* [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating @ 2020-03-19 20:40 David Woodhouse 2020-03-19 20:40 ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately David Woodhouse 2020-03-20 6:34 ` [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating Jürgen Groß 0 siblings, 2 replies; 17+ messages in thread From: David Woodhouse @ 2020-03-19 20:40 UTC (permalink / raw) To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu From: David Woodhouse <dwmw@amazon.co.uk> The do_ls() function has somewhat inconsistent handling of errors. If reading the node's contents with xs_read() fails, then do_ls() will just quietly not display the contents. If reading the node's permissions with xs_get_permissions() fails, then do_ls() will print a warning, continue, and ultimately won't exit with an error code (unless another error happens). If recursing into the node with xs_directory() fails, then do_ls() will abort immediately, not printing any further nodes. For persistent failure modes — such as ENOENT because a node has been removed, or EACCES because it has had its permisions changed since the xs_directory() on the parent directory returned its name — it's obviously quite likely that if either of the first two errors occur for a given node, then so will the third and thus xenstore-ls will abort. The ENOENT one is actually a fairly common case, and has caused tools to fail to clean up a network device because it *apparently* already doesn't exist in xenstore. There is a school of thought that says, "Well, xenstore-ls returned an error. So the tools should not trust its output." The natural corollary of this would surely be that the tools must re-run xenstore-ls as many times as is necessary until its manages to exit without hitting the race condition. I am not keen on that conclusion. For the specific case of ENOENT it seems reasonable to declare that, but for the timing, we might as well just not have seen that node at all when calling xs_directory() for the parent. By ignoring the error, we give acceptable output. The issue can be reproduced as follows: (dom0) # for a in `seq 1 1000` ; do xenstore-write /local/domain/2/foo/$a $a ; done Now simultaneously: (dom0) # for a in `seq 1 999` ; do xenstore-rm /local/domain/2/foo/$a ; done (dom2) # while true ; do ./xenstore-ls -p /local/domain/2/foo | grep -c 1000 ; done We should expect to see node 1000 in the output, every time. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> --- tools/xenstore/xenstore_client.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/tools/xenstore/xenstore_client.c b/tools/xenstore/xenstore_client.c index 3afc630ab8..ae7ed3eb9e 100644 --- a/tools/xenstore/xenstore_client.c +++ b/tools/xenstore/xenstore_client.c @@ -148,14 +148,20 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms int i; unsigned int num, len; + e = xs_directory(h, XBT_NULL, path, &num); + if (e == NULL) { + if (cur_depth && errno == ENOENT) { + /* If a node disappears while recursing, silently move on. */ + return; + } + + err(1, "xs_directory (%s)", path); + } + newpath = malloc(STRING_MAX); if (!newpath) err(1, "malloc in do_ls"); - e = xs_directory(h, XBT_NULL, path, &num); - if (e == NULL) - err(1, "xs_directory (%s)", path); - for (i = 0; i<num; i++) { char buf[MAX_STRLEN(unsigned int)+1]; struct xs_permissions *perms; -- 2.21.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately 2020-03-19 20:40 [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating David Woodhouse @ 2020-03-19 20:40 ` David Woodhouse 2020-03-20 11:03 ` Paul Durrant 2020-03-20 6:34 ` [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating Jürgen Groß 1 sibling, 1 reply; 17+ messages in thread From: David Woodhouse @ 2020-03-19 20:40 UTC (permalink / raw) To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu From: David Woodhouse <dwmw@amazon.co.uk> Report only one error to stderr for each node, regardless of whether it's xs_read, xs_get_permissions or xs_directory on the child that fails. Always exit with a non-zero code if any failure happens, reporting the last error that occurred. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> --- tools/xenstore/xenstore_client.c | 35 ++++++++++++++++++++++++-------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/tools/xenstore/xenstore_client.c b/tools/xenstore/xenstore_client.c index ae7ed3eb9e..0c891961ae 100644 --- a/tools/xenstore/xenstore_client.c +++ b/tools/xenstore/xenstore_client.c @@ -140,7 +140,7 @@ static int show_whole_path = 0; #define MIN(a, b) (((a) < (b))? (a) : (b)) -static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms) +static int do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms, int error, int ignore_errors) { char **e; char *newpath, *val; @@ -150,9 +150,16 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms e = xs_directory(h, XBT_NULL, path, &num); if (e == NULL) { - if (cur_depth && errno == ENOENT) { - /* If a node disappears while recursing, silently move on. */ - return; + if (cur_depth && (errno == ENOENT || errno == EACCES)) { + /* + * If a node disappears or becomes inaccessible while traversing, + * only print an error if previous operations on this node haven't + * done do. Then move on. + */ + error = errno; + if (!ignore_errors) + warn("xs_directory (%s)", path); + return error; } err(1, "xs_directory (%s)", path); @@ -197,7 +204,8 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms /* Print value */ if (val == NULL) { - printf(":\n"); + error = errno; + printf(": (%s)", strerror(error)); } else { if (max_width < (linewid + len + TAG_LEN)) { @@ -222,7 +230,11 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms if (show_perms) { perms = xs_get_permissions(h, XBT_NULL, newpath, &nperms); if (perms == NULL) { - warn("\ncould not access permissions for %s", e[i]); + error = errno; + val = NULL; + /* Don't repeat an error message if xs_read() already failed */ + if (val) + warn("could not access permissions for %s", e[i]); } else { int i; @@ -238,11 +250,13 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms } putchar('\n'); - - do_ls(h, newpath, cur_depth+1, show_perms); + + error = do_ls(h, newpath, cur_depth+1, show_perms, error, !val); } free(e); free(newpath); + + return error; } static void @@ -448,7 +462,10 @@ perform(enum mode mode, int optind, int argc, char **argv, struct xs_handle *xsh break; } case MODE_ls: { - do_ls(xsh, argv[optind], 0, prefix); + int error = do_ls(xsh, argv[optind], 0, prefix, 0, 0); + if (error) { + errx(1, "Errors during traversal. Last error: %s", strerror(error)); + } optind++; break; } -- 2.21.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately 2020-03-19 20:40 ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately David Woodhouse @ 2020-03-20 11:03 ` Paul Durrant 2020-03-20 12:30 ` David Woodhouse 0 siblings, 1 reply; 17+ messages in thread From: Paul Durrant @ 2020-03-20 11:03 UTC (permalink / raw) To: 'David Woodhouse', xen-devel Cc: 'Juergen Gross', 'Ian Jackson', 'Wei Liu' > -----Original Message----- > From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of David Woodhouse > Sent: 19 March 2020 20:40 > To: xen-devel@lists.xenproject.org > Cc: Juergen Gross <jgross@suse.com>; Ian Jackson <ian.jackson@eu.citrix.com>; Wei Liu <wl@xen.org> > Subject: [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit > appropriately > > From: David Woodhouse <dwmw@amazon.co.uk> > > Report only one error to stderr for each node, regardless of whether it's > xs_read, xs_get_permissions or xs_directory on the child that fails. > > Always exit with a non-zero code if any failure happens, reporting the > last error that occurred. > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > --- > tools/xenstore/xenstore_client.c | 35 ++++++++++++++++++++++++-------- > 1 file changed, 26 insertions(+), 9 deletions(-) > > diff --git a/tools/xenstore/xenstore_client.c b/tools/xenstore/xenstore_client.c > index ae7ed3eb9e..0c891961ae 100644 > --- a/tools/xenstore/xenstore_client.c > +++ b/tools/xenstore/xenstore_client.c > @@ -140,7 +140,7 @@ static int show_whole_path = 0; > > #define MIN(a, b) (((a) < (b))? (a) : (b)) > > -static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms) > +static int do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms, int error, int > ignore_errors) > { > char **e; > char *newpath, *val; > @@ -150,9 +150,16 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms > > e = xs_directory(h, XBT_NULL, path, &num); > if (e == NULL) { > - if (cur_depth && errno == ENOENT) { > - /* If a node disappears while recursing, silently move on. */ > - return; > + if (cur_depth && (errno == ENOENT || errno == EACCES)) { > + /* > + * If a node disappears or becomes inaccessible while traversing, > + * only print an error if previous operations on this node haven't > + * done do. Then move on. > + */ > + error = errno; > + if (!ignore_errors) > + warn("xs_directory (%s)", path); > + return error; > } > > err(1, "xs_directory (%s)", path); > @@ -197,7 +204,8 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms > > /* Print value */ > if (val == NULL) { > - printf(":\n"); > + error = errno; > + printf(": (%s)", strerror(error)); > } > else { > if (max_width < (linewid + len + TAG_LEN)) { > @@ -222,7 +230,11 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms > if (show_perms) { > perms = xs_get_permissions(h, XBT_NULL, newpath, &nperms); > if (perms == NULL) { > - warn("\ncould not access permissions for %s", e[i]); > + error = errno; > + val = NULL; > + /* Don't repeat an error message if xs_read() already failed */ > + if (val) How can the code get here? The line above the comment always sets val to NULL. Paul > + warn("could not access permissions for %s", e[i]); > } > else { > int i; > @@ -238,11 +250,13 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms > } > > putchar('\n'); > - > - do_ls(h, newpath, cur_depth+1, show_perms); > + > + error = do_ls(h, newpath, cur_depth+1, show_perms, error, !val); > } > free(e); > free(newpath); > + > + return error; > } > > static void > @@ -448,7 +462,10 @@ perform(enum mode mode, int optind, int argc, char **argv, struct xs_handle *xsh > break; > } > case MODE_ls: { > - do_ls(xsh, argv[optind], 0, prefix); > + int error = do_ls(xsh, argv[optind], 0, prefix, 0, 0); > + if (error) { > + errx(1, "Errors during traversal. Last error: %s", strerror(error)); > + } > optind++; > break; > } > -- > 2.21.0 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately 2020-03-20 11:03 ` Paul Durrant @ 2020-03-20 12:30 ` David Woodhouse 2020-03-20 14:36 ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately [and 1 more messages] Ian Jackson 0 siblings, 1 reply; 17+ messages in thread From: David Woodhouse @ 2020-03-20 12:30 UTC (permalink / raw) To: paul, xen-devel Cc: 'Juergen Gross', 'Ian Jackson', 'Wei Liu' [-- Attachment #1.1: Type: text/plain, Size: 551 bytes --] On Fri, 2020-03-20 at 11:03 +0000, Paul Durrant wrote: > > + val = NULL; > > + /* Don't repeat an error message if xs_read() already failed */ > > + if (val) > > How can the code get here? The line above the comment always sets val to NULL. Oops, I don't think it was supposed to. It was just stray testing which I neglected to remove. Will repost with that fixed, if that's the way we want to go. It's patch 1 which I really care about; this part is just yak shaving at Ian's prompting. [-- Attachment #1.2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5174 bytes --] [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately [and 1 more messages] 2020-03-20 12:30 ` David Woodhouse @ 2020-03-20 14:36 ` Ian Jackson 0 siblings, 0 replies; 17+ messages in thread From: Ian Jackson @ 2020-03-20 14:36 UTC (permalink / raw) To: David Woodhouse; +Cc: Juergen Gross, xen-devel, Wei Liu, paul David Woodhouse writes ("Re: [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately"): > It's patch 1 which I really care about; this part is just yak shaving > at Ian's prompting. Hi. Thanks for working on this cleanup. I confess that while reviewing your code I felt a bit confused and thickheaded. (I have had a difficult week.) So I think what I am about to say may not be as useful or true as it ought to be.... You'll see I've put my R-b on patch 1. Despite that, given what appears here in patch 2, I think you might want to try to look into using xs_transaction_{start,end} in xenstore-ls. That might make things simpler. In particular I think this because I'm not sure I fully understand the implications of when to ignore EACCES errors, purportedly on the grounds that they might be caused by concurrent updates to xenstore. My inclination is to think that this can't be right. But doing everything in a transaction would completely eliminate this issue. Regardless of that, it is still a bug that xenstore-ls ignores errors far too much and I very much welcome your efforts to fix this in 2/. I don't know if I already explained my overall theory about this but: ISTM that xenstore-ls should exit 0 iff it was able to find and print all the information requested. If it exits non-0 it should have printed at least one thing to stderr. David Woodhouse writes ("[PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately"): > From: David Woodhouse <dwmw@amazon.co.uk> > > Report only one error to stderr for each node, regardless of whether it's > xs_read, xs_get_permissions or xs_directory on the child that fails. > > Always exit with a non-zero code if any failure happens, reporting the > last error that occurred. I think though, that this patch has some remnants of previous iterations in it. > -static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms) > +static int do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms, int error, int ignore_errors) > { > char **e; > char *newpath, *val; > @@ -150,9 +150,16 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms > > e = xs_directory(h, XBT_NULL, path, &num); > if (e == NULL) { > - if (cur_depth && errno == ENOENT) { > - /* If a node disappears while recursing, silently move on. */ > - return; ISTM that this code ought to be retained. It is still the case that you want to ignore ENOENT. (Unless you do the transaction thing.) > + if (cur_depth && (errno == ENOENT || errno == EACCES)) { > + /* > + * If a node disappears or becomes inaccessible while traversing, > + * only print an error if previous operations on this node haven't > + * done do. Then move on. > + */ > + error = errno; > + if (!ignore_errors) > + warn("xs_directory (%s)", path); > + return error; So this branch is just for EACCES ? What is the justification for handling EACCES specially ? Maybe it is the only "expected" error ? > @@ -197,7 +204,8 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms > > /* Print value */ > if (val == NULL) { > - printf(":\n"); > + error = errno; > + printf(": (%s)", strerror(error)); Well done for making the output unambiguous. `= "..."' vs `: (...)' > @@ -222,7 +230,11 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms > if (show_perms) { > perms = xs_get_permissions(h, XBT_NULL, newpath, &nperms); > if (perms == NULL) { > - warn("\ncould not access permissions for %s", e[i]); > + error = errno; > + val = NULL; > + /* Don't repeat an error message if xs_read() already failed */ > + if (val) > + warn("could not access permissions for %s", e[i]); Maybe this wants the same ENOENT handling as before ? (Unless you do xs_transaction_*.) IDK what the rules are for xs_get_permissions.[1] Does it require read access ? If so then I think it might be better to simply skip the xs_get_permissions call if val==NULL. Especially if you do the xs_transaction_* thing - since in that case the xs_get_permissions call seems doomed. I hope this review was of some use. Please chat to me on irc or reply by email if it doesn't seem to make sense. Ian. [1] The in-tree docs refer here: https://wiki.xen.org/wiki/XenBus#Permissions but that part of the wiki page does not say what permissions are needed for GET_PERMS and SET_PERMS themselves. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-03-19 20:40 [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating David Woodhouse 2020-03-19 20:40 ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately David Woodhouse @ 2020-03-20 6:34 ` Jürgen Groß 2020-03-20 11:19 ` David Woodhouse 2020-03-20 14:11 ` Ian Jackson 1 sibling, 2 replies; 17+ messages in thread From: Jürgen Groß @ 2020-03-20 6:34 UTC (permalink / raw) To: David Woodhouse, xen-devel; +Cc: Ian Jackson, Wei Liu On 19.03.20 21:40, David Woodhouse wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > > The do_ls() function has somewhat inconsistent handling of errors. > > If reading the node's contents with xs_read() fails, then do_ls() will > just quietly not display the contents. > > If reading the node's permissions with xs_get_permissions() fails, then > do_ls() will print a warning, continue, and ultimately won't exit with > an error code (unless another error happens). > > If recursing into the node with xs_directory() fails, then do_ls() will > abort immediately, not printing any further nodes. > > For persistent failure modes — such as ENOENT because a node has been > removed, or EACCES because it has had its permisions changed since the > xs_directory() on the parent directory returned its name — it's > obviously quite likely that if either of the first two errors occur for > a given node, then so will the third and thus xenstore-ls will abort. > > The ENOENT one is actually a fairly common case, and has caused tools to > fail to clean up a network device because it *apparently* already > doesn't exist in xenstore. > > There is a school of thought that says, "Well, xenstore-ls returned an > error. So the tools should not trust its output." > > The natural corollary of this would surely be that the tools must re-run > xenstore-ls as many times as is necessary until its manages to exit > without hitting the race condition. I am not keen on that conclusion. > > For the specific case of ENOENT it seems reasonable to declare that, > but for the timing, we might as well just not have seen that node at > all when calling xs_directory() for the parent. By ignoring the error, > we give acceptable output. Have you thought about the possibility to do the complete handling in a single transaction? This would ensure a complete consistent picture from the time the operation has started. Any inconsistency should be reported as an error then. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-03-20 6:34 ` [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating Jürgen Groß @ 2020-03-20 11:19 ` David Woodhouse 2020-03-20 11:26 ` Jürgen Groß 2020-03-20 14:11 ` Ian Jackson 1 sibling, 1 reply; 17+ messages in thread From: David Woodhouse @ 2020-03-20 11:19 UTC (permalink / raw) To: Jürgen Groß, xen-devel; +Cc: Ian Jackson, Wei Liu [-- Attachment #1.1: Type: text/plain, Size: 465 bytes --] On Fri, 2020-03-20 at 07:34 +0100, Jürgen Groß wrote: > Have you thought about the possibility to do the complete handling in a > single transaction? This would ensure a complete consistent picture > from the time the operation has started. Any inconsistency should be > reported as an error then. Hm, how would that work? Do I have to buffer *all* the output from do_ls() and then only print it if/when xs_transaction_end() succeeds, else try again? [-- Attachment #1.2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5174 bytes --] [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-03-20 11:19 ` David Woodhouse @ 2020-03-20 11:26 ` Jürgen Groß 2020-03-20 14:12 ` Ian Jackson 0 siblings, 1 reply; 17+ messages in thread From: Jürgen Groß @ 2020-03-20 11:26 UTC (permalink / raw) To: David Woodhouse, xen-devel; +Cc: Ian Jackson, Wei Liu On 20.03.20 12:19, David Woodhouse wrote: > On Fri, 2020-03-20 at 07:34 +0100, Jürgen Groß wrote: >> Have you thought about the possibility to do the complete handling in a >> single transaction? This would ensure a complete consistent picture >> from the time the operation has started. Any inconsistency should be >> reported as an error then. > > Hm, how would that work? Do I have to buffer *all* the output from > do_ls() and then only print it if/when xs_transaction_end() succeeds, > else try again? No, you just don't care for the transaction to succeed or fail (IMO it should never fail as you are reading only). So just wrap everything into a transaction. I might be wrong, of course. :-) Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-03-20 11:26 ` Jürgen Groß @ 2020-03-20 14:12 ` Ian Jackson 2020-03-20 14:58 ` David Woodhouse 0 siblings, 1 reply; 17+ messages in thread From: Ian Jackson @ 2020-03-20 14:12 UTC (permalink / raw) To: Jürgen Groß; +Cc: xen-devel, David Woodhouse, Wei Liu Jürgen Groß writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"): > No, you just don't care for the transaction to succeed or fail (IMO it > should never fail as you are reading only). > > So just wrap everything into a transaction. Yes. xenstored will do the needed buffering. I think in principle there is a risk here that the transaction might run for a long time, if the output from xenstore-ls goes to something that blocks (eg a pager) and can't be written all at once. But if this is a problem it is a problem afflicting xenstored, not xenstore-ls. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-03-20 14:12 ` Ian Jackson @ 2020-03-20 14:58 ` David Woodhouse 2020-03-20 15:23 ` Jürgen Groß 0 siblings, 1 reply; 17+ messages in thread From: David Woodhouse @ 2020-03-20 14:58 UTC (permalink / raw) To: Ian Jackson, Jürgen Groß; +Cc: xen-devel, Wei Liu [-- Attachment #1.1: Type: text/plain, Size: 1923 bytes --] On Fri, 2020-03-20 at 14:12 +0000, Ian Jackson wrote: > Jürgen Groß writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"): > > No, you just don't care for the transaction to succeed or fail (IMO it > > should never fail as you are reading only). > > > > So just wrap everything into a transaction. > > Yes. xenstored will do the needed buffering. > > I think in principle there is a risk here that the transaction might > run for a long time, if the output from xenstore-ls goes to something > that blocks (eg a pager) and can't be written all at once. > > But if this is a problem it is a problem afflicting xenstored, not > xenstore-ls. > > Ian. So if I do this... --- a/tools/xenstore/xenstore_client.c +++ b/tools/xenstore/xenstore_client.c @@ -658,7 +658,6 @@ main(int argc, char **argv) case MODE_write: transaction = (argc - switch_argv - optind) > 2; break; - case MODE_ls: case MODE_watch: transaction = 0; break; @@ -683,6 +682,7 @@ again: xth = xs_transaction_start(xsh); if (xth == XBT_NULL) errx(1, "couldn't start transaction"); + printf("started transaction\n"); } ret = perform(mode, optind, argc - switch_argv, argv + switch_argv, xsh, xth, prefix, tidy, upto, recurse, nr_watches, raw); ... and then repeat my test case as shown in [PATCH 1/1], I should no longer see... xenstore-ls: could not access permissions for 407: No such file or directory xenstore-ls: in xs_directory (/local/domain/2/foo/407): No such file or directory 0 But it does still happen. And even if I turn the errx() into a warn() to stop it aborting, and add a warn() when the xs_transaction_end() returns EAGAIN... that isn't happening either. I'm just getting inconsistent data, within a transaction. [-- Attachment #1.2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5174 bytes --] [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-03-20 14:58 ` David Woodhouse @ 2020-03-20 15:23 ` Jürgen Groß 2020-03-30 16:37 ` Ian Jackson 0 siblings, 1 reply; 17+ messages in thread From: Jürgen Groß @ 2020-03-20 15:23 UTC (permalink / raw) To: David Woodhouse, Ian Jackson; +Cc: xen-devel, Wei Liu On 20.03.20 15:58, David Woodhouse wrote: > On Fri, 2020-03-20 at 14:12 +0000, Ian Jackson wrote: >> Jürgen Groß writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"): >>> No, you just don't care for the transaction to succeed or fail (IMO it >>> should never fail as you are reading only). >>> >>> So just wrap everything into a transaction. >> >> Yes. xenstored will do the needed buffering. >> >> I think in principle there is a risk here that the transaction might >> run for a long time, if the output from xenstore-ls goes to something >> that blocks (eg a pager) and can't be written all at once. >> >> But if this is a problem it is a problem afflicting xenstored, not >> xenstore-ls. >> >> Ian. > > > So if I do this... > > --- a/tools/xenstore/xenstore_client.c > +++ b/tools/xenstore/xenstore_client.c > @@ -658,7 +658,6 @@ main(int argc, char **argv) > case MODE_write: > transaction = (argc - switch_argv - optind) > 2; > break; > - case MODE_ls: > case MODE_watch: > transaction = 0; > break; > @@ -683,6 +682,7 @@ again: > xth = xs_transaction_start(xsh); > if (xth == XBT_NULL) > errx(1, "couldn't start transaction"); > + printf("started transaction\n"); > } > > ret = perform(mode, optind, argc - switch_argv, argv + switch_argv, xsh, xth, prefix, tidy, upto, recurse, nr_watches, raw); > > > ... and then repeat my test case as shown in [PATCH 1/1], I should no > longer see... > > xenstore-ls: > could not access permissions for 407: No such file or directory > xenstore-ls: in xs_directory (/local/domain/2/foo/407): No such file or directory > 0 > > > But it does still happen. And even if I turn the errx() into a warn() > to stop it aborting, and add a warn() when the xs_transaction_end() > returns EAGAIN... that isn't happening either. I'm just getting > inconsistent data, within a transaction. Hmm, yes, thinking more about it: a non-transactional write of a node which hasn't been written or read by an ongoing transaction is not handled in a special way. This could be changed, but would require some structural changes. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-03-20 15:23 ` Jürgen Groß @ 2020-03-30 16:37 ` Ian Jackson 2020-03-30 16:40 ` Ian Jackson 0 siblings, 1 reply; 17+ messages in thread From: Ian Jackson @ 2020-03-30 16:37 UTC (permalink / raw) To: Jürgen Groß; +Cc: xen-devel, David Woodhouse, Wei Liu Jürgen Groß writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"): > On 20.03.20 15:58, David Woodhouse wrote: > > But it does still happen. And even if I turn the errx() into a warn() > > to stop it aborting, and add a warn() when the xs_transaction_end() > > returns EAGAIN... that isn't happening either. I'm just getting > > inconsistent data, within a transaction. > > Hmm, yes, thinking more about it: a non-transactional write of a node > which hasn't been written or read by an ongoing transaction is not > handled in a special way. This could be changed, but would require some > structural changes. And making a node visible by XS_DIRECTORY[_PART] doesn't count as reading it. But it does count as reading the parent ? In principle adding or removing a node could be made to count as a change to the containing directory. But I don't think doing this as a response to David's issue is sensible. Ian. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-03-30 16:37 ` Ian Jackson @ 2020-03-30 16:40 ` Ian Jackson 2020-08-14 14:23 ` David Woodhouse 0 siblings, 1 reply; 17+ messages in thread From: Ian Jackson @ 2020-03-30 16:40 UTC (permalink / raw) To: Jürgen Groß, David Woodhouse, xen-devel, Wei Liu Ian Jackson writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"): > And making a node visible by XS_DIRECTORY[_PART] doesn't count as > reading it. But it does count as reading the parent ? > In principle adding or removing a node could be made to count as a > change to the containing directory. But I don't think doing this as a > response to David's issue is sensible. So, err, putting that together and reviewing the state of the world: I still think David's 1/ patch is good. I think my comments on 2/ are still applicable, apart from the bits where I suggest using a transaction will fix all this. David: do you now intend to revise 2/ according to our comments ? Everyone else: is there some reason we shouldn't commit 1/ immediately ? Ian. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-03-30 16:40 ` Ian Jackson @ 2020-08-14 14:23 ` David Woodhouse 2020-08-14 15:19 ` Wei Liu 0 siblings, 1 reply; 17+ messages in thread From: David Woodhouse @ 2020-08-14 14:23 UTC (permalink / raw) To: Ian Jackson, Jürgen Groß, xen-devel, Wei Liu [-- Attachment #1: Type: text/plain, Size: 1302 bytes --] On Mon, 2020-03-30 at 17:40 +0100, Ian Jackson wrote: > Ian Jackson writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do > not abort xenstore-ls if a node disappears while iterating"): > > And making a node visible by XS_DIRECTORY[_PART] doesn't count as > > reading it. But it does count as reading the parent ? > > In principle adding or removing a node could be made to count as a > > change to the containing directory. But I don't think doing this > > as a > > response to David's issue is sensible. > > So, err, putting that together and reviewing the state of the world: > > I still think David's 1/ patch is good. > > I think my comments on 2/ are still applicable, apart from the > bits where I suggest using a transaction will fix all this. > > David: do you now intend to revise 2/ according to our comments ? I confess to having slightly lost the will to live, but sure. If #1 gets applied and actually fixes the bug that was biting us in production and which I started trying to upstream in March 2019, I'll happily revisit those subsequent cleanups you asked for. > Everyone else: is there some reason we shouldn't commit 1/ > immediately ? It was deliberately split out so that it could indeed be applied immediately when it was posted in March. [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5174 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-08-14 14:23 ` David Woodhouse @ 2020-08-14 15:19 ` Wei Liu 0 siblings, 0 replies; 17+ messages in thread From: Wei Liu @ 2020-08-14 15:19 UTC (permalink / raw) To: David Woodhouse; +Cc: Ian Jackson, Jürgen Groß, xen-devel, Wei Liu On Fri, Aug 14, 2020 at 03:23:10PM +0100, David Woodhouse wrote: > On Mon, 2020-03-30 at 17:40 +0100, Ian Jackson wrote: > > Ian Jackson writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do > > not abort xenstore-ls if a node disappears while iterating"): > > > And making a node visible by XS_DIRECTORY[_PART] doesn't count as > > > reading it. But it does count as reading the parent ? > > > In principle adding or removing a node could be made to count as a > > > change to the containing directory. But I don't think doing this > > > as a > > > response to David's issue is sensible. > > > > So, err, putting that together and reviewing the state of the world: > > > > I still think David's 1/ patch is good. > > Everyone else: is there some reason we shouldn't commit 1/ > > immediately ? OK I will turn this into an ack and apply this patch. Sorry this fell off my radar somehow. Wei. > > It was deliberately split out so that it could indeed be applied > immediately when it was posted in March. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-03-20 6:34 ` [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating Jürgen Groß 2020-03-20 11:19 ` David Woodhouse @ 2020-03-20 14:11 ` Ian Jackson 2020-03-20 14:12 ` Jürgen Groß 1 sibling, 1 reply; 17+ messages in thread From: Ian Jackson @ 2020-03-20 14:11 UTC (permalink / raw) To: Jürgen Groß; +Cc: xen-devel, David Woodhouse, Wei Liu Jürgen Groß writes ("Re: [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"): > On 19.03.20 21:40, David Woodhouse wrote: > > From: David Woodhouse <dwmw@amazon.co.uk> ... > > For the specific case of ENOENT it seems reasonable to declare that, > > but for the timing, we might as well just not have seen that node at > > all when calling xs_directory() for the parent. By ignoring the error, > > we give acceptable output. Thanks. > Have you thought about the possibility to do the complete handling in a > single transaction? This would ensure a complete consistent picture > from the time the operation has started. Any inconsistency should be > reported as an error then. I think this would be a good idea (not least because it would mean that callers of xenstore-ls wouldn't see inconsistent data) but I think it would be an enhancement. For now, for David's original patch: Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com> If and when we introduce a transaction, David's 1/ should be reverted as indeed then even ENOENT would indicate some kind of serious problem. Thanks, Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating 2020-03-20 14:11 ` Ian Jackson @ 2020-03-20 14:12 ` Jürgen Groß 0 siblings, 0 replies; 17+ messages in thread From: Jürgen Groß @ 2020-03-20 14:12 UTC (permalink / raw) To: Ian Jackson; +Cc: xen-devel, David Woodhouse, Wei Liu On 20.03.20 15:11, Ian Jackson wrote: > Jürgen Groß writes ("Re: [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"): >> On 19.03.20 21:40, David Woodhouse wrote: >>> From: David Woodhouse <dwmw@amazon.co.uk> > ... >>> For the specific case of ENOENT it seems reasonable to declare that, >>> but for the timing, we might as well just not have seen that node at >>> all when calling xs_directory() for the parent. By ignoring the error, >>> we give acceptable output. > > Thanks. > >> Have you thought about the possibility to do the complete handling in a >> single transaction? This would ensure a complete consistent picture >> from the time the operation has started. Any inconsistency should be >> reported as an error then. > > I think this would be a good idea (not least because it would mean > that callers of xenstore-ls wouldn't see inconsistent data) but I > think it would be an enhancement. > > For now, for David's original patch: > > Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com> > > If and when we introduce a transaction, David's 1/ should be reverted > as indeed then even ENOENT would indicate some kind of serious > problem. Fine with me. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2020-08-14 15:19 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-19 20:40 [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating David Woodhouse 2020-03-19 20:40 ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately David Woodhouse 2020-03-20 11:03 ` Paul Durrant 2020-03-20 12:30 ` David Woodhouse 2020-03-20 14:36 ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately [and 1 more messages] Ian Jackson 2020-03-20 6:34 ` [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating Jürgen Groß 2020-03-20 11:19 ` David Woodhouse 2020-03-20 11:26 ` Jürgen Groß 2020-03-20 14:12 ` Ian Jackson 2020-03-20 14:58 ` David Woodhouse 2020-03-20 15:23 ` Jürgen Groß 2020-03-30 16:37 ` Ian Jackson 2020-03-30 16:40 ` Ian Jackson 2020-08-14 14:23 ` David Woodhouse 2020-08-14 15:19 ` Wei Liu 2020-03-20 14:11 ` Ian Jackson 2020-03-20 14:12 ` Jürgen Groß
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.