All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
@ 2020-03-19 20:40 David Woodhouse
  2020-03-19 20:40 ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately David Woodhouse
  2020-03-20  6:34 ` [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating Jürgen Groß
  0 siblings, 2 replies; 17+ messages in thread
From: David Woodhouse @ 2020-03-19 20:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu

From: David Woodhouse <dwmw@amazon.co.uk>

The do_ls() function has somewhat inconsistent handling of errors.

If reading the node's contents with xs_read() fails, then do_ls() will
just quietly not display the contents.

If reading the node's permissions with xs_get_permissions() fails, then
do_ls() will print a warning, continue, and ultimately won't exit with
an error code (unless another error happens).

If recursing into the node with xs_directory() fails, then do_ls() will
abort immediately, not printing any further nodes.

For persistent failure modes — such as ENOENT because a node has been
removed, or EACCES because it has had its permisions changed since the
xs_directory() on the parent directory returned its name — it's
obviously quite likely that if either of the first two errors occur for
a given node, then so will the third and thus xenstore-ls will abort.

The ENOENT one is actually a fairly common case, and has caused tools to
fail to clean up a network device because it *apparently* already
doesn't exist in xenstore.

There is a school of thought that says, "Well, xenstore-ls returned an
error. So the tools should not trust its output."

The natural corollary of this would surely be that the tools must re-run
xenstore-ls as many times as is necessary until its manages to exit
without hitting the race condition. I am not keen on that conclusion.

For the specific case of ENOENT it seems reasonable to declare that,
but for the timing, we might as well just not have seen that node at
all when calling xs_directory() for the parent. By ignoring the error,
we give acceptable output.

The issue can be reproduced as follows:

(dom0) # for a in `seq 1 1000` ; do
              xenstore-write /local/domain/2/foo/$a $a ;
         done

Now simultaneously:

(dom0) # for a in `seq 1 999` ; do
              xenstore-rm /local/domain/2/foo/$a ;
         done
(dom2) # while true ; do
              ./xenstore-ls -p /local/domain/2/foo | grep -c 1000 ;
         done

We should expect to see node 1000 in the output, every time.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 tools/xenstore/xenstore_client.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/xenstore/xenstore_client.c b/tools/xenstore/xenstore_client.c
index 3afc630ab8..ae7ed3eb9e 100644
--- a/tools/xenstore/xenstore_client.c
+++ b/tools/xenstore/xenstore_client.c
@@ -148,14 +148,20 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
     int i;
     unsigned int num, len;
 
+    e = xs_directory(h, XBT_NULL, path, &num);
+    if (e == NULL) {
+        if (cur_depth && errno == ENOENT) {
+            /* If a node disappears while recursing, silently move on. */
+            return;
+        }
+
+        err(1, "xs_directory (%s)", path);
+    }
+
     newpath = malloc(STRING_MAX);
     if (!newpath)
       err(1, "malloc in do_ls");
 
-    e = xs_directory(h, XBT_NULL, path, &num);
-    if (e == NULL)
-        err(1, "xs_directory (%s)", path);
-
     for (i = 0; i<num; i++) {
         char buf[MAX_STRLEN(unsigned int)+1];
         struct xs_permissions *perms;
-- 
2.21.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately
  2020-03-19 20:40 [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating David Woodhouse
@ 2020-03-19 20:40 ` David Woodhouse
  2020-03-20 11:03   ` Paul Durrant
  2020-03-20  6:34 ` [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating Jürgen Groß
  1 sibling, 1 reply; 17+ messages in thread
From: David Woodhouse @ 2020-03-19 20:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu

From: David Woodhouse <dwmw@amazon.co.uk>

Report only one error to stderr for each node, regardless of whether it's
xs_read, xs_get_permissions or xs_directory on the child that fails.

Always exit with a non-zero code if any failure happens, reporting the
last error that occurred.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 tools/xenstore/xenstore_client.c | 35 ++++++++++++++++++++++++--------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/tools/xenstore/xenstore_client.c b/tools/xenstore/xenstore_client.c
index ae7ed3eb9e..0c891961ae 100644
--- a/tools/xenstore/xenstore_client.c
+++ b/tools/xenstore/xenstore_client.c
@@ -140,7 +140,7 @@ static int show_whole_path = 0;
 
 #define MIN(a, b) (((a) < (b))? (a) : (b))
 
-static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms)
+static int do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms, int error, int ignore_errors)
 {
     char **e;
     char *newpath, *val;
@@ -150,9 +150,16 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
 
     e = xs_directory(h, XBT_NULL, path, &num);
     if (e == NULL) {
-        if (cur_depth && errno == ENOENT) {
-            /* If a node disappears while recursing, silently move on. */
-            return;
+        if (cur_depth && (errno == ENOENT || errno == EACCES)) {
+            /*
+             * If a node disappears or becomes inaccessible while traversing,
+             * only print an error if previous operations on this node haven't
+             * done do. Then move on.
+             */
+            error = errno;
+            if (!ignore_errors)
+                warn("xs_directory (%s)", path);
+            return error;
         }
 
         err(1, "xs_directory (%s)", path);
@@ -197,7 +204,8 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
 
         /* Print value */
         if (val == NULL) {
-            printf(":\n");
+            error = errno;
+            printf(": (%s)", strerror(error));
         }
         else {
             if (max_width < (linewid + len + TAG_LEN)) {
@@ -222,7 +230,11 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
         if (show_perms) {
             perms = xs_get_permissions(h, XBT_NULL, newpath, &nperms);
             if (perms == NULL) {
-                warn("\ncould not access permissions for %s", e[i]);
+                error = errno;
+                val = NULL;
+                /* Don't repeat an error message if xs_read() already failed */
+                if (val)
+                    warn("could not access permissions for %s", e[i]);
             }
             else {
                 int i;
@@ -238,11 +250,13 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
         }
 
         putchar('\n');
-            
-        do_ls(h, newpath, cur_depth+1, show_perms); 
+
+        error = do_ls(h, newpath, cur_depth+1, show_perms, error, !val);
     }
     free(e);
     free(newpath);
+
+    return error;
 }
 
 static void
@@ -448,7 +462,10 @@ perform(enum mode mode, int optind, int argc, char **argv, struct xs_handle *xsh
             break;
         }
         case MODE_ls: {
-            do_ls(xsh, argv[optind], 0, prefix);
+            int error = do_ls(xsh, argv[optind], 0, prefix, 0, 0);
+            if (error) {
+                errx(1, "Errors during traversal. Last error: %s", strerror(error));
+            }
             optind++;
             break;
         }
-- 
2.21.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-03-19 20:40 [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating David Woodhouse
  2020-03-19 20:40 ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately David Woodhouse
@ 2020-03-20  6:34 ` Jürgen Groß
  2020-03-20 11:19   ` David Woodhouse
  2020-03-20 14:11   ` Ian Jackson
  1 sibling, 2 replies; 17+ messages in thread
From: Jürgen Groß @ 2020-03-20  6:34 UTC (permalink / raw)
  To: David Woodhouse, xen-devel; +Cc: Ian Jackson, Wei Liu

On 19.03.20 21:40, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> The do_ls() function has somewhat inconsistent handling of errors.
> 
> If reading the node's contents with xs_read() fails, then do_ls() will
> just quietly not display the contents.
> 
> If reading the node's permissions with xs_get_permissions() fails, then
> do_ls() will print a warning, continue, and ultimately won't exit with
> an error code (unless another error happens).
> 
> If recursing into the node with xs_directory() fails, then do_ls() will
> abort immediately, not printing any further nodes.
> 
> For persistent failure modes — such as ENOENT because a node has been
> removed, or EACCES because it has had its permisions changed since the
> xs_directory() on the parent directory returned its name — it's
> obviously quite likely that if either of the first two errors occur for
> a given node, then so will the third and thus xenstore-ls will abort.
> 
> The ENOENT one is actually a fairly common case, and has caused tools to
> fail to clean up a network device because it *apparently* already
> doesn't exist in xenstore.
> 
> There is a school of thought that says, "Well, xenstore-ls returned an
> error. So the tools should not trust its output."
> 
> The natural corollary of this would surely be that the tools must re-run
> xenstore-ls as many times as is necessary until its manages to exit
> without hitting the race condition. I am not keen on that conclusion.
> 
> For the specific case of ENOENT it seems reasonable to declare that,
> but for the timing, we might as well just not have seen that node at
> all when calling xs_directory() for the parent. By ignoring the error,
> we give acceptable output.

Have you thought about the possibility to do the complete handling in a
single transaction? This would ensure a complete consistent picture
from the time the operation has started. Any inconsistency should be
reported as an error then.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately
  2020-03-19 20:40 ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately David Woodhouse
@ 2020-03-20 11:03   ` Paul Durrant
  2020-03-20 12:30     ` David Woodhouse
  0 siblings, 1 reply; 17+ messages in thread
From: Paul Durrant @ 2020-03-20 11:03 UTC (permalink / raw)
  To: 'David Woodhouse', xen-devel
  Cc: 'Juergen Gross', 'Ian Jackson', 'Wei Liu'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of David Woodhouse
> Sent: 19 March 2020 20:40
> To: xen-devel@lists.xenproject.org
> Cc: Juergen Gross <jgross@suse.com>; Ian Jackson <ian.jackson@eu.citrix.com>; Wei Liu <wl@xen.org>
> Subject: [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit
> appropriately
> 
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> Report only one error to stderr for each node, regardless of whether it's
> xs_read, xs_get_permissions or xs_directory on the child that fails.
> 
> Always exit with a non-zero code if any failure happens, reporting the
> last error that occurred.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  tools/xenstore/xenstore_client.c | 35 ++++++++++++++++++++++++--------
>  1 file changed, 26 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/xenstore/xenstore_client.c b/tools/xenstore/xenstore_client.c
> index ae7ed3eb9e..0c891961ae 100644
> --- a/tools/xenstore/xenstore_client.c
> +++ b/tools/xenstore/xenstore_client.c
> @@ -140,7 +140,7 @@ static int show_whole_path = 0;
> 
>  #define MIN(a, b) (((a) < (b))? (a) : (b))
> 
> -static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms)
> +static int do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms, int error, int
> ignore_errors)
>  {
>      char **e;
>      char *newpath, *val;
> @@ -150,9 +150,16 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
> 
>      e = xs_directory(h, XBT_NULL, path, &num);
>      if (e == NULL) {
> -        if (cur_depth && errno == ENOENT) {
> -            /* If a node disappears while recursing, silently move on. */
> -            return;
> +        if (cur_depth && (errno == ENOENT || errno == EACCES)) {
> +            /*
> +             * If a node disappears or becomes inaccessible while traversing,
> +             * only print an error if previous operations on this node haven't
> +             * done do. Then move on.
> +             */
> +            error = errno;
> +            if (!ignore_errors)
> +                warn("xs_directory (%s)", path);
> +            return error;
>          }
> 
>          err(1, "xs_directory (%s)", path);
> @@ -197,7 +204,8 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
> 
>          /* Print value */
>          if (val == NULL) {
> -            printf(":\n");
> +            error = errno;
> +            printf(": (%s)", strerror(error));
>          }
>          else {
>              if (max_width < (linewid + len + TAG_LEN)) {
> @@ -222,7 +230,11 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
>          if (show_perms) {
>              perms = xs_get_permissions(h, XBT_NULL, newpath, &nperms);
>              if (perms == NULL) {
> -                warn("\ncould not access permissions for %s", e[i]);
> +                error = errno;
> +                val = NULL;
> +                /* Don't repeat an error message if xs_read() already failed */
> +                if (val)

How can the code get here? The line above the comment always sets val to NULL.

  Paul

> +                    warn("could not access permissions for %s", e[i]);
>              }
>              else {
>                  int i;
> @@ -238,11 +250,13 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
>          }
> 
>          putchar('\n');
> -
> -        do_ls(h, newpath, cur_depth+1, show_perms);
> +
> +        error = do_ls(h, newpath, cur_depth+1, show_perms, error, !val);
>      }
>      free(e);
>      free(newpath);
> +
> +    return error;
>  }
> 
>  static void
> @@ -448,7 +462,10 @@ perform(enum mode mode, int optind, int argc, char **argv, struct xs_handle *xsh
>              break;
>          }
>          case MODE_ls: {
> -            do_ls(xsh, argv[optind], 0, prefix);
> +            int error = do_ls(xsh, argv[optind], 0, prefix, 0, 0);
> +            if (error) {
> +                errx(1, "Errors during traversal. Last error: %s", strerror(error));
> +            }
>              optind++;
>              break;
>          }
> --
> 2.21.0
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-03-20  6:34 ` [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating Jürgen Groß
@ 2020-03-20 11:19   ` David Woodhouse
  2020-03-20 11:26     ` Jürgen Groß
  2020-03-20 14:11   ` Ian Jackson
  1 sibling, 1 reply; 17+ messages in thread
From: David Woodhouse @ 2020-03-20 11:19 UTC (permalink / raw)
  To: Jürgen Groß, xen-devel; +Cc: Ian Jackson, Wei Liu


[-- Attachment #1.1: Type: text/plain, Size: 465 bytes --]

On Fri, 2020-03-20 at 07:34 +0100, Jürgen Groß wrote:
> Have you thought about the possibility to do the complete handling in a
> single transaction? This would ensure a complete consistent picture
> from the time the operation has started. Any inconsistency should be
> reported as an error then.

Hm, how would that work? Do I have to buffer *all* the output from
do_ls() and then only print it if/when xs_transaction_end() succeeds,
else try again?


[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-03-20 11:19   ` David Woodhouse
@ 2020-03-20 11:26     ` Jürgen Groß
  2020-03-20 14:12       ` Ian Jackson
  0 siblings, 1 reply; 17+ messages in thread
From: Jürgen Groß @ 2020-03-20 11:26 UTC (permalink / raw)
  To: David Woodhouse, xen-devel; +Cc: Ian Jackson, Wei Liu

On 20.03.20 12:19, David Woodhouse wrote:
> On Fri, 2020-03-20 at 07:34 +0100, Jürgen Groß wrote:
>> Have you thought about the possibility to do the complete handling in a
>> single transaction? This would ensure a complete consistent picture
>> from the time the operation has started. Any inconsistency should be
>> reported as an error then.
> 
> Hm, how would that work? Do I have to buffer *all* the output from
> do_ls() and then only print it if/when xs_transaction_end() succeeds,
> else try again?

No, you just don't care for the transaction to succeed or fail (IMO it
should never fail as you are reading only).

So just wrap everything into a transaction.

I might be wrong, of course. :-)


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately
  2020-03-20 11:03   ` Paul Durrant
@ 2020-03-20 12:30     ` David Woodhouse
  2020-03-20 14:36       ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately [and 1 more messages] Ian Jackson
  0 siblings, 1 reply; 17+ messages in thread
From: David Woodhouse @ 2020-03-20 12:30 UTC (permalink / raw)
  To: paul, xen-devel
  Cc: 'Juergen Gross', 'Ian Jackson', 'Wei Liu'


[-- Attachment #1.1: Type: text/plain, Size: 551 bytes --]

On Fri, 2020-03-20 at 11:03 +0000, Paul Durrant wrote:
> > +                val = NULL;
> > +                /* Don't repeat an error message if xs_read() already failed */
> > +                if (val)
> 
> How can the code get here? The line above the comment always sets val to NULL.

Oops, I don't think it was supposed to. It was just stray testing which
I neglected to remove.

Will repost with that fixed, if that's the way we want to go.

It's patch 1 which I really care about; this part is just yak shaving
at Ian's prompting.

[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-03-20  6:34 ` [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating Jürgen Groß
  2020-03-20 11:19   ` David Woodhouse
@ 2020-03-20 14:11   ` Ian Jackson
  2020-03-20 14:12     ` Jürgen Groß
  1 sibling, 1 reply; 17+ messages in thread
From: Ian Jackson @ 2020-03-20 14:11 UTC (permalink / raw)
  To: Jürgen Groß; +Cc: xen-devel, David Woodhouse, Wei Liu

Jürgen Groß writes ("Re: [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"):
> On 19.03.20 21:40, David Woodhouse wrote:
> > From: David Woodhouse <dwmw@amazon.co.uk>
...
> > For the specific case of ENOENT it seems reasonable to declare that,
> > but for the timing, we might as well just not have seen that node at
> > all when calling xs_directory() for the parent. By ignoring the error,
> > we give acceptable output.

Thanks.

> Have you thought about the possibility to do the complete handling in a
> single transaction? This would ensure a complete consistent picture
> from the time the operation has started. Any inconsistency should be
> reported as an error then.

I think this would be a good idea (not least because it would mean
that callers of xenstore-ls wouldn't see inconsistent data) but I
think it would be an enhancement.

For now, for David's original patch:

Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>

If and when we introduce a transaction, David's 1/ should be reverted
as indeed then even ENOENT would indicate some kind of serious
problem.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-03-20 14:11   ` Ian Jackson
@ 2020-03-20 14:12     ` Jürgen Groß
  0 siblings, 0 replies; 17+ messages in thread
From: Jürgen Groß @ 2020-03-20 14:12 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, David Woodhouse, Wei Liu

On 20.03.20 15:11, Ian Jackson wrote:
> Jürgen Groß writes ("Re: [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"):
>> On 19.03.20 21:40, David Woodhouse wrote:
>>> From: David Woodhouse <dwmw@amazon.co.uk>
> ...
>>> For the specific case of ENOENT it seems reasonable to declare that,
>>> but for the timing, we might as well just not have seen that node at
>>> all when calling xs_directory() for the parent. By ignoring the error,
>>> we give acceptable output.
> 
> Thanks.
> 
>> Have you thought about the possibility to do the complete handling in a
>> single transaction? This would ensure a complete consistent picture
>> from the time the operation has started. Any inconsistency should be
>> reported as an error then.
> 
> I think this would be a good idea (not least because it would mean
> that callers of xenstore-ls wouldn't see inconsistent data) but I
> think it would be an enhancement.
> 
> For now, for David's original patch:
> 
> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
> 
> If and when we introduce a transaction, David's 1/ should be reverted
> as indeed then even ENOENT would indicate some kind of serious
> problem.

Fine with me.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-03-20 11:26     ` Jürgen Groß
@ 2020-03-20 14:12       ` Ian Jackson
  2020-03-20 14:58         ` David Woodhouse
  0 siblings, 1 reply; 17+ messages in thread
From: Ian Jackson @ 2020-03-20 14:12 UTC (permalink / raw)
  To: Jürgen Groß; +Cc: xen-devel, David Woodhouse, Wei Liu

Jürgen Groß writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"):
> No, you just don't care for the transaction to succeed or fail (IMO it
> should never fail as you are reading only).
> 
> So just wrap everything into a transaction.

Yes.  xenstored will do the needed buffering.

I think in principle there is a risk here that the transaction might
run for a long time, if the output from xenstore-ls goes to something
that blocks (eg a pager) and can't be written all at once.

But if this is a problem it is a problem afflicting xenstored, not
xenstore-ls.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately [and 1 more messages]
  2020-03-20 12:30     ` David Woodhouse
@ 2020-03-20 14:36       ` Ian Jackson
  0 siblings, 0 replies; 17+ messages in thread
From: Ian Jackson @ 2020-03-20 14:36 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Juergen Gross, xen-devel, Wei Liu, paul

David Woodhouse writes ("Re: [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately"):
> It's patch 1 which I really care about; this part is just yak shaving
> at Ian's prompting.

Hi.  Thanks for working on this cleanup.

I confess that while reviewing your code I felt a bit confused and
thickheaded.  (I have had a difficult week.)  So I think what I am
about to say may not be as useful or true as it ought to be....

You'll see I've put my R-b on patch 1.  Despite that, given what
appears here in patch 2, I think you might want to try to look into
using xs_transaction_{start,end} in xenstore-ls.  That might make
things simpler.

In particular I think this because I'm not sure I fully understand the
implications of when to ignore EACCES errors, purportedly on the
grounds that they might be caused by concurrent updates to xenstore.
My inclination is to think that this can't be right.  But doing
everything in a transaction would completely eliminate this issue.


Regardless of that, it is still a bug that xenstore-ls ignores errors
far too much and I very much welcome your efforts to fix this in 2/.

I don't know if I already explained my overall theory about this but:
ISTM that xenstore-ls should exit 0 iff it was able to find and print
all the information requested.  If it exits non-0 it should have
printed at least one thing to stderr.


David Woodhouse writes ("[PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately"):
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> Report only one error to stderr for each node, regardless of whether it's
> xs_read, xs_get_permissions or xs_directory on the child that fails.
> 
> Always exit with a non-zero code if any failure happens, reporting the
> last error that occurred.

I think though, that this patch has some remnants of previous
iterations in it.

> -static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms)
> +static int do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms, int error, int ignore_errors)
>  {
>      char **e;
>      char *newpath, *val;
> @@ -150,9 +150,16 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
>  
>      e = xs_directory(h, XBT_NULL, path, &num);
>      if (e == NULL) {
> -        if (cur_depth && errno == ENOENT) {
> -            /* If a node disappears while recursing, silently move on. */
> -            return;

ISTM that this code ought to be retained.  It is still the case that
you want to ignore ENOENT.  (Unless you do the transaction thing.)

> +        if (cur_depth && (errno == ENOENT || errno == EACCES)) {
> +            /*
> +             * If a node disappears or becomes inaccessible while traversing,
> +             * only print an error if previous operations on this node haven't
> +             * done do. Then move on.
> +             */
> +            error = errno;
> +            if (!ignore_errors)
> +                warn("xs_directory (%s)", path);
> +            return error;

So this branch is just for EACCES ?  What is the justification for
handling EACCES specially ?  Maybe it is the only "expected" error ?

> @@ -197,7 +204,8 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
>  
>          /* Print value */
>          if (val == NULL) {
> -            printf(":\n");
> +            error = errno;
> +            printf(": (%s)", strerror(error));

Well done for making the output unambiguous.  `= "..."' vs `: (...)'

> @@ -222,7 +230,11 @@ static void do_ls(struct xs_handle *h, char *path, int cur_depth, int show_perms
>          if (show_perms) {
>              perms = xs_get_permissions(h, XBT_NULL, newpath, &nperms);
>              if (perms == NULL) {
> -                warn("\ncould not access permissions for %s", e[i]);
> +                error = errno;
> +                val = NULL;
> +                /* Don't repeat an error message if xs_read() already failed */
> +                if (val)
> +                    warn("could not access permissions for %s", e[i]);

Maybe this wants the same ENOENT handling as before ?  (Unless you do
xs_transaction_*.)

IDK what the rules are for xs_get_permissions.[1] Does it require read
access ?  If so then I think it might be better to simply skip the
xs_get_permissions call if val==NULL.  Especially if you do the
xs_transaction_* thing - since in that case the xs_get_permissions
call seems doomed.

I hope this review was of some use.  Please chat to me on irc or reply
by email if it doesn't seem to make sense.

Ian.

[1] The in-tree docs refer here:
  https://wiki.xen.org/wiki/XenBus#Permissions
but that part of the wiki page does not say what permissions are
needed for GET_PERMS and SET_PERMS themselves.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-03-20 14:12       ` Ian Jackson
@ 2020-03-20 14:58         ` David Woodhouse
  2020-03-20 15:23           ` Jürgen Groß
  0 siblings, 1 reply; 17+ messages in thread
From: David Woodhouse @ 2020-03-20 14:58 UTC (permalink / raw)
  To: Ian Jackson, Jürgen Groß; +Cc: xen-devel, Wei Liu


[-- Attachment #1.1: Type: text/plain, Size: 1923 bytes --]

On Fri, 2020-03-20 at 14:12 +0000, Ian Jackson wrote:
> Jürgen Groß writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"):
> > No, you just don't care for the transaction to succeed or fail (IMO it
> > should never fail as you are reading only).
> > 
> > So just wrap everything into a transaction.
> 
> Yes.  xenstored will do the needed buffering.
> 
> I think in principle there is a risk here that the transaction might
> run for a long time, if the output from xenstore-ls goes to something
> that blocks (eg a pager) and can't be written all at once.
> 
> But if this is a problem it is a problem afflicting xenstored, not
> xenstore-ls.
> 
> Ian.


So if I do this...

--- a/tools/xenstore/xenstore_client.c
+++ b/tools/xenstore/xenstore_client.c
@@ -658,7 +658,6 @@ main(int argc, char **argv)
     case MODE_write:
        transaction = (argc - switch_argv - optind) > 2;
        break;
-    case MODE_ls:
     case MODE_watch:
        transaction = 0;
        break;
@@ -683,6 +682,7 @@ again:
        xth = xs_transaction_start(xsh);
        if (xth == XBT_NULL)
            errx(1, "couldn't start transaction");
+        printf("started transaction\n");
     }
 
     ret = perform(mode, optind, argc - switch_argv, argv + switch_argv, xsh, xth, prefix, tidy, upto, recurse, nr_watches, raw);


... and then repeat my test case as shown in [PATCH 1/1], I should no
longer see...

xenstore-ls: 
could not access permissions for 407: No such file or directory
xenstore-ls: in xs_directory (/local/domain/2/foo/407): No such file or directory
0


But it does still happen. And even if I turn the errx() into a warn()
to stop it aborting, and add a warn() when the xs_transaction_end()
returns EAGAIN... that isn't happening either. I'm just getting
inconsistent data, within a transaction.



[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-03-20 14:58         ` David Woodhouse
@ 2020-03-20 15:23           ` Jürgen Groß
  2020-03-30 16:37             ` Ian Jackson
  0 siblings, 1 reply; 17+ messages in thread
From: Jürgen Groß @ 2020-03-20 15:23 UTC (permalink / raw)
  To: David Woodhouse, Ian Jackson; +Cc: xen-devel, Wei Liu

On 20.03.20 15:58, David Woodhouse wrote:
> On Fri, 2020-03-20 at 14:12 +0000, Ian Jackson wrote:
>> Jürgen Groß writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"):
>>> No, you just don't care for the transaction to succeed or fail (IMO it
>>> should never fail as you are reading only).
>>>
>>> So just wrap everything into a transaction.
>>
>> Yes.  xenstored will do the needed buffering.
>>
>> I think in principle there is a risk here that the transaction might
>> run for a long time, if the output from xenstore-ls goes to something
>> that blocks (eg a pager) and can't be written all at once.
>>
>> But if this is a problem it is a problem afflicting xenstored, not
>> xenstore-ls.
>>
>> Ian.
> 
> 
> So if I do this...
> 
> --- a/tools/xenstore/xenstore_client.c
> +++ b/tools/xenstore/xenstore_client.c
> @@ -658,7 +658,6 @@ main(int argc, char **argv)
>       case MODE_write:
>          transaction = (argc - switch_argv - optind) > 2;
>          break;
> -    case MODE_ls:
>       case MODE_watch:
>          transaction = 0;
>          break;
> @@ -683,6 +682,7 @@ again:
>          xth = xs_transaction_start(xsh);
>          if (xth == XBT_NULL)
>              errx(1, "couldn't start transaction");
> +        printf("started transaction\n");
>       }
>   
>       ret = perform(mode, optind, argc - switch_argv, argv + switch_argv, xsh, xth, prefix, tidy, upto, recurse, nr_watches, raw);
> 
> 
> ... and then repeat my test case as shown in [PATCH 1/1], I should no
> longer see...
> 
> xenstore-ls:
> could not access permissions for 407: No such file or directory
> xenstore-ls: in xs_directory (/local/domain/2/foo/407): No such file or directory
> 0
> 
> 
> But it does still happen. And even if I turn the errx() into a warn()
> to stop it aborting, and add a warn() when the xs_transaction_end()
> returns EAGAIN... that isn't happening either. I'm just getting
> inconsistent data, within a transaction.

Hmm, yes, thinking more about it: a non-transactional write of a node
which hasn't been written or read by an ongoing transaction is not
handled in a special way. This could be changed, but would require some
structural changes.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-03-20 15:23           ` Jürgen Groß
@ 2020-03-30 16:37             ` Ian Jackson
  2020-03-30 16:40               ` Ian Jackson
  0 siblings, 1 reply; 17+ messages in thread
From: Ian Jackson @ 2020-03-30 16:37 UTC (permalink / raw)
  To: Jürgen Groß; +Cc: xen-devel, David Woodhouse, Wei Liu

Jürgen Groß writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"):
> On 20.03.20 15:58, David Woodhouse wrote:
> > But it does still happen. And even if I turn the errx() into a warn()
> > to stop it aborting, and add a warn() when the xs_transaction_end()
> > returns EAGAIN... that isn't happening either. I'm just getting
> > inconsistent data, within a transaction.
> 
> Hmm, yes, thinking more about it: a non-transactional write of a node
> which hasn't been written or read by an ongoing transaction is not
> handled in a special way. This could be changed, but would require some
> structural changes.

And making a node visible by XS_DIRECTORY[_PART] doesn't count as
reading it.  But it does count as reading the parent ?
In principle adding or removing a node could be made to count as a
change to the containing directory.  But I don't think doing this as a
response to David's issue is sensible.

Ian.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-03-30 16:37             ` Ian Jackson
@ 2020-03-30 16:40               ` Ian Jackson
  2020-08-14 14:23                 ` David Woodhouse
  0 siblings, 1 reply; 17+ messages in thread
From: Ian Jackson @ 2020-03-30 16:40 UTC (permalink / raw)
  To: Jürgen Groß, David Woodhouse, xen-devel, Wei Liu

Ian Jackson writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating"):
> And making a node visible by XS_DIRECTORY[_PART] doesn't count as
> reading it.  But it does count as reading the parent ?
> In principle adding or removing a node could be made to count as a
> change to the containing directory.  But I don't think doing this as a
> response to David's issue is sensible.

So, err, putting that together and reviewing the state of the world:

I still think David's 1/ patch is good.

I think my comments on 2/ are still applicable, apart from the
bits where I suggest using a transaction will fix all this.

David: do you now intend to revise 2/ according to our comments ?

Everyone else: is there some reason we shouldn't commit 1/
immediately ?

Ian.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-03-30 16:40               ` Ian Jackson
@ 2020-08-14 14:23                 ` David Woodhouse
  2020-08-14 15:19                   ` Wei Liu
  0 siblings, 1 reply; 17+ messages in thread
From: David Woodhouse @ 2020-08-14 14:23 UTC (permalink / raw)
  To: Ian Jackson, Jürgen Groß, xen-devel, Wei Liu

[-- Attachment #1: Type: text/plain, Size: 1302 bytes --]

On Mon, 2020-03-30 at 17:40 +0100, Ian Jackson wrote:
> Ian Jackson writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do
> not abort xenstore-ls if a node disappears while iterating"):
> > And making a node visible by XS_DIRECTORY[_PART] doesn't count as
> > reading it.  But it does count as reading the parent ?
> > In principle adding or removing a node could be made to count as a
> > change to the containing directory.  But I don't think doing this
> > as a
> > response to David's issue is sensible.
> 
> So, err, putting that together and reviewing the state of the world:
> 
> I still think David's 1/ patch is good.
> 
> I think my comments on 2/ are still applicable, apart from the
> bits where I suggest using a transaction will fix all this.
> 
> David: do you now intend to revise 2/ according to our comments ?

I confess to having slightly lost the will to live, but sure. If #1
gets applied and actually fixes the bug that was biting us in
production and which I started trying to upstream in March 2019, I'll
happily revisit those subsequent cleanups you asked for.

> Everyone else: is there some reason we shouldn't commit 1/
> immediately ?

It was deliberately split out so that it could indeed be applied
immediately when it was posted in March.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
  2020-08-14 14:23                 ` David Woodhouse
@ 2020-08-14 15:19                   ` Wei Liu
  0 siblings, 0 replies; 17+ messages in thread
From: Wei Liu @ 2020-08-14 15:19 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Ian Jackson, Jürgen Groß, xen-devel, Wei Liu

On Fri, Aug 14, 2020 at 03:23:10PM +0100, David Woodhouse wrote:
> On Mon, 2020-03-30 at 17:40 +0100, Ian Jackson wrote:
> > Ian Jackson writes ("Re: [Xen-devel] [PATCH 1/2] tools/xenstore: Do
> > not abort xenstore-ls if a node disappears while iterating"):
> > > And making a node visible by XS_DIRECTORY[_PART] doesn't count as
> > > reading it.  But it does count as reading the parent ?
> > > In principle adding or removing a node could be made to count as a
> > > change to the containing directory.  But I don't think doing this
> > > as a
> > > response to David's issue is sensible.
> > 
> > So, err, putting that together and reviewing the state of the world:
> > 
> > I still think David's 1/ patch is good.
> > Everyone else: is there some reason we shouldn't commit 1/
> > immediately ?

OK I will turn this into an ack and apply this patch. Sorry this fell
off my radar somehow.

Wei.

> 
> It was deliberately split out so that it could indeed be applied
> immediately when it was posted in March.




^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-08-14 15:19 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-19 20:40 [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating David Woodhouse
2020-03-19 20:40 ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately David Woodhouse
2020-03-20 11:03   ` Paul Durrant
2020-03-20 12:30     ` David Woodhouse
2020-03-20 14:36       ` [Xen-devel] [PATCH 2/2] tools/xenstore: Accumulate errors in xenstore-ls and exit appropriately [and 1 more messages] Ian Jackson
2020-03-20  6:34 ` [Xen-devel] [PATCH 1/2] tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating Jürgen Groß
2020-03-20 11:19   ` David Woodhouse
2020-03-20 11:26     ` Jürgen Groß
2020-03-20 14:12       ` Ian Jackson
2020-03-20 14:58         ` David Woodhouse
2020-03-20 15:23           ` Jürgen Groß
2020-03-30 16:37             ` Ian Jackson
2020-03-30 16:40               ` Ian Jackson
2020-08-14 14:23                 ` David Woodhouse
2020-08-14 15:19                   ` Wei Liu
2020-03-20 14:11   ` Ian Jackson
2020-03-20 14:12     ` Jürgen Groß

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.