All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/9] libxl error reporting
@ 2015-06-24 13:47 Rob Hoes
  2015-06-24 13:47 ` [PATCH RFC 1/9] libxl idl: add comments to error enum Rob Hoes
                   ` (9 more replies)
  0 siblings, 10 replies; 33+ messages in thread
From: Rob Hoes @ 2015-06-24 13:47 UTC (permalink / raw)
  To: xen-devel; +Cc: euan.harris, Ian.Jackson


Following the proposal from Euan Harris to improve libxl's error reporting [1],
I have written a first couple of patches that I would like some feedback on.
The focus of these patches is on improving the errors that can be raised by the
device_add functions and some related ones.

Does the approach look acceptable?
Do the error codes make sense?
Did I miss any error conditions?

One thing I wasn't sure about is what happens if libxl__wait_device_connection
times out.

Also, I realise that these patches essentially break backward compatibility,
and I have not done the "#define LIBXL_HAVE" stuff yet. Do people consider this
necessary, and if so, at what granularity (e.g. one LIBXL_HAVE for all new
error codes in a release)?

Thanks,
Rob

[1] http://lists.xen.org/archives/html/xen-devel/2015-05/msg02572.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH RFC 1/9] libxl idl: add comments to error enum
  2015-06-24 13:47 [PATCH RFC 0/9] libxl error reporting Rob Hoes
@ 2015-06-24 13:47 ` Rob Hoes
  2015-06-24 15:06   ` Ian Jackson
  2015-06-25 15:58   ` Ian Campbell
  2015-06-24 13:47 ` [PATCH RFC 2/9] libxl idl: allow implicit enum values Rob Hoes
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 33+ messages in thread
From: Rob Hoes @ 2015-06-24 13:47 UTC (permalink / raw)
  To: xen-devel; +Cc: euan.harris, Ian.Jackson, Rob Hoes

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
---
 tools/libxl/libxl_types.idl | 41 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 65d479f..6dc18fa 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -44,26 +44,67 @@ MemKB = UInt(64, init_val = "LIBXL_MEMKB_DEFAULT", json_gen_fn = "libxl__uint64_
 #
 
 libxl_error = Enumeration("error", [
+    # Generic failure; code should be avoided (often seen as "rc = -1")
     (-1, "NONSPECIFIC"),
+    
+    # Libxl version mismatch
     (-2, "VERSION"),
+    
+    # General failure; code should be avoided
     (-3, "FAIL"),
+    
+    # Not implemented
     (-4, "NI"),
+    
+    # Out of memory (malloc or similar failed)
     (-5, "NOMEM"),
+    
+    # General failure; code should be avoided
     (-6, "INVAL"),
+    
+    # General failure; code should be avoided (used only in "xl")
     (-7, "BADFAIL"),
+    
+    # Domain responded to suspend request
     (-8, "GUEST_TIMEDOUT"),
+    
+    # A xenstore watch has timed out
     (-9, "TIMEDOUT"),
+    
+    # The operation requires PV control, but the domain does not offer it
     (-10, "NOPARAVIRT"),
+    
+    # Event has not happened (libxl_event_check)
     (-11, "NOT_READY"),
+    
+    # osevent registration or modification hook failed
     (-12, "OSEVENT_REG_FAIL"),
+    
+    # fd buffer full (libxl_osevent_beforepoll)
     (-13, "BUFFERFULL"),
+    
+    # Process is not a child of the current libxl instance (libxl_childproc_reaped)
     (-14, "UNKNOWN_CHILD"),
+    
+    # Could not acquire lock
     (-15, "LOCK_FAIL"),
+    
+    # Unable to find JSON domain config
     (-16, "JSON_CONFIG_EMPTY"),
+    
+    # The requested device already exists
     (-17, "DEVICE_EXISTS"),
+    
+    # Remus ops do not match device
     (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
+    
+    # Remus device not supported
     (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
+    
+    # vNUMA config not valid
     (-20, "VNUMA_CONFIG_INVALID"),
+    
+    # Requested domain was not found
     (-21, "DOMAIN_NOTFOUND"),
     ], value_namespace = "")
 
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH RFC 2/9] libxl idl: allow implicit enum values
  2015-06-24 13:47 [PATCH RFC 0/9] libxl error reporting Rob Hoes
  2015-06-24 13:47 ` [PATCH RFC 1/9] libxl idl: add comments to error enum Rob Hoes
@ 2015-06-24 13:47 ` Rob Hoes
  2015-06-24 15:08   ` Ian Jackson
  2015-06-24 13:47 ` [PATCH RFC 3/9] libxl: introduce specific xenstore error codes Rob Hoes
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Rob Hoes @ 2015-06-24 13:47 UTC (permalink / raw)
  To: xen-devel; +Cc: euan.harris, Ian.Jackson, Rob Hoes

Introducing two special enum values:
* ENUM_NEXT: equal to the previous value in the enum plus 1
* ENUM_PREV: equal to the previous value in the enum minus 1

This makes it a little easier to maintain enums for which we do not care too
much about the exact enum values.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
---
 tools/libxl/idl.py | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tools/libxl/idl.py b/tools/libxl/idl.py
index 437049e..9fccefa 100644
--- a/tools/libxl/idl.py
+++ b/tools/libxl/idl.py
@@ -8,6 +8,9 @@ DIR_IN   = 1
 DIR_OUT  = 2
 DIR_BOTH = 3
 
+ENUM_NEXT = sys.maxint
+ENUM_PREV = sys.maxint - 1
+
 _default_namespace = ""
 def namespace(s):
     if type(s) != str:
@@ -174,9 +177,18 @@ class Enumeration(Type):
             self.namespace)
 
         self.values = []
+        last = None
         for v in values:
             # (value, name)
             (num,name) = v
+            if num == ENUM_NEXT or num == ENUM_PREV:
+            	if last == None:
+            		raise ValueError
+            	elif num == ENUM_NEXT:
+            		num = last + 1
+            	else:
+            		num = last - 1
+            last = num
             self.values.append(EnumerationValue(self, num, name,
                                                 typename=self.rawname))
     def lookup(self, name):
@@ -354,6 +366,7 @@ def parse(f):
             globs[n] = t
         elif n in ['PASS_BY_REFERENCE', 'PASS_BY_VALUE',
                    'DIR_NONE', 'DIR_IN', 'DIR_OUT', 'DIR_BOTH',
+                   'ENUM_NEXT', 'ENUM_PREV',
                    'namespace', 'hidden']:
             globs[n] = t
 
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH RFC 3/9] libxl: introduce specific xenstore error codes
  2015-06-24 13:47 [PATCH RFC 0/9] libxl error reporting Rob Hoes
  2015-06-24 13:47 ` [PATCH RFC 1/9] libxl idl: add comments to error enum Rob Hoes
  2015-06-24 13:47 ` [PATCH RFC 2/9] libxl idl: allow implicit enum values Rob Hoes
@ 2015-06-24 13:47 ` Rob Hoes
  2015-06-24 15:10   ` Ian Jackson
  2015-06-24 13:47 ` [PATCH RFC 4/9] libxl: use explicit error codes in libxl_ctx_alloc Rob Hoes
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Rob Hoes @ 2015-06-24 13:47 UTC (permalink / raw)
  To: xen-devel; +Cc: euan.harris, Ian.Jackson, Rob Hoes

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
---
 tools/libxl/libxl_types.idl |  8 ++++++++
 tools/libxl/libxl_xshelp.c  | 14 +++++++-------
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 6dc18fa..e9b3477 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -106,6 +106,14 @@ libxl_error = Enumeration("error", [
     
     # Requested domain was not found
     (-21, "DOMAIN_NOTFOUND"),
+    
+    # Xenstore errors
+    (ENUM_PREV, "XS_CONNECT"),
+    (ENUM_PREV, "XS_READ"),
+    (ENUM_PREV, "XS_WRITE"),
+    (ENUM_PREV, "XS_TRANS_START"),
+    (ENUM_PREV, "XS_TRANS_COMMIT"),
+    (ENUM_PREV, "XS_REMOVE"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
diff --git a/tools/libxl/libxl_xshelp.c b/tools/libxl/libxl_xshelp.c
index d7eaa66..e634ee5 100644
--- a/tools/libxl/libxl_xshelp.c
+++ b/tools/libxl/libxl_xshelp.c
@@ -174,7 +174,7 @@ int libxl__xs_read_checked(libxl__gc *gc, xs_transaction_t t,
     if (!result) {
         if (errno != ENOENT) {
             LOGE(ERROR, "xenstore read failed: `%s'", path);
-            return ERROR_FAIL;
+            return ERROR_XS_READ;
         }
     }
     *result_out = result;
@@ -187,7 +187,7 @@ int libxl__xs_write_checked(libxl__gc *gc, xs_transaction_t t,
     size_t length = strlen(string);
     if (!xs_write(CTX->xsh, t, path, string, length)) {
         LOGE(ERROR, "xenstore write failed: `%s' = `%s'", path, string);
-        return ERROR_FAIL;
+        return ERROR_XS_WRITE;
     }
     return 0;
 }
@@ -199,7 +199,7 @@ int libxl__xs_rm_checked(libxl__gc *gc, xs_transaction_t t, const char *path)
             return 0;
 
         LOGE(ERROR, "xenstore rm failed: `%s'", path);
-        return ERROR_FAIL;
+        return ERROR_XS_REMOVE;
     }
     return 0;
 }
@@ -210,7 +210,7 @@ int libxl__xs_transaction_start(libxl__gc *gc, xs_transaction_t *t)
     *t = xs_transaction_start(CTX->xsh);
     if (!*t) {
         LOGE(ERROR, "could not create xenstore transaction");
-        return ERROR_FAIL;
+        return ERROR_XS_TRANS_START;
     }
     return 0;
 }
@@ -225,7 +225,7 @@ int libxl__xs_transaction_commit(libxl__gc *gc, xs_transaction_t *t)
             return +1;
 
         LOGE(ERROR, "could not commit xenstore transaction");
-        return ERROR_FAIL;
+        return ERROR_XS_TRANS_COMMIT;
     }
 
     *t = 0;
@@ -257,7 +257,7 @@ int libxl__xs_path_cleanup(libxl__gc *gc, xs_transaction_t t,
     if (!xs_rm(CTX->xsh, t, path)) {
         if (errno != ENOENT)
             LOGE(DEBUG, "unable to remove path %s", path);
-        rc = ERROR_FAIL;
+        rc = ERROR_XS_REMOVE;
         goto out;
     }
 
@@ -274,7 +274,7 @@ int libxl__xs_path_cleanup(libxl__gc *gc, xs_transaction_t t,
         if (!xs_rm(CTX->xsh, t, path)) {
             if (errno != ENOENT)
                 LOGE(DEBUG, "unable to remove path %s", path);
-            rc = ERROR_FAIL;
+            rc = ERROR_XS_REMOVE;
             goto out;
         }
     }
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH RFC 4/9] libxl: use explicit error codes in libxl_ctx_alloc
  2015-06-24 13:47 [PATCH RFC 0/9] libxl error reporting Rob Hoes
                   ` (2 preceding siblings ...)
  2015-06-24 13:47 ` [PATCH RFC 3/9] libxl: introduce specific xenstore error codes Rob Hoes
@ 2015-06-24 13:47 ` Rob Hoes
  2015-06-24 15:18   ` Ian Jackson
  2015-06-24 13:47 ` [PATCH RFC 5/9] libxl: introduce specific JSON error codes Rob Hoes
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Rob Hoes @ 2015-06-24 13:47 UTC (permalink / raw)
  To: xen-devel; +Cc: euan.harris, Ian.Jackson, Rob Hoes

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
---
 tools/libxl/libxl.c         | 6 +++---
 tools/libxl/libxl_event.c   | 2 +-
 tools/libxl/libxl_types.idl | 3 +++
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index a6eb2df..f622981 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -89,7 +89,7 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version,
         LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Failed to initialize mutex");
         free(ctx);
         ctx = 0;
-        rc = ERROR_FAIL;
+        rc = ERROR_LOCK_FAIL;
         goto out;
     }
 
@@ -107,7 +107,7 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version,
     ctx->xch = xc_interface_open(lg,lg,0);
     if (!ctx->xch) {
         LOGEV(ERROR, errno, "cannot open libxc handle");
-        rc = ERROR_FAIL; goto out;
+        rc = ERROR_XC_CONNECT; goto out;
     }
 
     ctx->xsh = xs_daemon_open();
@@ -115,7 +115,7 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version,
         ctx->xsh = xs_domain_open();
     if (!ctx->xsh) {
         LOGEV(ERROR, errno, "cannot connect to xenstore");
-        rc = ERROR_FAIL; goto out;
+        rc = ERROR_XS_CONNECT; goto out;
     }
 
     *pctx = ctx;
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 9ff4e78..b1e82c1 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -740,7 +740,7 @@ int libxl__ctx_evtchn_init(libxl__gc *gc) {
     xce = xc_evtchn_open(CTX->lg, 0);
     if (!xce) {
         LOGE(ERROR,"cannot open libxc evtchn handle");
-        rc = ERROR_FAIL;
+        rc = ERROR_XC_CONNECT;
         goto out;
     }
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index e9b3477..52795e4 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -114,6 +114,9 @@ libxl_error = Enumeration("error", [
     (ENUM_PREV, "XS_TRANS_START"),
     (ENUM_PREV, "XS_TRANS_COMMIT"),
     (ENUM_PREV, "XS_REMOVE"),
+        
+    # libxc connection error
+    (ENUM_PREV, "XC_CONNECT"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH RFC 5/9] libxl: introduce specific JSON error codes
  2015-06-24 13:47 [PATCH RFC 0/9] libxl error reporting Rob Hoes
                   ` (3 preceding siblings ...)
  2015-06-24 13:47 ` [PATCH RFC 4/9] libxl: use explicit error codes in libxl_ctx_alloc Rob Hoes
@ 2015-06-24 13:47 ` Rob Hoes
  2015-06-24 15:20   ` Ian Jackson
  2015-06-24 13:47 ` [PATCH RFC 6/9] libxl: introduce specific error code for libxl__wait_device_connection Rob Hoes
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Rob Hoes @ 2015-06-24 13:47 UTC (permalink / raw)
  To: xen-devel; +Cc: euan.harris, Ian.Jackson, Rob Hoes

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
---
 tools/libxl/libxl_dom.c      | 6 +++---
 tools/libxl/libxl_internal.c | 4 +---
 tools/libxl/libxl_json.c     | 4 ++--
 tools/libxl/libxl_types.idl  | 5 +++++
 4 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index a0c9850..efdb570 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -2323,7 +2323,7 @@ int libxl__userdata_store(libxl__gc *gc, uint32_t domid,
         goto out;
     }
 
-    rc = ERROR_FAIL;
+    rc = ERROR_JSON_SET_CONFIG;
 
     fd = open(newfilename, O_RDWR | O_CREAT | O_TRUNC, 0600);
     if (fd < 0)
@@ -2400,13 +2400,13 @@ int libxl__userdata_retrieve(libxl__gc *gc, uint32_t domid,
 
     e = libxl_read_file_contents(CTX, filename, data_r ? &data : 0, &datalen);
     if (e && errno != ENOENT) {
-        rc = ERROR_FAIL;
+        rc = ERROR_JSON_GET_CONFIG;
         goto out;
     }
     if (!e && !datalen) {
         LIBXL__LOG(CTX, LIBXL__LOG_ERROR, "userdata file %s is empty", filename);
         if (data_r) assert(!*data_r);
-        rc = ERROR_FAIL;
+        rc = ERROR_JSON_GET_CONFIG;
         goto out;
     }
 
diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
index c2c67bd..4ac43f8 100644
--- a/tools/libxl/libxl_internal.c
+++ b/tools/libxl/libxl_internal.c
@@ -500,7 +500,6 @@ int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid,
     if (rc) {
         LOGEV(ERROR, rc,
               "failed to retrieve domain configuration for domain %d", domid);
-        rc = ERROR_FAIL;
         goto out;
     }
 
@@ -527,7 +526,7 @@ int libxl__set_domain_configuration(libxl__gc *gc, uint32_t domid,
         LOGE(ERROR,
              "failed to convert domain configuration to JSON for domain %d",
              domid);
-        rc = ERROR_FAIL;
+        rc = ERROR_JSON_SET_CONFIG;
         goto out;
     }
 
@@ -537,7 +536,6 @@ int libxl__set_domain_configuration(libxl__gc *gc, uint32_t domid,
     if (rc) {
         LOGEV(ERROR, rc, "failed to store domain configuration for domain %d",
               domid);
-        rc = ERROR_FAIL;
         goto out;
     }
 
diff --git a/tools/libxl/libxl_json.c b/tools/libxl/libxl_json.c
index 346929a..973d535 100644
--- a/tools/libxl/libxl_json.c
+++ b/tools/libxl/libxl_json.c
@@ -1044,14 +1044,14 @@ int libxl__object_from_json(libxl_ctx *ctx, const char *type,
         LOG(ERROR,
             "unable to generate libxl__json_object from JSON representation of %s.",
             type);
-        rc = ERROR_FAIL;
+        rc = ERROR_JSON_PARSE_CONFIG;
         goto out;
     }
 
     rc = parse(gc, o, p);
     if (rc) {
         LOG(ERROR, "unable to convert libxl__json_object to %s. (rc=%d)", type, rc);
-        rc = ERROR_FAIL;
+        rc = ERROR_JSON_PARSE_CONFIG;
         goto out;
     }
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 52795e4..b905353 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -117,6 +117,11 @@ libxl_error = Enumeration("error", [
         
     # libxc connection error
     (ENUM_PREV, "XC_CONNECT"),
+    
+    # JSON domain config errors
+    (ENUM_PREV, "JSON_GET_CONFIG"),
+    (ENUM_PREV, "JSON_SET_CONFIG"),
+    (ENUM_PREV, "JSON_PARSE_CONFIG"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH RFC 6/9] libxl: introduce specific error code for libxl__wait_device_connection
  2015-06-24 13:47 [PATCH RFC 0/9] libxl error reporting Rob Hoes
                   ` (4 preceding siblings ...)
  2015-06-24 13:47 ` [PATCH RFC 5/9] libxl: introduce specific JSON error codes Rob Hoes
@ 2015-06-24 13:47 ` Rob Hoes
  2015-06-24 15:30   ` Ian Jackson
  2015-06-24 13:47 ` [PATCH RFC 7/9] libxl: introduce specific error codes in libxl_device_disk_add Rob Hoes
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Rob Hoes @ 2015-06-24 13:47 UTC (permalink / raw)
  To: xen-devel; +Cc: euan.harris, Ian.Jackson, Rob Hoes

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
---
 tools/libxl/libxl_device.c  | 1 +
 tools/libxl/libxl_types.idl | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 93bb41e..56c6e2e 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -768,6 +768,7 @@ void libxl__wait_device_connection(libxl__egc *egc, libxl__ao_device *aodev)
                                  LIBXL_INIT_TIMEOUT * 1000);
     if (rc) {
         LOG(ERROR, "unable to initialize device %s", be_path);
+        rc = ERROR_DEVICE_WAIT_INIT;
         goto out;
     }
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index b905353..3c44b41 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -122,6 +122,9 @@ libxl_error = Enumeration("error", [
     (ENUM_PREV, "JSON_GET_CONFIG"),
     (ENUM_PREV, "JSON_SET_CONFIG"),
     (ENUM_PREV, "JSON_PARSE_CONFIG"),
+    
+    # Unable to initialise device connection watch
+    (ENUM_PREV, "DEVICE_WAIT_INIT"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH RFC 7/9] libxl: introduce specific error codes in libxl_device_disk_add
  2015-06-24 13:47 [PATCH RFC 0/9] libxl error reporting Rob Hoes
                   ` (5 preceding siblings ...)
  2015-06-24 13:47 ` [PATCH RFC 6/9] libxl: introduce specific error code for libxl__wait_device_connection Rob Hoes
@ 2015-06-24 13:47 ` Rob Hoes
  2015-06-24 15:28   ` Ian Jackson
  2015-06-24 13:47 ` [PATCH RFC 8/9] libxl: introduce specific error codes in libxl_device_cdrom_insert Rob Hoes
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Rob Hoes @ 2015-06-24 13:47 UTC (permalink / raw)
  To: xen-devel; +Cc: euan.harris, Ian.Jackson, Rob Hoes

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
---
 tools/libxl/libxl.c         | 21 ++++++++++++++-------
 tools/libxl/libxl_types.idl | 13 +++++++++++++
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index f622981..2f56c6e 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2008,9 +2008,16 @@ static int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device)
 static int libxl__resolve_domid(libxl__gc *gc, const char *name,
                                 uint32_t *domid)
 {
+    int rc;
     if (!name)
         return 0;
-    return libxl_domain_qualifier_to_domid(CTX, name, domid);
+
+    rc = libxl_domain_qualifier_to_domid(CTX, name, domid);
+
+    if (rc < 0)
+        return ERROR_DOMAIN_NOTFOUND;
+    else
+        return rc;
 }
 
 /******************************************************************************/
@@ -2326,7 +2333,7 @@ int libxl__device_from_disk(libxl__gc *gc, uint32_t domid,
     if (devid==-1) {
         LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Invalid or unsupported"
                " virtual disk identifier %s", disk->vdev);
-        return ERROR_INVAL;
+        return ERROR_INVAL_DISK_VDEV;
     }
 
     device->backend_domid = disk->backend_domid;
@@ -2345,7 +2352,7 @@ int libxl__device_from_disk(libxl__gc *gc, uint32_t domid,
         default:
             LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "unrecognized disk backend type: %d\n",
                        disk->backend);
-            return ERROR_INVAL;
+            return ERROR_INVAL_DISK_BACKEND;
     }
 
     device->domid = domid;
@@ -2391,7 +2398,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
 
     libxl_domain_type type = libxl__domain_type(gc, domid);
     if (type == LIBXL_DOMAIN_TYPE_INVALID) {
-        rc = ERROR_FAIL;
+        rc = ERROR_INVAL_DOMAIN_TYPE;
         goto out;
     }
 
@@ -2423,7 +2430,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
             assert(get_vdev_user);
             disk->vdev = get_vdev(gc, get_vdev_user, t);
             if (disk->vdev == NULL) {
-                rc = ERROR_FAIL;
+                rc = ERROR_DISK_VDEV_UNDETERMINED;
                 goto out;
             }
         }
@@ -2488,7 +2495,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
                     if (!dev) {
                         LOG(ERROR, "failed to get blktap devpath for %p\n",
                             disk->pdev_path);
-                        rc = ERROR_FAIL;
+                        rc = ERROR_DISK_PDEV_NOT_FOUND;
                         goto out;
                     }
                 }
@@ -2511,7 +2518,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
                 break;
             default:
                 LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "unrecognized disk backend type: %d\n", disk->backend);
-                rc = ERROR_INVAL;
+                rc = ERROR_INVAL_DISK_BACKEND;
                 goto out;
         }
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 3c44b41..15e4af2 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -125,6 +125,19 @@ libxl_error = Enumeration("error", [
     
     # Unable to initialise device connection watch
     (ENUM_PREV, "DEVICE_WAIT_INIT"),
+    
+    # Domain type invalid
+    (ENUM_PREV, "INVAL_DOMAIN_TYPE"),
+    
+    # Disk parameters invalid
+    (ENUM_PREV, "INVAL_DISK_VDEV"),
+    (ENUM_PREV, "INVAL_DISK_BACKEND"),
+    
+    # Disk parameters could not be determined
+    (ENUM_PREV, "DISK_VDEV_UNDETERMINED"),
+    
+    # Physical disk device could not be found
+    (ENUM_PREV, "DISK_PDEV_NOT_FOUND"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH RFC 8/9] libxl: introduce specific error codes in libxl_device_cdrom_insert
  2015-06-24 13:47 [PATCH RFC 0/9] libxl error reporting Rob Hoes
                   ` (6 preceding siblings ...)
  2015-06-24 13:47 ` [PATCH RFC 7/9] libxl: introduce specific error codes in libxl_device_disk_add Rob Hoes
@ 2015-06-24 13:47 ` Rob Hoes
  2015-06-24 15:26   ` Ian Jackson
  2015-06-24 13:47 ` [PATCH RFC 9/9] libxl: introduce specific error codes in libxl_device_nic_add Rob Hoes
  2015-06-24 15:16 ` [PATCH RFC 0/9] libxl error reporting Ian Jackson
  9 siblings, 1 reply; 33+ messages in thread
From: Rob Hoes @ 2015-06-24 13:47 UTC (permalink / raw)
  To: xen-devel; +Cc: euan.harris, Ian.Jackson, Rob Hoes

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
---
 tools/libxl/libxl.c         | 12 ++++++------
 tools/libxl/libxl_device.c  |  6 +++---
 tools/libxl/libxl_qmp.c     |  4 +++-
 tools/libxl/libxl_types.idl | 22 +++++++++++++++++++++-
 4 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 2f56c6e..f41f291 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2848,25 +2848,25 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk,
 
     libxl_domain_type type = libxl__domain_type(gc, domid);
     if (type == LIBXL_DOMAIN_TYPE_INVALID) {
-        rc = ERROR_FAIL;
+        rc = ERROR_INVAL_DOMAIN_TYPE;
         goto out;
     }
     if (type != LIBXL_DOMAIN_TYPE_HVM) {
         LOG(ERROR, "cdrom-insert requires an HVM domain");
-        rc = ERROR_INVAL;
+        rc = ERROR_NOHVM;
         goto out;
     }
 
     if (libxl_get_stubdom_id(ctx, domid) != 0) {
         LOG(ERROR, "cdrom-insert doesn't work for stub domains");
-        rc = ERROR_INVAL;
+        rc = ERROR_STUBDOM;
         goto out;
     }
 
     dm_ver = libxl__device_model_version_running(gc, domid);
     if (dm_ver == -1) {
         LOG(ERROR, "cannot determine device model version");
-        rc = ERROR_FAIL;
+        rc = ERROR_DM_VERSION_UNDETERMINED;
         goto out;
     }
 
@@ -2881,7 +2881,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk,
     }
     if (i == num) {
         LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Virtual device not found");
-        rc = ERROR_FAIL;
+        rc = ERROR_DISK_VDEV_NOT_FOUND;
         goto out;
     }
 
@@ -2941,7 +2941,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk,
         {
             LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Internal error: %s does not exist",
                        libxl__sprintf(gc, "%s/frontend", path));
-            rc = ERROR_FAIL;
+            rc = ERROR_INTERNAL;
             goto out;
         }
 
diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 56c6e2e..1c5f659 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -271,7 +271,7 @@ int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) {
     if (disk->format == LIBXL_DISK_FORMAT_EMPTY) {
         if (!disk->is_cdrom) {
             LOG(ERROR, "Disk vdev=%s is empty but not cdrom", disk->vdev);
-            return ERROR_INVAL;
+            return ERROR_INVAL_DISK_FORMAT;
         }
         memset(&a.stab, 0, sizeof(a.stab));
     } else if ((disk->backend == LIBXL_DISK_BACKEND_UNKNOWN ||
@@ -281,7 +281,7 @@ int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) {
         if (stat(disk->pdev_path, &a.stab)) {
             LOGE(ERROR, "Disk vdev=%s failed to stat: %s",
                         disk->vdev, disk->pdev_path);
-            return ERROR_INVAL;
+            return ERROR_DISK_PDEV_NOT_FOUND;
         }
     }
 
@@ -299,7 +299,7 @@ int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) {
     }
     if (!ok) {
         LOG(ERROR, "no suitable backend for disk %s", disk->vdev);
-        return ERROR_INVAL;
+        return ERROR_DISK_BACKEND_UNDETERMINED;
     }
     disk->backend = ok;
     return 0;
diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 9aa7e2e..c687e86 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -817,9 +817,11 @@ static int qmp_run_command(libxl__gc *gc, int domid,
 
     qmp = libxl__qmp_initialize(gc, domid);
     if (!qmp)
-        return ERROR_FAIL;
+        return ERROR_QMP_INIT;
 
     rc = qmp_synchronous_send(qmp, cmd, args, callback, opaque, qmp->timeout);
+    if (rc < 0)
+        rc = ERROR_QMP_SEND;
 
     libxl__qmp_close(qmp);
     return rc;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 15e4af2..88262ca 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -107,6 +107,10 @@ libxl_error = Enumeration("error", [
     # Requested domain was not found
     (-21, "DOMAIN_NOTFOUND"),
     
+    # Internal error; not actionable by the caller other than by doing something like
+    # a retry/reboot (perhaps a libxl bug)
+    (ENUM_PREV, "INTERNAL"),
+    
     # Xenstore errors
     (ENUM_PREV, "XS_CONNECT"),
     (ENUM_PREV, "XS_READ"),
@@ -132,12 +136,28 @@ libxl_error = Enumeration("error", [
     # Disk parameters invalid
     (ENUM_PREV, "INVAL_DISK_VDEV"),
     (ENUM_PREV, "INVAL_DISK_BACKEND"),
+    (ENUM_PREV, "INVAL_DISK_FORMAT"),
     
     # Disk parameters could not be determined
     (ENUM_PREV, "DISK_VDEV_UNDETERMINED"),
+    (ENUM_PREV, "DISK_BACKEND_UNDETERMINED"),
     
-    # Physical disk device could not be found
+    # Physical/virtual disk device could not be found
     (ENUM_PREV, "DISK_PDEV_NOT_FOUND"),
+    (ENUM_PREV, "DISK_VDEV_NOT_FOUND"),
+    
+    # Operation requires an HVM domain
+    (ENUM_PREV, "NOHVM"),
+
+    # Operation is not compatible with a stub domain
+    (ENUM_PREV, "STUBDOM"),
+    
+    # Device model version could not be determined
+    (ENUM_PREV, "DM_VERSION_UNDETERMINED"),
+    
+    # QMP errors
+    (ENUM_PREV, "QMP_INIT"),
+    (ENUM_PREV, "QMP_SEND"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH RFC 9/9] libxl: introduce specific error codes in libxl_device_nic_add
  2015-06-24 13:47 [PATCH RFC 0/9] libxl error reporting Rob Hoes
                   ` (7 preceding siblings ...)
  2015-06-24 13:47 ` [PATCH RFC 8/9] libxl: introduce specific error codes in libxl_device_cdrom_insert Rob Hoes
@ 2015-06-24 13:47 ` Rob Hoes
  2015-06-24 15:11   ` Ian Jackson
  2015-06-24 15:16 ` [PATCH RFC 0/9] libxl error reporting Ian Jackson
  9 siblings, 1 reply; 33+ messages in thread
From: Rob Hoes @ 2015-06-24 13:47 UTC (permalink / raw)
  To: xen-devel; +Cc: euan.harris, Ian.Jackson, Rob Hoes

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
---
 tools/libxl/libxl.c          | 8 ++++----
 tools/libxl/libxl_internal.c | 2 +-
 tools/libxl/libxl_types.idl  | 4 ++++
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index f41f291..c0c0f12 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -3278,7 +3278,7 @@ int libxl__device_nic_setdefault(libxl__gc *gc, libxl_device_nic *nic,
     }
     if ( !nic->script && asprintf(&nic->script, "%s/vif-bridge",
                                   libxl__xen_script_dir_path()) < 0 )
-        return ERROR_FAIL;
+        return ERROR_NIC_SCRIPT_UNDETERMINED;
 
     run_hotplug_scripts = libxl__hotplug_settings(gc, XBT_NULL);
     if (run_hotplug_scripts < 0) {
@@ -3297,12 +3297,12 @@ int libxl__device_nic_setdefault(libxl__gc *gc, libxl_device_nic *nic,
     case LIBXL_DOMAIN_TYPE_PV:
         if (nic->nictype == LIBXL_NIC_TYPE_VIF_IOEMU) {
             LOG(ERROR, "trying to create PV guest with an emulated interface");
-            return ERROR_INVAL;
+            return ERROR_NOHVM;
         }
         nic->nictype = LIBXL_NIC_TYPE_VIF;
         break;
     case LIBXL_DOMAIN_TYPE_INVALID:
-        return ERROR_FAIL;
+        return ERROR_INVAL_DOMAIN_TYPE;
     default:
         abort();
     }
@@ -3349,7 +3349,7 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t domid,
 
     if (nic->devid == -1) {
         if ((nic->devid = libxl__device_nextid(gc, domid, "vif")) < 0) {
-            rc = ERROR_FAIL;
+            rc = ERROR_NIC_DEVID_UNDETERMINED;
             goto out;
         }
     }
diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
index 4ac43f8..a130c65 100644
--- a/tools/libxl/libxl_internal.c
+++ b/tools/libxl/libxl_internal.c
@@ -384,7 +384,7 @@ int libxl__hotplug_settings(libxl__gc *gc, xs_transaction_t t)
     val = libxl__xs_read(gc, t, DISABLE_UDEV_PATH);
     if (!val && errno != ENOENT) {
         LOGE(ERROR, "cannot read %s from xenstore", DISABLE_UDEV_PATH);
-        rc = ERROR_FAIL;
+        rc = ERROR_XS_READ;
         goto out;
     }
     if (!val) val = "0";
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 88262ca..1975e05 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -146,6 +146,10 @@ libxl_error = Enumeration("error", [
     (ENUM_PREV, "DISK_PDEV_NOT_FOUND"),
     (ENUM_PREV, "DISK_VDEV_NOT_FOUND"),
     
+    # NIC parameters could not be determined
+    (ENUM_PREV, "NIC_SCRIPT_UNDETERMINED"),
+    (ENUM_PREV, "NIC_DEVID_UNDETERMINED"),
+    
     # Operation requires an HVM domain
     (ENUM_PREV, "NOHVM"),
 
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/9] libxl idl: add comments to error enum
  2015-06-24 13:47 ` [PATCH RFC 1/9] libxl idl: add comments to error enum Rob Hoes
@ 2015-06-24 15:06   ` Ian Jackson
  2015-06-26 14:12     ` Rob Hoes
  2015-06-25 15:58   ` Ian Campbell
  1 sibling, 1 reply; 33+ messages in thread
From: Ian Jackson @ 2015-06-24 15:06 UTC (permalink / raw)
  To: Rob Hoes; +Cc: xen-devel, euan.harris

Rob Hoes writes ("[PATCH RFC 1/9] libxl idl: add comments to error enum"):
...
>  libxl_error = Enumeration("error", [

Thanks for this.  It think at this stage it is probably most helpful
to concentrate on the specification, that is the list of error codes
and corresponding docs.

So:


I think it would be worth trying to categorise these errors by whose
fault they are likely to be.

> +    # Generic failure; code should be avoided (often seen as "rc = -1")
>      (-1, "NONSPECIFIC"),
> +    
> +    # Libxl version mismatch
>      (-2, "VERSION"),
> +    
> +    # General failure; code should be avoided
>      (-3, "FAIL"),

I think you should probably write `deprecated in new libxl code, due to
being too vague' instead of `should be avoided'.

> +    # Not implemented
>      (-4, "NI"),

Seems not to have any use sites.  Should presumably be deprecated due
to having a daft name.

> +    # Out of memory (malloc or similar failed)
>      (-5, "NOMEM"),

Should say whether this includes "host does not have enough memory to
create the specified domain" as well as "malloc failed in toolstack".

NB that we are trying to phase out the handling of the latter as
anything except a fatal error.

> +    # General failure; code should be avoided
>      (-6, "INVAL"),

ERROR_INVAL means "there is something wrong with the parameters to the
operation (but we don't say exactly what).

> +    # General failure; code should be avoided (used only in "xl")
>      (-7, "BADFAIL"),

Maybe it should be renamed.

> +    # Domain responded to suspend request
>      (-8, "GUEST_TIMEDOUT"),

Failed to respond, you mean.

> +    # A xenstore watch has timed out
>      (-9, "TIMEDOUT"),

Not true, I'm afraid.  xenstore watches cannot time out, and this is
used for other things besides `we were waiting for something in
xenstore and decided it was taking too long' (which anyway does not
really say what it was that we were waiting for - lots of things
communicate via xenstore).

Also this error doesn't say what timed out, so this error code should
be deprecated.

> +    # Event has not happened (libxl_event_check)
>      (-11, "NOT_READY"),

I would say "Requested event has not (yet) happened".

> +    # Could not acquire lock
>      (-15, "LOCK_FAIL"),

This seems to be used entirely for the userdata lock.  I would be
tempted to argue that it's not particularly be useful to report this
distinctly to the application, because what it really means is "either
the system is totally hosed, or the permissions or existence of the
userdata storage area are wrong".  The latter ought to be a separate
error code.

> +    # Unable to find JSON domain config
>      (-16, "JSON_CONFIG_EMPTY"),

There is clearly something wrong here.  This is supposed to be handled
by most callers I think.  The comment at the one generation site says:

        /* No logging, not necessary an error from caller's PoV. */
        rc = ERROR_JSON_CONFIG_EMPTY;
        goto out;

But I can find no special handling of it by callers.

> +    # Remus ops do not match device
>      (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),

I think this is supposed to be internal-only.

> +    # Remus device not supported
>      (-19, "REMUS_DEVICE_NOT_SUPPORTED"),

Starting remus was attempted but the domain has a device which is not
supported by remus.

> +    # Requested domain was not found
>      (-21, "DOMAIN_NOTFOUND"),

AFAICT might also sometimes happen if the domain disappears during an
operation ?

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 2/9] libxl idl: allow implicit enum values
  2015-06-24 13:47 ` [PATCH RFC 2/9] libxl idl: allow implicit enum values Rob Hoes
@ 2015-06-24 15:08   ` Ian Jackson
  2015-06-25 15:59     ` Ian Campbell
  0 siblings, 1 reply; 33+ messages in thread
From: Ian Jackson @ 2015-06-24 15:08 UTC (permalink / raw)
  To: Rob Hoes; +Cc: xen-devel, euan.harris

Rob Hoes writes ("[PATCH RFC 2/9] libxl idl: allow implicit enum values"):
> Introducing two special enum values:
> * ENUM_NEXT: equal to the previous value in the enum plus 1
> * ENUM_PREV: equal to the previous value in the enum minus 1
> 
> This makes it a little easier to maintain enums for which we do not care too
> much about the exact enum values.

It means that enum values can't be looked up without compiling the
code.  It also means that deleting an old enum value would result in
unexpected and undesirable changes to subsequent enums.

So I'm afraid I don't agree with this approach.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 3/9] libxl: introduce specific xenstore error codes
  2015-06-24 13:47 ` [PATCH RFC 3/9] libxl: introduce specific xenstore error codes Rob Hoes
@ 2015-06-24 15:10   ` Ian Jackson
  2015-06-26 14:36     ` Rob Hoes
  0 siblings, 1 reply; 33+ messages in thread
From: Ian Jackson @ 2015-06-24 15:10 UTC (permalink / raw)
  To: Rob Hoes; +Cc: xen-devel, euan.harris

Rob Hoes writes ("[PATCH RFC 3/9] libxl: introduce specific xenstore error codes"):
> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>

I don't understand the distinctions you are trying to make here.  Is
it really useful for an application or a sysadmin to distinguish what
operation was being attempted against xenstore ?

I think the xenstore error code would be more informative.  Probably
they can be categorised into:
 * xenstored communication failed
 * xenstored seems to have done something mad
 * permission denied reading xenstore
 * permission denied writing xenstore
 * some other unexpected error happened talking to xenstore

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 9/9] libxl: introduce specific error codes in libxl_device_nic_add
  2015-06-24 13:47 ` [PATCH RFC 9/9] libxl: introduce specific error codes in libxl_device_nic_add Rob Hoes
@ 2015-06-24 15:11   ` Ian Jackson
  2015-06-26 16:36     ` Rob Hoes
  0 siblings, 1 reply; 33+ messages in thread
From: Ian Jackson @ 2015-06-24 15:11 UTC (permalink / raw)
  To: Rob Hoes; +Cc: xen-devel, euan.harris

Rob Hoes writes ("[PATCH RFC 9/9] libxl: introduce specific error codes in libxl_device_nic_add"):
> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
...    
> +    # NIC parameters could not be determined
> +    (ENUM_PREV, "NIC_SCRIPT_UNDETERMINED"),
> +    (ENUM_PREV, "NIC_DEVID_UNDETERMINED"),

Perhaps we could have a coherent naming scheme ?  These are invalid
parameter errors, aren't they ?  That is, libxl's caller specified
something wrong.

So maybe
  INVALID_NIC_SCRIPT_UNDETERMINED
?

What do others think ?

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 0/9] libxl error reporting
  2015-06-24 13:47 [PATCH RFC 0/9] libxl error reporting Rob Hoes
                   ` (8 preceding siblings ...)
  2015-06-24 13:47 ` [PATCH RFC 9/9] libxl: introduce specific error codes in libxl_device_nic_add Rob Hoes
@ 2015-06-24 15:16 ` Ian Jackson
  9 siblings, 0 replies; 33+ messages in thread
From: Ian Jackson @ 2015-06-24 15:16 UTC (permalink / raw)
  To: Rob Hoes; +Cc: xen-devel, euan.harris

Rob Hoes writes ("[PATCH RFC 0/9] libxl error reporting"):
> 
> Following the proposal from Euan Harris to improve libxl's error
> reporting [1], I have written a first couple of patches that I would
> like some feedback on.  The focus of these patches is on improving
> the errors that can be raised by the device_add functions and some
> related ones.
> 
> Does the approach look acceptable?

Your plan looks like 1. document existing stuff 2. go around fixing
things.  That seems good to me.

> Do the error codes make sense?

I will reply in detail.

> Did I miss any error conditions?

This is too hard to answer I think.

> One thing I wasn't sure about is what happens if
> libxl__wait_device_connection times out.

The devstate and hence the underlying libxl__ev_timeout will time out
generating rc == ERROR_TIMEDOUT.

I think if we are to do this properly we might want
libxl__ev_timeout's setup to take an rc value to generate for
timeouts.  We could have:

  ERROR_TIMEDOUT_GUEST
  ERROR_TIMEDOUT_BACKEND
  ERROR_TIMEDOUT_HOTPLUG

at the very least.

> Also, I realise that these patches essentially break backward
> compatibility, and I have not done the "#define LIBXL_HAVE" stuff
> yet. Do people consider this necessary, and if so, at what
> granularity (e.g. one LIBXL_HAVE for all new error codes in a
> release)?

We should probably have, at the end of the series,
LIBXL_HAVE_<something>, for the whole batch.

In general I don't think we need to care very much about backward
compatibility for API callers, because making application decisions
based on the existing vague error codes is basically impossible.
(Barring a few exceptions.)

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 4/9] libxl: use explicit error codes in libxl_ctx_alloc
  2015-06-24 13:47 ` [PATCH RFC 4/9] libxl: use explicit error codes in libxl_ctx_alloc Rob Hoes
@ 2015-06-24 15:18   ` Ian Jackson
  0 siblings, 0 replies; 33+ messages in thread
From: Ian Jackson @ 2015-06-24 15:18 UTC (permalink / raw)
  To: Rob Hoes; +Cc: xen-devel, euan.harris

Rob Hoes writes ("[PATCH RFC 4/9] libxl: use explicit error codes in libxl_ctx_alloc"):
> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
> ---
>  tools/libxl/libxl.c         | 6 +++---
>  tools/libxl/libxl_event.c   | 2 +-
>  tools/libxl/libxl_types.idl | 3 +++
>  3 files changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index a6eb2df..f622981 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -89,7 +89,7 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version,
>          LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Failed to initialize mutex");
>          free(ctx);
>          ctx = 0;
> -        rc = ERROR_FAIL;
> +        rc = ERROR_LOCK_FAIL;

I commented about ERROR_LOCK_FAIL earlier.

I think really if the mutex initialisation failed everything is
doomed.  You probably want to introduce
   ERROR_LIBXL_PROCESS_SEEMS_TOTALLY_BUST

>      ctx->xch = xc_interface_open(lg,lg,0);
>      if (!ctx->xch) {
>          LOGEV(ERROR, errno, "cannot open libxc handle");
> -        rc = ERROR_FAIL; goto out;
> +        rc = ERROR_XC_CONNECT; goto out;
...
> @@ -115,7 +115,7 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version,
>          ctx->xsh = xs_domain_open();
>      if (!ctx->xsh) {
>          LOGEV(ERROR, errno, "cannot connect to xenstore");
> -        rc = ERROR_FAIL; goto out;
> +        rc = ERROR_XS_CONNECT; goto out;

Good.

> @@ -740,7 +740,7 @@ int libxl__ctx_evtchn_init(libxl__gc *gc) {
>      xce = xc_evtchn_open(CTX->lg, 0);
>      if (!xce) {
>          LOGE(ERROR,"cannot open libxc evtchn handle");
> -        rc = ERROR_FAIL;
> +        rc = ERROR_XC_CONNECT;

If you're going to distinguish XS from XC, you probably want to
distinguish xce too.  It's a different /dev node.


If we're categorising errors these are "installation or permissions
problem".

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 5/9] libxl: introduce specific JSON error codes
  2015-06-24 13:47 ` [PATCH RFC 5/9] libxl: introduce specific JSON error codes Rob Hoes
@ 2015-06-24 15:20   ` Ian Jackson
  0 siblings, 0 replies; 33+ messages in thread
From: Ian Jackson @ 2015-06-24 15:20 UTC (permalink / raw)
  To: Rob Hoes; +Cc: xen-devel, euan.harris

Rob Hoes writes ("[PATCH RFC 5/9] libxl: introduce specific JSON error codes"):
> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
...
> -    rc = ERROR_FAIL;
> +    rc = ERROR_JSON_SET_CONFIG;

I think I'd rather have ERROR_USERDATA_ACCESS (see my previous
comments about LOCK_FAIL) or maybe ERROR_USERDATA_READ /
ERROR_USERDATA_WRITE (where LOCK is the latter).

>      rc = parse(gc, o, p);
>      if (rc) {
>          LOG(ERROR, "unable to convert libxl__json_object to %s. (rc=%d)", type, rc);
> -        rc = ERROR_FAIL;
> +        rc = ERROR_JSON_PARSE_CONFIG;

This error is an instance of "libxl's system-wide state has become
corrupted".

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 8/9] libxl: introduce specific error codes in libxl_device_cdrom_insert
  2015-06-24 13:47 ` [PATCH RFC 8/9] libxl: introduce specific error codes in libxl_device_cdrom_insert Rob Hoes
@ 2015-06-24 15:26   ` Ian Jackson
  2015-06-26 16:27     ` Rob Hoes
  0 siblings, 1 reply; 33+ messages in thread
From: Ian Jackson @ 2015-06-24 15:26 UTC (permalink / raw)
  To: Rob Hoes; +Cc: xen-devel, euan.harris

Rob Hoes writes ("[PATCH RFC 8/9] libxl: introduce specific error codes in libxl_device_cdrom_insert"):
> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>

>      if (libxl_get_stubdom_id(ctx, domid) != 0) {
>          LOG(ERROR, "cdrom-insert doesn't work for stub domains");
> -        rc = ERROR_INVAL;
> +        rc = ERROR_STUBDOM;

ERROR_NOTIMPLEMENTED_STUBDOM ?

>          goto out;
>      }
>  
>      dm_ver = libxl__device_model_version_running(gc, domid);
>      if (dm_ver == -1) {
>          LOG(ERROR, "cannot determine device model version");
> -        rc = ERROR_FAIL;
> +        rc = ERROR_DM_VERSION_UNDETERMINED;

Is this an internal problem ?


>      if (i == num) {
>          LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Virtual device not found");
> -        rc = ERROR_FAIL;
> +        rc = ERROR_DISK_VDEV_NOT_FOUND;

ERROR_VDEV_NOTFOUND ?  Could be used for block-detach, nic-detach.

> @@ -2941,7 +2941,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk,
>          {
>              LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Internal error: %s does not exist",
>                         libxl__sprintf(gc, "%s/frontend", path));
> -            rc = ERROR_FAIL;
> +            rc = ERROR_INTERNAL;

Right, this is "libxl systemwide state seems corrupted".

ERROR_INTERNAL_BADSTATE_LIBXL ?

> @@ -299,7 +299,7 @@ int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) {
>      }
>      if (!ok) {
>          LOG(ERROR, "no suitable backend for disk %s", disk->vdev);
> -        return ERROR_INVAL;
> +        return ERROR_DISK_BACKEND_UNDETERMINED;

ERROR_INVALID_DISK... ?

> diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
> index 9aa7e2e..c687e86 100644
> --- a/tools/libxl/libxl_qmp.c
> +++ b/tools/libxl/libxl_qmp.c
> @@ -817,9 +817,11 @@ static int qmp_run_command(libxl__gc *gc, int domid,
>  
>      qmp = libxl__qmp_initialize(gc, domid);
>      if (!qmp)
> -        return ERROR_FAIL;
> +        return ERROR_QMP_INIT;

This is a bit like mutex initialisation, isn't it ?  It might mean the
process is out of fds.  Perhaps it would be best to have a function
which converts an errno value.


>      rc = qmp_synchronous_send(qmp, cmd, args, callback, opaque, qmp->timeout);
> +    if (rc < 0)
> +        rc = ERROR_QMP_SEND;

We need to distinguish various this probably by errno value, I think.

At least some of the results are going to be

ERROR_DEVICEMODEL_{CRASHED,MAD,HUNG}

In general it is not sufficient to know "what operation was libxl
trying to do" and more interesting to know "what might cause this
error".

> +    # Internal error; not actionable by the caller other than by doing something like
> +    # a retry/reboot (perhaps a libxl bug)
> +    (ENUM_PREV, "INTERNAL"),

The message is good.  But please wrap everything to 70 columns.  (That
goes for your 0/N and commit messages too.)


Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 7/9] libxl: introduce specific error codes in libxl_device_disk_add
  2015-06-24 13:47 ` [PATCH RFC 7/9] libxl: introduce specific error codes in libxl_device_disk_add Rob Hoes
@ 2015-06-24 15:28   ` Ian Jackson
  2015-06-26 16:49     ` Rob Hoes
  0 siblings, 1 reply; 33+ messages in thread
From: Ian Jackson @ 2015-06-24 15:28 UTC (permalink / raw)
  To: Rob Hoes; +Cc: euan.harris, xen-devel, Ian.Jackson

Rob Hoes writes ("[PATCH RFC 7/9] libxl: introduce specific error codes in libxl_device_disk_add"):
> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
...
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index f622981..2f56c6e 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -2008,9 +2008,16 @@ static int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device)
>  static int libxl__resolve_domid(libxl__gc *gc, const char *name,
>                                  uint32_t *domid)
>  {
> +    int rc;
>      if (!name)
>          return 0;
> -    return libxl_domain_qualifier_to_domid(CTX, name, domid);
> +
> +    rc = libxl_domain_qualifier_to_domid(CTX, name, domid);
> +
> +    if (rc < 0)
> +        return ERROR_DOMAIN_NOTFOUND;
> +    else
> +        return rc;

This doesn't seem right.  You're smashing all the errors to domain not
found.  Surely libxl_domain_qualifier_to_domid should return the right
error to start with.

The rest looks plausible (modulo my quibbles about names in my earlier
mails).

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 6/9] libxl: introduce specific error code for libxl__wait_device_connection
  2015-06-24 13:47 ` [PATCH RFC 6/9] libxl: introduce specific error code for libxl__wait_device_connection Rob Hoes
@ 2015-06-24 15:30   ` Ian Jackson
  0 siblings, 0 replies; 33+ messages in thread
From: Ian Jackson @ 2015-06-24 15:30 UTC (permalink / raw)
  To: Rob Hoes; +Cc: xen-devel, euan.harris

Rob Hoes writes ("[PATCH RFC 6/9] libxl: introduce specific error code for libxl__wait_device_connection"):
...
> diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
> index 93bb41e..56c6e2e 100644
> --- a/tools/libxl/libxl_device.c
> +++ b/tools/libxl/libxl_device.c
> @@ -768,6 +768,7 @@ void libxl__wait_device_connection(libxl__egc *egc, libxl__ao_device *aodev)
>                                   LIBXL_INIT_TIMEOUT * 1000);
>      if (rc) {
>          LOG(ERROR, "unable to initialize device %s", be_path);
> +        rc = ERROR_DEVICE_WAIT_INIT;
>          goto out;

Smashing rc values like this is an antipattern.  For example, here,
you would turn ERROR_CANCELLED (from my ao cancellation series) into
ERROR_DEVICE_WAIT_INIT.

See my earlier comments about making timeouts each have their own rc
value depending what was waited for.

This particular wait is normally waiting for the backend/frontend
negotiation and state machine to complete, I think.

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/9] libxl idl: add comments to error enum
  2015-06-24 13:47 ` [PATCH RFC 1/9] libxl idl: add comments to error enum Rob Hoes
  2015-06-24 15:06   ` Ian Jackson
@ 2015-06-25 15:58   ` Ian Campbell
  2015-06-25 16:36     ` Ian Jackson
  1 sibling, 1 reply; 33+ messages in thread
From: Ian Campbell @ 2015-06-25 15:58 UTC (permalink / raw)
  To: Rob Hoes; +Cc: xen-devel, euan.harris, Ian.Jackson

On Wed, 2015-06-24 at 14:47 +0100, Rob Hoes wrote:
> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
> ---
>  tools/libxl/libxl_types.idl | 41 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
> 
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 65d479f..6dc18fa 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -44,26 +44,67 @@ MemKB = UInt(64, init_val = "LIBXL_MEMKB_DEFAULT", json_gen_fn = "libxl__uint64_
>  #
>  
>  libxl_error = Enumeration("error", [
> +    # Generic failure; code should be avoided (often seen as "rc = -1")
>      (-1, "NONSPECIFIC"),

I wonder if we should extend the Enumeration IDL support to have an
optional 3rd field (or a named param) containing a textual description,
which could then be used to autogenerate an analogue of sterrror(3) for
enums which include such things?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 2/9] libxl idl: allow implicit enum values
  2015-06-24 15:08   ` Ian Jackson
@ 2015-06-25 15:59     ` Ian Campbell
  2015-06-26 14:20       ` Rob Hoes
  0 siblings, 1 reply; 33+ messages in thread
From: Ian Campbell @ 2015-06-25 15:59 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, euan.harris, Rob Hoes

On Wed, 2015-06-24 at 16:08 +0100, Ian Jackson wrote:
> Rob Hoes writes ("[PATCH RFC 2/9] libxl idl: allow implicit enum values"):
> > Introducing two special enum values:
> > * ENUM_NEXT: equal to the previous value in the enum plus 1
> > * ENUM_PREV: equal to the previous value in the enum minus 1
> > 
> > This makes it a little easier to maintain enums for which we do not care too
> > much about the exact enum values.
> 
> It means that enum values can't be looked up without compiling the
> code.  It also means that deleting an old enum value would result in
> unexpected and undesirable changes to subsequent enums.
> 
> So I'm afraid I don't agree with this approach.

Neither do I.

An unstable ABI is one thing, having error messages change randomly from
"error -3" to "error -4" because someone added a new one is a step too
far though I think.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/9] libxl idl: add comments to error enum
  2015-06-25 15:58   ` Ian Campbell
@ 2015-06-25 16:36     ` Ian Jackson
  0 siblings, 0 replies; 33+ messages in thread
From: Ian Jackson @ 2015-06-25 16:36 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, euan.harris, Rob Hoes

Ian Campbell writes ("Re: [Xen-devel] [PATCH RFC 1/9] libxl idl: add comments to error enum"):
> On Wed, 2015-06-24 at 14:47 +0100, Rob Hoes wrote:
> >  libxl_error = Enumeration("error", [
> > +    # Generic failure; code should be avoided (often seen as "rc = -1")
> >      (-1, "NONSPECIFIC"),
> 
> I wonder if we should extend the Enumeration IDL support to have an
> optional 3rd field (or a named param) containing a textual description,
> which could then be used to autogenerate an analogue of sterrror(3) for
> enums which include such things?

That might be helpful, but we need significantly longer and more
discursive descriptions too.

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/9] libxl idl: add comments to error enum
  2015-06-24 15:06   ` Ian Jackson
@ 2015-06-26 14:12     ` Rob Hoes
  2015-06-26 14:26       ` Ian Campbell
  0 siblings, 1 reply; 33+ messages in thread
From: Rob Hoes @ 2015-06-26 14:12 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Euan Harris


> On 24 Jun 2015, at 16:06, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
...
>> +    # Out of memory (malloc or similar failed)
>>     (-5, "NOMEM"),
> 
> Should say whether this includes "host does not have enough memory to
> create the specified domain" as well as "malloc failed in toolstack".
> 
> NB that we are trying to phase out the handling of the latter as
> anything except a fatal error.

Indeed, we should probably split this into something like ERROR_NOMEM_FOR_NEW_DOMAIN and ERROR_NOMEM_IN_TOOLSTACK.

> 
>> +    # General failure; code should be avoided
>>     (-6, "INVAL"),
> 
> ERROR_INVAL means "there is something wrong with the parameters to the
> operation (but we don't say exactly what).

Yes, this is an easy target for splitting out into many ERROR_INVAL_<some parameter> (since we cannot parametrise the error itself).

> 
>> +    # General failure; code should be avoided (used only in "xl")
>>     (-7, "BADFAIL"),
> 
> Maybe it should be renamed.

Perhaps ERROR_INTERNAL_XL?

> 
>> +    # Domain responded to suspend request
>>     (-8, "GUEST_TIMEDOUT"),
> 
> Failed to respond, you mean.

Uhm, yes :)

>> +    # A xenstore watch has timed out
>>     (-9, "TIMEDOUT"),
> 
> Not true, I'm afraid.  xenstore watches cannot time out, and this is
> used for other things besides `we were waiting for something in
> xenstore and decided it was taking too long' (which anyway does not
> really say what it was that we were waiting for - lots of things
> communicate via xenstore).
> 
> Also this error doesn't say what timed out, so this error code should
> be deprecated.

Ok, so we’ll probably need to replace it by ERROR_<something>_TIMEDOUT, e.g. ERROR_HOTPLUG_TIMEDOUT.

>> +    # Event has not happened (libxl_event_check)
>>     (-11, "NOT_READY"),
> 
> I would say "Requested event has not (yet) happened".
> 
>> +    # Could not acquire lock
>>     (-15, "LOCK_FAIL"),
> 
> This seems to be used entirely for the userdata lock.  I would be
> tempted to argue that it's not particularly be useful to report this
> distinctly to the application, because what it really means is "either
> the system is totally hosed, or the permissions or existence of the
> userdata storage area are wrong".  The latter ought to be a separate
> error code.

For the “totally hosed” condition, we could use ERROR_INTERNAL (as in some of my other patches), implying “please kill me or just reboot the host”.

> 
>> +    # Unable to find JSON domain config
>>     (-16, "JSON_CONFIG_EMPTY"),
> 
> There is clearly something wrong here.  This is supposed to be handled
> by most callers I think.  The comment at the one generation site says:
> 
>        /* No logging, not necessary an error from caller's PoV. */
>        rc = ERROR_JSON_CONFIG_EMPTY;
>        goto out;
> 
> But I can find no special handling of it by callers.

I did see this error when testing Xen 4.5 under xenopsd+libxl. It can happen when executing a libxl function on a domain that was not started by libxl, for example dom0. Specifically, we saw it because we didn’t run xen-init-dom0 yet, and we tried to attach a block device to dom0 using libxl.

Cheers,
Rob

> 
>> +    # Remus ops do not match device
>>     (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
> 
> I think this is supposed to be internal-only.
> 
>> +    # Remus device not supported
>>     (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
> 
> Starting remus was attempted but the domain has a device which is not
> supported by remus.
> 
>> +    # Requested domain was not found
>>     (-21, "DOMAIN_NOTFOUND"),
> 
> AFAICT might also sometimes happen if the domain disappears during an
> operation ?
> 
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 2/9] libxl idl: allow implicit enum values
  2015-06-25 15:59     ` Ian Campbell
@ 2015-06-26 14:20       ` Rob Hoes
  0 siblings, 0 replies; 33+ messages in thread
From: Rob Hoes @ 2015-06-26 14:20 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, Euan Harris, xen-devel

On 25 Jun 2015, at 16:59, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> 
> On Wed, 2015-06-24 at 16:08 +0100, Ian Jackson wrote:
>> Rob Hoes writes ("[PATCH RFC 2/9] libxl idl: allow implicit enum values"):
>>> Introducing two special enum values:
>>> * ENUM_NEXT: equal to the previous value in the enum plus 1
>>> * ENUM_PREV: equal to the previous value in the enum minus 1
>>> 
>>> This makes it a little easier to maintain enums for which we do not care too
>>> much about the exact enum values.
>> 
>> It means that enum values can't be looked up without compiling the
>> code.  It also means that deleting an old enum value would result in
>> unexpected and undesirable changes to subsequent enums.
>> 
>> So I'm afraid I don't agree with this approach.
> 
> Neither do I.
> 
> An unstable ABI is one thing, having error messages change randomly from
> "error -3" to "error -4" because someone added a new one is a step too
> far though I think.
> 

Fair enough, I thought I’d just give it a go and see what happens. :)

My other option was to leave some gaps between error codes so that you can easily fit some in between in a logical order. Anyway, I am first going to focus on finding patterns and possible groupings of error codes without caring too much about the numbering.

Rob
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/9] libxl idl: add comments to error enum
  2015-06-26 14:12     ` Rob Hoes
@ 2015-06-26 14:26       ` Ian Campbell
  2015-06-26 14:33         ` Ian Jackson
  0 siblings, 1 reply; 33+ messages in thread
From: Ian Campbell @ 2015-06-26 14:26 UTC (permalink / raw)
  To: Rob Hoes; +Cc: Ian Jackson, Euan Harris, xen-devel

On Fri, 2015-06-26 at 14:12 +0000, Rob Hoes wrote:
> >> +    # General failure; code should be avoided (used only in "xl")
> >>     (-7, "BADFAIL"),
> > 
> > Maybe it should be renamed.
> 
> Perhaps ERROR_INTERNAL_XL?

It would be a bit wrong to have an application specific error code in
the library space.

Perhaps we should declare some ranges which for specific uses. For
instance perhaps 0x7FFF0000..0x7FFFFFFF could be set aside for the
application, so xl can define itself some errors which it can use for
its own needs without needing to handle two separate error spaces. xl
could then define these codes itself.

Likewise perhaps libxl internal errors should be given their own range,
which the application can legitimately expect never to see.

I was originally think about this in the context of the xenstore patch
in this series, i.e. declaring that some range is used for the existing
XS error codes (e.g. 0x3000xxxx is a xenstore code). I'm not sure if
that's a good idea or not.

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/9] libxl idl: add comments to error enum
  2015-06-26 14:26       ` Ian Campbell
@ 2015-06-26 14:33         ` Ian Jackson
  2015-06-30 12:19           ` Ian Campbell
  0 siblings, 1 reply; 33+ messages in thread
From: Ian Jackson @ 2015-06-26 14:33 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Euan Harris, Rob Hoes

Ian Campbell writes ("Re: [Xen-devel] [PATCH RFC 1/9] libxl idl: add comments to error enum"):
> It would be a bit wrong to have an application specific error code in
> the library space.

Yes.

> Perhaps we should declare some ranges which for specific uses. For
> instance perhaps 0x7FFF0000..0x7FFFFFFF could be set aside for the
> application, so xl can define itself some errors which it can use for
> its own needs without needing to handle two separate error spaces. xl
> could then define these codes itself.

All libxl errors are negative and no libxl function returns (either (a
nonnegative success value) or (a negative error code)).

If we were just to promise that then the application has a wide space
of numbers.

> Likewise perhaps libxl internal errors should be given their own range,
> which the application can legitimately expect never to see.

This is unwise; they might escape...

> I was originally think about this in the context of the xenstore patch
> in this series, i.e. declaring that some range is used for the existing
> XS error codes (e.g. 0x3000xxxx is a xenstore code). I'm not sure if
> that's a good idea or not.

I have no objection to mapping existing error code spaces into libxl
ones.  But we need to do this with a bit of care.  I think passing
xenstore or hypervisor errno numbers directly to the caller will often
be a bad idea.

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 3/9] libxl: introduce specific xenstore error codes
  2015-06-24 15:10   ` Ian Jackson
@ 2015-06-26 14:36     ` Rob Hoes
  2015-06-26 14:42       ` Rob Hoes
  0 siblings, 1 reply; 33+ messages in thread
From: Rob Hoes @ 2015-06-26 14:36 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Euan Harris


> On 24 Jun 2015, at 16:10, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
> 
> Rob Hoes writes ("[PATCH RFC 3/9] libxl: introduce specific xenstore error codes"):
>> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
> 
> I don't understand the distinctions you are trying to make here.  Is
> it really useful for an application or a sysadmin to distinguish what
> operation was being attempted against xenstore ?
> 
> I think the xenstore error code would be more informative.  Probably
> they can be categorised into:
> * xenstored communication failed
> * xenstored seems to have done something mad
> * permission denied reading xenstore
> * permission denied writing xenstore
> * some other unexpected error happened talking to xenstore

Besides the connection problems, the possible error codes from xenstored seem to be:
* EINVAL: invalid input
* EEXIST: tried to create something that already exists
* ENOENT: what you asked for does not exist
* EACCES: permission denied
* EQUOTA: quota exceeded
* E2BIG: data too big
* ENOSYS: unknown operation
* EAGAIN and EBUSY: transaction problems… please retry (?)

Should we just pass these on as ERROR_XS_EINVAL, ERROR_XS_EEXIST, etc.?

Rob
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 3/9] libxl: introduce specific xenstore error codes
  2015-06-26 14:36     ` Rob Hoes
@ 2015-06-26 14:42       ` Rob Hoes
  0 siblings, 0 replies; 33+ messages in thread
From: Rob Hoes @ 2015-06-26 14:42 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Euan Harris


> On 26 Jun 2015, at 15:36, Rob Hoes <rob.hoes@citrix.com> wrote:
> 
>> On 24 Jun 2015, at 16:10, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
>> 
>> Rob Hoes writes ("[PATCH RFC 3/9] libxl: introduce specific xenstore error codes"):
>>> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
>> 
>> I don't understand the distinctions you are trying to make here.  Is
>> it really useful for an application or a sysadmin to distinguish what
>> operation was being attempted against xenstore ?
>> 
>> I think the xenstore error code would be more informative.  Probably
>> they can be categorised into:
>> * xenstored communication failed
>> * xenstored seems to have done something mad
>> * permission denied reading xenstore
>> * permission denied writing xenstore
>> * some other unexpected error happened talking to xenstore
> 
> Besides the connection problems, the possible error codes from xenstored seem to be:
> * EINVAL: invalid input
> * EEXIST: tried to create something that already exists
> * ENOENT: what you asked for does not exist
> * EACCES: permission denied
> * EQUOTA: quota exceeded
> * E2BIG: data too big
> * ENOSYS: unknown operation
> * EAGAIN and EBUSY: transaction problems… please retry (?)
> 
> Should we just pass these on as ERROR_XS_EINVAL, ERROR_XS_EEXIST, etc.?

Of course these errors may occur somewhere deep down inside libxl, and might be translated into an error code that makes more sense in the context of the higher level operation that the user requested from libxl.

Rob
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 8/9] libxl: introduce specific error codes in libxl_device_cdrom_insert
  2015-06-24 15:26   ` Ian Jackson
@ 2015-06-26 16:27     ` Rob Hoes
  0 siblings, 0 replies; 33+ messages in thread
From: Rob Hoes @ 2015-06-26 16:27 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Euan Harris


> On 24 Jun 2015, at 16:26, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
> 
> Rob Hoes writes ("[PATCH RFC 8/9] libxl: introduce specific error codes in libxl_device_cdrom_insert"):
>> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
> 
>>     if (libxl_get_stubdom_id(ctx, domid) != 0) {
>>         LOG(ERROR, "cdrom-insert doesn't work for stub domains");
>> -        rc = ERROR_INVAL;
>> +        rc = ERROR_STUBDOM;
> 
> ERROR_NOTIMPLEMENTED_STUBDOM ?
> 
>>         goto out;
>>     }
>> 
>>     dm_ver = libxl__device_model_version_running(gc, domid);
>>     if (dm_ver == -1) {
>>         LOG(ERROR, "cannot determine device model version");
>> -        rc = ERROR_FAIL;
>> +        rc = ERROR_DM_VERSION_UNDETERMINED;
> 
> Is this an internal problem ?

In general, by ERROR_<something>_UNDETERMINED, I meant that the user did not specify <something> and expected libxl to figure it out or fill in a sensible default (e.g. based on some other params that _were_ specified, or some other state in the system), but libxl was unable to do so.

This as opposed to ERROR_INVAL_<parameter>, where there is something wrong with the <parameter> that was given to libxl by the caller.

Perhaps, for consistency, I should have made them ERROR_UNDETERMINED_<something> and ERROR_INVAL_<parameter> (ERROR_<condition>_<thing>), or the other way around: ERROR_<something>_UNDETERMINED and ERROR_<parameter>_INVAL (ERROR_<thing>_<condition>).

In this case, it probably means that QEMU was either not running (seems unlikely), an unknown version of it was running, or some state got messed up such that QEMU became unrecognisable.

> 
> 
>>     if (i == num) {
>>         LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Virtual device not found");
>> -        rc = ERROR_FAIL;
>> +        rc = ERROR_DISK_VDEV_NOT_FOUND;
> 
> ERROR_VDEV_NOTFOUND ?  Could be used for block-detach, nic-detach.

Though being more specific would do no harm?

> 
>> @@ -2941,7 +2941,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk,
>>         {
>>             LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Internal error: %s does not exist",
>>                        libxl__sprintf(gc, "%s/frontend", path));
>> -            rc = ERROR_FAIL;
>> +            rc = ERROR_INTERNAL;
> 
> Right, this is "libxl systemwide state seems corrupted".
> 
> ERROR_INTERNAL_BADSTATE_LIBXL ?
> 
>> @@ -299,7 +299,7 @@ int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) {
>>     }
>>     if (!ok) {
>>         LOG(ERROR, "no suitable backend for disk %s", disk->vdev);
>> -        return ERROR_INVAL;
>> +        return ERROR_DISK_BACKEND_UNDETERMINED;
> 
> ERROR_INVALID_DISK… ?

This follows the reasoning I gave above: the user did not specify the disk backend, and libxl was unable to fill in a suitable default. This is different from ERROR_INVAL_DISK_BACKEND.

> 
>> diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
>> index 9aa7e2e..c687e86 100644
>> --- a/tools/libxl/libxl_qmp.c
>> +++ b/tools/libxl/libxl_qmp.c
>> @@ -817,9 +817,11 @@ static int qmp_run_command(libxl__gc *gc, int domid,
>> 
>>     qmp = libxl__qmp_initialize(gc, domid);
>>     if (!qmp)
>> -        return ERROR_FAIL;
>> +        return ERROR_QMP_INIT;
> 
> This is a bit like mutex initialisation, isn't it ?  It might mean the
> process is out of fds.  Perhaps it would be best to have a function
> which converts an errno value.
> 
> 
>>     rc = qmp_synchronous_send(qmp, cmd, args, callback, opaque, qmp->timeout);
>> +    if (rc < 0)
>> +        rc = ERROR_QMP_SEND;
> 
> We need to distinguish various this probably by errno value, I think.
> 
> At least some of the results are going to be
> 
> ERROR_DEVICEMODEL_{CRASHED,MAD,HUNG}
> 
> In general it is not sufficient to know "what operation was libxl
> trying to do" and more interesting to know "what might cause this
> error”.

Yes, I agree. It’s similar to the xenstore case.

> 
>> +    # Internal error; not actionable by the caller other than by doing something like
>> +    # a retry/reboot (perhaps a libxl bug)
>> +    (ENUM_PREV, "INTERNAL"),
> 
> The message is good.  But please wrap everything to 70 columns.  (That
> goes for your 0/N and commit messages too.)
> 
> 
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 9/9] libxl: introduce specific error codes in libxl_device_nic_add
  2015-06-24 15:11   ` Ian Jackson
@ 2015-06-26 16:36     ` Rob Hoes
  0 siblings, 0 replies; 33+ messages in thread
From: Rob Hoes @ 2015-06-26 16:36 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Euan Harris


> On 24 Jun 2015, at 16:11, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
> 
> Rob Hoes writes ("[PATCH RFC 9/9] libxl: introduce specific error codes in libxl_device_nic_add"):
>> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
> ...    
>> +    # NIC parameters could not be determined
>> +    (ENUM_PREV, "NIC_SCRIPT_UNDETERMINED"),
>> +    (ENUM_PREV, "NIC_DEVID_UNDETERMINED"),
> 
> Perhaps we could have a coherent naming scheme ?  These are invalid
> parameter errors, aren't they ?  That is, libxl's caller specified
> something wrong.
> 
> So maybe
>  INVALID_NIC_SCRIPT_UNDETERMINED
> ?
> 

Also see my other reply on PATCH 8/9. I think that the distinction between UNDETERMINED and INVALID is useful.

I agree that a naming scheme would be good to have. I’ve tried to do that at least for the codes related to libxl_device_<type> structures, where I’ve included <device_type>_<field> in the error code, plus a condition (description of the problem) such as INVAL or UNDETERMINED.

Rob

> What do others think ?
> 
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 7/9] libxl: introduce specific error codes in libxl_device_disk_add
  2015-06-24 15:28   ` Ian Jackson
@ 2015-06-26 16:49     ` Rob Hoes
  0 siblings, 0 replies; 33+ messages in thread
From: Rob Hoes @ 2015-06-26 16:49 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Euan Harris

On 24 Jun 2015, at 16:28, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
> 
> Rob Hoes writes ("[PATCH RFC 7/9] libxl: introduce specific error codes in libxl_device_disk_add"):
>> Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
> ...
>> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
>> index f622981..2f56c6e 100644
>> --- a/tools/libxl/libxl.c
>> +++ b/tools/libxl/libxl.c
>> @@ -2008,9 +2008,16 @@ static int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device)
>> static int libxl__resolve_domid(libxl__gc *gc, const char *name,
>>                                 uint32_t *domid)
>> {
>> +    int rc;
>>     if (!name)
>>         return 0;
>> -    return libxl_domain_qualifier_to_domid(CTX, name, domid);
>> +
>> +    rc = libxl_domain_qualifier_to_domid(CTX, name, domid);
>> +
>> +    if (rc < 0)
>> +        return ERROR_DOMAIN_NOTFOUND;
>> +    else
>> +        return rc;
> 
> This doesn't seem right.  You're smashing all the errors to domain not
> found.  Surely libxl_domain_qualifier_to_domid should return the right
> error to start with.

The idea was to translate a lower-level error into one that makes sense in the context of the higher-level function. When talking about this offline, I think that we concluded that this may be the right thing to do in certain cases, but then we should explicitly translate specific error codes rather than doing it by “if (rc < 0)”.

However, in this specific case, I should have probably changed libxl_name_to_domid instead, to return ERROR_DOMAIN_NOTFOUND rather than ERROR_INVAL.

> 
> The rest looks plausible (modulo my quibbles about names in my earlier
> mails).
> 
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/9] libxl idl: add comments to error enum
  2015-06-26 14:33         ` Ian Jackson
@ 2015-06-30 12:19           ` Ian Campbell
  0 siblings, 0 replies; 33+ messages in thread
From: Ian Campbell @ 2015-06-30 12:19 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Euan Harris, Rob Hoes

On Fri, 2015-06-26 at 15:33 +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] [PATCH RFC 1/9] libxl idl: add comments to error enum"):
> > It would be a bit wrong to have an application specific error code in
> > the library space.
> 
> Yes.
> 
> > Perhaps we should declare some ranges which for specific uses. For
> > instance perhaps 0x7FFF0000..0x7FFFFFFF could be set aside for the
> > application, so xl can define itself some errors which it can use for
> > its own needs without needing to handle two separate error spaces. xl
> > could then define these codes itself.
> 
> All libxl errors are negative and no libxl function returns (either (a
> nonnegative success value) or (a negative error code)).

I'm having trouble parsing the boolean logic here.

I think you are saying that every libxl function either:

Returns 0 on success and a -ve code on error
-or-
Returns positive value success or 0 on error.

But there are no functions which return a positive value on success and
a negative error code on failure.

Is that right? I think I must have misinterpreted since a) I'm not sure
that is true and b) the presence of +ve success values makes the +ve
space unavailable for error codes in the bits of applications which use
them.

> If we were just to promise that then the application has a wide space
> of numbers.
> 
> > Likewise perhaps libxl internal errors should be given their own range,
> > which the application can legitimately expect never to see.
> 
> This is unwise; they might escape...

True. The reason to keep them together was to make it easier to check
for them en masse at library exit points. Doesn't stop them escaping but
might help.

> > I was originally think about this in the context of the xenstore patch
> > in this series, i.e. declaring that some range is used for the existing
> > XS error codes (e.g. 0x3000xxxx is a xenstore code). I'm not sure if
> > that's a good idea or not.
> 
> I have no objection to mapping existing error code spaces into libxl
> ones.  But we need to do this with a bit of care.  I think passing
> xenstore or hypervisor errno numbers directly to the caller will often
> be a bad idea.

Right.

Ian.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-06-30 12:19 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-24 13:47 [PATCH RFC 0/9] libxl error reporting Rob Hoes
2015-06-24 13:47 ` [PATCH RFC 1/9] libxl idl: add comments to error enum Rob Hoes
2015-06-24 15:06   ` Ian Jackson
2015-06-26 14:12     ` Rob Hoes
2015-06-26 14:26       ` Ian Campbell
2015-06-26 14:33         ` Ian Jackson
2015-06-30 12:19           ` Ian Campbell
2015-06-25 15:58   ` Ian Campbell
2015-06-25 16:36     ` Ian Jackson
2015-06-24 13:47 ` [PATCH RFC 2/9] libxl idl: allow implicit enum values Rob Hoes
2015-06-24 15:08   ` Ian Jackson
2015-06-25 15:59     ` Ian Campbell
2015-06-26 14:20       ` Rob Hoes
2015-06-24 13:47 ` [PATCH RFC 3/9] libxl: introduce specific xenstore error codes Rob Hoes
2015-06-24 15:10   ` Ian Jackson
2015-06-26 14:36     ` Rob Hoes
2015-06-26 14:42       ` Rob Hoes
2015-06-24 13:47 ` [PATCH RFC 4/9] libxl: use explicit error codes in libxl_ctx_alloc Rob Hoes
2015-06-24 15:18   ` Ian Jackson
2015-06-24 13:47 ` [PATCH RFC 5/9] libxl: introduce specific JSON error codes Rob Hoes
2015-06-24 15:20   ` Ian Jackson
2015-06-24 13:47 ` [PATCH RFC 6/9] libxl: introduce specific error code for libxl__wait_device_connection Rob Hoes
2015-06-24 15:30   ` Ian Jackson
2015-06-24 13:47 ` [PATCH RFC 7/9] libxl: introduce specific error codes in libxl_device_disk_add Rob Hoes
2015-06-24 15:28   ` Ian Jackson
2015-06-26 16:49     ` Rob Hoes
2015-06-24 13:47 ` [PATCH RFC 8/9] libxl: introduce specific error codes in libxl_device_cdrom_insert Rob Hoes
2015-06-24 15:26   ` Ian Jackson
2015-06-26 16:27     ` Rob Hoes
2015-06-24 13:47 ` [PATCH RFC 9/9] libxl: introduce specific error codes in libxl_device_nic_add Rob Hoes
2015-06-24 15:11   ` Ian Jackson
2015-06-26 16:36     ` Rob Hoes
2015-06-24 15:16 ` [PATCH RFC 0/9] libxl error reporting Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.