xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function
@ 2018-12-06 15:02 George Dunlap
  2018-12-06 15:02 ` [PATCH v2 02/10] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN) George Dunlap
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: George Dunlap @ 2018-12-06 15:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony Perard, Ian Jackson, Wei Liu, George Dunlap

To reliably kill an untrusted devicemodel, we need to know not only
its pid, but its uid.  In preparation for this, move the userid
determination logic into a helper function.

Create a new field, `dm_runas`, in libxl__domain_build_state to store
the value during domain creation.

This change also removes unnecessary duplication of the argument
construction code.

While here, clean up some minor CODING_STYLE infractions (space
between * and variable name).

No functional change intended.

While here, delete some trailing whitespace.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v2:
- Remove unnecessary space between * and dm_runas
- Additional code clean-up
- Delete trailing whitespace

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Anthony Perard <anthony.perard@citrix.com>
---
 tools/libxl/libxl_dm.c       | 260 +++++++++++++++++++----------------
 tools/libxl/libxl_internal.h |  22 +--
 2 files changed, 151 insertions(+), 131 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 5698fe8af3..bbcbc94b6c 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -65,6 +65,131 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char *name)
     return logfile_w;
 }
 
+/*
+ *  userlookup_helper_getpwnam(libxl__gc*, const char *user,
+ *                             struct passwd **pwd_r);
+ *
+ *  userlookup_helper_getpwuid(libxl__gc*, uid_t uid,
+ *                             struct passwd **pwd_r);
+ *
+ *  returns 1 if the user was found, 0 if it was not, -1 on error
+ */
+#define DEFINE_USERLOOKUP_HELPER(NAME,SPEC_TYPE,STRUCTNAME,SYSCONF)     \
+    static int userlookup_helper_##NAME(libxl__gc *gc,                  \
+                                        SPEC_TYPE spec,                 \
+                                        struct STRUCTNAME *resultbuf,   \
+                                        struct STRUCTNAME **out)        \
+    {                                                                   \
+        struct STRUCTNAME *resultp = NULL;                              \
+        char *buf = NULL;                                               \
+        long buf_size;                                                  \
+        int ret;                                                        \
+                                                                        \
+        buf_size = sysconf(SYSCONF);                                    \
+        if (buf_size < 0) {                                             \
+            buf_size = 2048;                                            \
+            LOG(DEBUG,                                                  \
+    "sysconf failed, setting the initial buffer size to %ld",           \
+                buf_size);                                              \
+        }                                                               \
+                                                                        \
+        while (1) {                                                     \
+            buf = libxl__realloc(gc, buf, buf_size);                    \
+            ret = NAME##_r(spec, resultbuf, buf, buf_size, &resultp);   \
+            if (ret == ERANGE) {                                        \
+                buf_size += 128;                                        \
+                continue;                                               \
+            }                                                           \
+            if (ret != 0)                                               \
+                return ERROR_FAIL;                                      \
+            if (resultp != NULL) {                                      \
+                if (out) *out = resultp;                                \
+                return 1;                                               \
+            }                                                           \
+            return 0;                                                   \
+        }                                                               \
+    }
+
+DEFINE_USERLOOKUP_HELPER(getpwnam, const char*, passwd, _SC_GETPW_R_SIZE_MAX);
+DEFINE_USERLOOKUP_HELPER(getpwuid, uid_t,       passwd, _SC_GETPW_R_SIZE_MAX);
+
+static int libxl__domain_get_device_model_uid(libxl__gc *gc,
+                                              libxl__dm_spawn_state *dmss)
+{
+    int guest_domid = dmss->guest_domid;
+    libxl__domain_build_state *const state = dmss->build_state;
+    const libxl_domain_build_info *b_info = &dmss->guest_config->b_info;
+
+    struct passwd *user_base, user_pwbuf;
+    int ret;
+    char *user;
+
+    /* Only qemu-upstream can run as a different uid */
+    if (b_info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
+        return 0;
+
+    user = b_info->device_model_user;
+    if (user)
+        goto end_search;
+
+    if (!libxl_defbool_val(b_info->dm_restrict)) {
+        LOGD(DEBUG, guest_domid,
+             "dm_restrict disabled, starting QEMU as root");
+        return 0;
+    }
+
+    user = GCSPRINTF("%s%d", LIBXL_QEMU_USER_BASE, guest_domid);
+    ret = userlookup_helper_getpwnam(gc, user, &user_pwbuf, 0);
+    if (ret < 0)
+        return ret;
+    if (ret > 0)
+        goto end_search;
+
+    ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
+                                         &user_pwbuf, &user_base);
+    if (ret < 0)
+        return ret;
+    if (ret > 0) {
+        struct passwd *user_clash, user_clash_pwbuf;
+        uid_t intended_uid = user_base->pw_uid + guest_domid;
+        ret = userlookup_helper_getpwuid(gc, intended_uid,
+                                         &user_clash_pwbuf, &user_clash);
+        if (ret < 0)
+            return ret;
+        if (ret > 0) {
+            LOGD(ERROR, guest_domid,
+                 "wanted to use uid %ld (%s + %d) but that is user %s !",
+                 (long)intended_uid, LIBXL_QEMU_USER_RANGE_BASE,
+                 guest_domid, user_clash->pw_name);
+            return ERROR_FAIL;
+        }
+        LOGD(DEBUG, guest_domid, "using uid %ld", (long)intended_uid);
+        user = GCSPRINTF("%ld:%ld", (long)intended_uid,
+                         (long)user_base->pw_gid);
+        goto end_search;
+    }
+
+    user = LIBXL_QEMU_USER_SHARED;
+    ret = userlookup_helper_getpwnam(gc, user, &user_pwbuf, 0);
+    if (ret < 0)
+        return ret;
+    if (ret > 0) {
+        LOGD(WARN, guest_domid, "Could not find user %s%d, falling back to %s",
+             LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED);
+        goto end_search;
+    }
+
+    LOGD(ERROR, guest_domid,
+         "Could not find user %s%d or %s or range base pseudo-user %s, cannot restrict",
+         LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED,
+         LIBXL_QEMU_USER_RANGE_BASE);
+    return ERROR_INVAL;
+
+end_search:
+    state->dm_runas = user;
+    return 0;
+}
+
 const char *libxl__domain_device_model(libxl__gc *gc,
                                        const libxl_domain_build_info *info)
 {
@@ -737,54 +862,6 @@ libxl__detect_gfx_passthru_kind(libxl__gc *gc,
     return LIBXL_GFX_PASSTHRU_KIND_DEFAULT;
 }
 
-/*
- *  userlookup_helper_getpwnam(libxl__gc*, const char *user,
- *                             struct passwd **pwd_r);
- *
- *  userlookup_helper_getpwuid(libxl__gc*, uid_t uid,
- *                             struct passwd **pwd_r);
- *
- *  returns 1 if the user was found, 0 if it was not, -1 on error
- */
-#define DEFINE_USERLOOKUP_HELPER(NAME,SPEC_TYPE,STRUCTNAME,SYSCONF)     \
-    static int userlookup_helper_##NAME(libxl__gc *gc,                  \
-                                        SPEC_TYPE spec,                 \
-                                        struct STRUCTNAME *resultbuf,   \
-                                        struct STRUCTNAME **out)        \
-    {                                                                   \
-        struct STRUCTNAME *resultp = NULL;                              \
-        char *buf = NULL;                                               \
-        long buf_size;                                                  \
-        int ret;                                                        \
-                                                                        \
-        buf_size = sysconf(SYSCONF);                                    \
-        if (buf_size < 0) {                                             \
-            buf_size = 2048;                                            \
-            LOG(DEBUG,                                                  \
-    "sysconf failed, setting the initial buffer size to %ld",           \
-                buf_size);                                              \
-        }                                                               \
-                                                                        \
-        while (1) {                                                     \
-            buf = libxl__realloc(gc, buf, buf_size);                    \
-            ret = NAME##_r(spec, resultbuf, buf, buf_size, &resultp);   \
-            if (ret == ERANGE) {                                        \
-                buf_size += 128;                                        \
-                continue;                                               \
-            }                                                           \
-            if (ret != 0)                                               \
-                return ERROR_FAIL;                                      \
-            if (resultp != NULL) {                                      \
-                if (out) *out = resultp;                                \
-                return 1;                                               \
-            }                                                           \
-            return 0;                                                   \
-        }                                                               \
-    }
-
-DEFINE_USERLOOKUP_HELPER(getpwnam, const char*, passwd, _SC_GETPW_R_SIZE_MAX);
-DEFINE_USERLOOKUP_HELPER(getpwuid, uid_t,       passwd, _SC_GETPW_R_SIZE_MAX);
-
 /* colo mode */
 enum {
     LIBXL__COLO_NONE = 0,
@@ -928,11 +1005,9 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
     const char *keymap = dm_keymap(guest_config);
     char *machinearg;
     flexarray_t *dm_args, *dm_envs;
-    int i, connection, devid, ret;
+    int i, connection, devid;
     uint64_t ram_size;
     const char *path, *chardev;
-    char *user = NULL;
-    struct passwd *user_base, user_pwbuf;
 
     dm_args = flexarray_make(gc, 16, 1);
     dm_envs = flexarray_make(gc, 16, 1);
@@ -1414,10 +1489,10 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
         char *chroot_dir = GCSPRINTF("%s/qemu-root-%d",
                                       libxl__run_dir_path(), guest_domid);
         int r;
-        
+
         flexarray_append(dm_args, "-xen-domid-restrict");
 
-        /* 
+        /*
          * Run QEMU in a chroot at XEN_RUN_DIR/qemu-root-<domid>
          *
          * There is no library function to do the equivalent of `rm
@@ -1425,7 +1500,7 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
          * able to write any files, as the chroot would be owned by
          * root, but it would be running as an unprivileged process.
          * So in theory, old chroots should always be empty.
-         * 
+         *
          * rmdir the directory before attempting to create
          * it; if it returns anything other than ENOENT, fail domain
          * creation.
@@ -1436,7 +1511,7 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
                   "failed to remove existing chroot dir %s", chroot_dir);
             return ERROR_FAIL;
         }
-        
+
         for (;;) {
             r = mkdir(chroot_dir, 0000);
             if (!r)
@@ -1538,7 +1613,7 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
                 continue;
             }
 
-            /* 
+            /*
              * If qemu isn't doing the interpreting, the parameter is
              * always raw
              */
@@ -1563,7 +1638,7 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
                     continue;
                 }
 
-                /* 
+                /*
                  * We can't call libxl__blktap_devpath from
                  * libxl__device_disk_find_local_path for now because
                  * the bootloader is called before the disks are set
@@ -1685,71 +1760,9 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
             break;
         }
 
-        if (b_info->device_model_user) {
-            user = b_info->device_model_user;
-            goto end_search;
-        }
-
-        if (!libxl_defbool_val(b_info->dm_restrict)) {
-            LOGD(DEBUG, guest_domid,
-                 "dm_restrict disabled, starting QEMU as root");
-            goto end_search;
-        }
-
-        user = GCSPRINTF("%s%d", LIBXL_QEMU_USER_BASE, guest_domid);
-        ret = userlookup_helper_getpwnam(gc, user, &user_pwbuf, 0);
-        if (ret < 0)
-            return ret;
-        if (ret > 0)
-            goto end_search;
-
-        ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
-                                         &user_pwbuf, &user_base);
-        if (ret < 0)
-            return ret;
-        if (ret > 0) {
-            struct passwd *user_clash, user_clash_pwbuf;
-            uid_t intended_uid = user_base->pw_uid + guest_domid;
-            ret = userlookup_helper_getpwuid(gc, intended_uid,
-                                             &user_clash_pwbuf, &user_clash);
-            if (ret < 0)
-                return ret;
-            if (ret > 0) {
-                LOGD(ERROR, guest_domid,
-                     "wanted to use uid %ld (%s + %d) but that is user %s !",
-                     (long)intended_uid, LIBXL_QEMU_USER_RANGE_BASE,
-                     guest_domid, user_clash->pw_name);
-                return ERROR_FAIL;
-            }
-            LOGD(DEBUG, guest_domid, "using uid %ld", (long)intended_uid);
+        if (state->dm_runas && strcmp(state->dm_runas, "root")) {
             flexarray_append(dm_args, "-runas");
-            flexarray_append(dm_args,
-                             GCSPRINTF("%ld:%ld", (long)intended_uid,
-                                       (long)user_base->pw_gid));
-            user = NULL; /* we have taken care of it */
-            goto end_search;
-        }
-
-        user = LIBXL_QEMU_USER_SHARED;
-        ret = userlookup_helper_getpwnam(gc, user, &user_pwbuf, 0);
-        if (ret < 0)
-            return ret;
-        if (ret > 0) {
-            LOGD(WARN, guest_domid, "Could not find user %s%d, falling back to %s",
-                    LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED);
-            goto end_search;
-        }
-
-        LOGD(ERROR, guest_domid,
- "Could not find user %s%d or %s or range base pseudo-user %s, cannot restrict",
-             LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED,
-             LIBXL_QEMU_USER_RANGE_BASE);
-        return ERROR_INVAL;
-
-end_search:
-        if (user != NULL && strcmp(user, "root")) {
-            flexarray_append(dm_args, "-runas");
-            flexarray_append(dm_args, user);
+            flexarray_append(dm_args, state->dm_runas);
         }
     }
     flexarray_append(dm_args, NULL);
@@ -2303,6 +2316,11 @@ void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)
         rc = ERROR_FAIL;
         goto out;
     }
+
+    rc = libxl__domain_get_device_model_uid(gc, dmss);
+    if (rc)
+        goto out;
+
     rc = libxl__build_device_model_args(gc, dm, domid, guest_config,
                                           &args, &envs, state,
                                           &dm_state_fd);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e498435e16..c4a43bd0b7 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -332,7 +332,7 @@ struct libxl__ev_evtchn {
 typedef struct libxl__ev_watch_slot {
     LIBXL_SLIST_ENTRY(struct libxl__ev_watch_slot) empty;
 } libxl__ev_watch_slot;
-    
+
 _hidden libxl__ev_xswatch *libxl__watch_slot_contents(libxl__gc *gc,
                                                       int slotnum);
 
@@ -484,7 +484,7 @@ struct libxl__ctx {
         death_list /* sorted by domid */,
         death_reported;
     libxl__ev_xswatch death_watch;
-    
+
     LIBXL_LIST_HEAD(, libxl_evgen_disk_eject) disk_eject_evgens;
 
     const libxl_childproc_hooks *childproc_hooks;
@@ -1131,9 +1131,11 @@ typedef struct {
 
     libxl__file_reference pv_kernel;
     libxl__file_reference pv_ramdisk;
-    const char * shim_path;
-    const char * shim_cmdline;
-    const char * pv_cmdline;
+    const char *shim_path;
+    const char *shim_cmdline;
+    const char *pv_cmdline;
+
+    char *dm_runas;
 
     xen_vmemrange_t *vmemranges;
     uint32_t num_vmemranges;
@@ -1471,7 +1473,7 @@ _hidden void libxl__spawn_init(libxl__spawn_state*);
  *
  * what: string describing the spawned process, used for logging
  *
- * Logs errors.  A copy of "what" is taken. 
+ * Logs errors.  A copy of "what" is taken.
  * Return values:
  *  < 0   error, *spawn is now Idle and need not be detached
  *   +1   caller is the parent, *spawn is Attached and must be detached
@@ -2750,10 +2752,10 @@ static inline void libxl__device_disk_local_init(libxl__disk_local_state *dls)
     dls->rc = 0;
 }
 
-/* 
+/*
  * See if we can find a way to access a disk locally
  */
-_hidden char * libxl__device_disk_find_local_path(libxl__gc *gc, 
+_hidden char * libxl__device_disk_find_local_path(libxl__gc *gc,
                                                   libxl_domid guest_domid,
                                                   const libxl_device_disk *disk,
                                                   bool qdisk_direct);
@@ -3774,7 +3776,7 @@ struct libxl__dm_spawn_state {
 
 _hidden void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state*);
 
-/* 
+/*
  * Called after forking but before executing the local devicemodel.
  */
 _hidden int libxl__local_dm_preexec_restrict(libxl__gc *gc);
@@ -3963,7 +3965,7 @@ _hidden void libxl__remus_restore_setup(libxl__egc *egc,
  */
 #define GCNEW_ARRAY(var, nmemb)                                 \
     ((var) = libxl__calloc((gc), (nmemb), sizeof(*(var))))
-    
+
 /*
  * Expression statement  <type> *GCREALLOC_ARRAY(<type> *var, size_t nmemb);
  * Uses                  libxl__gc *gc;
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 02/10] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN)
  2018-12-06 15:02 [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function George Dunlap
@ 2018-12-06 15:02 ` George Dunlap
  2018-12-06 15:02 ` [PATCH v2 03/10] libxl: Clean up userlookup_helper_getpw* helper George Dunlap
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2018-12-06 15:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap

QEMU_USER_BASE allows a user to specify the UID to use when running
the devicemodel for a specific domain number.  Unfortunately, this is
not really practical: It requires nearly 32,000 entries in
/etc/passwd.  QEMU_USER_RANGE_BASE is much more practical.

Remove support for QEMU_USER_BASE.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
NB that I've chosen not to update the xl.cfg man page at this time; it
needs a lot of other updates as well, which would be easier to do all
at once at the end.

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dm.c       | 16 ++++------------
 tools/libxl/libxl_internal.h |  1 -
 2 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index bbcbc94b6c..6024d4b7b8 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -138,13 +138,6 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
         return 0;
     }
 
-    user = GCSPRINTF("%s%d", LIBXL_QEMU_USER_BASE, guest_domid);
-    ret = userlookup_helper_getpwnam(gc, user, &user_pwbuf, 0);
-    if (ret < 0)
-        return ret;
-    if (ret > 0)
-        goto end_search;
-
     ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
                                          &user_pwbuf, &user_base);
     if (ret < 0)
@@ -174,15 +167,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
     if (ret < 0)
         return ret;
     if (ret > 0) {
-        LOGD(WARN, guest_domid, "Could not find user %s%d, falling back to %s",
-             LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED);
+        LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
+             LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
         goto end_search;
     }
 
     LOGD(ERROR, guest_domid,
-         "Could not find user %s%d or %s or range base pseudo-user %s, cannot restrict",
-         LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED,
-         LIBXL_QEMU_USER_RANGE_BASE);
+         "Could not find user %s or range base pseudo-user %s, cannot restrict",
+         LIBXL_QEMU_USER_SHARED, LIBXL_QEMU_USER_RANGE_BASE);
     return ERROR_INVAL;
 
 end_search:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index c4a43bd0b7..b147f3803c 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -4387,7 +4387,6 @@ _hidden int libxl__read_sysfs_file_contents(libxl__gc *gc,
                                             int *datalen_r);
 
 #define LIBXL_QEMU_USER_PREFIX "xen-qemuuser"
-#define LIBXL_QEMU_USER_BASE   LIBXL_QEMU_USER_PREFIX"-domid"
 #define LIBXL_QEMU_USER_SHARED LIBXL_QEMU_USER_PREFIX"-shared"
 #define LIBXL_QEMU_USER_RANGE_BASE LIBXL_QEMU_USER_PREFIX"-range-base"
 
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 03/10] libxl: Clean up userlookup_helper_getpw* helper
  2018-12-06 15:02 [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function George Dunlap
  2018-12-06 15:02 ` [PATCH v2 02/10] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN) George Dunlap
@ 2018-12-06 15:02 ` George Dunlap
  2018-12-12 15:26   ` Ian Jackson
  2018-12-06 15:02 ` [PATCH v2 04/10] dm_depriv: Describe expected usage of device_model_user parameter George Dunlap
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2018-12-06 15:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap

Bring conventions more in line with libxl__xs_read_checked():
- If found, return 0 and set pointer to non-NULL
- If not found, return 0 and set pointer to NULL
- On error, return libxl-style error number.

Update documentation to match.

Use CODING_STYLE compliant `r` rather than `ret`.

On error, log the error code before returning instead of discarding
it.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dm.c | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 6024d4b7b8..959fa0f46c 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -72,7 +72,13 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char *name)
  *  userlookup_helper_getpwuid(libxl__gc*, uid_t uid,
  *                             struct passwd **pwd_r);
  *
- *  returns 1 if the user was found, 0 if it was not, -1 on error
+ *  If the user is found, return 0 and set *pwd_r to the appropriat
+ *  value.
+ *
+ *  If the user is not found but there are no errors, return 0
+ *  and set *pwd_r to NULL.
+ *
+ *  On error, return a libxl-style error code.
  */
 #define DEFINE_USERLOOKUP_HELPER(NAME,SPEC_TYPE,STRUCTNAME,SYSCONF)     \
     static int userlookup_helper_##NAME(libxl__gc *gc,                  \
@@ -83,7 +89,7 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char *name)
         struct STRUCTNAME *resultp = NULL;                              \
         char *buf = NULL;                                               \
         long buf_size;                                                  \
-        int ret;                                                        \
+        int r;                                                          \
                                                                         \
         buf_size = sysconf(SYSCONF);                                    \
         if (buf_size < 0) {                                             \
@@ -95,17 +101,16 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char *name)
                                                                         \
         while (1) {                                                     \
             buf = libxl__realloc(gc, buf, buf_size);                    \
-            ret = NAME##_r(spec, resultbuf, buf, buf_size, &resultp);   \
-            if (ret == ERANGE) {                                        \
+            r = NAME##_r(spec, resultbuf, buf, buf_size, &resultp);     \
+            if (r == ERANGE) {                                          \
                 buf_size += 128;                                        \
                 continue;                                               \
             }                                                           \
-            if (ret != 0)                                               \
+            if (r != 0) {                                               \
+                LOGEV(ERROR, r, "Looking up username/uid with " #NAME); \
                 return ERROR_FAIL;                                      \
-            if (resultp != NULL) {                                      \
-                if (out) *out = resultp;                                \
-                return 1;                                               \
             }                                                           \
+            *out = resultp;                                             \
             return 0;                                                   \
         }                                                               \
     }
@@ -142,14 +147,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
                                          &user_pwbuf, &user_base);
     if (ret < 0)
         return ret;
-    if (ret > 0) {
+    if (user_base) {
         struct passwd *user_clash, user_clash_pwbuf;
         uid_t intended_uid = user_base->pw_uid + guest_domid;
         ret = userlookup_helper_getpwuid(gc, intended_uid,
                                          &user_clash_pwbuf, &user_clash);
         if (ret < 0)
             return ret;
-        if (ret > 0) {
+        if (user_clash) {
             LOGD(ERROR, guest_domid,
                  "wanted to use uid %ld (%s + %d) but that is user %s !",
                  (long)intended_uid, LIBXL_QEMU_USER_RANGE_BASE,
@@ -163,10 +168,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
     }
 
     user = LIBXL_QEMU_USER_SHARED;
-    ret = userlookup_helper_getpwnam(gc, user, &user_pwbuf, 0);
+    ret = userlookup_helper_getpwnam(gc, user, &user_pwbuf, &user_base);
     if (ret < 0)
         return ret;
-    if (ret > 0) {
+    if (user_base) {
         LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
              LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
         goto end_search;
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 04/10] dm_depriv: Describe expected usage of device_model_user parameter
  2018-12-06 15:02 [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function George Dunlap
  2018-12-06 15:02 ` [PATCH v2 02/10] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN) George Dunlap
  2018-12-06 15:02 ` [PATCH v2 03/10] libxl: Clean up userlookup_helper_getpw* helper George Dunlap
@ 2018-12-06 15:02 ` George Dunlap
  2018-12-06 15:02 ` [PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid George Dunlap
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2018-12-06 15:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony Perard, Ian Jackson, Wei Liu, George Dunlap

A number of subsequent patches rely on as-yet undefined behavior for
what the `device_model_user` parameter does.  Rather than implement it
incorrectly (or randomly), or remove the feature, describe an expected
usage for the feature.  Further patches will make decisions based on
this expected usage.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v2:
- Remove stale comment about device_model_user not being ready

RFC: As we'll see in a later patch, this implementation is still
incomplete: we need a `reaper` uid from which to kill uids.

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Anthony Perard <anthony.perard@citrix.com>
---
 docs/features/qemu-deprivilege.pandoc | 17 +++++++++++++++++
 tools/libxl/libxl_types.idl           |  1 -
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/docs/features/qemu-deprivilege.pandoc b/docs/features/qemu-deprivilege.pandoc
index f941525189..49b571980e 100644
--- a/docs/features/qemu-deprivilege.pandoc
+++ b/docs/features/qemu-deprivilege.pandoc
@@ -66,6 +66,23 @@ this, create a user named `xen-qemuuser-shared`; for example:
 
     adduser --no-create-home --system xen-qemuuser-shared
 
+A final way to set up a separate process for qemus is to allocate one
+UID per VM, and set the UID in the domain config file with the
+`device_model_user` argument.  For example, suppose you have a VM
+named `c6-01`.  You might do the following:
+
+    adduser --system --no-create-home --group xen-qemuuuser-c6-01
+
+And then in your config file, the following line:
+
+    device_model_user="xen-qemuuser-c6-01"
+
+NOTE: It is important when using `device_model_user` that EACH VM HAVE
+A SEPARATE UID, and that none of these UIDs map to root.  xl will
+throw an error a uid maps to zero, but not if multiple VMs have the
+same uid.  Multiple VMs with the same device model uid will cause
+problems.
+
 ## Domain config changes
 
 The core domain config change is to add the following line to the
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 51cf06a3a2..141c46e42a 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -495,7 +495,6 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("device_model",     string),
     ("device_model_ssidref", uint32),
     ("device_model_ssid_label", string),
-    # device_model_user is not ready for use yet
     ("device_model_user", string),
 
     # extra parameters pass directly to qemu, NULL terminated
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid
  2018-12-06 15:02 [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function George Dunlap
                   ` (2 preceding siblings ...)
  2018-12-06 15:02 ` [PATCH v2 04/10] dm_depriv: Describe expected usage of device_model_user parameter George Dunlap
@ 2018-12-06 15:02 ` George Dunlap
  2018-12-12 15:45   ` Ian Jackson
  2018-12-06 15:02 ` [PATCH v2 06/10] libxl: Move qmp cleanup into devicemodel destroy function George Dunlap
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2018-12-06 15:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap

At the moment, we check for equivalence to literal "root" before
deciding whether to add the `runas` command-line option to QEMU.  This
is unsatisfactory for several reasons.

First, just because the string doesn't match "root" doesn't mean the
final uid won't end up being zero; in particular, the range_base
calculations may end up producing "0:NNN", which would be root in any
case.

Secondly, it's almost certainly a configuration error if the resulting
uid ends up to be zero; rather than silently do what was specified but
probably not intended, throw an error.

To fix this, check for root once in
libxl__domain_get_device_model_uid.  If the result is root, return an
error; if appropriate, set the user.

After that, assume that the presence of state->dm_runas implies that a
`runas` argument should be constructed.

One side effect of this is to check whether device_model_user exists
before passing it to qemu, resulting in better error reporting.

While we're here:
- Refactor the function to use the "goto out" idiom in most cases
- Use 'rc' rather than 'ret', in line with CODING_STYLE
- Change the error returned in the "uid collision" case to
  ERROR_DEVICE_EXISTS

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
v2:
- Refactor to use `out` rather than multiple labels
- Only check for root once
- Use 'out' rather than direct returns for errors (only use direct returns
  for early `succeed-without-setting-runas` paths)
- Use `rc` rather than `ret` to more closely align with CODING_STYLE
- Fill out comments about the cases we're handling
- Return ERROR_DEVICE_EXISTS rather than ERROR_FAIL if there's another
  username that maps to our calculated uid
- Report an error if the specified device_model_user doesn't exist

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dm.c | 86 +++++++++++++++++++++++++++++++-----------
 1 file changed, 65 insertions(+), 21 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 959fa0f46c..7ff3e3160a 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -126,65 +126,109 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
     const libxl_domain_build_info *b_info = &dmss->guest_config->b_info;
 
     struct passwd *user_base, user_pwbuf;
-    int ret;
+    int rc;
     char *user;
+    uid_t intended_uid;
 
     /* Only qemu-upstream can run as a different uid */
     if (b_info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
         return 0;
 
+    /*
+     * If device_model_user is present, set `-runas` even if
+     * dm_restrict isn't in use
+     */
     user = b_info->device_model_user;
-    if (user)
-        goto end_search;
+    if (user) {
+        rc = userlookup_helper_getpwnam(gc, user, &user_pwbuf, &user_base);
+        if (rc)
+            goto out;
+
+        if (!user_base) {
+            LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
+                 user);
+            rc = ERROR_INVAL;
+        } else
+            intended_uid = user_base->pw_uid;
 
+        goto out;
+    }
+
+    /*
+     * If dm_restrict isn't set, and we don't have a specified user, don't
+     * bother setting a `-runas` parameter.
+     */
     if (!libxl_defbool_val(b_info->dm_restrict)) {
         LOGD(DEBUG, guest_domid,
              "dm_restrict disabled, starting QEMU as root");
         return 0;
     }
 
-    ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
+    /*
+     * dm_restrict is set, but device_model_user isn't set; look for
+     * QEMU_USER_BASE_RANGE
+     */
+    rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
                                          &user_pwbuf, &user_base);
-    if (ret < 0)
-        return ret;
+    if (rc)
+        goto out;
     if (user_base) {
         struct passwd *user_clash, user_clash_pwbuf;
-        uid_t intended_uid = user_base->pw_uid + guest_domid;
-        ret = userlookup_helper_getpwuid(gc, intended_uid,
+
+        intended_uid = user_base->pw_uid + guest_domid;
+        rc = userlookup_helper_getpwuid(gc, intended_uid,
                                          &user_clash_pwbuf, &user_clash);
-        if (ret < 0)
-            return ret;
+        if (rc < 0)
+            goto out;
         if (user_clash) {
             LOGD(ERROR, guest_domid,
                  "wanted to use uid %ld (%s + %d) but that is user %s !",
                  (long)intended_uid, LIBXL_QEMU_USER_RANGE_BASE,
                  guest_domid, user_clash->pw_name);
-            return ERROR_FAIL;
+            rc = ERROR_DEVICE_EXISTS;
+            goto out;
         }
+
         LOGD(DEBUG, guest_domid, "using uid %ld", (long)intended_uid);
         user = GCSPRINTF("%ld:%ld", (long)intended_uid,
                          (long)user_base->pw_gid);
-        goto end_search;
+        goto out;
     }
 
+    /*
+     * We couldn't find QEMU_USER_BASE_RANGE; look for QEMU_USER_SHARED
+     */
     user = LIBXL_QEMU_USER_SHARED;
-    ret = userlookup_helper_getpwnam(gc, user, &user_pwbuf, &user_base);
-    if (ret < 0)
-        return ret;
+    rc = userlookup_helper_getpwnam(gc, user, &user_pwbuf, &user_base);
+    if (rc < 0)
+        goto out;
     if (user_base) {
         LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
              LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
-        goto end_search;
+        intended_uid = user_base->pw_uid;
+        goto out;
     }
 
+    /*
+     * dm_depriv is set, but we can't find a non-root uid to run as;
+     * fail domain creation
+     */
     LOGD(ERROR, guest_domid,
          "Could not find user %s or range base pseudo-user %s, cannot restrict",
          LIBXL_QEMU_USER_SHARED, LIBXL_QEMU_USER_RANGE_BASE);
-    return ERROR_INVAL;
+    rc = ERROR_INVAL;
 
-end_search:
-    state->dm_runas = user;
-    return 0;
+out:
+    if (!rc) {
+        if (intended_uid == 0) {
+            LOGD(ERROR, guest_domid, "intended_uid is 0 (root)!");
+            return ERROR_INVAL;
+        }
+
+        state->dm_runas = user;
+    }
+
+    return rc;
 }
 
 const char *libxl__domain_device_model(libxl__gc *gc,
@@ -1757,7 +1801,7 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
             break;
         }
 
-        if (state->dm_runas && strcmp(state->dm_runas, "root")) {
+        if (state->dm_runas) {
             flexarray_append(dm_args, "-runas");
             flexarray_append(dm_args, state->dm_runas);
         }
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 06/10] libxl: Move qmp cleanup into devicemodel destroy function
  2018-12-06 15:02 [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function George Dunlap
                   ` (3 preceding siblings ...)
  2018-12-06 15:02 ` [PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid George Dunlap
@ 2018-12-06 15:02 ` George Dunlap
  2018-12-06 15:02 ` [PATCH v2 07/10] libxl: Make killing of device model asynchronous George Dunlap
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2018-12-06 15:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap

Removing the qmp connection is logically part of the device model
destruction; having the caller destroy it is a mild layering
violation.

Move libxl__qmp_cleanup() into libxl__destroy_device_model().  This
will make it easier when we make devicemodel destruction asynchronous.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dm.c     | 9 +++++++--
 tools/libxl/libxl_domain.c | 2 --
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 7ff3e3160a..db10b692dc 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2679,12 +2679,17 @@ out:
 
 int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid)
 {
+    int rc;
     char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
     if (!xs_rm(CTX->xsh, XBT_NULL, path))
         LOGD(ERROR, domid, "xs_rm failed for %s", path);
     /* We should try to destroy the device model anyway. */
-    return kill_device_model(gc,
-                GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
+    rc = kill_device_model(gc,
+              GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
+    
+    libxl__qmp_cleanup(gc, domid);
+
+    return rc;
 }
 
 /* Return 0 if no dm needed, 1 if needed and <0 if error. */
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index 3377bba994..d46b97dedf 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1069,8 +1069,6 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis)
     if (dm_present) {
         if (libxl__destroy_device_model(gc, domid) < 0)
             LOGD(ERROR, domid, "libxl__destroy_device_model failed");
-
-        libxl__qmp_cleanup(gc, domid);
     }
     dis->drs.ao = ao;
     dis->drs.domid = domid;
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 07/10] libxl: Make killing of device model asynchronous
  2018-12-06 15:02 [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function George Dunlap
                   ` (4 preceding siblings ...)
  2018-12-06 15:02 ` [PATCH v2 06/10] libxl: Move qmp cleanup into devicemodel destroy function George Dunlap
@ 2018-12-06 15:02 ` George Dunlap
  2018-12-12 15:46   ` Ian Jackson
  2018-12-06 15:02 ` [PATCH v2 08/10] libxl: Kill QEMU by uid when possible George Dunlap
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2018-12-06 15:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap

Or at least, give it an asynchronous interface so that we can make it
actually asynchronous in subsequent patches.

Create state structures and callback function signatures.  Add the
state structure to libxl__destroy_domid_state.  Break
libxl__destroy_domid down into two functions.

No functional change intended.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
v2:
- Note that libxl__devicemodel_destroy_cb may be called reentrantly

NB that I retain the comment before libxl__destroy_device_model, in
spite of the fact that it looks "pointless", to separate it logically
from the previous prototype.

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dm.c       | 11 +++++++---
 tools/libxl/libxl_domain.c   | 40 ++++++++++++++++++++++++++++--------
 tools/libxl/libxl_internal.h | 20 ++++++++++++++++--
 3 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index db10b692dc..7f9c6a62fe 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2677,19 +2677,24 @@ out:
     return rc;
 }
 
-int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid)
+void libxl__destroy_device_model(libxl__egc *egc,
+                                 libxl__destroy_devicemodel_state *ddms)
 {
+    STATE_AO_GC(ddms->ao);
     int rc;
+    int domid = ddms->domid;
     char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
+
     if (!xs_rm(CTX->xsh, XBT_NULL, path))
         LOGD(ERROR, domid, "xs_rm failed for %s", path);
+
     /* We should try to destroy the device model anyway. */
     rc = kill_device_model(gc,
               GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
-    
+
     libxl__qmp_cleanup(gc, domid);
 
-    return rc;
+    ddms->callback(egc, ddms, rc);
 }
 
 /* Return 0 if no dm needed, 1 if needed and <0 if error. */
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index d46b97dedf..0ce1ba1327 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1008,6 +1008,10 @@ static void destroy_finish_check(libxl__egc *egc,
 }
 
 /* Callbacks for libxl__destroy_domid */
+static void dm_destroy_cb(libxl__egc *egc,
+                          libxl__destroy_devicemodel_state *ddms,
+                          int rc);
+
 static void devices_destroy_cb(libxl__egc *egc,
                                libxl__devices_remove_state *drs,
                                int rc);
@@ -1066,16 +1070,18 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis)
     if (rc < 0) {
         LOGEVD(ERROR, rc, domid, "xc_domain_pause failed");
     }
+
     if (dm_present) {
-        if (libxl__destroy_device_model(gc, domid) < 0)
-            LOGD(ERROR, domid, "libxl__destroy_device_model failed");
+        dis->ddms.ao = ao;
+        dis->ddms.domid = domid;
+        dis->ddms.callback = dm_destroy_cb;
+
+        libxl__destroy_device_model(egc, &dis->ddms);
+        return;
+    } else {
+        dm_destroy_cb(egc, &dis->ddms, 0);
+        return;
     }
-    dis->drs.ao = ao;
-    dis->drs.domid = domid;
-    dis->drs.callback = devices_destroy_cb;
-    dis->drs.force = 1;
-    libxl__devices_destroy(egc, &dis->drs);
-    return;
 
 out:
     assert(rc);
@@ -1083,6 +1089,24 @@ out:
     return;
 }
 
+static void dm_destroy_cb(libxl__egc *egc,
+                          libxl__destroy_devicemodel_state *ddms,
+                          int rc)
+{
+    libxl__destroy_domid_state *dis = CONTAINER_OF(ddms, *dis, ddms);
+    STATE_AO_GC(dis->ao);
+    uint32_t domid = dis->domid;
+
+    if (rc < 0)
+        LOGD(ERROR, domid, "libxl__destroy_device_model failed");
+
+    dis->drs.ao = ao;
+    dis->drs.domid = domid;
+    dis->drs.callback = devices_destroy_cb;
+    dis->drs.force = 1;
+    libxl__devices_destroy(egc, &dis->drs);
+}
+
 static void devices_destroy_cb(libxl__egc *egc,
                                libxl__devices_remove_state *drs,
                                int rc)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b147f3803c..f9e0bf6578 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1705,8 +1705,6 @@ _hidden int libxl__wait_for_device_model_deprecated(libxl__gc *gc,
                                                       void *userdata),
                                 void *check_callback_userdata);
 
-_hidden int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid);
-
 _hidden const libxl_vnc_info *libxl__dm_vnc(const libxl_domain_config *g_cfg);
 
 _hidden char *libxl__abs_path(libxl__gc *gc, const char *s, const char *path);
@@ -3672,6 +3670,7 @@ extern const struct libxl_device_type *device_type_tbl[];
 
 typedef struct libxl__domain_destroy_state libxl__domain_destroy_state;
 typedef struct libxl__destroy_domid_state libxl__destroy_domid_state;
+typedef struct libxl__destroy_devicemodel_state libxl__destroy_devicemodel_state;
 typedef struct libxl__devices_remove_state libxl__devices_remove_state;
 
 typedef void libxl__domain_destroy_cb(libxl__egc *egc,
@@ -3682,6 +3681,10 @@ typedef void libxl__domid_destroy_cb(libxl__egc *egc,
                                      libxl__destroy_domid_state *dis,
                                      int rc);
 
+typedef void libxl__devicemodel_destroy_cb(libxl__egc *egc,
+                                     libxl__destroy_devicemodel_state *ddms,
+                                     int rc);
+
 typedef void libxl__devices_remove_callback(libxl__egc *egc,
                                             libxl__devices_remove_state *drs,
                                             int rc);
@@ -3697,6 +3700,14 @@ struct libxl__devices_remove_state {
     int num_devices;
 };
 
+struct libxl__destroy_devicemodel_state {
+    /* filled in by user */
+    libxl__ao *ao;
+    uint32_t domid;
+    libxl__devicemodel_destroy_cb *callback; /* May be called re-entrantly */
+    /* private to implementation */
+};
+
 struct libxl__destroy_domid_state {
     /* filled in by user */
     libxl__ao *ao;
@@ -3704,6 +3715,7 @@ struct libxl__destroy_domid_state {
     libxl__domid_destroy_cb *callback;
     /* private to implementation */
     libxl__devices_remove_state drs;
+    libxl__destroy_devicemodel_state ddms;
     libxl__ev_child destroyer;
     bool soft_reset;
 };
@@ -3735,6 +3747,10 @@ _hidden void libxl__domain_destroy(libxl__egc *egc,
 _hidden void libxl__destroy_domid(libxl__egc *egc,
                                   libxl__destroy_domid_state *dis);
 
+/* Used to detroy the device model */
+_hidden void libxl__destroy_device_model(libxl__egc *egc,
+                                         libxl__destroy_devicemodel_state *ddms);
+
 /* Entry point for devices destruction */
 _hidden void libxl__devices_destroy(libxl__egc *egc,
                                     libxl__devices_remove_state *drs);
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 08/10] libxl: Kill QEMU by uid when possible
  2018-12-06 15:02 [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function George Dunlap
                   ` (5 preceding siblings ...)
  2018-12-06 15:02 ` [PATCH v2 07/10] libxl: Make killing of device model asynchronous George Dunlap
@ 2018-12-06 15:02 ` George Dunlap
  2018-12-12 16:17   ` Ian Jackson
  2018-12-06 15:02 ` [PATCH v2 09/10] libxl: Kill QEMU with "reaper" ruid George Dunlap
  2018-12-06 15:02 ` [PATCH v2 10/10] dm_depriv: Mark `UID cleanup` as completed George Dunlap
  8 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2018-12-06 15:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony Perard, Ian Jackson, Wei Liu, George Dunlap

The privcmd fd that a dm_restrict'ed QEMU has gives it permission to
one specific domain ID.  This domain ID will probably eventually be
used again.  It is therefore necessary to make absolutely sure that a
rogue QEMU process cannot hang around after its domain has exited.

Killing QEMU by pid is insufficient in this situation, because QEMU
may be able to fork() to escape killing.  It is surprisingly tricky to
kill a process which can call fork() without races; the only reliable
way is to use kill(-1) to kill all processes with a given uid.

We can use this method only when we're sure that there's only one QEMU
instance per uid.  Add a dm_uid into the domain_build_state struct,
and set it in libxl__domain_get_device_model_uid() when it's safe to
kill by UID.  Store this in xenstore next to device-model-pid.

On domain destroy, check to see if device-model-uid is present in
xenstore.  If so, fork off a reaper process, setuid to that uid, and
do kill(-9) to kill all uids of that type.  Otherwise, carry on
destroying by pid.

While we're here, make libxl__destroy_device_model() consistently:
 1. Return an error when anything fails
 2. But continue to do as much clean-up as possible

NOTE that this is not yet completely safe: with ruid == dm_uid, the
device model may be able to kill(-9) the 'reaper' process before the
reaper process can kill it.  Further patches will address this.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
v2:
- Rebase on top of previous "goto out" refactoring
- Rather than introducing a `uid` string, Introduce a boolean,
  "kill_by_uid"; and do the GCSPRINTF() once if that is set.
- Fix typo "starting"
- Always call kill_device_model_uid_cb(); only call
  libxl__qmp_cleanup() from there
- Refactor libxl__destroy_device_model() to follow "goto out on error"
  pattern
- Retain and report errors even when we continue trying to clean up
- Report errors removing DM xenstore directory (except -ENOENT)
- Report errors reading device-model-uid
- Put "kill by uid" child logic in a separate function
- Refactor "kill by uid" to follow "goto out on error" pattern
- Change "kill by uid" to return libxl-style error, rather than errno
- Document the intention of when to return errors
- Assert that dm_uid != 0
- Log what the reaper process setresuid'd to

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Anthony Perard <anthony.perard@citrix.com>
---
 tools/libxl/libxl_dm.c       | 206 +++++++++++++++++++++++++++++++++--
 tools/libxl/libxl_internal.h |   4 +-
 2 files changed, 200 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 7f9c6a62fe..53fdf8daf7 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -129,6 +129,8 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
     int rc;
     char *user;
     uid_t intended_uid;
+    bool kill_by_uid;
+
 
     /* Only qemu-upstream can run as a different uid */
     if (b_info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
@@ -148,8 +150,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
             LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
                  user);
             rc = ERROR_INVAL;
-        } else
+        } else {
             intended_uid = user_base->pw_uid;
+            kill_by_uid = true;
+        }
 
         goto out;
     }
@@ -192,11 +196,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
         LOGD(DEBUG, guest_domid, "using uid %ld", (long)intended_uid);
         user = GCSPRINTF("%ld:%ld", (long)intended_uid,
                          (long)user_base->pw_gid);
+        kill_by_uid = true;
         goto out;
     }
 
     /*
-     * We couldn't find QEMU_USER_BASE_RANGE; look for QEMU_USER_SHARED
+     * We couldn't find QEMU_USER_BASE_RANGE; look for
+     * QEMU_USER_SHARED.  NB for QEMU_USER_SHARED, all QEMU will run
+     * as the same UID, we can't kill by uid; therefore don't set uid.
      */
     user = LIBXL_QEMU_USER_SHARED;
     rc = userlookup_helper_getpwnam(gc, user, &user_pwbuf, &user_base);
@@ -206,6 +213,7 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
         LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
              LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
         intended_uid = user_base->pw_uid;
+        kill_by_uid = false;
         goto out;
     }
 
@@ -226,6 +234,8 @@ out:
         }
 
         state->dm_runas = user;
+        if (kill_by_uid)
+            state->dm_uid = GCSPRINTF("%ld", (long)intended_uid);
     }
 
     return rc;
@@ -2408,6 +2418,15 @@ void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)
 
     const char *dom_path = libxl__xs_get_dompath(gc, domid);
 
+    /*
+     * If we're starting the dm with a non-root UID, save the UID so
+     * that we can reliably kill it and any subprocesses
+     */
+    if (state->dm_uid)
+        libxl__xs_printf(gc, XBT_NULL,
+                         GCSPRINTF("%s/image/device-model-uid", dom_path),
+                         "%s", state->dm_uid);
+
     if (vnc && vnc->passwd) {
         /* This xenstore key will only be used by qemu-xen-traditionnal.
          * The code to supply vncpasswd to qemu-xen is later. */
@@ -2677,25 +2696,194 @@ out:
     return rc;
 }
 
+/* Asynchronous device model destroy functions */
+
+static int kill_device_model_uid_child(libxl__destroy_devicemodel_state *ddms,
+                                       const char *dm_uid_str);
+
+static void kill_device_model_uid_cb(libxl__egc *egc,
+                                     libxl__ev_child *destroyer,
+                                     pid_t pid, int status);
+
+/*
+ * If we have a uid, we shouldn't kill by pid.  This is because a
+ * hostile QEMU might have exited, in which case the pid we have may
+ * be that of another process.
+ *
+ * The running devicemodel has permission over a specific domain id;
+ * this means that ideally we wouldn't the domain in question (freeing
+ * up the domain id for reuse) until we're confident that we've killed
+ * the domain.
+ *
+ * In general, destroy as much as we can; but return an error if there
+ * are any errors, so that the domain destroy will be aborted, and the
+ * domain itself will remain, giving the admin an opportunity to fix
+ * any issues and re-try the domain destroy.
+ */
+#define PROPAGATE_RC if(!ddms->rc) ddms->rc = rc
+
 void libxl__destroy_device_model(libxl__egc *egc,
                                  libxl__destroy_devicemodel_state *ddms)
 {
     STATE_AO_GC(ddms->ao);
     int rc;
     int domid = ddms->domid;
-    char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
+    char *path;
+    const char *dm_uid_str = NULL;
+    int reaper_pid;
 
-    if (!xs_rm(CTX->xsh, XBT_NULL, path))
+    ddms->rc = 0;
+
+    path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
+    rc = libxl__xs_rm_checked(gc, XBT_NULL, path);
+    if (rc) {
+        PROPAGATE_RC;
         LOGD(ERROR, domid, "xs_rm failed for %s", path);
+    }
+
+    /*
+     * See if we should try to kill by uid
+     */
+    path = GCSPRINTF("/local/domain/%d/image/device-model-uid", domid);
+    rc = libxl__xs_read_checked(gc, XBT_NULL, path, &dm_uid_str);
+
+    /*
+     * If there was an error here, accumulate the error and fall back
+     * to killing by pid.
+     */
+    if (rc) {
+        PROPAGATE_RC;
+        LOGD(ERROR, domid, "Reading dm UID path failed for %s", path);
+    }
+
+    /* The DM has its own uid; Attempt to kill all processes with that UID */
+    if (dm_uid_str) {
+        LOGD(DEBUG, domid, "Found DM uid %s, destroying by uid", dm_uid_str);
+
+        reaper_pid = libxl__ev_child_fork(gc, &ddms->destroyer,
+                                          kill_device_model_uid_cb);
+        if (reaper_pid < 0) {
+            rc = ERROR_FAIL;
+            PROPAGATE_RC;
+            /*
+             * Note that if this fails, we still don't kill by pid, to
+             * make sure that an untrusted DM has not "maliciously"
+             * exited (potentially causing us to kill an unrelated
+             * process which happened to get the same pid).
+             */
+            goto out;
+        }
+
+        if (!reaper_pid) {  /* child */
+            rc = kill_device_model_uid_child(ddms, dm_uid_str);
+            _exit(rc);
+        }
+
+        /*
+         * Parent of successful fork; execution will pick up in
+         * kill_device_model_uid_cb when child exits
+         */
+        return;
+    }
 
-    /* We should try to destroy the device model anyway. */
-    rc = kill_device_model(gc,
-              GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
+    /*
+     * No uid to kill; attept to kill by pid.
+     */
+    LOGD(DEBUG, domid, "Didn't find dm UID; destroying by pid");
+
+    path = GCSPRINTF("/local/domain/%d/image/device-model-pid", domid);
+    rc = kill_device_model(gc, path);
+
+    if (rc) {
+        PROPAGATE_RC;
+        LOGD(ERROR, domid, "Killing device model pid from path %s", path);
+    }
+
+out:
+    /*
+     * NB that we always return '0' here for the "status of exited
+     * process"; since there is no process, it always "succeeds".
+     * Errors are accumulated in ddms->rc and will be handled
+     * correctly.
+     */
+    kill_device_model_uid_cb(egc, &ddms->destroyer, -1, 0);
+    return;
+}
+
+/*
+ * Destroy all processes of the given uid by setresuid to the
+ * specified uid and kill(-1).  NB this MUST BE CALLED FROM A SEPARATE
+ * PROCESS from the normal libxl process.  Returns a libxl-style error
+ * code.
+ */
+static int kill_device_model_uid_child(libxl__destroy_devicemodel_state *ddms,
+                                       const char *dm_uid_str) {
+    STATE_AO_GC(ddms->ao);
+    int domid = ddms->domid;
+    int r, rc = 0;
+    uid_t dm_uid = atoi(dm_uid_str);
+
+    /*
+     * FIXME: the second uid needs to be distinct to avoid being
+     * killed by a potential rogue process
+     */
+
+    /*
+     * Should never happen; but if it does, better to have the
+     * toolstack crash with an error than nuking dom0.
+      */
+    assert(dm_uid);
+
+    LOGD(DEBUG, domid, "DM reaper: calling setresuid(%d, %d, 0)",
+         dm_uid, dm_uid);
+    r = setresuid(dm_uid, dm_uid, 0);
+    if (r) {
+        LOGED(ERROR, domid, "setresuid to (%d, %d, 0)", dm_uid, dm_uid);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    /*
+     * And kill everyone but me.
+     *
+     * NB that it's not clear from either POSIX or the Linux man page
+     * that ESRCH would be returned with a pid value of -1, but it
+     * doesn't hurt to check.
+     */
+    r = kill(-1, 9);
+    if (r && errno != ESRCH) {
+        LOGED(ERROR, domid, "kill(-1,9)");
+        rc = ERROR_FAIL;
+    }
+
+out:
+    return rc;
+}
+
+static void kill_device_model_uid_cb(libxl__egc *egc,
+                                    libxl__ev_child *destroyer,
+                                    pid_t pid, int status)
+{
+    libxl__destroy_devicemodel_state *ddms = CONTAINER_OF(destroyer, *ddms, destroyer);
+    STATE_AO_GC(ddms->ao);
+
+    if (status) {
+        int rc = ERROR_FAIL;;
+
+        if (WIFEXITED(status))
+            rc = WEXITSTATUS(status) - 128;
+
+        PROPAGATE_RC;
+        libxl_report_child_exitstatus(CTX, XTL_ERROR,
+                                      "async domain destroy", pid, status);
+    }
 
-    libxl__qmp_cleanup(gc, domid);
+    /* Always try to clean up qmp, even if something went wrong */
+    libxl__qmp_cleanup(gc, ddms->domid);
 
-    ddms->callback(egc, ddms, rc);
+    ddms->callback(egc, ddms, ddms->rc);
 }
+#undef PROPAGATE_RC
 
 /* Return 0 if no dm needed, 1 if needed and <0 if error. */
 int libxl__need_xenpv_qemu(libxl__gc *gc, libxl_domain_config *d_config)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index f9e0bf6578..cd3208f4b8 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1135,7 +1135,7 @@ typedef struct {
     const char *shim_cmdline;
     const char *pv_cmdline;
 
-    char *dm_runas;
+    char *dm_runas, *dm_uid;
 
     xen_vmemrange_t *vmemranges;
     uint32_t num_vmemranges;
@@ -3706,6 +3706,8 @@ struct libxl__destroy_devicemodel_state {
     uint32_t domid;
     libxl__devicemodel_destroy_cb *callback; /* May be called re-entrantly */
     /* private to implementation */
+    libxl__ev_child destroyer;
+    int rc; /* Accumulated return value for the destroy operation */
 };
 
 struct libxl__destroy_domid_state {
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 09/10] libxl: Kill QEMU with "reaper" ruid
  2018-12-06 15:02 [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function George Dunlap
                   ` (6 preceding siblings ...)
  2018-12-06 15:02 ` [PATCH v2 08/10] libxl: Kill QEMU by uid when possible George Dunlap
@ 2018-12-06 15:02 ` George Dunlap
  2018-12-12 16:30   ` Ian Jackson
  2018-12-06 15:02 ` [PATCH v2 10/10] dm_depriv: Mark `UID cleanup` as completed George Dunlap
  8 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2018-12-06 15:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap

Using kill(-1) to killing an untrusted dm process with the real uid
equal to the dm_uid isn't guaranteed to succeed: the process in
question may be able to kill the reaper process after the setresuid()
and before the kill().

Instead, set the real uid to the QEMU user for domain 0
(QEMU_USER_RANGE_BASE + 0).  The reaper process will still be able to
kill the dm process, but not vice versa.

This, in turn, requires locking to make sure that only one reaper
process is using that uid at a time; otherwise one reaper process may
kill the other reaper process.

Create a lockfile in RUNDIR/dm-reaper-lock, and grab the lock before
executing kill.

In the event that we can't get the lock for some reason, go ahead with
the kill using dm_uid for both real and effective UIDs.  This isn't
guaranteed to work, but it's no worse than not trying to kill the
process at all.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
v2:
- Port over previous changes
- libxl__get_reaper_uid() won't set errno, use LOG rather than LOGE.
- Accumulate error and return for all failures
- Move flock() outside of the condition.  Also fix EINTR check (check
  errno rather than return value).
- Add a comment explaining why we return an error even if the kill()
  succeeds
- Move locking to a separate function to minimize gotos
- Refactor libxl__get_reaper_id to take a pointer for reaper_uid;
  return only success/failure.  Also return EINVAL if reaper_uid would
  resolve to 0.
- Handle "reaper_uid not found" specially; note issue with
  device_model_user feature
- Assert that final reaper_uid != 0

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dm.c | 117 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 108 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 53fdf8daf7..90b4e21d48 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -241,6 +241,35 @@ out:
     return rc;
 }
 
+/*
+ * Look up "reaper UID".  If present and non-root, returns 0 and sets
+ * reaper_uid.  If not present, returns 0 and leaves reaper_uid unset;
+ * otherwise returns libxl-style error.
+ */
+static int libxl__get_reaper_uid(libxl__gc *gc, uid_t *reaper_uid)
+{
+    struct passwd *user_base, user_pwbuf;
+    int rc;
+
+    rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
+                                         &user_pwbuf, &user_base);
+    if (rc)
+        return rc;
+
+    if (!user_base) {
+        LOG(WARN, "Couldn't find uid for reaper process");
+    } else {
+        if(user_base->pw_uid == 0) {
+            LOG(ERROR, "UID for reaper process maps to root!");
+            return ERROR_INVAL;
+        }
+
+        *reaper_uid = user_base->pw_uid;
+    }
+
+    return 0;
+}
+
 const char *libxl__domain_device_model(libxl__gc *gc,
                                        const libxl_domain_build_info *info)
 {
@@ -2810,11 +2839,61 @@ out:
     return;
 }
 
+static int get_reaper_lock_and_uid(libxl__destroy_devicemodel_state *ddms,
+                                   uid_t *reaper_uid)
+{
+    STATE_AO_GC(ddms->ao);
+    int domid = ddms->domid;
+    int r;
+    const char * lockfile;
+    int fd;
+
+    /* Try to lock the "reaper uid" */
+    lockfile = GCSPRINTF("%s/dm-reaper-lock", libxl__run_dir_path());
+
+    /*
+     * NB that since we've just forked, we can't have any
+     * threads; so we don't need the libxl__carefd
+     * infrastructure here.
+     */
+    fd = open(lockfile, O_RDWR|O_CREAT, 0666);
+    if (fd < 0) {
+        /* All other errno: EBADF, EINVAL, ENOLCK, EWOULDBLOCK */
+        LOGED(ERROR, domid,
+              "unexpected error while trying to open lockfile %s, errno=%d",
+              lockfile, errno);
+        return ERROR_FAIL;
+    }
+
+    /* Try to lock the file, retrying on EINTR */
+    for (;;) {
+        r = flock(fd, LOCK_EX);
+        if (!r)
+            break;
+        if (errno != EINTR) {
+            /* All other errno: EBADF, EINVAL, ENOLCK, EWOULDBLOCK */
+            LOGED(ERROR, domid,
+                  "unexpected error while trying to lock %s, fd=%d, errno=%d",
+                  lockfile, fd, errno);
+            return ERROR_FAIL;
+        }
+    }
+
+    /*
+     * Get reaper_uid.  If we can't find such a uid, return an error.
+     *
+     * FIXME: This means that domain destruction will fail if
+     * device_model_user is set but QEMU_USER_RANGE_BASE doesn't exist.
+     */
+    return libxl__get_reaper_uid(gc, reaper_uid);
+}
+
+
 /*
  * Destroy all processes of the given uid by setresuid to the
  * specified uid and kill(-1).  NB this MUST BE CALLED FROM A SEPARATE
- * PROCESS from the normal libxl process.  Returns a libxl-style error
- * code.
+ * PROCESS from the normal libxl process, and should exit immediately
+ * after return.  Returns a libxl-style error code.
  */
 static int kill_device_model_uid_child(libxl__destroy_devicemodel_state *ddms,
                                        const char *dm_uid_str) {
@@ -2822,24 +2901,44 @@ static int kill_device_model_uid_child(libxl__destroy_devicemodel_state *ddms,
     int domid = ddms->domid;
     int r, rc = 0;
     uid_t dm_uid = atoi(dm_uid_str);
+    uid_t reaper_uid = dm_uid;
 
     /*
-     * FIXME: the second uid needs to be distinct to avoid being
-     * killed by a potential rogue process
+     * Try to kill the devicemodel by uid.  The safest way to do this
+     * is to set euid == dm_uid, but the ruid to something else.  If
+     * we can't get a separate ruid, carry on trying to kill the
+     * process anyway using dm_uid for the ruid.  This is racy (the dm
+     * may be able to kill(-1) us before we kill them), but worth
+     * trying.
+     *
+     * NB: Even if we don't have a separate reaper_uid, the parent can
+     * know whether we won the race by looking at the status variable;
+     * so we don't strictly need to return failure in this case.  But
+     * if there's a misconfiguration, it's better to alert the
+     * administator sooner rather than later; so if we fail to get a
+     * reaper uid, report an error even if the kill succeeds.
      */
+    rc = get_reaper_lock_and_uid(ddms, &reaper_uid);
+
+    /* NB: Carry on in case of failure, but pass rc value back up. */
+
+    if (reaper_uid == dm_uid)
+        LOGD(WARN, domid, "Couldn't get separate reaper uid;"
+            "carrying on with unsafe kill");
 
     /*
      * Should never happen; but if it does, better to have the
      * toolstack crash with an error than nuking dom0.
       */
+    assert(reaper_uid);
     assert(dm_uid);
 
     LOGD(DEBUG, domid, "DM reaper: calling setresuid(%d, %d, 0)",
-         dm_uid, dm_uid);
-    r = setresuid(dm_uid, dm_uid, 0);
+         reaper_uid, dm_uid);
+    r = setresuid(reaper_uid, dm_uid, 0);
     if (r) {
-        LOGED(ERROR, domid, "setresuid to (%d, %d, 0)", dm_uid, dm_uid);
-        rc = ERROR_FAIL;
+        LOGED(ERROR, domid, "setresuid to (%d, %d, 0)", reaper_uid, dm_uid);
+        rc = rc ? rc : ERROR_FAIL;
         goto out;
     }
 
@@ -2853,7 +2952,7 @@ static int kill_device_model_uid_child(libxl__destroy_devicemodel_state *ddms,
     r = kill(-1, 9);
     if (r && errno != ESRCH) {
         LOGED(ERROR, domid, "kill(-1,9)");
-        rc = ERROR_FAIL;
+        rc = rc ? rc : ERROR_FAIL;
     }
 
 out:
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 10/10] dm_depriv: Mark `UID cleanup` as completed
  2018-12-06 15:02 [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function George Dunlap
                   ` (7 preceding siblings ...)
  2018-12-06 15:02 ` [PATCH v2 09/10] libxl: Kill QEMU with "reaper" ruid George Dunlap
@ 2018-12-06 15:02 ` George Dunlap
  2018-12-12 16:31   ` Ian Jackson
  8 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2018-12-06 15:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 docs/designs/qemu-deprivilege.md | 40 ++++++++++++++++----------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/docs/designs/qemu-deprivilege.md b/docs/designs/qemu-deprivilege.md
index f7444a434d..81a5f5c05d 100644
--- a/docs/designs/qemu-deprivilege.md
+++ b/docs/designs/qemu-deprivilege.md
@@ -128,26 +128,6 @@ are specified; this does not apply to QEMU running as a Xen DM.
 
 '''Tested''': Not tested
 
-# Restrictions / improvements still to do
-
-This lists potential restrictions still to do.  It is meant to be
-listed in order of ease of implementation, with low-hanging fruit
-first.
-
-### Further RLIMITs
-
-RLIMIT_AS limits the total amount of memory; but this includes the
-virtual memory which QEMU uses as a mapcache.  xen-mapcache.c already
-fiddles with this; it would be straightforward to make it *set* the
-rlimit to what it thinks a sensible limit is.
-
-RLIMIT_NPROC limits total number of processes or threads.  QEMU uses
-threads for some devices, so this would require some thought.
-
-Other things that would take some cleverness / changes to QEMU to
-utilize due to ordering constrants:
- - RLIMIT_NOFILES (after all necessary files are opened)
-
 ### libxl UID cleanup
 
 '''Description''': Domain IDs are reused, and thus restricted UIDs are
@@ -223,6 +203,26 @@ Since this will kill all other `reaper_uid` processes as well, we must
 either allocate a separate `reaper_uid` per domain, or use locking to
 ensure that only one killing process is active at a time.
 
+# Restrictions / improvements still to do
+
+This lists potential restrictions still to do.  It is meant to be
+listed in order of ease of implementation, with low-hanging fruit
+first.
+
+### Further RLIMITs
+
+RLIMIT_AS limits the total amount of memory; but this includes the
+virtual memory which QEMU uses as a mapcache.  xen-mapcache.c already
+fiddles with this; it would be straightforward to make it *set* the
+rlimit to what it thinks a sensible limit is.
+
+RLIMIT_NPROC limits total number of processes or threads.  QEMU uses
+threads for some devices, so this would require some thought.
+
+Other things that would take some cleverness / changes to QEMU to
+utilize due to ordering constrants:
+ - RLIMIT_NOFILES (after all necessary files are opened)
+
 ## libxl: Treat QMP connection as untrusted
 
 '''Description''': Currently libxl talks with QEMU via QMP; but its
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 03/10] libxl: Clean up userlookup_helper_getpw* helper
  2018-12-06 15:02 ` [PATCH v2 03/10] libxl: Clean up userlookup_helper_getpw* helper George Dunlap
@ 2018-12-12 15:26   ` Ian Jackson
  2018-12-12 17:39     ` George Dunlap
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Jackson @ 2018-12-12 15:26 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Wei Liu

George Dunlap writes ("[PATCH v2 03/10] libxl: Clean up userlookup_helper_getpw* helper"):
> Bring conventions more in line with libxl__xs_read_checked():
> - If found, return 0 and set pointer to non-NULL
> - If not found, return 0 and set pointer to NULL
> - On error, return libxl-style error number.
> 
> Update documentation to match.
...
>  #define DEFINE_USERLOOKUP_HELPER(NAME,SPEC_TYPE,STRUCTNAME,SYSCONF)     \
>      static int userlookup_helper_##NAME(libxl__gc *gc,                  \
> @@ -83,7 +89,7 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char *name)
>          struct STRUCTNAME *resultp = NULL;                              \

git diff has excelled itself in choice of heading line, hasn't it ?

> @@ -142,14 +147,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
>                                           &user_pwbuf, &user_base);
>      if (ret < 0)
>          return ret;
> -    if (ret > 0) {
> +    if (user_base) {

I would prefer to also:

  -    if (ret < 0)
  +    if (ret)
           return ret;

which is more conventional for rc and might reduce the impact of bugs
where the function returned a positive.  (Twice.)

With or without that change,

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

Thanks for the cleanup!

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid
  2018-12-06 15:02 ` [PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid George Dunlap
@ 2018-12-12 15:45   ` Ian Jackson
  2018-12-12 18:20     ` George Dunlap
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Jackson @ 2018-12-12 15:45 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Wei Liu

George Dunlap writes ("[PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid"):
> At the moment, we check for equivalence to literal "root" before
> deciding whether to add the `runas` command-line option to QEMU.  This
> is unsatisfactory for several reasons.
...
> v2:
> - Refactor to use `out` rather than multiple labels
> - Only check for root once
> - Use 'out' rather than direct returns for errors (only use direct returns
>   for early `succeed-without-setting-runas` paths)
> - Use `rc` rather than `ret` to more closely align with CODING_STYLE
> - Fill out comments about the cases we're handling
> - Return ERROR_DEVICE_EXISTS rather than ERROR_FAIL if there's another
>   username that maps to our calculated uid
> - Report an error if the specified device_model_user doesn't exist

Thanks.  This is all much better now.  I still have a few style nits,
and one comment on your reuse of a libxl error code, but AFAICT the
algorithm is right.


> +        if (!user_base) {
> +            LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
> +                 user);
> +            rc = ERROR_INVAL;
> +        } else
> +            intended_uid = user_base->pw_uid;
>  
> +        goto out;

1. Would you mind adding the missing { } ?
  | ... To avoid confusion, either all
  | the blocks in an if...else chain have braces, or none of them do.

2. Can you please set rc in the other path ?  I think this is supposed
to be a success path.  Relying on the previous value of rc seems very
action at a distance.  (Later: yes, I see this this is a success path.
See my comments about the rules implied by your `out:' block.)

TBH I would prefer two goto outs rather than one, like this:

> +        if (!user_base) {
> +            LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
> +                 user);
> +            rc = ERROR_INVAL;
  +            goto out;
  +        } else {
> +            intended_uid = user_base->pw_uid;
  +            rc = 0;
  +            goto out;
  +        }
> +    }

which seems easier to read since it is not necessary to untangle the
whole if, and look at the context, to see the correctness of the
individual bits.

Or you could treat the !user_base as an error block and write this:

> +        if (!user_base) {
> +            LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
> +                 user);
> +            rc = ERROR_INVAL;
  +            goto out;
  +        }
  +        intended_uid = user_base->pw_uid;
  +        rc = 0;
  +        goto out;
> +    }



> +    /*
> +     * If dm_restrict isn't set, and we don't have a specified user, don't
> +     * bother setting a `-runas` parameter.
> +     */
>      if (!libxl_defbool_val(b_info->dm_restrict)) {
>          LOGD(DEBUG, guest_domid,
>               "dm_restrict disabled, starting QEMU as root");
>          return 0;
>      }

Why `return 0' here but `goto out' earlier ?  IMO all the success
returns should be the same.

> +        rc = userlookup_helper_getpwuid(gc, intended_uid,
>                                           &user_clash_pwbuf, &user_clash);
> -        if (ret < 0)
> -            return ret;
> +        if (rc < 0)
> +            goto out;

In my last mail I suggested this should be `ret < 0' but of course now
it should be `if (rc)' ...

>          if (user_clash) {
>              LOGD(ERROR, guest_domid,
>                   "wanted to use uid %ld (%s + %d) but that is user %s !",
>                   (long)intended_uid, LIBXL_QEMU_USER_RANGE_BASE,
>                   guest_domid, user_clash->pw_name);
> -            return ERROR_FAIL;
> +            rc = ERROR_DEVICE_EXISTS;
> +            goto out;

I appreciate your desire to specify particular error value, but I am
far from convinced that ERROR_DEVICE_EXISTS is appropriate.  It would
lead someone to ask which device was in the way.

We generally use EINVAL for bad configurations.  If you prefer, feel
free to introduce a new error code.  32-bit signed integers are pretty
cheap.

> +    if (rc < 0)
> +        goto out;
>      if (user_base) {
>          LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
>               LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
> -        goto end_search;
> +        intended_uid = user_base->pw_uid;
> +        goto out;

Here we have this pattern again with a `goto out' without a preceding
assignment to `rc'.  AFAICT the rules implied by your out block are:

 * Every goto out must be preceded by an assignment to rc.
   IMO there is no reason this should not immediately precede
   the goto out.

 * Additionally, if rc is 0 then the goto out must also be preceded
   relatively recently by an assignment to intended_uid.

> +out:
> +    if (!rc) {
> +        if (intended_uid == 0) {
> +            LOGD(ERROR, guest_domid, "intended_uid is 0 (root)!");
> +            return ERROR_INVAL;
> +        }
> +
> +        state->dm_runas = user;
> +    }
> +
> +    return rc;

TBH I dislike the early return in the `goto out' block.  If a resource
deallocation were to be introduced, that `return ERROR_INVAL' would
become a memory leak.

I think this means lifting state->dm_runas into else, or writing this:

> +out:
> +    if (!rc) {
> +        if (intended_uid == 0) {
> +            LOGD(ERROR, guest_domid, "intended_uid is 0 (root)!");
  +            rc = ERROR_INVAL;
> +        }
  +    }
  +    if (!rc) {
> +        state->dm_runas = user;
> +    }
> +
> +    return rc;

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 07/10] libxl: Make killing of device model asynchronous
  2018-12-06 15:02 ` [PATCH v2 07/10] libxl: Make killing of device model asynchronous George Dunlap
@ 2018-12-12 15:46   ` Ian Jackson
  0 siblings, 0 replies; 22+ messages in thread
From: Ian Jackson @ 2018-12-12 15:46 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Wei Liu

George Dunlap writes ("[PATCH v2 07/10] libxl: Make killing of device model asynchronous"):
> Or at least, give it an asynchronous interface so that we can make it
> actually asynchronous in subsequent patches.
> 
> Create state structures and callback function signatures.  Add the
> state structure to libxl__destroy_domid_state.  Break
> libxl__destroy_domid down into two functions.

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 08/10] libxl: Kill QEMU by uid when possible
  2018-12-06 15:02 ` [PATCH v2 08/10] libxl: Kill QEMU by uid when possible George Dunlap
@ 2018-12-12 16:17   ` Ian Jackson
  2018-12-17 18:09     ` George Dunlap
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Jackson @ 2018-12-12 16:17 UTC (permalink / raw)
  To: George Dunlap; +Cc: Anthony Perard, xen-devel, Wei Liu

George Dunlap writes ("[PATCH v2 08/10] libxl: Kill QEMU by uid when possible"):
> The privcmd fd that a dm_restrict'ed QEMU has gives it permission to
> one specific domain ID.  This domain ID will probably eventually be
> used again.  It is therefore necessary to make absolutely sure that a
> rogue QEMU process cannot hang around after its domain has exited.

Thanks.  Detailed comments mostly on error handling follow...

> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index f9e0bf6578..cd3208f4b8 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -1135,7 +1135,7 @@ typedef struct {
>      const char *shim_cmdline;
>      const char *pv_cmdline;
>  
> -    char *dm_runas;
> +    char *dm_runas, *dm_uid;

I think `dm_uid' is misnamed.  It is only set if the dm has a
*dedicated* uid.  If the dm is run as uid 0 or as a shared qemu uid,
it is set to NULL.

> @@ -3706,6 +3706,8 @@ struct libxl__destroy_devicemodel_state {
>      uint32_t domid;
>      libxl__devicemodel_destroy_cb *callback; /* May be called re-entrantly */
>      /* private to implementation */
> +    libxl__ev_child destroyer;
> +    int rc; /* Accumulated return value for the destroy operation */

Excellent.

> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index 7f9c6a62fe..53fdf8daf7 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -129,6 +129,8 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
>      int rc;
>      char *user;
>      uid_t intended_uid;
> +    bool kill_by_uid;
> +

I guess this new blank line is deliberate.  I think it is probably a
good idea.

> @@ -148,8 +150,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
>              LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
>                   user);
>              rc = ERROR_INVAL;
> -        } else
> +        } else {
>              intended_uid = user_base->pw_uid;
> +            kill_by_uid = true;
> +        }
>  
>          goto out;
>      }

I think your changes to the out block newly imply that all `goto out'
with `rc=0' must also set kill_by_uid.

I have not (in this revision) applied this patch and searched for that
because I think there will have to be a respin to consistently apply
the rules I previously identified.

(I mention this here partly so that when I review v3 I know that my
former self wanted to double check that.)

> @@ -226,6 +234,8 @@ out:
>          }
>  
>          state->dm_runas = user;
> +        if (kill_by_uid)
> +            state->dm_uid = GCSPRINTF("%ld", (long)intended_uid);
>      }

More stuff is accumulating in this post-0-check success path.
Perhaps this means my suggestion of
   if (!rc) { ... }  if (!rc) { ... }
will be most convenient.

> +    /*
> +     * If we're starting the dm with a non-root UID, save the UID so
> +     * that we can reliably kill it and any subprocesses
> +     */
> +    if (state->dm_uid)
> +        libxl__xs_printf(gc, XBT_NULL,
> +                         GCSPRINTF("%s/image/device-model-uid", dom_path),

My comment about the misnamed libxl variable applies to
device-model-uid too I think.

> +#define PROPAGATE_RC if(!ddms->rc) ddms->rc = rc
                          ^

I like this macro.  But (i) it needs proper macro hygiene - either a
do{ }while(0) block, or refactoring into an expression (and then
putting in parens).  And (ii) you missed out a space.

(The references to ddms and rc are anaphoric, not macro parameters, so
do not need parens.)

> +    path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
> +    rc = libxl__xs_rm_checked(gc, XBT_NULL, path);
> +    if (rc) {
> +        PROPAGATE_RC;
>          LOGD(ERROR, domid, "xs_rm failed for %s", path);
> +    }

Your new macro makes this a nicely transparent pasttern.

> +    /*
> +     * See if we should try to kill by uid
> +     */
> +    path = GCSPRINTF("/local/domain/%d/image/device-model-uid", domid);
> +    rc = libxl__xs_read_checked(gc, XBT_NULL, path, &dm_uid_str);
> +
> +    /*
> +     * If there was an error here, accumulate the error and fall back
> +     * to killing by pid.
> +     */
> +    if (rc) {
> +        PROPAGATE_RC;
> +        LOGD(ERROR, domid, "Reading dm UID path failed for %s", path);
> +    }

From the comment for libxl__xs_read_checked:
 | * On error, *result_out is undefined.
Arguably this is a bear trap.  Maybe you would like to fix it there
rather than by setting dm_uid_str to 0 here.

> +        reaper_pid = libxl__ev_child_fork(gc, &ddms->destroyer,
> +                                          kill_device_model_uid_cb);
> +        if (reaper_pid < 0) {
> +            rc = ERROR_FAIL;
> +            PROPAGATE_RC;
> +            /*
> +             * Note that if this fails, we still don't kill by pid, to
> +             * make sure that an untrusted DM has not "maliciously"
> +             * exited (potentially causing us to kill an unrelated
> +             * process which happened to get the same pid).
> +             */
> +            goto out;
> +        }
> +
> +        if (!reaper_pid) {  /* child */
> +            rc = kill_device_model_uid_child(ddms, dm_uid_str);
> +            _exit(rc);
> +        }

You cannot _exit(rc).  See my comments below...

Personally I like to put the child process block right after the fork,
especially when (like here) it is very short because the meat has been
lifted elsewhere.  Just a suggestion; you can leave it here if you
prefer.

> -    /* We should try to destroy the device model anyway. */
> -    rc = kill_device_model(gc,
> -              GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
> +    /*
> +     * No uid to kill; attept to kill by pid.
> +     */
> +    LOGD(DEBUG, domid, "Didn't find dm UID; destroying by pid");
> +
> +    path = GCSPRINTF("/local/domain/%d/image/device-model-pid", domid);
> +    rc = kill_device_model(gc, path);
> +
> +    if (rc) {
> +        PROPAGATE_RC;
> +        LOGD(ERROR, domid, "Killing device model pid from path %s", path);
> +    }
> +
> +out:

I would prefer the apparently-redundant:

  +    rc = 0;
> +out:

This avoids introducing a bug if, just before the success exit path,
new code gets added which doesn't happen to leave rc==0.

> +    /*
> +     * NB that we always return '0' here for the "status of exited
> +     * process"; since there is no process, it always "succeeds".

I had to read this three times to figure out what this meant.  Perhaps
I'm being dense but would you mind writing

  +     * NB that we always pass '0' here for the "status of exited
                            ^^^^

> +/*
> + * Destroy all processes of the given uid by setresuid to the
> + * specified uid and kill(-1).  NB this MUST BE CALLED FROM A SEPARATE
> + * PROCESS from the normal libxl process.  Returns a libxl-style error
> + * code.
> + */
> +static int kill_device_model_uid_child(libxl__destroy_devicemodel_state *ddms,
> +                                       const char *dm_uid_str) {
> +    STATE_AO_GC(ddms->ao);
> +    int domid = ddms->domid;
> +    int r, rc = 0;
> +    uid_t dm_uid = atoi(dm_uid_str);
> +
> +    /*
> +     * FIXME: the second uid needs to be distinct to avoid being
> +     * killed by a potential rogue process
> +     */
> +
> +    /*
> +     * Should never happen; but if it does, better to have the
> +     * toolstack crash with an error than nuking dom0.
> +      */
> +    assert(dm_uid);
> +
> +    LOGD(DEBUG, domid, "DM reaper: calling setresuid(%d, %d, 0)",
> +         dm_uid, dm_uid);
> +    r = setresuid(dm_uid, dm_uid, 0);
> +    if (r) {
> +        LOGED(ERROR, domid, "setresuid to (%d, %d, 0)", dm_uid, dm_uid);
> +        rc = ERROR_FAIL;
> +        goto out;
> +    }
> +
> +    /*
> +     * And kill everyone but me.
> +     *
> +     * NB that it's not clear from either POSIX or the Linux man page
> +     * that ESRCH would be returned with a pid value of -1, but it
> +     * doesn't hurt to check.
> +     */
> +    r = kill(-1, 9);
> +    if (r && errno != ESRCH) {
> +        LOGED(ERROR, domid, "kill(-1,9)");
> +        rc = ERROR_FAIL;
> +    }
> +
> +out:
> +    return rc;
> +}
> +
> +static void kill_device_model_uid_cb(libxl__egc *egc,
> +                                    libxl__ev_child *destroyer,
> +                                    pid_t pid, int status)
> +{
> +    libxl__destroy_devicemodel_state *ddms = CONTAINER_OF(destroyer, *ddms, destroyer);
> +    STATE_AO_GC(ddms->ao);
> +
> +    if (status) {
> +        int rc = ERROR_FAIL;;
> +
> +        if (WIFEXITED(status))
> +            rc = WEXITSTATUS(status) - 128;

This can't be right.  Where does this 128 come from ?

I don't see a comment anywhere about your encoding of the libxl error
value in the exit status.

A libxl error code does not necessarily fit in an exit status since an
exit status is just one byte.  Your protocol reserves the exit status
0 for success.  The C implementation typically reserves -1 (255) and
sometimes also 127.  By convention the exit status is normally
regarded as positive and libxl error codes are negative.

I suggest one of the following strategies:

  - Give up on the idea of distinguishing these error codes at all and
    simply _exit(!!rc).  (After all the real error is logged and the
    function only ever returns FAIL.)

  - Say that the child function may only return one of a limited
    subset of libxl error codes (since only FAIL is currently needed),
    and assert that -125 <= rc <= -1, and _exit(-rc).  Then the exit
    status can be recovered with -WEXITSTATUS(status).

  - Do something more subtle involving turning out-of-range
    rc values into -126 or some such.  This seems like overkill.

> -    ddms->callback(egc, ddms, rc);
> +    ddms->callback(egc, ddms, ddms->rc);
>  }
> +#undef PROPAGATE_RC

I am tempted to suggest replacing each call
  PROPAGATE_RC;
with
  ACCUMULATE_RC(ddms);
and put the definition in libxl_internal.h for use elsewhere.

I think we would probably already have some other call sites and if we
go and fix the destroy stuff probably we will want a lot more of it.

Up to you.  Like this is certainly fine for now.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 09/10] libxl: Kill QEMU with "reaper" ruid
  2018-12-06 15:02 ` [PATCH v2 09/10] libxl: Kill QEMU with "reaper" ruid George Dunlap
@ 2018-12-12 16:30   ` Ian Jackson
  0 siblings, 0 replies; 22+ messages in thread
From: Ian Jackson @ 2018-12-12 16:30 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Wei Liu

George Dunlap writes ("[PATCH v2 09/10] libxl: Kill QEMU with "reaper" ruid"):
> Using kill(-1) to killing an untrusted dm process with the real uid
> equal to the dm_uid isn't guaranteed to succeed: the process in
> question may be able to kill the reaper process after the setresuid()
> and before the kill().
> 
> Instead, set the real uid to the QEMU user for domain 0
> (QEMU_USER_RANGE_BASE + 0).  The reaper process will still be able to
> kill the dm process, but not vice versa.
> 
> This, in turn, requires locking to make sure that only one reaper
> process is using that uid at a time; otherwise one reaper process may
> kill the other reaper process.
> 
> Create a lockfile in RUNDIR/dm-reaper-lock, and grab the lock before
> executing kill.
> 
> In the event that we can't get the lock for some reason, go ahead with
> the kill using dm_uid for both real and effective UIDs.  This isn't
> guaranteed to work, but it's no worse than not trying to kill the
> process at all.

Thanks.  Only minor comments here.

> +/*
> + * Look up "reaper UID".  If present and non-root, returns 0 and sets
> + * reaper_uid.  If not present, returns 0 and leaves reaper_uid unset;
> + * otherwise returns libxl-style error.
> + */

`leaves reaper_uid unset' is ambiguous.  It might mean `leaves it
unchanged' or `leaves it set to a sentinel value'.  The implementation
seems to be the former, which means that all callers must set it to a
sentinel value.

I think it would be better if it explicitly set it to (uid_t)-1
in that case.

...

I looked further and saw that actually you meant `leaves it unchanged'
and the caller is expected to set it to the value that it will want to
use if there is no dedicated reaper uid.  This is rather too subtle
for me.

Can you please put the logic that falls back to
reaper_uid=dm_dedicated_uid closer to the point of use ?

> +        if(user_base->pw_uid == 0) {
             ^
missing space.

.oO{ I wonder how that checkpatch project is coming on... }

> +static int get_reaper_lock_and_uid(libxl__destroy_devicemodel_state *ddms,
> +                                   uid_t *reaper_uid)
> +{
...
> +    fd = open(lockfile, O_RDWR|O_CREAT, 0666);

I think this mode is wrong.  If the process is running with a g+w
umask the file is g+w which is not desirable.  I suggest 0644.

> +    if (fd < 0) {
> +        /* All other errno: EBADF, EINVAL, ENOLCK, EWOULDBLOCK */
> +        LOGED(ERROR, domid,
> +              "unexpected error while trying to open lockfile %s, errno=%d",
> +              lockfile, errno);
> +        return ERROR_FAIL;
;4~
`All other errno' - other to what ?  I think this comment has become
detached from its code.

> +    /* Try to lock the file, retrying on EINTR */
> +    for (;;) {
> +        r = flock(fd, LOCK_EX);
> +        if (!r)
> +            break;
> +        if (errno != EINTR) {
> +            /* All other errno: EBADF, EINVAL, ENOLCK, EWOULDBLOCK */
> +            LOGED(ERROR, domid,
> +                  "unexpected error while trying to lock %s, fd=%d, errno=%d",
> +                  lockfile, fd, errno);
> +            return ERROR_FAIL;
> +        }
> +    }

LGTM.

> +    /*
> +     * Get reaper_uid.  If we can't find such a uid, return an error.
> +     *
> +     * FIXME: This means that domain destruction will fail if
> +     * device_model_user is set but QEMU_USER_RANGE_BASE doesn't exist.
> +     */
> +    return libxl__get_reaper_uid(gc, reaper_uid);
> +}

Did you mean to introduce this new FIXME ?  I think device_model_user
was there before your series (even though it probably won't work very
well...)  Forgive me if I am missing something.

If you did mean to, then you need to explain in the commit message why
it is OK.

>  /*
>   * Destroy all processes of the given uid by setresuid to the
>   * specified uid and kill(-1).  NB this MUST BE CALLED FROM A SEPARATE
> - * PROCESS from the normal libxl process.  Returns a libxl-style error
> - * code.
> + * PROCESS from the normal libxl process, and should exit immediately
> + * after return.  Returns a libxl-style error code.

I think this rule that you must _exit immediately after return is not
new so can you move that comment into the patch that introduces the
setresuid ?  Also it would be marginally better to explicitly say
_exit rather than exit.

>      /*
...
> +     * NB: Even if we don't have a separate reaper_uid, the parent can
> +     * know whether we won the race by looking at the status variable;
> +     * so we don't strictly need to return failure in this case.  But
> +     * if there's a misconfiguration, it's better to alert the
> +     * administator sooner rather than later; so if we fail to get a
> +     * reaper uid, report an error even if the kill succeeds.

I approve of this reasoning and also of it being explained in a
comment.  Thanks.

> @@ -2853,7 +2952,7 @@ static int kill_device_model_uid_child(libxl__destroy_devicemodel_state *ddms,
>      r = kill(-1, 9);
>      if (r && errno != ESRCH) {
>          LOGED(ERROR, domid, "kill(-1,9)");
> -        rc = ERROR_FAIL;
> +        rc = rc ? rc : ERROR_FAIL;
>      }

This is a bit open-coded but I guess this is pretty bespoke set of
code.  So, fine.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 10/10] dm_depriv: Mark `UID cleanup` as completed
  2018-12-06 15:02 ` [PATCH v2 10/10] dm_depriv: Mark `UID cleanup` as completed George Dunlap
@ 2018-12-12 16:31   ` Ian Jackson
  0 siblings, 0 replies; 22+ messages in thread
From: Ian Jackson @ 2018-12-12 16:31 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Wei Liu

George Dunlap writes ("[PATCH v2 10/10] dm_depriv: Mark `UID cleanup` as completed"):
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 03/10] libxl: Clean up userlookup_helper_getpw* helper
  2018-12-12 15:26   ` Ian Jackson
@ 2018-12-12 17:39     ` George Dunlap
  0 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2018-12-12 17:39 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Wei Liu



> On Dec 12, 2018, at 3:26 PM, Ian Jackson <ian.jackson@citrix.com> wrote:
> 
> George Dunlap writes ("[PATCH v2 03/10] libxl: Clean up userlookup_helper_getpw* helper"):
>> Bring conventions more in line with libxl__xs_read_checked():
>> - If found, return 0 and set pointer to non-NULL
>> - If not found, return 0 and set pointer to NULL
>> - On error, return libxl-style error number.
>> 
>> Update documentation to match.
> ...
>> #define DEFINE_USERLOOKUP_HELPER(NAME,SPEC_TYPE,STRUCTNAME,SYSCONF)     \
>>     static int userlookup_helper_##NAME(libxl__gc *gc,                  \
>> @@ -83,7 +89,7 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char *name)
>>         struct STRUCTNAME *resultp = NULL;                              \
> 
> git diff has excelled itself in choice of heading line, hasn't it ?
> 
>> @@ -142,14 +147,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
>>                                          &user_pwbuf, &user_base);
>>     if (ret < 0)
>>         return ret;
>> -    if (ret > 0) {
>> +    if (user_base) {
> 
> I would prefer to also:
> 
>  -    if (ret < 0)
>  +    if (ret)
>           return ret;
> 
> which is more conventional for rc and might reduce the impact of bugs
> where the function returned a positive.  (Twice.)

Looks like thrice, but sure.

> 
> With or without that change,
> 
> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
> 
> Thanks for the cleanup!

No problem — it’s nice to have something tractable and satisfying to accomplish when my other active project is a seemingly never-ending ball of twine…

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid
  2018-12-12 15:45   ` Ian Jackson
@ 2018-12-12 18:20     ` George Dunlap
  2018-12-18 15:07       ` Ian Jackson
  0 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2018-12-12 18:20 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Wei Liu



> On Dec 12, 2018, at 3:45 PM, Ian Jackson <ian.jackson@citrix.com> wrote:
> 
> George Dunlap writes ("[PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid"):
>> At the moment, we check for equivalence to literal "root" before
>> deciding whether to add the `runas` command-line option to QEMU.  This
>> is unsatisfactory for several reasons.
> ...
>> v2:
>> - Refactor to use `out` rather than multiple labels
>> - Only check for root once
>> - Use 'out' rather than direct returns for errors (only use direct returns
>>  for early `succeed-without-setting-runas` paths)
>> - Use `rc` rather than `ret` to more closely align with CODING_STYLE
>> - Fill out comments about the cases we're handling
>> - Return ERROR_DEVICE_EXISTS rather than ERROR_FAIL if there's another
>>  username that maps to our calculated uid
>> - Report an error if the specified device_model_user doesn't exist
> 
> Thanks.  This is all much better now.

FWIW I agree. :-)

> Or you could treat the !user_base as an error block and write this:
> 
>> +        if (!user_base) {
>> +            LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
>> +                 user);
>> +            rc = ERROR_INVAL;
>  +            goto out;
>  +        }
>  +        intended_uid = user_base->pw_uid;
>  +        rc = 0;
>  +        goto out;
>> +    }

This looks good.

> 
>> +    /*
>> +     * If dm_restrict isn't set, and we don't have a specified user, don't
>> +     * bother setting a `-runas` parameter.
>> +     */
>>     if (!libxl_defbool_val(b_info->dm_restrict)) {
>>         LOGD(DEBUG, guest_domid,
>>              "dm_restrict disabled, starting QEMU as root");
>>         return 0;
>>     }
> 
> Why `return 0' here but `goto out' earlier ?  IMO all the success
> returns should be the same.

Not exactly — we only want to do the root check if we’re running as an alternate user.  In this case, neither device_model_user nor dm_restrict is set, so we’re not running as an alternate user, so we don’t want to run the `if(intended_uid == 0)` check on the normal ‘out’ path.

I take it you’d prefer to always jump to out, but to have the conditional be, `if (!rc && user)`?

> 
>>         if (user_clash) {
>>             LOGD(ERROR, guest_domid,
>>                  "wanted to use uid %ld (%s + %d) but that is user %s !",
>>                  (long)intended_uid, LIBXL_QEMU_USER_RANGE_BASE,
>>                  guest_domid, user_clash->pw_name);
>> -            return ERROR_FAIL;
>> +            rc = ERROR_DEVICE_EXISTS;
>> +            goto out;
> 
> I appreciate your desire to specify particular error value, but I am
> far from convinced that ERROR_DEVICE_EXISTS is appropriate.  It would
> lead someone to ask which device was in the way.

Consider this patch the most concise way of asking the question, “What error should I use in this case?” :-)

> 
> We generally use EINVAL for bad configurations.  If you prefer, feel
> free to introduce a new error code.  32-bit signed integers are pretty
> cheap.

The integers may be cheap, but scanning through trying to comprehend them is not. :-)

I’ll think about whether to introduce a new one or just use ERROR_INVAL.

> 
>> +    if (rc < 0)
>> +        goto out;
>>     if (user_base) {
>>         LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
>>              LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
>> -        goto end_search;
>> +        intended_uid = user_base->pw_uid;
>> +        goto out;
> 
> Here we have this pattern again with a `goto out' without a preceding
> assignment to `rc'.  AFAICT the rules implied by your out block are:
> 
> * Every goto out must be preceded by an assignment to rc.
>   IMO there is no reason this should not immediately precede
>   the goto out.
> 
> * Additionally, if rc is 0 then the goto out must also be preceded
>   relatively recently by an assignment to intended_uid.

Those are rules that you’re implying, not me. :-)  My `goto out` invariant in this patch were:

1. rc may be an error code.  In this case, rc is returned.
2. rc may be zero; if rc is zero:
 2a. user must be non-NULL,
 2b. user must be verified to exist on the system, and
 2c. intended_uid must be set to the userid reported in the previous check

In order to accept your suggestion above to replace the `return` with a `goto out`, I have to make the invariant as follows:

1. rc may be an error code.  In this case, rc is returned.
2. rc may be zero, and user NULL.  In this case, rc is returned.
3. rc may be zero, and user non-NULL.  In this case:
  [2b and 2c from above]

In this case, we know that rc is 0 because we just checked the value 6 lines earlier.  If code is ever added in between such that rc becomes non-zero, *that* code should be calling `goto out` (or thinking carefully about why falling through to this code is OK).

I’ll write redundant statements everywhere if you want, but I thought that would count as the kind of code duplication you wanted to avoid.

> 
>> +out:
>> +    if (!rc) {
>> +        if (intended_uid == 0) {
>> +            LOGD(ERROR, guest_domid, "intended_uid is 0 (root)!");
>> +            return ERROR_INVAL;
>> +        }
>> +
>> +        state->dm_runas = user;
>> +    }
>> +
>> +    return rc;
> 
> TBH I dislike the early return in the `goto out' block.  If a resource
> deallocation were to be introduced, that `return ERROR_INVAL' would
> become a memory leak.
> 
> I think this means lifting state->dm_runas into else, or writing this:
> 
>> +out:
>> +    if (!rc) {
>> +        if (intended_uid == 0) {
>> +            LOGD(ERROR, guest_domid, "intended_uid is 0 (root)!");
>  +            rc = ERROR_INVAL;
>> +        }
>  +    }
>  +    if (!rc) {
>> +        state->dm_runas = user;
>> +    }

Would you want braces on the second `if`?  I suppose we’d add them in patch 7 anyway.

If we switch the earlier `return 0` in the !dm_restrict conditional to a “goto out”, then this would turn into:

if (!rc && user) {
 /* check */
}
if (!rc && user) {
}

or 

if (user) {
  if (!rc) {
    /* check */
  }
  if (!rc) {
    /* set */
  }
}

or of course:

if (!rc && user) {
  if (/*check*) {
    rc = ERROR_INVAL;
  } else {
    /* set */
  }
}

If you have a favorite color it might be better just to tell me. :-)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 08/10] libxl: Kill QEMU by uid when possible
  2018-12-12 16:17   ` Ian Jackson
@ 2018-12-17 18:09     ` George Dunlap
  2018-12-18 11:18       ` Ian Jackson
  0 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2018-12-17 18:09 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Anthony Perard, xen-devel, Wei Liu

On 12/12/18 4:17 PM, Ian Jackson wrote:
> George Dunlap writes ("[PATCH v2 08/10] libxl: Kill QEMU by uid when possible"):
>> The privcmd fd that a dm_restrict'ed QEMU has gives it permission to
>> one specific domain ID.  This domain ID will probably eventually be
>> used again.  It is therefore necessary to make absolutely sure that a
>> rogue QEMU process cannot hang around after its domain has exited.
> 
> Thanks.  Detailed comments mostly on error handling follow...
> 
>> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
>> index f9e0bf6578..cd3208f4b8 100644
>> --- a/tools/libxl/libxl_internal.h
>> +++ b/tools/libxl/libxl_internal.h
>> @@ -1135,7 +1135,7 @@ typedef struct {
>>      const char *shim_cmdline;
>>      const char *pv_cmdline;
>>  
>> -    char *dm_runas;
>> +    char *dm_runas, *dm_uid;
> 
> I think `dm_uid' is misnamed.  It is only set if the dm has a
> *dedicated* uid.  If the dm is run as uid 0 or as a shared qemu uid,
> it is set to NULL.
dm_kill_uid?  If not some suggestions would be helpful.

I'll add a comment here describing these as well.

>> @@ -148,8 +150,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
>>              LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
>>                   user);
>>              rc = ERROR_INVAL;
>> -        } else
>> +        } else {
>>              intended_uid = user_base->pw_uid;
>> +            kill_by_uid = true;
>> +        }
>>  
>>          goto out;
>>      }
> 
> I think your changes to the out block newly imply that all `goto out'
> with `rc=0' must also set kill_by_uid.

Yes; I didn't initialize kill_by_uid in the hopes (perhaps overly
optimistic) that the compiler would notice if there were paths where it
wasn't explicitly set and complain.

>> +    /*
>> +     * If we're starting the dm with a non-root UID, save the UID so
>> +     * that we can reliably kill it and any subprocesses
>> +     */
>> +    if (state->dm_uid)
>> +        libxl__xs_printf(gc, XBT_NULL,
>> +                         GCSPRINTF("%s/image/device-model-uid", dom_path),
> 
> My comment about the misnamed libxl variable applies to
> device-model-uid too I think.

Ack

>> +#define PROPAGATE_RC if(!ddms->rc) ddms->rc = rc
>                           ^
> 
> I like this macro.  But (i) it needs proper macro hygiene - either a
> do{ }while(0) block, or refactoring into an expression (and then
> putting in parens).  And (ii) you missed out a space.
[snip]
>> +#undef PROPAGATE_RC
>
> I am tempted to suggest replacing each call
>   PROPAGATE_RC;
> with
>   ACCUMULATE_RC(ddms);
> and put the definition in libxl_internal.h for use elsewhere.

I like that better.  What about `ACCUMULATE_RC(ddms->rc)` instead?  Then
the same macro could be used for a local variable.

[snip]

>> +    /*
>> +     * See if we should try to kill by uid
>> +     */
>> +    path = GCSPRINTF("/local/domain/%d/image/device-model-uid", domid);
>> +    rc = libxl__xs_read_checked(gc, XBT_NULL, path, &dm_uid_str);
>> +
>> +    /*
>> +     * If there was an error here, accumulate the error and fall back
>> +     * to killing by pid.
>> +     */
>> +    if (rc) {
>> +        PROPAGATE_RC;
>> +        LOGD(ERROR, domid, "Reading dm UID path failed for %s", path);
>> +    }
> 
> From the comment for libxl__xs_read_checked:
>  | * On error, *result_out is undefined.
> Arguably this is a bear trap.  Maybe you would like to fix it there
> rather than by setting dm_uid_str to 0 here.

Saying it's "undefined" is probably a bear trap.  But you don't like the
way it's actually written -- i.e., that the pointer is only modified if
the value is successfully read?

>> +        reaper_pid = libxl__ev_child_fork(gc, &ddms->destroyer,
>> +                                          kill_device_model_uid_cb);
>> +        if (reaper_pid < 0) {
>> +            rc = ERROR_FAIL;
>> +            PROPAGATE_RC;
>> +            /*
>> +             * Note that if this fails, we still don't kill by pid, to
>> +             * make sure that an untrusted DM has not "maliciously"
>> +             * exited (potentially causing us to kill an unrelated
>> +             * process which happened to get the same pid).
>> +             */
>> +            goto out;
>> +        }
>> +
>> +        if (!reaper_pid) {  /* child */
>> +            rc = kill_device_model_uid_child(ddms, dm_uid_str);
>> +            _exit(rc);
>> +        }
> 
> You cannot _exit(rc).  See my comments below...

Is the 'not' attached to "rc" (i.e., the value must be positive), or
'_exit()' (i.e., you must call exit() rather than _exit())?

> 
> Personally I like to put the child process block right after the fork,
> especially when (like here) it is very short because the meat has been
> lifted elsewhere.  Just a suggestion; you can leave it here if you
> prefer.

OK -- I think this is long enough that I prefer it separate.

> 
>> -    /* We should try to destroy the device model anyway. */
>> -    rc = kill_device_model(gc,
>> -              GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
>> +    /*
>> +     * No uid to kill; attept to kill by pid.
>> +     */
>> +    LOGD(DEBUG, domid, "Didn't find dm UID; destroying by pid");
>> +
>> +    path = GCSPRINTF("/local/domain/%d/image/device-model-pid", domid);
>> +    rc = kill_device_model(gc, path);
>> +
>> +    if (rc) {
>> +        PROPAGATE_RC;
>> +        LOGD(ERROR, domid, "Killing device model pid from path %s", path);
>> +    }
>> +
>> +out:
> 
> I would prefer the apparently-redundant:
> 
>   +    rc = 0;
>> +out:

OK -- if I don't set rc initially it won't be redundant. :-)


>> +    /*
>> +     * NB that we always return '0' here for the "status of exited
>> +     * process"; since there is no process, it always "succeeds".
> 
> I had to read this three times to figure out what this meant.  Perhaps
> I'm being dense but would you mind writing
> 
>   +     * NB that we always pass '0' here for the "status of exited
>                             ^^^^

Ack

>> +static void kill_device_model_uid_cb(libxl__egc *egc,
>> +                                    libxl__ev_child *destroyer,
>> +                                    pid_t pid, int status)
>> +{
>> +    libxl__destroy_devicemodel_state *ddms = CONTAINER_OF(destroyer, *ddms, destroyer);
>> +    STATE_AO_GC(ddms->ao);
>> +
>> +    if (status) {
>> +        int rc = ERROR_FAIL;;
>> +
>> +        if (WIFEXITED(status))
>> +            rc = WEXITSTATUS(status) - 128;
> 
> This can't be right.  Where does this 128 come from ?

Looks like I was trying to figure out how WEXITSTATUS worked from
looking at the child of devices_destroy_cb() put in _exit() and  how
domain_destroy_domid_cb() interpreted it: Namely, I inferred that
_exit(-1) would get you WEXITSTATUS(status) == 127, and extrapolated
that -2 would get you 126, and so on.

> I don't see a comment anywhere about your encoding of the libxl error
> value in the exit status.
> 
> A libxl error code does not necessarily fit in an exit status since an
> exit status is just one byte.  Your protocol reserves the exit status
> 0 for success.  The C implementation typically reserves -1 (255) and
> sometimes also 127.  By convention the exit status is normally
> regarded as positive and libxl error codes are negative.
> 
> I suggest one of the following strategies:
> 
>   - Give up on the idea of distinguishing these error codes at all and
>     simply _exit(!!rc).  (After all the real error is logged and the
>     function only ever returns FAIL.)
> 
>   - Say that the child function may only return one of a limited
>     subset of libxl error codes (since only FAIL is currently needed),
>     and assert that -125 <= rc <= -1, and _exit(-rc).  Then the exit
>     status can be recovered with -WEXITSTATUS(status).

For some reason, I really don't want to drop the exit code.  If you'd
rather I do #1 I will, but given the choice I'd go with #2.

Thanks,
 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 08/10] libxl: Kill QEMU by uid when possible
  2018-12-17 18:09     ` George Dunlap
@ 2018-12-18 11:18       ` Ian Jackson
  0 siblings, 0 replies; 22+ messages in thread
From: Ian Jackson @ 2018-12-18 11:18 UTC (permalink / raw)
  To: George Dunlap; +Cc: Anthony Perard, xen-devel, Wei Liu

George Dunlap writes ("Re: [PATCH v2 08/10] libxl: Kill QEMU by uid when possible"):
> On 12/12/18 4:17 PM, Ian Jackson wrote:
> > I am tempted to suggest replacing each call
> >   PROPAGATE_RC;
> > with
> >   ACCUMULATE_RC(ddms);
> > and put the definition in libxl_internal.h for use elsewhere.
> 
> I like that better.  What about `ACCUMULATE_RC(ddms->rc)` instead?  Then
> the same macro could be used for a local variable.

Yes, SGTM.

> >> +    /*
> >> +     * If there was an error here, accumulate the error and fall back
> >> +     * to killing by pid.
> >> +     */
> >> +    if (rc) {
> >> +        PROPAGATE_RC;
> >> +        LOGD(ERROR, domid, "Reading dm UID path failed for %s", path);
> >> +    }
> > 
> > From the comment for libxl__xs_read_checked:
> >  | * On error, *result_out is undefined.
> > Arguably this is a bear trap.  Maybe you would like to fix it there
> > rather than by setting dm_uid_str to 0 here.
> 
> Saying it's "undefined" is probably a bear trap.  But you don't like the
> way it's actually written -- i.e., that the pointer is only modified if
> the value is successfully read?

I think that's a weirdly complicated interface with strange
implications for normal callers.  I dislike threading the logic
through like this.  By "fix it there" I meant change
libxl__xs_read_checked to always set it to 0 on error, and to promise
to do that.  Sorry not to be explicit.

IOW I don't really think libxl__xs_read_checked ought to expect to be
used in an accumulate-y way, especially since it only leaves you with
the previous value on errors you probably weren't expecting, and not
on ENOENT.

> >> +        if (!reaper_pid) {  /* child */
> >> +            rc = kill_device_model_uid_child(ddms, dm_uid_str);
> >> +            _exit(rc);
> >> +        }
> > 
> > You cannot _exit(rc).  See my comments below...
> 
> Is the 'not' attached to "rc" (i.e., the value must be positive), or
> '_exit()' (i.e., you must call exit() rather than _exit())?

To rc.

> >> +static void kill_device_model_uid_cb(libxl__egc *egc,
> >> +                                    libxl__ev_child *destroyer,
> >> +                                    pid_t pid, int status)
> >> +{
> >> +    libxl__destroy_devicemodel_state *ddms = CONTAINER_OF(destroyer, *ddms, destroyer);
> >> +    STATE_AO_GC(ddms->ao);
> >> +
> >> +    if (status) {
> >> +        int rc = ERROR_FAIL;;
> >> +
> >> +        if (WIFEXITED(status))
> >> +            rc = WEXITSTATUS(status) - 128;
> > 
> > This can't be right.  Where does this 128 come from ?
> 
> Looks like I was trying to figure out how WEXITSTATUS worked from
> looking at the child of devices_destroy_cb() put in _exit() and  how
> domain_destroy_domid_cb() interpreted it: Namely, I inferred that
> _exit(-1) would get you WEXITSTATUS(status) == 127, and extrapolated
> that -2 would get you 126, and so on.

But _exit(-1) won't get you WEXITSTATUS(status) == 127.  Also, there
is a manual.  But it will be easier if I state what happens:

 * Only the bottom byte (2's complement) fo the argument to exit()
   or _exit() is used.  This value is called the `exit status'.
   It is a common idiom to write exit(255) as exit(-1).

 * The resulting wait status (what you get from waitpid) is
   exit_status << 8.

 * If your died due to an uncaught signal, the wait status is the
   signal number.  Plus 128 if the program dumped core.

 * HOWEVER a parent may experience other wait statuses:

 * The dynamic linker or other parts of the runtime may sometimes
   cause your program to exit nonconsensually, perhaps before
   it even ran any of your code.  By convention it will (if it
   can) print a message to stderr, and use the exit status
   255 or 127.

 * If a shell is involved, the rules are weirdly broken.  When a
   process run by a shell terminates, the shell squashes either the
   top or bottom byte of the wait status into $?.  That is, $? is now
   either the exit status, or the termination signal, or the
   termination signal plus 128.  Typically if a shell script exits due
   to failure of a program it ran, it will call exit on $?.  So signal
   numbers can end up showing up as exit statuses.

 * Kernels sometimes send processes SIGKILL due to OOM.  And of course
   there are a lot of other signals that might arise for various
   reasons only very loosely within the program's control.

 * I am shy of exit status 126 due to what I think is a well-founded
   superstition that someone may steal it for something.

 * This stuff with the wait status is the same everywhere that's
   actually a dialect of Unix but macros have been provided
      WIFEXITED      if true, you may use
         WEXITSTATUS
      WIFSIGNALED    if true, you may use
         WTERMSIG
         WCOREDUMP   <- missing on some ancient proprietary unices
   and as I am a careful programmer I usually handle the remaining
   case (by printing something like `unexpected wait status %x') too.
   (There are other macros WIFSTOPPED etc. which I am not discussing
   here, which arise if you ask wait* to tell you when your child
   stops not just when it ends.)

> > I suggest one of the following strategies:
> > 
> >   - Give up on the idea of distinguishing these error codes at all and
> >     simply _exit(!!rc).  (After all the real error is logged and the
> >     function only ever returns FAIL.)
> > 
> >   - Say that the child function may only return one of a limited
> >     subset of libxl error codes (since only FAIL is currently needed),
> >     and assert that -125 <= rc <= -1, and _exit(-rc).  Then the exit
> >     status can be recovered with -WEXITSTATUS(status).
> 
> For some reason, I really don't want to drop the exit code.  If you'd
> rather I do #1 I will, but given the choice I'd go with #2.

OK.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid
  2018-12-12 18:20     ` George Dunlap
@ 2018-12-18 15:07       ` Ian Jackson
  2018-12-20 22:12         ` George Dunlap
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Jackson @ 2018-12-18 15:07 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Wei Liu

George Dunlap writes ("Re: [PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid"):
> On Dec 12, 2018, at 3:45 PM, Ian Jackson <ian.jackson@citrix.com> wrote:
> >> +    /*
> >> +     * If dm_restrict isn't set, and we don't have a specified user, don't
> >> +     * bother setting a `-runas` parameter.
> >> +     */
> >>     if (!libxl_defbool_val(b_info->dm_restrict)) {
> >>         LOGD(DEBUG, guest_domid,
> >>              "dm_restrict disabled, starting QEMU as root");
> >>         return 0;
> >>     }
> > 
> > Why `return 0' here but `goto out' earlier ?  IMO all the success
> > returns should be the same.
> 
> Not exactly — we only want to do the root check if we’re running as an alternate user.  In this case, neither device_model_user nor dm_restrict is set, so we’re not running as an alternate user, so we don’t want to run the `if(intended_uid == 0)` check on the normal ‘out’ path.

Ah.  Fiddly.

> I take it you’d prefer to always jump to out, but to have the conditional be, `if (!rc && user)`?

How about
  if (!rc) {
     if (user && intended_uid=0) {
         error case,
         rc = 
     }
  }
  if (!rc) {
?

The reason I like this is because this keeps separate the error / flow
control logic from the precise details of the reason why we might
discover a later error.

> > We generally use EINVAL for bad configurations.  If you prefer, feel
> > free to introduce a new error code.  32-bit signed integers are pretty
> > cheap.
> 
> The integers may be cheap, but scanning through trying to comprehend them is not. :-)

It would be nice if we had a doc saying what they meant.

> >> +    if (rc < 0)
> >> +        goto out;
> >>     if (user_base) {
> >>         LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
> >>              LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
> >> -        goto end_search;
> >> +        intended_uid = user_base->pw_uid;
> >> +        goto out;
> > 
> > Here we have this pattern again with a `goto out' without a preceding
> > assignment to `rc'.  AFAICT the rules implied by your out block are:
> > 
> > * Every goto out must be preceded by an assignment to rc.
> >   IMO there is no reason this should not immediately precede
> >   the goto out.
> > 
> > * Additionally, if rc is 0 then the goto out must also be preceded
> >   relatively recently by an assignment to intended_uid.
> 
> Those are rules that you’re implying, not me. :-)  My `goto out` invariant in this patch were:
> 
> 1. rc may be an error code.  In this case, rc is returned.
> 2. rc may be zero; if rc is zero:
>  2a. user must be non-NULL,
>  2b. user must be verified to exist on the system, and
>  2c. intended_uid must be set to the userid reported in the previous check

This formulation of the rules cannot be verified locally.  In order to
verify these for any particular `goto out' it is necessary to scan
back through the previous logic to see whether 2a, 2b, 2c are true.

But you are right that I had failed to spot that assignment to user is
also required.  Perhaps this demonstrates that a comment stating the
rules explicitly would be useful.

> In order to accept your suggestion above to replace the `return` with a `goto out`, I have to make the invariant as follows:
> 
> 1. rc may be an error code.  In this case, rc is returned.
> 2. rc may be zero, and user NULL.  In this case, rc is returned.
> 3. rc may be zero, and user non-NULL.  In this case:
>   [2b and 2c from above]

I would suggest:

  1. If rc is an error code, all bets are off about user and
     intended_uid

  2. Otherwise we have success.  Then user and intended_uid are
     appropriate for the situation, which might mean both are
     sentinel, or both are set to specific values.

This seems to make sense to me since the purpose of this function is
to discover the right values for user and intended_uid.

> In this case, we know that rc is 0 because we just checked the value 6 lines earlier.  If code is ever added in between such that rc becomes non-zero, *that* code should be calling `goto out` (or thinking carefully about why falling through to this code is OK).

I really dislike the pattern of reusing an rc value.  It is OK to
build up the results of the computation in the intended answer
variables (in this case user and intended_uid).

But rc is a `throwaway' variable which has to be trashed by any call
to any libxl subfunction.  That is, except in the special case of
destruction functions, and during the out block, rc is not accumulated
or preserved, and has only very local significance.

You are saying `but if that other subfunction doesn't succeed, setting
rc!=0, it will have to goto out' but that is not a valid assumption.
Maybe there is a way of handling the error by trying an alternative
approach, or something.  In general there is nothing wrong with code
like this:

   rc = libxl_try_mutilate_wombat(gc, dom, &wombat);
   /* We do not need to mutilate a nonexistent wombat */
   if (rc && rc != ERROR_NOWOMBAT)
       goto out;

That is the standard pattern for making a subroutine call, adjusted
for ignoring an expected harmless error case.  Unless there is a good
reason to do otherwise, code in libxl should tolerate having something
like that dropped into it, in between other work.

But of course the fragment above does not leave rc set to anything in
particular.  If subsequent code wants rc==0 it would have to set it.

This implies that unless there is a good reason to do otherwise, all
code in libxl that wants rc==0 should set it.  Even if the preceding
little fragment of code is, at present, a necessarily-successful
subroutine call.

> I’ll write redundant statements everywhere if you want, but I thought that would count as the kind of code duplication you wanted to avoid.

The reason why code duplication is bad is that duplicated code, like
all code, has bugs, and fixing the bugs is then duplicated too.  This
hardly applies to `rc = 0;' before `goto out;'.  Rather, I regard
reliance on some previous rc a latent bug.

> > I think this means lifting state->dm_runas into else, or writing this:
> > 
> >> +out:
> >> +    if (!rc) {
> >> +        if (intended_uid == 0) {
> >> +            LOGD(ERROR, guest_domid, "intended_uid is 0 (root)!");
> >  +            rc = ERROR_INVAL;
> >> +        }
> >  +    }
> >  +    if (!rc) {
> >> +        state->dm_runas = user;
> >> +    }
> 
> Would you want braces on the second `if`?  I suppose we’d add them in patch 7 anyway.

Yes.  That's why I wrote them there :-).

> If we switch the earlier `return 0` in the !dm_restrict conditional to a “goto out”, then this would turn into:

I think setting intended_uid==0 when user==0 is a hostage to fortune.
Why not set it to (uid_t)-1 ?

Then you write:

  if (!rc) {
     if (intended_uid == 0) {
         complain
         rc = ERROR_INVAL;
     }
  }
  if (!rc) {
     save user and intended_uid
  }

> If you have a favorite color it might be better just to tell me. :-)

Do you see why I prefer the above ?

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid
  2018-12-18 15:07       ` Ian Jackson
@ 2018-12-20 22:12         ` George Dunlap
  0 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2018-12-20 22:12 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Wei Liu


> On Dec 18, 2018, at 3:07 PM, Ian Jackson <ian.jackson@citrix.com> wrote:
> 
>> If we switch the earlier `return 0` in the !dm_restrict conditional to a “goto out”, then this would turn into:
> 
> I think setting intended_uid==0 when user==0 is a hostage to fortune.
> Why not set it to (uid_t)-1 ?
> 
> Then you write:
> 
>  if (!rc) {
>     if (intended_uid == 0) {
>         complain
>         rc = ERROR_INVAL;
>     }
>  }
>  if (!rc) {
>     save user and intended_uid
>  }
> 
>> If you have a favorite color it might be better just to tell me. :-)
> 
> Do you see why I prefer the above ?

Not at all I’m afraid.  Especially now that we have two separate “if (!rc)” checks,  I think having an extra label to jump to when we know the check needs to be done makes the code both more aesthetically pleasing and easier to understand.  But it’s your house, if you want it orange and purple, that’s what color I’ll paint it. :-)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-12-20 22:12 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-06 15:02 [PATCH v2 01/10] libxl: Move dm user determination logic into a helper function George Dunlap
2018-12-06 15:02 ` [PATCH v2 02/10] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN) George Dunlap
2018-12-06 15:02 ` [PATCH v2 03/10] libxl: Clean up userlookup_helper_getpw* helper George Dunlap
2018-12-12 15:26   ` Ian Jackson
2018-12-12 17:39     ` George Dunlap
2018-12-06 15:02 ` [PATCH v2 04/10] dm_depriv: Describe expected usage of device_model_user parameter George Dunlap
2018-12-06 15:02 ` [PATCH v2 05/10] libxl: Do root checks once in libxl__domain_get_device_model_uid George Dunlap
2018-12-12 15:45   ` Ian Jackson
2018-12-12 18:20     ` George Dunlap
2018-12-18 15:07       ` Ian Jackson
2018-12-20 22:12         ` George Dunlap
2018-12-06 15:02 ` [PATCH v2 06/10] libxl: Move qmp cleanup into devicemodel destroy function George Dunlap
2018-12-06 15:02 ` [PATCH v2 07/10] libxl: Make killing of device model asynchronous George Dunlap
2018-12-12 15:46   ` Ian Jackson
2018-12-06 15:02 ` [PATCH v2 08/10] libxl: Kill QEMU by uid when possible George Dunlap
2018-12-12 16:17   ` Ian Jackson
2018-12-17 18:09     ` George Dunlap
2018-12-18 11:18       ` Ian Jackson
2018-12-06 15:02 ` [PATCH v2 09/10] libxl: Kill QEMU with "reaper" ruid George Dunlap
2018-12-12 16:30   ` Ian Jackson
2018-12-06 15:02 ` [PATCH v2 10/10] dm_depriv: Mark `UID cleanup` as completed George Dunlap
2018-12-12 16:31   ` Ian Jackson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).