All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
@ 2012-06-22 18:36 Corey Bryant
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
                   ` (7 more replies)
  0 siblings, 8 replies; 56+ messages in thread
From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake

libvirt's sVirt security driver provides SELinux MAC isolation for
Qemu guest processes and their corresponding image files.  In other
words, sVirt uses SELinux to prevent a QEMU process from opening
files that do not belong to it.

sVirt provides this support by labeling guests and resources with
security labels that are stored in file system extended attributes.
Some file systems, such as NFS, do not support the extended
attribute security namespace, and therefore cannot support sVirt
isolation.

A solution to this problem is to provide fd passing support, where
libvirt opens files and passes file descriptors to QEMU.  This,
along with SELinux policy to prevent QEMU from opening files, can
provide image file isolation for NFS files stored on the same NFS
mount.

This patch series adds the pass-fd QMP monitor command, which allows
an fd to be passed via SCM_RIGHTS, and returns the received file
descriptor.  Support is also added to the block layer to allow QEMU
to dup the fd when the filename is of the /dev/fd/X format.  This
is useful if MAC policy prevents QEMU from opening specific types
of files.

One nice thing about this approach is that no new SELinux policy is
required to prevent open of NFS files (files with type nfs_t).  The
virt_use_nfs boolean type simply needs to be set to false, and open
will be prevented (and dup will be allowed).  For example:

    # setsebool virt_use_nfs 0
    # getsebool virt_use_nfs
    virt_use_nfs --> off

Corey Bryant (7):
  qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg
  qapi: Convert getfd and closefd
  qapi: Add pass-fd QMP command
  qapi: Re-arrange monitor.c functions
  block: Prevent /dev/fd/X filename from being detected as floppy
  block: Convert open calls to qemu_open
  osdep: Enable qemu_open to dup pre-opened fd

 block/raw-posix.c |   22 +++++-----
 block/raw-win32.c |    4 +-
 block/vdi.c       |    5 ++-
 block/vmdk.c      |   21 ++++------
 block/vpc.c       |    2 +-
 block/vvfat.c     |   21 +++++-----
 cutils.c          |   26 +++++++++---
 dump.c            |    3 +-
 hmp-commands.hx   |    6 +--
 hmp.c             |   18 ++++++++
 hmp.h             |    2 +
 main-loop.c       |    6 +--
 migration-fd.c    |    2 +-
 monitor.c         |  120 ++++++++++++++++++++++++++++++++---------------------
 monitor.h         |    2 +-
 net.c             |    6 ++-
 osdep.c           |   91 ++++++++++++++++++++++++++++++++++++++++
 qapi-schema.json  |   71 +++++++++++++++++++++++++++++++
 qemu-char.c       |    2 +-
 qemu-common.h     |    2 +-
 qmp-commands.hx   |   56 ++++++++++++++++++++++---
 21 files changed, 378 insertions(+), 110 deletions(-)

-- 
1.7.10.2

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg
  2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
@ 2012-06-22 18:36 ` Corey Bryant
  2012-06-22 19:31   ` Eric Blake
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake

This sets the close-on-exec flag for the file descriptor received
via SCM_RIGHTS.

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
v4
 -This patch is new in v4 (eblake@redhat.com)

 qemu-char.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-char.c b/qemu-char.c
index c2aaaee..f890113 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -2263,7 +2263,7 @@ static ssize_t tcp_chr_recv(CharDriverState *chr, char *buf, size_t len)
     msg.msg_control = &msg_control;
     msg.msg_controllen = sizeof(msg_control);
 
-    ret = recvmsg(s->fd, &msg, 0);
+    ret = recvmsg(s->fd, &msg, MSG_CMSG_CLOEXEC);
     if (ret > 0 && s->is_unix)
         unix_process_msgfd(chr, &msg);
 
-- 
1.7.10.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd
  2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
@ 2012-06-22 18:36 ` Corey Bryant
  2012-07-11 18:51   ` Luiz Capitulino
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
v2:
 -Convert getfd and closefd to QAPI (lcapitulino@redhat.com)
 -Remove changes that returned fd from getfd (lcapitulino@redhat.com)
 -Wrap hmp_* functions around qmp_* functions (kwolf@redhat.com)
 -Move hmp_* functions to hmp.c (lcapitulino@redhat.com)
 -Drop .user_print lines (lcapitulino@redhat.com)
 -Use 'cmd' instead of 'cmd_new' for HMP (lcapitulino@redhat.com)
 -Change QMP command existance back to 0.14 (lcapitulino@redhat.com)

v3:
 -Remove unnecessary return statements from qmp_* functions
  (lcapitulino@redhat.com)
 -Add notes to QMP command describing behavior in more detail
  (lcapitulino@redhat.com, eblake@redhat.com)

v4:
 -No changes

 hmp-commands.hx  |    6 ++----
 hmp.c            |   18 ++++++++++++++++++
 hmp.h            |    2 ++
 monitor.c        |   32 ++++++++++++++------------------
 qapi-schema.json |   35 +++++++++++++++++++++++++++++++++++
 qmp-commands.hx  |   14 ++++++++++----
 6 files changed, 81 insertions(+), 26 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index f5d9d91..eea8b32 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1236,8 +1236,7 @@ ETEXI
         .args_type  = "fdname:s",
         .params     = "getfd name",
         .help       = "receive a file descriptor via SCM rights and assign it a name",
-        .user_print = monitor_user_noop,
-        .mhandler.cmd_new = do_getfd,
+        .mhandler.cmd = hmp_getfd,
     },
 
 STEXI
@@ -1253,8 +1252,7 @@ ETEXI
         .args_type  = "fdname:s",
         .params     = "closefd name",
         .help       = "close a file descriptor previously passed via SCM rights",
-        .user_print = monitor_user_noop,
-        .mhandler.cmd_new = do_closefd,
+        .mhandler.cmd = hmp_closefd,
     },
 
 STEXI
diff --git a/hmp.c b/hmp.c
index 2ce8cb9..6241856 100644
--- a/hmp.c
+++ b/hmp.c
@@ -999,3 +999,21 @@ void hmp_netdev_del(Monitor *mon, const QDict *qdict)
     qmp_netdev_del(id, &err);
     hmp_handle_error(mon, &err);
 }
+
+void hmp_getfd(Monitor *mon, const QDict *qdict)
+{
+    const char *fdname = qdict_get_str(qdict, "fdname");
+    Error *errp = NULL;
+
+    qmp_getfd(fdname, &errp);
+    hmp_handle_error(mon, &errp);
+}
+
+void hmp_closefd(Monitor *mon, const QDict *qdict)
+{
+    const char *fdname = qdict_get_str(qdict, "fdname");
+    Error *errp = NULL;
+
+    qmp_closefd(fdname, &errp);
+    hmp_handle_error(mon, &errp);
+}
diff --git a/hmp.h b/hmp.h
index 79d138d..8d2b0d7 100644
--- a/hmp.h
+++ b/hmp.h
@@ -64,5 +64,7 @@ void hmp_device_del(Monitor *mon, const QDict *qdict);
 void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict);
 void hmp_netdev_add(Monitor *mon, const QDict *qdict);
 void hmp_netdev_del(Monitor *mon, const QDict *qdict);
+void hmp_getfd(Monitor *mon, const QDict *qdict);
+void hmp_closefd(Monitor *mon, const QDict *qdict);
 
 #endif
diff --git a/monitor.c b/monitor.c
index a3bc2c7..1a7f7e7 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2182,48 +2182,45 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict)
 }
 #endif
 
-static int do_getfd(Monitor *mon, const QDict *qdict, QObject **ret_data)
+void qmp_getfd(const char *fdname, Error **errp)
 {
-    const char *fdname = qdict_get_str(qdict, "fdname");
     mon_fd_t *monfd;
     int fd;
 
-    fd = qemu_chr_fe_get_msgfd(mon->chr);
+    fd = qemu_chr_fe_get_msgfd(cur_mon->chr);
     if (fd == -1) {
-        qerror_report(QERR_FD_NOT_SUPPLIED);
-        return -1;
+        error_set(errp, QERR_FD_NOT_SUPPLIED);
+        return;
     }
 
     if (qemu_isdigit(fdname[0])) {
-        qerror_report(QERR_INVALID_PARAMETER_VALUE, "fdname",
-                      "a name not starting with a digit");
-        return -1;
+        error_set(errp, QERR_INVALID_PARAMETER_VALUE, "fdname",
+                  "a name not starting with a digit");
+        return;
     }
 
-    QLIST_FOREACH(monfd, &mon->fds, next) {
+    QLIST_FOREACH(monfd, &cur_mon->fds, next) {
         if (strcmp(monfd->name, fdname) != 0) {
             continue;
         }
 
         close(monfd->fd);
         monfd->fd = fd;
-        return 0;
+        return;
     }
 
     monfd = g_malloc0(sizeof(mon_fd_t));
     monfd->name = g_strdup(fdname);
     monfd->fd = fd;
 
-    QLIST_INSERT_HEAD(&mon->fds, monfd, next);
-    return 0;
+    QLIST_INSERT_HEAD(&cur_mon->fds, monfd, next);
 }
 
-static int do_closefd(Monitor *mon, const QDict *qdict, QObject **ret_data)
+void qmp_closefd(const char *fdname, Error **errp)
 {
-    const char *fdname = qdict_get_str(qdict, "fdname");
     mon_fd_t *monfd;
 
-    QLIST_FOREACH(monfd, &mon->fds, next) {
+    QLIST_FOREACH(monfd, &cur_mon->fds, next) {
         if (strcmp(monfd->name, fdname) != 0) {
             continue;
         }
@@ -2232,11 +2229,10 @@ static int do_closefd(Monitor *mon, const QDict *qdict, QObject **ret_data)
         close(monfd->fd);
         g_free(monfd->name);
         g_free(monfd);
-        return 0;
+        return;
     }
 
-    qerror_report(QERR_FD_NOT_FOUND, fdname);
-    return -1;
+    error_set(errp, QERR_FD_NOT_FOUND, fdname);
 }
 
 static void do_loadvm(Monitor *mon, const QDict *qdict)
diff --git a/qapi-schema.json b/qapi-schema.json
index 3b6e346..26a6b84 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1862,3 +1862,38 @@
 # Since: 0.14.0
 ##
 { 'command': 'netdev_del', 'data': {'id': 'str'} }
+
+##
+# @getfd:
+#
+# Receive a file descriptor via SCM rights and assign it a name
+#
+# @fdname: file descriptor name
+#
+# Returns: Nothing on success
+#          If file descriptor was not received, FdNotSupplied
+#          If @fdname is not valid, InvalidParameterType
+#
+# Since: 0.14.0
+#
+# Notes: If @fdname already exists, the file descriptor assigned to
+#        it will be closed and replaced by the received file
+#        descriptor.
+#        The 'closefd' command can be used to explicitly close the
+#        file descriptor when it is no longer needed.
+##
+{ 'command': 'getfd', 'data': {'fdname': 'str'} }
+
+##
+# @closefd:
+#
+# Close a file descriptor previously passed via SCM rights
+#
+# @fdname: file descriptor name
+#
+# Returns: Nothing on success
+#          If @fdname is not found, FdNotFound
+#
+# Since: 0.14.0
+##
+{ 'command': 'closefd', 'data': {'fdname': 'str'} }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 2e1a38e..e3cf3c5 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -873,8 +873,7 @@ EQMP
         .args_type  = "fdname:s",
         .params     = "getfd name",
         .help       = "receive a file descriptor via SCM rights and assign it a name",
-        .user_print = monitor_user_noop,
-        .mhandler.cmd_new = do_getfd,
+        .mhandler.cmd_new = qmp_marshal_input_getfd,
     },
 
 SQMP
@@ -892,6 +891,14 @@ Example:
 -> { "execute": "getfd", "arguments": { "fdname": "fd1" } }
 <- { "return": {} }
 
+Notes:
+
+(1) If the name specified by the "fdname" argument already exists,
+    the file descriptor assigned to it will be closed and replaced
+    by the received file descriptor.
+(2) The 'closefd' command can be used to explicitly close the file
+    descriptor when it is no longer needed.
+
 EQMP
 
     {
@@ -899,8 +906,7 @@ EQMP
         .args_type  = "fdname:s",
         .params     = "closefd name",
         .help       = "close a file descriptor previously passed via SCM rights",
-        .user_print = monitor_user_noop,
-        .mhandler.cmd_new = do_closefd,
+        .mhandler.cmd_new = qmp_marshal_input_closefd,
     },
 
 SQMP
-- 
1.7.10.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command
  2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant
@ 2012-06-22 18:36 ` Corey Bryant
  2012-06-22 20:24   ` Eric Blake
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions Corey Bryant
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake

This patch adds the pass-fd QMP command using the QAPI framework.
Like the getfd command, it is used to pass a file descriptor via
SCM_RIGHTS and associate it with a name.  However, the pass-fd
command also returns the received file descriptor, which is a
difference in behavior from the getfd command, which returns
nothing.  pass-fd also supports a boolean "force" argument that
can be used to specify that an existing file descriptor is
associated with the specified name, and that it should become a
copy of the newly passed file descriptor.

The closefd command can be used to close a file descriptor that was
passed with the pass-fd command.

Note that when using getfd or pass-fd, there are some commands
(e.g. migrate with fd:name) that implicitly close the named fd.
When this is not the case, closefd must be used to avoid fd leaks.

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
v2:
 -Introduce new QMP command to pass/return fd (lcapitulino@redhat.com)
 -Use passfd as command name (berrange@redhat.com)

v3:
 -Use pass-fd as command name (lcapitulino@redhat.com)
 -Fail pass-fd if fdname already exists (lcapitulino@redhat.com)
 -Add notes to QMP command describing behavior in more detail
  (lcapitulino@redhat.com, eblake@redhat.com)
 -Add note about fd leakage (eblake@redhat.com)

v4:
 -Don't return same error class twice (lcapitulino@redhat.com)
 -Share code for qmp_gefd and qmp_pass_fd (lcapitulino@redhat.com)
 -Update qmp_closefd to call monitor_get_fd
 -Add support for "force" argument to pass-fd (eblake@redhat.com)

 dump.c           |    3 +-
 migration-fd.c   |    2 +-
 monitor.c        |   96 +++++++++++++++++++++++++++++++++++-------------------
 monitor.h        |    2 +-
 net.c            |    6 ++--
 qapi-schema.json |   36 ++++++++++++++++++++
 qmp-commands.hx  |   42 +++++++++++++++++++++++-
 7 files changed, 146 insertions(+), 41 deletions(-)

diff --git a/dump.c b/dump.c
index 4412d7a..2fd8ced 100644
--- a/dump.c
+++ b/dump.c
@@ -836,9 +836,8 @@ void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
 
 #if !defined(WIN32)
     if (strstart(file, "fd:", &p)) {
-        fd = monitor_get_fd(cur_mon, p);
+        fd = monitor_get_fd(cur_mon, p, errp);
         if (fd == -1) {
-            error_set(errp, QERR_FD_NOT_FOUND, p);
             return;
         }
     }
diff --git a/migration-fd.c b/migration-fd.c
index 50138ed..7335167 100644
--- a/migration-fd.c
+++ b/migration-fd.c
@@ -75,7 +75,7 @@ static int fd_close(MigrationState *s)
 
 int fd_start_outgoing_migration(MigrationState *s, const char *fdname)
 {
-    s->fd = monitor_get_fd(cur_mon, fdname);
+    s->fd = monitor_get_fd(cur_mon, fdname, NULL);
     if (s->fd == -1) {
         DPRINTF("fd_migration: invalid file descriptor identifier\n");
         goto err_after_get_fd;
diff --git a/monitor.c b/monitor.c
index 1a7f7e7..3433c06 100644
--- a/monitor.c
+++ b/monitor.c
@@ -814,7 +814,7 @@ static int add_graphics_client(Monitor *mon, const QDict *qdict, QObject **ret_d
     CharDriverState *s;
 
     if (strcmp(protocol, "spice") == 0) {
-        int fd = monitor_get_fd(mon, fdname);
+        int fd = monitor_get_fd(mon, fdname, NULL);
         int skipauth = qdict_get_try_bool(qdict, "skipauth", 0);
         int tls = qdict_get_try_bool(qdict, "tls", 0);
         if (!using_spice) {
@@ -828,18 +828,18 @@ static int add_graphics_client(Monitor *mon, const QDict *qdict, QObject **ret_d
         return 0;
 #ifdef CONFIG_VNC
     } else if (strcmp(protocol, "vnc") == 0) {
-	int fd = monitor_get_fd(mon, fdname);
+        int fd = monitor_get_fd(mon, fdname, NULL);
         int skipauth = qdict_get_try_bool(qdict, "skipauth", 0);
-	vnc_display_add_client(NULL, fd, skipauth);
-	return 0;
+        vnc_display_add_client(NULL, fd, skipauth);
+        return 0;
 #endif
     } else if ((s = qemu_chr_find(protocol)) != NULL) {
-	int fd = monitor_get_fd(mon, fdname);
-	if (qemu_chr_add_client(s, fd) < 0) {
-	    qerror_report(QERR_ADD_CLIENT_FAILED);
-	    return -1;
-	}
-	return 0;
+        int fd = monitor_get_fd(mon, fdname, NULL);
+        if (qemu_chr_add_client(s, fd) < 0) {
+            qerror_report(QERR_ADD_CLIENT_FAILED);
+            return -1;
+        }
+        return 0;
     }
 
     qerror_report(QERR_INVALID_PARAMETER, "protocol");
@@ -2182,57 +2182,69 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict)
 }
 #endif
 
-void qmp_getfd(const char *fdname, Error **errp)
+static int monitor_put_fd(Monitor *mon, const char *fdname, bool replace,
+                          bool copy, Error **errp)
 {
     mon_fd_t *monfd;
     int fd;
 
-    fd = qemu_chr_fe_get_msgfd(cur_mon->chr);
+    fd = qemu_chr_fe_get_msgfd(mon->chr);
     if (fd == -1) {
         error_set(errp, QERR_FD_NOT_SUPPLIED);
-        return;
+        return -1;
     }
 
     if (qemu_isdigit(fdname[0])) {
         error_set(errp, QERR_INVALID_PARAMETER_VALUE, "fdname",
                   "a name not starting with a digit");
-        return;
+        return -1;
     }
 
-    QLIST_FOREACH(monfd, &cur_mon->fds, next) {
+    QLIST_FOREACH(monfd, &mon->fds, next) {
         if (strcmp(monfd->name, fdname) != 0) {
             continue;
         }
 
-        close(monfd->fd);
-        monfd->fd = fd;
-        return;
+        if (replace) {
+            /* the existing fd is replaced with the new fd */
+            close(monfd->fd);
+            monfd->fd = fd;
+            return fd;
+        }
+
+        if (copy) {
+            /* the existing fd becomes a copy of the new fd */
+            if (dup2(fd, monfd->fd) == -1) {
+                error_set(errp, QERR_UNDEFINED_ERROR);
+                return -1;
+            }
+            close(fd);
+            return monfd->fd;
+        }
+
+        error_set(errp, QERR_INVALID_PARAMETER, "fdname");
+        return -1;
+    }
+
+    if (copy) {
+        error_set(errp, QERR_INVALID_PARAMETER, "fdname");
+        return -1;
     }
 
     monfd = g_malloc0(sizeof(mon_fd_t));
     monfd->name = g_strdup(fdname);
     monfd->fd = fd;
 
-    QLIST_INSERT_HEAD(&cur_mon->fds, monfd, next);
+    QLIST_INSERT_HEAD(&mon->fds, monfd, next);
+    return fd;
 }
 
 void qmp_closefd(const char *fdname, Error **errp)
 {
-    mon_fd_t *monfd;
-
-    QLIST_FOREACH(monfd, &cur_mon->fds, next) {
-        if (strcmp(monfd->name, fdname) != 0) {
-            continue;
-        }
-
-        QLIST_REMOVE(monfd, next);
-        close(monfd->fd);
-        g_free(monfd->name);
-        g_free(monfd);
-        return;
+    int fd = monitor_get_fd(cur_mon, fdname, errp);
+    if (fd != -1) {
+        close(fd);
     }
-
-    error_set(errp, QERR_FD_NOT_FOUND, fdname);
 }
 
 static void do_loadvm(Monitor *mon, const QDict *qdict)
@@ -2247,7 +2259,7 @@ static void do_loadvm(Monitor *mon, const QDict *qdict)
     }
 }
 
-int monitor_get_fd(Monitor *mon, const char *fdname)
+int monitor_get_fd(Monitor *mon, const char *fdname, Error **errp)
 {
     mon_fd_t *monfd;
 
@@ -2268,9 +2280,25 @@ int monitor_get_fd(Monitor *mon, const char *fdname)
         return fd;
     }
 
+    error_set(errp, QERR_FD_NOT_FOUND, fdname);
     return -1;
 }
 
+int64_t qmp_pass_fd(const char *fdname, bool has_force, bool force,
+                    Error **errp)
+{
+    bool replace = false;
+    bool copy = has_force ? force : false;
+    return monitor_put_fd(cur_mon, fdname, replace, copy, errp);
+}
+
+void qmp_getfd(const char *fdname, Error **errp)
+{
+    bool replace = true;
+    bool copy = false;
+    monitor_put_fd(cur_mon, fdname, replace, copy, errp);
+}
+
 /* mon_cmds and info_cmds would be sorted at runtime */
 static mon_cmd_t mon_cmds[] = {
 #include "hmp-commands.h"
diff --git a/monitor.h b/monitor.h
index cd1d878..8fa515d 100644
--- a/monitor.h
+++ b/monitor.h
@@ -63,7 +63,7 @@ int monitor_read_block_device_key(Monitor *mon, const char *device,
                                   BlockDriverCompletionFunc *completion_cb,
                                   void *opaque);
 
-int monitor_get_fd(Monitor *mon, const char *fdname);
+int monitor_get_fd(Monitor *mon, const char *fdname, Error **errp);
 
 void monitor_vprintf(Monitor *mon, const char *fmt, va_list ap)
     GCC_FMT_ATTR(2, 0);
diff --git a/net.c b/net.c
index 4aa416c..28860b8 100644
--- a/net.c
+++ b/net.c
@@ -730,12 +730,14 @@ int qemu_find_nic_model(NICInfo *nd, const char * const *models,
 int net_handle_fd_param(Monitor *mon, const char *param)
 {
     int fd;
+    Error *local_err = NULL;
 
     if (!qemu_isdigit(param[0]) && mon) {
 
-        fd = monitor_get_fd(mon, param);
+        fd = monitor_get_fd(mon, param, &local_err);
         if (fd == -1) {
-            error_report("No file descriptor named %s found", param);
+            qerror_report_err(local_err);
+            error_free(local_err);
             return -1;
         }
     } else {
diff --git a/qapi-schema.json b/qapi-schema.json
index 26a6b84..2ac1261 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1864,6 +1864,41 @@
 { 'command': 'netdev_del', 'data': {'id': 'str'} }
 
 ##
+# @pass-fd:
+#
+# Pass a file descriptor via SCM rights and assign it a name
+#
+# @fdname: file descriptor name
+#
+# @force: #optional true specifies that an existing file descriptor is
+#         associated with @fdname, and that it should become a copy of the
+#         newly passed file descriptor.
+#
+# Returns: The QEMU file descriptor that is assigned to @fdname
+#          If file descriptor was not received, FdNotSupplied
+#          If @fdname is not valid, InvalidParameterValue
+#          If @fdname already exists, and @force is not true, InvalidParameter
+#          If @fdname does not already exist, and @force is true,
+#          InvalidParameter
+#          If @force fails to copy the passed file descriptor,
+#          UndefinedError
+#
+# Since: 1.2.0
+#
+# Notes: If @fdname already exists, and @force is not true, the
+#        command will fail.
+#
+#        If @fdname already exists, and @force is true, the value of
+#        the existing file descriptor is returned when the command is
+#        successful.
+#
+#        The 'closefd' command can be used to explicitly close the
+#        file descriptor when it is no longer needed.
+##
+{ 'command': 'pass-fd', 'data': {'fdname': 'str', '*force': 'bool'},
+  'returns': 'int' }
+
+##
 # @getfd:
 #
 # Receive a file descriptor via SCM rights and assign it a name
@@ -1879,6 +1914,7 @@
 # Notes: If @fdname already exists, the file descriptor assigned to
 #        it will be closed and replaced by the received file
 #        descriptor.
+#
 #        The 'closefd' command can be used to explicitly close the
 #        file descriptor when it is no longer needed.
 ##
diff --git a/qmp-commands.hx b/qmp-commands.hx
index e3cf3c5..7e3c07e 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -869,9 +869,49 @@ Example:
 EQMP
 
     {
+        .name       = "pass-fd",
+        .args_type  = "fdname:s,force:b?",
+        .params     = "pass-fd fdname force",
+        .help       = "pass a file descriptor via SCM rights and assign it a name",
+        .mhandler.cmd_new = qmp_marshal_input_pass_fd,
+    },
+
+SQMP
+pass-fd
+-------
+
+Pass a file descriptor via SCM rights and assign it a name.
+
+Arguments:
+
+- "fdname": file descriptor name (json-string)
+- "force": true specifies that an existing file descriptor is associated
+           with "fdname", and that it should become a copy of the newly
+           passed file descriptor. (json-bool, optional)
+
+Return a json-int with the QEMU file descriptor that is assigned to "fdname".
+
+Example:
+
+-> { "execute": "pass-fd", "arguments": { "fdname": "fd1" } }
+<- { "return": 42 }
+
+Notes:
+
+(1) If the name specified by the "fdname" argument already exists, and
+    "force" is not true, the command will fail.
+(2) If the name specified by the "fdname" argument already exists, and
+    "force" is true, the value of the existing file descriptor is
+    returned when the command is successful.
+(3) The 'closefd' command can be used to explicitly close the file
+    descriptor when it is no longer needed.
+
+EQMP
+
+    {
         .name       = "getfd",
         .args_type  = "fdname:s",
-        .params     = "getfd name",
+        .params     = "getfd fdname",
         .help       = "receive a file descriptor via SCM rights and assign it a name",
         .mhandler.cmd_new = qmp_marshal_input_getfd,
     },
-- 
1.7.10.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions
  2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
                   ` (2 preceding siblings ...)
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant
@ 2012-06-22 18:36 ` Corey Bryant
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy Corey Bryant
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
v4:
 -This patch is new in v4.

 monitor.c |   40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/monitor.c b/monitor.c
index 3433c06..153e949 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2239,26 +2239,6 @@ static int monitor_put_fd(Monitor *mon, const char *fdname, bool replace,
     return fd;
 }
 
-void qmp_closefd(const char *fdname, Error **errp)
-{
-    int fd = monitor_get_fd(cur_mon, fdname, errp);
-    if (fd != -1) {
-        close(fd);
-    }
-}
-
-static void do_loadvm(Monitor *mon, const QDict *qdict)
-{
-    int saved_vm_running  = runstate_is_running();
-    const char *name = qdict_get_str(qdict, "name");
-
-    vm_stop(RUN_STATE_RESTORE_VM);
-
-    if (load_vmstate(name) == 0 && saved_vm_running) {
-        vm_start();
-    }
-}
-
 int monitor_get_fd(Monitor *mon, const char *fdname, Error **errp)
 {
     mon_fd_t *monfd;
@@ -2299,6 +2279,26 @@ void qmp_getfd(const char *fdname, Error **errp)
     monitor_put_fd(cur_mon, fdname, replace, copy, errp);
 }
 
+void qmp_closefd(const char *fdname, Error **errp)
+{
+    int fd = monitor_get_fd(cur_mon, fdname, errp);
+    if (fd != -1) {
+        close(fd);
+    }
+}
+
+static void do_loadvm(Monitor *mon, const QDict *qdict)
+{
+    int saved_vm_running  = runstate_is_running();
+    const char *name = qdict_get_str(qdict, "name");
+
+    vm_stop(RUN_STATE_RESTORE_VM);
+
+    if (load_vmstate(name) == 0 && saved_vm_running) {
+        vm_start();
+    }
+}
+
 /* mon_cmds and info_cmds would be sorted at runtime */
 static mon_cmd_t mon_cmds[] = {
 #include "hmp-commands.h"
-- 
1.7.10.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy
  2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
                   ` (3 preceding siblings ...)
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions Corey Bryant
@ 2012-06-22 18:36 ` Corey Bryant
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open Corey Bryant
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 56+ messages in thread
From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake

Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
v3:
 -This patch is new in v3.  It was previously submitted on its
  own, and is now being included in this series.

v4
 -Moved patch to be earlier in series (lcapitulino@redhat.com)

 block/raw-posix.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 03fcfcc..0d9be0b 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -946,9 +946,11 @@ static int floppy_probe_device(const char *filename)
     int prio = 0;
     struct floppy_struct fdparam;
     struct stat st;
+    const char *p;
 
-    if (strstart(filename, "/dev/fd", NULL))
+    if (strstart(filename, "/dev/fd", &p) && p[0] != '/') {
         prio = 50;
+    }
 
     fd = open(filename, O_RDONLY | O_NONBLOCK);
     if (fd < 0) {
-- 
1.7.10.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open
  2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
                   ` (4 preceding siblings ...)
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy Corey Bryant
@ 2012-06-22 18:36 ` Corey Bryant
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd Corey Bryant
       [not found] ` <20120626091004.GA14451@redhat.com>
  7 siblings, 0 replies; 56+ messages in thread
From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake

This patch converts all block layer open calls to qemu_open.  This
enables all block layer open paths to dup(X) a pre-opened file
descriptor if the filename is of the format /dev/fd/X.  This is
useful if QEMU is restricted from opening certain files.

Note that this adds the O_CLOEXEC flag to the changed open paths
when the O_CLOEXEC macro is defined.

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
v2:
 -Convert calls to qemu_open instead of file_open (kwolf@redhat.com)
 -Mention introduction of O_CLOEXEC (kwolf@redhat.com)

v3:
 -No changes

v4:
 -No changes

 block/raw-posix.c |   18 +++++++++---------
 block/raw-win32.c |    4 ++--
 block/vdi.c       |    5 +++--
 block/vmdk.c      |   21 +++++++++------------
 block/vpc.c       |    2 +-
 block/vvfat.c     |   21 +++++++++++----------
 6 files changed, 35 insertions(+), 36 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 0d9be0b..68886cd 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -568,8 +568,8 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
-              0644);
+    fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
+                   0644);
     if (fd < 0) {
         result = -errno;
     } else {
@@ -741,7 +741,7 @@ static int hdev_open(BlockDriverState *bs, const char *filename, int flags)
         if ( bsdPath[ 0 ] != '\0' ) {
             strcat(bsdPath,"s0");
             /* some CDs don't have a partition 0 */
-            fd = open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE);
+            fd = qemu_open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE);
             if (fd < 0) {
                 bsdPath[strlen(bsdPath)-1] = '1';
             } else {
@@ -798,7 +798,7 @@ static int fd_open(BlockDriverState *bs)
 #endif
             return -EIO;
         }
-        s->fd = open(bs->filename, s->open_flags & ~O_NONBLOCK);
+        s->fd = qemu_open(bs->filename, s->open_flags & ~O_NONBLOCK);
         if (s->fd < 0) {
             s->fd_error_time = get_clock();
             s->fd_got_error = 1;
@@ -872,7 +872,7 @@ static int hdev_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_BINARY);
+    fd = qemu_open(filename, O_WRONLY | O_BINARY);
     if (fd < 0)
         return -errno;
 
@@ -952,7 +952,7 @@ static int floppy_probe_device(const char *filename)
         prio = 50;
     }
 
-    fd = open(filename, O_RDONLY | O_NONBLOCK);
+    fd = qemu_open(filename, O_RDONLY | O_NONBLOCK);
     if (fd < 0) {
         goto out;
     }
@@ -1005,7 +1005,7 @@ static void floppy_eject(BlockDriverState *bs, bool eject_flag)
         close(s->fd);
         s->fd = -1;
     }
-    fd = open(bs->filename, s->open_flags | O_NONBLOCK);
+    fd = qemu_open(bs->filename, s->open_flags | O_NONBLOCK);
     if (fd >= 0) {
         if (ioctl(fd, FDEJECT, 0) < 0)
             perror("FDEJECT");
@@ -1055,7 +1055,7 @@ static int cdrom_probe_device(const char *filename)
     int prio = 0;
     struct stat st;
 
-    fd = open(filename, O_RDONLY | O_NONBLOCK);
+    fd = qemu_open(filename, O_RDONLY | O_NONBLOCK);
     if (fd < 0) {
         goto out;
     }
@@ -1179,7 +1179,7 @@ static int cdrom_reopen(BlockDriverState *bs)
      */
     if (s->fd >= 0)
         close(s->fd);
-    fd = open(bs->filename, s->open_flags, 0644);
+    fd = qemu_open(bs->filename, s->open_flags, 0644);
     if (fd < 0) {
         s->fd = -1;
         return -EIO;
diff --git a/block/raw-win32.c b/block/raw-win32.c
index e4b0b75..8d7838d 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -255,8 +255,8 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
-              0644);
+    fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
+                   0644);
     if (fd < 0)
         return -EIO;
     set_sparse(fd);
diff --git a/block/vdi.c b/block/vdi.c
index 119d3c7..6183fdf 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -648,8 +648,9 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
-              0644);
+    fd = qemu_open(filename,
+                   O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
+                   0644);
     if (fd < 0) {
         return -errno;
     }
diff --git a/block/vmdk.c b/block/vmdk.c
index 18e9b4c..557dc1b 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1161,10 +1161,9 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
     VMDK4Header header;
     uint32_t tmp, magic, grains, gd_size, gt_size, gt_count;
 
-    fd = open(
-        filename,
-        O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
-        0644);
+    fd = qemu_open(filename,
+                   O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
+                   0644);
     if (fd < 0) {
         return -errno;
     }
@@ -1484,15 +1483,13 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options)
             (flags & BLOCK_FLAG_COMPAT6 ? 6 : 4),
             total_size / (int64_t)(63 * 16 * 512));
     if (split || flat) {
-        fd = open(
-                filename,
-                O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
-                0644);
+        fd = qemu_open(filename,
+                       O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
+                       0644);
     } else {
-        fd = open(
-                filename,
-                O_WRONLY | O_BINARY | O_LARGEFILE,
-                0644);
+        fd = qemu_open(filename,
+                       O_WRONLY | O_BINARY | O_LARGEFILE,
+                       0644);
     }
     if (fd < 0) {
         return -errno;
diff --git a/block/vpc.c b/block/vpc.c
index 5cd13d1..60ebf5a 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -678,7 +678,7 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options)
     }
 
     /* Create the file */
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
+    fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
     if (fd < 0) {
         return -EIO;
     }
diff --git a/block/vvfat.c b/block/vvfat.c
index 0fd3367..81c3a19 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1152,16 +1152,17 @@ static inline mapping_t* find_mapping_for_cluster(BDRVVVFATState* s,int cluster_
 static int open_file(BDRVVVFATState* s,mapping_t* mapping)
 {
     if(!mapping)
-	return -1;
+        return -1;
     if(!s->current_mapping ||
-	    strcmp(s->current_mapping->path,mapping->path)) {
-	/* open file */
-	int fd = open(mapping->path, O_RDONLY | O_BINARY | O_LARGEFILE);
-	if(fd<0)
-	    return -1;
-	vvfat_close_current_file(s);
-	s->current_fd = fd;
-	s->current_mapping = mapping;
+            strcmp(s->current_mapping->path, mapping->path)) {
+        /* open file */
+        int fd = qemu_open(mapping->path, O_RDONLY | O_BINARY | O_LARGEFILE);
+        if (fd < 0) {
+            return -1;
+        }
+        vvfat_close_current_file(s);
+        s->current_fd = fd;
+        s->current_mapping = mapping;
     }
     return 0;
 }
@@ -2215,7 +2216,7 @@ static int commit_one_file(BDRVVVFATState* s,
     for (i = s->cluster_size; i < offset; i += s->cluster_size)
 	c = modified_fat_get(s, c);
 
-    fd = open(mapping->path, O_RDWR | O_CREAT | O_BINARY, 0666);
+    fd = qemu_open(mapping->path, O_RDWR | O_CREAT | O_BINARY, 0666);
     if (fd < 0) {
 	fprintf(stderr, "Could not open %s... (%s, %d)\n", mapping->path,
 		strerror(errno), errno);
-- 
1.7.10.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd
  2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
                   ` (5 preceding siblings ...)
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open Corey Bryant
@ 2012-06-22 18:36 ` Corey Bryant
  2012-06-22 19:58   ` Eric Blake
       [not found] ` <20120626091004.GA14451@redhat.com>
  7 siblings, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-06-22 18:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, pbonzini, eblake

This patch adds support to qemu_open to dup(fd) a pre-opened file
descriptor if the filename is of the format /dev/fd/X.

This can be used when QEMU is restricted from opening files, and
the management application opens files on QEMU's behalf.

If the fd was passed to the monitor with the pass-fd command, it
must be explicitly closed with the 'closefd' command when it is
no longer required, in order to prevent fd leaks.

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
v2:
 -Get rid of file_open and move dup code to qemu_open
  (kwolf@redhat.com)
 -Use strtol wrapper instead of atoi (kwolf@redhat.com)

v3:
 -Add note about fd leakage (eblake@redhat.com)

v4
 -Moved patch to be later in series (lcapitulino@redhat.com)
 -Update qemu_open to check access mode flags and set flags that
  can be set (eblake@redhat.com, kwolf@redhat.com)

 cutils.c      |   26 +++++++++++++----
 main-loop.c   |    6 ++--
 osdep.c       |   91 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 qemu-common.h |    2 +-
 4 files changed, 116 insertions(+), 9 deletions(-)

diff --git a/cutils.c b/cutils.c
index af308cd..f45d921 100644
--- a/cutils.c
+++ b/cutils.c
@@ -339,17 +339,33 @@ bool buffer_is_zero(const void *buf, size_t len)
 }
 
 #ifndef _WIN32
-/* Sets a specific flag */
-int fcntl_setfl(int fd, int flag)
+/* Sets a specific flag on or off */
+int fcntl_setfl(int fd, int flag, int onoff)
 {
     int flags;
 
+    if (onoff != 0 && onoff != 1) {
+        return -EINVAL;
+    }
+
     flags = fcntl(fd, F_GETFL);
-    if (flags == -1)
+    if (flags == -1) {
         return -errno;
+    }
 
-    if (fcntl(fd, F_SETFL, flags | flag) == -1)
-        return -errno;
+    if (onoff == 1) {
+        if ((flags & flag) != flag) {
+            if (fcntl(fd, F_SETFL, flags | flag) == -1) {
+                return -errno;
+            }
+        }
+    } else {
+        if ((flags & flag) == flag) {
+            if (fcntl(fd, F_SETFL, flags & ~flag) == -1) {
+                return -errno;
+            }
+        }
+    }
 
     return 0;
 }
diff --git a/main-loop.c b/main-loop.c
index eb3b6e6..644fcc3 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -75,11 +75,11 @@ static int qemu_event_init(void)
     if (err == -1) {
         return -errno;
     }
-    err = fcntl_setfl(fds[0], O_NONBLOCK);
+    err = fcntl_setfl(fds[0], O_NONBLOCK, 1);
     if (err < 0) {
         goto fail;
     }
-    err = fcntl_setfl(fds[1], O_NONBLOCK);
+    err = fcntl_setfl(fds[1], O_NONBLOCK, 1);
     if (err < 0) {
         goto fail;
     }
@@ -154,7 +154,7 @@ static int qemu_signal_init(void)
         return -errno;
     }
 
-    fcntl_setfl(sigfd, O_NONBLOCK);
+    fcntl_setfl(sigfd, O_NONBLOCK, 1);
 
     qemu_set_fd_handler2(sigfd, NULL, sigfd_handler, NULL,
                          (void *)(intptr_t)sigfd);
diff --git a/osdep.c b/osdep.c
index 3e6bada..a6fc758d 100644
--- a/osdep.c
+++ b/osdep.c
@@ -73,6 +73,63 @@ int qemu_madvise(void *addr, size_t len, int advice)
 #endif
 }
 
+/*
+ * Dups an fd and sets the flags
+ */
+static int qemu_dup(int fd, int flags)
+{
+    int ret;
+    int serrno;
+
+    if (flags & O_CLOEXEC) {
+        ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
+        if (ret == -1 && errno == EINVAL) {
+            ret = dup(fd);
+            if (ret == -1) {
+                goto fail;
+            }
+            if (fcntl_setfl(ret, O_CLOEXEC, (flags & O_CLOEXEC) ? 1 : 0) < 0) {
+                goto fail;
+            }
+        }
+    } else {
+        ret = dup(fd);
+    }
+
+    if (ret == -1) {
+        goto fail;
+    }
+
+    /* Truncate the file in the cases that open would truncate it */
+    if (flags & O_TRUNC ||
+            ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) {
+        if (ftruncate(ret, 0) == -1) {
+            goto fail;
+        }
+    }
+
+    if ((fcntl_setfl(ret, O_APPEND,    (flags & O_APPEND)    ? 1 : 0) < 0) ||
+        (fcntl_setfl(ret, O_ASYNC,     (flags & O_ASYNC)     ? 1 : 0) < 0) ||
+        (fcntl_setfl(ret, O_DIRECT,    (flags & O_DIRECT)    ? 1 : 0) < 0) ||
+        (fcntl_setfl(ret, O_LARGEFILE, (flags & O_LARGEFILE) ? 1 : 0) < 0) ||
+        (fcntl_setfl(ret, O_NDELAY,    (flags & O_NDELAY)    ? 1 : 0) < 0) ||
+        (fcntl_setfl(ret, O_NOATIME,   (flags & O_NOATIME)   ? 1 : 0) < 0) ||
+        (fcntl_setfl(ret, O_NOCTTY,    (flags & O_NOCTTY)    ? 1 : 0) < 0) ||
+        (fcntl_setfl(ret, O_NONBLOCK,  (flags & O_NONBLOCK)  ? 1 : 0) < 0) ||
+        (fcntl_setfl(ret, O_SYNC,      (flags & O_SYNC)      ? 1 : 0) < 0)) {
+        goto fail;
+    }
+
+    return ret;
+
+fail:
+    serrno = errno;
+    if (ret != -1) {
+        close(ret);
+    }
+    errno = serrno;
+    return -1;
+}
 
 /*
  * Opens a file with FD_CLOEXEC set
@@ -82,6 +139,40 @@ int qemu_open(const char *name, int flags, ...)
     int ret;
     int mode = 0;
 
+#ifndef _WIN32
+    const char *p;
+
+    /* Attempt dup of fd for pre-opened file */
+    if (strstart(name, "/dev/fd/", &p)) {
+        int fd;
+        int eflags;
+
+        fd = qemu_parse_fd(p);
+        if (fd == -1) {
+            return -1;
+        }
+
+        /* Get the existing fd's flags */
+        eflags = fcntl(fd, F_GETFL);
+        if (eflags == -1) {
+            return -1;
+        }
+
+        if (((flags & O_RDWR) != (eflags & O_RDWR)) ||
+            ((flags & O_RDONLY) != (eflags & O_RDONLY)) ||
+            ((flags & O_WRONLY) != (eflags & O_WRONLY))) {
+            errno = EACCES;
+            return -1;
+        }
+
+        if (fcntl_setfl(fd, O_CLOEXEC, 1) < 0) {
+            return -1;
+        }
+
+        return qemu_dup(fd, flags);
+    }
+#endif
+
     if (flags & O_CREAT) {
         va_list ap;
 
diff --git a/qemu-common.h b/qemu-common.h
index 91e0562..99cbbc5 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -144,7 +144,7 @@ int qemu_strnlen(const char *s, int max_len);
 time_t mktimegm(struct tm *tm);
 int qemu_fls(int i);
 int qemu_fdatasync(int fd);
-int fcntl_setfl(int fd, int flag);
+int fcntl_setfl(int fd, int flag, int onoff);
 int qemu_parse_fd(const char *param);
 
 /*
-- 
1.7.10.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
@ 2012-06-22 19:31   ` Eric Blake
  0 siblings, 0 replies; 56+ messages in thread
From: Eric Blake @ 2012-06-22 19:31 UTC (permalink / raw)
  To: Corey Bryant
  Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino,
	pbonzini

[-- Attachment #1: Type: text/plain, Size: 1262 bytes --]

On 06/22/2012 12:36 PM, Corey Bryant wrote:
> This sets the close-on-exec flag for the file descriptor received
> via SCM_RIGHTS.
> 
> Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
> ---
> v4
>  -This patch is new in v4 (eblake@redhat.com)
> 
>  qemu-char.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/qemu-char.c b/qemu-char.c
> index c2aaaee..f890113 100644
> --- a/qemu-char.c
> +++ b/qemu-char.c
> @@ -2263,7 +2263,7 @@ static ssize_t tcp_chr_recv(CharDriverState *chr, char *buf, size_t len)
>      msg.msg_control = &msg_control;
>      msg.msg_controllen = sizeof(msg_control);
>  
> -    ret = recvmsg(s->fd, &msg, 0);
> +    ret = recvmsg(s->fd, &msg, MSG_CMSG_CLOEXEC);

MSG_CMSG_CLOEXEC is not (yet) in POSIX (although it has been proposed
for addition); therefore, at the moment, it only exists on Linux and
Cygwin.  Does this need to have conditional code to allow compilation on
BSD, such as:

#ifndef MSG_CMSG_CLOEXEC
# define MSG_CMSG_CLOEXEC 0
#endif

as well as fallback code that sets FD_CLOEXEC manually via fcntl() when
MSG_CMSG_CLOEXEC is missing?

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd Corey Bryant
@ 2012-06-22 19:58   ` Eric Blake
  0 siblings, 0 replies; 56+ messages in thread
From: Eric Blake @ 2012-06-22 19:58 UTC (permalink / raw)
  To: Corey Bryant
  Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino,
	pbonzini

[-- Attachment #1: Type: text/plain, Size: 4185 bytes --]

On 06/22/2012 12:36 PM, Corey Bryant wrote:
> This patch adds support to qemu_open to dup(fd) a pre-opened file
> descriptor if the filename is of the format /dev/fd/X.
> 
> This can be used when QEMU is restricted from opening files, and
> the management application opens files on QEMU's behalf.
> 
> If the fd was passed to the monitor with the pass-fd command, it
> must be explicitly closed with the 'closefd' command when it is
> no longer required, in order to prevent fd leaks.
> 

> +static int qemu_dup(int fd, int flags)
> +{
> +    int ret;
> +    int serrno;
> +
> +    if (flags & O_CLOEXEC) {
> +        ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);

F_DUPFD_CLOEXEC is required by POSIX, but not implemented on all
platforms yet.  Do you need to be checking with #ifdef F_DUPFD_CLOEXEC
to avoid compilation failure?

> +        if (ret == -1 && errno == EINVAL) {
> +            ret = dup(fd);
> +            if (ret == -1) {
> +                goto fail;
> +            }
> +            if (fcntl_setfl(ret, O_CLOEXEC, (flags & O_CLOEXEC) ? 1 : 0) < 0) {

Broken.  O_CLOEXEC _only_ affects open(); to change it on an existing
fd, you have to use fcntl(F_GETFD/F_SETFD) (not F_GETFL/F_SETFL).


> +
> +    if ((fcntl_setfl(ret, O_APPEND,    (flags & O_APPEND)    ? 1 : 0) < 0) ||
> +        (fcntl_setfl(ret, O_ASYNC,     (flags & O_ASYNC)     ? 1 : 0) < 0) ||
> +        (fcntl_setfl(ret, O_DIRECT,    (flags & O_DIRECT)    ? 1 : 0) < 0) ||
> +        (fcntl_setfl(ret, O_LARGEFILE, (flags & O_LARGEFILE) ? 1 : 0) < 0) ||

Pointless. O_LARGEFILE should _always_ be set, since we are compiling
for 64-bit off_t always.

> +        (fcntl_setfl(ret, O_NDELAY,    (flags & O_NDELAY)    ? 1 : 0) < 0) ||
> +        (fcntl_setfl(ret, O_NOATIME,   (flags & O_NOATIME)   ? 1 : 0) < 0) ||
> +        (fcntl_setfl(ret, O_NOCTTY,    (flags & O_NOCTTY)    ? 1 : 0) < 0) ||
> +        (fcntl_setfl(ret, O_NONBLOCK,  (flags & O_NONBLOCK)  ? 1 : 0) < 0) ||
> +        (fcntl_setfl(ret, O_SYNC,      (flags & O_SYNC)      ? 1 : 0) < 0)) {

Yuck.  That's a lot of syscalls (1 per fcntl_setfl() if they are already
set correctly, and 2 per fcntl_setfl() call if we are toggling each
one).  It might be better to combine this into at most 2 fcntl() calls,
instead of a long sequence.


> +        /* Get the existing fd's flags */
> +        eflags = fcntl(fd, F_GETFL);
> +        if (eflags == -1) {
> +            return -1;
> +        }
> +
> +        if (((flags & O_RDWR) != (eflags & O_RDWR)) ||
> +            ((flags & O_RDONLY) != (eflags & O_RDONLY)) ||
> +            ((flags & O_WRONLY) != (eflags & O_WRONLY))) {

Broken.  O_RDWR, O_RDONLY, and O_WRONLY are NOT bitmasks, but are values
in the range of O_ACCMODE.  In particular, O_RDONLY==0 on some platforms
(Linux), and ==1 on others (Hurd), and although POSIX recommends that
O_RDWR==(O_RDONLY|O_WRONLY) for any new systems, no one has really done
that except Hurd.

A correct way to write this is:

switch (flags & O_ACCMODE) {
case O_RDWR:
    if ((eflags & O_ACCMODE) != O_RDWR) {
        goto error;
    break;
case O_RDONLY:
    if ((eflags & O_ACCMODE) != O_RDONLY) {
        goto error;
    break;
case O_RDONLY:
    if ((eflags & O_ACCMODE) != O_RDONLY) {
        goto error;
    break;
default:
    goto error:
}

[Technically, POSIX also requires O_ACCMODE to include O_SEARCH and
O_EXEC, although those two constants might be the same value; but right
now Linux has not yet implemented that bit; but unless qemu ever gains
the need to open executable binaries with O_EXEC or directories with
O_SEARCH, we probably don't have to worry about that aspect of O_ACCMODE
here.]

> +            errno = EACCES;
> +            return -1;
> +        }
> +
> +        if (fcntl_setfl(fd, O_CLOEXEC, 1) < 0) {

Again, broken.  Besides, why are you attempting it both here and in
qemu_dup()?  Shouldn't once be enough?

> +            return -1;
> +        }
> +
> +        return qemu_dup(fd, flags);

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant
@ 2012-06-22 20:24   ` Eric Blake
  0 siblings, 0 replies; 56+ messages in thread
From: Eric Blake @ 2012-06-22 20:24 UTC (permalink / raw)
  To: Corey Bryant
  Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino,
	pbonzini

[-- Attachment #1: Type: text/plain, Size: 1616 bytes --]

On 06/22/2012 12:36 PM, Corey Bryant wrote:
> This patch adds the pass-fd QMP command using the QAPI framework.
> Like the getfd command, it is used to pass a file descriptor via
> SCM_RIGHTS and associate it with a name.  However, the pass-fd
> command also returns the received file descriptor, which is a
> difference in behavior from the getfd command, which returns
> nothing.  pass-fd also supports a boolean "force" argument that
> can be used to specify that an existing file descriptor is
> associated with the specified name, and that it should become a
> copy of the newly passed file descriptor.

> +        if (replace) {
> +            /* the existing fd is replaced with the new fd */
> +            close(monfd->fd);
> +            monfd->fd = fd;
> +            return fd;
> +        }
> +
> +        if (copy) {

Since 'replace' and 'copy' are never both true, should this instead be a
tri-state enum argument instead of two independent bool arguments?

> +            /* the existing fd becomes a copy of the new fd */
> +            if (dup2(fd, monfd->fd) == -1) {

Broken - you want to use dup3(fd, monfd->fd, O_CLOEXEC) (and fall back
to dup2()+fcntl(F_GETFD/F_SETFD) on platforms where dup3 is not
available; it has been proposed for the next revision of POSIX but right
now is pretty much limited to Linux and Cygwin).  Otherwise, you are
undoing the fact that patch 1/7 just changed the 'getfd' list to always
keep all its fds marked cloexec.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
       [not found]         ` <4FEA3D9C.8080205@redhat.com>
@ 2012-07-02 22:02           ` Corey Bryant
  2012-07-02 22:31             ` Eric Blake
  2012-07-03 15:40             ` Corey Bryant
  0 siblings, 2 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-02 22:02 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	Luiz Capitulino, pbonzini



On 06/26/2012 06:54 PM, Eric Blake wrote:
> On 06/26/2012 04:28 PM, Corey Bryant wrote:
>
>>>>> With this proposed series, we have usage akin to:
>>>>>
>>>>>     1. pass_fd FDSET={M} -> returns a string "/dev/fd/N" showing QEMU's
>>>>>        view of the FD
>>>>>     2. drive_add file=/dev/fd/N
>>>>>     3. if failure:
>>>>>          close_fd "/dev/fd/N"
>>>>
>>>> In fact, there are more steps:
>>>>
>>>> 4. use it successfully
>>>> 5. close_fd "/dev/fd/N"
>>>>
>>>> I think it would well be possible that qemu just closes the fd when it's
>>>> not used internally any more.
>>>
>>> pass-fd could have a flag indicating qemu to do that.
>>>
>>
>> It seems like libvirt would be in a better position to understand when a
>> file is no longer in use, and then it can call close_fd.  No?  Of course
>> the the only fd that needs to be closed is the originally passed fd. The
>> dup'd fd's are closed by QEMU.
>
> The more we discuss this, the more I'm convinced that commands like
> 'block-commit' that will reopen a file to change from O_RDONLY to O_RDWR
> will need to have an optional argument that says the name of the file to
> reopen.  That is, I've seen three proposals:
>
> Proposal One: enhance every command to take an fd:name protocol.
> PRO: Successful use of a name removes the fd from the 'getfd' list.
> CON: Lots of commands to change.
> CON: The command sequence is not atomic.
> CON: Block layer does not have a file name tied to the fd, so the reopen
> operation also needs to learn the fd:name protocol, but since we're
> already touching the command it's not much more work.
> USAGE:
> 1. getfd FDSET={M}, ties new fd to "name"
> 2. drive_add fd=name - looks up the fd by name from the list
> 3a. on success, fd is no longer in the list, drive_add consumed it
> 3b. on failure, libvirt must call 'closefd name' to avoid a leak
> 4. getfd FDSET={M2}, ties new fd to "newname"
> 5. block-commit fd=newname - looks up fd by name from the list
> 6a. on success, fd is no longer in the list, block-commit consumed it
> 6b. on failure, libvirt must call 'closefd newname' to avoid a leak
>
> Proposal Two: Create a permanent name via 'getfd' or 'pass-fd', then
> update that name to the new fd in advance of any operation that needs to
> do a reopen.
> PRO: All other commands work without impact by using qemu_open(), by
> getting a clone of the permanent name.
> CON: The permanent name must remain open as long as qemu might need to
> reopen it, and reopening needs the pass-fd force option.
> CON: The command sequence is not atomic.
> USAGE:
> 1. pass_fd FDSET={M} -> returns an integer N (or a string "/dev/fd/N")
> showing QEMU's permanent name of the FD
> 2. drive_add file=<permanent name> means that qemu_open() will clone the
> fd, and drive_add is now using yet another FD while N remains open;
> meanwhile, the block layer knows that the drive is tied to the permanent
> name
> 3. pass-fd force FDSET={M2} -> replaces fd N with the passed M2, and
> still returns /dev/fd/N
> 4. block-commit - looks up the block-device name (/dev/fd/N), which maps
> back to the permanent fd, and gets a copy of the newly passed M2
> 5. on completion (success or failure), libvirt must call 'closefd name'
> to avoid a leak
>
> Proposal Three: Use thread-local fds passed alongside any command,
> rather than passing via a separate command
> PRO: All commands can now atomically handle fd passing
> PRO: Commands that only need a one-shot open can now use fd
> CON: Commands that need to reopen still need modification to allow a
> name override during the reopen
> 1. drive_add nfds=1 file="/dev/fdlist/1" FDSET={M} -> on success, the fd
> is used as the block file, on failure it is atomically closed, so there
> is no leak either way. However, the block file has no permanent name.
> 2. block-commit nfds=1 file-override="/dev/fdlist/1" FDSET={M2} -> file
> override must be passed again (since we have no guarantee that the
> /dev/fdlist/1 of _this_ command matches the command-local naming used in
> the previous command), but the fd is again used atomically
>
> Under proposal 3, there is no need to dup fd's, merely only to check
> that qemu_open("/dev/fdlist/n", flags) has compatible flags with the fd
> received via FDSET.  And the fact that things are made atomic means that
> libvirt no longer has to worry about calling closefd, nor does it have
> to worry about being interrupted between two monitor commands and not
> knowing what state qemu is in.  But it does mean teaching every possible
> command that wants to do a reopen to learn how to use fd overrides
> rather than to reuse the permanent name that was originally in place on
> the first open.
>

Here's another option that Kevin and I discussed today on IRC.  I've 
modified a few minor details since the discussion. And Kevin please 
correct me if anything is wrong.

Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds 
should all refer to the same file, but may have different access flags 
(ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the 
matching access mode flags.
pass-fds:
     { 'command': 'pass-fds',
       'data': {'fdsetname': 'str', '*close': 'bool'},
       'returns': 'string' }
close-fds:
     { 'command': 'close-fds',
       'data': {'fdsetname': 'str' }
where:
      @fdsetname - the file descriptor set name
      @close - *if true QEMU closes the monitor fds when they expire.
               if false, fds remain on the monitor until close-fds
               command.
PRO: Supports reopen
PRO: All other commands work without impact by using qemu_open()
PRO: No fd leakage if close==true specified
CON: Determining when to close fds when close==true may be tricky
USAGE:
1. pass-fd FDSET={M} -> returns a string "/dev/fdset/1") that refers to
the passed set of fds.
2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the
list that has access flags matching the qemu_open() action flags.
3. block-commit -- qemu_open() reopens "/dev/fdset/1" by using the first
fd from the list that has access flags matching the qemu_open() action 
flags.
4. The monitor fds are closed:
    A) *If @close == true, qemu closes the set of fds when the timer
       expires
    B) If @close == false, libvirt must issue close-fds command to close
       the set of fds

*How to solve this has yet to be determined.  Kevin mentioned 
potentially using a bottom-half or a timer in qemu_close().

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-02 22:02           ` [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
@ 2012-07-02 22:31             ` Eric Blake
  2012-07-03  9:07               ` Daniel P. Berrange
                                 ` (2 more replies)
  2012-07-03 15:40             ` Corey Bryant
  1 sibling, 3 replies; 56+ messages in thread
From: Eric Blake @ 2012-07-02 22:31 UTC (permalink / raw)
  To: Corey Bryant
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	Luiz Capitulino, pbonzini

[-- Attachment #1: Type: text/plain, Size: 3952 bytes --]

On 07/02/2012 04:02 PM, Corey Bryant wrote:

> Here's another option that Kevin and I discussed today on IRC.  I've
> modified a few minor details since the discussion. And Kevin please
> correct me if anything is wrong.
> 
> Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds
> should all refer to the same file, but may have different access flags
> (ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the
> matching access mode flags.

But this means that libvirt has to open a file O_RDWR up front for any
file that it _might_ need qemu to reopen later, and that qemu is now
hanging on to 2 fds per fdset instead of 1 fd for the life of any client
of the fdset.

I see no reason why libvirt can't pass in an O_RDWR fd when qemu only
needs to use an O_RDONLY fd; making qemu insist on an O_RDONLY fd makes
life harder for libvirt.  On the other hand, I can see from a safety
standpoint that passing in an O_RDWR fd opens up more potential for
errors than passing in an O_RDONLY fd, but if you don't know up front
whether you will ever need to write into a file, then it would be better
to pass in O_RDONLY.  At least I don't see it as a security risk:
passing in O_RDWR but setting SELinux permissions on the fd to only
allow read() but not write() via the labeling of the fd may be possible,
so that libvirt could still prevent accidental writes into an O_RDWR
file that starts life only needing to service reads.

> pass-fds:
>     { 'command': 'pass-fds',
>       'data': {'fdsetname': 'str', '*close': 'bool'},
>       'returns': 'string' }

This still doesn't solve Dan's complaint that things are not atomic; if
the monitor dies between the pass-fds and the later use of the fdset, we
would need a counterpart monitor command to query what fdsets are
currently known by qemu.  And like you pointed out, you didn't make it
clear how a timeout mechanism would be implemented to auto-close fds
that are not consumed in a fixed amount of time - would that be another
optional parameter to 'pass-fds'?

Or do we need a way to initially create a set with only one O_RDONLY fd,
but later pass in a new O_RDWR fd but associate it with the existing set
rather than creating a new set (akin to the 'force' option of 'pass-fd'
in proposal two)?

> close-fds:
>     { 'command': 'close-fds',
>       'data': {'fdsetname': 'str' }
> where:
>      @fdsetname - the file descriptor set name
>      @close - *if true QEMU closes the monitor fds when they expire.
>               if false, fds remain on the monitor until close-fds
>               command.
> PRO: Supports reopen
> PRO: All other commands work without impact by using qemu_open()
> PRO: No fd leakage if close==true specified
> CON: Determining when to close fds when close==true may be tricky
> USAGE:
> 1. pass-fd FDSET={M} -> returns a string "/dev/fdset/1") that refers to
> the passed set of fds.
> 2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the
> list that has access flags matching the qemu_open() action flags.
> 3. block-commit -- qemu_open() reopens "/dev/fdset/1" by using the first
> fd from the list that has access flags matching the qemu_open() action
> flags.
> 4. The monitor fds are closed:
>    A) *If @close == true, qemu closes the set of fds when the timer
>       expires
>    B) If @close == false, libvirt must issue close-fds command to close
>       the set of fds
> 
> *How to solve this has yet to be determined.  Kevin mentioned
> potentially using a bottom-half or a timer in qemu_close().

Still, it is certainly an option worth thinking about, and I'm hoping we
can come to a solid design that everyone agrees provides the desired
security and flexibility without having to recode every single existing
command.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-02 22:31             ` Eric Blake
@ 2012-07-03  9:07               ` Daniel P. Berrange
  2012-07-03  9:40               ` Kevin Wolf
  2012-07-03 13:42               ` Corey Bryant
  2 siblings, 0 replies; 56+ messages in thread
From: Daniel P. Berrange @ 2012-07-03  9:07 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, Corey Bryant,
	qemu-devel, Luiz Capitulino, pbonzini

On Mon, Jul 02, 2012 at 04:31:09PM -0600, Eric Blake wrote:
> On 07/02/2012 04:02 PM, Corey Bryant wrote:
> 
> > Here's another option that Kevin and I discussed today on IRC.  I've
> > modified a few minor details since the discussion. And Kevin please
> > correct me if anything is wrong.
> > 
> > Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds
> > should all refer to the same file, but may have different access flags
> > (ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the
> > matching access mode flags.
> 
> But this means that libvirt has to open a file O_RDWR up front for any
> file that it _might_ need qemu to reopen later, and that qemu is now
> hanging on to 2 fds per fdset instead of 1 fd for the life of any client
> of the fdset.
> 
> I see no reason why libvirt can't pass in an O_RDWR fd when qemu only
> needs to use an O_RDONLY fd;

If libvirt has only granted read-only access to the file with sVirt, then
passing a O_RDWR file handle to QEMU will result in an SELinux denial,
even if QEMU doesn't try to do I/O on it. So this is out of the question.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-02 22:31             ` Eric Blake
  2012-07-03  9:07               ` Daniel P. Berrange
@ 2012-07-03  9:40               ` Kevin Wolf
  2012-07-03 13:42               ` Corey Bryant
  2 siblings, 0 replies; 56+ messages in thread
From: Kevin Wolf @ 2012-07-03  9:40 UTC (permalink / raw)
  To: Eric Blake
  Cc: aliguori, stefanha, libvir-list, Corey Bryant, qemu-devel,
	Luiz Capitulino, pbonzini

Am 03.07.2012 00:31, schrieb Eric Blake:
> On 07/02/2012 04:02 PM, Corey Bryant wrote:
> 
>> Here's another option that Kevin and I discussed today on IRC.  I've
>> modified a few minor details since the discussion. And Kevin please
>> correct me if anything is wrong.
>>
>> Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds
>> should all refer to the same file, but may have different access flags
>> (ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the
>> matching access mode flags.
> 
> But this means that libvirt has to open a file O_RDWR up front for any
> file that it _might_ need qemu to reopen later, and that qemu is now
> hanging on to 2 fds per fdset instead of 1 fd for the life of any client
> of the fdset.

It doesn't have to be up front. There's no reason not to have monitor
commands that add or remove fds from a given fdset later.

> I see no reason why libvirt can't pass in an O_RDWR fd when qemu only
> needs to use an O_RDONLY fd; making qemu insist on an O_RDONLY fd makes
> life harder for libvirt.  On the other hand, I can see from a safety
> standpoint that passing in an O_RDWR fd opens up more potential for
> errors than passing in an O_RDONLY fd, 

Yes, this is exactly my consideration. Having a read-only file opened as
O_RDWR gives us the chance to make stupid mistakes.

> but if you don't know up front
> whether you will ever need to write into a file, then it would be better
> to pass in O_RDONLY.  At least I don't see it as a security risk:
> passing in O_RDWR but setting SELinux permissions on the fd to only
> allow read() but not write() via the labeling of the fd may be possible,
> so that libvirt could still prevent accidental writes into an O_RDWR
> file that starts life only needing to service reads.

But this would assume that libvirt knows exactly when a reopen happens,
for each current and future qemu version. I wouldn't want to tie qemu's
internals so closely to the management application, even if libvirt
could probably get it reasonably right (allow rw before sending a
monitor command; revoke rw when receiving a QMP event that the commit
has completed). It wouldn't work if we had a qemu-initiated ro->rw
transition, but I don't think we have one at the moment.

>> pass-fds:
>>     { 'command': 'pass-fds',
>>       'data': {'fdsetname': 'str', '*close': 'bool'},
>>       'returns': 'string' }
> 
> This still doesn't solve Dan's complaint that things are not atomic; if
> the monitor dies between the pass-fds and the later use of the fdset, we
> would need a counterpart monitor command to query what fdsets are
> currently known by qemu. 

If you want a query-fdsets, that should be easy enough.

Actually, we might be able to even make the command transactionable.
This would of course require blockdev-add to be transactionable as well
to be of any use.

> And like you pointed out, you didn't make it
> clear how a timeout mechanism would be implemented to auto-close fds
> that are not consumed in a fixed amount of time - would that be another
> optional parameter to 'pass-fds'?

Do you think a timeout is helpful? Can't we just drop libvirt's
reference automatically if the monitor connection gets closed?

The BH/timer thing Corey mentioned is more about the qemu internal
problem that during a reopen there may be a short period where the old
fd is closed, but the new one not yet opened, so the fdset might need to
survive a short period with refcount 0.

> Or do we need a way to initially create a set with only one O_RDONLY fd,
> but later pass in a new O_RDWR fd but associate it with the existing set
> rather than creating a new set (akin to the 'force' option of 'pass-fd'
> in proposal two)?

As I said above, yes, I think this makes a lot sense.

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-02 22:31             ` Eric Blake
  2012-07-03  9:07               ` Daniel P. Berrange
  2012-07-03  9:40               ` Kevin Wolf
@ 2012-07-03 13:42               ` Corey Bryant
  2 siblings, 0 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-03 13:42 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	Luiz Capitulino, pbonzini



On 07/02/2012 06:31 PM, Eric Blake wrote:
> On 07/02/2012 04:02 PM, Corey Bryant wrote:
>
>> Here's another option that Kevin and I discussed today on IRC.  I've
>> modified a few minor details since the discussion. And Kevin please
>> correct me if anything is wrong.
>>
>> Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds
>> should all refer to the same file, but may have different access flags
>> (ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the
>> matching access mode flags.
>
> But this means that libvirt has to open a file O_RDWR up front for any
> file that it _might_ need qemu to reopen later, and that qemu is now
> hanging on to 2 fds per fdset instead of 1 fd for the life of any client
> of the fdset.
>
> I see no reason why libvirt can't pass in an O_RDWR fd when qemu only
> needs to use an O_RDONLY fd; making qemu insist on an O_RDONLY fd makes
> life harder for libvirt.  On the other hand, I can see from a safety
> standpoint that passing in an O_RDWR fd opens up more potential for
> errors than passing in an O_RDONLY fd, but if you don't know up front
> whether you will ever need to write into a file, then it would be better
> to pass in O_RDONLY.  At least I don't see it as a security risk:
> passing in O_RDWR but setting SELinux permissions on the fd to only
> allow read() but not write() via the labeling of the fd may be possible,
> so that libvirt could still prevent accidental writes into an O_RDWR
> file that starts life only needing to service reads.
>
>> pass-fds:
>>      { 'command': 'pass-fds',
>>        'data': {'fdsetname': 'str', '*close': 'bool'},
>>        'returns': 'string' }
>
> This still doesn't solve Dan's complaint that things are not atomic; if
> the monitor dies between the pass-fds and the later use of the fdset, we
> would need a counterpart monitor command to query what fdsets are
> currently known by qemu.  And like you pointed out, you didn't make it
> clear how a timeout mechanism would be implemented to auto-close fds
> that are not consumed in a fixed amount of time - would that be another
> optional parameter to 'pass-fds'?

Yes see the description of the close bool on pass-fds below which 
determines how the fds are closed.

It's not an atomic operation, but it does handle Dan's concern of 
preventing fd leaks.

If libvirt did crash after the pass-fd, then couldn't it just abandon 
the fd set (which will get closed) and libvirt could send a new fd set 
and work with that?

>
> Or do we need a way to initially create a set with only one O_RDONLY fd,
> but later pass in a new O_RDWR fd but associate it with the existing set
> rather than creating a new set (akin to the 'force' option of 'pass-fd'
> in proposal two)?

Yeah I see what you're saying.

I was also wondering if it would make sense for qemu to consume the 
passed fds and use those (ie. no dup()).  This would prevent the issues 
of having to close the fds that linger on the monitor.  But I don't know 
if it's realistic for libvirt to know how many open calls qemu will need 
to make (this would need 1 fd passed per open call).

>
>> close-fds:
>>      { 'command': 'close-fds',
>>        'data': {'fdsetname': 'str' }
>> where:
>>       @fdsetname - the file descriptor set name
>>       @close - *if true QEMU closes the monitor fds when they expire.
>>                if false, fds remain on the monitor until close-fds
>>                command.
>> PRO: Supports reopen
>> PRO: All other commands work without impact by using qemu_open()
>> PRO: No fd leakage if close==true specified
>> CON: Determining when to close fds when close==true may be tricky
>> USAGE:
>> 1. pass-fd FDSET={M} -> returns a string "/dev/fdset/1") that refers to
>> the passed set of fds.
>> 2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the
>> list that has access flags matching the qemu_open() action flags.
>> 3. block-commit -- qemu_open() reopens "/dev/fdset/1" by using the first
>> fd from the list that has access flags matching the qemu_open() action
>> flags.
>> 4. The monitor fds are closed:
>>     A) *If @close == true, qemu closes the set of fds when the timer
>>        expires
>>     B) If @close == false, libvirt must issue close-fds command to close
>>        the set of fds
>>
>> *How to solve this has yet to be determined.  Kevin mentioned
>> potentially using a bottom-half or a timer in qemu_close().
>
> Still, it is certainly an option worth thinking about, and I'm hoping we
> can come to a solid design that everyone agrees provides the desired
> security and flexibility without having to recode every single existing
> command.
>

I agree.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-02 22:02           ` [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
  2012-07-02 22:31             ` Eric Blake
@ 2012-07-03 15:40             ` Corey Bryant
  2012-07-03 15:59               ` Kevin Wolf
  1 sibling, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-07-03 15:40 UTC (permalink / raw)
  To: Eric Blake, Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino, pbonzini



On 07/02/2012 06:02 PM, Corey Bryant wrote:
>
>
> On 06/26/2012 06:54 PM, Eric Blake wrote:
>> On 06/26/2012 04:28 PM, Corey Bryant wrote:
>>
>>>>>> With this proposed series, we have usage akin to:
>>>>>>
>>>>>>     1. pass_fd FDSET={M} -> returns a string "/dev/fd/N" showing
>>>>>> QEMU's
>>>>>>        view of the FD
>>>>>>     2. drive_add file=/dev/fd/N
>>>>>>     3. if failure:
>>>>>>          close_fd "/dev/fd/N"
>>>>>
>>>>> In fact, there are more steps:
>>>>>
>>>>> 4. use it successfully
>>>>> 5. close_fd "/dev/fd/N"
>>>>>
>>>>> I think it would well be possible that qemu just closes the fd when
>>>>> it's
>>>>> not used internally any more.
>>>>
>>>> pass-fd could have a flag indicating qemu to do that.
>>>>
>>>
>>> It seems like libvirt would be in a better position to understand when a
>>> file is no longer in use, and then it can call close_fd.  No?  Of course
>>> the the only fd that needs to be closed is the originally passed fd. The
>>> dup'd fd's are closed by QEMU.
>>
>> The more we discuss this, the more I'm convinced that commands like
>> 'block-commit' that will reopen a file to change from O_RDONLY to O_RDWR
>> will need to have an optional argument that says the name of the file to
>> reopen.  That is, I've seen three proposals:
>>
>> Proposal One: enhance every command to take an fd:name protocol.
>> PRO: Successful use of a name removes the fd from the 'getfd' list.
>> CON: Lots of commands to change.
>> CON: The command sequence is not atomic.
>> CON: Block layer does not have a file name tied to the fd, so the reopen
>> operation also needs to learn the fd:name protocol, but since we're
>> already touching the command it's not much more work.
>> USAGE:
>> 1. getfd FDSET={M}, ties new fd to "name"
>> 2. drive_add fd=name - looks up the fd by name from the list
>> 3a. on success, fd is no longer in the list, drive_add consumed it
>> 3b. on failure, libvirt must call 'closefd name' to avoid a leak
>> 4. getfd FDSET={M2}, ties new fd to "newname"
>> 5. block-commit fd=newname - looks up fd by name from the list
>> 6a. on success, fd is no longer in the list, block-commit consumed it
>> 6b. on failure, libvirt must call 'closefd newname' to avoid a leak
>>
>> Proposal Two: Create a permanent name via 'getfd' or 'pass-fd', then
>> update that name to the new fd in advance of any operation that needs to
>> do a reopen.
>> PRO: All other commands work without impact by using qemu_open(), by
>> getting a clone of the permanent name.
>> CON: The permanent name must remain open as long as qemu might need to
>> reopen it, and reopening needs the pass-fd force option.
>> CON: The command sequence is not atomic.
>> USAGE:
>> 1. pass_fd FDSET={M} -> returns an integer N (or a string "/dev/fd/N")
>> showing QEMU's permanent name of the FD
>> 2. drive_add file=<permanent name> means that qemu_open() will clone the
>> fd, and drive_add is now using yet another FD while N remains open;
>> meanwhile, the block layer knows that the drive is tied to the permanent
>> name
>> 3. pass-fd force FDSET={M2} -> replaces fd N with the passed M2, and
>> still returns /dev/fd/N
>> 4. block-commit - looks up the block-device name (/dev/fd/N), which maps
>> back to the permanent fd, and gets a copy of the newly passed M2
>> 5. on completion (success or failure), libvirt must call 'closefd name'
>> to avoid a leak
>>
>> Proposal Three: Use thread-local fds passed alongside any command,
>> rather than passing via a separate command
>> PRO: All commands can now atomically handle fd passing
>> PRO: Commands that only need a one-shot open can now use fd
>> CON: Commands that need to reopen still need modification to allow a
>> name override during the reopen
>> 1. drive_add nfds=1 file="/dev/fdlist/1" FDSET={M} -> on success, the fd
>> is used as the block file, on failure it is atomically closed, so there
>> is no leak either way. However, the block file has no permanent name.
>> 2. block-commit nfds=1 file-override="/dev/fdlist/1" FDSET={M2} -> file
>> override must be passed again (since we have no guarantee that the
>> /dev/fdlist/1 of _this_ command matches the command-local naming used in
>> the previous command), but the fd is again used atomically
>>
>> Under proposal 3, there is no need to dup fd's, merely only to check
>> that qemu_open("/dev/fdlist/n", flags) has compatible flags with the fd
>> received via FDSET.  And the fact that things are made atomic means that
>> libvirt no longer has to worry about calling closefd, nor does it have
>> to worry about being interrupted between two monitor commands and not
>> knowing what state qemu is in.  But it does mean teaching every possible
>> command that wants to do a reopen to learn how to use fd overrides
>> rather than to reuse the permanent name that was originally in place on
>> the first open.
>>
>
> Here's another option that Kevin and I discussed today on IRC.  I've
> modified a few minor details since the discussion. And Kevin please
> correct me if anything is wrong.
>
> Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds
> should all refer to the same file, but may have different access flags
> (ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the
> matching access mode flags.
> pass-fds:
>      { 'command': 'pass-fds',
>        'data': {'fdsetname': 'str', '*close': 'bool'},
>        'returns': 'string' }
> close-fds:
>      { 'command': 'close-fds',
>        'data': {'fdsetname': 'str' }
> where:
>       @fdsetname - the file descriptor set name
>       @close - *if true QEMU closes the monitor fds when they expire.
>                if false, fds remain on the monitor until close-fds
>                command.
> PRO: Supports reopen
> PRO: All other commands work without impact by using qemu_open()
> PRO: No fd leakage if close==true specified
> CON: Determining when to close fds when close==true may be tricky
> USAGE:
> 1. pass-fd FDSET={M} -> returns a string "/dev/fdset/1") that refers to
> the passed set of fds.
> 2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the
> list that has access flags matching the qemu_open() action flags.
> 3. block-commit -- qemu_open() reopens "/dev/fdset/1" by using the first
> fd from the list that has access flags matching the qemu_open() action
> flags.
> 4. The monitor fds are closed:
>     A) *If @close == true, qemu closes the set of fds when the timer
>        expires
>     B) If @close == false, libvirt must issue close-fds command to close
>        the set of fds
>
> *How to solve this has yet to be determined.  Kevin mentioned
> potentially using a bottom-half or a timer in qemu_close().
>

Thanks again for taking time to discuss this at today's QEMU community call.

Here's the proposal we discussed at the call.  Please let me know if I 
missed anything or if there are any issues with this design.

Proposal Five:  New monitor commands enable adding/removing an fd 
to/from a set.  The set of fds should all refer to the same file, but 
may have different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can 
then dup the fd that has the matching access mode flags.
PRO: Supports reopen
PRO: All other commands work without impact by using qemu_open()
PRO: No fd leakage (fds are associated with monitor connection and, if 
not in use, closed when monitor disconnects)
PRO: Security-wise this is ok since libvirt can manage the set of fd's 
(ie. it can add/remove an O_RDWR fd to/from the set as needed).
CON: Not atomic (e.g. doesn't add an fd with single drive_add command).
USAGE:
1. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named 
"/dev/fdset/1" - command returns qemu fd (e.g fd=4) to caller.  libvirt 
in-use flag turned on for fd.
2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the
set that has access flags matching the qemu_open action flags. 
qemu_open increments refcount for this fd.
3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named 
"/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5).  libvirt 
in-use flag turned on for fd.
3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first
fd from the set that has access flags matching the qemu_open action 
flags.  qemu_open increments refcount for this fd.
4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the 
set.  turns libvirt in-use flag off marking the fd ready to be closed 
when qemu is done with it.
5. qemu_close decrements refcount for fd, and closes fd when refcount is 
zero and libvirt in use flag is off.

More functional details:
-If libvirt crashes it can call "query-fd /dev/fdset/1" to determine 
which fds are open in the set.
-If monitor connection closes, qemu will close fds that have a refcount 
of zero.  Do we also need a qemu in-use flag in case refcount is zero 
and fd is still in use?
-This support requires introduction of qemu_close function that will be 
called everywhere in block layer that close is currently called.

Notes:
-Patch series 1 will include support for all of the above.  This will be 
my initial focus.
-Patch series 2 will include command line support that enables 
association of command line fd with a monitor set.  This will be my 
secondary focus, most likely after patch series 1 is applied.


-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-03 15:40             ` Corey Bryant
@ 2012-07-03 15:59               ` Kevin Wolf
  2012-07-03 16:25                 ` Corey Bryant
  0 siblings, 1 reply; 56+ messages in thread
From: Kevin Wolf @ 2012-07-03 15:59 UTC (permalink / raw)
  To: Corey Bryant
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake

Am 03.07.2012 17:40, schrieb Corey Bryant:
> Thanks again for taking time to discuss this at today's QEMU community call.
> 
> Here's the proposal we discussed at the call.  Please let me know if I 
> missed anything or if there are any issues with this design.
> 
> Proposal Five:  New monitor commands enable adding/removing an fd 
> to/from a set.  The set of fds should all refer to the same file, but 
> may have different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can 
> then dup the fd that has the matching access mode flags.
> PRO: Supports reopen
> PRO: All other commands work without impact by using qemu_open()
> PRO: No fd leakage (fds are associated with monitor connection and, if 
> not in use, closed when monitor disconnects)
> PRO: Security-wise this is ok since libvirt can manage the set of fd's 
> (ie. it can add/remove an O_RDWR fd to/from the set as needed).
> CON: Not atomic (e.g. doesn't add an fd with single drive_add command).
> USAGE:
> 1. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named 
> "/dev/fdset/1" - command returns qemu fd (e.g fd=4) to caller.  libvirt 
> in-use flag turned on for fd.

I thought qemu would rather return the number of the fdset (which it
also assigns if none it passed, i.e. for fdset creation). Does libvirt
need the number of an individual fd?

If libvirt prefers to assign fdset numbers itself, I'm not against it,
it's just something that wasn't clear to me yet.

> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the
> set that has access flags matching the qemu_open action flags. 
> qemu_open increments refcount for this fd.
> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named 
> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5).  libvirt 
> in-use flag turned on for fd.
> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first
> fd from the set that has access flags matching the qemu_open action 
> flags.  qemu_open increments refcount for this fd.
> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the 
> set.  turns libvirt in-use flag off marking the fd ready to be closed 
> when qemu is done with it.

If we decided to not return the individual fd numbers to libvirt, file
descriptors would be uniquely identified by an fdset/flags pair here.

> 5. qemu_close decrements refcount for fd, and closes fd when refcount is 
> zero and libvirt in use flag is off.

The monitor could just hold another reference, then we save the
additional flag. But that's a qemu implementation detail.

> More functional details:
> -If libvirt crashes it can call "query-fd /dev/fdset/1" to determine 
> which fds are open in the set.

We also need a query-fdsets command that lists all fdsets that exist. If
we add information about single fds to the return value of it, we
probably don't need a separate query-fd that operates on a single fdset.

> -If monitor connection closes, qemu will close fds that have a refcount 
> of zero.  Do we also need a qemu in-use flag in case refcount is zero 
> and fd is still in use?

In use by whom? If it's still in use in qemu (as in "in-use flag would
be set") and we have a refcount of zero, then that's a bug.

> -This support requires introduction of qemu_close function that will be 
> called everywhere in block layer that close is currently called.
> 
> Notes:
> -Patch series 1 will include support for all of the above.  This will be 
> my initial focus.
> -Patch series 2 will include command line support that enables 
> association of command line fd with a monitor set.  This will be my 
> secondary focus, most likely after patch series 1 is applied.

Thanks, this is a good and as far as I can tell complete summary of what
we discussed.

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-03 15:59               ` Kevin Wolf
@ 2012-07-03 16:25                 ` Corey Bryant
  2012-07-03 17:03                   ` Eric Blake
  0 siblings, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-07-03 16:25 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake



On 07/03/2012 11:59 AM, Kevin Wolf wrote:
> Am 03.07.2012 17:40, schrieb Corey Bryant:
>> Thanks again for taking time to discuss this at today's QEMU community call.
>>
>> Here's the proposal we discussed at the call.  Please let me know if I
>> missed anything or if there are any issues with this design.
>>
>> Proposal Five:  New monitor commands enable adding/removing an fd
>> to/from a set.  The set of fds should all refer to the same file, but
>> may have different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can
>> then dup the fd that has the matching access mode flags.
>> PRO: Supports reopen
>> PRO: All other commands work without impact by using qemu_open()
>> PRO: No fd leakage (fds are associated with monitor connection and, if
>> not in use, closed when monitor disconnects)
>> PRO: Security-wise this is ok since libvirt can manage the set of fd's
>> (ie. it can add/remove an O_RDWR fd to/from the set as needed).
>> CON: Not atomic (e.g. doesn't add an fd with single drive_add command).
>> USAGE:
>> 1. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named
>> "/dev/fdset/1" - command returns qemu fd (e.g fd=4) to caller.  libvirt
>> in-use flag turned on for fd.
>
> I thought qemu would rather return the number of the fdset (which it
> also assigns if none it passed, i.e. for fdset creation). Does libvirt
> need the number of an individual fd?
>
> If libvirt prefers to assign fdset numbers itself, I'm not against it,
> it's just something that wasn't clear to me yet.
>

That's fine.  QEMU can return the fdset number or a string 
(/dev/fdset/1) if none is specified.  And an fdset will need to be 
specified if adding to an existing set.

I think libvirt will need the fd returned by add-fd so that it can 
evaluate fds returned by query-fd.  It's also useful for remove-fd.

>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the
>> set that has access flags matching the qemu_open action flags.
>> qemu_open increments refcount for this fd.
>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named
>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5).  libvirt
>> in-use flag turned on for fd.
>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first
>> fd from the set that has access flags matching the qemu_open action
>> flags.  qemu_open increments refcount for this fd.
>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the
>> set.  turns libvirt in-use flag off marking the fd ready to be closed
>> when qemu is done with it.
>
> If we decided to not return the individual fd numbers to libvirt, file
> descriptors would be uniquely identified by an fdset/flags pair here.
>

Are you saying we'd pass the fdset name and flags parameters on 
remove-fd to somehow identify the fds to remove?

>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is
>> zero and libvirt in use flag is off.
>
> The monitor could just hold another reference, then we save the
> additional flag. But that's a qemu implementation detail.
>

I'm not sure I understand what you mean.

>> More functional details:
>> -If libvirt crashes it can call "query-fd /dev/fdset/1" to determine
>> which fds are open in the set.
>
> We also need a query-fdsets command that lists all fdsets that exist. If
> we add information about single fds to the return value of it, we
> probably don't need a separate query-fd that operates on a single fdset.
>

Yes, good point.  And maybe we don't need 2 commands.  query-fdsets 
could return all the sets and all the fds that are in those sets.

>> -If monitor connection closes, qemu will close fds that have a refcount
>> of zero.  Do we also need a qemu in-use flag in case refcount is zero
>> and fd is still in use?
>
> In use by whom? If it's still in use in qemu (as in "in-use flag would
> be set") and we have a refcount of zero, then that's a bug.
>

In use by qemu.  I don't think it's a bug.  I think there are situations 
where refcount gets to zero but qemu is still using the fd.

>> -This support requires introduction of qemu_close function that will be
>> called everywhere in block layer that close is currently called.
>>
>> Notes:
>> -Patch series 1 will include support for all of the above.  This will be
>> my initial focus.
>> -Patch series 2 will include command line support that enables
>> association of command line fd with a monitor set.  This will be my
>> secondary focus, most likely after patch series 1 is applied.
>
> Thanks, this is a good and as far as I can tell complete summary of what
> we discussed.
>
> Kevin
>

Definitely!  Thank you for all the input.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-03 16:25                 ` Corey Bryant
@ 2012-07-03 17:03                   ` Eric Blake
  2012-07-03 17:46                     ` Corey Bryant
  2012-07-04  8:00                     ` Kevin Wolf
  0 siblings, 2 replies; 56+ messages in thread
From: Eric Blake @ 2012-07-03 17:03 UTC (permalink / raw)
  To: Corey Bryant
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	Luiz Capitulino, pbonzini

[-- Attachment #1: Type: text/plain, Size: 6833 bytes --]

On 07/03/2012 10:25 AM, Corey Bryant wrote:

>> I thought qemu would rather return the number of the fdset (which it
>> also assigns if none it passed, i.e. for fdset creation). Does libvirt
>> need the number of an individual fd?
>>
>> If libvirt prefers to assign fdset numbers itself, I'm not against it,
>> it's just something that wasn't clear to me yet.
>>
> 
> That's fine.  QEMU can return the fdset number or a string
> (/dev/fdset/1) if none is specified.  And an fdset will need to be
> specified if adding to an existing set.
> 
> I think libvirt will need the fd returned by add-fd so that it can
> evaluate fds returned by query-fd.  It's also useful for remove-fd.

Correct - since we will be adding a remove-fd, then that command needs
to know both the fdset name and the individual fd within the set to be
removed.

> 
>>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the
>>> set that has access flags matching the qemu_open action flags.
>>> qemu_open increments refcount for this fd.
>>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named
>>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5).  libvirt
>>> in-use flag turned on for fd.
>>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first
>>> fd from the set that has access flags matching the qemu_open action
>>> flags.  qemu_open increments refcount for this fd.
>>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the
>>> set.  turns libvirt in-use flag off marking the fd ready to be closed
>>> when qemu is done with it.
>>
>> If we decided to not return the individual fd numbers to libvirt, file
>> descriptors would be uniquely identified by an fdset/flags pair here.
>>
> 
> Are you saying we'd pass the fdset name and flags parameters on
> remove-fd to somehow identify the fds to remove?

Passing the flag parameters is not trivial, as that would mean the QMP
code would have to define constants mapping to all of the O_* flags that
qemu_open supports.  It's easier to support closing by fd number.

> 
>>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is
>>> zero and libvirt in use flag is off.
>>
>> The monitor could just hold another reference, then we save the
>> additional flag. But that's a qemu implementation detail.
>>
> 
> I'm not sure I understand what you mean.

pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset,
with initial use count of 1 (the use is the monitor).  qemu_open()
increments the use count.  A new qemu_close() wrapper would decrement
the use count.  And both calling 'remove-fd', or closing the QMP monitor
of an fd that has not yet been passed through 'remove-fd', serves as a
way to decrement the use count.  You'd still have to track whether the
monitor is using an fd (to avoid over-decrementing on QMP monitor
close), but by having the monitor's use also tracked under the refcount,
then refcount reaching 0 is sufficient to auto-close an fd.  I think
that also means that re-establishing the client QMP connection would
increment  For some examples:

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
now 0 so qemu closes it

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount to 2
3. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
open because it is still in use by the block device
4. client re-establishes QMP connection, and 'query-fds' lets client
learn about fd=4 still being open as part of fdset1, but also informs
client that fd is not in use by the monitor

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount to 2
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in use by the monitor, refcount decremented to 1 but still left
open because it is in use by the block device
4. client crashes, so all tracked fds are visited; but fd=4 is already
marked as not in use by the monitor, so its refcount is unchanged

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
but the command fails for some other reason, so the refcount is still 1
at the end of the command (although it may have been temporarily
incremented then decremented during the command)
3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
QMP connection is closed), so qemu marks fd=4 as no longer in use by the
monitor, refcount is now decremented to 0 and fd=4 is closed

I think that covers the idea; you need a bool in_use for tracking
monitor state (the monitor is in use until either a remove-fd or a
monitor connection closes), as well as a ref-count.

>> We also need a query-fdsets command that lists all fdsets that exist. If
>> we add information about single fds to the return value of it, we
>> probably don't need a separate query-fd that operates on a single fdset.
>>
> 
> Yes, good point.  And maybe we don't need 2 commands.  query-fdsets
> could return all the sets and all the fds that are in those sets.

Yes, I think a single query command is good enough here, something like:

{ "execute":"query-fdsets" } =>
{ "return" : { "sets": [
   { "name": "fdset1",
     "fds": [ { "fd": 4, "monitor": true, "refcount": 1 } ] },
   { "name": "fdset2",
     "fds": [ { "fd": 5, "monitor": false, "refcount": 1 },
              { "fd": 6, "monitor": true, "refcount": 2 } ] } ] } }


>> In use by whom? If it's still in use in qemu (as in "in-use flag would
>> be set") and we have a refcount of zero, then that's a bug.
>>
> 
> In use by qemu.  I don't think it's a bug.  I think there are situations
> where refcount gets to zero but qemu is still using the fd.

I think the refcount being non-zero _is_ what defines an fd as being in
use by qemu (including use by the monitor).  Any place you have to close
an fd before reopening it is dangerous; the safe way is always to open
with the new permissions before closing the old permissions.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-03 17:03                   ` Eric Blake
@ 2012-07-03 17:46                     ` Corey Bryant
  2012-07-03 18:00                       ` Eric Blake
  2012-07-04  8:00                     ` Kevin Wolf
  1 sibling, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-07-03 17:46 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	Luiz Capitulino, pbonzini



On 07/03/2012 01:03 PM, Eric Blake wrote:
> On 07/03/2012 10:25 AM, Corey Bryant wrote:
>
>>> I thought qemu would rather return the number of the fdset (which it
>>> also assigns if none it passed, i.e. for fdset creation). Does libvirt
>>> need the number of an individual fd?
>>>
>>> If libvirt prefers to assign fdset numbers itself, I'm not against it,
>>> it's just something that wasn't clear to me yet.
>>>
>>
>> That's fine.  QEMU can return the fdset number or a string
>> (/dev/fdset/1) if none is specified.  And an fdset will need to be
>> specified if adding to an existing set.
>>
>> I think libvirt will need the fd returned by add-fd so that it can
>> evaluate fds returned by query-fd.  It's also useful for remove-fd.
>
> Correct - since we will be adding a remove-fd, then that command needs
> to know both the fdset name and the individual fd within the set to be
> removed.
>

Ok

>>
>>>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the
>>>> set that has access flags matching the qemu_open action flags.
>>>> qemu_open increments refcount for this fd.
>>>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named
>>>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5).  libvirt
>>>> in-use flag turned on for fd.
>>>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first
>>>> fd from the set that has access flags matching the qemu_open action
>>>> flags.  qemu_open increments refcount for this fd.
>>>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the
>>>> set.  turns libvirt in-use flag off marking the fd ready to be closed
>>>> when qemu is done with it.
>>>
>>> If we decided to not return the individual fd numbers to libvirt, file
>>> descriptors would be uniquely identified by an fdset/flags pair here.
>>>
>>
>> Are you saying we'd pass the fdset name and flags parameters on
>> remove-fd to somehow identify the fds to remove?
>
> Passing the flag parameters is not trivial, as that would mean the QMP
> code would have to define constants mapping to all of the O_* flags that
> qemu_open supports.  It's easier to support closing by fd number.
>

I understand what you were saying now, although I guess it's not 
applicable at this point.  I'll plan on returning the fd from add-fd and 
passing the fd on the remove-fd command.

>>
>>>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is
>>>> zero and libvirt in use flag is off.
>>>
>>> The monitor could just hold another reference, then we save the
>>> additional flag. But that's a qemu implementation detail.
>>>
>>
>> I'm not sure I understand what you mean.
>
> pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset,
> with initial use count of 1 (the use is the monitor).  qemu_open()
> increments the use count.  A new qemu_close() wrapper would decrement
> the use count.  And both calling 'remove-fd', or closing the QMP monitor
> of an fd that has not yet been passed through 'remove-fd', serves as a
> way to decrement the use count.  You'd still have to track whether the
> monitor is using an fd (to avoid over-decrementing on QMP monitor
> close), but by having the monitor's use also tracked under the refcount,
> then refcount reaching 0 is sufficient to auto-close an fd.  I think
> that also means that re-establishing the client QMP connection would
> increment  For some examples:

Yes, I think adding a +1 to the refcount for the monitor makes sense.

I'm a bit unsure how to increment the refcount when a monitor reconnects 
though.  Maybe it is as simple as adding a +1 to each fd's refcount when 
the next QMP monitor connects.

>
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client crashes, so all tracked fds are visited; fd=4 had not yet been
> passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
> now 0 so qemu closes it

Just a note that the fd above also hasn't yet been referenced by a 
drive-add/device-add, so it will be closed in step 2.

>
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount to 2
> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
> passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
> open because it is still in use by the block device
> 4. client re-establishes QMP connection, and 'query-fds' lets client
> learn about fd=4 still being open as part of fdset1, but also informs
> client that fd is not in use by the monitor

And in step 4 the QMP connection will increment the refcount +1 for all 
fds that persisted through the QMP disconnect. (?)

>
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount to 2
> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
> longer in use by the monitor, refcount decremented to 1 but still left
> open because it is in use by the block device
> 4. client crashes, so all tracked fds are visited; but fd=4 is already
> marked as not in use by the monitor, so its refcount is unchanged
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> but the command fails for some other reason, so the refcount is still 1
> at the end of the command (although it may have been temporarily
> incremented then decremented during the command)
> 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
> QMP connection is closed), so qemu marks fd=4 as no longer in use by the
> monitor, refcount is now decremented to 0 and fd=4 is closed
>
> I think that covers the idea; you need a bool in_use for tracking
> monitor state (the monitor is in use until either a remove-fd or a
> monitor connection closes), as well as a ref-count.
>

Yes, it all makes sense to me.  Thanks for the scenarios.

>>> We also need a query-fdsets command that lists all fdsets that exist. If
>>> we add information about single fds to the return value of it, we
>>> probably don't need a separate query-fd that operates on a single fdset.
>>>
>>
>> Yes, good point.  And maybe we don't need 2 commands.  query-fdsets
>> could return all the sets and all the fds that are in those sets.
>
> Yes, I think a single query command is good enough here, something like:
>
> { "execute":"query-fdsets" } =>
> { "return" : { "sets": [
>     { "name": "fdset1",
>       "fds": [ { "fd": 4, "monitor": true, "refcount": 1 } ] },
>     { "name": "fdset2",
>       "fds": [ { "fd": 5, "monitor": false, "refcount": 1 },
>                { "fd": 6, "monitor": true, "refcount": 2 } ] } ] } }
>
>

Ok, thanks!

>>> In use by whom? If it's still in use in qemu (as in "in-use flag would
>>> be set") and we have a refcount of zero, then that's a bug.
>>>
>>
>> In use by qemu.  I don't think it's a bug.  I think there are situations
>> where refcount gets to zero but qemu is still using the fd.
>
> I think the refcount being non-zero _is_ what defines an fd as being in
> use by qemu (including use by the monitor).  Any place you have to close
> an fd before reopening it is dangerous; the safe way is always to open
> with the new permissions before closing the old permissions.
>

Maybe Kevin wants to weigh in on this.  Perhaps it's an issue that can 
be separated from my patch series.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-03 17:46                     ` Corey Bryant
@ 2012-07-03 18:00                       ` Eric Blake
  2012-07-03 18:21                         ` Corey Bryant
  0 siblings, 1 reply; 56+ messages in thread
From: Eric Blake @ 2012-07-03 18:00 UTC (permalink / raw)
  To: Corey Bryant
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	Luiz Capitulino, pbonzini

[-- Attachment #1: Type: text/plain, Size: 2393 bytes --]

On 07/03/2012 11:46 AM, Corey Bryant wrote:

> 
> Yes, I think adding a +1 to the refcount for the monitor makes sense.
> 
> I'm a bit unsure how to increment the refcount when a monitor reconnects
> though.  Maybe it is as simple as adding a +1 to each fd's refcount when
> the next QMP monitor connects.

Or maybe delay a +1 until after a 'query-fds' - it is not until the
monitor has reconnected and learned what fds it should be aware of that
incrementing the refcount again makes sense.  But that would mean making
'query-fds' track whether this is the first call since the monitor
reconnected, as it shouldn't normally increase refcounts.

The other alternative is that the monitor never re-increments a
refcount.  Once a monitor disconnects, that fd is lost to the monitor,
and a reconnected monitor must pass in a new fd to be re-associated with
the fdset.  In other words, the monitor's use of an fd is a one-way
operation, starting life in use but ending at the first disconnect or
remove-fd.


>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
>> use by monitor, as member of fdset1
>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>> so qemu_open() increments the refcount to 2
>> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
>> passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
>> open because it is still in use by the block device
>> 4. client re-establishes QMP connection, and 'query-fds' lets client
>> learn about fd=4 still being open as part of fdset1, but also informs
>> client that fd is not in use by the monitor
> 
> And in step 4 the QMP connection will increment the refcount +1 for all
> fds that persisted through the QMP disconnect. (?)

I'm not sure we need the refcount increment on reconnect.  'query-fds'
should provide enough information for the new monitor to know what
fdsets are still in use by qemu, even though they are no longer
available to 'remove-fd' from the monitor, and if the monitor is worried
about keeping the fd set alive it can call 'add-fd' to again associate a
new fd with the set.  The lifetime of a set is thus as long as any of
its associated fds have a non-zero refcount.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-03 18:00                       ` Eric Blake
@ 2012-07-03 18:21                         ` Corey Bryant
  2012-07-04  8:09                           ` Kevin Wolf
  0 siblings, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-07-03 18:21 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	Luiz Capitulino, pbonzini



On 07/03/2012 02:00 PM, Eric Blake wrote:
> On 07/03/2012 11:46 AM, Corey Bryant wrote:
>
>>
>> Yes, I think adding a +1 to the refcount for the monitor makes sense.
>>
>> I'm a bit unsure how to increment the refcount when a monitor reconnects
>> though.  Maybe it is as simple as adding a +1 to each fd's refcount when
>> the next QMP monitor connects.
>
> Or maybe delay a +1 until after a 'query-fds' - it is not until the
> monitor has reconnected and learned what fds it should be aware of that
> incrementing the refcount again makes sense.  But that would mean making
> 'query-fds' track whether this is the first call since the monitor
> reconnected, as it shouldn't normally increase refcounts.

This doesn't sound ideal.

>
> The other alternative is that the monitor never re-increments a
> refcount.  Once a monitor disconnects, that fd is lost to the monitor,
> and a reconnected monitor must pass in a new fd to be re-associated with
> the fdset.  In other words, the monitor's use of an fd is a one-way
> operation, starting life in use but ending at the first disconnect or
> remove-fd.
>
>

I would vote for this 2nd alternative.  As long as we're not introducing 
an fd leak.  And I don't think we are if we decrement the refcount on 
remove-fd or on QMP disconnect.

>>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
>>> use by monitor, as member of fdset1
>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>>> so qemu_open() increments the refcount to 2
>>> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
>>> passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
>>> open because it is still in use by the block device
>>> 4. client re-establishes QMP connection, and 'query-fds' lets client
>>> learn about fd=4 still being open as part of fdset1, but also informs
>>> client that fd is not in use by the monitor
>>
>> And in step 4 the QMP connection will increment the refcount +1 for all
>> fds that persisted through the QMP disconnect. (?)
>
> I'm not sure we need the refcount increment on reconnect.  'query-fds'
> should provide enough information for the new monitor to know what
> fdsets are still in use by qemu, even though they are no longer
> available to 'remove-fd' from the monitor, and if the monitor is worried
> about keeping the fd set alive it can call 'add-fd' to again associate a
> new fd with the set.  The lifetime of a set is thus as long as any of
> its associated fds have a non-zero refcount.
>

This sounds good to me.

And qemu_open will need to make sure the monitor in_use flag is true 
before dup'ing an fd.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-03 17:03                   ` Eric Blake
  2012-07-03 17:46                     ` Corey Bryant
@ 2012-07-04  8:00                     ` Kevin Wolf
  2012-07-05 14:22                       ` Corey Bryant
  1 sibling, 1 reply; 56+ messages in thread
From: Kevin Wolf @ 2012-07-04  8:00 UTC (permalink / raw)
  To: Eric Blake
  Cc: aliguori, stefanha, libvir-list, Corey Bryant, qemu-devel,
	Luiz Capitulino, pbonzini

Am 03.07.2012 19:03, schrieb Eric Blake:
>>>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the
>>>> set that has access flags matching the qemu_open action flags.
>>>> qemu_open increments refcount for this fd.
>>>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named
>>>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5).  libvirt
>>>> in-use flag turned on for fd.
>>>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first
>>>> fd from the set that has access flags matching the qemu_open action
>>>> flags.  qemu_open increments refcount for this fd.
>>>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the
>>>> set.  turns libvirt in-use flag off marking the fd ready to be closed
>>>> when qemu is done with it.
>>>
>>> If we decided to not return the individual fd numbers to libvirt, file
>>> descriptors would be uniquely identified by an fdset/flags pair here.
>>>
>>
>> Are you saying we'd pass the fdset name and flags parameters on
>> remove-fd to somehow identify the fds to remove?
> 
> Passing the flag parameters is not trivial, as that would mean the QMP
> code would have to define constants mapping to all of the O_* flags that
> qemu_open supports.  It's easier to support closing by fd number.

Good point. So pass-fd or whatever it will be called needs to returnn
both fdset number and the individual fd number.

>>>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is
>>>> zero and libvirt in use flag is off.
>>>
>>> The monitor could just hold another reference, then we save the
>>> additional flag. But that's a qemu implementation detail.
>>>
>>
>> I'm not sure I understand what you mean.
> 
> pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset,
> with initial use count of 1 (the use is the monitor).  qemu_open()
> increments the use count.  A new qemu_close() wrapper would decrement
> the use count.  And both calling 'remove-fd', or closing the QMP monitor
> of an fd that has not yet been passed through 'remove-fd', serves as a
> way to decrement the use count.  You'd still have to track whether the
> monitor is using an fd (to avoid over-decrementing on QMP monitor
> close), but by having the monitor's use also tracked under the refcount,
> then refcount reaching 0 is sufficient to auto-close an fd.  

> I think
> that also means that re-establishing the client QMP connection would
> increment 

This is an interesting idea. I've never thought about what to do with a
reconnect. It felt a bit odd at first, but now your proposal totally
makes sense to me.


> For some examples:
> 
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client crashes, so all tracked fds are visited; fd=4 had not yet been
> passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
> now 0 so qemu closes it

Doesn't the refcount belong to an fdset rather than a single fd? As long
as the fdset has still a reference, it's possible to reach the other fds
in the set through a reopen. For example:

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member
of fdset1, in use by monitor, refcount 1
2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
member of fdset, in use by monitor, refcount is still 1
3. client calls 'device-add' with /dev/fdset/1 as the backing filename
and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
4. client crashes, so all tracked fdsets are visited; fdset1 gets its
refcount decreased to 1, and both member fds 4 and 5 stay open.

If you had refcounted a single fd, fd 5 would be closed now and qemu
couldn't switch to O_RDONLY any more.

> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount to 2
> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
> passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
> open because it is still in use by the block device
> 4. client re-establishes QMP connection, and 'query-fds' lets client
> learn about fd=4 still being open as part of fdset1, but also informs
> client that fd is not in use by the monitor
> 
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount to 2
> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
> longer in use by the monitor, refcount decremented to 1 but still left
> open because it is in use by the block device
> 4. client crashes, so all tracked fds are visited; but fd=4 is already
> marked as not in use by the monitor, so its refcount is unchanged
> 
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> but the command fails for some other reason, so the refcount is still 1
> at the end of the command (although it may have been temporarily
> incremented then decremented during the command)
> 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
> QMP connection is closed), so qemu marks fd=4 as no longer in use by the
> monitor, refcount is now decremented to 0 and fd=4 is closed

Yup, apart from the comment above the examples look good to me.

> I think that covers the idea; you need a bool in_use for tracking
> monitor state (the monitor is in use until either a remove-fd or a
> monitor connection closes), as well as a ref-count.

Yes, makes sense to me.

>> We also need a query-fdsets command that lists all fdsets that exist. If
>>> we add information about single fds to the return value of it, we
>>> probably don't need a separate query-fd that operates on a single fdset.
>>>
>>
>> Yes, good point.  And maybe we don't need 2 commands.  query-fdsets
>> could return all the sets and all the fds that are in those sets.
> 
> Yes, I think a single query command is good enough here, something like:
> 
> { "execute":"query-fdsets" } =>
> { "return" : { "sets": [
>    { "name": "fdset1",
>      "fds": [ { "fd": 4, "monitor": true, "refcount": 1 } ] },
>    { "name": "fdset2",
>      "fds": [ { "fd": 5, "monitor": false, "refcount": 1 },
>               { "fd": 6, "monitor": true, "refcount": 2 } ] } ] } }

Looks good, this is exactly what I was thinking of.

>>> In use by whom? If it's still in use in qemu (as in "in-use flag would
>>> be set") and we have a refcount of zero, then that's a bug.
>>>
>>
>> In use by qemu.  I don't think it's a bug.  I think there are situations
>> where refcount gets to zero but qemu is still using the fd.
> 
> I think the refcount being non-zero _is_ what defines an fd as being in
> use by qemu (including use by the monitor).  Any place you have to close
> an fd before reopening it is dangerous; the safe way is always to open
> with the new permissions before closing the old permissions.

There is one case I'm aware of where we need to be careful: Before
opening an image, qemu may probe the format. In this case, the image
gets opened twice, and the first close comes before the second open. I'm
not entirely sure how hard it would be to get rid of that behaviour.

If we can't get rid of it, we have a small window that the refcount
doesn't really cover, and if we weren't careful it would become racy.
This is why I mentioned earlier that maybe we need to defer the refcount
decrease a bit. However, I can't see how the in-use flag would make a
difference there. If the refcount can't cover it, the in-use flag can't
either.

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-03 18:21                         ` Corey Bryant
@ 2012-07-04  8:09                           ` Kevin Wolf
  2012-07-05 15:06                             ` Corey Bryant
  0 siblings, 1 reply; 56+ messages in thread
From: Kevin Wolf @ 2012-07-04  8:09 UTC (permalink / raw)
  To: Corey Bryant
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake

Am 03.07.2012 20:21, schrieb Corey Bryant:
> On 07/03/2012 02:00 PM, Eric Blake wrote:
>> On 07/03/2012 11:46 AM, Corey Bryant wrote:
>>
>>>
>>> Yes, I think adding a +1 to the refcount for the monitor makes sense.
>>>
>>> I'm a bit unsure how to increment the refcount when a monitor reconnects
>>> though.  Maybe it is as simple as adding a +1 to each fd's refcount when
>>> the next QMP monitor connects.
>>
>> Or maybe delay a +1 until after a 'query-fds' - it is not until the
>> monitor has reconnected and learned what fds it should be aware of that
>> incrementing the refcount again makes sense.  But that would mean making
>> 'query-fds' track whether this is the first call since the monitor
>> reconnected, as it shouldn't normally increase refcounts.
> 
> This doesn't sound ideal.

Yes, it's less than ideal.

>> The other alternative is that the monitor never re-increments a
>> refcount.  Once a monitor disconnects, that fd is lost to the monitor,
>> and a reconnected monitor must pass in a new fd to be re-associated with
>> the fdset.  In other words, the monitor's use of an fd is a one-way
>> operation, starting life in use but ending at the first disconnect or
>> remove-fd.
> 
> I would vote for this 2nd alternative.  As long as we're not introducing 
> an fd leak.  And I don't think we are if we decrement the refcount on 
> remove-fd or on QMP disconnect.

In fact, I believe this one is even worse. I can already see hacks like
adding a dummy FD with invalid flags and removing it again just to
regain control over the fdset...

You earlier suggestion made a lot of sense to me: Whenever a new QMP
monitor is connected, increase the refcount. That is, as long as any
monitor is there, don't drop any fdsets unless explicitly requested via QMP.

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-04  8:00                     ` Kevin Wolf
@ 2012-07-05 14:22                       ` Corey Bryant
  2012-07-05 14:51                         ` Kevin Wolf
  0 siblings, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-07-05 14:22 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake



On 07/04/2012 04:00 AM, Kevin Wolf wrote:
> Am 03.07.2012 19:03, schrieb Eric Blake:
>>>>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the
>>>>> set that has access flags matching the qemu_open action flags.
>>>>> qemu_open increments refcount for this fd.
>>>>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named
>>>>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5).  libvirt
>>>>> in-use flag turned on for fd.
>>>>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first
>>>>> fd from the set that has access flags matching the qemu_open action
>>>>> flags.  qemu_open increments refcount for this fd.
>>>>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the
>>>>> set.  turns libvirt in-use flag off marking the fd ready to be closed
>>>>> when qemu is done with it.
>>>>
>>>> If we decided to not return the individual fd numbers to libvirt, file
>>>> descriptors would be uniquely identified by an fdset/flags pair here.
>>>>
>>>
>>> Are you saying we'd pass the fdset name and flags parameters on
>>> remove-fd to somehow identify the fds to remove?
>>
>> Passing the flag parameters is not trivial, as that would mean the QMP
>> code would have to define constants mapping to all of the O_* flags that
>> qemu_open supports.  It's easier to support closing by fd number.
>
> Good point. So pass-fd or whatever it will be called needs to returnn
> both fdset number and the individual fd number.
>
>>>>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is
>>>>> zero and libvirt in use flag is off.
>>>>
>>>> The monitor could just hold another reference, then we save the
>>>> additional flag. But that's a qemu implementation detail.
>>>>
>>>
>>> I'm not sure I understand what you mean.
>>
>> pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset,
>> with initial use count of 1 (the use is the monitor).  qemu_open()
>> increments the use count.  A new qemu_close() wrapper would decrement
>> the use count.  And both calling 'remove-fd', or closing the QMP monitor
>> of an fd that has not yet been passed through 'remove-fd', serves as a
>> way to decrement the use count.  You'd still have to track whether the
>> monitor is using an fd (to avoid over-decrementing on QMP monitor
>> close), but by having the monitor's use also tracked under the refcount,
>> then refcount reaching 0 is sufficient to auto-close an fd.
>
>> I think
>> that also means that re-establishing the client QMP connection would
>> increment
>
> This is an interesting idea. I've never thought about what to do with a
> reconnect. It felt a bit odd at first, but now your proposal totally
> makes sense to me.
>
>
>> For some examples:
>>
>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
>> use by monitor, as member of fdset1
>> 2. client crashes, so all tracked fds are visited; fd=4 had not yet been
>> passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
>> now 0 so qemu closes it
>
> Doesn't the refcount belong to an fdset rather than a single fd? As long
> as the fdset has still a reference, it's possible to reach the other fds
> in the set through a reopen. For example:
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member
> of fdset1, in use by monitor, refcount 1
> 2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
> member of fdset, in use by monitor, refcount is still 1
> 3. client calls 'device-add' with /dev/fdset/1 as the backing filename
> and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
> 4. client crashes, so all tracked fdsets are visited; fdset1 gets its
> refcount decreased to 1, and both member fds 4 and 5 stay open.
>
> If you had refcounted a single fd, fd 5 would be closed now and qemu
> couldn't switch to O_RDONLY any more.
>

If the O_RDONLY is for a future command, it would make more sense to add 
that fd before that future command.  Or are you passing it at step 2 
because it is needed for the probe re-open issue?

>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
>> use by monitor, as member of fdset1
>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>> so qemu_open() increments the refcount to 2
>> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
>> passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
>> open because it is still in use by the block device
>> 4. client re-establishes QMP connection, and 'query-fds' lets client
>> learn about fd=4 still being open as part of fdset1, but also informs
>> client that fd is not in use by the monitor
>>
>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
>> use by monitor, as member of fdset1
>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>> so qemu_open() increments the refcount to 2
>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
>> longer in use by the monitor, refcount decremented to 1 but still left
>> open because it is in use by the block device
>> 4. client crashes, so all tracked fds are visited; but fd=4 is already
>> marked as not in use by the monitor, so its refcount is unchanged
>>
>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
>> use by monitor, as member of fdset1
>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>> but the command fails for some other reason, so the refcount is still 1
>> at the end of the command (although it may have been temporarily
>> incremented then decremented during the command)
>> 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
>> QMP connection is closed), so qemu marks fd=4 as no longer in use by the
>> monitor, refcount is now decremented to 0 and fd=4 is closed
>
> Yup, apart from the comment above the examples look good to me.
>
>> I think that covers the idea; you need a bool in_use for tracking
>> monitor state (the monitor is in use until either a remove-fd or a
>> monitor connection closes), as well as a ref-count.
>
> Yes, makes sense to me.
>
>>> We also need a query-fdsets command that lists all fdsets that exist. If
>>>> we add information about single fds to the return value of it, we
>>>> probably don't need a separate query-fd that operates on a single fdset.
>>>>
>>>
>>> Yes, good point.  And maybe we don't need 2 commands.  query-fdsets
>>> could return all the sets and all the fds that are in those sets.
>>
>> Yes, I think a single query command is good enough here, something like:
>>
>> { "execute":"query-fdsets" } =>
>> { "return" : { "sets": [
>>     { "name": "fdset1",
>>       "fds": [ { "fd": 4, "monitor": true, "refcount": 1 } ] },
>>     { "name": "fdset2",
>>       "fds": [ { "fd": 5, "monitor": false, "refcount": 1 },
>>                { "fd": 6, "monitor": true, "refcount": 2 } ] } ] } }
>
> Looks good, this is exactly what I was thinking of.
>
>>>> In use by whom? If it's still in use in qemu (as in "in-use flag would
>>>> be set") and we have a refcount of zero, then that's a bug.
>>>>
>>>
>>> In use by qemu.  I don't think it's a bug.  I think there are situations
>>> where refcount gets to zero but qemu is still using the fd.
>>
>> I think the refcount being non-zero _is_ what defines an fd as being in
>> use by qemu (including use by the monitor).  Any place you have to close
>> an fd before reopening it is dangerous; the safe way is always to open
>> with the new permissions before closing the old permissions.
>
> There is one case I'm aware of where we need to be careful: Before
> opening an image, qemu may probe the format. In this case, the image
> gets opened twice, and the first close comes before the second open. I'm
> not entirely sure how hard it would be to get rid of that behaviour.
>
> If we can't get rid of it, we have a small window that the refcount
> doesn't really cover, and if we weren't careful it would become racy.
> This is why I mentioned earlier that maybe we need to defer the refcount
> decrease a bit. However, I can't see how the in-use flag would make a
> difference there. If the refcount can't cover it, the in-use flag can't
> either.

Yeah this is a problem.  Could we introduce another flag to cover this?

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-05 14:22                       ` Corey Bryant
@ 2012-07-05 14:51                         ` Kevin Wolf
  2012-07-05 16:35                           ` Corey Bryant
  0 siblings, 1 reply; 56+ messages in thread
From: Kevin Wolf @ 2012-07-05 14:51 UTC (permalink / raw)
  To: Corey Bryant
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake

Am 05.07.2012 16:22, schrieb Corey Bryant:
>>> For some examples:
>>>
>>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
>>> use by monitor, as member of fdset1
>>> 2. client crashes, so all tracked fds are visited; fd=4 had not yet been
>>> passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
>>> now 0 so qemu closes it
>>
>> Doesn't the refcount belong to an fdset rather than a single fd? As long
>> as the fdset has still a reference, it's possible to reach the other fds
>> in the set through a reopen. For example:
>>
>> 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member
>> of fdset1, in use by monitor, refcount 1
>> 2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
>> member of fdset, in use by monitor, refcount is still 1
>> 3. client calls 'device-add' with /dev/fdset/1 as the backing filename
>> and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
>> 4. client crashes, so all tracked fdsets are visited; fdset1 gets its
>> refcount decreased to 1, and both member fds 4 and 5 stay open.
>>
>> If you had refcounted a single fd, fd 5 would be closed now and qemu
>> couldn't switch to O_RDONLY any more.
>>
> 
> If the O_RDONLY is for a future command, it would make more sense to add 
> that fd before that future command.  Or are you passing it at step 2 
> because it is needed for the probe re-open issue?

Come on, requiring realistic examples isn't fair. ;-)

Swap steps 2 and 3 in the example, then step 3 is just the preparation
for a command that uses the new fd. The problem stays the same. Or a
live commit operation could be in flight so that qemu must be able to
switch on completion without libvirt interaction.

There are enough reasons that an fd in an fdset exists and is not
opened, but I can't think of one why it should be dropped.

>>>>> In use by whom? If it's still in use in qemu (as in "in-use flag would
>>>>> be set") and we have a refcount of zero, then that's a bug.
>>>>>
>>>>
>>>> In use by qemu.  I don't think it's a bug.  I think there are situations
>>>> where refcount gets to zero but qemu is still using the fd.
>>>
>>> I think the refcount being non-zero _is_ what defines an fd as being in
>>> use by qemu (including use by the monitor).  Any place you have to close
>>> an fd before reopening it is dangerous; the safe way is always to open
>>> with the new permissions before closing the old permissions.
>>
>> There is one case I'm aware of where we need to be careful: Before
>> opening an image, qemu may probe the format. In this case, the image
>> gets opened twice, and the first close comes before the second open. I'm
>> not entirely sure how hard it would be to get rid of that behaviour.
>>
>> If we can't get rid of it, we have a small window that the refcount
>> doesn't really cover, and if we weren't careful it would become racy.
>> This is why I mentioned earlier that maybe we need to defer the refcount
>> decrease a bit. However, I can't see how the in-use flag would make a
>> difference there. If the refcount can't cover it, the in-use flag can't
>> either.
> 
> Yeah this is a problem.  Could we introduce another flag to cover this?

Adding more refcounts or flags is not a problem, but it doesn't solve it
either. The hard question is when to set that flag.

I believe it may be easier to just change the block layer so that it
opens files only once during bdrv_open().

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-04  8:09                           ` Kevin Wolf
@ 2012-07-05 15:06                             ` Corey Bryant
  2012-07-09 14:05                               ` Luiz Capitulino
  0 siblings, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-07-05 15:06 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake



On 07/04/2012 04:09 AM, Kevin Wolf wrote:
> Am 03.07.2012 20:21, schrieb Corey Bryant:
>> On 07/03/2012 02:00 PM, Eric Blake wrote:
>>> On 07/03/2012 11:46 AM, Corey Bryant wrote:
>>>
>>>>
>>>> Yes, I think adding a +1 to the refcount for the monitor makes sense.
>>>>
>>>> I'm a bit unsure how to increment the refcount when a monitor reconnects
>>>> though.  Maybe it is as simple as adding a +1 to each fd's refcount when
>>>> the next QMP monitor connects.
>>>
>>> Or maybe delay a +1 until after a 'query-fds' - it is not until the
>>> monitor has reconnected and learned what fds it should be aware of that
>>> incrementing the refcount again makes sense.  But that would mean making
>>> 'query-fds' track whether this is the first call since the monitor
>>> reconnected, as it shouldn't normally increase refcounts.
>>
>> This doesn't sound ideal.
>
> Yes, it's less than ideal.
>
>>> The other alternative is that the monitor never re-increments a
>>> refcount.  Once a monitor disconnects, that fd is lost to the monitor,
>>> and a reconnected monitor must pass in a new fd to be re-associated with
>>> the fdset.  In other words, the monitor's use of an fd is a one-way
>>> operation, starting life in use but ending at the first disconnect or
>>> remove-fd.
>>
>> I would vote for this 2nd alternative.  As long as we're not introducing
>> an fd leak.  And I don't think we are if we decrement the refcount on
>> remove-fd or on QMP disconnect.
>
> In fact, I believe this one is even worse. I can already see hacks like
> adding a dummy FD with invalid flags and removing it again just to
> regain control over the fdset...
>
> You earlier suggestion made a lot of sense to me: Whenever a new QMP
> monitor is connected, increase the refcount. That is, as long as any
> monitor is there, don't drop any fdsets unless explicitly requested via QMP.

Ok.  So refcount would be incremented (for the fd or fdset, whatever we 
decide on) when QMP reconnects.  I'm assuming we wouldn't wait until 
after a query-fds call.

I'll go with this approach until someone objects.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-05 14:51                         ` Kevin Wolf
@ 2012-07-05 16:35                           ` Corey Bryant
  2012-07-05 16:37                             ` Corey Bryant
  2012-07-05 17:00                             ` Eric Blake
  0 siblings, 2 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-05 16:35 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake



On 07/05/2012 10:51 AM, Kevin Wolf wrote:
> Am 05.07.2012 16:22, schrieb Corey Bryant:
>>>> For some examples:
>>>>
>>>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
>>>> use by monitor, as member of fdset1
>>>> 2. client crashes, so all tracked fds are visited; fd=4 had not yet been
>>>> passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
>>>> now 0 so qemu closes it
>>>
>>> Doesn't the refcount belong to an fdset rather than a single fd? As long
>>> as the fdset has still a reference, it's possible to reach the other fds
>>> in the set through a reopen. For example:
>>>
>>> 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member
>>> of fdset1, in use by monitor, refcount 1
>>> 2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
>>> member of fdset, in use by monitor, refcount is still 1
>>> 3. client calls 'device-add' with /dev/fdset/1 as the backing filename
>>> and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
>>> 4. client crashes, so all tracked fdsets are visited; fdset1 gets its
>>> refcount decreased to 1, and both member fds 4 and 5 stay open.
>>>
>>> If you had refcounted a single fd, fd 5 would be closed now and qemu
>>> couldn't switch to O_RDONLY any more.
>>>
>>
>> If the O_RDONLY is for a future command, it would make more sense to add
>> that fd before that future command.  Or are you passing it at step 2
>> because it is needed for the probe re-open issue?
>
> Come on, requiring realistic examples isn't fair. ;-)

Heh :)

>
> Swap steps 2 and 3 in the example, then step 3 is just the preparation
> for a command that uses the new fd. The problem stays the same. Or a
> live commit operation could be in flight so that qemu must be able to
> switch on completion without libvirt interaction.

Good point.

I thought it would be useful to run through the examples again with each 
fdset having a refcount, and each fd having an in-use flag.  One 
difference though is that refcount is only incremented/decremented when 
qemu_open/qemu_close are called.  I think this works and covers Kevin's 
concerns.  Actually it might be a bit simpler too.

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with 
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so it's in-use flag is on;  in-use flag is turned 
off and fd=4 is left open because it is still in use by the block device 
(refcount is 1)
4. client re-establishes QMP connection, so all tracked fds are visited, 
and in-use flags are turned back on; 'query-fds' lets client learn about 
fd=4 still being open as part of fdset1

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with 
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and is left open because it is in use by 
the block device (refcount is 1)
4. client crashes, so all tracked fds are visited; fd=4 is not in-use 
but refcount is 1 so it is not closed

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with 
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
but the command fails for some other reason, so the refcount is still 0
at the end of the command (although it may have been temporarily
incremented then decremented during the command)
3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
QMP connection is closed), so qemu marks fd=4 as no longer in-use by the 
monitor; refcount is 0 so all fds in fdset1 are closed

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with 
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with 
refcount still 0; fd=5's in-use flag is turned on
3. client crashes, so all tracked fds are visited; fd=4 and fd=5 had not 
yet been passed to 'remove-fd', so their in-use flags are on; in-use 
flags are turned off; refcount of fdset1 is 0 so qemu closes all fds in 
fdset1

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with 
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with 
refcount still 0; fd=5's in-use flag is turned on
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and fd=4 is closed since the refcount is 
0; fd=5 remains open

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 
with refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 
with refcount still 0; fd=5's in-use flag is turned on
3. client calls 'device-add' with /dev/fdset/1 as the backing filename 
and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1
4. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, but fd=4 remains open because refcount of 
fdset1 is 1
5. client calls 'remove-fd fdset=1 fd=5', so qemu marks fd=5 as no
longer in-use by the monitor, and fd=5 remains open because refcount of 
fdset1 is 1
6. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both 
fds are closed since refcount is 0 and their in-use flags are off

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 
with refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 
with refcount still 0; fd=5's in-use flag is turned on
3. client calls 'device-add' with /dev/fdset/1 as the backing filename 
and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1
4. client crashes, so all tracked fds are visited; fd=4 and fd=5 have 
not yet been passed to 'remove-fd', so their in-use flags are on; 
in-use flags are turned off and both fds are left open because the set 
is still in use by the block device (refcount is 1)
5. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both 
fds are closed since refcount is 0 and their in-use flags are off

>
> There are enough reasons that an fd in an fdset exists and is not
> opened, but I can't think of one why it should be dropped.
>
>>>>>> In use by whom? If it's still in use in qemu (as in "in-use flag would
>>>>>> be set") and we have a refcount of zero, then that's a bug.
>>>>>>
>>>>>
>>>>> In use by qemu.  I don't think it's a bug.  I think there are situations
>>>>> where refcount gets to zero but qemu is still using the fd.
>>>>
>>>> I think the refcount being non-zero _is_ what defines an fd as being in
>>>> use by qemu (including use by the monitor).  Any place you have to close
>>>> an fd before reopening it is dangerous; the safe way is always to open
>>>> with the new permissions before closing the old permissions.
>>>
>>> There is one case I'm aware of where we need to be careful: Before
>>> opening an image, qemu may probe the format. In this case, the image
>>> gets opened twice, and the first close comes before the second open. I'm
>>> not entirely sure how hard it would be to get rid of that behaviour.
>>>
>>> If we can't get rid of it, we have a small window that the refcount
>>> doesn't really cover, and if we weren't careful it would become racy.
>>> This is why I mentioned earlier that maybe we need to defer the refcount
>>> decrease a bit. However, I can't see how the in-use flag would make a
>>> difference there. If the refcount can't cover it, the in-use flag can't
>>> either.
>>
>> Yeah this is a problem.  Could we introduce another flag to cover this?
>
> Adding more refcounts or flags is not a problem, but it doesn't solve it
> either. The hard question is when to set that flag.
>
> I believe it may be easier to just change the block layer so that it
> opens files only once during bdrv_open().
>
> Kevin
>

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-05 16:35                           ` Corey Bryant
@ 2012-07-05 16:37                             ` Corey Bryant
  2012-07-06  9:06                               ` Kevin Wolf
  2012-07-05 17:00                             ` Eric Blake
  1 sibling, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-07-05 16:37 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake



On 07/05/2012 12:35 PM, Corey Bryant wrote:
>
>
> On 07/05/2012 10:51 AM, Kevin Wolf wrote:
>> Am 05.07.2012 16:22, schrieb Corey Bryant:
>>>>> For some examples:
>>>>>
>>>>> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount
>>>>> 1, in
>>>>> use by monitor, as member of fdset1
>>>>> 2. client crashes, so all tracked fds are visited; fd=4 had not yet
>>>>> been
>>>>> passed to 'remove-fd', so qemu decrements refcount; refcount of
>>>>> fd=4 is
>>>>> now 0 so qemu closes it
>>>>
>>>> Doesn't the refcount belong to an fdset rather than a single fd? As
>>>> long
>>>> as the fdset has still a reference, it's possible to reach the other
>>>> fds
>>>> in the set through a reopen. For example:
>>>>
>>>> 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a
>>>> member
>>>> of fdset1, in use by monitor, refcount 1
>>>> 2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
>>>> member of fdset, in use by monitor, refcount is still 1
>>>> 3. client calls 'device-add' with /dev/fdset/1 as the backing filename
>>>> and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
>>>> 4. client crashes, so all tracked fdsets are visited; fdset1 gets its
>>>> refcount decreased to 1, and both member fds 4 and 5 stay open.
>>>>
>>>> If you had refcounted a single fd, fd 5 would be closed now and qemu
>>>> couldn't switch to O_RDONLY any more.
>>>>
>>>
>>> If the O_RDONLY is for a future command, it would make more sense to add
>>> that fd before that future command.  Or are you passing it at step 2
>>> because it is needed for the probe re-open issue?
>>
>> Come on, requiring realistic examples isn't fair. ;-)
>
> Heh :)
>
>>
>> Swap steps 2 and 3 in the example, then step 3 is just the preparation
>> for a command that uses the new fd. The problem stays the same. Or a
>> live commit operation could be in flight so that qemu must be able to
>> switch on completion without libvirt interaction.
>
> Good point.
>
> I thought it would be useful to run through the examples again with each
> fdset having a refcount, and each fd having an in-use flag.  One
> difference though is that refcount is only incremented/decremented when
> qemu_open/qemu_close are called.  I think this works and covers Kevin's
> concerns.  Actually it might be a bit simpler too.
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount of fdset1 to 1
> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
> passed to 'remove-fd', so it's in-use flag is on;  in-use flag is turned
> off and fd=4 is left open because it is still in use by the block device
> (refcount is 1)
> 4. client re-establishes QMP connection, so all tracked fds are visited,
> and in-use flags are turned back on; 'query-fds' lets client learn about
> fd=4 still being open as part of fdset1
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount of fdset1 to 1
> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
> longer in-use by the monitor, and is left open because it is in use by
> the block device (refcount is 1)
> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
> but refcount is 1 so it is not closed
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> but the command fails for some other reason, so the refcount is still 0
> at the end of the command (although it may have been temporarily
> incremented then decremented during the command)
> 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
> QMP connection is closed), so qemu marks fd=4 as no longer in-use by the
> monitor; refcount is 0 so all fds in fdset1 are closed
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with
> refcount still 0; fd=5's in-use flag is turned on
> 3. client crashes, so all tracked fds are visited; fd=4 and fd=5 had not
> yet been passed to 'remove-fd', so their in-use flags are on; in-use
> flags are turned off; refcount of fdset1 is 0 so qemu closes all fds in
> fdset1
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with
> refcount still 0; fd=5's in-use flag is turned on
> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
> longer in-use by the monitor, and fd=4 is closed since the refcount is
> 0; fd=5 remains open
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1
> with refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1
> with refcount still 0; fd=5's in-use flag is turned on
> 3. client calls 'device-add' with /dev/fdset/1 as the backing filename
> and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1
> 4. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
> longer in-use by the monitor, but fd=4 remains open because refcount of
> fdset1 is 1
> 5. client calls 'remove-fd fdset=1 fd=5', so qemu marks fd=5 as no
> longer in-use by the monitor, and fd=5 remains open because refcount of
> fdset1 is 1
> 6. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both
> fds are closed since refcount is 0 and their in-use flags are off
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1
> with refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1
> with refcount still 0; fd=5's in-use flag is turned on
> 3. client calls 'device-add' with /dev/fdset/1 as the backing filename
> and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1
> 4. client crashes, so all tracked fds are visited; fd=4 and fd=5 have
> not yet been passed to 'remove-fd', so their in-use flags are on; in-use
> flags are turned off and both fds are left open because the set is still
> in use by the block device (refcount is 1)
> 5. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both
> fds are closed since refcount is 0 and their in-use flags are off
>
>>
>> There are enough reasons that an fd in an fdset exists and is not
>> opened, but I can't think of one why it should be dropped.
>>
>>>>>>> In use by whom? If it's still in use in qemu (as in "in-use flag
>>>>>>> would
>>>>>>> be set") and we have a refcount of zero, then that's a bug.
>>>>>>>
>>>>>>
>>>>>> In use by qemu.  I don't think it's a bug.  I think there are
>>>>>> situations
>>>>>> where refcount gets to zero but qemu is still using the fd.
>>>>>
>>>>> I think the refcount being non-zero _is_ what defines an fd as
>>>>> being in
>>>>> use by qemu (including use by the monitor).  Any place you have to
>>>>> close
>>>>> an fd before reopening it is dangerous; the safe way is always to open
>>>>> with the new permissions before closing the old permissions.
>>>>
>>>> There is one case I'm aware of where we need to be careful: Before
>>>> opening an image, qemu may probe the format. In this case, the image
>>>> gets opened twice, and the first close comes before the second open.
>>>> I'm
>>>> not entirely sure how hard it would be to get rid of that behaviour.
>>>>
>>>> If we can't get rid of it, we have a small window that the refcount
>>>> doesn't really cover, and if we weren't careful it would become racy.
>>>> This is why I mentioned earlier that maybe we need to defer the
>>>> refcount
>>>> decrease a bit. However, I can't see how the in-use flag would make a
>>>> difference there. If the refcount can't cover it, the in-use flag can't
>>>> either.
>>>
>>> Yeah this is a problem.  Could we introduce another flag to cover this?
>>
>> Adding more refcounts or flags is not a problem, but it doesn't solve it
>> either. The hard question is when to set that flag.
>>
>> I believe it may be easier to just change the block layer so that it
>> opens files only once during bdrv_open().
>>

Can this fix be delivered after the fd passing patch series?

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-05 16:35                           ` Corey Bryant
  2012-07-05 16:37                             ` Corey Bryant
@ 2012-07-05 17:00                             ` Eric Blake
  2012-07-05 17:36                               ` Corey Bryant
  2012-07-06  9:11                               ` Kevin Wolf
  1 sibling, 2 replies; 56+ messages in thread
From: Eric Blake @ 2012-07-05 17:00 UTC (permalink / raw)
  To: Corey Bryant
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	Luiz Capitulino, pbonzini

[-- Attachment #1: Type: text/plain, Size: 2946 bytes --]

On 07/05/2012 10:35 AM, Corey Bryant wrote:

> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount of fdset1 to 1
> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
> passed to 'remove-fd', so it's in-use flag is on;  in-use flag is turned
> off and fd=4 is left open because it is still in use by the block device
> (refcount is 1)
> 4. client re-establishes QMP connection, so all tracked fds are visited,
> and in-use flags are turned back on; 'query-fds' lets client learn about
> fd=4 still being open as part of fdset1

This says that an in-use fd comes back into use after a crash...

> 
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount of fdset1 to 1
> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
> longer in-use by the monitor, and is left open because it is in use by
> the block device (refcount is 1)
> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
> but refcount is 1 so it is not closed
5. client re-establishes QMP connection, so all tracked fds are visited.
 What happens to the fd=4 in-use flag?

...but what are the semantics here?

If we _always_ turn the in-use flag back on, then that says that even
though libvirt successfully asked to close the fd, the reconnection
means that libvirt once again has to ask to close things.

If we _never_ turn the in-use flag back on, then we've broken the first
case above where we want an in-use fd to come back into use after a crash.

Maybe that argues for two flags per fd: an in-use flag (there is
currently a monitor connection that can manipulate the fd, either
because it passed in the fd or because it reconnected) and a removed
flag (a monitor called remove-fd, and no longer wants to know about the
fd, even if it crashes and reconnects).  An fd is then closed when
'refcount == 0 && (!inuse || removed)'.  On monitor disconnect, the
inuse flag is cleared, and on reconnect, it is set; but that does not
impact the removed flag.  And the 'query-fds' command would omit any fd
with the 'removed' flag, even when the fd is still kept open internally
thanks to the refcount being nonzero.

But I'm definitely agreeing that tying the refcount to the set rather
than to individual fds within the set makes sense; you still avoid the
fd leak in that all fds in the set are closed when both the monitor has
disavowed the set and when qemu_close() has finished using any of the fds.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-05 17:00                             ` Eric Blake
@ 2012-07-05 17:36                               ` Corey Bryant
  2012-07-06  9:11                               ` Kevin Wolf
  1 sibling, 0 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-05 17:36 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	Luiz Capitulino, pbonzini



On 07/05/2012 01:00 PM, Eric Blake wrote:
> On 07/05/2012 10:35 AM, Corey Bryant wrote:
>
>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
>> refcount of 0; fd=4's in-use flag is turned on
>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>> so qemu_open() increments the refcount of fdset1 to 1
>> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
>> passed to 'remove-fd', so it's in-use flag is on;  in-use flag is turned
>> off and fd=4 is left open because it is still in use by the block device
>> (refcount is 1)
>> 4. client re-establishes QMP connection, so all tracked fds are visited,
>> and in-use flags are turned back on; 'query-fds' lets client learn about
>> fd=4 still being open as part of fdset1
>
> This says that an in-use fd comes back into use after a crash...
>
>>
>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
>> refcount of 0; fd=4's in-use flag is turned on
>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>> so qemu_open() increments the refcount of fdset1 to 1
>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
>> longer in-use by the monitor, and is left open because it is in use by
>> the block device (refcount is 1)
>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
>> but refcount is 1 so it is not closed
> 5. client re-establishes QMP connection, so all tracked fds are visited.
>   What happens to the fd=4 in-use flag?
>
> ...but what are the semantics here?
>
> If we _always_ turn the in-use flag back on, then that says that even
> though libvirt successfully asked to close the fd, the reconnection
> means that libvirt once again has to ask to close things.
>

Ah, yeah I missed that.

> If we _never_ turn the in-use flag back on, then we've broken the first
> case above where we want an in-use fd to come back into use after a crash.
>
> Maybe that argues for two flags per fd: an in-use flag (there is
> currently a monitor connection that can manipulate the fd, either
> because it passed in the fd or because it reconnected) and a removed
> flag (a monitor called remove-fd, and no longer wants to know about the
> fd, even if it crashes and reconnects).  An fd is then closed when
> 'refcount == 0 && (!inuse || removed)'.  On monitor disconnect, the
> inuse flag is cleared, and on reconnect, it is set; but that does not
> impact the removed flag.  And the 'query-fds' command would omit any fd
> with the 'removed' flag, even when the fd is still kept open internally
> thanks to the refcount being nonzero.
>

I agree.  Having the 2 flags makes sense and solves the issues you 
mentioned above.


-- 
Regards,
Corey


> But I'm definitely agreeing that tying the refcount to the set rather
> than to individual fds within the set makes sense; you still avoid the
> fd leak in that all fds in the set are closed when both the monitor has
> disavowed the set and when qemu_close() has finished using any of the fds.
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-05 16:37                             ` Corey Bryant
@ 2012-07-06  9:06                               ` Kevin Wolf
  0 siblings, 0 replies; 56+ messages in thread
From: Kevin Wolf @ 2012-07-06  9:06 UTC (permalink / raw)
  To: Corey Bryant
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake

Am 05.07.2012 18:37, schrieb Corey Bryant:
>>>>> There is one case I'm aware of where we need to be careful: Before
>>>>> opening an image, qemu may probe the format. In this case, the image
>>>>> gets opened twice, and the first close comes before the second open.
>>>>> I'm
>>>>> not entirely sure how hard it would be to get rid of that behaviour.
>>>>>
>>>>> If we can't get rid of it, we have a small window that the refcount
>>>>> doesn't really cover, and if we weren't careful it would become racy.
>>>>> This is why I mentioned earlier that maybe we need to defer the
>>>>> refcount
>>>>> decrease a bit. However, I can't see how the in-use flag would make a
>>>>> difference there. If the refcount can't cover it, the in-use flag can't
>>>>> either.
>>>>
>>>> Yeah this is a problem.  Could we introduce another flag to cover this?
>>>
>>> Adding more refcounts or flags is not a problem, but it doesn't solve it
>>> either. The hard question is when to set that flag.
>>>
>>> I believe it may be easier to just change the block layer so that it
>>> opens files only once during bdrv_open().
> 
> Can this fix be delivered after the fd passing patch series?

Sure, we can't fix everything at once.

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-05 17:00                             ` Eric Blake
  2012-07-05 17:36                               ` Corey Bryant
@ 2012-07-06  9:11                               ` Kevin Wolf
  2012-07-06 17:14                                 ` Corey Bryant
  2012-07-06 17:40                                 ` Corey Bryant
  1 sibling, 2 replies; 56+ messages in thread
From: Kevin Wolf @ 2012-07-06  9:11 UTC (permalink / raw)
  To: Eric Blake
  Cc: aliguori, stefanha, libvir-list, Corey Bryant, qemu-devel,
	Luiz Capitulino, pbonzini

Am 05.07.2012 19:00, schrieb Eric Blake:
> On 07/05/2012 10:35 AM, Corey Bryant wrote:
>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
>> refcount of 0; fd=4's in-use flag is turned on
>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>> so qemu_open() increments the refcount of fdset1 to 1
>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
>> longer in-use by the monitor, and is left open because it is in use by
>> the block device (refcount is 1)
>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
>> but refcount is 1 so it is not closed
> 5. client re-establishes QMP connection, so all tracked fds are visited.
>  What happens to the fd=4 in-use flag?
> 
> ...but what are the semantics here?
> 
> If we _always_ turn the in-use flag back on, then that says that even
> though libvirt successfully asked to close the fd, the reconnection
> means that libvirt once again has to ask to close things.
> 
> If we _never_ turn the in-use flag back on, then we've broken the first
> case above where we want an in-use fd to come back into use after a crash.
> 
> Maybe that argues for two flags per fd: an in-use flag (there is
> currently a monitor connection that can manipulate the fd, either
> because it passed in the fd or because it reconnected) and a removed
> flag (a monitor called remove-fd, and no longer wants to know about the
> fd, even if it crashes and reconnects).  

I was in fact just going to suggest a removed flag as well, however
combined with including the monitor connections in the refcount instead
of an additional flag. This would also allow to have (the currently
mostly hypothetical case of) multiple QMP monitors without interesting
things happening.

Maybe I'm missing some point that the inuse flag would allow and
including monitors in the refcount doesn't. Is there one?

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-06  9:11                               ` Kevin Wolf
@ 2012-07-06 17:14                                 ` Corey Bryant
  2012-07-06 17:15                                   ` Corey Bryant
  2012-07-06 17:40                                 ` Corey Bryant
  1 sibling, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-07-06 17:14 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake



On 07/06/2012 05:11 AM, Kevin Wolf wrote:
> Am 05.07.2012 19:00, schrieb Eric Blake:
>> On 07/05/2012 10:35 AM, Corey Bryant wrote:
>>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
>>> refcount of 0; fd=4's in-use flag is turned on
>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>>> so qemu_open() increments the refcount of fdset1 to 1
>>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
>>> longer in-use by the monitor, and is left open because it is in use by
>>> the block device (refcount is 1)
>>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
>>> but refcount is 1 so it is not closed
>> 5. client re-establishes QMP connection, so all tracked fds are visited.
>>   What happens to the fd=4 in-use flag?
>>
>> ...but what are the semantics here?
>>
>> If we _always_ turn the in-use flag back on, then that says that even
>> though libvirt successfully asked to close the fd, the reconnection
>> means that libvirt once again has to ask to close things.
>>
>> If we _never_ turn the in-use flag back on, then we've broken the first
>> case above where we want an in-use fd to come back into use after a crash.
>>
>> Maybe that argues for two flags per fd: an in-use flag (there is
>> currently a monitor connection that can manipulate the fd, either
>> because it passed in the fd or because it reconnected) and a removed
>> flag (a monitor called remove-fd, and no longer wants to know about the
>> fd, even if it crashes and reconnects).
>
> I was in fact just going to suggest a removed flag as well, however
> combined with including the monitor connections in the refcount instead
> of an additional flag. This would also allow to have (the currently
> mostly hypothetical case of) multiple QMP monitors without interesting
> things happening.
>
> Maybe I'm missing some point that the inuse flag would allow and
> including monitors in the refcount doesn't. Is there one?
>
> Kevin
>

I think we need the granularity of having an in-use flag per fd.  Here's 
an example of why:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1; fd=4's remove flag is initialized to off
2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 2
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 
is decremented to 1; fd=4 is left open because it is still in use by the 
block device (refcount is 1)
4. client re-establishes QMP connection, refcount for fdset1 is 
incremented to 2; 'query-fds' lets client learn about fd=4 still being 
open as part of fdset1
5. qemu_close is called for fd=4; refcount for fdset1 is decremented to 
1; fd=4 remains open as part of fdset1
6. QMP disconnects; refcount for fdset is decremented to zero;  fd=4 is 
closed and fdset1 is freed

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1 (because of monitor reference); fd=4's remove flag is 
initialized to off
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
but the command fails for some other reason, so the refcount is still 1
at the end of the command (although it may have been temporarily
incremented then decremented during the command)
3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
QMP connection is closed), so qemu turns on the remove flag for fd=4; 
refcount is 0 so all fds in fdset1 are closed


I think we need the granularity of having an in-use flag per fd.  If we 
track monitor in-use in the reference count, then we are tracking it for 
the fdset and I think this could cause leaks.

In the following example, we have a refcount for the fdset, an in-use 
flag per fd, and a remove flag per fd.  We're only 
incrementing/decrementing refcount in qemu_open/qemu_close.





In the following example, we have a refcount for the fdset, and a remove 
flag per fd.  We're incrementing refcount when the fdset is first 
created, when QMP re-connects, and in qemu_open.  We're decrementing 
refcount when QMP disconnects, and in qemu_close.

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1; fd=4's remove flag is initialized to off
2. client crashes, so all tracked fdsets are visited; fd=4 has
not yet been passed to 'remove-fd', so its remove flag is off; in-use
flags are turned off and both fds are left open because the set is still
in use by the block device (refcount is 1)








1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1; fd=4's remove flag is initialized to off
2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 2
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 
is decremented to 1; fd=4 is left open because it is still in use by the 
block device (refcount is 1)
4. client re-establishes QMP connection, refcount for fdset1 is 
incremented to 2; 'query-fds' lets client learn about fd=4 still being 
open as part of fdset1
5. qemu_close is called for fd=4; refcount for fdset1 is decremented to 
1; fd=4 remains open as part of fdset1
6. QMP disconnects; refcount for fdset is decremented to zero;  fd=4 is 
closed and fdset1 is freed

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and is left open because it is in use by
the block device (refcount is 1)
4. client crashes, so all tracked fds are visited; fd=4 is not in-use
but refcount is 1 so it is not closed

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
but the command fails for some other reason, so the refcount is still 0
at the end of the command (although it may have been temporarily
incremented then decremented during the command)
3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
QMP connection is closed), so qemu marks fd=4 as no longer in-use by the
monitor; refcount is 0 so all fds in fdset1 are closed

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with
refcount still 0; fd=5's in-use flag is turned on
3. client crashes, so all tracked fds are visited; fd=4 and fd=5 had not
yet been passed to 'remove-fd', so their in-use flags are on; in-use
flags are turned off; refcount of fdset1 is 0 so qemu closes all fds in
fdset1

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with
refcount still 0; fd=5's in-use flag is turned on
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and fd=4 is closed since the refcount is
0; fd=5 remains open

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1
with refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1
with refcount still 0; fd=5's in-use flag is turned on
3. client calls 'device-add' with /dev/fdset/1 as the backing filename
and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1
4. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, but fd=4 remains open because refcount of
fdset1 is 1
5. client calls 'remove-fd fdset=1 fd=5', so qemu marks fd=5 as no
longer in-use by the monitor, and fd=5 remains open because refcount of
fdset1 is 1
6. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both
fds are closed since refcount is 0 and their in-use flags are off

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1
with refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1
with refcount still 0; fd=5's in-use flag is turned on
3. client calls 'device-add' with /dev/fdset/1 as the backing filename
and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1
4. client crashes, so all tracked fds are visited; fd=4 and fd=5 have
not yet been passed to 'remove-fd', so their in-use flags are on; in-use
flags are turned off and both fds are left open because the set is still
in use by the block device (refcount is 1)
5. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both
fds are closed since refcount is 0 and their in-use flags are off


-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-06 17:14                                 ` Corey Bryant
@ 2012-07-06 17:15                                   ` Corey Bryant
  0 siblings, 0 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-06 17:15 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake

Ugh... please disregard this.  I hit send accidentally.

On 07/06/2012 01:14 PM, Corey Bryant wrote:
>
>
> On 07/06/2012 05:11 AM, Kevin Wolf wrote:
>> Am 05.07.2012 19:00, schrieb Eric Blake:
>>> On 07/05/2012 10:35 AM, Corey Bryant wrote:
>>>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
>>>> refcount of 0; fd=4's in-use flag is turned on
>>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>>>> so qemu_open() increments the refcount of fdset1 to 1
>>>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
>>>> longer in-use by the monitor, and is left open because it is in use by
>>>> the block device (refcount is 1)
>>>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
>>>> but refcount is 1 so it is not closed
>>> 5. client re-establishes QMP connection, so all tracked fds are visited.
>>>   What happens to the fd=4 in-use flag?
>>>
>>> ...but what are the semantics here?
>>>
>>> If we _always_ turn the in-use flag back on, then that says that even
>>> though libvirt successfully asked to close the fd, the reconnection
>>> means that libvirt once again has to ask to close things.
>>>
>>> If we _never_ turn the in-use flag back on, then we've broken the first
>>> case above where we want an in-use fd to come back into use after a
>>> crash.
>>>
>>> Maybe that argues for two flags per fd: an in-use flag (there is
>>> currently a monitor connection that can manipulate the fd, either
>>> because it passed in the fd or because it reconnected) and a removed
>>> flag (a monitor called remove-fd, and no longer wants to know about the
>>> fd, even if it crashes and reconnects).
>>
>> I was in fact just going to suggest a removed flag as well, however
>> combined with including the monitor connections in the refcount instead
>> of an additional flag. This would also allow to have (the currently
>> mostly hypothetical case of) multiple QMP monitors without interesting
>> things happening.
>>
>> Maybe I'm missing some point that the inuse flag would allow and
>> including monitors in the refcount doesn't. Is there one?
>>
>> Kevin
>>
>
> I think we need the granularity of having an in-use flag per fd.  Here's
> an example of why:
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 1; fd=4's remove flag is initialized to off
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
> qemu_open() increments the refcount of fdset1 to 2
> 3. client crashes, so all fdsets are visited; fd=4 had not yet been
> passed to 'remove-fd', so it's remove flag is off; refcount for fdset1
> is decremented to 1; fd=4 is left open because it is still in use by the
> block device (refcount is 1)
> 4. client re-establishes QMP connection, refcount for fdset1 is
> incremented to 2; 'query-fds' lets client learn about fd=4 still being
> open as part of fdset1
> 5. qemu_close is called for fd=4; refcount for fdset1 is decremented to
> 1; fd=4 remains open as part of fdset1
> 6. QMP disconnects; refcount for fdset is decremented to zero;  fd=4 is
> closed and fdset1 is freed
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 1 (because of monitor reference); fd=4's remove flag is
> initialized to off
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> but the command fails for some other reason, so the refcount is still 1
> at the end of the command (although it may have been temporarily
> incremented then decremented during the command)
> 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
> QMP connection is closed), so qemu turns on the remove flag for fd=4;
> refcount is 0 so all fds in fdset1 are closed
>
>
> I think we need the granularity of having an in-use flag per fd.  If we
> track monitor in-use in the reference count, then we are tracking it for
> the fdset and I think this could cause leaks.
>
> In the following example, we have a refcount for the fdset, an in-use
> flag per fd, and a remove flag per fd.  We're only
> incrementing/decrementing refcount in qemu_open/qemu_close.
>
>
>
>
>
> In the following example, we have a refcount for the fdset, and a remove
> flag per fd.  We're incrementing refcount when the fdset is first
> created, when QMP re-connects, and in qemu_open.  We're decrementing
> refcount when QMP disconnects, and in qemu_close.
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 1; fd=4's remove flag is initialized to off
> 2. client crashes, so all tracked fdsets are visited; fd=4 has
> not yet been passed to 'remove-fd', so its remove flag is off; in-use
> flags are turned off and both fds are left open because the set is still
> in use by the block device (refcount is 1)
>
>
>
>
>
>
>
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 1; fd=4's remove flag is initialized to off
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
> qemu_open() increments the refcount of fdset1 to 2
> 3. client crashes, so all fdsets are visited; fd=4 had not yet been
> passed to 'remove-fd', so it's remove flag is off; refcount for fdset1
> is decremented to 1; fd=4 is left open because it is still in use by the
> block device (refcount is 1)
> 4. client re-establishes QMP connection, refcount for fdset1 is
> incremented to 2; 'query-fds' lets client learn about fd=4 still being
> open as part of fdset1
> 5. qemu_close is called for fd=4; refcount for fdset1 is decremented to
> 1; fd=4 remains open as part of fdset1
> 6. QMP disconnects; refcount for fdset is decremented to zero;  fd=4 is
> closed and fdset1 is freed
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount of fdset1 to 1
> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
> longer in-use by the monitor, and is left open because it is in use by
> the block device (refcount is 1)
> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
> but refcount is 1 so it is not closed
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> but the command fails for some other reason, so the refcount is still 0
> at the end of the command (although it may have been temporarily
> incremented then decremented during the command)
> 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
> QMP connection is closed), so qemu marks fd=4 as no longer in-use by the
> monitor; refcount is 0 so all fds in fdset1 are closed
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with
> refcount still 0; fd=5's in-use flag is turned on
> 3. client crashes, so all tracked fds are visited; fd=4 and fd=5 had not
> yet been passed to 'remove-fd', so their in-use flags are on; in-use
> flags are turned off; refcount of fdset1 is 0 so qemu closes all fds in
> fdset1
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with
> refcount still 0; fd=5's in-use flag is turned on
> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
> longer in-use by the monitor, and fd=4 is closed since the refcount is
> 0; fd=5 remains open
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1
> with refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1
> with refcount still 0; fd=5's in-use flag is turned on
> 3. client calls 'device-add' with /dev/fdset/1 as the backing filename
> and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1
> 4. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
> longer in-use by the monitor, but fd=4 remains open because refcount of
> fdset1 is 1
> 5. client calls 'remove-fd fdset=1 fd=5', so qemu marks fd=5 as no
> longer in-use by the monitor, and fd=5 remains open because refcount of
> fdset1 is 1
> 6. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both
> fds are closed since refcount is 0 and their in-use flags are off
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1
> with refcount of 0; fd=4's in-use flag is turned on
> 2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1
> with refcount still 0; fd=5's in-use flag is turned on
> 3. client calls 'device-add' with /dev/fdset/1 as the backing filename
> and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 1
> 4. client crashes, so all tracked fds are visited; fd=4 and fd=5 have
> not yet been passed to 'remove-fd', so their in-use flags are on; in-use
> flags are turned off and both fds are left open because the set is still
> in use by the block device (refcount is 1)
> 5. qemu_close(fd=4) is called, refcount of fdset1 is decremented; both
> fds are closed since refcount is 0 and their in-use flags are off
>
>

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-06  9:11                               ` Kevin Wolf
  2012-07-06 17:14                                 ` Corey Bryant
@ 2012-07-06 17:40                                 ` Corey Bryant
  2012-07-06 18:19                                   ` [Qemu-devel] [libvirt] " Corey Bryant
  2012-07-09 14:04                                   ` [Qemu-devel] " Kevin Wolf
  1 sibling, 2 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-06 17:40 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake



On 07/06/2012 05:11 AM, Kevin Wolf wrote:
> Am 05.07.2012 19:00, schrieb Eric Blake:
>> On 07/05/2012 10:35 AM, Corey Bryant wrote:
>>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
>>> refcount of 0; fd=4's in-use flag is turned on
>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>>> so qemu_open() increments the refcount of fdset1 to 1
>>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
>>> longer in-use by the monitor, and is left open because it is in use by
>>> the block device (refcount is 1)
>>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
>>> but refcount is 1 so it is not closed
>> 5. client re-establishes QMP connection, so all tracked fds are visited.
>>   What happens to the fd=4 in-use flag?
>>
>> ...but what are the semantics here?
>>
>> If we _always_ turn the in-use flag back on, then that says that even
>> though libvirt successfully asked to close the fd, the reconnection
>> means that libvirt once again has to ask to close things.
>>
>> If we _never_ turn the in-use flag back on, then we've broken the first
>> case above where we want an in-use fd to come back into use after a crash.
>>
>> Maybe that argues for two flags per fd: an in-use flag (there is
>> currently a monitor connection that can manipulate the fd, either
>> because it passed in the fd or because it reconnected) and a removed
>> flag (a monitor called remove-fd, and no longer wants to know about the
>> fd, even if it crashes and reconnects).
>
> I was in fact just going to suggest a removed flag as well, however
> combined with including the monitor connections in the refcount instead
> of an additional flag. This would also allow to have (the currently
> mostly hypothetical case of) multiple QMP monitors without interesting
> things happening.
>
> Maybe I'm missing some point that the inuse flag would allow and
> including monitors in the refcount doesn't. Is there one?
>
> Kevin
>

Ok let me try this again. I was going through some of the examples and I 
think we need a separate in-use flag.  Otherwise, when refcount gets to 
1, we don't know if it is because of a monitor reference or a block 
device reference.  I think it would cause fds to sit on the monitor 
until refcount gets to zero (monitor disconnects). Here's an example 
without the in-use flag:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1 (incremented because of monitor reference); fd=4's remove 
flag is initialized to off
2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 2
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 
is decremented to 1; fd=4 is left open because it is still in use by the 
block device (refcount is 1)
4. client re-establishes QMP connection, refcount for fdset1 is 
incremented to 2; 'query-fds' lets client learn about fd=4 still being 
open as part of fdset1
5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for 
fd=4; but fd=4 remains open because refcount of fdset1 is 2
6. qemu_close is called for fd=4; refcount for fdset1 is decremented to 
1; fd=4 remains open because monitor still references fdset1 (refcount 
of fdset1 is 1)
7. Sometime later.. QMP disconnects; refcount for fdset is decremented 
to zero; fd=4 is closed

In the following case, we have an in-use and remove flag per fd and we 
only increment/decrement refcount on qemu_open/qemu_close:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's remove flag is initialized to off and in-use flag 
is initialized to on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 1
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; fd=4's in-use flag is 
turned off; fd=4 is left open because it is still in-use by the block 
device (refcount is still 1)
4. client re-establishes QMP connection, refcount for fdset1 is still 1; 
fd=4's in-use flag is turned on; 'query-fds' lets client learn about 
fd=4 still being open as part of fdset1
5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for 
fd=4; but fd=4 remains open because refcount of fdset1 is 1
6. qemu_close is called for fd=4; refcount for fdset1 is decremented to 
0; fd=4 is closed because (refcount == 0 && (!inuse || removed)) is true
7. Sometime later.. QMP disconnects

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [libvirt] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-06 17:40                                 ` Corey Bryant
@ 2012-07-06 18:19                                   ` Corey Bryant
  2012-07-09 14:04                                   ` [Qemu-devel] " Kevin Wolf
  1 sibling, 0 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-06 18:19 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: libvir-list, pbonzini, aliguori, stefanha, qemu-devel



On 07/06/2012 01:40 PM, Corey Bryant wrote:
>
>
> On 07/06/2012 05:11 AM, Kevin Wolf wrote:
>> Am 05.07.2012 19:00, schrieb Eric Blake:
>>> On 07/05/2012 10:35 AM, Corey Bryant wrote:
>>>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
>>>> refcount of 0; fd=4's in-use flag is turned on
>>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>>>> so qemu_open() increments the refcount of fdset1 to 1
>>>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
>>>> longer in-use by the monitor, and is left open because it is in use by
>>>> the block device (refcount is 1)
>>>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
>>>> but refcount is 1 so it is not closed
>>> 5. client re-establishes QMP connection, so all tracked fds are visited.
>>>   What happens to the fd=4 in-use flag?
>>>
>>> ...but what are the semantics here?
>>>
>>> If we _always_ turn the in-use flag back on, then that says that even
>>> though libvirt successfully asked to close the fd, the reconnection
>>> means that libvirt once again has to ask to close things.
>>>
>>> If we _never_ turn the in-use flag back on, then we've broken the first
>>> case above where we want an in-use fd to come back into use after a
>>> crash.
>>>
>>> Maybe that argues for two flags per fd: an in-use flag (there is
>>> currently a monitor connection that can manipulate the fd, either
>>> because it passed in the fd or because it reconnected) and a removed
>>> flag (a monitor called remove-fd, and no longer wants to know about the
>>> fd, even if it crashes and reconnects).
>>
>> I was in fact just going to suggest a removed flag as well, however
>> combined with including the monitor connections in the refcount instead
>> of an additional flag. This would also allow to have (the currently
>> mostly hypothetical case of) multiple QMP monitors without interesting
>> things happening.
>>
>> Maybe I'm missing some point that the inuse flag would allow and
>> including monitors in the refcount doesn't. Is there one?
>>
>> Kevin
>>
>
> Ok let me try this again. I was going through some of the examples and I
> think we need a separate in-use flag.  Otherwise, when refcount gets to
> 1, we don't know if it is because of a monitor reference or a block
> device reference.  I think it would cause fds to sit on the monitor
> until refcount gets to zero (monitor disconnects). Here's an example
> without the in-use flag:
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 1 (incremented because of monitor reference); fd=4's remove
> flag is initialized to off
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
> qemu_open() increments the refcount of fdset1 to 2
> 3. client crashes, so all fdsets are visited; fd=4 had not yet been
> passed to 'remove-fd', so it's remove flag is off; refcount for fdset1
> is decremented to 1; fd=4 is left open because it is still in use by the
> block device (refcount is 1)
> 4. client re-establishes QMP connection, refcount for fdset1 is
> incremented to 2; 'query-fds' lets client learn about fd=4 still being
> open as part of fdset1
> 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for
> fd=4; but fd=4 remains open because refcount of fdset1 is 2
> 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to
> 1; fd=4 remains open because monitor still references fdset1 (refcount
> of fdset1 is 1)
> 7. Sometime later.. QMP disconnects; refcount for fdset is decremented
> to zero; fd=4 is closed
>
> In the following case, we have an in-use and remove flag per fd and we
> only increment/decrement refcount on qemu_open/qemu_close:
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 0; fd=4's remove flag is initialized to off and in-use flag
> is initialized to on
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
> qemu_open() increments the refcount of fdset1 to 1
> 3. client crashes, so all fdsets are visited; fd=4 had not yet been
> passed to 'remove-fd', so it's remove flag is off; fd=4's in-use flag is
> turned off; fd=4 is left open because it is still in-use by the block
> device (refcount is still 1)
> 4. client re-establishes QMP connection, refcount for fdset1 is still 1;
> fd=4's in-use flag is turned on; 'query-fds' lets client learn about
> fd=4 still being open as part of fdset1
> 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for
> fd=4; but fd=4 remains open because refcount of fdset1 is 1
> 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to
> 0; fd=4 is closed because (refcount == 0 && (!inuse || removed)) is true
> 7. Sometime later.. QMP disconnects
>

I have one modification to this.  It looks the inuse flag could be at 
the fdset level, rather than an inuse flag per fd.  Although.. you did 
mention multiple monitors, so perhaps inuse should be a counter?

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-06 17:40                                 ` Corey Bryant
  2012-07-06 18:19                                   ` [Qemu-devel] [libvirt] " Corey Bryant
@ 2012-07-09 14:04                                   ` Kevin Wolf
  2012-07-09 15:23                                     ` Corey Bryant
  1 sibling, 1 reply; 56+ messages in thread
From: Kevin Wolf @ 2012-07-09 14:04 UTC (permalink / raw)
  To: Corey Bryant
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake

Am 06.07.2012 19:40, schrieb Corey Bryant:
> 
> 
> On 07/06/2012 05:11 AM, Kevin Wolf wrote:
>> Am 05.07.2012 19:00, schrieb Eric Blake:
>>> On 07/05/2012 10:35 AM, Corey Bryant wrote:
>>>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
>>>> refcount of 0; fd=4's in-use flag is turned on
>>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>>>> so qemu_open() increments the refcount of fdset1 to 1
>>>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
>>>> longer in-use by the monitor, and is left open because it is in use by
>>>> the block device (refcount is 1)
>>>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
>>>> but refcount is 1 so it is not closed
>>> 5. client re-establishes QMP connection, so all tracked fds are visited.
>>>   What happens to the fd=4 in-use flag?
>>>
>>> ...but what are the semantics here?
>>>
>>> If we _always_ turn the in-use flag back on, then that says that even
>>> though libvirt successfully asked to close the fd, the reconnection
>>> means that libvirt once again has to ask to close things.
>>>
>>> If we _never_ turn the in-use flag back on, then we've broken the first
>>> case above where we want an in-use fd to come back into use after a crash.
>>>
>>> Maybe that argues for two flags per fd: an in-use flag (there is
>>> currently a monitor connection that can manipulate the fd, either
>>> because it passed in the fd or because it reconnected) and a removed
>>> flag (a monitor called remove-fd, and no longer wants to know about the
>>> fd, even if it crashes and reconnects).
>>
>> I was in fact just going to suggest a removed flag as well, however
>> combined with including the monitor connections in the refcount instead
>> of an additional flag. This would also allow to have (the currently
>> mostly hypothetical case of) multiple QMP monitors without interesting
>> things happening.
>>
>> Maybe I'm missing some point that the inuse flag would allow and
>> including monitors in the refcount doesn't. Is there one?
>>
>> Kevin
>>
> 
> Ok let me try this again. I was going through some of the examples and I 
> think we need a separate in-use flag.  Otherwise, when refcount gets to 
> 1, we don't know if it is because of a monitor reference or a block 
> device reference.  

Does it matter?

> I think it would cause fds to sit on the monitor 
> until refcount gets to zero (monitor disconnects). Here's an example 
> without the in-use flag:
> 
> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
> refcount of 1 (incremented because of monitor reference); fd=4's remove 
> flag is initialized to off
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
> qemu_open() increments the refcount of fdset1 to 2
> 3. client crashes, so all fdsets are visited; fd=4 had not yet been
> passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 
> is decremented to 1; fd=4 is left open because it is still in use by the 
> block device (refcount is 1)
> 4. client re-establishes QMP connection, refcount for fdset1 is 
> incremented to 2; 'query-fds' lets client learn about fd=4 still being 
> open as part of fdset1
> 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for 
> fd=4; but fd=4 remains open because refcount of fdset1 is 2

It also decreases the reference count because the monitor doesn't use it
any more.

> 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to 
> 1; fd=4 remains open because monitor still references fdset1 (refcount 
> of fdset1 is 1)

So here the refcount becomes 0 and the fdset is closed.

> 7. Sometime later.. QMP disconnects; refcount for fdset is decremented 
> to zero; fd=4 is closed

The only question that is a bit unclear to me is whether a remove-fd on
one monitor drops the refcount only for this monitor or for all
monitors. However, both options can be implemented without additional
flags or counters.

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-05 15:06                             ` Corey Bryant
@ 2012-07-09 14:05                               ` Luiz Capitulino
  2012-07-09 15:05                                 ` Corey Bryant
  0 siblings, 1 reply; 56+ messages in thread
From: Luiz Capitulino @ 2012-07-09 14:05 UTC (permalink / raw)
  To: Corey Bryant
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	pbonzini, Eric Blake

On Thu, 05 Jul 2012 11:06:56 -0400
Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:

> 
> 
> On 07/04/2012 04:09 AM, Kevin Wolf wrote:
> > Am 03.07.2012 20:21, schrieb Corey Bryant:
> >> On 07/03/2012 02:00 PM, Eric Blake wrote:
> >>> On 07/03/2012 11:46 AM, Corey Bryant wrote:
> >>>
> >>>>
> >>>> Yes, I think adding a +1 to the refcount for the monitor makes sense.
> >>>>
> >>>> I'm a bit unsure how to increment the refcount when a monitor reconnects
> >>>> though.  Maybe it is as simple as adding a +1 to each fd's refcount when
> >>>> the next QMP monitor connects.
> >>>
> >>> Or maybe delay a +1 until after a 'query-fds' - it is not until the
> >>> monitor has reconnected and learned what fds it should be aware of that
> >>> incrementing the refcount again makes sense.  But that would mean making
> >>> 'query-fds' track whether this is the first call since the monitor
> >>> reconnected, as it shouldn't normally increase refcounts.
> >>
> >> This doesn't sound ideal.
> >
> > Yes, it's less than ideal.
> >
> >>> The other alternative is that the monitor never re-increments a
> >>> refcount.  Once a monitor disconnects, that fd is lost to the monitor,
> >>> and a reconnected monitor must pass in a new fd to be re-associated with
> >>> the fdset.  In other words, the monitor's use of an fd is a one-way
> >>> operation, starting life in use but ending at the first disconnect or
> >>> remove-fd.
> >>
> >> I would vote for this 2nd alternative.  As long as we're not introducing
> >> an fd leak.  And I don't think we are if we decrement the refcount on
> >> remove-fd or on QMP disconnect.
> >
> > In fact, I believe this one is even worse. I can already see hacks like
> > adding a dummy FD with invalid flags and removing it again just to
> > regain control over the fdset...
> >
> > You earlier suggestion made a lot of sense to me: Whenever a new QMP
> > monitor is connected, increase the refcount. That is, as long as any
> > monitor is there, don't drop any fdsets unless explicitly requested via QMP.
> 
> Ok.  So refcount would be incremented (for the fd or fdset, whatever we 
> decide on) when QMP reconnects.  I'm assuming we wouldn't wait until 
> after a query-fds call.

I'm not sure this is a good idea because we will leak fds if the client forgets
about the fds when re-connecting (ie. it was restarted) or if a different
client connects to QMP.

If we really want to do that, I think that the right way of doing this is to
add a command for clients to re-again ownership of the fds on reconnection.

But to be honest, I don't fully understand why this is needed.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 14:05                               ` Luiz Capitulino
@ 2012-07-09 15:05                                 ` Corey Bryant
  2012-07-09 15:46                                   ` Kevin Wolf
  2012-07-09 18:20                                   ` Corey Bryant
  0 siblings, 2 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-09 15:05 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	pbonzini, Eric Blake



On 07/09/2012 10:05 AM, Luiz Capitulino wrote:
> On Thu, 05 Jul 2012 11:06:56 -0400
> Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:
>
>>
>>
>> On 07/04/2012 04:09 AM, Kevin Wolf wrote:
>>> Am 03.07.2012 20:21, schrieb Corey Bryant:
>>>> On 07/03/2012 02:00 PM, Eric Blake wrote:
>>>>> On 07/03/2012 11:46 AM, Corey Bryant wrote:
>>>>>
>>>>>>
>>>>>> Yes, I think adding a +1 to the refcount for the monitor makes sense.
>>>>>>
>>>>>> I'm a bit unsure how to increment the refcount when a monitor reconnects
>>>>>> though.  Maybe it is as simple as adding a +1 to each fd's refcount when
>>>>>> the next QMP monitor connects.
>>>>>
>>>>> Or maybe delay a +1 until after a 'query-fds' - it is not until the
>>>>> monitor has reconnected and learned what fds it should be aware of that
>>>>> incrementing the refcount again makes sense.  But that would mean making
>>>>> 'query-fds' track whether this is the first call since the monitor
>>>>> reconnected, as it shouldn't normally increase refcounts.
>>>>
>>>> This doesn't sound ideal.
>>>
>>> Yes, it's less than ideal.
>>>
>>>>> The other alternative is that the monitor never re-increments a
>>>>> refcount.  Once a monitor disconnects, that fd is lost to the monitor,
>>>>> and a reconnected monitor must pass in a new fd to be re-associated with
>>>>> the fdset.  In other words, the monitor's use of an fd is a one-way
>>>>> operation, starting life in use but ending at the first disconnect or
>>>>> remove-fd.
>>>>
>>>> I would vote for this 2nd alternative.  As long as we're not introducing
>>>> an fd leak.  And I don't think we are if we decrement the refcount on
>>>> remove-fd or on QMP disconnect.
>>>
>>> In fact, I believe this one is even worse. I can already see hacks like
>>> adding a dummy FD with invalid flags and removing it again just to
>>> regain control over the fdset...
>>>
>>> You earlier suggestion made a lot of sense to me: Whenever a new QMP
>>> monitor is connected, increase the refcount. That is, as long as any
>>> monitor is there, don't drop any fdsets unless explicitly requested via QMP.
>>
>> Ok.  So refcount would be incremented (for the fd or fdset, whatever we
>> decide on) when QMP reconnects.  I'm assuming we wouldn't wait until
>> after a query-fds call.
>
> I'm not sure this is a good idea because we will leak fds if the client forgets
> about the fds when re-connecting (ie. it was restarted) or if a different
> client connects to QMP.
>
> If we really want to do that, I think that the right way of doing this is to
> add a command for clients to re-again ownership of the fds on reconnection.
>
> But to be honest, I don't fully understand why this is needed.
>

I'm not sure this is an issue with current design.  I know things have 
changed a bit as the email threads evolved, so I'll paste the current 
design that I am working from.  Please let me know if you still see any 
issues.

FD passing:
-----------
New monitor commands enable adding/removing an fd to/from a set.  New 
monitor command query-fdsets enables querying of current monitor fdsets. 
  The set of fds should all refer to the same file, with each fd having 
different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup 
the fd that has the matching access mode flags.

Design points:
--------------
1. add-fd
-> fd is passed via SCM rights and qemu adds fd to first unused fdset 
(e.g. /dev/fdset/1)
-> add-fd monitor function initializes the monitor inuse flag for the 
fdset to true
-> add-fd monitor function initializes the remove flag for the fd to false
-> add-fd returns fdset number and received fd number (e.g fd=3) to caller

2. drive_add file=/dev/fdset/1
-> qemu_open uses the first fd in fdset1 that has access flags matching 
the qemu_open action flags and has remove flag set to false
-> qemu_open increments refcount for the fdset
-> Need to make sure that if a command like 'device-add' fails that 
refcount is not incremented

3. add-fd fdset=1
-> fd is passed via SCM rights
-> add-fd monitor function adds the received fd to the specified fdset 
(or fails if fdset doesn't exist)
-> add-fd monitor function initializes the remove flag for the fd to false
-> add-fd returns fdset number and received fd number (e.g fd=4) to caller

4. block-commit
-> qemu_open performs "reopen" by using the first fd from the fdset that 
has access flags matching the qemu_open action flags and has remove flag 
set to false
-> qemu_open increments refcount for the fdset
-> Need to make sure that if a command like 'block-commit' fails that 
refcount is not incremented

5. remove-fd fdset=1 fd=4
-> remove-fd monitor function fails if fdset doesn't exist
-> remove-fd monitor function turns on remove flag for fd=4

6. qemu_close (need to replace all close calls in block layer with 
qemu_close)
-> qemu_close decrements refcount for fdset
-> qemu_close closes all fds that have (refcount == 0 && (!inuse || remove))
-> qemu_close frees the fdset if no fds remain in it

7. disconnecting the QMP monitor
-> monitor disconnect visits all fdsets on monitor and turns off monitor 
in-use flag for fdset

8. connecting the QMP monitor
-> monitor connect visits all fdsets on monitor and turns on monitor 
in-use flag for fdset

9. query-fdsets
-> returns all fdsets and fds that don't have remove flag on

QMP command examples
--------------------
-> { "execute": "add-fd", "arguments": { "fdset": 1 } }
<- { "return": { "fdset": 1, "fd": 3 } }

-> { "execute": "remove-fd", "arguments": { "fdset": 1, "fd": 3 } }
<- { "return": {} }

-> { "execute":"query-fdsets" } =>
<- { "return" : { "fdsets": [
      { "name": "fdset1",
        "fds": [ { "fd": 4, "removed": false } ],
        "refcount": 1,
        "monitor": true },
      { "name": "fdset2",
        "fds": [ { "fd": 5, "removed": false },
                 { "fd": 6, "removed": true } ],
        "refcount": 1,
        "monitor": true } }

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 14:04                                   ` [Qemu-devel] " Kevin Wolf
@ 2012-07-09 15:23                                     ` Corey Bryant
  2012-07-09 15:30                                       ` Kevin Wolf
  0 siblings, 1 reply; 56+ messages in thread
From: Corey Bryant @ 2012-07-09 15:23 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake



On 07/09/2012 10:04 AM, Kevin Wolf wrote:
> Am 06.07.2012 19:40, schrieb Corey Bryant:
>>
>>
>> On 07/06/2012 05:11 AM, Kevin Wolf wrote:
>>> Am 05.07.2012 19:00, schrieb Eric Blake:
>>>> On 07/05/2012 10:35 AM, Corey Bryant wrote:
>>>>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
>>>>> refcount of 0; fd=4's in-use flag is turned on
>>>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
>>>>> so qemu_open() increments the refcount of fdset1 to 1
>>>>> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
>>>>> longer in-use by the monitor, and is left open because it is in use by
>>>>> the block device (refcount is 1)
>>>>> 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
>>>>> but refcount is 1 so it is not closed
>>>> 5. client re-establishes QMP connection, so all tracked fds are visited.
>>>>    What happens to the fd=4 in-use flag?
>>>>
>>>> ...but what are the semantics here?
>>>>
>>>> If we _always_ turn the in-use flag back on, then that says that even
>>>> though libvirt successfully asked to close the fd, the reconnection
>>>> means that libvirt once again has to ask to close things.
>>>>
>>>> If we _never_ turn the in-use flag back on, then we've broken the first
>>>> case above where we want an in-use fd to come back into use after a crash.
>>>>
>>>> Maybe that argues for two flags per fd: an in-use flag (there is
>>>> currently a monitor connection that can manipulate the fd, either
>>>> because it passed in the fd or because it reconnected) and a removed
>>>> flag (a monitor called remove-fd, and no longer wants to know about the
>>>> fd, even if it crashes and reconnects).
>>>
>>> I was in fact just going to suggest a removed flag as well, however
>>> combined with including the monitor connections in the refcount instead
>>> of an additional flag. This would also allow to have (the currently
>>> mostly hypothetical case of) multiple QMP monitors without interesting
>>> things happening.
>>>
>>> Maybe I'm missing some point that the inuse flag would allow and
>>> including monitors in the refcount doesn't. Is there one?
>>>
>>> Kevin
>>>
>>
>> Ok let me try this again. I was going through some of the examples and I
>> think we need a separate in-use flag.  Otherwise, when refcount gets to
>> 1, we don't know if it is because of a monitor reference or a block
>> device reference.
>
> Does it matter?

>> I think it would cause fds to sit on the monitor
>> until refcount gets to zero (monitor disconnects). Here's an example
>> without the in-use flag:
>>
>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
>> refcount of 1 (incremented because of monitor reference); fd=4's remove
>> flag is initialized to off
>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
>> qemu_open() increments the refcount of fdset1 to 2
>> 3. client crashes, so all fdsets are visited; fd=4 had not yet been
>> passed to 'remove-fd', so it's remove flag is off; refcount for fdset1
>> is decremented to 1; fd=4 is left open because it is still in use by the
>> block device (refcount is 1)
>> 4. client re-establishes QMP connection, refcount for fdset1 is
>> incremented to 2; 'query-fds' lets client learn about fd=4 still being
>> open as part of fdset1
>> 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for
>> fd=4; but fd=4 remains open because refcount of fdset1 is 2
>
> It also decreases the reference count because the monitor doesn't use it
> any more.

I don't think that will work because refcount is for the entire fdset. 
So we can't decrement the refcount for every fd that is removed from the 
fdset.

I think it is much simpler if we only increment refcount for an fdset on 
qemu_open, and only decrement refcount on qemu_close.

>
>> 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to
>> 1; fd=4 remains open because monitor still references fdset1 (refcount
>> of fdset1 is 1)
>
> So here the refcount becomes 0 and the fdset is closed.
>

>> 7. Sometime later.. QMP disconnects; refcount for fdset is decremented
>> to zero; fd=4 is closed
>
> The only question that is a bit unclear to me is whether a remove-fd on
> one monitor drops the refcount only for this monitor or for all
> monitors. However, both options can be implemented without additional
> flags or counters.

Before we go back and forth on this thread, would you mind taking a look 
at the last email I sent to Luiz?  It includes all the design points 
that I'm currently working from.  I think it's a good level set and we 
can work off that thread if there are still any issues.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 15:23                                     ` Corey Bryant
@ 2012-07-09 15:30                                       ` Kevin Wolf
  0 siblings, 0 replies; 56+ messages in thread
From: Kevin Wolf @ 2012-07-09 15:30 UTC (permalink / raw)
  To: Corey Bryant
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake

Am 09.07.2012 17:23, schrieb Corey Bryant:
>>> I think it would cause fds to sit on the monitor
>>> until refcount gets to zero (monitor disconnects). Here's an example
>>> without the in-use flag:
>>>
>>> 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
>>> refcount of 1 (incremented because of monitor reference); fd=4's remove
>>> flag is initialized to off
>>> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
>>> qemu_open() increments the refcount of fdset1 to 2
>>> 3. client crashes, so all fdsets are visited; fd=4 had not yet been
>>> passed to 'remove-fd', so it's remove flag is off; refcount for fdset1
>>> is decremented to 1; fd=4 is left open because it is still in use by the
>>> block device (refcount is 1)
>>> 4. client re-establishes QMP connection, refcount for fdset1 is
>>> incremented to 2; 'query-fds' lets client learn about fd=4 still being
>>> open as part of fdset1
>>> 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for
>>> fd=4; but fd=4 remains open because refcount of fdset1 is 2
>>
>> It also decreases the reference count because the monitor doesn't use it
>> any more.
> 
> I don't think that will work because refcount is for the entire fdset. 
> So we can't decrement the refcount for every fd that is removed from the 
> fdset.
> 
> I think it is much simpler if we only increment refcount for an fdset on 
> qemu_open, and only decrement refcount on qemu_close.

Ah right... So this would only work if we had explicit
fdset-create/close commands, where the former would increase the
refcount and the latter decrease it (fdset-open would be optional but I
like symmetry)

Maybe we need (or want) that anyway, but I need to think more about it
first.

>>> 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to
>>> 1; fd=4 remains open because monitor still references fdset1 (refcount
>>> of fdset1 is 1)
>>
>> So here the refcount becomes 0 and the fdset is closed.
>>
> 
>>> 7. Sometime later.. QMP disconnects; refcount for fdset is decremented
>>> to zero; fd=4 is closed
>>
>> The only question that is a bit unclear to me is whether a remove-fd on
>> one monitor drops the refcount only for this monitor or for all
>> monitors. However, both options can be implemented without additional
>> flags or counters.
> 
> Before we go back and forth on this thread, would you mind taking a look 
> at the last email I sent to Luiz?  It includes all the design points 
> that I'm currently working from.  I think it's a good level set and we 
> can work off that thread if there are still any issues.

Ok, I'll have a look.

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 15:05                                 ` Corey Bryant
@ 2012-07-09 15:46                                   ` Kevin Wolf
  2012-07-09 16:18                                     ` Luiz Capitulino
  2012-07-09 17:35                                     ` Corey Bryant
  2012-07-09 18:20                                   ` Corey Bryant
  1 sibling, 2 replies; 56+ messages in thread
From: Kevin Wolf @ 2012-07-09 15:46 UTC (permalink / raw)
  To: Corey Bryant
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake

Am 09.07.2012 17:05, schrieb Corey Bryant:
> I'm not sure this is an issue with current design.  I know things have 
> changed a bit as the email threads evolved, so I'll paste the current 
> design that I am working from.  Please let me know if you still see any 
> issues.
> 
> FD passing:
> -----------
> New monitor commands enable adding/removing an fd to/from a set.  New 
> monitor command query-fdsets enables querying of current monitor fdsets. 
>   The set of fds should all refer to the same file, with each fd having 
> different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup 
> the fd that has the matching access mode flags.
> 
> Design points:
> --------------
> 1. add-fd
> -> fd is passed via SCM rights and qemu adds fd to first unused fdset 
> (e.g. /dev/fdset/1)
> -> add-fd monitor function initializes the monitor inuse flag for the 
> fdset to true
> -> add-fd monitor function initializes the remove flag for the fd to false
> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller
> 
> 2. drive_add file=/dev/fdset/1
> -> qemu_open uses the first fd in fdset1 that has access flags matching 
> the qemu_open action flags and has remove flag set to false
> -> qemu_open increments refcount for the fdset
> -> Need to make sure that if a command like 'device-add' fails that 
> refcount is not incremented
> 
> 3. add-fd fdset=1
> -> fd is passed via SCM rights
> -> add-fd monitor function adds the received fd to the specified fdset 
> (or fails if fdset doesn't exist)
> -> add-fd monitor function initializes the remove flag for the fd to false
> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller
> 
> 4. block-commit
> -> qemu_open performs "reopen" by using the first fd from the fdset that 
> has access flags matching the qemu_open action flags and has remove flag 
> set to false
> -> qemu_open increments refcount for the fdset
> -> Need to make sure that if a command like 'block-commit' fails that 
> refcount is not incremented
> 
> 5. remove-fd fdset=1 fd=4
> -> remove-fd monitor function fails if fdset doesn't exist
> -> remove-fd monitor function turns on remove flag for fd=4

What was again the reason why we keep removed fds in the fdset at all?

The removed flag would make sense for a fdset after a hypothetical
close-fdset call because the fdset needs to be kept around until the
last user closes it, but I think removed fds can be deleted immediately.

I think I might have confused remove-fd and close-fdset in earlier
emails in this thread, so I hope this isn't inconsistent with what I
said before.

> 6. qemu_close (need to replace all close calls in block layer with 
> qemu_close)
> -> qemu_close decrements refcount for fdset
> -> qemu_close closes all fds that have (refcount == 0 && (!inuse || remove))
> -> qemu_close frees the fdset if no fds remain in it
> 
> 7. disconnecting the QMP monitor
> -> monitor disconnect visits all fdsets on monitor and turns off monitor 
> in-use flag for fdset

And close all fds with refcount == 0.

> 8. connecting the QMP monitor
> -> monitor connect visits all fdsets on monitor and turns on monitor 
> in-use flag for fdset
> 
> 9. query-fdsets
> -> returns all fdsets and fds that don't have remove flag on

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 15:46                                   ` Kevin Wolf
@ 2012-07-09 16:18                                     ` Luiz Capitulino
  2012-07-09 17:59                                       ` Corey Bryant
  2012-07-09 17:35                                     ` Corey Bryant
  1 sibling, 1 reply; 56+ messages in thread
From: Luiz Capitulino @ 2012-07-09 16:18 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, Corey Bryant, qemu-devel,
	pbonzini, Eric Blake

On Mon, 09 Jul 2012 17:46:00 +0200
Kevin Wolf <kwolf@redhat.com> wrote:

> Am 09.07.2012 17:05, schrieb Corey Bryant:
> > I'm not sure this is an issue with current design.  I know things have 
> > changed a bit as the email threads evolved, so I'll paste the current 
> > design that I am working from.  Please let me know if you still see any 
> > issues.
> > 
> > FD passing:
> > -----------
> > New monitor commands enable adding/removing an fd to/from a set.  New 
> > monitor command query-fdsets enables querying of current monitor fdsets. 
> >   The set of fds should all refer to the same file, with each fd having 
> > different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup 
> > the fd that has the matching access mode flags.
> > 
> > Design points:
> > --------------
> > 1. add-fd
> > -> fd is passed via SCM rights and qemu adds fd to first unused fdset 
> > (e.g. /dev/fdset/1)

The fdset should be specified by the client, like:

 { "execute": "add-fd-set", "arguments": { "set-name": "/dev/fdset/1" } }

> > -> add-fd monitor function initializes the monitor inuse flag for the 
> > fdset to true

Why do we need the inuse flag?

> > -> add-fd monitor function initializes the remove flag for the fd to false
> > -> add-fd returns fdset number and received fd number (e.g fd=3) to caller
> > 
> > 2. drive_add file=/dev/fdset/1
> > -> qemu_open uses the first fd in fdset1 that has access flags matching 
> > the qemu_open action flags and has remove flag set to false
> > -> qemu_open increments refcount for the fdset
> > -> Need to make sure that if a command like 'device-add' fails that 
> > refcount is not incremented
> > 
> > 3. add-fd fdset=1
> > -> fd is passed via SCM rights
> > -> add-fd monitor function adds the received fd to the specified fdset 
> > (or fails if fdset doesn't exist)
> > -> add-fd monitor function initializes the remove flag for the fd to false
> > -> add-fd returns fdset number and received fd number (e.g fd=4) to caller
> > 
> > 4. block-commit
> > -> qemu_open performs "reopen" by using the first fd from the fdset that 
> > has access flags matching the qemu_open action flags and has remove flag 
> > set to false
> > -> qemu_open increments refcount for the fdset
> > -> Need to make sure that if a command like 'block-commit' fails that 
> > refcount is not incremented
> > 
> > 5. remove-fd fdset=1 fd=4
> > -> remove-fd monitor function fails if fdset doesn't exist
> > -> remove-fd monitor function turns on remove flag for fd=4
> 
> What was again the reason why we keep removed fds in the fdset at all?
> 
> The removed flag would make sense for a fdset after a hypothetical
> close-fdset call because the fdset needs to be kept around until the
> last user closes it, but I think removed fds can be deleted immediately.

Agreed.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 15:46                                   ` Kevin Wolf
  2012-07-09 16:18                                     ` Luiz Capitulino
@ 2012-07-09 17:35                                     ` Corey Bryant
  2012-07-09 17:48                                       ` Luiz Capitulino
  2012-07-10  7:53                                       ` Kevin Wolf
  1 sibling, 2 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-09 17:35 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake



On 07/09/2012 11:46 AM, Kevin Wolf wrote:
> Am 09.07.2012 17:05, schrieb Corey Bryant:
>> I'm not sure this is an issue with current design.  I know things have
>> changed a bit as the email threads evolved, so I'll paste the current
>> design that I am working from.  Please let me know if you still see any
>> issues.
>>
>> FD passing:
>> -----------
>> New monitor commands enable adding/removing an fd to/from a set.  New
>> monitor command query-fdsets enables querying of current monitor fdsets.
>>    The set of fds should all refer to the same file, with each fd having
>> different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
>> the fd that has the matching access mode flags.
>>
>> Design points:
>> --------------
>> 1. add-fd
>> -> fd is passed via SCM rights and qemu adds fd to first unused fdset
>> (e.g. /dev/fdset/1)
>> -> add-fd monitor function initializes the monitor inuse flag for the
>> fdset to true
>> -> add-fd monitor function initializes the remove flag for the fd to false
>> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller
>>
>> 2. drive_add file=/dev/fdset/1
>> -> qemu_open uses the first fd in fdset1 that has access flags matching
>> the qemu_open action flags and has remove flag set to false
>> -> qemu_open increments refcount for the fdset
>> -> Need to make sure that if a command like 'device-add' fails that
>> refcount is not incremented
>>
>> 3. add-fd fdset=1
>> -> fd is passed via SCM rights
>> -> add-fd monitor function adds the received fd to the specified fdset
>> (or fails if fdset doesn't exist)
>> -> add-fd monitor function initializes the remove flag for the fd to false
>> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller
>>
>> 4. block-commit
>> -> qemu_open performs "reopen" by using the first fd from the fdset that
>> has access flags matching the qemu_open action flags and has remove flag
>> set to false
>> -> qemu_open increments refcount for the fdset
>> -> Need to make sure that if a command like 'block-commit' fails that
>> refcount is not incremented
>>
>> 5. remove-fd fdset=1 fd=4
>> -> remove-fd monitor function fails if fdset doesn't exist
>> -> remove-fd monitor function turns on remove flag for fd=4
>
> What was again the reason why we keep removed fds in the fdset at all?

Because if refcount is > 0 for the fd set, then the fd could be in use 
by a block device.  So we keep it around until refcount is decremented 
to zero, at which point it is safe to close.

>
> The removed flag would make sense for a fdset after a hypothetical
> close-fdset call because the fdset needs to be kept around until the
> last user closes it, but I think removed fds can be deleted immediately.

fds in an fd set really need to be kept around until zero block devices 
reference them.  At that point, if '(refcount == 0 && (!inuse || 
remove))' is true, then we'll officially close the fd.

>
> I think I might have confused remove-fd and close-fdset in earlier
> emails in this thread, so I hope this isn't inconsistent with what I
> said before.
>

Ok no problem.

>> 6. qemu_close (need to replace all close calls in block layer with
>> qemu_close)
>> -> qemu_close decrements refcount for fdset
>> -> qemu_close closes all fds that have (refcount == 0 && (!inuse || remove))
>> -> qemu_close frees the fdset if no fds remain in it
>>
>> 7. disconnecting the QMP monitor
>> -> monitor disconnect visits all fdsets on monitor and turns off monitor
>> in-use flag for fdset
>
> And close all fds with refcount == 0.
>

Yes, this makes sense.

It also makes sense to close removed fds with refcount == 0 in the 
remove-fd function.  Basically this will be the same thing we do in 
qemu_close.  We'll close any fds that evaulate the following as true:

(refcount == 0 && (!inuse || remove))

>> 8. connecting the QMP monitor
>> -> monitor connect visits all fdsets on monitor and turns on monitor
>> in-use flag for fdset
>>
>> 9. query-fdsets
>> -> returns all fdsets and fds that don't have remove flag on

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 17:35                                     ` Corey Bryant
@ 2012-07-09 17:48                                       ` Luiz Capitulino
  2012-07-09 18:02                                         ` Corey Bryant
  2012-07-10  7:53                                       ` Kevin Wolf
  1 sibling, 1 reply; 56+ messages in thread
From: Luiz Capitulino @ 2012-07-09 17:48 UTC (permalink / raw)
  To: Corey Bryant
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	pbonzini, Eric Blake

On Mon, 09 Jul 2012 13:35:19 -0400
Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:

> 
> 
> On 07/09/2012 11:46 AM, Kevin Wolf wrote:
> > Am 09.07.2012 17:05, schrieb Corey Bryant:
> >> I'm not sure this is an issue with current design.  I know things have
> >> changed a bit as the email threads evolved, so I'll paste the current
> >> design that I am working from.  Please let me know if you still see any
> >> issues.
> >>
> >> FD passing:
> >> -----------
> >> New monitor commands enable adding/removing an fd to/from a set.  New
> >> monitor command query-fdsets enables querying of current monitor fdsets.
> >>    The set of fds should all refer to the same file, with each fd having
> >> different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
> >> the fd that has the matching access mode flags.
> >>
> >> Design points:
> >> --------------
> >> 1. add-fd
> >> -> fd is passed via SCM rights and qemu adds fd to first unused fdset
> >> (e.g. /dev/fdset/1)
> >> -> add-fd monitor function initializes the monitor inuse flag for the
> >> fdset to true
> >> -> add-fd monitor function initializes the remove flag for the fd to false
> >> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller
> >>
> >> 2. drive_add file=/dev/fdset/1
> >> -> qemu_open uses the first fd in fdset1 that has access flags matching
> >> the qemu_open action flags and has remove flag set to false
> >> -> qemu_open increments refcount for the fdset
> >> -> Need to make sure that if a command like 'device-add' fails that
> >> refcount is not incremented
> >>
> >> 3. add-fd fdset=1
> >> -> fd is passed via SCM rights
> >> -> add-fd monitor function adds the received fd to the specified fdset
> >> (or fails if fdset doesn't exist)
> >> -> add-fd monitor function initializes the remove flag for the fd to false
> >> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller
> >>
> >> 4. block-commit
> >> -> qemu_open performs "reopen" by using the first fd from the fdset that
> >> has access flags matching the qemu_open action flags and has remove flag
> >> set to false
> >> -> qemu_open increments refcount for the fdset
> >> -> Need to make sure that if a command like 'block-commit' fails that
> >> refcount is not incremented
> >>
> >> 5. remove-fd fdset=1 fd=4
> >> -> remove-fd monitor function fails if fdset doesn't exist
> >> -> remove-fd monitor function turns on remove flag for fd=4
> >
> > What was again the reason why we keep removed fds in the fdset at all?
> 
> Because if refcount is > 0 for the fd set, then the fd could be in use 
> by a block device.  So we keep it around until refcount is decremented 
> to zero, at which point it is safe to close.

But then the refcount is associated with the set, not with any particular fd.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 16:18                                     ` Luiz Capitulino
@ 2012-07-09 17:59                                       ` Corey Bryant
  0 siblings, 0 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-09 17:59 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	pbonzini, Eric Blake



On 07/09/2012 12:18 PM, Luiz Capitulino wrote:
> On Mon, 09 Jul 2012 17:46:00 +0200
> Kevin Wolf <kwolf@redhat.com> wrote:
>
>> Am 09.07.2012 17:05, schrieb Corey Bryant:
>>> I'm not sure this is an issue with current design.  I know things have
>>> changed a bit as the email threads evolved, so I'll paste the current
>>> design that I am working from.  Please let me know if you still see any
>>> issues.
>>>
>>> FD passing:
>>> -----------
>>> New monitor commands enable adding/removing an fd to/from a set.  New
>>> monitor command query-fdsets enables querying of current monitor fdsets.
>>>    The set of fds should all refer to the same file, with each fd having
>>> different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
>>> the fd that has the matching access mode flags.
>>>
>>> Design points:
>>> --------------
>>> 1. add-fd
>>> -> fd is passed via SCM rights and qemu adds fd to first unused fdset
>>> (e.g. /dev/fdset/1)
>
> The fdset should be specified by the client, like:
>
>   { "execute": "add-fd-set", "arguments": { "set-name": "/dev/fdset/1" } }
>

We could make the fdset name configurable.  Then we wouldn't force 
clients into using the file=/dev/fdset/1 alias on commands like 
device_add.  The risk with this is that clients need to be careful and 
use a unique name that doesn't conflict with any existing file names.

The way it is currently, if add-fd is not given an fdset name, it will 
generate a new fdset and add the fd to it.  add-fd always returns the 
fdset (int) and received fd (int) on success.  (e.g. fdset=1 corresponds 
to file=/dev/fdset/1).  Then the 2nd time you want to add an fd to this 
set, you have to specify fdset=1 on add-fd.

I'll do whatever you all prefer.  I think there are advantages in both 
approaches, however I'm leaning toward the current approach and forcing 
use of /dev/fdset/1 to keep all fd set names consistent.

>>> -> add-fd monitor function initializes the monitor inuse flag for the
>>> fdset to true
>
> Why do we need the inuse flag?
>

This helps to prevent fd leakage.  Let's say the client adds fd=3 to 
/dev/fdset/1 and then the QMP monitor disconnects.  Since the following 
evaluates to true when the monitor disconnects, the fd will be closed:

(refcount == 0 && (!inuse || remove))

Note: refcount is incremented for the fdset on qemu_open and decremented 
on qemu_close, and no commands caused it to be incremented from zero in 
this example.

>>> -> add-fd monitor function initializes the remove flag for the fd to false
>>> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller
>>>
>>> 2. drive_add file=/dev/fdset/1
>>> -> qemu_open uses the first fd in fdset1 that has access flags matching
>>> the qemu_open action flags and has remove flag set to false
>>> -> qemu_open increments refcount for the fdset
>>> -> Need to make sure that if a command like 'device-add' fails that
>>> refcount is not incremented
>>>
>>> 3. add-fd fdset=1
>>> -> fd is passed via SCM rights
>>> -> add-fd monitor function adds the received fd to the specified fdset
>>> (or fails if fdset doesn't exist)
>>> -> add-fd monitor function initializes the remove flag for the fd to false
>>> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller
>>>
>>> 4. block-commit
>>> -> qemu_open performs "reopen" by using the first fd from the fdset that
>>> has access flags matching the qemu_open action flags and has remove flag
>>> set to false
>>> -> qemu_open increments refcount for the fdset
>>> -> Need to make sure that if a command like 'block-commit' fails that
>>> refcount is not incremented
>>>
>>> 5. remove-fd fdset=1 fd=4
>>> -> remove-fd monitor function fails if fdset doesn't exist
>>> -> remove-fd monitor function turns on remove flag for fd=4
>>
>> What was again the reason why we keep removed fds in the fdset at all?
>>
>> The removed flag would make sense for a fdset after a hypothetical
>> close-fdset call because the fdset needs to be kept around until the
>> last user closes it, but I think removed fds can be deleted immediately.
>
> Agreed.
>

Please take a look at my recent reply to Kevin about this and let me 
know if it clears things up.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 17:48                                       ` Luiz Capitulino
@ 2012-07-09 18:02                                         ` Corey Bryant
  0 siblings, 0 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-09 18:02 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	pbonzini, Eric Blake



On 07/09/2012 01:48 PM, Luiz Capitulino wrote:
> On Mon, 09 Jul 2012 13:35:19 -0400
> Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:
>
>>
>>
>> On 07/09/2012 11:46 AM, Kevin Wolf wrote:
>>> Am 09.07.2012 17:05, schrieb Corey Bryant:
>>>> I'm not sure this is an issue with current design.  I know things have
>>>> changed a bit as the email threads evolved, so I'll paste the current
>>>> design that I am working from.  Please let me know if you still see any
>>>> issues.
>>>>
>>>> FD passing:
>>>> -----------
>>>> New monitor commands enable adding/removing an fd to/from a set.  New
>>>> monitor command query-fdsets enables querying of current monitor fdsets.
>>>>     The set of fds should all refer to the same file, with each fd having
>>>> different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
>>>> the fd that has the matching access mode flags.
>>>>
>>>> Design points:
>>>> --------------
>>>> 1. add-fd
>>>> -> fd is passed via SCM rights and qemu adds fd to first unused fdset
>>>> (e.g. /dev/fdset/1)
>>>> -> add-fd monitor function initializes the monitor inuse flag for the
>>>> fdset to true
>>>> -> add-fd monitor function initializes the remove flag for the fd to false
>>>> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller
>>>>
>>>> 2. drive_add file=/dev/fdset/1
>>>> -> qemu_open uses the first fd in fdset1 that has access flags matching
>>>> the qemu_open action flags and has remove flag set to false
>>>> -> qemu_open increments refcount for the fdset
>>>> -> Need to make sure that if a command like 'device-add' fails that
>>>> refcount is not incremented
>>>>
>>>> 3. add-fd fdset=1
>>>> -> fd is passed via SCM rights
>>>> -> add-fd monitor function adds the received fd to the specified fdset
>>>> (or fails if fdset doesn't exist)
>>>> -> add-fd monitor function initializes the remove flag for the fd to false
>>>> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller
>>>>
>>>> 4. block-commit
>>>> -> qemu_open performs "reopen" by using the first fd from the fdset that
>>>> has access flags matching the qemu_open action flags and has remove flag
>>>> set to false
>>>> -> qemu_open increments refcount for the fdset
>>>> -> Need to make sure that if a command like 'block-commit' fails that
>>>> refcount is not incremented
>>>>
>>>> 5. remove-fd fdset=1 fd=4
>>>> -> remove-fd monitor function fails if fdset doesn't exist
>>>> -> remove-fd monitor function turns on remove flag for fd=4
>>>
>>> What was again the reason why we keep removed fds in the fdset at all?
>>
>> Because if refcount is > 0 for the fd set, then the fd could be in use
>> by a block device.  So we keep it around until refcount is decremented
>> to zero, at which point it is safe to close.
>
> But then the refcount is associated with the set, not with any particular fd.
>

Exactly, yes that's what we're doing.  Sorry, I thought that was clear 
in the design overview I sent earlier today.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 15:05                                 ` Corey Bryant
  2012-07-09 15:46                                   ` Kevin Wolf
@ 2012-07-09 18:20                                   ` Corey Bryant
  1 sibling, 0 replies; 56+ messages in thread
From: Corey Bryant @ 2012-07-09 18:20 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: Kevin Wolf, aliguori, stefanha, libvir-list, qemu-devel,
	pbonzini, Eric Blake



On 07/09/2012 11:05 AM, Corey Bryant wrote:
>
>
> On 07/09/2012 10:05 AM, Luiz Capitulino wrote:
>> On Thu, 05 Jul 2012 11:06:56 -0400
>> Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:
>>
>>>
>>>
>>> On 07/04/2012 04:09 AM, Kevin Wolf wrote:
>>>> Am 03.07.2012 20:21, schrieb Corey Bryant:
>>>>> On 07/03/2012 02:00 PM, Eric Blake wrote:
>>>>>> On 07/03/2012 11:46 AM, Corey Bryant wrote:
>>>>>>
>>>>>>>
>>>>>>> Yes, I think adding a +1 to the refcount for the monitor makes
>>>>>>> sense.
>>>>>>>
>>>>>>> I'm a bit unsure how to increment the refcount when a monitor
>>>>>>> reconnects
>>>>>>> though.  Maybe it is as simple as adding a +1 to each fd's
>>>>>>> refcount when
>>>>>>> the next QMP monitor connects.
>>>>>>
>>>>>> Or maybe delay a +1 until after a 'query-fds' - it is not until the
>>>>>> monitor has reconnected and learned what fds it should be aware of
>>>>>> that
>>>>>> incrementing the refcount again makes sense.  But that would mean
>>>>>> making
>>>>>> 'query-fds' track whether this is the first call since the monitor
>>>>>> reconnected, as it shouldn't normally increase refcounts.
>>>>>
>>>>> This doesn't sound ideal.
>>>>
>>>> Yes, it's less than ideal.
>>>>
>>>>>> The other alternative is that the monitor never re-increments a
>>>>>> refcount.  Once a monitor disconnects, that fd is lost to the
>>>>>> monitor,
>>>>>> and a reconnected monitor must pass in a new fd to be
>>>>>> re-associated with
>>>>>> the fdset.  In other words, the monitor's use of an fd is a one-way
>>>>>> operation, starting life in use but ending at the first disconnect or
>>>>>> remove-fd.
>>>>>
>>>>> I would vote for this 2nd alternative.  As long as we're not
>>>>> introducing
>>>>> an fd leak.  And I don't think we are if we decrement the refcount on
>>>>> remove-fd or on QMP disconnect.
>>>>
>>>> In fact, I believe this one is even worse. I can already see hacks like
>>>> adding a dummy FD with invalid flags and removing it again just to
>>>> regain control over the fdset...
>>>>
>>>> You earlier suggestion made a lot of sense to me: Whenever a new QMP
>>>> monitor is connected, increase the refcount. That is, as long as any
>>>> monitor is there, don't drop any fdsets unless explicitly requested
>>>> via QMP.
>>>
>>> Ok.  So refcount would be incremented (for the fd or fdset, whatever we
>>> decide on) when QMP reconnects.  I'm assuming we wouldn't wait until
>>> after a query-fds call.
>>
>> I'm not sure this is a good idea because we will leak fds if the
>> client forgets
>> about the fds when re-connecting (ie. it was restarted) or if a different
>> client connects to QMP.
>>
>> If we really want to do that, I think that the right way of doing this
>> is to
>> add a command for clients to re-again ownership of the fds on
>> reconnection.
>>
>> But to be honest, I don't fully understand why this is needed.
>>
>
> I'm not sure this is an issue with current design.  I know things have
> changed a bit as the email threads evolved, so I'll paste the current
> design that I am working from.  Please let me know if you still see any
> issues.
>
> FD passing:
> -----------
> New monitor commands enable adding/removing an fd to/from a set.  New
> monitor command query-fdsets enables querying of current monitor fdsets.
>   The set of fds should all refer to the same file, with each fd having
> different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
> the fd that has the matching access mode flags.
>
> Design points:
> --------------
> 1. add-fd
> -> fd is passed via SCM rights and qemu adds fd to first unused fdset
> (e.g. /dev/fdset/1)
> -> add-fd monitor function initializes the monitor inuse flag for the
> fdset to true
> -> add-fd monitor function initializes the remove flag for the fd to false
> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller
>
> 2. drive_add file=/dev/fdset/1
> -> qemu_open uses the first fd in fdset1 that has access flags matching
> the qemu_open action flags and has remove flag set to false
> -> qemu_open increments refcount for the fdset
> -> Need to make sure that if a command like 'device-add' fails that
> refcount is not incremented
>
> 3. add-fd fdset=1
> -> fd is passed via SCM rights
> -> add-fd monitor function adds the received fd to the specified fdset
> (or fails if fdset doesn't exist)
> -> add-fd monitor function initializes the remove flag for the fd to false
> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller
>
> 4. block-commit
> -> qemu_open performs "reopen" by using the first fd from the fdset that
> has access flags matching the qemu_open action flags and has remove flag
> set to false
> -> qemu_open increments refcount for the fdset
> -> Need to make sure that if a command like 'block-commit' fails that
> refcount is not incremented
>
> 5. remove-fd fdset=1 fd=4
> -> remove-fd monitor function fails if fdset doesn't exist
> -> remove-fd monitor function turns on remove flag for fd=4
>
> 6. qemu_close (need to replace all close calls in block layer with
> qemu_close)
> -> qemu_close decrements refcount for fdset
> -> qemu_close closes all fds that have (refcount == 0 && (!inuse ||
> remove))
> -> qemu_close frees the fdset if no fds remain in it
>
> 7. disconnecting the QMP monitor
> -> monitor disconnect visits all fdsets on monitor and turns off monitor
> in-use flag for fdset
>
> 8. connecting the QMP monitor
> -> monitor connect visits all fdsets on monitor and turns on monitor
> in-use flag for fdset
>
> 9. query-fdsets
> -> returns all fdsets and fds that don't have remove flag on
>
> QMP command examples
> --------------------
> -> { "execute": "add-fd", "arguments": { "fdset": 1 } }
> <- { "return": { "fdset": 1, "fd": 3 } }
>
> -> { "execute": "remove-fd", "arguments": { "fdset": 1, "fd": 3 } }
> <- { "return": {} }
>
> -> { "execute":"query-fdsets" } =>
> <- { "return" : { "fdsets": [
>       { "name": "fdset1",
>         "fds": [ { "fd": 4, "removed": false } ],
>         "refcount": 1,
>         "monitor": true },
>       { "name": "fdset2",
>         "fds": [ { "fd": 5, "removed": false },
>                  { "fd": 6, "removed": true } ],
>         "refcount": 1,
>         "monitor": true } }
>

Here's a minor correction to query-fdsets.  s/name/fdset and I changed 
fdset to an int to stay in sync with add-fd and remove-fd.

-> { "execute":"query-fdsets" } =>
<- { "return" : { "fdsets": [
      { "fdset": 1,
        "fds": [ { "fd": 4, "removed": false } ],
        "refcount": 1,
        "monitor": true },
      { "fdset": 2,
        "fds": [ { "fd": 5, "removed": false },
                 { "fd": 6, "removed": true } ],
        "refcount": 1,
        "monitor": true } }

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
       [not found] ` <20120626091004.GA14451@redhat.com>
       [not found]   ` <4FE9A0F0.2050809@redhat.com>
@ 2012-07-09 18:40   ` Anthony Liguori
  2012-07-09 19:00     ` Luiz Capitulino
  2012-07-10  7:58     ` Kevin Wolf
  1 sibling, 2 replies; 56+ messages in thread
From: Anthony Liguori @ 2012-07-09 18:40 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: kwolf, stefanha, libvir-list, Corey Bryant, qemu-devel,
	lcapitulino, pbonzini, eblake

On 06/26/2012 04:10 AM, Daniel P. Berrange wrote:
> On Fri, Jun 22, 2012 at 02:36:07PM -0400, Corey Bryant wrote:
>> libvirt's sVirt security driver provides SELinux MAC isolation for
>> Qemu guest processes and their corresponding image files.  In other
>> words, sVirt uses SELinux to prevent a QEMU process from opening
>> files that do not belong to it.
>>
>> sVirt provides this support by labeling guests and resources with
>> security labels that are stored in file system extended attributes.
>> Some file systems, such as NFS, do not support the extended
>> attribute security namespace, and therefore cannot support sVirt
>> isolation.
>>
>> A solution to this problem is to provide fd passing support, where
>> libvirt opens files and passes file descriptors to QEMU.  This,
>> along with SELinux policy to prevent QEMU from opening files, can
>> provide image file isolation for NFS files stored on the same NFS
>> mount.
>>
>> This patch series adds the pass-fd QMP monitor command, which allows
>> an fd to be passed via SCM_RIGHTS, and returns the received file
>> descriptor.  Support is also added to the block layer to allow QEMU
>> to dup the fd when the filename is of the /dev/fd/X format.  This
>> is useful if MAC policy prevents QEMU from opening specific types
>> of files.
>
> I was thinking about some of the sources complexity when using
> FD passing from libvirt and wanted to raise one idea for discussion
> before we continue.
>
> With this proposed series, we have usage akin to:
>
>    1. pass_fd FDSET={M} ->  returns a string "/dev/fd/N" showing QEMU's
>       view of the FD
>    2. drive_add file=/dev/fd/N
>    3. if failure:
>         close_fd "/dev/fd/N"
>
> My problem is that none of this FD passing is "transactional".

My original patch series did not suffer from this problem.

QEMU owned the file descriptor once it received it from libvirt.

I don't think the cited problem (QEMU failing an operation if libvirt was down) 
is really an actual problem since it would be libvirt that would be issuing the 
command in the first place (so the command would just fail which libvirt would 
have to assume anyway if it crashed).

I really dislike where this thread has headed with /dev/fdset.  This has become 
extremely complex and cumbersome.

Perhaps we should reconsider using an RPC for QEMU to request an fd as this 
solves all the cited problems in a much simpler fashion.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 18:40   ` Anthony Liguori
@ 2012-07-09 19:00     ` Luiz Capitulino
  2012-07-10  8:54       ` Daniel P. Berrange
  2012-07-10  7:58     ` Kevin Wolf
  1 sibling, 1 reply; 56+ messages in thread
From: Luiz Capitulino @ 2012-07-09 19:00 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kwolf, stefanha, libvir-list, Corey Bryant, qemu-devel, pbonzini, eblake

On Mon, 09 Jul 2012 13:40:34 -0500
Anthony Liguori <aliguori@us.ibm.com> wrote:

> On 06/26/2012 04:10 AM, Daniel P. Berrange wrote:
> > On Fri, Jun 22, 2012 at 02:36:07PM -0400, Corey Bryant wrote:
> >> libvirt's sVirt security driver provides SELinux MAC isolation for
> >> Qemu guest processes and their corresponding image files.  In other
> >> words, sVirt uses SELinux to prevent a QEMU process from opening
> >> files that do not belong to it.
> >>
> >> sVirt provides this support by labeling guests and resources with
> >> security labels that are stored in file system extended attributes.
> >> Some file systems, such as NFS, do not support the extended
> >> attribute security namespace, and therefore cannot support sVirt
> >> isolation.
> >>
> >> A solution to this problem is to provide fd passing support, where
> >> libvirt opens files and passes file descriptors to QEMU.  This,
> >> along with SELinux policy to prevent QEMU from opening files, can
> >> provide image file isolation for NFS files stored on the same NFS
> >> mount.
> >>
> >> This patch series adds the pass-fd QMP monitor command, which allows
> >> an fd to be passed via SCM_RIGHTS, and returns the received file
> >> descriptor.  Support is also added to the block layer to allow QEMU
> >> to dup the fd when the filename is of the /dev/fd/X format.  This
> >> is useful if MAC policy prevents QEMU from opening specific types
> >> of files.
> >
> > I was thinking about some of the sources complexity when using
> > FD passing from libvirt and wanted to raise one idea for discussion
> > before we continue.
> >
> > With this proposed series, we have usage akin to:
> >
> >    1. pass_fd FDSET={M} ->  returns a string "/dev/fd/N" showing QEMU's
> >       view of the FD
> >    2. drive_add file=/dev/fd/N
> >    3. if failure:
> >         close_fd "/dev/fd/N"
> >
> > My problem is that none of this FD passing is "transactional".
> 
> My original patch series did not suffer from this problem.
> 
> QEMU owned the file descriptor once it received it from libvirt.
> 
> I don't think the cited problem (QEMU failing an operation if libvirt was down) 
> is really an actual problem since it would be libvirt that would be issuing the 
> command in the first place (so the command would just fail which libvirt would 
> have to assume anyway if it crashed).
> 
> I really dislike where this thread has headed with /dev/fdset.  This has become 
> extremely complex and cumbersome.

I agree, maybe it's time to start over and discuss the original problem again.

> 
> Perhaps we should reconsider using an RPC for QEMU to request an fd as this 
> solves all the cited problems in a much simpler fashion.
> 
> Regards,
> 
> Anthony Liguori
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 17:35                                     ` Corey Bryant
  2012-07-09 17:48                                       ` Luiz Capitulino
@ 2012-07-10  7:53                                       ` Kevin Wolf
  1 sibling, 0 replies; 56+ messages in thread
From: Kevin Wolf @ 2012-07-10  7:53 UTC (permalink / raw)
  To: Corey Bryant
  Cc: aliguori, stefanha, libvir-list, qemu-devel, Luiz Capitulino,
	pbonzini, Eric Blake

Am 09.07.2012 19:35, schrieb Corey Bryant:
> 
> 
> On 07/09/2012 11:46 AM, Kevin Wolf wrote:
>> Am 09.07.2012 17:05, schrieb Corey Bryant:
>>> I'm not sure this is an issue with current design.  I know things have
>>> changed a bit as the email threads evolved, so I'll paste the current
>>> design that I am working from.  Please let me know if you still see any
>>> issues.
>>>
>>> FD passing:
>>> -----------
>>> New monitor commands enable adding/removing an fd to/from a set.  New
>>> monitor command query-fdsets enables querying of current monitor fdsets.
>>>    The set of fds should all refer to the same file, with each fd having
>>> different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
>>> the fd that has the matching access mode flags.
>>>
>>> Design points:
>>> --------------
>>> 1. add-fd
>>> -> fd is passed via SCM rights and qemu adds fd to first unused fdset
>>> (e.g. /dev/fdset/1)
>>> -> add-fd monitor function initializes the monitor inuse flag for the
>>> fdset to true
>>> -> add-fd monitor function initializes the remove flag for the fd to false
>>> -> add-fd returns fdset number and received fd number (e.g fd=3) to caller
>>>
>>> 2. drive_add file=/dev/fdset/1
>>> -> qemu_open uses the first fd in fdset1 that has access flags matching
>>> the qemu_open action flags and has remove flag set to false
>>> -> qemu_open increments refcount for the fdset
>>> -> Need to make sure that if a command like 'device-add' fails that
>>> refcount is not incremented
>>>
>>> 3. add-fd fdset=1
>>> -> fd is passed via SCM rights
>>> -> add-fd monitor function adds the received fd to the specified fdset
>>> (or fails if fdset doesn't exist)
>>> -> add-fd monitor function initializes the remove flag for the fd to false
>>> -> add-fd returns fdset number and received fd number (e.g fd=4) to caller
>>>
>>> 4. block-commit
>>> -> qemu_open performs "reopen" by using the first fd from the fdset that
>>> has access flags matching the qemu_open action flags and has remove flag
>>> set to false
>>> -> qemu_open increments refcount for the fdset
>>> -> Need to make sure that if a command like 'block-commit' fails that
>>> refcount is not incremented
>>>
>>> 5. remove-fd fdset=1 fd=4
>>> -> remove-fd monitor function fails if fdset doesn't exist
>>> -> remove-fd monitor function turns on remove flag for fd=4
>>
>> What was again the reason why we keep removed fds in the fdset at all?
> 
> Because if refcount is > 0 for the fd set, then the fd could be in use 
> by a block device.  So we keep it around until refcount is decremented 
> to zero, at which point it is safe to close.
> 
>>
>> The removed flag would make sense for a fdset after a hypothetical
>> close-fdset call because the fdset needs to be kept around until the
>> last user closes it, but I think removed fds can be deleted immediately.
> 
> fds in an fd set really need to be kept around until zero block devices 
> reference them.  At that point, if '(refcount == 0 && (!inuse || 
> remove))' is true, then we'll officially close the fd.

Block devices don't reference an fd in the fdset. There are two
references in a block device. The first one is obviously the file
descriptor they are using; it is a fd dup()ed from an fd in the fdset,
but it's now independent of it. The other reference is the file name
that is kept in the BlockDriverState, and it always points to
"/dev/fdset/X", that is, the whole fdset instead of a single fd.

What happens if you remove a file descriptor from an fdset that is in
use, is that you can't reopen the fdset with the flags of the removed
file descriptor any more. Which I believe is exactly the expected
behaviour. libvirt would use this to revoke r/w access, for example (and
which behaviour you already provide by checking removed in qemu_open).

Are there any other use cases where it makes a difference whether a file
descriptor is kept in the fdset with removed=1 or whether it's actually
removed from the fdset?

>> I think I might have confused remove-fd and close-fdset in earlier
>> emails in this thread, so I hope this isn't inconsistent with what I
>> said before.
>>
> 
> Ok no problem.
> 
>>> 6. qemu_close (need to replace all close calls in block layer with
>>> qemu_close)
>>> -> qemu_close decrements refcount for fdset
>>> -> qemu_close closes all fds that have (refcount == 0 && (!inuse || remove))
>>> -> qemu_close frees the fdset if no fds remain in it
>>>
>>> 7. disconnecting the QMP monitor
>>> -> monitor disconnect visits all fdsets on monitor and turns off monitor
>>> in-use flag for fdset
>>
>> And close all fds with refcount == 0.
>>
> 
> Yes, this makes sense.
> 
> It also makes sense to close removed fds with refcount == 0 in the 
> remove-fd function.  Basically this will be the same thing we do in 
> qemu_close.  We'll close any fds that evaulate the following as true:
> 
> (refcount == 0 && (!inuse || remove))

Yes, whatever condition we'll come up with, but it should be the same
and checked in all places where its value might change.

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 18:40   ` Anthony Liguori
  2012-07-09 19:00     ` Luiz Capitulino
@ 2012-07-10  7:58     ` Kevin Wolf
  1 sibling, 0 replies; 56+ messages in thread
From: Kevin Wolf @ 2012-07-10  7:58 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: stefanha, libvir-list, Corey Bryant, qemu-devel, lcapitulino,
	pbonzini, eblake

Am 09.07.2012 20:40, schrieb Anthony Liguori:
> On 06/26/2012 04:10 AM, Daniel P. Berrange wrote:
>> On Fri, Jun 22, 2012 at 02:36:07PM -0400, Corey Bryant wrote:
>>> libvirt's sVirt security driver provides SELinux MAC isolation for
>>> Qemu guest processes and their corresponding image files.  In other
>>> words, sVirt uses SELinux to prevent a QEMU process from opening
>>> files that do not belong to it.
>>>
>>> sVirt provides this support by labeling guests and resources with
>>> security labels that are stored in file system extended attributes.
>>> Some file systems, such as NFS, do not support the extended
>>> attribute security namespace, and therefore cannot support sVirt
>>> isolation.
>>>
>>> A solution to this problem is to provide fd passing support, where
>>> libvirt opens files and passes file descriptors to QEMU.  This,
>>> along with SELinux policy to prevent QEMU from opening files, can
>>> provide image file isolation for NFS files stored on the same NFS
>>> mount.
>>>
>>> This patch series adds the pass-fd QMP monitor command, which allows
>>> an fd to be passed via SCM_RIGHTS, and returns the received file
>>> descriptor.  Support is also added to the block layer to allow QEMU
>>> to dup the fd when the filename is of the /dev/fd/X format.  This
>>> is useful if MAC policy prevents QEMU from opening specific types
>>> of files.
>>
>> I was thinking about some of the sources complexity when using
>> FD passing from libvirt and wanted to raise one idea for discussion
>> before we continue.
>>
>> With this proposed series, we have usage akin to:
>>
>>    1. pass_fd FDSET={M} ->  returns a string "/dev/fd/N" showing QEMU's
>>       view of the FD
>>    2. drive_add file=/dev/fd/N
>>    3. if failure:
>>         close_fd "/dev/fd/N"
>>
>> My problem is that none of this FD passing is "transactional".
> 
> My original patch series did not suffer from this problem.
> 
> QEMU owned the file descriptor once it received it from libvirt.
> 
> I don't think the cited problem (QEMU failing an operation if libvirt was down) 
> is really an actual problem since it would be libvirt that would be issuing the 
> command in the first place (so the command would just fail which libvirt would 
> have to assume anyway if it crashed).
> 
> I really dislike where this thread has headed with /dev/fdset.  This has become 
> extremely complex and cumbersome.

What exactly is complex about the interface we're going to provide? A
long discussion about how to get the details implemented best doesn't
mean at all that the result is complex.

> Perhaps we should reconsider using an RPC for QEMU to request an fd as this 
> solves all the cited problems in a much simpler fashion.

NACK. RPC is wrong and no way easier to handle for management.

Kevin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
  2012-07-09 19:00     ` Luiz Capitulino
@ 2012-07-10  8:54       ` Daniel P. Berrange
  0 siblings, 0 replies; 56+ messages in thread
From: Daniel P. Berrange @ 2012-07-10  8:54 UTC (permalink / raw)
  To: Luiz Capitulino
  Cc: kwolf, Anthony Liguori, stefanha, libvir-list, Corey Bryant,
	qemu-devel, pbonzini, eblake

On Mon, Jul 09, 2012 at 04:00:37PM -0300, Luiz Capitulino wrote:
> On Mon, 09 Jul 2012 13:40:34 -0500
> Anthony Liguori <aliguori@us.ibm.com> wrote:
> 
> > On 06/26/2012 04:10 AM, Daniel P. Berrange wrote:
> > > On Fri, Jun 22, 2012 at 02:36:07PM -0400, Corey Bryant wrote:
> > >> libvirt's sVirt security driver provides SELinux MAC isolation for
> > >> Qemu guest processes and their corresponding image files.  In other
> > >> words, sVirt uses SELinux to prevent a QEMU process from opening
> > >> files that do not belong to it.
> > >>
> > >> sVirt provides this support by labeling guests and resources with
> > >> security labels that are stored in file system extended attributes.
> > >> Some file systems, such as NFS, do not support the extended
> > >> attribute security namespace, and therefore cannot support sVirt
> > >> isolation.
> > >>
> > >> A solution to this problem is to provide fd passing support, where
> > >> libvirt opens files and passes file descriptors to QEMU.  This,
> > >> along with SELinux policy to prevent QEMU from opening files, can
> > >> provide image file isolation for NFS files stored on the same NFS
> > >> mount.
> > >>
> > >> This patch series adds the pass-fd QMP monitor command, which allows
> > >> an fd to be passed via SCM_RIGHTS, and returns the received file
> > >> descriptor.  Support is also added to the block layer to allow QEMU
> > >> to dup the fd when the filename is of the /dev/fd/X format.  This
> > >> is useful if MAC policy prevents QEMU from opening specific types
> > >> of files.
> > >
> > > I was thinking about some of the sources complexity when using
> > > FD passing from libvirt and wanted to raise one idea for discussion
> > > before we continue.
> > >
> > > With this proposed series, we have usage akin to:
> > >
> > >    1. pass_fd FDSET={M} ->  returns a string "/dev/fd/N" showing QEMU's
> > >       view of the FD
> > >    2. drive_add file=/dev/fd/N
> > >    3. if failure:
> > >         close_fd "/dev/fd/N"
> > >
> > > My problem is that none of this FD passing is "transactional".
> > 
> > My original patch series did not suffer from this problem.
> > 
> > QEMU owned the file descriptor once it received it from libvirt.
> > 
> > I don't think the cited problem (QEMU failing an operation if libvirt was down) 
> > is really an actual problem since it would be libvirt that would be issuing the 
> > command in the first place (so the command would just fail which libvirt would 
> > have to assume anyway if it crashed).
> > 
> > I really dislike where this thread has headed with /dev/fdset.  This has become 
> > extremely complex and cumbersome.
> 
> I agree, maybe it's time to start over and discuss the original problem again.

I must say, I'm not entirely sure of all the problems we're trying to
solve anymore. I don't think we've ever clearly stated in this thread
what all the requirements/problems are, so I'm finding it hard to see
what the optimal solution is.


Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd
  2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant
@ 2012-07-11 18:51   ` Luiz Capitulino
  0 siblings, 0 replies; 56+ messages in thread
From: Luiz Capitulino @ 2012-07-11 18:51 UTC (permalink / raw)
  To: Corey Bryant
  Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, pbonzini, eblake

On Fri, 22 Jun 2012 14:36:09 -0400
Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:

> Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>

I've cherry-picked this one into the qmp tree.

> ---
> v2:
>  -Convert getfd and closefd to QAPI (lcapitulino@redhat.com)
>  -Remove changes that returned fd from getfd (lcapitulino@redhat.com)
>  -Wrap hmp_* functions around qmp_* functions (kwolf@redhat.com)
>  -Move hmp_* functions to hmp.c (lcapitulino@redhat.com)
>  -Drop .user_print lines (lcapitulino@redhat.com)
>  -Use 'cmd' instead of 'cmd_new' for HMP (lcapitulino@redhat.com)
>  -Change QMP command existance back to 0.14 (lcapitulino@redhat.com)
> 
> v3:
>  -Remove unnecessary return statements from qmp_* functions
>   (lcapitulino@redhat.com)
>  -Add notes to QMP command describing behavior in more detail
>   (lcapitulino@redhat.com, eblake@redhat.com)
> 
> v4:
>  -No changes
> 
>  hmp-commands.hx  |    6 ++----
>  hmp.c            |   18 ++++++++++++++++++
>  hmp.h            |    2 ++
>  monitor.c        |   32 ++++++++++++++------------------
>  qapi-schema.json |   35 +++++++++++++++++++++++++++++++++++
>  qmp-commands.hx  |   14 ++++++++++----
>  6 files changed, 81 insertions(+), 26 deletions(-)
> 
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index f5d9d91..eea8b32 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1236,8 +1236,7 @@ ETEXI
>          .args_type  = "fdname:s",
>          .params     = "getfd name",
>          .help       = "receive a file descriptor via SCM rights and assign it a name",
> -        .user_print = monitor_user_noop,
> -        .mhandler.cmd_new = do_getfd,
> +        .mhandler.cmd = hmp_getfd,
>      },
>  
>  STEXI
> @@ -1253,8 +1252,7 @@ ETEXI
>          .args_type  = "fdname:s",
>          .params     = "closefd name",
>          .help       = "close a file descriptor previously passed via SCM rights",
> -        .user_print = monitor_user_noop,
> -        .mhandler.cmd_new = do_closefd,
> +        .mhandler.cmd = hmp_closefd,
>      },
>  
>  STEXI
> diff --git a/hmp.c b/hmp.c
> index 2ce8cb9..6241856 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -999,3 +999,21 @@ void hmp_netdev_del(Monitor *mon, const QDict *qdict)
>      qmp_netdev_del(id, &err);
>      hmp_handle_error(mon, &err);
>  }
> +
> +void hmp_getfd(Monitor *mon, const QDict *qdict)
> +{
> +    const char *fdname = qdict_get_str(qdict, "fdname");
> +    Error *errp = NULL;
> +
> +    qmp_getfd(fdname, &errp);
> +    hmp_handle_error(mon, &errp);
> +}
> +
> +void hmp_closefd(Monitor *mon, const QDict *qdict)
> +{
> +    const char *fdname = qdict_get_str(qdict, "fdname");
> +    Error *errp = NULL;
> +
> +    qmp_closefd(fdname, &errp);
> +    hmp_handle_error(mon, &errp);
> +}
> diff --git a/hmp.h b/hmp.h
> index 79d138d..8d2b0d7 100644
> --- a/hmp.h
> +++ b/hmp.h
> @@ -64,5 +64,7 @@ void hmp_device_del(Monitor *mon, const QDict *qdict);
>  void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict);
>  void hmp_netdev_add(Monitor *mon, const QDict *qdict);
>  void hmp_netdev_del(Monitor *mon, const QDict *qdict);
> +void hmp_getfd(Monitor *mon, const QDict *qdict);
> +void hmp_closefd(Monitor *mon, const QDict *qdict);
>  
>  #endif
> diff --git a/monitor.c b/monitor.c
> index a3bc2c7..1a7f7e7 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -2182,48 +2182,45 @@ static void do_inject_mce(Monitor *mon, const QDict *qdict)
>  }
>  #endif
>  
> -static int do_getfd(Monitor *mon, const QDict *qdict, QObject **ret_data)
> +void qmp_getfd(const char *fdname, Error **errp)
>  {
> -    const char *fdname = qdict_get_str(qdict, "fdname");
>      mon_fd_t *monfd;
>      int fd;
>  
> -    fd = qemu_chr_fe_get_msgfd(mon->chr);
> +    fd = qemu_chr_fe_get_msgfd(cur_mon->chr);
>      if (fd == -1) {
> -        qerror_report(QERR_FD_NOT_SUPPLIED);
> -        return -1;
> +        error_set(errp, QERR_FD_NOT_SUPPLIED);
> +        return;
>      }
>  
>      if (qemu_isdigit(fdname[0])) {
> -        qerror_report(QERR_INVALID_PARAMETER_VALUE, "fdname",
> -                      "a name not starting with a digit");
> -        return -1;
> +        error_set(errp, QERR_INVALID_PARAMETER_VALUE, "fdname",
> +                  "a name not starting with a digit");
> +        return;
>      }
>  
> -    QLIST_FOREACH(monfd, &mon->fds, next) {
> +    QLIST_FOREACH(monfd, &cur_mon->fds, next) {
>          if (strcmp(monfd->name, fdname) != 0) {
>              continue;
>          }
>  
>          close(monfd->fd);
>          monfd->fd = fd;
> -        return 0;
> +        return;
>      }
>  
>      monfd = g_malloc0(sizeof(mon_fd_t));
>      monfd->name = g_strdup(fdname);
>      monfd->fd = fd;
>  
> -    QLIST_INSERT_HEAD(&mon->fds, monfd, next);
> -    return 0;
> +    QLIST_INSERT_HEAD(&cur_mon->fds, monfd, next);
>  }
>  
> -static int do_closefd(Monitor *mon, const QDict *qdict, QObject **ret_data)
> +void qmp_closefd(const char *fdname, Error **errp)
>  {
> -    const char *fdname = qdict_get_str(qdict, "fdname");
>      mon_fd_t *monfd;
>  
> -    QLIST_FOREACH(monfd, &mon->fds, next) {
> +    QLIST_FOREACH(monfd, &cur_mon->fds, next) {
>          if (strcmp(monfd->name, fdname) != 0) {
>              continue;
>          }
> @@ -2232,11 +2229,10 @@ static int do_closefd(Monitor *mon, const QDict *qdict, QObject **ret_data)
>          close(monfd->fd);
>          g_free(monfd->name);
>          g_free(monfd);
> -        return 0;
> +        return;
>      }
>  
> -    qerror_report(QERR_FD_NOT_FOUND, fdname);
> -    return -1;
> +    error_set(errp, QERR_FD_NOT_FOUND, fdname);
>  }
>  
>  static void do_loadvm(Monitor *mon, const QDict *qdict)
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 3b6e346..26a6b84 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -1862,3 +1862,38 @@
>  # Since: 0.14.0
>  ##
>  { 'command': 'netdev_del', 'data': {'id': 'str'} }
> +
> +##
> +# @getfd:
> +#
> +# Receive a file descriptor via SCM rights and assign it a name
> +#
> +# @fdname: file descriptor name
> +#
> +# Returns: Nothing on success
> +#          If file descriptor was not received, FdNotSupplied
> +#          If @fdname is not valid, InvalidParameterType
> +#
> +# Since: 0.14.0
> +#
> +# Notes: If @fdname already exists, the file descriptor assigned to
> +#        it will be closed and replaced by the received file
> +#        descriptor.
> +#        The 'closefd' command can be used to explicitly close the
> +#        file descriptor when it is no longer needed.
> +##
> +{ 'command': 'getfd', 'data': {'fdname': 'str'} }
> +
> +##
> +# @closefd:
> +#
> +# Close a file descriptor previously passed via SCM rights
> +#
> +# @fdname: file descriptor name
> +#
> +# Returns: Nothing on success
> +#          If @fdname is not found, FdNotFound
> +#
> +# Since: 0.14.0
> +##
> +{ 'command': 'closefd', 'data': {'fdname': 'str'} }
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index 2e1a38e..e3cf3c5 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -873,8 +873,7 @@ EQMP
>          .args_type  = "fdname:s",
>          .params     = "getfd name",
>          .help       = "receive a file descriptor via SCM rights and assign it a name",
> -        .user_print = monitor_user_noop,
> -        .mhandler.cmd_new = do_getfd,
> +        .mhandler.cmd_new = qmp_marshal_input_getfd,
>      },
>  
>  SQMP
> @@ -892,6 +891,14 @@ Example:
>  -> { "execute": "getfd", "arguments": { "fdname": "fd1" } }
>  <- { "return": {} }
>  
> +Notes:
> +
> +(1) If the name specified by the "fdname" argument already exists,
> +    the file descriptor assigned to it will be closed and replaced
> +    by the received file descriptor.
> +(2) The 'closefd' command can be used to explicitly close the file
> +    descriptor when it is no longer needed.
> +
>  EQMP
>  
>      {
> @@ -899,8 +906,7 @@ EQMP
>          .args_type  = "fdname:s",
>          .params     = "closefd name",
>          .help       = "close a file descriptor previously passed via SCM rights",
> -        .user_print = monitor_user_noop,
> -        .mhandler.cmd_new = do_closefd,
> +        .mhandler.cmd_new = qmp_marshal_input_closefd,
>      },
>  
>  SQMP

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2012-07-11 18:52 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-22 18:36 [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 1/7] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
2012-06-22 19:31   ` Eric Blake
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 2/7] qapi: Convert getfd and closefd Corey Bryant
2012-07-11 18:51   ` Luiz Capitulino
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 3/7] qapi: Add pass-fd QMP command Corey Bryant
2012-06-22 20:24   ` Eric Blake
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 4/7] qapi: Re-arrange monitor.c functions Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 5/7] block: Prevent /dev/fd/X filename from being detected as floppy Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 6/7] block: Convert open calls to qemu_open Corey Bryant
2012-06-22 18:36 ` [Qemu-devel] [PATCH v4 7/7] osdep: Enable qemu_open to dup pre-opened fd Corey Bryant
2012-06-22 19:58   ` Eric Blake
     [not found] ` <20120626091004.GA14451@redhat.com>
     [not found]   ` <4FE9A0F0.2050809@redhat.com>
     [not found]     ` <20120626175045.2c7011b3@doriath.home>
     [not found]       ` <4FEA37A9.10707@linux.vnet.ibm.com>
     [not found]         ` <4FEA3D9C.8080205@redhat.com>
2012-07-02 22:02           ` [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd Corey Bryant
2012-07-02 22:31             ` Eric Blake
2012-07-03  9:07               ` Daniel P. Berrange
2012-07-03  9:40               ` Kevin Wolf
2012-07-03 13:42               ` Corey Bryant
2012-07-03 15:40             ` Corey Bryant
2012-07-03 15:59               ` Kevin Wolf
2012-07-03 16:25                 ` Corey Bryant
2012-07-03 17:03                   ` Eric Blake
2012-07-03 17:46                     ` Corey Bryant
2012-07-03 18:00                       ` Eric Blake
2012-07-03 18:21                         ` Corey Bryant
2012-07-04  8:09                           ` Kevin Wolf
2012-07-05 15:06                             ` Corey Bryant
2012-07-09 14:05                               ` Luiz Capitulino
2012-07-09 15:05                                 ` Corey Bryant
2012-07-09 15:46                                   ` Kevin Wolf
2012-07-09 16:18                                     ` Luiz Capitulino
2012-07-09 17:59                                       ` Corey Bryant
2012-07-09 17:35                                     ` Corey Bryant
2012-07-09 17:48                                       ` Luiz Capitulino
2012-07-09 18:02                                         ` Corey Bryant
2012-07-10  7:53                                       ` Kevin Wolf
2012-07-09 18:20                                   ` Corey Bryant
2012-07-04  8:00                     ` Kevin Wolf
2012-07-05 14:22                       ` Corey Bryant
2012-07-05 14:51                         ` Kevin Wolf
2012-07-05 16:35                           ` Corey Bryant
2012-07-05 16:37                             ` Corey Bryant
2012-07-06  9:06                               ` Kevin Wolf
2012-07-05 17:00                             ` Eric Blake
2012-07-05 17:36                               ` Corey Bryant
2012-07-06  9:11                               ` Kevin Wolf
2012-07-06 17:14                                 ` Corey Bryant
2012-07-06 17:15                                   ` Corey Bryant
2012-07-06 17:40                                 ` Corey Bryant
2012-07-06 18:19                                   ` [Qemu-devel] [libvirt] " Corey Bryant
2012-07-09 14:04                                   ` [Qemu-devel] " Kevin Wolf
2012-07-09 15:23                                     ` Corey Bryant
2012-07-09 15:30                                       ` Kevin Wolf
2012-07-09 18:40   ` Anthony Liguori
2012-07-09 19:00     ` Luiz Capitulino
2012-07-10  8:54       ` Daniel P. Berrange
2012-07-10  7:58     ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.