All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v6 0/6] file descriptor passing using fd sets
@ 2012-08-03 17:28 Corey Bryant
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Corey Bryant @ 2012-08-03 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, Corey Bryant,
	lcapitulino, eblake

libvirt's sVirt security driver provides SELinux MAC isolation for
Qemu guest processes and their corresponding image files.  In other
words, sVirt uses SELinux to prevent a QEMU process from opening
files that do not belong to it.

sVirt provides this support by labeling guests and resources with
security labels that are stored in file system extended attributes.
Some file systems, such as NFS, do not support the extended
attribute security namespace, and therefore cannot support sVirt
isolation.

A solution to this problem is to provide fd passing support, where
libvirt opens files and passes file descriptors to QEMU.  This,
along with SELinux policy to prevent QEMU from opening files, can
provide image file isolation for NFS files stored on the same NFS
mount.

This patch series adds the add-fd, remove-fd, and query-fdsets
QMP monitor commands, which allow file descriptors to be passed
via SCM_RIGHTS, and assigned to specified fd sets.  This allows
fd sets to be created per file with fds having, for example,
different access rights.  When QEMU needs to reopen a file with
different access rights, it can search for a matching fd in the
fd set.  Fd sets also allow for easy tracking of fds per file,
helping to prevent fd leaks.

Support is also added to the block layer to allow QEMU to dup an
fd from an fdset when the filename is of the /dev/fdset/nnn format,
where nnn is the fd set ID.

No new SELinux policy is required to prevent open of NFS files
(files with type nfs_t).  The virt_use_nfs boolean type simply
needs to be set to false, and open will be prevented (and dup will
be allowed).  For example:

    # setsebool virt_use_nfs 0
    # getsebool virt_use_nfs
    virt_use_nfs --> off

Corey Bryant (6):
  qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg
  qapi: Introduce add-fd, remove-fd, query-fdsets
  monitor: Clean up fd sets on monitor disconnect
  block: Convert open calls to qemu_open
  block: Convert close calls to qemu_close
  block: Enable qemu_open/close to work with fd sets

 block/raw-posix.c |   42 ++++----
 block/raw-win32.c |    6 +-
 block/vdi.c       |    5 +-
 block/vmdk.c      |   25 ++---
 block/vpc.c       |    4 +-
 block/vvfat.c     |   16 +--
 cutils.c          |    5 +
 monitor.c         |  287 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 monitor.h         |    5 +
 osdep.c           |  117 ++++++++++++++++++++++
 qapi-schema.json  |  103 +++++++++++++++++++
 qemu-char.c       |   12 ++-
 qemu-common.h     |    2 +
 qemu-tool.c       |   21 ++++
 qerror.c          |    4 +
 qerror.h          |    3 +
 qmp-commands.hx   |  126 +++++++++++++++++++++++
 savevm.c          |    4 +-
 18 files changed, 732 insertions(+), 55 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v6 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg
  2012-08-03 17:28 [Qemu-devel] [PATCH v6 0/6] file descriptor passing using fd sets Corey Bryant
@ 2012-08-03 17:28 ` Corey Bryant
  2012-08-03 17:33   ` Corey Bryant
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets Corey Bryant
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Corey Bryant @ 2012-08-03 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, Corey Bryant,
	lcapitulino, eblake

Set the close-on-exec flag for the file descriptor received
via SCM_RIGHTS.

v4
 -This patch is new in v4 (eblake@redhat.com)

v5
 -Fallback to FD_CLOEXEC if MSG_CMSG_CLOEXEC is not available
  (eblake@redhat.com, stefanha@linux.vnet.ibm.com)

v6
 -Set cloexec on correct fd (eblake@redhat.com)

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
 qemu-char.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/qemu-char.c b/qemu-char.c
index c2aaaee..ab4a928 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -2238,6 +2238,9 @@ static void unix_process_msgfd(CharDriverState *chr, struct msghdr *msg)
         if (fd < 0)
             continue;
 
+#ifndef MSG_CMSG_CLOEXEC
+        qemu_set_cloexec(fd);
+#endif
         if (s->msgfd != -1)
             close(s->msgfd);
         s->msgfd = fd;
@@ -2253,6 +2256,7 @@ static ssize_t tcp_chr_recv(CharDriverState *chr, char *buf, size_t len)
         struct cmsghdr cmsg;
         char control[CMSG_SPACE(sizeof(int))];
     } msg_control;
+    int flags = 0;
     ssize_t ret;
 
     iov[0].iov_base = buf;
@@ -2263,9 +2267,13 @@ static ssize_t tcp_chr_recv(CharDriverState *chr, char *buf, size_t len)
     msg.msg_control = &msg_control;
     msg.msg_controllen = sizeof(msg_control);
 
-    ret = recvmsg(s->fd, &msg, 0);
-    if (ret > 0 && s->is_unix)
+#ifdef MSG_CMSG_CLOEXEC
+    flags |= MSG_CMSG_CLOEXEC;
+#endif
+    ret = recvmsg(s->fd, &msg, flags);
+    if (ret > 0 && s->is_unix) {
         unix_process_msgfd(chr, &msg);
+    }
 
     return ret;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v6 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets
  2012-08-03 17:28 [Qemu-devel] [PATCH v6 0/6] file descriptor passing using fd sets Corey Bryant
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
@ 2012-08-03 17:28 ` Corey Bryant
  2012-08-07 18:16   ` Stefan Hajnoczi
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 3/6] monitor: Clean up fd sets on monitor disconnect Corey Bryant
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Corey Bryant @ 2012-08-03 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, Corey Bryant,
	lcapitulino, eblake

This patch adds support that enables passing of file descriptors
to the QEMU monitor where they will be stored in specified file
descriptor sets.

A file descriptor set can be used by a client like libvirt to
store file descriptors for the same file.  This allows the
client to open a file with different access modes (O_RDWR,
O_WRONLY, O_RDONLY) and add/remove the passed fds to/from an fd
set as needed.  This will allow QEMU to (in a later patch in this
series) "open" and "reopen" the same file by dup()ing the fd in
the fd set that corresponds to the file, where the fd has the
matching access mode flag that QEMU requests.

The new QMP commands are:
  add-fd: Add a file descriptor to an fd set
  remove-fd: Remove a file descriptor from an fd set
  query-fdsets: Return information describing all fd sets

Note: These commands are not compatible with the existing getfd
and closefd QMP commands.

v5:
 -This patch is new in v5 and replaces the pass-fd QMP command
  from v4.
 -By grouping fds in fd sets, we ease managability with an fd
  set per file, addressing concerns raised in v4 about handling
  "reopens" and preventing fd leakage. (eblake@redhat.com,
  kwolf@redhat.com, dberrange@redhat.com)

v6
 -Make @fd optional for remove-fd (eblake@redhat.com)
 -Make @fdset-id optional for add-fd (eblake@redhat.com)

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
 monitor.c        |  172 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 qapi-schema.json |  103 ++++++++++++++++++++++++++++++++
 qerror.c         |    4 ++
 qerror.h         |    3 +
 qmp-commands.hx  |  126 +++++++++++++++++++++++++++++++++++++++
 5 files changed, 407 insertions(+), 1 deletion(-)

diff --git a/monitor.c b/monitor.c
index 49dccfe..9aa9f7e 100644
--- a/monitor.c
+++ b/monitor.c
@@ -140,6 +140,24 @@ struct mon_fd_t {
     QLIST_ENTRY(mon_fd_t) next;
 };
 
+/* file descriptor associated with a file descriptor set */
+typedef struct mon_fdset_fd_t mon_fdset_fd_t;
+struct mon_fdset_fd_t {
+    int fd;
+    bool removed;
+    QLIST_ENTRY(mon_fdset_fd_t) next;
+};
+
+/* file descriptor set containing fds passed via SCM_RIGHTS */
+typedef struct mon_fdset_t mon_fdset_t;
+struct mon_fdset_t {
+    int64_t id;
+    int refcount;
+    bool in_use;
+    QLIST_HEAD(, mon_fdset_fd_t) fds;
+    QLIST_ENTRY(mon_fdset_t) next;
+};
+
 typedef struct MonitorControl {
     QObject *id;
     JSONMessageParser parser;
@@ -176,7 +194,8 @@ struct Monitor {
     int print_calls_nr;
 #endif
     QError *error;
-    QLIST_HEAD(,mon_fd_t) fds;
+    QLIST_HEAD(, mon_fd_t) fds;
+    QLIST_HEAD(, mon_fdset_t) fdsets;
     QLIST_ENTRY(Monitor) entry;
 };
 
@@ -2389,6 +2408,157 @@ int monitor_get_fd(Monitor *mon, const char *fdname)
     return -1;
 }
 
+static void monitor_fdset_cleanup(mon_fdset_t *mon_fdset)
+{
+    mon_fdset_fd_t *mon_fdset_fd;
+    mon_fdset_fd_t *mon_fdset_fd_next;
+
+    if (mon_fdset->refcount != 0) {
+        return;
+    }
+
+    QLIST_FOREACH_SAFE(mon_fdset_fd, &mon_fdset->fds, next, mon_fdset_fd_next) {
+        if (!mon_fdset->in_use || mon_fdset_fd->removed) {
+            close(mon_fdset_fd->fd);
+            QLIST_REMOVE(mon_fdset_fd, next);
+            g_free(mon_fdset_fd);
+        }
+    }
+
+    if (QLIST_EMPTY(&mon_fdset->fds)) {
+        QLIST_REMOVE(mon_fdset, next);
+        g_free(mon_fdset);
+    }
+}
+
+AddfdInfo *qmp_add_fd(bool has_fdset_id, int64_t fdset_id, Error **errp)
+{
+    int fd;
+    Monitor *mon = cur_mon;
+    mon_fdset_t *mon_fdset;
+    mon_fdset_fd_t *mon_fdset_fd;
+    AddfdInfo *fdinfo;
+
+    fd = qemu_chr_fe_get_msgfd(mon->chr);
+    if (fd == -1) {
+        qerror_report(QERR_FD_NOT_SUPPLIED);
+        return NULL;
+    }
+
+    if (has_fdset_id) {
+        QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
+            if (mon_fdset->id == fdset_id) {
+                break;
+            }
+        }
+        if (mon_fdset == NULL) {
+            qerror_report(QERR_FDSET_NOT_FOUND, fdset_id);
+            return NULL;
+        }
+    } else {
+        int64_t fdset_id_prev = -1;
+        mon_fdset_t *mon_fdset_cur = QLIST_FIRST(&mon->fdsets);
+
+        /* Use first available fdset ID */
+        QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
+            mon_fdset_cur = mon_fdset;
+            if (fdset_id_prev == mon_fdset_cur->id - 1) {
+                fdset_id_prev = mon_fdset_cur->id;
+                continue;
+            }
+            break;
+        }
+
+        mon_fdset = g_malloc0(sizeof(*mon_fdset));
+        mon_fdset->id = fdset_id_prev + 1;
+        mon_fdset->refcount = 0;
+        mon_fdset->in_use = true;
+
+        /* The fdset list is ordered by fdset ID */
+        if (mon_fdset->id == 0) {
+            QLIST_INSERT_HEAD(&mon->fdsets, mon_fdset, next);
+        } else if (mon_fdset->id < mon_fdset_cur->id) {
+            QLIST_INSERT_BEFORE(mon_fdset_cur, mon_fdset, next);
+        } else {
+            QLIST_INSERT_AFTER(mon_fdset_cur, mon_fdset, next);
+        }
+    }
+
+    mon_fdset_fd = g_malloc0(sizeof(*mon_fdset_fd));
+    mon_fdset_fd->fd = fd;
+    mon_fdset_fd->removed = false;
+    QLIST_INSERT_HEAD(&mon_fdset->fds, mon_fdset_fd, next);
+
+    fdinfo = g_malloc0(sizeof(*fdinfo));
+    fdinfo->fdset_id = mon_fdset->id;
+    fdinfo->fd = mon_fdset_fd->fd;
+
+    return fdinfo;
+}
+
+void qmp_remove_fd(int64_t fdset_id, bool has_fd, int64_t fd, Error **errp)
+{
+    Monitor *mon = cur_mon;
+    mon_fdset_t *mon_fdset;
+    mon_fdset_fd_t *mon_fdset_fd;
+    char fd_str[20];
+
+    QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
+        if (mon_fdset->id != fdset_id) {
+            continue;
+        }
+        QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
+            if (has_fd && mon_fdset_fd->fd != fd) {
+                continue;
+            }
+            mon_fdset_fd->removed = true;
+            if (has_fd) {
+                break;
+            }
+        }
+        monitor_fdset_cleanup(mon_fdset);
+        return;
+    }
+    snprintf(fd_str, sizeof(fd_str), "%ld", fd);
+    qerror_report(QERR_FD_NOT_FOUND, fd_str);
+}
+
+FdsetInfoList *qmp_query_fdsets(Error **errp)
+{
+    Monitor *mon = cur_mon;
+    mon_fdset_t *mon_fdset;
+    mon_fdset_fd_t *mon_fdset_fd;
+    FdsetInfoList *fdset_list = NULL;
+
+    QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
+        FdsetInfoList *fdset_info = g_malloc0(sizeof(*fdset_info));
+        FdsetFdInfoList *fdsetfd_list = NULL;
+
+        fdset_info->value = g_malloc0(sizeof(*fdset_info->value));
+        fdset_info->value->fdset_id = mon_fdset->id;
+        fdset_info->value->refcount = mon_fdset->refcount;
+        fdset_info->value->in_use = mon_fdset->in_use;
+
+        QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
+            FdsetFdInfoList *fdsetfd_info = g_malloc0(sizeof(*fdsetfd_info));
+
+            fdsetfd_info->value = g_malloc0(sizeof(*fdsetfd_info->value));
+            fdsetfd_info->value->fd = mon_fdset_fd->fd;
+            fdsetfd_info->value->removed = mon_fdset_fd->removed;
+
+            fdsetfd_info->next = fdsetfd_list;
+            fdsetfd_list = fdsetfd_info;
+        }
+
+        fdset_info->value->fds = fdsetfd_list;
+
+        fdset_info->next = fdset_list;
+        fdset_list = fdset_info;
+    }
+
+    return fdset_list;
+}
+
 /* mon_cmds and info_cmds would be sorted at runtime */
 static mon_cmd_t mon_cmds[] = {
 #include "hmp-commands.h"
diff --git a/qapi-schema.json b/qapi-schema.json
index bc55ed2..4d46e5b 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2183,3 +2183,106 @@
 # Since: 0.14.0
 ##
 { 'command': 'closefd', 'data': {'fdname': 'str'} }
+
+# @AddfdInfo:
+#
+# Information about a file descriptor that was added to an fd set.
+#
+# @fdset_id: The ID of the fd set that @fd was added to.
+#
+# @fd: The file descriptor that was received via SCM rights and
+#      added to the fd set.
+#
+# Since: 1.2.0
+##
+{ 'type': 'AddfdInfo', 'data': {'fdset_id': 'int', 'fd': 'int'} }
+
+##
+# @add-fd:
+#
+# Add a file descriptor, that was passed via SCM rights, to an fd set.
+#
+# @fdset_id: #optional The ID of the fd set to add the file descriptor to.
+#
+# Returns: @AddfdInfo on success
+#          If file descriptor was not received, FdNotSupplied
+#          If @fdset_id does not exist, FdSetNotFound
+#
+# Note: If @fdset_id is not specified, a new fd set will be created.
+#
+# Since: 1.2.0
+##
+{ 'command': 'add-fd', 'data': {'*fdset_id': 'int'},
+  'returns': 'AddfdInfo' }
+
+##
+# @remove-fd:
+#
+# Remove a file descriptor from an fd set.
+#
+# @fdset_id: The ID of the fd set that the file descriptor belongs to.
+#
+# @fd: #optional The file descriptor that is to be removed.
+#
+# Returns: Nothing on success
+#          If @fdset_id or @fd is not found, FdNotFound
+#
+# Since: 1.2.0
+#
+# Notes: File descriptors that are removed:
+#        o will not be closed until the reference count corresponding
+#          to @fdset_id reaches zero.
+#        o will not be available for use after successful completion
+#          of the remove-fd command.
+#
+#        If @fd is not specified, all file descriptors in @fdset_id
+#        will be removed.
+##
+{ 'command': 'remove-fd', 'data': {'fdset_id': 'int', '*fd': 'int'} }
+
+##
+# @FdsetFdInfo:
+#
+# Information about a file descriptor that belongs to an fd set.
+#
+# @fd: The file descriptor value.
+#
+# @removed: If true, the remove-fd command has been issued for this fd.
+#
+# Since: 1.2.0
+##
+{ 'type': 'FdsetFdInfo', 'data': {'fd': 'int', 'removed': 'bool'} }
+
+##
+# @FdsetInfo:
+#
+# Information about an fd set.
+#
+# @fdset_id: The ID of the fd set.
+#
+# @refcount: A count of the number of outstanding dup() references to
+#            this fd set.
+#
+# @in_use: If true, this fd set is in use by a connected QMP monitor.
+#
+# @fds: A list of file descriptors that belong to this fd set.
+#
+# Since: 1.2.0
+##
+{ 'type': 'FdsetInfo',
+  'data': {'fdset_id': 'int', 'refcount': 'int', 'in_use': 'bool',
+           'fds': ['FdsetFdInfo']} }
+
+##
+# @query-fdsets:
+#
+# Return information describing all fd sets.
+#
+# Returns: A list of @FdsetInfo
+#
+# Since: 1.2.0
+#
+# Notes: File descriptors are not closed until @refcount is zero,
+#        and either @in_use is false or @removed is true.
+##
+{ 'command': 'query-fdsets', 'returns': ['FdsetInfo'] }
diff --git a/qerror.c b/qerror.c
index 92c4eff..63a0aa1 100644
--- a/qerror.c
+++ b/qerror.c
@@ -148,6 +148,10 @@ static const QErrorStringTable qerror_table[] = {
         .desc      = "No file descriptor supplied via SCM_RIGHTS",
     },
     {
+        .error_fmt = QERR_FDSET_NOT_FOUND,
+        .desc      = "File descriptor set with ID '%(id)' not found",
+    },
+    {
         .error_fmt = QERR_FEATURE_DISABLED,
         .desc      = "The feature '%(name)' is not enabled",
     },
diff --git a/qerror.h b/qerror.h
index b4c8758..b908d3f 100644
--- a/qerror.h
+++ b/qerror.h
@@ -133,6 +133,9 @@ QError *qobject_to_qerror(const QObject *obj);
 #define QERR_FD_NOT_SUPPLIED \
     "{ 'class': 'FdNotSupplied', 'data': {} }"
 
+#define QERR_FDSET_NOT_FOUND \
+    "{ 'class': 'FdSetNotFound', 'data': { 'id': %ld } }"
+
 #define QERR_FEATURE_DISABLED \
     "{ 'class': 'FeatureDisabled', 'data': { 'name': %s } }"
 
diff --git a/qmp-commands.hx b/qmp-commands.hx
index e3cf3c5..7141168 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -926,6 +926,132 @@ Example:
 
 EQMP
 
+     {
+        .name       = "add-fd",
+        .args_type  = "fdset_id:i?",
+        .params     = "add-fd fdset_id",
+        .help       = "Add a file descriptor, that was passed via SCM rights, to an fd set",
+        .mhandler.cmd_new = qmp_marshal_input_add_fd,
+    },
+
+SQMP
+add-fd
+-------
+
+Add a file descriptor, that was passed via SCM rights, to an fd set.
+
+Arguments:
+
+- "fdset_id": The ID of the fd set to add the file descriptor to.
+              (json-int, optional)
+
+Return a json-object with the following information:
+
+- "fdset_id": The ID of the fd set that the fd was added to. (json-int)
+- "fd": The file descriptor that was received via SCM rights and added to the
+        fd set. (json-int)
+
+Example:
+
+-> { "execute": "add-fd", "arguments": { "fdset_id": 1 } }
+<- { "return": { "fdset_id": 1, "fd": 3 } }
+
+Note:  If "fdset_id" is not specified, a new fd set will be created.
+
+EQMP
+
+     {
+        .name       = "remove-fd",
+        .args_type  = "fdset_id:i,fd:i?",
+        .params     = "remove-fd fdset_id fd",
+        .help       = "Remove a file descriptor from an fd set",
+        .mhandler.cmd_new = qmp_marshal_input_remove_fd,
+    },
+
+SQMP
+remove-fd
+---------
+
+Remove a file descriptor from an fd set.
+
+Arguments:
+
+- "fdset_id": The ID of the fd set that the file descriptor belongs to.
+              (json-int)
+- "fd": The file descriptor that is to be removed. (json-int, optional)
+
+Example:
+
+-> { "execute": "remove-fd", "arguments": { "fdset_id": 1, "fd": 3 } }
+<- { "return": {} }
+
+Notes:
+
+(1) File descriptors that are removed:
+    (a) will not be closed until the reference count corresponding to fdset_id
+        reaches zero.
+    (b) will not be available for use after successful completion of the
+        remove-fd command.
+(2) If "fd" is not specified, all file descriptors in "fdset_id" will be
+    removed.
+
+EQMP
+
+    {
+        .name       = "query-fdsets",
+        .args_type  = "",
+        .help       = "Return information describing all fd sets",
+        .mhandler.cmd_new = qmp_marshal_input_query_fdsets,
+    },
+
+SQMP
+query-fdsets
+-------------
+
+Return information describing all fd sets.
+
+Arguments: None
+
+Example:
+
+-> { "execute": "query-fdsets" }
+<- { "return": [
+       {
+         "fdset_id": 1
+         "refcount": 0,
+         "in_use": true,
+         "fds": [
+           {
+             "fd": 21,
+             "removed": false
+           },
+           {
+             "fd": 23,
+             "removed": false
+           }
+         ],
+       },
+       {
+         "fdset_id": 2
+         "refcount": 0,
+         "in_use": true,
+         "fds": [
+           {
+             "fd": 22,
+             "removed": false
+           }
+         ],
+       }
+     ]
+   }
+
+Notes:
+
+(1) File descriptors are not closed until refcount is zero, and
+    either in_use is false or removed is true.
+
+EQMP
+
     {
         .name       = "block_passwd",
         .args_type  = "device:B,password:s",
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v6 3/6] monitor: Clean up fd sets on monitor disconnect
  2012-08-03 17:28 [Qemu-devel] [PATCH v6 0/6] file descriptor passing using fd sets Corey Bryant
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets Corey Bryant
@ 2012-08-03 17:28 ` Corey Bryant
  2012-08-07 17:32   ` Stefan Hajnoczi
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 4/6] block: Convert open calls to qemu_open Corey Bryant
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Corey Bryant @ 2012-08-03 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, Corey Bryant,
	lcapitulino, eblake

Each fd set has a boolean that keeps track of whether or not the
fd set is in use by a monitor connection.  When a monitor
disconnects, all fds that are members of an fd set with refcount
of zero are closed.  This prevents any fd leakage associated with
a client disconnect prior to using a passed fd.

v5:
 -This patch is new in v5.
 -This support addresses concerns from v4 regarding fd leakage
  if the client disconnects unexpectedly. (eblake@redhat.com,
  kwolf@redhat.com, dberrange@redhat.com)

v6:
 -No changes

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
 monitor.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/monitor.c b/monitor.c
index 9aa9f7e..a46ef8d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2559,6 +2559,19 @@ FdsetInfoList *qmp_query_fdsets(Error **errp)
     return fdset_list;
 }
 
+static void monitor_fdsets_set_in_use(Monitor *mon, bool in_use)
+{
+    mon_fdset_t *mon_fdset;
+    mon_fdset_t *mon_fdset_next;
+
+    QLIST_FOREACH_SAFE(mon_fdset, &mon->fdsets, next, mon_fdset_next) {
+        mon_fdset->in_use = in_use;
+        if (!in_use) {
+            monitor_fdset_cleanup(mon_fdset);
+        }
+    }
+}
+
 /* mon_cmds and info_cmds would be sorted at runtime */
 static mon_cmd_t mon_cmds[] = {
 #include "hmp-commands.h"
@@ -4763,9 +4776,11 @@ static void monitor_control_event(void *opaque, int event)
         data = get_qmp_greeting();
         monitor_json_emitter(mon, data);
         qobject_decref(data);
+        monitor_fdsets_set_in_use(mon, true);
         break;
     case CHR_EVENT_CLOSED:
         json_message_parser_destroy(&mon->mc->parser);
+        monitor_fdsets_set_in_use(mon, false);
         break;
     }
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v6 4/6] block: Convert open calls to qemu_open
  2012-08-03 17:28 [Qemu-devel] [PATCH v6 0/6] file descriptor passing using fd sets Corey Bryant
                   ` (2 preceding siblings ...)
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 3/6] monitor: Clean up fd sets on monitor disconnect Corey Bryant
@ 2012-08-03 17:28 ` Corey Bryant
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 5/6] block: Convert close calls to qemu_close Corey Bryant
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 6/6] block: Enable qemu_open/close to work with fd sets Corey Bryant
  5 siblings, 0 replies; 14+ messages in thread
From: Corey Bryant @ 2012-08-03 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, Corey Bryant,
	lcapitulino, eblake

This patch converts all block layer open calls to qemu_open.

Note that this adds the O_CLOEXEC flag to the changed open paths
when the O_CLOEXEC macro is defined.

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
v2:
 -Convert calls to qemu_open instead of file_open (kwolf@redhat.com)
 -Mention introduction of O_CLOEXEC (kwolf@redhat.com)

v3-v6:
 -No changes
---
 block/raw-posix.c |   18 +++++++++---------
 block/raw-win32.c |    4 ++--
 block/vdi.c       |    5 +++--
 block/vmdk.c      |   21 +++++++++------------
 block/vpc.c       |    2 +-
 block/vvfat.c     |    4 ++--
 6 files changed, 26 insertions(+), 28 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 0dce089..7408a42 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -572,8 +572,8 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
-              0644);
+    fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
+                   0644);
     if (fd < 0) {
         result = -errno;
     } else {
@@ -846,7 +846,7 @@ static int hdev_open(BlockDriverState *bs, const char *filename, int flags)
         if ( bsdPath[ 0 ] != '\0' ) {
             strcat(bsdPath,"s0");
             /* some CDs don't have a partition 0 */
-            fd = open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE);
+            fd = qemu_open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE);
             if (fd < 0) {
                 bsdPath[strlen(bsdPath)-1] = '1';
             } else {
@@ -903,7 +903,7 @@ static int fd_open(BlockDriverState *bs)
 #endif
             return -EIO;
         }
-        s->fd = open(bs->filename, s->open_flags & ~O_NONBLOCK);
+        s->fd = qemu_open(bs->filename, s->open_flags & ~O_NONBLOCK);
         if (s->fd < 0) {
             s->fd_error_time = get_clock();
             s->fd_got_error = 1;
@@ -977,7 +977,7 @@ static int hdev_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_BINARY);
+    fd = qemu_open(filename, O_WRONLY | O_BINARY);
     if (fd < 0)
         return -errno;
 
@@ -1055,7 +1055,7 @@ static int floppy_probe_device(const char *filename)
     if (strstart(filename, "/dev/fd", NULL))
         prio = 50;
 
-    fd = open(filename, O_RDONLY | O_NONBLOCK);
+    fd = qemu_open(filename, O_RDONLY | O_NONBLOCK);
     if (fd < 0) {
         goto out;
     }
@@ -1108,7 +1108,7 @@ static void floppy_eject(BlockDriverState *bs, bool eject_flag)
         close(s->fd);
         s->fd = -1;
     }
-    fd = open(bs->filename, s->open_flags | O_NONBLOCK);
+    fd = qemu_open(bs->filename, s->open_flags | O_NONBLOCK);
     if (fd >= 0) {
         if (ioctl(fd, FDEJECT, 0) < 0)
             perror("FDEJECT");
@@ -1158,7 +1158,7 @@ static int cdrom_probe_device(const char *filename)
     int prio = 0;
     struct stat st;
 
-    fd = open(filename, O_RDONLY | O_NONBLOCK);
+    fd = qemu_open(filename, O_RDONLY | O_NONBLOCK);
     if (fd < 0) {
         goto out;
     }
@@ -1282,7 +1282,7 @@ static int cdrom_reopen(BlockDriverState *bs)
      */
     if (s->fd >= 0)
         close(s->fd);
-    fd = open(bs->filename, s->open_flags, 0644);
+    fd = qemu_open(bs->filename, s->open_flags, 0644);
     if (fd < 0) {
         s->fd = -1;
         return -EIO;
diff --git a/block/raw-win32.c b/block/raw-win32.c
index e4b0b75..8d7838d 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -255,8 +255,8 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
-              0644);
+    fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
+                   0644);
     if (fd < 0)
         return -EIO;
     set_sparse(fd);
diff --git a/block/vdi.c b/block/vdi.c
index 57325d6..c4f1529 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -653,8 +653,9 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options)
         options++;
     }
 
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
-              0644);
+    fd = qemu_open(filename,
+                   O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
+                   0644);
     if (fd < 0) {
         return -errno;
     }
diff --git a/block/vmdk.c b/block/vmdk.c
index 18e9b4c..557dc1b 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1161,10 +1161,9 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
     VMDK4Header header;
     uint32_t tmp, magic, grains, gd_size, gt_size, gt_count;
 
-    fd = open(
-        filename,
-        O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
-        0644);
+    fd = qemu_open(filename,
+                   O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
+                   0644);
     if (fd < 0) {
         return -errno;
     }
@@ -1484,15 +1483,13 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options)
             (flags & BLOCK_FLAG_COMPAT6 ? 6 : 4),
             total_size / (int64_t)(63 * 16 * 512));
     if (split || flat) {
-        fd = open(
-                filename,
-                O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
-                0644);
+        fd = qemu_open(filename,
+                       O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
+                       0644);
     } else {
-        fd = open(
-                filename,
-                O_WRONLY | O_BINARY | O_LARGEFILE,
-                0644);
+        fd = qemu_open(filename,
+                       O_WRONLY | O_BINARY | O_LARGEFILE,
+                       0644);
     }
     if (fd < 0) {
         return -errno;
diff --git a/block/vpc.c b/block/vpc.c
index 5cd13d1..60ebf5a 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -678,7 +678,7 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options)
     }
 
     /* Create the file */
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
+    fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
     if (fd < 0) {
         return -EIO;
     }
diff --git a/block/vvfat.c b/block/vvfat.c
index 7b1dcee..22b586a 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1162,7 +1162,7 @@ static int open_file(BDRVVVFATState* s,mapping_t* mapping)
     if(!s->current_mapping ||
 	    strcmp(s->current_mapping->path,mapping->path)) {
 	/* open file */
-	int fd = open(mapping->path, O_RDONLY | O_BINARY | O_LARGEFILE);
+	int fd = qemu_open(mapping->path, O_RDONLY | O_BINARY | O_LARGEFILE);
 	if(fd<0)
 	    return -1;
 	vvfat_close_current_file(s);
@@ -2221,7 +2221,7 @@ static int commit_one_file(BDRVVVFATState* s,
     for (i = s->cluster_size; i < offset; i += s->cluster_size)
 	c = modified_fat_get(s, c);
 
-    fd = open(mapping->path, O_RDWR | O_CREAT | O_BINARY, 0666);
+    fd = qemu_open(mapping->path, O_RDWR | O_CREAT | O_BINARY, 0666);
     if (fd < 0) {
 	fprintf(stderr, "Could not open %s... (%s, %d)\n", mapping->path,
 		strerror(errno), errno);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v6 5/6] block: Convert close calls to qemu_close
  2012-08-03 17:28 [Qemu-devel] [PATCH v6 0/6] file descriptor passing using fd sets Corey Bryant
                   ` (3 preceding siblings ...)
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 4/6] block: Convert open calls to qemu_open Corey Bryant
@ 2012-08-03 17:28 ` Corey Bryant
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 6/6] block: Enable qemu_open/close to work with fd sets Corey Bryant
  5 siblings, 0 replies; 14+ messages in thread
From: Corey Bryant @ 2012-08-03 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, Corey Bryant,
	lcapitulino, eblake

This patch converts all block layer close calls, that correspond
to qemu_open calls, to qemu_close.

v5:
 -This patch is new in v5. (kwolf@redhat.com, eblake@redhat.com)

v6:
 -No changes

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
---
 block/raw-posix.c |   24 ++++++++++++------------
 block/raw-win32.c |    2 +-
 block/vmdk.c      |    4 ++--
 block/vpc.c       |    2 +-
 block/vvfat.c     |   12 ++++++------
 osdep.c           |    5 +++++
 qemu-common.h     |    1 +
 savevm.c          |    4 ++--
 8 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 7408a42..a172de3 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const char *filename,
 out_free_buf:
     qemu_vfree(s->aligned_buf);
 out_close:
-    close(fd);
+    qemu_close(fd);
     return -errno;
 }
 
@@ -376,7 +376,7 @@ static void raw_close(BlockDriverState *bs)
 {
     BDRVRawState *s = bs->opaque;
     if (s->fd >= 0) {
-        close(s->fd);
+        qemu_close(s->fd);
         s->fd = -1;
         if (s->aligned_buf != NULL)
             qemu_vfree(s->aligned_buf);
@@ -580,7 +580,7 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
         if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
             result = -errno;
         }
-        if (close(fd) != 0) {
+        if (qemu_close(fd) != 0) {
             result = -errno;
         }
     }
@@ -850,7 +850,7 @@ static int hdev_open(BlockDriverState *bs, const char *filename, int flags)
             if (fd < 0) {
                 bsdPath[strlen(bsdPath)-1] = '1';
             } else {
-                close(fd);
+                qemu_close(fd);
             }
             filename = bsdPath;
         }
@@ -889,7 +889,7 @@ static int fd_open(BlockDriverState *bs)
     last_media_present = (s->fd >= 0);
     if (s->fd >= 0 &&
         (get_clock() - s->fd_open_time) >= FD_OPEN_TIMEOUT) {
-        close(s->fd);
+        qemu_close(s->fd);
         s->fd = -1;
 #ifdef DEBUG_FLOPPY
         printf("Floppy closed\n");
@@ -988,7 +988,7 @@ static int hdev_create(const char *filename, QEMUOptionParameter *options)
     else if (lseek(fd, 0, SEEK_END) < total_size * BDRV_SECTOR_SIZE)
         ret = -ENOSPC;
 
-    close(fd);
+    qemu_close(fd);
     return ret;
 }
 
@@ -1038,7 +1038,7 @@ static int floppy_open(BlockDriverState *bs, const char *filename, int flags)
         return ret;
 
     /* close fd so that we can reopen it as needed */
-    close(s->fd);
+    qemu_close(s->fd);
     s->fd = -1;
     s->fd_media_changed = 1;
 
@@ -1070,7 +1070,7 @@ static int floppy_probe_device(const char *filename)
         prio = 100;
 
 outc:
-    close(fd);
+    qemu_close(fd);
 out:
     return prio;
 }
@@ -1105,14 +1105,14 @@ static void floppy_eject(BlockDriverState *bs, bool eject_flag)
     int fd;
 
     if (s->fd >= 0) {
-        close(s->fd);
+        qemu_close(s->fd);
         s->fd = -1;
     }
     fd = qemu_open(bs->filename, s->open_flags | O_NONBLOCK);
     if (fd >= 0) {
         if (ioctl(fd, FDEJECT, 0) < 0)
             perror("FDEJECT");
-        close(fd);
+        qemu_close(fd);
     }
 }
 
@@ -1173,7 +1173,7 @@ static int cdrom_probe_device(const char *filename)
         prio = 100;
 
 outc:
-    close(fd);
+    qemu_close(fd);
 out:
     return prio;
 }
@@ -1281,7 +1281,7 @@ static int cdrom_reopen(BlockDriverState *bs)
      * FreeBSD seems to not notice sometimes...
      */
     if (s->fd >= 0)
-        close(s->fd);
+        qemu_close(s->fd);
     fd = qemu_open(bs->filename, s->open_flags, 0644);
     if (fd < 0) {
         s->fd = -1;
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 8d7838d..c56bf83 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -261,7 +261,7 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
         return -EIO;
     set_sparse(fd);
     ftruncate(fd, total_size * 512);
-    close(fd);
+    qemu_close(fd);
     return 0;
 }
 
diff --git a/block/vmdk.c b/block/vmdk.c
index 557dc1b..daee426 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1258,7 +1258,7 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
 
     ret = 0;
  exit:
-    close(fd);
+    qemu_close(fd);
     return ret;
 }
 
@@ -1506,7 +1506,7 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options)
     }
     ret = 0;
 exit:
-    close(fd);
+    qemu_close(fd);
     return ret;
 }
 
diff --git a/block/vpc.c b/block/vpc.c
index 60ebf5a..c0b82c4 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -744,7 +744,7 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options)
     }
 
  fail:
-    close(fd);
+    qemu_close(fd);
     return ret;
 }
 
diff --git a/block/vvfat.c b/block/vvfat.c
index 22b586a..59d3c5b 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1105,7 +1105,7 @@ static inline void vvfat_close_current_file(BDRVVVFATState *s)
     if(s->current_mapping) {
 	s->current_mapping = NULL;
 	if (s->current_fd) {
-		close(s->current_fd);
+		qemu_close(s->current_fd);
 		s->current_fd = 0;
 	}
     }
@@ -2230,7 +2230,7 @@ static int commit_one_file(BDRVVVFATState* s,
     }
     if (offset > 0) {
         if (lseek(fd, offset, SEEK_SET) != offset) {
-            close(fd);
+            qemu_close(fd);
             g_free(cluster);
             return -3;
         }
@@ -2251,13 +2251,13 @@ static int commit_one_file(BDRVVVFATState* s,
 	    (uint8_t*)cluster, (rest_size + 0x1ff) / 0x200);
 
         if (ret < 0) {
-            close(fd);
+            qemu_close(fd);
             g_free(cluster);
             return ret;
         }
 
         if (write(fd, cluster, rest_size) < 0) {
-            close(fd);
+            qemu_close(fd);
             g_free(cluster);
             return -2;
         }
@@ -2268,11 +2268,11 @@ static int commit_one_file(BDRVVVFATState* s,
 
     if (ftruncate(fd, size)) {
         perror("ftruncate()");
-        close(fd);
+        qemu_close(fd);
         g_free(cluster);
         return -4;
     }
-    close(fd);
+    qemu_close(fd);
     g_free(cluster);
 
     return commit_mappings(s, first_cluster, dir_index);
diff --git a/osdep.c b/osdep.c
index 03817f0..7b09dff 100644
--- a/osdep.c
+++ b/osdep.c
@@ -104,6 +104,11 @@ int qemu_open(const char *name, int flags, ...)
     return ret;
 }
 
+int qemu_close(int fd)
+{
+    return close(fd);
+}
+
 /*
  * A variant of write(2) which handles partial write.
  *
diff --git a/qemu-common.h b/qemu-common.h
index d26ff39..8c4638b 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -189,6 +189,7 @@ const char *path(const char *pathname);
 void *qemu_oom_check(void *ptr);
 
 int qemu_open(const char *name, int flags, ...);
+int qemu_close(int fd);
 ssize_t qemu_write_full(int fd, const void *buf, size_t count)
     QEMU_WARN_UNUSED_RESULT;
 ssize_t qemu_send_full(int fd, const void *buf, size_t count, int flags)
diff --git a/savevm.c b/savevm.c
index 6e82b2d..8ecd5d2 100644
--- a/savevm.c
+++ b/savevm.c
@@ -513,7 +513,7 @@ static void qemu_fill_buffer(QEMUFile *f)
  *
  * Returns f->close() return value, or 0 if close function is not set.
  */
-static int qemu_close(QEMUFile *f)
+static int _qemu_fclose(QEMUFile *f)
 {
     int ret = 0;
     if (f->close) {
@@ -535,7 +535,7 @@ int qemu_fclose(QEMUFile *f)
 {
     int ret;
     qemu_fflush(f);
-    ret = qemu_close(f);
+    ret = _qemu_fclose(f);
     /* If any error was spotted before closing, we should report it
      * instead of the close() return value.
      */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v6 6/6] block: Enable qemu_open/close to work with fd sets
  2012-08-03 17:28 [Qemu-devel] [PATCH v6 0/6] file descriptor passing using fd sets Corey Bryant
                   ` (4 preceding siblings ...)
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 5/6] block: Convert close calls to qemu_close Corey Bryant
@ 2012-08-03 17:28 ` Corey Bryant
  5 siblings, 0 replies; 14+ messages in thread
From: Corey Bryant @ 2012-08-03 17:28 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, aliguori, stefanha, libvir-list, Corey Bryant,
	lcapitulino, eblake

When qemu_open is passed a filename of the "/dev/fdset/nnn"
format (where nnn is the fdset ID), an fd with matching access
mode flags will be searched for within the specified monitor
fd set.  If the fd is found, a dup of the fd will be returned
from qemu_open.

Each fd set has a reference count.  The purpose of the reference
count is to determine if an fd set contains file descriptors that
have open dup() references that have not yet been closed.  It is
incremented on qemu_open and decremented on qemu_close.  It is
not until the refcount is zero that file desriptors in an fd set
can be closed.  If an fd set has dup() references open, then we
must keep the other fds in the fd set open in case a reopen
of the file occurs that requires an fd with a different access
mode.

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>

v2:
 -Get rid of file_open and move dup code to qemu_open
  (kwolf@redhat.com)
 -Use strtol wrapper instead of atoi (kwolf@redhat.com)

v3:
 -Add note about fd leakage (eblake@redhat.com)

v4
 -Moved patch to be later in series (lcapitulino@redhat.com)
 -Update qemu_open to check access mode flags and set flags that
  can be set (eblake@redhat.com, kwolf@redhat.com)

v5:
 -This patch was overhauled quite a bit in this version, with
  the addition of fd set and refcount support.
 -Use qemu_set_cloexec() on dup'd fd (eblake@redhat.com)
 -Modify flags set by fcntl on dup'd fd (eblake@redhat.com)
 -Reduce syscalls when setting flags for dup'd fd (eblake@redhat.com)
 -Fix O_RDWR, O_RDONLY, O_WRONLY checks (eblake@redhat.com)

v6:
 -Pass only the fd to qemu_close() and keep track of dup fds per fd
  set. (kwolf@redhat.com, eblake@redhat.com)
 -Handle refcount incr/decr in new dup_fd_add/remove fd functions.
 -Use qemu_set_cloexec() appropriately in qemu_dup() (kwolf@redhat.com)
 -Simplify setting of setfl_flags in qemu_dup() (kwolf@redhat.com)
 -Add preprocessor checks for F_DUPFD_CLOEXEC (eblake@redhat.com)
 -Simplify flag checking in monitor_fdset_get_fd() (kwolf@redhat.com)
---
 cutils.c      |    5 +++
 monitor.c     |  100 +++++++++++++++++++++++++++++++++++++++++++++++++++
 monitor.h     |    5 +++
 osdep.c       |  112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 qemu-common.h |    1 +
 qemu-tool.c   |   21 +++++++++++
 6 files changed, 244 insertions(+)

diff --git a/cutils.c b/cutils.c
index 9d4c570..8b0d2bb 100644
--- a/cutils.c
+++ b/cutils.c
@@ -382,3 +382,8 @@ int qemu_parse_fd(const char *param)
     }
     return fd;
 }
+
+int qemu_parse_fdset(const char *param)
+{
+    return qemu_parse_fd(param);
+}
diff --git a/monitor.c b/monitor.c
index a46ef8d..66b863f 100644
--- a/monitor.c
+++ b/monitor.c
@@ -155,6 +155,7 @@ struct mon_fdset_t {
     int refcount;
     bool in_use;
     QLIST_HEAD(, mon_fdset_fd_t) fds;
+    QLIST_HEAD(, mon_fdset_fd_t) dup_fds;
     QLIST_ENTRY(mon_fdset_t) next;
 };
 
@@ -2572,6 +2573,105 @@ static void monitor_fdsets_set_in_use(Monitor *mon, bool in_use)
     }
 }
 
+int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
+{
+    mon_fdset_t *mon_fdset;
+    mon_fdset_fd_t *mon_fdset_fd;
+    int mon_fd_flags;
+
+    if (!mon) {
+        errno = ENOENT;
+        return -1;
+    }
+
+    QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
+        if (mon_fdset->id != fdset_id) {
+            continue;
+        }
+        QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
+            if (mon_fdset_fd->removed) {
+                continue;
+            }
+
+            mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL);
+            if (mon_fd_flags == -1) {
+                return -1;
+            }
+
+            if ((flags & O_ACCMODE) == (mon_fd_flags & O_ACCMODE)) {
+                return mon_fdset_fd->fd;
+            }
+        }
+        errno = EACCES;
+        return -1;
+    }
+    errno = ENOENT;
+    return -1;
+}
+
+int monitor_fdset_dup_fd_add(Monitor *mon, int64_t fdset_id, int dup_fd)
+{
+    mon_fdset_t *mon_fdset;
+    mon_fdset_fd_t *mon_fdset_fd_dup;
+
+    if (!mon) {
+        return -1;
+    }
+
+    QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
+        if (mon_fdset->id != fdset_id) {
+            continue;
+        }
+        QLIST_FOREACH(mon_fdset_fd_dup, &mon_fdset->dup_fds, next) {
+            if (mon_fdset_fd_dup->fd == dup_fd) {
+                return -1;
+            }
+        }
+        mon_fdset_fd_dup = g_malloc0(sizeof(*mon_fdset_fd_dup));
+        mon_fdset_fd_dup->fd = dup_fd;
+        QLIST_INSERT_HEAD(&mon_fdset->dup_fds, mon_fdset_fd_dup, next);
+        mon_fdset->refcount++;
+        return 0;
+    }
+    return -1;
+}
+
+static int _monitor_fdset_dup_fd_find(Monitor *mon, int dup_fd, bool remove)
+{
+    mon_fdset_t *mon_fdset;
+    mon_fdset_fd_t *mon_fdset_fd_dup;
+
+    if (!mon) {
+        return -1;
+    }
+
+    QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
+        QLIST_FOREACH(mon_fdset_fd_dup, &mon_fdset->dup_fds, next) {
+            if (mon_fdset_fd_dup->fd == dup_fd) {
+                if (remove) {
+                    QLIST_REMOVE(mon_fdset_fd_dup, next);
+                    mon_fdset->refcount--;
+                    if (mon_fdset->refcount == 0) {
+                        monitor_fdset_cleanup(mon_fdset);
+                    }
+                }
+                return mon_fdset->id;
+            }
+        }
+    }
+    return -1;
+}
+
+int monitor_fdset_dup_fd_find(Monitor *mon, int dup_fd)
+{
+    return _monitor_fdset_dup_fd_find(mon, dup_fd, false);
+}
+
+int monitor_fdset_dup_fd_remove(Monitor *mon, int dup_fd)
+{
+    return _monitor_fdset_dup_fd_find(mon, dup_fd, true);
+}
+
 /* mon_cmds and info_cmds would be sorted at runtime */
 static mon_cmd_t mon_cmds[] = {
 #include "hmp-commands.h"
diff --git a/monitor.h b/monitor.h
index 5f4de1b..afab88a 100644
--- a/monitor.h
+++ b/monitor.h
@@ -86,4 +86,9 @@ int qmp_qom_set(Monitor *mon, const QDict *qdict, QObject **ret);
 
 int qmp_qom_get(Monitor *mon, const QDict *qdict, QObject **ret);
 
+int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags);
+int monitor_fdset_dup_fd_add(Monitor *mon, int64_t fdset_id, int dup_fd);
+int monitor_fdset_dup_fd_remove(Monitor *mon, int dup_fd);
+int monitor_fdset_dup_fd_find(Monitor *mon, int dup_fd);
+
 #endif /* !MONITOR_H */
diff --git a/osdep.c b/osdep.c
index 7b09dff..4a4e7e8 100644
--- a/osdep.c
+++ b/osdep.c
@@ -47,6 +47,7 @@ extern int madvise(caddr_t, size_t, int);
 #include "qemu-common.h"
 #include "trace.h"
 #include "qemu_socket.h"
+#include "monitor.h"
 
 static const char *qemu_version = QEMU_VERSION;
 
@@ -75,6 +76,69 @@ int qemu_madvise(void *addr, size_t len, int advice)
 #endif
 }
 
+/*
+ * Dups an fd and sets the flags
+ */
+static int qemu_dup(int fd, int flags)
+{
+    int ret;
+    int serrno;
+    int dup_flags;
+    int setfl_flags;
+
+    if (flags & O_CLOEXEC) {
+#ifdef F_DUPFD_CLOEXEC
+        ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
+#else
+        ret = dup(fd);
+        if (ret != -1) {
+            qemu_set_cloexec(ret);
+        }
+#endif
+    } else {
+        ret = dup(fd);
+    }
+
+    if (ret == -1) {
+        goto fail;
+    }
+
+    dup_flags = fcntl(ret, F_GETFL);
+    if (dup_flags == -1) {
+        goto fail;
+    }
+
+    if ((flags & O_SYNC) != (dup_flags & O_SYNC)) {
+        errno = EINVAL;
+        goto fail;
+    }
+
+    /* Set/unset flags that we can with fcntl */
+    setfl_flags = O_APPEND | O_ASYNC | O_DIRECT | O_NOATIME | O_NONBLOCK;
+    dup_flags &= ~setfl_flags;
+    dup_flags |= (flags & setfl_flags);
+    if (fcntl(ret, F_SETFL, dup_flags) == -1) {
+        goto fail;
+    }
+
+    /* Truncate the file in the cases that open() would truncate it */
+    if (flags & O_TRUNC ||
+            ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) {
+        if (ftruncate(ret, 0) == -1) {
+            goto fail;
+        }
+    }
+
+    return ret;
+
+fail:
+    serrno = errno;
+    if (ret != -1) {
+        close(ret);
+    }
+    errno = serrno;
+    return -1;
+}
 
 /*
  * Opens a file with FD_CLOEXEC set
@@ -84,6 +148,39 @@ int qemu_open(const char *name, int flags, ...)
     int ret;
     int mode = 0;
 
+#ifndef _WIN32
+    const char *fdset_id_str;
+
+    /* Attempt dup of fd from fd set */
+    if (strstart(name, "/dev/fdset/", &fdset_id_str)) {
+        int64_t fdset_id;
+        int fd, dupfd;
+
+        fdset_id = qemu_parse_fdset(fdset_id_str);
+        if (fdset_id == -1) {
+            errno = EINVAL;
+            return -1;
+        }
+
+        fd = monitor_fdset_get_fd(default_mon, fdset_id, flags);
+        if (fd == -1) {
+            return -1;
+        }
+
+        dupfd = qemu_dup(fd, flags);
+        if (fd == -1) {
+            return -1;
+        }
+
+        ret = monitor_fdset_dup_fd_add(default_mon, fdset_id, dupfd);
+        if (ret == -1) {
+            return -1;
+        }
+
+        return dupfd;
+    }
+#endif
+
     if (flags & O_CREAT) {
         va_list ap;
 
@@ -106,6 +203,21 @@ int qemu_open(const char *name, int flags, ...)
 
 int qemu_close(int fd)
 {
+    int64_t fdset_id;
+
+    /* Close fd that was dup'd from an fdset */
+    fdset_id = monitor_fdset_dup_fd_find(default_mon, fd);
+    if (fdset_id != -1) {
+        int ret;
+
+        ret = close(fd);
+        if (ret == 0) {
+            monitor_fdset_dup_fd_remove(default_mon, fd);
+        }
+
+        return ret;
+    }
+
     return close(fd);
 }
 
diff --git a/qemu-common.h b/qemu-common.h
index 8c4638b..ee114aa 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -148,6 +148,7 @@ int qemu_fls(int i);
 int qemu_fdatasync(int fd);
 int fcntl_setfl(int fd, int flag);
 int qemu_parse_fd(const char *param);
+int qemu_parse_fdset(const char *param);
 
 /*
  * strtosz() suffixes used to specify the default treatment of an
diff --git a/qemu-tool.c b/qemu-tool.c
index 318c5fc..269f3b7 100644
--- a/qemu-tool.c
+++ b/qemu-tool.c
@@ -31,6 +31,7 @@ struct QEMUBH
 };
 
 Monitor *cur_mon;
+Monitor *default_mon;
 
 int monitor_cur_is_qmp(void)
 {
@@ -57,6 +58,26 @@ void monitor_protocol_event(MonitorEvent event, QObject *data)
 {
 }
 
+int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
+{
+    return -1;
+}
+
+int monitor_fdset_dup_fd_add(Monitor *mon, int64_t fdset_id, int dup_fd)
+{
+    return -1;
+}
+
+int monitor_fdset_dup_fd_remove(Monitor *mon, int dup_fd)
+{
+    return -1;
+}
+
+int monitor_fdset_dup_fd_find(Monitor *mon, int dup_fd)
+{
+    return -1;
+}
+
 int64_t cpu_get_clock(void)
 {
     return qemu_get_clock_ns(rt_clock);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v6 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
@ 2012-08-03 17:33   ` Corey Bryant
  0 siblings, 0 replies; 14+ messages in thread
From: Corey Bryant @ 2012-08-03 17:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, aliguori, stefanha, libvir-list, lcapitulino, eblake


If these patches are acceptable, I'll resend and get the version history 
out of the commit message.

-- 
Regards,
Corey

On 08/03/2012 01:28 PM, Corey Bryant wrote:
> Set the close-on-exec flag for the file descriptor received
> via SCM_RIGHTS.
>
> v4
>   -This patch is new in v4 (eblake@redhat.com)
>
> v5
>   -Fallback to FD_CLOEXEC if MSG_CMSG_CLOEXEC is not available
>    (eblake@redhat.com, stefanha@linux.vnet.ibm.com)
>
> v6
>   -Set cloexec on correct fd (eblake@redhat.com)
>
> Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
> ---
>   qemu-char.c |   12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/qemu-char.c b/qemu-char.c
> index c2aaaee..ab4a928 100644
> --- a/qemu-char.c
> +++ b/qemu-char.c
> @@ -2238,6 +2238,9 @@ static void unix_process_msgfd(CharDriverState *chr, struct msghdr *msg)
>           if (fd < 0)
>               continue;
>
> +#ifndef MSG_CMSG_CLOEXEC
> +        qemu_set_cloexec(fd);
> +#endif
>           if (s->msgfd != -1)
>               close(s->msgfd);
>           s->msgfd = fd;
> @@ -2253,6 +2256,7 @@ static ssize_t tcp_chr_recv(CharDriverState *chr, char *buf, size_t len)
>           struct cmsghdr cmsg;
>           char control[CMSG_SPACE(sizeof(int))];
>       } msg_control;
> +    int flags = 0;
>       ssize_t ret;
>
>       iov[0].iov_base = buf;
> @@ -2263,9 +2267,13 @@ static ssize_t tcp_chr_recv(CharDriverState *chr, char *buf, size_t len)
>       msg.msg_control = &msg_control;
>       msg.msg_controllen = sizeof(msg_control);
>
> -    ret = recvmsg(s->fd, &msg, 0);
> -    if (ret > 0 && s->is_unix)
> +#ifdef MSG_CMSG_CLOEXEC
> +    flags |= MSG_CMSG_CLOEXEC;
> +#endif
> +    ret = recvmsg(s->fd, &msg, flags);
> +    if (ret > 0 && s->is_unix) {
>           unix_process_msgfd(chr, &msg);
> +    }
>
>       return ret;
>   }
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v6 3/6] monitor: Clean up fd sets on monitor disconnect
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 3/6] monitor: Clean up fd sets on monitor disconnect Corey Bryant
@ 2012-08-07 17:32   ` Stefan Hajnoczi
  2012-08-07 19:36     ` Corey Bryant
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Hajnoczi @ 2012-08-07 17:32 UTC (permalink / raw)
  To: Corey Bryant
  Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino, eblake

On Fri, Aug 03, 2012 at 01:28:06PM -0400, Corey Bryant wrote:
> Each fd set has a boolean that keeps track of whether or not the
> fd set is in use by a monitor connection.  When a monitor
> disconnects, all fds that are members of an fd set with refcount
> of zero are closed.  This prevents any fd leakage associated with
> a client disconnect prior to using a passed fd.
> 
> v5:
>  -This patch is new in v5.
>  -This support addresses concerns from v4 regarding fd leakage
>   if the client disconnects unexpectedly. (eblake@redhat.com,
>   kwolf@redhat.com, dberrange@redhat.com)
> 
> v6:
>  -No changes
> 
> Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
> ---
>  monitor.c |   15 +++++++++++++++
>  1 file changed, 15 insertions(+)

The lifecycle of fdsets and fds isn't clear to me.  It seems like just a
refcount in fdset should handle this without extra fields like in_use.

Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v6 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets
  2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets Corey Bryant
@ 2012-08-07 18:16   ` Stefan Hajnoczi
  2012-08-07 19:59     ` Corey Bryant
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Hajnoczi @ 2012-08-07 18:16 UTC (permalink / raw)
  To: Corey Bryant
  Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino, eblake

On Fri, Aug 3, 2012 at 6:28 PM, Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:
> diff --git a/monitor.c b/monitor.c
> index 49dccfe..9aa9f7e 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -140,6 +140,24 @@ struct mon_fd_t {
>      QLIST_ENTRY(mon_fd_t) next;
>  };
>
> +/* file descriptor associated with a file descriptor set */
> +typedef struct mon_fdset_fd_t mon_fdset_fd_t;
> +struct mon_fdset_fd_t {

QEMU coding style is:

typedef struct MonFdsetFd MonFdsetFd;
struct MonFdsetFd {

See ./CODING_STYLE for more info.

> +    int fd;
> +    bool removed;
> +    QLIST_ENTRY(mon_fdset_fd_t) next;
> +};
> +
> +/* file descriptor set containing fds passed via SCM_RIGHTS */
> +typedef struct mon_fdset_t mon_fdset_t;
> +struct mon_fdset_t {
> +    int64_t id;
> +    int refcount;
> +    bool in_use;
> +    QLIST_HEAD(, mon_fdset_fd_t) fds;
> +    QLIST_ENTRY(mon_fdset_t) next;
> +};

At this point in the patch series it's not clear to me whether the
removed and refcount/in_use fields are a clean and correct solution.
Exposing these fields via QMP is also something I'm going to carefully
review because they seem like internals.

> +
>  typedef struct MonitorControl {
>      QObject *id;
>      JSONMessageParser parser;
> @@ -176,7 +194,8 @@ struct Monitor {
>      int print_calls_nr;
>  #endif
>      QError *error;
> -    QLIST_HEAD(,mon_fd_t) fds;
> +    QLIST_HEAD(, mon_fd_t) fds;
> +    QLIST_HEAD(, mon_fdset_t) fdsets;
>      QLIST_ENTRY(Monitor) entry;
>  };
>
> @@ -2389,6 +2408,157 @@ int monitor_get_fd(Monitor *mon, const char *fdname)
>      return -1;
>  }
>
> +static void monitor_fdset_cleanup(mon_fdset_t *mon_fdset)
> +{
> +    mon_fdset_fd_t *mon_fdset_fd;
> +    mon_fdset_fd_t *mon_fdset_fd_next;
> +
> +    if (mon_fdset->refcount != 0) {
> +        return;
> +    }
> +
> +    QLIST_FOREACH_SAFE(mon_fdset_fd, &mon_fdset->fds, next, mon_fdset_fd_next) {
> +        if (!mon_fdset->in_use || mon_fdset_fd->removed) {
> +            close(mon_fdset_fd->fd);
> +            QLIST_REMOVE(mon_fdset_fd, next);
> +            g_free(mon_fdset_fd);
> +        }
> +    }
> +
> +    if (QLIST_EMPTY(&mon_fdset->fds)) {
> +        QLIST_REMOVE(mon_fdset, next);
> +        g_free(mon_fdset);
> +    }
> +}
> +
> +AddfdInfo *qmp_add_fd(bool has_fdset_id, int64_t fdset_id, Error **errp)
> +{
> +    int fd;
> +    Monitor *mon = cur_mon;
> +    mon_fdset_t *mon_fdset;
> +    mon_fdset_fd_t *mon_fdset_fd;
> +    AddfdInfo *fdinfo;
> +
> +    fd = qemu_chr_fe_get_msgfd(mon->chr);
> +    if (fd == -1) {
> +        qerror_report(QERR_FD_NOT_SUPPLIED);
> +        return NULL;
> +    }
> +
> +    if (has_fdset_id) {
> +        QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
> +            if (mon_fdset->id == fdset_id) {
> +                break;
> +            }
> +        }
> +        if (mon_fdset == NULL) {
> +            qerror_report(QERR_FDSET_NOT_FOUND, fdset_id);
> +            return NULL;

fd is leaked?

> +        }
> +    } else {
> +        int64_t fdset_id_prev = -1;
> +        mon_fdset_t *mon_fdset_cur = QLIST_FIRST(&mon->fdsets);
> +
> +        /* Use first available fdset ID */
> +        QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
> +            mon_fdset_cur = mon_fdset;
> +            if (fdset_id_prev == mon_fdset_cur->id - 1) {
> +                fdset_id_prev = mon_fdset_cur->id;
> +                continue;
> +            }
> +            break;
> +        }
> +
> +        mon_fdset = g_malloc0(sizeof(*mon_fdset));
> +        mon_fdset->id = fdset_id_prev + 1;
> +        mon_fdset->refcount = 0;
> +        mon_fdset->in_use = true;
> +
> +        /* The fdset list is ordered by fdset ID */
> +        if (mon_fdset->id == 0) {
> +            QLIST_INSERT_HEAD(&mon->fdsets, mon_fdset, next);
> +        } else if (mon_fdset->id < mon_fdset_cur->id) {
> +            QLIST_INSERT_BEFORE(mon_fdset_cur, mon_fdset, next);
> +        } else {
> +            QLIST_INSERT_AFTER(mon_fdset_cur, mon_fdset, next);
> +        }
> +    }
> +
> +    mon_fdset_fd = g_malloc0(sizeof(*mon_fdset_fd));
> +    mon_fdset_fd->fd = fd;
> +    mon_fdset_fd->removed = false;
> +    QLIST_INSERT_HEAD(&mon_fdset->fds, mon_fdset_fd, next);
> +
> +    fdinfo = g_malloc0(sizeof(*fdinfo));
> +    fdinfo->fdset_id = mon_fdset->id;
> +    fdinfo->fd = mon_fdset_fd->fd;
> +
> +    return fdinfo;
> +}
> +
> +void qmp_remove_fd(int64_t fdset_id, bool has_fd, int64_t fd, Error **errp)
> +{
> +    Monitor *mon = cur_mon;
> +    mon_fdset_t *mon_fdset;
> +    mon_fdset_fd_t *mon_fdset_fd;
> +    char fd_str[20];
> +
> +    QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
> +        if (mon_fdset->id != fdset_id) {
> +            continue;
> +        }
> +        QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
> +            if (has_fd && mon_fdset_fd->fd != fd) {
> +                continue;
> +            }
> +            mon_fdset_fd->removed = true;
> +            if (has_fd) {
> +                break;
> +            }
> +        }
> +        monitor_fdset_cleanup(mon_fdset);
> +        return;
> +    }
> +    snprintf(fd_str, sizeof(fd_str), "%ld", fd);
> +    qerror_report(QERR_FD_NOT_FOUND, fd_str);

Why use an fd_str instead of passing an int64_t into the error
message?  This also assumed sizeof(long) == 8, which isn't true on
32-bit hosts, so %ld should be %"PRId64".

There is a new policy on error reporting.  I think this patch series
may be affected/conflict, please qemu-devel to check.  I think Luiz
can also help here.

Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v6 3/6] monitor: Clean up fd sets on monitor disconnect
  2012-08-07 17:32   ` Stefan Hajnoczi
@ 2012-08-07 19:36     ` Corey Bryant
  0 siblings, 0 replies; 14+ messages in thread
From: Corey Bryant @ 2012-08-07 19:36 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino, eblake



On 08/07/2012 01:32 PM, Stefan Hajnoczi wrote:
> On Fri, Aug 03, 2012 at 01:28:06PM -0400, Corey Bryant wrote:
>> Each fd set has a boolean that keeps track of whether or not the
>> fd set is in use by a monitor connection.  When a monitor
>> disconnects, all fds that are members of an fd set with refcount
>> of zero are closed.  This prevents any fd leakage associated with
>> a client disconnect prior to using a passed fd.
>>
>> v5:
>>   -This patch is new in v5.
>>   -This support addresses concerns from v4 regarding fd leakage
>>    if the client disconnects unexpectedly. (eblake@redhat.com,
>>    kwolf@redhat.com, dberrange@redhat.com)
>>
>> v6:
>>   -No changes
>>
>> Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
>> ---
>>   monitor.c |   15 +++++++++++++++
>>   1 file changed, 15 insertions(+)
>
> The lifecycle of fdsets and fds isn't clear to me.  It seems like just a
> refcount in fdset should handle this without extra fields like in_use.

The lifecycle of fdsets and fds starts with add-fd.

I'll explain the lifecycle end of fdsets and fds below.  To follow along 
with the code, this cleanup occurs in monitor_fdset_cleanup().

Fds are closed and removed from an fdset when there are no more open 
dup() fds (refcount == 0) for the fd set, and there are either no 
monitor connections (!in-use) or the fd has been removed with remove-fd. 
  In other words fds get cleaned up when:

(refcount == 0 && (!in-use || removed))

Let me explain each variable:

(1) refcount is incremented when qemu_open() dup()s an fd from an fd set 
and is decremented when qemu_close() closes a dup()d fd.  We don't want 
to close any fds in an fd set if refcount > 0, because this file could 
be reopened with different access mode flags, which would require dup() 
of another fd from the fdset.

(2) in-use is used to prevent fd leakage if a monitor disconnects and 
abandons fds. If libvirt adds fds and then disconnects without issuing a 
command that references the fds, then refcount will be zero, and in-use 
will be false, and the fds will be closed and removed from the fd set. 
When the monitor connection is restored, the query-fdsets command can be 
used to see what fd sets and fds are available.

(3) If the remove-fd command is issued, the fd is marked for removal. 
It won't be closed until there are no more outstanding dup() references 
on the fd set, for similar reasons to why we don't close the fd in (1).

fdsets simply get cleaned up once all fds from the set have been closed 
and removed.

Hopefully this clears things up a bit more.

Please also take a look at the v7 series that I sent out today.  Fd sets 
are now stored globally, rather than one per Monitor object.  This 
simplifies things a bit.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v6 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets
  2012-08-07 18:16   ` Stefan Hajnoczi
@ 2012-08-07 19:59     ` Corey Bryant
  2012-08-08  8:52       ` Stefan Hajnoczi
  0 siblings, 1 reply; 14+ messages in thread
From: Corey Bryant @ 2012-08-07 19:59 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino, eblake



On 08/07/2012 02:16 PM, Stefan Hajnoczi wrote:
> On Fri, Aug 3, 2012 at 6:28 PM, Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:
>> diff --git a/monitor.c b/monitor.c
>> index 49dccfe..9aa9f7e 100644
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -140,6 +140,24 @@ struct mon_fd_t {
>>       QLIST_ENTRY(mon_fd_t) next;
>>   };
>>
>> +/* file descriptor associated with a file descriptor set */
>> +typedef struct mon_fdset_fd_t mon_fdset_fd_t;
>> +struct mon_fdset_fd_t {
>
> QEMU coding style is:
>
> typedef struct MonFdsetFd MonFdsetFd;
> struct MonFdsetFd {
>
> See ./CODING_STYLE for more info.
>

Thanks, I'll fix that.

>> +    int fd;
>> +    bool removed;
>> +    QLIST_ENTRY(mon_fdset_fd_t) next;
>> +};
>> +
>> +/* file descriptor set containing fds passed via SCM_RIGHTS */
>> +typedef struct mon_fdset_t mon_fdset_t;
>> +struct mon_fdset_t {
>> +    int64_t id;
>> +    int refcount;
>> +    bool in_use;
>> +    QLIST_HEAD(, mon_fdset_fd_t) fds;
>> +    QLIST_ENTRY(mon_fdset_t) next;
>> +};
>
> At this point in the patch series it's not clear to me whether the
> removed and refcount/in_use fields are a clean and correct solution.
> Exposing these fields via QMP is also something I'm going to carefully
> review because they seem like internals.
>

Yes, please review the v7 patches and let me know what you think.  I 
explained the purpose of these fields in the previous email I just sent 
you, so I won't go into their details again here.  But I will point out 
that refcount/in-use came about after concern of fd leakage if libvirt's 
monitor connection disconnects.  query-fdsets allows the client to 
determine the state of the fd sets after reconnecting.

>> +
>>   typedef struct MonitorControl {
>>       QObject *id;
>>       JSONMessageParser parser;
>> @@ -176,7 +194,8 @@ struct Monitor {
>>       int print_calls_nr;
>>   #endif
>>       QError *error;
>> -    QLIST_HEAD(,mon_fd_t) fds;
>> +    QLIST_HEAD(, mon_fd_t) fds;
>> +    QLIST_HEAD(, mon_fdset_t) fdsets;
>>       QLIST_ENTRY(Monitor) entry;
>>   };
>>
>> @@ -2389,6 +2408,157 @@ int monitor_get_fd(Monitor *mon, const char *fdname)
>>       return -1;
>>   }
>>
>> +static void monitor_fdset_cleanup(mon_fdset_t *mon_fdset)
>> +{
>> +    mon_fdset_fd_t *mon_fdset_fd;
>> +    mon_fdset_fd_t *mon_fdset_fd_next;
>> +
>> +    if (mon_fdset->refcount != 0) {
>> +        return;
>> +    }
>> +
>> +    QLIST_FOREACH_SAFE(mon_fdset_fd, &mon_fdset->fds, next, mon_fdset_fd_next) {
>> +        if (!mon_fdset->in_use || mon_fdset_fd->removed) {
>> +            close(mon_fdset_fd->fd);
>> +            QLIST_REMOVE(mon_fdset_fd, next);
>> +            g_free(mon_fdset_fd);
>> +        }
>> +    }
>> +
>> +    if (QLIST_EMPTY(&mon_fdset->fds)) {
>> +        QLIST_REMOVE(mon_fdset, next);
>> +        g_free(mon_fdset);
>> +    }
>> +}
>> +
>> +AddfdInfo *qmp_add_fd(bool has_fdset_id, int64_t fdset_id, Error **errp)
>> +{
>> +    int fd;
>> +    Monitor *mon = cur_mon;
>> +    mon_fdset_t *mon_fdset;
>> +    mon_fdset_fd_t *mon_fdset_fd;
>> +    AddfdInfo *fdinfo;
>> +
>> +    fd = qemu_chr_fe_get_msgfd(mon->chr);
>> +    if (fd == -1) {
>> +        qerror_report(QERR_FD_NOT_SUPPLIED);
>> +        return NULL;
>> +    }
>> +
>> +    if (has_fdset_id) {
>> +        QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
>> +            if (mon_fdset->id == fdset_id) {
>> +                break;
>> +            }
>> +        }
>> +        if (mon_fdset == NULL) {
>> +            qerror_report(QERR_FDSET_NOT_FOUND, fdset_id);
>> +            return NULL;
>
> fd is leaked?
>

Yes, it looks like it is.  I'll fix that.

>> +        }
>> +    } else {
>> +        int64_t fdset_id_prev = -1;
>> +        mon_fdset_t *mon_fdset_cur = QLIST_FIRST(&mon->fdsets);
>> +
>> +        /* Use first available fdset ID */
>> +        QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
>> +            mon_fdset_cur = mon_fdset;
>> +            if (fdset_id_prev == mon_fdset_cur->id - 1) {
>> +                fdset_id_prev = mon_fdset_cur->id;
>> +                continue;
>> +            }
>> +            break;
>> +        }
>> +
>> +        mon_fdset = g_malloc0(sizeof(*mon_fdset));
>> +        mon_fdset->id = fdset_id_prev + 1;
>> +        mon_fdset->refcount = 0;
>> +        mon_fdset->in_use = true;
>> +
>> +        /* The fdset list is ordered by fdset ID */
>> +        if (mon_fdset->id == 0) {
>> +            QLIST_INSERT_HEAD(&mon->fdsets, mon_fdset, next);
>> +        } else if (mon_fdset->id < mon_fdset_cur->id) {
>> +            QLIST_INSERT_BEFORE(mon_fdset_cur, mon_fdset, next);
>> +        } else {
>> +            QLIST_INSERT_AFTER(mon_fdset_cur, mon_fdset, next);
>> +        }
>> +    }
>> +
>> +    mon_fdset_fd = g_malloc0(sizeof(*mon_fdset_fd));
>> +    mon_fdset_fd->fd = fd;
>> +    mon_fdset_fd->removed = false;
>> +    QLIST_INSERT_HEAD(&mon_fdset->fds, mon_fdset_fd, next);
>> +
>> +    fdinfo = g_malloc0(sizeof(*fdinfo));
>> +    fdinfo->fdset_id = mon_fdset->id;
>> +    fdinfo->fd = mon_fdset_fd->fd;
>> +
>> +    return fdinfo;
>> +}
>> +
>> +void qmp_remove_fd(int64_t fdset_id, bool has_fd, int64_t fd, Error **errp)
>> +{
>> +    Monitor *mon = cur_mon;
>> +    mon_fdset_t *mon_fdset;
>> +    mon_fdset_fd_t *mon_fdset_fd;
>> +    char fd_str[20];
>> +
>> +    QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
>> +        if (mon_fdset->id != fdset_id) {
>> +            continue;
>> +        }
>> +        QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
>> +            if (has_fd && mon_fdset_fd->fd != fd) {
>> +                continue;
>> +            }
>> +            mon_fdset_fd->removed = true;
>> +            if (has_fd) {
>> +                break;
>> +            }
>> +        }
>> +        monitor_fdset_cleanup(mon_fdset);
>> +        return;
>> +    }
>> +    snprintf(fd_str, sizeof(fd_str), "%ld", fd);
>> +    qerror_report(QERR_FD_NOT_FOUND, fd_str);
>
> Why use an fd_str instead of passing an int64_t into the error
> message?  This also assumed sizeof(long) == 8, which isn't true on
> 32-bit hosts, so %ld should be %"PRId64".

Can I pass an int64_t into the message if it takes a string?

I thought int64_t was a long long in 32-bit mode, but perhaps that's not 
always the case?

>
> There is a new policy on error reporting.  I think this patch series
> may be affected/conflict, please qemu-devel to check.  I think Luiz
> can also help here.

Ok thanks, I'll take a look at qemu-devel.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v6 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets
  2012-08-07 19:59     ` Corey Bryant
@ 2012-08-08  8:52       ` Stefan Hajnoczi
  2012-08-08 13:40         ` Corey Bryant
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Hajnoczi @ 2012-08-08  8:52 UTC (permalink / raw)
  To: Corey Bryant
  Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino, eblake

On Tue, Aug 7, 2012 at 8:59 PM, Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:
>
>
> On 08/07/2012 02:16 PM, Stefan Hajnoczi wrote:
>>
>> On Fri, Aug 3, 2012 at 6:28 PM, Corey Bryant <coreyb@linux.vnet.ibm.com>
>> wrote:
>>> +    snprintf(fd_str, sizeof(fd_str), "%ld", fd);
>>> +    qerror_report(QERR_FD_NOT_FOUND, fd_str);
>>
>>
>> Why use an fd_str instead of passing an int64_t into the error
>> message?  This also assumed sizeof(long) == 8, which isn't true on
>> 32-bit hosts, so %ld should be %"PRId64".
>
>
> Can I pass an int64_t into the message if it takes a string?
>
> I thought int64_t was a long long in 32-bit mode, but perhaps that's not
> always the case?

The PRId64 format specifier macro from the C standard hides this so
you can pass int64_t values to printf()-style functions in a portable
way.

Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v6 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets
  2012-08-08  8:52       ` Stefan Hajnoczi
@ 2012-08-08 13:40         ` Corey Bryant
  0 siblings, 0 replies; 14+ messages in thread
From: Corey Bryant @ 2012-08-08 13:40 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, aliguori, stefanha, libvir-list, qemu-devel, lcapitulino, eblake



On 08/08/2012 04:52 AM, Stefan Hajnoczi wrote:
> On Tue, Aug 7, 2012 at 8:59 PM, Corey Bryant <coreyb@linux.vnet.ibm.com> wrote:
>>
>>
>> On 08/07/2012 02:16 PM, Stefan Hajnoczi wrote:
>>>
>>> On Fri, Aug 3, 2012 at 6:28 PM, Corey Bryant <coreyb@linux.vnet.ibm.com>
>>> wrote:
>>>> +    snprintf(fd_str, sizeof(fd_str), "%ld", fd);
>>>> +    qerror_report(QERR_FD_NOT_FOUND, fd_str);
>>>
>>>
>>> Why use an fd_str instead of passing an int64_t into the error
>>> message?  This also assumed sizeof(long) == 8, which isn't true on
>>> 32-bit hosts, so %ld should be %"PRId64".
>>
>>
>> Can I pass an int64_t into the message if it takes a string?
>>
>> I thought int64_t was a long long in 32-bit mode, but perhaps that's not
>> always the case?
>
> The PRId64 format specifier macro from the C standard hides this so
> you can pass int64_t values to printf()-style functions in a portable
> way.
>
> Stefan
>


Thanks, I'll use PRId64.

-- 
Regards,
Corey

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-08-08 13:43 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-03 17:28 [Qemu-devel] [PATCH v6 0/6] file descriptor passing using fd sets Corey Bryant
2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg Corey Bryant
2012-08-03 17:33   ` Corey Bryant
2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 2/6] qapi: Introduce add-fd, remove-fd, query-fdsets Corey Bryant
2012-08-07 18:16   ` Stefan Hajnoczi
2012-08-07 19:59     ` Corey Bryant
2012-08-08  8:52       ` Stefan Hajnoczi
2012-08-08 13:40         ` Corey Bryant
2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 3/6] monitor: Clean up fd sets on monitor disconnect Corey Bryant
2012-08-07 17:32   ` Stefan Hajnoczi
2012-08-07 19:36     ` Corey Bryant
2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 4/6] block: Convert open calls to qemu_open Corey Bryant
2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 5/6] block: Convert close calls to qemu_close Corey Bryant
2012-08-03 17:28 ` [Qemu-devel] [PATCH v6 6/6] block: Enable qemu_open/close to work with fd sets Corey Bryant

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.