All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files
@ 2012-12-04 18:19 Thanos Makatos
  2012-12-04 18:19 ` [PATCH 1 of 9 RFC v2] blktap3: Introduce blktap3 headers Thanos Makatos
                   ` (10 more replies)
  0 siblings, 11 replies; 31+ messages in thread
From: Thanos Makatos @ 2012-12-04 18:19 UTC (permalink / raw)
  To: xen-devel; +Cc: thanos.makatos

blktap3 is a disk back-end driver. It is based on blktap2 but does not require
the blktap/blkback kernel modules as it allows tapdisk to talk directly to
blkfront. This primarily simplifies maintenance, and _may_ lead to performance
improvements. blktap3 is based on a blktap2 fork maintained mostly by Citrix
(it lives in github), so these changes are also imported, apart from the
blktap3 ones.

I've organised my upstream effort as follows:
1. Upstream the smallest possible subset of blktap3 that will allow guest VMs
   to use RAW images backed by blktap3. This will enable early testing on the
   bits introduced by blktap3.
2. Upstream the remaining of blktap3, most notably back-end drivers, e.g. VHD
   etc.
3. Import bug fixes from blktap2 living in github.
4. Import new features and optimisations from blktap2 living in github, e.g.
   the mirroring plug-in.

blktap3 is made of the following components:
1. blkfront (not a blktap3 component and already upstream): a virtual
   block device driver in the guest VM that receives block I/O requests and
   forwards them to tapdisk via the shared ring.
2. tapdisk: a user space process that receives block I/O requests from
   blkfront, translates them to whatever the current backing file format is
   (i.e. RAW, VHD, qcow etc.), and performs the actual I/O. Apart from block
   I/O requests, the tapdisk also allows basic management of each virtual
   block device, e.g. a device may be temporarily paused. tapdisk listens to
   a loopback socket for such commands. The tap-ctl utility (explained later)
   can be used for managing the tapdisk.
3. libtapback: a user space library that implements the functionality required
   to access the shared ring. It is used by tapdisk to obtain the block I/O
   requests forwarded by blkfront, and to produce the corresponding responses.
   This is the very "heart" of blktap3, it's architecture will be thoroughly
   explained by the patch series that introduces it.
4. tapback: a user space daemon that acts as the back-end of each virtual
   block device: it monitors XenStore for the block front-end's state changes,
   creates/destroys the shared ring, and instructs the tapdisk to connect
   to/disconnect from the shared ring. It also communicates to the block
   front-end required parameters (e.g. block device size in sectors) via
   XenStore.
5. libblktapctl: a user space library where the tapdisk management functions
   are implemented.
6. tap-ctl: a user space utility that allows management of the tapdisk, uses
   libblktapctl.

The tapdisk is spawned/destroyed by libxl when a domain is created/destroyed,
in the exact same way as in blktap2. libxl uses libblktapctl for this.

This patch series introduces a small subset of files required by tapback (the
tapback daemon is introduced by the next patch series):
- basic blktap3 header files
- a rudimentary implementation of libblktapctl. Only the bits required by
  tapback to manage the tapdisk are introduced, the rest of this library will
  be introduced by later patches.

---
Changed since v1:
* In all patches the patch message has been improved.
* Patches 1, 5, and 6 use GPLv2.
* Patch 0: Basic explanation of blktap3's fundamental components.
* Patch 9: Improved tools/blktap3/control/Makefile by moving hard coded
  paths to config/StdGNU.mk.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1 of 9 RFC v2] blktap3: Introduce blktap3 headers
  2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
@ 2012-12-04 18:19 ` Thanos Makatos
  2013-01-18 13:45   ` Ian Campbell
  2012-12-04 18:19 ` [PATCH 2 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk message types and structures Thanos Makatos
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Thanos Makatos @ 2012-12-04 18:19 UTC (permalink / raw)
  To: xen-devel; +Cc: thanos.makatos

This patch introduces basic blktap3 header files.

diff --git a/tools/blktap3/include/blktap3.h b/tools/blktap3/include/blktap3.h
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/include/blktap3.h
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ * 
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ * 
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ *
+ * Commonly used headers and definitions.
+ */
+
+#ifndef __BLKTAP_3_H__
+#define __BLKTAP_3_H__
+
+#include "compiler.h"
+
+/* TODO remove from other files */
+#include <xen-external/bsd-sys-queue.h>
+
+#define BLKTAP3_CONTROL_NAME        "blktap-control"
+#define BLKTAP3_CONTROL_DIR         "/var/run/"BLKTAP3_CONTROL_NAME
+#define BLKTAP3_CONTROL_SOCKET      "ctl"
+
+#define BLKTAP3_ENOSPC_SIGNAL_FILE  "/var/run/tapdisk3-enospc"
+
+/*
+ * TODO They may have to change due to macro namespacing.
+ */
+#define TAILQ_MOVE_HEAD(node, src, dst, entry)	\
+	TAILQ_REMOVE(src, node, entry);				\
+	TAILQ_INSERT_HEAD(dst, node, entry);
+
+#define TAILQ_MOVE_TAIL(node, src, dst, entry)	\
+	TAILQ_REMOVE(src, node, entry);				\
+	TAILQ_INSERT_TAIL(dst, node, entry);
+
+#endif /* __BLKTAP_3_H__ */
diff --git a/tools/blktap3/include/compiler.h b/tools/blktap3/include/compiler.h
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/include/compiler.h
@@ -0,0 +1,42 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ * 
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ * 
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301
+ * USA
+ */
+
+#ifndef __COMPILER_H__
+#define __COMPILER_H__
+
+#define likely(_cond)	__builtin_expect(!!(_cond), 1)
+#define unlikely(_cond)	__builtin_expect(!!(_cond), 0)
+
+/*
+ * FIXME taken from list.h, do we need to mention anything about the license?
+ */
+#define containerof(_ptr, _type, _memb) \
+	((_type*)((void*)(_ptr) - offsetof(_type, _memb)))
+
+#define __printf(a, b)	__attribute__((format(printf, a, b)))
+#define __scanf(_f, _a) __attribute__((format (scanf, _f, _a)))
+
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+#endif /* ARRAY_SIZE */
+
+#define UNUSED_PARAMETER(x) \
+    (void)(x);
+
+#endif /* __COMPILER_H__ */

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 2 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk message types and structures
  2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
  2012-12-04 18:19 ` [PATCH 1 of 9 RFC v2] blktap3: Introduce blktap3 headers Thanos Makatos
@ 2012-12-04 18:19 ` Thanos Makatos
  2013-01-18 13:49   ` Ian Campbell
  2012-12-04 18:19 ` [PATCH 3 of 9 RFC v2] blktap3/libblktapctl: Introduce the tapdisk control header Thanos Makatos
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Thanos Makatos @ 2012-12-04 18:19 UTC (permalink / raw)
  To: xen-devel; +Cc: thanos.makatos

This patch introduces function prototypes and structures that are implemented
by libblktapctl (the library that allows libxl, tap-ctl, and the tapback daemon
to manage a running tapdisk process). This file is based on the existing
blktap2 file, with some changes coming from blktap2 living in github (the STATS
message, support for mirroring).

tapdisk_message_name is now neater and uses a look up table instead of a big
switch.

blktap3 introduces the following messages:
	- DISK_INFO: used by xenio to communicate to blkfront via XenStore the
	  number of sectors and the sector size so that it can create the virtual
	  block device.
	- XENBLKIF_CONNECT/DISCONNECT: used by the xenio daemon to instruct a
	  running tapdisk process to connect to the ring. The
	  tapdisk_message_blkif structure is used to convey such messages.

The ATTACH message has been removed since it is now superseded by
XENBLKIF_CONNECT (this probably means that DETACH must be removed as well in
favour of XENBLKIF_DISCONNECT). However, it would probably be nicer to keep the
ATTACH/DETACH identifiers in order to minimize interface changes.

diff --git a/tools/blktap2/include/tapdisk-message.h b/tools/blktap3/include/tapdisk-message.h
copy from tools/blktap2/include/tapdisk-message.h
copy to tools/blktap3/include/tapdisk-message.h
--- a/tools/blktap2/include/tapdisk-message.h
+++ b/tools/blktap3/include/tapdisk-message.h
@@ -36,29 +36,35 @@
 #define TAPDISK_MESSAGE_MAX_MINORS \
 	((TAPDISK_MESSAGE_MAX_PATH_LENGTH / sizeof(int)) - 1)
 
-#define TAPDISK_MESSAGE_FLAG_SHARED      0x01
-#define TAPDISK_MESSAGE_FLAG_RDONLY      0x02
-#define TAPDISK_MESSAGE_FLAG_ADD_CACHE   0x04
-#define TAPDISK_MESSAGE_FLAG_VHD_INDEX   0x08
-#define TAPDISK_MESSAGE_FLAG_LOG_DIRTY   0x10
+#define TAPDISK_MESSAGE_FLAG_SHARED      0x001
+#define TAPDISK_MESSAGE_FLAG_RDONLY      0x002
+#define TAPDISK_MESSAGE_FLAG_ADD_CACHE   0x004
+#define TAPDISK_MESSAGE_FLAG_VHD_INDEX   0x008
+#define TAPDISK_MESSAGE_FLAG_LOG_DIRTY   0x010
+#define TAPDISK_MESSAGE_FLAG_ADD_LCACHE  0x020
+#define TAPDISK_MESSAGE_FLAG_REUSE_PRT   0x040
+#define TAPDISK_MESSAGE_FLAG_SECONDARY   0x080
+#define TAPDISK_MESSAGE_FLAG_STANDBY     0x100
 
 typedef struct tapdisk_message           tapdisk_message_t;
-typedef uint8_t                          tapdisk_message_flag_t;
+typedef uint32_t                         tapdisk_message_flag_t;
 typedef struct tapdisk_message_image     tapdisk_message_image_t;
 typedef struct tapdisk_message_params    tapdisk_message_params_t;
 typedef struct tapdisk_message_string    tapdisk_message_string_t;
 typedef struct tapdisk_message_response  tapdisk_message_response_t;
 typedef struct tapdisk_message_minors    tapdisk_message_minors_t;
 typedef struct tapdisk_message_list      tapdisk_message_list_t;
+typedef struct tapdisk_message_stat      tapdisk_message_stat_t;
+typedef struct tapdisk_message_blkif     tapdisk_message_blkif_t;
 
 struct tapdisk_message_params {
 	tapdisk_message_flag_t           flags;
 
-	uint8_t                          storage;
 	uint32_t                         devnum;
 	uint32_t                         domid;
-	uint16_t                         path_len;
 	char                             path[TAPDISK_MESSAGE_MAX_PATH_LENGTH];
+	uint32_t                         prt_devnum;
+	char                             secondary[TAPDISK_MESSAGE_MAX_PATH_LENGTH];
 };
 
 struct tapdisk_message_image {
@@ -88,6 +94,55 @@ struct tapdisk_message_list {
 	char                             path[TAPDISK_MESSAGE_MAX_PATH_LENGTH];
 };
 
+struct tapdisk_message_stat {
+	uint16_t type;
+	uint16_t cookie;
+	size_t length;
+};
+
+/**
+ * Tapdisk message containing all the necessary information required for the
+ * tapdisk to connect to a guest's blkfront.
+ */
+struct tapdisk_message_blkif {
+	/**
+	 * The domain ID of the guest to connect to.
+	 */
+	uint32_t domid;
+
+	/**
+	 * The device ID of the virtual block device.
+	 */
+	uint32_t devid;
+
+	/**
+	 * Grant references for the shared ring.
+	 * TODO Why 8 specifically?
+	 */
+	uint32_t gref[8];
+
+	/**
+	 * Number of pages in the ring, expressed as a page order.
+	 */
+	uint32_t order;
+
+	/**
+	 * Protocol to use: native, 32 bit, or 64 bit.
+	 * TODO Is this used for supporting a 32 bit guest on a 64 bit hypervisor?
+	 */
+	uint32_t proto;
+
+	/**
+	 * TODO Page pool? Can be NULL.
+	 */
+	char pool[TAPDISK_MESSAGE_STRING_LENGTH];
+
+	/**
+	 * The event channel port.
+	 */
+	uint32_t port;
+};
+
 struct tapdisk_message {
 	uint16_t                         type;
 	uint16_t                         cookie;
@@ -100,10 +155,15 @@ struct tapdisk_message {
 		tapdisk_message_minors_t minors;
 		tapdisk_message_response_t response;
 		tapdisk_message_list_t   list;
+		tapdisk_message_stat_t   info;
+		tapdisk_message_blkif_t  blkif;
 	} u;
 };
 
 enum tapdisk_message_id {
+	/*
+	 * TODO Why start from 1 and not from 0?
+	 */
 	TAPDISK_MESSAGE_ERROR = 1,
 	TAPDISK_MESSAGE_RUNTIME_ERROR,
 	TAPDISK_MESSAGE_PID,
@@ -120,84 +180,70 @@ enum tapdisk_message_id {
 	TAPDISK_MESSAGE_CLOSE_RSP,
 	TAPDISK_MESSAGE_DETACH,
 	TAPDISK_MESSAGE_DETACH_RSP,
-	TAPDISK_MESSAGE_LIST_MINORS,
-	TAPDISK_MESSAGE_LIST_MINORS_RSP,
+	TAPDISK_MESSAGE_LIST_MINORS,		/* TODO still valid? */
+	TAPDISK_MESSAGE_LIST_MINORS_RSP,	/* TODO still valid? */
 	TAPDISK_MESSAGE_LIST,
 	TAPDISK_MESSAGE_LIST_RSP,
+	TAPDISK_MESSAGE_STATS,
+	TAPDISK_MESSAGE_STATS_RSP,
 	TAPDISK_MESSAGE_FORCE_SHUTDOWN,
+	TAPDISK_MESSAGE_DISK_INFO,
+	TAPDISK_MESSAGE_DISK_INFO_RSP,
+	TAPDISK_MESSAGE_XENBLKIF_CONNECT,
+	TAPDISK_MESSAGE_XENBLKIF_CONNECT_RSP,
+	TAPDISK_MESSAGE_XENBLKIF_DISCONNECT,
+	TAPDISK_MESSAGE_XENBLKIF_DISCONNECT_RSP,
 	TAPDISK_MESSAGE_EXIT,
 };
 
-static inline char *
-tapdisk_message_name(enum tapdisk_message_id id)
-{
-	switch (id) {
-	case TAPDISK_MESSAGE_ERROR:
-		return "error";
+#define TAPDISK_MESSAGE_MAX TAPDISK_MESSAGE_EXIT
 
-	case TAPDISK_MESSAGE_PID:
-		return "pid";
+/**
+ * Retrieves a message's human-readable representation.
+ *
+ * @param id the message ID to translate
+ * @return the name of the message 
+ */
+static inline char const *
+tapdisk_message_name(const enum tapdisk_message_id id) {
+	static char const *msg_names[(TAPDISK_MESSAGE_MAX + 1)] = {
+		[TAPDISK_MESSAGE_ERROR] = "error",
+		[TAPDISK_MESSAGE_RUNTIME_ERROR] = "runtime error",
+		[TAPDISK_MESSAGE_PID] = "pid",
+		[TAPDISK_MESSAGE_PID_RSP] = "pid response",
+		[TAPDISK_MESSAGE_OPEN] = "open",
+		[TAPDISK_MESSAGE_OPEN_RSP] = "open response",
+		[TAPDISK_MESSAGE_PAUSE] = "pause",
+		[TAPDISK_MESSAGE_PAUSE_RSP] = "pause response",
+		[TAPDISK_MESSAGE_RESUME] = "resume",
+		[TAPDISK_MESSAGE_RESUME_RSP] = "resume response",
+		[TAPDISK_MESSAGE_CLOSE] = "close",
+		[TAPDISK_MESSAGE_FORCE_SHUTDOWN] = "force shutdown",
+		[TAPDISK_MESSAGE_CLOSE_RSP] = "close response",
+		[TAPDISK_MESSAGE_ATTACH] = "attach",
+		[TAPDISK_MESSAGE_ATTACH_RSP] = "attach response",
+		[TAPDISK_MESSAGE_DETACH] = "detach",
+		[TAPDISK_MESSAGE_DETACH_RSP] = "detach response",
+		[TAPDISK_MESSAGE_LIST_MINORS] = "list minors",
+		[TAPDISK_MESSAGE_LIST_MINORS_RSP] = "list minors response",
+		[TAPDISK_MESSAGE_LIST] = "list",
+		[TAPDISK_MESSAGE_LIST_RSP] = "list response",
+		[TAPDISK_MESSAGE_STATS] = "stats",
+		[TAPDISK_MESSAGE_STATS_RSP] = "stats response",
+		[TAPDISK_MESSAGE_DISK_INFO] = "disk info",
+		[TAPDISK_MESSAGE_DISK_INFO_RSP] = "disk info response",
+		[TAPDISK_MESSAGE_XENBLKIF_CONNECT] = "blkif connect",
+		[TAPDISK_MESSAGE_XENBLKIF_CONNECT_RSP] = "blkif connect response",
+		[TAPDISK_MESSAGE_XENBLKIF_DISCONNECT] = "blkif disconnect",
+		[TAPDISK_MESSAGE_XENBLKIF_DISCONNECT_RSP]
+			= "blkif disconnect response",
+		[TAPDISK_MESSAGE_EXIT] = "exit"
+	};
 
-	case TAPDISK_MESSAGE_PID_RSP:
-		return "pid response";
-
-	case TAPDISK_MESSAGE_OPEN:
-		return "open";
-
-	case TAPDISK_MESSAGE_OPEN_RSP:
-		return "open response";
-
-	case TAPDISK_MESSAGE_PAUSE:
-		return "pause";
-
-	case TAPDISK_MESSAGE_PAUSE_RSP:
-		return "pause response";
-
-	case TAPDISK_MESSAGE_RESUME:
-		return "resume";
-
-	case TAPDISK_MESSAGE_RESUME_RSP:
-		return "resume response";
-
-	case TAPDISK_MESSAGE_CLOSE:
-		return "close";
-
-	case TAPDISK_MESSAGE_FORCE_SHUTDOWN:
-		return "force shutdown";
-
-	case TAPDISK_MESSAGE_CLOSE_RSP:
-		return "close response";
-
-	case TAPDISK_MESSAGE_ATTACH:
-		return "attach";
-
-	case TAPDISK_MESSAGE_ATTACH_RSP:
-		return "attach response";
-
-	case TAPDISK_MESSAGE_DETACH:
-		return "detach";
-
-	case TAPDISK_MESSAGE_DETACH_RSP:
-		return "detach response";
-
-	case TAPDISK_MESSAGE_LIST_MINORS:
-		return "list minors";
-
-	case TAPDISK_MESSAGE_LIST_MINORS_RSP:
-		return "list minors response";
-
-	case TAPDISK_MESSAGE_LIST:
-		return "list";
-
-	case TAPDISK_MESSAGE_LIST_RSP:
-		return "list response";
-
-	case TAPDISK_MESSAGE_EXIT:
-		return "exit";
-
-	default:
+	if (id < 1 || id > TAPDISK_MESSAGE_MAX) {
 		return "unknown";
 	}
+	return msg_names[id];
 }
 
-#endif
+#endif /* _TAPDISK_MESSAGE_H_ */

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 3 of 9 RFC v2] blktap3/libblktapctl: Introduce the tapdisk control header
  2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
  2012-12-04 18:19 ` [PATCH 1 of 9 RFC v2] blktap3: Introduce blktap3 headers Thanos Makatos
  2012-12-04 18:19 ` [PATCH 2 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk message types and structures Thanos Makatos
@ 2012-12-04 18:19 ` Thanos Makatos
  2012-12-04 18:19 ` [PATCH 4 of 9 RFC v2] blktap3/libblktapctl: Introduce listing running tapdisks functionality Thanos Makatos
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Thanos Makatos @ 2012-12-04 18:19 UTC (permalink / raw)
  To: xen-devel; +Cc: thanos.makatos

This patch introduces the header file where tapdisk control-related structures
and functions are declared. This file is based on the existing blktap2, with
most changes coming from blktap2 living in github.  Linux lists are replaced by
BSD tail queues. Few functions are partly documented, most of them are not
documented at all, this will be addressed by a future patch.

diff --git a/tools/blktap2/control/tap-ctl.h b/tools/blktap3/control/tap-ctl.h
copy from tools/blktap2/control/tap-ctl.h
copy to tools/blktap3/control/tap-ctl.h
--- a/tools/blktap2/control/tap-ctl.h
+++ b/tools/blktap3/control/tap-ctl.h
@@ -30,72 +30,180 @@
 
 #include <syslog.h>
 #include <errno.h>
+#include <sys/time.h>
 #include <tapdisk-message.h>
+#include "blktap3.h"
 
+/*
+ * TODO These are private, move to an internal header.
+ */
 extern int tap_ctl_debug;
 
 #ifdef TAPCTL
-#define DBG(_f, _a...)				\
-	do {					\
+#define DBG(_f, _a...)			\
+	do {						\
 		if (tap_ctl_debug)		\
 			printf(_f, ##_a);	\
 	} while (0)
 
 #define DPRINTF(_f, _a...) syslog(LOG_INFO, _f, ##_a)
 #define EPRINTF(_f, _a...) syslog(LOG_ERR, "tap-err:%s: " _f, __func__, ##_a)
-#define  PERROR(_f, _a...) syslog(LOG_ERR, "tap-err:%s: " _f ": %s", __func__, ##_a, \
-				  strerror(errno))
+#define PERROR(_f, _a...) syslog(LOG_ERR, "tap-err:%s: " _f ": %s", \
+        __func__, ##_a, strerror(errno))
 #endif
 
-void tap_ctl_version(int *major, int *minor);
-int tap_ctl_kernel_version(int *major, int *minor);
+/**
+ * Contains information about a tapdisk process.
+ */
+typedef struct tap_list {
 
-int tap_ctl_check_blktap(const char **message);
-int tap_ctl_check_version(const char **message);
-int tap_ctl_check(const char **message);
+    /**
+     * The process ID.
+     */
+    pid_t pid;
 
+    /**
+     * TODO
+     */
+    int minor;
+
+    /**
+     * State of the VBD, specified in drivers/tapdisk-vbd.h.
+     */
+    int state;
+
+    /**
+     * TODO
+     */
+    char *type;
+
+    /**
+     * /path/to/file
+     */
+    char *path;
+
+    /**
+     * for linked lists
+     */
+    TAILQ_ENTRY(tap_list) entry;
+} tap_list_t;
+
+TAILQ_HEAD(tqh_tap_list, tap_list);
+
+/**
+ * Iterate over a list of struct tap_list elements.
+ */
+#define tap_list_for_each_entry(_pos, _head) \
+    TAILQ_FOREACH(_pos, _head, entry)
+
+/**
+ * Iterate over a list of struct tap_list elements allowing deletions without
+ * having to restart the iteration.
+ */
+#define tap_list_for_each_entry_safe(_pos, _n, _head) \
+    TAILQ_FOREACH_SAFE(_pos, _head, entry, _n)
+
+/**
+ * Connects to a tapdisk.
+ *
+ * @param /path/to/file of the control socket (e.g. 
+ * /var/run/blktap-control/ctl/<pid>
+ * @param socket output parameter that receives the connection
+ * @returns 0 on success, an error code otherwise
+ */
 int tap_ctl_connect(const char *path, int *socket);
-int tap_ctl_connect_id(int id, int *socket);
-int tap_ctl_read_message(int fd, tapdisk_message_t *message, int timeout);
-int tap_ctl_write_message(int fd, tapdisk_message_t *message, int timeout);
-int tap_ctl_send_and_receive(int fd, tapdisk_message_t *message, int timeout);
-int tap_ctl_connect_send_and_receive(int id,
-				     tapdisk_message_t *message, int timeout);
+
+/**
+ * Connects to a tapdisk.
+ *
+ * @param id the process ID of the tapdisk to connect to
+ * @param socket output parameter that receives the connection
+ * @returns 0 on success, an error code otherwise
+ */
+int tap_ctl_connect_id(const int id, int *socket);
+
+/**
+ * Reads from the tapdisk connection to the buffer.
+ *
+ * @param fd the file descriptor of the socket to read from
+ * @param buf buffer that receives the output
+ * @param sz size, in bytes, of the buffer
+ * @param timeout (optional) specifies the maximum time to wait for reading
+ * @returns 0 on success, an error code otherwise
+ */
+int tap_ctl_read_raw(const int fd, void *buf, const size_t sz,
+        struct timeval *timeout);
+
+int tap_ctl_read_message(int fd, tapdisk_message_t * message,
+        struct timeval *timeout);
+
+int tap_ctl_write_message(int fd, tapdisk_message_t * message,
+        struct timeval *timeout);
+
+int tap_ctl_send_and_receive(int fd, tapdisk_message_t * message,
+        struct timeval *timeout);
+
+int tap_ctl_connect_send_and_receive(int id, tapdisk_message_t * message,
+        struct timeval *timeout);
+
 char *tap_ctl_socket_name(int id);
 
-typedef struct {
-	int         id;
-	pid_t       pid;
-	int         minor;
-	int         state;
-	char       *type;
-	char       *path;
-} tap_list_t;
+int tap_ctl_list_pid(pid_t pid, struct tqh_tap_list *list);
 
-int tap_ctl_get_driver_id(const char *handle);
+/**
+ * Retrieves a list of all tapdisks.
+ *
+ * @param list output parameter that receives the list of tapdisks
+ * @returns 0 on success, an error code otherwise
+ */
+int tap_ctl_list(struct tqh_tap_list *list);
 
-int tap_ctl_list(tap_list_t ***list);
-void tap_ctl_free_list(tap_list_t **list);
-int tap_ctl_find(const char *type, const char *path, tap_list_t *tap);
+/**
+ * Deallocates a list of struct tap_list.
+ *
+ * @param list the tapdisk information structure to deallocate.
+ */
+void tap_ctl_list_free(struct tqh_tap_list *list);
 
-int tap_ctl_allocate(int *minor, char **devname);
-int tap_ctl_free(const int minor);
+/**
+ * Creates a tapdisk process.
+ *
+ * TODO document parameters
+ * @param params
+ * @param flags
+ * @param prt_minor
+ * @param secondary
+ * @returns 0 on success, en error code otherwise
+ */
+int tap_ctl_create(const char *params, int flags, int prt_minor,
+        char *secondary);
 
-int tap_ctl_create(const char *params, char **devname);
-int tap_ctl_destroy(const int id, const int minor);
+int tap_ctl_destroy(const int id, const int minor, int force,
+        struct timeval *timeout);
 
+/*
+ * TODO The following functions are not currently used by anything else
+ * other than the tapdisk itself. Move to a private header?
+ */
 int tap_ctl_spawn(void);
 pid_t tap_ctl_get_pid(const int id);
 
 int tap_ctl_attach(const int id, const int minor);
 int tap_ctl_detach(const int id, const int minor);
 
-int tap_ctl_open(const int id, const int minor, const char *params);
-int tap_ctl_close(const int id, const int minor, const int force);
+int tap_ctl_open(const int id, const int minor, const char *params,
+        int flags, const int prt_minor, const char *secondary);
 
-int tap_ctl_pause(const int id, const int minor);
+int tap_ctl_close(const int id, const int minor, const int force,
+        struct timeval *timeout);
+
+int tap_ctl_pause(const int id, const int minor, struct timeval 
+        *timeout);
+
 int tap_ctl_unpause(const int id, const int minor, const char *params);
 
-int tap_ctl_blk_major(void);
+ssize_t tap_ctl_stats(pid_t pid, int minor, char *buf, size_t size); 
 
-#endif
+int tap_ctl_stats_fwrite(pid_t pid, int minor, FILE * out);
+
+#endif /* __TAP_CTL_H__ */

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 4 of 9 RFC v2] blktap3/libblktapctl: Introduce listing running tapdisks functionality
  2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
                   ` (2 preceding siblings ...)
  2012-12-04 18:19 ` [PATCH 3 of 9 RFC v2] blktap3/libblktapctl: Introduce the tapdisk control header Thanos Makatos
@ 2012-12-04 18:19 ` Thanos Makatos
  2012-12-04 18:19 ` [PATCH 5 of 9 RFC v2] blktap3/libblktapctl: Introduce functionality used by tapback to instruct tapdisk to connect to the sring Thanos Makatos
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Thanos Makatos @ 2012-12-04 18:19 UTC (permalink / raw)
  To: xen-devel; +Cc: thanos.makatos

This patch introduces tap-ctl-list.c, the file where listing tapdisks
functionality is implemented. It is based on the existing blktap2 file with
most changes coming from blktap2 living in github. I have not examined the
changes in detail but it seems they are minor.

Function tap_ctl_list needs to be amended because minors are not retrieved
from sysfs (there's no sysfs blktap directory any more).

diff --git a/tools/blktap2/control/tap-ctl-list.c b/tools/blktap3/control/tap-ctl-list.c
copy from tools/blktap2/control/tap-ctl-list.c
copy to tools/blktap3/control/tap-ctl-list.c
--- a/tools/blktap2/control/tap-ctl-list.c
+++ b/tools/blktap3/control/tap-ctl-list.c
@@ -34,23 +34,47 @@
 #include <glob.h>
 
 #include "tap-ctl.h"
-#include "blktap2.h"
-#include "list.h"
+#include "blktap3.h"
+
+/**
+ * Allocates and initializes a tap_list_t.
+ */
+static tap_list_t *
+_tap_list_alloc(void)
+{
+	const size_t sz = sizeof(tap_list_t);
+	tap_list_t *tl;
+
+	tl = malloc(sz);
+	if (!tl)
+		return NULL;
+
+	tl->pid = -1;
+	tl->minor = -1;
+	tl->state = -1;
+	tl->type = NULL;
+	tl->path = NULL;
+
+	return tl;
+}
 
 static void
-free_list(tap_list_t *entry)
+_tap_list_free(tap_list_t * tl, struct tqh_tap_list *list)
 {
-	if (entry->type) {
-		free(entry->type);
-		entry->type = NULL;
+	if (tl->type) {
+		free(tl->type);
+		tl->type = NULL;
 	}
 
-	if (entry->path) {
-		free(entry->path);
-		entry->path = NULL;
+	if (tl->path) {
+		free(tl->path);
+		tl->path = NULL;
 	}
 
-	free(entry);
+	if (list)
+		TAILQ_REMOVE(list, tl, entry);
+
+	free(tl);
 }
 
 int
@@ -66,7 +90,7 @@ int
 	len = ptr - params;
 
 	*type = strndup(params, len);
-	*path =  strdup(params + len + 1);
+	*path = strdup(params + len + 1);
 
 	if (!*type || !*path) {
 		free(*type);
@@ -81,102 +105,26 @@ int
 	return 0;
 }
 
-static int
-init_list(tap_list_t *entry,
-	  int tap_id, pid_t tap_pid, int vbd_minor, int vbd_state,
-	  const char *params)
+void
+tap_ctl_list_free(struct tqh_tap_list *list)
 {
-	int err = 0;
+	tap_list_t *tl, *n;
 
-	entry->id     = tap_id;
-	entry->pid    = tap_pid;
-	entry->minor  = vbd_minor;
-	entry->state  = vbd_state;
-
-	if (params)
-		err = _parse_params(params, &entry->type, &entry->path);
-
-	return err;
-}
-
-void
-tap_ctl_free_list(tap_list_t **list)
-{
-	tap_list_t **_entry;
-
-	for (_entry = list; *_entry != NULL; ++_entry)
-		free_list(*_entry);
-
-	free(list);
-}
-
-static tap_list_t**
-tap_ctl_alloc_list(int n)
-{
-	tap_list_t **list, *entry;
-	size_t size;
-	int i;
-
-	size = sizeof(tap_list_t*) * (n+1);
-	list = malloc(size);
-	if (!list)
-		goto fail;
-
-	memset(list, 0, size);
-
-	for (i = 0; i < n; ++i) {
-		tap_list_t *entry;
-
-		entry = malloc(sizeof(tap_list_t));
-		if (!entry)
-			goto fail;
-
-		memset(entry, 0, sizeof(tap_list_t));
-
-		list[i] = entry;
-	}
-
-	return list;
-
-fail:
-	if (list)
-		tap_ctl_free_list(list);
-
-	return NULL;
-}
-
-static int
-tap_ctl_list_length(const tap_list_t **list)
-{
-	const tap_list_t **_entry;
-	int n;
-
-	n = 0;
-	for (_entry = list; *_entry != NULL; ++_entry)
-		n++;
-
-	return n;
-}
-
-static int
-_tap_minor_cmp(const void *a, const void *b)
-{
-	return *(int*)a - *(int*)b;
+	tap_list_for_each_entry_safe(tl, n, list)
+		_tap_list_free(tl, list);
 }
 
 int
-_tap_ctl_find_minors(int **_minorv)
+_tap_ctl_find_tapdisks(struct tqh_tap_list *list)
 {
-	glob_t glbuf = { 0 };
+	glob_t glbuf = { 0 };	
 	const char *pattern, *format;
-	int *minorv = NULL, n_minors = 0;
-	int err, i;
+	int err, i, n_taps = 0;
 
-	pattern = BLKTAP2_SYSFS_DIR"/blktap*";
-	format  = BLKTAP2_SYSFS_DIR"/blktap%d";
+	pattern = BLKTAP3_CONTROL_DIR "/" BLKTAP3_CONTROL_SOCKET "*";
+	format = BLKTAP3_CONTROL_DIR "/" BLKTAP3_CONTROL_SOCKET "%d";
 
-	n_minors = 0;
-	minorv   = NULL;
+	TAILQ_INIT(list);
 
 	err = glob(pattern, 0, NULL, &glbuf);
 	switch (err) {
@@ -186,337 +134,231 @@ int
 	case GLOB_ABORTED:
 	case GLOB_NOSPACE:
 		err = -errno;
-		EPRINTF("%s: glob failed, err %d", pattern, err);
-		goto fail;
-	}
-
-	minorv = malloc(sizeof(int) * glbuf.gl_pathc);
-	if (!minorv) {
-		err = -errno;
+		EPRINTF("%s: glob failed: %s", pattern, strerror(err));
 		goto fail;
 	}
 
 	for (i = 0; i < glbuf.gl_pathc; ++i) {
+		tap_list_t *tl;
 		int n;
 
-		n = sscanf(glbuf.gl_pathv[i], format, &minorv[n_minors]);
+		tl = _tap_list_alloc();
+		if (!tl) {
+			err = -ENOMEM;
+			goto fail;
+		}
+
+		n = sscanf(glbuf.gl_pathv[i], format, &tl->pid);
 		if (n != 1)
-			continue;
+			goto skip;
 
-		n_minors++;
+		tl->pid = tap_ctl_get_pid(tl->pid);
+		if (tl->pid < 0)
+			goto skip;
+
+		TAILQ_INSERT_TAIL(list, tl, entry);
+		n_taps++;
+		continue;
+
+	  skip:
+		_tap_list_free(tl, NULL);
 	}
 
-	qsort(minorv, n_minors, sizeof(int), _tap_minor_cmp);
-
-done:
-	*_minorv = minorv;
+  done:
 	err = 0;
-
-out:
-	if (glbuf.gl_pathv)
-		globfree(&glbuf);
-
-	return err ? : n_minors;
-
-fail:
-	if (minorv)
-		free(minorv);
-
-	goto out;
-}
-
-struct tapdisk {
-	int    id;
-	pid_t  pid;
-	struct list_head list;
-};
-
-static int
-_tap_tapdisk_cmp(const void *a, const void *b)
-{
-	return ((struct tapdisk*)a)->id - ((struct tapdisk*)b)->id;
-}
-
-int
-_tap_ctl_find_tapdisks(struct tapdisk **_tapv)
-{
-	glob_t glbuf = { 0 };
-	const char *pattern, *format;
-	struct tapdisk *tapv = NULL;
-	int err, i, n_taps = 0;
-
-	pattern = BLKTAP2_CONTROL_DIR"/"BLKTAP2_CONTROL_SOCKET"*";
-	format  = BLKTAP2_CONTROL_DIR"/"BLKTAP2_CONTROL_SOCKET"%d";
-
-	n_taps = 0;
-	tapv   = NULL;
-
-	err = glob(pattern, 0, NULL, &glbuf);
-	switch (err) {
-	case GLOB_NOMATCH:
-		goto done;
-
-	case GLOB_ABORTED:
-	case GLOB_NOSPACE:
-		err = -errno;
-		EPRINTF("%s: glob failed, err %d", pattern, err);
-		goto fail;
-	}
-
-	tapv = malloc(sizeof(struct tapdisk) * glbuf.gl_pathc);
-	if (!tapv) {
-		err = -errno;
-		goto fail;
-	}
-
-	for (i = 0; i < glbuf.gl_pathc; ++i) {
-		struct tapdisk *tap;
-		int n;
-
-		tap = &tapv[n_taps];
-
-		err = sscanf(glbuf.gl_pathv[i], format, &tap->id);
-		if (err != 1)
-			continue;
-
-		tap->pid = tap_ctl_get_pid(tap->id);
-		if (tap->pid < 0)
-			continue;
-
-		n_taps++;
-	}
-
-	qsort(tapv, n_taps, sizeof(struct tapdisk), _tap_tapdisk_cmp);
-
-	for (i = 0; i < n_taps; ++i)
-		INIT_LIST_HEAD(&tapv[i].list);
-
-done:
-	*_tapv = tapv;
-	err = 0;
-
-out:
+  out:
 	if (glbuf.gl_pathv)
 		globfree(&glbuf);
 
 	return err ? : n_taps;
 
-fail:
-	if (tapv)
-		free(tapv);
-
+  fail:
+	tap_ctl_list_free(list);
 	goto out;
 }
 
-struct tapdisk_list {
-	int  minor;
-	int  state;
-	char *params;
-	struct list_head entry;
-};
-
-int
-_tap_ctl_list_tapdisk(int id, struct list_head *_list)
+/**
+ * Retrieves all the VBDs a tapdisk is serving.
+ *
+ * @param pid the process ID of the tapdisk whose VBDs should be retrieved
+ * @param list output parameter that receives the list of VBD
+ * @returns 0 on success, an error code otherwise
+ */
+static int
+_tap_ctl_list_tapdisk(pid_t pid, struct tqh_tap_list *list)
 {
+	struct timeval timeout = {.tv_sec = 10,.tv_usec = 0 };
 	tapdisk_message_t message;
-	struct list_head list;
-	struct tapdisk_list *tl, *next;
+	tap_list_t *tl;
 	int err, sfd;
 
-	err = tap_ctl_connect_id(id, &sfd);
+	err = tap_ctl_connect_id(pid, &sfd);
 	if (err)
 		return err;
 
 	memset(&message, 0, sizeof(message));
-	message.type   = TAPDISK_MESSAGE_LIST;
+	message.type = TAPDISK_MESSAGE_LIST;
 	message.cookie = -1;
 
-	err = tap_ctl_write_message(sfd, &message, 2);
+	err = tap_ctl_write_message(sfd, &message, &timeout);
 	if (err)
 		return err;
 
-	INIT_LIST_HEAD(&list);
+	TAILQ_INIT(list);
+
 	do {
-		err = tap_ctl_read_message(sfd, &message, 2);
+		err = tap_ctl_read_message(sfd, &message, &timeout);
 		if (err) {
 			err = -EPROTO;
-			break;
+			goto fail;
 		}
 
 		if (message.u.list.count == 0)
 			break;
 
-		tl = malloc(sizeof(struct tapdisk_list));
+		tl = _tap_list_alloc();
 		if (!tl) {
 			err = -ENOMEM;
-			break;
+			goto fail;
 		}
 
-		tl->minor  = message.u.list.minor;
-		tl->state  = message.u.list.state;
+		tl->pid = pid;
+		tl->minor = message.u.list.minor;
+		tl->state = message.u.list.state;
+
 		if (message.u.list.path[0] != 0) {
-			tl->params = strndup(message.u.list.path,
-					     sizeof(message.u.list.path));
-			if (!tl->params) {
-				err = -errno;
-				break;
+			err = _parse_params(message.u.list.path, &tl->type, &tl->path);
+			if (err) {
+				_tap_list_free(tl, NULL);
+				goto fail;
 			}
-		} else
-			tl->params = NULL;
+		}
 
-		list_add(&tl->entry, &list);
+		TAILQ_INSERT_HEAD(list, tl, entry);
 	} while (1);
 
-	if (err)
-		list_for_each_entry_safe(tl, next, &list, entry) {
-			list_del(&tl->entry);
-			free(tl->params);
-			free(tl);
-		}
+	err = 0;
+  out:
+	close(sfd);
+	return 0;
 
-	close(sfd);
-	list_splice(&list, _list);
-	return err;
-}
-
-void
-_tap_ctl_free_tapdisks(struct tapdisk *tapv, int n_taps)
-{
-	struct tapdisk *tap;
-
-	for (tap = tapv; tap < &tapv[n_taps]; ++tap) {
-		struct tapdisk_list *tl, *next;
-
-		list_for_each_entry_safe(tl, next, &tap->list, entry) {
-			free(tl->params);
-			free(tl);
-		}
-	}
-
-	free(tapv);
+  fail:
+	tap_ctl_list_free(list);
+	goto out;
 }
 
 int
-_tap_list_join3(int n_minors, int *minorv, int n_taps, struct tapdisk *tapv,
-		tap_list_t ***_list)
+tap_ctl_list(struct tqh_tap_list *list)
 {
-	tap_list_t **list, **_entry, *entry;
-	int i, _m, err;
+	struct tqh_tap_list minors, tapdisks, vbds;
+	tap_list_t *t, *next_t, *v, *next_v, *m, *next_m;
+	int err;
 
-	list = tap_ctl_alloc_list(n_minors + n_taps);
-	if (!list) {
-		err = -ENOMEM;
+	/*
+	 * Find all minors, find all tapdisks, then list all minors
+	 * they attached to. Output is a 3-way outer join.
+	 */
+	TAILQ_INIT(&minors);
+
+	/*
+	 * TODO There's no blktap sysfs entry anymore, get rid of minors and
+	 * rationalise the rest of this function.
+	 */
+#if 0
+	err = _tap_ctl_find_minors(&minors);
+	if (err < 0) {
+		EPRINTF("error finding minors: %s\n", strerror(err));
+		goto fail;
+	}
+#endif
+
+	err = _tap_ctl_find_tapdisks(&tapdisks);
+	if (err < 0) {
+		EPRINTF("error finding tapdisks: %s\n", strerror(err));
 		goto fail;
 	}
 
-	_entry = list;
+	TAILQ_INIT(list);
 
-	for (i = 0; i < n_taps; ++i) {
-		struct tapdisk *tap = &tapv[i];
-		struct tapdisk_list *tl;
+	tap_list_for_each_entry_safe(t, next_t, &tapdisks) {
 
-		/* orphaned tapdisk */
-		if (list_empty(&tap->list)) {
-			err = init_list(*_entry++, tap->id, tap->pid, -1, -1, NULL);
-			if (err)
-				goto fail;
+		err = _tap_ctl_list_tapdisk(t->pid, &vbds);
+
+		/*
+		 * TODO Don't just swallow the error, print a warning, at least.
+		 */
+		if (err || TAILQ_EMPTY(&vbds)) {
+			TAILQ_MOVE_TAIL(t, &tapdisks, list, entry);
 			continue;
 		}
 
-		list_for_each_entry(tl, &tap->list, entry) {
+		tap_list_for_each_entry_safe(v, next_v, &vbds) {
 
-			err = init_list(*_entry++,
-					tap->id, tap->pid,
-					tl->minor, tl->state, tl->params);
-			if (err)
-				goto fail;
+			tap_list_for_each_entry_safe(m, next_m, &minors)
+				if (m->minor == v->minor) {
+				_tap_list_free(m, &minors);
+				break;
+			}
 
-			if (tl->minor >= 0) {
-				/* clear minor */
-				for (_m = 0; _m < n_minors; ++_m) {
-					if (minorv[_m] == tl->minor) {
-						minorv[_m] = -1;
-						break;
-					}
-				}
-			}
+			TAILQ_MOVE_TAIL(v, &vbds, list, entry);
 		}
+
+		_tap_list_free(t, &tapdisks);
 	}
 
 	/* orphaned minors */
-	for (_m = 0; _m < n_minors; ++_m) {
-		int minor = minorv[_m];
-		if (minor >= 0) {
-			err = init_list(*_entry++, -1, -1, minor, -1, NULL);
-			if (err)
-				goto fail;
-		}
-	}
-
-	/* free extraneous list entries */
-	for (; *_entry != NULL; ++entry) {
-		free_list(*_entry);
-		*_entry = NULL;
-	}
-
-	*_list = list;
+	TAILQ_CONCAT(list, &minors, entry);
 
 	return 0;
 
 fail:
-	if (list)
-		tap_ctl_free_list(list);
+	tap_ctl_list_free(list);
+
+	tap_ctl_list_free(&vbds);
+	tap_ctl_list_free(&tapdisks);
+	tap_ctl_list_free(&minors);
 
 	return err;
 }
 
 int
-tap_ctl_list(tap_list_t ***list)
+tap_ctl_list_pid(pid_t pid, struct tqh_tap_list *list)
 {
-	int n_taps, n_minors, err, *minorv;
-	struct tapdisk *tapv, *tap;
+	tap_list_t *t;
+	int err;
 
-	n_taps   = -1;
-	n_minors = -1;
+	t = _tap_list_alloc();
+	if (!t)
+		return -ENOMEM;
 
-	err = n_minors = _tap_ctl_find_minors(&minorv);
-	if (err < 0)
-		goto out;
-
-	err = n_taps = _tap_ctl_find_tapdisks(&tapv);
-	if (err < 0)
-		goto out;
-
-	for (tap = tapv; tap < &tapv[n_taps]; ++tap) {
-		err = _tap_ctl_list_tapdisk(tap->id, &tap->list);
-		if (err)
-			goto out;
+	t->pid = tap_ctl_get_pid(pid);
+	if (t->pid < 0) {
+		_tap_list_free(t, NULL);
+		return 0;
 	}
 
-	err = _tap_list_join3(n_minors, minorv, n_taps, tapv, list);
+	err = _tap_ctl_list_tapdisk(t->pid, list);
 
-out:
-	if (n_taps > 0)
-		_tap_ctl_free_tapdisks(tapv, n_taps);
+	if (err || TAILQ_EMPTY(list))
+		TAILQ_INSERT_TAIL(list, t, entry);
 
-	if (n_minors > 0)
-		free(minorv);
-
-	return err;
+	return 0;
 }
 
 int
-tap_ctl_find(const char *type, const char *path, tap_list_t *tap)
+tap_ctl_find_minor(const char *type, const char *path)
 {
-	tap_list_t **list, **_entry;
-	int ret = -ENOENT, err;
+	struct tqh_tap_list list = TAILQ_HEAD_INITIALIZER(list);
+	tap_list_t *entry;
+	int minor, err;
 
 	err = tap_ctl_list(&list);
 	if (err)
 		return err;
 
-	for (_entry = list; *_entry != NULL; ++_entry) {
-		tap_list_t *entry  = *_entry;
+	minor = -1;
+
+	tap_list_for_each_entry(entry, &list) {
 
 		if (type && (!entry->type || strcmp(entry->type, type)))
 			continue;
@@ -524,13 +366,11 @@ tap_ctl_find(const char *type, const cha
 		if (path && (!entry->path || strcmp(entry->path, path)))
 			continue;
 
-		*tap = *entry;
-		tap->type = tap->path = NULL;
-		ret = 0;
+		minor = entry->minor;
 		break;
 	}
 
-	tap_ctl_free_list(list);
+	tap_ctl_list_free(&list);
 
-	return ret;
+	return minor >= 0 ? minor : -ENOENT;
 }

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 5 of 9 RFC v2] blktap3/libblktapctl: Introduce functionality used by tapback to instruct tapdisk to connect to the sring
  2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
                   ` (3 preceding siblings ...)
  2012-12-04 18:19 ` [PATCH 4 of 9 RFC v2] blktap3/libblktapctl: Introduce listing running tapdisks functionality Thanos Makatos
@ 2012-12-04 18:19 ` Thanos Makatos
  2013-01-18 13:54   ` Ian Campbell
  2012-12-04 18:19 ` [PATCH 6 of 9 RFC v2] blktap3/libblktapctl: Introduce block device control information retrieval functionality Thanos Makatos
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Thanos Makatos @ 2012-12-04 18:19 UTC (permalink / raw)
  To: xen-devel; +Cc: thanos.makatos

This patch introduces functions tap_ctl_connect_xenblkif and
tap_ctl_disconnect_xenblkif, that are used by the tapback daemon to instruct a
running tapdisk process to connect to/disconnect from the shared ring.

diff --git a/tools/blktap3/control/tap-ctl-xen.c b/tools/blktap3/control/tap-ctl-xen.c
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/control/tap-ctl-xen.c
@@ -0,0 +1,80 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ * 
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ * 
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ */
+
+#include <stdio.h>
+#include <string.h>
+
+#include "tap-ctl.h"
+#include "tap-ctl-xen.h"
+
+int tap_ctl_connect_xenblkif(pid_t pid, int minor, domid_t domid,
+        int devid, const grant_ref_t * grefs, int order,
+        const evtchn_port_t port, int proto, const char *pool)
+{
+    tapdisk_message_t message;
+    int i, err;
+
+    memset(&message, 0, sizeof(message));
+    message.type = TAPDISK_MESSAGE_XENBLKIF_CONNECT;
+    message.cookie = minor;
+
+    message.u.blkif.domid = domid;
+    message.u.blkif.devid = devid;
+    for (i = 0; i < 1 << order; i++)
+        message.u.blkif.gref[i] = grefs[i];
+    message.u.blkif.order = order;
+    message.u.blkif.port = port;
+    message.u.blkif.proto = proto;
+    if (pool)
+        strncpy(message.u.blkif.pool, pool, sizeof(message.u.blkif.pool));
+    else
+        message.u.blkif.pool[0] = 0;
+
+    err = tap_ctl_connect_send_and_receive(pid, &message, NULL);
+    if (err)
+        return err;
+
+    if (message.type == TAPDISK_MESSAGE_XENBLKIF_CONNECT_RSP)
+        err = -message.u.response.error;
+    else
+        err = -EINVAL;
+
+    return err;
+}
+
+int tap_ctl_disconnect_xenblkif(pid_t pid, int minor, domid_t domid,
+        int devid, struct timeval *timeout)
+{
+    tapdisk_message_t message;
+    int err;
+
+    memset(&message, 0, sizeof(message));
+    message.type = TAPDISK_MESSAGE_XENBLKIF_DISCONNECT;
+    message.cookie = minor;
+    message.u.blkif.domid = domid;
+    message.u.blkif.devid = devid;
+
+    err = tap_ctl_connect_send_and_receive(pid, &message, timeout);
+    if (message.type == TAPDISK_MESSAGE_XENBLKIF_CONNECT_RSP)
+        err = -message.u.response.error;
+    else
+        err = -EINVAL;
+
+    return err;
+}
diff --git a/tools/blktap3/control/tap-ctl-xen.h b/tools/blktap3/control/tap-ctl-xen.h
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/control/tap-ctl-xen.h
@@ -0,0 +1,61 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ * 
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ * 
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ */
+
+#ifndef __TAP_CTL_XEN_H__
+#define __TAP_CTL_XEN_H__
+
+#include <xen/xen.h>
+#include <xen/grant_table.h>
+#include <xen/event_channel.h>
+
+/**
+ * Instructs a tapdisk to connect to the shared ring.
+ *
+ * TODO further explain parameters
+ *
+ * @param pid the process ID of the tapdisk that should connect to the shared
+ * ring
+ * @param minor the minor number of the virtual block device
+ * @param domid the domain ID of the guest VM
+ * @param devid the device ID
+ * @param grefs the grant references
+ * @param order number of grant references, expressed as a 2's order
+ * @param port event channel port
+ * @param proto the protocol: native (XENIO_BLKIF_PROTO_NATIVE),
+ * x86 (XENIO_BLKIF_PROTO_X86_32), or x64 (XENIO_BLKIF_PROTO_X86_64)
+ * @param pool TODO page pool?
+ * @returns 0 on success, an error code otherwise
+ */
+int tap_ctl_connect_xenblkif(pid_t pid, int minor, domid_t domid, int devid,
+        const grant_ref_t * grefs, int order, evtchn_port_t port, int proto,
+        const char *pool);
+
+/**
+ * Instructs a tapdisk to disconnect from the shared ring.
+ *
+ * @param pid the process ID of the tapdisk that should disconnect
+ * @param minor the minor number of the virtual block device
+ * @param domid the ID of the guest VM
+ * @param devid the device ID of the virtual block device
+ * @param timeout timeout to wait, if NULL the function will wait indefinitely
+ */
+int tap_ctl_disconnect_xenblkif(pid_t pid, int minor, domid_t domid,
+        int devid, struct timeval *timeout);
+
+#endif /* __TAP_CTL_XEN_H__ */

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 6 of 9 RFC v2] blktap3/libblktapctl: Introduce block device control information retrieval functionality
  2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
                   ` (4 preceding siblings ...)
  2012-12-04 18:19 ` [PATCH 5 of 9 RFC v2] blktap3/libblktapctl: Introduce functionality used by tapback to instruct tapdisk to connect to the sring Thanos Makatos
@ 2012-12-04 18:19 ` Thanos Makatos
  2013-01-18 13:55   ` Ian Campbell
  2012-12-04 18:19 ` [PATCH 7 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk message exchange functionality Thanos Makatos
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Thanos Makatos @ 2012-12-04 18:19 UTC (permalink / raw)
  To: xen-devel; +Cc: thanos.makatos

This patch introduces function tap_ctl_info. It is used by the tapback daemon
to retrieve the size of the block device, as well as the sector size,  in order
to communicate it to blkfront so that it can create the virtual block device.

diff --git a/tools/blktap3/control/tap-ctl-info.c b/tools/blktap3/control/tap-ctl-info.c
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/control/tap-ctl-info.c
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ * 
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ * 
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ */
+
+#include <stdio.h>
+#include <string.h>
+
+#include "tap-ctl.h"
+#include "tap-ctl-info.h"
+
+int tap_ctl_info(pid_t pid, int minor, unsigned long long *sectors,
+        unsigned int *sector_size, unsigned int *info)
+{
+    tapdisk_message_t message;
+    int err;
+
+    memset(&message, 0, sizeof(message));
+    message.type = TAPDISK_MESSAGE_DISK_INFO;
+    message.cookie = minor;
+
+    err = tap_ctl_connect_send_and_receive(pid, &message, NULL);
+    if (err)
+        return err;
+
+    if (message.type != TAPDISK_MESSAGE_DISK_INFO_RSP)
+        return -EINVAL;
+
+    *sectors = message.u.image.sectors;
+    *sector_size = message.u.image.sector_size;
+    *info = message.u.image.info;
+    return err;
+}
diff --git a/tools/blktap3/control/tap-ctl-info.h b/tools/blktap3/control/tap-ctl-info.h
new file mode 100644
--- /dev/null
+++ b/tools/blktap3/control/tap-ctl-info.h
@@ -0,0 +1,36 @@
+/*
+ * Copyright (C) 2012      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ * 
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ * 
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
+ * USA.
+ */
+
+#ifndef __TAP_CTL_INFO_H__
+#define __TAP_CTL_INFO_H__
+
+/**
+ * Retrieves virtual disk information from a tapdisk.
+ *
+ * @pid the process ID of the tapdisk process
+ * @minor the minor device number
+ * @sectors output parameter that receives the number of sectors
+ * @sector_size output parameter that receives the size of the sector
+ * @info TODO ?
+ * 
+ */
+int tap_ctl_info(pid_t pid, int minor, unsigned long long *sectors,
+        unsigned int *sector_size, unsigned int *info);
+
+#endif /* __TAP_CTL_INFO_H__ */

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 7 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk message exchange functionality
  2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
                   ` (5 preceding siblings ...)
  2012-12-04 18:19 ` [PATCH 6 of 9 RFC v2] blktap3/libblktapctl: Introduce block device control information retrieval functionality Thanos Makatos
@ 2012-12-04 18:19 ` Thanos Makatos
  2012-12-04 18:19 ` [PATCH 8 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk spawn functionality Thanos Makatos
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Thanos Makatos @ 2012-12-04 18:19 UTC (permalink / raw)
  To: xen-devel; +Cc: thanos.makatos

This patch introduces file conrol/tap-ctl-ipc.c, where the functionality of
talking to a tapdisk process is implemented. This file is imported from the
existing blktap2 implementation, with most changes coming from blktap2 living
in github.

diff --git a/tools/blktap2/control/tap-ctl-ipc.c b/tools/blktap3/control/tap-ctl-ipc.c
copy from tools/blktap2/control/tap-ctl-ipc.c
copy to tools/blktap3/control/tap-ctl-ipc.c
--- a/tools/blktap2/control/tap-ctl-ipc.c
+++ b/tools/blktap3/control/tap-ctl-ipc.c
@@ -30,44 +30,33 @@
 #include <unistd.h>
 #include <stdlib.h>
 #include <string.h>
+#include <fcntl.h>
 #include <sys/un.h>
 #include <sys/stat.h>
 #include <sys/types.h>
 #include <sys/socket.h>
 
 #include "tap-ctl.h"
-#include "blktap2.h"
+#include "blktap3.h"
 
 int tap_ctl_debug = 0;
 
 int
-tap_ctl_read_message(int fd, tapdisk_message_t *message, int timeout)
+tap_ctl_read_raw(int fd, void *buf, size_t size, struct timeval *timeout)
 {
 	fd_set readfds;
-	int ret, len, offset;
-	struct timeval tv, *t;
+	size_t offset = 0;
+	int ret;
 
-	t      = NULL;
-	offset = 0;
-	len    = sizeof(tapdisk_message_t);
-
-	if (timeout) {
-		tv.tv_sec  = timeout;
-		tv.tv_usec = 0;
-		t = &tv;
-	}
-
-	memset(message, 0, sizeof(tapdisk_message_t));
-
-	while (offset < len) {
+	while (offset < size) {
 		FD_ZERO(&readfds);
 		FD_SET(fd, &readfds);
 
-		ret = select(fd + 1, &readfds, NULL, NULL, t);
+		ret = select(fd + 1, &readfds, NULL, NULL, timeout);
 		if (ret == -1)
 			break;
 		else if (FD_ISSET(fd, &readfds)) {
-			ret = read(fd, message + offset, len - offset);
+			ret = read(fd, buf + offset, size - offset);
 			if (ret <= 0)
 				break;
 			offset += ret;
@@ -75,34 +64,24 @@ tap_ctl_read_message(int fd, tapdisk_mes
 			break;
 	}
 
-	if (offset != len) {
-		EPRINTF("failure reading message\n");
+	if (offset != size) {
+		EPRINTF("failure reading data %zd/%zd\n", offset, size);
 		return -EIO;
 	}
 
-	DBG("received '%s' message (uuid = %u)\n",
-	    tapdisk_message_name(message->type), message->cookie);
-
 	return 0;
 }
 
 int
-tap_ctl_write_message(int fd, tapdisk_message_t *message, int timeout)
+tap_ctl_write_message(int fd, tapdisk_message_t * message,
+					  struct timeval *timeout)
 {
 	fd_set writefds;
 	int ret, len, offset;
-	struct timeval tv, *t;
 
-	t      = NULL;
 	offset = 0;
 	len    = sizeof(tapdisk_message_t);
 
-	if (timeout) {
-		tv.tv_sec  = timeout;
-		tv.tv_usec = 0;
-		t = &tv;
-	}
-
 	DBG("sending '%s' message (uuid = %u)\n",
 	    tapdisk_message_name(message->type), message->cookie);
 
@@ -113,7 +92,7 @@ tap_ctl_write_message(int fd, tapdisk_me
 		/* we don't bother reinitializing tv. at worst, it will wait a
 		 * bit more time than expected. */
 
-		ret = select(fd + 1, NULL, &writefds, NULL, t);
+		ret = select(fd + 1, NULL, &writefds, NULL, timeout);
 		if (ret == -1)
 			break;
 		else if (FD_ISSET(fd, &writefds)) {
@@ -134,7 +113,8 @@ tap_ctl_write_message(int fd, tapdisk_me
 }
 
 int
-tap_ctl_send_and_receive(int sfd, tapdisk_message_t *message, int timeout)
+tap_ctl_send_and_receive(int sfd, tapdisk_message_t * message,
+						 struct timeval *timeout)
 {
 	int err;
 
@@ -161,7 +141,7 @@ tap_ctl_socket_name(int id)
 	char *name;
 
 	if (asprintf(&name, "%s/%s%d",
-		     BLKTAP2_CONTROL_DIR, BLKTAP2_CONTROL_SOCKET, id) == -1)
+				 BLKTAP3_CONTROL_DIR, BLKTAP3_CONTROL_SOCKET, id) == -1)
 		return NULL;
 
 	return name;
@@ -216,13 +196,15 @@ tap_ctl_connect_id(int id, int *sfd)
 	}
 
 	err = tap_ctl_connect(name, sfd);
+
 	free(name);
 
 	return err;
 }
 
 int
-tap_ctl_connect_send_and_receive(int id, tapdisk_message_t *message, int timeout)
+tap_ctl_connect_send_and_receive(int id, tapdisk_message_t * message,
+								 struct timeval *timeout)
 {
 	int err, sfd;
 
@@ -235,3 +217,20 @@ tap_ctl_connect_send_and_receive(int id,
 	close(sfd);
 	return err;
 }
+
+int
+tap_ctl_read_message(int fd, tapdisk_message_t * message,
+					 struct timeval *timeout)
+{
+	size_t size = sizeof(tapdisk_message_t);
+	int err;
+
+	err = tap_ctl_read_raw(fd, message, size, timeout);
+	if (err)
+		return err;
+
+	DBG("received '%s' message (uuid = %u)\n",
+		tapdisk_message_name(message->type), message->cookie);
+
+	return 0;
+}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 8 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk spawn functionality
  2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
                   ` (6 preceding siblings ...)
  2012-12-04 18:19 ` [PATCH 7 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk message exchange functionality Thanos Makatos
@ 2012-12-04 18:19 ` Thanos Makatos
  2012-12-04 18:19 ` [PATCH 9 of 9 RFC v2] blktap3/libblktapctl: Introduce makefile that builds tapback-required libblktapctl functionality Thanos Makatos
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Thanos Makatos @ 2012-12-04 18:19 UTC (permalink / raw)
  To: xen-devel; +Cc: thanos.makatos

This patch imports file control/tap-ctl-spawn.c from the existing blktap2
implementation, with most changes coming from blktap2 living in github.
Function tap-ctl-spawn is used for spawning a new tapdisk process in order to
serve a new virtual block device.

diff --git a/tools/blktap2/control/tap-ctl-spawn.c b/tools/blktap3/control/tap-ctl-spawn.c
copy from tools/blktap2/control/tap-ctl-spawn.c
copy to tools/blktap3/control/tap-ctl-spawn.c
--- a/tools/blktap2/control/tap-ctl-spawn.c
+++ b/tools/blktap3/control/tap-ctl-spawn.c
@@ -31,15 +31,17 @@
 #include <unistd.h>
 #include <stdlib.h>
 #include <string.h>
+#include <signal.h>
 #include <sys/wait.h>
 
 #include "tap-ctl.h"
-#include "blktap2.h"
+#include "blktap3.h"
+#include "_paths.h"
 
 static pid_t
 __tap_ctl_spawn(int *readfd)
 {
-	int err, child, channel[2];
+	int child, channel[2];
 	char *tapdisk;
 
 	if (pipe(channel)) {
@@ -71,14 +73,21 @@ static pid_t
 	close(channel[0]);
 	close(channel[1]);
 
-	tapdisk = getenv("TAPDISK2");
+	tapdisk = getenv("TAPDISK");
 	if (!tapdisk)
-		tapdisk = "tapdisk2";
+		tapdisk = getenv("TAPDISK2");
 
-	execlp(tapdisk, tapdisk, NULL);
+	if (tapdisk) {
+		execlp(tapdisk, tapdisk, NULL);
+		exit(errno);
+	}
 
-	EPRINTF("exec failed\n");
-	exit(1);
+	execl(TAPDISK_EXECDIR "/" TAPDISK_EXEC, TAPDISK_EXEC, NULL);
+
+	if (errno == ENOENT)
+		execl(TAPDISK_BUILDDIR "/" TAPDISK_EXEC, TAPDISK_EXEC, NULL);
+
+	exit(errno);
 }
 
 pid_t
@@ -90,7 +99,7 @@ tap_ctl_get_pid(const int id)
 	memset(&message, 0, sizeof(message));
 	message.type = TAPDISK_MESSAGE_PID;
 
-	err = tap_ctl_connect_send_and_receive(id, &message, 2);
+	err = tap_ctl_connect_send_and_receive(id, &message, NULL);
 	if (err)
 		return err;
 
@@ -119,6 +128,12 @@ tap_ctl_wait(pid_t child)
 	if (WIFSIGNALED(status)) {
 		int signo = WTERMSIG(status);
 		EPRINTF("tapdisk2[%d] killed by signal %d\n", child, signo);
+		if (signo == SIGUSR1)
+			/* NB. there's a race between tapdisk's
+			 * sigaction init and xen-bugtool shooting
+			 * debug signals. If killed by something as
+			 * innocuous as USR1, then retry. */
+			return -EAGAIN;
 		return -EINTR;
 	}
 
@@ -139,8 +154,8 @@ tap_ctl_get_child_id(int readfd)
 	}
 
 	errno = 0;
-	if (fscanf(f, BLKTAP2_CONTROL_DIR"/"
-		   BLKTAP2_CONTROL_SOCKET"%d", &id) != 1) {
+	if (fscanf(f, BLKTAP3_CONTROL_DIR "/"
+			   BLKTAP3_CONTROL_SOCKET "%d", &id) != 1) {
 		errno = (errno ? : EINVAL);
 		EPRINTF("parsing id failed: %d\n", errno);
 		id = -1;
@@ -158,13 +173,17 @@ tap_ctl_spawn(void)
 
 	readfd = -1;
 
+  again:
 	child = __tap_ctl_spawn(&readfd);
 	if (child < 0)
 		return child;
 
 	err = tap_ctl_wait(child);
-	if (err)
+	if (err) {
+		if (err == -EAGAIN)
+			goto again;
 		return err;
+	}
 
 	id = tap_ctl_get_child_id(readfd);
 	if (id < 0)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 9 of 9 RFC v2] blktap3/libblktapctl: Introduce makefile that builds tapback-required libblktapctl functionality
  2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
                   ` (7 preceding siblings ...)
  2012-12-04 18:19 ` [PATCH 8 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk spawn functionality Thanos Makatos
@ 2012-12-04 18:19 ` Thanos Makatos
  2013-01-18 13:59   ` Ian Campbell
  2013-01-11 15:40 ` [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
  2013-01-18 13:39 ` Ian Campbell
  10 siblings, 1 reply; 31+ messages in thread
From: Thanos Makatos @ 2012-12-04 18:19 UTC (permalink / raw)
  To: xen-devel; +Cc: thanos.makatos

This patch imports control/Makefile from the existing blktap2 implementation,
building only the libblktapctl functionality required by the tapback daemon.
The rest of the binaries/functionality will be introduced by a later patch.

Defines TAPDISK_EXEC, TAPDISK_EXECDIR, and TAPDISK_BUILDDIR are used by
tap-ctl-spawn, as it needs to know where the tapdisk binary is located in order
to spawn a tapdisk process. Variables TAPDISK_EXEC and TAPDISK_EXECDIR are
declared in config/StdGNU.mk and end up as #define's using buildmakevars2file,
just like libxl does.

diff --git a/Config.mk b/Config.mk
--- a/Config.mk
+++ b/Config.mk
@@ -147,7 +147,7 @@ define buildmakevars2file-closure
 	$(foreach var,                                                      \
 	          SBINDIR BINDIR LIBEXEC LIBDIR SHAREDIR PRIVATE_BINDIR     \
 	          XENFIRMWAREDIR XEN_CONFIG_DIR XEN_SCRIPT_DIR XEN_LOCK_DIR \
-	          XEN_RUN_DIR XEN_PAGING_DIR,                               \
+	          XEN_RUN_DIR XEN_PAGING_DIR TAPDISK_EXEC TAPDISK_EXECDIR,  \
 	          echo "$(var)=\"$($(var))\"" >>$(1).tmp;)        \
 	$(call move-if-changed,$(1).tmp,$(1))
 endef
diff --git a/config/StdGNU.mk b/config/StdGNU.mk
--- a/config/StdGNU.mk
+++ b/config/StdGNU.mk
@@ -77,3 +77,6 @@ ifeq ($(lto),y)
 CFLAGS += -flto
 LDFLAGS-$(clang) += -plugin LLVMgold.so
 endif
+
+TAPDISK_EXEC = "tapdisk"
+TAPDISK_EXECDIR = $(LIBEXEC)
diff --git a/tools/blktap2/control/Makefile b/tools/blktap3/control/Makefile
copy from tools/blktap2/control/Makefile
copy to tools/blktap3/control/Makefile
--- a/tools/blktap2/control/Makefile
+++ b/tools/blktap3/control/Makefile
@@ -6,40 +6,45 @@ MINOR              = 0
 LIBNAME            = libblktapctl
 LIBSONAME          = $(LIBNAME).so.$(MAJOR)
 
-IBIN               = tap-ctl
+genpath-target = $(call buildmakevars2file,_paths.h.tmp)
+$(eval $(genpath-target))
 
-CFLAGS            += -Werror
-CFLAGS            += -Wno-unused
-CFLAGS            += -I../include -I../drivers
-CFLAGS            += $(CFLAGS_xeninclude)
-CFLAGS            += $(CFLAGS_libxenctrl)
-CFLAGS            += -D_GNU_SOURCE
-CFLAGS            += -DTAPCTL
+_paths.h: genpath
+	sed -e "s/\([^=]*\)=\(.*\)/#define \1 \2/g" $@.tmp >$@.2.tmp
+	rm -f $@.tmp
+	$(call move-if-changed,$@.2.tmp,$@)
 
-CTL_OBJS  := tap-ctl-ipc.o
+override CFLAGS += \
+	-I../include \
+	-DTAPDISK_BUILDDIR='"../drivers"' \
+	$(CFLAGS_xeninclude) \
+	$(CFLAGS_libxenctrl) \
+	-D_GNU_SOURCE \
+	-DTAPCTL \
+    -Wall \
+    -Wextra \
+    -Werror
+
+# FIXME cause trouble
+override CFLAGS += \
+    -Wno-type-limits \
+    -Wno-missing-field-initializers \
+    -Wno-sign-compare
+
 CTL_OBJS  += tap-ctl-list.o
-CTL_OBJS  += tap-ctl-allocate.o
-CTL_OBJS  += tap-ctl-free.o
-CTL_OBJS  += tap-ctl-create.o
-CTL_OBJS  += tap-ctl-destroy.o
+CTL_OBJS  += tap-ctl-info.o
+CTL_OBJS  += tap-ctl-xen.o
+CTL_OBJS  += tap-ctl-ipc.o
 CTL_OBJS  += tap-ctl-spawn.o
-CTL_OBJS  += tap-ctl-attach.o
-CTL_OBJS  += tap-ctl-detach.o
-CTL_OBJS  += tap-ctl-open.o
-CTL_OBJS  += tap-ctl-close.o
-CTL_OBJS  += tap-ctl-pause.o
-CTL_OBJS  += tap-ctl-unpause.o
-CTL_OBJS  += tap-ctl-major.o
-CTL_OBJS  += tap-ctl-check.o
+
+tap-ctl-spawn.o: _paths.h
 
 CTL_PICS  = $(patsubst %.o,%.opic,$(CTL_OBJS))
 
-OBJS = $(CTL_OBJS) tap-ctl.o
 PICS = $(CTL_PICS)
 
 LIB_STATIC = $(LIBNAME).a
 LIB_SHARED = $(LIBSONAME).$(MINOR)
-IBIN = tap-ctl
 
 all: build
 
@@ -51,27 +56,24 @@ build: $(IBIN) $(LIB_STATIC) $(LIB_SHARE
 $(LIBSONAME): $(LIB_SHARED)
 	ln -sf $< $@
 
-tap-ctl: tap-ctl.o $(LIBNAME).so
-	$(CC) $(LDFLAGS) -o $@ $^
-
 $(LIB_STATIC): $(CTL_OBJS)
 	$(AR) r $@ $^
 
 $(LIB_SHARED): $(CTL_PICS)
 	$(CC) $(LDFLAGS) -fPIC  -Wl,$(SONAME_LDFLAG) -Wl,$(LIBSONAME) $(SHLIB_LDFLAGS) -rdynamic $^ -o $@
 
-install: $(IBIN) $(LIB_STATIC) $(LIB_SHARED)
+install: $(LIB_STATIC) $(LIB_SHARED)
 	$(INSTALL_DIR) -p $(DESTDIR)$(SBINDIR)
-	$(INSTALL_PROG) $(IBIN) $(DESTDIR)$(SBINDIR)
 	$(INSTALL_DATA) $(LIB_STATIC) $(DESTDIR)$(LIBDIR)
 	$(INSTALL_PROG) $(LIB_SHARED) $(DESTDIR)$(LIBDIR)
 	ln -sf $(LIBSONAME) $(DESTDIR)$(LIBDIR)/$(LIBNAME).so
 	ln -sf $(LIB_SHARED) $(DESTDIR)$(LIBDIR)/$(LIBSONAME)
 
 clean:
-	rm -f $(OBJS) $(PICS) $(DEPS) $(IBIN) $(LIB_STATIC) $(LIB_SHARED)
+	rm -f $(CTL_OBJS) $(PICS) $(DEPS) $(LIB_STATIC) $(LIB_SHARED)
 	rm -f $(LIBNAME).so $(LIBSONAME)
 	rm -f *~
+	rm -f _paths.h
 
 .PHONY: all build clean install

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files
  2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
                   ` (8 preceding siblings ...)
  2012-12-04 18:19 ` [PATCH 9 of 9 RFC v2] blktap3/libblktapctl: Introduce makefile that builds tapback-required libblktapctl functionality Thanos Makatos
@ 2013-01-11 15:40 ` Thanos Makatos
  2013-01-18 13:39 ` Ian Campbell
  10 siblings, 0 replies; 31+ messages in thread
From: Thanos Makatos @ 2013-01-11 15:40 UTC (permalink / raw)
  To: xen-devel

Ping?

> -----Original Message-----
> From: Thanos Makatos [mailto:thanos.makatos@citrix.com]
> Sent: 04 December 2012 18:20
> To: xen-devel@lists.xensource.com
> Cc: Thanos Makatos
> Subject: [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of
> blktap3 files
> 
> blktap3 is a disk back-end driver. It is based on blktap2 but does not
> require
> the blktap/blkback kernel modules as it allows tapdisk to talk directly
> to
> blkfront. This primarily simplifies maintenance, and _may_ lead to
> performance
> improvements. blktap3 is based on a blktap2 fork maintained mostly by
> Citrix
> (it lives in github), so these changes are also imported, apart from
> the
> blktap3 ones.
> 
> I've organised my upstream effort as follows:
> 1. Upstream the smallest possible subset of blktap3 that will allow
> guest VMs
>    to use RAW images backed by blktap3. This will enable early testing
> on the
>    bits introduced by blktap3.
> 2. Upstream the remaining of blktap3, most notably back-end drivers,
> e.g. VHD
>    etc.
> 3. Import bug fixes from blktap2 living in github.
> 4. Import new features and optimisations from blktap2 living in github,
> e.g.
>    the mirroring plug-in.
> 
> blktap3 is made of the following components:
> 1. blkfront (not a blktap3 component and already upstream): a virtual
>    block device driver in the guest VM that receives block I/O requests
> and
>    forwards them to tapdisk via the shared ring.
> 2. tapdisk: a user space process that receives block I/O requests from
>    blkfront, translates them to whatever the current backing file
> format is
>    (i.e. RAW, VHD, qcow etc.), and performs the actual I/O. Apart from
> block
>    I/O requests, the tapdisk also allows basic management of each
> virtual
>    block device, e.g. a device may be temporarily paused. tapdisk
> listens to
>    a loopback socket for such commands. The tap-ctl utility (explained
> later)
>    can be used for managing the tapdisk.
> 3. libtapback: a user space library that implements the functionality
> required
>    to access the shared ring. It is used by tapdisk to obtain the block
> I/O
>    requests forwarded by blkfront, and to produce the corresponding
> responses.
>    This is the very "heart" of blktap3, it's architecture will be
> thoroughly
>    explained by the patch series that introduces it.
> 4. tapback: a user space daemon that acts as the back-end of each
> virtual
>    block device: it monitors XenStore for the block front-end's state
> changes,
>    creates/destroys the shared ring, and instructs the tapdisk to
> connect
>    to/disconnect from the shared ring. It also communicates to the
> block
>    front-end required parameters (e.g. block device size in sectors)
> via
>    XenStore.
> 5. libblktapctl: a user space library where the tapdisk management
> functions
>    are implemented.
> 6. tap-ctl: a user space utility that allows management of the tapdisk,
> uses
>    libblktapctl.
> 
> The tapdisk is spawned/destroyed by libxl when a domain is
> created/destroyed,
> in the exact same way as in blktap2. libxl uses libblktapctl for this.
> 
> This patch series introduces a small subset of files required by
> tapback (the
> tapback daemon is introduced by the next patch series):
> - basic blktap3 header files
> - a rudimentary implementation of libblktapctl. Only the bits required
> by
>   tapback to manage the tapdisk are introduced, the rest of this
> library will
>   be introduced by later patches.
> 
> ---
> Changed since v1:
> * In all patches the patch message has been improved.
> * Patches 1, 5, and 6 use GPLv2.
> * Patch 0: Basic explanation of blktap3's fundamental components.
> * Patch 9: Improved tools/blktap3/control/Makefile by moving hard coded
>   paths to config/StdGNU.mk.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files
  2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
                   ` (9 preceding siblings ...)
  2013-01-11 15:40 ` [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
@ 2013-01-18 13:39 ` Ian Campbell
  2013-01-18 16:37   ` Thanos Makatos
  10 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2013-01-18 13:39 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: xen-devel

On Tue, 2012-12-04 at 18:19 +0000, Thanos Makatos wrote:
> 2. tapdisk: a user space process that receives block I/O requests from
>    blkfront, translates them to whatever the current backing file format is
>    (i.e. RAW, VHD, qcow etc.), and performs the actual I/O. Apart from block
>    I/O requests, the tapdisk also allows basic management of each virtual
>    block device, e.g. a device may be temporarily paused. tapdisk listens to
>    a loopback socket for such commands. The tap-ctl utility (explained later)
>    can be used for managing the tapdisk.
[...]
> 4. tapback: a user space daemon that acts as the back-end of each virtual
>    block device: it monitors XenStore for the block front-end's state changes,
>    creates/destroys the shared ring, and instructs the tapdisk to connect
>    to/disconnect from the shared ring. It also communicates to the block
>    front-end required parameters (e.g. block device size in sectors) via
>    XenStore.

There is 1 tapdisk per VBD but how many tapbacks are there? One per VBD
as well or one per domain or per driver domain?

Is tapback involved in things after the ring has been setup and it has
instructed tapdisk to use it? i.e. is it on the datapath at all?

Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1 of 9 RFC v2] blktap3: Introduce blktap3 headers
  2012-12-04 18:19 ` [PATCH 1 of 9 RFC v2] blktap3: Introduce blktap3 headers Thanos Makatos
@ 2013-01-18 13:45   ` Ian Campbell
  2013-01-18 17:43     ` Thanos Makatos
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2013-01-18 13:45 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: xen-devel

On Tue, 2012-12-04 at 18:19 +0000, Thanos Makatos wrote:
> This patch introduces basic blktap3 header files.
> 
> diff --git a/tools/blktap3/include/blktap3.h b/tools/blktap3/include/blktap3.h
> new file mode 100644
> --- /dev/null
> +++ b/tools/blktap3/include/blktap3.h
> @@ -0,0 +1,47 @@
> +/*
> + * Copyright (C) 2012      Citrix Ltd.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.

I see a mix of GPL and LGPL in this patch (compile.h is LGPL, this is
GPL). That's fine if that is what you meant but it will mean the result
is GPL.

> + * 
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + * 
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301,
> + * USA.
> + *
> + * Commonly used headers and definitions.
> + */
> +
> +#ifndef __BLKTAP_3_H__
> +#define __BLKTAP_3_H__
> +
> +#include "compiler.h"
> +
> +/* TODO remove from other files */
> +#include <xen-external/bsd-sys-queue.h>
> +
> +#define BLKTAP3_CONTROL_NAME        "blktap-control"
> +#define BLKTAP3_CONTROL_DIR         "/var/run/"BLKTAP3_CONTROL_NAME
> +#define BLKTAP3_CONTROL_SOCKET      "ctl"
> +
> +#define BLKTAP3_ENOSPC_SIGNAL_FILE  "/var/run/tapdisk3-enospc"
> +
> +/*
> + * TODO They may have to change due to macro namespacing.
> + */
> +#define TAILQ_MOVE_HEAD(node, src, dst, entry)	\
> +	TAILQ_REMOVE(src, node, entry);				\
> +	TAILQ_INSERT_HEAD(dst, node, entry);
> +
> +#define TAILQ_MOVE_TAIL(node, src, dst, entry)	\
> +	TAILQ_REMOVE(src, node, entry);				\
> +	TAILQ_INSERT_TAIL(dst, node, entry);
> +
> +#endif /* __BLKTAP_3_H__ */
> diff --git a/tools/blktap3/include/compiler.h b/tools/blktap3/include/compiler.h
> new file mode 100644
> --- /dev/null
> +++ b/tools/blktap3/include/compiler.h
> @@ -0,0 +1,42 @@
> +/*
> + * Copyright (C) 2012      Citrix Ltd.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + * 
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + * 
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301
> + * USA
> + */
> +
> +#ifndef __COMPILER_H__
> +#define __COMPILER_H__
> +
> +#define likely(_cond)	__builtin_expect(!!(_cond), 1)
> +#define unlikely(_cond)	__builtin_expect(!!(_cond), 0)
> +
> +/*
> + * FIXME taken from list.h, do we need to mention anything about the license?

In general a good idea to. I assume it is GPL which is not the license
at the head of this file (which is LGPL).

You could use the LGPL tools/libxl/libxl_internal.h:CONTAINER_OF
instead, you could even change the name to containerof...

> + */
> +#define containerof(_ptr, _type, _memb) \
> +	((_type*)((void*)(_ptr) - offsetof(_type, _memb)))
> +
> +#define __printf(a, b)	__attribute__((format(printf, a, b)))
> +#define __scanf(_f, _a) __attribute__((format (scanf, _f, _a)))
> +
> +#ifndef ARRAY_SIZE

Is someone leaking this into your scope? Can we fix them?

> +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
> +#endif /* ARRAY_SIZE */
> +
> +#define UNUSED_PARAMETER(x) \
> +    (void)(x);
> +
> +#endif /* __COMPILER_H__ */
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk message types and structures
  2012-12-04 18:19 ` [PATCH 2 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk message types and structures Thanos Makatos
@ 2013-01-18 13:49   ` Ian Campbell
  2013-01-21 18:16     ` Thanos Makatos
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2013-01-18 13:49 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: xen-devel

On Tue, 2012-12-04 at 18:19 +0000, Thanos Makatos wrote:
> +	/**
> +	 * Protocol to use: native, 32 bit, or 64 bit.
> +	 * TODO Is this used for supporting a 32 bit guest on a 64 bit hypervisor?

Sort of. The proto stuff is to support 32-bit dom0 talking to 64-bit
domU and vice versa.

> +	TAPDISK_MESSAGE_DISK_INFO_RSP,
> +	TAPDISK_MESSAGE_XENBLKIF_CONNECT,
> +	TAPDISK_MESSAGE_XENBLKIF_CONNECT_RSP,
> +	TAPDISK_MESSAGE_XENBLKIF_DISCONNECT,
> +	TAPDISK_MESSAGE_XENBLKIF_DISCONNECT_RSP,
>  	TAPDISK_MESSAGE_EXIT,
>  };
>  
> -static inline char *
> -tapdisk_message_name(enum tapdisk_message_id id)
> -{
> -	switch (id) {
> -	case TAPDISK_MESSAGE_ERROR:
> -		return "error";
> +#define TAPDISK_MESSAGE_MAX TAPDISK_MESSAGE_EXIT

A nice trick here is to include TAPDISK_MESSAGE_MAX in the enum.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5 of 9 RFC v2] blktap3/libblktapctl: Introduce functionality used by tapback to instruct tapdisk to connect to the sring
  2012-12-04 18:19 ` [PATCH 5 of 9 RFC v2] blktap3/libblktapctl: Introduce functionality used by tapback to instruct tapdisk to connect to the sring Thanos Makatos
@ 2013-01-18 13:54   ` Ian Campbell
  2013-01-21 18:16     ` Thanos Makatos
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2013-01-18 13:54 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: xen-devel

On Tue, 2012-12-04 at 18:19 +0000, Thanos Makatos wrote:

> +    err = tap_ctl_connect_send_and_receive(pid, &message, NULL);
> +    if (err)
> +        return err;
> +
> +    if (message.type == TAPDISK_MESSAGE_XENBLKIF_CONNECT_RSP)
> +        err = -message.u.response.error;
> +    else
> +        err = -EINVAL;
> +
> +    return err;
> +}
> +
> +int tap_ctl_disconnect_xenblkif(pid_t pid, int minor, domid_t domid,
> +        int devid, struct timeval *timeout)
> +{
> +    tapdisk_message_t message;
> +    int err;
> +
> +    memset(&message, 0, sizeof(message));
> +    message.type = TAPDISK_MESSAGE_XENBLKIF_DISCONNECT;
> +    message.cookie = minor;
> +    message.u.blkif.domid = domid;
> +    message.u.blkif.devid = devid;
> +
> +    err = tap_ctl_connect_send_and_receive(pid, &message, timeout);

In the code above you had an "if err return err" here. I'd expect them
to behave similarly, whichever is right.

If this is a common pattern it might be something to consider pushing
into tap_ctl_connect_send_and_receive.

> +    if (message.type == TAPDISK_MESSAGE_XENBLKIF_CONNECT_RSP)

CONNECT_RSP to a DISCONNECT message?

> +        err = -message.u.response.error;
> +    else
> +        err = -EINVAL;
> +
> +    return err;
> +}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6 of 9 RFC v2] blktap3/libblktapctl: Introduce block device control information retrieval functionality
  2012-12-04 18:19 ` [PATCH 6 of 9 RFC v2] blktap3/libblktapctl: Introduce block device control information retrieval functionality Thanos Makatos
@ 2013-01-18 13:55   ` Ian Campbell
  2013-01-21 18:16     ` Thanos Makatos
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2013-01-18 13:55 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: xen-devel

> +int tap_ctl_info(pid_t pid, int minor, unsigned long long *sectors,
> +        unsigned int *sector_size, unsigned int *info)
> +{
> +    tapdisk_message_t message;
> +    int err;
> +
> +    memset(&message, 0, sizeof(message));
> +    message.type = TAPDISK_MESSAGE_DISK_INFO;
> +    message.cookie = minor;
> +
> +    err = tap_ctl_connect_send_and_receive(pid, &message, NULL);
> +    if (err)
> +        return err;
> +
> +    if (message.type != TAPDISK_MESSAGE_DISK_INFO_RSP)
> +        return -EINVAL;

Here's a third pattern to go with the two in the previous patch ;-)

Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 9 of 9 RFC v2] blktap3/libblktapctl: Introduce makefile that builds tapback-required libblktapctl functionality
  2012-12-04 18:19 ` [PATCH 9 of 9 RFC v2] blktap3/libblktapctl: Introduce makefile that builds tapback-required libblktapctl functionality Thanos Makatos
@ 2013-01-18 13:59   ` Ian Campbell
  2013-01-21 18:16     ` Thanos Makatos
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2013-01-18 13:59 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: xen-devel

On Tue, 2012-12-04 at 18:19 +0000, Thanos Makatos wrote:
> diff --git a/config/StdGNU.mk b/config/StdGNU.mk
> --- a/config/StdGNU.mk
> +++ b/config/StdGNU.mk
> @@ -77,3 +77,6 @@ ifeq ($(lto),y)
>  CFLAGS += -flto
>  LDFLAGS-$(clang) += -plugin LLVMgold.so
>  endif
> +
> +TAPDISK_EXEC = "tapdisk"
> +TAPDISK_EXECDIR = $(LIBEXEC) 

Do these need to be configurable?

I'd be tempted to suggest that you just do 
        #define TAPDISK_EXEC "tapdisk"
        #define TAPDISK_EXECDIR LIBEXEC
in one of you headers.

Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files
  2013-01-18 13:39 ` Ian Campbell
@ 2013-01-18 16:37   ` Thanos Makatos
  2013-01-18 16:46     ` Ian Campbell
  0 siblings, 1 reply; 31+ messages in thread
From: Thanos Makatos @ 2013-01-18 16:37 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

> > 4. tapback: a user space daemon that acts as the back-end of each
> virtual
> >    block device: it monitors XenStore for the block front-end's state
> changes,
> >    creates/destroys the shared ring, and instructs the tapdisk to
> connect
> >    to/disconnect from the shared ring. It also communicates to the
> block
> >    front-end required parameters (e.g. block device size in sectors)
> via
> >    XenStore.
> 
> There is 1 tapdisk per VBD but how many tapbacks are there? One per VBD
> as well or one per domain or per driver domain?

There is one tapback daemon in total, serving all VBDs/domains.

The tapback daemon sets a XenStore watch to "backend/<device type>" to serve VBD creations for completely new domains, and then sets an additional watch to the front-end state path for each VBD. I guess there could be one tapback daemon detecting completely new domains, responsible for spawning tapback daemons designated to a specific domain/VBD. I'm not sure of the implications of this approach, though. (We could also have one thread designated to each domain/VBD.) Is there a particular reason why a single tapback daemon wouldn't suffice?

> 
> Is tapback involved in things after the ring has been setup and it has
> instructed tapdisk to use it? i.e. is it on the datapath at all?

The tapback daemon is involved only for running the Xenbus protocol when the front-end is set up/torn down -- no participation in the data path.

> 
> Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files
  2013-01-18 16:37   ` Thanos Makatos
@ 2013-01-18 16:46     ` Ian Campbell
  2013-01-18 16:58       ` Thanos Makatos
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2013-01-18 16:46 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: xen-devel

On Fri, 2013-01-18 at 16:37 +0000, Thanos Makatos wrote:
> > > 4. tapback: a user space daemon that acts as the back-end of each
> > virtual
> > >    block device: it monitors XenStore for the block front-end's state
> > changes,
> > >    creates/destroys the shared ring, and instructs the tapdisk to
> > connect
> > >    to/disconnect from the shared ring. It also communicates to the
> > block
> > >    front-end required parameters (e.g. block device size in sectors)
> > via
> > >    XenStore.
> > 
> > There is 1 tapdisk per VBD but how many tapbacks are there? One per VBD
> > as well or one per domain or per driver domain?
> 
> There is one tapback daemon in total, serving all VBDs/domains.

You mean one per backend domain I assume?

> The tapback daemon sets a XenStore watch to "backend/<device type>" to
> serve VBD creations for completely new domains, and then sets an
> additional watch to the front-end state path for each VBD. I guess
> there could be one tapback daemon detecting completely new domains,
> responsible for spawning tapback daemons designated to a specific
> domain/VBD. I'm not sure of the implications of this approach, though.
> (We could also have one thread designated to each domain/VBD.) Is
> there a particular reason why a single tapback daemon wouldn't
> suffice?

No, it sounds reasonable. I was more concerned there might be one per
VBD in which case it would have made sense to merge tapdisk and tapback.

> > Is tapback involved in things after the ring has been setup and it has
> > instructed tapdisk to use it? i.e. is it on the datapath at all?
> 
> The tapback daemon is involved only for running the Xenbus protocol
> when the front-end is set up/torn down -- no participation in the data
> path.

Oh good.

Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files
  2013-01-18 16:46     ` Ian Campbell
@ 2013-01-18 16:58       ` Thanos Makatos
  2013-01-18 17:01         ` Ian Campbell
  0 siblings, 1 reply; 31+ messages in thread
From: Thanos Makatos @ 2013-01-18 16:58 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

> -----Original Message-----
> From: Ian Campbell
> Sent: 18 January 2013 16:47
> To: Thanos Makatos
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] [PATCH 0 of 9 RFC v2] blktap3: Introduce a
> small subset of blktap3 files
> 
> On Fri, 2013-01-18 at 16:37 +0000, Thanos Makatos wrote:
> > > > 4. tapback: a user space daemon that acts as the back-end of each
> > > virtual
> > > >    block device: it monitors XenStore for the block front-end's
> state
> > > changes,
> > > >    creates/destroys the shared ring, and instructs the tapdisk to
> > > connect
> > > >    to/disconnect from the shared ring. It also communicates to
> the
> > > block
> > > >    front-end required parameters (e.g. block device size in
> sectors)
> > > via
> > > >    XenStore.
> > >
> > > There is 1 tapdisk per VBD but how many tapbacks are there? One per
> VBD
> > > as well or one per domain or per driver domain?
> >
> > There is one tapback daemon in total, serving all VBDs/domains.
> 
> You mean one per backend domain I assume?

I haven't thought of that, but I believe there can be only one tapback daemon across multiple driver domains. This is because tapback monitors XenStore path "backend/<device type>" (i.e. backend/vbd), so if there were multiple tapback daemon they would all try to serve this event. Is my understanding correct? If this is true then it's a serious limitation for driver domains.

> 
> > The tapback daemon sets a XenStore watch to "backend/<device type>"
> to
> > serve VBD creations for completely new domains, and then sets an
> > additional watch to the front-end state path for each VBD. I guess
> > there could be one tapback daemon detecting completely new domains,
> > responsible for spawning tapback daemons designated to a specific
> > domain/VBD. I'm not sure of the implications of this approach,
> though.
> > (We could also have one thread designated to each domain/VBD.) Is
> > there a particular reason why a single tapback daemon wouldn't
> > suffice?
> 
> No, it sounds reasonable. I was more concerned there might be one per
> VBD in which case it would have made sense to merge tapdisk and
> tapback.
> 
> > > Is tapback involved in things after the ring has been setup and it
> has
> > > instructed tapdisk to use it? i.e. is it on the datapath at all?
> >
> > The tapback daemon is involved only for running the Xenbus protocol
> > when the front-end is set up/torn down -- no participation in the
> data
> > path.
> 
> Oh good.

I'll update the patch description with this information.

> 
> Ian.
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files
  2013-01-18 16:58       ` Thanos Makatos
@ 2013-01-18 17:01         ` Ian Campbell
  2013-01-18 17:15           ` Thanos Makatos
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2013-01-18 17:01 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: xen-devel

On Fri, 2013-01-18 at 16:58 +0000, Thanos Makatos wrote:
> > -----Original Message-----
> > From: Ian Campbell
> > Sent: 18 January 2013 16:47
> > To: Thanos Makatos
> > Cc: xen-devel@lists.xensource.com
> > Subject: Re: [Xen-devel] [PATCH 0 of 9 RFC v2] blktap3: Introduce a
> > small subset of blktap3 files
> > 
> > On Fri, 2013-01-18 at 16:37 +0000, Thanos Makatos wrote:
> > > > > 4. tapback: a user space daemon that acts as the back-end of each
> > > > virtual
> > > > >    block device: it monitors XenStore for the block front-end's
> > state
> > > > changes,
> > > > >    creates/destroys the shared ring, and instructs the tapdisk to
> > > > connect
> > > > >    to/disconnect from the shared ring. It also communicates to
> > the
> > > > block
> > > > >    front-end required parameters (e.g. block device size in
> > sectors)
> > > > via
> > > > >    XenStore.
> > > >
> > > > There is 1 tapdisk per VBD but how many tapbacks are there? One per
> > VBD
> > > > as well or one per domain or per driver domain?
> > >
> > > There is one tapback daemon in total, serving all VBDs/domains.
> > 
> > You mean one per backend domain I assume?
> 
> I haven't thought of that, but I believe there can be only one tapback
> daemon across multiple driver domains. This is because tapback
> monitors XenStore path "backend/<device type>" (i.e. backend/vbd), so
> if there were multiple tapback daemon they would all try to serve this
> event. Is my understanding correct? If this is true then it's a
> serious limitation for driver domains.

"backend/<device-type>" is a relative not absolute path so it is
relative to the driver domains /local/domid/<DOMID>.

I wouldn't expect that a tapback process in one domain would be able to
setup the necessary rings/evtchns in such a way that a tapdisk in
another process could use them.

Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files
  2013-01-18 17:01         ` Ian Campbell
@ 2013-01-18 17:15           ` Thanos Makatos
  2013-01-18 17:25             ` Ian Campbell
  0 siblings, 1 reply; 31+ messages in thread
From: Thanos Makatos @ 2013-01-18 17:15 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

> > > > > There is 1 tapdisk per VBD but how many tapbacks are there? One
> > > > > per
> > > VBD
> > > > > as well or one per domain or per driver domain?
> > > >
> > > > There is one tapback daemon in total, serving all VBDs/domains.
> > >
> > > You mean one per backend domain I assume?
> >
> > I haven't thought of that, but I believe there can be only one
> tapback
> > daemon across multiple driver domains. This is because tapback
> > monitors XenStore path "backend/<device type>" (i.e. backend/vbd), so
> > if there were multiple tapback daemon they would all try to serve
> this
> > event. Is my understanding correct? If this is true then it's a
> > serious limitation for driver domains.
> 
> "backend/<device-type>" is a relative not absolute path so it is
> relative to the driver domains /local/domid/<DOMID>.

Then there's no problem with that. So, going back to your original question, there is one tapback daemon per driver domain.

What I don't understand, however, is how a front-end's VBD creation request will end up to the correct driver domain. Does the front-end know under which /local/domid tree in XenStore it should write to? Is this explained somewhere?
 
> 
> I wouldn't expect that a tapback process in one domain would be able to
> setup the necessary rings/evtchns in such a way that a tapdisk in
> another process could use them.

I guess a tapback daemon could delegate this operation to the tapback running at the appropriate driver domain via XenStore, no?

> 
> Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files
  2013-01-18 17:15           ` Thanos Makatos
@ 2013-01-18 17:25             ` Ian Campbell
  2013-01-22 17:29               ` Thanos Makatos
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2013-01-18 17:25 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: xen-devel

On Fri, 2013-01-18 at 17:15 +0000, Thanos Makatos wrote:
> > > > > > There is 1 tapdisk per VBD but how many tapbacks are there? One
> > > > > > per
> > > > VBD
> > > > > > as well or one per domain or per driver domain?
> > > > >
> > > > > There is one tapback daemon in total, serving all VBDs/domains.
> > > >
> > > > You mean one per backend domain I assume?
> > >
> > > I haven't thought of that, but I believe there can be only one
> > tapback
> > > daemon across multiple driver domains. This is because tapback
> > > monitors XenStore path "backend/<device type>" (i.e. backend/vbd), so
> > > if there were multiple tapback daemon they would all try to serve
> > this
> > > event. Is my understanding correct? If this is true then it's a
> > > serious limitation for driver domains.
> > 
> > "backend/<device-type>" is a relative not absolute path so it is
> > relative to the driver domains /local/domid/<DOMID>.
> 
> Then there's no problem with that. So, going back to your original
> question, there is one tapback daemon per driver domain.
> 
> What I don't understand, however, is how a front-end's VBD creation
> request will end up to the correct driver domain. Does the front-end
> know under which /local/domid tree in XenStore it should write to? Is
> this explained somewhere?

The frontend looks in devices/vbd/<foo>/backend which to get the full
path to to the backend. The toolstack writes this and knows both the
front- and back-end domain ids.

>  
> > 
> > I wouldn't expect that a tapback process in one domain would be able to
> > setup the necessary rings/evtchns in such a way that a tapdisk in
> > another process could use them.
> 
> I guess a tapback daemon could delegate this operation to the tapback
> running at the appropriate driver domain via XenStore, no?

Yes.

In fact I'm wondering why the tapback daemon does all the ring setup,
xenbus negotiation anyway. It could just watch "backend/<device-type>"
and when it sees a new backend/<device-type>/<domid>/<devid> fork a new
"tapdisk --domain <domid> --devid <devid>" and have tapdisk take care of
the protocol entirely. Saves a bunch of messaging back and forth I
think.

I don't have a problem with the existing model, it just strikes me as a
bit odd (but then I don't really know the architecture)

Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1 of 9 RFC v2] blktap3: Introduce blktap3 headers
  2013-01-18 13:45   ` Ian Campbell
@ 2013-01-18 17:43     ` Thanos Makatos
  2013-01-21 10:37       ` Ian Campbell
  0 siblings, 1 reply; 31+ messages in thread
From: Thanos Makatos @ 2013-01-18 17:43 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

> -----Original Message-----
> From: Ian Campbell
> Sent: 18 January 2013 13:46
> To: Thanos Makatos
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] [PATCH 1 of 9 RFC v2] blktap3: Introduce
> blktap3 headers
> 
> On Tue, 2012-12-04 at 18:19 +0000, Thanos Makatos wrote:
> > This patch introduces basic blktap3 header files.
> >
> > diff --git a/tools/blktap3/include/blktap3.h
> b/tools/blktap3/include/blktap3.h
> > new file mode 100644
> > --- /dev/null
> > +++ b/tools/blktap3/include/blktap3.h
> > @@ -0,0 +1,47 @@
> > +/*
> > + * Copyright (C) 2012      Citrix Ltd.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version 2
> > + * of the License, or (at your option) any later version.
> 
> I see a mix of GPL and LGPL in this patch (compile.h is LGPL, this is
> GPL). That's fine if that is what you meant but it will mean the result
> is GPL.

That was accidental, I'll make compiler.h's license GPL.

> 
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301,
> > + * USA.
> > + *
> > + * Commonly used headers and definitions.
> > + */
> > +
> > +#ifndef __BLKTAP_3_H__
> > +#define __BLKTAP_3_H__
> > +
> > +#include "compiler.h"
> > +
> > +/* TODO remove from other files */
> > +#include <xen-external/bsd-sys-queue.h>
> > +
> > +#define BLKTAP3_CONTROL_NAME        "blktap-control"
> > +#define BLKTAP3_CONTROL_DIR         "/var/run/"BLKTAP3_CONTROL_NAME
> > +#define BLKTAP3_CONTROL_SOCKET      "ctl"
> > +
> > +#define BLKTAP3_ENOSPC_SIGNAL_FILE  "/var/run/tapdisk3-enospc"
> > +
> > +/*
> > + * TODO They may have to change due to macro namespacing.
> > + */
> > +#define TAILQ_MOVE_HEAD(node, src, dst, entry)	\
> > +	TAILQ_REMOVE(src, node, entry);				\
> > +	TAILQ_INSERT_HEAD(dst, node, entry);
> > +
> > +#define TAILQ_MOVE_TAIL(node, src, dst, entry)	\
> > +	TAILQ_REMOVE(src, node, entry);				\
> > +	TAILQ_INSERT_TAIL(dst, node, entry);
> > +
> > +#endif /* __BLKTAP_3_H__ */
> > diff --git a/tools/blktap3/include/compiler.h
> b/tools/blktap3/include/compiler.h
> > new file mode 100644
> > --- /dev/null
> > +++ b/tools/blktap3/include/compiler.h
> > @@ -0,0 +1,42 @@
> > +/*
> > + * Copyright (C) 2012      Citrix Ltd.
> > + *
> > + * This library is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later
> version.
> > + *
> > + * This library is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with this library; if not, write to the Free
> Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301
> > + * USA
> > + */
> > +
> > +#ifndef __COMPILER_H__
> > +#define __COMPILER_H__
> > +
> > +#define likely(_cond)	__builtin_expect(!!(_cond), 1)
> > +#define unlikely(_cond)	__builtin_expect(!!(_cond), 0)
> > +
> > +/*
> > + * FIXME taken from list.h, do we need to mention anything about the
> license?
> 
> In general a good idea to. I assume it is GPL which is not the license
> at the head of this file (which is LGPL).
> 
> You could use the LGPL tools/libxl/libxl_internal.h:CONTAINER_OF
> instead, you could even change the name to containerof...

That would work, should I include tools/libxl/libxl_internal.h or just copy CONTAINER_OF's definition?

> 
> > + */
> > +#define containerof(_ptr, _type, _memb) \
> > +	((_type*)((void*)(_ptr) - offsetof(_type, _memb)))
> > +
> > +#define __printf(a, b)	__attribute__((format(printf, a, b)))
> > +#define __scanf(_f, _a) __attribute__((format (scanf, _f, _a)))
> > +
> > +#ifndef ARRAY_SIZE
> 
> Is someone leaking this into your scope? Can we fix them?

Ok.

> 
> > +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
> > +#endif /* ARRAY_SIZE */
> > +
> > +#define UNUSED_PARAMETER(x) \
> > +    (void)(x);
> > +
> > +#endif /* __COMPILER_H__ */
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1 of 9 RFC v2] blktap3: Introduce blktap3 headers
  2013-01-18 17:43     ` Thanos Makatos
@ 2013-01-21 10:37       ` Ian Campbell
  0 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2013-01-21 10:37 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: xen-devel

On Fri, 2013-01-18 at 17:43 +0000, Thanos Makatos wrote:

> > > +/*
> > > + * FIXME taken from list.h, do we need to mention anything about the
> > license?
> > 
> > In general a good idea to. I assume it is GPL which is not the license
> > at the head of this file (which is LGPL).
> > 
> > You could use the LGPL tools/libxl/libxl_internal.h:CONTAINER_OF
> > instead, you could even change the name to containerof...
> 
> That would work, should I include tools/libxl/libxl_internal.h or just
> copy CONTAINER_OF's definition?

In the absence of any more central place to put it it would be better to
copy.

Or maybe move to libxc (not a great fit there though)

Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 9 of 9 RFC v2] blktap3/libblktapctl: Introduce makefile that builds tapback-required libblktapctl functionality
  2013-01-18 13:59   ` Ian Campbell
@ 2013-01-21 18:16     ` Thanos Makatos
  2013-01-22  9:28       ` Ian Campbell
  0 siblings, 1 reply; 31+ messages in thread
From: Thanos Makatos @ 2013-01-21 18:16 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

> On Tue, 2012-12-04 at 18:19 +0000, Thanos Makatos wrote:
> > diff --git a/config/StdGNU.mk b/config/StdGNU.mk
> > --- a/config/StdGNU.mk
> > +++ b/config/StdGNU.mk
> > @@ -77,3 +77,6 @@ ifeq ($(lto),y)
> >  CFLAGS += -flto
> >  LDFLAGS-$(clang) += -plugin LLVMgold.so  endif
> > +
> > +TAPDISK_EXEC = "tapdisk"
> > +TAPDISK_EXECDIR = $(LIBEXEC)
> 
> Do these need to be configurable?
> 
> I'd be tempted to suggest that you just do
>         #define TAPDISK_EXEC "tapdisk"
>         #define TAPDISK_EXECDIR LIBEXEC
> in one of you headers.
> 
> Ian.

They were originally specified in the Makefile and you suggested they were put elsewhere:

> > diff --git a/tools/blktap2/control/Makefile
> > b/tools/blktap3/control/Makefile copy from
> > tools/blktap2/control/Makefile copy to tools/blktap3/control/Makefile
> > --- a/tools/blktap2/control/Makefile
> > +++ b/tools/blktap3/control/Makefile
> > @@ -6,40 +6,36 @@ MINOR              = 0
> >  LIBNAME            = libblktapctl
> >  LIBSONAME          = $(LIBNAME).so.$(MAJOR)
> >
> > -IBIN               = tap-ctl
> > +override CFLAGS += \
> > +	-I../include \
> > +	-DTAPDISK_EXEC='"tapdisk"' \
> > +	-DTAPDISK_EXECDIR='"/usr/local/libexec"' \
> 
> This should come from tools/configure and config/Tools.mk etc rather
> than being hard coded here.
> 
> By default it should likely be under /usr/lib/xen (aka LIBEXEC_DIR).
> Check out buildmakevars2file in tools/Rules.mk and the use of it in
> tools/libxl/Makefile -- you probably want to do something similar.

Is the above still valid? I don't see any reason why they should be configurable, so I'd agree they should be placed in a header file.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk message types and structures
  2013-01-18 13:49   ` Ian Campbell
@ 2013-01-21 18:16     ` Thanos Makatos
  0 siblings, 0 replies; 31+ messages in thread
From: Thanos Makatos @ 2013-01-21 18:16 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

> On Tue, 2012-12-04 at 18:19 +0000, Thanos Makatos wrote:
> > +	/**
> > +	 * Protocol to use: native, 32 bit, or 64 bit.
> > +	 * TODO Is this used for supporting a 32 bit guest on a 64 bit
> hypervisor?
> 
> Sort of. The proto stuff is to support 32-bit dom0 talking to 64-bit
> domU and vice versa.

I'll update the comment.

> 
> > +	TAPDISK_MESSAGE_DISK_INFO_RSP,
> > +	TAPDISK_MESSAGE_XENBLKIF_CONNECT,
> > +	TAPDISK_MESSAGE_XENBLKIF_CONNECT_RSP,
> > +	TAPDISK_MESSAGE_XENBLKIF_DISCONNECT,
> > +	TAPDISK_MESSAGE_XENBLKIF_DISCONNECT_RSP,
> >  	TAPDISK_MESSAGE_EXIT,
> >  };
> >
> > -static inline char *
> > -tapdisk_message_name(enum tapdisk_message_id id) -{
> > -	switch (id) {
> > -	case TAPDISK_MESSAGE_ERROR:
> > -		return "error";
> > +#define TAPDISK_MESSAGE_MAX TAPDISK_MESSAGE_EXIT
> 
> A nice trick here is to include TAPDISK_MESSAGE_MAX in the enum.

Ok.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5 of 9 RFC v2] blktap3/libblktapctl: Introduce functionality used by tapback to instruct tapdisk to connect to the sring
  2013-01-18 13:54   ` Ian Campbell
@ 2013-01-21 18:16     ` Thanos Makatos
  0 siblings, 0 replies; 31+ messages in thread
From: Thanos Makatos @ 2013-01-21 18:16 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

> > +    err = tap_ctl_connect_send_and_receive(pid, &message, NULL);
> > +    if (err)
> > +        return err;
> > +
> > +    if (message.type == TAPDISK_MESSAGE_XENBLKIF_CONNECT_RSP)
> > +        err = -message.u.response.error;
> > +    else
> > +        err = -EINVAL;
> > +
> > +    return err;
> > +}
> > +
> > +int tap_ctl_disconnect_xenblkif(pid_t pid, int minor, domid_t domid,
> > +        int devid, struct timeval *timeout) {
> > +    tapdisk_message_t message;
> > +    int err;
> > +
> > +    memset(&message, 0, sizeof(message));
> > +    message.type = TAPDISK_MESSAGE_XENBLKIF_DISCONNECT;
> > +    message.cookie = minor;
> > +    message.u.blkif.domid = domid;
> > +    message.u.blkif.devid = devid;
> > +
> > +    err = tap_ctl_connect_send_and_receive(pid, &message, timeout);
> 
> In the code above you had an "if err return err" here. I'd expect them
> to behave similarly, whichever is right.

An omission, I'll fix it.

> 
> If this is a common pattern it might be something to consider pushing
> into tap_ctl_connect_send_and_receive.

I'll put it in tap_ctl_send_and_receive as it fits better there.

> 
> > +    if (message.type == TAPDISK_MESSAGE_XENBLKIF_CONNECT_RSP)
> 
> CONNECT_RSP to a DISCONNECT message?

An omission, I'll fix it.

> 
> > +        err = -message.u.response.error;
> > +    else
> > +        err = -EINVAL;
> > +
> > +    return err;
> > +}
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6 of 9 RFC v2] blktap3/libblktapctl: Introduce block device control information retrieval functionality
  2013-01-18 13:55   ` Ian Campbell
@ 2013-01-21 18:16     ` Thanos Makatos
  0 siblings, 0 replies; 31+ messages in thread
From: Thanos Makatos @ 2013-01-21 18:16 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

> > +int tap_ctl_info(pid_t pid, int minor, unsigned long long *sectors,
> > +        unsigned int *sector_size, unsigned int *info) {
> > +    tapdisk_message_t message;
> > +    int err;
> > +
> > +    memset(&message, 0, sizeof(message));
> > +    message.type = TAPDISK_MESSAGE_DISK_INFO;
> > +    message.cookie = minor;
> > +
> > +    err = tap_ctl_connect_send_and_receive(pid, &message, NULL);
> > +    if (err)
> > +        return err;
> > +
> > +    if (message.type != TAPDISK_MESSAGE_DISK_INFO_RSP)
> > +        return -EINVAL;
> 
> Here's a third pattern to go with the two in the previous patch ;-)
> 
> Ian.

Ok.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 9 of 9 RFC v2] blktap3/libblktapctl: Introduce makefile that builds tapback-required libblktapctl functionality
  2013-01-21 18:16     ` Thanos Makatos
@ 2013-01-22  9:28       ` Ian Campbell
  0 siblings, 0 replies; 31+ messages in thread
From: Ian Campbell @ 2013-01-22  9:28 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: xen-devel

On Mon, 2013-01-21 at 18:16 +0000, Thanos Makatos wrote:
> > On Tue, 2012-12-04 at 18:19 +0000, Thanos Makatos wrote:
> > > diff --git a/config/StdGNU.mk b/config/StdGNU.mk
> > > --- a/config/StdGNU.mk
> > > +++ b/config/StdGNU.mk
> > > @@ -77,3 +77,6 @@ ifeq ($(lto),y)
> > >  CFLAGS += -flto
> > >  LDFLAGS-$(clang) += -plugin LLVMgold.so  endif
> > > +
> > > +TAPDISK_EXEC = "tapdisk"
> > > +TAPDISK_EXECDIR = $(LIBEXEC)
> > 
> > Do these need to be configurable?
> > 
> > I'd be tempted to suggest that you just do
> >         #define TAPDISK_EXEC "tapdisk"
> >         #define TAPDISK_EXECDIR LIBEXEC
> > in one of you headers.
> > 
> > Ian.
> 
> They were originally specified in the Makefile and you suggested they were put elsewhere:

Hrm yes so I did. At the time I was imagining you wanted to control
these separately from $(LIBEXEC) for some reason but looking at what
you'd done here I don't see why that should be the case.

> 
> > > diff --git a/tools/blktap2/control/Makefile
> > > b/tools/blktap3/control/Makefile copy from
> > > tools/blktap2/control/Makefile copy to tools/blktap3/control/Makefile
> > > --- a/tools/blktap2/control/Makefile
> > > +++ b/tools/blktap3/control/Makefile
> > > @@ -6,40 +6,36 @@ MINOR              = 0
> > >  LIBNAME            = libblktapctl
> > >  LIBSONAME          = $(LIBNAME).so.$(MAJOR)
> > >
> > > -IBIN               = tap-ctl
> > > +override CFLAGS += \
> > > +	-I../include \
> > > +	-DTAPDISK_EXEC='"tapdisk"' \
> > > +	-DTAPDISK_EXECDIR='"/usr/local/libexec"' \
> > 
> > This should come from tools/configure and config/Tools.mk etc rather
> > than being hard coded here.
> > 
> > By default it should likely be under /usr/lib/xen (aka LIBEXEC_DIR).
> > Check out buildmakevars2file in tools/Rules.mk and the use of it in
> > tools/libxl/Makefile -- you probably want to do something similar.
> 
> Is the above still valid? I don't see any reason why they should be configurable, so I'd agree they should be placed in a header file.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files
  2013-01-18 17:25             ` Ian Campbell
@ 2013-01-22 17:29               ` Thanos Makatos
  0 siblings, 0 replies; 31+ messages in thread
From: Thanos Makatos @ 2013-01-22 17:29 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Jonathan Ludlam, xen-devel, Dave Scott, Keith Petley, xen-api

> In fact I'm wondering why the tapback daemon does all the ring setup,
> xenbus negotiation anyway. It could just watch "backend/<device-type>"
> and when it sees a new backend/<device-type>/<domid>/<devid> fork a new
> "tapdisk --domain <domid> --devid <devid>" and have tapdisk take care
> of the protocol entirely. Saves a bunch of messaging back and forth I
> think.
> 
> I don't have a problem with the existing model, it just strikes me as a
> bit odd (but then I don't really know the architecture)
> 
> Ian.

I'm not aware of any particular reason for this design option. However, this may turn out to be more interesting/complicated -- I'm including xen-api.

In the existing blktap3 prototype, when libxl starts a domain it also spawns the tapdisk. When tapback (formerly known as xenio) detects in XenStore that a front-end wants to connect to the VBD, it needs to locate which tapdisk is designated to serve the corresponding back-end file in order to instruct the tapdisk to connect to the ring. It seems that there is a number of ways for tapback to figure out which tapdisk it must use. I'm not sure whether all these alternatives are possible (please correct me if not), or whether all the necessary information is already written to XenStore.
* When libxl spawns the tapdisk, it instructs it to attach to the back-end file. Then, it associates the process ID of the tapdisk with the domain and device Ids (I assume by writing all this to XenStore). When tapback needs to figure out which tapdisk to use, it knows the domain and devices ids so all it has to do is to check XenStore for the corresponding tapdisk.
* When libxl spawns the tapdisk, it supplies to it the device and domain ID. It can also tell it which back-end file to use, or write this to XenStore. Tapback can then query all running tapdisk processes and select the one having the specified domain/device ID.

It seems that having tapback spawning the tapdisk is simpler, though tapback needs to tell tapdisk which back-end file to use, so it seems that libxl needs to associate the back-end file with the domain and device ids -- I guess this can only be achieved by having libxl writing all this to XenStore. Tapdisk will then take care of running the rest of the XenBus protocol.

Which solution is best, given that:
* multiple front-ends attached to the same VBD may be desirable
* VHD snapshot/coalesce operations need to figure out which tapdisks to pause/refresh
* various race conditions should be avoided (?)
* other?

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2013-01-22 17:29 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-04 18:19 [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
2012-12-04 18:19 ` [PATCH 1 of 9 RFC v2] blktap3: Introduce blktap3 headers Thanos Makatos
2013-01-18 13:45   ` Ian Campbell
2013-01-18 17:43     ` Thanos Makatos
2013-01-21 10:37       ` Ian Campbell
2012-12-04 18:19 ` [PATCH 2 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk message types and structures Thanos Makatos
2013-01-18 13:49   ` Ian Campbell
2013-01-21 18:16     ` Thanos Makatos
2012-12-04 18:19 ` [PATCH 3 of 9 RFC v2] blktap3/libblktapctl: Introduce the tapdisk control header Thanos Makatos
2012-12-04 18:19 ` [PATCH 4 of 9 RFC v2] blktap3/libblktapctl: Introduce listing running tapdisks functionality Thanos Makatos
2012-12-04 18:19 ` [PATCH 5 of 9 RFC v2] blktap3/libblktapctl: Introduce functionality used by tapback to instruct tapdisk to connect to the sring Thanos Makatos
2013-01-18 13:54   ` Ian Campbell
2013-01-21 18:16     ` Thanos Makatos
2012-12-04 18:19 ` [PATCH 6 of 9 RFC v2] blktap3/libblktapctl: Introduce block device control information retrieval functionality Thanos Makatos
2013-01-18 13:55   ` Ian Campbell
2013-01-21 18:16     ` Thanos Makatos
2012-12-04 18:19 ` [PATCH 7 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk message exchange functionality Thanos Makatos
2012-12-04 18:19 ` [PATCH 8 of 9 RFC v2] blktap3/libblktapctl: Introduce tapdisk spawn functionality Thanos Makatos
2012-12-04 18:19 ` [PATCH 9 of 9 RFC v2] blktap3/libblktapctl: Introduce makefile that builds tapback-required libblktapctl functionality Thanos Makatos
2013-01-18 13:59   ` Ian Campbell
2013-01-21 18:16     ` Thanos Makatos
2013-01-22  9:28       ` Ian Campbell
2013-01-11 15:40 ` [PATCH 0 of 9 RFC v2] blktap3: Introduce a small subset of blktap3 files Thanos Makatos
2013-01-18 13:39 ` Ian Campbell
2013-01-18 16:37   ` Thanos Makatos
2013-01-18 16:46     ` Ian Campbell
2013-01-18 16:58       ` Thanos Makatos
2013-01-18 17:01         ` Ian Campbell
2013-01-18 17:15           ` Thanos Makatos
2013-01-18 17:25             ` Ian Campbell
2013-01-22 17:29               ` Thanos Makatos

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.