All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support
@ 2013-08-29 22:16 Shriram Rajagopalan
  2013-08-29 22:16 ` [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts Shriram Rajagopalan
                   ` (5 more replies)
  0 siblings, 6 replies; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-08-29 22:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Ian Jackson, Ian Campbell, Stefano Stabellini

This patch series adds support for network buffering in the Remus
codebase in libxl.

Changes in V2:
[1/5] Configure script will automatically enable/disable network
      buffer support depending on the availability of the appropriate
      libnl3 version. [If libnl3 is unavailable, a warning message will be
      printed to let the user know that the feature has been disabled.]

      use macros from pkg.m4 instead of pkg-config commands
      removed redundant checks for libnl3 libraries.

[3,4/5] - Minor nits.

Version 1:

 1/5 - changes to autoconf scripts to check for libnl3. Add linker flags
       to libxl Makefile.

 2/5 - External script to setup/teardown network buffering using libnl3's CLI.
       This script will be invoked by libxl before starting Remus. The script's
       main job is to bring up an IFB device with plug qdisc attached to it.
       It then re-routes egress traffic from the guest's vif to the IFB device.

 3/5 - Libxl code to invoke the external setup script, followed by
       netlink related setup to obtain a handle on the output buffers attached
       to each vif.

 4/5 - Libxl interaction with network buffer module in the kernel via libnl3 API.

 5/5 - xl cmdline switch to explicitly enable network buffering when starting remus.


Few things to note:
a) Based on previous email discussions, the setup/teardown task has been moved to a
   hotplug style shell script which can be customized as desired, instead of
   implementing it as C code inside libxl.

b) Libnl3 is not available on NetBSD. Nor is it available on CentOS (Linux).
   So I have made network buffering support an optional feature so that it can be
   disabled if desired.

c) NetBSD does not have libnl3. So I have put the setup script under
   tools/hotplug/Linux folder.

thanks
shriram

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts
  2013-08-29 22:16 [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
@ 2013-08-29 22:16 ` Shriram Rajagopalan
  2013-09-04 13:40   ` Ian Campbell
  2013-08-29 22:16 ` [PATCH 2 of 5 V2] tools/hotplug: Remus network buffering setup scripts Shriram Rajagopalan
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-08-29 22:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Ian Jackson, Ian Campbell, Stefano Stabellini

# HG changeset patch
# User Shriram Rajagopalan <rshriram@cs.ubc.ca>
# Date 1377641625 25200
# Node ID cc2b88cba1c67cc0e0a00c7e63a3ba97e14609cb
# Parent  40b079bd57dea24c3b000f233ed123763483a4d5
remus: add libnl3 dependency to autoconf scripts

Libnl3 is required for controlling Remus network buffering.
This patch adds dependency on libnl3 (>= 3.2.8) to autoconf scripts.
Also provide ability to configure tools without libnl3 support, that
is without network buffering support.

Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>

diff -r 40b079bd57de -r cc2b88cba1c6 config/Tools.mk.in
--- a/config/Tools.mk.in	Sun Aug 25 14:39:36 2013 -0700
+++ b/config/Tools.mk.in	Tue Aug 27 15:13:45 2013 -0700
@@ -35,6 +35,7 @@ PTHREAD_LDFLAGS     := @PTHREAD_LDFLAGS@
 PTHREAD_LIBS        := @PTHREAD_LIBS@
 
 PTYFUNCS_LIBS       := @PTYFUNCS_LIBS@
+LIBNL3_LIBS         := @LIBNL3_LIBS@
 
 # Download GIT repositories via HTTP or GIT's own protocol?
 # GIT's protocol is faster and more robust, when it works at all (firewalls
@@ -54,6 +55,7 @@ CONFIG_QEMU_TRAD    := @qemu_traditional
 CONFIG_QEMU_XEN     := @qemu_xen@
 CONFIG_XEND         := @xend@
 CONFIG_BLKTAP1      := @blktap1@
+CONFIG_REMUS_NETBUF := @remus_netbuf@
 
 #System options
 ZLIB                := @zlib@
diff -r 40b079bd57de -r cc2b88cba1c6 tools/configure.ac
--- a/tools/configure.ac	Sun Aug 25 14:39:36 2013 -0700
+++ b/tools/configure.ac	Tue Aug 27 15:13:45 2013 -0700
@@ -208,4 +208,19 @@ AC_SUBST(libiconv)
 # Checks for header files.
 AC_CHECK_HEADERS([yajl/yajl_version.h sys/eventfd.h])
 
+# Check for libnl3 >=3.2.8. If present enable remus network buffering.
+PKG_CHECK_EXISTS([libnl-3.0 >= 3.2.8 libnl-route-3.0 >= 3.2.8 libnl-cli-3.0 >= 3.2.8],
+		[libnl3_lib="y"], [libnl3_lib="n"])
+AC_CHECK_HEADER([netlink/route/qdisc/plug.h], [libnl3_dev="y"], [libnl3_dev="n"])
+AS_IF([test "x$libnl3" = "xn" || test "x$libnl3_dev" = "xn"], [
+	    AC_MSG_WARN([Disabling support for Remus network buffering.
+	    Please install libnl3 libraries, command line tools and devel
+	    headers - version 3.2.8 or higher])
+	    AC_SUBST(remus_netbuf, [n])
+	    ],[
+	    LIBNL3_LIBS="-lnl-3 -lnl-route-3"
+	    AC_SUBST(LIBNL3_LIBS)
+	    AC_SUBST(remus_netbuf, [y])
+])
+
 AC_OUTPUT()
diff -r 40b079bd57de -r cc2b88cba1c6 tools/libxl/Makefile
--- a/tools/libxl/Makefile	Sun Aug 25 14:39:36 2013 -0700
+++ b/tools/libxl/Makefile	Tue Aug 27 15:13:45 2013 -0700
@@ -20,7 +20,7 @@ LIBUUID_LIBS += -luuid
 endif
 
 LIBXL_LIBS =
-LIBXL_LIBS = $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxenstore) $(LDLIBS_libblktapctl) $(PTYFUNCS_LIBS) $(LIBUUID_LIBS)
+LIBXL_LIBS = $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxenstore) $(LDLIBS_libblktapctl) $(PTYFUNCS_LIBS) $(LIBUUID_LIBS) $(LIBNL3_LIBS)
 
 CFLAGS_LIBXL += $(CFLAGS_libxenctrl)
 CFLAGS_LIBXL += $(CFLAGS_libxenguest)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 2 of 5 V2] tools/hotplug: Remus network buffering setup scripts
  2013-08-29 22:16 [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
  2013-08-29 22:16 ` [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts Shriram Rajagopalan
@ 2013-08-29 22:16 ` Shriram Rajagopalan
  2013-09-04 13:50   ` Ian Campbell
  2013-08-29 22:16 ` [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering Shriram Rajagopalan
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-08-29 22:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Ian Jackson, Ian Campbell, Stefano Stabellini

# HG changeset patch
# User Shriram Rajagopalan <rshriram@cs.ubc.ca>
# Date 1377812185 25200
# Node ID 79b21d778550f74e550af791eca41d4ca152492a
# Parent  cc2b88cba1c67cc0e0a00c7e63a3ba97e14609cb
tools/hotplug: Remus network buffering setup scripts

This patch introduces remus-netbuf-setup hotplug script responsible for
setting up and tearing down the necessary infrastructure required for
network output buffering in Remus.  This script is intended to be invoked
by libxl for each guest interface, when starting or stopping Remus.

Apart from returning success/failure indication via the usual hotplug
entires in xenstore, this script also writes to xenstore, the name of
the IFB device to be used to control the vif's network output.

The script relies on libnl3 command line utilities to perform various
setup/teardown functions. The script is confined to Linux platforms only
since NetBSD does not seem to have libnl3.

Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>

diff -r cc2b88cba1c6 -r 79b21d778550 tools/hotplug/Linux/Makefile
--- a/tools/hotplug/Linux/Makefile	Tue Aug 27 15:13:45 2013 -0700
+++ b/tools/hotplug/Linux/Makefile	Thu Aug 29 14:36:25 2013 -0700
@@ -16,6 +16,7 @@ XEN_SCRIPTS += network-nat vif-nat
 XEN_SCRIPTS += vif-openvswitch
 XEN_SCRIPTS += vif2
 XEN_SCRIPTS += vif-setup
+XEN_SCRIPTS-$(CONFIG_REMUS_NETBUF) += remus-netbuf-setup
 XEN_SCRIPTS += block
 XEN_SCRIPTS += block-enbd block-nbd
 XEN_SCRIPTS-$(CONFIG_BLKTAP1) += blktap
diff -r cc2b88cba1c6 -r 79b21d778550 tools/hotplug/Linux/remus-netbuf-setup
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/hotplug/Linux/remus-netbuf-setup	Thu Aug 29 14:36:25 2013 -0700
@@ -0,0 +1,186 @@
+#!/bin/bash
+#============================================================================
+# ${XEN_SCRIPT_DIR}/remus-netbuf-setup
+#
+# Script for attaching a network buffer to the specified vif (in any mode).
+# The hotplugging system will call this script when starting remus via libxl
+# API, libxl_domain_remus_start.
+#
+# Usage:
+# remus-netbuf-setup (setup|teardown)
+#
+# Environment vars:
+# vifname     vif interface name (required).
+# XENBUS_PATH path in Xenstore, where the IFB device details will be stored
+#                      or read from (required).
+#             (libxl passes /libxl/<domid>/remus/netbuf/<devid>)
+#
+# Written to the store: (setup operation)
+# XENBUS_PATH/ifb=<ifbdevName> the IFB device serving
+#  as the intermediate buffer through which the interface's network output
+#  can be controlled.
+#
+# Read from the store: (teardown operation)
+# XENBUS_PATH/ifb the IFB device serving as the output buffer
+#			 controller for the vif
+#
+# To install a network buffer on a guest vif (vif1.0) using ifb (ifb0)
+# we need to do the following
+#
+#  ip link set dev ifb0 up
+#  tc qdisc add dev vif1.0 ingress
+#  tc filter add dev vif1.0 parent ffff: proto ip \
+#    prio 10 u32 match u32 0 0 action mirred egress redirect dev ifb0
+#  nl-qdisc-add --dev=ifb0 --parent root plug
+#  nl-qdisc-add --dev=ifb0 --parent root --update plug --limit=10000000
+#                                                (10MB limit on buffer)
+#
+# So order of operations when installing a network buffer on vif1.0
+# 1. find a free ifb and bring up the device
+# 2. redirect traffic from vif1.0 to ifb:
+#   2.1 add ingress qdisc to vif1.0 (to capture outgoing packets from guest)
+#   2.2 use tc filter command with actions mirred egress + redirect
+# 3. install plug_qdisc on ifb device, with which we can buffer/release
+#    guest's network output from vif1.0
+#
+#
+
+#============================================================================
+
+# Unlike other vif scripts, vif-common is not needed here as it executes vif
+#specific setup code such as renaming.
+dir=$(dirname "$0")
+. "$dir/xen-hotplug-common.sh"
+
+findCommand "$@"
+
+if [ "$command" != "setup" -a  "$command" != "teardown" ]
+then
+  echo "Invalid command: $command"
+  log err "Invalid command: $command"
+  exit 1
+fi
+
+evalVariables "$@"
+
+: ${vifname?}
+: ${XENBUS_PATH?}
+IFB=$(xenstore_read_default "$XENBUS_PATH/ifb" "")
+
+check_libnl_tools() {
+    if ! command -v nl-qdisc-list > /dev/null 2>&1; then
+        fatal "Unable to find nl-qdisc-list tool"
+    fi
+    if ! command -v nl-qdisc-add > /dev/null 2>&1; then
+        fatal "Unable to find nl-qdisc-add tool"
+    fi
+    if ! command -v nl-qdisc-delete > /dev/null 2>&1; then
+        fatal "Unable to find nl-qdisc-delete tool"
+    fi
+}
+
+# We only check for modules. We dont load them.
+# User/Admin is supposed to load ifb during boot time,
+# ensuring that there are enough free ifbs in the system.
+# Other modules will be loaded automatically by tc commands.
+check_modules() {
+    for m in ifb sch_plug sch_ingress act_mirred cls_u32
+    do
+	if ! modinfo $m > /dev/null 2>&1; then
+	    fatal "Unable to find $m kernel module"
+	fi
+    done
+}
+
+setup_ifb() {
+
+    for ifb in `ifconfig -a -s|egrep ^ifb|cut -d ' ' -f1`
+    do
+	local installed=`nl-qdisc-list -d $ifb`
+	[ -n "$installed" ] && continue
+	IFB="$ifb"
+	break
+    done
+
+    if [ -z "$IFB" ]
+    then
+	fatal "Unable to find a free IFB device for $vifname"
+    fi
+
+    do_or_die ip link set dev "$IFB" up
+}
+
+redirect_vif_traffic() {
+    local vif=$1
+    local ifb=$2
+
+    do_or_die tc qdisc add dev "$vif" ingress
+
+    tc filter add dev "$vif" parent ffff: proto ip prio 10 \
+	u32 match u32 0 0 action mirred egress redirect dev "$ifb" >/dev/null 2>&1
+
+    if [ $? -ne 0 ]
+    then
+	do_without_error tc qdisc del dev "$vif" ingress
+	fatal "Failed to redirect traffic from $vif to $ifb"
+    fi
+}
+
+add_plug_qdisc() {
+    local vif=$1
+    local ifb=$2
+
+    nl-qdisc-add --dev="$ifb" --parent root plug >/dev/null 2>&1
+    if [ $? -ne 0 ]
+    then
+	do_without_error tc qdisc del dev "$vif" ingress
+	fatal "Failed to add plug qdisc to $ifb"
+    fi
+
+    #set ifb buffering limit in bytes. Its okay if this command
+    nl-qdisc-add --dev="$ifb" --parent root \
+	--update plug --limit=10000000 >/dev/null 2>&1
+}
+
+teardown_netbuf() {
+    local vif=$1
+    local ifb=$2
+
+    if [ "$ifb" ]; then
+	do_without_error ip link set dev "$ifb" down
+	do_without_error nl-qdisc-delete --dev="$ifb" --parent root plug >/dev/null 2>&1
+	xenstore-rm -t "$XENBUS_PATH/ifb" 2>/dev/null || true
+    fi
+    do_without_error tc qdisc del dev "$vif" ingress
+    xenstore-rm -t "$XENBUS_PATH/hotplug-status" 2>/dev/null || true
+}
+
+xs_write_failed() {
+    local vif=$1
+    local ifb=$2
+    teardown_netbuf "$vifname" "$IFB"
+    fatal "failed to write ifb name to xenstore"
+}
+
+case "$command" in
+    setup)
+	check_libnl_tools
+	check_modules
+	setup_ifb
+	redirect_vif_traffic "$vifname" "$IFB"
+	add_plug_qdisc "$vifname" "$IFB"
+	#not using xenstore_write that automatically exits on error
+	# because we need to cleanup
+	_xenstore_write "$XENBUS_PATH/ifb" "$IFB" || xs_write_failed "$vifname" "$IFB"
+        ;;
+    teardown)
+        teardown_netbuf "$vifname" "$IFB"
+        ;;
+esac
+
+log debug "Successful remus-netbuf-setup $command for $vifname, ifb $IFB."
+
+if [ "$command" = "setup" ]
+then
+  success
+fi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering
  2013-08-29 22:16 [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
  2013-08-29 22:16 ` [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts Shriram Rajagopalan
  2013-08-29 22:16 ` [PATCH 2 of 5 V2] tools/hotplug: Remus network buffering setup scripts Shriram Rajagopalan
@ 2013-08-29 22:16 ` Shriram Rajagopalan
  2013-09-04 15:17   ` Ian Campbell
  2013-09-04 16:16   ` Ian Jackson
  2013-08-29 22:16 ` [PATCH 4 of 5 V2] tools/libxl: Control network buffering in remus callbacks Shriram Rajagopalan
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-08-29 22:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Ian Jackson, Ian Campbell, Stefano Stabellini

# HG changeset patch
# User Shriram Rajagopalan <rshriram@cs.ubc.ca>
# Date 1377812196 25200
# Node ID 3e8b0aa7cd4a945ec3efe14c969700c9e23ea2cc
# Parent  79b21d778550f74e550af791eca41d4ca152492a
tools/libxl: setup/teardown Remus network buffering

Setup/teardown remus network buffering for a given guest, when
libxl_domain_remus_start API is invoked.

This patch does the following during setup:
 a) call the hotplug script for each vif to setup its network buffer

 b) establish a dedicated remus context containing libnl related
    state (netlink sockets, qdisc caches, etc.,)

 c) Obtain handles to plug qdiscs installed on the IFB devices
    chosen by the hotplug scripts.

During teardown, the hotplug scripts are called to remove the IFB
devices. This is followed by releasing netlink resources.

Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>

diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/Makefile
--- a/tools/libxl/Makefile	Thu Aug 29 14:36:25 2013 -0700
+++ b/tools/libxl/Makefile	Thu Aug 29 14:36:36 2013 -0700
@@ -40,6 +40,13 @@ LIBXL_OBJS-y += libxl_blktap2.o
 else
 LIBXL_OBJS-y += libxl_noblktap2.o
 endif
+
+ifeq ($(CONFIG_REMUS_NETBUF),y)
+LIBXL_OBJS-y += libxl_netbuffer.o
+else
+LIBXL_OBJS-y += libxl_nonetbuffer.o
+endif
+
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o
 LIBXL_OBJS-$(CONFIG_IA64) += libxl_nocpuid.o libxl_noarch.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_noarch.o
diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl.c
--- a/tools/libxl/libxl.c	Thu Aug 29 14:36:25 2013 -0700
+++ b/tools/libxl/libxl.c	Thu Aug 29 14:36:36 2013 -0700
@@ -692,8 +692,9 @@ libxl_vminfo * libxl_list_vm(libxl_ctx *
     return ptr;
 }
 
-static void remus_failover_cb(libxl__egc *egc,
-                              libxl__domain_suspend_state *dss, int rc);
+static void remus_replication_failure_cb(libxl__egc *egc,
+                                         libxl__domain_suspend_state *dss,
+                                         int rc);
 
 /* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
 int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
@@ -712,18 +713,44 @@ int libxl_domain_remus_start(libxl_ctx *
 
     GCNEW(dss);
     dss->ao = ao;
-    dss->callback = remus_failover_cb;
+    dss->callback = remus_replication_failure_cb;
     dss->domid = domid;
     dss->fd = send_fd;
     /* TODO do something with recv_fd */
     dss->type = type;
     dss->live = 1;
     dss->debug = 0;
-    dss->remus = info;
 
     assert(info);
 
-    /* TBD: Remus setup - i.e. attach qdisc, enable disk buffering, etc */
+    GCNEW(dss->remus_ctx);
+
+    /* convenience shorthand */
+    libxl__remus_ctx *remus_ctx = dss->remus_ctx;
+    remus_ctx->blackhole = info->blackhole;
+    remus_ctx->interval = info->interval;
+    remus_ctx->compression = info->compression;
+
+    /* Setup network buffering before invoking domain_suspend */
+    if (info->netbuf) {
+        if (info->netbufscript) {
+            remus_ctx->netbufscript =
+                libxl__strdup(gc, info->netbufscript);
+        } else {
+            remus_ctx->netbufscript =
+                libxl__sprintf(gc, "%s/remus-netbuf-setup",
+                              libxl__xen_script_dir_path());
+        }
+
+        if (libxl__remus_netbuf_setup(gc, domid, remus_ctx)) {
+            LOG(ERROR, "Remus: failed to setup network buffering"
+                " for guest with domid %u", domid);
+            rc = ERROR_FAIL;
+            goto out;
+        }
+    }
+
+    /* TBD: enable disk buffering */
 
     /* Point of no return */
     libxl__domain_suspend(egc, dss);
@@ -733,8 +760,9 @@ int libxl_domain_remus_start(libxl_ctx *
     return AO_ABORT(rc);
 }
 
-static void remus_failover_cb(libxl__egc *egc,
-                              libxl__domain_suspend_state *dss, int rc)
+static void remus_replication_failure_cb(libxl__egc *egc,
+                                         libxl__domain_suspend_state *dss,
+                                         int rc)
 {
     STATE_AO_GC(dss->ao);
     /*
@@ -743,9 +771,10 @@ static void remus_failover_cb(libxl__egc
      * from sending checkpoints.
      */
 
-    /* TBD: Remus cleanup - i.e. detach qdisc, release other
-     * resources.
-     */
+    /* Teardown the network buffers and release netlink resources. */
+    if (dss->remus_ctx && dss->remus_ctx->netbuf_ctx)
+        libxl__remus_netbuf_teardown(gc, dss->domid, dss->remus_ctx);
+
     libxl__ao_complete(egc, ao, rc);
 }
 
diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_dom.c
--- a/tools/libxl/libxl_dom.c	Thu Aug 29 14:36:25 2013 -0700
+++ b/tools/libxl/libxl_dom.c	Thu Aug 29 14:36:36 2013 -0700
@@ -1259,7 +1259,7 @@ static void remus_checkpoint_dm_saved(li
     /* REMUS TODO: Wait for disk and memory ack, release network buffer */
     /* REMUS TODO: make this asynchronous */
     assert(!rc); /* REMUS TODO handle this error properly */
-    usleep(dss->interval * 1000);
+    usleep(dss->remus_ctx->interval * 1000);
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 1);
 }
 
@@ -1277,7 +1277,6 @@ void libxl__domain_suspend(libxl__egc *e
     const libxl_domain_type type = dss->type;
     const int live = dss->live;
     const int debug = dss->debug;
-    const libxl_domain_remus_info *const r_info = dss->remus;
     libxl__srm_save_autogen_callbacks *const callbacks =
         &dss->shs.callbacks.save.a;
 
@@ -1312,11 +1311,8 @@ void libxl__domain_suspend(libxl__egc *e
     dss->guest_responded = 0;
     dss->dm_savefile = libxl__device_model_savefile(gc, domid);
 
-    if (r_info != NULL) {
-        dss->interval = r_info->interval;
-        if (r_info->compression)
-            dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
-    }
+    if (dss->remus_ctx && dss->remus_ctx->compression)
+        dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
 
     dss->xce = xc_evtchn_open(NULL, 0);
     if (dss->xce == NULL)
@@ -1335,7 +1331,7 @@ void libxl__domain_suspend(libxl__egc *e
     }
 
     memset(callbacks, 0, sizeof(*callbacks));
-    if (r_info != NULL) {
+    if (dss->remus_ctx != NULL) {
         callbacks->suspend = libxl__remus_domain_suspend_callback;
         callbacks->postcopy = libxl__remus_domain_resume_callback;
         callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;
diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_internal.h
--- a/tools/libxl/libxl_internal.h	Thu Aug 29 14:36:25 2013 -0700
+++ b/tools/libxl/libxl_internal.h	Thu Aug 29 14:36:36 2013 -0700
@@ -2242,6 +2242,29 @@ typedef struct libxl__logdirty_switch {
     libxl__ev_time timeout;
 } libxl__logdirty_switch;
 
+typedef struct libxl__remus_ctx {
+    /* checkpoint interval */
+    int interval;
+    int blackhole;
+    int compression;
+    /* Script to setup/teardown network buffers */
+    const char *netbufscript;
+    /* Opaque context containing network buffer related stuff */
+    void *netbuf_ctx;
+} libxl__remus_ctx;
+
+_hidden int libxl__remus_netbuf_setup(libxl__gc *gc, uint32_t domid,
+                                     libxl__remus_ctx *remus_ctx);
+
+_hidden int libxl__remus_netbuf_teardown(libxl__gc *gc, uint32_t domid,
+                                        libxl__remus_ctx *remus_ctx);
+
+_hidden int libxl__remus_netbuf_start_new_epoch(libxl__gc *gc, uint32_t domid,
+                                               libxl__remus_ctx *remus_ctx);
+
+_hidden int libxl__remus_netbuf_release_prev_epoch(libxl__gc *gc, uint32_t domid,
+                                                  libxl__remus_ctx *remus_ctx);
+
 struct libxl__domain_suspend_state {
     /* set by caller of libxl__domain_suspend */
     libxl__ao *ao;
@@ -2252,7 +2275,7 @@ struct libxl__domain_suspend_state {
     libxl_domain_type type;
     int live;
     int debug;
-    const libxl_domain_remus_info *remus;
+    libxl__remus_ctx *remus_ctx;
     /* private */
     xc_evtchn *xce; /* event channel handle */
     int suspend_eventchn;
@@ -2260,7 +2283,6 @@ struct libxl__domain_suspend_state {
     int xcflags;
     int guest_responded;
     const char *dm_savefile;
-    int interval; /* checkpoint interval (for Remus) */
     libxl__save_helper_state shs;
     libxl__logdirty_switch logdirty;
     /* private for libxl__domain_save_device_model */
diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_netbuffer.c
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/libxl/libxl_netbuffer.c	Thu Aug 29 14:36:36 2013 -0700
@@ -0,0 +1,434 @@
+/*
+ * Copyright (C) 2013
+ * Author Shriram Rajagopalan <rshriram@cs.ubc.ca>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+#include <netlink/cache.h>
+#include <netlink/socket.h>
+#include <netlink/attr.h>
+#include <netlink/route/link.h>
+#include <netlink/route/route.h>
+#include <netlink/route/qdisc.h>
+#include <netlink/route/qdisc/plug.h>
+
+typedef struct libxl__remus_netbuf_ctx {
+    struct rtnl_qdisc **netbuf_qdisc_list;
+    struct nl_sock *nlsock;
+    struct nl_cache *qdisc_cache;
+    char **vif_list;
+    char **ifb_list;
+    uint32_t num_netbufs;
+    uint32_t unused;
+} libxl__remus_netbuf_ctx;
+
+/* If the device has a vifname, then use that instead of
+ * the vifX.Y format.
+ */
+static char *get_vifname(libxl__gc *gc, uint32_t domid,
+                         libxl_device_nic *nic)
+{
+    char *vifname = NULL;
+    vifname = libxl__xs_read(gc, XBT_NULL,
+                             libxl__sprintf(gc,
+                                            "%s/backend/vif/%d/%d/vifname",
+                                            libxl__xs_get_dompath(gc, 0),
+                                            domid, nic->devid));
+    if (!vifname) {
+        vifname = (char *)libxl__device_nic_devname(gc, domid,
+                                                    nic->devid,
+                                                    nic->nictype);
+    }
+
+    return vifname;
+}
+
+static char **get_guest_vif_list(libxl__gc *gc, uint32_t domid,
+                                 int *num_vifs)
+{
+    libxl_device_nic *nics;
+    int nb, i = 0;
+    char **vif_list = NULL;
+
+    nics = libxl_device_nic_list(CTX, domid, &nb);
+    if (!nics) {
+        *num_vifs = 0;
+        return NULL;
+    }
+
+    vif_list = libxl__calloc(gc, nb, sizeof(char *));
+    for (i = 0; i < nb; ++i) {
+        vif_list[i] = get_vifname(gc, domid, &nics[i]);
+        libxl_device_nic_dispose(&nics[i]);
+    }
+    free(nics);
+
+    *num_vifs = nb;
+    return vif_list;
+}
+
+static void netbuf_script_child_death_cb(libxl__egc *egc,
+                                         libxl__ev_child *child,
+                                         pid_t pid, int status)
+{
+    /* No-op. hotplug-error will be read in setup/teardown_ifb */
+    return;
+}
+
+/* the script needs the following env & args
+ * $vifname
+ * $XENBUS_PATH (/libxl/<domid>/remus/netbuf/<devid>/)
+ * setup/teardown as command line arg.
+ * In return, the script writes the name of IFB device to be used for
+ * output buffering into XENBUS_PATH/ifb
+ *
+ * The setup and teardown ops are synchronous, as they are executed during
+ * the context of an asynchronous API call (libxl_domain_remus_start).
+ */
+static int libxl__exec_netbuf_script(libxl__gc *gc, uint32_t domid,
+                                     int devid, char *vif, char *script,
+                                     char *op)
+{
+    int arraysize, nr;
+    char **env = NULL, **args = NULL;
+    pid_t pid;
+    libxl__ev_child childw_out;
+
+    arraysize = 5; nr = 0;
+    GCNEW_ARRAY(env, arraysize);
+    env[nr++] = "vifname";
+    env[nr++] = vif;
+    env[nr++] = "XENBUS_PATH";
+    env[nr++] = GCSPRINTF("%s/remus/netbuf/%d",
+                          libxl__xs_libxl_path(gc, domid), devid);
+    env[nr++] = NULL;
+    assert(nr <= arraysize);
+
+    arraysize = 3; nr = 0;
+    GCNEW_ARRAY(args, arraysize);
+    args[nr++] = script;
+    args[nr++] = op;
+    args[nr++] = NULL;
+    assert(nr == arraysize);
+
+    libxl__ev_child_init(&childw_out);
+    LOG(DEBUG, "Calling netbuf script: %s %s", script, op);
+
+    /* Fork and exec netbuf script */
+    pid = libxl__ev_child_fork(gc, &childw_out, netbuf_script_child_death_cb);
+    if (pid == -1) {
+        LOG(ERROR, "unable to fork netbuf script %s", script);
+        return ERROR_FAIL;
+    }
+
+    if (!pid) {
+        /* child: Launch netbuf script */
+        libxl__exec(gc, -1, -1, -1, args[0], args, env);
+        /* notreached */
+        abort();
+    }
+
+    assert(libxl__ev_child_inuse(&childw_out));
+
+    return 0;
+}
+
+static int script_done_cb(libxl__gc *gc, uint32_t domid,
+                          const char *state, void *userdata)
+{
+    return 0; /* No-op callback */
+}
+
+static int libxl__netbuf_script_setup(libxl__gc *gc, uint32_t domid,
+                                      int devid, char *vif, char *script,
+                                      char **ifb)
+{
+    int rc;
+    char *out_path_base, *hotplug_error;
+
+    rc = libxl__exec_netbuf_script(gc, domid, devid, vif,
+                                   script, "setup");
+    if (rc) return rc;
+
+    out_path_base = GCSPRINTF("%s/remus/netbuf/%d",
+                              libxl__xs_libxl_path(gc, domid), devid);
+
+    /* Now wait for the result (XENBUS_PATH/hotplug-status).
+     * It doesnt matter what the result is. If the hotplug-status path
+     * appears, then we are good to go.
+     */
+    rc = libxl__wait_for_offspring(gc, domid, LIBXL_HOTPLUG_TIMEOUT,
+                                   script,
+                                   /* path */
+                                   GCSPRINTF("%s/hotplug-status",
+                                             out_path_base),
+                                   NULL /* state */,
+                                   NULL, script_done_cb, NULL);
+    if (rc) {
+        LOG(ERROR, "unable to wait for %s setup to complete", script);
+        return ERROR_FAIL;
+    }
+
+    hotplug_error = libxl__xs_read(gc, XBT_NULL,
+                                   GCSPRINTF("%s/hotplug-error",
+                                             out_path_base));
+    if (hotplug_error) {
+        LOG(ERROR, "netbuf script %s setup failed: %s",
+            script, hotplug_error);
+        return ERROR_FAIL;
+    }
+
+    *ifb = libxl__xs_read(gc, XBT_NULL,
+                          GCSPRINTF("%s/remus/netbuf/%d/ifb",
+                                    libxl__xs_libxl_path(gc, domid),
+                                    devid));
+    if (!(*ifb)) {
+        LOG(ERROR, "Cannot get ifb dev name for domain %u dev %s",
+            domid, vif);
+        return ERROR_FAIL;
+    }
+
+    return 0;
+}
+
+static int libxl__netbuf_script_teardown(libxl__gc *gc, uint32_t domid,
+                                         int devid, char *vif,
+                                         char *script)
+{
+    /* Nothing to wait for during tear down */
+    return libxl__exec_netbuf_script(gc, domid, devid, vif,
+                                     script, "teardown");
+}
+
+/* Scan through the list of vifs belonging to domid and
+ * invoke the netbufscript to setup the IFB device & plug qdisc
+ * for each vif. Then scan through the list of IFB devices to obtain
+ * a handle on the plug qdisc installed on these IFB devices.
+ * Network output buffering is controlled via these qdiscs.
+ */
+int libxl__remus_netbuf_setup(libxl__gc *gc, uint32_t domid,
+			      libxl__remus_ctx *remus_ctx)
+{
+    int i, ret, ifindex, num_netbufs = 0;
+    struct rtnl_link *ifb = NULL;
+    struct rtnl_qdisc *qdisc = NULL;
+    libxl__remus_netbuf_ctx *netbuf_ctx = NULL;
+
+    /* If the domain backend is a stubdom, do nothing. We dont
+     * support stubdom network buffering yet.
+     */
+    if (libxl_get_stubdom_id(CTX, domid)) {
+        LOG(ERROR, "Network buffering is not supported with stubdoms");
+        return ERROR_FAIL;
+    }
+
+    netbuf_ctx = libxl__zalloc(gc, sizeof(libxl__remus_netbuf_ctx));
+    netbuf_ctx->vif_list = get_guest_vif_list(gc, domid, &num_netbufs);
+    if (!num_netbufs) return 0;
+
+    netbuf_ctx->ifb_list = libxl__calloc(gc, num_netbufs,
+                                         sizeof(char *));
+
+    /* convenience vars */
+    char **vif_list = netbuf_ctx->vif_list;
+    char **ifb_list = netbuf_ctx->ifb_list;
+
+    for (i = 0; i < num_netbufs; ++i) {
+
+        /* The setup script does the following
+         * find a free IFB to act as buffer for the vif.
+         * set up the plug qdisc on the IFB.
+         */
+        ret = libxl__netbuf_script_setup(gc, domid, i, vif_list[i],
+                                         (char *) remus_ctx->netbufscript,
+                                         &ifb_list[i]);
+        if (ret) {
+            LOG(ERROR, "Failed to setup ifb device for dev %s",
+                vif_list[i]);
+            return ERROR_FAIL;
+        }
+
+        LOG(DEBUG, "dev %s will be buffered at ifb %s", vif_list[i],
+            ifb_list[i]);
+    }
+
+    /* Now that we have brought up IFB devices with plug qdisc for
+     * each vif, lets get a netlink handle on the plug qdisc for use
+     * during checkpointing.
+     */
+    netbuf_ctx->nlsock = nl_socket_alloc();
+    if (!netbuf_ctx->nlsock) {
+        LOG(ERROR, "cannot allocate nl socket");
+        return ERROR_FAIL;
+    }
+
+    ret = nl_connect(netbuf_ctx->nlsock, NETLINK_ROUTE);
+    if (ret) {
+        LOG(ERROR, "failed to open netlink socket: %s",
+            nl_geterror(ret));
+        return ERROR_FAIL;
+    }
+
+    /* get list of all qdiscs installed on network devs. */
+    ret = rtnl_qdisc_alloc_cache(netbuf_ctx->nlsock,
+                                 &netbuf_ctx->qdisc_cache);
+    if (ret) {
+        LOG(ERROR, "failed to allocate qdisc cache: %s",
+            nl_geterror(ret));
+        goto end;
+    }
+
+    /* list of handles to plug qdiscs */
+    netbuf_ctx->netbuf_qdisc_list =
+        libxl__calloc(gc, num_netbufs,
+                      sizeof(struct rtnl_qdisc *));
+
+    ifb = rtnl_link_alloc();
+
+    for (i = 0; i < num_netbufs; ++i) {
+
+        /* get a handle to the IFB interface */
+        ret = rtnl_link_get_kernel(netbuf_ctx->nlsock, 0,
+                                   ifb_list[i], &ifb);
+        if (ret) {
+            LOG(ERROR, "cannot obtain handle for %s: %s", ifb_list[i],
+                nl_geterror(ret));
+            goto end;
+        }
+
+        ifindex = rtnl_link_get_ifindex(ifb);
+        if (!ifindex) {
+            LOG(ERROR, "interface %s has no index", ifb_list[i]);
+            goto end;
+        }
+
+        /* Get a reference to the root qdisc installed on the IFB, by
+         * querying the qdisc list we obtained earlier. The netbufscript
+         * sets up the plug qdisc as the root qdisc, so we dont have to
+         * search the entire qdisc tree on the IFB dev.
+
+         * There is no need to explicitly free this qdisc as its just a
+         * reference from the qdisc cache we allocated earlier.
+         */
+        qdisc = rtnl_qdisc_get_by_parent(netbuf_ctx->qdisc_cache, ifindex,
+                                         TC_H_ROOT);
+
+        if (qdisc) {
+            const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc));
+            /* Sanity check: Ensure that the root qdisc is a plug qdisc. */
+            if (!tc_kind || strcmp(tc_kind, "plug")) {
+                LOG(ERROR, "plug qdisc is not installed on %s", ifb_list[i]);
+                goto end;
+            }
+            netbuf_ctx->netbuf_qdisc_list[i] = qdisc;
+        } else {
+            LOG(ERROR, "Cannot get qdisc handle from ifb %s", ifb_list[i]);
+            goto end;
+        }
+    }
+
+    rtnl_link_put(ifb);
+
+    netbuf_ctx->num_netbufs = num_netbufs;
+
+    /* Update remus_ctx to point to the newly setup netbuffer context */
+    remus_ctx->netbuf_ctx = netbuf_ctx;
+    return 0;
+
+ end:
+    if (ifb) rtnl_link_put(ifb);
+    if (netbuf_ctx->qdisc_cache) nl_cache_free(netbuf_ctx->qdisc_cache);
+    if (netbuf_ctx->nlsock) nl_close(netbuf_ctx->nlsock);
+    return ERROR_FAIL;
+}
+
+/* Note: This function will be called in the same gc context as
+ * libxl__remus_netbuf_setup, created during the libxl_domain_remus_start
+ * API call.
+ */
+int libxl__remus_netbuf_teardown(libxl__gc *gc, uint32_t domid,
+				 libxl__remus_ctx *remus_ctx)
+{
+    libxl__remus_netbuf_ctx *netbuf_ctx = remus_ctx->netbuf_ctx;
+    char **vif_list = NULL, **ifb_list = NULL;
+    int i;
+
+    if (!(remus_ctx && netbuf_ctx && netbuf_ctx->num_netbufs))
+        return 0;
+
+    nl_cache_free(netbuf_ctx->qdisc_cache);
+    nl_close(netbuf_ctx->nlsock);
+
+    vif_list = netbuf_ctx->vif_list;
+    ifb_list = netbuf_ctx->ifb_list;
+
+    for (i = 0; i < netbuf_ctx->num_netbufs; ++i)
+        libxl__netbuf_script_teardown(gc, domid, i, vif_list[i],
+                                      (char *) remus_ctx->netbufscript);
+
+    return 0;
+}
+
+#define TC_BUFFER_START 1
+#define TC_BUFFER_RELEASE 2
+static int remus_netbuf_op(libxl__gc *gc, uint32_t domid,
+			   libxl__remus_ctx *remus_ctx,
+			   int buffer_op)
+{
+    int i, ret;
+    libxl__remus_netbuf_ctx *netbuf_ctx = remus_ctx->netbuf_ctx;
+
+    for (i = 0; i < netbuf_ctx->num_netbufs; ++i) {
+        if (buffer_op == TC_BUFFER_START)
+            ret = rtnl_qdisc_plug_buffer(netbuf_ctx->netbuf_qdisc_list[i]);
+        else
+            ret = rtnl_qdisc_plug_release_one(netbuf_ctx->netbuf_qdisc_list[i]);
+
+        if (!ret)
+            ret = rtnl_qdisc_add(netbuf_ctx->nlsock,
+                                 netbuf_ctx->netbuf_qdisc_list[i],
+                                 NLM_F_REQUEST);
+        if (ret) {
+            LOG(ERROR, "Remus: cannot do netbuf op %s on %s:%s",
+                ((buffer_op == TC_BUFFER_START) ?
+                 "start_new_epoch" : "release_prev_epoch"),
+                netbuf_ctx->ifb_list[i], nl_geterror(ret));
+            return ERROR_FAIL;
+        }
+    }
+
+    return 0;
+}
+
+int libxl__remus_netbuf_start_new_epoch(libxl__gc *gc, uint32_t domid,
+					libxl__remus_ctx *remus_ctx)
+{
+    return remus_netbuf_op(gc, domid, remus_ctx, TC_BUFFER_START);
+}
+
+int libxl__remus_netbuf_release_prev_epoch(libxl__gc *gc, uint32_t domid,
+					   libxl__remus_ctx *remus_ctx)
+{
+    return remus_netbuf_op(gc, domid, remus_ctx, TC_BUFFER_RELEASE);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_nonetbuffer.c
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/libxl/libxl_nonetbuffer.c	Thu Aug 29 14:36:36 2013 -0700
@@ -0,0 +1,54 @@
+/*
+ * Copyright (C) 2013
+ * Author Shriram Rajagopalan <rshriram@cs.ubc.ca>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+/* Remus network buffer related stubs */
+int libxl__remus_netbuf_setup(libxl__gc *gc, uint32_t domid,
+			      libxl__remus_ctx *remus_ctx)
+{
+    LOG(ERROR, "Remus: No support for network buffering");
+    return ERROR_FAIL;
+}
+
+int libxl__remus_netbuf_teardown(libxl__gc *gc, uint32_t domid,
+				 libxl__remus_ctx *remus_ctx)
+{
+    return 0; /* No point complaining in a teardown call. */
+}
+
+int libxl__remus_netbuf_start_new_epoch(libxl__gc *gc, uint32_t domid,
+					libxl__remus_ctx *remus_ctx)
+{
+    LOG(ERROR, "Remus: No support for network buffering");
+    return ERROR_FAIL;
+}
+
+int libxl__remus_netbuf_release_prev_epoch(libxl__gc *gc, uint32_t domid,
+					   libxl__remus_ctx *remus_ctx)
+{
+    LOG(ERROR, "Remus: No support for network buffering");
+    return ERROR_FAIL;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_types.idl
--- a/tools/libxl/libxl_types.idl	Thu Aug 29 14:36:25 2013 -0700
+++ b/tools/libxl/libxl_types.idl	Thu Aug 29 14:36:36 2013 -0700
@@ -526,6 +526,8 @@ libxl_domain_remus_info = Struct("domain
     ("interval",     integer),
     ("blackhole",    bool),
     ("compression",  bool),
+    ("netbuf",  bool),
+    ("netbufscript", string),
     ])
 
 libxl_event_type = Enumeration("event_type", [

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 4 of 5 V2] tools/libxl: Control network buffering in remus callbacks
  2013-08-29 22:16 [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
                   ` (2 preceding siblings ...)
  2013-08-29 22:16 ` [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering Shriram Rajagopalan
@ 2013-08-29 22:16 ` Shriram Rajagopalan
  2013-09-04 15:19   ` Ian Campbell
  2013-09-05 11:25   ` Ian Jackson
  2013-08-29 22:16 ` [PATCH 5 of 5 V2] tools/xl: Remus - Network buffering cmdline switch Shriram Rajagopalan
  2013-09-04 13:21 ` [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
  5 siblings, 2 replies; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-08-29 22:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Ian Jackson, Ian Campbell, Stefano Stabellini

# HG changeset patch
# User Shriram Rajagopalan <rshriram@cs.ubc.ca>
# Date 1377813001 25200
# Node ID b8839e0b61c2e15b94a274cfddedf0814d2560de
# Parent  3e8b0aa7cd4a945ec3efe14c969700c9e23ea2cc
tools/libxl: Control network buffering in remus callbacks

This patch constitutes the core network buffering logic.
and does the following:
 a) create a new network buffer when the domain is suspended
    (remus_domain_suspend_callback)
 b) release the previous network buffer pertaining to the
    committed checkpoint (remus_domain_checkpoint_dm_saved)

Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>

diff -r 3e8b0aa7cd4a -r b8839e0b61c2 tools/libxl/libxl_dom.c
--- a/tools/libxl/libxl_dom.c	Thu Aug 29 14:36:36 2013 -0700
+++ b/tools/libxl/libxl_dom.c	Thu Aug 29 14:50:01 2013 -0700
@@ -1215,8 +1215,24 @@ int libxl__toolstack_save(uint32_t domid
 
 static int libxl__remus_domain_suspend_callback(void *data)
 {
-    /* REMUS TODO: Issue disk and network checkpoint reqs. */
-    return libxl__domain_suspend_common_callback(data);
+    /* REMUS TODO: Issue disk checkpoint reqs. */
+    libxl__save_helper_state *shs = data;
+    libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
+    libxl__remus_ctx *remus_ctx = dss->remus_ctx;
+    bool is_suspended;
+    STATE_AO_GC(dss->ao);
+
+    is_suspended = !!libxl__domain_suspend_common_callback(data);
+
+    if (!remus_ctx->netbuf_ctx) return is_suspended;
+
+    if (is_suspended) {
+        if (libxl__remus_netbuf_start_new_epoch(gc, dss->domid,
+                                                remus_ctx))
+            return !is_suspended;
+    }
+
+    return is_suspended;
 }
 
 static int libxl__remus_domain_resume_callback(void *data)
@@ -1229,7 +1245,7 @@ static int libxl__remus_domain_resume_ca
     if (libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1))
         return 0;
 
-    /* REMUS TODO: Deal with disk. Start a new network output buffer */
+    /* REMUS TODO: Deal with disk. */
     return 1;
 }
 
@@ -1256,10 +1272,36 @@ static void libxl__remus_domain_checkpoi
 static void remus_checkpoint_dm_saved(libxl__egc *egc,
                                       libxl__domain_suspend_state *dss, int rc)
 {
-    /* REMUS TODO: Wait for disk and memory ack, release network buffer */
-    /* REMUS TODO: make this asynchronous */
-    assert(!rc); /* REMUS TODO handle this error properly */
-    usleep(dss->remus_ctx->interval * 1000);
+    /*
+     * REMUS TODO: Wait for disk and explicit memory ack (through restore
+     * callback from remote) before releasing network buffer.
+     */
+    libxl__remus_ctx *remus_ctx = dss->remus_ctx;
+    struct timespec epoch;
+    int ret;
+    STATE_AO_GC(dss->ao);
+
+    if (rc) {
+        LOG(ERROR, "Failed to save device model. Terminating Remus..");
+        libxl__xc_domain_saverestore_async_callback_done(egc,
+                                                         &dss->shs, rc);
+        return;
+    }
+
+    if (remus_ctx->netbuf_ctx) {
+        ret = libxl__remus_netbuf_release_prev_epoch(gc, dss->domid,
+                                                     remus_ctx);
+        if (ret) {
+            libxl__xc_domain_saverestore_async_callback_done(egc,
+                                                             &dss->shs,
+                                                             ret);
+            return;
+        }
+    }
+
+    epoch.tv_sec = remus_ctx->interval / 1000; /* interval is in ms */
+    epoch.tv_nsec = remus_ctx->interval * 1000L * 1000L;
+    nanosleep(&epoch, 0);
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 1);
 }

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 5 of 5 V2] tools/xl: Remus - Network buffering cmdline switch
  2013-08-29 22:16 [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
                   ` (3 preceding siblings ...)
  2013-08-29 22:16 ` [PATCH 4 of 5 V2] tools/libxl: Control network buffering in remus callbacks Shriram Rajagopalan
@ 2013-08-29 22:16 ` Shriram Rajagopalan
  2013-09-04 15:24   ` Ian Campbell
  2013-09-04 13:21 ` [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
  5 siblings, 1 reply; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-08-29 22:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Ian Jackson, Ian Campbell, Stefano Stabellini

# HG changeset patch
# User Shriram Rajagopalan <rshriram@cs.ubc.ca>
# Date 1377814007 25200
# Node ID 66e6a3c645669e76e119edeed9271e6ae2a49bdb
# Parent  b8839e0b61c2e15b94a274cfddedf0814d2560de
tools/xl: Remus - Network buffering cmdline switch

Command line switch to 'xl remus' command, to enable network buffering.
Pass on this flag to libxl so that it can act accordingly.
Also update man pages to reflect the addition of a new option to
'xl remus' command.

Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>

diff -r b8839e0b61c2 -r 66e6a3c64566 docs/man/xl.pod.1
--- a/docs/man/xl.pod.1	Thu Aug 29 14:50:01 2013 -0700
+++ b/docs/man/xl.pod.1	Thu Aug 29 15:06:47 2013 -0700
@@ -398,8 +398,7 @@ Print huge (!) amount of debug during th
 Enable Remus HA for domain. By default B<xl> relies on ssh as a transport
 mechanism between the two hosts.
 
-N.B: Remus support in xl is still in experimental (proof-of-concept) phase.
-     There is no support for network or disk buffering at the moment.
+N.B: There is no support for disk buffering at the moment.
 
 B<OPTIONS>
 
@@ -418,6 +417,13 @@ Generally useful for debugging.
 
 Disable memory checkpoint compression.
 
+=item B<-n>
+
+Enable network output buffering.  The default script used to configure
+network buffering is /etc/xen/scripts/remus-netbuf-setup. If you wish to
+use a custom script, set the global variable "remus.default.netbufscript"
+in /etc/xen/xl.conf to point to your script.
+
 =item B<-s> I<sshcommand>
 
 Use <sshcommand> instead of ssh.  String will be passed to sh.
diff -r b8839e0b61c2 -r 66e6a3c64566 tools/libxl/xl.c
--- a/tools/libxl/xl.c	Thu Aug 29 14:50:01 2013 -0700
+++ b/tools/libxl/xl.c	Thu Aug 29 15:06:47 2013 -0700
@@ -46,6 +46,7 @@ char *default_vifscript = NULL;
 char *default_bridge = NULL;
 char *default_gatewaydev = NULL;
 char *default_vifbackend = NULL;
+char *default_remus_netbufscript = NULL;
 enum output_format default_output_format = OUTPUT_FORMAT_JSON;
 int claim_mode = 1;
 
@@ -177,6 +178,11 @@ static void parse_global_config(const ch
     if (!xlu_cfg_get_long (config, "claim_mode", &l, 0))
         claim_mode = l;
 
+    if (!xlu_cfg_get_string (config, "remus.default.netbufscript", &buf, 0)) {
+        free(default_remus_netbufscript);
+        default_remus_netbufscript = strdup(buf);
+    }
+
     xlu_cfg_destroy(config);
 }
 
diff -r b8839e0b61c2 -r 66e6a3c64566 tools/libxl/xl.h
--- a/tools/libxl/xl.h	Thu Aug 29 14:50:01 2013 -0700
+++ b/tools/libxl/xl.h	Thu Aug 29 15:06:47 2013 -0700
@@ -152,6 +152,7 @@ extern char *default_vifscript;
 extern char *default_bridge;
 extern char *default_gatewaydev;
 extern char *default_vifbackend;
+extern char *default_remus_netbufscript;
 extern char *blkdev_start;
 
 enum output_format {
diff -r b8839e0b61c2 -r 66e6a3c64566 tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c	Thu Aug 29 14:50:01 2013 -0700
+++ b/tools/libxl/xl_cmdimpl.c	Thu Aug 29 15:06:47 2013 -0700
@@ -7070,8 +7070,9 @@ int main_remus(int argc, char **argv)
     r_info.interval = 200;
     r_info.blackhole = 0;
     r_info.compression = 1;
-
-    SWITCH_FOREACH_OPT(opt, "bui:s:e", NULL, "remus", 2) {
+    r_info.netbuf = 0;
+
+    SWITCH_FOREACH_OPT(opt, "buni:s:e", NULL, "remus", 2) {
     case 'i':
         r_info.interval = atoi(optarg);
         break;
@@ -7081,6 +7082,9 @@ int main_remus(int argc, char **argv)
     case 'u':
         r_info.compression = 0;
         break;
+    case 'n':
+        r_info.netbuf = 1;
+        break;
     case 's':
         ssh_command = optarg;
         break;
@@ -7092,6 +7096,11 @@ int main_remus(int argc, char **argv)
     domid = find_domain(argv[optind]);
     host = argv[optind + 1];
 
+    if (r_info.netbuf) {
+        if (default_remus_netbufscript)
+            r_info.netbufscript = strdup(default_remus_netbufscript);
+    }
+
     if (r_info.blackhole) {
         send_fd = open("/dev/null", O_RDWR, 0644);
         if (send_fd < 0) {
@@ -7126,13 +7135,10 @@ int main_remus(int argc, char **argv)
     /* Point of no return */
     rc = libxl_domain_remus_start(ctx, &r_info, domid, send_fd, recv_fd, 0);
 
-    /* If we are here, it means backup has failed/domain suspend failed.
-     * Try to resume the domain and exit gracefully.
+    /* If we are here, it means remus setup/domain susppend/backup has
+     * failed. Try to resume the domain and exit gracefully.
      * TODO: Split-Brain check.
      */
-    fprintf(stderr, "remus sender: libxl_domain_suspend failed"
-            " (rc=%d)\n", rc);
-
     if (rc == ERROR_GUEST_TIMEDOUT)
         fprintf(stderr, "Failed to suspend domain at primary.\n");
     else {
diff -r b8839e0b61c2 -r 66e6a3c64566 tools/libxl/xl_cmdtable.c
--- a/tools/libxl/xl_cmdtable.c	Thu Aug 29 14:50:01 2013 -0700
+++ b/tools/libxl/xl_cmdtable.c	Thu Aug 29 15:06:47 2013 -0700
@@ -481,6 +481,7 @@ struct cmd_spec cmd_table[] = {
       "-i MS                   Checkpoint domain memory every MS milliseconds (def. 200ms).\n"
       "-b                      Replicate memory checkpoints to /dev/null (blackhole)\n"
       "-u                      Disable memory checkpoint compression.\n"
+      "-n                      Enable network output buffering.\n"
       "-s <sshcommand>         Use <sshcommand> instead of ssh.  String will be passed\n"
       "                        to sh. If empty, run <host> instead of \n"
       "                        ssh <host> xl migrate-receive -r [-e]\n"

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support
  2013-08-29 22:16 [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
                   ` (4 preceding siblings ...)
  2013-08-29 22:16 ` [PATCH 5 of 5 V2] tools/xl: Remus - Network buffering cmdline switch Shriram Rajagopalan
@ 2013-09-04 13:21 ` Shriram Rajagopalan
  2013-09-04 13:27   ` Ian Campbell
  5 siblings, 1 reply; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-09-04 13:21 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Ian Jackson, Ian Campbell, Stefano Stabellini


[-- Attachment #1.1: Type: text/plain, Size: 2151 bytes --]

On Thu, Aug 29, 2013 at 6:16 PM, Shriram Rajagopalan <rshriram@cs.ubc.ca>wrote:

> This patch series adds support for network buffering in the Remus
> codebase in libxl.
>
> Changes in V2:
> [1/5] Configure script will automatically enable/disable network
>       buffer support depending on the availability of the appropriate
>       libnl3 version. [If libnl3 is unavailable, a warning message will be
>       printed to let the user know that the feature has been disabled.]
>
>       use macros from pkg.m4 instead of pkg-config commands
>       removed redundant checks for libnl3 libraries.
>
> [3,4/5] - Minor nits.
>
> Version 1:
>
>  1/5 - changes to autoconf scripts to check for libnl3. Add linker flags
>        to libxl Makefile.
>
>  2/5 - External script to setup/teardown network buffering using libnl3's
> CLI.
>        This script will be invoked by libxl before starting Remus. The
> script's
>        main job is to bring up an IFB device with plug qdisc attached to
> it.
>        It then re-routes egress traffic from the guest's vif to the IFB
> device.
>
>  3/5 - Libxl code to invoke the external setup script, followed by
>        netlink related setup to obtain a handle on the output buffers
> attached
>        to each vif.
>
>  4/5 - Libxl interaction with network buffer module in the kernel via
> libnl3 API.
>
>  5/5 - xl cmdline switch to explicitly enable network buffering when
> starting remus.
>
>
> Few things to note:
> a) Based on previous email discussions, the setup/teardown task has been
> moved to a
>    hotplug style shell script which can be customized as desired, instead
> of
>    implementing it as C code inside libxl.
>
> b) Libnl3 is not available on NetBSD. Nor is it available on CentOS
> (Linux).
>    So I have made network buffering support an optional feature so that it
> can be
>    disabled if desired.
>
> c) NetBSD does not have libnl3. So I have put the setup script under
>    tools/hotplug/Linux folder.
>
> thanks
> shriram
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

ping!

shriram

[-- Attachment #1.2: Type: text/html, Size: 2809 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support
  2013-09-04 13:21 ` [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
@ 2013-09-04 13:27   ` Ian Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2013-09-04 13:27 UTC (permalink / raw)
  To: rshriram; +Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, xen-devel

On Wed, 2013-09-04 at 09:21 -0400, Shriram Rajagopalan wrote:
> ping! 

It's in my backlog but I've not gotten to it yet, sorry.

Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts
  2013-08-29 22:16 ` [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts Shriram Rajagopalan
@ 2013-09-04 13:40   ` Ian Campbell
  2013-10-09  2:14     ` Shriram Rajagopalan
  0 siblings, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2013-09-04 13:40 UTC (permalink / raw)
  To: Shriram Rajagopalan
  Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, xen-devel

On Thu, 2013-08-29 at 15:16 -0700, Shriram Rajagopalan wrote:
> # HG changeset patch
> # User Shriram Rajagopalan <rshriram@cs.ubc.ca>
> # Date 1377641625 25200
> # Node ID cc2b88cba1c67cc0e0a00c7e63a3ba97e14609cb
> # Parent  40b079bd57dea24c3b000f233ed123763483a4d5
> remus: add libnl3 dependency to autoconf scripts
> 
> Libnl3 is required for controlling Remus network buffering.
> This patch adds dependency on libnl3 (>= 3.2.8) to autoconf scripts.
> Also provide ability to configure tools without libnl3 support, that
> is without network buffering support.
> 
> Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
> 
> diff -r 40b079bd57de -r cc2b88cba1c6 config/Tools.mk.in
> --- a/config/Tools.mk.in	Sun Aug 25 14:39:36 2013 -0700
> +++ b/config/Tools.mk.in	Tue Aug 27 15:13:45 2013 -0700
> @@ -35,6 +35,7 @@ PTHREAD_LDFLAGS     := @PTHREAD_LDFLAGS@
>  PTHREAD_LIBS        := @PTHREAD_LIBS@
>  
>  PTYFUNCS_LIBS       := @PTYFUNCS_LIBS@
> +LIBNL3_LIBS         := @LIBNL3_LIBS@
>  
>  # Download GIT repositories via HTTP or GIT's own protocol?
>  # GIT's protocol is faster and more robust, when it works at all (firewalls
> @@ -54,6 +55,7 @@ CONFIG_QEMU_TRAD    := @qemu_traditional
>  CONFIG_QEMU_XEN     := @qemu_xen@
>  CONFIG_XEND         := @xend@
>  CONFIG_BLKTAP1      := @blktap1@
> +CONFIG_REMUS_NETBUF := @remus_netbuf@
>  
>  #System options
>  ZLIB                := @zlib@
> diff -r 40b079bd57de -r cc2b88cba1c6 tools/configure.ac
> --- a/tools/configure.ac	Sun Aug 25 14:39:36 2013 -0700
> +++ b/tools/configure.ac	Tue Aug 27 15:13:45 2013 -0700
> @@ -208,4 +208,19 @@ AC_SUBST(libiconv)
>  # Checks for header files.
>  AC_CHECK_HEADERS([yajl/yajl_version.h sys/eventfd.h])
>  
> +# Check for libnl3 >=3.2.8. If present enable remus network buffering.
> +PKG_CHECK_EXISTS([libnl-3.0 >= 3.2.8 libnl-route-3.0 >= 3.2.8 libnl-cli-3.0 >= 3.2.8],
> +		[libnl3_lib="y"], [libnl3_lib="n"])
> +AC_CHECK_HEADER([netlink/route/qdisc/plug.h], [libnl3_dev="y"], [libnl3_dev="n"])

Aren't these a bit redundant? i.e. doesn't PKG_CHECK_EXISTS cover the
development headers? At least on Debian libnl-3-dev contains:
[...]
-rw-r--r-- root/root       770 2013-05-21 22:52 ./usr/include/libnl3/netlink/route/qdisc/plug.h
[...]
-rw-r--r-- root/root       239 2013-05-21 22:52 ./usr/lib/x86_64-linux-gnu/pkgconfig/libnl-3.0.pc
[...]

> +AS_IF([test "x$libnl3" = "xn" || test "x$libnl3_dev" = "xn"], [
> +	    AC_MSG_WARN([Disabling support for Remus network buffering.
> +	    Please install libnl3 libraries, command line tools and devel
> +	    headers - version 3.2.8 or higher])

This should be mentioned in README too.

> +	    AC_SUBST(remus_netbuf, [n])
> +	    ],[
> +	    LIBNL3_LIBS="-lnl-3 -lnl-route-3"

Since libnl3 uses pkg-cofnig, these should come from there I think.
There's probably a macro in pkg-config.m4 to help?

> +	    AC_SUBST(LIBNL3_LIBS)
> +	    AC_SUBST(remus_netbuf, [y])
> +])
> +
>  AC_OUTPUT()
> diff -r 40b079bd57de -r cc2b88cba1c6 tools/libxl/Makefile
> --- a/tools/libxl/Makefile	Sun Aug 25 14:39:36 2013 -0700
> +++ b/tools/libxl/Makefile	Tue Aug 27 15:13:45 2013 -0700
> @@ -20,7 +20,7 @@ LIBUUID_LIBS += -luuid
>  endif
>  
>  LIBXL_LIBS =
> -LIBXL_LIBS = $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxenstore) $(LDLIBS_libblktapctl) $(PTYFUNCS_LIBS) $(LIBUUID_LIBS)
> +LIBXL_LIBS = $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxenstore) $(LDLIBS_libblktapctl) $(PTYFUNCS_LIBS) $(LIBUUID_LIBS) $(LIBNL3_LIBS)

I think it might be time to break this long line and use
LIBXL_LIBS  = X Y Z
LIBXL_LIBS += A B C
(no need for one line per lib though)

>  
>  CFLAGS_LIBXL += $(CFLAGS_libxenctrl)
>  CFLAGS_LIBXL += $(CFLAGS_libxenguest)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2 of 5 V2] tools/hotplug: Remus network buffering setup scripts
  2013-08-29 22:16 ` [PATCH 2 of 5 V2] tools/hotplug: Remus network buffering setup scripts Shriram Rajagopalan
@ 2013-09-04 13:50   ` Ian Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2013-09-04 13:50 UTC (permalink / raw)
  To: Shriram Rajagopalan
  Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, xen-devel

On Thu, 2013-08-29 at 15:16 -0700, Shriram Rajagopalan wrote:
> # HG changeset patch
> # User Shriram Rajagopalan <rshriram@cs.ubc.ca>
> # Date 1377812185 25200
> # Node ID 79b21d778550f74e550af791eca41d4ca152492a
> # Parent  cc2b88cba1c67cc0e0a00c7e63a3ba97e14609cb
> tools/hotplug: Remus network buffering setup scripts
> 
> This patch introduces remus-netbuf-setup hotplug script responsible for
> setting up and tearing down the necessary infrastructure required for
> network output buffering in Remus.  This script is intended to be invoked
> by libxl for each guest interface, when starting or stopping Remus.
> 
> Apart from returning success/failure indication via the usual hotplug
> entires in xenstore, this script also writes to xenstore, the name of

"entries"

Also, if you haven't already in a later patch please can you update
docs/misc/xenstore-paths.markdown at some point in this series.

> the IFB device to be used to control the vif's network output.
> 
> The script relies on libnl3 command line utilities to perform various
> setup/teardown functions. The script is confined to Linux platforms only
> since NetBSD does not seem to have libnl3.
> 
> Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>

> +
> +# We only check for modules. We dont load them.

"don't" (sorry, pedantic I know)

> +# User/Admin is supposed to load ifb during boot time,
> +# ensuring that there are enough free ifbs in the system.
> +# Other modules will be loaded automatically by tc commands.

Is there (going to be) a howto on the wiki or somewhere in this series?

> +check_modules() {
> +    for m in ifb sch_plug sch_ingress act_mirred cls_u32
> +    do
> +	if ! modinfo $m > /dev/null 2>&1; then
> +	    fatal "Unable to find $m kernel module"
> +	fi
> +    done
> +}
> +
> +setup_ifb() {
> +
> +    for ifb in `ifconfig -a -s|egrep ^ifb|cut -d ' ' -f1`
> +    do
> +	local installed=`nl-qdisc-list -d $ifb`
> +	[ -n "$installed" ] && continue
> +	IFB="$ifb"
> +	break
> +    done

This is a bit unfortunate, but I guess it is hard to do much else from
shell though I suppose.

Is there any locking in the script to avoid two instances racing in this
function? I didn't see it but maybe it's in common.sh or something.

> +
> +case "$command" in
> +    setup)
> +	check_libnl_tools
> +	check_modules
> +	setup_ifb
> +	redirect_vif_traffic "$vifname" "$IFB"
> +	add_plug_qdisc "$vifname" "$IFB"
> +	#not using xenstore_write that automatically exits on error
> +	# because we need to cleanup
> +	_xenstore_write "$XENBUS_PATH/ifb" "$IFB" || xs_write_failed "$vifname" "$IFB"
> +        ;;

Some tab vs space confusion here.

> +    teardown)
> +        teardown_netbuf "$vifname" "$IFB"
> +        ;;
> +esac
> +
> +log debug "Successful remus-netbuf-setup $command for $vifname, ifb $IFB."
> +
> +if [ "$command" = "setup" ]
> +then
> +  success
> +fi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering
  2013-08-29 22:16 ` [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering Shriram Rajagopalan
@ 2013-09-04 15:17   ` Ian Campbell
  2013-10-14 14:23     ` Shriram Rajagopalan
  2013-09-04 16:16   ` Ian Jackson
  1 sibling, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2013-09-04 15:17 UTC (permalink / raw)
  To: Shriram Rajagopalan
  Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, Roger Pau Monne,
	xen-devel

On Thu, 2013-08-29 at 15:16 -0700, Shriram Rajagopalan wrote:
> # HG changeset patch
> # User Shriram Rajagopalan <rshriram@cs.ubc.ca>
> # Date 1377812196 25200
> # Node ID 3e8b0aa7cd4a945ec3efe14c969700c9e23ea2cc
> # Parent  79b21d778550f74e550af791eca41d4ca152492a
> tools/libxl: setup/teardown Remus network buffering
> 
> Setup/teardown remus network buffering for a given guest, when
> libxl_domain_remus_start API is invoked.
> 
> This patch does the following during setup:
>  a) call the hotplug script for each vif to setup its network buffer
> 
>  b) establish a dedicated remus context containing libnl related
>     state (netlink sockets, qdisc caches, etc.,)
> 
>  c) Obtain handles to plug qdiscs installed on the IFB devices
>     chosen by the hotplug scripts.
> 
> During teardown, the hotplug scripts are called to remove the IFB
> devices. This is followed by releasing netlink resources.
> 
> Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>

This could use feedback from Ian on the async bits and Roger on the
hotplug bits.

> 
> diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/Makefile
> --- a/tools/libxl/Makefile	Thu Aug 29 14:36:25 2013 -0700
> +++ b/tools/libxl/Makefile	Thu Aug 29 14:36:36 2013 -0700
> @@ -40,6 +40,13 @@ LIBXL_OBJS-y += libxl_blktap2.o
>  else
>  LIBXL_OBJS-y += libxl_noblktap2.o
>  endif
> +
> +ifeq ($(CONFIG_REMUS_NETBUF),y)
> +LIBXL_OBJS-y += libxl_netbuffer.o
> +else
> +LIBXL_OBJS-y += libxl_nonetbuffer.o
> +endif
> +
>  LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o
>  LIBXL_OBJS-$(CONFIG_IA64) += libxl_nocpuid.o libxl_noarch.o
>  LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_noarch.o
> diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl.c
> --- a/tools/libxl/libxl.c	Thu Aug 29 14:36:25 2013 -0700
> +++ b/tools/libxl/libxl.c	Thu Aug 29 14:36:36 2013 -0700
> @@ -692,8 +692,9 @@ libxl_vminfo * libxl_list_vm(libxl_ctx *
>      return ptr;
>  }
>  
> -static void remus_failover_cb(libxl__egc *egc,
> -                              libxl__domain_suspend_state *dss, int rc);
> +static void remus_replication_failure_cb(libxl__egc *egc,
> +                                         libxl__domain_suspend_state *dss,
> +                                         int rc);
>  
>  /* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
>  int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
> @@ -712,18 +713,44 @@ int libxl_domain_remus_start(libxl_ctx *
>  
>      GCNEW(dss);
>      dss->ao = ao;
> -    dss->callback = remus_failover_cb;
> +    dss->callback = remus_replication_failure_cb;
>      dss->domid = domid;
>      dss->fd = send_fd;
>      /* TODO do something with recv_fd */
>      dss->type = type;
>      dss->live = 1;
>      dss->debug = 0;
> -    dss->remus = info;
>  
>      assert(info);
>  
> -    /* TBD: Remus setup - i.e. attach qdisc, enable disk buffering, etc */
> +    GCNEW(dss->remus_ctx);
> +
> +    /* convenience shorthand */
> +    libxl__remus_ctx *remus_ctx = dss->remus_ctx;
> +    remus_ctx->blackhole = info->blackhole;
> +    remus_ctx->interval = info->interval;
> +    remus_ctx->compression = info->compression;
> +
> +    /* Setup network buffering before invoking domain_suspend */
> +    if (info->netbuf) {
> +        if (info->netbufscript) {
> +            remus_ctx->netbufscript =
> +                libxl__strdup(gc, info->netbufscript);
> +        } else {
> +            remus_ctx->netbufscript =
> +                libxl__sprintf(gc, "%s/remus-netbuf-setup",
> +                              libxl__xen_script_dir_path());

GCSPRINTF would help shorten this line.

(we don't seem to have GCSTRDUP to help the other side of the else, oh
well)

> diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_netbuffer.c
> --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
> +++ b/tools/libxl/libxl_netbuffer.c	Thu Aug 29 14:36:36 2013 -0700
> @@ -0,0 +1,434 @@
> +/*
> + * Copyright (C) 2013
> + * Author Shriram Rajagopalan <rshriram@cs.ubc.ca>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "libxl_osdeps.h" /* must come before any other headers */
> +
> +#include "libxl_internal.h"
> +
> +#include <netlink/cache.h>
> +#include <netlink/socket.h>
> +#include <netlink/attr.h>
> +#include <netlink/route/link.h>
> +#include <netlink/route/route.h>
> +#include <netlink/route/qdisc.h>
> +#include <netlink/route/qdisc/plug.h>
> +
> +typedef struct libxl__remus_netbuf_ctx {
> +    struct rtnl_qdisc **netbuf_qdisc_list;
> +    struct nl_sock *nlsock;
> +    struct nl_cache *qdisc_cache;
> +    char **vif_list;
> +    char **ifb_list;
> +    uint32_t num_netbufs;
> +    uint32_t unused;
> +} libxl__remus_netbuf_ctx;
> +
> +/* If the device has a vifname, then use that instead of
> + * the vifX.Y format.
> + */
> +static char *get_vifname(libxl__gc *gc, uint32_t domid,
> +                         libxl_device_nic *nic)
> +{
> +    char *vifname = NULL;
> +    vifname = libxl__xs_read(gc, XBT_NULL,
> +                             libxl__sprintf(gc,
> +                                            "%s/backend/vif/%d/%d/vifname",
> +                                            libxl__xs_get_dompath(gc, 0),
> +                                            domid, nic->devid));
> +    if (!vifname) {
> +        vifname = (char *)libxl__device_nic_devname(gc, domid,

Please don't cast away the const here. Either make this function return
const or dup this. Preferably const this function.

> +                                                    nic->devid,
> +                                                    nic->nictype);
> +    }
> +
> +    return vifname;
> +}
> +
> +static char **get_guest_vif_list(libxl__gc *gc, uint32_t domid,
> +                                 int *num_vifs)
> +{
> +    libxl_device_nic *nics;
> +    int nb, i = 0;
> +    char **vif_list = NULL;
> +
> +    nics = libxl_device_nic_list(CTX, domid, &nb);
> +    if (!nics) {
> +        *num_vifs = 0;
> +        return NULL;
> +    }
> +
> +    vif_list = libxl__calloc(gc, nb, sizeof(char *));
> +    for (i = 0; i < nb; ++i) {
> +        vif_list[i] = get_vifname(gc, domid, &nics[i]);
> +        libxl_device_nic_dispose(&nics[i]);
> +    }
> +    free(nics);
> +
> +    *num_vifs = nb;
> +    return vif_list;
> +}

I think rather than creating a list of char *'s you should just use the
array of libxl_device_nic's and pass that around in your context, pass
the individual elements to libxl__netbuf_script_setup etc and then let
them determine the name as necessary. That would seem to remove a bunch
of book keeping around this list.

> +static int libxl__netbuf_script_setup(libxl__gc *gc, uint32_t domid,
> +                                      int devid, char *vif, char *script,
> +                                      char **ifb)
> +{
> +    int rc;
> +    char *out_path_base, *hotplug_error;
> +
> +    rc = libxl__exec_netbuf_script(gc, domid, devid, vif,
> +                                   script, "setup");
> +    if (rc) return rc;
> +
> +    out_path_base = GCSPRINTF("%s/remus/netbuf/%d",
> +                              libxl__xs_libxl_path(gc, domid), devid);
> +
> +    /* Now wait for the result (XENBUS_PATH/hotplug-status).
> +     * It doesnt matter what the result is. If the hotplug-status path

"doesn't"

> +     * appears, then we are good to go.
> +     */
> +    rc = libxl__wait_for_offspring(gc, domid, LIBXL_HOTPLUG_TIMEOUT,
> +                                   script,
> +                                   /* path */
> +                                   GCSPRINTF("%s/hotplug-status",
> +                                             out_path_base),
> +                                   NULL /* state */,
> +                                   NULL, script_done_cb, NULL);

Does this wait synchronously? Doesn't it need to be async, with the
remainder of this function in the done_cb?

This is one of those places where Ian J needs to comment I think.

> +/* Scan through the list of vifs belonging to domid and
> + * invoke the netbufscript to setup the IFB device & plug qdisc
> + * for each vif. Then scan through the list of IFB devices to obtain
> + * a handle on the plug qdisc installed on these IFB devices.
> + * Network output buffering is controlled via these qdiscs.
> + */
> +int libxl__remus_netbuf_setup(libxl__gc *gc, uint32_t domid,
> +			      libxl__remus_ctx *remus_ctx)
> +{
> +    int i, ret, ifindex, num_netbufs = 0;
> +    struct rtnl_link *ifb = NULL;
> +    struct rtnl_qdisc *qdisc = NULL;
> +    libxl__remus_netbuf_ctx *netbuf_ctx = NULL;
> +
> +    /* If the domain backend is a stubdom, do nothing. We dont

"don't" (is your apostrophe key broken? ;-) )

> +     * support stubdom network buffering yet.
> +     */
> +    if (libxl_get_stubdom_id(CTX, domid)) {
> +        LOG(ERROR, "Network buffering is not supported with stubdoms");
> +        return ERROR_FAIL;
> +    }

This just checks for a HVM stub device model I think, whereas AIUI you
are trying to test if the NICs are backed by a driver domains?

For that case I think you need to check backend_domid for each NIC.

> +
> +    netbuf_ctx = libxl__zalloc(gc, sizeof(libxl__remus_netbuf_ctx));
> +    netbuf_ctx->vif_list = get_guest_vif_list(gc, domid, &num_netbufs);
> +    if (!num_netbufs) return 0;
> +
> +    netbuf_ctx->ifb_list = libxl__calloc(gc, num_netbufs,
> +                                         sizeof(char *));
> +
> +    /* convenience vars */
> +    char **vif_list = netbuf_ctx->vif_list;
> +    char **ifb_list = netbuf_ctx->ifb_list;

const on these?

> +
> +    for (i = 0; i < num_netbufs; ++i) {
> +
> +        /* The setup script does the following
> +         * find a free IFB to act as buffer for the vif.
> +         * set up the plug qdisc on the IFB.
> +         */
> +        ret = libxl__netbuf_script_setup(gc, domid, i, vif_list[i],
> +                                         (char *) remus_ctx->netbufscript,

Are you casting another const away here? Please don't.
[...]

I assume all this libnl stuff is right.


> diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_types.idl
> --- a/tools/libxl/libxl_types.idl	Thu Aug 29 14:36:25 2013 -0700
> +++ b/tools/libxl/libxl_types.idl	Thu Aug 29 14:36:36 2013 -0700
> @@ -526,6 +526,8 @@ libxl_domain_remus_info = Struct("domain
>      ("interval",     integer),
>      ("blackhole",    bool),
>      ("compression",  bool),
> +    ("netbuf",  bool),
> +    ("netbufscript", string),

Please align the types as in the lines above.

Ian

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4 of 5 V2] tools/libxl: Control network buffering in remus callbacks
  2013-08-29 22:16 ` [PATCH 4 of 5 V2] tools/libxl: Control network buffering in remus callbacks Shriram Rajagopalan
@ 2013-09-04 15:19   ` Ian Campbell
  2013-09-05 11:25   ` Ian Jackson
  1 sibling, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2013-09-04 15:19 UTC (permalink / raw)
  To: Shriram Rajagopalan
  Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, xen-devel

On Thu, 2013-08-29 at 15:16 -0700, Shriram Rajagopalan wrote:
> # HG changeset patch
> # User Shriram Rajagopalan <rshriram@cs.ubc.ca>
> # Date 1377813001 25200
> # Node ID b8839e0b61c2e15b94a274cfddedf0814d2560de
> # Parent  3e8b0aa7cd4a945ec3efe14c969700c9e23ea2cc
> tools/libxl: Control network buffering in remus callbacks
> 
> This patch constitutes the core network buffering logic.
> and does the following:
>  a) create a new network buffer when the domain is suspended
>     (remus_domain_suspend_callback)
>  b) release the previous network buffer pertaining to the
>     committed checkpoint (remus_domain_checkpoint_dm_saved)
> 
> Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>

Looks ok to me, but I'd like Ian's opinion on the interactions with the
async stuff in the previous patch.

> 
> diff -r 3e8b0aa7cd4a -r b8839e0b61c2 tools/libxl/libxl_dom.c
> --- a/tools/libxl/libxl_dom.c	Thu Aug 29 14:36:36 2013 -0700
> +++ b/tools/libxl/libxl_dom.c	Thu Aug 29 14:50:01 2013 -0700
> @@ -1215,8 +1215,24 @@ int libxl__toolstack_save(uint32_t domid
>  
>  static int libxl__remus_domain_suspend_callback(void *data)
>  {
> -    /* REMUS TODO: Issue disk and network checkpoint reqs. */
> -    return libxl__domain_suspend_common_callback(data);
> +    /* REMUS TODO: Issue disk checkpoint reqs. */
> +    libxl__save_helper_state *shs = data;
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
> +    libxl__remus_ctx *remus_ctx = dss->remus_ctx;
> +    bool is_suspended;
> +    STATE_AO_GC(dss->ao);
> +
> +    is_suspended = !!libxl__domain_suspend_common_callback(data);
> +
> +    if (!remus_ctx->netbuf_ctx) return is_suspended;
> +
> +    if (is_suspended) {
> +        if (libxl__remus_netbuf_start_new_epoch(gc, dss->domid,
> +                                                remus_ctx))
> +            return !is_suspended;
> +    }
> +
> +    return is_suspended;
>  }
>  
>  static int libxl__remus_domain_resume_callback(void *data)
> @@ -1229,7 +1245,7 @@ static int libxl__remus_domain_resume_ca
>      if (libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1))
>          return 0;
>  
> -    /* REMUS TODO: Deal with disk. Start a new network output buffer */
> +    /* REMUS TODO: Deal with disk. */
>      return 1;
>  }
>  
> @@ -1256,10 +1272,36 @@ static void libxl__remus_domain_checkpoi
>  static void remus_checkpoint_dm_saved(libxl__egc *egc,
>                                        libxl__domain_suspend_state *dss, int rc)
>  {
> -    /* REMUS TODO: Wait for disk and memory ack, release network buffer */
> -    /* REMUS TODO: make this asynchronous */
> -    assert(!rc); /* REMUS TODO handle this error properly */
> -    usleep(dss->remus_ctx->interval * 1000);
> +    /*
> +     * REMUS TODO: Wait for disk and explicit memory ack (through restore
> +     * callback from remote) before releasing network buffer.
> +     */
> +    libxl__remus_ctx *remus_ctx = dss->remus_ctx;
> +    struct timespec epoch;
> +    int ret;
> +    STATE_AO_GC(dss->ao);
> +
> +    if (rc) {
> +        LOG(ERROR, "Failed to save device model. Terminating Remus..");
> +        libxl__xc_domain_saverestore_async_callback_done(egc,
> +                                                         &dss->shs, rc);
> +        return;
> +    }
> +
> +    if (remus_ctx->netbuf_ctx) {
> +        ret = libxl__remus_netbuf_release_prev_epoch(gc, dss->domid,
> +                                                     remus_ctx);
> +        if (ret) {
> +            libxl__xc_domain_saverestore_async_callback_done(egc,
> +                                                             &dss->shs,
> +                                                             ret);
> +            return;
> +        }
> +    }
> +
> +    epoch.tv_sec = remus_ctx->interval / 1000; /* interval is in ms */
> +    epoch.tv_nsec = remus_ctx->interval * 1000L * 1000L;
> +    nanosleep(&epoch, 0);
>      libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 1);
>  }
>  

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5 of 5 V2] tools/xl: Remus - Network buffering cmdline switch
  2013-08-29 22:16 ` [PATCH 5 of 5 V2] tools/xl: Remus - Network buffering cmdline switch Shriram Rajagopalan
@ 2013-09-04 15:24   ` Ian Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2013-09-04 15:24 UTC (permalink / raw)
  To: Shriram Rajagopalan
  Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, xen-devel

On Thu, 2013-08-29 at 15:16 -0700, Shriram Rajagopalan wrote:
> @@ -418,6 +417,13 @@ Generally useful for debugging.
>  
>  Disable memory checkpoint compression.
>  
> +=item B<-n>
> +
> +Enable network output buffering.  The default script used to configure
> +network buffering is /etc/xen/scripts/remus-netbuf-setup. If you wish to
> +use a custom script, set the global variable "remus.default.netbufscript"
> +in /etc/xen/xl.conf to point to your script.

Is this not the sort of thing you might want to control on a per domain
basis? Or perhaps on a per- "xl remus" granularity?

> +
>  =item B<-s> I<sshcommand>
>  
>  Use <sshcommand> instead of ssh.  String will be passed to sh.
> diff -r b8839e0b61c2 -r 66e6a3c64566 tools/libxl/xl.c
> --- a/tools/libxl/xl.c	Thu Aug 29 14:50:01 2013 -0700
> +++ b/tools/libxl/xl.c	Thu Aug 29 15:06:47 2013 -0700
> @@ -46,6 +46,7 @@ char *default_vifscript = NULL;
>  char *default_bridge = NULL;
>  char *default_gatewaydev = NULL;
>  char *default_vifbackend = NULL;
> +char *default_remus_netbufscript = NULL;
>  enum output_format default_output_format = OUTPUT_FORMAT_JSON;
>  int claim_mode = 1;
>  
> @@ -177,6 +178,11 @@ static void parse_global_config(const ch
>      if (!xlu_cfg_get_long (config, "claim_mode", &l, 0))
>          claim_mode = l;
>  
> +    if (!xlu_cfg_get_string (config, "remus.default.netbufscript", &buf, 0)) {
> +        free(default_remus_netbufscript);
> +        default_remus_netbufscript = strdup(buf);
> +    }

xlu_cfg_replace_string? I think the patterns you have copied predate
this function and/or were more copying which noone noticed.

(If you were feeling exceptionally keen you could update them all, not a
requirement though)

> +
>      xlu_cfg_destroy(config);
>  }
>  
> diff -r b8839e0b61c2 -r 66e6a3c64566 tools/libxl/xl.h
> --- a/tools/libxl/xl.h	Thu Aug 29 14:50:01 2013 -0700
> +++ b/tools/libxl/xl.h	Thu Aug 29 15:06:47 2013 -0700
> @@ -152,6 +152,7 @@ extern char *default_vifscript;
>  extern char *default_bridge;
>  extern char *default_gatewaydev;
>  extern char *default_vifbackend;
> +extern char *default_remus_netbufscript;
>  extern char *blkdev_start;
>  
>  enum output_format {
> diff -r b8839e0b61c2 -r 66e6a3c64566 tools/libxl/xl_cmdimpl.c
> --- a/tools/libxl/xl_cmdimpl.c	Thu Aug 29 14:50:01 2013 -0700
> +++ b/tools/libxl/xl_cmdimpl.c	Thu Aug 29 15:06:47 2013 -0700
> @@ -7070,8 +7070,9 @@ int main_remus(int argc, char **argv)
>      r_info.interval = 200;
>      r_info.blackhole = 0;
>      r_info.compression = 1;
> -
> -    SWITCH_FOREACH_OPT(opt, "bui:s:e", NULL, "remus", 2) {
> +    r_info.netbuf = 0;
> +
> +    SWITCH_FOREACH_OPT(opt, "buni:s:e", NULL, "remus", 2) {
>      case 'i':
>          r_info.interval = atoi(optarg);
>          break;
> @@ -7081,6 +7082,9 @@ int main_remus(int argc, char **argv)
>      case 'u':
>          r_info.compression = 0;
>          break;
> +    case 'n':
> +        r_info.netbuf = 1;
> +        break;
>      case 's':
>          ssh_command = optarg;
>          break;
> @@ -7092,6 +7096,11 @@ int main_remus(int argc, char **argv)
>      domid = find_domain(argv[optind]);
>      host = argv[optind + 1];
>  
> +    if (r_info.netbuf) {
> +        if (default_remus_netbufscript)
> +            r_info.netbufscript = strdup(default_remus_netbufscript);

Can netbufscript be const and then drop the strdup?

> +    }
> +
>      if (r_info.blackhole) {
>          send_fd = open("/dev/null", O_RDWR, 0644);
>          if (send_fd < 0) {
> @@ -7126,13 +7135,10 @@ int main_remus(int argc, char **argv)
>      /* Point of no return */
>      rc = libxl_domain_remus_start(ctx, &r_info, domid, send_fd, recv_fd, 0);
>  
> -    /* If we are here, it means backup has failed/domain suspend failed.
> -     * Try to resume the domain and exit gracefully.
> +    /* If we are here, it means remus setup/domain susppend/backup has

"suspend"

Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering
  2013-08-29 22:16 ` [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering Shriram Rajagopalan
  2013-09-04 15:17   ` Ian Campbell
@ 2013-09-04 16:16   ` Ian Jackson
  2013-09-04 17:22     ` Shriram Rajagopalan
  1 sibling, 1 reply; 30+ messages in thread
From: Ian Jackson @ 2013-09-04 16:16 UTC (permalink / raw)
  To: Shriram Rajagopalan
  Cc: Andrew Cooper, Stefano Stabellini, Ian Campbell, xen-devel

Shriram Rajagopalan writes ("[Xen-devel] [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering"):
> tools/libxl: setup/teardown Remus network buffering

Thanks.

I'm afraid the event handling, particularly with respect to
subprocesses, is not yet right.

> diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_dom.c
> --- a/tools/libxl/libxl_dom.c	Thu Aug 29 14:36:25 2013 -0700
> +++ b/tools/libxl/libxl_dom.c	Thu Aug 29 14:36:36 2013 -0700
> @@ -1259,7 +1259,7 @@ static void remus_checkpoint_dm_saved(li
>      /* REMUS TODO: Wait for disk and memory ack, release network buffer */
>      /* REMUS TODO: make this asynchronous */
>      assert(!rc); /* REMUS TODO handle this error properly */
> -    usleep(dss->interval * 1000);
> +    usleep(dss->remus_ctx->interval * 1000);
>      libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 1);

Do you plan to fix this later (perhaps as part of a future series) ?

> diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_netbuffer.c
> --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
> +++ b/tools/libxl/libxl_netbuffer.c	Thu Aug 29 14:36:36 2013 -0700
> @@ -0,0 +1,434 @@
...
> +static void netbuf_script_child_death_cb(libxl__egc *egc,
> +                                         libxl__ev_child *child,
> +                                         pid_t pid, int status)
> +{
> +    /* No-op. hotplug-error will be read in setup/teardown_ifb */
> +    return;
> +}

This can't possibly be right.  You must always wait for your children
and you may not complete the ao until the child has finished.  This
should all be documented in libxl_internal.h.

Also I don't see where this other error handling code is.

> +    if (!pid) {
> +        /* child: Launch netbuf script */
> +        libxl__exec(gc, -1, -1, -1, args[0], args, env);
> +        /* notreached */
> +        abort();
> +    }
> +
> +    assert(libxl__ev_child_inuse(&childw_out));

What purpose does this serve ?  This can be false if the death_cb is
called from within libxl__ev_child_fork.

> +    rc = libxl__exec_netbuf_script(gc, domid, devid, vif,
> +                                   script, "setup");
> +    if (rc) return rc;

This definitely isn't right - it needs to be asynchronous, with a
callback.

> +    /* Now wait for the result (XENBUS_PATH/hotplug-status).
> +     * It doesnt matter what the result is. If the hotplug-status path
> +     * appears, then we are good to go.
> +     */
> +    rc = libxl__wait_for_offspring(gc, domid, LIBXL_HOTPLUG_TIMEOUT,
> +                                   script,
> +                                   /* path */
> +                                   GCSPRINTF("%s/hotplug-status",
> +                                             out_path_base),
> +                                   NULL /* state */,
> +                                   NULL, script_done_cb, NULL);

This wait does not actually block the current execution.  It arranges
for the callback to be called.  You need to break this function apart.

> +static int libxl__netbuf_script_teardown(libxl__gc *gc, uint32_t domid,
> +                                         int devid, char *vif,
> +                                         char *script)
> +{
> +    /* Nothing to wait for during tear down */
> +    return libxl__exec_netbuf_script(gc, domid, devid, vif,
> +                                     script, "teardown");

And this one.

> +    for (i = 0; i < num_netbufs; ++i) {
> +
> +        /* The setup script does the following
> +         * find a free IFB to act as buffer for the vif.
> +         * set up the plug qdisc on the IFB.
> +         */
> +        ret = libxl__netbuf_script_setup(gc, domid, i, vif_list[i],
> +                                         (char *) remus_ctx->netbufscript,
> +                                         &ifb_list[i]);

Does this run a script per netbuf ?  Do they want to run in
parallel ?

> +int libxl__remus_netbuf_teardown(libxl__gc *gc, uint32_t domid,
> +				 libxl__remus_ctx *remus_ctx)
> +{
...
> +    for (i = 0; i < netbuf_ctx->num_netbufs; ++i)
> +        libxl__netbuf_script_teardown(gc, domid, i, vif_list[i],
> +                                      (char *) remus_ctx->netbufscript);

I think you should definitely do this asynchronously and in parallel.

What happens if I use "xl destroy" to destroy a domain which is being
handled by remus.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering
  2013-09-04 16:16   ` Ian Jackson
@ 2013-09-04 17:22     ` Shriram Rajagopalan
  2013-10-09 16:32       ` Shriram Rajagopalan
  0 siblings, 1 reply; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-09-04 17:22 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Andrew Cooper, Stefano Stabellini, Ian Campbell, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 6678 bytes --]

On Wed, Sep 4, 2013 at 12:16 PM, Ian Jackson <Ian.Jackson@eu.citrix.com>wrote:

> Shriram Rajagopalan writes ("[Xen-devel] [PATCH 3 of 5 V2] tools/libxl:
> setup/teardown Remus network buffering"):
> > tools/libxl: setup/teardown Remus network buffering
>
> Thanks.
>
> I'm afraid the event handling, particularly with respect to
> subprocesses, is not yet right.
>
> > diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_dom.c
> > --- a/tools/libxl/libxl_dom.c Thu Aug 29 14:36:25 2013 -0700
> > +++ b/tools/libxl/libxl_dom.c Thu Aug 29 14:36:36 2013 -0700
> > @@ -1259,7 +1259,7 @@ static void remus_checkpoint_dm_saved(li
> >      /* REMUS TODO: Wait for disk and memory ack, release network buffer
> */
> >      /* REMUS TODO: make this asynchronous */
> >      assert(!rc); /* REMUS TODO handle this error properly */
> > -    usleep(dss->interval * 1000);
> > +    usleep(dss->remus_ctx->interval * 1000);
> >      libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 1);
>
> Do you plan to fix this later (perhaps as part of a future series) ?
>
>
Its actually fixed in the next patch in the same series.


>  > diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_netbuffer.c
> > --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> > +++ b/tools/libxl/libxl_netbuffer.c   Thu Aug 29 14:36:36 2013 -0700
> > @@ -0,0 +1,434 @@
> ...
> > +static void netbuf_script_child_death_cb(libxl__egc *egc,
> > +                                         libxl__ev_child *child,
> > +                                         pid_t pid, int status)
> > +{
> > +    /* No-op. hotplug-error will be read in setup/teardown_ifb */
> > +    return;
> > +}
>
> This can't possibly be right.  You must always wait for your children
> and you may not complete the ao until the child has finished.  This
> should all be documented in libxl_internal.h.
>
>
Thats true. But this was not meant to be an ao (see explanation at
end of this email).


> Also I don't see where this other error handling code is.
>
> > +    if (!pid) {
> > +        /* child: Launch netbuf script */
> > +        libxl__exec(gc, -1, -1, -1, args[0], args, env);
> > +        /* notreached */
> > +        abort();
> > +    }
> > +
> > +    assert(libxl__ev_child_inuse(&childw_out));
>
> What purpose does this serve ?  This can be false if the death_cb is
> called from within libxl__ev_child_fork.
>
>
Sorry. This was copy pasted from code in device_hotplug(). I couldn't find
a simple fork & exec abstraction. The libxl_ev_child_fork seemed to be
the only option.


> > +    rc = libxl__exec_netbuf_script(gc, domid, devid, vif,
> > +                                   script, "setup");
> > +    if (rc) return rc;
>
> This definitely isn't right - it needs to be asynchronous, with a
> callback.
>
> > +    /* Now wait for the result (XENBUS_PATH/hotplug-status).
> > +     * It doesnt matter what the result is. If the hotplug-status path
> > +     * appears, then we are good to go.
> > +     */
> > +    rc = libxl__wait_for_offspring(gc, domid, LIBXL_HOTPLUG_TIMEOUT,
> > +                                   script,
> > +                                   /* path */
> > +                                   GCSPRINTF("%s/hotplug-status",
> > +                                             out_path_base),
> > +                                   NULL /* state */,
> > +                                   NULL, script_done_cb, NULL);
>
> This wait does not actually block the current execution.


It actually does. Atleast based on the code in libxl_exec.c which uses
a select() call to wait on xenstore fds.


>  It arranges
> for the callback to be called.  You need to break this function apart.
>
> > +static int libxl__netbuf_script_teardown(libxl__gc *gc, uint32_t domid,
> > +                                         int devid, char *vif,
> > +                                         char *script)
> > +{
> > +    /* Nothing to wait for during tear down */
> > +    return libxl__exec_netbuf_script(gc, domid, devid, vif,
> > +                                     script, "teardown");
>
> And this one.
>
> > +    for (i = 0; i < num_netbufs; ++i) {
> > +
> > +        /* The setup script does the following
> > +         * find a free IFB to act as buffer for the vif.
> > +         * set up the plug qdisc on the IFB.
> > +         */
> > +        ret = libxl__netbuf_script_setup(gc, domid, i, vif_list[i],
> > +                                         (char *)
> remus_ctx->netbufscript,
> > +                                         &ifb_list[i]);
>
> Does this run a script per netbuf ?  Do they want to run in
> parallel ?
>
>
A script per netbuf. Running them in parallel is an overkill, since the
remus
setup is a one-time task and the average domain is expected to have few
interfaces.
Besides, the script has no waits. It just issues a bunch of shell commands
that don't block


>  > +int libxl__remus_netbuf_teardown(libxl__gc *gc, uint32_t domid,
> > +                              libxl__remus_ctx *remus_ctx)
> > +{
> ...
> > +    for (i = 0; i < netbuf_ctx->num_netbufs; ++i)
> > +        libxl__netbuf_script_teardown(gc, domid, i, vif_list[i],
> > +                                      (char *) remus_ctx->netbufscript);
>
> I think you should definitely do this asynchronously and in parallel.
>
>
See answer above for running in parallel. In terms of running this
asynchronously,
I had a comment in this file explaining why I decided to go the synchronous
route.
Basically, the setup and teardown are part of the remus API call, which is
already
asynchronous.

The remus external API (libxl_domain_remus_start) returns after starting
remus.
It goes like this:
  setup scripts
  suspend_domain (which basically launches save_domain helper process)
  return AO_IN_PROGRESS

The setup scripts are have no long running operations. They complete in
jiffy
(adding a bunch of tc rules to interfaces)

The AO callback in libxl.c (named "remus_replication_failure_cb" [beginning
of this patch]), is called when the backup fails, i.e. when the replication
operations fail (such as write() calls).

The AO callback calls the above teardown function.  I remember reading in
the
comments that an AO op should not be issued from an AO callback
(please correct me if I am wrong).

What happens if I use "xl destroy" to destroy a domain which is being
> handled by remus.
>
>
One of the replication steps (domain suspend or domain resume) would fail.
This would result in the AO callback being invoked, which in turn invokes
the
external scripts. The scripts fail silently if the vif interface cannot be
found,
but they go ahead and remove the IFB device, and so on.


> Thanks,
> Ian.
>
>

[-- Attachment #1.2: Type: text/html, Size: 9861 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4 of 5 V2] tools/libxl: Control network buffering in remus callbacks
  2013-08-29 22:16 ` [PATCH 4 of 5 V2] tools/libxl: Control network buffering in remus callbacks Shriram Rajagopalan
  2013-09-04 15:19   ` Ian Campbell
@ 2013-09-05 11:25   ` Ian Jackson
  1 sibling, 0 replies; 30+ messages in thread
From: Ian Jackson @ 2013-09-05 11:25 UTC (permalink / raw)
  To: Shriram Rajagopalan
  Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, Ian Campbell, xen-devel

Shriram Rajagopalan writes ("[Xen-devel] [PATCH 4 of 5 V2] tools/libxl: Control network buffering in remus	callbacks"):
> tools/libxl: Control network buffering in remus callbacks
...
> @@ -1256,10 +1272,36 @@ static void libxl__remus_domain_checkpoi
>  static void remus_checkpoint_dm_saved(libxl__egc *egc,
>                                        libxl__domain_suspend_state *dss, int rc)
>  {

I think there are some error handling problems in this function.

> -    /* REMUS TODO: Wait for disk and memory ack, release network buffer */
> -    /* REMUS TODO: make this asynchronous */
> -    assert(!rc); /* REMUS TODO handle this error properly */
> -    usleep(dss->remus_ctx->interval * 1000);
> +    /*
> +     * REMUS TODO: Wait for disk and explicit memory ack (through restore
> +     * callback from remote) before releasing network buffer.
> +     */
> +    libxl__remus_ctx *remus_ctx = dss->remus_ctx;
> +    struct timespec epoch;
> +    int ret;
> +    STATE_AO_GC(dss->ao);
> +
> +    if (rc) {
> +        LOG(ERROR, "Failed to save device model. Terminating Remus..");
> +        libxl__xc_domain_saverestore_async_callback_done(egc,
> +                                                         &dss->shs, rc);

Firstly, it would be much clearer if you used the "goto out" error
handling style, rather than calling ..._callback_done and return
separately each time.

Secondly, here you return rc, a libxl error code.

> +        ret = libxl__remus_netbuf_release_prev_epoch(gc, dss->domid,
> +                                                     remus_ctx);
> +        if (ret) {
> +            libxl__xc_domain_saverestore_async_callback_done(egc,
> +                                                             &dss->shs,
> +                                                             ret);

Here you return "ret" which is what exactly ?

> +            return;
...
>      libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 1);

And here the original function returns "1".

The specification for the callback is in xenguest.h's struct
save_callbacks, the "checkpoint" member.  It says the return value
must be 0 or 1.

So if there is a failure I think you need to stash the error code
somewhere because libxc won't pass it through for you.

Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts
  2013-09-04 13:40   ` Ian Campbell
@ 2013-10-09  2:14     ` Shriram Rajagopalan
  2013-10-09  8:06       ` Ian Campbell
  0 siblings, 1 reply; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-10-09  2:14 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 4255 bytes --]

On Wed, Sep 4, 2013 at 6:40 AM, Ian Campbell <Ian.Campbell@citrix.com>wrote:

> On Thu, 2013-08-29 at 15:16 -0700, Shriram Rajagopalan wrote:
> > # HG changeset patch
> > # User Shriram Rajagopalan <rshriram@cs.ubc.ca>
> > # Date 1377641625 25200
> > # Node ID cc2b88cba1c67cc0e0a00c7e63a3ba97e14609cb
> > # Parent  40b079bd57dea24c3b000f233ed123763483a4d5
> > remus: add libnl3 dependency to autoconf scripts
> >
> > Libnl3 is required for controlling Remus network buffering.
> > This patch adds dependency on libnl3 (>= 3.2.8) to autoconf scripts.
> > Also provide ability to configure tools without libnl3 support, that
> > is without network buffering support.
> >
> > Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
> >
> > diff -r 40b079bd57de -r cc2b88cba1c6 config/Tools.mk.in
> > --- a/config/Tools.mk.in      Sun Aug 25 14:39:36 2013 -0700
> > +++ b/config/Tools.mk.in      Tue Aug 27 15:13:45 2013 -0700
> > @@ -35,6 +35,7 @@ PTHREAD_LDFLAGS     := @PTHREAD_LDFLAGS@
> >  PTHREAD_LIBS        := @PTHREAD_LIBS@
> >
> >  PTYFUNCS_LIBS       := @PTYFUNCS_LIBS@
> > +LIBNL3_LIBS         := @LIBNL3_LIBS@
> >
> >  # Download GIT repositories via HTTP or GIT's own protocol?
> >  # GIT's protocol is faster and more robust, when it works at all
> (firewalls
> > @@ -54,6 +55,7 @@ CONFIG_QEMU_TRAD    := @qemu_traditional
> >  CONFIG_QEMU_XEN     := @qemu_xen@
> >  CONFIG_XEND         := @xend@
> >  CONFIG_BLKTAP1      := @blktap1@
> > +CONFIG_REMUS_NETBUF := @remus_netbuf@
> >
> >  #System options
> >  ZLIB                := @zlib@
> > diff -r 40b079bd57de -r cc2b88cba1c6 tools/configure.ac
> > --- a/tools/configure.ac      Sun Aug 25 14:39:36 2013 -0700
> > +++ b/tools/configure.ac      Tue Aug 27 15:13:45 2013 -0700
> > @@ -208,4 +208,19 @@ AC_SUBST(libiconv)
> >  # Checks for header files.
> >  AC_CHECK_HEADERS([yajl/yajl_version.h sys/eventfd.h])
> >
> > +# Check for libnl3 >=3.2.8. If present enable remus network buffering.
> > +PKG_CHECK_EXISTS([libnl-3.0 >= 3.2.8 libnl-route-3.0 >= 3.2.8
> libnl-cli-3.0 >= 3.2.8],
> > +             [libnl3_lib="y"], [libnl3_lib="n"])
> > +AC_CHECK_HEADER([netlink/route/qdisc/plug.h], [libnl3_dev="y"],
> [libnl3_dev="n"])
>
> Aren't these a bit redundant? i.e. doesn't PKG_CHECK_EXISTS cover the
> development headers?


Unfortunately, libnl3 does not have a pkgconfig .pc file for its dev
headers. So there is no standard way
to check for these headers. Debian has libnl-3-dev. OpenSUSE has
libnl3-devel, etc.


> At least on Debian libnl-3-dev contains:
> [...]
> -rw-r--r-- root/root       770 2013-05-21 22:52
> ./usr/include/libnl3/netlink/route/qdisc/plug.h
> [...]
> -rw-r--r-- root/root       239 2013-05-21 22:52
> ./usr/lib/x86_64-linux-gnu/pkgconfig/libnl-3.0.pc
> [...]
>
> > +AS_IF([test "x$libnl3" = "xn" || test "x$libnl3_dev" = "xn"], [
> > +         AC_MSG_WARN([Disabling support for Remus network buffering.
> > +         Please install libnl3 libraries, command line tools and devel
> > +         headers - version 3.2.8 or higher])
>
> This should be mentioned in README too.
>
> > +         AC_SUBST(remus_netbuf, [n])
> > +         ],[
> > +         LIBNL3_LIBS="-lnl-3 -lnl-route-3"
>
> Since libnl3 uses pkg-cofnig, these should come from there I think.
> There's probably a macro in pkg-config.m4 to help?
>

None that I could find.


>
> > +         AC_SUBST(LIBNL3_LIBS)
> > +         AC_SUBST(remus_netbuf, [y])
> > +])
> > +
> >  AC_OUTPUT()
> > diff -r 40b079bd57de -r cc2b88cba1c6 tools/libxl/Makefile
> > --- a/tools/libxl/Makefile    Sun Aug 25 14:39:36 2013 -0700
> > +++ b/tools/libxl/Makefile    Tue Aug 27 15:13:45 2013 -0700
> > @@ -20,7 +20,7 @@ LIBUUID_LIBS += -luuid
> >  endif
> >
> >  LIBXL_LIBS =
> > -LIBXL_LIBS = $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest)
> $(LDLIBS_libxenstore) $(LDLIBS_libblktapctl) $(PTYFUNCS_LIBS)
> $(LIBUUID_LIBS)
> > +LIBXL_LIBS = $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest)
> $(LDLIBS_libxenstore) $(LDLIBS_libblktapctl) $(PTYFUNCS_LIBS)
> $(LIBUUID_LIBS) $(LIBNL3_LIBS)
>
> I think it might be time to break this long line and use
> LIBXL_LIBS  = X Y Z
> LIBXL_LIBS += A B C
> (no need for one line per lib though)
>
> >
> >  CFLAGS_LIBXL += $(CFLAGS_libxenctrl)
> >  CFLAGS_LIBXL += $(CFLAGS_libxenguest)
>
>
>

[-- Attachment #1.2: Type: text/html, Size: 6052 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts
  2013-10-09  2:14     ` Shriram Rajagopalan
@ 2013-10-09  8:06       ` Ian Campbell
  2013-10-09 16:27         ` Shriram Rajagopalan
  0 siblings, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2013-10-09  8:06 UTC (permalink / raw)
  To: rshriram; +Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, xen-devel

On Tue, 2013-10-08 at 19:14 -0700, Shriram Rajagopalan wrote:

>         
>         > +# Check for libnl3 >=3.2.8. If present enable remus network
>         buffering.
>         > +PKG_CHECK_EXISTS([libnl-3.0 >= 3.2.8 libnl-route-3.0 >=
>         3.2.8 libnl-cli-3.0 >= 3.2.8],
>         > +             [libnl3_lib="y"], [libnl3_lib="n"])
>         > +AC_CHECK_HEADER([netlink/route/qdisc/plug.h],
>         [libnl3_dev="y"], [libnl3_dev="n"])
>         
>         
>         Aren't these a bit redundant? i.e. doesn't PKG_CHECK_EXISTS
>         cover the development headers? 

> Unfortunately, libnl3 does not have a pkgconfig .pc file for its dev
> headers. So there is no standard way to check for these headers.
> Debian has libnl-3-dev. OpenSUSE has libnl3-devel, etc.

I think you are confused about what pkg-config and .pc files are.

They are part of the "dev headers" and describe how to compile and link
against the library. You are entitled to assume that if pkg-config
returns that the pkg exists you can use it and link against it etc.

IOW the .pc file is exactly the standard way to check for these things.

>         > +         AC_SUBST(remus_netbuf, [n])
>         > +         ],[
>         > +         LIBNL3_LIBS="-lnl-3 -lnl-route-3"
>         
>         
>         Since libnl3 uses pkg-cofnig, these should come from there I
>         think.
>         There's probably a macro in pkg-config.m4 to help?

> None that I could find.

Looks like PKG_CHECK_MODULES sets <PKG>_CFLAGS and _LDFLAGS.

See
https://www.flameeyes.eu/autotools-mythbuster/pkgconfig/pkg_check_modules.html and https://developer.gnome.org/anjuta-build-tutorial/stable/library-autotools.html.en for info (first two hits on google for "pkg-config autoconf")

Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts
  2013-10-09  8:06       ` Ian Campbell
@ 2013-10-09 16:27         ` Shriram Rajagopalan
  2013-10-09 16:36           ` Ian Campbell
  0 siblings, 1 reply; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-10-09 16:27 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2932 bytes --]

On Wed, Oct 9, 2013 at 1:06 AM, Ian Campbell <Ian.Campbell@citrix.com>wrote:

> On Tue, 2013-10-08 at 19:14 -0700, Shriram Rajagopalan wrote:
>
> >
> >         > +# Check for libnl3 >=3.2.8. If present enable remus network
> >         buffering.
> >         > +PKG_CHECK_EXISTS([libnl-3.0 >= 3.2.8 libnl-route-3.0 >=
> >         3.2.8 libnl-cli-3.0 >= 3.2.8],
> >         > +             [libnl3_lib="y"], [libnl3_lib="n"])
> >         > +AC_CHECK_HEADER([netlink/route/qdisc/plug.h],
> >         [libnl3_dev="y"], [libnl3_dev="n"])
> >
> >
> >         Aren't these a bit redundant? i.e. doesn't PKG_CHECK_EXISTS
> >         cover the development headers?
>
> > Unfortunately, libnl3 does not have a pkgconfig .pc file for its dev
> > headers. So there is no standard way to check for these headers.
> > Debian has libnl-3-dev. OpenSUSE has libnl3-devel, etc.
>
> I think you are confused about what pkg-config and .pc files are.
>
> They are part of the "dev headers" and describe how to compile and link
> against the library. You are entitled to assume that if pkg-config
> returns that the pkg exists you can use it and link against it etc.
>
> IOW the .pc file is exactly the standard way to check for these things.
>
>
I am a little confused. I do know that if pkg exists, then
linking/compiling isn't an issue.

The issue with libnl3 is "compiling" - i.e., dev headers, that libxl uses
to invoke libnl API.
There is no pkg-config module for libnl3 development headers (Just like the
ones for
libraries: libnl-3.0, libnl-route-3.0, etc) [see
http://git.infradead.org/users/tgr/libnl.git/tree ]

It has package config files for libnl-3.0.pc (-lnl-3), libnl-route-3.0.pc
(-lnl-route-3)
but none for dev headers.

Do they get installed along with the libraries ? Don't think so (atleast on
debian & ubuntu).

root@localhost:~# apt-cache search libnl
libnl-3-200 - library for dealing with netlink sockets
libnl-3-dev - development library and headers for libnl-3
libnl-cli-3-200 - library for dealing with netlink sockets - cli helpers
libnl-cli-3-dev - development library and headers for libnl-cli-3
libnl-route-3-200 - library for dealing with netlink sockets - route
interface
libnl-route-3-dev - development library and headers for libnl-route-3


So what do you suggest ?

 >         > +         AC_SUBST(remus_netbuf, [n])
> >         > +         ],[
> >         > +         LIBNL3_LIBS="-lnl-3 -lnl-route-3"
> >
> >
> >         Since libnl3 uses pkg-cofnig, these should come from there I
> >         think.
> >         There's probably a macro in pkg-config.m4 to help?
>
> > None that I could find.
>
> Looks like PKG_CHECK_MODULES sets <PKG>_CFLAGS and _LDFLAGS.
>
> See
>
> https://www.flameeyes.eu/autotools-mythbuster/pkgconfig/pkg_check_modules.htmland
> https://developer.gnome.org/anjuta-build-tutorial/stable/library-autotools.html.enfor info (first two hits on google for "pkg-config autoconf")
>
> Ian.
>
>
>

[-- Attachment #1.2: Type: text/html, Size: 4644 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering
  2013-09-04 17:22     ` Shriram Rajagopalan
@ 2013-10-09 16:32       ` Shriram Rajagopalan
  2013-10-14 16:30         ` [PATCH] libxl: Deprecate synchronous waiting for the device model Ian Jackson
  2013-10-14 16:34         ` [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering [and 1 more messages] Ian Jackson
  0 siblings, 2 replies; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-10-09 16:32 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Andrew Cooper, Stefano Stabellini, Ian Campbell, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 7299 bytes --]

On Wed, Sep 4, 2013 at 10:22 AM, Shriram Rajagopalan <rshriram@cs.ubc.ca>wrote:

> On Wed, Sep 4, 2013 at 12:16 PM, Ian Jackson <Ian.Jackson@eu.citrix.com>wrote:
>
>> Shriram Rajagopalan writes ("[Xen-devel] [PATCH 3 of 5 V2] tools/libxl:
>> setup/teardown Remus network buffering"):
>> > tools/libxl: setup/teardown Remus network buffering
>>
>> Thanks.
>>
>> I'm afraid the event handling, particularly with respect to
>> subprocesses, is not yet right.
>>
>> > diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_dom.c
>> > --- a/tools/libxl/libxl_dom.c Thu Aug 29 14:36:25 2013 -0700
>> > +++ b/tools/libxl/libxl_dom.c Thu Aug 29 14:36:36 2013 -0700
>> > @@ -1259,7 +1259,7 @@ static void remus_checkpoint_dm_saved(li
>> >      /* REMUS TODO: Wait for disk and memory ack, release network
>> buffer */
>> >      /* REMUS TODO: make this asynchronous */
>> >      assert(!rc); /* REMUS TODO handle this error properly */
>> > -    usleep(dss->interval * 1000);
>> > +    usleep(dss->remus_ctx->interval * 1000);
>> >      libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs,
>> 1);
>>
>> Do you plan to fix this later (perhaps as part of a future series) ?
>>
>>
> Its actually fixed in the next patch in the same series.
>
>
>>  > diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_netbuffer.c
>> > --- /dev/null Thu Jan 01 00:00:00 1970 +0000
>> > +++ b/tools/libxl/libxl_netbuffer.c   Thu Aug 29 14:36:36 2013 -0700
>> > @@ -0,0 +1,434 @@
>> ...
>> > +static void netbuf_script_child_death_cb(libxl__egc *egc,
>> > +                                         libxl__ev_child *child,
>> > +                                         pid_t pid, int status)
>> > +{
>> > +    /* No-op. hotplug-error will be read in setup/teardown_ifb */
>> > +    return;
>> > +}
>>
>> This can't possibly be right.  You must always wait for your children
>> and you may not complete the ao until the child has finished.  This
>> should all be documented in libxl_internal.h.
>>
>>
> Thats true. But this was not meant to be an ao (see explanation at
> end of this email).
>
>
>> Also I don't see where this other error handling code is.
>>
>> > +    if (!pid) {
>> > +        /* child: Launch netbuf script */
>> > +        libxl__exec(gc, -1, -1, -1, args[0], args, env);
>> > +        /* notreached */
>> > +        abort();
>> > +    }
>> > +
>> > +    assert(libxl__ev_child_inuse(&childw_out));
>>
>> What purpose does this serve ?  This can be false if the death_cb is
>> called from within libxl__ev_child_fork.
>>
>>
> Sorry. This was copy pasted from code in device_hotplug(). I couldn't find
> a simple fork & exec abstraction. The libxl_ev_child_fork seemed to be
> the only option.
>
>
>> > +    rc = libxl__exec_netbuf_script(gc, domid, devid, vif,
>> > +                                   script, "setup");
>> > +    if (rc) return rc;
>>
>> This definitely isn't right - it needs to be asynchronous, with a
>> callback.
>>
>> > +    /* Now wait for the result (XENBUS_PATH/hotplug-status).
>> > +     * It doesnt matter what the result is. If the hotplug-status path
>> > +     * appears, then we are good to go.
>> > +     */
>> > +    rc = libxl__wait_for_offspring(gc, domid, LIBXL_HOTPLUG_TIMEOUT,
>> > +                                   script,
>> > +                                   /* path */
>> > +                                   GCSPRINTF("%s/hotplug-status",
>> > +                                             out_path_base),
>> > +                                   NULL /* state */,
>> > +                                   NULL, script_done_cb, NULL);
>>
>> This wait does not actually block the current execution.
>
>
> It actually does. Atleast based on the code in libxl_exec.c which uses
> a select() call to wait on xenstore fds.
>
>
>>  It arranges
>> for the callback to be called.  You need to break this function apart.
>>
>> > +static int libxl__netbuf_script_teardown(libxl__gc *gc, uint32_t domid,
>> > +                                         int devid, char *vif,
>> > +                                         char *script)
>> > +{
>> > +    /* Nothing to wait for during tear down */
>> > +    return libxl__exec_netbuf_script(gc, domid, devid, vif,
>> > +                                     script, "teardown");
>>
>> And this one.
>>
>> > +    for (i = 0; i < num_netbufs; ++i) {
>> > +
>> > +        /* The setup script does the following
>> > +         * find a free IFB to act as buffer for the vif.
>> > +         * set up the plug qdisc on the IFB.
>> > +         */
>> > +        ret = libxl__netbuf_script_setup(gc, domid, i, vif_list[i],
>> > +                                         (char *)
>> remus_ctx->netbufscript,
>> > +                                         &ifb_list[i]);
>>
>> Does this run a script per netbuf ?  Do they want to run in
>> parallel ?
>>
>>
> A script per netbuf. Running them in parallel is an overkill, since the
> remus
> setup is a one-time task and the average domain is expected to have few
> interfaces.
> Besides, the script has no waits. It just issues a bunch of shell commands
> that don't block
>
>
>>  > +int libxl__remus_netbuf_teardown(libxl__gc *gc, uint32_t domid,
>> > +                              libxl__remus_ctx *remus_ctx)
>> > +{
>> ...
>> > +    for (i = 0; i < netbuf_ctx->num_netbufs; ++i)
>> > +        libxl__netbuf_script_teardown(gc, domid, i, vif_list[i],
>> > +                                      (char *)
>> remus_ctx->netbufscript);
>>
>> I think you should definitely do this asynchronously and in parallel.
>>
>>
> See answer above for running in parallel. In terms of running this
> asynchronously,
> I had a comment in this file explaining why I decided to go the
> synchronous route.
>  Basically, the setup and teardown are part of the remus API call, which
> is already
> asynchronous.
>
> The remus external API (libxl_domain_remus_start) returns after starting
> remus.
> It goes like this:
>   setup scripts
>   suspend_domain (which basically launches save_domain helper process)
>   return AO_IN_PROGRESS
>
> The setup scripts are have no long running operations. They complete in
> jiffy
> (adding a bunch of tc rules to interfaces)
>
> The AO callback in libxl.c (named "remus_replication_failure_cb"
> [beginning
> of this patch]), is called when the backup fails, i.e. when the
> replication
> operations fail (such as write() calls).
>
> The AO callback calls the above teardown function.  I remember reading in
> the
> comments that an AO op should not be issued from an AO callback
> (please correct me if I am wrong).
>
> What happens if I use "xl destroy" to destroy a domain which is being
>> handled by remus.
>>
>>
> One of the replication steps (domain suspend or domain resume) would fail.
> This would result in the AO callback being invoked, which in turn invokes
> the
> external scripts. The scripts fail silently if the vif interface cannot be
> found,
> but they go ahead and remove the IFB device, and so on.
>
>
>> Thanks,
>> Ian.
>>
>>
>
Ian J, do you have any further thoughts related to my responses on your
concerns about
the event handling in this part of the code ?
Do you have suggestions on a cleaner way to go about the
libxl_ev_child_fork code block ?
With regard to the AO ops, do my answers make sense ?


thanks
shriram

[-- Attachment #1.2: Type: text/html, Size: 10956 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts
  2013-10-09 16:27         ` Shriram Rajagopalan
@ 2013-10-09 16:36           ` Ian Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2013-10-09 16:36 UTC (permalink / raw)
  To: rshriram; +Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, xen-devel

On Wed, 2013-10-09 at 09:27 -0700, Shriram Rajagopalan wrote:

> The issue with libnl3 is "compiling" - i.e., dev headers, that libxl
> uses to invoke libnl API.

This is what pkg-config describes. pkg-config is about compiling and
linking, not runtime dependencies as you seem to be thinking.

> There is no pkg-config module for libnl3 development headers (Just
> like the ones for 
> libraries: libnl-3.0, libnl-route-3.0, etc) [see
> http://git.infradead.org/users/tgr/libnl.git/tree ]

> It has package config files for libnl-3.0.pc (-lnl-3),
> libnl-route-3.0.pc (-lnl-route-3)
> but none for dev headers.

The .pc is *part of* the dev headers, not the runtime library7.

> Do they get installed along with the libraries ? Don't think so
> (atleast on debian & ubuntu).

The .pc file is in the -dev package, alongside the headers. This is hte
pkg-config model.
        $ dpkg -L libnl-3-dev | grep \.pc
        /usr/lib/x86_64-linux-gnu/pkgconfig/libnl-3.0.pc
        
If pkg-config says libnl-3 exists to be compiled/linked against then by
definition the headers must be present.

Or are you trying to say that libnl3 is somehow a different thing to
libnl-3.0?

Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering
  2013-09-04 15:17   ` Ian Campbell
@ 2013-10-14 14:23     ` Shriram Rajagopalan
  2013-10-15 10:34       ` Ian Campbell
  0 siblings, 1 reply; 30+ messages in thread
From: Shriram Rajagopalan @ 2013-10-14 14:23 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Andrew Cooper, Stefano Stabellini, Ian Jackson, Roger Pau Monne,
	xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 4255 bytes --]

On Wed, Sep 4, 2013 at 8:17 AM, Ian Campbell <Ian.Campbell@citrix.com>wrote:

>
> > diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_netbuffer.c
> > --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> > +++ b/tools/libxl/libxl_netbuffer.c   Thu Aug 29 14:36:36 2013 -0700
> > @@ -0,0 +1,434 @@
> > +/*
> > + * Copyright (C) 2013
> > + * Author Shriram Rajagopalan <rshriram@cs.ubc.ca>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU Lesser General Public License as
> published
> > + * by the Free Software Foundation; version 2.1 only. with the special
> > + * exception on linking described in file LICENSE.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU Lesser General Public License for more details.
> > + */
> > +
> > +#include "libxl_osdeps.h" /* must come before any other headers */
> > +
> > +#include "libxl_internal.h"
> > +
> > +#include <netlink/cache.h>
> > +#include <netlink/socket.h>
> > +#include <netlink/attr.h>
> > +#include <netlink/route/link.h>
> > +#include <netlink/route/route.h>
> > +#include <netlink/route/qdisc.h>
> > +#include <netlink/route/qdisc/plug.h>
> > +
> > +typedef struct libxl__remus_netbuf_ctx {
> > +    struct rtnl_qdisc **netbuf_qdisc_list;
> > +    struct nl_sock *nlsock;
> > +    struct nl_cache *qdisc_cache;
> > +    char **vif_list;
> > +    char **ifb_list;
> > +    uint32_t num_netbufs;
> > +    uint32_t unused;
> > +} libxl__remus_netbuf_ctx;
> > +
> > +/* If the device has a vifname, then use that instead of
> > + * the vifX.Y format.
> > + */
> > +static char *get_vifname(libxl__gc *gc, uint32_t domid,
> > +                         libxl_device_nic *nic)
> > +{
> > +    char *vifname = NULL;
> > +    vifname = libxl__xs_read(gc, XBT_NULL,
> > +                             libxl__sprintf(gc,
> > +
>  "%s/backend/vif/%d/%d/vifname",
> > +                                            libxl__xs_get_dompath(gc,
> 0),
> > +                                            domid, nic->devid));
> > +    if (!vifname) {
> > +        vifname = (char *)libxl__device_nic_devname(gc, domid,
>
> Please don't cast away the const here. Either make this function return
> const or dup this. Preferably const this function.
>
> > +                                                    nic->devid,
> > +                                                    nic->nictype);
> > +    }
> > +
> > +    return vifname;
> > +}
> > +
> > +static char **get_guest_vif_list(libxl__gc *gc, uint32_t domid,
> > +                                 int *num_vifs)
> > +{
> > +    libxl_device_nic *nics;
> > +    int nb, i = 0;
> > +    char **vif_list = NULL;
> > +
> > +    nics = libxl_device_nic_list(CTX, domid, &nb);
> > +    if (!nics) {
> > +        *num_vifs = 0;
> > +        return NULL;
> > +    }
> > +
> > +    vif_list = libxl__calloc(gc, nb, sizeof(char *));
> > +    for (i = 0; i < nb; ++i) {
> > +        vif_list[i] = get_vifname(gc, domid, &nics[i]);
> > +        libxl_device_nic_dispose(&nics[i]);
> > +    }
> > +    free(nics);
> > +
> > +    *num_vifs = nb;
> > +    return vif_list;
> > +}
>
> I think rather than creating a list of char *'s you should just use the
> array of libxl_device_nic's and pass that around in your context, pass
> the individual elements to libxl__netbuf_script_setup etc and then let
> them determine the name as necessary. That would seem to remove a bunch
> of book keeping around this list.
>
>
The list of nics returned by libxl__device_nic_list is not managed by the
gc.
So at every exit point in the main functions, there has to be extra code to
free up the invididual
nics in the list and the list itself.

In contrast, with the current approach, all allocations are under the
gc's purview.  The list of nics are released immediately after creating the
array of vifnames.
This seemed to eliminate redundant error handling logic and keeps the code
simpler.  I'll also add
checks for backend_domid!=0 to this function, to ensure that Remus works
only for domains
with no driver domain backed interfaces.

[-- Attachment #1.2: Type: text/html, Size: 5533 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] libxl: Deprecate synchronous waiting for the device model
  2013-10-09 16:32       ` Shriram Rajagopalan
@ 2013-10-14 16:30         ` Ian Jackson
  2013-11-04 16:50           ` Ian Campbell
  2013-10-14 16:34         ` [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering [and 1 more messages] Ian Jackson
  1 sibling, 1 reply; 30+ messages in thread
From: Ian Jackson @ 2013-10-14 16:30 UTC (permalink / raw)
  To: xen-devel; +Cc: rshriram, Andrew Cooper, Ian Campbell, Stefano Stabellini

commit a2e653f795e14cc7d2e83c8075311757564ed751
Author: Ian Jackson <ian.jackson@eu.citrix.com>
Date:   Mon Oct 14 17:26:01 2013 +0100

    libxl: Deprecate synchronous waiting for the device model
    
    libxl__wait_for_device_model blocks, with the ctx lock held, waiting
    for a response from the device model.  If the dm doesn't respond
    quickly (for example, because it has crashed), this may block the
    whole process.  Explain this in a comment, rename the function to
    libxl__wait_for_device_model_deprecated, and explain what to use
    instead.
    
    libxl__wait_for_offspring is the core implementation for the above.
    Its name leads people to think it might be generally useful for
    waiting for children, which is far from the case.  It only waits for
    xenstore.  Also it has the problems described above.  Explain this,
    rename it to libxl__xenstore_child_wait_deprecated, and explain what
    to use instead.
    
    Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 29e66f2..eba7c0d 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -846,7 +846,7 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
         state = libxl__xs_read(gc, XBT_NULL, path);
         if (state != NULL && !strcmp(state, "paused")) {
             libxl__qemu_traditional_cmd(gc, domid, "continue");
-            libxl__wait_for_device_model(gc, domid, "running",
+            libxl__wait_for_device_model_deprecated(gc, domid, "running",
                                          NULL, NULL, NULL);
         }
     }
diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 16a92a4..610f0c5 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -1077,7 +1077,7 @@ static void devices_remove_callback(libxl__egc *egc,
     return;
 }
 
-int libxl__wait_for_device_model(libxl__gc *gc,
+int libxl__wait_for_device_model_deprecated(libxl__gc *gc,
                                  uint32_t domid, char *state,
                                  libxl__spawn_starting *spawning,
                                  int (*check_callback)(libxl__gc *gc,
@@ -1088,7 +1088,7 @@ int libxl__wait_for_device_model(libxl__gc *gc,
 {
     char *path;
     path = libxl__sprintf(gc, "/local/domain/0/device-model/%d/state", domid);
-    return libxl__wait_for_offspring(gc, domid,
+    return libxl__xenstore_child_wait_deprecated(gc, domid,
                                      LIBXL_DEVICE_MODEL_START_TIMEOUT,
                                      "Device Model", path, state, spawning,
                                      check_callback, check_callback_userdata);
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 6e2252a..709f820 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -955,7 +955,7 @@ int libxl__domain_suspend_device_model(libxl__gc *gc,
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
         LOG(DEBUG, "Saving device model state to %s", filename);
         libxl__qemu_traditional_cmd(gc, domid, "save");
-        libxl__wait_for_device_model(gc, domid, "paused", NULL, NULL, NULL);
+        libxl__wait_for_device_model_deprecated(gc, domid, "paused", NULL, NULL, NULL);
         break;
     }
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
@@ -979,7 +979,7 @@ int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
     switch (libxl__device_model_version_running(gc, domid)) {
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
         libxl__qemu_traditional_cmd(gc, domid, "continue");
-        libxl__wait_for_device_model(gc, domid, "running", NULL, NULL, NULL);
+        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
         break;
     }
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
diff --git a/tools/libxl/libxl_exec.c b/tools/libxl/libxl_exec.c
index 7eddaef..b6efa0f 100644
--- a/tools/libxl/libxl_exec.c
+++ b/tools/libxl/libxl_exec.c
@@ -157,7 +157,7 @@ out:
     return rc ? SIGTERM : 0;
 }
 
-int libxl__wait_for_offspring(libxl__gc *gc,
+int libxl__xenstore_child_wait_deprecated(libxl__gc *gc,
                                  uint32_t domid,
                                  uint32_t timeout, char *what,
                                  char *path, char *state,
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 165dc00..653079a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1223,7 +1223,19 @@ _hidden int libxl__spawn_record_pid(libxl__gc*, libxl__spawn_state*,
                                     pid_t innerchild);
 
 /*
- * libxl__wait_for_offspring - Wait for child state
+ * libxl__xenstore_child_wait_deprecated - Wait for daemonic child IPC
+ *
+ * This is a NOT function for waiting for ordinary child processes.
+ * If you want to run (fork/exec/wait) subprocesses from libxl:
+ *  - Make your libxl entrypoint use the ao machinery
+ *  - Use libxl__ev_fork, and use the callback programming style
+ *
+ * This function is intended for interprocess communication with a
+ * service process.  If the service process does not respond quickly,
+ * the whole caller may be blocked.  Therefore this function is
+ * deprecated.  This function is currently used only by
+ * libxl__wait_for_device_model_deprecated.
+ *
  * gc: allocation pool
  * domid: guest to work with
  * timeout: how many seconds to wait for the state to appear
@@ -1240,9 +1252,9 @@ _hidden int libxl__spawn_record_pid(libxl__gc*, libxl__spawn_state*,
  * in xenstore, and optionally for state in path.
  * If path appears and state matches, check_callback is called.
  * If check_callback returns > 0, waiting for path or state continues.
- * Otherwise libxl__wait_for_offspring returns.
+ * Otherwise libxl__xenstore_child_wait_deprecated returns.
  */
-_hidden int libxl__wait_for_offspring(libxl__gc *gc,
+_hidden int libxl__xenstore_child_wait_deprecated(libxl__gc *gc,
                                  uint32_t domid,
                                  uint32_t timeout, char *what,
                                  char *path, char *state,
@@ -1295,12 +1307,20 @@ _hidden int libxl__need_xenpv_qemu(libxl__gc *gc,
         int nr_consoles, libxl__device_console *consoles,
         int nr_vfbs, libxl_device_vfb *vfbs,
         int nr_disks, libxl_device_disk *disks);
-  /* Caller must either: pass starting_r==0, or on successful
-   * return pass *starting_r (which will be non-0) to
-   * libxl__confirm_device_model_startup or libxl__detach_device_model. */
-_hidden int libxl__wait_for_device_model(libxl__gc *gc,
+
+/*
+ * This function will cause the whole libxl process to hang
+ * if the device model does not respond.  It is deprecated.
+ *
+ * Instead of calling this function:
+ *  - Make your libxl entrypoint use the ao machinery
+ *  - Use libxl__ev_xswatch_register, and use the callback programming
+ *    style
+ */
+_hidden int libxl__wait_for_device_model_deprecated(libxl__gc *gc,
                                 uint32_t domid, char *state,
-                                libxl__spawn_starting *spawning,
+                                libxl__spawn_starting *spawning
+                                                    /* NULL allowed */,
                                 int (*check_callback)(libxl__gc *gc,
                                                       uint32_t domid,
                                                       const char *state,
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index b9960bb..df43895 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -823,7 +823,7 @@ static int qemu_pci_add_xenstore(libxl__gc *gc, uint32_t domid,
     }
 
     libxl__qemu_traditional_cmd(gc, domid, "pci-ins");
-    rc = libxl__wait_for_device_model(gc, domid, NULL, NULL,
+    rc = libxl__wait_for_device_model_deprecated(gc, domid, NULL, NULL,
                                       pci_ins_check, state);
     path = libxl__sprintf(gc, "/local/domain/0/device-model/%d/parameter",
                           domid);
@@ -851,7 +851,7 @@ static int do_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcidev, i
     switch (libxl__domain_type(gc, domid)) {
     case LIBXL_DOMAIN_TYPE_HVM:
         hvm = 1;
-        if (libxl__wait_for_device_model(gc, domid, "running",
+        if (libxl__wait_for_device_model_deprecated(gc, domid, "running",
                                          NULL, NULL, NULL) < 0) {
             return ERROR_FAIL;
         }
@@ -1137,7 +1137,7 @@ static int qemu_pci_remove_xenstore(libxl__gc *gc, uint32_t domid,
      * device-model for function 0 */
     if ( !force && (pcidev->vdevfn & 0x7) == 0 ) {
         libxl__qemu_traditional_cmd(gc, domid, "pci-rem");
-        if (libxl__wait_for_device_model(gc, domid, "pci-removed",
+        if (libxl__wait_for_device_model_deprecated(gc, domid, "pci-removed",
                                          NULL, NULL, NULL) < 0) {
             LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Device Model didn't respond in time");
             /* This depends on guest operating system acknowledging the
@@ -1179,7 +1179,7 @@ static int do_pci_remove(libxl__gc *gc, uint32_t domid,
     switch (libxl__domain_type(gc, domid)) {
     case LIBXL_DOMAIN_TYPE_HVM:
         hvm = 1;
-        if (libxl__wait_for_device_model(gc, domid, "running",
+        if (libxl__wait_for_device_model_deprecated(gc, domid, "running",
                                          NULL, NULL, NULL) < 0)
             goto out_fail;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering [and 1 more messages]
  2013-10-09 16:32       ` Shriram Rajagopalan
  2013-10-14 16:30         ` [PATCH] libxl: Deprecate synchronous waiting for the device model Ian Jackson
@ 2013-10-14 16:34         ` Ian Jackson
  1 sibling, 0 replies; 30+ messages in thread
From: Ian Jackson @ 2013-10-14 16:34 UTC (permalink / raw)
  To: rshriram; +Cc: Andrew Cooper, Stefano Stabellini, Ian Campbell, xen-devel

Shriram Rajagopalan writes ("Re: [Xen-devel] [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering"):
> On Wed, Sep 4, 2013 at 12:16 PM, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:
>     > diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_netbuffer.c
>     > --- /dev/null Thu Jan 01 00:00:00 1970 +0000
>     > +++ b/tools/libxl/libxl_netbuffer.c   Thu Aug 29 14:36:36 2013 -0700
>     > @@ -0,0 +1,434 @@
>     ...
>     > +static void netbuf_script_child_death_cb(libxl__egc *egc,
>     > +                                         libxl__ev_child *child,
>     > +                                         pid_t pid, int status)
>     > +{
>     > +    /* No-op. hotplug-error will be read in setup/teardown_ifb */
>     > +    return;
>     > +}
> 
>     This can't possibly be right.  You must always wait for your children
>     and you may not complete the ao until the child has finished.  This
>     should all be documented in libxl_internal.h.
>
> Thats true. But this was not meant to be an ao (see explanation at
> end of this email).

It is not possible to fork/exec/wait in libxl without using the ao
machinery.  As the comment in libxl_internal.h says:
 /*
  * For making subprocesses.  This is the only permitted mechanism for
  * code in libxl to do so.
 ...
 _hidden pid_t libxl__ev_child_fork(libxl__gc *gc, ...

(This is due to awkwardness in the POSIX API, combined with libxl's
status as a library which needs to work inside a variety of other
programs and libraries with a variety of ways of handling child
processes.)

>     > +    /* Now wait for the result (XENBUS_PATH/hotplug-status).
>     > +     * It doesnt matter what the result is. If the hotplug-status path
>     > +     * appears, then we are good to go.
>     > +     */
>     > +    rc = libxl__wait_for_offspring(gc, domid, LIBXL_HOTPLUG_TIMEOUT,
>     > +                                   script,
>     > +                                   /* path */
>     > +                                   GCSPRINTF("%s/hotplug-status",
>     > +                                             out_path_base),
>     > +                                   NULL /* state */,
>     > +                                   NULL, script_done_cb, NULL);
> 
>     This wait does not actually block the current execution.
...
> It actually does. Atleast based on the code in libxl_exec.c which uses
> a select() call to wait on xenstore fds.

Oh, I'm wrong.  But, libxl__wait_for_offspring is a special tool for
dealing with the daemonic device model process.  The comment should be
clearer.

>     Does this run a script per netbuf ?  Do they want to run in
>     parallel ?
> 
> A script per netbuf. Running them in parallel is an overkill, since
> the remus setup is a one-time task and the average domain is
> expected to have few interfaces.  Besides, the script has no
> waits. It just issues a bunch of shell commands that don't block

Fair enough.  But you may find it easier to run them in parallel
because perhaps then you can reuse more of the existing machinery for
running multiple ao subtasks.

> See answer above for running in parallel. In terms of running this
> asynchronously, I had a comment in this file explaining why I
> decided to go the synchronous route.  Basically, the setup and
> teardown are part of the remus API call, which is already
> asynchronous.
> 
> The remus external API (libxl_domain_remus_start) returns after
> starting remus.  It goes like this:
>   setup scripts
>   suspend_domain (which basically launches save_domain helper process)
>   return AO_IN_PROGRESS

Right.

> The setup scripts are have no long running operations. They complete in jiffy
> (adding a bunch of tc rules to interfaces)

But in libxl you can't wait for subprocesses (ie, you can't call
waitpid) other than via the ao machinery.

> The AO callback in libxl.c (named "remus_replication_failure_cb" [beginning 
> of this patch]), is called when the backup fails, i.e. when the replication 
> operations fail (such as write() calls).
> 
> The AO callback calls the above teardown function.  I remember reading in the
> comments that an AO op should not be issued from an AO callback 
> (please correct me if I am wrong).

You can't run an ao_how-taking libxl function from within libxl.  But
you may (and should) provide a plain-callback-style version of the
same operation from use within libxl.

To see some examples of how to do subfunctions in the asynchronous
style, see libxl_create.c - for example, its call to
libxl__bootloader_run.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering
  2013-10-14 14:23     ` Shriram Rajagopalan
@ 2013-10-15 10:34       ` Ian Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2013-10-15 10:34 UTC (permalink / raw)
  To: rshriram
  Cc: Andrew Cooper, xen-devel, Roger Pau Monne, Ian Jackson,
	Stefano Stabellini

On Mon, 2013-10-14 at 07:23 -0700, Shriram Rajagopalan wrote:
> On Wed, Sep 4, 2013 at 8:17 AM, Ian Campbell <Ian.Campbell@citrix.com>
> wrote:

>         
>         > +static char **get_guest_vif_list(libxl__gc *gc, uint32_t domid,
>         > +                                 int *num_vifs)
>         > +{
>         > +    libxl_device_nic *nics;
>         > +    int nb, i = 0;
>         > +    char **vif_list = NULL;
>         > +
>         > +    nics = libxl_device_nic_list(CTX, domid, &nb);
>         > +    if (!nics) {
>         > +        *num_vifs = 0;
>         > +        return NULL;
>         > +    }
>         > +
>         > +    vif_list = libxl__calloc(gc, nb, sizeof(char *));
>         > +    for (i = 0; i < nb; ++i) {
>         > +        vif_list[i] = get_vifname(gc, domid, &nics[i]);
>         > +        libxl_device_nic_dispose(&nics[i]);
>         > +    }
>         > +    free(nics);
>         > +
>         > +    *num_vifs = nb;
>         > +    return vif_list;
>         > +}
>         
>         
>         I think rather than creating a list of char *'s you should just use the
>         array of libxl_device_nic's and pass that around in your context, pass
>         the individual elements to libxl__netbuf_script_setup etc and then let
>         them determine the name as necessary. That would seem to remove a bunch
>         of book keeping around this list.

> 
> The list of nics returned by libxl__device_nic_list is not managed by the gc.

Hrm, that's rather annoying. There is scope for making an interna
libxl__device_nic which does take a gc and having the external facing
libxl_device_nic_list pass in NOGC though.

> So at every exit point in the main functions, there has to be extra code to free up the invididual
> nics in the list and the list itself. 

Yes, that would be less than ideal.

> In contrast, with the current approach, all allocations are under the
> gc's purview.  The list of nics are released immediately after creating the array of vifnames.
> This seemed to eliminate redundant error handling logic and keeps the code simpler.  I'll also add 
> checks for backend_domid!=0 to this function, to ensure that Remus works only for domains
> with no driver domain backed interfaces.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] libxl: Deprecate synchronous waiting for the device model
  2013-10-14 16:30         ` [PATCH] libxl: Deprecate synchronous waiting for the device model Ian Jackson
@ 2013-11-04 16:50           ` Ian Campbell
  2013-11-04 17:03             ` Ian Jackson
  0 siblings, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2013-11-04 16:50 UTC (permalink / raw)
  To: Ian Jackson; +Cc: rshriram, Andrew Cooper, Stefano Stabellini, xen-devel

On Mon, 2013-10-14 at 17:30 +0100, Ian Jackson wrote:
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 165dc00..653079a 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -1223,7 +1223,19 @@ _hidden int libxl__spawn_record_pid(libxl__gc*, libxl__spawn_state*,
>                                      pid_t innerchild);
>  
>  /*
> - * libxl__wait_for_offspring - Wait for child state
> + * libxl__xenstore_child_wait_deprecated - Wait for daemonic child IPC
> [...]
> This function is currently used only by
> + * libxl__wait_for_device_model_deprecated.

Could it become a static helper in tools/libxl/libxl_device.c (which is
where libxl__wait_for_device_model_deprecated seems to live) then? Tat
would get it out of the internal API immediately...

Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] libxl: Deprecate synchronous waiting for the device model
  2013-11-04 16:50           ` Ian Campbell
@ 2013-11-04 17:03             ` Ian Jackson
  2013-11-12 16:58               ` [PATCH RESEND] " Ian Jackson
  0 siblings, 1 reply; 30+ messages in thread
From: Ian Jackson @ 2013-11-04 17:03 UTC (permalink / raw)
  To: Ian Campbell; +Cc: rshriram, Andrew Cooper, Stefano Stabellini, xen-devel

Ian Campbell writes ("Re: [PATCH] libxl: Deprecate synchronous waiting for the device model"):
> On Mon, 2013-10-14 at 17:30 +0100, Ian Jackson wrote:
> >  /*
> > - * libxl__wait_for_offspring - Wait for child state
> > + * libxl__xenstore_child_wait_deprecated - Wait for daemonic child IPC
> > [...]
> > This function is currently used only by
> > + * libxl__wait_for_device_model_deprecated.
> 
> Could it become a static helper in tools/libxl/libxl_device.c (which is
> where libxl__wait_for_device_model_deprecated seems to live) then? Tat
> would get it out of the internal API immediately...

At the moment its called in libxl_device.c but defined in
libxl_exec.c.  I'm not too keen on moving about something which we
intend to get rid of.

Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH RESEND] libxl: Deprecate synchronous waiting for the device model
  2013-11-04 17:03             ` Ian Jackson
@ 2013-11-12 16:58               ` Ian Jackson
  2013-11-12 17:25                 ` Ian Campbell
  0 siblings, 1 reply; 30+ messages in thread
From: Ian Jackson @ 2013-11-12 16:58 UTC (permalink / raw)
  To: Ian Campbell, xen-devel, rshriram, Andrew Cooper, Stefano Stabellini

Ian Jackson writes ("Re: [PATCH] libxl: Deprecate synchronous waiting for the device model"):
> Ian Campbell writes ("Re: [PATCH] libxl: Deprecate synchronous waiting for the device model"):
> > On Mon, 2013-10-14 at 17:30 +0100, Ian Jackson wrote:
> > >  /*
> > > - * libxl__wait_for_offspring - Wait for child state
> > > + * libxl__xenstore_child_wait_deprecated - Wait for daemonic child IPC
> > > [...]
> > > This function is currently used only by
> > > + * libxl__wait_for_device_model_deprecated.
> > 
> > Could it become a static helper in tools/libxl/libxl_device.c (which is
> > where libxl__wait_for_device_model_deprecated seems to live) then? Tat
> > would get it out of the internal API immediately...
> 
> At the moment its called in libxl_device.c but defined in
> libxl_exec.c.  I'm not too keen on moving about something which we
> intend to get rid of.

Ping.


From: Ian Jackson <ian.jackson@eu.citrix.com>
Date: Mon, 14 Oct 2013 17:26:01 +0100
Subject: [PATCH] libxl: Deprecate synchronous waiting for the device model

libxl__wait_for_device_model blocks, with the ctx lock held, waiting
for a response from the device model.  If the dm doesn't respond
quickly (for example, because it has crashed), this may block the
whole process.  Explain this in a comment, rename the function to
libxl__wait_for_device_model_deprecated, and explain what to use
instead.

libxl__wait_for_offspring is the core implementation for the above.
Its name leads people to think it might be generally useful for
waiting for children, which is far from the case.  It only waits for
xenstore.  Also it has the problems described above.  Explain this,
rename it to libxl__xenstore_child_wait_deprecated, and explain what
to use instead.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl.c          |    2 +-
 tools/libxl/libxl_device.c   |    4 ++--
 tools/libxl/libxl_dom.c      |    4 ++--
 tools/libxl/libxl_exec.c     |    2 +-
 tools/libxl/libxl_internal.h |   36 ++++++++++++++++++++++++++++--------
 tools/libxl/libxl_pci.c      |    8 ++++----
 6 files changed, 38 insertions(+), 18 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index bede011..142e936 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -846,7 +846,7 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
         state = libxl__xs_read(gc, XBT_NULL, path);
         if (state != NULL && !strcmp(state, "paused")) {
             libxl__qemu_traditional_cmd(gc, domid, "continue");
-            libxl__wait_for_device_model(gc, domid, "running",
+            libxl__wait_for_device_model_deprecated(gc, domid, "running",
                                          NULL, NULL, NULL);
         }
     }
diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 16a92a4..610f0c5 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -1077,7 +1077,7 @@ static void devices_remove_callback(libxl__egc *egc,
     return;
 }
 
-int libxl__wait_for_device_model(libxl__gc *gc,
+int libxl__wait_for_device_model_deprecated(libxl__gc *gc,
                                  uint32_t domid, char *state,
                                  libxl__spawn_starting *spawning,
                                  int (*check_callback)(libxl__gc *gc,
@@ -1088,7 +1088,7 @@ int libxl__wait_for_device_model(libxl__gc *gc,
 {
     char *path;
     path = libxl__sprintf(gc, "/local/domain/0/device-model/%d/state", domid);
-    return libxl__wait_for_offspring(gc, domid,
+    return libxl__xenstore_child_wait_deprecated(gc, domid,
                                      LIBXL_DEVICE_MODEL_START_TIMEOUT,
                                      "Device Model", path, state, spawning,
                                      check_callback, check_callback_userdata);
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 1812bdc..6cb39c1 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -965,7 +965,7 @@ int libxl__domain_suspend_device_model(libxl__gc *gc,
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
         LOG(DEBUG, "Saving device model state to %s", filename);
         libxl__qemu_traditional_cmd(gc, domid, "save");
-        libxl__wait_for_device_model(gc, domid, "paused", NULL, NULL, NULL);
+        libxl__wait_for_device_model_deprecated(gc, domid, "paused", NULL, NULL, NULL);
         break;
     }
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
@@ -989,7 +989,7 @@ int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
     switch (libxl__device_model_version_running(gc, domid)) {
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
         libxl__qemu_traditional_cmd(gc, domid, "continue");
-        libxl__wait_for_device_model(gc, domid, "running", NULL, NULL, NULL);
+        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
         break;
     }
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
diff --git a/tools/libxl/libxl_exec.c b/tools/libxl/libxl_exec.c
index 7eddaef..b6efa0f 100644
--- a/tools/libxl/libxl_exec.c
+++ b/tools/libxl/libxl_exec.c
@@ -157,7 +157,7 @@ out:
     return rc ? SIGTERM : 0;
 }
 
-int libxl__wait_for_offspring(libxl__gc *gc,
+int libxl__xenstore_child_wait_deprecated(libxl__gc *gc,
                                  uint32_t domid,
                                  uint32_t timeout, char *what,
                                  char *path, char *state,
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 4f92522..d2301f8 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1223,7 +1223,19 @@ _hidden int libxl__spawn_record_pid(libxl__gc*, libxl__spawn_state*,
                                     pid_t innerchild);
 
 /*
- * libxl__wait_for_offspring - Wait for child state
+ * libxl__xenstore_child_wait_deprecated - Wait for daemonic child IPC
+ *
+ * This is a NOT function for waiting for ordinary child processes.
+ * If you want to run (fork/exec/wait) subprocesses from libxl:
+ *  - Make your libxl entrypoint use the ao machinery
+ *  - Use libxl__ev_fork, and use the callback programming style
+ *
+ * This function is intended for interprocess communication with a
+ * service process.  If the service process does not respond quickly,
+ * the whole caller may be blocked.  Therefore this function is
+ * deprecated.  This function is currently used only by
+ * libxl__wait_for_device_model_deprecated.
+ *
  * gc: allocation pool
  * domid: guest to work with
  * timeout: how many seconds to wait for the state to appear
@@ -1240,9 +1252,9 @@ _hidden int libxl__spawn_record_pid(libxl__gc*, libxl__spawn_state*,
  * in xenstore, and optionally for state in path.
  * If path appears and state matches, check_callback is called.
  * If check_callback returns > 0, waiting for path or state continues.
- * Otherwise libxl__wait_for_offspring returns.
+ * Otherwise libxl__xenstore_child_wait_deprecated returns.
  */
-_hidden int libxl__wait_for_offspring(libxl__gc *gc,
+_hidden int libxl__xenstore_child_wait_deprecated(libxl__gc *gc,
                                  uint32_t domid,
                                  uint32_t timeout, char *what,
                                  char *path, char *state,
@@ -1295,12 +1307,20 @@ _hidden int libxl__need_xenpv_qemu(libxl__gc *gc,
         int nr_consoles, libxl__device_console *consoles,
         int nr_vfbs, libxl_device_vfb *vfbs,
         int nr_disks, libxl_device_disk *disks);
-  /* Caller must either: pass starting_r==0, or on successful
-   * return pass *starting_r (which will be non-0) to
-   * libxl__confirm_device_model_startup or libxl__detach_device_model. */
-_hidden int libxl__wait_for_device_model(libxl__gc *gc,
+
+/*
+ * This function will cause the whole libxl process to hang
+ * if the device model does not respond.  It is deprecated.
+ *
+ * Instead of calling this function:
+ *  - Make your libxl entrypoint use the ao machinery
+ *  - Use libxl__ev_xswatch_register, and use the callback programming
+ *    style
+ */
+_hidden int libxl__wait_for_device_model_deprecated(libxl__gc *gc,
                                 uint32_t domid, char *state,
-                                libxl__spawn_starting *spawning,
+                                libxl__spawn_starting *spawning
+                                                    /* NULL allowed */,
                                 int (*check_callback)(libxl__gc *gc,
                                                       uint32_t domid,
                                                       const char *state,
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 26ba3e6..c8eceb4 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -823,7 +823,7 @@ static int qemu_pci_add_xenstore(libxl__gc *gc, uint32_t domid,
     }
 
     libxl__qemu_traditional_cmd(gc, domid, "pci-ins");
-    rc = libxl__wait_for_device_model(gc, domid, NULL, NULL,
+    rc = libxl__wait_for_device_model_deprecated(gc, domid, NULL, NULL,
                                       pci_ins_check, state);
     path = libxl__sprintf(gc, "/local/domain/0/device-model/%d/parameter",
                           domid);
@@ -851,7 +851,7 @@ static int do_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcidev, i
     switch (libxl__domain_type(gc, domid)) {
     case LIBXL_DOMAIN_TYPE_HVM:
         hvm = 1;
-        if (libxl__wait_for_device_model(gc, domid, "running",
+        if (libxl__wait_for_device_model_deprecated(gc, domid, "running",
                                          NULL, NULL, NULL) < 0) {
             return ERROR_FAIL;
         }
@@ -1137,7 +1137,7 @@ static int qemu_pci_remove_xenstore(libxl__gc *gc, uint32_t domid,
      * device-model for function 0 */
     if ( !force && (pcidev->vdevfn & 0x7) == 0 ) {
         libxl__qemu_traditional_cmd(gc, domid, "pci-rem");
-        if (libxl__wait_for_device_model(gc, domid, "pci-removed",
+        if (libxl__wait_for_device_model_deprecated(gc, domid, "pci-removed",
                                          NULL, NULL, NULL) < 0) {
             LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Device Model didn't respond in time");
             /* This depends on guest operating system acknowledging the
@@ -1179,7 +1179,7 @@ static int do_pci_remove(libxl__gc *gc, uint32_t domid,
     switch (libxl__domain_type(gc, domid)) {
     case LIBXL_DOMAIN_TYPE_HVM:
         hvm = 1;
-        if (libxl__wait_for_device_model(gc, domid, "running",
+        if (libxl__wait_for_device_model_deprecated(gc, domid, "running",
                                          NULL, NULL, NULL) < 0)
             goto out_fail;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH RESEND] libxl: Deprecate synchronous waiting for the device model
  2013-11-12 16:58               ` [PATCH RESEND] " Ian Jackson
@ 2013-11-12 17:25                 ` Ian Campbell
  2013-11-12 17:27                   ` Ian Jackson
  0 siblings, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2013-11-12 17:25 UTC (permalink / raw)
  To: Ian Jackson; +Cc: rshriram, Andrew Cooper, Stefano Stabellini, xen-devel

On Tue, 2013-11-12 at 16:58 +0000, Ian Jackson wrote:
> Ian Jackson writes ("Re: [PATCH] libxl: Deprecate synchronous waiting for the device model"):
> > Ian Campbell writes ("Re: [PATCH] libxl: Deprecate synchronous waiting for the device model"):
> > > On Mon, 2013-10-14 at 17:30 +0100, Ian Jackson wrote:
> > > >  /*
> > > > - * libxl__wait_for_offspring - Wait for child state
> > > > + * libxl__xenstore_child_wait_deprecated - Wait for daemonic child IPC
> > > > [...]
> > > > This function is currently used only by
> > > > + * libxl__wait_for_device_model_deprecated.
> > > 
> > > Could it become a static helper in tools/libxl/libxl_device.c (which is
> > > where libxl__wait_for_device_model_deprecated seems to live) then? Tat
> > > would get it out of the internal API immediately...
> > 
> > At the moment its called in libxl_device.c but defined in
> > libxl_exec.c.  I'm not too keen on moving about something which we
> > intend to get rid of.
> 
> Ping.
> 
> 
> From: Ian Jackson <ian.jackson@eu.citrix.com>
> Date: Mon, 14 Oct 2013 17:26:01 +0100
> Subject: [PATCH] libxl: Deprecate synchronous waiting for the device model
> 
> libxl__wait_for_device_model blocks, with the ctx lock held, waiting
> for a response from the device model.  If the dm doesn't respond
> quickly (for example, because it has crashed), this may block the
> whole process.  Explain this in a comment, rename the function to
> libxl__wait_for_device_model_deprecated, and explain what to use
> instead.
> 
> libxl__wait_for_offspring is the core implementation for the above.
> Its name leads people to think it might be generally useful for
> waiting for children, which is far from the case.  It only waits for
> xenstore.  Also it has the problems described above.  Explain this,
> rename it to libxl__xenstore_child_wait_deprecated, and explain what
> to use instead.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH RESEND] libxl: Deprecate synchronous waiting for the device model
  2013-11-12 17:25                 ` Ian Campbell
@ 2013-11-12 17:27                   ` Ian Jackson
  0 siblings, 0 replies; 30+ messages in thread
From: Ian Jackson @ 2013-11-12 17:27 UTC (permalink / raw)
  To: Ian Campbell; +Cc: rshriram, Andrew Cooper, Stefano Stabellini, xen-devel

Ian Campbell writes ("Re: [PATCH RESEND] libxl: Deprecate synchronous waiting for the device model"):
> On Tue, 2013-11-12 at 16:58 +0000, Ian Jackson wrote:
> > From: Ian Jackson <ian.jackson@eu.citrix.com>
> > Date: Mon, 14 Oct 2013 17:26:01 +0100
> > Subject: [PATCH] libxl: Deprecate synchronous waiting for the device model
> > 
> > libxl__wait_for_device_model blocks, with the ctx lock held, waiting
> > for a response from the device model.  If the dm doesn't respond
> > quickly (for example, because it has crashed), this may block the
> > whole process.  Explain this in a comment, rename the function to
> > libxl__wait_for_device_model_deprecated, and explain what to use
> > instead.
> > 
> > libxl__wait_for_offspring is the core implementation for the above.
> > Its name leads people to think it might be generally useful for
> > waiting for children, which is far from the case.  It only waits for
> > xenstore.  Also it has the problems described above.  Explain this,
> > rename it to libxl__xenstore_child_wait_deprecated, and explain what
> > to use instead.
> > 
> > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2013-11-12 17:27 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-29 22:16 [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
2013-08-29 22:16 ` [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts Shriram Rajagopalan
2013-09-04 13:40   ` Ian Campbell
2013-10-09  2:14     ` Shriram Rajagopalan
2013-10-09  8:06       ` Ian Campbell
2013-10-09 16:27         ` Shriram Rajagopalan
2013-10-09 16:36           ` Ian Campbell
2013-08-29 22:16 ` [PATCH 2 of 5 V2] tools/hotplug: Remus network buffering setup scripts Shriram Rajagopalan
2013-09-04 13:50   ` Ian Campbell
2013-08-29 22:16 ` [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering Shriram Rajagopalan
2013-09-04 15:17   ` Ian Campbell
2013-10-14 14:23     ` Shriram Rajagopalan
2013-10-15 10:34       ` Ian Campbell
2013-09-04 16:16   ` Ian Jackson
2013-09-04 17:22     ` Shriram Rajagopalan
2013-10-09 16:32       ` Shriram Rajagopalan
2013-10-14 16:30         ` [PATCH] libxl: Deprecate synchronous waiting for the device model Ian Jackson
2013-11-04 16:50           ` Ian Campbell
2013-11-04 17:03             ` Ian Jackson
2013-11-12 16:58               ` [PATCH RESEND] " Ian Jackson
2013-11-12 17:25                 ` Ian Campbell
2013-11-12 17:27                   ` Ian Jackson
2013-10-14 16:34         ` [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering [and 1 more messages] Ian Jackson
2013-08-29 22:16 ` [PATCH 4 of 5 V2] tools/libxl: Control network buffering in remus callbacks Shriram Rajagopalan
2013-09-04 15:19   ` Ian Campbell
2013-09-05 11:25   ` Ian Jackson
2013-08-29 22:16 ` [PATCH 5 of 5 V2] tools/xl: Remus - Network buffering cmdline switch Shriram Rajagopalan
2013-09-04 15:24   ` Ian Campbell
2013-09-04 13:21 ` [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
2013-09-04 13:27   ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.