All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5] xenstore: extend the xenstore ring with a 'closing' signal
@ 2014-09-25 14:58 David Scott
  2014-09-26 14:00 ` Ian Jackson
  2014-10-07 11:26 ` Jon Ludlam
  0 siblings, 2 replies; 4+ messages in thread
From: David Scott @ 2014-09-25 14:58 UTC (permalink / raw)
  To: xen-devel
  Cc: dev, ian.campbell, anil, Andrew.Cooper3, Jonathan.Ludlam,
	yanqiangjun, ian.jackson, David Scott, Paul.Durrant,
	john.liuqiming

Hvmloader uses the xenstore ring and then tries to reset it back
to its initial state before booting the guest. Occasionally xenstored
will read the ring while it is being zeroed and conclude it has
been corrupted. This prevents PV drivers from loading in the guest.

This patch updates the xenstore ring protocol definition, enabling
a server to advertise additional features to the guest. One such feature
is defined: the ability to cleanly reset the ring including the
higher-level protocol, like an enhanced RESET_WATCHES for rings.

This patch implements the ring reconnection features in oxenstored
and hvmloader, fixing the bug.

This patch also defines an 'invalid' xenstore packet type and uses this
to poison the ring over a reconnect. This will make diagnosing this
bug much easier in future.

Signed-off-by: David Scott <dave.scott@citrix.com>

---

Updates since v4
* reconnect now clears all ring state (including requests, watches and
  transactions) -- this is the API clients will want
* ocaml: hide the encoding of server features behind a type (thanks to
  Jon for mentioning this)
* docs: promise to advertise the feature before touching the queues.
  Now if the client has received a single reply, it will know that it's
  safe to check for the feature.

Updates since v3
* hvmloader: signal the event channel after requesting a reconnect
  (thanks to Zheng Li for diagnosing this)
* switch to using feature flags/bits instead of version numbers
* rename the 'closing' field to 'connection state'
* define an invalid packet type (0xffff, since 0x0 was already taken)
* on ring connection/reset, fill the input and output buffers with
  the invalid packet
* ocaml xs_ring_stubs.c: collapse both {read,write} functions into one,
  remove #define ERROR_FOO
* doc: write from the guest's point of view, weaken guarantees made by
  server (e.g. producer = consumer != 0 is valid after reconnect)
* doc: relate reconnection to the RESET_WATCHES command in the higher
  level protocol
* doc: be more specific about who must write what, when

Updates since v2
* hvmloader: use volatile for read of closing flag
* style improvements
* remove xenstore version #defines

Updates since v1
* remove unused client version and associated dead code
* treat 'closing' as a flag by using "!=0" rather than "==1"
* hvmloader: move function up and remove forward decl
* document the existing xenstore ring and the extention in misc/
---
 docs/misc/xenstore-ring.txt         |  116 +++++++++++++++++++++++++++++++++++
 tools/firmware/hvmloader/xenbus.c   |   48 +++++++++------
 tools/ocaml/libs/xb/xb.ml           |   21 +++++++
 tools/ocaml/libs/xb/xb.mli          |    4 +-
 tools/ocaml/libs/xb/xs_ring.ml      |   28 +++++++++
 tools/ocaml/libs/xb/xs_ring_stubs.c |  109 +++++++++++++++++++++++---------
 tools/ocaml/xenstored/connection.ml |   16 ++++-
 tools/ocaml/xenstored/process.ml    |   14 ++++-
 xen/include/public/io/xs_wire.h     |   13 +++-
 9 files changed, 316 insertions(+), 53 deletions(-)
 create mode 100644 docs/misc/xenstore-ring.txt

diff --git a/docs/misc/xenstore-ring.txt b/docs/misc/xenstore-ring.txt
new file mode 100644
index 0000000..16b4d0f
--- /dev/null
+++ b/docs/misc/xenstore-ring.txt
@@ -0,0 +1,116 @@
+The xenstore ring is a datastructure stored within a single 4KiB page
+shared between the xenstore server and the guest. The ring contains
+two queues of bytes -- one in each direction -- and some signalling
+information. The [xenstore protocol](xenstore.txt) is layered on top of
+the byte streams.
+
+The xenstore ring datastructure
+===============================
+
+The following table describes the ring structure where
+  - offsets and lengths are in bytes;
+  - "Input" is used to describe the data sent to the server; and
+  - "Output" is used to describe the data sent to the domain.
+
+Offset  Length  Description
+-----------------------------------------------------------------
+0       1024    Input data
+1024    1024    Output data
+2048    4       Input consumer offset
+2052    4       Input producer offset
+2056    4       Output consumer offset
+2060    4       Output producer offset
+2064    4       Server feature bitmap
+2068    4       Connection state
+
+The Input data and Output data are circular buffers. Each buffer is
+associated with a pair of free-running offsets labelled "consumer" and
+"producer".
+
+A "producer" offset is the offset in the byte stream of the next byte
+to be written modulo 2^32. A "consumer" offset is the offset in the byte
+stream of the next byte to be read modulo 2^32. Implementations must
+take care to handle wraparound properly when performing arithmetic with
+these values.
+
+The byte at offset 'x' in the byte stream will be stored at offset
+'x modulo 1024' in the circular buffer.
+
+Implementations may only overwrite previously-written data if it has
+been marked as 'consumed' by the relevant consumer pointer.
+
+When the guest domain is created, there is no outstanding Input or Output
+data. However
+
+  - guests must not assume that producer or consumer pointers start
+    at zero; and
+  - guests must not assume that unused bytes in either the Input or
+    Output data buffers has any particular value.
+
+A xenstore ring is always associated with an event channel. Whenever the
+ring structure is updated the event channel must be signalled. The
+guest and server are free to inspect the contents of the ring at any
+time, not only in response to an event channel event. This implies that
+updates must be ordered carefully to ensure consistency.
+
+The xenstore server may decide to advertise some features via the
+"Server feature bitmap". The server can start advertising features
+at any time by setting bits but it will never stop advertising features
+i.e. bits will never be cleared. The guest is not permitted to write to
+the server feature bitmap. The server features are offered to the guest;
+it is up to the guest whether to use them or not. The guest should ignore
+any unknown feature bits.
+
+The following features are defined:
+
+Mask    Description
+-----------------------------------------------------------------
+1       Ring reconnection (see the ring reconnection feature below)
+
+The "Connection state" field is used to request a ring close and reconnect.
+The "Connection state" field only contains valid data if the server has
+advertised the ring reconnection feature. If the feature has been advertised
+then the "Connection state" may take the following values:
+
+Value   Description
+-----------------------------------------------------------------
+0       Ring is connected
+1       Ring close and reconnect is in progress (see the "ring
+        reconnection feature" described below)
+
+The ring reconnection feature
+=============================
+
+The ring reconnection feature allows the guest to ask the server to
+reset the ring to a valid initial state i.e. one in which the Input
+and Output queues contain no data and there are no outstanding requests,
+watches or transactions.
+
+The ring reconnection feature is only available if the 'Ring reconnection'
+feature bit has been set by the server in the "Server feature bitmap".
+If a server supports ring reconnection, it will guarantee to advertise
+the feature before producing or consuming any data from the Input or Output
+queues.
+
+Assuming the server has advertised the feature, the guest can initiate
+a reconnection by setting the the Connection state to 1 ("Ring close
+and reconnect is in progress") and signalling the event channel.
+The guest must now ignore all fields except the Connection state and
+wait for it to be set to 0 ("Ring is connected")
+
+The server will guarantee to
+
+  - drop any partially read or written higher-level
+    [xenstore protocol](xenstore.txt) packets it may have;
+  - empty the Input and Output queues in the xenstore ring;
+  - discard any in-flight requests
+  - discard any watches associated with the connection
+  - discard any transactions associated with the connection
+  - set the Connection state to 0 ("Ring is connected"); and
+  - signal the event channel.
+
+From the point of view of the guest, the connection has been reset on a
+packet boundary.
+
+Note that only the guest may set the Connection state to 1 and only the
+server may set it back to 0.
diff --git a/tools/firmware/hvmloader/xenbus.c b/tools/firmware/hvmloader/xenbus.c
index 64c2176..f900a1e 100644
--- a/tools/firmware/hvmloader/xenbus.c
+++ b/tools/firmware/hvmloader/xenbus.c
@@ -37,6 +37,19 @@ static struct xenstore_domain_interface *rings; /* Shared ring with dom0 */
 static evtchn_port_t event;                     /* Event-channel to dom0 */
 static char payload[XENSTORE_PAYLOAD_MAX + 1];  /* Unmarshalling area */
 
+static void ring_wait(void)
+{
+    struct shared_info *shinfo = get_shared_info();
+    struct sched_poll poll;
+
+    memset(&poll, 0, sizeof(poll));
+    set_xen_guest_handle(poll.ports, &event);
+    poll.nr_ports = 1;
+
+    while ( !test_and_clear_bit(event, shinfo->evtchn_pending) )
+        hypercall_sched_op(SCHEDOP_poll, &poll);
+}
+
 /* Connect our xenbus client to the backend.
  * Call once, before any other xenbus actions. */
 void xenbus_setup(void)
@@ -61,14 +74,26 @@ void xenbus_setup(void)
 void xenbus_shutdown(void)
 {
     struct shared_info *shinfo = get_shared_info();
+    evtchn_send_t send;
 
     ASSERT(rings != NULL);
 
-    /* We zero out the whole ring -- the backend can handle this, and it's 
-     * not going to surprise any frontends since it's equivalent to never 
-     * having used the rings. */
-    memset(rings, 0, sizeof *rings);
-
+    if (rings->server_features & XENSTORE_SERVER_FEATURE_RECONNECTION) {
+        rings->connection = XENSTORE_RECONNECT;
+        send.port = event;
+        hypercall_event_channel_op(EVTCHNOP_send, &send);
+        while (*(volatile uint32_t*)&rings->connection == XENSTORE_RECONNECT)
+            ring_wait ();
+    } else {
+        /* If the backend reads the state while we're erasing it then the
+         * ring state will become corrupted, preventing guest frontends from
+         * connecting. This is rare. To help diagnose the failure, we fill
+         * the ring with XS_INVALID packets. */
+        memset(rings->req, 0xff, XENSTORE_RING_SIZE);
+        memset(rings->rsp, 0xff, XENSTORE_RING_SIZE);
+        rings->req_cons = rings->req_prod = 0;
+        rings->rsp_cons = rings->rsp_prod = 0;
+    }
     /* Clear the event-channel state too. */
     memset(shinfo->vcpu_info, 0, sizeof(shinfo->vcpu_info));
     memset(shinfo->evtchn_pending, 0, sizeof(shinfo->evtchn_pending));
@@ -77,19 +102,6 @@ void xenbus_shutdown(void)
     rings = NULL;
 }
 
-static void ring_wait(void)
-{
-    struct shared_info *shinfo = get_shared_info();
-    struct sched_poll poll;
-
-    memset(&poll, 0, sizeof(poll));
-    set_xen_guest_handle(poll.ports, &event);
-    poll.nr_ports = 1;
-
-    while ( !test_and_clear_bit(event, shinfo->evtchn_pending) )
-        hypercall_sched_op(SCHEDOP_poll, &poll);
-}
-
 /* Helper functions: copy data in and out of the ring */
 static void ring_write(const char *data, uint32_t len)
 {
diff --git a/tools/ocaml/libs/xb/xb.ml b/tools/ocaml/libs/xb/xb.ml
index 29d354d..50944b5 100644
--- a/tools/ocaml/libs/xb/xb.ml
+++ b/tools/ocaml/libs/xb/xb.ml
@@ -21,6 +21,10 @@ exception End_of_file
 exception Eagain
 exception Noent
 exception Invalid
+exception Reconnect
+
+let _ =
+  Callback.register_exception "Xb.Reconnect" Reconnect
 
 type backend_mmap =
 {
@@ -50,6 +54,19 @@ type t =
 let init_partial_in () = NoHdr
 	(Partial.header_size (), String.make (Partial.header_size()) '\000')
 
+let reconnect t = match t.backend with
+	| Fd _ ->
+		(* should never happen, so close the connection *)
+		raise End_of_file
+	| Xenmmap backend ->
+		Xs_ring.close backend.mmap;
+		backend.eventchn_notify ();
+		(* Clear our old connection state *)
+		Queue.clear t.pkt_in;
+		Queue.clear t.pkt_out;
+		t.partial_in <- init_partial_in ();
+		t.partial_out <- ""
+
 let queue con pkt = Queue.push pkt con.pkt_out
 
 let read_fd back con s len =
@@ -84,6 +101,7 @@ let write con s len =
 	| Fd backfd     -> write_fd backfd con s len
 	| Xenmmap backmmap -> write_mmap backmmap con s len
 
+(* NB: can throw Reconnect *)
 let output con =
 	(* get the output string from a string_of(packet) or partial_out *)
 	let s = if String.length con.partial_out > 0 then
@@ -102,6 +120,7 @@ let output con =
 	(* after sending one packet, partial is empty *)
 	con.partial_out = ""
 
+(* NB: can throw Reconnect *)
 let input con =
 	let newpacket = ref false in
 	let to_read =
@@ -145,6 +164,8 @@ let newcon backend = {
 let open_fd fd = newcon (Fd { fd = fd; })
 
 let open_mmap mmap notifyfct =
+	(* Advertise XENSTORE_SERVER_FEATURE_RECONNECTION *)
+	Xs_ring.set_server_features mmap (Xs_ring.Server_features.singleton Xs_ring.Server_feature.Reconnection);
 	newcon (Xenmmap {
 		mmap = mmap;
 		eventchn_notify = notifyfct;
diff --git a/tools/ocaml/libs/xb/xb.mli b/tools/ocaml/libs/xb/xb.mli
index 58234ae..4e1f833 100644
--- a/tools/ocaml/libs/xb/xb.mli
+++ b/tools/ocaml/libs/xb/xb.mli
@@ -23,7 +23,7 @@ module Op :
       | Resume
       | Set_target
       | Restrict
-      | Invalid (* Not a valid wire operation *)
+      | Invalid
     val operation_c_mapping : operation array
     val size : int
     val array_search : 'a -> 'a array -> int
@@ -57,6 +57,7 @@ exception End_of_file
 exception Eagain
 exception Noent
 exception Invalid
+exception Reconnect
 type backend_mmap = {
   mmap : Xenmmap.mmap_interface;
   eventchn_notify : unit -> unit;
@@ -73,6 +74,7 @@ type t = {
   mutable partial_out : string;
 }
 val init_partial_in : unit -> partial_buf
+val reconnect : t -> unit
 val queue : t -> Packet.t -> unit
 val read_fd : backend_fd -> 'a -> string -> int -> int
 val read_mmap : backend_mmap -> 'a -> string -> int -> int
diff --git a/tools/ocaml/libs/xb/xs_ring.ml b/tools/ocaml/libs/xb/xs_ring.ml
index 9469609..48e06f4 100644
--- a/tools/ocaml/libs/xb/xs_ring.ml
+++ b/tools/ocaml/libs/xb/xs_ring.ml
@@ -14,5 +14,33 @@
  * GNU Lesser General Public License for more details.
  *)
 
+module Server_feature = struct
+	type t =
+	| Reconnection
+end
+
+module Server_features = Set.Make(struct
+	type t = Server_feature.t
+	let compare = compare
+end)
+
 external read: Xenmmap.mmap_interface -> string -> int -> int = "ml_interface_read"
 external write: Xenmmap.mmap_interface -> string -> int -> int = "ml_interface_write"
+
+external _internal_set_server_features: Xenmmap.mmap_interface -> int -> unit = "ml_interface_set_server_features" "noalloc"
+external _internal_get_server_features: Xenmmap.mmap_interface -> int = "ml_interface_get_server_features" "noalloc"
+
+
+let get_server_features mmap =
+	(* NB only one feature currently defined above *)
+	let x = _internal_get_server_features mmap in
+	if x = 0
+	then Server_features.empty
+	else Server_features.singleton Server_feature.Reconnection
+
+let set_server_features mmap set =
+	(* NB only one feature currently defined above *)
+	let x = if set = Server_features.empty then 0 else 1 in
+	_internal_set_server_features mmap x
+
+external close: Xenmmap.mmap_interface -> unit = "ml_interface_close" "noalloc"
diff --git a/tools/ocaml/libs/xb/xs_ring_stubs.c b/tools/ocaml/libs/xb/xs_ring_stubs.c
index 17656c2..fc9b0c5 100644
--- a/tools/ocaml/libs/xb/xs_ring_stubs.c
+++ b/tools/ocaml/libs/xb/xs_ring_stubs.c
@@ -36,22 +36,39 @@
 
 #define GET_C_STRUCT(a) ((struct mmap_interface *) a)
 
-static int xs_ring_read(struct mmap_interface *interface,
-                             char *buffer, int len)
+CAMLprim value ml_interface_read(value ml_interface,
+                                 value ml_buffer,
+                                 value ml_len)
 {
+	CAMLparam3(ml_interface, ml_buffer, ml_len);
+	CAMLlocal1(ml_result);
+
+	struct mmap_interface *interface = GET_C_STRUCT(ml_interface);
+	char *buffer = String_val(ml_buffer);
+	int len = Int_val(ml_len);
+	int result;
+
 	struct xenstore_domain_interface *intf = interface->addr;
 	XENSTORE_RING_IDX cons, prod; /* offsets only */
 	int to_read;
+	uint32_t connection;
 
 	cons = *(volatile uint32_t*)&intf->req_cons;
 	prod = *(volatile uint32_t*)&intf->req_prod;
+	connection = *(volatile uint32*)&intf->connection;
+
+	if (connection != XENSTORE_CONNECTED)
+		caml_raise_constant(*caml_named_value("Xb.Reconnect"));
+
 	xen_mb();
 
 	if ((prod - cons) > XENSTORE_RING_SIZE)
-	    return -1;
+		caml_failwith("bad connection");
 
-	if (prod == cons)
-		return 0;
+	if (prod == cons) {
+		result = 0;
+		goto exit;
+	}
 	cons = MASK_XENSTORE_IDX(cons);
 	prod = MASK_XENSTORE_IDX(prod);
 	if (prod > cons)
@@ -63,21 +80,41 @@ static int xs_ring_read(struct mmap_interface *interface,
 	memcpy(buffer, intf->req + cons, len);
 	xen_mb();
 	intf->req_cons += len;
-	return len;
+	result = len;
+exit:
+	ml_result = Val_int(result);
+	CAMLreturn(ml_result);
 }
 
-static int xs_ring_write(struct mmap_interface *interface,
-                              char *buffer, int len)
+CAMLprim value ml_interface_write(value ml_interface,
+                                  value ml_buffer,
+                                  value ml_len)
 {
+	CAMLparam3(ml_interface, ml_buffer, ml_len);
+	CAMLlocal1(ml_result);
+
+	struct mmap_interface *interface = GET_C_STRUCT(ml_interface);
+	char *buffer = String_val(ml_buffer);
+	int len = Int_val(ml_len);
+	int result;
+
 	struct xenstore_domain_interface *intf = interface->addr;
 	XENSTORE_RING_IDX cons, prod;
 	int can_write;
+	uint32_t connection;
 
 	cons = *(volatile uint32_t*)&intf->rsp_cons;
 	prod = *(volatile uint32_t*)&intf->rsp_prod;
+	connection = *(volatile uint32*)&intf->connection;
+
+	if (connection != XENSTORE_CONNECTED)
+		caml_raise_constant(*caml_named_value("Xb.Reconnect"));
+
 	xen_mb();
-	if ( (prod - cons) >= XENSTORE_RING_SIZE )
-		return 0;
+	if ( (prod - cons) >= XENSTORE_RING_SIZE ) {
+		result = 0;
+		goto exit;
+	}
 	if (MASK_XENSTORE_IDX(prod) >= MASK_XENSTORE_IDX(cons))
 		can_write = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(prod);
 	else 
@@ -87,31 +124,43 @@ static int xs_ring_write(struct mmap_interface *interface,
 	memcpy(intf->rsp + MASK_XENSTORE_IDX(prod), buffer, len);
 	xen_mb();
 	intf->rsp_prod += len;
-	return len;
+	result = len;
+exit:
+	ml_result = Val_int(result);
+	CAMLreturn(ml_result);
 }
 
-CAMLprim value ml_interface_read(value interface, value buffer, value len)
+CAMLprim value ml_interface_set_server_features(value interface, value v)
 {
-	CAMLparam3(interface, buffer, len);
-	CAMLlocal1(result);
-	int res;
+	CAMLparam2(interface, v);
+	struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)->addr;
 
-	res = xs_ring_read(GET_C_STRUCT(interface),
-	                   String_val(buffer), Int_val(len));
-	if (res == -1)
-		caml_failwith("bad connection");
-	result = Val_int(res);
-	CAMLreturn(result);
+	intf->server_features = Int_val(v);
+
+	CAMLreturn(Val_unit);
+}
+
+CAMLprim value ml_interface_get_server_features(value interface)
+{
+	CAMLparam1(interface);
+	struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)->addr;
+
+	CAMLreturn(Val_int (intf->server_features));
 }
 
-CAMLprim value ml_interface_write(value interface, value buffer, value len)
+CAMLprim value ml_interface_close(value interface)
 {
-	CAMLparam3(interface, buffer, len);
-	CAMLlocal1(result);
-	int res;
-
-	res = xs_ring_write(GET_C_STRUCT(interface),
-	                    String_val(buffer), Int_val(len));
-	result = Val_int(res);
-	CAMLreturn(result);
+	CAMLparam1(interface);
+	struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)->addr;
+	int i;
+
+	intf->req_cons = intf->req_prod = intf->rsp_cons = intf->rsp_prod = 0;
+	/* Ensure the unused space is full of invalid xenstore packets. */
+	for (i = 0; i < XENSTORE_RING_SIZE; i++) {
+		intf->req[i] = 0xff; /* XS_INVALID = 0xffff */
+		intf->rsp[i] = 0xff;
+	}
+	xen_mb ();
+	intf->connection = XENSTORE_CONNECTED;
+	CAMLreturn(Val_unit);
 }
diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
index 47695f8..e2fa4e9 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -45,6 +45,20 @@ let mark_as_bad con =
 	|None -> ()
 	| Some domain -> Domain.mark_as_bad domain
 
+let initial_next_tid = 1
+
+let reconnect con =
+	Xenbus.Xb.reconnect con.xb;
+	(* dom is the same *)
+	Hashtbl.clear con.transactions;
+	con.next_tid <- initial_next_tid;
+	Hashtbl.clear con.watches;
+	(* anonid is the same *)
+	con.nb_watches <- 0;
+	con.stat_nb_ops <- 0;
+	(* perm is the same *)
+	()
+
 let get_path con =
 Printf.sprintf "/local/domain/%i/" (match con.dom with None -> 0 | Some d -> Domain.get_id d)
 
@@ -89,7 +103,7 @@ let create xbcon dom =
 	xb = xbcon;
 	dom = dom;
 	transactions = Hashtbl.create 5;
-	next_tid = 1;
+	next_tid = initial_next_tid;
 	watches = Hashtbl.create 8;
 	nb_watches = 0;
 	anonid = id;
diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
index 89db56c..0620585 100644
--- a/tools/ocaml/xenstored/process.ml
+++ b/tools/ocaml/xenstored/process.ml
@@ -377,7 +377,12 @@ let do_input store cons doms con =
 	let newpacket =
 		try
 			Connection.do_input con
-		with Failure exp ->
+		with Xenbus.Xb.Reconnect ->
+			info "%s requests a reconnect" (Connection.get_domstr con);
+			Connection.reconnect con;
+			info "%s reconnection complete" (Connection.get_domstr con);
+			false
+		| Failure exp ->
 			error "caught exception %s" exp;
 			error "got a bad client %s" (sprintf "%-8s" (Connection.get_domstr con));
 			Connection.mark_as_bad con;
@@ -407,6 +412,11 @@ let do_output store cons doms con =
 			         (Xenbus.Xb.Op.to_string ty) (sanitize_data data);*)
 			write_answer_log ~ty ~tid ~con ~data;
 		);
-		ignore (Connection.do_output con)
+		try
+			ignore (Connection.do_output con)
+		with Xenbus.Xb.Reconnect ->
+			info "%s requests a reconnect" (Connection.get_domstr con);
+			Connection.reconnect con;
+			info "%s reconnection complete" (Connection.get_domstr con)
 	)
 
diff --git a/xen/include/public/io/xs_wire.h b/xen/include/public/io/xs_wire.h
index 585f0c8..0a0cdbc 100644
--- a/xen/include/public/io/xs_wire.h
+++ b/xen/include/public/io/xs_wire.h
@@ -49,7 +49,9 @@ enum xsd_sockmsg_type
     XS_RESUME,
     XS_SET_TARGET,
     XS_RESTRICT,
-    XS_RESET_WATCHES
+    XS_RESET_WATCHES,
+
+    XS_INVALID = 0xffff /* Guaranteed to remain an invalid type */
 };
 
 #define XS_WRITE_NONE "NONE"
@@ -116,6 +118,8 @@ struct xenstore_domain_interface {
     char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch events. */
     XENSTORE_RING_IDX req_cons, req_prod;
     XENSTORE_RING_IDX rsp_cons, rsp_prod;
+    uint32_t server_features; /* Bitmap of features supported by the server */
+    uint32_t connection;
 };
 
 /* Violating this is very bad.  See docs/misc/xenstore.txt. */
@@ -125,6 +129,13 @@ struct xenstore_domain_interface {
 #define XENSTORE_ABS_PATH_MAX 3072
 #define XENSTORE_REL_PATH_MAX 2048
 
+/* The ability to reconnect a ring */
+#define XENSTORE_SERVER_FEATURE_RECONNECTION 1
+
+/* Valid values for the connection field */
+#define XENSTORE_CONNECTED 0 /* the steady-state */
+#define XENSTORE_RECONNECT 1 /* guest has initiated a reconnect */
+
 #endif /* _XS_WIRE_H */
 
 /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v5] xenstore: extend the xenstore ring with a 'closing' signal
  2014-09-25 14:58 [PATCH v5] xenstore: extend the xenstore ring with a 'closing' signal David Scott
@ 2014-09-26 14:00 ` Ian Jackson
  2014-09-26 14:06   ` Dave Scott
  2014-10-07 11:26 ` Jon Ludlam
  1 sibling, 1 reply; 4+ messages in thread
From: Ian Jackson @ 2014-09-26 14:00 UTC (permalink / raw)
  To: David Scott
  Cc: dev, ian.campbell, anil, Andrew.Cooper3, Jonathan.Ludlam,
	yanqiangjun, Paul.Durrant, john.liuqiming, xen-devel

David Scott writes ("[PATCH v5] xenstore: extend the xenstore ring with a 'closing' signal"):
> Hvmloader uses the xenstore ring and then tries to reset it back
> to its initial state before booting the guest. Occasionally xenstored
> will read the ring while it is being zeroed and conclude it has
> been corrupted. This prevents PV drivers from loading in the guest.
> 
> This patch updates the xenstore ring protocol definition, enabling
> a server to advertise additional features to the guest. One such feature
> is defined: the ability to cleanly reset the ring including the
> higher-level protocol, like an enhanced RESET_WATCHES for rings.
> 
> This patch implements the ring reconnection features in oxenstored
> and hvmloader, fixing the bug.
> 
> This patch also defines an 'invalid' xenstore packet type and uses this
> to poison the ring over a reconnect. This will make diagnosing this
> bug much easier in future.
> 
> Signed-off-by: David Scott <dave.scott@citrix.com>

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

I'm happy to take this from you without really reviewing the ocaml
code.

To be applied this needs a freeze exception.  You should make the case
(explaining why the feature is important and why it is low risk) to
Konrad.

Ian.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v5] xenstore: extend the xenstore ring with a 'closing' signal
  2014-09-26 14:00 ` Ian Jackson
@ 2014-09-26 14:06   ` Dave Scott
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Scott @ 2014-09-26 14:06 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Zheng Li, Ian Campbell, Anil Madhavapeddy, Andrew Cooper,
	Jonathan Ludlam, yanqiangjun, Paul Durrant, john.liuqiming,
	xen-devel


On 26 Sep 2014, at 15:00, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:

> David Scott writes ("[PATCH v5] xenstore: extend the xenstore ring with a 'closing' signal"):
>> Hvmloader uses the xenstore ring and then tries to reset it back
>> to its initial state before booting the guest. Occasionally xenstored
>> will read the ring while it is being zeroed and conclude it has
>> been corrupted. This prevents PV drivers from loading in the guest.
>> 
>> This patch updates the xenstore ring protocol definition, enabling
>> a server to advertise additional features to the guest. One such feature
>> is defined: the ability to cleanly reset the ring including the
>> higher-level protocol, like an enhanced RESET_WATCHES for rings.
>> 
>> This patch implements the ring reconnection features in oxenstored
>> and hvmloader, fixing the bug.
>> 
>> This patch also defines an 'invalid' xenstore packet type and uses this
>> to poison the ring over a reconnect. This will make diagnosing this
>> bug much easier in future.
>> 
>> Signed-off-by: David Scott <dave.scott@citrix.com>
> 
> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

Thanks!

> I'm happy to take this from you without really reviewing the ocaml
> code.

Jon: could you take another look at the OCaml code? It’s slightly modified
from the previous one you reviewed. This version allows the Reconnect
exception to bubble up to the oxenstored connection handler and allows
oxenstored to reset all the state (ring, requests, watches, transactions
etc)
 
> To be applied this needs a freeze exception.  You should make the case
> (explaining why the feature is important and why it is low risk) to
> Konrad.

Understood.

Cheers,
Dave

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v5] xenstore: extend the xenstore ring with a 'closing' signal
  2014-09-25 14:58 [PATCH v5] xenstore: extend the xenstore ring with a 'closing' signal David Scott
  2014-09-26 14:00 ` Ian Jackson
@ 2014-10-07 11:26 ` Jon Ludlam
  1 sibling, 0 replies; 4+ messages in thread
From: Jon Ludlam @ 2014-10-07 11:26 UTC (permalink / raw)
  To: David Scott, xen-devel
  Cc: dev, ian.campbell, anil, Andrew.Cooper3, Jonathan.Ludlam,
	yanqiangjun, ian.jackson, Paul.Durrant, john.liuqiming

Handling the exception higher up looks a fair bit nicer, and my previous
comment has been addressed, so it looks fine to me.

Reviewed-by: Jon Ludlam <jonathan.ludlam@citrix.com>


On 25/09/14 15:58, David Scott wrote:
> Hvmloader uses the xenstore ring and then tries to reset it back
> to its initial state before booting the guest. Occasionally xenstored
> will read the ring while it is being zeroed and conclude it has
> been corrupted. This prevents PV drivers from loading in the guest.
>
> This patch updates the xenstore ring protocol definition, enabling
> a server to advertise additional features to the guest. One such feature
> is defined: the ability to cleanly reset the ring including the
> higher-level protocol, like an enhanced RESET_WATCHES for rings.
>
> This patch implements the ring reconnection features in oxenstored
> and hvmloader, fixing the bug.
>
> This patch also defines an 'invalid' xenstore packet type and uses this
> to poison the ring over a reconnect. This will make diagnosing this
> bug much easier in future.
>
> Signed-off-by: David Scott <dave.scott@citrix.com>
>
> ---
>
> Updates since v4
> * reconnect now clears all ring state (including requests, watches and
>   transactions) -- this is the API clients will want
> * ocaml: hide the encoding of server features behind a type (thanks to
>   Jon for mentioning this)
> * docs: promise to advertise the feature before touching the queues.
>   Now if the client has received a single reply, it will know that it's
>   safe to check for the feature.
>
> Updates since v3
> * hvmloader: signal the event channel after requesting a reconnect
>   (thanks to Zheng Li for diagnosing this)
> * switch to using feature flags/bits instead of version numbers
> * rename the 'closing' field to 'connection state'
> * define an invalid packet type (0xffff, since 0x0 was already taken)
> * on ring connection/reset, fill the input and output buffers with
>   the invalid packet
> * ocaml xs_ring_stubs.c: collapse both {read,write} functions into one,
>   remove #define ERROR_FOO
> * doc: write from the guest's point of view, weaken guarantees made by
>   server (e.g. producer = consumer != 0 is valid after reconnect)
> * doc: relate reconnection to the RESET_WATCHES command in the higher
>   level protocol
> * doc: be more specific about who must write what, when
>
> Updates since v2
> * hvmloader: use volatile for read of closing flag
> * style improvements
> * remove xenstore version #defines
>
> Updates since v1
> * remove unused client version and associated dead code
> * treat 'closing' as a flag by using "!=0" rather than "==1"
> * hvmloader: move function up and remove forward decl
> * document the existing xenstore ring and the extention in misc/
> ---
>  docs/misc/xenstore-ring.txt         |  116 +++++++++++++++++++++++++++++++++++
>  tools/firmware/hvmloader/xenbus.c   |   48 +++++++++------
>  tools/ocaml/libs/xb/xb.ml           |   21 +++++++
>  tools/ocaml/libs/xb/xb.mli          |    4 +-
>  tools/ocaml/libs/xb/xs_ring.ml      |   28 +++++++++
>  tools/ocaml/libs/xb/xs_ring_stubs.c |  109 +++++++++++++++++++++++---------
>  tools/ocaml/xenstored/connection.ml |   16 ++++-
>  tools/ocaml/xenstored/process.ml    |   14 ++++-
>  xen/include/public/io/xs_wire.h     |   13 +++-
>  9 files changed, 316 insertions(+), 53 deletions(-)
>  create mode 100644 docs/misc/xenstore-ring.txt
>
> diff --git a/docs/misc/xenstore-ring.txt b/docs/misc/xenstore-ring.txt
> new file mode 100644
> index 0000000..16b4d0f
> --- /dev/null
> +++ b/docs/misc/xenstore-ring.txt
> @@ -0,0 +1,116 @@
> +The xenstore ring is a datastructure stored within a single 4KiB page
> +shared between the xenstore server and the guest. The ring contains
> +two queues of bytes -- one in each direction -- and some signalling
> +information. The [xenstore protocol](xenstore.txt) is layered on top of
> +the byte streams.
> +
> +The xenstore ring datastructure
> +===============================
> +
> +The following table describes the ring structure where
> +  - offsets and lengths are in bytes;
> +  - "Input" is used to describe the data sent to the server; and
> +  - "Output" is used to describe the data sent to the domain.
> +
> +Offset  Length  Description
> +-----------------------------------------------------------------
> +0       1024    Input data
> +1024    1024    Output data
> +2048    4       Input consumer offset
> +2052    4       Input producer offset
> +2056    4       Output consumer offset
> +2060    4       Output producer offset
> +2064    4       Server feature bitmap
> +2068    4       Connection state
> +
> +The Input data and Output data are circular buffers. Each buffer is
> +associated with a pair of free-running offsets labelled "consumer" and
> +"producer".
> +
> +A "producer" offset is the offset in the byte stream of the next byte
> +to be written modulo 2^32. A "consumer" offset is the offset in the byte
> +stream of the next byte to be read modulo 2^32. Implementations must
> +take care to handle wraparound properly when performing arithmetic with
> +these values.
> +
> +The byte at offset 'x' in the byte stream will be stored at offset
> +'x modulo 1024' in the circular buffer.
> +
> +Implementations may only overwrite previously-written data if it has
> +been marked as 'consumed' by the relevant consumer pointer.
> +
> +When the guest domain is created, there is no outstanding Input or Output
> +data. However
> +
> +  - guests must not assume that producer or consumer pointers start
> +    at zero; and
> +  - guests must not assume that unused bytes in either the Input or
> +    Output data buffers has any particular value.
> +
> +A xenstore ring is always associated with an event channel. Whenever the
> +ring structure is updated the event channel must be signalled. The
> +guest and server are free to inspect the contents of the ring at any
> +time, not only in response to an event channel event. This implies that
> +updates must be ordered carefully to ensure consistency.
> +
> +The xenstore server may decide to advertise some features via the
> +"Server feature bitmap". The server can start advertising features
> +at any time by setting bits but it will never stop advertising features
> +i.e. bits will never be cleared. The guest is not permitted to write to
> +the server feature bitmap. The server features are offered to the guest;
> +it is up to the guest whether to use them or not. The guest should ignore
> +any unknown feature bits.
> +
> +The following features are defined:
> +
> +Mask    Description
> +-----------------------------------------------------------------
> +1       Ring reconnection (see the ring reconnection feature below)
> +
> +The "Connection state" field is used to request a ring close and reconnect.
> +The "Connection state" field only contains valid data if the server has
> +advertised the ring reconnection feature. If the feature has been advertised
> +then the "Connection state" may take the following values:
> +
> +Value   Description
> +-----------------------------------------------------------------
> +0       Ring is connected
> +1       Ring close and reconnect is in progress (see the "ring
> +        reconnection feature" described below)
> +
> +The ring reconnection feature
> +=============================
> +
> +The ring reconnection feature allows the guest to ask the server to
> +reset the ring to a valid initial state i.e. one in which the Input
> +and Output queues contain no data and there are no outstanding requests,
> +watches or transactions.
> +
> +The ring reconnection feature is only available if the 'Ring reconnection'
> +feature bit has been set by the server in the "Server feature bitmap".
> +If a server supports ring reconnection, it will guarantee to advertise
> +the feature before producing or consuming any data from the Input or Output
> +queues.
> +
> +Assuming the server has advertised the feature, the guest can initiate
> +a reconnection by setting the the Connection state to 1 ("Ring close
> +and reconnect is in progress") and signalling the event channel.
> +The guest must now ignore all fields except the Connection state and
> +wait for it to be set to 0 ("Ring is connected")
> +
> +The server will guarantee to
> +
> +  - drop any partially read or written higher-level
> +    [xenstore protocol](xenstore.txt) packets it may have;
> +  - empty the Input and Output queues in the xenstore ring;
> +  - discard any in-flight requests
> +  - discard any watches associated with the connection
> +  - discard any transactions associated with the connection
> +  - set the Connection state to 0 ("Ring is connected"); and
> +  - signal the event channel.
> +
> +From the point of view of the guest, the connection has been reset on a
> +packet boundary.
> +
> +Note that only the guest may set the Connection state to 1 and only the
> +server may set it back to 0.
> diff --git a/tools/firmware/hvmloader/xenbus.c b/tools/firmware/hvmloader/xenbus.c
> index 64c2176..f900a1e 100644
> --- a/tools/firmware/hvmloader/xenbus.c
> +++ b/tools/firmware/hvmloader/xenbus.c
> @@ -37,6 +37,19 @@ static struct xenstore_domain_interface *rings; /* Shared ring with dom0 */
>  static evtchn_port_t event;                     /* Event-channel to dom0 */
>  static char payload[XENSTORE_PAYLOAD_MAX + 1];  /* Unmarshalling area */
>  
> +static void ring_wait(void)
> +{
> +    struct shared_info *shinfo = get_shared_info();
> +    struct sched_poll poll;
> +
> +    memset(&poll, 0, sizeof(poll));
> +    set_xen_guest_handle(poll.ports, &event);
> +    poll.nr_ports = 1;
> +
> +    while ( !test_and_clear_bit(event, shinfo->evtchn_pending) )
> +        hypercall_sched_op(SCHEDOP_poll, &poll);
> +}
> +
>  /* Connect our xenbus client to the backend.
>   * Call once, before any other xenbus actions. */
>  void xenbus_setup(void)
> @@ -61,14 +74,26 @@ void xenbus_setup(void)
>  void xenbus_shutdown(void)
>  {
>      struct shared_info *shinfo = get_shared_info();
> +    evtchn_send_t send;
>  
>      ASSERT(rings != NULL);
>  
> -    /* We zero out the whole ring -- the backend can handle this, and it's 
> -     * not going to surprise any frontends since it's equivalent to never 
> -     * having used the rings. */
> -    memset(rings, 0, sizeof *rings);
> -
> +    if (rings->server_features & XENSTORE_SERVER_FEATURE_RECONNECTION) {
> +        rings->connection = XENSTORE_RECONNECT;
> +        send.port = event;
> +        hypercall_event_channel_op(EVTCHNOP_send, &send);
> +        while (*(volatile uint32_t*)&rings->connection == XENSTORE_RECONNECT)
> +            ring_wait ();
> +    } else {
> +        /* If the backend reads the state while we're erasing it then the
> +         * ring state will become corrupted, preventing guest frontends from
> +         * connecting. This is rare. To help diagnose the failure, we fill
> +         * the ring with XS_INVALID packets. */
> +        memset(rings->req, 0xff, XENSTORE_RING_SIZE);
> +        memset(rings->rsp, 0xff, XENSTORE_RING_SIZE);
> +        rings->req_cons = rings->req_prod = 0;
> +        rings->rsp_cons = rings->rsp_prod = 0;
> +    }
>      /* Clear the event-channel state too. */
>      memset(shinfo->vcpu_info, 0, sizeof(shinfo->vcpu_info));
>      memset(shinfo->evtchn_pending, 0, sizeof(shinfo->evtchn_pending));
> @@ -77,19 +102,6 @@ void xenbus_shutdown(void)
>      rings = NULL;
>  }
>  
> -static void ring_wait(void)
> -{
> -    struct shared_info *shinfo = get_shared_info();
> -    struct sched_poll poll;
> -
> -    memset(&poll, 0, sizeof(poll));
> -    set_xen_guest_handle(poll.ports, &event);
> -    poll.nr_ports = 1;
> -
> -    while ( !test_and_clear_bit(event, shinfo->evtchn_pending) )
> -        hypercall_sched_op(SCHEDOP_poll, &poll);
> -}
> -
>  /* Helper functions: copy data in and out of the ring */
>  static void ring_write(const char *data, uint32_t len)
>  {
> diff --git a/tools/ocaml/libs/xb/xb.ml b/tools/ocaml/libs/xb/xb.ml
> index 29d354d..50944b5 100644
> --- a/tools/ocaml/libs/xb/xb.ml
> +++ b/tools/ocaml/libs/xb/xb.ml
> @@ -21,6 +21,10 @@ exception End_of_file
>  exception Eagain
>  exception Noent
>  exception Invalid
> +exception Reconnect
> +
> +let _ =
> +  Callback.register_exception "Xb.Reconnect" Reconnect
>  
>  type backend_mmap =
>  {
> @@ -50,6 +54,19 @@ type t =
>  let init_partial_in () = NoHdr
>  	(Partial.header_size (), String.make (Partial.header_size()) '\000')
>  
> +let reconnect t = match t.backend with
> +	| Fd _ ->
> +		(* should never happen, so close the connection *)
> +		raise End_of_file
> +	| Xenmmap backend ->
> +		Xs_ring.close backend.mmap;
> +		backend.eventchn_notify ();
> +		(* Clear our old connection state *)
> +		Queue.clear t.pkt_in;
> +		Queue.clear t.pkt_out;
> +		t.partial_in <- init_partial_in ();
> +		t.partial_out <- ""
> +
>  let queue con pkt = Queue.push pkt con.pkt_out
>  
>  let read_fd back con s len =
> @@ -84,6 +101,7 @@ let write con s len =
>  	| Fd backfd     -> write_fd backfd con s len
>  	| Xenmmap backmmap -> write_mmap backmmap con s len
>  
> +(* NB: can throw Reconnect *)
>  let output con =
>  	(* get the output string from a string_of(packet) or partial_out *)
>  	let s = if String.length con.partial_out > 0 then
> @@ -102,6 +120,7 @@ let output con =
>  	(* after sending one packet, partial is empty *)
>  	con.partial_out = ""
>  
> +(* NB: can throw Reconnect *)
>  let input con =
>  	let newpacket = ref false in
>  	let to_read =
> @@ -145,6 +164,8 @@ let newcon backend = {
>  let open_fd fd = newcon (Fd { fd = fd; })
>  
>  let open_mmap mmap notifyfct =
> +	(* Advertise XENSTORE_SERVER_FEATURE_RECONNECTION *)
> +	Xs_ring.set_server_features mmap (Xs_ring.Server_features.singleton Xs_ring.Server_feature.Reconnection);
>  	newcon (Xenmmap {
>  		mmap = mmap;
>  		eventchn_notify = notifyfct;
> diff --git a/tools/ocaml/libs/xb/xb.mli b/tools/ocaml/libs/xb/xb.mli
> index 58234ae..4e1f833 100644
> --- a/tools/ocaml/libs/xb/xb.mli
> +++ b/tools/ocaml/libs/xb/xb.mli
> @@ -23,7 +23,7 @@ module Op :
>        | Resume
>        | Set_target
>        | Restrict
> -      | Invalid (* Not a valid wire operation *)
> +      | Invalid
>      val operation_c_mapping : operation array
>      val size : int
>      val array_search : 'a -> 'a array -> int
> @@ -57,6 +57,7 @@ exception End_of_file
>  exception Eagain
>  exception Noent
>  exception Invalid
> +exception Reconnect
>  type backend_mmap = {
>    mmap : Xenmmap.mmap_interface;
>    eventchn_notify : unit -> unit;
> @@ -73,6 +74,7 @@ type t = {
>    mutable partial_out : string;
>  }
>  val init_partial_in : unit -> partial_buf
> +val reconnect : t -> unit
>  val queue : t -> Packet.t -> unit
>  val read_fd : backend_fd -> 'a -> string -> int -> int
>  val read_mmap : backend_mmap -> 'a -> string -> int -> int
> diff --git a/tools/ocaml/libs/xb/xs_ring.ml b/tools/ocaml/libs/xb/xs_ring.ml
> index 9469609..48e06f4 100644
> --- a/tools/ocaml/libs/xb/xs_ring.ml
> +++ b/tools/ocaml/libs/xb/xs_ring.ml
> @@ -14,5 +14,33 @@
>   * GNU Lesser General Public License for more details.
>   *)
>  
> +module Server_feature = struct
> +	type t =
> +	| Reconnection
> +end
> +
> +module Server_features = Set.Make(struct
> +	type t = Server_feature.t
> +	let compare = compare
> +end)
> +
>  external read: Xenmmap.mmap_interface -> string -> int -> int = "ml_interface_read"
>  external write: Xenmmap.mmap_interface -> string -> int -> int = "ml_interface_write"
> +
> +external _internal_set_server_features: Xenmmap.mmap_interface -> int -> unit = "ml_interface_set_server_features" "noalloc"
> +external _internal_get_server_features: Xenmmap.mmap_interface -> int = "ml_interface_get_server_features" "noalloc"
> +
> +
> +let get_server_features mmap =
> +	(* NB only one feature currently defined above *)
> +	let x = _internal_get_server_features mmap in
> +	if x = 0
> +	then Server_features.empty
> +	else Server_features.singleton Server_feature.Reconnection
> +
> +let set_server_features mmap set =
> +	(* NB only one feature currently defined above *)
> +	let x = if set = Server_features.empty then 0 else 1 in
> +	_internal_set_server_features mmap x
> +
> +external close: Xenmmap.mmap_interface -> unit = "ml_interface_close" "noalloc"
> diff --git a/tools/ocaml/libs/xb/xs_ring_stubs.c b/tools/ocaml/libs/xb/xs_ring_stubs.c
> index 17656c2..fc9b0c5 100644
> --- a/tools/ocaml/libs/xb/xs_ring_stubs.c
> +++ b/tools/ocaml/libs/xb/xs_ring_stubs.c
> @@ -36,22 +36,39 @@
>  
>  #define GET_C_STRUCT(a) ((struct mmap_interface *) a)
>  
> -static int xs_ring_read(struct mmap_interface *interface,
> -                             char *buffer, int len)
> +CAMLprim value ml_interface_read(value ml_interface,
> +                                 value ml_buffer,
> +                                 value ml_len)
>  {
> +	CAMLparam3(ml_interface, ml_buffer, ml_len);
> +	CAMLlocal1(ml_result);
> +
> +	struct mmap_interface *interface = GET_C_STRUCT(ml_interface);
> +	char *buffer = String_val(ml_buffer);
> +	int len = Int_val(ml_len);
> +	int result;
> +
>  	struct xenstore_domain_interface *intf = interface->addr;
>  	XENSTORE_RING_IDX cons, prod; /* offsets only */
>  	int to_read;
> +	uint32_t connection;
>  
>  	cons = *(volatile uint32_t*)&intf->req_cons;
>  	prod = *(volatile uint32_t*)&intf->req_prod;
> +	connection = *(volatile uint32*)&intf->connection;
> +
> +	if (connection != XENSTORE_CONNECTED)
> +		caml_raise_constant(*caml_named_value("Xb.Reconnect"));
> +
>  	xen_mb();
>  
>  	if ((prod - cons) > XENSTORE_RING_SIZE)
> -	    return -1;
> +		caml_failwith("bad connection");
>  
> -	if (prod == cons)
> -		return 0;
> +	if (prod == cons) {
> +		result = 0;
> +		goto exit;
> +	}
>  	cons = MASK_XENSTORE_IDX(cons);
>  	prod = MASK_XENSTORE_IDX(prod);
>  	if (prod > cons)
> @@ -63,21 +80,41 @@ static int xs_ring_read(struct mmap_interface *interface,
>  	memcpy(buffer, intf->req + cons, len);
>  	xen_mb();
>  	intf->req_cons += len;
> -	return len;
> +	result = len;
> +exit:
> +	ml_result = Val_int(result);
> +	CAMLreturn(ml_result);
>  }
>  
> -static int xs_ring_write(struct mmap_interface *interface,
> -                              char *buffer, int len)
> +CAMLprim value ml_interface_write(value ml_interface,
> +                                  value ml_buffer,
> +                                  value ml_len)
>  {
> +	CAMLparam3(ml_interface, ml_buffer, ml_len);
> +	CAMLlocal1(ml_result);
> +
> +	struct mmap_interface *interface = GET_C_STRUCT(ml_interface);
> +	char *buffer = String_val(ml_buffer);
> +	int len = Int_val(ml_len);
> +	int result;
> +
>  	struct xenstore_domain_interface *intf = interface->addr;
>  	XENSTORE_RING_IDX cons, prod;
>  	int can_write;
> +	uint32_t connection;
>  
>  	cons = *(volatile uint32_t*)&intf->rsp_cons;
>  	prod = *(volatile uint32_t*)&intf->rsp_prod;
> +	connection = *(volatile uint32*)&intf->connection;
> +
> +	if (connection != XENSTORE_CONNECTED)
> +		caml_raise_constant(*caml_named_value("Xb.Reconnect"));
> +
>  	xen_mb();
> -	if ( (prod - cons) >= XENSTORE_RING_SIZE )
> -		return 0;
> +	if ( (prod - cons) >= XENSTORE_RING_SIZE ) {
> +		result = 0;
> +		goto exit;
> +	}
>  	if (MASK_XENSTORE_IDX(prod) >= MASK_XENSTORE_IDX(cons))
>  		can_write = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(prod);
>  	else 
> @@ -87,31 +124,43 @@ static int xs_ring_write(struct mmap_interface *interface,
>  	memcpy(intf->rsp + MASK_XENSTORE_IDX(prod), buffer, len);
>  	xen_mb();
>  	intf->rsp_prod += len;
> -	return len;
> +	result = len;
> +exit:
> +	ml_result = Val_int(result);
> +	CAMLreturn(ml_result);
>  }
>  
> -CAMLprim value ml_interface_read(value interface, value buffer, value len)
> +CAMLprim value ml_interface_set_server_features(value interface, value v)
>  {
> -	CAMLparam3(interface, buffer, len);
> -	CAMLlocal1(result);
> -	int res;
> +	CAMLparam2(interface, v);
> +	struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)->addr;
>  
> -	res = xs_ring_read(GET_C_STRUCT(interface),
> -	                   String_val(buffer), Int_val(len));
> -	if (res == -1)
> -		caml_failwith("bad connection");
> -	result = Val_int(res);
> -	CAMLreturn(result);
> +	intf->server_features = Int_val(v);
> +
> +	CAMLreturn(Val_unit);
> +}
> +
> +CAMLprim value ml_interface_get_server_features(value interface)
> +{
> +	CAMLparam1(interface);
> +	struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)->addr;
> +
> +	CAMLreturn(Val_int (intf->server_features));
>  }
>  
> -CAMLprim value ml_interface_write(value interface, value buffer, value len)
> +CAMLprim value ml_interface_close(value interface)
>  {
> -	CAMLparam3(interface, buffer, len);
> -	CAMLlocal1(result);
> -	int res;
> -
> -	res = xs_ring_write(GET_C_STRUCT(interface),
> -	                    String_val(buffer), Int_val(len));
> -	result = Val_int(res);
> -	CAMLreturn(result);
> +	CAMLparam1(interface);
> +	struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)->addr;
> +	int i;
> +
> +	intf->req_cons = intf->req_prod = intf->rsp_cons = intf->rsp_prod = 0;
> +	/* Ensure the unused space is full of invalid xenstore packets. */
> +	for (i = 0; i < XENSTORE_RING_SIZE; i++) {
> +		intf->req[i] = 0xff; /* XS_INVALID = 0xffff */
> +		intf->rsp[i] = 0xff;
> +	}
> +	xen_mb ();
> +	intf->connection = XENSTORE_CONNECTED;
> +	CAMLreturn(Val_unit);
>  }
> diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
> index 47695f8..e2fa4e9 100644
> --- a/tools/ocaml/xenstored/connection.ml
> +++ b/tools/ocaml/xenstored/connection.ml
> @@ -45,6 +45,20 @@ let mark_as_bad con =
>  	|None -> ()
>  	| Some domain -> Domain.mark_as_bad domain
>  
> +let initial_next_tid = 1
> +
> +let reconnect con =
> +	Xenbus.Xb.reconnect con.xb;
> +	(* dom is the same *)
> +	Hashtbl.clear con.transactions;
> +	con.next_tid <- initial_next_tid;
> +	Hashtbl.clear con.watches;
> +	(* anonid is the same *)
> +	con.nb_watches <- 0;
> +	con.stat_nb_ops <- 0;
> +	(* perm is the same *)
> +	()
> +
>  let get_path con =
>  Printf.sprintf "/local/domain/%i/" (match con.dom with None -> 0 | Some d -> Domain.get_id d)
>  
> @@ -89,7 +103,7 @@ let create xbcon dom =
>  	xb = xbcon;
>  	dom = dom;
>  	transactions = Hashtbl.create 5;
> -	next_tid = 1;
> +	next_tid = initial_next_tid;
>  	watches = Hashtbl.create 8;
>  	nb_watches = 0;
>  	anonid = id;
> diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
> index 89db56c..0620585 100644
> --- a/tools/ocaml/xenstored/process.ml
> +++ b/tools/ocaml/xenstored/process.ml
> @@ -377,7 +377,12 @@ let do_input store cons doms con =
>  	let newpacket =
>  		try
>  			Connection.do_input con
> -		with Failure exp ->
> +		with Xenbus.Xb.Reconnect ->
> +			info "%s requests a reconnect" (Connection.get_domstr con);
> +			Connection.reconnect con;
> +			info "%s reconnection complete" (Connection.get_domstr con);
> +			false
> +		| Failure exp ->
>  			error "caught exception %s" exp;
>  			error "got a bad client %s" (sprintf "%-8s" (Connection.get_domstr con));
>  			Connection.mark_as_bad con;
> @@ -407,6 +412,11 @@ let do_output store cons doms con =
>  			         (Xenbus.Xb.Op.to_string ty) (sanitize_data data);*)
>  			write_answer_log ~ty ~tid ~con ~data;
>  		);
> -		ignore (Connection.do_output con)
> +		try
> +			ignore (Connection.do_output con)
> +		with Xenbus.Xb.Reconnect ->
> +			info "%s requests a reconnect" (Connection.get_domstr con);
> +			Connection.reconnect con;
> +			info "%s reconnection complete" (Connection.get_domstr con)
>  	)
>  
> diff --git a/xen/include/public/io/xs_wire.h b/xen/include/public/io/xs_wire.h
> index 585f0c8..0a0cdbc 100644
> --- a/xen/include/public/io/xs_wire.h
> +++ b/xen/include/public/io/xs_wire.h
> @@ -49,7 +49,9 @@ enum xsd_sockmsg_type
>      XS_RESUME,
>      XS_SET_TARGET,
>      XS_RESTRICT,
> -    XS_RESET_WATCHES
> +    XS_RESET_WATCHES,
> +
> +    XS_INVALID = 0xffff /* Guaranteed to remain an invalid type */
>  };
>  
>  #define XS_WRITE_NONE "NONE"
> @@ -116,6 +118,8 @@ struct xenstore_domain_interface {
>      char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch events. */
>      XENSTORE_RING_IDX req_cons, req_prod;
>      XENSTORE_RING_IDX rsp_cons, rsp_prod;
> +    uint32_t server_features; /* Bitmap of features supported by the server */
> +    uint32_t connection;
>  };
>  
>  /* Violating this is very bad.  See docs/misc/xenstore.txt. */
> @@ -125,6 +129,13 @@ struct xenstore_domain_interface {
>  #define XENSTORE_ABS_PATH_MAX 3072
>  #define XENSTORE_REL_PATH_MAX 2048
>  
> +/* The ability to reconnect a ring */
> +#define XENSTORE_SERVER_FEATURE_RECONNECTION 1
> +
> +/* Valid values for the connection field */
> +#define XENSTORE_CONNECTED 0 /* the steady-state */
> +#define XENSTORE_RECONNECT 1 /* guest has initiated a reconnect */
> +
>  #endif /* _XS_WIRE_H */
>  
>  /*

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-10-07 11:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-25 14:58 [PATCH v5] xenstore: extend the xenstore ring with a 'closing' signal David Scott
2014-09-26 14:00 ` Ian Jackson
2014-09-26 14:06   ` Dave Scott
2014-10-07 11:26 ` Jon Ludlam

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.