All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] More Oxenstored live update fixes
@ 2022-11-30 16:54 Andrew Cooper
  2022-11-30 16:54 ` [PATCH v2 1/6] tools/oxenstored: Style fixes to Domain Andrew Cooper
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Andrew Cooper @ 2022-11-30 16:54 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Christian Lindig, David Scott, Edwin Torok, Rob Hoes

Patch 6 is already acked and queued for 4.18, but testing has identified it
was incomplete.  Specifically, the DOM_EXC virq needs handling across live
update, otherwise domain shutdown events go awry.

Therefore, this v2 series is presented with 5 patches of refactoring, leading
up to the virq correction in patch 6.

Andrew Cooper (5):
  tools/oxenstored: Style fixes to Domain
  tools/oxenstored: Bind the DOM_EXC VIRQ in in Event.init()
  tools/oxenstored: Rename some 'port' variables to 'remote_port'
  tools/oxenstored: Implement Domain.rebind_evtchn
  tools/oxenstored: Rework Domain evtchn handling to use port_pair

Edwin Török (1):
  tools/oxenstored: Keep /dev/xen/evtchn open across live update

 tools/ocaml/xenstored/connections.ml |  9 +---
 tools/ocaml/xenstored/domain.ml      | 86 ++++++++++++++++++++--------------
 tools/ocaml/xenstored/domains.ml     | 31 ++++++-------
 tools/ocaml/xenstored/event.ml       | 20 ++++++--
 tools/ocaml/xenstored/process.ml     | 16 +++----
 tools/ocaml/xenstored/xenstored.ml   | 89 +++++++++++++++++++++++-------------
 6 files changed, 149 insertions(+), 102 deletions(-)

-- 
2.11.0



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 1/6] tools/oxenstored: Style fixes to Domain
  2022-11-30 16:54 [PATCH v2 0/6] More Oxenstored live update fixes Andrew Cooper
@ 2022-11-30 16:54 ` Andrew Cooper
  2022-11-30 17:14   ` Edwin Torok
  2022-12-01 11:11   ` Christian Lindig
  2022-11-30 16:54 ` [PATCH v2 2/6] tools/oxenstored: Bind the DOM_EXC VIRQ in in Event.init() Andrew Cooper
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 23+ messages in thread
From: Andrew Cooper @ 2022-11-30 16:54 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Christian Lindig, David Scott, Edwin Torok, Rob Hoes

This file has some style problems so severe that they interfere with the
readability of the subsequent bugfix patches.

Fix these issues ahead of time, to make the subsequent changes more readable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Christian Lindig <christian.lindig@citrix.com>
CC: David Scott <dave@recoil.org>
CC: Edwin Torok <edvin.torok@citrix.com>
CC: Rob Hoes <Rob.Hoes@citrix.com>

v2:
 * New
---
 tools/ocaml/xenstored/domain.ml | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
index 81cb59b8f1a2..ab08dcf37f62 100644
--- a/tools/ocaml/xenstored/domain.ml
+++ b/tools/ocaml/xenstored/domain.ml
@@ -57,17 +57,16 @@ let is_paused_for_conflict dom = dom.conflict_credit <= 0.0
 let is_free_to_conflict = is_dom0
 
 let string_of_port = function
-| None -> "None"
-| Some x -> string_of_int (Xeneventchn.to_int x)
+	| None -> "None"
+	| Some x -> string_of_int (Xeneventchn.to_int x)
 
 let dump d chan =
 	fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.remote_port
 
-let notify dom = match dom.port with
-| None ->
-	warn "domain %d: attempt to notify on unknown port" dom.id
-| Some port ->
-	Event.notify dom.eventchn port
+let notify dom =
+	match dom.port with
+	| None -> warn "domain %d: attempt to notify on unknown port" dom.id
+	| Some port -> Event.notify dom.eventchn port
 
 let bind_interdomain dom =
 	begin match dom.port with
@@ -84,8 +83,7 @@ let close dom =
 	| None -> ()
 	| Some port -> Event.unbind dom.eventchn port
 	end;
-	Xenmmap.unmap dom.interface;
-	()
+	Xenmmap.unmap dom.interface
 
 let make id mfn remote_port interface eventchn = {
 	id = id;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 2/6] tools/oxenstored: Bind the DOM_EXC VIRQ in in Event.init()
  2022-11-30 16:54 [PATCH v2 0/6] More Oxenstored live update fixes Andrew Cooper
  2022-11-30 16:54 ` [PATCH v2 1/6] tools/oxenstored: Style fixes to Domain Andrew Cooper
@ 2022-11-30 16:54 ` Andrew Cooper
  2022-11-30 17:16   ` Edwin Torok
  2022-12-01 11:27   ` Christian Lindig
  2022-11-30 16:54 ` [PATCH v2 3/6] tools/oxenstored: Rename some 'port' variables to 'remote_port' Andrew Cooper
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 23+ messages in thread
From: Andrew Cooper @ 2022-11-30 16:54 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Christian Lindig, David Scott, Edwin Torok, Rob Hoes

Xenstored always needs to bind the DOM_EXC VIRQ.

Instead of doing it shortly after the call to Event.init(), do it in the
init() call itself.  This removes the need for the field to be a mutable
option.

It will also simplify a future change to restore both parts from the live
update record, rather than re-initialising them from scratch.

Rename the field from virq_port (which could be any VIRQ) to it's proper name.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Christian Lindig <christian.lindig@citrix.com>
CC: David Scott <dave@recoil.org>
CC: Edwin Torok <edvin.torok@citrix.com>
CC: Rob Hoes <Rob.Hoes@citrix.com>

v2:
 * New.
---
 tools/ocaml/xenstored/event.ml     | 9 ++++++---
 tools/ocaml/xenstored/xenstored.ml | 4 +---
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/tools/ocaml/xenstored/event.ml b/tools/ocaml/xenstored/event.ml
index ccca90b6fc4f..a3be296374ff 100644
--- a/tools/ocaml/xenstored/event.ml
+++ b/tools/ocaml/xenstored/event.ml
@@ -17,12 +17,15 @@
 (**************** high level binding ****************)
 type t = {
 	handle: Xeneventchn.handle;
-	mutable virq_port: Xeneventchn.t option;
+	domexc: Xeneventchn.t;
 }
 
-let init () = { handle = Xeneventchn.init (); virq_port = None; }
+let init () =
+	let handle = Xeneventchn.init () in
+	let domexc = Xeneventchn.bind_dom_exc_virq handle in
+	{ handle; domexc }
+
 let fd eventchn = Xeneventchn.fd eventchn.handle
-let bind_dom_exc_virq eventchn = eventchn.virq_port <- Some (Xeneventchn.bind_dom_exc_virq eventchn.handle)
 let bind_interdomain eventchn domid port = Xeneventchn.bind_interdomain eventchn.handle domid port
 let unbind eventchn port = Xeneventchn.unbind eventchn.handle port
 let notify eventchn port = Xeneventchn.notify eventchn.handle port
diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index c5dc7a28d082..55071b49eccb 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -397,7 +397,6 @@ let _ =
 	if cf.restart && Sys.file_exists Disk.xs_daemon_database then (
 		let rwro = DB.from_file store domains cons Disk.xs_daemon_database in
 		info "Live reload: database loaded";
-		Event.bind_dom_exc_virq eventchn;
 		Process.LiveUpdate.completed ();
 		rwro
 	) else (
@@ -413,7 +412,6 @@ let _ =
 
 		if cf.domain_init then (
 			Connections.add_domain cons (Domains.create0 domains);
-			Event.bind_dom_exc_virq eventchn
 		);
 		rw_sock
 	) in
@@ -451,7 +449,7 @@ let _ =
 			let port = Event.pending eventchn in
 			debug "pending port %d" (Xeneventchn.to_int port);
 			finally (fun () ->
-				if Some port = eventchn.Event.virq_port then (
+				if port = eventchn.Event.domexc then (
 					let (notify, deaddom) = Domains.cleanup domains in
 					List.iter (Store.reset_permissions store) deaddom;
 					List.iter (Connections.del_domain cons) deaddom;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 3/6] tools/oxenstored: Rename some 'port' variables to 'remote_port'
  2022-11-30 16:54 [PATCH v2 0/6] More Oxenstored live update fixes Andrew Cooper
  2022-11-30 16:54 ` [PATCH v2 1/6] tools/oxenstored: Style fixes to Domain Andrew Cooper
  2022-11-30 16:54 ` [PATCH v2 2/6] tools/oxenstored: Bind the DOM_EXC VIRQ in in Event.init() Andrew Cooper
@ 2022-11-30 16:54 ` Andrew Cooper
  2022-11-30 17:16   ` Edwin Torok
  2022-12-01 11:26   ` Christian Lindig
  2022-11-30 16:54 ` [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn Andrew Cooper
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 23+ messages in thread
From: Andrew Cooper @ 2022-11-30 16:54 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Christian Lindig, David Scott, Edwin Torok, Rob Hoes

This will make the logic clearer when we plumb local_port through these
functions.

While changing this, simplify the construct in Domains.create0 to separate the
remote port handling from the interface.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Christian Lindig <christian.lindig@citrix.com>
CC: David Scott <dave@recoil.org>
CC: Edwin Torok <edvin.torok@citrix.com>
CC: Rob Hoes <Rob.Hoes@citrix.com>

v2:
 * New.
---
 tools/ocaml/xenstored/domains.ml   | 26 ++++++++++++--------------
 tools/ocaml/xenstored/process.ml   | 12 ++++++------
 tools/ocaml/xenstored/xenstored.ml |  8 ++++----
 3 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/tools/ocaml/xenstored/domains.ml b/tools/ocaml/xenstored/domains.ml
index 17fe2fa25772..26018ac0dd3d 100644
--- a/tools/ocaml/xenstored/domains.ml
+++ b/tools/ocaml/xenstored/domains.ml
@@ -122,9 +122,9 @@ let cleanup doms =
 let resume _doms _domid =
 	()
 
-let create doms domid mfn port =
+let create doms domid mfn remote_port =
 	let interface = Xenctrl.map_foreign_range xc domid (Xenmmap.getpagesize()) mfn in
-	let dom = Domain.make domid mfn port interface doms.eventchn in
+	let dom = Domain.make domid mfn remote_port interface doms.eventchn in
 	Hashtbl.add doms.table domid dom;
 	Domain.bind_interdomain dom;
 	dom
@@ -133,18 +133,16 @@ let xenstored_kva = ref ""
 let xenstored_port = ref ""
 
 let create0 doms =
-	let port, interface =
-		(
-			let port = Utils.read_file_single_integer !xenstored_port
-			and fd = Unix.openfile !xenstored_kva
-					       [ Unix.O_RDWR ] 0o600 in
-			let interface = Xenmmap.mmap fd Xenmmap.RDWR Xenmmap.SHARED
-						  (Xenmmap.getpagesize()) 0 in
-			Unix.close fd;
-			port, interface
-		)
-		in
-	let dom = Domain.make 0 Nativeint.zero port interface doms.eventchn in
+	let remote_port = Utils.read_file_single_integer !xenstored_port in
+
+	let interface =
+		let fd = Unix.openfile !xenstored_kva [ Unix.O_RDWR ] 0o600 in
+		let interface = Xenmmap.mmap fd Xenmmap.RDWR Xenmmap.SHARED (Xenmmap.getpagesize()) 0 in
+		Unix.close fd;
+		interface
+	in
+
+	let dom = Domain.make 0 Nativeint.zero remote_port interface doms.eventchn in
 	Hashtbl.add doms.table 0 dom;
 	Domain.bind_interdomain dom;
 	Domain.notify dom;
diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
index 72a79e9328dd..b2973aca2a82 100644
--- a/tools/ocaml/xenstored/process.ml
+++ b/tools/ocaml/xenstored/process.ml
@@ -558,10 +558,10 @@ let do_transaction_end con t domains cons data =
 let do_introduce con t domains cons data =
 	if not (Connection.is_dom0 con)
 	then raise Define.Permission_denied;
-	let (domid, mfn, port) =
+	let (domid, mfn, remote_port) =
 		match (split None '\000' data) with
-		| domid :: mfn :: port :: _ ->
-			int_of_string domid, Nativeint.of_string mfn, int_of_string port
+		| domid :: mfn :: remote_port :: _ ->
+			int_of_string domid, Nativeint.of_string mfn, int_of_string remote_port
 		| _                         -> raise Invalid_Cmd_Args;
 		in
 	let dom =
@@ -569,18 +569,18 @@ let do_introduce con t domains cons data =
 			let edom = Domains.find domains domid in
 			if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != con then begin
 				(* Use XS_INTRODUCE for recreating the xenbus event-channel. *)
-				edom.remote_port <- port;
+				edom.remote_port <- remote_port;
 				Domain.bind_interdomain edom;
 			end;
 			edom
 		else try
-			let ndom = Domains.create domains domid mfn port in
+			let ndom = Domains.create domains domid mfn remote_port in
 			Connections.add_domain cons ndom;
 			Connections.fire_spec_watches (Transaction.get_root t) cons Store.Path.introduce_domain;
 			ndom
 		with _ -> raise Invalid_Cmd_Args
 	in
-	if (Domain.get_remote_port dom) <> port || (Domain.get_mfn dom) <> mfn then
+	if (Domain.get_remote_port dom) <> remote_port || (Domain.get_mfn dom) <> mfn then
 		raise Domain_not_match
 
 let do_release con t domains cons data =
diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index 55071b49eccb..1f11f576b515 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -167,10 +167,10 @@ let from_channel_f chan global_f socket_f domain_f watch_f store_f =
 					global_f ~rw
 				| "socket" :: fd :: [] ->
 					socket_f ~fd:(int_of_string fd)
-				| "dom" :: domid :: mfn :: port :: []->
+				| "dom" :: domid :: mfn :: remote_port :: []->
 					domain_f (int_of_string domid)
 					         (Nativeint.of_string mfn)
-					         (int_of_string port)
+					         (int_of_string remote_port)
 				| "watch" :: domid :: path :: token :: [] ->
 					watch_f (int_of_string domid)
 					        (unhexify path) (unhexify token)
@@ -209,10 +209,10 @@ let from_channel store cons doms chan =
 		else
 			warn "Ignoring invalid socket FD %d" fd
 	in
-	let domain_f domid mfn port =
+	let domain_f domid mfn remote_port =
 		let ndom =
 			if domid > 0 then
-				Domains.create doms domid mfn port
+				Domains.create doms domid mfn remote_port
 			else
 				Domains.create0 doms
 			in
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn
  2022-11-30 16:54 [PATCH v2 0/6] More Oxenstored live update fixes Andrew Cooper
                   ` (2 preceding siblings ...)
  2022-11-30 16:54 ` [PATCH v2 3/6] tools/oxenstored: Rename some 'port' variables to 'remote_port' Andrew Cooper
@ 2022-11-30 16:54 ` Andrew Cooper
  2022-11-30 17:15   ` Edwin Torok
  2022-12-01 11:20   ` Christian Lindig
  2022-11-30 16:54 ` [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair Andrew Cooper
  2022-11-30 16:54 ` [PATCH v2 6/6] tools/oxenstored: Keep /dev/xen/evtchn open across live update Andrew Cooper
  5 siblings, 2 replies; 23+ messages in thread
From: Andrew Cooper @ 2022-11-30 16:54 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Christian Lindig, David Scott, Edwin Torok, Rob Hoes

Generally speaking, the event channel local/remote port is fixed for the
lifetime of the associated domain object.  The exception to this is a
secondary XS_INTRODUCE (defined to re-bind to a new event channel) which pokes
around at the domain object's internal state.

We need to refactor the evtchn handling to support live update, so start by
moving the relevant manipulation into Domain.

No practical change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Christian Lindig <christian.lindig@citrix.com>
CC: David Scott <dave@recoil.org>
CC: Edwin Torok <edvin.torok@citrix.com>
CC: Rob Hoes <Rob.Hoes@citrix.com>

Note: This change deliberately doesn't reuse Domain.bind_interdomain, which is
removed by the end of the refactoring.

v2:
 * New.
---
 tools/ocaml/xenstored/domain.ml  | 12 ++++++++++++
 tools/ocaml/xenstored/process.ml |  6 ++----
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
index ab08dcf37f62..d59a9401e211 100644
--- a/tools/ocaml/xenstored/domain.ml
+++ b/tools/ocaml/xenstored/domain.ml
@@ -63,6 +63,18 @@ let string_of_port = function
 let dump d chan =
 	fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.remote_port
 
+let rebind_evtchn d remote_port =
+	begin match d.port with
+	| None -> ()
+	| Some p -> Event.unbind d.eventchn p
+	end;
+	let local = Event.bind_interdomain d.eventchn d.id remote_port in
+	debug "domain %d rebind (l %s, r %d) => (l %d, r %d)"
+	      d.id (string_of_port d.port) d.remote_port
+	      (Xeneventchn.to_int local) remote_port;
+	d.remote_port <- remote_port;
+	d.port <- Some (local)
+
 let notify dom =
 	match dom.port with
 	| None -> warn "domain %d: attempt to notify on unknown port" dom.id
diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
index b2973aca2a82..2ea940d7e2d5 100644
--- a/tools/ocaml/xenstored/process.ml
+++ b/tools/ocaml/xenstored/process.ml
@@ -567,11 +567,9 @@ let do_introduce con t domains cons data =
 	let dom =
 		if Domains.exist domains domid then
 			let edom = Domains.find domains domid in
-			if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != con then begin
+			if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != con then
 				(* Use XS_INTRODUCE for recreating the xenbus event-channel. *)
-				edom.remote_port <- remote_port;
-				Domain.bind_interdomain edom;
-			end;
+				Domain.rebind_evtchn edom remote_port;
 			edom
 		else try
 			let ndom = Domains.create domains domid mfn remote_port in
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair
  2022-11-30 16:54 [PATCH v2 0/6] More Oxenstored live update fixes Andrew Cooper
                   ` (3 preceding siblings ...)
  2022-11-30 16:54 ` [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn Andrew Cooper
@ 2022-11-30 16:54 ` Andrew Cooper
  2022-11-30 17:17   ` Edwin Torok
  2022-12-01 11:59   ` Christian Lindig
  2022-11-30 16:54 ` [PATCH v2 6/6] tools/oxenstored: Keep /dev/xen/evtchn open across live update Andrew Cooper
  5 siblings, 2 replies; 23+ messages in thread
From: Andrew Cooper @ 2022-11-30 16:54 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Christian Lindig, David Scott, Edwin Torok, Rob Hoes

Inter-domain event channels are always a pair of local and remote ports.
Right now the handling is asymmetric, caused by the fact that the evtchn is
bound after the associated Domain object is constructed.

First, move binding of the event channel into the Domain.make() constructor.
This means the local port no longer needs to be an option.  It also removes
the final callers of Domain.bind_interdomain.

Next, introduce a new port_pair type to encapsulate the fact that these two
should be updated together, and replace the previous port and remote_port
fields.  This refactoring also changes the Domain.get_port interface (removing
an option) so take the opportunity to name it get_local_port instead.

Also, this fixes a use-after-free risk with Domain.close.  Once the evtchn has
been unbound, the same local port number can be reused for a different
purpose, so explicitly invalidate the ports to prevent their accidental misuse
in the future.

This also cleans up some of the debugging, to always print a port pair.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Christian Lindig <christian.lindig@citrix.com>
CC: David Scott <dave@recoil.org>
CC: Edwin Torok <edvin.torok@citrix.com>
CC: Rob Hoes <Rob.Hoes@citrix.com>

v2:
 * New
---
 tools/ocaml/xenstored/connections.ml |  9 +----
 tools/ocaml/xenstored/domain.ml      | 75 ++++++++++++++++++------------------
 tools/ocaml/xenstored/domains.ml     |  2 -
 3 files changed, 39 insertions(+), 47 deletions(-)

diff --git a/tools/ocaml/xenstored/connections.ml b/tools/ocaml/xenstored/connections.ml
index 7d68c583b43a..a80ae0bed2ce 100644
--- a/tools/ocaml/xenstored/connections.ml
+++ b/tools/ocaml/xenstored/connections.ml
@@ -48,9 +48,7 @@ let add_domain cons dom =
 	let xbcon = Xenbus.Xb.open_mmap ~capacity (Domain.get_interface dom) (fun () -> Domain.notify dom) in
 	let con = Connection.create xbcon (Some dom) in
 	Hashtbl.add cons.domains (Domain.get_id dom) con;
-	match Domain.get_port dom with
-	| Some p -> Hashtbl.add cons.ports p con;
-	| None -> ()
+	Hashtbl.add cons.ports (Domain.get_local_port dom) con
 
 let select ?(only_if = (fun _ -> true)) cons =
 	Hashtbl.fold (fun _ con (ins, outs) ->
@@ -97,10 +95,7 @@ let del_domain cons id =
 		let con = find_domain cons id in
 		Hashtbl.remove cons.domains id;
 		(match Connection.get_domain con with
-		 | Some d ->
-		   (match Domain.get_port d with
-		    | Some p -> Hashtbl.remove cons.ports p
-		    | None -> ())
+		 | Some d -> Hashtbl.remove cons.ports (Domain.get_local_port d)
 		 | None -> ());
 		del_watches cons con;
 		Connection.close con
diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
index d59a9401e211..ecdd65f3209a 100644
--- a/tools/ocaml/xenstored/domain.ml
+++ b/tools/ocaml/xenstored/domain.ml
@@ -19,14 +19,31 @@ open Printf
 let debug fmt = Logging.debug "domain" fmt
 let warn  fmt = Logging.warn  "domain" fmt
 
+(* An event channel port pair.  The remote port, and the local port it is
+   bound to. *)
+type port_pair =
+{
+	local: Xeneventchn.t;
+	remote: int;
+}
+
+(* Sentinal port_pair with both set to EVTCHN_INVALID *)
+let invalid_ports =
+{
+	local = Xeneventchn.of_int 0;
+	remote = 0
+}
+
+let string_of_port_pair p =
+	sprintf "(l %d, r %d)" (Xeneventchn.to_int p.local) p.remote
+
 type t =
 {
 	id: Xenctrl.domid;
 	mfn: nativeint;
 	interface: Xenmmap.mmap_interface;
 	eventchn: Event.t;
-	mutable remote_port: int;
-	mutable port: Xeneventchn.t option;
+	mutable ports: port_pair;
 	mutable bad_client: bool;
 	mutable io_credit: int; (* the rounds of ring process left to do, default is 0,
 	                           usually set to 1 when there is work detected, could
@@ -41,8 +58,8 @@ let is_dom0 d = d.id = 0
 let get_id domain = domain.id
 let get_interface d = d.interface
 let get_mfn d = d.mfn
-let get_remote_port d = d.remote_port
-let get_port d = d.port
+let get_remote_port d = d.ports.remote
+let get_local_port d = d.ports.local
 
 let is_bad_domain domain = domain.bad_client
 let mark_as_bad domain = domain.bad_client <- true
@@ -56,54 +73,36 @@ let is_paused_for_conflict dom = dom.conflict_credit <= 0.0
 
 let is_free_to_conflict = is_dom0
 
-let string_of_port = function
-	| None -> "None"
-	| Some x -> string_of_int (Xeneventchn.to_int x)
-
 let dump d chan =
-	fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.remote_port
+	fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.ports.remote
 
 let rebind_evtchn d remote_port =
-	begin match d.port with
-	| None -> ()
-	| Some p -> Event.unbind d.eventchn p
-	end;
+	Event.unbind d.eventchn d.ports.local;
 	let local = Event.bind_interdomain d.eventchn d.id remote_port in
-	debug "domain %d rebind (l %s, r %d) => (l %d, r %d)"
-	      d.id (string_of_port d.port) d.remote_port
-	      (Xeneventchn.to_int local) remote_port;
-	d.remote_port <- remote_port;
-	d.port <- Some (local)
+	let ports = { local; remote = remote_port } in
+	debug "domain %d rebind %s => %s"
+	      d.id (string_of_port_pair d.ports) (string_of_port_pair ports);
+	d.ports <- ports
 
 let notify dom =
-	match dom.port with
-	| None -> warn "domain %d: attempt to notify on unknown port" dom.id
-	| Some port -> Event.notify dom.eventchn port
-
-let bind_interdomain dom =
-	begin match dom.port with
-	| None -> ()
-	| Some port -> Event.unbind dom.eventchn port
-	end;
-	dom.port <- Some (Event.bind_interdomain dom.eventchn dom.id dom.remote_port);
-	debug "bound domain %d remote port %d to local port %s" dom.id dom.remote_port (string_of_port dom.port)
-
+	Event.notify dom.eventchn dom.ports.local
 
 let close dom =
-	debug "domain %d unbound port %s" dom.id (string_of_port dom.port);
-	begin match dom.port with
-	| None -> ()
-	| Some port -> Event.unbind dom.eventchn port
-	end;
+	debug "domain %d unbind %s" dom.id (string_of_port_pair dom.ports);
+	Event.unbind dom.eventchn dom.ports.local;
+	dom.ports <- invalid_ports;
 	Xenmmap.unmap dom.interface
 
-let make id mfn remote_port interface eventchn = {
+let make id mfn remote_port interface eventchn =
+	let local = Event.bind_interdomain eventchn id remote_port in
+	let ports = { local; remote = remote_port } in
+	debug "domain %d bind %s" id (string_of_port_pair ports);
+{
 	id = id;
 	mfn = mfn;
-	remote_port = remote_port;
+	ports;
 	interface = interface;
 	eventchn = eventchn;
-	port = None;
 	bad_client = false;
 	io_credit = 0;
 	conflict_credit = !Define.conflict_burst_limit;
diff --git a/tools/ocaml/xenstored/domains.ml b/tools/ocaml/xenstored/domains.ml
index 26018ac0dd3d..2ab0c5f4d8d0 100644
--- a/tools/ocaml/xenstored/domains.ml
+++ b/tools/ocaml/xenstored/domains.ml
@@ -126,7 +126,6 @@ let create doms domid mfn remote_port =
 	let interface = Xenctrl.map_foreign_range xc domid (Xenmmap.getpagesize()) mfn in
 	let dom = Domain.make domid mfn remote_port interface doms.eventchn in
 	Hashtbl.add doms.table domid dom;
-	Domain.bind_interdomain dom;
 	dom
 
 let xenstored_kva = ref ""
@@ -144,7 +143,6 @@ let create0 doms =
 
 	let dom = Domain.make 0 Nativeint.zero remote_port interface doms.eventchn in
 	Hashtbl.add doms.table 0 dom;
-	Domain.bind_interdomain dom;
 	Domain.notify dom;
 	dom
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 6/6] tools/oxenstored: Keep /dev/xen/evtchn open across live update
  2022-11-30 16:54 [PATCH v2 0/6] More Oxenstored live update fixes Andrew Cooper
                   ` (4 preceding siblings ...)
  2022-11-30 16:54 ` [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair Andrew Cooper
@ 2022-11-30 16:54 ` Andrew Cooper
  5 siblings, 0 replies; 23+ messages in thread
From: Andrew Cooper @ 2022-11-30 16:54 UTC (permalink / raw)
  To: Xen-devel
  Cc: Edwin Török, Andrew Cooper, Christian Lindig,
	David Scott, Rob Hoes

From: Edwin Török <edvin.torok@citrix.com>

Closing the evtchn handle will unbind and free all local ports.  The new
xenstored would need to rebind all evtchns, which is work that we don't want
or need to be doing during the critical handover period.

However, it turns out that the Windows PV drivers also rebind their local port
too across suspend/resume, leaving (o)xenstored with a stale idea of the
remote port to use.  In this case, reusing the established connection is the
only robust option.

Therefore:
 * Have oxenstored open /dev/xen/evtchn without CLOEXEC at start of day.
 * Extend the handover information with the evtchn fd, domexc virq local port,
   and the local port number for each domain connection.
 * Have (the new) oxenstored recover the open handle using Xeneventchn.fdopen,
   and use the provided local ports rather than trying to rebind them.

When this new information isn't present (i.e. live updating from an oxenstored
prior to this change), the best-effort status quo will have to do.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Christian Lindig <christian.lindig@citrix.com>
CC: David Scott <dave@recoil.org>
CC: Edwin Torok <edvin.torok@citrix.com>
CC: Rob Hoes <Rob.Hoes@citrix.com>

v2:
 * Bind DOM_EXC virq for non-LU starts.  (Regression introduced between v2 and
   v3 of the original series)
 * Rebase over previous Evtchn.init() virq cleanup
 * Preserve the DOM_EXC virq local port too
---
 tools/ocaml/xenstored/domain.ml    | 13 ++++--
 tools/ocaml/xenstored/domains.ml   |  9 ++--
 tools/ocaml/xenstored/event.ml     | 20 +++++++--
 tools/ocaml/xenstored/process.ml   |  2 +-
 tools/ocaml/xenstored/xenstored.ml | 85 +++++++++++++++++++++++++-------------
 5 files changed, 90 insertions(+), 39 deletions(-)

diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
index ecdd65f3209a..c196edf6a059 100644
--- a/tools/ocaml/xenstored/domain.ml
+++ b/tools/ocaml/xenstored/domain.ml
@@ -74,7 +74,8 @@ let is_paused_for_conflict dom = dom.conflict_credit <= 0.0
 let is_free_to_conflict = is_dom0
 
 let dump d chan =
-	fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.ports.remote
+	fprintf chan "dom,%d,%nd,%d,%d\n"
+		d.id d.mfn d.ports.remote (Xeneventchn.to_int d.ports.local)
 
 let rebind_evtchn d remote_port =
 	Event.unbind d.eventchn d.ports.local;
@@ -93,8 +94,14 @@ let close dom =
 	dom.ports <- invalid_ports;
 	Xenmmap.unmap dom.interface
 
-let make id mfn remote_port interface eventchn =
-	let local = Event.bind_interdomain eventchn id remote_port in
+(* On clean start, local_port will be None, and we must bind the remote port
+   given.  On Live Update, the event channel is already bound, and both the
+   local and remote port numbers come from the transfer record. *)
+let make ?local_port ~remote_port id mfn interface eventchn =
+	let local = match local_port with
+		| None -> Event.bind_interdomain eventchn id remote_port
+		| Some p -> Xeneventchn.of_int p
+	in
 	let ports = { local; remote = remote_port } in
 	debug "domain %d bind %s" id (string_of_port_pair ports);
 {
diff --git a/tools/ocaml/xenstored/domains.ml b/tools/ocaml/xenstored/domains.ml
index 2ab0c5f4d8d0..b6c075c838ab 100644
--- a/tools/ocaml/xenstored/domains.ml
+++ b/tools/ocaml/xenstored/domains.ml
@@ -56,6 +56,7 @@ let exist doms id = Hashtbl.mem doms.table id
 let find doms id = Hashtbl.find doms.table id
 let number doms = Hashtbl.length doms.table
 let iter doms fct = Hashtbl.iter (fun _ b -> fct b) doms.table
+let eventchn doms = doms.eventchn
 
 let rec is_empty_queue q =
 	Queue.is_empty q ||
@@ -122,16 +123,16 @@ let cleanup doms =
 let resume _doms _domid =
 	()
 
-let create doms domid mfn remote_port =
+let create doms ?local_port ~remote_port domid mfn =
 	let interface = Xenctrl.map_foreign_range xc domid (Xenmmap.getpagesize()) mfn in
-	let dom = Domain.make domid mfn remote_port interface doms.eventchn in
+	let dom = Domain.make ?local_port ~remote_port domid mfn interface doms.eventchn in
 	Hashtbl.add doms.table domid dom;
 	dom
 
 let xenstored_kva = ref ""
 let xenstored_port = ref ""
 
-let create0 doms =
+let create0 ?local_port doms =
 	let remote_port = Utils.read_file_single_integer !xenstored_port in
 
 	let interface =
@@ -141,7 +142,7 @@ let create0 doms =
 		interface
 	in
 
-	let dom = Domain.make 0 Nativeint.zero remote_port interface doms.eventchn in
+	let dom = Domain.make ?local_port ~remote_port 0 Nativeint.zero interface doms.eventchn in
 	Hashtbl.add doms.table 0 dom;
 	Domain.notify dom;
 	dom
diff --git a/tools/ocaml/xenstored/event.ml b/tools/ocaml/xenstored/event.ml
index a3be296374ff..629dc6041bb0 100644
--- a/tools/ocaml/xenstored/event.ml
+++ b/tools/ocaml/xenstored/event.ml
@@ -20,9 +20,18 @@ type t = {
 	domexc: Xeneventchn.t;
 }
 
-let init () =
-	let handle = Xeneventchn.init () in
-	let domexc = Xeneventchn.bind_dom_exc_virq handle in
+(* On clean start, both parameters will be None, and we must open the evtchn
+   handle and bind the DOM_EXC VIRQ.  On Live Update, the fd is preserved
+   across exec(), and the DOM_EXC VIRQ still bound. *)
+let init ?fd ?domexc_port () =
+	let handle = match fd with
+		| None -> Xeneventchn.init ~cloexec:false ()
+		| Some fd -> fd |> Utils.FD.of_int |> Xeneventchn.fdopen
+	in
+	let domexc = match domexc_port with
+		| None -> Xeneventchn.bind_dom_exc_virq handle
+		| Some p -> Xeneventchn.of_int p
+	in
 	{ handle; domexc }
 
 let fd eventchn = Xeneventchn.fd eventchn.handle
@@ -31,3 +40,8 @@ let unbind eventchn port = Xeneventchn.unbind eventchn.handle port
 let notify eventchn port = Xeneventchn.notify eventchn.handle port
 let pending eventchn = Xeneventchn.pending eventchn.handle
 let unmask eventchn port = Xeneventchn.unmask eventchn.handle port
+
+let dump e chan =
+	Printf.fprintf chan "evtchn-dev,%d,%d\n"
+		       (Utils.FD.to_int @@ Xeneventchn.fd e.handle)
+		       (Xeneventchn.to_int e.domexc)
diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
index 2ea940d7e2d5..ad2e0fa70f4a 100644
--- a/tools/ocaml/xenstored/process.ml
+++ b/tools/ocaml/xenstored/process.ml
@@ -572,7 +572,7 @@ let do_introduce con t domains cons data =
 				Domain.rebind_evtchn edom remote_port;
 			edom
 		else try
-			let ndom = Domains.create domains domid mfn remote_port in
+			let ndom = Domains.create ~remote_port domains domid mfn in
 			Connections.add_domain cons ndom;
 			Connections.fire_spec_watches (Transaction.get_root t) cons Store.Path.introduce_domain;
 			ndom
diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index 1f11f576b515..f526f4fb2310 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -144,7 +144,7 @@ exception Bad_format of string
 
 let dump_format_header = "$xenstored-dump-format"
 
-let from_channel_f chan global_f socket_f domain_f watch_f store_f =
+let from_channel_f chan global_f evtchn_f socket_f domain_f watch_f store_f =
 	let unhexify s = Utils.unhexify s in
 	let getpath s =
 		let u = Utils.unhexify s in
@@ -165,12 +165,19 @@ let from_channel_f chan global_f socket_f domain_f watch_f store_f =
 					(* there might be more parameters here,
 					   e.g. a RO socket from a previous version: ignore it *)
 					global_f ~rw
+				| "evtchn-dev" :: fd :: domexc_port :: [] ->
+					evtchn_f ~fd:(int_of_string fd)
+						 ~domexc_port:(int_of_string domexc_port)
 				| "socket" :: fd :: [] ->
 					socket_f ~fd:(int_of_string fd)
-				| "dom" :: domid :: mfn :: remote_port :: []->
-					domain_f (int_of_string domid)
-					         (Nativeint.of_string mfn)
-					         (int_of_string remote_port)
+				| "dom" :: domid :: mfn :: remote_port :: rest ->
+					let local_port = match rest with
+						  | [] -> None (* backward compat: old version didn't have it *)
+						  | local_port :: _ -> Some (int_of_string local_port) in
+					domain_f ?local_port
+						 ~remote_port:(int_of_string remote_port)
+						 (int_of_string domid)
+						 (Nativeint.of_string mfn)
 				| "watch" :: domid :: path :: token :: [] ->
 					watch_f (int_of_string domid)
 					        (unhexify path) (unhexify token)
@@ -189,10 +196,21 @@ let from_channel_f chan global_f socket_f domain_f watch_f store_f =
 	done;
 	info "Completed loading xenstore dump"
 
-let from_channel store cons doms chan =
+let from_channel store cons domains_init chan =
 	(* don't let the permission get on our way, full perm ! *)
 	let op = Store.get_ops store Perms.Connection.full_rights in
 	let rwro = ref (None) in
+	let doms = ref (None) in
+
+	let require_doms () =
+		match !doms with
+		| None ->
+			warn "No event channel file descriptor available in dump!";
+		        let domains = domains_init @@ Event.init () in
+		        doms := Some domains;
+		        domains
+		| Some d -> d
+	in
 	let global_f ~rw =
 		let get_listen_sock sockfd =
 			let fd = sockfd |> int_of_string |> Utils.FD.of_int in
@@ -201,6 +219,10 @@ let from_channel store cons doms chan =
 		in
 		rwro := get_listen_sock rw
 	in
+	let evtchn_f ~fd ~domexc_port =
+		let evtchn = Event.init ~fd ~domexc_port () in
+		doms := Some(domains_init evtchn)
+	in
 	let socket_f ~fd =
 		let ufd = Utils.FD.of_int fd in
 		let is_valid = try (Unix.fstat ufd).Unix.st_kind = Unix.S_SOCK with _ -> false in
@@ -209,12 +231,13 @@ let from_channel store cons doms chan =
 		else
 			warn "Ignoring invalid socket FD %d" fd
 	in
-	let domain_f domid mfn remote_port =
+	let domain_f ?local_port ~remote_port domid mfn =
+		let doms = require_doms () in
 		let ndom =
 			if domid > 0 then
-				Domains.create doms domid mfn remote_port
+				Domains.create ?local_port ~remote_port doms domid mfn
 			else
-				Domains.create0 doms
+				Domains.create0 ?local_port doms
 			in
 		Connections.add_domain cons ndom;
 		in
@@ -229,8 +252,8 @@ let from_channel store cons doms chan =
 		op.Store.write path value;
 		op.Store.setperms path perms
 		in
-	from_channel_f chan global_f socket_f domain_f watch_f store_f;
-	!rwro
+	from_channel_f chan global_f evtchn_f socket_f domain_f watch_f store_f;
+	!rwro, require_doms ()
 
 let from_file store cons doms file =
 	info "Loading xenstore dump from %s" file;
@@ -238,7 +261,7 @@ let from_file store cons doms file =
 	finally (fun () -> from_channel store doms cons channel)
 	        (fun () -> close_in channel)
 
-let to_channel store cons rw chan =
+let to_channel store cons (rw, evtchn) chan =
 	let hexify s = Utils.hexify s in
 
 	fprintf chan "%s\n" dump_format_header;
@@ -248,6 +271,9 @@ let to_channel store cons rw chan =
 		Utils.FD.to_int fd in
 	fprintf chan "global,%d\n" (fdopt rw);
 
+	(* dump evtchn device info *)
+	Event.dump evtchn chan;
+
 	(* dump connections related to domains: domid, mfn, eventchn port/ sockets, and watches *)
 	Connections.iter cons (fun con -> Connection.dump con chan);
 
@@ -367,7 +393,6 @@ let _ =
 	| None         -> () end;
 
 	let store = Store.create () in
-	let eventchn = Event.init () in
 	let next_frequent_ops = ref 0. in
 	let advance_next_frequent_ops () =
 		next_frequent_ops := (Unix.gettimeofday () +. !Define.conflict_max_history_seconds)
@@ -375,16 +400,8 @@ let _ =
 	let delay_next_frequent_ops_by duration =
 		next_frequent_ops := !next_frequent_ops +. duration
 	in
-	let domains = Domains.init eventchn advance_next_frequent_ops in
+	let domains_init eventchn = Domains.init eventchn advance_next_frequent_ops in
 
-	(* For things that need to be done periodically but more often
-	 * than the periodic_ops function *)
-	let frequent_ops () =
-		if Unix.gettimeofday () > !next_frequent_ops then (
-			History.trim ();
-			Domains.incr_conflict_credit domains;
-			advance_next_frequent_ops ()
-		) in
 	let cons = Connections.create () in
 
 	let quit = ref false in
@@ -393,14 +410,15 @@ let _ =
 	List.iter (fun path ->
 		Store.write store Perms.Connection.full_rights path "") Store.Path.specials;
 
-	let rw_sock =
+	let rw_sock, domains =
 	if cf.restart && Sys.file_exists Disk.xs_daemon_database then (
-		let rwro = DB.from_file store domains cons Disk.xs_daemon_database in
+		let rw, domains = DB.from_file store domains_init cons Disk.xs_daemon_database in
 		info "Live reload: database loaded";
 		Process.LiveUpdate.completed ();
-		rwro
+		rw, domains
 	) else (
 		info "No live reload: regular startup";
+		let domains = domains_init @@ Event.init () in
 		if !Disk.enable then (
 			info "reading store from disk";
 			Disk.read store
@@ -413,9 +431,18 @@ let _ =
 		if cf.domain_init then (
 			Connections.add_domain cons (Domains.create0 domains);
 		);
-		rw_sock
+		rw_sock, domains
 	) in
 
+	(* For things that need to be done periodically but more often
+	 * than the periodic_ops function *)
+	let frequent_ops () =
+		if Unix.gettimeofday () > !next_frequent_ops then (
+			History.trim ();
+			Domains.incr_conflict_credit domains;
+			advance_next_frequent_ops ()
+		) in
+
 	(* required for xenstore-control to detect availability of live-update *)
 	let tool_path = Store.Path.of_string "/tool" in
 	if not (Store.path_exists store tool_path) then
@@ -430,8 +457,10 @@ let _ =
 	Sys.set_signal Sys.sigusr1 (Sys.Signal_handle (fun _ -> sigusr1_handler store));
 	Sys.set_signal Sys.sigpipe Sys.Signal_ignore;
 
+	let eventchn = Domains.eventchn domains in
+
 	if cf.activate_access_log then begin
-		let post_rotate () = DB.to_file store cons (None) Disk.xs_daemon_database in
+		let post_rotate () = DB.to_file store cons (None, eventchn) Disk.xs_daemon_database in
 		Logging.init_access_log post_rotate
 	end;
 
@@ -593,7 +622,7 @@ let _ =
 			live_update := Process.LiveUpdate.should_run cons;
 			if !live_update || !quit then begin
 				(* don't initiate live update if saving state fails *)
-				DB.to_file store cons (rw_sock) Disk.xs_daemon_database;
+				DB.to_file store cons (rw_sock, eventchn) Disk.xs_daemon_database;
 				quit := true;
 			end
 		with exc ->
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/6] tools/oxenstored: Style fixes to Domain
  2022-11-30 16:54 ` [PATCH v2 1/6] tools/oxenstored: Style fixes to Domain Andrew Cooper
@ 2022-11-30 17:14   ` Edwin Torok
  2022-12-01 11:11   ` Christian Lindig
  1 sibling, 0 replies; 23+ messages in thread
From: Edwin Torok @ 2022-11-30 17:14 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Christian Lindig, David Scott, Rob Hoes



> On 30 Nov 2022, at 16:54, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> 
> This file has some style problems so severe that they interfere with the
> readability of the subsequent bugfix patches.
> 
> Fix these issues ahead of time, to make the subsequent changes more readable.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Christian Lindig <christian.lindig@citrix.com>
> CC: David Scott <dave@recoil.org>
> CC: Edwin Torok <edvin.torok@citrix.com>
> CC: Rob Hoes <Rob.Hoes@citrix.com>


Reviewed-by: Edwin Török <edvin.torok@citrix.com>

> 
> v2:
> * New
> ---
> tools/ocaml/xenstored/domain.ml | 16 +++++++---------
> 1 file changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
> index 81cb59b8f1a2..ab08dcf37f62 100644
> --- a/tools/ocaml/xenstored/domain.ml
> +++ b/tools/ocaml/xenstored/domain.ml
> @@ -57,17 +57,16 @@ let is_paused_for_conflict dom = dom.conflict_credit <= 0.0
> let is_free_to_conflict = is_dom0
> 
> let string_of_port = function
> -| None -> "None"
> -| Some x -> string_of_int (Xeneventchn.to_int x)
> + | None -> "None"
> + | Some x -> string_of_int (Xeneventchn.to_int x)

I would've expected ocp-indent to already do the right thing on this part.

> 
> let dump d chan =
> fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.remote_port
> 
> -let notify dom = match dom.port with
> -| None ->
> - warn "domain %d: attempt to notify on unknown port" dom.id
> -| Some port ->
> - Event.notify dom.eventchn port
> +let notify dom =
> + match dom.port with
> + | None -> warn "domain %d: attempt to notify on unknown port" dom.id
> + | Some port -> Event.notify dom.eventchn port

but yes for this we'd need ocamlformat, not ocp-indent.

> 
> let bind_interdomain dom =
> begin match dom.port with
> @@ -84,8 +83,7 @@ let close dom =
> | None -> ()
> | Some port -> Event.unbind dom.eventchn port
> end;
> - Xenmmap.unmap dom.interface;
> - ()
> + Xenmmap.unmap dom.interface
> 
> let make id mfn remote_port interface eventchn = {
> id = id;
> -- 
> 2.11.0
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn
  2022-11-30 16:54 ` [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn Andrew Cooper
@ 2022-11-30 17:15   ` Edwin Torok
  2022-12-01 11:20   ` Christian Lindig
  1 sibling, 0 replies; 23+ messages in thread
From: Edwin Torok @ 2022-11-30 17:15 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Christian Lindig, David Scott, Rob Hoes



> On 30 Nov 2022, at 16:54, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> 
> Generally speaking, the event channel local/remote port is fixed for the
> lifetime of the associated domain object.  The exception to this is a
> secondary XS_INTRODUCE (defined to re-bind to a new event channel) which pokes
> around at the domain object's internal state.
> 
> We need to refactor the evtchn handling to support live update, so start by
> moving the relevant manipulation into Domain.
> 
> No practical change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Christian Lindig <christian.lindig@citrix.com>
> CC: David Scott <dave@recoil.org>
> CC: Edwin Torok <edvin.torok@citrix.com>
> CC: Rob Hoes <Rob.Hoes@citrix.com>
> 
> Note: This change deliberately doesn't reuse Domain.bind_interdomain, which is
> removed by the end of the refactoring.


Reviewed-by: Edwin Török <edvin.torok@citrix.com>

> 
> v2:
> * New.
> ---
> tools/ocaml/xenstored/domain.ml  | 12 ++++++++++++
> tools/ocaml/xenstored/process.ml |  6 ++----
> 2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
> index ab08dcf37f62..d59a9401e211 100644
> --- a/tools/ocaml/xenstored/domain.ml
> +++ b/tools/ocaml/xenstored/domain.ml
> @@ -63,6 +63,18 @@ let string_of_port = function
> let dump d chan =
> fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.remote_port
> 
> +let rebind_evtchn d remote_port =
> + begin match d.port with
> + | None -> ()
> + | Some p -> Event.unbind d.eventchn p
> + end;
> + let local = Event.bind_interdomain d.eventchn d.id remote_port in
> + debug "domain %d rebind (l %s, r %d) => (l %d, r %d)"
> +      d.id (string_of_port d.port) d.remote_port
> +      (Xeneventchn.to_int local) remote_port;
> + d.remote_port <- remote_port;
> + d.port <- Some (local)
> +
> let notify dom =
> match dom.port with
> | None -> warn "domain %d: attempt to notify on unknown port" dom.id
> diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
> index b2973aca2a82..2ea940d7e2d5 100644
> --- a/tools/ocaml/xenstored/process.ml
> +++ b/tools/ocaml/xenstored/process.ml
> @@ -567,11 +567,9 @@ let do_introduce con t domains cons data =
> let dom =
> if Domains.exist domains domid then
> let edom = Domains.find domains domid in
> - if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != con then begin
> + if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != con then
> (* Use XS_INTRODUCE for recreating the xenbus event-channel. *)
> - edom.remote_port <- remote_port;
> - Domain.bind_interdomain edom;
> - end;
> + Domain.rebind_evtchn edom remote_port;
> edom
> else try
> let ndom = Domains.create domains domid mfn remote_port in
> -- 
> 2.11.0
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/6] tools/oxenstored: Rename some 'port' variables to 'remote_port'
  2022-11-30 16:54 ` [PATCH v2 3/6] tools/oxenstored: Rename some 'port' variables to 'remote_port' Andrew Cooper
@ 2022-11-30 17:16   ` Edwin Torok
  2022-12-01 11:26   ` Christian Lindig
  1 sibling, 0 replies; 23+ messages in thread
From: Edwin Torok @ 2022-11-30 17:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Christian Lindig, David Scott, Rob Hoes



> On 30 Nov 2022, at 16:54, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> 
> This will make the logic clearer when we plumb local_port through these
> functions.
> 
> While changing this, simplify the construct in Domains.create0 to separate the
> remote port handling from the interface.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Christian Lindig <christian.lindig@citrix.com>
> CC: David Scott <dave@recoil.org>
> CC: Edwin Torok <edvin.torok@citrix.com>
> CC: Rob Hoes <Rob.Hoes@citrix.com>

We've reviewed this change in-person:
Reviewed-by: Edwin Török <edvin.torok@citrix.com>


> 
> v2:
> * New.
> ---
> tools/ocaml/xenstored/domains.ml   | 26 ++++++++++++--------------
> tools/ocaml/xenstored/process.ml   | 12 ++++++------
> tools/ocaml/xenstored/xenstored.ml |  8 ++++----
> 3 files changed, 22 insertions(+), 24 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/domains.ml b/tools/ocaml/xenstored/domains.ml
> index 17fe2fa25772..26018ac0dd3d 100644
> --- a/tools/ocaml/xenstored/domains.ml
> +++ b/tools/ocaml/xenstored/domains.ml
> @@ -122,9 +122,9 @@ let cleanup doms =
> let resume _doms _domid =
> ()
> 
> -let create doms domid mfn port =
> +let create doms domid mfn remote_port =
> let interface = Xenctrl.map_foreign_range xc domid (Xenmmap.getpagesize()) mfn in
> - let dom = Domain.make domid mfn port interface doms.eventchn in
> + let dom = Domain.make domid mfn remote_port interface doms.eventchn in
> Hashtbl.add doms.table domid dom;
> Domain.bind_interdomain dom;
> dom
> @@ -133,18 +133,16 @@ let xenstored_kva = ref ""
> let xenstored_port = ref ""
> 
> let create0 doms =
> - let port, interface =
> - (
> - let port = Utils.read_file_single_integer !xenstored_port
> - and fd = Unix.openfile !xenstored_kva
> -       [ Unix.O_RDWR ] 0o600 in
> - let interface = Xenmmap.mmap fd Xenmmap.RDWR Xenmmap.SHARED
> -  (Xenmmap.getpagesize()) 0 in
> - Unix.close fd;
> - port, interface
> - )
> - in
> - let dom = Domain.make 0 Nativeint.zero port interface doms.eventchn in
> + let remote_port = Utils.read_file_single_integer !xenstored_port in
> +
> + let interface =
> + let fd = Unix.openfile !xenstored_kva [ Unix.O_RDWR ] 0o600 in
> + let interface = Xenmmap.mmap fd Xenmmap.RDWR Xenmmap.SHARED (Xenmmap.getpagesize()) 0 in
> + Unix.close fd;
> + interface
> + in
> +
> + let dom = Domain.make 0 Nativeint.zero remote_port interface doms.eventchn in
> Hashtbl.add doms.table 0 dom;
> Domain.bind_interdomain dom;
> Domain.notify dom;
> diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
> index 72a79e9328dd..b2973aca2a82 100644
> --- a/tools/ocaml/xenstored/process.ml
> +++ b/tools/ocaml/xenstored/process.ml
> @@ -558,10 +558,10 @@ let do_transaction_end con t domains cons data =
> let do_introduce con t domains cons data =
> if not (Connection.is_dom0 con)
> then raise Define.Permission_denied;
> - let (domid, mfn, port) =
> + let (domid, mfn, remote_port) =
> match (split None '\000' data) with
> - | domid :: mfn :: port :: _ ->
> - int_of_string domid, Nativeint.of_string mfn, int_of_string port
> + | domid :: mfn :: remote_port :: _ ->
> + int_of_string domid, Nativeint.of_string mfn, int_of_string remote_port
> | _                         -> raise Invalid_Cmd_Args;
> in
> let dom =
> @@ -569,18 +569,18 @@ let do_introduce con t domains cons data =
> let edom = Domains.find domains domid in
> if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != con then begin
> (* Use XS_INTRODUCE for recreating the xenbus event-channel. *)
> - edom.remote_port <- port;
> + edom.remote_port <- remote_port;
> Domain.bind_interdomain edom;
> end;
> edom
> else try
> - let ndom = Domains.create domains domid mfn port in
> + let ndom = Domains.create domains domid mfn remote_port in
> Connections.add_domain cons ndom;
> Connections.fire_spec_watches (Transaction.get_root t) cons Store.Path.introduce_domain;
> ndom
> with _ -> raise Invalid_Cmd_Args
> in
> - if (Domain.get_remote_port dom) <> port || (Domain.get_mfn dom) <> mfn then
> + if (Domain.get_remote_port dom) <> remote_port || (Domain.get_mfn dom) <> mfn then
> raise Domain_not_match
> 
> let do_release con t domains cons data =
> diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
> index 55071b49eccb..1f11f576b515 100644
> --- a/tools/ocaml/xenstored/xenstored.ml
> +++ b/tools/ocaml/xenstored/xenstored.ml
> @@ -167,10 +167,10 @@ let from_channel_f chan global_f socket_f domain_f watch_f store_f =
> global_f ~rw
> | "socket" :: fd :: [] ->
> socket_f ~fd:(int_of_string fd)
> - | "dom" :: domid :: mfn :: port :: []->
> + | "dom" :: domid :: mfn :: remote_port :: []->
> domain_f (int_of_string domid)
>         (Nativeint.of_string mfn)
> -         (int_of_string port)
> +         (int_of_string remote_port)
> | "watch" :: domid :: path :: token :: [] ->
> watch_f (int_of_string domid)
>        (unhexify path) (unhexify token)
> @@ -209,10 +209,10 @@ let from_channel store cons doms chan =
> else
> warn "Ignoring invalid socket FD %d" fd
> in
> - let domain_f domid mfn port =
> + let domain_f domid mfn remote_port =
> let ndom =
> if domid > 0 then
> - Domains.create doms domid mfn port
> + Domains.create doms domid mfn remote_port
> else
> Domains.create0 doms
> in
> -- 
> 2.11.0
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/6] tools/oxenstored: Bind the DOM_EXC VIRQ in in Event.init()
  2022-11-30 16:54 ` [PATCH v2 2/6] tools/oxenstored: Bind the DOM_EXC VIRQ in in Event.init() Andrew Cooper
@ 2022-11-30 17:16   ` Edwin Torok
  2022-12-01 11:27   ` Christian Lindig
  1 sibling, 0 replies; 23+ messages in thread
From: Edwin Torok @ 2022-11-30 17:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Christian Lindig, David Scott, Rob Hoes



> On 30 Nov 2022, at 16:54, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> 
> Xenstored always needs to bind the DOM_EXC VIRQ.
> 
> Instead of doing it shortly after the call to Event.init(), do it in the
> init() call itself.  This removes the need for the field to be a mutable
> option.
> 
> It will also simplify a future change to restore both parts from the live
> update record, rather than re-initialising them from scratch.
> 
> Rename the field from virq_port (which could be any VIRQ) to it's proper name.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Christian Lindig <christian.lindig@citrix.com>
> CC: David Scott <dave@recoil.org>
> CC: Edwin Torok <edvin.torok@citrix.com>
> CC: Rob Hoes <Rob.Hoes@citrix.com>


reviewd in person:

Reviewed-by: Edwin Török <edvin.torok@citrix.com>

> 
> v2:
> * New.
> ---
> tools/ocaml/xenstored/event.ml     | 9 ++++++---
> tools/ocaml/xenstored/xenstored.ml | 4 +---
> 2 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/event.ml b/tools/ocaml/xenstored/event.ml
> index ccca90b6fc4f..a3be296374ff 100644
> --- a/tools/ocaml/xenstored/event.ml
> +++ b/tools/ocaml/xenstored/event.ml
> @@ -17,12 +17,15 @@
> (**************** high level binding ****************)
> type t = {
> handle: Xeneventchn.handle;
> - mutable virq_port: Xeneventchn.t option;
> + domexc: Xeneventchn.t;
> }
> 
> -let init () = { handle = Xeneventchn.init (); virq_port = None; }
> +let init () =
> + let handle = Xeneventchn.init () in
> + let domexc = Xeneventchn.bind_dom_exc_virq handle in
> + { handle; domexc }
> +
> let fd eventchn = Xeneventchn.fd eventchn.handle
> -let bind_dom_exc_virq eventchn = eventchn.virq_port <- Some (Xeneventchn.bind_dom_exc_virq eventchn.handle)
> let bind_interdomain eventchn domid port = Xeneventchn.bind_interdomain eventchn.handle domid port
> let unbind eventchn port = Xeneventchn.unbind eventchn.handle port
> let notify eventchn port = Xeneventchn.notify eventchn.handle port
> diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
> index c5dc7a28d082..55071b49eccb 100644
> --- a/tools/ocaml/xenstored/xenstored.ml
> +++ b/tools/ocaml/xenstored/xenstored.ml
> @@ -397,7 +397,6 @@ let _ =
> if cf.restart && Sys.file_exists Disk.xs_daemon_database then (
> let rwro = DB.from_file store domains cons Disk.xs_daemon_database in
> info "Live reload: database loaded";
> - Event.bind_dom_exc_virq eventchn;
> Process.LiveUpdate.completed ();
> rwro
> ) else (
> @@ -413,7 +412,6 @@ let _ =
> 
> if cf.domain_init then (
> Connections.add_domain cons (Domains.create0 domains);
> - Event.bind_dom_exc_virq eventchn
> );
> rw_sock
> ) in
> @@ -451,7 +449,7 @@ let _ =
> let port = Event.pending eventchn in
> debug "pending port %d" (Xeneventchn.to_int port);
> finally (fun () ->
> - if Some port = eventchn.Event.virq_port then (
> + if port = eventchn.Event.domexc then (
> let (notify, deaddom) = Domains.cleanup domains in
> List.iter (Store.reset_permissions store) deaddom;
> List.iter (Connections.del_domain cons) deaddom;
> -- 
> 2.11.0
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair
  2022-11-30 16:54 ` [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair Andrew Cooper
@ 2022-11-30 17:17   ` Edwin Torok
  2022-12-01 11:59   ` Christian Lindig
  1 sibling, 0 replies; 23+ messages in thread
From: Edwin Torok @ 2022-11-30 17:17 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Christian Lindig, David Scott, Rob Hoes



> On 30 Nov 2022, at 16:54, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> 
> Inter-domain event channels are always a pair of local and remote ports.
> Right now the handling is asymmetric, caused by the fact that the evtchn is
> bound after the associated Domain object is constructed.
> 
> First, move binding of the event channel into the Domain.make() constructor.
> This means the local port no longer needs to be an option.  It also removes
> the final callers of Domain.bind_interdomain.
> 
> Next, introduce a new port_pair type to encapsulate the fact that these two
> should be updated together, and replace the previous port and remote_port
> fields.  This refactoring also changes the Domain.get_port interface (removing
> an option) so take the opportunity to name it get_local_port instead.
> 
> Also, this fixes a use-after-free risk with Domain.close.  Once the evtchn has
> been unbound, the same local port number can be reused for a different
> purpose, so explicitly invalidate the ports to prevent their accidental misuse
> in the future.
> 
> This also cleans up some of the debugging, to always print a port pair.

Reviewed in-person, I've suggested to use explicit labeled arguments for the case where multiple integers with very close semantic meaning are passed as arguments,
e.g. local vs remote port, it'd be quite easy to accidentally swap them in the caller, leading to bugs.


Reviewed-by: Edwin Török <edvin.torok@citrix.com>

> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Christian Lindig <christian.lindig@citrix.com>
> CC: David Scott <dave@recoil.org>
> CC: Edwin Torok <edvin.torok@citrix.com>
> CC: Rob Hoes <Rob.Hoes@citrix.com>
> 
> v2:
> * New
> ---
> tools/ocaml/xenstored/connections.ml |  9 +----
> tools/ocaml/xenstored/domain.ml      | 75 ++++++++++++++++++------------------
> tools/ocaml/xenstored/domains.ml     |  2 -
> 3 files changed, 39 insertions(+), 47 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/connections.ml b/tools/ocaml/xenstored/connections.ml
> index 7d68c583b43a..a80ae0bed2ce 100644
> --- a/tools/ocaml/xenstored/connections.ml
> +++ b/tools/ocaml/xenstored/connections.ml
> @@ -48,9 +48,7 @@ let add_domain cons dom =
> let xbcon = Xenbus.Xb.open_mmap ~capacity (Domain.get_interface dom) (fun () -> Domain.notify dom) in
> let con = Connection.create xbcon (Some dom) in
> Hashtbl.add cons.domains (Domain.get_id dom) con;
> - match Domain.get_port dom with
> - | Some p -> Hashtbl.add cons.ports p con;
> - | None -> ()
> + Hashtbl.add cons.ports (Domain.get_local_port dom) con
> 
> let select ?(only_if = (fun _ -> true)) cons =
> Hashtbl.fold (fun _ con (ins, outs) ->
> @@ -97,10 +95,7 @@ let del_domain cons id =
> let con = find_domain cons id in
> Hashtbl.remove cons.domains id;
> (match Connection.get_domain con with
> - | Some d ->
> -   (match Domain.get_port d with
> -    | Some p -> Hashtbl.remove cons.ports p
> -    | None -> ())
> + | Some d -> Hashtbl.remove cons.ports (Domain.get_local_port d)
> | None -> ());
> del_watches cons con;
> Connection.close con
> diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
> index d59a9401e211..ecdd65f3209a 100644
> --- a/tools/ocaml/xenstored/domain.ml
> +++ b/tools/ocaml/xenstored/domain.ml
> @@ -19,14 +19,31 @@ open Printf
> let debug fmt = Logging.debug "domain" fmt
> let warn  fmt = Logging.warn  "domain" fmt
> 
> +(* An event channel port pair.  The remote port, and the local port it is
> +   bound to. *)
> +type port_pair =
> +{
> + local: Xeneventchn.t;
> + remote: int;
> +}
> +
> +(* Sentinal port_pair with both set to EVTCHN_INVALID *)
> +let invalid_ports =
> +{
> + local = Xeneventchn.of_int 0;
> + remote = 0
> +}
> +
> +let string_of_port_pair p =
> + sprintf "(l %d, r %d)" (Xeneventchn.to_int p.local) p.remote
> +
> type t =
> {
> id: Xenctrl.domid;
> mfn: nativeint;
> interface: Xenmmap.mmap_interface;
> eventchn: Event.t;
> - mutable remote_port: int;
> - mutable port: Xeneventchn.t option;
> + mutable ports: port_pair;
> mutable bad_client: bool;
> mutable io_credit: int; (* the rounds of ring process left to do, default is 0,
>                           usually set to 1 when there is work detected, could
> @@ -41,8 +58,8 @@ let is_dom0 d = d.id = 0
> let get_id domain = domain.id
> let get_interface d = d.interface
> let get_mfn d = d.mfn
> -let get_remote_port d = d.remote_port
> -let get_port d = d.port
> +let get_remote_port d = d.ports.remote
> +let get_local_port d = d.ports.local
> 
> let is_bad_domain domain = domain.bad_client
> let mark_as_bad domain = domain.bad_client <- true
> @@ -56,54 +73,36 @@ let is_paused_for_conflict dom = dom.conflict_credit <= 0.0
> 
> let is_free_to_conflict = is_dom0
> 
> -let string_of_port = function
> - | None -> "None"
> - | Some x -> string_of_int (Xeneventchn.to_int x)
> -
> let dump d chan =
> - fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.remote_port
> + fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.ports.remote
> 
> let rebind_evtchn d remote_port =
> - begin match d.port with
> - | None -> ()
> - | Some p -> Event.unbind d.eventchn p
> - end;
> + Event.unbind d.eventchn d.ports.local;
> let local = Event.bind_interdomain d.eventchn d.id remote_port in
> - debug "domain %d rebind (l %s, r %d) => (l %d, r %d)"
> -      d.id (string_of_port d.port) d.remote_port
> -      (Xeneventchn.to_int local) remote_port;
> - d.remote_port <- remote_port;
> - d.port <- Some (local)
> + let ports = { local; remote = remote_port } in
> + debug "domain %d rebind %s => %s"
> +      d.id (string_of_port_pair d.ports) (string_of_port_pair ports);
> + d.ports <- ports
> 
> let notify dom =
> - match dom.port with
> - | None -> warn "domain %d: attempt to notify on unknown port" dom.id
> - | Some port -> Event.notify dom.eventchn port
> -
> -let bind_interdomain dom =
> - begin match dom.port with
> - | None -> ()
> - | Some port -> Event.unbind dom.eventchn port
> - end;
> - dom.port <- Some (Event.bind_interdomain dom.eventchn dom.id dom.remote_port);
> - debug "bound domain %d remote port %d to local port %s" dom.id dom.remote_port (string_of_port dom.port)
> -
> + Event.notify dom.eventchn dom.ports.local
> 
> let close dom =
> - debug "domain %d unbound port %s" dom.id (string_of_port dom.port);
> - begin match dom.port with
> - | None -> ()
> - | Some port -> Event.unbind dom.eventchn port
> - end;
> + debug "domain %d unbind %s" dom.id (string_of_port_pair dom.ports);
> + Event.unbind dom.eventchn dom.ports.local;
> + dom.ports <- invalid_ports;
> Xenmmap.unmap dom.interface
> 
> -let make id mfn remote_port interface eventchn = {
> +let make id mfn remote_port interface eventchn =
> + let local = Event.bind_interdomain eventchn id remote_port in
> + let ports = { local; remote = remote_port } in
> + debug "domain %d bind %s" id (string_of_port_pair ports);
> +{
> id = id;
> mfn = mfn;
> - remote_port = remote_port;
> + ports;
> interface = interface;
> eventchn = eventchn;
> - port = None;
> bad_client = false;
> io_credit = 0;
> conflict_credit = !Define.conflict_burst_limit;
> diff --git a/tools/ocaml/xenstored/domains.ml b/tools/ocaml/xenstored/domains.ml
> index 26018ac0dd3d..2ab0c5f4d8d0 100644
> --- a/tools/ocaml/xenstored/domains.ml
> +++ b/tools/ocaml/xenstored/domains.ml
> @@ -126,7 +126,6 @@ let create doms domid mfn remote_port =
> let interface = Xenctrl.map_foreign_range xc domid (Xenmmap.getpagesize()) mfn in
> let dom = Domain.make domid mfn remote_port interface doms.eventchn in
> Hashtbl.add doms.table domid dom;
> - Domain.bind_interdomain dom;
> dom
> 
> let xenstored_kva = ref ""
> @@ -144,7 +143,6 @@ let create0 doms =
> 
> let dom = Domain.make 0 Nativeint.zero remote_port interface doms.eventchn in
> Hashtbl.add doms.table 0 dom;
> - Domain.bind_interdomain dom;
> Domain.notify dom;
> dom
> 
> -- 
> 2.11.0
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/6] tools/oxenstored: Style fixes to Domain
  2022-11-30 16:54 ` [PATCH v2 1/6] tools/oxenstored: Style fixes to Domain Andrew Cooper
  2022-11-30 17:14   ` Edwin Torok
@ 2022-12-01 11:11   ` Christian Lindig
  1 sibling, 0 replies; 23+ messages in thread
From: Christian Lindig @ 2022-12-01 11:11 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, David Scott, Edwin Torok, Rob Hoes



> On 30 Nov 2022, at 16:54, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
> 
> This file has some style problems so severe that they interfere with the
> readability of the subsequent bugfix patches.
> 
> Fix these issues ahead of time, to make the subsequent changes more readable.


Acked-by: Christian Lindig <christian.lindig@citrix.com>



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn
  2022-11-30 16:54 ` [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn Andrew Cooper
  2022-11-30 17:15   ` Edwin Torok
@ 2022-12-01 11:20   ` Christian Lindig
  2022-12-01 12:10     ` Andrew Cooper
  1 sibling, 1 reply; 23+ messages in thread
From: Christian Lindig @ 2022-12-01 11:20 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, David Scott, Edwin Torok, Rob Hoes



> On 30 Nov 2022, at 16:54, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
> 
> Generally speaking, the event channel local/remote port is fixed for the
> lifetime of the associated domain object.  The exception to this is a
> secondary XS_INTRODUCE (defined to re-bind to a new event channel) which pokes
> around at the domain object's internal state.
> 
> We need to refactor the evtchn handling to support live update, so start by
> moving the relevant manipulation into Domain.
> 
> No practical change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Christian Lindig <christian.lindig@citrix.com>
> CC: David Scott <dave@recoil.org>
> CC: Edwin Torok <edvin.torok@citrix.com>
> CC: Rob Hoes <Rob.Hoes@citrix.com>

Acked-by: Christian Lindig <christian.lindig@citrix.com>

The code makes changes around if-expressions and it is easy to get mislead by indentation which parts are covered by an if and which are not in the presence of sequential code. I would be more confident about this with automatic formatting (but also believe this is correct).

— C




> Note: This change deliberately doesn't reuse Domain.bind_interdomain, which is
> removed by the end of the refactoring.
> 
> v2:
> * New.
> ---
> tools/ocaml/xenstored/domain.ml  | 12 ++++++++++++
> tools/ocaml/xenstored/process.ml |  6 ++----
> 2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
> index ab08dcf37f62..d59a9401e211 100644
> --- a/tools/ocaml/xenstored/domain.ml
> +++ b/tools/ocaml/xenstored/domain.ml
> @@ -63,6 +63,18 @@ let string_of_port = function
> let dump d chan =
> 	fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.remote_port
> 
> +let rebind_evtchn d remote_port =
> +	begin match d.port with
> +	| None -> ()
> +	| Some p -> Event.unbind d.eventchn p
> +	end;
> +	let local = Event.bind_interdomain d.eventchn d.id remote_port in
> +	debug "domain %d rebind (l %s, r %d) => (l %d, r %d)"
> +	      d.id (string_of_port d.port) d.remote_port
> +	      (Xeneventchn.to_int local) remote_port;
> +	d.remote_port <- remote_port;
> +	d.port <- Some (local)
> +
> let notify dom =
> 	match dom.port with
> 	| None -> warn "domain %d: attempt to notify on unknown port" dom.id
> diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
> index b2973aca2a82..2ea940d7e2d5 100644
> --- a/tools/ocaml/xenstored/process.ml
> +++ b/tools/ocaml/xenstored/process.ml
> @@ -567,11 +567,9 @@ let do_introduce con t domains cons data =
> 	let dom =
> 		if Domains.exist domains domid then
> 			let edom = Domains.find domains domid in
> -			if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != con then begin
> +			if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != con then
> 				(* Use XS_INTRODUCE for recreating the xenbus event-channel. *)
> -				edom.remote_port <- remote_port;
> -				Domain.bind_interdomain edom;
> -			end;
> +				Domain.rebind_evtchn edom remote_port;
> 			edom
> 		else try
> 			let ndom = Domains.create domains domid mfn remote_port in
> -- 
> 2.11.0
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/6] tools/oxenstored: Rename some 'port' variables to 'remote_port'
  2022-11-30 16:54 ` [PATCH v2 3/6] tools/oxenstored: Rename some 'port' variables to 'remote_port' Andrew Cooper
  2022-11-30 17:16   ` Edwin Torok
@ 2022-12-01 11:26   ` Christian Lindig
  2022-12-01 12:02     ` Andrew Cooper
  1 sibling, 1 reply; 23+ messages in thread
From: Christian Lindig @ 2022-12-01 11:26 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, David Scott, Edwin Torok, Rob Hoes



> On 30 Nov 2022, at 16:54, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
> 
> This will make the logic clearer when we plumb local_port through these
> functions.
> 
> While changing this, simplify the construct in Domains.create0 to separate the
> remote port handling from the interface.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Christian Lindig <christian.lindig@citrix.com>
> CC: David Scott <dave@recoil.org>
> CC: Edwin Torok <edvin.torok@citrix.com>
> CC: Rob Hoes <Rob.Hoes@citrix.com>

Acked-by: Christian Lindig <christian.lindig@citrix.com>


> 
> v2:
> * New.
> ---
> tools/ocaml/xenstored/domains.ml   | 26 ++++++++++++--------------
> tools/ocaml/xenstored/process.ml   | 12 ++++++------
> tools/ocaml/xenstored/xenstored.ml |  8 ++++----
> 3 files changed, 22 insertions(+), 24 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/domains.ml b/tools/ocaml/xenstored/domains.ml
> index 17fe2fa25772..26018ac0dd3d 100644
> --- a/tools/ocaml/xenstored/domains.ml
> +++ b/tools/ocaml/xenstored/domains.ml
> @@ -122,9 +122,9 @@ let cleanup doms =
> let resume _doms _domid =
> 	()
> 
> -let create doms domid mfn port =
> +let create doms domid mfn remote_port =
> 	let interface = Xenctrl.map_foreign_range xc domid (Xenmmap.getpagesize()) mfn in
> -	let dom = Domain.make domid mfn port interface doms.eventchn in
> +	let dom = Domain.make domid mfn remote_port interface doms.eventchn in
> 	Hashtbl.add doms.table domid dom;
> 	Domain.bind_interdomain dom;
> 	dom
> @@ -133,18 +133,16 @@ let xenstored_kva = ref ""
> let xenstored_port = ref ""
> 
> let create0 doms =
> -	let port, interface =
> -		(
> -			let port = Utils.read_file_single_integer !xenstored_port
> -			and fd = Unix.openfile !xenstored_kva
> -					       [ Unix.O_RDWR ] 0o600 in
> -			let interface = Xenmmap.mmap fd Xenmmap.RDWR Xenmmap.SHARED
> -						  (Xenmmap.getpagesize()) 0 in
> -			Unix.close fd;
> -			port, interface
> -		)
> -		in
> -	let dom = Domain.make 0 Nativeint.zero port interface doms.eventchn in
> +	let remote_port = Utils.read_file_single_integer !xenstored_port in
> +
> +	let interface =
> +		let fd = Unix.openfile !xenstored_kva [ Unix.O_RDWR ] 0o600 in
> +		let interface = Xenmmap.mmap fd Xenmmap.RDWR Xenmmap.SHARED (Xenmmap.getpagesize()) 0 in

Can we be sure that this never throws an exception such that the close can't be missed? Otherwise a Fun.protect (or equivalent) should be used.

> +		Unix.close fd;
> +		interface
> +	in
> +
> +	let dom = Domain.make 0 Nativeint.zero remote_port interface doms.eventchn in
> 	Hashtbl.add doms.table 0 dom;
> 	Domain.bind_interdomain dom;
> 	Domain.notify dom;
> diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
> index 72a79e9328dd..b2973aca2a82 100644
> --- a/tools/ocaml/xenstored/process.ml
> +++ b/tools/ocaml/xenstored/process.ml
> @@ -558,10 +558,10 @@ let do_transaction_end con t domains cons data =
> let do_introduce con t domains cons data =
> 	if not (Connection.is_dom0 con)
> 	then raise Define.Permission_denied;
> -	let (domid, mfn, port) =
> +	let (domid, mfn, remote_port) =
> 		match (split None '\000' data) with
> -		| domid :: mfn :: port :: _ ->
> -			int_of_string domid, Nativeint.of_string mfn, int_of_string port
> +		| domid :: mfn :: remote_port :: _ ->
> +			int_of_string domid, Nativeint.of_string mfn, int_of_string remote_port
> 		| _                         -> raise Invalid_Cmd_Args;
> 		in
> 	let dom =
> @@ -569,18 +569,18 @@ let do_introduce con t domains cons data =
> 			let edom = Domains.find domains domid in
> 			if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != con then begin
> 				(* Use XS_INTRODUCE for recreating the xenbus event-channel. *)
> -				edom.remote_port <- port;
> +				edom.remote_port <- remote_port;
> 				Domain.bind_interdomain edom;
> 			end;
> 			edom
> 		else try
> -			let ndom = Domains.create domains domid mfn port in
> +			let ndom = Domains.create domains domid mfn remote_port in
> 			Connections.add_domain cons ndom;
> 			Connections.fire_spec_watches (Transaction.get_root t) cons Store.Path.introduce_domain;
> 			ndom
> 		with _ -> raise Invalid_Cmd_Args
> 	in
> -	if (Domain.get_remote_port dom) <> port || (Domain.get_mfn dom) <> mfn then
> +	if (Domain.get_remote_port dom) <> remote_port || (Domain.get_mfn dom) <> mfn then
> 		raise Domain_not_match
> 
> let do_release con t domains cons data =
> diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
> index 55071b49eccb..1f11f576b515 100644
> --- a/tools/ocaml/xenstored/xenstored.ml
> +++ b/tools/ocaml/xenstored/xenstored.ml
> @@ -167,10 +167,10 @@ let from_channel_f chan global_f socket_f domain_f watch_f store_f =
> 					global_f ~rw
> 				| "socket" :: fd :: [] ->
> 					socket_f ~fd:(int_of_string fd)
> -				| "dom" :: domid :: mfn :: port :: []->
> +				| "dom" :: domid :: mfn :: remote_port :: []->
> 					domain_f (int_of_string domid)
> 					         (Nativeint.of_string mfn)
> -					         (int_of_string port)
> +					         (int_of_string remote_port)
> 				| "watch" :: domid :: path :: token :: [] ->
> 					watch_f (int_of_string domid)
> 					        (unhexify path) (unhexify token)
> @@ -209,10 +209,10 @@ let from_channel store cons doms chan =
> 		else
> 			warn "Ignoring invalid socket FD %d" fd
> 	in
> -	let domain_f domid mfn port =
> +	let domain_f domid mfn remote_port =
> 		let ndom =
> 			if domid > 0 then
> -				Domains.create doms domid mfn port
> +				Domains.create doms domid mfn remote_port
> 			else
> 				Domains.create0 doms
> 			in
> -- 
> 2.11.0
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/6] tools/oxenstored: Bind the DOM_EXC VIRQ in in Event.init()
  2022-11-30 16:54 ` [PATCH v2 2/6] tools/oxenstored: Bind the DOM_EXC VIRQ in in Event.init() Andrew Cooper
  2022-11-30 17:16   ` Edwin Torok
@ 2022-12-01 11:27   ` Christian Lindig
  1 sibling, 0 replies; 23+ messages in thread
From: Christian Lindig @ 2022-12-01 11:27 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, David Scott, Edwin Torok, Rob Hoes



> On 30 Nov 2022, at 16:54, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
> 
> Xenstored always needs to bind the DOM_EXC VIRQ.
> 
> Instead of doing it shortly after the call to Event.init(), do it in the
> init() call itself.  This removes the need for the field to be a mutable
> option.
> 
> It will also simplify a future change to restore both parts from the live
> update record, rather than re-initialising them from scratch.
> 
> Rename the field from virq_port (which could be any VIRQ) to it's proper name.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Christian Lindig <christian.lindig@citrix.com>
> CC: David Scott <dave@recoil.org>
> CC: Edwin Torok <edvin.torok@citrix.com>
> CC: Rob Hoes <Rob.Hoes@citrix.com>
> 

Acked-by: Christian Lindig <christian.lindig@citrix.com>



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair
  2022-11-30 16:54 ` [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair Andrew Cooper
  2022-11-30 17:17   ` Edwin Torok
@ 2022-12-01 11:59   ` Christian Lindig
  2022-12-01 14:22     ` Andrew Cooper
  1 sibling, 1 reply; 23+ messages in thread
From: Christian Lindig @ 2022-12-01 11:59 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, David Scott, Edwin Torok, Rob Hoes



> On 30 Nov 2022, at 16:54, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
> 
> Inter-domain event channels are always a pair of local and remote ports.
> Right now the handling is asymmetric, caused by the fact that the evtchn is
> bound after the associated Domain object is constructed.
> 
> First, move binding of the event channel into the Domain.make() constructor.
> This means the local port no longer needs to be an option.  It also removes
> the final callers of Domain.bind_interdomain.
> 
> Next, introduce a new port_pair type to encapsulate the fact that these two
> should be updated together, and replace the previous port and remote_port
> fields.  This refactoring also changes the Domain.get_port interface (removing
> an option) so take the opportunity to name it get_local_port instead.
> 
> Also, this fixes a use-after-free risk with Domain.close.  Once the evtchn has
> been unbound, the same local port number can be reused for a different
> purpose, so explicitly invalidate the ports to prevent their accidental misuse
> in the future.
> 
> This also cleans up some of the debugging, to always print a port pair.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Christian Lindig <christian.lindig@citrix.com>
> CC: David Scott <dave@recoil.org>
> CC: Edwin Torok <edvin.torok@citrix.com>
> CC: Rob Hoes <Rob.Hoes@citrix.com>

Acked-by: Christian Lindig <christian.lindig@citrix.com>

> 
> v2:
> * New
> ---
> tools/ocaml/xenstored/connections.ml |  9 +----
> tools/ocaml/xenstored/domain.ml      | 75 ++++++++++++++++++------------------
> tools/ocaml/xenstored/domains.ml     |  2 -
> 3 files changed, 39 insertions(+), 47 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/connections.ml b/tools/ocaml/xenstored/connections.ml
> index 7d68c583b43a..a80ae0bed2ce 100644
> --- a/tools/ocaml/xenstored/connections.ml
> +++ b/tools/ocaml/xenstored/connections.ml
> @@ -48,9 +48,7 @@ let add_domain cons dom =
> 	let xbcon = Xenbus.Xb.open_mmap ~capacity (Domain.get_interface dom) (fun () -> Domain.notify dom) in
> 	let con = Connection.create xbcon (Some dom) in
> 	Hashtbl.add cons.domains (Domain.get_id dom) con;
> -	match Domain.get_port dom with
> -	| Some p -> Hashtbl.add cons.ports p con;
> -	| None -> ()
> +	Hashtbl.add cons.ports (Domain.get_local_port dom) con

I would prefer Hashtbl.replace. Hashtbl.add shadows an existing binding which becomes visible again after Hashtabl.remove. When we are sure that we only have one binding per key, we should use replace instead of add. 

> 
> let select ?(only_if = (fun _ -> true)) cons =
> 	Hashtbl.fold (fun _ con (ins, outs) ->
> @@ -97,10 +95,7 @@ let del_domain cons id =
> 		let con = find_domain cons id in
> 		Hashtbl.remove cons.domains id;
> 		(match Connection.get_domain con with
> -		 | Some d ->
> -		   (match Domain.get_port d with
> -		    | Some p -> Hashtbl.remove cons.ports p
> -		    | None -> ())
> +		 | Some d -> Hashtbl.remove cons.ports (Domain.get_local_port d)
> 		 | None -> ());
> 		del_watches cons con;
> 		Connection.close con
> diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
> index d59a9401e211..ecdd65f3209a 100644
> --- a/tools/ocaml/xenstored/domain.ml
> +++ b/tools/ocaml/xenstored/domain.ml
> @@ -19,14 +19,31 @@ open Printf
> let debug fmt = Logging.debug "domain" fmt
> let warn  fmt = Logging.warn  "domain" fmt
> 
> +(* An event channel port pair.  The remote port, and the local port it is
> +   bound to. *)
> +type port_pair =
> +{
> +	local: Xeneventchn.t;
> +	remote: int;
> +}
> +
> +(* Sentinal port_pair with both set to EVTCHN_INVALID *)
> +let invalid_ports =
> +{
> +	local = Xeneventchn.of_int 0;
> +	remote = 0
> +}
> +
> +let string_of_port_pair p =
> +	sprintf "(l %d, r %d)" (Xeneventchn.to_int p.local) p.remote
> +
> type t =
> {
> 	id: Xenctrl.domid;
> 	mfn: nativeint;
> 	interface: Xenmmap.mmap_interface;
> 	eventchn: Event.t;
> -	mutable remote_port: int;
> -	mutable port: Xeneventchn.t option;
> +	mutable ports: port_pair;
> 	mutable bad_client: bool;
> 	mutable io_credit: int; (* the rounds of ring process left to do, default is 0,
> 	                           usually set to 1 when there is work detected, could
> @@ -41,8 +58,8 @@ let is_dom0 d = d.id = 0
> let get_id domain = domain.id
> let get_interface d = d.interface
> let get_mfn d = d.mfn
> -let get_remote_port d = d.remote_port
> -let get_port d = d.port
> +let get_remote_port d = d.ports.remote
> +let get_local_port d = d.ports.local
> 
> let is_bad_domain domain = domain.bad_client
> let mark_as_bad domain = domain.bad_client <- true
> @@ -56,54 +73,36 @@ let is_paused_for_conflict dom = dom.conflict_credit <= 0.0
> 
> let is_free_to_conflict = is_dom0
> 
> -let string_of_port = function
> -	| None -> "None"
> -	| Some x -> string_of_int (Xeneventchn.to_int x)
> -
> let dump d chan =
> -	fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.remote_port
> +	fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.ports.remote
> 
> let rebind_evtchn d remote_port =
> -	begin match d.port with
> -	| None -> ()
> -	| Some p -> Event.unbind d.eventchn p
> -	end;
> +	Event.unbind d.eventchn d.ports.local;
> 	let local = Event.bind_interdomain d.eventchn d.id remote_port in
> -	debug "domain %d rebind (l %s, r %d) => (l %d, r %d)"
> -	      d.id (string_of_port d.port) d.remote_port
> -	      (Xeneventchn.to_int local) remote_port;
> -	d.remote_port <- remote_port;
> -	d.port <- Some (local)
> +	let ports = { local; remote = remote_port } in
> +	debug "domain %d rebind %s => %s"
> +	      d.id (string_of_port_pair d.ports) (string_of_port_pair ports);
> +	d.ports <- ports
> 
> let notify dom =
> -	match dom.port with
> -	| None -> warn "domain %d: attempt to notify on unknown port" dom.id
> -	| Some port -> Event.notify dom.eventchn port
> -
> -let bind_interdomain dom =
> -	begin match dom.port with
> -	| None -> ()
> -	| Some port -> Event.unbind dom.eventchn port
> -	end;
> -	dom.port <- Some (Event.bind_interdomain dom.eventchn dom.id dom.remote_port);
> -	debug "bound domain %d remote port %d to local port %s" dom.id dom.remote_port (string_of_port dom.port)
> -
> +	Event.notify dom.eventchn dom.ports.local
> 
> let close dom =
> -	debug "domain %d unbound port %s" dom.id (string_of_port dom.port);
> -	begin match dom.port with
> -	| None -> ()
> -	| Some port -> Event.unbind dom.eventchn port
> -	end;
> +	debug "domain %d unbind %s" dom.id (string_of_port_pair dom.ports);
> +	Event.unbind dom.eventchn dom.ports.local;
> +	dom.ports <- invalid_ports;
> 	Xenmmap.unmap dom.interface
> 
> -let make id mfn remote_port interface eventchn = {
> +let make id mfn remote_port interface eventchn =
> +	let local = Event.bind_interdomain eventchn id remote_port in
> +	let ports = { local; remote = remote_port } in
> +	debug "domain %d bind %s" id (string_of_port_pair ports);
> +{
> 	id = id;
> 	mfn = mfn;
> -	remote_port = remote_port;
> +	ports;
> 	interface = interface;
> 	eventchn = eventchn;
> -	port = None;
> 	bad_client = false;
> 	io_credit = 0;
> 	conflict_credit = !Define.conflict_burst_limit;
> diff --git a/tools/ocaml/xenstored/domains.ml b/tools/ocaml/xenstored/domains.ml
> index 26018ac0dd3d..2ab0c5f4d8d0 100644
> --- a/tools/ocaml/xenstored/domains.ml
> +++ b/tools/ocaml/xenstored/domains.ml
> @@ -126,7 +126,6 @@ let create doms domid mfn remote_port =
> 	let interface = Xenctrl.map_foreign_range xc domid (Xenmmap.getpagesize()) mfn in
> 	let dom = Domain.make domid mfn remote_port interface doms.eventchn in
> 	Hashtbl.add doms.table domid dom;
> -	Domain.bind_interdomain dom;
> 	dom
> 
> let xenstored_kva = ref ""
> @@ -144,7 +143,6 @@ let create0 doms =
> 
> 	let dom = Domain.make 0 Nativeint.zero remote_port interface doms.eventchn in
> 	Hashtbl.add doms.table 0 dom;
> -	Domain.bind_interdomain dom;
> 	Domain.notify dom;
> 	dom
> 
> -- 
> 2.11.0
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/6] tools/oxenstored: Rename some 'port' variables to 'remote_port'
  2022-12-01 11:26   ` Christian Lindig
@ 2022-12-01 12:02     ` Andrew Cooper
  0 siblings, 0 replies; 23+ messages in thread
From: Andrew Cooper @ 2022-12-01 12:02 UTC (permalink / raw)
  To: Christian Lindig; +Cc: Xen-devel, David Scott, Edwin Torok, Rob Hoes

On 01/12/2022 11:26, Christian Lindig wrote:
>> On 30 Nov 2022, at 16:54, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
>>
>> This will make the logic clearer when we plumb local_port through these
>> functions.
>>
>> While changing this, simplify the construct in Domains.create0 to separate the
>> remote port handling from the interface.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Christian Lindig <christian.lindig@citrix.com>
>> CC: David Scott <dave@recoil.org>
>> CC: Edwin Torok <edvin.torok@citrix.com>
>> CC: Rob Hoes <Rob.Hoes@citrix.com>
> Acked-by: Christian Lindig <christian.lindig@citrix.com>

Thanks.

>> diff --git a/tools/ocaml/xenstored/domains.ml b/tools/ocaml/xenstored/domains.ml
>> index 17fe2fa25772..26018ac0dd3d 100644
>> --- a/tools/ocaml/xenstored/domains.ml
>> +++ b/tools/ocaml/xenstored/domains.ml
>> @@ -133,18 +133,16 @@ let xenstored_kva = ref ""
>> let xenstored_port = ref ""
>>
>> let create0 doms =
>> -	let port, interface =
>> -		(
>> -			let port = Utils.read_file_single_integer !xenstored_port
>> -			and fd = Unix.openfile !xenstored_kva
>> -					       [ Unix.O_RDWR ] 0o600 in
>> -			let interface = Xenmmap.mmap fd Xenmmap.RDWR Xenmmap.SHARED
>> -						  (Xenmmap.getpagesize()) 0 in
>> -			Unix.close fd;
>> -			port, interface
>> -		)
>> -		in
>> -	let dom = Domain.make 0 Nativeint.zero port interface doms.eventchn in
>> +	let remote_port = Utils.read_file_single_integer !xenstored_port in
>> +
>> +	let interface =
>> +		let fd = Unix.openfile !xenstored_kva [ Unix.O_RDWR ] 0o600 in
>> +		let interface = Xenmmap.mmap fd Xenmmap.RDWR Xenmmap.SHARED (Xenmmap.getpagesize()) 0 in
> Can we be sure that this never throws an exception such that the close can't be missed? Otherwise a Fun.protect (or equivalent) should be used.

This mess also depends on !xenstored_port and !xenstored_kva morphing
into something other than ""  before Domain.create0 is called.

But this logic is also the penultimate unstable ABI in oxenstored, and
will be removed fully when we can bind /dev/xen/gntdev for Ocaml and
replace the foreign mapping with "map grant 1" (also removing this as a
special case difference between dom0 and all other domains.)


So I'm tempted to argue that I'm not actually changing the behaviour
here, and it's not worth fixing up logic this fragile when we're
intending to replace it anyway.  Edvin has patches IIRC, but they need
rebasing.

~Andrew

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn
  2022-12-01 11:20   ` Christian Lindig
@ 2022-12-01 12:10     ` Andrew Cooper
  2022-12-01 13:10       ` Christian Lindig
  2022-12-02  9:11       ` Edwin Torok
  0 siblings, 2 replies; 23+ messages in thread
From: Andrew Cooper @ 2022-12-01 12:10 UTC (permalink / raw)
  To: Christian Lindig; +Cc: Xen-devel, David Scott, Edwin Torok, Rob Hoes

On 01/12/2022 11:20, Christian Lindig wrote:
>
>> On 30 Nov 2022, at 16:54, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
>>
>> Generally speaking, the event channel local/remote port is fixed for the
>> lifetime of the associated domain object.  The exception to this is a
>> secondary XS_INTRODUCE (defined to re-bind to a new event channel) which pokes
>> around at the domain object's internal state.
>>
>> We need to refactor the evtchn handling to support live update, so start by
>> moving the relevant manipulation into Domain.
>>
>> No practical change.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Christian Lindig <christian.lindig@citrix.com>
>> CC: David Scott <dave@recoil.org>
>> CC: Edwin Torok <edvin.torok@citrix.com>
>> CC: Rob Hoes <Rob.Hoes@citrix.com>
> Acked-by: Christian Lindig <christian.lindig@citrix.com>

Thanks.

> The code makes changes around if-expressions and it is easy to get mislead by indentation which parts are covered by an if and which are not in the presence of sequential code. I would be more confident about this with automatic formatting (but also believe this is correct).

I can keep the being/end if you'd prefer.

Looking at the end result, it would actually shrink the patch, so is
probably worth doing anyway for clarity.  The net result is:

diff --git a/tools/ocaml/xenstored/process.ml
b/tools/ocaml/xenstored/process.ml
index b2973aca2a82..1c80e7198dbe 100644
--- a/tools/ocaml/xenstored/process.ml
+++ b/tools/ocaml/xenstored/process.ml
@@ -569,8 +569,7 @@ let do_introduce con t domains cons data =
                        let edom = Domains.find domains domid in
                        if (Domain.get_mfn edom) = mfn &&
(Connections.find_domain cons domid) != con then begin
                                (* Use XS_INTRODUCE for recreating the
xenbus event-channel. *)
-                               edom.remote_port <- remote_port;
-                               Domain.bind_interdomain edom;
+                               Domain.rebind_evtchn edom remote_port;
                        end;
                        edom
                else try

I'm happy to adjust on commit.

When I started this, I tried rearranging things to avoid the "if exists
then find" pattern, but quickly got into a mess, then realised that this
is (almost) a dead logic path... I've got no idea why this is supported;
looking through history, I can't find a case where a redundant
XS_INTRODUCE was ever used, but its a common behaviour between C and O
so there was clearly some reason...

~Andrew

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn
  2022-12-01 12:10     ` Andrew Cooper
@ 2022-12-01 13:10       ` Christian Lindig
  2022-12-02  9:11       ` Edwin Torok
  1 sibling, 0 replies; 23+ messages in thread
From: Christian Lindig @ 2022-12-01 13:10 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, David Scott, Edwin Torok, Rob Hoes



> On 1 Dec 2022, at 12:10, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
> 
> I can keep the being/end if you'd prefer.
> 
> Looking at the end result, it would actually shrink the patch, so is
> probably worth doing anyway for clarity.  The net result is:

I think keeping the begin/end is a good idea - as it keeps the patch small. I was mostly arguing for automated formatting because in OCaml the unfortunate difference in what constitutes the resulting expression in if vs. match has lead to subtle bugs in the past.

— C

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair
  2022-12-01 11:59   ` Christian Lindig
@ 2022-12-01 14:22     ` Andrew Cooper
  2022-12-01 15:22       ` Edwin Torok
  0 siblings, 1 reply; 23+ messages in thread
From: Andrew Cooper @ 2022-12-01 14:22 UTC (permalink / raw)
  To: Christian Lindig; +Cc: Xen-devel, David Scott, Edwin Torok, Rob Hoes

On 01/12/2022 11:59, Christian Lindig wrote:
>> On 30 Nov 2022, at 16:54, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
>>
>> Inter-domain event channels are always a pair of local and remote ports.
>> Right now the handling is asymmetric, caused by the fact that the evtchn is
>> bound after the associated Domain object is constructed.
>>
>> First, move binding of the event channel into the Domain.make() constructor.
>> This means the local port no longer needs to be an option.  It also removes
>> the final callers of Domain.bind_interdomain.
>>
>> Next, introduce a new port_pair type to encapsulate the fact that these two
>> should be updated together, and replace the previous port and remote_port
>> fields.  This refactoring also changes the Domain.get_port interface (removing
>> an option) so take the opportunity to name it get_local_port instead.
>>
>> Also, this fixes a use-after-free risk with Domain.close.  Once the evtchn has
>> been unbound, the same local port number can be reused for a different
>> purpose, so explicitly invalidate the ports to prevent their accidental misuse
>> in the future.
>>
>> This also cleans up some of the debugging, to always print a port pair.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Christian Lindig <christian.lindig@citrix.com>
>> CC: David Scott <dave@recoil.org>
>> CC: Edwin Torok <edvin.torok@citrix.com>
>> CC: Rob Hoes <Rob.Hoes@citrix.com>
> Acked-by: Christian Lindig <christian.lindig@citrix.com>

Thanks.

>
>> v2:
>> * New
>> ---
>> tools/ocaml/xenstored/connections.ml |  9 +----
>> tools/ocaml/xenstored/domain.ml      | 75 ++++++++++++++++++------------------
>> tools/ocaml/xenstored/domains.ml     |  2 -
>> 3 files changed, 39 insertions(+), 47 deletions(-)
>>
>> diff --git a/tools/ocaml/xenstored/connections.ml b/tools/ocaml/xenstored/connections.ml
>> index 7d68c583b43a..a80ae0bed2ce 100644
>> --- a/tools/ocaml/xenstored/connections.ml
>> +++ b/tools/ocaml/xenstored/connections.ml
>> @@ -48,9 +48,7 @@ let add_domain cons dom =
>> 	let xbcon = Xenbus.Xb.open_mmap ~capacity (Domain.get_interface dom) (fun () -> Domain.notify dom) in
>> 	let con = Connection.create xbcon (Some dom) in
>> 	Hashtbl.add cons.domains (Domain.get_id dom) con;
>> -	match Domain.get_port dom with
>> -	| Some p -> Hashtbl.add cons.ports p con;
>> -	| None -> ()
>> +	Hashtbl.add cons.ports (Domain.get_local_port dom) con
> I would prefer Hashtbl.replace. Hashtbl.add shadows an existing binding which becomes visible again after Hashtabl.remove. When we are sure that we only have one binding per key, we should use replace instead of add.

That's surprising behaviour.  Presumably the add->replace suggestion
applies the other hashtable here (cons.domains)?  And possibly elsewhere
too.

~Andrew

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair
  2022-12-01 14:22     ` Andrew Cooper
@ 2022-12-01 15:22       ` Edwin Torok
  0 siblings, 0 replies; 23+ messages in thread
From: Edwin Torok @ 2022-12-01 15:22 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Christian Lindig, Xen-devel, David Scott, Rob Hoes



> On 1 Dec 2022, at 14:22, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
> 
> On 01/12/2022 11:59, Christian Lindig wrote:
>>> On 30 Nov 2022, at 16:54, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
>>> 
>>> Inter-domain event channels are always a pair of local and remote ports.
>>> Right now the handling is asymmetric, caused by the fact that the evtchn is
>>> bound after the associated Domain object is constructed.
>>> 
>>> First, move binding of the event channel into the Domain.make() constructor.
>>> This means the local port no longer needs to be an option.  It also removes
>>> the final callers of Domain.bind_interdomain.
>>> 
>>> Next, introduce a new port_pair type to encapsulate the fact that these two
>>> should be updated together, and replace the previous port and remote_port
>>> fields.  This refactoring also changes the Domain.get_port interface (removing
>>> an option) so take the opportunity to name it get_local_port instead.
>>> 
>>> Also, this fixes a use-after-free risk with Domain.close.  Once the evtchn has
>>> been unbound, the same local port number can be reused for a different
>>> purpose, so explicitly invalidate the ports to prevent their accidental misuse
>>> in the future.
>>> 
>>> This also cleans up some of the debugging, to always print a port pair.
>>> 
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> ---
>>> CC: Christian Lindig <christian.lindig@citrix.com>
>>> CC: David Scott <dave@recoil.org>
>>> CC: Edwin Torok <edvin.torok@citrix.com>
>>> CC: Rob Hoes <Rob.Hoes@citrix.com>
>> Acked-by: Christian Lindig <christian.lindig@citrix.com>
> 
> Thanks.
> 
>> 
>>> v2:
>>> * New
>>> ---
>>> tools/ocaml/xenstored/connections.ml |  9 +----
>>> tools/ocaml/xenstored/domain.ml      | 75 ++++++++++++++++++------------------
>>> tools/ocaml/xenstored/domains.ml     |  2 -
>>> 3 files changed, 39 insertions(+), 47 deletions(-)
>>> 
>>> diff --git a/tools/ocaml/xenstored/connections.ml b/tools/ocaml/xenstored/connections.ml
>>> index 7d68c583b43a..a80ae0bed2ce 100644
>>> --- a/tools/ocaml/xenstored/connections.ml
>>> +++ b/tools/ocaml/xenstored/connections.ml
>>> @@ -48,9 +48,7 @@ let add_domain cons dom =
>>> let xbcon = Xenbus.Xb.open_mmap ~capacity (Domain.get_interface dom) (fun () -> Domain.notify dom) in
>>> let con = Connection.create xbcon (Some dom) in
>>> Hashtbl.add cons.domains (Domain.get_id dom) con;
>>> - match Domain.get_port dom with
>>> - | Some p -> Hashtbl.add cons.ports p con;
>>> - | None -> ()
>>> + Hashtbl.add cons.ports (Domain.get_local_port dom) con
>> I would prefer Hashtbl.replace. Hashtbl.add shadows an existing binding which becomes visible again after Hashtabl.remove. When we are sure that we only have one binding per key, we should use replace instead of add.
> 
> That's surprising behaviour.  Presumably the add->replace suggestion
> applies the other hashtable here (cons.domains)?  And possibly elsewhere
> too.


Yes:
* Hashtbl.add -> Hashtbl.replace
* Hashtbl.clear -> Hashtbl.reset

Using anything on the left is almost always an indication of a subtle bug (e.g. Hashtbl.clear won't release the memory used by the buckets, and the only time that is useful is if you'd immediately fill the hashtable with lots of elements again, really code should always use Hashtbl.reset but that only got introduced in OCaml 4.0.0, so older code won't have it).

And the use of Hashtbl.add can lead to "space leaks" (eventually OOM) unless one really knows what they are doing (i.e. there are only a finite number of add calls ever).

In XAPI we have a "quality gate" that counts the number of problematic functions/etc, and makes it a hard build time failure if any new usages are introduced (and we strive to reduce that to 0).
I don't think these 2 Hashtbl calls are there yet, but they probably should be.

Best regards,
--Edwin



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn
  2022-12-01 12:10     ` Andrew Cooper
  2022-12-01 13:10       ` Christian Lindig
@ 2022-12-02  9:11       ` Edwin Torok
  1 sibling, 0 replies; 23+ messages in thread
From: Edwin Torok @ 2022-12-02  9:11 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Christian Lindig, Xen-devel, David Scott, Rob Hoes



> On 1 Dec 2022, at 12:10, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
> 
> On 01/12/2022 11:20, Christian Lindig wrote:
>> 
>>> On 30 Nov 2022, at 16:54, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
>>> 
>>> Generally speaking, the event channel local/remote port is fixed for the
>>> lifetime of the associated domain object.  The exception to this is a
>>> secondary XS_INTRODUCE (defined to re-bind to a new event channel) which pokes
>>> around at the domain object's internal state.
>>> 
>>> We need to refactor the evtchn handling to support live update, so start by
>>> moving the relevant manipulation into Domain.
>>> 
>>> No practical change.
>>> 
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> ---
>>> CC: Christian Lindig <christian.lindig@citrix.com>
>>> CC: David Scott <dave@recoil.org>
>>> CC: Edwin Torok <edvin.torok@citrix.com>
>>> CC: Rob Hoes <Rob.Hoes@citrix.com>
>> Acked-by: Christian Lindig <christian.lindig@citrix.com>
> 
> Thanks.
> 
>> The code makes changes around if-expressions and it is easy to get mislead by indentation which parts are covered by an if and which are not in the presence of sequential code. I would be more confident about this with automatic formatting (but also believe this is correct).
> 
> I can keep the being/end if you'd prefer.
> 
> Looking at the end result, it would actually shrink the patch, so is
> probably worth doing anyway for clarity.  The net result is:
> 
> diff --git a/tools/ocaml/xenstored/process.ml
> b/tools/ocaml/xenstored/process.ml
> index b2973aca2a82..1c80e7198dbe 100644
> --- a/tools/ocaml/xenstored/process.ml
> +++ b/tools/ocaml/xenstored/process.ml
> @@ -569,8 +569,7 @@ let do_introduce con t domains cons data =
>                         let edom = Domains.find domains domid in
>                         if (Domain.get_mfn edom) = mfn &&
> (Connections.find_domain cons domid) != con then begin
>                                 (* Use XS_INTRODUCE for recreating the
> xenbus event-channel. *)
> -                               edom.remote_port <- remote_port;
> -                               Domain.bind_interdomain edom;
> +                               Domain.rebind_evtchn edom remote_port;
>                         end;
>                         edom
>                 else try
> 
> I'm happy to adjust on commit.
> 
> When I started this, I tried rearranging things to avoid the "if exists
> then find" pattern, but quickly got into a mess, then realised that this
> is (almost) a dead logic path... I've got no idea why this is supported;
> looking through history, I can't find a case where a redundant
> XS_INTRODUCE was ever used, but its a common behaviour between C and O
> so there was clearly some reason...


Currently the soft reset code in xenopsd uses it, but as you say there must've been another reason too (the soft reset code is a lot more recent than this).

Best regards,
--Edwin

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-12-02  9:12 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-30 16:54 [PATCH v2 0/6] More Oxenstored live update fixes Andrew Cooper
2022-11-30 16:54 ` [PATCH v2 1/6] tools/oxenstored: Style fixes to Domain Andrew Cooper
2022-11-30 17:14   ` Edwin Torok
2022-12-01 11:11   ` Christian Lindig
2022-11-30 16:54 ` [PATCH v2 2/6] tools/oxenstored: Bind the DOM_EXC VIRQ in in Event.init() Andrew Cooper
2022-11-30 17:16   ` Edwin Torok
2022-12-01 11:27   ` Christian Lindig
2022-11-30 16:54 ` [PATCH v2 3/6] tools/oxenstored: Rename some 'port' variables to 'remote_port' Andrew Cooper
2022-11-30 17:16   ` Edwin Torok
2022-12-01 11:26   ` Christian Lindig
2022-12-01 12:02     ` Andrew Cooper
2022-11-30 16:54 ` [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn Andrew Cooper
2022-11-30 17:15   ` Edwin Torok
2022-12-01 11:20   ` Christian Lindig
2022-12-01 12:10     ` Andrew Cooper
2022-12-01 13:10       ` Christian Lindig
2022-12-02  9:11       ` Edwin Torok
2022-11-30 16:54 ` [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair Andrew Cooper
2022-11-30 17:17   ` Edwin Torok
2022-12-01 11:59   ` Christian Lindig
2022-12-01 14:22     ` Andrew Cooper
2022-12-01 15:22       ` Edwin Torok
2022-11-30 16:54 ` [PATCH v2 6/6] tools/oxenstored: Keep /dev/xen/evtchn open across live update Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.