All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] tools/ocaml/xenstored: simplify code
@ 2020-08-27 17:35 Edwin Török
  2020-08-27 17:35 ` [PATCH v4 1/4] tools/ocaml/xenstored: replace hand rolled GC with weak GC references Edwin Török
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Edwin Török @ 2020-08-27 17:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

This refreshes the V3 patches to work with OCaml 4.02.
Upgrading to 4.06 will come as a separate series.

This patch is new in V4, the other patches were already acked in V3:
[PATCH v4 2/4] Map: backport find_opt/update from 4.06

A git tree with this and other series is available at:
https://gitlab.com/edwintorok/xen/-/compare/master...for-upstream

Edwin Török (4):
  tools/ocaml/xenstored: replace hand rolled GC with weak GC references
  Map: backport find_opt/update from 4.06
  tools/ocaml/xenstored: use more efficient node trees
  tools/ocaml/xenstored: use more efficient tries

 tools/ocaml/xenstored/connection.ml  |  3 --
 tools/ocaml/xenstored/connections.ml |  2 +-
 tools/ocaml/xenstored/history.ml     | 14 ------
 tools/ocaml/xenstored/stdext.ml      | 21 +++++++++
 tools/ocaml/xenstored/store.ml       | 49 +++++++++----------
 tools/ocaml/xenstored/symbol.ml      | 70 +++++++---------------------
 tools/ocaml/xenstored/symbol.mli     | 22 +++------
 tools/ocaml/xenstored/trie.ml        | 61 +++++++++++-------------
 tools/ocaml/xenstored/trie.mli       | 26 +++++------
 tools/ocaml/xenstored/xenstored.ml   | 16 +------
 10 files changed, 110 insertions(+), 174 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 1/4] tools/ocaml/xenstored: replace hand rolled GC with weak GC references
  2020-08-27 17:35 [PATCH v4 0/4] tools/ocaml/xenstored: simplify code Edwin Török
@ 2020-08-27 17:35 ` Edwin Török
  2020-08-27 17:35 ` [PATCH v4 2/4] Map: backport find_opt/update from 4.06 Edwin Török
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Edwin Török @ 2020-08-27 17:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

The code here is attempting to reduce memory usage by sharing common
substrings in the tree: it replaces strings with ints, and keeps a
string->int map that gets manually garbage collected using a hand-rolled
mark and sweep algorithm.

This is unnecessary: OCaml already has a mark-and-sweep Garbage
Collector runtime, and sharing of common strings in tree nodes
can be achieved through Weak references: if the string hasn't been seen
yet it gets added to the Weak reference table, and if it has we use the
entry from the table instead, thus storing a string only once.
When the string is no longer referenced OCaml's GC will drop it from the
weak table: there is no need to manually do a mark-and-sweep, or to tell
OCaml when to drop it.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
---
Changes since V3:
* replace String.equal with (=) for compatibility with 4.02
---
 tools/ocaml/xenstored/connection.ml |  3 --
 tools/ocaml/xenstored/history.ml    | 14 ------
 tools/ocaml/xenstored/store.ml      | 11 ++---
 tools/ocaml/xenstored/symbol.ml     | 68 ++++++-----------------------
 tools/ocaml/xenstored/symbol.mli    | 21 ++-------
 tools/ocaml/xenstored/xenstored.ml  | 16 +------
 6 files changed, 24 insertions(+), 109 deletions(-)

diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
index 24750ada43..aa6dd95501 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -271,9 +271,6 @@ let has_more_work con =
 
 let incr_ops con = con.stat_nb_ops <- con.stat_nb_ops + 1
 
-let mark_symbols con =
-	Hashtbl.iter (fun _ t -> Store.mark_symbols (Transaction.get_store t)) con.transactions
-
 let stats con =
 	Hashtbl.length con.watches, con.stat_nb_ops
 
diff --git a/tools/ocaml/xenstored/history.ml b/tools/ocaml/xenstored/history.ml
index f39565bff5..029802bd15 100644
--- a/tools/ocaml/xenstored/history.ml
+++ b/tools/ocaml/xenstored/history.ml
@@ -22,20 +22,6 @@ type history_record = {
 
 let history : history_record list ref = ref []
 
-(* Called from periodic_ops to ensure we don't discard symbols that are still needed. *)
-(* There is scope for optimisation here, since in consecutive commits one commit's `after`
- * is the same thing as the next commit's `before`, but not all commits in history are
- * consecutive. *)
-let mark_symbols () =
-	(* There are gaps where dom0's commits are missing. Otherwise we could assume that
-	 * each element's `before` is the same thing as the next element's `after`
-	 * since the next element is the previous commit *)
-	List.iter (fun hist_rec ->
-			Store.mark_symbols hist_rec.before;
-			Store.mark_symbols hist_rec.after;
-		)
-		!history
-
 (* Keep only enough commit-history to protect the running transactions that we are still tracking *)
 (* There is scope for optimisation here, replacing List.filter with something more efficient,
  * probably on a different list-like structure. *)
diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index f299ec6461..45659a23ee 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -46,18 +46,18 @@ let add_child node child =
 
 let exists node childname =
 	let childname = Symbol.of_string childname in
-	List.exists (fun n -> n.name = childname) node.children
+	List.exists (fun n -> Symbol.equal n.name childname) node.children
 
 let find node childname =
 	let childname = Symbol.of_string childname in
-	List.find (fun n -> n.name = childname) node.children
+	List.find (fun n -> Symbol.equal n.name childname) node.children
 
 let replace_child node child nchild =
 	(* this is the on-steroid version of the filter one-replace one *)
 	let rec replace_one_in_list l =
 		match l with
 		| []                               -> []
-		| h :: tl when h.name = child.name -> nchild :: tl
+		| h :: tl when Symbol.equal h.name child.name -> nchild :: tl
 		| h :: tl                          -> h :: replace_one_in_list tl
 		in
 	{ node with children = (replace_one_in_list node.children) }
@@ -67,7 +67,7 @@ let del_childname node childname =
 	let rec delete_one_in_list l =
 		match l with
 		| []                        -> raise Not_found
-		| h :: tl when h.name = sym -> tl
+		| h :: tl when Symbol.equal h.name sym -> tl
 		| h :: tl                   -> h :: delete_one_in_list tl
 		in
 	{ node with children = (delete_one_in_list node.children) }
@@ -463,9 +463,6 @@ let copy store = {
 	quota = Quota.copy store.quota;
 }
 
-let mark_symbols store =
-	Node.recurse (fun node -> Symbol.mark_as_used node.Node.name) store.root
-
 let incr_transaction_coalesce store =
 	store.stat_transaction_coalesce <- store.stat_transaction_coalesce + 1
 let incr_transaction_abort store =
diff --git a/tools/ocaml/xenstored/symbol.ml b/tools/ocaml/xenstored/symbol.ml
index 4420c6a4d7..2b41d120f6 100644
--- a/tools/ocaml/xenstored/symbol.ml
+++ b/tools/ocaml/xenstored/symbol.ml
@@ -14,63 +14,23 @@
  * GNU Lesser General Public License for more details.
  *)
 
-type t = int
+module WeakTable = Weak.Make(struct
+    type t = string
+    let equal (x:string) (y:string) = (x = y)
+    let hash = Hashtbl.hash
+end)
 
-type 'a record = { data: 'a; mutable garbage: bool }
-let int_string_tbl : (int,string record) Hashtbl.t = Hashtbl.create 1024
-let string_int_tbl : (string,int) Hashtbl.t = Hashtbl.create 1024
+type t = string
 
-let created_counter = ref 0
-let used_counter = ref 0
+let tbl = WeakTable.create 1024
 
-let count = ref 0
-let rec fresh () =
-	if Hashtbl.mem int_string_tbl !count
-	then begin
-		incr count;
-		fresh ()
-	end else
-		!count
+let of_string s = WeakTable.merge tbl s
+let to_string s = s
 
-let new_record v = { data=v; garbage=false }
-
-let of_string name =
-	if Hashtbl.mem string_int_tbl name
-	then begin
-		incr used_counter;
-		Hashtbl.find string_int_tbl name
-	end else begin
-		let i = fresh () in
-		incr created_counter;
-		Hashtbl.add string_int_tbl name i;
-		Hashtbl.add int_string_tbl i (new_record name);
-		i
-	end
-
-let to_string i =
-	(Hashtbl.find int_string_tbl i).data
-
-let mark_all_as_unused () =
-	Hashtbl.iter (fun _ v -> v.garbage <- true) int_string_tbl
-
-let mark_as_used symb =
-	let record1 = Hashtbl.find int_string_tbl symb in
-		record1.garbage <- false
-
-let garbage () =
-	let records = Hashtbl.fold (fun symb record accu ->
-		if record.garbage then (symb, record.data) :: accu else accu
-	) int_string_tbl [] in
-	let remove (int,string) =
-		Hashtbl.remove int_string_tbl int;
-		Hashtbl.remove string_int_tbl string
-	in
-	created_counter := 0;
-	used_counter := 0;
-	List.iter remove records
+let equal a b =
+  (* compare using physical equality, both members have to be part of the above weak table *)
+  a == b
 
 let stats () =
-	Hashtbl.length string_int_tbl
-
-let created () = !created_counter
-let used () = !used_counter
+  let len, entries, _, _, _, _ = WeakTable.stats tbl in
+  len, entries
diff --git a/tools/ocaml/xenstored/symbol.mli b/tools/ocaml/xenstored/symbol.mli
index c3c9f6e2f8..586ab57507 100644
--- a/tools/ocaml/xenstored/symbol.mli
+++ b/tools/ocaml/xenstored/symbol.mli
@@ -29,24 +29,11 @@ val of_string : string -> t
 val to_string : t -> string
 (** Convert a symbol into a string. *)
 
-(** {6 Garbage Collection} *)
-
-(** Symbols need to be regulary garbage collected. The following steps should be followed:
--     mark all the knowns symbols as unused (with [mark_all_as_unused]);
--     mark all the symbols really usefull as used (with [mark_as_used]); and
--     finally, call [garbage] *)
-
-val mark_all_as_unused : unit -> unit
-val mark_as_used : t -> unit
-val garbage : unit -> unit
+val equal: t -> t -> bool
+(** Compare two symbols for equality *)
 
 (** {6 Statistics } *)
 
-val stats : unit -> int
-(** Get the number of used symbols. *)
+val stats : unit -> int * int
+(** Get the table size and number of entries. *)
 
-val created : unit -> int
-(** Returns the number of symbols created since the last GC. *)
-
-val used : unit -> int
-(** Returns the number of existing symbols used since the last GC *)
diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index 5b96f1852a..f3e4697dea 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -376,18 +376,6 @@ let _ =
 
 	let periodic_ops now =
 		debug "periodic_ops starting";
-		(* we garbage collect the string->int dictionary after a sizeable amount of operations,
-		 * there's no need to be really fast even if we got loose
-		 * objects since names are often reuse.
-		 *)
-		if Symbol.created () > 1000 || Symbol.used () > 20000
-		then begin
-			Symbol.mark_all_as_unused ();
-			Store.mark_symbols store;
-			Connections.iter cons Connection.mark_symbols;
-			History.mark_symbols ();
-			Symbol.garbage ()
-		end;
 
 		(* scan all the xs rings as a safenet for ill-behaved clients *)
 		if !ring_scan_interval >= 0 && now > (!last_scan_time +. float !ring_scan_interval) then
@@ -405,11 +393,11 @@ let _ =
 			let (lanon, lanon_ops, lanon_watchs,
 			     ldom, ldom_ops, ldom_watchs) = Connections.stats cons in
 			let store_nodes, store_abort, store_coalesce = Store.stats store in
-			let symtbl_len = Symbol.stats () in
+			let symtbl_len, symtbl_entries = Symbol.stats () in
 
 			info "store stat: nodes(%d) t-abort(%d) t-coalesce(%d)"
 			     store_nodes store_abort store_coalesce;
-			info "sytbl stat: %d" symtbl_len;
+			info "sytbl stat: length(%d) entries(%d)" symtbl_len symtbl_entries;
 			info "  con stat: anonymous(%d, %d o, %d w) domains(%d, %d o, %d w)"
 			     lanon lanon_ops lanon_watchs ldom ldom_ops ldom_watchs;
 			info "  mem stat: minor(%.0f) promoted(%.0f) major(%.0f) heap(%d w, %d c) live(%d w, %d b) free(%d w, %d b)"
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 2/4] Map: backport find_opt/update from 4.06
  2020-08-27 17:35 [PATCH v4 0/4] tools/ocaml/xenstored: simplify code Edwin Török
  2020-08-27 17:35 ` [PATCH v4 1/4] tools/ocaml/xenstored: replace hand rolled GC with weak GC references Edwin Török
@ 2020-08-27 17:35 ` Edwin Török
  2020-08-28  8:30   ` Christian Lindig
  2020-08-27 17:35 ` [PATCH v4 3/4] tools/ocaml/xenstored: use more efficient node trees Edwin Török
  2020-08-27 17:35 ` [PATCH v4 4/4] tools/ocaml/xenstored: use more efficient tries Edwin Török
  3 siblings, 1 reply; 9+ messages in thread
From: Edwin Török @ 2020-08-27 17:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

We are currently on OCaml 4.02 as minimum version.
To make the followup optimizations compile backport these functions from
OCaml 4.06.

This implementation is less efficient than the one in the 4.06 standard
library which has access to the internals of the Map.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
Changes since V3:
* this patch is new in V4
---
 tools/ocaml/xenstored/stdext.ml | 21 +++++++++++++++++++++
 tools/ocaml/xenstored/trie.ml   |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/tools/ocaml/xenstored/stdext.ml b/tools/ocaml/xenstored/stdext.ml
index 4f2f3a2c8c..5bebe2aa27 100644
--- a/tools/ocaml/xenstored/stdext.ml
+++ b/tools/ocaml/xenstored/stdext.ml
@@ -44,6 +44,27 @@ let default d v =
 let maybe f v =
 	match v with None -> () | Some x -> f x
 
+module Map = struct
+module Make(Ord: Map.OrderedType) = struct
+
+include Map.Make(Ord)
+
+let find_opt k t =
+	(* avoid raising exceptions, they can be expensive *)
+	if mem k t then Some (find k t) else None
+
+let update k f t =
+  let r = find_opt k t in
+  let r' = f r in
+  match r, r' with
+  | None, None -> t
+  | Some _, None -> remove k t
+  | Some r, Some r' when r == r' -> t
+  | _, Some r' -> add k r' t
+
+end
+end
+
 module String = struct include String
 
 let of_char c = String.make 1 c
diff --git a/tools/ocaml/xenstored/trie.ml b/tools/ocaml/xenstored/trie.ml
index dc42535092..f513f4e608 100644
--- a/tools/ocaml/xenstored/trie.ml
+++ b/tools/ocaml/xenstored/trie.ml
@@ -13,6 +13,8 @@
  * GNU Lesser General Public License for more details.
  *)
 
+open Stdext
+
 module Node =
 struct
 	type ('a,'b) t =  {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 3/4] tools/ocaml/xenstored: use more efficient node trees
  2020-08-27 17:35 [PATCH v4 0/4] tools/ocaml/xenstored: simplify code Edwin Török
  2020-08-27 17:35 ` [PATCH v4 1/4] tools/ocaml/xenstored: replace hand rolled GC with weak GC references Edwin Török
  2020-08-27 17:35 ` [PATCH v4 2/4] Map: backport find_opt/update from 4.06 Edwin Török
@ 2020-08-27 17:35 ` Edwin Török
  2020-08-27 17:35 ` [PATCH v4 4/4] tools/ocaml/xenstored: use more efficient tries Edwin Török
  3 siblings, 0 replies; 9+ messages in thread
From: Edwin Török @ 2020-08-27 17:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

This changes the output of xenstore-ls to be sorted.
Previously the keys were listed in the order in which they were inserted
in.
docs/misc/xenstore.txt doesn't specify in what order keys are listed.

Map.update is used to retain semantics with replace_child:
only an existing child is replaced, if it wasn't part of the original
map we don't add it.
Similarly exception behaviour is retained for del_childname and related
functions.

Entries are stored in reverse sort order, so that upon Map.fold the
constructed list is sorted in ascending order and there is no need for a
List.rev.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
---
Changes since V3:
* none, repost after preceding commits fix OCaml 4.02 compatibility
---
 tools/ocaml/xenstored/store.ml   | 46 +++++++++++++++-----------------
 tools/ocaml/xenstored/symbol.ml  |  4 +++
 tools/ocaml/xenstored/symbol.mli |  3 +++
 3 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index 45659a23ee..d9dfa36045 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -16,17 +16,19 @@
  *)
 open Stdext
 
+module SymbolMap = Map.Make(Symbol)
+
 module Node = struct
 
 type t = {
 	name: Symbol.t;
 	perms: Perms.Node.t;
 	value: string;
-	children: t list;
+	children: t SymbolMap.t;
 }
 
 let create _name _perms _value =
-	{ name = Symbol.of_string _name; perms = _perms; value = _value; children = []; }
+	{ name = Symbol.of_string _name; perms = _perms; value = _value; children = SymbolMap.empty; }
 
 let get_owner node = Perms.Node.get_owner node.perms
 let get_children node = node.children
@@ -42,38 +44,34 @@ let set_value node nvalue =
 let set_perms node nperms = { node with perms = nperms }
 
 let add_child node child =
-	{ node with children = child :: node.children }
+	let children = SymbolMap.add child.name child node.children in
+	{ node with children }
 
 let exists node childname =
 	let childname = Symbol.of_string childname in
-	List.exists (fun n -> Symbol.equal n.name childname) node.children
+	SymbolMap.mem childname node.children
 
 let find node childname =
 	let childname = Symbol.of_string childname in
-	List.find (fun n -> Symbol.equal n.name childname) node.children
+	SymbolMap.find childname node.children
 
 let replace_child node child nchild =
-	(* this is the on-steroid version of the filter one-replace one *)
-	let rec replace_one_in_list l =
-		match l with
-		| []                               -> []
-		| h :: tl when Symbol.equal h.name child.name -> nchild :: tl
-		| h :: tl                          -> h :: replace_one_in_list tl
-		in
-	{ node with children = (replace_one_in_list node.children) }
+	{ node with
+	  children = SymbolMap.update child.name
+		     (function None -> None | Some _ -> Some nchild)
+		     node.children
+	}
 
 let del_childname node childname =
 	let sym = Symbol.of_string childname in
-	let rec delete_one_in_list l =
-		match l with
-		| []                        -> raise Not_found
-		| h :: tl when Symbol.equal h.name sym -> tl
-		| h :: tl                   -> h :: delete_one_in_list tl
-		in
-	{ node with children = (delete_one_in_list node.children) }
+	{ node with children =
+		SymbolMap.update sym
+		  (function None -> raise Not_found | Some _ -> None)
+		  node.children
+	}
 
 let del_all_children node =
-	{ node with children = [] }
+	{ node with children = SymbolMap.empty }
 
 (* check if the current node can be accessed by the current connection with rperm permissions *)
 let check_perm node connection request =
@@ -87,7 +85,7 @@ let check_owner node connection =
 		raise Define.Permission_denied;
 	end
 
-let rec recurse fct node = fct node; List.iter (recurse fct) node.children
+let rec recurse fct node = fct node; SymbolMap.iter (fun _ -> recurse fct) node.children
 
 let unpack node = (Symbol.to_string node.name, node.perms, node.value)
 
@@ -321,7 +319,7 @@ let ls store perm path =
 				Node.check_perm cnode perm Perms.READ;
 				cnode.Node.children in
 			Path.apply store.root path do_ls in
-	List.rev (List.map (fun n -> Symbol.to_string n.Node.name) children)
+	SymbolMap.fold (fun k _ accu -> Symbol.to_string k :: accu) children []
 
 let getperms store perm path =
 	if path = [] then
@@ -350,7 +348,7 @@ let traversal root_node f =
 	let rec _traversal path node =
 		f path node;
 		let node_path = Path.of_path_and_name path (Symbol.to_string node.Node.name) in
-		List.iter (_traversal node_path) node.Node.children
+		SymbolMap.iter (fun _ -> _traversal node_path) node.Node.children
 		in
 	_traversal [] root_node
 
diff --git a/tools/ocaml/xenstored/symbol.ml b/tools/ocaml/xenstored/symbol.ml
index 2b41d120f6..301639f16f 100644
--- a/tools/ocaml/xenstored/symbol.ml
+++ b/tools/ocaml/xenstored/symbol.ml
@@ -31,6 +31,10 @@ let equal a b =
   (* compare using physical equality, both members have to be part of the above weak table *)
   a == b
 
+let compare a b =
+  if equal a b then 0
+  else -(String.compare a b)
+
 let stats () =
   let len, entries, _, _, _, _ = WeakTable.stats tbl in
   len, entries
diff --git a/tools/ocaml/xenstored/symbol.mli b/tools/ocaml/xenstored/symbol.mli
index 586ab57507..dd0f014796 100644
--- a/tools/ocaml/xenstored/symbol.mli
+++ b/tools/ocaml/xenstored/symbol.mli
@@ -32,6 +32,9 @@ val to_string : t -> string
 val equal: t -> t -> bool
 (** Compare two symbols for equality *)
 
+val compare: t -> t -> int
+(** Compare two symbols *)
+
 (** {6 Statistics } *)
 
 val stats : unit -> int * int
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 4/4] tools/ocaml/xenstored: use more efficient tries
  2020-08-27 17:35 [PATCH v4 0/4] tools/ocaml/xenstored: simplify code Edwin Török
                   ` (2 preceding siblings ...)
  2020-08-27 17:35 ` [PATCH v4 3/4] tools/ocaml/xenstored: use more efficient node trees Edwin Török
@ 2020-08-27 17:35 ` Edwin Török
  3 siblings, 0 replies; 9+ messages in thread
From: Edwin Török @ 2020-08-27 17:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

No functional change, just an optimization.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
---
Changes since V3:
* none, repost after previous commits fix compatibility with OCaml 4.02
---
 tools/ocaml/xenstored/connections.ml |  2 +-
 tools/ocaml/xenstored/symbol.ml      |  6 +--
 tools/ocaml/xenstored/trie.ml        | 59 ++++++++++++----------------
 tools/ocaml/xenstored/trie.mli       | 26 ++++++------
 4 files changed, 43 insertions(+), 50 deletions(-)

diff --git a/tools/ocaml/xenstored/connections.ml b/tools/ocaml/xenstored/connections.ml
index f02ef6b526..4983c7370b 100644
--- a/tools/ocaml/xenstored/connections.ml
+++ b/tools/ocaml/xenstored/connections.ml
@@ -21,7 +21,7 @@ type t = {
 	anonymous: (Unix.file_descr, Connection.t) Hashtbl.t;
 	domains: (int, Connection.t) Hashtbl.t;
 	ports: (Xeneventchn.t, Connection.t) Hashtbl.t;
-	mutable watches: (string, Connection.watch list) Trie.t;
+	mutable watches: Connection.watch list Trie.t;
 }
 
 let create () = {
diff --git a/tools/ocaml/xenstored/symbol.ml b/tools/ocaml/xenstored/symbol.ml
index 301639f16f..72a84ebf80 100644
--- a/tools/ocaml/xenstored/symbol.ml
+++ b/tools/ocaml/xenstored/symbol.ml
@@ -31,9 +31,9 @@ let equal a b =
   (* compare using physical equality, both members have to be part of the above weak table *)
   a == b
 
-let compare a b =
-  if equal a b then 0
-  else -(String.compare a b)
+(* the sort order is reversed here, so that Map.fold constructs a list
+   in ascending order *)
+let compare a b = String.compare b a
 
 let stats () =
   let len, entries, _, _, _, _ = WeakTable.stats tbl in
diff --git a/tools/ocaml/xenstored/trie.ml b/tools/ocaml/xenstored/trie.ml
index f513f4e608..ad2aed5123 100644
--- a/tools/ocaml/xenstored/trie.ml
+++ b/tools/ocaml/xenstored/trie.ml
@@ -15,24 +15,26 @@
 
 open Stdext
 
+module StringMap = Map.Make(String)
+
 module Node =
 struct
-	type ('a,'b) t =  {
-		key: 'a;
-		value: 'b option;
-		children: ('a,'b) t list;
+	type 'a t =  {
+		key: string;
+		value: 'a option;
+		children: 'a t StringMap.t;
 	}
 
 	let _create key value = {
 		key = key;
 		value = Some value;
-		children = [];
+		children = StringMap.empty;
 	}
 
 	let empty key = {
 		key = key;
 		value = None;
-		children = []
+		children = StringMap.empty;
 	}
 
 	let _get_key node = node.key
@@ -49,41 +51,31 @@ struct
 		{ node with children = children }
 
 	let _add_child node child =
-		{ node with children = child :: node.children }
+		{ node with children = StringMap.add child.key child node.children }
 end
 
-type ('a,'b) t = ('a,'b) Node.t list
+type 'a t = 'a Node.t StringMap.t
 
 let mem_node nodes key =
-	List.exists (fun n -> n.Node.key = key) nodes
+	StringMap.mem key nodes
 
 let find_node nodes key =
-	List.find (fun n -> n.Node.key = key) nodes
+	StringMap.find key nodes
 
 let replace_node nodes key node =
-	let rec aux = function
-		| []                            -> []
-		| h :: tl when h.Node.key = key -> node :: tl
-		| h :: tl                       -> h :: aux tl
-	in
-	aux nodes
+	StringMap.update key (function None -> None | Some _ -> Some node) nodes
 
 let remove_node nodes key =
-	let rec aux = function
-		| []                            -> raise Not_found
-		| h :: tl when h.Node.key = key -> tl
-		| h :: tl                       -> h :: aux tl
-	in
-	aux nodes
+	StringMap.update key (function None -> raise Not_found | Some _ -> None) nodes
 
-let create () = []
+let create () = StringMap.empty
 
 let rec iter f tree =
-	let aux node =
-		f node.Node.key node.Node.value;
+	let aux key node =
+		f key node.Node.value;
 		iter f node.Node.children
 	in
-	List.iter aux tree
+	StringMap.iter aux tree
 
 let rec map f tree =
 	let aux node =
@@ -94,13 +86,14 @@ let rec map f tree =
 		in
 		{ node with Node.value = value; Node.children = map f node.Node.children }
 	in
-	List.filter (fun n -> n.Node.value <> None || n.Node.children <> []) (List.map aux tree)
+	tree |> StringMap.map aux
+	|> StringMap.filter (fun _ n -> n.Node.value <> None || not (StringMap.is_empty n.Node.children) )
 
 let rec fold f tree acc =
-	let aux accu node =
-		fold f node.Node.children (f node.Node.key node.Node.value accu)
+	let aux key node accu =
+		fold f node.Node.children (f key node.Node.value accu)
 	in
-	List.fold_left aux acc tree
+	StringMap.fold aux tree acc
 
 (* return a sub-trie *)
 let rec sub_node tree = function
@@ -117,7 +110,7 @@ let rec sub_node tree = function
 
 let sub tree path =
 	try (sub_node tree path).Node.children
-	with Not_found -> []
+	with Not_found -> StringMap.empty
 
 let find tree path =
 	Node.get_value (sub_node tree path)
@@ -161,7 +154,7 @@ and set tree path value =
 				  replace_node tree h (set_node node t value)
 			  end else begin
 				  let node = Node.empty h in
-				  set_node node t value :: tree
+				  StringMap.add node.Node.key (set_node node t value) tree
 			  end
 
 let rec unset tree = function
@@ -176,7 +169,7 @@ let rec unset tree = function
 				  then Node.set_children (Node.empty h) children
 				  else Node.set_children node children
 			  in
-			  if children = [] && new_node.Node.value = None
+			  if StringMap.is_empty children && new_node.Node.value = None
 			  then remove_node tree h
 			  else replace_node tree h new_node
 		  end else
diff --git a/tools/ocaml/xenstored/trie.mli b/tools/ocaml/xenstored/trie.mli
index 5dc53c1cb1..27785154f5 100644
--- a/tools/ocaml/xenstored/trie.mli
+++ b/tools/ocaml/xenstored/trie.mli
@@ -15,46 +15,46 @@
 
 (** Basic Implementation of polymorphic tries (ie. prefix trees) *)
 
-type ('a, 'b) t
-(** The type of tries. ['a list] is the type of keys, ['b] the type of values.
+type 'a t
+(** The type of tries. ['a] the type of values.
 	Internally, a trie is represented as a labeled tree, where node contains values
-	of type ['a * 'b option]. *)
+	of type [string * 'a option]. *)
 
-val create : unit -> ('a,'b) t
+val create : unit -> 'a t
 (** Creates an empty trie. *)
 
-val mem : ('a,'b) t -> 'a list -> bool
+val mem : 'a t -> string list -> bool
 (** [mem t k] returns true if a value is associated with the key [k] in the trie [t].
 	Otherwise, it returns false. *)
 
-val find : ('a, 'b) t -> 'a list -> 'b
+val find : 'a t -> string list -> 'a
 (** [find t k] returns the value associated with the key [k] in the trie [t].
 	Returns [Not_found] if no values are associated with [k] in [t]. *)
 
-val set : ('a, 'b) t -> 'a list -> 'b -> ('a, 'b) t
+val set : 'a t -> string list -> 'a -> 'a t
 (** [set t k v] associates the value [v] with the key [k] in the trie [t]. *)
 
-val unset : ('a, 'b) t -> 'a list -> ('a, 'b) t
+val unset : 'a t -> string list -> 'a t
 (** [unset k v] removes the association of value [v] with the key [k] in the trie [t].
 	Moreover, it automatically clean the trie, ie. it removes recursively
 	every nodes of [t] containing no values and having no chil. *)
 
-val iter : ('a -> 'b option -> unit) -> ('a, 'b) t -> unit
+val iter : (string -> 'a option -> unit) -> 'a t -> unit
 (** [iter f t] applies the function [f] to every node of the trie [t].
 	As nodes of the trie [t] do not necessary contains a value, the second argument of
 	[f] is an option type. *)
 
-val iter_path : ('a -> 'b option -> unit) -> ('a, 'b) t -> 'a list -> unit
+val iter_path : (string -> 'a option -> unit) -> 'a t -> string list -> unit
 (** [iter_path f t p] iterates [f] over nodes associated with the path [p] in the trie [t].
 	If [p] is not a valid path of [t], it iterates on the longest valid prefix of [p]. *)
 
-val fold : ('a -> 'b option -> 'c -> 'c) -> ('a, 'b) t -> 'c -> 'c
+val fold : (string -> 'a option -> 'c -> 'c) -> 'a t -> 'c -> 'c
 (** [fold f t x] fold [f] over every nodes of [t], with [x] as initial value. *)
 
-val map : ('b -> 'c option) -> ('a,'b) t -> ('a,'c) t
+val map : ('a -> 'b option) -> 'a t -> 'b t
 (** [map f t] maps [f] over every values stored in [t]. The return value of [f] is of type 'c option
 	as one may wants to remove value associated to a key. This function is not tail-recursive. *)
 
-val sub : ('a, 'b) t -> 'a list -> ('a,'b) t
+val sub : 'a t -> string list -> 'a t
 (** [sub t p] returns the sub-trie associated with the path [p] in the trie [t].
 	If [p] is not a valid path of [t], it returns an empty trie. *)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/4] Map: backport find_opt/update from 4.06
  2020-08-27 17:35 ` [PATCH v4 2/4] Map: backport find_opt/update from 4.06 Edwin Török
@ 2020-08-28  8:30   ` Christian Lindig
  2020-08-28 17:43     ` Edwin Torok
  0 siblings, 1 reply; 9+ messages in thread
From: Christian Lindig @ 2020-08-28  8:30 UTC (permalink / raw)
  To: Edwin Torok, xen-devel; +Cc: David Scott, Ian Jackson, Wei Liu


________________________________________
From: Edwin Török <edvin.torok@citrix.com>
Sent: 27 August 2020 18:35
To: xen-devel@lists.xenproject.org
Cc: Edwin Torok; Christian Lindig; David Scott; Ian Jackson; Wei Liu
Subject: [PATCH v4 2/4] Map: backport find_opt/update from 4.06

We are currently on OCaml 4.02 as minimum version.
To make the followup optimizations compile backport these functions from
OCaml 4.06.

This implementation is less efficient than the one in the 4.06 standard
library which has access to the internals of the Map.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
Changes since V3:
* this patch is new in V4
---
 tools/ocaml/xenstored/stdext.ml | 21 +++++++++++++++++++++
 tools/ocaml/xenstored/trie.ml   |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/tools/ocaml/xenstored/stdext.ml b/tools/ocaml/xenstored/stdext.ml
index 4f2f3a2c8c..5bebe2aa27 100644
--- a/tools/ocaml/xenstored/stdext.ml
+++ b/tools/ocaml/xenstored/stdext.ml
@@ -44,6 +44,27 @@ let default d v =
 let maybe f v =
        match v with None -> () | Some x -> f x

+module Map = struct
+module Make(Ord: Map.OrderedType) = struct
+
+include Map.Make(Ord)
+
+let find_opt k t =
+       (* avoid raising exceptions, they can be expensive *)
+       if mem k t then Some (find k t) else None

I disagree with this argument. Exceptions in OCaml are cheap because they don't walk the stack and cut to the exception handler directly. Is there a reason why they could be expensive here? In any case, the code is correct.

+
+let update k f t =
+  let r = find_opt k t in
+  let r' = f r in
+  match r, r' with
+  | None, None -> t
+  | Some _, None -> remove k t
+  | Some r, Some r' when r == r' -> t
+  | _, Some r' -> add k r' t

This looks correct to me.

+
+end
+end
+
 module String = struct include String

 let of_char c = String.make 1 c
diff --git a/tools/ocaml/xenstored/trie.ml b/tools/ocaml/xenstored/trie.ml
index dc42535092..f513f4e608 100644
--- a/tools/ocaml/xenstored/trie.ml
+++ b/tools/ocaml/xenstored/trie.ml
@@ -13,6 +13,8 @@
  * GNU Lesser General Public License for more details.
  *)

+open Stdext
+
 module Node =
 struct
        type ('a,'b) t =  {
--
2.25.1

-- 
Acked-by: Christian Lindig <christian.lindig@citrix.com>


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/4] Map: backport find_opt/update from 4.06
  2020-08-28  8:30   ` Christian Lindig
@ 2020-08-28 17:43     ` Edwin Torok
  2020-09-01 12:39       ` Christian Lindig
  0 siblings, 1 reply; 9+ messages in thread
From: Edwin Torok @ 2020-08-28 17:43 UTC (permalink / raw)
  To: Christian Lindig, xen-devel; +Cc: Ian Jackson, dave, wl

[-- Attachment #1: Type: text/plain, Size: 8567 bytes --]

On Fri, 2020-08-28 at 10:30 +0200, Christian Lindig wrote:
> +let find_opt k t =
> 
> +       (* avoid raising exceptions, they can be expensive *)
> 
> +       if mem k t then Some (find k t) else None
> 
> 
> 
> I disagree with this argument. Exceptions in OCaml are cheap because
> they don't walk the stack and cut to the exception handler directly.
> Is there a reason why they could be expensive here? In any case, the
> code is correct.
> 

Interesting question, it is best to measure.

The answer depends on whether the key is found or not in the map.
I'll change the impl to the one with find+catch, I think we  might be
looking up keys that are present more often than those that are missing
(although the 4.06 series will make this moot, 4.06 version is faster
than both approaches, regardless whether key is present or not).

To sum up the measurements below:
If the key is not found then my approach is faster (takes only 80% of
time), if the key is found then find+catch is faster (again an approx
80% of the time taken).

I based my comment on the documentation of raise_notrace, which says:
"A faster version raise which does not record the backtrace.",
which implies that recording the backtrace has a measurable perf
impact.
One could argue that if performance matters backtraces should be turned
off in production, but I think the value of having backtraces when some
hard-to-reprodue bug occurs outweights any perf penalty.
We should try to use exceptions only for unexpected situations though,
not finding a value in a map doesn't qualify.

See the attachment for a small micro-benchmark:
$ dune exec --profile=release -- ./updatet.exe raise
Estimated testing time 20s (2 benchmarks x 10s). Change using '-quota'.
┌───────────────┬──────────┬────────────┐
│ Name          │ Time/Run │ Percentage │
├───────────────┼──────────┼────────────┤
│ raise         │  33.52ns │    100.00% │
│ raise_notrace │  19.16ns │     57.16% │
└───────────────┴──────────┴────────────┘

So raising with a backtrace is measurably slower, taking the backtrace
spends some CPU cycles.

$ dune exec --profile=release -- ./updatet.exe find-opt
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌──────────────────────────┬──────────┬────────────┐
│ Name                     │ Time/Run │ Percentage │
├──────────────────────────┼──────────┼────────────┤
│ find_opt 4.06:10         │  49.10ns │     24.06% │
│ find_opt 4.06:100        │ 115.38ns │     56.55% │
│ find_opt 4.06:1000       │ 161.27ns │     79.03% │
│ find_opt=mem+find:10     │  50.48ns │     24.74% │
│ find_opt=mem+find:100    │ 110.39ns │     54.10% │
│ find_opt=mem+find:1000   │ 162.48ns │     79.63% │
│ find_opt=find+catch:10   │  89.10ns │     43.67% │
│ find_opt=find+catch:100  │ 160.80ns │     78.80% │
│ find_opt=find+catch:1000 │ 204.04ns │    100.00% │
└──────────────────────────┴──────────┴────────────┘


4.06 and mem+find take 80% of the time of find+catch.

But of course if the key is actually found in the map then we have
this: 
edwin@edvin-tower:~/uex % dune exec --profile=release -- ./updatet.exe
find-opt-found
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌──────────────────────────┬──────────┬─────────┬────────────┐
│ Name                     │ Time/Run │ mWd/Run │ Percentage │
├──────────────────────────┼──────────┼─────────┼────────────┤
│ find_opt 4.06:10         │  38.38ns │   2.00w │     52.65% │
│ find_opt 4.06:100        │  20.66ns │   2.00w │     28.35% │
│ find_opt 4.06:1000       │  20.63ns │   2.00w │     28.30% │
│ find_opt=mem+find:10     │  72.89ns │   2.00w │    100.00% │
│ find_opt=mem+find:100    │  39.06ns │   2.00w │     53.59% │
│ find_opt=mem+find:1000   │  39.07ns │   2.00w │     53.60% │
│ find_opt=find+catch:10   │  49.54ns │   2.00w │     67.97% │
│ find_opt=find+catch:100  │  33.01ns │   2.00w │     45.29% │
│ find_opt=find+catch:1000 │  32.97ns │   2.00w │     45.23% │
└──────────────────────────┴──────────┴─────────┴────────────┘

In this case find+catch is faster.

And here is update for a key that is not present:
$ dune exec --profile=release -- ./updatet.exe update
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌───────────────────────────────────┬──────────┬─────────┬────────────┐
│ Name                              │ Time/Run │ mWd/Run │ Percentage │
├───────────────────────────────────┼──────────┼─────────┼────────────┤
│ update 4.06:10                    │  79.96ns │  24.00w │     17.27% │
│ update 4.06:100                   │ 171.96ns │  48.00w │     37.15% │
│ update 4.06:1000                  │ 243.95ns │  66.00w │     52.70% │
│ update=find+catch+add/remove:10   │ 183.46ns │  24.00w │     39.63% │
│ update=find+catch+add/remove:100  │ 340.00ns │  48.00w │     73.45% │
│ update=find+catch+add/remove:1000 │ 462.89ns │  66.00w │    100.00% │
│ update=mem+find+add/remove:10     │ 126.06ns │  24.00w │     27.23% │
│ update=mem+find+add/remove:100    │ 274.79ns │  48.00w │     59.36% │
│ update=mem+find+add/remove:1000   │ 401.62ns │  66.00w │     86.76% │
└───────────────────────────────────┴──────────┴─────────┴────────────┘

Here 4.06 is a clear win, and mem+add is faster than find+catch+add.
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌───────────────────────────────────┬──────────┬─────────┬────────────┐
│ Name                              │ Time/Run │ mWd/Run │ Percentage │
├───────────────────────────────────┼──────────┼─────────┼────────────┤
│ update 4.06:10                    │  72.76ns │  24.00w │     31.25% │
│ update 4.06:100                   │ 164.69ns │  48.00w │     70.74% │
│ update 4.06:1000                  │ 232.79ns │  66.00w │    100.00% │
│ update=find+catch+add/remove:10   │ 133.24ns │  23.00w │     57.24% │
│ update=find+catch+add/remove:100  │ 118.76ns │  35.00w │     51.02% │
│ update=find+catch+add/remove:1000 │ 161.22ns │  59.00w │     69.26% │
│ update=mem+find+add/remove:10     │ 156.29ns │  23.00w │     67.14% │
│ update=mem+find+add/remove:100    │ 122.98ns │  35.00w │     52.83% │
│ update=mem+find+add/remove:1000   │ 161.53ns │  59.00w │     69.39% │
└───────────────────────────────────┴──────────┴─────────┴────────────┘

Interestingly here the 4.06 implementation is actually slower and not
much difference between my other two.

Best regards,
--Edwin

[-- Attachment #2: dune --]
[-- Type: text/plain, Size: 53 bytes --]

(executable
 (name updatet)
 (libraries core_bench))

[-- Attachment #3: dune-project --]
[-- Type: text/plain, Size: 16 bytes --]

(lang dune 2.6)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: updatet.ml --]
[-- Type: text/x-ocaml; name="updatet.ml", Size: 3674 bytes --]

module StringMap = Map.Make (String)

let find_opt_back k t =
  (* avoid raising exceptions, they can be expensive *)
  if StringMap.mem k t then Some (StringMap.find k t) else None

let find_opt_raise k t = try Some (StringMap.find k t) with Not_found -> None

let update_raise k f t =
  let r = find_opt_raise k t in
  let r' = f r in
  match (r, r') with
  | None, None ->
      t
  | Some _, None ->
      StringMap.remove k t
  | Some r, Some r' when r == r' ->
      t
  | _, Some r' ->
      StringMap.add k r' t

let update k f t =
  let r = find_opt_back k t in
  let r' = f r in
  match (r, r') with
  | None, None ->
      t
  | Some _, None ->
      StringMap.remove k t
  | Some r, Some r' when r == r' ->
      t
  | _, Some r' ->
      StringMap.add k r' t

let do_raise () = raise Not_found

let do_raise_notrace () = raise_notrace Not_found

let dummy = ref 0

let wrap f () =
  try if Sys.opaque_identity !dummy = 0 then f () with Not_found -> ()

(* open here to make sure we use stdlib impl above *)
open! Core
open Core_bench

let pre = "/local/domain/0"

let test_map = List.init 1000 (fun i -> pre ^ string_of_int i)

let args = [10; 100; 1000]

let mk_map_bench ~name f =
  Bench.Test.create_indexed ~name ~args (fun len ->
      (* this runs before the benchmark *)
      let m =
        List.init len (fun i -> (pre ^ string_of_int i, "value"))
        |> Caml.List.to_seq |> StringMap.of_seq
      in
      Staged.stage (fun () ->
          (* this is the benchmarked function, the benchmark framework takes care of
             running it multiple times, and avoiding the compiler optimizing it away
             by "using" the result *)
          f m))

let () =
  Printexc.record_backtrace true ;
  let key = pre ^ "/nonexistent" in
  let keyf = pre ^ "5" in
  let f = function Some x -> None | None -> Some "value2" in
  Command.run
    (Command.group ~summary:"exception handling benchmarkls"
       [ ( "raise"
         , Bench.make_command
             [ Bench.Test.create ~name:"raise" (wrap do_raise)
             ; Bench.Test.create ~name:"raise_notrace" (wrap do_raise_notrace)
             ] )
       ; ( "find-opt"
         , Bench.make_command
             [ mk_map_bench ~name:"find_opt 4.06" (fun m ->
                   StringMap.find_opt key m)
             ; mk_map_bench ~name:"find_opt=mem+find" (fun m ->
                   find_opt_back key m)
             ; mk_map_bench ~name:"find_opt=find+catch" (fun m ->
                   find_opt_raise key m) ] )
       ; ( "find-opt-found"
         , Bench.make_command
             [ mk_map_bench ~name:"find_opt 4.06" (fun m ->
                   StringMap.find_opt keyf m)
             ; mk_map_bench ~name:"find_opt=mem+find" (fun m ->
                   find_opt_back keyf m)
             ; mk_map_bench ~name:"find_opt=find+catch" (fun m ->
                   find_opt_raise keyf m) ] )
       ; ( "update"
         , Bench.make_command
             [ mk_map_bench ~name:"update 4.06" (fun m ->
                   StringMap.update key f m)
             ; mk_map_bench ~name:"update=find+catch+add/remove" (fun m ->
                   update_raise key f m)
             ; mk_map_bench ~name:"update=mem+find+add/remove" (fun m ->
                   update key f m) ]

             )
       ; ( "update-found"
         , Bench.make_command
             [ mk_map_bench ~name:"update 4.06" (fun m ->
                   StringMap.update key f m)
             ; mk_map_bench ~name:"update=find+catch+add/remove" (fun m ->
                   update_raise keyf f m)
             ; mk_map_bench ~name:"update=mem+find+add/remove" (fun m ->
                   update keyf f m) ] ) ])


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/4] Map: backport find_opt/update from 4.06
  2020-08-28 17:43     ` Edwin Torok
@ 2020-09-01 12:39       ` Christian Lindig
  0 siblings, 0 replies; 9+ messages in thread
From: Christian Lindig @ 2020-09-01 12:39 UTC (permalink / raw)
  To: Edwin Torok, xen-devel; +Cc: Ian Jackson, dave, wl

Thanks for this interesting experiment. My recollection was correct that exceptions are very fast but only if you don't capture stack traces (which we we do capture). With backtraces the difference is not massive. It still does pay off to use exceptions when they are raised rarely as the it saves the lookup for mem(). And moving to OCaml 4.06 improves performance.

-- C
________________________________________
From: Edwin Torok
Sent: 28 August 2020 18:43
To: Christian Lindig; xen-devel@lists.xenproject.org
Cc: Ian Jackson; dave@recoil.org; wl@xen.org
Subject: Re: [PATCH v4 2/4] Map: backport find_opt/update from 4.06

On Fri, 2020-08-28 at 10:30 +0200, Christian Lindig wrote:
> +let find_opt k t =
>
> +       (* avoid raising exceptions, they can be expensive *)
>
> +       if mem k t then Some (find k t) else None
>
>
>
> I disagree with this argument. Exceptions in OCaml are cheap because
> they don't walk the stack and cut to the exception handler directly.
> Is there a reason why they could be expensive here? In any case, the
> code is correct.
>

Interesting question, it is best to measure.

The answer depends on whether the key is found or not in the map.
I'll change the impl to the one with find+catch, I think we  might be
looking up keys that are present more often than those that are missing
(although the 4.06 series will make this moot, 4.06 version is faster
than both approaches, regardless whether key is present or not).

To sum up the measurements below:
If the key is not found then my approach is faster (takes only 80% of
time), if the key is found then find+catch is faster (again an approx
80% of the time taken).

I based my comment on the documentation of raise_notrace, which says:
"A faster version raise which does not record the backtrace.",
which implies that recording the backtrace has a measurable perf
impact.
One could argue that if performance matters backtraces should be turned
off in production, but I think the value of having backtraces when some
hard-to-reprodue bug occurs outweights any perf penalty.
We should try to use exceptions only for unexpected situations though,
not finding a value in a map doesn't qualify.

See the attachment for a small micro-benchmark:
$ dune exec --profile=release -- ./updatet.exe raise
Estimated testing time 20s (2 benchmarks x 10s). Change using '-quota'.
┌───────────────┬──────────┬────────────┐
│ Name          │ Time/Run │ Percentage │
├───────────────┼──────────┼────────────┤
│ raise         │  33.52ns │    100.00% │
│ raise_notrace │  19.16ns │     57.16% │
└───────────────┴──────────┴────────────┘

So raising with a backtrace is measurably slower, taking the backtrace
spends some CPU cycles.

$ dune exec --profile=release -- ./updatet.exe find-opt
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌──────────────────────────┬──────────┬────────────┐
│ Name                     │ Time/Run │ Percentage │
├──────────────────────────┼──────────┼────────────┤
│ find_opt 4.06:10         │  49.10ns │     24.06% │
│ find_opt 4.06:100        │ 115.38ns │     56.55% │
│ find_opt 4.06:1000       │ 161.27ns │     79.03% │
│ find_opt=mem+find:10     │  50.48ns │     24.74% │
│ find_opt=mem+find:100    │ 110.39ns │     54.10% │
│ find_opt=mem+find:1000   │ 162.48ns │     79.63% │
│ find_opt=find+catch:10   │  89.10ns │     43.67% │
│ find_opt=find+catch:100  │ 160.80ns │     78.80% │
│ find_opt=find+catch:1000 │ 204.04ns │    100.00% │
└──────────────────────────┴──────────┴────────────┘


4.06 and mem+find take 80% of the time of find+catch.

But of course if the key is actually found in the map then we have
this:
edwin@edvin-tower:~/uex % dune exec --profile=release -- ./updatet.exe
find-opt-found
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌──────────────────────────┬──────────┬─────────┬────────────┐
│ Name                     │ Time/Run │ mWd/Run │ Percentage │
├──────────────────────────┼──────────┼─────────┼────────────┤
│ find_opt 4.06:10         │  38.38ns │   2.00w │     52.65% │
│ find_opt 4.06:100        │  20.66ns │   2.00w │     28.35% │
│ find_opt 4.06:1000       │  20.63ns │   2.00w │     28.30% │
│ find_opt=mem+find:10     │  72.89ns │   2.00w │    100.00% │
│ find_opt=mem+find:100    │  39.06ns │   2.00w │     53.59% │
│ find_opt=mem+find:1000   │  39.07ns │   2.00w │     53.60% │
│ find_opt=find+catch:10   │  49.54ns │   2.00w │     67.97% │
│ find_opt=find+catch:100  │  33.01ns │   2.00w │     45.29% │
│ find_opt=find+catch:1000 │  32.97ns │   2.00w │     45.23% │
└──────────────────────────┴──────────┴─────────┴────────────┘

In this case find+catch is faster.

And here is update for a key that is not present:
$ dune exec --profile=release -- ./updatet.exe update
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌───────────────────────────────────┬──────────┬─────────┬────────────┐
│ Name                              │ Time/Run │ mWd/Run │ Percentage │
├───────────────────────────────────┼──────────┼─────────┼────────────┤
│ update 4.06:10                    │  79.96ns │  24.00w │     17.27% │
│ update 4.06:100                   │ 171.96ns │  48.00w │     37.15% │
│ update 4.06:1000                  │ 243.95ns │  66.00w │     52.70% │
│ update=find+catch+add/remove:10   │ 183.46ns │  24.00w │     39.63% │
│ update=find+catch+add/remove:100  │ 340.00ns │  48.00w │     73.45% │
│ update=find+catch+add/remove:1000 │ 462.89ns │  66.00w │    100.00% │
│ update=mem+find+add/remove:10     │ 126.06ns │  24.00w │     27.23% │
│ update=mem+find+add/remove:100    │ 274.79ns │  48.00w │     59.36% │
│ update=mem+find+add/remove:1000   │ 401.62ns │  66.00w │     86.76% │
└───────────────────────────────────┴──────────┴─────────┴────────────┘

Here 4.06 is a clear win, and mem+add is faster than find+catch+add.
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌───────────────────────────────────┬──────────┬─────────┬────────────┐
│ Name                              │ Time/Run │ mWd/Run │ Percentage │
├───────────────────────────────────┼──────────┼─────────┼────────────┤
│ update 4.06:10                    │  72.76ns │  24.00w │     31.25% │
│ update 4.06:100                   │ 164.69ns │  48.00w │     70.74% │
│ update 4.06:1000                  │ 232.79ns │  66.00w │    100.00% │
│ update=find+catch+add/remove:10   │ 133.24ns │  23.00w │     57.24% │
│ update=find+catch+add/remove:100  │ 118.76ns │  35.00w │     51.02% │
│ update=find+catch+add/remove:1000 │ 161.22ns │  59.00w │     69.26% │
│ update=mem+find+add/remove:10     │ 156.29ns │  23.00w │     67.14% │
│ update=mem+find+add/remove:100    │ 122.98ns │  35.00w │     52.83% │
│ update=mem+find+add/remove:1000   │ 161.53ns │  59.00w │     69.39% │
└───────────────────────────────────┴──────────┴─────────┴────────────┘

Interestingly here the 4.06 implementation is actually slower and not
much difference between my other two.

Best regards,
--Edwin


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 3/4] tools/ocaml/xenstored: use more efficient node trees
  2021-01-15 22:28 [PATCH v2 0/2] oxenstored build enhancements Edwin Török
@ 2021-01-15 22:28 ` Edwin Török
  0 siblings, 0 replies; 9+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

This changes the output of xenstore-ls to be sorted.
Previously the keys were listed in the order in which they were inserted
in.
docs/misc/xenstore.txt doesn't specify in what order keys are listed.

Map.update is used to retain semantics with replace_child:
only an existing child is replaced, if it wasn't part of the original
map we don't add it.
Similarly exception behaviour is retained for del_childname and related
functions.

Entries are stored in reverse sort order, so that upon Map.fold the
constructed list is sorted in ascending order and there is no need for a
List.rev.

This changes the semantics and is not suitable as is for a backport.
It reveals bugs in buggy clients that depend on xenstore entry order,
however those clients should be fixed.
(We found one such bug in our internal testsuite where the first
 xenstore entry from a subtree was always dropped, and changing the
 listing order changed what key got dropped making the test fail)

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
---
Changed since V3:
* repost after XSA to avoid conflicts
---
 tools/ocaml/xenstored/store.ml   | 48 +++++++++++++++-----------------
 tools/ocaml/xenstored/symbol.ml  |  4 +++
 tools/ocaml/xenstored/symbol.mli |  3 ++
 3 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index 9c226e4ef7..5f155f45eb 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -16,17 +16,19 @@
  *)
 open Stdext
 
+module SymbolMap = Map.Make(Symbol)
+
 module Node = struct
 
 type t = {
 	name: Symbol.t;
 	perms: Perms.Node.t;
 	value: string;
-	children: t list;
+	children: t SymbolMap.t;
 }
 
 let create _name _perms _value =
-	{ name = Symbol.of_string _name; perms = _perms; value = _value; children = []; }
+	{ name = Symbol.of_string _name; perms = _perms; value = _value; children = SymbolMap.empty; }
 
 let get_owner node = Perms.Node.get_owner node.perms
 let get_children node = node.children
@@ -42,38 +44,34 @@ let set_value node nvalue =
 let set_perms node nperms = { node with perms = nperms }
 
 let add_child node child =
-	{ node with children = child :: node.children }
+	let children = SymbolMap.add child.name child node.children in
+	{ node with children }
 
 let exists node childname =
 	let childname = Symbol.of_string childname in
-	List.exists (fun n -> Symbol.equal n.name childname) node.children
+	SymbolMap.mem childname node.children
 
 let find node childname =
 	let childname = Symbol.of_string childname in
-	List.find (fun n -> Symbol.equal n.name childname) node.children
+	SymbolMap.find childname node.children
 
 let replace_child node child nchild =
-	(* this is the on-steroid version of the filter one-replace one *)
-	let rec replace_one_in_list l =
-		match l with
-		| []                               -> []
-		| h :: tl when Symbol.equal h.name child.name -> nchild :: tl
-		| h :: tl                          -> h :: replace_one_in_list tl
-		in
-	{ node with children = (replace_one_in_list node.children) }
+	{ node with
+	  children = SymbolMap.update child.name
+		     (function None -> None | Some _ -> Some nchild)
+		     node.children
+	}
 
 let del_childname node childname =
 	let sym = Symbol.of_string childname in
-	let rec delete_one_in_list l =
-		match l with
-		| []                        -> raise Not_found
-		| h :: tl when Symbol.equal h.name sym -> tl
-		| h :: tl                   -> h :: delete_one_in_list tl
-		in
-	{ node with children = (delete_one_in_list node.children) }
+	{ node with children =
+		SymbolMap.update sym
+		  (function None -> raise Not_found | Some _ -> None)
+		  node.children
+	}
 
 let del_all_children node =
-	{ node with children = [] }
+	{ node with children = SymbolMap.empty }
 
 (* check if the current node can be accessed by the current connection with rperm permissions *)
 let check_perm node connection request =
@@ -87,12 +85,12 @@ let check_owner node connection =
 		raise Define.Permission_denied;
 	end
 
-let rec recurse fct node = fct node; List.iter (recurse fct) node.children
+let rec recurse fct node = fct node; SymbolMap.iter (fun _ -> recurse fct) node.children
 
 (** [recurse_map f tree] applies [f] on each node in the tree recursively *)
 let recurse_map f =
 	let rec walk node =
-		f { node with children = List.rev_map walk node.children |> List.rev }
+		  f { node with children = SymbolMap.map walk node.children }
 	in
 	walk
 
@@ -336,7 +334,7 @@ let ls store perm path =
 				Node.check_perm cnode perm Perms.READ;
 				cnode.Node.children in
 			Path.apply store.root path do_ls in
-	List.rev (List.map (fun n -> Symbol.to_string n.Node.name) children)
+	SymbolMap.fold (fun k _ accu -> Symbol.to_string k :: accu) children []
 
 let getperms store perm path =
 	if path = [] then (
@@ -366,7 +364,7 @@ let traversal root_node f =
 	let rec _traversal path node =
 		f path node;
 		let node_path = Path.of_path_and_name path (Symbol.to_string node.Node.name) in
-		List.iter (_traversal node_path) (List.rev node.Node.children)
+		SymbolMap.iter (fun _ -> _traversal node_path) node.Node.children
 		in
 	_traversal [] root_node
 
diff --git a/tools/ocaml/xenstored/symbol.ml b/tools/ocaml/xenstored/symbol.ml
index 2b41d120f6..301639f16f 100644
--- a/tools/ocaml/xenstored/symbol.ml
+++ b/tools/ocaml/xenstored/symbol.ml
@@ -31,6 +31,10 @@ let equal a b =
   (* compare using physical equality, both members have to be part of the above weak table *)
   a == b
 
+let compare a b =
+  if equal a b then 0
+  else -(String.compare a b)
+
 let stats () =
   let len, entries, _, _, _, _ = WeakTable.stats tbl in
   len, entries
diff --git a/tools/ocaml/xenstored/symbol.mli b/tools/ocaml/xenstored/symbol.mli
index 586ab57507..dd0f014796 100644
--- a/tools/ocaml/xenstored/symbol.mli
+++ b/tools/ocaml/xenstored/symbol.mli
@@ -32,6 +32,9 @@ val to_string : t -> string
 val equal: t -> t -> bool
 (** Compare two symbols for equality *)
 
+val compare: t -> t -> int
+(** Compare two symbols *)
+
 (** {6 Statistics } *)
 
 val stats : unit -> int * int
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-01-15 22:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-27 17:35 [PATCH v4 0/4] tools/ocaml/xenstored: simplify code Edwin Török
2020-08-27 17:35 ` [PATCH v4 1/4] tools/ocaml/xenstored: replace hand rolled GC with weak GC references Edwin Török
2020-08-27 17:35 ` [PATCH v4 2/4] Map: backport find_opt/update from 4.06 Edwin Török
2020-08-28  8:30   ` Christian Lindig
2020-08-28 17:43     ` Edwin Torok
2020-09-01 12:39       ` Christian Lindig
2020-08-27 17:35 ` [PATCH v4 3/4] tools/ocaml/xenstored: use more efficient node trees Edwin Török
2020-08-27 17:35 ` [PATCH v4 4/4] tools/ocaml/xenstored: use more efficient tries Edwin Török
2021-01-15 22:28 [PATCH v2 0/2] oxenstored build enhancements Edwin Török
2021-01-15 22:28 ` [PATCH v4 3/4] tools/ocaml/xenstored: use more efficient node trees Edwin Török

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.