All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication
@ 2019-01-15  9:27 Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 01/14] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
                   ` (14 more replies)
  0 siblings, 15 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Lars Kurth, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Version four of this patch series.

* Changes are primarily addressing feedback from the v3 series reviews.
  Many points noted on the invididual commit posts.

* Register ring interfaces uses Xen gfns as page identifiers,
  and the arguments no longer specify page granularity.

* Multi-level lock validation macros defined and applied.
  Locks renamed to improve readability.

* Hypercall argument struct checking is folded inline into the series,
  checks applied as types are introduced.

* argo-mac string boot parameter changed to argo-mac-permissive boolean

Feedback items that are remaining to be addressed have been noted with
comments in the commit message and at the location in the code.

Christopher Clark (14):
  argo: Introduce the Kconfig option to govern inclusion of Argo
  argo: introduce the argo_op hypercall boilerplate
  argo: define argo_dprintk for subsystem debugging
  argo: init, destroy and soft-reset, with enable command line opt
  errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI
  xen/arm: introduce guest_handle_for_field()
  argo: implement the register op
  argo: implement the unregister op
  argo: implement the sendv op; evtchn: expose send_guest_global_virq
  argo: implement the notify op
  xsm, argo: XSM control for argo register
  xsm, argo: XSM control for argo message send operation
  xsm, argo: XSM control for any access to argo by a domain
  xsm, argo: notify: don't describe rings that cannot be sent to

 docs/misc/xen-command-line.pandoc            |   30 +
 tools/flask/policy/modules/guest_features.te |    7 +
 xen/arch/x86/guest/hypercall_page.S          |    2 +-
 xen/arch/x86/hvm/hypercall.c                 |    3 +
 xen/arch/x86/hypercall.c                     |    3 +
 xen/arch/x86/pv/hypercall.c                  |    3 +
 xen/common/Kconfig                           |   19 +
 xen/common/Makefile                          |    3 +-
 xen/common/argo.c                            | 2213 ++++++++++++++++++++++++++
 xen/common/compat/argo.c                     |   62 +
 xen/common/domain.c                          |   12 +
 xen/common/event_channel.c                   |    2 +-
 xen/include/Makefile                         |    1 +
 xen/include/asm-arm/guest_access.h           |    3 +
 xen/include/public/argo.h                    |  287 ++++
 xen/include/public/errno.h                   |    2 +
 xen/include/public/xen.h                     |    4 +-
 xen/include/xen/argo.h                       |   44 +
 xen/include/xen/event.h                      |    7 +
 xen/include/xen/hypercall.h                  |    9 +
 xen/include/xen/sched.h                      |    5 +
 xen/include/xlat.lst                         |    8 +
 xen/include/xsm/dummy.h                      |   25 +
 xen/include/xsm/xsm.h                        |   31 +
 xen/xsm/dummy.c                              |    6 +
 xen/xsm/flask/hooks.c                        |   41 +-
 xen/xsm/flask/policy/access_vectors          |   16 +
 xen/xsm/flask/policy/security_classes        |    1 +
 28 files changed, 2841 insertions(+), 8 deletions(-)
 create mode 100644 xen/common/argo.c
 create mode 100644 xen/common/compat/argo.c
 create mode 100644 xen/include/public/argo.h
 create mode 100644 xen/include/xen/argo.h

-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v4 01/14] argo: Introduce the Kconfig option to govern inclusion of Argo
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 02/14] argo: introduce the argo_op hypercall boilerplate Christopher Clark
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Defines CONFIG_ARGO when enabled. Default: disabled.

When the Kconfig option is enabled, the Argo hypercall implementation
will be included, allowing use of the hypervisor-mediated interdomain
communication mechanism.

Argo is implemented for x86 and ARM hardware platforms.

Availability of the option depends on EXPERT and Argo is currently an
experimental feature.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
(Jan's ack applies for committing together with at least one patch
 using the CONFIG_ARGO symbol.)

v3 added Jan's Ack
v2 #01 feedback, Jan: replace def_bool/prompt with bool
v1 #02 feedback, Jan: default Kconfig off, use EXPERT, fix whitespace

 xen/common/Kconfig | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 37f8505..5e1251e 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -200,6 +200,25 @@ config LATE_HWDOM
 
 	  If unsure, say N.
 
+config ARGO
+	bool "Argo: hypervisor-mediated interdomain communication" if EXPERT = "y"
+	---help---
+	  Enables a hypercall for domains to ask the hypervisor to perform
+	  data transfer of messages between domains.
+
+	  This allows communication channels to be established that do not
+	  require any shared memory between domains; the hypervisor is the
+	  entity that each domain interacts with. The hypervisor is able to
+	  enforce Mandatory Access Control policy over the communication.
+
+	  If XSM_FLASK is enabled, XSM policy can govern which domains may
+	  communicate via the Argo system.
+
+	  This feature does nothing if the "argo" boot parameter is not present.
+	  Argo is disabled at runtime by default.
+
+	  If unsure, say N.
+
 menu "Schedulers"
 	visible if EXPERT = "y"
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 02/14] argo: introduce the argo_op hypercall boilerplate
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 01/14] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 03/14] argo: define argo_dprintk for subsystem debugging Christopher Clark
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Presence is gated upon CONFIG_ARGO.

Registers the hypercall previously reserved for this.
Takes 5 arguments, does nothing and returns -ENOSYS.

Will be avoiding a compat ABI by using fixed-size types in hypercall ops so
HYPERCALL, rather than COMPAT_CALL, is the correct macro for the hypercall
tables.

Even though handles will be used for (up to) two of the arguments to the
hypercall, there will be no need for any XLAT_* translation functions
because the referenced data structures have been constructed to be exactly
the same size and bit pattern on both 32-bit and 64-bit guests, and padded
to be integer multiples of 32 bits in size. This means that the same
copy_to_guest and copy_from_guest logic can be relied upon to perform as
required without any further intervention. Testing communication with 32
and 64 bit guests has confirmed this works as intended.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v2 Copyright line: add 2019
v2 feedback #3 Jan: drop "message" from argo_message_op
v2 feedback #3 Jan: add Acked-by
v1 feedback #15 Jan: handle upper-halves of hypercall args
v1 feedback #15 Jan: use unsigned where negative values impossible

 xen/arch/x86/guest/hypercall_page.S |  2 +-
 xen/arch/x86/hvm/hypercall.c        |  3 +++
 xen/arch/x86/hypercall.c            |  3 +++
 xen/arch/x86/pv/hypercall.c         |  3 +++
 xen/common/Makefile                 |  1 +
 xen/common/argo.c                   | 28 ++++++++++++++++++++++++++++
 xen/include/public/xen.h            |  2 +-
 xen/include/xen/hypercall.h         |  9 +++++++++
 8 files changed, 49 insertions(+), 2 deletions(-)
 create mode 100644 xen/common/argo.c

diff --git a/xen/arch/x86/guest/hypercall_page.S b/xen/arch/x86/guest/hypercall_page.S
index fdd2e72..26afabf 100644
--- a/xen/arch/x86/guest/hypercall_page.S
+++ b/xen/arch/x86/guest/hypercall_page.S
@@ -59,7 +59,7 @@ DECLARE_HYPERCALL(sysctl)
 DECLARE_HYPERCALL(domctl)
 DECLARE_HYPERCALL(kexec_op)
 DECLARE_HYPERCALL(tmem_op)
-DECLARE_HYPERCALL(xc_reserved_op)
+DECLARE_HYPERCALL(argo_op)
 DECLARE_HYPERCALL(xenpmu_op)
 
 DECLARE_HYPERCALL(arch_0)
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 19d1263..b4eaac3 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -134,6 +134,9 @@ static const hypercall_table_t hvm_hypercall_table[] = {
 #ifdef CONFIG_TMEM
     HYPERCALL(tmem_op),
 #endif
+#ifdef CONFIG_ARGO
+    HYPERCALL(argo_op),
+#endif
     COMPAT_CALL(platform_op),
 #ifdef CONFIG_PV
     COMPAT_CALL(mmuext_op),
diff --git a/xen/arch/x86/hypercall.c b/xen/arch/x86/hypercall.c
index 032de8f..93e7860 100644
--- a/xen/arch/x86/hypercall.c
+++ b/xen/arch/x86/hypercall.c
@@ -64,6 +64,9 @@ const hypercall_args_t hypercall_args_table[NR_hypercalls] =
     ARGS(domctl, 1),
     ARGS(kexec_op, 2),
     ARGS(tmem_op, 1),
+#ifdef CONFIG_ARGO
+    ARGS(argo_op, 5),
+#endif
     ARGS(xenpmu_op, 2),
 #ifdef CONFIG_HVM
     ARGS(hvm_op, 2),
diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
index 5d11911..ed75053 100644
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -77,6 +77,9 @@ const hypercall_table_t pv_hypercall_table[] = {
 #ifdef CONFIG_TMEM
     HYPERCALL(tmem_op),
 #endif
+#ifdef CONFIG_ARGO
+    HYPERCALL(argo_op),
+#endif
     HYPERCALL(xenpmu_op),
 #ifdef CONFIG_HVM
     HYPERCALL(hvm_op),
diff --git a/xen/common/Makefile b/xen/common/Makefile
index ffdfb74..8c65c6f 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -1,3 +1,4 @@
+obj-$(CONFIG_ARGO) += argo.o
 obj-y += bitmap.o
 obj-y += bsearch.o
 obj-$(CONFIG_CORE_PARKING) += core_parking.o
diff --git a/xen/common/argo.c b/xen/common/argo.c
new file mode 100644
index 0000000..d69ad7c
--- /dev/null
+++ b/xen/common/argo.c
@@ -0,0 +1,28 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Derived from v4v, the version 2 of v2v.
+ *
+ * Copyright (c) 2010, Citrix Systems
+ * Copyright (c) 2018-2019 BAE Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <xen/errno.h>
+#include <xen/guest_access.h>
+
+long
+do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
+           XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
+           unsigned long arg4)
+{
+    return -ENOSYS;
+}
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 1a56871..b3f6491 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -118,7 +118,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define __HYPERVISOR_domctl               36
 #define __HYPERVISOR_kexec_op             37
 #define __HYPERVISOR_tmem_op              38
-#define __HYPERVISOR_xc_reserved_op       39 /* reserved for XenClient */
+#define __HYPERVISOR_argo_op              39
 #define __HYPERVISOR_xenpmu_op            40
 #define __HYPERVISOR_dm_op                41
 
diff --git a/xen/include/xen/hypercall.h b/xen/include/xen/hypercall.h
index cc99aea..e2f61d6 100644
--- a/xen/include/xen/hypercall.h
+++ b/xen/include/xen/hypercall.h
@@ -136,6 +136,15 @@ do_tmem_op(
     XEN_GUEST_HANDLE_PARAM(tmem_op_t) uops);
 #endif
 
+#ifdef CONFIG_ARGO
+extern long do_argo_op(
+    unsigned int cmd,
+    XEN_GUEST_HANDLE_PARAM(void) arg1,
+    XEN_GUEST_HANDLE_PARAM(void) arg2,
+    unsigned long arg3,
+    unsigned long arg4);
+#endif
+
 extern long
 do_xenoprof_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg);
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 03/14] argo: define argo_dprintk for subsystem debugging
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 01/14] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 02/14] argo: introduce the argo_op hypercall boilerplate Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 04/14] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

A convenience for working on development of the argo subsystem:
setting a #define variable enables additional debug messages.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v3 added Roger's Reviewed-by
v3 added Jan's Ack
v2 #03 feedback, Jan: fix ifdef/define confusion error
v1 #04 feedback, Jan: fix dprintk implementation

 xen/common/argo.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index d69ad7c..6f782f7 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -19,6 +19,15 @@
 #include <xen/errno.h>
 #include <xen/guest_access.h>
 
+/* Change this to #define ARGO_DEBUG here to enable more debug messages */
+#undef ARGO_DEBUG
+
+#ifdef ARGO_DEBUG
+#define argo_dprintk(format, args...) printk("argo: " format, ## args )
+#else
+#define argo_dprintk(format, ... ) ((void)0)
+#endif
+
 long
 do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
            XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 04/14] argo: init, destroy and soft-reset, with enable command line opt
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (2 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 03/14] argo: define argo_dprintk for subsystem debugging Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15 12:29   ` Roger Pau Monné
  2019-01-15  9:27 ` [PATCH v4 05/14] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI Christopher Clark
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Initialises basic data structures and performs teardown of argo state
for domain shutdown.

Inclusion of the Argo implementation is dependent on CONFIG_ARGO.

Introduces a new Xen command line parameter 'argo': bool to enable/disable
the argo hypercall. Defaults to disabled.

New headers:
  public/argo.h: with definions of addresses and ring structure, including
  indexes for atomic update for communication between domain and hypervisor.

  xen/argo.h: to expose the hooks for integration into domain lifecycle:
    argo_init: per-domain init of argo data structures for domain_create.
    argo_destroy: teardown for domain_destroy and the error exit
                  path of domain_create.
    argo_soft_reset: reset of domain state for domain_soft_reset.

Adds two new fields to struct domain:
    rwlock_t argo_lock;
    struct argo_domain *argo;

In accordance with recent work on _domain_destroy, argo_destroy is
idempotent. It will tear down: all rings registered by this domain, all
rings where this domain is the single sender (ie. specified partner,
non-wildcard rings), and all pending notifications where this domain is
awaiting signal about available space in the rings of other domains.

A count will be maintained of the number of rings that a domain has
registered in order to limit it below the fixed maximum limit defined here.

Macros are defined to verify the internal locking state within the argo
implementation. The macros are ASSERTed on entry to functions to validate
and document the required lock state prior to calling.

The software license on the public header is the BSD license, standard
procedure for the public Xen headers. The public header was originally
posted under a GPL license at: [1]:
https://lists.xenproject.org/archives/html/xen-devel/2013-05/msg02710.html

The following ACK by Lars Kurth is to confirm that only people being
employees of Citrix contributed to the header files in the series posted at
[1] and that thus the copyright of the files in question is fully owned by
Citrix. The ACK also confirms that Citrix is happy for the header files to
be published under a BSD license in this series (which is based on [1]).

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Lars Kurth <lars.kurth@citrix.com>
Reviewed-by: Ross Philipson <ross.philipson@oracle.com>

This version contains FIXMEs for 4.12:
 * Replace the hash function to get better distribution across buckets.
     - Don't use casts in the replacement function.
     - Drop the use of array_index_nospec.
 * since argo_destroy is in _domain_destroy, remove it from domain_kill
---
v3 #04 Andrew: use xzalloc for struct argo_domain in argo_init
v3 #04 Andrew: reference CONFIG_ARGO in the command line documentation
v3 #07 Jan: rename ring_find_info to find_ring_info
v3 #04 Andrew: don't truncate args do_argo_op printk
v3 #07 Jan: fix numeric entries in printk format strings
v3 #10 Roger: move find functions to top of file and drop prototypes
v3 #04 Jan: meld compat check for hypercall arg types
v3 #04 Roger/Jan: make lock names clearer and assert their state
v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
v3 #04 Jan: reorder call to argo_init_domain in argo_init
v3 #04 Jan: ring_remove_mfns: zero count before freeing arrays
v3 #04 Jason/Roger: soft_reset: can assume reinit is ok if d->argo set
v3 #04 Roger: remove unused and confusing d->argo_lock
v3 #04 Roger: add simple inlines in xen/argo.h, drop ifdef CONFIG_ARGO
v3 #04 Roger: simpler return -EOPNOTSUPP in do_argo_op
v3 #04 Roger: add const to domain arg to ring_remove_info
v3 #04 Roger: use XFREE
v3 #04 Roger: newline fix in wildcard_pending_list_remove
v3 #04 Roger: mfn_mapping: void* instead of uint8_t*
v3 #04 Roger: drop npages struct member in argo_ring_info; use len
v3 #04 Roger/Jan: drop many fixed width types in internal structs
v3 #04 Jason/Jan: drop pad and fixed width type in pending_ent struct
v3 #04 Eric: moved ring_find_info from register op into this commit
v3 moved hash_index function, nospec include from register op to this commit
v3 moved XEN_ARGO_DOMID_ANY defn from register op into this commit
v3 added #include <xen/sched.h> to <xen/argo.h> for domain struct defn
v3 feedback #04 Roger: reorder #includes to alphabetical order
v3 Added Ross's Reviewed-by.

v2 rewrite locking explanation comment
v2 header copyright line now includes 2019
v2 self: use ring_info backpointer in pending_ent to maintain npending
v2 self: rename all_rings_remove_info to domain_rings_remove_all
v2 feedback Jan: drop cookie, implement teardown
v2 self: add npending to track number of pending entries per ring
v2 self: amend comment on locking; drop section comments
v2 cookie_eq: test low bits first and use likely on high bits
v2 self: OVERHAUL
v2 self: s/argo_pending_ent/pending_ent/g
v2 self: drop pending_remove_ent, inline at single call site
v1 feedback Roger, Jan: drop argo prefix on static functions
v2 #4 Lars: add Acked-by and details to commit message.
v2 feedback #9 Jan: document argo boot opt in xen-command-line.markdown
v2 bugfix: xsm use in soft-reset prior to introduction
v2 feedback #9 Jan: drop 'message' from do_argo_message_op
v1 #5 feedback Paul: init/destroy unsigned, brackets and whitespace fixes
v1 #5 feedback Paul: Use mfn_eq for comparing mfns.
v1 #5 feedback Paul: init/destroy : use currd
v1 #6 (#5) feedback Jan: init/destroy: s/ENOSYS/EOPNOTSUPP/
v1 #6 feedback Paul: Folded patch 6 into patch 5.
v1 #6 feedback Jan: drop opt_argo_enabled initializer
v1 $6 feedback Jan: s/ENOSYS/EOPNOTSUPP/g and drop useless dprintk
v1. #5 feedback Paul: change the license on public header to BSD
      - ack from Lars at Citrix.
v1. self, Jan: drop unnecessary xen include from sched.h
v1. self, Jan: drop inclusion of public argo.h in private one
v1. self, Jan: add include of public argo.h to argo.c
v1. self, Jan: drop fwd decl of argo_domain in priv header
v1. Paul/self/Jan: add data structures to xlat.lst and compat/argo.h to Makefile
v1. self: removed allocation of event channel since switching to VIRQ
v1. self: drop types.h include from private argo.h
v1: reorder public argo include position
v1: #13 feedback Jan: public namespace: prefix with xen
v1: self: rename pending ent "id" to "domain_id"
v1: self: add domain_cookie to ent struct
v1. #15 feedback Jan: make cmd unsigned
v1. #15 feedback Jan: make i loop variable unsigned
v1: self: adjust dprintks in init, destroy
v1: #18 feedback Jan: meld max ring count limit
v1: self: use type not struct in public defn, affects compat gen header
v1: feedback #15 Jan: handle upper-halves of hypercall args
v1: add comment explaining the 'magic' field
v1: self + Jan feedback: implement soft reset
v1: feedback #13 Roger: use ASSERT_UNREACHABLE

 docs/misc/xen-command-line.pandoc |  13 +
 xen/common/Makefile               |   2 +-
 xen/common/argo.c                 | 572 +++++++++++++++++++++++++++++++++++++-
 xen/common/compat/argo.c          |  23 ++
 xen/common/domain.c               |  12 +
 xen/include/Makefile              |   1 +
 xen/include/public/argo.h         |  64 +++++
 xen/include/xen/argo.h            |  44 +++
 xen/include/xen/sched.h           |   5 +
 xen/include/xlat.lst              |   2 +
 10 files changed, 736 insertions(+), 2 deletions(-)
 create mode 100644 xen/common/compat/argo.c
 create mode 100644 xen/include/public/argo.h
 create mode 100644 xen/include/xen/argo.h

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index a755a67..08c28f9 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -182,6 +182,19 @@ Permit Xen to use "Always Running APIC Timer" support on compatible hardware
 in combination with cpuidle.  This option is only expected to be useful for
 developers wishing Xen to fall back to older timing methods on newer hardware.
 
+### argo
+> `= <boolean>`
+
+> Default: `false`
+
+Enable the Argo hypervisor-mediated interdomain communication mechanism.
+
+Only available if Xen is compiled with `CONFIG_ARGO` enabled.
+
+This allows domains access to the Argo hypercall, which supports registration
+of memory rings with the hypervisor to receive messages, sending messages to
+other domains by hypercall and querying the ring status of other domains.
+
 ### asid (x86)
 > `= <boolean>`
 
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 8c65c6f..88b9b2f 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -70,7 +70,7 @@ obj-y += xmalloc_tlsf.o
 obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4 earlycpio,$(n).init.o)
 
 
-obj-$(CONFIG_COMPAT) += $(addprefix compat/,domain.o kernel.o memory.o multicall.o xlat.o)
+obj-$(CONFIG_COMPAT) += $(addprefix compat/,argo.o domain.o kernel.o memory.o multicall.o xlat.o)
 
 tmem-y := tmem.o tmem_xen.o tmem_control.o
 tmem-$(CONFIG_COMPAT) += compat/tmem_xen.o
diff --git a/xen/common/argo.c b/xen/common/argo.c
index 6f782f7..1958fdc 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -16,8 +16,223 @@
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  */
 
+#include <xen/argo.h>
+#include <xen/domain.h>
+#include <xen/domain_page.h>
 #include <xen/errno.h>
+#include <xen/event.h>
 #include <xen/guest_access.h>
+#include <xen/nospec.h>
+#include <xen/sched.h>
+#include <xen/time.h>
+
+#include <public/argo.h>
+
+DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+
+/* Xen command line option to enable argo */
+static bool __read_mostly opt_argo_enabled;
+boolean_param("argo", opt_argo_enabled);
+
+typedef struct argo_ring_id
+{
+    xen_argo_port_t aport;
+    domid_t partner_id;
+    domid_t domain_id;
+} argo_ring_id;
+
+/* Data about a domain's own ring that it has registered */
+struct argo_ring_info
+{
+    /* next node in the hash, protected by rings_L2 */
+    struct hlist_node node;
+    /* this ring's id, protected by rings_L2 */
+    struct argo_ring_id id;
+    /* L3, the ring_info lock: protects the members of this struct below */
+    spinlock_t L3_lock;
+    /* length of the ring, protected by L3 */
+    unsigned int len;
+    /* number of pages translated into mfns, protected by L3 */
+    unsigned int nmfns;
+    /* cached tx pointer location, protected by L3 */
+    unsigned int tx_ptr;
+    /* mapped ring pages protected by L3 */
+    void **mfn_mapping;
+    /* list of mfns of guest ring, protected by L3 */
+    mfn_t *mfns;
+    /* list of struct pending_ent for this ring, protected by L3 */
+    struct hlist_head pending;
+    /* number of pending entries queued for this ring, protected by L3 */
+    unsigned int npending;
+};
+
+/* Data about a single-sender ring, held by the sender (partner) domain */
+struct argo_send_info
+{
+    /* next node in the hash, protected by send_L2 */
+    struct hlist_node node;
+    /* this ring's id, protected by send_L2 */
+    struct argo_ring_id id;
+};
+
+/* A space-available notification that is awaiting sufficient space */
+struct pending_ent
+{
+    /* List node within argo_ring_info's pending list */
+    struct hlist_node node;
+    /*
+     * List node within argo_domain's wildcard_pend_list. Only used if the
+     * ring is one with a wildcard partner (ie. that any domain may send to)
+     * to enable cancelling signals on wildcard rings on domain destroy.
+     */
+    struct hlist_node wildcard_node;
+    /*
+     * Pointer to the ring_info that this ent pertains to. Used to ensure that
+     * ring_info->npending is decremented when ents for wildcard rings are
+     * cancelled for domain destroy.
+     * Caution: Must hold the correct locks before accessing ring_info via this.
+     */
+    struct argo_ring_info *ring_info;
+    /* minimum ring space available that this signal is waiting upon */
+    unsigned int len;
+    /* domain to be notified when space is available */
+    domid_t domain_id;
+};
+
+/*
+ * The value of the argo element in a struct domain is
+ * protected by L1_global_argo_rwlock
+ */
+#define ARGO_HTABLE_SIZE 32
+struct argo_domain
+{
+    /* rings_L2 */
+    rwlock_t rings_L2_rwlock;
+    /*
+     * Hash table of argo_ring_info about rings this domain has registered.
+     * Protected by rings_L2.
+     */
+    struct hlist_head ring_hash[ARGO_HTABLE_SIZE];
+    /* Counter of rings registered by this domain. Protected by rings_L2. */
+    unsigned int ring_count;
+
+    /* send_L2 */
+    spinlock_t send_L2_lock;
+    /*
+     * Hash table of argo_send_info about rings other domains have registered
+     * for this domain to send to. Single partner, non-wildcard rings.
+     * Protected by send_L2.
+     */
+    struct hlist_head send_hash[ARGO_HTABLE_SIZE];
+
+    /* wildcard_L2 */
+    spinlock_t wildcard_L2_lock;
+    /*
+     * List of pending space-available signals for this domain about wildcard
+     * rings registered by other domains. Protected by wildcard_L2.
+     */
+    struct hlist_head wildcard_pend_list;
+};
+
+/*
+ * Locking is organized as follows:
+ *
+ * Terminology: R(<lock>) means taking a read lock on the specified lock;
+ *              W(<lock>) means taking a write lock on it.
+ *
+ * == L1 : The global read/write lock: L1_global_argo_rwlock
+ * Protects the argo elements of all struct domain *d in the system.
+ * It does not protect any of the elements of d->argo, only their
+ * addresses.
+ *
+ * By extension since the destruction of a domain with a non-NULL
+ * d->argo will need to free the d->argo pointer, holding W(L1)
+ * guarantees that no domains pointers that argo is interested in
+ * become invalid whilst this lock is held.
+ */
+
+static DEFINE_RWLOCK(L1_global_argo_rwlock); /* L1 */
+
+/*
+ * == rings_L2 : The per-domain ring hash lock: d->argo->rings_L2_rwlock
+ *
+ * Holding a read lock on rings_L2 protects the ring hash table and
+ * the elements in the hash_table d->argo->ring_hash, and
+ * the node and id fields in struct argo_ring_info in the
+ * hash table.
+ * Holding a write lock on rings_L2 protects all of the elements of all the
+ * struct argo_ring_info belonging to this domain.
+ *
+ * To take rings_L2 you must already have R(L1). W(L1) implies W(rings_L2) and
+ * L3.
+ *
+ * == L3 : The individual ring_info lock: ring_info->L3_lock
+ *
+ * Protects all the fields within the argo_ring_info, aside from the ones that
+ * rings_L2 already protects: node, id, lock.
+ *
+ * To acquire L3 you must already have R(rings_L2). W(rings_L2) implies L3.
+ *
+ * == send_L2 : The per-domain single-sender partner rings lock:
+ *              d->argo->send_L2_lock
+ *
+ * Protects the per-domain send hash table : d->argo->send_hash
+ * and the elements in the hash table, and the node and id fields
+ * in struct argo_send_info in the hash table.
+ *
+ * To take send_L2, you must already have R(L1). W(L1) implies send_L2.
+ * Do not attempt to acquire a rings_L2 on any domain after taking and while
+ * holding a send_L2 lock -- acquire the rings_L2 (if one is needed) beforehand.
+ *
+ * == wildcard_L2 : The per-domain wildcard pending list lock:
+ *                  d->argo->wildcard_L2_lock
+ *
+ * Protects the per-domain list of outstanding signals for space availability
+ * on wildcard rings.
+ *
+ * To take wildcard_L2, you must already have R(L1). W(L1) implies wildcard_L2.
+ * No other locks are acquired after obtaining wildcard_L2.
+ */
+
+/*
+ * Lock state validations macros
+ *
+ * These macros encode the logic to verify that the locking has adhered to the
+ * locking discipline above.
+ * eg. On entry to logic that requires holding at least R(rings_L2), this:
+ *      ASSERT(LOCKING_Read_rings_L2(d));
+ *
+ * checks that the lock state is sufficient, validating that one of the
+ * following must be true when executed:       R(rings_L2) && R(L1)
+ *                                        or:  W(rings_L2) && R(L1)
+ *                                        or:  W(L1)
+ */
+
+/* RAW macros here are only used to assist defining the other macros below */
+#define RAW_LOCKING_Read_L1 (rw_is_locked(&L1_global_argo_rwlock))
+#define RAW_LOCKING_Read_rings_L2(d) \
+    (rw_is_locked(&d->argo->rings_L2_rwlock) && RAW_LOCKING_Read_L1)
+
+/* The LOCKING macros defined below here are for use at verification points */
+#define LOCKING_Write_L1 (rw_is_write_locked(&L1_global_argo_rwlock))
+#define LOCKING_Read_L1 (RAW_LOCKING_Read_L1 || LOCKING_Write_L1)
+
+#define LOCKING_Write_rings_L2(d) \
+    ((RAW_LOCKING_Read_L1 && rw_is_write_locked(&d->argo->rings_L2_rwlock)) || \
+     LOCKING_Write_L1)
+
+#define LOCKING_Read_rings_L2(d) \
+    ((RAW_LOCKING_Read_L1 && rw_is_locked(&d->argo->rings_L2_rwlock)) || \
+     LOCKING_Write_rings_L2(d) || LOCKING_Write_L1)
+
+#define LOCKING_L3(d, r) \
+    ((RAW_LOCKING_Read_rings_L2(d) && spin_is_locked(&r->L3_lock)) || \
+     LOCKING_Write_rings_L2(d) || LOCKING_Write_L1)
+
+#define LOCKING_send_L2(d) \
+    ((RAW_LOCKING_Read_L1 && spin_is_locked(&d->argo->send_L2_lock)) || \
+     LOCKING_Write_L1)
 
 /* Change this to #define ARGO_DEBUG here to enable more debug messages */
 #undef ARGO_DEBUG
@@ -28,10 +243,365 @@
 #define argo_dprintk(format, ... ) ((void)0)
 #endif
 
+/* 
+ * FIXME for 4.12:
+ *  * Replace this hash function to get better distribution across buckets.
+ *  * Don't use casts in the replacement function.
+ *  * Drop the use of array_index_nospec.
+ */
+/*
+ * This hash function is used to distribute rings within the per-domain
+ * hash tables (d->argo->ring_hash and d->argo_send_hash). The hash table
+ * will provide a struct if a match is found with a 'argo_ring_id' key:
+ * ie. the key is a (domain id, argo port, partner domain id) tuple.
+ * Since argo port number varies the most in expected use, and the Linux driver
+ * allocates at both the high and low ends, incorporate high and low bits to
+ * help with distribution.
+ * Apply array_index_nospec as a defensive measure since this operates
+ * on user-supplied input and the array size that it indexes into is known.
+ */
+static unsigned int
+hash_index(const struct argo_ring_id *id)
+{
+    unsigned int hash;
+
+    hash = (uint16_t)(id->aport >> 16);
+    hash ^= (uint16_t)id->aport;
+    hash ^= id->domain_id;
+    hash ^= id->partner_id;
+    hash &= (ARGO_HTABLE_SIZE - 1);
+
+    return array_index_nospec(hash, ARGO_HTABLE_SIZE);
+}
+
+static struct argo_ring_info *
+find_ring_info(const struct domain *d, const struct argo_ring_id *id)
+{
+    unsigned int ring_hash_index;
+    struct hlist_node *node;
+    struct argo_ring_info *ring_info;
+
+    ASSERT(LOCKING_Read_rings_L2(d));
+
+    ring_hash_index = hash_index(id);
+
+    argo_dprintk("d->argo=%p, d->argo->ring_hash[%u]=%p id=%p\n",
+                 d->argo, ring_hash_index,
+                 d->argo->ring_hash[ring_hash_index].first, id);
+    argo_dprintk("id.aport=%x id.domain=vm%u id.partner_id=vm%u\n",
+                 id->aport, id->domain_id, id->partner_id);
+
+    hlist_for_each_entry(ring_info, node, &d->argo->ring_hash[ring_hash_index],
+                         node)
+    {
+        const struct argo_ring_id *cmpid = &ring_info->id;
+
+        if ( cmpid->aport == id->aport &&
+             cmpid->domain_id == id->domain_id &&
+             cmpid->partner_id == id->partner_id )
+        {
+            argo_dprintk("ring_info=%p\n", ring_info);
+            return ring_info;
+        }
+    }
+    argo_dprintk("no ring_info found\n");
+
+    return NULL;
+}
+
+static void
+ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
+{
+    unsigned int i;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    if ( !ring_info->mfn_mapping )
+        return;
+
+    for ( i = 0; i < ring_info->nmfns; i++ )
+    {
+        if ( !ring_info->mfn_mapping[i] )
+            continue;
+        if ( ring_info->mfns )
+            argo_dprintk(XENLOG_ERR "argo: unmapping page %"PRI_mfn" from %p\n",
+                         mfn_x(ring_info->mfns[i]),
+                         ring_info->mfn_mapping[i]);
+        unmap_domain_page_global(ring_info->mfn_mapping[i]);
+        ring_info->mfn_mapping[i] = NULL;
+    }
+}
+
+static void
+wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
+{
+    struct domain *d = get_domain_by_id(domain_id);
+
+    if ( !d )
+        return;
+
+    ASSERT(LOCKING_Read_L1);
+
+    if ( d->argo )
+    {
+        spin_lock(&d->argo->wildcard_L2_lock);
+        hlist_del(&ent->wildcard_node);
+        spin_unlock(&d->argo->wildcard_L2_lock);
+    }
+    put_domain(d);
+}
+
+static void
+pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
+{
+    struct hlist_node *node, *next;
+    struct pending_ent *ent;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node)
+    {
+        if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
+            wildcard_pending_list_remove(ent->domain_id, ent);
+        hlist_del(&ent->node);
+        xfree(ent);
+    }
+    ring_info->npending = 0;
+}
+
+static void
+wildcard_rings_pending_remove(struct domain *d)
+{
+    struct hlist_node *node, *next;
+    struct pending_ent *ent;
+
+    ASSERT(LOCKING_Write_L1);
+
+    hlist_for_each_entry_safe(ent, node, next, &d->argo->wildcard_pend_list,
+                              node)
+    {
+        /*
+         * The ent->node deleted here, and the npending value decreased,
+         * belong to the ring_info of another domain, which is why this
+         * function requires holding W(L1):
+         * it implies the L3 lock that protects that ring_info struct.
+         */
+        ent->ring_info->npending--;
+        hlist_del(&ent->node);
+        hlist_del(&ent->wildcard_node);
+        xfree(ent);
+    }
+}
+
+static void
+ring_remove_mfns(const struct domain *d, struct argo_ring_info *ring_info)
+{
+    unsigned int i;
+
+    ASSERT(LOCKING_Write_rings_L2(d));
+
+    if ( !ring_info->mfns )
+        return;
+
+    if ( !ring_info->mfn_mapping )
+    {
+        ASSERT_UNREACHABLE();
+        return;
+    }
+
+    ring_unmap(d, ring_info);
+
+    for ( i = 0; i < ring_info->nmfns; i++ )
+        if ( !mfn_eq(ring_info->mfns[i], INVALID_MFN) )
+            put_page_and_type(mfn_to_page(ring_info->mfns[i]));
+
+    ring_info->nmfns = 0;
+    XFREE(ring_info->mfns);
+    XFREE(ring_info->mfn_mapping);
+}
+
+static void
+ring_remove_info(const struct domain *d, struct argo_ring_info *ring_info)
+{
+    ASSERT(LOCKING_Write_rings_L2(d));
+
+    pending_remove_all(d, ring_info);
+    hlist_del(&ring_info->node);
+    ring_remove_mfns(d, ring_info);
+    xfree(ring_info);
+}
+
+static void
+domain_rings_remove_all(struct domain *d)
+{
+    unsigned int i;
+
+    ASSERT(LOCKING_Write_rings_L2(d));
+
+    for ( i = 0; i < ARGO_HTABLE_SIZE; ++i )
+    {
+        struct hlist_node *node, *next;
+        struct argo_ring_info *ring_info;
+
+        hlist_for_each_entry_safe(ring_info, node, next,
+                                  &d->argo->ring_hash[i], node)
+            ring_remove_info(d, ring_info);
+    }
+    d->argo->ring_count = 0;
+}
+
+/*
+ * Tear down all rings of other domains where src_d domain is the partner.
+ * (ie. it is the single domain that can send to those rings.)
+ * This will also cancel any pending notifications about those rings.
+ */
+static void
+partner_rings_remove(struct domain *src_d)
+{
+    unsigned int i;
+
+    ASSERT(LOCKING_Write_L1);
+
+    for ( i = 0; i < ARGO_HTABLE_SIZE; ++i )
+    {
+        struct hlist_node *node, *next;
+        struct argo_send_info *send_info;
+
+        hlist_for_each_entry_safe(send_info, node, next,
+                                  &src_d->argo->send_hash[i], node)
+        {
+            struct argo_ring_info *ring_info;
+            struct domain *dst_d;
+
+            dst_d = get_domain_by_id(send_info->id.domain_id);
+            if ( dst_d )
+            {
+                ring_info = find_ring_info(dst_d, &send_info->id);
+                if ( ring_info )
+                {
+                    ring_remove_info(dst_d, ring_info);
+                    dst_d->argo->ring_count--;
+                }
+
+                put_domain(dst_d);
+            }
+
+            hlist_del(&send_info->node);
+            xfree(send_info);
+        }
+    }
+}
+
 long
 do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
            XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
            unsigned long arg4)
 {
-    return -ENOSYS;
+    long rc = -EFAULT;
+
+    argo_dprintk("->do_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
+                 (void *)arg1.p, (void *)arg2.p, arg3, arg4);
+
+    if ( unlikely(!opt_argo_enabled) )
+        return -EOPNOTSUPP;
+
+    switch (cmd)
+    {
+    default:
+        rc = -EOPNOTSUPP;
+        break;
+    }
+
+    argo_dprintk("<-do_argo_op(%u)=%ld\n", cmd, rc);
+
+    return rc;
+}
+
+static void
+argo_domain_init(struct argo_domain *argo)
+{
+    unsigned int i;
+
+    rwlock_init(&argo->rings_L2_rwlock);
+    spin_lock_init(&argo->send_L2_lock);
+    spin_lock_init(&argo->wildcard_L2_lock);
+    argo->ring_count = 0;
+
+    for ( i = 0; i < ARGO_HTABLE_SIZE; ++i )
+    {
+        INIT_HLIST_HEAD(&argo->ring_hash[i]);
+        INIT_HLIST_HEAD(&argo->send_hash[i]);
+    }
+    INIT_HLIST_HEAD(&argo->wildcard_pend_list);
+}
+
+int
+argo_init(struct domain *d)
+{
+    struct argo_domain *argo;
+
+    if ( !opt_argo_enabled )
+    {
+        argo_dprintk("argo disabled, domid: %u\n", d->domain_id);
+        return 0;
+    }
+
+    argo_dprintk("init: domid: %u\n", d->domain_id);
+
+    argo = xzalloc(struct argo_domain);
+    if ( !argo )
+        return -ENOMEM;
+
+    argo_domain_init(argo);
+
+    write_lock(&L1_global_argo_rwlock);
+
+    d->argo = argo;
+
+    write_unlock(&L1_global_argo_rwlock);
+
+    return 0;
+}
+
+void
+argo_destroy(struct domain *d)
+{
+    BUG_ON(!d->is_dying);
+
+    write_lock(&L1_global_argo_rwlock);
+
+    argo_dprintk("destroy: domid %u d->argo=%p\n", d->domain_id, d->argo);
+
+    if ( d->argo )
+    {
+        domain_rings_remove_all(d);
+        partner_rings_remove(d);
+        wildcard_rings_pending_remove(d);
+        XFREE(d->argo);
+    }
+    write_unlock(&L1_global_argo_rwlock);
+}
+
+void
+argo_soft_reset(struct domain *d)
+{
+    write_lock(&L1_global_argo_rwlock);
+
+    argo_dprintk("soft reset d=%u d->argo=%p\n", d->domain_id, d->argo);
+
+    if ( d->argo )
+    {
+        domain_rings_remove_all(d);
+        partner_rings_remove(d);
+        wildcard_rings_pending_remove(d);
+
+        /*
+         * Since opt_argo_enabled cannot change at runtime, if d->argo is true
+         * then opt_argo_enabled must be true, and we can assume that init
+         * is allowed to proceed again here.
+         */
+        argo_domain_init(d->argo);
+    }
+
+    write_unlock(&L1_global_argo_rwlock);
 }
diff --git a/xen/common/compat/argo.c b/xen/common/compat/argo.c
new file mode 100644
index 0000000..8edb9e8
--- /dev/null
+++ b/xen/common/compat/argo.c
@@ -0,0 +1,23 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Copyright (c) 2018, BAE Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <xen/lib.h>
+
+#include <public/argo.h>
+
+#include <compat/argo.h>
+
+CHECK_argo_addr;
+CHECK_argo_ring;
diff --git a/xen/common/domain.c b/xen/common/domain.c
index c623dae..bd344e5 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -32,6 +32,7 @@
 #include <xen/grant_table.h>
 #include <xen/xenoprof.h>
 #include <xen/irq.h>
+#include <xen/argo.h>
 #include <asm/debugger.h>
 #include <asm/p2m.h>
 #include <asm/processor.h>
@@ -277,6 +278,8 @@ static void _domain_destroy(struct domain *d)
 
     xfree(d->pbuf);
 
+    argo_destroy(d);
+
     rangeset_domain_destroy(d);
 
     free_cpumask_var(d->dirty_cpumask);
@@ -445,6 +448,9 @@ struct domain *domain_create(domid_t domid,
             goto fail;
         init_status |= INIT_gnttab;
 
+        if ( (err = argo_init(d)) != 0 )
+            goto fail;
+
         err = -ENOMEM;
 
         d->pbuf = xzalloc_array(char, DOMAIN_PBUF_SIZE);
@@ -694,6 +700,9 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d)
     return 0;
 }
 
+/*
+ * FIXME for 4.12: since argo_destroy is in _domain_destroy, remove it below.
+ */
 int domain_kill(struct domain *d)
 {
     int rc = 0;
@@ -717,6 +726,7 @@ int domain_kill(struct domain *d)
         if ( d->is_dying != DOMDYING_alive )
             return domain_kill(d);
         d->is_dying = DOMDYING_dying;
+        argo_destroy(d);
         evtchn_destroy(d);
         gnttab_release_mappings(d);
         tmem_destroy(d->tmem_client);
@@ -1175,6 +1185,8 @@ int domain_soft_reset(struct domain *d)
 
     grant_table_warn_active_grants(d);
 
+    argo_soft_reset(d);
+
     for_each_vcpu ( d, v )
     {
         set_xen_guest_handle(runstate_guest(v), NULL);
diff --git a/xen/include/Makefile b/xen/include/Makefile
index f7895e4..3d14532 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -5,6 +5,7 @@ ifneq ($(CONFIG_COMPAT),)
 compat-arch-$(CONFIG_X86) := x86_32
 
 headers-y := \
+    compat/argo.h \
     compat/callback.h \
     compat/elfnote.h \
     compat/event_channel.h \
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
new file mode 100644
index 0000000..530bb82
--- /dev/null
+++ b/xen/include/public/argo.h
@@ -0,0 +1,64 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Derived from v4v, the version 2 of v2v.
+ *
+ * Copyright (c) 2010, Citrix Systems
+ * Copyright (c) 2018-2019, BAE Systems
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __XEN_PUBLIC_ARGO_H__
+#define __XEN_PUBLIC_ARGO_H__
+
+#include "xen.h"
+
+#define XEN_ARGO_DOMID_ANY       DOMID_INVALID
+
+/* Fixed-width type for "argo port" number. Nothing to do with evtchns. */
+typedef uint32_t xen_argo_port_t;
+
+typedef struct xen_argo_addr
+{
+    xen_argo_port_t aport;
+    domid_t domain_id;
+    uint16_t pad;
+} xen_argo_addr_t;
+
+typedef struct xen_argo_ring
+{
+    /* Guests should use atomic operations to access rx_ptr */
+    uint32_t rx_ptr;
+    /* Guests should use atomic operations to access tx_ptr */
+    uint32_t tx_ptr;
+    /*
+     * Header space reserved for later use. Align the start of the ring to a
+     * multiple of the message slot size.
+     */
+    uint8_t reserved[56];
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    uint8_t ring[];
+#elif defined(__GNUC__)
+    uint8_t ring[0];
+#endif
+} xen_argo_ring_t;
+
+#endif
diff --git a/xen/include/xen/argo.h b/xen/include/xen/argo.h
new file mode 100644
index 0000000..2ba7e5c
--- /dev/null
+++ b/xen/include/xen/argo.h
@@ -0,0 +1,44 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Copyright (c) 2018, BAE Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef __XEN_ARGO_H__
+#define __XEN_ARGO_H__
+
+#include <xen/sched.h>
+
+#ifdef CONFIG_ARGO
+
+int argo_init(struct domain *d);
+void argo_destroy(struct domain *d);
+void argo_soft_reset(struct domain *d);
+
+#else /* !CONFIG_ARGO */
+
+static inline int argo_init(struct domain *d)
+{
+    return 0;
+}
+
+static inline void argo_destroy(struct domain *d)
+{
+}
+
+static inline void argo_soft_reset(struct domain *d)
+{
+}
+
+#endif
+
+#endif
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 4956a77..6e69afa 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -490,6 +490,11 @@ struct domain
         unsigned int guest_request_enabled       : 1;
         unsigned int guest_request_sync          : 1;
     } monitor;
+
+#ifdef CONFIG_ARGO
+    /* Argo interdomain communication support */
+    struct argo_domain *argo;
+#endif
 };
 
 /* Protect updates/reads (resp.) of domain_list and domain_hash. */
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 5273320..9f616e4 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -148,3 +148,5 @@
 ?	flask_setenforce		xsm/flask_op.h
 !	flask_sid_context		xsm/flask_op.h
 ?	flask_transition		xsm/flask_op.h
+?	argo_addr			argo.h
+?	argo_ring			argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 05/14] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (3 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 04/14] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 06/14] xen/arm: introduce guest_handle_for_field() Christopher Clark
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

EMSGSIZE: Argo's sendv operation will return EMSGSIZE when an excess amount
of data, across all iovs, has been supplied, exceeding either the statically
configured maximum size of a transmittable message, or the (variable) size
of the ring registered by the destination domain.

ECONNREFUSED: Argo's register operation will return ECONNREFUSED if a ring
is being registered to communicate with a specific remote domain that does
exist but is not argo-enabled.

These codes are described by POSIX here:
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html
    EMSGSIZE     : "Message too large"
    ECONNREFUSED : "Connection refused".

The numeric values assigned to each are taken from Linux, as is the case
for the existing error codes.
    EMSGSIZE     : 90
    ECONNREFUSED : 111

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/include/public/errno.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/include/public/errno.h b/xen/include/public/errno.h
index 305c112..e1d02fc 100644
--- a/xen/include/public/errno.h
+++ b/xen/include/public/errno.h
@@ -102,6 +102,7 @@ XEN_ERRNO(EILSEQ,	84)	/* Illegal byte sequence */
 XEN_ERRNO(ERESTART,	85)	/* Interrupted system call should be restarted */
 #endif
 XEN_ERRNO(ENOTSOCK,	88)	/* Socket operation on non-socket */
+XEN_ERRNO(EMSGSIZE,	90)	/* Message too large. */
 XEN_ERRNO(EOPNOTSUPP,	95)	/* Operation not supported on transport endpoint */
 XEN_ERRNO(EADDRINUSE,	98)	/* Address already in use */
 XEN_ERRNO(EADDRNOTAVAIL, 99)	/* Cannot assign requested address */
@@ -109,6 +110,7 @@ XEN_ERRNO(ENOBUFS,	105)	/* No buffer space available */
 XEN_ERRNO(EISCONN,	106)	/* Transport endpoint is already connected */
 XEN_ERRNO(ENOTCONN,	107)	/* Transport endpoint is not connected */
 XEN_ERRNO(ETIMEDOUT,	110)	/* Connection timed out */
+XEN_ERRNO(ECONNREFUSED,	111)	/* Connection refused */
 
 #undef XEN_ERRNO
 #endif /* XEN_ERRNO */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 06/14] xen/arm: introduce guest_handle_for_field()
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (4 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 05/14] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 07/14] argo: implement the register op Christopher Clark
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

ARM port of c/s bb544585: "introduce guest_handle_for_field()"

This helper turns a field of a GUEST_HANDLE into a GUEST_HANDLE.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v3: Added Stefano's Reviewed-by
v2: Added Paul's Reviewed-by

 xen/include/asm-arm/guest_access.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/xen/include/asm-arm/guest_access.h b/xen/include/asm-arm/guest_access.h
index 224d2a0..8997a1c 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -63,6 +63,9 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
     _y;                                                     \
 })
 
+#define guest_handle_for_field(hnd, type, fld)          \
+    ((XEN_GUEST_HANDLE(type)) { &(hnd).p->fld })
+
 #define guest_handle_from_ptr(ptr, type)        \
     ((XEN_GUEST_HANDLE_PARAM(type)) { (type *)ptr })
 #define const_guest_handle_from_ptr(ptr, type)  \
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 07/14] argo: implement the register op
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (5 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 06/14] xen/arm: introduce guest_handle_for_field() Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15 14:40   ` Roger Pau Monné
  2019-01-15  9:27 ` [PATCH v4 08/14] argo: implement the unregister op Christopher Clark
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

The register op is used by a domain to register a region of memory for
receiving messages from either a specified other domain, or, if specifying a
wildcard, any domain.

This operation creates a mapping within Xen's private address space that
will remain resident for the lifetime of the ring. In subsequent commits,
the hypervisor will use this mapping to copy data from a sending domain into
this registered ring, making it accessible to the domain that registered the
ring to receive data.

Wildcard any-sender rings are default disabled and registration will be
refused with EPERM unless they have been specifically enabled with the
argo-mac boot option introduced here. The reason why the default for
wildcard rings is 'deny' is that there is currently no means to protect the
ring from DoS by a noisy domain spamming the ring, affecting other domains
ability to send to it. This will be addressed with XSM policy controls in
subsequent work.

Since denying access to any-sender rings is a significant functional
constraint, a new bootparam is provided to enable overriding this:
 "argo-mac" variable has allowed values: 'permissive' and 'enforcing'.
Even though this is a boolean variable, use these descriptive strings in
order to make it obvious to an administrator that this has potential
security impact.

The p2m type of the memory supplied by the guest for the ring must be
p2m_ram_rw and the memory will be pinned as PGT_writable_page while the ring
is registered.

xen_argo_gfn_t type is defined and is 64-bit on all architectures which
assists with avoiding the need for compat code to translate hypercall args.
This hypercall op and its interface currently only supports 4K-sized pages.

array_index_nospec is used to guard the result of the ring id hash function.
This is out of an abundance of caution, since this is a very basic hash
function and it operates upon values supplied by the guest just before
being used as an array index.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>

-This version contains FIXMEs for 4.12:
 * find_ring_mfn: investigate using check_get_page_from_gfn()
   and rewrite this function using it or with adopted logic

 * shrink critical sections: move acquire/release of the global lock.
 * simplify the out label path when lock release has been moved.

 * - drop use of unsigned long type as hypercall args: not compat-friendly
 * - drop UL suffix on XEN_ARGO_REGISTER_FLAG_MASK
 * - guard XEN_ARGO_REGISTER_FLAG_MASK (perhaps framed by "#ifdef __XEN__")
 * - define XEN_ARGO_REGISTER_FLAG_MASK in terms of other flags defined

 * register_ring: pull write_unlock up above the cleanup actions above
   and add another label to aborb the two separate put_domain() calls on
   the error paths.
---
v3 #07 Jan: comment: minimum ring size is based on minimum-sized message
v3 #04 Andrew: reference CONFIG_ARGO in the command line documentation
v3 #07 Jan: register_ring: fold else, if into else-if to drop indent
v3 #07 Jan: remove no longer used guest_handle_is_aligned macros
v3 #07 Jan: remove dead code from find_ring_mfns
v3 #07 Jan: fix format string indention in printks
v3 #07 Jan: remove redundant bounds check on npage in find_ring_mfns
v3 #08 self/Roger: improve dprintk output in find_ring_info like find_send_info
v3 #07 Jan: rename ring_find_info to find_ring_info
v3 #07 Jan: use array_index_nospec in ring_map_page
v3 #07 Jan: fix numeric entries in printk format strings
v3 #7 Jan: drop unneeded parentheses from ROUNDUP_MESSAGE defn
v3 #10 Roger: move find functions to top of file and drop prototypes
v3 #03 meld compat check for hypercall arg register struct
v3 #04 Roger/Jan: make lock names clearer and assert their state
v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
v3 feedback #07 Eric: fix header max ring size comment units
v3 feedback #04 Roger: mfn_mapping: void* instead of uint8_t*
v3 use %u for printing unsigned ints in find_ring_mfns
v3 feedback #04 Jan: uint32_t -> unsigned int for npage in register_ring
v3 feedback #04 Roger: drop npages struct member, calculate from len
v3 : register_ring: uint32_t -> unsigned int for private_tx_ptr
v3 feedback Roger/Jan: ASSERT currd is current->domain or use 'd' variable name
v3 feedback #07 Roger: use opt_argo_mac_permissive : a boolean opt
v3 feedback #04 Roger: reorder #includes to alphabetical order
v3 feedback #07 Roger: drop comment re: Intel EPT/AMD NPT for write-only mapping
v3 feedback #07 Roger: drop ptr arithmetic in update_tx_ptr, use ring struct cast
v3 feedback #07 Roger: drop newline in ring_map_page
v3 feedback #07 Roger: drop unneeded null check before xfree
v3 feedback #07 Roger: use return and drop out label in register_ring
v3 Stefano: add 4K page constraint to header file comment & commit msg
v3 Julien/Stefano: 4K granularity ok: use 64-bit gfns in register interface

v2 self: disallow ring resize via reregister
v2 feedback Jan: drop cookie, implement teardown
v2 feedback Jan: drop message from argo_message_op
v2 self: move hash_index function below locking comment
v2 self: OVERHAUL
v2 self/Jan: remove use of magic verification field and tidy up
v2 self: merge max and min ring size check clauses
v2 feedback v1#13 Roger: use OS-supplied roundup; drop from public header
v2 feedback #9, Jan: use the argo-mac bootparam at point of introduction
v2 feedback #9, Jan: rename boot opt variable to comply with convention
v2 feedback #9, Jan: rename the argo_mac bootparam to argo-mac
v2 feedback #9 Jan: document argo boot opt in xen-command-line.markdown
v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
v1 feedback Roger, Jan: drop argo prefix on static functions
v1 feedback Roger: s/pfn/gfn/ and retire always-64-bit type
v2. feedback Jan: document the argo-mac boot opt
v2. feedback Jan: simplify re-register, drop mappings
v1 #13 feedback Jan: revise use of guest_handle_okay vs __copy ops

v1 #13 feedback, Jan: register op : s/ECONNREFUSED/ESRCH/
v1 #5 (#13) feedback Paul: register op: use currd in do_message_op
v1 #13 feedback, Paul: register op: use mfn_eq comparator
v1 #5 (#13) feedback Paul: register op: use currd in argo_register_ring
v1 #13 feedback Paul: register op: whitespace, unsigned, bounds check
v1 #13 feedback Paul: use of hex in limit constant definition
v1 #13 feedback Paul, register op: set nmfns on loop termination
v1 #13 feedback Paul: register op: do/while -> gotos, reindent
v1 argo_ring_map_page: drop uint32_t for unsigned int
v1. #13 feedback Julien: use page descriptors instead of gpfns.
   - adds ABI support for pages with different granularity.
v1 feedback #13, Paul: adjust log level of message
v1 feedback #13, Paul: use gprintk for guest-triggered warning
v1 feedback #13, Paul: gprintk and XENLOG_DEBUG for ring registration
v1 feedback #13, Paul: use gprintk for errs in argo_ring_map_page
v1 feedback #13, Paul: use ENOMEM if global mapping fails
v1 feedback Paul: overflow check before shift
v1: add define for copy_field_to_guest_errno
v1: fix gprintk use for ARM as its defn dislikes split format strings
v1: use copy_field_to_guest_errno
v1 feedback #13, Jan: argo_hash_fn: no inline, rename, change type
v1 feedback #13, Paul, Jan: EFAULT -> ENOMEM in argo_ring_map_page
v1 feedback #13, Jan: rename page var in argo_ring_map_page
v1 feedback #13, Jan: switch uint8_t* to void* and drop cast
v1 feedback #13, Jan: switch memory barrier to smp_wmb
v1 feedback #13, Jan: make 'ring' comment comply with single-line style
v1 feedback #13, Jan: use xzalloc_array, drop loop NULL init
v1 feedback #13, Jan: init bool with false rather than 0
v1 feedback #13 Jan: use __copy; define and use __copy_field_to_guest_errno
v1 feedback #13, Jan: use xzalloc, drop individual init zeroes
v1 feedback #13, Jan: prefix public namespace with xen
v1 feedback #13, Jan: blank line after op case in do_argo_message_op
v1 self: reflow comment in argo_ring_map_page to within 80 char len
v1 feedback #13, Roger: use true not 1 in assign to update_tx_ptr bool
v1 feedback #21, Jan: fold in the array_index_nospec hash function guards
v1 feedback #18, Jan: fold the max ring count limit into the series
v1 self: use unsigned long type for XEN_ARGO_REGISTER_FLAG_MASK
v1: feedback #15 Jan: handle upper-halves of hypercall args
v1. feedback #13 Jan: add comment re: page alignment
v1. self: confirm ring magic presence in supplied page array
v1. feedback #13 Jan: add comment re: minimum ring size
v1. feedback #13 Roger: use ASSERT_UNREACHABLE
v1. feedback Roger: add comment to hash function

 docs/misc/xen-command-line.pandoc |  17 ++
 xen/common/argo.c                 | 475 +++++++++++++++++++++++++++++++++++++-
 xen/common/compat/argo.c          |   1 +
 xen/include/public/argo.h         |  77 ++++++
 xen/include/xlat.lst              |   1 +
 5 files changed, 570 insertions(+), 1 deletion(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 08c28f9..099c4b6 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -195,6 +195,23 @@ This allows domains access to the Argo hypercall, which supports registration
 of memory rings with the hypervisor to receive messages, sending messages to
 other domains by hypercall and querying the ring status of other domains.
 
+### argo-mac-permissive
+> `= <boolean>`
+
+> Default: `false`
+
+Constrain the access control applied to the Argo communication mechanism.
+
+Only available if Xen is compiled with `CONFIG_ARGO` enabled.
+
+When `false`, domains may not register rings that have wildcard specified
+for the sender which would allow messages to be sent to the ring by any domain.
+This is to protect rings and the services that utilize them against DoS by a
+malicious or buggy domain spamming the ring.
+
+When the boot option is set to `true`, this constraint is relaxed and
+wildcard any-sender rings are allowed to be registered.
+
 ### asid (x86)
 > `= <boolean>`
 
diff --git a/xen/common/argo.c b/xen/common/argo.c
index 1958fdc..076ee6c 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -22,19 +22,36 @@
 #include <xen/errno.h>
 #include <xen/event.h>
 #include <xen/guest_access.h>
+#include <xen/lib.h>
 #include <xen/nospec.h>
 #include <xen/sched.h>
 #include <xen/time.h>
 
 #include <public/argo.h>
 
+#define MAX_RINGS_PER_DOMAIN            128U
+
+/* All messages on the ring are padded to a multiple of the slot size. */
+#define ROUNDUP_MESSAGE(a) ROUNDUP((a), XEN_ARGO_MSG_SLOT_SIZE)
+
+/* Number of PAGEs needed to hold a ring of a given size in bytes */
+#define NPAGES_RING(ring_len) \
+    (ROUNDUP((ROUNDUP_MESSAGE(ring_len) + sizeof(xen_argo_ring_t)), PAGE_SIZE) \
+     >> PAGE_SHIFT)
+
 DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_gfn_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
 
 /* Xen command line option to enable argo */
 static bool __read_mostly opt_argo_enabled;
 boolean_param("argo", opt_argo_enabled);
 
+/* Xen command line option for conservative or relaxed access control */
+static bool __read_mostly opt_argo_mac_permissive;
+boolean_param("argo-mac-permissive", opt_argo_mac_permissive);
+
 typedef struct argo_ring_id
 {
     xen_argo_port_t aport;
@@ -304,7 +321,8 @@ find_ring_info(const struct domain *d, const struct argo_ring_id *id)
             return ring_info;
         }
     }
-    argo_dprintk("no ring_info found\n");
+    argo_dprintk("no ring_info found for ring(%u:%x %u)\n",
+                 id->domain_id, id->aport, id->partner_id);
 
     return NULL;
 }
@@ -332,6 +350,66 @@ ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
     }
 }
 
+static int
+ring_map_page(const struct domain *d, struct argo_ring_info *ring_info,
+              unsigned int i, void **out_ptr)
+{
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    if ( i >= ring_info->nmfns )
+    {
+        gprintk(XENLOG_ERR,
+               "argo: ring (vm%u:%x vm%u) %p attempted to map page %u of %u\n",
+                ring_info->id.domain_id, ring_info->id.aport,
+                ring_info->id.partner_id, ring_info, i, ring_info->nmfns);
+        return -ENOMEM;
+    }
+    i = array_index_nospec(i, ring_info->nmfns);
+
+    if ( !ring_info->mfns || !ring_info->mfn_mapping)
+    {
+        ASSERT_UNREACHABLE();
+        ring_info->len = 0;
+        return -ENOMEM;
+    }
+
+    if ( !ring_info->mfn_mapping[i] )
+    {
+        ring_info->mfn_mapping[i] = map_domain_page_global(ring_info->mfns[i]);
+        if ( !ring_info->mfn_mapping[i] )
+        {
+            gprintk(XENLOG_ERR, "argo: ring (vm%u:%x vm%u) %p attempted to map "
+                    "page %u of %u\n",
+                    ring_info->id.domain_id, ring_info->id.aport,
+                    ring_info->id.partner_id, ring_info, i, ring_info->nmfns);
+            return -ENOMEM;
+        }
+        argo_dprintk("mapping page %"PRI_mfn" to %p\n",
+                     mfn_x(ring_info->mfns[i]), ring_info->mfn_mapping[i]);
+    }
+
+    if ( out_ptr )
+        *out_ptr = ring_info->mfn_mapping[i];
+
+    return 0;
+}
+
+static void
+update_tx_ptr(const struct domain *d, struct argo_ring_info *ring_info,
+              uint32_t tx_ptr)
+{
+    xen_argo_ring_t *ringp;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+    ASSERT(ring_info->mfn_mapping[0]);
+
+    ring_info->tx_ptr = tx_ptr;
+    ringp = ring_info->mfn_mapping[0];
+
+    write_atomic(&ringp->tx_ptr, tx_ptr);
+    smp_wmb();
+}
+
 static void
 wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
 {
@@ -492,11 +570,374 @@ partner_rings_remove(struct domain *src_d)
     }
 }
 
+/*
+ * FIXME for 4.12: investigate using check_get_page_from_gfn()
+ *                 and rewrite this function using it or with adopted logic
+ */
+static int
+find_ring_mfn(struct domain *d, gfn_t gfn, mfn_t *mfn)
+{
+    p2m_type_t p2mt;
+    int ret = 0;
+
+#ifdef CONFIG_X86
+    *mfn = get_gfn_unshare(d, gfn_x(gfn), &p2mt);
+#else
+    *mfn = p2m_lookup(d, gfn, &p2mt);
+#endif
+
+    if ( !mfn_valid(*mfn) )
+        ret = -EINVAL;
+#ifdef CONFIG_X86
+    else if ( p2m_is_paging(p2mt) || (p2mt == p2m_ram_logdirty) )
+        ret = -EAGAIN;
+#endif
+    else if ( (p2mt != p2m_ram_rw) ||
+              !get_page_and_type(mfn_to_page(*mfn), d, PGT_writable_page) )
+        ret = -EINVAL;
+
+#ifdef CONFIG_X86
+    put_gfn(d, gfn_x(gfn));
+#endif
+
+    return ret;
+}
+
+static int
+find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
+               const unsigned int npage,
+               XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
+               const unsigned int len)
+{
+    unsigned int i;
+    int ret = 0;
+    mfn_t *mfns;
+    void **mfn_mapping;
+
+    ASSERT(LOCKING_Write_rings_L2(d));
+
+    if ( ring_info->mfns )
+    {
+        /* Ring already existed: drop the previous mapping. */
+        gprintk(XENLOG_INFO, "argo: vm%u re-register existing ring "
+                "(vm%u:%x vm%u) clears mapping\n",
+                d->domain_id, ring_info->id.domain_id,
+                ring_info->id.aport, ring_info->id.partner_id);
+
+        ring_remove_mfns(d, ring_info);
+        ASSERT(!ring_info->mfns);
+    }
+
+    mfns = xmalloc_array(mfn_t, npage);
+    if ( !mfns )
+        return -ENOMEM;
+
+    for ( i = 0; i < npage; i++ )
+        mfns[i] = INVALID_MFN;
+
+    mfn_mapping = xzalloc_array(void *, npage);
+    if ( !mfn_mapping )
+    {
+        xfree(mfns);
+        return -ENOMEM;
+    }
+
+    ring_info->mfns = mfns;
+    ring_info->mfn_mapping = mfn_mapping;
+
+    for ( i = 0; i < npage; i++ )
+    {
+        xen_argo_gfn_t argo_gfn;
+        mfn_t mfn;
+
+        ret = __copy_from_guest_offset(&argo_gfn, gfn_hnd, i, 1) ? -EFAULT : 0;
+        if ( ret )
+            break;
+
+        ret = find_ring_mfn(d, _gfn(argo_gfn), &mfn);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR, "argo: vm%u: invalid gfn %"PRI_gfn" "
+                    "r:(vm%u:%x vm%u) %p %u/%u\n",
+                    d->domain_id, gfn_x(_gfn(argo_gfn)),
+                    ring_info->id.domain_id, ring_info->id.aport,
+                    ring_info->id.partner_id, ring_info, i, npage);
+            break;
+        }
+
+        ring_info->mfns[i] = mfn;
+
+        argo_dprintk("%u: %"PRI_gfn" -> %"PRI_mfn"\n",
+                     i, gfn_x(_gfn(argo_gfn)), mfn_x(ring_info->mfns[i]));
+    }
+
+    ring_info->nmfns = i;
+
+    if ( ret )
+        ring_remove_mfns(d, ring_info);
+    else
+    {
+        ASSERT(ring_info->nmfns == NPAGES_RING(len));
+
+        gprintk(XENLOG_DEBUG, "argo: vm%u ring (vm%u:%x vm%u) %p "
+                "mfn_mapping %p len %u nmfns %u\n",
+                d->domain_id, ring_info->id.domain_id,
+                ring_info->id.aport, ring_info->id.partner_id, ring_info,
+                ring_info->mfn_mapping, ring_info->len, ring_info->nmfns);
+    }
+
+    return ret;
+}
+
+/*
+ * FIXME for 4.12:
+ * * shrink critical sections: move acquire/release of the global lock.
+ * * simplify the out label path when lock release has been moved.
+ */
+static long
+register_ring(struct domain *currd,
+              XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd,
+              XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
+              unsigned int npage, bool fail_exist)
+{
+    xen_argo_register_ring_t reg;
+    struct argo_ring_id ring_id;
+    void *map_ringp;
+    xen_argo_ring_t *ringp;
+    struct argo_ring_info *ring_info;
+    struct argo_send_info *send_info = NULL;
+    struct domain *dst_d = NULL;
+    int ret = 0;
+    unsigned int private_tx_ptr;
+
+    ASSERT(currd == current->domain);
+
+    if ( copy_from_guest(&reg, reg_hnd, 1) )
+        return -EFAULT;
+
+    /*
+     * A ring must be large enough to transmit messages, so requires space for:
+     * * 1 message header, plus
+     * * 1 payload slot (payload is always rounded to a multiple of 16 bytes)
+     *   for the message payload to be written into, plus
+     * * 1 more slot, so that the ring cannot be filled to capacity with a
+     *   single minimum-size message -- see the logic in ringbuf_insert --
+     *   allowing for this ensures that there can be space remaining when a
+     *   message is present.
+     * The above determines the minimum acceptable ring size.
+     */
+    if ( (reg.len < (sizeof(struct xen_argo_ring_message_header)
+                      + ROUNDUP_MESSAGE(1) + ROUNDUP_MESSAGE(1))) ||
+         (reg.len > XEN_ARGO_MAX_RING_SIZE) ||
+         (reg.len != ROUNDUP_MESSAGE(reg.len)) ||
+         (NPAGES_RING(reg.len) != npage) ||
+         (reg.pad != 0) )
+        return -EINVAL;
+
+    ring_id.partner_id = reg.partner_id;
+    ring_id.aport = reg.aport;
+    ring_id.domain_id = currd->domain_id;
+
+    read_lock(&L1_global_argo_rwlock);
+
+    if ( !currd->argo )
+    {
+        ret = -ENODEV;
+        goto out_unlock;
+    }
+
+    if ( reg.partner_id == XEN_ARGO_DOMID_ANY )
+    {
+        if ( !opt_argo_mac_permissive )
+        {
+            ret = -EPERM;
+            goto out_unlock;
+        }
+    }
+    else
+    {
+        dst_d = get_domain_by_id(reg.partner_id);
+        if ( !dst_d )
+        {
+            argo_dprintk("!dst_d, ESRCH\n");
+            ret = -ESRCH;
+            goto out_unlock;
+        }
+
+        if ( !dst_d->argo )
+        {
+            argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
+            ret = -ECONNREFUSED;
+            put_domain(dst_d);
+            goto out_unlock;
+        }
+
+        send_info = xzalloc(struct argo_send_info);
+        if ( !send_info )
+        {
+            ret = -ENOMEM;
+            put_domain(dst_d);
+            goto out_unlock;
+        }
+        send_info->id = ring_id;
+    }
+
+    write_lock(&currd->argo->rings_L2_rwlock);
+
+    if ( currd->argo->ring_count >= MAX_RINGS_PER_DOMAIN )
+    {
+        ret = -ENOSPC;
+        goto out_unlock2;
+    }
+
+    ring_info = find_ring_info(currd, &ring_id);
+    if ( !ring_info )
+    {
+        ring_info = xzalloc(struct argo_ring_info);
+        if ( !ring_info )
+        {
+            ret = -ENOMEM;
+            goto out_unlock2;
+        }
+
+        spin_lock_init(&ring_info->L3_lock);
+
+        ring_info->id = ring_id;
+        INIT_HLIST_HEAD(&ring_info->pending);
+
+        hlist_add_head(&ring_info->node,
+                       &currd->argo->ring_hash[hash_index(&ring_info->id)]);
+
+        gprintk(XENLOG_DEBUG, "argo: vm%u registering ring (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+    }
+    else if ( ring_info->len )
+    {
+        /*
+         * If the caller specified that the ring must not already exist,
+         * fail at attempt to add a completed ring which already exists.
+         */
+        if ( fail_exist )
+        {
+            argo_dprintk("disallowed reregistration of existing ring\n");
+            ret = -EEXIST;
+            goto out_unlock2;
+        }
+
+        if ( ring_info->len != reg.len )
+        {
+            /*
+             * Change of ring size could result in entries on the pending
+             * notifications list that will never trigger.
+             * Simple blunt solution: disallow ring resize for now.
+             * TODO: investigate enabling ring resize.
+             */
+            gprintk(XENLOG_ERR, "argo: vm%u attempted to change ring size "
+                    "(vm%u:%x vm%u)\n",
+                    currd->domain_id, ring_id.domain_id, ring_id.aport,
+                    ring_id.partner_id);
+            /*
+             * Could return EINVAL here, but if the ring didn't already
+             * exist then the arguments would have been valid, so: EEXIST.
+             */
+            ret = -EEXIST;
+            goto out_unlock2;
+        }
+
+        gprintk(XENLOG_DEBUG,
+                "argo: vm%u re-registering existing ring (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+    }
+
+    ret = find_ring_mfns(currd, ring_info, npage, gfn_hnd, reg.len);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: vm%u failed to find ring mfns (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+
+        ring_remove_info(currd, ring_info);
+        goto out_unlock2;
+    }
+
+    /*
+     * The first page of the memory supplied for the ring has the xen_argo_ring
+     * structure at its head, which is where the ring indexes reside.
+     */
+    ret = ring_map_page(currd, ring_info, 0, &map_ringp);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: vm%u failed to map ring mfn 0 (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+
+        ring_remove_info(currd, ring_info);
+        goto out_unlock2;
+    }
+    ringp = map_ringp;
+
+    private_tx_ptr = read_atomic(&ringp->tx_ptr);
+
+    if ( (private_tx_ptr >= reg.len) ||
+         (ROUNDUP_MESSAGE(private_tx_ptr) != private_tx_ptr) )
+    {
+        /*
+         * Since the ring is a mess, attempt to flush the contents of it
+         * here by setting the tx_ptr to the next aligned message slot past
+         * the latest rx_ptr we have observed. Handle ring wrap correctly.
+         */
+        private_tx_ptr = ROUNDUP_MESSAGE(read_atomic(&ringp->rx_ptr));
+
+        if ( private_tx_ptr >= reg.len )
+            private_tx_ptr = 0;
+
+        update_tx_ptr(currd, ring_info, private_tx_ptr);
+    }
+
+    ring_info->tx_ptr = private_tx_ptr;
+    ring_info->len = reg.len;
+    currd->argo->ring_count++;
+
+    if ( send_info )
+    {
+        spin_lock(&dst_d->argo->send_L2_lock);
+
+        hlist_add_head(&send_info->node,
+                       &dst_d->argo->send_hash[hash_index(&send_info->id)]);
+
+        spin_unlock(&dst_d->argo->send_L2_lock);
+    }
+
+ out_unlock2:
+    if ( !ret )
+        xfree(send_info);
+
+    if ( dst_d )
+        put_domain(dst_d);
+
+    /*
+     * FIXME for 4.12: pull this write_unlock up above the cleanup actions above
+     * and add another label to aborb the two separate put_domain() calls on
+     * the error paths.
+     */
+    write_unlock(&currd->argo->rings_L2_rwlock);
+
+ out_unlock:
+    read_unlock(&L1_global_argo_rwlock);
+
+    return ret;
+}
+
 long
 do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
            XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
            unsigned long arg4)
 {
+    struct domain *currd = current->domain;
     long rc = -EFAULT;
 
     argo_dprintk("->do_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
@@ -507,6 +948,38 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
 
     switch (cmd)
     {
+    case XEN_ARGO_OP_register_ring:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd =
+            guest_handle_cast(arg1, xen_argo_register_ring_t);
+        XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd =
+            guest_handle_cast(arg2, xen_argo_gfn_t);
+        /* arg3 is npage */
+        /* arg4 is flags */
+        bool fail_exist = arg4 & XEN_ARGO_REGISTER_FLAG_FAIL_EXIST;
+
+        if ( unlikely(arg3 > (XEN_ARGO_MAX_RING_SIZE >> PAGE_SHIFT)) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+        /*
+         * Check access to the whole array here so we can use the faster __copy
+         * operations to read each element later.
+         */
+        if ( unlikely(!guest_handle_okay(gfn_hnd, arg3)) )
+            break;
+        /* arg4: reserve currently-undefined bits, require zero.  */
+        if ( unlikely(arg4 & ~XEN_ARGO_REGISTER_FLAG_MASK) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        rc = register_ring(currd, reg_hnd, gfn_hnd, arg3, fail_exist);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/common/compat/argo.c b/xen/common/compat/argo.c
index 8edb9e8..9437a7a 100644
--- a/xen/common/compat/argo.c
+++ b/xen/common/compat/argo.c
@@ -20,4 +20,5 @@
 #include <compat/argo.h>
 
 CHECK_argo_addr;
+CHECK_argo_register_ring;
 CHECK_argo_ring;
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index 530bb82..bd23373 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -33,9 +33,19 @@
 
 #define XEN_ARGO_DOMID_ANY       DOMID_INVALID
 
+/*
+ * The maximum size of an Argo ring is defined to be: 16MB
+ *  -- which is 0x1000000 bytes.
+ * A byte index into the ring is at most 24 bits.
+ */
+#define XEN_ARGO_MAX_RING_SIZE  (0x1000000ULL)
+
 /* Fixed-width type for "argo port" number. Nothing to do with evtchns. */
 typedef uint32_t xen_argo_port_t;
 
+/* gfn type: 64-bit on all architectures to aid avoiding a compat ABI */
+typedef uint64_t xen_argo_gfn_t;
+
 typedef struct xen_argo_addr
 {
     xen_argo_port_t aport;
@@ -61,4 +71,71 @@ typedef struct xen_argo_ring
 #endif
 } xen_argo_ring_t;
 
+typedef struct xen_argo_register_ring
+{
+    xen_argo_port_t aport;
+    domid_t partner_id;
+    uint16_t pad;
+    uint32_t len;
+} xen_argo_register_ring_t;
+
+/* Messages on the ring are padded to a multiple of this size. */
+#define XEN_ARGO_MSG_SLOT_SIZE 0x10
+
+struct xen_argo_ring_message_header
+{
+    uint32_t len;
+    xen_argo_addr_t source;
+    uint32_t message_type;
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    uint8_t data[];
+#elif defined(__GNUC__)
+    uint8_t data[0];
+#endif
+};
+
+/*
+ * Hypercall operations
+ */
+
+/* FIXME for 4.12:
+ * - drop use of unsigned long type as hypercall args: not compat-friendly
+ * - drop UL suffix on XEN_ARGO_REGISTER_FLAG_MASK
+ * - guard XEN_ARGO_REGISTER_FLAG_MASK (perhaps framed by "#ifdef __XEN__")
+ * - define XEN_ARGO_REGISTER_FLAG_MASK in terms of other flags defined
+ */
+/*
+ * XEN_ARGO_OP_register_ring
+ *
+ * Register a ring using the guest-supplied memory pages.
+ * Also used to reregister an existing ring (eg. after resume from hibernate).
+ *
+ * The first argument struct indicates the port number for the ring to register
+ * and the partner domain, if any, that is to be allowed to send to the ring.
+ * A wildcard (XEN_ARGO_DOMID_ANY) may be supplied instead of a partner domid,
+ * and if the hypervisor has wildcard sender rings enabled, this will allow
+ * any domain (XSM notwithstanding) to send to the ring.
+ *
+ * The second argument is an array of guest frame numbers and the third argument
+ * indicates the size of the array. This operation only supports 4K-sized pages.
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_register_ring_t)
+ * arg2: XEN_GUEST_HANDLE(xen_argo_gfn_t)
+ * arg3: unsigned long npages
+ * arg4: unsigned long flags
+ */
+#define XEN_ARGO_OP_register_ring     1
+
+/* Register op flags */
+/*
+ * Fail exist:
+ * If set, reject attempts to (re)register an existing established ring.
+ * If clear, reregistration occurs if the ring exists, with the new ring
+ * taking the place of the old, preserving tx_ptr if it remains valid.
+ */
+#define XEN_ARGO_REGISTER_FLAG_FAIL_EXIST  0x1
+
+/* Mask for all defined flags. unsigned long type so ok for both 32/64-bit */
+#define XEN_ARGO_REGISTER_FLAG_MASK 0x1UL
+
 #endif
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 9f616e4..9c9d33f 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -150,3 +150,4 @@
 ?	flask_transition		xsm/flask_op.h
 ?	argo_addr			argo.h
 ?	argo_ring			argo.h
+?	argo_register_ring		argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 08/14] argo: implement the unregister op
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (6 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 07/14] argo: implement the register op Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15 15:03   ` Roger Pau Monné
  2019-01-15  9:27 ` [PATCH v4 09/14] argo: implement the sendv op; evtchn: expose send_guest_global_virq Christopher Clark
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Takes a single argument: a handle to the ring unregistration struct,
which specifies the port and partner domain id or wildcard.

The ring's entry is removed from the hashtable of registered rings;
any entries for pending notifications are removed; and the ring is
unmapped from Xen's address space.

If the ring had been registered to communicate with a single specified
domain (ie. a non-wildcard ring) then the partner domain state is removed
from the partner domain's argo send_info hash table.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
v3 #08 Jan: pull xfree out of exclusive critical sections in unregister_ring
v3 #08 Jan: rename send_find_info to find_send_info
v3 #07 Jan: rename ring_find_info to find_ring_info
v3 #08 Roger: use return and remove the out label in unregister_ring
v3 #08 Roger: better debug output in send_find_info
v3 #10 Roger: move find functions to top of file and drop prototypes
v3 #04 Jan: meld compat check for unregister_ring struct
v3 #04 Roger/Jan: make lock names clearer and assert their state
v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
v3 feedback Roger/Jan: ASSERT currd is current->domain or use 'd' variable name
v3 feedback #07 Roger: const the argo_ring_id structs in send_find_info
v2 feedback Jan: drop cookie, implement teardown
v2 feedback Jan: drop message from argo_message_op
v2 self: OVERHAUL
v2 self: reorder logic to shorten critical section
v1 #13 feedback Jan: revise use of guest_handle_okay vs __copy ops
v1 feedback Roger, Jan: drop argo prefix on static functions
v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
v1 #5 (#14) feedback Paul: use currd in do_argo_message_op
v1 #5 (#14) feedback Paul: full use currd in argo_unregister_ring
v1 #13 (#14) feedback Paul: replace do/while with goto; reindent
v1 self: add blank lines in unregister case in do_argo_message_op
v1: #13 feedback Jan: public namespace: prefix with xen
v1: #13 feedback Jan: blank line after op case in do_argo_message_op
v1: #14 feedback Jan: replace domain id override with validation
v1: #18 feedback Jan: meld the ring count limit into the series
v1: feedback #15 Jan: verify zero in unused hypercall args

 xen/common/argo.c         | 118 ++++++++++++++++++++++++++++++++++++++++++++++
 xen/common/compat/argo.c  |   1 +
 xen/include/public/argo.h |  19 ++++++++
 xen/include/xlat.lst      |   1 +
 4 files changed, 139 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 076ee6c..3f95f80 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -43,6 +43,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_gfn_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_unregister_ring_t);
 
 /* Xen command line option to enable argo */
 static bool __read_mostly opt_argo_enabled;
@@ -327,6 +328,33 @@ find_ring_info(const struct domain *d, const struct argo_ring_id *id)
     return NULL;
 }
 
+static struct argo_send_info *
+find_send_info(const struct domain *d, const struct argo_ring_id *id)
+{
+    struct hlist_node *node;
+    struct argo_send_info *send_info;
+
+    ASSERT(LOCKING_send_L2(d));
+
+    hlist_for_each_entry(send_info, node, &d->argo->send_hash[hash_index(id)],
+                         node)
+    {
+        const struct argo_ring_id *cmpid = &send_info->id;
+
+        if ( cmpid->aport == id->aport &&
+             cmpid->domain_id == id->domain_id &&
+             cmpid->partner_id == id->partner_id )
+        {
+            argo_dprintk("send_info=%p\n", send_info);
+            return send_info;
+        }
+    }
+    argo_dprintk("no send_info found for ring(%u:%x %u)\n",
+                 id->domain_id, id->aport, id->partner_id);
+
+    return NULL;
+}
+
 static void
 ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
 {
@@ -695,6 +723,81 @@ find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
  * * simplify the out label path when lock release has been moved.
  */
 static long
+unregister_ring(struct domain *currd,
+                XEN_GUEST_HANDLE_PARAM(xen_argo_unregister_ring_t) unreg_hnd)
+{
+    xen_argo_unregister_ring_t unreg;
+    struct argo_ring_id ring_id;
+    struct argo_ring_info *ring_info;
+    struct argo_send_info *send_info = NULL;
+    struct domain *dst_d = NULL;
+    int ret = 0;
+
+    ASSERT(currd == current->domain);
+
+    if ( copy_from_guest(&unreg, unreg_hnd, 1) )
+        return -EFAULT;
+
+    if ( unreg.pad )
+        return -EINVAL;
+
+    ring_id.partner_id = unreg.partner_id;
+    ring_id.aport = unreg.aport;
+    ring_id.domain_id = currd->domain_id;
+
+    read_lock(&L1_global_argo_rwlock);
+
+    if ( !currd->argo )
+    {
+        ret = -ENODEV;
+        goto out_unlock;
+    }
+
+    write_lock(&currd->argo->rings_L2_rwlock);
+
+    ring_info = find_ring_info(currd, &ring_id);
+    if ( ring_info )
+    {
+        ring_remove_info(currd, ring_info);
+        currd->argo->ring_count--;
+    }
+
+    dst_d = get_domain_by_id(ring_id.partner_id);
+    if ( dst_d )
+    {
+        if ( dst_d->argo )
+        {
+            spin_lock(&dst_d->argo->send_L2_lock);
+
+            send_info = find_send_info(dst_d, &ring_id);
+            if ( send_info )
+                hlist_del(&send_info->node);
+
+            spin_unlock(&dst_d->argo->send_L2_lock);
+
+        }
+        put_domain(dst_d);
+    }
+
+    write_unlock(&currd->argo->rings_L2_rwlock);
+
+    if ( send_info )
+        xfree(send_info);
+
+    if ( !ring_info )
+    {
+        argo_dprintk("ENOENT\n");
+        ret = -ENOENT;
+        goto out_unlock;
+    }
+
+ out_unlock:
+    read_unlock(&L1_global_argo_rwlock);
+
+    return ret;
+}
+
+static long
 register_ring(struct domain *currd,
               XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd,
               XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
@@ -980,6 +1083,21 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
         break;
     }
 
+    case XEN_ARGO_OP_unregister_ring:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_unregister_ring_t) unreg_hnd =
+            guest_handle_cast(arg1, xen_argo_unregister_ring_t);
+
+        if ( unlikely((!guest_handle_is_null(arg2)) || arg3 || arg4) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        rc = unregister_ring(currd, unreg_hnd);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/common/compat/argo.c b/xen/common/compat/argo.c
index 9437a7a..6a1671c 100644
--- a/xen/common/compat/argo.c
+++ b/xen/common/compat/argo.c
@@ -22,3 +22,4 @@
 CHECK_argo_addr;
 CHECK_argo_register_ring;
 CHECK_argo_ring;
+CHECK_argo_unregister_ring;
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index bd23373..3eabf83 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -79,6 +79,13 @@ typedef struct xen_argo_register_ring
     uint32_t len;
 } xen_argo_register_ring_t;
 
+typedef struct xen_argo_unregister_ring
+{
+    xen_argo_port_t aport;
+    domid_t partner_id;
+    uint16_t pad;
+} xen_argo_unregister_ring_t;
+
 /* Messages on the ring are padded to a multiple of this size. */
 #define XEN_ARGO_MSG_SLOT_SIZE 0x10
 
@@ -138,4 +145,16 @@ struct xen_argo_ring_message_header
 /* Mask for all defined flags. unsigned long type so ok for both 32/64-bit */
 #define XEN_ARGO_REGISTER_FLAG_MASK 0x1UL
 
+/*
+ * XEN_ARGO_OP_unregister_ring
+ *
+ * Unregister a previously-registered ring, ending communication.
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_unregister_ring_t)
+ * arg2: NULL
+ * arg3: 0 (ZERO)
+ * arg4: 0 (ZERO)
+ */
+#define XEN_ARGO_OP_unregister_ring     2
+
 #endif
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 9c9d33f..411c661 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -151,3 +151,4 @@
 ?	argo_addr			argo.h
 ?	argo_ring			argo.h
 ?	argo_register_ring		argo.h
+?	argo_unregister_ring		argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 09/14] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (7 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 08/14] argo: implement the unregister op Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15 15:49   ` Roger Pau Monné
  2019-01-15  9:27 ` [PATCH v4 10/14] argo: implement the notify op Christopher Clark
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

sendv operation is invoked to perform a synchronous send of buffers
contained in iovs to a remote domain's registered ring.

It takes:
 * A destination address (domid, port) for the ring to send to.
   It performs a most-specific match lookup, to allow for wildcard.
 * A source address, used to inform the destination of where to reply.
 * The address of an array of iovs containing the data to send
 * .. and the length of that array of iovs
 * and a 32-bit message type, available to communicate message context
   data (eg. kernel-to-kernel, separate from the application data).

If insufficient space exists in the destination ring, it will return
-EAGAIN and Xen will notify the caller when sufficient space becomes
available.

Accesses to the ring indices are appropriately atomic. The rings are
mapped into Xen's private address space to write as needed and the
mappings are retained for later use.

Fixed-size types are used in some areas within this code where caution
around avoiding integer overflow is important.

Notifications are sent to guests via VIRQ and send_guest_global_virq is
exposed in the change to enable argo to call it. VIRQ_ARGO_MESSAGE is
claimed from the VIRQ previously reserved for this purpose (#11).

The VIRQ notification method is used rather than sending events using
evtchn functions directly because:

* no current event channel type is an exact fit for the intended
  behaviour. ECS_IPI is closest, but it disallows migration to
  other VCPUs which is not necessarily a requirement for Argo.

* at the point of argo_init, allocation of an event channel is
  complicated by none of the guest VCPUs being initialized yet
  and the event channel logic expects that a valid event channel
  has a present VCPU.

* at the point of signalling a notification, the VIRQ logic is already
  defensive: if d->vcpu[0] is NULL, the notification is just silently
  dropped, whereas the evtchn_send logic is not so defensive: vcpu[0]
  must not be NULL, otherwise a null pointer dereference occurs.

Using a VIRQ removes the need for the guest to query to determine which
event channel notifications will be delivered on. This is also likely to
simplify establishing future L0/L1 nested hypervisor argo communication.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
v3 #07 Jan: rename ring_find_info* to find_ring_info*
v3 #07 Jan: fix numeric entries in printk format strings
v3 #10 Roger: move find functions to top of file and drop prototypes
v3 #04 Jan: meld compat struct checking for hypercall args
v3 #04 Roger/Jan: make lock names clearer and assert their state
v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
v3 feedback #09 Eric: fix len & offset sanity check in memcpy_to_guest_ring
v3 feedback #04 Roger: newline fix in wildcard_pending_list_insert
v3 feedback #04 Roger: drop npages struct member, calculate from len
v3 #09 Roger: simplify EFAULT return in memcpy_to_guest_ring
v3 #09 Roger: add newline before return in get_sanitized_ring
v3 #09 Roger: replace while with for loop in iov_count
v3 #09 Roger: drop 0 in struct init in ringbuf_insert
v3 #09 Roger: comment for XEN_ARGO_MAXIOV: warn of stack overflow risk
v3 #09 Roger: simplify while loop: for instead in ringbuf_insert
v3 #09 Roger: drop out label for returns in ringbuf_insert
v3 #09 Roger: drop newline in pending_queue
v3 #09 Roger: replace second goto label with error path unlock in sendv
v3 #09 Jason: check iov_len vs MAX_ARGO_MESSAGE_SIZE in iov_count
v3 #09 Jason: check padding is zeroed in sendv op
v3 #09 Jason: memcpy_to_guest_ring: simpler code with better loop

v2 self: use ring_info backpointer in pending_ent to maintain npending
v2 feedback Jan: drop cookie, implement teardown
v2 self: pending_queue: reap stale ents when in need of space
v2 self: pending_requeue: reclaim ents for stale domains
v2.feedback Jan: only override sender domid if DOMID_ANY
v2 feedback Jan: drop message from argo_message_op
v2 self: check npending vs maximum limit
v2 self: get_sanitized_ring instead of get_rx_ptr
v2 feedback v1#13 Jan: remove double read from ringbuf insert, lower MAX_IOV
v2 self: make iov_count const
v2 self: iov_count : return EMSGSIZE for message too big
v2 self: OVERHAUL
v2 self: s/argo_pending_ent/pending_ent/g
v2 feedback v1#13 Roger: use OS-supplied roundup; drop from public header
v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
v1 feedback Roger, Jan: drop argo prefix on static functions
v1 feedback #13 Jan: drop guest_handle_okay when using copy_from_guest
    - reorder do_argo_op logic
v2 self: add _hnd suffix to iovs variable name to indicate guest handle type
v2 self: replace use of XEN_GUEST_HANDLE_NULL with two existing macros

v1 #15 feedback, Jan: sendv op : s/ECONNREFUSED/ESRCH/
v1 #5 (#15) feedback Paul: sendv: use currd in do_argo_message_op
v1 #13 (#15) feedback Paul: sendv op: do/while reindent only
v1 #13 (#15) feedback Paul: sendv op: do/while: argo_ringbuf_insert to goto style
v1 #13 (#15) feedback Paul: sendv op: do/while: reindent only again
v1 #13 (#15) feedback Paul: sendv op: do/while : goto
v1 #15 feedback Paul: sendv op: make page var: unsigned
v1 #15 feedback Paul: sendv op: new local var for PAGE_SIZE - offset
v1 #8 feedback Jan: XEN_GUEST_HANDLE : C89 compliance
v1 rebase after switching register op from pfns to page descriptors
v1 self: move iov DEFINE_XEN_GUEST_HANDLE out of public header into argo.c
v1 #13 (#15) feedback Paul: fix loglevel for guest-triggered messages
v1 : add compat xlat.lst entries
v1 self: switched notification to send_guest_global_virq instead of event
v1: fix gprintk use for ARM as its defn dislikes split format strings
v1: init len variable to satisfy ARM compiler initialized checking
v1 #13 feedback Jan: rename page var
v1:#14 feedback Jan: uint8_t* -> void*
v1: #13 feedback Jan: public namespace: prefix with xen
v1: #13 feedback Jan: blank line after case op in do_argo_message_op
v1: #15 feedback Jan: add comments explaining why the writes don't overrun
v1: self: add ASSERT to support comment that overrun cannot happen
v1: self: fail on short writes where guest manipulated the iov_lens
v1: self: rename ent id to domain_id
v1: self: add moan for iov rewrite
v1. feedback #15 Jan: require the pad bits are zero
v1. feedback #15 Jan: drop NULL check in argo_signal_domain as now using VIRQ
v1. self: store domain_cookie in pending ent
v1. feedback #15 Jan: use unsigned where possible
v1. feedback Jan: use handle type for iov_base in public iov interface
v1. self: log whenever visible error occurs
v1 feedback #15, Jan: drop unnecessary mb
v1 self: only update internal tx_ptr if able to return success
         and update the visible tx_ptr
v1 self: log on failure to map ring to update visible tx_ptr
v1 feedback #15 Jan: add comment re: notification size policy
v1 self/Roger? remove errant space after sizeof
v1. feedback #15 Jan: require iov pad be zero
v1. self: rename iov_base to iov_hnd for handle in public iov interface
v1: feedback #15 Jan: handle upper-halves of hypercall args; changes some
    types in function signatures to match.
v1: self: add dprintk to sendv
v1: self: add debug output to argo_iov_count
v1. feedback #14 Jan: blank line before return in argo_iov_count
v1 feedback #15 Jan: verify src id, not override

 xen/common/argo.c          | 635 +++++++++++++++++++++++++++++++++++++++++++++
 xen/common/compat/argo.c   |  19 ++
 xen/common/event_channel.c |   2 +-
 xen/include/public/argo.h  |  60 +++++
 xen/include/public/xen.h   |   2 +-
 xen/include/xen/event.h    |   7 +
 xen/include/xlat.lst       |   2 +
 7 files changed, 725 insertions(+), 2 deletions(-)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 3f95f80..5d5cf49 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -30,10 +30,15 @@
 #include <public/argo.h>
 
 #define MAX_RINGS_PER_DOMAIN            128U
+#define MAX_PENDING_PER_RING             32U
 
 /* All messages on the ring are padded to a multiple of the slot size. */
 #define ROUNDUP_MESSAGE(a) ROUNDUP((a), XEN_ARGO_MSG_SLOT_SIZE)
 
+/* The maximum size of a message that may be sent on the largest Argo ring. */
+#define MAX_ARGO_MESSAGE_SIZE ((XEN_ARGO_MAX_RING_SIZE) - \
+        (sizeof(struct xen_argo_ring_message_header)) - ROUNDUP_MESSAGE(1))
+
 /* Number of PAGEs needed to hold a ring of a given size in bytes */
 #define NPAGES_RING(ring_len) \
     (ROUNDUP((ROUNDUP_MESSAGE(ring_len) + sizeof(xen_argo_ring_t)), PAGE_SIZE) \
@@ -41,8 +46,10 @@
 
 DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_gfn_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_iov_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_send_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_unregister_ring_t);
 
 /* Xen command line option to enable argo */
@@ -328,6 +335,28 @@ find_ring_info(const struct domain *d, const struct argo_ring_id *id)
     return NULL;
 }
 
+static struct argo_ring_info *
+find_ring_info_by_match(const struct domain *d, xen_argo_port_t aport,
+                        domid_t partner_id)
+{
+    struct argo_ring_id id;
+    struct argo_ring_info *ring_info;
+
+    ASSERT(LOCKING_Read_rings_L2(d));
+
+    id.aport = aport;
+    id.domain_id = d->domain_id;
+    id.partner_id = partner_id;
+
+    ring_info = find_ring_info(d, &id);
+    if ( ring_info )
+        return ring_info;
+
+    id.partner_id = XEN_ARGO_DOMID_ANY;
+
+    return find_ring_info(d, &id);
+}
+
 static struct argo_send_info *
 find_send_info(const struct domain *d, const struct argo_ring_id *id)
 {
@@ -356,6 +385,14 @@ find_send_info(const struct domain *d, const struct argo_ring_id *id)
 }
 
 static void
+signal_domain(struct domain *d)
+{
+    argo_dprintk("signalling domid:%u\n", d->domain_id);
+
+    send_guest_global_virq(d, VIRQ_ARGO_MESSAGE);
+}
+
+static void
 ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
 {
     unsigned int i;
@@ -438,6 +475,389 @@ update_tx_ptr(const struct domain *d, struct argo_ring_info *ring_info,
     smp_wmb();
 }
 
+static int
+memcpy_to_guest_ring(const struct domain *d, struct argo_ring_info *ring_info,
+                     unsigned int offset,
+                     const void *src, XEN_GUEST_HANDLE(uint8_t) src_hnd,
+                     unsigned int len)
+{
+    unsigned int mfns_index = offset >> PAGE_SHIFT;
+    void *dst;
+    int ret;
+    unsigned int src_offset = 0;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    offset &= ~PAGE_MASK;
+
+    if ( len + offset > XEN_ARGO_MAX_RING_SIZE )
+        return -EFAULT;
+
+    while ( len )
+    {
+        unsigned int head_len = len > PAGE_SIZE ? PAGE_SIZE - offset : len;
+
+        ret = ring_map_page(d, ring_info, mfns_index, &dst);
+        if ( ret )
+            return ret;
+
+        if ( src )
+        {
+            memcpy(dst + offset, src + src_offset, head_len);
+            src_offset += head_len;
+        }
+        else
+        {
+            if ( copy_from_guest(dst + offset, src_hnd, head_len) )
+                return -EFAULT;
+
+            guest_handle_add_offset(src_hnd, head_len);
+        }
+
+        mfns_index++;
+        len -= head_len;
+        offset = 0;
+    }
+
+    return 0;
+}
+
+/*
+ * Use this with caution: rx_ptr is under guest control and may be bogus.
+ * See get_sanitized_ring for a safer alternative.
+ */
+static int
+get_rx_ptr(const struct domain *d, struct argo_ring_info *ring_info,
+           uint32_t *rx_ptr)
+{
+    void *src;
+    xen_argo_ring_t *ringp;
+    int ret;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    if ( !ring_info->nmfns || ring_info->nmfns < NPAGES_RING(ring_info->len) )
+        return -EINVAL;
+
+    ret = ring_map_page(d, ring_info, 0, &src);
+    if ( ret )
+        return ret;
+
+    ringp = (xen_argo_ring_t *)src;
+
+    *rx_ptr = read_atomic(&ringp->rx_ptr);
+
+    return 0;
+}
+
+/*
+ * get_sanitized_ring creates a modified copy of the ring pointers where
+ * the rx_ptr is rounded up to ensure it is aligned, and then ring
+ * wrap is handled. Simplifies safe use of the rx_ptr for available
+ * space calculation.
+ */
+static int
+get_sanitized_ring(const struct domain *d, xen_argo_ring_t *ring,
+                   struct argo_ring_info *ring_info)
+{
+    uint32_t rx_ptr;
+    int ret;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    ret = get_rx_ptr(d, ring_info, &rx_ptr);
+    if ( ret )
+        return ret;
+
+    ring->tx_ptr = ring_info->tx_ptr;
+
+    rx_ptr = ROUNDUP_MESSAGE(rx_ptr);
+    if ( rx_ptr >= ring_info->len )
+        rx_ptr = 0;
+
+    ring->rx_ptr = rx_ptr;
+
+    return 0;
+}
+
+/*
+ * iov_count returns its count on success via an out variable to avoid
+ * potential for a negative return value to be used incorrectly
+ * (eg. coerced into an unsigned variable resulting in a large incorrect value)
+ */
+static int
+iov_count(const xen_argo_iov_t *piov, unsigned long niov, uint32_t *count)
+{
+    uint32_t sum_iov_lens = 0;
+
+    if ( niov > XEN_ARGO_MAXIOV )
+        return -EINVAL;
+
+    for ( ; niov--; piov++ )
+    {
+        /* valid iovs must have the padding field set to zero */
+        if ( piov->pad )
+        {
+            argo_dprintk("invalid iov: padding is not zero\n");
+            return -EINVAL;
+        }
+
+        /* check each to protect sum against integer overflow */
+        if ( piov->iov_len > MAX_ARGO_MESSAGE_SIZE )
+        {
+            argo_dprintk("invalid iov_len: too big (%u)>%llu\n",
+                         piov->iov_len, MAX_ARGO_MESSAGE_SIZE);
+            return -EINVAL;
+        }
+
+        sum_iov_lens += piov->iov_len;
+
+        /*
+         * Again protect sum from integer overflow
+         * and ensure total msg size will be within bounds.
+         */
+        if ( sum_iov_lens > MAX_ARGO_MESSAGE_SIZE )
+        {
+            argo_dprintk("invalid iov series: total message too big\n");
+            return -EMSGSIZE;
+        }
+    }
+
+    *count = sum_iov_lens;
+
+    return 0;
+}
+
+static int
+ringbuf_insert(const struct domain *d, struct argo_ring_info *ring_info,
+               const struct argo_ring_id *src_id,
+               XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd,
+               unsigned long niov, uint32_t message_type,
+               unsigned long *out_len)
+{
+    xen_argo_ring_t ring;
+    struct xen_argo_ring_message_header mh = { };
+    int32_t sp;
+    int32_t ret;
+    uint32_t len = 0;
+    xen_argo_iov_t iovs[XEN_ARGO_MAXIOV];
+    xen_argo_iov_t *piov;
+    XEN_GUEST_HANDLE(uint8_t) NULL_hnd =
+       guest_handle_from_param(guest_handle_from_ptr(NULL, uint8_t), uint8_t);
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    ret = __copy_from_guest(iovs, iovs_hnd, niov) ? -EFAULT : 0;
+    if ( ret )
+        return ret;
+
+    /*
+     * Obtain the total size of data to transmit -- sets the 'len' variable
+     * -- and sanity check that the iovs conform to size and number limits.
+     * Enforced below: no more than 'len' bytes of guest data
+     * (plus the message header) will be sent in this operation.
+     */
+    ret = iov_count(iovs, niov, &len);
+    if ( ret )
+        return ret;
+
+    /*
+     * Size bounds check against ring size and static maximum message limit.
+     * The message must not fill the ring; there must be at least one slot
+     * remaining so we can distinguish a full ring from an empty one.
+     */
+    if ( ((ROUNDUP_MESSAGE(len) +
+            sizeof(struct xen_argo_ring_message_header)) >= ring_info->len) ||
+         (len > MAX_ARGO_MESSAGE_SIZE) )
+        return -EMSGSIZE;
+
+    ret = get_sanitized_ring(d, &ring, ring_info);
+    if ( ret )
+        return ret;
+
+    argo_dprintk("ring.tx_ptr=%u ring.rx_ptr=%u ring len=%u"
+                 " ring_info->tx_ptr=%u\n",
+                 ring.tx_ptr, ring.rx_ptr, ring_info->len, ring_info->tx_ptr);
+
+    if ( ring.rx_ptr == ring.tx_ptr )
+        sp = ring_info->len;
+    else
+    {
+        sp = ring.rx_ptr - ring.tx_ptr;
+        if ( sp < 0 )
+            sp += ring_info->len;
+    }
+
+    /*
+     * Size bounds check against currently available space in the ring.
+     * Again: the message must not fill the ring leaving no space remaining.
+     */
+    if ( (ROUNDUP_MESSAGE(len) +
+            sizeof(struct xen_argo_ring_message_header)) >= sp )
+    {
+        argo_dprintk("EAGAIN\n");
+        return -EAGAIN;
+    }
+
+    mh.len = len + sizeof(struct xen_argo_ring_message_header);
+    mh.source.aport = src_id->aport;
+    mh.source.domain_id = src_id->domain_id;
+    mh.message_type = message_type;
+
+    /*
+     * For this copy to the guest ring, tx_ptr is always 16-byte aligned
+     * and the message header is 16 bytes long.
+     */
+    BUILD_BUG_ON(
+        sizeof(struct xen_argo_ring_message_header) != ROUNDUP_MESSAGE(1));
+
+    /*
+     * First data write into the destination ring: fixed size, message header.
+     * This cannot overrun because the available free space (value in 'sp')
+     * is checked above and must be at least this size.
+     */
+    ret = memcpy_to_guest_ring(d, ring_info,
+                               ring.tx_ptr + sizeof(xen_argo_ring_t),
+                               &mh, NULL_hnd, sizeof(mh));
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: failed to write message header to ring (vm%u:%x vm%u)\n",
+                ring_info->id.domain_id, ring_info->id.aport,
+                ring_info->id.partner_id);
+
+        return ret;
+    }
+
+    ring.tx_ptr += sizeof(mh);
+    if ( ring.tx_ptr == ring_info->len )
+        ring.tx_ptr = 0;
+
+    for ( piov = iovs; niov--; piov++ )
+    {
+        XEN_GUEST_HANDLE_64(uint8_t) buf_hnd = piov->iov_hnd;
+        uint32_t iov_len = piov->iov_len;
+
+        /* If no data is provided in this iov, moan and skip on to the next */
+        if ( !iov_len )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: no data iov_len=0 iov_hnd=%p ring (vm%u:%x vm%u)\n",
+                    buf_hnd.p, ring_info->id.domain_id, ring_info->id.aport,
+                    ring_info->id.partner_id);
+
+            continue;
+        }
+
+        if ( unlikely(!guest_handle_okay(buf_hnd, iov_len)) )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: bad iov handle [%p, %"PRIx32"] (vm%u:%x vm%u)\n",
+                    buf_hnd.p, iov_len,
+                    ring_info->id.domain_id, ring_info->id.aport,
+                    ring_info->id.partner_id);
+
+            return -EFAULT;
+        }
+
+        sp = ring_info->len - ring.tx_ptr;
+
+        /* Check: iov data size versus free space at the tail of the ring */
+        if ( iov_len > sp )
+        {
+            /*
+             * Second possible data write: ring-tail-wrap-write.
+             * Populate the ring tail and update the internal tx_ptr to handle
+             * wrapping at the end of ring.
+             * Size of data written here: sp
+             * which is the exact full amount of free space available at the
+             * tail of the ring, so this cannot overrun.
+             */
+            ret = memcpy_to_guest_ring(d, ring_info,
+                                       ring.tx_ptr + sizeof(xen_argo_ring_t),
+                                       NULL, buf_hnd, sp);
+            if ( ret )
+            {
+                gprintk(XENLOG_ERR,
+                        "argo: failed to copy {%p, %"PRIx32"} (vm%u:%x vm%u)\n",
+                        buf_hnd.p, sp,
+                        ring_info->id.domain_id, ring_info->id.aport,
+                        ring_info->id.partner_id);
+
+                return ret;
+            }
+
+            ring.tx_ptr = 0;
+            iov_len -= sp;
+            guest_handle_add_offset(buf_hnd, sp);
+
+            ASSERT(iov_len <= ring_info->len);
+        }
+
+        /*
+         * Third possible data write: all data remaining for this iov.
+         * Size of data written here: iov_len
+         *
+         * Case 1: if the ring-tail-wrap-write above was performed, then
+         *         iov_len has been decreased by 'sp' and ring.tx_ptr is zero.
+         *
+         *    We know from checking the result of iov_count:
+         *      len + sizeof(message_header) <= ring_info->len
+         *    We also know that len is the total of summing all iov_lens, so:
+         *       iov_len <= len
+         *    so by transitivity:
+         *       iov_len <= len <= (ring_info->len - sizeof(msgheader))
+         *    and therefore:
+         *       (iov_len + sizeof(msgheader) <= ring_info->len) &&
+         *       (ring.tx_ptr == 0)
+         *    so this write cannot overrun here.
+         *
+         * Case 2: ring-tail-wrap-write above was not performed
+         *    -> so iov_len is the guest-supplied value and: (iov_len <= sp)
+         *    ie. less than available space at the tail of the ring:
+         *        so this write cannot overrun.
+         */
+        ret = memcpy_to_guest_ring(d, ring_info,
+                                   ring.tx_ptr + sizeof(xen_argo_ring_t),
+                                   NULL, buf_hnd, iov_len);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: failed to copy [%p, %"PRIx32"] (vm%u:%x vm%u)\n",
+                    buf_hnd.p, iov_len, ring_info->id.domain_id,
+                    ring_info->id.aport, ring_info->id.partner_id);
+
+            return ret;
+        }
+
+        ring.tx_ptr += iov_len;
+
+        if ( ring.tx_ptr == ring_info->len )
+            ring.tx_ptr = 0;
+    }
+
+    ring.tx_ptr = ROUNDUP_MESSAGE(ring.tx_ptr);
+
+    if ( ring.tx_ptr >= ring_info->len )
+        ring.tx_ptr -= ring_info->len;
+
+    update_tx_ptr(d, ring_info, ring.tx_ptr);
+
+    /*
+     * At this point (and also on an error exit paths from this function) it is
+     * possible to unmap the ring_info, ie:
+     *   ring_unmap(d, ring_info);
+     * but performance should be improved by not doing so, and retaining
+     * the mapping.
+     * An XSM policy control over level of confidentiality required
+     * versus performance cost could be added to decide that here.
+     */
+
+    *out_len = len;
+
+    return ret;
+}
+
 static void
 wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
 {
@@ -458,6 +878,25 @@ wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
 }
 
 static void
+wildcard_pending_list_insert(domid_t domain_id, struct pending_ent *ent)
+{
+    struct domain *d = get_domain_by_id(domain_id);
+
+    if ( !d )
+        return;
+
+    ASSERT(LOCKING_Read_L1);
+
+    if ( d->argo )
+    {
+        spin_lock(&d->argo->wildcard_L2_lock);
+        hlist_add_head(&ent->wildcard_node, &d->argo->wildcard_pend_list);
+        spin_unlock(&d->argo->wildcard_L2_lock);
+    }
+    put_domain(d);
+}
+
+static void
 pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
 {
     struct hlist_node *node, *next;
@@ -475,6 +914,66 @@ pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
     ring_info->npending = 0;
 }
 
+static int
+pending_queue(const struct domain *d, struct argo_ring_info *ring_info,
+              domid_t src_id, unsigned int len)
+{
+    struct pending_ent *ent;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    if ( ring_info->npending >= MAX_PENDING_PER_RING )
+        return -ENOSPC;
+
+    ent = xmalloc(struct pending_ent);
+    if ( !ent )
+        return -ENOMEM;
+
+    ent->len = len;
+    ent->domain_id = src_id;
+    ent->ring_info = ring_info;
+
+    if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
+        wildcard_pending_list_insert(src_id, ent);
+    hlist_add_head(&ent->node, &ring_info->pending);
+    ring_info->npending++;
+
+    return 0;
+}
+
+static int
+pending_requeue(const struct domain *d, struct argo_ring_info *ring_info,
+                domid_t src_id, unsigned int len)
+{
+    struct hlist_node *node;
+    struct pending_ent *ent;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    hlist_for_each_entry(ent, node, &ring_info->pending, node)
+    {
+        if ( ent->domain_id == src_id )
+        {
+            /*
+             * Reuse an existing queue entry for a notification rather than add
+             * another. If the existing entry is waiting for a smaller size than
+             * the current message then adjust the record to wait for the
+             * current (larger) size to be available before triggering a
+             * notification.
+             * This assists the waiting sender by ensuring that whenever a
+             * notification is triggered, there is sufficient space available
+             * for (at least) any one of the messages awaiting transmission.
+             */
+            if ( ent->len < len )
+                ent->len = len;
+
+            return 0;
+        }
+    }
+
+    return pending_queue(d, ring_info, src_id, len);
+}
+
 static void
 wildcard_rings_pending_remove(struct domain *d)
 {
@@ -1035,6 +1534,95 @@ register_ring(struct domain *currd,
     return ret;
 }
 
+static long
+sendv(struct domain *src_d, const xen_argo_addr_t *src_addr,
+      const xen_argo_addr_t *dst_addr,
+      XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd, unsigned long niov,
+      uint32_t message_type)
+{
+    struct domain *dst_d = NULL;
+    struct argo_ring_id src_id;
+    struct argo_ring_info *ring_info;
+    int ret = 0;
+    unsigned long len = 0;
+
+    ASSERT(src_d->domain_id == src_addr->domain_id);
+
+    argo_dprintk("sendv: (%u:%x)->(%u:%x) niov:%lu iov:%p type:%u\n",
+                 src_addr->domain_id, src_addr->aport,
+                 dst_addr->domain_id, dst_addr->aport,
+                 niov, iovs_hnd.p, message_type);
+
+    read_lock(&L1_global_argo_rwlock);
+
+    if ( !src_d->argo )
+    {
+        ret = -ENODEV;
+        goto out_unlock;
+    }
+
+    src_id.aport = src_addr->aport;
+    src_id.domain_id = src_d->domain_id;
+    src_id.partner_id = dst_addr->domain_id;
+
+    dst_d = get_domain_by_id(dst_addr->domain_id);
+    if ( !dst_d )
+    {
+        argo_dprintk("!dst_d, ESRCH\n");
+        ret = -ESRCH;
+        goto out_unlock;
+    }
+
+    if ( !dst_d->argo )
+    {
+        argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
+        ret = -ECONNREFUSED;
+        goto out_unlock;
+    }
+
+    read_lock(&dst_d->argo->rings_L2_rwlock);
+
+    ring_info = find_ring_info_by_match(dst_d, dst_addr->aport,
+                                        src_addr->domain_id);
+    if ( !ring_info )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: vm%u connection refused, src (vm%u:%x) dst (vm%u:%x)\n",
+                current->domain->domain_id, src_id.domain_id, src_id.aport,
+                dst_addr->domain_id, dst_addr->aport);
+
+        ret = -ECONNREFUSED;
+        read_unlock(&dst_d->argo->rings_L2_rwlock);
+        goto out_unlock;
+    }
+
+    spin_lock(&ring_info->L3_lock);
+
+    ret = ringbuf_insert(dst_d, ring_info, &src_id, iovs_hnd, niov,
+                         message_type, &len);
+    if ( ret == -EAGAIN )
+    {
+        argo_dprintk("argo_ringbuf_sendv failed, EAGAIN\n");
+        /* requeue to issue a notification when space is there */
+        ret = pending_requeue(dst_d, ring_info, src_addr->domain_id, len);
+    }
+
+    spin_unlock(&ring_info->L3_lock);
+
+    if ( ret >= 0 )
+        signal_domain(dst_d);
+
+    read_unlock(&dst_d->argo->rings_L2_rwlock);
+
+ out_unlock:
+    if ( dst_d )
+        put_domain(dst_d);
+
+    read_unlock(&L1_global_argo_rwlock);
+
+    return ( ret < 0 ) ? ret : len;
+}
+
 long
 do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
            XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
@@ -1098,6 +1686,53 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
         break;
     }
 
+    case XEN_ARGO_OP_sendv:
+    {
+        xen_argo_send_addr_t send_addr;
+
+        XEN_GUEST_HANDLE_PARAM(xen_argo_send_addr_t) send_addr_hnd =
+            guest_handle_cast(arg1, xen_argo_send_addr_t);
+        XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd =
+            guest_handle_cast(arg2, xen_argo_iov_t);
+        /* arg3 is niov */
+        /* arg4 is message_type. Must be a 32-bit value. */
+
+        rc = copy_from_guest(&send_addr, send_addr_hnd, 1) ? -EFAULT : 0;
+        if ( rc )
+            break;
+
+        /*
+         * Check padding is zeroed. Reject niov above limit or message_types
+         * that are outside 32 bit range.
+         */
+        if ( unlikely(send_addr.src.pad || send_addr.dst.pad ||
+                      (arg3 > XEN_ARGO_MAXIOV) || (arg4 & ~0xffffffffUL)) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        if ( send_addr.src.domain_id == XEN_ARGO_DOMID_ANY )
+            send_addr.src.domain_id = currd->domain_id;
+
+        /* No domain is currently authorized to send on behalf of another */
+        if ( unlikely(send_addr.src.domain_id != currd->domain_id) )
+        {
+            rc = -EPERM;
+            break;
+        }
+
+        /*
+         * Check access to the whole array here so we can use the faster __copy
+         * operations to read each element later.
+         */
+        if ( unlikely(!guest_handle_okay(iovs_hnd, arg3)) )
+            break;
+
+        rc = sendv(currd, &send_addr.src, &send_addr.dst, iovs_hnd, arg3, arg4);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/common/compat/argo.c b/xen/common/compat/argo.c
index 6a1671c..6290ed6 100644
--- a/xen/common/compat/argo.c
+++ b/xen/common/compat/argo.c
@@ -23,3 +23,22 @@ CHECK_argo_addr;
 CHECK_argo_register_ring;
 CHECK_argo_ring;
 CHECK_argo_unregister_ring;
+
+/*
+ * Disable strict type checking in this compat validation macro for the
+ * following struct checks because it cannot handle fields within structs that
+ * have types that differ in the compat versus non-compat structs.
+ * Replace it with a field size check which is sufficient here.
+ */
+
+#undef CHECK_FIELD_COMMON_
+#define CHECK_FIELD_COMMON_(k, name, n, f) \
+static inline int __maybe_unused name(k xen_ ## n *x, k compat_ ## n *c) \
+{ \
+    BUILD_BUG_ON(offsetof(k xen_ ## n, f) != \
+                 offsetof(k compat_ ## n, f)); \
+    return sizeof(x->f) == sizeof(c->f); \
+}
+
+CHECK_argo_send_addr;
+CHECK_argo_iov;
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index f34d4f0..6fbe346 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -746,7 +746,7 @@ void send_guest_vcpu_virq(struct vcpu *v, uint32_t virq)
     spin_unlock_irqrestore(&v->virq_lock, flags);
 }
 
-static void send_guest_global_virq(struct domain *d, uint32_t virq)
+void send_guest_global_virq(struct domain *d, uint32_t virq)
 {
     unsigned long flags;
     int port;
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index 3eabf83..c12a50f 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -46,6 +46,34 @@ typedef uint32_t xen_argo_port_t;
 /* gfn type: 64-bit on all architectures to aid avoiding a compat ABI */
 typedef uint64_t xen_argo_gfn_t;
 
+/*
+ * XEN_ARGO_MAXIOV : maximum number of iovs accepted in a single sendv.
+ * Caution is required if this value is increased: this determines the size of
+ * an array of xen_argo_iov_t structs on the hypervisor stack, so could cause
+ * stack overflow if the value is too large.
+ * The Linux Argo driver never passes more than two iovs.
+ *
+ * This value should also not exceed 128 to ensure that the total amount of data
+ * posted in a single Argo sendv operation cannot exceed 2^31 bytes, to reduce
+ * risk of integer overflow defects:
+ * Each argo iov can hold ~ 2^24 bytes, so XEN_ARGO_MAXIOV <= 2^(31-24),
+ * ie. keep XEN_ARGO_MAXIOV <= 128.
+*/
+#define XEN_ARGO_MAXIOV          8U
+
+DEFINE_XEN_GUEST_HANDLE(uint8_t);
+
+typedef struct xen_argo_iov
+{
+#ifdef XEN_GUEST_HANDLE_64
+    XEN_GUEST_HANDLE_64(uint8_t) iov_hnd;
+#else
+    uint64_t iov_hnd;
+#endif
+    uint32_t iov_len;
+    uint32_t pad;
+} xen_argo_iov_t;
+
 typedef struct xen_argo_addr
 {
     xen_argo_port_t aport;
@@ -53,6 +81,12 @@ typedef struct xen_argo_addr
     uint16_t pad;
 } xen_argo_addr_t;
 
+typedef struct xen_argo_send_addr
+{
+    xen_argo_addr_t src;
+    xen_argo_addr_t dst;
+} xen_argo_send_addr_t;
+
 typedef struct xen_argo_ring
 {
     /* Guests should use atomic operations to access rx_ptr */
@@ -157,4 +191,30 @@ struct xen_argo_ring_message_header
  */
 #define XEN_ARGO_OP_unregister_ring     2
 
+/*
+ * XEN_ARGO_OP_sendv
+ *
+ * Send a list of buffers contained in iovs.
+ *
+ * The send address struct specifies the source and destination addresses
+ * for the message being sent, which are used to find the destination ring:
+ * Xen first looks for a most-specific match with a registered ring with
+ *  (id.addr == dst) and (id.partner == sending_domain) ;
+ * if that fails, it then looks for a wildcard match (aka multicast receiver)
+ * where (id.addr == dst) and (id.partner == DOMID_ANY).
+ *
+ * For each iov entry, send iov_len bytes from iov_base to the destination ring.
+ * If insufficient space exists in the destination ring, it will return -EAGAIN
+ * and Xen will notify the caller when sufficient space becomes available.
+ *
+ * The message type is a 32-bit data field available to communicate message
+ * context data (eg. kernel-to-kernel, rather than application layer).
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_send_addr_t) source and dest addresses
+ * arg2: XEN_GUEST_HANDLE(xen_argo_iov_t) iovs
+ * arg3: unsigned long niov
+ * arg4: unsigned long message type
+ */
+#define XEN_ARGO_OP_sendv               3
+
 #endif
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index b3f6491..b650aba 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -178,7 +178,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define VIRQ_CON_RING   8  /* G. (DOM0) Bytes received on console            */
 #define VIRQ_PCPU_STATE 9  /* G. (DOM0) PCPU state changed                   */
 #define VIRQ_MEM_EVENT  10 /* G. (DOM0) A memory event has occurred          */
-#define VIRQ_XC_RESERVED 11 /* G. Reserved for XenClient                     */
+#define VIRQ_ARGO_MESSAGE 11 /* G. Argo interdomain message notification     */
 #define VIRQ_ENOMEM     12 /* G. (DOM0) Low on heap memory       */
 #define VIRQ_XENPMU     13 /* V.  PMC interrupt                              */
 
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index ebb879e..4650887 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -29,6 +29,13 @@ void send_guest_vcpu_virq(struct vcpu *v, uint32_t virq);
 void send_global_virq(uint32_t virq);
 
 /*
+ * send_guest_global_virq:
+ *  @d:        Domain to which VIRQ should be sent
+ *  @virq:     Virtual IRQ number (VIRQ_*), must be global
+ */
+void send_guest_global_virq(struct domain *d, uint32_t virq);
+
+/*
  * sent_global_virq_handler: Set a global VIRQ handler.
  *  @d:        New target domain for this VIRQ
  *  @virq:     Virtual IRQ number (VIRQ_*), must be global
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 411c661..3723980 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -152,3 +152,5 @@
 ?	argo_ring			argo.h
 ?	argo_register_ring		argo.h
 ?	argo_unregister_ring		argo.h
+?	argo_iov			argo.h
+?	argo_send_addr			argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 10/14] argo: implement the notify op
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (8 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 09/14] argo: implement the sendv op; evtchn: expose send_guest_global_virq Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15 16:17   ` Roger Pau Monné
  2019-01-15  9:27 ` [PATCH v4 11/14] xsm, argo: XSM control for argo register Christopher Clark
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Queries for data about space availability in registered rings and
causes notification to be sent when space has become available.

The hypercall op populates a supplied data structure with information about
ring state, and if insufficient space is currently available in a given ring,
the hypervisor will record the domain's expressed interest and notify it
when it observes that space has become available.

Checks for free space occur when this notify op is invoked, so it may be
intentionally invoked with no data structure to populate
(ie. a NULL argument) to trigger such a check and consequent notifications.

Limit the maximum number of notify requests in a single operation to a
simple fixed limit of 256.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
v3 #07 Jan: fix format string indention in printks
v3 (general) Jan: drop fixed width types for ringbuf_payload_space
v3 #07 Jan: rename ring_find_info_by_match to find_ring_info_by_match
v3 #07 Jan: fix numeric entries in printk format strings
v3: ringbuf_payload_space: simpler return 0 if get_sanitized_ring fails
v3 #10 Roger: simplify ringbuf_payload_space for empty rings
v3 #10 Roger: ringbuf_payload_space: add comment to explain how ret < INT32_MAX
v3 #10 Roger: drop out label, use return -EFAULT in fill_ring_data
v3 #10 Roger: add newline in signal_domid
v3 #10 Roger: move find functions to top of file and drop prototypes
v3 #04 Jan: meld the compat hypercall arg checking
v3 #04 Roger/Jan: make lock names clearer and assert their state
v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
v3 self: drop braces in foreach of notify_check_pending
v3 feedback Roger/Jan: ASSERT currd is current->domain or use 'd' variable name

v2 feedback Jan: drop cookie, implement teardown
v2 notify: add flag to indicate ring is shared
v2 argument name for fill_ring_data arg is now currd
v2 self: check ring size vs request and flag error rather than queue signal
v2 feedback Jan: drop 'message' from 'argo_message_op'
v2 self: simplify signal_domid, drop unnecessary label + goto
v2 self: skip the cookie check in pending_cancel
v2 self: implement npending limit on number of pending entries
v1 feedback #16 Jan: sanitize_ring in ringbuf_payload_space
v2 self: inline fill_ring_data_array
v2 self: avoid retesting dst_d for put_domain
v2 self/Jan: remove use of magic verification field and tidy up
v1 feedback #16 Jan: remove testing of magic in guest-supplied structure
v2 self: s/argo_pending_ent/pending_ent/g
v2 feedback v1#13 Roger: use OS-supplied roundup; drop from public header
v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
v1 feedback Roger, Jan: drop argo prefix on static functions
v2 self: reduce indentation via goto out if arg NULL
v1 feedback #13 Jan: resolve checking of array handle and use of __copy

v1 #5 (#16) feedback Paul: notify op: use currd in do_argo_message_op
v1 #5 (#16) feedback Paul: notify op: use currd in argo_notify
v1 #5 (#16) feedback Paul: notify op: use currd in argo_notify_check_pending
v1 #5 (#16) feedback Paul: notify op: use currd in argo_fill_ring_data_array
v1 #13 (#16) feedback Paul: notify op: do/while: reindent only
v1 #13 (#16) feedback Paul: notify op: do/while: goto
v1 : add compat xlat.lst entries
v1: add definition for copy_field_from_guest_errno
v1 #13 feedback Jan: make 'ring data' comment comply with single-line style
v1 feedback #13 Jan: use __copy; so define and use __copy_field_to_guest_errno
v1: #13 feedback Jan: public namespace: prefix with xen
v1: #13 feedback Jan: add blank line after case in do_argo_message_op
v1: self: rename ent id to domain_id
v1: self: ent id-> domain_id
v1: self: drop signal if domain_cookie mismatches
v1. feedback #15 Jan: make loop i unsigned
v1. self: drop unnecessary mb() in argo_notify_check_pending
v1. self: add blank line
v1 #16 feedback Jan: const domain arg to +argo_fill_ring_data
v1. feedback #15 Jan: check unusued hypercall args are zero
v1 feedback #16 Jan: add comment on space available signal policy
v1. feedback #16 Jan: move declr, drop braces, lower indent
v1. feedback #18 Jan: meld the resource limits into the main commit
v1. feedback #16 Jan: clarify use of magic field
v1. self: use single copy to read notify ring data struct
v1: argo_fill_ring_data: fix dprintk types for port field
v1: self: use %x for printing port as per other print sites
v1. feedback Jan: add comments explaining ring full vs empty
v1. following Jan: fix argo_ringbuf_payload_space calculation for empty ring

 xen/common/argo.c         | 352 ++++++++++++++++++++++++++++++++++++++++++++++
 xen/common/compat/argo.c  |  18 +++
 xen/include/public/argo.h |  67 +++++++++
 xen/include/xlat.lst      |   2 +
 4 files changed, 439 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 5d5cf49..d4aff05 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -30,6 +30,7 @@
 #include <public/argo.h>
 
 #define MAX_RINGS_PER_DOMAIN            128U
+#define MAX_NOTIFY_COUNT                256U
 #define MAX_PENDING_PER_RING             32U
 
 /* All messages on the ring are padded to a multiple of the slot size. */
@@ -49,6 +50,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_argo_gfn_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_iov_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_data_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_send_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_unregister_ring_t);
 
@@ -393,6 +396,18 @@ signal_domain(struct domain *d)
 }
 
 static void
+signal_domid(domid_t domain_id)
+{
+    struct domain *d = get_domain_by_id(domain_id);
+
+    if ( !d )
+        return;
+
+    signal_domain(d);
+    put_domain(d);
+}
+
+static void
 ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
 {
     unsigned int i;
@@ -580,6 +595,66 @@ get_sanitized_ring(const struct domain *d, xen_argo_ring_t *ring,
     return 0;
 }
 
+static unsigned int
+ringbuf_payload_space(const struct domain *d, struct argo_ring_info *ring_info)
+{
+    xen_argo_ring_t ring;
+    unsigned int len;
+    int ret;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    len = ring_info->len;
+    if ( !len )
+        return 0;
+
+    if ( get_sanitized_ring(d, &ring, ring_info) )
+        return 0;
+
+    argo_dprintk("sanitized ringbuf_payload_space: tx_ptr=%u rx_ptr=%u\n",
+                 ring.tx_ptr, ring.rx_ptr);
+
+    /*
+     * rx_ptr == tx_ptr means that the ring has been emptied.
+     * See message size checking logic in the entry to ringbuf_insert which
+     * ensures that there is always one message slot of size ROUNDUP_MESSAGE(1)
+     * left available, preventing a ring from being entirely filled.
+     * This ensures that matching ring indexes always indicate an empty ring
+     * and never a full one.
+     */
+    ret = ring.rx_ptr - ring.tx_ptr;
+    if ( ret <= 0 )
+        ret += len;
+
+    /*
+     * In a sanitized ring, we can rely on:
+     *              (rx_ptr < ring_info->len)           &&
+     *              (tx_ptr < ring_info->len)           &&
+     *      (ring_info->len <= XEN_ARGO_MAX_RING_SIZE)
+     *
+     * and since: XEN_ARGO_MAX_RING_SIZE < INT32_MAX
+     * therefore right here: ret < INT32_MAX
+     * and we are safe to return it as a unsigned value from this function.
+     * The subtractions below cannot increase its value.
+     */
+
+    /*
+     * The maximum size payload for a message that will be accepted is:
+     * (the available space between the ring indexes)
+     *    minus (space for a message header)
+     *    minus (space for one message slot)
+     * since ringbuf_insert requires that one message slot be left
+     * unfilled, to avoid filling the ring to capacity and confusing a full
+     * ring with an empty one.
+     * Since the ring indexes are sanitized, the value in ret is aligned, so
+     * the simple subtraction here works to return the aligned value needed:
+     */
+    ret -= sizeof(struct xen_argo_ring_message_header);
+    ret -= ROUNDUP_MESSAGE(1);
+
+    return (ret < 0) ? 0 : ret;
+}
+
 /*
  * iov_count returns its count on success via an out variable to avoid
  * potential for a negative return value to be used incorrectly
@@ -914,6 +989,61 @@ pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
     ring_info->npending = 0;
 }
 
+static void
+pending_notify(struct hlist_head *to_notify)
+{
+    struct hlist_node *node, *next;
+    struct pending_ent *ent;
+
+    ASSERT(LOCKING_Read_L1);
+
+    hlist_for_each_entry_safe(ent, node, next, to_notify, node)
+    {
+        hlist_del(&ent->node);
+        signal_domid(ent->domain_id);
+        xfree(ent);
+    }
+}
+
+static void
+pending_find(const struct domain *d, struct argo_ring_info *ring_info,
+             unsigned int payload_space, struct hlist_head *to_notify)
+{
+    struct hlist_node *node, *next;
+    struct pending_ent *ent;
+
+    ASSERT(LOCKING_Read_rings_L2(d));
+
+    /*
+     * TODO: Current policy here is to signal _all_ of the waiting domains
+     *       interested in sending a message of size less than payload_space.
+     *
+     * This is likely to be suboptimal, since once one of them has added
+     * their message to the ring, there may well be insufficient room
+     * available for any of the others to transmit, meaning that they were
+     * woken in vain, which created extra work just to requeue their wait.
+     *
+     * Retain this simple policy for now since it at least avoids starving a
+     * domain of available space notifications because of a policy that only
+     * notified other domains instead. Improvement may be possible;
+     * investigation required.
+     */
+
+    spin_lock(&ring_info->L3_lock);
+    hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node)
+    {
+        if ( payload_space >= ent->len )
+        {
+            if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
+                wildcard_pending_list_remove(ent->domain_id, ent);
+            hlist_del(&ent->node);
+            ring_info->npending--;
+            hlist_add_head(&ent->node, to_notify);
+        }
+    }
+    spin_unlock(&ring_info->L3_lock);
+}
+
 static int
 pending_queue(const struct domain *d, struct argo_ring_info *ring_info,
               domid_t src_id, unsigned int len)
@@ -975,6 +1105,28 @@ pending_requeue(const struct domain *d, struct argo_ring_info *ring_info,
 }
 
 static void
+pending_cancel(const struct domain *d, struct argo_ring_info *ring_info,
+               domid_t src_id)
+{
+    struct hlist_node *node, *next;
+    struct pending_ent *ent;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node)
+    {
+        if ( ent->domain_id == src_id )
+        {
+            if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
+                wildcard_pending_list_remove(ent->domain_id, ent);
+            hlist_del(&ent->node);
+            xfree(ent);
+            ring_info->npending--;
+        }
+    }
+}
+
+static void
 wildcard_rings_pending_remove(struct domain *d)
 {
     struct hlist_node *node, *next;
@@ -1102,6 +1254,88 @@ partner_rings_remove(struct domain *src_d)
  *                 and rewrite this function using it or with adopted logic
  */
 static int
+fill_ring_data(const struct domain *currd,
+               XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) data_ent_hnd)
+{
+    xen_argo_ring_data_ent_t ent;
+    struct domain *dst_d;
+    struct argo_ring_info *ring_info;
+
+    ASSERT(currd == current->domain);
+    ASSERT(LOCKING_Read_L1);
+
+    if ( __copy_from_guest(&ent, data_ent_hnd, 1) )
+        return -EFAULT;
+
+    argo_dprintk("fill_ring_data: ent.ring.domain=%u,ent.ring.aport=%x\n",
+                 ent.ring.domain_id, ent.ring.aport);
+
+    ent.flags = 0;
+
+    dst_d = get_domain_by_id(ent.ring.domain_id);
+    if ( dst_d )
+    {
+        if ( dst_d->argo )
+        {
+            read_lock(&dst_d->argo->rings_L2_rwlock);
+
+            ring_info = find_ring_info_by_match(dst_d, ent.ring.aport,
+                                                currd->domain_id);
+            if ( ring_info )
+            {
+                unsigned int space_avail;
+
+                ent.flags |= XEN_ARGO_RING_DATA_F_EXISTS;
+                ent.max_message_size = ring_info->len -
+                                   sizeof(struct xen_argo_ring_message_header) -
+                                   ROUNDUP_MESSAGE(1);
+
+                if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
+                    ent.flags |= XEN_ARGO_RING_DATA_F_SHARED;
+
+                spin_lock(&ring_info->L3_lock);
+
+                space_avail = ringbuf_payload_space(dst_d, ring_info);
+
+                argo_dprintk("fill_ring_data: aport=%x space_avail=%u"
+                             " space_wanted=%u\n",
+                             ring_info->id.aport, space_avail,
+                             ent.space_required);
+
+                /* Do not queue a notification for an unachievable size */
+                if ( ent.space_required > ent.max_message_size )
+                    ent.flags |= XEN_ARGO_RING_DATA_F_EMSGSIZE;
+                else if ( space_avail >= ent.space_required )
+                {
+                    pending_cancel(dst_d, ring_info, currd->domain_id);
+                    ent.flags |= XEN_ARGO_RING_DATA_F_SUFFICIENT;
+                }
+                else
+                {
+                    pending_requeue(dst_d, ring_info, currd->domain_id,
+                                    ent.space_required);
+                    ent.flags |= XEN_ARGO_RING_DATA_F_PENDING;
+                }
+
+                spin_unlock(&ring_info->L3_lock);
+
+                if ( space_avail == ent.max_message_size )
+                    ent.flags |= XEN_ARGO_RING_DATA_F_EMPTY;
+
+            }
+            read_unlock(&dst_d->argo->rings_L2_rwlock);
+        }
+        put_domain(dst_d);
+    }
+
+    if ( __copy_field_to_guest(data_ent_hnd, &ent, flags) ||
+         __copy_field_to_guest(data_ent_hnd, &ent, max_message_size) )
+        return -EFAULT;
+
+    return 0;
+}
+
+static int
 find_ring_mfn(struct domain *d, gfn_t gfn, mfn_t *mfn)
 {
     p2m_type_t p2mt;
@@ -1534,6 +1768,109 @@ register_ring(struct domain *currd,
     return ret;
 }
 
+static void
+notify_ring(const struct domain *d, struct argo_ring_info *ring_info,
+            struct hlist_head *to_notify)
+{
+    unsigned int space;
+
+    ASSERT(LOCKING_Read_rings_L2(d));
+
+    spin_lock(&ring_info->L3_lock);
+
+    if ( ring_info->len )
+        space = ringbuf_payload_space(d, ring_info);
+    else
+        space = 0;
+
+    spin_unlock(&ring_info->L3_lock);
+
+    if ( space )
+        pending_find(d, ring_info, space, to_notify);
+}
+
+static void
+notify_check_pending(struct domain *d)
+{
+    unsigned int i;
+    HLIST_HEAD(to_notify);
+
+    ASSERT(LOCKING_Read_L1);
+
+    read_lock(&d->argo->rings_L2_rwlock);
+
+    for ( i = 0; i < ARGO_HTABLE_SIZE; i++ )
+    {
+        struct hlist_node *node, *next;
+        struct argo_ring_info *ring_info;
+
+        hlist_for_each_entry_safe(ring_info, node, next,
+                                  &d->argo->ring_hash[i], node)
+            notify_ring(d, ring_info, &to_notify);
+    }
+
+    read_unlock(&d->argo->rings_L2_rwlock);
+
+    if ( !hlist_empty(&to_notify) )
+        pending_notify(&to_notify);
+}
+
+static long
+notify(struct domain *currd,
+       XEN_GUEST_HANDLE_PARAM(xen_argo_ring_data_t) ring_data_hnd)
+{
+    XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) ent_hnd;
+    xen_argo_ring_data_t ring_data;
+    int ret = 0;
+
+    ASSERT(currd == current->domain);
+
+    read_lock(&L1_global_argo_rwlock);
+
+    if ( !currd->argo )
+    {
+        argo_dprintk("!d->argo, ENODEV\n");
+        ret = -ENODEV;
+        goto out;
+    }
+
+    notify_check_pending(currd);
+
+    if ( guest_handle_is_null(ring_data_hnd) )
+        goto out;
+
+    ret = copy_from_guest(&ring_data, ring_data_hnd, 1) ? -EFAULT : 0;
+    if ( ret )
+        goto out;
+
+    if ( ring_data.nent > MAX_NOTIFY_COUNT )
+    {
+        gprintk(XENLOG_ERR, "argo: notify entry count(%u) exceeds max(%u)\n",
+                ring_data.nent, MAX_NOTIFY_COUNT);
+        ret = -EACCES;
+        goto out;
+    }
+
+    ent_hnd = guest_handle_for_field(ring_data_hnd,
+                                     xen_argo_ring_data_ent_t, data[0]);
+    if ( unlikely(!guest_handle_okay(ent_hnd, ring_data.nent)) )
+    {
+        ret = -EFAULT;
+        goto out;
+    }
+
+    while ( !ret && ring_data.nent-- )
+    {
+        ret = fill_ring_data(currd, ent_hnd);
+        guest_handle_add_offset(ent_hnd, 1);
+    }
+
+ out:
+    read_unlock(&L1_global_argo_rwlock);
+
+    return ret;
+}
+
 static long
 sendv(struct domain *src_d, const xen_argo_addr_t *src_addr,
       const xen_argo_addr_t *dst_addr,
@@ -1733,6 +2070,21 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
         break;
     }
 
+    case XEN_ARGO_OP_notify:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_ring_data_t) ring_data_hnd =
+                   guest_handle_cast(arg1, xen_argo_ring_data_t);
+
+        if ( unlikely((!guest_handle_is_null(arg2)) || arg3 || arg4) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        rc = notify(currd, ring_data_hnd);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/common/compat/argo.c b/xen/common/compat/argo.c
index 6290ed6..4fac597 100644
--- a/xen/common/compat/argo.c
+++ b/xen/common/compat/argo.c
@@ -41,4 +41,22 @@ static inline int __maybe_unused name(k xen_ ## n *x, k compat_ ## n *c) \
 }
 
 CHECK_argo_send_addr;
+CHECK_argo_ring_data_ent;
 CHECK_argo_iov;
+
+/*
+ * Disable sizeof type checking for the following struct checks because
+ * these structs have fields of types that differ in the compat vs non-compat
+ * structs with variable size which prevents the size check validation.
+ */
+
+#undef CHECK_FIELD_COMMON_
+#define CHECK_FIELD_COMMON_(k, name, n, f) \
+static inline int __maybe_unused name(k xen_ ## n *x, k compat_ ## n *c) \
+{ \
+    BUILD_BUG_ON(offsetof(k xen_ ## n, f) != \
+                 offsetof(k compat_ ## n, f)); \
+    return 1; \
+}
+
+CHECK_argo_ring_data;
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index c12a50f..d2cb594 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -123,6 +123,42 @@ typedef struct xen_argo_unregister_ring
 /* Messages on the ring are padded to a multiple of this size. */
 #define XEN_ARGO_MSG_SLOT_SIZE 0x10
 
+/*
+ * Notify flags
+ */
+/* Ring is empty */
+#define XEN_ARGO_RING_DATA_F_EMPTY       (1U << 0)
+/* Ring exists */
+#define XEN_ARGO_RING_DATA_F_EXISTS      (1U << 1)
+/* Pending interrupt exists. Do not rely on this field - for profiling only */
+#define XEN_ARGO_RING_DATA_F_PENDING     (1U << 2)
+/* Sufficient space to queue space_required bytes exists */
+#define XEN_ARGO_RING_DATA_F_SUFFICIENT  (1U << 3)
+/* Insufficient ring size for space_required bytes */
+#define XEN_ARGO_RING_DATA_F_EMSGSIZE    (1U << 4)
+/* Ring is shared, not unicast */
+#define XEN_ARGO_RING_DATA_F_SHARED      (1U << 5)
+
+typedef struct xen_argo_ring_data_ent
+{
+    xen_argo_addr_t ring;
+    uint16_t flags;
+    uint16_t pad;
+    uint32_t space_required;
+    uint32_t max_message_size;
+} xen_argo_ring_data_ent_t;
+
+typedef struct xen_argo_ring_data
+{
+    uint32_t nent;
+    uint32_t pad;
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    xen_argo_ring_data_ent_t data[];
+#elif defined(__GNUC__)
+    xen_argo_ring_data_ent_t data[0];
+#endif
+} xen_argo_ring_data_t;
+
 struct xen_argo_ring_message_header
 {
     uint32_t len;
@@ -217,4 +253,35 @@ struct xen_argo_ring_message_header
  */
 #define XEN_ARGO_OP_sendv               3
 
+/*
+ * XEN_ARGO_OP_notify
+ *
+ * Asks Xen for information about other rings in the system.
+ *
+ * ent->ring is the xen_argo_addr_t of the ring you want information on.
+ * Uses the same ring matching rules as XEN_ARGO_OP_sendv.
+ *
+ * ent->space_required : if this field is not null then Xen will check
+ * that there is space in the destination ring for this many bytes of payload.
+ * If the ring is too small for the requested space_required, it will set the
+ * XEN_ARGO_RING_DATA_F_EMSGSIZE flag on return.
+ * If sufficient space is available, it will set XEN_ARGO_RING_DATA_F_SUFFICIENT
+ * and CANCEL any pending notification for that ent->ring; otherwise it
+ * will schedule a notification event and the flag will not be set.
+ *
+ * These flags are set by Xen when notify replies:
+ * XEN_ARGO_RING_DATA_F_EMPTY      ring is empty
+ * XEN_ARGO_RING_DATA_F_PENDING    notify event is pending *don't rely on this*
+ * XEN_ARGO_RING_DATA_F_SUFFICIENT sufficient space for space_required is there
+ * XEN_ARGO_RING_DATA_F_EXISTS     ring exists
+ * XEN_ARGO_RING_DATA_F_EMSGSIZE   space_required too large for the ring size
+ * XEN_ARGO_RING_DATA_F_SHARED     ring is registered for wildcard partner
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_ring_data_t) ring_data (may be NULL)
+ * arg2: NULL
+ * arg3: 0 (ZERO)
+ * arg4: 0 (ZERO)
+ */
+#define XEN_ARGO_OP_notify              4
+
 #endif
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 3723980..e45b60e 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -154,3 +154,5 @@
 ?	argo_unregister_ring		argo.h
 ?	argo_iov			argo.h
 ?	argo_send_addr			argo.h
+?	argo_ring_data_ent		argo.h
+?	argo_ring_data			argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 11/14] xsm, argo: XSM control for argo register
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (9 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 10/14] argo: implement the notify op Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 12/14] xsm, argo: XSM control for argo message send operation Christopher Clark
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	Daniel De Graaf, James McKenzie, Eric Chanudet, Roger Pau Monne

XSM controls for argo ring registration with two distinct cases, where
the ring being registered is:

1) Single source:  registering a ring for communication to receive messages
                   from a specified single other domain.
   Default policy: allow.

2) Any source:     registering a ring for communication to receive messages
                   from any, or all, other domains (ie. wildcard).
   Default policy: deny, with runtime policy configuration via bootparam.

The existing argo-mac boot parameter indicates administrator preference for
either permissive or strict access control, which will allow or deny
registration of any-sender rings.

This commit modifies the signature of core XSM hook functions in order to
apply 'const' to arguments, needed in order for 'const' to be accepted in
signature of functions that invoke them.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
v3 Daniel/Jan: add to the default xsm policy for the register op
v3 hoist opt_argo_mac_permissive check to allow default policy to match non-XSM
v3 was: Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
v3 Add Daniel's Acked-by
v3 feedback #07 Roger: use opt_argo_mac_permissive : a boolean opt
v2 feedback #9 Jan: refactor to use argo-mac bootparam at point of introduction
v1 feedback Paul: replace use of strncmp with strcmp
v1 feedback #16 Jan: apply const to function signatures
v1 feedback #14 Jan: add blank line before return in parse_argo_mac_param

 tools/flask/policy/modules/guest_features.te |  6 ++++++
 xen/common/argo.c                            | 15 +++++++++++----
 xen/include/xsm/dummy.h                      | 14 ++++++++++++++
 xen/include/xsm/xsm.h                        | 19 +++++++++++++++++++
 xen/xsm/dummy.c                              |  4 ++++
 xen/xsm/flask/hooks.c                        | 27 ++++++++++++++++++++++++---
 xen/xsm/flask/policy/access_vectors          | 11 +++++++++++
 xen/xsm/flask/policy/security_classes        |  1 +
 8 files changed, 90 insertions(+), 7 deletions(-)

diff --git a/tools/flask/policy/modules/guest_features.te b/tools/flask/policy/modules/guest_features.te
index 9ac9780..d00769e 100644
--- a/tools/flask/policy/modules/guest_features.te
+++ b/tools/flask/policy/modules/guest_features.te
@@ -5,6 +5,12 @@ allow domain_type xen_t:xen tmem_op;
 # pmu_ctrl is for)
 allow domain_type xen_t:xen2 pmu_use;
 
+# Allow all domains:
+# to register single-sender (unicast) rings to partner with any domain; and
+# to register any-sender (wildcard) rings that can be sent to by any domain.
+allow domain_type xen_t:argo { register_any_source };
+allow domain_type domain_type:argo { register_single_source };
+
 # Allow guest console output to the serial console.  This is used by PV Linux
 # and stub domains for early boot output, so don't audit even when we deny it.
 # Without XSM, this is enabled only if the Xen was compiled in debug mode.
diff --git a/xen/common/argo.c b/xen/common/argo.c
index d4aff05..f748d8b 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -26,6 +26,7 @@
 #include <xen/nospec.h>
 #include <xen/sched.h>
 #include <xen/time.h>
+#include <xsm/xsm.h>
 
 #include <public/argo.h>
 
@@ -1584,11 +1585,10 @@ register_ring(struct domain *currd,
 
     if ( reg.partner_id == XEN_ARGO_DOMID_ANY )
     {
-        if ( !opt_argo_mac_permissive )
-        {
-            ret = -EPERM;
+        ret = opt_argo_mac_permissive ? xsm_argo_register_any_source(currd) :
+                                        -EPERM;
+        if ( ret )
             goto out_unlock;
-        }
     }
     else
     {
@@ -1600,6 +1600,13 @@ register_ring(struct domain *currd,
             goto out_unlock;
         }
 
+        ret = xsm_argo_register_single_source(currd, dst_d);
+        if ( ret )
+        {
+            put_domain(dst_d);
+            goto out_unlock;
+        }
+
         if ( !dst_d->argo )
         {
             argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index a29d1ef..96118aa 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -720,6 +720,20 @@ static XSM_INLINE int xsm_dm_op(XSM_DEFAULT_ARG struct domain *d)
 
 #endif /* CONFIG_X86 */
 
+#ifdef CONFIG_ARGO
+static XSM_INLINE int xsm_argo_register_single_source(struct domain *d,
+                                                      struct domain *t)
+{
+    return 0;
+}
+
+static XSM_INLINE int xsm_argo_register_any_source(struct domain *d)
+{
+    return 0;
+}
+
+#endif /* CONFIG_ARGO */
+
 #include <public/version.h>
 static XSM_INLINE int xsm_xen_version (XSM_DEFAULT_ARG uint32_t op)
 {
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 3b192b5..e32a645 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -181,6 +181,11 @@ struct xsm_operations {
 #endif
     int (*xen_version) (uint32_t cmd);
     int (*domain_resource_map) (struct domain *d);
+#ifdef CONFIG_ARGO
+    int (*argo_register_single_source) (const struct domain *d,
+                                        const struct domain *t);
+    int (*argo_register_any_source) (const struct domain *d);
+#endif
 };
 
 #ifdef CONFIG_XSM
@@ -698,6 +703,20 @@ static inline int xsm_domain_resource_map(xsm_default_t def, struct domain *d)
     return xsm_ops->domain_resource_map(d);
 }
 
+#ifdef CONFIG_ARGO
+static inline xsm_argo_register_single_source(const struct domain *d,
+                                              const struct domain *t)
+{
+    return xsm_ops->argo_register_single_source(d, t);
+}
+
+static inline xsm_argo_register_any_source(const struct domain *d)
+{
+    return xsm_ops->argo_register_any_source(d);
+}
+
+#endif /* CONFIG_ARGO */
+
 #endif /* XSM_NO_WRAPPERS */
 
 #ifdef CONFIG_MULTIBOOT
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 5701047..ed236b0 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -152,4 +152,8 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
 #endif
     set_to_dummy_if_null(ops, xen_version);
     set_to_dummy_if_null(ops, domain_resource_map);
+#ifdef CONFIG_ARGO
+    set_to_dummy_if_null(ops, argo_register_single_source);
+    set_to_dummy_if_null(ops, argo_register_any_source);
+#endif
 }
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 96d31aa..fcb7487 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -36,13 +36,14 @@
 #include <objsec.h>
 #include <conditional.h>
 
-static u32 domain_sid(struct domain *dom)
+static u32 domain_sid(const struct domain *dom)
 {
     struct domain_security_struct *dsec = dom->ssid;
     return dsec->sid;
 }
 
-static u32 domain_target_sid(struct domain *src, struct domain *dst)
+static u32 domain_target_sid(const struct domain *src,
+                             const struct domain *dst)
 {
     struct domain_security_struct *ssec = src->ssid;
     struct domain_security_struct *dsec = dst->ssid;
@@ -58,7 +59,8 @@ static u32 evtchn_sid(const struct evtchn *chn)
     return chn->ssid.flask_sid;
 }
 
-static int domain_has_perm(struct domain *dom1, struct domain *dom2, 
+static int domain_has_perm(const struct domain *dom1,
+                           const struct domain *dom2,
                            u16 class, u32 perms)
 {
     u32 ssid, tsid;
@@ -1717,6 +1719,21 @@ static int flask_domain_resource_map(struct domain *d)
     return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__RESOURCE_MAP);
 }
 
+#ifdef CONFIG_ARGO
+static int flask_argo_register_single_source(const struct domain *d,
+                                             const struct domain *t)
+{
+    return domain_has_perm(d, t, SECCLASS_ARGO,
+                           ARGO__REGISTER_SINGLE_SOURCE);
+}
+
+static int flask_argo_register_any_source(const struct domain *d)
+{
+    return avc_has_perm(domain_sid(d), SECINITSID_XEN, SECCLASS_ARGO,
+                        ARGO__REGISTER_ANY_SOURCE, NULL);
+}
+#endif
+
 long do_flask_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_flask_op);
 int compat_flask_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_flask_op);
 
@@ -1851,6 +1868,10 @@ static struct xsm_operations flask_ops = {
 #endif
     .xen_version = flask_xen_version,
     .domain_resource_map = flask_domain_resource_map,
+#ifdef CONFIG_ARGO
+    .argo_register_single_source = flask_argo_register_single_source,
+    .argo_register_any_source = flask_argo_register_any_source,
+#endif
 };
 
 void __init flask_init(const void *policy_buffer, size_t policy_size)
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 6fecfda..fb95c97 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -531,3 +531,14 @@ class version
 # Xen build id
     xen_build_id
 }
+
+# Class argo is used to describe the Argo interdomain communication system.
+class argo
+{
+    # Domain requesting registration of a communication ring
+    # to receive messages from a specific other domain.
+    register_single_source
+    # Domain requesting registration of a communication ring
+    # to receive messages from any other domain.
+    register_any_source
+}
diff --git a/xen/xsm/flask/policy/security_classes b/xen/xsm/flask/policy/security_classes
index cde4e1a..50ecbab 100644
--- a/xen/xsm/flask/policy/security_classes
+++ b/xen/xsm/flask/policy/security_classes
@@ -19,5 +19,6 @@ class event
 class grant
 class security
 class version
+class argo
 
 # FLASK
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 12/14] xsm, argo: XSM control for argo message send operation
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (10 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 11/14] xsm, argo: XSM control for argo register Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 13/14] xsm, argo: XSM control for any access to argo by a domain Christopher Clark
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	Daniel De Graaf, James McKenzie, Eric Chanudet, Roger Pau Monne


Default policy: allow.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
v3 Daniel/Jan: add to the default xsm policy for the send op
v3 Add Daniel's Acked-by
v2: reordered commit sequence to after sendv implementation
v1 feedback Jan #16: apply const to function signatures
v1 version was: Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

 tools/flask/policy/modules/guest_features.te | 7 ++++---
 xen/common/argo.c                            | 8 ++++++++
 xen/include/xsm/dummy.h                      | 6 ++++++
 xen/include/xsm/xsm.h                        | 6 ++++++
 xen/xsm/dummy.c                              | 1 +
 xen/xsm/flask/hooks.c                        | 7 +++++++
 xen/xsm/flask/policy/access_vectors          | 2 ++
 7 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/tools/flask/policy/modules/guest_features.te b/tools/flask/policy/modules/guest_features.te
index d00769e..ca52257 100644
--- a/tools/flask/policy/modules/guest_features.te
+++ b/tools/flask/policy/modules/guest_features.te
@@ -6,10 +6,11 @@ allow domain_type xen_t:xen tmem_op;
 allow domain_type xen_t:xen2 pmu_use;
 
 # Allow all domains:
-# to register single-sender (unicast) rings to partner with any domain; and
-# to register any-sender (wildcard) rings that can be sent to by any domain.
+# to register single-sender (unicast) rings to partner with any domain;
+# to register any-sender (wildcard) rings that can be sent to by any domain;
+# and send messages to rings.
 allow domain_type xen_t:argo { register_any_source };
-allow domain_type domain_type:argo { register_single_source };
+allow domain_type domain_type:argo { send register_single_source };
 
 # Allow guest console output to the serial console.  This is used by PV Linux
 # and stub domains for early boot output, so don't audit even when we deny it.
diff --git a/xen/common/argo.c b/xen/common/argo.c
index f748d8b..dadcb88 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -1924,6 +1924,14 @@ sendv(struct domain *src_d, const xen_argo_addr_t *src_addr,
         goto out_unlock;
     }
 
+    ret = xsm_argo_send(src_d, dst_d);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR, "argo: XSM REJECTED %i -> %i\n",
+                src_addr->domain_id, dst_addr->domain_id);
+        goto out_unlock;
+    }
+
     read_lock(&dst_d->argo->rings_L2_rwlock);
 
     ring_info = find_ring_info_by_match(dst_d, dst_addr->aport,
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 96118aa..7daf1f0 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -732,6 +732,12 @@ static XSM_INLINE int xsm_argo_register_any_source(struct domain *d)
     return 0;
 }
 
+static XSM_INLINE int xsm_argo_send(const struct domain *d,
+                                    const struct domain *t)
+{
+    return 0;
+}
+
 #endif /* CONFIG_ARGO */
 
 #include <public/version.h>
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index e32a645..7c69efe 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -185,6 +185,7 @@ struct xsm_operations {
     int (*argo_register_single_source) (const struct domain *d,
                                         const struct domain *t);
     int (*argo_register_any_source) (const struct domain *d);
+    int (*argo_send) (const struct domain *d, const struct domain *t);
 #endif
 };
 
@@ -715,6 +716,11 @@ static inline xsm_argo_register_any_source(const struct domain *d)
     return xsm_ops->argo_register_any_source(d);
 }
 
+static inline int xsm_argo_send(const struct domain *d, const struct domain *t)
+{
+    return xsm_ops->argo_send(d, t);
+}
+
 #endif /* CONFIG_ARGO */
 
 #endif /* XSM_NO_WRAPPERS */
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index ed236b0..ffac774 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -155,5 +155,6 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
 #ifdef CONFIG_ARGO
     set_to_dummy_if_null(ops, argo_register_single_source);
     set_to_dummy_if_null(ops, argo_register_any_source);
+    set_to_dummy_if_null(ops, argo_send);
 #endif
 }
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index fcb7487..76c012c 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1732,6 +1732,12 @@ static int flask_argo_register_any_source(const struct domain *d)
     return avc_has_perm(domain_sid(d), SECINITSID_XEN, SECCLASS_ARGO,
                         ARGO__REGISTER_ANY_SOURCE, NULL);
 }
+
+static int flask_argo_send(const struct domain *d, const struct domain *t)
+{
+    return domain_has_perm(d, t, SECCLASS_ARGO, ARGO__SEND);
+}
+
 #endif
 
 long do_flask_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_flask_op);
@@ -1871,6 +1877,7 @@ static struct xsm_operations flask_ops = {
 #ifdef CONFIG_ARGO
     .argo_register_single_source = flask_argo_register_single_source,
     .argo_register_any_source = flask_argo_register_any_source,
+    .argo_send = flask_argo_send,
 #endif
 };
 
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index fb95c97..f6c5377 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -541,4 +541,6 @@ class argo
     # Domain requesting registration of a communication ring
     # to receive messages from any other domain.
     register_any_source
+    # Domain sending a message to another domain.
+    send
 }
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 13/14] xsm, argo: XSM control for any access to argo by a domain
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (11 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 12/14] xsm, argo: XSM control for argo message send operation Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15  9:27 ` [PATCH v4 14/14] xsm, argo: notify: don't describe rings that cannot be sent to Christopher Clark
  2019-01-15 16:34 ` [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Roger Pau Monné
  14 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	Daniel De Graaf, James McKenzie, Eric Chanudet, Roger Pau Monne

Will inhibit initialization of the domain's argo data structure to
prevent receiving any messages or notifications and access to any of
the argo hypercall operations.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
v3 Daniel/Jan: add to the default xsm policy for enable
v3 Add Daniel's Acked-by
v3 #04 Jason/Roger: soft_reset: can assume reinit is ok if d->argo set
v2 self: fix xsm use in soft-reset prior to introduction
v1 #5 (#17) feedback Paul: XSM control for any access: use currd
v1 #16 feedback Jan: apply const to function signatures

 tools/flask/policy/modules/guest_features.te |  4 ++--
 xen/common/argo.c                            | 11 ++++++-----
 xen/include/xsm/dummy.h                      |  5 +++++
 xen/include/xsm/xsm.h                        |  6 ++++++
 xen/xsm/dummy.c                              |  1 +
 xen/xsm/flask/hooks.c                        |  7 +++++++
 xen/xsm/flask/policy/access_vectors          |  3 +++
 7 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/tools/flask/policy/modules/guest_features.te b/tools/flask/policy/modules/guest_features.te
index ca52257..fe4835d 100644
--- a/tools/flask/policy/modules/guest_features.te
+++ b/tools/flask/policy/modules/guest_features.te
@@ -5,11 +5,11 @@ allow domain_type xen_t:xen tmem_op;
 # pmu_ctrl is for)
 allow domain_type xen_t:xen2 pmu_use;
 
-# Allow all domains:
+# Allow all domains to enable the Argo interdomain communication hypercall;
 # to register single-sender (unicast) rings to partner with any domain;
 # to register any-sender (wildcard) rings that can be sent to by any domain;
 # and send messages to rings.
-allow domain_type xen_t:argo { register_any_source };
+allow domain_type xen_t:argo { enable register_any_source };
 allow domain_type domain_type:argo { send register_single_source };
 
 # Allow guest console output to the serial console.  This is used by PV Linux
diff --git a/xen/common/argo.c b/xen/common/argo.c
index dadcb88..23b61bf 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -1986,7 +1986,7 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
     argo_dprintk("->do_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
                  (void *)arg1.p, (void *)arg2.p, arg3, arg4);
 
-    if ( unlikely(!opt_argo_enabled) )
+    if ( unlikely(!opt_argo_enabled || xsm_argo_enable(currd)) )
         return -EOPNOTSUPP;
 
     switch (cmd)
@@ -2133,7 +2133,7 @@ argo_init(struct domain *d)
 {
     struct argo_domain *argo;
 
-    if ( !opt_argo_enabled )
+    if ( !opt_argo_enabled || xsm_argo_enable(d) )
     {
         argo_dprintk("argo disabled, domid: %u\n", d->domain_id);
         return 0;
@@ -2189,9 +2189,10 @@ argo_soft_reset(struct domain *d)
         wildcard_rings_pending_remove(d);
 
         /*
-         * Since opt_argo_enabled cannot change at runtime, if d->argo is true
-         * then opt_argo_enabled must be true, and we can assume that init
-         * is allowed to proceed again here.
+         * Since neither opt_argo_enabled or xsm_argo_enable(d) can change at
+         * runtime, if d->argo is true then both opt_argo_enabled and
+         * xsm_argo_enable(d) must be true, and we can assume that init is
+         * allowed to proceed again here.
          */
         argo_domain_init(d->argo);
     }
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 7daf1f0..56d7865 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -721,6 +721,11 @@ static XSM_INLINE int xsm_dm_op(XSM_DEFAULT_ARG struct domain *d)
 #endif /* CONFIG_X86 */
 
 #ifdef CONFIG_ARGO
+static XSM_INLINE int xsm_argo_enable(struct domain *d)
+{
+    return 0;
+}
+
 static XSM_INLINE int xsm_argo_register_single_source(struct domain *d,
                                                       struct domain *t)
 {
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 7c69efe..8daffae 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -182,6 +182,7 @@ struct xsm_operations {
     int (*xen_version) (uint32_t cmd);
     int (*domain_resource_map) (struct domain *d);
 #ifdef CONFIG_ARGO
+    int (*argo_enable) (const struct domain *d);
     int (*argo_register_single_source) (const struct domain *d,
                                         const struct domain *t);
     int (*argo_register_any_source) (const struct domain *d);
@@ -705,6 +706,11 @@ static inline int xsm_domain_resource_map(xsm_default_t def, struct domain *d)
 }
 
 #ifdef CONFIG_ARGO
+static inline xsm_argo_enable(const struct domain *d)
+{
+    return xsm_ops->argo_enable(d);
+}
+
 static inline xsm_argo_register_single_source(const struct domain *d,
                                               const struct domain *t)
 {
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index ffac774..1fe0e74 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -153,6 +153,7 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, xen_version);
     set_to_dummy_if_null(ops, domain_resource_map);
 #ifdef CONFIG_ARGO
+    set_to_dummy_if_null(ops, argo_enable);
     set_to_dummy_if_null(ops, argo_register_single_source);
     set_to_dummy_if_null(ops, argo_register_any_source);
     set_to_dummy_if_null(ops, argo_send);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 76c012c..3d00c74 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1720,6 +1720,12 @@ static int flask_domain_resource_map(struct domain *d)
 }
 
 #ifdef CONFIG_ARGO
+static int flask_argo_enable(const struct domain *d)
+{
+    return avc_has_perm(domain_sid(d), SECINITSID_XEN, SECCLASS_ARGO,
+                        ARGO__ENABLE, NULL);
+}
+
 static int flask_argo_register_single_source(const struct domain *d,
                                              const struct domain *t)
 {
@@ -1875,6 +1881,7 @@ static struct xsm_operations flask_ops = {
     .xen_version = flask_xen_version,
     .domain_resource_map = flask_domain_resource_map,
 #ifdef CONFIG_ARGO
+    .argo_enable = flask_argo_enable,
     .argo_register_single_source = flask_argo_register_single_source,
     .argo_register_any_source = flask_argo_register_any_source,
     .argo_send = flask_argo_send,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index f6c5377..e00448b 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -535,6 +535,9 @@ class version
 # Class argo is used to describe the Argo interdomain communication system.
 class argo
 {
+    # Enable initialization of a domain's argo subsystem and
+    # permission to access the argo hypercall operations.
+    enable
     # Domain requesting registration of a communication ring
     # to receive messages from a specific other domain.
     register_single_source
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 14/14] xsm, argo: notify: don't describe rings that cannot be sent to
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (12 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 13/14] xsm, argo: XSM control for any access to argo by a domain Christopher Clark
@ 2019-01-15  9:27 ` Christopher Clark
  2019-01-15 16:34 ` [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Roger Pau Monné
  14 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15  9:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	Daniel De Graaf, James McKenzie, Eric Chanudet, Roger Pau Monne

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
v3 #10 Roger: drop out label, use return -EFAULT in fill_ring_data
v3: Add Daniel's Acked-by

 xen/common/argo.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 23b61bf..ced8d47 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -1261,6 +1261,7 @@ fill_ring_data(const struct domain *currd,
     xen_argo_ring_data_ent_t ent;
     struct domain *dst_d;
     struct argo_ring_info *ring_info;
+    int ret;
 
     ASSERT(currd == current->domain);
     ASSERT(LOCKING_Read_L1);
@@ -1278,6 +1279,17 @@ fill_ring_data(const struct domain *currd,
     {
         if ( dst_d->argo )
         {
+            /*
+             * Don't supply information about rings that a guest is not
+             * allowed to send to.
+             */
+            ret = xsm_argo_send(currd, dst_d);
+            if ( ret )
+            {
+                put_domain(dst_d);
+                return ret;
+            }
+
             read_lock(&dst_d->argo->rings_L2_rwlock);
 
             ring_info = find_ring_info_by_match(dst_d, ent.ring.aport,
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 04/14] argo: init, destroy and soft-reset, with enable command line opt
  2019-01-15  9:27 ` [PATCH v4 04/14] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
@ 2019-01-15 12:29   ` Roger Pau Monné
  2019-01-15 12:42     ` Jan Beulich
                       ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-15 12:29 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 01:27:36AM -0800, Christopher Clark wrote:
> Initialises basic data structures and performs teardown of argo state
> for domain shutdown.
> 
> Inclusion of the Argo implementation is dependent on CONFIG_ARGO.
> 
> Introduces a new Xen command line parameter 'argo': bool to enable/disable
> the argo hypercall. Defaults to disabled.
> 
> New headers:
>   public/argo.h: with definions of addresses and ring structure, including
>   indexes for atomic update for communication between domain and hypervisor.
> 
>   xen/argo.h: to expose the hooks for integration into domain lifecycle:
>     argo_init: per-domain init of argo data structures for domain_create.
>     argo_destroy: teardown for domain_destroy and the error exit
>                   path of domain_create.
>     argo_soft_reset: reset of domain state for domain_soft_reset.
> 
> Adds two new fields to struct domain:
>     rwlock_t argo_lock;
>     struct argo_domain *argo;
> 
> In accordance with recent work on _domain_destroy, argo_destroy is
> idempotent. It will tear down: all rings registered by this domain, all
> rings where this domain is the single sender (ie. specified partner,
> non-wildcard rings), and all pending notifications where this domain is
> awaiting signal about available space in the rings of other domains.
> 
> A count will be maintained of the number of rings that a domain has
> registered in order to limit it below the fixed maximum limit defined here.
> 
> Macros are defined to verify the internal locking state within the argo
> implementation. The macros are ASSERTed on entry to functions to validate
> and document the required lock state prior to calling.
> 
> The software license on the public header is the BSD license, standard
> procedure for the public Xen headers. The public header was originally
> posted under a GPL license at: [1]:
> https://lists.xenproject.org/archives/html/xen-devel/2013-05/msg02710.html
> 
> The following ACK by Lars Kurth is to confirm that only people being
> employees of Citrix contributed to the header files in the series posted at
> [1] and that thus the copyright of the files in question is fully owned by
> Citrix. The ACK also confirms that Citrix is happy for the header files to
> be published under a BSD license in this series (which is based on [1]).

I've got some minor comments, and a couple of questions, but I think
the code is looking good.

> diff --git a/xen/common/argo.c b/xen/common/argo.c
> index 6f782f7..1958fdc 100644
> --- a/xen/common/argo.c
> +++ b/xen/common/argo.c
> @@ -16,8 +16,223 @@
>   * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
>   */
>  
> +#include <xen/argo.h>
> +#include <xen/domain.h>
> +#include <xen/domain_page.h>
>  #include <xen/errno.h>
> +#include <xen/event.h>
>  #include <xen/guest_access.h>
> +#include <xen/nospec.h>
> +#include <xen/sched.h>
> +#include <xen/time.h>
> +
> +#include <public/argo.h>
> +
> +DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
> +DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
> +
> +/* Xen command line option to enable argo */
> +static bool __read_mostly opt_argo_enabled;
> +boolean_param("argo", opt_argo_enabled);
> +
> +typedef struct argo_ring_id
> +{
> +    xen_argo_port_t aport;
> +    domid_t partner_id;
> +    domid_t domain_id;
> +} argo_ring_id;
> +
> +/* Data about a domain's own ring that it has registered */
> +struct argo_ring_info
> +{
> +    /* next node in the hash, protected by rings_L2 */
> +    struct hlist_node node;
> +    /* this ring's id, protected by rings_L2 */
> +    struct argo_ring_id id;
> +    /* L3, the ring_info lock: protects the members of this struct below */
> +    spinlock_t L3_lock;
> +    /* length of the ring, protected by L3 */
> +    unsigned int len;
> +    /* number of pages translated into mfns, protected by L3 */
> +    unsigned int nmfns;
> +    /* cached tx pointer location, protected by L3 */
> +    unsigned int tx_ptr;
> +    /* mapped ring pages protected by L3 */
> +    void **mfn_mapping;
> +    /* list of mfns of guest ring, protected by L3 */
> +    mfn_t *mfns;
> +    /* list of struct pending_ent for this ring, protected by L3 */
> +    struct hlist_head pending;
> +    /* number of pending entries queued for this ring, protected by L3 */
> +    unsigned int npending;
> +};
> +
> +/* Data about a single-sender ring, held by the sender (partner) domain */
> +struct argo_send_info
> +{
> +    /* next node in the hash, protected by send_L2 */
> +    struct hlist_node node;
> +    /* this ring's id, protected by send_L2 */
> +    struct argo_ring_id id;
> +};
> +
> +/* A space-available notification that is awaiting sufficient space */
> +struct pending_ent
> +{
> +    /* List node within argo_ring_info's pending list */
> +    struct hlist_node node;
> +    /*
> +     * List node within argo_domain's wildcard_pend_list. Only used if the
> +     * ring is one with a wildcard partner (ie. that any domain may send to)
> +     * to enable cancelling signals on wildcard rings on domain destroy.
> +     */
> +    struct hlist_node wildcard_node;
> +    /*
> +     * Pointer to the ring_info that this ent pertains to. Used to ensure that
> +     * ring_info->npending is decremented when ents for wildcard rings are
> +     * cancelled for domain destroy.
> +     * Caution: Must hold the correct locks before accessing ring_info via this.
> +     */
> +    struct argo_ring_info *ring_info;
> +    /* minimum ring space available that this signal is waiting upon */
> +    unsigned int len;
> +    /* domain to be notified when space is available */
> +    domid_t domain_id;
> +};
> +
> +/*
> + * The value of the argo element in a struct domain is
> + * protected by L1_global_argo_rwlock
> + */
> +#define ARGO_HTABLE_SIZE 32
> +struct argo_domain
> +{
> +    /* rings_L2 */
> +    rwlock_t rings_L2_rwlock;
> +    /*
> +     * Hash table of argo_ring_info about rings this domain has registered.
> +     * Protected by rings_L2.
> +     */
> +    struct hlist_head ring_hash[ARGO_HTABLE_SIZE];
> +    /* Counter of rings registered by this domain. Protected by rings_L2. */
> +    unsigned int ring_count;
> +
> +    /* send_L2 */
> +    spinlock_t send_L2_lock;

Other locks are rw locks, while this is a spinlock, I guess that's
because there aren't many concurrent read-only accesses to
send_hash?

> +    /*
> +     * Hash table of argo_send_info about rings other domains have registered
> +     * for this domain to send to. Single partner, non-wildcard rings.
> +     * Protected by send_L2.
> +     */
> +    struct hlist_head send_hash[ARGO_HTABLE_SIZE];
> +
> +    /* wildcard_L2 */
> +    spinlock_t wildcard_L2_lock;
> +    /*
> +     * List of pending space-available signals for this domain about wildcard
> +     * rings registered by other domains. Protected by wildcard_L2.
> +     */
> +    struct hlist_head wildcard_pend_list;
> +};
> +
> +/*
> + * Locking is organized as follows:
> + *
> + * Terminology: R(<lock>) means taking a read lock on the specified lock;
> + *              W(<lock>) means taking a write lock on it.
> + *
> + * == L1 : The global read/write lock: L1_global_argo_rwlock
> + * Protects the argo elements of all struct domain *d in the system.
> + * It does not protect any of the elements of d->argo, only their
> + * addresses.

But if you W(L1), you can basically modify anything, in all d->argo
structs, so it does seem to protect the elements of d->argo when
write-locked?

> + * By extension since the destruction of a domain with a non-NULL
> + * d->argo will need to free the d->argo pointer, holding W(L1)
> + * guarantees that no domains pointers that argo is interested in
> + * become invalid whilst this lock is held.

AFAICT holding W(L1) guarantees not only that pointers doesn't change,
but that there are no changes at all in any of the d->argo contained
data.

> + */
> +
> +static DEFINE_RWLOCK(L1_global_argo_rwlock); /* L1 */
> +
> +/*
> + * == rings_L2 : The per-domain ring hash lock: d->argo->rings_L2_rwlock
> + *
> + * Holding a read lock on rings_L2 protects the ring hash table and
> + * the elements in the hash_table d->argo->ring_hash, and
> + * the node and id fields in struct argo_ring_info in the
> + * hash table.
> + * Holding a write lock on rings_L2 protects all of the elements of all the
> + * struct argo_ring_info belonging to this domain.
> + *
> + * To take rings_L2 you must already have R(L1). W(L1) implies W(rings_L2) and
> + * L3.
> + *
> + * == L3 : The individual ring_info lock: ring_info->L3_lock
> + *
> + * Protects all the fields within the argo_ring_info, aside from the ones that
> + * rings_L2 already protects: node, id, lock.
> + *
> + * To acquire L3 you must already have R(rings_L2). W(rings_L2) implies L3.
> + *
> + * == send_L2 : The per-domain single-sender partner rings lock:
> + *              d->argo->send_L2_lock
> + *
> + * Protects the per-domain send hash table : d->argo->send_hash
> + * and the elements in the hash table, and the node and id fields
> + * in struct argo_send_info in the hash table.
> + *
> + * To take send_L2, you must already have R(L1). W(L1) implies send_L2.
> + * Do not attempt to acquire a rings_L2 on any domain after taking and while
> + * holding a send_L2 lock -- acquire the rings_L2 (if one is needed) beforehand.
> + *
> + * == wildcard_L2 : The per-domain wildcard pending list lock:
> + *                  d->argo->wildcard_L2_lock
> + *
> + * Protects the per-domain list of outstanding signals for space availability
> + * on wildcard rings.
> + *
> + * To take wildcard_L2, you must already have R(L1). W(L1) implies wildcard_L2.
> + * No other locks are acquired after obtaining wildcard_L2.
> + */
> +
> +/*
> + * Lock state validations macros
> + *
> + * These macros encode the logic to verify that the locking has adhered to the
> + * locking discipline above.
> + * eg. On entry to logic that requires holding at least R(rings_L2), this:
> + *      ASSERT(LOCKING_Read_rings_L2(d));
> + *
> + * checks that the lock state is sufficient, validating that one of the
> + * following must be true when executed:       R(rings_L2) && R(L1)
> + *                                        or:  W(rings_L2) && R(L1)
> + *                                        or:  W(L1)
> + */
> +
> +/* RAW macros here are only used to assist defining the other macros below */
> +#define RAW_LOCKING_Read_L1 (rw_is_locked(&L1_global_argo_rwlock))

Not sure whether it's relevant or not, but this macro would return
true as long as the lock is taken, regardless of whether it's read or
write locked. If you want to make sure it's only read-locked then you
will have to use:

rw_is_locked(&L1_global_argo_rwlock) &&
!rw_is_write_locked(&L1_global_argo_rwlock)

AFAICT.

> +#define RAW_LOCKING_Read_rings_L2(d) \
> +    (rw_is_locked(&d->argo->rings_L2_rwlock) && RAW_LOCKING_Read_L1)
> +
> +/* The LOCKING macros defined below here are for use at verification points */
> +#define LOCKING_Write_L1 (rw_is_write_locked(&L1_global_argo_rwlock))
> +#define LOCKING_Read_L1 (RAW_LOCKING_Read_L1 || LOCKING_Write_L1)

You can drop the LOCKING_Write_L1 here, since with the current macros
RAW_LOCKING_Read_L1 will return true regardless of whether the lock is
read or write locked.

> +
> +#define LOCKING_Write_rings_L2(d) \
> +    ((RAW_LOCKING_Read_L1 && rw_is_write_locked(&d->argo->rings_L2_rwlock)) || \

For safety you need parentheses around d here:

rw_is_write_locked(&(d)->argo->rings_L2_rwlock)

And also in the macros below, same applies to r.

> +     LOCKING_Write_L1)
> +
> +#define LOCKING_Read_rings_L2(d) \
> +    ((RAW_LOCKING_Read_L1 && rw_is_locked(&d->argo->rings_L2_rwlock)) || \
> +     LOCKING_Write_rings_L2(d) || LOCKING_Write_L1)
> +
> +#define LOCKING_L3(d, r) \
> +    ((RAW_LOCKING_Read_rings_L2(d) && spin_is_locked(&r->L3_lock)) || \
> +     LOCKING_Write_rings_L2(d) || LOCKING_Write_L1)
> +
> +#define LOCKING_send_L2(d) \
> +    ((RAW_LOCKING_Read_L1 && spin_is_locked(&d->argo->send_L2_lock)) || \
> +     LOCKING_Write_L1)
>  
>  /* Change this to #define ARGO_DEBUG here to enable more debug messages */
>  #undef ARGO_DEBUG
> @@ -28,10 +243,365 @@
>  #define argo_dprintk(format, ... ) ((void)0)
>  #endif
>  
> +/* 
> + * FIXME for 4.12:
> + *  * Replace this hash function to get better distribution across buckets.
> + *  * Don't use casts in the replacement function.
> + *  * Drop the use of array_index_nospec.
> + */
> +/*
> + * This hash function is used to distribute rings within the per-domain
> + * hash tables (d->argo->ring_hash and d->argo_send_hash). The hash table
> + * will provide a struct if a match is found with a 'argo_ring_id' key:
> + * ie. the key is a (domain id, argo port, partner domain id) tuple.
> + * Since argo port number varies the most in expected use, and the Linux driver
> + * allocates at both the high and low ends, incorporate high and low bits to
> + * help with distribution.
> + * Apply array_index_nospec as a defensive measure since this operates
> + * on user-supplied input and the array size that it indexes into is known.
> + */
> +static unsigned int
> +hash_index(const struct argo_ring_id *id)
> +{
> +    unsigned int hash;
> +
> +    hash = (uint16_t)(id->aport >> 16);
> +    hash ^= (uint16_t)id->aport;
> +    hash ^= id->domain_id;
> +    hash ^= id->partner_id;
> +    hash &= (ARGO_HTABLE_SIZE - 1);
> +
> +    return array_index_nospec(hash, ARGO_HTABLE_SIZE);
> +}
> +
> +static struct argo_ring_info *
> +find_ring_info(const struct domain *d, const struct argo_ring_id *id)
> +{
> +    unsigned int ring_hash_index;
> +    struct hlist_node *node;
> +    struct argo_ring_info *ring_info;
> +
> +    ASSERT(LOCKING_Read_rings_L2(d));
> +
> +    ring_hash_index = hash_index(id);
> +
> +    argo_dprintk("d->argo=%p, d->argo->ring_hash[%u]=%p id=%p\n",
> +                 d->argo, ring_hash_index,
> +                 d->argo->ring_hash[ring_hash_index].first, id);
> +    argo_dprintk("id.aport=%x id.domain=vm%u id.partner_id=vm%u\n",
> +                 id->aport, id->domain_id, id->partner_id);
> +
> +    hlist_for_each_entry(ring_info, node, &d->argo->ring_hash[ring_hash_index],
> +                         node)
> +    {
> +        const struct argo_ring_id *cmpid = &ring_info->id;
> +
> +        if ( cmpid->aport == id->aport &&
> +             cmpid->domain_id == id->domain_id &&
> +             cmpid->partner_id == id->partner_id )
> +        {
> +            argo_dprintk("ring_info=%p\n", ring_info);
> +            return ring_info;
> +        }
> +    }
> +    argo_dprintk("no ring_info found\n");
> +
> +    return NULL;
> +}
> +
> +static void
> +ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
> +{
> +    unsigned int i;
> +
> +    ASSERT(LOCKING_L3(d, ring_info));
> +
> +    if ( !ring_info->mfn_mapping )
> +        return;
> +
> +    for ( i = 0; i < ring_info->nmfns; i++ )
> +    {
> +        if ( !ring_info->mfn_mapping[i] )
> +            continue;
> +        if ( ring_info->mfns )
> +            argo_dprintk(XENLOG_ERR "argo: unmapping page %"PRI_mfn" from %p\n",
> +                         mfn_x(ring_info->mfns[i]),
> +                         ring_info->mfn_mapping[i]);

Is it actually possible to have a mapped page without a matching mfn
stored in the mfns array? That would imply there's no reference
taken on such mapped page, which could be dangerous? I think you might
want to add an ASSERT(ring_info->mfns) instead of the current if
condition?

(Maybe I'm missing something here).

>  long
>  do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
>             XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
>             unsigned long arg4)
>  {
> -    return -ENOSYS;
> +    long rc = -EFAULT;
> +
> +    argo_dprintk("->do_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
> +                 (void *)arg1.p, (void *)arg2.p, arg3, arg4);
> +
> +    if ( unlikely(!opt_argo_enabled) )
> +        return -EOPNOTSUPP;

I think this should return -ENOSYS, an hypervisor built with
CONFIG_ARGO but without argo enabled on the command line shouldn't
behave differently than an hypervisor build without CONFIG_ARGO.

> +
> +    switch (cmd)
> +    {
> +    default:
> +        rc = -EOPNOTSUPP;
> +        break;
> +    }
> +
> +    argo_dprintk("<-do_argo_op(%u)=%ld\n", cmd, rc);
> +
> +    return rc;
> +}
> +
> +static void
> +argo_domain_init(struct argo_domain *argo)
> +{
> +    unsigned int i;
> +
> +    rwlock_init(&argo->rings_L2_rwlock);
> +    spin_lock_init(&argo->send_L2_lock);
> +    spin_lock_init(&argo->wildcard_L2_lock);
> +    argo->ring_count = 0;

No need to set ring_count to 0, since you allocate the struct with
xzalloc it's going to be zeroed already.

In the argo_soft_reset case domain_rings_remove_all should have
already set ring_count to 0.

> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 4956a77..6e69afa 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -490,6 +490,11 @@ struct domain
>          unsigned int guest_request_enabled       : 1;
>          unsigned int guest_request_sync          : 1;
>      } monitor;
> +
> +#ifdef CONFIG_ARGO
> +    /* Argo interdomain communication support */
> +    struct argo_domain *argo;

I'm likely missing something, but argo_domain is declared in argo.c,
don't you need a forward declaration here for this to build?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 04/14] argo: init, destroy and soft-reset, with enable command line opt
  2019-01-15 12:29   ` Roger Pau Monné
@ 2019-01-15 12:42     ` Jan Beulich
  2019-01-15 14:16       ` Roger Pau Monné
  2019-01-15 14:15     ` Ian Jackson
  2019-01-16  1:07     ` Christopher Clark
  2 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2019-01-15 12:42 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, ross.philipson,
	Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Christopher Clark,
	Rich Persaud, James McKenzie, George Dunlap, Julien Grall,
	Paul Durrant, xen-devel, eric chanudet

>>> On 15.01.19 at 13:29, <roger.pau@citrix.com> wrote:
> On Tue, Jan 15, 2019 at 01:27:36AM -0800, Christopher Clark wrote:
>>  long
>>  do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
>>             XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
>>             unsigned long arg4)
>>  {
>> -    return -ENOSYS;
>> +    long rc = -EFAULT;
>> +
>> +    argo_dprintk("->do_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
>> +                 (void *)arg1.p, (void *)arg2.p, arg3, arg4);
>> +
>> +    if ( unlikely(!opt_argo_enabled) )
>> +        return -EOPNOTSUPP;
> 
> I think this should return -ENOSYS, an hypervisor built with
> CONFIG_ARGO but without argo enabled on the command line shouldn't
> behave differently than an hypervisor build without CONFIG_ARGO.

We've been there before, and there appears to be disagreement.
I support the use of -EOPNOTSUPP here.

>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -490,6 +490,11 @@ struct domain
>>          unsigned int guest_request_enabled       : 1;
>>          unsigned int guest_request_sync          : 1;
>>      } monitor;
>> +
>> +#ifdef CONFIG_ARGO
>> +    /* Argo interdomain communication support */
>> +    struct argo_domain *argo;
> 
> I'm likely missing something, but argo_domain is declared in argo.c,
> don't you need a forward declaration here for this to build?

That would be needed in C++ (iirc), but not in C, where such
forward declarations are needed solely when the type is used
as a function parameter, as otherwise that parameter's type
ends up having scope local to the declared function.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 04/14] argo: init, destroy and soft-reset, with enable command line opt
  2019-01-15 12:29   ` Roger Pau Monné
  2019-01-15 12:42     ` Jan Beulich
@ 2019-01-15 14:15     ` Ian Jackson
  2019-01-16  1:07     ` Christopher Clark
  2 siblings, 0 replies; 42+ messages in thread
From: Ian Jackson @ 2019-01-15 14:15 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, Daniel Smith, Andrew Cooper,
	Jason Andryuk, Tim (Xen.org),
	George Dunlap, Rich Persaud, Christopher Clark, Julien Grall,
	Paul Durrant, Jan Beulich, xen-devel, James McKenzie,
	Eric Chanudet

Roger Pau Monne writes ("Re: [PATCH v4 04/14] argo: init, destroy and soft-reset, with enable command line opt"):
> On Tue, Jan 15, 2019 at 01:27:36AM -0800, Christopher Clark wrote:
> > +    /* cached tx pointer location, protected by L3 */

I have not been following this in detail.  But I saw this go past and
I wanted to comment on the lock handling issue, but stepping back a
bit.

I applaud these detailed descriptions of what lock is protected by
what, and what the rules are, which are found in these comments.  Much
of the existing hypervisor code is much less explicit and this is a
recurrent source of bugs.  What I see here is much more like how
things ought to be.

We reviewers/maintainers/committers should be careful that this
attention is properly rewarded.  Under the circumstances I would
probably support a freeze exception, although of course the final
decision is with Juergen.

I'll leave the detailed commentary to others.

Regards,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 04/14] argo: init, destroy and soft-reset, with enable command line opt
  2019-01-15 12:42     ` Jan Beulich
@ 2019-01-15 14:16       ` Roger Pau Monné
  0 siblings, 0 replies; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-15 14:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, ross.philipson,
	Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Christopher Clark,
	Rich Persaud, James McKenzie, George Dunlap, Julien Grall,
	Paul Durrant, xen-devel, eric chanudet

On Tue, Jan 15, 2019 at 05:42:16AM -0700, Jan Beulich wrote:
> >>> On 15.01.19 at 13:29, <roger.pau@citrix.com> wrote:
> > On Tue, Jan 15, 2019 at 01:27:36AM -0800, Christopher Clark wrote:
> >>  long
> >>  do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
> >>             XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
> >>             unsigned long arg4)
> >>  {
> >> -    return -ENOSYS;
> >> +    long rc = -EFAULT;
> >> +
> >> +    argo_dprintk("->do_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
> >> +                 (void *)arg1.p, (void *)arg2.p, arg3, arg4);
> >> +
> >> +    if ( unlikely(!opt_argo_enabled) )
> >> +        return -EOPNOTSUPP;
> > 
> > I think this should return -ENOSYS, an hypervisor built with
> > CONFIG_ARGO but without argo enabled on the command line shouldn't
> > behave differently than an hypervisor build without CONFIG_ARGO.
> 
> We've been there before, and there appears to be disagreement.
> I support the use of -EOPNOTSUPP here.

I withdraw my comment then.

> >> --- a/xen/include/xen/sched.h
> >> +++ b/xen/include/xen/sched.h
> >> @@ -490,6 +490,11 @@ struct domain
> >>          unsigned int guest_request_enabled       : 1;
> >>          unsigned int guest_request_sync          : 1;
> >>      } monitor;
> >> +
> >> +#ifdef CONFIG_ARGO
> >> +    /* Argo interdomain communication support */
> >> +    struct argo_domain *argo;
> > 
> > I'm likely missing something, but argo_domain is declared in argo.c,
> > don't you need a forward declaration here for this to build?
> 
> That would be needed in C++ (iirc), but not in C, where such
> forward declarations are needed solely when the type is used
> as a function parameter, as otherwise that parameter's type
> ends up having scope local to the declared function.

Oh, OK, sorry for the noise.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 07/14] argo: implement the register op
  2019-01-15  9:27 ` [PATCH v4 07/14] argo: implement the register op Christopher Clark
@ 2019-01-15 14:40   ` Roger Pau Monné
  2019-01-15 22:37     ` Christopher Clark
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-15 14:40 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 01:27:39AM -0800, Christopher Clark wrote:
> The register op is used by a domain to register a region of memory for
> receiving messages from either a specified other domain, or, if specifying a
> wildcard, any domain.
> 
> This operation creates a mapping within Xen's private address space that
> will remain resident for the lifetime of the ring. In subsequent commits,
> the hypervisor will use this mapping to copy data from a sending domain into
> this registered ring, making it accessible to the domain that registered the
> ring to receive data.
> 
> Wildcard any-sender rings are default disabled and registration will be
> refused with EPERM unless they have been specifically enabled with the
> argo-mac boot option introduced here. The reason why the default for
  ^ nit: argo-mac-permissive

> wildcard rings is 'deny' is that there is currently no means to protect the
> ring from DoS by a noisy domain spamming the ring, affecting other domains
> ability to send to it. This will be addressed with XSM policy controls in
> subsequent work.
> 
> Since denying access to any-sender rings is a significant functional
> constraint, a new bootparam is provided to enable overriding this:
>  "argo-mac" variable has allowed values: 'permissive' and 'enforcing'.
> Even though this is a boolean variable, use these descriptive strings in
> order to make it obvious to an administrator that this has potential
> security impact.
> 
> The p2m type of the memory supplied by the guest for the ring must be
> p2m_ram_rw and the memory will be pinned as PGT_writable_page while the ring
> is registered.
> 
> xen_argo_gfn_t type is defined and is 64-bit on all architectures which
> assists with avoiding the need for compat code to translate hypercall args.
> This hypercall op and its interface currently only supports 4K-sized pages.
> 
> array_index_nospec is used to guard the result of the ring id hash function.
> This is out of an abundance of caution, since this is a very basic hash
> function and it operates upon values supplied by the guest just before
> being used as an array index.
> 
> Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
> 
> -This version contains FIXMEs for 4.12:
>  * find_ring_mfn: investigate using check_get_page_from_gfn()
>    and rewrite this function using it or with adopted logic
> 
>  * shrink critical sections: move acquire/release of the global lock.
>  * simplify the out label path when lock release has been moved.
> 
>  * - drop use of unsigned long type as hypercall args: not compat-friendly
>  * - drop UL suffix on XEN_ARGO_REGISTER_FLAG_MASK
>  * - guard XEN_ARGO_REGISTER_FLAG_MASK (perhaps framed by "#ifdef __XEN__")
>  * - define XEN_ARGO_REGISTER_FLAG_MASK in terms of other flags defined
> 
>  * register_ring: pull write_unlock up above the cleanup actions above
>    and add another label to aborb the two separate put_domain() calls on
>    the error paths.

Thanks, would you agree to add a FIXME to look into using vmap in
order to map the ring pages into contiguous virtual address space in
order to simplify access to the rings? That would likely apply to the
code in ring_map_page, and IMO doesn't need to be done for 4.12, can
be left for later if there are time constrains.

The rest LGTM.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 08/14] argo: implement the unregister op
  2019-01-15  9:27 ` [PATCH v4 08/14] argo: implement the unregister op Christopher Clark
@ 2019-01-15 15:03   ` Roger Pau Monné
  2019-01-17  6:40     ` Christopher Clark
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-15 15:03 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 01:27:40AM -0800, Christopher Clark wrote:
> Takes a single argument: a handle to the ring unregistration struct,
> which specifies the port and partner domain id or wildcard.
> 
> The ring's entry is removed from the hashtable of registered rings;
> any entries for pending notifications are removed; and the ring is
> unmapped from Xen's address space.
> 
> If the ring had been registered to communicate with a single specified
> domain (ie. a non-wildcard ring) then the partner domain state is removed
> from the partner domain's argo send_info hash table.
> 
> Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>

Thanks, LGTM. I just have one question below.

> ---
> v3 #08 Jan: pull xfree out of exclusive critical sections in unregister_ring
> v3 #08 Jan: rename send_find_info to find_send_info
> v3 #07 Jan: rename ring_find_info to find_ring_info
> v3 #08 Roger: use return and remove the out label in unregister_ring
> v3 #08 Roger: better debug output in send_find_info
> v3 #10 Roger: move find functions to top of file and drop prototypes
> v3 #04 Jan: meld compat check for unregister_ring struct
> v3 #04 Roger/Jan: make lock names clearer and assert their state
> v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
> v3 feedback Roger/Jan: ASSERT currd is current->domain or use 'd' variable name
> v3 feedback #07 Roger: const the argo_ring_id structs in send_find_info
> v2 feedback Jan: drop cookie, implement teardown
> v2 feedback Jan: drop message from argo_message_op
> v2 self: OVERHAUL
> v2 self: reorder logic to shorten critical section
> v1 #13 feedback Jan: revise use of guest_handle_okay vs __copy ops
> v1 feedback Roger, Jan: drop argo prefix on static functions
> v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
> v1 #5 (#14) feedback Paul: use currd in do_argo_message_op
> v1 #5 (#14) feedback Paul: full use currd in argo_unregister_ring
> v1 #13 (#14) feedback Paul: replace do/while with goto; reindent
> v1 self: add blank lines in unregister case in do_argo_message_op
> v1: #13 feedback Jan: public namespace: prefix with xen
> v1: #13 feedback Jan: blank line after op case in do_argo_message_op
> v1: #14 feedback Jan: replace domain id override with validation
> v1: #18 feedback Jan: meld the ring count limit into the series
> v1: feedback #15 Jan: verify zero in unused hypercall args
> 
>  xen/common/argo.c         | 118 ++++++++++++++++++++++++++++++++++++++++++++++
>  xen/common/compat/argo.c  |   1 +
>  xen/include/public/argo.h |  19 ++++++++
>  xen/include/xlat.lst      |   1 +
>  4 files changed, 139 insertions(+)
> 
> diff --git a/xen/common/argo.c b/xen/common/argo.c
> index 076ee6c..3f95f80 100644
> --- a/xen/common/argo.c
> +++ b/xen/common/argo.c
> @@ -43,6 +43,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
>  DEFINE_XEN_GUEST_HANDLE(xen_argo_gfn_t);
>  DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
>  DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
> +DEFINE_XEN_GUEST_HANDLE(xen_argo_unregister_ring_t);
>  
>  /* Xen command line option to enable argo */
>  static bool __read_mostly opt_argo_enabled;
> @@ -327,6 +328,33 @@ find_ring_info(const struct domain *d, const struct argo_ring_id *id)
>      return NULL;
>  }
>  
> +static struct argo_send_info *
> +find_send_info(const struct domain *d, const struct argo_ring_id *id)
> +{
> +    struct hlist_node *node;
> +    struct argo_send_info *send_info;
> +
> +    ASSERT(LOCKING_send_L2(d));
> +
> +    hlist_for_each_entry(send_info, node, &d->argo->send_hash[hash_index(id)],
> +                         node)
> +    {
> +        const struct argo_ring_id *cmpid = &send_info->id;
> +
> +        if ( cmpid->aport == id->aport &&
> +             cmpid->domain_id == id->domain_id &&
> +             cmpid->partner_id == id->partner_id )
> +        {
> +            argo_dprintk("send_info=%p\n", send_info);
> +            return send_info;
> +        }
> +    }
> +    argo_dprintk("no send_info found for ring(%u:%x %u)\n",
> +                 id->domain_id, id->aport, id->partner_id);
> +
> +    return NULL;
> +}
> +
>  static void
>  ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
>  {
> @@ -695,6 +723,81 @@ find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
>   * * simplify the out label path when lock release has been moved.
>   */
>  static long
> +unregister_ring(struct domain *currd,
> +                XEN_GUEST_HANDLE_PARAM(xen_argo_unregister_ring_t) unreg_hnd)
> +{
> +    xen_argo_unregister_ring_t unreg;
> +    struct argo_ring_id ring_id;
> +    struct argo_ring_info *ring_info;
> +    struct argo_send_info *send_info = NULL;
> +    struct domain *dst_d = NULL;
> +    int ret = 0;
> +
> +    ASSERT(currd == current->domain);
> +
> +    if ( copy_from_guest(&unreg, unreg_hnd, 1) )
> +        return -EFAULT;
> +
> +    if ( unreg.pad )
> +        return -EINVAL;
> +
> +    ring_id.partner_id = unreg.partner_id;
> +    ring_id.aport = unreg.aport;
> +    ring_id.domain_id = currd->domain_id;
> +
> +    read_lock(&L1_global_argo_rwlock);
> +
> +    if ( !currd->argo )
> +    {
> +        ret = -ENODEV;
> +        goto out_unlock;
> +    }
> +
> +    write_lock(&currd->argo->rings_L2_rwlock);
> +
> +    ring_info = find_ring_info(currd, &ring_id);
> +    if ( ring_info )
> +    {
> +        ring_remove_info(currd, ring_info);
> +        currd->argo->ring_count--;
> +    }
> +
> +    dst_d = get_domain_by_id(ring_id.partner_id);
> +    if ( dst_d )
> +    {
> +        if ( dst_d->argo )
> +        {
> +            spin_lock(&dst_d->argo->send_L2_lock);
> +
> +            send_info = find_send_info(dst_d, &ring_id);
> +            if ( send_info )
> +                hlist_del(&send_info->node);
> +
> +            spin_unlock(&dst_d->argo->send_L2_lock);
> +
> +        }
> +        put_domain(dst_d);
> +    }

Can you actually find send_info if ring_info returns NULL?

The ringid in send_info would then be stale, and point to a
non-existing ring?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 09/14] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-15  9:27 ` [PATCH v4 09/14] argo: implement the sendv op; evtchn: expose send_guest_global_virq Christopher Clark
@ 2019-01-15 15:49   ` Roger Pau Monné
  2019-01-15 16:10     ` Jan Beulich
  2019-01-17  6:48     ` Christopher Clark
  0 siblings, 2 replies; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-15 15:49 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 01:27:41AM -0800, Christopher Clark wrote:
> sendv operation is invoked to perform a synchronous send of buffers
> contained in iovs to a remote domain's registered ring.
> 
> It takes:
>  * A destination address (domid, port) for the ring to send to.
>    It performs a most-specific match lookup, to allow for wildcard.
>  * A source address, used to inform the destination of where to reply.
>  * The address of an array of iovs containing the data to send
>  * .. and the length of that array of iovs
>  * and a 32-bit message type, available to communicate message context
>    data (eg. kernel-to-kernel, separate from the application data).
> 
> If insufficient space exists in the destination ring, it will return
> -EAGAIN and Xen will notify the caller when sufficient space becomes
> available.
> 
> Accesses to the ring indices are appropriately atomic. The rings are
> mapped into Xen's private address space to write as needed and the
> mappings are retained for later use.
> 
> Fixed-size types are used in some areas within this code where caution
> around avoiding integer overflow is important.
> 
> Notifications are sent to guests via VIRQ and send_guest_global_virq is
> exposed in the change to enable argo to call it. VIRQ_ARGO_MESSAGE is
> claimed from the VIRQ previously reserved for this purpose (#11).
> 
> The VIRQ notification method is used rather than sending events using
> evtchn functions directly because:
> 
> * no current event channel type is an exact fit for the intended
>   behaviour. ECS_IPI is closest, but it disallows migration to
>   other VCPUs which is not necessarily a requirement for Argo.
> 
> * at the point of argo_init, allocation of an event channel is
>   complicated by none of the guest VCPUs being initialized yet
>   and the event channel logic expects that a valid event channel
>   has a present VCPU.
> 
> * at the point of signalling a notification, the VIRQ logic is already
>   defensive: if d->vcpu[0] is NULL, the notification is just silently
>   dropped, whereas the evtchn_send logic is not so defensive: vcpu[0]
>   must not be NULL, otherwise a null pointer dereference occurs.
> 
> Using a VIRQ removes the need for the guest to query to determine which
> event channel notifications will be delivered on. This is also likely to
> simplify establishing future L0/L1 nested hypervisor argo communication.
> 
> Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>

LGTM, one question below and one comment.

> +static int
> +ringbuf_insert(const struct domain *d, struct argo_ring_info *ring_info,
> +               const struct argo_ring_id *src_id,
> +               XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd,
> +               unsigned long niov, uint32_t message_type,
> +               unsigned long *out_len)
> +{
> +    xen_argo_ring_t ring;
> +    struct xen_argo_ring_message_header mh = { };
> +    int32_t sp;
> +    int32_t ret;
> +    uint32_t len = 0;
> +    xen_argo_iov_t iovs[XEN_ARGO_MAXIOV];
> +    xen_argo_iov_t *piov;
> +    XEN_GUEST_HANDLE(uint8_t) NULL_hnd =
> +       guest_handle_from_param(guest_handle_from_ptr(NULL, uint8_t), uint8_t);
> +
> +    ASSERT(LOCKING_L3(d, ring_info));
> +
> +    ret = __copy_from_guest(iovs, iovs_hnd, niov) ? -EFAULT : 0;
> +    if ( ret )
> +        return ret;
> +
> +    /*
> +     * Obtain the total size of data to transmit -- sets the 'len' variable
> +     * -- and sanity check that the iovs conform to size and number limits.
> +     * Enforced below: no more than 'len' bytes of guest data
> +     * (plus the message header) will be sent in this operation.
> +     */
> +    ret = iov_count(iovs, niov, &len);
> +    if ( ret )
> +        return ret;
> +
> +    /*
> +     * Size bounds check against ring size and static maximum message limit.
> +     * The message must not fill the ring; there must be at least one slot
> +     * remaining so we can distinguish a full ring from an empty one.
> +     */

NB: I think if you didn't wrap the ring indexes (so always increasing
them) you could always identify an empty ring from a full ring, and
you wouldn't require always having at least one empty slot, unless I'm
missing something.

> +static int
> +pending_requeue(const struct domain *d, struct argo_ring_info *ring_info,
> +                domid_t src_id, unsigned int len)
> +{
> +    struct hlist_node *node;
> +    struct pending_ent *ent;
> +
> +    ASSERT(LOCKING_L3(d, ring_info));
> +
> +    hlist_for_each_entry(ent, node, &ring_info->pending, node)
> +    {
> +        if ( ent->domain_id == src_id )
> +        {
> +            /*
> +             * Reuse an existing queue entry for a notification rather than add
> +             * another. If the existing entry is waiting for a smaller size than
> +             * the current message then adjust the record to wait for the
> +             * current (larger) size to be available before triggering a
> +             * notification.
> +             * This assists the waiting sender by ensuring that whenever a
> +             * notification is triggered, there is sufficient space available
> +             * for (at least) any one of the messages awaiting transmission.
> +             */
> +            if ( ent->len < len )
> +                ent->len = len;

Nit:

ent->len = max(ent->len, len);

> diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
> index b3f6491..b650aba 100644
> --- a/xen/include/public/xen.h
> +++ b/xen/include/public/xen.h
> @@ -178,7 +178,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
>  #define VIRQ_CON_RING   8  /* G. (DOM0) Bytes received on console            */
>  #define VIRQ_PCPU_STATE 9  /* G. (DOM0) PCPU state changed                   */
>  #define VIRQ_MEM_EVENT  10 /* G. (DOM0) A memory event has occurred          */
> -#define VIRQ_XC_RESERVED 11 /* G. Reserved for XenClient                     */
> +#define VIRQ_ARGO_MESSAGE 11 /* G. Argo interdomain message notification     */

Nit: VIRQ_ARGO would be enough IMO, since there are no other argo
related VIRQs.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 09/14] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-15 15:49   ` Roger Pau Monné
@ 2019-01-15 16:10     ` Jan Beulich
  2019-01-15 16:19       ` Roger Pau Monné
  2019-01-17  6:48     ` Christopher Clark
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2019-01-15 16:10 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, ross.philipson,
	Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Christopher Clark,
	Rich Persaud, James McKenzie, George Dunlap, Julien Grall,
	Paul Durrant, xen-devel, eric chanudet

>>> On 15.01.19 at 16:49, <roger.pau@citrix.com> wrote:
> On Tue, Jan 15, 2019 at 01:27:41AM -0800, Christopher Clark wrote:
>> +static int
>> +pending_requeue(const struct domain *d, struct argo_ring_info *ring_info,
>> +                domid_t src_id, unsigned int len)
>> +{
>> +    struct hlist_node *node;
>> +    struct pending_ent *ent;
>> +
>> +    ASSERT(LOCKING_L3(d, ring_info));
>> +
>> +    hlist_for_each_entry(ent, node, &ring_info->pending, node)
>> +    {
>> +        if ( ent->domain_id == src_id )
>> +        {
>> +            /*
>> +             * Reuse an existing queue entry for a notification rather than add
>> +             * another. If the existing entry is waiting for a smaller size than
>> +             * the current message then adjust the record to wait for the
>> +             * current (larger) size to be available before triggering a
>> +             * notification.
>> +             * This assists the waiting sender by ensuring that whenever a
>> +             * notification is triggered, there is sufficient space available
>> +             * for (at least) any one of the messages awaiting transmission.
>> +             */
>> +            if ( ent->len < len )
>> +                ent->len = len;
> 
> Nit:
> 
> ent->len = max(ent->len, len);

I don't think use of max() should be a requirement in cases where
one of the items compared is also the value to update. I'm not
even convinced it helps readability of the sources, let alone the
quality of generated code.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 10/14] argo: implement the notify op
  2019-01-15  9:27 ` [PATCH v4 10/14] argo: implement the notify op Christopher Clark
@ 2019-01-15 16:17   ` Roger Pau Monné
  2019-01-17  6:54     ` Christopher Clark
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-15 16:17 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 01:27:42AM -0800, Christopher Clark wrote:
> Queries for data about space availability in registered rings and
> causes notification to be sent when space has become available.
> 
> The hypercall op populates a supplied data structure with information about
> ring state, and if insufficient space is currently available in a given ring,
> the hypervisor will record the domain's expressed interest and notify it
> when it observes that space has become available.
> 
> Checks for free space occur when this notify op is invoked, so it may be
> intentionally invoked with no data structure to populate
> (ie. a NULL argument) to trigger such a check and consequent notifications.
> 
> Limit the maximum number of notify requests in a single operation to a
> simple fixed limit of 256.

LGTM, I have one comment on the public interface, the other comment is
purely cosmetic.

>  static int
> +fill_ring_data(const struct domain *currd,
> +               XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) data_ent_hnd)
> +{
> +    xen_argo_ring_data_ent_t ent;
> +    struct domain *dst_d;
> +    struct argo_ring_info *ring_info;
> +
> +    ASSERT(currd == current->domain);
> +    ASSERT(LOCKING_Read_L1);
> +
> +    if ( __copy_from_guest(&ent, data_ent_hnd, 1) )
> +        return -EFAULT;
> +
> +    argo_dprintk("fill_ring_data: ent.ring.domain=%u,ent.ring.aport=%x\n",
> +                 ent.ring.domain_id, ent.ring.aport);
> +
> +    ent.flags = 0;
> +
> +    dst_d = get_domain_by_id(ent.ring.domain_id);
> +    if ( dst_d )
> +    {
> +        if ( dst_d->argo )
> +        {
> +            read_lock(&dst_d->argo->rings_L2_rwlock);
> +
> +            ring_info = find_ring_info_by_match(dst_d, ent.ring.aport,
> +                                                currd->domain_id);
> +            if ( ring_info )
> +            {

Nit: there's a lot of nested conditions here, which push the
indentation to the right. It might be better from a readability PoV to
return early. For example:

if ( !dst_d || !dst_d->argo)
    goto out;

...

if ( !ring_info )
{
    read_unlock(&dst_d->argo->rings_L2_rwlock);
    goto out;
}

...

out:
if ( dst_d )
    put_domain(dst_d);

if ( copy_to_....

In order to prevent so much space wastage due to indentation.

> diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
> index c12a50f..d2cb594 100644
> --- a/xen/include/public/argo.h
> +++ b/xen/include/public/argo.h
> @@ -123,6 +123,42 @@ typedef struct xen_argo_unregister_ring
>  /* Messages on the ring are padded to a multiple of this size. */
>  #define XEN_ARGO_MSG_SLOT_SIZE 0x10
>  
> +/*
> + * Notify flags
> + */
> +/* Ring is empty */
> +#define XEN_ARGO_RING_DATA_F_EMPTY       (1U << 0)
> +/* Ring exists */
> +#define XEN_ARGO_RING_DATA_F_EXISTS      (1U << 1)
> +/* Pending interrupt exists. Do not rely on this field - for profiling only */
> +#define XEN_ARGO_RING_DATA_F_PENDING     (1U << 2)
> +/* Sufficient space to queue space_required bytes exists */
> +#define XEN_ARGO_RING_DATA_F_SUFFICIENT  (1U << 3)

I would reword this as:

"Sufficient space to queue space_required bytes might exists"

Because AFAICT as soon as the hypervisor drops the L3 lock the space
available might change, so the recipient of the notification or the
return from the hypercall shouldn't expect that there _must_ be
space_required available space on the ring.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 09/14] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-15 16:10     ` Jan Beulich
@ 2019-01-15 16:19       ` Roger Pau Monné
  0 siblings, 0 replies; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-15 16:19 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, ross.philipson,
	Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Christopher Clark,
	Rich Persaud, James McKenzie, George Dunlap, Julien Grall,
	Paul Durrant, xen-devel, eric chanudet

On Tue, Jan 15, 2019 at 09:10:41AM -0700, Jan Beulich wrote:
> >>> On 15.01.19 at 16:49, <roger.pau@citrix.com> wrote:
> > On Tue, Jan 15, 2019 at 01:27:41AM -0800, Christopher Clark wrote:
> >> +static int
> >> +pending_requeue(const struct domain *d, struct argo_ring_info *ring_info,
> >> +                domid_t src_id, unsigned int len)
> >> +{
> >> +    struct hlist_node *node;
> >> +    struct pending_ent *ent;
> >> +
> >> +    ASSERT(LOCKING_L3(d, ring_info));
> >> +
> >> +    hlist_for_each_entry(ent, node, &ring_info->pending, node)
> >> +    {
> >> +        if ( ent->domain_id == src_id )
> >> +        {
> >> +            /*
> >> +             * Reuse an existing queue entry for a notification rather than add
> >> +             * another. If the existing entry is waiting for a smaller size than
> >> +             * the current message then adjust the record to wait for the
> >> +             * current (larger) size to be available before triggering a
> >> +             * notification.
> >> +             * This assists the waiting sender by ensuring that whenever a
> >> +             * notification is triggered, there is sufficient space available
> >> +             * for (at least) any one of the messages awaiting transmission.
> >> +             */
> >> +            if ( ent->len < len )
> >> +                ent->len = len;
> > 
> > Nit:
> > 
> > ent->len = max(ent->len, len);
> 
> I don't think use of max() should be a requirement in cases where
> one of the items compared is also the value to update. I'm not
> even convinced it helps readability of the sources, let alone the
> quality of generated code.

Then disregard the comment. It's likely I got used to this style and
find it easier to read.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication
  2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (13 preceding siblings ...)
  2019-01-15  9:27 ` [PATCH v4 14/14] xsm, argo: notify: don't describe rings that cannot be sent to Christopher Clark
@ 2019-01-15 16:34 ` Roger Pau Monné
  2019-01-15 22:39   ` Christopher Clark
  14 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-15 16:34 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Lars Kurth, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 01:27:32AM -0800, Christopher Clark wrote:
> Version four of this patch series.
> 
> * Changes are primarily addressing feedback from the v3 series reviews.
>   Many points noted on the invididual commit posts.
> 
> * Register ring interfaces uses Xen gfns as page identifiers,
>   and the arguments no longer specify page granularity.
> 
> * Multi-level lock validation macros defined and applied.
>   Locks renamed to improve readability.
> 
> * Hypercall argument struct checking is folded inline into the series,
>   checks applied as types are introduced.
> 
> * argo-mac string boot parameter changed to argo-mac-permissive boolean
> 
> Feedback items that are remaining to be addressed have been noted with
> comments in the commit message and at the location in the code.

Thanks. I've made some comments on the patches, but overall this LGTM.
Thanks for improving the locking names and the comments.

I think my only request would be to add the usage of vmap in order to
map the rings in a FIXME, so it's not forgotten. That can be likely
done after 4.12 if there are time constrains (and maybe more
important issues to solve).

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 07/14] argo: implement the register op
  2019-01-15 14:40   ` Roger Pau Monné
@ 2019-01-15 22:37     ` Christopher Clark
  0 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15 22:37 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 6:41 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Tue, Jan 15, 2019 at 01:27:39AM -0800, Christopher Clark wrote:
> > The register op is used by a domain to register a region of memory for
> > receiving messages from either a specified other domain, or, if specifying a
> > wildcard, any domain.
> >
> > This operation creates a mapping within Xen's private address space that
> > will remain resident for the lifetime of the ring. In subsequent commits,
> > the hypervisor will use this mapping to copy data from a sending domain into
> > this registered ring, making it accessible to the domain that registered the
> > ring to receive data.
> >
> > Wildcard any-sender rings are default disabled and registration will be
> > refused with EPERM unless they have been specifically enabled with the
> > argo-mac boot option introduced here. The reason why the default for
>   ^ nit: argo-mac-permissive

ack, thanks - fixed here and below.

>
> > wildcard rings is 'deny' is that there is currently no means to protect the
> > ring from DoS by a noisy domain spamming the ring, affecting other domains
> > ability to send to it. This will be addressed with XSM policy controls in
> > subsequent work.
> >
> > Since denying access to any-sender rings is a significant functional
> > constraint, a new bootparam is provided to enable overriding this:
> >  "argo-mac" variable has allowed values: 'permissive' and 'enforcing'.
> > Even though this is a boolean variable, use these descriptive strings in
> > order to make it obvious to an administrator that this has potential
> > security impact.
> >
> > The p2m type of the memory supplied by the guest for the ring must be
> > p2m_ram_rw and the memory will be pinned as PGT_writable_page while the ring
> > is registered.
> >
> > xen_argo_gfn_t type is defined and is 64-bit on all architectures which
> > assists with avoiding the need for compat code to translate hypercall args.
> > This hypercall op and its interface currently only supports 4K-sized pages.
> >
> > array_index_nospec is used to guard the result of the ring id hash function.
> > This is out of an abundance of caution, since this is a very basic hash
> > function and it operates upon values supplied by the guest just before
> > being used as an array index.
> >
> > Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
> >
> > -This version contains FIXMEs for 4.12:
> >  * find_ring_mfn: investigate using check_get_page_from_gfn()
> >    and rewrite this function using it or with adopted logic
> >
> >  * shrink critical sections: move acquire/release of the global lock.
> >  * simplify the out label path when lock release has been moved.
> >
> >  * - drop use of unsigned long type as hypercall args: not compat-friendly
> >  * - drop UL suffix on XEN_ARGO_REGISTER_FLAG_MASK
> >  * - guard XEN_ARGO_REGISTER_FLAG_MASK (perhaps framed by "#ifdef __XEN__")
> >  * - define XEN_ARGO_REGISTER_FLAG_MASK in terms of other flags defined
> >
> >  * register_ring: pull write_unlock up above the cleanup actions above
> >    and add another label to aborb the two separate put_domain() calls on
> >    the error paths.
>
> Thanks, would you agree to add a FIXME to look into using vmap in
> order to map the ring pages into contiguous virtual address space in
> order to simplify access to the rings? That would likely apply to the
> code in ring_map_page, and IMO doesn't need to be done for 4.12, can
> be left for later if there are time constrains.

Ack - agreed, done.

> The rest LGTM.

thanks!

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication
  2019-01-15 16:34 ` [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Roger Pau Monné
@ 2019-01-15 22:39   ` Christopher Clark
  0 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-15 22:39 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Lars Kurth, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 8:34 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Tue, Jan 15, 2019 at 01:27:32AM -0800, Christopher Clark wrote:
> > Version four of this patch series.
> >
> > * Changes are primarily addressing feedback from the v3 series reviews.
> >   Many points noted on the invididual commit posts.
> >
> > * Register ring interfaces uses Xen gfns as page identifiers,
> >   and the arguments no longer specify page granularity.
> >
> > * Multi-level lock validation macros defined and applied.
> >   Locks renamed to improve readability.
> >
> > * Hypercall argument struct checking is folded inline into the series,
> >   checks applied as types are introduced.
> >
> > * argo-mac string boot parameter changed to argo-mac-permissive boolean
> >
> > Feedback items that are remaining to be addressed have been noted with
> > comments in the commit message and at the location in the code.
>
> Thanks. I've made some comments on the patches, but overall this LGTM.
> Thanks for improving the locking names and the comments.

You're welcome - thanks for the detailed reviews and feedback.

> I think my only request would be to add the usage of vmap in order to
> map the rings in a FIXME, so it's not forgotten. That can be likely
> done after 4.12 if there are time constrains (and maybe more
> important issues to solve).

Ack, I've added this and it will be included in the next posting.

thanks,

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 04/14] argo: init, destroy and soft-reset, with enable command line opt
  2019-01-15 12:29   ` Roger Pau Monné
  2019-01-15 12:42     ` Jan Beulich
  2019-01-15 14:15     ` Ian Jackson
@ 2019-01-16  1:07     ` Christopher Clark
  2 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-16  1:07 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 4:29 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Tue, Jan 15, 2019 at 01:27:36AM -0800, Christopher Clark wrote:
> > Initialises basic data structures and performs teardown of argo state
> > for domain shutdown.

> > +
> > +/*
> > + * The value of the argo element in a struct domain is
> > + * protected by L1_global_argo_rwlock
> > + */
> > +#define ARGO_HTABLE_SIZE 32
> > +struct argo_domain
> > +{
> > +    /* rings_L2 */
> > +    rwlock_t rings_L2_rwlock;
> > +    /*
> > +     * Hash table of argo_ring_info about rings this domain has registered.
> > +     * Protected by rings_L2.
> > +     */
> > +    struct hlist_head ring_hash[ARGO_HTABLE_SIZE];
> > +    /* Counter of rings registered by this domain. Protected by rings_L2. */
> > +    unsigned int ring_count;
> > +
> > +    /* send_L2 */
> > +    spinlock_t send_L2_lock;
>
> Other locks are rw locks, while this is a spinlock, I guess that's
> because there aren't many concurrent read-only accesses to
> send_hash?

Yes, that's correct. The only places that need to take this lock need
to take it exclusively, for updating the protected data structure, so
there's no call for or benefit to using a rw lock for this one.

>
> > +    /*
> > +     * Hash table of argo_send_info about rings other domains have registered
> > +     * for this domain to send to. Single partner, non-wildcard rings.
> > +     * Protected by send_L2.
> > +     */
> > +    struct hlist_head send_hash[ARGO_HTABLE_SIZE];
> > +
> > +    /* wildcard_L2 */
> > +    spinlock_t wildcard_L2_lock;
> > +    /*
> > +     * List of pending space-available signals for this domain about wildcard
> > +     * rings registered by other domains. Protected by wildcard_L2.
> > +     */
> > +    struct hlist_head wildcard_pend_list;
> > +};
> > +
> > +/*
> > + * Locking is organized as follows:
> > + *
> > + * Terminology: R(<lock>) means taking a read lock on the specified lock;
> > + *              W(<lock>) means taking a write lock on it.
> > + *
> > + * == L1 : The global read/write lock: L1_global_argo_rwlock
> > + * Protects the argo elements of all struct domain *d in the system.
> > + * It does not protect any of the elements of d->argo, only their
> > + * addresses.
>
> But if you W(L1), you can basically modify anything, in all d->argo
> structs, so it does seem to protect the elements of d->argo when
> write-locked?

ack, that is correct and this comment isn't clear enough about what it
is supposed to say, so I've just rewritten it. Pasting the new version
of the comment about the L1 lock here:

 * == L1 : The global read/write lock: L1_global_argo_rwlock
 * Protects the argo elements of all struct domain *d in the system.
 *
 * R(L1) does not protect any of the elements of d->argo; it protects their
 * addresses. W(L1) protects those and more since it implies W on all the lower
 * level locks - see the notes on those locks below.
 *
 * The destruction of an argo-enabled domain, which must have a non-NULL d->argo
 * pointer, will need to free that d->argo pointer, which requires W(L1).
 * Since holding R(L1) will block acquiring W(L1), it will ensure that
 * no domains pointers that argo is interested in become invalid while either
 * W(L1) or R(L1) are held.

>
> > + * By extension since the destruction of a domain with a non-NULL
> > + * d->argo will need to free the d->argo pointer, holding W(L1)
> > + * guarantees that no domains pointers that argo is interested in
> > + * become invalid whilst this lock is held.
>
> AFAICT holding W(L1) guarantees not only that pointers doesn't change,
> but that there are no changes at all in any of the d->argo contained
> data.

Ack. In the terminology used in the comments: W(L1) implies
W(rings_L2), W(send_L2) and W(wildcard_L2) on all domains, and implies
L3 on all rings. ie. holding W(L1) grants the same access as if you
held all of those lower level locks, which is effectively to all Argo
data structures.

> > +/*
> > + * Lock state validations macros
> > + *
> > + * These macros encode the logic to verify that the locking has adhered to the
> > + * locking discipline above.
> > + * eg. On entry to logic that requires holding at least R(rings_L2), this:
> > + *      ASSERT(LOCKING_Read_rings_L2(d));
> > + *
> > + * checks that the lock state is sufficient, validating that one of the
> > + * following must be true when executed:       R(rings_L2) && R(L1)
> > + *                                        or:  W(rings_L2) && R(L1)
> > + *                                        or:  W(L1)
> > + */
> > +
> > +/* RAW macros here are only used to assist defining the other macros below */
> > +#define RAW_LOCKING_Read_L1 (rw_is_locked(&L1_global_argo_rwlock))
>
> Not sure whether it's relevant or not, but this macro would return
> true as long as the lock is taken, regardless of whether it's read or
> write locked. If you want to make sure it's only read-locked then you
> will have to use:
>
> rw_is_locked(&L1_global_argo_rwlock) &&
> !rw_is_write_locked(&L1_global_argo_rwlock)
>
> AFAICT.

Thanks - you're right, and in practice the macros don't need that
distinction about only-read locking, which is helpful...

> > +#define RAW_LOCKING_Read_rings_L2(d) \
> > +    (rw_is_locked(&d->argo->rings_L2_rwlock) && RAW_LOCKING_Read_L1)
> > +
> > +/* The LOCKING macros defined below here are for use at verification points */
> > +#define LOCKING_Write_L1 (rw_is_write_locked(&L1_global_argo_rwlock))
> > +#define LOCKING_Read_L1 (RAW_LOCKING_Read_L1 || LOCKING_Write_L1)
>
> You can drop the LOCKING_Write_L1 here, since with the current macros
> RAW_LOCKING_Read_L1 will return true regardless of whether the lock is
> read or write locked.
>
> > +
> > +#define LOCKING_Write_rings_L2(d) \
> > +    ((RAW_LOCKING_Read_L1 && rw_is_write_locked(&d->argo->rings_L2_rwlock)) || \
>
> For safety you need parentheses around d here:
>
> rw_is_write_locked(&(d)->argo->rings_L2_rwlock)
>
> And also in the macros below, same applies to r.

Ack to all the above and so I've rewritten the macros -- and handily,
the RAW ones are just not needed. New versions (hopefully my mail
client doesn't shred the pasting here; have reflowed to shorten lines
a bit):

 * The LOCKING macros defined below here are for use at verification points.
 */
#define LOCKING_Write_L1 (rw_is_write_locked(&L1_global_argo_rwlock))
/*
 * While LOCKING_Read_L1 will return true even if the lock is write-locked,
 * that's OK because everywhere that a Read lock is needed with these
 * macros, holding a Write lock there instead is OK too: we're checking that
 * _at least_ the specified level of locks are held.
 */
#define LOCKING_Read_L1 (rw_is_locked(&L1_global_argo_rwlock))

#define LOCKING_Write_rings_L2(d) \
    ((LOCKING_Read_L1 && \
        rw_is_write_locked(&(d)->argo->rings_L2_rwlock)) || \
     LOCKING_Write_L1)
/*
 * Skip checking LOCKING_Write_rings_L2(d) within this
LOCKING_Read_rings_L2 * definition because the first clause that is
testing R(L1) && R(L2) will
 * also return true if R(L1) && W(L2) is true, because of the way that
 * rw_is_locked behaves. This results in a slightly shorter and faster
 * implementation.
 */
#define LOCKING_Read_rings_L2(d) \
    ((LOCKING_Read_L1 && rw_is_locked(&(d)->argo->rings_L2_rwlock)) || \
     LOCKING_Write_L1)
/*
 * Skip checking LOCKING_Write_L1 within this LOCKING_L3 definition because
 * LOCKING_Write_rings_L2(d) will return true for that condition.
 */
#define LOCKING_L3(d, r) \
    ((LOCKING_Read_L1 && rw_is_locked(&(d)->argo->rings_L2_rwlock) \
      && spin_is_locked(&(r)->L3_lock)) || LOCKING_Write_rings_L2(d))

#define LOCKING_send_L2(d) \
    ((LOCKING_Read_L1 && spin_is_locked(&(d)->argo->send_L2_lock)) || \
     LOCKING_Write_L1)

> >
> > +/*
> > + * FIXME for 4.12:
> > + *  * Replace this hash function to get better distribution across buckets.
> > + *  * Don't use casts in the replacement function.
> > + *  * Drop the use of array_index_nospec.
> > + */
> > +/*
> > + * This hash function is used to distribute rings within the per-domain
> > + * hash tables (d->argo->ring_hash and d->argo_send_hash). The hash table
> > + * will provide a struct if a match is found with a 'argo_ring_id' key:
> > + * ie. the key is a (domain id, argo port, partner domain id) tuple.
> > + * Since argo port number varies the most in expected use, and the Linux driver
> > + * allocates at both the high and low ends, incorporate high and low bits to
> > + * help with distribution.
> > + * Apply array_index_nospec as a defensive measure since this operates
> > + * on user-supplied input and the array size that it indexes into is known.
> > + */
> > +static unsigned int
> > +hash_index(const struct argo_ring_id *id)
> > +{
> > +    unsigned int hash;
> > +
> > +    hash = (uint16_t)(id->aport >> 16);
> > +    hash ^= (uint16_t)id->aport;
> > +    hash ^= id->domain_id;
> > +    hash ^= id->partner_id;
> > +    hash &= (ARGO_HTABLE_SIZE - 1);
> > +
> > +    return array_index_nospec(hash, ARGO_HTABLE_SIZE);
> > +}
> > +
> > +static struct argo_ring_info *
> > +find_ring_info(const struct domain *d, const struct argo_ring_id *id)
> > +{
> > +    unsigned int ring_hash_index;
> > +    struct hlist_node *node;
> > +    struct argo_ring_info *ring_info;
> > +
> > +    ASSERT(LOCKING_Read_rings_L2(d));
> > +
> > +    ring_hash_index = hash_index(id);
> > +
> > +    argo_dprintk("d->argo=%p, d->argo->ring_hash[%u]=%p id=%p\n",
> > +                 d->argo, ring_hash_index,
> > +                 d->argo->ring_hash[ring_hash_index].first, id);
> > +    argo_dprintk("id.aport=%x id.domain=vm%u id.partner_id=vm%u\n",
> > +                 id->aport, id->domain_id, id->partner_id);
> > +
> > +    hlist_for_each_entry(ring_info, node, &d->argo->ring_hash[ring_hash_index],
> > +                         node)
> > +    {
> > +        const struct argo_ring_id *cmpid = &ring_info->id;
> > +
> > +        if ( cmpid->aport == id->aport &&
> > +             cmpid->domain_id == id->domain_id &&
> > +             cmpid->partner_id == id->partner_id )
> > +        {
> > +            argo_dprintk("ring_info=%p\n", ring_info);
> > +            return ring_info;
> > +        }
> > +    }
> > +    argo_dprintk("no ring_info found\n");
> > +
> > +    return NULL;
> > +}
> > +
> > +static void
> > +ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
> > +{
> > +    unsigned int i;
> > +
> > +    ASSERT(LOCKING_L3(d, ring_info));
> > +
> > +    if ( !ring_info->mfn_mapping )
> > +        return;
> > +
> > +    for ( i = 0; i < ring_info->nmfns; i++ )
> > +    {
> > +        if ( !ring_info->mfn_mapping[i] )
> > +            continue;
> > +        if ( ring_info->mfns )
> > +            argo_dprintk(XENLOG_ERR "argo: unmapping page %"PRI_mfn" from %p\n",
> > +                         mfn_x(ring_info->mfns[i]),
> > +                         ring_info->mfn_mapping[i]);
>
> Is it actually possible to have a mapped page without a matching mfn
> stored in the mfns array? That would imply there's no reference
> taken on such mapped page, which could be dangerous? I think you might
> want to add an ASSERT(ring_info->mfns) instead of the current if
> condition?
>
> (Maybe I'm missing something here).

I don't think you've missed anything - it looks right to me, so I've
dropped the if, made the printk unconditional and added two ASSERTs:
One is before the loop:
    ASSERT(!ring_info->nmfns || ring_info->mfns);

and another within the loop, after the continue:
        ASSERT(!mfn_eq(ring_info->mfns[i], INVALID_MFN));

>
> >  long
> >  do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
> >             XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
> >             unsigned long arg4)
> >  {
> > -    return -ENOSYS;
> > +    long rc = -EFAULT;
> > +
> > +    argo_dprintk("->do_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
> > +                 (void *)arg1.p, (void *)arg2.p, arg3, arg4);
> > +
> > +    if ( unlikely(!opt_argo_enabled) )
> > +        return -EOPNOTSUPP;
>
> I think this should return -ENOSYS, an hypervisor built with
> CONFIG_ARGO but without argo enabled on the command line shouldn't
> behave differently than an hypervisor build without CONFIG_ARGO.

I've left this unchanged as -EOPNOTSUPP after the later discussion in
this thread.

>
> > +
> > +    switch (cmd)
> > +    {
> > +    default:
> > +        rc = -EOPNOTSUPP;
> > +        break;
> > +    }
> > +
> > +    argo_dprintk("<-do_argo_op(%u)=%ld\n", cmd, rc);
> > +
> > +    return rc;
> > +}
> > +
> > +static void
> > +argo_domain_init(struct argo_domain *argo)
> > +{
> > +    unsigned int i;
> > +
> > +    rwlock_init(&argo->rings_L2_rwlock);
> > +    spin_lock_init(&argo->send_L2_lock);
> > +    spin_lock_init(&argo->wildcard_L2_lock);
> > +    argo->ring_count = 0;
>
> No need to set ring_count to 0, since you allocate the struct with
> xzalloc it's going to be zeroed already.
>
> In the argo_soft_reset case domain_rings_remove_all should have
> already set ring_count to 0.

ack, done, thanks

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 08/14] argo: implement the unregister op
  2019-01-15 15:03   ` Roger Pau Monné
@ 2019-01-17  6:40     ` Christopher Clark
  0 siblings, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-17  6:40 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 7:07 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Tue, Jan 15, 2019 at 01:27:40AM -0800, Christopher Clark wrote:
> > Takes a single argument: a handle to the ring unregistration struct,
> > which specifies the port and partner domain id or wildcard.
> >
> > The ring's entry is removed from the hashtable of registered rings;
> > any entries for pending notifications are removed; and the ring is
> > unmapped from Xen's address space.
> >
> > If the ring had been registered to communicate with a single specified
> > domain (ie. a non-wildcard ring) then the partner domain state is removed
> > from the partner domain's argo send_info hash table.
> >
> > Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
>
> Thanks, LGTM. I just have one question below.
>
> > ---
> > v3 #08 Jan: pull xfree out of exclusive critical sections in unregister_ring
> > v3 #08 Jan: rename send_find_info to find_send_info
> > v3 #07 Jan: rename ring_find_info to find_ring_info
> > v3 #08 Roger: use return and remove the out label in unregister_ring
> > v3 #08 Roger: better debug output in send_find_info
> > v3 #10 Roger: move find functions to top of file and drop prototypes
> > v3 #04 Jan: meld compat check for unregister_ring struct
> > v3 #04 Roger/Jan: make lock names clearer and assert their state
> > v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
> > v3 feedback Roger/Jan: ASSERT currd is current->domain or use 'd' variable name
> > v3 feedback #07 Roger: const the argo_ring_id structs in send_find_info
> > v2 feedback Jan: drop cookie, implement teardown
> > v2 feedback Jan: drop message from argo_message_op
> > v2 self: OVERHAUL
> > v2 self: reorder logic to shorten critical section
> > v1 #13 feedback Jan: revise use of guest_handle_okay vs __copy ops
> > v1 feedback Roger, Jan: drop argo prefix on static functions
> > v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
> > v1 #5 (#14) feedback Paul: use currd in do_argo_message_op
> > v1 #5 (#14) feedback Paul: full use currd in argo_unregister_ring
> > v1 #13 (#14) feedback Paul: replace do/while with goto; reindent
> > v1 self: add blank lines in unregister case in do_argo_message_op
> > v1: #13 feedback Jan: public namespace: prefix with xen
> > v1: #13 feedback Jan: blank line after op case in do_argo_message_op
> > v1: #14 feedback Jan: replace domain id override with validation
> > v1: #18 feedback Jan: meld the ring count limit into the series
> > v1: feedback #15 Jan: verify zero in unused hypercall args
> >
> >  xen/common/argo.c         | 118 ++++++++++++++++++++++++++++++++++++++++++++++
> >  xen/common/compat/argo.c  |   1 +
> >  xen/include/public/argo.h |  19 ++++++++
> >  xen/include/xlat.lst      |   1 +
> >  4 files changed, 139 insertions(+)
> >
> > diff --git a/xen/common/argo.c b/xen/common/argo.c
> > index 076ee6c..3f95f80 100644
> > --- a/xen/common/argo.c
> > +++ b/xen/common/argo.c
> > @@ -43,6 +43,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
> >  DEFINE_XEN_GUEST_HANDLE(xen_argo_gfn_t);
> >  DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
> >  DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
> > +DEFINE_XEN_GUEST_HANDLE(xen_argo_unregister_ring_t);
> >
> >  /* Xen command line option to enable argo */
> >  static bool __read_mostly opt_argo_enabled;
> > @@ -327,6 +328,33 @@ find_ring_info(const struct domain *d, const struct argo_ring_id *id)
> >      return NULL;
> >  }
> >
> > +static struct argo_send_info *
> > +find_send_info(const struct domain *d, const struct argo_ring_id *id)
> > +{
> > +    struct hlist_node *node;
> > +    struct argo_send_info *send_info;
> > +
> > +    ASSERT(LOCKING_send_L2(d));
> > +
> > +    hlist_for_each_entry(send_info, node, &d->argo->send_hash[hash_index(id)],
> > +                         node)
> > +    {
> > +        const struct argo_ring_id *cmpid = &send_info->id;
> > +
> > +        if ( cmpid->aport == id->aport &&
> > +             cmpid->domain_id == id->domain_id &&
> > +             cmpid->partner_id == id->partner_id )
> > +        {
> > +            argo_dprintk("send_info=%p\n", send_info);
> > +            return send_info;
> > +        }
> > +    }
> > +    argo_dprintk("no send_info found for ring(%u:%x %u)\n",
> > +                 id->domain_id, id->aport, id->partner_id);
> > +
> > +    return NULL;
> > +}
> > +
> >  static void
> >  ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
> >  {
> > @@ -695,6 +723,81 @@ find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
> >   * * simplify the out label path when lock release has been moved.
> >   */
> >  static long
> > +unregister_ring(struct domain *currd,
> > +                XEN_GUEST_HANDLE_PARAM(xen_argo_unregister_ring_t) unreg_hnd)
> > +{
> > +    xen_argo_unregister_ring_t unreg;
> > +    struct argo_ring_id ring_id;
> > +    struct argo_ring_info *ring_info;
> > +    struct argo_send_info *send_info = NULL;
> > +    struct domain *dst_d = NULL;
> > +    int ret = 0;
> > +
> > +    ASSERT(currd == current->domain);
> > +
> > +    if ( copy_from_guest(&unreg, unreg_hnd, 1) )
> > +        return -EFAULT;
> > +
> > +    if ( unreg.pad )
> > +        return -EINVAL;
> > +
> > +    ring_id.partner_id = unreg.partner_id;
> > +    ring_id.aport = unreg.aport;
> > +    ring_id.domain_id = currd->domain_id;
> > +
> > +    read_lock(&L1_global_argo_rwlock);
> > +
> > +    if ( !currd->argo )
> > +    {
> > +        ret = -ENODEV;
> > +        goto out_unlock;
> > +    }
> > +
> > +    write_lock(&currd->argo->rings_L2_rwlock);
> > +
> > +    ring_info = find_ring_info(currd, &ring_id);
> > +    if ( ring_info )
> > +    {
> > +        ring_remove_info(currd, ring_info);
> > +        currd->argo->ring_count--;
> > +    }
> > +
> > +    dst_d = get_domain_by_id(ring_id.partner_id);
> > +    if ( dst_d )
> > +    {
> > +        if ( dst_d->argo )
> > +        {
> > +            spin_lock(&dst_d->argo->send_L2_lock);
> > +
> > +            send_info = find_send_info(dst_d, &ring_id);
> > +            if ( send_info )
> > +                hlist_del(&send_info->node);
> > +
> > +            spin_unlock(&dst_d->argo->send_L2_lock);
> > +
> > +        }
> > +        put_domain(dst_d);
> > +    }
>
> Can you actually find send_info if ring_info returns NULL?
>
> The ringid in send_info would then be stale, and point to a
> non-existing ring?

Your observation is correct, and it means that the above logic can be
simplified a bit, so have done so. It can also skip the send_info
lookup if it is unregistering a wildcard ring, as determined by the
partner_id.

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 09/14] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-15 15:49   ` Roger Pau Monné
  2019-01-15 16:10     ` Jan Beulich
@ 2019-01-17  6:48     ` Christopher Clark
  2019-01-17 10:53       ` Roger Pau Monné
  1 sibling, 1 reply; 42+ messages in thread
From: Christopher Clark @ 2019-01-17  6:48 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 7:49 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Tue, Jan 15, 2019 at 01:27:41AM -0800, Christopher Clark wrote:
> > sendv operation is invoked to perform a synchronous send of buffers
> > contained in iovs to a remote domain's registered ring.
> >
> > It takes:
> >  * A destination address (domid, port) for the ring to send to.
> >    It performs a most-specific match lookup, to allow for wildcard.
> >  * A source address, used to inform the destination of where to reply.
> >  * The address of an array of iovs containing the data to send
> >  * .. and the length of that array of iovs
> >  * and a 32-bit message type, available to communicate message context
> >    data (eg. kernel-to-kernel, separate from the application data).
> >
> > If insufficient space exists in the destination ring, it will return
> > -EAGAIN and Xen will notify the caller when sufficient space becomes
> > available.
> >
> > Accesses to the ring indices are appropriately atomic. The rings are
> > mapped into Xen's private address space to write as needed and the
> > mappings are retained for later use.
> >
> > Fixed-size types are used in some areas within this code where caution
> > around avoiding integer overflow is important.
> >
> > Notifications are sent to guests via VIRQ and send_guest_global_virq is
> > exposed in the change to enable argo to call it. VIRQ_ARGO_MESSAGE is
> > claimed from the VIRQ previously reserved for this purpose (#11).
> >
> > The VIRQ notification method is used rather than sending events using
> > evtchn functions directly because:
> >
> > * no current event channel type is an exact fit for the intended
> >   behaviour. ECS_IPI is closest, but it disallows migration to
> >   other VCPUs which is not necessarily a requirement for Argo.
> >
> > * at the point of argo_init, allocation of an event channel is
> >   complicated by none of the guest VCPUs being initialized yet
> >   and the event channel logic expects that a valid event channel
> >   has a present VCPU.
> >
> > * at the point of signalling a notification, the VIRQ logic is already
> >   defensive: if d->vcpu[0] is NULL, the notification is just silently
> >   dropped, whereas the evtchn_send logic is not so defensive: vcpu[0]
> >   must not be NULL, otherwise a null pointer dereference occurs.
> >
> > Using a VIRQ removes the need for the guest to query to determine which
> > event channel notifications will be delivered on. This is also likely to
> > simplify establishing future L0/L1 nested hypervisor argo communication.
> >
> > Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
>
> LGTM, one question below and one comment.
>
> > +static int
> > +ringbuf_insert(const struct domain *d, struct argo_ring_info *ring_info,
> > +               const struct argo_ring_id *src_id,
> > +               XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd,
> > +               unsigned long niov, uint32_t message_type,
> > +               unsigned long *out_len)
> > +{
> > +    xen_argo_ring_t ring;
> > +    struct xen_argo_ring_message_header mh = { };
> > +    int32_t sp;
> > +    int32_t ret;
> > +    uint32_t len = 0;
> > +    xen_argo_iov_t iovs[XEN_ARGO_MAXIOV];
> > +    xen_argo_iov_t *piov;
> > +    XEN_GUEST_HANDLE(uint8_t) NULL_hnd =
> > +       guest_handle_from_param(guest_handle_from_ptr(NULL, uint8_t), uint8_t);
> > +
> > +    ASSERT(LOCKING_L3(d, ring_info));
> > +
> > +    ret = __copy_from_guest(iovs, iovs_hnd, niov) ? -EFAULT : 0;
> > +    if ( ret )
> > +        return ret;
> > +
> > +    /*
> > +     * Obtain the total size of data to transmit -- sets the 'len' variable
> > +     * -- and sanity check that the iovs conform to size and number limits.
> > +     * Enforced below: no more than 'len' bytes of guest data
> > +     * (plus the message header) will be sent in this operation.
> > +     */
> > +    ret = iov_count(iovs, niov, &len);
> > +    if ( ret )
> > +        return ret;
> > +
> > +    /*
> > +     * Size bounds check against ring size and static maximum message limit.
> > +     * The message must not fill the ring; there must be at least one slot
> > +     * remaining so we can distinguish a full ring from an empty one.
> > +     */
>
> NB: I think if you didn't wrap the ring indexes (so always increasing
> them) you could always identify an empty ring from a full ring, and
> you wouldn't require always having at least one empty slot, unless I'm
> missing something.

I haven't yet worked it through to be sure on this one, with the
rx_ptr under guest control, so possibly able to force the increment of
the tx_ptr at some high rate? but I can see that it might be possible,
yes.

> > diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
> > index b3f6491..b650aba 100644
> > --- a/xen/include/public/xen.h
> > +++ b/xen/include/public/xen.h
> > @@ -178,7 +178,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
> >  #define VIRQ_CON_RING   8  /* G. (DOM0) Bytes received on console            */
> >  #define VIRQ_PCPU_STATE 9  /* G. (DOM0) PCPU state changed                   */
> >  #define VIRQ_MEM_EVENT  10 /* G. (DOM0) A memory event has occurred          */
> > -#define VIRQ_XC_RESERVED 11 /* G. Reserved for XenClient                     */
> > +#define VIRQ_ARGO_MESSAGE 11 /* G. Argo interdomain message notification     */
>
> Nit: VIRQ_ARGO would be enough IMO, since there are no other argo
> related VIRQs.

Thanks for catching that one - I'd meant to do that when Jan had
suggested dropping 'message' from the hypercall name (which is done)
but I missed this one. Now done.

thanks,

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 10/14] argo: implement the notify op
  2019-01-15 16:17   ` Roger Pau Monné
@ 2019-01-17  6:54     ` Christopher Clark
  2019-01-17 11:12       ` Roger Pau Monné
  0 siblings, 1 reply; 42+ messages in thread
From: Christopher Clark @ 2019-01-17  6:54 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 15, 2019 at 8:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Tue, Jan 15, 2019 at 01:27:42AM -0800, Christopher Clark wrote:
> > Queries for data about space availability in registered rings and
> > causes notification to be sent when space has become available.
> >
> > The hypercall op populates a supplied data structure with information about
> > ring state, and if insufficient space is currently available in a given ring,
> > the hypervisor will record the domain's expressed interest and notify it
> > when it observes that space has become available.
> >
> > Checks for free space occur when this notify op is invoked, so it may be
> > intentionally invoked with no data structure to populate
> > (ie. a NULL argument) to trigger such a check and consequent notifications.
> >
> > Limit the maximum number of notify requests in a single operation to a
> > simple fixed limit of 256.
>
> LGTM, I have one comment on the public interface, the other comment is
> purely cosmetic.
>
> >  static int
> > +fill_ring_data(const struct domain *currd,
> > +               XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) data_ent_hnd)
> > +{
> > +    xen_argo_ring_data_ent_t ent;
> > +    struct domain *dst_d;
> > +    struct argo_ring_info *ring_info;
> > +
> > +    ASSERT(currd == current->domain);
> > +    ASSERT(LOCKING_Read_L1);
> > +
> > +    if ( __copy_from_guest(&ent, data_ent_hnd, 1) )
> > +        return -EFAULT;
> > +
> > +    argo_dprintk("fill_ring_data: ent.ring.domain=%u,ent.ring.aport=%x\n",
> > +                 ent.ring.domain_id, ent.ring.aport);
> > +
> > +    ent.flags = 0;
> > +
> > +    dst_d = get_domain_by_id(ent.ring.domain_id);
> > +    if ( dst_d )
> > +    {
> > +        if ( dst_d->argo )
> > +        {
> > +            read_lock(&dst_d->argo->rings_L2_rwlock);
> > +
> > +            ring_info = find_ring_info_by_match(dst_d, ent.ring.aport,
> > +                                                currd->domain_id);
> > +            if ( ring_info )
> > +            {
>
> Nit: there's a lot of nested conditions here, which push the
> indentation to the right. It might be better from a readability PoV to
> return early. For example:
>
> if ( !dst_d || !dst_d->argo)
>     goto out;
>
> ...
>
> if ( !ring_info )
> {
>     read_unlock(&dst_d->argo->rings_L2_rwlock);
>     goto out;
> }
>
> ...
>
> out:
> if ( dst_d )
>     put_domain(dst_d);
>
> if ( copy_to_....
>
> In order to prevent so much space wastage due to indentation.

Thanks, yes - done.

>
> > diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
> > index c12a50f..d2cb594 100644
> > --- a/xen/include/public/argo.h
> > +++ b/xen/include/public/argo.h
> > @@ -123,6 +123,42 @@ typedef struct xen_argo_unregister_ring
> >  /* Messages on the ring are padded to a multiple of this size. */
> >  #define XEN_ARGO_MSG_SLOT_SIZE 0x10
> >
> > +/*
> > + * Notify flags
> > + */
> > +/* Ring is empty */
> > +#define XEN_ARGO_RING_DATA_F_EMPTY       (1U << 0)
> > +/* Ring exists */
> > +#define XEN_ARGO_RING_DATA_F_EXISTS      (1U << 1)
> > +/* Pending interrupt exists. Do not rely on this field - for profiling only */
> > +#define XEN_ARGO_RING_DATA_F_PENDING     (1U << 2)
> > +/* Sufficient space to queue space_required bytes exists */
> > +#define XEN_ARGO_RING_DATA_F_SUFFICIENT  (1U << 3)
>
> I would reword this as:
>
> "Sufficient space to queue space_required bytes might exists"
>
> Because AFAICT as soon as the hypervisor drops the L3 lock the space
> available might change, so the recipient of the notification or the
> return from the hypercall shouldn't expect that there _must_ be
> space_required available space on the ring.

ack. does this look ok? -:

+ * Sufficient space to queue space_required bytes has become available.
+ * If messages have been sent, it may not still be available.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 09/14] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-17  6:48     ` Christopher Clark
@ 2019-01-17 10:53       ` Roger Pau Monné
  0 siblings, 0 replies; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-17 10:53 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Wed, Jan 16, 2019 at 10:48:52PM -0800, Christopher Clark wrote:
> On Tue, Jan 15, 2019 at 7:49 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Tue, Jan 15, 2019 at 01:27:41AM -0800, Christopher Clark wrote:
> > > sendv operation is invoked to perform a synchronous send of buffers
> > > contained in iovs to a remote domain's registered ring.
> > >
> > > It takes:
> > >  * A destination address (domid, port) for the ring to send to.
> > >    It performs a most-specific match lookup, to allow for wildcard.
> > >  * A source address, used to inform the destination of where to reply.
> > >  * The address of an array of iovs containing the data to send
> > >  * .. and the length of that array of iovs
> > >  * and a 32-bit message type, available to communicate message context
> > >    data (eg. kernel-to-kernel, separate from the application data).
> > >
> > > If insufficient space exists in the destination ring, it will return
> > > -EAGAIN and Xen will notify the caller when sufficient space becomes
> > > available.
> > >
> > > Accesses to the ring indices are appropriately atomic. The rings are
> > > mapped into Xen's private address space to write as needed and the
> > > mappings are retained for later use.
> > >
> > > Fixed-size types are used in some areas within this code where caution
> > > around avoiding integer overflow is important.
> > >
> > > Notifications are sent to guests via VIRQ and send_guest_global_virq is
> > > exposed in the change to enable argo to call it. VIRQ_ARGO_MESSAGE is
> > > claimed from the VIRQ previously reserved for this purpose (#11).
> > >
> > > The VIRQ notification method is used rather than sending events using
> > > evtchn functions directly because:
> > >
> > > * no current event channel type is an exact fit for the intended
> > >   behaviour. ECS_IPI is closest, but it disallows migration to
> > >   other VCPUs which is not necessarily a requirement for Argo.
> > >
> > > * at the point of argo_init, allocation of an event channel is
> > >   complicated by none of the guest VCPUs being initialized yet
> > >   and the event channel logic expects that a valid event channel
> > >   has a present VCPU.
> > >
> > > * at the point of signalling a notification, the VIRQ logic is already
> > >   defensive: if d->vcpu[0] is NULL, the notification is just silently
> > >   dropped, whereas the evtchn_send logic is not so defensive: vcpu[0]
> > >   must not be NULL, otherwise a null pointer dereference occurs.
> > >
> > > Using a VIRQ removes the need for the guest to query to determine which
> > > event channel notifications will be delivered on. This is also likely to
> > > simplify establishing future L0/L1 nested hypervisor argo communication.
> > >
> > > Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
> >
> > LGTM, one question below and one comment.
> >
> > > +static int
> > > +ringbuf_insert(const struct domain *d, struct argo_ring_info *ring_info,
> > > +               const struct argo_ring_id *src_id,
> > > +               XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd,
> > > +               unsigned long niov, uint32_t message_type,
> > > +               unsigned long *out_len)
> > > +{
> > > +    xen_argo_ring_t ring;
> > > +    struct xen_argo_ring_message_header mh = { };
> > > +    int32_t sp;
> > > +    int32_t ret;
> > > +    uint32_t len = 0;
> > > +    xen_argo_iov_t iovs[XEN_ARGO_MAXIOV];
> > > +    xen_argo_iov_t *piov;
> > > +    XEN_GUEST_HANDLE(uint8_t) NULL_hnd =
> > > +       guest_handle_from_param(guest_handle_from_ptr(NULL, uint8_t), uint8_t);
> > > +
> > > +    ASSERT(LOCKING_L3(d, ring_info));
> > > +
> > > +    ret = __copy_from_guest(iovs, iovs_hnd, niov) ? -EFAULT : 0;
> > > +    if ( ret )
> > > +        return ret;
> > > +
> > > +    /*
> > > +     * Obtain the total size of data to transmit -- sets the 'len' variable
> > > +     * -- and sanity check that the iovs conform to size and number limits.
> > > +     * Enforced below: no more than 'len' bytes of guest data
> > > +     * (plus the message header) will be sent in this operation.
> > > +     */
> > > +    ret = iov_count(iovs, niov, &len);
> > > +    if ( ret )
> > > +        return ret;
> > > +
> > > +    /*
> > > +     * Size bounds check against ring size and static maximum message limit.
> > > +     * The message must not fill the ring; there must be at least one slot
> > > +     * remaining so we can distinguish a full ring from an empty one.
> > > +     */
> >
> > NB: I think if you didn't wrap the ring indexes (so always increasing
> > them) you could always identify an empty ring from a full ring, and
> > you wouldn't require always having at least one empty slot, unless I'm
> > missing something.
> 
> I haven't yet worked it through to be sure on this one, with the
> rx_ptr under guest control, so possibly able to force the increment of
> the tx_ptr at some high rate? but I can see that it might be possible,
> yes.

AFAICT this part of the public protocol, so there's not much that can
be done to change it now I guess?

Shared memory rings used by most of the PV devices don't wrap the ring
indexes to the size of the ring, and the consumer is always chasing
the producer index. This avoids loosing one slot. Anyway I though it
was worth mentioning it for reference.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 10/14] argo: implement the notify op
  2019-01-17  6:54     ` Christopher Clark
@ 2019-01-17 11:12       ` Roger Pau Monné
  2019-01-17 12:04         ` Jan Beulich
  2019-01-17 21:44         ` Christopher Clark
  0 siblings, 2 replies; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-17 11:12 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Wed, Jan 16, 2019 at 10:54:48PM -0800, Christopher Clark wrote:
> On Tue, Jan 15, 2019 at 8:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Tue, Jan 15, 2019 at 01:27:42AM -0800, Christopher Clark wrote:
> > > diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
> > > index c12a50f..d2cb594 100644
> > > --- a/xen/include/public/argo.h
> > > +++ b/xen/include/public/argo.h
> > > @@ -123,6 +123,42 @@ typedef struct xen_argo_unregister_ring
> > >  /* Messages on the ring are padded to a multiple of this size. */
> > >  #define XEN_ARGO_MSG_SLOT_SIZE 0x10
> > >
> > > +/*
> > > + * Notify flags
> > > + */
> > > +/* Ring is empty */
> > > +#define XEN_ARGO_RING_DATA_F_EMPTY       (1U << 0)
> > > +/* Ring exists */
> > > +#define XEN_ARGO_RING_DATA_F_EXISTS      (1U << 1)
> > > +/* Pending interrupt exists. Do not rely on this field - for profiling only */
> > > +#define XEN_ARGO_RING_DATA_F_PENDING     (1U << 2)

Regarding this flag, I've just noticed while looking at the code that
it doesn't seem to relate to interrupts?

From it's usage in fill_ring_data I would write the following
description:

"Likely not enough space to queue a message of `space_required`
size."

And then XEN_ARGO_RING_DATA_F_PENDING is completely orthogonal to
XEN_ARGO_RING_DATA_F_SUFFICIENT, at which point having only one of
those would be enough?

AFAICT you cannot get a xen_argo_ring_data_ent_t with both
XEN_ARGO_RING_DATA_F_PENDING and XEN_ARGO_RING_DATA_F_SUFFICIENT set
at the same time?

> > > +/* Sufficient space to queue space_required bytes exists */
> > > +#define XEN_ARGO_RING_DATA_F_SUFFICIENT  (1U << 3)
> >
> > I would reword this as:
> >
> > "Sufficient space to queue space_required bytes might exists"
> >
> > Because AFAICT as soon as the hypervisor drops the L3 lock the space
> > available might change, so the recipient of the notification or the
> > return from the hypercall shouldn't expect that there _must_ be
> > space_required available space on the ring.
> 
> ack. does this look ok? -:
> 
> + * Sufficient space to queue space_required bytes has become available.
> + * If messages have been sent, it may not still be available.

I think my suggestion was shorter and clearer, but I'm not a native
speaker so if you think the above is better and no one else complains
that's fine for me.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 10/14] argo: implement the notify op
  2019-01-17 11:12       ` Roger Pau Monné
@ 2019-01-17 12:04         ` Jan Beulich
  2019-01-17 21:44         ` Christopher Clark
  1 sibling, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2019-01-17 12:04 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, ross.philipson,
	Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Christopher Clark,
	Rich Persaud, James McKenzie, George Dunlap, Julien Grall,
	Paul Durrant, xen-devel, eric chanudet

>>> On 17.01.19 at 12:12, <roger.pau@citrix.com> wrote:
> On Wed, Jan 16, 2019 at 10:54:48PM -0800, Christopher Clark wrote:
>> On Tue, Jan 15, 2019 at 8:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>> > On Tue, Jan 15, 2019 at 01:27:42AM -0800, Christopher Clark wrote:
>> > > +/* Sufficient space to queue space_required bytes exists */
>> > > +#define XEN_ARGO_RING_DATA_F_SUFFICIENT  (1U << 3)
>> >
>> > I would reword this as:
>> >
>> > "Sufficient space to queue space_required bytes might exists"
>> >
>> > Because AFAICT as soon as the hypervisor drops the L3 lock the space
>> > available might change, so the recipient of the notification or the
>> > return from the hypercall shouldn't expect that there _must_ be
>> > space_required available space on the ring.
>> 
>> ack. does this look ok? -:
>> 
>> + * Sufficient space to queue space_required bytes has become available.
>> + * If messages have been sent, it may not still be available.
> 
> I think my suggestion was shorter and clearer, but I'm not a native
> speaker so if you think the above is better and no one else complains
> that's fine for me.

FWIW I'd be fine with either, but with a small adjustment in each case:
In yours the trailing "s" looks wrong. In Christopher's I'd suggest it to
be "If further messages have been sent, ...".

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 10/14] argo: implement the notify op
  2019-01-17 11:12       ` Roger Pau Monné
  2019-01-17 12:04         ` Jan Beulich
@ 2019-01-17 21:44         ` Christopher Clark
  2019-01-18  9:44           ` Roger Pau Monné
  1 sibling, 1 reply; 42+ messages in thread
From: Christopher Clark @ 2019-01-17 21:44 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Thu, Jan 17, 2019 at 3:12 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Wed, Jan 16, 2019 at 10:54:48PM -0800, Christopher Clark wrote:
> > On Tue, Jan 15, 2019 at 8:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Tue, Jan 15, 2019 at 01:27:42AM -0800, Christopher Clark wrote:
> > > > diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
> > > > index c12a50f..d2cb594 100644
> > > > --- a/xen/include/public/argo.h
> > > > +++ b/xen/include/public/argo.h
> > > > @@ -123,6 +123,42 @@ typedef struct xen_argo_unregister_ring
> > > >  /* Messages on the ring are padded to a multiple of this size. */
> > > >  #define XEN_ARGO_MSG_SLOT_SIZE 0x10
> > > >
> > > > +/*
> > > > + * Notify flags
> > > > + */
> > > > +/* Ring is empty */
> > > > +#define XEN_ARGO_RING_DATA_F_EMPTY       (1U << 0)
> > > > +/* Ring exists */
> > > > +#define XEN_ARGO_RING_DATA_F_EXISTS      (1U << 1)
> > > > +/* Pending interrupt exists. Do not rely on this field - for profiling only */
> > > > +#define XEN_ARGO_RING_DATA_F_PENDING     (1U << 2)
>
> Regarding this flag, I've just noticed while looking at the code that
> it doesn't seem to relate to interrupts?

It might not seem that way, but I think it does, because it indicates
that the hypervisor has just queued up a signal (via VIRQ) for later:
the logic in fill_ring_data has observed that there wasn't enough
space available in the ring for the requested space_required supplied
in the notify call, so it has added a new entry to the ring's
pending_ent list, which will cause a signal to be triggered to the
domain (ie. a VIRQ) later when enough space has been observed as being
available.

Now, the "len" value stored in that pending_ent can be changed later,
depending on the size of messages that the domain attempts to send to
the same ring in the meantime, which I think is why the comment notes
not to depend upon that flag.

> From it's usage in fill_ring_data I would write the following
> description:
>
> "Likely not enough space to queue a message of `space_required`
> size."
>
> And then XEN_ARGO_RING_DATA_F_PENDING is completely orthogonal to
> XEN_ARGO_RING_DATA_F_SUFFICIENT, at which point having only one of
> those would be enough?

Given the above, where I do think that the PENDING flag is an
indicator of queued interrupt, I think there's some merit to keeping
them separate, rather than committing to the client that it is always
one or the other. It actually looks like the call to pending_requeue
is ignoring the potential for an error value (eg ENOSPC or ENOMEM)
there, where the flag should not be set, and possibly the errno should
be returned to the caller.

> AFAICT you cannot get a xen_argo_ring_data_ent_t with both
> XEN_ARGO_RING_DATA_F_PENDING and XEN_ARGO_RING_DATA_F_SUFFICIENT set
> at the same time?

right, but there is a case where you can get one with neither bit set.
It looks a bit clearer for the caller to have the explicit separate
bits because it can avoid having to check a third flag first to see
how to interpret a combined one.

>
> > > > +/* Sufficient space to queue space_required bytes exists */
> > > > +#define XEN_ARGO_RING_DATA_F_SUFFICIENT  (1U << 3)
> > >
> > > I would reword this as:
> > >
> > > "Sufficient space to queue space_required bytes might exists"
> > >
> > > Because AFAICT as soon as the hypervisor drops the L3 lock the space
> > > available might change, so the recipient of the notification or the
> > > return from the hypercall shouldn't expect that there _must_ be
> > > space_required available space on the ring.
> >
> > ack. does this look ok? -:
> >
> > + * Sufficient space to queue space_required bytes has become available.
> > + * If messages have been sent, it may not still be available.
>
> I think my suggestion was shorter and clearer, but I'm not a native
> speaker so if you think the above is better and no one else complains
> that's fine for me.

ok, Jan's edit to yours looks good and a single line is nicer so:
"Sufficient space to queue space_required bytes might exist"

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 10/14] argo: implement the notify op
  2019-01-17 21:44         ` Christopher Clark
@ 2019-01-18  9:44           ` Roger Pau Monné
  2019-01-18 23:54             ` Christopher Clark
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-18  9:44 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Thu, Jan 17, 2019 at 01:44:32PM -0800, Christopher Clark wrote:
> On Thu, Jan 17, 2019 at 3:12 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Wed, Jan 16, 2019 at 10:54:48PM -0800, Christopher Clark wrote:
> > > On Tue, Jan 15, 2019 at 8:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > >
> > > > On Tue, Jan 15, 2019 at 01:27:42AM -0800, Christopher Clark wrote:
> > > > > diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
> > > > > index c12a50f..d2cb594 100644
> > > > > --- a/xen/include/public/argo.h
> > > > > +++ b/xen/include/public/argo.h
> > > > > @@ -123,6 +123,42 @@ typedef struct xen_argo_unregister_ring
> > > > >  /* Messages on the ring are padded to a multiple of this size. */
> > > > >  #define XEN_ARGO_MSG_SLOT_SIZE 0x10
> > > > >
> > > > > +/*
> > > > > + * Notify flags
> > > > > + */
> > > > > +/* Ring is empty */
> > > > > +#define XEN_ARGO_RING_DATA_F_EMPTY       (1U << 0)
> > > > > +/* Ring exists */
> > > > > +#define XEN_ARGO_RING_DATA_F_EXISTS      (1U << 1)
> > > > > +/* Pending interrupt exists. Do not rely on this field - for profiling only */
> > > > > +#define XEN_ARGO_RING_DATA_F_PENDING     (1U << 2)
> >
> > Regarding this flag, I've just noticed while looking at the code that
> > it doesn't seem to relate to interrupts?
> 
> It might not seem that way, but I think it does, because it indicates
> that the hypervisor has just queued up a signal (via VIRQ) for later:
> the logic in fill_ring_data has observed that there wasn't enough
> space available in the ring for the requested space_required supplied
> in the notify call, so it has added a new entry to the ring's
> pending_ent list, which will cause a signal to be triggered to the
> domain (ie. a VIRQ) later when enough space has been observed as being
> available.

Oh, I think I was getting confused by the wording of the comment, here
"pending interrupt" means that the caller should expect an interrupt at
some point in the future when there's enough free space on the ring?

To me "pending interrupt" means there's an interrupt set by the
hypervisor which has not yet been serviced by the caller.

> Now, the "len" value stored in that pending_ent can be changed later,
> depending on the size of messages that the domain attempts to send to
> the same ring in the meantime, which I think is why the comment notes
> not to depend upon that flag.
> 
> > From it's usage in fill_ring_data I would write the following
> > description:
> >
> > "Likely not enough space to queue a message of `space_required`
> > size."
> >
> > And then XEN_ARGO_RING_DATA_F_PENDING is completely orthogonal to
> > XEN_ARGO_RING_DATA_F_SUFFICIENT, at which point having only one of
> > those would be enough?
> 
> Given the above, where I do think that the PENDING flag is an
> indicator of queued interrupt, I think there's some merit to keeping
> them separate, rather than committing to the client that it is always
> one or the other. It actually looks like the call to pending_requeue
> is ignoring the potential for an error value (eg ENOSPC or ENOMEM)
> there, where the flag should not be set, and possibly the errno should
> be returned to the caller.

Yes, you should propagate the errors from pending_requeue to the
caller.

> > AFAICT you cannot get a xen_argo_ring_data_ent_t with both
> > XEN_ARGO_RING_DATA_F_PENDING and XEN_ARGO_RING_DATA_F_SUFFICIENT set
> > at the same time?
> 
> right, but there is a case where you can get one with neither bit set.

Yes, that's right. But you would then get the
XEN_ARGO_RING_DATA_F_EMSGSIZE flag set or the ring simply don't
exist.

> It looks a bit clearer for the caller to have the explicit separate
> bits because it can avoid having to check a third flag first to see
> how to interpret a combined one.

There are three possible situations, which are mutually exclusive:

1. Message is bigger than the max message size supported by the ring:
   set EMSGSIZE
2. Message fits based on the current available space on the ring:
   don't set any flags.
3. Message doesn't fit based on the current available space on the
   ring: set NOTIFY.

So that would leave the following set of flags:

/* Ring is empty. */
#define XEN_ARGO_RING_EMPTY       (1U << 0)
/* Ring exists. */
#define XEN_ARGO_RING_EXISTS      (1U << 1)
/*
 * Not enough ring space available for the requested size, caller set
 * to receive a notification via VIRQ_ARGO when enough free space
 * might be available.
 */
#define XEN_ARGO_RING_NOTIFY      (1U << 2)
/* Requested size exceeds maximum ring message size. */
#define XEN_ARGO_RING_EMSGSIZE    (1U << 3)
/* Ring is shared, not unicast. */
#define XEN_ARGO_RING_SHARED      (1U << 4)

Note that I've also removed the _DATA_F_, I think it's not specially
helpful, and shorter names are easier to read.

I think the above is clearer and should be able to convey the
same set of information using one flag less, which is always better
IMO. That being set I don't know the users of this interface anyway,
so if you think the original proposal is better I'm not going to
oppose.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 10/14] argo: implement the notify op
  2019-01-18  9:44           ` Roger Pau Monné
@ 2019-01-18 23:54             ` Christopher Clark
  2019-01-18 23:59               ` Christopher Clark
  2019-01-19 12:06               ` Roger Pau Monné
  0 siblings, 2 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-18 23:54 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Fri, Jan 18, 2019 at 1:44 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Thu, Jan 17, 2019 at 01:44:32PM -0800, Christopher Clark wrote:
> > On Thu, Jan 17, 2019 at 3:12 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Wed, Jan 16, 2019 at 10:54:48PM -0800, Christopher Clark wrote:
> > > > On Tue, Jan 15, 2019 at 8:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > >
> > > > > On Tue, Jan 15, 2019 at 01:27:42AM -0800, Christopher Clark wrote:
> > > > > > diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
> > > > > > index c12a50f..d2cb594 100644
> > > > > > --- a/xen/include/public/argo.h
> > > > > > +++ b/xen/include/public/argo.h
> > > > > > @@ -123,6 +123,42 @@ typedef struct xen_argo_unregister_ring
> > > > > >  /* Messages on the ring are padded to a multiple of this size. */
> > > > > >  #define XEN_ARGO_MSG_SLOT_SIZE 0x10
> > > > > >
> > > > > > +/*
> > > > > > + * Notify flags
> > > > > > + */
> > > > > > +/* Ring is empty */
> > > > > > +#define XEN_ARGO_RING_DATA_F_EMPTY       (1U << 0)
> > > > > > +/* Ring exists */
> > > > > > +#define XEN_ARGO_RING_DATA_F_EXISTS      (1U << 1)
> > > > > > +/* Pending interrupt exists. Do not rely on this field - for profiling only */
> > > > > > +#define XEN_ARGO_RING_DATA_F_PENDING     (1U << 2)
> > >
> > > Regarding this flag, I've just noticed while looking at the code that
> > > it doesn't seem to relate to interrupts?
> >
> > It might not seem that way, but I think it does, because it indicates
> > that the hypervisor has just queued up a signal (via VIRQ) for later:
> > the logic in fill_ring_data has observed that there wasn't enough
> > space available in the ring for the requested space_required supplied
> > in the notify call, so it has added a new entry to the ring's
> > pending_ent list, which will cause a signal to be triggered to the
> > domain (ie. a VIRQ) later when enough space has been observed as being
> > available.
>
> Oh, I think I was getting confused by the wording of the comment, here
> "pending interrupt" means that the caller should expect an interrupt at
> some point in the future when there's enough free space on the ring?

Yes, that's right.

> To me "pending interrupt" means there's an interrupt set by the
> hypervisor which has not yet been serviced by the caller.

OK, I could see that is a reasonable interpretation too. Do you have a
term that you would prefer for this?

>
> > Now, the "len" value stored in that pending_ent can be changed later,
> > depending on the size of messages that the domain attempts to send to
> > the same ring in the meantime, which I think is why the comment notes
> > not to depend upon that flag.
> >
> > > From it's usage in fill_ring_data I would write the following
> > > description:
> > >
> > > "Likely not enough space to queue a message of `space_required`
> > > size."
> > >
> > > And then XEN_ARGO_RING_DATA_F_PENDING is completely orthogonal to
> > > XEN_ARGO_RING_DATA_F_SUFFICIENT, at which point having only one of
> > > those would be enough?
> >
> > Given the above, where I do think that the PENDING flag is an
> > indicator of queued interrupt, I think there's some merit to keeping
> > them separate, rather than committing to the client that it is always
> > one or the other. It actually looks like the call to pending_requeue
> > is ignoring the potential for an error value (eg ENOSPC or ENOMEM)
> > there, where the flag should not be set, and possibly the errno should
> > be returned to the caller.
>
> Yes, you should propagate the errors from pending_requeue to the
> caller.

ack, done.

>
> > > AFAICT you cannot get a xen_argo_ring_data_ent_t with both
> > > XEN_ARGO_RING_DATA_F_PENDING and XEN_ARGO_RING_DATA_F_SUFFICIENT set
> > > at the same time?
> >
> > right, but there is a case where you can get one with neither bit set.
>
> Yes, that's right. But you would then get the
> XEN_ARGO_RING_DATA_F_EMSGSIZE flag set or the ring simply don't
> exist.
>
> > It looks a bit clearer for the caller to have the explicit separate
> > bits because it can avoid having to check a third flag first to see
> > how to interpret a combined one.
>
> There are three possible situations, which are mutually exclusive:
>
> 1. Message is bigger than the max message size supported by the ring:
>    set EMSGSIZE
> 2. Message fits based on the current available space on the ring:
>    don't set any flags.
> 3. Message doesn't fit based on the current available space on the
>    ring: set NOTIFY.

Unfortunately, given the new error checking (added for my "ack, done." above),
now there is a fourth condition. Situation 3 is described more fully as:

3. Message doesn't fit based on the current available space on the
ring, and a VIRQ is queued for when space is available: set NOTIFY.

New Situation 4 is:

4. Message doesn't fit based on the current available space on the ring,
but Xen can't queue up a VIRQ for later because memory allocation to
add an entry for that failed. Don't set NOTIFY.

We ought to enable the guest to distinguish Situation 2 from Situation 4
-- which I think points to keeping the separate flags.

> So that would leave the following set of flags:
>
> /* Ring is empty. */
> #define XEN_ARGO_RING_EMPTY       (1U << 0)
> /* Ring exists. */
> #define XEN_ARGO_RING_EXISTS      (1U << 1)
> /*
>  * Not enough ring space available for the requested size, caller set
>  * to receive a notification via VIRQ_ARGO when enough free space
>  * might be available.
>  */
> #define XEN_ARGO_RING_NOTIFY      (1U << 2)
> /* Requested size exceeds maximum ring message size. */
> #define XEN_ARGO_RING_EMSGSIZE    (1U << 3)
> /* Ring is shared, not unicast. */
> #define XEN_ARGO_RING_SHARED      (1U << 4)
>
> Note that I've also removed the _DATA_F_, I think it's not specially
> helpful, and shorter names are easier to read.

Ack - done.

>
> I think the above is clearer and should be able to convey the
> same set of information using one flag less, which is always better
> IMO. That being set I don't know the users of this interface anyway,
> so if you think the original proposal is better I'm not going to
> oppose.

ok -- let me know your view given the description of Situation 4 above.
I've kept it unchanged for the time being.

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 10/14] argo: implement the notify op
  2019-01-18 23:54             ` Christopher Clark
@ 2019-01-18 23:59               ` Christopher Clark
  2019-01-19 12:06               ` Roger Pau Monné
  1 sibling, 0 replies; 42+ messages in thread
From: Christopher Clark @ 2019-01-18 23:59 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Fri, Jan 18, 2019 at 3:54 PM Christopher Clark
<christopher.w.clark@gmail.com> wrote:
>
> On Fri, Jan 18, 2019 at 1:44 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Thu, Jan 17, 2019 at 01:44:32PM -0800, Christopher Clark wrote:
> > > On Thu, Jan 17, 2019 at 3:12 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > >
> > > > On Wed, Jan 16, 2019 at 10:54:48PM -0800, Christopher Clark wrote:
> > > > > On Tue, Jan 15, 2019 at 8:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > >
> > > > > > On Tue, Jan 15, 2019 at 01:27:42AM -0800, Christopher Clark wrote:

> New Situation 4 is:
>
> 4. Message doesn't fit based on the current available space on the ring,
> but Xen can't queue up a VIRQ for later because memory allocation to
> add an entry for that failed. Don't set NOTIFY.

I should add: there's an additional error condition for Situation 4 besides
memory allocation failure: the ring could have reached the maximum
number of pending space_available signals in its list, which would cause
ENOSPC as another possible error value returned there.

>
> We ought to enable the guest to distinguish Situation 2 from Situation 4
> -- which I think points to keeping the separate flags.
>
> > So that would leave the following set of flags:
> >
> > /* Ring is empty. */
> > #define XEN_ARGO_RING_EMPTY       (1U << 0)
> > /* Ring exists. */
> > #define XEN_ARGO_RING_EXISTS      (1U << 1)
> > /*
> >  * Not enough ring space available for the requested size, caller set
> >  * to receive a notification via VIRQ_ARGO when enough free space
> >  * might be available.
> >  */
> > #define XEN_ARGO_RING_NOTIFY      (1U << 2)
> > /* Requested size exceeds maximum ring message size. */
> > #define XEN_ARGO_RING_EMSGSIZE    (1U << 3)
> > /* Ring is shared, not unicast. */
> > #define XEN_ARGO_RING_SHARED      (1U << 4)
> >
> > Note that I've also removed the _DATA_F_, I think it's not specially
> > helpful, and shorter names are easier to read.
>
> Ack - done.
>
> >
> > I think the above is clearer and should be able to convey the
> > same set of information using one flag less, which is always better
> > IMO. That being set I don't know the users of this interface anyway,
> > so if you think the original proposal is better I'm not going to
> > oppose.
>
> ok -- let me know your view given the description of Situation 4 above.
> I've kept it unchanged for the time being.
>
> Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 10/14] argo: implement the notify op
  2019-01-18 23:54             ` Christopher Clark
  2019-01-18 23:59               ` Christopher Clark
@ 2019-01-19 12:06               ` Roger Pau Monné
  2019-01-21  1:59                 ` Christopher Clark
  1 sibling, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-19 12:06 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Fri, Jan 18, 2019 at 03:54:14PM -0800, Christopher Clark wrote:
> On Fri, Jan 18, 2019 at 1:44 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Thu, Jan 17, 2019 at 01:44:32PM -0800, Christopher Clark wrote:
> > > On Thu, Jan 17, 2019 at 3:12 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > >
> > > > On Wed, Jan 16, 2019 at 10:54:48PM -0800, Christopher Clark wrote:
> > > > > On Tue, Jan 15, 2019 at 8:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > >
> > > > > > On Tue, Jan 15, 2019 at 01:27:42AM -0800, Christopher Clark wrote:
> > > > > > > diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
> > > > > > > index c12a50f..d2cb594 100644
> > > > > > > --- a/xen/include/public/argo.h
> > > > > > > +++ b/xen/include/public/argo.h
> > > > > > > @@ -123,6 +123,42 @@ typedef struct xen_argo_unregister_ring
> > > > > > >  /* Messages on the ring are padded to a multiple of this size. */
> > > > > > >  #define XEN_ARGO_MSG_SLOT_SIZE 0x10
> > > > > > >
> > > > > > > +/*
> > > > > > > + * Notify flags
> > > > > > > + */
> > > > > > > +/* Ring is empty */
> > > > > > > +#define XEN_ARGO_RING_DATA_F_EMPTY       (1U << 0)
> > > > > > > +/* Ring exists */
> > > > > > > +#define XEN_ARGO_RING_DATA_F_EXISTS      (1U << 1)
> > > > > > > +/* Pending interrupt exists. Do not rely on this field - for profiling only */
> > > > > > > +#define XEN_ARGO_RING_DATA_F_PENDING     (1U << 2)
> > > >
> > > > Regarding this flag, I've just noticed while looking at the code that
> > > > it doesn't seem to relate to interrupts?
> > >
> > > It might not seem that way, but I think it does, because it indicates
> > > that the hypervisor has just queued up a signal (via VIRQ) for later:
> > > the logic in fill_ring_data has observed that there wasn't enough
> > > space available in the ring for the requested space_required supplied
> > > in the notify call, so it has added a new entry to the ring's
> > > pending_ent list, which will cause a signal to be triggered to the
> > > domain (ie. a VIRQ) later when enough space has been observed as being
> > > available.
> >
> > Oh, I think I was getting confused by the wording of the comment, here
> > "pending interrupt" means that the caller should expect an interrupt at
> > some point in the future when there's enough free space on the ring?
> 
> Yes, that's right.
> 
> > To me "pending interrupt" means there's an interrupt set by the
> > hypervisor which has not yet been serviced by the caller.
> 
> OK, I could see that is a reasonable interpretation too. Do you have a
> term that you would prefer for this?

My proposal was 'notify', but I'm quite bad at naming things TBH.

> >
> > > Now, the "len" value stored in that pending_ent can be changed later,
> > > depending on the size of messages that the domain attempts to send to
> > > the same ring in the meantime, which I think is why the comment notes
> > > not to depend upon that flag.
> > >
> > > > From it's usage in fill_ring_data I would write the following
> > > > description:
> > > >
> > > > "Likely not enough space to queue a message of `space_required`
> > > > size."
> > > >
> > > > And then XEN_ARGO_RING_DATA_F_PENDING is completely orthogonal to
> > > > XEN_ARGO_RING_DATA_F_SUFFICIENT, at which point having only one of
> > > > those would be enough?
> > >
> > > Given the above, where I do think that the PENDING flag is an
> > > indicator of queued interrupt, I think there's some merit to keeping
> > > them separate, rather than committing to the client that it is always
> > > one or the other. It actually looks like the call to pending_requeue
> > > is ignoring the potential for an error value (eg ENOSPC or ENOMEM)
> > > there, where the flag should not be set, and possibly the errno should
> > > be returned to the caller.
> >
> > Yes, you should propagate the errors from pending_requeue to the
> > caller.
> 
> ack, done.
> 
> >
> > > > AFAICT you cannot get a xen_argo_ring_data_ent_t with both
> > > > XEN_ARGO_RING_DATA_F_PENDING and XEN_ARGO_RING_DATA_F_SUFFICIENT set
> > > > at the same time?
> > >
> > > right, but there is a case where you can get one with neither bit set.
> >
> > Yes, that's right. But you would then get the
> > XEN_ARGO_RING_DATA_F_EMSGSIZE flag set or the ring simply don't
> > exist.
> >
> > > It looks a bit clearer for the caller to have the explicit separate
> > > bits because it can avoid having to check a third flag first to see
> > > how to interpret a combined one.
> >
> > There are three possible situations, which are mutually exclusive:
> >
> > 1. Message is bigger than the max message size supported by the ring:
> >    set EMSGSIZE
> > 2. Message fits based on the current available space on the ring:
> >    don't set any flags.
> > 3. Message doesn't fit based on the current available space on the
> >    ring: set NOTIFY.
> 
> Unfortunately, given the new error checking (added for my "ack, done." above),
> now there is a fourth condition. Situation 3 is described more fully as:
> 
> 3. Message doesn't fit based on the current available space on the
> ring, and a VIRQ is queued for when space is available: set NOTIFY.
> 
> New Situation 4 is:
> 
> 4. Message doesn't fit based on the current available space on the ring,
> but Xen can't queue up a VIRQ for later because memory allocation to
> add an entry for that failed. Don't set NOTIFY.
> 
> We ought to enable the guest to distinguish Situation 2 from Situation 4
> -- which I think points to keeping the separate flags.

But situation 4 is going to return an error code from the hypercall
(ENOSPC?), at which point you will be able to differentiate it?

In fact I think XEN_ARGO_RING_EMSGSIZE could be removed also, and the
hypercall made return E2BIG?

> 
> > So that would leave the following set of flags:
> >
> > /* Ring is empty. */
> > #define XEN_ARGO_RING_EMPTY       (1U << 0)
> > /* Ring exists. */
> > #define XEN_ARGO_RING_EXISTS      (1U << 1)
> > /*
> >  * Not enough ring space available for the requested size, caller set
> >  * to receive a notification via VIRQ_ARGO when enough free space
> >  * might be available.
> >  */
> > #define XEN_ARGO_RING_NOTIFY      (1U << 2)
> > /* Requested size exceeds maximum ring message size. */
> > #define XEN_ARGO_RING_EMSGSIZE    (1U << 3)
> > /* Ring is shared, not unicast. */
> > #define XEN_ARGO_RING_SHARED      (1U << 4)
> >
> > Note that I've also removed the _DATA_F_, I think it's not specially
> > helpful, and shorter names are easier to read.
> 
> Ack - done.
> 
> >
> > I think the above is clearer and should be able to convey the
> > same set of information using one flag less, which is always better
> > IMO. That being set I don't know the users of this interface anyway,
> > so if you think the original proposal is better I'm not going to
> > oppose.
> 
> ok -- let me know your view given the description of Situation 4 above.
> I've kept it unchanged for the time being.

As said in my previous reply, my comments where recommendations but I
don't have a strong opinion because I don't know the users of the
interface. My PoV is that adding flags for errors seems like
duplicating error codes, and duplicating the places the caller has to
check for errors (in your proposal return value from hypercall and
flags field can both contain errors).

For example if the message is bigger than the ring size, the hypercall
is going to set the XEN_ARGO_RING_EMSGSIZE flag and return 0? I think
you could avoid the XEN_ARGO_RING_EMSGSIZE flag and just make the
hypercall return E2BIG.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 10/14] argo: implement the notify op
  2019-01-19 12:06               ` Roger Pau Monné
@ 2019-01-21  1:59                 ` Christopher Clark
  2019-01-21  8:21                   ` Roger Pau Monné
  0 siblings, 1 reply; 42+ messages in thread
From: Christopher Clark @ 2019-01-21  1:59 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Sat, Jan 19, 2019 at 4:06 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Fri, Jan 18, 2019 at 03:54:14PM -0800, Christopher Clark wrote:
> > On Fri, Jan 18, 2019 at 1:44 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Thu, Jan 17, 2019 at 01:44:32PM -0800, Christopher Clark wrote:
> > > > On Thu, Jan 17, 2019 at 3:12 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > >
> > > > > On Wed, Jan 16, 2019 at 10:54:48PM -0800, Christopher Clark wrote:
> > > > > > On Tue, Jan 15, 2019 at 8:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > > >
> > > > > > > On Tue, Jan 15, 2019 at 01:27:42AM -0800, Christopher Clark wrote:
> > > > > > > > diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
> > > > > > > > index c12a50f..d2cb594 100644
> > > > > > > > --- a/xen/include/public/argo.h
> > > > > > > > +++ b/xen/include/public/argo.h

> > > > > AFAICT you cannot get a xen_argo_ring_data_ent_t with both
> > > > > XEN_ARGO_RING_DATA_F_PENDING and XEN_ARGO_RING_DATA_F_SUFFICIENT set
> > > > > at the same time?
> > >
> > > There are three possible situations, which are mutually exclusive:
> > >
> > > 1. Message is bigger than the max message size supported by the ring:
> > >    set EMSGSIZE
> > > 2. Message fits based on the current available space on the ring:
> > >    don't set any flags.
> > > 3. Message doesn't fit based on the current available space on the
> > >    ring: set NOTIFY.
> >
> > Unfortunately, given the new error checking (added for my "ack, done." above),
> > now there is a fourth condition. Situation 3 is described more fully as:
> >
> > 3. Message doesn't fit based on the current available space on the
> > ring, and a VIRQ is queued for when space is available: set NOTIFY.
> >
> > New Situation 4 is:
> >
> > 4. Message doesn't fit based on the current available space on the ring,
> > but Xen can't queue up a VIRQ for later because memory allocation to
> > add an entry for that failed. Don't set NOTIFY.
> >
> > We ought to enable the guest to distinguish Situation 2 from Situation 4
> > -- which I think points to keeping the separate flags.
>
> But situation 4 is going to return an error code from the hypercall
> (ENOSPC?), at which point you will be able to differentiate it?

Ack, ok. Since ENOSPC aborts the return of any further data about that ring,
(or subsequent rings that were queried in the same op) yes, it's distinct.

> In fact I think XEN_ARGO_RING_EMSGSIZE could be removed also, and the
> hypercall made return E2BIG?

This is the query interface for the sender to a ring to discover the
receiver's ring size, and it's an interface for querying about multiple
rings in the same operation; it may not know an individual ring size
beforehand, or that a given payload size will exceed it.

Returning E2BIG would abort the loop (in notify, that calls fill_ring_data)
and not actually return the state to the caller indicating the size of the
maximum acceptable message size that it needs to avoid that error.
Instead, we're using the bit in the per-ring response to indicate that
(non-serious) condition and allowing the loop to continue and provide data
about all the other rings in the request, including maximum message sizes.

> > > So that would leave the following set of flags:
> > >
> > > /* Ring is empty. */
> > > #define XEN_ARGO_RING_EMPTY       (1U << 0)
> > > /* Ring exists. */
> > > #define XEN_ARGO_RING_EXISTS      (1U << 1)
> > > /*
> > >  * Not enough ring space available for the requested size, caller set
> > >  * to receive a notification via VIRQ_ARGO when enough free space
> > >  * might be available.
> > >  */
> > > #define XEN_ARGO_RING_NOTIFY      (1U << 2)
> > > /* Requested size exceeds maximum ring message size. */
> > > #define XEN_ARGO_RING_EMSGSIZE    (1U << 3)
> > > /* Ring is shared, not unicast. */
> > > #define XEN_ARGO_RING_SHARED      (1U << 4)
> > >
> > >
> > > I think the above is clearer and should be able to convey the
> > > same set of information using one flag less, which is always better
> > > IMO. That being set I don't know the users of this interface anyway,
> > > so if you think the original proposal is better I'm not going to
> > > oppose.

I've checked both the Linux (prototype Argo) and Windows (v4v) drivers and
of the original flags, the SUFFICIENT flag is used, while the PENDING one is
not, which fits with its description of being for profiling only; so given
that, I'll keep the SUFFICIENT flag, but will drop the PENDING one, leaving
it to be inferred from the other state returned, as requested.

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 10/14] argo: implement the notify op
  2019-01-21  1:59                 ` Christopher Clark
@ 2019-01-21  8:21                   ` Roger Pau Monné
  0 siblings, 0 replies; 42+ messages in thread
From: Roger Pau Monné @ 2019-01-21  8:21 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Sun, Jan 20, 2019 at 05:59:36PM -0800, Christopher Clark wrote:
> On Sat, Jan 19, 2019 at 4:06 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Fri, Jan 18, 2019 at 03:54:14PM -0800, Christopher Clark wrote:
> > > On Fri, Jan 18, 2019 at 1:44 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > >
> > > > On Thu, Jan 17, 2019 at 01:44:32PM -0800, Christopher Clark wrote:
> > > > > On Thu, Jan 17, 2019 at 3:12 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > >
> > > > > > On Wed, Jan 16, 2019 at 10:54:48PM -0800, Christopher Clark wrote:
> > > > > > > On Tue, Jan 15, 2019 at 8:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Jan 15, 2019 at 01:27:42AM -0800, Christopher Clark wrote:
> > > > > > > > > diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
> > > > > > > > > index c12a50f..d2cb594 100644
> > > > > > > > > --- a/xen/include/public/argo.h
> > > > > > > > > +++ b/xen/include/public/argo.h
> 
> > > > > > AFAICT you cannot get a xen_argo_ring_data_ent_t with both
> > > > > > XEN_ARGO_RING_DATA_F_PENDING and XEN_ARGO_RING_DATA_F_SUFFICIENT set
> > > > > > at the same time?
> > > >
> > > > There are three possible situations, which are mutually exclusive:
> > > >
> > > > 1. Message is bigger than the max message size supported by the ring:
> > > >    set EMSGSIZE
> > > > 2. Message fits based on the current available space on the ring:
> > > >    don't set any flags.
> > > > 3. Message doesn't fit based on the current available space on the
> > > >    ring: set NOTIFY.
> > >
> > > Unfortunately, given the new error checking (added for my "ack, done." above),
> > > now there is a fourth condition. Situation 3 is described more fully as:
> > >
> > > 3. Message doesn't fit based on the current available space on the
> > > ring, and a VIRQ is queued for when space is available: set NOTIFY.
> > >
> > > New Situation 4 is:
> > >
> > > 4. Message doesn't fit based on the current available space on the ring,
> > > but Xen can't queue up a VIRQ for later because memory allocation to
> > > add an entry for that failed. Don't set NOTIFY.
> > >
> > > We ought to enable the guest to distinguish Situation 2 from Situation 4
> > > -- which I think points to keeping the separate flags.
> >
> > But situation 4 is going to return an error code from the hypercall
> > (ENOSPC?), at which point you will be able to differentiate it?
> 
> Ack, ok. Since ENOSPC aborts the return of any further data about that ring,
> (or subsequent rings that were queried in the same op) yes, it's distinct.
> 
> > In fact I think XEN_ARGO_RING_EMSGSIZE could be removed also, and the
> > hypercall made return E2BIG?
> 
> This is the query interface for the sender to a ring to discover the
> receiver's ring size, and it's an interface for querying about multiple
> rings in the same operation; it may not know an individual ring size
> beforehand, or that a given payload size will exceed it.
> 
> Returning E2BIG would abort the loop (in notify, that calls fill_ring_data)
> and not actually return the state to the caller indicating the size of the
> maximum acceptable message size that it needs to avoid that error.
> Instead, we're using the bit in the per-ring response to indicate that
> (non-serious) condition and allowing the loop to continue and provide data
> about all the other rings in the request, including maximum message sizes.

Right, I've been thinking about this and since this is a status
request hypercall it might make sense to return some of what would be
errors (if this was a write to the ring) as flags, and leave the
return error code to be used only for errors that actually prevent the
hypervisor from successfully executing the status hypercall. I leave
up to you to decide what's worth putting in the flags field or
returning as an error code.

> > > > So that would leave the following set of flags:
> > > >
> > > > /* Ring is empty. */
> > > > #define XEN_ARGO_RING_EMPTY       (1U << 0)
> > > > /* Ring exists. */
> > > > #define XEN_ARGO_RING_EXISTS      (1U << 1)
> > > > /*
> > > >  * Not enough ring space available for the requested size, caller set
> > > >  * to receive a notification via VIRQ_ARGO when enough free space
> > > >  * might be available.
> > > >  */
> > > > #define XEN_ARGO_RING_NOTIFY      (1U << 2)
> > > > /* Requested size exceeds maximum ring message size. */
> > > > #define XEN_ARGO_RING_EMSGSIZE    (1U << 3)
> > > > /* Ring is shared, not unicast. */
> > > > #define XEN_ARGO_RING_SHARED      (1U << 4)
> > > >
> > > >
> > > > I think the above is clearer and should be able to convey the
> > > > same set of information using one flag less, which is always better
> > > > IMO. That being set I don't know the users of this interface anyway,
> > > > so if you think the original proposal is better I'm not going to
> > > > oppose.
> 
> I've checked both the Linux (prototype Argo) and Windows (v4v) drivers and
> of the original flags, the SUFFICIENT flag is used, while the PENDING one is
> not, which fits with its description of being for profiling only; so given
> that, I'll keep the SUFFICIENT flag, but will drop the PENDING one, leaving
> it to be inferred from the other state returned, as requested.

Ack, as said above, I leave up to you to decide what flags to use. At
the end of day there are clients already for this interface, so I
assume the flags are functional for the needs of the clients.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2019-01-21  8:22 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-15  9:27 [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Christopher Clark
2019-01-15  9:27 ` [PATCH v4 01/14] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
2019-01-15  9:27 ` [PATCH v4 02/14] argo: introduce the argo_op hypercall boilerplate Christopher Clark
2019-01-15  9:27 ` [PATCH v4 03/14] argo: define argo_dprintk for subsystem debugging Christopher Clark
2019-01-15  9:27 ` [PATCH v4 04/14] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
2019-01-15 12:29   ` Roger Pau Monné
2019-01-15 12:42     ` Jan Beulich
2019-01-15 14:16       ` Roger Pau Monné
2019-01-15 14:15     ` Ian Jackson
2019-01-16  1:07     ` Christopher Clark
2019-01-15  9:27 ` [PATCH v4 05/14] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI Christopher Clark
2019-01-15  9:27 ` [PATCH v4 06/14] xen/arm: introduce guest_handle_for_field() Christopher Clark
2019-01-15  9:27 ` [PATCH v4 07/14] argo: implement the register op Christopher Clark
2019-01-15 14:40   ` Roger Pau Monné
2019-01-15 22:37     ` Christopher Clark
2019-01-15  9:27 ` [PATCH v4 08/14] argo: implement the unregister op Christopher Clark
2019-01-15 15:03   ` Roger Pau Monné
2019-01-17  6:40     ` Christopher Clark
2019-01-15  9:27 ` [PATCH v4 09/14] argo: implement the sendv op; evtchn: expose send_guest_global_virq Christopher Clark
2019-01-15 15:49   ` Roger Pau Monné
2019-01-15 16:10     ` Jan Beulich
2019-01-15 16:19       ` Roger Pau Monné
2019-01-17  6:48     ` Christopher Clark
2019-01-17 10:53       ` Roger Pau Monné
2019-01-15  9:27 ` [PATCH v4 10/14] argo: implement the notify op Christopher Clark
2019-01-15 16:17   ` Roger Pau Monné
2019-01-17  6:54     ` Christopher Clark
2019-01-17 11:12       ` Roger Pau Monné
2019-01-17 12:04         ` Jan Beulich
2019-01-17 21:44         ` Christopher Clark
2019-01-18  9:44           ` Roger Pau Monné
2019-01-18 23:54             ` Christopher Clark
2019-01-18 23:59               ` Christopher Clark
2019-01-19 12:06               ` Roger Pau Monné
2019-01-21  1:59                 ` Christopher Clark
2019-01-21  8:21                   ` Roger Pau Monné
2019-01-15  9:27 ` [PATCH v4 11/14] xsm, argo: XSM control for argo register Christopher Clark
2019-01-15  9:27 ` [PATCH v4 12/14] xsm, argo: XSM control for argo message send operation Christopher Clark
2019-01-15  9:27 ` [PATCH v4 13/14] xsm, argo: XSM control for any access to argo by a domain Christopher Clark
2019-01-15  9:27 ` [PATCH v4 14/14] xsm, argo: notify: don't describe rings that cannot be sent to Christopher Clark
2019-01-15 16:34 ` [PATCH v4 00/14] Argo: hypervisor-mediated interdomain communication Roger Pau Monné
2019-01-15 22:39   ` Christopher Clark

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.