All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
@ 2019-01-21  9:59 Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 01/15] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
                   ` (15 more replies)
  0 siblings, 16 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Rich Persaud,
	Tim Deegan, Daniel Smith, Julien Grall, Paul Durrant,
	Jan Beulich, Daniel De Graaf, James McKenzie, Eric Chanudet,
	Roger Pau Monne

Version five of this patch series:

* Changes are primarily addressing feedback from the v4 series reviews.
  Many points noted on the invididual commit posts.

* Critical sections have been shrunk, with allocations and frees
  pulled outside where possible, reordering logic within hypercall ops.

* A new ring hash function implemented, derived from the djb2 string
  hash function.

* Flags returned by the notify op have been simplified.

* Now uses a single argo boot parameter, taking a list:
  - top level boolean to enable/disable Argo
  - mac-permissive option to enable/disable wildcard rings
  - command line doc edit: no "CONFIG_ARGO" but refers to build config

* Switched to use the standard list data structures used by Xen's
  common code.

* Further removal of uses of fixed-width types.

* Added a new patch to add Argo to the MAINTAINERS file.

Christopher Clark (15):
  argo: Introduce the Kconfig option to govern inclusion of Argo
  argo: introduce the argo_op hypercall boilerplate
  argo: define argo_dprintk for subsystem debugging
  argo: init, destroy and soft-reset, with enable command line opt
  errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI
  xen/arm: introduce guest_handle_for_field()
  argo: implement the register op
  argo: implement the unregister op
  argo: implement the sendv op; evtchn: expose send_guest_global_virq
  argo: implement the notify op
  xsm, argo: XSM control for argo register
  xsm, argo: XSM control for argo message send operation
  xsm, argo: XSM control for any access to argo by a domain
  xsm, argo: notify: don't describe rings that cannot be sent to
  MAINTAINERS: add new section for Argo and self as maintainer

 MAINTAINERS                                  |    8 +
 docs/misc/xen-command-line.pandoc            |   22 +
 tools/flask/policy/modules/guest_features.te |    7 +
 xen/arch/x86/guest/hypercall_page.S          |    2 +-
 xen/arch/x86/hvm/hypercall.c                 |    3 +
 xen/arch/x86/hypercall.c                     |    3 +
 xen/arch/x86/pv/hypercall.c                  |    3 +
 xen/common/Kconfig                           |   19 +
 xen/common/Makefile                          |    3 +-
 xen/common/argo.c                            | 2281 ++++++++++++++++++++++++++
 xen/common/compat/argo.c                     |   62 +
 xen/common/domain.c                          |    9 +
 xen/common/event_channel.c                   |    2 +-
 xen/include/Makefile                         |    1 +
 xen/include/asm-arm/guest_access.h           |    3 +
 xen/include/public/argo.h                    |  280 ++++
 xen/include/public/errno.h                   |    2 +
 xen/include/public/xen.h                     |    4 +-
 xen/include/xen/argo.h                       |   44 +
 xen/include/xen/event.h                      |    7 +
 xen/include/xen/hypercall.h                  |    9 +
 xen/include/xen/sched.h                      |    5 +
 xen/include/xlat.lst                         |    8 +
 xen/include/xsm/dummy.h                      |   25 +
 xen/include/xsm/xsm.h                        |   31 +
 xen/xsm/dummy.c                              |    6 +
 xen/xsm/flask/hooks.c                        |   41 +-
 xen/xsm/flask/policy/access_vectors          |   16 +
 xen/xsm/flask/policy/security_classes        |    1 +
 29 files changed, 2899 insertions(+), 8 deletions(-)
 create mode 100644 xen/common/argo.c
 create mode 100644 xen/common/compat/argo.c
 create mode 100644 xen/include/public/argo.h
 create mode 100644 xen/include/xen/argo.h

-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 01/15] argo: Introduce the Kconfig option to govern inclusion of Argo
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 02/15] argo: introduce the argo_op hypercall boilerplate Christopher Clark
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Defines CONFIG_ARGO when enabled. Default: disabled.

When the Kconfig option is enabled, the Argo hypercall implementation
will be included, allowing use of the hypervisor-mediated interdomain
communication mechanism.

Argo is implemented for x86 and ARM hardware platforms.

Availability of the option depends on EXPERT and Argo is currently an
experimental feature.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v3 added Jan's Ack
v2 #01 feedback, Jan: replace def_bool/prompt with bool
v1 #02 feedback, Jan: default Kconfig off, use EXPERT, fix whitespace

 xen/common/Kconfig | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index a79cd40..0438462 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -202,6 +202,25 @@ config LATE_HWDOM
 
 	  If unsure, say N.
 
+config ARGO
+	bool "Argo: hypervisor-mediated interdomain communication" if EXPERT = "y"
+	---help---
+	  Enables a hypercall for domains to ask the hypervisor to perform
+	  data transfer of messages between domains.
+
+	  This allows communication channels to be established that do not
+	  require any shared memory between domains; the hypervisor is the
+	  entity that each domain interacts with. The hypervisor is able to
+	  enforce Mandatory Access Control policy over the communication.
+
+	  If XSM_FLASK is enabled, XSM policy can govern which domains may
+	  communicate via the Argo system.
+
+	  This feature does nothing if the "argo" boot parameter is not present.
+	  Argo is disabled at runtime by default.
+
+	  If unsure, say N.
+
 menu "Schedulers"
 	visible if EXPERT = "y"
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 02/15] argo: introduce the argo_op hypercall boilerplate
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 01/15] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 03/15] argo: define argo_dprintk for subsystem debugging Christopher Clark
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Presence is gated upon CONFIG_ARGO.

Registers the hypercall previously reserved for this.
Takes 5 arguments, does nothing and returns -ENOSYS.

Will be avoiding a compat ABI by using fixed-size types in hypercall ops so
HYPERCALL, rather than COMPAT_CALL, is the correct macro for the hypercall
tables.

Even though handles will be used for (up to) two of the arguments to the
hypercall, there will be no need for any XLAT_* translation functions
because the referenced data structures have been constructed to be exactly
the same size and bit pattern on both 32-bit and 64-bit guests, and padded
to be integer multiples of 32 bits in size. This means that the same
copy_to_guest and copy_from_guest logic can be relied upon to perform as
required without any further intervention. Testing communication with 32
and 64 bit guests has confirmed this works as intended.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v2 Copyright line: add 2019
v2 feedback #3 Jan: drop "message" from argo_message_op
v2 feedback #3 Jan: add Acked-by
v1 feedback #15 Jan: handle upper-halves of hypercall args
v1 feedback #15 Jan: use unsigned where negative values impossible

 xen/arch/x86/guest/hypercall_page.S |  2 +-
 xen/arch/x86/hvm/hypercall.c        |  3 +++
 xen/arch/x86/hypercall.c            |  3 +++
 xen/arch/x86/pv/hypercall.c         |  3 +++
 xen/common/Makefile                 |  1 +
 xen/common/argo.c                   | 28 ++++++++++++++++++++++++++++
 xen/include/public/xen.h            |  2 +-
 xen/include/xen/hypercall.h         |  9 +++++++++
 8 files changed, 49 insertions(+), 2 deletions(-)
 create mode 100644 xen/common/argo.c

diff --git a/xen/arch/x86/guest/hypercall_page.S b/xen/arch/x86/guest/hypercall_page.S
index fdd2e72..26afabf 100644
--- a/xen/arch/x86/guest/hypercall_page.S
+++ b/xen/arch/x86/guest/hypercall_page.S
@@ -59,7 +59,7 @@ DECLARE_HYPERCALL(sysctl)
 DECLARE_HYPERCALL(domctl)
 DECLARE_HYPERCALL(kexec_op)
 DECLARE_HYPERCALL(tmem_op)
-DECLARE_HYPERCALL(xc_reserved_op)
+DECLARE_HYPERCALL(argo_op)
 DECLARE_HYPERCALL(xenpmu_op)
 
 DECLARE_HYPERCALL(arch_0)
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 19d1263..b4eaac3 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -134,6 +134,9 @@ static const hypercall_table_t hvm_hypercall_table[] = {
 #ifdef CONFIG_TMEM
     HYPERCALL(tmem_op),
 #endif
+#ifdef CONFIG_ARGO
+    HYPERCALL(argo_op),
+#endif
     COMPAT_CALL(platform_op),
 #ifdef CONFIG_PV
     COMPAT_CALL(mmuext_op),
diff --git a/xen/arch/x86/hypercall.c b/xen/arch/x86/hypercall.c
index 032de8f..93e7860 100644
--- a/xen/arch/x86/hypercall.c
+++ b/xen/arch/x86/hypercall.c
@@ -64,6 +64,9 @@ const hypercall_args_t hypercall_args_table[NR_hypercalls] =
     ARGS(domctl, 1),
     ARGS(kexec_op, 2),
     ARGS(tmem_op, 1),
+#ifdef CONFIG_ARGO
+    ARGS(argo_op, 5),
+#endif
     ARGS(xenpmu_op, 2),
 #ifdef CONFIG_HVM
     ARGS(hvm_op, 2),
diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
index 5d11911..ed75053 100644
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -77,6 +77,9 @@ const hypercall_table_t pv_hypercall_table[] = {
 #ifdef CONFIG_TMEM
     HYPERCALL(tmem_op),
 #endif
+#ifdef CONFIG_ARGO
+    HYPERCALL(argo_op),
+#endif
     HYPERCALL(xenpmu_op),
 #ifdef CONFIG_HVM
     HYPERCALL(hvm_op),
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 56fc201..59ac7de 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -1,3 +1,4 @@
+obj-$(CONFIG_ARGO) += argo.o
 obj-y += bitmap.o
 obj-y += bsearch.o
 obj-$(CONFIG_CORE_PARKING) += core_parking.o
diff --git a/xen/common/argo.c b/xen/common/argo.c
new file mode 100644
index 0000000..d69ad7c
--- /dev/null
+++ b/xen/common/argo.c
@@ -0,0 +1,28 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Derived from v4v, the version 2 of v2v.
+ *
+ * Copyright (c) 2010, Citrix Systems
+ * Copyright (c) 2018-2019 BAE Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <xen/errno.h>
+#include <xen/guest_access.h>
+
+long
+do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
+           XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
+           unsigned long arg4)
+{
+    return -ENOSYS;
+}
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 1a56871..b3f6491 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -118,7 +118,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define __HYPERVISOR_domctl               36
 #define __HYPERVISOR_kexec_op             37
 #define __HYPERVISOR_tmem_op              38
-#define __HYPERVISOR_xc_reserved_op       39 /* reserved for XenClient */
+#define __HYPERVISOR_argo_op              39
 #define __HYPERVISOR_xenpmu_op            40
 #define __HYPERVISOR_dm_op                41
 
diff --git a/xen/include/xen/hypercall.h b/xen/include/xen/hypercall.h
index cc99aea..e2f61d6 100644
--- a/xen/include/xen/hypercall.h
+++ b/xen/include/xen/hypercall.h
@@ -136,6 +136,15 @@ do_tmem_op(
     XEN_GUEST_HANDLE_PARAM(tmem_op_t) uops);
 #endif
 
+#ifdef CONFIG_ARGO
+extern long do_argo_op(
+    unsigned int cmd,
+    XEN_GUEST_HANDLE_PARAM(void) arg1,
+    XEN_GUEST_HANDLE_PARAM(void) arg2,
+    unsigned long arg3,
+    unsigned long arg4);
+#endif
+
 extern long
 do_xenoprof_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg);
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 03/15] argo: define argo_dprintk for subsystem debugging
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 01/15] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 02/15] argo: introduce the argo_op hypercall boilerplate Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 04/15] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

A convenience for working on development of the argo subsystem:
setting a #define variable enables additional debug messages.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
v3 added Roger's Reviewed-by
v3 added Jan's Ack
v2 #03 feedback, Jan: fix ifdef/define confusion error
v1 #04 feedback, Jan: fix dprintk implementation

 xen/common/argo.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index d69ad7c..6f782f7 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -19,6 +19,15 @@
 #include <xen/errno.h>
 #include <xen/guest_access.h>
 
+/* Change this to #define ARGO_DEBUG here to enable more debug messages */
+#undef ARGO_DEBUG
+
+#ifdef ARGO_DEBUG
+#define argo_dprintk(format, args...) printk("argo: " format, ## args )
+#else
+#define argo_dprintk(format, ... ) ((void)0)
+#endif
+
 long
 do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
            XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 04/15] argo: init, destroy and soft-reset, with enable command line opt
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (2 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 03/15] argo: define argo_dprintk for subsystem debugging Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-21 17:55   ` Roger Pau Monné
  2019-01-21  9:59 ` [PATCH v5 05/15] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI Christopher Clark
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Rich Persaud,
	Tim Deegan, Daniel Smith, Julien Grall, Paul Durrant,
	Jan Beulich, James McKenzie, Eric Chanudet, Roger Pau Monne

Initialises basic data structures and performs teardown of argo state
for domain shutdown.

Inclusion of the Argo implementation is dependent on CONFIG_ARGO.

Introduces a new Xen command line parameter 'argo': bool to enable/disable
the argo hypercall. Defaults to disabled.

New headers:
  public/argo.h: with definions of addresses and ring structure, including
  indexes for atomic update for communication between domain and hypervisor.

  xen/argo.h: to expose the hooks for integration into domain lifecycle:
    argo_init: per-domain init of argo data structures for domain_create.
    argo_destroy: teardown for domain_destroy and the error exit
                  path of domain_create.
    argo_soft_reset: reset of domain state for domain_soft_reset.

Adds a new field to struct domain: struct argo_domain *argo;

In accordance with recent work on _domain_destroy, argo_destroy is
idempotent. It will tear down: all rings registered by this domain, all
rings where this domain is the single sender (ie. specified partner,
non-wildcard rings), and all pending notifications where this domain is
awaiting signal about available space in the rings of other domains.

A count will be maintained of the number of rings that a domain has
registered in order to limit it below the fixed maximum limit defined here.

Macros are defined to verify the internal locking state within the argo
implementation. The macros are ASSERTed on entry to functions to validate
and document the required lock state prior to calling.

The hash function for the hashtables that hold ring state is derived from
the string hashing function djb2 (http://www.cse.yorku.ca/~oz/hash.html)
by Daniel J. Bernstein. Basic testing with a limited number of domains and
ports has shown reasonable distribution for the table size.

The software license on the public header is the BSD license, standard
procedure for the public Xen headers. The public header was originally
posted under a GPL license at: [1]:
https://lists.xenproject.org/archives/html/xen-devel/2013-05/msg02710.html

The following ACK by Lars Kurth is to confirm that only people being
employees of Citrix contributed to the header files in the series posted at
[1] and that thus the copyright of the files in question is fully owned by
Citrix. The ACK also confirms that Citrix is happy for the header files to
be published under a BSD license in this series (which is based on [1]).

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Lars Kurth <lars.kurth@citrix.com>
Reviewed-by: Ross Philipson <ross.philipson@oracle.com>
---
v4 Jan: amend the command line doc text referring to build configuration
v4 feedback: use standard data structures as per common code
v4 Jan: replace hash_index with djb2-derived hash algorithm
v4 Andrew: switch argo command line option to list argo=<bool>
v4: removed note to remove argo_destroy from domain_kill (test shows issue)
v4 #04 Roger: drop unneeded init of ring_count in argo_domain_init
v4 #04 Roger: replace if (ring_info->mfns) with ASSERTs in ring_unmap
v4 #04 Roger: rewrite the locking verification macros
v4 #04 Roger: make L1 lock description comment clearer about R(L1) and W(L1)
v4 Andrew: fix split of dprintk in ring_map_info across v4 commits

v3 #04 Andrew: use xzalloc for struct argo_domain in argo_init
v3 #04 Andrew: reference CONFIG_ARGO in the command line documentation
v3 #07 Jan: rename ring_find_info to find_ring_info
v3 #04 Andrew: don't truncate args do_argo_op printk
v3 #07 Jan: fix numeric entries in printk format strings
v3 #10 Roger: move find functions to top of file and drop prototypes
v3 #04 Jan: meld compat check for hypercall arg types
v3 #04 Roger/Jan: make lock names clearer and assert their state
v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
v3 #04 Jan: reorder call to argo_init_domain in argo_init
v3 #04 Jan: ring_remove_mfns: zero count before freeing arrays
v3 #04 Jason/Roger: soft_reset: can assume reinit is ok if d->argo set
v3 #04 Roger: remove unused and confusing d->argo_lock
v3 #04 Roger: add simple inlines in xen/argo.h, drop ifdef CONFIG_ARGO
v3 #04 Roger: simpler return -EOPNOTSUPP in do_argo_op
v3 #04 Roger: add const to domain arg to ring_remove_info
v3 #04 Roger: use XFREE
v3 #04 Roger: newline fix in wildcard_pending_list_remove
v3 #04 Roger: mfn_mapping: void* instead of uint8_t*
v3 #04 Roger: drop npages struct member in argo_ring_info; use len
v3 #04 Roger/Jan: drop many fixed width types in internal structs
v3 #04 Jason/Jan: drop pad and fixed width type in pending_ent struct
v3 #04 Eric: moved ring_find_info from register op into this commit
v3 moved hash_index function, nospec include from register op to this commit
v3 moved XEN_ARGO_DOMID_ANY defn from register op into this commit
v3 added #include <xen/sched.h> to <xen/argo.h> for domain struct defn
v3 feedback #04 Roger: reorder #includes to alphabetical order
v3 Added Ross's Reviewed-by.

v2 rewrite locking explanation comment
v2 header copyright line now includes 2019
v2 self: use ring_info backpointer in pending_ent to maintain npending
v2 self: rename all_rings_remove_info to domain_rings_remove_all
v2 feedback Jan: drop cookie, implement teardown
v2 self: add npending to track number of pending entries per ring
v2 self: amend comment on locking; drop section comments
v2 cookie_eq: test low bits first and use likely on high bits
v2 self: OVERHAUL
v2 self: s/argo_pending_ent/pending_ent/g
v2 self: drop pending_remove_ent, inline at single call site
v1 feedback Roger, Jan: drop argo prefix on static functions
v2 #4 Lars: add Acked-by and details to commit message.
v2 feedback #9 Jan: document argo boot opt in xen-command-line.markdown
v2 bugfix: xsm use in soft-reset prior to introduction
v2 feedback #9 Jan: drop 'message' from do_argo_message_op
v1 #5 feedback Paul: init/destroy unsigned, brackets and whitespace fixes
v1 #5 feedback Paul: Use mfn_eq for comparing mfns.
v1 #5 feedback Paul: init/destroy : use currd
v1 #6 (#5) feedback Jan: init/destroy: s/ENOSYS/EOPNOTSUPP/
v1 #6 feedback Paul: Folded patch 6 into patch 5.
v1 #6 feedback Jan: drop opt_argo_enabled initializer
v1 $6 feedback Jan: s/ENOSYS/EOPNOTSUPP/g and drop useless dprintk
v1. #5 feedback Paul: change the license on public header to BSD
- ack from Lars at Citrix.
v1. self, Jan: drop unnecessary xen include from sched.h
v1. self, Jan: drop inclusion of public argo.h in private one
v1. self, Jan: add include of public argo.h to argo.c
v1. self, Jan: drop fwd decl of argo_domain in priv header
v1. Paul/self/Jan: add data structures to xlat.lst and compat/argo.h to Makefile
v1. self: removed allocation of event channel since switching to VIRQ
v1. self: drop types.h include from private argo.h
v1: reorder public argo include position
v1: #13 feedback Jan: public namespace: prefix with xen
v1: self: rename pending ent "id" to "domain_id"
v1: self: add domain_cookie to ent struct
v1. #15 feedback Jan: make cmd unsigned
v1. #15 feedback Jan: make i loop variable unsigned
v1: self: adjust dprintks in init, destroy
v1: #18 feedback Jan: meld max ring count limit
v1: self: use type not struct in public defn, affects compat gen header
v1: feedback #15 Jan: handle upper-halves of hypercall args
v1: add comment explaining the 'magic' field
v1: self + Jan feedback: implement soft reset
v1: feedback #13 Roger: use ASSERT_UNREACHABLE

 docs/misc/xen-command-line.pandoc |  15 +
 xen/common/Makefile               |   2 +-
 xen/common/argo.c                 | 617 +++++++++++++++++++++++++++++++++++++-
 xen/common/compat/argo.c          |  23 ++
 xen/common/domain.c               |   9 +
 xen/include/Makefile              |   1 +
 xen/include/public/argo.h         |  64 ++++
 xen/include/xen/argo.h            |  44 +++
 xen/include/xen/sched.h           |   5 +
 xen/include/xlat.lst              |   2 +
 10 files changed, 780 insertions(+), 2 deletions(-)
 create mode 100644 xen/common/compat/argo.c
 create mode 100644 xen/include/public/argo.h
 create mode 100644 xen/include/xen/argo.h

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index d39bcee..93f41bc 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -182,6 +182,21 @@ Permit Xen to use "Always Running APIC Timer" support on compatible hardware
 in combination with cpuidle.  This option is only expected to be useful for
 developers wishing Xen to fall back to older timing methods on newer hardware.
 
+### argo
+    = List of [ <bool> ]
+
+Controls for the Argo hypervisor-mediated interdomain communication service.
+
+The functionality that this option controls is only available when Xen has been
+compiled with the build setting for Argo enabled in the build configuration.
+
+Argo is a interdomain communication mechanism, where Xen acts as the central
+point of authority.  Guests may register memory rings to recieve messages,
+query the status of other domains, and send messages by hypercall, all subject
+to appropriate auditing by Xen.
+
+*   An overall boolean acts as a global control.  Argo is disabled by default.
+
 ### asid (x86)
 > `= <boolean>`
 
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 59ac7de..75af32e 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -71,7 +71,7 @@ lzo-y := lzo
 lzo-$(CONFIG_TMEM) :=
 obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma $(lzo-y) unlzo unlz4 earlycpio,$(n).init.o)
 
-obj-$(CONFIG_COMPAT) += $(addprefix compat/,domain.o kernel.o memory.o multicall.o xlat.o)
+obj-$(CONFIG_COMPAT) += $(addprefix compat/,argo.o domain.o kernel.o memory.o multicall.o xlat.o)
 
 tmem-y := tmem.o tmem_xen.o tmem_control.o
 tmem-$(CONFIG_COMPAT) += compat/tmem_xen.o
diff --git a/xen/common/argo.c b/xen/common/argo.c
index 6f782f7..12b3ec2 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -16,8 +16,255 @@
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  */
 
+#include <xen/argo.h>
+#include <xen/domain.h>
+#include <xen/domain_page.h>
 #include <xen/errno.h>
+#include <xen/event.h>
 #include <xen/guest_access.h>
+#include <xen/nospec.h>
+#include <xen/sched.h>
+#include <xen/time.h>
+
+#include <public/argo.h>
+
+DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+
+static bool __read_mostly opt_argo;
+
+static int __init parse_argo(const char *s)
+{
+    const char *ss;
+    int val, rc = 0;
+
+    do {
+        ss = strchr(s, ',');
+        if ( !ss )
+            ss = strchr(s, '\0');
+
+        if ( (val = parse_bool(s, ss)) >= 0 )
+            opt_argo = val;
+        else
+            rc = -EINVAL;
+
+        s = ss + 1;
+    } while ( *ss );
+
+    return rc;
+}
+custom_param("argo", parse_argo);
+
+typedef struct argo_ring_id
+{
+    xen_argo_port_t aport;
+    domid_t partner_id;
+    domid_t domain_id;
+} argo_ring_id;
+
+/* Data about a domain's own ring that it has registered */
+struct argo_ring_info
+{
+    /* next node in the hash, protected by rings_L2 */
+    struct list_head node;
+    /* this ring's id, protected by rings_L2 */
+    struct argo_ring_id id;
+    /* L3, the ring_info lock: protects the members of this struct below */
+    spinlock_t L3_lock;
+    /* length of the ring, protected by L3 */
+    unsigned int len;
+    /* number of pages translated into mfns, protected by L3 */
+    unsigned int nmfns;
+    /* cached tx pointer location, protected by L3 */
+    unsigned int tx_ptr;
+    /* mapped ring pages protected by L3 */
+    void **mfn_mapping;
+    /* list of mfns of guest ring, protected by L3 */
+    mfn_t *mfns;
+    /* list of struct pending_ent for this ring, protected by L3 */
+    struct list_head pending;
+    /* number of pending entries queued for this ring, protected by L3 */
+    unsigned int npending;
+};
+
+/* Data about a single-sender ring, held by the sender (partner) domain */
+struct argo_send_info
+{
+    /* next node in the hash, protected by send_L2 */
+    struct list_head node;
+    /* this ring's id, protected by send_L2 */
+    struct argo_ring_id id;
+};
+
+/* A space-available notification that is awaiting sufficient space */
+struct pending_ent
+{
+    /* List node within argo_ring_info's pending list */
+    struct list_head node;
+    /*
+     * List node within argo_domain's wildcard_pend_list. Only used if the
+     * ring is one with a wildcard partner (ie. that any domain may send to)
+     * to enable cancelling signals on wildcard rings on domain destroy.
+     */
+    struct list_head wildcard_node;
+    /*
+     * Pointer to the ring_info that this ent pertains to. Used to ensure that
+     * ring_info->npending is decremented when ents for wildcard rings are
+     * cancelled for domain destroy.
+     * Caution: Must hold the correct locks before accessing ring_info via this.
+     */
+    struct argo_ring_info *ring_info;
+    /* minimum ring space available that this signal is waiting upon */
+    unsigned int len;
+    /* domain to be notified when space is available */
+    domid_t domain_id;
+};
+
+/*
+ * The value of the argo element in a struct domain is
+ * protected by L1_global_argo_rwlock
+ */
+#define ARGO_HASHTABLE_SIZE 32
+struct argo_domain
+{
+    /* rings_L2 */
+    rwlock_t rings_L2_rwlock;
+    /*
+     * Hash table of argo_ring_info about rings this domain has registered.
+     * Protected by rings_L2.
+     */
+    struct list_head ring_hash[ARGO_HASHTABLE_SIZE];
+    /* Counter of rings registered by this domain. Protected by rings_L2. */
+    unsigned int ring_count;
+
+    /* send_L2 */
+    spinlock_t send_L2_lock;
+    /*
+     * Hash table of argo_send_info about rings other domains have registered
+     * for this domain to send to. Single partner, non-wildcard rings.
+     * Protected by send_L2.
+     */
+    struct list_head send_hash[ARGO_HASHTABLE_SIZE];
+
+    /* wildcard_L2 */
+    spinlock_t wildcard_L2_lock;
+    /*
+     * List of pending space-available signals for this domain about wildcard
+     * rings registered by other domains. Protected by wildcard_L2.
+     */
+    struct list_head wildcard_pend_list;
+};
+
+/*
+ * Locking is organized as follows:
+ *
+ * Terminology: R(<lock>) means taking a read lock on the specified lock;
+ *              W(<lock>) means taking a write lock on it.
+ *
+ * == L1 : The global read/write lock: L1_global_argo_rwlock
+ * Protects the argo elements of all struct domain *d in the system.
+ *
+ * R(L1) does not protect any of the elements of d->argo; it protects their
+ * addresses. W(L1) protects those and more since it implies W on all the lower
+ * level locks - see the notes on those locks below.
+ *
+ * The destruction of an argo-enabled domain, which must have a non-NULL d->argo
+ * pointer, will need to free that d->argo pointer, which requires W(L1).
+ * Since holding R(L1) will block acquiring W(L1), it will ensure that
+ * no domains pointers that argo is interested in become invalid while either
+ * W(L1) or R(L1) are held.
+ */
+
+static DEFINE_RWLOCK(L1_global_argo_rwlock); /* L1 */
+
+/*
+ * == rings_L2 : The per-domain ring hash lock: d->argo->rings_L2_rwlock
+ *
+ * Holding a read lock on rings_L2 protects the ring hash table and
+ * the elements in the hash_table d->argo->ring_hash, and
+ * the node and id fields in struct argo_ring_info in the
+ * hash table.
+ * Holding a write lock on rings_L2 protects all of the elements of all the
+ * struct argo_ring_info belonging to this domain.
+ *
+ * To take rings_L2 you must already have R(L1). W(L1) implies W(rings_L2) and
+ * L3.
+ *
+ * == L3 : The individual ring_info lock: ring_info->L3_lock
+ *
+ * Protects all the fields within the argo_ring_info, aside from the ones that
+ * rings_L2 already protects: node, id, lock.
+ *
+ * To acquire L3 you must already have R(rings_L2). W(rings_L2) implies L3.
+ *
+ * == send_L2 : The per-domain single-sender partner rings lock:
+ *              d->argo->send_L2_lock
+ *
+ * Protects the per-domain send hash table : d->argo->send_hash
+ * and the elements in the hash table, and the node and id fields
+ * in struct argo_send_info in the hash table.
+ *
+ * To take send_L2, you must already have R(L1). W(L1) implies send_L2.
+ * Do not attempt to acquire a rings_L2 on any domain after taking and while
+ * holding a send_L2 lock -- acquire the rings_L2 (if one is needed) beforehand.
+ *
+ * == wildcard_L2 : The per-domain wildcard pending list lock:
+ *                  d->argo->wildcard_L2_lock
+ *
+ * Protects the per-domain list of outstanding signals for space availability
+ * on wildcard rings.
+ *
+ * To take wildcard_L2, you must already have R(L1). W(L1) implies wildcard_L2.
+ * No other locks are acquired after obtaining wildcard_L2.
+ */
+
+/*
+ * Lock state validations macros
+ *
+ * These macros encode the logic to verify that the locking has adhered to the
+ * locking discipline above.
+ * eg. On entry to logic that requires holding at least R(rings_L2), this:
+ *      ASSERT(LOCKING_Read_rings_L2(d));
+ *
+ * checks that the lock state is sufficient, validating that one of the
+ * following must be true when executed:       R(rings_L2) && R(L1)
+ *                                        or:  W(rings_L2) && R(L1)
+ *                                        or:  W(L1)
+ *
+ * The LOCKING macros defined below here are for use at verification points.
+ */
+#define LOCKING_Write_L1 (rw_is_write_locked(&L1_global_argo_rwlock))
+/*
+ * While LOCKING_Read_L1 will return true even if the lock is write-locked,
+ * that's OK because everywhere that a Read lock is needed with these macros,
+ * holding a Write lock there instead is OK too: we're checking that _at least_
+ * the specified level of locks are held.
+ */
+#define LOCKING_Read_L1 (rw_is_locked(&L1_global_argo_rwlock))
+
+#define LOCKING_Write_rings_L2(d) \
+    ((LOCKING_Read_L1 && rw_is_write_locked(&(d)->argo->rings_L2_rwlock)) || \
+     LOCKING_Write_L1)
+/*
+ * Skip checking LOCKING_Write_rings_L2(d) within this LOCKING_Read_rings_L2
+ * definition because the first clause that is testing R(L1) && R(L2) will also
+ * return true if R(L1) && W(L2) is true, because of the way that rw_is_locked
+ * behaves. This results in a slightly shorter and faster implementation.
+ */
+#define LOCKING_Read_rings_L2(d) \
+    ((LOCKING_Read_L1 && rw_is_locked(&(d)->argo->rings_L2_rwlock)) || \
+     LOCKING_Write_L1)
+/*
+ * Skip checking LOCKING_Write_L1 within this LOCKING_L3 definition because
+ * LOCKING_Write_rings_L2(d) will return true for that condition.
+ */
+#define LOCKING_L3(d, r) \
+    ((LOCKING_Read_L1 && rw_is_locked(&(d)->argo->rings_L2_rwlock) \
+      && spin_is_locked(&(r)->L3_lock)) || LOCKING_Write_rings_L2(d))
+
+#define LOCKING_send_L2(d) \
+    ((LOCKING_Read_L1 && spin_is_locked(&(d)->argo->send_L2_lock)) || \
+     LOCKING_Write_L1)
 
 /* Change this to #define ARGO_DEBUG here to enable more debug messages */
 #undef ARGO_DEBUG
@@ -28,10 +275,378 @@
 #define argo_dprintk(format, ... ) ((void)0)
 #endif
 
+/*
+ * This hash function is used to distribute rings within the per-domain
+ * hash tables (d->argo->ring_hash and d->argo_send_hash). The hash table
+ * will provide a struct if a match is found with a 'argo_ring_id' key:
+ * ie. the key is a (domain id, argo port, partner domain id) tuple.
+ * The algorithm approximates the string hashing function 'djb2'.
+ */
+static unsigned int
+hash_index(const struct argo_ring_id *id)
+{
+    unsigned int hash = 5381; /* prime constant from djb2 */
+
+    /* For each input: hash = hash * 33 + <new input character value> */
+    hash = ((hash << 5) + hash) +  (id->aport            & 0xff);
+    hash = ((hash << 5) + hash) + ((id->aport      >> 8) & 0xff);
+    hash = ((hash << 5) + hash) + ((id->aport     >> 16) & 0xff);
+    hash = ((hash << 5) + hash) + ((id->aport     >> 24) & 0xff);
+    hash = ((hash << 5) + hash) +  (id->domain_id        & 0xff);
+    hash = ((hash << 5) + hash) + ((id->domain_id  >> 8) & 0xff);
+    hash = ((hash << 5) + hash) +  (id->partner_id       & 0xff);
+    hash = ((hash << 5) + hash) + ((id->partner_id >> 8) & 0xff);
+
+    /*
+     * Since ARGO_HASHTABLE_SIZE is small, use higher-order bits of the
+     * hash to contribute to the lower-order bits before masking off.
+     */
+    return (hash ^ (hash >> 15)) & (ARGO_HASHTABLE_SIZE - 1);
+}
+
+static struct argo_ring_info *
+find_ring_info(const struct domain *d, const struct argo_ring_id *id)
+{
+    struct list_head *cursor, *bucket;
+
+    ASSERT(LOCKING_Read_rings_L2(d));
+
+    /* List is not modified here. Search and return the match if found. */
+    bucket = &d->argo->ring_hash[hash_index(id)];
+
+    for ( cursor = bucket->next; cursor != bucket; cursor = cursor->next )
+    {
+        struct argo_ring_info *ring_info =
+            list_entry(cursor, struct argo_ring_info, node);
+        const struct argo_ring_id *cmpid = &ring_info->id;
+
+        if ( cmpid->aport == id->aport &&
+             cmpid->domain_id == id->domain_id &&
+             cmpid->partner_id == id->partner_id )
+        {
+            argo_dprintk("found ring_info for ring(%u:%x %u)\n",
+                         id->domain_id, id->aport, id->partner_id);
+            return ring_info;
+        }
+    }
+    argo_dprintk("no ring_info for ring(%u:%x %u)\n",
+                 id->domain_id, id->aport, id->partner_id);
+
+    return NULL;
+}
+
+static void
+ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
+{
+    unsigned int i;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    if ( !ring_info->mfn_mapping )
+        return;
+
+    ASSERT(!ring_info->nmfns || ring_info->mfns);
+
+    for ( i = 0; i < ring_info->nmfns; i++ )
+    {
+        if ( !ring_info->mfn_mapping[i] )
+            continue;
+
+        ASSERT(!mfn_eq(ring_info->mfns[i], INVALID_MFN));
+        argo_dprintk(XENLOG_ERR "argo: unmapping page %"PRI_mfn" from %p\n",
+                     mfn_x(ring_info->mfns[i]), ring_info->mfn_mapping[i]);
+
+        unmap_domain_page_global(ring_info->mfn_mapping[i]);
+        ring_info->mfn_mapping[i] = NULL;
+    }
+}
+
+static void
+wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
+{
+    struct domain *d = get_domain_by_id(domain_id);
+
+    if ( !d )
+        return;
+
+    ASSERT(LOCKING_Read_L1);
+
+    if ( d->argo )
+    {
+        spin_lock(&d->argo->wildcard_L2_lock);
+        list_del(&ent->wildcard_node);
+        spin_unlock(&d->argo->wildcard_L2_lock);
+    }
+    put_domain(d);
+}
+
+static void
+pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
+{
+    struct list_head *ring_pending = &ring_info->pending;
+    struct pending_ent *ent;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    /* Delete all pending notifications from this ring's list. */
+    while ( !list_empty(ring_pending) )
+    {
+        ent = list_entry(ring_pending->next, struct pending_ent, node);
+
+        /* For wildcard rings, remove each from their wildcard list too. */
+        if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
+            wildcard_pending_list_remove(ent->domain_id, ent);
+        list_del(&ent->node);
+        xfree(ent);
+    }
+    ring_info->npending = 0;
+}
+
+static void
+wildcard_rings_pending_remove(struct domain *d)
+{
+    struct list_head *wildcard_head;
+
+    ASSERT(LOCKING_Write_L1);
+
+    /* Delete all pending signals to the domain about wildcard rings. */
+    wildcard_head = &d->argo->wildcard_pend_list;
+
+    while ( !list_empty(wildcard_head) )
+    {
+        struct pending_ent *ent =
+            list_entry(wildcard_head->next, struct pending_ent, node);
+
+        /*
+         * The ent->node deleted here, and the npending value decreased,
+         * belong to the ring_info of another domain, which is why this
+         * function requires holding W(L1):
+         * it implies the L3 lock that protects that ring_info struct.
+         */
+        ent->ring_info->npending--;
+        list_del(&ent->node);
+        list_del(&ent->wildcard_node);
+        xfree(ent);
+    }
+}
+
+static void
+ring_remove_mfns(const struct domain *d, struct argo_ring_info *ring_info)
+{
+    unsigned int i;
+
+    ASSERT(LOCKING_Write_rings_L2(d));
+
+    if ( !ring_info->mfns )
+        return;
+
+    if ( !ring_info->mfn_mapping )
+    {
+        ASSERT_UNREACHABLE();
+        return;
+    }
+
+    ring_unmap(d, ring_info);
+
+    for ( i = 0; i < ring_info->nmfns; i++ )
+        if ( !mfn_eq(ring_info->mfns[i], INVALID_MFN) )
+            put_page_and_type(mfn_to_page(ring_info->mfns[i]));
+
+    ring_info->nmfns = 0;
+    XFREE(ring_info->mfns);
+    XFREE(ring_info->mfn_mapping);
+}
+
+static void
+ring_remove_info(const struct domain *d, struct argo_ring_info *ring_info)
+{
+    ASSERT(LOCKING_Write_rings_L2(d));
+
+    pending_remove_all(d, ring_info);
+    list_del(&ring_info->node);
+    ring_remove_mfns(d, ring_info);
+    xfree(ring_info);
+}
+
+static void
+domain_rings_remove_all(struct domain *d)
+{
+    unsigned int i;
+    struct argo_ring_info *ring_info;
+
+    ASSERT(LOCKING_Write_rings_L2(d));
+
+    for ( i = 0; i < ARGO_HASHTABLE_SIZE; ++i )
+    {
+        struct list_head *bucket = &d->argo->ring_hash[i];
+
+        while ( !list_empty(bucket) )
+        {
+            ring_info = list_entry(bucket->next, struct argo_ring_info, node);
+            ring_remove_info(d, ring_info);
+        }
+    }
+    d->argo->ring_count = 0;
+}
+
+/*
+ * Tear down all rings of other domains where src_d domain is the partner.
+ * (ie. it is the single domain that can send to those rings.)
+ * This will also cancel any pending notifications about those rings.
+ */
+static void
+partner_rings_remove(struct domain *src_d)
+{
+    unsigned int i;
+    struct argo_send_info *send_info;
+    struct argo_ring_info *ring_info;
+    struct domain *dst_d;
+
+    ASSERT(LOCKING_Write_L1);
+
+    for ( i = 0; i < ARGO_HASHTABLE_SIZE; ++i )
+    {
+        struct list_head *cursor, *bucket = &src_d->argo->send_hash[i];
+
+        /* Remove all ents from the send list. Take each off their ring list. */
+        for ( cursor = bucket->next; cursor != bucket; cursor = cursor->next )
+        {
+            send_info = list_entry(cursor, struct argo_send_info, node);
+
+            dst_d = get_domain_by_id(send_info->id.domain_id);
+            if ( dst_d && dst_d->argo )
+            {
+                ring_info = find_ring_info(dst_d, &send_info->id);
+                if ( ring_info )
+                {
+                    ring_remove_info(dst_d, ring_info);
+                    dst_d->argo->ring_count--;
+                }
+                else
+                    ASSERT_UNREACHABLE();
+            }
+            else
+                ASSERT_UNREACHABLE();
+
+            if ( dst_d )
+                put_domain(dst_d);
+
+            list_del(&send_info->node);
+            xfree(send_info);
+        }
+    }
+}
+
 long
 do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
            XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
            unsigned long arg4)
 {
-    return -ENOSYS;
+    long rc = -EFAULT;
+
+    argo_dprintk("->do_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
+                 (void *)arg1.p, (void *)arg2.p, arg3, arg4);
+
+    if ( unlikely(!opt_argo) )
+        return -EOPNOTSUPP;
+
+    switch (cmd)
+    {
+    default:
+        rc = -EOPNOTSUPP;
+        break;
+    }
+
+    argo_dprintk("<-do_argo_op(%u)=%ld\n", cmd, rc);
+
+    return rc;
+}
+
+static void
+argo_domain_init(struct argo_domain *argo)
+{
+    unsigned int i;
+
+    rwlock_init(&argo->rings_L2_rwlock);
+    spin_lock_init(&argo->send_L2_lock);
+    spin_lock_init(&argo->wildcard_L2_lock);
+
+    for ( i = 0; i < ARGO_HASHTABLE_SIZE; ++i )
+    {
+        INIT_LIST_HEAD(&argo->ring_hash[i]);
+        INIT_LIST_HEAD(&argo->send_hash[i]);
+    }
+    INIT_LIST_HEAD(&argo->wildcard_pend_list);
+}
+
+int
+argo_init(struct domain *d)
+{
+    struct argo_domain *argo;
+
+    if ( !opt_argo )
+    {
+        argo_dprintk("argo disabled, domid: %u\n", d->domain_id);
+        return 0;
+    }
+
+    argo_dprintk("init: domid: %u\n", d->domain_id);
+
+    argo = xzalloc(struct argo_domain);
+    if ( !argo )
+        return -ENOMEM;
+
+    argo_domain_init(argo);
+
+    write_lock(&L1_global_argo_rwlock);
+
+    d->argo = argo;
+
+    write_unlock(&L1_global_argo_rwlock);
+
+    return 0;
+}
+
+void
+argo_destroy(struct domain *d)
+{
+    BUG_ON(!d->is_dying);
+
+    write_lock(&L1_global_argo_rwlock);
+
+    argo_dprintk("destroy: domid %u d->argo=%p\n", d->domain_id, d->argo);
+
+    if ( d->argo )
+    {
+        domain_rings_remove_all(d);
+        partner_rings_remove(d);
+        wildcard_rings_pending_remove(d);
+        XFREE(d->argo);
+    }
+
+    write_unlock(&L1_global_argo_rwlock);
+}
+
+void
+argo_soft_reset(struct domain *d)
+{
+    write_lock(&L1_global_argo_rwlock);
+
+    argo_dprintk("soft reset d=%u d->argo=%p\n", d->domain_id, d->argo);
+
+    if ( d->argo )
+    {
+        domain_rings_remove_all(d);
+        partner_rings_remove(d);
+        wildcard_rings_pending_remove(d);
+
+        /*
+         * Since opt_argo cannot change at runtime, if d->argo is true then
+         * opt_argo must be true, and we can assume that init is allowed to
+         * proceed again here.
+         */
+        argo_domain_init(d->argo);
+    }
+
+    write_unlock(&L1_global_argo_rwlock);
 }
diff --git a/xen/common/compat/argo.c b/xen/common/compat/argo.c
new file mode 100644
index 0000000..8edb9e8
--- /dev/null
+++ b/xen/common/compat/argo.c
@@ -0,0 +1,23 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Copyright (c) 2018, BAE Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <xen/lib.h>
+
+#include <public/argo.h>
+
+#include <compat/argo.h>
+
+CHECK_argo_addr;
+CHECK_argo_ring;
diff --git a/xen/common/domain.c b/xen/common/domain.c
index c623dae..7470cd9 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -32,6 +32,7 @@
 #include <xen/grant_table.h>
 #include <xen/xenoprof.h>
 #include <xen/irq.h>
+#include <xen/argo.h>
 #include <asm/debugger.h>
 #include <asm/p2m.h>
 #include <asm/processor.h>
@@ -277,6 +278,8 @@ static void _domain_destroy(struct domain *d)
 
     xfree(d->pbuf);
 
+    argo_destroy(d);
+
     rangeset_domain_destroy(d);
 
     free_cpumask_var(d->dirty_cpumask);
@@ -445,6 +448,9 @@ struct domain *domain_create(domid_t domid,
             goto fail;
         init_status |= INIT_gnttab;
 
+        if ( (err = argo_init(d)) != 0 )
+            goto fail;
+
         err = -ENOMEM;
 
         d->pbuf = xzalloc_array(char, DOMAIN_PBUF_SIZE);
@@ -717,6 +723,7 @@ int domain_kill(struct domain *d)
         if ( d->is_dying != DOMDYING_alive )
             return domain_kill(d);
         d->is_dying = DOMDYING_dying;
+        argo_destroy(d);
         evtchn_destroy(d);
         gnttab_release_mappings(d);
         tmem_destroy(d->tmem_client);
@@ -1175,6 +1182,8 @@ int domain_soft_reset(struct domain *d)
 
     grant_table_warn_active_grants(d);
 
+    argo_soft_reset(d);
+
     for_each_vcpu ( d, v )
     {
         set_xen_guest_handle(runstate_guest(v), NULL);
diff --git a/xen/include/Makefile b/xen/include/Makefile
index f7895e4..3d14532 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -5,6 +5,7 @@ ifneq ($(CONFIG_COMPAT),)
 compat-arch-$(CONFIG_X86) := x86_32
 
 headers-y := \
+    compat/argo.h \
     compat/callback.h \
     compat/elfnote.h \
     compat/event_channel.h \
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
new file mode 100644
index 0000000..530bb82
--- /dev/null
+++ b/xen/include/public/argo.h
@@ -0,0 +1,64 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Derived from v4v, the version 2 of v2v.
+ *
+ * Copyright (c) 2010, Citrix Systems
+ * Copyright (c) 2018-2019, BAE Systems
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __XEN_PUBLIC_ARGO_H__
+#define __XEN_PUBLIC_ARGO_H__
+
+#include "xen.h"
+
+#define XEN_ARGO_DOMID_ANY       DOMID_INVALID
+
+/* Fixed-width type for "argo port" number. Nothing to do with evtchns. */
+typedef uint32_t xen_argo_port_t;
+
+typedef struct xen_argo_addr
+{
+    xen_argo_port_t aport;
+    domid_t domain_id;
+    uint16_t pad;
+} xen_argo_addr_t;
+
+typedef struct xen_argo_ring
+{
+    /* Guests should use atomic operations to access rx_ptr */
+    uint32_t rx_ptr;
+    /* Guests should use atomic operations to access tx_ptr */
+    uint32_t tx_ptr;
+    /*
+     * Header space reserved for later use. Align the start of the ring to a
+     * multiple of the message slot size.
+     */
+    uint8_t reserved[56];
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    uint8_t ring[];
+#elif defined(__GNUC__)
+    uint8_t ring[0];
+#endif
+} xen_argo_ring_t;
+
+#endif
diff --git a/xen/include/xen/argo.h b/xen/include/xen/argo.h
new file mode 100644
index 0000000..2ba7e5c
--- /dev/null
+++ b/xen/include/xen/argo.h
@@ -0,0 +1,44 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Copyright (c) 2018, BAE Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef __XEN_ARGO_H__
+#define __XEN_ARGO_H__
+
+#include <xen/sched.h>
+
+#ifdef CONFIG_ARGO
+
+int argo_init(struct domain *d);
+void argo_destroy(struct domain *d);
+void argo_soft_reset(struct domain *d);
+
+#else /* !CONFIG_ARGO */
+
+static inline int argo_init(struct domain *d)
+{
+    return 0;
+}
+
+static inline void argo_destroy(struct domain *d)
+{
+}
+
+static inline void argo_soft_reset(struct domain *d)
+{
+}
+
+#endif
+
+#endif
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 4956a77..6e69afa 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -490,6 +490,11 @@ struct domain
         unsigned int guest_request_enabled       : 1;
         unsigned int guest_request_sync          : 1;
     } monitor;
+
+#ifdef CONFIG_ARGO
+    /* Argo interdomain communication support */
+    struct argo_domain *argo;
+#endif
 };
 
 /* Protect updates/reads (resp.) of domain_list and domain_hash. */
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 5273320..9f616e4 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -148,3 +148,5 @@
 ?	flask_setenforce		xsm/flask_op.h
 !	flask_sid_context		xsm/flask_op.h
 ?	flask_transition		xsm/flask_op.h
+?	argo_addr			argo.h
+?	argo_ring			argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 05/15] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (3 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 04/15] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 06/15] xen/arm: introduce guest_handle_for_field() Christopher Clark
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

EMSGSIZE: Argo's sendv operation will return EMSGSIZE when an excess amount
of data, across all iovs, has been supplied, exceeding either the statically
configured maximum size of a transmittable message, or the (variable) size
of the ring registered by the destination domain.

ECONNREFUSED: Argo's register operation will return ECONNREFUSED if a ring
is being registered to communicate with a specific remote domain that does
exist but is not argo-enabled.

These codes are described by POSIX here:
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html
    EMSGSIZE     : "Message too large"
    ECONNREFUSED : "Connection refused".

The numeric values assigned to each are taken from Linux, as is the case
for the existing error codes.
    EMSGSIZE     : 90
    ECONNREFUSED : 111

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/include/public/errno.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/include/public/errno.h b/xen/include/public/errno.h
index 305c112..e1d02fc 100644
--- a/xen/include/public/errno.h
+++ b/xen/include/public/errno.h
@@ -102,6 +102,7 @@ XEN_ERRNO(EILSEQ,	84)	/* Illegal byte sequence */
 XEN_ERRNO(ERESTART,	85)	/* Interrupted system call should be restarted */
 #endif
 XEN_ERRNO(ENOTSOCK,	88)	/* Socket operation on non-socket */
+XEN_ERRNO(EMSGSIZE,	90)	/* Message too large. */
 XEN_ERRNO(EOPNOTSUPP,	95)	/* Operation not supported on transport endpoint */
 XEN_ERRNO(EADDRINUSE,	98)	/* Address already in use */
 XEN_ERRNO(EADDRNOTAVAIL, 99)	/* Cannot assign requested address */
@@ -109,6 +110,7 @@ XEN_ERRNO(ENOBUFS,	105)	/* No buffer space available */
 XEN_ERRNO(EISCONN,	106)	/* Transport endpoint is already connected */
 XEN_ERRNO(ENOTCONN,	107)	/* Transport endpoint is not connected */
 XEN_ERRNO(ETIMEDOUT,	110)	/* Connection timed out */
+XEN_ERRNO(ECONNREFUSED,	111)	/* Connection refused */
 
 #undef XEN_ERRNO
 #endif /* XEN_ERRNO */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 06/15] xen/arm: introduce guest_handle_for_field()
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (4 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 05/15] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 07/15] argo: implement the register op Christopher Clark
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

ARM port of c/s bb544585: "introduce guest_handle_for_field()"

This helper turns a field of a GUEST_HANDLE into a GUEST_HANDLE.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v3: Added Stefano's Reviewed-by
v2: Added Paul's Reviewed-by

 xen/include/asm-arm/guest_access.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/xen/include/asm-arm/guest_access.h b/xen/include/asm-arm/guest_access.h
index 224d2a0..8997a1c 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -63,6 +63,9 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
     _y;                                                     \
 })
 
+#define guest_handle_for_field(hnd, type, fld)          \
+    ((XEN_GUEST_HANDLE(type)) { &(hnd).p->fld })
+
 #define guest_handle_from_ptr(ptr, type)        \
     ((XEN_GUEST_HANDLE_PARAM(type)) { (type *)ptr })
 #define const_guest_handle_from_ptr(ptr, type)  \
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 07/15] argo: implement the register op
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (5 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 06/15] xen/arm: introduce guest_handle_for_field() Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-22  9:59   ` Roger Pau Monné
  2019-01-21  9:59 ` [PATCH v5 08/15] argo: implement the unregister op Christopher Clark
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

The register op is used by a domain to register a region of memory for
receiving messages from either a specified other domain, or, if specifying a
wildcard, any domain.

This operation creates a mapping within Xen's private address space that
will remain resident for the lifetime of the ring. In subsequent commits,
the hypervisor will use this mapping to copy data from a sending domain into
this registered ring, making it accessible to the domain that registered the
ring to receive data.

Wildcard any-sender rings are default disabled and registration will be
refused with EPERM unless they have been specifically enabled with the
new mac-permissive flag that is added to the argo boot option here. The
reason why the default for wildcard rings is 'deny' is that there is
currently no means to protect the ring from DoS by a noisy domain
spamming the ring, affecting other domains ability to send to it. This
will be addressed with XSM policy controls in subsequent work.

Since denying access to any-sender rings is a significant functional
constraint, the new option "mac-permissive" for the argo bootparam
enables overriding this. eg: "argo=1,mac-permissive=1"

The p2m type of the memory supplied by the guest for the ring must be
p2m_ram_rw and the memory will be pinned as PGT_writable_page while the ring
is registered.

xen_argo_gfn_t type is defined and is 64-bit on all architectures which
assists with avoiding the need for compat code to translate hypercall args.
This hypercall op and its interface currently only supports 4K-sized pages.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
v4 v3#07 Jan: shrink critical sections in register_ring
v4 v3#07 Jan: revise register flag MASK in header, note 32-bitness of args
v4 feedback: use standard data structures per common code, not loop macros
v4 Andrew: use the single argo command line option list
v4 #07 Jan: rewrite find_ring_mfn to use check_get_page_from_gfn
v4 #07 Roger: add FIXME to ring_map_page for vmap contiguous ring mapping

v3 #07 Jan: comment: minimum ring size is based on minimum-sized message
v3 #04 Andrew: reference CONFIG_ARGO in the command line documentation
v3 #07 Jan: register_ring: fold else, if into else-if to drop indent
v3 #07 Jan: remove no longer used guest_handle_is_aligned macros
v3 #07 Jan: remove dead code from find_ring_mfns
v3 #07 Jan: fix format string indention in printks
v3 #07 Jan: remove redundant bounds check on npage in find_ring_mfns
v3 #08 self/Roger: improve dprintk output in find_ring_info like find_send_info
v3 #07 Jan: rename ring_find_info to find_ring_info
v3 #07 Jan: use array_index_nospec in ring_map_page
v3 #07 Jan: fix numeric entries in printk format strings
v3 #7 Jan: drop unneeded parentheses from ROUNDUP_MESSAGE defn
v3 #10 Roger: move find functions to top of file and drop prototypes
v3 #03 meld compat check for hypercall arg register struct
v3 #04 Roger/Jan: make lock names clearer and assert their state
v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
v3 feedback #07 Eric: fix header max ring size comment units
v3 feedback #04 Roger: mfn_mapping: void* instead of uint8_t*
v3 use %u for printing unsigned ints in find_ring_mfns
v3 feedback #04 Jan: uint32_t -> unsigned int for npage in register_ring
v3 feedback #04 Roger: drop npages struct member, calculate from len
v3 : register_ring: uint32_t -> unsigned int for private_tx_ptr
v3 feedback Roger/Jan: ASSERT currd is current->domain or use 'd' variable name
v3 feedback #07 Roger: use opt_argo_mac_permissive : a boolean opt
v3 feedback #04 Roger: reorder #includes to alphabetical order
v3 feedback #07 Roger: drop comment re: Intel EPT/AMD NPT for write-only mapping
v3 feedback #07 Roger: drop ptr arithmetic in update_tx_ptr, use ring struct cast
v3 feedback #07 Roger: drop newline in ring_map_page
v3 feedback #07 Roger: drop unneeded null check before xfree
v3 feedback #07 Roger: use return and drop out label in register_ring
v3 Stefano: add 4K page constraint to header file comment & commit msg
v3 Julien/Stefano: 4K granularity ok: use 64-bit gfns in register interface

v2 self: disallow ring resize via reregister
v2 feedback Jan: drop cookie, implement teardown
v2 feedback Jan: drop message from argo_message_op
v2 self: move hash_index function below locking comment
v2 self: OVERHAUL
v2 self/Jan: remove use of magic verification field and tidy up
v2 self: merge max and min ring size check clauses
v2 feedback v1#13 Roger: use OS-supplied roundup; drop from public header
v2 feedback #9, Jan: use the argo-mac bootparam at point of introduction
v2 feedback #9, Jan: rename boot opt variable to comply with convention
v2 feedback #9, Jan: rename the argo_mac bootparam to argo-mac
v2 feedback #9 Jan: document argo boot opt in xen-command-line.markdown
v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
v1 feedback Roger, Jan: drop argo prefix on static functions
v1 feedback Roger: s/pfn/gfn/ and retire always-64-bit type
v2. feedback Jan: document the argo-mac boot opt
v2. feedback Jan: simplify re-register, drop mappings
v1 #13 feedback Jan: revise use of guest_handle_okay vs __copy ops

v1 #13 feedback, Jan: register op : s/ECONNREFUSED/ESRCH/
v1 #5 (#13) feedback Paul: register op: use currd in do_message_op
v1 #13 feedback, Paul: register op: use mfn_eq comparator
v1 #5 (#13) feedback Paul: register op: use currd in argo_register_ring
v1 #13 feedback Paul: register op: whitespace, unsigned, bounds check
v1 #13 feedback Paul: use of hex in limit constant definition
v1 #13 feedback Paul, register op: set nmfns on loop termination
v1 #13 feedback Paul: register op: do/while -> gotos, reindent
v1 argo_ring_map_page: drop uint32_t for unsigned int
v1. #13 feedback Julien: use page descriptors instead of gpfns.
   - adds ABI support for pages with different granularity.
v1 feedback #13, Paul: adjust log level of message
v1 feedback #13, Paul: use gprintk for guest-triggered warning
v1 feedback #13, Paul: gprintk and XENLOG_DEBUG for ring registration
v1 feedback #13, Paul: use gprintk for errs in argo_ring_map_page
v1 feedback #13, Paul: use ENOMEM if global mapping fails
v1 feedback Paul: overflow check before shift
v1: add define for copy_field_to_guest_errno
v1: fix gprintk use for ARM as its defn dislikes split format strings
v1: use copy_field_to_guest_errno
v1 feedback #13, Jan: argo_hash_fn: no inline, rename, change type
v1 feedback #13, Paul, Jan: EFAULT -> ENOMEM in argo_ring_map_page
v1 feedback #13, Jan: rename page var in argo_ring_map_page
v1 feedback #13, Jan: switch uint8_t* to void* and drop cast
v1 feedback #13, Jan: switch memory barrier to smp_wmb
v1 feedback #13, Jan: make 'ring' comment comply with single-line style
v1 feedback #13, Jan: use xzalloc_array, drop loop NULL init
v1 feedback #13, Jan: init bool with false rather than 0
v1 feedback #13 Jan: use __copy; define and use __copy_field_to_guest_errno
v1 feedback #13, Jan: use xzalloc, drop individual init zeroes
v1 feedback #13, Jan: prefix public namespace with xen
v1 feedback #13, Jan: blank line after op case in do_argo_message_op
v1 self: reflow comment in argo_ring_map_page to within 80 char len
v1 feedback #13, Roger: use true not 1 in assign to update_tx_ptr bool
v1 feedback #21, Jan: fold in the array_index_nospec hash function guards
v1 feedback #18, Jan: fold the max ring count limit into the series
v1 self: use unsigned long type for XEN_ARGO_REGISTER_FLAG_MASK
v1: feedback #15 Jan: handle upper-halves of hypercall args
v1. feedback #13 Jan: add comment re: page alignment
v1. self: confirm ring magic presence in supplied page array
v1. feedback #13 Jan: add comment re: minimum ring size
v1. feedback #13 Roger: use ASSERT_UNREACHABLE
v1. feedback Roger: add comment to hash function

 docs/misc/xen-command-line.pandoc |   9 +-
 xen/common/argo.c                 | 467 ++++++++++++++++++++++++++++++++++++++
 xen/common/compat/argo.c          |   1 +
 xen/include/public/argo.h         |  73 ++++++
 xen/include/xlat.lst              |   1 +
 5 files changed, 550 insertions(+), 1 deletion(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 93f41bc..0f8c338 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -183,7 +183,7 @@ in combination with cpuidle.  This option is only expected to be useful for
 developers wishing Xen to fall back to older timing methods on newer hardware.
 
 ### argo
-    = List of [ <bool> ]
+    = List of [ <bool>, mac-permissive=<bool> ]
 
 Controls for the Argo hypervisor-mediated interdomain communication service.
 
@@ -197,6 +197,13 @@ to appropriate auditing by Xen.
 
 *   An overall boolean acts as a global control.  Argo is disabled by default.
 
+*   The `mac-permissive` boolean controls whether wildcard receive rings may be
+    registered (`mac-permissive=1`) or may not be registered
+    (`mac-permissive=0`).
+
+    This option is disabled by default, to protect domains from a DoS by a
+    buggy or malicious other domain spamming the ring.
+
 ### asid (x86)
 > `= <boolean>`
 
diff --git a/xen/common/argo.c b/xen/common/argo.c
index 12b3ec2..a7ec0e0 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -22,16 +22,30 @@
 #include <xen/errno.h>
 #include <xen/event.h>
 #include <xen/guest_access.h>
+#include <xen/lib.h>
 #include <xen/nospec.h>
 #include <xen/sched.h>
 #include <xen/time.h>
 
 #include <public/argo.h>
 
+#define MAX_RINGS_PER_DOMAIN            128U
+
+/* All messages on the ring are padded to a multiple of the slot size. */
+#define ROUNDUP_MESSAGE(a) ROUNDUP((a), XEN_ARGO_MSG_SLOT_SIZE)
+
+/* Number of PAGEs needed to hold a ring of a given size in bytes */
+#define NPAGES_RING(ring_len) \
+    (ROUNDUP((ROUNDUP_MESSAGE(ring_len) + sizeof(xen_argo_ring_t)), PAGE_SIZE) \
+     >> PAGE_SHIFT)
+
 DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_gfn_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
 
 static bool __read_mostly opt_argo;
+static bool __read_mostly opt_argo_mac_permissive;
 
 static int __init parse_argo(const char *s)
 {
@@ -45,6 +59,8 @@ static int __init parse_argo(const char *s)
 
         if ( (val = parse_bool(s, ss)) >= 0 )
             opt_argo = val;
+        else if ( (val = parse_boolean("mac-permissive", s, ss)) >= 0 )
+            opt_argo_mac_permissive = val;
         else
             rc = -EINVAL;
 
@@ -361,6 +377,74 @@ ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
     }
 }
 
+static int
+ring_map_page(const struct domain *d, struct argo_ring_info *ring_info,
+              unsigned int i, void **out_ptr)
+{
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    /*
+     * FIXME: Investigate using vmap to create a single contiguous virtual
+     * address space mapping of the ring instead of using the array of single
+     * page mappings.
+     * Affects logic in memcpy_to_guest_ring, the mfn_mapping array data
+     * structure, and places where ring mappings are added or removed.
+     */
+
+    if ( i >= ring_info->nmfns )
+    {
+        gprintk(XENLOG_ERR,
+               "argo: ring (vm%u:%x vm%u) %p attempted to map page %u of %u\n",
+                ring_info->id.domain_id, ring_info->id.aport,
+                ring_info->id.partner_id, ring_info, i, ring_info->nmfns);
+        return -ENOMEM;
+    }
+    i = array_index_nospec(i, ring_info->nmfns);
+
+    if ( !ring_info->mfns || !ring_info->mfn_mapping)
+    {
+        ASSERT_UNREACHABLE();
+        ring_info->len = 0;
+        return -ENOMEM;
+    }
+
+    if ( !ring_info->mfn_mapping[i] )
+    {
+        ring_info->mfn_mapping[i] = map_domain_page_global(ring_info->mfns[i]);
+        if ( !ring_info->mfn_mapping[i] )
+        {
+            gprintk(XENLOG_ERR, "argo: ring (vm%u:%x vm%u) %p attempted to map "
+                    "page %u of %u\n",
+                    ring_info->id.domain_id, ring_info->id.aport,
+                    ring_info->id.partner_id, ring_info, i, ring_info->nmfns);
+            return -ENOMEM;
+        }
+        argo_dprintk("mapping page %"PRI_mfn" to %p\n",
+                     mfn_x(ring_info->mfns[i]), ring_info->mfn_mapping[i]);
+    }
+
+    if ( out_ptr )
+        *out_ptr = ring_info->mfn_mapping[i];
+
+    return 0;
+}
+
+static void
+update_tx_ptr(const struct domain *d, struct argo_ring_info *ring_info,
+              uint32_t tx_ptr)
+{
+    xen_argo_ring_t *ringp;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+    ASSERT(ring_info->mfn_mapping[0]);
+
+    ring_info->tx_ptr = tx_ptr;
+    ringp = ring_info->mfn_mapping[0];
+
+    write_atomic(&ringp->tx_ptr, tx_ptr);
+    smp_wmb();
+}
+
 static void
 wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
 {
@@ -537,11 +621,362 @@ partner_rings_remove(struct domain *src_d)
     }
 }
 
+static int
+find_ring_mfn(struct domain *d, gfn_t gfn, mfn_t *mfn)
+{
+    struct page_info *page;
+    p2m_type_t p2mt;
+    int ret;
+
+    ret = check_get_page_from_gfn(d, gfn, false, &p2mt, &page);
+    if ( unlikely(ret) )
+        return ret;
+
+    *mfn = page_to_mfn(page);
+    if ( !mfn_valid(*mfn) )
+        ret = -EINVAL;
+#ifdef CONFIG_X86
+    else if ( p2mt == p2m_ram_logdirty )
+        ret = -EAGAIN;
+#endif
+    else if ( (p2mt != p2m_ram_rw) ||
+              !get_page_and_type(page, d, PGT_writable_page) )
+        ret = -EINVAL;
+
+    put_page(page);
+
+    return ret;
+}
+
+static int
+find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
+               const unsigned int npage,
+               XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
+               const unsigned int len)
+{
+    unsigned int i;
+    int ret = 0;
+    mfn_t *mfns;
+    void **mfn_mapping;
+
+    ASSERT(LOCKING_Write_rings_L2(d));
+
+    if ( ring_info->mfns )
+    {
+        /* Ring already existed: drop the previous mapping. */
+        gprintk(XENLOG_INFO, "argo: vm%u re-register existing ring "
+                "(vm%u:%x vm%u) clears mapping\n",
+                d->domain_id, ring_info->id.domain_id,
+                ring_info->id.aport, ring_info->id.partner_id);
+
+        ring_remove_mfns(d, ring_info);
+        ASSERT(!ring_info->mfns);
+    }
+
+    mfns = xmalloc_array(mfn_t, npage);
+    if ( !mfns )
+        return -ENOMEM;
+
+    for ( i = 0; i < npage; i++ )
+        mfns[i] = INVALID_MFN;
+
+    mfn_mapping = xzalloc_array(void *, npage);
+    if ( !mfn_mapping )
+    {
+        xfree(mfns);
+        return -ENOMEM;
+    }
+
+    ring_info->mfns = mfns;
+    ring_info->mfn_mapping = mfn_mapping;
+
+    for ( i = 0; i < npage; i++ )
+    {
+        xen_argo_gfn_t argo_gfn;
+        mfn_t mfn;
+
+        ret = __copy_from_guest_offset(&argo_gfn, gfn_hnd, i, 1) ? -EFAULT : 0;
+        if ( ret )
+            break;
+
+        ret = find_ring_mfn(d, _gfn(argo_gfn), &mfn);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR, "argo: vm%u: invalid gfn %"PRI_gfn" "
+                    "r:(vm%u:%x vm%u) %p %u/%u\n",
+                    d->domain_id, gfn_x(_gfn(argo_gfn)),
+                    ring_info->id.domain_id, ring_info->id.aport,
+                    ring_info->id.partner_id, ring_info, i, npage);
+            break;
+        }
+
+        ring_info->mfns[i] = mfn;
+
+        argo_dprintk("%u: %"PRI_gfn" -> %"PRI_mfn"\n",
+                     i, gfn_x(_gfn(argo_gfn)), mfn_x(ring_info->mfns[i]));
+    }
+
+    ring_info->nmfns = i;
+
+    if ( ret )
+        ring_remove_mfns(d, ring_info);
+    else
+    {
+        ASSERT(ring_info->nmfns == NPAGES_RING(len));
+
+        gprintk(XENLOG_DEBUG, "argo: vm%u ring (vm%u:%x vm%u) %p "
+                "mfn_mapping %p len %u nmfns %u\n",
+                d->domain_id, ring_info->id.domain_id,
+                ring_info->id.aport, ring_info->id.partner_id, ring_info,
+                ring_info->mfn_mapping, ring_info->len, ring_info->nmfns);
+    }
+
+    return ret;
+}
+
+static long
+register_ring(struct domain *currd,
+              XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd,
+              XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
+              unsigned int npage, bool fail_exist)
+{
+    xen_argo_register_ring_t reg;
+    struct argo_ring_id ring_id;
+    void *map_ringp;
+    xen_argo_ring_t *ringp;
+    struct argo_ring_info *ring_info, *new_ring_info = NULL;
+    struct argo_send_info *send_info = NULL;
+    struct domain *dst_d = NULL;
+    int ret = 0;
+    unsigned int private_tx_ptr;
+
+    ASSERT(currd == current->domain);
+
+    if ( copy_from_guest(&reg, reg_hnd, 1) )
+        return -EFAULT;
+
+    /*
+     * A ring must be large enough to transmit messages, so requires space for:
+     * * 1 message header, plus
+     * * 1 payload slot (payload is always rounded to a multiple of 16 bytes)
+     *   for the message payload to be written into, plus
+     * * 1 more slot, so that the ring cannot be filled to capacity with a
+     *   single minimum-size message -- see the logic in ringbuf_insert --
+     *   allowing for this ensures that there can be space remaining when a
+     *   message is present.
+     * The above determines the minimum acceptable ring size.
+     */
+    if ( (reg.len < (sizeof(struct xen_argo_ring_message_header)
+                      + ROUNDUP_MESSAGE(1) + ROUNDUP_MESSAGE(1))) ||
+         (reg.len > XEN_ARGO_MAX_RING_SIZE) ||
+         (reg.len != ROUNDUP_MESSAGE(reg.len)) ||
+         (NPAGES_RING(reg.len) != npage) ||
+         (reg.pad != 0) )
+        return -EINVAL;
+
+    ring_id.partner_id = reg.partner_id;
+    ring_id.aport = reg.aport;
+    ring_id.domain_id = currd->domain_id;
+
+    if ( reg.partner_id == XEN_ARGO_DOMID_ANY )
+    {
+        if ( !opt_argo_mac_permissive )
+            return -EPERM;
+    }
+    else
+    {
+        dst_d = get_domain_by_id(reg.partner_id);
+        if ( !dst_d )
+        {
+            argo_dprintk("!dst_d, ESRCH\n");
+            return -ESRCH;
+        }
+
+        send_info = xzalloc(struct argo_send_info);
+        if ( !send_info )
+        {
+            ret = -ENOMEM;
+            goto out;
+        }
+        send_info->id = ring_id;
+    }
+
+    /*
+     * Common case is that the ring doesn't already exist, so do the alloc here
+     * before picking up any locks.
+     */
+    new_ring_info = xzalloc(struct argo_ring_info);
+    if ( !new_ring_info )
+    {
+        ret = -ENOMEM;
+        goto out;
+    }
+
+    read_lock(&L1_global_argo_rwlock);
+
+    if ( !currd->argo )
+    {
+        ret = -ENODEV;
+        goto out_unlock;
+    }
+
+    if ( dst_d && !dst_d->argo )
+    {
+        argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
+        ret = -ECONNREFUSED;
+        goto out_unlock;
+    }
+
+    write_lock(&currd->argo->rings_L2_rwlock);
+
+    if ( currd->argo->ring_count >= MAX_RINGS_PER_DOMAIN )
+    {
+        ret = -ENOSPC;
+        goto out_unlock2;
+    }
+
+    ring_info = find_ring_info(currd, &ring_id);
+    if ( !ring_info )
+    {
+        ring_info = new_ring_info;
+        new_ring_info = NULL;
+
+        spin_lock_init(&ring_info->L3_lock);
+
+        ring_info->id = ring_id;
+        INIT_LIST_HEAD(&ring_info->pending);
+
+        list_add(&ring_info->node,
+                 &currd->argo->ring_hash[hash_index(&ring_info->id)]);
+
+        gprintk(XENLOG_DEBUG, "argo: vm%u registering ring (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+    }
+    else if ( ring_info->len )
+    {
+        /*
+         * If the caller specified that the ring must not already exist,
+         * fail at attempt to add a completed ring which already exists.
+         */
+        if ( fail_exist )
+        {
+            argo_dprintk("disallowed reregistration of existing ring\n");
+            ret = -EEXIST;
+            goto out_unlock2;
+        }
+
+        if ( ring_info->len != reg.len )
+        {
+            /*
+             * Change of ring size could result in entries on the pending
+             * notifications list that will never trigger.
+             * Simple blunt solution: disallow ring resize for now.
+             * TODO: investigate enabling ring resize.
+             */
+            gprintk(XENLOG_ERR, "argo: vm%u attempted to change ring size "
+                    "(vm%u:%x vm%u)\n",
+                    currd->domain_id, ring_id.domain_id, ring_id.aport,
+                    ring_id.partner_id);
+            /*
+             * Could return EINVAL here, but if the ring didn't already
+             * exist then the arguments would have been valid, so: EEXIST.
+             */
+            ret = -EEXIST;
+            goto out_unlock2;
+        }
+
+        gprintk(XENLOG_DEBUG,
+                "argo: vm%u re-registering existing ring (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+    }
+
+    ret = find_ring_mfns(currd, ring_info, npage, gfn_hnd, reg.len);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: vm%u failed to find ring mfns (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+
+        ring_remove_info(currd, ring_info);
+        goto out_unlock2;
+    }
+
+    /*
+     * The first page of the memory supplied for the ring has the xen_argo_ring
+     * structure at its head, which is where the ring indexes reside.
+     */
+    ret = ring_map_page(currd, ring_info, 0, &map_ringp);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: vm%u failed to map ring mfn 0 (vm%u:%x vm%u)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.aport,
+                ring_id.partner_id);
+
+        ring_remove_info(currd, ring_info);
+        goto out_unlock2;
+    }
+    ringp = map_ringp;
+
+    private_tx_ptr = read_atomic(&ringp->tx_ptr);
+
+    if ( (private_tx_ptr >= reg.len) ||
+         (ROUNDUP_MESSAGE(private_tx_ptr) != private_tx_ptr) )
+    {
+        /*
+         * Since the ring is a mess, attempt to flush the contents of it
+         * here by setting the tx_ptr to the next aligned message slot past
+         * the latest rx_ptr we have observed. Handle ring wrap correctly.
+         */
+        private_tx_ptr = ROUNDUP_MESSAGE(read_atomic(&ringp->rx_ptr));
+
+        if ( private_tx_ptr >= reg.len )
+            private_tx_ptr = 0;
+
+        update_tx_ptr(currd, ring_info, private_tx_ptr);
+    }
+
+    ring_info->tx_ptr = private_tx_ptr;
+    ring_info->len = reg.len;
+    currd->argo->ring_count++;
+
+    if ( send_info )
+    {
+        spin_lock(&dst_d->argo->send_L2_lock);
+
+        list_add(&send_info->node,
+                 &dst_d->argo->send_hash[hash_index(&send_info->id)]);
+
+        spin_unlock(&dst_d->argo->send_L2_lock);
+    }
+
+ out_unlock2:
+    write_unlock(&currd->argo->rings_L2_rwlock);
+
+ out_unlock:
+    read_unlock(&L1_global_argo_rwlock);
+
+ out:
+    if ( dst_d )
+        put_domain(dst_d);
+
+    if ( ret )
+        xfree(send_info);
+
+    xfree(new_ring_info);
+
+    return ret;
+}
+
 long
 do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
            XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
            unsigned long arg4)
 {
+    struct domain *currd = current->domain;
     long rc = -EFAULT;
 
     argo_dprintk("->do_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
@@ -552,6 +987,38 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
 
     switch (cmd)
     {
+    case XEN_ARGO_OP_register_ring:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd =
+            guest_handle_cast(arg1, xen_argo_register_ring_t);
+        XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd =
+            guest_handle_cast(arg2, xen_argo_gfn_t);
+        /* arg3 is npage */
+        /* arg4 is flags */
+        bool fail_exist = arg4 & XEN_ARGO_REGISTER_FLAG_FAIL_EXIST;
+
+        if ( unlikely(arg3 > (XEN_ARGO_MAX_RING_SIZE >> PAGE_SHIFT)) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+        /*
+         * Check access to the whole array here so we can use the faster __copy
+         * operations to read each element later.
+         */
+        if ( unlikely(!guest_handle_okay(gfn_hnd, arg3)) )
+            break;
+        /* arg4: reserve currently-undefined bits, require zero.  */
+        if ( unlikely(arg4 & ~XEN_ARGO_REGISTER_FLAG_MASK) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        rc = register_ring(currd, reg_hnd, gfn_hnd, arg3, fail_exist);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/common/compat/argo.c b/xen/common/compat/argo.c
index 8edb9e8..9437a7a 100644
--- a/xen/common/compat/argo.c
+++ b/xen/common/compat/argo.c
@@ -20,4 +20,5 @@
 #include <compat/argo.h>
 
 CHECK_argo_addr;
+CHECK_argo_register_ring;
 CHECK_argo_ring;
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index 530bb82..f822756 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -33,9 +33,19 @@
 
 #define XEN_ARGO_DOMID_ANY       DOMID_INVALID
 
+/*
+ * The maximum size of an Argo ring is defined to be: 16MB
+ *  -- which is 0x1000000 bytes.
+ * A byte index into the ring is at most 24 bits.
+ */
+#define XEN_ARGO_MAX_RING_SIZE  (0x1000000ULL)
+
 /* Fixed-width type for "argo port" number. Nothing to do with evtchns. */
 typedef uint32_t xen_argo_port_t;
 
+/* gfn type: 64-bit on all architectures to aid avoiding a compat ABI */
+typedef uint64_t xen_argo_gfn_t;
+
 typedef struct xen_argo_addr
 {
     xen_argo_port_t aport;
@@ -61,4 +71,67 @@ typedef struct xen_argo_ring
 #endif
 } xen_argo_ring_t;
 
+typedef struct xen_argo_register_ring
+{
+    xen_argo_port_t aport;
+    domid_t partner_id;
+    uint16_t pad;
+    uint32_t len;
+} xen_argo_register_ring_t;
+
+/* Messages on the ring are padded to a multiple of this size. */
+#define XEN_ARGO_MSG_SLOT_SIZE 0x10
+
+struct xen_argo_ring_message_header
+{
+    uint32_t len;
+    xen_argo_addr_t source;
+    uint32_t message_type;
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    uint8_t data[];
+#elif defined(__GNUC__)
+    uint8_t data[0];
+#endif
+};
+
+/*
+ * Hypercall operations
+ */
+
+/*
+ * XEN_ARGO_OP_register_ring
+ *
+ * Register a ring using the guest-supplied memory pages.
+ * Also used to reregister an existing ring (eg. after resume from hibernate).
+ *
+ * The first argument struct indicates the port number for the ring to register
+ * and the partner domain, if any, that is to be allowed to send to the ring.
+ * A wildcard (XEN_ARGO_DOMID_ANY) may be supplied instead of a partner domid,
+ * and if the hypervisor has wildcard sender rings enabled, this will allow
+ * any domain (XSM notwithstanding) to send to the ring.
+ *
+ * The second argument is an array of guest frame numbers and the third argument
+ * indicates the size of the array. This operation only supports 4K-sized pages.
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_register_ring_t)
+ * arg2: XEN_GUEST_HANDLE(xen_argo_gfn_t)
+ * arg3: unsigned long npages
+ * arg4: unsigned long flags (32-bit value)
+ */
+#define XEN_ARGO_OP_register_ring     1
+
+/* Register op flags */
+/*
+ * Fail exist:
+ * If set, reject attempts to (re)register an existing established ring.
+ * If clear, reregistration occurs if the ring exists, with the new ring
+ * taking the place of the old, preserving tx_ptr if it remains valid.
+ */
+#define XEN_ARGO_REGISTER_FLAG_FAIL_EXIST  0x1
+
+#ifdef __XEN__
+/* Mask for all defined flags. */
+#define XEN_ARGO_REGISTER_FLAG_MASK XEN_ARGO_REGISTER_FLAG_FAIL_EXIST
+#endif
+
 #endif
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 9f616e4..9c9d33f 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -150,3 +150,4 @@
 ?	flask_transition		xsm/flask_op.h
 ?	argo_addr			argo.h
 ?	argo_ring			argo.h
+?	argo_register_ring		argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 08/15] argo: implement the unregister op
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (6 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 07/15] argo: implement the register op Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-22 11:02   ` Roger Pau Monné
  2019-01-21  9:59 ` [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq Christopher Clark
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Takes a single argument: a handle to the ring unregistration struct,
which specifies the port and partner domain id or wildcard.

The ring's entry is removed from the hashtable of registered rings;
any entries for pending notifications are removed; and the ring is
unmapped from Xen's address space.

If the ring had been registered to communicate with a single specified
domain (ie. a non-wildcard ring) then the partner domain state is removed
from the partner domain's argo send_info hash table.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
The logic in unregister_ring got pretty heavily reordered in this version
of the patch, to shrink the critical sections. I'm happy with the result.
I've added a couple of ASSERT_UNREACHABLEs where it seems appropriate.

v4 # Jan: shrink the critical sections in unregister
v4 : use standard data structures as per common code
v4 #08 Roger: skip send_info lookup for wildcard rings
v4: add ASSERT_UNREACHABLE for missing sender domain or send_info
v4: reduce indentation by using goto
v4: add unlikely to currd->argo check
v4 #08 Jan: move put_domain outside L2 critical section
v4: include ring data in debug output when ring not found

v3 #08 Jan: pull xfree out of exclusive critical sections in unregister_ring
v3 #08 Jan: rename send_find_info to find_send_info
v3 #07 Jan: rename ring_find_info to find_ring_info
v3 #08 Roger: use return and remove the out label in unregister_ring
v3 #08 Roger: better debug output in send_find_info
v3 #10 Roger: move find functions to top of file and drop prototypes
v3 #04 Jan: meld compat check for unregister_ring struct
v3 #04 Roger/Jan: make lock names clearer and assert their state
v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
v3 feedback Roger/Jan: ASSERT currd is current->domain or use 'd' variable name
v3 feedback #07 Roger: const the argo_ring_id structs in send_find_info
v2 feedback Jan: drop cookie, implement teardown
v2 feedback Jan: drop message from argo_message_op
v2 self: OVERHAUL
v2 self: reorder logic to shorten critical section
v1 #13 feedback Jan: revise use of guest_handle_okay vs __copy ops
v1 feedback Roger, Jan: drop argo prefix on static functions
v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
v1 #5 (#14) feedback Paul: use currd in do_argo_message_op
v1 #5 (#14) feedback Paul: full use currd in argo_unregister_ring
v1 #13 (#14) feedback Paul: replace do/while with goto; reindent
v1 self: add blank lines in unregister case in do_argo_message_op
v1: #13 feedback Jan: public namespace: prefix with xen
v1: #13 feedback Jan: blank line after op case in do_argo_message_op
v1: #14 feedback Jan: replace domain id override with validation
v1: #18 feedback Jan: meld the ring count limit into the series
v1: feedback #15 Jan: verify zero in unused hypercall args

 xen/common/argo.c         | 126 ++++++++++++++++++++++++++++++++++++++++++++++
 xen/common/compat/argo.c  |   1 +
 xen/include/public/argo.h |  19 +++++++
 xen/include/xlat.lst      |   1 +
 4 files changed, 147 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index a7ec0e0..e4cd446 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -43,6 +43,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_gfn_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_unregister_ring_t);
 
 static bool __read_mostly opt_argo;
 static bool __read_mostly opt_argo_mac_permissive;
@@ -351,6 +352,37 @@ find_ring_info(const struct domain *d, const struct argo_ring_id *id)
     return NULL;
 }
 
+static struct argo_send_info *
+find_send_info(const struct domain *d, const struct argo_ring_id *id)
+{
+    struct list_head *cursor, *bucket;
+
+    ASSERT(LOCKING_send_L2(d));
+
+    /* List is not modified here. Search and return the match if found. */
+    bucket = &d->argo->send_hash[hash_index(id)];
+
+    for ( cursor = bucket->next; cursor != bucket; cursor = cursor->next )
+    {
+        struct argo_send_info *send_info =
+            list_entry(cursor, struct argo_send_info, node);
+        const struct argo_ring_id *cmpid = &send_info->id;
+
+        if ( cmpid->aport == id->aport &&
+             cmpid->domain_id == id->domain_id &&
+             cmpid->partner_id == id->partner_id )
+        {
+            argo_dprintk("found send_info for ring(%u:%x %u)\n",
+                         id->domain_id, id->aport, id->partner_id);
+            return send_info;
+        }
+    }
+    argo_dprintk("no send_info for ring(%u:%x %u)\n",
+                 id->domain_id, id->aport, id->partner_id);
+
+    return NULL;
+}
+
 static void
 ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
 {
@@ -735,6 +767,85 @@ find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
 }
 
 static long
+unregister_ring(struct domain *currd,
+                XEN_GUEST_HANDLE_PARAM(xen_argo_unregister_ring_t) unreg_hnd)
+{
+    xen_argo_unregister_ring_t unreg;
+    struct argo_ring_id ring_id;
+    struct argo_ring_info *ring_info = NULL;
+    struct argo_send_info *send_info = NULL;
+    struct domain *dst_d = NULL;
+
+    ASSERT(currd == current->domain);
+
+    if ( copy_from_guest(&unreg, unreg_hnd, 1) )
+        return -EFAULT;
+
+    if ( unreg.pad )
+        return -EINVAL;
+
+    ring_id.partner_id = unreg.partner_id;
+    ring_id.aport = unreg.aport;
+    ring_id.domain_id = currd->domain_id;
+
+    read_lock(&L1_global_argo_rwlock);
+
+    if ( unlikely(!currd->argo) )
+    {
+        read_unlock(&L1_global_argo_rwlock);
+        return -ENODEV;
+    }
+
+    write_lock(&currd->argo->rings_L2_rwlock);
+
+    ring_info = find_ring_info(currd, &ring_id);
+    if ( !ring_info )
+        goto out;
+
+    ring_remove_info(currd, ring_info);
+    currd->argo->ring_count--;
+
+    if ( ring_id.partner_id == XEN_ARGO_DOMID_ANY )
+        goto out;
+
+    dst_d = get_domain_by_id(ring_id.partner_id);
+    if ( !dst_d || !dst_d->argo )
+    {
+        ASSERT_UNREACHABLE();
+        goto out;
+    }
+
+    spin_lock(&dst_d->argo->send_L2_lock);
+
+    send_info = find_send_info(dst_d, &ring_id);
+    if ( send_info )
+        list_del(&send_info->node);
+    else
+        ASSERT_UNREACHABLE();
+
+    spin_unlock(&dst_d->argo->send_L2_lock);
+
+ out:
+    write_unlock(&currd->argo->rings_L2_rwlock);
+
+    read_unlock(&L1_global_argo_rwlock);
+
+    if ( dst_d )
+        put_domain(dst_d);
+
+    xfree(send_info);
+
+    if ( !ring_info )
+    {
+        argo_dprintk("unregister_ring: no ring_info found for ring(%u:%x %u)\n",
+                     ring_id.domain_id, ring_id.aport, ring_id.partner_id);
+        return -ENOENT;
+    }
+
+    return 0;
+}
+
+static long
 register_ring(struct domain *currd,
               XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd,
               XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
@@ -1019,6 +1130,21 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
         break;
     }
 
+    case XEN_ARGO_OP_unregister_ring:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_unregister_ring_t) unreg_hnd =
+            guest_handle_cast(arg1, xen_argo_unregister_ring_t);
+
+        if ( unlikely((!guest_handle_is_null(arg2)) || arg3 || arg4) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        rc = unregister_ring(currd, unreg_hnd);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/common/compat/argo.c b/xen/common/compat/argo.c
index 9437a7a..6a1671c 100644
--- a/xen/common/compat/argo.c
+++ b/xen/common/compat/argo.c
@@ -22,3 +22,4 @@
 CHECK_argo_addr;
 CHECK_argo_register_ring;
 CHECK_argo_ring;
+CHECK_argo_unregister_ring;
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index f822756..2371510 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -79,6 +79,13 @@ typedef struct xen_argo_register_ring
     uint32_t len;
 } xen_argo_register_ring_t;
 
+typedef struct xen_argo_unregister_ring
+{
+    xen_argo_port_t aport;
+    domid_t partner_id;
+    uint16_t pad;
+} xen_argo_unregister_ring_t;
+
 /* Messages on the ring are padded to a multiple of this size. */
 #define XEN_ARGO_MSG_SLOT_SIZE 0x10
 
@@ -134,4 +141,16 @@ struct xen_argo_ring_message_header
 #define XEN_ARGO_REGISTER_FLAG_MASK XEN_ARGO_REGISTER_FLAG_FAIL_EXIST
 #endif
 
+/*
+ * XEN_ARGO_OP_unregister_ring
+ *
+ * Unregister a previously-registered ring, ending communication.
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_unregister_ring_t)
+ * arg2: NULL
+ * arg3: 0 (ZERO)
+ * arg4: 0 (ZERO)
+ */
+#define XEN_ARGO_OP_unregister_ring     2
+
 #endif
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 9c9d33f..411c661 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -151,3 +151,4 @@
 ?	argo_addr			argo.h
 ?	argo_ring			argo.h
 ?	argo_register_ring		argo.h
+?	argo_unregister_ring		argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (7 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 08/15] argo: implement the unregister op Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-22 12:08   ` Roger Pau Monné
  2019-01-21  9:59 ` [PATCH v5 10/15] argo: implement the notify op Christopher Clark
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

sendv operation is invoked to perform a synchronous send of buffers
contained in iovs to a remote domain's registered ring.

It takes:
 * A destination address (domid, port) for the ring to send to.
   It performs a most-specific match lookup, to allow for wildcard.
 * A source address, used to inform the destination of where to reply.
 * The address of an array of iovs containing the data to send
 * .. and the length of that array of iovs
 * and a 32-bit message type, available to communicate message context
   data (eg. kernel-to-kernel, separate from the application data).

If insufficient space exists in the destination ring, it will return
-EAGAIN and Xen will notify the caller when sufficient space becomes
available.

Accesses to the ring indices are appropriately atomic. The rings are
mapped into Xen's private address space to write as needed and the
mappings are retained for later use.

Notifications are sent to guests via VIRQ and send_guest_global_virq is
exposed in the change to enable argo to call it. VIRQ_ARGO_MESSAGE is
claimed from the VIRQ previously reserved for this purpose (#11).

The VIRQ notification method is used rather than sending events using
evtchn functions directly because:

* no current event channel type is an exact fit for the intended
  behaviour. ECS_IPI is closest, but it disallows migration to
  other VCPUs which is not necessarily a requirement for Argo.

* at the point of argo_init, allocation of an event channel is
  complicated by none of the guest VCPUs being initialized yet
  and the event channel logic expects that a valid event channel
  has a present VCPU.

* at the point of signalling a notification, the VIRQ logic is already
  defensive: if d->vcpu[0] is NULL, the notification is just silently
  dropped, whereas the evtchn_send logic is not so defensive: vcpu[0]
  must not be NULL, otherwise a null pointer dereference occurs.

Using a VIRQ removes the need for the guest to query to determine which
event channel notifications will be delivered on. This is also likely to
simplify establishing future L0/L1 nested hypervisor argo communication.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
v4 Jan: remove use of fixed-width types from iov_count, ringbuf_insert
v4 #07 Jan: shrink critical sections in sendv
v3 #07 Jan: header: note 32-bitness of hypercall message tuype arg
v4 : use standard data structures as per common code
v4 self: bugfix memcpy_to_guest_ring: head_len must check (offset + len)
v4 #09 Roger: drop MESSAGE from VIRQ_ARGO_MESSAGE

v3 #07 Jan: rename ring_find_info* to find_ring_info*
v3 #07 Jan: fix numeric entries in printk format strings
v3 #10 Roger: move find functions to top of file and drop prototypes
v3 #04 Jan: meld compat struct checking for hypercall args
v3 #04 Roger/Jan: make lock names clearer and assert their state
v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
v3 feedback #09 Eric: fix len & offset sanity check in memcpy_to_guest_ring
v3 feedback #04 Roger: newline fix in wildcard_pending_list_insert
v3 feedback #04 Roger: drop npages struct member, calculate from len
v3 #09 Roger: simplify EFAULT return in memcpy_to_guest_ring
v3 #09 Roger: add newline before return in get_sanitized_ring
v3 #09 Roger: replace while with for loop in iov_count
v3 #09 Roger: drop 0 in struct init in ringbuf_insert
v3 #09 Roger: comment for XEN_ARGO_MAXIOV: warn of stack overflow risk
v3 #09 Roger: simplify while loop: for instead in ringbuf_insert
v3 #09 Roger: drop out label for returns in ringbuf_insert
v3 #09 Roger: drop newline in pending_queue
v3 #09 Roger: replace second goto label with error path unlock in sendv
v3 #09 Jason: check iov_len vs MAX_ARGO_MESSAGE_SIZE in iov_count
v3 #09 Jason: check padding is zeroed in sendv op
v3 #09 Jason: memcpy_to_guest_ring: simpler code with better loop

v2 self: use ring_info backpointer in pending_ent to maintain npending
v2 feedback Jan: drop cookie, implement teardown
v2 self: pending_queue: reap stale ents when in need of space
v2 self: pending_requeue: reclaim ents for stale domains
v2.feedback Jan: only override sender domid if DOMID_ANY
v2 feedback Jan: drop message from argo_message_op
v2 self: check npending vs maximum limit
v2 self: get_sanitized_ring instead of get_rx_ptr
v2 feedback v1#13 Jan: remove double read from ringbuf insert, lower MAX_IOV
v2 self: make iov_count const
v2 self: iov_count : return EMSGSIZE for message too big
v2 self: OVERHAUL
v2 self: s/argo_pending_ent/pending_ent/g
v2 feedback v1#13 Roger: use OS-supplied roundup; drop from public header
v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
v1 feedback Roger, Jan: drop argo prefix on static functions
v1 feedback #13 Jan: drop guest_handle_okay when using copy_from_guest
    - reorder do_argo_op logic
v2 self: add _hnd suffix to iovs variable name to indicate guest handle type
v2 self: replace use of XEN_GUEST_HANDLE_NULL with two existing macros

v1 #15 feedback, Jan: sendv op : s/ECONNREFUSED/ESRCH/
v1 #5 (#15) feedback Paul: sendv: use currd in do_argo_message_op
v1 #13 (#15) feedback Paul: sendv op: do/while reindent only
v1 #13 (#15) feedback Paul: sendv op: do/while: argo_ringbuf_insert to goto style
v1 #13 (#15) feedback Paul: sendv op: do/while: reindent only again
v1 #13 (#15) feedback Paul: sendv op: do/while : goto
v1 #15 feedback Paul: sendv op: make page var: unsigned
v1 #15 feedback Paul: sendv op: new local var for PAGE_SIZE - offset
v1 #8 feedback Jan: XEN_GUEST_HANDLE : C89 compliance
v1 rebase after switching register op from pfns to page descriptors
v1 self: move iov DEFINE_XEN_GUEST_HANDLE out of public header into argo.c
v1 #13 (#15) feedback Paul: fix loglevel for guest-triggered messages
v1 : add compat xlat.lst entries
v1 self: switched notification to send_guest_global_virq instead of event
v1: fix gprintk use for ARM as its defn dislikes split format strings
v1: init len variable to satisfy ARM compiler initialized checking
v1 #13 feedback Jan: rename page var
v1:#14 feedback Jan: uint8_t* -> void*
v1: #13 feedback Jan: public namespace: prefix with xen
v1: #13 feedback Jan: blank line after case op in do_argo_message_op
v1: #15 feedback Jan: add comments explaining why the writes don't overrun
v1: self: add ASSERT to support comment that overrun cannot happen
v1: self: fail on short writes where guest manipulated the iov_lens
v1: self: rename ent id to domain_id
v1: self: add moan for iov rewrite
v1. feedback #15 Jan: require the pad bits are zero
v1. feedback #15 Jan: drop NULL check in argo_signal_domain as now using VIRQ
v1. self: store domain_cookie in pending ent
v1. feedback #15 Jan: use unsigned where possible
v1. feedback Jan: use handle type for iov_base in public iov interface
v1. self: log whenever visible error occurs
v1 feedback #15, Jan: drop unnecessary mb
v1 self: only update internal tx_ptr if able to return success
         and update the visible tx_ptr
v1 self: log on failure to map ring to update visible tx_ptr
v1 feedback #15 Jan: add comment re: notification size policy
v1 self/Roger? remove errant space after sizeof
v1. feedback #15 Jan: require iov pad be zero
v1. self: rename iov_base to iov_hnd for handle in public iov interface
v1: feedback #15 Jan: handle upper-halves of hypercall args; changes some
    types in function signatures to match.
v1: self: add dprintk to sendv
v1: self: add debug output to argo_iov_count
v1. feedback #14 Jan: blank line before return in argo_iov_count
v1 feedback #15 Jan: verify src id, not override

 xen/common/argo.c          | 636 +++++++++++++++++++++++++++++++++++++++++++++
 xen/common/compat/argo.c   |  19 ++
 xen/common/event_channel.c |   2 +-
 xen/include/public/argo.h  |  60 +++++
 xen/include/public/xen.h   |   2 +-
 xen/include/xen/event.h    |   7 +
 xen/include/xlat.lst       |   2 +
 7 files changed, 726 insertions(+), 2 deletions(-)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index e4cd446..518aff7 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -30,10 +30,15 @@
 #include <public/argo.h>
 
 #define MAX_RINGS_PER_DOMAIN            128U
+#define MAX_PENDING_PER_RING             32U
 
 /* All messages on the ring are padded to a multiple of the slot size. */
 #define ROUNDUP_MESSAGE(a) ROUNDUP((a), XEN_ARGO_MSG_SLOT_SIZE)
 
+/* The maximum size of a message that may be sent on the largest Argo ring. */
+#define MAX_ARGO_MESSAGE_SIZE ((XEN_ARGO_MAX_RING_SIZE) - \
+        (sizeof(struct xen_argo_ring_message_header)) - ROUNDUP_MESSAGE(1))
+
 /* Number of PAGEs needed to hold a ring of a given size in bytes */
 #define NPAGES_RING(ring_len) \
     (ROUNDUP((ROUNDUP_MESSAGE(ring_len) + sizeof(xen_argo_ring_t)), PAGE_SIZE) \
@@ -41,8 +46,10 @@
 
 DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_gfn_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_iov_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_send_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_unregister_ring_t);
 
 static bool __read_mostly opt_argo;
@@ -352,6 +359,28 @@ find_ring_info(const struct domain *d, const struct argo_ring_id *id)
     return NULL;
 }
 
+static struct argo_ring_info *
+find_ring_info_by_match(const struct domain *d, xen_argo_port_t aport,
+                        domid_t partner_id)
+{
+    struct argo_ring_id id;
+    struct argo_ring_info *ring_info;
+
+    ASSERT(LOCKING_Read_rings_L2(d));
+
+    id.aport = aport;
+    id.domain_id = d->domain_id;
+    id.partner_id = partner_id;
+
+    ring_info = find_ring_info(d, &id);
+    if ( ring_info )
+        return ring_info;
+
+    id.partner_id = XEN_ARGO_DOMID_ANY;
+
+    return find_ring_info(d, &id);
+}
+
 static struct argo_send_info *
 find_send_info(const struct domain *d, const struct argo_ring_id *id)
 {
@@ -384,6 +413,14 @@ find_send_info(const struct domain *d, const struct argo_ring_id *id)
 }
 
 static void
+signal_domain(struct domain *d)
+{
+    argo_dprintk("signalling domid:%u\n", d->domain_id);
+
+    send_guest_global_virq(d, VIRQ_ARGO);
+}
+
+static void
 ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
 {
     unsigned int i;
@@ -477,6 +514,390 @@ update_tx_ptr(const struct domain *d, struct argo_ring_info *ring_info,
     smp_wmb();
 }
 
+static int
+memcpy_to_guest_ring(const struct domain *d, struct argo_ring_info *ring_info,
+                     unsigned int offset,
+                     const void *src, XEN_GUEST_HANDLE(uint8_t) src_hnd,
+                     unsigned int len)
+{
+    unsigned int mfns_index = offset >> PAGE_SHIFT;
+    void *dst;
+    int ret;
+    unsigned int src_offset = 0;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    offset &= ~PAGE_MASK;
+
+    if ( len + offset > XEN_ARGO_MAX_RING_SIZE )
+        return -EFAULT;
+
+    while ( len )
+    {
+        unsigned int head_len = (offset + len) > PAGE_SIZE ? PAGE_SIZE - offset
+                                                           : len;
+
+        ret = ring_map_page(d, ring_info, mfns_index, &dst);
+        if ( ret )
+            return ret;
+
+        if ( src )
+        {
+            memcpy(dst + offset, src + src_offset, head_len);
+            src_offset += head_len;
+        }
+        else
+        {
+            if ( copy_from_guest(dst + offset, src_hnd, head_len) )
+                return -EFAULT;
+
+            guest_handle_add_offset(src_hnd, head_len);
+        }
+
+        mfns_index++;
+        len -= head_len;
+        offset = 0;
+    }
+
+    return 0;
+}
+
+/*
+ * Use this with caution: rx_ptr is under guest control and may be bogus.
+ * See get_sanitized_ring for a safer alternative.
+ */
+static int
+get_rx_ptr(const struct domain *d, struct argo_ring_info *ring_info,
+           uint32_t *rx_ptr)
+{
+    void *src;
+    xen_argo_ring_t *ringp;
+    int ret;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    if ( !ring_info->nmfns || ring_info->nmfns < NPAGES_RING(ring_info->len) )
+        return -EINVAL;
+
+    ret = ring_map_page(d, ring_info, 0, &src);
+    if ( ret )
+        return ret;
+
+    ringp = (xen_argo_ring_t *)src;
+
+    *rx_ptr = read_atomic(&ringp->rx_ptr);
+
+    return 0;
+}
+
+/*
+ * get_sanitized_ring creates a modified copy of the ring pointers where
+ * the rx_ptr is rounded up to ensure it is aligned, and then ring
+ * wrap is handled. Simplifies safe use of the rx_ptr for available
+ * space calculation.
+ */
+static int
+get_sanitized_ring(const struct domain *d, xen_argo_ring_t *ring,
+                   struct argo_ring_info *ring_info)
+{
+    uint32_t rx_ptr;
+    int ret;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    ret = get_rx_ptr(d, ring_info, &rx_ptr);
+    if ( ret )
+        return ret;
+
+    ring->tx_ptr = ring_info->tx_ptr;
+
+    rx_ptr = ROUNDUP_MESSAGE(rx_ptr);
+    if ( rx_ptr >= ring_info->len )
+        rx_ptr = 0;
+
+    ring->rx_ptr = rx_ptr;
+
+    return 0;
+}
+
+/*
+ * iov_count returns its count on success via an out variable to avoid
+ * potential for a negative return value to be used incorrectly
+ * (eg. coerced into an unsigned variable resulting in a large incorrect value)
+ */
+static int
+iov_count(const xen_argo_iov_t *piov, unsigned long niov,
+          unsigned int *count)
+{
+    unsigned int sum_iov_lens = 0;
+
+    if ( niov > XEN_ARGO_MAXIOV )
+        return -EINVAL;
+
+    for ( ; niov--; piov++ )
+    {
+        /* valid iovs must have the padding field set to zero */
+        if ( piov->pad )
+        {
+            argo_dprintk("invalid iov: padding is not zero\n");
+            return -EINVAL;
+        }
+
+        /* check each to protect sum against integer overflow */
+        if ( piov->iov_len > MAX_ARGO_MESSAGE_SIZE )
+        {
+            argo_dprintk("invalid iov_len: too big (%u)>%llu\n",
+                         piov->iov_len, MAX_ARGO_MESSAGE_SIZE);
+            return -EINVAL;
+        }
+
+        sum_iov_lens += piov->iov_len;
+
+        /*
+         * Again protect sum from integer overflow
+         * and ensure total msg size will be within bounds.
+         */
+        if ( sum_iov_lens > MAX_ARGO_MESSAGE_SIZE )
+        {
+            argo_dprintk("invalid iov series: total message too big\n");
+            return -EMSGSIZE;
+        }
+    }
+
+    *count = sum_iov_lens;
+
+    return 0;
+}
+
+static int
+ringbuf_insert(const struct domain *d, struct argo_ring_info *ring_info,
+               const struct argo_ring_id *src_id,
+               XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd,
+               unsigned long niov, uint32_t message_type,
+               unsigned long *out_len)
+{
+    xen_argo_ring_t ring;
+    struct xen_argo_ring_message_header mh = { };
+    int sp, ret;
+    unsigned int len = 0;
+    xen_argo_iov_t iovs[XEN_ARGO_MAXIOV];
+    xen_argo_iov_t *piov;
+    XEN_GUEST_HANDLE(uint8_t) NULL_hnd =
+       guest_handle_from_param(guest_handle_from_ptr(NULL, uint8_t), uint8_t);
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    ret = __copy_from_guest(iovs, iovs_hnd, niov) ? -EFAULT : 0;
+    if ( ret )
+        return ret;
+
+    /*
+     * Obtain the total size of data to transmit -- sets the 'len' variable
+     * -- and sanity check that the iovs conform to size and number limits.
+     * Enforced below: no more than 'len' bytes of guest data
+     * (plus the message header) will be sent in this operation.
+     */
+    ret = iov_count(iovs, niov, &len);
+    if ( ret )
+        return ret;
+
+    /*
+     * Size bounds check against ring size and static maximum message limit.
+     * The message must not fill the ring; there must be at least one slot
+     * remaining so we can distinguish a full ring from an empty one.
+     */
+    if ( ((ROUNDUP_MESSAGE(len) +
+            sizeof(struct xen_argo_ring_message_header)) >= ring_info->len) ||
+         (len > MAX_ARGO_MESSAGE_SIZE) )
+        return -EMSGSIZE;
+
+    ret = get_sanitized_ring(d, &ring, ring_info);
+    if ( ret )
+        return ret;
+
+    argo_dprintk("ring.tx_ptr=%u ring.rx_ptr=%u ring len=%u"
+                 " ring_info->tx_ptr=%u\n",
+                 ring.tx_ptr, ring.rx_ptr, ring_info->len, ring_info->tx_ptr);
+
+    if ( ring.rx_ptr == ring.tx_ptr )
+        sp = ring_info->len;
+    else
+    {
+        sp = ring.rx_ptr - ring.tx_ptr;
+        if ( sp < 0 )
+            sp += ring_info->len;
+    }
+
+    /*
+     * Size bounds check against currently available space in the ring.
+     * Again: the message must not fill the ring leaving no space remaining.
+     */
+    if ( (ROUNDUP_MESSAGE(len) +
+            sizeof(struct xen_argo_ring_message_header)) >= sp )
+    {
+        argo_dprintk("EAGAIN\n");
+        return -EAGAIN;
+    }
+
+    mh.len = len + sizeof(struct xen_argo_ring_message_header);
+    mh.source.aport = src_id->aport;
+    mh.source.domain_id = src_id->domain_id;
+    mh.message_type = message_type;
+
+    /*
+     * For this copy to the guest ring, tx_ptr is always 16-byte aligned
+     * and the message header is 16 bytes long.
+     */
+    BUILD_BUG_ON(
+        sizeof(struct xen_argo_ring_message_header) != ROUNDUP_MESSAGE(1));
+
+    /*
+     * First data write into the destination ring: fixed size, message header.
+     * This cannot overrun because the available free space (value in 'sp')
+     * is checked above and must be at least this size.
+     */
+    ret = memcpy_to_guest_ring(d, ring_info,
+                               ring.tx_ptr + sizeof(xen_argo_ring_t),
+                               &mh, NULL_hnd, sizeof(mh));
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: failed to write message header to ring (vm%u:%x vm%u)\n",
+                ring_info->id.domain_id, ring_info->id.aport,
+                ring_info->id.partner_id);
+
+        return ret;
+    }
+
+    ring.tx_ptr += sizeof(mh);
+    if ( ring.tx_ptr == ring_info->len )
+        ring.tx_ptr = 0;
+
+    for ( piov = iovs; niov--; piov++ )
+    {
+        XEN_GUEST_HANDLE_64(uint8_t) buf_hnd = piov->iov_hnd;
+        unsigned int iov_len = piov->iov_len;
+
+        /* If no data is provided in this iov, moan and skip on to the next */
+        if ( !iov_len )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: no data iov_len=0 iov_hnd=%p ring (vm%u:%x vm%u)\n",
+                    buf_hnd.p, ring_info->id.domain_id, ring_info->id.aport,
+                    ring_info->id.partner_id);
+
+            continue;
+        }
+
+        if ( unlikely(!guest_handle_okay(buf_hnd, iov_len)) )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: bad iov handle [%p, %u] (vm%u:%x vm%u)\n",
+                    buf_hnd.p, iov_len,
+                    ring_info->id.domain_id, ring_info->id.aport,
+                    ring_info->id.partner_id);
+
+            return -EFAULT;
+        }
+
+        sp = ring_info->len - ring.tx_ptr;
+
+        /* Check: iov data size versus free space at the tail of the ring */
+        if ( iov_len > sp )
+        {
+            /*
+             * Second possible data write: ring-tail-wrap-write.
+             * Populate the ring tail and update the internal tx_ptr to handle
+             * wrapping at the end of ring.
+             * Size of data written here: sp
+             * which is the exact full amount of free space available at the
+             * tail of the ring, so this cannot overrun.
+             */
+            ret = memcpy_to_guest_ring(d, ring_info,
+                                       ring.tx_ptr + sizeof(xen_argo_ring_t),
+                                       NULL, buf_hnd, sp);
+            if ( ret )
+            {
+                gprintk(XENLOG_ERR,
+                        "argo: failed to copy {%p, %d} (vm%u:%x vm%u)\n",
+                        buf_hnd.p, sp,
+                        ring_info->id.domain_id, ring_info->id.aport,
+                        ring_info->id.partner_id);
+
+                return ret;
+            }
+
+            ring.tx_ptr = 0;
+            iov_len -= sp;
+            guest_handle_add_offset(buf_hnd, sp);
+
+            ASSERT(iov_len <= ring_info->len);
+        }
+
+        /*
+         * Third possible data write: all data remaining for this iov.
+         * Size of data written here: iov_len
+         *
+         * Case 1: if the ring-tail-wrap-write above was performed, then
+         *         iov_len has been decreased by 'sp' and ring.tx_ptr is zero.
+         *
+         *    We know from checking the result of iov_count:
+         *      len + sizeof(message_header) <= ring_info->len
+         *    We also know that len is the total of summing all iov_lens, so:
+         *       iov_len <= len
+         *    so by transitivity:
+         *       iov_len <= len <= (ring_info->len - sizeof(msgheader))
+         *    and therefore:
+         *       (iov_len + sizeof(msgheader) <= ring_info->len) &&
+         *       (ring.tx_ptr == 0)
+         *    so this write cannot overrun here.
+         *
+         * Case 2: ring-tail-wrap-write above was not performed
+         *    -> so iov_len is the guest-supplied value and: (iov_len <= sp)
+         *    ie. less than available space at the tail of the ring:
+         *        so this write cannot overrun.
+         */
+        ret = memcpy_to_guest_ring(d, ring_info,
+                                   ring.tx_ptr + sizeof(xen_argo_ring_t),
+                                   NULL, buf_hnd, iov_len);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: failed to copy [%p, %u] (vm%u:%x vm%u)\n",
+                    buf_hnd.p, iov_len, ring_info->id.domain_id,
+                    ring_info->id.aport, ring_info->id.partner_id);
+
+            return ret;
+        }
+
+        ring.tx_ptr += iov_len;
+
+        if ( ring.tx_ptr == ring_info->len )
+            ring.tx_ptr = 0;
+    }
+
+    ring.tx_ptr = ROUNDUP_MESSAGE(ring.tx_ptr);
+
+    if ( ring.tx_ptr >= ring_info->len )
+        ring.tx_ptr -= ring_info->len;
+
+    update_tx_ptr(d, ring_info, ring.tx_ptr);
+
+    /*
+     * At this point (and also on an error exit paths from this function) it is
+     * possible to unmap the ring_info, ie:
+     *   ring_unmap(d, ring_info);
+     * but performance should be improved by not doing so, and retaining
+     * the mapping.
+     * An XSM policy control over level of confidentiality required
+     * versus performance cost could be added to decide that here.
+     */
+
+    *out_len = len;
+
+    return ret;
+}
+
 static void
 wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
 {
@@ -497,6 +918,25 @@ wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
 }
 
 static void
+wildcard_pending_list_insert(domid_t domain_id, struct pending_ent *ent)
+{
+    struct domain *d = get_domain_by_id(domain_id);
+
+    if ( !d )
+        return;
+
+    ASSERT(LOCKING_Read_L1);
+
+    if ( d->argo )
+    {
+        spin_lock(&d->argo->wildcard_L2_lock);
+        list_add(&ent->wildcard_node, &d->argo->wildcard_pend_list);
+        spin_unlock(&d->argo->wildcard_L2_lock);
+    }
+    put_domain(d);
+}
+
+static void
 pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
 {
     struct list_head *ring_pending = &ring_info->pending;
@@ -518,6 +958,70 @@ pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
     ring_info->npending = 0;
 }
 
+static int
+pending_queue(const struct domain *d, struct argo_ring_info *ring_info,
+              domid_t src_id, unsigned int len)
+{
+    struct pending_ent *ent;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    if ( ring_info->npending >= MAX_PENDING_PER_RING )
+        return -ENOSPC;
+
+    ent = xmalloc(struct pending_ent);
+    if ( !ent )
+        return -ENOMEM;
+
+    ent->len = len;
+    ent->domain_id = src_id;
+    ent->ring_info = ring_info;
+
+    if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
+        wildcard_pending_list_insert(src_id, ent);
+    list_add(&ent->node, &ring_info->pending);
+    ring_info->npending++;
+
+    return 0;
+}
+
+static int
+pending_requeue(const struct domain *d, struct argo_ring_info *ring_info,
+                domid_t src_id, unsigned int len)
+{
+    struct list_head *cursor, *head;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    /* List structure is not modified here. Update len in a match if found. */
+    head = &ring_info->pending;
+
+    for ( cursor = head->next; cursor != head; cursor = cursor->next )
+    {
+        struct pending_ent *ent = list_entry(cursor, struct pending_ent, node);
+
+        if ( ent->domain_id == src_id )
+        {
+            /*
+             * Reuse an existing queue entry for a notification rather than add
+             * another. If the existing entry is waiting for a smaller size than
+             * the current message then adjust the record to wait for the
+             * current (larger) size to be available before triggering a
+             * notification.
+             * This assists the waiting sender by ensuring that whenever a
+             * notification is triggered, there is sufficient space available
+             * for (at least) any one of the messages awaiting transmission.
+             */
+            if ( ent->len < len )
+                ent->len = len;
+
+            return 0;
+        }
+    }
+
+    return pending_queue(d, ring_info, src_id, len);
+}
+
 static void
 wildcard_rings_pending_remove(struct domain *d)
 {
@@ -1082,6 +1586,91 @@ register_ring(struct domain *currd,
     return ret;
 }
 
+static long
+sendv(struct domain *src_d, const xen_argo_addr_t *src_addr,
+      const xen_argo_addr_t *dst_addr,
+      XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd, unsigned long niov,
+      uint32_t message_type)
+{
+    struct domain *dst_d = NULL;
+    struct argo_ring_id src_id;
+    struct argo_ring_info *ring_info;
+    int ret = 0;
+    unsigned long len = 0;
+
+    ASSERT(src_d->domain_id == src_addr->domain_id);
+
+    argo_dprintk("sendv: (%u:%x)->(%u:%x) niov:%lu iov:%p type:%u\n",
+                 src_addr->domain_id, src_addr->aport,
+                 dst_addr->domain_id, dst_addr->aport,
+                 niov, iovs_hnd.p, message_type);
+
+    src_id.aport = src_addr->aport;
+    src_id.domain_id = src_d->domain_id;
+    src_id.partner_id = dst_addr->domain_id;
+
+    dst_d = get_domain_by_id(dst_addr->domain_id);
+    if ( !dst_d )
+        return -ESRCH;
+
+    read_lock(&L1_global_argo_rwlock);
+
+    if ( !src_d->argo )
+    {
+        ret = -ENODEV;
+        goto out_unlock;
+    }
+
+    if ( !dst_d->argo )
+    {
+        argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
+        ret = -ECONNREFUSED;
+        goto out_unlock;
+    }
+
+    read_lock(&dst_d->argo->rings_L2_rwlock);
+
+    ring_info = find_ring_info_by_match(dst_d, dst_addr->aport,
+                                        src_id.domain_id);
+    if ( !ring_info )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: vm%u connection refused, src (vm%u:%x) dst (vm%u:%x)\n",
+                current->domain->domain_id, src_id.domain_id, src_id.aport,
+                dst_addr->domain_id, dst_addr->aport);
+
+        ret = -ECONNREFUSED;
+    }
+    else
+    {
+        spin_lock(&ring_info->L3_lock);
+
+        ret = ringbuf_insert(dst_d, ring_info, &src_id, iovs_hnd, niov,
+                             message_type, &len);
+        if ( ret == -EAGAIN )
+        {
+            argo_dprintk("argo_ringbuf_sendv failed, EAGAIN\n");
+            /* requeue to issue a notification when space is there */
+            ret = pending_requeue(dst_d, ring_info, src_id.domain_id, len);
+        }
+
+        spin_unlock(&ring_info->L3_lock);
+    }
+
+    read_unlock(&dst_d->argo->rings_L2_rwlock);
+
+ out_unlock:
+    read_unlock(&L1_global_argo_rwlock);
+
+    if ( ret >= 0 )
+        signal_domain(dst_d);
+
+    if ( dst_d )
+        put_domain(dst_d);
+
+    return ( ret < 0 ) ? ret : len;
+}
+
 long
 do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
            XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
@@ -1145,6 +1734,53 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
         break;
     }
 
+    case XEN_ARGO_OP_sendv:
+    {
+        xen_argo_send_addr_t send_addr;
+
+        XEN_GUEST_HANDLE_PARAM(xen_argo_send_addr_t) send_addr_hnd =
+            guest_handle_cast(arg1, xen_argo_send_addr_t);
+        XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd =
+            guest_handle_cast(arg2, xen_argo_iov_t);
+        /* arg3 is niov */
+        /* arg4 is message_type. Must be a 32-bit value. */
+
+        rc = copy_from_guest(&send_addr, send_addr_hnd, 1) ? -EFAULT : 0;
+        if ( rc )
+            break;
+
+        /*
+         * Check padding is zeroed. Reject niov above limit or message_types
+         * that are outside 32 bit range.
+         */
+        if ( unlikely(send_addr.src.pad || send_addr.dst.pad ||
+                      (arg3 > XEN_ARGO_MAXIOV) || (arg4 & ~0xffffffffUL)) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        if ( send_addr.src.domain_id == XEN_ARGO_DOMID_ANY )
+            send_addr.src.domain_id = currd->domain_id;
+
+        /* No domain is currently authorized to send on behalf of another */
+        if ( unlikely(send_addr.src.domain_id != currd->domain_id) )
+        {
+            rc = -EPERM;
+            break;
+        }
+
+        /*
+         * Check access to the whole array here so we can use the faster __copy
+         * operations to read each element later.
+         */
+        if ( unlikely(!guest_handle_okay(iovs_hnd, arg3)) )
+            break;
+
+        rc = sendv(currd, &send_addr.src, &send_addr.dst, iovs_hnd, arg3, arg4);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/common/compat/argo.c b/xen/common/compat/argo.c
index 6a1671c..6290ed6 100644
--- a/xen/common/compat/argo.c
+++ b/xen/common/compat/argo.c
@@ -23,3 +23,22 @@ CHECK_argo_addr;
 CHECK_argo_register_ring;
 CHECK_argo_ring;
 CHECK_argo_unregister_ring;
+
+/*
+ * Disable strict type checking in this compat validation macro for the
+ * following struct checks because it cannot handle fields within structs that
+ * have types that differ in the compat versus non-compat structs.
+ * Replace it with a field size check which is sufficient here.
+ */
+
+#undef CHECK_FIELD_COMMON_
+#define CHECK_FIELD_COMMON_(k, name, n, f) \
+static inline int __maybe_unused name(k xen_ ## n *x, k compat_ ## n *c) \
+{ \
+    BUILD_BUG_ON(offsetof(k xen_ ## n, f) != \
+                 offsetof(k compat_ ## n, f)); \
+    return sizeof(x->f) == sizeof(c->f); \
+}
+
+CHECK_argo_send_addr;
+CHECK_argo_iov;
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index f34d4f0..6fbe346 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -746,7 +746,7 @@ void send_guest_vcpu_virq(struct vcpu *v, uint32_t virq)
     spin_unlock_irqrestore(&v->virq_lock, flags);
 }
 
-static void send_guest_global_virq(struct domain *d, uint32_t virq)
+void send_guest_global_virq(struct domain *d, uint32_t virq)
 {
     unsigned long flags;
     int port;
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index 2371510..a28454a 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -46,6 +46,34 @@ typedef uint32_t xen_argo_port_t;
 /* gfn type: 64-bit on all architectures to aid avoiding a compat ABI */
 typedef uint64_t xen_argo_gfn_t;
 
+/*
+ * XEN_ARGO_MAXIOV : maximum number of iovs accepted in a single sendv.
+ * Caution is required if this value is increased: this determines the size of
+ * an array of xen_argo_iov_t structs on the hypervisor stack, so could cause
+ * stack overflow if the value is too large.
+ * The Linux Argo driver never passes more than two iovs.
+ *
+ * This value should also not exceed 128 to ensure that the total amount of data
+ * posted in a single Argo sendv operation cannot exceed 2^31 bytes, to reduce
+ * risk of integer overflow defects:
+ * Each argo iov can hold ~ 2^24 bytes, so XEN_ARGO_MAXIOV <= 2^(31-24),
+ * ie. keep XEN_ARGO_MAXIOV <= 128.
+*/
+#define XEN_ARGO_MAXIOV          8U
+
+DEFINE_XEN_GUEST_HANDLE(uint8_t);
+
+typedef struct xen_argo_iov
+{
+#ifdef XEN_GUEST_HANDLE_64
+    XEN_GUEST_HANDLE_64(uint8_t) iov_hnd;
+#else
+    uint64_t iov_hnd;
+#endif
+    uint32_t iov_len;
+    uint32_t pad;
+} xen_argo_iov_t;
+
 typedef struct xen_argo_addr
 {
     xen_argo_port_t aport;
@@ -53,6 +81,12 @@ typedef struct xen_argo_addr
     uint16_t pad;
 } xen_argo_addr_t;
 
+typedef struct xen_argo_send_addr
+{
+    xen_argo_addr_t src;
+    xen_argo_addr_t dst;
+} xen_argo_send_addr_t;
+
 typedef struct xen_argo_ring
 {
     /* Guests should use atomic operations to access rx_ptr */
@@ -153,4 +187,30 @@ struct xen_argo_ring_message_header
  */
 #define XEN_ARGO_OP_unregister_ring     2
 
+/*
+ * XEN_ARGO_OP_sendv
+ *
+ * Send a list of buffers contained in iovs.
+ *
+ * The send address struct specifies the source and destination addresses
+ * for the message being sent, which are used to find the destination ring:
+ * Xen first looks for a most-specific match with a registered ring with
+ *  (id.addr == dst) and (id.partner == sending_domain) ;
+ * if that fails, it then looks for a wildcard match (aka multicast receiver)
+ * where (id.addr == dst) and (id.partner == DOMID_ANY).
+ *
+ * For each iov entry, send iov_len bytes from iov_base to the destination ring.
+ * If insufficient space exists in the destination ring, it will return -EAGAIN
+ * and Xen will notify the caller when sufficient space becomes available.
+ *
+ * The message type is a 32-bit data field available to communicate message
+ * context data (eg. kernel-to-kernel, rather than application layer).
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_send_addr_t) source and dest addresses
+ * arg2: XEN_GUEST_HANDLE(xen_argo_iov_t) iovs
+ * arg3: unsigned long niov
+ * arg4: unsigned long message type (32-bit value)
+ */
+#define XEN_ARGO_OP_sendv               3
+
 #endif
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index b3f6491..ccdffc0 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -178,7 +178,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define VIRQ_CON_RING   8  /* G. (DOM0) Bytes received on console            */
 #define VIRQ_PCPU_STATE 9  /* G. (DOM0) PCPU state changed                   */
 #define VIRQ_MEM_EVENT  10 /* G. (DOM0) A memory event has occurred          */
-#define VIRQ_XC_RESERVED 11 /* G. Reserved for XenClient                     */
+#define VIRQ_ARGO       11 /* G. Argo interdomain message notification       */
 #define VIRQ_ENOMEM     12 /* G. (DOM0) Low on heap memory       */
 #define VIRQ_XENPMU     13 /* V.  PMC interrupt                              */
 
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index ebb879e..4650887 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -29,6 +29,13 @@ void send_guest_vcpu_virq(struct vcpu *v, uint32_t virq);
 void send_global_virq(uint32_t virq);
 
 /*
+ * send_guest_global_virq:
+ *  @d:        Domain to which VIRQ should be sent
+ *  @virq:     Virtual IRQ number (VIRQ_*), must be global
+ */
+void send_guest_global_virq(struct domain *d, uint32_t virq);
+
+/*
  * sent_global_virq_handler: Set a global VIRQ handler.
  *  @d:        New target domain for this VIRQ
  *  @virq:     Virtual IRQ number (VIRQ_*), must be global
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 411c661..3723980 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -152,3 +152,5 @@
 ?	argo_ring			argo.h
 ?	argo_register_ring		argo.h
 ?	argo_unregister_ring		argo.h
+?	argo_iov			argo.h
+?	argo_send_addr			argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 10/15] argo: implement the notify op
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (8 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-22 14:09   ` Roger Pau Monné
  2019-01-21  9:59 ` [PATCH v5 11/15] xsm, argo: XSM control for argo register Christopher Clark
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Queries for data about space availability in registered rings and
causes notification to be sent when space has become available.

The hypercall op populates a supplied data structure with information about
ring state and if insufficient space is currently available in a given ring,
the hypervisor will record the domain's expressed interest and notify it
when it observes that space has become available.

Checks for free space occur when this notify op is invoked, so it may be
intentionally invoked with no data structure to populate
(ie. a NULL argument) to trigger such a check and consequent notifications.

Limit the maximum number of notify requests in a single operation to a
simple fixed limit of 256.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
v4 #10 Roger: consolidate notify flags; infer pending notify if needed
v4 bugfix: take L3 before accessing ring_info in fill_ring_data
v4 #10 Roger: shorten notify flag names: drop _DATA_F
v4 #10 self/Roger: fill_ring_data: check pending_requeue error code
v4 : use standard data structures as per common code
v4 #10 Roger: lower indentation in fill_ring_data by using goto
v4 #10 Roger: reword the XEN_ARGO_RING_DATA_F_SUFFICIENT comment
v4 fix location of a FIXME that was incorrectly moved by this later commit

v3 #07 Jan: fix format string indention in printks
v3 (general) Jan: drop fixed width types for ringbuf_payload_space
v3 #07 Jan: rename ring_find_info_by_match to find_ring_info_by_match
v3 #07 Jan: fix numeric entries in printk format strings
v3: ringbuf_payload_space: simpler return 0 if get_sanitized_ring fails
v3 #10 Roger: simplify ringbuf_payload_space for empty rings
v3 #10 Roger: ringbuf_payload_space: add comment to explain how ret < INT32_MAX
v3 #10 Roger: drop out label, use return -EFAULT in fill_ring_data
v3 #10 Roger: add newline in signal_domid
v3 #10 Roger: move find functions to top of file and drop prototypes
v3 #04 Jan: meld the compat hypercall arg checking
v3 #04 Roger/Jan: make lock names clearer and assert their state
v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
v3 self: drop braces in foreach of notify_check_pending
v3 feedback Roger/Jan: ASSERT currd is current->domain or use 'd' variable name

v2 feedback Jan: drop cookie, implement teardown
v2 notify: add flag to indicate ring is shared
v2 argument name for fill_ring_data arg is now currd
v2 self: check ring size vs request and flag error rather than queue signal
v2 feedback Jan: drop 'message' from 'argo_message_op'
v2 self: simplify signal_domid, drop unnecessary label + goto
v2 self: skip the cookie check in pending_cancel
v2 self: implement npending limit on number of pending entries
v1 feedback #16 Jan: sanitize_ring in ringbuf_payload_space
v2 self: inline fill_ring_data_array
v2 self: avoid retesting dst_d for put_domain
v2 self/Jan: remove use of magic verification field and tidy up
v1 feedback #16 Jan: remove testing of magic in guest-supplied structure
v2 self: s/argo_pending_ent/pending_ent/g
v2 feedback v1#13 Roger: use OS-supplied roundup; drop from public header
v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
v1 feedback Roger, Jan: drop argo prefix on static functions
v2 self: reduce indentation via goto out if arg NULL
v1 feedback #13 Jan: resolve checking of array handle and use of __copy
v1 #5 (#16) feedback Paul: notify op: use currd in do_argo_message_op
v1 #5 (#16) feedback Paul: notify op: use currd in argo_notify
v1 #5 (#16) feedback Paul: notify op: use currd in argo_notify_check_pending
v1 #5 (#16) feedback Paul: notify op: use currd in argo_fill_ring_data_array
v1 #13 (#16) feedback Paul: notify op: do/while: reindent only
v1 #13 (#16) feedback Paul: notify op: do/while: goto
v1 : add compat xlat.lst entries
v1: add definition for copy_field_from_guest_errno
v1 #13 feedback Jan: make 'ring data' comment comply with single-line style
v1 feedback #13 Jan: use __copy; so define and use __copy_field_to_guest_errno
v1: #13 feedback Jan: public namespace: prefix with xen
v1: #13 feedback Jan: add blank line after case in do_argo_message_op
v1: self: rename ent id to domain_id
v1: self: ent id-> domain_id
v1: self: drop signal if domain_cookie mismatches
v1. feedback #15 Jan: make loop i unsigned
v1. self: drop unnecessary mb() in argo_notify_check_pending
v1. self: add blank line
v1 #16 feedback Jan: const domain arg to +argo_fill_ring_data
v1. feedback #15 Jan: check unusued hypercall args are zero
v1 feedback #16 Jan: add comment on space available signal policy
v1. feedback #16 Jan: move declr, drop braces, lower indent
v1. feedback #18 Jan: meld the resource limits into the main commit
v1. feedback #16 Jan: clarify use of magic field
v1. self: use single copy to read notify ring data struct
v1: argo_fill_ring_data: fix dprintk types for port field
v1: self: use %x for printing port as per other print sites
v1. feedback Jan: add comments explaining ring full vs empty
v1. following Jan: fix argo_ringbuf_payload_space calculation for empty ring

 xen/common/argo.c         | 371 ++++++++++++++++++++++++++++++++++++++++++++++
 xen/common/compat/argo.c  |  18 +++
 xen/include/public/argo.h |  64 ++++++++
 xen/include/xlat.lst      |   2 +
 4 files changed, 455 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 518aff7..4b43bdd 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -30,6 +30,7 @@
 #include <public/argo.h>
 
 #define MAX_RINGS_PER_DOMAIN            128U
+#define MAX_NOTIFY_COUNT                256U
 #define MAX_PENDING_PER_RING             32U
 
 /* All messages on the ring are padded to a multiple of the slot size. */
@@ -49,6 +50,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_argo_gfn_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_iov_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_data_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_send_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_unregister_ring_t);
 
@@ -421,6 +424,18 @@ signal_domain(struct domain *d)
 }
 
 static void
+signal_domid(domid_t domain_id)
+{
+    struct domain *d = get_domain_by_id(domain_id);
+
+    if ( !d )
+        return;
+
+    signal_domain(d);
+    put_domain(d);
+}
+
+static void
 ring_unmap(const struct domain *d, struct argo_ring_info *ring_info)
 {
     unsigned int i;
@@ -620,6 +635,66 @@ get_sanitized_ring(const struct domain *d, xen_argo_ring_t *ring,
     return 0;
 }
 
+static unsigned int
+ringbuf_payload_space(const struct domain *d, struct argo_ring_info *ring_info)
+{
+    xen_argo_ring_t ring;
+    unsigned int len;
+    int ret;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    len = ring_info->len;
+    if ( !len )
+        return 0;
+
+    if ( get_sanitized_ring(d, &ring, ring_info) )
+        return 0;
+
+    argo_dprintk("sanitized ringbuf_payload_space: tx_ptr=%u rx_ptr=%u\n",
+                 ring.tx_ptr, ring.rx_ptr);
+
+    /*
+     * rx_ptr == tx_ptr means that the ring has been emptied.
+     * See message size checking logic in the entry to ringbuf_insert which
+     * ensures that there is always one message slot of size ROUNDUP_MESSAGE(1)
+     * left available, preventing a ring from being entirely filled.
+     * This ensures that matching ring indexes always indicate an empty ring
+     * and never a full one.
+     */
+    ret = ring.rx_ptr - ring.tx_ptr;
+    if ( ret <= 0 )
+        ret += len;
+
+    /*
+     * In a sanitized ring, we can rely on:
+     *              (rx_ptr < ring_info->len)           &&
+     *              (tx_ptr < ring_info->len)           &&
+     *      (ring_info->len <= XEN_ARGO_MAX_RING_SIZE)
+     *
+     * and since: XEN_ARGO_MAX_RING_SIZE < INT32_MAX
+     * therefore right here: ret < INT32_MAX
+     * and we are safe to return it as a unsigned value from this function.
+     * The subtractions below cannot increase its value.
+     */
+
+    /*
+     * The maximum size payload for a message that will be accepted is:
+     * (the available space between the ring indexes)
+     *    minus (space for a message header)
+     *    minus (space for one message slot)
+     * since ringbuf_insert requires that one message slot be left
+     * unfilled, to avoid filling the ring to capacity and confusing a full
+     * ring with an empty one.
+     * Since the ring indexes are sanitized, the value in ret is aligned, so
+     * the simple subtraction here works to return the aligned value needed:
+     */
+    ret -= sizeof(struct xen_argo_ring_message_header);
+    ret -= ROUNDUP_MESSAGE(1);
+
+    return (ret < 0) ? 0 : ret;
+}
+
 /*
  * iov_count returns its count on success via an out variable to avoid
  * potential for a negative return value to be used incorrectly
@@ -958,6 +1033,71 @@ pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
     ring_info->npending = 0;
 }
 
+static void
+pending_notify(struct list_head *to_notify)
+{
+    ASSERT(LOCKING_Read_L1);
+
+    /* Sending signals for all ents in this list, draining until it is empty. */
+    while ( !list_empty(to_notify) )
+    {
+        struct pending_ent *ent =
+            list_entry(to_notify->next, struct pending_ent, node);
+
+        list_del(&ent->node);
+        signal_domid(ent->domain_id);
+        xfree(ent);
+    }
+}
+
+static void
+pending_find(const struct domain *d, struct argo_ring_info *ring_info,
+             unsigned int payload_space, struct list_head *to_notify)
+{
+    struct list_head *cursor, *pending_head;
+
+    ASSERT(LOCKING_Read_rings_L2(d));
+
+    /*
+     * TODO: Current policy here is to signal _all_ of the waiting domains
+     *       interested in sending a message of size less than payload_space.
+     *
+     * This is likely to be suboptimal, since once one of them has added
+     * their message to the ring, there may well be insufficient room
+     * available for any of the others to transmit, meaning that they were
+     * woken in vain, which created extra work just to requeue their wait.
+     *
+     * Retain this simple policy for now since it at least avoids starving a
+     * domain of available space notifications because of a policy that only
+     * notified other domains instead. Improvement may be possible;
+     * investigation required.
+     */
+    spin_lock(&ring_info->L3_lock);
+
+    /* Remove matching ents from the ring list, and add them to "to_notify" */
+    pending_head = &ring_info->pending;
+    cursor = pending_head->next;
+
+    while ( cursor != pending_head )
+    {
+        struct pending_ent *ent = list_entry(cursor, struct pending_ent, node);
+
+        cursor = cursor->next;
+
+        if ( payload_space >= ent->len )
+        {
+            if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
+                wildcard_pending_list_remove(ent->domain_id, ent);
+
+            list_del(&ent->node);
+            ring_info->npending--;
+            list_add(&ent->node, to_notify);
+        }
+    }
+
+    spin_unlock(&ring_info->L3_lock);
+}
+
 static int
 pending_queue(const struct domain *d, struct argo_ring_info *ring_info,
               domid_t src_id, unsigned int len)
@@ -1023,6 +1163,36 @@ pending_requeue(const struct domain *d, struct argo_ring_info *ring_info,
 }
 
 static void
+pending_cancel(const struct domain *d, struct argo_ring_info *ring_info,
+               domid_t src_id)
+{
+    struct list_head *cursor, *pending_head;
+
+    ASSERT(LOCKING_L3(d, ring_info));
+
+    /* Remove all ents where domain_id matches src_id from the ring's list. */
+    pending_head = &ring_info->pending;
+    cursor = pending_head->next;
+
+    while ( cursor != pending_head )
+    {
+        struct pending_ent *ent = list_entry(cursor, struct pending_ent, node);
+
+        cursor = cursor->next;
+
+        if ( ent->domain_id == src_id )
+        {
+            /* For wildcard rings, remove each from their wildcard list too. */
+            if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
+                wildcard_pending_list_remove(ent->domain_id, ent);
+            list_del(&ent->node);
+            xfree(ent);
+            ring_info->npending--;
+        }
+    }
+}
+
+static void
 wildcard_rings_pending_remove(struct domain *d)
 {
     struct list_head *wildcard_head;
@@ -1158,6 +1328,86 @@ partner_rings_remove(struct domain *src_d)
 }
 
 static int
+fill_ring_data(const struct domain *currd,
+               XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) data_ent_hnd)
+{
+    xen_argo_ring_data_ent_t ent;
+    struct domain *dst_d;
+    struct argo_ring_info *ring_info;
+    int ret = 0;
+
+    ASSERT(currd == current->domain);
+    ASSERT(LOCKING_Read_L1);
+
+    if ( __copy_from_guest(&ent, data_ent_hnd, 1) )
+        return -EFAULT;
+
+    argo_dprintk("fill_ring_data: ent.ring.domain=%u,ent.ring.aport=%x\n",
+                 ent.ring.domain_id, ent.ring.aport);
+
+    ent.flags = 0;
+
+    dst_d = get_domain_by_id(ent.ring.domain_id);
+    if ( !dst_d || !dst_d->argo )
+        goto out;
+
+    read_lock(&dst_d->argo->rings_L2_rwlock);
+
+    ring_info = find_ring_info_by_match(dst_d, ent.ring.aport,
+                                        currd->domain_id);
+    if ( ring_info )
+    {
+        unsigned int space_avail;
+
+        ent.flags |= XEN_ARGO_RING_EXISTS;
+
+        spin_lock(&ring_info->L3_lock);
+
+        ent.max_message_size = ring_info->len -
+                                   sizeof(struct xen_argo_ring_message_header) -
+                                   ROUNDUP_MESSAGE(1);
+
+        if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
+            ent.flags |= XEN_ARGO_RING_SHARED;
+
+        space_avail = ringbuf_payload_space(dst_d, ring_info);
+
+        argo_dprintk("fill_ring_data: aport=%x space_avail=%u"
+                     " space_wanted=%u\n",
+                     ring_info->id.aport, space_avail, ent.space_required);
+
+        /* Do not queue a notification for an unachievable size */
+        if ( ent.space_required > ent.max_message_size )
+            ent.flags |= XEN_ARGO_RING_EMSGSIZE;
+        else if ( space_avail >= ent.space_required )
+        {
+            pending_cancel(dst_d, ring_info, currd->domain_id);
+            ent.flags |= XEN_ARGO_RING_SUFFICIENT;
+        }
+        else
+            ret = pending_requeue(dst_d, ring_info, currd->domain_id,
+                                  ent.space_required);
+
+        spin_unlock(&ring_info->L3_lock);
+
+        if ( space_avail == ent.max_message_size )
+            ent.flags |= XEN_ARGO_RING_EMPTY;
+
+    }
+    read_unlock(&dst_d->argo->rings_L2_rwlock);
+
+ out:
+    if ( dst_d )
+        put_domain(dst_d);
+
+    if ( !ret && (__copy_field_to_guest(data_ent_hnd, &ent, flags) ||
+                  __copy_field_to_guest(data_ent_hnd, &ent, max_message_size)) )
+        return -EFAULT;
+
+    return ret;
+}
+
+static int
 find_ring_mfn(struct domain *d, gfn_t gfn, mfn_t *mfn)
 {
     struct page_info *page;
@@ -1586,6 +1836,112 @@ register_ring(struct domain *currd,
     return ret;
 }
 
+static void
+notify_ring(const struct domain *d, struct argo_ring_info *ring_info,
+            struct list_head *to_notify)
+{
+    unsigned int space;
+
+    ASSERT(LOCKING_Read_rings_L2(d));
+
+    spin_lock(&ring_info->L3_lock);
+
+    if ( ring_info->len )
+        space = ringbuf_payload_space(d, ring_info);
+    else
+        space = 0;
+
+    spin_unlock(&ring_info->L3_lock);
+
+    if ( space )
+        pending_find(d, ring_info, space, to_notify);
+}
+
+static void
+notify_check_pending(struct domain *d)
+{
+    unsigned int i;
+    LIST_HEAD(to_notify);
+
+    ASSERT(LOCKING_Read_L1);
+
+    read_lock(&d->argo->rings_L2_rwlock);
+
+    /* Walk all rings, call notify_ring on each to populate to_notify list */
+    for ( i = 0; i < ARGO_HASHTABLE_SIZE; i++ )
+    {
+        struct list_head *cursor, *bucket = &d->argo->ring_hash[i];
+        struct argo_ring_info *ring_info;
+
+        for ( cursor = bucket->next; cursor != bucket; cursor = cursor->next )
+        {
+            ring_info = list_entry(cursor, struct argo_ring_info, node);
+            notify_ring(d, ring_info, &to_notify);
+        }
+    }
+
+    read_unlock(&d->argo->rings_L2_rwlock);
+
+    if ( !list_empty(&to_notify) )
+        pending_notify(&to_notify);
+}
+
+static long
+notify(struct domain *currd,
+       XEN_GUEST_HANDLE_PARAM(xen_argo_ring_data_t) ring_data_hnd)
+{
+    XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) ent_hnd;
+    xen_argo_ring_data_t ring_data;
+    int ret = 0;
+
+    ASSERT(currd == current->domain);
+
+    read_lock(&L1_global_argo_rwlock);
+
+    if ( !currd->argo )
+    {
+        argo_dprintk("!d->argo, ENODEV\n");
+        ret = -ENODEV;
+        goto out;
+    }
+
+    notify_check_pending(currd);
+
+    if ( guest_handle_is_null(ring_data_hnd) )
+        goto out;
+
+    ret = copy_from_guest(&ring_data, ring_data_hnd, 1) ? -EFAULT : 0;
+    if ( ret )
+        goto out;
+
+    if ( ring_data.nent > MAX_NOTIFY_COUNT )
+    {
+        gprintk(XENLOG_ERR, "argo: notify entry count(%u) exceeds max(%u)\n",
+                ring_data.nent, MAX_NOTIFY_COUNT);
+        ret = -EACCES;
+        goto out;
+    }
+
+    ent_hnd = guest_handle_for_field(ring_data_hnd,
+                                     xen_argo_ring_data_ent_t, data[0]);
+    if ( unlikely(!guest_handle_okay(ent_hnd, ring_data.nent)) )
+    {
+        ret = -EFAULT;
+        goto out;
+    }
+
+    while ( !ret && ring_data.nent-- )
+    {
+        ret = fill_ring_data(currd, ent_hnd);
+        guest_handle_add_offset(ent_hnd, 1);
+    }
+
+ out:
+    read_unlock(&L1_global_argo_rwlock);
+
+    return ret;
+}
+
 static long
 sendv(struct domain *src_d, const xen_argo_addr_t *src_addr,
       const xen_argo_addr_t *dst_addr,
@@ -1781,6 +2137,21 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
         break;
     }
 
+    case XEN_ARGO_OP_notify:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_ring_data_t) ring_data_hnd =
+                   guest_handle_cast(arg1, xen_argo_ring_data_t);
+
+        if ( unlikely((!guest_handle_is_null(arg2)) || arg3 || arg4) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        rc = notify(currd, ring_data_hnd);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/common/compat/argo.c b/xen/common/compat/argo.c
index 6290ed6..4fac597 100644
--- a/xen/common/compat/argo.c
+++ b/xen/common/compat/argo.c
@@ -41,4 +41,22 @@ static inline int __maybe_unused name(k xen_ ## n *x, k compat_ ## n *c) \
 }
 
 CHECK_argo_send_addr;
+CHECK_argo_ring_data_ent;
 CHECK_argo_iov;
+
+/*
+ * Disable sizeof type checking for the following struct checks because
+ * these structs have fields of types that differ in the compat vs non-compat
+ * structs with variable size which prevents the size check validation.
+ */
+
+#undef CHECK_FIELD_COMMON_
+#define CHECK_FIELD_COMMON_(k, name, n, f) \
+static inline int __maybe_unused name(k xen_ ## n *x, k compat_ ## n *c) \
+{ \
+    BUILD_BUG_ON(offsetof(k xen_ ## n, f) != \
+                 offsetof(k compat_ ## n, f)); \
+    return 1; \
+}
+
+CHECK_argo_ring_data;
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index a28454a..d2fdd03 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -123,6 +123,40 @@ typedef struct xen_argo_unregister_ring
 /* Messages on the ring are padded to a multiple of this size. */
 #define XEN_ARGO_MSG_SLOT_SIZE 0x10
 
+/*
+ * Notify flags
+ */
+/* Ring is empty */
+#define XEN_ARGO_RING_EMPTY       (1U << 0)
+/* Ring exists */
+#define XEN_ARGO_RING_EXISTS      (1U << 1)
+/* Sufficient space to queue space_required bytes might exist */
+#define XEN_ARGO_RING_SUFFICIENT  (1U << 2)
+/* Insufficient ring size for space_required bytes */
+#define XEN_ARGO_RING_EMSGSIZE    (1U << 3)
+/* Ring is shared, not unicast */
+#define XEN_ARGO_RING_SHARED      (1U << 4)
+
+typedef struct xen_argo_ring_data_ent
+{
+    xen_argo_addr_t ring;
+    uint16_t flags;
+    uint16_t pad;
+    uint32_t space_required;
+    uint32_t max_message_size;
+} xen_argo_ring_data_ent_t;
+
+typedef struct xen_argo_ring_data
+{
+    uint32_t nent;
+    uint32_t pad;
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    xen_argo_ring_data_ent_t data[];
+#elif defined(__GNUC__)
+    xen_argo_ring_data_ent_t data[0];
+#endif
+} xen_argo_ring_data_t;
+
 struct xen_argo_ring_message_header
 {
     uint32_t len;
@@ -213,4 +247,34 @@ struct xen_argo_ring_message_header
  */
 #define XEN_ARGO_OP_sendv               3
 
+/*
+ * XEN_ARGO_OP_notify
+ *
+ * Asks Xen for information about other rings in the system.
+ *
+ * ent->ring is the xen_argo_addr_t of the ring you want information on.
+ * Uses the same ring matching rules as XEN_ARGO_OP_sendv.
+ *
+ * ent->space_required : if this field is not null then Xen will check
+ * that there is space in the destination ring for this many bytes of payload.
+ * If the ring is too small for the requested space_required, it will set the
+ * XEN_ARGO_RING_EMSGSIZE flag on return.
+ * If sufficient space is available, it will set XEN_ARGO_RING_SUFFICIENT
+ * and CANCEL any pending notification for that ent->ring; otherwise it
+ * will schedule a notification event and the flag will not be set.
+ *
+ * These flags are set by Xen when notify replies:
+ * XEN_ARGO_RING_EMPTY      ring is empty
+ * XEN_ARGO_RING_SUFFICIENT sufficient space for space_required is there
+ * XEN_ARGO_RING_EXISTS     ring exists
+ * XEN_ARGO_RING_EMSGSIZE   space_required too large for the ring size
+ * XEN_ARGO_RING_SHARED     ring is registered for wildcard partner
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_ring_data_t) ring_data (may be NULL)
+ * arg2: NULL
+ * arg3: 0 (ZERO)
+ * arg4: 0 (ZERO)
+ */
+#define XEN_ARGO_OP_notify              4
+
 #endif
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 3723980..e45b60e 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -154,3 +154,5 @@
 ?	argo_unregister_ring		argo.h
 ?	argo_iov			argo.h
 ?	argo_send_addr			argo.h
+?	argo_ring_data_ent		argo.h
+?	argo_ring_data			argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 11/15] xsm, argo: XSM control for argo register
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (9 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 10/15] argo: implement the notify op Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 12/15] xsm, argo: XSM control for argo message send operation Christopher Clark
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	Daniel De Graaf, James McKenzie, Eric Chanudet, Roger Pau Monne

XSM controls for argo ring registration with two distinct cases, where
the ring being registered is:

1) Single source:  registering a ring for communication to receive messages
                   from a specified single other domain.
   Default policy: allow.

2) Any source:     registering a ring for communication to receive messages
                   from any, or all, other domains (ie. wildcard).
   Default policy: deny, with runtime policy configuration via bootparam.

This commit modifies the signature of core XSM hook functions in order to
apply 'const' to arguments, needed in order for 'const' to be accepted in
signature of functions that invoke them.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
v3 Daniel/Jan: add to the default xsm policy for the register op
v3 hoist opt_argo_mac_permissive check to allow default policy to match non-XSM
v3 was: Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
v3 Add Daniel's Acked-by ; note minor changes required for v4
v3 feedback #07 Roger: use opt_argo_mac_permissive : a boolean opt
v2 feedback #9 Jan: refactor to use argo-mac bootparam at point of introduction
v1 feedback Paul: replace use of strncmp with strcmp
v1 feedback #16 Jan: apply const to function signatures
v1 feedback #14 Jan: add blank line before return in parse_argo_mac_param

 tools/flask/policy/modules/guest_features.te |  6 ++++++
 xen/common/argo.c                            | 11 +++++++++--
 xen/include/xsm/dummy.h                      | 14 ++++++++++++++
 xen/include/xsm/xsm.h                        | 19 +++++++++++++++++++
 xen/xsm/dummy.c                              |  4 ++++
 xen/xsm/flask/hooks.c                        | 27 ++++++++++++++++++++++++---
 xen/xsm/flask/policy/access_vectors          | 11 +++++++++++
 xen/xsm/flask/policy/security_classes        |  1 +
 8 files changed, 88 insertions(+), 5 deletions(-)

diff --git a/tools/flask/policy/modules/guest_features.te b/tools/flask/policy/modules/guest_features.te
index 9ac9780..d00769e 100644
--- a/tools/flask/policy/modules/guest_features.te
+++ b/tools/flask/policy/modules/guest_features.te
@@ -5,6 +5,12 @@ allow domain_type xen_t:xen tmem_op;
 # pmu_ctrl is for)
 allow domain_type xen_t:xen2 pmu_use;
 
+# Allow all domains:
+# to register single-sender (unicast) rings to partner with any domain; and
+# to register any-sender (wildcard) rings that can be sent to by any domain.
+allow domain_type xen_t:argo { register_any_source };
+allow domain_type domain_type:argo { register_single_source };
+
 # Allow guest console output to the serial console.  This is used by PV Linux
 # and stub domains for early boot output, so don't audit even when we deny it.
 # Without XSM, this is enabled only if the Xen was compiled in debug mode.
diff --git a/xen/common/argo.c b/xen/common/argo.c
index 4b43bdd..7061fd6 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -26,6 +26,7 @@
 #include <xen/nospec.h>
 #include <xen/sched.h>
 #include <xen/time.h>
+#include <xsm/xsm.h>
 
 #include <public/argo.h>
 
@@ -1645,8 +1646,10 @@ register_ring(struct domain *currd,
 
     if ( reg.partner_id == XEN_ARGO_DOMID_ANY )
     {
-        if ( !opt_argo_mac_permissive )
-            return -EPERM;
+        ret = opt_argo_mac_permissive ? xsm_argo_register_any_source(currd) :
+                                        -EPERM;
+        if ( ret )
+            return ret;
     }
     else
     {
@@ -1657,6 +1660,10 @@ register_ring(struct domain *currd,
             return -ESRCH;
         }
 
+        ret = xsm_argo_register_single_source(currd, dst_d);
+        if ( ret )
+            goto out;
+
         send_info = xzalloc(struct argo_send_info);
         if ( !send_info )
         {
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index a29d1ef..96118aa 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -720,6 +720,20 @@ static XSM_INLINE int xsm_dm_op(XSM_DEFAULT_ARG struct domain *d)
 
 #endif /* CONFIG_X86 */
 
+#ifdef CONFIG_ARGO
+static XSM_INLINE int xsm_argo_register_single_source(struct domain *d,
+                                                      struct domain *t)
+{
+    return 0;
+}
+
+static XSM_INLINE int xsm_argo_register_any_source(struct domain *d)
+{
+    return 0;
+}
+
+#endif /* CONFIG_ARGO */
+
 #include <public/version.h>
 static XSM_INLINE int xsm_xen_version (XSM_DEFAULT_ARG uint32_t op)
 {
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 3b192b5..e32a645 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -181,6 +181,11 @@ struct xsm_operations {
 #endif
     int (*xen_version) (uint32_t cmd);
     int (*domain_resource_map) (struct domain *d);
+#ifdef CONFIG_ARGO
+    int (*argo_register_single_source) (const struct domain *d,
+                                        const struct domain *t);
+    int (*argo_register_any_source) (const struct domain *d);
+#endif
 };
 
 #ifdef CONFIG_XSM
@@ -698,6 +703,20 @@ static inline int xsm_domain_resource_map(xsm_default_t def, struct domain *d)
     return xsm_ops->domain_resource_map(d);
 }
 
+#ifdef CONFIG_ARGO
+static inline xsm_argo_register_single_source(const struct domain *d,
+                                              const struct domain *t)
+{
+    return xsm_ops->argo_register_single_source(d, t);
+}
+
+static inline xsm_argo_register_any_source(const struct domain *d)
+{
+    return xsm_ops->argo_register_any_source(d);
+}
+
+#endif /* CONFIG_ARGO */
+
 #endif /* XSM_NO_WRAPPERS */
 
 #ifdef CONFIG_MULTIBOOT
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 5701047..ed236b0 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -152,4 +152,8 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
 #endif
     set_to_dummy_if_null(ops, xen_version);
     set_to_dummy_if_null(ops, domain_resource_map);
+#ifdef CONFIG_ARGO
+    set_to_dummy_if_null(ops, argo_register_single_source);
+    set_to_dummy_if_null(ops, argo_register_any_source);
+#endif
 }
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 96d31aa..fcb7487 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -36,13 +36,14 @@
 #include <objsec.h>
 #include <conditional.h>
 
-static u32 domain_sid(struct domain *dom)
+static u32 domain_sid(const struct domain *dom)
 {
     struct domain_security_struct *dsec = dom->ssid;
     return dsec->sid;
 }
 
-static u32 domain_target_sid(struct domain *src, struct domain *dst)
+static u32 domain_target_sid(const struct domain *src,
+                             const struct domain *dst)
 {
     struct domain_security_struct *ssec = src->ssid;
     struct domain_security_struct *dsec = dst->ssid;
@@ -58,7 +59,8 @@ static u32 evtchn_sid(const struct evtchn *chn)
     return chn->ssid.flask_sid;
 }
 
-static int domain_has_perm(struct domain *dom1, struct domain *dom2, 
+static int domain_has_perm(const struct domain *dom1,
+                           const struct domain *dom2,
                            u16 class, u32 perms)
 {
     u32 ssid, tsid;
@@ -1717,6 +1719,21 @@ static int flask_domain_resource_map(struct domain *d)
     return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__RESOURCE_MAP);
 }
 
+#ifdef CONFIG_ARGO
+static int flask_argo_register_single_source(const struct domain *d,
+                                             const struct domain *t)
+{
+    return domain_has_perm(d, t, SECCLASS_ARGO,
+                           ARGO__REGISTER_SINGLE_SOURCE);
+}
+
+static int flask_argo_register_any_source(const struct domain *d)
+{
+    return avc_has_perm(domain_sid(d), SECINITSID_XEN, SECCLASS_ARGO,
+                        ARGO__REGISTER_ANY_SOURCE, NULL);
+}
+#endif
+
 long do_flask_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_flask_op);
 int compat_flask_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_flask_op);
 
@@ -1851,6 +1868,10 @@ static struct xsm_operations flask_ops = {
 #endif
     .xen_version = flask_xen_version,
     .domain_resource_map = flask_domain_resource_map,
+#ifdef CONFIG_ARGO
+    .argo_register_single_source = flask_argo_register_single_source,
+    .argo_register_any_source = flask_argo_register_any_source,
+#endif
 };
 
 void __init flask_init(const void *policy_buffer, size_t policy_size)
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 6fecfda..fb95c97 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -531,3 +531,14 @@ class version
 # Xen build id
     xen_build_id
 }
+
+# Class argo is used to describe the Argo interdomain communication system.
+class argo
+{
+    # Domain requesting registration of a communication ring
+    # to receive messages from a specific other domain.
+    register_single_source
+    # Domain requesting registration of a communication ring
+    # to receive messages from any other domain.
+    register_any_source
+}
diff --git a/xen/xsm/flask/policy/security_classes b/xen/xsm/flask/policy/security_classes
index cde4e1a..50ecbab 100644
--- a/xen/xsm/flask/policy/security_classes
+++ b/xen/xsm/flask/policy/security_classes
@@ -19,5 +19,6 @@ class event
 class grant
 class security
 class version
+class argo
 
 # FLASK
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 12/15] xsm, argo: XSM control for argo message send operation
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (10 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 11/15] xsm, argo: XSM control for argo register Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 13/15] xsm, argo: XSM control for any access to argo by a domain Christopher Clark
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	Daniel De Graaf, James McKenzie, Eric Chanudet, Roger Pau Monne

Default policy: allow.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
v3 Daniel/Jan: add to the default xsm policy for the send op
v3 Add Daniel's Acked-by
v2: reordered commit sequence to after sendv implementation
v1 feedback Jan #16: apply const to function signatures
v1 version was: Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

 tools/flask/policy/modules/guest_features.te |  7 ++++---
 xen/common/argo.c                            | 11 +++++++++++
 xen/include/xsm/dummy.h                      |  6 ++++++
 xen/include/xsm/xsm.h                        |  6 ++++++
 xen/xsm/dummy.c                              |  1 +
 xen/xsm/flask/hooks.c                        |  7 +++++++
 xen/xsm/flask/policy/access_vectors          |  2 ++
 7 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/tools/flask/policy/modules/guest_features.te b/tools/flask/policy/modules/guest_features.te
index d00769e..ca52257 100644
--- a/tools/flask/policy/modules/guest_features.te
+++ b/tools/flask/policy/modules/guest_features.te
@@ -6,10 +6,11 @@ allow domain_type xen_t:xen tmem_op;
 allow domain_type xen_t:xen2 pmu_use;
 
 # Allow all domains:
-# to register single-sender (unicast) rings to partner with any domain; and
-# to register any-sender (wildcard) rings that can be sent to by any domain.
+# to register single-sender (unicast) rings to partner with any domain;
+# to register any-sender (wildcard) rings that can be sent to by any domain;
+# and send messages to rings.
 allow domain_type xen_t:argo { register_any_source };
-allow domain_type domain_type:argo { register_single_source };
+allow domain_type domain_type:argo { send register_single_source };
 
 # Allow guest console output to the serial console.  This is used by PV Linux
 # and stub domains for early boot output, so don't audit even when we deny it.
diff --git a/xen/common/argo.c b/xen/common/argo.c
index 7061fd6..77070f4 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -1976,6 +1976,17 @@ sendv(struct domain *src_d, const xen_argo_addr_t *src_addr,
     if ( !dst_d )
         return -ESRCH;
 
+    ret = xsm_argo_send(src_d, dst_d);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR, "argo: XSM REJECTED %i -> %i\n",
+                src_d->domain_id, dst_d->domain_id);
+
+        put_domain(dst_d);
+
+        return ret;
+    }
+
     read_lock(&L1_global_argo_rwlock);
 
     if ( !src_d->argo )
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 96118aa..7daf1f0 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -732,6 +732,12 @@ static XSM_INLINE int xsm_argo_register_any_source(struct domain *d)
     return 0;
 }
 
+static XSM_INLINE int xsm_argo_send(const struct domain *d,
+                                    const struct domain *t)
+{
+    return 0;
+}
+
 #endif /* CONFIG_ARGO */
 
 #include <public/version.h>
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index e32a645..7c69efe 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -185,6 +185,7 @@ struct xsm_operations {
     int (*argo_register_single_source) (const struct domain *d,
                                         const struct domain *t);
     int (*argo_register_any_source) (const struct domain *d);
+    int (*argo_send) (const struct domain *d, const struct domain *t);
 #endif
 };
 
@@ -715,6 +716,11 @@ static inline xsm_argo_register_any_source(const struct domain *d)
     return xsm_ops->argo_register_any_source(d);
 }
 
+static inline int xsm_argo_send(const struct domain *d, const struct domain *t)
+{
+    return xsm_ops->argo_send(d, t);
+}
+
 #endif /* CONFIG_ARGO */
 
 #endif /* XSM_NO_WRAPPERS */
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index ed236b0..ffac774 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -155,5 +155,6 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
 #ifdef CONFIG_ARGO
     set_to_dummy_if_null(ops, argo_register_single_source);
     set_to_dummy_if_null(ops, argo_register_any_source);
+    set_to_dummy_if_null(ops, argo_send);
 #endif
 }
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index fcb7487..76c012c 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1732,6 +1732,12 @@ static int flask_argo_register_any_source(const struct domain *d)
     return avc_has_perm(domain_sid(d), SECINITSID_XEN, SECCLASS_ARGO,
                         ARGO__REGISTER_ANY_SOURCE, NULL);
 }
+
+static int flask_argo_send(const struct domain *d, const struct domain *t)
+{
+    return domain_has_perm(d, t, SECCLASS_ARGO, ARGO__SEND);
+}
+
 #endif
 
 long do_flask_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_flask_op);
@@ -1871,6 +1877,7 @@ static struct xsm_operations flask_ops = {
 #ifdef CONFIG_ARGO
     .argo_register_single_source = flask_argo_register_single_source,
     .argo_register_any_source = flask_argo_register_any_source,
+    .argo_send = flask_argo_send,
 #endif
 };
 
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index fb95c97..f6c5377 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -541,4 +541,6 @@ class argo
     # Domain requesting registration of a communication ring
     # to receive messages from any other domain.
     register_any_source
+    # Domain sending a message to another domain.
+    send
 }
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 13/15] xsm, argo: XSM control for any access to argo by a domain
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (11 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 12/15] xsm, argo: XSM control for argo message send operation Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 14/15] xsm, argo: notify: don't describe rings that cannot be sent to Christopher Clark
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	Daniel De Graaf, James McKenzie, Eric Chanudet, Roger Pau Monne

Will inhibit initialization of the domain's argo data structure to
prevent receiving any messages or notifications and access to any of
the argo hypercall operations.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
v3 Daniel/Jan: add to the default xsm policy for enable
v3 Add Daniel's Acked-by
v3 #04 Jason/Roger: soft_reset: can assume reinit is ok if d->argo set
v2 self: fix xsm use in soft-reset prior to introduction
v1 #5 (#17) feedback Paul: XSM control for any access: use currd
v1 #16 feedback Jan: apply const to function signatures

 tools/flask/policy/modules/guest_features.te |  4 ++--
 xen/common/argo.c                            | 10 +++++-----
 xen/include/xsm/dummy.h                      |  5 +++++
 xen/include/xsm/xsm.h                        |  6 ++++++
 xen/xsm/dummy.c                              |  1 +
 xen/xsm/flask/hooks.c                        |  7 +++++++
 xen/xsm/flask/policy/access_vectors          |  3 +++
 7 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/tools/flask/policy/modules/guest_features.te b/tools/flask/policy/modules/guest_features.te
index ca52257..fe4835d 100644
--- a/tools/flask/policy/modules/guest_features.te
+++ b/tools/flask/policy/modules/guest_features.te
@@ -5,11 +5,11 @@ allow domain_type xen_t:xen tmem_op;
 # pmu_ctrl is for)
 allow domain_type xen_t:xen2 pmu_use;
 
-# Allow all domains:
+# Allow all domains to enable the Argo interdomain communication hypercall;
 # to register single-sender (unicast) rings to partner with any domain;
 # to register any-sender (wildcard) rings that can be sent to by any domain;
 # and send messages to rings.
-allow domain_type xen_t:argo { register_any_source };
+allow domain_type xen_t:argo { enable register_any_source };
 allow domain_type domain_type:argo { send register_single_source };
 
 # Allow guest console output to the serial console.  This is used by PV Linux
diff --git a/xen/common/argo.c b/xen/common/argo.c
index 77070f4..4631f66 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -2056,7 +2056,7 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
     argo_dprintk("->do_argo_op(%u,%p,%p,%lu,0x%lx)\n", cmd,
                  (void *)arg1.p, (void *)arg2.p, arg3, arg4);
 
-    if ( unlikely(!opt_argo) )
+    if ( unlikely(!opt_argo || xsm_argo_enable(currd)) )
         return -EOPNOTSUPP;
 
     switch (cmd)
@@ -2202,7 +2202,7 @@ argo_init(struct domain *d)
 {
     struct argo_domain *argo;
 
-    if ( !opt_argo )
+    if ( !opt_argo || xsm_argo_enable(d) )
     {
         argo_dprintk("argo disabled, domid: %u\n", d->domain_id);
         return 0;
@@ -2259,9 +2259,9 @@ argo_soft_reset(struct domain *d)
         wildcard_rings_pending_remove(d);
 
         /*
-         * Since opt_argo cannot change at runtime, if d->argo is true then
-         * opt_argo must be true, and we can assume that init is allowed to
-         * proceed again here.
+         * Since neither opt_argo or xsm_argo_enable(d) can change at runtime,
+         * if d->argo is true then both opt_argo and xsm_argo_enable(d) must be
+         * true, and we can assume that init is allowed to proceed again here.
          */
         argo_domain_init(d->argo);
     }
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 7daf1f0..56d7865 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -721,6 +721,11 @@ static XSM_INLINE int xsm_dm_op(XSM_DEFAULT_ARG struct domain *d)
 #endif /* CONFIG_X86 */
 
 #ifdef CONFIG_ARGO
+static XSM_INLINE int xsm_argo_enable(struct domain *d)
+{
+    return 0;
+}
+
 static XSM_INLINE int xsm_argo_register_single_source(struct domain *d,
                                                       struct domain *t)
 {
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 7c69efe..8daffae 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -182,6 +182,7 @@ struct xsm_operations {
     int (*xen_version) (uint32_t cmd);
     int (*domain_resource_map) (struct domain *d);
 #ifdef CONFIG_ARGO
+    int (*argo_enable) (const struct domain *d);
     int (*argo_register_single_source) (const struct domain *d,
                                         const struct domain *t);
     int (*argo_register_any_source) (const struct domain *d);
@@ -705,6 +706,11 @@ static inline int xsm_domain_resource_map(xsm_default_t def, struct domain *d)
 }
 
 #ifdef CONFIG_ARGO
+static inline xsm_argo_enable(const struct domain *d)
+{
+    return xsm_ops->argo_enable(d);
+}
+
 static inline xsm_argo_register_single_source(const struct domain *d,
                                               const struct domain *t)
 {
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index ffac774..1fe0e74 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -153,6 +153,7 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, xen_version);
     set_to_dummy_if_null(ops, domain_resource_map);
 #ifdef CONFIG_ARGO
+    set_to_dummy_if_null(ops, argo_enable);
     set_to_dummy_if_null(ops, argo_register_single_source);
     set_to_dummy_if_null(ops, argo_register_any_source);
     set_to_dummy_if_null(ops, argo_send);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 76c012c..3d00c74 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1720,6 +1720,12 @@ static int flask_domain_resource_map(struct domain *d)
 }
 
 #ifdef CONFIG_ARGO
+static int flask_argo_enable(const struct domain *d)
+{
+    return avc_has_perm(domain_sid(d), SECINITSID_XEN, SECCLASS_ARGO,
+                        ARGO__ENABLE, NULL);
+}
+
 static int flask_argo_register_single_source(const struct domain *d,
                                              const struct domain *t)
 {
@@ -1875,6 +1881,7 @@ static struct xsm_operations flask_ops = {
     .xen_version = flask_xen_version,
     .domain_resource_map = flask_domain_resource_map,
 #ifdef CONFIG_ARGO
+    .argo_enable = flask_argo_enable,
     .argo_register_single_source = flask_argo_register_single_source,
     .argo_register_any_source = flask_argo_register_any_source,
     .argo_send = flask_argo_send,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index f6c5377..e00448b 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -535,6 +535,9 @@ class version
 # Class argo is used to describe the Argo interdomain communication system.
 class argo
 {
+    # Enable initialization of a domain's argo subsystem and
+    # permission to access the argo hypercall operations.
+    enable
     # Domain requesting registration of a communication ring
     # to receive messages from a specific other domain.
     register_single_source
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 14/15] xsm, argo: notify: don't describe rings that cannot be sent to
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (12 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 13/15] xsm, argo: XSM control for any access to argo by a domain Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-21  9:59 ` [PATCH v5 15/15] MAINTAINERS: add new section for Argo and self as maintainer Christopher Clark
  2019-01-22 14:17 ` [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Roger Pau Monné
  15 siblings, 0 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	Daniel De Graaf, James McKenzie, Eric Chanudet, Roger Pau Monne

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
v3 #10 Roger: drop out label, use return -EFAULT in fill_ring_data
v3: Add Daniel's Acked-by

 xen/common/argo.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 4631f66..c58aba9 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -1352,6 +1352,17 @@ fill_ring_data(const struct domain *currd,
     if ( !dst_d || !dst_d->argo )
         goto out;
 
+    /*
+     * Don't supply information about rings that a guest is not
+     * allowed to send to.
+     */
+    ret = xsm_argo_send(currd, dst_d);
+    if ( ret )
+    {
+        put_domain(dst_d);
+        return ret;
+    }
+
     read_lock(&dst_d->argo->rings_L2_rwlock);
 
     ring_info = find_ring_info_by_match(dst_d, ent.ring.aport,
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 15/15] MAINTAINERS: add new section for Argo and self as maintainer
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (13 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 14/15] xsm, argo: notify: don't describe rings that cannot be sent to Christopher Clark
@ 2019-01-21  9:59 ` Christopher Clark
  2019-01-21 10:58   ` Wei Liu
  2019-01-21 11:07   ` Jan Beulich
  2019-01-22 14:17 ` [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Roger Pau Monné
  15 siblings, 2 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-21  9:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---

 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 96a0518..c4f5316 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -158,6 +158,14 @@ S:	Supported
 F:	xen/arch/x86/hvm/svm/
 F:	xen/arch/x86/cpu/vpmu_amd.c
 
+ARGO
+M:  Christopher Clark <christopher.w.clark@gmail.com>
+S:  Maintained
+F:  xen/include/public/argo.h
+F:  xen/include/xen/argo.h
+F:  xen/common/argo.c
+F:  xen/common/compat/argo.c
+
 ARINC653 SCHEDULER
 M:	Josh Whitehead <josh.whitehead@dornerworks.com>
 M:	Robert VanVossen <robert.vanvossen@dornerworks.com>
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 15/15] MAINTAINERS: add new section for Argo and self as maintainer
  2019-01-21  9:59 ` [PATCH v5 15/15] MAINTAINERS: add new section for Argo and self as maintainer Christopher Clark
@ 2019-01-21 10:58   ` Wei Liu
  2019-01-21 11:07   ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: Wei Liu @ 2019-01-21 10:58 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet, Roger Pau Monne

On Mon, Jan 21, 2019 at 01:59:55AM -0800, Christopher Clark wrote:
> Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
> ---
> 
>  MAINTAINERS | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 96a0518..c4f5316 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -158,6 +158,14 @@ S:	Supported
>  F:	xen/arch/x86/hvm/svm/
>  F:	xen/arch/x86/cpu/vpmu_amd.c
>  
> +ARGO
> +M:  Christopher Clark <christopher.w.clark@gmail.com>
> +S:  Maintained
> +F:  xen/include/public/argo.h
> +F:  xen/include/xen/argo.h
> +F:  xen/common/argo.c
> +F:  xen/common/compat/argo.c
> +

Use tab for indentation please.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 15/15] MAINTAINERS: add new section for Argo and self as maintainer
  2019-01-21  9:59 ` [PATCH v5 15/15] MAINTAINERS: add new section for Argo and self as maintainer Christopher Clark
  2019-01-21 10:58   ` Wei Liu
@ 2019-01-21 11:07   ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2019-01-21 11:07 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	ross.philipson, Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Rich Persaud, James McKenzie,
	George Dunlap, Julien Grall, Paul Durrant, xen-devel,
	eric chanudet, Roger Pau Monne

>>> On 21.01.19 at 10:59, <christopher.w.clark@gmail.com> wrote:
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -158,6 +158,14 @@ S:	Supported
>  F:	xen/arch/x86/hvm/svm/
>  F:	xen/arch/x86/cpu/vpmu_amd.c
>  
> +ARGO
> +M:  Christopher Clark <christopher.w.clark@gmail.com>
> +S:  Maintained
> +F:  xen/include/public/argo.h
> +F:  xen/include/xen/argo.h
> +F:  xen/common/argo.c
> +F:  xen/common/compat/argo.c

Please follow the whitespace model of adjacent entries, the more
in light of this recent patch:
https://lists.xenproject.org/archives/html/xen-devel/2019-01/msg00196.html
(I can't really tell why I didn't apply it yet)

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 04/15] argo: init, destroy and soft-reset, with enable command line opt
  2019-01-21  9:59 ` [PATCH v5 04/15] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
@ 2019-01-21 17:55   ` Roger Pau Monné
  2019-01-31  4:06     ` Christopher Clark
  0 siblings, 1 reply; 44+ messages in thread
From: Roger Pau Monné @ 2019-01-21 17:55 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Rich Persaud,
	Tim Deegan, Daniel Smith, Julien Grall, Paul Durrant,
	Jan Beulich, xen-devel, James McKenzie, Eric Chanudet

On Mon, Jan 21, 2019 at 01:59:44AM -0800, Christopher Clark wrote:
> Initialises basic data structures and performs teardown of argo state
> for domain shutdown.
> 
> Inclusion of the Argo implementation is dependent on CONFIG_ARGO.
> 
> Introduces a new Xen command line parameter 'argo': bool to enable/disable
> the argo hypercall. Defaults to disabled.
> 
> New headers:
>   public/argo.h: with definions of addresses and ring structure, including
>   indexes for atomic update for communication between domain and hypervisor.
> 
>   xen/argo.h: to expose the hooks for integration into domain lifecycle:
>     argo_init: per-domain init of argo data structures for domain_create.
>     argo_destroy: teardown for domain_destroy and the error exit
>                   path of domain_create.
>     argo_soft_reset: reset of domain state for domain_soft_reset.
> 
> Adds a new field to struct domain: struct argo_domain *argo;
> 
> In accordance with recent work on _domain_destroy, argo_destroy is
> idempotent. It will tear down: all rings registered by this domain, all
> rings where this domain is the single sender (ie. specified partner,
> non-wildcard rings), and all pending notifications where this domain is
> awaiting signal about available space in the rings of other domains.
> 
> A count will be maintained of the number of rings that a domain has
> registered in order to limit it below the fixed maximum limit defined here.
> 
> Macros are defined to verify the internal locking state within the argo
> implementation. The macros are ASSERTed on entry to functions to validate
> and document the required lock state prior to calling.
> 
> The hash function for the hashtables that hold ring state is derived from
> the string hashing function djb2 (http://www.cse.yorku.ca/~oz/hash.html)
> by Daniel J. Bernstein. Basic testing with a limited number of domains and
> ports has shown reasonable distribution for the table size.
> 
> The software license on the public header is the BSD license, standard
> procedure for the public Xen headers. The public header was originally
> posted under a GPL license at: [1]:
> https://lists.xenproject.org/archives/html/xen-devel/2013-05/msg02710.html
> 
> The following ACK by Lars Kurth is to confirm that only people being
> employees of Citrix contributed to the header files in the series posted at
> [1] and that thus the copyright of the files in question is fully owned by
> Citrix. The ACK also confirms that Citrix is happy for the header files to
> be published under a BSD license in this series (which is based on [1]).
> 
> Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
> Acked-by: Lars Kurth <lars.kurth@citrix.com>
> Reviewed-by: Ross Philipson <ross.philipson@oracle.com>

Thanks.

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

I've got some nits below, but it's purely cosmetic changes to make the
code cleaner.

> ---
> v4 Jan: amend the command line doc text referring to build configuration
> v4 feedback: use standard data structures as per common code
> v4 Jan: replace hash_index with djb2-derived hash algorithm
> v4 Andrew: switch argo command line option to list argo=<bool>
> v4: removed note to remove argo_destroy from domain_kill (test shows issue)
> v4 #04 Roger: drop unneeded init of ring_count in argo_domain_init
> v4 #04 Roger: replace if (ring_info->mfns) with ASSERTs in ring_unmap
> v4 #04 Roger: rewrite the locking verification macros
> v4 #04 Roger: make L1 lock description comment clearer about R(L1) and W(L1)
> v4 Andrew: fix split of dprintk in ring_map_info across v4 commits
> 
> v3 #04 Andrew: use xzalloc for struct argo_domain in argo_init
> v3 #04 Andrew: reference CONFIG_ARGO in the command line documentation
> v3 #07 Jan: rename ring_find_info to find_ring_info
> v3 #04 Andrew: don't truncate args do_argo_op printk
> v3 #07 Jan: fix numeric entries in printk format strings
> v3 #10 Roger: move find functions to top of file and drop prototypes
> v3 #04 Jan: meld compat check for hypercall arg types
> v3 #04 Roger/Jan: make lock names clearer and assert their state
> v3 #04 Jan: port -> aport with type; distinguish argo port from evtchn
> v3 #04 Jan: reorder call to argo_init_domain in argo_init
> v3 #04 Jan: ring_remove_mfns: zero count before freeing arrays
> v3 #04 Jason/Roger: soft_reset: can assume reinit is ok if d->argo set
> v3 #04 Roger: remove unused and confusing d->argo_lock
> v3 #04 Roger: add simple inlines in xen/argo.h, drop ifdef CONFIG_ARGO
> v3 #04 Roger: simpler return -EOPNOTSUPP in do_argo_op
> v3 #04 Roger: add const to domain arg to ring_remove_info
> v3 #04 Roger: use XFREE
> v3 #04 Roger: newline fix in wildcard_pending_list_remove
> v3 #04 Roger: mfn_mapping: void* instead of uint8_t*
> v3 #04 Roger: drop npages struct member in argo_ring_info; use len
> v3 #04 Roger/Jan: drop many fixed width types in internal structs
> v3 #04 Jason/Jan: drop pad and fixed width type in pending_ent struct
> v3 #04 Eric: moved ring_find_info from register op into this commit
> v3 moved hash_index function, nospec include from register op to this commit
> v3 moved XEN_ARGO_DOMID_ANY defn from register op into this commit
> v3 added #include <xen/sched.h> to <xen/argo.h> for domain struct defn
> v3 feedback #04 Roger: reorder #includes to alphabetical order
> v3 Added Ross's Reviewed-by.
> 
> v2 rewrite locking explanation comment
> v2 header copyright line now includes 2019
> v2 self: use ring_info backpointer in pending_ent to maintain npending
> v2 self: rename all_rings_remove_info to domain_rings_remove_all
> v2 feedback Jan: drop cookie, implement teardown
> v2 self: add npending to track number of pending entries per ring
> v2 self: amend comment on locking; drop section comments
> v2 cookie_eq: test low bits first and use likely on high bits
> v2 self: OVERHAUL
> v2 self: s/argo_pending_ent/pending_ent/g
> v2 self: drop pending_remove_ent, inline at single call site
> v1 feedback Roger, Jan: drop argo prefix on static functions
> v2 #4 Lars: add Acked-by and details to commit message.
> v2 feedback #9 Jan: document argo boot opt in xen-command-line.markdown
> v2 bugfix: xsm use in soft-reset prior to introduction
> v2 feedback #9 Jan: drop 'message' from do_argo_message_op
> v1 #5 feedback Paul: init/destroy unsigned, brackets and whitespace fixes
> v1 #5 feedback Paul: Use mfn_eq for comparing mfns.
> v1 #5 feedback Paul: init/destroy : use currd
> v1 #6 (#5) feedback Jan: init/destroy: s/ENOSYS/EOPNOTSUPP/
> v1 #6 feedback Paul: Folded patch 6 into patch 5.
> v1 #6 feedback Jan: drop opt_argo_enabled initializer
> v1 $6 feedback Jan: s/ENOSYS/EOPNOTSUPP/g and drop useless dprintk
> v1. #5 feedback Paul: change the license on public header to BSD
> - ack from Lars at Citrix.
> v1. self, Jan: drop unnecessary xen include from sched.h
> v1. self, Jan: drop inclusion of public argo.h in private one
> v1. self, Jan: add include of public argo.h to argo.c
> v1. self, Jan: drop fwd decl of argo_domain in priv header
> v1. Paul/self/Jan: add data structures to xlat.lst and compat/argo.h to Makefile
> v1. self: removed allocation of event channel since switching to VIRQ
> v1. self: drop types.h include from private argo.h
> v1: reorder public argo include position
> v1: #13 feedback Jan: public namespace: prefix with xen
> v1: self: rename pending ent "id" to "domain_id"
> v1: self: add domain_cookie to ent struct
> v1. #15 feedback Jan: make cmd unsigned
> v1. #15 feedback Jan: make i loop variable unsigned
> v1: self: adjust dprintks in init, destroy
> v1: #18 feedback Jan: meld max ring count limit
> v1: self: use type not struct in public defn, affects compat gen header
> v1: feedback #15 Jan: handle upper-halves of hypercall args
> v1: add comment explaining the 'magic' field
> v1: self + Jan feedback: implement soft reset
> v1: feedback #13 Roger: use ASSERT_UNREACHABLE
> 
>  docs/misc/xen-command-line.pandoc |  15 +
>  xen/common/Makefile               |   2 +-
>  xen/common/argo.c                 | 617 +++++++++++++++++++++++++++++++++++++-
>  xen/common/compat/argo.c          |  23 ++
>  xen/common/domain.c               |   9 +
>  xen/include/Makefile              |   1 +
>  xen/include/public/argo.h         |  64 ++++
>  xen/include/xen/argo.h            |  44 +++
>  xen/include/xen/sched.h           |   5 +
>  xen/include/xlat.lst              |   2 +
>  10 files changed, 780 insertions(+), 2 deletions(-)
>  create mode 100644 xen/common/compat/argo.c
>  create mode 100644 xen/include/public/argo.h
>  create mode 100644 xen/include/xen/argo.h
> 
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> index d39bcee..93f41bc 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -182,6 +182,21 @@ Permit Xen to use "Always Running APIC Timer" support on compatible hardware
>  in combination with cpuidle.  This option is only expected to be useful for
>  developers wishing Xen to fall back to older timing methods on newer hardware.
>  
> +### argo
> +    = List of [ <bool> ]
> +
> +Controls for the Argo hypervisor-mediated interdomain communication service.
> +
> +The functionality that this option controls is only available when Xen has been
> +compiled with the build setting for Argo enabled in the build configuration.
> +
> +Argo is a interdomain communication mechanism, where Xen acts as the central
> +point of authority.  Guests may register memory rings to recieve messages,
> +query the status of other domains, and send messages by hypercall, all subject
> +to appropriate auditing by Xen.
> +
> +*   An overall boolean acts as a global control.  Argo is disabled by default.

I'm not sure it's worth adding a list item for the boolean, I would
just add the "Argo is disabled by default" to the first paragraph.

[...]
> +static struct argo_ring_info *
> +find_ring_info(const struct domain *d, const struct argo_ring_id *id)
> +{
> +    struct list_head *cursor, *bucket;
> +
> +    ASSERT(LOCKING_Read_rings_L2(d));
> +
> +    /* List is not modified here. Search and return the match if found. */
> +    bucket = &d->argo->ring_hash[hash_index(id)];
> +
> +    for ( cursor = bucket->next; cursor != bucket; cursor = cursor->next )

Why are you open-coding list_for_each here?

You might also consider using list_for_each_entry, so that you can
avoid the list_entry call below.

> +    {
> +        struct argo_ring_info *ring_info =
> +            list_entry(cursor, struct argo_ring_info, node);
> +        const struct argo_ring_id *cmpid = &ring_info->id;
> +
> +        if ( cmpid->aport == id->aport &&
> +             cmpid->domain_id == id->domain_id &&
> +             cmpid->partner_id == id->partner_id )
> +        {
> +            argo_dprintk("found ring_info for ring(%u:%x %u)\n",
> +                         id->domain_id, id->aport, id->partner_id);
> +            return ring_info;
> +        }
> +    }
> +    argo_dprintk("no ring_info for ring(%u:%x %u)\n",
> +                 id->domain_id, id->aport, id->partner_id);
> +
> +    return NULL;
> +}
[...]
> +static void
> +pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
> +{
> +    struct list_head *ring_pending = &ring_info->pending;
> +    struct pending_ent *ent;
> +
> +    ASSERT(LOCKING_L3(d, ring_info));
> +
> +    /* Delete all pending notifications from this ring's list. */
> +    while ( !list_empty(ring_pending) )

Nit: you could use list_first_entry_or_null that joins the list_empty
and list_entry calls.

> +    {
> +        ent = list_entry(ring_pending->next, struct pending_ent, node);
> +
> +        /* For wildcard rings, remove each from their wildcard list too. */
> +        if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> +            wildcard_pending_list_remove(ent->domain_id, ent);
> +        list_del(&ent->node);
> +        xfree(ent);
> +    }
> +    ring_info->npending = 0;
> +}
> +
> +static void
> +wildcard_rings_pending_remove(struct domain *d)
> +{
> +    struct list_head *wildcard_head;
> +
> +    ASSERT(LOCKING_Write_L1);
> +
> +    /* Delete all pending signals to the domain about wildcard rings. */
> +    wildcard_head = &d->argo->wildcard_pend_list;
> +
> +    while ( !list_empty(wildcard_head) )
> +    {
> +        struct pending_ent *ent =
> +            list_entry(wildcard_head->next, struct pending_ent, node);

Same here regarding the usage of list_first_entry_or_null.

> +
> +        /*
> +         * The ent->node deleted here, and the npending value decreased,
> +         * belong to the ring_info of another domain, which is why this
> +         * function requires holding W(L1):
> +         * it implies the L3 lock that protects that ring_info struct.
> +         */
> +        ent->ring_info->npending--;
> +        list_del(&ent->node);
> +        list_del(&ent->wildcard_node);
> +        xfree(ent);
> +    }
> +}
[...]
> +static void
> +domain_rings_remove_all(struct domain *d)
> +{
> +    unsigned int i;
> +    struct argo_ring_info *ring_info;
> +
> +    ASSERT(LOCKING_Write_rings_L2(d));
> +
> +    for ( i = 0; i < ARGO_HASHTABLE_SIZE; ++i )
> +    {
> +        struct list_head *bucket = &d->argo->ring_hash[i];
> +
> +        while ( !list_empty(bucket) )
> +        {
> +            ring_info = list_entry(bucket->next, struct argo_ring_info, node);

list_first_entry_or_null

> +            ring_remove_info(d, ring_info);
> +        }
> +    }
> +    d->argo->ring_count = 0;
> +}
> +
> +/*
> + * Tear down all rings of other domains where src_d domain is the partner.
> + * (ie. it is the single domain that can send to those rings.)
> + * This will also cancel any pending notifications about those rings.
> + */
> +static void
> +partner_rings_remove(struct domain *src_d)
> +{
> +    unsigned int i;
> +    struct argo_send_info *send_info;
> +    struct argo_ring_info *ring_info;
> +    struct domain *dst_d;
> +
> +    ASSERT(LOCKING_Write_L1);
> +
> +    for ( i = 0; i < ARGO_HASHTABLE_SIZE; ++i )
> +    {
> +        struct list_head *cursor, *bucket = &src_d->argo->send_hash[i];
> +
> +        /* Remove all ents from the send list. Take each off their ring list. */
> +        for ( cursor = bucket->next; cursor != bucket; cursor = cursor->next )

Another open-coded version of list_for_each, see my comments on the
instances above.

> +        {
> +            send_info = list_entry(cursor, struct argo_send_info, node);

send_info should be defined here to reduce it's scope.

> +
> +            dst_d = get_domain_by_id(send_info->id.domain_id);
> +            if ( dst_d && dst_d->argo )
> +            {
> +                ring_info = find_ring_info(dst_d, &send_info->id);

ring_info should be defined here.

> +                if ( ring_info )
> +                {
> +                    ring_remove_info(dst_d, ring_info);
> +                    dst_d->argo->ring_count--;
> +                }
> +                else
> +                    ASSERT_UNREACHABLE();
> +            }
> +            else
> +                ASSERT_UNREACHABLE();
> +
> +            if ( dst_d )
> +                put_domain(dst_d);
> +
> +            list_del(&send_info->node);
> +            xfree(send_info);
> +        }
> +    }
> +}

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 07/15] argo: implement the register op
  2019-01-21  9:59 ` [PATCH v5 07/15] argo: implement the register op Christopher Clark
@ 2019-01-22  9:59   ` Roger Pau Monné
  2019-01-31  4:08     ` Christopher Clark
  0 siblings, 1 reply; 44+ messages in thread
From: Roger Pau Monné @ 2019-01-22  9:59 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Mon, Jan 21, 2019 at 01:59:47AM -0800, Christopher Clark wrote:
> The register op is used by a domain to register a region of memory for
> receiving messages from either a specified other domain, or, if specifying a
> wildcard, any domain.
> 
> This operation creates a mapping within Xen's private address space that
> will remain resident for the lifetime of the ring. In subsequent commits,
> the hypervisor will use this mapping to copy data from a sending domain into
> this registered ring, making it accessible to the domain that registered the
> ring to receive data.
> 
> Wildcard any-sender rings are default disabled and registration will be
> refused with EPERM unless they have been specifically enabled with the
> new mac-permissive flag that is added to the argo boot option here. The
> reason why the default for wildcard rings is 'deny' is that there is
> currently no means to protect the ring from DoS by a noisy domain
> spamming the ring, affecting other domains ability to send to it. This
> will be addressed with XSM policy controls in subsequent work.
> 
> Since denying access to any-sender rings is a significant functional
> constraint, the new option "mac-permissive" for the argo bootparam
> enables overriding this. eg: "argo=1,mac-permissive=1"
> 
> The p2m type of the memory supplied by the guest for the ring must be
> p2m_ram_rw and the memory will be pinned as PGT_writable_page while the ring
> is registered.
> 
> xen_argo_gfn_t type is defined and is 64-bit on all architectures which
> assists with avoiding the need for compat code to translate hypercall args.
> This hypercall op and its interface currently only supports 4K-sized pages.
> 
> Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>

Reviewed-by: Roger Pau Mooné <roger.pau@citrix.com>

Just some nits that can be taken care of later.

> +static int
> +find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
> +               const unsigned int npage,
> +               XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
> +               const unsigned int len)
> +{
> +    unsigned int i;
> +    int ret = 0;
> +    mfn_t *mfns;
> +    void **mfn_mapping;
> +
> +    ASSERT(LOCKING_Write_rings_L2(d));
> +
> +    if ( ring_info->mfns )
> +    {
> +        /* Ring already existed: drop the previous mapping. */
> +        gprintk(XENLOG_INFO, "argo: vm%u re-register existing ring "
> +                "(vm%u:%x vm%u) clears mapping\n",
> +                d->domain_id, ring_info->id.domain_id,
> +                ring_info->id.aport, ring_info->id.partner_id);
> +
> +        ring_remove_mfns(d, ring_info);
> +        ASSERT(!ring_info->mfns);
> +    }
> +
> +    mfns = xmalloc_array(mfn_t, npage);
> +    if ( !mfns )
> +        return -ENOMEM;
> +
> +    for ( i = 0; i < npage; i++ )
> +        mfns[i] = INVALID_MFN;
> +
> +    mfn_mapping = xzalloc_array(void *, npage);
> +    if ( !mfn_mapping )
> +    {
> +        xfree(mfns);
> +        return -ENOMEM;
> +    }
> +
> +    ring_info->mfns = mfns;
> +    ring_info->mfn_mapping = mfn_mapping;
> +
> +    for ( i = 0; i < npage; i++ )
> +    {
> +        xen_argo_gfn_t argo_gfn;
> +        mfn_t mfn;
> +
> +        ret = __copy_from_guest_offset(&argo_gfn, gfn_hnd, i, 1) ? -EFAULT : 0;
> +        if ( ret )
> +            break;
> +
> +        ret = find_ring_mfn(d, _gfn(argo_gfn), &mfn);
> +        if ( ret )
> +        {
> +            gprintk(XENLOG_ERR, "argo: vm%u: invalid gfn %"PRI_gfn" "
> +                    "r:(vm%u:%x vm%u) %p %u/%u\n",
> +                    d->domain_id, gfn_x(_gfn(argo_gfn)),
> +                    ring_info->id.domain_id, ring_info->id.aport,
> +                    ring_info->id.partner_id, ring_info, i, npage);
> +            break;
> +        }
> +
> +        ring_info->mfns[i] = mfn;
> +
> +        argo_dprintk("%u: %"PRI_gfn" -> %"PRI_mfn"\n",
> +                     i, gfn_x(_gfn(argo_gfn)), mfn_x(ring_info->mfns[i]));
> +    }
> +
> +    ring_info->nmfns = i;
> +
> +    if ( ret )
> +        ring_remove_mfns(d, ring_info);
> +    else
> +    {
> +        ASSERT(ring_info->nmfns == NPAGES_RING(len));
> +
> +        gprintk(XENLOG_DEBUG, "argo: vm%u ring (vm%u:%x vm%u) %p "

Nit: this likely wants to be an argo_dprintk?

> +                "mfn_mapping %p len %u nmfns %u\n",
> +                d->domain_id, ring_info->id.domain_id,
> +                ring_info->id.aport, ring_info->id.partner_id, ring_info,
> +                ring_info->mfn_mapping, ring_info->len, ring_info->nmfns);
> +    }
> +
> +    return ret;
> +}
> +
> +static long
> +register_ring(struct domain *currd,
> +              XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd,
> +              XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
> +              unsigned int npage, bool fail_exist)
> +{
> +    xen_argo_register_ring_t reg;
> +    struct argo_ring_id ring_id;
> +    void *map_ringp;
> +    xen_argo_ring_t *ringp;
> +    struct argo_ring_info *ring_info, *new_ring_info = NULL;
> +    struct argo_send_info *send_info = NULL;
> +    struct domain *dst_d = NULL;
> +    int ret = 0;
> +    unsigned int private_tx_ptr;
> +
> +    ASSERT(currd == current->domain);
> +
> +    if ( copy_from_guest(&reg, reg_hnd, 1) )
> +        return -EFAULT;
> +
> +    /*
> +     * A ring must be large enough to transmit messages, so requires space for:
> +     * * 1 message header, plus
> +     * * 1 payload slot (payload is always rounded to a multiple of 16 bytes)
> +     *   for the message payload to be written into, plus
> +     * * 1 more slot, so that the ring cannot be filled to capacity with a
> +     *   single minimum-size message -- see the logic in ringbuf_insert --
> +     *   allowing for this ensures that there can be space remaining when a
> +     *   message is present.
> +     * The above determines the minimum acceptable ring size.
> +     */
> +    if ( (reg.len < (sizeof(struct xen_argo_ring_message_header)
> +                      + ROUNDUP_MESSAGE(1) + ROUNDUP_MESSAGE(1))) ||
> +         (reg.len > XEN_ARGO_MAX_RING_SIZE) ||
> +         (reg.len != ROUNDUP_MESSAGE(reg.len)) ||
> +         (NPAGES_RING(reg.len) != npage) ||
> +         (reg.pad != 0) )
> +        return -EINVAL;
> +
> +    ring_id.partner_id = reg.partner_id;
> +    ring_id.aport = reg.aport;
> +    ring_id.domain_id = currd->domain_id;
> +
> +    if ( reg.partner_id == XEN_ARGO_DOMID_ANY )
> +    {
> +        if ( !opt_argo_mac_permissive )
> +            return -EPERM;
> +    }
> +    else
> +    {
> +        dst_d = get_domain_by_id(reg.partner_id);
> +        if ( !dst_d )
> +        {
> +            argo_dprintk("!dst_d, ESRCH\n");
> +            return -ESRCH;
> +        }
> +
> +        send_info = xzalloc(struct argo_send_info);
> +        if ( !send_info )
> +        {
> +            ret = -ENOMEM;
> +            goto out;
> +        }
> +        send_info->id = ring_id;
> +    }
> +
> +    /*
> +     * Common case is that the ring doesn't already exist, so do the alloc here
> +     * before picking up any locks.
> +     */
> +    new_ring_info = xzalloc(struct argo_ring_info);
> +    if ( !new_ring_info )
> +    {
> +        ret = -ENOMEM;
> +        goto out;
> +    }
> +
> +    read_lock(&L1_global_argo_rwlock);
> +
> +    if ( !currd->argo )
> +    {
> +        ret = -ENODEV;
> +        goto out_unlock;
> +    }
> +
> +    if ( dst_d && !dst_d->argo )
> +    {
> +        argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
> +        ret = -ECONNREFUSED;
> +        goto out_unlock;
> +    }
> +
> +    write_lock(&currd->argo->rings_L2_rwlock);
> +
> +    if ( currd->argo->ring_count >= MAX_RINGS_PER_DOMAIN )
> +    {
> +        ret = -ENOSPC;
> +        goto out_unlock2;
> +    }
> +
> +    ring_info = find_ring_info(currd, &ring_id);
> +    if ( !ring_info )
> +    {
> +        ring_info = new_ring_info;
> +        new_ring_info = NULL;
> +
> +        spin_lock_init(&ring_info->L3_lock);
> +
> +        ring_info->id = ring_id;
> +        INIT_LIST_HEAD(&ring_info->pending);
> +
> +        list_add(&ring_info->node,
> +                 &currd->argo->ring_hash[hash_index(&ring_info->id)]);
> +
> +        gprintk(XENLOG_DEBUG, "argo: vm%u registering ring (vm%u:%x vm%u)\n",
> +                currd->domain_id, ring_id.domain_id, ring_id.aport,
> +                ring_id.partner_id);
> +    }
> +    else if ( ring_info->len )
> +    {
> +        /*
> +         * If the caller specified that the ring must not already exist,
> +         * fail at attempt to add a completed ring which already exists.
> +         */
> +        if ( fail_exist )
> +        {
> +            argo_dprintk("disallowed reregistration of existing ring\n");

And this should likely be gprintk with error type?

I think the pattern of using gprintk for error messages and
argo_dprintk for verbose information is correct, but there are a
couple of oddities that can be fixed later.

> +            ret = -EEXIST;
> +            goto out_unlock2;
> +        }
> +
> +        if ( ring_info->len != reg.len )
> +        {
> +            /*
> +             * Change of ring size could result in entries on the pending
> +             * notifications list that will never trigger.
> +             * Simple blunt solution: disallow ring resize for now.
> +             * TODO: investigate enabling ring resize.
> +             */
> +            gprintk(XENLOG_ERR, "argo: vm%u attempted to change ring size "
> +                    "(vm%u:%x vm%u)\n",
> +                    currd->domain_id, ring_id.domain_id, ring_id.aport,
> +                    ring_id.partner_id);
> +            /*
> +             * Could return EINVAL here, but if the ring didn't already
> +             * exist then the arguments would have been valid, so: EEXIST.
> +             */
> +            ret = -EEXIST;
> +            goto out_unlock2;
> +        }
> +
> +        gprintk(XENLOG_DEBUG,
> +                "argo: vm%u re-registering existing ring (vm%u:%x vm%u)\n",
> +                currd->domain_id, ring_id.domain_id, ring_id.aport,
> +                ring_id.partner_id);

This again would better be argo_dprintk IMO.

[...]
> @@ -552,6 +987,38 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
>  
>      switch (cmd)
>      {
> +    case XEN_ARGO_OP_register_ring:
> +    {
> +        XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd =
> +            guest_handle_cast(arg1, xen_argo_register_ring_t);
> +        XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd =
> +            guest_handle_cast(arg2, xen_argo_gfn_t);
> +        /* arg3 is npage */
> +        /* arg4 is flags */
> +        bool fail_exist = arg4 & XEN_ARGO_REGISTER_FLAG_FAIL_EXIST;

Nit: I would add a:

BUILD_BUG_ON(!IS_ALIGNED(XEN_ARGO_MAX_RING_SIZE, PAGE_SIZE));

> +        if ( unlikely(arg3 > (XEN_ARGO_MAX_RING_SIZE >> PAGE_SHIFT)) )
> +        {
> +            rc = -EINVAL;
> +            break;
> +        }

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 08/15] argo: implement the unregister op
  2019-01-21  9:59 ` [PATCH v5 08/15] argo: implement the unregister op Christopher Clark
@ 2019-01-22 11:02   ` Roger Pau Monné
  0 siblings, 0 replies; 44+ messages in thread
From: Roger Pau Monné @ 2019-01-22 11:02 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Mon, Jan 21, 2019 at 01:59:48AM -0800, Christopher Clark wrote:
> Takes a single argument: a handle to the ring unregistration struct,
> which specifies the port and partner domain id or wildcard.
> 
> The ring's entry is removed from the hashtable of registered rings;
> any entries for pending notifications are removed; and the ring is
> unmapped from Xen's address space.
> 
> If the ring had been registered to communicate with a single specified
> domain (ie. a non-wildcard ring) then the partner domain state is removed
> from the partner domain's argo send_info hash table.
> 
> Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>

LGTM:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Just one nit about the open-coded list_for_each.

> diff --git a/xen/common/argo.c b/xen/common/argo.c
> index a7ec0e0..e4cd446 100644
> --- a/xen/common/argo.c
> +++ b/xen/common/argo.c
> @@ -43,6 +43,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
>  DEFINE_XEN_GUEST_HANDLE(xen_argo_gfn_t);
>  DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
>  DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
> +DEFINE_XEN_GUEST_HANDLE(xen_argo_unregister_ring_t);
>  
>  static bool __read_mostly opt_argo;
>  static bool __read_mostly opt_argo_mac_permissive;
> @@ -351,6 +352,37 @@ find_ring_info(const struct domain *d, const struct argo_ring_id *id)
>      return NULL;
>  }
>  
> +static struct argo_send_info *
> +find_send_info(const struct domain *d, const struct argo_ring_id *id)
> +{
> +    struct list_head *cursor, *bucket;
> +
> +    ASSERT(LOCKING_send_L2(d));
> +
> +    /* List is not modified here. Search and return the match if found. */
> +    bucket = &d->argo->send_hash[hash_index(id)];
> +
> +    for ( cursor = bucket->next; cursor != bucket; cursor = cursor->next )

list_for_each

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-21  9:59 ` [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq Christopher Clark
@ 2019-01-22 12:08   ` Roger Pau Monné
  2019-01-31  4:10     ` Christopher Clark
  0 siblings, 1 reply; 44+ messages in thread
From: Roger Pau Monné @ 2019-01-22 12:08 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Mon, Jan 21, 2019 at 01:59:49AM -0800, Christopher Clark wrote:
> sendv operation is invoked to perform a synchronous send of buffers
> contained in iovs to a remote domain's registered ring.
> 
> It takes:
>  * A destination address (domid, port) for the ring to send to.
>    It performs a most-specific match lookup, to allow for wildcard.
>  * A source address, used to inform the destination of where to reply.
>  * The address of an array of iovs containing the data to send
>  * .. and the length of that array of iovs
>  * and a 32-bit message type, available to communicate message context
>    data (eg. kernel-to-kernel, separate from the application data).
> 
> If insufficient space exists in the destination ring, it will return
> -EAGAIN and Xen will notify the caller when sufficient space becomes
> available.
> 
> Accesses to the ring indices are appropriately atomic. The rings are
> mapped into Xen's private address space to write as needed and the
> mappings are retained for later use.
> 
> Notifications are sent to guests via VIRQ and send_guest_global_virq is
> exposed in the change to enable argo to call it. VIRQ_ARGO_MESSAGE is
                                                   ^ VIRQ_ARGO
> claimed from the VIRQ previously reserved for this purpose (#11).
> 
> The VIRQ notification method is used rather than sending events using
> evtchn functions directly because:
> 
> * no current event channel type is an exact fit for the intended
>   behaviour. ECS_IPI is closest, but it disallows migration to
>   other VCPUs which is not necessarily a requirement for Argo.
> 
> * at the point of argo_init, allocation of an event channel is
>   complicated by none of the guest VCPUs being initialized yet
>   and the event channel logic expects that a valid event channel
>   has a present VCPU.

IMO iff you wanted to use event channels those _must_ be setup by the
guest, ie: the guest argo driver would load, allocate an event channel
and then tell the hypervisor about the event channel that should be
used for argo notifications.

> +static int
> +memcpy_to_guest_ring(const struct domain *d, struct argo_ring_info *ring_info,
> +                     unsigned int offset,
> +                     const void *src, XEN_GUEST_HANDLE(uint8_t) src_hnd,
> +                     unsigned int len)
> +{
> +    unsigned int mfns_index = offset >> PAGE_SHIFT;
> +    void *dst;
> +    int ret;
> +    unsigned int src_offset = 0;
> +
> +    ASSERT(LOCKING_L3(d, ring_info));
> +
> +    offset &= ~PAGE_MASK;
> +
> +    if ( len + offset > XEN_ARGO_MAX_RING_SIZE )
> +        return -EFAULT;
> +
> +    while ( len )
> +    {
> +        unsigned int head_len = (offset + len) > PAGE_SIZE ? PAGE_SIZE - offset
> +                                                           : len;

IMO that would be clearer as:

head_len = min(PAGE_SIZE - offset, len);

But anyway, this should go away when you move to using vmap.

[...]
> +static int
> +ringbuf_insert(const struct domain *d, struct argo_ring_info *ring_info,
> +               const struct argo_ring_id *src_id,
> +               XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd,
> +               unsigned long niov, uint32_t message_type,
> +               unsigned long *out_len)
> +{
> +    xen_argo_ring_t ring;
> +    struct xen_argo_ring_message_header mh = { };
> +    int sp, ret;
> +    unsigned int len = 0;
> +    xen_argo_iov_t iovs[XEN_ARGO_MAXIOV];
> +    xen_argo_iov_t *piov;
> +    XEN_GUEST_HANDLE(uint8_t) NULL_hnd =
> +       guest_handle_from_param(guest_handle_from_ptr(NULL, uint8_t), uint8_t);
> +
> +    ASSERT(LOCKING_L3(d, ring_info));
> +
> +    ret = __copy_from_guest(iovs, iovs_hnd, niov) ? -EFAULT : 0;
> +    if ( ret )
> +        return ret;
> +
> +    /*
> +     * Obtain the total size of data to transmit -- sets the 'len' variable
> +     * -- and sanity check that the iovs conform to size and number limits.
> +     * Enforced below: no more than 'len' bytes of guest data
> +     * (plus the message header) will be sent in this operation.
> +     */
> +    ret = iov_count(iovs, niov, &len);
> +    if ( ret )
> +        return ret;
> +
> +    /*
> +     * Size bounds check against ring size and static maximum message limit.
> +     * The message must not fill the ring; there must be at least one slot
> +     * remaining so we can distinguish a full ring from an empty one.
> +     */
> +    if ( ((ROUNDUP_MESSAGE(len) +
> +            sizeof(struct xen_argo_ring_message_header)) >= ring_info->len) ||
> +         (len > MAX_ARGO_MESSAGE_SIZE) )

len is already checked to be <= MAX_ARGO_MESSAGE_SIZE in iov_count
where it gets set, this is redundant.

> +        return -EMSGSIZE;
> +
> +    ret = get_sanitized_ring(d, &ring, ring_info);
> +    if ( ret )
> +        return ret;
> +
> +    argo_dprintk("ring.tx_ptr=%u ring.rx_ptr=%u ring len=%u"
> +                 " ring_info->tx_ptr=%u\n",
> +                 ring.tx_ptr, ring.rx_ptr, ring_info->len, ring_info->tx_ptr);
> +
> +    if ( ring.rx_ptr == ring.tx_ptr )
> +        sp = ring_info->len;
> +    else
> +    {
> +        sp = ring.rx_ptr - ring.tx_ptr;
> +        if ( sp < 0 )
> +            sp += ring_info->len;
> +    }
> +
> +    /*
> +     * Size bounds check against currently available space in the ring.
> +     * Again: the message must not fill the ring leaving no space remaining.
> +     */
> +    if ( (ROUNDUP_MESSAGE(len) +
> +            sizeof(struct xen_argo_ring_message_header)) >= sp )
> +    {
> +        argo_dprintk("EAGAIN\n");
> +        return -EAGAIN;
> +    }
> +
> +    mh.len = len + sizeof(struct xen_argo_ring_message_header);
> +    mh.source.aport = src_id->aport;
> +    mh.source.domain_id = src_id->domain_id;
> +    mh.message_type = message_type;
> +
> +    /*
> +     * For this copy to the guest ring, tx_ptr is always 16-byte aligned
> +     * and the message header is 16 bytes long.
> +     */
> +    BUILD_BUG_ON(
> +        sizeof(struct xen_argo_ring_message_header) != ROUNDUP_MESSAGE(1));
> +
> +    /*
> +     * First data write into the destination ring: fixed size, message header.
> +     * This cannot overrun because the available free space (value in 'sp')
> +     * is checked above and must be at least this size.
> +     */
> +    ret = memcpy_to_guest_ring(d, ring_info,
> +                               ring.tx_ptr + sizeof(xen_argo_ring_t),
> +                               &mh, NULL_hnd, sizeof(mh));
> +    if ( ret )
> +    {
> +        gprintk(XENLOG_ERR,
> +                "argo: failed to write message header to ring (vm%u:%x vm%u)\n",
> +                ring_info->id.domain_id, ring_info->id.aport,
> +                ring_info->id.partner_id);
> +
> +        return ret;
> +    }
> +
> +    ring.tx_ptr += sizeof(mh);
> +    if ( ring.tx_ptr == ring_info->len )
> +        ring.tx_ptr = 0;
> +
> +    for ( piov = iovs; niov--; piov++ )
> +    {
> +        XEN_GUEST_HANDLE_64(uint8_t) buf_hnd = piov->iov_hnd;
> +        unsigned int iov_len = piov->iov_len;
> +
> +        /* If no data is provided in this iov, moan and skip on to the next */
> +        if ( !iov_len )
> +        {
> +            gprintk(XENLOG_ERR,

This should likely be WARN or INFO, since it's not an error?

> +                    "argo: no data iov_len=0 iov_hnd=%p ring (vm%u:%x vm%u)\n",
> +                    buf_hnd.p, ring_info->id.domain_id, ring_info->id.aport,
> +                    ring_info->id.partner_id);
> +
> +            continue;
> +        }
> +
> +        if ( unlikely(!guest_handle_okay(buf_hnd, iov_len)) )
> +        {
> +            gprintk(XENLOG_ERR,
> +                    "argo: bad iov handle [%p, %u] (vm%u:%x vm%u)\n",
> +                    buf_hnd.p, iov_len,
> +                    ring_info->id.domain_id, ring_info->id.aport,
> +                    ring_info->id.partner_id);
> +
> +            return -EFAULT;
> +        }
> +
> +        sp = ring_info->len - ring.tx_ptr;
> +
> +        /* Check: iov data size versus free space at the tail of the ring */
> +        if ( iov_len > sp )
> +        {
> +            /*
> +             * Second possible data write: ring-tail-wrap-write.
> +             * Populate the ring tail and update the internal tx_ptr to handle
> +             * wrapping at the end of ring.
> +             * Size of data written here: sp
> +             * which is the exact full amount of free space available at the
> +             * tail of the ring, so this cannot overrun.
> +             */
> +            ret = memcpy_to_guest_ring(d, ring_info,
> +                                       ring.tx_ptr + sizeof(xen_argo_ring_t),
> +                                       NULL, buf_hnd, sp);
> +            if ( ret )
> +            {
> +                gprintk(XENLOG_ERR,
> +                        "argo: failed to copy {%p, %d} (vm%u:%x vm%u)\n",
> +                        buf_hnd.p, sp,
> +                        ring_info->id.domain_id, ring_info->id.aport,
> +                        ring_info->id.partner_id);
> +
> +                return ret;
> +            }
> +
> +            ring.tx_ptr = 0;
> +            iov_len -= sp;
> +            guest_handle_add_offset(buf_hnd, sp);
> +
> +            ASSERT(iov_len <= ring_info->len);
> +        }
> +
> +        /*
> +         * Third possible data write: all data remaining for this iov.
> +         * Size of data written here: iov_len
> +         *
> +         * Case 1: if the ring-tail-wrap-write above was performed, then
> +         *         iov_len has been decreased by 'sp' and ring.tx_ptr is zero.
> +         *
> +         *    We know from checking the result of iov_count:
> +         *      len + sizeof(message_header) <= ring_info->len
> +         *    We also know that len is the total of summing all iov_lens, so:
> +         *       iov_len <= len
> +         *    so by transitivity:
> +         *       iov_len <= len <= (ring_info->len - sizeof(msgheader))
> +         *    and therefore:
> +         *       (iov_len + sizeof(msgheader) <= ring_info->len) &&
> +         *       (ring.tx_ptr == 0)
> +         *    so this write cannot overrun here.
> +         *
> +         * Case 2: ring-tail-wrap-write above was not performed
> +         *    -> so iov_len is the guest-supplied value and: (iov_len <= sp)
> +         *    ie. less than available space at the tail of the ring:
> +         *        so this write cannot overrun.
> +         */
> +        ret = memcpy_to_guest_ring(d, ring_info,
> +                                   ring.tx_ptr + sizeof(xen_argo_ring_t),
> +                                   NULL, buf_hnd, iov_len);
> +        if ( ret )
> +        {
> +            gprintk(XENLOG_ERR,
> +                    "argo: failed to copy [%p, %u] (vm%u:%x vm%u)\n",
> +                    buf_hnd.p, iov_len, ring_info->id.domain_id,
> +                    ring_info->id.aport, ring_info->id.partner_id);
> +
> +            return ret;
> +        }
> +
> +        ring.tx_ptr += iov_len;
> +
> +        if ( ring.tx_ptr == ring_info->len )
> +            ring.tx_ptr = 0;
> +    }
> +
> +    ring.tx_ptr = ROUNDUP_MESSAGE(ring.tx_ptr);
> +
> +    if ( ring.tx_ptr >= ring_info->len )
> +        ring.tx_ptr -= ring_info->len;

You seem to handle the wrapping after each possible write, so I think
the above is not needed? Maybe it should be an assert instead?

> +
> +    update_tx_ptr(d, ring_info, ring.tx_ptr);
> +
> +    /*
> +     * At this point (and also on an error exit paths from this function) it is
> +     * possible to unmap the ring_info, ie:
> +     *   ring_unmap(d, ring_info);
> +     * but performance should be improved by not doing so, and retaining
> +     * the mapping.
> +     * An XSM policy control over level of confidentiality required
> +     * versus performance cost could be added to decide that here.
> +     */
> +
> +    *out_len = len;
> +
> +    return ret;
> +}
> +
>  static void
>  wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
>  {
> @@ -497,6 +918,25 @@ wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
>  }
>  
>  static void
> +wildcard_pending_list_insert(domid_t domain_id, struct pending_ent *ent)
> +{
> +    struct domain *d = get_domain_by_id(domain_id);
> +
> +    if ( !d )
> +        return;
> +
> +    ASSERT(LOCKING_Read_L1);
> +
> +    if ( d->argo )
> +    {
> +        spin_lock(&d->argo->wildcard_L2_lock);
> +        list_add(&ent->wildcard_node, &d->argo->wildcard_pend_list);
> +        spin_unlock(&d->argo->wildcard_L2_lock);
> +    }
> +    put_domain(d);
> +}
> +
> +static void
>  pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
>  {
>      struct list_head *ring_pending = &ring_info->pending;
> @@ -518,6 +958,70 @@ pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
>      ring_info->npending = 0;
>  }
>  
> +static int
> +pending_queue(const struct domain *d, struct argo_ring_info *ring_info,
> +              domid_t src_id, unsigned int len)
> +{
> +    struct pending_ent *ent;
> +
> +    ASSERT(LOCKING_L3(d, ring_info));
> +
> +    if ( ring_info->npending >= MAX_PENDING_PER_RING )
> +        return -ENOSPC;
> +
> +    ent = xmalloc(struct pending_ent);
> +    if ( !ent )
> +        return -ENOMEM;
> +
> +    ent->len = len;
> +    ent->domain_id = src_id;
> +    ent->ring_info = ring_info;
> +
> +    if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> +        wildcard_pending_list_insert(src_id, ent);
> +    list_add(&ent->node, &ring_info->pending);
> +    ring_info->npending++;
> +
> +    return 0;
> +}
> +
> +static int
> +pending_requeue(const struct domain *d, struct argo_ring_info *ring_info,
> +                domid_t src_id, unsigned int len)
> +{
> +    struct list_head *cursor, *head;
> +
> +    ASSERT(LOCKING_L3(d, ring_info));
> +
> +    /* List structure is not modified here. Update len in a match if found. */
> +    head = &ring_info->pending;
> +
> +    for ( cursor = head->next; cursor != head; cursor = cursor->next )

list_for_each_entry

>  long
>  do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
>             XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
> @@ -1145,6 +1734,53 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
>          break;
>      }
>  
> +    case XEN_ARGO_OP_sendv:
> +    {
> +        xen_argo_send_addr_t send_addr;
> +
> +        XEN_GUEST_HANDLE_PARAM(xen_argo_send_addr_t) send_addr_hnd =
> +            guest_handle_cast(arg1, xen_argo_send_addr_t);
> +        XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd =
> +            guest_handle_cast(arg2, xen_argo_iov_t);
> +        /* arg3 is niov */
> +        /* arg4 is message_type. Must be a 32-bit value. */
> +
> +        rc = copy_from_guest(&send_addr, send_addr_hnd, 1) ? -EFAULT : 0;
> +        if ( rc )
> +            break;
> +
> +        /*
> +         * Check padding is zeroed. Reject niov above limit or message_types
> +         * that are outside 32 bit range.
> +         */
> +        if ( unlikely(send_addr.src.pad || send_addr.dst.pad ||
> +                      (arg3 > XEN_ARGO_MAXIOV) || (arg4 & ~0xffffffffUL)) )

arg4 & (GB(4) - 1)

Is clearer IMO, or:

arg4 > UINT32_MAX

> +        {
> +            rc = -EINVAL;
> +            break;
> +        }
> +
> +        if ( send_addr.src.domain_id == XEN_ARGO_DOMID_ANY )
> +            send_addr.src.domain_id = currd->domain_id;
> +
> +        /* No domain is currently authorized to send on behalf of another */
> +        if ( unlikely(send_addr.src.domain_id != currd->domain_id) )
> +        {
> +            rc = -EPERM;
> +            break;
> +        }
> +
> +        /*
> +         * Check access to the whole array here so we can use the faster __copy
> +         * operations to read each element later.
> +         */
> +        if ( unlikely(!guest_handle_okay(iovs_hnd, arg3)) )

You need to set rc to EFAULT here, because the call to copy_from_guest
has set it to 0.

Alternatively you can change the call above to be:

if ( copy_from_guest(&send_addr, send_addr_hnd, 1) )
    return -EFAULT;

So rc doesn't get set to 0 on success.

With those taken care of:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 10/15] argo: implement the notify op
  2019-01-21  9:59 ` [PATCH v5 10/15] argo: implement the notify op Christopher Clark
@ 2019-01-22 14:09   ` Roger Pau Monné
  2019-01-31  4:12     ` Christopher Clark
  0 siblings, 1 reply; 44+ messages in thread
From: Roger Pau Monné @ 2019-01-22 14:09 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Julien Grall, Tim Deegan,
	Daniel Smith, Rich Persaud, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Mon, Jan 21, 2019 at 01:59:50AM -0800, Christopher Clark wrote:
> Queries for data about space availability in registered rings and
> causes notification to be sent when space has become available.
> 
> The hypercall op populates a supplied data structure with information about
> ring state and if insufficient space is currently available in a given ring,
> the hypervisor will record the domain's expressed interest and notify it
> when it observes that space has become available.
> 
> Checks for free space occur when this notify op is invoked, so it may be
> intentionally invoked with no data structure to populate
> (ie. a NULL argument) to trigger such a check and consequent notifications.
> 
> Limit the maximum number of notify requests in a single operation to a
> simple fixed limit of 256.
> 
> Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>

LGTM, but I would like to see the open-coded versions of the list_
macros fixed:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> diff --git a/xen/common/argo.c b/xen/common/argo.c
> index 518aff7..4b43bdd 100644
> --- a/xen/common/argo.c
> +++ b/xen/common/argo.c
[...]
> +static void
> +pending_notify(struct list_head *to_notify)
> +{
> +    ASSERT(LOCKING_Read_L1);
> +
> +    /* Sending signals for all ents in this list, draining until it is empty. */
> +    while ( !list_empty(to_notify) )
> +    {
> +        struct pending_ent *ent =
> +            list_entry(to_notify->next, struct pending_ent, node);

list_first_entry_or_null

> +
> +        list_del(&ent->node);
> +        signal_domid(ent->domain_id);
> +        xfree(ent);
> +    }
> +}
> +
> +static void
> +pending_find(const struct domain *d, struct argo_ring_info *ring_info,
> +             unsigned int payload_space, struct list_head *to_notify)
> +{
> +    struct list_head *cursor, *pending_head;
> +
> +    ASSERT(LOCKING_Read_rings_L2(d));
> +
> +    /*
> +     * TODO: Current policy here is to signal _all_ of the waiting domains
> +     *       interested in sending a message of size less than payload_space.
> +     *
> +     * This is likely to be suboptimal, since once one of them has added
> +     * their message to the ring, there may well be insufficient room
> +     * available for any of the others to transmit, meaning that they were
> +     * woken in vain, which created extra work just to requeue their wait.
> +     *
> +     * Retain this simple policy for now since it at least avoids starving a
> +     * domain of available space notifications because of a policy that only
> +     * notified other domains instead. Improvement may be possible;
> +     * investigation required.
> +     */
> +    spin_lock(&ring_info->L3_lock);
> +
> +    /* Remove matching ents from the ring list, and add them to "to_notify" */
> +    pending_head = &ring_info->pending;
> +    cursor = pending_head->next;
> +
> +    while ( cursor != pending_head )
> +    {
> +        struct pending_ent *ent = list_entry(cursor, struct pending_ent, node);
> +
> +        cursor = cursor->next;

list_for_each_entry_safe?

> +
> +        if ( payload_space >= ent->len )
> +        {
> +            if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> +                wildcard_pending_list_remove(ent->domain_id, ent);
> +
> +            list_del(&ent->node);
> +            ring_info->npending--;
> +            list_add(&ent->node, to_notify);
> +        }
> +    }
> +
> +    spin_unlock(&ring_info->L3_lock);
> +}
> +
>  static int
>  pending_queue(const struct domain *d, struct argo_ring_info *ring_info,
>                domid_t src_id, unsigned int len)
> @@ -1023,6 +1163,36 @@ pending_requeue(const struct domain *d, struct argo_ring_info *ring_info,
>  }
>  
>  static void
> +pending_cancel(const struct domain *d, struct argo_ring_info *ring_info,
> +               domid_t src_id)
> +{
> +    struct list_head *cursor, *pending_head;
> +
> +    ASSERT(LOCKING_L3(d, ring_info));
> +
> +    /* Remove all ents where domain_id matches src_id from the ring's list. */
> +    pending_head = &ring_info->pending;
> +    cursor = pending_head->next;
> +
> +    while ( cursor != pending_head )
> +    {
> +        struct pending_ent *ent = list_entry(cursor, struct pending_ent, node);
> +
> +        cursor = cursor->next;

list_for_each_entry_safe

> +
> +        if ( ent->domain_id == src_id )
> +        {
> +            /* For wildcard rings, remove each from their wildcard list too. */
> +            if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> +                wildcard_pending_list_remove(ent->domain_id, ent);
> +            list_del(&ent->node);
> +            xfree(ent);
> +            ring_info->npending--;
> +        }
> +    }
> +}
> +
> +static void
>  wildcard_rings_pending_remove(struct domain *d)
>  {
>      struct list_head *wildcard_head;
> @@ -1158,6 +1328,86 @@ partner_rings_remove(struct domain *src_d)
>  }
>  
>  static int
> +fill_ring_data(const struct domain *currd,
> +               XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) data_ent_hnd)
> +{
> +    xen_argo_ring_data_ent_t ent;
> +    struct domain *dst_d;
> +    struct argo_ring_info *ring_info;
> +    int ret = 0;
> +
> +    ASSERT(currd == current->domain);
> +    ASSERT(LOCKING_Read_L1);
> +
> +    if ( __copy_from_guest(&ent, data_ent_hnd, 1) )
> +        return -EFAULT;
> +
> +    argo_dprintk("fill_ring_data: ent.ring.domain=%u,ent.ring.aport=%x\n",
> +                 ent.ring.domain_id, ent.ring.aport);
> +
> +    ent.flags = 0;
> +
> +    dst_d = get_domain_by_id(ent.ring.domain_id);
> +    if ( !dst_d || !dst_d->argo )
> +        goto out;
> +
> +    read_lock(&dst_d->argo->rings_L2_rwlock);
> +
> +    ring_info = find_ring_info_by_match(dst_d, ent.ring.aport,
> +                                        currd->domain_id);
> +    if ( ring_info )
> +    {
> +        unsigned int space_avail;
> +
> +        ent.flags |= XEN_ARGO_RING_EXISTS;
> +
> +        spin_lock(&ring_info->L3_lock);
> +
> +        ent.max_message_size = ring_info->len -
> +                                   sizeof(struct xen_argo_ring_message_header) -
> +                                   ROUNDUP_MESSAGE(1);
> +
> +        if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> +            ent.flags |= XEN_ARGO_RING_SHARED;
> +
> +        space_avail = ringbuf_payload_space(dst_d, ring_info);
> +
> +        argo_dprintk("fill_ring_data: aport=%x space_avail=%u"
> +                     " space_wanted=%u\n",
> +                     ring_info->id.aport, space_avail, ent.space_required);
> +
> +        /* Do not queue a notification for an unachievable size */
> +        if ( ent.space_required > ent.max_message_size )
> +            ent.flags |= XEN_ARGO_RING_EMSGSIZE;
> +        else if ( space_avail >= ent.space_required )
> +        {
> +            pending_cancel(dst_d, ring_info, currd->domain_id);
> +            ent.flags |= XEN_ARGO_RING_SUFFICIENT;
> +        }
> +        else
> +            ret = pending_requeue(dst_d, ring_info, currd->domain_id,
> +                                  ent.space_required);
> +
> +        spin_unlock(&ring_info->L3_lock);
> +
> +        if ( space_avail == ent.max_message_size )
> +            ent.flags |= XEN_ARGO_RING_EMPTY;
> +
> +    }
> +    read_unlock(&dst_d->argo->rings_L2_rwlock);
> +
> + out:
> +    if ( dst_d )
> +        put_domain(dst_d);
> +
> +    if ( !ret && (__copy_field_to_guest(data_ent_hnd, &ent, flags) ||
> +                  __copy_field_to_guest(data_ent_hnd, &ent, max_message_size)) )
> +        return -EFAULT;
> +
> +    return ret;
> +}
> +
> +static int
>  find_ring_mfn(struct domain *d, gfn_t gfn, mfn_t *mfn)
>  {
>      struct page_info *page;
> @@ -1586,6 +1836,112 @@ register_ring(struct domain *currd,
>      return ret;
>  }
>  
> +static void
> +notify_ring(const struct domain *d, struct argo_ring_info *ring_info,
> +            struct list_head *to_notify)
> +{
> +    unsigned int space;
> +
> +    ASSERT(LOCKING_Read_rings_L2(d));
> +
> +    spin_lock(&ring_info->L3_lock);
> +
> +    if ( ring_info->len )
> +        space = ringbuf_payload_space(d, ring_info);
> +    else
> +        space = 0;
> +
> +    spin_unlock(&ring_info->L3_lock);
> +
> +    if ( space )
> +        pending_find(d, ring_info, space, to_notify);
> +}
> +
> +static void
> +notify_check_pending(struct domain *d)
> +{
> +    unsigned int i;
> +    LIST_HEAD(to_notify);
> +
> +    ASSERT(LOCKING_Read_L1);
> +
> +    read_lock(&d->argo->rings_L2_rwlock);
> +
> +    /* Walk all rings, call notify_ring on each to populate to_notify list */
> +    for ( i = 0; i < ARGO_HASHTABLE_SIZE; i++ )
> +    {
> +        struct list_head *cursor, *bucket = &d->argo->ring_hash[i];
> +        struct argo_ring_info *ring_info;
> +
> +        for ( cursor = bucket->next; cursor != bucket; cursor = cursor->next )

list_for_each_entry

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
  2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (14 preceding siblings ...)
  2019-01-21  9:59 ` [PATCH v5 15/15] MAINTAINERS: add new section for Argo and self as maintainer Christopher Clark
@ 2019-01-22 14:17 ` Roger Pau Monné
  2019-01-31  4:05   ` Christopher Clark
  15 siblings, 1 reply; 44+ messages in thread
From: Roger Pau Monné @ 2019-01-22 14:17 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Rich Persaud,
	Tim Deegan, Daniel Smith, Julien Grall, Paul Durrant,
	Jan Beulich, xen-devel, Daniel De Graaf, James McKenzie,
	Eric Chanudet

On Mon, Jan 21, 2019 at 01:59:40AM -0800, Christopher Clark wrote:
> Version five of this patch series:
> 
> * Changes are primarily addressing feedback from the v4 series reviews.
>   Many points noted on the invididual commit posts.
> 
> * Critical sections have been shrunk, with allocations and frees
>   pulled outside where possible, reordering logic within hypercall ops.
> 
> * A new ring hash function implemented, derived from the djb2 string
>   hash function.
> 
> * Flags returned by the notify op have been simplified.
> 
> * Now uses a single argo boot parameter, taking a list:
>   - top level boolean to enable/disable Argo
>   - mac-permissive option to enable/disable wildcard rings
>   - command line doc edit: no "CONFIG_ARGO" but refers to build config
> 
> * Switched to use the standard list data structures used by Xen's
>   common code.

AFAIK this was not requested by any reviewer, so I wonder why you made
such change. The more that you open coded some of the list_ macros
instead of just doing a s/hlist_/list_/ replacement.

I'm fine with using list instead of hlist, but I don't understand why
you decided to open code list_for_each and list_for_each_safe instead
of using the macros provided by Xen. Is there an issue with such
macros?

I've made a couple of minor comments, but I think the current status
is good, and fixing those minor comments is going to be trivial.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
  2019-01-22 14:17 ` [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Roger Pau Monné
@ 2019-01-31  4:05   ` Christopher Clark
  2019-01-31 13:39     ` Roger Pau Monné
  0 siblings, 1 reply; 44+ messages in thread
From: Christopher Clark @ 2019-01-31  4:05 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Rich Persaud,
	Tim Deegan, Daniel Smith, Julien Grall, Paul Durrant,
	Jan Beulich, xen-devel, Daniel De Graaf, James McKenzie,
	Eric Chanudet

On Tue, Jan 22, 2019 at 6:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Mon, Jan 21, 2019 at 01:59:40AM -0800, Christopher Clark wrote:
> > Version five of this patch series:
> >
> > * Changes are primarily addressing feedback from the v4 series reviews.
> >   Many points noted on the invididual commit posts.
> >
> > * Critical sections have been shrunk, with allocations and frees
> >   pulled outside where possible, reordering logic within hypercall ops.
> >
> > * A new ring hash function implemented, derived from the djb2 string
> >   hash function.
> >
> > * Flags returned by the notify op have been simplified.
> >
> > * Now uses a single argo boot parameter, taking a list:
> >   - top level boolean to enable/disable Argo
> >   - mac-permissive option to enable/disable wildcard rings
> >   - command line doc edit: no "CONFIG_ARGO" but refers to build config
> >
> > * Switched to use the standard list data structures used by Xen's
> >   common code.
>
> AFAIK this was not requested by any reviewer, so I wonder why you made
> such change. The more that you open coded some of the list_ macros
> instead of just doing a s/hlist_/list_/ replacement.
> I'm fine with using list instead of hlist,

At your request, v7 replaces open coding with Xen's list macros. The
hlist macros were not used by any of the common code in Xen.

> but I don't understand why
> you decided to open code list_for_each and list_for_each_safe instead
> of using the macros provided by Xen. Is there an issue with such
> macros?

As discussed offline:

- Using Xen's list macros will expedite Argo's merge for Xen 4.12
- List macros in Xen list.h originated in Linux list.h and have diverged
- OpenXT has use cases for measured launch and nested virtualization,
  which influence downstream performance and security requirements for
  Argo and Xen
- OpenXT can temporarily patch Xen 4.12 for downstream use

> I've made a couple of minor comments, but I think the current status
> is good, and fixing those minor comments is going to be trivial.

Ack, thanks. Hopefully v7 looks good.

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 04/15] argo: init, destroy and soft-reset, with enable command line opt
  2019-01-21 17:55   ` Roger Pau Monné
@ 2019-01-31  4:06     ` Christopher Clark
  2019-01-31 10:28       ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: Christopher Clark @ 2019-01-31  4:06 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Rich Persaud,
	Tim Deegan, Daniel Smith, Julien Grall, Paul Durrant,
	Jan Beulich, xen-devel, James McKenzie, Eric Chanudet

On Mon, Jan 21, 2019 at 9:55 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Mon, Jan 21, 2019 at 01:59:44AM -0800, Christopher Clark wrote:
> > Initialises basic data structures and performs teardown of argo state
> > for domain shutdown.
> >
> > Inclusion of the Argo implementation is dependent on CONFIG_ARGO.
> >
> > Introduces a new Xen command line parameter 'argo': bool to enable/disable
> > the argo hypercall. Defaults to disabled.
> >
> > New headers:
> >   public/argo.h: with definions of addresses and ring structure, including
> >   indexes for atomic update for communication between domain and hypervisor.
> >
> >   xen/argo.h: to expose the hooks for integration into domain lifecycle:
> >     argo_init: per-domain init of argo data structures for domain_create.
> >     argo_destroy: teardown for domain_destroy and the error exit
> >                   path of domain_create.
> >     argo_soft_reset: reset of domain state for domain_soft_reset.
> >
> > Adds a new field to struct domain: struct argo_domain *argo;
> >
> > In accordance with recent work on _domain_destroy, argo_destroy is
> > idempotent. It will tear down: all rings registered by this domain, all
> > rings where this domain is the single sender (ie. specified partner,
> > non-wildcard rings), and all pending notifications where this domain is
> > awaiting signal about available space in the rings of other domains.
> >
> > A count will be maintained of the number of rings that a domain has
> > registered in order to limit it below the fixed maximum limit defined here.
> >
> > Macros are defined to verify the internal locking state within the argo
> > implementation. The macros are ASSERTed on entry to functions to validate
> > and document the required lock state prior to calling.
> >
> > The hash function for the hashtables that hold ring state is derived from
> > the string hashing function djb2 (http://www.cse.yorku.ca/~oz/hash.html)
> > by Daniel J. Bernstein. Basic testing with a limited number of domains and
> > ports has shown reasonable distribution for the table size.
> >
> > The software license on the public header is the BSD license, standard
> > procedure for the public Xen headers. The public header was originally
> > posted under a GPL license at: [1]:
> > https://lists.xenproject.org/archives/html/xen-devel/2013-05/msg02710.html
> >
> > The following ACK by Lars Kurth is to confirm that only people being
> > employees of Citrix contributed to the header files in the series posted at
> > [1] and that thus the copyright of the files in question is fully owned by
> > Citrix. The ACK also confirms that Citrix is happy for the header files to
> > be published under a BSD license in this series (which is based on [1]).
> >
> > Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
> > Acked-by: Lars Kurth <lars.kurth@citrix.com>
> > Reviewed-by: Ross Philipson <ross.philipson@oracle.com>
>
> Thanks.
>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>
> I've got some nits below, but it's purely cosmetic changes to make the
> code cleaner.
>
> > ---

> >
> > diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> > index d39bcee..93f41bc 100644
> > --- a/docs/misc/xen-command-line.pandoc
> > +++ b/docs/misc/xen-command-line.pandoc
> > @@ -182,6 +182,21 @@ Permit Xen to use "Always Running APIC Timer" support on compatible hardware
> >  in combination with cpuidle.  This option is only expected to be useful for
> >  developers wishing Xen to fall back to older timing methods on newer hardware.
> >
> > +### argo
> > +    = List of [ <bool> ]
> > +
> > +Controls for the Argo hypervisor-mediated interdomain communication service.
> > +
> > +The functionality that this option controls is only available when Xen has been
> > +compiled with the build setting for Argo enabled in the build configuration.
> > +
> > +Argo is a interdomain communication mechanism, where Xen acts as the central
> > +point of authority.  Guests may register memory rings to recieve messages,
> > +query the status of other domains, and send messages by hypercall, all subject
> > +to appropriate auditing by Xen.
> > +
> > +*   An overall boolean acts as a global control.  Argo is disabled by default.
>
> I'm not sure it's worth adding a list item for the boolean, I would
> just add the "Argo is disabled by default" to the first paragraph.

ack, thanks, done.

>
> [...]
> > +static struct argo_ring_info *
> > +find_ring_info(const struct domain *d, const struct argo_ring_id *id)
> > +{
> > +    struct list_head *cursor, *bucket;
> > +
> > +    ASSERT(LOCKING_Read_rings_L2(d));
> > +
> > +    /* List is not modified here. Search and return the match if found. */
> > +    bucket = &d->argo->ring_hash[hash_index(id)];
> > +
> > +    for ( cursor = bucket->next; cursor != bucket; cursor = cursor->next )
>
> Why are you open-coding list_for_each here?

see response to cover letter

> You might also consider using list_for_each_entry, so that you can
> avoid the list_entry call below.

Ack, have done that.

>
> > +    {
> > +        struct argo_ring_info *ring_info =
> > +            list_entry(cursor, struct argo_ring_info, node);
> > +        const struct argo_ring_id *cmpid = &ring_info->id;
> > +
> > +        if ( cmpid->aport == id->aport &&
> > +             cmpid->domain_id == id->domain_id &&
> > +             cmpid->partner_id == id->partner_id )
> > +        {
> > +            argo_dprintk("found ring_info for ring(%u:%x %u)\n",
> > +                         id->domain_id, id->aport, id->partner_id);
> > +            return ring_info;
> > +        }
> > +    }
> > +    argo_dprintk("no ring_info for ring(%u:%x %u)\n",
> > +                 id->domain_id, id->aport, id->partner_id);
> > +
> > +    return NULL;
> > +}
> [...]
> > +static void
> > +pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
> > +{
> > +    struct list_head *ring_pending = &ring_info->pending;
> > +    struct pending_ent *ent;
> > +
> > +    ASSERT(LOCKING_L3(d, ring_info));
> > +
> > +    /* Delete all pending notifications from this ring's list. */
> > +    while ( !list_empty(ring_pending) )
>
> Nit: you could use list_first_entry_or_null that joins the list_empty
> and list_entry calls.

There are no existing users of list_first_entry_or_null anywhere in Xen,
and applying it to this loop results in an assignment within the
while condition, which also appears to be very rare construct within Xen,
so I just used the list_for_each_entry_safe macro.

>
> > +    {
> > +        ent = list_entry(ring_pending->next, struct pending_ent, node);
> > +
> > +        /* For wildcard rings, remove each from their wildcard list too. */
> > +        if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> > +            wildcard_pending_list_remove(ent->domain_id, ent);
> > +        list_del(&ent->node);
> > +        xfree(ent);
> > +    }
> > +    ring_info->npending = 0;
> > +}
> > +
> > +static void
> > +wildcard_rings_pending_remove(struct domain *d)
> > +{
> > +    struct list_head *wildcard_head;
> > +
> > +    ASSERT(LOCKING_Write_L1);
> > +
> > +    /* Delete all pending signals to the domain about wildcard rings. */
> > +    wildcard_head = &d->argo->wildcard_pend_list;
> > +
> > +    while ( !list_empty(wildcard_head) )
> > +    {
> > +        struct pending_ent *ent =
> > +            list_entry(wildcard_head->next, struct pending_ent, node);
>
> Same here regarding the usage of list_first_entry_or_null.

list_for_each_entry_safe.

>
> > +
> > +        /*
> > +         * The ent->node deleted here, and the npending value decreased,
> > +         * belong to the ring_info of another domain, which is why this
> > +         * function requires holding W(L1):
> > +         * it implies the L3 lock that protects that ring_info struct.
> > +         */
> > +        ent->ring_info->npending--;
> > +        list_del(&ent->node);
> > +        list_del(&ent->wildcard_node);
> > +        xfree(ent);
> > +    }
> > +}
> [...]
> > +static void
> > +domain_rings_remove_all(struct domain *d)
> > +{
> > +    unsigned int i;
> > +    struct argo_ring_info *ring_info;
> > +
> > +    ASSERT(LOCKING_Write_rings_L2(d));
> > +
> > +    for ( i = 0; i < ARGO_HASHTABLE_SIZE; ++i )
> > +    {
> > +        struct list_head *bucket = &d->argo->ring_hash[i];
> > +
> > +        while ( !list_empty(bucket) )
> > +        {
> > +            ring_info = list_entry(bucket->next, struct argo_ring_info, node);
>
> list_first_entry_or_null
>
> > +            ring_remove_info(d, ring_info);
> > +        }
> > +    }
> > +    d->argo->ring_count = 0;
> > +}
> > +
> > +/*
> > + * Tear down all rings of other domains where src_d domain is the partner.
> > + * (ie. it is the single domain that can send to those rings.)
> > + * This will also cancel any pending notifications about those rings.
> > + */
> > +static void
> > +partner_rings_remove(struct domain *src_d)
> > +{
> > +    unsigned int i;
> > +    struct argo_send_info *send_info;
> > +    struct argo_ring_info *ring_info;
> > +    struct domain *dst_d;
> > +
> > +    ASSERT(LOCKING_Write_L1);
> > +
> > +    for ( i = 0; i < ARGO_HASHTABLE_SIZE; ++i )
> > +    {
> > +        struct list_head *cursor, *bucket = &src_d->argo->send_hash[i];
> > +
> > +        /* Remove all ents from the send list. Take each off their ring list. */
> > +        for ( cursor = bucket->next; cursor != bucket; cursor = cursor->next )
>
> Another open-coded version of list_for_each, see my comments on the
> instances above.

list_for_each_entry_safe

>
> > +        {
> > +            send_info = list_entry(cursor, struct argo_send_info, node);
>
> send_info should be defined here to reduce it's scope.

send_info is now iterating in the list_for_each_entry_safe macro, so its
position here is retained since now required.

>
> > +
> > +            dst_d = get_domain_by_id(send_info->id.domain_id);
> > +            if ( dst_d && dst_d->argo )
> > +            {
> > +                ring_info = find_ring_info(dst_d, &send_info->id);
>
> ring_info should be defined here.

ack, done.

>
> > +                if ( ring_info )
> > +                {
> > +                    ring_remove_info(dst_d, ring_info);
> > +                    dst_d->argo->ring_count--;
> > +                }
> > +                else
> > +                    ASSERT_UNREACHABLE();
> > +            }
> > +            else
> > +                ASSERT_UNREACHABLE();
> > +
> > +            if ( dst_d )
> > +                put_domain(dst_d);
> > +
> > +            list_del(&send_info->node);
> > +            xfree(send_info);
> > +        }
> > +    }
> > +}

thanks,

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 07/15] argo: implement the register op
  2019-01-22  9:59   ` Roger Pau Monné
@ 2019-01-31  4:08     ` Christopher Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-31  4:08 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 22, 2019 at 1:59 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Mon, Jan 21, 2019 at 01:59:47AM -0800, Christopher Clark wrote:
> > The register op is used by a domain to register a region of memory for
> > receiving messages from either a specified other domain, or, if specifying a
> > wildcard, any domain.
> >
> > This operation creates a mapping within Xen's private address space that
> > will remain resident for the lifetime of the ring. In subsequent commits,
> > the hypervisor will use this mapping to copy data from a sending domain into
> > this registered ring, making it accessible to the domain that registered the
> > ring to receive data.
> >
> > Wildcard any-sender rings are default disabled and registration will be
> > refused with EPERM unless they have been specifically enabled with the
> > new mac-permissive flag that is added to the argo boot option here. The
> > reason why the default for wildcard rings is 'deny' is that there is
> > currently no means to protect the ring from DoS by a noisy domain
> > spamming the ring, affecting other domains ability to send to it. This
> > will be addressed with XSM policy controls in subsequent work.
> >
> > Since denying access to any-sender rings is a significant functional
> > constraint, the new option "mac-permissive" for the argo bootparam
> > enables overriding this. eg: "argo=1,mac-permissive=1"
> >
> > The p2m type of the memory supplied by the guest for the ring must be
> > p2m_ram_rw and the memory will be pinned as PGT_writable_page while the ring
> > is registered.
> >
> > xen_argo_gfn_t type is defined and is 64-bit on all architectures which
> > assists with avoiding the need for compat code to translate hypercall args.
> > This hypercall op and its interface currently only supports 4K-sized pages.
> >
> > Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
>
> Reviewed-by: Roger Pau Mooné <roger.pau@citrix.com>
>
> Just some nits that can be taken care of later.
>
> > +static int
> > +find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
> > +               const unsigned int npage,
> > +               XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
> > +               const unsigned int len)
> > +{
> > +    unsigned int i;
> > +    int ret = 0;
> > +    mfn_t *mfns;
> > +    void **mfn_mapping;
> > +
> > +    ASSERT(LOCKING_Write_rings_L2(d));
> > +
> > +    if ( ring_info->mfns )
> > +    {
> > +        /* Ring already existed: drop the previous mapping. */
> > +        gprintk(XENLOG_INFO, "argo: vm%u re-register existing ring "
> > +                "(vm%u:%x vm%u) clears mapping\n",
> > +                d->domain_id, ring_info->id.domain_id,
> > +                ring_info->id.aport, ring_info->id.partner_id);
> > +
> > +        ring_remove_mfns(d, ring_info);
> > +        ASSERT(!ring_info->mfns);
> > +    }
> > +
> > +    mfns = xmalloc_array(mfn_t, npage);
> > +    if ( !mfns )
> > +        return -ENOMEM;
> > +
> > +    for ( i = 0; i < npage; i++ )
> > +        mfns[i] = INVALID_MFN;
> > +
> > +    mfn_mapping = xzalloc_array(void *, npage);
> > +    if ( !mfn_mapping )
> > +    {
> > +        xfree(mfns);
> > +        return -ENOMEM;
> > +    }
> > +
> > +    ring_info->mfns = mfns;
> > +    ring_info->mfn_mapping = mfn_mapping;
> > +
> > +    for ( i = 0; i < npage; i++ )
> > +    {
> > +        xen_argo_gfn_t argo_gfn;
> > +        mfn_t mfn;
> > +
> > +        ret = __copy_from_guest_offset(&argo_gfn, gfn_hnd, i, 1) ? -EFAULT : 0;
> > +        if ( ret )
> > +            break;
> > +
> > +        ret = find_ring_mfn(d, _gfn(argo_gfn), &mfn);
> > +        if ( ret )
> > +        {
> > +            gprintk(XENLOG_ERR, "argo: vm%u: invalid gfn %"PRI_gfn" "
> > +                    "r:(vm%u:%x vm%u) %p %u/%u\n",
> > +                    d->domain_id, gfn_x(_gfn(argo_gfn)),
> > +                    ring_info->id.domain_id, ring_info->id.aport,
> > +                    ring_info->id.partner_id, ring_info, i, npage);
> > +            break;
> > +        }
> > +
> > +        ring_info->mfns[i] = mfn;
> > +
> > +        argo_dprintk("%u: %"PRI_gfn" -> %"PRI_mfn"\n",
> > +                     i, gfn_x(_gfn(argo_gfn)), mfn_x(ring_info->mfns[i]));
> > +    }
> > +
> > +    ring_info->nmfns = i;
> > +
> > +    if ( ret )
> > +        ring_remove_mfns(d, ring_info);
> > +    else
> > +    {
> > +        ASSERT(ring_info->nmfns == NPAGES_RING(len));
> > +
> > +        gprintk(XENLOG_DEBUG, "argo: vm%u ring (vm%u:%x vm%u) %p "
>
> Nit: this likely wants to be an argo_dprintk?

There are not many instances in the Argo code where gprintk(XENLOG_DEBUG
is used, but it's intentional here, because argo_dprintk needs a
recompile to enable it, whereas gprintk does not.

Ring registration is non-datapath and the message is potentially useful
when diagnosing a deployed system.

>
> > +                "mfn_mapping %p len %u nmfns %u\n",
> > +                d->domain_id, ring_info->id.domain_id,
> > +                ring_info->id.aport, ring_info->id.partner_id, ring_info,
> > +                ring_info->mfn_mapping, ring_info->len, ring_info->nmfns);
> > +    }
> > +
> > +    return ret;
> > +}
> > +
> > +static long
> > +register_ring(struct domain *currd,
> > +              XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd,
> > +              XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
> > +              unsigned int npage, bool fail_exist)
> > +{
> > +    xen_argo_register_ring_t reg;
> > +    struct argo_ring_id ring_id;
> > +    void *map_ringp;
> > +    xen_argo_ring_t *ringp;
> > +    struct argo_ring_info *ring_info, *new_ring_info = NULL;
> > +    struct argo_send_info *send_info = NULL;
> > +    struct domain *dst_d = NULL;
> > +    int ret = 0;
> > +    unsigned int private_tx_ptr;
> > +
> > +    ASSERT(currd == current->domain);
> > +
> > +    if ( copy_from_guest(&reg, reg_hnd, 1) )
> > +        return -EFAULT;
> > +
> > +    /*
> > +     * A ring must be large enough to transmit messages, so requires space for:
> > +     * * 1 message header, plus
> > +     * * 1 payload slot (payload is always rounded to a multiple of 16 bytes)
> > +     *   for the message payload to be written into, plus
> > +     * * 1 more slot, so that the ring cannot be filled to capacity with a
> > +     *   single minimum-size message -- see the logic in ringbuf_insert --
> > +     *   allowing for this ensures that there can be space remaining when a
> > +     *   message is present.
> > +     * The above determines the minimum acceptable ring size.
> > +     */
> > +    if ( (reg.len < (sizeof(struct xen_argo_ring_message_header)
> > +                      + ROUNDUP_MESSAGE(1) + ROUNDUP_MESSAGE(1))) ||
> > +         (reg.len > XEN_ARGO_MAX_RING_SIZE) ||
> > +         (reg.len != ROUNDUP_MESSAGE(reg.len)) ||
> > +         (NPAGES_RING(reg.len) != npage) ||
> > +         (reg.pad != 0) )
> > +        return -EINVAL;
> > +
> > +    ring_id.partner_id = reg.partner_id;
> > +    ring_id.aport = reg.aport;
> > +    ring_id.domain_id = currd->domain_id;
> > +
> > +    if ( reg.partner_id == XEN_ARGO_DOMID_ANY )
> > +    {
> > +        if ( !opt_argo_mac_permissive )
> > +            return -EPERM;
> > +    }
> > +    else
> > +    {
> > +        dst_d = get_domain_by_id(reg.partner_id);
> > +        if ( !dst_d )
> > +        {
> > +            argo_dprintk("!dst_d, ESRCH\n");
> > +            return -ESRCH;
> > +        }
> > +
> > +        send_info = xzalloc(struct argo_send_info);
> > +        if ( !send_info )
> > +        {
> > +            ret = -ENOMEM;
> > +            goto out;
> > +        }
> > +        send_info->id = ring_id;
> > +    }
> > +
> > +    /*
> > +     * Common case is that the ring doesn't already exist, so do the alloc here
> > +     * before picking up any locks.
> > +     */
> > +    new_ring_info = xzalloc(struct argo_ring_info);
> > +    if ( !new_ring_info )
> > +    {
> > +        ret = -ENOMEM;
> > +        goto out;
> > +    }
> > +
> > +    read_lock(&L1_global_argo_rwlock);
> > +
> > +    if ( !currd->argo )
> > +    {
> > +        ret = -ENODEV;
> > +        goto out_unlock;
> > +    }
> > +
> > +    if ( dst_d && !dst_d->argo )
> > +    {
> > +        argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
> > +        ret = -ECONNREFUSED;
> > +        goto out_unlock;
> > +    }
> > +
> > +    write_lock(&currd->argo->rings_L2_rwlock);
> > +
> > +    if ( currd->argo->ring_count >= MAX_RINGS_PER_DOMAIN )
> > +    {
> > +        ret = -ENOSPC;
> > +        goto out_unlock2;
> > +    }
> > +
> > +    ring_info = find_ring_info(currd, &ring_id);
> > +    if ( !ring_info )
> > +    {
> > +        ring_info = new_ring_info;
> > +        new_ring_info = NULL;
> > +
> > +        spin_lock_init(&ring_info->L3_lock);
> > +
> > +        ring_info->id = ring_id;
> > +        INIT_LIST_HEAD(&ring_info->pending);
> > +
> > +        list_add(&ring_info->node,
> > +                 &currd->argo->ring_hash[hash_index(&ring_info->id)]);
> > +
> > +        gprintk(XENLOG_DEBUG, "argo: vm%u registering ring (vm%u:%x vm%u)\n",
> > +                currd->domain_id, ring_id.domain_id, ring_id.aport,
> > +                ring_id.partner_id);
> > +    }
> > +    else if ( ring_info->len )
> > +    {
> > +        /*
> > +         * If the caller specified that the ring must not already exist,
> > +         * fail at attempt to add a completed ring which already exists.
> > +         */
> > +        if ( fail_exist )
> > +        {
> > +            argo_dprintk("disallowed reregistration of existing ring\n");
>
> And this should likely be gprintk with error type?

ack, yes, thanks.

>
> I think the pattern of using gprintk for error messages and
> argo_dprintk for verbose information is correct, but there are a
> couple of oddities that can be fixed later.
>
> > +            ret = -EEXIST;
> > +            goto out_unlock2;
> > +        }
> > +
> > +        if ( ring_info->len != reg.len )
> > +        {
> > +            /*
> > +             * Change of ring size could result in entries on the pending
> > +             * notifications list that will never trigger.
> > +             * Simple blunt solution: disallow ring resize for now.
> > +             * TODO: investigate enabling ring resize.
> > +             */
> > +            gprintk(XENLOG_ERR, "argo: vm%u attempted to change ring size "
> > +                    "(vm%u:%x vm%u)\n",
> > +                    currd->domain_id, ring_id.domain_id, ring_id.aport,
> > +                    ring_id.partner_id);
> > +            /*
> > +             * Could return EINVAL here, but if the ring didn't already
> > +             * exist then the arguments would have been valid, so: EEXIST.
> > +             */
> > +            ret = -EEXIST;
> > +            goto out_unlock2;
> > +        }
> > +
> > +        gprintk(XENLOG_DEBUG,
> > +                "argo: vm%u re-registering existing ring (vm%u:%x vm%u)\n",
> > +                currd->domain_id, ring_id.domain_id, ring_id.aport,
> > +                ring_id.partner_id);
>
> This again would better be argo_dprintk IMO.

similar to above: message useful when deployed, not just for argo development.

>
> [...]
> > @@ -552,6 +987,38 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
> >
> >      switch (cmd)
> >      {
> > +    case XEN_ARGO_OP_register_ring:
> > +    {
> > +        XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd =
> > +            guest_handle_cast(arg1, xen_argo_register_ring_t);
> > +        XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd =
> > +            guest_handle_cast(arg2, xen_argo_gfn_t);
> > +        /* arg3 is npage */
> > +        /* arg4 is flags */
> > +        bool fail_exist = arg4 & XEN_ARGO_REGISTER_FLAG_FAIL_EXIST;
>
> Nit: I would add a:
>
> BUILD_BUG_ON(!IS_ALIGNED(XEN_ARGO_MAX_RING_SIZE, PAGE_SIZE));

ack

thanks,

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-22 12:08   ` Roger Pau Monné
@ 2019-01-31  4:10     ` Christopher Clark
  2019-01-31 10:18       ` Roger Pau Monné
  0 siblings, 1 reply; 44+ messages in thread
From: Christopher Clark @ 2019-01-31  4:10 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 22, 2019 at 4:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Mon, Jan 21, 2019 at 01:59:49AM -0800, Christopher Clark wrote:
> > sendv operation is invoked to perform a synchronous send of buffers
> > contained in iovs to a remote domain's registered ring.
> >
> > It takes:
> >  * A destination address (domid, port) for the ring to send to.
> >    It performs a most-specific match lookup, to allow for wildcard.
> >  * A source address, used to inform the destination of where to reply.
> >  * The address of an array of iovs containing the data to send
> >  * .. and the length of that array of iovs
> >  * and a 32-bit message type, available to communicate message context
> >    data (eg. kernel-to-kernel, separate from the application data).
> >
> > If insufficient space exists in the destination ring, it will return
> > -EAGAIN and Xen will notify the caller when sufficient space becomes
> > available.
> >
> > Accesses to the ring indices are appropriately atomic. The rings are
> > mapped into Xen's private address space to write as needed and the
> > mappings are retained for later use.
> >
> > Notifications are sent to guests via VIRQ and send_guest_global_virq is
> > exposed in the change to enable argo to call it. VIRQ_ARGO_MESSAGE is
>                                                    ^ VIRQ_ARGO
> > claimed from the VIRQ previously reserved for this purpose (#11).
> >
> > The VIRQ notification method is used rather than sending events using
> > evtchn functions directly because:
> >
> > * no current event channel type is an exact fit for the intended
> >   behaviour. ECS_IPI is closest, but it disallows migration to
> >   other VCPUs which is not necessarily a requirement for Argo.
> >
> > * at the point of argo_init, allocation of an event channel is
> >   complicated by none of the guest VCPUs being initialized yet
> >   and the event channel logic expects that a valid event channel
> >   has a present VCPU.
>
> IMO iff you wanted to use event channels those _must_ be setup by the
> guest, ie: the guest argo driver would load, allocate an event channel
> and then tell the hypervisor about the event channel that should be
> used for argo notifications.
>
> > +static int
> > +memcpy_to_guest_ring(const struct domain *d, struct argo_ring_info *ring_info,
> > +                     unsigned int offset,
> > +                     const void *src, XEN_GUEST_HANDLE(uint8_t) src_hnd,
> > +                     unsigned int len)
> > +{
> > +    unsigned int mfns_index = offset >> PAGE_SHIFT;
> > +    void *dst;
> > +    int ret;
> > +    unsigned int src_offset = 0;
> > +
> > +    ASSERT(LOCKING_L3(d, ring_info));
> > +
> > +    offset &= ~PAGE_MASK;
> > +
> > +    if ( len + offset > XEN_ARGO_MAX_RING_SIZE )
> > +        return -EFAULT;
> > +
> > +    while ( len )
> > +    {
> > +        unsigned int head_len = (offset + len) > PAGE_SIZE ? PAGE_SIZE - offset
> > +                                                           : len;
>
> IMO that would be clearer as:
>
> head_len = min(PAGE_SIZE - offset, len);

You're right that the calculated result should be the same, but I've left
this unchanged because I think the reason for using that value (ie. intent)
is clearer in the form it has:
it's not about trying to find the smallest amount of data to write,
it's about only writing up to the PAGE_SIZE boundary, starting at offset.

>
> But anyway, this should go away when you move to using vmap.
>
> [...]
> > +static int
> > +ringbuf_insert(const struct domain *d, struct argo_ring_info *ring_info,
> > +               const struct argo_ring_id *src_id,
> > +               XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd,
> > +               unsigned long niov, uint32_t message_type,
> > +               unsigned long *out_len)
> > +{
> > +    xen_argo_ring_t ring;
> > +    struct xen_argo_ring_message_header mh = { };
> > +    int sp, ret;
> > +    unsigned int len = 0;
> > +    xen_argo_iov_t iovs[XEN_ARGO_MAXIOV];
> > +    xen_argo_iov_t *piov;
> > +    XEN_GUEST_HANDLE(uint8_t) NULL_hnd =
> > +       guest_handle_from_param(guest_handle_from_ptr(NULL, uint8_t), uint8_t);
> > +
> > +    ASSERT(LOCKING_L3(d, ring_info));
> > +
> > +    ret = __copy_from_guest(iovs, iovs_hnd, niov) ? -EFAULT : 0;
> > +    if ( ret )
> > +        return ret;
> > +
> > +    /*
> > +     * Obtain the total size of data to transmit -- sets the 'len' variable
> > +     * -- and sanity check that the iovs conform to size and number limits.
> > +     * Enforced below: no more than 'len' bytes of guest data
> > +     * (plus the message header) will be sent in this operation.
> > +     */
> > +    ret = iov_count(iovs, niov, &len);
> > +    if ( ret )
> > +        return ret;
> > +
> > +    /*
> > +     * Size bounds check against ring size and static maximum message limit.
> > +     * The message must not fill the ring; there must be at least one slot
> > +     * remaining so we can distinguish a full ring from an empty one.
> > +     */
> > +    if ( ((ROUNDUP_MESSAGE(len) +
> > +            sizeof(struct xen_argo_ring_message_header)) >= ring_info->len) ||
> > +         (len > MAX_ARGO_MESSAGE_SIZE) )
>
> len is already checked to be <= MAX_ARGO_MESSAGE_SIZE in iov_count
> where it gets set, this is redundant.

ack, removed, thanks.

>
> > +        return -EMSGSIZE;
> > +
> > +    ret = get_sanitized_ring(d, &ring, ring_info);
> > +    if ( ret )
> > +        return ret;
> > +
> > +    argo_dprintk("ring.tx_ptr=%u ring.rx_ptr=%u ring len=%u"
> > +                 " ring_info->tx_ptr=%u\n",
> > +                 ring.tx_ptr, ring.rx_ptr, ring_info->len, ring_info->tx_ptr);
> > +
> > +    if ( ring.rx_ptr == ring.tx_ptr )
> > +        sp = ring_info->len;
> > +    else
> > +    {
> > +        sp = ring.rx_ptr - ring.tx_ptr;
> > +        if ( sp < 0 )
> > +            sp += ring_info->len;
> > +    }
> > +
> > +    /*
> > +     * Size bounds check against currently available space in the ring.
> > +     * Again: the message must not fill the ring leaving no space remaining.
> > +     */
> > +    if ( (ROUNDUP_MESSAGE(len) +
> > +            sizeof(struct xen_argo_ring_message_header)) >= sp )
> > +    {
> > +        argo_dprintk("EAGAIN\n");
> > +        return -EAGAIN;
> > +    }
> > +
> > +    mh.len = len + sizeof(struct xen_argo_ring_message_header);
> > +    mh.source.aport = src_id->aport;
> > +    mh.source.domain_id = src_id->domain_id;
> > +    mh.message_type = message_type;
> > +
> > +    /*
> > +     * For this copy to the guest ring, tx_ptr is always 16-byte aligned
> > +     * and the message header is 16 bytes long.
> > +     */
> > +    BUILD_BUG_ON(
> > +        sizeof(struct xen_argo_ring_message_header) != ROUNDUP_MESSAGE(1));
> > +
> > +    /*
> > +     * First data write into the destination ring: fixed size, message header.
> > +     * This cannot overrun because the available free space (value in 'sp')
> > +     * is checked above and must be at least this size.
> > +     */
> > +    ret = memcpy_to_guest_ring(d, ring_info,
> > +                               ring.tx_ptr + sizeof(xen_argo_ring_t),
> > +                               &mh, NULL_hnd, sizeof(mh));
> > +    if ( ret )
> > +    {
> > +        gprintk(XENLOG_ERR,
> > +                "argo: failed to write message header to ring (vm%u:%x vm%u)\n",
> > +                ring_info->id.domain_id, ring_info->id.aport,
> > +                ring_info->id.partner_id);
> > +
> > +        return ret;
> > +    }
> > +
> > +    ring.tx_ptr += sizeof(mh);
> > +    if ( ring.tx_ptr == ring_info->len )
> > +        ring.tx_ptr = 0;
> > +
> > +    for ( piov = iovs; niov--; piov++ )
> > +    {
> > +        XEN_GUEST_HANDLE_64(uint8_t) buf_hnd = piov->iov_hnd;
> > +        unsigned int iov_len = piov->iov_len;
> > +
> > +        /* If no data is provided in this iov, moan and skip on to the next */
> > +        if ( !iov_len )
> > +        {
> > +            gprintk(XENLOG_ERR,
>
> This should likely be WARN or INFO, since it's not an error?

Yes, changed to WARNING, ack.

>
> > +                    "argo: no data iov_len=0 iov_hnd=%p ring (vm%u:%x vm%u)\n",
> > +                    buf_hnd.p, ring_info->id.domain_id, ring_info->id.aport,
> > +                    ring_info->id.partner_id);
> > +
> > +            continue;
> > +        }
> > +
> > +        if ( unlikely(!guest_handle_okay(buf_hnd, iov_len)) )
> > +        {
> > +            gprintk(XENLOG_ERR,
> > +                    "argo: bad iov handle [%p, %u] (vm%u:%x vm%u)\n",
> > +                    buf_hnd.p, iov_len,
> > +                    ring_info->id.domain_id, ring_info->id.aport,
> > +                    ring_info->id.partner_id);
> > +
> > +            return -EFAULT;
> > +        }
> > +
> > +        sp = ring_info->len - ring.tx_ptr;
> > +
> > +        /* Check: iov data size versus free space at the tail of the ring */
> > +        if ( iov_len > sp )
> > +        {
> > +            /*
> > +             * Second possible data write: ring-tail-wrap-write.
> > +             * Populate the ring tail and update the internal tx_ptr to handle
> > +             * wrapping at the end of ring.
> > +             * Size of data written here: sp
> > +             * which is the exact full amount of free space available at the
> > +             * tail of the ring, so this cannot overrun.
> > +             */
> > +            ret = memcpy_to_guest_ring(d, ring_info,
> > +                                       ring.tx_ptr + sizeof(xen_argo_ring_t),
> > +                                       NULL, buf_hnd, sp);
> > +            if ( ret )
> > +            {
> > +                gprintk(XENLOG_ERR,
> > +                        "argo: failed to copy {%p, %d} (vm%u:%x vm%u)\n",
> > +                        buf_hnd.p, sp,
> > +                        ring_info->id.domain_id, ring_info->id.aport,
> > +                        ring_info->id.partner_id);
> > +
> > +                return ret;
> > +            }
> > +
> > +            ring.tx_ptr = 0;
> > +            iov_len -= sp;
> > +            guest_handle_add_offset(buf_hnd, sp);
> > +
> > +            ASSERT(iov_len <= ring_info->len);
> > +        }
> > +
> > +        /*
> > +         * Third possible data write: all data remaining for this iov.
> > +         * Size of data written here: iov_len
> > +         *
> > +         * Case 1: if the ring-tail-wrap-write above was performed, then
> > +         *         iov_len has been decreased by 'sp' and ring.tx_ptr is zero.
> > +         *
> > +         *    We know from checking the result of iov_count:
> > +         *      len + sizeof(message_header) <= ring_info->len
> > +         *    We also know that len is the total of summing all iov_lens, so:
> > +         *       iov_len <= len
> > +         *    so by transitivity:
> > +         *       iov_len <= len <= (ring_info->len - sizeof(msgheader))
> > +         *    and therefore:
> > +         *       (iov_len + sizeof(msgheader) <= ring_info->len) &&
> > +         *       (ring.tx_ptr == 0)
> > +         *    so this write cannot overrun here.
> > +         *
> > +         * Case 2: ring-tail-wrap-write above was not performed
> > +         *    -> so iov_len is the guest-supplied value and: (iov_len <= sp)
> > +         *    ie. less than available space at the tail of the ring:
> > +         *        so this write cannot overrun.
> > +         */
> > +        ret = memcpy_to_guest_ring(d, ring_info,
> > +                                   ring.tx_ptr + sizeof(xen_argo_ring_t),
> > +                                   NULL, buf_hnd, iov_len);
> > +        if ( ret )
> > +        {
> > +            gprintk(XENLOG_ERR,
> > +                    "argo: failed to copy [%p, %u] (vm%u:%x vm%u)\n",
> > +                    buf_hnd.p, iov_len, ring_info->id.domain_id,
> > +                    ring_info->id.aport, ring_info->id.partner_id);
> > +
> > +            return ret;
> > +        }
> > +
> > +        ring.tx_ptr += iov_len;
> > +
> > +        if ( ring.tx_ptr == ring_info->len )
> > +            ring.tx_ptr = 0;
> > +    }
> > +
> > +    ring.tx_ptr = ROUNDUP_MESSAGE(ring.tx_ptr);
> > +
> > +    if ( ring.tx_ptr >= ring_info->len )
> > +        ring.tx_ptr -= ring_info->len;
>
> You seem to handle the wrapping after each possible write, so I think
> the above is not needed? Maybe it should be an assert instead?

The wrap handling is necesssary due to that ROUNDUP_MESSAGE
immediately above it.

I've added a new comment to make it a bit clearer:

Finished writing data from all iovs into the ring: now need to
round up tx_ptr to align to the next message boundary, and then
wrap if necessary.

>
> > +
> > +    update_tx_ptr(d, ring_info, ring.tx_ptr);
> > +
> > +    /*
> > +     * At this point (and also on an error exit paths from this function) it is
> > +     * possible to unmap the ring_info, ie:
> > +     *   ring_unmap(d, ring_info);
> > +     * but performance should be improved by not doing so, and retaining
> > +     * the mapping.
> > +     * An XSM policy control over level of confidentiality required
> > +     * versus performance cost could be added to decide that here.
> > +     */
> > +
> > +    *out_len = len;
> > +
> > +    return ret;
> > +}
> > +
> >  static void
> >  wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
> >  {
> > @@ -497,6 +918,25 @@ wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
> >  }
> >
> >  static void
> > +wildcard_pending_list_insert(domid_t domain_id, struct pending_ent *ent)
> > +{
> > +    struct domain *d = get_domain_by_id(domain_id);
> > +
> > +    if ( !d )
> > +        return;
> > +
> > +    ASSERT(LOCKING_Read_L1);
> > +
> > +    if ( d->argo )
> > +    {
> > +        spin_lock(&d->argo->wildcard_L2_lock);
> > +        list_add(&ent->wildcard_node, &d->argo->wildcard_pend_list);
> > +        spin_unlock(&d->argo->wildcard_L2_lock);
> > +    }
> > +    put_domain(d);
> > +}
> > +
> > +static void
> >  pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
> >  {
> >      struct list_head *ring_pending = &ring_info->pending;
> > @@ -518,6 +958,70 @@ pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
> >      ring_info->npending = 0;
> >  }
> >
> > +static int
> > +pending_queue(const struct domain *d, struct argo_ring_info *ring_info,
> > +              domid_t src_id, unsigned int len)
> > +{
> > +    struct pending_ent *ent;
> > +
> > +    ASSERT(LOCKING_L3(d, ring_info));
> > +
> > +    if ( ring_info->npending >= MAX_PENDING_PER_RING )
> > +        return -ENOSPC;
> > +
> > +    ent = xmalloc(struct pending_ent);
> > +    if ( !ent )
> > +        return -ENOMEM;
> > +
> > +    ent->len = len;
> > +    ent->domain_id = src_id;
> > +    ent->ring_info = ring_info;
> > +
> > +    if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> > +        wildcard_pending_list_insert(src_id, ent);
> > +    list_add(&ent->node, &ring_info->pending);
> > +    ring_info->npending++;
> > +
> > +    return 0;
> > +}
> > +
> > +static int
> > +pending_requeue(const struct domain *d, struct argo_ring_info *ring_info,
> > +                domid_t src_id, unsigned int len)
> > +{
> > +    struct list_head *cursor, *head;
> > +
> > +    ASSERT(LOCKING_L3(d, ring_info));
> > +
> > +    /* List structure is not modified here. Update len in a match if found. */
> > +    head = &ring_info->pending;
> > +
> > +    for ( cursor = head->next; cursor != head; cursor = cursor->next )
>
> list_for_each_entry

ack

>
> >  long
> >  do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
> >             XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
> > @@ -1145,6 +1734,53 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
> >          break;
> >      }
> >
> > +    case XEN_ARGO_OP_sendv:
> > +    {
> > +        xen_argo_send_addr_t send_addr;
> > +
> > +        XEN_GUEST_HANDLE_PARAM(xen_argo_send_addr_t) send_addr_hnd =
> > +            guest_handle_cast(arg1, xen_argo_send_addr_t);
> > +        XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd =
> > +            guest_handle_cast(arg2, xen_argo_iov_t);
> > +        /* arg3 is niov */
> > +        /* arg4 is message_type. Must be a 32-bit value. */
> > +
> > +        rc = copy_from_guest(&send_addr, send_addr_hnd, 1) ? -EFAULT : 0;
> > +        if ( rc )
> > +            break;
> > +
> > +        /*
> > +         * Check padding is zeroed. Reject niov above limit or message_types
> > +         * that are outside 32 bit range.
> > +         */
> > +        if ( unlikely(send_addr.src.pad || send_addr.dst.pad ||
> > +                      (arg3 > XEN_ARGO_MAXIOV) || (arg4 & ~0xffffffffUL)) )
>
> arg4 & (GB(4) - 1)
>
> Is clearer IMO, or:
>
> arg4 > UINT32_MAX

I've left the code unchanged, as the mask constant is used multiple
places elsewhere in Xen. UINT32_MAX is only used as a threshold value.

>
> > +        {
> > +            rc = -EINVAL;
> > +            break;
> > +        }
> > +
> > +        if ( send_addr.src.domain_id == XEN_ARGO_DOMID_ANY )
> > +            send_addr.src.domain_id = currd->domain_id;
> > +
> > +        /* No domain is currently authorized to send on behalf of another */
> > +        if ( unlikely(send_addr.src.domain_id != currd->domain_id) )
> > +        {
> > +            rc = -EPERM;
> > +            break;
> > +        }
> > +
> > +        /*
> > +         * Check access to the whole array here so we can use the faster __copy
> > +         * operations to read each element later.
> > +         */
> > +        if ( unlikely(!guest_handle_okay(iovs_hnd, arg3)) )
>
> You need to set rc to EFAULT here, because the call to copy_from_guest
> has set it to 0.

ack.

>
> Alternatively you can change the call above to be:
>
> if ( copy_from_guest(&send_addr, send_addr_hnd, 1) )
>     return -EFAULT;
>
> So rc doesn't get set to 0 on success.

> With those taken care of:
>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 10/15] argo: implement the notify op
  2019-01-22 14:09   ` Roger Pau Monné
@ 2019-01-31  4:12     ` Christopher Clark
  0 siblings, 0 replies; 44+ messages in thread
From: Christopher Clark @ 2019-01-31  4:12 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Julien Grall, Tim Deegan,
	Daniel Smith, Rich Persaud, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Tue, Jan 22, 2019 at 6:14 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Mon, Jan 21, 2019 at 01:59:50AM -0800, Christopher Clark wrote:
> > Queries for data about space availability in registered rings and
> > causes notification to be sent when space has become available.
> >
> > The hypercall op populates a supplied data structure with information about
> > ring state and if insufficient space is currently available in a given ring,
> > the hypervisor will record the domain's expressed interest and notify it
> > when it observes that space has become available.
> >
> > Checks for free space occur when this notify op is invoked, so it may be
> > intentionally invoked with no data structure to populate
> > (ie. a NULL argument) to trigger such a check and consequent notifications.
> >
> > Limit the maximum number of notify requests in a single operation to a
> > simple fixed limit of 256.
> >
> > Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
>
> LGTM, but I would like to see the open-coded versions of the list_
> macros fixed:
>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>
> > diff --git a/xen/common/argo.c b/xen/common/argo.c
> > index 518aff7..4b43bdd 100644
> > --- a/xen/common/argo.c
> > +++ b/xen/common/argo.c
> [...]
> > +static void
> > +pending_notify(struct list_head *to_notify)
> > +{
> > +    ASSERT(LOCKING_Read_L1);
> > +
> > +    /* Sending signals for all ents in this list, draining until it is empty. */
> > +    while ( !list_empty(to_notify) )
> > +    {
> > +        struct pending_ent *ent =
> > +            list_entry(to_notify->next, struct pending_ent, node);
>
> list_first_entry_or_null

(same as earlier message: list_first_entry_or_null is not used by Xen)

list_for_each_entry_safe

>
> > +
> > +        list_del(&ent->node);
> > +        signal_domid(ent->domain_id);
> > +        xfree(ent);
> > +    }
> > +}
> > +
> > +static void
> > +pending_find(const struct domain *d, struct argo_ring_info *ring_info,
> > +             unsigned int payload_space, struct list_head *to_notify)
> > +{
> > +    struct list_head *cursor, *pending_head;
> > +
> > +    ASSERT(LOCKING_Read_rings_L2(d));
> > +
> > +    /*
> > +     * TODO: Current policy here is to signal _all_ of the waiting domains
> > +     *       interested in sending a message of size less than payload_space.
> > +     *
> > +     * This is likely to be suboptimal, since once one of them has added
> > +     * their message to the ring, there may well be insufficient room
> > +     * available for any of the others to transmit, meaning that they were
> > +     * woken in vain, which created extra work just to requeue their wait.
> > +     *
> > +     * Retain this simple policy for now since it at least avoids starving a
> > +     * domain of available space notifications because of a policy that only
> > +     * notified other domains instead. Improvement may be possible;
> > +     * investigation required.
> > +     */
> > +    spin_lock(&ring_info->L3_lock);
> > +
> > +    /* Remove matching ents from the ring list, and add them to "to_notify" */
> > +    pending_head = &ring_info->pending;
> > +    cursor = pending_head->next;
> > +
> > +    while ( cursor != pending_head )
> > +    {
> > +        struct pending_ent *ent = list_entry(cursor, struct pending_ent, node);
> > +
> > +        cursor = cursor->next;
>
> list_for_each_entry_safe?

ack

>
> > +
> > +        if ( payload_space >= ent->len )
> > +        {
> > +            if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> > +                wildcard_pending_list_remove(ent->domain_id, ent);
> > +
> > +            list_del(&ent->node);
> > +            ring_info->npending--;
> > +            list_add(&ent->node, to_notify);
> > +        }
> > +    }
> > +
> > +    spin_unlock(&ring_info->L3_lock);
> > +}
> > +
> >  static int
> >  pending_queue(const struct domain *d, struct argo_ring_info *ring_info,
> >                domid_t src_id, unsigned int len)
> > @@ -1023,6 +1163,36 @@ pending_requeue(const struct domain *d, struct argo_ring_info *ring_info,
> >  }
> >
> >  static void
> > +pending_cancel(const struct domain *d, struct argo_ring_info *ring_info,
> > +               domid_t src_id)
> > +{
> > +    struct list_head *cursor, *pending_head;
> > +
> > +    ASSERT(LOCKING_L3(d, ring_info));
> > +
> > +    /* Remove all ents where domain_id matches src_id from the ring's list. */
> > +    pending_head = &ring_info->pending;
> > +    cursor = pending_head->next;
> > +
> > +    while ( cursor != pending_head )
> > +    {
> > +        struct pending_ent *ent = list_entry(cursor, struct pending_ent, node);
> > +
> > +        cursor = cursor->next;
>
> list_for_each_entry_safe

ack

>
> > +
> > +        if ( ent->domain_id == src_id )
> > +        {
> > +            /* For wildcard rings, remove each from their wildcard list too. */
> > +            if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> > +                wildcard_pending_list_remove(ent->domain_id, ent);
> > +            list_del(&ent->node);
> > +            xfree(ent);
> > +            ring_info->npending--;
> > +        }
> > +    }
> > +}
> > +
> > +static void
> >  wildcard_rings_pending_remove(struct domain *d)
> >  {
> >      struct list_head *wildcard_head;
> > @@ -1158,6 +1328,86 @@ partner_rings_remove(struct domain *src_d)
> >  }
> >
> >  static int
> > +fill_ring_data(const struct domain *currd,
> > +               XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) data_ent_hnd)
> > +{
> > +    xen_argo_ring_data_ent_t ent;
> > +    struct domain *dst_d;
> > +    struct argo_ring_info *ring_info;
> > +    int ret = 0;
> > +
> > +    ASSERT(currd == current->domain);
> > +    ASSERT(LOCKING_Read_L1);
> > +
> > +    if ( __copy_from_guest(&ent, data_ent_hnd, 1) )
> > +        return -EFAULT;
> > +
> > +    argo_dprintk("fill_ring_data: ent.ring.domain=%u,ent.ring.aport=%x\n",
> > +                 ent.ring.domain_id, ent.ring.aport);
> > +
> > +    ent.flags = 0;
> > +
> > +    dst_d = get_domain_by_id(ent.ring.domain_id);
> > +    if ( !dst_d || !dst_d->argo )
> > +        goto out;
> > +
> > +    read_lock(&dst_d->argo->rings_L2_rwlock);
> > +
> > +    ring_info = find_ring_info_by_match(dst_d, ent.ring.aport,
> > +                                        currd->domain_id);
> > +    if ( ring_info )
> > +    {
> > +        unsigned int space_avail;
> > +
> > +        ent.flags |= XEN_ARGO_RING_EXISTS;
> > +
> > +        spin_lock(&ring_info->L3_lock);
> > +
> > +        ent.max_message_size = ring_info->len -
> > +                                   sizeof(struct xen_argo_ring_message_header) -
> > +                                   ROUNDUP_MESSAGE(1);
> > +
> > +        if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY )
> > +            ent.flags |= XEN_ARGO_RING_SHARED;
> > +
> > +        space_avail = ringbuf_payload_space(dst_d, ring_info);
> > +
> > +        argo_dprintk("fill_ring_data: aport=%x space_avail=%u"
> > +                     " space_wanted=%u\n",
> > +                     ring_info->id.aport, space_avail, ent.space_required);
> > +
> > +        /* Do not queue a notification for an unachievable size */
> > +        if ( ent.space_required > ent.max_message_size )
> > +            ent.flags |= XEN_ARGO_RING_EMSGSIZE;
> > +        else if ( space_avail >= ent.space_required )
> > +        {
> > +            pending_cancel(dst_d, ring_info, currd->domain_id);
> > +            ent.flags |= XEN_ARGO_RING_SUFFICIENT;
> > +        }
> > +        else
> > +            ret = pending_requeue(dst_d, ring_info, currd->domain_id,
> > +                                  ent.space_required);
> > +
> > +        spin_unlock(&ring_info->L3_lock);
> > +
> > +        if ( space_avail == ent.max_message_size )
> > +            ent.flags |= XEN_ARGO_RING_EMPTY;
> > +
> > +    }
> > +    read_unlock(&dst_d->argo->rings_L2_rwlock);
> > +
> > + out:
> > +    if ( dst_d )
> > +        put_domain(dst_d);
> > +
> > +    if ( !ret && (__copy_field_to_guest(data_ent_hnd, &ent, flags) ||
> > +                  __copy_field_to_guest(data_ent_hnd, &ent, max_message_size)) )
> > +        return -EFAULT;
> > +
> > +    return ret;
> > +}
> > +
> > +static int
> >  find_ring_mfn(struct domain *d, gfn_t gfn, mfn_t *mfn)
> >  {
> >      struct page_info *page;
> > @@ -1586,6 +1836,112 @@ register_ring(struct domain *currd,
> >      return ret;
> >  }
> >
> > +static void
> > +notify_ring(const struct domain *d, struct argo_ring_info *ring_info,
> > +            struct list_head *to_notify)
> > +{
> > +    unsigned int space;
> > +
> > +    ASSERT(LOCKING_Read_rings_L2(d));
> > +
> > +    spin_lock(&ring_info->L3_lock);
> > +
> > +    if ( ring_info->len )
> > +        space = ringbuf_payload_space(d, ring_info);
> > +    else
> > +        space = 0;
> > +
> > +    spin_unlock(&ring_info->L3_lock);
> > +
> > +    if ( space )
> > +        pending_find(d, ring_info, space, to_notify);
> > +}
> > +
> > +static void
> > +notify_check_pending(struct domain *d)
> > +{
> > +    unsigned int i;
> > +    LIST_HEAD(to_notify);
> > +
> > +    ASSERT(LOCKING_Read_L1);
> > +
> > +    read_lock(&d->argo->rings_L2_rwlock);
> > +
> > +    /* Walk all rings, call notify_ring on each to populate to_notify list */
> > +    for ( i = 0; i < ARGO_HASHTABLE_SIZE; i++ )
> > +    {
> > +        struct list_head *cursor, *bucket = &d->argo->ring_hash[i];
> > +        struct argo_ring_info *ring_info;
> > +
> > +        for ( cursor = bucket->next; cursor != bucket; cursor = cursor->next )
>
> list_for_each_entry

list_for_each_entry_safe

thanks

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-31  4:10     ` Christopher Clark
@ 2019-01-31 10:18       ` Roger Pau Monné
  2019-01-31 10:35         ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: Roger Pau Monné @ 2019-01-31 10:18 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet

On Wed, Jan 30, 2019 at 08:10:28PM -0800, Christopher Clark wrote:
> On Tue, Jan 22, 2019 at 4:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Mon, Jan 21, 2019 at 01:59:49AM -0800, Christopher Clark wrote:
> > >  do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
> > >             XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
> > > @@ -1145,6 +1734,53 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
> > >          break;
> > >      }
> > >
> > > +    case XEN_ARGO_OP_sendv:
> > > +    {
> > > +        xen_argo_send_addr_t send_addr;
> > > +
> > > +        XEN_GUEST_HANDLE_PARAM(xen_argo_send_addr_t) send_addr_hnd =
> > > +            guest_handle_cast(arg1, xen_argo_send_addr_t);
> > > +        XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs_hnd =
> > > +            guest_handle_cast(arg2, xen_argo_iov_t);
> > > +        /* arg3 is niov */
> > > +        /* arg4 is message_type. Must be a 32-bit value. */
> > > +
> > > +        rc = copy_from_guest(&send_addr, send_addr_hnd, 1) ? -EFAULT : 0;
> > > +        if ( rc )
> > > +            break;
> > > +
> > > +        /*
> > > +         * Check padding is zeroed. Reject niov above limit or message_types
> > > +         * that are outside 32 bit range.
> > > +         */
> > > +        if ( unlikely(send_addr.src.pad || send_addr.dst.pad ||
> > > +                      (arg3 > XEN_ARGO_MAXIOV) || (arg4 & ~0xffffffffUL)) )
> >
> > arg4 & (GB(4) - 1)
> >
> > Is clearer IMO, or:
> >
> > arg4 > UINT32_MAX
> 
> I've left the code unchanged, as the mask constant is used multiple
> places elsewhere in Xen. UINT32_MAX is only used as a threshold value.

The fact that others parts of the code could be improved is not an
excuse to follow suit. I'm having a hard time believing that you find
"arg4 & ~0xffffffffUL" easier to read than "arg4 & ~(GB(4) - 1)" or
even "arg4 >= GB(4)".

IMO it's much more likely to miss an 'f' in the first construct, and
thus get the value wrong and introduce a bug.

Anyway, this is your code, so I'm not going to insist.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 04/15] argo: init, destroy and soft-reset, with enable command line opt
  2019-01-31  4:06     ` Christopher Clark
@ 2019-01-31 10:28       ` Jan Beulich
  0 siblings, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2019-01-31 10:28 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	ross.philipson, Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Rich Persaud, James McKenzie,
	George Dunlap, Julien Grall, Paul Durrant, Tim Deegan, xen-devel,
	eric chanudet, Roger Pau Monne

>>> On 31.01.19 at 05:06, <christopher.w.clark@gmail.com> wrote:
> On Mon, Jan 21, 2019 at 9:55 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>> On Mon, Jan 21, 2019 at 01:59:44AM -0800, Christopher Clark wrote:
>> > +static void
>> > +pending_remove_all(const struct domain *d, struct argo_ring_info *ring_info)
>> > +{
>> > +    struct list_head *ring_pending = &ring_info->pending;
>> > +    struct pending_ent *ent;
>> > +
>> > +    ASSERT(LOCKING_L3(d, ring_info));
>> > +
>> > +    /* Delete all pending notifications from this ring's list. */
>> > +    while ( !list_empty(ring_pending) )
>>
>> Nit: you could use list_first_entry_or_null that joins the list_empty
>> and list_entry calls.
> 
> There are no existing users of list_first_entry_or_null anywhere in Xen,
> and applying it to this loop results in an assignment within the
> while condition, which also appears to be very rare construct within Xen,
> so I just used the list_for_each_entry_safe macro.

I'm not fully following why lack of use of a construct elsewhere in the
tree would be a reason not to use it here. If you were to use only
constructs already in use in common code, you should also not have
had a need to e.g. introduce for Arm (and use) guest_handle_for_field().

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-31 10:18       ` Roger Pau Monné
@ 2019-01-31 10:35         ` Jan Beulich
  2019-01-31 11:00           ` Roger Pau Monné
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2019-01-31 10:35 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	ross.philipson, Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Christopher Clark,
	Rich Persaud, James McKenzie, George Dunlap, Julien Grall,
	Paul Durrant, xen-devel, eric chanudet

>>> On 31.01.19 at 11:18, <roger.pau@citrix.com> wrote:
> On Wed, Jan 30, 2019 at 08:10:28PM -0800, Christopher Clark wrote:
>> On Tue, Jan 22, 2019 at 4:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>> > On Mon, Jan 21, 2019 at 01:59:49AM -0800, Christopher Clark wrote:
>> > > +        /*
>> > > +         * Check padding is zeroed. Reject niov above limit or message_types
>> > > +         * that are outside 32 bit range.
>> > > +         */
>> > > +        if ( unlikely(send_addr.src.pad || send_addr.dst.pad ||
>> > > +                      (arg3 > XEN_ARGO_MAXIOV) || (arg4 & ~0xffffffffUL)) )
>> >
>> > arg4 & (GB(4) - 1)
>> >
>> > Is clearer IMO, or:
>> >
>> > arg4 > UINT32_MAX
>> 
>> I've left the code unchanged, as the mask constant is used multiple
>> places elsewhere in Xen. UINT32_MAX is only used as a threshold value.
> 
> The fact that others parts of the code could be improved is not an
> excuse to follow suit. I'm having a hard time believing that you find
> "arg4 & ~0xffffffffUL" easier to read than "arg4 & ~(GB(4) - 1)" or
> even "arg4 >= GB(4)".
> 
> IMO it's much more likely to miss an 'f' in the first construct, and
> thus get the value wrong and introduce a bug.

I agree with this last statement, but I'm having trouble to see how
message _type_ is related to a size construct like GB(4) is. I see
only UINT32_MAX as a viable alternative for something that's not
expressing the size of anything.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-31 10:35         ` Jan Beulich
@ 2019-01-31 11:00           ` Roger Pau Monné
  2019-02-03 17:56             ` Christopher Clark
  0 siblings, 1 reply; 44+ messages in thread
From: Roger Pau Monné @ 2019-01-31 11:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	ross.philipson, Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Christopher Clark,
	Rich Persaud, James McKenzie, George Dunlap, Julien Grall,
	Paul Durrant, xen-devel, eric chanudet

On Thu, Jan 31, 2019 at 03:35:23AM -0700, Jan Beulich wrote:
> >>> On 31.01.19 at 11:18, <roger.pau@citrix.com> wrote:
> > On Wed, Jan 30, 2019 at 08:10:28PM -0800, Christopher Clark wrote:
> >> On Tue, Jan 22, 2019 at 4:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >> > On Mon, Jan 21, 2019 at 01:59:49AM -0800, Christopher Clark wrote:
> >> > > +        /*
> >> > > +         * Check padding is zeroed. Reject niov above limit or message_types
> >> > > +         * that are outside 32 bit range.
> >> > > +         */
> >> > > +        if ( unlikely(send_addr.src.pad || send_addr.dst.pad ||
> >> > > +                      (arg3 > XEN_ARGO_MAXIOV) || (arg4 & ~0xffffffffUL)) )
> >> >
> >> > arg4 & (GB(4) - 1)
> >> >
> >> > Is clearer IMO, or:
> >> >
> >> > arg4 > UINT32_MAX
> >> 
> >> I've left the code unchanged, as the mask constant is used multiple
> >> places elsewhere in Xen. UINT32_MAX is only used as a threshold value.
> > 
> > The fact that others parts of the code could be improved is not an
> > excuse to follow suit. I'm having a hard time believing that you find
> > "arg4 & ~0xffffffffUL" easier to read than "arg4 & ~(GB(4) - 1)" or
> > even "arg4 >= GB(4)".
> > 
> > IMO it's much more likely to miss an 'f' in the first construct, and
> > thus get the value wrong and introduce a bug.
> 
> I agree with this last statement, but I'm having trouble to see how
> message _type_ is related to a size construct like GB(4) is. I see
> only UINT32_MAX as a viable alternative for something that's not
> expressing the size of anything.

I've suggested the GB construct as an alternative because the comment
above mentions the 32bit range. IMO anything that avoids using
0xffffffffUL is fine.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
  2019-01-31  4:05   ` Christopher Clark
@ 2019-01-31 13:39     ` Roger Pau Monné
  2019-02-03 18:04       ` Christopher Clark
  0 siblings, 1 reply; 44+ messages in thread
From: Roger Pau Monné @ 2019-01-31 13:39 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Rich Persaud,
	Tim Deegan, Daniel Smith, Julien Grall, Paul Durrant,
	Jan Beulich, xen-devel, Daniel De Graaf, James McKenzie,
	Eric Chanudet

On Wed, Jan 30, 2019 at 08:05:30PM -0800, Christopher Clark wrote:
> On Tue, Jan 22, 2019 at 6:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Mon, Jan 21, 2019 at 01:59:40AM -0800, Christopher Clark wrote:
> > > Version five of this patch series:
> > >
> > > * Changes are primarily addressing feedback from the v4 series reviews.
> > >   Many points noted on the invididual commit posts.
> > >
> > > * Critical sections have been shrunk, with allocations and frees
> > >   pulled outside where possible, reordering logic within hypercall ops.
> > >
> > > * A new ring hash function implemented, derived from the djb2 string
> > >   hash function.
> > >
> > > * Flags returned by the notify op have been simplified.
> > >
> > > * Now uses a single argo boot parameter, taking a list:
> > >   - top level boolean to enable/disable Argo
> > >   - mac-permissive option to enable/disable wildcard rings
> > >   - command line doc edit: no "CONFIG_ARGO" but refers to build config
> > >
> > > * Switched to use the standard list data structures used by Xen's
> > >   common code.
> >
> > AFAIK this was not requested by any reviewer, so I wonder why you made
> > such change. The more that you open coded some of the list_ macros
> > instead of just doing a s/hlist_/list_/ replacement.
> > I'm fine with using list instead of hlist,
> 
> At your request, v7 replaces open coding with Xen's list macros. The
> hlist macros were not used by any of the common code in Xen.
> 
> > but I don't understand why
> > you decided to open code list_for_each and list_for_each_safe instead
> > of using the macros provided by Xen. Is there an issue with such
> > macros?
> 
> As discussed offline:
> 
> - Using Xen's list macros will expedite Argo's merge for Xen 4.12
> - List macros in Xen list.h originated in Linux list.h and have diverged
> - OpenXT has use cases for measured launch and nested virtualization,
>   which influence downstream performance and security requirements for
>   Argo and Xen
> - OpenXT can temporarily patch Xen 4.12 for downstream use
> 
> > I've made a couple of minor comments, but I think the current status
> > is good, and fixing those minor comments is going to be trivial.
> 
> Ack, thanks. Hopefully v7 looks good.

As a note, the common flow of interactions usually involves the
contributor replying to the comments made by the reviewer in order to
try to reach an agreement before sending a new version.

There are comments from v5 that haven't been fixed in v7 (the mask
usage and list_first_entry_or_null for example) and the reply to the
reviewer's comment was sent at the same time as v7, leaving no time
for further discussion (and for reaching an agreement suitable to both
parties) before sending v7.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-01-31 11:00           ` Roger Pau Monné
@ 2019-02-03 17:56             ` Christopher Clark
  2019-02-04  9:30               ` Roger Pau Monné
  0 siblings, 1 reply; 44+ messages in thread
From: Christopher Clark @ 2019-02-03 17:56 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Ross Philipson, Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Rich Persaud, James McKenzie,
	George Dunlap, Julien Grall, Paul Durrant, Jan Beulich,
	xen-devel, eric chanudet

On Thu, Jan 31, 2019 at 3:01 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Thu, Jan 31, 2019 at 03:35:23AM -0700, Jan Beulich wrote:
> > >>> On 31.01.19 at 11:18, <roger.pau@citrix.com> wrote:
> > > On Wed, Jan 30, 2019 at 08:10:28PM -0800, Christopher Clark wrote:
> > >> On Tue, Jan 22, 2019 at 4:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >> > On Mon, Jan 21, 2019 at 01:59:49AM -0800, Christopher Clark wrote:
> > >> > > +        /*
> > >> > > +         * Check padding is zeroed. Reject niov above limit or message_types
> > >> > > +         * that are outside 32 bit range.
> > >> > > +         */
> > >> > > +        if ( unlikely(send_addr.src.pad || send_addr.dst.pad ||
> > >> > > +                      (arg3 > XEN_ARGO_MAXIOV) || (arg4 & ~0xffffffffUL)) )
> > >> >
> > >> > arg4 & (GB(4) - 1)
> > >> >
> > >> > Is clearer IMO, or:
> > >> >
> > >> > arg4 > UINT32_MAX
> > >>
> > >> I've left the code unchanged, as the mask constant is used multiple
> > >> places elsewhere in Xen. UINT32_MAX is only used as a threshold value.
> > >
> > > The fact that others parts of the code could be improved is not an
> > > excuse to follow suit. I'm having a hard time believing that you find
> > > "arg4 & ~0xffffffffUL" easier to read than "arg4 & ~(GB(4) - 1)" or
> > > even "arg4 >= GB(4)".


Below, I propose an alternative way of achieving our correctness and
readability goals.

On the topic of readability, this self-contained definition
does stand out: ~0xffffffffUL,
encouraging caution and careful counting of 'f's. However, no other
source files are involved, making the code independent of changes in
(macro) definitions in other files.

In comparison, to understand GB, I have find the external definition,
and then parse this:

#define GB(_gb)     (_AC(_gb, ULL) << 30)

(which seems to have a different type? ULL vs UL?) and then find and
understand this, in another file:

#ifdef __ASSEMBLY__
#define _AC(X,Y)    X
#define _AT(T,X)    X
#else
#define __AC(X,Y)   (X##Y)
#define _AC(X,Y)    __AC(X,Y)
#define _AT(T,X)    ((T)(X))
#endif

so I'm saying: it's at least somewhat arguable which is easier to understand.
Regardless, I think there's a better option than either.

> > > IMO it's much more likely to miss an 'f' in the first construct, and
> > > thus get the value wrong and introduce a bug.
> >
> > I agree with this last statement, but I'm having trouble to see how
> > message _type_ is related to a size construct like GB(4) is. I see
> > only UINT32_MAX as a viable alternative for something that's not
> > expressing the size of anything.
>
> I've suggested the GB construct as an alternative because the comment
> above mentions the 32bit range. IMO anything that avoids using
> 0xffffffffUL is fine.

Jan and Andrew have employed a useful technique in recent changes where such a
test was required.  This could work:

(arg4 != (uint32_t)arg4))

It is self-contained, readable and clearly expresses the intent of the check
being performed. I have tested a series with this applied, and have it ready
to post if you approve.

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
  2019-01-31 13:39     ` Roger Pau Monné
@ 2019-02-03 18:04       ` Christopher Clark
  2019-02-04 10:07         ` Roger Pau Monné
  0 siblings, 1 reply; 44+ messages in thread
From: Christopher Clark @ 2019-02-03 18:04 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Rich Persaud,
	Tim Deegan, Daniel Smith, Julien Grall, Paul Durrant,
	Jan Beulich, xen-devel, Daniel De Graaf, James McKenzie,
	Eric Chanudet

On Thu, Jan 31, 2019 at 5:39 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Wed, Jan 30, 2019 at 08:05:30PM -0800, Christopher Clark wrote:
> > On Tue, Jan 22, 2019 at 6:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Mon, Jan 21, 2019 at 01:59:40AM -0800, Christopher Clark wrote:
> > > > Version five of this patch series:
> > > >
> > > > * Changes are primarily addressing feedback from the v4 series reviews.
> > > >   Many points noted on the invididual commit posts.
> > > >
> > > > * Critical sections have been shrunk, with allocations and frees
> > > >   pulled outside where possible, reordering logic within hypercall ops.
> > > >
> > > > * A new ring hash function implemented, derived from the djb2 string
> > > >   hash function.
> > > >
> > > > * Flags returned by the notify op have been simplified.
> > > >
> > > > * Now uses a single argo boot parameter, taking a list:
> > > >   - top level boolean to enable/disable Argo
> > > >   - mac-permissive option to enable/disable wildcard rings
> > > >   - command line doc edit: no "CONFIG_ARGO" but refers to build config
> > > >
> > > > * Switched to use the standard list data structures used by Xen's
> > > >   common code.
> > >
> > > AFAIK this was not requested by any reviewer, so I wonder why you made
> > > such change. The more that you open coded some of the list_ macros
> > > instead of just doing a s/hlist_/list_/ replacement.
> > > I'm fine with using list instead of hlist,
> >
> > At your request, v7 replaces open coding with Xen's list macros. The
> > hlist macros were not used by any of the common code in Xen.
> >
> > > but I don't understand why
> > > you decided to open code list_for_each and list_for_each_safe instead
> > > of using the macros provided by Xen. Is there an issue with such
> > > macros?
> >
> > As discussed offline:
> >
> > - Using Xen's list macros will expedite Argo's merge for Xen 4.12
> > - List macros in Xen list.h originated in Linux list.h and have diverged
> > - OpenXT has use cases for measured launch and nested virtualization,
> >   which influence downstream performance and security requirements for
> >   Argo and Xen
> > - OpenXT can temporarily patch Xen 4.12 for downstream use
> >
> > > I've made a couple of minor comments, but I think the current status
> > > is good, and fixing those minor comments is going to be trivial.
> >
> > Ack, thanks. Hopefully v7 looks good.
>
> As a note, the common flow of interactions usually involves the
> contributor replying to the comments made by the reviewer in order to
> try to reach an agreement before sending a new version.

Yes, v7 was sent to address Jan and Julien's review comments in parallel
with our ongoing discussion on v5 macros. v7 also provided a checkpoint
for Argo testers to maximize test coverage as the series converges into
a Xen 4.12 merge candidate for Juergen. It addressed:

 - Jan's v6 review comments
 - Julien's v1 review comment
 - most of your xen-devel and offline review comments

> There are comments from v5 that haven't been fixed in v7
> (the mask usage and list_first_entry_or_null for example)
> and the reply to the reviewer's comment was sent at the same time as
> v7, leaving no time for further discussion (and for reaching an
> agreement suitable to both parties) before sending v7.

Code changes from our ongoing discussion will be addressed in v8. A
proposal to address mask usage has been put forward in the parallel
thread. Your proposed usage of list_first_entry_or_null will be made in
v8, subject to the previous offline discussion about list macros
(duplicated here for convenience):

> > As discussed offline:
> >
> > - Using Xen's list macros will expedite Argo's merge for Xen 4.12
> > - List macros in Xen list.h originated in Linux list.h and have diverged
> > - OpenXT has use cases for measured launch and nested virtualization,
> >   which influence downstream performance and security requirements for
> >   Argo and Xen
> > - OpenXT can temporarily patch Xen 4.12 for downstream use

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2019-02-03 17:56             ` Christopher Clark
@ 2019-02-04  9:30               ` Roger Pau Monné
  0 siblings, 0 replies; 44+ messages in thread
From: Roger Pau Monné @ 2019-02-04  9:30 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Ross Philipson, Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Rich Persaud, James McKenzie,
	George Dunlap, Julien Grall, Paul Durrant, Jan Beulich,
	xen-devel, eric chanudet

On Sun, Feb 03, 2019 at 09:56:26AM -0800, Christopher Clark wrote:
> On Thu, Jan 31, 2019 at 3:01 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Thu, Jan 31, 2019 at 03:35:23AM -0700, Jan Beulich wrote:
> > > >>> On 31.01.19 at 11:18, <roger.pau@citrix.com> wrote:
> > > > On Wed, Jan 30, 2019 at 08:10:28PM -0800, Christopher Clark wrote:
> > > >> On Tue, Jan 22, 2019 at 4:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > >> > On Mon, Jan 21, 2019 at 01:59:49AM -0800, Christopher Clark wrote:
> > > >> > > +        /*
> > > >> > > +         * Check padding is zeroed. Reject niov above limit or message_types
> > > >> > > +         * that are outside 32 bit range.
> > > >> > > +         */
> > > >> > > +        if ( unlikely(send_addr.src.pad || send_addr.dst.pad ||
> > > >> > > +                      (arg3 > XEN_ARGO_MAXIOV) || (arg4 & ~0xffffffffUL)) )
> > > >> >
> > > >> > arg4 & (GB(4) - 1)
> > > >> >
> > > >> > Is clearer IMO, or:
> > > >> >
> > > >> > arg4 > UINT32_MAX
> > > >>
> > > >> I've left the code unchanged, as the mask constant is used multiple
> > > >> places elsewhere in Xen. UINT32_MAX is only used as a threshold value.
> > > >
> > > > The fact that others parts of the code could be improved is not an
> > > > excuse to follow suit. I'm having a hard time believing that you find
> > > > "arg4 & ~0xffffffffUL" easier to read than "arg4 & ~(GB(4) - 1)" or
> > > > even "arg4 >= GB(4)".
> 
> 
> Below, I propose an alternative way of achieving our correctness and
> readability goals.
> 
> On the topic of readability, this self-contained definition
> does stand out: ~0xffffffffUL,
> encouraging caution and careful counting of 'f's. However, no other
> source files are involved, making the code independent of changes in
> (macro) definitions in other files.
> 
> In comparison, to understand GB, I have find the external definition,
> and then parse this:
> 
> #define GB(_gb)     (_AC(_gb, ULL) << 30)
> 
> (which seems to have a different type? ULL vs UL?) and then find and
> understand this, in another file:
> 
> #ifdef __ASSEMBLY__
> #define _AC(X,Y)    X
> #define _AT(T,X)    X
> #else
> #define __AC(X,Y)   (X##Y)
> #define _AC(X,Y)    __AC(X,Y)
> #define _AT(T,X)    ((T)(X))
> #endif
> 
> so I'm saying: it's at least somewhat arguable which is easier to understand.
> Regardless, I think there's a better option than either.
> 
> > > > IMO it's much more likely to miss an 'f' in the first construct, and
> > > > thus get the value wrong and introduce a bug.
> > >
> > > I agree with this last statement, but I'm having trouble to see how
> > > message _type_ is related to a size construct like GB(4) is. I see
> > > only UINT32_MAX as a viable alternative for something that's not
> > > expressing the size of anything.
> >
> > I've suggested the GB construct as an alternative because the comment
> > above mentions the 32bit range. IMO anything that avoids using
> > 0xffffffffUL is fine.
> 
> Jan and Andrew have employed a useful technique in recent changes where such a
> test was required.  This could work:
> 
> (arg4 != (uint32_t)arg4))
> 
> It is self-contained, readable and clearly expresses the intent of the check
> being performed. I have tested a series with this applied, and have it ready
> to post if you approve.

Yes, that's fine. As said in v7 anything that doesn't involve an
open-coded mask is fine with me.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
  2019-02-03 18:04       ` Christopher Clark
@ 2019-02-04 10:07         ` Roger Pau Monné
  2019-02-04 18:22           ` Stefano Stabellini
                             ` (3 more replies)
  0 siblings, 4 replies; 44+ messages in thread
From: Roger Pau Monné @ 2019-02-04 10:07 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Rich Persaud,
	Tim Deegan, Daniel Smith, Julien Grall, Paul Durrant,
	Jan Beulich, xen-devel, Daniel De Graaf, James McKenzie,
	Eric Chanudet

On Sun, Feb 03, 2019 at 10:04:29AM -0800, Christopher Clark wrote:
> On Thu, Jan 31, 2019 at 5:39 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Wed, Jan 30, 2019 at 08:05:30PM -0800, Christopher Clark wrote:
> > > On Tue, Jan 22, 2019 at 6:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > >
> > > > On Mon, Jan 21, 2019 at 01:59:40AM -0800, Christopher Clark wrote:
> > > > > Version five of this patch series:
> > > > >
> > > > > * Changes are primarily addressing feedback from the v4 series reviews.
> > > > >   Many points noted on the invididual commit posts.
> > > > >
> > > > > * Critical sections have been shrunk, with allocations and frees
> > > > >   pulled outside where possible, reordering logic within hypercall ops.
> > > > >
> > > > > * A new ring hash function implemented, derived from the djb2 string
> > > > >   hash function.
> > > > >
> > > > > * Flags returned by the notify op have been simplified.
> > > > >
> > > > > * Now uses a single argo boot parameter, taking a list:
> > > > >   - top level boolean to enable/disable Argo
> > > > >   - mac-permissive option to enable/disable wildcard rings
> > > > >   - command line doc edit: no "CONFIG_ARGO" but refers to build config
> > > > >
> > > > > * Switched to use the standard list data structures used by Xen's
> > > > >   common code.
> > > >
> > > > AFAIK this was not requested by any reviewer, so I wonder why you made
> > > > such change. The more that you open coded some of the list_ macros
> > > > instead of just doing a s/hlist_/list_/ replacement.
> > > > I'm fine with using list instead of hlist,
> > >
> > > At your request, v7 replaces open coding with Xen's list macros. The
> > > hlist macros were not used by any of the common code in Xen.
> > >
> > > > but I don't understand why
> > > > you decided to open code list_for_each and list_for_each_safe instead
> > > > of using the macros provided by Xen. Is there an issue with such
> > > > macros?
> > >
> > > As discussed offline:
> > >
> > > - Using Xen's list macros will expedite Argo's merge for Xen 4.12
> > > - List macros in Xen list.h originated in Linux list.h and have diverged
> > > - OpenXT has use cases for measured launch and nested virtualization,
> > >   which influence downstream performance and security requirements for
> > >   Argo and Xen
> > > - OpenXT can temporarily patch Xen 4.12 for downstream use
> > >
> > > > I've made a couple of minor comments, but I think the current status
> > > > is good, and fixing those minor comments is going to be trivial.
> > >
> > > Ack, thanks. Hopefully v7 looks good.
> >
> > As a note, the common flow of interactions usually involves the
> > contributor replying to the comments made by the reviewer in order to
> > try to reach an agreement before sending a new version.
> 
> Yes, v7 was sent to address Jan and Julien's review comments in parallel
> with our ongoing discussion on v5 macros. v7 also provided a checkpoint
> for Argo testers to maximize test coverage as the series converges into
> a Xen 4.12 merge candidate for Juergen. It addressed:
> 
>  - Jan's v6 review comments
>  - Julien's v1 review comment
>  - most of your xen-devel and offline review comments

I think it will benefit the community to give this review in public,
so other reviewers know whats going on. IMO getting this private
review makes it harder for me (as a reviewer) to know the motivation
of some of the changes between versions, and likely also makes it
harder for you since you have to keep track of comments from multiple
sources on different channels.

Is there anything that prevents those people from making the review
comments publicly on xen-devel?

We should very much try to fix that so everyone can make review
comments on the public mailing list.

> > There are comments from v5 that haven't been fixed in v7
> > (the mask usage and list_first_entry_or_null for example)
> > and the reply to the reviewer's comment was sent at the same time as
> > v7, leaving no time for further discussion (and for reaching an
> > agreement suitable to both parties) before sending v7.
> 
> Code changes from our ongoing discussion will be addressed in v8. A
> proposal to address mask usage has been put forward in the parallel
> thread. Your proposed usage of list_first_entry_or_null will be made in
> v8, subject to the previous offline discussion about list macros
> (duplicated here for convenience):
> 
> > > As discussed offline:
> > >
> > > - Using Xen's list macros will expedite Argo's merge for Xen 4.12
> > > - List macros in Xen list.h originated in Linux list.h and have diverged
> > > - OpenXT has use cases for measured launch and nested virtualization,
> > >   which influence downstream performance and security requirements for
> > >   Argo and Xen

FWIW, I don't see the connection between nested virtualization or
measured launch and the list macros. I think a little bit more context
would be helpful here in order to understand the issue.

> > > - OpenXT can temporarily patch Xen 4.12 for downstream use

Patching the macros for OpenXT is perfectly fine, but it would be
better to understand and fix the problem upstream if possible.

How are you patching the macros?

What are you trying to achieve by patching them?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
  2019-02-04 10:07         ` Roger Pau Monné
@ 2019-02-04 18:22           ` Stefano Stabellini
  2019-02-13 10:31             ` Rich Persaud
  2019-02-06 10:31           ` Roger Pau Monné
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 44+ messages in thread
From: Stefano Stabellini @ 2019-02-04 18:22 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Christopher Clark,
	Rich Persaud, Tim Deegan, Daniel Smith, Julien Grall,
	Paul Durrant, Jan Beulich, xen-devel, Daniel De Graaf,
	James McKenzie, Eric Chanudet

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1055 bytes --]

On Mon, 4 Feb 2019, Roger Pau Monné wrote:
> > Yes, v7 was sent to address Jan and Julien's review comments in parallel
> > with our ongoing discussion on v5 macros. v7 also provided a checkpoint
> > for Argo testers to maximize test coverage as the series converges into
> > a Xen 4.12 merge candidate for Juergen. It addressed:
> > 
> >  - Jan's v6 review comments
> >  - Julien's v1 review comment
> >  - most of your xen-devel and offline review comments
> 
> I think it will benefit the community to give this review in public,
> so other reviewers know whats going on. IMO getting this private
> review makes it harder for me (as a reviewer) to know the motivation
> of some of the changes between versions, and likely also makes it
> harder for you since you have to keep track of comments from multiple
> sources on different channels.

There is one more reason to require public comments which I have only
learned recently: for safety certifications we need to keep a record of
all review comments and patches that address them for traceability.

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
  2019-02-04 10:07         ` Roger Pau Monné
  2019-02-04 18:22           ` Stefano Stabellini
@ 2019-02-06 10:31           ` Roger Pau Monné
  2019-02-06 10:36           ` Roger Pau Monné
  2019-02-13 10:43           ` Rich Persaud
  3 siblings, 0 replies; 44+ messages in thread
From: Roger Pau Monné @ 2019-02-06 10:31 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Lars Kurth, Julien Grall, Stefano Stabellini,
	Wei Liu, Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Christopher Clark,
	Tim Deegan, Daniel Smith, Rich Persaud, Paul Durrant,
	Jan Beulich, xen-devel, Daniel De Graaf, James McKenzie,
	Eric Chanudet

Gentle ping on the questions below.

On Mon, Feb 04, 2019 at 11:07:12AM +0100, Roger Pau Monné wrote:
> On Sun, Feb 03, 2019 at 10:04:29AM -0800, Christopher Clark wrote:
> > On Thu, Jan 31, 2019 at 5:39 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Wed, Jan 30, 2019 at 08:05:30PM -0800, Christopher Clark wrote:
> > > > On Tue, Jan 22, 2019 at 6:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > >
> > > > > On Mon, Jan 21, 2019 at 01:59:40AM -0800, Christopher Clark wrote:
> > > > As discussed offline:
> > > >
> > > > - Using Xen's list macros will expedite Argo's merge for Xen 4.12
> > > > - List macros in Xen list.h originated in Linux list.h and have diverged
> > > > - OpenXT has use cases for measured launch and nested virtualization,
> > > >   which influence downstream performance and security requirements for
> > > >   Argo and Xen
> 
> FWIW, I don't see the connection between nested virtualization or
> measured launch and the list macros. I think a little bit more context
> would be helpful here in order to understand the issue.
> 
> > > > - OpenXT can temporarily patch Xen 4.12 for downstream use
> 
> Patching the macros for OpenXT is perfectly fine, but it would be
> better to understand and fix the problem upstream if possible.
> 
> How are you patching the macros?
> 
> What are you trying to achieve by patching them?
> 
> Thanks, Roger.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
  2019-02-04 10:07         ` Roger Pau Monné
  2019-02-04 18:22           ` Stefano Stabellini
  2019-02-06 10:31           ` Roger Pau Monné
@ 2019-02-06 10:36           ` Roger Pau Monné
  2019-02-13 10:43           ` Rich Persaud
  3 siblings, 0 replies; 44+ messages in thread
From: Roger Pau Monné @ 2019-02-06 10:36 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Julien Grall,
	Tim Deegan, Daniel Smith, Rich Persaud, Paul Durrant,
	Jan Beulich, xen-devel, Daniel De Graaf, James McKenzie,
	Eric Chanudet

Wrong 'To:' field in the previous email, sorry.

Gentle ping on the questions below.

On Mon, Feb 04, 2019 at 11:07:12AM +0100, Roger Pau Monné wrote:
> On Sun, Feb 03, 2019 at 10:04:29AM -0800, Christopher Clark wrote:
> > On Thu, Jan 31, 2019 at 5:39 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Wed, Jan 30, 2019 at 08:05:30PM -0800, Christopher Clark wrote:
> > > > On Tue, Jan 22, 2019 at 6:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > >
> > > > > On Mon, Jan 21, 2019 at 01:59:40AM -0800, Christopher Clark wrote:
> > > > As discussed offline:
> > > >
> > > > - Using Xen's list macros will expedite Argo's merge for Xen 4.12
> > > > - List macros in Xen list.h originated in Linux list.h and have diverged
> > > > - OpenXT has use cases for measured launch and nested virtualization,
> > > >   which influence downstream performance and security requirements for
> > > >   Argo and Xen
> 
> FWIW, I don't see the connection between nested virtualization or
> measured launch and the list macros. I think a little bit more context
> would be helpful here in order to understand the issue.
> 
> > > > - OpenXT can temporarily patch Xen 4.12 for downstream use
> 
> Patching the macros for OpenXT is perfectly fine, but it would be
> better to understand and fix the problem upstream if possible.
> 
> How are you patching the macros?
> 
> What are you trying to achieve by patching them?
> 
> Thanks, Roger.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
  2019-02-04 18:22           ` Stefano Stabellini
@ 2019-02-13 10:31             ` Rich Persaud
  2019-02-13 22:19               ` Stefano Stabellini
  0 siblings, 1 reply; 44+ messages in thread
From: Rich Persaud @ 2019-02-13 10:31 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Lars Kurth, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Christopher Clark, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich, xen-devel,
	Daniel De Graaf, James McKenzie, Eric Chanudet,
	Roger Pau Monné


[-- Attachment #1.1: Type: text/plain, Size: 1776 bytes --]

> On Feb 4, 2019, at 13:22, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Mon, 4 Feb 2019, Roger Pau Monné wrote:
>>> Yes, v7 was sent to address Jan and Julien's review comments in parallel
>>> with our ongoing discussion on v5 macros. v7 also provided a checkpoint
>>> for Argo testers to maximize test coverage as the series converges into
>>> a Xen 4.12 merge candidate for Juergen. It addressed:
>>> 
>>> - Jan's v6 review comments
>>> - Julien's v1 review comment
>>> - most of your xen-devel and offline review comments
>> 
>> I think it will benefit the community to give this review in public,
>> so other reviewers know whats going on. IMO getting this private
>> review makes it harder for me (as a reviewer) to know the motivation
>> of some of the changes between versions, and likely also makes it
>> harder for you since you have to keep track of comments from multiple
>> sources on different channels.
> 
> There is one more reason to require public comments which I have only
> learned recently: for safety certifications we need to keep a record of
> all review comments and patches that address them for traceability.

Do you mean:

(A) all _merged_ patches and their review comments

 or

(B) all comments and patches (merged or not) that address them

i.e. would the certification process be seeking traceability of safety-impacting patches (code, scenario A) or decisions (including decisions to leave code unchanged, scenario B)?

If you mean (B), would we need an update to the Xen Security Problem Response Process [1]?  e.g. public archive of all comments from pre-disclosure discussion, along with content hashes stored immutably?  

Rich

[1] https://www.xenproject.org/security-policy.html



[-- Attachment #1.2: Type: text/html, Size: 3962 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
  2019-02-04 10:07         ` Roger Pau Monné
                             ` (2 preceding siblings ...)
  2019-02-06 10:36           ` Roger Pau Monné
@ 2019-02-13 10:43           ` Rich Persaud
  3 siblings, 0 replies; 44+ messages in thread
From: Rich Persaud @ 2019-02-13 10:43 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Christopher Clark,
	Tim Deegan, Daniel Smith, Julien Grall, Paul Durrant,
	Jan Beulich, xen-devel, Daniel De Graaf, James McKenzie,
	Eric Chanudet


[-- Attachment #1.1: Type: text/plain, Size: 8355 bytes --]


>>>>> On Feb 4, 2019, at 05:07, Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>> 
>>>>>> On Sun, Feb 03, 2019 at 10:04:29AM -0800, Christopher Clark wrote:
>>>>>> On Thu, Jan 31, 2019 at 5:39 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>>> 
>>>>>> On Wed, Jan 30, 2019 at 08:05:30PM -0800, Christopher Clark wrote:
>>>>>> On Tue, Jan 22, 2019 at 6:19 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>>> 
>>>>>> On Mon, Jan 21, 2019 at 01:59:40AM -0800, Christopher Clark wrote:
>>>>>> Version five of this patch series:
>>>>>> 
>>>>>> * Changes are primarily addressing feedback from the v4 series reviews.
>>>>>> Many points noted on the invididual commit posts.
>>>>>> 
>>>>>> * Critical sections have been shrunk, with allocations and frees
>>>>>> pulled outside where possible, reordering logic within hypercall ops.
>>>>>> 
>>>>>> * A new ring hash function implemented, derived from the djb2 string
>>>>>> hash function.
>>>>>> 
>>>>>> * Flags returned by the notify op have been simplified.
>>>>>> 
>>>>>> * Now uses a single argo boot parameter, taking a list:
>>>>>> - top level boolean to enable/disable Argo
>>>>>> - mac-permissive option to enable/disable wildcard rings
>>>>>> - command line doc edit: no "CONFIG_ARGO" but refers to build config
>>>>>> 
>>>>>> * Switched to use the standard list data structures used by Xen's
>>>>>> common code.
>>>>> 
>>>>> AFAIK this was not requested by any reviewer, so I wonder why you made
>>>>> such change. The more that you open coded some of the list_ macros
>>>>> instead of just doing a s/hlist_/list_/ replacement.
>>>>> I'm fine with using list instead of hlist,
>>>> 
>>>> At your request, v7 replaces open coding with Xen's list macros. The
>>>> hlist macros were not used by any of the common code in Xen.
>>>> 
>>>>> but I don't understand why
>>>>> you decided to open code list_for_each and list_for_each_safe instead
>>>>> of using the macros provided by Xen. Is there an issue with such
>>>>> macros?
>>>> 
>>>> As discussed offline:
>>>> 
>>>> - Using Xen's list macros will expedite Argo's merge for Xen 4.12
>>>> - List macros in Xen list.h originated in Linux list.h and have diverged
>>>> - OpenXT has use cases for measured launch and nested virtualization,
>>>> which influence downstream performance and security requirements for
>>>> Argo and Xen
>>>> - OpenXT can temporarily patch Xen 4.12 for downstream use
>>>> 
>>>>> I've made a couple of minor comments, but I think the current status
>>>>> is good, and fixing those minor comments is going to be trivial.
>>>> 
>>>> Ack, thanks. Hopefully v7 looks good.
>>> 
>>> As a note, the common flow of interactions usually involves the
>>> contributor replying to the comments made by the reviewer in order to
>>> try to reach an agreement before sending a new version.
>> 
>> Yes, v7 was sent to address Jan and Julien's review comments in parallel
>> with our ongoing discussion on v5 macros. v7 also provided a checkpoint
>> for Argo testers to maximize test coverage as the series converges into
>> a Xen 4.12 merge candidate for Juergen. It addressed:
>> 
>> - Jan's v6 review comments
>> - Julien's v1 review comment
>> - most of your xen-devel and offline review comments
> 
> I think it will benefit the community to give this review in public,
> so other reviewers know whats going on. IMO getting this private
> review makes it harder for me (as a reviewer) to know the motivation
> of some of the changes between versions, and likely also makes it
> harder for you since you have to keep track of comments from multiple
> sources on different channels.
> 
> Is there anything that prevents those people from making the review
> comments publicly on xen-devel?
> 
> We should very much try to fix that so everyone can make review
> comments on the public mailing list.

I've advocated for open-source principles in several large organizations.  At XenSource and Citrix, we created organizational separation between the OSS Xen dev team and product teams.  I don't know if that structure remains today, but it was once helpful in reducing conflict between public OSS and private product roadmaps.

The separation between server and client Xen product teams was less ideal, which eventually lead to OpenXT.  Six years after v4v was posted to xen-devel, Xen Argo is the first step to possible reunification, a small chance at reversal, via public open-source, of architectural and resource fragmentation that took place privately.

Like QubesOS, OpenXT (and predecessor Citrix XenClient) development is spread across many open-source projects, including Xen, enabling user workflows that balance hardware-assisted security with usability.  Spanning ecosystems, OpenXT is:

- unbundling OSS capabilities, e.g. TrenchBoot and coreboot for launch integrity
- moving code upstream (Argo, stubdom, blktap, Qemu, OpenEmbedded meta-virt)
- refactoring for peer & downstream derivatives, on client devices and beyond

To achieve this cross-community integration, we work with many stakeholders amid competing priorities for limited dev resources.  It has taken six years to turn the ship from a Xen separation which began within one organization. This progress was accrued across multiple organizations and policies.

Argo was improved by the Xen upstreaming effort.  Future Xen code contributions will benefit from the lessons learned.


>>> There are comments from v5 that haven't been fixed in v7
>>> (the mask usage and list_first_entry_or_null for example)
>>> and the reply to the reviewer's comment was sent at the same time as
>>> v7, leaving no time for further discussion (and for reaching an
>>> agreement suitable to both parties) before sending v7.
>> 
>> Code changes from our ongoing discussion will be addressed in v8. A
>> proposal to address mask usage has been put forward in the parallel
>> thread. Your proposed usage of list_first_entry_or_null will be made in
>> v8, subject to the previous offline discussion about list macros
>> (duplicated here for convenience):
>> 
>>>> As discussed offline:
>>>> 
>>>> - Using Xen's list macros will expedite Argo's merge for Xen 4.12
>>>> - List macros in Xen list.h originated in Linux list.h and have diverged
>>>> - OpenXT has use cases for measured launch and nested virtualization,
>>>> which influence downstream performance and security requirements for
>>>> Argo and Xen
> 
> FWIW, I don't see the connection between nested virtualization or
> measured launch and the list macros.

The issue most relevant to xen-devel is the divergence of list macros in Xen list.h from their origin in Linux list.h.  Since this issue is independent of Argo, I've started a separate thread on "macro supply chains" [1].


> I think a little bit more context
> would be helpful here in order to understand the issue.

For more context on nested virtualization, see Ian Pratt's talk [2] on AX, uXen and nested Hyper-V.  If a production system is designed to meet performance and security requirements that are delivered by multiple hypervisors, which could be open-source (e.g. Xen or uXen), proprietary (e.g. Hyper-V) or in firmware (e.g. AX on HP laptops), then a measured launch increases the level of assurance that the system is booted with validated hypervisors that can cooperate to meet those requirements.  

For more context on measured launch, see the boot integrity talks from PSEC 2018 [3].


>>>> - OpenXT can temporarily patch Xen 4.12 for downstream use
> 
> Patching the macros for OpenXT is perfectly fine, but it would be
> better to understand and fix the problem upstream if possible.
> 
> How are you patching the macros?
> 
> What are you trying to achieve by patching them?

This will be determined during an upcoming OpenXT development, testing and certification cycle, when upstream Xen Argo is evaluated in the context of OpenXT and derivative use cases.

Rich

[1] Macro supply chains
https://lists.xen.org/archives/html/xen-devel/2019-02/msg00832.html

[2] "Hypervisor Security — Lessons Learned", Ian Pratt, 2018
https://www.platformsecuritysummit.com/2018/speaker/pratt/

[3] Boot Integrity presentations, 2018
https://www.platformsecuritysummit.com/2018/topic/boot/


[-- Attachment #1.2: Type: text/html, Size: 26235 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication
  2019-02-13 10:31             ` Rich Persaud
@ 2019-02-13 22:19               ` Stefano Stabellini
  0 siblings, 0 replies; 44+ messages in thread
From: Stefano Stabellini @ 2019-02-13 22:19 UTC (permalink / raw)
  To: Rich Persaud
  Cc: Juergen Gross, Lars Kurth, Stefano Stabellini, Wei Liu,
	Ross Philipson, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Jason Andryuk, Ian Jackson, Christopher Clark,
	Tim Deegan, Daniel Smith, Julien Grall, Paul Durrant,
	Jan Beulich, xen-devel, Daniel De Graaf, James McKenzie,
	Eric Chanudet, Roger Pau Monné

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2269 bytes --]

On Wed, 13 Feb 2019, Rich Persaud wrote:
> On Feb 4, 2019, at 13:22, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
>       On Mon, 4 Feb 2019, Roger Pau Monné wrote:
>                   Yes, v7 was sent to address Jan and Julien's review comments in parallel
> 
>                   with our ongoing discussion on v5 macros. v7 also provided a checkpoint
> 
>                   for Argo testers to maximize test coverage as the series converges into
> 
>                   a Xen 4.12 merge candidate for Juergen. It addressed:
> 
> 
>                   - Jan's v6 review comments
> 
>                   - Julien's v1 review comment
> 
>                   - most of your xen-devel and offline review comments
> 
> 
>             I think it will benefit the community to give this review in public,
> 
>             so other reviewers know whats going on. IMO getting this private
> 
>             review makes it harder for me (as a reviewer) to know the motivation
> 
>             of some of the changes between versions, and likely also makes it
> 
>             harder for you since you have to keep track of comments from multiple
> 
>             sources on different channels.
> 
> 
>       There is one more reason to require public comments which I have only
>       learned recently: for safety certifications we need to keep a record of
>       all review comments and patches that address them for traceability.
> 
> 
> Do you mean:
> 
> (A) all _merged_ patches and their review comments
> 
>  or
> 
> (B) all comments and patches (merged or not) that address them
> 
> i.e. would the certification process be seeking traceability of safety-impacting patches (code, scenario A) or decisions
> (including decisions to leave code unchanged, scenario B)?

I meant (A), however, I don't know specifically if anything from (B) is
also required.


> If you mean (B), would we need an update to the Xen Security Problem Response Process [1]?  e.g. public archive of all comments
> from pre-disclosure discussion, along with content hashes stored immutably?  
> Rich
> 
> [1] https://www.xenproject.org/security-policy.html

I don't think the archives of the pre-disclosure security discussions
need to be public, but they probably need to be available.

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2019-02-13 22:19 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-21  9:59 [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Christopher Clark
2019-01-21  9:59 ` [PATCH v5 01/15] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
2019-01-21  9:59 ` [PATCH v5 02/15] argo: introduce the argo_op hypercall boilerplate Christopher Clark
2019-01-21  9:59 ` [PATCH v5 03/15] argo: define argo_dprintk for subsystem debugging Christopher Clark
2019-01-21  9:59 ` [PATCH v5 04/15] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
2019-01-21 17:55   ` Roger Pau Monné
2019-01-31  4:06     ` Christopher Clark
2019-01-31 10:28       ` Jan Beulich
2019-01-21  9:59 ` [PATCH v5 05/15] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI Christopher Clark
2019-01-21  9:59 ` [PATCH v5 06/15] xen/arm: introduce guest_handle_for_field() Christopher Clark
2019-01-21  9:59 ` [PATCH v5 07/15] argo: implement the register op Christopher Clark
2019-01-22  9:59   ` Roger Pau Monné
2019-01-31  4:08     ` Christopher Clark
2019-01-21  9:59 ` [PATCH v5 08/15] argo: implement the unregister op Christopher Clark
2019-01-22 11:02   ` Roger Pau Monné
2019-01-21  9:59 ` [PATCH v5 09/15] argo: implement the sendv op; evtchn: expose send_guest_global_virq Christopher Clark
2019-01-22 12:08   ` Roger Pau Monné
2019-01-31  4:10     ` Christopher Clark
2019-01-31 10:18       ` Roger Pau Monné
2019-01-31 10:35         ` Jan Beulich
2019-01-31 11:00           ` Roger Pau Monné
2019-02-03 17:56             ` Christopher Clark
2019-02-04  9:30               ` Roger Pau Monné
2019-01-21  9:59 ` [PATCH v5 10/15] argo: implement the notify op Christopher Clark
2019-01-22 14:09   ` Roger Pau Monné
2019-01-31  4:12     ` Christopher Clark
2019-01-21  9:59 ` [PATCH v5 11/15] xsm, argo: XSM control for argo register Christopher Clark
2019-01-21  9:59 ` [PATCH v5 12/15] xsm, argo: XSM control for argo message send operation Christopher Clark
2019-01-21  9:59 ` [PATCH v5 13/15] xsm, argo: XSM control for any access to argo by a domain Christopher Clark
2019-01-21  9:59 ` [PATCH v5 14/15] xsm, argo: notify: don't describe rings that cannot be sent to Christopher Clark
2019-01-21  9:59 ` [PATCH v5 15/15] MAINTAINERS: add new section for Argo and self as maintainer Christopher Clark
2019-01-21 10:58   ` Wei Liu
2019-01-21 11:07   ` Jan Beulich
2019-01-22 14:17 ` [PATCH v5 00/15] Argo: hypervisor-mediated interdomain communication Roger Pau Monné
2019-01-31  4:05   ` Christopher Clark
2019-01-31 13:39     ` Roger Pau Monné
2019-02-03 18:04       ` Christopher Clark
2019-02-04 10:07         ` Roger Pau Monné
2019-02-04 18:22           ` Stefano Stabellini
2019-02-13 10:31             ` Rich Persaud
2019-02-13 22:19               ` Stefano Stabellini
2019-02-06 10:31           ` Roger Pau Monné
2019-02-06 10:36           ` Roger Pau Monné
2019-02-13 10:43           ` Rich Persaud

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.