All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication
@ 2018-12-20  6:38 Christopher Clark
  2018-12-20  6:38 ` [PATCH v2 01/18] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
                   ` (17 more replies)
  0 siblings, 18 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Jason Andryuk, Daniel Smith, Chris Patterson, James McKenzie,
	Rich Persaud, Jan Beulich, Stefano Stabellini, Tim Deegan,
	Jean Guyader, Lars Kurth, Ross Philipson, Konrad Rzeszutek Wilk,
	Paul Durrant, Juergen Gross, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Julien Grall, Daniel De Graaf,
	Eric Chanudet, Roger Pau Monne

This is the second version of the patch series that implements the Argo
hypervisor-mediated interdomain communication mechanism as an experimental
feature for incorporation into the Xen hypervisor.

Feedback from version one has been incorporated, with notable changes:

* VIRQs are used exclusively for signal delivery
  (ie. not raw events, or ISA IRQ in this version).

* The get_config operation has been dropped from the series since the
  raw event channels are not in use and channel number need not be queried.

* The memory pages supplied by the guest in the register ring operation
  are described using values that indicate page address and granularity
  rather than guest frame number.

* Xen's soft reset operation has been implemented.

* The compat machinery for validating hypercall arguments structures
  has been enabled.

* 'goto' has been reintroduced by popular demand, and indentation
  decreased.

* Comments have been added where reviews have identified queries;
  commit messages are expanded.

* Xen-prefixes have been applied throughout the public header,
  and then everywhere else to match.

* The public header has been relicensed under the terms of the
  BSD License, with sign-off to be confirmed by Citrix with this series.

* Several patches melded, folding related work together.

and more.

Thank-you to Paul, Jan, Julien and Roger for the extensive and helpful
reviews of the first version of this series.

Christopher

===
v1 cover letter:

Relevant to the ARM deadline for inclusion in the Xen 4.12 release,
there are very few and only minor ARM-specific changes in this series.

This is derived from the v4v work of XenClient, retained in the OpenXT
Project and developed further by Bromium in uxen. It has benefitted from
and been improved by previous rounds of review in this Xen community,
and is the combined work of a series of Xen engineers that have
preceeded the efforts of the current submission.

The motivation for this feature continues to be that a non-networking,
non-shared memory, hypervisor-mediated communication mechanism between
domains concurrently executing on the same hypervisor has attractive
properties for use cases that value strong mechanisms for policy
enforcement and isolation.

In this series, Argo is made optional for inclusion via Kconfig. When
included, it defaults to disabled and requires a Xen boot parameter to
enable it.  It has XSM integration for access control over
domain-to-domain communication, and a second boot parameter governs the
level of permissiveness over shared communication rings when using the
non-XSM/Flask default.

Design documentation can be found on the Xen wiki, at:
https://wiki.xenproject.org/wiki/Argo:_Hypervisor-Mediated_Exchange_(HMX)_for_Xen

and it will be updated to correspond to the submission here in the coming days.

Argo has recently been discussed on the Xen x86 Community Call, minutes:
https://docs.google.com/document/d/1VUPdWwd1raDOPhjReVVkmb6YoQB3X5oU12E4ExjO1n0/edit#heading=h.mz1wjb9vekjn

In (very) short, Argo is implemented by a new hypercall with five operations:
    * register ring
    * unregister ring
    * sendv
    * notify
    * get config

Ring registration is performed by a domain to provide a region of memory
for receiving messages from one or many other domains. A domain can
issue a send operation to send messages to another domain's ring. The
data is transferred synchronously by the hypervisor. There is no shared
memory between domains, allowing for increased confidence by the domain
that the memory accesses in the registered ring conform to the expected
protocol. The hypervisor is able to enforce access control policy over
the communication.

== Naming

v4v lives on in the Bromium uxen codebase. It is not the same
implementation as this, it doesn't have quite the same properties and
I don't expect the two to converge (though I do hope continued
cross-pollination will happen). Given that, this feature needs to be
describable with a different name.

It's also a complex enough system, with design details that matter and
affect important properties of it, that a generic term (eg. "message
rings") is not sufficient.

Xen's name originates from Xenia, the ancient Greek concept of
hospitality. Argo is the ship from Greek mythology that provided secure
transport for the mission to obtain the Golden Fleece. This feature aims
to provide secure transport.

With this series, I'm proposing that this work shall use the name: argo.
(short, pronouncable, unique within Xen's context so acceptable in code
and material artefacts will be discoverable with a search engine.)

Valued feedback was given in review prior to this posting about whether
naming aspects of the implementation 'argo' was ok. I took this
seriously, and spent significant time looking at how to reduce the level
of argo-ness in this implementation. This version does incorporate changes
from that effort but in general, my view is that use of the name in the
code assists the clarity of it, so much of it has been retained.

The term "Hypervisor-Mediated data eXchange (HMX)" was introduced in a
presentation at the Platform Security Summit 2018, to describe the
general, hypervisor-agnostic, capability of data transfer between
domains performed by the hypervisor. It is viewable at:

  https://www.platformsecuritysummit.com/2018/speaker/clark/

Argo conforms to HMX as described, as does Hyper-V's message-sending
primitive.

== Future items

The Linux device driver used to test this software is derived from the
OpenXT v4v Linux device driver, available at:
    https://github.com/OpenXT/v4v
The Argo implementation is not yet ready to publish (focus has been on
the hypervisor code to this point). A Linux device driver suitable for
inclusion in Xen will be submitted for a future Xen release and
incorporation into OpenXT.

This submission does not include a firewall for constraining
domain-to-domain communication. The XSM hooks added currently provide
granularity of control at domain-to-domain level. We intend to extend
this to provide finer-grained access control in a future submission, but
the current implementation should be sufficient to provide sufficient
isolation for some use cases.

Communication between VMs at different levels of nesting in a
multi-hypervisor system is of strong interest and will inform near-term
enhancements.

Optimization of notification delivery to VMs is a known area for improvement.
* uxen's v4v uses an edge-triggered interrupt to reduce VMEXIT load.
* delivering extended notification data via a dedicated registered ring
  will allow a guest to avoid a search to identify notification causes.

Additional items will be noted on the Xen wiki.

== Credits

Contributors to the design and implementation of this software include:
James McKenzie, Jean Guyader, Ross Philipson, Christopher Clark

with the support of the OpenXT Project.

Thanks are due for the helpful reviews of earlier revisions by
Tim Deegan, Jan Beulich, Ian Campbell and Eric Chanudet.


Christopher Clark (18):
  argo: Introduce the Kconfig option to govern inclusion of Argo
  argo: introduce the argo_message_op hypercall boilerplate
  argo: define argo_dprintk for subsystem debugging
  argo: init, destroy and soft-reset, with enable command line opt
  xen: add simple errno-returning macros for copy from guest
  xen: add XEN_GUEST_HANDLE_NULL macros for null XEN_GUEST_HANDLE
  errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI
  xen/arm: introduce guest_handle_for_field()
  xsm, argo: XSM control for argo register; add argo_mac bootparam
  xsm, argo: XSM control for argo message send operation
  argo: implement the register op
  argo: implement the unregister op
  argo: implement the sendv op; evtchn: expose send_guest_global_virq
  argo: implement the notify op
  xsm, argo: XSM control for any access to argo by a domain
  xsm, argo: notify: don't describe rings that cannot be sent to
  argo: validate hypercall arg structures via compat machinery
  argo: unmap rings on suspend; signal ring-owners on resume

 xen/arch/x86/guest/hypercall_page.S   |    2 +-
 xen/arch/x86/hvm/hypercall.c          |    3 +
 xen/arch/x86/hypercall.c              |    3 +
 xen/arch/x86/pv/hypercall.c           |    3 +
 xen/common/Kconfig                    |   20 +
 xen/common/Makefile                   |    3 +-
 xen/common/argo.c                     | 2296 +++++++++++++++++++++++++++++++++
 xen/common/compat/argo.c              |   60 +
 xen/common/domain.c                   |   29 +
 xen/common/event_channel.c            |    2 +-
 xen/include/Makefile                  |    1 +
 xen/include/asm-arm/guest_access.h    |   23 +
 xen/include/asm-x86/guest_access.h    |   20 +
 xen/include/public/argo.h             |  284 ++++
 xen/include/public/errno.h            |    2 +
 xen/include/public/xen.h              |    8 +-
 xen/include/xen/argo.h                |   25 +
 xen/include/xen/event.h               |    7 +
 xen/include/xen/guest_access.h        |    3 +
 xen/include/xen/hypercall.h           |    9 +
 xen/include/xen/sched.h               |    6 +
 xen/include/xlat.lst                  |    7 +
 xen/include/xsm/dummy.h               |   26 +
 xen/include/xsm/xsm.h                 |   31 +
 xen/xsm/dummy.c                       |    6 +
 xen/xsm/flask/hooks.c                 |   41 +-
 xen/xsm/flask/policy/access_vectors   |   16 +
 xen/xsm/flask/policy/security_classes |    1 +
 28 files changed, 2929 insertions(+), 8 deletions(-)
 create mode 100644 xen/common/argo.c
 create mode 100644 xen/common/compat/argo.c
 create mode 100644 xen/include/public/argo.h
 create mode 100644 xen/include/xen/argo.h

-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 01/18] argo: Introduce the Kconfig option to govern inclusion of Argo
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
@ 2018-12-20  6:38 ` Christopher Clark
  2018-12-20  9:00   ` Jan Beulich
  2018-12-20  6:38 ` [PATCH v2 02/18] argo: introduce the argo_message_op hypercall boilerplate Christopher Clark
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
Changes since v1:

Change the default to disabled.
Make the prompt depend on EXPERT.
Fix tab-based indentation.

 xen/common/Kconfig | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 68132a3..b5a1b29 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -200,6 +200,26 @@ config LATE_HWDOM
 
 	  If unsure, say N.
 
+config ARGO
+	def_bool n
+	prompt "Argo: hypervisor-mediated interdomain communication" if EXPERT = "y"
+	---help---
+	  Enables a hypercall for domains to ask the hypervisor to perform
+	  data transfer of messages between domains.
+
+	  This allows communication channels to be established that do not
+	  require any shared memory between domains; the hypervisor is the
+	  entity that each domain interacts with. The hypervisor is able to
+	  enforce Mandatory Access Control policy over the communication.
+
+	  If XSM_FLASK is enabled, XSM policy can govern which domains may
+	  communicate via the Argo system.
+
+	  This feature does nothing if the "argo" boot parameter is not present.
+	  Argo is disabled at runtime by default.
+
+	  If unsure, say N.
+
 menu "Schedulers"
 	visible if EXPERT = "y"
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 02/18] argo: introduce the argo_message_op hypercall boilerplate
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
  2018-12-20  6:38 ` [PATCH v2 01/18] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
@ 2018-12-20  6:38 ` Christopher Clark
  2018-12-20 15:18   ` Jan Beulich
  2018-12-20  6:39 ` [PATCH v2 03/18] argo: define argo_dprintk for subsystem debugging Christopher Clark
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Presence is gated upon CONFIG_ARGO.

Registers the hypercall previously reserved for this.
Takes 5 arguments, does nothing and returns -ENOSYS.

Will be avoiding a compat ABI by using fixed-size types in hypercall ops so
HYPERCALL, rather than COMPAT_CALL, is the correct macro for the hypercall
tables.

Even though handles will be used for (up to) two of the arguments to the
hypercall, there will be no need for any XLAT_* translation functions
because the referenced data structures have been constructed to be exactly
the same size and bit pattern on both 32-bit and 64-bit guests, and padded
to be integer multiples of 32 bits in size. This means that the same
copy_to_guest and copy_from_guest logic can be relied upon to perform as
required without any further intervention. Testing communication with 32
and 64 bit guests has confirmed this works as intended.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>

---
Changes since v1:

v1 feedback #15 Jan: handle upper-halves of hypercall args
v1 feedback #15 Jan: use unsigned where negative values impossible

 xen/arch/x86/guest/hypercall_page.S |  2 +-
 xen/arch/x86/hvm/hypercall.c        |  3 +++
 xen/arch/x86/hypercall.c            |  3 +++
 xen/arch/x86/pv/hypercall.c         |  3 +++
 xen/common/Makefile                 |  1 +
 xen/common/argo.c                   | 28 ++++++++++++++++++++++++++++
 xen/include/public/xen.h            |  2 +-
 xen/include/xen/hypercall.h         |  9 +++++++++
 8 files changed, 49 insertions(+), 2 deletions(-)
 create mode 100644 xen/common/argo.c

diff --git a/xen/arch/x86/guest/hypercall_page.S b/xen/arch/x86/guest/hypercall_page.S
index fdd2e72..6c56d66 100644
--- a/xen/arch/x86/guest/hypercall_page.S
+++ b/xen/arch/x86/guest/hypercall_page.S
@@ -59,7 +59,7 @@ DECLARE_HYPERCALL(sysctl)
 DECLARE_HYPERCALL(domctl)
 DECLARE_HYPERCALL(kexec_op)
 DECLARE_HYPERCALL(tmem_op)
-DECLARE_HYPERCALL(xc_reserved_op)
+DECLARE_HYPERCALL(argo_message_op)
 DECLARE_HYPERCALL(xenpmu_op)
 
 DECLARE_HYPERCALL(arch_0)
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 19d1263..ee3c9f1 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -134,6 +134,9 @@ static const hypercall_table_t hvm_hypercall_table[] = {
 #ifdef CONFIG_TMEM
     HYPERCALL(tmem_op),
 #endif
+#ifdef CONFIG_ARGO
+    HYPERCALL(argo_message_op),
+#endif
     COMPAT_CALL(platform_op),
 #ifdef CONFIG_PV
     COMPAT_CALL(mmuext_op),
diff --git a/xen/arch/x86/hypercall.c b/xen/arch/x86/hypercall.c
index 032de8f..7da7e89 100644
--- a/xen/arch/x86/hypercall.c
+++ b/xen/arch/x86/hypercall.c
@@ -64,6 +64,9 @@ const hypercall_args_t hypercall_args_table[NR_hypercalls] =
     ARGS(domctl, 1),
     ARGS(kexec_op, 2),
     ARGS(tmem_op, 1),
+#ifdef CONFIG_ARGO
+    ARGS(argo_message_op, 5),
+#endif
     ARGS(xenpmu_op, 2),
 #ifdef CONFIG_HVM
     ARGS(hvm_op, 2),
diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
index 5d11911..c3fd555 100644
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -77,6 +77,9 @@ const hypercall_table_t pv_hypercall_table[] = {
 #ifdef CONFIG_TMEM
     HYPERCALL(tmem_op),
 #endif
+#ifdef CONFIG_ARGO
+    HYPERCALL(argo_message_op),
+#endif
     HYPERCALL(xenpmu_op),
 #ifdef CONFIG_HVM
     HYPERCALL(hvm_op),
diff --git a/xen/common/Makefile b/xen/common/Makefile
index ffdfb74..8c65c6f 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -1,3 +1,4 @@
+obj-$(CONFIG_ARGO) += argo.o
 obj-y += bitmap.o
 obj-y += bsearch.o
 obj-$(CONFIG_CORE_PARKING) += core_parking.o
diff --git a/xen/common/argo.c b/xen/common/argo.c
new file mode 100644
index 0000000..cefd64f
--- /dev/null
+++ b/xen/common/argo.c
@@ -0,0 +1,28 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Derived from v4v, the version 2 of v2v.
+ *
+ * Copyright (c) 2010, Citrix Systems
+ * Copyright (c) 2018, BAE Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <xen/errno.h>
+#include <xen/guest_access.h>
+
+long
+do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
+                   XEN_GUEST_HANDLE_PARAM(void) arg2,
+                   unsigned long arg3, unsigned long arg4)
+{
+    return -ENOSYS;
+}
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 68ee098..0a27546 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -118,7 +118,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define __HYPERVISOR_domctl               36
 #define __HYPERVISOR_kexec_op             37
 #define __HYPERVISOR_tmem_op              38
-#define __HYPERVISOR_xc_reserved_op       39 /* reserved for XenClient */
+#define __HYPERVISOR_argo_message_op      39
 #define __HYPERVISOR_xenpmu_op            40
 #define __HYPERVISOR_dm_op                41
 
diff --git a/xen/include/xen/hypercall.h b/xen/include/xen/hypercall.h
index cc99aea..1dba964 100644
--- a/xen/include/xen/hypercall.h
+++ b/xen/include/xen/hypercall.h
@@ -136,6 +136,15 @@ do_tmem_op(
     XEN_GUEST_HANDLE_PARAM(tmem_op_t) uops);
 #endif
 
+#ifdef CONFIG_ARGO
+extern long do_argo_message_op(
+    unsigned int cmd,
+    XEN_GUEST_HANDLE_PARAM(void) arg1,
+    XEN_GUEST_HANDLE_PARAM(void) arg2,
+    unsigned long arg3,
+    unsigned long arg4);
+#endif
+
 extern long
 do_xenoprof_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg);
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 03/18] argo: define argo_dprintk for subsystem debugging
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
  2018-12-20  6:38 ` [PATCH v2 01/18] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
  2018-12-20  6:38 ` [PATCH v2 02/18] argo: introduce the argo_message_op hypercall boilerplate Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20 15:20   ` Jan Beulich
  2018-12-20  6:39 ` [PATCH v2 04/18] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

A convenience for working on development of the argo subsystem:
toggling a local #define variable turns on just the debug messages
in this subsystem.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
Changes since v1:

v1 #04 feedback, Jan:
    remove do-while from definition
    fully parenthesize the macro expansion
    remove code snippet from the commit message
    add #define ARGO_DEBUG 0 to the file

Correct plural comment to singlar

 xen/common/argo.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index cefd64f..98f40b5 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -19,6 +19,19 @@
 #include <xen/errno.h>
 #include <xen/guest_access.h>
 
+/*
+ * Debug
+ */
+
+/* Set ARGO_DEBUG to 1 here to enable more debug messages */
+#define ARGO_DEBUG 0
+
+#ifdef ARGO_DEBUG
+#define argo_dprintk(format, args...) printk("argo: " format, ## args )
+#else
+#define argo_dprintk(format, ... ) ((void)0)
+#endif
+
 long
 do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
                    XEN_GUEST_HANDLE_PARAM(void) arg2,
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 04/18] argo: init, destroy and soft-reset, with enable command line opt
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (2 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 03/18] argo: define argo_dprintk for subsystem debugging Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20 14:41   ` Lars Kurth
  2018-12-20  6:39 ` [PATCH v2 05/18] xen: add simple errno-returning macros for copy from guest Christopher Clark
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Lars Kurth, Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Initialises basic data structures and performs teardown of argo state
for domain shutdown.

Adds a Xen command line parameter 'argo': bool to enable/disable.
=> defaults to disabled.

Introduces headers:
  <public/argo.h> with definions of addresses and ring structure, including
  indexes for atomic update for communication between domain and hypervisor,
  and <xen/argo.h> to expose the hooks for integration into domain lifecycle
  stages.

If CONFIG_ARGO is enabled:

Adds: per-domain init of argo data structures to domain_create by calling
argo_init; teardown via argo_destroy into domain_destroy and the error exit
path of domain_create; and reset of domain state in argo_soft_reset.

In accordance with recent work on _domain_destroy, argo_destroy is
idempotent.

Adds two new fields to struct domain:
    rwlock_t argo_lock;
    struct argo_domain *argo;

The software license on the public header is the BSD license, standard
procedure for the public Xen headers.

A count will be maintained of the number of rings that a domain has
registered in order to limit it below the fixed maximum limit defined here.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
Changes since v1:

v1 #5 feedback Paul: init/destroy unsigned, brackets and whitespace fixes
v1 #5 feedback Paul: Use mfn_eq for comparing mfns.
v1 #5 feedback Paul: init/destroy : use currd
v1 #6 (#5) feedback Jan: init/destroy: s/ENOSYS/EOPNOTSUPP/
v1 #6 feedback Paul: Folded patch 6 into patch 5.
v1 #6 feedback Jan: drop opt_argo_enabled initializer
v1 $6 feedback Jan: s/ENOSYS/EOPNOTSUPP/g and drop useless dprintk
v1 #5 feedback Paul: change the license on public header to BSD
v1 drop unnecessary xen include from sched.h
v1 drop inclusion of public argo.h in private one
v1 add include of public argo.h to argo.c
v1 drop fwd decl of argo_domain in priv header
v1 Paul/Jan: add data structures to xlat.lst and compat/argo.h to Makefile
v1. removed allocation of event channel since switching to VIRQ
v1. drop types.h include from private argo.h
v1: reorder public argo include position
v1: #13 feedback Jan: public namespace: prefix with xen
v1: rename pending ent "id" to "domain_id"
v1: add domain_cookie to ent struct
v1. #15 feedback Jan: make cmd unsigned
v1. #15 feedback Jan: make i loop variable unsigned
v1: adjust dprintks in init, destroy
v1: #18 feedback Jan: meld max ring count limit
v1: use type not struct in public defn, affects compat gen header
v1: feedback #15 Jan: handle upper-halves of hypercall args
v1: add comment explaining the 'magic' field
v1: via Jan feedback: implement soft reset
v1: feedback #13 Roger: use ASSERT_UNREACHABLE

 xen/common/argo.c         | 331 +++++++++++++++++++++++++++++++++++++++++++++-
 xen/common/domain.c       |  20 +++
 xen/include/Makefile      |   1 +
 xen/include/public/argo.h |  70 ++++++++++
 xen/include/xen/argo.h    |  23 ++++
 xen/include/xen/sched.h   |   6 +
 xen/include/xlat.lst      |   3 +
 7 files changed, 453 insertions(+), 1 deletion(-)
 create mode 100644 xen/include/public/argo.h
 create mode 100644 xen/include/xen/argo.h

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 98f40b5..2d2f8a8 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -17,7 +17,110 @@
  */
 
 #include <xen/errno.h>
+#include <xen/sched.h>
+#include <xen/domain.h>
+#include <xen/argo.h>
+#include <xen/event.h>
+#include <xen/domain_page.h>
 #include <xen/guest_access.h>
+#include <xen/time.h>
+#include <public/argo.h>
+
+#define ARGO_MAX_RINGS_PER_DOMAIN       128U
+
+DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+
+/* Xen command line option to enable argo */
+static bool __read_mostly opt_argo_enabled;
+boolean_param("argo", opt_argo_enabled);
+
+struct argo_pending_ent
+{
+    struct hlist_node node;
+    domid_t domain_id;
+    uint16_t pad;
+    uint32_t len;
+    uint64_t domain_cookie;
+};
+
+struct argo_ring_info
+{
+    /* next node in the hash, protected by L2 */
+    struct hlist_node node;
+    /* this ring's id, protected by L2 */
+    xen_argo_ring_id_t id;
+    /* used to confirm sender id, protected by L2 */
+    uint64_t partner_cookie;
+    /* L3 */
+    spinlock_t lock;
+    /* cached length of the ring (from ring->len), protected by L3 */
+    uint32_t len;
+    /* number of pages in the ring, protected by L3 */
+    uint32_t npage;
+    /* number of pages translated into mfns, protected by L3 */
+    uint32_t nmfns;
+    /* cached tx pointer location, protected by L3 */
+    uint32_t tx_ptr;
+    /* mapped ring pages protected by L3 */
+    uint8_t **mfn_mapping;
+    /* list of mfns of guest ring, protected by L3 */
+    mfn_t *mfns;
+    /* list of struct argo_pending_ent for this ring, protected by L3 */
+    struct hlist_head pending;
+};
+
+/*
+ * The value of the argo element in a struct domain is
+ * protected by the global lock argo_lock: L1
+ */
+#define ARGO_HTABLE_SIZE 32
+struct argo_domain
+{
+    /* L2 */
+    rwlock_t lock;
+    /* protected by L2 */
+    struct hlist_head ring_hash[ARGO_HTABLE_SIZE];
+    /* id cookie, written only at init, so readable with R(L1) */
+    uint64_t domain_cookie;
+    /* counter of rings registered by this domain, protected by L2 */
+    uint32_t ring_count;
+};
+
+/*
+ * locks
+ */
+
+/*
+ * locking is organized as follows:
+ *
+ * L1 : The global lock: argo_lock
+ * Protects the argo elements of all struct domain *d in the system.
+ * It does not protect any of the elements of d->argo, only their
+ * addresses.
+ * By extension since the destruction of a domain with a non-NULL
+ * d->argo will need to free the d->argo pointer, holding this lock
+ * guarantees that no domains pointers that argo is interested in
+ * become invalid whilst this lock is held.
+ */
+
+static DEFINE_RWLOCK(argo_lock); /* L1 */
+
+/*
+ * L2 : The per-domain lock: d->argo->lock
+ * Holding a read lock on L2 protects the hash table and
+ * the elements in the hash_table d->argo->ring_hash, and
+ * the node and id fields in struct argo_ring_info in the
+ * hash table.
+ * Holding a write lock on L2 protects all of the elements of
+ * struct argo_ring_info.
+ * To take L2 you must already have R(L1). W(L1) implies W(L2) and L3.
+ *
+ * L3 : The ringinfo lock: argo_ring_info *ringinfo; ringinfo->lock
+ * Protects len, tx_ptr, the guest ring, the guest ring_data and
+ * the pending list.
+ * To aquire L3 you must already have R(L2). W(L2) implies L3.
+ */
 
 /*
  * Debug
@@ -32,10 +135,236 @@
 #define argo_dprintk(format, ... ) ((void)0)
 #endif
 
+/*
+ * ring buffer
+ */
+
+/* caller must have L3 or W(L2) */
+static void
+argo_ring_unmap(struct argo_ring_info *ring_info)
+{
+    unsigned int i;
+
+    if ( !ring_info->mfn_mapping )
+        return;
+
+    for ( i = 0; i < ring_info->nmfns; i++ )
+    {
+        if ( !ring_info->mfn_mapping[i] )
+            continue;
+        if ( ring_info->mfns )
+            argo_dprintk(XENLOG_ERR "argo: unmapping page %"PRI_mfn" from %p\n",
+                         mfn_x(ring_info->mfns[i]),
+                         ring_info->mfn_mapping[i]);
+        unmap_domain_page_global(ring_info->mfn_mapping[i]);
+        ring_info->mfn_mapping[i] = NULL;
+    }
+}
+
+/*
+ * pending
+ */
+static void
+argo_pending_remove_ent(struct argo_pending_ent *ent)
+{
+    hlist_del(&ent->node);
+    xfree(ent);
+}
+
+static void
+argo_pending_remove_all(struct argo_ring_info *ring_info)
+{
+    struct hlist_node *node, *next;
+    struct argo_pending_ent *pending_ent;
+
+    hlist_for_each_entry_safe(pending_ent, node, next, &ring_info->pending,
+                              node)
+        argo_pending_remove_ent(pending_ent);
+}
+
+static void argo_ring_remove_mfns(const struct domain *d,
+                                  struct argo_ring_info *ring_info)
+{
+    unsigned int i;
+
+    ASSERT(rw_is_write_locked(&d->argo->lock));
+
+    if ( !ring_info->mfns )
+        return;
+
+    if ( !ring_info->mfn_mapping )
+    {
+        ASSERT_UNREACHABLE();
+        return;
+    }
+
+    argo_ring_unmap(ring_info);
+
+    for ( i = 0; i < ring_info->nmfns; i++ )
+        if ( !mfn_eq(ring_info->mfns[i], INVALID_MFN) )
+            put_page_and_type(mfn_to_page(ring_info->mfns[i]));
+
+    xfree(ring_info->mfns);
+    ring_info->mfns = NULL;
+    ring_info->npage = 0;
+    xfree(ring_info->mfn_mapping);
+    ring_info->mfn_mapping = NULL;
+    ring_info->nmfns = 0;
+}
+
+static void
+argo_ring_remove_info(struct domain *d, struct argo_ring_info *ring_info)
+{
+    ASSERT(rw_is_write_locked(&d->argo->lock));
+
+    /* Holding W(L2) so do not need to acquire L3 */
+    argo_pending_remove_all(ring_info);
+    hlist_del(&ring_info->node);
+    argo_ring_remove_mfns(d, ring_info);
+    xfree(ring_info);
+}
+
 long
 do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
                    XEN_GUEST_HANDLE_PARAM(void) arg2,
                    unsigned long arg3, unsigned long arg4)
 {
-    return -ENOSYS;
+    struct domain *currd = current->domain;
+    long rc = -EFAULT;
+
+    argo_dprintk("->do_argo_message_op(%u,%p,%p,%d,%d)\n", cmd,
+                 (void *)arg1.p, (void *)arg2.p, (int) arg3, (int) arg4);
+
+    if ( unlikely(!opt_argo_enabled) )
+    {
+        rc = -EOPNOTSUPP;
+        return rc;
+    }
+
+    domain_lock(currd);
+
+    switch (cmd)
+    {
+    default:
+        rc = -EOPNOTSUPP;
+        break;
+    }
+
+    domain_unlock(currd);
+
+    argo_dprintk("<-do_argo_message_op(%u)=%ld\n", cmd, rc);
+
+    return rc;
+}
+
+void
+argo_init_domain_state(struct argo_domain *argo)
+{
+    unsigned int i;
+
+    rwlock_init(&argo->lock);
+    argo->ring_count = 0;
+
+    for ( i = 0; i < ARGO_HTABLE_SIZE; ++i )
+        INIT_HLIST_HEAD(&argo->ring_hash[i]);
+
+    argo->domain_cookie = (uint64_t)NOW();
+}
+
+int
+argo_init(struct domain *d)
+{
+    struct argo_domain *argo;
+
+    if ( !opt_argo_enabled )
+    {
+        argo_dprintk("argo disabled, domid: %d\n", d->domain_id);
+        return 0;
+    }
+
+    argo_dprintk("init: domid: %d\n", d->domain_id);
+
+    argo = xmalloc(struct argo_domain);
+    if ( !argo )
+        return -ENOMEM;
+
+    argo_init_domain_state(argo);
+
+    write_lock(&argo_lock);
+
+    d->argo = argo;
+
+    write_unlock(&argo_lock);
+
+    return 0;
+}
+
+void
+argo_unregister_all_rings(struct domain *d)
+{
+    unsigned int i;
+
+    for ( i = 0; i < ARGO_HTABLE_SIZE; ++i )
+    {
+        struct hlist_node *node, *next;
+        struct argo_ring_info *ring_info;
+
+        hlist_for_each_entry_safe(ring_info, node, next,
+                                  &d->argo->ring_hash[i], node)
+            argo_ring_remove_info(d, ring_info);
+    }
+}
+
+void
+argo_destroy(struct domain *d)
+{
+    BUG_ON(!d->is_dying);
+
+    write_lock(&argo_lock);
+
+    argo_dprintk("destroy: domid %d d->argo=%p\n", d->domain_id, d->argo);
+
+    if ( d->argo )
+    {
+        argo_unregister_all_rings(d);
+
+        d->argo->domain_cookie = 0;
+        xfree(d->argo);
+        d->argo = NULL;
+    }
+    write_unlock(&argo_lock);
+
+    /*
+     * This (dying) domain's domid may be recorded as the authorized sender
+     * to rings registered by other domains, and those rings are not
+     * unregistered here.
+     * If a later domain is created that has the same domid as this one, the
+     * domain_cookie will differ, which ensures that the new domain cannot
+     * use the inherited authorizations to transmit that were issued to this
+     * domain.
+     */
+}
+
+void
+argo_soft_reset(struct domain *d)
+{
+    write_lock(&argo_lock);
+
+    argo_dprintk("soft reset d=%d d->argo=%p\n", d->domain_id, d->argo);
+
+    if ( d->argo )
+    {
+        argo_unregister_all_rings(d);
+
+        if ( !opt_argo_enabled || xsm_argo_enable(d) )
+        {
+            d->argo->domain_cookie = 0;
+            xfree(d->argo);
+            d->argo = NULL;
+        }
+        else
+            argo_init_domain_state(d->argo);
+    }
+
+    write_unlock(&argo_lock);
 }
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 78cc524..601c663 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -32,6 +32,7 @@
 #include <xen/grant_table.h>
 #include <xen/xenoprof.h>
 #include <xen/irq.h>
+#include <xen/argo.h>
 #include <asm/debugger.h>
 #include <asm/p2m.h>
 #include <asm/processor.h>
@@ -277,6 +278,10 @@ static void _domain_destroy(struct domain *d)
 
     xfree(d->pbuf);
 
+#ifdef CONFIG_ARGO
+    argo_destroy(d);
+#endif
+
     rangeset_domain_destroy(d);
 
     free_cpumask_var(d->dirty_cpumask);
@@ -376,6 +381,9 @@ struct domain *domain_create(domid_t domid,
     spin_lock_init(&d->hypercall_deadlock_mutex);
     INIT_PAGE_LIST_HEAD(&d->page_list);
     INIT_PAGE_LIST_HEAD(&d->xenpage_list);
+#ifdef CONFIG_ARGO
+    rwlock_init(&d->argo_lock);
+#endif
 
     spin_lock_init(&d->node_affinity_lock);
     d->node_affinity = NODE_MASK_ALL;
@@ -445,6 +453,11 @@ struct domain *domain_create(domid_t domid,
             goto fail;
         init_status |= INIT_gnttab;
 
+#ifdef CONFIG_ARGO
+        if ( (err = argo_init(d)) != 0 )
+            goto fail;
+#endif
+
         err = -ENOMEM;
 
         d->pbuf = xzalloc_array(char, DOMAIN_PBUF_SIZE);
@@ -717,6 +730,9 @@ int domain_kill(struct domain *d)
         if ( d->is_dying != DOMDYING_alive )
             return domain_kill(d);
         d->is_dying = DOMDYING_dying;
+#ifdef CONFIG_ARGO
+        argo_destroy(d);
+#endif
         evtchn_destroy(d);
         gnttab_release_mappings(d);
         tmem_destroy(d->tmem_client);
@@ -1172,6 +1188,10 @@ int domain_soft_reset(struct domain *d)
 
     grant_table_warn_active_grants(d);
 
+#ifdef CONFIG_ARGO
+    argo_soft_reset(d);
+#endif
+
     for_each_vcpu ( d, v )
     {
         set_xen_guest_handle(runstate_guest(v), NULL);
diff --git a/xen/include/Makefile b/xen/include/Makefile
index f7895e4..3d14532 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -5,6 +5,7 @@ ifneq ($(CONFIG_COMPAT),)
 compat-arch-$(CONFIG_X86) := x86_32
 
 headers-y := \
+    compat/argo.h \
     compat/callback.h \
     compat/elfnote.h \
     compat/event_channel.h \
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
new file mode 100644
index 0000000..a32fb2d
--- /dev/null
+++ b/xen/include/public/argo.h
@@ -0,0 +1,70 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Derived from v4v, the version 2 of v2v.
+ *
+ * Copyright (c) 2010, Citrix Systems
+ * Copyright (c) 2018, BAE Systems
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __XEN_PUBLIC_ARGO_H__
+#define __XEN_PUBLIC_ARGO_H__
+
+#include "xen.h"
+
+typedef struct xen_argo_addr
+{
+    uint32_t port;
+    domid_t domain_id;
+    uint16_t pad;
+} xen_argo_addr_t;
+
+typedef struct xen_argo_ring_id
+{
+    xen_argo_addr_t addr;
+    domid_t partner;
+    uint16_t pad;
+} xen_argo_ring_id_t;
+
+typedef struct xen_argo_ring
+{
+    /*
+     * Contents of the 'magic' field are inspected to verify that they contain
+     * an expected value before the hypervisor will perform writes into this
+     * structure in guest-supplied memory.
+     */
+    uint64_t magic;
+    xen_argo_ring_id_t id;
+    uint32_t len;
+    /* Guests should use atomic operations to access rx_ptr */
+    uint32_t rx_ptr;
+    /* Guests should use atomic operations to access tx_ptr */
+    uint32_t tx_ptr;
+    uint8_t reserved[32];
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    uint8_t ring[];
+#elif defined(__GNUC__)
+    uint8_t ring[0];
+#endif
+} xen_argo_ring_t;
+
+#endif
diff --git a/xen/include/xen/argo.h b/xen/include/xen/argo.h
new file mode 100644
index 0000000..29d32a9
--- /dev/null
+++ b/xen/include/xen/argo.h
@@ -0,0 +1,23 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Copyright (c) 2018, BAE Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef __XEN_ARGO_H__
+#define __XEN_ARGO_H__
+
+int argo_init(struct domain *d);
+void argo_destroy(struct domain *d);
+void argo_soft_reset(struct domain *d);
+
+#endif
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 0309c1f..f5ac0e8 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -490,6 +490,12 @@ struct domain
         unsigned int guest_request_enabled       : 1;
         unsigned int guest_request_sync          : 1;
     } monitor;
+
+#ifdef CONFIG_ARGO
+    /* Argo interdomain communication support */
+    rwlock_t argo_lock;
+    struct argo_domain *argo;
+#endif
 };
 
 /* Protect updates/reads (resp.) of domain_list and domain_hash. */
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 5273320..33e4fd1 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -148,3 +148,6 @@
 ?	flask_setenforce		xsm/flask_op.h
 !	flask_sid_context		xsm/flask_op.h
 ?	flask_transition		xsm/flask_op.h
+?	argo_addr			argo.h
+?	argo_ring_id			argo.h
+?	argo_ring			argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 05/18] xen: add simple errno-returning macros for copy from guest
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (3 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 04/18] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20  6:39 ` [PATCH v2 06/18] xen: add XEN_GUEST_HANDLE_NULL macros for null XEN_GUEST_HANDLE Christopher Clark
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Adds: copy_from_guest_errno(ptr, hnd, nr)
 and: copy_from_guest_offset_errno(ptr, hnd, off, nr)

Inputs are identical to copy_from_guest and copy_from_guest_offset:

    - ptr: pointer to local memory to write into
    - hnd: guest handle to read from
    -  nr: number of records to copy
    - off: offset from the source handle to start the copy from.

The new versions perform the same work as the existing copy_from_guest and
copy_from_guest_offset, but differ in return value when the function does
not succeed in performing a complete copy: the new functions return an errno
value on error, rather than the number of bytes that could not be copied.

This is to allow for improvements in tidy error handling flow at call sites
in the common case.
eg. to enable this sequence:

    ret = copy_from_guest_errno(&iov, iovs, 1);
    if ( ret )
        goto out;
    ...

instead of the prior:

    if ( copy_from_guest(&iov, iovs, 1) )
    {
        ret = -EFAULT;
        goto out;
    }
        ...

or assigning ret a default value of -EFAULT at point of declaration.

In (almost?) all cases the result of copy_from_guest must be checked
and the error code if one is to be generated is (always?) EFAULT.
This change moves that check and error code translation into common code.

This errno-returning function interface originates from Bromium's uxen and
an additional motivation for introducing this is to simplify comparison and
maintenance of common code between argo and v4v.

Applied to both x86 and ARM headers.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>

---
v1 #7 feedback, Paul: corrected and significantly simplified implementation

The need for this interface was questioned during the first series review,
(which wasn't much helped by the implementation or commit message that was
provided in that patch, sorry) -- it wasn't the primary focus of the series,
so wasn't forefront in attention beforehand. However, I maintain that this
interface is better for nearly all the call sites using the guest access
macros, and will simplify code using them -- that plus easing work across
the uxen and Xen codebases together provides reasonable justification for
inclusion in the common code.

 xen/include/asm-arm/guest_access.h | 3 +++
 xen/include/asm-x86/guest_access.h | 3 +++
 xen/include/xen/guest_access.h     | 3 +++
 3 files changed, 9 insertions(+)

diff --git a/xen/include/asm-arm/guest_access.h b/xen/include/asm-arm/guest_access.h
index 224d2a0..8722858 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -97,6 +97,9 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
     typeof(*(ptr)) *_d = (ptr);                         \
     raw_copy_from_guest(_d, _s+(off), sizeof(*_d)*(nr));\
 })
+#define copy_from_guest_offset_errno(ptr, hnd, off, nr)  \
+    (copy_from_guest_offset((ptr), (hnd), (off), (nr)) ? \
+        -EFAULT : 0)
 
 /* Copy sub-field of a structure to guest context via a guest handle. */
 #define copy_field_to_guest(hnd, ptr, field) ({         \
diff --git a/xen/include/asm-x86/guest_access.h b/xen/include/asm-x86/guest_access.h
index ca700c9..9399480 100644
--- a/xen/include/asm-x86/guest_access.h
+++ b/xen/include/asm-x86/guest_access.h
@@ -100,6 +100,9 @@
     typeof(*(ptr)) *_d = (ptr);                         \
     raw_copy_from_guest(_d, _s+(off), sizeof(*_d)*(nr));\
 })
+#define copy_from_guest_offset_errno(ptr, hnd, off, nr)  \
+    (copy_from_guest_offset((ptr), (hnd), (off), (nr)) ? \
+        -EFAULT : 0)                                     \
 
 #define clear_guest_offset(hnd, off, nr) ({    \
     void *_d = (hnd).p;                        \
diff --git a/xen/include/xen/guest_access.h b/xen/include/xen/guest_access.h
index 09989df..4a8ea1f 100644
--- a/xen/include/xen/guest_access.h
+++ b/xen/include/xen/guest_access.h
@@ -17,6 +17,9 @@
 #define copy_from_guest(ptr, hnd, nr)                   \
     copy_from_guest_offset(ptr, hnd, 0, nr)
 
+#define copy_from_guest_errno(ptr, hnd, nr)             \
+    (copy_from_guest_offset(ptr, hnd, 0, nr) ? -EFAULT : 0)
+
 #define clear_guest(hnd, nr)                            \
     clear_guest_offset(hnd, 0, nr)
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 06/18] xen: add XEN_GUEST_HANDLE_NULL macros for null XEN_GUEST_HANDLE
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (4 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 05/18] xen: add simple errno-returning macros for copy from guest Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20  6:39 ` [PATCH v2 07/18] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI Christopher Clark
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

This is to assist code clarity and abstraction conformance, enabling:

    var = XEN_GUEST_HANDLE_NULL(type);

instead of an abstraction violation, where structure is assumed:

    var = ((XEN_GUEST_HANDLE(type)){(void *)0});
or:
    var.p = NULL;

and for convenience for supplying a suitably typed NULL value to a function
that takes an argument with a XEN_GUEST_HANDLE type. eg:

    my_function(foo, bar, XEN_GUEST_HANDLE_NULL(type));

C89-compliance for the public header prohibits use of compound literals
so a declaration is required prior to the point of using the macro:

    DECLARE_XEN_GUEST_HANDLE_NULL(type);

This will define a const variable at the given static or auto scope which is
then utilized to supply the NULL-with-correct-type value when
XEN_GUEST_HANDLE_NULL is used.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
v1 #8 feedback Jan:
    parenthesize XEN_GUEST_HANDLE_NULL
    header requires C89 compliance

 xen/include/public/xen.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 0a27546..5f4f760 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -982,6 +982,10 @@ typedef struct {
 #define XEN_GUEST_HANDLE_64(name) XEN_GUEST_HANDLE(name)
 #endif
 
+#define DECLARE_XEN_GUEST_HANDLE_NULL(type) \
+    XEN_GUEST_HANDLE(type) const __guest_handle_ ## type ## _NULL = {(type *)0}
+#define XEN_GUEST_HANDLE_NULL(type) (__guest_handle_ ## type ## _NULL)
+
 #ifndef __ASSEMBLY__
 struct xenctl_bitmap {
     XEN_GUEST_HANDLE_64(uint8) bitmap;
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 07/18] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (5 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 06/18] xen: add XEN_GUEST_HANDLE_NULL macros for null XEN_GUEST_HANDLE Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20 15:22   ` Jan Beulich
  2018-12-20  6:39 ` [PATCH v2 08/18] xen/arm: introduce guest_handle_for_field() Christopher Clark
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

EMSGSIZE: Argo's sendv operation will return EMSGSIZE when an excess amount
of data, across all iovs, has been supplied, exceeding either the statically
configured maximum size of a transmittable message, or the (variable) size
of the ring registered by the destination domain.

ECONNREFUSED: Argo's register operation will return ECONNREFUSED if a ring
is being registered to communicate with a specific remote domain that does
exist but is not argo-enabled.

These codes are described by POSIX here:
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html
    EMSGSIZE     : "Message too large"
    ECONNREFUSED : "Connection refused".

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
 xen/include/public/errno.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/include/public/errno.h b/xen/include/public/errno.h
index 305c112..e1d02fc 100644
--- a/xen/include/public/errno.h
+++ b/xen/include/public/errno.h
@@ -102,6 +102,7 @@ XEN_ERRNO(EILSEQ,	84)	/* Illegal byte sequence */
 XEN_ERRNO(ERESTART,	85)	/* Interrupted system call should be restarted */
 #endif
 XEN_ERRNO(ENOTSOCK,	88)	/* Socket operation on non-socket */
+XEN_ERRNO(EMSGSIZE,	90)	/* Message too large. */
 XEN_ERRNO(EOPNOTSUPP,	95)	/* Operation not supported on transport endpoint */
 XEN_ERRNO(EADDRINUSE,	98)	/* Address already in use */
 XEN_ERRNO(EADDRNOTAVAIL, 99)	/* Cannot assign requested address */
@@ -109,6 +110,7 @@ XEN_ERRNO(ENOBUFS,	105)	/* No buffer space available */
 XEN_ERRNO(EISCONN,	106)	/* Transport endpoint is already connected */
 XEN_ERRNO(ENOTCONN,	107)	/* Transport endpoint is not connected */
 XEN_ERRNO(ETIMEDOUT,	110)	/* Connection timed out */
+XEN_ERRNO(ECONNREFUSED,	111)	/* Connection refused */
 
 #undef XEN_ERRNO
 #endif /* XEN_ERRNO */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 08/18] xen/arm: introduce guest_handle_for_field()
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (6 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 07/18] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20  6:39 ` [PATCH v2 09/18] xsm, argo: XSM control for argo register; add argo_mac bootparam Christopher Clark
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Ross Philipson, Jason Andryuk, Daniel Smith,
	Rich Persaud, James McKenzie, Julien Grall, Paul Durrant,
	Eric Chanudet

ARM port of c/s bb544585: "introduce guest_handle_for_field()"

This helper turns a field of a GUEST_HANDLE into a GUEST_HANDLE.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
---
v1 10 feedback Paul: improve commit message, add R-by

 xen/include/asm-arm/guest_access.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/xen/include/asm-arm/guest_access.h b/xen/include/asm-arm/guest_access.h
index 8722858..729f71e 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -63,6 +63,9 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
     _y;                                                     \
 })
 
+#define guest_handle_for_field(hnd, type, fld)          \
+    ((XEN_GUEST_HANDLE(type)) { &(hnd).p->fld })
+
 #define guest_handle_from_ptr(ptr, type)        \
     ((XEN_GUEST_HANDLE_PARAM(type)) { (type *)ptr })
 #define const_guest_handle_from_ptr(ptr, type)  \
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 09/18] xsm, argo: XSM control for argo register; add argo_mac bootparam
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (7 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 08/18] xen/arm: introduce guest_handle_for_field() Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20 15:29   ` Jan Beulich
  2018-12-20  6:39 ` [PATCH v2 10/18] xsm, argo: XSM control for argo message send operation Christopher Clark
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	Daniel De Graaf, James McKenzie, Eric Chanudet, Roger Pau Monne

XSM hooks implement distinct permissions for these two distinct cases of
Argo ring registration:

* Single source:  registering a ring for communication to receive messages
                  from a specified single other domain.
  Default policy: allow.

* Any source:     registering a ring for communication to receive messages
                  from any, or all, other domains (ie. wildcard).
  Default policy: deny, with runtime policy configuration via new bootparam.

The reason why the default for wildcard rings is 'deny' is that there is
currently no means other than XSM to protect the ring from DoS by a noisy
domain spamming the ring, affecting other domains ability to send to it.
Using XSM at least allows per-domain control over access to the send
permission, to limit communication to domains that can be trusted.

Since denying access to any-sender rings unless a flask XSM policy is active
will prevent many users from using a key Argo feature, also introduce a
bootparam that can override this constraint:

   "argo_mac" variable has allowed values: 'permissive' and 'enforcing'.

Even though this is a boolean variable, use these descriptive strings in
order to make it obvious to an administrator that this has potential
security impact.

Modifies the signature of core XSM hook functions in order to apply 'const'
to arguments, needed in order for 'const' to be accepted in signature of
functions that invoke them.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
v1 feedback: replace use of strncmp with strcmp
v1 feedback #16 Jan: apply const to function signatures
v1 feedback #14 Jan: add blank line before return in parse_argo_mac_param

 xen/common/argo.c                     | 16 ++++++++++++++++
 xen/include/xsm/dummy.h               | 15 +++++++++++++++
 xen/include/xsm/xsm.h                 | 19 +++++++++++++++++++
 xen/xsm/dummy.c                       |  4 ++++
 xen/xsm/flask/hooks.c                 | 27 ++++++++++++++++++++++++---
 xen/xsm/flask/policy/access_vectors   | 11 +++++++++++
 xen/xsm/flask/policy/security_classes |  1 +
 7 files changed, 90 insertions(+), 3 deletions(-)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 2d2f8a8..abfc1f0 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -35,6 +35,22 @@ DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
 static bool __read_mostly opt_argo_enabled;
 boolean_param("argo", opt_argo_enabled);
 
+/* Xen command line option for conservative or relaxed access control */
+bool __read_mostly argo_mac_bootparam_enforcing = true;
+
+static int __init parse_argo_mac_param(const char *s)
+{
+    if ( !strcmp(s, "enforcing") )
+        argo_mac_bootparam_enforcing = true;
+    else if ( !strcmp(s, "permissive") )
+        argo_mac_bootparam_enforcing = false;
+    else
+        return -EINVAL;
+
+    return 0;
+}
+custom_param("argo_mac", parse_argo_mac_param);
+
 struct argo_pending_ent
 {
     struct hlist_node node;
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index a29d1ef..55113c3 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -720,6 +720,21 @@ static XSM_INLINE int xsm_dm_op(XSM_DEFAULT_ARG struct domain *d)
 
 #endif /* CONFIG_X86 */
 
+#ifdef CONFIG_ARGO
+static XSM_INLINE int xsm_argo_register_single_source(struct domain *d,
+                                                      struct domain *t)
+{
+    return 0;
+}
+
+static XSM_INLINE int xsm_argo_register_any_source(struct domain *d,
+                                                   bool strict)
+{
+    return strict ? -EPERM : 0;
+}
+
+#endif /* CONFIG_ARGO */
+
 #include <public/version.h>
 static XSM_INLINE int xsm_xen_version (XSM_DEFAULT_ARG uint32_t op)
 {
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 3b192b5..e775a6d 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -181,6 +181,11 @@ struct xsm_operations {
 #endif
     int (*xen_version) (uint32_t cmd);
     int (*domain_resource_map) (struct domain *d);
+#ifdef CONFIG_ARGO
+    int (*argo_register_single_source) (const struct domain *d,
+                                        const struct domain *t);
+    int (*argo_register_any_source) (const struct domain *d);
+#endif
 };
 
 #ifdef CONFIG_XSM
@@ -698,6 +703,20 @@ static inline int xsm_domain_resource_map(xsm_default_t def, struct domain *d)
     return xsm_ops->domain_resource_map(d);
 }
 
+#ifdef CONFIG_ARGO
+static inline xsm_argo_register_single_source(const struct domain *d,
+                                              const struct domain *t)
+{
+    return xsm_ops->argo_register_single_source(d, t);
+}
+
+static inline xsm_argo_register_any_source(const struct domain *d, bool strict)
+{
+    return xsm_ops->argo_register_any_source(d);
+}
+
+#endif /* CONFIG_ARGO */
+
 #endif /* XSM_NO_WRAPPERS */
 
 #ifdef CONFIG_MULTIBOOT
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 5701047..ed236b0 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -152,4 +152,8 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
 #endif
     set_to_dummy_if_null(ops, xen_version);
     set_to_dummy_if_null(ops, domain_resource_map);
+#ifdef CONFIG_ARGO
+    set_to_dummy_if_null(ops, argo_register_single_source);
+    set_to_dummy_if_null(ops, argo_register_any_source);
+#endif
 }
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 96d31aa..fcb7487 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -36,13 +36,14 @@
 #include <objsec.h>
 #include <conditional.h>
 
-static u32 domain_sid(struct domain *dom)
+static u32 domain_sid(const struct domain *dom)
 {
     struct domain_security_struct *dsec = dom->ssid;
     return dsec->sid;
 }
 
-static u32 domain_target_sid(struct domain *src, struct domain *dst)
+static u32 domain_target_sid(const struct domain *src,
+                             const struct domain *dst)
 {
     struct domain_security_struct *ssec = src->ssid;
     struct domain_security_struct *dsec = dst->ssid;
@@ -58,7 +59,8 @@ static u32 evtchn_sid(const struct evtchn *chn)
     return chn->ssid.flask_sid;
 }
 
-static int domain_has_perm(struct domain *dom1, struct domain *dom2, 
+static int domain_has_perm(const struct domain *dom1,
+                           const struct domain *dom2,
                            u16 class, u32 perms)
 {
     u32 ssid, tsid;
@@ -1717,6 +1719,21 @@ static int flask_domain_resource_map(struct domain *d)
     return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__RESOURCE_MAP);
 }
 
+#ifdef CONFIG_ARGO
+static int flask_argo_register_single_source(const struct domain *d,
+                                             const struct domain *t)
+{
+    return domain_has_perm(d, t, SECCLASS_ARGO,
+                           ARGO__REGISTER_SINGLE_SOURCE);
+}
+
+static int flask_argo_register_any_source(const struct domain *d)
+{
+    return avc_has_perm(domain_sid(d), SECINITSID_XEN, SECCLASS_ARGO,
+                        ARGO__REGISTER_ANY_SOURCE, NULL);
+}
+#endif
+
 long do_flask_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_flask_op);
 int compat_flask_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_flask_op);
 
@@ -1851,6 +1868,10 @@ static struct xsm_operations flask_ops = {
 #endif
     .xen_version = flask_xen_version,
     .domain_resource_map = flask_domain_resource_map,
+#ifdef CONFIG_ARGO
+    .argo_register_single_source = flask_argo_register_single_source,
+    .argo_register_any_source = flask_argo_register_any_source,
+#endif
 };
 
 void __init flask_init(const void *policy_buffer, size_t policy_size)
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 6fecfda..fb95c97 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -531,3 +531,14 @@ class version
 # Xen build id
     xen_build_id
 }
+
+# Class argo is used to describe the Argo interdomain communication system.
+class argo
+{
+    # Domain requesting registration of a communication ring
+    # to receive messages from a specific other domain.
+    register_single_source
+    # Domain requesting registration of a communication ring
+    # to receive messages from any other domain.
+    register_any_source
+}
diff --git a/xen/xsm/flask/policy/security_classes b/xen/xsm/flask/policy/security_classes
index cde4e1a..50ecbab 100644
--- a/xen/xsm/flask/policy/security_classes
+++ b/xen/xsm/flask/policy/security_classes
@@ -19,5 +19,6 @@ class event
 class grant
 class security
 class version
+class argo
 
 # FLASK
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 10/18] xsm, argo: XSM control for argo message send operation
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (8 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 09/18] xsm, argo: XSM control for argo register; add argo_mac bootparam Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20  6:39 ` [PATCH v2 11/18] argo: implement the register op Christopher Clark
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel, Paul Durrant, Daniel De Graaf
  Cc: Ross Philipson, Jason Andryuk, Daniel Smith, James McKenzie,
	Rich Persaud, Eric Chanudet

Default policy: allow.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
v1 version was:
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
but this has been modified since v1 based on Jan's feedback to apply const
to function signatures.

=> Paul, does the Reviewed-by still stand?

 xen/include/xsm/dummy.h             | 6 ++++++
 xen/include/xsm/xsm.h               | 6 ++++++
 xen/xsm/dummy.c                     | 1 +
 xen/xsm/flask/hooks.c               | 7 +++++++
 xen/xsm/flask/policy/access_vectors | 2 ++
 5 files changed, 22 insertions(+)

diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 55113c3..05d10b5 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -733,6 +733,12 @@ static XSM_INLINE int xsm_argo_register_any_source(struct domain *d,
     return strict ? -EPERM : 0;
 }
 
+static XSM_INLINE int xsm_argo_send(const struct domain *d,
+                                    const struct domain *t)
+{
+    return 0;
+}
+
 #endif /* CONFIG_ARGO */
 
 #include <public/version.h>
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index e775a6d..4d4a60c 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -185,6 +185,7 @@ struct xsm_operations {
     int (*argo_register_single_source) (const struct domain *d,
                                         const struct domain *t);
     int (*argo_register_any_source) (const struct domain *d);
+    int (*argo_send) (const struct domain *d, const struct domain *t);
 #endif
 };
 
@@ -715,6 +716,11 @@ static inline xsm_argo_register_any_source(const struct domain *d, bool strict)
     return xsm_ops->argo_register_any_source(d);
 }
 
+static inline int xsm_argo_send(const struct domain *d, const struct domain *t)
+{
+    return xsm_ops->argo_send(d, t);
+}
+
 #endif /* CONFIG_ARGO */
 
 #endif /* XSM_NO_WRAPPERS */
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index ed236b0..ffac774 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -155,5 +155,6 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
 #ifdef CONFIG_ARGO
     set_to_dummy_if_null(ops, argo_register_single_source);
     set_to_dummy_if_null(ops, argo_register_any_source);
+    set_to_dummy_if_null(ops, argo_send);
 #endif
 }
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index fcb7487..76c012c 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1732,6 +1732,12 @@ static int flask_argo_register_any_source(const struct domain *d)
     return avc_has_perm(domain_sid(d), SECINITSID_XEN, SECCLASS_ARGO,
                         ARGO__REGISTER_ANY_SOURCE, NULL);
 }
+
+static int flask_argo_send(const struct domain *d, const struct domain *t)
+{
+    return domain_has_perm(d, t, SECCLASS_ARGO, ARGO__SEND);
+}
+
 #endif
 
 long do_flask_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_flask_op);
@@ -1871,6 +1877,7 @@ static struct xsm_operations flask_ops = {
 #ifdef CONFIG_ARGO
     .argo_register_single_source = flask_argo_register_single_source,
     .argo_register_any_source = flask_argo_register_any_source,
+    .argo_send = flask_argo_send,
 #endif
 };
 
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index fb95c97..f6c5377 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -541,4 +541,6 @@ class argo
     # Domain requesting registration of a communication ring
     # to receive messages from any other domain.
     register_any_source
+    # Domain sending a message to another domain.
+    send
 }
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 11/18] argo: implement the register op
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (9 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 10/18] xsm, argo: XSM control for argo message send operation Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20 11:20   ` Julien Grall
  2018-12-20  6:39 ` [PATCH v2 12/18] argo: implement the unregister op Christopher Clark
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Used by a domain to register a region of memory for receiving messages from
either a specified other domain, or, if specifying a wildcard, any domain.

This operation creates a mapping within Xen's private address space that
will remain resident for the lifetime of the ring. In subsequent commits,
the hypervisor will use this mapping to copy data from a sending domain into
this registered ring, making it accessible to the domain that registered the
ring to receive data.

In this code, the p2m type of the memory supplied by the guest for the ring
must be p2m_ram_rw, which is a conservative choice made to defer the need to
reason about the other p2m types with this commit.

xen_argo_page_descr_t type is introduced as a page descriptor, to convey
both the physical address of the start of the page and its granularity. The
smallest granularity page is assumed to be 4096 bytes and the lower twelve
bits of the type are used for indicate an enumerated page size.
xen_argo_pfn_t type is introduced here to create a pfn_t type that is 64-bit
on all architectures, to assist with avoiding the need to add a compat ABI.

copy_field_to_guest_errno is added for guest access, performing the same
operation as copy_field_to_guest, but returning -EFAULT if the copy is
incomplete. Added to common code to simplify code at call sites.

Uses array_index_nospec to guard the result of the hash function.
This is out of an abundance of caution, since this is a very basic hash
function, chosen more for its bucket distribution properties to cluster
related rings rather than for cryptographic strength or any uniformness of
output, and it operates upon values supplied by the guest just before being
used as an array index.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
Changes since v1:

v1 #13 feedback, Jan: register op : s/ECONNREFUSED/ESRCH/
v1 #5 (#13) feedback Paul: register op: use currd in do_message_op
v1 #13 feedback, Paul: register op: use mfn_eq comparator
v1 #5 (#13) feedback Paul: register op: use currd in argo_register_ring
v1 #13 feedback Paul: register op: whitespace, unsigned, bounds check
v1 #13 feedback Paul: use of hex in limit constant definition
v1 #13 feedback Paul, register op: set nmfns on loop termination
v1 #13 feedback Paul: register op: do/while -> gotos, reindent
v1 argo_ring_map_page: drop uint32_t for unsigned int
v1. #13 feedback Julien: use page descriptors instead of gpfns.
   - adds ABI support for pages with different granularity.
v1 feedback #13, Paul: adjust log level of message
v1 feedback #13, Paul: use gprintk for guest-triggered warning
v1 feedback #13, Paul: gprintk and XENLOG_DEBUG for ring registration
v1 feedback #13, Paul: use gprintk for errs in argo_ring_map_page
v1 feedback #13, Paul: use ENOMEM if global mapping fails
v1 feedback Paul: overflow check before shift
v1: add define for copy_field_to_guest_errno
v1: fix gprintk use for ARM as its defn dislikes split format strings
v1: use copy_field_to_guest_errno
v1 feedback #13, Jan: argo_hash_fn: no inline, rename, change type
v1 feedback #13, Paul, Jan: EFAULT -> ENOMEM in argo_ring_map_page
v1 feedback #13, Jan: rename page var in argo_ring_map_page
v1 feedback #13, Jan: switch uint8_t* to void* and drop cast
v1 feedback #13, Jan: switch memory barrier to smp_wmb
v1 feedback #13, Jan: make 'ring' comment comply with single-line style
v1 feedback #13, Jan: use xzalloc_array, drop loop NULL init
v1 feedback #13, Jan: init bool with false rather than 0
v1 feedback #13 Jan: use __copy; define and use __copy_field_to_guest_errno
v1 feedback #13, Jan: use xzalloc, drop individual init zeroes
v1 feedback #13, Jan: prefix public namespace with xen
v1 feedback #13, Jan: blank line after op case in do_argo_message_op
v1 self: reflow comment in argo_ring_map_page to within 80 char len
v1 feedback #13, Roger: use true not 1 in assign to update_tx_ptr bool
v1 feedback #21, Jan: fold in the array_index_nospec hash function guards
v1 feedback #18, Jan: fold the max ring count limit into the series
v1 self: use unsigned long type for XEN_ARGO_REGISTER_FLAG_MASK
v1: feedback #15 Jan: handle upper-halves of hypercall args
v1. feedback #13 Jan: add comment re: page alignment
v1. self: confirm ring magic presence in supplied page array
v1. feedback #13 Jan: add comment re: minimum ring size
v1. feedback #13 Roger: use ASSERT_UNREACHABLE
v1. feedback Roger: add comment to hash function

 xen/common/argo.c                  | 621 +++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/guest_access.h |  12 +
 xen/include/asm-x86/guest_access.h |  12 +
 xen/include/public/argo.h          |  71 +++++
 4 files changed, 716 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index abfc1f0..81f8341 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -23,14 +23,19 @@
 #include <xen/event.h>
 #include <xen/domain_page.h>
 #include <xen/guest_access.h>
+#include <xen/nospec.h>
 #include <xen/time.h>
 #include <public/argo.h>
 
 #define ARGO_MAX_RINGS_PER_DOMAIN       128U
 
+DEFINE_XEN_GUEST_HANDLE(xen_argo_page_descr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
 
+/* pfn type: 64-bit on all architectures */
+typedef uint64_t argo_pfn_t;
+
 /* Xen command line option to enable argo */
 static bool __read_mostly opt_argo_enabled;
 boolean_param("argo", opt_argo_enabled);
@@ -104,6 +109,31 @@ struct argo_domain
 };
 
 /*
+ * This hash function is used to distribute rings within the per-domain
+ * hash table (d->argo->ring_hash). The hash table will provide a
+ * 'ring_info' struct if a match is found with a 'xen_argo_ring_id' key:
+ * ie. the key is a (domain id, port, partner domain id) tuple.
+ * There aren't many hash table buckets, and this doesn't need to be
+ * cryptographically robust. Since port number varies the most in
+ * expected use, and the Linux driver allocates at both the high and
+ * low ends, incorporate high and low bits to help with distribution.
+ */
+static unsigned int
+argo_hash(const struct xen_argo_ring_id *id)
+{
+    unsigned int ret;
+
+    ret = (uint16_t)(id->addr.port >> 16);
+    ret ^= (uint16_t)id->addr.port;
+    ret ^= id->addr.domain_id;
+    ret ^= id->partner;
+
+    ret &= (ARGO_HTABLE_SIZE - 1);
+
+    return ret;
+}
+
+/*
  * locks
  */
 
@@ -177,6 +207,84 @@ argo_ring_unmap(struct argo_ring_info *ring_info)
     }
 }
 
+/* caller must have L3 or W(L2) */
+static int
+argo_ring_map_page(struct argo_ring_info *ring_info, unsigned int i,
+                   void **out_ptr)
+{
+    if ( i >= ring_info->nmfns )
+    {
+        gprintk(XENLOG_ERR,
+            "argo: ring (vm%u:%x vm%d) %p attempted to map page  %u of %u\n",
+            ring_info->id.addr.domain_id, ring_info->id.addr.port,
+            ring_info->id.partner, ring_info, i, ring_info->nmfns);
+        return -ENOMEM;
+    }
+
+    if ( !ring_info->mfns || !ring_info->mfn_mapping)
+    {
+        ASSERT_UNREACHABLE();
+        ring_info->len = 0;
+        return -ENOMEM;
+    }
+
+    if ( !ring_info->mfn_mapping[i] )
+    {
+        /*
+         * TODO:
+         * The first page of the ring contains the ring indices, so both read
+         * and write access to the page is required by the hypervisor, but
+         * read-access is not needed for this mapping for the remainder of the
+         * ring.
+         * Since this mapping will remain resident in Xen's address space for
+         * the lifetime of the ring, and following the principle of least
+         * privilege, it could be preferable to:
+         *  # add a XSM check to determine what policy is wanted here
+         *  # depending on the XSM query, optionally create this mapping as
+         *    _write-only_ on platforms that can support it.
+         *    (eg. Intel EPT/AMD NPT).
+         */
+        ring_info->mfn_mapping[i] = map_domain_page_global(ring_info->mfns[i]);
+
+        if ( !ring_info->mfn_mapping[i] )
+        {
+            gprintk(XENLOG_ERR,
+                "argo: ring (vm%u:%x vm%d) %p attempted to map page %u of %u\n",
+                ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                ring_info->id.partner, ring_info, i, ring_info->nmfns);
+            return -ENOMEM;
+        }
+        argo_dprintk("mapping page %"PRI_mfn" to %p\n",
+               mfn_x(ring_info->mfns[i]), ring_info->mfn_mapping[i]);
+    }
+
+    if ( out_ptr )
+        *out_ptr = ring_info->mfn_mapping[i];
+
+    return 0;
+}
+
+/* caller must have L3 or W(L2) */
+static int
+argo_update_tx_ptr(struct argo_ring_info *ring_info, uint32_t tx_ptr)
+{
+    void *dst;
+    uint32_t *p;
+    int ret;
+
+    ret = argo_ring_map_page(ring_info, 0, &dst);
+    if ( ret )
+        return ret;
+
+    ring_info->tx_ptr = tx_ptr;
+
+    p = dst + offsetof(xen_argo_ring_t, tx_ptr);
+    write_atomic(p, tx_ptr);
+    smp_wmb();
+
+    return 0;
+}
+
 /*
  * pending
  */
@@ -240,6 +348,488 @@ argo_ring_remove_info(struct domain *d, struct argo_ring_info *ring_info)
     xfree(ring_info);
 }
 
+/* ring */
+
+static int
+argo_find_ring_mfn(struct domain *d, argo_pfn_t pfn, mfn_t *mfn)
+{
+    p2m_type_t p2mt;
+    int ret = 0;
+
+#ifdef CONFIG_X86
+    *mfn = get_gfn_unshare(d, pfn, &p2mt);
+#else
+    *mfn = p2m_lookup(d, _gfn(pfn), &p2mt);
+#endif
+
+    if ( !mfn_valid(*mfn) )
+        ret = -EINVAL;
+#ifdef CONFIG_X86
+    else if ( p2m_is_paging(p2mt) || (p2mt == p2m_ram_logdirty) )
+        ret = -EAGAIN;
+#endif
+    else if ( (p2mt != p2m_ram_rw) ||
+              !get_page_and_type(mfn_to_page(*mfn), d, PGT_writable_page) )
+        ret = -EINVAL;
+
+#ifdef CONFIG_X86
+    put_gfn(d, pfn);
+#endif
+
+    return ret;
+}
+
+static int
+argo_find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
+                    uint32_t npage,
+                    XEN_GUEST_HANDLE_PARAM(xen_argo_page_descr_t) pg_descr_hnd,
+                    uint32_t len)
+{
+    unsigned int i;
+    int ret = 0;
+
+    /*
+     * first bounds check on npage here also serves as an overflow check
+     * before left shifting it
+     */
+    if ( (unlikely(npage > (XEN_ARGO_MAX_RING_SIZE >> PAGE_SHIFT))) ||
+         ((npage << PAGE_SHIFT) < len) )
+        return -EINVAL;
+
+    if ( ring_info->mfns )
+    {
+        /*
+         * Ring already existed. Check if it's the same ring,
+         * i.e. same number of pages and all translated gpfns still
+         * translating to the same mfns
+         */
+        if ( ring_info->npage != npage )
+            i = ring_info->nmfns + 1; /* forces re-register below */
+        else
+        {
+            for ( i = 0; i < ring_info->nmfns; i++ )
+            {
+                xen_argo_page_descr_t pg_descr;
+                argo_pfn_t pfn;
+                mfn_t mfn;
+
+                ret = copy_from_guest_offset_errno(&pg_descr, pg_descr_hnd,
+                                                   i, 1);
+                if ( ret )
+                    break;
+
+                /* Implementation currently only supports handling 4K pages */
+                if ( (pg_descr & XEN_ARGO_PAGE_DESCR_SIZE_MASK) !=
+                        XEN_ARGO_PAGE_DESCR_SIZE_4K )
+                {
+                    ret = -EINVAL;
+                    break;
+                }
+                pfn = pg_descr >> PAGE_SHIFT;
+
+                ret = argo_find_ring_mfn(d, pfn, &mfn);
+                if ( ret )
+                    break;
+
+                if ( !mfn_eq(mfn, ring_info->mfns[i]) )
+                    break;
+            }
+        }
+        if ( i != ring_info->nmfns )
+        {
+            /* Re-register is standard procedure after resume */
+            gprintk(XENLOG_INFO,
+        "argo: vm%u re-register existing ring (vm%u:%x vm%d) clears MFN list\n",
+                    current->domain->domain_id, ring_info->id.addr.domain_id,
+                    ring_info->id.addr.port, ring_info->id.partner);
+
+            argo_ring_remove_mfns(d, ring_info);
+            ASSERT(!ring_info->mfns);
+        }
+    }
+
+    if ( !ring_info->mfns )
+    {
+        mfn_t *mfns;
+        uint8_t **mfn_mapping;
+
+        mfns = xmalloc_array(mfn_t, npage);
+        if ( !mfns )
+            return -ENOMEM;
+
+        for ( i = 0; i < npage; i++ )
+            mfns[i] = INVALID_MFN;
+
+        mfn_mapping = xzalloc_array(uint8_t *, npage);
+        if ( !mfn_mapping )
+        {
+            xfree(mfns);
+            return -ENOMEM;
+        }
+
+        ring_info->npage = npage;
+        ring_info->mfns = mfns;
+        ring_info->mfn_mapping = mfn_mapping;
+    }
+    ASSERT(ring_info->npage == npage);
+
+    if ( ring_info->nmfns == ring_info->npage )
+        return 0;
+
+    for ( i = ring_info->nmfns; i < ring_info->npage; i++ )
+    {
+        xen_argo_page_descr_t pg_descr;
+        argo_pfn_t pfn;
+        mfn_t mfn;
+
+        ret = copy_from_guest_offset_errno(&pg_descr, pg_descr_hnd, i, 1);
+        if ( ret )
+            break;
+
+        /* Implementation currently only supports handling 4K pages */
+        if ( (pg_descr & XEN_ARGO_PAGE_DESCR_SIZE_MASK) !=
+                XEN_ARGO_PAGE_DESCR_SIZE_4K )
+        {
+            ret = -EINVAL;
+            break;
+        }
+        pfn = pg_descr >> PAGE_SHIFT;
+
+        ret = argo_find_ring_mfn(d, pfn, &mfn);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR,
+        "argo: vm%u: invalid gpfn %"PRI_xen_pfn" r:(vm%u:%x vm%d) %p %d/%d\n",
+                    d->domain_id, pfn, ring_info->id.addr.domain_id,
+                    ring_info->id.addr.port, ring_info->id.partner,
+                    ring_info, i, ring_info->npage);
+            break;
+        }
+
+        ring_info->mfns[i] = mfn;
+
+        argo_dprintk("%d: %"PRI_xen_pfn" -> %"PRI_mfn"\n",
+               i, pfn, mfn_x(ring_info->mfns[i]));
+    }
+
+    ring_info->nmfns = i;
+
+    if ( ret )
+        argo_ring_remove_mfns(d, ring_info);
+    else
+    {
+        ASSERT(ring_info->nmfns == ring_info->npage);
+
+        gprintk(XENLOG_DEBUG,
+        "argo: vm%u ring (vm%u:%x vm%d) %p mfn_mapping %p npage %d nmfns %d\n",
+                current->domain->domain_id,
+                ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                ring_info->id.partner, ring_info, ring_info->mfn_mapping,
+                ring_info->npage, ring_info->nmfns);
+    }
+
+    return ret;
+}
+
+static struct argo_ring_info *
+argo_ring_find_info(const struct domain *d, const struct xen_argo_ring_id *id)
+{
+    unsigned int hash;
+    struct hlist_node *node;
+    struct argo_ring_info *ring_info;
+
+    ASSERT(rw_is_locked(&d->argo->lock));
+
+    hash = array_index_nospec(argo_hash(id), ARGO_HTABLE_SIZE);
+
+    argo_dprintk("d->argo=%p, d->argo->ring_hash[%u]=%p id=%p\n",
+                 d->argo, hash, d->argo->ring_hash[hash].first, id);
+    argo_dprintk("id.addr.port=%x id.addr.domain=vm%u"
+                 " id.addr.partner=vm%d\n",
+                 id->addr.port, id->addr.domain_id, id->partner);
+
+    hlist_for_each_entry(ring_info, node, &d->argo->ring_hash[hash], node)
+    {
+        xen_argo_ring_id_t *cmpid = &ring_info->id;
+
+        if ( cmpid->addr.port == id->addr.port &&
+             cmpid->addr.domain_id == id->addr.domain_id &&
+             cmpid->partner == id->partner )
+        {
+            argo_dprintk("ring_info=%p\n", ring_info);
+            return ring_info;
+        }
+    }
+    argo_dprintk("no ring_info found\n");
+
+    return NULL;
+}
+
+static int
+argo_verify_ring_magic(struct argo_ring_info *ring_info)
+{
+    void *dst;
+    xen_argo_ring_t *ring;
+    int ret;
+
+    ret = argo_ring_map_page(ring_info, 0, &dst);
+    if ( ret )
+        return ret;
+
+    ring = dst;
+    mb();
+
+    if ( ring->magic != XEN_ARGO_RING_MAGIC )
+        return -EINVAL;
+
+    return 0;
+}
+
+static long
+argo_register_ring(struct domain *currd,
+                   XEN_GUEST_HANDLE_PARAM(xen_argo_ring_t) ring_hnd,
+                   XEN_GUEST_HANDLE_PARAM(xen_argo_page_descr_t) pg_descr_hnd,
+                   uint32_t npage, bool fail_exist)
+{
+    struct xen_argo_ring ring;
+    struct argo_ring_info *ring_info;
+    int ret = 0;
+    bool update_tx_ptr = false;
+    uint64_t dst_domain_cookie = 0;
+
+    /*
+     * Verify the alignment of the ring data structure supplied with the
+     * understanding that the ring handle supplied points to the same memory as
+     * the first entry in the array of pages provided via pg_descr_hnd, where
+     * the head of the ring will reside.
+     * See argo_update_tx_ptr where the location of the tx_ptr is accessed at a
+     * fixed offset from head of the first page in the mfn array.
+     */
+    if ( !(guest_handle_is_aligned(ring_hnd, ~PAGE_MASK)) )
+        return -EINVAL;
+
+    read_lock(&argo_lock);
+
+    if ( !currd->argo )
+    {
+        ret = -ENODEV;
+        goto out_unlock;
+    }
+
+    if ( copy_from_guest(&ring, ring_hnd, 1) )
+    {
+        ret = -EFAULT;
+        goto out_unlock;
+    }
+
+    if ( ring.magic != XEN_ARGO_RING_MAGIC )
+    {
+        ret = -EINVAL;
+        goto out_unlock;
+    }
+
+    /*
+     * A ring must be large enough to transmit messages, which requires room for
+     * at least:
+     *  * one message header, and
+     *  * one payload slot (payload is always rounded to a multiple of 16 bytes)
+     * and the ring does not allow filling to capacity with a single message --
+     * see logic in argo_ringbuf_insert -- so there must be space remaining when
+     * a single message is present. This determines minimum ring size.
+     * In addition, the ring size must be aligned with the payload rounding.
+     */
+    if ( (ring.len < (sizeof(struct xen_argo_ring_message_header)
+                      + XEN_ARGO_ROUNDUP(1) + XEN_ARGO_ROUNDUP(1))) ||
+         (XEN_ARGO_ROUNDUP(ring.len) != ring.len) )
+    {
+        ret = -EINVAL;
+        goto out_unlock;
+    }
+
+    if ( ring.len > XEN_ARGO_MAX_RING_SIZE )
+    {
+        ret = -EINVAL;
+        goto out_unlock;
+    }
+
+    if ( ring.id.partner == XEN_ARGO_DOMID_ANY )
+    {
+        ret = xsm_argo_register_any_source(currd,
+                                           argo_mac_bootparam_enforcing);
+        if ( ret )
+            goto out_unlock;
+    }
+    else
+    {
+        struct domain *dst_d = get_domain_by_id(ring.id.partner);
+
+        if ( !dst_d )
+        {
+            argo_dprintk("!dst_d, ESRCH\n");
+            ret = -ESRCH;
+            goto out_unlock;
+        }
+
+        ret = xsm_argo_register_single_source(currd, dst_d);
+        if ( ret )
+        {
+            put_domain(dst_d);
+            goto out_unlock;
+        }
+
+        if ( !dst_d->argo )
+        {
+            argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
+            ret = -ECONNREFUSED;
+            put_domain(dst_d);
+            goto out_unlock;
+        }
+
+        dst_domain_cookie = dst_d->argo->domain_cookie;
+
+        put_domain(dst_d);
+    }
+
+    ring.id.addr.domain_id = currd->domain_id;
+    ret = __copy_field_to_guest_errno(ring_hnd, &ring, id);
+    if ( ret )
+        goto out_unlock;
+
+    /*
+     * no need for a lock yet, because only we know about this
+     * set the tx pointer if it looks bogus (we don't reset it
+     * because this might be a re-register after S4)
+     */
+
+    if ( ring.tx_ptr >= ring.len ||
+         XEN_ARGO_ROUNDUP(ring.tx_ptr) != ring.tx_ptr )
+    {
+        /*
+         * Since the ring is a mess, attempt to flush the contents of it
+         * here by setting the tx_ptr to the next aligned message slot past
+         * the latest rx_ptr we have observed. Handle ring wrap correctly.
+         */
+        ring.tx_ptr = XEN_ARGO_ROUNDUP(ring.rx_ptr);
+
+        if ( ring.tx_ptr >= ring.len )
+            ring.tx_ptr = 0;
+
+        /* ring.tx_ptr will be written back to the guest ring below. */
+        update_tx_ptr = true;
+    }
+
+    /* W(L2) protects all the elements of the domain's ring_info */
+    write_lock(&currd->argo->lock);
+
+    if ( currd->argo->ring_count >= ARGO_MAX_RINGS_PER_DOMAIN )
+    {
+        ret = -ENOSPC;
+        goto out_unlock2;
+    }
+
+    ring_info = argo_ring_find_info(currd, &ring.id);
+    if ( !ring_info )
+    {
+        unsigned int hash;
+
+        ring_info = xzalloc(struct argo_ring_info);
+        if ( !ring_info )
+        {
+            ret = -ENOMEM;
+            goto out_unlock2;
+        }
+
+        spin_lock_init(&ring_info->lock);
+
+        ring_info->partner_cookie = dst_domain_cookie;
+        ring_info->id = ring.id;
+
+        INIT_HLIST_HEAD(&ring_info->pending);
+
+        hash = array_index_nospec(argo_hash(&ring_info->id),
+                                  ARGO_HTABLE_SIZE);
+        hlist_add_head(&ring_info->node, &currd->argo->ring_hash[hash]);
+
+        gprintk(XENLOG_DEBUG, "argo: vm%u registering ring (vm%u:%x vm%d)\n",
+                currd->domain_id, ring.id.addr.domain_id,
+                ring.id.addr.port, ring.id.partner);
+    }
+    else
+    {
+        /*
+         * If the caller specified that the ring must not already exist,
+         * fail at attempt to add a completed ring which already exists.
+         */
+        if ( fail_exist && ring_info->len )
+        {
+            ret = -EEXIST;
+            goto out_unlock2;
+        }
+
+        gprintk(XENLOG_DEBUG,
+            "argo: vm%u re-registering existing ring (vm%u:%x vm%d)\n",
+             currd->domain_id, ring.id.addr.domain_id,
+             ring.id.addr.port, ring.id.partner);
+    }
+
+    /* Since we hold W(L2), there is no need to take L3 here */
+    ring_info->tx_ptr = ring.tx_ptr;
+
+    ret = argo_find_ring_mfns(currd, ring_info, npage, pg_descr_hnd,
+                              ring.len);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+            "argo: vm%u failed to find ring mfns (vm%u:%x vm%d)\n",
+             currd->domain_id, ring.id.addr.domain_id,
+             ring.id.addr.port, ring.id.partner);
+
+        goto out_unlock2;
+    }
+
+    /*
+     * Safety check to confirm that the memory supplied is intended for
+     * use as a ring. This will map the first page of the ring.
+     */
+    ret = argo_verify_ring_magic(ring_info);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+            "argo: vm%u register memory mismatch (vm%u:%x vm%d)\n",
+             currd->domain_id, ring.id.addr.domain_id,
+             ring.id.addr.port, ring.id.partner);
+
+        argo_ring_remove_info(currd, ring_info);
+        goto out_unlock2;
+    }
+
+    if ( update_tx_ptr )
+    {
+        ret = argo_update_tx_ptr(ring_info, ring.tx_ptr);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR,
+                "argo: vm%u failed to write tx_ptr (vm%u:%x vm%d)\n",
+                 currd->domain_id, ring.id.addr.domain_id,
+                 ring.id.addr.port, ring.id.partner);
+
+            argo_ring_remove_info(currd, ring_info);
+            goto out_unlock2;
+        }
+    }
+
+    ring_info->len = ring.len;
+    currd->argo->ring_count++;
+
+ out_unlock2:
+    write_unlock(&currd->argo->lock);
+
+ out_unlock:
+    read_unlock(&argo_lock);
+
+    return ret;
+}
+
 long
 do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
                    XEN_GUEST_HANDLE_PARAM(void) arg2,
@@ -261,6 +851,37 @@ do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
 
     switch (cmd)
     {
+    case XEN_ARGO_MESSAGE_OP_register_ring:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_ring_t) ring_hnd =
+            guest_handle_cast(arg1, xen_argo_ring_t);
+        XEN_GUEST_HANDLE_PARAM(xen_argo_page_descr_t) pg_descr_hnd =
+            guest_handle_cast(arg2, xen_argo_page_descr_t);
+        /* arg3 is npage */
+        /* arg4 is flags */
+        bool fail_exist = arg4 & XEN_ARGO_REGISTER_FLAG_FAIL_EXIST;
+
+        if ( unlikely(!guest_handle_okay(ring_hnd, 1)) )
+            break;
+        if ( unlikely(arg3 > (XEN_ARGO_MAX_RING_SIZE >> PAGE_SHIFT)) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+        if ( unlikely(!guest_handle_okay(pg_descr_hnd, arg3)) )
+            break;
+        /* arg4: reserve currently-undefined bits, require zero.  */
+        if ( unlikely(arg4 & ~XEN_ARGO_REGISTER_FLAG_MASK) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        rc = argo_register_ring(currd, ring_hnd, pg_descr_hnd, arg3,
+                                fail_exist);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/include/asm-arm/guest_access.h b/xen/include/asm-arm/guest_access.h
index 729f71e..5456d81 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -29,6 +29,8 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
 /* Is the guest handle a NULL reference? */
 #define guest_handle_is_null(hnd)        ((hnd).p == NULL)
 
+#define guest_handle_is_aligned(hnd, mask) (!((uintptr_t)(hnd).p & (mask)))
+
 /* Offset the given guest handle into the array it refers to. */
 #define guest_handle_add_offset(hnd, nr) ((hnd).p += (nr))
 #define guest_handle_subtract_offset(hnd, nr) ((hnd).p -= (nr))
@@ -112,6 +114,11 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
     raw_copy_to_guest(_d, _s, sizeof(*_s));             \
 })
 
+/* Errno-returning variant of copy_field_to_guest */
+#define copy_field_to_guest_errno(hnd, ptr, field)      \
+    (copy_field_to_guest((hnd), (ptr), field) ?         \
+        -EFAULT : 0)
+
 /* Copy sub-field of a structure from guest context via a guest handle. */
 #define copy_field_from_guest(ptr, hnd, field) ({       \
     const typeof(&(ptr)->field) _s = &(hnd).p->field;   \
@@ -151,6 +158,11 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
     __raw_copy_to_guest(_d, _s, sizeof(*_s));           \
 })
 
+/* Errno-returning variant of __copy_field_to_guest */
+#define __copy_field_to_guest_errno(hnd, ptr, field)    \
+    (__copy_field_to_guest((hnd), (ptr), field) ?       \
+        -EFAULT : 0)
+
 #define __copy_field_from_guest(ptr, hnd, field) ({     \
     const typeof(&(ptr)->field) _s = &(hnd).p->field;   \
     typeof(&(ptr)->field) _d = &(ptr)->field;           \
diff --git a/xen/include/asm-x86/guest_access.h b/xen/include/asm-x86/guest_access.h
index 9399480..9176150 100644
--- a/xen/include/asm-x86/guest_access.h
+++ b/xen/include/asm-x86/guest_access.h
@@ -41,6 +41,8 @@
 /* Is the guest handle a NULL reference? */
 #define guest_handle_is_null(hnd)        ((hnd).p == NULL)
 
+#define guest_handle_is_aligned(hnd, mask) (!((uintptr_t)(hnd).p & (mask)))
+
 /* Offset the given guest handle into the array it refers to. */
 #define guest_handle_add_offset(hnd, nr) ((hnd).p += (nr))
 #define guest_handle_subtract_offset(hnd, nr) ((hnd).p -= (nr))
@@ -117,6 +119,11 @@
     raw_copy_to_guest(_d, _s, sizeof(*_s));             \
 })
 
+/* Errno-returning variant of copy_field_to_guest */
+#define copy_field_to_guest_errno(hnd, ptr, field)      \
+    (copy_field_to_guest((hnd), (ptr), field) ?         \
+        -EFAULT : 0)
+
 /* Copy sub-field of a structure from guest context via a guest handle. */
 #define copy_field_from_guest(ptr, hnd, field) ({       \
     const typeof(&(ptr)->field) _s = &(hnd).p->field;   \
@@ -162,6 +169,11 @@
     __raw_copy_to_guest(_d, _s, sizeof(*_s));           \
 })
 
+/* Errno-returning variant of __copy_field_to_guest */
+#define __copy_field_to_guest_errno(hnd, ptr, field)    \
+    (__copy_field_to_guest((hnd), (ptr), field) ?       \
+        -EFAULT : 0)
+
 #define __copy_field_from_guest(ptr, hnd, field) ({     \
     const typeof(&(ptr)->field) _s = &(hnd).p->field;   \
     typeof(&(ptr)->field) _d = &(ptr)->field;           \
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index a32fb2d..e73faea 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -31,6 +31,27 @@
 
 #include "xen.h"
 
+#define XEN_ARGO_RING_MAGIC      0xbd67e163e7777f2fULL
+#define XEN_ARGO_DOMID_ANY       DOMID_INVALID
+
+/*
+ * The maximum size of an Argo ring is defined to be: 16GB
+ *  -- which is 0x1000000 bytes.
+ * A byte index into the ring is at most 24 bits.
+ */
+#define XEN_ARGO_MAX_RING_SIZE  (0x1000000ULL)
+
+/*
+ * Page descriptor: encoding both page address and size in a 64-bit value.
+ * Intended to allow ABI to support use of different granularity pages.
+ * example of how to populate:
+ * xen_argo_page_descr_t pg_desc =
+ *      (physaddr & PAGE_MASK) | XEN_ARGO_PAGE_DESCR_SIZE_4K;
+ */
+typedef uint64_t xen_argo_page_descr_t;
+#define XEN_ARGO_PAGE_DESCR_SIZE_MASK   0x0000000000000fffULL
+#define XEN_ARGO_PAGE_DESCR_SIZE_4K     0
+
 typedef struct xen_argo_addr
 {
     uint32_t port;
@@ -67,4 +88,54 @@ typedef struct xen_argo_ring
 #endif
 } xen_argo_ring_t;
 
+/*
+ * Messages on the ring are padded to 128 bits
+ * Len here refers to the exact length of the data not including the
+ * 128 bit header. The message uses
+ * ((len + 0xf) & ~0xf) + sizeof(argo_ring_message_header) bytes.
+ * Using typeof(a) make clear that this does not truncate any high-order bits.
+ */
+#define XEN_ARGO_ROUNDUP(a) (((a) + 0xf) & ~(typeof(a))0xf)
+
+struct xen_argo_ring_message_header
+{
+    uint32_t len;
+    xen_argo_addr_t source;
+    uint32_t message_type;
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    uint8_t data[];
+#elif defined(__GNUC__)
+    uint8_t data[0];
+#endif
+};
+
+/*
+ * Hypercall operations
+ */
+
+/*
+ * XEN_ARGO_MESSAGE_OP_register_ring
+ *
+ * Register a ring using the indicated memory.
+ * Also used to reregister an existing ring (eg. after resume from sleep).
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_ring_t)
+ * arg2: XEN_GUEST_HANDLE(xen_argo_page_descr_t)
+ * arg3: unsigned long npages
+ * arg4: unsigned long flags
+ */
+#define XEN_ARGO_MESSAGE_OP_register_ring     1
+
+/* Register op flags */
+/*
+ * Fail exist:
+ * If set, reject attempts to (re)register an existing established ring.
+ * If clear, reregistration occurs if the ring exists, with the new ring
+ * taking the place of the old, preserving tx_ptr if it remains valid.
+ */
+#define XEN_ARGO_REGISTER_FLAG_FAIL_EXIST  0x1
+
+/* Mask for all defined flags. unsigned long type so ok for both 32/64-bit */
+#define XEN_ARGO_REGISTER_FLAG_MASK 0x1UL
+
 #endif
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 12/18] argo: implement the unregister op
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (10 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 11/18] argo: implement the register op Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20  6:39 ` [PATCH v2 13/18] argo: implement the sendv op; evtchn: expose send_guest_global_virq Christopher Clark
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Takes a single argument: a handle to the registered ring.

The ring's entry is removed from the hashtable of registered rings;
any entries for pending notifications are removed; and the ring is
unmapped from Xen's address space.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
Changes since v1:

v1 #5 (#14) feedback Paul: use currd in do_argo_message_op
v1 #5 (#14) feedback Paul: full use currd in argo_unregister_ring
v1 #13 (#14) feedback Paul: replace do/while with goto; reindent
v1 self: add blank lines in unregister case in do_argo_message_op
v1: #13 feedback Jan: public namespace: prefix with xen
v1: #13 feedback Jan: blank line after op case in do_argo_message_op
v1: #14 feedback Jan: replace domain id override with validation
v1: #18 feedback Jan: meld the ring count limit into the series
v1: feedback #15 Jan: verify zero in unused hypercall args

 xen/common/argo.c         | 76 +++++++++++++++++++++++++++++++++++++++++++++++
 xen/include/public/argo.h | 12 ++++++++
 2 files changed, 88 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 81f8341..cbb17a3 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -565,6 +565,65 @@ argo_ring_find_info(const struct domain *d, const struct xen_argo_ring_id *id)
     return NULL;
 }
 
+static long
+argo_unregister_ring(struct domain *currd,
+                     XEN_GUEST_HANDLE_PARAM(xen_argo_ring_t) ring_hnd)
+{
+    struct xen_argo_ring ring;
+    struct argo_ring_info *ring_info;
+    int ret = 0;
+
+    read_lock(&argo_lock);
+
+    if ( !currd->argo )
+    {
+        ret = -ENODEV;
+        goto out;
+    }
+
+    ret = copy_from_guest_errno(&ring, ring_hnd, 1);
+    if ( ret )
+        goto out;
+
+    if ( ring.magic != XEN_ARGO_RING_MAGIC )
+    {
+        argo_dprintk(
+            "ring.magic(%"PRIx64") != XEN_ARGO_RING_MAGIC(%llx), EINVAL\n",
+            ring.magic, XEN_ARGO_RING_MAGIC);
+        ret = -EINVAL;
+        goto out;
+    }
+
+    if ( unlikely(ring.id.addr.domain_id != currd->domain_id) )
+    {
+        ret = -EPERM;
+        goto out;
+    }
+
+    write_lock(&currd->argo->lock);
+
+    ring_info = argo_ring_find_info(currd, &ring.id);
+    if ( ring_info )
+    {
+        argo_ring_remove_info(currd, ring_info);
+        currd->argo->ring_count--;
+    }
+
+    write_unlock(&currd->argo->lock);
+
+    if ( !ring_info )
+    {
+        argo_dprintk("ENOENT\n");
+        ret = -ENOENT;
+        goto out;
+    }
+
+ out:
+    read_unlock(&argo_lock);
+
+    return ret;
+}
+
 static int
 argo_verify_ring_magic(struct argo_ring_info *ring_info)
 {
@@ -882,6 +941,23 @@ do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
         break;
     }
 
+    case XEN_ARGO_MESSAGE_OP_unregister_ring:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_ring_t) ring_hnd =
+            guest_handle_cast(arg1, xen_argo_ring_t);
+
+        if ( unlikely(!guest_handle_okay(ring_hnd, 1)) )
+            break;
+        if ( unlikely((!guest_handle_is_null(arg2)) || arg3 || arg4) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        rc = argo_unregister_ring(currd, ring_hnd);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index e73faea..24696e2 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -138,4 +138,16 @@ struct xen_argo_ring_message_header
 /* Mask for all defined flags. unsigned long type so ok for both 32/64-bit */
 #define XEN_ARGO_REGISTER_FLAG_MASK 0x1UL
 
+/*
+ * XEN_ARGO_MESSAGE_OP_unregister_ring
+ *
+ * Unregister a previously-registered ring, ending communication.
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_ring_t)
+ * arg2: NULL
+ * arg3: 0 (ZERO)
+ * arg4: 0 (ZERO)
+ */
+#define XEN_ARGO_MESSAGE_OP_unregister_ring     2
+
 #endif
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 13/18] argo: implement the sendv op; evtchn: expose send_guest_global_virq
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (11 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 12/18] argo: implement the unregister op Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20  6:39 ` [PATCH v2 14/18] argo: implement the notify op Christopher Clark
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

sendv operation is invoked to perform a synchronous send of buffers
contained in iovs to a remote domain's registered ring.

It takes:
 * A destination address (domid, port) for the ring to send to.
   It performs a most-specific match lookup, to allow for wildcard.
 * A source address, used to inform the destination of where to reply.
 * The address of an array of iovs containing the data to send
 * .. and the length of that array of iovs
 * and a 32-bit message type, available to communicate message context
   data (eg. kernel-to-kernel, separate from the application data).

If insufficient space exists in the destination ring, it will return
-EAGAIN and Xen will notify the caller when sufficient space becomes
available.

Accesses to the ring indices are appropriately atomic. The rings are
mapped into Xen's private address space to write as needed and the
mappings are retained for later use.

When locating the destination ring, a check is performed via a cookie
installed at ring registration time, to ensure that the source domain
is the same as it was when the ring was registered.

Fixed-size types are used in some areas within this code where caution
around avoiding integer overflow is important.

Notifications are sent to guests via VIRQ and send_guest_global_virq is
exposed in the change to enable argo to call it. VIRQ_ARGO_MESSAGE is
claimed from the VIRQ previously reserved for this purpose (#11).

After consideration, the VIRQ notification method has been selected
rather than sending events using evtchn functions directly because:

* no current event channel type is an exact fit for the intended
  behaviour. ECS_IPI is closest, but it disallows migration to
  other VCPUs which is not necessarily a requirement for Argo.

* at the point of argo_init, allocation of an event channel is
  complicated by none of the guest VCPUs being initialized yet
  and the event channel logic expects that a valid event channel
  has a present VCPU.

* at the point of signalling a notification, the VIRQ logic is already
  defensive: if d->vcpu[0] is NULL, the notification is just silently
  dropped, whereas the evtchn_send logic is not so defensive: vcpu[0]
  must not be NULL, otherwise a null pointer dereference occurs.

Using a VIRQ removes the need for the guest to query to determine which
event channel notifications will be delivered on. This is also likely to
simplify establishing future L0/L1 nested hypervisor argo communication.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
Changes since v1:

v1 #15 feedback, Jan: sendv op : s/ECONNREFUSED/ESRCH/ if ! dest dom
v1 #5 (#15) feedback Paul: sendv: use currd in do_argo_message_op
v1 #13 (#15) feedback Paul: sendv op: do/while -> goto; reindent
v1 #15 feedback Paul: sendv op: make page var: unsigned
v1 #15 feedback Paul: sendv op: new local var for PAGE_SIZE - offset
v1 #8 feedback Jan: XEN_GUEST_HANDLE : C89 compliance
v1 rebase after switching register op from pfns to page descriptors
v1 self: move iov DEFINE_XEN_GUEST_HANDLE out of public header into argo.c
v1 #13 (#15) feedback Paul: fix loglevel for guest-triggered messages
v1 : add compat xlat.lst entries
v1 self: switched notification to send_guest_global_virq instead of event
v1: fix gprintk use for ARM as its defn dislikes split format strings
v1: init len variable to satisfy ARM compiler initialized checking
v1 #13 feedback Jan: rename page var
v1:#14 feedback Jan: uint8_t* -> void*
v1: #13 feedback Jan: public namespace: prefix with xen
v1: #13 feedback Jan: blank line after case op in do_argo_message_op
v1: #15 feedback Jan: add comments explaining why the writes don't overrun
v1: self: add ASSERT to support comment that overrun cannot happen
v1: self: fail on short writes where guest manipulated the iov_lens
v1: self: rename ent id to domain_id
v1: self: add moan for iov rewrite
v1. feedback #15 Jan: require the pad bits are zero
v1. feedback #15 Jan: drop NULL check in argo_signal_domain as now using VIRQ
v1. self: store domain_cookie in pending ent
v1. feedback #15 Jan: use unsigned where possible
v1. feedback Jan: use handle type for iov_base in public iov interface
v1. self: log whenever visible error occurs
v1 feedback #15, Jan: drop unnecessary mb
v1 self: only update internal tx_ptr if able to return success
         and update the visible tx_ptr
v1 self: log on failure to map ring to update visible tx_ptr
v1 feedback #15 Jan: add comment re: notification size policy
v1 self/Roger? remove errant space after sizeof
v1. feedback #15 Jan: require iov pad be zero
v1. self: rename iov_base to iov_hnd for handle in public iov interface
v1: feedback #15 Jan: handle upper-halves of hypercall args; changes some
    types in function signatures to match.
v1: self: add dprintk to sendv
v1: self: add debug output to argo_iov_count
v1. feedback #14 Jan: blank line before return in argo_iov_count
v1 feedback #15 Jan: verify src id, not override

 xen/common/argo.c          | 746 +++++++++++++++++++++++++++++++++++++++++++++
 xen/common/event_channel.c |   2 +-
 xen/include/public/argo.h  |  64 ++++
 xen/include/public/xen.h   |   2 +-
 xen/include/xen/event.h    |   7 +
 xen/include/xlat.lst       |   2 +
 6 files changed, 821 insertions(+), 2 deletions(-)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index cbb17a3..ed50415 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -25,13 +25,17 @@
 #include <xen/guest_access.h>
 #include <xen/nospec.h>
 #include <xen/time.h>
+#include <xsm/xsm.h>
 #include <public/argo.h>
 
 #define ARGO_MAX_RINGS_PER_DOMAIN       128U
 
 DEFINE_XEN_GUEST_HANDLE(xen_argo_page_descr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_iov_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_send_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+DECLARE_XEN_GUEST_HANDLE_NULL(uint8_t);
 
 /* pfn type: 64-bit on all architectures */
 typedef uint64_t argo_pfn_t;
@@ -182,6 +186,18 @@ static DEFINE_RWLOCK(argo_lock); /* L1 */
 #endif
 
 /*
+ * notification to guests
+ */
+
+static void
+argo_signal_domain(struct domain *d)
+{
+    argo_dprintk("signalling domid:%d\n", d->domain_id);
+
+    send_guest_global_virq(d, VIRQ_ARGO_MESSAGE);
+}
+
+/*
  * ring buffer
  */
 
@@ -285,6 +301,519 @@ argo_update_tx_ptr(struct argo_ring_info *ring_info, uint32_t tx_ptr)
     return 0;
 }
 
+static int
+argo_memcpy_to_guest_ring(struct argo_ring_info *ring_info,
+                          uint32_t offset,
+                          const void *src,
+                          XEN_GUEST_HANDLE(uint8_t) src_hnd,
+                          uint32_t len)
+{
+    unsigned int mfns_index = offset >> PAGE_SHIFT;
+    void *dst;
+    int ret;
+    unsigned int src_offset = 0;
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    offset &= ~PAGE_MASK;
+
+    if ( (len > XEN_ARGO_MAX_RING_SIZE) || (offset > XEN_ARGO_MAX_RING_SIZE) )
+        return -EFAULT;
+
+    while ( (offset + len) > PAGE_SIZE )
+    {
+        unsigned int head_len = PAGE_SIZE - offset;
+
+        ret = argo_ring_map_page(ring_info, mfns_index, &dst);
+        if ( ret )
+            return ret;
+
+        if ( src )
+        {
+            memcpy(dst + offset, src + src_offset, head_len);
+            src_offset += head_len;
+        }
+        else
+        {
+            ret = copy_from_guest_errno(dst + offset, src_hnd, head_len);
+            if ( ret )
+                return ret;
+
+            guest_handle_add_offset(src_hnd, head_len);
+        }
+
+        mfns_index++;
+        len -= head_len;
+        offset = 0;
+    }
+
+    ret = argo_ring_map_page(ring_info, mfns_index, &dst);
+    if ( ret )
+    {
+        argo_dprintk("argo: ring (vm%u:%x vm%d) %p attempted to map page"
+               " %d of %d\n", ring_info->id.addr.domain_id,
+               ring_info->id.addr.port, ring_info->id.partner, ring_info,
+               mfns_index, ring_info->nmfns);
+        return ret;
+    }
+
+    if ( src )
+        memcpy(dst + offset, src + src_offset, len);
+    else
+        ret = copy_from_guest_errno(dst + offset, src_hnd, len);
+
+    return ret;
+}
+
+static int
+argo_ringbuf_get_rx_ptr(struct argo_ring_info *ring_info, uint32_t *rx_ptr)
+{
+    void *src;
+    xen_argo_ring_t *ringp;
+    int ret;
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    if ( !ring_info->nmfns || ring_info->nmfns < ring_info->npage )
+        return -EINVAL;
+
+    ret = argo_ring_map_page(ring_info, 0, &src);
+    if ( ret )
+        return ret;
+
+    ringp = (xen_argo_ring_t *)src;
+
+    *rx_ptr = read_atomic(&ringp->rx_ptr);
+
+    return 0;
+}
+
+/*
+ * argo_sanitize_ring creates a modified copy of the ring pointers
+ * where the rx_ptr is rounded up to ensure it is aligned, and then
+ * ring wrap is handled. Simplifies safe use of the rx_ptr for
+ * available space calculation.
+ */
+static void
+argo_sanitize_ring(xen_argo_ring_t *ring,
+                   const struct argo_ring_info *ring_info)
+{
+    uint32_t rx_ptr = ring->rx_ptr;
+
+    ring->tx_ptr = ring_info->tx_ptr;
+    ring->len = ring_info->len;
+
+    rx_ptr = XEN_ARGO_ROUNDUP(rx_ptr);
+    if ( rx_ptr >= ring_info->len )
+        rx_ptr = 0;
+
+    ring->rx_ptr = rx_ptr;
+}
+
+/*
+ * argo_iov_count returns its count on success via an out variable
+ * to avoid potential for a negative return value to be used incorrectly
+ * (eg. coerced into an unsigned variable resulting in a large incorrect value)
+ */
+static int
+argo_iov_count(XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs, unsigned long niov,
+               uint32_t *count)
+{
+    xen_argo_iov_t iov;
+    uint32_t sum_iov_lens = 0;
+    int ret;
+
+    if ( niov > XEN_ARGO_MAXIOV )
+        return -EINVAL;
+
+    while ( niov-- )
+    {
+        ret = copy_from_guest_errno(&iov, iovs, 1);
+        if ( ret )
+            return ret;
+
+        /* valid iovs must have the padding field set to zero */
+        if ( iov.pad )
+        {
+            argo_dprintk("invalid iov: padding is not zero\n");
+            return -EINVAL;
+        }
+
+        /* check each to protect sum against integer overflow */
+        if ( iov.iov_len > XEN_ARGO_MAX_RING_SIZE )
+        {
+            argo_dprintk("invalid iov_len: too big (%u)>%llu\n",
+                         iov.iov_len, XEN_ARGO_MAX_RING_SIZE);
+            return -EINVAL;
+        }
+
+        sum_iov_lens += iov.iov_len;
+
+        /*
+         * Again protect sum from integer overflow
+         * and ensure total msg size will be within bounds.
+         */
+        if ( sum_iov_lens > XEN_ARGO_MAX_MSG_SIZE )
+        {
+            argo_dprintk("invalid iov series: total message too big\n");
+            return -EINVAL;
+        }
+
+        guest_handle_add_offset(iovs, 1);
+    }
+
+    *count = sum_iov_lens;
+
+    return 0;
+}
+
+static int
+argo_ringbuf_insert(struct domain *d,
+                    struct argo_ring_info *ring_info,
+                    const struct xen_argo_ring_id *src_id,
+                    XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs,
+                    unsigned long niov, uint32_t message_type,
+                    unsigned long *out_len)
+{
+    xen_argo_ring_t ring;
+    struct xen_argo_ring_message_header mh = { 0 };
+    int32_t sp;
+    int32_t ret = 0;
+    uint32_t len = 0;
+    uint32_t iov_len;
+    uint32_t sum_iov_len = 0;
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    /*
+     * Obtain the total size of data to transmit -- sets the 'len' variable
+     * -- and sanity check that the iovs conform to size and number limits.
+     * Reads each iov in the array from the guest for the first time
+     * (nb: just reading the iov structs, not the actual data to transmit).
+     * Enforced below: Once the value 'len' has been determined, no more than
+     * 'len' bytes of guest data (plus the message header) will be sent in this
+     * operation.
+     *
+     * len is used to determine that sufficient space exists in the destination
+     * ring for the message -- aborting the send with EAGAIN if not --
+     * to enable populating the message size field in message header, and for
+     * bounds checking while performing the data transmission.
+     */
+    ret = argo_iov_count(iovs, niov, &len);
+    if ( ret )
+        goto out;
+
+    if ( ((XEN_ARGO_ROUNDUP(len) +
+            sizeof(struct xen_argo_ring_message_header)) >= ring_info->len) ||
+         (len > XEN_ARGO_MAX_MSG_SIZE) )
+    {
+        ret = -EMSGSIZE;
+        goto out;
+    }
+
+    ret = argo_ringbuf_get_rx_ptr(ring_info, &ring.rx_ptr);
+    if ( ret )
+        goto out;
+
+    argo_sanitize_ring(&ring, ring_info);
+
+    argo_dprintk("ring.tx_ptr=%d ring.rx_ptr=%d ring.len=%d"
+                 " ring_info->tx_ptr=%d\n",
+                 ring.tx_ptr, ring.rx_ptr, ring.len, ring_info->tx_ptr);
+
+    if ( ring.rx_ptr == ring.tx_ptr )
+        sp = ring_info->len;
+    else
+    {
+        sp = ring.rx_ptr - ring.tx_ptr;
+        if ( sp < 0 )
+            sp += ring.len;
+    }
+
+    if ( (XEN_ARGO_ROUNDUP(len) +
+            sizeof(struct xen_argo_ring_message_header)) >= sp )
+    {
+        argo_dprintk("EAGAIN\n");
+        ret = -EAGAIN;
+        goto out;
+    }
+
+    mh.len = len + sizeof(struct xen_argo_ring_message_header);
+    mh.source.port = src_id->addr.port;
+    mh.source.domain_id = src_id->addr.domain_id;
+    mh.message_type = message_type;
+
+    /*
+     * For this copy to the guest ring, tx_ptr is always 16-byte aligned
+     * and the message header is 16 bytes long.
+     */
+    BUILD_BUG_ON(
+        sizeof(struct xen_argo_ring_message_header) != XEN_ARGO_ROUNDUP(1));
+
+    /*
+     * First data write into the destination ring: fixed size, message header.
+     * This cannot overrun because the available free space (value in 'sp')
+     * is checked above and must be at least this size.
+     */
+    ret = argo_memcpy_to_guest_ring(ring_info,
+                                    ring.tx_ptr + sizeof(xen_argo_ring_t),
+                                    &mh,
+                                    XEN_GUEST_HANDLE_NULL(uint8_t),
+                                    sizeof(mh));
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: failed to write message header to ring (vm%u:%x vm%d)\n",
+                ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                ring_info->id.partner);
+
+        goto out;
+    }
+
+    ring.tx_ptr += sizeof(mh);
+    if ( ring.tx_ptr == ring_info->len )
+        ring.tx_ptr = 0;
+
+    while ( niov-- )
+    {
+        XEN_GUEST_HANDLE_64(uint8_t) buf_hnd;
+        xen_argo_iov_t iov;
+
+        /*
+         * This is the second read of the iov from the guest
+         * -- see comments inline below.
+         */
+        ret = copy_from_guest_errno(&iov, iovs, 1);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: failed to re-read iov (vm%u:%x vm%d)\n",
+                    ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                    ring_info->id.partner);
+
+            goto out;
+        }
+
+        /* Reserve the padding bits: require that they must be zero */
+        if ( iov.pad != 0 )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: iov.pad reserved bits != 0 ring (vm%u:%x vm%d)\n",
+                    ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                    ring_info->id.partner);
+
+            ret = -EINVAL;
+            goto out;
+        }
+
+        buf_hnd = iov.iov_hnd;
+        iov_len = iov.iov_len;
+
+        /* If no data is provided in this iov, moan and skip on to the next */
+        if ( !iov_len )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: no data iov_len=0 iov_hnd=%p ring (vm%u:%x vm%d)\n",
+                    buf_hnd.p,
+                    ring_info->id.addr.domain_id,
+                    ring_info->id.addr.port,
+                    ring_info->id.partner);
+
+            guest_handle_add_offset(iovs, 1);
+            continue;
+        }
+
+        /*
+         * The iov lens could have been modified since the first read above but
+         * each is checked again for continued conformance against
+         * XEN_ARGO_MAX_MSG_SIZE here:
+         */
+        if ( iov_len > XEN_ARGO_MAX_MSG_SIZE )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: iov len %"PRIx32" too big, ring (vm%u:%x vm%d)\n",
+                    iov_len,
+                    ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                    ring_info->id.partner);
+
+            ret = -EINVAL;
+            goto out;
+        }
+
+        /*
+         * and the running total of data processed ('sum_iov_len') is checked
+         * against 'len', which we counted at the beginning with the first iov
+         * read, so the total data provided cannot exceed that limit:
+         * if it does, transmission is aborted.
+         */
+        sum_iov_len += iov_len;
+        if ( sum_iov_len > len )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: len increased %"PRIx32":%"PRIx32" (vm%u:%x vm%d)\n",
+                    len, sum_iov_len,
+                    ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                    ring_info->id.partner);
+
+            ret = -EINVAL;
+            goto out;
+        }
+
+        if ( unlikely(!guest_handle_okay(buf_hnd, iov_len)) )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: bad iov handle [%p, %"PRIx32"] (vm%u:%x vm%d)\n",
+                    buf_hnd.p, iov_len,
+                    ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                    ring_info->id.partner);
+
+            ret = -EFAULT;
+            goto out;
+        }
+
+        sp = ring.len - ring.tx_ptr;
+
+        /* Check: iov data size versus free space at the tail of the ring */
+        if ( iov_len > sp )
+        {
+            /*
+             * Second possible data write: ring-tail-wrap-write.
+             * Populate the ring tail and update the internal tx_ptr to handle
+             * wrapping at the end of ring.
+             * Size of data written here: sp
+             * which is the exact full amount of free space available at the
+             * tail of the ring, so this cannot overrun.
+             */
+            ret = argo_memcpy_to_guest_ring(ring_info,
+                                            ring.tx_ptr +
+                                                sizeof(xen_argo_ring_t),
+                                            NULL, buf_hnd, sp);
+            if ( ret )
+            {
+                gprintk(XENLOG_ERR,
+                        "argo: failed to copy {%p, %"PRIx32"} (vm%u:%x vm%d)\n",
+                        buf_hnd.p, sp,
+                        ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                        ring_info->id.partner);
+
+                goto out;
+            }
+
+            ring.tx_ptr = 0;
+            iov_len -= sp;
+            guest_handle_add_offset(buf_hnd, sp);
+
+            ASSERT(iov_len <= ring.len);
+        }
+
+        /*
+         * Third possible data write: all data remaining for this iov.
+         * Size of data written here: iov_len
+         *
+         * Case 1: if the ring-tail-wrap-write above was performed, then
+         *         iov_len has been decreased by 'sp' and ring.tx_ptr is zero.
+         *
+         *    We know from the first pass of iov_len counting:
+         *      len + sizeof(message_header) <= ring.len
+         *    We also know that the running total, sum_iov_len (which has been
+         *    incremented by each iov_len when they are read the second time)
+         *    cannot exceed len here -- it is bounds checked above -- so both
+         *    these must be true:
+         *       (iov_len <= sum_iov_len) && (sum_iov_len <= len)
+         *    so by transitivity:
+         *       iov_len <= sum_iov_len <= len <= (ring.len - sizeof(msgheader))
+         *    and therefore:
+         *       (iov_len + sizeof(msgheader) <= ring.len) && (ring.tx_ptr == 0)
+         *    so this write cannot overrun here.
+         *
+         * Case 2: ring-tail-wrap-write above was not performed
+         *    -> so iov_len is the guest-supplied value and: (iov_len <= sp)
+         *    ie. less than available space at the tail of the ring:
+         *        so this write cannot overrun.
+         */
+
+        ret = argo_memcpy_to_guest_ring(ring_info,
+                                        ring.tx_ptr + sizeof(xen_argo_ring_t),
+                                        NULL, buf_hnd, iov_len);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: failed to copy [%p, %"PRIx32"] (vm%u:%x vm%d)\n",
+                    buf_hnd.p, iov_len,
+                    ring_info->id.addr.domain_id, ring_info->id.addr.port,
+                    ring_info->id.partner);
+
+            goto out;
+        }
+
+        ring.tx_ptr += iov_len;
+
+        if ( ring.tx_ptr == ring_info->len )
+            ring.tx_ptr = 0;
+
+        guest_handle_add_offset(iovs, 1);
+    }
+
+    /*
+     * If the guest decided to lower the values in its copy of the iov_lens
+     * between the first read by the hypervisor and the second, a) it's being
+     * rude and b) the effect is that we have performed a short write: the
+     * private ring.tx_ptr will have been updated correctly for size of data
+     * written but the length written in the message header will not match the
+     * tx_ptr increment or the length of data actually copied into the ring.
+     *
+     * A short write could also occur if a bad iov was introduced, such as one
+     * with a iov_len exceeding XEN_ARGO_MAX_MSG_SIZE, or with a data pointer
+     * that turned out to be invalid, triggering an early exit from the iov
+     * processing loop above -- those cases would 'goto out' above.
+     *
+     * So: check the two summed iov lengths and if they mismatch, return error
+     * and do not update the guest-visible tx_ptr (ie. count this as abort).
+     */
+    if ( sum_iov_len != len )
+    {
+        gprintk(XENLOG_ERR,
+          "argo: iov modified: sum_iov_len(%u) != len(%u) ring(vm%u:%x vm%d)\n",
+                sum_iov_len, len,
+                ring_info->id.addr.domain_id,
+                ring_info->id.addr.port, ring_info->id.partner);
+
+        ret = -EINVAL;
+        goto out;
+    }
+
+    ring.tx_ptr = XEN_ARGO_ROUNDUP(ring.tx_ptr);
+
+    if ( ring.tx_ptr >= ring_info->len )
+        ring.tx_ptr -= ring_info->len;
+
+    ret = argo_update_tx_ptr(ring_info, ring.tx_ptr);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: failed to update tx_ptr ring(vm%u:%x vm%d)\n",
+                ring_info->id.addr.domain_id,
+                ring_info->id.addr.port, ring_info->id.partner);
+        goto out;
+    }
+
+ out:
+    /*
+     * At this point it is possible to unmap the ring_info, ie:
+     *   argo_ring_unmap(ring_info);
+     * but performance should be improved by not doing so, and retaining
+     * the mapping.
+     * An XSM policy control over level of confidentiality required
+     * versus performance cost could be added to decide that here.
+     * See the similar comment in argo_ring_map_page re: write-only mappings.
+     */
+
+    if ( !ret )
+        *out_len = len;
+
+    return ret;
+}
+
 /*
  * pending
  */
@@ -306,6 +835,61 @@ argo_pending_remove_all(struct argo_ring_info *ring_info)
         argo_pending_remove_ent(pending_ent);
 }
 
+static int
+argo_pending_queue(struct argo_ring_info *ring_info, domid_t src_id,
+                   uint64_t src_cookie, unsigned int len)
+{
+    struct argo_pending_ent *ent;
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    ent = xmalloc(struct argo_pending_ent);
+
+    if ( !ent )
+        return -ENOMEM;
+
+    ent->len = len;
+    ent->domain_id = src_id;
+    ent->domain_cookie = src_cookie;
+
+    hlist_add_head(&ent->node, &ring_info->pending);
+
+    return 0;
+}
+
+static int
+argo_pending_requeue(struct argo_ring_info *ring_info, domid_t src_id,
+                     uint64_t src_cookie, unsigned int len)
+{
+    struct hlist_node *node;
+    struct argo_pending_ent *ent;
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    hlist_for_each_entry(ent, node, &ring_info->pending, node)
+    {
+        if ( (ent->domain_id == src_id) &&
+             (ent->domain_cookie == src_cookie) )
+        {
+            /*
+             * Reuse an existing queue entry for a notification rather than add
+             * another. If the existing entry is waiting for a smaller size than
+             * the current message then adjust the record to wait for the
+             * current (larger) size to be available before triggering a
+             * notification.
+             * This assists the waiting sender by ensuring that whenever a
+             * notification is triggered, there is sufficient space available
+             * for (at least) any one of the messages awaiting transmission.
+             */
+            if ( ent->len < len )
+                ent->len = len;
+            return 0;
+        }
+    }
+
+    return argo_pending_queue(ring_info, src_id, src_cookie, len);
+}
+
 static void argo_ring_remove_mfns(const struct domain *d,
                                   struct argo_ring_info *ring_info)
 {
@@ -565,6 +1149,28 @@ argo_ring_find_info(const struct domain *d, const struct xen_argo_ring_id *id)
     return NULL;
 }
 
+static struct argo_ring_info *
+argo_ring_find_info_by_match(const struct domain *d, uint32_t port,
+                             domid_t partner_id, uint64_t partner_cookie)
+{
+    xen_argo_ring_id_t id;
+    struct argo_ring_info *ring_info;
+
+    ASSERT(rw_is_locked(&d->argo->lock));
+
+    id.addr.port = port;
+    id.addr.domain_id = d->domain_id;
+    id.partner = partner_id;
+
+    ring_info = argo_ring_find_info(d, &id);
+    if ( ring_info && (partner_cookie == ring_info->partner_cookie) )
+        return ring_info;
+
+    id.partner = XEN_ARGO_DOMID_ANY;
+
+    return argo_ring_find_info(d, &id);
+}
+
 static long
 argo_unregister_ring(struct domain *currd,
                      XEN_GUEST_HANDLE_PARAM(xen_argo_ring_t) ring_hnd)
@@ -889,6 +1495,112 @@ argo_register_ring(struct domain *currd,
     return ret;
 }
 
+/*
+ * io
+ */
+
+static long
+argo_sendv(struct domain *src_d, const xen_argo_addr_t *src_addr,
+           const xen_argo_addr_t *dst_addr,
+           XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs,
+           unsigned long niov, uint32_t message_type)
+{
+    struct domain *dst_d = NULL;
+    struct xen_argo_ring_id src_id;
+    struct argo_ring_info *ring_info;
+    int ret = 0;
+    unsigned long len = 0;
+
+    ASSERT(src_d->domain_id == src_addr->domain_id);
+
+    argo_dprintk("sendv: (%d:%x)->(%d:%x) niov:%lu iov:%p type:%u\n",
+                 src_addr->domain_id, src_addr->port,
+                 dst_addr->domain_id, dst_addr->port,
+                 niov, iovs.p, message_type);
+
+    read_lock(&argo_lock);
+
+    if ( !src_d->argo )
+    {
+        ret = -ENODEV;
+        goto out_unlock;
+    }
+
+    src_id.addr.pad = 0;
+    src_id.addr.port = src_addr->port;
+    src_id.addr.domain_id = src_d->domain_id;
+    src_id.partner = dst_addr->domain_id;
+
+    dst_d = get_domain_by_id(dst_addr->domain_id);
+    if ( !dst_d )
+    {
+        argo_dprintk("!dst_d, ESRCH\n");
+        ret = -ESRCH;
+        goto out_unlock;
+    }
+
+    if ( !dst_d->argo )
+    {
+        argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
+        ret = -ECONNREFUSED;
+        goto out_unlock;
+    }
+
+    ret = xsm_argo_send(src_d, dst_d);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR, "argo: XSM REJECTED %i -> %i\n",
+                src_addr->domain_id, dst_addr->domain_id);
+        goto out_unlock;
+    }
+
+    read_lock(&dst_d->argo->lock);
+
+    ring_info = argo_ring_find_info_by_match(dst_d, dst_addr->port,
+                                             src_addr->domain_id,
+                                             src_d->argo->domain_cookie);
+    if ( !ring_info )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: vm%u connection refused, src (vm%u:%x) dst (vm%u:%x)\n",
+                current->domain->domain_id,
+                src_id.addr.domain_id, src_id.addr.port,
+                dst_addr->domain_id, dst_addr->port);
+
+        ret = -ECONNREFUSED;
+        goto out_unlock2;
+    }
+
+    spin_lock(&ring_info->lock);
+
+    ret = argo_ringbuf_insert(dst_d, ring_info, &src_id,
+                              iovs, niov, message_type, &len);
+    if ( ret == -EAGAIN )
+    {
+        argo_dprintk("argo_ringbuf_sendv failed, EAGAIN\n");
+        /* requeue to issue a notification when space is there */
+        if ( argo_pending_requeue(ring_info, src_addr->domain_id,
+                                  src_d->argo->domain_cookie, len) )
+             ret = -ENOMEM;
+    }
+
+    spin_unlock(&ring_info->lock);
+
+    if ( ret >= 0 )
+        argo_signal_domain(dst_d);
+
+ out_unlock2:
+    read_unlock(&dst_d->argo->lock);
+
+ out_unlock:
+    if ( dst_d )
+        put_domain(dst_d);
+
+    read_unlock(&argo_lock);
+
+    return ( ret < 0 ) ? ret : len;
+}
+
 long
 do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
                    XEN_GUEST_HANDLE_PARAM(void) arg2,
@@ -958,6 +1670,40 @@ do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
         break;
     }
 
+    case XEN_ARGO_MESSAGE_OP_sendv:
+    {
+        xen_argo_send_addr_t send_addr;
+
+        XEN_GUEST_HANDLE_PARAM(xen_argo_send_addr_t) send_addr_hnd =
+            guest_handle_cast(arg1, xen_argo_send_addr_t);
+        XEN_GUEST_HANDLE_PARAM(xen_argo_iov_t) iovs =
+            guest_handle_cast(arg2, xen_argo_iov_t);
+        /* arg3 is niov */
+        /* arg4 is message_type */
+
+        if ( unlikely(!guest_handle_okay(send_addr_hnd, 1)) )
+            break;
+        rc = copy_from_guest_errno(&send_addr, send_addr_hnd, 1);
+        if ( rc )
+            break;
+
+        if ( unlikely((arg3 > XEN_ARGO_MAXIOV) || (arg4 & ~0xffffffffUL)) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        if ( unlikely(send_addr.src.domain_id != currd->domain_id) )
+        {
+            rc = -EPERM;
+            break;
+        }
+
+        rc = argo_sendv(currd, &send_addr.src, &send_addr.dst,
+                        iovs, arg3, arg4);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index f34d4f0..6fbe346 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -746,7 +746,7 @@ void send_guest_vcpu_virq(struct vcpu *v, uint32_t virq)
     spin_unlock_irqrestore(&v->virq_lock, flags);
 }
 
-static void send_guest_global_virq(struct domain *d, uint32_t virq)
+void send_guest_global_virq(struct domain *d, uint32_t virq)
 {
     unsigned long flags;
     int port;
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index 24696e2..d075930 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -42,6 +42,33 @@
 #define XEN_ARGO_MAX_RING_SIZE  (0x1000000ULL)
 
 /*
+ * XEN_ARGO_MAXIOV : maximum number of iovs accepted in a single sendv.
+ * Rationale for the value:
+ * The Linux argo driver never passes more than two iovs.
+ * Linux defines UIO_MAXIOV as 1024.
+ * POSIX mandates at least 16 -- not that this is a POSIX API of course.
+ *
+ * Limit the total amount of data posted in a single argo operation to
+ * no more than 2^31 bytes to reduce risk of integer overflow defects.
+ * Each argo iov can hold ~ 2^24 bytes, so set ARGO_MAXIOV to 2^(31-24),
+ * minus one to enable simple efficient bounds checking via masking: 127.
+*/
+#define XEN_ARGO_MAXIOV          127U
+
+DEFINE_XEN_GUEST_HANDLE(uint8_t);
+
+typedef struct xen_argo_iov
+{
+#ifdef XEN_GUEST_HANDLE_64
+    XEN_GUEST_HANDLE_64(uint8_t) iov_hnd;
+#else
+    uint64_t iov_hnd;
+#endif
+    uint32_t iov_len;
+    uint32_t pad;
+} xen_argo_iov_t;
+
+/*
  * Page descriptor: encoding both page address and size in a 64-bit value.
  * Intended to allow ABI to support use of different granularity pages.
  * example of how to populate:
@@ -59,6 +86,12 @@ typedef struct xen_argo_addr
     uint16_t pad;
 } xen_argo_addr_t;
 
+typedef struct xen_argo_send_addr
+{
+    xen_argo_addr_t src;
+    xen_argo_addr_t dst;
+} xen_argo_send_addr_t;
+
 typedef struct xen_argo_ring_id
 {
     xen_argo_addr_t addr;
@@ -150,4 +183,35 @@ struct xen_argo_ring_message_header
  */
 #define XEN_ARGO_MESSAGE_OP_unregister_ring     2
 
+/*
+ * XEN_ARGO_MESSAGE_OP_sendv
+ *
+ * Send a list of buffers contained in iovs.
+ *
+ * The send address struct specifies the source and destination addresses
+ * for the message being sent, which are used to find the destination ring:
+ * Xen first looks for a most-specific match with a registered ring with
+ *  (id.addr == dst) and (id.partner == sending_domain) ;
+ * if that fails, it then looks for a wildcard match (aka multicast receiver)
+ * where (id.addr == dst) and (id.partner == DOMID_ANY).
+ *
+ * For each iov entry, send iov_len bytes from iov_base to the destination ring.
+ * If insufficient space exists in the destination ring, it will return -EAGAIN
+ * and Xen will notify the caller when sufficient space becomes available.
+ *
+ * The message type is a 32-bit data field available to communicate message
+ * context data (eg. kernel-to-kernel, rather than application layer).
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_send_addr_t) source and dest addresses
+ * arg2: XEN_GUEST_HANDLE(xen_argo_iov_t) iovs
+ * arg3: unsigned long niov
+ * arg4: unsigned long message type
+ */
+#define XEN_ARGO_MESSAGE_OP_sendv               5
+
+/* The maximum size of a guest message that may be sent on an Argo ring. */
+#define XEN_ARGO_MAX_MSG_SIZE ((XEN_ARGO_MAX_RING_SIZE) - \
+        (sizeof(struct xen_argo_ring_message_header)) - \
+        XEN_ARGO_ROUNDUP(1))
+
 #endif
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 5f4f760..efd65c4 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -178,7 +178,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define VIRQ_CON_RING   8  /* G. (DOM0) Bytes received on console            */
 #define VIRQ_PCPU_STATE 9  /* G. (DOM0) PCPU state changed                   */
 #define VIRQ_MEM_EVENT  10 /* G. (DOM0) A memory event has occurred          */
-#define VIRQ_XC_RESERVED 11 /* G. Reserved for XenClient                     */
+#define VIRQ_ARGO_MESSAGE 11 /* G. Argo interdomain message notification     */
 #define VIRQ_ENOMEM     12 /* G. (DOM0) Low on heap memory       */
 #define VIRQ_XENPMU     13 /* V.  PMC interrupt                              */
 
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index ebb879e..4650887 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -29,6 +29,13 @@ void send_guest_vcpu_virq(struct vcpu *v, uint32_t virq);
 void send_global_virq(uint32_t virq);
 
 /*
+ * send_guest_global_virq:
+ *  @d:        Domain to which VIRQ should be sent
+ *  @virq:     Virtual IRQ number (VIRQ_*), must be global
+ */
+void send_guest_global_virq(struct domain *d, uint32_t virq);
+
+/*
  * sent_global_virq_handler: Set a global VIRQ handler.
  *  @d:        New target domain for this VIRQ
  *  @virq:     Virtual IRQ number (VIRQ_*), must be global
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 33e4fd1..742b546 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -151,3 +151,5 @@
 ?	argo_addr			argo.h
 ?	argo_ring_id			argo.h
 ?	argo_ring			argo.h
+?	argo_iov			argo.h
+?	argo_send_addr			argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 14/18] argo: implement the notify op
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (12 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 13/18] argo: implement the sendv op; evtchn: expose send_guest_global_virq Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20  6:39 ` [PATCH v2 15/18] xsm, argo: XSM control for any access to argo by a domain Christopher Clark
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Queries for data about space availability in registered rings and
causes notification to be sent when space has become available.

The hypercall op populates a supplied data structure with information about
ring state, and if insufficent space is currently available in a given ring,
the hypervisor will record the domain's expressed interest and notify it
when it observes that space has become available.

Checks for free space occur when this notify op is invoked, so it may be
intentionally invoked with no data structure to populate
(ie. a NULL argument) to trigger such a check and consequent notifications.

copy_field_from_guest_errno is added for guest access, performing the same
operation as copy_field_from_guest, but returning -EFAULT if the copy is
incomplete. Added to common code to simplify code at call sites.

limit the max number of notify requests in a single operation with
a simple fixed limit of 256.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
Changes since v1:

v1 #5 (#16) feedback Paul: notify op: use currd in do_argo_message_op
v1 #5 (#16) feedback Paul: notify op: use currd in argo_notify
v1 #5 (#16) feedback Paul: notify op: use currd in argo_notify_check_pending
v1 #5 (#16) feedback Paul: notify op: use currd in argo_fill_ring_data_array
v1 #13 (#16) feedback Paul: notify op: do/while: reindent only
v1 #13 (#16) feedback Paul: notify op: do/while: goto
v1 : add compat xlat.lst entries
v1: add definition for copy_field_from_guest_errno
v1 #13 feedback Jan: make 'ring data' comment comply with single-line style
v1 feedback #13 Jan: use __copy; so define and use __copy_field_to_guest_errno
v1: #13 feedback Jan: public namespace: prefix with xen
v1: #13 feedback Jan: add blank line after case in do_argo_message_op
v1: self: rename ent id to domain_id
v1: self: ent id-> domain_id
v1: self: drop signal if domain_cookie mismatches
v1. feedback #15 Jan: make loop i unsigned
v1. self: drop unnecessary mb() in argo_notify_check_pending
v1. self: add blank line
v1 #16 feedback Jan: const domain arg to +argo_fill_ring_data
v1. feedback #15 Jan: check unusued hypercall args are zero
v1 feedback #16 Jan: add comment on space available signal policy
v1. feedback #16 Jan: move declr, drop braces, lower indent
v1. feedback #18 Jan: meld the resource limits into the main commit
v1. feedback #16 Jan: clarify use of magic field
v1. self: use single copy to read notify ring data struct
v1: argo_fill_ring_data: fix dprintk types for port field
v1: self: use %x for printing port as per other print sites
v1. feedback Jan: add comments explaining ring full vs empty
v1. following Jan: fix argo_ringbuf_payload_space calculation for empty ring

 xen/common/argo.c                  | 384 +++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/guest_access.h |   5 +
 xen/include/asm-x86/guest_access.h |   5 +
 xen/include/public/argo.h          |  67 +++++++
 xen/include/xlat.lst               |   2 +
 5 files changed, 463 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index ed50415..6fbd0a6 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -29,12 +29,15 @@
 #include <public/argo.h>
 
 #define ARGO_MAX_RINGS_PER_DOMAIN       128U
+#define ARGO_MAX_NOTIFY_COUNT           256U
 
 DEFINE_XEN_GUEST_HANDLE(xen_argo_page_descr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_iov_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_send_addr_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_data_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t);
 DECLARE_XEN_GUEST_HANDLE_NULL(uint8_t);
 
 /* pfn type: 64-bit on all architectures */
@@ -137,6 +140,10 @@ argo_hash(const struct xen_argo_ring_id *id)
     return ret;
 }
 
+static struct argo_ring_info *
+argo_ring_find_info_by_match(const struct domain *d, uint32_t port,
+                             domid_t partner_id, uint64_t partner_cookie);
+
 /*
  * locks
  */
@@ -197,6 +204,28 @@ argo_signal_domain(struct domain *d)
     send_guest_global_virq(d, VIRQ_ARGO_MESSAGE);
 }
 
+static void
+argo_signal_domid(domid_t domain_id, uint64_t domain_cookie)
+{
+    struct domain *d = get_domain_by_id(domain_id);
+
+    if ( !d )
+        return;
+
+    if ( !d->argo )
+        goto out;
+    /*
+     * The caller holds R(L1) which ensures that d->argo is stable.
+     * Since the domain_cookie is never modified while d->argo is valid
+     * we do not need to aquire R(L2) to read the cookie here.
+     */
+    if ( d->argo->domain_cookie == domain_cookie )
+        argo_signal_domain(d);
+
+ out:
+    put_domain(d);
+}
+
 /*
  * ring buffer
  */
@@ -388,6 +417,60 @@ argo_ringbuf_get_rx_ptr(struct argo_ring_info *ring_info, uint32_t *rx_ptr)
     return 0;
 }
 
+static uint32_t
+argo_ringbuf_payload_space(struct domain *d, struct argo_ring_info *ring_info)
+{
+    xen_argo_ring_t ring;
+    int32_t ret;
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    ring.len = ring_info->len;
+    if ( !ring.len )
+        return 0;
+
+    ring.tx_ptr = ring_info->tx_ptr;
+
+    if ( argo_ringbuf_get_rx_ptr(ring_info, &ring.rx_ptr) )
+        return 0;
+
+    argo_dprintk("argo_ringbuf_payload_space: tx_ptr=%d rx_ptr=%d\n",
+                 ring.tx_ptr, ring.rx_ptr);
+
+    /*
+     * rx_ptr == tx_ptr means that the ring has been emptied, so return
+     * the maximum payload size that can be accepted -- see message size
+     * checking logic in the entry to argo_ringbuf_insert which ensures that
+     * there is always one message slot (of size XEN_ARGO_ROUNDUP(1)) left
+     * available, preventing a ring from being entirely filled. This ensures
+     * that matching ring indexes always indicate an empty ring and not a
+     * full one.
+     * The subtraction here will not underflow due to minimum size constraints
+     * enforced on ring size elsewhere.
+     */
+    if ( ring.rx_ptr == ring.tx_ptr )
+        return ring.len - sizeof(struct xen_argo_ring_message_header)
+                        - XEN_ARGO_ROUNDUP(1);
+
+    ret = ring.rx_ptr - ring.tx_ptr;
+    if ( ret < 0 )
+        ret += ring.len;
+
+    /*
+     * The maximum size payload for a message that will be accepted is:
+     * (the available space between the ring indexes)
+     *    minus (space for a message header)
+     *    minus (space for one message slot)
+     * since argo_ringbuf_insert requires that one message slot be left
+     * unfilled, to avoid filling the ring to capacity and confusing a full
+     * ring with an empty one.
+     */
+    ret -= sizeof(struct xen_argo_ring_message_header);
+    ret -= XEN_ARGO_ROUNDUP(1);
+
+    return (ret < 0) ? 0 : ret;
+}
+
 /*
  * argo_sanitize_ring creates a modified copy of the ring pointers
  * where the rx_ptr is rounded up to ensure it is aligned, and then
@@ -835,6 +918,58 @@ argo_pending_remove_all(struct argo_ring_info *ring_info)
         argo_pending_remove_ent(pending_ent);
 }
 
+static void
+argo_pending_notify(struct hlist_head *to_notify)
+{
+    struct hlist_node *node, *next;
+    struct argo_pending_ent *ent;
+
+    ASSERT(rw_is_locked(&argo_lock));
+
+    hlist_for_each_entry_safe(ent, node, next, to_notify, node)
+    {
+        hlist_del(&ent->node);
+        argo_signal_domid(ent->domain_id, ent->domain_cookie);
+        xfree(ent);
+    }
+}
+
+static void
+argo_pending_find(const struct domain *d, struct argo_ring_info *ring_info,
+                  uint32_t payload_space, struct hlist_head *to_notify)
+{
+    struct hlist_node *node, *next;
+    struct argo_pending_ent *ent;
+
+    ASSERT(rw_is_locked(&d->argo->lock));
+
+    /*
+     * TODO: Current policy here is to signal _all_ of the waiting domains
+     *       interested in sending a message of size less than payload_space.
+     *
+     * This is likely to be suboptimal, since once one of them has added
+     * their message to the ring, there may well be insufficient room
+     * available for any of the others to transmit, meaning that they were
+     * woken in vain, which created extra work just to requeue their wait.
+     *
+     * Retain this simple policy for now since it at least avoids starving a
+     * domain of available space notifications because of a policy that only
+     * notified other domains instead. Improvement may be possible;
+     * investigation required.
+     */
+
+    spin_lock(&ring_info->lock);
+    hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node)
+    {
+        if ( payload_space >= ent->len )
+        {
+            hlist_del(&ent->node);
+            hlist_add_head(&ent->node, to_notify);
+        }
+    }
+    spin_unlock(&ring_info->lock);
+}
+
 static int
 argo_pending_queue(struct argo_ring_info *ring_info, domid_t src_id,
                    uint64_t src_cookie, unsigned int len)
@@ -890,6 +1025,26 @@ argo_pending_requeue(struct argo_ring_info *ring_info, domid_t src_id,
     return argo_pending_queue(ring_info, src_id, src_cookie, len);
 }
 
+static void
+argo_pending_cancel(struct argo_ring_info *ring_info, domid_t src_id,
+                    uint64_t src_cookie)
+{
+    struct hlist_node *node, *next;
+    struct argo_pending_ent *ent;
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node)
+    {
+        if ( (ent->domain_id == src_id) &&
+             (ent->domain_cookie == src_cookie) )
+        {
+            hlist_del(&ent->node);
+            xfree(ent);
+        }
+    }
+}
+
 static void argo_ring_remove_mfns(const struct domain *d,
                                   struct argo_ring_info *ring_info)
 {
@@ -932,6 +1087,110 @@ argo_ring_remove_info(struct domain *d, struct argo_ring_info *ring_info)
     xfree(ring_info);
 }
 
+/* ring data */
+
+static int
+argo_fill_ring_data(const struct domain *src_d,
+                    XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) data_ent_hnd)
+{
+    xen_argo_ring_data_ent_t ent;
+    domid_t src_id;
+    struct domain *dst_d;
+    struct argo_ring_info *ring_info;
+    int ret;
+
+    ASSERT(rw_is_locked(&argo_lock));
+
+    ret = copy_from_guest_errno(&ent, data_ent_hnd, 1);
+    if ( ret )
+        return ret;
+
+    argo_dprintk("argo_fill_ring_data: ent.ring.domain=%u,ent.ring.port=%x\n",
+                 ent.ring.domain_id, ent.ring.port);
+
+    src_id = src_d->domain_id;
+    ent.flags = 0;
+
+    dst_d = get_domain_by_id(ent.ring.domain_id);
+
+    if ( dst_d && dst_d->argo )
+    {
+        read_lock(&dst_d->argo->lock);
+
+        ring_info = argo_ring_find_info_by_match(dst_d, ent.ring.port, src_id,
+                                                 src_d->argo->domain_cookie);
+
+        if ( ring_info )
+        {
+            uint32_t space_avail;
+
+            ent.flags |= XEN_ARGO_RING_DATA_F_EXISTS;
+            ent.max_message_size =
+                ring_info->len - sizeof(struct xen_argo_ring_message_header) -
+                XEN_ARGO_ROUNDUP(1);
+
+            spin_lock(&ring_info->lock);
+
+            space_avail = argo_ringbuf_payload_space(dst_d, ring_info);
+
+            argo_dprintk("argo_fill_ring_data: port=%x space_avail=%u"
+                         " space_wanted=%u\n",
+                         ring_info->id.addr.port, space_avail,
+                         ent.space_required);
+
+            if ( space_avail >= ent.space_required )
+            {
+                argo_pending_cancel(ring_info, src_id,
+                                    src_d->argo->domain_cookie);
+                ent.flags |= XEN_ARGO_RING_DATA_F_SUFFICIENT;
+            }
+            else
+            {
+                argo_pending_requeue(ring_info, src_id,
+                                     src_d->argo->domain_cookie,
+                                     ent.space_required);
+                ent.flags |= XEN_ARGO_RING_DATA_F_PENDING;
+            }
+
+            spin_unlock(&ring_info->lock);
+
+            if ( space_avail == ent.max_message_size )
+                ent.flags |= XEN_ARGO_RING_DATA_F_EMPTY;
+
+        }
+        read_unlock(&dst_d->argo->lock);
+    }
+
+    if ( dst_d )
+        put_domain(dst_d);
+
+    ret = __copy_field_to_guest_errno(data_ent_hnd, &ent, flags);
+    if ( ret )
+        return ret;
+    ret = __copy_field_to_guest_errno(data_ent_hnd, &ent, max_message_size);
+    if ( ret )
+        return ret;
+
+    return 0;
+}
+
+static int
+argo_fill_ring_data_array(struct domain *currd, int nent,
+                          XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) data_hnd)
+{
+    int ret = 0;
+
+    ASSERT(rw_is_locked(&argo_lock));
+
+    while ( !ret && nent-- )
+    {
+        ret = argo_fill_ring_data(currd, data_hnd);
+        guest_handle_add_offset(data_hnd, 1);
+    }
+
+    return ret;
+}
+
 /* ring */
 
 static int
@@ -1499,6 +1758,116 @@ argo_register_ring(struct domain *currd,
  * io
  */
 
+static void
+argo_notify_ring(struct domain *d, struct argo_ring_info *ring_info,
+                struct hlist_head *to_notify)
+{
+    uint32_t space;
+
+    ASSERT(rw_is_locked(&argo_lock));
+    ASSERT(rw_is_locked(&d->argo->lock));
+
+    spin_lock(&ring_info->lock);
+
+    if ( ring_info->len )
+        space = argo_ringbuf_payload_space(d, ring_info);
+    else
+        space = 0;
+
+    spin_unlock(&ring_info->lock);
+
+    if ( space )
+        argo_pending_find(d, ring_info, space, to_notify);
+}
+
+static void
+argo_notify_check_pending(struct domain *currd)
+{
+    unsigned int i;
+    HLIST_HEAD(to_notify);
+
+    ASSERT(rw_is_locked(&argo_lock));
+
+    read_lock(&currd->argo->lock);
+
+    for ( i = 0; i < ARGO_HTABLE_SIZE; i++ )
+    {
+        struct hlist_node *node, *next;
+        struct argo_ring_info *ring_info;
+
+        hlist_for_each_entry_safe(ring_info, node, next,
+                                  &currd->argo->ring_hash[i], node)
+        {
+            argo_notify_ring(currd, ring_info, &to_notify);
+        }
+    }
+
+    read_unlock(&currd->argo->lock);
+
+    if ( !hlist_empty(&to_notify) )
+        argo_pending_notify(&to_notify);
+}
+
+static long
+argo_notify(struct domain *currd,
+            XEN_GUEST_HANDLE_PARAM(xen_argo_ring_data_t) ring_data_hnd)
+{
+    XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) ent_hnd;
+    xen_argo_ring_data_t ring_data;
+    int ret = 0;
+
+    read_lock(&argo_lock);
+
+    if ( !currd->argo )
+    {
+        argo_dprintk("!d->argo, ENODEV\n");
+        ret = -ENODEV;
+        goto out;
+    }
+
+    argo_notify_check_pending(currd);
+
+    if ( !guest_handle_is_null(ring_data_hnd) )
+    {
+        ret = copy_from_guest_errno(&ring_data, ring_data_hnd, 1);
+        if ( ret )
+            goto out;
+
+        /*
+         * Before performing a hypervisor write into guest memory, validate
+         * that it is memory that the guest expects these writes into by
+         * checking that the 'magic' field contains the expected value.
+         */
+        if ( ring_data.magic != XEN_ARGO_RING_DATA_MAGIC )
+        {
+            gprintk(XENLOG_ERR,
+                "argo: incorrect ring_data.magic(%"PRIx64") vs (%llx)\n",
+                ring_data.magic, XEN_ARGO_RING_DATA_MAGIC);
+            ret = -EINVAL;
+            goto out;
+        }
+
+        if ( ring_data.nent > ARGO_MAX_NOTIFY_COUNT )
+        {
+            gprintk(XENLOG_ERR,
+                    "argo: notify entry count(%u) exceeds max(%u)\n",
+                    ring_data.nent, ARGO_MAX_NOTIFY_COUNT);
+            ret = -EACCES;
+            goto out;
+        }
+
+        ent_hnd = guest_handle_for_field(ring_data_hnd,
+                                         xen_argo_ring_data_ent_t, data[0]);
+
+        ret = argo_fill_ring_data_array(currd, ring_data.nent, ent_hnd);
+    }
+
+ out:
+    read_unlock(&argo_lock);
+
+    return ret;
+}
+
 static long
 argo_sendv(struct domain *src_d, const xen_argo_addr_t *src_addr,
            const xen_argo_addr_t *dst_addr,
@@ -1704,6 +2073,21 @@ do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
         break;
     }
 
+    case XEN_ARGO_MESSAGE_OP_notify:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_ring_data_t) ring_data_hnd =
+                   guest_handle_cast(arg1, xen_argo_ring_data_t);
+
+        if ( unlikely((!guest_handle_is_null(arg2)) || arg3 || arg4) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        rc = argo_notify(currd, ring_data_hnd);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/include/asm-arm/guest_access.h b/xen/include/asm-arm/guest_access.h
index 5456d81..fc73572 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -126,6 +126,11 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf,
     raw_copy_from_guest(_d, _s, sizeof(*_d));           \
 })
 
+/* Errno-returning variant of copy_field_from_guest */
+#define copy_field_from_guest_errno(ptr, hnd, field)    \
+    (copy_field_from_guest((ptr), (hnd), field) ?       \
+        -EFAULT : 0)
+
 /*
  * Pre-validate a guest handle.
  * Allows use of faster __copy_* functions.
diff --git a/xen/include/asm-x86/guest_access.h b/xen/include/asm-x86/guest_access.h
index 9176150..09b137a 100644
--- a/xen/include/asm-x86/guest_access.h
+++ b/xen/include/asm-x86/guest_access.h
@@ -131,6 +131,11 @@
     raw_copy_from_guest(_d, _s, sizeof(*_d));           \
 })
 
+/* Errno-returning variant of copy_field_from_guest */
+#define copy_field_from_guest_errno(ptr, hnd, field)    \
+    (copy_field_from_guest((ptr), (hnd), field) ?       \
+        -EFAULT : 0)
+
 /*
  * Pre-validate a guest handle.
  * Allows use of faster __copy_* functions.
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index d075930..517f615 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -32,6 +32,7 @@
 #include "xen.h"
 
 #define XEN_ARGO_RING_MAGIC      0xbd67e163e7777f2fULL
+#define XEN_ARGO_RING_DATA_MAGIC 0xcce4d30fbc82e92aULL
 #define XEN_ARGO_DOMID_ANY       DOMID_INVALID
 
 /*
@@ -130,6 +131,45 @@ typedef struct xen_argo_ring
  */
 #define XEN_ARGO_ROUNDUP(a) (((a) + 0xf) & ~(typeof(a))0xf)
 
+/*
+ * Notify flags
+ */
+/* Ring is empty */
+#define XEN_ARGO_RING_DATA_F_EMPTY       (1U << 0)
+/* Ring exists */
+#define XEN_ARGO_RING_DATA_F_EXISTS      (1U << 1)
+/* Pending interrupt exists. Do not rely on this field - for profiling only */
+#define XEN_ARGO_RING_DATA_F_PENDING     (1U << 2)
+/* Sufficient space to queue space_required bytes exists */
+#define XEN_ARGO_RING_DATA_F_SUFFICIENT  (1U << 3)
+
+typedef struct xen_argo_ring_data_ent
+{
+    xen_argo_addr_t ring;
+    uint16_t flags;
+    uint16_t pad;
+    uint32_t space_required;
+    uint32_t max_message_size;
+} xen_argo_ring_data_ent_t;
+
+typedef struct xen_argo_ring_data
+{
+    /*
+     * Contents of the 'magic' field are inspected to verify that they contain
+     * an expected value before the hypervisor will perform writes into this
+     * structure in guest-supplied memory.
+     */
+    uint64_t magic;
+    uint32_t nent;
+    uint32_t pad;
+    uint64_t reserved[4];
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    xen_argo_ring_data_ent_t data[];
+#elif defined(__GNUC__)
+    xen_argo_ring_data_ent_t data[0];
+#endif
+} xen_argo_ring_data_t;
+
 struct xen_argo_ring_message_header
 {
     uint32_t len;
@@ -209,6 +249,33 @@ struct xen_argo_ring_message_header
  */
 #define XEN_ARGO_MESSAGE_OP_sendv               5
 
+/*
+ * XEN_ARGO_MESSAGE_OP_notify
+ *
+ * Asks Xen for information about other rings in the system.
+ *
+ * ent->ring is the xen_argo_addr_t of the ring you want information on.
+ * Uses the same ring matching rules as XEN_ARGO_MESSAGE_OP_sendv.
+ *
+ * ent->space_required : if this field is not null then Xen will check
+ * that there is space in the destination ring for this many bytes of payload.
+ * If sufficient space is available, it will set XEN_ARGO_RING_DATA_F_SUFFICIENT
+ * and CANCEL any pending notification for that ent->ring; otherwise it
+ * will schedule a notification event and the flag will not be set.
+ *
+ * These flags are set by Xen when notify replies:
+ * XEN_ARGO_RING_DATA_F_EMPTY      ring is empty
+ * XEN_ARGO_RING_DATA_F_PENDING    notify event is pending *don't rely on this*
+ * XEN_ARGO_RING_DATA_F_SUFFICIENT sufficient space for space_required is there
+ * XEN_ARGO_RING_DATA_F_EXISTS     ring exists
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_ring_data_t) ring_data (may be NULL)
+ * arg2: NULL
+ * arg3: 0 (ZERO)
+ * arg4: 0 (ZERO)
+ */
+#define XEN_ARGO_MESSAGE_OP_notify              4
+
 /* The maximum size of a guest message that may be sent on an Argo ring. */
 #define XEN_ARGO_MAX_MSG_SIZE ((XEN_ARGO_MAX_RING_SIZE) - \
         (sizeof(struct xen_argo_ring_message_header)) - \
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 742b546..53f2f50 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -153,3 +153,5 @@
 ?	argo_ring			argo.h
 ?	argo_iov			argo.h
 ?	argo_send_addr			argo.h
+?	argo_ring_data_ent		argo.h
+?	argo_ring_data			argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 15/18] xsm, argo: XSM control for any access to argo by a domain
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (13 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 14/18] argo: implement the notify op Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20  6:39 ` [PATCH v2 16/18] xsm, argo: notify: don't describe rings that cannot be sent to Christopher Clark
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel, Daniel De Graaf
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Will inhibit initialization of the domain's argo data structure to
prevent receiving any messages or notifications and access to any of
the argo hypercall operations.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
Changes since v1:

v1 #5 (#17) feedback Paul: XSM control for any access: use currd
v1 #16 feedback Jan: apply const to function signatures

 xen/common/argo.c                   | 4 ++--
 xen/include/xsm/dummy.h             | 5 +++++
 xen/include/xsm/xsm.h               | 6 ++++++
 xen/xsm/dummy.c                     | 1 +
 xen/xsm/flask/hooks.c               | 7 +++++++
 xen/xsm/flask/policy/access_vectors | 3 +++
 6 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 6fbd0a6..1f16872 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -1981,7 +1981,7 @@ do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
     argo_dprintk("->do_argo_message_op(%u,%p,%p,%d,%d)\n", cmd,
                  (void *)arg1.p, (void *)arg2.p, (int) arg3, (int) arg4);
 
-    if ( unlikely(!opt_argo_enabled) )
+    if ( unlikely(!opt_argo_enabled || xsm_argo_enable(currd)) )
     {
         rc = -EOPNOTSUPP;
         return rc;
@@ -2119,7 +2119,7 @@ argo_init(struct domain *d)
 {
     struct argo_domain *argo;
 
-    if ( !opt_argo_enabled )
+    if ( !opt_argo_enabled || xsm_argo_enable(d) )
     {
         argo_dprintk("argo disabled, domid: %d\n", d->domain_id);
         return 0;
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 05d10b5..91a21c3 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -721,6 +721,11 @@ static XSM_INLINE int xsm_dm_op(XSM_DEFAULT_ARG struct domain *d)
 #endif /* CONFIG_X86 */
 
 #ifdef CONFIG_ARGO
+static XSM_INLINE int xsm_argo_enable(struct domain *d)
+{
+    return 0;
+}
+
 static XSM_INLINE int xsm_argo_register_single_source(struct domain *d,
                                                       struct domain *t)
 {
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 4d4a60c..e300ebc 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -182,6 +182,7 @@ struct xsm_operations {
     int (*xen_version) (uint32_t cmd);
     int (*domain_resource_map) (struct domain *d);
 #ifdef CONFIG_ARGO
+    int (*argo_enable) (const struct domain *d);
     int (*argo_register_single_source) (const struct domain *d,
                                         const struct domain *t);
     int (*argo_register_any_source) (const struct domain *d);
@@ -705,6 +706,11 @@ static inline int xsm_domain_resource_map(xsm_default_t def, struct domain *d)
 }
 
 #ifdef CONFIG_ARGO
+static inline xsm_argo_enable(const struct domain *d)
+{
+    return xsm_ops->argo_enable(d);
+}
+
 static inline xsm_argo_register_single_source(const struct domain *d,
                                               const struct domain *t)
 {
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index ffac774..1fe0e74 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -153,6 +153,7 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, xen_version);
     set_to_dummy_if_null(ops, domain_resource_map);
 #ifdef CONFIG_ARGO
+    set_to_dummy_if_null(ops, argo_enable);
     set_to_dummy_if_null(ops, argo_register_single_source);
     set_to_dummy_if_null(ops, argo_register_any_source);
     set_to_dummy_if_null(ops, argo_send);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 76c012c..3d00c74 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1720,6 +1720,12 @@ static int flask_domain_resource_map(struct domain *d)
 }
 
 #ifdef CONFIG_ARGO
+static int flask_argo_enable(const struct domain *d)
+{
+    return avc_has_perm(domain_sid(d), SECINITSID_XEN, SECCLASS_ARGO,
+                        ARGO__ENABLE, NULL);
+}
+
 static int flask_argo_register_single_source(const struct domain *d,
                                              const struct domain *t)
 {
@@ -1875,6 +1881,7 @@ static struct xsm_operations flask_ops = {
     .xen_version = flask_xen_version,
     .domain_resource_map = flask_domain_resource_map,
 #ifdef CONFIG_ARGO
+    .argo_enable = flask_argo_enable,
     .argo_register_single_source = flask_argo_register_single_source,
     .argo_register_any_source = flask_argo_register_any_source,
     .argo_send = flask_argo_send,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index f6c5377..e00448b 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -535,6 +535,9 @@ class version
 # Class argo is used to describe the Argo interdomain communication system.
 class argo
 {
+    # Enable initialization of a domain's argo subsystem and
+    # permission to access the argo hypercall operations.
+    enable
     # Domain requesting registration of a communication ring
     # to receive messages from a specific other domain.
     register_single_source
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 16/18] xsm, argo: notify: don't describe rings that cannot be sent to
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (14 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 15/18] xsm, argo: XSM control for any access to argo by a domain Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20  6:39 ` [PATCH v2 17/18] argo: validate hypercall arg structures via compat machinery Christopher Clark
  2018-12-20  6:39 ` [PATCH v2 18/18] argo: unmap rings on suspend; signal ring-owners on resume Christopher Clark
  17 siblings, 0 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel, Daniel De Graaf
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
 xen/common/argo.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 1f16872..921aaf3 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -1115,6 +1115,17 @@ argo_fill_ring_data(const struct domain *src_d,
 
     if ( dst_d && dst_d->argo )
     {
+        /*
+         * Don't supply information about rings that a guest is not
+         * allowed to send to.
+         */
+        ret = xsm_argo_send(src_d, dst_d);
+        if ( ret )
+        {
+            put_domain(dst_d);
+            return ret;
+        }
+
         read_lock(&dst_d->argo->lock);
 
         ring_info = argo_ring_find_info_by_match(dst_d, ent.ring.port, src_id,
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 17/18] argo: validate hypercall arg structures via compat machinery
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (15 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 16/18] xsm, argo: notify: don't describe rings that cannot be sent to Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  2018-12-20  6:39 ` [PATCH v2 18/18] argo: unmap rings on suspend; signal ring-owners on resume Christopher Clark
  17 siblings, 0 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Argo doesn't use compat hypercall or argument translation but can use some
of the infrastructure for validating the hypercall argument structures to
ensure that the struct sizes, offsets and compositions don't vary between 32
and 64bit, so add that here in a new dedicated source file for this purpose.

Some of the argo hypercall argument structures contain elements that are
hypercall argument structure types themselves, and the standard compat
structure validation does not handle this, since the types differ in compat
vs. non-compat versions; so for some of the tests the exact-type-match check
is replaced with a weaker, but still sufficient, sizeof check.

Then there are additional hypercall argument structures that contain
elements that do not have a fixed size (last element, variable length array
fields), so we have to then disable that size check too for validating those
structures; the coverage of offset of elements is still retained.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
This is a new patch introduced in version 2 of the series.

These checks could be introduced incrementally in multiple previous commits as
the data structures are added with each hypercall op, but this commit can stand
or fall on its own and the macro redefinition needed to override aspects of the
checking warrants review. This commit does add compile-time coverage of the
hypercall data structures (as requested).

 xen/common/Makefile      |  2 +-
 xen/common/compat/argo.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+), 1 deletion(-)
 create mode 100644 xen/common/compat/argo.c

diff --git a/xen/common/Makefile b/xen/common/Makefile
index 8c65c6f..88b9b2f 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -70,7 +70,7 @@ obj-y += xmalloc_tlsf.o
 obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4 earlycpio,$(n).init.o)
 
 
-obj-$(CONFIG_COMPAT) += $(addprefix compat/,domain.o kernel.o memory.o multicall.o xlat.o)
+obj-$(CONFIG_COMPAT) += $(addprefix compat/,argo.o domain.o kernel.o memory.o multicall.o xlat.o)
 
 tmem-y := tmem.o tmem_xen.o tmem_control.o
 tmem-$(CONFIG_COMPAT) += compat/tmem_xen.o
diff --git a/xen/common/compat/argo.c b/xen/common/compat/argo.c
new file mode 100644
index 0000000..209c4fd
--- /dev/null
+++ b/xen/common/compat/argo.c
@@ -0,0 +1,60 @@
+/******************************************************************************
+ * Argo : Hypervisor-Mediated data eXchange
+ *
+ * Copyright (c) 2018, BAE Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <xen/types.h>
+#include <xen/lib.h>
+#include <public/argo.h>
+#include <compat/argo.h>
+
+CHECK_argo_addr;
+
+/*
+ * Disable strict type checking in this compat validation macro for the
+ * following struct checks because it cannot handle fields within structs that
+ * have types that differ in the compat versus non-compat structs.
+ * Replace it with a field size check which is sufficient here.
+ */
+
+#undef CHECK_FIELD_COMMON_
+#define CHECK_FIELD_COMMON_(k, name, n, f) \
+static inline int __maybe_unused name(k xen_ ## n *x, k compat_ ## n *c) \
+{ \
+    BUILD_BUG_ON(offsetof(k xen_ ## n, f) != \
+                 offsetof(k compat_ ## n, f)); \
+    return sizeof(x->f) == sizeof(c->f); \
+}
+
+CHECK_argo_ring_id;
+CHECK_argo_send_addr;
+CHECK_argo_ring_data_ent;
+CHECK_argo_iov;
+
+/*
+ * Disable sizeof type checking for the following struct checks because
+ * these structs have fields with variable size that the size check
+ * cannot validate.
+ */
+
+#undef CHECK_FIELD_COMMON_
+#define CHECK_FIELD_COMMON_(k, name, n, f) \
+static inline int __maybe_unused name(k xen_ ## n *x, k compat_ ## n *c) \
+{ \
+    BUILD_BUG_ON(offsetof(k xen_ ## n, f) != \
+                 offsetof(k compat_ ## n, f)); \
+    return 1; \
+}
+
+CHECK_argo_ring;
+CHECK_argo_ring_data;
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 18/18] argo: unmap rings on suspend; signal ring-owners on resume
  2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
                   ` (16 preceding siblings ...)
  2018-12-20  6:39 ` [PATCH v2 17/18] argo: validate hypercall arg structures via compat machinery Christopher Clark
@ 2018-12-20  6:39 ` Christopher Clark
  17 siblings, 0 replies; 28+ messages in thread
From: Christopher Clark @ 2018-12-20  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Rich Persaud, Tim Deegan,
	Daniel Smith, Julien Grall, Paul Durrant, Jan Beulich,
	James McKenzie, Eric Chanudet, Roger Pau Monne

The hypervisor data structures for the argo rings are populated with mfns
which cannot be assumed to map to the same guest frame numbers across entry
to and exit from host power state S4 (hibernate to disk).

Tear down all the rings during suspend to stop argo operations from
performing any further writes into the registered set of mfns for the rings.

In order to support guests reestablishing their registered rings on resume,
signal each guest that has a registered ring so that the guest may
re-register each ring with the current mappings for its guest frame numbers.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
---
Changes since v1:

v1. feedback #15 Jan: make i unsigned
v1. self: fix indentation and blank lines
v1. feedback #25 Jan: add rationale to commit message

re: v1 feedback #25 Jan: soft-reset is now not a problem for the resume logic
in this commit since handling of soft reset was added to an earlier commit in
this version 2 series.

 xen/common/argo.c      | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++
 xen/common/domain.c    |  9 +++++++
 xen/include/xen/argo.h |  2 ++
 3 files changed, 83 insertions(+)

diff --git a/xen/common/argo.c b/xen/common/argo.c
index 921aaf3..624ed8a 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -1076,6 +1076,16 @@ static void argo_ring_remove_mfns(const struct domain *d,
 }
 
 static void
+argo_ring_reset(struct domain *d, struct argo_ring_info *ring_info)
+{
+    ASSERT(rw_is_write_locked(&d->argo->lock));
+
+    argo_ring_remove_mfns(d, ring_info);
+    ring_info->len = 0;
+    ring_info->tx_ptr = 0;
+}
+
+static void
 argo_ring_remove_info(struct domain *d, struct argo_ring_info *ring_info)
 {
     ASSERT(rw_is_write_locked(&d->argo->lock));
@@ -2222,3 +2232,65 @@ argo_soft_reset(struct domain *d)
 
     write_unlock(&argo_lock);
 }
+
+void
+argo_shutdown_for_suspend(struct domain *d)
+{
+    unsigned int i;
+
+    if ( !d )
+        return;
+
+    if ( get_domain(d) )
+    {
+        read_lock(&argo_lock);
+
+        if ( d->argo )
+        {
+            write_lock(&d->argo->lock);
+
+            for ( i = 0; i < ARGO_HTABLE_SIZE; i++ )
+            {
+                struct hlist_node *node, *next;
+                struct argo_ring_info *ring_info;
+
+                hlist_for_each_entry_safe(ring_info, node,
+                                          next, &d->argo->ring_hash[i], node)
+                    argo_ring_reset(d, ring_info);
+            }
+
+            write_unlock(&d->argo->lock);
+        }
+
+        read_unlock(&argo_lock);
+
+        put_domain(d);
+    }
+}
+
+void
+argo_resume(struct domain *d)
+{
+    bool send_wakeup;
+
+    if ( !d )
+        return;
+
+    if ( !get_domain(d) )
+        return;
+
+    read_lock(&argo_lock);
+
+    read_lock(&d->argo->lock);
+
+    send_wakeup = ( d->argo->ring_count > 0 );
+
+    read_unlock(&d->argo->lock);
+
+    if ( send_wakeup )
+        argo_signal_domain(d);
+
+    read_unlock(&argo_lock);
+
+    put_domain(d);
+}
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 601c663..e194a42 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -89,6 +89,11 @@ static void __domain_finalise_shutdown(struct domain *d)
         if ( !v->paused_for_shutdown )
             return;
 
+#ifdef CONFIG_ARGO
+    if ( d->shutdown_code == SHUTDOWN_suspend )
+        argo_shutdown_for_suspend(d);
+#endif
+
     d->is_shut_down = 1;
     if ( (d->shutdown_code == SHUTDOWN_suspend) && d->suspend_evtchn )
         evtchn_send(d, d->suspend_evtchn);
@@ -857,6 +862,10 @@ void domain_resume(struct domain *d)
     spin_unlock(&d->shutdown_lock);
 
     domain_unpause(d);
+
+#ifdef CONFIG_ARGO
+    argo_resume(d);
+#endif
 }
 
 int vcpu_start_shutdown_deferral(struct vcpu *v)
diff --git a/xen/include/xen/argo.h b/xen/include/xen/argo.h
index 29d32a9..e3dc19b 100644
--- a/xen/include/xen/argo.h
+++ b/xen/include/xen/argo.h
@@ -18,6 +18,8 @@
 
 int argo_init(struct domain *d);
 void argo_destroy(struct domain *d);
+void argo_shutdown_for_suspend(struct domain *d);
+void argo_resume(struct domain *d);
 void argo_soft_reset(struct domain *d);
 
 #endif
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 01/18] argo: Introduce the Kconfig option to govern inclusion of Argo
  2018-12-20  6:38 ` [PATCH v2 01/18] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
@ 2018-12-20  9:00   ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2018-12-20  9:00 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, ross.philipson,
	Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Rich Persaud, James McKenzie,
	George Dunlap, Julien Grall, Paul Durrant, xen-devel,
	eric chanudet, Roger Pau Monne

>>> On 20.12.18 at 07:38, <christopher.w.clark@gmail.com> wrote:
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -200,6 +200,26 @@ config LATE_HWDOM
>  
>  	  If unsure, say N.
>  
> +config ARGO
> +	def_bool n
> +	prompt "Argo: hypervisor-mediated interdomain communication" if EXPERT = "y"

The common (and slightly shorter) way to express this is

config ARGO
	bool "Argo: hypervisor-mediated interdomain communication" if EXPERT = "y"

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 11/18] argo: implement the register op
  2018-12-20  6:39 ` [PATCH v2 11/18] argo: implement the register op Christopher Clark
@ 2018-12-20 11:20   ` Julien Grall
  2018-12-21  1:17     ` Christopher Clark
  0 siblings, 1 reply; 28+ messages in thread
From: Julien Grall @ 2018-12-20 11:20 UTC (permalink / raw)
  To: Christopher Clark, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Tim Deegan, Daniel Smith,
	Rich Persaud, Paul Durrant, Jan Beulich, James McKenzie,
	Eric Chanudet, Roger Pau Monne

Hi Christopher,

On 12/20/18 6:39 AM, Christopher Clark wrote:
> Used by a domain to register a region of memory for receiving messages from
> either a specified other domain, or, if specifying a wildcard, any domain.
> 
> This operation creates a mapping within Xen's private address space that
> will remain resident for the lifetime of the ring. In subsequent commits,
> the hypervisor will use this mapping to copy data from a sending domain into
> this registered ring, making it accessible to the domain that registered the
> ring to receive data.
> 
> In this code, the p2m type of the memory supplied by the guest for the ring
> must be p2m_ram_rw, which is a conservative choice made to defer the need to
> reason about the other p2m types with this commit.
> 
> xen_argo_page_descr_t type is introduced as a page descriptor, to convey
> both the physical address of the start of the page and its granularity. The
> smallest granularity page is assumed to be 4096 bytes and the lower twelve
> bits of the type are used for indicate an enumerated page size.

I haven't seen any reply from you on my concern with this approach (see 
[1]).

For convenience, I will duplicate the message here.

If you let the user the choice of the granularity, then, I believe, you 
will prevent the hypervisor to do some optimization.

For instance, if the guest supplies only 4KB page but the hypervisor is 
64KB. There are no way to easily map them contiguously in the hypervisor 
(e.g using vmap).

Is there a particular reason to allow the ring buffer to be 
non-contiguous in the guest physical address?

Depending on the answer, there are different way to handle that:
1) Request the guest to allocate memory using 64KB (on Arm) chunk and 
pass the base address for each chunk
2) Request the guest to allocate contiguously the buffer and pass the 
base address and size

Cheers,

[1] https://lists.xen.org/archives/html/xen-devel/2018-12/msg01038.html

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 04/18] argo: init, destroy and soft-reset, with enable command line opt
  2018-12-20  6:39 ` [PATCH v2 04/18] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
@ 2018-12-20 14:41   ` Lars Kurth
  0 siblings, 0 replies; 28+ messages in thread
From: Lars Kurth @ 2018-12-20 14:41 UTC (permalink / raw)
  To: Christopher Clark, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, Daniel Smith, Andrew Cooper,
	Jason Andryuk, Tim (Xen.org),
	George Dunlap, Rich Persaud, James McKenzie, Julien Grall,
	Paul Durrant, Jan Beulich, Ian Jackson, Eric Chanudet,
	Roger Pau Monne



On 20/12/2018, 06:39, "Christopher Clark" <christopher.w.clark@gmail.com> wrote:

    The software license on the public header is the BSD license, standard
    procedure for the public Xen headers.
    
Thank you for posting the patch. I think it is important to note that these headers were originally posted under a GPL license at 
[1] https://lists.xenproject.org/archives/html/xen-devel/2013-05/msg02710.html 

The following ACK is to confirm that only people being employees of Citrix contributed to the header files in the series posted 
at [1] and that thus the copyright of the files in question is fully owned by Citrix. The ACK also confirms that Citrix is happy for 
the header files to be published under a BSD license in this series (which is based on [1]).

Acked-by: Lars Kurth <lars.kurth@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 02/18] argo: introduce the argo_message_op hypercall boilerplate
  2018-12-20  6:38 ` [PATCH v2 02/18] argo: introduce the argo_message_op hypercall boilerplate Christopher Clark
@ 2018-12-20 15:18   ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2018-12-20 15:18 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, ross.philipson,
	Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Rich Persaud, James McKenzie,
	George Dunlap, Julien Grall, Paul Durrant, xen-devel,
	eric chanudet, Roger Pau Monne

>>> On 20.12.18 at 07:38, <christopher.w.clark@gmail.com> wrote:
> Presence is gated upon CONFIG_ARGO.
> 
> Registers the hypercall previously reserved for this.
> Takes 5 arguments, does nothing and returns -ENOSYS.
> 
> Will be avoiding a compat ABI by using fixed-size types in hypercall ops so
> HYPERCALL, rather than COMPAT_CALL, is the correct macro for the hypercall
> tables.
> 
> Even though handles will be used for (up to) two of the arguments to the
> hypercall, there will be no need for any XLAT_* translation functions
> because the referenced data structures have been constructed to be exactly
> the same size and bit pattern on both 32-bit and 64-bit guests, and padded
> to be integer multiples of 32 bits in size. This means that the same
> copy_to_guest and copy_from_guest logic can be relied upon to perform as
> required without any further intervention. Testing communication with 32
> and 64 bit guests has confirmed this works as intended.
> 
> Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
with one further question:

> --- a/xen/include/public/xen.h
> +++ b/xen/include/public/xen.h
> @@ -118,7 +118,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
>  #define __HYPERVISOR_domctl               36
>  #define __HYPERVISOR_kexec_op             37
>  #define __HYPERVISOR_tmem_op              38
> -#define __HYPERVISOR_xc_reserved_op       39 /* reserved for XenClient */
> +#define __HYPERVISOR_argo_message_op      39

Is "message op" really appropriate? I.e. wouldn't
__HYPERVISOR_argo_op be a better fit considering that this is
not just about message exchange, but also configuration etc?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 03/18] argo: define argo_dprintk for subsystem debugging
  2018-12-20  6:39 ` [PATCH v2 03/18] argo: define argo_dprintk for subsystem debugging Christopher Clark
@ 2018-12-20 15:20   ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2018-12-20 15:20 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, ross.philipson,
	Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Rich Persaud, James McKenzie,
	George Dunlap, Julien Grall, Paul Durrant, xen-devel,
	eric chanudet, Roger Pau Monne

>>> On 20.12.18 at 07:39, <christopher.w.clark@gmail.com> wrote:
> --- a/xen/common/argo.c
> +++ b/xen/common/argo.c
> @@ -19,6 +19,19 @@
>  #include <xen/errno.h>
>  #include <xen/guest_access.h>
>  
> +/*
> + * Debug
> + */
> +
> +/* Set ARGO_DEBUG to 1 here to enable more debug messages */
> +#define ARGO_DEBUG 0
> +
> +#ifdef ARGO_DEBUG

Well, either the way to switch between modes is replacing the
0 by a 1 above (in which case you mean #if here) or you mean
#ifdef here and want to switch by toggling between #define
and #undef.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 07/18] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI
  2018-12-20  6:39 ` [PATCH v2 07/18] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI Christopher Clark
@ 2018-12-20 15:22   ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2018-12-20 15:22 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, ross.philipson,
	Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Rich Persaud, James McKenzie,
	George Dunlap, Julien Grall, Paul Durrant, xen-devel,
	eric chanudet, Roger Pau Monne

>>> On 20.12.18 at 07:39, <christopher.w.clark@gmail.com> wrote:
> EMSGSIZE: Argo's sendv operation will return EMSGSIZE when an excess amount
> of data, across all iovs, has been supplied, exceeding either the statically
> configured maximum size of a transmittable message, or the (variable) size
> of the ring registered by the destination domain.
> 
> ECONNREFUSED: Argo's register operation will return ECONNREFUSED if a ring
> is being registered to communicate with a specific remote domain that does
> exist but is not argo-enabled.
> 
> These codes are described by POSIX here:
> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html 
>     EMSGSIZE     : "Message too large"
>     ECONNREFUSED : "Connection refused".
> 
> Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>

Almost - what's still missing is the (simple) rationale behind the
values chosen. With a sentence added saying they come from
Linux like the rest of the numbers,
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 09/18] xsm, argo: XSM control for argo register; add argo_mac bootparam
  2018-12-20  6:39 ` [PATCH v2 09/18] xsm, argo: XSM control for argo register; add argo_mac bootparam Christopher Clark
@ 2018-12-20 15:29   ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2018-12-20 15:29 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, ross.philipson,
	Jason Andryuk, Daniel Smith, Andrew Cooper,
	Konrad Rzeszutek Wilk, Ian Jackson, Rich Persaud, James McKenzie,
	George Dunlap, Julien Grall, Paul Durrant, xen-devel,
	Daniel de Graaf, eric chanudet, Roger Pau Monne

>>> On 20.12.18 at 07:39, <christopher.w.clark@gmail.com> wrote:
> --- a/xen/common/argo.c
> +++ b/xen/common/argo.c
> @@ -35,6 +35,22 @@ DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
>  static bool __read_mostly opt_argo_enabled;
>  boolean_param("argo", opt_argo_enabled);
>  
> +/* Xen command line option for conservative or relaxed access control */
> +bool __read_mostly argo_mac_bootparam_enforcing = true;

Please can you follow our naming convention, which would
make this something like opt_argo_mac_enforcing?

And then - except in the parsing function, the variable is never
used.

> +static int __init parse_argo_mac_param(const char *s)
> +{
> +    if ( !strcmp(s, "enforcing") )
> +        argo_mac_bootparam_enforcing = true;
> +    else if ( !strcmp(s, "permissive") )
> +        argo_mac_bootparam_enforcing = false;
> +    else
> +        return -EINVAL;
> +
> +    return 0;
> +}
> +custom_param("argo_mac", parse_argo_mac_param);

New command line options need to be accompanied by an
addition to docs/misc/xen-command-line.markdown.

Also we've more or less settled on using dashes instead of
underscores in new command line options.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 11/18] argo: implement the register op
  2018-12-20 11:20   ` Julien Grall
@ 2018-12-21  1:17     ` Christopher Clark
  2018-12-21 12:21       ` Julien Grall
  0 siblings, 1 reply; 28+ messages in thread
From: Christopher Clark @ 2018-12-21  1:17 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Tim Deegan, Daniel Smith,
	Rich Persaud, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet, Roger Pau Monne

On Thu, Dec 20, 2018 at 3:20 AM Julien Grall <julien.grall@arm.com> wrote:
>
> Hi Christopher,
>
> On 12/20/18 6:39 AM, Christopher Clark wrote:
> > Used by a domain to register a region of memory for receiving messages from
> > either a specified other domain, or, if specifying a wildcard, any domain.
> >
> > This operation creates a mapping within Xen's private address space that
> > will remain resident for the lifetime of the ring. In subsequent commits,
> > the hypervisor will use this mapping to copy data from a sending domain into
> > this registered ring, making it accessible to the domain that registered the
> > ring to receive data.
> >
> > In this code, the p2m type of the memory supplied by the guest for the ring
> > must be p2m_ram_rw, which is a conservative choice made to defer the need to
> > reason about the other p2m types with this commit.
> >
> > xen_argo_page_descr_t type is introduced as a page descriptor, to convey
> > both the physical address of the start of the page and its granularity. The
> > smallest granularity page is assumed to be 4096 bytes and the lower twelve
> > bits of the type are used for indicate an enumerated page size.
>
> I haven't seen any reply from you on my concern with this approach (see
> [1]).
>
> For convenience, I will duplicate the message here.

Hi Julien,

Thanks for the reminder.

> If you let the user the choice of the granularity, then, I believe, you
> will prevent the hypervisor to do some optimization.

OK, let's work through this then.

> For instance, if the guest supplies only 4KB page but the hypervisor is
> 64KB. There are no way to easily map them contiguously in the hypervisor
> (e.g using vmap).

Right. So with the matrix:

4K guest, 4K xen : fine.
4K guest, 64K xen : contiguous guest physical chunks or region required.
64K guest, 4K xen : weird? seems doable.
64K guest, 64K xen : fine (with some work).

as you note, the 4K guest, 64K hypervisor case is the one that
raises the question.

> Is there a particular reason to allow the ring buffer to be
> non-contiguous in the guest physical address?

It hasn't been a necessary restriction up to this point, and isn't
so on the platforms we're deploying on, so my preference is not to
introduce it as an additional requirement if it can be avoided. It
allows us to use vmalloc (rather than kmalloc) on Linux, which is
helpful.

There can be high turnover in ring registration for a server with
many short-lived connections. While the rings are not necessarily
large -- the default is 128K in the current Linux driver, though
clients can change what they use -- contiguous memory regions are a
more limited resource for the kernel to manage, and avoiding
pressure on that contiguous region allocator when it isn't necessary
is preferable.

We also do not want to disincentivize a server that is seeking to
improve performance from registering larger rings -- so allowing
non-contiguous regions fits with that.

I'd have to study the Linux driver further to say whether there
are stronger additional requirements that I'm not currently aware
of, but I don't know of any at the moment.

> Depending on the answer, there are different way to handle that:
> 1) Request the guest to allocate memory using 64KB (on Arm) chunk and
> pass the base address for each chunk
> 2) Request the guest to allocate contiguously the buffer and pass the
> base address and size

I understand that #2 would avoid the need to describe a contiguous
allocation of memory as a series of chunks; but I think #1 is the
option I would select. Do you think that would be acceptable?

Thanks again for your interest in this stuff.

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 11/18] argo: implement the register op
  2018-12-21  1:17     ` Christopher Clark
@ 2018-12-21 12:21       ` Julien Grall
  0 siblings, 0 replies; 28+ messages in thread
From: Julien Grall @ 2018-12-21 12:21 UTC (permalink / raw)
  To: Christopher Clark
  Cc: Stefano Stabellini, Wei Liu, Ross Philipson,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Jason Andryuk, Ian Jackson, Tim Deegan, Daniel Smith,
	Rich Persaud, Paul Durrant, Jan Beulich, xen-devel,
	James McKenzie, Eric Chanudet, Roger Pau Monne

Hi Christopher,

On 21/12/2018 01:17, Christopher Clark wrote:
> On Thu, Dec 20, 2018 at 3:20 AM Julien Grall <julien.grall@arm.com> wrote:
>>
>> Hi Christopher,
>>
>> On 12/20/18 6:39 AM, Christopher Clark wrote:
>>> Used by a domain to register a region of memory for receiving messages from
>>> either a specified other domain, or, if specifying a wildcard, any domain.
>>>
>>> This operation creates a mapping within Xen's private address space that
>>> will remain resident for the lifetime of the ring. In subsequent commits,
>>> the hypervisor will use this mapping to copy data from a sending domain into
>>> this registered ring, making it accessible to the domain that registered the
>>> ring to receive data.
>>>
>>> In this code, the p2m type of the memory supplied by the guest for the ring
>>> must be p2m_ram_rw, which is a conservative choice made to defer the need to
>>> reason about the other p2m types with this commit.
>>>
>>> xen_argo_page_descr_t type is introduced as a page descriptor, to convey
>>> both the physical address of the start of the page and its granularity. The
>>> smallest granularity page is assumed to be 4096 bytes and the lower twelve
>>> bits of the type are used for indicate an enumerated page size.
>>
>> I haven't seen any reply from you on my concern with this approach (see
>> [1]).
>>
>> For convenience, I will duplicate the message here.
> 
> Hi Julien,
> 
> Thanks for the reminder.
> 
>> If you let the user the choice of the granularity, then, I believe, you
>> will prevent the hypervisor to do some optimization.
> 
> OK, let's work through this then.
> 
>> For instance, if the guest supplies only 4KB page but the hypervisor is
>> 64KB. There are no way to easily map them contiguously in the hypervisor
>> (e.g using vmap).
> 
> Right. So with the matrix:
> 
> 4K guest, 4K xen : fine.
> 4K guest, 64K xen : contiguous guest physical chunks or region required.
> 64K guest, 4K xen : weird? seems doable.

It is not weird, 64KB split nicely into 16 4KB chunk. Actually, Linux upstream 
has all the support for to run with 64KB pages on current Xen.

> 64K guest, 64K xen : fine (with some work).
> 
> as you note, the 4K guest, 64K hypervisor case is the one that
> raises the question.

That's correct. To generalize the problem, the problem will happen whenever the 
guest page size is smaller than the Xen page size.

> 
>> Is there a particular reason to allow the ring buffer to be
>> non-contiguous in the guest physical address?
> 
> It hasn't been a necessary restriction up to this point, and isn't
> so on the platforms we're deploying on, so my preference is not to
> introduce it as an additional requirement if it can be avoided. It
> allows us to use vmalloc (rather than kmalloc) on Linux, which is
> helpful.

vmalloc might be an issue on Arm if we request 64KB chunk of physical memory. 
Although I don't know the vmalloc implementation to be able to say whether this 
can be addressed.

> 
> There can be high turnover in ring registration for a server with
> many short-lived connections. While the rings are not necessarily
> large -- the default is 128K in the current Linux driver, though
> clients can change what they use -- contiguous memory regions are a
> more limited resource for the kernel to manage, and avoiding
> pressure on that contiguous region allocator when it isn't necessary
> is preferable.
> 
> We also do not want to disincentivize a server that is seeking to
> improve performance from registering larger rings -- so allowing
> non-contiguous regions fits with that.
> 
> I'd have to study the Linux driver further to say whether there
> are stronger additional requirements that I'm not currently aware
> of, but I don't know of any at the moment.

Thank you for the detailed explanation. So I think my option 1) below would suit 
you the best here.

> 
>> Depending on the answer, there are different way to handle that:
>> 1) Request the guest to allocate memory using 64KB (on Arm) chunk and
>> pass the base address for each chunk
>> 2) Request the guest to allocate contiguously the buffer and pass the
>> base address and size
> 
> I understand that #2 would avoid the need to describe a contiguous
> allocation of memory as a series of chunks; but I think #1 is the
> option I would select. Do you think that would be acceptable?

1) is a good option for me. I forgot to mention the base address would need to 
be aligned to 64KB.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-12-21 12:37 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-20  6:38 [PATCH v2 00/18] Argo: hypervisor-mediated interdomain communication Christopher Clark
2018-12-20  6:38 ` [PATCH v2 01/18] argo: Introduce the Kconfig option to govern inclusion of Argo Christopher Clark
2018-12-20  9:00   ` Jan Beulich
2018-12-20  6:38 ` [PATCH v2 02/18] argo: introduce the argo_message_op hypercall boilerplate Christopher Clark
2018-12-20 15:18   ` Jan Beulich
2018-12-20  6:39 ` [PATCH v2 03/18] argo: define argo_dprintk for subsystem debugging Christopher Clark
2018-12-20 15:20   ` Jan Beulich
2018-12-20  6:39 ` [PATCH v2 04/18] argo: init, destroy and soft-reset, with enable command line opt Christopher Clark
2018-12-20 14:41   ` Lars Kurth
2018-12-20  6:39 ` [PATCH v2 05/18] xen: add simple errno-returning macros for copy from guest Christopher Clark
2018-12-20  6:39 ` [PATCH v2 06/18] xen: add XEN_GUEST_HANDLE_NULL macros for null XEN_GUEST_HANDLE Christopher Clark
2018-12-20  6:39 ` [PATCH v2 07/18] errno: add POSIX error codes EMSGSIZE, ECONNREFUSED to the ABI Christopher Clark
2018-12-20 15:22   ` Jan Beulich
2018-12-20  6:39 ` [PATCH v2 08/18] xen/arm: introduce guest_handle_for_field() Christopher Clark
2018-12-20  6:39 ` [PATCH v2 09/18] xsm, argo: XSM control for argo register; add argo_mac bootparam Christopher Clark
2018-12-20 15:29   ` Jan Beulich
2018-12-20  6:39 ` [PATCH v2 10/18] xsm, argo: XSM control for argo message send operation Christopher Clark
2018-12-20  6:39 ` [PATCH v2 11/18] argo: implement the register op Christopher Clark
2018-12-20 11:20   ` Julien Grall
2018-12-21  1:17     ` Christopher Clark
2018-12-21 12:21       ` Julien Grall
2018-12-20  6:39 ` [PATCH v2 12/18] argo: implement the unregister op Christopher Clark
2018-12-20  6:39 ` [PATCH v2 13/18] argo: implement the sendv op; evtchn: expose send_guest_global_virq Christopher Clark
2018-12-20  6:39 ` [PATCH v2 14/18] argo: implement the notify op Christopher Clark
2018-12-20  6:39 ` [PATCH v2 15/18] xsm, argo: XSM control for any access to argo by a domain Christopher Clark
2018-12-20  6:39 ` [PATCH v2 16/18] xsm, argo: notify: don't describe rings that cannot be sent to Christopher Clark
2018-12-20  6:39 ` [PATCH v2 17/18] argo: validate hypercall arg structures via compat machinery Christopher Clark
2018-12-20  6:39 ` [PATCH v2 18/18] argo: unmap rings on suspend; signal ring-owners on resume Christopher Clark

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.