All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Xen-devel <xen-devel@lists.xenproject.org>
Cc: "Lars Kurth" <lars.kurth@citrix.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Julien Grall" <julien@xen.org>, "Wei Liu" <wl@xen.org>,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
	"George Dunlap" <George.Dunlap@eu.citrix.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Tim Deegan" <tim@xen.org>, "Jan Beulich" <JBeulich@suse.com>,
	"Ian Jackson" <ian.jackson@citrix.com>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: [Xen-devel] [PATCH 4/4] docs/sphinx: Technical Debt
Date: Thu, 3 Oct 2019 21:56:23 +0100	[thread overview]
Message-ID: <20191003205623.20839-4-andrew.cooper3@citrix.com> (raw)
In-Reply-To: <20191003205623.20839-1-andrew.cooper3@citrix.com>

This identifies various of areas technical debt, which either need to be, or
are being worked on, along with enough clarifying details for people to
follow.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Lars Kurth <lars.kurth@citrix.com>
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Jan Beulich <JBeulich@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Tim Deegan <tim@xen.org>
CC: Wei Liu <wl@xen.org>
CC: Julien Grall <julien@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 docs/conf.py            |  11 +++-
 docs/index.rst          |   8 +++
 docs/misc/tech-debt.rst | 130 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 148 insertions(+), 1 deletion(-)
 create mode 100644 docs/misc/tech-debt.rst

diff --git a/docs/conf.py b/docs/conf.py
index 50e41501db..0d2227f52e 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -53,7 +53,7 @@
 # Add any Sphinx extension module names here, as strings. They can be
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
 # ones.
-extensions = []
+extensions = ['sphinx.ext.extlinks']
 
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
@@ -192,3 +192,12 @@
 
 # A list of files that should not be packed into the epub file.
 epub_exclude_files = ['search.html']
+
+
+# -- Configuration for external links ----------------------------------------
+
+extlinks = {
+    'xen-cs':
+        ('https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=%s',
+         'Xen c/s '),
+}
diff --git a/docs/index.rst b/docs/index.rst
index b8ab13178c..0a2af2db9d 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -59,3 +59,11 @@ Miscellanea
 .. toctree::
 
    glossary
+
+Unsorted
+--------
+
+.. toctree::
+   :maxdepth: 2
+
+   misc/tech-debt
diff --git a/docs/misc/tech-debt.rst b/docs/misc/tech-debt.rst
new file mode 100644
index 0000000000..172ba3bd51
--- /dev/null
+++ b/docs/misc/tech-debt.rst
@@ -0,0 +1,130 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+
+Technical Debt
+==============
+
+Hypervisor
+----------
+
+CONFIG_PDX
+~~~~~~~~~~
+
+Xen uses the term MFN for Machine Frame Number, which is synonymous with
+Linux's PFN, and maps linearly to system/host/machine physical addresses.
+
+For every page of RAM, a ``struct page_info`` is needed for tracking purposes.
+In the simple case, the frametable is an array of ``struct page_info[]``
+indexed by MFN.
+
+However, this is inefficient when a system has banks of RAM at spread out in
+address space, as a large amount of space is wasted on frametable entries for
+non-existent frames.  This wastes both virtual address space and RAM.
+
+As a consequence, Xen has a compression scheme known as PDX which removes
+unused bits out of the middle of MFNs, to make a more tightly packed Page
+inDeX, which in turn reduces the size of the frametable for system.
+
+At the moment, PDX compression is unconditionally used.
+
+However, PDX compression does come with a cost in terms of the complexity to
+convert between PFNs and pages, which is a common operation in Xen.
+
+Typically, ARM32 systems do have RAM banks in discrete locations, and want to
+use PDX compression, while typically ARM64 and x86 systems have RAM packed
+from 0 with no holes.
+
+The goal of this work is to have ``CONFIG_PDX`` selected by ARM32 only.  This
+requires slightly untangling the memory management code in ARM and x86 to give
+it a clean compile boundary where PDX conversions are used.
+
+
+Waitqueue infrastructure
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Livepatching safety in Xen depends on all CPUs rendezvousing on the return to
+guest path, with no stack frame.  The vCPU waitqueue infrastructure undermines
+this safety by copying a stack frame sideways, and ``longjmp()``\-ing away.
+
+Waitqueues are only used by the introspection/mem_event/paging infrastructure,
+where the design of the rings causes some problems.  There is a single 4k page
+used for the ring, which serves both synchronous requests, and lossless async
+requests.  In practice, introspecting an 11-vcpu guest is sufficient to cause
+the waitqueue infrastructure to start to be used.
+
+A better design of ring would be to have a slot per vcpu for synchronous
+requests (simplifies producing and consuming of requests), and a multipage
+ring buffer (of negotiable size) with lossy semantics for async requests.
+
+A design such as this would guarantee that Xen never has to block waiting for
+userspace to create enough space on the ring for a vcpu to write state out.
+
+.. note::
+
+   There are other aspects of the existing ring infrastructure which are
+   driving a redesign, but these don't relate directly to the waitqueue
+   infrastructure and livepatching safety.
+
+   The most serious problem is that the ring infrastructure is GFN based,
+   which leaves the guest either able to mess with the ring, or a shattered
+   host superpage where the ring used to be, and the guest balloon driver able
+   to prevent the introspection agent from connecting/reconnecting the ring.
+
+As there are multiple compelling reasons to redesign the ring infrastructure,
+the plan is to introduce the new ring ABI, deprecate and remove the old ABI,
+and simply delete the waitqueue infrastructure at that point, rather than try
+to redesign livepatching from scratch in an attempt to cope with unwinding old
+stack frames.
+
+
+Dom0
+----
+
+Remove xenstored's dependencies on unstable interfaces
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Various xenstored implementations use libxc for two purposes.  It would be a
+substantial advantage to move xenstored onto entirely stable interfaces, which
+disconnects it from the internal of the libxc.
+
+1. Foreign mapping of the store ring
+
+   This is obsolete since :xen-cs:`6a2de353a9` (2012) which allocated grant
+   entries instead, to allow xenstored to function as a stub-domain without dom0
+   permissions.  :xen-cs:`38eeb3864d` dropped foreign mapping for cxenstored.
+   However, there are no OCaml bindings for libxengnttab.
+
+   Work Items:
+
+   * Minimal ``tools/ocaml/libs/xg/`` binding for ``tools/libs/gnttab/``.
+   * Replicate :xen-cs:`38eeb3864d` for oxenstored as well.
+
+2. Figuring out which domain(s) have gone away
+
+   Currently, the handling of domains is asymmetric.
+
+   * When a domain is created, the toolstack explicitly sends an
+     ``XS_INTRODUCE(domid, store mfn, store evtchn)`` message to xenstored, to
+     cause xenstored to connect to the guest ring, and fire the
+     ``@introduceDomain`` watch.
+
+   * When a domain is destroyed, Xen fires ``VIRQ_DOM_EXC`` which is bound by
+     xenstored, rather than the toolstack.  xenstored updates its idea of the
+     status of domains, and fires the ``@releaseDomain`` watch.
+
+     Xenstored uses ``xc_domain_getinfo()``, to work out which domain(s) have gone
+     away, and only cares about the shutdown status.
+
+     Furthermore, ``@releaseDomain`` (like ``VIRQ_DOM_EXC``) is a single-bit
+     message, which requires all listeners to evaluate whether the message applies
+     to them or not.  This results in a flurry of ``xc_domain_getinfo()`` calls
+     from multiple entities in the system, which all serialise on the domctl lock
+     in Xen.
+
+     Work Items:
+
+     * Figure out how shutdown status can be expressed in a stable way from Xen.
+     * Figure out if ``VIRQ_DOM_EXC`` and ``@releaseDomain`` can be extended
+       or superseded to carry at least a domid, to make domain shutdown scale
+       better.
+     * Figure out if ``VIRQ_DOM_EXC`` would better be bound by the toolstack,
+       rather than xenstored.
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  parent reply	other threads:[~2019-10-03 20:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-03 20:56 [Xen-devel] [PATCH for-4.13 0/4] docs/sphinx Andrew Cooper
2019-10-03 20:56 ` [Xen-devel] [PATCH 1/4] docs/sphinx: License content with CC-BY-4.0 Andrew Cooper
2019-10-07 12:01   ` Lars Kurth
2019-10-07 12:29     ` Andrew Cooper
2019-10-07 13:09       ` Lars Kurth
2019-10-08 13:09   ` Lars Kurth
2019-10-03 20:56 ` [Xen-devel] [PATCH 2/4] docs/sphinx: Indent cleanup Andrew Cooper
2019-10-08 12:09   ` Lars Kurth
2019-10-03 20:56 ` Andrew Cooper [this message]
2019-10-04  4:53 ` [Xen-devel] [PATCH for-4.13 0/4] docs/sphinx Jürgen Groß

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191003205623.20839-4-andrew.cooper3@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=ian.jackson@citrix.com \
    --cc=julien@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=lars.kurth@citrix.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=tim@xen.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.