All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Durrant <pdurrant@amazon.com>
To: <xen-devel@lists.xenproject.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	Julien Grall <julien@xen.org>, Wei Liu <wl@xen.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Paul Durrant <pdurrant@amazon.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>
Subject: [Xen-devel] [PATCH v4 2/2] docs/designs: Add a design document for migration of xenstore data
Date: Wed, 29 Jan 2020 14:47:02 +0000	[thread overview]
Message-ID: <20200129144702.1543-3-pdurrant@amazon.com> (raw)
In-Reply-To: <20200129144702.1543-1-pdurrant@amazon.com>

This patch details proposes extra migration data and xenstore protocol
extensions to support non-cooperative live migration of guests.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wl@xen.org>

v4:
 - Drop the restrictions on special paths

v3:
 - New in v3
---
 docs/designs/xenstore-migration.md | 121 +++++++++++++++++++++++++++++
 1 file changed, 121 insertions(+)
 create mode 100644 docs/designs/xenstore-migration.md

diff --git a/docs/designs/xenstore-migration.md b/docs/designs/xenstore-migration.md
new file mode 100644
index 0000000000..991236e201
--- /dev/null
+++ b/docs/designs/xenstore-migration.md
@@ -0,0 +1,121 @@
+# Xenstore Migration
+
+## Background
+
+The design for *Non-Cooperative Migration of Guests*[1] explains that extra
+save records are required in the migrations stream to allow a guest running
+PV drivers to be migrated without its co-operation. Moreover the save
+records must include details of registered xenstore watches as well as
+content; information that cannot currently be recovered from `xenstored`,
+and hence some extension to the xenstore protocol[2] will also be required.
+
+The *libxenlight Domain Image Format* specification[3] already defines a
+record type `EMULATOR_XENSTORE_DATA` but this is not suitable for
+transferring xenstore data pertaining to the domain directly as it is
+specified such that keys are relative to the path
+`/local/domain/$dm_domid/device-model/$domid`. Thus it is necessary to
+define at least one new save record type.
+
+## Proposal
+
+### New Save Record
+
+A new mandatory record type should be defined within the libxenlight Domain
+Image Format:
+
+`0x00000007: DOMAIN_XENSTORE_DATA`
+
+The format of each of these new records should be as follows:
+
+
+```
+0     1     2     3     4     5     6     7 octet
++------------------------+------------------------+
+| type                   | record specific data   |
++------------------------+                        |
+...
++-------------------------------------------------+
+```
+
+
+| Field | Description |
+|---|---|
+| `type` | 0x00000000: invalid |
+|        | 0x00000001: node data |
+|        | 0x00000002: watch data |
+|        | 0x00000003 - 0xFFFFFFFF: reserved for future use |
+
+
+where data is always in the form of a NUL separated and terminated tuple
+as follows
+
+
+**node data**
+
+
+`<path>|<value>|<perm-as-string>|`
+
+
+`<path>` is considered relative to the domain path `/local/domain/$domid`
+and hence must not begin with `/`.
+`<path>` and `<value>` should be suitable to formulate a `WRITE` operation
+to the receiving xenstore and `<perm-as-string>` should be similarly suitable
+to formulate a subsequent `SET_PERMS` operation.
+
+**watch data**
+
+
+`<path>|<token>|`
+
+`<path>` again is considered relative and, together with `<token>`, should
+be suitable to formulate an `ADD_DOMAIN_WATCHES` operation (see below).
+
+
+### Protocol Extension
+
+The `WATCH` operation does not allow specification of a `<domid>`; it is
+assumed that the watch pertains to the domain that owns the shared ring
+over which the operation is passed. Hence, for the tool-stack to be able
+to register a watch on behalf of a domain a new operation is needed:
+
+```
+ADD_DOMAIN_WATCHES      <domid>|<watch>|+
+
+Adds watches on behalf of the specified domain.
+
+<watch> is a NUL separated tuple of <path>|<token>. The semantics of this
+operation are identical to the domain issuing WATCH <path>|<token>| for
+each <watch>.
+```
+
+The watch information for a domain also needs to be extracted from the
+sending xenstored so the following operation is also needed:
+
+```
+GET_DOMAIN_WATCHES      <domid>|<index>   <gencnt>|<watch>|* 
+
+Gets the list of watches that are currently registered for the domain.
+
+<watch> is a NUL separated tuple of <path>|<token>. The sub-list returned
+will start at <index> into the the overall list of watches and may be
+truncated such that the returned data fits within XENSTORE_PAYLOAD_MAX.
+If <index> is beyond the end of the overall list then the returned sub-
+list will be empty. If the value of <gencnt> changes then it indicates
+that the overall watch list has changed and thus it may be necessary
+to re-issue the operation for previous values of <index>.
+```
+
+It may also be desirable to state in the protocol specification that
+the `INTRODUCE` operation should not clear the `<mfn>` specified such that
+a `RELEASE` operation followed by an `INTRODUCE` operation form an
+idempotent pair. The current implementation of *C xentored* does this
+(in the `domain_conn_reset()` function) but this could be dropped as this
+behaviour is not currently specified and the page will always be zeroed
+for a newly created domain.
+
+
+* * *
+
+[1] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/designs/non-cooperative-migration.md
+[2] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/xenstore.txt
+[3] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/specs/libxl-migration-stream.pandoc
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  parent reply	other threads:[~2020-01-29 14:47 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-29 14:47 [Xen-devel] [PATCH v4 0/2] docs: Migration design documents Paul Durrant
2020-01-29 14:47 ` [Xen-devel] [PATCH v4 1/2] docs/designs: Add a design document for non-cooperative live migration Paul Durrant
2020-01-29 19:47   ` Andrew Cooper
2020-01-30  9:15     ` Durrant, Paul
2020-01-30 11:48       ` Andrew Cooper
2020-01-29 14:47 ` Paul Durrant [this message]
2020-01-29 21:23   ` [Xen-devel] [PATCH v4 2/2] docs/designs: Add a design document for migration of xenstore data Marek Marczykowski-Górecki
2020-01-30  8:45     ` Durrant, Paul
2020-01-30 10:27       ` Marek Marczykowski-Górecki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200129144702.1543-3-pdurrant@amazon.com \
    --to=pdurrant@amazon.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=julien@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.