xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Tiejun Chen <tiejun.chen@intel.com>
To: xen-devel@lists.xen.org
Cc: Wei Liu <wei.liu2@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Subject: [v10][PATCH 10/16] tools: introduce some new parameters to set rdm policy
Date: Mon, 20 Jul 2015 14:16:57 +0800	[thread overview]
Message-ID: <1437373023-14884-11-git-send-email-tiejun.chen@intel.com> (raw)
In-Reply-To: <1437373023-14884-1-git-send-email-tiejun.chen@intel.com>

This patch introduces user configurable parameters to specify RDM
resource and according policies,

Global RDM parameter:
    rdm = "strategy=host,policy=strict/relaxed"
Per-device RDM parameter:
    pci = [ 'sbdf, rdm_policy=strict/relaxed' ]

Global RDM parameter, "strategy", allows user to specify reserved regions
explicitly, Currently, using 'host' to include all reserved regions reported
on this platform which is good to handle hotplug scenario. In the future
this parameter may be further extended to allow specifying random regions,
e.g. even those belonging to another platform as a preparation for live
migration with passthrough devices. By default this isn't set so we don't
check all rdms. Instead, we just check rdm specific to a given device if
you're assigning this kind of device. Note this option is not recommended
unless you can make sure any conflict does exist.

'strict/relaxed' policy decides how to handle conflict when reserving RDM
regions in pfn space. If conflict exists, 'strict' means an immediate error
so VM can't keep running, while 'relaxed' allows moving forward with a
warning message thrown out.

Default per-device RDM policy is same as default global RDM policy as being
'relaxed'. And the per-device policy would override the global policy like
others.

CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
---
v9 ~ v10:

* Nothing is changed.

v8:

* One minimal code style change

v7:

* Need to rename some parameters:
  In the xl rdm config parsing, `reserve=' should be `policy='.
  In the xl pci config parsing, `rdm_reserve=' should be `rdm_policy='.
  The type `libxl_rdm_reserve_flag' should be `libxl_rdm_policy'.
  The field name `reserve' in `libxl_rdm_reserve' should be `policy'.

v6:

* Some rename to make our policy reasonable
  "type" -> "strategy"
  "none" -> "ignore"
* Don't expose "ignore" in xl level and just keep that as a default.
  And then sync docs and the patch head description

v5:

* Just make sure the per-device plicy always override the global policy,
  and so cleanup some associated comments and the patch head description.
* A little change to follow one bit, XEN_DOMCTL_DEV_RDM_RELAXED.
* Improve all descriptions in doc.
* Make all rdm variables specific to .hvm

v4:

* No need to define init_val for libxl_rdm_reserve_type since its just zero
* Grab those changes to xl/libxlu to as a final patch

 docs/man/xl.cfg.pod.5        | 81 ++++++++++++++++++++++++++++++++++++++++++++
 docs/misc/vtd.txt            | 24 +++++++++++++
 tools/libxl/libxl_create.c   |  7 ++++
 tools/libxl/libxl_internal.h |  2 ++
 tools/libxl/libxl_pci.c      |  9 +++++
 tools/libxl/libxl_types.idl  | 18 ++++++++++
 6 files changed, 141 insertions(+)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index a3e0e2e..6c55a8b 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -655,6 +655,79 @@ assigned slave device.
 
 =back
 
+=item B<rdm="RDM_RESERVATION_STRING">
+
+(HVM/x86 only) Specifies information about Reserved Device Memory (RDM),
+which is necessary to enable robust device passthrough. One example of RDM
+is reported through ACPI Reserved Memory Region Reporting (RMRR) structure
+on x86 platform.
+
+B<RDM_RESERVE_STRING> has the form C<[KEY=VALUE,KEY=VALUE,...> where:
+
+=over 4
+
+=item B<KEY=VALUE>
+
+Possible B<KEY>s are:
+
+=over 4
+
+=item B<strategy="STRING">
+
+Currently there is only one valid type:
+
+"host" means all reserved device memory on this platform should be checked to
+reserve regions in this VM's guest address space. This global rdm parameter
+allows user to specify reserved regions explicitly, and using "host" includes
+all reserved regions reported on this platform, which is useful when doing
+hotplug.
+
+By default this isn't set so we don't check all rdms. Instead, we just check
+rdm specific to a given device if you're assigning this kind of device. Note
+this option is not recommended unless you can make sure any conflict does exist.
+
+For example, you're trying to set "memory = 2800" to allocate memory to one
+given VM but the platform owns two RDM regions like,
+
+Device A [sbdf_A]: RMRR region_A: base_addr ac6d3000 end_address ac6e6fff
+Device B [sbdf_B]: RMRR region_B: base_addr ad800000 end_address afffffff
+
+In this conflict case,
+
+#1. If B<strategy> is set to "host", for example,
+
+rdm = "strategy=host,policy=strict" or rdm = "strategy=host,policy=relaxed"
+
+It means all conflicts will be handled according to the policy
+introduced by B<policy> as described below.
+
+#2. If B<strategy> is not set at all, but
+
+pci = [ 'sbdf_A, rdm_policy=xxxxx' ]
+
+It means only one conflict of region_A will be handled according to the policy
+introduced by B<rdm_policy="STRING"> as described inside pci options.
+
+=item B<policy="STRING">
+
+Specifies how to deal with conflicts when reserving reserved device
+memory in guest address space.
+
+When that conflict is unsolved,
+
+"strict" means VM can't be created, or the associated device can't be
+attached in the case of hotplug.
+
+"relaxed" allows VM to be created but may cause VM to crash if
+pass-through device accesses RDM. For exampl,e Windows IGD GFX driver
+always accessed RDM regions so it leads to VM crash.
+
+Note this may be overridden by rdm_policy option in PCI device configuration.
+
+=back
+
+=back
+
 =item B<pci=[ "PCI_SPEC_STRING", "PCI_SPEC_STRING", ... ]>
 
 Specifies the host PCI devices to passthrough to this guest. Each B<PCI_SPEC_STRING>
@@ -717,6 +790,14 @@ dom0 without confirmation.  Please use with care.
 D0-D3hot power management states for the PCI device. False (0) by
 default.
 
+=item B<rdm_policy="STRING">
+
+(HVM/x86 only) This is same as policy option inside the rdm option but
+just specific to a given device. Therefore the default is "relaxed" as
+same as policy option as well.
+
+Note this would override global B<rdm> option.
+
 =back
 
 =back
diff --git a/docs/misc/vtd.txt b/docs/misc/vtd.txt
index 9af0e99..88b2102 100644
--- a/docs/misc/vtd.txt
+++ b/docs/misc/vtd.txt
@@ -111,6 +111,30 @@ in the config file:
 To override for a specific device:
 	pci = [ '01:00.0,msitranslate=0', '03:00.0' ]
 
+RDM, 'reserved device memory', for PCI Device Passthrough
+---------------------------------------------------------
+
+There are some devices the BIOS controls, for e.g. USB devices to perform
+PS2 emulation. The regions of memory used for these devices are marked
+reserved in the e820 map. When we turn on DMA translation, DMA to those
+regions will fail. Hence BIOS uses RMRR to specify these regions along with
+devices that need to access these regions. OS is expected to setup
+identity mappings for these regions for these devices to access these regions.
+
+While creating a VM we should reserve them in advance, and avoid any conflicts.
+So we introduce user configurable parameters to specify RDM resource and
+according policies,
+
+To enable this globally, add "rdm" in the config file:
+
+    rdm = "strategy=host, policy=relaxed"   (default policy is "relaxed")
+
+Or just for a specific device:
+
+    pci = [ '01:00.0,rdm_policy=relaxed', '03:00.0,rdm_policy=strict' ]
+
+For all the options available to RDM, see xl.cfg(5).
+
 
 Caveat on Conventional PCI Device Passthrough
 ---------------------------------------------
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index f366a09..f75d4f1 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -105,6 +105,12 @@ static int sched_params_valid(libxl__gc *gc,
     return 1;
 }
 
+void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
+{
+    if (b_info->u.hvm.rdm.policy == LIBXL_RDM_RESERVE_POLICY_INVALID)
+        b_info->u.hvm.rdm.policy = LIBXL_RDM_RESERVE_POLICY_RELAXED;
+}
+
 int libxl__domain_build_info_setdefault(libxl__gc *gc,
                                         libxl_domain_build_info *b_info)
 {
@@ -384,6 +390,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 
         libxl_defbool_setdefault(&b_info->u.hvm.gfx_passthru, false);
 
+        libxl__rdm_setdefault(gc, b_info);
         break;
     case LIBXL_DOMAIN_TYPE_PV:
         libxl_defbool_setdefault(&b_info->u.pv.e820_host, false);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index d52589e..d397143 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1154,6 +1154,8 @@ _hidden int libxl__device_vtpm_setdefault(libxl__gc *gc, libxl_device_vtpm *vtpm
 _hidden int libxl__device_vfb_setdefault(libxl__gc *gc, libxl_device_vfb *vfb);
 _hidden int libxl__device_vkb_setdefault(libxl__gc *gc, libxl_device_vkb *vkb);
 _hidden int libxl__device_pci_setdefault(libxl__gc *gc, libxl_device_pci *pci);
+_hidden void libxl__rdm_setdefault(libxl__gc *gc,
+                                   libxl_domain_build_info *b_info);
 
 _hidden const char *libxl__device_nic_devname(libxl__gc *gc,
                                               uint32_t domid,
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 632c15e..1ebdce7 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -988,6 +988,12 @@ static int do_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcidev, i
 
 out:
     if (!libxl_is_stubdom(ctx, domid, NULL)) {
+        if (pcidev->rdm_policy == LIBXL_RDM_RESERVE_POLICY_STRICT) {
+            flag &= ~XEN_DOMCTL_DEV_RDM_RELAXED;
+        } else if (pcidev->rdm_policy != LIBXL_RDM_RESERVE_POLICY_RELAXED) {
+            LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "unknown rdm check flag.");
+            return ERROR_FAIL;
+        }
         rc = xc_assign_device(ctx->xch, domid, pcidev_encode_bdf(pcidev), flag);
         if (rc < 0 && (hvm || errno != ENOSYS)) {
             LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_assign_device failed");
@@ -1040,6 +1046,9 @@ static int libxl__device_pci_reset(libxl__gc *gc, unsigned int domain, unsigned
 
 int libxl__device_pci_setdefault(libxl__gc *gc, libxl_device_pci *pci)
 {
+    /* We'd like to force reserve rdm specific to a device by default.*/
+    if (pci->rdm_policy == LIBXL_RDM_RESERVE_POLICY_INVALID)
+        pci->rdm_policy = LIBXL_RDM_RESERVE_POLICY_STRICT;
     return 0;
 }
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index e1632fa..47dd83a 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -76,6 +76,17 @@ libxl_domain_type = Enumeration("domain_type", [
     (2, "PV"),
     ], init_val = "LIBXL_DOMAIN_TYPE_INVALID")
 
+libxl_rdm_reserve_strategy = Enumeration("rdm_reserve_strategy", [
+    (0, "ignore"),
+    (1, "host"),
+    ])
+
+libxl_rdm_reserve_policy = Enumeration("rdm_reserve_policy", [
+    (-1, "invalid"),
+    (0, "strict"),
+    (1, "relaxed"),
+    ], init_val = "LIBXL_RDM_RESERVE_POLICY_INVALID")
+
 libxl_channel_connection = Enumeration("channel_connection", [
     (0, "UNKNOWN"),
     (1, "PTY"),
@@ -369,6 +380,11 @@ libxl_vnode_info = Struct("vnode_info", [
     ("vcpus", libxl_bitmap), # vcpus in this node
     ])
 
+libxl_rdm_reserve = Struct("rdm_reserve", [
+    ("strategy",    libxl_rdm_reserve_strategy),
+    ("policy",      libxl_rdm_reserve_policy),
+    ])
+
 libxl_domain_build_info = Struct("domain_build_info",[
     ("max_vcpus",       integer),
     ("avail_vcpus",     libxl_bitmap),
@@ -467,6 +483,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                        # See libxl_ms_vm_genid_generate()
                                        ("ms_vm_genid",      libxl_ms_vm_genid),
                                        ("serial_list",      libxl_string_list),
+                                       ("rdm", libxl_rdm_reserve),
                                        ])),
                  ("pv", Struct(None, [("kernel", string),
                                       ("slack_memkb", MemKB),
@@ -542,6 +559,7 @@ libxl_device_pci = Struct("device_pci", [
     ("power_mgmt", bool),
     ("permissive", bool),
     ("seize", bool),
+    ("rdm_policy",      libxl_rdm_reserve_policy),
     ])
 
 libxl_device_dtdev = Struct("device_dtdev", [
-- 
1.9.1

  parent reply	other threads:[~2015-07-20  6:16 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-20  6:16 [v10][PATCH 00/16] Fix RMRR Tiejun Chen
2015-07-20  6:16 ` [v10][PATCH 01/16] xen: introduce XENMEM_reserved_device_memory_map Tiejun Chen
2015-07-20  6:16 ` [v10][PATCH 02/16] xen/vtd: create RMRR mapping Tiejun Chen
2015-07-20  6:16 ` [v10][PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy Tiejun Chen
2015-07-20  6:16 ` [v10][PATCH 04/16] xen: enable XENMEM_memory_map in hvm Tiejun Chen
2015-07-20  6:16 ` [v10][PATCH 05/16] hvmloader: get guest memory map into memory_map[] Tiejun Chen
2015-07-20  6:16 ` [v10][PATCH 06/16] hvmloader/pci: Try to avoid placing BARs in RMRRs Tiejun Chen
2015-07-20 11:00   ` George Dunlap
2015-07-20 11:30   ` Jan Beulich
2015-07-20 12:52     ` George Dunlap
2015-07-20 14:06       ` Chen, Tiejun
2015-07-20 14:10         ` George Dunlap
2015-07-20 14:16           ` Jan Beulich
2015-07-20 14:32             ` Chen, Tiejun
2015-07-20 15:00               ` Jan Beulich
2015-07-21  0:53                 ` Chen, Tiejun
2015-07-21  6:18                   ` Jan Beulich
2015-07-21  9:42                   ` George Dunlap
2015-07-20  6:16 ` [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table Tiejun Chen
2015-07-20 11:56   ` Jan Beulich
2015-07-20 14:35     ` Chen, Tiejun
2015-07-20 15:03       ` Jan Beulich
2015-07-20 13:00   ` George Dunlap
2015-07-20 13:23     ` Chen, Tiejun
2015-07-20 13:50       ` George Dunlap
2015-07-20 13:57         ` Chen, Tiejun
2015-07-20  6:16 ` [v10][PATCH 08/16] tools/libxc: Expose new hypercall xc_reserved_device_memory_map Tiejun Chen
2015-07-20  6:16 ` [v10][PATCH 09/16] tools: extend xc_assign_device() to support rdm reservation policy Tiejun Chen
2015-07-20  6:16 ` Tiejun Chen [this message]
2015-07-20  6:16 ` [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM Tiejun Chen
2015-07-20 13:32   ` Ian Jackson
2015-07-20 14:40     ` Chen, Tiejun
2015-07-20 14:53       ` Ian Jackson
2015-07-20 15:08         ` Chen, Tiejun
2015-07-20 15:24           ` Ian Jackson
2015-07-20 15:38             ` Ian Campbell
2015-07-21  6:44               ` Chen, Tiejun
2015-07-21  6:45             ` Chen, Tiejun
2015-07-21  6:38     ` Chen, Tiejun
2015-07-21 10:48       ` Ian Jackson
2015-07-21 11:12         ` Chen, Tiejun
2015-07-21 10:41     ` Ian Jackson
2015-07-21 11:04       ` Chen, Tiejun
2015-07-21 11:11         ` Ian Jackson
2015-07-21 11:23           ` Chen, Tiejun
2015-07-21 11:27             ` Ian Jackson
2015-07-21 11:45               ` Chen, Tiejun
2015-07-21 12:33                 ` Ian Jackson
2015-07-21 13:29                   ` Chen, Tiejun
2015-07-21 15:09                     ` Ian Jackson
2015-07-21 15:42                       ` Chen, Tiejun
2015-07-21 15:57                         ` Ian Jackson
2015-07-21 15:57                           ` Ian Jackson
2015-07-22  0:33                           ` Chen, Tiejun
2015-07-22  8:43                             ` Ian Campbell
2015-07-22  9:18                               ` Chen, Tiejun
2015-07-22 10:28                                 ` Ian Jackson
2015-07-22 10:51                                 ` Ian Campbell
2015-07-21 13:41                   ` Chen, Tiejun
2015-07-21 15:10                     ` Ian Jackson
2015-07-21 15:22                       ` Chen, Tiejun
2015-07-21 15:31                         ` Ian Jackson
2015-07-21 12:04           ` Chen, Tiejun
2015-07-21 12:34             ` Ian Jackson
2015-07-20  6:16 ` [v10][PATCH 12/16] tools: introduce a new parameter to set a predefined rdm boundary Tiejun Chen
2015-07-20  6:17 ` [v10][PATCH 13/16] libxl: construct e820 map with RDM information for HVM guest Tiejun Chen
2015-07-20 13:34   ` Ian Jackson
2015-07-20  6:17 ` [v10][PATCH 14/16] xen/vtd: enable USB device assignment Tiejun Chen
2015-07-20  6:17 ` [v10][PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr Tiejun Chen
2015-07-20  6:17 ` [v10][PATCH 16/16] tools: parse to enable new rdm policy parameters Tiejun Chen
2015-07-20 13:43   ` Ian Jackson
2015-07-20 13:53     ` Chen, Tiejun
2015-07-20 10:37 ` [v10][PATCH 00/16] Fix RMRR George Dunlap
2015-07-20 12:39   ` Chen, Tiejun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1437373023-14884-11-git-send-email-tiejun.chen@intel.com \
    --to=tiejun.chen@intel.com \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).