All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s
@ 2015-12-17 20:22 Boris Ostrovsky
  2015-12-17 20:22 ` [PATCH v2 1/2] x86/mm: Add information about faulted page's presence to npfec structure Boris Ostrovsky
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Boris Ostrovsky @ 2015-12-17 20:22 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, keir; +Cc: Boris Ostrovsky, xen-devel

v2:
* Expanded npfec structure to include page's precense indicator (I took the
  easy way out and didn't implement VMX's fine-grained status since I couldn't
  figure out how to properly do it in SVM)
* s/mmio/mmcfg for some operations. I left mmio_ro_emulate_ctxt as is because
  it is used for non-MMCFG accesses
* Properly initialized hvm_emulate_ctxt (set data pointer after
  hvm_emulate_prepare())
* Added error handling to _hvm_emulate_one()
* Tested for page presence (and whether it is being written) before trying to
  emulate hvm access
* Added generic unhandleable_rwx() to x86_emulate.c
* Left non-MMCFG RO accesses unhandled (we havent't encountered those accesses
  yet with PVH dom0 and it's probably better to fault on them and investigate
  whether they are guest's issues).
* Code style updates

Boris Ostrovsky (2):
  x86/mm: Add information about faulted page's presence to npfec
    structure
  x86/PCI: Intercept Dom0 MMCFG from dom0s in HVM containers

 xen/arch/x86/hvm/emulate.c             |   34 ++++++++++++++++++++
 xen/arch/x86/hvm/hvm.c                 |   15 +++++++++
 xen/arch/x86/hvm/svm/svm.c             |    3 +-
 xen/arch/x86/hvm/vmx/vmx.c             |    5 ++-
 xen/arch/x86/mm.c                      |   53 +++++++++++---------------------
 xen/arch/x86/x86_emulate/x86_emulate.c |   10 ++++++
 xen/arch/x86/x86_emulate/x86_emulate.h |   13 ++++++++
 xen/include/asm-x86/hvm/emulate.h      |    4 ++
 xen/include/asm-x86/mm.h               |   17 ++++++++++
 xen/include/xen/mm.h                   |    1 +
 10 files changed, 118 insertions(+), 37 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/2] x86/mm: Add information about faulted page's presence to npfec structure
  2015-12-17 20:22 [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s Boris Ostrovsky
@ 2015-12-17 20:22 ` Boris Ostrovsky
  2015-12-18 10:15   ` Jan Beulich
  2015-12-17 20:22 ` [PATCH v2 2/2] x86/PCI: Intercept Dom0 MMCFG from dom0s in HVM containers Boris Ostrovsky
  2015-12-18 10:11 ` [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s Jan Beulich
  2 siblings, 1 reply; 12+ messages in thread
From: Boris Ostrovsky @ 2015-12-17 20:22 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, keir; +Cc: Boris Ostrovsky, xen-devel

This is provided explicitly in SVM and implicitly in VMX (when neither of
the three EPT_EFFECTIVE_* bits is set).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/hvm/svm/svm.c |    3 ++-
 xen/arch/x86/hvm/vmx/vmx.c |    5 ++++-
 xen/include/xen/mm.h       |    1 +
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 0078100..a66d854 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1491,7 +1491,8 @@ static void svm_do_nested_pgfault(struct vcpu *v,
     struct npfec npfec = {
         .read_access = !(pfec & PFEC_insn_fetch),
         .write_access = !!(pfec & PFEC_write_access),
-        .insn_fetch = !!(pfec & PFEC_insn_fetch)
+        .insn_fetch = !!(pfec & PFEC_insn_fetch),
+        .present = !!(pfec & PFEC_page_present),
     };
 
     /* These bits are mutually exclusive */
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 2922673..7ea1882 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2746,7 +2746,10 @@ static void ept_handle_violation(unsigned long qualification, paddr_t gpa)
         .read_access = !!(qualification & EPT_READ_VIOLATION) ||
                        !!(qualification & EPT_WRITE_VIOLATION),
         .write_access = !!(qualification & EPT_WRITE_VIOLATION),
-        .insn_fetch = !!(qualification & EPT_EXEC_VIOLATION)
+        .insn_fetch = !!(qualification & EPT_EXEC_VIOLATION),
+        .present = !!(qualification & (EPT_EFFECTIVE_READ |
+                                       EPT_EFFECTIVE_WRITE |
+                                       EPT_EFFECTIVE_EXEC))
     };
 
     if ( tb_init_done )
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 5d4b64b..a795dd6 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -159,6 +159,7 @@ struct npfec {
     unsigned int read_access:1;
     unsigned int write_access:1;
     unsigned int insn_fetch:1;
+    unsigned int present:1;
     unsigned int gla_valid:1;
     unsigned int kind:2;  /* npfec_kind_t */
 };
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/2] x86/PCI: Intercept Dom0 MMCFG from dom0s in HVM containers
  2015-12-17 20:22 [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s Boris Ostrovsky
  2015-12-17 20:22 ` [PATCH v2 1/2] x86/mm: Add information about faulted page's presence to npfec structure Boris Ostrovsky
@ 2015-12-17 20:22 ` Boris Ostrovsky
  2015-12-18 10:28   ` Jan Beulich
  2015-12-18 10:11 ` [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s Jan Beulich
  2 siblings, 1 reply; 12+ messages in thread
From: Boris Ostrovsky @ 2015-12-17 20:22 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, keir; +Cc: Boris Ostrovsky, xen-devel

Commit 9256f66c1606 ("x86/PCI: intercept all PV Dom0 MMCFG writes")
added intercepts for writes to RO MMCFG space from PV dom0.

Similar functionality is needed by dom0s in HVM containers (such as
PVH and, in the future, HVMlite).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/hvm/emulate.c             |   34 ++++++++++++++++++++
 xen/arch/x86/hvm/hvm.c                 |   15 +++++++++
 xen/arch/x86/mm.c                      |   53 +++++++++++---------------------
 xen/arch/x86/x86_emulate/x86_emulate.c |   10 ++++++
 xen/arch/x86/x86_emulate/x86_emulate.h |   13 ++++++++
 xen/include/asm-x86/hvm/emulate.h      |    4 ++
 xen/include/asm-x86/mm.h               |   17 ++++++++++
 7 files changed, 111 insertions(+), 35 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index e1017b5..6f6ec61 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1778,6 +1778,40 @@ int hvm_emulate_one_no_write(
     return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops_no_write);
 }
 
+int hvm_emulate_one_mmcfg(unsigned seg, unsigned bdf, unsigned long gla)
+{
+    static const struct x86_emulate_ops hvm_emulate_ops_mmcfg = {
+        .read       = unhandleable_rwx,
+        .insn_fetch = hvmemul_insn_fetch,
+        .write      = mmcfg_intercept_write,
+    };
+    struct mmio_ro_emulate_ctxt mmio_ro_ctxt = {
+        .cr2 = gla,
+        .seg = seg,
+        .bdf = bdf
+    };
+    struct hvm_emulate_ctxt ctxt;
+    int rc;
+
+    hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
+    ctxt.ctxt.data = &mmio_ro_ctxt;
+    rc = _hvm_emulate_one(&ctxt, &hvm_emulate_ops_mmcfg);
+    switch ( rc )
+    {
+    case X86EMUL_UNHANDLEABLE:
+        hvm_dump_emulation_state(XENLOG_G_WARNING "MMCFG", &ctxt);
+        break;
+    case X86EMUL_EXCEPTION:
+        if ( ctxt.exn_pending )
+            hvm_inject_trap(&ctxt.trap);
+        /* fallthrough */
+    default:
+        hvm_emulate_writeback(&ctxt);
+    }
+
+    return rc;
+}
+
 void hvm_mem_access_emulate_one(enum emul_kind kind, unsigned int trapnr,
     unsigned int errcode)
 {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 08cef1f..7cc057b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3115,6 +3115,21 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
         goto out_put_gfn;
     }
 
+    if ( (p2mt == p2m_mmio_direct) && is_hardware_domain(currd) &&
+         npfec.write_access && npfec.present )
+    {
+        unsigned int seg, bdf;
+        const unsigned long *ro_map;
+
+        if ( pci_mmcfg_decode(mfn_x(mfn), &seg, &bdf) &&
+             ((ro_map = pci_get_ro_map(seg)) == NULL || !test_bit(bdf, ro_map)) &&
+             (hvm_emulate_one_mmcfg(seg, bdf, gla) == X86EMUL_OKAY) )
+        {
+            rc = 1;
+            goto out_put_gfn;
+        }
+    }
+
     /* If we fell through, the vcpu will retry now that access restrictions have
      * been removed. It may fault again if the p2m entry type still requires so.
      * Otherwise, this is an error condition. */
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 92df36f..a7767f8 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5243,31 +5243,14 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
  * fault handling for read-only MMIO pages
  */
 
-struct mmio_ro_emulate_ctxt {
-    struct x86_emulate_ctxt ctxt;
-    unsigned long cr2;
-    unsigned int seg, bdf;
-};
-
-static int mmio_ro_emulated_read(
-    enum x86_segment seg,
-    unsigned long offset,
-    void *p_data,
-    unsigned int bytes,
-    struct x86_emulate_ctxt *ctxt)
-{
-    return X86EMUL_UNHANDLEABLE;
-}
-
-static int mmio_ro_emulated_write(
+int mmio_ro_emulated_write(
     enum x86_segment seg,
     unsigned long offset,
     void *p_data,
     unsigned int bytes,
     struct x86_emulate_ctxt *ctxt)
 {
-    struct mmio_ro_emulate_ctxt *mmio_ro_ctxt =
-        container_of(ctxt, struct mmio_ro_emulate_ctxt, ctxt);
+    struct mmio_ro_emulate_ctxt *mmio_ro_ctxt = ctxt->data;
 
     /* Only allow naturally-aligned stores at the original %cr2 address. */
     if ( ((bytes | offset) & (bytes - 1)) || offset != mmio_ro_ctxt->cr2 )
@@ -5281,20 +5264,19 @@ static int mmio_ro_emulated_write(
 }
 
 static const struct x86_emulate_ops mmio_ro_emulate_ops = {
-    .read       = mmio_ro_emulated_read,
+    .read       = unhandleable_rwx,
     .insn_fetch = ptwr_emulated_read,
     .write      = mmio_ro_emulated_write,
 };
 
-static int mmio_intercept_write(
+int mmcfg_intercept_write(
     enum x86_segment seg,
     unsigned long offset,
     void *p_data,
     unsigned int bytes,
     struct x86_emulate_ctxt *ctxt)
 {
-    struct mmio_ro_emulate_ctxt *mmio_ctxt =
-        container_of(ctxt, struct mmio_ro_emulate_ctxt, ctxt);
+    struct mmio_ro_emulate_ctxt *mmio_ctxt = ctxt->data;
 
     /*
      * Only allow naturally-aligned stores no wider than 4 bytes to the
@@ -5303,7 +5285,7 @@ static int mmio_intercept_write(
     if ( ((bytes | offset) & (bytes - 1)) || bytes > 4 ||
          offset != mmio_ctxt->cr2 )
     {
-        MEM_LOG("mmio_intercept: bad write (cr2=%lx, addr=%lx, bytes=%u)",
+        MEM_LOG("mmcfg_intercept: bad write (cr2=%lx, addr=%lx, bytes=%u)",
                 mmio_ctxt->cr2, offset, bytes);
         return X86EMUL_UNHANDLEABLE;
     }
@@ -5318,10 +5300,10 @@ static int mmio_intercept_write(
     return X86EMUL_OKAY;
 }
 
-static const struct x86_emulate_ops mmio_intercept_ops = {
-    .read       = mmio_ro_emulated_read,
+static const struct x86_emulate_ops mmcfg_intercept_ops = {
+    .read       = unhandleable_rwx,
     .insn_fetch = ptwr_emulated_read,
-    .write      = mmio_intercept_write,
+    .write      = mmcfg_intercept_write,
 };
 
 /* Check if guest is trying to modify a r/o MMIO page. */
@@ -5331,12 +5313,13 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
     l1_pgentry_t pte;
     unsigned long mfn;
     unsigned int addr_size = is_pv_32bit_vcpu(v) ? 32 : BITS_PER_LONG;
-    struct mmio_ro_emulate_ctxt mmio_ro_ctxt = {
-        .ctxt.regs = regs,
-        .ctxt.addr_size = addr_size,
-        .ctxt.sp_size = addr_size,
-        .ctxt.swint_emulate = x86_swint_emulate_none,
-        .cr2 = addr
+    struct mmio_ro_emulate_ctxt mmio_ro_ctxt = { .cr2 = addr };
+    struct x86_emulate_ctxt ctxt = {
+        .regs = regs,
+        .addr_size = addr_size,
+        .sp_size = addr_size,
+        .swint_emulate = x86_swint_emulate_none,
+        .data = &mmio_ro_ctxt
     };
     const unsigned long *ro_map;
     int rc;
@@ -5366,9 +5349,9 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
     if ( pci_mmcfg_decode(mfn, &mmio_ro_ctxt.seg, &mmio_ro_ctxt.bdf) &&
          ((ro_map = pci_get_ro_map(mmio_ro_ctxt.seg)) == NULL ||
           !test_bit(mmio_ro_ctxt.bdf, ro_map)) )
-        rc = x86_emulate(&mmio_ro_ctxt.ctxt, &mmio_intercept_ops);
+        rc = x86_emulate(&ctxt, &mmcfg_intercept_ops);
     else
-        rc = x86_emulate(&mmio_ro_ctxt.ctxt, &mmio_ro_emulate_ops);
+        rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
 
     return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
 }
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index f1454ce..dde37ba 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1444,6 +1444,16 @@ static int inject_swint(enum x86_swint_type type,
     return ops->inject_hw_exception(fault_type, error_code, ctxt);
 }
 
+int unhandleable_rwx(
+    enum x86_segment seg,
+    unsigned long offset,
+    void *p_data,
+    unsigned int bytes,
+    struct x86_emulate_ctxt *ctxt)
+{
+    return X86EMUL_UNHANDLEABLE;
+}
+
 int
 x86_emulate(
     struct x86_emulate_ctxt *ctxt,
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index cfac09b..d1754b9 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -430,6 +430,9 @@ struct x86_emulate_ctxt
         } flags;
         uint8_t byte;
     } retire;
+
+    /* Caller data that can be used by x86_emulate_ops' routines. */
+    void *data;
 };
 
 struct x86_emulate_stub {
@@ -463,4 +466,14 @@ void *
 decode_register(
     uint8_t modrm_reg, struct cpu_user_regs *regs, int highbyte_regs);
 
+/* Unhadleable read, write or instruction fetch */
+int
+unhandleable_rwx(
+    enum x86_segment seg,
+    unsigned long offset,
+    void *p_data,
+    unsigned int bytes,
+    struct x86_emulate_ctxt *ctxt);
+
+
 #endif /* __X86_EMULATE_H__ */
diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-x86/hvm/emulate.h
index 49134b5..2f39ff8 100644
--- a/xen/include/asm-x86/hvm/emulate.h
+++ b/xen/include/asm-x86/hvm/emulate.h
@@ -57,6 +57,10 @@ void hvm_emulate_writeback(
 struct segment_register *hvmemul_get_seg_reg(
     enum x86_segment seg,
     struct hvm_emulate_ctxt *hvmemul_ctxt);
+int hvm_emulate_one_mmcfg(
+    unsigned int seg,
+    unsigned int bdf,
+    unsigned long gla);
 
 int hvmemul_do_pio_buffer(uint16_t port,
                           unsigned int size,
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 67b34c6..773ba5d 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -7,6 +7,7 @@
 #include <xen/spinlock.h>
 #include <asm/io.h>
 #include <asm/uaccess.h>
+#include <asm/x86_emulate.h>
 
 /*
  * Per-page-frame information.
@@ -488,6 +489,22 @@ void memguard_unguard_range(void *p, unsigned long l);
 void memguard_guard_stack(void *p);
 void memguard_unguard_stack(void *p);
 
+struct mmio_ro_emulate_ctxt {
+        unsigned long cr2;
+        unsigned int seg, bdf;
+};
+
+extern int mmio_ro_emulate_write(enum x86_segment seg,
+                                 unsigned long offset,
+                                 void *p_data,
+                                 unsigned int bytes,
+                                 struct x86_emulate_ctxt *ctxt);
+extern int mmcfg_intercept_write(enum x86_segment seg,
+                                 unsigned long offset,
+                                 void *p_data,
+                                 unsigned int bytes,
+                                 struct x86_emulate_ctxt *ctxt);
+
 int  ptwr_do_page_fault(struct vcpu *, unsigned long,
                         struct cpu_user_regs *);
 int  mmio_ro_do_page_fault(struct vcpu *, unsigned long,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s
  2015-12-17 20:22 [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s Boris Ostrovsky
  2015-12-17 20:22 ` [PATCH v2 1/2] x86/mm: Add information about faulted page's presence to npfec structure Boris Ostrovsky
  2015-12-17 20:22 ` [PATCH v2 2/2] x86/PCI: Intercept Dom0 MMCFG from dom0s in HVM containers Boris Ostrovsky
@ 2015-12-18 10:11 ` Jan Beulich
  2015-12-18 13:02   ` Andrew Cooper
  2 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2015-12-18 10:11 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: andrew.cooper3, keir, xen-devel

>>> On 17.12.15 at 21:22, <boris.ostrovsky@oracle.com> wrote:
> * Left non-MMCFG RO accesses unhandled (we havent't encountered those accesses
>   yet with PVH dom0 and it's probably better to fault on them and investigate
>   whether they are guest's issues).

I seriously question this being a good approach, since I can't see why
what is appropriate here for PV shouldn't also be appropriate for PVH
or HVMlite. Did you indeed try with at least one of the device kinds
that get marked r/o (AMD IOMMU, PCI serial card used by hypervisor,
EHCI device with debug port used by hypervisor, "vga=keep" used)
at present?

Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] x86/mm: Add information about faulted page's presence to npfec structure
  2015-12-17 20:22 ` [PATCH v2 1/2] x86/mm: Add information about faulted page's presence to npfec structure Boris Ostrovsky
@ 2015-12-18 10:15   ` Jan Beulich
  2015-12-20  6:49     ` Tian, Kevin
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2015-12-18 10:15 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: andrew.cooper3, Kevin Tian, keir, Jun Nakajima, xen-devel

>>> On 17.12.15 at 21:22, <boris.ostrovsky@oracle.com> wrote:
> This is provided explicitly in SVM and implicitly in VMX (when neither of
> the three EPT_EFFECTIVE_* bits is set).
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

Retaining full context since you failed to Cc the VMX maintainers,
which I've no done.

Jan

> ---
>  xen/arch/x86/hvm/svm/svm.c |    3 ++-
>  xen/arch/x86/hvm/vmx/vmx.c |    5 ++++-
>  xen/include/xen/mm.h       |    1 +
>  3 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index 0078100..a66d854 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -1491,7 +1491,8 @@ static void svm_do_nested_pgfault(struct vcpu *v,
>      struct npfec npfec = {
>          .read_access = !(pfec & PFEC_insn_fetch),
>          .write_access = !!(pfec & PFEC_write_access),
> -        .insn_fetch = !!(pfec & PFEC_insn_fetch)
> +        .insn_fetch = !!(pfec & PFEC_insn_fetch),
> +        .present = !!(pfec & PFEC_page_present),
>      };
>  
>      /* These bits are mutually exclusive */
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 2922673..7ea1882 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2746,7 +2746,10 @@ static void ept_handle_violation(unsigned long 
> qualification, paddr_t gpa)
>          .read_access = !!(qualification & EPT_READ_VIOLATION) ||
>                         !!(qualification & EPT_WRITE_VIOLATION),
>          .write_access = !!(qualification & EPT_WRITE_VIOLATION),
> -        .insn_fetch = !!(qualification & EPT_EXEC_VIOLATION)
> +        .insn_fetch = !!(qualification & EPT_EXEC_VIOLATION),
> +        .present = !!(qualification & (EPT_EFFECTIVE_READ |
> +                                       EPT_EFFECTIVE_WRITE |
> +                                       EPT_EFFECTIVE_EXEC))
>      };
>  
>      if ( tb_init_done )
> diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
> index 5d4b64b..a795dd6 100644
> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -159,6 +159,7 @@ struct npfec {
>      unsigned int read_access:1;
>      unsigned int write_access:1;
>      unsigned int insn_fetch:1;
> +    unsigned int present:1;
>      unsigned int gla_valid:1;
>      unsigned int kind:2;  /* npfec_kind_t */
>  };
> -- 
> 1.7.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/2] x86/PCI: Intercept Dom0 MMCFG from dom0s in HVM containers
  2015-12-17 20:22 ` [PATCH v2 2/2] x86/PCI: Intercept Dom0 MMCFG from dom0s in HVM containers Boris Ostrovsky
@ 2015-12-18 10:28   ` Jan Beulich
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Beulich @ 2015-12-18 10:28 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: andrew.cooper3, keir, xen-devel

>>> On 17.12.15 at 21:22, <boris.ostrovsky@oracle.com> wrote:
> Commit 9256f66c1606 ("x86/PCI: intercept all PV Dom0 MMCFG writes")
> added intercepts for writes to RO MMCFG space from PV dom0.
> 
> Similar functionality is needed by dom0s in HVM containers (such as
> PVH and, in the future, HVMlite).
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Looks okay leaving aside the question of r/o device handling. A few
minor remarks:

> @@ -463,4 +466,14 @@ void *
>  decode_register(
>      uint8_t modrm_reg, struct cpu_user_regs *regs, int highbyte_regs);
>  
> +/* Unhadleable read, write or instruction fetch */

Unhandleable

> +int
> +unhandleable_rwx(

x86emul_unhandleable_rwx() or some such, making it clear what
component it belongs to. I also doubt the x here really makes much
sense - the insn_fetch field can't usefully point to that function.

> +    enum x86_segment seg,
> +    unsigned long offset,
> +    void *p_data,
> +    unsigned int bytes,
> +    struct x86_emulate_ctxt *ctxt);
> +
> +
>  #endif /* __X86_EMULATE_H__ */

One blank line only please.

Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s
  2015-12-18 10:11 ` [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s Jan Beulich
@ 2015-12-18 13:02   ` Andrew Cooper
  2015-12-18 13:53     ` Boris Ostrovsky
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Cooper @ 2015-12-18 13:02 UTC (permalink / raw)
  To: Jan Beulich, Boris Ostrovsky; +Cc: keir, xen-devel

On 18/12/15 10:11, Jan Beulich wrote:
>>>> On 17.12.15 at 21:22, <boris.ostrovsky@oracle.com> wrote:
>> * Left non-MMCFG RO accesses unhandled (we havent't encountered those accesses
>>   yet with PVH dom0 and it's probably better to fault on them and investigate
>>   whether they are guest's issues).
> I seriously question this being a good approach,

I concur.  All accesses should be terminated somehow, even if this is
the hypervisor dropping writes and completing reads with ~0's.

This is the expected behaviour from the x86 architecture.

~Andrew

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s
  2015-12-18 13:02   ` Andrew Cooper
@ 2015-12-18 13:53     ` Boris Ostrovsky
  2015-12-18 14:16       ` Jan Beulich
  0 siblings, 1 reply; 12+ messages in thread
From: Boris Ostrovsky @ 2015-12-18 13:53 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich; +Cc: keir, xen-devel

On 12/18/2015 08:02 AM, Andrew Cooper wrote:
> On 18/12/15 10:11, Jan Beulich wrote:
>>>>> On 17.12.15 at 21:22, <boris.ostrovsky@oracle.com> wrote:
>>> * Left non-MMCFG RO accesses unhandled (we havent't encountered those accesses
>>>    yet with PVH dom0 and it's probably better to fault on them and investigate
>>>    whether they are guest's issues).
>> I seriously question this being a good approach,
> I concur.  All accesses should be terminated somehow, even if this is
> the hypervisor dropping writes and completing reads with ~0's.
>
> This is the expected behaviour from the x86 architecture.

OK, I'll add that in.

To answer Jan's question, I did try 'vga=keep' and that worked fine. (I 
couldn't test AMD IOMMU because we don't support PVH there and I don't 
have the other HW that he mentioned).

-boris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s
  2015-12-18 13:53     ` Boris Ostrovsky
@ 2015-12-18 14:16       ` Jan Beulich
  2015-12-18 14:30         ` Boris Ostrovsky
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2015-12-18 14:16 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: Andrew Cooper, keir, xen-devel

>>> On 18.12.15 at 14:53, <boris.ostrovsky@oracle.com> wrote:
> On 12/18/2015 08:02 AM, Andrew Cooper wrote:
>> On 18/12/15 10:11, Jan Beulich wrote:
>>>>>> On 17.12.15 at 21:22, <boris.ostrovsky@oracle.com> wrote:
>>>> * Left non-MMCFG RO accesses unhandled (we havent't encountered those 
> accesses
>>>>    yet with PVH dom0 and it's probably better to fault on them and 
> investigate
>>>>    whether they are guest's issues).
>>> I seriously question this being a good approach,
>> I concur.  All accesses should be terminated somehow, even if this is
>> the hypervisor dropping writes and completing reads with ~0's.
>>
>> This is the expected behaviour from the x86 architecture.
> 
> OK, I'll add that in.
> 
> To answer Jan's question, I did try 'vga=keep' and that worked fine.

Interesting. Does your system not load/use a DRM driver for the
card? Or is the driver so minimalistic that no extended config
space write ever occurs?

Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s
  2015-12-18 14:16       ` Jan Beulich
@ 2015-12-18 14:30         ` Boris Ostrovsky
  2015-12-18 14:36           ` Jan Beulich
  0 siblings, 1 reply; 12+ messages in thread
From: Boris Ostrovsky @ 2015-12-18 14:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, keir, xen-devel

On 12/18/2015 09:16 AM, Jan Beulich wrote:
>>>> On 18.12.15 at 14:53, <boris.ostrovsky@oracle.com> wrote:
>> On 12/18/2015 08:02 AM, Andrew Cooper wrote:
>>> On 18/12/15 10:11, Jan Beulich wrote:
>>>>>>> On 17.12.15 at 21:22, <boris.ostrovsky@oracle.com> wrote:
>>>>> * Left non-MMCFG RO accesses unhandled (we havent't encountered those
>> accesses
>>>>>     yet with PVH dom0 and it's probably better to fault on them and
>> investigate
>>>>>     whether they are guest's issues).
>>>> I seriously question this being a good approach,
>>> I concur.  All accesses should be terminated somehow, even if this is
>>> the hypervisor dropping writes and completing reads with ~0's.
>>>
>>> This is the expected behaviour from the x86 architecture.
>> OK, I'll add that in.
>>
>> To answer Jan's question, I did try 'vga=keep' and that worked fine.
> Interesting. Does your system not load/use a DRM driver for the
> card? Or is the driver so minimalistic that no extended config
> space write ever occurs?

It does load the driver:

0b:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200e 
[Pilot] ServerEngines (SEP1) (rev 04)
     Subsystem: Intel Corporation Device 0101
     Kernel driver in use: mgag200
..
root@ovs104> lsmod | grep mgag
mgag200                45056  1
ttm                    90112  1 mgag200
drm_kms_helper        122880  1 mgag200
drm                   335872  4 mgag200,ttm,drm_kms_helper
i2c_algo_bit           16384  2 igb,mgag200

but I don't know what it does to the config space. I configured my 
kernel for a fairly minimal configuration so perhaps I disabled 
something that limits what the driver does.

-boris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s
  2015-12-18 14:30         ` Boris Ostrovsky
@ 2015-12-18 14:36           ` Jan Beulich
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Beulich @ 2015-12-18 14:36 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: Andrew Cooper, keir, xen-devel

>>> On 18.12.15 at 15:30, <boris.ostrovsky@oracle.com> wrote:
> On 12/18/2015 09:16 AM, Jan Beulich wrote:
>>>>> On 18.12.15 at 14:53, <boris.ostrovsky@oracle.com> wrote:
>>> On 12/18/2015 08:02 AM, Andrew Cooper wrote:
>>>> On 18/12/15 10:11, Jan Beulich wrote:
>>>>>>>> On 17.12.15 at 21:22, <boris.ostrovsky@oracle.com> wrote:
>>>>>> * Left non-MMCFG RO accesses unhandled (we havent't encountered those
>>> accesses
>>>>>>     yet with PVH dom0 and it's probably better to fault on them and
>>> investigate
>>>>>>     whether they are guest's issues).
>>>>> I seriously question this being a good approach,
>>>> I concur.  All accesses should be terminated somehow, even if this is
>>>> the hypervisor dropping writes and completing reads with ~0's.
>>>>
>>>> This is the expected behaviour from the x86 architecture.
>>> OK, I'll add that in.
>>>
>>> To answer Jan's question, I did try 'vga=keep' and that worked fine.
>> Interesting. Does your system not load/use a DRM driver for the
>> card? Or is the driver so minimalistic that no extended config
>> space write ever occurs?
> 
> It does load the driver:
> 
> 0b:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200e 
> [Pilot] ServerEngines (SEP1) (rev 04)
>      Subsystem: Intel Corporation Device 0101
>      Kernel driver in use: mgag200
> ..
> root@ovs104> lsmod | grep mgag
> mgag200                45056  1
> ttm                    90112  1 mgag200
> drm_kms_helper        122880  1 mgag200
> drm                   335872  4 mgag200,ttm,drm_kms_helper
> i2c_algo_bit           16384  2 igb,mgag200
> 
> but I don't know what it does to the config space. I configured my 
> kernel for a fairly minimal configuration so perhaps I disabled 
> something that limits what the driver does.

For the MGA driver I'm not really surprised; I expected you would
be using one of the more common ones (radeon, nouveau, or i915).

Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] x86/mm: Add information about faulted page's presence to npfec structure
  2015-12-18 10:15   ` Jan Beulich
@ 2015-12-20  6:49     ` Tian, Kevin
  0 siblings, 0 replies; 12+ messages in thread
From: Tian, Kevin @ 2015-12-20  6:49 UTC (permalink / raw)
  To: Jan Beulich, Boris Ostrovsky
  Cc: andrew.cooper3, keir, Nakajima, Jun, xen-devel

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, December 18, 2015 6:15 PM
> 
> >>> On 17.12.15 at 21:22, <boris.ostrovsky@oracle.com> wrote:
> > This is provided explicitly in SVM and implicitly in VMX (when neither of
> > the three EPT_EFFECTIVE_* bits is set).
> >
> > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> 
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> Retaining full context since you failed to Cc the VMX maintainers,
> which I've no done.
> 
> Jan
> 

Acked-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-12-20  6:49 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-17 20:22 [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s Boris Ostrovsky
2015-12-17 20:22 ` [PATCH v2 1/2] x86/mm: Add information about faulted page's presence to npfec structure Boris Ostrovsky
2015-12-18 10:15   ` Jan Beulich
2015-12-20  6:49     ` Tian, Kevin
2015-12-17 20:22 ` [PATCH v2 2/2] x86/PCI: Intercept Dom0 MMCFG from dom0s in HVM containers Boris Ostrovsky
2015-12-18 10:28   ` Jan Beulich
2015-12-18 10:11 ` [PATCH v2 0/2] Support of RO MMCFG access for PVH/HVMlite dom0s Jan Beulich
2015-12-18 13:02   ` Andrew Cooper
2015-12-18 13:53     ` Boris Ostrovsky
2015-12-18 14:16       ` Jan Beulich
2015-12-18 14:30         ` Boris Ostrovsky
2015-12-18 14:36           ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.