All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v3 1/2] x86: add p2m_mmio_write_dm
  2014-09-03 21:53 ` [PATCH v3 1/2] x86: add p2m_mmio_write_dm Wei Ye
@ 2014-09-03  9:54   ` Paul Durrant
  2014-09-03 10:03     ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Durrant @ 2014-09-03  9:54 UTC (permalink / raw)
  To: Wei Ye, xen-devel
  Cc: Kevin Tian, Keir (Xen.org), Ian Campbell, Tim (Xen.org),
	donald.d.dugger, Stefano Stabellini, zhiyuan.lv, JBeulich,
	yang.z.zhang, Ian Jackson

> -----Original Message-----
> From: Wei Ye [mailto:wei.ye@intel.com]
> Sent: 03 September 2014 22:53
> To: xen-devel@lists.xen.org
> Cc: JBeulich@suse.com; Ian Campbell; Paul Durrant; Ian Jackson; Stefano
> Stabellini; donald.d.dugger@intel.com; Kevin Tian; yang.z.zhang@intel.com;
> zhiyuan.lv@intel.com; konrad.wilk@oracle.com; Keir (Xen.org); Tim
> (Xen.org); Wei Ye
> Subject: [PATCH v3 1/2] x86: add p2m_mmio_write_dm
> 
> Add an new p2m type p2m_mmio_write_dm. Page of this type is read
> only, and write will go to the device model for emulation just like
> a mmio.
> 

I realise this is a bit beyond the scope of this what you're doing but it occurs to me that the idea of p2m_mmio_read_dm might be equally useful when emulating MMIO that has read side effects but no write side effects. May save a few VMEXITs here and there.

> Signed-off-by: Wei Ye <wei.ye@intel.com>
> ---
>  xen/arch/x86/hvm/hvm.c    |    3 ++-
>  xen/arch/x86/mm/p2m-ept.c |    1 +
>  xen/arch/x86/mm/p2m-pt.c  |    1 +
>  xen/include/asm-x86/p2m.h |    1 +
>  4 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 17ff011..adc0f93 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2739,7 +2739,8 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>       * to the mmio handler.
>       */
>      if ( (p2mt == p2m_mmio_dm) ||
> -         (access_w && (p2mt == p2m_ram_ro)) )
> +         (access_w && ((p2mt == p2m_ram_ro) ||
> +         (p2mt == p2m_mmio_write_dm))) )

Personally I would have put the two p2m_mmio checks next to each other, but it doesn't really matter.

  Paul

>      {
>          put_gfn(p2m->domain, gfn);
> 
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index 15c6e83..e21a92d 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -136,6 +136,7 @@ static void ept_p2m_type_to_flags(ept_entry_t
> *entry, p2m_type_t type, p2m_acces
>              entry->x = 0;
>              break;
>          case p2m_grant_map_ro:
> +        case p2m_mmio_write_dm:
>              entry->r = 1;
>              entry->w = entry->x = 0;
>              break;
> diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
> index 085ab6f..6a18606 100644
> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -94,6 +94,7 @@ static unsigned long p2m_type_to_flags(p2m_type_t t,
> mfn_t mfn)
>      default:
>          return flags | _PAGE_NX_BIT;
>      case p2m_grant_map_ro:
> +    case p2m_mmio_write_dm:
>          return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
>      case p2m_ram_ro:
>      case p2m_ram_logdirty:
> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
> index 0ddbadb..523c7a9 100644
> --- a/xen/include/asm-x86/p2m.h
> +++ b/xen/include/asm-x86/p2m.h
> @@ -71,6 +71,7 @@ typedef enum {
>      p2m_ram_shared = 12,          /* Shared or sharable memory */
>      p2m_ram_broken = 13,          /* Broken page, access cause domain crash */
>      p2m_map_foreign  = 14,        /* ram pages from foreign domain */
> +    p2m_mmio_write_dm = 15,       /* Read-only; write go to the device
> model */
>  } p2m_type_t;
> 
>  /*
> --
> 1.7.9.5

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/2] x86: add p2m_mmio_write_dm
  2014-09-03  9:54   ` Paul Durrant
@ 2014-09-03 10:03     ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2014-09-03 10:03 UTC (permalink / raw)
  To: Paul Durrant, Wei Ye, xen-devel
  Cc: Kevin Tian, Keir (Xen.org), Ian Campbell, Tim (Xen.org),
	donald.d.dugger, StefanoStabellini, zhiyuan.lv, yang.z.zhang,
	Ian Jackson

>>> On 03.09.14 at 11:54, <Paul.Durrant@citrix.com> wrote:
>> From: Wei Ye [mailto:wei.ye@intel.com]
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -2739,7 +2739,8 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>>       * to the mmio handler.
>>       */
>>      if ( (p2mt == p2m_mmio_dm) ||
>> -         (access_w && (p2mt == p2m_ram_ro)) )
>> +         (access_w && ((p2mt == p2m_ram_ro) ||
>> +         (p2mt == p2m_mmio_write_dm))) )
> 
> Personally I would have put the two p2m_mmio checks next to each other, but 
> it doesn't really matter.

At the very least the indentation would need adjustment if the
ordering is to be kept as it is now.

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-03 21:53 ` [PATCH v3 2/2] ioreq-server: write protected range and forwarding Wei Ye
@ 2014-09-03 10:11   ` Paul Durrant
  2014-09-04 23:10     ` Tian, Kevin
  2014-09-03 13:17   ` Jan Beulich
  1 sibling, 1 reply; 19+ messages in thread
From: Paul Durrant @ 2014-09-03 10:11 UTC (permalink / raw)
  To: Wei Ye, xen-devel
  Cc: Kevin Tian, Keir (Xen.org), Ian Campbell, Tim (Xen.org),
	donald.d.dugger, Stefano Stabellini, zhiyuan.lv, JBeulich,
	yang.z.zhang, Ian Jackson

> -----Original Message-----
> From: Wei Ye [mailto:wei.ye@intel.com]
> Sent: 03 September 2014 22:53
> To: xen-devel@lists.xen.org
> Cc: JBeulich@suse.com; Ian Campbell; Paul Durrant; Ian Jackson; Stefano
> Stabellini; donald.d.dugger@intel.com; Kevin Tian; yang.z.zhang@intel.com;
> zhiyuan.lv@intel.com; konrad.wilk@oracle.com; Keir (Xen.org); Tim
> (Xen.org); Wei Ye
> Subject: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
> 
> Extend the ioreq-server to write protect pages and forward the
> write access to the corresponding device model.
> 

Using rangesets did make the patch somewhat smaller then :-)

> Signed-off-by: Wei Ye <wei.ye@intel.com>
> ---
>  tools/libxc/xc_domain.c          |   33 +++++++++++++++++++++++++++
>  tools/libxc/xenctrl.h            |   18 +++++++++++++++
>  xen/arch/x86/hvm/hvm.c           |   46
> ++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/hvm/domain.h |    2 +-
>  xen/include/public/hvm/hvm_op.h  |    1 +
>  5 files changed, 99 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> index 37ed141..34e6309 100644
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -1485,6 +1485,39 @@ int
> xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch, domid_t
> domid,
>      return rc;
>  }
> 
> +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch, domid_t
> domid,
> +                                        ioservid_t id, uint16_t set,
> +                                        uint64_t start, uint64_t end)
> +{

Given what I said about the notion of MMIO ranges that need read emulation but no write emulation, and the ranges you're concerned about which require write emulation but no read emulation, I wonder whether this could collapse into the existing xc_hvm_map_io_range_to_ioreq_server() by adding an extra parameter to say whether, for a given range, you want read emulation, write emulation or both.

> +    DECLARE_HYPERCALL;
> +    DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
> +    int rc = -1;
> +
> +    if ( arg == NULL )
> +    {
> +        errno = -EFAULT;
> +        return -1;
> +    }
> +
> +    hypercall.op = __HYPERVISOR_hvm_op;
> +    if ( set )
> +        hypercall.arg[0] = HVMOP_map_io_range_to_ioreq_server;
> +    else
> +        hypercall.arg[0] = HVMOP_unmap_io_range_from_ioreq_server;
> +    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
> +
> +    arg->domid = domid;
> +    arg->id = id;
> +    arg->type = HVMOP_IO_RANGE_WP;
> +    arg->start = start;
> +    arg->end = end;
> +
> +    rc = do_xen_hypercall(xch, &hypercall);
> +
> +    xc_hypercall_buffer_free(xch, arg);
> +    return rc;
> +}
> +
>  int xc_hvm_destroy_ioreq_server(xc_interface *xch,
>                                  domid_t domid,
>                                  ioservid_t id)
> diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
> index b55d857..b261602 100644
> --- a/tools/libxc/xenctrl.h
> +++ b/tools/libxc/xenctrl.h
> @@ -1943,6 +1943,24 @@ int
> xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
>                                            uint8_t function);
> 
>  /**
> + * This function registers/deregisters a set of pages to be write protected.
> + *
> + * @parm xch a handle to an open hypervisor interface.
> + * @parm domid the domain id to be serviced
> + * @parm id the IOREQ Server id.
> + * @parm set whether registers or deregisters: 1 for register, 0 for
> deregister
> + * @parm start start of range
> + * @parm end end of range (inclusive).
> + * @return 0 on success, -1 on failure.
> + */
> +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> +                                        domid_t domid,
> +                                        ioservid_t id,
> +                                        uint16_t set,
> +                                        uint64_t start,
> +                                        uint64_t end);
> +
> +/**
>   * This function destroys an IOREQ Server.
>   *
>   * @parm xch a handle to an open hypervisor interface.
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index adc0f93..f8c4db8 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -853,6 +853,7 @@ static int hvm_ioreq_server_alloc_rangesets(struct
> hvm_ioreq_server *s,
>                        (i == HVMOP_IO_RANGE_PORT) ? "port" :
>                        (i == HVMOP_IO_RANGE_MEMORY) ? "memory" :
>                        (i == HVMOP_IO_RANGE_PCI) ? "pci" :
> +                      (i == HVMOP_IO_RANGE_WP) ? "write protection" :

Continuing along my train of thought from above, perhaps the rangesets should now be

HVMOP_IO_RANGE_PORT_READ
HMVOP_IO_RANGE_PORT_WRITE
HVMOP_IO_RANGE_MEMORY_READ
HVMOP_IO_RANGE_MEMORY_WRITE
HVMOP_IO_RANGE_PCI

>                        "");
>          if ( rc )
>              goto fail;
> @@ -1138,6 +1139,37 @@ static int hvm_get_ioreq_server_info(struct
> domain *d, ioservid_t id,
>      return rc;
>  }
> 
> +static int hvm_change_p2m_type_ioreq_server(struct domain *d, uint16_t
> set,
> +                                            uint64_t start, uint64_t end)
> +{
> +    int rc = -EINVAL;
> +    uint64_t gpfn_s, gpfn_e, gpfn;
> +    p2m_type_t ot, nt;
> +
> +    if ( set )
> +    {
> +        ot = p2m_ram_rw;
> +        nt = p2m_mmio_write_dm;
> +    }
> +    else
> +    {
> +        ot = p2m_mmio_write_dm;
> +        nt = p2m_ram_rw;
> +    }
> +
> +    gpfn_s = start >> PAGE_SHIFT;
> +    gpfn_e = end >> PAGE_SHIFT;
> +
> +    for ( gpfn = gpfn_s; gpfn <= gpfn_e; gpfn++)
> +    {
> +        rc = p2m_change_type_one(d, gpfn, ot, nt);
> +        if ( !rc )
> +            break;
> +    }
> +
> +    return rc;
> +}
> +
>  static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t
> id,
>                                              uint32_t type, uint64_t start, uint64_t end)
>  {
> @@ -1163,6 +1195,7 @@ static int
> hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
>              case HVMOP_IO_RANGE_PORT:
>              case HVMOP_IO_RANGE_MEMORY:
>              case HVMOP_IO_RANGE_PCI:
> +            case HVMOP_IO_RANGE_WP:
>                  r = s->range[type];
>                  break;
> 
> @@ -1180,6 +1213,10 @@ static int
> hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
>                  break;
> 
>              rc = rangeset_add_range(r, start, end);
> +
> +            if ( type == HVMOP_IO_RANGE_WP )
> +                rc = hvm_change_p2m_type_ioreq_server(d, 1, start, end);
> +

Here, for memory ranges,  you'd then change the p2m types to p2m_mmio_write_dm, p2m_mmio_read_dm or p2m_mmio_dm depending on whether the a range was being added to the MEMORY_WRITE set, the MEMORY_READ set or both. Obviously you'd need to be careful about ranges with distinct semantics that fall into the same page - but that could be dealt with. E.g. if you had two ranges in the same page, one needing full emulation and one needing write-only emulation then you could set the p2m type to p2m_mmio_dm and handle the read emulation for the second range directly in hvm_send_assist_req_to_ioreq_server().

  Paul

>              break;
>          }
>      }
> @@ -1214,6 +1251,7 @@ static int
> hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>              case HVMOP_IO_RANGE_PORT:
>              case HVMOP_IO_RANGE_MEMORY:
>              case HVMOP_IO_RANGE_PCI:
> +            case HVMOP_IO_RANGE_WP:
>                  r = s->range[type];
>                  break;
> 
> @@ -1231,6 +1269,10 @@ static int
> hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
>                  break;
> 
>              rc = rangeset_remove_range(r, start, end);
> +
> +            if ( type == HVMOP_IO_RANGE_WP )
> +                rc = hvm_change_p2m_type_ioreq_server(d, 0, start, end);
> +
>              break;
>          }
>      }
> @@ -2361,6 +2403,10 @@ static struct hvm_ioreq_server
> *hvm_select_ioreq_server(struct domain *d,
>              if ( rangeset_contains_range(r, addr, end) )
>                  return s;
> 
> +            r = s->range[HVMOP_IO_RANGE_WP];
> +            if ( rangeset_contains_range(r, addr, end) )
> +                return s;
> +
>              break;
>          case IOREQ_TYPE_PCI_CONFIG:
>              if ( rangeset_contains_singleton(r, addr >> 32) )
> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-
> x86/hvm/domain.h
> index 291a2e0..b194f9a 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -48,7 +48,7 @@ struct hvm_ioreq_vcpu {
>      evtchn_port_t    ioreq_evtchn;
>  };
> 
> -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
> +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP + 1)
>  #define MAX_NR_IO_RANGES  256
> 
>  struct hvm_ioreq_server {
> diff --git a/xen/include/public/hvm/hvm_op.h
> b/xen/include/public/hvm/hvm_op.h
> index eeb0a60..aedb97f 100644
> --- a/xen/include/public/hvm/hvm_op.h
> +++ b/xen/include/public/hvm/hvm_op.h
> @@ -323,6 +323,7 @@ struct xen_hvm_io_range {
>  # define HVMOP_IO_RANGE_PORT   0 /* I/O port range */
>  # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
>  # define HVMOP_IO_RANGE_PCI    2 /* PCI segment/bus/dev/func range */
> +# define HVMOP_IO_RANGE_WP     3 /* Write protetion range */
>      uint64_aligned_t start, end; /* IN - inclusive start and end of range */
>  };
>  typedef struct xen_hvm_io_range xen_hvm_io_range_t;
> --
> 1.7.9.5

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-03 21:53 ` [PATCH v3 2/2] ioreq-server: write protected range and forwarding Wei Ye
  2014-09-03 10:11   ` Paul Durrant
@ 2014-09-03 13:17   ` Jan Beulich
  2014-09-04  0:31     ` Ye, Wei
  2014-09-10  5:32     ` Ye, Wei
  1 sibling, 2 replies; 19+ messages in thread
From: Jan Beulich @ 2014-09-03 13:17 UTC (permalink / raw)
  To: Wei Ye
  Cc: kevin.tian, keir, ian.campbell, stefano.stabellini, tim,
	ian.jackson, donald.d.dugger, xen-devel, Paul.Durrant,
	zhiyuan.lv, yang.z.zhang

>>> On 03.09.14 at 23:53, <wei.ye@intel.com> wrote:
> +static int hvm_change_p2m_type_ioreq_server(struct domain *d, uint16_t set,
> +                                            uint64_t start, uint64_t end)
> +{
> +    int rc = -EINVAL;
> +    uint64_t gpfn_s, gpfn_e, gpfn;
> +    p2m_type_t ot, nt;
> +
> +    if ( set )
> +    {
> +        ot = p2m_ram_rw;
> +        nt = p2m_mmio_write_dm;
> +    }
> +    else
> +    {
> +        ot = p2m_mmio_write_dm;
> +        nt = p2m_ram_rw;
> +    }
> +
> +    gpfn_s = start >> PAGE_SHIFT;
> +    gpfn_e = end >> PAGE_SHIFT;

Considering that the first really ought to be PFN_DOWN() - is the latter
really correct? I'd rather expect that to be PFN_UP()...

> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -48,7 +48,7 @@ struct hvm_ioreq_vcpu {
>      evtchn_port_t    ioreq_evtchn;
>  };
>  
> -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
> +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP + 1)
>  #define MAX_NR_IO_RANGES  256

So in the end you didn't even find it necessary to bump the limit
on the number of ranges per domain? Iirc that was your major
objection against using existing infrastructure.

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 0/2] Extend ioreq-server to support page write protection
@ 2014-09-03 21:53 Wei Ye
  2014-09-03 21:53 ` [PATCH v3 1/2] x86: add p2m_mmio_write_dm Wei Ye
  2014-09-03 21:53 ` [PATCH v3 2/2] ioreq-server: write protected range and forwarding Wei Ye
  0 siblings, 2 replies; 19+ messages in thread
From: Wei Ye @ 2014-09-03 21:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, keir, ian.campbell, stefano.stabellini, tim,
	ian.jackson, donald.d.dugger, Paul.Durrant, zhiyuan.lv, JBeulich,
	yang.z.zhang, Wei Ye

Changes from v1:
 - Changes the new p2m type name from p2m_ram_wp to p2m_mmio_write_dm. This
   means that we treat the pages as a special mmio range instead of ram.
 - Move macros to c file since only this file is using them.
 - Address various comments from Jan.

Changes from v2:
 - Remove excute attribute of the new p2m type p2m_mmio_write_dm
 - Use existing rangeset for keeping the write protection page range instead of
   introducing hash table.
 - Some code style fix.

ioreq-server is proposed to forward PIO and MMIO request to multiple device
models according to the io range. XenGT (Intel Graphics Virtualization technology,
please refer to https://01.org/xen/blogs/srclarkx/2013/graphics-virtualization-
xengt) driver reside in Dom0 as a virtual graphics device model also need to trap
and emulate the guest's write operation to some specific memory pages, like the
memory pages used by guest graphics driver as PPGTT(per-process graphics 
translation table). We add an new p2m type "p2m_mmio_write_dm" to trap the page
write operation. Page of this new p2m type is read only and for write, the request
will go to device model via ioreq-server.

Wei Ye (2):
  x86: add p2m_mmio_write_dm
  ioreq-server: write protected range and forwarding

 tools/libxc/xc_domain.c          |   30 ++++++++++++++++++++++
 tools/libxc/xenctrl.h            |   18 ++++++++++++++
 xen/arch/x86/hvm/hvm.c           |   51 ++++++++++++++++++++++++++++++++++++--
 xen/arch/x86/mm/p2m-ept.c        |    1 +
 xen/arch/x86/mm/p2m-pt.c         |    1 +
 xen/include/asm-x86/hvm/domain.h |    2 +-
 xen/include/asm-x86/p2m.h        |    1 +
 xen/include/public/hvm/hvm_op.h  |    1 +
 8 files changed, 102 insertions(+), 3 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 1/2] x86: add p2m_mmio_write_dm
  2014-09-03 21:53 [PATCH v3 0/2] Extend ioreq-server to support page write protection Wei Ye
@ 2014-09-03 21:53 ` Wei Ye
  2014-09-03  9:54   ` Paul Durrant
  2014-09-03 21:53 ` [PATCH v3 2/2] ioreq-server: write protected range and forwarding Wei Ye
  1 sibling, 1 reply; 19+ messages in thread
From: Wei Ye @ 2014-09-03 21:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, keir, ian.campbell, stefano.stabellini, tim,
	ian.jackson, donald.d.dugger, Paul.Durrant, zhiyuan.lv, JBeulich,
	yang.z.zhang, Wei Ye

Add an new p2m type p2m_mmio_write_dm. Page of this type is read
only, and write will go to the device model for emulation just like
a mmio.

Signed-off-by: Wei Ye <wei.ye@intel.com>
---
 xen/arch/x86/hvm/hvm.c    |    3 ++-
 xen/arch/x86/mm/p2m-ept.c |    1 +
 xen/arch/x86/mm/p2m-pt.c  |    1 +
 xen/include/asm-x86/p2m.h |    1 +
 4 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 17ff011..adc0f93 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2739,7 +2739,8 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
      * to the mmio handler.
      */
     if ( (p2mt == p2m_mmio_dm) || 
-         (access_w && (p2mt == p2m_ram_ro)) )
+         (access_w && ((p2mt == p2m_ram_ro) ||
+         (p2mt == p2m_mmio_write_dm))) )
     {
         put_gfn(p2m->domain, gfn);
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 15c6e83..e21a92d 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -136,6 +136,7 @@ static void ept_p2m_type_to_flags(ept_entry_t *entry, p2m_type_t type, p2m_acces
             entry->x = 0;
             break;
         case p2m_grant_map_ro:
+        case p2m_mmio_write_dm:
             entry->r = 1;
             entry->w = entry->x = 0;
             break;
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 085ab6f..6a18606 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -94,6 +94,7 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn)
     default:
         return flags | _PAGE_NX_BIT;
     case p2m_grant_map_ro:
+    case p2m_mmio_write_dm:
         return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
     case p2m_ram_ro:
     case p2m_ram_logdirty:
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 0ddbadb..523c7a9 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -71,6 +71,7 @@ typedef enum {
     p2m_ram_shared = 12,          /* Shared or sharable memory */
     p2m_ram_broken = 13,          /* Broken page, access cause domain crash */
     p2m_map_foreign  = 14,        /* ram pages from foreign domain */
+    p2m_mmio_write_dm = 15,       /* Read-only; write go to the device model */
 } p2m_type_t;
 
 /*
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-03 21:53 [PATCH v3 0/2] Extend ioreq-server to support page write protection Wei Ye
  2014-09-03 21:53 ` [PATCH v3 1/2] x86: add p2m_mmio_write_dm Wei Ye
@ 2014-09-03 21:53 ` Wei Ye
  2014-09-03 10:11   ` Paul Durrant
  2014-09-03 13:17   ` Jan Beulich
  1 sibling, 2 replies; 19+ messages in thread
From: Wei Ye @ 2014-09-03 21:53 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, keir, ian.campbell, stefano.stabellini, tim,
	ian.jackson, donald.d.dugger, Paul.Durrant, zhiyuan.lv, JBeulich,
	yang.z.zhang, Wei Ye

Extend the ioreq-server to write protect pages and forward the
write access to the corresponding device model.

Signed-off-by: Wei Ye <wei.ye@intel.com>
---
 tools/libxc/xc_domain.c          |   33 +++++++++++++++++++++++++++
 tools/libxc/xenctrl.h            |   18 +++++++++++++++
 xen/arch/x86/hvm/hvm.c           |   46 ++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/domain.h |    2 +-
 xen/include/public/hvm/hvm_op.h  |    1 +
 5 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 37ed141..34e6309 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1485,6 +1485,39 @@ int xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch, domid_t domid,
     return rc;
 }
 
+int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch, domid_t domid,
+                                        ioservid_t id, uint16_t set,
+                                        uint64_t start, uint64_t end)
+{
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
+    int rc = -1;
+
+    if ( arg == NULL )
+    {
+        errno = -EFAULT;
+        return -1;
+    }
+
+    hypercall.op = __HYPERVISOR_hvm_op;
+    if ( set )
+        hypercall.arg[0] = HVMOP_map_io_range_to_ioreq_server;
+    else 
+        hypercall.arg[0] = HVMOP_unmap_io_range_from_ioreq_server;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->domid = domid;
+    arg->id = id;
+    arg->type = HVMOP_IO_RANGE_WP;
+    arg->start = start;
+    arg->end = end;
+
+    rc = do_xen_hypercall(xch, &hypercall);
+
+    xc_hypercall_buffer_free(xch, arg);
+    return rc;
+}
+
 int xc_hvm_destroy_ioreq_server(xc_interface *xch,
                                 domid_t domid,
                                 ioservid_t id)
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index b55d857..b261602 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -1943,6 +1943,24 @@ int xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
                                           uint8_t function);
 
 /**
+ * This function registers/deregisters a set of pages to be write protected.
+ *
+ * @parm xch a handle to an open hypervisor interface.
+ * @parm domid the domain id to be serviced
+ * @parm id the IOREQ Server id.
+ * @parm set whether registers or deregisters: 1 for register, 0 for deregister
+ * @parm start start of range
+ * @parm end end of range (inclusive).
+ * @return 0 on success, -1 on failure.
+ */
+int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
+                                        domid_t domid,
+                                        ioservid_t id,
+                                        uint16_t set,
+                                        uint64_t start,
+                                        uint64_t end);
+
+/**
  * This function destroys an IOREQ Server.
  *
  * @parm xch a handle to an open hypervisor interface.
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index adc0f93..f8c4db8 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -853,6 +853,7 @@ static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
                       (i == HVMOP_IO_RANGE_PORT) ? "port" :
                       (i == HVMOP_IO_RANGE_MEMORY) ? "memory" :
                       (i == HVMOP_IO_RANGE_PCI) ? "pci" :
+                      (i == HVMOP_IO_RANGE_WP) ? "write protection" :
                       "");
         if ( rc )
             goto fail;
@@ -1138,6 +1139,37 @@ static int hvm_get_ioreq_server_info(struct domain *d, ioservid_t id,
     return rc;
 }
 
+static int hvm_change_p2m_type_ioreq_server(struct domain *d, uint16_t set,
+                                            uint64_t start, uint64_t end)
+{
+    int rc = -EINVAL;
+    uint64_t gpfn_s, gpfn_e, gpfn;
+    p2m_type_t ot, nt;
+
+    if ( set )
+    {
+        ot = p2m_ram_rw;
+        nt = p2m_mmio_write_dm;
+    }
+    else
+    {
+        ot = p2m_mmio_write_dm;
+        nt = p2m_ram_rw;
+    }
+
+    gpfn_s = start >> PAGE_SHIFT;
+    gpfn_e = end >> PAGE_SHIFT;
+
+    for ( gpfn = gpfn_s; gpfn <= gpfn_e; gpfn++)
+    {
+        rc = p2m_change_type_one(d, gpfn, ot, nt);
+        if ( !rc )
+            break;
+    }
+    
+    return rc;
+}
+
 static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
                                             uint32_t type, uint64_t start, uint64_t end)
 {
@@ -1163,6 +1195,7 @@ static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
             case HVMOP_IO_RANGE_PORT:
             case HVMOP_IO_RANGE_MEMORY:
             case HVMOP_IO_RANGE_PCI:
+            case HVMOP_IO_RANGE_WP:
                 r = s->range[type];
                 break;
 
@@ -1180,6 +1213,10 @@ static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
                 break;
 
             rc = rangeset_add_range(r, start, end);
+
+            if ( type == HVMOP_IO_RANGE_WP )
+                rc = hvm_change_p2m_type_ioreq_server(d, 1, start, end);
+
             break;
         }
     }
@@ -1214,6 +1251,7 @@ static int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
             case HVMOP_IO_RANGE_PORT:
             case HVMOP_IO_RANGE_MEMORY:
             case HVMOP_IO_RANGE_PCI:
+            case HVMOP_IO_RANGE_WP:
                 r = s->range[type];
                 break;
 
@@ -1231,6 +1269,10 @@ static int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
                 break;
 
             rc = rangeset_remove_range(r, start, end);
+
+            if ( type == HVMOP_IO_RANGE_WP )
+                rc = hvm_change_p2m_type_ioreq_server(d, 0, start, end);
+
             break;
         }
     }
@@ -2361,6 +2403,10 @@ static struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
             if ( rangeset_contains_range(r, addr, end) )
                 return s;
 
+            r = s->range[HVMOP_IO_RANGE_WP];
+            if ( rangeset_contains_range(r, addr, end) )
+                return s;
+
             break;
         case IOREQ_TYPE_PCI_CONFIG:
             if ( rangeset_contains_singleton(r, addr >> 32) )
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index 291a2e0..b194f9a 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -48,7 +48,7 @@ struct hvm_ioreq_vcpu {
     evtchn_port_t    ioreq_evtchn;
 };
 
-#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
+#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP + 1)
 #define MAX_NR_IO_RANGES  256
 
 struct hvm_ioreq_server {
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index eeb0a60..aedb97f 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -323,6 +323,7 @@ struct xen_hvm_io_range {
 # define HVMOP_IO_RANGE_PORT   0 /* I/O port range */
 # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
 # define HVMOP_IO_RANGE_PCI    2 /* PCI segment/bus/dev/func range */
+# define HVMOP_IO_RANGE_WP     3 /* Write protetion range */
     uint64_aligned_t start, end; /* IN - inclusive start and end of range */
 };
 typedef struct xen_hvm_io_range xen_hvm_io_range_t;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-03 13:17   ` Jan Beulich
@ 2014-09-04  0:31     ` Ye, Wei
  2014-09-10  5:32     ` Ye, Wei
  1 sibling, 0 replies; 19+ messages in thread
From: Ye, Wei @ 2014-09-04  0:31 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, keir, ian.campbell, stefano.stabellini, tim,
	ian.jackson, Dugger, Donald D, xen-devel, Paul.Durrant, Lv,
	Zhiyuan, Zhang, Yang Z



> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Jan Beulich
> Sent: Wednesday, September 3, 2014 9:17 PM
> To: Ye, Wei
> Cc: Tian, Kevin; keir@xen.org; ian.campbell@citrix.com;
> stefano.stabellini@eu.citrix.com; tim@xen.org; ian.jackson@eu.citrix.com;
> Dugger, Donald D; xen-devel@lists.xen.org; Paul.Durrant@citrix.com; Lv,
> Zhiyuan; Zhang, Yang Z
> Subject: Re: [Xen-devel] [PATCH v3 2/2] ioreq-server: write protected range
> and forwarding
> 
> >>> On 03.09.14 at 23:53, <wei.ye@intel.com> wrote:
> > --- a/xen/include/asm-x86/hvm/domain.h
> > +++ b/xen/include/asm-x86/hvm/domain.h
> > @@ -48,7 +48,7 @@ struct hvm_ioreq_vcpu {
> >      evtchn_port_t    ioreq_evtchn;
> >  };
> >
> > -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
> > +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP + 1)
> >  #define MAX_NR_IO_RANGES  256
> 
> So in the end you didn't even find it necessary to bump the limit on the
> number of ranges per domain? Iirc that was your major objection against
> using existing infrastructure.
> 
I make an experiment to collect the total amount page of ppgtt that will be protected in one time on the HWS platform. The result show that 512 pages needs to be protected. Though it's larger than the limitation of rangeset, fortunately, most pages are always continuous. So the many pages will be merged into one range. So the total  range count is much less than the limitation. 
And thanks for your remaindering to use the existing rangeset. 

Regards
Wei


> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-03 10:11   ` Paul Durrant
@ 2014-09-04 23:10     ` Tian, Kevin
  2014-09-05  0:44       ` Ye, Wei
  0 siblings, 1 reply; 19+ messages in thread
From: Tian, Kevin @ 2014-09-04 23:10 UTC (permalink / raw)
  To: Paul Durrant, Ye, Wei, xen-devel
  Cc: Keir (Xen.org), Ian Campbell, Tim (Xen.org),
	Dugger, Donald D, Stefano Stabellini, Lv, Zhiyuan, JBeulich,
	Zhang, Yang Z, Ian Jackson

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Wednesday, September 03, 2014 3:12 AM
> 
> > -----Original Message-----
> > From: Wei Ye [mailto:wei.ye@intel.com]
> > Sent: 03 September 2014 22:53
> > To: xen-devel@lists.xen.org
> > Cc: JBeulich@suse.com; Ian Campbell; Paul Durrant; Ian Jackson; Stefano
> > Stabellini; donald.d.dugger@intel.com; Kevin Tian; yang.z.zhang@intel.com;
> > zhiyuan.lv@intel.com; konrad.wilk@oracle.com; Keir (Xen.org); Tim
> > (Xen.org); Wei Ye
> > Subject: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
> >
> > Extend the ioreq-server to write protect pages and forward the
> > write access to the corresponding device model.
> >
> 
> Using rangesets did make the patch somewhat smaller then :-)
> 
> > Signed-off-by: Wei Ye <wei.ye@intel.com>
> > ---
> >  tools/libxc/xc_domain.c          |   33
> +++++++++++++++++++++++++++
> >  tools/libxc/xenctrl.h            |   18 +++++++++++++++
> >  xen/arch/x86/hvm/hvm.c           |   46
> > ++++++++++++++++++++++++++++++++++++++
> >  xen/include/asm-x86/hvm/domain.h |    2 +-
> >  xen/include/public/hvm/hvm_op.h  |    1 +
> >  5 files changed, 99 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> > index 37ed141..34e6309 100644
> > --- a/tools/libxc/xc_domain.c
> > +++ b/tools/libxc/xc_domain.c
> > @@ -1485,6 +1485,39 @@ int
> > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch, domid_t
> > domid,
> >      return rc;
> >  }
> >
> > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch, domid_t
> > domid,
> > +                                        ioservid_t id, uint16_t set,
> > +                                        uint64_t start, uint64_t
> end)
> > +{
> 
> Given what I said about the notion of MMIO ranges that need read emulation
> but no write emulation, and the ranges you're concerned about which require
> write emulation but no read emulation, I wonder whether this could collapse
> into the existing xc_hvm_map_io_range_to_ioreq_server() by adding an extra
> parameter to say whether, for a given range, you want read emulation, write
> emulation or both.	

Agree. Add a parameter to indicate r/w permission seems cleaner.

> 
> > +    DECLARE_HYPERCALL;
> > +    DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
> > +    int rc = -1;
> > +
> > +    if ( arg == NULL )
> > +    {
> > +        errno = -EFAULT;
> > +        return -1;
> > +    }
> > +
> > +    hypercall.op = __HYPERVISOR_hvm_op;
> > +    if ( set )
> > +        hypercall.arg[0] = HVMOP_map_io_range_to_ioreq_server;
> > +    else
> > +        hypercall.arg[0] = HVMOP_unmap_io_range_from_ioreq_server;
> > +    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
> > +
> > +    arg->domid = domid;
> > +    arg->id = id;
> > +    arg->type = HVMOP_IO_RANGE_WP;
> > +    arg->start = start;
> > +    arg->end = end;
> > +
> > +    rc = do_xen_hypercall(xch, &hypercall);
> > +
> > +    xc_hypercall_buffer_free(xch, arg);
> > +    return rc;
> > +}
> > +
> >  int xc_hvm_destroy_ioreq_server(xc_interface *xch,
> >                                  domid_t domid,
> >                                  ioservid_t id)
> > diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
> > index b55d857..b261602 100644
> > --- a/tools/libxc/xenctrl.h
> > +++ b/tools/libxc/xenctrl.h
> > @@ -1943,6 +1943,24 @@ int
> > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
> >                                            uint8_t function);
> >
> >  /**
> > + * This function registers/deregisters a set of pages to be write protected.
> > + *
> > + * @parm xch a handle to an open hypervisor interface.
> > + * @parm domid the domain id to be serviced
> > + * @parm id the IOREQ Server id.
> > + * @parm set whether registers or deregisters: 1 for register, 0 for
> > deregister
> > + * @parm start start of range
> > + * @parm end end of range (inclusive).
> > + * @return 0 on success, -1 on failure.
> > + */
> > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> > +                                        domid_t domid,
> > +                                        ioservid_t id,
> > +                                        uint16_t set,
> > +                                        uint64_t start,
> > +                                        uint64_t end);
> > +
> > +/**
> >   * This function destroys an IOREQ Server.
> >   *
> >   * @parm xch a handle to an open hypervisor interface.
> > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> > index adc0f93..f8c4db8 100644
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -853,6 +853,7 @@ static int hvm_ioreq_server_alloc_rangesets(struct
> > hvm_ioreq_server *s,
> >                        (i == HVMOP_IO_RANGE_PORT) ? "port" :
> >                        (i == HVMOP_IO_RANGE_MEMORY) ?
> "memory" :
> >                        (i == HVMOP_IO_RANGE_PCI) ? "pci" :
> > +                      (i == HVMOP_IO_RANGE_WP) ? "write
> protection" :
> 
> Continuing along my train of thought from above, perhaps the rangesets
> should now be
> 
> HVMOP_IO_RANGE_PORT_READ
> HMVOP_IO_RANGE_PORT_WRITE
> HVMOP_IO_RANGE_MEMORY_READ
> HVMOP_IO_RANGE_MEMORY_WRITE
> HVMOP_IO_RANGE_PCI

this looks cleaner. Once we start considering the permission, WP becomes
only an attribute which can be attached to either port or memory resource.
So better not define a new type here.

for now READ can simply fail until we see a real usage.

One side-effect though, is for default r/w emulated case we would double
rangesets here.

> 
> >                        "");
> >          if ( rc )
> >              goto fail;
> > @@ -1138,6 +1139,37 @@ static int hvm_get_ioreq_server_info(struct
> > domain *d, ioservid_t id,
> >      return rc;
> >  }
> >
> > +static int hvm_change_p2m_type_ioreq_server(struct domain *d, uint16_t
> > set,
> > +                                            uint64_t start,
> uint64_t end)
> > +{
> > +    int rc = -EINVAL;
> > +    uint64_t gpfn_s, gpfn_e, gpfn;
> > +    p2m_type_t ot, nt;
> > +
> > +    if ( set )
> > +    {
> > +        ot = p2m_ram_rw;
> > +        nt = p2m_mmio_write_dm;
> > +    }
> > +    else
> > +    {
> > +        ot = p2m_mmio_write_dm;
> > +        nt = p2m_ram_rw;
> > +    }
> > +
> > +    gpfn_s = start >> PAGE_SHIFT;
> > +    gpfn_e = end >> PAGE_SHIFT;
> > +
> > +    for ( gpfn = gpfn_s; gpfn <= gpfn_e; gpfn++)
> > +    {
> > +        rc = p2m_change_type_one(d, gpfn, ot, nt);
> > +        if ( !rc )
> > +            break;
> > +    }
> > +
> > +    return rc;
> > +}
> > +
> >  static int hvm_map_io_range_to_ioreq_server(struct domain *d,
> ioservid_t
> > id,
> >                                              uint32_t type,
> uint64_t start, uint64_t end)
> >  {
> > @@ -1163,6 +1195,7 @@ static int
> > hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> >              case HVMOP_IO_RANGE_PORT:
> >              case HVMOP_IO_RANGE_MEMORY:
> >              case HVMOP_IO_RANGE_PCI:
> > +            case HVMOP_IO_RANGE_WP:
> >                  r = s->range[type];
> >                  break;
> >
> > @@ -1180,6 +1213,10 @@ static int
> > hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> >                  break;
> >
> >              rc = rangeset_add_range(r, start, end);
> > +
> > +            if ( type == HVMOP_IO_RANGE_WP )
> > +                rc = hvm_change_p2m_type_ioreq_server(d, 1, start,
> end);
> > +
> 
> Here, for memory ranges,  you'd then change the p2m types to
> p2m_mmio_write_dm, p2m_mmio_read_dm or p2m_mmio_dm depending
> on whether the a range was being added to the MEMORY_WRITE set, the
> MEMORY_READ set or both. Obviously you'd need to be careful about ranges
> with distinct semantics that fall into the same page - but that could be dealt
> with. E.g. if you had two ranges in the same page, one needing full emulation
> and one needing write-only emulation then you could set the p2m type to
> p2m_mmio_dm and handle the read emulation for the second range directly
> in hvm_send_assist_req_to_ioreq_server().

I prefer to not introducing such complexity now. Just focus on existing usages,
i.e. either rw-emulation or w-emulation. We can reserve the type in hypercall
path, but p2m_mmio_read_dm can wait until a real usage is required.

> 
>   Paul
> 
> >              break;
> >          }
> >      }
> > @@ -1214,6 +1251,7 @@ static int
> > hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
> >              case HVMOP_IO_RANGE_PORT:
> >              case HVMOP_IO_RANGE_MEMORY:
> >              case HVMOP_IO_RANGE_PCI:
> > +            case HVMOP_IO_RANGE_WP:
> >                  r = s->range[type];
> >                  break;
> >
> > @@ -1231,6 +1269,10 @@ static int
> > hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
> >                  break;
> >
> >              rc = rangeset_remove_range(r, start, end);
> > +
> > +            if ( type == HVMOP_IO_RANGE_WP )
> > +                rc = hvm_change_p2m_type_ioreq_server(d, 0, start,
> end);
> > +
> >              break;
> >          }
> >      }
> > @@ -2361,6 +2403,10 @@ static struct hvm_ioreq_server
> > *hvm_select_ioreq_server(struct domain *d,
> >              if ( rangeset_contains_range(r, addr, end) )
> >                  return s;
> >
> > +            r = s->range[HVMOP_IO_RANGE_WP];
> > +            if ( rangeset_contains_range(r, addr, end) )
> > +                return s;
> > +
> >              break;
> >          case IOREQ_TYPE_PCI_CONFIG:
> >              if ( rangeset_contains_singleton(r, addr >> 32) )
> > diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-
> > x86/hvm/domain.h
> > index 291a2e0..b194f9a 100644
> > --- a/xen/include/asm-x86/hvm/domain.h
> > +++ b/xen/include/asm-x86/hvm/domain.h
> > @@ -48,7 +48,7 @@ struct hvm_ioreq_vcpu {
> >      evtchn_port_t    ioreq_evtchn;
> >  };
> >
> > -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
> > +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP + 1)
> >  #define MAX_NR_IO_RANGES  256
> >
> >  struct hvm_ioreq_server {
> > diff --git a/xen/include/public/hvm/hvm_op.h
> > b/xen/include/public/hvm/hvm_op.h
> > index eeb0a60..aedb97f 100644
> > --- a/xen/include/public/hvm/hvm_op.h
> > +++ b/xen/include/public/hvm/hvm_op.h
> > @@ -323,6 +323,7 @@ struct xen_hvm_io_range {
> >  # define HVMOP_IO_RANGE_PORT   0 /* I/O port range */
> >  # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
> >  # define HVMOP_IO_RANGE_PCI    2 /* PCI segment/bus/dev/func
> range */
> > +# define HVMOP_IO_RANGE_WP     3 /* Write protetion range */
> >      uint64_aligned_t start, end; /* IN - inclusive start and end of range */
> >  };
> >  typedef struct xen_hvm_io_range xen_hvm_io_range_t;
> > --
> > 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-04 23:10     ` Tian, Kevin
@ 2014-09-05  0:44       ` Ye, Wei
  2014-09-05  9:01         ` Paul Durrant
  0 siblings, 1 reply; 19+ messages in thread
From: Ye, Wei @ 2014-09-05  0:44 UTC (permalink / raw)
  To: Tian, Kevin, Paul Durrant, xen-devel
  Cc: Keir (Xen.org), Ian Campbell, Tim (Xen.org),
	Dugger, Donald D, Stefano Stabellini, Lv, Zhiyuan, JBeulich,
	Zhang, Yang Z, Ian Jackson



> -----Original Message-----
> From: Tian, Kevin
> Sent: Friday, September 5, 2014 7:11 AM
> To: Paul Durrant; Ye, Wei; xen-devel@lists.xen.org
> Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano Stabellini; Dugger,
> Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com; Keir (Xen.org);
> Tim (Xen.org)
> Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> forwarding
> 
> > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > Sent: Wednesday, September 03, 2014 3:12 AM
> >
> > > -----Original Message-----
> > > From: Wei Ye [mailto:wei.ye@intel.com]
> > > Sent: 03 September 2014 22:53
> > > To: xen-devel@lists.xen.org
> > > Cc: JBeulich@suse.com; Ian Campbell; Paul Durrant; Ian Jackson;
> > > Stefano Stabellini; donald.d.dugger@intel.com; Kevin Tian;
> > > yang.z.zhang@intel.com; zhiyuan.lv@intel.com;
> > > konrad.wilk@oracle.com; Keir (Xen.org); Tim (Xen.org); Wei Ye
> > > Subject: [PATCH v3 2/2] ioreq-server: write protected range and
> > > forwarding
> > >
> > > Extend the ioreq-server to write protect pages and forward the write
> > > access to the corresponding device model.
> > >
> >
> > Using rangesets did make the patch somewhat smaller then :-)
> >
> > > Signed-off-by: Wei Ye <wei.ye@intel.com>
> > > ---
> > >  tools/libxc/xc_domain.c          |   33
> > +++++++++++++++++++++++++++
> > >  tools/libxc/xenctrl.h            |   18 +++++++++++++++
> > >  xen/arch/x86/hvm/hvm.c           |   46
> > > ++++++++++++++++++++++++++++++++++++++
> > >  xen/include/asm-x86/hvm/domain.h |    2 +-
> > >  xen/include/public/hvm/hvm_op.h  |    1 +
> > >  5 files changed, 99 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c index
> > > 37ed141..34e6309 100644
> > > --- a/tools/libxc/xc_domain.c
> > > +++ b/tools/libxc/xc_domain.c
> > > @@ -1485,6 +1485,39 @@ int
> > > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch, domid_t
> > > domid,
> > >      return rc;
> > >  }
> > >
> > > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> domid_t
> > > domid,
> > > +                                        ioservid_t id, uint16_t set,
> > > +                                        uint64_t start, uint64_t
> > end)
> > > +{
> >
> > Given what I said about the notion of MMIO ranges that need read
> > emulation but no write emulation, and the ranges you're concerned
> > about which require write emulation but no read emulation, I wonder
> > whether this could collapse into the existing
> > xc_hvm_map_io_range_to_ioreq_server() by adding an extra parameter
> to say whether, for a given range, you want read emulation, write
> > emulation or both.
> 
> Agree. Add a parameter to indicate r/w permission seems cleaner.
> 
> >
> > > +    DECLARE_HYPERCALL;
> > > +    DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
> > > +    int rc = -1;
> > > +
> > > +    if ( arg == NULL )
> > > +    {
> > > +        errno = -EFAULT;
> > > +        return -1;
> > > +    }
> > > +
> > > +    hypercall.op = __HYPERVISOR_hvm_op;
> > > +    if ( set )
> > > +        hypercall.arg[0] = HVMOP_map_io_range_to_ioreq_server;
> > > +    else
> > > +        hypercall.arg[0] = HVMOP_unmap_io_range_from_ioreq_server;
> > > +    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
> > > +
> > > +    arg->domid = domid;
> > > +    arg->id = id;
> > > +    arg->type = HVMOP_IO_RANGE_WP;
> > > +    arg->start = start;
> > > +    arg->end = end;
> > > +
> > > +    rc = do_xen_hypercall(xch, &hypercall);
> > > +
> > > +    xc_hypercall_buffer_free(xch, arg);
> > > +    return rc;
> > > +}
> > > +
> > >  int xc_hvm_destroy_ioreq_server(xc_interface *xch,
> > >                                  domid_t domid,
> > >                                  ioservid_t id) diff --git
> > > a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h index
> > > b55d857..b261602 100644
> > > --- a/tools/libxc/xenctrl.h
> > > +++ b/tools/libxc/xenctrl.h
> > > @@ -1943,6 +1943,24 @@ int
> > > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
> > >                                            uint8_t function);
> > >
> > >  /**
> > > + * This function registers/deregisters a set of pages to be write
> protected.
> > > + *
> > > + * @parm xch a handle to an open hypervisor interface.
> > > + * @parm domid the domain id to be serviced
> > > + * @parm id the IOREQ Server id.
> > > + * @parm set whether registers or deregisters: 1 for register, 0
> > > + for
> > > deregister
> > > + * @parm start start of range
> > > + * @parm end end of range (inclusive).
> > > + * @return 0 on success, -1 on failure.
> > > + */
> > > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> > > +                                        domid_t domid,
> > > +                                        ioservid_t id,
> > > +                                        uint16_t set,
> > > +                                        uint64_t start,
> > > +                                        uint64_t end);
> > > +
> > > +/**
> > >   * This function destroys an IOREQ Server.
> > >   *
> > >   * @parm xch a handle to an open hypervisor interface.
> > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index
> > > adc0f93..f8c4db8 100644
> > > --- a/xen/arch/x86/hvm/hvm.c
> > > +++ b/xen/arch/x86/hvm/hvm.c
> > > @@ -853,6 +853,7 @@ static int
> > > hvm_ioreq_server_alloc_rangesets(struct
> > > hvm_ioreq_server *s,
> > >                        (i == HVMOP_IO_RANGE_PORT) ? "port" :
> > >                        (i == HVMOP_IO_RANGE_MEMORY) ?
> > "memory" :
> > >                        (i == HVMOP_IO_RANGE_PCI) ? "pci" :
> > > +                      (i == HVMOP_IO_RANGE_WP) ? "write
> > protection" :
> >
> > Continuing along my train of thought from above, perhaps the rangesets
> > should now be
> >
> > HVMOP_IO_RANGE_PORT_READ
> > HMVOP_IO_RANGE_PORT_WRITE
> > HVMOP_IO_RANGE_MEMORY_READ
> > HVMOP_IO_RANGE_MEMORY_WRITE
> > HVMOP_IO_RANGE_PCI
> 
> this looks cleaner. Once we start considering the permission, WP becomes
> only an attribute which can be attached to either port or memory resource.
> So better not define a new type here.
> 

Shall we also need a type of HVMOP_IO_RANGE_MEMORY_BOTH? Otherwise, if want to set
both r/w emulation, user needs to call the interface xc_hvm_map_io_range_to_ioreq_server() twice. One
is with the parameter HVMOP_IO_RANGE_MEMORY_READ and the other with parameter
HVMOP_IO_RANGE_MEMORY_WRITE.

Wei.

> for now READ can simply fail until we see a real usage.
> 
> One side-effect though, is for default r/w emulated case we would double
> rangesets here.
> 
> >
> > >                        "");
> > >          if ( rc )
> > >              goto fail;
> > > @@ -1138,6 +1139,37 @@ static int hvm_get_ioreq_server_info(struct
> > > domain *d, ioservid_t id,
> > >      return rc;
> > >  }
> > >
> > > +static int hvm_change_p2m_type_ioreq_server(struct domain *d,
> > > +uint16_t
> > > set,
> > > +                                            uint64_t start,
> > uint64_t end)
> > > +{
> > > +    int rc = -EINVAL;
> > > +    uint64_t gpfn_s, gpfn_e, gpfn;
> > > +    p2m_type_t ot, nt;
> > > +
> > > +    if ( set )
> > > +    {
> > > +        ot = p2m_ram_rw;
> > > +        nt = p2m_mmio_write_dm;
> > > +    }
> > > +    else
> > > +    {
> > > +        ot = p2m_mmio_write_dm;
> > > +        nt = p2m_ram_rw;
> > > +    }
> > > +
> > > +    gpfn_s = start >> PAGE_SHIFT;
> > > +    gpfn_e = end >> PAGE_SHIFT;
> > > +
> > > +    for ( gpfn = gpfn_s; gpfn <= gpfn_e; gpfn++)
> > > +    {
> > > +        rc = p2m_change_type_one(d, gpfn, ot, nt);
> > > +        if ( !rc )
> > > +            break;
> > > +    }
> > > +
> > > +    return rc;
> > > +}
> > > +
> > >  static int hvm_map_io_range_to_ioreq_server(struct domain *d,
> > ioservid_t
> > > id,
> > >                                              uint32_t type,
> > uint64_t start, uint64_t end)
> > >  {
> > > @@ -1163,6 +1195,7 @@ static int
> > > hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> > >              case HVMOP_IO_RANGE_PORT:
> > >              case HVMOP_IO_RANGE_MEMORY:
> > >              case HVMOP_IO_RANGE_PCI:
> > > +            case HVMOP_IO_RANGE_WP:
> > >                  r = s->range[type];
> > >                  break;
> > >
> > > @@ -1180,6 +1213,10 @@ static int
> > > hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> > >                  break;
> > >
> > >              rc = rangeset_add_range(r, start, end);
> > > +
> > > +            if ( type == HVMOP_IO_RANGE_WP )
> > > +                rc = hvm_change_p2m_type_ioreq_server(d, 1, start,
> > end);
> > > +
> >
> > Here, for memory ranges,  you'd then change the p2m types to
> > p2m_mmio_write_dm, p2m_mmio_read_dm or p2m_mmio_dm
> depending on
> > whether the a range was being added to the MEMORY_WRITE set, the
> > MEMORY_READ set or both. Obviously you'd need to be careful about
> > ranges with distinct semantics that fall into the same page - but that
> > could be dealt with. E.g. if you had two ranges in the same page, one
> > needing full emulation and one needing write-only emulation then you
> > could set the p2m type to p2m_mmio_dm and handle the read emulation
> > for the second range directly in hvm_send_assist_req_to_ioreq_server().
> 
> I prefer to not introducing such complexity now. Just focus on existing usages,
> i.e. either rw-emulation or w-emulation. We can reserve the type in
> hypercall path, but p2m_mmio_read_dm can wait until a real usage is
> required.
> 
> >
> >   Paul
> >
> > >              break;
> > >          }
> > >      }
> > > @@ -1214,6 +1251,7 @@ static int
> > > hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t
> id,
> > >              case HVMOP_IO_RANGE_PORT:
> > >              case HVMOP_IO_RANGE_MEMORY:
> > >              case HVMOP_IO_RANGE_PCI:
> > > +            case HVMOP_IO_RANGE_WP:
> > >                  r = s->range[type];
> > >                  break;
> > >
> > > @@ -1231,6 +1269,10 @@ static int
> > > hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t
> id,
> > >                  break;
> > >
> > >              rc = rangeset_remove_range(r, start, end);
> > > +
> > > +            if ( type == HVMOP_IO_RANGE_WP )
> > > +                rc = hvm_change_p2m_type_ioreq_server(d, 0, start,
> > end);
> > > +
> > >              break;
> > >          }
> > >      }
> > > @@ -2361,6 +2403,10 @@ static struct hvm_ioreq_server
> > > *hvm_select_ioreq_server(struct domain *d,
> > >              if ( rangeset_contains_range(r, addr, end) )
> > >                  return s;
> > >
> > > +            r = s->range[HVMOP_IO_RANGE_WP];
> > > +            if ( rangeset_contains_range(r, addr, end) )
> > > +                return s;
> > > +
> > >              break;
> > >          case IOREQ_TYPE_PCI_CONFIG:
> > >              if ( rangeset_contains_singleton(r, addr >> 32) ) diff
> > > --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-
> > > x86/hvm/domain.h index 291a2e0..b194f9a 100644
> > > --- a/xen/include/asm-x86/hvm/domain.h
> > > +++ b/xen/include/asm-x86/hvm/domain.h
> > > @@ -48,7 +48,7 @@ struct hvm_ioreq_vcpu {
> > >      evtchn_port_t    ioreq_evtchn;
> > >  };
> > >
> > > -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
> > > +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP + 1)
> > >  #define MAX_NR_IO_RANGES  256
> > >
> > >  struct hvm_ioreq_server {
> > > diff --git a/xen/include/public/hvm/hvm_op.h
> > > b/xen/include/public/hvm/hvm_op.h index eeb0a60..aedb97f 100644
> > > --- a/xen/include/public/hvm/hvm_op.h
> > > +++ b/xen/include/public/hvm/hvm_op.h
> > > @@ -323,6 +323,7 @@ struct xen_hvm_io_range {
> > >  # define HVMOP_IO_RANGE_PORT   0 /* I/O port range */
> > >  # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
> > >  # define HVMOP_IO_RANGE_PCI    2 /* PCI segment/bus/dev/func
> > range */
> > > +# define HVMOP_IO_RANGE_WP     3 /* Write protetion range */
> > >      uint64_aligned_t start, end; /* IN - inclusive start and end of
> > > range */  };  typedef struct xen_hvm_io_range xen_hvm_io_range_t;
> > > --
> > > 1.7.9.5
> >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-05  0:44       ` Ye, Wei
@ 2014-09-05  9:01         ` Paul Durrant
  2014-09-10  6:10           ` Ye, Wei
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Durrant @ 2014-09-05  9:01 UTC (permalink / raw)
  To: Ye, Wei, Kevin Tian, xen-devel
  Cc: Keir (Xen.org), Ian Campbell, Tim (Xen.org),
	Dugger, Donald D, Stefano Stabellini, Lv, Zhiyuan, JBeulich,
	Zhang, Yang Z, Ian Jackson

> -----Original Message-----
> From: Ye, Wei [mailto:wei.ye@intel.com]
> Sent: 05 September 2014 01:44
> To: Kevin Tian; Paul Durrant; xen-devel@lists.xen.org
> Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano Stabellini;
> Dugger, Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com; Keir
> (Xen.org); Tim (Xen.org)
> Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> forwarding
> 
> 
> 
> > -----Original Message-----
> > From: Tian, Kevin
> > Sent: Friday, September 5, 2014 7:11 AM
> > To: Paul Durrant; Ye, Wei; xen-devel@lists.xen.org
> > Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano Stabellini;
> Dugger,
> > Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com; Keir
> (Xen.org);
> > Tim (Xen.org)
> > Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> > forwarding
> >
> > > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > > Sent: Wednesday, September 03, 2014 3:12 AM
> > >
> > > > -----Original Message-----
> > > > From: Wei Ye [mailto:wei.ye@intel.com]
> > > > Sent: 03 September 2014 22:53
> > > > To: xen-devel@lists.xen.org
> > > > Cc: JBeulich@suse.com; Ian Campbell; Paul Durrant; Ian Jackson;
> > > > Stefano Stabellini; donald.d.dugger@intel.com; Kevin Tian;
> > > > yang.z.zhang@intel.com; zhiyuan.lv@intel.com;
> > > > konrad.wilk@oracle.com; Keir (Xen.org); Tim (Xen.org); Wei Ye
> > > > Subject: [PATCH v3 2/2] ioreq-server: write protected range and
> > > > forwarding
> > > >
> > > > Extend the ioreq-server to write protect pages and forward the write
> > > > access to the corresponding device model.
> > > >
> > >
> > > Using rangesets did make the patch somewhat smaller then :-)
> > >
> > > > Signed-off-by: Wei Ye <wei.ye@intel.com>
> > > > ---
> > > >  tools/libxc/xc_domain.c          |   33
> > > +++++++++++++++++++++++++++
> > > >  tools/libxc/xenctrl.h            |   18 +++++++++++++++
> > > >  xen/arch/x86/hvm/hvm.c           |   46
> > > > ++++++++++++++++++++++++++++++++++++++
> > > >  xen/include/asm-x86/hvm/domain.h |    2 +-
> > > >  xen/include/public/hvm/hvm_op.h  |    1 +
> > > >  5 files changed, 99 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c index
> > > > 37ed141..34e6309 100644
> > > > --- a/tools/libxc/xc_domain.c
> > > > +++ b/tools/libxc/xc_domain.c
> > > > @@ -1485,6 +1485,39 @@ int
> > > > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
> domid_t
> > > > domid,
> > > >      return rc;
> > > >  }
> > > >
> > > > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> > domid_t
> > > > domid,
> > > > +                                        ioservid_t id, uint16_t set,
> > > > +                                        uint64_t start, uint64_t
> > > end)
> > > > +{
> > >
> > > Given what I said about the notion of MMIO ranges that need read
> > > emulation but no write emulation, and the ranges you're concerned
> > > about which require write emulation but no read emulation, I wonder
> > > whether this could collapse into the existing
> > > xc_hvm_map_io_range_to_ioreq_server() by adding an extra parameter
> > to say whether, for a given range, you want read emulation, write
> > > emulation or both.
> >
> > Agree. Add a parameter to indicate r/w permission seems cleaner.
> >
> > >
> > > > +    DECLARE_HYPERCALL;
> > > > +    DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
> > > > +    int rc = -1;
> > > > +
> > > > +    if ( arg == NULL )
> > > > +    {
> > > > +        errno = -EFAULT;
> > > > +        return -1;
> > > > +    }
> > > > +
> > > > +    hypercall.op = __HYPERVISOR_hvm_op;
> > > > +    if ( set )
> > > > +        hypercall.arg[0] = HVMOP_map_io_range_to_ioreq_server;
> > > > +    else
> > > > +        hypercall.arg[0] = HVMOP_unmap_io_range_from_ioreq_server;
> > > > +    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
> > > > +
> > > > +    arg->domid = domid;
> > > > +    arg->id = id;
> > > > +    arg->type = HVMOP_IO_RANGE_WP;
> > > > +    arg->start = start;
> > > > +    arg->end = end;
> > > > +
> > > > +    rc = do_xen_hypercall(xch, &hypercall);
> > > > +
> > > > +    xc_hypercall_buffer_free(xch, arg);
> > > > +    return rc;
> > > > +}
> > > > +
> > > >  int xc_hvm_destroy_ioreq_server(xc_interface *xch,
> > > >                                  domid_t domid,
> > > >                                  ioservid_t id) diff --git
> > > > a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h index
> > > > b55d857..b261602 100644
> > > > --- a/tools/libxc/xenctrl.h
> > > > +++ b/tools/libxc/xenctrl.h
> > > > @@ -1943,6 +1943,24 @@ int
> > > > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
> > > >                                            uint8_t function);
> > > >
> > > >  /**
> > > > + * This function registers/deregisters a set of pages to be write
> > protected.
> > > > + *
> > > > + * @parm xch a handle to an open hypervisor interface.
> > > > + * @parm domid the domain id to be serviced
> > > > + * @parm id the IOREQ Server id.
> > > > + * @parm set whether registers or deregisters: 1 for register, 0
> > > > + for
> > > > deregister
> > > > + * @parm start start of range
> > > > + * @parm end end of range (inclusive).
> > > > + * @return 0 on success, -1 on failure.
> > > > + */
> > > > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> > > > +                                        domid_t domid,
> > > > +                                        ioservid_t id,
> > > > +                                        uint16_t set,
> > > > +                                        uint64_t start,
> > > > +                                        uint64_t end);
> > > > +
> > > > +/**
> > > >   * This function destroys an IOREQ Server.
> > > >   *
> > > >   * @parm xch a handle to an open hypervisor interface.
> > > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index
> > > > adc0f93..f8c4db8 100644
> > > > --- a/xen/arch/x86/hvm/hvm.c
> > > > +++ b/xen/arch/x86/hvm/hvm.c
> > > > @@ -853,6 +853,7 @@ static int
> > > > hvm_ioreq_server_alloc_rangesets(struct
> > > > hvm_ioreq_server *s,
> > > >                        (i == HVMOP_IO_RANGE_PORT) ? "port" :
> > > >                        (i == HVMOP_IO_RANGE_MEMORY) ?
> > > "memory" :
> > > >                        (i == HVMOP_IO_RANGE_PCI) ? "pci" :
> > > > +                      (i == HVMOP_IO_RANGE_WP) ? "write
> > > protection" :
> > >
> > > Continuing along my train of thought from above, perhaps the rangesets
> > > should now be
> > >
> > > HVMOP_IO_RANGE_PORT_READ
> > > HMVOP_IO_RANGE_PORT_WRITE
> > > HVMOP_IO_RANGE_MEMORY_READ
> > > HVMOP_IO_RANGE_MEMORY_WRITE
> > > HVMOP_IO_RANGE_PCI
> >
> > this looks cleaner. Once we start considering the permission, WP becomes
> > only an attribute which can be attached to either port or memory resource.
> > So better not define a new type here.
> >
> 
> Shall we also need a type of HVMOP_IO_RANGE_MEMORY_BOTH?
> Otherwise, if want to set
> both r/w emulation, user needs to call the interface
> xc_hvm_map_io_range_to_ioreq_server() twice. One
> is with the parameter HVMOP_IO_RANGE_MEMORY_READ and the other
> with parameter
> HVMOP_IO_RANGE_MEMORY_WRITE.
> 

That's true, so messing with the types directly is probably not the correct answer. Instead, break direct association between type and rangeset index and then hvmop_map_io_range_to_ioreq_server() can call hvm_map_io_range_to_ioreq_server() multiple times if necessary.
You could declare the array ins struct hvm_ioreq_server as:

struct rangeset	*range[NR_IO_RANGE_TYPES * 2];

and then calculate index as something like:

if (flags & READ)
  index = type;

if (flags & WRITE)
  index = type + NR_IO_RANGE_TYPES;

Yes, it does mean double the number of rangesets if you want both read and write, but that's a very small amount of extra memory. If necessary it could be avoided by having a third set of rangesets for READ|WRITE but it would mean potentially having to do two lookups in hvm_select_ioreq_server().

  Paul


> Wei.
> 
> > for now READ can simply fail until we see a real usage.
> >
> > One side-effect though, is for default r/w emulated case we would double
> > rangesets here.
> >
> > >
> > > >                        "");
> > > >          if ( rc )
> > > >              goto fail;
> > > > @@ -1138,6 +1139,37 @@ static int hvm_get_ioreq_server_info(struct
> > > > domain *d, ioservid_t id,
> > > >      return rc;
> > > >  }
> > > >
> > > > +static int hvm_change_p2m_type_ioreq_server(struct domain *d,
> > > > +uint16_t
> > > > set,
> > > > +                                            uint64_t start,
> > > uint64_t end)
> > > > +{
> > > > +    int rc = -EINVAL;
> > > > +    uint64_t gpfn_s, gpfn_e, gpfn;
> > > > +    p2m_type_t ot, nt;
> > > > +
> > > > +    if ( set )
> > > > +    {
> > > > +        ot = p2m_ram_rw;
> > > > +        nt = p2m_mmio_write_dm;
> > > > +    }
> > > > +    else
> > > > +    {
> > > > +        ot = p2m_mmio_write_dm;
> > > > +        nt = p2m_ram_rw;
> > > > +    }
> > > > +
> > > > +    gpfn_s = start >> PAGE_SHIFT;
> > > > +    gpfn_e = end >> PAGE_SHIFT;
> > > > +
> > > > +    for ( gpfn = gpfn_s; gpfn <= gpfn_e; gpfn++)
> > > > +    {
> > > > +        rc = p2m_change_type_one(d, gpfn, ot, nt);
> > > > +        if ( !rc )
> > > > +            break;
> > > > +    }
> > > > +
> > > > +    return rc;
> > > > +}
> > > > +
> > > >  static int hvm_map_io_range_to_ioreq_server(struct domain *d,
> > > ioservid_t
> > > > id,
> > > >                                              uint32_t type,
> > > uint64_t start, uint64_t end)
> > > >  {
> > > > @@ -1163,6 +1195,7 @@ static int
> > > > hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> > > >              case HVMOP_IO_RANGE_PORT:
> > > >              case HVMOP_IO_RANGE_MEMORY:
> > > >              case HVMOP_IO_RANGE_PCI:
> > > > +            case HVMOP_IO_RANGE_WP:
> > > >                  r = s->range[type];
> > > >                  break;
> > > >
> > > > @@ -1180,6 +1213,10 @@ static int
> > > > hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> > > >                  break;
> > > >
> > > >              rc = rangeset_add_range(r, start, end);
> > > > +
> > > > +            if ( type == HVMOP_IO_RANGE_WP )
> > > > +                rc = hvm_change_p2m_type_ioreq_server(d, 1, start,
> > > end);
> > > > +
> > >
> > > Here, for memory ranges,  you'd then change the p2m types to
> > > p2m_mmio_write_dm, p2m_mmio_read_dm or p2m_mmio_dm
> > depending on
> > > whether the a range was being added to the MEMORY_WRITE set, the
> > > MEMORY_READ set or both. Obviously you'd need to be careful about
> > > ranges with distinct semantics that fall into the same page - but that
> > > could be dealt with. E.g. if you had two ranges in the same page, one
> > > needing full emulation and one needing write-only emulation then you
> > > could set the p2m type to p2m_mmio_dm and handle the read emulation
> > > for the second range directly in
> hvm_send_assist_req_to_ioreq_server().
> >
> > I prefer to not introducing such complexity now. Just focus on existing
> usages,
> > i.e. either rw-emulation or w-emulation. We can reserve the type in
> > hypercall path, but p2m_mmio_read_dm can wait until a real usage is
> > required.
> >
> > >
> > >   Paul
> > >
> > > >              break;
> > > >          }
> > > >      }
> > > > @@ -1214,6 +1251,7 @@ static int
> > > > hvm_unmap_io_range_from_ioreq_server(struct domain *d,
> ioservid_t
> > id,
> > > >              case HVMOP_IO_RANGE_PORT:
> > > >              case HVMOP_IO_RANGE_MEMORY:
> > > >              case HVMOP_IO_RANGE_PCI:
> > > > +            case HVMOP_IO_RANGE_WP:
> > > >                  r = s->range[type];
> > > >                  break;
> > > >
> > > > @@ -1231,6 +1269,10 @@ static int
> > > > hvm_unmap_io_range_from_ioreq_server(struct domain *d,
> ioservid_t
> > id,
> > > >                  break;
> > > >
> > > >              rc = rangeset_remove_range(r, start, end);
> > > > +
> > > > +            if ( type == HVMOP_IO_RANGE_WP )
> > > > +                rc = hvm_change_p2m_type_ioreq_server(d, 0, start,
> > > end);
> > > > +
> > > >              break;
> > > >          }
> > > >      }
> > > > @@ -2361,6 +2403,10 @@ static struct hvm_ioreq_server
> > > > *hvm_select_ioreq_server(struct domain *d,
> > > >              if ( rangeset_contains_range(r, addr, end) )
> > > >                  return s;
> > > >
> > > > +            r = s->range[HVMOP_IO_RANGE_WP];
> > > > +            if ( rangeset_contains_range(r, addr, end) )
> > > > +                return s;
> > > > +
> > > >              break;
> > > >          case IOREQ_TYPE_PCI_CONFIG:
> > > >              if ( rangeset_contains_singleton(r, addr >> 32) ) diff
> > > > --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-
> > > > x86/hvm/domain.h index 291a2e0..b194f9a 100644
> > > > --- a/xen/include/asm-x86/hvm/domain.h
> > > > +++ b/xen/include/asm-x86/hvm/domain.h
> > > > @@ -48,7 +48,7 @@ struct hvm_ioreq_vcpu {
> > > >      evtchn_port_t    ioreq_evtchn;
> > > >  };
> > > >
> > > > -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
> > > > +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP + 1)
> > > >  #define MAX_NR_IO_RANGES  256
> > > >
> > > >  struct hvm_ioreq_server {
> > > > diff --git a/xen/include/public/hvm/hvm_op.h
> > > > b/xen/include/public/hvm/hvm_op.h index eeb0a60..aedb97f 100644
> > > > --- a/xen/include/public/hvm/hvm_op.h
> > > > +++ b/xen/include/public/hvm/hvm_op.h
> > > > @@ -323,6 +323,7 @@ struct xen_hvm_io_range {
> > > >  # define HVMOP_IO_RANGE_PORT   0 /* I/O port range */
> > > >  # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
> > > >  # define HVMOP_IO_RANGE_PCI    2 /* PCI segment/bus/dev/func
> > > range */
> > > > +# define HVMOP_IO_RANGE_WP     3 /* Write protetion range */
> > > >      uint64_aligned_t start, end; /* IN - inclusive start and end of
> > > > range */  };  typedef struct xen_hvm_io_range xen_hvm_io_range_t;
> > > > --
> > > > 1.7.9.5
> > >

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-03 13:17   ` Jan Beulich
  2014-09-04  0:31     ` Ye, Wei
@ 2014-09-10  5:32     ` Ye, Wei
  2014-09-10  9:19       ` Jan Beulich
  1 sibling, 1 reply; 19+ messages in thread
From: Ye, Wei @ 2014-09-10  5:32 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, keir, ian.campbell, stefano.stabellini, tim,
	ian.jackson, Dugger, Donald D, xen-devel, Paul.Durrant, Lv,
	Zhiyuan, Zhang, Yang Z



> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Jan Beulich
> Sent: Wednesday, September 3, 2014 9:17 PM
> To: Ye, Wei
> Cc: Tian, Kevin; keir@xen.org; ian.campbell@citrix.com;
> stefano.stabellini@eu.citrix.com; tim@xen.org; ian.jackson@eu.citrix.com;
> Dugger, Donald D; xen-devel@lists.xen.org; Paul.Durrant@citrix.com; Lv,
> Zhiyuan; Zhang, Yang Z
> Subject: Re: [Xen-devel] [PATCH v3 2/2] ioreq-server: write protected range
> and forwarding
> 
> >>> On 03.09.14 at 23:53, <wei.ye@intel.com> wrote:
> > +static int hvm_change_p2m_type_ioreq_server(struct domain *d,
> uint16_t set,
> > +                                            uint64_t start, uint64_t
> > +end) {
> > +    int rc = -EINVAL;
> > +    uint64_t gpfn_s, gpfn_e, gpfn;
> > +    p2m_type_t ot, nt;
> > +
> > +    if ( set )
> > +    {
> > +        ot = p2m_ram_rw;
> > +        nt = p2m_mmio_write_dm;
> > +    }
> > +    else
> > +    {
> > +        ot = p2m_mmio_write_dm;
> > +        nt = p2m_ram_rw;
> > +    }
> > +
> > +    gpfn_s = start >> PAGE_SHIFT;
> > +    gpfn_e = end >> PAGE_SHIFT;
> 
> Considering that the first really ought to be PFN_DOWN() - is the latter really
> correct? I'd rather expect that to be PFN_UP()...
> 
I think the latter one is also should be PFN_DOWN. Considering a range is [0x0000, 0x1000],
then only the first pfn 0 should be changed. Note that the follow loop:
for (gpfn = gpfn_s; gpfn <= gpfn_e; gpfn++)
The gpfn_e is included. If PFN_UP is to calculate the gpfn_e, there's wrong p2m type change
to pfn 1.

Wei
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-05  9:01         ` Paul Durrant
@ 2014-09-10  6:10           ` Ye, Wei
  2014-09-10 21:02             ` Tian, Kevin
  2014-09-11 14:45             ` Paul Durrant
  0 siblings, 2 replies; 19+ messages in thread
From: Ye, Wei @ 2014-09-10  6:10 UTC (permalink / raw)
  To: Paul Durrant, Tian, Kevin, xen-devel
  Cc: Keir (Xen.org), Ian Campbell, Tim (Xen.org),
	Dugger, Donald D, Stefano Stabellini, Lv, Zhiyuan, JBeulich,
	Zhang, Yang Z, Ian Jackson



> -----Original Message-----
> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Friday, September 5, 2014 5:02 PM
> To: Ye, Wei; Tian, Kevin; xen-devel@lists.xen.org
> Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano Stabellini; Dugger,
> Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com; Keir (Xen.org);
> Tim (Xen.org)
> Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> forwarding
> 
> > -----Original Message-----
> > From: Ye, Wei [mailto:wei.ye@intel.com]
> > Sent: 05 September 2014 01:44
> > To: Kevin Tian; Paul Durrant; xen-devel@lists.xen.org
> > Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano Stabellini;
> > Dugger, Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com;
> > Keir (Xen.org); Tim (Xen.org)
> > Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> > forwarding
> >
> >
> >
> > > -----Original Message-----
> > > From: Tian, Kevin
> > > Sent: Friday, September 5, 2014 7:11 AM
> > > To: Paul Durrant; Ye, Wei; xen-devel@lists.xen.org
> > > Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano
> > > Stabellini;
> > Dugger,
> > > Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com; Keir
> > (Xen.org);
> > > Tim (Xen.org)
> > > Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> > > forwarding
> > >
> > > > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > > > Sent: Wednesday, September 03, 2014 3:12 AM
> > > >
> > > > > -----Original Message-----
> > > > > From: Wei Ye [mailto:wei.ye@intel.com]
> > > > > Sent: 03 September 2014 22:53
> > > > > To: xen-devel@lists.xen.org
> > > > > Cc: JBeulich@suse.com; Ian Campbell; Paul Durrant; Ian Jackson;
> > > > > Stefano Stabellini; donald.d.dugger@intel.com; Kevin Tian;
> > > > > yang.z.zhang@intel.com; zhiyuan.lv@intel.com;
> > > > > konrad.wilk@oracle.com; Keir (Xen.org); Tim (Xen.org); Wei Ye
> > > > > Subject: [PATCH v3 2/2] ioreq-server: write protected range and
> > > > > forwarding
> > > > >
> > > > > Extend the ioreq-server to write protect pages and forward the
> > > > > write access to the corresponding device model.
> > > > >
> > > >
> > > > Using rangesets did make the patch somewhat smaller then :-)
> > > >
> > > > > Signed-off-by: Wei Ye <wei.ye@intel.com>
> > > > > ---
> > > > >  tools/libxc/xc_domain.c          |   33
> > > > +++++++++++++++++++++++++++
> > > > >  tools/libxc/xenctrl.h            |   18 +++++++++++++++
> > > > >  xen/arch/x86/hvm/hvm.c           |   46
> > > > > ++++++++++++++++++++++++++++++++++++++
> > > > >  xen/include/asm-x86/hvm/domain.h |    2 +-
> > > > >  xen/include/public/hvm/hvm_op.h  |    1 +
> > > > >  5 files changed, 99 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> > > > > index
> > > > > 37ed141..34e6309 100644
> > > > > --- a/tools/libxc/xc_domain.c
> > > > > +++ b/tools/libxc/xc_domain.c
> > > > > @@ -1485,6 +1485,39 @@ int
> > > > > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
> > domid_t
> > > > > domid,
> > > > >      return rc;
> > > > >  }
> > > > >
> > > > > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> > > domid_t
> > > > > domid,
> > > > > +                                        ioservid_t id, uint16_t set,
> > > > > +                                        uint64_t start,
> > > > > + uint64_t
> > > > end)
> > > > > +{
> > > >
> > > > Given what I said about the notion of MMIO ranges that need read
> > > > emulation but no write emulation, and the ranges you're concerned
> > > > about which require write emulation but no read emulation, I
> > > > wonder whether this could collapse into the existing
> > > > xc_hvm_map_io_range_to_ioreq_server() by adding an extra
> parameter
> > > to say whether, for a given range, you want read emulation, write
> > > > emulation or both.
> > >
> > > Agree. Add a parameter to indicate r/w permission seems cleaner.
> > >
> > > >
> > > > > +    DECLARE_HYPERCALL;
> > > > > +    DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
> > > > > +    int rc = -1;
> > > > > +
> > > > > +    if ( arg == NULL )
> > > > > +    {
> > > > > +        errno = -EFAULT;
> > > > > +        return -1;
> > > > > +    }
> > > > > +
> > > > > +    hypercall.op = __HYPERVISOR_hvm_op;
> > > > > +    if ( set )
> > > > > +        hypercall.arg[0] = HVMOP_map_io_range_to_ioreq_server;
> > > > > +    else
> > > > > +        hypercall.arg[0] =
> HVMOP_unmap_io_range_from_ioreq_server;
> > > > > +    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
> > > > > +
> > > > > +    arg->domid = domid;
> > > > > +    arg->id = id;
> > > > > +    arg->type = HVMOP_IO_RANGE_WP;
> > > > > +    arg->start = start;
> > > > > +    arg->end = end;
> > > > > +
> > > > > +    rc = do_xen_hypercall(xch, &hypercall);
> > > > > +
> > > > > +    xc_hypercall_buffer_free(xch, arg);
> > > > > +    return rc;
> > > > > +}
> > > > > +
> > > > >  int xc_hvm_destroy_ioreq_server(xc_interface *xch,
> > > > >                                  domid_t domid,
> > > > >                                  ioservid_t id) diff --git
> > > > > a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h index
> > > > > b55d857..b261602 100644
> > > > > --- a/tools/libxc/xenctrl.h
> > > > > +++ b/tools/libxc/xenctrl.h
> > > > > @@ -1943,6 +1943,24 @@ int
> > > > > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
> > > > >                                            uint8_t function);
> > > > >
> > > > >  /**
> > > > > + * This function registers/deregisters a set of pages to be
> > > > > + write
> > > protected.
> > > > > + *
> > > > > + * @parm xch a handle to an open hypervisor interface.
> > > > > + * @parm domid the domain id to be serviced
> > > > > + * @parm id the IOREQ Server id.
> > > > > + * @parm set whether registers or deregisters: 1 for register,
> > > > > + 0 for
> > > > > deregister
> > > > > + * @parm start start of range
> > > > > + * @parm end end of range (inclusive).
> > > > > + * @return 0 on success, -1 on failure.
> > > > > + */
> > > > > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> > > > > +                                        domid_t domid,
> > > > > +                                        ioservid_t id,
> > > > > +                                        uint16_t set,
> > > > > +                                        uint64_t start,
> > > > > +                                        uint64_t end);
> > > > > +
> > > > > +/**
> > > > >   * This function destroys an IOREQ Server.
> > > > >   *
> > > > >   * @parm xch a handle to an open hypervisor interface.
> > > > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> > > > > index
> > > > > adc0f93..f8c4db8 100644
> > > > > --- a/xen/arch/x86/hvm/hvm.c
> > > > > +++ b/xen/arch/x86/hvm/hvm.c
> > > > > @@ -853,6 +853,7 @@ static int
> > > > > hvm_ioreq_server_alloc_rangesets(struct
> > > > > hvm_ioreq_server *s,
> > > > >                        (i == HVMOP_IO_RANGE_PORT) ? "port" :
> > > > >                        (i == HVMOP_IO_RANGE_MEMORY) ?
> > > > "memory" :
> > > > >                        (i == HVMOP_IO_RANGE_PCI) ? "pci" :
> > > > > +                      (i == HVMOP_IO_RANGE_WP) ? "write
> > > > protection" :
> > > >
> > > > Continuing along my train of thought from above, perhaps the
> > > > rangesets should now be
> > > >
> > > > HVMOP_IO_RANGE_PORT_READ
> > > > HMVOP_IO_RANGE_PORT_WRITE
> > > > HVMOP_IO_RANGE_MEMORY_READ
> > > > HVMOP_IO_RANGE_MEMORY_WRITE
> > > > HVMOP_IO_RANGE_PCI
> > >
> > > this looks cleaner. Once we start considering the permission, WP
> > > becomes only an attribute which can be attached to either port or
> memory resource.
> > > So better not define a new type here.
> > >
> >
> > Shall we also need a type of HVMOP_IO_RANGE_MEMORY_BOTH?
> > Otherwise, if want to set
> > both r/w emulation, user needs to call the interface
> > xc_hvm_map_io_range_to_ioreq_server() twice. One is with the
> parameter
> > HVMOP_IO_RANGE_MEMORY_READ and the other with parameter
> > HVMOP_IO_RANGE_MEMORY_WRITE.
> >
> 
> That's true, so messing with the types directly is probably not the correct
> answer. Instead, break direct association between type and rangeset index
> and then hvmop_map_io_range_to_ioreq_server() can call
> hvm_map_io_range_to_ioreq_server() multiple times if necessary.
> You could declare the array ins struct hvm_ioreq_server as:
> 
> struct rangeset	*range[NR_IO_RANGE_TYPES * 2];
> 
> and then calculate index as something like:
> 
> if (flags & READ)
>   index = type;
> 
> if (flags & WRITE)
>   index = type + NR_IO_RANGE_TYPES;
> 
> Yes, it does mean double the number of rangesets if you want both read and
> write, but that's a very small amount of extra memory. If necessary it could
> be avoided by having a third set of rangesets for READ|WRITE but it would
> mean potentially having to do two lookups in hvm_select_ioreq_server().
> 
>   Paul
> 
I'm a little bit confused that like write protection for a true mmio range. If the true
mmio range is readable but write is porected, then where is the data can be read from
If the read is not forwarded to device model. Current write protection for normal pages
treat write as mmio is a special case. For read, data comes from the normal page, but
wirte go to device model for emulation. And, we also can write directly to the
normal page but forward read to device model. So I think the write protection or
read protection is meaningfull only for normal page protection, not for port or mmio resources.
So, double the rangeset that split READ|WRITE sets for existing resource is not very reasonable,  am I right?

Wei

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-10  5:32     ` Ye, Wei
@ 2014-09-10  9:19       ` Jan Beulich
  2014-09-10  9:30         ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2014-09-10  9:19 UTC (permalink / raw)
  To: Wei Ye
  Cc: Kevin Tian, keir, ian.campbell, stefano.stabellini, tim,
	ian.jackson, Donald D Dugger, xen-devel, Paul.Durrant,
	Zhiyuan Lv, Yang Z Zhang

>>> On 10.09.14 at 07:32, <wei.ye@intel.com> wrote:

> 
>> -----Original Message-----
>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>> bounces@lists.xen.org] On Behalf Of Jan Beulich
>> Sent: Wednesday, September 3, 2014 9:17 PM
>> To: Ye, Wei
>> Cc: Tian, Kevin; keir@xen.org; ian.campbell@citrix.com;
>> stefano.stabellini@eu.citrix.com; tim@xen.org; ian.jackson@eu.citrix.com;
>> Dugger, Donald D; xen-devel@lists.xen.org; Paul.Durrant@citrix.com; Lv,
>> Zhiyuan; Zhang, Yang Z
>> Subject: Re: [Xen-devel] [PATCH v3 2/2] ioreq-server: write protected range
>> and forwarding
>> 
>> >>> On 03.09.14 at 23:53, <wei.ye@intel.com> wrote:
>> > +static int hvm_change_p2m_type_ioreq_server(struct domain *d,
>> uint16_t set,
>> > +                                            uint64_t start, uint64_t
>> > +end) {
>> > +    int rc = -EINVAL;
>> > +    uint64_t gpfn_s, gpfn_e, gpfn;
>> > +    p2m_type_t ot, nt;
>> > +
>> > +    if ( set )
>> > +    {
>> > +        ot = p2m_ram_rw;
>> > +        nt = p2m_mmio_write_dm;
>> > +    }
>> > +    else
>> > +    {
>> > +        ot = p2m_mmio_write_dm;
>> > +        nt = p2m_ram_rw;
>> > +    }
>> > +
>> > +    gpfn_s = start >> PAGE_SHIFT;
>> > +    gpfn_e = end >> PAGE_SHIFT;
>> 
>> Considering that the first really ought to be PFN_DOWN() - is the latter 
> really
>> correct? I'd rather expect that to be PFN_UP()...
>> 
> I think the latter one is also should be PFN_DOWN. Considering a range is 
> [0x0000, 0x1000],
> then only the first pfn 0 should be changed. Note that the follow loop:
> for (gpfn = gpfn_s; gpfn <= gpfn_e; gpfn++)
> The gpfn_e is included. If PFN_UP is to calculate the gpfn_e, there's wrong 
> p2m type change
> to pfn 1.

But then (i.e. if the range is inclusive) it still needs to be
PFN_UP(end - 1).

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-10  9:19       ` Jan Beulich
@ 2014-09-10  9:30         ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2014-09-10  9:30 UTC (permalink / raw)
  To: Wei Ye, Jan Beulich
  Cc: Kevin Tian, keir, ian.campbell, stefano.stabellini, tim,
	ian.jackson, Donald D Dugger, xen-devel, Paul.Durrant,
	Zhiyuan Lv, Yang Z Zhang

>>> On 10.09.14 at 11:19, <JBeulich@suse.com> wrote:
>>>> On 10.09.14 at 07:32, <wei.ye@intel.com> wrote:
> 
>> 
>>> -----Original Message-----
>>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>>> bounces@lists.xen.org] On Behalf Of Jan Beulich
>>> Sent: Wednesday, September 3, 2014 9:17 PM
>>> To: Ye, Wei
>>> Cc: Tian, Kevin; keir@xen.org; ian.campbell@citrix.com;
>>> stefano.stabellini@eu.citrix.com; tim@xen.org; ian.jackson@eu.citrix.com;
>>> Dugger, Donald D; xen-devel@lists.xen.org; Paul.Durrant@citrix.com; Lv,
>>> Zhiyuan; Zhang, Yang Z
>>> Subject: Re: [Xen-devel] [PATCH v3 2/2] ioreq-server: write protected range
>>> and forwarding
>>> 
>>> >>> On 03.09.14 at 23:53, <wei.ye@intel.com> wrote:
>>> > +static int hvm_change_p2m_type_ioreq_server(struct domain *d,
>>> uint16_t set,
>>> > +                                            uint64_t start, uint64_t
>>> > +end) {
>>> > +    int rc = -EINVAL;
>>> > +    uint64_t gpfn_s, gpfn_e, gpfn;
>>> > +    p2m_type_t ot, nt;
>>> > +
>>> > +    if ( set )
>>> > +    {
>>> > +        ot = p2m_ram_rw;
>>> > +        nt = p2m_mmio_write_dm;
>>> > +    }
>>> > +    else
>>> > +    {
>>> > +        ot = p2m_mmio_write_dm;
>>> > +        nt = p2m_ram_rw;
>>> > +    }
>>> > +
>>> > +    gpfn_s = start >> PAGE_SHIFT;
>>> > +    gpfn_e = end >> PAGE_SHIFT;
>>> 
>>> Considering that the first really ought to be PFN_DOWN() - is the latter 
>> really
>>> correct? I'd rather expect that to be PFN_UP()...
>>> 
>> I think the latter one is also should be PFN_DOWN. Considering a range is 
>> [0x0000, 0x1000],
>> then only the first pfn 0 should be changed. Note that the follow loop:
>> for (gpfn = gpfn_s; gpfn <= gpfn_e; gpfn++)
>> The gpfn_e is included. If PFN_UP is to calculate the gpfn_e, there's wrong 
>> p2m type change
>> to pfn 1.
> 
> But then (i.e. if the range is inclusive) it still needs to be

Of course this should read "... is not inclusive) ..."

Jan

> PFN_UP(end - 1).
> 
> Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-10  6:10           ` Ye, Wei
@ 2014-09-10 21:02             ` Tian, Kevin
  2014-09-11 14:38               ` Paul Durrant
  2014-09-11 14:45             ` Paul Durrant
  1 sibling, 1 reply; 19+ messages in thread
From: Tian, Kevin @ 2014-09-10 21:02 UTC (permalink / raw)
  To: Ye, Wei, Paul Durrant, xen-devel
  Cc: Keir (Xen.org), Ian Campbell, Tim (Xen.org),
	Dugger, Donald D, Stefano Stabellini, Lv, Zhiyuan, JBeulich,
	Zhang, Yang Z, Ian Jackson

> From: Ye, Wei
> Sent: Tuesday, September 09, 2014 11:10 PM
> > > > > > adc0f93..f8c4db8 100644
> > > > > > --- a/xen/arch/x86/hvm/hvm.c
> > > > > > +++ b/xen/arch/x86/hvm/hvm.c
> > > > > > @@ -853,6 +853,7 @@ static int
> > > > > > hvm_ioreq_server_alloc_rangesets(struct
> > > > > > hvm_ioreq_server *s,
> > > > > >                        (i == HVMOP_IO_RANGE_PORT) ?
> "port" :
> > > > > >                        (i == HVMOP_IO_RANGE_MEMORY) ?
> > > > > "memory" :
> > > > > >                        (i == HVMOP_IO_RANGE_PCI) ? "pci" :
> > > > > > +                      (i == HVMOP_IO_RANGE_WP) ? "write
> > > > > protection" :
> > > > >
> > > > > Continuing along my train of thought from above, perhaps the
> > > > > rangesets should now be
> > > > >
> > > > > HVMOP_IO_RANGE_PORT_READ
> > > > > HMVOP_IO_RANGE_PORT_WRITE
> > > > > HVMOP_IO_RANGE_MEMORY_READ
> > > > > HVMOP_IO_RANGE_MEMORY_WRITE
> > > > > HVMOP_IO_RANGE_PCI
> > > >
> > > > this looks cleaner. Once we start considering the permission, WP
> > > > becomes only an attribute which can be attached to either port or
> > memory resource.
> > > > So better not define a new type here.
> > > >
> > >
> > > Shall we also need a type of HVMOP_IO_RANGE_MEMORY_BOTH?
> > > Otherwise, if want to set
> > > both r/w emulation, user needs to call the interface
> > > xc_hvm_map_io_range_to_ioreq_server() twice. One is with the
> > parameter
> > > HVMOP_IO_RANGE_MEMORY_READ and the other with parameter
> > > HVMOP_IO_RANGE_MEMORY_WRITE.
> > >
> >
> > That's true, so messing with the types directly is probably not the correct
> > answer. Instead, break direct association between type and rangeset index
> > and then hvmop_map_io_range_to_ioreq_server() can call
> > hvm_map_io_range_to_ioreq_server() multiple times if necessary.
> > You could declare the array ins struct hvm_ioreq_server as:
> >
> > struct rangeset	*range[NR_IO_RANGE_TYPES * 2];
> >
> > and then calculate index as something like:
> >
> > if (flags & READ)
> >   index = type;
> >
> > if (flags & WRITE)
> >   index = type + NR_IO_RANGE_TYPES;
> >
> > Yes, it does mean double the number of rangesets if you want both read and
> > write, but that's a very small amount of extra memory. If necessary it could
> > be avoided by having a third set of rangesets for READ|WRITE but it would
> > mean potentially having to do two lookups in hvm_select_ioreq_server().
> >
> >   Paul
> >
> I'm a little bit confused that like write protection for a true mmio range. If the
> true
> mmio range is readable but write is porected, then where is the data can be
> read from
> If the read is not forwarded to device model. Current write protection for
> normal pages
> treat write as mmio is a special case. For read, data comes from the normal
> page, but
> wirte go to device model for emulation. And, we also can write directly to the
> normal page but forward read to device model. So I think the write protection
> or
> read protection is meaningfull only for normal page protection, not for port or
> mmio resources.
> So, double the rangeset that split READ|WRITE sets for existing resource is not
> very reasonable,  am I right?
> 
> Wei

Ideally there could be such usage of a special device assignment style. For
example, have all the writes from the VM forwarded to an agent in Dom0, so
writes can be logged and replayed for introspection or high availability purpose,
while having most reads from device directly for performance reason if no
side-effect. 

However it's only an ideal case, so likely the change we made now is incomplete
and then anyway more changes are required to make it complete when the 
ideal case becomes real. Regarding to that, maybe it's better to just introduce 
a new type, specifically for this write-protected memory page usage:

>>HVMOP_IO_RANGE_WP_MEMORY

It has effect only on normal memory (add a check on that), w/o impact on
existing interfaces.

Paul, how about your thoughts?

Thanks
Kevin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-10 21:02             ` Tian, Kevin
@ 2014-09-11 14:38               ` Paul Durrant
  2014-09-16 21:25                 ` Tian, Kevin
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Durrant @ 2014-09-11 14:38 UTC (permalink / raw)
  To: Kevin Tian, Ye, Wei, xen-devel
  Cc: Keir (Xen.org), Ian Campbell, Tim (Xen.org),
	Dugger, Donald D, Stefano Stabellini, Lv, Zhiyuan, JBeulich,
	Zhang, Yang Z, Ian Jackson

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 10 September 2014 14:02
> To: Ye, Wei; Paul Durrant; xen-devel@lists.xen.org
> Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano Stabellini;
> Dugger, Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com; Keir
> (Xen.org); Tim (Xen.org)
> Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> forwarding
> 
> > From: Ye, Wei
> > Sent: Tuesday, September 09, 2014 11:10 PM
> > > > > > > adc0f93..f8c4db8 100644
> > > > > > > --- a/xen/arch/x86/hvm/hvm.c
> > > > > > > +++ b/xen/arch/x86/hvm/hvm.c
> > > > > > > @@ -853,6 +853,7 @@ static int
> > > > > > > hvm_ioreq_server_alloc_rangesets(struct
> > > > > > > hvm_ioreq_server *s,
> > > > > > >                        (i == HVMOP_IO_RANGE_PORT) ?
> > "port" :
> > > > > > >                        (i == HVMOP_IO_RANGE_MEMORY) ?
> > > > > > "memory" :
> > > > > > >                        (i == HVMOP_IO_RANGE_PCI) ? "pci" :
> > > > > > > +                      (i == HVMOP_IO_RANGE_WP) ? "write
> > > > > > protection" :
> > > > > >
> > > > > > Continuing along my train of thought from above, perhaps the
> > > > > > rangesets should now be
> > > > > >
> > > > > > HVMOP_IO_RANGE_PORT_READ
> > > > > > HMVOP_IO_RANGE_PORT_WRITE
> > > > > > HVMOP_IO_RANGE_MEMORY_READ
> > > > > > HVMOP_IO_RANGE_MEMORY_WRITE
> > > > > > HVMOP_IO_RANGE_PCI
> > > > >
> > > > > this looks cleaner. Once we start considering the permission, WP
> > > > > becomes only an attribute which can be attached to either port or
> > > memory resource.
> > > > > So better not define a new type here.
> > > > >
> > > >
> > > > Shall we also need a type of HVMOP_IO_RANGE_MEMORY_BOTH?
> > > > Otherwise, if want to set
> > > > both r/w emulation, user needs to call the interface
> > > > xc_hvm_map_io_range_to_ioreq_server() twice. One is with the
> > > parameter
> > > > HVMOP_IO_RANGE_MEMORY_READ and the other with parameter
> > > > HVMOP_IO_RANGE_MEMORY_WRITE.
> > > >
> > >
> > > That's true, so messing with the types directly is probably not the correct
> > > answer. Instead, break direct association between type and rangeset
> index
> > > and then hvmop_map_io_range_to_ioreq_server() can call
> > > hvm_map_io_range_to_ioreq_server() multiple times if necessary.
> > > You could declare the array ins struct hvm_ioreq_server as:
> > >
> > > struct rangeset	*range[NR_IO_RANGE_TYPES * 2];
> > >
> > > and then calculate index as something like:
> > >
> > > if (flags & READ)
> > >   index = type;
> > >
> > > if (flags & WRITE)
> > >   index = type + NR_IO_RANGE_TYPES;
> > >
> > > Yes, it does mean double the number of rangesets if you want both read
> and
> > > write, but that's a very small amount of extra memory. If necessary it
> could
> > > be avoided by having a third set of rangesets for READ|WRITE but it
> would
> > > mean potentially having to do two lookups in hvm_select_ioreq_server().
> > >
> > >   Paul
> > >
> > I'm a little bit confused that like write protection for a true mmio range. If
> the
> > true
> > mmio range is readable but write is porected, then where is the data can
> be
> > read from
> > If the read is not forwarded to device model. Current write protection for
> > normal pages
> > treat write as mmio is a special case. For read, data comes from the normal
> > page, but
> > wirte go to device model for emulation. And, we also can write directly to
> the
> > normal page but forward read to device model. So I think the write
> protection
> > or
> > read protection is meaningfull only for normal page protection, not for port
> or
> > mmio resources.
> > So, double the rangeset that split READ|WRITE sets for existing resource is
> not
> > very reasonable,  am I right?
> >
> > Wei
> 
> Ideally there could be such usage of a special device assignment style. For
> example, have all the writes from the VM forwarded to an agent in Dom0, so
> writes can be logged and replayed for introspection or high availability
> purpose,
> while having most reads from device directly for performance reason if no
> side-effect.
> 
> However it's only an ideal case, so likely the change we made now is
> incomplete
> and then anyway more changes are required to make it complete when the
> ideal case becomes real. Regarding to that, maybe it's better to just
> introduce
> a new type, specifically for this write-protected memory page usage:
> 
> >>HVMOP_IO_RANGE_WP_MEMORY
> 
> It has effect only on normal memory (add a check on that), w/o impact on
> existing interfaces.
> 
> Paul, how about your thoughts?
>

I really don't see it as a special case. You can either emulate read, write or both. Clearly if you're not emulating both then there has to be a page in the p2m so that the non-emulated case can be handled. Current MMIO is simply the emulate-both case for a range which has no memory backing it. If we get the new additional code right then it would be possible for an emulator registering an MMIO range to add pages into the guest for than range, and then only emulate read or write accesses if it doesn't need to handle both.

  Paul

 
> Thanks
> Kevin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-10  6:10           ` Ye, Wei
  2014-09-10 21:02             ` Tian, Kevin
@ 2014-09-11 14:45             ` Paul Durrant
  1 sibling, 0 replies; 19+ messages in thread
From: Paul Durrant @ 2014-09-11 14:45 UTC (permalink / raw)
  To: Ye, Wei, Kevin Tian, xen-devel
  Cc: Keir (Xen.org), Ian Campbell, Tim (Xen.org),
	Dugger, Donald D, Stefano Stabellini, Lv, Zhiyuan, JBeulich,
	Zhang, Yang Z, Ian Jackson

> -----Original Message-----
> From: Ye, Wei [mailto:wei.ye@intel.com]
> Sent: 09 September 2014 23:10
> To: Paul Durrant; Kevin Tian; xen-devel@lists.xen.org
> Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano Stabellini;
> Dugger, Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com; Keir
> (Xen.org); Tim (Xen.org)
> Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> forwarding
> 
> 
> 
> > -----Original Message-----
> > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > Sent: Friday, September 5, 2014 5:02 PM
> > To: Ye, Wei; Tian, Kevin; xen-devel@lists.xen.org
> > Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano Stabellini;
> Dugger,
> > Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com; Keir
> (Xen.org);
> > Tim (Xen.org)
> > Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> > forwarding
> >
> > > -----Original Message-----
> > > From: Ye, Wei [mailto:wei.ye@intel.com]
> > > Sent: 05 September 2014 01:44
> > > To: Kevin Tian; Paul Durrant; xen-devel@lists.xen.org
> > > Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano Stabellini;
> > > Dugger, Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com;
> > > Keir (Xen.org); Tim (Xen.org)
> > > Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> > > forwarding
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Tian, Kevin
> > > > Sent: Friday, September 5, 2014 7:11 AM
> > > > To: Paul Durrant; Ye, Wei; xen-devel@lists.xen.org
> > > > Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano
> > > > Stabellini;
> > > Dugger,
> > > > Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com; Keir
> > > (Xen.org);
> > > > Tim (Xen.org)
> > > > Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> > > > forwarding
> > > >
> > > > > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > > > > Sent: Wednesday, September 03, 2014 3:12 AM
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Wei Ye [mailto:wei.ye@intel.com]
> > > > > > Sent: 03 September 2014 22:53
> > > > > > To: xen-devel@lists.xen.org
> > > > > > Cc: JBeulich@suse.com; Ian Campbell; Paul Durrant; Ian Jackson;
> > > > > > Stefano Stabellini; donald.d.dugger@intel.com; Kevin Tian;
> > > > > > yang.z.zhang@intel.com; zhiyuan.lv@intel.com;
> > > > > > konrad.wilk@oracle.com; Keir (Xen.org); Tim (Xen.org); Wei Ye
> > > > > > Subject: [PATCH v3 2/2] ioreq-server: write protected range and
> > > > > > forwarding
> > > > > >
> > > > > > Extend the ioreq-server to write protect pages and forward the
> > > > > > write access to the corresponding device model.
> > > > > >
> > > > >
> > > > > Using rangesets did make the patch somewhat smaller then :-)
> > > > >
> > > > > > Signed-off-by: Wei Ye <wei.ye@intel.com>
> > > > > > ---
> > > > > >  tools/libxc/xc_domain.c          |   33
> > > > > +++++++++++++++++++++++++++
> > > > > >  tools/libxc/xenctrl.h            |   18 +++++++++++++++
> > > > > >  xen/arch/x86/hvm/hvm.c           |   46
> > > > > > ++++++++++++++++++++++++++++++++++++++
> > > > > >  xen/include/asm-x86/hvm/domain.h |    2 +-
> > > > > >  xen/include/public/hvm/hvm_op.h  |    1 +
> > > > > >  5 files changed, 99 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> > > > > > index
> > > > > > 37ed141..34e6309 100644
> > > > > > --- a/tools/libxc/xc_domain.c
> > > > > > +++ b/tools/libxc/xc_domain.c
> > > > > > @@ -1485,6 +1485,39 @@ int
> > > > > > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
> > > domid_t
> > > > > > domid,
> > > > > >      return rc;
> > > > > >  }
> > > > > >
> > > > > > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> > > > domid_t
> > > > > > domid,
> > > > > > +                                        ioservid_t id, uint16_t set,
> > > > > > +                                        uint64_t start,
> > > > > > + uint64_t
> > > > > end)
> > > > > > +{
> > > > >
> > > > > Given what I said about the notion of MMIO ranges that need read
> > > > > emulation but no write emulation, and the ranges you're concerned
> > > > > about which require write emulation but no read emulation, I
> > > > > wonder whether this could collapse into the existing
> > > > > xc_hvm_map_io_range_to_ioreq_server() by adding an extra
> > parameter
> > > > to say whether, for a given range, you want read emulation, write
> > > > > emulation or both.
> > > >
> > > > Agree. Add a parameter to indicate r/w permission seems cleaner.
> > > >
> > > > >
> > > > > > +    DECLARE_HYPERCALL;
> > > > > > +    DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
> > > > > > +    int rc = -1;
> > > > > > +
> > > > > > +    if ( arg == NULL )
> > > > > > +    {
> > > > > > +        errno = -EFAULT;
> > > > > > +        return -1;
> > > > > > +    }
> > > > > > +
> > > > > > +    hypercall.op = __HYPERVISOR_hvm_op;
> > > > > > +    if ( set )
> > > > > > +        hypercall.arg[0] = HVMOP_map_io_range_to_ioreq_server;
> > > > > > +    else
> > > > > > +        hypercall.arg[0] =
> > HVMOP_unmap_io_range_from_ioreq_server;
> > > > > > +    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
> > > > > > +
> > > > > > +    arg->domid = domid;
> > > > > > +    arg->id = id;
> > > > > > +    arg->type = HVMOP_IO_RANGE_WP;
> > > > > > +    arg->start = start;
> > > > > > +    arg->end = end;
> > > > > > +
> > > > > > +    rc = do_xen_hypercall(xch, &hypercall);
> > > > > > +
> > > > > > +    xc_hypercall_buffer_free(xch, arg);
> > > > > > +    return rc;
> > > > > > +}
> > > > > > +
> > > > > >  int xc_hvm_destroy_ioreq_server(xc_interface *xch,
> > > > > >                                  domid_t domid,
> > > > > >                                  ioservid_t id) diff --git
> > > > > > a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h index
> > > > > > b55d857..b261602 100644
> > > > > > --- a/tools/libxc/xenctrl.h
> > > > > > +++ b/tools/libxc/xenctrl.h
> > > > > > @@ -1943,6 +1943,24 @@ int
> > > > > > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
> > > > > >                                            uint8_t function);
> > > > > >
> > > > > >  /**
> > > > > > + * This function registers/deregisters a set of pages to be
> > > > > > + write
> > > > protected.
> > > > > > + *
> > > > > > + * @parm xch a handle to an open hypervisor interface.
> > > > > > + * @parm domid the domain id to be serviced
> > > > > > + * @parm id the IOREQ Server id.
> > > > > > + * @parm set whether registers or deregisters: 1 for register,
> > > > > > + 0 for
> > > > > > deregister
> > > > > > + * @parm start start of range
> > > > > > + * @parm end end of range (inclusive).
> > > > > > + * @return 0 on success, -1 on failure.
> > > > > > + */
> > > > > > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> > > > > > +                                        domid_t domid,
> > > > > > +                                        ioservid_t id,
> > > > > > +                                        uint16_t set,
> > > > > > +                                        uint64_t start,
> > > > > > +                                        uint64_t end);
> > > > > > +
> > > > > > +/**
> > > > > >   * This function destroys an IOREQ Server.
> > > > > >   *
> > > > > >   * @parm xch a handle to an open hypervisor interface.
> > > > > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> > > > > > index
> > > > > > adc0f93..f8c4db8 100644
> > > > > > --- a/xen/arch/x86/hvm/hvm.c
> > > > > > +++ b/xen/arch/x86/hvm/hvm.c
> > > > > > @@ -853,6 +853,7 @@ static int
> > > > > > hvm_ioreq_server_alloc_rangesets(struct
> > > > > > hvm_ioreq_server *s,
> > > > > >                        (i == HVMOP_IO_RANGE_PORT) ? "port" :
> > > > > >                        (i == HVMOP_IO_RANGE_MEMORY) ?
> > > > > "memory" :
> > > > > >                        (i == HVMOP_IO_RANGE_PCI) ? "pci" :
> > > > > > +                      (i == HVMOP_IO_RANGE_WP) ? "write
> > > > > protection" :
> > > > >
> > > > > Continuing along my train of thought from above, perhaps the
> > > > > rangesets should now be
> > > > >
> > > > > HVMOP_IO_RANGE_PORT_READ
> > > > > HMVOP_IO_RANGE_PORT_WRITE
> > > > > HVMOP_IO_RANGE_MEMORY_READ
> > > > > HVMOP_IO_RANGE_MEMORY_WRITE
> > > > > HVMOP_IO_RANGE_PCI
> > > >
> > > > this looks cleaner. Once we start considering the permission, WP
> > > > becomes only an attribute which can be attached to either port or
> > memory resource.
> > > > So better not define a new type here.
> > > >
> > >
> > > Shall we also need a type of HVMOP_IO_RANGE_MEMORY_BOTH?
> > > Otherwise, if want to set
> > > both r/w emulation, user needs to call the interface
> > > xc_hvm_map_io_range_to_ioreq_server() twice. One is with the
> > parameter
> > > HVMOP_IO_RANGE_MEMORY_READ and the other with parameter
> > > HVMOP_IO_RANGE_MEMORY_WRITE.
> > >
> >
> > That's true, so messing with the types directly is probably not the correct
> > answer. Instead, break direct association between type and rangeset index
> > and then hvmop_map_io_range_to_ioreq_server() can call
> > hvm_map_io_range_to_ioreq_server() multiple times if necessary.
> > You could declare the array ins struct hvm_ioreq_server as:
> >
> > struct rangeset	*range[NR_IO_RANGE_TYPES * 2];
> >
> > and then calculate index as something like:
> >
> > if (flags & READ)
> >   index = type;
> >
> > if (flags & WRITE)
> >   index = type + NR_IO_RANGE_TYPES;
> >
> > Yes, it does mean double the number of rangesets if you want both read
> and
> > write, but that's a very small amount of extra memory. If necessary it could
> > be avoided by having a third set of rangesets for READ|WRITE but it would
> > mean potentially having to do two lookups in hvm_select_ioreq_server().
> >
> >   Paul
> >
> I'm a little bit confused that like write protection for a true mmio range. If the
> true
> mmio range is readable but write is porected, then where is the data can be
> read from
> If the read is not forwarded to device model.

Clearly the emulator would need to populate that page in the guest p2m if it's not emulating both read and write. But that's what you're already doing in your use-case isn't it?

> Current write protection for
> normal pages
> treat write as mmio is a special case. For read, data comes from the normal
> page, but
> wirte go to device model for emulation. 

Yes, but why is that a special case? All you're essentially doing is populating pages in an MMIO area and then setting protection so you get the intercepts you want. There's already precedent for populating the p2m of MMIO range in the stdvga device model in QEMU (where the VRAM is exposed as a BAR).

> And, we also can write directly to the
> normal page but forward read to device model. So I think the write
> protection or
> read protection is meaningfull only for normal page protection, not for port
> or mmio resources.

But essentially the only difference between MMIO and RAM is whether there are any entries in the p2m.

  Paul

> So, double the rangeset that split READ|WRITE sets for existing resource is
> not very reasonable,  am I right?
> 
> Wei

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] ioreq-server: write protected range and forwarding
  2014-09-11 14:38               ` Paul Durrant
@ 2014-09-16 21:25                 ` Tian, Kevin
  0 siblings, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2014-09-16 21:25 UTC (permalink / raw)
  To: Paul Durrant, Ye, Wei, xen-devel
  Cc: Keir (Xen.org), Ian Campbell, Zhang, Yu C, Tim (Xen.org),
	Dugger, Donald D, Stefano Stabellini, Lv, Zhiyuan, JBeulich,
	Zhang, Yang Z, Ian Jackson

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Thursday, September 11, 2014 7:39 AM
> 
> > -----Original Message-----
> > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > Sent: 10 September 2014 14:02
> > To: Ye, Wei; Paul Durrant; xen-devel@lists.xen.org
> > Cc: JBeulich@suse.com; Ian Campbell; Ian Jackson; Stefano Stabellini;
> > Dugger, Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@oracle.com; Keir
> > (Xen.org); Tim (Xen.org)
> > Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> > forwarding
> >
> > > From: Ye, Wei
> > > Sent: Tuesday, September 09, 2014 11:10 PM
> > > > > > > > adc0f93..f8c4db8 100644
> > > > > > > > --- a/xen/arch/x86/hvm/hvm.c
> > > > > > > > +++ b/xen/arch/x86/hvm/hvm.c
> > > > > > > > @@ -853,6 +853,7 @@ static int
> > > > > > > > hvm_ioreq_server_alloc_rangesets(struct
> > > > > > > > hvm_ioreq_server *s,
> > > > > > > >                        (i == HVMOP_IO_RANGE_PORT) ?
> > > "port" :
> > > > > > > >                        (i ==
> HVMOP_IO_RANGE_MEMORY) ?
> > > > > > > "memory" :
> > > > > > > >                        (i == HVMOP_IO_RANGE_PCI) ?
> "pci" :
> > > > > > > > +                      (i == HVMOP_IO_RANGE_WP) ?
> "write
> > > > > > > protection" :
> > > > > > >
> > > > > > > Continuing along my train of thought from above, perhaps the
> > > > > > > rangesets should now be
> > > > > > >
> > > > > > > HVMOP_IO_RANGE_PORT_READ
> > > > > > > HMVOP_IO_RANGE_PORT_WRITE
> > > > > > > HVMOP_IO_RANGE_MEMORY_READ
> > > > > > > HVMOP_IO_RANGE_MEMORY_WRITE
> > > > > > > HVMOP_IO_RANGE_PCI
> > > > > >
> > > > > > this looks cleaner. Once we start considering the permission, WP
> > > > > > becomes only an attribute which can be attached to either port or
> > > > memory resource.
> > > > > > So better not define a new type here.
> > > > > >
> > > > >
> > > > > Shall we also need a type of HVMOP_IO_RANGE_MEMORY_BOTH?
> > > > > Otherwise, if want to set
> > > > > both r/w emulation, user needs to call the interface
> > > > > xc_hvm_map_io_range_to_ioreq_server() twice. One is with the
> > > > parameter
> > > > > HVMOP_IO_RANGE_MEMORY_READ and the other with parameter
> > > > > HVMOP_IO_RANGE_MEMORY_WRITE.
> > > > >
> > > >
> > > > That's true, so messing with the types directly is probably not the correct
> > > > answer. Instead, break direct association between type and rangeset
> > index
> > > > and then hvmop_map_io_range_to_ioreq_server() can call
> > > > hvm_map_io_range_to_ioreq_server() multiple times if necessary.
> > > > You could declare the array ins struct hvm_ioreq_server as:
> > > >
> > > > struct rangeset	*range[NR_IO_RANGE_TYPES * 2];
> > > >
> > > > and then calculate index as something like:
> > > >
> > > > if (flags & READ)
> > > >   index = type;
> > > >
> > > > if (flags & WRITE)
> > > >   index = type + NR_IO_RANGE_TYPES;
> > > >
> > > > Yes, it does mean double the number of rangesets if you want both read
> > and
> > > > write, but that's a very small amount of extra memory. If necessary it
> > could
> > > > be avoided by having a third set of rangesets for READ|WRITE but it
> > would
> > > > mean potentially having to do two lookups in hvm_select_ioreq_server().
> > > >
> > > >   Paul
> > > >
> > > I'm a little bit confused that like write protection for a true mmio range. If
> > the
> > > true
> > > mmio range is readable but write is porected, then where is the data can
> > be
> > > read from
> > > If the read is not forwarded to device model. Current write protection for
> > > normal pages
> > > treat write as mmio is a special case. For read, data comes from the
> normal
> > > page, but
> > > wirte go to device model for emulation. And, we also can write directly to
> > the
> > > normal page but forward read to device model. So I think the write
> > protection
> > > or
> > > read protection is meaningfull only for normal page protection, not for
> port
> > or
> > > mmio resources.
> > > So, double the rangeset that split READ|WRITE sets for existing resource is
> > not
> > > very reasonable,  am I right?
> > >
> > > Wei
> >
> > Ideally there could be such usage of a special device assignment style. For
> > example, have all the writes from the VM forwarded to an agent in Dom0,
> so
> > writes can be logged and replayed for introspection or high availability
> > purpose,
> > while having most reads from device directly for performance reason if no
> > side-effect.
> >
> > However it's only an ideal case, so likely the change we made now is
> > incomplete
> > and then anyway more changes are required to make it complete when the
> > ideal case becomes real. Regarding to that, maybe it's better to just
> > introduce
> > a new type, specifically for this write-protected memory page usage:
> >
> > >>HVMOP_IO_RANGE_WP_MEMORY
> >
> > It has effect only on normal memory (add a check on that), w/o impact on
> > existing interfaces.
> >
> > Paul, how about your thoughts?
> >
> 
> I really don't see it as a special case. You can either emulate read, write or
> both. Clearly if you're not emulating both then there has to be a page in the
> p2m so that the non-emulated case can be handled. Current MMIO is simply
> the emulate-both case for a range which has no memory backing it. If we get
> the new additional code right then it would be possible for an emulator
> registering an MMIO range to add pages into the guest for than range, and
> then only emulate read or write accesses if it doesn't need to handle both.
> 

so let's still pursue this interface change like you suggested. However because
Wei left Intel, Yu will take this work and please bear some delay for him to ramp
up. 

Thanks
Kevin

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-09-16 21:25 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-03 21:53 [PATCH v3 0/2] Extend ioreq-server to support page write protection Wei Ye
2014-09-03 21:53 ` [PATCH v3 1/2] x86: add p2m_mmio_write_dm Wei Ye
2014-09-03  9:54   ` Paul Durrant
2014-09-03 10:03     ` Jan Beulich
2014-09-03 21:53 ` [PATCH v3 2/2] ioreq-server: write protected range and forwarding Wei Ye
2014-09-03 10:11   ` Paul Durrant
2014-09-04 23:10     ` Tian, Kevin
2014-09-05  0:44       ` Ye, Wei
2014-09-05  9:01         ` Paul Durrant
2014-09-10  6:10           ` Ye, Wei
2014-09-10 21:02             ` Tian, Kevin
2014-09-11 14:38               ` Paul Durrant
2014-09-16 21:25                 ` Tian, Kevin
2014-09-11 14:45             ` Paul Durrant
2014-09-03 13:17   ` Jan Beulich
2014-09-04  0:31     ` Ye, Wei
2014-09-10  5:32     ` Ye, Wei
2014-09-10  9:19       ` Jan Beulich
2014-09-10  9:30         ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.