All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] RFC: V4V (v3)
@ 2012-08-03 19:50 Jean Guyader
  2012-08-03 19:50 ` [PATCH 1/5] xen: add ssize_t Jean Guyader
                   ` (5 more replies)
  0 siblings, 6 replies; 50+ messages in thread
From: Jean Guyader @ 2012-08-03 19:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Jean Guyader

[-- Attachment #1: Type: text/plain, Size: 2341 bytes --]

v3 changes:
        - Switch to event channel
                - Allocated a unbound event channel
                  per domain.
                - Add a new v4v call to share the
                  event channel port.
        - Public headers with actual type definition
        - Align all the v4v type to 64 bits
        - Modify v4v MAGIC numbers because we won't
          but backward compatible anymore
        - Merge insert and insertv
        - Merge send and sendv
        - Turn all the lock prerequisite from comment
          to ASSERT()
        - Make use or write_atomic instead of volatile pointers
        - Merge v4v_memcpy_to_guest_ring and
          v4v_memcpy_to_guest_ring_from_guest
                - Introduce copy_from_guest_maybe that can take
                  a void * and a handle as src address.
        - TODO:
                - Add libv4v userspace code

v2 changes:
        - Cleanup plugin header
        - Include basic access control
        - Use guest_handle_for_field


Jan Beulich (1):
  xen: Introduce guest_handle_for_field

Jean Guyader (4):
  xen: add ssize_t
  xen: virq, remove VIRQ_XC_RESERVED
  xen: events, exposes evtchn_alloc_unbound_domain
  xen: Add V4V implementation

 xen/arch/x86/hvm/hvm.c             |    9 +-
 xen/arch/x86/x86_32/entry.S        |    2 +
 xen/arch/x86/x86_64/compat/entry.S |    2 +
 xen/arch/x86/x86_64/entry.S        |    2 +
 xen/common/Makefile                |    1 +
 xen/common/domain.c                |   13 +-
 xen/common/event_channel.c         |   33 +-
 xen/common/v4v.c                   | 1895 ++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/types.h        |    1 +
 xen/include/asm-x86/guest_access.h |    3 +
 xen/include/asm-x86/types.h        |    6 +
 xen/include/public/v4v.h           |  291 ++++++
 xen/include/public/xen.h           |    3 +-
 xen/include/xen/event.h            |    2 +
 xen/include/xen/sched.h            |    4 +
 xen/include/xen/v4v.h              |  134 +++
 xen/include/xen/v4v_utils.h        |  276 ++++++
 17 files changed, 2665 insertions(+), 12 deletions(-)
 create mode 100644 xen/common/v4v.c
 create mode 100644 xen/include/public/v4v.h
 create mode 100644 xen/include/xen/v4v.h
 create mode 100644 xen/include/xen/v4v_utils.h

-- 
1.7.9.5


[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/5] xen: add ssize_t
  2012-08-03 19:50 [PATCH 0/5] RFC: V4V (v3) Jean Guyader
@ 2012-08-03 19:50 ` Jean Guyader
  2012-08-06  8:08   ` Jan Beulich
  2012-08-03 19:50 ` [PATCH 2/5] xen: Introduce guest_handle_for_field Jean Guyader
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-08-03 19:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Jean Guyader

[-- Attachment #1: Type: text/plain, Size: 182 bytes --]


Signed-off-by: Jean Guyader <jean.guyader@citrix.com>
---
 xen/include/asm-arm/types.h |    1 +
 xen/include/asm-x86/types.h |    6 ++++++
 2 files changed, 7 insertions(+)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-xen-add-ssize_t.patch --]
[-- Type: text/x-patch; name="0001-xen-add-ssize_t.patch", Size: 852 bytes --]

diff --git a/xen/include/asm-arm/types.h b/xen/include/asm-arm/types.h
index 48864f9..d2c5612 100644
--- a/xen/include/asm-arm/types.h
+++ b/xen/include/asm-arm/types.h
@@ -35,6 +35,7 @@ typedef u64 paddr_t;
 #define PRIpaddr "016llx"
 
 typedef unsigned long size_t;
+typedef long ssize_t;
 
 typedef char bool_t;
 #define test_and_set_bool(b)   xchg(&(b), 1)
diff --git a/xen/include/asm-x86/types.h b/xen/include/asm-x86/types.h
index 1c4c5d5..bb7ffc2 100644
--- a/xen/include/asm-x86/types.h
+++ b/xen/include/asm-x86/types.h
@@ -59,6 +59,12 @@ typedef char bool_t;
 #define test_and_set_bool(b)   xchg(&(b), 1)
 #define test_and_clear_bool(b) xchg(&(b), 0)
 
+#if defined(__i386__)
+typedef int ssize_t;
+#else /* __x86_64 */
+typedef long ssize_t;
+#endif
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __X86_TYPES_H__ */

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/5] xen: Introduce guest_handle_for_field
  2012-08-03 19:50 [PATCH 0/5] RFC: V4V (v3) Jean Guyader
  2012-08-03 19:50 ` [PATCH 1/5] xen: add ssize_t Jean Guyader
@ 2012-08-03 19:50 ` Jean Guyader
  2012-08-03 19:50 ` [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED Jean Guyader
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 50+ messages in thread
From: Jean Guyader @ 2012-08-03 19:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Jean Guyader, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 252 bytes --]


This helper turns a field of a GUEST_HANDLE in
a GUEST_HANDLE.

From: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Jean Guyader <jean.guyader@citrix.com>
---
 xen/include/asm-x86/guest_access.h |    3 +++
 1 file changed, 3 insertions(+)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0002-xen-Introduce-guest_handle_for_field.patch --]
[-- Type: text/x-patch; name="0002-xen-Introduce-guest_handle_for_field.patch", Size: 547 bytes --]

diff --git a/xen/include/asm-x86/guest_access.h b/xen/include/asm-x86/guest_access.h
index 2b429c2..e3ac1d6 100644
--- a/xen/include/asm-x86/guest_access.h
+++ b/xen/include/asm-x86/guest_access.h
@@ -51,6 +51,9 @@
     (XEN_GUEST_HANDLE(type)) { _x };            \
 })
 
+#define guest_handle_for_field(hnd, type, fld)          \
+    ((XEN_GUEST_HANDLE(type)) { &(hnd).p->fld })
+
 #define guest_handle_from_ptr(ptr, type)        \
     ((XEN_GUEST_HANDLE(type)) { (type *)ptr })
 #define const_guest_handle_from_ptr(ptr, type)  \

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED
  2012-08-03 19:50 [PATCH 0/5] RFC: V4V (v3) Jean Guyader
  2012-08-03 19:50 ` [PATCH 1/5] xen: add ssize_t Jean Guyader
  2012-08-03 19:50 ` [PATCH 2/5] xen: Introduce guest_handle_for_field Jean Guyader
@ 2012-08-03 19:50 ` Jean Guyader
  2012-08-06  8:10   ` Jan Beulich
  2012-08-03 19:50 ` [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain Jean Guyader
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-08-03 19:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Jean Guyader

[-- Attachment #1: Type: text/plain, Size: 257 bytes --]


VIRQ_XC_RESERVED was reserved for V4V but we have switched
to event channels so this place holder is no longer required.

Signed-off-by: Jean Guyader <jean.guyader@citrix.com>
---
 xen/include/public/xen.h |    1 -
 1 file changed, 1 deletion(-)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0003-xen-virq-remove-VIRQ_XC_RESERVED.patch --]
[-- Type: text/x-patch; name="0003-xen-virq-remove-VIRQ_XC_RESERVED.patch", Size: 667 bytes --]

diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index b2f6c50..b19425b 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -157,7 +157,6 @@ DEFINE_XEN_GUEST_HANDLE(xen_pfn_t);
 #define VIRQ_CON_RING   8  /* G. (DOM0) Bytes received on console            */
 #define VIRQ_PCPU_STATE 9  /* G. (DOM0) PCPU state changed                   */
 #define VIRQ_MEM_EVENT  10 /* G. (DOM0) A memory event has occured           */
-#define VIRQ_XC_RESERVED 11 /* G. Reserved for XenClient                     */
 #define VIRQ_ENOMEM     12 /* G. (DOM0) Low on heap memory       */
 
 /* Architecture-specific VIRQ definitions. */

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain
  2012-08-03 19:50 [PATCH 0/5] RFC: V4V (v3) Jean Guyader
                   ` (2 preceding siblings ...)
  2012-08-03 19:50 ` [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED Jean Guyader
@ 2012-08-03 19:50 ` Jean Guyader
  2012-08-06  8:19   ` Jan Beulich
  2012-08-09 10:06   ` Tim Deegan
  2012-08-03 19:50 ` [PATCH 5/5] xen: Add V4V implementation Jean Guyader
  2012-08-04 13:24 ` [PATCH 0/5] RFC: V4V (v3) Jean Guyader
  5 siblings, 2 replies; 50+ messages in thread
From: Jean Guyader @ 2012-08-03 19:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Jean Guyader

[-- Attachment #1: Type: text/plain, Size: 338 bytes --]


Exposes evtchn_alloc_unbound_domain to the rest of
Xen so we can create allocated unbound evtchn within Xen.

Signed-off-by: Jean Guyader <jean.guyader@citrix.com>
---
 xen/common/event_channel.c |   33 +++++++++++++++++++++++++++------
 xen/include/xen/event.h    |    2 ++
 2 files changed, 29 insertions(+), 6 deletions(-)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0004-xen-events-exposes-evtchn_alloc_unbound_domain.patch --]
[-- Type: text/x-patch; name="0004-xen-events-exposes-evtchn_alloc_unbound_domain.patch", Size: 2538 bytes --]

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 53777f8..067365b 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -51,6 +51,8 @@
 
 #define consumer_is_xen(e) (!!(e)->xen_consumer)
 
+static long __evtchn_close(struct domain *d, int port);
+
 /*
  * The function alloc_unbound_xen_event_channel() allows an arbitrary
  * notifier function to be specified. However, very few unique functions
@@ -161,18 +163,18 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
 {
     struct evtchn *chn;
     struct domain *d;
-    int            port;
+    evtchn_port_t  port;
     domid_t        dom = alloc->dom;
-    long           rc;
+    int            rc;
 
     rc = rcu_lock_target_domain_by_id(dom, &d);
     if ( rc )
         return rc;
 
-    spin_lock(&d->event_lock);
+    rc = evtchn_alloc_unbound_domain(d, &port);
+    if ( rc )
+        ERROR_EXIT_DOM(rc, d);
 
-    if ( (port = get_free_port(d)) < 0 )
-        ERROR_EXIT_DOM(port, d);
     chn = evtchn_from_port(d, port);
 
     rc = xsm_evtchn_unbound(d, chn, alloc->remote_dom);
@@ -186,12 +188,31 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
     alloc->port = port;
 
  out:
-    spin_unlock(&d->event_lock);
     rcu_unlock_domain(d);
 
     return rc;
 }
 
+int evtchn_alloc_unbound_domain(struct domain *d, evtchn_port_t *port)
+{
+    struct evtchn *chn;
+    int            free_port = 0;
+
+    spin_lock(&d->event_lock);
+
+    if ( (free_port = get_free_port(d)) < 0 )
+        goto out;
+    chn = evtchn_from_port(d, free_port);
+    chn->state = ECS_UNBOUND;
+    chn->u.unbound.remote_domid = DOMID_INVALID;
+    *port = free_port;
+    /* Everything is fine, returns 0 */
+    free_port = 0;
+
+ out:
+    spin_unlock(&d->event_lock);
+    return free_port;
+}
 
 static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
 {
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 71c3e92..d89b0c1 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -69,6 +69,8 @@ int guest_enabled_event(struct vcpu *v, uint32_t virq);
 /* Notify remote end of a Xen-attached event channel.*/
 void notify_via_xen_event_channel(struct domain *ld, int lport);
 
+int evtchn_alloc_unbound_domain(struct domain *d, evtchn_port_t *port);
+
 /* Internal event channel object accessors */
 #define bucket_from_port(d,p) \
     ((d)->evtchn[(p)/EVTCHNS_PER_BUCKET])

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 5/5] xen: Add V4V implementation
  2012-08-03 19:50 [PATCH 0/5] RFC: V4V (v3) Jean Guyader
                   ` (3 preceding siblings ...)
  2012-08-03 19:50 ` [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain Jean Guyader
@ 2012-08-03 19:50 ` Jean Guyader
  2012-08-06  8:45   ` Jan Beulich
  2012-08-09 10:38   ` [PATCH 5/5] xen: Add V4V implementation Tim Deegan
  2012-08-04 13:24 ` [PATCH 0/5] RFC: V4V (v3) Jean Guyader
  5 siblings, 2 replies; 50+ messages in thread
From: Jean Guyader @ 2012-08-03 19:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Jean Guyader

[-- Attachment #1: Type: text/plain, Size: 1046 bytes --]


Setup of v4v domains a domain gets created and cleanup
when a domain die. Wire up the v4v hypercall.

Include v4v internal and public headers.

Signed-off-by: Jean Guyader <jean.guyader@citrix.com>
---
 xen/arch/x86/hvm/hvm.c             |    9 +-
 xen/arch/x86/x86_32/entry.S        |    2 +
 xen/arch/x86/x86_64/compat/entry.S |    2 +
 xen/arch/x86/x86_64/entry.S        |    2 +
 xen/common/Makefile                |    1 +
 xen/common/domain.c                |   13 +-
 xen/common/v4v.c                   | 1895 ++++++++++++++++++++++++++++++++++++
 xen/include/public/v4v.h           |  291 ++++++
 xen/include/public/xen.h           |    2 +-
 xen/include/xen/sched.h            |    4 +
 xen/include/xen/v4v.h              |  134 +++
 xen/include/xen/v4v_utils.h        |  276 ++++++
 12 files changed, 2626 insertions(+), 5 deletions(-)
 create mode 100644 xen/common/v4v.c
 create mode 100644 xen/include/public/v4v.h
 create mode 100644 xen/include/xen/v4v.h
 create mode 100644 xen/include/xen/v4v_utils.h


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0005-xen-Add-V4V-implementation.patch --]
[-- Type: text/x-patch; name="0005-xen-Add-V4V-implementation.patch", Size: 76616 bytes --]

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 22c136b..2671069 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3125,7 +3125,8 @@ static hvm_hypercall_t *hvm_hypercall32_table[NR_hypercalls] = {
     HYPERCALL(set_timer_op),
     HYPERCALL(hvm_op),
     HYPERCALL(sysctl),
-    HYPERCALL(tmem_op)
+    HYPERCALL(tmem_op),
+    HYPERCALL(v4v_op)
 };
 
 #else /* defined(__x86_64__) */
@@ -3210,7 +3211,8 @@ static hvm_hypercall_t *hvm_hypercall64_table[NR_hypercalls] = {
     HYPERCALL(set_timer_op),
     HYPERCALL(hvm_op),
     HYPERCALL(sysctl),
-    HYPERCALL(tmem_op)
+    HYPERCALL(tmem_op),
+    HYPERCALL(v4v_op)
 };
 
 #define COMPAT_CALL(x)                                        \
@@ -3227,7 +3229,8 @@ static hvm_hypercall_t *hvm_hypercall32_table[NR_hypercalls] = {
     COMPAT_CALL(set_timer_op),
     HYPERCALL(hvm_op),
     HYPERCALL(sysctl),
-    HYPERCALL(tmem_op)
+    HYPERCALL(tmem_op),
+    HYPERCALL(v4v_op)
 };
 
 #endif /* defined(__x86_64__) */
diff --git a/xen/arch/x86/x86_32/entry.S b/xen/arch/x86/x86_32/entry.S
index 2982679..5b1f55b 100644
--- a/xen/arch/x86/x86_32/entry.S
+++ b/xen/arch/x86/x86_32/entry.S
@@ -700,6 +700,7 @@ ENTRY(hypercall_table)
         .long do_domctl
         .long do_kexec_op
         .long do_tmem_op
+        .long do_v4v_op
         .rept __HYPERVISOR_arch_0-((.-hypercall_table)/4)
         .long do_ni_hypercall
         .endr
@@ -748,6 +749,7 @@ ENTRY(hypercall_args_table)
         .byte 1 /* do_domctl            */
         .byte 2 /* do_kexec_op          */
         .byte 1 /* do_tmem_op           */
+        .byte 5 /* do_v4v_op		*/
         .rept __HYPERVISOR_arch_0-(.-hypercall_args_table)
         .byte 0 /* do_ni_hypercall      */
         .endr
diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
index f49ff2d..6b838d3 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -414,6 +414,7 @@ ENTRY(compat_hypercall_table)
         .quad do_domctl
         .quad compat_kexec_op
         .quad do_tmem_op
+        .quad do_v4v_op
         .rept __HYPERVISOR_arch_0-((.-compat_hypercall_table)/8)
         .quad compat_ni_hypercall
         .endr
@@ -462,6 +463,7 @@ ENTRY(compat_hypercall_args_table)
         .byte 1 /* do_domctl                */
         .byte 2 /* compat_kexec_op          */
         .byte 1 /* do_tmem_op               */
+        .byte 5 /* do_v4v_op		    */
         .rept __HYPERVISOR_arch_0-(.-compat_hypercall_args_table)
         .byte 0 /* compat_ni_hypercall      */
         .endr
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 997bc94..e6a7fdd 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -707,6 +707,7 @@ ENTRY(hypercall_table)
         .quad do_domctl
         .quad do_kexec_op
         .quad do_tmem_op
+        .quad do_v4v_op
         .rept __HYPERVISOR_arch_0-((.-hypercall_table)/8)
         .quad do_ni_hypercall
         .endr
@@ -755,6 +756,7 @@ ENTRY(hypercall_args_table)
         .byte 1 /* do_domctl            */
         .byte 2 /* do_kexec             */
         .byte 1 /* do_tmem_op           */
+        .byte 5 /* do_v4v_op		*/
         .rept __HYPERVISOR_arch_0-(.-hypercall_args_table)
         .byte 0 /* do_ni_hypercall      */
         .endr
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 9eba8bc..fe3c72c 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -45,6 +45,7 @@ obj-y += tmem_xen.o
 obj-y += radix-tree.o
 obj-y += rbtree.o
 obj-y += lzo.o
+obj-y += v4v.o
 
 obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo,$(n).init.o)
 
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 4c5d241..1600f45 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -195,7 +195,8 @@ struct domain *domain_create(
 {
     struct domain *d, **pd;
     enum { INIT_xsm = 1u<<0, INIT_watchdog = 1u<<1, INIT_rangeset = 1u<<2,
-           INIT_evtchn = 1u<<3, INIT_gnttab = 1u<<4, INIT_arch = 1u<<5 };
+           INIT_evtchn = 1u<<3, INIT_gnttab = 1u<<4, INIT_arch = 1u<<5,
+           INIT_v4v = 1u<<6 };
     int init_status = 0;
     int poolid = CPUPOOLID_NONE;
 
@@ -305,6 +306,13 @@ struct domain *domain_create(
         spin_unlock(&domlist_update_lock);
     }
 
+    if ( !is_idle_domain(d) )
+    {
+        if ( v4v_init(d) != 0 )
+            goto fail;
+        init_status |= INIT_v4v;
+    }
+
     return d;
 
  fail:
@@ -313,6 +321,8 @@ struct domain *domain_create(
     xfree(d->mem_event);
     if ( init_status & INIT_arch )
         arch_domain_destroy(d);
+    if ( init_status & INIT_v4v )
+	v4v_destroy(d);
     if ( init_status & INIT_gnttab )
         grant_table_destroy(d);
     if ( init_status & INIT_evtchn )
@@ -466,6 +476,7 @@ int domain_kill(struct domain *d)
         domain_pause(d);
         d->is_dying = DOMDYING_dying;
         spin_barrier(&d->domain_lock);
+        v4v_destroy(d);
         evtchn_destroy(d);
         gnttab_release_mappings(d);
         tmem_destroy(d->tmem);
diff --git a/xen/common/v4v.c b/xen/common/v4v.c
new file mode 100644
index 0000000..8b96b38
--- /dev/null
+++ b/xen/common/v4v.c
@@ -0,0 +1,1895 @@
+/******************************************************************************
+ * V4V
+ *
+ * Version 2 of v2v (Virtual-to-Virtual)
+ *
+ * Copyright (c) 2010, Citrix Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <xen/config.h>
+#include <xen/mm.h>
+#include <xen/compat.h>
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/errno.h>
+#include <xen/sched.h>
+#include <xen/domain.h>
+#include <xen/v4v.h>
+#include <xen/event.h>
+#include <xen/guest_access.h>
+#include <asm/paging.h>
+#include <asm/p2m.h>
+#include <xen/keyhandler.h>
+#include <xen/v4v_utils.h>
+
+#ifdef V4V_DEBUG
+#define v4v_dprintk(format, args...)            \
+    do {                                        \
+        printk("%s:%d " format,                 \
+               __FILE__, __LINE__, ## args );   \
+    } while ( 1 == 0 )
+#else
+#define v4v_dprintk(format, ... ) (void)0
+#endif
+
+
+struct list_head viprules = LIST_HEAD_INIT(viprules);
+static DEFINE_RWLOCK(viprules_lock);
+
+DEFINE_XEN_GUEST_HANDLE(uint8_t);
+static struct v4v_ring_info *v4v_ring_find_info(struct domain *d,
+                                                v4v_ring_id_t *id);
+
+static struct v4v_ring_info *v4v_ring_find_info_by_addr(struct domain *d,
+                                                        struct v4v_addr *a,
+                                                        domid_t p);
+
+typedef struct internal_v4v_iov
+{
+    XEN_GUEST_HANDLE(v4v_iov_t) guest_iov;
+    v4v_iov_t                   *xen_iov;
+} internal_v4v_iov_t;
+
+/*
+ * locks
+ */
+
+/*
+ * locking is organized as follows:
+ *
+ * the global lock v4v_lock: L1 protects the v4v elements
+ * of all struct domain *d in the system, it does not
+ * protect any of the elements of d->v4v, just their
+ * addresses. By extension since the destruction of
+ * a domain with a non-NULL d->v4v will need to free
+ * the d->v4v pointer, holding this lock gauruntees
+ * that no domains pointers in which v4v is interested
+ * become invalid whilst this lock is held.
+ */
+
+static DEFINE_RWLOCK(v4v_lock); /* L1 */
+
+/*
+ * the lock d->v4v->lock: L2:  Read on protects the hash table and
+ * the elements in the hash_table d->v4v->ring_hash, and
+ * the node and id fields in struct v4v_ring_info in the
+ * hash table. Write on L2 protects all of the elements of
+ * struct v4v_ring_info. To take L2 you must already have R(L1)
+ * W(L1) implies W(L2) and L3
+ *
+ * the lock v4v_ring_info *ringinfo; ringinfo->lock: L3:
+ * protects len,tx_ptr the guest ring, the
+ * guest ring_data and the pending list. To take L3 you must
+ * already have R(L2). W(L2) implies L3
+ */
+
+
+/*
+ * Debugs
+ */
+
+#ifdef V4V_DEBUG
+static void
+v4v_hexdump(void *_p, int len)
+{
+    uint8_t *buf = (uint8_t *)_p;
+    int i, j;
+
+    for ( i = 0; i < len; i += 16 )
+    {
+        printk(KERN_ERR "%p:", &buf[i]);
+        for ( j = 0; j < 16; ++j )
+        {
+            int k = i + j;
+            if ( k < len )
+                printk(" %02x", buf[k]);
+            else
+                printk("   ");
+        }
+        printk(" ");
+
+        for ( j = 0; j < 16; ++j )
+        {
+            int k = i + j;
+            if ( k < len )
+                printk("%c", ((buf[k] > 32) && (buf[k] < 127)) ? buf[k] : '.');
+            else
+                printk(" ");
+        }
+        printk("\n");
+    }
+}
+#endif
+
+
+/*
+ * Event channel
+ */
+
+static void
+v4v_signal_domain(struct domain *d)
+{
+    v4v_dprintk("send guest VIRQ_V4V domid:%d\n", d->domain_id);
+
+    evtchn_send(d, d->v4v->evtchn_port);
+}
+
+static void
+v4v_signal_domid(domid_t id)
+{
+    struct domain *d = get_domain_by_id(id);
+    if ( !d )
+        return;
+    v4v_signal_domain(d);
+    put_domain(d);
+}
+
+
+/*
+ * ring buffer
+ */
+
+static void
+v4v_ring_unmap(struct v4v_ring_info *ring_info)
+{
+    int i;
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    for ( i = 0; i < ring_info->npage; ++i )
+    {
+        if ( !ring_info->mfn_mapping[i] )
+            continue;
+        v4v_dprintk("unmapping page %p from %p\n",
+                    (void*)mfn_x(ring_info->mfns[i]),
+                    ring_info->mfn_mapping[i]);
+
+        unmap_domain_page(ring_info->mfn_mapping[i]);
+        ring_info->mfn_mapping[i] = NULL;
+    }
+}
+
+static uint8_t *
+v4v_ring_map_page(struct v4v_ring_info *ring_info, int i)
+{
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    if ( i >= ring_info->npage )
+        return NULL;
+    if ( ring_info->mfn_mapping[i] )
+        return ring_info->mfn_mapping[i];
+    ring_info->mfn_mapping[i] = map_domain_page(mfn_x(ring_info->mfns[i]));
+
+    v4v_dprintk("mapping page %p to %p\n",
+                (void *)mfn_x(ring_info->mfns[i]),
+                ring_info->mfn_mapping[i]);
+    return ring_info->mfn_mapping[i];
+}
+
+static int
+v4v_memcpy_from_guest_ring(void *_dst, struct v4v_ring_info *ring_info,
+                           uint32_t offset, uint32_t len)
+{
+    int page = offset >> PAGE_SHIFT;
+    uint8_t *src;
+    uint8_t *dst = _dst;
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    offset &= PAGE_SIZE - 1;
+
+    while ( (offset + len) > PAGE_SIZE )
+    {
+        src = v4v_ring_map_page(ring_info, page);
+
+        if ( !src )
+            return -EFAULT;
+
+        v4v_dprintk("memcpy(%p,%p+%d,%d)\n",
+                    dst, src, offset,
+                    (int)(PAGE_SIZE - offset));
+        memcpy(dst, src + offset, PAGE_SIZE - offset);
+
+        page++;
+        len -= PAGE_SIZE - offset;
+        dst += PAGE_SIZE - offset;
+        offset = 0;
+    }
+
+    src = v4v_ring_map_page(ring_info, page);
+    if ( !src )
+        return -EFAULT;
+
+    v4v_dprintk("memcpy(%p,%p+%d,%d)\n", dst, src, offset, len);
+    memcpy(dst, src + offset, len);
+
+    return 0;
+}
+
+static int
+v4v_update_tx_ptr(struct v4v_ring_info *ring_info, uint32_t tx_ptr)
+{
+    uint8_t *dst = v4v_ring_map_page(ring_info, 0);
+    uint32_t *p = (uint32_t *)(dst + offsetof(v4v_ring_t, tx_ptr));
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    if ( !dst )
+        return -EFAULT;
+    write_atomic(p, tx_ptr);
+    mb();
+    return 0;
+}
+
+static int
+v4v_copy_from_guest_maybe(void *dst, void *src,
+                          XEN_GUEST_HANDLE(uint8_t) src_hnd,
+                          uint32_t len)
+{
+    int rc = 0;
+
+    if ( src )
+        memcpy(dst, src, len);
+    else
+        rc = copy_from_guest(dst, src_hnd, len);
+    return rc;
+}
+
+static int
+v4v_memcpy_to_guest_ring(struct v4v_ring_info *ring_info,
+                         uint32_t offset,
+                         void *src,
+                         XEN_GUEST_HANDLE(uint8_t) src_hnd,
+                         uint32_t len)
+{
+    int page = offset >> PAGE_SHIFT;
+    uint8_t *dst;
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    offset &= PAGE_SIZE - 1;
+
+    while ( (offset + len) > PAGE_SIZE )
+    {
+        dst = v4v_ring_map_page(ring_info, page);
+        if ( !dst )
+            return -EFAULT;
+
+        if ( v4v_copy_from_guest_maybe(dst + offset, src, src_hnd,
+                                       PAGE_SIZE - offset) )
+            return -EFAULT;
+
+        page++;
+        len -= PAGE_SIZE - offset;
+        if ( src )
+            src += (PAGE_SIZE - offset);
+        else
+            guest_handle_add_offset(src_hnd, PAGE_SIZE - offset);
+        offset = 0;
+    }
+
+    dst = v4v_ring_map_page(ring_info, page);
+    if ( !dst )
+        return -EFAULT;
+
+    if ( v4v_copy_from_guest_maybe(dst + offset, src, src_hnd, len) )
+        return -EFAULT;
+
+    return 0;
+}
+
+static int
+v4v_ringbuf_get_rx_ptr(struct domain *d, struct v4v_ring_info *ring_info,
+                        uint32_t * rx_ptr)
+{
+    v4v_ring_t *ringp;
+
+    if ( ring_info->npage == 0 )
+        return -1;
+
+    ringp = map_domain_page(mfn_x(ring_info->mfns[0]));
+
+    v4v_dprintk("v4v_ringbuf_payload_space: mapped %p to %p\n",
+                (void *)mfn_x(ring_info->mfns[0]), ringp);
+    if ( !ringp )
+        return -1;
+
+    write_atomic(rx_ptr, ringp->rx_ptr);
+    mb();
+
+    unmap_domain_page(mfn_x(ring_info->mfns[0]));
+    return 0;
+}
+
+uint32_t
+v4v_ringbuf_payload_space(struct domain * d, struct v4v_ring_info * ring_info)
+{
+    v4v_ring_t ring;
+    int32_t ret;
+
+    ring.tx_ptr = ring_info->tx_ptr;
+    ring.len = ring_info->len;
+
+    if ( v4v_ringbuf_get_rx_ptr(d, ring_info, &ring.rx_ptr) )
+        return 0;
+
+    v4v_dprintk("v4v_ringbuf_payload_space:tx_ptr=%d rx_ptr=%d\n",
+                (int)ring.tx_ptr, (int)ring.rx_ptr);
+    if ( ring.rx_ptr == ring.tx_ptr )
+        return ring.len - sizeof (struct v4v_ring_message_header);
+
+    ret = ring.rx_ptr - ring.tx_ptr;
+    if ( ret < 0 )
+        ret += ring.len;
+
+    ret -= sizeof (struct v4v_ring_message_header);
+    ret -= V4V_ROUNDUP(1);
+
+    return (ret < 0) ? 0 : ret;
+}
+
+static unsigned long
+v4v_iov_copy(v4v_iov_t *iov, internal_v4v_iov_t *iovs)
+{
+    if (iovs->xen_iov == NULL)
+    {
+        return copy_from_guest(iov, iovs->guest_iov, 1);
+    }
+    else
+    {
+        *iov = *(iovs->xen_iov);
+        return 0;
+    }
+}
+
+static void
+v4v_iov_add_offset(internal_v4v_iov_t *iovs, int offset)
+{
+    if (iovs->xen_iov == NULL)
+        guest_handle_add_offset(iovs->guest_iov, offset);
+    else
+        iovs->xen_iov += offset;
+}
+
+static size_t
+v4v_iov_count(internal_v4v_iov_t iovs, int niov)
+{
+    v4v_iov_t iov;
+    size_t ret = 0;
+
+    while ( niov-- )
+    {
+        if ( v4v_iov_copy(&iov, &iovs) )
+            return -EFAULT;
+
+        ret += iov.iov_len;
+        v4v_iov_add_offset(&iovs, 1);
+    }
+
+    return ret;
+}
+
+static ssize_t
+v4v_ringbuf_insertv(struct domain *d,
+                    struct v4v_ring_info *ring_info,
+                    v4v_ring_id_t *src_id, uint32_t proto,
+                    internal_v4v_iov_t iovs, uint32_t niov,
+                    size_t len)
+{
+    v4v_ring_t ring;
+    struct v4v_ring_message_header mh = { 0 };
+    int32_t sp;
+    int32_t happy_ret;
+    int32_t ret = 0;
+    XEN_GUEST_HANDLE(uint8_t) empty_hnd = { 0 };
+
+    ASSERT(spin_is_locked(&ring_info->lock));
+
+    happy_ret = len;
+
+    if ( (V4V_ROUNDUP(len) + sizeof (struct v4v_ring_message_header) ) >=
+            ring_info->len)
+        return -EMSGSIZE;
+
+    do
+    {
+        if ( (ret = v4v_memcpy_from_guest_ring(&ring, ring_info, 0,
+                                               sizeof (ring))) )
+            break;
+
+        ring.tx_ptr = ring_info->tx_ptr;
+        ring.len = ring_info->len;
+
+        v4v_dprintk("ring.tx_ptr=%d ring.rx_ptr=%d ring.len=%d ring_info->tx_ptr=%d\n",
+                    ring.tx_ptr, ring.rx_ptr, ring.len, ring_info->tx_ptr);
+
+        if ( ring.rx_ptr == ring.tx_ptr )
+            sp = ring_info->len;
+        else
+        {
+            sp = ring.rx_ptr - ring.tx_ptr;
+            if ( sp < 0 )
+                sp += ring.len;
+        }
+
+        if ( (V4V_ROUNDUP(len) + sizeof (struct v4v_ring_message_header)) >= sp )
+        {
+            v4v_dprintk("EAGAIN\n");
+            ret = -EAGAIN;
+            break;
+        }
+
+        mh.len = len + sizeof (struct v4v_ring_message_header);
+        mh.source = src_id->addr;
+        mh.protocol = proto;
+
+        if ( (ret = v4v_memcpy_to_guest_ring(ring_info,
+                                             ring.tx_ptr + sizeof (v4v_ring_t),
+                                             &mh, empty_hnd,
+                                             sizeof (mh))) )
+            break;
+
+        ring.tx_ptr += sizeof (mh);
+        if ( ring.tx_ptr == ring_info->len )
+            ring.tx_ptr = 0;
+
+        while ( niov-- )
+        {
+            XEN_GUEST_HANDLE(uint8_t) buf_hnd;
+            v4v_iov_t iov;
+
+            if ( v4v_iov_copy(&iov, &iovs) )
+            {
+                ret = -EFAULT;
+                break;
+            }
+
+            buf_hnd.p = (uint8_t *)iov.iov_base; //FIXME
+            len = iov.iov_len;
+
+            if ( unlikely(!guest_handle_okay(buf_hnd, len)) )
+            {
+                ret = -EFAULT;
+                break;
+            }
+
+            sp = ring.len - ring.tx_ptr;
+
+            if ( len > sp )
+            {
+                ret = v4v_memcpy_to_guest_ring(ring_info,
+                        ring.tx_ptr + sizeof (v4v_ring_t),
+                        NULL, buf_hnd, sp);
+                if ( ret )
+                    break;
+
+                ring.tx_ptr = 0;
+                len -= sp;
+                guest_handle_add_offset(buf_hnd, sp);
+            }
+
+            ret = v4v_memcpy_to_guest_ring(ring_info,
+                    ring.tx_ptr + sizeof (v4v_ring_t),
+                    NULL, buf_hnd, len);
+            if ( ret )
+                break;
+
+            ring.tx_ptr += len;
+
+            if ( ring.tx_ptr == ring_info->len )
+                ring.tx_ptr = 0;
+
+            v4v_iov_add_offset(&iovs, 1);
+        }
+        if ( ret )
+            break;
+
+        ring.tx_ptr = V4V_ROUNDUP(ring.tx_ptr);
+
+        if ( ring.tx_ptr >= ring_info->len )
+            ring.tx_ptr -= ring_info->len;
+
+        mb();
+        ring_info->tx_ptr = ring.tx_ptr;
+        if ( (ret = v4v_update_tx_ptr(ring_info, ring.tx_ptr)) )
+            break;
+    }
+    while ( 0 );
+
+    v4v_ring_unmap(ring_info);
+
+    return ret ? ret : happy_ret;
+}
+
+
+
+/* pending */
+static void
+v4v_pending_remove_ent(struct v4v_pending_ent *ent)
+{
+    hlist_del(&ent->node);
+    xfree(ent);
+}
+
+static void
+v4v_pending_remove_all(struct v4v_ring_info *info)
+{
+    struct hlist_node *node, *next;
+    struct v4v_pending_ent *pending_ent;
+
+    ASSERT(spin_is_locked(&info->lock));
+    hlist_for_each_entry_safe(pending_ent, node, next, &info->pending,
+            node) v4v_pending_remove_ent(pending_ent);
+}
+
+static void
+v4v_pending_notify(struct domain *caller_d, struct hlist_head *to_notify)
+{
+    struct hlist_node *node, *next;
+    struct v4v_pending_ent *pending_ent;
+
+    ASSERT(rw_is_locked(&v4v_lock));
+
+    hlist_for_each_entry_safe(pending_ent, node, next, to_notify, node)
+    {
+        hlist_del(&pending_ent->node);
+        v4v_signal_domid(pending_ent->id);
+        xfree(pending_ent);
+    }
+
+}
+
+static void
+v4v_pending_find(struct domain *d, struct v4v_ring_info *ring_info,
+                 uint32_t payload_space, struct hlist_head *to_notify)
+{
+    struct hlist_node *node, *next;
+    struct v4v_pending_ent *ent;
+
+    ASSERT(rw_is_locked(&d->v4v->lock));
+
+    spin_lock(&ring_info->lock);
+    hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node)
+    {
+        if ( payload_space >= ent->len )
+        {
+            hlist_del(&ent->node);
+            hlist_add_head(&ent->node, to_notify);
+        }
+    }
+    spin_unlock(&ring_info->lock);
+}
+
+/*caller must have L3 */
+static int
+v4v_pending_queue(struct v4v_ring_info *ring_info, domid_t src_id, int len)
+{
+    struct v4v_pending_ent *ent = xmalloc(struct v4v_pending_ent);
+
+    if ( !ent )
+    {
+        v4v_dprintk("ENOMEM\n");
+        return -ENOMEM;
+    }
+
+    ent->len = len;
+    ent->id = src_id;
+
+    hlist_add_head(&ent->node, &ring_info->pending);
+
+    return 0;
+}
+
+/* L3 */
+static int
+v4v_pending_requeue(struct v4v_ring_info *ring_info, domid_t src_id, int len)
+{
+    struct hlist_node *node;
+    struct v4v_pending_ent *ent;
+
+    hlist_for_each_entry(ent, node, &ring_info->pending, node)
+    {
+        if ( ent->id == src_id )
+        {
+            if ( ent->len < len )
+                ent->len = len;
+            return 0;
+        }
+    }
+
+    return v4v_pending_queue(ring_info, src_id, len);
+}
+
+
+/* L3 */
+static void
+v4v_pending_cancel(struct v4v_ring_info *ring_info, domid_t src_id)
+{
+    struct hlist_node *node, *next;
+    struct v4v_pending_ent *ent;
+
+    hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node)
+    {
+        if ( ent->id == src_id)
+        {
+            hlist_del(&ent->node);
+            xfree(ent);
+        }
+    }
+}
+
+/*
+ * ring data
+ */
+
+/*Caller should hold R(L1)*/
+static int
+v4v_fill_ring_data(struct domain *src_d,
+                   XEN_GUEST_HANDLE(v4v_ring_data_ent_t) data_ent_hnd)
+{
+    v4v_ring_data_ent_t ent;
+    struct domain *dst_d;
+    struct v4v_ring_info *ring_info;
+
+    if ( copy_from_guest(&ent, data_ent_hnd, 1) )
+    {
+        v4v_dprintk("EFAULT\n");
+        return -EFAULT;
+    }
+
+    v4v_dprintk("v4v_fill_ring_data: ent.ring.domain=%d,ent.ring.port=%u\n",
+                (int)ent.ring.domain, (int)ent.ring.port);
+
+    ent.flags = 0;
+
+    dst_d = get_domain_by_id(ent.ring.domain);
+
+    if ( dst_d && dst_d->v4v )
+    {
+        read_lock(&dst_d->v4v->lock);
+        ring_info = v4v_ring_find_info_by_addr(dst_d, &ent.ring,
+                                               src_d->domain_id);
+
+        if ( ring_info )
+        {
+            uint32_t space_avail;
+
+            ent.flags |= V4V_RING_DATA_F_EXISTS;
+            ent.max_message_size =
+                ring_info->len - sizeof (struct v4v_ring_message_header) -
+                V4V_ROUNDUP(1);
+            spin_lock(&ring_info->lock);
+
+            space_avail = v4v_ringbuf_payload_space(dst_d, ring_info);
+
+            if ( space_avail >= ent.space_required )
+            {
+                v4v_pending_cancel(ring_info, src_d->domain_id);
+                ent.flags |= V4V_RING_DATA_F_SUFFICIENT;
+            }
+            else
+            {
+                v4v_pending_requeue(ring_info, src_d->domain_id,
+                        ent.space_required);
+                ent.flags |= V4V_RING_DATA_F_PENDING;
+            }
+
+            spin_unlock(&ring_info->lock);
+
+            if ( space_avail == ent.max_message_size )
+                ent.flags |= V4V_RING_DATA_F_EMPTY;
+
+        }
+        read_unlock(&dst_d->v4v->lock);
+    }
+
+    if ( dst_d )
+        put_domain(dst_d);
+
+    if ( copy_field_to_guest(data_ent_hnd, &ent, flags) )
+    {
+        v4v_dprintk("EFAULT\n");
+        return -EFAULT;
+    }
+    return 0;
+}
+
+/*Called should hold no more than R(L1) */
+static int
+v4v_fill_ring_datas(struct domain *d, int nent,
+                     XEN_GUEST_HANDLE(v4v_ring_data_ent_t) data_ent_hnd)
+{
+    int ret = 0;
+
+    read_lock(&v4v_lock);
+    while ( !ret && nent-- )
+    {
+        ret = v4v_fill_ring_data(d, data_ent_hnd);
+        guest_handle_add_offset(data_ent_hnd, 1);
+    }
+    read_unlock(&v4v_lock);
+    return ret;
+}
+
+/*
+ * ring
+ */
+static int
+v4v_find_ring_mfns(struct domain *d, struct v4v_ring_info *ring_info,
+                   uint32_t npage, XEN_GUEST_HANDLE(v4v_pfn_t) pfn_hnd)
+{
+    int i,j;
+    mfn_t *mfns;
+    uint8_t **mfn_mapping;
+    unsigned long mfn;
+    struct page_info *page;
+    int ret = 0;
+
+    if ( (npage << PAGE_SHIFT) < ring_info->len )
+    {
+        v4v_dprintk("EINVAL\n");
+        return -EINVAL;
+    }
+
+    mfns = xmalloc_array(mfn_t, npage);
+    if ( !mfns )
+    {
+        v4v_dprintk("ENOMEM\n");
+        return -ENOMEM;
+    }
+
+    mfn_mapping = xmalloc_array(uint8_t *, npage);
+    if ( !mfn_mapping )
+    {
+        xfree(mfns);
+        return -ENOMEM;
+    }
+
+    for ( i = 0; i < npage; ++i )
+    {
+        unsigned long pfn;
+        p2m_type_t p2mt;
+
+        if ( copy_from_guest_offset(&pfn, pfn_hnd, i, 1) )
+        {
+            ret = -EFAULT;
+            v4v_dprintk("EFAULT\n");
+            break;
+        }
+
+        mfn = mfn_x(get_gfn(d, pfn, &p2mt));
+        if ( !mfn_valid(mfn) )
+        {
+            printk(KERN_ERR "v4v domain %d passed invalid mfn %"PRI_mfn" ring %p seq %d\n",
+                    d->domain_id, mfn, ring_info, i);
+            ret = -EINVAL;
+            break;
+        }
+        page = mfn_to_page(mfn);
+        if ( !get_page_and_type(page, d, PGT_writable_page) )
+        {
+            printk(KERN_ERR "v4v domain %d passed wrong type mfn %"PRI_mfn" ring %p seq %d\n",
+                    d->domain_id, mfn, ring_info, i);
+            ret = -EINVAL;
+            break;
+        }
+        mfns[i] = _mfn(mfn);
+        v4v_dprintk("v4v_find_ring_mfns: %d: %lx -> %lx\n",
+                    i, (unsigned long)pfn, (unsigned long)mfn_x(mfns[i]));
+        mfn_mapping[i] = NULL;
+        put_gfn(d, pfn);
+    }
+
+    if ( !ret )
+    {
+        ring_info->npage = npage;
+        ring_info->mfns = mfns;
+        ring_info->mfn_mapping = mfn_mapping;
+    }
+    else
+    {
+        j = i;
+        for ( i = 0; i < j; ++i )
+            if ( mfn_x(mfns[i]) != 0 )
+                put_page_and_type(mfn_to_page(mfn_x(mfns[i])));
+        xfree(mfn_mapping);
+        xfree(mfns);
+    }
+    return ret;
+}
+
+
+static struct v4v_ring_info *
+v4v_ring_find_info(struct domain *d, v4v_ring_id_t *id)
+{
+    uint16_t hash;
+    struct hlist_node *node;
+    struct v4v_ring_info *ring_info;
+
+    ASSERT(rw_is_locked(&d->v4v->lock));
+
+    hash = v4v_hash_fn(id);
+
+    v4v_dprintk("ring_find_info: d->v4v=%p, d->v4v->ring_hash[%d]=%p id=%p\n",
+                d->v4v, (int)hash, d->v4v->ring_hash[hash].first, id);
+    v4v_dprintk("ring_find_info: id.addr.port=%d id.addr.domain=%d id.addr.partner=%d\n",
+                id->addr.port, id->addr.domain, id->partner);
+
+    hlist_for_each_entry(ring_info, node, &d->v4v->ring_hash[hash], node)
+    {
+        v4v_ring_id_t *cmpid = &ring_info->id;
+
+        if ( cmpid->addr.port == id->addr.port &&
+             cmpid->addr.domain == id->addr.domain &&
+             cmpid->partner == id->partner)
+        {
+            v4v_dprintk("ring_find_info: ring_info=%p\n", ring_info);
+            return ring_info;
+        }
+    }
+    v4v_dprintk("ring_find_info: no ring_info found\n");
+    return NULL;
+}
+
+static struct v4v_ring_info *
+v4v_ring_find_info_by_addr(struct domain *d, struct v4v_addr *a, domid_t p)
+{
+    v4v_ring_id_t id;
+    struct v4v_ring_info *ret;
+
+    ASSERT(rw_is_locked(&d->v4v->lock));
+
+    if ( !a )
+        return NULL;
+
+    id.addr.port = a->port;
+    id.addr.domain = d->domain_id;
+    id.partner = p;
+
+    ret = v4v_ring_find_info(d, &id);
+    if ( ret )
+        return ret;
+
+    id.partner = V4V_DOMID_ANY;
+
+    return v4v_ring_find_info(d, &id);
+}
+
+static void v4v_ring_remove_mfns(struct domain *d, struct v4v_ring_info *ring_info)
+{
+    int i;
+
+    ASSERT(rw_is_write_locked(&d->v4v->lock));
+
+    if ( ring_info->mfns )
+    {
+        for ( i = 0; i < ring_info->npage; ++i )
+            if ( mfn_x(ring_info->mfns[i]) != 0 )
+                put_page_and_type(mfn_to_page(mfn_x(ring_info->mfns[i])));
+        xfree(ring_info->mfns);
+    }
+    if ( ring_info->mfn_mapping )
+        xfree(ring_info->mfn_mapping);
+    ring_info->mfns = NULL;
+
+}
+
+static void
+v4v_ring_remove_info(struct domain *d, struct v4v_ring_info *ring_info)
+{
+    ASSERT(rw_is_write_locked(&d->v4v->lock));
+
+    spin_lock(&ring_info->lock);
+
+    v4v_pending_remove_all(ring_info);
+    hlist_del(&ring_info->node);
+    v4v_ring_remove_mfns(d, ring_info);
+
+    spin_unlock(&ring_info->lock);
+
+    xfree(ring_info);
+}
+
+/* Call from guest to unpublish a ring */
+static long
+v4v_ring_remove(struct domain *d, XEN_GUEST_HANDLE(v4v_ring_t) ring_hnd)
+{
+    struct v4v_ring ring;
+    struct v4v_ring_info *ring_info;
+    int ret = 0;
+
+    read_lock(&v4v_lock);
+
+    do
+    {
+        if ( !d->v4v )
+        {
+            v4v_dprintk("EINVAL\n");
+            ret = -EINVAL;
+            break;
+        }
+
+        if ( copy_from_guest(&ring, ring_hnd, 1) )
+        {
+            v4v_dprintk("EFAULT\n");
+            ret = -EFAULT;
+            break;
+        }
+
+        if ( ring.magic != V4V_RING_MAGIC )
+        {
+            v4v_dprintk("ring.magic(%lx) != V4V_RING_MAGIC(%lx), EINVAL\n",
+                    ring.magic, V4V_RING_MAGIC);
+            ret = -EINVAL;
+            break;
+        }
+
+        ring.id.addr.domain = d->domain_id;
+
+        write_lock(&d->v4v->lock);
+        ring_info = v4v_ring_find_info(d, &ring.id);
+
+        if ( ring_info )
+            v4v_ring_remove_info(d, ring_info);
+
+        write_unlock(&d->v4v->lock);
+
+        if ( !ring_info )
+        {
+            v4v_dprintk("ENOENT\n");
+            ret = -ENOENT;
+            break;
+        }
+
+    }
+    while ( 0 );
+
+    read_unlock(&v4v_lock);
+    return ret;
+}
+
+/* call from guest to publish a ring */
+static long
+v4v_ring_add(struct domain *d, XEN_GUEST_HANDLE(v4v_ring_t) ring_hnd,
+             uint32_t npage, XEN_GUEST_HANDLE(v4v_pfn_t) pfn_hnd)
+{
+    struct v4v_ring ring;
+    struct v4v_ring_info *ring_info;
+    int need_to_insert = 0;
+    int ret = 0;
+
+    if ( (long)ring_hnd.p & (PAGE_SIZE - 1) )
+    {
+        v4v_dprintk("EINVAL\n");
+        return -EINVAL;
+    }
+
+    read_lock(&v4v_lock);
+    do
+    {
+        if ( !d->v4v )
+        {
+            v4v_dprintk(" !d->v4v, EINVAL\n");
+            ret = -EINVAL;
+            break;
+        }
+
+        if ( copy_from_guest(&ring, ring_hnd, 1) )
+        {
+            v4v_dprintk(" copy_from_guest failed, EFAULT\n");
+            ret = -EFAULT;
+            break;
+        }
+
+        if ( ring.magic != V4V_RING_MAGIC )
+        {
+            v4v_dprintk("ring.magic(%lx) != V4V_RING_MAGIC(%lx), EINVAL\n",
+                        ring.magic, V4V_RING_MAGIC);
+            ret = -EINVAL;
+            break;
+        }
+
+        if ( (ring.len <
+                    (sizeof (struct v4v_ring_message_header) + V4V_ROUNDUP(1) +
+                     V4V_ROUNDUP(1))) || (V4V_ROUNDUP(ring.len) != ring.len) )
+        {
+            v4v_dprintk("EINVAL\n");
+            ret = -EINVAL;
+            break;
+        }
+
+        ring.id.addr.domain = d->domain_id;
+        if ( copy_field_to_guest(ring_hnd, &ring, id) )
+        {
+            v4v_dprintk("EFAULT\n");
+            ret = -EFAULT;
+            break;
+        }
+
+        /*
+         * no need for a lock yet, because only we know about this
+         * set the tx pointer if it looks bogus (we don't reset it
+         * because this might be a re-register after S4)
+         */
+        if ( (ring.tx_ptr >= ring.len)
+                || (V4V_ROUNDUP(ring.tx_ptr) != ring.tx_ptr) )
+        {
+            ring.tx_ptr = ring.rx_ptr;
+        }
+        copy_field_to_guest(ring_hnd, &ring, tx_ptr);
+
+        read_lock(&d->v4v->lock);
+        ring_info = v4v_ring_find_info(d, &ring.id);
+
+        if ( !ring_info )
+        {
+            read_unlock(&d->v4v->lock);
+            ring_info = xmalloc(struct v4v_ring_info);
+            if ( !ring_info )
+            {
+                v4v_dprintk("ENOMEM\n");
+                ret = -ENOMEM;
+                break;
+            }
+            need_to_insert++;
+            spin_lock_init(&ring_info->lock);
+            INIT_HLIST_HEAD(&ring_info->pending);
+            ring_info->mfns = NULL;
+
+        }
+        else
+        {
+            /*
+             * Ring info already existed. If mfn list was already
+             * populated remove the MFN's from list and then add
+             * the new list.
+             */
+            printk(KERN_INFO "v4v: dom%d ring already registered\n",
+                    current->domain->domain_id);
+            ret = -EEXIST;
+            break;
+        }
+
+        spin_lock(&ring_info->lock);
+        ring_info->id = ring.id;
+        ring_info->len = ring.len;
+        ring_info->tx_ptr = ring.tx_ptr;
+        ring_info->ring = ring_hnd;
+        if ( ring_info->mfns )
+            xfree(ring_info->mfns);
+        ret = v4v_find_ring_mfns(d, ring_info, npage, pfn_hnd);
+        spin_unlock(&ring_info->lock);
+        if ( ret )
+            break;
+
+        if ( !need_to_insert )
+        {
+            read_unlock(&d->v4v->lock);
+        }
+        else
+        {
+            uint16_t hash = v4v_hash_fn(&ring.id);
+            write_lock(&d->v4v->lock);
+            hlist_add_head(&ring_info->node, &d->v4v->ring_hash[hash]);
+            write_unlock(&d->v4v->lock);
+        }
+    }
+    while ( 0 );
+
+    read_unlock(&v4v_lock);
+    return ret;
+}
+
+
+/*
+ * io
+ */
+
+static void
+v4v_notify_ring(struct domain *d, struct v4v_ring_info *ring_info,
+                struct hlist_head *to_notify)
+{
+    uint32_t space;
+
+    ASSERT(rw_is_locked(&v4v_lock));
+    ASSERT(rw_is_locked(&d->v4v->lock));
+
+    spin_lock(&ring_info->lock);
+    space = v4v_ringbuf_payload_space(d, ring_info);
+    spin_unlock(&ring_info->lock);
+
+    v4v_pending_find(d, ring_info, space, to_notify);
+}
+
+/*notify hypercall*/
+static long
+v4v_notify(struct domain *d,
+           XEN_GUEST_HANDLE(v4v_ring_data_t) ring_data_hnd)
+{
+    v4v_ring_data_t ring_data;
+    HLIST_HEAD(to_notify);
+    int i;
+    int ret = 0;
+
+    read_lock(&v4v_lock);
+
+    if ( !d->v4v )
+    {
+        read_unlock(&v4v_lock);
+        v4v_dprintk("!d->v4v, ENODEV\n");
+        return -ENODEV;
+    }
+
+    read_lock(&d->v4v->lock);
+    for ( i = 0; i < V4V_HTABLE_SIZE; ++i )
+    {
+        struct hlist_node *node, *next;
+        struct v4v_ring_info *ring_info;
+
+        hlist_for_each_entry_safe(ring_info, node,
+                next, &d->v4v->ring_hash[i],
+                node)
+        {
+            v4v_notify_ring(d, ring_info, &to_notify);
+        }
+    }
+    read_unlock(&d->v4v->lock);
+
+    if ( !hlist_empty(&to_notify) )
+        v4v_pending_notify(d, &to_notify);
+
+    do
+    {
+        if ( !guest_handle_is_null(ring_data_hnd) )
+        {
+            /* Quick sanity check on ring_data_hnd */
+            if ( copy_field_from_guest(&ring_data, ring_data_hnd, magic) )
+            {
+                v4v_dprintk("copy_field_from_guest failed\n");
+                ret = -EFAULT;
+                break;
+            }
+
+            if ( ring_data.magic != V4V_RING_DATA_MAGIC )
+            {
+                v4v_dprintk("ring.magic(%lx) != V4V_RING_MAGIC(%lx), EINVAL\n",
+                        ring_data.magic, V4V_RING_MAGIC);
+                ret = -EINVAL;
+                break;
+            }
+
+            if ( copy_from_guest(&ring_data, ring_data_hnd, 1) )
+            {
+                v4v_dprintk("copy_from_guest failed\n");
+                ret = -EFAULT;
+                break;
+            }
+
+            {
+                XEN_GUEST_HANDLE(v4v_ring_data_ent_t) ring_data_ent_hnd;
+                ring_data_ent_hnd =
+                    guest_handle_for_field(ring_data_hnd, v4v_ring_data_ent_t, data[0]);
+                ret = v4v_fill_ring_datas(d, ring_data.nent, ring_data_ent_hnd);
+            }
+        }
+    }
+    while ( 0 );
+
+    read_unlock(&v4v_lock);
+
+    return ret;
+}
+
+#ifdef V4V_DEBUG
+void
+v4v_viptables_print_rule(struct v4v_viptables_rule_node *node)
+{
+    v4v_viptables_rule_t *rule;
+
+    if ( node == NULL )
+    {
+        printk("(null)\n");
+        return;
+    }
+
+    rule = &node->rule;
+
+    if ( rule->accept == 1 )
+        printk("ACCEPT");
+    else
+        printk("REJECT");
+
+    printk(" ");
+
+    if ( rule->src.domain == DOMID_ANY )
+        printk("*");
+    else
+        printk("%i", rule->src.domain);
+
+    printk(":");
+
+    if ( rule->src.port == -1 )
+        printk("*");
+    else
+        printk("%i", rule->src.port);
+
+    printk(" -> ");
+
+    if ( rule->dst.domain == DOMID_ANY )
+        printk("*");
+    else
+        printk("%i", rule->dst.domain);
+
+    printk(":");
+
+    if ( rule->dst.port == -1 )
+        printk("*");
+    else
+        printk("%i", rule->dst.port);
+
+    printk("\n");
+}
+#endif /* V4V_DEBUG */
+
+int
+v4v_viptables_add(struct domain *src_d,
+                  XEN_GUEST_HANDLE(v4v_viptables_rule_t) rule,
+                  int32_t position)
+{
+    struct v4v_viptables_rule_node* new = NULL;
+    struct list_head* tmp;
+
+    ASSERT(rw_is_write_locked(&viprules_lock));
+
+    /* First rule is n.1 */
+    position--;
+
+    new = xmalloc(struct v4v_viptables_rule_node);
+    if ( new == NULL )
+        return -ENOMEM;
+
+    if ( copy_from_guest(&new->rule, rule, 1) )
+    {
+        xfree(new);
+        return -EFAULT;
+    }
+
+#ifdef V4V_DEBUG
+    printk(KERN_ERR "VIPTables: ");
+    v4v_viptables_print_rule(new);
+#endif /* V4V_DEBUG */
+
+    tmp = &viprules;
+    while ( position != 0 && tmp->next != &viprules )
+    {
+        tmp = tmp->next;
+        position--;
+    }
+    list_add(&new->list, tmp);
+
+    return 0;
+}
+
+int
+v4v_viptables_del(struct domain *src_d,
+                  XEN_GUEST_HANDLE(v4v_viptables_rule_t) rule,
+                  int32_t position)
+{
+    struct list_head *tmp = NULL;
+    struct list_head *next = NULL;
+    struct v4v_viptables_rule_node *node;
+
+    ASSERT(rw_is_write_locked(&viprules_lock));
+
+    if ( position != -1 )
+    {
+        /* We want to delete the rule number <position> */
+        tmp = &viprules;
+        while ( position != 0 && tmp->next != &viprules )
+        {
+            tmp = tmp->next;
+            position--;
+        }
+    }
+    else if ( !guest_handle_is_null(rule) )
+    {
+        struct v4v_viptables_rule r;
+
+        if ( copy_field_from_guest(&r, rule, src) ||
+             copy_field_from_guest(&r, rule, dst) ||
+             copy_field_from_guest(&r, rule, accept) )
+        {
+            return -EFAULT;
+        }
+
+        list_for_each(tmp, &viprules)
+        {
+            node = list_entry(tmp, struct v4v_viptables_rule_node, list);
+
+            if ( (node->rule.src.domain == r.src.domain) &&
+                 (node->rule.src.port   == r.src.port)   &&
+                 (node->rule.dst.domain == r.dst.domain) &&
+                 (node->rule.dst.port   == r.dst.port))
+            {
+                position = 0;
+                break;
+            }
+        }
+    }
+    else
+    {
+        /* We want to flush the rules! */
+        printk(KERN_ERR "VIPTables: flushing rules\n");
+        list_for_each_safe(tmp, next, &viprules)
+        {
+            node = list_entry(tmp, struct v4v_viptables_rule_node, list);
+            list_del(tmp);
+            xfree(node);
+        }
+    }
+
+    if ( position == 0 && tmp != &viprules )
+    {
+        node = list_entry(tmp, struct v4v_viptables_rule_node, list);
+#ifdef V4V_DEBUG
+        printk(KERN_ERR "VIPTables: deleting rule: ");
+        v4v_viptables_print_rule(node);
+#endif /* V4V_DEBUG */
+        list_del(tmp);
+        xfree(node);
+    }
+
+    return 0;
+}
+
+static size_t
+v4v_viptables_list(struct domain *src_d,
+                   XEN_GUEST_HANDLE(v4v_viptables_list_t) list_hnd)
+{
+    struct list_head *ptr;
+    struct v4v_viptables_rule_node *node;
+    struct v4v_viptables_list rules_list;
+    uint32_t nbrules;
+    XEN_GUEST_HANDLE(v4v_viptables_rule_t) guest_rules;
+
+    ASSERT(rw_is_locked(&viprules_lock));
+
+    memset(&rules_list, 0, sizeof (rules_list));
+    if ( copy_from_guest(&rules_list, list_hnd, 1) )
+        return -EFAULT;
+
+    ptr = viprules.next;
+    while ( rules_list.start_rule != 0 && ptr->next != &viprules )
+    {
+        ptr = ptr->next;
+        rules_list.start_rule--;
+    }
+
+    if ( rules_list.nb_rules == 0 )
+        return -EINVAL;
+
+    guest_rules = guest_handle_for_field(list_hnd, v4v_viptables_rule_t, rules[0]);
+
+    nbrules = 0;
+    while ( nbrules < rules_list.nb_rules && ptr != &viprules )
+    {
+        node = list_entry(ptr, struct v4v_viptables_rule_node, list);
+
+        if ( !guest_handle_okay(guest_rules, 1) )
+            break;
+
+        if ( copy_to_guest(guest_rules, &node->rule, 1) )
+            break;
+
+        guest_handle_add_offset(guest_rules, 1);
+
+        nbrules++;
+        ptr = ptr->next;
+    }
+
+    rules_list.nb_rules = nbrules;
+    if ( copy_field_to_guest(list_hnd, &rules_list, nb_rules) )
+        return -EFAULT;
+
+    return 0;
+}
+
+static size_t
+v4v_viptables_check(v4v_addr_t * src, v4v_addr_t * dst)
+{
+    struct list_head *ptr;
+    struct v4v_viptables_rule_node *node;
+    size_t ret = 0; /* Defaulting to ACCEPT */
+
+    read_lock(&viprules_lock);
+
+    list_for_each(ptr, &viprules)
+    {
+        node = list_entry(ptr, struct v4v_viptables_rule_node, list);
+
+        if ( (node->rule.src.domain == V4V_DOMID_ANY ||
+              node->rule.src.domain == src->domain) &&
+             (node->rule.src.port == V4V_PORT_ANY ||
+              node->rule.src.port == src->port) &&
+             (node->rule.dst.domain == V4V_DOMID_ANY ||
+              node->rule.dst.domain == dst->domain) &&
+             (node->rule.dst.port == V4V_PORT_ANY ||
+              node->rule.dst.port == dst->port) )
+        {
+            ret = !node->rule.accept;
+            break;
+        }
+    }
+
+    read_unlock(&viprules_lock);
+    return ret;
+}
+
+/*
+ * Hypercall to do the send
+ */
+static size_t
+v4v_sendv(struct domain *src_d, v4v_addr_t * src_addr,
+          v4v_addr_t * dst_addr, uint32_t proto,
+          internal_v4v_iov_t iovs, size_t niov)
+{
+    struct domain *dst_d;
+    v4v_ring_id_t src_id;
+    struct v4v_ring_info *ring_info;
+    int ret = 0;
+
+    if ( !dst_addr )
+    {
+        v4v_dprintk("!dst_addr, EINVAL\n");
+        return -EINVAL;
+    }
+
+    read_lock(&v4v_lock);
+    if ( !src_d->v4v )
+    {
+        read_unlock(&v4v_lock);
+        v4v_dprintk("!src_d->v4v, EINVAL\n");
+        return -EINVAL;
+    }
+
+    src_id.addr.port = src_addr->port;
+    src_id.addr.domain = src_d->domain_id;
+    src_id.partner = dst_addr->domain;
+
+    dst_d = get_domain_by_id(dst_addr->domain);
+    if ( !dst_d )
+    {
+        read_unlock(&v4v_lock);
+        v4v_dprintk("!dst_d, ECONNREFUSED\n");
+        return -ECONNREFUSED;
+    }
+
+    if ( v4v_viptables_check(src_addr, dst_addr) != 0 )
+    {
+        read_unlock(&v4v_lock);
+        gdprintk(XENLOG_WARNING,
+                 "V4V: VIPTables REJECTED %i:%i -> %i:%i\n",
+                 src_addr->domain, src_addr->port,
+                 dst_addr->domain, dst_addr->port);
+        return -ECONNREFUSED;
+    }
+
+    do
+    {
+        if ( !dst_d->v4v )
+        {
+            v4v_dprintk("dst_d->v4v, ECONNREFUSED\n");
+            ret = -ECONNREFUSED;
+            break;
+        }
+
+        read_lock(&dst_d->v4v->lock);
+        ring_info =
+            v4v_ring_find_info_by_addr(dst_d, dst_addr, src_addr->domain);
+
+        if ( !ring_info )
+        {
+            ret = -ECONNREFUSED;
+            v4v_dprintk(" !ring_info, ECONNREFUSED\n");
+        }
+        else
+        {
+            size_t len = v4v_iov_count(iovs, niov);
+
+            if ( len < 0 )
+            {
+                ret = len;
+                break;
+            }
+
+            spin_lock(&ring_info->lock);
+            ret =
+                v4v_ringbuf_insertv(dst_d, ring_info, &src_id, proto, iovs,
+                        niov, len);
+            if ( ret == -EAGAIN )
+            {
+                v4v_dprintk("v4v_ringbuf_insertv failed, EAGAIN\n");
+                /* Schedule a wake up on the event channel when space is there */
+                if ( v4v_pending_requeue(ring_info, src_d->domain_id, len) )
+                {
+                    v4v_dprintk("v4v_pending_requeue failed, ENOMEM\n");
+                    ret = -ENOMEM;
+                }
+            }
+            spin_unlock(&ring_info->lock);
+
+            if ( ret >= 0 )
+            {
+                v4v_signal_domain(dst_d);
+            }
+
+        }
+        read_unlock(&dst_d->v4v->lock);
+
+    }
+    while ( 0 );
+
+    put_domain(dst_d);
+    read_unlock(&v4v_lock);
+    return ret;
+}
+
+static void
+v4v_info(struct domain *d, v4v_info_t *info)
+{
+    read_lock(&d->v4v->lock);
+    info->ring_magic = V4V_RING_MAGIC;
+    info->data_magic = V4V_RING_DATA_MAGIC;
+    info->evtchn = d->v4v->evtchn_port;
+    read_unlock(&d->v4v->lock);
+}
+
+/*
+ * hypercall glue
+ */
+long
+do_v4v_op(int cmd, XEN_GUEST_HANDLE(void) arg1,
+          XEN_GUEST_HANDLE(void) arg2,
+          uint32_t arg3, uint32_t arg4)
+{
+    struct domain *d = current->domain;
+    long rc = -EFAULT;
+
+    v4v_dprintk("->do_v4v_op(%d,%p,%p,%d,%d)\n", cmd,
+                (void *)arg1.p, (void *)arg2.p, (int) arg3, (int) arg4);
+
+    domain_lock(d);
+    switch (cmd)
+    {
+        case V4VOP_register_ring:
+            {
+                XEN_GUEST_HANDLE(v4v_ring_t) ring_hnd =
+                    guest_handle_cast(arg1, v4v_ring_t);
+                XEN_GUEST_HANDLE(v4v_pfn_t) pfn_hnd =
+                    guest_handle_cast(arg2, v4v_pfn_t);
+                uint32_t npage = arg3;
+                if ( unlikely(!guest_handle_okay(ring_hnd, 1)) )
+                    goto out;
+                if ( unlikely(!guest_handle_okay(pfn_hnd, npage)) )
+                    goto out;
+                rc = v4v_ring_add(d, ring_hnd, npage, pfn_hnd);
+                break;
+            }
+        case V4VOP_unregister_ring:
+            {
+                XEN_GUEST_HANDLE(v4v_ring_t) ring_hnd =
+                    guest_handle_cast(arg1, v4v_ring_t);
+                if ( unlikely(!guest_handle_okay(ring_hnd, 1)) )
+                    goto out;
+                rc = v4v_ring_remove(d, ring_hnd);
+                break;
+            }
+        case V4VOP_send:
+            {
+                uint32_t len = arg3;
+                uint32_t protocol = arg4;
+                v4v_iov_t iov;
+                internal_v4v_iov_t iovs;
+                XEN_GUEST_HANDLE(v4v_send_addr_t) addr_hnd =
+                    guest_handle_cast(arg1, v4v_send_addr_t);
+                v4v_send_addr_t addr;
+
+                if ( unlikely(!guest_handle_okay(addr_hnd, 1)) )
+                    goto out;
+                if ( copy_from_guest(&addr, addr_hnd, 1) )
+                    goto out;
+
+                iov.iov_base = (uint64_t)arg2.p; // FIXME
+                iov.iov_len = len;
+                iovs.xen_iov = &iov;
+                rc = v4v_sendv(d, &addr.src, &addr.dst, protocol, iovs, 1);
+                break;
+            }
+        case V4VOP_sendv:
+            {
+                internal_v4v_iov_t iovs;
+                uint32_t niov = arg3;
+                uint32_t protocol = arg4;
+                XEN_GUEST_HANDLE(v4v_send_addr_t) addr_hnd =
+                    guest_handle_cast(arg1, v4v_send_addr_t);
+                v4v_send_addr_t addr;
+
+                memset(&iovs, 0, sizeof (iovs));
+                iovs.guest_iov = guest_handle_cast(arg2, v4v_iov_t);
+
+                if ( unlikely(!guest_handle_okay(addr_hnd, 1)) )
+                    goto out;
+                if ( copy_from_guest(&addr, addr_hnd, 1) )
+                    goto out;
+
+                if ( unlikely(!guest_handle_okay(iovs.guest_iov, niov)) )
+                    goto out;
+
+                rc = v4v_sendv(d, &addr.src, &addr.dst, protocol, iovs, niov);
+                break;
+            }
+        case V4VOP_notify:
+            {
+                XEN_GUEST_HANDLE(v4v_ring_data_t) ring_data_hnd =
+                    guest_handle_cast(arg1, v4v_ring_data_t);
+                rc = v4v_notify(d, ring_data_hnd);
+                break;
+            }
+        case V4VOP_viptables_add:
+            {
+                uint32_t position = arg3;
+                XEN_GUEST_HANDLE(v4v_viptables_rule_t) rule_hnd =
+                    guest_handle_cast(arg1, v4v_viptables_rule_t);
+                rc = -EPERM;
+                if ( !IS_PRIV(d) )
+                    goto out;
+
+                write_lock(&viprules_lock);
+                rc = v4v_viptables_add(d, rule_hnd, position);
+                write_unlock(&viprules_lock);
+                break;
+            }
+        case V4VOP_viptables_del:
+            {
+                uint32_t position = arg3;
+                XEN_GUEST_HANDLE(v4v_viptables_rule_t) rule_hnd =
+                    guest_handle_cast(arg1, v4v_viptables_rule_t);
+                rc = -EPERM;
+                if ( !IS_PRIV(d) )
+                    goto out;
+
+                write_lock(&viprules_lock);
+                rc = v4v_viptables_del(d, rule_hnd, position);
+                write_unlock(&viprules_lock);
+                break;
+            }
+        case V4VOP_viptables_list:
+            {
+                XEN_GUEST_HANDLE(v4v_viptables_list_t) rules_list_hnd =
+                    guest_handle_cast(arg1, v4v_viptables_list_t);
+                rc = -EPERM;
+                if ( !IS_PRIV(d) )
+                    goto out;
+
+                rc = -EFAULT;
+                if ( unlikely(!guest_handle_okay(rules_list_hnd, 1)) )
+                    goto out;
+
+                read_lock(&viprules_lock);
+                rc = v4v_viptables_list(d, rules_list_hnd);
+                read_unlock(&viprules_lock);
+                break;
+            }
+        case V4VOP_info:
+            {
+                XEN_GUEST_HANDLE(v4v_info_t) info_hnd =
+                    guest_handle_cast(arg1, v4v_info_t);
+                v4v_info_t info;
+
+                if ( unlikely(!guest_handle_okay(info_hnd, 1)) )
+                    goto out;
+                v4v_info(d, &info);
+                if ( copy_to_guest(info_hnd, &info, 1) )
+                    goto out;
+                rc = 0;
+                break;
+            }
+        default:
+            rc = -ENOSYS;
+            break;
+    }
+out:
+    domain_unlock(d);
+    v4v_dprintk("<-do_v4v_op()=%d\n", (int)rc);
+    return rc;
+}
+
+/*
+ * init
+ */
+
+void
+v4v_destroy(struct domain *d)
+{
+    int i;
+
+    BUG_ON(!d->is_dying);
+    write_lock(&v4v_lock);
+
+    v4v_dprintk("d->v=%p\n", d->v4v);
+
+    if ( d->v4v )
+    {
+        for ( i = 0; i < V4V_HTABLE_SIZE; ++i )
+        {
+            struct hlist_node *node, *next;
+            struct v4v_ring_info *ring_info;
+
+            hlist_for_each_entry_safe(ring_info, node,
+                    next, &d->v4v->ring_hash[i],
+                    node)
+            {
+                v4v_ring_remove_info(d, ring_info);
+            }
+        }
+    }
+
+    d->v4v = NULL;
+    write_unlock(&v4v_lock);
+}
+
+int
+v4v_init(struct domain *d)
+{
+    struct v4v_domain *v4v;
+    evtchn_port_t port;
+    struct evtchn *chn;
+    int i;
+    int rc;
+
+    v4v = xmalloc(struct v4v_domain);
+    if ( !v4v )
+        return -ENOMEM;
+
+    rc = evtchn_alloc_unbound_domain(d, &port);
+    if ( rc )
+        return rc;
+
+    chn = evtchn_from_port(d, port);
+    chn->u.unbound.remote_domid = d->domain_id;
+
+    rwlock_init(&v4v->lock);
+
+    v4v->evtchn_port = port;
+    for ( i = 0; i < V4V_HTABLE_SIZE; ++i )
+        INIT_HLIST_HEAD(&v4v->ring_hash[i]);
+
+    write_lock(&v4v_lock);
+    d->v4v = v4v;
+    write_unlock(&v4v_lock);
+
+    return 0;
+}
+
+
+/*
+ * debug
+ */
+
+static void
+dump_domain_ring(struct domain *d, struct v4v_ring_info *ring_info)
+{
+    uint32_t rx_ptr;
+
+    printk(KERN_ERR "  ring: domid=%d port=0x%08x partner=%d npage=%d\n",
+           (int)d->domain_id, (int)ring_info->id.addr.port,
+           (int)ring_info->id.partner, (int)ring_info->npage);
+
+    if ( v4v_ringbuf_get_rx_ptr(d, ring_info, &rx_ptr) )
+    {
+        printk(KERN_ERR "   Failed to read rx_ptr\n");
+        return;
+    }
+
+    printk(KERN_ERR "   tx_ptr=%d rx_ptr=%d len=%d\n",
+           (int)ring_info->tx_ptr, (int)rx_ptr, (int)ring_info->len);
+}
+
+static void
+dump_domain(struct domain *d)
+{
+    int i;
+
+    printk(KERN_ERR " domain %d:\n", (int)d->domain_id);
+
+    read_lock(&d->v4v->lock);
+
+    for ( i = 0; i < V4V_HTABLE_SIZE; ++i )
+    {
+        struct hlist_node *node;
+        struct v4v_ring_info *ring_info;
+
+        hlist_for_each_entry(ring_info, node, &d->v4v->ring_hash[i], node)
+            dump_domain_ring(d, ring_info);
+    }
+
+    printk(KERN_ERR "  event channel: %d\n",  d->v4v->evtchn_port);
+    read_unlock(&d->v4v->lock);
+
+    printk(KERN_ERR "\n");
+    v4v_signal_domain(d);
+}
+
+static void
+dump_state(unsigned char key)
+{
+    struct domain *d;
+
+    printk(KERN_ERR "\n\nV4V:\n");
+    read_lock(&v4v_lock);
+
+    rcu_read_lock(&domlist_read_lock);
+
+    for_each_domain(d)
+        dump_domain(d);
+
+    rcu_read_unlock(&domlist_read_lock);
+
+    read_unlock(&v4v_lock);
+}
+
+struct keyhandler v4v_info_keyhandler =
+{
+    .diagnostic = 1,
+    .u.fn = dump_state,
+    .desc = "dump v4v states and interupt"
+};
+
+static int __init
+setup_dump_rings(void)
+{
+    register_keyhandler('4', &v4v_info_keyhandler);
+    return 0;
+}
+
+__initcall(setup_dump_rings);
+
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/v4v.h b/xen/include/public/v4v.h
new file mode 100644
index 0000000..1f1c156
--- /dev/null
+++ b/xen/include/public/v4v.h
@@ -0,0 +1,291 @@
+/******************************************************************************
+ * V4V
+ *
+ * Version 2 of v2v (Virtual-to-Virtual)
+ *
+ * Copyright (c) 2010, Citrix Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef __XEN_PUBLIC_V4V_H__
+#define __XEN_PUBLIC_V4V_H__
+
+#include "xen.h"
+#include "event_channel.h"
+
+/*
+ * Structure definitions
+ */
+
+#define V4V_RING_MAGIC          0xA822f72bb0b9d8cc
+#define V4V_RING_DATA_MAGIC	0x45fe852220b801d4
+
+#define V4V_PROTO_DGRAM		0x3c2c1db8
+#define V4V_PROTO_STREAM 	0x70f6a8e5
+
+#define V4V_DOMID_ANY           0x7fffU
+#define V4V_PORT_ANY            0
+
+typedef struct v4v_iov
+{
+    uint64_t iov_base;
+    uint64_t iov_len;
+} v4v_iov_t;
+
+typedef struct v4v_addr
+{
+    uint32_t port;
+    domid_t domain;
+    uint16_t pad;
+} v4v_addr_t;
+
+typedef struct v4v_ring_id
+{
+    v4v_addr_t addr;
+    domid_t partner;
+    uint16_t pad;
+} v4v_ring_id_t;
+
+typedef uint64_t v4v_pfn_t;
+
+typedef struct
+{
+    v4v_addr_t src;
+    v4v_addr_t dst;
+} v4v_send_addr_t;
+
+/*
+ * v4v_ring
+ * id: xen only looks at this during register/unregister
+ *     and will fill in id.addr.domain
+ * rx_ptr: rx pointer, modified by domain
+ * tx_ptr: tx pointer, modified by xen
+ *
+ */
+struct v4v_ring
+{
+    uint64_t magic;
+    v4v_ring_id_t id;
+    uint32_t len;
+    uint32_t rx_ptr;
+    uint32_t tx_ptr;
+    uint8_t reserved[32];
+    uint8_t ring[0];
+};
+typedef struct v4v_ring v4v_ring_t;
+
+#define V4V_RING_DATA_F_EMPTY       (1U << 0) /* Ring is empty */
+#define V4V_RING_DATA_F_EXISTS      (1U << 1) /* Ring exists */
+#define V4V_RING_DATA_F_PENDING     (1U << 2) /* Pending interrupt exists - do not
+                                               * rely on this field - for
+                                               * profiling only */
+#define V4V_RING_DATA_F_SUFFICIENT  (1U << 3) /* Sufficient space to queue
+                                               * space_required bytes exists */
+
+typedef struct v4v_ring_data_ent
+{
+    v4v_addr_t ring;
+    uint16_t flags;
+    uint16_t pad;
+    uint32_t space_required;
+    uint32_t max_message_size;
+} v4v_ring_data_ent_t;
+
+typedef struct v4v_ring_data
+{
+    uint64_t magic;
+    uint32_t nent;
+    uint32_t pad;
+    uint64_t reserved[4];
+    v4v_ring_data_ent_t data[0];
+} v4v_ring_data_t;
+
+struct v4v_info
+{
+    uint64_t ring_magic;
+    uint64_t data_magic;
+    evtchn_port_t evtchn;
+};
+typedef struct v4v_info v4v_info_t;
+
+#define V4V_ROUNDUP(a) (((a) +0xf ) & ~0xf)
+/*
+ * Messages on the ring are padded to 128 bits
+ * Len here refers to the exact length of the data not including the
+ * 128 bit header. The message uses
+ * ((len +0xf) & ~0xf) + sizeof(v4v_ring_message_header) bytes
+ */
+
+#define V4V_SHF_SYN		(1 << 0)
+#define V4V_SHF_ACK		(1 << 1)
+#define V4V_SHF_RST		(1 << 2)
+
+#define V4V_SHF_PING		(1 << 8)
+#define V4V_SHF_PONG		(1 << 9)
+
+struct v4v_stream_header
+{
+    uint32_t flags;
+    uint32_t conid;
+};
+
+struct v4v_ring_message_header
+{
+    uint32_t len;
+    uint32_t pad0;
+    v4v_addr_t source;
+    uint32_t protocol;
+    uint32_t pad1;
+    uint8_t data[0];
+};
+
+typedef struct v4v_viptables_rule
+{
+    v4v_addr_t src;
+    v4v_addr_t dst;
+    uint32_t accept;
+    uint32_t pad;
+} v4v_viptables_rule_t;
+
+typedef struct v4v_viptables_list
+{
+    uint32_t start_rule;
+    uint32_t nb_rules;
+    struct v4v_viptables_rule rules[0];
+} v4v_viptables_list_t;
+
+/*
+ * HYPERCALLS
+ */
+
+#define V4VOP_register_ring 	1
+/*
+ * Registers a ring with Xen, if a ring with the same v4v_ring_id exists,
+ * this ring takes its place, registration will not change tx_ptr
+ * unless it is invalid
+ *
+ * do_v4v_op(V4VOP_unregister_ring,
+ *           v4v_ring, XEN_GUEST_HANDLE(v4v_pfn),
+ *           npage, 0)
+ */
+
+
+#define V4VOP_unregister_ring 	2
+/*
+ * Unregister a ring.
+ *
+ * v4v_hypercall(V4VOP_send, v4v_ring, NULL, 0, 0)
+ */
+
+#define V4VOP_send 		3
+/*
+ * Sends len bytes of buf to dst, giving src as the source address (xen will
+ * ignore src->domain and put your domain in the actually message), xen
+ * first looks for a ring with id.addr==dst and id.partner==sending_domain
+ * if that fails it looks for id.addr==dst and id.partner==DOMID_ANY.
+ * protocol is the 32 bit protocol number used from the message
+ * most likely V4V_PROTO_DGRAM or STREAM. If insufficient space exists
+ * it will return -EAGAIN and xen will twing the V4V_INTERRUPT when
+ * sufficient space becomes available
+ *
+ * v4v_hypercall(V4VOP_send,
+ *               v4v_send_addr_t addr,
+ *               void* buf,
+ *               uint32_t len,
+ *               uint32_t protocol)
+ */
+
+
+#define V4VOP_notify 		4
+/* Asks xen for information about other rings in the system
+ *
+ * ent->ring is the v4v_addr_t of the ring you want information on
+ * the same matching rules are used as for V4VOP_send.
+ *
+ * ent->space_required  if this field is not null xen will check
+ * that there is space in the destination ring for this many bytes
+ * of payload. If there is it will set the V4V_RING_DATA_F_SUFFICIENT
+ * and CANCEL any pending interrupt for that ent->ring, if insufficient
+ * space is available it will schedule an interrupt and the flag will
+ * not be set.
+ *
+ * The flags are set by xen when notify replies
+ * V4V_RING_DATA_F_EMPTY	ring is empty
+ * V4V_RING_DATA_F_PENDING	interrupt is pending - don't rely on this
+ * V4V_RING_DATA_F_SUFFICIENT	sufficient space for space_required is there
+ * V4V_RING_DATA_F_EXISTS	ring exists
+ *
+ * v4v_hypercall(V4VOP_notify,
+ *               XEN_GUEST_HANDLE(v4v_ring_data_ent) ent,
+ *               NULL, nent, 0)
+ */
+
+#define V4VOP_sendv		5
+/*
+ * Identical to V4VOP_send except rather than buf and len it takes
+ * an array of v4v_iov and a length of the array.
+ *
+ * v4v_hypercall(V4VOP_sendv,
+ *               v4v_send_addr_t addr,
+ *               v4v_iov iov,
+ *               uint32_t niov,
+ *               uint32_t protocol)
+ */
+
+#define V4VOP_viptables_add     6
+/*
+ * Insert a filtering rules after a given position.
+ *
+ * v4v_hypercall(V4VOP_viptables_add,
+ *               v4v_viptables_rule_t rule,
+ *               NULL,
+ *               uint32_t position, 0)
+ */
+
+#define V4VOP_viptables_del     7
+/*
+ * Delete a filtering rules at a position or the rule
+ * that matches "rule".
+ *
+ * v4v_hypercall(V4VOP_viptables_del,
+ *               v4v_viptables_rule_t rule,
+ *               NULL,
+ *               uint32_t position, 0)
+ */
+
+#define V4VOP_viptables_list    8
+/*
+ * Delete a filtering rules at a position or the rule
+ * that matches "rule".
+ *
+ * v4v_hypercall(V4VOP_viptables_list,
+ *               v4v_vitpables_list_t list,
+ *               NULL, 0, 0)
+ */
+
+#define V4VOP_info              9
+/*
+ * v4v_hypercall(V4VOP_info,
+ *               XEN_GUEST_HANDLE(v4v_info_t) info,
+ *               NULL, 0, 0)
+ */
+
+#endif /* __XEN_PUBLIC_V4V_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index b19425b..868d119 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -99,7 +99,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_pfn_t);
 #define __HYPERVISOR_domctl               36
 #define __HYPERVISOR_kexec_op             37
 #define __HYPERVISOR_tmem_op              38
-#define __HYPERVISOR_xc_reserved_op       39 /* reserved for XenClient */
+#define __HYPERVISOR_v4v_op               39
 
 /* Architecture-specific hypercall definitions. */
 #define __HYPERVISOR_arch_0               48
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 53804c8..296de52 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -23,6 +23,7 @@
 #include <public/sysctl.h>
 #include <public/vcpu.h>
 #include <public/mem_event.h>
+#include <xen/v4v.h>
 
 #ifdef CONFIG_COMPAT
 #include <compat/vcpu.h>
@@ -350,6 +351,9 @@ struct domain
     nodemask_t node_affinity;
     unsigned int last_alloc_node;
     spinlock_t node_affinity_lock;
+
+    /* v4v */
+    struct v4v_domain *v4v;
 };
 
 struct domain_setup_info
diff --git a/xen/include/xen/v4v.h b/xen/include/xen/v4v.h
new file mode 100644
index 0000000..cba5ea7
--- /dev/null
+++ b/xen/include/xen/v4v.h
@@ -0,0 +1,134 @@
+/******************************************************************************
+ * V4V
+ *
+ * Version 2 of v2v (Virtual-to-Virtual)
+ *
+ * Copyright (c) 2010, Citrix Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef __V4V_PRIVATE_H__
+#define __V4V_PRIVATE_H__
+
+#include <xen/config.h>
+#include <xen/types.h>
+#include <xen/spinlock.h>
+#include <xen/smp.h>
+#include <xen/shared.h>
+#include <xen/list.h>
+#include <public/v4v.h>
+
+#define V4V_HTABLE_SIZE 32
+
+/*
+ * Handlers
+ */
+
+DEFINE_XEN_GUEST_HANDLE (v4v_iov_t);
+DEFINE_XEN_GUEST_HANDLE (v4v_addr_t);
+DEFINE_XEN_GUEST_HANDLE (v4v_send_addr_t);
+DEFINE_XEN_GUEST_HANDLE (v4v_pfn_t);
+DEFINE_XEN_GUEST_HANDLE (v4v_ring_t);
+DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_ent_t);
+DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_t);
+DEFINE_XEN_GUEST_HANDLE (v4v_info_t);
+
+DEFINE_XEN_GUEST_HANDLE (v4v_viptables_rule_t);
+DEFINE_XEN_GUEST_HANDLE (v4v_viptables_list_t);
+
+/*
+ * Helper functions
+ */
+
+static inline uint16_t
+v4v_hash_fn (v4v_ring_id_t *id)
+{
+    uint16_t ret;
+    ret = (uint16_t) (id->addr.port >> 16);
+    ret ^= (uint16_t) id->addr.port;
+    ret ^= id->addr.domain;
+    ret ^= id->partner;
+
+    ret &= (V4V_HTABLE_SIZE-1);
+
+    return ret;
+}
+
+struct v4v_pending_ent
+{
+    struct hlist_node node;
+    domid_t id;
+    uint32_t len;
+};
+
+
+struct v4v_ring_info
+{
+    /* next node in the hash, protected by L2  */
+    struct hlist_node node;
+    /* this ring's id, protected by L2 */
+    v4v_ring_id_t id;
+    /* L3 */
+    spinlock_t lock;
+    /* cached length of the ring (from ring->len), protected by L3 */
+    uint32_t len;
+    uint32_t npage;
+    /* cached tx pointer location, protected by L3 */
+    uint32_t tx_ptr;
+    /* guest ring, protected by L3 */
+    XEN_GUEST_HANDLE(v4v_ring_t) ring;
+    /* mapped ring pages protected by L3*/
+    uint8_t **mfn_mapping;
+    /* list of mfns of guest ring */
+    mfn_t *mfns;
+    /* list of struct v4v_pending_ent for this ring, L3 */
+    struct hlist_head pending;
+};
+
+/*
+ * The value of the v4v element in a struct domain is
+ * protected by the global lock L1
+ */
+struct v4v_domain
+{
+    /* L2 */
+    rwlock_t lock;
+    /* event channel */
+    evtchn_port_t evtchn_port;
+    /* protected by L2 */
+    struct hlist_head ring_hash[V4V_HTABLE_SIZE];
+};
+
+typedef struct v4v_viptables_rule_node
+{
+    struct list_head list;
+    v4v_viptables_rule_t rule;
+} v4v_viptables_rule_node_t;
+
+void v4v_destroy(struct domain *d);
+int v4v_init(struct domain *d);
+long do_v4v_op (int cmd,
+                XEN_GUEST_HANDLE (void) arg1,
+                XEN_GUEST_HANDLE (void) arg2,
+                uint32_t arg3,
+                uint32_t arg4);
+
+#endif /* __V4V_PRIVATE_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/v4v_utils.h b/xen/include/xen/v4v_utils.h
new file mode 100644
index 0000000..67b2d77
--- /dev/null
+++ b/xen/include/xen/v4v_utils.h
@@ -0,0 +1,276 @@
+/******************************************************************************
+ * V4V
+ *
+ * Version 2 of v2v (Virtual-to-Virtual)
+ *
+ * Copyright (c) 2010, Citrix Systems
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef __V4V_UTILS_H__
+# define __V4V_UTILS_H__
+
+/* Compiler specific hacks */
+#if defined(__GNUC__)
+# define V4V_UNUSED __attribute__ ((unused))
+# ifndef __STRICT_ANSI__
+#  define V4V_INLINE inline
+# else
+#  define V4V_INLINE
+# endif
+#else /* !__GNUC__ */
+# define V4V_UNUSED
+# define V4V_INLINE
+#endif
+
+
+/*
+ * Utility functions
+ */
+
+static V4V_INLINE uint32_t
+v4v_ring_bytes_to_read (volatile struct v4v_ring *r)
+{
+    int32_t ret;
+    ret = r->tx_ptr - r->rx_ptr;
+    if (ret >= 0)
+        return ret;
+    return (uint32_t) (r->len + ret);
+}
+
+
+/*
+ * Copy at most t bytes of the next message in the ring, into the buffer
+ * at _buf, setting from and protocol if they are not NULL, returns
+ * the actual length of the message, or -1 if there is nothing to read
+ */
+V4V_UNUSED static V4V_INLINE ssize_t
+v4v_copy_out (struct v4v_ring *r, struct v4v_addr *from, uint32_t * protocol,
+              void *_buf, size_t t, int consume)
+{
+    volatile struct v4v_ring_message_header *mh;
+    /* unnecessary cast from void * required by MSVC compiler */
+    uint8_t *buf = (uint8_t *) _buf;
+    uint32_t btr = v4v_ring_bytes_to_read (r);
+    uint32_t rxp = r->rx_ptr;
+    uint32_t bte;
+    uint32_t len;
+    ssize_t ret;
+
+
+    if (btr < sizeof (*mh))
+        return -1;
+
+    /*
+     * Becuase the message_header is 128 bits long and the ring is 128 bit
+     * aligned, we're gaurunteed never to wrap
+     */
+    mh = (volatile struct v4v_ring_message_header *) &r->ring[r->rx_ptr];
+
+    len = mh->len;
+    if (btr < len)
+        return -1;
+
+#if defined(__GNUC__)
+    if (from)
+        *from = mh->source;
+#else
+    /* MSVC can't do the above */
+    if (from)
+	memcpy((void *) from, (void *) &(mh->source), sizeof(struct v4v_addr));
+#endif
+
+    if (protocol)
+        *protocol = mh->protocol;
+
+    rxp += sizeof (*mh);
+    if (rxp == r->len)
+        rxp = 0;
+    len -= sizeof (*mh);
+    ret = len;
+
+    bte = r->len - rxp;
+
+    if (bte < len)
+    {
+        if (t < bte)
+        {
+            if (buf)
+            {
+                memcpy (buf, (void *) &r->ring[rxp], t);
+                buf += t;
+            }
+
+            rxp = 0;
+            len -= bte;
+            t = 0;
+        }
+        else
+        {
+            if (buf)
+            {
+                memcpy (buf, (void *) &r->ring[rxp], bte);
+                buf += bte;
+            }
+            rxp = 0;
+            len -= bte;
+            t -= bte;
+        }
+    }
+
+    if (buf && t)
+        memcpy (buf, (void *) &r->ring[rxp], (t < len) ? t : len);
+
+
+    rxp += V4V_ROUNDUP (len);
+    if (rxp == r->len)
+        rxp = 0;
+
+    mb ();
+
+    if (consume)
+        r->rx_ptr = rxp;
+
+    return ret;
+}
+
+static V4V_INLINE void
+v4v_memcpy_skip (void *_dst, const void *_src, size_t len, size_t *skip)
+{
+    const uint8_t *src =  (const uint8_t *) _src;
+    uint8_t *dst = (uint8_t *) _dst;
+
+    if (!*skip)
+    {
+        memcpy (dst, src, len);
+        return;
+    }
+
+    if (*skip >= len)
+    {
+        *skip -= len;
+        return;
+    }
+
+    src += *skip;
+    dst += *skip;
+    len -= *skip;
+    *skip = 0;
+
+    memcpy (dst, src, len);
+}
+
+/*
+ * Copy at most t bytes of the next message in the ring, into the buffer
+ * at _buf, skipping skip bytes, setting from and protocol if they are not
+ * NULL, returns the actual length of the message, or -1 if there is
+ * nothing to read
+ */
+static ssize_t
+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
+                     uint32_t * protocol, void *_buf, size_t t, int consume,
+                     size_t skip) V4V_UNUSED;
+
+V4V_INLINE static ssize_t
+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
+                     uint32_t * protocol, void *_buf, size_t t, int consume,
+                     size_t skip)
+{
+    volatile struct v4v_ring_message_header *mh;
+    /* unnecessary cast from void * required by MSVC compiler */
+    uint8_t *buf = (uint8_t *) _buf;
+    uint32_t btr = v4v_ring_bytes_to_read (r);
+    uint32_t rxp = r->rx_ptr;
+    uint32_t bte;
+    uint32_t len;
+    ssize_t ret;
+
+    buf -= skip;
+
+    if (btr < sizeof (*mh))
+        return -1;
+
+    /*
+     * Becuase the message_header is 128 bits long and the ring is 128 bit
+     * aligned, we're gaurunteed never to wrap
+     */
+    mh = (volatile struct v4v_ring_message_header *) &r->ring[r->rx_ptr];
+
+    len = mh->len;
+    if (btr < len)
+        return -1;
+
+#if defined(__GNUC__)
+    if (from)
+        *from = mh->source;
+#else
+    /* MSVC can't do the above */
+    if (from)
+	memcpy((void *) from, (void *) &(mh->source), sizeof(struct v4v_addr));
+#endif
+
+    if (protocol)
+        *protocol = mh->protocol;
+
+    rxp += sizeof (*mh);
+    if (rxp == r->len)
+        rxp = 0;
+    len -= sizeof (*mh);
+    ret = len;
+
+    bte = r->len - rxp;
+
+    if (bte < len)
+    {
+        if (t < bte)
+        {
+            if (buf)
+            {
+                v4v_memcpy_skip (buf, (void *) &r->ring[rxp], t, &skip);
+                buf += t;
+            }
+
+            rxp = 0;
+            len -= bte;
+            t = 0;
+        }
+        else
+        {
+            if (buf)
+            {
+                v4v_memcpy_skip (buf, (void *) &r->ring[rxp], bte,
+                        &skip);
+                buf += bte;
+            }
+            rxp = 0;
+            len -= bte;
+            t -= bte;
+        }
+    }
+
+    if (buf && t)
+        v4v_memcpy_skip (buf, (void *) &r->ring[rxp], (t < len) ? t : len,
+                         &skip);
+
+
+    rxp += V4V_ROUNDUP (len);
+    if (rxp == r->len)
+        rxp = 0;
+
+    mb ();
+
+    if (consume)
+        r->rx_ptr = rxp;
+
+    return ret;
+}
+
+#endif /* !__V4V_UTILS_H__ */

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/5] RFC: V4V (v3)
  2012-08-03 19:50 [PATCH 0/5] RFC: V4V (v3) Jean Guyader
                   ` (4 preceding siblings ...)
  2012-08-03 19:50 ` [PATCH 5/5] xen: Add V4V implementation Jean Guyader
@ 2012-08-04 13:24 ` Jean Guyader
  5 siblings, 0 replies; 50+ messages in thread
From: Jean Guyader @ 2012-08-04 13:24 UTC (permalink / raw)
  To: Jean Guyader; +Cc: xen-devel

On 3 August 2012 20:50, Jean Guyader <jean.guyader@citrix.com> wrote:
> v3 changes:
>         - Switch to event channel
>                 - Allocated a unbound event channel
>                   per domain.
>                 - Add a new v4v call to share the
>                   event channel port.
>         - Public headers with actual type definition
>         - Align all the v4v type to 64 bits
>         - Modify v4v MAGIC numbers because we won't
>           but backward compatible anymore
>         - Merge insert and insertv
>         - Merge send and sendv
>         - Turn all the lock prerequisite from comment
>           to ASSERT()
>         - Make use or write_atomic instead of volatile pointers
>         - Merge v4v_memcpy_to_guest_ring and
>           v4v_memcpy_to_guest_ring_from_guest
>                 - Introduce copy_from_guest_maybe that can take
>                   a void * and a handle as src address.
            - Replace 6 arguments hypercalls with 5 arguments hypercalls.

Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/5] xen: add ssize_t
  2012-08-03 19:50 ` [PATCH 1/5] xen: add ssize_t Jean Guyader
@ 2012-08-06  8:08   ` Jan Beulich
  2012-08-06 14:47     ` Jean Guyader
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2012-08-06  8:08 UTC (permalink / raw)
  To: Jean Guyader; +Cc: xen-devel

>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:

Without finally explaining why you need this type in the first place,
I'll continue to NAK this patch. (This is made even worse by the fact
taht the two inline functions in patch 5 that make use of the type
appear to be unused.)

Jan

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED
  2012-08-03 19:50 ` [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED Jean Guyader
@ 2012-08-06  8:10   ` Jan Beulich
  2012-08-06 14:46     ` Jean Guyader
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2012-08-06  8:10 UTC (permalink / raw)
  To: Jean Guyader; +Cc: xen-devel

>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
> VIRQ_XC_RESERVED was reserved for V4V but we have switched
> to event channels so this place holder is no longer required.

I'm fine with this change, but is a future re-use of the value indeed
not going to cause problems on XenServer (or wherever else this
is patch set coming from)?

Jan

> Signed-off-by: Jean Guyader <jean.guyader@citrix.com>
> ---
>  xen/include/public/xen.h |    1 -
>  1 file changed, 1 deletion(-)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain
  2012-08-03 19:50 ` [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain Jean Guyader
@ 2012-08-06  8:19   ` Jan Beulich
  2012-08-09 10:06   ` Tim Deegan
  1 sibling, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2012-08-06  8:19 UTC (permalink / raw)
  To: Jean Guyader; +Cc: xen-devel

>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>--- a/xen/common/event_channel.c
>+++ b/xen/common/event_channel.c
>@@ -51,6 +51,8 @@
> 
> #define consumer_is_xen(e) (!!(e)->xen_consumer)
> 
>+static long __evtchn_close(struct domain *d, int port);

What is this needed for?

>+
> /*
>  * The function alloc_unbound_xen_event_channel() allows an arbitrary
>  * notifier function to be specified. However, very few unique functions
>@@ -161,18 +163,18 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>     struct evtchn *chn;
>     struct domain *d;
>-    int            port;
>+    evtchn_port_t  port;
>     domid_t        dom = alloc->dom;
>-    long           rc;
>+    int            rc;
> 
>     rc = rcu_lock_target_domain_by_id(dom, &d);
>     if ( rc )
>         return rc;
> 
>-    spin_lock(&d->event_lock);
>+    rc = evtchn_alloc_unbound_domain(d, &port);
>+    if ( rc )
>+        ERROR_EXIT_DOM(rc, d);
> 
>-    if ( (port = get_free_port(d)) < 0 )
>-        ERROR_EXIT_DOM(port, d);
>     chn = evtchn_from_port(d, port);
> 
>     rc = xsm_evtchn_unbound(d, chn, alloc->remote_dom);
>@@ -186,12 +188,31 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>     alloc->port = port;
> 
>  out:
>-    spin_unlock(&d->event_lock);
>     rcu_unlock_domain(d);
> 
>     return rc;
> }
> 
>+int evtchn_alloc_unbound_domain(struct domain *d, evtchn_port_t *port)
>+{
>+    struct evtchn *chn;
>+    int            free_port = 0;
>+
>+    spin_lock(&d->event_lock);
>+
>+    if ( (free_port = get_free_port(d)) < 0 )
>+        goto out;
>+    chn = evtchn_from_port(d, free_port);

The code below this is not really a plain breakout from the
function above:

>+    chn->state = ECS_UNBOUND;

The equivalent to this ought to be removed from the original
function as being redundant.

>+    chn->u.unbound.remote_domid = DOMID_INVALID;

The single caller here will immediately overwrite this value. It
would seem more clean to simply pass in the intended value,
and eliminate the corresponding code from the caller too.

Jan

>+    *port = free_port;
>+    /* Everything is fine, returns 0 */
>+    free_port = 0;
>+
>+ out:
>+    spin_unlock(&d->event_lock);
>+    return free_port;
>+}
> 
> static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
> {

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation
  2012-08-03 19:50 ` [PATCH 5/5] xen: Add V4V implementation Jean Guyader
@ 2012-08-06  8:45   ` Jan Beulich
  2012-08-23 11:57     ` Jean Guyader
  2012-09-01 20:56     ` Jean Guyader
  2012-08-09 10:38   ` [PATCH 5/5] xen: Add V4V implementation Tim Deegan
  1 sibling, 2 replies; 50+ messages in thread
From: Jan Beulich @ 2012-08-06  8:45 UTC (permalink / raw)
  To: Jean Guyader; +Cc: xen-devel

>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>--- /dev/null
>+++ b/xen/include/public/v4v.h
>...
>+#define V4V_DOMID_ANY           0x7fffU

I think I asked this before - why not use one of the pre-existing
DOMID values? And if there is a good reason, it should be said
here in a comment, to avoid the same question being asked
later again.

>...
>+typedef uint64_t v4v_pfn_t;

We already have xen_pfn_t, so why do you need yet another
flavor?

>...
>+struct v4v_info
>+{
>+    uint64_t ring_magic;
>+    uint64_t data_magic;
>+    evtchn_port_t evtchn;

Missing padding at the end?

>+};
>+typedef struct v4v_info v4v_info_t;
>+
>+#define V4V_ROUNDUP(a) (((a) +0xf ) & ~0xf)

Doesn't seem to belong here. Or is the subsequent comment
actually related to this (in which case it should be moved ahead
of the definition and made match it).

>+/*
>+ * Messages on the ring are padded to 128 bits
>+ * Len here refers to the exact length of the data not including the
>+ * 128 bit header. The message uses
>+ * ((len +0xf) & ~0xf) + sizeof(v4v_ring_message_header) bytes
>+ */
>...
>+/*
>+ * HYPERCALLS
>+ */
>...

In the block below here, please get the naming (do_v4v_op()
vs v4v_hypercall()) and the use of newlines (either always one
or always two between individual hypercall descriptions)
consistent. Hmm, even the descriptions don't seem to always
match the definitions (not really obvious because apparently
again the descriptions follow the definitions, whereas the
opposite is the usual way to arrange things).

>--- /dev/null
>+++ b/xen/include/xen/v4v_utils.h
>...
>+/* Compiler specific hacks */
>+#if defined(__GNUC__)
>+# define V4V_UNUSED __attribute__ ((unused))
>+# ifndef __STRICT_ANSI__
>+#  define V4V_INLINE inline
>+# else
>+#  define V4V_INLINE
>+# endif
>+#else /* !__GNUC__ */
>+# define V4V_UNUSED
>+# define V4V_INLINE
>+#endif

This suggests the header is really intended to be public?

>...
>+static V4V_INLINE uint32_t
>+v4v_ring_bytes_to_read (volatile struct v4v_ring *r)

No space between function name and opening parenthesis
(throughout this file).

>...
>+V4V_UNUSED static V4V_INLINE ssize_t

V4V_UNUSED? Doesn't make sense in conjunction with
V4V_INLINE, at least as long as you're using GNU extensions
anyway (see above as to the disposition of the header).

>+v4v_copy_out (struct v4v_ring *r, struct v4v_addr *from, uint32_t * protocol,
>+              void *_buf, size_t t, int consume)

Dead functions shouldn't be placed here.

>...
>+static ssize_t
>+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
>+                     uint32_t * protocol, void *_buf, size_t t, int consume,
>+                     size_t skip) V4V_UNUSED;
>+
>+V4V_INLINE static ssize_t
>+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
>+                     uint32_t * protocol, void *_buf, size_t t, int consume,
>+                     size_t skip)
>+{

What's the point of having a declaration followed immediately by
a definition? Also, the function is dead too.

Jan

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED
  2012-08-06  8:10   ` Jan Beulich
@ 2012-08-06 14:46     ` Jean Guyader
  2012-08-06 14:49       ` Andrew Cooper
  2012-08-06 14:56       ` Ian Campbell
  0 siblings, 2 replies; 50+ messages in thread
From: Jean Guyader @ 2012-08-06 14:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Stefano Stabellini, Ross Philipson, Jean Guyader, xen-devel

On 6 August 2012 09:10, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>> VIRQ_XC_RESERVED was reserved for V4V but we have switched
>> to event channels so this place holder is no longer required.
>
> I'm fine with this change, but is a future re-use of the value indeed
> not going to cause problems on XenServer (or wherever else this
> is patch set coming from)?
>

That may need to be confirmed but I don't think XenServer is using v4v
yet, their plan
is to pick it up from upstream. So removing this VIRQ should be fine.

Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/5] xen: add ssize_t
  2012-08-06  8:08   ` Jan Beulich
@ 2012-08-06 14:47     ` Jean Guyader
  2012-08-09  9:51       ` Tim Deegan
  0 siblings, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-08-06 14:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Jean Guyader, xen-devel

On 6 August 2012 09:08, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>
> Without finally explaining why you need this type in the first place,
> I'll continue to NAK this patch. (This is made even worse by the fact
> taht the two inline functions in patch 5 that make use of the type
> appear to be unused.)
>

Understood. I'll switch to use long instead of ssize_t in my
forthcoming patch serie.

Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED
  2012-08-06 14:46     ` Jean Guyader
@ 2012-08-06 14:49       ` Andrew Cooper
  2012-08-06 14:56       ` Ian Campbell
  1 sibling, 0 replies; 50+ messages in thread
From: Andrew Cooper @ 2012-08-06 14:49 UTC (permalink / raw)
  To: Jean Guyader
  Cc: Jean Guyader (3P),
	xen-devel, Ross Philipson, Jan Beulich, Stefano Stabellini

On 06/08/12 15:46, Jean Guyader wrote:
> On 6 August 2012 09:10, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>>> VIRQ_XC_RESERVED was reserved for V4V but we have switched
>>> to event channels so this place holder is no longer required.
>> I'm fine with this change, but is a future re-use of the value indeed
>> not going to cause problems on XenServer (or wherever else this
>> is patch set coming from)?
>>
> That may need to be confirmed but I don't think XenServer is using v4v
> yet, their plan
> is to pick it up from upstream. So removing this VIRQ should be fine.
>
> Jean

Yes - we don't use it, but are interested when XenClient can
successfully upstream it.

~Andrew

>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED
  2012-08-06 14:46     ` Jean Guyader
  2012-08-06 14:49       ` Andrew Cooper
@ 2012-08-06 14:56       ` Ian Campbell
  2012-08-06 15:01         ` Jean Guyader
  1 sibling, 1 reply; 50+ messages in thread
From: Ian Campbell @ 2012-08-06 14:56 UTC (permalink / raw)
  To: Jean Guyader
  Cc: Jean Guyader (3P),
	xen-devel, Ross Philipson, Jan Beulich, Stefano Stabellini

On Mon, 2012-08-06 at 15:46 +0100, Jean Guyader wrote:
> On 6 August 2012 09:10, Jan Beulich <JBeulich@suse.com> wrote:
> >>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
> >> VIRQ_XC_RESERVED was reserved for V4V but we have switched
> >> to event channels so this place holder is no longer required.
> >
> > I'm fine with this change, but is a future re-use of the value indeed
> > not going to cause problems on XenServer (or wherever else this
> > is patch set coming from)?
> >
> 
> That may need to be confirmed but I don't think XenServer is using v4v
> yet

I think Jan probably meant XenClient (i.e. that being the place where
v4v is already deployed).

There's no harm in keeping this # reserved indefinitely, with a suitable
comment, I think? The only reason not to would be if this address space
was limited, but I don't think that is the case with VIRQs

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED
  2012-08-06 14:56       ` Ian Campbell
@ 2012-08-06 15:01         ` Jean Guyader
  2012-08-06 15:13           ` Ian Campbell
  0 siblings, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-08-06 15:01 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Jean Guyader (3P),
	xen-devel, Ross Philipson, Jan Beulich, Stefano Stabellini

On 6 August 2012 15:56, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Mon, 2012-08-06 at 15:46 +0100, Jean Guyader wrote:
>> On 6 August 2012 09:10, Jan Beulich <JBeulich@suse.com> wrote:
>> >>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>> >> VIRQ_XC_RESERVED was reserved for V4V but we have switched
>> >> to event channels so this place holder is no longer required.
>> >
>> > I'm fine with this change, but is a future re-use of the value indeed
>> > not going to cause problems on XenServer (or wherever else this
>> > is patch set coming from)?
>> >
>>
>> That may need to be confirmed but I don't think XenServer is using v4v
>> yet
>
> I think Jan probably meant XenClient (i.e. that being the place where
> v4v is already deployed).
>
> There's no harm in keeping this # reserved indefinitely, with a suitable
> comment, I think? The only reason not to would be if this address space
> was limited, but I don't think that is the case with VIRQs
>
>

I think if XenClient rebase to a new version of Xen we will probably
use the version of
v4v that comes with it and we will not try to rebase the old code on
the newer Xen.

But if you think we should keep it I don't mind.

Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED
  2012-08-06 15:01         ` Jean Guyader
@ 2012-08-06 15:13           ` Ian Campbell
  2012-08-06 15:46             ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Ian Campbell @ 2012-08-06 15:13 UTC (permalink / raw)
  To: Jean Guyader
  Cc: Jean Guyader (3P),
	xen-devel, Ross Philipson, Jan Beulich, Stefano Stabellini

On Mon, 2012-08-06 at 16:01 +0100, Jean Guyader wrote:
> On 6 August 2012 15:56, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Mon, 2012-08-06 at 15:46 +0100, Jean Guyader wrote:
> >> On 6 August 2012 09:10, Jan Beulich <JBeulich@suse.com> wrote:
> >> >>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
> >> >> VIRQ_XC_RESERVED was reserved for V4V but we have switched
> >> >> to event channels so this place holder is no longer required.
> >> >
> >> > I'm fine with this change, but is a future re-use of the value indeed
> >> > not going to cause problems on XenServer (or wherever else this
> >> > is patch set coming from)?
> >> >
> >>
> >> That may need to be confirmed but I don't think XenServer is using v4v
> >> yet
> >
> > I think Jan probably meant XenClient (i.e. that being the place where
> > v4v is already deployed).
> >
> > There's no harm in keeping this # reserved indefinitely, with a suitable
> > comment, I think? The only reason not to would be if this address space
> > was limited, but I don't think that is the case with VIRQs
> >
> >
> 
> I think if XenClient rebase to a new version of Xen we will probably
> use the version of
> v4v that comes with it and we will not try to rebase the old code on
> the newer Xen.

I think Jan's concern was if a current client runs on some future
version of Xen which has reused that VIRQ for something else, some sort
of weirdness would probably ensue?

Probably not as bad for a VIRQ as reusing a hypercall number...


> 
> But if you think we should keep it I don't mind.
> 
> Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED
  2012-08-06 15:13           ` Ian Campbell
@ 2012-08-06 15:46             ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2012-08-06 15:46 UTC (permalink / raw)
  To: Ian Campbell, Jean Guyader
  Cc: xen-devel, Ross Philipson, Jean Guyader (3P), Stefano Stabellini

>>> On 06.08.12 at 17:13, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Mon, 2012-08-06 at 16:01 +0100, Jean Guyader wrote:
>> On 6 August 2012 15:56, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> > On Mon, 2012-08-06 at 15:46 +0100, Jean Guyader wrote:
>> >> On 6 August 2012 09:10, Jan Beulich <JBeulich@suse.com> wrote:
>> >> >>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>> >> >> VIRQ_XC_RESERVED was reserved for V4V but we have switched
>> >> >> to event channels so this place holder is no longer required.
>> >> >
>> >> > I'm fine with this change, but is a future re-use of the value indeed
>> >> > not going to cause problems on XenServer (or wherever else this
>> >> > is patch set coming from)?
>> >> >
>> >>
>> >> That may need to be confirmed but I don't think XenServer is using v4v
>> >> yet
>> >
>> > I think Jan probably meant XenClient (i.e. that being the place where
>> > v4v is already deployed).
>> >
>> > There's no harm in keeping this # reserved indefinitely, with a suitable
>> > comment, I think? The only reason not to would be if this address space
>> > was limited, but I don't think that is the case with VIRQs
>> >
>> >
>> 
>> I think if XenClient rebase to a new version of Xen we will probably
>> use the version of
>> v4v that comes with it and we will not try to rebase the old code on
>> the newer Xen.
> 
> I think Jan's concern was if a current client runs on some future
> version of Xen which has reused that VIRQ for something else, some sort
> of weirdness would probably ensue?

That was exactly the point of my inquiry.

Jan

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/5] xen: add ssize_t
  2012-08-06 14:47     ` Jean Guyader
@ 2012-08-09  9:51       ` Tim Deegan
  2012-08-09 10:19         ` Jean Guyader
  0 siblings, 1 reply; 50+ messages in thread
From: Tim Deegan @ 2012-08-09  9:51 UTC (permalink / raw)
  To: Jean Guyader; +Cc: Jean Guyader, Jan Beulich, xen-devel

At 15:47 +0100 on 06 Aug (1344268059), Jean Guyader wrote:
> On 6 August 2012 09:08, Jan Beulich <JBeulich@suse.com> wrote:
> >>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
> >
> > Without finally explaining why you need this type in the first place,
> > I'll continue to NAK this patch. (This is made even worse by the fact
> > taht the two inline functions in patch 5 that make use of the type
> > appear to be unused.)
> >
> 
> Understood. I'll switch to use long instead of ssize_t in my
> forthcoming patch serie.

Please use an explicitly 64-bit type - AFAICS you're holding the sum of
some 64-bit length fields.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain
  2012-08-03 19:50 ` [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain Jean Guyader
  2012-08-06  8:19   ` Jan Beulich
@ 2012-08-09 10:06   ` Tim Deegan
  2012-08-09 10:23     ` Ian Campbell
  1 sibling, 1 reply; 50+ messages in thread
From: Tim Deegan @ 2012-08-09 10:06 UTC (permalink / raw)
  To: Jean Guyader; +Cc: xen-devel

At 20:50 +0100 on 03 Aug (1344027053), Jean Guyader wrote:
> 
> Exposes evtchn_alloc_unbound_domain to the rest of
> Xen so we can create allocated unbound evtchn within Xen.
> 
> Signed-off-by: Jean Guyader <jean.guyader@citrix.com>

> @@ -161,18 +163,18 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>  {
>      struct evtchn *chn;
>      struct domain *d;
> -    int            port;
> +    evtchn_port_t  port;
>      domid_t        dom = alloc->dom;
> -    long           rc;
> +    int            rc;

The function returns long; if you're tidying this up to be an int, might
as well change the return type too.

>  
>      rc = rcu_lock_target_domain_by_id(dom, &d);
>      if ( rc )
>          return rc;
>  
> -    spin_lock(&d->event_lock);
> +    rc = evtchn_alloc_unbound_domain(d, &port);

This moves some of the setting of channel state before the xsm hook.
Also, the state changes lower down in this function are no longer under
the event_lock. :(

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/5] xen: add ssize_t
  2012-08-09  9:51       ` Tim Deegan
@ 2012-08-09 10:19         ` Jean Guyader
  2012-08-09 10:39           ` Jan Beulich
  2012-08-09 10:59           ` Tim Deegan
  0 siblings, 2 replies; 50+ messages in thread
From: Jean Guyader @ 2012-08-09 10:19 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Jean Guyader, Jan Beulich, xen-devel

On 9 August 2012 10:51, Tim Deegan <tim@xen.org> wrote:
> At 15:47 +0100 on 06 Aug (1344268059), Jean Guyader wrote:
>> On 6 August 2012 09:08, Jan Beulich <JBeulich@suse.com> wrote:
>> >>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>> >
>> > Without finally explaining why you need this type in the first place,
>> > I'll continue to NAK this patch. (This is made even worse by the fact
>> > taht the two inline functions in patch 5 that make use of the type
>> > appear to be unused.)
>> >
>>
>> Understood. I'll switch to use long instead of ssize_t in my
>> forthcoming patch serie.
>
> Please use an explicitly 64-bit type - AFAICS you're holding the sum of
> some 64-bit length fields.
>

Ok but ssize_t is kind of a funny one. It should accept everything
that size_t can accept + negative values.

The linux kernel is defining ssize_t as int for 32b and long for 64b.
I could do the same then check for the
size of the copy and if it's bigger than MAX_INT|MAX_LONG return -EMSGSIZE.

http://lxr.free-electrons.com/source/include/asm-generic/posix_types.h#L68
(look for __kernel_ssize_t)

Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain
  2012-08-09 10:06   ` Tim Deegan
@ 2012-08-09 10:23     ` Ian Campbell
  2012-08-09 10:35       ` Tim Deegan
  0 siblings, 1 reply; 50+ messages in thread
From: Ian Campbell @ 2012-08-09 10:23 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Jean Guyader (3P), xen-devel

On Thu, 2012-08-09 at 11:06 +0100, Tim Deegan wrote:
> At 20:50 +0100 on 03 Aug (1344027053), Jean Guyader wrote:
> > 
> > Exposes evtchn_alloc_unbound_domain to the rest of
> > Xen so we can create allocated unbound evtchn within Xen.
> > 
> > Signed-off-by: Jean Guyader <jean.guyader@citrix.com>
> 
> > @@ -161,18 +163,18 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> >  {
> >      struct evtchn *chn;
> >      struct domain *d;
> > -    int            port;
> > +    evtchn_port_t  port;
> >      domid_t        dom = alloc->dom;
> > -    long           rc;
> > +    int            rc;
> 
> The function returns long; if you're tidying this up to be an int, might
> as well change the return type too.

I'm not sure if this is relevant but Jan just sent a patch to "make all
(native) hypercalls consistently have "long" return type". I
think/suspect this rc here turns into the result of the hypercall?

Jan's patch was motivated by something to do with sign extension when a
hypercall's int return is written to the long in the multicall arg
struct which causes strangeness. Perhaps not totally relevant to
evtchn_alloc which is unlikely to be in a MC.

Ian.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain
  2012-08-09 10:23     ` Ian Campbell
@ 2012-08-09 10:35       ` Tim Deegan
  2012-08-09 10:40         ` Jean Guyader
  0 siblings, 1 reply; 50+ messages in thread
From: Tim Deegan @ 2012-08-09 10:35 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jean Guyader (3P), xen-devel

At 11:23 +0100 on 09 Aug (1344511426), Ian Campbell wrote:
> On Thu, 2012-08-09 at 11:06 +0100, Tim Deegan wrote:
> > At 20:50 +0100 on 03 Aug (1344027053), Jean Guyader wrote:
> > > 
> > > Exposes evtchn_alloc_unbound_domain to the rest of
> > > Xen so we can create allocated unbound evtchn within Xen.
> > > 
> > > Signed-off-by: Jean Guyader <jean.guyader@citrix.com>
> > 
> > > @@ -161,18 +163,18 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> > >  {
> > >      struct evtchn *chn;
> > >      struct domain *d;
> > > -    int            port;
> > > +    evtchn_port_t  port;
> > >      domid_t        dom = alloc->dom;
> > > -    long           rc;
> > > +    int            rc;
> > 
> > The function returns long; if you're tidying this up to be an int, might
> > as well change the return type too.
> 
> I'm not sure if this is relevant but Jan just sent a patch to "make all
> (native) hypercalls consistently have "long" return type". I
> think/suspect this rc here turns into the result of the hypercall?
> 
> Jan's patch was motivated by something to do with sign extension when a
> hypercall's int return is written to the long in the multicall arg
> struct which causes strangeness. Perhaps not totally relevant to
> evtchn_alloc which is unlikely to be in a MC.

Yes, this eventually ends up in a hypercall handler, but s/long/int/
here doesn't cause problems because
 - rc is only ever set to an 'int' value here so we can't lose data
   from the type being too narrow; and
 - Those int values get cast up to long (either in here or in the
   caller) directly, which will sign-extend the.

It really doesn't matter whether this function returns an int or a long,
but it's a bit untidy to change it half-way. 

Tim.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation
  2012-08-03 19:50 ` [PATCH 5/5] xen: Add V4V implementation Jean Guyader
  2012-08-06  8:45   ` Jan Beulich
@ 2012-08-09 10:38   ` Tim Deegan
  2012-08-10 16:51     ` Jean Guyader
  1 sibling, 1 reply; 50+ messages in thread
From: Tim Deegan @ 2012-08-09 10:38 UTC (permalink / raw)
  To: Jean Guyader; +Cc: xen-devel

Hi,

This looks pretty good; I think you've addressed almost all my comments
except for one, which is really a design decision raether than an
implementation one.  As I said last time: 

] And what about protocol?  Protocol seems to have ended up as a bit of a
] second-class citizen in v4v; it's defined, and indeed required, but not
] used for routing or for acccess control, so all traffic to a given port
] _on every protocol_ ends up on the same ring. 
] 
] This is the inverse of the TCP/IP namespace that you're copying, where
] protocol demux happens before port demux.  And I think it will bite
] someone if you ever, for example, want to send ICMP or GRE over a v4v
] channel.

I see Jan has some comments on the detail; all I have to add is this:

At 20:50 +0100 on 03 Aug (1344027054), Jean Guyader wrote:
> +
> +
> +struct list_head viprules = LIST_HEAD_INIT(viprules);
> +static DEFINE_RWLOCK(viprules_lock);

Please add comments describing this lock and where it comes in the
locking order relative to the locks below -- it looks like it's always
taken after any other locks but please make it clear.

> +/*
> + * Structure definitions
> + */
> +
> +#define V4V_RING_MAGIC          0xA822f72bb0b9d8cc
> +#define V4V_RING_DATA_MAGIC	0x45fe852220b801d4

Thanks for getting rid of the #ifdefs here, but you need to keep the
'ULL' suffix so this will compile properly on 32-bit.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/5] xen: add ssize_t
  2012-08-09 10:19         ` Jean Guyader
@ 2012-08-09 10:39           ` Jan Beulich
  2012-08-09 10:48             ` Jean Guyader
  2012-08-09 10:59           ` Tim Deegan
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2012-08-09 10:39 UTC (permalink / raw)
  To: Jean Guyader; +Cc: Tim Deegan, Jean Guyader, xen-devel

>>> On 09.08.12 at 12:19, Jean Guyader <jean.guyader@gmail.com> wrote:
> On 9 August 2012 10:51, Tim Deegan <tim@xen.org> wrote:
>> At 15:47 +0100 on 06 Aug (1344268059), Jean Guyader wrote:
>>> On 6 August 2012 09:08, Jan Beulich <JBeulich@suse.com> wrote:
>>> >>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>>> >
>>> > Without finally explaining why you need this type in the first place,
>>> > I'll continue to NAK this patch. (This is made even worse by the fact
>>> > taht the two inline functions in patch 5 that make use of the type
>>> > appear to be unused.)
>>> >
>>>
>>> Understood. I'll switch to use long instead of ssize_t in my
>>> forthcoming patch serie.
>>
>> Please use an explicitly 64-bit type - AFAICS you're holding the sum of
>> some 64-bit length fields.
>>
> 
> Ok but ssize_t is kind of a funny one. It should accept everything
> that size_t can accept + negative values.

No. It's the same relation as between e.g. "signed int" and
"unsigned int". Value preserving conversions are only guaranteed
for non-negative values fitting both types.

Jan

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain
  2012-08-09 10:35       ` Tim Deegan
@ 2012-08-09 10:40         ` Jean Guyader
  2012-08-09 23:25           ` Jean Guyader
  0 siblings, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-08-09 10:40 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Ian Campbell, Jean Guyader (3P), xen-devel

On 9 August 2012 11:35, Tim Deegan <tim@xen.org> wrote:
> At 11:23 +0100 on 09 Aug (1344511426), Ian Campbell wrote:
>> On Thu, 2012-08-09 at 11:06 +0100, Tim Deegan wrote:
>> > At 20:50 +0100 on 03 Aug (1344027053), Jean Guyader wrote:
>> > >
>> > > Exposes evtchn_alloc_unbound_domain to the rest of
>> > > Xen so we can create allocated unbound evtchn within Xen.
>> > >
>> > > Signed-off-by: Jean Guyader <jean.guyader@citrix.com>
>> >
>> > > @@ -161,18 +163,18 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>> > >  {
>> > >      struct evtchn *chn;
>> > >      struct domain *d;
>> > > -    int            port;
>> > > +    evtchn_port_t  port;
>> > >      domid_t        dom = alloc->dom;
>> > > -    long           rc;
>> > > +    int            rc;
>> >
>> > The function returns long; if you're tidying this up to be an int, might
>> > as well change the return type too.
>>
>> I'm not sure if this is relevant but Jan just sent a patch to "make all
>> (native) hypercalls consistently have "long" return type". I
>> think/suspect this rc here turns into the result of the hypercall?
>>
>> Jan's patch was motivated by something to do with sign extension when a
>> hypercall's int return is written to the long in the multicall arg
>> struct which causes strangeness. Perhaps not totally relevant to
>> evtchn_alloc which is unlikely to be in a MC.
>
> Yes, this eventually ends up in a hypercall handler, but s/long/int/
> here doesn't cause problems because
>  - rc is only ever set to an 'int' value here so we can't lose data
>    from the type being too narrow; and
>  - Those int values get cast up to long (either in here or in the
>    caller) directly, which will sign-extend the.
>
> It really doesn't matter whether this function returns an int or a long,
> but it's a bit untidy to change it half-way.
>

The main reason why I changed it only base ERROR_EXIT_DOM expects an int based
on the format string. I guess I could cast the long in int for the
call to ERROR_EXIT_DOM
but that doesn't really look nice either.

Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/5] xen: add ssize_t
  2012-08-09 10:39           ` Jan Beulich
@ 2012-08-09 10:48             ` Jean Guyader
  2012-08-09 13:02               ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-08-09 10:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Tim Deegan, Jean Guyader, xen-devel

On 9 August 2012 11:39, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 09.08.12 at 12:19, Jean Guyader <jean.guyader@gmail.com> wrote:
>> On 9 August 2012 10:51, Tim Deegan <tim@xen.org> wrote:
>>> At 15:47 +0100 on 06 Aug (1344268059), Jean Guyader wrote:
>>>> On 6 August 2012 09:08, Jan Beulich <JBeulich@suse.com> wrote:
>>>> >>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>>>> >
>>>> > Without finally explaining why you need this type in the first place,
>>>> > I'll continue to NAK this patch. (This is made even worse by the fact
>>>> > taht the two inline functions in patch 5 that make use of the type
>>>> > appear to be unused.)
>>>> >
>>>>
>>>> Understood. I'll switch to use long instead of ssize_t in my
>>>> forthcoming patch serie.
>>>
>>> Please use an explicitly 64-bit type - AFAICS you're holding the sum of
>>> some 64-bit length fields.
>>>
>>
>> Ok but ssize_t is kind of a funny one. It should accept everything
>> that size_t can accept + negative values.
>
> No. It's the same relation as between e.g. "signed int" and
> "unsigned int". Value preserving conversions are only guaranteed
> for non-negative values fitting both types.
>

ssize_t is a *signed* type, I was wrong by saying that it should
accept all the range of a size_t, it allows only
a subset of it. read/write used ssize_t as a return type.

>From man 2 read:
If count is zero, read() returns zero and has no other results. If
count is greater than SSIZE_MAX, the result is unspecified.

Would it be ok to claim the same thing here? i.e. if count > INT_MAX
the result is unspecified.

Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/5] xen: add ssize_t
  2012-08-09 10:19         ` Jean Guyader
  2012-08-09 10:39           ` Jan Beulich
@ 2012-08-09 10:59           ` Tim Deegan
  2012-08-09 11:08             ` Jean Guyader
  1 sibling, 1 reply; 50+ messages in thread
From: Tim Deegan @ 2012-08-09 10:59 UTC (permalink / raw)
  To: Jean Guyader; +Cc: Jan Beulich, Jean Guyader, xen-devel

At 11:19 +0100 on 09 Aug (1344511142), Jean Guyader wrote:
> On 9 August 2012 10:51, Tim Deegan <tim@xen.org> wrote:
> > At 15:47 +0100 on 06 Aug (1344268059), Jean Guyader wrote:
> >> On 6 August 2012 09:08, Jan Beulich <JBeulich@suse.com> wrote:
> >> >>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
> >> >
> >> > Without finally explaining why you need this type in the first place,
> >> > I'll continue to NAK this patch. (This is made even worse by the fact
> >> > taht the two inline functions in patch 5 that make use of the type
> >> > appear to be unused.)
> >> >
> >>
> >> Understood. I'll switch to use long instead of ssize_t in my
> >> forthcoming patch serie.
> >
> > Please use an explicitly 64-bit type - AFAICS you're holding the sum of
> > some 64-bit length fields.
> >
> 
> Ok but ssize_t is kind of a funny one. It should accept everything
> that size_t can accept + negative values.

Given that you start with user-supplied 64-bit iov lengths, there is no
such type. :)

And now that I look at it, your handling of sizes is pretty confused.
v4v_ringbuf_insertv() returns a ssize_t, but calculates it internally as
an int32_t, which is _smaller_ than the size_t it starts with (which is
the sum of some guest-supplied uint64_ts).  And then v4v_sendv() puts
that ssize_t into an int again before returning it as a size_t, which
d_v4v_op() casts as a long to give to the guest.  Whee!

Can you please make that lot consistent, and then audit the whole path
from user-supplied iovec lists to actual copies and make sure that there
are no overflows?

I think that explicitly limiting your sum-of-iovec-lengths to 31 bits
would be perfectly reasonable (1 hypercall per 2GB copy), and would
avoid a lot of pain here.  As long as it was documented in the
interface, of course!

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/5] xen: add ssize_t
  2012-08-09 10:59           ` Tim Deegan
@ 2012-08-09 11:08             ` Jean Guyader
  0 siblings, 0 replies; 50+ messages in thread
From: Jean Guyader @ 2012-08-09 11:08 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Jan Beulich, Jean Guyader, xen-devel

On 9 August 2012 11:59, Tim Deegan <tim@xen.org> wrote:
> At 11:19 +0100 on 09 Aug (1344511142), Jean Guyader wrote:
>> On 9 August 2012 10:51, Tim Deegan <tim@xen.org> wrote:
>> > At 15:47 +0100 on 06 Aug (1344268059), Jean Guyader wrote:
>> >> On 6 August 2012 09:08, Jan Beulich <JBeulich@suse.com> wrote:
>> >> >>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>> >> >
>> >> > Without finally explaining why you need this type in the first place,
>> >> > I'll continue to NAK this patch. (This is made even worse by the fact
>> >> > taht the two inline functions in patch 5 that make use of the type
>> >> > appear to be unused.)
>> >> >
>> >>
>> >> Understood. I'll switch to use long instead of ssize_t in my
>> >> forthcoming patch serie.
>> >
>> > Please use an explicitly 64-bit type - AFAICS you're holding the sum of
>> > some 64-bit length fields.
>> >
>>
>> Ok but ssize_t is kind of a funny one. It should accept everything
>> that size_t can accept + negative values.
>
> Given that you start with user-supplied 64-bit iov lengths, there is no
> such type. :)
>
> And now that I look at it, your handling of sizes is pretty confused.
> v4v_ringbuf_insertv() returns a ssize_t, but calculates it internally as
> an int32_t, which is _smaller_ than the size_t it starts with (which is
> the sum of some guest-supplied uint64_ts).  And then v4v_sendv() puts
> that ssize_t into an int again before returning it as a size_t, which
> d_v4v_op() casts as a long to give to the guest.  Whee!
>
> Can you please make that lot consistent, and then audit the whole path
> from user-supplied iovec lists to actual copies and make sure that there
> are no overflows?
>
> I think that explicitly limiting your sum-of-iovec-lengths to 31 bits
> would be perfectly reasonable (1 hypercall per 2GB copy), and would
> avoid a lot of pain here.  As long as it was documented in the
> interface, of course!
>

Ok fine that works for me. I'll do that for the next version.

Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/5] xen: add ssize_t
  2012-08-09 10:48             ` Jean Guyader
@ 2012-08-09 13:02               ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2012-08-09 13:02 UTC (permalink / raw)
  To: Jean Guyader; +Cc: Tim Deegan, Jean Guyader, xen-devel

>>> On 09.08.12 at 12:48, Jean Guyader <jean.guyader@gmail.com> wrote:
> On 9 August 2012 11:39, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 09.08.12 at 12:19, Jean Guyader <jean.guyader@gmail.com> wrote:
>>> On 9 August 2012 10:51, Tim Deegan <tim@xen.org> wrote:
>>>> At 15:47 +0100 on 06 Aug (1344268059), Jean Guyader wrote:
>>>>> On 6 August 2012 09:08, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> >>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>>>>> >
>>>>> > Without finally explaining why you need this type in the first place,
>>>>> > I'll continue to NAK this patch. (This is made even worse by the fact
>>>>> > taht the two inline functions in patch 5 that make use of the type
>>>>> > appear to be unused.)
>>>>> >
>>>>>
>>>>> Understood. I'll switch to use long instead of ssize_t in my
>>>>> forthcoming patch serie.
>>>>
>>>> Please use an explicitly 64-bit type - AFAICS you're holding the sum of
>>>> some 64-bit length fields.
>>>>
>>>
>>> Ok but ssize_t is kind of a funny one. It should accept everything
>>> that size_t can accept + negative values.
>>
>> No. It's the same relation as between e.g. "signed int" and
>> "unsigned int". Value preserving conversions are only guaranteed
>> for non-negative values fitting both types.
>>
> 
> ssize_t is a *signed* type, I was wrong by saying that it should
> accept all the range of a size_t, it allows only
> a subset of it. read/write used ssize_t as a return type.
> 
> From man 2 read:
> If count is zero, read() returns zero and has no other results. If
> count is greater than SSIZE_MAX, the result is unspecified.
> 
> Would it be ok to claim the same thing here? i.e. if count > INT_MAX
> the result is unspecified.

Sure, except that I think you wanted to use long and hence
LONG_MAX.

Jan

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain
  2012-08-09 10:40         ` Jean Guyader
@ 2012-08-09 23:25           ` Jean Guyader
  2012-08-10  7:35             ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-08-09 23:25 UTC (permalink / raw)
  To: Jean Guyader; +Cc: Tim (Xen.org), Ian Campbell, xen-devel

[-- Attachment #1: Type: text/plain, Size: 2331 bytes --]

On 09/08 11:40, Jean Guyader wrote:
> On 9 August 2012 11:35, Tim Deegan <tim@xen.org> wrote:
> > At 11:23 +0100 on 09 Aug (1344511426), Ian Campbell wrote:
> >> On Thu, 2012-08-09 at 11:06 +0100, Tim Deegan wrote:
> >> > At 20:50 +0100 on 03 Aug (1344027053), Jean Guyader wrote:
> >> > >
> >> > > Exposes evtchn_alloc_unbound_domain to the rest of
> >> > > Xen so we can create allocated unbound evtchn within Xen.
> >> > >
> >> > > Signed-off-by: Jean Guyader <jean.guyader@citrix.com>
> >> >
> >> > > @@ -161,18 +163,18 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> >> > >  {
> >> > >      struct evtchn *chn;
> >> > >      struct domain *d;
> >> > > -    int            port;
> >> > > +    evtchn_port_t  port;
> >> > >      domid_t        dom = alloc->dom;
> >> > > -    long           rc;
> >> > > +    int            rc;
> >> >
> >> > The function returns long; if you're tidying this up to be an int, might
> >> > as well change the return type too.
> >>
> >> I'm not sure if this is relevant but Jan just sent a patch to "make all
> >> (native) hypercalls consistently have "long" return type". I
> >> think/suspect this rc here turns into the result of the hypercall?
> >>
> >> Jan's patch was motivated by something to do with sign extension when a
> >> hypercall's int return is written to the long in the multicall arg
> >> struct which causes strangeness. Perhaps not totally relevant to
> >> evtchn_alloc which is unlikely to be in a MC.
> >
> > Yes, this eventually ends up in a hypercall handler, but s/long/int/
> > here doesn't cause problems because
> >  - rc is only ever set to an 'int' value here so we can't lose data
> >    from the type being too narrow; and
> >  - Those int values get cast up to long (either in here or in the
> >    caller) directly, which will sign-extend the.
> >
> > It really doesn't matter whether this function returns an int or a long,
> > but it's a bit untidy to change it half-way.
> >
> 
> The main reason why I changed it only base ERROR_EXIT_DOM expects an int based
> on the format string. I guess I could cast the long in int for the
> call to ERROR_EXIT_DOM
> but that doesn't really look nice either.
> 

Hi,

Here is a new version that should address the comments from Tim and Jan.

Signed-off-by: Jean Guyader <jean.guyader@citrix.com>

Jean

[-- Attachment #2: evtchn_alloc_unbound_domain.patch --]
[-- Type: text/x-diff, Size: 2881 bytes --]

commit c43dbcee9c4e9d65520f9a562b39e8e6455efc36
Author: Jean Guyader <jean.guyader@citrix.com>
Date:   Thu Aug 2 16:19:23 2012 +0100

    xen: events, exposes evtchn_alloc_unbound_domain
    
    Exposes evtchn_alloc_unbound_domain to the rest of
    Xen so we can create allocated unbound evtchn within Xen.

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 53777f8..880395e 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -159,9 +159,8 @@ static int get_free_port(struct domain *d)
 
 static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
 {
-    struct evtchn *chn;
     struct domain *d;
-    int            port;
+    evtchn_port_t  port;
     domid_t        dom = alloc->dom;
     long           rc;
 
@@ -169,26 +168,47 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
     if ( rc )
         return rc;
 
+    rc = evtchn_alloc_unbound_domain(d, &port,
+            alloc->remote_dom == DOMID_SELF ? current->domain->domain_id
+                                            : alloc->remote_dom);
+    if ( rc )
+        ERROR_EXIT_DOM((int)rc, d);
+
+    alloc->port = port;
+
+ out:
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
+int evtchn_alloc_unbound_domain(struct domain *d, evtchn_port_t *port,
+                                domid_t remote_domid)
+{
+    struct evtchn *chn;
+    int           rc;
+    int           free_port;
+
     spin_lock(&d->event_lock);
 
-    if ( (port = get_free_port(d)) < 0 )
-        ERROR_EXIT_DOM(port, d);
-    chn = evtchn_from_port(d, port);
+    rc = free_port = get_free_port(d);
+    if ( free_port < 0 )
+        goto out;
 
-    rc = xsm_evtchn_unbound(d, chn, alloc->remote_dom);
+    chn = evtchn_from_port(d, free_port);
+    rc = xsm_evtchn_unbound(d, chn, remote_domid);
     if ( rc )
         goto out;
 
     chn->state = ECS_UNBOUND;
-    if ( (chn->u.unbound.remote_domid = alloc->remote_dom) == DOMID_SELF )
-        chn->u.unbound.remote_domid = current->domain->domain_id;
+    chn->u.unbound.remote_domid = remote_domid;
 
-    alloc->port = port;
+    *port = free_port;
+    /* Everything is fine, returns 0 */
+    rc = 0;
 
  out:
     spin_unlock(&d->event_lock);
-    rcu_unlock_domain(d);
-
     return rc;
 }
 
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 71c3e92..1a0c832 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -69,6 +69,9 @@ int guest_enabled_event(struct vcpu *v, uint32_t virq);
 /* Notify remote end of a Xen-attached event channel.*/
 void notify_via_xen_event_channel(struct domain *ld, int lport);
 
+int evtchn_alloc_unbound_domain(struct domain *d, evtchn_port_t *port,
+                                domid_t remote_domid);
+
 /* Internal event channel object accessors */
 #define bucket_from_port(d,p) \
     ((d)->evtchn[(p)/EVTCHNS_PER_BUCKET])

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain
  2012-08-09 23:25           ` Jean Guyader
@ 2012-08-10  7:35             ` Jan Beulich
  2012-08-10  7:51               ` Jean Guyader
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2012-08-10  7:35 UTC (permalink / raw)
  To: Jean Guyader, Jean Guyader; +Cc: Tim (Xen.org), Ian Campbell, xen-devel

>>> On 10.08.12 at 01:25, Jean Guyader <jean.guyader@citrix.com> wrote:
>--- a/xen/common/event_channel.c
>+++ b/xen/common/event_channel.c
>@@ -159,9 +159,8 @@ static int get_free_port(struct domain *d)
> 
> static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>-    struct evtchn *chn;
>     struct domain *d;
>-    int            port;
>+    evtchn_port_t  port;
>     domid_t        dom = alloc->dom;
>     long           rc;
> 
>@@ -169,26 +168,47 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>     if ( rc )
>         return rc;
> 
>+    rc = evtchn_alloc_unbound_domain(d, &port,

Any reason you can't pass &alloc->port here directly?

>+            alloc->remote_dom == DOMID_SELF ? current->domain->domain_id
>+                                            : alloc->remote_dom);

Any reason this can't/shouldn't be done in the called function?

Jan

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain
  2012-08-10  7:35             ` Jan Beulich
@ 2012-08-10  7:51               ` Jean Guyader
  2012-08-10  7:57                 ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-08-10  7:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Tim (Xen.org), Ian Campbell, Jean Guyader, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1097 bytes --]

On 10/08 08:35, Jan Beulich wrote:
> >>> On 10.08.12 at 01:25, Jean Guyader <jean.guyader@citrix.com> wrote:
> >--- a/xen/common/event_channel.c
> >+++ b/xen/common/event_channel.c
> >@@ -159,9 +159,8 @@ static int get_free_port(struct domain *d)
> > 
> > static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> > {
> >-    struct evtchn *chn;
> >     struct domain *d;
> >-    int            port;
> >+    evtchn_port_t  port;
> >     domid_t        dom = alloc->dom;
> >     long           rc;
> > 
> >@@ -169,26 +168,47 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> >     if ( rc )
> >         return rc;
> > 
> >+    rc = evtchn_alloc_unbound_domain(d, &port,
> 
> Any reason you can't pass &alloc->port here directly?
> 
> >+            alloc->remote_dom == DOMID_SELF ? current->domain->domain_id
> >+                                            : alloc->remote_dom);
> 
> Any reason this can't/shouldn't be done in the called function?
> 

No specific reason for both of those thing. Here is a new version based
on your comments.

Thanks for reviewing,
Jean

[-- Attachment #2: evtchn_alloc_unbound_domain.patch --]
[-- Type: text/x-diff, Size: 2808 bytes --]

commit 208384d74852df9ae26294236d79e33967a75afa
Author: Jean Guyader <jean.guyader@citrix.com>
Date:   Thu Aug 2 16:19:23 2012 +0100

    xen: events, exposes evtchn_alloc_unbound_domain
    
    Exposes evtchn_alloc_unbound_domain to the rest of
    Xen so we can create allocated unbound evtchn within Xen.

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 53777f8..fd626bf 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -159,36 +159,53 @@ static int get_free_port(struct domain *d)
 
 static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
 {
-    struct evtchn *chn;
     struct domain *d;
-    int            port;
-    domid_t        dom = alloc->dom;
     long           rc;
 
-    rc = rcu_lock_target_domain_by_id(dom, &d);
+    rc = rcu_lock_target_domain_by_id(alloc->dom, &d);
     if ( rc )
         return rc;
 
+    rc = evtchn_alloc_unbound_domain(d, &alloc->port, alloc->remote_dom);
+    if ( rc )
+        ERROR_EXIT_DOM((int)rc, d);
+
+ out:
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
+int evtchn_alloc_unbound_domain(struct domain *d, evtchn_port_t *port,
+                                domid_t remote_domid)
+{
+    struct evtchn *chn;
+    int           rc;
+    int           free_port;
+
     spin_lock(&d->event_lock);
 
-    if ( (port = get_free_port(d)) < 0 )
-        ERROR_EXIT_DOM(port, d);
-    chn = evtchn_from_port(d, port);
+    rc = free_port = get_free_port(d);
+    if ( free_port < 0 )
+        goto out;
 
-    rc = xsm_evtchn_unbound(d, chn, alloc->remote_dom);
+    chn = evtchn_from_port(d, free_port);
+    rc = xsm_evtchn_unbound(d, chn, remote_domid);
     if ( rc )
         goto out;
 
     chn->state = ECS_UNBOUND;
-    if ( (chn->u.unbound.remote_domid = alloc->remote_dom) == DOMID_SELF )
+    if ( (chn->u.unbound.remote_domid = remote_domid) == DOMID_SELF )
         chn->u.unbound.remote_domid = current->domain->domain_id;
 
-    alloc->port = port;
+    chn->u.unbound.remote_domid = remote_domid;
+
+    *port = free_port;
+    /* Everything is fine, returns 0 */
+    rc = 0;
 
  out:
     spin_unlock(&d->event_lock);
-    rcu_unlock_domain(d);
-
     return rc;
 }
 
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 71c3e92..1a0c832 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -69,6 +69,9 @@ int guest_enabled_event(struct vcpu *v, uint32_t virq);
 /* Notify remote end of a Xen-attached event channel.*/
 void notify_via_xen_event_channel(struct domain *ld, int lport);
 
+int evtchn_alloc_unbound_domain(struct domain *d, evtchn_port_t *port,
+                                domid_t remote_domid);
+
 /* Internal event channel object accessors */
 #define bucket_from_port(d,p) \
     ((d)->evtchn[(p)/EVTCHNS_PER_BUCKET])

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain
  2012-08-10  7:51               ` Jean Guyader
@ 2012-08-10  7:57                 ` Jan Beulich
  2012-08-23 12:03                   ` Jean Guyader
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2012-08-10  7:57 UTC (permalink / raw)
  To: Jean Guyader; +Cc: Tim (Xen.org), Ian Campbell, Jean Guyader, xen-devel

>>> On 10.08.12 at 09:51, Jean Guyader <jean.guyader@citrix.com> wrote:
> No specific reason for both of those thing. Here is a new version based
> on your comments.

Reviewed-by: Jan Beulich <jbeulich@suse.com>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation
  2012-08-09 10:38   ` [PATCH 5/5] xen: Add V4V implementation Tim Deegan
@ 2012-08-10 16:51     ` Jean Guyader
  2012-08-13  9:38       ` Tim Deegan
  0 siblings, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-08-10 16:51 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

On 09/08 11:38, Tim Deegan wrote:
> Hi,
> 
> This looks pretty good; I think you've addressed almost all my comments
> except for one, which is really a design decision raether than an
> implementation one.  As I said last time: 
> 
> ] And what about protocol?  Protocol seems to have ended up as a bit of a
> ] second-class citizen in v4v; it's defined, and indeed required, but not
> ] used for routing or for acccess control, so all traffic to a given port
> ] _on every protocol_ ends up on the same ring. 
> ] 
> ] This is the inverse of the TCP/IP namespace that you're copying, where
> ] protocol demux happens before port demux.  And I think it will bite
> ] someone if you ever, for example, want to send ICMP or GRE over a v4v
> ] channel.
> 

The protocol field is used to inform about the type a message on the ring.

Right now we use two protocols in our linux driver: V4V_PROTO_DGRAM and
V4V_PROTO_STREAM. In the future that could probably be extended to new protocol
like V4V_PROTO_ICMP for instance.

The demultiplexing will happens at the other end, the driver can look at the
message and decide what to do with it based on the protocol field.

Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation
  2012-08-10 16:51     ` Jean Guyader
@ 2012-08-13  9:38       ` Tim Deegan
  2012-08-13 12:43         ` Jean Guyader
  0 siblings, 1 reply; 50+ messages in thread
From: Tim Deegan @ 2012-08-13  9:38 UTC (permalink / raw)
  To: Jean Guyader; +Cc: xen-devel

At 17:51 +0100 on 10 Aug (1344621069), Jean Guyader wrote:
> On 09/08 11:38, Tim Deegan wrote:
> > Hi,
> > 
> > This looks pretty good; I think you've addressed almost all my comments
> > except for one, which is really a design decision raether than an
> > implementation one.  As I said last time: 
> > 
> > ] And what about protocol?  Protocol seems to have ended up as a bit of a
> > ] second-class citizen in v4v; it's defined, and indeed required, but not
> > ] used for routing or for acccess control, so all traffic to a given port
> > ] _on every protocol_ ends up on the same ring. 
> > ] 
> > ] This is the inverse of the TCP/IP namespace that you're copying, where
> > ] protocol demux happens before port demux.  And I think it will bite
> > ] someone if you ever, for example, want to send ICMP or GRE over a v4v
> > ] channel.
> > 
> 
> The protocol field is used to inform about the type a message on the ring.
> 
> Right now we use two protocols in our linux driver: V4V_PROTO_DGRAM and
> V4V_PROTO_STREAM. In the future that could probably be extended to new protocol
> like V4V_PROTO_ICMP for instance.
> 
> The demultiplexing will happens at the other end, the driver can look at the
> message and decide what to do with it based on the protocol field.

Yes, I understand all that - what I'm saying is that it seems like a
design flaw to me.  The namespace in V4V, as proposed, looks like this:

 Protocol
 Port
 Domain

and it would be more sensible to do (like the IP stack):

 Port
 Protocol
 Domain.

Or at the very least the protocol should be made part of the endpoint
address, and not just part of the packet header.  As it stands:

 - The handlers for port X in _all_ protocols _have_ to share a
   ring.  That seems kind of plausible because the IANA port assignments
   never give the same port number to different services on TCP and UDP,
   but will it make sense for every new protocol?  Is it sensible to
   require, say, an L2TP service to make its connection IDs not clash
   with V4V_PROTO_DGRAM and V4V_PROTO_STREAM users?

   It may not even make sense in existing protocols.  It's common enough
   for DNS servers to use different ACLs (and indeed different servers)
   for TCP and UDP.

 - Relatedly, every protocol _has_ to have port numbers.  How would you
   register an ICMP listener, for example?  You'd have to do something
   gross like declare a particular port to be the ICMP port so that you
   could demux it, or indeed send it in the first place.

You say:

> The demultiplexing will happens at the other end, the driver can look at the
> message and decide what to do with it based on the protocol field.

I'm willing to accept that argument, but only if we extend it to ports
too, get rid of all the namespace and ACL code in Xen and leave each
domain with a single RX ring that the (single) guest driver must demux. :P

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation
  2012-08-13  9:38       ` Tim Deegan
@ 2012-08-13 12:43         ` Jean Guyader
  2012-08-16 12:32           ` Tim Deegan
  0 siblings, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-08-13 12:43 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Jean Guyader, xen-devel

On 13 August 2012 10:38, Tim Deegan <tim@xen.org> wrote:
> At 17:51 +0100 on 10 Aug (1344621069), Jean Guyader wrote:
>> On 09/08 11:38, Tim Deegan wrote:
>> > Hi,
>> >
>> > This looks pretty good; I think you've addressed almost all my comments
>> > except for one, which is really a design decision raether than an
>> > implementation one.  As I said last time:
>> >
>> > ] And what about protocol?  Protocol seems to have ended up as a bit of a
>> > ] second-class citizen in v4v; it's defined, and indeed required, but not
>> > ] used for routing or for acccess control, so all traffic to a given port
>> > ] _on every protocol_ ends up on the same ring.
>> > ]
>> > ] This is the inverse of the TCP/IP namespace that you're copying, where
>> > ] protocol demux happens before port demux.  And I think it will bite
>> > ] someone if you ever, for example, want to send ICMP or GRE over a v4v
>> > ] channel.
>> >
>>
>> The protocol field is used to inform about the type a message on the ring.
>>
>> Right now we use two protocols in our linux driver: V4V_PROTO_DGRAM and
>> V4V_PROTO_STREAM. In the future that could probably be extended to new protocol
>> like V4V_PROTO_ICMP for instance.
>>
>> The demultiplexing will happens at the other end, the driver can look at the
>> message and decide what to do with it based on the protocol field.
>
> Yes, I understand all that - what I'm saying is that it seems like a
> design flaw to me.  The namespace in V4V, as proposed, looks like this:
>
>  Protocol
>  Port
>  Domain
>

Protocol isn't part of the namespace - I think that's
where the confusion arises. The namespace is exclusively
Port/Domain. Protocol is there to describe the content of
_a particular message_ in the ring. It is included in the
hypercalls (rather than, say, being the first n bytes of
all messages) to force all senders to use it.

The other key point here is confusion as to what V4V is.
V4V is _not_ a tcp or udp clone. V4V exists to provide
a mechanism to transfer messages (which are arbitary
strings of bytes, labeled with a "protocol") between
endpoitns (labeled by a domain and port).

An example use of this facility uses two v4v rings to
implement something which looks quite like TCP. In this
case to distinguish TCP-a-like messages from other
such messages that may or may not be present on the ring,
the messages are labeled with a protocol field that
indicates what upper layer should handle these messages.

One can easily immagine circumstances where messages of
many different protocols are present on the ring. An
obvious example might be a message type which implemented
some sort of flow control, that quenced or started
transmission on a partner ring, regardless of what other
messages were being sent on the rings.

> and it would be more sensible to do (like the IP stack):
>
>  Port
>  Protocol
>  Domain.
>
> Or at the very least the protocol should be made part of the endpoint
> address, and not just part of the packet header.  As it stands:
>
>  - The handlers for port X in _all_ protocols _have_ to share a
>    ring.  That seems kind of plausible because the IANA port assignments
>    never give the same port number to different services on TCP and UDP,
>    but will it make sense for every new protocol?  Is it sensible to
>    require, say, an L2TP service to make its connection IDs not clash
>    with V4V_PROTO_DGRAM and V4V_PROTO_STREAM users?
>
>    It may not even make sense in existing protocols.  It's common enough
>    for DNS servers to use different ACLs (and indeed different servers)
>    for TCP and UDP.
>

V4V isn't IP, but you raise a valid point. We should
define two ranges of 16 bit V4V port numbers one which
is "well known" for the use of TCP encapsulation, and
one which is "well known" for the use of UDP
encapsulation.  That then makes the concept that you
think that protocol should be, be part of the
endpoint address, and it leaves the thing I misleadingly
called "protocol" free to be the message type label.

>  - Relatedly, every protocol _has_ to have port numbers.  How would you
>    register an ICMP listener, for example?  You'd have to do something
>    gross like declare a particular port to be the ICMP port so that you
>    could demux it, or indeed send it in the first place.
>
> You say:
>
>> The demultiplexing will happens at the other end, the driver can look at the
>> message and decide what to do with it based on the protocol field.
>
> I'm willing to accept that argument, but only if we extend it to ports
> too, get rid of all the namespace and ACL code in Xen and leave each
> domain with a single RX ring that the (single) guest driver must demux. :P
>

Cheers,
Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation
  2012-08-13 12:43         ` Jean Guyader
@ 2012-08-16 12:32           ` Tim Deegan
  0 siblings, 0 replies; 50+ messages in thread
From: Tim Deegan @ 2012-08-16 12:32 UTC (permalink / raw)
  To: Jean Guyader; +Cc: Jean Guyader, xen-devel

At 13:43 +0100 on 13 Aug (1344865403), Jean Guyader wrote:
> Protocol isn't part of the namespace - I think that's
> where the confusion arises. The namespace is exclusively
> Port/Domain. Protocol is there to describe the content of
> _a particular message_ in the ring.

OK.  

In that case, I think the hypervisor shouldn't handle it at all.  That
can be done in the client drivers, and the V4V_PROTO definitions and
maybe packet format stuff can go in documentation.

> It is included in the hypercalls (rather than, say, being the first n
> bytes of all messages) to force all senders to use it.

I don't think that's very useful.  It just forces any V4V user who
doesn't need multiple protocols to make up a number for form's sake, and
since Xen doesn't do any checking on the field, it doesn't protect the
receiver from anything.

I guess what I'm saying is, either 'protocol' (or whatever name you give
it) is part of the v4v addressing, in which case it should be treated
properly and demuxed before port, or it's not, in which case it
needn't be in the interface at all.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation
  2012-08-06  8:45   ` Jan Beulich
@ 2012-08-23 11:57     ` Jean Guyader
  2012-08-24 20:06       ` Jan Beulich
  2012-09-01 20:56     ` Jean Guyader
  1 sibling, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-08-23 11:57 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Jean Guyader, xen-devel

On 6 August 2012 09:45, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>>--- /dev/null
>>+++ b/xen/include/public/v4v.h
>>...
>>+#define V4V_DOMID_ANY           0x7fffU
>
> I think I asked this before - why not use one of the pre-existing
> DOMID values? And if there is a good reason, it should be said
> here in a comment, to avoid the same question being asked
> later again.
>
>>...
>>+typedef uint64_t v4v_pfn_t;
>
> We already have xen_pfn_t, so why do you need yet another
> flavor?
>
>>...
>>+struct v4v_info
>>+{
>>+    uint64_t ring_magic;
>>+    uint64_t data_magic;
>>+    evtchn_port_t evtchn;
>
> Missing padding at the end?
>
>>+};
>>+typedef struct v4v_info v4v_info_t;
>>+
>>+#define V4V_ROUNDUP(a) (((a) +0xf ) & ~0xf)
>
> Doesn't seem to belong here. Or is the subsequent comment
> actually related to this (in which case it should be moved ahead
> of the definition and made match it).
>
>>+/*
>>+ * Messages on the ring are padded to 128 bits
>>+ * Len here refers to the exact length of the data not including the
>>+ * 128 bit header. The message uses
>>+ * ((len +0xf) & ~0xf) + sizeof(v4v_ring_message_header) bytes
>>+ */
>>...
>>+/*
>>+ * HYPERCALLS
>>+ */
>>...
>
> In the block below here, please get the naming (do_v4v_op()
> vs v4v_hypercall()) and the use of newlines (either always one
> or always two between individual hypercall descriptions)
> consistent. Hmm, even the descriptions don't seem to always
> match the definitions (not really obvious because apparently
> again the descriptions follow the definitions, whereas the
> opposite is the usual way to arrange things).
>
>>--- /dev/null
>>+++ b/xen/include/xen/v4v_utils.h
>>...
>>+/* Compiler specific hacks */
>>+#if defined(__GNUC__)
>>+# define V4V_UNUSED __attribute__ ((unused))
>>+# ifndef __STRICT_ANSI__
>>+#  define V4V_INLINE inline
>>+# else
>>+#  define V4V_INLINE
>>+# endif
>>+#else /* !__GNUC__ */
>>+# define V4V_UNUSED
>>+# define V4V_INLINE
>>+#endif
>
> This suggests the header is really intended to be public?
>
>>...
>>+static V4V_INLINE uint32_t
>>+v4v_ring_bytes_to_read (volatile struct v4v_ring *r)
>
> No space between function name and opening parenthesis
> (throughout this file).
>
>>...
>>+V4V_UNUSED static V4V_INLINE ssize_t
>
> V4V_UNUSED? Doesn't make sense in conjunction with
> V4V_INLINE, at least as long as you're using GNU extensions
> anyway (see above as to the disposition of the header).
>
>>+v4v_copy_out (struct v4v_ring *r, struct v4v_addr *from, uint32_t * protocol,
>>+              void *_buf, size_t t, int consume)
>
> Dead functions shouldn't be placed here.
>
>>...
>>+static ssize_t
>>+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
>>+                     uint32_t * protocol, void *_buf, size_t t, int consume,
>>+                     size_t skip) V4V_UNUSED;
>>+
>>+V4V_INLINE static ssize_t
>>+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
>>+                     uint32_t * protocol, void *_buf, size_t t, int consume,
>>+                     size_t skip)
>>+{
>
> What's the point of having a declaration followed immediately by
> a definition? Also, the function is dead too.
>

This file (v4v_utils.h) has utility that could be used by drivers, we don't use
them in Xen but we through it will be convenient to have such function
accessible for one to write a v4v driver a v4v driver.

What would be the right place for those?

Thanks,
Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain
  2012-08-10  7:57                 ` Jan Beulich
@ 2012-08-23 12:03                   ` Jean Guyader
  0 siblings, 0 replies; 50+ messages in thread
From: Jean Guyader @ 2012-08-23 12:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Tim (Xen.org), Ian Campbell, Jean Guyader, xen-devel

[-- Attachment #1: Type: text/plain, Size: 489 bytes --]

On 10 August 2012 08:57, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 10.08.12 at 09:51, Jean Guyader <jean.guyader@citrix.com> wrote:
>> No specific reason for both of those thing. Here is a new version based
>> on your comments.
>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>

I have found a copy/paste bug in this patch.

This statement was executed unconditionally, not just inside the if statement.
chn->u.unbound.remote_domid = remote_domid;

Attached the fixed version.

Jean

[-- Attachment #2: evtchn_alloc_unbound_domain.patch --]
[-- Type: application/octet-stream, Size: 2757 bytes --]

commit 8a33fc898726717879f59f5b911125b87598e78a
Author: Jean Guyader <jean.guyader@citrix.com>
Date:   Thu Aug 2 16:19:23 2012 +0100

    xen: events, exposes evtchn_alloc_unbound_domain
    
    Exposes evtchn_alloc_unbound_domain to the rest of
    Xen so we can create allocated unbound evtchn within Xen.

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 53777f8..6a0e342 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -159,36 +159,51 @@ static int get_free_port(struct domain *d)
 
 static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
 {
-    struct evtchn *chn;
     struct domain *d;
-    int            port;
-    domid_t        dom = alloc->dom;
     long           rc;
 
-    rc = rcu_lock_target_domain_by_id(dom, &d);
+    rc = rcu_lock_target_domain_by_id(alloc->dom, &d);
     if ( rc )
         return rc;
 
+    rc = evtchn_alloc_unbound_domain(d, &alloc->port, alloc->remote_dom);
+    if ( rc )
+        ERROR_EXIT_DOM((int)rc, d);
+
+ out:
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
+int evtchn_alloc_unbound_domain(struct domain *d, evtchn_port_t *port,
+                                domid_t remote_domid)
+{
+    struct evtchn *chn;
+    int           rc;
+    int           free_port;
+
     spin_lock(&d->event_lock);
 
-    if ( (port = get_free_port(d)) < 0 )
-        ERROR_EXIT_DOM(port, d);
-    chn = evtchn_from_port(d, port);
+    rc = free_port = get_free_port(d);
+    if ( free_port < 0 )
+        goto out;
 
-    rc = xsm_evtchn_unbound(d, chn, alloc->remote_dom);
+    chn = evtchn_from_port(d, free_port);
+    rc = xsm_evtchn_unbound(d, chn, remote_domid);
     if ( rc )
         goto out;
 
     chn->state = ECS_UNBOUND;
-    if ( (chn->u.unbound.remote_domid = alloc->remote_dom) == DOMID_SELF )
+    if ( (chn->u.unbound.remote_domid = remote_domid) == DOMID_SELF )
         chn->u.unbound.remote_domid = current->domain->domain_id;
 
-    alloc->port = port;
+    *port = free_port;
+    /* Everything is fine, returns 0 */
+    rc = 0;
 
  out:
     spin_unlock(&d->event_lock);
-    rcu_unlock_domain(d);
-
     return rc;
 }
 
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 71c3e92..1a0c832 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -69,6 +69,9 @@ int guest_enabled_event(struct vcpu *v, uint32_t virq);
 /* Notify remote end of a Xen-attached event channel.*/
 void notify_via_xen_event_channel(struct domain *ld, int lport);
 
+int evtchn_alloc_unbound_domain(struct domain *d, evtchn_port_t *port,
+                                domid_t remote_domid);
+
 /* Internal event channel object accessors */
 #define bucket_from_port(d,p) \
     ((d)->evtchn[(p)/EVTCHNS_PER_BUCKET])

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation
  2012-08-23 11:57     ` Jean Guyader
@ 2012-08-24 20:06       ` Jan Beulich
  2012-09-01 20:58         ` Jean Guyader
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2012-08-24 20:06 UTC (permalink / raw)
  To: Jean Guyader; +Cc: Jean Guyader, xen-devel

>>> On 23.08.12 at 13:57, Jean Guyader <jean.guyader@gmail.com> wrote:
> On 6 August 2012 09:45, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>>>--- /dev/null
>>>+++ b/xen/include/public/v4v.h
>>>...
>>>+#define V4V_DOMID_ANY           0x7fffU
>>
>> I think I asked this before - why not use one of the pre-existing
>> DOMID values? And if there is a good reason, it should be said
>> here in a comment, to avoid the same question being asked
>> later again.
>>
>>>...
>>>+typedef uint64_t v4v_pfn_t;
>>
>> We already have xen_pfn_t, so why do you need yet another
>> flavor?
>>
>>>...
>>>+struct v4v_info
>>>+{
>>>+    uint64_t ring_magic;
>>>+    uint64_t data_magic;
>>>+    evtchn_port_t evtchn;
>>
>> Missing padding at the end?
>>
>>>+};
>>>+typedef struct v4v_info v4v_info_t;
>>>+
>>>+#define V4V_ROUNDUP(a) (((a) +0xf ) & ~0xf)
>>
>> Doesn't seem to belong here. Or is the subsequent comment
>> actually related to this (in which case it should be moved ahead
>> of the definition and made match it).
>>
>>>+/*
>>>+ * Messages on the ring are padded to 128 bits
>>>+ * Len here refers to the exact length of the data not including the
>>>+ * 128 bit header. The message uses
>>>+ * ((len +0xf) & ~0xf) + sizeof(v4v_ring_message_header) bytes
>>>+ */
>>>...
>>>+/*
>>>+ * HYPERCALLS
>>>+ */
>>>...
>>
>> In the block below here, please get the naming (do_v4v_op()
>> vs v4v_hypercall()) and the use of newlines (either always one
>> or always two between individual hypercall descriptions)
>> consistent. Hmm, even the descriptions don't seem to always
>> match the definitions (not really obvious because apparently
>> again the descriptions follow the definitions, whereas the
>> opposite is the usual way to arrange things).
>>
>>>--- /dev/null
>>>+++ b/xen/include/xen/v4v_utils.h
>>>...
>>>+/* Compiler specific hacks */
>>>+#if defined(__GNUC__)
>>>+# define V4V_UNUSED __attribute__ ((unused))
>>>+# ifndef __STRICT_ANSI__
>>>+#  define V4V_INLINE inline
>>>+# else
>>>+#  define V4V_INLINE
>>>+# endif
>>>+#else /* !__GNUC__ */
>>>+# define V4V_UNUSED
>>>+# define V4V_INLINE
>>>+#endif
>>
>> This suggests the header is really intended to be public?
>>
>>>...
>>>+static V4V_INLINE uint32_t
>>>+v4v_ring_bytes_to_read (volatile struct v4v_ring *r)
>>
>> No space between function name and opening parenthesis
>> (throughout this file).
>>
>>>...
>>>+V4V_UNUSED static V4V_INLINE ssize_t
>>
>> V4V_UNUSED? Doesn't make sense in conjunction with
>> V4V_INLINE, at least as long as you're using GNU extensions
>> anyway (see above as to the disposition of the header).
>>
>>>+v4v_copy_out (struct v4v_ring *r, struct v4v_addr *from, uint32_t * 
> protocol,
>>>+              void *_buf, size_t t, int consume)
>>
>> Dead functions shouldn't be placed here.
>>
>>>...
>>>+static ssize_t
>>>+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
>>>+                     uint32_t * protocol, void *_buf, size_t t, int consume,
>>>+                     size_t skip) V4V_UNUSED;
>>>+
>>>+V4V_INLINE static ssize_t
>>>+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
>>>+                     uint32_t * protocol, void *_buf, size_t t, int consume,
>>>+                     size_t skip)
>>>+{
>>
>> What's the point of having a declaration followed immediately by
>> a definition? Also, the function is dead too.
>>
> 
> This file (v4v_utils.h) has utility that could be used by drivers, we don't 
> use
> them in Xen but we through it will be convenient to have such function
> accessible for one to write a v4v driver a v4v driver.
> 
> What would be the right place for those?

I think public/io/ would be the best matching place, but it's
certainly also not ideal. Maybe we ought to introduce public/lib/
or some such for it?

But then again I don't see how this comment of yours relates to
the earlier comments I made.

Jan

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation
  2012-08-06  8:45   ` Jan Beulich
  2012-08-23 11:57     ` Jean Guyader
@ 2012-09-01 20:56     ` Jean Guyader
  2013-06-11 17:10       ` [PATCH 5/5] xen: Add V4V implementation - padding question Ross Philipson
  1 sibling, 1 reply; 50+ messages in thread
From: Jean Guyader @ 2012-09-01 20:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Jean Guyader, xen-devel

On 6 August 2012 01:45, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>>--- /dev/null
>>+++ b/xen/include/public/v4v.h
>>...
>>+#define V4V_DOMID_ANY           0x7fffU
>
> I think I asked this before - why not use one of the pre-existing
> DOMID values? And if there is a good reason, it should be said
> here in a comment, to avoid the same question being asked
> later again.
>

I replaced 0x7fffU with DOMID_INVALID.

>>...
>>+typedef uint64_t v4v_pfn_t;
>
> We already have xen_pfn_t, so why do you need yet another
> flavor?
>

No, replaced with xen_pfn_t.

>>...
>>+struct v4v_info
>>+{
>>+    uint64_t ring_magic;
>>+    uint64_t data_magic;
>>+    evtchn_port_t evtchn;
>
> Missing padding at the end?
>

ack.

>>+};
>>+typedef struct v4v_info v4v_info_t;
>>+
>>+#define V4V_ROUNDUP(a) (((a) +0xf ) & ~0xf)
>
> Doesn't seem to belong here. Or is the subsequent comment
> actually related to this (in which case it should be moved ahead
> of the definition and made match it).
>

V4V_ROUNDUP is now in v4v.c and I move the description above
the define.

Thanks for the review,
Jean

>>+/*
>>+ * Messages on the ring are padded to 128 bits
>>+ * Len here refers to the exact length of the data not including the
>>+ * 128 bit header. The message uses
>>+ * ((len +0xf) & ~0xf) + sizeof(v4v_ring_message_header) bytes
>>+ */
>>...
>>+/*
>>+ * HYPERCALLS
>>+ */
>>...
>
> In the block below here, please get the naming (do_v4v_op()
> vs v4v_hypercall()) and the use of newlines (either always one
> or always two between individual hypercall descriptions)
> consistent. Hmm, even the descriptions don't seem to always
> match the definitions (not really obvious because apparently
> again the descriptions follow the definitions, whereas the
> opposite is the usual way to arrange things).
>
>>--- /dev/null
>>+++ b/xen/include/xen/v4v_utils.h
>>...
>>+/* Compiler specific hacks */
>>+#if defined(__GNUC__)
>>+# define V4V_UNUSED __attribute__ ((unused))
>>+# ifndef __STRICT_ANSI__
>>+#  define V4V_INLINE inline
>>+# else
>>+#  define V4V_INLINE
>>+# endif
>>+#else /* !__GNUC__ */
>>+# define V4V_UNUSED
>>+# define V4V_INLINE
>>+#endif
>
> This suggests the header is really intended to be public?
>
>>...
>>+static V4V_INLINE uint32_t
>>+v4v_ring_bytes_to_read (volatile struct v4v_ring *r)
>
> No space between function name and opening parenthesis
> (throughout this file).
>
>>...
>>+V4V_UNUSED static V4V_INLINE ssize_t
>
> V4V_UNUSED? Doesn't make sense in conjunction with
> V4V_INLINE, at least as long as you're using GNU extensions
> anyway (see above as to the disposition of the header).
>
>>+v4v_copy_out (struct v4v_ring *r, struct v4v_addr *from, uint32_t * protocol,
>>+              void *_buf, size_t t, int consume)
>
> Dead functions shouldn't be placed here.
>
>>...
>>+static ssize_t
>>+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
>>+                     uint32_t * protocol, void *_buf, size_t t, int consume,
>>+                     size_t skip) V4V_UNUSED;
>>+
>>+V4V_INLINE static ssize_t
>>+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
>>+                     uint32_t * protocol, void *_buf, size_t t, int consume,
>>+                     size_t skip)
>>+{
>
> What's the point of having a declaration followed immediately by
> a definition? Also, the function is dead too.
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation
  2012-08-24 20:06       ` Jan Beulich
@ 2012-09-01 20:58         ` Jean Guyader
  0 siblings, 0 replies; 50+ messages in thread
From: Jean Guyader @ 2012-09-01 20:58 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Jean Guyader, xen-devel

On 24 August 2012 13:06, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 23.08.12 at 13:57, Jean Guyader <jean.guyader@gmail.com> wrote:
>> On 6 August 2012 09:45, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 03.08.12 at 21:50, Jean Guyader <jean.guyader@citrix.com> wrote:
>>>>--- /dev/null
>>>>+++ b/xen/include/public/v4v.h
>>>>...
>>>>+#define V4V_DOMID_ANY           0x7fffU
>>>
>>> I think I asked this before - why not use one of the pre-existing
>>> DOMID values? And if there is a good reason, it should be said
>>> here in a comment, to avoid the same question being asked
>>> later again.
>>>
>>>>...
>>>>+typedef uint64_t v4v_pfn_t;
>>>
>>> We already have xen_pfn_t, so why do you need yet another
>>> flavor?
>>>
>>>>...
>>>>+struct v4v_info
>>>>+{
>>>>+    uint64_t ring_magic;
>>>>+    uint64_t data_magic;
>>>>+    evtchn_port_t evtchn;
>>>
>>> Missing padding at the end?
>>>
>>>>+};
>>>>+typedef struct v4v_info v4v_info_t;
>>>>+
>>>>+#define V4V_ROUNDUP(a) (((a) +0xf ) & ~0xf)
>>>
>>> Doesn't seem to belong here. Or is the subsequent comment
>>> actually related to this (in which case it should be moved ahead
>>> of the definition and made match it).
>>>
>>>>+/*
>>>>+ * Messages on the ring are padded to 128 bits
>>>>+ * Len here refers to the exact length of the data not including the
>>>>+ * 128 bit header. The message uses
>>>>+ * ((len +0xf) & ~0xf) + sizeof(v4v_ring_message_header) bytes
>>>>+ */
>>>>...
>>>>+/*
>>>>+ * HYPERCALLS
>>>>+ */
>>>>...
>>>
>>> In the block below here, please get the naming (do_v4v_op()
>>> vs v4v_hypercall()) and the use of newlines (either always one
>>> or always two between individual hypercall descriptions)
>>> consistent. Hmm, even the descriptions don't seem to always
>>> match the definitions (not really obvious because apparently
>>> again the descriptions follow the definitions, whereas the
>>> opposite is the usual way to arrange things).
>>>
>>>>--- /dev/null
>>>>+++ b/xen/include/xen/v4v_utils.h
>>>>...
>>>>+/* Compiler specific hacks */
>>>>+#if defined(__GNUC__)
>>>>+# define V4V_UNUSED __attribute__ ((unused))
>>>>+# ifndef __STRICT_ANSI__
>>>>+#  define V4V_INLINE inline
>>>>+# else
>>>>+#  define V4V_INLINE
>>>>+# endif
>>>>+#else /* !__GNUC__ */
>>>>+# define V4V_UNUSED
>>>>+# define V4V_INLINE
>>>>+#endif
>>>
>>> This suggests the header is really intended to be public?
>>>
>>>>...
>>>>+static V4V_INLINE uint32_t
>>>>+v4v_ring_bytes_to_read (volatile struct v4v_ring *r)
>>>
>>> No space between function name and opening parenthesis
>>> (throughout this file).
>>>
>>>>...
>>>>+V4V_UNUSED static V4V_INLINE ssize_t
>>>
>>> V4V_UNUSED? Doesn't make sense in conjunction with
>>> V4V_INLINE, at least as long as you're using GNU extensions
>>> anyway (see above as to the disposition of the header).
>>>
>>>>+v4v_copy_out (struct v4v_ring *r, struct v4v_addr *from, uint32_t *
>> protocol,
>>>>+              void *_buf, size_t t, int consume)
>>>
>>> Dead functions shouldn't be placed here.
>>>
>>>>...
>>>>+static ssize_t
>>>>+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
>>>>+                     uint32_t * protocol, void *_buf, size_t t, int consume,
>>>>+                     size_t skip) V4V_UNUSED;
>>>>+
>>>>+V4V_INLINE static ssize_t
>>>>+v4v_copy_out_offset (struct v4v_ring *r, struct v4v_addr *from,
>>>>+                     uint32_t * protocol, void *_buf, size_t t, int consume,
>>>>+                     size_t skip)
>>>>+{
>>>
>>> What's the point of having a declaration followed immediately by
>>> a definition? Also, the function is dead too.
>>>
>>
>> This file (v4v_utils.h) has utility that could be used by drivers, we don't
>> use
>> them in Xen but we through it will be convenient to have such function
>> accessible for one to write a v4v driver a v4v driver.
>>
>> What would be the right place for those?
>
> I think public/io/ would be the best matching place, but it's
> certainly also not ideal. Maybe we ought to introduce public/lib/
> or some such for it?
>
> But then again I don't see how this comment of yours relates to
> the earlier comments I made.
>

Ok I have removed this file from the patch series,  we already have a
copy of this file in the Linux driver so those helper functions can be found
there.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation - padding question
  2012-09-01 20:56     ` Jean Guyader
@ 2013-06-11 17:10       ` Ross Philipson
  2013-06-11 17:25         ` Tim Deegan
  0 siblings, 1 reply; 50+ messages in thread
From: Ross Philipson @ 2013-06-11 17:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Jan Beulich

>
>>> ...
>>> +struct v4v_info
>>> +{
>>> +    uint64_t ring_magic;
>>> +    uint64_t data_magic;
>>> +    evtchn_port_t evtchn;
>>
>> Missing padding at the end?
>>
>
> ack.
>

At one point during the review of an earlier version of the V4V patch 
set, Jan requested that this pad be added to the v4v_info struct. I 
understand the padding in all the other structs but I don't understand 
this one. This struct is not included in any other structs and is not 
followed by any data.

Thanks
Ross

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation - padding question
  2013-06-11 17:10       ` [PATCH 5/5] xen: Add V4V implementation - padding question Ross Philipson
@ 2013-06-11 17:25         ` Tim Deegan
  2013-06-11 17:40           ` Ross Philipson
  0 siblings, 1 reply; 50+ messages in thread
From: Tim Deegan @ 2013-06-11 17:25 UTC (permalink / raw)
  To: Ross Philipson; +Cc: Jan Beulich, xen-devel

At 13:10 -0400 on 11 Jun (1370956242), Ross Philipson wrote:
> >
> >>>...
> >>>+struct v4v_info
> >>>+{
> >>>+    uint64_t ring_magic;
> >>>+    uint64_t data_magic;
> >>>+    evtchn_port_t evtchn;
> >>
> >>Missing padding at the end?
> >>
> >
> >ack.
> >
> 
> At one point during the review of an earlier version of the V4V patch 
> set, Jan requested that this pad be added to the v4v_info struct. I 
> understand the padding in all the other structs but I don't understand 
> this one. This struct is not included in any other structs and is not 
> followed by any data.

64-bit Xen would see this as a 24-byte struct, even without explicit
padding.  That would surprise a 32-bit guest that allocated what it saw
as a 20-byte struct for Xen to copy into.

Tim.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation - padding question
  2013-06-11 17:25         ` Tim Deegan
@ 2013-06-11 17:40           ` Ross Philipson
  2013-06-11 17:54             ` Ross Philipson
  0 siblings, 1 reply; 50+ messages in thread
From: Ross Philipson @ 2013-06-11 17:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Tim Deegan

On 06/11/2013 01:25 PM, Tim Deegan wrote:
> At 13:10 -0400 on 11 Jun (1370956242), Ross Philipson wrote:
>>>
>>>>> ...
>>>>> +struct v4v_info
>>>>> +{
>>>>> +    uint64_t ring_magic;
>>>>> +    uint64_t data_magic;
>>>>> +    evtchn_port_t evtchn;
>>>>
>>>> Missing padding at the end?
>>>>
>>>
>>> ack.
>>>
>>
>> At one point during the review of an earlier version of the V4V patch
>> set, Jan requested that this pad be added to the v4v_info struct. I
>> understand the padding in all the other structs but I don't understand
>> this one. This struct is not included in any other structs and is not
>> followed by any data.
>
> 64-bit Xen would see this as a 24-byte struct, even without explicit
> padding.  That would surprise a 32-bit guest that allocated what it saw
> as a 20-byte struct for Xen to copy into.

Ah yes, of course. Thanks for the quick response.

Ross

>
> Tim.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation - padding question
  2013-06-11 17:40           ` Ross Philipson
@ 2013-06-11 17:54             ` Ross Philipson
  2013-06-11 18:04               ` Tim Deegan
  0 siblings, 1 reply; 50+ messages in thread
From: Ross Philipson @ 2013-06-11 17:54 UTC (permalink / raw)
  To: xen-devel; +Cc: Tim Deegan

On 06/11/2013 01:40 PM, Ross Philipson wrote:
> On 06/11/2013 01:25 PM, Tim Deegan wrote:
>> At 13:10 -0400 on 11 Jun (1370956242), Ross Philipson wrote:
>>>>
>>>>>> ...
>>>>>> +struct v4v_info
>>>>>> +{
>>>>>> + uint64_t ring_magic;
>>>>>> + uint64_t data_magic;
>>>>>> + evtchn_port_t evtchn;
>>>>>
>>>>> Missing padding at the end?
>>>>>
>>>>
>>>> ack.
>>>>
>>>
>>> At one point during the review of an earlier version of the V4V patch
>>> set, Jan requested that this pad be added to the v4v_info struct. I
>>> understand the padding in all the other structs but I don't understand
>>> this one. This struct is not included in any other structs and is not
>>> followed by any data.
>>
>> 64-bit Xen would see this as a 24-byte struct, even without explicit
>> padding. That would surprise a 32-bit guest that allocated what it saw
>> as a 20-byte struct for Xen to copy into.
>
> Ah yes, of course. Thanks for the quick response.
>
> Ross

I guess that means that this struct is unhappy then...

typedef struct v4vtables_rule
{
     v4v_addr_t src;  -- 8b
     v4v_addr_t dst;  -- 8b
     uint32_t accept; -- 4b
} v4vtables_rule_t;


>
>>
>> Tim.
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation - padding question
  2013-06-11 17:54             ` Ross Philipson
@ 2013-06-11 18:04               ` Tim Deegan
  2013-06-12  7:45                 ` Jan Beulich
  2013-06-13 17:21                 ` Stefano Stabellini
  0 siblings, 2 replies; 50+ messages in thread
From: Tim Deegan @ 2013-06-11 18:04 UTC (permalink / raw)
  To: Ross Philipson; +Cc: xen-devel

At 13:54 -0400 on 11 Jun (1370958882), Ross Philipson wrote:
> On 06/11/2013 01:40 PM, Ross Philipson wrote:
> >On 06/11/2013 01:25 PM, Tim Deegan wrote:
> >>At 13:10 -0400 on 11 Jun (1370956242), Ross Philipson wrote:
> >>>>
> >>>>>>...
> >>>>>>+struct v4v_info
> >>>>>>+{
> >>>>>>+ uint64_t ring_magic;
> >>>>>>+ uint64_t data_magic;
> >>>>>>+ evtchn_port_t evtchn;
> >>>>>
> >>>>>Missing padding at the end?
> >>>>>
> >>>>
> >>>>ack.
> >>>>
> >>>
> >>>At one point during the review of an earlier version of the V4V patch
> >>>set, Jan requested that this pad be added to the v4v_info struct. I
> >>>understand the padding in all the other structs but I don't understand
> >>>this one. This struct is not included in any other structs and is not
> >>>followed by any data.
> >>
> >>64-bit Xen would see this as a 24-byte struct, even without explicit
> >>padding. That would surprise a 32-bit guest that allocated what it saw
> >>as a 20-byte struct for Xen to copy into.
> >
> >Ah yes, of course. Thanks for the quick response.
> >
> >Ross
> 
> I guess that means that this struct is unhappy then...
> 
> typedef struct v4vtables_rule
> {
>     v4v_addr_t src;  -- 8b
>     v4v_addr_t dst;  -- 8b
>     uint32_t accept; -- 4b
> } v4vtables_rule_t;

Surprisingly, no: since it contains no 8-byte fields, both x86
architectures will see it as a 20-byte (4-byte-aligned) struct. 
Not sure about ARM, though.

Tim.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation - padding question
  2013-06-11 18:04               ` Tim Deegan
@ 2013-06-12  7:45                 ` Jan Beulich
  2013-06-13 17:21                 ` Stefano Stabellini
  1 sibling, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2013-06-12  7:45 UTC (permalink / raw)
  To: Ross Philipson, Tim Deegan; +Cc: xen-devel

>>> On 11.06.13 at 20:04, Tim Deegan <tim@xen.org> wrote:
> At 13:54 -0400 on 11 Jun (1370958882), Ross Philipson wrote:
>> I guess that means that this struct is unhappy then...
>> 
>> typedef struct v4vtables_rule
>> {
>>     v4v_addr_t src;  -- 8b
>>     v4v_addr_t dst;  -- 8b
>>     uint32_t accept; -- 4b
>> } v4vtables_rule_t;
> 
> Surprisingly, no: since it contains no 8-byte fields, both x86
> architectures will see it as a 20-byte (4-byte-aligned) struct. 
> Not sure about ARM, though.

That's why all such structures should go through the public
header compat mode checking, so that eventual inconsistencies
would be noticed right away rather than at the end of some
extended debugging session.

Jan

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] xen: Add V4V implementation - padding question
  2013-06-11 18:04               ` Tim Deegan
  2013-06-12  7:45                 ` Jan Beulich
@ 2013-06-13 17:21                 ` Stefano Stabellini
  1 sibling, 0 replies; 50+ messages in thread
From: Stefano Stabellini @ 2013-06-13 17:21 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Ross Philipson, xen-devel

On Tue, 11 Jun 2013, Tim Deegan wrote:
> At 13:54 -0400 on 11 Jun (1370958882), Ross Philipson wrote:
> > On 06/11/2013 01:40 PM, Ross Philipson wrote:
> > >On 06/11/2013 01:25 PM, Tim Deegan wrote:
> > >>At 13:10 -0400 on 11 Jun (1370956242), Ross Philipson wrote:
> > >>>>
> > >>>>>>...
> > >>>>>>+struct v4v_info
> > >>>>>>+{
> > >>>>>>+ uint64_t ring_magic;
> > >>>>>>+ uint64_t data_magic;
> > >>>>>>+ evtchn_port_t evtchn;
> > >>>>>
> > >>>>>Missing padding at the end?
> > >>>>>
> > >>>>
> > >>>>ack.
> > >>>>
> > >>>
> > >>>At one point during the review of an earlier version of the V4V patch
> > >>>set, Jan requested that this pad be added to the v4v_info struct. I
> > >>>understand the padding in all the other structs but I don't understand
> > >>>this one. This struct is not included in any other structs and is not
> > >>>followed by any data.
> > >>
> > >>64-bit Xen would see this as a 24-byte struct, even without explicit
> > >>padding. That would surprise a 32-bit guest that allocated what it saw
> > >>as a 20-byte struct for Xen to copy into.
> > >
> > >Ah yes, of course. Thanks for the quick response.
> > >
> > >Ross
> > 
> > I guess that means that this struct is unhappy then...
> > 
> > typedef struct v4vtables_rule
> > {
> >     v4v_addr_t src;  -- 8b
> >     v4v_addr_t dst;  -- 8b
> >     uint32_t accept; -- 4b
> > } v4vtables_rule_t;
> 
> Surprisingly, no: since it contains no 8-byte fields, both x86
> architectures will see it as a 20-byte (4-byte-aligned) struct. 
> Not sure about ARM, though.

same on ARM and ARM64

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2013-06-13 17:21 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-03 19:50 [PATCH 0/5] RFC: V4V (v3) Jean Guyader
2012-08-03 19:50 ` [PATCH 1/5] xen: add ssize_t Jean Guyader
2012-08-06  8:08   ` Jan Beulich
2012-08-06 14:47     ` Jean Guyader
2012-08-09  9:51       ` Tim Deegan
2012-08-09 10:19         ` Jean Guyader
2012-08-09 10:39           ` Jan Beulich
2012-08-09 10:48             ` Jean Guyader
2012-08-09 13:02               ` Jan Beulich
2012-08-09 10:59           ` Tim Deegan
2012-08-09 11:08             ` Jean Guyader
2012-08-03 19:50 ` [PATCH 2/5] xen: Introduce guest_handle_for_field Jean Guyader
2012-08-03 19:50 ` [PATCH 3/5] xen: virq, remove VIRQ_XC_RESERVED Jean Guyader
2012-08-06  8:10   ` Jan Beulich
2012-08-06 14:46     ` Jean Guyader
2012-08-06 14:49       ` Andrew Cooper
2012-08-06 14:56       ` Ian Campbell
2012-08-06 15:01         ` Jean Guyader
2012-08-06 15:13           ` Ian Campbell
2012-08-06 15:46             ` Jan Beulich
2012-08-03 19:50 ` [PATCH 4/5] xen: events, exposes evtchn_alloc_unbound_domain Jean Guyader
2012-08-06  8:19   ` Jan Beulich
2012-08-09 10:06   ` Tim Deegan
2012-08-09 10:23     ` Ian Campbell
2012-08-09 10:35       ` Tim Deegan
2012-08-09 10:40         ` Jean Guyader
2012-08-09 23:25           ` Jean Guyader
2012-08-10  7:35             ` Jan Beulich
2012-08-10  7:51               ` Jean Guyader
2012-08-10  7:57                 ` Jan Beulich
2012-08-23 12:03                   ` Jean Guyader
2012-08-03 19:50 ` [PATCH 5/5] xen: Add V4V implementation Jean Guyader
2012-08-06  8:45   ` Jan Beulich
2012-08-23 11:57     ` Jean Guyader
2012-08-24 20:06       ` Jan Beulich
2012-09-01 20:58         ` Jean Guyader
2012-09-01 20:56     ` Jean Guyader
2013-06-11 17:10       ` [PATCH 5/5] xen: Add V4V implementation - padding question Ross Philipson
2013-06-11 17:25         ` Tim Deegan
2013-06-11 17:40           ` Ross Philipson
2013-06-11 17:54             ` Ross Philipson
2013-06-11 18:04               ` Tim Deegan
2013-06-12  7:45                 ` Jan Beulich
2013-06-13 17:21                 ` Stefano Stabellini
2012-08-09 10:38   ` [PATCH 5/5] xen: Add V4V implementation Tim Deegan
2012-08-10 16:51     ` Jean Guyader
2012-08-13  9:38       ` Tim Deegan
2012-08-13 12:43         ` Jean Guyader
2012-08-16 12:32           ` Tim Deegan
2012-08-04 13:24 ` [PATCH 0/5] RFC: V4V (v3) Jean Guyader

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.