All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/8] Xen: FIFO-based event channel ABI
@ 2013-03-19 21:00 David Vrabel
  2013-03-19 21:00 ` [PATCH 1/8] debug: remove some event channel info from the 'i' and 'q' debug keys David Vrabel
                   ` (9 more replies)
  0 siblings, 10 replies; 27+ messages in thread
From: David Vrabel @ 2013-03-19 21:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Keir Fraser, David Vrabel, Konrad Rzeszutek Wilk

This is an RFC of the FIFO-based event channel ABI described in this
design document:

http://xenbits.xen.org/people/dvrabel/event-channels-C.pdf

Changes since draft B are:

- Queue tail is now private to Xen.

- Guest pages are specified by MFN in the hypercalls.

- Updated link/unlink algorithm to avoid races when adding an event to
  a queue that is becoming empty.

Patch 1-5 do some preparatory work for supporting alternate ABIs.

Patch 6 is a hack to allow it to build.

Patch 7-8 add the ABI and the implementation.  Patch 8 list some of
the known limitations with this RFC implementation.

David

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/8] debug: remove some event channel info from the 'i' and 'q' debug keys
  2013-03-19 21:00 [PATCH RFC 0/8] Xen: FIFO-based event channel ABI David Vrabel
@ 2013-03-19 21:00 ` David Vrabel
  2013-03-19 21:00 ` [PATCH 2/8] evtchn: refactor low-level event channel port ops David Vrabel
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 27+ messages in thread
From: David Vrabel @ 2013-03-19 21:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Keir Fraser, David Vrabel, Konrad Rzeszutek Wilk

From: David Vrabel <david.vrabel@citrix.com>

With different event channel ABIs, showing the 2-level selector word
may be misleading (if a different ABI is in use).

When dumping domain information, printing the state of the VIRQ_DEBUG
port is redundant -- this information is available via the 'e' key.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/arch/x86/irq.c      |    5 +----
 xen/common/keyhandler.c |   11 ++---------
 2 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index ca829bb..345f1ae 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2088,14 +2088,11 @@ static void dump_irqs(unsigned char key)
                 d = action->guest[i];
                 pirq = domain_irq_to_pirq(d, irq);
                 info = pirq_info(d, pirq);
-                printk("%u:%3d(%c%c%c%c)",
+                printk("%u:%3d(%c%c%c)",
                        d->domain_id, pirq,
                        (test_bit(info->evtchn,
                                  &shared_info(d, evtchn_pending)) ?
                         'P' : '-'),
-                       (test_bit(info->evtchn / BITS_PER_EVTCHN_WORD(d),
-                                 &vcpu_info(d->vcpu[0], evtchn_pending_sel)) ?
-                        'S' : '-'),
                        (test_bit(info->evtchn, &shared_info(d, evtchn_mask)) ?
                         'M' : '-'),
                        (info->masked ? 'M' : '-'));
diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index e9ef45f..eddf899 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -299,16 +299,9 @@ static void dump_domains(unsigned char key)
     {
         for_each_vcpu ( d, v )
         {
-            printk("Notifying guest %d:%d (virq %d, port %d, stat %d/%d/%d)\n",
+            printk("Notifying guest %d:%d (virq %d, port %d)\n",
                    d->domain_id, v->vcpu_id,
-                   VIRQ_DEBUG, v->virq_to_evtchn[VIRQ_DEBUG],
-                   test_bit(v->virq_to_evtchn[VIRQ_DEBUG], 
-                            &shared_info(d, evtchn_pending)),
-                   test_bit(v->virq_to_evtchn[VIRQ_DEBUG], 
-                            &shared_info(d, evtchn_mask)),
-                   test_bit(v->virq_to_evtchn[VIRQ_DEBUG] /
-                            BITS_PER_EVTCHN_WORD(d),
-                            &vcpu_info(v, evtchn_pending_sel)));
+                   VIRQ_DEBUG, v->virq_to_evtchn[VIRQ_DEBUG]);
             send_guest_vcpu_virq(v, VIRQ_DEBUG);
         }
     }
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/8] evtchn: refactor low-level event channel port ops
  2013-03-19 21:00 [PATCH RFC 0/8] Xen: FIFO-based event channel ABI David Vrabel
  2013-03-19 21:00 ` [PATCH 1/8] debug: remove some event channel info from the 'i' and 'q' debug keys David Vrabel
@ 2013-03-19 21:00 ` David Vrabel
  2013-03-20 10:21   ` Jan Beulich
  2013-03-20 10:24   ` Jan Beulich
  2013-03-19 21:00 ` [PATCH 3/8] evtchn: add a hook to bind an event port to a VCPU David Vrabel
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 27+ messages in thread
From: David Vrabel @ 2013-03-19 21:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Keir Fraser, David Vrabel, Konrad Rzeszutek Wilk

From: David Vrabel <david.vrabel@citrix.com>

Use functions for the low-level event channel port operations
(set/clear pending, unmask, is_pending and is_masked).

Group these functions into a struct evtchn_port_op so they can be
replaced by alternate implementations (for different ABIs) on a
per-domain basis.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/arch/x86/irq.c         |   11 ++---
 xen/common/Makefile        |    1 +
 xen/common/event_channel.c |   65 +++++---------------------
 xen/common/event_port.c    |  108 ++++++++++++++++++++++++++++++++++++++++++++
 xen/common/schedule.c      |    3 +-
 xen/include/xen/event.h    |   40 ++++++++++++++++
 xen/include/xen/sched.h    |    4 ++
 7 files changed, 172 insertions(+), 60 deletions(-)
 create mode 100644 xen/common/event_port.c

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 345f1ae..3d26910 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1452,7 +1452,7 @@ int pirq_guest_unmask(struct domain *d)
         {
             pirq = pirqs[i]->pirq;
             if ( pirqs[i]->masked &&
-                 !test_bit(pirqs[i]->evtchn, &shared_info(d, evtchn_mask)) )
+                 !evtchn_port_is_masked(d, evtchn_from_port(d, pirqs[i]->evtchn)) )
                 pirq_guest_eoi(pirqs[i]);
         }
     } while ( ++pirq < d->nr_pirqs && n == ARRAY_SIZE(pirqs) );
@@ -2046,6 +2046,7 @@ static void dump_irqs(unsigned char key)
     int i, irq, pirq;
     struct irq_desc *desc;
     irq_guest_action_t *action;
+    struct evtchn *evtchn;
     struct domain *d;
     const struct pirq *info;
     unsigned long flags;
@@ -2088,13 +2089,11 @@ static void dump_irqs(unsigned char key)
                 d = action->guest[i];
                 pirq = domain_irq_to_pirq(d, irq);
                 info = pirq_info(d, pirq);
+                evtchn = evtchn_from_port(d, info->evtchn);
                 printk("%u:%3d(%c%c%c)",
                        d->domain_id, pirq,
-                       (test_bit(info->evtchn,
-                                 &shared_info(d, evtchn_pending)) ?
-                        'P' : '-'),
-                       (test_bit(info->evtchn, &shared_info(d, evtchn_mask)) ?
-                        'M' : '-'),
+                       (evtchn_port_is_pending(d, evtchn) ? 'P' : '-'),
+                       (evtchn_port_is_masked(d, evtchn) ? 'M' : '-'),
                        (info->masked ? 'M' : '-'));
                 if ( i != action->nr_guests )
                     printk(",");
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 8a0c506..f967b49 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -6,6 +6,7 @@ obj-$(HAS_DEVICE_TREE) += device_tree.o
 obj-y += domctl.o
 obj-y += domain.o
 obj-y += event_channel.o
+obj-y += event_port.o
 obj-y += grant_table.o
 obj-y += irq.o
 obj-y += kernel.o
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index a0f293f..b5e74d6 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -151,6 +151,7 @@ static int get_free_port(struct domain *d)
             xfree(chn);
             return -ENOMEM;
         }
+        chn[i].port = port + i;
     }
 
     return port;
@@ -529,7 +530,7 @@ static long __evtchn_close(struct domain *d1, int port1)
     }
 
     /* Clear pending event to avoid unexpected behavior on re-bind. */
-    clear_bit(port1, &shared_info(d1, evtchn_pending));
+    evtchn_port_clear_pending(d1, chn1);
 
     /* Reset binding to vcpu0 when the channel is freed. */
     chn1->state          = ECS_FREE;
@@ -614,43 +615,10 @@ out:
 
 static void evtchn_set_pending(struct vcpu *v, int port)
 {
-    struct domain *d = v->domain;
-    int vcpuid;
-
-    /*
-     * The following bit operations must happen in strict order.
-     * NB. On x86, the atomic bit operations also act as memory barriers.
-     * There is therefore sufficiently strict ordering for this architecture --
-     * others may require explicit memory barriers.
-     */
-
-    if ( test_and_set_bit(port, &shared_info(d, evtchn_pending)) )
-        return;
-
-    if ( !test_bit        (port, &shared_info(d, evtchn_mask)) &&
-         !test_and_set_bit(port / BITS_PER_EVTCHN_WORD(d),
-                           &vcpu_info(v, evtchn_pending_sel)) )
-    {
-        vcpu_mark_events_pending(v);
-    }
-    
-    /* Check if some VCPU might be polling for this event. */
-    if ( likely(bitmap_empty(d->poll_mask, d->max_vcpus)) )
-        return;
+    struct evtchn *evtchn;
 
-    /* Wake any interested (or potentially interested) pollers. */
-    for ( vcpuid = find_first_bit(d->poll_mask, d->max_vcpus);
-          vcpuid < d->max_vcpus;
-          vcpuid = find_next_bit(d->poll_mask, d->max_vcpus, vcpuid+1) )
-    {
-        v = d->vcpu[vcpuid];
-        if ( ((v->poll_evtchn <= 0) || (v->poll_evtchn == port)) &&
-             test_and_clear_bit(vcpuid, d->poll_mask) )
-        {
-            v->poll_evtchn = 0;
-            vcpu_unblock(v);
-        }
-    }
+    evtchn = evtchn_from_port(v->domain, port);
+    evtchn_port_set_pending(v, evtchn);
 }
 
 int guest_enabled_event(struct vcpu *v, uint32_t virq)
@@ -919,26 +887,15 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
 int evtchn_unmask(unsigned int port)
 {
     struct domain *d = current->domain;
-    struct vcpu   *v;
+    struct evtchn *evtchn;
 
     ASSERT(spin_is_locked(&d->event_lock));
 
     if ( unlikely(!port_is_valid(d, port)) )
         return -EINVAL;
 
-    v = d->vcpu[evtchn_from_port(d, port)->notify_vcpu_id];
-
-    /*
-     * These operations must happen in strict order. Based on
-     * include/xen/event.h:evtchn_set_pending(). 
-     */
-    if ( test_and_clear_bit(port, &shared_info(d, evtchn_mask)) &&
-         test_bit          (port, &shared_info(d, evtchn_pending)) &&
-         !test_and_set_bit (port / BITS_PER_EVTCHN_WORD(d),
-                            &vcpu_info(v, evtchn_pending_sel)) )
-    {
-        vcpu_mark_events_pending(v);
-    }
+    evtchn = evtchn_from_port(d, port);
+    evtchn_port_unmask(d, evtchn);
 
     return 0;
 }
@@ -1184,6 +1141,8 @@ int evtchn_init(struct domain *d)
     bitmap_zero(d->poll_mask, MAX_VIRT_CPUS);
 #endif
 
+    d->evtchn_port_ops = &evtchn_port_ops_2l;
+
     return 0;
 }
 
@@ -1269,8 +1228,8 @@ static void domain_dump_evtchn_info(struct domain *d)
 
         printk("    %4u [%d/%d]: s=%d n=%d x=%d",
                port,
-               !!test_bit(port, &shared_info(d, evtchn_pending)),
-               !!test_bit(port, &shared_info(d, evtchn_mask)),
+               !!evtchn_port_is_pending(d, chn),
+               !!evtchn_port_is_masked(d, chn),
                chn->state, chn->notify_vcpu_id, chn->xen_consumer);
 
         switch ( chn->state )
diff --git a/xen/common/event_port.c b/xen/common/event_port.c
new file mode 100644
index 0000000..b0ef25b
--- /dev/null
+++ b/xen/common/event_port.c
@@ -0,0 +1,108 @@
+/*
+ * Event channel port operations.
+ *
+ * Copyright (c) 2003-2006, K A Fraser.
+ * 
+ * This source code is licensed under the GNU General Public License,
+ * Version 2 or later.  See the file COPYING for more details.
+ */
+
+#include <xen/config.h>
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/errno.h>
+#include <xen/sched.h>
+#include <xen/event.h>
+
+static void evtchn_check_pollers(struct domain *d, int port)
+{
+    struct vcpu *v;
+    int vcpuid;
+
+    /* Check if some VCPU might be polling for this event. */
+    if ( likely(bitmap_empty(d->poll_mask, d->max_vcpus)) )
+        return;
+
+    /* Wake any interested (or potentially interested) pollers. */
+    for ( vcpuid = find_first_bit(d->poll_mask, d->max_vcpus);
+          vcpuid < d->max_vcpus;
+          vcpuid = find_next_bit(d->poll_mask, d->max_vcpus, vcpuid+1) )
+    {
+        v = d->vcpu[vcpuid];
+        if ( ((v->poll_evtchn <= 0) || (v->poll_evtchn == port)) &&
+             test_and_clear_bit(vcpuid, d->poll_mask) )
+        {
+            v->poll_evtchn = 0;
+            vcpu_unblock(v);
+        }
+    }
+}
+
+static void evtchn_2l_set_pending(struct vcpu *v, struct evtchn *evtchn)
+{
+    struct domain *d = v->domain;
+    int port = evtchn->port;
+
+    /*
+     * The following bit operations must happen in strict order.
+     * NB. On x86, the atomic bit operations also act as memory barriers.
+     * There is therefore sufficiently strict ordering for this architecture --
+     * others may require explicit memory barriers.
+     */
+
+    if ( test_and_set_bit(port, &shared_info(d, evtchn_pending)) )
+        return;
+
+    if ( !test_bit        (port, &shared_info(d, evtchn_mask)) &&
+         !test_and_set_bit(port / BITS_PER_EVTCHN_WORD(d),
+                           &vcpu_info(v, evtchn_pending_sel)) )
+    {
+        vcpu_mark_events_pending(v);
+    }
+
+    evtchn_check_pollers(d, v->poll_evtchn);
+}
+
+static void evtchn_2l_clear_pending(struct domain *d, struct evtchn *evtchn)
+{
+    clear_bit(evtchn->port, &shared_info(d, evtchn_pending));
+}
+
+static void evtchn_2l_unmask(struct domain *d, struct evtchn *evtchn)
+{
+    struct vcpu *v = d->vcpu[evtchn->notify_vcpu_id];
+    int port = evtchn->port;
+
+    /*
+     * These operations must happen in strict order. Based on
+     * evtchn_2l_set_pending() above.
+     */
+    if ( test_and_clear_bit(port, &shared_info(d, evtchn_mask)) &&
+         test_bit          (port, &shared_info(d, evtchn_pending)) &&
+         !test_and_set_bit (port / BITS_PER_EVTCHN_WORD(d),
+                            &vcpu_info(v, evtchn_pending_sel)) )
+    {
+        vcpu_mark_events_pending(v);
+    }
+}
+
+static bool_t evtchn_2l_is_pending(struct domain *d,
+                                   const struct evtchn *evtchn)
+{
+    return test_bit(evtchn->port, &shared_info(d, evtchn_pending));
+}
+
+static bool_t evtchn_2l_is_masked(struct domain *d,
+                                  const struct evtchn *evtchn)
+{
+    return test_bit(evtchn->port, &shared_info(d, evtchn_mask));
+}
+
+struct evtchn_port_ops evtchn_port_ops_2l =
+{
+    .set_pending   = evtchn_2l_set_pending,
+    .clear_pending = evtchn_2l_clear_pending,
+    .unmask        = evtchn_2l_unmask,
+    .is_pending    = evtchn_2l_is_pending,
+    .is_masked     = evtchn_2l_is_masked,
+};
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 9c58b12..9d2aa5e 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -33,6 +33,7 @@
 #include <xen/multicall.h>
 #include <xen/cpu.h>
 #include <xen/preempt.h>
+#include <xen/event.h>
 #include <public/sched.h>
 #include <xsm/xsm.h>
 
@@ -693,7 +694,7 @@ static long do_poll(struct sched_poll *sched_poll)
             goto out;
 
         rc = 0;
-        if ( test_bit(port, &shared_info(d, evtchn_pending)) )
+        if ( evtchn_port_is_pending(d, evtchn_from_port(d, port)) )
             goto out;
     }
 
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 71c3e92..0d23e3f 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -102,4 +102,44 @@ void notify_via_xen_event_channel(struct domain *ld, int lport);
         mb(); /* set blocked status /then/ caller does his work */      \
     } while ( 0 )
 
+/*
+ * Low-level event channel port ops.
+ */
+struct evtchn_port_ops {
+    void (*set_pending)(struct vcpu *v, struct evtchn *evtchn);
+    void (*clear_pending)(struct domain *d, struct evtchn *evtchn);
+    void (*unmask)(struct domain *d, struct evtchn *evtchn);
+    bool_t (*is_pending)(struct domain *d, const struct evtchn *evtchn);
+    bool_t (*is_masked)(struct domain *d, const struct evtchn *evtchn);
+};
+
+extern struct evtchn_port_ops evtchn_port_ops_2l;
+
+static inline void evtchn_port_set_pending(struct vcpu *v, struct evtchn *evtchn)
+{
+    v->domain->evtchn_port_ops->set_pending(v, evtchn);
+}
+
+static inline void evtchn_port_clear_pending(struct domain *d, struct evtchn *evtchn)
+{
+    d->evtchn_port_ops->clear_pending(d, evtchn);
+}
+
+static inline void evtchn_port_unmask(struct domain *d,
+                                      const struct evtchn *evtchn)
+{
+    d->evtchn_port_ops->unmask(d, evtchn);
+}
+
+static inline bool_t evtchn_port_is_pending(struct domain *d,
+                                            const struct evtchn *evtchn)
+{
+    return d->evtchn_port_ops->is_pending(d, evtchn);
+}
+
+static inline bool_t evtchn_port_is_masked(struct domain *d, struct evtchn *evtchn)
+{
+    return d->evtchn_port_ops->is_masked(d, evtchn);
+}
+
 #endif /* __XEN_EVENT_H__ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index e108436..0db73c0 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -81,6 +81,7 @@ struct evtchn
         } pirq;        /* state == ECS_PIRQ */
         u16 virq;      /* state == ECS_VIRQ */
     } u;
+    u32 port;
 #ifdef FLASK_ENABLE
     void *ssid;
 #endif
@@ -230,6 +231,8 @@ struct mem_event_per_domain
     struct mem_event_domain access;
 };
 
+struct evtchn_port_ops;
+
 struct domain
 {
     domid_t          domain_id;
@@ -262,6 +265,7 @@ struct domain
     /* Event channel information. */
     struct evtchn   *evtchn[NR_EVTCHN_BUCKETS];
     spinlock_t       event_lock;
+    struct evtchn_port_ops *evtchn_port_ops;
 
     struct grant_table *grant_table;
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/8] evtchn: add a hook to bind an event port to a VCPU
  2013-03-19 21:00 [PATCH RFC 0/8] Xen: FIFO-based event channel ABI David Vrabel
  2013-03-19 21:00 ` [PATCH 1/8] debug: remove some event channel info from the 'i' and 'q' debug keys David Vrabel
  2013-03-19 21:00 ` [PATCH 2/8] evtchn: refactor low-level event channel port ops David Vrabel
@ 2013-03-19 21:00 ` David Vrabel
  2013-03-19 21:00 ` [PATCH 4/8] evtchn: Dynamically allocate d->evtchn David Vrabel
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 27+ messages in thread
From: David Vrabel @ 2013-03-19 21:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Keir Fraser, David Vrabel, Konrad Rzeszutek Wilk

From: David Vrabel <david.vrabel@citrix.com>

The upcoming FIFO-based event channel ABI will require binding ports
to a specific per-VCPU queue, so add a port op (bind_to_vcpu) for
this.  Call this new op when events are bound and when they are moved
to a different VCPU.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/common/event_channel.c |    7 +++++++
 xen/include/xen/event.h    |    9 +++++++++
 2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index b5e74d6..3a91241 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -236,10 +236,12 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
     if ( rc )
         goto out;
 
+    evtchn_port_bind_to_vcpu(ld, lchn, ld->vcpu[lchn->notify_vcpu_id]);
     lchn->u.interdomain.remote_dom  = rd;
     lchn->u.interdomain.remote_port = (u16)rport;
     lchn->state                     = ECS_INTERDOMAIN;
     
+    evtchn_port_bind_to_vcpu(rd, rchn, rd->vcpu[rchn->notify_vcpu_id]);
     rchn->u.interdomain.remote_dom  = ld;
     rchn->u.interdomain.remote_port = (u16)lport;
     rchn->state                     = ECS_INTERDOMAIN;
@@ -290,6 +292,7 @@ static long evtchn_bind_virq(evtchn_bind_virq_t *bind)
         ERROR_EXIT(port);
 
     chn = evtchn_from_port(d, port);
+    evtchn_port_bind_to_vcpu(d, chn, v);
     chn->state          = ECS_VIRQ;
     chn->notify_vcpu_id = vcpu;
     chn->u.virq         = virq;
@@ -320,6 +323,7 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind)
         ERROR_EXIT(port);
 
     chn = evtchn_from_port(d, port);
+    evtchn_port_bind_to_vcpu(d, chn, d->vcpu[vcpu]);
     chn->state          = ECS_IPI;
     chn->notify_vcpu_id = vcpu;
 
@@ -397,6 +401,7 @@ static long evtchn_bind_pirq(evtchn_bind_pirq_t *bind)
         goto out;
     }
 
+    evtchn_port_bind_to_vcpu(d, chn, v);
     chn->state  = ECS_PIRQ;
     chn->u.pirq.irq = pirq;
     link_pirq_port(port, chn, v);
@@ -876,6 +881,8 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
         rc = -EINVAL;
         break;
     }
+    if ( rc == 0 )
+        evtchn_port_bind_to_vcpu(d, chn, d->vcpu[vcpu_id]);
 
  out:
     spin_unlock(&d->event_lock);
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 0d23e3f..9d85234 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -111,6 +111,8 @@ struct evtchn_port_ops {
     void (*unmask)(struct domain *d, struct evtchn *evtchn);
     bool_t (*is_pending)(struct domain *d, const struct evtchn *evtchn);
     bool_t (*is_masked)(struct domain *d, const struct evtchn *evtchn);
+    void (*bind_to_vcpu)(struct domain *d, struct evtchn *evtchn,
+                         struct vcpu *vcpu);
 };
 
 extern struct evtchn_port_ops evtchn_port_ops_2l;
@@ -142,4 +144,11 @@ static inline bool_t evtchn_port_is_masked(struct domain *d, struct evtchn *evtc
     return d->evtchn_port_ops->is_masked(d, evtchn);
 }
 
+static inline void evtchn_port_bind_to_vcpu(struct domain *d, struct evtchn *evtchn,
+                                            struct vcpu *vcpu)
+{
+    if ( d->evtchn_port_ops->bind_to_vcpu )
+        d->evtchn_port_ops->bind_to_vcpu(d, evtchn, vcpu);
+}
+
 #endif /* __XEN_EVENT_H__ */
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 4/8] evtchn: Dynamically allocate d->evtchn
  2013-03-19 21:00 [PATCH RFC 0/8] Xen: FIFO-based event channel ABI David Vrabel
                   ` (2 preceding siblings ...)
  2013-03-19 21:00 ` [PATCH 3/8] evtchn: add a hook to bind an event port to a VCPU David Vrabel
@ 2013-03-19 21:00 ` David Vrabel
  2013-03-20 11:43   ` Wei Liu
  2013-03-19 21:00 ` [PATCH 5/8] evtchn: use a per-domain variable for the max number of event channels David Vrabel
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 27+ messages in thread
From: David Vrabel @ 2013-03-19 21:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Keir Fraser, David Vrabel, Konrad Rzeszutek Wilk

From: Wei Liu <wei.liu2@citrix.com>

As we move to extended evtchn ABI we need bigger d->evtchn, as a result
this will bloat struct domain. So move this array out of struct domain
and allocate a dedicated array for it.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/common/event_channel.c |   14 ++++++++++++++
 xen/include/xen/sched.h    |    2 +-
 2 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 3a91241..44672b5 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -1136,15 +1136,27 @@ void notify_via_xen_event_channel(struct domain *ld, int lport)
 
 int evtchn_init(struct domain *d)
 {
+    BUILD_BUG_ON(sizeof(struct evtchn *) * NR_EVTCHN_BUCKETS > PAGE_SIZE);
+    d->evtchn = xzalloc_array(struct evtchn *, NR_EVTCHN_BUCKETS);
+
+    if ( d->evtchn == NULL )
+        return -ENOMEM;
+
     spin_lock_init(&d->event_lock);
     if ( get_free_port(d) != 0 )
+    {
+        xfree(d->evtchn);
         return -EINVAL;
+    }
     evtchn_from_port(d, 0)->state = ECS_RESERVED;
 
 #if MAX_VIRT_CPUS > BITS_PER_LONG
     d->poll_mask = xmalloc_array(unsigned long, BITS_TO_LONGS(MAX_VIRT_CPUS));
     if ( !d->poll_mask )
+    {
+        xfree(d->evtchn);
         return -ENOMEM;
+    }
     bitmap_zero(d->poll_mask, MAX_VIRT_CPUS);
 #endif
 
@@ -1180,6 +1192,8 @@ void evtchn_destroy(struct domain *d)
     spin_unlock(&d->event_lock);
 
     clear_global_virq_handlers(d);
+
+    xfree(d->evtchn);
 }
 
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 0db73c0..ad9d7ef 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -263,7 +263,7 @@ struct domain
     spinlock_t       rangesets_lock;
 
     /* Event channel information. */
-    struct evtchn   *evtchn[NR_EVTCHN_BUCKETS];
+    struct evtchn  **evtchn;
     spinlock_t       event_lock;
     struct evtchn_port_ops *evtchn_port_ops;
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 5/8] evtchn: use a per-domain variable for the max number of event channels
  2013-03-19 21:00 [PATCH RFC 0/8] Xen: FIFO-based event channel ABI David Vrabel
                   ` (3 preceding siblings ...)
  2013-03-19 21:00 ` [PATCH 4/8] evtchn: Dynamically allocate d->evtchn David Vrabel
@ 2013-03-19 21:00 ` David Vrabel
  2013-03-20 10:27   ` Jan Beulich
  2013-03-19 21:00 ` [PATCH 6/8] HACK! evtchn: increase number of buckets to support the FIFO ABI David Vrabel
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 27+ messages in thread
From: David Vrabel @ 2013-03-19 21:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Keir Fraser, David Vrabel, Konrad Rzeszutek Wilk

From: David Vrabel <david.vrabel@citrix.com>

Instead of the MAX_EVTCHNS(d) macro, use d->max_evtchns instead.  This
avoids having to repeatedly check the ABI type.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/common/event_channel.c |    7 +++++--
 xen/common/schedule.c      |    2 +-
 xen/include/xen/event.h    |    2 +-
 xen/include/xen/sched.h    |    1 +
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 44672b5..b0ee75d 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -134,7 +134,7 @@ static int get_free_port(struct domain *d)
         if ( evtchn_from_port(d, port)->state == ECS_FREE )
             return port;
 
-    if ( port == MAX_EVTCHNS(d) )
+    if ( port == d->max_evtchns )
         return -ENOSPC;
 
     chn = xzalloc_array(struct evtchn, EVTCHNS_PER_BUCKET);
@@ -1136,6 +1136,8 @@ void notify_via_xen_event_channel(struct domain *ld, int lport)
 
 int evtchn_init(struct domain *d)
 {
+    d->max_evtchns = MAX_EVTCHNS(d);
+
     BUILD_BUG_ON(sizeof(struct evtchn *) * NR_EVTCHN_BUCKETS > PAGE_SIZE);
     d->evtchn = xzalloc_array(struct evtchn *, NR_EVTCHN_BUCKETS);
 
@@ -1160,6 +1162,7 @@ int evtchn_init(struct domain *d)
     bitmap_zero(d->poll_mask, MAX_VIRT_CPUS);
 #endif
 
+    /* Default to N-level ABI. */
     d->evtchn_port_ops = &evtchn_port_ops_2l;
 
     return 0;
@@ -1236,7 +1239,7 @@ static void domain_dump_evtchn_info(struct domain *d)
 
     spin_lock(&d->event_lock);
 
-    for ( port = 1; port < MAX_EVTCHNS(d); ++port )
+    for ( port = 1; port < d->max_evtchns; ++port )
     {
         const struct evtchn *chn;
         char *ssid;
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 9d2aa5e..9a71d9a 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -690,7 +690,7 @@ static long do_poll(struct sched_poll *sched_poll)
             goto out;
 
         rc = -EINVAL;
-        if ( port >= MAX_EVTCHNS(d) )
+        if ( port >= d->max_evtchns )
             goto out;
 
         rc = 0;
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 9d85234..879175d 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -73,7 +73,7 @@ void notify_via_xen_event_channel(struct domain *ld, int lport);
 #define bucket_from_port(d,p) \
     ((d)->evtchn[(p)/EVTCHNS_PER_BUCKET])
 #define port_is_valid(d,p)    \
-    (((p) >= 0) && ((p) < MAX_EVTCHNS(d)) && \
+    (((p) >= 0) && ((p) < d->max_evtchns) && \
      (bucket_from_port(d,p) != NULL))
 #define evtchn_from_port(d,p) \
     (&(bucket_from_port(d,p))[(p)&(EVTCHNS_PER_BUCKET-1)])
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index ad9d7ef..1e849aa 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -264,6 +264,7 @@ struct domain
 
     /* Event channel information. */
     struct evtchn  **evtchn;
+    unsigned         max_evtchns;
     spinlock_t       event_lock;
     struct evtchn_port_ops *evtchn_port_ops;
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 6/8] HACK! evtchn: increase number of buckets to support the FIFO ABI
  2013-03-19 21:00 [PATCH RFC 0/8] Xen: FIFO-based event channel ABI David Vrabel
                   ` (4 preceding siblings ...)
  2013-03-19 21:00 ` [PATCH 5/8] evtchn: use a per-domain variable for the max number of event channels David Vrabel
@ 2013-03-19 21:00 ` David Vrabel
  2013-03-19 21:00 ` [PATCH 7/8] evtchn: add FIFO-based event channel ABI David Vrabel
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 27+ messages in thread
From: David Vrabel @ 2013-03-19 21:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Keir Fraser, David Vrabel, Konrad Rzeszutek Wilk

From: David Vrabel <david.vrabel@citrix.com>

Increase NR_EVTCHN_BUCKETS so 2^17 event channels are possible.  This
required increasing EVTCHN_PER_BUCKET so the bucket list fits into a page.

HACK! because each bucket is now > PAGE_SIZE.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/include/xen/sched.h |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 1e849aa..f825113 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -51,8 +51,9 @@ extern struct domain *dom0;
 #define BITS_PER_EVTCHN_WORD(d) (has_32bit_shinfo(d) ? 32 : BITS_PER_LONG)
 #endif
 #define MAX_EVTCHNS(d) (BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d))
-#define EVTCHNS_PER_BUCKET 128
-#define NR_EVTCHN_BUCKETS  (NR_EVENT_CHANNELS / EVTCHNS_PER_BUCKET)
+/* FIXME: this causes each bucket to be > PAGE_SIZE. */
+#define EVTCHNS_PER_BUCKET 256
+#define NR_EVTCHN_BUCKETS  ((1 << EVTCHN_FIFO_LINK_BITS) / EVTCHNS_PER_BUCKET)
 
 struct evtchn
 {
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 7/8] evtchn: add FIFO-based event channel ABI
  2013-03-19 21:00 [PATCH RFC 0/8] Xen: FIFO-based event channel ABI David Vrabel
                   ` (5 preceding siblings ...)
  2013-03-19 21:00 ` [PATCH 6/8] HACK! evtchn: increase number of buckets to support the FIFO ABI David Vrabel
@ 2013-03-19 21:00 ` David Vrabel
  2013-03-20 10:32   ` Jan Beulich
  2013-03-19 21:00 ` [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops David Vrabel
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 27+ messages in thread
From: David Vrabel @ 2013-03-19 21:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Keir Fraser, David Vrabel, Konrad Rzeszutek Wilk

From: David Vrabel <david.vrabel@citrix.com>

Add the event channel hypercall sub-ops and the definitions for the
shared data structures for the FIFO-based event channel ABI.

The design document for this new ABI is available here:

http://xenbits.xen.org/people/dvrabel/event-channels-C.pdf

In summary, events are reported using a per-domain shared event array
of event words.  Each event word has PENDING, LINKED and MASKED bits
and a LINK field for pointing to the next event in the event queue.

There are 16 event queues (with different priorities) per-VCPU.

Key advantages of this new ABI include:

- Support for over 100,000 events (2^17).
- 16 different event priorities.
- Improved fairness in event latency through the use of FIFOs.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/include/public/event_channel.h |   70 ++++++++++++++++++++++++++++++++++++
 1 files changed, 70 insertions(+), 0 deletions(-)

diff --git a/xen/include/public/event_channel.h b/xen/include/public/event_channel.h
index 472efdb..4945cd0 100644
--- a/xen/include/public/event_channel.h
+++ b/xen/include/public/event_channel.h
@@ -71,6 +71,10 @@
 #define EVTCHNOP_bind_vcpu        8
 #define EVTCHNOP_unmask           9
 #define EVTCHNOP_reset           10
+#define EVTCHNOP_init_control    11
+#define EVTCHNOP_expand_array    12
+#define EVTCHNOP_set_priority    13
+#define EVTCHNOP_set_limit       14
 /* ` } */
 
 typedef uint32_t evtchn_port_t;
@@ -258,6 +262,46 @@ struct evtchn_reset {
 typedef struct evtchn_reset evtchn_reset_t;
 
 /*
+ * EVTCHNOP_init_control: initialize the control block for the FIFO ABI.
+ */
+struct evtchn_init_control {
+    /* IN parameters. */
+    uint64_t control_mfn;
+    uint32_t offset;
+    uint32_t vcpu;
+};
+typedef struct evtchn_init_control evtchn_init_control_t;
+
+/*
+ * EVTCHNOP_expand_array: add an additional page to the event array.
+ */
+struct evtchn_expand_array {
+    /* IN parameters. */
+    uint64_t array_mfn;
+};
+typedef struct evtchn_expand_array evtchn_expand_array_t;
+
+/*
+ * EVTCHNOP_set_priority: set the priority for an event channel.
+ */
+struct evtchn_set_priority {
+    /* IN parameters. */
+    uint32_t port;
+    uint32_t priority;
+};
+typedef struct evtchn_set_priority evtchn_set_priority_t;
+
+/*
+ * EVTCHNOP_set_limit: set the maximum event channel port that may be bound.
+ */
+struct evtchn_set_limit {
+    /* IN parameters. */
+    uint32_t domid;
+    uint32_t max_port;
+};
+typedef struct evtchn_set_limit evtchn_set_limit_t;
+
+/*
  * ` enum neg_errnoval
  * ` HYPERVISOR_event_channel_op_compat(struct evtchn_op *op)
  * `
@@ -281,6 +325,32 @@ struct evtchn_op {
 typedef struct evtchn_op evtchn_op_t;
 DEFINE_XEN_GUEST_HANDLE(evtchn_op_t);
 
+/*
+ * FIFO ABI
+ */
+
+/* Events may have priorities from 0 (highest) to 15 (lowest). */
+#define EVTCHN_FIFO_PRIORITY_MIN     15
+#define EVTCHN_FIFO_PRIORITY_DEFAULT 7
+
+#define EVTCHN_FIFO_MAX_QUEUES (EVTCHN_FIFO_PRIORITY_MIN + 1)
+
+typedef uint32_t event_word_t;
+
+#define EVTCHN_FIFO_PENDING 31
+#define EVTCHN_FIFO_MASKED  30
+#define EVTCHN_FIFO_LINKED  29
+
+#define EVTCHN_FIFO_LINK_BITS 17
+#define EVTCHN_FIFO_LINK_MASK ((1 << EVTCHN_FIFO_LINK_BITS) - 1)
+
+struct evtchn_fifo_control_block {
+    uint32_t ready;
+    uint32_t _rsvd1;
+    uint32_t head[EVTCHN_FIFO_MAX_QUEUES];
+};
+typedef struct evtchn_fifo_control_block evtchn_fifo_control_block_t;
+
 #endif /* __XEN_PUBLIC_EVENT_CHANNEL_H__ */
 
 /*
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops
  2013-03-19 21:00 [PATCH RFC 0/8] Xen: FIFO-based event channel ABI David Vrabel
                   ` (6 preceding siblings ...)
  2013-03-19 21:00 ` [PATCH 7/8] evtchn: add FIFO-based event channel ABI David Vrabel
@ 2013-03-19 21:00 ` David Vrabel
  2013-03-20 10:47   ` Jan Beulich
  2013-03-20 13:50   ` Wei Liu
  2013-03-19 21:15 ` [PATCH RFC 0/8] Xen: FIFO-based event channel ABI Keir Fraser
  2013-03-20 10:15 ` Jan Beulich
  9 siblings, 2 replies; 27+ messages in thread
From: David Vrabel @ 2013-03-19 21:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Keir Fraser, David Vrabel, Konrad Rzeszutek Wilk

From: David Vrabel <david.vrabel@citrix.com>

Add the implementation for the FIFO-based event channel ABI.  The new
hypercall sub-ops (EVTCHNOP_init_control, EVTCHNOP_expand_array,
EVTCHNOP_set_priority) and the required evtchn_ops (set_pending,
unmask, etc.).

This current implementation has three main limitations:

- EVTCHNOP_set_limit is not yet implemented so any guest will be able
  to use up to 2^17 (requiring 128 global mapping pages for a fully
  populated event array).

- The control block frames are not required to be shared with the
  vcpu_info structure.  This requires an additional global mapping
  page per-VCPU.  This does makes the guest implementation cleaner
  though so perhaps we do not need to fix this?

- The allocation of the struct evtchns requires > PAGE_SIZE
  allocations.  I plan to take Wei's "evtchn: alter internal object
  handling scheme" patch for this.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
---
 xen/common/Makefile          |    1 +
 xen/common/event_channel.c   |   35 +++++
 xen/common/event_fifo.c      |  313 ++++++++++++++++++++++++++++++++++++++++++
 xen/common/event_port.c      |  187 +++++++++++++++++++++++++
 xen/include/xen/event.h      |   13 +-
 xen/include/xen/event_fifo.h |   56 ++++++++
 xen/include/xen/sched.h      |    4 +
 7 files changed, 604 insertions(+), 5 deletions(-)
 create mode 100644 xen/common/event_fifo.c
 create mode 100644 xen/include/xen/event_fifo.h

diff --git a/xen/common/Makefile b/xen/common/Makefile
index f967b49..91cc460 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -6,6 +6,7 @@ obj-$(HAS_DEVICE_TREE) += device_tree.o
 obj-y += domctl.o
 obj-y += domain.o
 obj-y += event_channel.o
+obj-y += event_fifo.o
 obj-y += event_port.o
 obj-y += grant_table.o
 obj-y += irq.o
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index b0ee75d..5aaf530 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -26,6 +26,7 @@
 #include <xen/compat.h>
 #include <xen/guest_access.h>
 #include <xen/keyhandler.h>
+#include <xen/event_fifo.h>
 #include <asm/current.h>
 
 #include <public/xen.h>
@@ -1042,6 +1043,38 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case EVTCHNOP_init_control: {
+        struct evtchn_init_control init_control;
+        if ( copy_from_guest(&init_control, arg, 1) != 0 )
+            return -EFAULT;
+        rc = evtchn_fifo_init_control(&init_control);
+        break;
+    }
+
+    case EVTCHNOP_expand_array: {
+        struct evtchn_expand_array expand_array;
+        if ( copy_from_guest(&expand_array, arg, 1) != 0 )
+            return -EFAULT;
+        rc = evtchn_fifo_expand_array(&expand_array);
+        break;
+    }
+
+    case EVTCHNOP_set_priority: {
+        struct evtchn_set_priority set_priority;
+        if ( copy_from_guest(&set_priority, arg, 1) != 0 )
+            return -EFAULT;
+        rc = evtchn_fifo_set_priority(&set_priority);
+        break;
+    }
+
+    case EVTCHNOP_set_limit: {
+        struct evtchn_set_limit set_limit;
+        if ( copy_from_guest(&set_limit, arg, 1) != 0 )
+            return -EFAULT;
+        rc = evtchn_fifo_set_limit(&set_limit);
+        break;
+    }
+
     default:
         rc = -ENOSYS;
         break;
@@ -1184,6 +1217,8 @@ void evtchn_destroy(struct domain *d)
         (void)__evtchn_close(d, i);
     }
 
+    evtchn_fifo_destroy(d);
+
     /* Free all event-channel buckets. */
     spin_lock(&d->event_lock);
     for ( i = 0; i < NR_EVTCHN_BUCKETS; i++ )
diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c
new file mode 100644
index 0000000..f14266d
--- /dev/null
+++ b/xen/common/event_fifo.c
@@ -0,0 +1,313 @@
+/*
+ * FIFO event channel management.
+ *
+ * Copyright (C) 2013 Citrix Systems R&D Ltd.
+ * 
+ * This source code is licensed under the GNU General Public License,
+ * Version 2 or later.  See the file COPYING for more details.
+ */
+
+#include <xen/config.h>
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/errno.h>
+#include <xen/sched.h>
+#include <xen/event.h>
+#include <xen/event_fifo.h>
+#include <xen/paging.h>
+#include <xen/mm.h>
+
+#include <public/event_channel.h>
+
+static int map_guest_page(struct domain *d, uint64_t gfn,
+                          struct page_info **page, void **virt)
+{
+    struct page_info *p;
+
+    p = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC);
+    if ( !p )
+        return -EINVAL;
+
+    if ( !get_page_type(p, PGT_writable_page) )
+    {
+        put_page(p);
+        return -EINVAL;
+    }
+
+    *virt = map_domain_page_global(gfn);
+    if ( *virt == NULL )
+    {
+        put_page_and_type(p);
+        return -ENOMEM;
+    }
+    *page = p;
+    return 0;
+}
+
+static void unmap_guest_page(struct page_info *page, void *virt)
+{
+    if ( page == NULL )
+        return;
+
+    unmap_domain_page_global(virt);
+    put_page_and_type(page);
+}
+
+static void cleanup_control_block(struct vcpu *v)
+{
+    if ( v->evtchn_fifo )
+    {
+        unmap_guest_page(v->evtchn_fifo->cb_page, v->evtchn_fifo->control_block);
+        xfree(v->evtchn_fifo);
+        v->evtchn_fifo = NULL;
+    }
+}
+
+static void init_queue(struct vcpu *v, struct evtchn_fifo_queue *q, int i)
+{
+    spin_lock_init(&q->lock);
+    q->priority = i;
+    q->head = &v->evtchn_fifo->control_block->head[i];
+}
+
+static int setup_control_block(struct vcpu *v, uint64_t gfn, uint32_t offset)
+{
+    struct domain *d = v->domain;
+    struct evtchn_fifo_vcpu *efv;
+    struct page_info *page;
+    void *virt;
+    int i;
+    int rc;
+
+    if ( v->evtchn_fifo )
+        return -EINVAL;
+
+    efv = xzalloc(struct evtchn_fifo_vcpu);
+    if ( efv == NULL )
+        return -ENOMEM;
+
+    rc = map_guest_page(d, gfn, &page, &virt);
+    if ( rc < 0 )
+    {
+        xfree(efv);
+        return rc;
+    }
+
+    v->evtchn_fifo = efv;
+
+    v->evtchn_fifo->cb_page       = page;
+    v->evtchn_fifo->control_block = virt + offset;
+
+    for ( i = 0; i <= EVTCHN_FIFO_PRIORITY_MIN; i++ )
+        init_queue(v, &v->evtchn_fifo->queue[i], i);
+ 
+    return 0;
+}
+
+/*
+ * Setup an event array with no pages.
+ */
+static int setup_event_array(struct domain *d)
+{
+    if ( d->evtchn_fifo )
+        return 0;
+
+    d->evtchn_fifo = xzalloc(struct evtchn_fifo_domain);
+    if ( d->evtchn_fifo == NULL )
+        return -ENOMEM;
+
+    d->evtchn_fifo->num_evtchns = 0;
+
+    return 0;
+}
+
+/*
+ * Some ports may already be bound, bind them to the correct VCPU so
+ * they have a valid queue.
+ *
+ * Note: any events that are currently pending will not be resent and
+ * will be lost.
+ */
+static void rebind_all_ports(struct domain *d)
+{
+    int port;
+
+    for ( port = 1; port < d->max_evtchns; port++ )
+    {
+        struct evtchn *evtchn;
+
+        if ( !port_is_valid(d, port) )
+            break;
+
+        evtchn = evtchn_from_port(d, port);
+        switch ( evtchn->state )
+        {
+        case ECS_INTERDOMAIN:
+        case ECS_PIRQ:
+        case ECS_VIRQ:
+            evtchn_port_bind_to_vcpu(d, evtchn, d->vcpu[evtchn->notify_vcpu_id]);
+            break;
+        default:
+            break;
+        }
+    }
+}
+
+static void cleanup_event_array(struct domain *d)
+{
+    int i;
+
+    if ( d->evtchn_fifo == NULL )
+        return;
+
+    for ( i = 0; i < EVTCHN_FIFO_MAX_EVENT_ARRAY_PAGES; i++ )
+    {
+        unmap_guest_page(d->evtchn_fifo->event_array[i].page,
+                         d->evtchn_fifo->event_array[i].virt);
+    }
+    xfree(d->evtchn_fifo);
+}
+
+int evtchn_fifo_init_control(struct evtchn_init_control *init_control)
+{
+    struct domain *d = current->domain;
+    uint32_t vcpu_id;
+    uint64_t gfn;
+    uint32_t offset;
+    struct vcpu *v;
+    int rc;
+
+    vcpu_id = init_control->vcpu;
+    gfn     = init_control->control_mfn;
+    offset  = init_control->offset;
+
+    if ( (vcpu_id >= d->max_vcpus) || (d->vcpu[vcpu_id] == NULL) )
+        return -ENOENT;
+    v = d->vcpu[vcpu_id];
+
+    /* Must not cross page boundary. */
+    if ( offset > (PAGE_SIZE - sizeof(evtchn_fifo_control_block_t)) )
+        return -EINVAL;
+
+    /* Must be 8-bytes aligned. */
+    if ( offset & (8 - 1) )
+        return -EINVAL;
+
+    spin_lock(&d->event_lock);
+
+    rc = setup_control_block(v, gfn, offset);
+
+    /* If this is the first control block, setup an empty event array
+       and switch to the fifo port ops. */
+    if ( d->evtchn_fifo == NULL )
+    {
+        rc = setup_event_array(d);
+        if ( rc < 0 )
+            cleanup_control_block(v);
+        else
+        {
+            d->evtchn_port_ops = &evtchn_port_ops_fifo;
+            rebind_all_ports(d);
+        }
+    }
+
+    spin_unlock(&d->event_lock);
+
+    return rc;
+}
+
+static int add_page_to_event_array(struct domain *d, unsigned long gfn)
+{
+    struct page_info *page = NULL;
+    void *virt;
+    int slot;
+    int rc;
+
+    slot = d->evtchn_fifo->num_evtchns / EVTCHN_FIFO_EVENT_WORDS_PER_PAGE;
+    if ( slot >= EVTCHN_FIFO_MAX_EVENT_ARRAY_PAGES )
+        return -ENOSPC;
+
+    rc = map_guest_page(d, gfn, &page, &virt);
+    if ( rc < 0 )
+        return rc;
+
+    d->evtchn_fifo->event_array[slot].page = page;
+    d->evtchn_fifo->event_array[slot].virt = virt;
+
+    d->evtchn_fifo->num_evtchns += EVTCHN_FIFO_EVENT_WORDS_PER_PAGE;
+
+    return 0;
+}
+
+int evtchn_fifo_expand_array(struct evtchn_expand_array *expand_array)
+{
+    struct domain *d = current->domain;
+    int rc;
+
+    spin_lock(&d->event_lock);
+    rc = add_page_to_event_array(d, expand_array->array_mfn);
+    spin_unlock(&d->event_lock);
+
+    return rc;
+}
+
+int evtchn_fifo_set_priority(struct evtchn_set_priority *set_priority)
+{
+    struct domain *d = current->domain;
+    struct vcpu *v;
+    uint32_t priority;
+    int port;
+    struct evtchn *evtchn;
+
+    priority = set_priority->priority;
+    port     = set_priority->port;
+    
+    if ( priority > EVTCHN_FIFO_PRIORITY_MIN )
+        return -EINVAL;
+
+    spin_lock(&d->event_lock);
+
+    if ( !port_is_valid(d, port) )
+    {
+        spin_unlock(&d->event_lock);
+        return -EINVAL;
+    }
+
+    /*
+     * Switch to the new queue for future events. If the event is
+     * already pending or in the process of being linked it will be on
+     * the old queue -- this is fine.
+     */
+    evtchn = evtchn_from_port(d, port);
+    v = d->vcpu[evtchn->notify_vcpu_id];
+    evtchn->queue = &v->evtchn_fifo->queue[priority];
+
+    spin_unlock(&d->event_lock);
+
+    return 0;
+}
+
+int evtchn_fifo_set_limit(struct evtchn_set_limit *set_limit)
+{
+    /* FIXME: not supported yet. */
+    return -ENOSYS;
+}
+
+void evtchn_fifo_destroy(struct domain *d)
+{
+    struct vcpu *v;
+
+    for_each_vcpu( d, v )
+        cleanup_control_block(v);
+    cleanup_event_array(d);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/common/event_port.c b/xen/common/event_port.c
index b0ef25b..578acd6 100644
--- a/xen/common/event_port.c
+++ b/xen/common/event_port.c
@@ -13,6 +13,9 @@
 #include <xen/errno.h>
 #include <xen/sched.h>
 #include <xen/event.h>
+#include <xen/event_fifo.h>
+
+#include <public/event_channel.h>
 
 static void evtchn_check_pollers(struct domain *d, int port)
 {
@@ -106,3 +109,187 @@ struct evtchn_port_ops evtchn_port_ops_2l =
     .is_pending    = evtchn_2l_is_pending,
     .is_masked     = evtchn_2l_is_masked,
 };
+
+static inline event_word_t *evtchn_fifo_word_from_port(struct domain *d, int port)
+{
+    int p, w;
+
+    if ( unlikely(port >= d->evtchn_fifo->num_evtchns) )
+        return NULL;
+
+    p = port / EVTCHN_FIFO_EVENT_WORDS_PER_PAGE;
+    w = port % EVTCHN_FIFO_EVENT_WORDS_PER_PAGE;
+
+    return d->evtchn_fifo->event_array[p].virt + w;
+}
+
+static bool_t evtchn_fifo_set_link(volatile event_word_t *word, uint32_t link)
+{
+    event_word_t n, o, w;
+
+    w = *word;
+
+    do {
+        if ( !(w & (1 << EVTCHN_FIFO_LINKED)) )
+            return 0;
+        o = w;
+        n = (w & ~EVTCHN_FIFO_LINK_MASK) | link;
+    } while ( (w = cmpxchg(word, o, n)) != o );
+
+    return 1;
+}
+
+static void evtchn_fifo_set_pending(struct vcpu *v, struct evtchn *evtchn)
+{
+    struct domain *d = v->domain;
+    int port;
+    event_word_t *word;
+    struct evtchn_fifo_queue *q;
+    unsigned long flags;
+    bool_t was_pending;
+
+    port = evtchn->port;
+    word = evtchn_fifo_word_from_port(d, port);
+    if ( unlikely(!word) )
+        return;
+
+    /*
+     * No locking around getting the queue. This may race with
+     * changing the priority but we are allowed to signal the event
+     * once on the old priority.
+     */
+    q = evtchn->queue;
+
+    was_pending = test_and_set_bit(EVTCHN_FIFO_PENDING, word);
+
+    /*
+     * Link the event if it unmasked and not already linked.
+     */
+    if ( !test_bit(EVTCHN_FIFO_MASKED, word)
+         && !test_and_set_bit(EVTCHN_FIFO_LINKED, word) )
+    {
+        event_word_t *tail_word;
+        bool_t linked = 0;
+
+        spin_lock_irqsave(&q->lock, flags);
+
+        /*
+         * Atomically link the tail to port iff the tail is linked.
+         * If the tail is unlinked the queue is empty.
+         *
+         * If port is the same as tail, the queue is empty but q->tail
+         * will appear linked as we just set LINKED above.
+         *
+         * If the queue is empty (i.e., we haven't linked to the new
+         * event), head must be updated.
+         */
+        if ( port != q->tail )
+        {
+            tail_word = evtchn_fifo_word_from_port(d, q->tail);
+            linked = evtchn_fifo_set_link(tail_word, port);
+        }
+        if ( !linked )
+            *q->head = port;
+        q->tail = port;
+
+        spin_unlock_irqrestore(&q->lock, flags);
+
+        if ( !test_and_set_bit(q->priority, &v->evtchn_fifo->control_block->ready) )
+            vcpu_mark_events_pending(v);
+    }
+
+    if ( !was_pending )
+        evtchn_check_pollers(d, v->poll_evtchn);
+}
+
+static void evtchn_fifo_clear_pending(struct domain *d, struct evtchn *evtchn)
+{
+    event_word_t *word;
+
+    word = evtchn_fifo_word_from_port(d, evtchn->port);
+    if ( unlikely(!word) )
+        return;
+
+    /*
+     * Just clear the P bit.
+     *
+     * No need to unlink as the guest will unlink and ignore
+     * non-pending events.
+     */
+    clear_bit(EVTCHN_FIFO_PENDING, word);
+}
+
+static void evtchn_fifo_unmask(struct domain *d, struct evtchn *evtchn)
+{
+    struct vcpu *v = d->vcpu[evtchn->notify_vcpu_id];
+    event_word_t *word;
+
+    word = evtchn_fifo_word_from_port(d, evtchn->port);
+    if ( unlikely(!word) )
+        return;
+
+    clear_bit(EVTCHN_FIFO_MASKED, word);
+
+    /* Relink if pending. */
+    if ( test_bit(EVTCHN_FIFO_PENDING, word ) )
+        evtchn_fifo_set_pending(v, evtchn);
+}
+
+static bool_t evtchn_fifo_is_pending(struct domain *d,
+                                     const struct evtchn *evtchn)
+{
+    event_word_t *word;
+
+    word = evtchn_fifo_word_from_port(d, evtchn->port);
+    if ( unlikely(!word) )
+        return 0;
+
+    return test_bit(EVTCHN_FIFO_PENDING, word);
+}
+
+static bool_t evtchn_fifo_is_masked(struct domain *d,
+                                    const struct evtchn *evtchn)
+{
+    event_word_t *word;
+
+    word = evtchn_fifo_word_from_port(d, evtchn->port);
+    if ( unlikely(!word) )
+        return 1;
+
+    return test_bit(EVTCHN_FIFO_MASKED, word);
+}
+
+static void evtchn_fifo_bind_to_vcpu(struct domain *d, struct evtchn *evtchn,
+                                     struct vcpu *v)
+{
+    int priority;
+
+    /* Keep the same priority if possible, otherwise use the
+       default. */
+    if ( evtchn->queue )
+        priority = evtchn->queue->priority;
+    else
+        priority = EVTCHN_FIFO_PRIORITY_DEFAULT;
+
+    evtchn->queue = &v->evtchn_fifo->queue[priority];
+}
+
+struct evtchn_port_ops evtchn_port_ops_fifo =
+{
+    .set_pending   = evtchn_fifo_set_pending,
+    .clear_pending = evtchn_fifo_clear_pending,
+    .unmask        = evtchn_fifo_unmask,
+    .is_pending    = evtchn_fifo_is_pending,
+    .is_masked     = evtchn_fifo_is_masked,
+    .bind_to_vcpu  = evtchn_fifo_bind_to_vcpu,
+};
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 879175d..5ddf32c 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -116,19 +116,21 @@ struct evtchn_port_ops {
 };
 
 extern struct evtchn_port_ops evtchn_port_ops_2l;
+extern struct evtchn_port_ops evtchn_port_ops_fifo;
 
-static inline void evtchn_port_set_pending(struct vcpu *v, struct evtchn *evtchn)
+static inline void evtchn_port_set_pending(struct vcpu *v,
+                                           struct evtchn *evtchn)
 {
     v->domain->evtchn_port_ops->set_pending(v, evtchn);
 }
 
-static inline void evtchn_port_clear_pending(struct domain *d, struct evtchn *evtchn)
+static inline void evtchn_port_clear_pending(struct domain *d,
+                                             struct evtchn *evtchn)
 {
     d->evtchn_port_ops->clear_pending(d, evtchn);
 }
 
-static inline void evtchn_port_unmask(struct domain *d,
-                                      const struct evtchn *evtchn)
+static inline void evtchn_port_unmask(struct domain *d, struct evtchn *evtchn)
 {
     d->evtchn_port_ops->unmask(d, evtchn);
 }
@@ -139,7 +141,8 @@ static inline bool_t evtchn_port_is_pending(struct domain *d,
     return d->evtchn_port_ops->is_pending(d, evtchn);
 }
 
-static inline bool_t evtchn_port_is_masked(struct domain *d, struct evtchn *evtchn)
+static inline bool_t evtchn_port_is_masked(struct domain *d,
+                                           const struct evtchn *evtchn)
 {
     return d->evtchn_port_ops->is_masked(d, evtchn);
 }
diff --git a/xen/include/xen/event_fifo.h b/xen/include/xen/event_fifo.h
new file mode 100644
index 0000000..4c552bd
--- /dev/null
+++ b/xen/include/xen/event_fifo.h
@@ -0,0 +1,56 @@
+/*
+ * FIFO-based event channel ABI.
+ *
+ * Copyright (C) 2013 Citrix Systems R&D Ltd.
+ *
+ * This source code is licensed under the GNU General Public License,
+ * Version 2 or later.  See the file COPYING for more details.
+ */
+#ifndef __XEN_EVENT_FIFO_H__
+#define __XEN_EVENT_FIFO_H__
+
+struct evtchn_fifo_queue {
+    volatile uint32_t *head; /* points into control block */
+    uint32_t tail;
+    spinlock_t lock;
+    uint8_t priority;
+};
+
+struct evtchn_fifo_vcpu {
+    struct page_info *cb_page;
+    struct evtchn_fifo_control_block *control_block;
+    struct evtchn_fifo_queue queue[EVTCHN_FIFO_MAX_QUEUES];
+};
+
+#define EVTCHN_FIFO_EVENT_WORDS_PER_PAGE (PAGE_SIZE / sizeof(event_word_t))
+#define EVTCHN_FIFO_MAX_EVENT_ARRAY_PAGES \
+    ((1 << EVTCHN_FIFO_LINK_BITS) / EVTCHN_FIFO_EVENT_WORDS_PER_PAGE)
+
+
+struct evtchn_fifo_array_page {
+    struct page_info *page;
+    event_word_t *virt;
+};
+
+struct evtchn_fifo_domain {
+    struct evtchn_fifo_array_page event_array[EVTCHN_FIFO_MAX_EVENT_ARRAY_PAGES];
+    unsigned num_evtchns;
+};
+
+int evtchn_fifo_init_control(struct evtchn_init_control *init_control);
+int evtchn_fifo_expand_array(struct evtchn_expand_array *expand_array);
+int evtchn_fifo_set_priority(struct evtchn_set_priority *set_priority);
+int evtchn_fifo_set_limit(struct evtchn_set_limit *set_limit);
+void evtchn_fifo_destroy(struct domain *domain);
+
+#endif /* __XEN_EVENT_FIFO_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index f825113..62045ea 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -83,6 +83,7 @@ struct evtchn
         u16 virq;      /* state == ECS_VIRQ */
     } u;
     u32 port;
+    struct evtchn_fifo_queue *queue;
 #ifdef FLASK_ENABLE
     void *ssid;
 #endif
@@ -189,6 +190,8 @@ struct vcpu
 
     struct waitqueue_vcpu *waitqueue_vcpu;
 
+    struct evtchn_fifo_vcpu *evtchn_fifo;
+
     struct arch_vcpu arch;
 };
 
@@ -268,6 +271,7 @@ struct domain
     unsigned         max_evtchns;
     spinlock_t       event_lock;
     struct evtchn_port_ops *evtchn_port_ops;
+    struct evtchn_fifo_domain *evtchn_fifo;
 
     struct grant_table *grant_table;
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC 0/8] Xen: FIFO-based event channel ABI
  2013-03-19 21:00 [PATCH RFC 0/8] Xen: FIFO-based event channel ABI David Vrabel
                   ` (7 preceding siblings ...)
  2013-03-19 21:00 ` [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops David Vrabel
@ 2013-03-19 21:15 ` Keir Fraser
  2013-03-20 10:15 ` Jan Beulich
  9 siblings, 0 replies; 27+ messages in thread
From: Keir Fraser @ 2013-03-19 21:15 UTC (permalink / raw)
  To: David Vrabel, xen-devel; +Cc: Wei Liu, Konrad Rzeszutek Wilk

On 19/03/2013 21:00, "David Vrabel" <david.vrabel@citrix.com> wrote:

> This is an RFC of the FIFO-based event channel ABI described in this
> design document:
> 
> http://xenbits.xen.org/people/dvrabel/event-channels-C.pdf

Nice! :) It seems not massively more code than the 3-level implementation,
although granted it's not fully baked yet. But I like it.

 -- Keir

> Changes since draft B are:
> 
> - Queue tail is now private to Xen.
> 
> - Guest pages are specified by MFN in the hypercalls.
> 
> - Updated link/unlink algorithm to avoid races when adding an event to
>   a queue that is becoming empty.
> 
> Patch 1-5 do some preparatory work for supporting alternate ABIs.
> 
> Patch 6 is a hack to allow it to build.
> 
> Patch 7-8 add the ABI and the implementation.  Patch 8 list some of
> the known limitations with this RFC implementation.
> 
> David
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH RFC 0/8] Xen: FIFO-based event channel ABI
  2013-03-19 21:00 [PATCH RFC 0/8] Xen: FIFO-based event channel ABI David Vrabel
                   ` (8 preceding siblings ...)
  2013-03-19 21:15 ` [PATCH RFC 0/8] Xen: FIFO-based event channel ABI Keir Fraser
@ 2013-03-20 10:15 ` Jan Beulich
  9 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2013-03-20 10:15 UTC (permalink / raw)
  To: David Vrabel; +Cc: Keir Fraser, Konrad Rzeszutek Wilk, Wei Liu, xen-devel

>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
> This is an RFC of the FIFO-based event channel ABI described in this
> design document:
> 
> http://xenbits.xen.org/people/dvrabel/event-channels-C.pdf 
> 
> Changes since draft B are:
> 
> - Queue tail is now private to Xen.
> 
> - Guest pages are specified by MFN in the hypercalls.

I think this should be GMFN universally (also in the field names).
The fact that GMFN == MFN for non-translated guests is well
known anyway.

Jan

> - Updated link/unlink algorithm to avoid races when adding an event to
>   a queue that is becoming empty.
> 
> Patch 1-5 do some preparatory work for supporting alternate ABIs.
> 
> Patch 6 is a hack to allow it to build.
> 
> Patch 7-8 add the ABI and the implementation.  Patch 8 list some of
> the known limitations with this RFC implementation.
> 
> David
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/8] evtchn: refactor low-level event channel port ops
  2013-03-19 21:00 ` [PATCH 2/8] evtchn: refactor low-level event channel port ops David Vrabel
@ 2013-03-20 10:21   ` Jan Beulich
  2013-03-20 13:37     ` David Vrabel
  2013-03-20 10:24   ` Jan Beulich
  1 sibling, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2013-03-20 10:21 UTC (permalink / raw)
  To: David Vrabel; +Cc: Keir Fraser, Konrad Rzeszutek Wilk, Wei Liu, xen-devel

>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
> --- /dev/null
> +++ b/xen/common/event_port.c

Is this really a good name? Wouldn't it be better to have a separate
file per ABI (e.g. this one being evt_port_2l.c)?

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -81,6 +81,7 @@ struct evtchn
>          } pirq;        /* state == ECS_PIRQ */
>          u16 virq;      /* state == ECS_VIRQ */
>      } u;
> +    u32 port;

The need for this field doesn't become obvious after going
through the entire patch - if it really needs adding here, the
description should say why, I would think.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/8] evtchn: refactor low-level event channel port ops
  2013-03-19 21:00 ` [PATCH 2/8] evtchn: refactor low-level event channel port ops David Vrabel
  2013-03-20 10:21   ` Jan Beulich
@ 2013-03-20 10:24   ` Jan Beulich
  1 sibling, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2013-03-20 10:24 UTC (permalink / raw)
  To: David Vrabel; +Cc: Keir Fraser, Konrad Rzeszutek Wilk, Wei Liu, xen-devel

>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
> @@ -262,6 +265,7 @@ struct domain
>      /* Event channel information. */
>      struct evtchn   *evtchn[NR_EVTCHN_BUCKETS];
>      spinlock_t       event_lock;
> +    struct evtchn_port_ops *evtchn_port_ops;

const ...

Jan

>  
>      struct grant_table *grant_table;
>  

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] evtchn: use a per-domain variable for the max number of event channels
  2013-03-19 21:00 ` [PATCH 5/8] evtchn: use a per-domain variable for the max number of event channels David Vrabel
@ 2013-03-20 10:27   ` Jan Beulich
  0 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2013-03-20 10:27 UTC (permalink / raw)
  To: David Vrabel; +Cc: Keir Fraser, Konrad Rzeszutek Wilk, Wei Liu, xen-devel

>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
> @@ -1136,6 +1136,8 @@ void notify_via_xen_event_channel(struct domain *ld, int lport)
>  
>  int evtchn_init(struct domain *d)
>  {
> +    d->max_evtchns = MAX_EVTCHNS(d);

With MAX_EVTCHNS() only being needed here now (and not
supposed to be used elsewhere), is there no way to restrict its
visibility to just this source file (i.e. move the definition out of
the header)?

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 7/8] evtchn: add FIFO-based event channel ABI
  2013-03-19 21:00 ` [PATCH 7/8] evtchn: add FIFO-based event channel ABI David Vrabel
@ 2013-03-20 10:32   ` Jan Beulich
  2013-03-20 13:38     ` David Vrabel
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2013-03-20 10:32 UTC (permalink / raw)
  To: David Vrabel; +Cc: Keir Fraser, Konrad Rzeszutek Wilk, Wei Liu, xen-devel

>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
> From: David Vrabel <david.vrabel@citrix.com>
> 
> Add the event channel hypercall sub-ops and the definitions for the
> shared data structures for the FIFO-based event channel ABI.
> 
> The design document for this new ABI is available here:
> 
> http://xenbits.xen.org/people/dvrabel/event-channels-C.pdf 
> 
> In summary, events are reported using a per-domain shared event array
> of event words.  Each event word has PENDING, LINKED and MASKED bits
> and a LINK field for pointing to the next event in the event queue.
> 
> There are 16 event queues (with different priorities) per-VCPU.
> 
> Key advantages of this new ABI include:
> 
> - Support for over 100,000 events (2^17).

As said in a comment on the design doc B, I'm missing a way for
the guest to learn the width of the involved mask (17 here, but
obviously there is nothing preventing this from being 16, 18, or
yet some other implementation specific value <= 29).

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops
  2013-03-19 21:00 ` [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops David Vrabel
@ 2013-03-20 10:47   ` Jan Beulich
  2013-03-20 13:42     ` David Vrabel
  2013-03-20 13:50   ` Wei Liu
  1 sibling, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2013-03-20 10:47 UTC (permalink / raw)
  To: David Vrabel; +Cc: Keir Fraser, Konrad Rzeszutek Wilk, Wei Liu, xen-devel

>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
> +static int setup_control_block(struct vcpu *v, uint64_t gfn, uint32_t offset)
> +{
> +    struct domain *d = v->domain;
> +    struct evtchn_fifo_vcpu *efv;
> +    struct page_info *page;
> +    void *virt;
> +    int i;

unsigned int (also elsewhere)

> +struct evtchn_fifo_queue {
> +    volatile uint32_t *head; /* points into control block */

I still think that explicit barriers are the way to go. Unless Linux'es
view on this has changed, you'll have issues getting the Linux folks
to accept this.

> @@ -83,6 +83,7 @@ struct evtchn
>          u16 virq;      /* state == ECS_VIRQ */
>      } u;
>      u32 port;
> +    struct evtchn_fifo_queue *queue;

Wouldn't it be better (more space efficient) to store the queue
array index rather than a pointer, considering that you expect to
have thousands of event channels?

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/8] evtchn: Dynamically allocate d->evtchn
  2013-03-19 21:00 ` [PATCH 4/8] evtchn: Dynamically allocate d->evtchn David Vrabel
@ 2013-03-20 11:43   ` Wei Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Wei Liu @ 2013-03-20 11:43 UTC (permalink / raw)
  To: David Vrabel; +Cc: Keir (Xen.org), Konrad Rzeszutek Wilk, wei.liu2, xen-devel

On Tue, 2013-03-19 at 21:00 +0000, David Vrabel wrote:
> From: Wei Liu <wei.liu2@citrix.com>
> 
> As we move to extended evtchn ABI we need bigger d->evtchn, as a result
> this will bloat struct domain. So move this array out of struct domain
> and allocate a dedicated array for it.
> 

Please check out the patch series I sent yesterday. I modified the
internal object store for event channel, which can save space when your
domain only has small number of event channels, which is the case for
FIFO-based ABI.


Wei.

> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  xen/common/event_channel.c |   14 ++++++++++++++
>  xen/include/xen/sched.h    |    2 +-
>  2 files changed, 15 insertions(+), 1 deletions(-)
> 
> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> index 3a91241..44672b5 100644
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -1136,15 +1136,27 @@ void notify_via_xen_event_channel(struct domain *ld, int lport)
>  
>  int evtchn_init(struct domain *d)
>  {
> +    BUILD_BUG_ON(sizeof(struct evtchn *) * NR_EVTCHN_BUCKETS > PAGE_SIZE);
> +    d->evtchn = xzalloc_array(struct evtchn *, NR_EVTCHN_BUCKETS);
> +
> +    if ( d->evtchn == NULL )
> +        return -ENOMEM;
> +
>      spin_lock_init(&d->event_lock);
>      if ( get_free_port(d) != 0 )
> +    {
> +        xfree(d->evtchn);
>          return -EINVAL;
> +    }
>      evtchn_from_port(d, 0)->state = ECS_RESERVED;
>  
>  #if MAX_VIRT_CPUS > BITS_PER_LONG
>      d->poll_mask = xmalloc_array(unsigned long, BITS_TO_LONGS(MAX_VIRT_CPUS));
>      if ( !d->poll_mask )
> +    {
> +        xfree(d->evtchn);
>          return -ENOMEM;
> +    }
>      bitmap_zero(d->poll_mask, MAX_VIRT_CPUS);
>  #endif
>  
> @@ -1180,6 +1192,8 @@ void evtchn_destroy(struct domain *d)
>      spin_unlock(&d->event_lock);
>  
>      clear_global_virq_handlers(d);
> +
> +    xfree(d->evtchn);
>  }
>  
> 
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 0db73c0..ad9d7ef 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -263,7 +263,7 @@ struct domain
>      spinlock_t       rangesets_lock;
>  
>      /* Event channel information. */
> -    struct evtchn   *evtchn[NR_EVTCHN_BUCKETS];
> +    struct evtchn  **evtchn;
>      spinlock_t       event_lock;
>      struct evtchn_port_ops *evtchn_port_ops;
>  

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/8] evtchn: refactor low-level event channel port ops
  2013-03-20 10:21   ` Jan Beulich
@ 2013-03-20 13:37     ` David Vrabel
  0 siblings, 0 replies; 27+ messages in thread
From: David Vrabel @ 2013-03-20 13:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir (Xen.org), Konrad Rzeszutek Wilk, Wei Liu, xen-devel

On 20/03/13 10:21, Jan Beulich wrote:
>>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
>> --- /dev/null
>> +++ b/xen/common/event_port.c
> 
> Is this really a good name? Wouldn't it be better to have a separate
> file per ABI (e.g. this one being evt_port_2l.c)?

Agreed.  I did this on the Linux side and it's better.

>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -81,6 +81,7 @@ struct evtchn
>>          } pirq;        /* state == ECS_PIRQ */
>>          u16 virq;      /* state == ECS_VIRQ */
>>      } u;
>> +    u32 port;
> 
> The need for this field doesn't become obvious after going
> through the entire patch - if it really needs adding here, the
> description should say why, I would think.

The evtchn_ops calls all take a struct evtchn * as a parameter so we
need this field to get the port.

David

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 7/8] evtchn: add FIFO-based event channel ABI
  2013-03-20 10:32   ` Jan Beulich
@ 2013-03-20 13:38     ` David Vrabel
  0 siblings, 0 replies; 27+ messages in thread
From: David Vrabel @ 2013-03-20 13:38 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir (Xen.org), Konrad Rzeszutek Wilk, Wei Liu, xen-devel

On 20/03/13 10:32, Jan Beulich wrote:
>>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> Add the event channel hypercall sub-ops and the definitions for the
>> shared data structures for the FIFO-based event channel ABI.
>>
>> The design document for this new ABI is available here:
>>
>> http://xenbits.xen.org/people/dvrabel/event-channels-C.pdf 
>>
>> In summary, events are reported using a per-domain shared event array
>> of event words.  Each event word has PENDING, LINKED and MASKED bits
>> and a LINK field for pointing to the next event in the event queue.
>>
>> There are 16 event queues (with different priorities) per-VCPU.
>>
>> Key advantages of this new ABI include:
>>
>> - Support for over 100,000 events (2^17).
> 
> As said in a comment on the design doc B, I'm missing a way for
> the guest to learn the width of the involved mask (17 here, but
> obviously there is nothing preventing this from being 16, 18, or
> yet some other implementation specific value <= 29).

I haven't forgotten this.  I was focusing on the core implementation so
not all the design review feedback has been incorporated yet.

David

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops
  2013-03-20 10:47   ` Jan Beulich
@ 2013-03-20 13:42     ` David Vrabel
  2013-03-20 13:55       ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: David Vrabel @ 2013-03-20 13:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir (Xen.org), Konrad Rzeszutek Wilk, Wei Liu, xen-devel

On 20/03/13 10:47, Jan Beulich wrote:
>>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
>> +struct evtchn_fifo_queue {
>> +    volatile uint32_t *head; /* points into control block */
> 
> I still think that explicit barriers are the way to go. Unless Linux'es
> view on this has changed, you'll have issues getting the Linux folks
> to accept this.

This volatile can just be removed.  head is only written by Xen in one
place and it is immediately followed by a spin_unlock() which is a barrier.

>> @@ -83,6 +83,7 @@ struct evtchn
>>          u16 virq;      /* state == ECS_VIRQ */
>>      } u;
>>      u32 port;
>> +    struct evtchn_fifo_queue *queue;
> 
> Wouldn't it be better (more space efficient) to store the queue
> array index rather than a pointer, considering that you expect to
> have thousands of event channels?

Yes.

David

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops
  2013-03-19 21:00 ` [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops David Vrabel
  2013-03-20 10:47   ` Jan Beulich
@ 2013-03-20 13:50   ` Wei Liu
  1 sibling, 0 replies; 27+ messages in thread
From: Wei Liu @ 2013-03-20 13:50 UTC (permalink / raw)
  To: David Vrabel; +Cc: Keir (Xen.org), Konrad Rzeszutek Wilk, wei.liu2, xen-devel

On Tue, 2013-03-19 at 21:00 +0000, David Vrabel wrote:
> From: David Vrabel <david.vrabel@citrix.com>
> 
> Add the implementation for the FIFO-based event channel ABI.  The new
> hypercall sub-ops (EVTCHNOP_init_control, EVTCHNOP_expand_array,
> EVTCHNOP_set_priority) and the required evtchn_ops (set_pending,
> unmask, etc.).
> 
> This current implementation has three main limitations:
> 
> - EVTCHNOP_set_limit is not yet implemented so any guest will be able
>   to use up to 2^17 (requiring 128 global mapping pages for a fully
>   populated event array).
> 

How can a guest know its limit? What if the limit change when a guest is
running?

> - The control block frames are not required to be shared with the
>   vcpu_info structure.  This requires an additional global mapping
>   page per-VCPU.  This does makes the guest implementation cleaner
>   though so perhaps we do not need to fix this?
> 
> - The allocation of the struct evtchns requires > PAGE_SIZE
>   allocations.  I plan to take Wei's "evtchn: alter internal object
>   handling scheme" patch for this.
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
...
> +
> +#include <public/event_channel.h>
> +
> +static int map_guest_page(struct domain *d, uint64_t gfn,
                                               ^^^^^^^^
                                               xen_pfn_t?

> +                          struct page_info **page, void **virt)
> +{

You can use vmap() and vunmap() for all mapping / unmapping routines to
simplify your code. In that case you can also eliminate the "page" field
in your control structures.

> +    struct page_info *p;
> +
> +    p = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC);
> +    if ( !p )
> +        return -EINVAL;
> +
...
> +static int setup_control_block(struct vcpu *v, uint64_t gfn, uint32_t offset)
> +{
> +    struct domain *d = v->domain;
> +    struct evtchn_fifo_vcpu *efv;
> +    struct page_info *page;
> +    void *virt;
> +    int i;
> +    int rc;
> +
> +    if ( v->evtchn_fifo )
> +        return -EINVAL;

How about using EBUSY?

> +
> +    efv = xzalloc(struct evtchn_fifo_vcpu);
> +    if ( efv == NULL )
...
> +
> +/*
> + * Some ports may already be bound, bind them to the correct VCPU so
> + * they have a valid queue.
> + *
> + * Note: any events that are currently pending will not be resent and
> + * will be lost.
> + */

Does this need to be fixed? Can you scan the bitmap and pick up pending
events if there are any?

This function is called when setting up first control block. So,
consider:

(a) guest can only setup control block at start of day, this function
will not be necessary.

(b) guest is allowed to switch ABI when there are some pending events,
IMO re-injection is necessary.

> +static void rebind_all_ports(struct domain *d)
> +{
> +    int port;
> +
> +    for ( port = 1; port < d->max_evtchns; port++ )
> +    {
> +        struct evtchn *evtchn;
> +
> +        if ( !port_is_valid(d, port) )
> +            break;
> +

"continue", not "break".

> +        evtchn = evtchn_from_port(d, port);
> +        switch ( evtchn->state )
> +        {
> +        case ECS_INTERDOMAIN:
> +        case ECS_PIRQ:
> +        case ECS_VIRQ:
> +            evtchn_port_bind_to_vcpu(d, evtchn, d->vcpu[evtchn->notify_vcpu_id]);
> +            break;
> +        default:
> +            break;
> +        }
> +    }
> +}
> +
...
> diff --git a/xen/common/event_port.c b/xen/common/event_port.c
> index b0ef25b..578acd6 100644
> --- a/xen/common/event_port.c
> +++ b/xen/common/event_port.c
> @@ -13,6 +13,9 @@
>  #include <xen/errno.h>
>  #include <xen/sched.h>
>  #include <xen/event.h>
> +#include <xen/event_fifo.h>
> +
> +#include <public/event_channel.h>
> 
>  static void evtchn_check_pollers(struct domain *d, int port)
>  {
> @@ -106,3 +109,187 @@ struct evtchn_port_ops evtchn_port_ops_2l =
>      .is_pending    = evtchn_2l_is_pending,
>      .is_masked     = evtchn_2l_is_masked,
>  };
> +
> +static inline event_word_t *evtchn_fifo_word_from_port(struct domain *d, int port)
> +{
> +    int p, w;
> +
> +    if ( unlikely(port >= d->evtchn_fifo->num_evtchns) )
> +        return NULL;
> +

Should also check against d->max_evtchns? Or you need to keep
d->max_evtchns and num_evtchns in sync.


Wei.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops
  2013-03-20 13:42     ` David Vrabel
@ 2013-03-20 13:55       ` Jan Beulich
  2013-03-20 14:23         ` Tim Deegan
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2013-03-20 13:55 UTC (permalink / raw)
  To: David Vrabel; +Cc: Keir (Xen.org), Konrad Rzeszutek Wilk, Wei Liu, xen-devel

>>> On 20.03.13 at 14:42, David Vrabel <david.vrabel@citrix.com> wrote:
> On 20/03/13 10:47, Jan Beulich wrote:
>>>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
>>> +struct evtchn_fifo_queue {
>>> +    volatile uint32_t *head; /* points into control block */
>> 
>> I still think that explicit barriers are the way to go. Unless Linux'es
>> view on this has changed, you'll have issues getting the Linux folks
>> to accept this.
> 
> This volatile can just be removed.  head is only written by Xen in one
> place and it is immediately followed by a spin_unlock() which is a barrier.

A release type barrier only, but that presumably is sufficient for
the purposes here.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops
  2013-03-20 13:55       ` Jan Beulich
@ 2013-03-20 14:23         ` Tim Deegan
  2013-03-20 14:38           ` David Vrabel
  0 siblings, 1 reply; 27+ messages in thread
From: Tim Deegan @ 2013-03-20 14:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, xen-devel, Keir (Xen.org), David Vrabel, Konrad Rzeszutek Wilk

At 13:55 +0000 on 20 Mar (1363787704), Jan Beulich wrote:
> >>> On 20.03.13 at 14:42, David Vrabel <david.vrabel@citrix.com> wrote:
> > On 20/03/13 10:47, Jan Beulich wrote:
> >>>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
> >>> +struct evtchn_fifo_queue {
> >>> +    volatile uint32_t *head; /* points into control block */
> >> 
> >> I still think that explicit barriers are the way to go. Unless Linux'es
> >> view on this has changed, you'll have issues getting the Linux folks
> >> to accept this.
> > 
> > This volatile can just be removed.  head is only written by Xen in one
> > place and it is immediately followed by a spin_unlock() which is a barrier.
> 
> A release type barrier only, but that presumably is sufficient for
> the purposes here.

You might have to use an atomic_t or similar if the consumer might be
confused by partial updates.

Tim.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops
  2013-03-20 14:23         ` Tim Deegan
@ 2013-03-20 14:38           ` David Vrabel
  2013-03-20 15:34             ` Tim Deegan
  0 siblings, 1 reply; 27+ messages in thread
From: David Vrabel @ 2013-03-20 14:38 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Wei Liu, xen-devel, Keir (Xen.org), Jan Beulich, Konrad Rzeszutek Wilk

On 20/03/13 14:23, Tim Deegan wrote:
> At 13:55 +0000 on 20 Mar (1363787704), Jan Beulich wrote:
>>>>> On 20.03.13 at 14:42, David Vrabel <david.vrabel@citrix.com> wrote:
>>> On 20/03/13 10:47, Jan Beulich wrote:
>>>>>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
>>>>> +struct evtchn_fifo_queue {
>>>>> +    volatile uint32_t *head; /* points into control block */
>>>>
>>>> I still think that explicit barriers are the way to go. Unless Linux'es
>>>> view on this has changed, you'll have issues getting the Linux folks
>>>> to accept this.
>>>
>>> This volatile can just be removed.  head is only written by Xen in one
>>> place and it is immediately followed by a spin_unlock() which is a barrier.
>>
>> A release type barrier only, but that presumably is sufficient for
>> the purposes here.
> 
> You might have to use an atomic_t or similar if the consumer might be
> confused by partial updates.

I have assumed that reads and writes to 32-bit words are atomic on all
interesting architectures.

David

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops
  2013-03-20 14:38           ` David Vrabel
@ 2013-03-20 15:34             ` Tim Deegan
  2013-03-20 15:54               ` David Vrabel
  0 siblings, 1 reply; 27+ messages in thread
From: Tim Deegan @ 2013-03-20 15:34 UTC (permalink / raw)
  To: David Vrabel
  Cc: Wei Liu, xen-devel, Keir (Xen.org), Jan Beulich, Konrad Rzeszutek Wilk

At 14:38 +0000 on 20 Mar (1363790300), David Vrabel wrote:
> On 20/03/13 14:23, Tim Deegan wrote:
> > At 13:55 +0000 on 20 Mar (1363787704), Jan Beulich wrote:
> >>>>> On 20.03.13 at 14:42, David Vrabel <david.vrabel@citrix.com> wrote:
> >>> On 20/03/13 10:47, Jan Beulich wrote:
> >>>>>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
> >>>>> +struct evtchn_fifo_queue {
> >>>>> +    volatile uint32_t *head; /* points into control block */
> >>>>
> >>>> I still think that explicit barriers are the way to go. Unless Linux'es
> >>>> view on this has changed, you'll have issues getting the Linux folks
> >>>> to accept this.
> >>>
> >>> This volatile can just be removed.  head is only written by Xen in one
> >>> place and it is immediately followed by a spin_unlock() which is a barrier.
> >>
> >> A release type barrier only, but that presumably is sufficient for
> >> the purposes here.
> > 
> > You might have to use an atomic_t or similar if the consumer might be
> > confused by partial updates.
> 
> I have assumed that reads and writes to 32-bit words are atomic on all
> interesting architectures.

True, but unless you explicitly tell it to, the compiler isn't required
to update a 32-bit variable using 32-bit operations, or to avoid
weird-looking intermediate values.  It seems unlikely that it would do
anything other than straightforwardly write the new value, but we've seen
compilers do some unlikely things. :)

Tim.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops
  2013-03-20 15:34             ` Tim Deegan
@ 2013-03-20 15:54               ` David Vrabel
  2013-03-20 16:15                 ` Keir Fraser
  0 siblings, 1 reply; 27+ messages in thread
From: David Vrabel @ 2013-03-20 15:54 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Wei Liu, xen-devel, Keir (Xen.org), Jan Beulich, Konrad Rzeszutek Wilk

On 20/03/13 15:34, Tim Deegan wrote:
> At 14:38 +0000 on 20 Mar (1363790300), David Vrabel wrote:
>> On 20/03/13 14:23, Tim Deegan wrote:
>>> At 13:55 +0000 on 20 Mar (1363787704), Jan Beulich wrote:
>>>>>>> On 20.03.13 at 14:42, David Vrabel <david.vrabel@citrix.com> wrote:
>>>>> On 20/03/13 10:47, Jan Beulich wrote:
>>>>>>>>> On 19.03.13 at 22:00, David Vrabel <david.vrabel@citrix.com> wrote:
>>>>>>> +struct evtchn_fifo_queue {
>>>>>>> +    volatile uint32_t *head; /* points into control block */
>>>>>>
>>>>>> I still think that explicit barriers are the way to go. Unless Linux'es
>>>>>> view on this has changed, you'll have issues getting the Linux folks
>>>>>> to accept this.
>>>>>
>>>>> This volatile can just be removed.  head is only written by Xen in one
>>>>> place and it is immediately followed by a spin_unlock() which is a barrier.
>>>>
>>>> A release type barrier only, but that presumably is sufficient for
>>>> the purposes here.
>>>
>>> You might have to use an atomic_t or similar if the consumer might be
>>> confused by partial updates.
>>
>> I have assumed that reads and writes to 32-bit words are atomic on all
>> interesting architectures.
> 
> True, but unless you explicitly tell it to, the compiler isn't required
> to update a 32-bit variable using 32-bit operations, or to avoid
> weird-looking intermediate values.  It seems unlikely that it would do
> anything other than straightforwardly write the new value, but we've seen
> compilers do some unlikely things. :)

Ok.  Looks like I need to use write_atomic() to update the head in Xen.

David

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops
  2013-03-20 15:54               ` David Vrabel
@ 2013-03-20 16:15                 ` Keir Fraser
  0 siblings, 0 replies; 27+ messages in thread
From: Keir Fraser @ 2013-03-20 16:15 UTC (permalink / raw)
  To: David Vrabel, Tim Deegan
  Cc: xen-devel, Wei Liu, Jan Beulich, Konrad Rzeszutek Wilk

On 20/03/2013 15:54, "David Vrabel" <david.vrabel@citrix.com> wrote:

>>>> You might have to use an atomic_t or similar if the consumer might be
>>>> confused by partial updates.
>>> 
>>> I have assumed that reads and writes to 32-bit words are atomic on all
>>> interesting architectures.
>> 
>> True, but unless you explicitly tell it to, the compiler isn't required
>> to update a 32-bit variable using 32-bit operations, or to avoid
>> weird-looking intermediate values.  It seems unlikely that it would do
>> anything other than straightforwardly write the new value, but we've seen
>> compilers do some unlikely things. :)
> 
> Ok.  Looks like I need to use write_atomic() to update the head in Xen.

Yes, I introduced {read,write}_atomic() precisely to deal with this
possibility. No doubt we are missing some places it ought to be used, but at
least we fix them when we spot them and new code should make use of them.

It also provides some form of documentation about memory locations involved
in lock-free communication/synchronisation.

 -- Keir

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2013-03-20 16:15 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-19 21:00 [PATCH RFC 0/8] Xen: FIFO-based event channel ABI David Vrabel
2013-03-19 21:00 ` [PATCH 1/8] debug: remove some event channel info from the 'i' and 'q' debug keys David Vrabel
2013-03-19 21:00 ` [PATCH 2/8] evtchn: refactor low-level event channel port ops David Vrabel
2013-03-20 10:21   ` Jan Beulich
2013-03-20 13:37     ` David Vrabel
2013-03-20 10:24   ` Jan Beulich
2013-03-19 21:00 ` [PATCH 3/8] evtchn: add a hook to bind an event port to a VCPU David Vrabel
2013-03-19 21:00 ` [PATCH 4/8] evtchn: Dynamically allocate d->evtchn David Vrabel
2013-03-20 11:43   ` Wei Liu
2013-03-19 21:00 ` [PATCH 5/8] evtchn: use a per-domain variable for the max number of event channels David Vrabel
2013-03-20 10:27   ` Jan Beulich
2013-03-19 21:00 ` [PATCH 6/8] HACK! evtchn: increase number of buckets to support the FIFO ABI David Vrabel
2013-03-19 21:00 ` [PATCH 7/8] evtchn: add FIFO-based event channel ABI David Vrabel
2013-03-20 10:32   ` Jan Beulich
2013-03-20 13:38     ` David Vrabel
2013-03-19 21:00 ` [PATCH 8/8] evtchn: add FIFO-based event channel hypercalls and port ops David Vrabel
2013-03-20 10:47   ` Jan Beulich
2013-03-20 13:42     ` David Vrabel
2013-03-20 13:55       ` Jan Beulich
2013-03-20 14:23         ` Tim Deegan
2013-03-20 14:38           ` David Vrabel
2013-03-20 15:34             ` Tim Deegan
2013-03-20 15:54               ` David Vrabel
2013-03-20 16:15                 ` Keir Fraser
2013-03-20 13:50   ` Wei Liu
2013-03-19 21:15 ` [PATCH RFC 0/8] Xen: FIFO-based event channel ABI Keir Fraser
2013-03-20 10:15 ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.