xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] XSA-343 followup patches
@ 2020-10-16 10:58 Juergen Gross
  2020-10-16 10:58 ` [PATCH v3 1/2] xen/events: access last_priority and last_vcpu_id together Juergen Gross
  2020-10-16 10:58 ` [PATCH v3 2/2] xen/evtchn: rework per event channel lock Juergen Gross
  0 siblings, 2 replies; 13+ messages in thread
From: Juergen Gross @ 2020-10-16 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Roger Pau Monné

The patches for XSA-343 produced some fallout, especially the event
channel locking has shown to be problematic.

Patch 1 is targeting fifo event channels for avoiding any races for the
case that the fifo queue has been changed for a specific event channel.

The second patch is modifying the per event channel locking scheme in
order to avoid deadlocks and problems due to the event channel lock
having been changed to require IRQs off by the XSA-343 patches.

Changes in V3:
- addressed comments

Juergen Gross (2):
  xen/events: access last_priority and last_vcpu_id together
  xen/evtchn: rework per event channel lock

 xen/arch/x86/irq.c         |   6 +-
 xen/arch/x86/pv/shim.c     |   9 +--
 xen/common/event_channel.c | 109 +++++++++++++++++--------------------
 xen/common/event_fifo.c    |  25 +++++++--
 xen/include/xen/event.h    |  76 ++++++++++++++++++++++----
 xen/include/xen/sched.h    |   5 +-
 6 files changed, 145 insertions(+), 85 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/2] xen/events: access last_priority and last_vcpu_id together
  2020-10-16 10:58 [PATCH v3 0/2] XSA-343 followup patches Juergen Gross
@ 2020-10-16 10:58 ` Juergen Gross
  2020-11-04  9:42   ` Julien Grall
  2020-10-16 10:58 ` [PATCH v3 2/2] xen/evtchn: rework per event channel lock Juergen Gross
  1 sibling, 1 reply; 13+ messages in thread
From: Juergen Gross @ 2020-10-16 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu

The queue for a fifo event is depending on the vcpu_id and the
priority of the event. When sending an event it might happen the
event needs to change queues and the old queue needs to be kept for
keeping the links between queue elements intact. For this purpose
the event channel contains last_priority and last_vcpu_id values
elements for being able to identify the old queue.

In order to avoid races always access last_priority and last_vcpu_id
with a single atomic operation avoiding any inconsistencies.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/event_fifo.c | 25 +++++++++++++++++++------
 xen/include/xen/sched.h |  3 +--
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c
index fc189152e1..2f4e8c54fc 100644
--- a/xen/common/event_fifo.c
+++ b/xen/common/event_fifo.c
@@ -42,6 +42,14 @@ struct evtchn_fifo_domain {
     unsigned int num_evtchns;
 };
 
+union evtchn_fifo_lastq {
+    uint32_t raw;
+    struct {
+        uint8_t last_priority;
+        uint16_t last_vcpu_id;
+    };
+};
+
 static inline event_word_t *evtchn_fifo_word_from_port(const struct domain *d,
                                                        unsigned int port)
 {
@@ -86,16 +94,18 @@ static struct evtchn_fifo_queue *lock_old_queue(const struct domain *d,
     struct vcpu *v;
     struct evtchn_fifo_queue *q, *old_q;
     unsigned int try;
+    union evtchn_fifo_lastq lastq;
 
     for ( try = 0; try < 3; try++ )
     {
-        v = d->vcpu[evtchn->last_vcpu_id];
-        old_q = &v->evtchn_fifo->queue[evtchn->last_priority];
+        lastq.raw = read_atomic(&evtchn->fifo_lastq);
+        v = d->vcpu[lastq.last_vcpu_id];
+        old_q = &v->evtchn_fifo->queue[lastq.last_priority];
 
         spin_lock_irqsave(&old_q->lock, *flags);
 
-        v = d->vcpu[evtchn->last_vcpu_id];
-        q = &v->evtchn_fifo->queue[evtchn->last_priority];
+        v = d->vcpu[lastq.last_vcpu_id];
+        q = &v->evtchn_fifo->queue[lastq.last_priority];
 
         if ( old_q == q )
             return old_q;
@@ -246,8 +256,11 @@ static void evtchn_fifo_set_pending(struct vcpu *v, struct evtchn *evtchn)
         /* Moved to a different queue? */
         if ( old_q != q )
         {
-            evtchn->last_vcpu_id = v->vcpu_id;
-            evtchn->last_priority = q->priority;
+            union evtchn_fifo_lastq lastq = { };
+
+            lastq.last_vcpu_id = v->vcpu_id;
+            lastq.last_priority = q->priority;
+            write_atomic(&evtchn->fifo_lastq, lastq.raw);
 
             spin_unlock_irqrestore(&old_q->lock, flags);
             spin_lock_irqsave(&q->lock, flags);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index d8ed83f869..a298ff4df8 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -114,8 +114,7 @@ struct evtchn
         u16 virq;      /* state == ECS_VIRQ */
     } u;
     u8 priority;
-    u8 last_priority;
-    u16 last_vcpu_id;
+    u32 fifo_lastq;    /* Data for fifo events identifying last queue. */
 #ifdef CONFIG_XSM
     union {
 #ifdef XSM_NEED_GENERIC_EVTCHN_SSID
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/2] xen/evtchn: rework per event channel lock
  2020-10-16 10:58 [PATCH v3 0/2] XSA-343 followup patches Juergen Gross
  2020-10-16 10:58 ` [PATCH v3 1/2] xen/events: access last_priority and last_vcpu_id together Juergen Gross
@ 2020-10-16 10:58 ` Juergen Gross
  2020-10-20  9:28   ` Jan Beulich
  1 sibling, 1 reply; 13+ messages in thread
From: Juergen Gross @ 2020-10-16 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini

Currently the lock for a single event channel needs to be taken with
interrupts off, which causes deadlocks in some cases.

Rework the per event channel lock to be non-blocking for the case of
sending an event and removing the need for disabling interrupts for
taking the lock.

The lock is needed for avoiding races between sending an event or
querying the channel's state against removal of the event channel.

Use a locking scheme similar to a rwlock, but with some modifications:

- sending an event or querying the event channel's state uses an
  operation similar to read_trylock(), in case of not obtaining the
  lock the sending is omitted or a default state is returned

- closing an event channel is similar to write_lock(), but without
  real fairness regarding multiple writers (this saves some space in
  the event channel structure and multiple writers are impossible as
  closing an event channel requires the domain's event_lock to be
  held).

Fixes: e045199c7c9c54 ("evtchn: address races with evtchn_reset()")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
V3:
- corrected a copy-and-paste error (Jan Beulich)
- corrected unlocking in two cases (Jan Beulich)
- renamed evtchn_read_trylock() (Jan Beulich)
- added some comments and an ASSERT() for evtchn_write_lock()
- set EVENT_WRITE_LOCK_INC to INT_MIN

V2:
- added needed barriers

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/arch/x86/irq.c         |   6 +-
 xen/arch/x86/pv/shim.c     |   9 +--
 xen/common/event_channel.c | 109 +++++++++++++++++--------------------
 xen/include/xen/event.h    |  76 ++++++++++++++++++++++----
 xen/include/xen/sched.h    |   2 +-
 5 files changed, 125 insertions(+), 77 deletions(-)

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 93c4fb9a79..8d1f9a9fc6 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2495,14 +2495,12 @@ static void dump_irqs(unsigned char key)
                 pirq = domain_irq_to_pirq(d, irq);
                 info = pirq_info(d, pirq);
                 evtchn = evtchn_from_port(d, info->evtchn);
-                local_irq_disable();
-                if ( spin_trylock(&evtchn->lock) )
+                if ( evtchn_read_trylock(evtchn) )
                 {
                     pending = evtchn_is_pending(d, evtchn);
                     masked = evtchn_is_masked(d, evtchn);
-                    spin_unlock(&evtchn->lock);
+                    evtchn_read_unlock(evtchn);
                 }
-                local_irq_enable();
                 printk("d%d:%3d(%c%c%c)%c",
                        d->domain_id, pirq, "-P?"[pending],
                        "-M?"[masked], info->masked ? 'M' : '-',
diff --git a/xen/arch/x86/pv/shim.c b/xen/arch/x86/pv/shim.c
index 9aef7a860a..b4e83e0778 100644
--- a/xen/arch/x86/pv/shim.c
+++ b/xen/arch/x86/pv/shim.c
@@ -660,11 +660,12 @@ void pv_shim_inject_evtchn(unsigned int port)
     if ( port_is_valid(guest, port) )
     {
         struct evtchn *chn = evtchn_from_port(guest, port);
-        unsigned long flags;
 
-        spin_lock_irqsave(&chn->lock, flags);
-        evtchn_port_set_pending(guest, chn->notify_vcpu_id, chn);
-        spin_unlock_irqrestore(&chn->lock, flags);
+        if ( evtchn_read_trylock(chn) )
+        {
+            evtchn_port_set_pending(guest, chn->notify_vcpu_id, chn);
+            evtchn_read_unlock(chn);
+        }
     }
 }
 
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index e365b5498f..3df73dbc71 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -131,7 +131,7 @@ static struct evtchn *alloc_evtchn_bucket(struct domain *d, unsigned int port)
             return NULL;
         }
         chn[i].port = port + i;
-        spin_lock_init(&chn[i].lock);
+        atomic_set(&chn[i].lock, 0);
     }
     return chn;
 }
@@ -253,7 +253,6 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
     int            port;
     domid_t        dom = alloc->dom;
     long           rc;
-    unsigned long  flags;
 
     d = rcu_lock_domain_by_any_id(dom);
     if ( d == NULL )
@@ -269,14 +268,14 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
     if ( rc )
         goto out;
 
-    spin_lock_irqsave(&chn->lock, flags);
+    evtchn_write_lock(chn);
 
     chn->state = ECS_UNBOUND;
     if ( (chn->u.unbound.remote_domid = alloc->remote_dom) == DOMID_SELF )
         chn->u.unbound.remote_domid = current->domain->domain_id;
     evtchn_port_init(d, chn);
 
-    spin_unlock_irqrestore(&chn->lock, flags);
+    evtchn_write_unlock(chn);
 
     alloc->port = port;
 
@@ -289,32 +288,26 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
 }
 
 
-static unsigned long double_evtchn_lock(struct evtchn *lchn,
-                                        struct evtchn *rchn)
+static void double_evtchn_lock(struct evtchn *lchn, struct evtchn *rchn)
 {
-    unsigned long flags;
-
     if ( lchn <= rchn )
     {
-        spin_lock_irqsave(&lchn->lock, flags);
+        evtchn_write_lock(lchn);
         if ( lchn != rchn )
-            spin_lock(&rchn->lock);
+            evtchn_write_lock(rchn);
     }
     else
     {
-        spin_lock_irqsave(&rchn->lock, flags);
-        spin_lock(&lchn->lock);
+        evtchn_write_lock(rchn);
+        evtchn_write_lock(lchn);
     }
-
-    return flags;
 }
 
-static void double_evtchn_unlock(struct evtchn *lchn, struct evtchn *rchn,
-                                 unsigned long flags)
+static void double_evtchn_unlock(struct evtchn *lchn, struct evtchn *rchn)
 {
     if ( lchn != rchn )
-        spin_unlock(&lchn->lock);
-    spin_unlock_irqrestore(&rchn->lock, flags);
+        evtchn_write_unlock(lchn);
+    evtchn_write_unlock(rchn);
 }
 
 static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
@@ -324,7 +317,6 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
     int            lport, rport = bind->remote_port;
     domid_t        rdom = bind->remote_dom;
     long           rc;
-    unsigned long  flags;
 
     if ( rdom == DOMID_SELF )
         rdom = current->domain->domain_id;
@@ -360,7 +352,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
     if ( rc )
         goto out;
 
-    flags = double_evtchn_lock(lchn, rchn);
+    double_evtchn_lock(lchn, rchn);
 
     lchn->u.interdomain.remote_dom  = rd;
     lchn->u.interdomain.remote_port = rport;
@@ -377,7 +369,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
      */
     evtchn_port_set_pending(ld, lchn->notify_vcpu_id, lchn);
 
-    double_evtchn_unlock(lchn, rchn, flags);
+    double_evtchn_unlock(lchn, rchn);
 
     bind->local_port = lport;
 
@@ -400,7 +392,6 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
     struct domain *d = current->domain;
     int            virq = bind->virq, vcpu = bind->vcpu;
     int            rc = 0;
-    unsigned long  flags;
 
     if ( (virq < 0) || (virq >= ARRAY_SIZE(v->virq_to_evtchn)) )
         return -EINVAL;
@@ -438,14 +429,14 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
 
     chn = evtchn_from_port(d, port);
 
-    spin_lock_irqsave(&chn->lock, flags);
+    evtchn_write_lock(chn);
 
     chn->state          = ECS_VIRQ;
     chn->notify_vcpu_id = vcpu;
     chn->u.virq         = virq;
     evtchn_port_init(d, chn);
 
-    spin_unlock_irqrestore(&chn->lock, flags);
+    evtchn_write_unlock(chn);
 
     v->virq_to_evtchn[virq] = bind->port = port;
 
@@ -462,7 +453,6 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind)
     struct domain *d = current->domain;
     int            port, vcpu = bind->vcpu;
     long           rc = 0;
-    unsigned long  flags;
 
     if ( domain_vcpu(d, vcpu) == NULL )
         return -ENOENT;
@@ -474,13 +464,13 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind)
 
     chn = evtchn_from_port(d, port);
 
-    spin_lock_irqsave(&chn->lock, flags);
+    evtchn_write_lock(chn);
 
     chn->state          = ECS_IPI;
     chn->notify_vcpu_id = vcpu;
     evtchn_port_init(d, chn);
 
-    spin_unlock_irqrestore(&chn->lock, flags);
+    evtchn_write_unlock(chn);
 
     bind->port = port;
 
@@ -524,7 +514,6 @@ static long evtchn_bind_pirq(evtchn_bind_pirq_t *bind)
     struct pirq   *info;
     int            port = 0, pirq = bind->pirq;
     long           rc;
-    unsigned long  flags;
 
     if ( (pirq < 0) || (pirq >= d->nr_pirqs) )
         return -EINVAL;
@@ -557,14 +546,14 @@ static long evtchn_bind_pirq(evtchn_bind_pirq_t *bind)
         goto out;
     }
 
-    spin_lock_irqsave(&chn->lock, flags);
+    evtchn_write_lock(chn);
 
     chn->state  = ECS_PIRQ;
     chn->u.pirq.irq = pirq;
     link_pirq_port(port, chn, v);
     evtchn_port_init(d, chn);
 
-    spin_unlock_irqrestore(&chn->lock, flags);
+    evtchn_write_unlock(chn);
 
     bind->port = port;
 
@@ -585,7 +574,6 @@ int evtchn_close(struct domain *d1, int port1, bool guest)
     struct evtchn *chn1, *chn2;
     int            port2;
     long           rc = 0;
-    unsigned long  flags;
 
  again:
     spin_lock(&d1->event_lock);
@@ -686,14 +674,14 @@ int evtchn_close(struct domain *d1, int port1, bool guest)
         BUG_ON(chn2->state != ECS_INTERDOMAIN);
         BUG_ON(chn2->u.interdomain.remote_dom != d1);
 
-        flags = double_evtchn_lock(chn1, chn2);
+        double_evtchn_lock(chn1, chn2);
 
         evtchn_free(d1, chn1);
 
         chn2->state = ECS_UNBOUND;
         chn2->u.unbound.remote_domid = d1->domain_id;
 
-        double_evtchn_unlock(chn1, chn2, flags);
+        double_evtchn_unlock(chn1, chn2);
 
         goto out;
 
@@ -701,9 +689,9 @@ int evtchn_close(struct domain *d1, int port1, bool guest)
         BUG();
     }
 
-    spin_lock_irqsave(&chn1->lock, flags);
+    evtchn_write_lock(chn1);
     evtchn_free(d1, chn1);
-    spin_unlock_irqrestore(&chn1->lock, flags);
+    evtchn_write_unlock(chn1);
 
  out:
     if ( d2 != NULL )
@@ -723,7 +711,6 @@ int evtchn_send(struct domain *ld, unsigned int lport)
     struct evtchn *lchn, *rchn;
     struct domain *rd;
     int            rport, ret = 0;
-    unsigned long  flags;
 
     if ( !port_is_valid(ld, lport) )
         return -EINVAL;
@@ -736,7 +723,8 @@ int evtchn_send(struct domain *ld, unsigned int lport)
 
     lchn = evtchn_from_port(ld, lport);
 
-    spin_lock_irqsave(&lchn->lock, flags);
+    if ( !evtchn_read_trylock(lchn) )
+        return 0;
 
     /* Guest cannot send via a Xen-attached event channel. */
     if ( unlikely(consumer_is_xen(lchn)) )
@@ -771,7 +759,7 @@ int evtchn_send(struct domain *ld, unsigned int lport)
     }
 
 out:
-    spin_unlock_irqrestore(&lchn->lock, flags);
+    evtchn_read_unlock(lchn);
 
     return ret;
 }
@@ -798,9 +786,11 @@ void send_guest_vcpu_virq(struct vcpu *v, uint32_t virq)
 
     d = v->domain;
     chn = evtchn_from_port(d, port);
-    spin_lock(&chn->lock);
-    evtchn_port_set_pending(d, v->vcpu_id, chn);
-    spin_unlock(&chn->lock);
+    if ( evtchn_read_trylock(chn) )
+    {
+        evtchn_port_set_pending(d, v->vcpu_id, chn);
+        evtchn_read_unlock(chn);
+    }
 
  out:
     spin_unlock_irqrestore(&v->virq_lock, flags);
@@ -829,9 +819,11 @@ void send_guest_global_virq(struct domain *d, uint32_t virq)
         goto out;
 
     chn = evtchn_from_port(d, port);
-    spin_lock(&chn->lock);
-    evtchn_port_set_pending(d, chn->notify_vcpu_id, chn);
-    spin_unlock(&chn->lock);
+    if ( evtchn_read_trylock(chn) )
+    {
+        evtchn_port_set_pending(d, chn->notify_vcpu_id, chn);
+        evtchn_read_unlock(chn);
+    }
 
  out:
     spin_unlock_irqrestore(&v->virq_lock, flags);
@@ -841,7 +833,6 @@ void send_guest_pirq(struct domain *d, const struct pirq *pirq)
 {
     int port;
     struct evtchn *chn;
-    unsigned long flags;
 
     /*
      * PV guests: It should not be possible to race with __evtchn_close(). The
@@ -856,9 +847,11 @@ void send_guest_pirq(struct domain *d, const struct pirq *pirq)
     }
 
     chn = evtchn_from_port(d, port);
-    spin_lock_irqsave(&chn->lock, flags);
-    evtchn_port_set_pending(d, chn->notify_vcpu_id, chn);
-    spin_unlock_irqrestore(&chn->lock, flags);
+    if ( evtchn_read_trylock(chn) )
+    {
+        evtchn_port_set_pending(d, chn->notify_vcpu_id, chn);
+        evtchn_read_unlock(chn);
+    }
 }
 
 static struct domain *global_virq_handlers[NR_VIRQS] __read_mostly;
@@ -1060,15 +1053,16 @@ int evtchn_unmask(unsigned int port)
 {
     struct domain *d = current->domain;
     struct evtchn *evtchn;
-    unsigned long flags;
 
     if ( unlikely(!port_is_valid(d, port)) )
         return -EINVAL;
 
     evtchn = evtchn_from_port(d, port);
-    spin_lock_irqsave(&evtchn->lock, flags);
-    evtchn_port_unmask(d, evtchn);
-    spin_unlock_irqrestore(&evtchn->lock, flags);
+    if ( evtchn_read_trylock(evtchn) )
+    {
+        evtchn_port_unmask(d, evtchn);
+        evtchn_read_unlock(evtchn);
+    }
 
     return 0;
 }
@@ -1327,7 +1321,6 @@ int alloc_unbound_xen_event_channel(
 {
     struct evtchn *chn;
     int            port, rc;
-    unsigned long  flags;
 
     spin_lock(&ld->event_lock);
 
@@ -1340,14 +1333,14 @@ int alloc_unbound_xen_event_channel(
     if ( rc )
         goto out;
 
-    spin_lock_irqsave(&chn->lock, flags);
+    evtchn_write_lock(chn);
 
     chn->state = ECS_UNBOUND;
     chn->xen_consumer = get_xen_consumer(notification_fn);
     chn->notify_vcpu_id = lvcpu;
     chn->u.unbound.remote_domid = remote_domid;
 
-    spin_unlock_irqrestore(&chn->lock, flags);
+    evtchn_write_unlock(chn);
 
     /*
      * Increment ->xen_evtchns /after/ ->active_evtchns. No explicit
@@ -1383,7 +1376,6 @@ void notify_via_xen_event_channel(struct domain *ld, int lport)
 {
     struct evtchn *lchn, *rchn;
     struct domain *rd;
-    unsigned long flags;
 
     if ( !port_is_valid(ld, lport) )
     {
@@ -1398,7 +1390,8 @@ void notify_via_xen_event_channel(struct domain *ld, int lport)
 
     lchn = evtchn_from_port(ld, lport);
 
-    spin_lock_irqsave(&lchn->lock, flags);
+    if ( !evtchn_read_trylock(lchn) )
+        return;
 
     if ( likely(lchn->state == ECS_INTERDOMAIN) )
     {
@@ -1408,7 +1401,7 @@ void notify_via_xen_event_channel(struct domain *ld, int lport)
         evtchn_port_set_pending(rd, rchn->notify_vcpu_id, rchn);
     }
 
-    spin_unlock_irqrestore(&lchn->lock, flags);
+    evtchn_read_unlock(lchn);
 }
 
 void evtchn_check_pollers(struct domain *d, unsigned int port)
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 509d3ae861..592e0dc22d 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -105,6 +105,60 @@ void notify_via_xen_event_channel(struct domain *ld, int lport);
 #define bucket_from_port(d, p) \
     ((group_from_port(d, p))[((p) % EVTCHNS_PER_GROUP) / EVTCHNS_PER_BUCKET])
 
+#define EVENT_WRITE_LOCK_INC    INT_MIN
+
+/*
+ * Lock an event channel exclusively. This is allowed only with holding
+ * d->event_lock AND when the channel is free or unbound either when taking
+ * or when releasing the lock, as any concurrent operation on the event
+ * channel using evtchn_read_trylock() will just assume the event channel is
+ * free or unbound at the moment.
+ */
+static inline void evtchn_write_lock(struct evtchn *evtchn)
+{
+    int val;
+
+    /*
+     * The lock can't be held by a writer already, as all writers need to
+     * hold d->event_lock.
+     */
+    ASSERT(atomic_read(&evtchn->lock) >= 0);
+
+    /* No barrier needed, atomic_add_return() is full barrier. */
+    for ( val = atomic_add_return(EVENT_WRITE_LOCK_INC, &evtchn->lock);
+          val != EVENT_WRITE_LOCK_INC;
+          val = atomic_read(&evtchn->lock) )
+        cpu_relax();
+}
+
+static inline void evtchn_write_unlock(struct evtchn *evtchn)
+{
+    arch_lock_release_barrier();
+
+    atomic_sub(EVENT_WRITE_LOCK_INC, &evtchn->lock);
+}
+
+static inline bool evtchn_read_trylock(struct evtchn *evtchn)
+{
+    if ( atomic_read(&evtchn->lock) < 0 )
+        return false;
+
+    /* No barrier needed, atomic_inc_return() is full barrier. */
+    if ( atomic_inc_return(&evtchn->lock) >= 0 )
+        return true;
+
+    atomic_dec(&evtchn->lock);
+
+    return false;
+}
+
+static inline void evtchn_read_unlock(struct evtchn *evtchn)
+{
+    arch_lock_release_barrier();
+
+    atomic_dec(&evtchn->lock);
+}
+
 static inline unsigned int max_evtchns(const struct domain *d)
 {
     return d->evtchn_fifo ? EVTCHN_FIFO_NR_CHANNELS
@@ -249,12 +303,13 @@ static inline bool evtchn_is_masked(const struct domain *d,
 static inline bool evtchn_port_is_masked(struct domain *d, evtchn_port_t port)
 {
     struct evtchn *evtchn = evtchn_from_port(d, port);
-    bool rc;
-    unsigned long flags;
+    bool rc = true;
 
-    spin_lock_irqsave(&evtchn->lock, flags);
-    rc = evtchn_is_masked(d, evtchn);
-    spin_unlock_irqrestore(&evtchn->lock, flags);
+    if ( evtchn_read_trylock(evtchn) )
+    {
+        rc = evtchn_is_masked(d, evtchn);
+        evtchn_read_unlock(evtchn);
+    }
 
     return rc;
 }
@@ -274,12 +329,13 @@ static inline int evtchn_port_poll(struct domain *d, evtchn_port_t port)
     if ( port_is_valid(d, port) )
     {
         struct evtchn *evtchn = evtchn_from_port(d, port);
-        unsigned long flags;
 
-        spin_lock_irqsave(&evtchn->lock, flags);
-        if ( evtchn_usable(evtchn) )
-            rc = evtchn_is_pending(d, evtchn);
-        spin_unlock_irqrestore(&evtchn->lock, flags);
+        if ( evtchn_read_trylock(evtchn) )
+        {
+            if ( evtchn_usable(evtchn) )
+                rc = evtchn_is_pending(d, evtchn);
+            evtchn_read_unlock(evtchn);
+        }
     }
 
     return rc;
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index a298ff4df8..096e0ec6af 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -85,7 +85,7 @@ extern domid_t hardware_domid;
 
 struct evtchn
 {
-    spinlock_t lock;
+    atomic_t lock;         /* kind of rwlock, use evtchn_*_[un]lock()        */
 #define ECS_FREE         0 /* Channel is available for use.                  */
 #define ECS_RESERVED     1 /* Channel is reserved.                           */
 #define ECS_UNBOUND      2 /* Channel is waiting to bind to a remote domain. */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] xen/evtchn: rework per event channel lock
  2020-10-16 10:58 ` [PATCH v3 2/2] xen/evtchn: rework per event channel lock Juergen Gross
@ 2020-10-20  9:28   ` Jan Beulich
  2020-11-02 13:41     ` Jürgen Groß
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2020-10-20  9:28 UTC (permalink / raw)
  To: Juergen Gross
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini

On 16.10.2020 12:58, Juergen Gross wrote:
> --- a/xen/arch/x86/pv/shim.c
> +++ b/xen/arch/x86/pv/shim.c
> @@ -660,11 +660,12 @@ void pv_shim_inject_evtchn(unsigned int port)
>      if ( port_is_valid(guest, port) )
>      {
>          struct evtchn *chn = evtchn_from_port(guest, port);
> -        unsigned long flags;
>  
> -        spin_lock_irqsave(&chn->lock, flags);
> -        evtchn_port_set_pending(guest, chn->notify_vcpu_id, chn);
> -        spin_unlock_irqrestore(&chn->lock, flags);
> +        if ( evtchn_read_trylock(chn) )
> +        {
> +            evtchn_port_set_pending(guest, chn->notify_vcpu_id, chn);
> +            evtchn_read_unlock(chn);
> +        }

Does this want some form of else, e.g. at least a printk_once()?

> @@ -360,7 +352,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>      if ( rc )
>          goto out;
>  
> -    flags = double_evtchn_lock(lchn, rchn);
> +    double_evtchn_lock(lchn, rchn);

This introduces an unfortunate conflict with my conversion of
the per-domain event lock to an rw one: It acquires rd's lock
in read mode only, while the requirements here would not allow
doing so. (Same in evtchn_close() then.)

> @@ -736,7 +723,8 @@ int evtchn_send(struct domain *ld, unsigned int lport)
>  
>      lchn = evtchn_from_port(ld, lport);
>  
> -    spin_lock_irqsave(&lchn->lock, flags);
> +    if ( !evtchn_read_trylock(lchn) )
> +        return 0;

With this, the auxiliary call to xsm_evtchn_send() up from
here should also go away again (possibly in a separate follow-
on, which would then likely be a clean revert).

> @@ -798,9 +786,11 @@ void send_guest_vcpu_virq(struct vcpu *v, uint32_t virq)
>  
>      d = v->domain;
>      chn = evtchn_from_port(d, port);
> -    spin_lock(&chn->lock);
> -    evtchn_port_set_pending(d, v->vcpu_id, chn);
> -    spin_unlock(&chn->lock);
> +    if ( evtchn_read_trylock(chn) )
> +    {
> +        evtchn_port_set_pending(d, v->vcpu_id, chn);
> +        evtchn_read_unlock(chn);
> +    }
>  
>   out:
>      spin_unlock_irqrestore(&v->virq_lock, flags);
> @@ -829,9 +819,11 @@ void send_guest_global_virq(struct domain *d, uint32_t virq)
>          goto out;
>  
>      chn = evtchn_from_port(d, port);
> -    spin_lock(&chn->lock);
> -    evtchn_port_set_pending(d, chn->notify_vcpu_id, chn);
> -    spin_unlock(&chn->lock);
> +    if ( evtchn_read_trylock(chn) )
> +    {
> +        evtchn_port_set_pending(d, chn->notify_vcpu_id, chn);
> +        evtchn_read_unlock(chn);
> +    }
>  
>   out:
>      spin_unlock_irqrestore(&v->virq_lock, flags);

As said before, I think these lock uses can go away altogether.
I shall put together a patch.

And on the whole I'd really prefer if we first convinced ourselves
that there's no way to simply get rid of the IRQ-safe locking
forms instead, before taking a decision to go with this model with
its extra constraints.

> @@ -1060,15 +1053,16 @@ int evtchn_unmask(unsigned int port)
>  {
>      struct domain *d = current->domain;
>      struct evtchn *evtchn;
> -    unsigned long flags;
>  
>      if ( unlikely(!port_is_valid(d, port)) )
>          return -EINVAL;
>  
>      evtchn = evtchn_from_port(d, port);
> -    spin_lock_irqsave(&evtchn->lock, flags);
> -    evtchn_port_unmask(d, evtchn);
> -    spin_unlock_irqrestore(&evtchn->lock, flags);
> +    if ( evtchn_read_trylock(evtchn) )
> +    {
> +        evtchn_port_unmask(d, evtchn);
> +        evtchn_read_unlock(evtchn);
> +    }

I think this wants mentioning together with send / query in the
description.

> --- a/xen/include/xen/event.h
> +++ b/xen/include/xen/event.h
> @@ -105,6 +105,60 @@ void notify_via_xen_event_channel(struct domain *ld, int lport);
>  #define bucket_from_port(d, p) \
>      ((group_from_port(d, p))[((p) % EVTCHNS_PER_GROUP) / EVTCHNS_PER_BUCKET])
>  
> +#define EVENT_WRITE_LOCK_INC    INT_MIN
> +
> +/*
> + * Lock an event channel exclusively. This is allowed only with holding
> + * d->event_lock AND when the channel is free or unbound either when taking
> + * or when releasing the lock, as any concurrent operation on the event
> + * channel using evtchn_read_trylock() will just assume the event channel is
> + * free or unbound at the moment.

... when the evtchn_read_trylock() returns false.

> + */
> +static inline void evtchn_write_lock(struct evtchn *evtchn)
> +{
> +    int val;
> +
> +    /*
> +     * The lock can't be held by a writer already, as all writers need to
> +     * hold d->event_lock.
> +     */
> +    ASSERT(atomic_read(&evtchn->lock) >= 0);
> +
> +    /* No barrier needed, atomic_add_return() is full barrier. */
> +    for ( val = atomic_add_return(EVENT_WRITE_LOCK_INC, &evtchn->lock);
> +          val != EVENT_WRITE_LOCK_INC;

The _INC suffix is slightly odd for this 2nd use, but I guess
the dual use will make it so for about any name you may pick.

> +          val = atomic_read(&evtchn->lock) )
> +        cpu_relax();
> +}
> +
> +static inline void evtchn_write_unlock(struct evtchn *evtchn)
> +{
> +    arch_lock_release_barrier();
> +
> +    atomic_sub(EVENT_WRITE_LOCK_INC, &evtchn->lock);
> +}
> +
> +static inline bool evtchn_read_trylock(struct evtchn *evtchn)
> +{
> +    if ( atomic_read(&evtchn->lock) < 0 )
> +        return false;
> +
> +    /* No barrier needed, atomic_inc_return() is full barrier. */
> +    if ( atomic_inc_return(&evtchn->lock) >= 0 )

atomic_*_return() return the new value, so I think you mean ">"
here?

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] xen/evtchn: rework per event channel lock
  2020-10-20  9:28   ` Jan Beulich
@ 2020-11-02 13:41     ` Jürgen Groß
  2020-11-02 13:52       ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Jürgen Groß @ 2020-11-02 13:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini

On 20.10.20 11:28, Jan Beulich wrote:
> On 16.10.2020 12:58, Juergen Gross wrote:
>> --- a/xen/arch/x86/pv/shim.c
>> +++ b/xen/arch/x86/pv/shim.c
>> @@ -660,11 +660,12 @@ void pv_shim_inject_evtchn(unsigned int port)
>>       if ( port_is_valid(guest, port) )
>>       {
>>           struct evtchn *chn = evtchn_from_port(guest, port);
>> -        unsigned long flags;
>>   
>> -        spin_lock_irqsave(&chn->lock, flags);
>> -        evtchn_port_set_pending(guest, chn->notify_vcpu_id, chn);
>> -        spin_unlock_irqrestore(&chn->lock, flags);
>> +        if ( evtchn_read_trylock(chn) )
>> +        {
>> +            evtchn_port_set_pending(guest, chn->notify_vcpu_id, chn);
>> +            evtchn_read_unlock(chn);
>> +        }
> 
> Does this want some form of else, e.g. at least a printk_once()?

No, I don't think so.

This is just a race with the port_is_valid() test above where the
port is just being switched to invalid.

> 
>> @@ -360,7 +352,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>>       if ( rc )
>>           goto out;
>>   
>> -    flags = double_evtchn_lock(lchn, rchn);
>> +    double_evtchn_lock(lchn, rchn);
> 
> This introduces an unfortunate conflict with my conversion of
> the per-domain event lock to an rw one: It acquires rd's lock
> in read mode only, while the requirements here would not allow
> doing so. (Same in evtchn_close() then.)

Is it a problem to use write mode for those cases?

> 
>> @@ -736,7 +723,8 @@ int evtchn_send(struct domain *ld, unsigned int lport)
>>   
>>       lchn = evtchn_from_port(ld, lport);
>>   
>> -    spin_lock_irqsave(&lchn->lock, flags);
>> +    if ( !evtchn_read_trylock(lchn) )
>> +        return 0;
> 
> With this, the auxiliary call to xsm_evtchn_send() up from
> here should also go away again (possibly in a separate follow-
> on, which would then likely be a clean revert).

Yes.

> 
>> @@ -798,9 +786,11 @@ void send_guest_vcpu_virq(struct vcpu *v, uint32_t virq)
>>   
>>       d = v->domain;
>>       chn = evtchn_from_port(d, port);
>> -    spin_lock(&chn->lock);
>> -    evtchn_port_set_pending(d, v->vcpu_id, chn);
>> -    spin_unlock(&chn->lock);
>> +    if ( evtchn_read_trylock(chn) )
>> +    {
>> +        evtchn_port_set_pending(d, v->vcpu_id, chn);
>> +        evtchn_read_unlock(chn);
>> +    }
>>   
>>    out:
>>       spin_unlock_irqrestore(&v->virq_lock, flags);
>> @@ -829,9 +819,11 @@ void send_guest_global_virq(struct domain *d, uint32_t virq)
>>           goto out;
>>   
>>       chn = evtchn_from_port(d, port);
>> -    spin_lock(&chn->lock);
>> -    evtchn_port_set_pending(d, chn->notify_vcpu_id, chn);
>> -    spin_unlock(&chn->lock);
>> +    if ( evtchn_read_trylock(chn) )
>> +    {
>> +        evtchn_port_set_pending(d, chn->notify_vcpu_id, chn);
>> +        evtchn_read_unlock(chn);
>> +    }
>>   
>>    out:
>>       spin_unlock_irqrestore(&v->virq_lock, flags);
> 
> As said before, I think these lock uses can go away altogether.
> I shall put together a patch.
> 
> And on the whole I'd really prefer if we first convinced ourselves
> that there's no way to simply get rid of the IRQ-safe locking
> forms instead, before taking a decision to go with this model with
> its extra constraints.
> 
>> @@ -1060,15 +1053,16 @@ int evtchn_unmask(unsigned int port)
>>   {
>>       struct domain *d = current->domain;
>>       struct evtchn *evtchn;
>> -    unsigned long flags;
>>   
>>       if ( unlikely(!port_is_valid(d, port)) )
>>           return -EINVAL;
>>   
>>       evtchn = evtchn_from_port(d, port);
>> -    spin_lock_irqsave(&evtchn->lock, flags);
>> -    evtchn_port_unmask(d, evtchn);
>> -    spin_unlock_irqrestore(&evtchn->lock, flags);
>> +    if ( evtchn_read_trylock(evtchn) )
>> +    {
>> +        evtchn_port_unmask(d, evtchn);
>> +        evtchn_read_unlock(evtchn);
>> +    }
> 
> I think this wants mentioning together with send / query in the
> description.

Okay.

> 
>> --- a/xen/include/xen/event.h
>> +++ b/xen/include/xen/event.h
>> @@ -105,6 +105,60 @@ void notify_via_xen_event_channel(struct domain *ld, int lport);
>>   #define bucket_from_port(d, p) \
>>       ((group_from_port(d, p))[((p) % EVTCHNS_PER_GROUP) / EVTCHNS_PER_BUCKET])
>>   
>> +#define EVENT_WRITE_LOCK_INC    INT_MIN
>> +
>> +/*
>> + * Lock an event channel exclusively. This is allowed only with holding
>> + * d->event_lock AND when the channel is free or unbound either when taking
>> + * or when releasing the lock, as any concurrent operation on the event
>> + * channel using evtchn_read_trylock() will just assume the event channel is
>> + * free or unbound at the moment.
> 
> ... when the evtchn_read_trylock() returns false.

Okay.

> 
>> + */
>> +static inline void evtchn_write_lock(struct evtchn *evtchn)
>> +{
>> +    int val;
>> +
>> +    /*
>> +     * The lock can't be held by a writer already, as all writers need to
>> +     * hold d->event_lock.
>> +     */
>> +    ASSERT(atomic_read(&evtchn->lock) >= 0);
>> +
>> +    /* No barrier needed, atomic_add_return() is full barrier. */
>> +    for ( val = atomic_add_return(EVENT_WRITE_LOCK_INC, &evtchn->lock);
>> +          val != EVENT_WRITE_LOCK_INC;
> 
> The _INC suffix is slightly odd for this 2nd use, but I guess
> the dual use will make it so for about any name you may pick.

I'll switch to a normal rwlock in V4.


Juergen


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] xen/evtchn: rework per event channel lock
  2020-11-02 13:41     ` Jürgen Groß
@ 2020-11-02 13:52       ` Jan Beulich
  2020-11-02 13:59         ` Jürgen Groß
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2020-11-02 13:52 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini

On 02.11.2020 14:41, Jürgen Groß wrote:
> On 20.10.20 11:28, Jan Beulich wrote:
>> On 16.10.2020 12:58, Juergen Gross wrote:
>>> --- a/xen/arch/x86/pv/shim.c
>>> +++ b/xen/arch/x86/pv/shim.c
>>> @@ -660,11 +660,12 @@ void pv_shim_inject_evtchn(unsigned int port)
>>>       if ( port_is_valid(guest, port) )
>>>       {
>>>           struct evtchn *chn = evtchn_from_port(guest, port);
>>> -        unsigned long flags;
>>>   
>>> -        spin_lock_irqsave(&chn->lock, flags);
>>> -        evtchn_port_set_pending(guest, chn->notify_vcpu_id, chn);
>>> -        spin_unlock_irqrestore(&chn->lock, flags);
>>> +        if ( evtchn_read_trylock(chn) )
>>> +        {
>>> +            evtchn_port_set_pending(guest, chn->notify_vcpu_id, chn);
>>> +            evtchn_read_unlock(chn);
>>> +        }
>>
>> Does this want some form of else, e.g. at least a printk_once()?
> 
> No, I don't think so.
> 
> This is just a race with the port_is_valid() test above where the
> port is just being switched to invalid.

This may be such a race yes, but why do you think it _will_ be?
Any holding of the lock for writing (or in fact, any pending
acquire in write mode) will make this fail, which - if it's not
such a race - will mean an event which wasn't sent when it
should have been, with potentially fatal (to the guest)
consequences.

>>> @@ -360,7 +352,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>>>       if ( rc )
>>>           goto out;
>>>   
>>> -    flags = double_evtchn_lock(lchn, rchn);
>>> +    double_evtchn_lock(lchn, rchn);
>>
>> This introduces an unfortunate conflict with my conversion of
>> the per-domain event lock to an rw one: It acquires rd's lock
>> in read mode only, while the requirements here would not allow
>> doing so. (Same in evtchn_close() then.)
> 
> Is it a problem to use write mode for those cases?

"Problem" can have a wide range of meanings - it's not going to
be the end of the world, but I view any use of a write lock as
a problem when a read lock would suffice. This can still harm
parallelism.

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] xen/evtchn: rework per event channel lock
  2020-11-02 13:52       ` Jan Beulich
@ 2020-11-02 13:59         ` Jürgen Groß
  2020-11-02 15:18           ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Jürgen Groß @ 2020-11-02 13:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini

On 02.11.20 14:52, Jan Beulich wrote:
> On 02.11.2020 14:41, Jürgen Groß wrote:
>> On 20.10.20 11:28, Jan Beulich wrote:
>>> On 16.10.2020 12:58, Juergen Gross wrote:
>>>> --- a/xen/arch/x86/pv/shim.c
>>>> +++ b/xen/arch/x86/pv/shim.c
>>>> @@ -660,11 +660,12 @@ void pv_shim_inject_evtchn(unsigned int port)
>>>>        if ( port_is_valid(guest, port) )
>>>>        {
>>>>            struct evtchn *chn = evtchn_from_port(guest, port);
>>>> -        unsigned long flags;
>>>>    
>>>> -        spin_lock_irqsave(&chn->lock, flags);
>>>> -        evtchn_port_set_pending(guest, chn->notify_vcpu_id, chn);
>>>> -        spin_unlock_irqrestore(&chn->lock, flags);
>>>> +        if ( evtchn_read_trylock(chn) )
>>>> +        {
>>>> +            evtchn_port_set_pending(guest, chn->notify_vcpu_id, chn);
>>>> +            evtchn_read_unlock(chn);
>>>> +        }
>>>
>>> Does this want some form of else, e.g. at least a printk_once()?
>>
>> No, I don't think so.
>>
>> This is just a race with the port_is_valid() test above where the
>> port is just being switched to invalid.
> 
> This may be such a race yes, but why do you think it _will_ be?

According to the outlined lock discipline there is no other
possibility (assuming that the lock discipline is honored).

I'll have a look whether I can add some ASSERT()s to catch any
lock discipline violation.

> 
>>>> @@ -360,7 +352,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>>>>        if ( rc )
>>>>            goto out;
>>>>    
>>>> -    flags = double_evtchn_lock(lchn, rchn);
>>>> +    double_evtchn_lock(lchn, rchn);
>>>
>>> This introduces an unfortunate conflict with my conversion of
>>> the per-domain event lock to an rw one: It acquires rd's lock
>>> in read mode only, while the requirements here would not allow
>>> doing so. (Same in evtchn_close() then.)
>>
>> Is it a problem to use write mode for those cases?
> 
> "Problem" can have a wide range of meanings - it's not going to
> be the end of the world, but I view any use of a write lock as
> a problem when a read lock would suffice. This can still harm
> parallelism.

Both cases are very rare ones in the life time of an event channel. I
don't think you'll ever be able to measure any performance impact from
switching these case to a write lock for any well behaved guest.


Juergen


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] xen/evtchn: rework per event channel lock
  2020-11-02 13:59         ` Jürgen Groß
@ 2020-11-02 15:18           ` Jan Beulich
  2020-11-02 15:26             ` Jürgen Groß
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2020-11-02 15:18 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini

On 02.11.2020 14:59, Jürgen Groß wrote:
> On 02.11.20 14:52, Jan Beulich wrote:
>> On 02.11.2020 14:41, Jürgen Groß wrote:
>>> On 20.10.20 11:28, Jan Beulich wrote:
>>>> On 16.10.2020 12:58, Juergen Gross wrote:
>>>>> @@ -360,7 +352,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>>>>>        if ( rc )
>>>>>            goto out;
>>>>>    
>>>>> -    flags = double_evtchn_lock(lchn, rchn);
>>>>> +    double_evtchn_lock(lchn, rchn);
>>>>
>>>> This introduces an unfortunate conflict with my conversion of
>>>> the per-domain event lock to an rw one: It acquires rd's lock
>>>> in read mode only, while the requirements here would not allow
>>>> doing so. (Same in evtchn_close() then.)
>>>
>>> Is it a problem to use write mode for those cases?
>>
>> "Problem" can have a wide range of meanings - it's not going to
>> be the end of the world, but I view any use of a write lock as
>> a problem when a read lock would suffice. This can still harm
>> parallelism.
> 
> Both cases are very rare ones in the life time of an event channel. I
> don't think you'll ever be able to measure any performance impact from
> switching these case to a write lock for any well behaved guest.

I agree as far as the lifetime of an individual port goes, but
we're talking about the per-domain lock here. (Perhaps my
choice of context in your patch wasn't the best one, as there
it is the per-channel lock of which two instances get acquired.
I'm sorry if this has lead to any confusion.)

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] xen/evtchn: rework per event channel lock
  2020-11-02 15:18           ` Jan Beulich
@ 2020-11-02 15:26             ` Jürgen Groß
  2020-11-04  9:50               ` Julien Grall
  0 siblings, 1 reply; 13+ messages in thread
From: Jürgen Groß @ 2020-11-02 15:26 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini

On 02.11.20 16:18, Jan Beulich wrote:
> On 02.11.2020 14:59, Jürgen Groß wrote:
>> On 02.11.20 14:52, Jan Beulich wrote:
>>> On 02.11.2020 14:41, Jürgen Groß wrote:
>>>> On 20.10.20 11:28, Jan Beulich wrote:
>>>>> On 16.10.2020 12:58, Juergen Gross wrote:
>>>>>> @@ -360,7 +352,7 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>>>>>>         if ( rc )
>>>>>>             goto out;
>>>>>>     
>>>>>> -    flags = double_evtchn_lock(lchn, rchn);
>>>>>> +    double_evtchn_lock(lchn, rchn);
>>>>>
>>>>> This introduces an unfortunate conflict with my conversion of
>>>>> the per-domain event lock to an rw one: It acquires rd's lock
>>>>> in read mode only, while the requirements here would not allow
>>>>> doing so. (Same in evtchn_close() then.)
>>>>
>>>> Is it a problem to use write mode for those cases?
>>>
>>> "Problem" can have a wide range of meanings - it's not going to
>>> be the end of the world, but I view any use of a write lock as
>>> a problem when a read lock would suffice. This can still harm
>>> parallelism.
>>
>> Both cases are very rare ones in the life time of an event channel. I
>> don't think you'll ever be able to measure any performance impact from
>> switching these case to a write lock for any well behaved guest.
> 
> I agree as far as the lifetime of an individual port goes, but
> we're talking about the per-domain lock here. (Perhaps my
> choice of context in your patch wasn't the best one, as there
> it is the per-channel lock of which two instances get acquired.
> I'm sorry if this has lead to any confusion.)

Hmm, with the switch to an ordinary rwlock it should be fine to drop
the requirement to hold the domain's event channel lock exclusively
for taking the per-channel lock as a writer.


Juergen


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/2] xen/events: access last_priority and last_vcpu_id together
  2020-10-16 10:58 ` [PATCH v3 1/2] xen/events: access last_priority and last_vcpu_id together Juergen Gross
@ 2020-11-04  9:42   ` Julien Grall
  0 siblings, 0 replies; 13+ messages in thread
From: Julien Grall @ 2020-11-04  9:42 UTC (permalink / raw)
  To: Juergen Gross, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu

Hi Juergen,

On 16/10/2020 11:58, Juergen Gross wrote:
> The queue for a fifo event is depending on the vcpu_id and the
> priority of the event. When sending an event it might happen the
> event needs to change queues and the old queue needs to be kept for
> keeping the links between queue elements intact. For this purpose
> the event channel contains last_priority and last_vcpu_id values
> elements for being able to identify the old queue.
> 
> In order to avoid races always access last_priority and last_vcpu_id
> with a single atomic operation avoiding any inconsistencies.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] xen/evtchn: rework per event channel lock
  2020-11-02 15:26             ` Jürgen Groß
@ 2020-11-04  9:50               ` Julien Grall
  2020-11-04  9:56                 ` Jürgen Groß
  0 siblings, 1 reply; 13+ messages in thread
From: Julien Grall @ 2020-11-04  9:50 UTC (permalink / raw)
  To: Jürgen Groß, Jan Beulich
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Stefano Stabellini

Hi Juergen,

On 02/11/2020 15:26, Jürgen Groß wrote:
> On 02.11.20 16:18, Jan Beulich wrote:
>> On 02.11.2020 14:59, Jürgen Groß wrote:
>>> On 02.11.20 14:52, Jan Beulich wrote:
>>>> On 02.11.2020 14:41, Jürgen Groß wrote:
>>>>> On 20.10.20 11:28, Jan Beulich wrote:
>>>>>> On 16.10.2020 12:58, Juergen Gross wrote:
>>>>>>> @@ -360,7 +352,7 @@ static long 
>>>>>>> evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>>>>>>>         if ( rc )
>>>>>>>             goto out;
>>>>>>> -    flags = double_evtchn_lock(lchn, rchn);
>>>>>>> +    double_evtchn_lock(lchn, rchn);
>>>>>>
>>>>>> This introduces an unfortunate conflict with my conversion of
>>>>>> the per-domain event lock to an rw one: It acquires rd's lock
>>>>>> in read mode only, while the requirements here would not allow
>>>>>> doing so. (Same in evtchn_close() then.)
>>>>>
>>>>> Is it a problem to use write mode for those cases?
>>>>
>>>> "Problem" can have a wide range of meanings - it's not going to
>>>> be the end of the world, but I view any use of a write lock as
>>>> a problem when a read lock would suffice. This can still harm
>>>> parallelism.
>>>
>>> Both cases are very rare ones in the life time of an event channel. I
>>> don't think you'll ever be able to measure any performance impact from
>>> switching these case to a write lock for any well behaved guest.
>>
>> I agree as far as the lifetime of an individual port goes, but
>> we're talking about the per-domain lock here. (Perhaps my
>> choice of context in your patch wasn't the best one, as there
>> it is the per-channel lock of which two instances get acquired.
>> I'm sorry if this has lead to any confusion.)
> 
> Hmm, with the switch to an ordinary rwlock it should be fine to drop
> the requirement to hold the domain's event channel lock exclusively
> for taking the per-channel lock as a writer.

I don't think you can drop d->event_lock. It protects us against 
allocating new ports while evtchn_reset() is called.

Without it, you are going to re-open XSA-343.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] xen/evtchn: rework per event channel lock
  2020-11-04  9:50               ` Julien Grall
@ 2020-11-04  9:56                 ` Jürgen Groß
  2020-11-04 10:02                   ` Julien Grall
  0 siblings, 1 reply; 13+ messages in thread
From: Jürgen Groß @ 2020-11-04  9:56 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Stefano Stabellini

On 04.11.20 10:50, Julien Grall wrote:
> Hi Juergen,
> 
> On 02/11/2020 15:26, Jürgen Groß wrote:
>> On 02.11.20 16:18, Jan Beulich wrote:
>>> On 02.11.2020 14:59, Jürgen Groß wrote:
>>>> On 02.11.20 14:52, Jan Beulich wrote:
>>>>> On 02.11.2020 14:41, Jürgen Groß wrote:
>>>>>> On 20.10.20 11:28, Jan Beulich wrote:
>>>>>>> On 16.10.2020 12:58, Juergen Gross wrote:
>>>>>>>> @@ -360,7 +352,7 @@ static long 
>>>>>>>> evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>>>>>>>>         if ( rc )
>>>>>>>>             goto out;
>>>>>>>> -    flags = double_evtchn_lock(lchn, rchn);
>>>>>>>> +    double_evtchn_lock(lchn, rchn);
>>>>>>>
>>>>>>> This introduces an unfortunate conflict with my conversion of
>>>>>>> the per-domain event lock to an rw one: It acquires rd's lock
>>>>>>> in read mode only, while the requirements here would not allow
>>>>>>> doing so. (Same in evtchn_close() then.)
>>>>>>
>>>>>> Is it a problem to use write mode for those cases?
>>>>>
>>>>> "Problem" can have a wide range of meanings - it's not going to
>>>>> be the end of the world, but I view any use of a write lock as
>>>>> a problem when a read lock would suffice. This can still harm
>>>>> parallelism.
>>>>
>>>> Both cases are very rare ones in the life time of an event channel. I
>>>> don't think you'll ever be able to measure any performance impact from
>>>> switching these case to a write lock for any well behaved guest.
>>>
>>> I agree as far as the lifetime of an individual port goes, but
>>> we're talking about the per-domain lock here. (Perhaps my
>>> choice of context in your patch wasn't the best one, as there
>>> it is the per-channel lock of which two instances get acquired.
>>> I'm sorry if this has lead to any confusion.)
>>
>> Hmm, with the switch to an ordinary rwlock it should be fine to drop
>> the requirement to hold the domain's event channel lock exclusively
>> for taking the per-channel lock as a writer.
> 
> I don't think you can drop d->event_lock. It protects us against 
> allocating new ports while evtchn_reset() is called.

I wrote "exclusively", as in case of a switch to a rwlock it should be
fine to hold it as a reader in case the reset coding takes it as a
writer.

> Without it, you are going to re-open XSA-343.

Yes, of course.


Juergen


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/2] xen/evtchn: rework per event channel lock
  2020-11-04  9:56                 ` Jürgen Groß
@ 2020-11-04 10:02                   ` Julien Grall
  0 siblings, 0 replies; 13+ messages in thread
From: Julien Grall @ 2020-11-04 10:02 UTC (permalink / raw)
  To: Jürgen Groß, Jan Beulich
  Cc: xen-devel, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Ian Jackson, Stefano Stabellini



On 04/11/2020 09:56, Jürgen Groß wrote:
> On 04.11.20 10:50, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 02/11/2020 15:26, Jürgen Groß wrote:
>>> On 02.11.20 16:18, Jan Beulich wrote:
>>>> On 02.11.2020 14:59, Jürgen Groß wrote:
>>>>> On 02.11.20 14:52, Jan Beulich wrote:
>>>>>> On 02.11.2020 14:41, Jürgen Groß wrote:
>>>>>>> On 20.10.20 11:28, Jan Beulich wrote:
>>>>>>>> On 16.10.2020 12:58, Juergen Gross wrote:
>>>>>>>>> @@ -360,7 +352,7 @@ static long 
>>>>>>>>> evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind)
>>>>>>>>>         if ( rc )
>>>>>>>>>             goto out;
>>>>>>>>> -    flags = double_evtchn_lock(lchn, rchn);
>>>>>>>>> +    double_evtchn_lock(lchn, rchn);
>>>>>>>>
>>>>>>>> This introduces an unfortunate conflict with my conversion of
>>>>>>>> the per-domain event lock to an rw one: It acquires rd's lock
>>>>>>>> in read mode only, while the requirements here would not allow
>>>>>>>> doing so. (Same in evtchn_close() then.)
>>>>>>>
>>>>>>> Is it a problem to use write mode for those cases?
>>>>>>
>>>>>> "Problem" can have a wide range of meanings - it's not going to
>>>>>> be the end of the world, but I view any use of a write lock as
>>>>>> a problem when a read lock would suffice. This can still harm
>>>>>> parallelism.
>>>>>
>>>>> Both cases are very rare ones in the life time of an event channel. I
>>>>> don't think you'll ever be able to measure any performance impact from
>>>>> switching these case to a write lock for any well behaved guest.
>>>>
>>>> I agree as far as the lifetime of an individual port goes, but
>>>> we're talking about the per-domain lock here. (Perhaps my
>>>> choice of context in your patch wasn't the best one, as there
>>>> it is the per-channel lock of which two instances get acquired.
>>>> I'm sorry if this has lead to any confusion.)
>>>
>>> Hmm, with the switch to an ordinary rwlock it should be fine to drop
>>> the requirement to hold the domain's event channel lock exclusively
>>> for taking the per-channel lock as a writer.
>>
>> I don't think you can drop d->event_lock. It protects us against 
>> allocating new ports while evtchn_reset() is called.
> 
> I wrote "exclusively", as in case of a switch to a rwlock it should be
> fine to hold it as a reader in case the reset coding takes it as a
> writer.

Oh I misread your comment. Sorry for the noise.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-11-04 10:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-16 10:58 [PATCH v3 0/2] XSA-343 followup patches Juergen Gross
2020-10-16 10:58 ` [PATCH v3 1/2] xen/events: access last_priority and last_vcpu_id together Juergen Gross
2020-11-04  9:42   ` Julien Grall
2020-10-16 10:58 ` [PATCH v3 2/2] xen/evtchn: rework per event channel lock Juergen Gross
2020-10-20  9:28   ` Jan Beulich
2020-11-02 13:41     ` Jürgen Groß
2020-11-02 13:52       ` Jan Beulich
2020-11-02 13:59         ` Jürgen Groß
2020-11-02 15:18           ` Jan Beulich
2020-11-02 15:26             ` Jürgen Groß
2020-11-04  9:50               ` Julien Grall
2020-11-04  9:56                 ` Jürgen Groß
2020-11-04 10:02                   ` Julien Grall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).