All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/5] xen/arm: vgic: Support 32-bit access for 64-bit register
@ 2015-10-12 14:22 Julien Grall
  2015-10-12 14:22 ` [PATCH v4 1/5] xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR Julien Grall
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Julien Grall @ 2015-10-12 14:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, ian.campbell, stefano.stabellini

Hi all,

This series aims to fix the 32-bit access on 64-bit register. Some guest
OS such as FreeBSD and Linux (ITS and recently 32-bit guest using GICv3)
use 32-bit access and will crash at boot time.

Major changes in v4:
    - Patch #1-#6 of the previous version has been applied
    - Split "Optimize the way to store the target vCPU in the rank" in 3
    patchs to avoiding fixing a bug (byte access), changing behavior
    (handle zero write), and the actual optimizing in a single patch.

Sincerely yours,

Julien Grall (5):
  xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR
  xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0
  xen/arm: vgic: Optimize the way to store the target vCPU in the rank
  xen/arm: vgic: Introduce helpers to extract/update/clear/set vGIC
    register ...
  xen/arm: vgic-v3: Support 32-bit access for 64-bit registers

 xen/arch/arm/vgic-v2.c     | 276 +++++++++++++++++++++++++++------------------
 xen/arch/arm/vgic-v3.c     | 264 +++++++++++++++++++++++++++----------------
 xen/arch/arm/vgic.c        |  45 ++++++--
 xen/include/asm-arm/vgic.h | 129 +++++++++++++++++----
 4 files changed, 474 insertions(+), 240 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v4 1/5] xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR
  2015-10-12 14:22 [PATCH v4 0/5] xen/arm: vgic: Support 32-bit access for 64-bit register Julien Grall
@ 2015-10-12 14:22 ` Julien Grall
  2015-10-22 15:53   ` Ian Campbell
  2015-10-12 14:22 ` [PATCH v4 2/5] xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0 Julien Grall
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Julien Grall @ 2015-10-12 14:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, ian.campbell, stefano.stabellini

During a store, the byte is always in the low part of the register (i.e
[0:7]).

Although, we are masking the register by using a shift of the
byte offset in the ITARGETSR. This will result to get a target list
equal to 0 which is ignored by the emulation.

Because of that a guest won't be able to modify the any ITARGETSR using
byte access. Note that the first byte of each register will still be
writeable.

Furthermore, the body of the loop is retrieving the old target list
using the index of the byte.

To avoid modifying too much the loop, shift the byte stored to the correct
offset.

Signed-off-by: Julien Grall <julien.grall@citrix.com>

----
    This change used to be embedded in "xen/arm: vgic: Optimize the way
    to store the target vCPU in the rank". It has been moved out to
    avoid having too much functional changes in a single patch.

    This patch is a good candidate to backport to Xen 4.6 and Xen 4.5.
    Without it a guest won't be able migrate an IRQ from one vCPU to
    another if it's using byte access to write in ITARGETSR.

    Changes in v4:
        - Patch added
---
 xen/arch/arm/vgic-v2.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index 2d63e12..665afeb 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -346,11 +346,11 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
         /* 8-bit vcpu mask for this domain */
         BUG_ON(v->domain->max_vcpus > 8);
         target = (1 << v->domain->max_vcpus) - 1;
-        if ( dabt.size == 2 )
-            target = target | (target << 8) | (target << 16) | (target << 24);
+        target = target | (target << 8) | (target << 16) | (target << 24);
+        if ( dabt.size == DABT_WORD )
+            target &= r;
         else
-            target = (target << (8 * (gicd_reg & 0x3)));
-        target &= r;
+            target &= (r << (8 * (gicd_reg & 0x3)));
         /* ignore zero writes */
         if ( !target )
             goto write_ignore;
@@ -374,7 +374,7 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
 
             if ( new_target != old_target )
             {
-                irq = gicd_reg - GICD_ITARGETSR + (i / 8);
+                irq = (gicd_reg & ~0x3) - GICD_ITARGETSR + (i / 8);
                 v_target = v->domain->vcpu[new_target];
                 v_old = v->domain->vcpu[old_target];
                 vgic_migrate_irq(v_old, v_target, irq);
@@ -386,7 +386,7 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
                                              DABT_WORD)] = target;
         else
             vgic_byte_write(&rank->v2.itargets[REG_RANK_INDEX(8,
-                      gicd_reg - GICD_ITARGETSR, DABT_WORD)], target, gicd_reg);
+                      gicd_reg - GICD_ITARGETSR, DABT_WORD)], r, gicd_reg);
         vgic_unlock_rank(v, rank, flags);
         return 1;
     }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 2/5] xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0
  2015-10-12 14:22 [PATCH v4 0/5] xen/arm: vgic: Support 32-bit access for 64-bit register Julien Grall
  2015-10-12 14:22 ` [PATCH v4 1/5] xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR Julien Grall
@ 2015-10-12 14:22 ` Julien Grall
  2015-10-22 16:07   ` Ian Campbell
  2015-10-12 14:22 ` [PATCH v4 3/5] xen/arm: vgic: Optimize the way to store the target vCPU in the rank Julien Grall
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Julien Grall @ 2015-10-12 14:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, ian.campbell, stefano.stabellini

The current implementation ignores the whole write if one of the field is
0. Although, based on the spec (4.3.12 IHI 0048B.b), 0 is a valid value
when:
    - The interrupt is not wired in the distributor. From the Xen
    point of view, it means that the corresponding bit is not set in
    d->arch.vgic.allocated_irqs.
    - The user wants to disable the IRQ forwarding in the distributor.
    I.e the IRQ stays pending in the distributor and never received by
    the guest.

Implementing the later will require more work in Xen because we always
assume the interrupt is forwarded to vCPU. So for now, ignore any field
where the value is 0.

The emulation of the write access of ITARGETSR has been reworked and
moved to a new function because it would have been difficult to
implement properly the behavior with the current code.

The new implementation is breaking the register in 4 distinct bytes. For
each byte, it will check the validity of the target list, find the new
target, migrate the interrupt and store the value if necessary.

In the new implementation there is nearly no distinction of the access
size to avoid having too many different path which is harder to test.

Signed-off-by: Julien Grall <julien.grall@citrix.com>

---
    This change used to be embedded in "xen/arm: vgic: Optimize the way
    to store the target vCPU in the rank". It has been moved out to
    avoid having too much functional changes in a single patch.

    I'm not sure if this patch should be backported to Xen 4.6. Without
    it any guest writing 0 in one the field won't be able to migrate
    other interrupts. Although, in all the use case I've seen, the guest
    is read ITARGETSR first and write-back the value with the
    corresponding byte changed.

    Changes in v4:
        - Patch added
---
 xen/arch/arm/vgic-v2.c | 141 ++++++++++++++++++++++++++++++++++---------------
 1 file changed, 98 insertions(+), 43 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index 665afeb..6b7eab3 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -50,6 +50,94 @@ void vgic_v2_setup_hw(paddr_t dbase, paddr_t cbase, paddr_t vbase)
     vgic_v2_hw.vbase = vbase;
 }
 
+#define NR_TARGET_PER_REG 4U
+#define NR_BIT_PER_TARGET 8U
+
+/*
+ * Store a ITARGETSR register. This function only deals with ITARGETSR8
+ * and onwards.
+ *
+ * Note the offset will be aligned to the appropriate boundary.
+ */
+static void vgic_store_itargetsr(struct domain *d, struct vgic_irq_rank *rank,
+                                 unsigned int offset, uint32_t itargetsr)
+{
+    unsigned int i;
+    unsigned int regidx = REG_RANK_INDEX(8, offset, DABT_WORD);
+    unsigned int virq;
+
+    ASSERT(spin_is_locked(&rank->lock));
+
+    /*
+     * The ITARGETSR0-7, used for SGIs/PPIs, are implemented RO in the
+     * emulation and should never call this function.
+     *
+     * They all live in the first rank.
+     */
+    BUILD_BUG_ON(NR_INTERRUPT_PER_RANK != 32);
+    ASSERT(rank->index >= 1);
+
+    offset &= INTERRUPT_RANK_MASK;
+    offset &= ~(NR_TARGET_PER_REG - 1);
+
+    virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
+
+    for ( i = 0; i < NR_TARGET_PER_REG; i++, offset++, virq++ )
+    {
+        unsigned int new_target, old_target;
+        uint8_t new_mask, old_mask;
+
+        new_mask = itargetsr & ((1 << NR_BIT_PER_TARGET) - 1);
+        old_mask = vgic_byte_read(rank->v2.itargets[regidx], i);
+
+        itargetsr >>= NR_BIT_PER_TARGET;
+
+        /*
+         * SPIs are using the 1-N model (see 1.4.3 in ARM IHI 0048B).
+         * While the interrupt could be set pending to all the vCPUs in
+         * target list, it's not guarantee by the spec.
+         * For simplicity, always route the vIRQ to the first interrupt
+         * in the target list
+         */
+        new_target = ffs(new_mask);
+        old_target = ffs(old_mask);
+
+        /* The current target should always be valid */
+        ASSERT(old_target && (old_target <= d->max_vcpus));
+
+        /*
+         * Ignore the write request for this interrupt if the new target
+         * is invalid.
+         * XXX: From the spec, if the target list is not valid, the
+         * interrupt should be ignored (i.e not forwarding to the
+         * guest).
+         */
+        if ( !new_target || (new_target > d->max_vcpus) )
+        {
+            printk(XENLOG_G_DEBUG
+                   "d%d: No valid vCPU found for vIRQ%u in the target list (%#x). Skip it\n",
+                   d->domain_id, virq, new_mask);
+            continue;
+        }
+
+        /* The vCPU ID always starts from 0 */
+        new_target--;
+        old_target--;
+
+        /* Only migrate the vIRQ if the target vCPU has changed */
+        if ( new_target != old_target )
+        {
+            vgic_migrate_irq(d->vcpu[old_target],
+                             d->vcpu[new_target],
+                             virq);
+        }
+
+        /* Bit corresponding to unimplemented CPU is write-ignored. */
+        new_mask &= (1 << d->max_vcpus) - 1;
+        vgic_byte_write(&rank->v2.itargets[regidx], new_mask, i);
+    }
+}
+
 static int vgic_v2_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
                                    register_t *r, void *priv)
 {
@@ -337,56 +425,23 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
 
     case GICD_ITARGETSR + 8 ... GICD_ITARGETSRN:
     {
-        /* unsigned long needed for find_next_bit */
-        unsigned long target;
-        int i;
+        uint32_t itargetsr;
+
         if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
         rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR, DABT_WORD);
         if ( rank == NULL) goto write_ignore;
-        /* 8-bit vcpu mask for this domain */
-        BUG_ON(v->domain->max_vcpus > 8);
-        target = (1 << v->domain->max_vcpus) - 1;
-        target = target | (target << 8) | (target << 16) | (target << 24);
+        vgic_lock_rank(v, rank, flags);
         if ( dabt.size == DABT_WORD )
-            target &= r;
+            itargetsr = r;
         else
-            target &= (r << (8 * (gicd_reg & 0x3)));
-        /* ignore zero writes */
-        if ( !target )
-            goto write_ignore;
-        /* For word reads ignore writes where any single byte is zero */
-        if ( dabt.size == 2 &&
-            !((target & 0xff) && (target & (0xff << 8)) &&
-             (target & (0xff << 16)) && (target & (0xff << 24))))
-            goto write_ignore;
-        vgic_lock_rank(v, rank, flags);
-        i = 0;
-        while ( (i = find_next_bit(&target, 32, i)) < 32 )
         {
-            unsigned int irq, new_target, old_target;
-            unsigned long old_target_mask;
-            struct vcpu *v_target, *v_old;
-
-            new_target = i % 8;
-            old_target_mask = vgic_byte_read(rank->v2.itargets[REG_RANK_INDEX(8,
-                                             gicd_reg - GICD_ITARGETSR, DABT_WORD)], i/8);
-            old_target = find_first_bit(&old_target_mask, 8);
-
-            if ( new_target != old_target )
-            {
-                irq = (gicd_reg & ~0x3) - GICD_ITARGETSR + (i / 8);
-                v_target = v->domain->vcpu[new_target];
-                v_old = v->domain->vcpu[old_target];
-                vgic_migrate_irq(v_old, v_target, irq);
-            }
-            i += 8 - new_target;
+            itargetsr = rank->v2.itargets[REG_RANK_INDEX(8,
+                                    gicd_reg - GICD_ITARGETSR,
+                                    DABT_WORD)];
+            vgic_byte_write(&itargetsr, r, gicd_reg);
         }
-        if ( dabt.size == DABT_WORD )
-            rank->v2.itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR,
-                                             DABT_WORD)] = target;
-        else
-            vgic_byte_write(&rank->v2.itargets[REG_RANK_INDEX(8,
-                      gicd_reg - GICD_ITARGETSR, DABT_WORD)], r, gicd_reg);
+        vgic_store_itargetsr(v->domain, rank, gicd_reg - GICD_ITARGETSR,
+                             itargetsr);
         vgic_unlock_rank(v, rank, flags);
         return 1;
     }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 3/5] xen/arm: vgic: Optimize the way to store the target vCPU in the rank
  2015-10-12 14:22 [PATCH v4 0/5] xen/arm: vgic: Support 32-bit access for 64-bit register Julien Grall
  2015-10-12 14:22 ` [PATCH v4 1/5] xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR Julien Grall
  2015-10-12 14:22 ` [PATCH v4 2/5] xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0 Julien Grall
@ 2015-10-12 14:22 ` Julien Grall
  2015-10-22 16:17   ` Ian Campbell
  2015-10-12 14:22 ` [PATCH v4 4/5] xen/arm: vgic: Introduce helpers to extract/update/clear/set vGIC register Julien Grall
  2015-10-12 14:22 ` [PATCH v4 5/5] xen/arm: vgic-v3: Support 32-bit access for 64-bit registers Julien Grall
  4 siblings, 1 reply; 22+ messages in thread
From: Julien Grall @ 2015-10-12 14:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, ian.campbell, stefano.stabellini

Xen is currently directly storing the value of GICD_ITARGETSR register
(for GICv2) and GICD_IROUTER (for GICv3) in the rank. This makes the
emulation of the registers access very simple but makes the code to get
the target vCPU for a given vIRQ more complex.

While the target vCPU of an vIRQ is retrieved every time an vIRQ is
injected to the guest, the access to the register occurs less often.

So the data structure should be optimized for the most common case
rather than the inverse.

This patch introduces the usage of an array to store the target vCPU for
every interrupt in the rank. This will make the code to get the target
very quick. The emulation code will now have to generate the GICD_ITARGETSR
and GICD_IROUTER register for read access and split it to store in a
convenient way.

With the new way to store the target vCPU, the structure vgic_irq_rank
is shrunk down from 320 bytes to 92 bytes. This is saving about 228
bytes of memory allocated separately per vCPU.

Note that with these changes, any read to those register will list only
the target vCPU used by Xen. As the spec is not clear whether this is a
valid choice or not, OSes which have a different interpretation of the
spec (i.e OSes which perform read-modify-write operations on these
registers) may not boot anymore on Xen. Although, I think this is fair
trade between memory usage in Xen (1KB less on a domain using 4 vCPUs
with no SPIs) and a strict interpretation of the spec (though all the
cases are not clearly defined).

Furthermore, the implementation of the callback get_target_vcpu is now
exactly the same. Consolidate the implementation in the common vGIC code
and drop the callback.

Finally take the opportunity to fix coding style and replace "irq" by
"virq" to make clear that we are dealing with virtual IRQ in section we
are modifying.

Signed-off-by: Julien Grall <julien.grall@citrix.com>

---
    The real reason is I'd like to drop vgic_byte_* helpers in favor of more
    generic access helpers. Although it would make the code to retrieve the
    priority more complex. So reworking the code to get the target vCPU
    was the best solution.

    Changes in v4:
        - Move the behavior changes in 2 separate patches:
            * patch #1: Handle correctly byte write in ITARGETSR
            * patch #2: Don't ignore a write in ITARGETSR if one field is 0
        - Update commit message

    Changes in v3:
        - Make clear in the commit message that we rely on a corner case
        of the spec.
        - Typoes

    Changes in v2:
        - Remove get_target_vcpu callback
        - s/irq/virq/ to make clear we are using virtual IRQ
        - Update the commit message
        - Rename vgic_generate_* to vgic_fetch
        - Improve code comment
        - Move the introduction of vgic_rank_init in a separate patch
        - Make use of rank->idx

    Changes in v1:
        - Patch added
---
 xen/arch/arm/vgic-v2.c     |  88 +++++++++++++--------------------
 xen/arch/arm/vgic-v3.c     | 119 ++++++++++++++++++++++++---------------------
 xen/arch/arm/vgic.c        |  45 ++++++++++++-----
 xen/include/asm-arm/vgic.h |  18 +++----
 4 files changed, 139 insertions(+), 131 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index 6b7eab3..a0eba30 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -54,8 +54,31 @@ void vgic_v2_setup_hw(paddr_t dbase, paddr_t cbase, paddr_t vbase)
 #define NR_BIT_PER_TARGET 8U
 
 /*
- * Store a ITARGETSR register. This function only deals with ITARGETSR8
- * and onwards.
+ * Fetch a ITARGETSR register based on the offset from ITARGETSR0. Only
+ * one vCPU will be listed for a given vIRQ.
+ *
+ * Note the offset will be aligned to the appropriate boundary.
+ */
+static uint32_t vgic_fetch_itargetsr(struct vgic_irq_rank *rank,
+                                     unsigned int offset)
+{
+    uint32_t reg = 0;
+    unsigned int i;
+
+    ASSERT(spin_is_locked(&rank->lock));
+
+    offset &= INTERRUPT_RANK_MASK;
+    offset &= ~(NR_TARGET_PER_REG - 1);
+
+    for ( i = 0; i < NR_TARGET_PER_REG; i++, offset++ )
+        reg |= (1 << rank->vcpu[offset]) << (i * NR_BIT_PER_TARGET);
+
+    return reg;
+}
+
+/*
+ * Store a ITARGETSR register in a convenient way and migrate the vIRQ
+ * if necessary. This function only deals with ITARGETSR8 and onwards.
  *
  * Note the offset will be aligned to the appropriate boundary.
  */
@@ -63,7 +86,6 @@ static void vgic_store_itargetsr(struct domain *d, struct vgic_irq_rank *rank,
                                  unsigned int offset, uint32_t itargetsr)
 {
     unsigned int i;
-    unsigned int regidx = REG_RANK_INDEX(8, offset, DABT_WORD);
     unsigned int virq;
 
     ASSERT(spin_is_locked(&rank->lock));
@@ -85,10 +107,9 @@ static void vgic_store_itargetsr(struct domain *d, struct vgic_irq_rank *rank,
     for ( i = 0; i < NR_TARGET_PER_REG; i++, offset++, virq++ )
     {
         unsigned int new_target, old_target;
-        uint8_t new_mask, old_mask;
+        uint8_t new_mask;
 
         new_mask = itargetsr & ((1 << NR_BIT_PER_TARGET) - 1);
-        old_mask = vgic_byte_read(rank->v2.itargets[regidx], i);
 
         itargetsr >>= NR_BIT_PER_TARGET;
 
@@ -100,10 +121,6 @@ static void vgic_store_itargetsr(struct domain *d, struct vgic_irq_rank *rank,
          * in the target list
          */
         new_target = ffs(new_mask);
-        old_target = ffs(old_mask);
-
-        /* The current target should always be valid */
-        ASSERT(old_target && (old_target <= d->max_vcpus));
 
         /*
          * Ignore the write request for this interrupt if the new target
@@ -122,19 +139,20 @@ static void vgic_store_itargetsr(struct domain *d, struct vgic_irq_rank *rank,
 
         /* The vCPU ID always starts from 0 */
         new_target--;
-        old_target--;
+
+        old_target = rank->vcpu[offset];
 
         /* Only migrate the vIRQ if the target vCPU has changed */
         if ( new_target != old_target )
         {
+            unsigned int virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
+
             vgic_migrate_irq(d->vcpu[old_target],
                              d->vcpu[new_target],
                              virq);
         }
 
-        /* Bit corresponding to unimplemented CPU is write-ignored. */
-        new_mask &= (1 << d->max_vcpus) - 1;
-        vgic_byte_write(&rank->v2.itargets[regidx], new_mask, i);
+        rank->vcpu[offset] = new_target;
     }
 }
 
@@ -214,8 +232,7 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
         rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR, DABT_WORD);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank, flags);
-        *r = rank->v2.itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR,
-                                              DABT_WORD)];
+        *r = vgic_fetch_itargetsr(rank, gicd_reg - GICD_ITARGETSR);
         if ( dabt.size == DABT_BYTE )
             *r = vgic_byte_read(*r, gicd_reg);
         vgic_unlock_rank(v, rank, flags);
@@ -435,9 +452,7 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
             itargetsr = r;
         else
         {
-            itargetsr = rank->v2.itargets[REG_RANK_INDEX(8,
-                                    gicd_reg - GICD_ITARGETSR,
-                                    DABT_WORD)];
+            itargetsr = vgic_fetch_itargetsr(rank, gicd_reg - GICD_ITARGETSR);
             vgic_byte_write(&itargetsr, r, gicd_reg);
         }
         vgic_store_itargetsr(v->domain, rank, gicd_reg - GICD_ITARGETSR,
@@ -542,43 +557,16 @@ static const struct mmio_handler_ops vgic_v2_distr_mmio_handler = {
     .write = vgic_v2_distr_mmio_write,
 };
 
-static struct vcpu *vgic_v2_get_target_vcpu(struct vcpu *v, unsigned int irq)
-{
-    unsigned long target;
-    struct vcpu *v_target;
-    struct vgic_irq_rank *rank = vgic_rank_irq(v, irq);
-    ASSERT(spin_is_locked(&rank->lock));
-
-    target = vgic_byte_read(rank->v2.itargets[REG_RANK_INDEX(8,
-                                              irq, DABT_WORD)], irq & 0x3);
-
-    /* 1-N SPI should be delivered as pending to all the vcpus in the
-     * mask, but here we just return the first vcpu for simplicity and
-     * because it would be too slow to do otherwise. */
-    target = find_first_bit(&target, 8);
-    ASSERT(target >= 0 && target < v->domain->max_vcpus);
-    v_target = v->domain->vcpu[target];
-    return v_target;
-}
-
 static int vgic_v2_vcpu_init(struct vcpu *v)
 {
-    int i;
-
-    /* For SGI and PPI the target is always this CPU */
-    for ( i = 0 ; i < 8 ; i++ )
-        v->arch.vgic.private_irqs->v2.itargets[i] =
-              (1<<(v->vcpu_id+0))
-            | (1<<(v->vcpu_id+8))
-            | (1<<(v->vcpu_id+16))
-            | (1<<(v->vcpu_id+24));
+    /* Nothing specific to initialize for this driver */
 
     return 0;
 }
 
 static int vgic_v2_domain_init(struct domain *d)
 {
-    int i, ret;
+    int ret;
     paddr_t cbase;
 
     /*
@@ -618,11 +606,6 @@ static int vgic_v2_domain_init(struct domain *d)
     if ( ret )
         return ret;
 
-    /* By default deliver to CPU0 */
-    for ( i = 0; i < DOMAIN_NR_RANKS(d); i++ )
-        memset(d->arch.vgic.shared_irqs[i].v2.itargets, 0x1,
-               sizeof(d->arch.vgic.shared_irqs[i].v2.itargets));
-
     register_mmio_handler(d, &vgic_v2_distr_mmio_handler, d->arch.vgic.dbase,
                           PAGE_SIZE, NULL);
 
@@ -632,7 +615,6 @@ static int vgic_v2_domain_init(struct domain *d)
 static const struct vgic_ops vgic_v2_ops = {
     .vcpu_init   = vgic_v2_vcpu_init,
     .domain_init = vgic_v2_domain_init,
-    .get_target_vcpu = vgic_v2_get_target_vcpu,
     .max_vcpus = 8,
 };
 
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index b5249ff..2aab77b 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -89,18 +89,72 @@ static struct vcpu *vgic_v3_irouter_to_vcpu(struct domain *d, uint64_t irouter)
     return d->vcpu[vcpu_id];
 }
 
-static struct vcpu *vgic_v3_get_target_vcpu(struct vcpu *v, unsigned int irq)
-{
-    struct vcpu *v_target;
-    struct vgic_irq_rank *rank = vgic_rank_irq(v, irq);
+#define NR_BYTE_PER_IROUTER 8U
 
+/*
+ * Fetch a IROUTER register based on the offset from IROUTER0. Only one
+ * vCPU will be listed for a given vIRQ.
+ *
+ * Note the offset will be aligned to the appropritate  boundary.
+ */
+static uint64_t vgic_fetch_irouter(struct vgic_irq_rank *rank,
+                                   unsigned int offset)
+{
     ASSERT(spin_is_locked(&rank->lock));
 
-    v_target = vgic_v3_irouter_to_vcpu(v->domain, rank->v3.irouter[irq % 32]);
+    /* There is exactly 1 vIRQ per IROUTER */
+    offset /= NR_BYTE_PER_IROUTER;
 
-    ASSERT(v_target != NULL);
+    /* Get the index in the rank */
+    offset &= INTERRUPT_RANK_MASK;
 
-    return v_target;
+    return vcpuid_to_vaffinity(rank->vcpu[offset]);
+}
+
+/*
+ * Store a IROUTER register in a convenient way and migrate the vIRQ
+ * if necessary. This function only deals with ITARGETSR32 and onwards.
+ *
+ * Note the offset will be aligned to the appriopriate boundary.
+ */
+static void vgic_store_irouter(struct domain *d, struct vgic_irq_rank *rank,
+                               unsigned int offset, uint64_t irouter)
+{
+    struct vcpu *new_vcpu, *old_vcpu;
+    unsigned int virq;
+
+    /* There is 1 vIRQ per IROUTER */
+    virq = offset / NR_BYTE_PER_IROUTER;
+
+    /*
+     * The IROUTER0-31, used for SGIs/PPIs, are reserved and should
+     * never call this function.
+     */
+    ASSERT(virq >= 32);
+
+    /* Get the index in the rank */
+    offset &= virq & INTERRUPT_RANK_MASK;
+
+    new_vcpu = vgic_v3_irouter_to_vcpu(d, irouter);
+    old_vcpu = d->vcpu[rank->vcpu[offset]];
+
+    /*
+     * From the spec (see 8.9.13 in IHI 0069A), any write with an
+     * invalid vCPU will lead to the interrupt being ignored.
+     *
+     * But the current code to inject an IRQ is not able to cope with
+     * invalid vCPU. So for now, just ignore the write.
+     *
+     * TODO: Respect the spec
+     */
+    if ( !new_vcpu )
+        return;
+
+    /* Only migrate the IRQ if the target vCPU has changed */
+    if ( new_vcpu != old_vcpu )
+        vgic_migrate_irq(old_vcpu, new_vcpu, virq);
+
+    rank->vcpu[offset] = new_vcpu->vcpu_id;
 }
 
 static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
@@ -726,8 +780,7 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
                                 DABT_DOUBLE_WORD);
         if ( rank == NULL ) goto read_as_zero;
         vgic_lock_rank(v, rank, flags);
-        *r = rank->v3.irouter[REG_RANK_INDEX(64,
-                              (gicd_reg - GICD_IROUTER), DABT_DOUBLE_WORD)];
+        *r = vgic_fetch_irouter(rank, gicd_reg - GICD_IROUTER);
         vgic_unlock_rank(v, rank, flags);
         return 1;
     case GICD_NSACR ... GICD_NSACRN:
@@ -812,8 +865,6 @@ static int vgic_v3_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
     struct hsr_dabt dabt = info->dabt;
     struct vgic_irq_rank *rank;
     unsigned long flags;
-    uint64_t new_irouter, old_irouter;
-    struct vcpu *old_vcpu, *new_vcpu;
     int gicd_reg = (int)(info->gpa - v->domain->arch.vgic.dbase);
 
     perfc_incr(vgicd_writes);
@@ -881,32 +932,8 @@ static int vgic_v3_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
         rank = vgic_rank_offset(v, 64, gicd_reg - GICD_IROUTER,
                                 DABT_DOUBLE_WORD);
         if ( rank == NULL ) goto write_ignore;
-        new_irouter = r;
         vgic_lock_rank(v, rank, flags);
-
-        old_irouter = rank->v3.irouter[REG_RANK_INDEX(64,
-                                       (gicd_reg - GICD_IROUTER),
-                                       DABT_DOUBLE_WORD)];
-        old_vcpu = vgic_v3_irouter_to_vcpu(v->domain, old_irouter);
-        new_vcpu = vgic_v3_irouter_to_vcpu(v->domain, new_irouter);
-
-        if ( !new_vcpu )
-        {
-            printk(XENLOG_G_DEBUG
-                   "%pv: vGICD: wrong irouter at offset %#08x val %#"PRIregister,
-                   v, gicd_reg, r);
-            vgic_unlock_rank(v, rank, flags);
-            /*
-             * TODO: Don't inject a fault to the guest when the MPIDR is
-             * not valid. From the spec, the interrupt should be
-             * ignored.
-             */
-            return 0;
-        }
-        rank->v3.irouter[REG_RANK_INDEX(64, (gicd_reg - GICD_IROUTER),
-                         DABT_DOUBLE_WORD)] = new_irouter;
-        if ( old_vcpu != new_vcpu )
-            vgic_migrate_irq(old_vcpu, new_vcpu, (gicd_reg - GICD_IROUTER)/8);
+        vgic_store_irouter(v->domain, rank, gicd_reg - GICD_IROUTER, r);
         vgic_unlock_rank(v, rank, flags);
         return 1;
     case GICD_NSACR ... GICD_NSACRN:
@@ -1036,7 +1063,6 @@ static const struct mmio_handler_ops vgic_distr_mmio_handler = {
 static int vgic_v3_vcpu_init(struct vcpu *v)
 {
     int i;
-    uint64_t affinity;
     paddr_t rdist_base;
     struct vgic_rdist_region *region;
     unsigned int last_cpu;
@@ -1045,15 +1071,6 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
     struct domain *d = v->domain;
     uint32_t rdist_stride = d->arch.vgic.rdist_stride;
 
-    /* For SGI and PPI the target is always this CPU */
-    affinity = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 32 |
-                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 16 |
-                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 8  |
-                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0));
-
-    for ( i = 0 ; i < 32 ; i++ )
-        v->arch.vgic.private_irqs->v3.irouter[i] = affinity;
-
     /*
      * Find the region where the re-distributor lives. For this purpose,
      * we look one region ahead as we have only the first CPU in hand.
@@ -1098,7 +1115,7 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
 
 static int vgic_v3_domain_init(struct domain *d)
 {
-    int i, idx;
+    int i;
 
     /*
      * Domain 0 gets the hardware address.
@@ -1150,13 +1167,6 @@ static int vgic_v3_domain_init(struct domain *d)
         d->arch.vgic.rdist_regions[0].first_cpu = 0;
     }
 
-    /* By default deliver to CPU0 */
-    for ( i = 0; i < DOMAIN_NR_RANKS(d); i++ )
-    {
-        for ( idx = 0; idx < 32; idx++ )
-            d->arch.vgic.shared_irqs[i].v3.irouter[idx] = 0;
-    }
-
     /* Register mmio handle for the Distributor */
     register_mmio_handler(d, &vgic_distr_mmio_handler, d->arch.vgic.dbase,
                           SZ_64K, NULL);
@@ -1182,7 +1192,6 @@ static int vgic_v3_domain_init(struct domain *d)
 static const struct vgic_ops v3_ops = {
     .vcpu_init   = vgic_v3_vcpu_init,
     .domain_init = vgic_v3_domain_init,
-    .get_target_vcpu  = vgic_v3_get_target_vcpu,
     .emulate_sysreg  = vgic_v3_emulate_sysreg,
     /*
      * We use both AFF1 and AFF0 in (v)MPIDR. Thus, the max number of CPU
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 7bb4570..531ce5d 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -68,11 +68,24 @@ static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
     p->irq = virq;
 }
 
-static void vgic_rank_init(struct vgic_irq_rank *rank, uint8_t index)
+static void vgic_rank_init(struct vgic_irq_rank *rank, uint8_t index,
+                           unsigned int vcpu)
 {
+    unsigned int i;
+
+    /*
+     * Make sure that the type chosen to store the target is able to
+     * store an VCPU ID between 0 and the maximum of virtual CPUs
+     * supported.
+     */
+    BUILD_BUG_ON((1 << (sizeof(rank->vcpu[0]) * 8)) < MAX_VIRT_CPUS);
+
     spin_lock_init(&rank->lock);
 
     rank->index = index;
+
+    for ( i = 0; i < NR_INTERRUPT_PER_RANK; i++ )
+        rank->vcpu[i] = vcpu;
 }
 
 int domain_vgic_init(struct domain *d, unsigned int nr_spis)
@@ -121,8 +134,9 @@ int domain_vgic_init(struct domain *d, unsigned int nr_spis)
     for (i=0; i<d->arch.vgic.nr_spis; i++)
         vgic_init_pending_irq(&d->arch.vgic.pending_irqs[i], i + 32);
 
+    /* SPIs are routed to VCPU0 by default */
     for ( i = 0; i < DOMAIN_NR_RANKS(d); i++ )
-        vgic_rank_init(&d->arch.vgic.shared_irqs[i], i + 1);
+        vgic_rank_init(&d->arch.vgic.shared_irqs[i], i + 1, 0);
 
     ret = d->arch.vgic.handler->domain_init(d);
     if ( ret )
@@ -176,7 +190,8 @@ int vcpu_vgic_init(struct vcpu *v)
     if ( v->arch.vgic.private_irqs == NULL )
       return -ENOMEM;
 
-    vgic_rank_init(v->arch.vgic.private_irqs, 0);
+    /* SGIs/PPIs are always routed to this VCPU */
+    vgic_rank_init(v->arch.vgic.private_irqs, 0, v->vcpu_id);
 
     v->domain->arch.vgic.handler->vcpu_init(v);
 
@@ -197,17 +212,27 @@ int vcpu_vgic_free(struct vcpu *v)
     return 0;
 }
 
+/* The function should be called by rank lock taken. */
+static struct vcpu *__vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
+{
+    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
+
+    ASSERT(spin_is_locked(&rank->lock));
+
+    return v->domain->vcpu[rank->vcpu[virq & INTERRUPT_RANK_MASK]];
+}
+
 /* takes the rank lock */
-struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int irq)
+struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
 {
-    struct domain *d = v->domain;
     struct vcpu *v_target;
-    struct vgic_irq_rank *rank = vgic_rank_irq(v, irq);
+    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
     unsigned long flags;
 
     vgic_lock_rank(v, rank, flags);
-    v_target = d->arch.vgic.handler->get_target_vcpu(v, irq);
+    v_target = __vgic_get_target_vcpu(v, virq);
     vgic_unlock_rank(v, rank, flags);
+
     return v_target;
 }
 
@@ -286,7 +311,6 @@ void arch_move_irqs(struct vcpu *v)
 
 void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n)
 {
-    struct domain *d = v->domain;
     const unsigned long mask = r;
     struct pending_irq *p;
     unsigned int irq;
@@ -296,7 +320,7 @@ void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n)
 
     while ( (i = find_next_bit(&mask, 32, i)) < 32 ) {
         irq = i + (32 * n);
-        v_target = d->arch.vgic.handler->get_target_vcpu(v, irq);
+        v_target = __vgic_get_target_vcpu(v, irq);
         p = irq_to_pending(v_target, irq);
         clear_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
         gic_remove_from_queues(v_target, irq);
@@ -312,7 +336,6 @@ void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n)
 
 void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n)
 {
-    struct domain *d = v->domain;
     const unsigned long mask = r;
     struct pending_irq *p;
     unsigned int irq;
@@ -322,7 +345,7 @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n)
 
     while ( (i = find_next_bit(&mask, 32, i)) < 32 ) {
         irq = i + (32 * n);
-        v_target = d->arch.vgic.handler->get_target_vcpu(v, irq);
+        v_target = __vgic_get_target_vcpu(v, irq);
         p = irq_to_pending(v_target, irq);
         set_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
         spin_lock_irqsave(&v_target->arch.vgic.lock, flags);
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index ba74d0f..4df7755 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -104,14 +104,11 @@ struct vgic_irq_rank {
         uint32_t ipriorityr[8];
     };
 
-    union {
-        struct {
-            uint32_t itargets[8];
-        }v2;
-        struct {
-            uint64_t irouter[32];
-        }v3;
-    };
+    /*
+     * It's more convenient to store a target VCPU per vIRQ
+     * than the register ITARGETSR/IROUTER itself
+     */
+    uint8_t vcpu[32];
 };
 
 struct sgi_target {
@@ -130,9 +127,6 @@ struct vgic_ops {
     int (*vcpu_init)(struct vcpu *v);
     /* Domain specific initialization of vGIC */
     int (*domain_init)(struct domain *d);
-    /* Get the target vcpu for a given virq. The rank lock is already taken
-     * when calling this. */
-    struct vcpu *(*get_target_vcpu)(struct vcpu *v, unsigned int irq);
     /* vGIC sysreg emulation */
     int (*emulate_sysreg)(struct cpu_user_regs *regs, union hsr hsr);
     /* Maximum number of vCPU supported */
@@ -205,7 +199,7 @@ enum gic_sgi_mode;
 extern int domain_vgic_init(struct domain *d, unsigned int nr_spis);
 extern void domain_vgic_free(struct domain *d);
 extern int vcpu_vgic_init(struct vcpu *v);
-extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int irq);
+extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
 extern void vgic_clear_pending_irqs(struct vcpu *v);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 4/5] xen/arm: vgic: Introduce helpers to extract/update/clear/set vGIC register ...
  2015-10-12 14:22 [PATCH v4 0/5] xen/arm: vgic: Support 32-bit access for 64-bit register Julien Grall
                   ` (2 preceding siblings ...)
  2015-10-12 14:22 ` [PATCH v4 3/5] xen/arm: vgic: Optimize the way to store the target vCPU in the rank Julien Grall
@ 2015-10-12 14:22 ` Julien Grall
  2015-10-12 14:22 ` [PATCH v4 5/5] xen/arm: vgic-v3: Support 32-bit access for 64-bit registers Julien Grall
  4 siblings, 0 replies; 22+ messages in thread
From: Julien Grall @ 2015-10-12 14:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, ian.campbell, stefano.stabellini

and use them in the vGIC emulation.

The GIC registers may support different access sizes. Rather than open
coding the access for every registers, provide a set of helpers to access
them.

The caller will have to call vgic_regN_* where N is the size of the
emulated registers.

The new helpers supports any access size and expect the caller to
validate the access size supported by the emulated registers.

Finally, take the opportunity to fix the coding style in section we are
modifying.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>

---
    Ian, I update this patch to convert the emulation of GICD_CTLR in
    vGICv3 to the new API. Although I've kept your ack because I think
    it's not a major change. Let me know it's not fine.

    Changes in v4:
        - Convert GICD_CTLR in vGICv3 to the new API.

    Changes in v3:
        - Add Ian's acked-by

    Changes in v2:
        - Rename read/write helpers to extract/update
        - Add a 's' to clearbit/setbit because we update multiple bits

    Changes in v1:
        - Make use of the new helpers in the vGICv2 code
        - Drop vgic_byte_*
        - Use unsigned long rather than uint64_t to optimize the code
        on ARM. (~about 40% less instruction per helpers).
---
 xen/arch/arm/vgic-v2.c     |  89 ++++++++++++++++++++------------
 xen/arch/arm/vgic-v3.c     | 125 +++++++++++++++++++++++++++++++--------------
 xen/include/asm-arm/vgic.h | 111 ++++++++++++++++++++++++++++++++++++----
 3 files changed, 243 insertions(+), 82 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index a0eba30..f9d28d1 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -171,24 +171,31 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
     case GICD_CTLR:
         if ( dabt.size != DABT_WORD ) goto bad_width;
         vgic_lock(v);
-        *r = v->domain->arch.vgic.ctlr;
+        *r = vgic_reg32_extract(v->domain->arch.vgic.ctlr, info);
         vgic_unlock(v);
         return 1;
     case GICD_TYPER:
+    {
+        uint32_t typer;
+
         if ( dabt.size != DABT_WORD ) goto bad_width;
         /* No secure world support for guests. */
         vgic_lock(v);
-        *r = ( ((v->domain->max_vcpus - 1) << GICD_TYPE_CPUS_SHIFT) )
+        typer = ((v->domain->max_vcpus - 1) << GICD_TYPE_CPUS_SHIFT)
             | DIV_ROUND_UP(v->domain->arch.vgic.nr_spis, 32);
         vgic_unlock(v);
+
+        *r = vgic_reg32_extract(typer, info);
+
         return 1;
+    }
     case GICD_IIDR:
         if ( dabt.size != DABT_WORD ) goto bad_width;
         /*
          * XXX Do we need a JEP106 manufacturer ID?
          * Just use the physical h/w value for now
          */
-        *r = 0x0000043b;
+        *r = vgic_reg32_extract(0x0000043b, info);
         return 1;
 
     /* Implementation defined -- read as zero */
@@ -204,7 +211,7 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
         rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ISENABLER, DABT_WORD);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank, flags);
-        *r = rank->ienable;
+        *r = vgic_reg32_extract(rank->ienable, info);
         vgic_unlock_rank(v, rank, flags);
         return 1;
 
@@ -213,7 +220,7 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
         rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ICENABLER, DABT_WORD);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank, flags);
-        *r = rank->ienable;
+        *r = vgic_reg32_extract(rank->ienable, info);
         vgic_unlock_rank(v, rank, flags);
         return 1;
 
@@ -228,37 +235,53 @@ static int vgic_v2_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
         goto read_as_zero;
 
     case GICD_ITARGETSR ... GICD_ITARGETSRN:
+    {
+        uint32_t itargetsr;
+
         if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
         rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR, DABT_WORD);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank, flags);
-        *r = vgic_fetch_itargetsr(rank, gicd_reg - GICD_ITARGETSR);
-        if ( dabt.size == DABT_BYTE )
-            *r = vgic_byte_read(*r, gicd_reg);
+        itargetsr = vgic_fetch_itargetsr(rank, gicd_reg - GICD_ITARGETSR);
         vgic_unlock_rank(v, rank, flags);
+        *r = vgic_reg32_extract(itargetsr, info);
+
         return 1;
+    }
 
     case GICD_IPRIORITYR ... GICD_IPRIORITYRN:
+    {
+        uint32_t ipriorityr;
+
         if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
         rank = vgic_rank_offset(v, 8, gicd_reg - GICD_IPRIORITYR, DABT_WORD);
-        if ( rank == NULL) goto read_as_zero;
+        if ( rank == NULL ) goto read_as_zero;
 
         vgic_lock_rank(v, rank, flags);
-        *r = rank->ipriorityr[REG_RANK_INDEX(8, gicd_reg - GICD_IPRIORITYR,
-                                             DABT_WORD)];
-        if ( dabt.size == DABT_BYTE )
-            *r = vgic_byte_read(*r, gicd_reg);
+        ipriorityr = rank->ipriorityr[REG_RANK_INDEX(8,
+                                                     gicd_reg - GICD_IPRIORITYR,
+                                                     DABT_WORD)];
         vgic_unlock_rank(v, rank, flags);
+        *r = vgic_reg32_extract(ipriorityr, info);
+
         return 1;
+    }
 
     case GICD_ICFGR ... GICD_ICFGRN:
+    {
+        uint32_t icfgr;
+
         if ( dabt.size != DABT_WORD ) goto bad_width;
         rank = vgic_rank_offset(v, 2, gicd_reg - GICD_ICFGR, DABT_WORD);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank, flags);
-        *r = rank->icfg[REG_RANK_INDEX(2, gicd_reg - GICD_ICFGR, DABT_WORD)];
+        icfgr = rank->icfg[REG_RANK_INDEX(2, gicd_reg - GICD_ICFGR, DABT_WORD)];
         vgic_unlock_rank(v, rank, flags);
+
+        *r = vgic_reg32_extract(icfgr, info);
+
         return 1;
+    }
 
     case GICD_NSACR ... GICD_NSACRN:
         /* We do not implement security extensions for guests, read zero */
@@ -368,7 +391,8 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
         if ( dabt.size != DABT_WORD ) goto bad_width;
         /* Ignore all but the enable bit */
         vgic_lock(v);
-        v->domain->arch.vgic.ctlr = r & GICD_CTL_ENABLE;
+        vgic_reg32_update(&v->domain->arch.vgic.ctlr, r, info);
+        v->domain->arch.vgic.ctlr &= GICD_CTL_ENABLE;
         vgic_unlock(v);
 
         return 1;
@@ -392,8 +416,8 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
         if ( rank == NULL) goto write_ignore;
         vgic_lock_rank(v, rank, flags);
         tr = rank->ienable;
-        rank->ienable |= r;
-        vgic_enable_irqs(v, r & (~tr), rank->index);
+        vgic_reg32_setbits(&rank->ienable, r, info);
+        vgic_enable_irqs(v, (rank->ienable) & (~tr), rank->index);
         vgic_unlock_rank(v, rank, flags);
         return 1;
 
@@ -403,8 +427,8 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
         if ( rank == NULL) goto write_ignore;
         vgic_lock_rank(v, rank, flags);
         tr = rank->ienable;
-        rank->ienable &= ~r;
-        vgic_disable_irqs(v, r & tr, rank->index);
+        vgic_reg32_clearbits(&rank->ienable, r, info);
+        vgic_disable_irqs(v, (~rank->ienable) & tr, rank->index);
         vgic_unlock_rank(v, rank, flags);
         return 1;
 
@@ -448,13 +472,8 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
         rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR, DABT_WORD);
         if ( rank == NULL) goto write_ignore;
         vgic_lock_rank(v, rank, flags);
-        if ( dabt.size == DABT_WORD )
-            itargetsr = r;
-        else
-        {
-            itargetsr = vgic_fetch_itargetsr(rank, gicd_reg - GICD_ITARGETSR);
-            vgic_byte_write(&itargetsr, r, gicd_reg);
-        }
+        itargetsr = vgic_fetch_itargetsr(rank, gicd_reg - GICD_ITARGETSR);
+        vgic_reg32_update(&itargetsr, r, info);
         vgic_store_itargetsr(v->domain, rank, gicd_reg - GICD_ITARGETSR,
                              itargetsr);
         vgic_unlock_rank(v, rank, flags);
@@ -462,18 +481,20 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
     }
 
     case GICD_IPRIORITYR ... GICD_IPRIORITYRN:
+    {
+        uint32_t *ipriorityr;
+
         if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
         rank = vgic_rank_offset(v, 8, gicd_reg - GICD_IPRIORITYR, DABT_WORD);
         if ( rank == NULL) goto write_ignore;
         vgic_lock_rank(v, rank, flags);
-        if ( dabt.size == DABT_WORD )
-            rank->ipriorityr[REG_RANK_INDEX(8, gicd_reg - GICD_IPRIORITYR,
-                                            DABT_WORD)] = r;
-        else
-            vgic_byte_write(&rank->ipriorityr[REG_RANK_INDEX(8,
-                        gicd_reg - GICD_IPRIORITYR, DABT_WORD)], r, gicd_reg);
+        ipriorityr = &rank->ipriorityr[REG_RANK_INDEX(8,
+                                                      gicd_reg - GICD_IPRIORITYR,
+                                                      DABT_WORD)];
+        vgic_reg32_update(ipriorityr, r, info);
         vgic_unlock_rank(v, rank, flags);
         return 1;
+    }
 
     case GICD_ICFGR: /* SGIs */
         goto write_ignore_32;
@@ -485,7 +506,9 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
         rank = vgic_rank_offset(v, 2, gicd_reg - GICD_ICFGR, DABT_WORD);
         if ( rank == NULL) goto write_ignore;
         vgic_lock_rank(v, rank, flags);
-        rank->icfg[REG_RANK_INDEX(2, gicd_reg - GICD_ICFGR, DABT_WORD)] = r;
+        vgic_reg32_update(&rank->icfg[REG_RANK_INDEX(2, gicd_reg - GICD_ICFGR,
+                                                     DABT_WORD)],
+                          r, info);
         vgic_unlock_rank(v, rank, flags);
         return 1;
 
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 2aab77b..75db23d 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -162,7 +162,6 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
                                          register_t *r)
 {
     struct hsr_dabt dabt = info->dabt;
-    uint64_t aff;
 
     switch ( gicr_reg )
     {
@@ -171,21 +170,27 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
         goto read_as_zero_32;
     case GICR_IIDR:
         if ( dabt.size != DABT_WORD ) goto bad_width;
-        *r = GICV3_GICR_IIDR_VAL;
+        *r = vgic_reg32_extract(GICV3_GICR_IIDR_VAL, info);
         return 1;
     case GICR_TYPER:
+    {
+        uint64_t typer, aff;
+
         if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
         /* TBD: Update processor id in [23:8] when ITS support is added */
         aff = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 56 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 48 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 40 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0) << 32);
-        *r = aff;
+        typer = aff;
 
         if ( v->arch.vgic.flags & VGIC_V3_RDIST_LAST )
-            *r |= GICR_TYPER_LAST;
+            typer |= GICR_TYPER_LAST;
+
+        *r = vgic_reg64_extract(typer, info);
 
         return 1;
+    }
     case GICR_STATUSR:
         /* Not implemented */
         goto read_as_zero_32;
@@ -214,7 +219,7 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
     case GICR_SYNCR:
         if ( dabt.size != DABT_WORD ) goto bad_width;
         /* RO . But when read it always returns busy bito bit[0] */
-        *r = GICR_SYNCR_NOT_BUSY;
+        *r = vgic_reg32_extract(GICR_SYNCR_NOT_BUSY, info);
         return 1;
     case GICR_MOVLPIR:
         /* WO Read as zero */
@@ -224,22 +229,22 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
         goto read_as_zero_64;
     case GICR_PIDR0:
         if ( dabt.size != DABT_WORD ) goto bad_width;
-        *r = GICV3_GICR_PIDR0;
+        *r = vgic_reg32_extract(GICV3_GICR_PIDR0, info);
          return 1;
     case GICR_PIDR1:
         if ( dabt.size != DABT_WORD ) goto bad_width;
-        *r = GICV3_GICR_PIDR1;
+        *r = vgic_reg32_extract(GICV3_GICR_PIDR1, info);
          return 1;
     case GICR_PIDR2:
         if ( dabt.size != DABT_WORD ) goto bad_width;
-        *r = GICV3_GICR_PIDR2;
+        *r = vgic_reg32_extract(GICV3_GICR_PIDR2, info);
          return 1;
     case GICR_PIDR3:
         /* Manufacture/customer defined */
         goto read_as_zero_32;
     case GICR_PIDR4:
         if ( dabt.size != DABT_WORD ) goto bad_width;
-        *r = GICV3_GICR_PIDR4;
+        *r = vgic_reg32_extract(GICV3_GICR_PIDR4, info);
          return 1;
     case GICR_PIDR5 ... GICR_PIDR7:
         /* Reserved0 */
@@ -360,7 +365,7 @@ static int __vgic_v3_distr_common_mmio_read(const char *name, struct vcpu *v,
         rank = vgic_rank_offset(v, 1, reg - GICD_ISENABLER, DABT_WORD);
         if ( rank == NULL ) goto read_as_zero;
         vgic_lock_rank(v, rank, flags);
-        *r = rank->ienable;
+        *r = vgic_reg32_extract(rank->ienable, info);
         vgic_unlock_rank(v, rank, flags);
         return 1;
     case GICD_ICENABLER ... GICD_ICENABLERN:
@@ -368,7 +373,7 @@ static int __vgic_v3_distr_common_mmio_read(const char *name, struct vcpu *v,
         rank = vgic_rank_offset(v, 1, reg - GICD_ICENABLER, DABT_WORD);
         if ( rank == NULL ) goto read_as_zero;
         vgic_lock_rank(v, rank, flags);
-        *r = rank->ienable;
+        *r = vgic_reg32_extract(rank->ienable, info);
         vgic_unlock_rank(v, rank, flags);
         return 1;
     /* Read the pending status of an IRQ via GICD/GICR is not supported */
@@ -382,25 +387,38 @@ static int __vgic_v3_distr_common_mmio_read(const char *name, struct vcpu *v,
         goto read_as_zero;
 
     case GICD_IPRIORITYR ... GICD_IPRIORITYRN:
+    {
+        uint32_t ipriorityr;
+
         if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
         rank = vgic_rank_offset(v, 8, reg - GICD_IPRIORITYR, DABT_WORD);
         if ( rank == NULL ) goto read_as_zero;
 
         vgic_lock_rank(v, rank, flags);
-        *r = rank->ipriorityr[REG_RANK_INDEX(8, reg - GICD_IPRIORITYR,
-                                             DABT_WORD)];
-        if ( dabt.size == DABT_BYTE )
-            *r = vgic_byte_read(*r, reg);
+        ipriorityr = rank->ipriorityr[REG_RANK_INDEX(8, reg - GICD_IPRIORITYR,
+                                                     DABT_WORD)];
         vgic_unlock_rank(v, rank, flags);
+
+        *r = vgic_reg32_extract(ipriorityr, info);
+
         return 1;
+    }
+
     case GICD_ICFGR ... GICD_ICFGRN:
+    {
+        uint32_t icfgr;
+
         if ( dabt.size != DABT_WORD ) goto bad_width;
         rank = vgic_rank_offset(v, 2, reg - GICD_ICFGR, DABT_WORD);
         if ( rank == NULL ) goto read_as_zero;
         vgic_lock_rank(v, rank, flags);
-        *r = rank->icfg[REG_RANK_INDEX(2, reg - GICD_ICFGR, DABT_WORD)];
+        icfgr = rank->icfg[REG_RANK_INDEX(2, reg - GICD_ICFGR, DABT_WORD)];
         vgic_unlock_rank(v, rank, flags);
+
+        *r = vgic_reg32_extract(icfgr, info);
+
         return 1;
+    }
     default:
         printk(XENLOG_G_ERR
                "%pv: %s: unhandled read r%d offset %#08x\n",
@@ -439,8 +457,8 @@ static int __vgic_v3_distr_common_mmio_write(const char *name, struct vcpu *v,
         if ( rank == NULL ) goto write_ignore;
         vgic_lock_rank(v, rank, flags);
         tr = rank->ienable;
-        rank->ienable |= r;
-        vgic_enable_irqs(v, r & (~tr), rank->index);
+        vgic_reg32_setbits(&rank->ienable, r, info);
+        vgic_enable_irqs(v, (rank->ienable) & (~tr), rank->index);
         vgic_unlock_rank(v, rank, flags);
         return 1;
     case GICD_ICENABLER ... GICD_ICENABLERN:
@@ -449,8 +467,8 @@ static int __vgic_v3_distr_common_mmio_write(const char *name, struct vcpu *v,
         if ( rank == NULL ) goto write_ignore;
         vgic_lock_rank(v, rank, flags);
         tr = rank->ienable;
-        rank->ienable &= ~r;
-        vgic_disable_irqs(v, r & tr, rank->index);
+        vgic_reg32_clearbits(&rank->ienable, r, info);
+        vgic_disable_irqs(v, (~rank->ienable) & tr, rank->index);
         vgic_unlock_rank(v, rank, flags);
         return 1;
     case GICD_ISPENDR ... GICD_ISPENDRN:
@@ -482,16 +500,16 @@ static int __vgic_v3_distr_common_mmio_write(const char *name, struct vcpu *v,
         return 0;
 
     case GICD_IPRIORITYR ... GICD_IPRIORITYRN:
+    {
+        uint32_t *ipriorityr;
+
         if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto bad_width;
         rank = vgic_rank_offset(v, 8, reg - GICD_IPRIORITYR, DABT_WORD);
         if ( rank == NULL ) goto write_ignore;
         vgic_lock_rank(v, rank, flags);
-        if ( dabt.size == DABT_WORD )
-            rank->ipriorityr[REG_RANK_INDEX(8, reg - GICD_IPRIORITYR,
-                                            DABT_WORD)] = r;
-        else
-            vgic_byte_write(&rank->ipriorityr[REG_RANK_INDEX(8,
-                       reg - GICD_IPRIORITYR, DABT_WORD)], r, reg);
+        ipriorityr = &rank->ipriorityr[REG_RANK_INDEX(8, reg - GICD_IPRIORITYR,
+                                                      DABT_WORD)];
+        vgic_reg32_update(ipriorityr, r, info);
         vgic_unlock_rank(v, rank, flags);
         return 1;
     case GICD_ICFGR: /* Restricted to configure SGIs */
@@ -503,9 +521,13 @@ static int __vgic_v3_distr_common_mmio_write(const char *name, struct vcpu *v,
         rank = vgic_rank_offset(v, 2, reg - GICD_ICFGR, DABT_WORD);
         if ( rank == NULL ) goto write_ignore;
         vgic_lock_rank(v, rank, flags);
-        rank->icfg[REG_RANK_INDEX(2, reg - GICD_ICFGR, DABT_WORD)] = r;
+        vgic_reg32_update(&rank->icfg[REG_RANK_INDEX(2, reg - GICD_ICFGR,
+                                                     DABT_WORD)],
+                          r, info);
         vgic_unlock_rank(v, rank, flags);
         return 1;
+    }
+
     default:
         printk(XENLOG_G_ERR
                "%pv: %s: unhandled write r%d=%"PRIregister" offset %#08x\n",
@@ -719,7 +741,7 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
     case GICD_CTLR:
         if ( dabt.size != DABT_WORD ) goto bad_width;
         vgic_lock(v);
-        *r = v->domain->arch.vgic.ctlr;
+        *r = vgic_reg32_extract(v->domain->arch.vgic.ctlr, info);
         vgic_unlock(v);
         return 1;
     case GICD_TYPER:
@@ -734,13 +756,16 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
          * bit is zero. The maximum is 8.
          */
         unsigned int ncpus = min_t(unsigned int, v->domain->max_vcpus, 8);
+        uint32_t typer;
 
         if ( dabt.size != DABT_WORD ) goto bad_width;
         /* No secure world support for guests. */
-        *r = ((ncpus - 1) << GICD_TYPE_CPUS_SHIFT |
-              DIV_ROUND_UP(v->domain->arch.vgic.nr_spis, 32));
+        typer = ((ncpus - 1) << GICD_TYPE_CPUS_SHIFT |
+                 DIV_ROUND_UP(v->domain->arch.vgic.nr_spis, 32));
+
+        typer |= (irq_bits - 1) << GICD_TYPE_ID_BITS_SHIFT;
 
-        *r |= (irq_bits - 1) << GICD_TYPE_ID_BITS_SHIFT;
+        *r = vgic_reg32_extract(typer, info);
 
         return 1;
     }
@@ -752,7 +777,7 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
         goto read_as_zero_32;
     case GICD_IIDR:
         if ( dabt.size != DABT_WORD ) goto bad_width;
-        *r = GICV3_GICD_IIDR_VAL;
+        *r = vgic_reg32_extract(GICV3_GICD_IIDR_VAL, info);
         return 1;
     case 0x020 ... 0x03c:
     case 0xc000 ... 0xffcc:
@@ -775,14 +800,22 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
         /* SGI/PPI is RES0 */
         goto read_as_zero_64;
     case GICD_IROUTER32 ... GICD_IROUTERN:
+    {
+        uint64_t irouter;
+
         if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
         rank = vgic_rank_offset(v, 64, gicd_reg - GICD_IROUTER,
                                 DABT_DOUBLE_WORD);
         if ( rank == NULL ) goto read_as_zero;
         vgic_lock_rank(v, rank, flags);
-        *r = vgic_fetch_irouter(rank, gicd_reg - GICD_IROUTER);
+        irouter = vgic_fetch_irouter(rank, gicd_reg - GICD_IROUTER);
         vgic_unlock_rank(v, rank, flags);
+
+        *r = vgic_reg64_extract(irouter, info);
+
         return 1;
+    }
+
     case GICD_NSACR ... GICD_NSACRN:
         /* We do not implement security extensions for guests, read zero */
         goto read_as_zero_32;
@@ -798,17 +831,17 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
     case GICD_PIDR0:
         /* GICv3 identification value */
         if ( dabt.size != DABT_WORD ) goto bad_width;
-        *r = GICV3_GICD_PIDR0;
+        *r = vgic_reg32_extract(GICV3_GICD_PIDR0, info);
         return 1;
     case GICD_PIDR1:
         /* GICv3 identification value */
         if ( dabt.size != DABT_WORD ) goto bad_width;
-        *r = GICV3_GICD_PIDR1;
+        *r = vgic_reg32_extract(GICV3_GICD_PIDR1, info);
         return 1;
     case GICD_PIDR2:
         /* GICv3 identification value */
         if ( dabt.size != DABT_WORD ) goto bad_width;
-        *r = GICV3_GICD_PIDR2;
+        *r = vgic_reg32_extract(GICV3_GICD_PIDR2, info);
         return 1;
     case GICD_PIDR3:
         /* GICv3 identification value. Manufacturer/Customer defined */
@@ -816,7 +849,7 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
     case GICD_PIDR4:
         /* GICv3 identification value */
         if ( dabt.size != DABT_WORD ) goto bad_width;
-        *r = GICV3_GICD_PIDR4;
+        *r = vgic_reg32_extract(GICV3_GICD_PIDR4, info);
         return 1;
     case GICD_PIDR5 ... GICD_PIDR7:
         /* Reserved0 */
@@ -872,17 +905,24 @@ static int vgic_v3_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
     switch ( gicd_reg )
     {
     case GICD_CTLR:
+    {
+        uint32_t ctlr = 0;
+
         if ( dabt.size != DABT_WORD ) goto bad_width;
 
         vgic_lock(v);
+
+        vgic_reg32_update(&ctlr, r, info);
+
         /* Only EnableGrp1A can be changed */
-        if ( r & GICD_CTLR_ENABLE_G1A )
+        if ( ctlr & GICD_CTLR_ENABLE_G1A )
             v->domain->arch.vgic.ctlr |= GICD_CTLR_ENABLE_G1A;
         else
             v->domain->arch.vgic.ctlr &= ~GICD_CTLR_ENABLE_G1A;
         vgic_unlock(v);
 
         return 1;
+    }
     case GICD_TYPER:
         /* RO -- write ignored */
         goto write_ignore_32;
@@ -928,14 +968,21 @@ static int vgic_v3_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
         /* SGI/PPI is RES0 */
         goto write_ignore_64;
     case GICD_IROUTER32 ... GICD_IROUTERN:
+    {
+        uint64_t irouter;
+
         if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
         rank = vgic_rank_offset(v, 64, gicd_reg - GICD_IROUTER,
                                 DABT_DOUBLE_WORD);
         if ( rank == NULL ) goto write_ignore;
         vgic_lock_rank(v, rank, flags);
-        vgic_store_irouter(v->domain, rank, gicd_reg - GICD_IROUTER, r);
+        irouter = vgic_fetch_irouter(rank, gicd_reg - GICD_IROUTER);
+        vgic_reg64_update(&irouter, r, info);
+        vgic_store_irouter(v->domain, rank, gicd_reg - GICD_IROUTER, irouter);
         vgic_unlock_rank(v, rank, flags);
         return 1;
+    }
+
     case GICD_NSACR ... GICD_NSACRN:
         /* We do not implement security extensions for guests, write ignore */
         goto write_ignore_32;
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 4df7755..c5fad43 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -19,6 +19,7 @@
 #define __ASM_ARM_VGIC_H__
 
 #include <xen/bitops.h>
+#include <asm/mmio.h>
 
 struct pending_irq
 {
@@ -166,26 +167,116 @@ static inline int REG_RANK_NR(int b, uint32_t n)
     }
 }
 
-static inline uint32_t vgic_byte_read(uint32_t val, int offset)
+#define VGIC_REG_MASK(size) ((~0UL) >> (BITS_PER_LONG - ((1 << (size)) * 8)))
+
+/*
+ * The check on the size supported by the register has to be done by
+ * the caller of vgic_regN_*.
+ *
+ * vgic_reg_* should never be called directly. Instead use the vgic_regN_*
+ * according to size of the emulated register
+ *
+ * Note that the alignment fault will always be taken in the guest
+ * (see B3.12.7 DDI0406.b).
+ */
+static inline register_t vgic_reg_extract(unsigned long reg,
+                                          unsigned int offset,
+                                          enum dabt_size size)
+{
+    reg >>= 8 * offset;
+    reg &= VGIC_REG_MASK(size);
+
+    return reg;
+}
+
+static inline void vgic_reg_update(unsigned long *reg, register_t val,
+                                   unsigned int offset,
+                                   enum dabt_size size)
 {
-    int byte = offset & 0x3;
+    unsigned long mask = VGIC_REG_MASK(size);
+    int shift = offset * 8;
 
-    val = val >> (8*byte);
-    val &= 0x000000ff;
+    *reg &= ~(mask << shift);
+    *reg |= ((unsigned long)val & mask) << shift;
+}
+
+static inline void vgic_reg_setbits(unsigned long *reg, register_t bits,
+                                    unsigned int offset,
+                                    enum dabt_size size)
+{
+    unsigned long mask = VGIC_REG_MASK(size);
+    int shift = offset * 8;
 
-    return val;
+    *reg |= ((unsigned long)bits & mask) << shift;
 }
 
-static inline void vgic_byte_write(uint32_t *reg, uint32_t var, int offset)
+static inline void vgic_reg_clearbits(unsigned long *reg, register_t bits,
+                                      unsigned int offset,
+                                      enum dabt_size size)
 {
-    int byte = offset & 0x3;
+    unsigned long mask = VGIC_REG_MASK(size);
+    int shift = offset * 8;
 
-    var &= 0xff;
+    *reg &= ~(((unsigned long)bits & mask) << shift);
+}
 
-    *reg &= ~(0xff << (8*byte));
-    *reg |= (var << (8*byte));
+/* N-bit register helpers */
+#define VGIC_REG_HELPERS(sz, offmask)                                   \
+static inline register_t vgic_reg##sz##_extract(uint##sz##_t reg,       \
+                                                const mmio_info_t *info)\
+{                                                                       \
+    return vgic_reg_extract(reg, info->gpa & offmask,                   \
+                            info->dabt.size);                           \
+}                                                                       \
+                                                                        \
+static inline void vgic_reg##sz##_update(uint##sz##_t *reg,             \
+                                         register_t val,                \
+                                         const mmio_info_t *info)       \
+{                                                                       \
+    unsigned long tmp = *reg;                                           \
+                                                                        \
+    vgic_reg_update(&tmp, val, info->gpa & offmask,                     \
+                    info->dabt.size);                                   \
+                                                                        \
+    *reg = tmp;                                                         \
+}                                                                       \
+                                                                        \
+static inline void vgic_reg##sz##_setbits(uint##sz##_t *reg,            \
+                                          register_t bits,              \
+                                          const mmio_info_t *info)      \
+{                                                                       \
+    unsigned long tmp = *reg;                                           \
+                                                                        \
+    vgic_reg_setbits(&tmp, bits, info->gpa & offmask,                   \
+                     info->dabt.size);                                  \
+                                                                        \
+    *reg = tmp;                                                         \
+}                                                                       \
+                                                                        \
+static inline void vgic_reg##sz##_clearbits(uint##sz##_t *reg,          \
+                                            register_t bits,            \
+                                            const mmio_info_t *info)    \
+{                                                                       \
+    unsigned long tmp = *reg;                                           \
+                                                                        \
+    vgic_reg_clearbits(&tmp, bits, info->gpa & offmask,                 \
+                       info->dabt.size);                                \
+                                                                        \
+    *reg = tmp;                                                         \
 }
 
+/*
+ * 64 bits registers are only supported on platform with 64-bit long.
+ * This is also allow us to optimize the 32 bit case by using
+ * unsigned long rather than uint64_t
+ */
+#if BITS_PER_LONG == 64
+VGIC_REG_HELPERS(64, 0x7);
+#endif
+VGIC_REG_HELPERS(32, 0x3);
+
+#undef VGIC_REG_HELPERS
+
 enum gic_sgi_mode;
 
 /*
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 5/5] xen/arm: vgic-v3: Support 32-bit access for 64-bit registers
  2015-10-12 14:22 [PATCH v4 0/5] xen/arm: vgic: Support 32-bit access for 64-bit register Julien Grall
                   ` (3 preceding siblings ...)
  2015-10-12 14:22 ` [PATCH v4 4/5] xen/arm: vgic: Introduce helpers to extract/update/clear/set vGIC register Julien Grall
@ 2015-10-12 14:22 ` Julien Grall
  4 siblings, 0 replies; 22+ messages in thread
From: Julien Grall @ 2015-10-12 14:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, ian.campbell, stefano.stabellini

Based on 8.1.3 (IHI 0069A), unless stated otherwise, the 64-bit registers
supports both 32-bit and 64-bits access.

All the registers we properly emulate (i.e not RAZ/WI) supports 32-bit access.

For RAZ/WI, it's also seems to be the case but I'm not 100% sure. Anyway,
emulating 32-bit access for them doesn't hurt. Note that we would need
some extra care when they will be implemented (for instance GICR_PROPBASER).

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>

---
    This is technically fixing boot of FreeBSD ARM64 guest with GICv3.

    The next version of Linux (4.4) will contain support of GICv3 for 32-bit
    OS. While we only support GICv3 on 64-bit host, the vGICv3 is not tight
    to 64-bit. This patch will allow 32-bit guest booting on platform
    which doesn't have a GICv3 compatible with GICv2.

    So this patch is a good candidate for Xen 4.6.1. Although this patch is
    heavily depend on previous patches. It may be possible to shuffle
    and move the "opmitization" patches towards the end. I haven't yet
    done that because I feel this series makes more sense in the current
    order.

    Also, I haven't move vgic_reg64_check_access in vgic.h because there
    is no usage in this series outside of vgic-v3.c and the helpers is
    GICv3 oriented.

    Changes in v2:
        - Add Ian's acked-by

    Changes in v1:
        - Support 32bit access on the most significant word of
        GICR_TYPER
---
 xen/arch/arm/vgic-v3.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 75db23d..659d784 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -157,6 +157,15 @@ static void vgic_store_irouter(struct domain *d, struct vgic_irq_rank *rank,
     rank->vcpu[offset] = new_vcpu->vcpu_id;
 }
 
+static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
+{
+    /*
+     * 64 bits registers can be accessible using 32-bit and 64-bit unless
+     * stated otherwise (See 8.1.3 ARM IHI 0069A).
+     */
+    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
+}
+
 static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
                                          uint32_t gicr_reg,
                                          register_t *r)
@@ -173,10 +182,11 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
         *r = vgic_reg32_extract(GICV3_GICR_IIDR_VAL, info);
         return 1;
     case GICR_TYPER:
+    case GICR_TYPER + 4:
     {
         uint64_t typer, aff;
 
-        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
         /* TBD: Update processor id in [23:8] when ITS support is added */
         aff = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 56 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 48 |
@@ -262,7 +272,7 @@ bad_width:
     return 0;
 
 read_as_zero_64:
-    if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
+    if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
     *r = 0;
     return 1;
 
@@ -338,7 +348,7 @@ bad_width:
     return 0;
 
 write_ignore_64:
-    if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
+    if ( vgic_reg64_check_access(dabt) ) goto bad_width;
     return 1;
 
 write_ignore_32:
@@ -803,7 +813,7 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
     {
         uint64_t irouter;
 
-        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
         rank = vgic_rank_offset(v, 64, gicd_reg - GICD_IROUTER,
                                 DABT_DOUBLE_WORD);
         if ( rank == NULL ) goto read_as_zero;
@@ -878,7 +888,7 @@ bad_width:
     return 0;
 
 read_as_zero_64:
-    if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
+    if ( vgic_reg64_check_access(dabt) ) goto bad_width;
     *r = 0;
     return 1;
 
@@ -971,7 +981,7 @@ static int vgic_v3_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
     {
         uint64_t irouter;
 
-        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
         rank = vgic_rank_offset(v, 64, gicd_reg - GICD_IROUTER,
                                 DABT_DOUBLE_WORD);
         if ( rank == NULL ) goto write_ignore;
@@ -1030,7 +1040,7 @@ write_ignore_32:
     return 1;
 
 write_ignore_64:
-    if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
+    if ( vgic_reg64_check_access(dabt) ) goto bad_width;
     return 1;
 
 write_ignore:
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 1/5] xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR
  2015-10-12 14:22 ` [PATCH v4 1/5] xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR Julien Grall
@ 2015-10-22 15:53   ` Ian Campbell
  2015-10-22 16:36     ` Julien Grall
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Campbell @ 2015-10-22 15:53 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: stefano.stabellini

On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:

Subject: "correctly handle" and "writes"

> During a store, the byte is always in the low part of the register (i.e
> [0:7]).
> 
> Although, we are masking the register by using a shift of the
> byte offset in the ITARGETSR. This will result to get a target list
> equal to 0 which is ignored by the emulation.

I'm afraid I can't parse this.

I think instead of "Although" you might mean "incorrectly" as in "we are
incorrectly...", but that would really then want the sentence to end
"instead of <the right thing>". So perhaps:

    We are incorrectly masking the register by using a shift of the byte
    offset in the ITARGETSR instead of <...something...>. This will result
    in a target list equal to 0 which is ignored by the emulation.

(note also s/to get/in a/ in the second sentence)

> Because of that a guest won't be able to modify the any ITARGETSR using
> byte access. Note that the first byte of each register will still be
> writeable.

"Because of that the guest will only be able to modify the first byte in
each ITARGETSR"

In your version the "any ITARGETSR" in the first sentence is immediately
contradicted by the second sentence with gives an example of an ITARGETSR
which it can modify.

> 
> Furthermore, the body of the loop is retrieving the old target list
> using the index of the byte.
> 
> To avoid modifying too much the loop, shift the byte stored to the correct
> offset.

That might have meant a smaller patch, but it's a lot harder to understand
either the result or the diff.

> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> 
> ----
>     This change used to be embedded in "xen/arm: vgic: Optimize the way
>     to store the target vCPU in the rank". It has been moved out to
>     avoid having too much functional changes in a single patch.
> 
>     This patch is a good candidate to backport to Xen 4.6 and Xen 4.5.
>     Without it a guest won't be able migrate an IRQ from one vCPU to
>     another if it's using byte access to write in ITARGETSR.
> 
>     Changes in v4:
>         - Patch added
> ---
>  xen/arch/arm/vgic-v2.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
> index 2d63e12..665afeb 100644
> --- a/xen/arch/arm/vgic-v2.c
> +++ b/xen/arch/arm/vgic-v2.c
> @@ -346,11 +346,11 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
>          /* 8-bit vcpu mask for this domain */
>          BUG_ON(v->domain->max_vcpus > 8);
>          target = (1 << v->domain->max_vcpus) - 1;
> -        if ( dabt.size == 2 )
> -            target = target | (target << 8) | (target << 16) | (target << 24);
> +        target = target | (target << 8) | (target << 16) | (target << 24);
> +        if ( dabt.size == DABT_WORD )
> +            target &= r;
>          else
> -            target = (target << (8 * (gicd_reg & 0x3)));
> -        target &= r;
> +            target &= (r << (8 * (gicd_reg & 0x3)));

At this point do you not now have 3 bytes of
    (1 << v->domain->max_vcpus) - 1;
and 1 byte of that masked with the write?

IOW isn't this modifying the 3 bytes which aren't written?

>          /* ignore zero writes */
>          if ( !target )
>              goto write_ignore;
> @@ -374,7 +374,7 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
>  
>              if ( new_target != old_target )
>              {
> -                irq = gicd_reg - GICD_ITARGETSR + (i / 8);
> +                irq = (gicd_reg & ~0x3) - GICD_ITARGETSR + (i / 8);
>                  v_target = v->domain->vcpu[new_target];
>                  v_old = v->domain->vcpu[old_target];
>                  vgic_migrate_irq(v_old, v_target, irq);
> @@ -386,7 +386,7 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
>                                               DABT_WORD)] = target;
>          else
>              vgic_byte_write(&rank->v2.itargets[REG_RANK_INDEX(8,
> -                      gicd_reg - GICD_ITARGETSR, DABT_WORD)], target, gicd_reg);
> +                      gicd_reg - GICD_ITARGETSR, DABT_WORD)], r, gicd_reg);
>          vgic_unlock_rank(v, rank, flags);
>          return 1;
>      }

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/5] xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0
  2015-10-12 14:22 ` [PATCH v4 2/5] xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0 Julien Grall
@ 2015-10-22 16:07   ` Ian Campbell
  2015-10-22 16:51     ` Julien Grall
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Campbell @ 2015-10-22 16:07 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: stefano.stabellini

On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:
> The current implementation ignores the whole write if one of the field is
> 0. Although, based on the spec (4.3.12 IHI 0048B.b), 0 is a valid value
> when:
>     - The interrupt is not wired in the distributor. From the Xen
>     point of view, it means that the corresponding bit is not set in
>     d->arch.vgic.allocated_irqs.
>     - The user wants to disable the IRQ forwarding in the distributor.
>     I.e the IRQ stays pending in the distributor and never received by
>     the guest.
> 
> Implementing the later will require more work in Xen because we always
> assume the interrupt is forwarded to vCPU. So for now, ignore any field
> where the value is 0.
> 
> The emulation of the write access of ITARGETSR has been reworked and
> moved to a new function because it would have been difficult to
> implement properly the behavior with the current code.
> 
> The new implementation is breaking the register in 4 distinct bytes. For
> each byte, it will check the validity of the target list, find the new
> target, migrate the interrupt and store the value if necessary.
> 
> In the new implementation there is nearly no distinction of the access
> size to avoid having too many different path which is harder to test.
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> 
> ---
>     This change used to be embedded in "xen/arm: vgic: Optimize the way
>     to store the target vCPU in the rank". It has been moved out to
>     avoid having too much functional changes in a single patch.
> 
>     I'm not sure if this patch should be backported to Xen 4.6. Without
>     it any guest writing 0 in one the field won't be able to migrate
>     other interrupts. Although, in all the use case I've seen, the guest
>     is read ITARGETSR first and write-back the value with the
>     corresponding byte changed.
> 
>     Changes in v4:
>         - Patch added
> ---
>  xen/arch/arm/vgic-v2.c | 141 ++++++++++++++++++++++++++++++++++---------
> ------
>  1 file changed, 98 insertions(+), 43 deletions(-)
> 
> diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
> index 665afeb..6b7eab3 100644
> --- a/xen/arch/arm/vgic-v2.c
> +++ b/xen/arch/arm/vgic-v2.c
> @@ -50,6 +50,94 @@ void vgic_v2_setup_hw(paddr_t dbase, paddr_t cbase,
> paddr_t vbase)
>      vgic_v2_hw.vbase = vbase;
>  }
>  
> +#define NR_TARGET_PER_REG 4U
> +#define NR_BIT_PER_TARGET 8U

NR_TARGETS_ and NR_BITS_...

"REG" there is a bit generic, given this only works for registers with 8
-bit fields, _ITARGETSR instead?

> +/*
> + * Store a ITARGETSR register. This function only deals with ITARGETSR8
> + * and onwards.
> + *
> + * Note the offset will be aligned to the appropriate boundary.
> + */
> +static void vgic_store_itargetsr(struct domain *d, struct vgic_irq_rank
> *rank,
> +                                 unsigned int offset, uint32_t
> itargetsr)
> +{
> +    unsigned int i;
> +    unsigned int regidx = REG_RANK_INDEX(8, offset, DABT_WORD);
> +    unsigned int virq;
> +
> +    ASSERT(spin_is_locked(&rank->lock));
> +
> +    /*
> +     * The ITARGETSR0-7, used for SGIs/PPIs, are implemented RO in the
> +     * emulation and should never call this function.
> +     *
> +     * They all live in the first rank.
> +     */
> +    BUILD_BUG_ON(NR_INTERRUPT_PER_RANK != 32);
> +    ASSERT(rank->index >= 1);
> +
> +    offset &= INTERRUPT_RANK_MASK;
> +    offset &= ~(NR_TARGET_PER_REG - 1);
> +
> +    virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
> +
> +    for ( i = 0; i < NR_TARGET_PER_REG; i++, offset++, virq++ )
> +    {
> +        unsigned int new_target, old_target;
> +        uint8_t new_mask, old_mask;
> +
> +        new_mask = itargetsr & ((1 << NR_BIT_PER_TARGET) - 1);
> +        old_mask = vgic_byte_read(rank->v2.itargets[regidx], i);
> +
> +        itargetsr >>= NR_BIT_PER_TARGET;

This is really a part of the for() iteration expression, but oddly place
here.

If you turn the "((1 << NR_BIT_PER_TARGET) - 1)" thing into a #define or
constant, then you may find that extracting the relevant byte from the
unshifted itargetsr using (itargetsr >> (i*NR_BITS_PER_TARGET) and then
applying the mask is clean enough to use here instead.

I don't think you rely on the shifting of itargetsr apart from when
calculating new_mask.

> +        /*
> +         * SPIs are using the 1-N model (see 1.4.3 in ARM IHI 0048B).
> +         * While the interrupt could be set pending to all the vCPUs in
> +         * target list, it's not guarantee by the spec.

"guaranteed"

> +         * For simplicity, always route the vIRQ to the first interrupt
> +         * in the target list
> +         */
> +        new_target = ffs(new_mask);
> +        old_target = ffs(old_mask);
> +
> +        /* The current target should always be valid */
> +        ASSERT(old_target && (old_target <= d->max_vcpus));
> +
> +        /*
> +         * Ignore the write request for this interrupt if the new target
> +         * is invalid.
> +         * XXX: From the spec, if the target list is not valid, the
> +         * interrupt should be ignored (i.e not forwarding to the

"forwarded".

> +         * guest).
> +         */
> +        if ( !new_target || (new_target > d->max_vcpus) )
> +        {
> +            printk(XENLOG_G_DEBUG

gdprintk?

> +                   "d%d: No valid vCPU found for vIRQ%u in the target list (%#x). Skip it\n",
> +                   d->domain_id, virq, new_mask);
> +            continue;
> +        }
> +
> +        /* The vCPU ID always starts from 0 */
> +        new_target--;
> +        old_target--;
> +
> +        /* Only migrate the vIRQ if the target vCPU has changed */
> +        if ( new_target != old_target )
> +        {
> +            vgic_migrate_irq(d->vcpu[old_target],
> +                             d->vcpu[new_target],
> +                             virq);
> +        }
> +
> +        /* Bit corresponding to unimplemented CPU is write-ignored. */

"write-ignore"

> +        new_mask &= (1 << d->max_vcpus) - 1;
> +        vgic_byte_write(&rank->v2.itargets[regidx], new_mask, i);
> +    }
> +}
> +
>  static int vgic_v2_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
>                                     register_t *r, void *priv)
>  {
> @@ -337,56 +425,23 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v,
> mmio_info_t *info,
>  
>      case GICD_ITARGETSR + 8 ... GICD_ITARGETSRN:
>      {
> -        /* unsigned long needed for find_next_bit */
> -        unsigned long target;
> -        int i;
> +        uint32_t itargetsr;
> +
>          if ( dabt.size != DABT_BYTE && dabt.size != DABT_WORD ) goto
> bad_width;
>          rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR,
> DABT_WORD);
>          if ( rank == NULL) goto write_ignore;
> -        /* 8-bit vcpu mask for this domain */
> -        BUG_ON(v->domain->max_vcpus > 8);
> -        target = (1 << v->domain->max_vcpus) - 1;
> -        target = target | (target << 8) | (target << 16) | (target <<
> 24);
> +        vgic_lock_rank(v, rank, flags);
>          if ( dabt.size == DABT_WORD )
> -            target &= r;
> +            itargetsr = r;
>          else
> -            target &= (r << (8 * (gicd_reg & 0x3)));
> -        /* ignore zero writes */
> -        if ( !target )
> -            goto write_ignore;
> -        /* For word reads ignore writes where any single byte is zero */
> -        if ( dabt.size == 2 &&
> -            !((target & 0xff) && (target & (0xff << 8)) &&
> -             (target & (0xff << 16)) && (target & (0xff << 24))))
> -            goto write_ignore;
> -        vgic_lock_rank(v, rank, flags);
> -        i = 0;
> -        while ( (i = find_next_bit(&target, 32, i)) < 32 )
>          {
> -            unsigned int irq, new_target, old_target;
> -            unsigned long old_target_mask;
> -            struct vcpu *v_target, *v_old;
> -
> -            new_target = i % 8;
> -            old_target_mask = vgic_byte_read(rank
> ->v2.itargets[REG_RANK_INDEX(8,
> -                                             gicd_reg - GICD_ITARGETSR,
> DABT_WORD)], i/8);
> -            old_target = find_first_bit(&old_target_mask, 8);
> -
> -            if ( new_target != old_target )
> -            {
> -                irq = (gicd_reg & ~0x3) - GICD_ITARGETSR + (i / 8);
> -                v_target = v->domain->vcpu[new_target];
> -                v_old = v->domain->vcpu[old_target];
> -                vgic_migrate_irq(v_old, v_target, irq);
> -            }
> -            i += 8 - new_target;
> +            itargetsr = rank->v2.itargets[REG_RANK_INDEX(8,
> +                                    gicd_reg - GICD_ITARGETSR,
> +                                    DABT_WORD)];
> +            vgic_byte_write(&itargetsr, r, gicd_reg);
>          }
> -        if ( dabt.size == DABT_WORD )
> -            rank->v2.itargets[REG_RANK_INDEX(8, gicd_reg -
> GICD_ITARGETSR,
> -                                             DABT_WORD)] = target;
> -        else
> -            vgic_byte_write(&rank->v2.itargets[REG_RANK_INDEX(8,
> -                      gicd_reg - GICD_ITARGETSR, DABT_WORD)], r,
> gicd_reg);
> +        vgic_store_itargetsr(v->domain, rank, gicd_reg - GICD_ITARGETSR,
> +                             itargetsr);
>          vgic_unlock_rank(v, rank, flags);
>          return 1;
>      }

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/5] xen/arm: vgic: Optimize the way to store the target vCPU in the rank
  2015-10-12 14:22 ` [PATCH v4 3/5] xen/arm: vgic: Optimize the way to store the target vCPU in the rank Julien Grall
@ 2015-10-22 16:17   ` Ian Campbell
  2015-10-22 17:15     ` Julien Grall
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Campbell @ 2015-10-22 16:17 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: stefano.stabellini

On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:
> [...]
>          /* Only migrate the vIRQ if the target vCPU has changed */
>          if ( new_target != old_target )
>          {
> +            unsigned int virq = rank->index * NR_INTERRUPT_PER_RANK + offset;

FWIW this was the value of offset before it was shifted + masked, I think.
Could you not just save it up top and remember it?

--- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -89,18 +89,72 @@ static struct vcpu *vgic_v3_irouter_to_vcpu(struct
> domain *d, uint64_t irouter)
>      return d->vcpu[vcpu_id];
>  }
>  
> -static struct vcpu *vgic_v3_get_target_vcpu(struct vcpu *v, unsigned int
> irq)
> -{
> -    struct vcpu *v_target;
> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, irq);
> +#define NR_BYTE_PER_IROUTER 8U

BYTES

>  
> +/*
> + * Fetch a IROUTER register based on the offset from IROUTER0. Only one
> + * vCPU will be listed for a given vIRQ.
> + *
> + * Note the offset will be aligned to the appropritate  boundary.

appropriate.

> + */
> +static uint64_t vgic_fetch_irouter(struct vgic_irq_rank *rank,
> +                                   unsigned int offset)
> +{
>      ASSERT(spin_is_locked(&rank->lock));
>  
> -    v_target = vgic_v3_irouter_to_vcpu(v->domain, rank->v3.irouter[irq %
> 32]);
> +    /* There is exactly 1 vIRQ per IROUTER */
> +    offset /= NR_BYTE_PER_IROUTER;
>  
> -    ASSERT(v_target != NULL);
> +    /* Get the index in the rank */
> +    offset &= INTERRUPT_RANK_MASK;
>  
> -    return v_target;
> +    return vcpuid_to_vaffinity(rank->vcpu[offset]);
> +}
> +
> +/*
> + * Store a IROUTER register in a convenient way and migrate the vIRQ

There's been a few places now where you said "a I<foo>R", all of which
should be "an I<foo>R".

> + * if necessary. This function only deals with ITARGETSR32 and onwards.
> + *
> + * Note the offset will be aligned to the appriopriate boundary.

"appropriate" again.

The rest looks ok, mostly based on my recollection of previous versions.

Ian.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 1/5] xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR
  2015-10-22 15:53   ` Ian Campbell
@ 2015-10-22 16:36     ` Julien Grall
  2015-10-23  9:33       ` Ian Campbell
  0 siblings, 1 reply; 22+ messages in thread
From: Julien Grall @ 2015-10-22 16:36 UTC (permalink / raw)
  To: Ian Campbell, xen-devel; +Cc: stefano.stabellini

Hi Ian,

On 22/10/15 16:53, Ian Campbell wrote:
> On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:
> 
> Subject: "correctly handle" and "writes"
> 
>> During a store, the byte is always in the low part of the register (i.e
>> [0:7]).
>>
>> Although, we are masking the register by using a shift of the
>> byte offset in the ITARGETSR. This will result to get a target list
>> equal to 0 which is ignored by the emulation.
> 
> I'm afraid I can't parse this.
> 
> I think instead of "Although" you might mean "incorrectly" as in "we are
> incorrectly...", but that would really then want the sentence to end
> "instead of <the right thing>". So perhaps:
> 
>     We are incorrectly masking the register by using a shift of the byte
>     offset in the ITARGETSR instead of <...something...>. This will result
>     in a target list equal to 0 which is ignored by the emulation.

Rather than "instead of..." what about "while the byte is always in r[0:7]"?

> (note also s/to get/in a/ in the second sentence)
> 
>> Because of that a guest won't be able to modify the any ITARGETSR using
>> byte access. Note that the first byte of each register will still be
>> writeable.
> 
> "Because of that the guest will only be able to modify the first byte in
> each ITARGETSR"
> 
> In your version the "any ITARGETSR" in the first sentence is immediately
> contradicted by the second sentence with gives an example of an ITARGETSR
> which it can modify.

Right, I will update the commit message.

>>
>> Furthermore, the body of the loop is retrieving the old target list
>> using the index of the byte.
>>
>> To avoid modifying too much the loop, shift the byte stored to the correct
>> offset.
> 
> That might have meant a smaller patch, but it's a lot harder to understand
> either the result or the diff.

The size of the patch would have been the same. Although, it requires to
modify the call to vgic_byte_read in the loop to access the correct
interrupt.

I didn't try to spend to much time to modify the loop because the
follow-up patch (#2) will rewrite the loop.

[...]

>>  xen/arch/arm/vgic-v2.c | 12 ++++++------
>>  1 file changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
>> index 2d63e12..665afeb 100644
>> --- a/xen/arch/arm/vgic-v2.c
>> +++ b/xen/arch/arm/vgic-v2.c
>> @@ -346,11 +346,11 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
>>          /* 8-bit vcpu mask for this domain */
>>          BUG_ON(v->domain->max_vcpus > 8);
>>          target = (1 << v->domain->max_vcpus) - 1;
>> -        if ( dabt.size == 2 )
>> -            target = target | (target << 8) | (target << 16) | (target << 24);
>> +        target = target | (target << 8) | (target << 16) | (target << 24);
>> +        if ( dabt.size == DABT_WORD )
>> +            target &= r;
>>          else
>> -            target = (target << (8 * (gicd_reg & 0x3)));
>> -        target &= r;
>> +            target &= (r << (8 * (gicd_reg & 0x3)));
> 
> At this point do you not now have 3 bytes of
>     (1 << v->domain->max_vcpus) - 1;
> and 1 byte of that masked with the write?
> 
> IOW isn't this modifying the 3 bytes which aren't written?

No, the current loop search for bit set to 1. As the target variable
will only contain one byte with some bit set to 1, only the IRQ
associated to this byte will be updated.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/5] xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0
  2015-10-22 16:07   ` Ian Campbell
@ 2015-10-22 16:51     ` Julien Grall
  2015-10-23  9:30       ` Ian Campbell
  0 siblings, 1 reply; 22+ messages in thread
From: Julien Grall @ 2015-10-22 16:51 UTC (permalink / raw)
  To: Ian Campbell, xen-devel; +Cc: stefano.stabellini

On 22/10/15 17:07, Ian Campbell wrote:
>> diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
>> index 665afeb..6b7eab3 100644
>> --- a/xen/arch/arm/vgic-v2.c
>> +++ b/xen/arch/arm/vgic-v2.c
>> @@ -50,6 +50,94 @@ void vgic_v2_setup_hw(paddr_t dbase, paddr_t cbase,
>> paddr_t vbase)
>>      vgic_v2_hw.vbase = vbase;
>>  }
>>  
>> +#define NR_TARGET_PER_REG 4U
>> +#define NR_BIT_PER_TARGET 8U
> 
> NR_TARGETS_ and NR_BITS_...
> 
> "REG" there is a bit generic, given this only works for registers with 8
> -bit fields, _ITARGETSR instead?

Well this is within the vgic-v2.c and only one register contains target.
So I found pointless to add ITARGETSR to the name.

> 
>> +/*
>> + * Store a ITARGETSR register. This function only deals with ITARGETSR8
>> + * and onwards.
>> + *
>> + * Note the offset will be aligned to the appropriate boundary.
>> + */
>> +static void vgic_store_itargetsr(struct domain *d, struct vgic_irq_rank
>> *rank,
>> +                                 unsigned int offset, uint32_t
>> itargetsr)
>> +{
>> +    unsigned int i;
>> +    unsigned int regidx = REG_RANK_INDEX(8, offset, DABT_WORD);
>> +    unsigned int virq;
>> +
>> +    ASSERT(spin_is_locked(&rank->lock));
>> +
>> +    /*
>> +     * The ITARGETSR0-7, used for SGIs/PPIs, are implemented RO in the
>> +     * emulation and should never call this function.
>> +     *
>> +     * They all live in the first rank.
>> +     */
>> +    BUILD_BUG_ON(NR_INTERRUPT_PER_RANK != 32);
>> +    ASSERT(rank->index >= 1);
>> +
>> +    offset &= INTERRUPT_RANK_MASK;
>> +    offset &= ~(NR_TARGET_PER_REG - 1);
>> +
>> +    virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
>> +
>> +    for ( i = 0; i < NR_TARGET_PER_REG; i++, offset++, virq++ )
>> +    {
>> +        unsigned int new_target, old_target;
>> +        uint8_t new_mask, old_mask;
>> +
>> +        new_mask = itargetsr & ((1 << NR_BIT_PER_TARGET) - 1);
>> +        old_mask = vgic_byte_read(rank->v2.itargets[regidx], i);
>> +
>> +        itargetsr >>= NR_BIT_PER_TARGET;
> 
> This is really a part of the for() iteration expression, but oddly place
> here.
> If you turn the "((1 << NR_BIT_PER_TARGET) - 1)" thing into a #define or
> constant, then you may find that extracting the relevant byte from the
> unshifted itargetsr using (itargetsr >> (i*NR_BITS_PER_TARGET) and then
> applying the mask is clean enough to use here instead.

I placed this shift here because I didn't want to use ... >> (i *
NR_BIT_..) which require a multiplication rather than a shift in the
resulting code.

Furthermore, I found the for loop more complicate to read with itargetsr
>>= inside.

Anyway, I prefer the latter solution so I will use it in the next version.

> 
> I don't think you rely on the shifting of itargetsr apart from when
> calculating new_mask.

Correct.

>> +        /*
>> +         * SPIs are using the 1-N model (see 1.4.3 in ARM IHI 0048B).
>> +         * While the interrupt could be set pending to all the vCPUs in
>> +         * target list, it's not guarantee by the spec.
> 
> "guaranteed"
> 
>> +         * For simplicity, always route the vIRQ to the first interrupt
>> +         * in the target list
>> +         */
>> +        new_target = ffs(new_mask);
>> +        old_target = ffs(old_mask);
>> +
>> +        /* The current target should always be valid */
>> +        ASSERT(old_target && (old_target <= d->max_vcpus));
>> +
>> +        /*
>> +         * Ignore the write request for this interrupt if the new target
>> +         * is invalid.
>> +         * XXX: From the spec, if the target list is not valid, the
>> +         * interrupt should be ignored (i.e not forwarding to the
> 
> "forwarded".
> 
>> +         * guest).
>> +         */
>> +        if ( !new_target || (new_target > d->max_vcpus) )
>> +        {
>> +            printk(XENLOG_G_DEBUG
> 
> gdprintk?

I would prefer to keep this printk in non-debug to help us catching any
OS potentially using this trick.

Based on that I would even use XENLOG_G_WARNING because this is not
really compliant to the spec and we are meant to fix it.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/5] xen/arm: vgic: Optimize the way to store the target vCPU in the rank
  2015-10-22 16:17   ` Ian Campbell
@ 2015-10-22 17:15     ` Julien Grall
  2015-10-23  9:34       ` Ian Campbell
  0 siblings, 1 reply; 22+ messages in thread
From: Julien Grall @ 2015-10-22 17:15 UTC (permalink / raw)
  To: Ian Campbell, xen-devel; +Cc: stefano.stabellini

Hi Ian,

On 22/10/15 17:17, Ian Campbell wrote:
> On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:
>> [...]
>>          /* Only migrate the vIRQ if the target vCPU has changed */
>>          if ( new_target != old_target )
>>          {
>> +            unsigned int virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
> 
> FWIW this was the value of offset before it was shifted + masked, I think.
> Could you not just save it up top and remember it?

In fact, the virq is already correctly set before the loop (see patch #2):

virq = rank->index * NR_INTERRUPT_PER_RANK + offset;

The variable is incremented in the for loop. So I just forgot to drop
this line when I did the split.

Not that it's not possible to use directly offset because for byte
access it will point to the byte modified and not the base address of
the register.

Though, I could use a mask, but I find this solution clearer.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/5] xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0
  2015-10-22 16:51     ` Julien Grall
@ 2015-10-23  9:30       ` Ian Campbell
  2015-10-23  9:37         ` Julien Grall
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Campbell @ 2015-10-23  9:30 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: stefano.stabellini

On Thu, 2015-10-22 at 17:51 +0100, Julien Grall wrote:
> On 22/10/15 17:07, Ian Campbell wrote:
> > > diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
> > > index 665afeb..6b7eab3 100644
> > > --- a/xen/arch/arm/vgic-v2.c
> > > +++ b/xen/arch/arm/vgic-v2.c
> > > @@ -50,6 +50,94 @@ void vgic_v2_setup_hw(paddr_t dbase, paddr_t
> > > cbase,
> > > paddr_t vbase)
> > >      vgic_v2_hw.vbase = vbase;
> > >  }
> > >  
> > > +#define NR_TARGET_PER_REG 4U
> > > +#define NR_BIT_PER_TARGET 8U
> > 
> > NR_TARGETS_ and NR_BITS_...
> > 
> > "REG" there is a bit generic, given this only works for registers with
> > 8
> > -bit fields, _ITARGETSR instead?
> 
> Well this is within the vgic-v2.c and only one register contains target.
> So I found pointless to add ITARGETSR to the name.

It's the use of the generic "REG" when there is only one relevant register
(which could be named) which I found confusing, since the current name
implies it has wider relevance than it actually does.

> > This is really a part of the for() iteration expression, but oddly
> > place
> > here.
> > If you turn the "((1 << NR_BIT_PER_TARGET) - 1)" thing into a #define
> > or
> > constant, then you may find that extracting the relevant byte from the
> > unshifted itargetsr using (itargetsr >> (i*NR_BITS_PER_TARGET) and then
> > applying the mask is clean enough to use here instead.
> 
> I placed this shift here because I didn't want to use ... >> (i *
> NR_BIT_..) which require a multiplication rather than a shift in the
> resulting code.

FWIW given a constant NR_BITS which is a power of two I think i*NR_BITS
would end up a shift with any reasonable compiler.

> > > +         * guest).
> > > +         */
> > > +        if ( !new_target || (new_target > d->max_vcpus) )
> > > +        {
> > > +            printk(XENLOG_G_DEBUG
> > 
> > gdprintk?
> 
> I would prefer to keep this printk in non-debug to help us catching any
> OS potentially using this trick.
> 
> Based on that I would even use XENLOG_G_WARNING because this is not
> really compliant to the spec and we are meant to fix it.

Assuming I remember correctly that XENLOG_G_WARNING is ratelimited in
default configurations, then OK.

> 
> Regards,
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 1/5] xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR
  2015-10-22 16:36     ` Julien Grall
@ 2015-10-23  9:33       ` Ian Campbell
  2015-10-23  9:58         ` Julien Grall
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Campbell @ 2015-10-23  9:33 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: stefano.stabellini

On Thu, 2015-10-22 at 17:36 +0100, Julien Grall wrote:
> Hi Ian,
> 
> On 22/10/15 16:53, Ian Campbell wrote:
> > On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:
> > 
> > Subject: "correctly handle" and "writes"
> > 
> > > During a store, the byte is always in the low part of the register
> > > (i.e
> > > [0:7]).
> > > 
> > > Although, we are masking the register by using a shift of the
> > > byte offset in the ITARGETSR. This will result to get a target list
> > > equal to 0 which is ignored by the emulation.
> > 
> > I'm afraid I can't parse this.
> > 
> > I think instead of "Although" you might mean "incorrectly" as in "we
> > are
> > incorrectly...", but that would really then want the sentence to end
> > "instead of <the right thing>". So perhaps:
> > 
> >     We are incorrectly masking the register by using a shift of the
> > byte
> >     offset in the ITARGETSR instead of <...something...>. This will
> > result
> >     in a target list equal to 0 which is ignored by the emulation.
> 
> Rather than "instead of..." what about "while the byte is always in
> r[0:7]"?

Yes, I think that's ok.


> > > 
> > > Furthermore, the body of the loop is retrieving the old target list
> > > using the index of the byte.
> > > 
> > > To avoid modifying too much the loop, shift the byte stored to the
> > > correct
> > > offset.
> > 
> > That might have meant a smaller patch, but it's a lot harder to
> > understand
> > either the result or the diff.
> 
> The size of the patch would have been the same. Although, it requires to
> modify the call to vgic_byte_read in the loop to access the correct
> interrupt.
> 
> I didn't try to spend to much time to modify the loop because the
> follow-up patch (#2) will rewrite the loop.

Since this patch is marked for backport then if we decided to take #2 then
that's probably ok, otherwise the state of the tree after just this patch
is more relevant.

That's in itself probably a stronger argument for taking #2 than the actual
functionality of #2 is.


> 
> [...]
> 
> > >  xen/arch/arm/vgic-v2.c | 12 ++++++------
> > >  1 file changed, 6 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
> > > index 2d63e12..665afeb 100644
> > > --- a/xen/arch/arm/vgic-v2.c
> > > +++ b/xen/arch/arm/vgic-v2.c
> > > @@ -346,11 +346,11 @@ static int vgic_v2_distr_mmio_write(struct vcpu
> > > *v, mmio_info_t *info,
> > >          /* 8-bit vcpu mask for this domain */
> > >          BUG_ON(v->domain->max_vcpus > 8);
> > >          target = (1 << v->domain->max_vcpus) - 1;
> > > -        if ( dabt.size == 2 )
> > > -            target = target | (target << 8) | (target << 16) |
> > > (target << 24);
> > > +        target = target | (target << 8) | (target << 16) | (target
> > > << 24);
> > > +        if ( dabt.size == DABT_WORD )
> > > +            target &= r;
> > >          else
> > > -            target = (target << (8 * (gicd_reg & 0x3)));
> > > -        target &= r;
> > > +            target &= (r << (8 * (gicd_reg & 0x3)));
> > 
> > At this point do you not now have 3 bytes of
> >     (1 << v->domain->max_vcpus) - 1;
> > and 1 byte of that masked with the write?
> > 
> > IOW isn't this modifying the 3 bytes which aren't written?
> 
> No, the current loop search for bit set to 1. As the target variable
> will only contain one byte with some bit set to 1, only the IRQ
> associated to this byte will be updated.

Um, OK, I'll take your word for that.

Ian.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/5] xen/arm: vgic: Optimize the way to store the target vCPU in the rank
  2015-10-22 17:15     ` Julien Grall
@ 2015-10-23  9:34       ` Ian Campbell
  2015-10-23 10:01         ` Julien Grall
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Campbell @ 2015-10-23  9:34 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: stefano.stabellini

On Thu, 2015-10-22 at 18:15 +0100, Julien Grall wrote:
> Hi Ian,
> 
> On 22/10/15 17:17, Ian Campbell wrote:
> > On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:
> > > [...]
> > >          /* Only migrate the vIRQ if the target vCPU has changed */
> > >          if ( new_target != old_target )
> > >          {
> > > +            unsigned int virq = rank->index * NR_INTERRUPT_PER_RANK
> > > + offset;
> > 
> > FWIW this was the value of offset before it was shifted + masked, I
> > think.
> > Could you not just save it up top and remember it?
> 
> In fact, the virq is already correctly set before the loop (see patch
> #2):
> 
> virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
> 
> The variable is incremented in the for loop. So I just forgot to drop
> this line when I did the split.
> 
> Not that it's not possible to use directly offset because for byte
> access it will point to the byte modified and not the base address of
> the register.
> 
> Though, I could use a mask, but I find this solution clearer.

But per the above what is actually going to happen is you drop this change?

Ian.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/5] xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0
  2015-10-23  9:30       ` Ian Campbell
@ 2015-10-23  9:37         ` Julien Grall
  2015-10-23  9:53           ` Ian Campbell
  0 siblings, 1 reply; 22+ messages in thread
From: Julien Grall @ 2015-10-23  9:37 UTC (permalink / raw)
  To: Ian Campbell, xen-devel; +Cc: stefano.stabellini

Hi Ian,

On 23/10/15 10:30, Ian Campbell wrote:
> On Thu, 2015-10-22 at 17:51 +0100, Julien Grall wrote:
>> On 22/10/15 17:07, Ian Campbell wrote:
>>>> diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
>>>> index 665afeb..6b7eab3 100644
>>>> --- a/xen/arch/arm/vgic-v2.c
>>>> +++ b/xen/arch/arm/vgic-v2.c
>>>> @@ -50,6 +50,94 @@ void vgic_v2_setup_hw(paddr_t dbase, paddr_t
>>>> cbase,
>>>> paddr_t vbase)
>>>>      vgic_v2_hw.vbase = vbase;
>>>>  }
>>>>  
>>>> +#define NR_TARGET_PER_REG 4U
>>>> +#define NR_BIT_PER_TARGET 8U
>>>
>>> NR_TARGETS_ and NR_BITS_...
>>>
>>> "REG" there is a bit generic, given this only works for registers with
>>> 8
>>> -bit fields, _ITARGETSR instead?
>>
>> Well this is within the vgic-v2.c and only one register contains target.
>> So I found pointless to add ITARGETSR to the name.
> 
> It's the use of the generic "REG" when there is only one relevant register
> (which could be named) which I found confusing, since the current name
> implies it has wider relevance than it actually does.

Ok. I guess you'd like to rename NR_BITS_PER_TARGET? If so, do you have
any suggestion?

>>> This is really a part of the for() iteration expression, but oddly
>>> place
>>> here.
>>> If you turn the "((1 << NR_BIT_PER_TARGET) - 1)" thing into a #define
>>> or
>>> constant, then you may find that extracting the relevant byte from the
>>> unshifted itargetsr using (itargetsr >> (i*NR_BITS_PER_TARGET) and then
>>> applying the mask is clean enough to use here instead.
>>
>> I placed this shift here because I didn't want to use ... >> (i *
>> NR_BIT_..) which require a multiplication rather than a shift in the
>> resulting code.
> 
> FWIW given a constant NR_BITS which is a power of two I think i*NR_BITS
> would end up a shift with any reasonable compiler.

I will give a look.

> 
>>>> +         * guest).
>>>> +         */
>>>> +        if ( !new_target || (new_target > d->max_vcpus) )
>>>> +        {
>>>> +            printk(XENLOG_G_DEBUG
>>>
>>> gdprintk?
>>
>> I would prefer to keep this printk in non-debug to help us catching any
>> OS potentially using this trick.
>>
>> Based on that I would even use XENLOG_G_WARNING because this is not
>> really compliant to the spec and we are meant to fix it.
> 
> Assuming I remember correctly that XENLOG_G_WARNING is ratelimited in
> default configurations, then OK.

Correct, any XENLOG_G_* is rate-limited by default.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/5] xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0
  2015-10-23  9:37         ` Julien Grall
@ 2015-10-23  9:53           ` Ian Campbell
  0 siblings, 0 replies; 22+ messages in thread
From: Ian Campbell @ 2015-10-23  9:53 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: stefano.stabellini

On Fri, 2015-10-23 at 10:37 +0100, Julien Grall wrote:
> Hi Ian,
> 
> On 23/10/15 10:30, Ian Campbell wrote:
> > On Thu, 2015-10-22 at 17:51 +0100, Julien Grall wrote:
> > > On 22/10/15 17:07, Ian Campbell wrote:
> > > > > diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
> > > > > index 665afeb..6b7eab3 100644
> > > > > --- a/xen/arch/arm/vgic-v2.c
> > > > > +++ b/xen/arch/arm/vgic-v2.c
> > > > > @@ -50,6 +50,94 @@ void vgic_v2_setup_hw(paddr_t dbase, paddr_t
> > > > > cbase,
> > > > > paddr_t vbase)
> > > > >      vgic_v2_hw.vbase = vbase;
> > > > >  }
> > > > >  
> > > > > +#define NR_TARGET_PER_REG 4U
> > > > > +#define NR_BIT_PER_TARGET 8U
> > > > 
> > > > NR_TARGETS_ and NR_BITS_...
> > > > 
> > > > "REG" there is a bit generic, given this only works for registers
> > > > with
> > > > 8
> > > > -bit fields, _ITARGETSR instead?
> > > 
> > > Well this is within the vgic-v2.c and only one register contains
> > > target.
> > > So I found pointless to add ITARGETSR to the name.
> > 
> > It's the use of the generic "REG" when there is only one relevant
> > register
> > (which could be named) which I found confusing, since the current name
> > implies it has wider relevance than it actually does.
> 
> Ok. I guess you'd like to rename NR_BITS_PER_TARGET? If so, do you have
> any suggestion?

NR_BITS_PER_TARGET is OK here, since 8 is always correct for that in the
gic v2 irrespective of whether it is used with any register.

NR_TARGET_PER_REG was the thing I thought we were discussing above. I think
that one should be NR_TARGETS_PER_ITARGETSR.

Alternatively
    const unsigned long nr_targets_per_reg = 32/NR_BITS_PER_TARGET;
in the context of the vgic_store_itargetsr function (and others which use
it) would be ok too. Or maybe 32=>(sizeof(...)*8). And maybe in that
context an even terser name would be ok too.

Ian.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 1/5] xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR
  2015-10-23  9:33       ` Ian Campbell
@ 2015-10-23  9:58         ` Julien Grall
  2015-10-23 10:12           ` Ian Campbell
  0 siblings, 1 reply; 22+ messages in thread
From: Julien Grall @ 2015-10-23  9:58 UTC (permalink / raw)
  To: Ian Campbell, xen-devel; +Cc: stefano.stabellini

On 23/10/15 10:33, Ian Campbell wrote:
> On Thu, 2015-10-22 at 17:36 +0100, Julien Grall wrote:
>> On 22/10/15 16:53, Ian Campbell wrote:
>>> On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:

[...]

>>>>
>>>> Furthermore, the body of the loop is retrieving the old target list
>>>> using the index of the byte.
>>>>
>>>> To avoid modifying too much the loop, shift the byte stored to the
>>>> correct
>>>> offset.
>>>
>>> That might have meant a smaller patch, but it's a lot harder to
>>> understand
>>> either the result or the diff.
>>
>> The size of the patch would have been the same. Although, it requires to
>> modify the call to vgic_byte_read in the loop to access the correct
>> interrupt.
>>
>> I didn't try to spend to much time to modify the loop because the
>> follow-up patch (#2) will rewrite the loop.
> 
> Since this patch is marked for backport then if we decided to take #2 then
> that's probably ok, otherwise the state of the tree after just this patch
> is more relevant.
> That's in itself probably a stronger argument for taking #2 than the actual
> functionality of #2 is.

This code is already complex and I don't think the loop modification would have
make the code easier to read.

Although, my plan was to ask to backport the whole series once we exercise
the code a bit in unstable. This is in order to fix 32-bit access on 64-bit
register.

>>
>> [...]
>>
>>>>  xen/arch/arm/vgic-v2.c | 12 ++++++------
>>>>  1 file changed, 6 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
>>>> index 2d63e12..665afeb 100644
>>>> --- a/xen/arch/arm/vgic-v2.c
>>>> +++ b/xen/arch/arm/vgic-v2.c
>>>> @@ -346,11 +346,11 @@ static int vgic_v2_distr_mmio_write(struct vcpu
>>>> *v, mmio_info_t *info,
>>>>          /* 8-bit vcpu mask for this domain */
>>>>          BUG_ON(v->domain->max_vcpus > 8);
>>>>          target = (1 << v->domain->max_vcpus) - 1;
>>>> -        if ( dabt.size == 2 )
>>>> -            target = target | (target << 8) | (target << 16) |
>>>> (target << 24);
>>>> +        target = target | (target << 8) | (target << 16) | (target
>>>> << 24);
>>>> +        if ( dabt.size == DABT_WORD )
>>>> +            target &= r;
>>>>          else
>>>> -            target = (target << (8 * (gicd_reg & 0x3)));
>>>> -        target &= r;
>>>> +            target &= (r << (8 * (gicd_reg & 0x3)));
>>>
>>> At this point do you not now have 3 bytes of
>>>     (1 << v->domain->max_vcpus) - 1;
>>> and 1 byte of that masked with the write?
>>>
>>> IOW isn't this modifying the 3 bytes which aren't written?
>>
>> No, the current loop search for bit set to 1. As the target variable
>> will only contain one byte with some bit set to 1, only the IRQ
>> associated to this byte will be updated.
> 
> Um, OK, I'll take your word for that.

FWIW, I wrote a Linux patch to exercise the code changed and
I didn't see any possible issue:

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 982c09c..b6de74f 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -446,6 +446,7 @@ static void gic_cpu_if_up(struct gic_chip_data *gic)
        writel_relaxed(bypass | mode | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
 }
 
+#include <xen/hvc-console.h>
 
 static void __init gic_dist_init(struct gic_chip_data *gic)
 {
@@ -453,6 +454,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
        u32 cpumask;
        unsigned int gic_irqs = gic->gic_irqs;
        void __iomem *base = gic_data_dist_base(gic);
+       void __iomem *target = base + GIC_DIST_TARGET;
 
        writel_relaxed(GICD_DISABLE, base + GIC_DIST_CTRL);
 
@@ -465,6 +467,45 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
        for (i = 32; i < gic_irqs; i += 4)
                writel_relaxed(cpumask, base + GIC_DIST_TARGET + i * 4 / 4);
 
+       for (i = 32; i < 34; i++) {
+               unsigned int n = i / 4;
+               unsigned int nb = i % 4;
+               int j;
+               void __iomem *regaddr = target + (i & ~0x3);
+               void __iomem *byte = target + i;
+
+               xen_raw_printk("Exerce ITARGETSR for irq %u\n", i);
+               xen_raw_printk("\t 32-bit ITARGETSR%u = 0x%x\n",
+                              n, readl_relaxed(regaddr));
+               xen_raw_printk("\t 8-bit ITARGETSR%u[%u] = 0x%x\n",
+                              n, nb, readb_relaxed(byte));
+               for (j = 0; j < 5; j++) {
+                       xen_raw_printk("switch to CPU%u\n", j);
+                       writeb(1 << j, byte);
+                       xen_raw_printk("\t 32-bit ITARGETSR%u = 0x%x\n",
+                                      n, readl_relaxed(regaddr));
+                       xen_raw_printk("\t 8-bit ITARGETSR%u[%u] = 0x%x\n",
+                                      n, nb, readb_relaxed(byte));
+               }
+       }
+
+       cpumask = 0x2;
+       cpumask |= cpumask << 8;
+       cpumask |= cpumask << 16;
+       xen_raw_printk("Excerce 32-bit access\n");
+       for (i = 32; i < 38; i += 4)
+       {
+               unsigned int n = i / 4;
+               void __iomem *regaddr = target + i * 4 / 4;
+
+               xen_raw_printk("Exerce 32-bit access on ITARGETSR%u\n", n);
+               xen_raw_printk("\t before = 0x%x\n", readl_relaxed(regaddr));
+               writel_relaxed(cpumask, base + GIC_DIST_TARGET + i * 4 / 4);
+               xen_raw_printk("\t after = 0x%x\n", readl_relaxed(regaddr));
+       }
+
+       while (1);
+
        gic_dist_config(base, gic_irqs, NULL);
 
        writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL);

Regards,


-- 
Julien Grall

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/5] xen/arm: vgic: Optimize the way to store the target vCPU in the rank
  2015-10-23  9:34       ` Ian Campbell
@ 2015-10-23 10:01         ` Julien Grall
  2015-10-23 10:14           ` Ian Campbell
  0 siblings, 1 reply; 22+ messages in thread
From: Julien Grall @ 2015-10-23 10:01 UTC (permalink / raw)
  To: Ian Campbell, xen-devel; +Cc: stefano.stabellini

On 23/10/15 10:34, Ian Campbell wrote:
> On Thu, 2015-10-22 at 18:15 +0100, Julien Grall wrote:
>> Hi Ian,
>>
>> On 22/10/15 17:17, Ian Campbell wrote:
>>> On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:
>>>> [...]
>>>>          /* Only migrate the vIRQ if the target vCPU has changed */
>>>>          if ( new_target != old_target )
>>>>          {
>>>> +            unsigned int virq = rank->index * NR_INTERRUPT_PER_RANK
>>>> + offset;
>>>
>>> FWIW this was the value of offset before it was shifted + masked, I
>>> think.
>>> Could you not just save it up top and remember it?
>>
>> In fact, the virq is already correctly set before the loop (see patch
>> #2):
>>
>> virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
>>
>> The variable is incremented in the for loop. So I just forgot to drop
>> this line when I did the split.
>>
>> Not that it's not possible to use directly offset because for byte
>> access it will point to the byte modified and not the base address of
>> the register.
>>
>> Though, I could use a mask, but I find this solution clearer.
> 
> But per the above what is actually going to happen is you drop this change?

As said, the introduction of virq within this patch is a mistake.
Patch #2 already set virq before the loop:

offset &= INTERRUPT_RANK_MASK;
offset &= ~(NR_TARGET_PER_REG - 1);

virq = rank->index * NR_INTERRUPT_PER_RANK + offset;

for ( i = 0; i < NR_TARGET_PER_REG; i++, offset++, virq++ )


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 1/5] xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR
  2015-10-23  9:58         ` Julien Grall
@ 2015-10-23 10:12           ` Ian Campbell
  0 siblings, 0 replies; 22+ messages in thread
From: Ian Campbell @ 2015-10-23 10:12 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: stefano.stabellini

On Fri, 2015-10-23 at 10:58 +0100, Julien Grall wrote:
> On 23/10/15 10:33, Ian Campbell wrote:
> > On Thu, 2015-10-22 at 17:36 +0100, Julien Grall wrote:
> > > On 22/10/15 16:53, Ian Campbell wrote:
> > > > On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:
> 
> [...]
> 
> > > > > 
> > > > > Furthermore, the body of the loop is retrieving the old target
> > > > > list
> > > > > using the index of the byte.
> > > > > 
> > > > > To avoid modifying too much the loop, shift the byte stored to
> > > > > the
> > > > > correct
> > > > > offset.
> > > > 
> > > > That might have meant a smaller patch, but it's a lot harder to
> > > > understand
> > > > either the result or the diff.
> > > 
> > > The size of the patch would have been the same. Although, it requires
> > > to
> > > modify the call to vgic_byte_read in the loop to access the correct
> > > interrupt.
> > > 
> > > I didn't try to spend to much time to modify the loop because the
> > > follow-up patch (#2) will rewrite the loop.
> > 
> > Since this patch is marked for backport then if we decided to take #2
> > then
> > that's probably ok, otherwise the state of the tree after just this
> > patch
> > is more relevant.
> > That's in itself probably a stronger argument for taking #2 than the
> > actual
> > functionality of #2 is.
> 
> This code is already complex and I don't think the loop modification would have
> make the code easier to read.
> 
> Although, my plan was to ask to backport the whole series once we exercise
> the code a bit in unstable. This is in order to fix 32-bit access on 64-bit
> register.

OK, if the whole series is going to be backported at once then the
maintainability of the intermediate states is not so critical.

Ian.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/5] xen/arm: vgic: Optimize the way to store the target vCPU in the rank
  2015-10-23 10:01         ` Julien Grall
@ 2015-10-23 10:14           ` Ian Campbell
  2015-10-23 10:15             ` Julien Grall
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Campbell @ 2015-10-23 10:14 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: stefano.stabellini

On Fri, 2015-10-23 at 11:01 +0100, Julien Grall wrote:
> On 23/10/15 10:34, Ian Campbell wrote:
> > On Thu, 2015-10-22 at 18:15 +0100, Julien Grall wrote:
> > > Hi Ian,
> > > 
> > > On 22/10/15 17:17, Ian Campbell wrote:
> > > > On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:
> > > > > [...]
> > > > >          /* Only migrate the vIRQ if the target vCPU has changed
> > > > > */
> > > > >          if ( new_target != old_target )
> > > > >          {
> > > > > +            unsigned int virq = rank->index *
> > > > > NR_INTERRUPT_PER_RANK
> > > > > + offset;
> > > > 
> > > > FWIW this was the value of offset before it was shifted + masked, I
> > > > think.
> > > > Could you not just save it up top and remember it?
> > > 
> > > In fact, the virq is already correctly set before the loop (see patch
> > > #2):
> > > 
> > > virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
> > > 
> > > The variable is incremented in the for loop. So I just forgot to drop
> > > this line when I did the split.
> > > 
> > > Not that it's not possible to use directly offset because for byte
> > > access it will point to the byte modified and not the base address of
> > > the register.
> > > 
> > > Though, I could use a mask, but I find this solution clearer.
> > 
> > But per the above what is actually going to happen is you drop this
> > change?
> 
> As said, the introduction of virq within this patch is a mistake.
> Patch #2 already set virq before the loop:

I thought that was what you said, but then your final line seemed to
contradict that by implying that you wanted to keep virq here (the implicat
ion of saying it is clearer to you).

> offset &= INTERRUPT_RANK_MASK;
> offset &= ~(NR_TARGET_PER_REG - 1);
> 
> virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
> 
> for ( i = 0; i < NR_TARGET_PER_REG; i++, offset++, virq++ )
> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/5] xen/arm: vgic: Optimize the way to store the target vCPU in the rank
  2015-10-23 10:14           ` Ian Campbell
@ 2015-10-23 10:15             ` Julien Grall
  0 siblings, 0 replies; 22+ messages in thread
From: Julien Grall @ 2015-10-23 10:15 UTC (permalink / raw)
  To: Ian Campbell, xen-devel; +Cc: stefano.stabellini

On 23/10/15 11:14, Ian Campbell wrote:
> On Fri, 2015-10-23 at 11:01 +0100, Julien Grall wrote:
>> On 23/10/15 10:34, Ian Campbell wrote:
>>> On Thu, 2015-10-22 at 18:15 +0100, Julien Grall wrote:
>>>> Hi Ian,
>>>>
>>>> On 22/10/15 17:17, Ian Campbell wrote:
>>>>> On Mon, 2015-10-12 at 15:22 +0100, Julien Grall wrote:
>>>>>> [...]
>>>>>>          /* Only migrate the vIRQ if the target vCPU has changed
>>>>>> */
>>>>>>          if ( new_target != old_target )
>>>>>>          {
>>>>>> +            unsigned int virq = rank->index *
>>>>>> NR_INTERRUPT_PER_RANK
>>>>>> + offset;
>>>>>
>>>>> FWIW this was the value of offset before it was shifted + masked, I
>>>>> think.
>>>>> Could you not just save it up top and remember it?
>>>>
>>>> In fact, the virq is already correctly set before the loop (see patch
>>>> #2):
>>>>
>>>> virq = rank->index * NR_INTERRUPT_PER_RANK + offset;
>>>>
>>>> The variable is incremented in the for loop. So I just forgot to drop
>>>> this line when I did the split.
>>>>
>>>> Not that it's not possible to use directly offset because for byte
>>>> access it will point to the byte modified and not the base address of
>>>> the register.
>>>>
>>>> Though, I could use a mask, but I find this solution clearer.
>>>
>>> But per the above what is actually going to happen is you drop this
>>> change?
>>
>> As said, the introduction of virq within this patch is a mistake.
>> Patch #2 already set virq before the loop:
> 
> I thought that was what you said, but then your final line seemed to
> contradict that by implying that you wanted to keep virq here (the implicat
> ion of saying it is clearer to you).

Sorry, I was speaking about using the unmodified offset. I.e something like

virq = offset & 0x3;

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-10-23 10:16 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-12 14:22 [PATCH v4 0/5] xen/arm: vgic: Support 32-bit access for 64-bit register Julien Grall
2015-10-12 14:22 ` [PATCH v4 1/5] xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR Julien Grall
2015-10-22 15:53   ` Ian Campbell
2015-10-22 16:36     ` Julien Grall
2015-10-23  9:33       ` Ian Campbell
2015-10-23  9:58         ` Julien Grall
2015-10-23 10:12           ` Ian Campbell
2015-10-12 14:22 ` [PATCH v4 2/5] xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0 Julien Grall
2015-10-22 16:07   ` Ian Campbell
2015-10-22 16:51     ` Julien Grall
2015-10-23  9:30       ` Ian Campbell
2015-10-23  9:37         ` Julien Grall
2015-10-23  9:53           ` Ian Campbell
2015-10-12 14:22 ` [PATCH v4 3/5] xen/arm: vgic: Optimize the way to store the target vCPU in the rank Julien Grall
2015-10-22 16:17   ` Ian Campbell
2015-10-22 17:15     ` Julien Grall
2015-10-23  9:34       ` Ian Campbell
2015-10-23 10:01         ` Julien Grall
2015-10-23 10:14           ` Ian Campbell
2015-10-23 10:15             ` Julien Grall
2015-10-12 14:22 ` [PATCH v4 4/5] xen/arm: vgic: Introduce helpers to extract/update/clear/set vGIC register Julien Grall
2015-10-12 14:22 ` [PATCH v4 5/5] xen/arm: vgic-v3: Support 32-bit access for 64-bit registers Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.