All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] xen/arm: run on real hardware
@ 2012-11-13 15:40 Stefano Stabellini
  2012-11-13 15:42 ` [PATCH 1/7] xen/arm: pass the correct bit-per-interrupt argument to vgic_irq_rank Stefano Stabellini
                   ` (6 more replies)
  0 siblings, 7 replies; 26+ messages in thread
From: Stefano Stabellini @ 2012-11-13 15:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Tim (Xen.org), Ian Campbell, Stefano Stabellini

Hi all,
this is a collection of fixes that I wrote trying to run Xen on a real
Versatile Express Cortex A15 machine.


Changes in v2:
- rework the first patch to fix vgic emulation;
- fix code style in head.S;
- more comments in head.S;
- always hook xen_fixmap in the pagetable but fill in the console fixmap
only if EARLY_UART_ADDRESS;
- detect the processor ID and call a processor specific initialization
function;
- move the ACTLR initialization to the CortexA15 initialization
function;
- move the ACTLR_* defines to processor-ca15.h;
- use preprocessor definitions in kick_cpus;
- do not manually increment the base address register, use an offset
instead;
- move kick_cpus to proc-ca15.S;
- add a comment to described why we need a DSB at the beginning of
write_pte;
- do not issue ISB within write_pte, call isb() afterwards whenever
appropriate;
- issue DSB after DCCMVAC in write_pte to make sure that the data flush
is completed before proceeding;
- make flush_xen_dcache_va take a void* as argument;
- introduce flush_xen_dcache_va_range;
- return always at least 1 cpu from device_tree_cpus;
- skip nodes with names that don't start with "cpu".



Stefano Stabellini (7):
      xen/arm: pass the correct bit-per-interrupt argument to vgic_irq_rank
      xen/arm: setup the fixmap in head.S
      pl011: set baud and clock_hz to the right defaults for Versatile Express
      xen/arm: set the SMP bit in the ACTLR register
      xen/arm: wake up secondary cpus
      xen/arm: flush D-cache and I-cache when appropriate
      xen/arm: get the number of cpus from device tree

 xen/arch/arm/Makefile                   |    1 +
 xen/arch/arm/early_printk.c             |    5 +-
 xen/arch/arm/gic.c                      |    4 +-
 xen/arch/arm/gic.h                      |    4 +-
 xen/arch/arm/head.S                     |   67 ++++++++++++++++++++++---------
 xen/arch/arm/mm.c                       |   20 ++++++++-
 xen/arch/arm/mode_switch.S              |   28 +++++++++++++
 xen/arch/arm/proc-ca15.S                |   28 +++++++++++++
 xen/arch/arm/setup.c                    |    9 +++-
 xen/arch/arm/smpboot.c                  |    2 +
 xen/arch/arm/vgic.c                     |   25 ++++++------
 xen/common/device_tree.c                |   19 +++++++++
 xen/drivers/char/pl011.c                |    4 +-
 xen/include/asm-arm/cpregs.h            |    1 +
 xen/include/asm-arm/page.h              |   51 ++++++++++++++++++++++-
 xen/include/asm-arm/platform_vexpress.h |   17 ++++++++
 xen/include/asm-arm/processor-ca15.h    |   45 +++++++++++++++++++++
 xen/include/asm-arm/processor.h         |    3 +
 xen/include/xen/device_tree.h           |    1 +
 19 files changed, 285 insertions(+), 49 deletions(-)


Cheers,

Stefano

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/7] xen/arm: pass the correct bit-per-interrupt argument to vgic_irq_rank
  2012-11-13 15:40 [PATCH v2 0/7] xen/arm: run on real hardware Stefano Stabellini
@ 2012-11-13 15:42 ` Stefano Stabellini
  2012-11-15 10:26   ` Ian Campbell
  2012-11-13 15:42 ` [PATCH 2/7] xen/arm: setup the fixmap in head.S Stefano Stabellini
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2012-11-13 15:42 UTC (permalink / raw)
  To: xen-devel; +Cc: tim, Ian.Campbell, Stefano Stabellini

Use 1 for registers that have 1 bit per irq.

Changes in v2:
- add case 1 to REG_RANK_NR rather than changing the implementation.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 xen/arch/arm/vgic.c |   25 +++++++++++++------------
 1 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 3c3983f..3f7e757 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -47,6 +47,7 @@ static inline int REG_RANK_NR(int b, uint32_t n)
     case 8: return n >> 3;
     case 4: return n >> 2;
     case 2: return n >> 1;
+    case 1: return n;
     default: BUG();
     }
 }
@@ -199,7 +200,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
 
     case GICD_ISENABLER ... GICD_ISENABLERN:
         if ( dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ISENABLER);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ISENABLER);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank);
         *r = rank->ienable;
@@ -208,7 +209,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
 
     case GICD_ICENABLER ... GICD_ICENABLERN:
         if ( dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ICENABLER);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ICENABLER);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank);
         *r = rank->ienable;
@@ -217,7 +218,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
 
     case GICD_ISPENDR ... GICD_ISPENDRN:
         if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ISPENDR);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ISPENDR);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank);
         *r = byte_read(rank->ipend, dabt.sign, offset);
@@ -226,7 +227,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
 
     case GICD_ICPENDR ... GICD_ICPENDRN:
         if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ICPENDR);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ICPENDR);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank);
         *r = byte_read(rank->ipend, dabt.sign, offset);
@@ -235,7 +236,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
 
     case GICD_ISACTIVER ... GICD_ISACTIVERN:
         if ( dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ISACTIVER);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ISACTIVER);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank);
         *r = rank->iactive;
@@ -244,7 +245,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
 
     case GICD_ICACTIVER ... GICD_ICACTIVERN:
         if ( dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ICACTIVER);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ICACTIVER);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank);
         *r = rank->iactive;
@@ -296,7 +297,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
 
     case GICD_CPENDSGIR ... GICD_CPENDSGIRN:
         if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_CPENDSGIR);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_CPENDSGIR);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank);
         *r = byte_read(rank->pendsgi, dabt.sign, offset);
@@ -305,7 +306,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
 
     case GICD_SPENDSGIR ... GICD_SPENDSGIRN:
         if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_SPENDSGIR);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_SPENDSGIR);
         if ( rank == NULL) goto read_as_zero;
         vgic_lock_rank(v, rank);
         *r = byte_read(rank->pendsgi, dabt.sign, offset);
@@ -400,7 +401,7 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
 
     case GICD_ISENABLER ... GICD_ISENABLERN:
         if ( dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ISENABLER);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ISENABLER);
         if ( rank == NULL) goto write_ignore;
         vgic_lock_rank(v, rank);
         tr = rank->ienable;
@@ -411,7 +412,7 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
 
     case GICD_ICENABLER ... GICD_ICENABLERN:
         if ( dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ICENABLER);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ICENABLER);
         if ( rank == NULL) goto write_ignore;
         vgic_lock_rank(v, rank);
         rank->ienable &= ~*r;
@@ -432,7 +433,7 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
 
     case GICD_ISACTIVER ... GICD_ISACTIVERN:
         if ( dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ISACTIVER);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ISACTIVER);
         if ( rank == NULL) goto write_ignore;
         vgic_lock_rank(v, rank);
         rank->iactive &= ~*r;
@@ -441,7 +442,7 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
 
     case GICD_ICACTIVER ... GICD_ICACTIVERN:
         if ( dabt.size != 2 ) goto bad_width;
-        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ICACTIVER);
+        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ICACTIVER);
         if ( rank == NULL) goto write_ignore;
         vgic_lock_rank(v, rank);
         rank->iactive &= ~*r;
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/7] xen/arm: setup the fixmap in head.S
  2012-11-13 15:40 [PATCH v2 0/7] xen/arm: run on real hardware Stefano Stabellini
  2012-11-13 15:42 ` [PATCH 1/7] xen/arm: pass the correct bit-per-interrupt argument to vgic_irq_rank Stefano Stabellini
@ 2012-11-13 15:42 ` Stefano Stabellini
  2012-11-15 10:27   ` Ian Campbell
  2012-11-13 15:42 ` [PATCH 3/7] pl011: set baud and clock_hz to the right defaults for Versatile Express Stefano Stabellini
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2012-11-13 15:42 UTC (permalink / raw)
  To: xen-devel; +Cc: tim, Ian.Campbell, Stefano Stabellini

Setup the fixmap mapping directly in head.S rather than having a
temporary mapping only to re-do it later in C.


Changes in v2:

- fix code style;
- more comments;
- always hook xen_fixmap in the pagetable but fill in the console fixmap
only if EARLY_UART_ADDRESS.


Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 xen/arch/arm/early_printk.c |    5 ++---
 xen/arch/arm/head.S         |   40 +++++++++++++++++++++++-----------------
 xen/arch/arm/mm.c           |    2 +-
 xen/arch/arm/setup.c        |    2 --
 4 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/xen/arch/arm/early_printk.c b/xen/arch/arm/early_printk.c
index 3e51252..bdf4c0e 100644
--- a/xen/arch/arm/early_printk.c
+++ b/xen/arch/arm/early_printk.c
@@ -17,12 +17,11 @@
 
 #ifdef EARLY_UART_ADDRESS
 
-static void __init early_putch(char c)
+void __init early_putch(char c)
 {
     volatile uint32_t *r;
 
-    r = (uint32_t *)((EARLY_UART_ADDRESS & 0x001fffff)
-                     + XEN_VIRT_START + (1 << 21));
+    r = (uint32_t *)(XEN_VIRT_START + (1 << 21));
 
     /* XXX: assuming a PL011 UART. */
     while(*(r + 0x6) & 0x8)
diff --git a/xen/arch/arm/head.S b/xen/arch/arm/head.S
index 9fdeda7..4506244 100644
--- a/xen/arch/arm/head.S
+++ b/xen/arch/arm/head.S
@@ -24,6 +24,7 @@
 #define PT_PT  0xe7f /* nG=1, AF=1, SH=10, AP=01, NS=1, ATTR=111, T=1, P=1 */
 #define PT_MEM 0xe7d /* nG=1, AF=1, SH=10, AP=01, NS=1, ATTR=111, T=0, P=1 */
 #define PT_DEV 0xe71 /* nG=1, AF=1, SH=10, AP=01, NS=1, ATTR=100, T=0, P=1 */
+#define PT_DEV_L3 0xe73 /* lev3: nG=1, AF=1, SH=10, AP=01, NS=1, ATTR=100, T=1, P=1 */
 
 #define PT_UPPER(x) (PT_##x & 0xf00)
 #define PT_LOWER(x) (PT_##x & 0x0ff)
@@ -183,6 +184,18 @@ skip_bss:
 	teq   r12, #0
 	bne   pt_ready
 	
+	/* console fixmap */
+#ifdef EARLY_UART_ADDRESS
+	ldr   r1, =xen_fixmap
+	add   r1, r1, r10            /* r1 := paddr (xen_fixmap) */
+	mov   r3, #0
+	lsr   r2, r11, #12
+	lsl   r2, r2, #12            /* 4K aligned paddr of UART */
+	orr   r2, r2, #PT_UPPER(DEV_L3)
+	orr   r2, r2, #PT_LOWER(DEV_L3) /* r2:r3 := 4K dev map including UART */
+	strd  r2, r3, [r1, #(FIXMAP_CONSOLE*8)] /* Map it in the first fixmap's slot */
+#endif
+	
 	/* Build the baseline idle pagetable's first-level entries */
 	ldr   r1, =xen_second
 	add   r1, r1, r10            /* r1 := paddr (xen_second) */
@@ -205,17 +218,15 @@ skip_bss:
 	ldr   r4, =start
 	lsr   r4, #18                /* Slot for vaddr(start) */
 	strd  r2, r3, [r1, r4]       /* Map Xen there too */
-#ifdef EARLY_UART_ADDRESS
-	ldr   r3, =(1<<(54-32))      /* NS for device mapping */
-	lsr   r2, r11, #21
-	lsl   r2, r2, #21            /* 2MB-aligned paddr of UART */
-	orr   r2, r2, #PT_UPPER(DEV)
-	orr   r2, r2, #PT_LOWER(DEV) /* r2:r3 := 2MB dev map including UART */
+
+	/* xen_fixmap pagetable */
+	ldr   r2, =xen_fixmap
+	add   r2, r2, r10            /* r2 := paddr (xen_fixmap) */
+	orr   r2, r2, #PT_UPPER(PT)
+	orr   r2, r2, #PT_LOWER(PT)  /* r2:r3 := table map of xen_fixmap */
 	add   r4, r4, #8
 	strd  r2, r3, [r1, r4]       /* Map it in the fixmap's slot */
-#else
-	add   r4, r4, #8             /* Skip over unused fixmap slot */
-#endif
+
 	mov   r3, #0x0
 	lsr   r2, r8, #21
 	lsl   r2, r2, #21            /* 2MB-aligned paddr of DTB */
@@ -236,13 +247,10 @@ pt_ready:
 	mov   pc, r1                 /* Get a proper vaddr into PC */
 paging:
 
+	
 #ifdef EARLY_UART_ADDRESS
-	/* Recover the UART address in the new address space. */
-	lsl   r11, #11
-	lsr   r11, #11               /* UART base's offset from 2MB base */
-	adr   r0, start
-	add   r0, r0, #0x200000      /* vaddr of the fixmap's 2MB slot */
-	add   r11, r11, r0           /* r11 := vaddr (UART base address) */
+    /* Use a virtual address to access the UART. */
+	ldr   r11, =FIXMAP_ADDR(FIXMAP_CONSOLE)
 #endif
 
 	PRINT("- Ready -\r\n")
@@ -261,8 +269,6 @@ paging:
 	mcr   CP32(r0, BPIALL)       /* Flush branch predictor */
 	dsb                          /* Ensure completion of TLB+BP flush */
 	isb
- 	/* Now, the UART is in its proper fixmap address */
-	ldrne r11, =FIXMAP_ADDR(FIXMAP_CONSOLE)
 
 	/* Non-boot CPUs report that they've got this far */
 	ldr   r0, =ready_cpus
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index b0068d2..d0cd2c9 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -37,7 +37,7 @@ struct domain *dom_xen, *dom_io;
 /* Static start-of-day pagetables that we use before the allocators are up */
 lpae_t xen_pgtable[LPAE_ENTRIES] __attribute__((__aligned__(4096)));
 lpae_t xen_second[LPAE_ENTRIES*4] __attribute__((__aligned__(4096*4)));
-static lpae_t xen_fixmap[LPAE_ENTRIES] __attribute__((__aligned__(4096)));
+lpae_t xen_fixmap[LPAE_ENTRIES] __attribute__((__aligned__(4096)));
 static lpae_t xen_xenmap[LPAE_ENTRIES] __attribute__((__aligned__(4096)));
 
 /* Non-boot CPUs use this to find the correct pagetables. */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 7568968..6fbcb81 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -194,9 +194,7 @@ void __init start_xen(unsigned long boot_phys_offset,
     setup_pagetables(boot_phys_offset, get_xen_paddr());
 
 #ifdef EARLY_UART_ADDRESS
-    /* Map the UART */
     /* TODO Need to get device tree or command line for UART address */
-    set_fixmap(FIXMAP_CONSOLE, EARLY_UART_ADDRESS >> PAGE_SHIFT, DEV_SHARED);
     pl011_init(0, FIXMAP_ADDR(FIXMAP_CONSOLE));
     console_init_preirq();
 #endif
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/7] pl011: set baud and clock_hz to the right defaults for Versatile Express
  2012-11-13 15:40 [PATCH v2 0/7] xen/arm: run on real hardware Stefano Stabellini
  2012-11-13 15:42 ` [PATCH 1/7] xen/arm: pass the correct bit-per-interrupt argument to vgic_irq_rank Stefano Stabellini
  2012-11-13 15:42 ` [PATCH 2/7] xen/arm: setup the fixmap in head.S Stefano Stabellini
@ 2012-11-13 15:42 ` Stefano Stabellini
  2012-11-15 10:27   ` Ian Campbell
  2012-11-13 15:42 ` [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register Stefano Stabellini
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2012-11-13 15:42 UTC (permalink / raw)
  To: xen-devel; +Cc: tim, Ian.Campbell, Stefano Stabellini

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
---
 xen/drivers/char/pl011.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/char/pl011.c b/xen/drivers/char/pl011.c
index 6af45aa..6ccb73a 100644
--- a/xen/drivers/char/pl011.c
+++ b/xen/drivers/char/pl011.c
@@ -241,8 +241,8 @@ void __init pl011_init(int index, unsigned long register_base_address)
 
     uart = &pl011_com[index];
 
-    uart->clock_hz  = 7372800;
-    uart->baud      = 115200;
+    uart->clock_hz  = 0x16e3600;
+    uart->baud      = 38400;
     uart->data_bits = 8;
     uart->parity    = PARITY_NONE;
     uart->stop_bits = 1;
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register
  2012-11-13 15:40 [PATCH v2 0/7] xen/arm: run on real hardware Stefano Stabellini
                   ` (2 preceding siblings ...)
  2012-11-13 15:42 ` [PATCH 3/7] pl011: set baud and clock_hz to the right defaults for Versatile Express Stefano Stabellini
@ 2012-11-13 15:42 ` Stefano Stabellini
  2012-11-15 10:27   ` Ian Campbell
  2012-11-13 15:42 ` [PATCH 5/7] xen/arm: wake up secondary cpus Stefano Stabellini
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2012-11-13 15:42 UTC (permalink / raw)
  To: xen-devel; +Cc: tim, Ian.Campbell, Stefano Stabellini

>From the Cortex A15 manual:

"Enables the processor to receive instruction cache, BTB, and TLB maintenance
operations from other processors"

...

"You must set this bit before enabling the caches and MMU, or
performing any cache and TLB maintenance operations. The only time
you must clear this bit is during a processor power-down sequence"


Changes in v2:
- detect the processor ID and call a processor specific initialization
function;
- move the ACTLR initialization to the CortexA15 initialization
function;
- move the ACTLR_* defines to processor-ca15.h.


Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
---
 xen/arch/arm/Makefile                |    1 +
 xen/arch/arm/head.S                  |   10 +++++++
 xen/arch/arm/proc-ca15.S             |   28 +++++++++++++++++++++
 xen/include/asm-arm/cpregs.h         |    1 +
 xen/include/asm-arm/processor-ca15.h |   45 ++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/processor.h      |    3 ++
 6 files changed, 88 insertions(+), 0 deletions(-)
 create mode 100644 xen/arch/arm/proc-ca15.S
 create mode 100644 xen/include/asm-arm/processor-ca15.h

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 634b620..d3e34bc 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -13,6 +13,7 @@ obj-y += irq.o
 obj-y += kernel.o
 obj-y += mm.o
 obj-y += mode_switch.o
+obj-y += proc-ca15.o
 obj-y += p2m.o
 obj-y += percpu.o
 obj-y += guestcopy.o
diff --git a/xen/arch/arm/head.S b/xen/arch/arm/head.S
index 4506244..25993d6 100644
--- a/xen/arch/arm/head.S
+++ b/xen/arch/arm/head.S
@@ -19,6 +19,7 @@
 
 #include <asm/config.h>
 #include <asm/page.h>
+#include <asm/processor-ca15.h>
 #include <asm/asm_defns.h>
 
 #define PT_PT  0xe7f /* nG=1, AF=1, SH=10, AP=01, NS=1, ATTR=111, T=1, P=1 */
@@ -148,6 +149,15 @@ skip_bss:
 
 	PRINT("- Setting up control registers -\r\n")
 	
+	/* Read CPU ID */
+	mrc   CP32(r0, MIDR)
+	ldr   r1, =(MIDR_MASK)
+	and   r0, r0, r1
+	/* Is this a Cortex A15? */
+	ldr   r1, =(CORTEX_A15_ID)
+	teq   r0, r1
+	bleq  cortex_a15_init
+
 	/* Set up memory attribute type tables */
 	ldr   r0, =MAIR0VAL
 	ldr   r1, =MAIR1VAL
diff --git a/xen/arch/arm/proc-ca15.S b/xen/arch/arm/proc-ca15.S
new file mode 100644
index 0000000..5a5bf64
--- /dev/null
+++ b/xen/arch/arm/proc-ca15.S
@@ -0,0 +1,28 @@
+/*
+ * xen/arch/arm/proc-ca15.S
+ *
+ * Cortex A15 specific initializations
+ *
+ * Copyright (c) 2011 Citrix Systems.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <asm/asm_defns.h>
+#include <asm/processor-ca15.h>
+
+.globl cortex_a15_init
+cortex_a15_init:
+	/* Set up the SMP bit in ACTLR */
+	mrc   CP32(r0, ACTLR)
+	orr   r0, r0, #(ACTLR_CA15_SMP) /* enable SMP bit*/
+	mcr   CP32(r0, ACTLR)
+	mov   pc, lr
diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
index 34a9e93..3b51845 100644
--- a/xen/include/asm-arm/cpregs.h
+++ b/xen/include/asm-arm/cpregs.h
@@ -104,6 +104,7 @@
 /* Coprocessor 15 */
 
 /* CP15 CR0: CPUID and Cache Type Registers */
+#define MIDR            p15,0,c0,c0,0   /* Main ID Register */
 #define MPIDR           p15,0,c0,c0,5   /* Multiprocessor Affinity Register */
 #define ID_PFR0         p15,0,c0,c1,0   /* Processor Feature Register 0 */
 #define ID_PFR1         p15,0,c0,c1,1   /* Processor Feature Register 1 */
diff --git a/xen/include/asm-arm/processor-ca15.h b/xen/include/asm-arm/processor-ca15.h
new file mode 100644
index 0000000..86231b3
--- /dev/null
+++ b/xen/include/asm-arm/processor-ca15.h
@@ -0,0 +1,45 @@
+#ifndef __ASM_ARM_PROCESSOR_CA15_H
+#define __ASM_ARM_PROCESSOR_CA15_H
+
+
+#define CORTEX_A15_ID     (0x410FC0F0)
+
+/* ACTLR Auxiliary Control Register, Cortex A15 */
+#define ACTLR_CA15_SNOOP_DELAYED      (1<<31)
+#define ACTLR_CA15_MAIN_CLOCK         (1<<30)
+#define ACTLR_CA15_NEON_CLOCK         (1<<29)
+#define ACTLR_CA15_NONCACHE           (1<<24)
+#define ACTLR_CA15_INORDER_REQ        (1<<23)
+#define ACTLR_CA15_INORDER_LOAD       (1<<22)
+#define ACTLR_CA15_L2_TLB_PREFETCH    (1<<21)
+#define ACTLR_CA15_L2_IPA_PA_CACHE    (1<<20)
+#define ACTLR_CA15_L2_CACHE           (1<<19)
+#define ACTLR_CA15_L2_PA_CACHE        (1<<18)
+#define ACTLR_CA15_TLB                (1<<17)
+#define ACTLR_CA15_STRONGY_ORDERED    (1<<16)
+#define ACTLR_CA15_INORDER            (1<<15)
+#define ACTLR_CA15_FORCE_LIM          (1<<14)
+#define ACTLR_CA15_CP_FLUSH           (1<<13)
+#define ACTLR_CA15_CP_PUSH            (1<<12)
+#define ACTLR_CA15_LIM                (1<<11)
+#define ACTLR_CA15_SER                (1<<10)
+#define ACTLR_CA15_OPT                (1<<9)
+#define ACTLR_CA15_WFI                (1<<8)
+#define ACTLR_CA15_WFE                (1<<7)
+#define ACTLR_CA15_SMP                (1<<6)
+#define ACTLR_CA15_PLD                (1<<5)
+#define ACTLR_CA15_IP                 (1<<4)
+#define ACTLR_CA15_MICRO_BTB          (1<<3)
+#define ACTLR_CA15_LOOP_ONE           (1<<2)
+#define ACTLR_CA15_LOOP_DISABLE       (1<<1)
+#define ACTLR_CA15_BTB                (1<<0)
+
+#endif /* __ASM_ARM_PROCESSOR_CA15_H */
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 3849b23..e0c0beb 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -3,6 +3,9 @@
 
 #include <asm/cpregs.h>
 
+/* MIDR Main ID Register */
+#define MIDR_MASK    0xff0ffff0
+
 /* TTBCR Translation Table Base Control Register */
 #define TTBCR_EAE    0x80000000
 #define TTBCR_N_MASK 0x07
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 5/7] xen/arm: wake up secondary cpus
  2012-11-13 15:40 [PATCH v2 0/7] xen/arm: run on real hardware Stefano Stabellini
                   ` (3 preceding siblings ...)
  2012-11-13 15:42 ` [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register Stefano Stabellini
@ 2012-11-13 15:42 ` Stefano Stabellini
  2012-11-15 10:28   ` Ian Campbell
  2012-11-13 15:42 ` [PATCH 6/7] xen/arm: flush D-cache and I-cache when appropriate Stefano Stabellini
  2012-11-13 15:42 ` [PATCH 7/7] xen/arm: get the number of cpus from device tree Stefano Stabellini
  6 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2012-11-13 15:42 UTC (permalink / raw)
  To: xen-devel; +Cc: tim, Ian.Campbell, Stefano Stabellini

Secondary cpus are held by the firmware until we send an IPI to them.

Changes in v2:
- use preprocessor definitions in kick_cpus;
- do not manually increment the base address register, use an offset
instead;
- move kick_cpus to proc-ca15.S.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 xen/arch/arm/gic.h                      |    2 ++
 xen/arch/arm/head.S                     |    8 ++++++--
 xen/arch/arm/mode_switch.S              |   28 ++++++++++++++++++++++++++++
 xen/include/asm-arm/platform_vexpress.h |   17 +++++++++++++++++
 4 files changed, 53 insertions(+), 2 deletions(-)
 create mode 100644 xen/include/asm-arm/platform_vexpress.h

diff --git a/xen/arch/arm/gic.h b/xen/arch/arm/gic.h
index b8f9f201..b2e1d7f 100644
--- a/xen/arch/arm/gic.h
+++ b/xen/arch/arm/gic.h
@@ -124,6 +124,7 @@
 /* XXX: write this into the DT */
 #define VGIC_IRQ_EVTCHN_CALLBACK 31
 
+#ifndef __ASSEMBLY__
 extern int domain_vgic_init(struct domain *d);
 extern void domain_vgic_free(struct domain *d);
 
@@ -158,6 +159,7 @@ extern int gicv_setup(struct domain *d);
 extern void gic_save_state(struct vcpu *v);
 extern void gic_restore_state(struct vcpu *v);
 
+#endif /* __ASSEMBLY__ */
 #endif
 
 /*
diff --git a/xen/arch/arm/head.S b/xen/arch/arm/head.S
index 25993d6..3fe6412 100644
--- a/xen/arch/arm/head.S
+++ b/xen/arch/arm/head.S
@@ -80,12 +80,12 @@ start:
 	beq   boot_cpu               /* If we're CPU 0, boot now */
 
 	/* Non-boot CPUs wait here to be woken up one at a time. */
-1:	wfe
-	dsb
+1:	dsb
 	ldr   r0, =smp_up_cpu        /* VA of gate */
 	add   r0, r0, r10            /* PA of gate */
 	ldr   r1, [r0]               /* Which CPU is being booted? */
 	teq   r1, r12                /* Is it us? */
+	wfene
 	bne   1b
 
 boot_cpu:
@@ -99,6 +99,10 @@ boot_cpu:
 	PRINT(" booting -\r\n")
 #endif
 
+	/* Wake up secondary cpus */
+	teq   r12, #0
+	bleq  kick_cpus
+
 	/* Check that this CPU has Hyp mode */
 	mrc   CP32(r0, ID_PFR1)
 	and   r0, r0, #0xf000        /* Bits 12-15 define virt extensions */
diff --git a/xen/arch/arm/mode_switch.S b/xen/arch/arm/mode_switch.S
index 83a682b..7ff2f9e 100644
--- a/xen/arch/arm/mode_switch.S
+++ b/xen/arch/arm/mode_switch.S
@@ -19,7 +19,35 @@
 
 #include <asm/config.h>
 #include <asm/page.h>
+#include <asm/platform_vexpress.h>
 #include <asm/asm_defns.h>
+#include "gic.h"
+
+
+/* XXX: Versatile Express specific code */
+/* wake up secondary cpus */
+.globl kick_cpus
+kick_cpus:
+    /* write start paddr to v2m sysreg FLAGSSET register */
+	ldr   r0, =(V2M_SYS_MMIO_BASE)        /* base V2M sysreg MMIO address */
+	dsb
+	mov   r2, #0xffffffff
+	str   r2, [r0, #(V2M_SYS_FLAGSCLR)]
+	dsb
+	ldr   r2, =start
+	add   r2, r2, r10
+	str   r2, [r0, #(V2M_SYS_FLAGSSET)]
+	dsb
+	/* send an interrupt */
+	ldr   r0, =(GIC_BASE_ADDRESS + GIC_DR_OFFSET) /* base GICD MMIO address */
+	mov   r2, #0x1
+	str   r2, [r0, #(GICD_CTLR * 4)]      /* enable distributor */
+	mov   r2, #0xfe0000
+	str   r2, [r0, #(GICD_SGIR * 4)]      /* send IPI to everybody */
+	dsb
+	str   r2, [r0, #(GICD_CTLR * 4)]      /* disable distributor */
+	mov   pc, lr
+
 
 /* Get up a CPU into Hyp mode.  Clobbers r0-r3.
  *
diff --git a/xen/include/asm-arm/platform_vexpress.h b/xen/include/asm-arm/platform_vexpress.h
new file mode 100644
index 0000000..3556af3
--- /dev/null
+++ b/xen/include/asm-arm/platform_vexpress.h
@@ -0,0 +1,17 @@
+#ifndef __ASM_ARM_PLATFORM_H
+#define __ASM_ARM_PLATFORM_H
+
+/* V2M */
+#define V2M_SYS_MMIO_BASE     (0x1c010000)
+#define V2M_SYS_FLAGSSET      (0x30)
+#define V2M_SYS_FLAGSCLR      (0x34)
+
+#endif /* __ASM_ARM_PLATFORM_H */
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 6/7] xen/arm: flush D-cache and I-cache when appropriate
  2012-11-13 15:40 [PATCH v2 0/7] xen/arm: run on real hardware Stefano Stabellini
                   ` (4 preceding siblings ...)
  2012-11-13 15:42 ` [PATCH 5/7] xen/arm: wake up secondary cpus Stefano Stabellini
@ 2012-11-13 15:42 ` Stefano Stabellini
  2012-11-15 10:02   ` Ian Campbell
  2012-11-13 15:42 ` [PATCH 7/7] xen/arm: get the number of cpus from device tree Stefano Stabellini
  6 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2012-11-13 15:42 UTC (permalink / raw)
  To: xen-devel; +Cc: tim, Ian.Campbell, Stefano Stabellini

- invalidate tlb after setting WXN;
- flush D-cache and I-cache after relocation;
- flush D-cache after writing to smp_up_cpu;
- flush TLB before changing HTTBR;
- flush I-cache after changing HTTBR;
- flush I-cache and branch predictor after writing Xen text ptes.


Changes in v2:
- fix a wrong comment;
- add a comment to described why we need a DSB at the beginning of
write_pte;
- do not issue ISB within write_pte, call isb() afterwards whenever
appropriate;
- issue DSB after DCCMVAC in write_pte to make sure that the data flush
is completed before proceeding;
- make flush_xen_dcache_va take a void* as argument;
- introduce flush_xen_dcache_va_range.


Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 xen/arch/arm/head.S        |    9 +++++++
 xen/arch/arm/mm.c          |   18 +++++++++++++-
 xen/arch/arm/setup.c       |    4 +++
 xen/arch/arm/smpboot.c     |    2 +
 xen/include/asm-arm/page.h |   51 +++++++++++++++++++++++++++++++++++++++++--
 5 files changed, 79 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/head.S b/xen/arch/arm/head.S
index 3fe6412..4de4d95 100644
--- a/xen/arch/arm/head.S
+++ b/xen/arch/arm/head.S
@@ -278,8 +278,15 @@ paging:
 	ldr   r4, =boot_httbr        /* VA of HTTBR value stashed by CPU 0 */
 	add   r4, r4, r10            /* PA of it */
 	ldrd  r4, r5, [r4]           /* Actual value */
+	dsb
+	mcr   CP32(r0, TLBIALLH)     /* Flush hypervisor TLB */
+	dsb
+	isb
 	mcrr  CP64(r4, r5, HTTBR)
+	dsb
+	isb
 	mcr   CP32(r0, TLBIALLH)     /* Flush hypervisor TLB */
+	mcr   CP32(r0, ICIALLU)      /* Flush I-cache */
 	mcr   CP32(r0, BPIALL)       /* Flush branch predictor */
 	dsb                          /* Ensure completion of TLB+BP flush */
 	isb
@@ -292,6 +299,8 @@ paging:
 	teq   r2, #0
 	bne   1b
 	dsb
+	mcr   CP32(r0, DCCMVAC)      /* flush D-Cache */
+	dsb
 
 	/* Here, the non-boot CPUs must wait again -- they're now running on
 	 * the boot CPU's pagetables so it's safe for the boot CPU to
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index d0cd2c9..3d25153 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -32,6 +32,7 @@
 #include <public/memory.h>
 #include <xen/sched.h>
 
+int __read_mostly cacheline_bytes = MIN_CACHELINE_BYTES;
 struct domain *dom_xen, *dom_io;
 
 /* Static start-of-day pagetables that we use before the allocators are up */
@@ -244,10 +245,17 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
 
     /* Change pagetables to the copy in the relocated Xen */
     boot_httbr = (unsigned long) xen_pgtable + phys_offset;
+    flush_xen_dcache_va(&boot_httbr);
+    flush_xen_dcache_va_range((void*)dest_va, _end - _start);
+    isb();
+    flush_xen_text_tlb();
+
     asm volatile (
         STORE_CP64(0, HTTBR)          /* Change translation base */
         "dsb;"                        /* Ensure visibility of HTTBR update */
+        "isb;"
         STORE_CP32(0, TLBIALLH)       /* Flush hypervisor TLB */
+        STORE_CP32(0, ICIALLU)        /* Flush I-cache */
         STORE_CP32(0, BPIALL)         /* Flush branch predictor */
         "dsb;"                        /* Ensure completion of TLB+BP flush */
         "isb;"
@@ -256,6 +264,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
     /* Undo the temporary map */
     pte.bits = 0;
     write_pte(xen_second + second_table_offset(dest_va), pte);
+    isb();
     flush_xen_text_tlb();
 
     /* Link in the fixmap pagetable */
@@ -291,11 +300,14 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
                            >> PAGE_SHIFT);
     pte.pt.table = 1;
     write_pte(xen_second + second_linear_offset(XEN_VIRT_START), pte);
-    /* Have changed a mapping used for .text. Flush everything for safety. */
-    flush_xen_text_tlb();
+    /* ISB is needed because we changed the text mappings */
+    isb();
 
     /* From now on, no mapping may be both writable and executable. */
     WRITE_CP32(READ_CP32(HSCTLR) | SCTLR_WXN, HSCTLR);
+    isb();
+    /* Flush everything after setting WXN bit. */
+    flush_xen_text_tlb();
 }
 
 /* MMU setup for secondary CPUS (which already have paging enabled) */
@@ -303,6 +315,8 @@ void __cpuinit mmu_init_secondary_cpu(void)
 {
     /* From now on, no mapping may be both writable and executable. */
     WRITE_CP32(READ_CP32(HSCTLR) | SCTLR_WXN, HSCTLR);
+    isb();
+    flush_xen_text_tlb();
 }
 
 /* Create Xen's mappings of memory.
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 6fbcb81..a579a56 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -185,6 +185,10 @@ void __init start_xen(unsigned long boot_phys_offset,
     size_t fdt_size;
     int cpus, i;
 
+    cacheline_bytes = READ_CP32(CCSIDR);
+    if ( cacheline_bytes < MIN_CACHELINE_BYTES )
+        panic("CPU has preposterously small cache lines");
+
     fdt = (void *)BOOT_MISC_VIRT_START
         + (atag_paddr & ((1 << SECOND_SHIFT) - 1));
     fdt_size = device_tree_early_init(fdt);
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index c0750c0..f4fd512 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -105,6 +105,7 @@ make_cpus_ready(unsigned int max_cpus, unsigned long boot_phys_offset)
         /* Tell the next CPU to get ready */
         /* TODO: handle boards where CPUIDs are not contiguous */
         *gate = i;
+        flush_xen_dcache_va(gate);
         asm volatile("dsb; isb; sev");
         /* And wait for it to respond */
         while ( ready_cpus < i )
@@ -201,6 +202,7 @@ int __cpu_up(unsigned int cpu)
     /* Unblock the CPU.  It should be waiting in the loop in head.S
      * for an event to arrive when smp_up_cpu matches its cpuid. */
     smp_up_cpu = cpu;
+    flush_xen_dcache_va(&smp_up_cpu);
     asm volatile("dsb; isb; sev");
 
     while ( !cpu_online(cpu) )
diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index 9511c45..1b1d556 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -228,27 +228,72 @@ static inline lpae_t mfn_to_p2m_entry(unsigned long mfn, unsigned int mattr)
     return e;
 }
 
-/* Write a pagetable entry */
+/* Write a pagetable entry.
+ *
+ * If the table entry is changing a text mapping, it is responsibility
+ * of the caller to issue an ISB after write_pte.
+ */
 static inline void write_pte(lpae_t *p, lpae_t pte)
 {
     asm volatile (
+        /* Ensure any writes have completed with the old mappings. */
+        "dsb;"
         /* Safely write the entry (STRD is atomic on CPUs that support LPAE) */
         "strd %0, %H0, [%1];"
+        "dsb;"
         /* Push this cacheline to the PoC so the rest of the system sees it. */
         STORE_CP32(1, DCCMVAC)
+        /* Ensure that the data flush is completed before proceeding */
+        "dsb;"
         : : "r" (pte.bits), "r" (p) : "memory");
 }
 
+
+#define MIN_CACHELINE_BYTES 32
+extern int cacheline_bytes;
+
+/* Function for flushing medium-sized areas.
+ * if 'range' is large enough we might want to use model-specific
+ * full-cache flushes. */
+static inline void flush_xen_dcache_va_range(void *p, unsigned long size)
+{
+    void *end;
+    dsb();           /* So the CPU issues all writes to the range */ 
+    for ( end = p + size; p < end; p += cacheline_bytes )
+        WRITE_CP32((uint32_t) p, DCCMVAC);
+    dsb();           /* So we know the flushes happen before continuing */
+}
+
+
+/* Macro for flushing a single small item.  The predicate is always 
+ * compile-time constant so this will compile down to 3 instructions in
+ * the common case.  Make sure to call it with the correct type of
+ * pointer! */
+#define flush_xen_dcache_va(p) do {                       \
+    typeof(p) _p = (p);                                   \
+    if ( (sizeof *_p) > MIN_CACHELINE_BYTES )             \
+        flush_xen_dcache_va_range(_p, sizeof *_p);        \
+    else                                                  \
+        asm volatile (                                    \
+            "dsb;"   /* Finish all earlier writes */      \
+            STORE_CP32(0, DCCMVAC)                        \
+            "dsb;"   /* Finish flush before continuing */ \
+            : : "r" (_p), "m" (*_p));                     \
+} while (0)
+
+
 /*
  * Flush all hypervisor mappings from the TLB and branch predictor.
- * This is needed after changing Xen code mappings. 
+ * This is needed after changing Xen code mappings.
+ *
+ * The caller needs to issue the necessary barriers before this functions.
  */
 static inline void flush_xen_text_tlb(void)
 {
     register unsigned long r0 asm ("r0");
     asm volatile (
-        "dsb;"                        /* Ensure visibility of PTE writes */
         STORE_CP32(0, TLBIALLH)       /* Flush hypervisor TLB */
+        STORE_CP32(0, ICIALLU)        /* Flush I-cache */
         STORE_CP32(0, BPIALL)         /* Flush branch predictor */
         "dsb;"                        /* Ensure completion of TLB+BP flush */
         "isb;"
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 7/7] xen/arm: get the number of cpus from device tree
  2012-11-13 15:40 [PATCH v2 0/7] xen/arm: run on real hardware Stefano Stabellini
                   ` (5 preceding siblings ...)
  2012-11-13 15:42 ` [PATCH 6/7] xen/arm: flush D-cache and I-cache when appropriate Stefano Stabellini
@ 2012-11-13 15:42 ` Stefano Stabellini
  2012-11-15 10:09   ` Ian Campbell
  6 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2012-11-13 15:42 UTC (permalink / raw)
  To: xen-devel; +Cc: tim, Ian.Campbell, Stefano Stabellini

The system might have fewer cpus than the GIC supports.

Changes in v2:
- return always at least 1 cpu;
- skip nodes with names that don't start with "cpu".

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 xen/arch/arm/gic.c            |    4 +---
 xen/arch/arm/gic.h            |    2 +-
 xen/arch/arm/setup.c          |    3 ++-
 xen/common/device_tree.c      |   19 +++++++++++++++++++
 xen/include/xen/device_tree.h |    1 +
 5 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 5f06e08..0c6fab9 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -304,7 +304,7 @@ static void __cpuinit gic_hyp_disable(void)
 }
 
 /* Set up the GIC */
-int __init gic_init(void)
+void __init gic_init(void)
 {
     /* XXX FIXME get this from devicetree */
     gic.dbase = GIC_BASE_ADDRESS + GIC_DR_OFFSET;
@@ -328,8 +328,6 @@ int __init gic_init(void)
     gic.lr_mask = 0ULL;
 
     spin_unlock(&gic.lock);
-
-    return gic.cpus;
 }
 
 /* Set up the per-CPU parts of the GIC for a secondary CPU */
diff --git a/xen/arch/arm/gic.h b/xen/arch/arm/gic.h
index b2e1d7f..1bf1b02 100644
--- a/xen/arch/arm/gic.h
+++ b/xen/arch/arm/gic.h
@@ -147,7 +147,7 @@ extern int gic_route_irq_to_guest(struct domain *d, unsigned int irq,
 /* Accept an interrupt from the GIC and dispatch its handler */
 extern void gic_interrupt(struct cpu_user_regs *regs, int is_fiq);
 /* Bring up the interrupt controller, and report # cpus attached */
-extern int gic_init(void);
+extern void gic_init(void);
 /* Bring up a secondary CPU's per-CPU GIC interface */
 extern void gic_init_secondary_cpu(void);
 /* Take down a CPU's per-CPU GIC interface */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index a579a56..632e7e3 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -193,6 +193,7 @@ void __init start_xen(unsigned long boot_phys_offset,
         + (atag_paddr & ((1 << SECOND_SHIFT) - 1));
     fdt_size = device_tree_early_init(fdt);
 
+    cpus = device_tree_cpus(fdt);
     cmdline_parse(device_tree_bootargs(fdt));
 
     setup_pagetables(boot_phys_offset, get_xen_paddr());
@@ -203,7 +204,7 @@ void __init start_xen(unsigned long boot_phys_offset,
     console_init_preirq();
 #endif
 
-    cpus = gic_init();
+    gic_init();
     make_cpus_ready(cpus, boot_phys_offset);
 
     percpu_init_areas();
diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 3d1f0f4..5b6dab9 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -153,6 +153,25 @@ const char *device_tree_bootargs(const void *fdt)
     return prop->data;
 }
 
+int device_tree_cpus(const void *fdt)
+{
+    int node = 0, depth = 1;
+    int cpus = 0;
+
+    node = fdt_path_offset(fdt, "/cpus/cpu");
+    if ( node < 0 )
+        return 1; /* we have at least one cpu */
+
+    while ( node >= 0 && depth >= 0 ) {
+        if ( strncmp(fdt_get_name(fdt, node, NULL), "cpu", 3) )
+            continue;
+        node = fdt_next_node(fdt, node, &depth);
+        cpus++;
+    }
+
+    return cpus > 0 ? cpus : 1;
+}
+
 static int dump_node(const void *fdt, int node, const char *name, int depth,
                      u32 address_cells, u32 size_cells, void *data)
 {
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 4d010c0..46bc0f8 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -49,6 +49,7 @@ bool_t device_tree_node_matches(const void *fdt, int node, const char *match);
 int device_tree_for_each_node(const void *fdt,
                               device_tree_node_func func, void *data);
 const char *device_tree_bootargs(const void *fdt);
+int device_tree_cpus(const void *fdt);
 void device_tree_dump(const void *fdt);
 
 #endif
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/7] xen/arm: flush D-cache and I-cache when appropriate
  2012-11-13 15:42 ` [PATCH 6/7] xen/arm: flush D-cache and I-cache when appropriate Stefano Stabellini
@ 2012-11-15 10:02   ` Ian Campbell
  2012-11-16 15:36     ` Stefano Stabellini
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Campbell @ 2012-11-15 10:02 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Tim (Xen.org)

On Tue, 2012-11-13 at 15:42 +0000, Stefano Stabellini wrote:
> - invalidate tlb after setting WXN;
> - flush D-cache and I-cache after relocation;
> - flush D-cache after writing to smp_up_cpu;
> - flush TLB before changing HTTBR;
> - flush I-cache after changing HTTBR;
> - flush I-cache and branch predictor after writing Xen text ptes.

Since the reasoning is pretty subtle I wonder if you could say a few
words in each case about why these flushes are necessary? Either here on
in the comments in the code.

> @@ -244,10 +245,17 @@ void __init setup_pagetables(unsigned long
> boot_phys_offset, paddr_t xen_paddr)
>  
>      /* Change pagetables to the copy in the relocated Xen */
>      boot_httbr = (unsigned long) xen_pgtable + phys_offset;
> +    flush_xen_dcache_va(&boot_httbr);
> +    flush_xen_dcache_va_range((void*)dest_va, _end - _start);
> +    isb();
> +    flush_xen_text_tlb();
> +
>      asm volatile (
>          STORE_CP64(0, HTTBR)          /* Change translation base */
>          "dsb;"                        /* Ensure visibility of HTTBR
> update */
> +        "isb;"
>          STORE_CP32(0, TLBIALLH)       /* Flush hypervisor TLB */
> +        STORE_CP32(0, ICIALLU)        /* Flush I-cache */
>          STORE_CP32(0, BPIALL)         /* Flush branch predictor */

This now looks exactly like flush_xen_text_tlb() -- shall we call it
instead of open coding?

>          "dsb;"                        /* Ensure completion of TLB+BP
> flush */
>          "isb;"
> @@ -256,6 +264,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>      /* Undo the temporary map */
>      pte.bits = 0;
>      write_pte(xen_second + second_table_offset(dest_va), pte);
> +    isb();
>      flush_xen_text_tlb();

Do any calls to flush_xen_text_tlb() not require a preceding isb? I'd
have thought that by its nature an isb would be required every time --
in which case it may as well go in the macro.
 
>      /* Link in the fixmap pagetable */
> @@ -291,11 +300,14 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>                             >> PAGE_SHIFT);
>      pte.pt.table = 1;
>      write_pte(xen_second + second_linear_offset(XEN_VIRT_START), pte);
> -    /* Have changed a mapping used for .text. Flush everything for safety. */
> -    flush_xen_text_tlb();
> +    /* ISB is needed because we changed the text mappings */
> +    isb();

Why is the text TLB flush is not also required in this case?

>      /* From now on, no mapping may be both writable and executable. */
>      WRITE_CP32(READ_CP32(HSCTLR) | SCTLR_WXN, HSCTLR);
> +    isb();
> +    /* Flush everything after setting WXN bit. */
> +    flush_xen_text_tlb();
>  }
>  
>  /* MMU setup for secondary CPUS (which already have paging enabled) */
> @@ -303,6 +315,8 @@ void __cpuinit mmu_init_secondary_cpu(void)
>  {
>      /* From now on, no mapping may be both writable and executable. */
>      WRITE_CP32(READ_CP32(HSCTLR) | SCTLR_WXN, HSCTLR);
> +    isb();
> +    flush_xen_text_tlb();
>  }
>  
>  /* Create Xen's mappings of memory.

> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index 9511c45..1b1d556 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -228,27 +228,72 @@ static inline lpae_t mfn_to_p2m_entry(unsigned long mfn, unsigned int mattr)
>      return e;
>  }
>  
> -/* Write a pagetable entry */
> +/* Write a pagetable entry.
> + *
> + * If the table entry is changing a text mapping, it is responsibility
> + * of the caller to issue an ISB after write_pte.
> + */
>  static inline void write_pte(lpae_t *p, lpae_t pte)
>  {
>      asm volatile (
> +        /* Ensure any writes have completed with the old mappings. */
> +        "dsb;"
>          /* Safely write the entry (STRD is atomic on CPUs that support LPAE) */
>          "strd %0, %H0, [%1];"
> +        "dsb;"
>          /* Push this cacheline to the PoC so the rest of the system sees it. */
>          STORE_CP32(1, DCCMVAC)
> +        /* Ensure that the data flush is completed before proceeding */
> +        "dsb;"
>          : : "r" (pte.bits), "r" (p) : "memory");
>  }
>  
> +
> +#define MIN_CACHELINE_BYTES 32
> +extern int cacheline_bytes;
> +
> +/* Function for flushing medium-sized areas.
> + * if 'range' is large enough we might want to use model-specific
> + * full-cache flushes. */
> +static inline void flush_xen_dcache_va_range(void *p, unsigned long size)
> +{
> +    void *end;
> +    dsb();           /* So the CPU issues all writes to the range */ 
> +    for ( end = p + size; p < end; p += cacheline_bytes )

Tim asked if this memory read of the cached cacheline_bytes might not be
more expensive than hitting the (on-core) cp register every time, which
I also suspect will be the case. Looking at Linux it seems to reread the
register each time in v7_flush_dcache_all which suggests that reading
the register is faster or at least preferable.

> +        WRITE_CP32((uint32_t) p, DCCMVAC);
> +    dsb();           /* So we know the flushes happen before continuing */
> +}
> +
> +
> +/* Macro for flushing a single small item.  The predicate is always 
> + * compile-time constant so this will compile down to 3 instructions in
> + * the common case.  Make sure to call it with the correct type of
> + * pointer! */

Do we need to worry about pointers to things which cross a cacheline
boundary? i.e. a 4 byte thing at offset 30 in a 32-byte cache line.

Or do ARMs C alignment rules ensure this can never happen? (I suspect
the answer is yes).

> +#define flush_xen_dcache_va(p) do {                       \
> +    typeof(p) _p = (p);                                   \
> +    if ( (sizeof *_p) > MIN_CACHELINE_BYTES )             \
> +        flush_xen_dcache_va_range(_p, sizeof *_p);        \
> +    else                                                  \
> +        asm volatile (                                    \
> +            "dsb;"   /* Finish all earlier writes */      \
> +            STORE_CP32(0, DCCMVAC)                        \
> +            "dsb;"   /* Finish flush before continuing */ \
> +            : : "r" (_p), "m" (*_p));                     \
> +} while (0)
> +
> +
>  /*
>   * Flush all hypervisor mappings from the TLB and branch predictor.
> - * This is needed after changing Xen code mappings. 
> + * This is needed after changing Xen code mappings.
> + *
> + * The caller needs to issue the necessary barriers before this functions.

Worth commenting on what those are and when they are needed?

>   */
>  static inline void flush_xen_text_tlb(void)
>  {
>      register unsigned long r0 asm ("r0");
>      asm volatile (
> -        "dsb;"                        /* Ensure visibility of PTE writes */
>          STORE_CP32(0, TLBIALLH)       /* Flush hypervisor TLB */
> +        STORE_CP32(0, ICIALLU)        /* Flush I-cache */
>          STORE_CP32(0, BPIALL)         /* Flush branch predictor */
>          "dsb;"                        /* Ensure completion of TLB+BP flush */
>          "isb;"

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/7] xen/arm: get the number of cpus from device tree
  2012-11-13 15:42 ` [PATCH 7/7] xen/arm: get the number of cpus from device tree Stefano Stabellini
@ 2012-11-15 10:09   ` Ian Campbell
  2012-11-15 15:26     ` Stefano Stabellini
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Campbell @ 2012-11-15 10:09 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Tim (Xen.org)

On Tue, 2012-11-13 at 15:42 +0000, Stefano Stabellini wrote:
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index 3d1f0f4..5b6dab9 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -153,6 +153,25 @@ const char *device_tree_bootargs(const void *fdt)
>      return prop->data;
>  }
>  
> +int device_tree_cpus(const void *fdt)
> +{
> +    int node = 0, depth = 1;
> +    int cpus = 0;
> +
> +    node = fdt_path_offset(fdt, "/cpus/cpu");
> +    if ( node < 0 )
> +        return 1; /* we have at least one cpu */
> +
> +    while ( node >= 0 && depth >= 0 ) {
> +        if ( strncmp(fdt_get_name(fdt, node, NULL), "cpu", 3) )
> +            continue;
> +        node = fdt_next_node(fdt, node, &depth);
> +        cpus++;

Do we not need to track the largest <n> for each cpu@<n> which we see,
in order to handle systems with e.g. CPUs 0, 1, 4 & 5?

There are some helpers in device_tree.c to walk over trees like this,
are none of them suitable?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/7] xen/arm: pass the correct bit-per-interrupt argument to vgic_irq_rank
  2012-11-13 15:42 ` [PATCH 1/7] xen/arm: pass the correct bit-per-interrupt argument to vgic_irq_rank Stefano Stabellini
@ 2012-11-15 10:26   ` Ian Campbell
  0 siblings, 0 replies; 26+ messages in thread
From: Ian Campbell @ 2012-11-15 10:26 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Tim (Xen.org)

On Tue, 2012-11-13 at 15:42 +0000, Stefano Stabellini wrote:
> Use 1 for registers that have 1 bit per irq.
> 
> Changes in v2:
> - add case 1 to REG_RANK_NR rather than changing the implementation.
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Acked + applied.

> ---
>  xen/arch/arm/vgic.c |   25 +++++++++++++------------
>  1 files changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 3c3983f..3f7e757 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -47,6 +47,7 @@ static inline int REG_RANK_NR(int b, uint32_t n)
>      case 8: return n >> 3;
>      case 4: return n >> 2;
>      case 2: return n >> 1;
> +    case 1: return n;
>      default: BUG();
>      }
>  }
> @@ -199,7 +200,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_ISENABLER ... GICD_ISENABLERN:
>          if ( dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ISENABLER);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ISENABLER);
>          if ( rank == NULL) goto read_as_zero;
>          vgic_lock_rank(v, rank);
>          *r = rank->ienable;
> @@ -208,7 +209,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_ICENABLER ... GICD_ICENABLERN:
>          if ( dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ICENABLER);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ICENABLER);
>          if ( rank == NULL) goto read_as_zero;
>          vgic_lock_rank(v, rank);
>          *r = rank->ienable;
> @@ -217,7 +218,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_ISPENDR ... GICD_ISPENDRN:
>          if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ISPENDR);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ISPENDR);
>          if ( rank == NULL) goto read_as_zero;
>          vgic_lock_rank(v, rank);
>          *r = byte_read(rank->ipend, dabt.sign, offset);
> @@ -226,7 +227,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_ICPENDR ... GICD_ICPENDRN:
>          if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ICPENDR);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ICPENDR);
>          if ( rank == NULL) goto read_as_zero;
>          vgic_lock_rank(v, rank);
>          *r = byte_read(rank->ipend, dabt.sign, offset);
> @@ -235,7 +236,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_ISACTIVER ... GICD_ISACTIVERN:
>          if ( dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ISACTIVER);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ISACTIVER);
>          if ( rank == NULL) goto read_as_zero;
>          vgic_lock_rank(v, rank);
>          *r = rank->iactive;
> @@ -244,7 +245,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_ICACTIVER ... GICD_ICACTIVERN:
>          if ( dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ICACTIVER);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ICACTIVER);
>          if ( rank == NULL) goto read_as_zero;
>          vgic_lock_rank(v, rank);
>          *r = rank->iactive;
> @@ -296,7 +297,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_CPENDSGIR ... GICD_CPENDSGIRN:
>          if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_CPENDSGIR);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_CPENDSGIR);
>          if ( rank == NULL) goto read_as_zero;
>          vgic_lock_rank(v, rank);
>          *r = byte_read(rank->pendsgi, dabt.sign, offset);
> @@ -305,7 +306,7 @@ static int vgic_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_SPENDSGIR ... GICD_SPENDSGIRN:
>          if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_SPENDSGIR);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_SPENDSGIR);
>          if ( rank == NULL) goto read_as_zero;
>          vgic_lock_rank(v, rank);
>          *r = byte_read(rank->pendsgi, dabt.sign, offset);
> @@ -400,7 +401,7 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_ISENABLER ... GICD_ISENABLERN:
>          if ( dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ISENABLER);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ISENABLER);
>          if ( rank == NULL) goto write_ignore;
>          vgic_lock_rank(v, rank);
>          tr = rank->ienable;
> @@ -411,7 +412,7 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_ICENABLER ... GICD_ICENABLERN:
>          if ( dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ICENABLER);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ICENABLER);
>          if ( rank == NULL) goto write_ignore;
>          vgic_lock_rank(v, rank);
>          rank->ienable &= ~*r;
> @@ -432,7 +433,7 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_ISACTIVER ... GICD_ISACTIVERN:
>          if ( dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ISACTIVER);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ISACTIVER);
>          if ( rank == NULL) goto write_ignore;
>          vgic_lock_rank(v, rank);
>          rank->iactive &= ~*r;
> @@ -441,7 +442,7 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
>  
>      case GICD_ICACTIVER ... GICD_ICACTIVERN:
>          if ( dabt.size != 2 ) goto bad_width;
> -        rank = vgic_irq_rank(v, 8, gicd_reg - GICD_ICACTIVER);
> +        rank = vgic_irq_rank(v, 1, gicd_reg - GICD_ICACTIVER);
>          if ( rank == NULL) goto write_ignore;
>          vgic_lock_rank(v, rank);
>          rank->iactive &= ~*r;

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/7] xen/arm: setup the fixmap in head.S
  2012-11-13 15:42 ` [PATCH 2/7] xen/arm: setup the fixmap in head.S Stefano Stabellini
@ 2012-11-15 10:27   ` Ian Campbell
  0 siblings, 0 replies; 26+ messages in thread
From: Ian Campbell @ 2012-11-15 10:27 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Tim (Xen.org)

> diff --git a/xen/arch/arm/early_printk.c b/xen/arch/arm/early_printk.c
> index 3e51252..bdf4c0e 100644
> --- a/xen/arch/arm/early_printk.c
> +++ b/xen/arch/arm/early_printk.c
> @@ -17,12 +17,11 @@
>  
>  #ifdef EARLY_UART_ADDRESS
>  
> -static void __init early_putch(char c)
> +void __init early_putch(char c)
>  {
>      volatile uint32_t *r;
>  
> -    r = (uint32_t *)((EARLY_UART_ADDRESS & 0x001fffff)
> -                     + XEN_VIRT_START + (1 << 21));
> +    r = (uint32_t *)(XEN_VIRT_START + (1 << 21));

I'm not sure I understand the significance of this address (I know this
was already here, but I figured you might know having touched it).

>  #ifdef EARLY_UART_ADDRESS
> -	/* Recover the UART address in the new address space. */
> -	lsl   r11, #11
> -	lsr   r11, #11               /* UART base's offset from 2MB base */
> -	adr   r0, start
> -	add   r0, r0, #0x200000      /* vaddr of the fixmap's 2MB slot */
> -	add   r11, r11, r0           /* r11 := vaddr (UART base address) */
> +    /* Use a virtual address to access the UART. */

Should be a tab.

(a separate patch to use soft tabs in all .S files would also be
acceptable ;-))

Actually that turned out to be my only "significant" comment so I'll
just fix it on commit.

Acked + applied.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/7] pl011: set baud and clock_hz to the right defaults for Versatile Express
  2012-11-13 15:42 ` [PATCH 3/7] pl011: set baud and clock_hz to the right defaults for Versatile Express Stefano Stabellini
@ 2012-11-15 10:27   ` Ian Campbell
  0 siblings, 0 replies; 26+ messages in thread
From: Ian Campbell @ 2012-11-15 10:27 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Tim (Xen.org)

On Tue, 2012-11-13 at 15:42 +0000, Stefano Stabellini wrote:
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Acked-by: Tim Deegan <tim@xen.org>

Applied, thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register
  2012-11-13 15:42 ` [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register Stefano Stabellini
@ 2012-11-15 10:27   ` Ian Campbell
  0 siblings, 0 replies; 26+ messages in thread
From: Ian Campbell @ 2012-11-15 10:27 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Tim (Xen.org)

On Tue, 2012-11-13 at 15:42 +0000, Stefano Stabellini wrote:

> Changes in v2:
> - detect the processor ID and call a processor specific initialization
> function;

I presume this will become a table driven lookup at some point...

> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Acked-by: Tim Deegan <tim@xen.org>

Applied, thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/7] xen/arm: wake up secondary cpus
  2012-11-13 15:42 ` [PATCH 5/7] xen/arm: wake up secondary cpus Stefano Stabellini
@ 2012-11-15 10:28   ` Ian Campbell
  0 siblings, 0 replies; 26+ messages in thread
From: Ian Campbell @ 2012-11-15 10:28 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Tim (Xen.org)

On Tue, 2012-11-13 at 15:42 +0000, Stefano Stabellini wrote:
> Secondary cpus are held by the firmware until we send an IPI to them.

I inserted:

Reordered non-boot cpu wait loop to perform the check before waiting for
an event, to handled the case where the event has already happened when
we reach the loop.

At some point we'll need to do something to handle different platforms,
but this will do for now.

> 
> Changes in v2:
> - use preprocessor definitions in kick_cpus;
> - do not manually increment the base address register, use an offset
> instead;
> - move kick_cpus to proc-ca15.S.

You meant mode_switch.S I think? I'm going to assume this comment is
wrong rather than the patch. If not then please send a followup.

> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

fixed up tab vs space + Acked + applied 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/7] xen/arm: get the number of cpus from device tree
  2012-11-15 10:09   ` Ian Campbell
@ 2012-11-15 15:26     ` Stefano Stabellini
  2012-11-15 15:34       ` Ian Campbell
  0 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2012-11-15 15:26 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Tim (Xen.org), Stefano Stabellini

On Thu, 15 Nov 2012, Ian Campbell wrote:
> On Tue, 2012-11-13 at 15:42 +0000, Stefano Stabellini wrote:
> > diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> > index 3d1f0f4..5b6dab9 100644
> > --- a/xen/common/device_tree.c
> > +++ b/xen/common/device_tree.c
> > @@ -153,6 +153,25 @@ const char *device_tree_bootargs(const void *fdt)
> >      return prop->data;
> >  }
> >  
> > +int device_tree_cpus(const void *fdt)
> > +{
> > +    int node = 0, depth = 1;
> > +    int cpus = 0;
> > +
> > +    node = fdt_path_offset(fdt, "/cpus/cpu");
> > +    if ( node < 0 )
> > +        return 1; /* we have at least one cpu */
> > +
> > +    while ( node >= 0 && depth >= 0 ) {
> > +        if ( strncmp(fdt_get_name(fdt, node, NULL), "cpu", 3) )
> > +            continue;
> > +        node = fdt_next_node(fdt, node, &depth);
> > +        cpus++;
> 
> Do we not need to track the largest <n> for each cpu@<n> which we see,
> in order to handle systems with e.g. CPUs 0, 1, 4 & 5?

Actually the hardware ID is expressed by the <reg> propery.
Maybe we should set the corresponding ID in cpu_present_map from
device_tree_cpus?


> There are some helpers in device_tree.c to walk over trees like this,
> are none of them suitable?
 
Do you mean device_tree_node_matches?
Yes, I can use that instead of strncmp. I'll do that.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/7] xen/arm: get the number of cpus from device tree
  2012-11-15 15:26     ` Stefano Stabellini
@ 2012-11-15 15:34       ` Ian Campbell
  2012-11-15 16:18         ` Stefano Stabellini
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Campbell @ 2012-11-15 15:34 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Tim (Xen.org)

On Thu, 2012-11-15 at 15:26 +0000, Stefano Stabellini wrote:
> On Thu, 15 Nov 2012, Ian Campbell wrote:
> > On Tue, 2012-11-13 at 15:42 +0000, Stefano Stabellini wrote:
> > > diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> > > index 3d1f0f4..5b6dab9 100644
> > > --- a/xen/common/device_tree.c
> > > +++ b/xen/common/device_tree.c
> > > @@ -153,6 +153,25 @@ const char *device_tree_bootargs(const void *fdt)
> > >      return prop->data;
> > >  }
> > >  
> > > +int device_tree_cpus(const void *fdt)
> > > +{
> > > +    int node = 0, depth = 1;
> > > +    int cpus = 0;
> > > +
> > > +    node = fdt_path_offset(fdt, "/cpus/cpu");
> > > +    if ( node < 0 )
> > > +        return 1; /* we have at least one cpu */
> > > +
> > > +    while ( node >= 0 && depth >= 0 ) {
> > > +        if ( strncmp(fdt_get_name(fdt, node, NULL), "cpu", 3) )
> > > +            continue;
> > > +        node = fdt_next_node(fdt, node, &depth);
> > > +        cpus++;
> > 
> > Do we not need to track the largest <n> for each cpu@<n> which we see,
> > in order to handle systems with e.g. CPUs 0, 1, 4 & 5?
> 
> Actually the hardware ID is expressed by the <reg> propery.
> Maybe we should set the corresponding ID in cpu_present_map from
> device_tree_cpus?

I'm not sure what you mean.

> > There are some helpers in device_tree.c to walk over trees like this,
> > are none of them suitable?
>  
> Do you mean device_tree_node_matches?
> Yes, I can use that instead of strncmp. I'll do that.

You should , but that's not what I was talkig about ;-)

I was thinking of device_tree_for_each_node but I suppose that doesn't
quite fit? But perhaps you can integrate with early_scan_node?

Ian.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/7] xen/arm: get the number of cpus from device tree
  2012-11-15 15:34       ` Ian Campbell
@ 2012-11-15 16:18         ` Stefano Stabellini
  2012-11-15 16:27           ` Ian Campbell
  0 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2012-11-15 16:18 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Tim (Xen.org), Stefano Stabellini

On Thu, 15 Nov 2012, Ian Campbell wrote:
> On Thu, 2012-11-15 at 15:26 +0000, Stefano Stabellini wrote:
> > On Thu, 15 Nov 2012, Ian Campbell wrote:
> > > On Tue, 2012-11-13 at 15:42 +0000, Stefano Stabellini wrote:
> > > > diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> > > > index 3d1f0f4..5b6dab9 100644
> > > > --- a/xen/common/device_tree.c
> > > > +++ b/xen/common/device_tree.c
> > > > @@ -153,6 +153,25 @@ const char *device_tree_bootargs(const void *fdt)
> > > >      return prop->data;
> > > >  }
> > > >  
> > > > +int device_tree_cpus(const void *fdt)
> > > > +{
> > > > +    int node = 0, depth = 1;
> > > > +    int cpus = 0;
> > > > +
> > > > +    node = fdt_path_offset(fdt, "/cpus/cpu");
> > > > +    if ( node < 0 )
> > > > +        return 1; /* we have at least one cpu */
> > > > +
> > > > +    while ( node >= 0 && depth >= 0 ) {
> > > > +        if ( strncmp(fdt_get_name(fdt, node, NULL), "cpu", 3) )
> > > > +            continue;
> > > > +        node = fdt_next_node(fdt, node, &depth);
> > > > +        cpus++;
> > > 
> > > Do we not need to track the largest <n> for each cpu@<n> which we see,
> > > in order to handle systems with e.g. CPUs 0, 1, 4 & 5?
> > 
> > Actually the hardware ID is expressed by the <reg> propery.
> > Maybe we should set the corresponding ID in cpu_present_map from
> > device_tree_cpus?
> 
> I'm not sure what you mean.

I mean that it is not the @<num> that expresses the cpu number from the
hardware point of view.
The cpu number is held by the <reg> property of the cpu node.
Considering that the second cpu could theoretically have ID number 5, we
should go and mark cpu 1-4 as non present in the cpu masks. We should
only set cpu number 5 as present.

> > > There are some helpers in device_tree.c to walk over trees like this,
> > > are none of them suitable?
> >  
> > Do you mean device_tree_node_matches?
> > Yes, I can use that instead of strncmp. I'll do that.
> 
> You should , but that's not what I was talkig about ;-)
> 
> I was thinking of device_tree_for_each_node but I suppose that doesn't
> quite fit? But perhaps you can integrate with early_scan_node?

Maybe, but now that I think about it we should be matching on
device_type rather than node name. We should do the same for memory too.
We need to either change device_tree_node_matches or write a new
matching function.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/7] xen/arm: get the number of cpus from device tree
  2012-11-15 16:18         ` Stefano Stabellini
@ 2012-11-15 16:27           ` Ian Campbell
  0 siblings, 0 replies; 26+ messages in thread
From: Ian Campbell @ 2012-11-15 16:27 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Tim (Xen.org)

On Thu, 2012-11-15 at 16:18 +0000, Stefano Stabellini wrote:
> On Thu, 15 Nov 2012, Ian Campbell wrote:
> > On Thu, 2012-11-15 at 15:26 +0000, Stefano Stabellini wrote:
> > > On Thu, 15 Nov 2012, Ian Campbell wrote:
> > > > On Tue, 2012-11-13 at 15:42 +0000, Stefano Stabellini wrote:
> > > > > diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> > > > > index 3d1f0f4..5b6dab9 100644
> > > > > --- a/xen/common/device_tree.c
> > > > > +++ b/xen/common/device_tree.c
> > > > > @@ -153,6 +153,25 @@ const char *device_tree_bootargs(const void *fdt)
> > > > >      return prop->data;
> > > > >  }
> > > > >  
> > > > > +int device_tree_cpus(const void *fdt)
> > > > > +{
> > > > > +    int node = 0, depth = 1;
> > > > > +    int cpus = 0;
> > > > > +
> > > > > +    node = fdt_path_offset(fdt, "/cpus/cpu");
> > > > > +    if ( node < 0 )
> > > > > +        return 1; /* we have at least one cpu */
> > > > > +
> > > > > +    while ( node >= 0 && depth >= 0 ) {
> > > > > +        if ( strncmp(fdt_get_name(fdt, node, NULL), "cpu", 3) )
> > > > > +            continue;
> > > > > +        node = fdt_next_node(fdt, node, &depth);
> > > > > +        cpus++;
> > > > 
> > > > Do we not need to track the largest <n> for each cpu@<n> which we see,
> > > > in order to handle systems with e.g. CPUs 0, 1, 4 & 5?
> > > 
> > > Actually the hardware ID is expressed by the <reg> propery.
> > > Maybe we should set the corresponding ID in cpu_present_map from
> > > device_tree_cpus?
> > 
> > I'm not sure what you mean.
> 
> I mean that it is not the @<num> that expresses the cpu number from the
> hardware point of view.
> The cpu number is held by the <reg> property of the cpu node.
> Considering that the second cpu could theoretically have ID number 5, we
> should go and mark cpu 1-4 as non present in the cpu masks. We should
> only set cpu number 5 as present.

I think that makes sense, unless we want to construct a logic->physical
CPU numbering scheme for some other reason.

> > > > There are some helpers in device_tree.c to walk over trees like this,
> > > > are none of them suitable?
> > >  
> > > Do you mean device_tree_node_matches?
> > > Yes, I can use that instead of strncmp. I'll do that.
> > 
> > You should , but that's not what I was talkig about ;-)
> > 
> > I was thinking of device_tree_for_each_node but I suppose that doesn't
> > quite fit? But perhaps you can integrate with early_scan_node?
> 
> Maybe, but now that I think about it we should be matching on
> device_type rather than node name.

I've no idea what this means, but I trust you ;-)

>  We should do the same for memory too.
> We need to either change device_tree_node_matches or write a new
> matching function.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/7] xen/arm: flush D-cache and I-cache when appropriate
  2012-11-15 10:02   ` Ian Campbell
@ 2012-11-16 15:36     ` Stefano Stabellini
  0 siblings, 0 replies; 26+ messages in thread
From: Stefano Stabellini @ 2012-11-16 15:36 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Tim (Xen.org), Stefano Stabellini

On Thu, 15 Nov 2012, Ian Campbell wrote:
> On Tue, 2012-11-13 at 15:42 +0000, Stefano Stabellini wrote:
> > - invalidate tlb after setting WXN;
> > - flush D-cache and I-cache after relocation;
> > - flush D-cache after writing to smp_up_cpu;
> > - flush TLB before changing HTTBR;
> > - flush I-cache after changing HTTBR;
> > - flush I-cache and branch predictor after writing Xen text ptes.
> 
> Since the reasoning is pretty subtle I wonder if you could say a few
> words in each case about why these flushes are necessary? Either here on
> in the comments in the code.

I'll remove the TLB flush before changing HTTBR because it isn't
actually needed.
I'll write comments in the code regarding the D-cache flush after
changing smp_up_cpu. I think that the others are self explanatory (they
are a consequence of the fact that we are modifying the code we are
running on or the mapping of it).


> > @@ -244,10 +245,17 @@ void __init setup_pagetables(unsigned long
> > boot_phys_offset, paddr_t xen_paddr)
> >  
> >      /* Change pagetables to the copy in the relocated Xen */
> >      boot_httbr = (unsigned long) xen_pgtable + phys_offset;
> > +    flush_xen_dcache_va(&boot_httbr);
> > +    flush_xen_dcache_va_range((void*)dest_va, _end - _start);
> > +    isb();
> > +    flush_xen_text_tlb();
> > +
> >      asm volatile (
> >          STORE_CP64(0, HTTBR)          /* Change translation base */
> >          "dsb;"                        /* Ensure visibility of HTTBR
> > update */
> > +        "isb;"
> >          STORE_CP32(0, TLBIALLH)       /* Flush hypervisor TLB */
> > +        STORE_CP32(0, ICIALLU)        /* Flush I-cache */
> >          STORE_CP32(0, BPIALL)         /* Flush branch predictor */
> 
> This now looks exactly like flush_xen_text_tlb() -- shall we call it
> instead of open coding?

Yeah


> >          "dsb;"                        /* Ensure completion of TLB+BP
> > flush */
> >          "isb;"
> > @@ -256,6 +264,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
> >      /* Undo the temporary map */
> >      pte.bits = 0;
> >      write_pte(xen_second + second_table_offset(dest_va), pte);
> > +    isb();
> >      flush_xen_text_tlb();
> 
> Do any calls to flush_xen_text_tlb() not require a preceding isb? I'd
> have thought that by its nature an isb would be required every time --
> in which case it may as well go in the macro.

OK


> >      /* Link in the fixmap pagetable */
> > @@ -291,11 +300,14 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
> >                             >> PAGE_SHIFT);
> >      pte.pt.table = 1;
> >      write_pte(xen_second + second_linear_offset(XEN_VIRT_START), pte);
> > -    /* Have changed a mapping used for .text. Flush everything for safety. */
> > -    flush_xen_text_tlb();
> > +    /* ISB is needed because we changed the text mappings */
> > +    isb();
> 
> Why is the text TLB flush is not also required in this case?

You are right, it is required, but we wait a bit longer and delay it
until setting the WXN bit.
In fact we can also delay the isb. I'll make that change and add a
comment.


> >      /* From now on, no mapping may be both writable and executable. */
> >      WRITE_CP32(READ_CP32(HSCTLR) | SCTLR_WXN, HSCTLR);
> > +    isb();
> > +    /* Flush everything after setting WXN bit. */
> > +    flush_xen_text_tlb();
> >  }
> >  
> >  /* MMU setup for secondary CPUS (which already have paging enabled) */
> > @@ -303,6 +315,8 @@ void __cpuinit mmu_init_secondary_cpu(void)
> >  {
> >      /* From now on, no mapping may be both writable and executable. */
> >      WRITE_CP32(READ_CP32(HSCTLR) | SCTLR_WXN, HSCTLR);
> > +    isb();
> > +    flush_xen_text_tlb();
> >  }
> >  
> >  /* Create Xen's mappings of memory.
> 
> > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> > index 9511c45..1b1d556 100644
> > --- a/xen/include/asm-arm/page.h
> > +++ b/xen/include/asm-arm/page.h
> > @@ -228,27 +228,72 @@ static inline lpae_t mfn_to_p2m_entry(unsigned long mfn, unsigned int mattr)
> >      return e;
> >  }
> >  
> > -/* Write a pagetable entry */
> > +/* Write a pagetable entry.
> > + *
> > + * If the table entry is changing a text mapping, it is responsibility
> > + * of the caller to issue an ISB after write_pte.
> > + */
> >  static inline void write_pte(lpae_t *p, lpae_t pte)
> >  {
> >      asm volatile (
> > +        /* Ensure any writes have completed with the old mappings. */
> > +        "dsb;"
> >          /* Safely write the entry (STRD is atomic on CPUs that support LPAE) */
> >          "strd %0, %H0, [%1];"
> > +        "dsb;"
> >          /* Push this cacheline to the PoC so the rest of the system sees it. */
> >          STORE_CP32(1, DCCMVAC)
> > +        /* Ensure that the data flush is completed before proceeding */
> > +        "dsb;"
> >          : : "r" (pte.bits), "r" (p) : "memory");
> >  }
> >  
> > +
> > +#define MIN_CACHELINE_BYTES 32
> > +extern int cacheline_bytes;
> > +
> > +/* Function for flushing medium-sized areas.
> > + * if 'range' is large enough we might want to use model-specific
> > + * full-cache flushes. */
> > +static inline void flush_xen_dcache_va_range(void *p, unsigned long size)
> > +{
> > +    void *end;
> > +    dsb();           /* So the CPU issues all writes to the range */ 
> > +    for ( end = p + size; p < end; p += cacheline_bytes )
> 
> Tim asked if this memory read of the cached cacheline_bytes might not be
> more expensive than hitting the (on-core) cp register every time, which
> I also suspect will be the case. Looking at Linux it seems to reread the
> register each time in v7_flush_dcache_all which suggests that reading
> the register is faster or at least preferable.

OK


> > +        WRITE_CP32((uint32_t) p, DCCMVAC);
> > +    dsb();           /* So we know the flushes happen before continuing */
> > +}
> > +
> > +
> > +/* Macro for flushing a single small item.  The predicate is always 
> > + * compile-time constant so this will compile down to 3 instructions in
> > + * the common case.  Make sure to call it with the correct type of
> > + * pointer! */
> 
> Do we need to worry about pointers to things which cross a cacheline
> boundary? i.e. a 4 byte thing at offset 30 in a 32-byte cache line.
> 
> Or do ARMs C alignment rules ensure this can never happen? (I suspect
> the answer is yes).

I think you are right, we need to take this into consideration.
I couldn't find in the ARM C specs anything about cacheline alignment,
but reading the gcc docs it seems pretty clear that cachelines are
expected to be aligned to the cacheline size.
I think we should change the macro into:

#define flush_xen_dcache_va(p) do {                       \
    int cacheline_bytes  = READ_CP32(CCSIDR);             \
    typeof(p) _p = (p);                                   \
    if ( ((unsigned long)_p & ~(cacheline_bytes - 1)) !=             \
        (((unsigned long)_p + (sizeof *_p)) & ~(cacheline_bytes - 1)) ) \
        flush_xen_dcache_va_range(_p, sizeof *_p);        \
    else                                                  \
        asm volatile (                                    \
            "dsb;"   /* Finish all earlier writes */      \
            STORE_CP32(0, DCCMVAC)                        \
            "dsb;"   /* Finish flush before continuing */ \
            : : "r" (_p), "m" (*_p));                     \
} while (0)


> > +#define flush_xen_dcache_va(p) do {                       \
> > +    typeof(p) _p = (p);                                   \
> > +    if ( (sizeof *_p) > MIN_CACHELINE_BYTES )             \
> > +        flush_xen_dcache_va_range(_p, sizeof *_p);        \
> > +    else                                                  \
> > +        asm volatile (                                    \
> > +            "dsb;"   /* Finish all earlier writes */      \
> > +            STORE_CP32(0, DCCMVAC)                        \
> > +            "dsb;"   /* Finish flush before continuing */ \
> > +            : : "r" (_p), "m" (*_p));                     \
> > +} while (0)
> > +
> > +
> >  /*
> >   * Flush all hypervisor mappings from the TLB and branch predictor.
> > - * This is needed after changing Xen code mappings. 
> > + * This is needed after changing Xen code mappings.
> > + *
> > + * The caller needs to issue the necessary barriers before this functions.
> 
> Worth commenting on what those are and when they are needed?

Given that we are moving isb inside flush_xen_text_tlb, it is only DSB
and D-cache flushes. I'll write that down.


> >   */
> >  static inline void flush_xen_text_tlb(void)
> >  {
> >      register unsigned long r0 asm ("r0");
> >      asm volatile (
> > -        "dsb;"                        /* Ensure visibility of PTE writes */
> >          STORE_CP32(0, TLBIALLH)       /* Flush hypervisor TLB */
> > +        STORE_CP32(0, ICIALLU)        /* Flush I-cache */
> >          STORE_CP32(0, BPIALL)         /* Flush branch predictor */
> >          "dsb;"                        /* Ensure completion of TLB+BP flush */
> >          "isb;"
> 
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register
  2012-10-26  8:56         ` Tim Deegan
@ 2012-10-26  8:59           ` Ian Campbell
  0 siblings, 0 replies; 26+ messages in thread
From: Ian Campbell @ 2012-10-26  8:59 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Stefano Stabellini

On Fri, 2012-10-26 at 09:56 +0100, Tim Deegan wrote:
> At 13:04 +0100 on 25 Oct (1351170270), Ian Campbell wrote:
> > On Thu, 2012-10-25 at 12:57 +0100, Stefano Stabellini wrote:
> > > On Thu, 25 Oct 2012, Ian Campbell wrote:
> > > > On Wed, 2012-10-24 at 16:03 +0100, Stefano Stabellini wrote:
> > > > > From the Cortex A15 manual:
> > > > > 
> > > > > "Enables the processor to receive instruction cache, BTB, and TLB maintenance
> > > > > operations from other processors"
> > > > > 
> > > > > ...
> > > > > 
> > > > > "You must set this bit before enabling the caches and MMU, or
> > > > > performing any cache and TLB maintenance operations. The only time
> > > > > you must clear this bit is during a processor power-down sequence"
> > > > 
> > > > Is it considered a bug that the firmware doesn't do this?
> > > 
> > > Why would it be? You can run fairly complicated pieces of software
> > > without caches or MMU.
> > 
> > True, I guess I just considered that not setting the SMP bit on the
> > second processor when the firmware starts it seemed a bit odd. If you
> > aren't caches/MMU/etc them then setting the bit is pretty much a NOP (or
> > maybe it isn't?).
> 
> Well, given that you're not allowed to clear it except at power-down, I
> think it would be impolite of the firmware to set it if there's _any_
> reason the OS might not want it set.

True.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register
  2012-10-25 12:04       ` Ian Campbell
@ 2012-10-26  8:56         ` Tim Deegan
  2012-10-26  8:59           ` Ian Campbell
  0 siblings, 1 reply; 26+ messages in thread
From: Tim Deegan @ 2012-10-26  8:56 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Stefano Stabellini

At 13:04 +0100 on 25 Oct (1351170270), Ian Campbell wrote:
> On Thu, 2012-10-25 at 12:57 +0100, Stefano Stabellini wrote:
> > On Thu, 25 Oct 2012, Ian Campbell wrote:
> > > On Wed, 2012-10-24 at 16:03 +0100, Stefano Stabellini wrote:
> > > > From the Cortex A15 manual:
> > > > 
> > > > "Enables the processor to receive instruction cache, BTB, and TLB maintenance
> > > > operations from other processors"
> > > > 
> > > > ...
> > > > 
> > > > "You must set this bit before enabling the caches and MMU, or
> > > > performing any cache and TLB maintenance operations. The only time
> > > > you must clear this bit is during a processor power-down sequence"
> > > 
> > > Is it considered a bug that the firmware doesn't do this?
> > 
> > Why would it be? You can run fairly complicated pieces of software
> > without caches or MMU.
> 
> True, I guess I just considered that not setting the SMP bit on the
> second processor when the firmware starts it seemed a bit odd. If you
> aren't caches/MMU/etc them then setting the bit is pretty much a NOP (or
> maybe it isn't?).

Well, given that you're not allowed to clear it except at power-down, I
think it would be impolite of the firmware to set it if there's _any_
reason the OS might not want it set.

Tim.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register
  2012-10-25 11:57     ` Stefano Stabellini
@ 2012-10-25 12:04       ` Ian Campbell
  2012-10-26  8:56         ` Tim Deegan
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Campbell @ 2012-10-25 12:04 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Tim (Xen.org)

On Thu, 2012-10-25 at 12:57 +0100, Stefano Stabellini wrote:
> On Thu, 25 Oct 2012, Ian Campbell wrote:
> > On Wed, 2012-10-24 at 16:03 +0100, Stefano Stabellini wrote:
> > > From the Cortex A15 manual:
> > > 
> > > "Enables the processor to receive instruction cache, BTB, and TLB maintenance
> > > operations from other processors"
> > > 
> > > ...
> > > 
> > > "You must set this bit before enabling the caches and MMU, or
> > > performing any cache and TLB maintenance operations. The only time
> > > you must clear this bit is during a processor power-down sequence"
> > 
> > Is it considered a bug that the firmware doesn't do this?
> 
> Why would it be? You can run fairly complicated pieces of software
> without caches or MMU.

True, I guess I just considered that not setting the SMP bit on the
second processor when the firmware starts it seemed a bit odd. If you
aren't caches/MMU/etc them then setting the bit is pretty much a NOP (or
maybe it isn't?).

> 
> 
> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > > ---
> > >  xen/arch/arm/head.S             |    6 ++++++
> > >  xen/include/asm-arm/processor.h |   30 ++++++++++++++++++++++++++++++
> > >  2 files changed, 36 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/head.S b/xen/arch/arm/head.S
> > > index 3d01be0..c784f4d 100644
> > > --- a/xen/arch/arm/head.S
> > > +++ b/xen/arch/arm/head.S
> > > @@ -148,6 +148,12 @@ skip_bss:
> > >  
> > >  	PRINT("- Setting up control registers -\r\n")
> > >  	
> > > +	/* XXX: Cortex A15 specific */
> > 
> > We probably need some sort of early proc info keyed of the processor id
> > (like Linux) has so we can push this processor specific stuff into the
> > right places.
> 
> Indeed.
> I wrote something simple that should be OK until we get more than
> ~10 different processors.
> 
> 
> > > +	/* Set up the SMP bit in ACTLR */
> > > +	mrc   CP32(r0, ACTLR)
> > > +	orr   r0, r0, #(ACTLR_SMP) /* enable SMP bit*/
> > > +	mcr   CP32(r0, ACTLR)
> > 
> > Linux does this IFF it isn't already set for some reason:
> > #ifdef CONFIG_SMP
> >         ALT_SMP(mrc     p15, 0, r0, c1, c0, 1)
> >         ALT_UP(mov      r0, #(1 << 6))          @ fake it for UP
> >         tst     r0, #(1 << 6)                   @ SMP/nAMP mode enabled?
> >         orreq   r0, r0, #(1 << 6)               @ Enable SMP/nAMP mode
> >         orreq   r0, r0, r10                     @ Enable CPU-specific SMP bits
> >         mcreq   p15, 0, r0, c1, c0, 1
> > #endif
> > 
> > Might just relate to the "fake it up for UP" thing I suppose.
> 
> Considering that we don't have a UP config option for Xen (I am aware
> of), there is no need for us to do that, I think.
> 
> 
> > > +
> > >  	/* Set up memory attribute type tables */
> > >  	ldr   r0, =MAIR0VAL
> > >  	ldr   r1, =MAIR1VAL
> > > diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> > > index 3849b23..91a5836 100644
> > > --- a/xen/include/asm-arm/processor.h
> > > +++ b/xen/include/asm-arm/processor.h
> > > @@ -34,6 +34,36 @@
> > >  #define SCTLR_A         (1<<1)
> > >  #define SCTLR_M         (1<<0)
> > >  
> > > +/* ACTLR Auxiliary Control Register */
> > > +#define ACTLR_SNOOP_DELAYED      (1<<31)
> > 
> > If these are CortexA15 specific then I think they ought to be
> > ACTLR_CA15_FOO or something. And possibly in a new processor-ca15.h
> > header instead.
> 
> OK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register
  2012-10-25  9:52   ` Ian Campbell
@ 2012-10-25 11:57     ` Stefano Stabellini
  2012-10-25 12:04       ` Ian Campbell
  0 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2012-10-25 11:57 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Tim (Xen.org), Stefano Stabellini

On Thu, 25 Oct 2012, Ian Campbell wrote:
> On Wed, 2012-10-24 at 16:03 +0100, Stefano Stabellini wrote:
> > From the Cortex A15 manual:
> > 
> > "Enables the processor to receive instruction cache, BTB, and TLB maintenance
> > operations from other processors"
> > 
> > ...
> > 
> > "You must set this bit before enabling the caches and MMU, or
> > performing any cache and TLB maintenance operations. The only time
> > you must clear this bit is during a processor power-down sequence"
> 
> Is it considered a bug that the firmware doesn't do this?

Why would it be? You can run fairly complicated pieces of software
without caches or MMU.


> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > ---
> >  xen/arch/arm/head.S             |    6 ++++++
> >  xen/include/asm-arm/processor.h |   30 ++++++++++++++++++++++++++++++
> >  2 files changed, 36 insertions(+), 0 deletions(-)
> > 
> > diff --git a/xen/arch/arm/head.S b/xen/arch/arm/head.S
> > index 3d01be0..c784f4d 100644
> > --- a/xen/arch/arm/head.S
> > +++ b/xen/arch/arm/head.S
> > @@ -148,6 +148,12 @@ skip_bss:
> >  
> >  	PRINT("- Setting up control registers -\r\n")
> >  	
> > +	/* XXX: Cortex A15 specific */
> 
> We probably need some sort of early proc info keyed of the processor id
> (like Linux) has so we can push this processor specific stuff into the
> right places.

Indeed.
I wrote something simple that should be OK until we get more than
~10 different processors.


> > +	/* Set up the SMP bit in ACTLR */
> > +	mrc   CP32(r0, ACTLR)
> > +	orr   r0, r0, #(ACTLR_SMP) /* enable SMP bit*/
> > +	mcr   CP32(r0, ACTLR)
> 
> Linux does this IFF it isn't already set for some reason:
> #ifdef CONFIG_SMP
>         ALT_SMP(mrc     p15, 0, r0, c1, c0, 1)
>         ALT_UP(mov      r0, #(1 << 6))          @ fake it for UP
>         tst     r0, #(1 << 6)                   @ SMP/nAMP mode enabled?
>         orreq   r0, r0, #(1 << 6)               @ Enable SMP/nAMP mode
>         orreq   r0, r0, r10                     @ Enable CPU-specific SMP bits
>         mcreq   p15, 0, r0, c1, c0, 1
> #endif
> 
> Might just relate to the "fake it up for UP" thing I suppose.

Considering that we don't have a UP config option for Xen (I am aware
of), there is no need for us to do that, I think.


> > +
> >  	/* Set up memory attribute type tables */
> >  	ldr   r0, =MAIR0VAL
> >  	ldr   r1, =MAIR1VAL
> > diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> > index 3849b23..91a5836 100644
> > --- a/xen/include/asm-arm/processor.h
> > +++ b/xen/include/asm-arm/processor.h
> > @@ -34,6 +34,36 @@
> >  #define SCTLR_A         (1<<1)
> >  #define SCTLR_M         (1<<0)
> >  
> > +/* ACTLR Auxiliary Control Register */
> > +#define ACTLR_SNOOP_DELAYED      (1<<31)
> 
> If these are CortexA15 specific then I think they ought to be
> ACTLR_CA15_FOO or something. And possibly in a new processor-ca15.h
> header instead.

OK

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register
  2012-10-24 15:03 ` [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register Stefano Stabellini
@ 2012-10-25  9:52   ` Ian Campbell
  2012-10-25 11:57     ` Stefano Stabellini
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Campbell @ 2012-10-25  9:52 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Tim (Xen.org)

On Wed, 2012-10-24 at 16:03 +0100, Stefano Stabellini wrote:
> From the Cortex A15 manual:
> 
> "Enables the processor to receive instruction cache, BTB, and TLB maintenance
> operations from other processors"
> 
> ...
> 
> "You must set this bit before enabling the caches and MMU, or
> performing any cache and TLB maintenance operations. The only time
> you must clear this bit is during a processor power-down sequence"

Is it considered a bug that the firmware doesn't do this?

> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> ---
>  xen/arch/arm/head.S             |    6 ++++++
>  xen/include/asm-arm/processor.h |   30 ++++++++++++++++++++++++++++++
>  2 files changed, 36 insertions(+), 0 deletions(-)
> 
> diff --git a/xen/arch/arm/head.S b/xen/arch/arm/head.S
> index 3d01be0..c784f4d 100644
> --- a/xen/arch/arm/head.S
> +++ b/xen/arch/arm/head.S
> @@ -148,6 +148,12 @@ skip_bss:
>  
>  	PRINT("- Setting up control registers -\r\n")
>  	
> +	/* XXX: Cortex A15 specific */

We probably need some sort of early proc info keyed of the processor id
(like Linux) has so we can push this processor specific stuff into the
right places.

> +	/* Set up the SMP bit in ACTLR */
> +	mrc   CP32(r0, ACTLR)
> +	orr   r0, r0, #(ACTLR_SMP) /* enable SMP bit*/
> +	mcr   CP32(r0, ACTLR)

Linux does this IFF it isn't already set for some reason:
#ifdef CONFIG_SMP
        ALT_SMP(mrc     p15, 0, r0, c1, c0, 1)
        ALT_UP(mov      r0, #(1 << 6))          @ fake it for UP
        tst     r0, #(1 << 6)                   @ SMP/nAMP mode enabled?
        orreq   r0, r0, #(1 << 6)               @ Enable SMP/nAMP mode
        orreq   r0, r0, r10                     @ Enable CPU-specific SMP bits
        mcreq   p15, 0, r0, c1, c0, 1
#endif

Might just relate to the "fake it up for UP" thing I suppose.

> +
>  	/* Set up memory attribute type tables */
>  	ldr   r0, =MAIR0VAL
>  	ldr   r1, =MAIR1VAL
> diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> index 3849b23..91a5836 100644
> --- a/xen/include/asm-arm/processor.h
> +++ b/xen/include/asm-arm/processor.h
> @@ -34,6 +34,36 @@
>  #define SCTLR_A         (1<<1)
>  #define SCTLR_M         (1<<0)
>  
> +/* ACTLR Auxiliary Control Register */
> +#define ACTLR_SNOOP_DELAYED      (1<<31)

If these are CortexA15 specific then I think they ought to be
ACTLR_CA15_FOO or something. And possibly in a new processor-ca15.h
header instead.

> +#define ACTLR_MAIN_CLOCK         (1<<30)
> +#define ACTLR_NEON_CLOCK         (1<<29)
> +#define ACTLR_NONCACHE           (1<<24)
> +#define ACTLR_INORDER_REQ        (1<<23)
> +#define ACTLR_INORDER_LOAD       (1<<22)
> +#define ACTLR_L2_TLB_PREFETCH    (1<<21)
> +#define ACTLR_L2_IPA_PA_CACHE    (1<<20)
> +#define ACTLR_L2_CACHE           (1<<19)
> +#define ACTLR_L2_PA_CACHE        (1<<18)
> +#define ACTLR_TLB                (1<<17)
> +#define ACTLR_STRONGY_ORDERED    (1<<16)
> +#define ACTLR_INORDER            (1<<15)
> +#define ACTLR_FORCE_LIM          (1<<14)
> +#define ACTLR_CP_FLUSH           (1<<13)
> +#define ACTLR_CP_PUSH            (1<<12)
> +#define ACTLR_LIM                (1<<11)
> +#define ACTLR_SER                (1<<10)
> +#define ACTLR_OPT                (1<<9)
> +#define ACTLR_WFI                (1<<8)
> +#define ACTLR_WFE                (1<<7)
> +#define ACTLR_SMP                (1<<6)
> +#define ACTLR_PLD                (1<<5)
> +#define ACTLR_IP                 (1<<4)
> +#define ACTLR_MICRO_BTB          (1<<3)
> +#define ACTLR_LOOP_ONE           (1<<2)
> +#define ACTLR_LOOP_DISABLE       (1<<1)
> +#define ACTLR_BTB                (1<<0)
> +
>  #define SCTLR_BASE        0x00c50078
>  #define HSCTLR_BASE       0x30c51878
>  

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register
  2012-10-24 15:03 [PATCH 0/7] xen/arm: run on real hardware Stefano Stabellini
@ 2012-10-24 15:03 ` Stefano Stabellini
  2012-10-25  9:52   ` Ian Campbell
  0 siblings, 1 reply; 26+ messages in thread
From: Stefano Stabellini @ 2012-10-24 15:03 UTC (permalink / raw)
  To: xen-devel; +Cc: tim, Ian.Campbell, Stefano Stabellini

>From the Cortex A15 manual:

"Enables the processor to receive instruction cache, BTB, and TLB maintenance
operations from other processors"

...

"You must set this bit before enabling the caches and MMU, or
performing any cache and TLB maintenance operations. The only time
you must clear this bit is during a processor power-down sequence"

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 xen/arch/arm/head.S             |    6 ++++++
 xen/include/asm-arm/processor.h |   30 ++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/xen/arch/arm/head.S b/xen/arch/arm/head.S
index 3d01be0..c784f4d 100644
--- a/xen/arch/arm/head.S
+++ b/xen/arch/arm/head.S
@@ -148,6 +148,12 @@ skip_bss:
 
 	PRINT("- Setting up control registers -\r\n")
 	
+	/* XXX: Cortex A15 specific */
+	/* Set up the SMP bit in ACTLR */
+	mrc   CP32(r0, ACTLR)
+	orr   r0, r0, #(ACTLR_SMP) /* enable SMP bit*/
+	mcr   CP32(r0, ACTLR)
+
 	/* Set up memory attribute type tables */
 	ldr   r0, =MAIR0VAL
 	ldr   r1, =MAIR1VAL
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 3849b23..91a5836 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -34,6 +34,36 @@
 #define SCTLR_A         (1<<1)
 #define SCTLR_M         (1<<0)
 
+/* ACTLR Auxiliary Control Register */
+#define ACTLR_SNOOP_DELAYED      (1<<31)
+#define ACTLR_MAIN_CLOCK         (1<<30)
+#define ACTLR_NEON_CLOCK         (1<<29)
+#define ACTLR_NONCACHE           (1<<24)
+#define ACTLR_INORDER_REQ        (1<<23)
+#define ACTLR_INORDER_LOAD       (1<<22)
+#define ACTLR_L2_TLB_PREFETCH    (1<<21)
+#define ACTLR_L2_IPA_PA_CACHE    (1<<20)
+#define ACTLR_L2_CACHE           (1<<19)
+#define ACTLR_L2_PA_CACHE        (1<<18)
+#define ACTLR_TLB                (1<<17)
+#define ACTLR_STRONGY_ORDERED    (1<<16)
+#define ACTLR_INORDER            (1<<15)
+#define ACTLR_FORCE_LIM          (1<<14)
+#define ACTLR_CP_FLUSH           (1<<13)
+#define ACTLR_CP_PUSH            (1<<12)
+#define ACTLR_LIM                (1<<11)
+#define ACTLR_SER                (1<<10)
+#define ACTLR_OPT                (1<<9)
+#define ACTLR_WFI                (1<<8)
+#define ACTLR_WFE                (1<<7)
+#define ACTLR_SMP                (1<<6)
+#define ACTLR_PLD                (1<<5)
+#define ACTLR_IP                 (1<<4)
+#define ACTLR_MICRO_BTB          (1<<3)
+#define ACTLR_LOOP_ONE           (1<<2)
+#define ACTLR_LOOP_DISABLE       (1<<1)
+#define ACTLR_BTB                (1<<0)
+
 #define SCTLR_BASE        0x00c50078
 #define HSCTLR_BASE       0x30c51878
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-11-16 15:36 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-13 15:40 [PATCH v2 0/7] xen/arm: run on real hardware Stefano Stabellini
2012-11-13 15:42 ` [PATCH 1/7] xen/arm: pass the correct bit-per-interrupt argument to vgic_irq_rank Stefano Stabellini
2012-11-15 10:26   ` Ian Campbell
2012-11-13 15:42 ` [PATCH 2/7] xen/arm: setup the fixmap in head.S Stefano Stabellini
2012-11-15 10:27   ` Ian Campbell
2012-11-13 15:42 ` [PATCH 3/7] pl011: set baud and clock_hz to the right defaults for Versatile Express Stefano Stabellini
2012-11-15 10:27   ` Ian Campbell
2012-11-13 15:42 ` [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register Stefano Stabellini
2012-11-15 10:27   ` Ian Campbell
2012-11-13 15:42 ` [PATCH 5/7] xen/arm: wake up secondary cpus Stefano Stabellini
2012-11-15 10:28   ` Ian Campbell
2012-11-13 15:42 ` [PATCH 6/7] xen/arm: flush D-cache and I-cache when appropriate Stefano Stabellini
2012-11-15 10:02   ` Ian Campbell
2012-11-16 15:36     ` Stefano Stabellini
2012-11-13 15:42 ` [PATCH 7/7] xen/arm: get the number of cpus from device tree Stefano Stabellini
2012-11-15 10:09   ` Ian Campbell
2012-11-15 15:26     ` Stefano Stabellini
2012-11-15 15:34       ` Ian Campbell
2012-11-15 16:18         ` Stefano Stabellini
2012-11-15 16:27           ` Ian Campbell
  -- strict thread matches above, loose matches on Subject: below --
2012-10-24 15:03 [PATCH 0/7] xen/arm: run on real hardware Stefano Stabellini
2012-10-24 15:03 ` [PATCH 4/7] xen/arm: set the SMP bit in the ACTLR register Stefano Stabellini
2012-10-25  9:52   ` Ian Campbell
2012-10-25 11:57     ` Stefano Stabellini
2012-10-25 12:04       ` Ian Campbell
2012-10-26  8:56         ` Tim Deegan
2012-10-26  8:59           ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.