xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64
@ 2016-07-20 15:25 Julien Grall
  2016-07-20 15:25 ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching Julien Grall
                   ` (7 more replies)
  0 siblings, 8 replies; 20+ messages in thread
From: Julien Grall @ 2016-07-20 15:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

Hello,

Some of the processor errata will require to modify code sequence. As those
modifications may impact the performance, they should only be enabled on
affected cores. Furthermore, Xen may also want to take advantage of
new hardware features coming up with v8.1 and v8.2.

The first part of the series adds the alternative infrastructure, most of
the code is based on Linux v4.6-rc3. The rest of the series implements
errata for Cortex-A57 and Cortex-A53.

A branch with all the patches can be found here:

git://xenbits.xen.org/people/julieng/xen-unstable.git branch alternative-v5

For all the changes, see in each patch.

Yours sincerely,

Julien Grall (7):
  xen/arm: Introduce alternative runtime patching
  xen/arm: cpufeature: Provide an helper to check if a capability is
    supported
  xen/arm: Detect silicon revision and set cap bits accordingly
  xen/arm: Document the errata implemented in Xen
  xen/arm: arm64: Add Cortex-A53 cache errata workaround
  xen/arm: arm64: Add cortex-A57 erratum 832075 workaround
  xen/arm: traps: Don't inject a fault if the translation VA -> IPA
    fails

 docs/misc/arm/silicon-errata.txt  |  49 +++++++++
 xen/arch/arm/Kconfig              |  96 +++++++++++++++++
 xen/arch/arm/Makefile             |   2 +
 xen/arch/arm/alternative.c        | 221 ++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/arm64/cache.S        |   2 +
 xen/arch/arm/cpuerrata.c          |  60 +++++++++++
 xen/arch/arm/cpufeature.c         |  16 +++
 xen/arch/arm/setup.c              |  10 ++
 xen/arch/arm/smpboot.c            |   3 +
 xen/arch/arm/traps.c              |   5 +-
 xen/arch/arm/xen.lds.S            |   9 ++
 xen/include/asm-arm/alternative.h | 165 ++++++++++++++++++++++++++++
 xen/include/asm-arm/arm64/insn.h  |  16 +++
 xen/include/asm-arm/arm64/io.h    |  21 +++-
 xen/include/asm-arm/arm64/page.h  |   7 +-
 xen/include/asm-arm/cpuerrata.h   |  14 +++
 xen/include/asm-arm/cpufeature.h  |  20 +++-
 xen/include/asm-arm/processor.h   |   8 ++
 18 files changed, 715 insertions(+), 9 deletions(-)
 create mode 100644 docs/misc/arm/silicon-errata.txt
 create mode 100644 xen/arch/arm/alternative.c
 create mode 100644 xen/arch/arm/cpuerrata.c
 create mode 100644 xen/include/asm-arm/alternative.h
 create mode 100644 xen/include/asm-arm/cpuerrata.h

-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching
  2016-07-20 15:25 [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Julien Grall
@ 2016-07-20 15:25 ` Julien Grall
  2016-07-21  0:32   ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patchingo Stefano Stabellini
  2016-07-22 14:15   ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching Konrad Rzeszutek Wilk
  2016-07-20 15:25 ` [PATCH v5 2/7] xen/arm: cpufeature: Provide an helper to check if a capability is supported Julien Grall
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 20+ messages in thread
From: Julien Grall @ 2016-07-20 15:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

Some of the processor erratum will require to modify code sequence.
As those modifications may impact the performance, they should only
be enabled on affected cores. Furthermore, Xen may also want to take
advantage of new hardware features coming up with v8.1 and v8.2.

This patch adds an infrastructure to patch Xen during boot time
depending on the "features" available on the platform.

This code is based on the file arch/arm64/kernel/alternative.c in
Linux 4.6-rc3. Any references to arm64 have been dropped to make the
code as generic as possible.

When Xen is creating the page tables, all the executable sections
(.text and .init.text) will be marked read-only and then enforced by
setting SCTLR.WNX.

Whilst it might be possible to mark those entries read-only after
Xen has been patched, we would need extra care to avoid possible
TLBs conflicts (see D4-1732 in ARM DDI 0487A.i) as all
physical CPUs will be running.

All the physical CPUs have to be brought up before patching Xen because
each cores may have different errata/features which require code
patching. The only way to find them is to probe system registers on
each CPU.

To avoid extra complexity, it is possible to create a temporary
writeable mapping with vmap. This mapping will be used to write the
new instructions.

Lastly, runtime patching is currently not necessary for ARM32. So the
code is only enabled for ARM64.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v5:
        - Rebase on the latest staging

    Changes in v4:
        - Fix build when ALTERNATIVE is not enabled
        - Fix typo
        - Move .altinstructions in init.data

    Changes in v3:
        - .altinstr_replacement should live in init.text
        - Add a comment to explain when apply_alternatives_all should be
        called.

    Changes in v2:
        - Use hard tabs in Kconfig
        - Update the copyright year
        - Update the commit message to explain the extra mapping
---
 xen/arch/arm/Kconfig              |   5 +
 xen/arch/arm/Makefile             |   1 +
 xen/arch/arm/alternative.c        | 221 ++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/setup.c              |   7 ++
 xen/arch/arm/xen.lds.S            |   9 ++
 xen/include/asm-arm/alternative.h | 165 ++++++++++++++++++++++++++++
 xen/include/asm-arm/arm64/insn.h  |  16 +++
 7 files changed, 424 insertions(+)
 create mode 100644 xen/arch/arm/alternative.c
 create mode 100644 xen/include/asm-arm/alternative.h

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 6231cd5..0141bd9 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -14,6 +14,7 @@ config ARM_64
 	def_bool y
 	depends on 64BIT
 	select HAS_GICV3
+	select ALTERNATIVE
 
 config ARM
 	def_bool y
@@ -46,6 +47,10 @@ config ACPI
 config HAS_GICV3
 	bool
 
+# Select ALTERNATIVE if the architecture supports runtime patching
+config ALTERNATIVE
+	bool
+
 endmenu
 
 source "common/Kconfig"
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index b264ed4..74bd7b8 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -4,6 +4,7 @@ subdir-y += platforms
 subdir-$(CONFIG_ARM_64) += efi
 subdir-$(CONFIG_ACPI) += acpi
 
+obj-$(CONFIG_ALTERNATIVE) += alternative.o
 obj-y += bootfdt.o
 obj-y += cpu.o
 obj-y += cpufeature.o
diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
new file mode 100644
index 0000000..d00f98e
--- /dev/null
+++ b/xen/arch/arm/alternative.c
@@ -0,0 +1,221 @@
+/*
+ * alternative runtime patching
+ * inspired by the x86 version
+ *
+ * Copyright (C) 2014-2016 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+#include <xen/init.h>
+#include <xen/types.h>
+#include <xen/kernel.h>
+#include <xen/mm.h>
+#include <xen/vmap.h>
+#include <xen/smp.h>
+#include <xen/stop_machine.h>
+#include <asm/alternative.h>
+#include <asm/atomic.h>
+#include <asm/byteorder.h>
+#include <asm/cpufeature.h>
+#include <asm/insn.h>
+#include <asm/page.h>
+
+#define __ALT_PTR(a,f)      (u32 *)((void *)&(a)->f + (a)->f)
+#define ALT_ORIG_PTR(a)     __ALT_PTR(a, orig_offset)
+#define ALT_REPL_PTR(a)     __ALT_PTR(a, alt_offset)
+
+extern const struct alt_instr __alt_instructions[], __alt_instructions_end[];
+
+struct alt_region {
+    const struct alt_instr *begin;
+    const struct alt_instr *end;
+};
+
+/*
+ * Check if the target PC is within an alternative block.
+ */
+static bool_t branch_insn_requires_update(const struct alt_instr *alt,
+                                          unsigned long pc)
+{
+    unsigned long replptr;
+
+    if ( is_active_kernel_text(pc) )
+        return 1;
+
+    replptr = (unsigned long)ALT_REPL_PTR(alt);
+    if ( pc >= replptr && pc <= (replptr + alt->alt_len) )
+        return 0;
+
+    /*
+     * Branching into *another* alternate sequence is doomed, and
+     * we're not even trying to fix it up.
+     */
+    BUG();
+}
+
+static u32 get_alt_insn(const struct alt_instr *alt,
+                        const u32 *insnptr, const u32 *altinsnptr)
+{
+    u32 insn;
+
+    insn = le32_to_cpu(*altinsnptr);
+
+    if ( insn_is_branch_imm(insn) )
+    {
+        s32 offset = insn_get_branch_offset(insn);
+        unsigned long target;
+
+        target = (unsigned long)altinsnptr + offset;
+
+        /*
+         * If we're branching inside the alternate sequence,
+         * do not rewrite the instruction, as it is already
+         * correct. Otherwise, generate the new instruction.
+         */
+        if ( branch_insn_requires_update(alt, target) )
+        {
+            offset = target - (unsigned long)insnptr;
+            insn = insn_set_branch_offset(insn, offset);
+        }
+    }
+
+    return insn;
+}
+
+static int __apply_alternatives(const struct alt_region *region)
+{
+    const struct alt_instr *alt;
+    const u32 *origptr, *replptr;
+    u32 *writeptr, *writemap;
+    mfn_t text_mfn = _mfn(virt_to_mfn(_stext));
+    unsigned int text_order = get_order_from_bytes(_end - _start);
+
+    printk("Patching kernel code\n");
+
+    /*
+     * The text section is read-only. So re-map Xen to be able to patch
+     * the code.
+     */
+    writemap = __vmap(&text_mfn, 1 << text_order, 1, 1, PAGE_HYPERVISOR,
+                      VMAP_DEFAULT);
+    if ( !writemap )
+    {
+        printk("alternatives: Unable to map the text section\n");
+        return -ENOMEM;
+    }
+
+    for ( alt = region->begin; alt < region->end; alt++ )
+    {
+        u32 insn;
+        int i, nr_inst;
+
+        if ( !cpus_have_cap(alt->cpufeature) )
+            continue;
+
+        BUG_ON(alt->alt_len != alt->orig_len);
+
+        origptr = ALT_ORIG_PTR(alt);
+        writeptr = origptr - (u32 *)_start + writemap;
+        replptr = ALT_REPL_PTR(alt);
+
+        nr_inst = alt->alt_len / sizeof(insn);
+
+        for ( i = 0; i < nr_inst; i++ )
+        {
+            insn = get_alt_insn(alt, origptr + i, replptr + i);
+            *(writeptr + i) = cpu_to_le32(insn);
+        }
+
+        /* Ensure the new instructions reached the memory and nuke */
+        clean_and_invalidate_dcache_va_range(writeptr,
+                                             (sizeof (*writeptr) * nr_inst));
+    }
+
+    /* Nuke the instruction cache */
+    invalidate_icache();
+
+    vunmap(writemap);
+
+    return 0;
+}
+
+/*
+ * We might be patching the stop_machine state machine, so implement a
+ * really simple polling protocol here.
+ */
+static int __apply_alternatives_multi_stop(void *unused)
+{
+    static int patched = 0;
+    struct alt_region region = {
+        .begin = __alt_instructions,
+        .end = __alt_instructions_end,
+    };
+
+    /* We always have a CPU 0 at this point (__init) */
+    if ( smp_processor_id() )
+    {
+        while ( !read_atomic(&patched) )
+            cpu_relax();
+        isb();
+    }
+    else
+    {
+        int ret;
+
+        BUG_ON(patched);
+        ret = __apply_alternatives(&region);
+        /* The patching is not expected to fail during boot. */
+        BUG_ON(ret != 0);
+
+        /* Barriers provided by the cache flushing */
+        write_atomic(&patched, 1);
+    }
+
+    return 0;
+}
+
+/*
+ * This function should only be called during boot and before CPU0 jump
+ * into the idle_loop.
+ */
+void __init apply_alternatives_all(void)
+{
+    int ret;
+
+	/* better not try code patching on a live SMP system */
+    ret = stop_machine_run(__apply_alternatives_multi_stop, NULL, NR_CPUS);
+
+    /* stop_machine_run should never fail at this stage of the boot */
+    BUG_ON(ret);
+}
+
+int apply_alternatives(void *start, size_t length)
+{
+    struct alt_region region = {
+        .begin = start,
+        .end = start + length,
+    };
+
+    return __apply_alternatives(&region);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 46cf6dd..97b3214 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -38,6 +38,7 @@
 #include <xen/vmap.h>
 #include <xen/libfdt/libfdt.h>
 #include <xen/acpi.h>
+#include <asm/alternative.h>
 #include <asm/page.h>
 #include <asm/current.h>
 #include <asm/setup.h>
@@ -835,6 +836,12 @@ void __init start_xen(unsigned long boot_phys_offset,
 
     do_initcalls();
 
+    /*
+     * It needs to be called after do_initcalls to be able to use
+     * stop_machine (tasklets initialized via an initcall).
+     */
+    apply_alternatives_all();
+
     /* Create initial domain 0. */
     /* The vGIC for DOM0 is exactly emulating the hardware GIC */
     config.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE;
diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
index b18c9c2..b24e93b 100644
--- a/xen/arch/arm/xen.lds.S
+++ b/xen/arch/arm/xen.lds.S
@@ -151,6 +151,15 @@ SECTIONS
        *(.initcall1.init)
        __initcall_end = .;
 
+#ifdef CONFIG_ALTERNATIVE
+       . = ALIGN(4);
+       __alt_instructions = .;
+       *(.altinstructions)
+       __alt_instructions_end = .;
+       . = ALIGN(4);
+       *(.altinstr_replacement)
+#endif
+
        *(.init.data)
        *(.init.data.rel)
        *(.init.data.rel.*)
diff --git a/xen/include/asm-arm/alternative.h b/xen/include/asm-arm/alternative.h
new file mode 100644
index 0000000..4287bac
--- /dev/null
+++ b/xen/include/asm-arm/alternative.h
@@ -0,0 +1,165 @@
+#ifndef __ASM_ALTERNATIVE_H
+#define __ASM_ALTERNATIVE_H
+
+#include <asm/cpufeature.h>
+#include <xen/config.h>
+#include <xen/kconfig.h>
+
+#ifdef CONFIG_ALTERNATIVE
+
+#ifndef __ASSEMBLY__
+
+#include <xen/init.h>
+#include <xen/types.h>
+#include <xen/stringify.h>
+
+struct alt_instr {
+	s32 orig_offset;	/* offset to original instruction */
+	s32 alt_offset;		/* offset to replacement instruction */
+	u16 cpufeature;		/* cpufeature bit set for replacement */
+	u8  orig_len;		/* size of original instruction(s) */
+	u8  alt_len;		/* size of new instruction(s), <= orig_len */
+};
+
+void __init apply_alternatives_all(void);
+int apply_alternatives(void *start, size_t length);
+
+#define ALTINSTR_ENTRY(feature)						      \
+	" .word 661b - .\n"				/* label           */ \
+	" .word 663f - .\n"				/* new instruction */ \
+	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
+	" .byte 662b-661b\n"				/* source len      */ \
+	" .byte 664f-663f\n"				/* replacement len */
+
+/*
+ * alternative assembly primitive:
+ *
+ * If any of these .org directive fail, it means that insn1 and insn2
+ * don't have the same length. This used to be written as
+ *
+ * .if ((664b-663b) != (662b-661b))
+ * 	.error "Alternatives instruction length mismatch"
+ * .endif
+ *
+ * but most assemblers die if insn1 or insn2 have a .inst. This should
+ * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
+ * containing commit 4e4d08cf7399b606 or c1baaddf8861).
+ */
+#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
+	".if "__stringify(cfg_enabled)" == 1\n"				\
+	"661:\n\t"							\
+	oldinstr "\n"							\
+	"662:\n"							\
+	".pushsection .altinstructions,\"a\"\n"				\
+	ALTINSTR_ENTRY(feature)						\
+	".popsection\n"							\
+	".pushsection .altinstr_replacement, \"a\"\n"			\
+	"663:\n\t"							\
+	newinstr "\n"							\
+	"664:\n\t"							\
+	".popsection\n\t"						\
+	".org	. - (664b-663b) + (662b-661b)\n\t"			\
+	".org	. - (662b-661b) + (664b-663b)\n"			\
+	".endif\n"
+
+#define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
+	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
+
+#else
+
+#include <asm/asm_defns.h>
+
+.macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
+	.word \orig_offset - .
+	.word \alt_offset - .
+	.hword \feature
+	.byte \orig_len
+	.byte \alt_len
+.endm
+
+.macro alternative_insn insn1, insn2, cap, enable = 1
+	.if \enable
+661:	\insn1
+662:	.pushsection .altinstructions, "a"
+	altinstruction_entry 661b, 663f, \cap, 662b-661b, 664f-663f
+	.popsection
+	.pushsection .altinstr_replacement, "ax"
+663:	\insn2
+664:	.popsection
+	.org	. - (664b-663b) + (662b-661b)
+	.org	. - (662b-661b) + (664b-663b)
+	.endif
+.endm
+
+/*
+ * Begin an alternative code sequence.
+ *
+ * The code that follows this macro will be assembled and linked as
+ * normal. There are no restrictions on this code.
+ */
+.macro alternative_if_not cap, enable = 1
+	.if \enable
+	.pushsection .altinstructions, "a"
+	altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
+	.popsection
+661:
+	.endif
+.endm
+
+/*
+ * Provide the alternative code sequence.
+ *
+ * The code that follows this macro is assembled into a special
+ * section to be used for dynamic patching. Code that follows this
+ * macro must:
+ *
+ * 1. Be exactly the same length (in bytes) as the default code
+ *    sequence.
+ *
+ * 2. Not contain a branch target that is used outside of the
+ *    alternative sequence it is defined in (branches into an
+ *    alternative sequence are not fixed up).
+ */
+.macro alternative_else
+662:	.pushsection .altinstr_replacement, "ax"
+663:
+.endm
+
+/*
+ * Complete an alternative code sequence.
+ */
+.macro alternative_endif
+664:	.popsection
+	.org	. - (664b-663b) + (662b-661b)
+	.org	. - (662b-661b) + (664b-663b)
+.endm
+
+#define _ALTERNATIVE_CFG(insn1, insn2, cap, cfg, ...)	\
+	alternative_insn insn1, insn2, cap, IS_ENABLED(cfg)
+
+#endif  /*  __ASSEMBLY__  */
+
+/*
+ * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature));
+ *
+ * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature, CONFIG_FOO));
+ * N.B. If CONFIG_FOO is specified, but not selected, the whole block
+ *      will be omitted, including oldinstr.
+ */
+#define ALTERNATIVE(oldinstr, newinstr, ...)   \
+	_ALTERNATIVE_CFG(oldinstr, newinstr, __VA_ARGS__, 1)
+
+#else /* !CONFIG_ALTERNATIVE */
+
+static inline void apply_alternatives_all(void)
+{
+}
+
+static inline int apply_alternatives(void *start, size_t lenght)
+{
+    return 0;
+}
+
+#endif
+
+#endif /* __ASM_ALTERNATIVE_H */
diff --git a/xen/include/asm-arm/arm64/insn.h b/xen/include/asm-arm/arm64/insn.h
index cfcdbe9..6ce37be 100644
--- a/xen/include/asm-arm/arm64/insn.h
+++ b/xen/include/asm-arm/arm64/insn.h
@@ -61,6 +61,22 @@ u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
 s32 aarch64_get_branch_offset(u32 insn);
 u32 aarch64_set_branch_offset(u32 insn, s32 offset);
 
+/* Wrapper for common code */
+static inline bool insn_is_branch_imm(u32 insn)
+{
+    return aarch64_insn_is_branch_imm(insn);
+}
+
+static inline s32 insn_get_branch_offset(u32 insn)
+{
+    return aarch64_get_branch_offset(insn);
+}
+
+static inline u32 insn_set_branch_offset(u32 insn, s32 offset)
+{
+    return aarch64_set_branch_offset(insn, offset);
+}
+
 #endif /* !__ARCH_ARM_ARM64_INSN */
 /*
  * Local variables:
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 2/7] xen/arm: cpufeature: Provide an helper to check if a capability is supported
  2016-07-20 15:25 [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Julien Grall
  2016-07-20 15:25 ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching Julien Grall
@ 2016-07-20 15:25 ` Julien Grall
  2016-07-22 14:18   ` Konrad Rzeszutek Wilk
  2016-07-20 15:25 ` [PATCH v5 3/7] xen/arm: Detect silicon revision and set cap bits accordingly Julien Grall
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Julien Grall @ 2016-07-20 15:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

The CPU capabilities will be set depending on the value found in the CPU
registers. This patch provides a generic to go through a set of capabilities
and find which one should be enabled.

The parameter "info" is used to display the kind of capability updated (e.g
workaround, feature...).

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>

---
    Changes in v4:
        - Add Stefano's acked-by

    Changes in v3:
         - Patch added. The code was previously part of "Detect
         silicon...".
---
 xen/arch/arm/cpufeature.c        | 16 ++++++++++++++++
 xen/include/asm-arm/cpufeature.h |  9 +++++++++
 2 files changed, 25 insertions(+)

diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
index 7a1b56b..088625b 100644
--- a/xen/arch/arm/cpufeature.c
+++ b/xen/arch/arm/cpufeature.c
@@ -24,6 +24,22 @@
 
 DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
 
+void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
+                             const char *info)
+{
+    int i;
+
+    for ( i = 0; caps[i].matches; i++ )
+    {
+        if ( !caps[i].matches(&caps[i]) )
+            continue;
+
+        if ( !cpus_have_cap(caps[i].capability) && caps[i].desc )
+            printk(XENLOG_INFO "%s: %s\n", info, caps[i].desc);
+        cpus_set_cap(caps[i].capability);
+    }
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
index 2bebad1..be2414c 100644
--- a/xen/include/asm-arm/cpufeature.h
+++ b/xen/include/asm-arm/cpufeature.h
@@ -62,6 +62,15 @@ static inline void cpus_set_cap(unsigned int num)
         __set_bit(num, cpu_hwcaps);
 }
 
+struct arm_cpu_capabilities {
+    const char *desc;
+    u16 capability;
+    bool_t (*matches)(const struct arm_cpu_capabilities *);
+};
+
+void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
+                             const char *info);
+
 #endif /* __ASSEMBLY__ */
 
 #endif
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 3/7] xen/arm: Detect silicon revision and set cap bits accordingly
  2016-07-20 15:25 [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Julien Grall
  2016-07-20 15:25 ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching Julien Grall
  2016-07-20 15:25 ` [PATCH v5 2/7] xen/arm: cpufeature: Provide an helper to check if a capability is supported Julien Grall
@ 2016-07-20 15:25 ` Julien Grall
  2016-07-22 14:24   ` Konrad Rzeszutek Wilk
  2016-07-20 15:25 ` [PATCH v5 4/7] xen/arm: Document the errata implemented in Xen Julien Grall
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Julien Grall @ 2016-07-20 15:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

After each CPU has been started, we iterate through a list of CPU
errata to detect CPUs which need from hypervisor code patches.

For each bug there is a function which check if that a particular CPU is
affected. This needs to be done on every CPUs to cover heterogenous
system properly.

If a certain erratum has been detected, the capability bit will be set.
In the case the erratum requires code patching, this will be triggered
by the call to apply_alternatives.

The code is based on the file arch/arm64/kernel/cpu_errata.c in Linux
v4.6-rc3.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>

---
    Changes in v4:
        - Add missing emacs magic blocks
        - Add Stefano's acked-by

    Changes in v3:
        - Move update_cpu_capabilities in a separate patch
        - Update the commit message to mention that workaround may
        not require code patching.

    Changes in v2:
        - Use XENLOG_INFO for the loglevel of the message
---
 xen/arch/arm/Makefile            |  1 +
 xen/arch/arm/cpuerrata.c         | 34 ++++++++++++++++++++++++++++++++++
 xen/arch/arm/setup.c             |  3 +++
 xen/arch/arm/smpboot.c           |  3 +++
 xen/include/asm-arm/cpuerrata.h  | 14 ++++++++++++++
 xen/include/asm-arm/cpufeature.h |  6 ++++++
 6 files changed, 61 insertions(+)
 create mode 100644 xen/arch/arm/cpuerrata.c
 create mode 100644 xen/include/asm-arm/cpuerrata.h

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 74bd7b8..23aaf52 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -7,6 +7,7 @@ subdir-$(CONFIG_ACPI) += acpi
 obj-$(CONFIG_ALTERNATIVE) += alternative.o
 obj-y += bootfdt.o
 obj-y += cpu.o
+obj-y += cpuerrata.o
 obj-y += cpufeature.o
 obj-y += decode.o
 obj-y += device.o
diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
new file mode 100644
index 0000000..03ae7b4
--- /dev/null
+++ b/xen/arch/arm/cpuerrata.c
@@ -0,0 +1,34 @@
+#include <xen/config.h>
+#include <asm/cpufeature.h>
+#include <asm/cpuerrata.h>
+
+#define MIDR_RANGE(model, min, max)     \
+    .matches = is_affected_midr_range,  \
+    .midr_model = model,                \
+    .midr_range_min = min,              \
+    .midr_range_max = max
+
+static bool_t __maybe_unused
+is_affected_midr_range(const struct arm_cpu_capabilities *entry)
+{
+    return MIDR_IS_CPU_MODEL_RANGE(boot_cpu_data.midr.bits, entry->midr_model,
+                                   entry->midr_range_min,
+                                   entry->midr_range_max);
+}
+
+static const struct arm_cpu_capabilities arm_errata[] = {
+    {},
+};
+
+void check_local_cpu_errata(void)
+{
+    update_cpu_capabilities(arm_errata, "enabled workaround for");
+}
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 97b3214..38eb888 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -43,6 +43,7 @@
 #include <asm/current.h>
 #include <asm/setup.h>
 #include <asm/gic.h>
+#include <asm/cpuerrata.h>
 #include <asm/cpufeature.h>
 #include <asm/platform.h>
 #include <asm/procinfo.h>
@@ -171,6 +172,8 @@ static void __init processor_id(void)
     }
 
     processor_setup();
+
+    check_local_cpu_errata();
 }
 
 void dt_unreserved_regions(paddr_t s, paddr_t e,
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 3a962f7..d56b91d 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -29,6 +29,7 @@
 #include <xen/timer.h>
 #include <xen/irq.h>
 #include <xen/console.h>
+#include <asm/cpuerrata.h>
 #include <asm/gic.h>
 #include <asm/psci.h>
 #include <asm/acpi.h>
@@ -316,6 +317,8 @@ void start_secondary(unsigned long boot_phys_offset,
     local_irq_enable();
     local_abort_enable();
 
+    check_local_cpu_errata();
+
     printk(XENLOG_DEBUG "CPU %u booted.\n", smp_processor_id());
 
     startup_cpu_idle_loop();
diff --git a/xen/include/asm-arm/cpuerrata.h b/xen/include/asm-arm/cpuerrata.h
new file mode 100644
index 0000000..fe93beb
--- /dev/null
+++ b/xen/include/asm-arm/cpuerrata.h
@@ -0,0 +1,14 @@
+#ifndef __ARM_CPUERRATA_H
+#define __ARM_CPUERRATA_H
+
+void check_local_cpu_errata(void);
+
+#endif /* __ARM_CPUERRATA_H */
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
index be2414c..fb57295 100644
--- a/xen/include/asm-arm/cpufeature.h
+++ b/xen/include/asm-arm/cpufeature.h
@@ -66,6 +66,12 @@ struct arm_cpu_capabilities {
     const char *desc;
     u16 capability;
     bool_t (*matches)(const struct arm_cpu_capabilities *);
+    union {
+        struct {    /* To be used for eratum handling only */
+            u32 midr_model;
+            u32 midr_range_min, midr_range_max;
+        };
+    };
 };
 
 void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 4/7] xen/arm: Document the errata implemented in Xen
  2016-07-20 15:25 [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Julien Grall
                   ` (2 preceding siblings ...)
  2016-07-20 15:25 ` [PATCH v5 3/7] xen/arm: Detect silicon revision and set cap bits accordingly Julien Grall
@ 2016-07-20 15:25 ` Julien Grall
  2016-07-22 14:25   ` Konrad Rzeszutek Wilk
  2016-07-20 15:25 ` [PATCH v5 5/7] xen/arm: arm64: Add Cortex-A53 cache errata workaround Julien Grall
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Julien Grall @ 2016-07-20 15:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

The new document will help to keep track of each erratum Xen is able to
handle.

The text is based on the Linux doc in Documents/arm64/silicon-errata.txt.

Also list the current errata that Xen is aware of.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>

---
    Changes in v4:
        - Fix grammar in the document : s/by ARM64/on ARM64/
        - Add Stefano's acked-by

    Changes in v3:
        - Fix grammar in the commit message
        - s/Linux/Xen/
        - Mention that runtime patching is only supported by ARM64
---
 docs/misc/arm/silicon-errata.txt | 45 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)
 create mode 100644 docs/misc/arm/silicon-errata.txt

diff --git a/docs/misc/arm/silicon-errata.txt b/docs/misc/arm/silicon-errata.txt
new file mode 100644
index 0000000..5d54694
--- /dev/null
+++ b/docs/misc/arm/silicon-errata.txt
@@ -0,0 +1,45 @@
+                Silicon Errata and Software Workarounds
+                =======================================
+
+It is an unfortunate fact of life that hardware is often produced with
+so-called "errata", which can cause it to deviate from the architecture
+under specific circumstances.  For hardware produced by ARM, these
+errata are broadly classified into the following categories:
+
+  Category A: A critical error without a viable workaround.
+  Category B: A significant or critical error with an acceptable
+              workaround.
+  Category C: A minor error that is not expected to occur under normal
+              operation.
+
+For more information, consult one of the "Software Developers Errata
+Notice" documents available on infocenter.arm.com (registration
+required).
+
+As far as Xen is concerned, Category B errata may require some special
+treatment in the hypervisor. For example, avoiding a particular sequence
+of code, or configuring the processor in a particular way. A less common
+situation may require similar actions in order to declassify a Category A
+erratum into a Category C erratum. These are collectively known as
+"software workarounds" and are only required in the minority of cases
+(e.g. those cases that both require a non-secure workaround *and* can
+be triggered by Xen).
+
+For software workarounds that may adversely impact systems unaffected by
+the erratum in question, a Kconfig entry is added under "ARM errata
+workarounds via the alternatives framework". These are enabled by default
+and patched in at runtime when an affected CPU is detected. Note that
+runtime patching is only supported on ARM64. For less-intrusive workarounds,
+a Kconfig option is not available and the code is structured (preferably
+with a comment) in such a way that the erratum will not be hit.
+
+This approach can make it slightly onerous to determine exactly which
+errata are worked around in an arbitrary hypervisor source tree, so this
+file acts as a registry of software workarounds in the Xen hypervisor and
+will be updated when new workarounds are committed and backported to
+stable hypervisors.
+
+| Implementor    | Component       | Erratum ID      | Kconfig                 |
++----------------+-----------------+-----------------+-------------------------+
+| ARM            | Cortex-A15      | #766422         | N/A                     |
+| ARM            | Cortex-A57      | #852523         | N/A                     |
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 5/7] xen/arm: arm64: Add Cortex-A53 cache errata workaround
  2016-07-20 15:25 [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Julien Grall
                   ` (3 preceding siblings ...)
  2016-07-20 15:25 ` [PATCH v5 4/7] xen/arm: Document the errata implemented in Xen Julien Grall
@ 2016-07-20 15:25 ` Julien Grall
  2016-07-20 15:25 ` [PATCH v5 6/7] xen/arm: arm64: Add cortex-A57 erratum 832075 workaround Julien Grall
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Julien Grall @ 2016-07-20 15:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

The ARM errata 819472, 827319 and 824069 define the same workaround for
these hardware issues in certain Cortex-A53 parts.

The cache instructions "dc cvac" and "dc cvau" need to be upgraded to
"dc civac".

Use the alternative framework to replace those instructions only on
affected cores.

Whilst the errata affect cache instructions issued at any exception
level, it is not necessary to trap EL1/EL0 data cache instructions
access in order to upgrade them. Indeed the data cache corruption would
always be at the address used by the data cache instructions. Note that
this address could point to a shared memory between guests and the
hypervisors, however all the information present in it are be validated
before any use.

Therefore a malicious guest could only hurt itself. Note that all the
guests should implement/enable the workaround for the affected cores.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>

---
    Changes in v4:
        - Add Stefano's acked-by

    Changes in v3:
        - Remove conflict introduced whilst rebasing this series
---
 docs/misc/arm/silicon-errata.txt |  3 ++
 xen/arch/arm/Kconfig             | 71 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/arm64/cache.S       |  2 ++
 xen/arch/arm/cpuerrata.c         | 17 ++++++++++
 xen/include/asm-arm/arm64/page.h |  7 +++-
 xen/include/asm-arm/cpufeature.h |  4 ++-
 xen/include/asm-arm/processor.h  |  6 ++++
 7 files changed, 108 insertions(+), 2 deletions(-)

diff --git a/docs/misc/arm/silicon-errata.txt b/docs/misc/arm/silicon-errata.txt
index 5d54694..546ccd7 100644
--- a/docs/misc/arm/silicon-errata.txt
+++ b/docs/misc/arm/silicon-errata.txt
@@ -42,4 +42,7 @@ stable hypervisors.
 | Implementor    | Component       | Erratum ID      | Kconfig                 |
 +----------------+-----------------+-----------------+-------------------------+
 | ARM            | Cortex-A15      | #766422         | N/A                     |
+| ARM            | Cortex-A53      | #827319         | ARM64_ERRATUM_827319    |
+| ARM            | Cortex-A53      | #824069         | ARM64_ERRATUM_824069    |
+| ARM            | Cortex-A53      | #819472         | ARM64_ERRATUM_819472    |
 | ARM            | Cortex-A57      | #852523         | N/A                     |
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 0141bd9..a473d9c 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -53,6 +53,77 @@ config ALTERNATIVE
 
 endmenu
 
+menu "ARM errata workaround via the alternative framework"
+	depends on ALTERNATIVE
+
+config ARM64_ERRATUM_827319
+	bool "Cortex-A53: 827319: Data cache clean instructions might cause overlapping transactions to the interconnect"
+	default y
+	depends on ARM_64
+	help
+	  This option adds an alternative code sequence to work around ARM
+	  erratum 827319 on Cortex-A53 parts up to r0p2 with an AMBA 5 CHI
+	  master interface and an L2 cache.
+
+	  Under certain conditions this erratum can cause a clean line eviction
+	  to occur at the same time as another transaction to the same address
+	  on the AMBA 5 CHI interface, which can cause data corruption if the
+	  interconnect reorders the two transactions.
+
+	  The workaround promotes data cache clean instructions to
+	  data cache clean-and-invalidate.
+	  Please note that this does not necessarily enable the workaround,
+	  as it depends on the alternative framework, which will only patch
+	  the kernel if an affected CPU is detected.
+
+	  If unsure, say Y.
+
+config ARM64_ERRATUM_824069
+	bool "Cortex-A53: 824069: Cache line might not be marked as clean after a CleanShared snoop"
+	default y
+	depends on ARM_64
+	help
+	  This option adds an alternative code sequence to work around ARM
+	  erratum 824069 on Cortex-A53 parts up to r0p2 when it is connected
+	  to a coherent interconnect.
+
+	  If a Cortex-A53 processor is executing a store or prefetch for
+	  write instruction at the same time as a processor in another
+	  cluster is executing a cache maintenance operation to the same
+	  address, then this erratum might cause a clean cache line to be
+	  incorrectly marked as dirty.
+
+	  The workaround promotes data cache clean instructions to
+	  data cache clean-and-invalidate.
+	  Please note that this option does not necessarily enable the
+	  workaround, as it depends on the alternative framework, which will
+	  only patch the kernel if an affected CPU is detected.
+
+	  If unsure, say Y.
+
+config ARM64_ERRATUM_819472
+	bool "Cortex-A53: 819472: Store exclusive instructions might cause data corruption"
+	default y
+	depends on ARM_64
+	help
+	  This option adds an alternative code sequence to work around ARM
+	  erratum 819472 on Cortex-A53 parts up to r0p1 with an L2 cache
+	  present when it is connected to a coherent interconnect.
+
+	  If the processor is executing a load and store exclusive sequence at
+	  the same time as a processor in another cluster is executing a cache
+	  maintenance operation to the same address, then this erratum might
+	  cause data corruption.
+
+	  The workaround promotes data cache clean instructions to
+	  data cache clean-and-invalidate.
+	  Please note that this does not necessarily enable the workaround,
+	  as it depends on the alternative framework, which will only patch
+	  the kernel if an affected CPU is detected.
+
+	  If unsure, say Y.
+endmenu
+
 source "common/Kconfig"
 
 source "drivers/Kconfig"
diff --git a/xen/arch/arm/arm64/cache.S b/xen/arch/arm/arm64/cache.S
index eff4e16..9a88a2b 100644
--- a/xen/arch/arm/arm64/cache.S
+++ b/xen/arch/arm/arm64/cache.S
@@ -19,6 +19,8 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <asm/alternative.h>
+
 /*
  * dcache_line_size - get the minimum D-cache line size from the CTR register.
  */
diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
index 03ae7b4..121788d 100644
--- a/xen/arch/arm/cpuerrata.c
+++ b/xen/arch/arm/cpuerrata.c
@@ -17,6 +17,23 @@ is_affected_midr_range(const struct arm_cpu_capabilities *entry)
 }
 
 static const struct arm_cpu_capabilities arm_errata[] = {
+#if defined(CONFIG_ARM64_ERRATUM_827319) || \
+    defined(CONFIG_ARM64_ERRATUM_824069)
+    {
+        /* Cortex-A53 r0p[012] */
+        .desc = "ARM errata 827319, 824069",
+        .capability = ARM64_WORKAROUND_CLEAN_CACHE,
+        MIDR_RANGE(MIDR_CORTEX_A53, 0x00, 0x02),
+    },
+#endif
+#ifdef CONFIG_ARM64_ERRATUM_819472
+    {
+        /* Cortex-A53 r0[01] */
+        .desc = "ARM erratum 819472",
+        .capability = ARM64_WORKAROUND_CLEAN_CACHE,
+        MIDR_RANGE(MIDR_CORTEX_A53, 0x00, 0x01),
+    },
+#endif
     {},
 };
 
diff --git a/xen/include/asm-arm/arm64/page.h b/xen/include/asm-arm/arm64/page.h
index fbdc8fb..79ef7bd 100644
--- a/xen/include/asm-arm/arm64/page.h
+++ b/xen/include/asm-arm/arm64/page.h
@@ -3,6 +3,8 @@
 
 #ifndef __ASSEMBLY__
 
+#include <asm/alternative.h>
+
 /* Write a pagetable entry */
 static inline void write_pte(lpae_t *p, lpae_t pte)
 {
@@ -18,7 +20,10 @@ static inline void write_pte(lpae_t *p, lpae_t pte)
 #define __invalidate_dcache_one(R) "dc ivac, %" #R ";"
 
 /* Inline ASM to flush dcache on register R (may be an inline asm operand) */
-#define __clean_dcache_one(R) "dc cvac, %" #R ";"
+#define __clean_dcache_one(R)                   \
+    ALTERNATIVE("dc cvac, %" #R ";",            \
+                "dc civac, %" #R ";",           \
+                ARM64_WORKAROUND_CLEAN_CACHE)   \
 
 /* Inline ASM to clean and invalidate dcache on register R (may be an
  * inline asm operand) */
diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
index fb57295..474a778 100644
--- a/xen/include/asm-arm/cpufeature.h
+++ b/xen/include/asm-arm/cpufeature.h
@@ -35,7 +35,9 @@
 #endif
 #define cpu_has_security  (boot_cpu_feature32(security) > 0)
 
-#define ARM_NCAPS           0
+#define ARM64_WORKAROUND_CLEAN_CACHE    0
+
+#define ARM_NCAPS           1
 
 #ifndef __ASSEMBLY__
 
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index dba9b9a..5089bfd 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -44,6 +44,12 @@
         _model == (model) && _rv >= (rv_min) && _rv <= (rv_max);        \
 })
 
+#define ARM_CPU_IMP_ARM             0x41
+
+#define ARM_CPU_PART_CORTEX_A53     0xD03
+
+#define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
+
 /* MPIDR Multiprocessor Affinity Register */
 #define _MPIDR_UP           (30)
 #define MPIDR_UP            (_AC(1,U) << _MPIDR_UP)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 6/7] xen/arm: arm64: Add cortex-A57 erratum 832075 workaround
  2016-07-20 15:25 [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Julien Grall
                   ` (4 preceding siblings ...)
  2016-07-20 15:25 ` [PATCH v5 5/7] xen/arm: arm64: Add Cortex-A53 cache errata workaround Julien Grall
@ 2016-07-20 15:25 ` Julien Grall
  2016-07-20 15:26 ` [PATCH v5 7/7] xen/arm: traps: Don't inject a fault if the translation VA -> IPA fails Julien Grall
  2016-07-22 14:29 ` [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Konrad Rzeszutek Wilk
  7 siblings, 0 replies; 20+ messages in thread
From: Julien Grall @ 2016-07-20 15:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

The ARM erratum 832075 applies to certain revisions of Cortex-A57, one
of the workarounds is to change device loads into using load-acquire
semantics.

Use the alternative framework to enable the workaround only on affected
cores.

Whilst a guest could trigger the deadlock, it can be broken when the
processor is receiving an interrupt. As the Xen scheduler will always setup
a timer (firing to every 1ms to 300ms depending on the running time
slice) on each processor, the deadlock would last only few milliseconds
and only affects the guest time slice.

Therefore a malicious guest could only hurt itself. Note that all the
guests should implement/enable the workaround for the affected cores.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>

---
    Changes in v4:
        - Add Stefano's acked-by

    Changes in v2:
        - Update the commit message to explain why it is not necessary
        to take care of possible deadlock from the guest.
---
 docs/misc/arm/silicon-errata.txt |  1 +
 xen/arch/arm/Kconfig             | 20 ++++++++++++++++++++
 xen/arch/arm/cpuerrata.c         |  9 +++++++++
 xen/include/asm-arm/arm64/io.h   | 21 +++++++++++++++++----
 xen/include/asm-arm/cpufeature.h |  3 ++-
 xen/include/asm-arm/processor.h  |  2 ++
 6 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/docs/misc/arm/silicon-errata.txt b/docs/misc/arm/silicon-errata.txt
index 546ccd7..220f724 100644
--- a/docs/misc/arm/silicon-errata.txt
+++ b/docs/misc/arm/silicon-errata.txt
@@ -46,3 +46,4 @@ stable hypervisors.
 | ARM            | Cortex-A53      | #824069         | ARM64_ERRATUM_824069    |
 | ARM            | Cortex-A53      | #819472         | ARM64_ERRATUM_819472    |
 | ARM            | Cortex-A57      | #852523         | N/A                     |
+| ARM            | Cortex-A57      | #832075         | ARM64_ERRATUM_832075    |
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index a473d9c..e26c4c8 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -122,6 +122,26 @@ config ARM64_ERRATUM_819472
 	  the kernel if an affected CPU is detected.
 
 	  If unsure, say Y.
+
+config ARM64_ERRATUM_832075
+	bool "Cortex-A57: 832075: possible deadlock on mixing exclusive memory accesses with device loads"
+	default y
+	depends on ARM_64
+	help
+	  This option adds an alternative code sequence to work around ARM
+	  erratum 832075 on Cortex-A57 parts up to r1p2.
+
+	  Affected Cortex-A57 parts might deadlock when exclusive load/store
+	  instructions to Write-Back memory are mixed with Device loads.
+
+	  The workaround is to promote device loads to use Load-Acquire
+	  semantics.
+	  Please note that this does not necessarily enable the workaround,
+	  as it depends on the alternative framework, which will only patch
+	  the kernel if an affected CPU is detected.
+
+	  If unsure, say Y.
+
 endmenu
 
 source "common/Kconfig"
diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
index 121788d..3ac97b3 100644
--- a/xen/arch/arm/cpuerrata.c
+++ b/xen/arch/arm/cpuerrata.c
@@ -34,6 +34,15 @@ static const struct arm_cpu_capabilities arm_errata[] = {
         MIDR_RANGE(MIDR_CORTEX_A53, 0x00, 0x01),
     },
 #endif
+#ifdef CONFIG_ARM64_ERRATUM_832075
+    {
+        /* Cortex-A57 r0p0 - r1p2 */
+        .desc = "ARM erratum 832075",
+        .capability = ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE,
+        MIDR_RANGE(MIDR_CORTEX_A57, 0x00,
+                   (1 << MIDR_VARIANT_SHIFT) | 2),
+    },
+#endif
     {},
 };
 
diff --git a/xen/include/asm-arm/arm64/io.h b/xen/include/asm-arm/arm64/io.h
index f80156f..30bfc78 100644
--- a/xen/include/asm-arm/arm64/io.h
+++ b/xen/include/asm-arm/arm64/io.h
@@ -22,6 +22,7 @@
 
 #include <asm/system.h>
 #include <asm/byteorder.h>
+#include <asm/alternative.h>
 
 /*
  * Generic IO read/write.  These perform native-endian accesses.
@@ -49,28 +50,40 @@ static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
 static inline u8 __raw_readb(const volatile void __iomem *addr)
 {
         u8 val;
-        asm volatile("ldrb %w0, [%1]" : "=r" (val) : "r" (addr));
+        asm volatile(ALTERNATIVE("ldrb %w0, [%1]",
+                                 "ldarb %w0, [%1]",
+                                 ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE)
+                     : "=r" (val) : "r" (addr));
         return val;
 }
 
 static inline u16 __raw_readw(const volatile void __iomem *addr)
 {
         u16 val;
-        asm volatile("ldrh %w0, [%1]" : "=r" (val) : "r" (addr));
+        asm volatile(ALTERNATIVE("ldrh %w0, [%1]",
+                                 "ldarh %w0, [%1]",
+                                 ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE)
+                     : "=r" (val) : "r" (addr));
         return val;
 }
 
 static inline u32 __raw_readl(const volatile void __iomem *addr)
 {
         u32 val;
-        asm volatile("ldr %w0, [%1]" : "=r" (val) : "r" (addr));
+        asm volatile(ALTERNATIVE("ldr %w0, [%1]",
+                                 "ldar %w0, [%1]",
+                                 ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE)
+                     : "=r" (val) : "r" (addr));
         return val;
 }
 
 static inline u64 __raw_readq(const volatile void __iomem *addr)
 {
         u64 val;
-        asm volatile("ldr %0, [%1]" : "=r" (val) : "r" (addr));
+        asm volatile(ALTERNATIVE("ldr %0, [%1]",
+                                 "ldar %0, [%1]",
+                                 ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE)
+                     : "=r" (val) : "r" (addr));
         return val;
 }
 
diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
index 474a778..78e2263 100644
--- a/xen/include/asm-arm/cpufeature.h
+++ b/xen/include/asm-arm/cpufeature.h
@@ -36,8 +36,9 @@
 #define cpu_has_security  (boot_cpu_feature32(security) > 0)
 
 #define ARM64_WORKAROUND_CLEAN_CACHE    0
+#define ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE    1
 
-#define ARM_NCAPS           1
+#define ARM_NCAPS           2
 
 #ifndef __ASSEMBLY__
 
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 5089bfd..1708253 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -47,8 +47,10 @@
 #define ARM_CPU_IMP_ARM             0x41
 
 #define ARM_CPU_PART_CORTEX_A53     0xD03
+#define ARM_CPU_PART_CORTEX_A57     0xD07
 
 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
+#define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
 
 /* MPIDR Multiprocessor Affinity Register */
 #define _MPIDR_UP           (30)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 7/7] xen/arm: traps: Don't inject a fault if the translation VA -> IPA fails
  2016-07-20 15:25 [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Julien Grall
                   ` (5 preceding siblings ...)
  2016-07-20 15:25 ` [PATCH v5 6/7] xen/arm: arm64: Add cortex-A57 erratum 832075 workaround Julien Grall
@ 2016-07-20 15:26 ` Julien Grall
  2016-07-22 14:29 ` [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Konrad Rzeszutek Wilk
  7 siblings, 0 replies; 20+ messages in thread
From: Julien Grall @ 2016-07-20 15:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, sstabellini

Based on ARM ARM (D4.5.3 in ARM DDI 0486A and B3.12.7 in ARM DDI 0406C.c),
a Stage 1 translation error has priority over a Stage 2 translation error.

Therefore gva_to_ipa can only fail if another vCPU is playing with the
page table.

Rather than injecting a custom fault, replay the instruction and let the
processor injecting the correct fault.

This is fine as Xen is handling all the pending softirqs
(see leave_hypervisor_tail) before returning to the guest. One of them
is the scheduler which could rescheduled the vCPU.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>

---
    Changes in v4:
        - s/reschuled/rescheduled/ in the commit message

    Changes in v3:
        - Add Stefano's acked-by

    Changes in v2:
        - Update commit message to explain why a guest cannot DoS the
        hypervisor with it.
---
 xen/arch/arm/traps.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index a2eb1da..d7a3e62 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2412,7 +2412,7 @@ static void do_trap_instr_abort_guest(struct cpu_user_regs *regs,
 
             rc = gva_to_ipa(gva, &gpa, GV2M_READ);
             if ( rc == -EFAULT )
-                goto bad_insn_abort;
+                return; /* Try again */
         }
 
         rc = p2m_mem_access_check(gpa, gva, npfec);
@@ -2424,7 +2424,6 @@ static void do_trap_instr_abort_guest(struct cpu_user_regs *regs,
     break;
     }
 
-bad_insn_abort:
     inject_iabt_exception(regs, gva, hsr.len);
 }
 
@@ -2448,7 +2447,7 @@ static void do_trap_data_abort_guest(struct cpu_user_regs *regs,
     {
         rc = gva_to_ipa(info.gva, &info.gpa, GV2M_READ);
         if ( rc == -EFAULT )
-            goto bad_data_abort;
+            return; /* Try again */
     }
 
     switch ( dabt.dfsc & 0x3f )
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/7] xen/arm: Introduce alternative runtime patchingo
  2016-07-20 15:25 ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching Julien Grall
@ 2016-07-21  0:32   ` Stefano Stabellini
  2016-07-22 14:15   ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 20+ messages in thread
From: Stefano Stabellini @ 2016-07-21  0:32 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, xen-devel

CC'ing Konrad.

Konrad,

I'll defer to you on this patch.

On Wed, 20 Jul 2016, Julien Grall wrote:
> Some of the processor erratum will require to modify code sequence.
> As those modifications may impact the performance, they should only
> be enabled on affected cores. Furthermore, Xen may also want to take
> advantage of new hardware features coming up with v8.1 and v8.2.
> 
> This patch adds an infrastructure to patch Xen during boot time
> depending on the "features" available on the platform.
> 
> This code is based on the file arch/arm64/kernel/alternative.c in
> Linux 4.6-rc3. Any references to arm64 have been dropped to make the
> code as generic as possible.
> 
> When Xen is creating the page tables, all the executable sections
> (.text and .init.text) will be marked read-only and then enforced by
> setting SCTLR.WNX.
> 
> Whilst it might be possible to mark those entries read-only after
> Xen has been patched, we would need extra care to avoid possible
> TLBs conflicts (see D4-1732 in ARM DDI 0487A.i) as all
> physical CPUs will be running.
> 
> All the physical CPUs have to be brought up before patching Xen because
> each cores may have different errata/features which require code
> patching. The only way to find them is to probe system registers on
> each CPU.
> 
> To avoid extra complexity, it is possible to create a temporary
> writeable mapping with vmap. This mapping will be used to write the
> new instructions.
> 
> Lastly, runtime patching is currently not necessary for ARM32. So the
> code is only enabled for ARM64.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
>     Changes in v5:
>         - Rebase on the latest staging
> 
>     Changes in v4:
>         - Fix build when ALTERNATIVE is not enabled
>         - Fix typo
>         - Move .altinstructions in init.data
> 
>     Changes in v3:
>         - .altinstr_replacement should live in init.text
>         - Add a comment to explain when apply_alternatives_all should be
>         called.
> 
>     Changes in v2:
>         - Use hard tabs in Kconfig
>         - Update the copyright year
>         - Update the commit message to explain the extra mapping
> ---
>  xen/arch/arm/Kconfig              |   5 +
>  xen/arch/arm/Makefile             |   1 +
>  xen/arch/arm/alternative.c        | 221 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/setup.c              |   7 ++
>  xen/arch/arm/xen.lds.S            |   9 ++
>  xen/include/asm-arm/alternative.h | 165 ++++++++++++++++++++++++++++
>  xen/include/asm-arm/arm64/insn.h  |  16 +++
>  7 files changed, 424 insertions(+)
>  create mode 100644 xen/arch/arm/alternative.c
>  create mode 100644 xen/include/asm-arm/alternative.h
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 6231cd5..0141bd9 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -14,6 +14,7 @@ config ARM_64
>  	def_bool y
>  	depends on 64BIT
>  	select HAS_GICV3
> +	select ALTERNATIVE
>  
>  config ARM
>  	def_bool y
> @@ -46,6 +47,10 @@ config ACPI
>  config HAS_GICV3
>  	bool
>  
> +# Select ALTERNATIVE if the architecture supports runtime patching
> +config ALTERNATIVE
> +	bool
> +
>  endmenu
>  
>  source "common/Kconfig"
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index b264ed4..74bd7b8 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -4,6 +4,7 @@ subdir-y += platforms
>  subdir-$(CONFIG_ARM_64) += efi
>  subdir-$(CONFIG_ACPI) += acpi
>  
> +obj-$(CONFIG_ALTERNATIVE) += alternative.o
>  obj-y += bootfdt.o
>  obj-y += cpu.o
>  obj-y += cpufeature.o
> diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
> new file mode 100644
> index 0000000..d00f98e
> --- /dev/null
> +++ b/xen/arch/arm/alternative.c
> @@ -0,0 +1,221 @@
> +/*
> + * alternative runtime patching
> + * inspired by the x86 version
> + *
> + * Copyright (C) 2014-2016 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/config.h>
> +#include <xen/init.h>
> +#include <xen/types.h>
> +#include <xen/kernel.h>
> +#include <xen/mm.h>
> +#include <xen/vmap.h>
> +#include <xen/smp.h>
> +#include <xen/stop_machine.h>
> +#include <asm/alternative.h>
> +#include <asm/atomic.h>
> +#include <asm/byteorder.h>
> +#include <asm/cpufeature.h>
> +#include <asm/insn.h>
> +#include <asm/page.h>
> +
> +#define __ALT_PTR(a,f)      (u32 *)((void *)&(a)->f + (a)->f)
> +#define ALT_ORIG_PTR(a)     __ALT_PTR(a, orig_offset)
> +#define ALT_REPL_PTR(a)     __ALT_PTR(a, alt_offset)
> +
> +extern const struct alt_instr __alt_instructions[], __alt_instructions_end[];
> +
> +struct alt_region {
> +    const struct alt_instr *begin;
> +    const struct alt_instr *end;
> +};
> +
> +/*
> + * Check if the target PC is within an alternative block.
> + */
> +static bool_t branch_insn_requires_update(const struct alt_instr *alt,
> +                                          unsigned long pc)
> +{
> +    unsigned long replptr;
> +
> +    if ( is_active_kernel_text(pc) )
> +        return 1;
> +
> +    replptr = (unsigned long)ALT_REPL_PTR(alt);
> +    if ( pc >= replptr && pc <= (replptr + alt->alt_len) )
> +        return 0;
> +
> +    /*
> +     * Branching into *another* alternate sequence is doomed, and
> +     * we're not even trying to fix it up.
> +     */
> +    BUG();
> +}
> +
> +static u32 get_alt_insn(const struct alt_instr *alt,
> +                        const u32 *insnptr, const u32 *altinsnptr)
> +{
> +    u32 insn;
> +
> +    insn = le32_to_cpu(*altinsnptr);
> +
> +    if ( insn_is_branch_imm(insn) )
> +    {
> +        s32 offset = insn_get_branch_offset(insn);
> +        unsigned long target;
> +
> +        target = (unsigned long)altinsnptr + offset;
> +
> +        /*
> +         * If we're branching inside the alternate sequence,
> +         * do not rewrite the instruction, as it is already
> +         * correct. Otherwise, generate the new instruction.
> +         */
> +        if ( branch_insn_requires_update(alt, target) )
> +        {
> +            offset = target - (unsigned long)insnptr;
> +            insn = insn_set_branch_offset(insn, offset);
> +        }
> +    }
> +
> +    return insn;
> +}
> +
> +static int __apply_alternatives(const struct alt_region *region)
> +{
> +    const struct alt_instr *alt;
> +    const u32 *origptr, *replptr;
> +    u32 *writeptr, *writemap;
> +    mfn_t text_mfn = _mfn(virt_to_mfn(_stext));
> +    unsigned int text_order = get_order_from_bytes(_end - _start);
> +
> +    printk("Patching kernel code\n");
> +
> +    /*
> +     * The text section is read-only. So re-map Xen to be able to patch
> +     * the code.
> +     */
> +    writemap = __vmap(&text_mfn, 1 << text_order, 1, 1, PAGE_HYPERVISOR,
> +                      VMAP_DEFAULT);
> +    if ( !writemap )
> +    {
> +        printk("alternatives: Unable to map the text section\n");
> +        return -ENOMEM;
> +    }
> +
> +    for ( alt = region->begin; alt < region->end; alt++ )
> +    {
> +        u32 insn;
> +        int i, nr_inst;
> +
> +        if ( !cpus_have_cap(alt->cpufeature) )
> +            continue;
> +
> +        BUG_ON(alt->alt_len != alt->orig_len);
> +
> +        origptr = ALT_ORIG_PTR(alt);
> +        writeptr = origptr - (u32 *)_start + writemap;
> +        replptr = ALT_REPL_PTR(alt);
> +
> +        nr_inst = alt->alt_len / sizeof(insn);
> +
> +        for ( i = 0; i < nr_inst; i++ )
> +        {
> +            insn = get_alt_insn(alt, origptr + i, replptr + i);
> +            *(writeptr + i) = cpu_to_le32(insn);
> +        }
> +
> +        /* Ensure the new instructions reached the memory and nuke */
> +        clean_and_invalidate_dcache_va_range(writeptr,
> +                                             (sizeof (*writeptr) * nr_inst));
> +    }
> +
> +    /* Nuke the instruction cache */
> +    invalidate_icache();
> +
> +    vunmap(writemap);
> +
> +    return 0;
> +}
> +
> +/*
> + * We might be patching the stop_machine state machine, so implement a
> + * really simple polling protocol here.
> + */
> +static int __apply_alternatives_multi_stop(void *unused)
> +{
> +    static int patched = 0;
> +    struct alt_region region = {
> +        .begin = __alt_instructions,
> +        .end = __alt_instructions_end,
> +    };
> +
> +    /* We always have a CPU 0 at this point (__init) */
> +    if ( smp_processor_id() )
> +    {
> +        while ( !read_atomic(&patched) )
> +            cpu_relax();
> +        isb();
> +    }
> +    else
> +    {
> +        int ret;
> +
> +        BUG_ON(patched);
> +        ret = __apply_alternatives(&region);
> +        /* The patching is not expected to fail during boot. */
> +        BUG_ON(ret != 0);
> +
> +        /* Barriers provided by the cache flushing */
> +        write_atomic(&patched, 1);
> +    }
> +
> +    return 0;
> +}
> +
> +/*
> + * This function should only be called during boot and before CPU0 jump
> + * into the idle_loop.
> + */
> +void __init apply_alternatives_all(void)
> +{
> +    int ret;
> +
> +	/* better not try code patching on a live SMP system */
> +    ret = stop_machine_run(__apply_alternatives_multi_stop, NULL, NR_CPUS);
> +
> +    /* stop_machine_run should never fail at this stage of the boot */
> +    BUG_ON(ret);
> +}
> +
> +int apply_alternatives(void *start, size_t length)
> +{
> +    struct alt_region region = {
> +        .begin = start,
> +        .end = start + length,
> +    };
> +
> +    return __apply_alternatives(&region);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 46cf6dd..97b3214 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -38,6 +38,7 @@
>  #include <xen/vmap.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <xen/acpi.h>
> +#include <asm/alternative.h>
>  #include <asm/page.h>
>  #include <asm/current.h>
>  #include <asm/setup.h>
> @@ -835,6 +836,12 @@ void __init start_xen(unsigned long boot_phys_offset,
>  
>      do_initcalls();
>  
> +    /*
> +     * It needs to be called after do_initcalls to be able to use
> +     * stop_machine (tasklets initialized via an initcall).
> +     */
> +    apply_alternatives_all();
> +
>      /* Create initial domain 0. */
>      /* The vGIC for DOM0 is exactly emulating the hardware GIC */
>      config.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE;
> diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
> index b18c9c2..b24e93b 100644
> --- a/xen/arch/arm/xen.lds.S
> +++ b/xen/arch/arm/xen.lds.S
> @@ -151,6 +151,15 @@ SECTIONS
>         *(.initcall1.init)
>         __initcall_end = .;
>  
> +#ifdef CONFIG_ALTERNATIVE
> +       . = ALIGN(4);
> +       __alt_instructions = .;
> +       *(.altinstructions)
> +       __alt_instructions_end = .;
> +       . = ALIGN(4);
> +       *(.altinstr_replacement)
> +#endif
> +
>         *(.init.data)
>         *(.init.data.rel)
>         *(.init.data.rel.*)
> diff --git a/xen/include/asm-arm/alternative.h b/xen/include/asm-arm/alternative.h
> new file mode 100644
> index 0000000..4287bac
> --- /dev/null
> +++ b/xen/include/asm-arm/alternative.h
> @@ -0,0 +1,165 @@
> +#ifndef __ASM_ALTERNATIVE_H
> +#define __ASM_ALTERNATIVE_H
> +
> +#include <asm/cpufeature.h>
> +#include <xen/config.h>
> +#include <xen/kconfig.h>
> +
> +#ifdef CONFIG_ALTERNATIVE
> +
> +#ifndef __ASSEMBLY__
> +
> +#include <xen/init.h>
> +#include <xen/types.h>
> +#include <xen/stringify.h>
> +
> +struct alt_instr {
> +	s32 orig_offset;	/* offset to original instruction */
> +	s32 alt_offset;		/* offset to replacement instruction */
> +	u16 cpufeature;		/* cpufeature bit set for replacement */
> +	u8  orig_len;		/* size of original instruction(s) */
> +	u8  alt_len;		/* size of new instruction(s), <= orig_len */
> +};
> +
> +void __init apply_alternatives_all(void);
> +int apply_alternatives(void *start, size_t length);
> +
> +#define ALTINSTR_ENTRY(feature)						      \
> +	" .word 661b - .\n"				/* label           */ \
> +	" .word 663f - .\n"				/* new instruction */ \
> +	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
> +	" .byte 662b-661b\n"				/* source len      */ \
> +	" .byte 664f-663f\n"				/* replacement len */
> +
> +/*
> + * alternative assembly primitive:
> + *
> + * If any of these .org directive fail, it means that insn1 and insn2
> + * don't have the same length. This used to be written as
> + *
> + * .if ((664b-663b) != (662b-661b))
> + * 	.error "Alternatives instruction length mismatch"
> + * .endif
> + *
> + * but most assemblers die if insn1 or insn2 have a .inst. This should
> + * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
> + * containing commit 4e4d08cf7399b606 or c1baaddf8861).
> + */
> +#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
> +	".if "__stringify(cfg_enabled)" == 1\n"				\
> +	"661:\n\t"							\
> +	oldinstr "\n"							\
> +	"662:\n"							\
> +	".pushsection .altinstructions,\"a\"\n"				\
> +	ALTINSTR_ENTRY(feature)						\
> +	".popsection\n"							\
> +	".pushsection .altinstr_replacement, \"a\"\n"			\
> +	"663:\n\t"							\
> +	newinstr "\n"							\
> +	"664:\n\t"							\
> +	".popsection\n\t"						\
> +	".org	. - (664b-663b) + (662b-661b)\n\t"			\
> +	".org	. - (662b-661b) + (664b-663b)\n"			\
> +	".endif\n"
> +
> +#define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
> +	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
> +
> +#else
> +
> +#include <asm/asm_defns.h>
> +
> +.macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
> +	.word \orig_offset - .
> +	.word \alt_offset - .
> +	.hword \feature
> +	.byte \orig_len
> +	.byte \alt_len
> +.endm
> +
> +.macro alternative_insn insn1, insn2, cap, enable = 1
> +	.if \enable
> +661:	\insn1
> +662:	.pushsection .altinstructions, "a"
> +	altinstruction_entry 661b, 663f, \cap, 662b-661b, 664f-663f
> +	.popsection
> +	.pushsection .altinstr_replacement, "ax"
> +663:	\insn2
> +664:	.popsection
> +	.org	. - (664b-663b) + (662b-661b)
> +	.org	. - (662b-661b) + (664b-663b)
> +	.endif
> +.endm
> +
> +/*
> + * Begin an alternative code sequence.
> + *
> + * The code that follows this macro will be assembled and linked as
> + * normal. There are no restrictions on this code.
> + */
> +.macro alternative_if_not cap, enable = 1
> +	.if \enable
> +	.pushsection .altinstructions, "a"
> +	altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
> +	.popsection
> +661:
> +	.endif
> +.endm
> +
> +/*
> + * Provide the alternative code sequence.
> + *
> + * The code that follows this macro is assembled into a special
> + * section to be used for dynamic patching. Code that follows this
> + * macro must:
> + *
> + * 1. Be exactly the same length (in bytes) as the default code
> + *    sequence.
> + *
> + * 2. Not contain a branch target that is used outside of the
> + *    alternative sequence it is defined in (branches into an
> + *    alternative sequence are not fixed up).
> + */
> +.macro alternative_else
> +662:	.pushsection .altinstr_replacement, "ax"
> +663:
> +.endm
> +
> +/*
> + * Complete an alternative code sequence.
> + */
> +.macro alternative_endif
> +664:	.popsection
> +	.org	. - (664b-663b) + (662b-661b)
> +	.org	. - (662b-661b) + (664b-663b)
> +.endm
> +
> +#define _ALTERNATIVE_CFG(insn1, insn2, cap, cfg, ...)	\
> +	alternative_insn insn1, insn2, cap, IS_ENABLED(cfg)
> +
> +#endif  /*  __ASSEMBLY__  */
> +
> +/*
> + * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature));
> + *
> + * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature, CONFIG_FOO));
> + * N.B. If CONFIG_FOO is specified, but not selected, the whole block
> + *      will be omitted, including oldinstr.
> + */
> +#define ALTERNATIVE(oldinstr, newinstr, ...)   \
> +	_ALTERNATIVE_CFG(oldinstr, newinstr, __VA_ARGS__, 1)
> +
> +#else /* !CONFIG_ALTERNATIVE */
> +
> +static inline void apply_alternatives_all(void)
> +{
> +}
> +
> +static inline int apply_alternatives(void *start, size_t lenght)
> +{
> +    return 0;
> +}
> +
> +#endif
> +
> +#endif /* __ASM_ALTERNATIVE_H */
> diff --git a/xen/include/asm-arm/arm64/insn.h b/xen/include/asm-arm/arm64/insn.h
> index cfcdbe9..6ce37be 100644
> --- a/xen/include/asm-arm/arm64/insn.h
> +++ b/xen/include/asm-arm/arm64/insn.h
> @@ -61,6 +61,22 @@ u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>  s32 aarch64_get_branch_offset(u32 insn);
>  u32 aarch64_set_branch_offset(u32 insn, s32 offset);
>  
> +/* Wrapper for common code */
> +static inline bool insn_is_branch_imm(u32 insn)
> +{
> +    return aarch64_insn_is_branch_imm(insn);
> +}
> +
> +static inline s32 insn_get_branch_offset(u32 insn)
> +{
> +    return aarch64_get_branch_offset(insn);
> +}
> +
> +static inline u32 insn_set_branch_offset(u32 insn, s32 offset)
> +{
> +    return aarch64_set_branch_offset(insn, offset);
> +}
> +
>  #endif /* !__ARCH_ARM_ARM64_INSN */
>  /*
>   * Local variables:
> -- 
> 1.9.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching
  2016-07-20 15:25 ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching Julien Grall
  2016-07-21  0:32   ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patchingo Stefano Stabellini
@ 2016-07-22 14:15   ` Konrad Rzeszutek Wilk
  2016-07-22 15:29     ` Julien Grall
  1 sibling, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-07-22 14:15 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, xen-devel

> diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
> new file mode 100644
> index 0000000..d00f98e
> --- /dev/null
> +++ b/xen/arch/arm/alternative.c

Hey!

I've some comments, most of them are light. What I am concerned most
is the difference between comments in the header vs the code. 
Looking at Linux code it seems to have the same issue - and I know you don't
want to differ to much from Linux.

Perhaps go with the changes as is (With the comments being off) and the
then have some patches for Linux for the header mismatches and those
can be backported?

..snip..
> +static int __apply_alternatives(const struct alt_region *region)
> +{
> +    const struct alt_instr *alt;
> +    const u32 *origptr, *replptr;
> +    u32 *writeptr, *writemap;
> +    mfn_t text_mfn = _mfn(virt_to_mfn(_stext));
> +    unsigned int text_order = get_order_from_bytes(_end - _start);
> +
> +    printk("Patching kernel code\n");

I see you use prefixed later with 'alternatives:' and also this is without
any prefix (DEBUG, or ERR). Would it make sense to have this one above
be dprintk while the one below with printk (KERN_ERR) ?
> +
> +    /*
> +     * The text section is read-only. So re-map Xen to be able to patch
> +     * the code.
> +     */
> +    writemap = __vmap(&text_mfn, 1 << text_order, 1, 1, PAGE_HYPERVISOR,
> +                      VMAP_DEFAULT);
> +    if ( !writemap )
> +    {
> +        printk("alternatives: Unable to map the text section\n");

Perhaps include the size of the text section? That may help in figuring
out what went wrong or if one could increase the VMAP area?

> +        return -ENOMEM;
> +    }
> +
> +    for ( alt = region->begin; alt < region->end; alt++ )
> +    {
> +        u32 insn;
> +        int i, nr_inst;
> +
> +        if ( !cpus_have_cap(alt->cpufeature) )
> +            continue;
> +
> +        BUG_ON(alt->alt_len != alt->orig_len);

The header file says:
+	u8  alt_len;		/* size of new instruction(s), <= orig_len */

So this BUG_ON seems incorrect?
But hten later in the header it says:
"1. Be exactly the same length (in bytes) as the default code
+ *    sequence."

So the BUG_ON is correct. Perhaps the 'alt_len' comment should be s/<=/==/ ?

> +
> +        origptr = ALT_ORIG_PTR(alt);
> +        writeptr = origptr - (u32 *)_start + writemap;

How about just using writeptr += ?

Ah wait. You are trying to preserve the Linux code!. Nevermind then.

> +        replptr = ALT_REPL_PTR(alt);
> +
> +        nr_inst = alt->alt_len / sizeof(insn);
> +
> +        for ( i = 0; i < nr_inst; i++ )
> +        {
> +            insn = get_alt_insn(alt, origptr + i, replptr + i);
> +            *(writeptr + i) = cpu_to_le32(insn);
> +        }
> +
> +        /* Ensure the new instructions reached the memory and nuke */
> +        clean_and_invalidate_dcache_va_range(writeptr,
> +                                             (sizeof (*writeptr) * nr_inst));
> +    }
> +
> +    /* Nuke the instruction cache */
> +    invalidate_icache();
> +
> +    vunmap(writemap);
> +
> +    return 0;
> +}
> +
> +/*
> + * We might be patching the stop_machine state machine, so implement a
> + * really simple polling protocol here.
> + */
> +static int __apply_alternatives_multi_stop(void *unused)
> +{
> +    static int patched = 0;

Shouldn't this be 'atomic_t' ?

> +    struct alt_region region = {

Can this 'const' ?
> +        .begin = __alt_instructions,
> +        .end = __alt_instructions_end,
> +    };
> +
> +    /* We always have a CPU 0 at this point (__init) */
> +    if ( smp_processor_id() )
> +    {
> +        while ( !read_atomic(&patched) )
> +            cpu_relax();
> +        isb();
> +    }
> +    else
> +    {
> +        int ret;
> +
> +        BUG_ON(patched);
> +        ret = __apply_alternatives(&region);
> +        /* The patching is not expected to fail during boot. */
> +        BUG_ON(ret != 0);
> +
> +        /* Barriers provided by the cache flushing */
> +        write_atomic(&patched, 1);
> +    }
> +
> +    return 0;
> +}
> +
> +/*
> + * This function should only be called during boot and before CPU0 jump
> + * into the idle_loop.

So could you add an:
> + */
> +void __init apply_alternatives_all(void)
> +{
> +    int ret;
> +
ASSERT(system_state != SYS_STATE_active);
?

> +	/* better not try code patching on a live SMP system */
> +    ret = stop_machine_run(__apply_alternatives_multi_stop, NULL, NR_CPUS);
> +
> +    /* stop_machine_run should never fail at this stage of the boot */
> +    BUG_ON(ret);
> +}
> +
> +int apply_alternatives(void *start, size_t length)
> +{
> +    struct alt_region region = {

Could this be 'const'?

> +        .begin = start,
> +        .end = start + length,
> +    };
> +
> +    return __apply_alternatives(&region);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 46cf6dd..97b3214 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -38,6 +38,7 @@
>  #include <xen/vmap.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <xen/acpi.h>
> +#include <asm/alternative.h>
>  #include <asm/page.h>
>  #include <asm/current.h>
>  #include <asm/setup.h>
> @@ -835,6 +836,12 @@ void __init start_xen(unsigned long boot_phys_offset,
>  
>      do_initcalls();
>  
> +    /*
> +     * It needs to be called after do_initcalls to be able to use
> +     * stop_machine (tasklets initialized via an initcall).
> +     */
> +    apply_alternatives_all();
> +
>      /* Create initial domain 0. */
>      /* The vGIC for DOM0 is exactly emulating the hardware GIC */
>      config.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE;
> diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
> index b18c9c2..b24e93b 100644
> --- a/xen/arch/arm/xen.lds.S
> +++ b/xen/arch/arm/xen.lds.S
> @@ -151,6 +151,15 @@ SECTIONS
>         *(.initcall1.init)
>         __initcall_end = .;
>  
> +#ifdef CONFIG_ALTERNATIVE
> +       . = ALIGN(4);
> +       __alt_instructions = .;
> +       *(.altinstructions)
> +       __alt_instructions_end = .;
> +       . = ALIGN(4);
> +       *(.altinstr_replacement)
> +#endif
> +
>         *(.init.data)
>         *(.init.data.rel)
>         *(.init.data.rel.*)
> diff --git a/xen/include/asm-arm/alternative.h b/xen/include/asm-arm/alternative.h
> new file mode 100644
> index 0000000..4287bac
> --- /dev/null
> +++ b/xen/include/asm-arm/alternative.h
> @@ -0,0 +1,165 @@
> +#ifndef __ASM_ALTERNATIVE_H
> +#define __ASM_ALTERNATIVE_H
> +
> +#include <asm/cpufeature.h>
> +#include <xen/config.h>
> +#include <xen/kconfig.h>
> +
> +#ifdef CONFIG_ALTERNATIVE
> +
> +#ifndef __ASSEMBLY__
> +
> +#include <xen/init.h>
> +#include <xen/types.h>
> +#include <xen/stringify.h>
> +
> +struct alt_instr {
> +	s32 orig_offset;	/* offset to original instruction */
> +	s32 alt_offset;		/* offset to replacement instruction */
> +	u16 cpufeature;		/* cpufeature bit set for replacement */
> +	u8  orig_len;		/* size of original instruction(s) */
> +	u8  alt_len;		/* size of new instruction(s), <= orig_len */

Perhaps s/<=/==/?

> +};
> +
> +void __init apply_alternatives_all(void);
> +int apply_alternatives(void *start, size_t length);
> +
> +#define ALTINSTR_ENTRY(feature)						      \
> +	" .word 661b - .\n"				/* label           */ \
> +	" .word 663f - .\n"				/* new instruction */ \
> +	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
> +	" .byte 662b-661b\n"				/* source len      */ \
> +	" .byte 664f-663f\n"				/* replacement len */
> +
> +/*
> + * alternative assembly primitive:
> + *
> + * If any of these .org directive fail, it means that insn1 and insn2
> + * don't have the same length. This used to be written as
> + *
> + * .if ((664b-663b) != (662b-661b))
> + * 	.error "Alternatives instruction length mismatch"
> + * .endif
> + *
> + * but most assemblers die if insn1 or insn2 have a .inst. This should
> + * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
> + * containing commit 4e4d08cf7399b606 or c1baaddf8861).
> + */
> +#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
> +	".if "__stringify(cfg_enabled)" == 1\n"				\
> +	"661:\n\t"							\
> +	oldinstr "\n"							\
> +	"662:\n"							\
> +	".pushsection .altinstructions,\"a\"\n"				\
> +	ALTINSTR_ENTRY(feature)						\
> +	".popsection\n"							\
> +	".pushsection .altinstr_replacement, \"a\"\n"			\
> +	"663:\n\t"							\
> +	newinstr "\n"							\
> +	"664:\n\t"							\
> +	".popsection\n\t"						\
> +	".org	. - (664b-663b) + (662b-661b)\n\t"			\
> +	".org	. - (662b-661b) + (664b-663b)\n"			\
> +	".endif\n"
> +
> +#define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
> +	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
> +
> +#else
> +
> +#include <asm/asm_defns.h>
> +
> +.macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
> +	.word \orig_offset - .
> +	.word \alt_offset - .
> +	.hword \feature
> +	.byte \orig_len
> +	.byte \alt_len
> +.endm
> +
> +.macro alternative_insn insn1, insn2, cap, enable = 1
> +	.if \enable
> +661:	\insn1
> +662:	.pushsection .altinstructions, "a"
> +	altinstruction_entry 661b, 663f, \cap, 662b-661b, 664f-663f
> +	.popsection
> +	.pushsection .altinstr_replacement, "ax"
> +663:	\insn2
> +664:	.popsection
> +	.org	. - (664b-663b) + (662b-661b)
> +	.org	. - (662b-661b) + (664b-663b)
> +	.endif
> +.endm
> +
> +/*
> + * Begin an alternative code sequence.
> + *
> + * The code that follows this macro will be assembled and linked as
> + * normal. There are no restrictions on this code.
> + */
> +.macro alternative_if_not cap, enable = 1
> +	.if \enable
> +	.pushsection .altinstructions, "a"
> +	altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
> +	.popsection
> +661:
> +	.endif
> +.endm
> +
> +/*
> + * Provide the alternative code sequence.
> + *
> + * The code that follows this macro is assembled into a special
> + * section to be used for dynamic patching. Code that follows this
> + * macro must:
> + *
> + * 1. Be exactly the same length (in bytes) as the default code
> + *    sequence.
> + *
> + * 2. Not contain a branch target that is used outside of the
> + *    alternative sequence it is defined in (branches into an
> + *    alternative sequence are not fixed up).

Hm,? But get_alt_insn seems to take care of fixing this up:

 /*
+         * If we're branching inside the alternate sequence,
+         * do not rewrite the instruction, as it is already
+         * correct. Otherwise, generate the new instruction.
+         */


?

> + */
> +.macro alternative_else
> +662:	.pushsection .altinstr_replacement, "ax"
> +663:
> +.endm
> +
> +/*
> + * Complete an alternative code sequence.
> + */
> +.macro alternative_endif
> +664:	.popsection
> +	.org	. - (664b-663b) + (662b-661b)
> +	.org	. - (662b-661b) + (664b-663b)
> +.endm
> +
> +#define _ALTERNATIVE_CFG(insn1, insn2, cap, cfg, ...)	\
> +	alternative_insn insn1, insn2, cap, IS_ENABLED(cfg)
> +
> +#endif  /*  __ASSEMBLY__  */
> +
> +/*
> + * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature));
> + *
> + * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature, CONFIG_FOO));
> + * N.B. If CONFIG_FOO is specified, but not selected, the whole block
> + *      will be omitted, including oldinstr.
> + */
> +#define ALTERNATIVE(oldinstr, newinstr, ...)   \
> +	_ALTERNATIVE_CFG(oldinstr, newinstr, __VA_ARGS__, 1)
> +
> +#else /* !CONFIG_ALTERNATIVE */
> +
> +static inline void apply_alternatives_all(void)
> +{
> +}
> +
> +static inline int apply_alternatives(void *start, size_t lenght)
> +{
> +    return 0;
> +}
> +
> +#endif
> +
> +#endif /* __ASM_ALTERNATIVE_H */

This being a new file perhaps add:
*                                                                              
 * Local variables:                                                             
 * mode: C                                                                      
 * c-file-style: "BSD"                                                          
 * c-basic-offset: 4                                                            
 * indent-tabs-mode: nil                                                        
 * End:                                                                         
 */           
?

> diff --git a/xen/include/asm-arm/arm64/insn.h b/xen/include/asm-arm/arm64/insn.h
> index cfcdbe9..6ce37be 100644
> --- a/xen/include/asm-arm/arm64/insn.h
> +++ b/xen/include/asm-arm/arm64/insn.h
> @@ -61,6 +61,22 @@ u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
>  s32 aarch64_get_branch_offset(u32 insn);
>  u32 aarch64_set_branch_offset(u32 insn, s32 offset);
>  
> +/* Wrapper for common code */
> +static inline bool insn_is_branch_imm(u32 insn)
> +{
> +    return aarch64_insn_is_branch_imm(insn);
> +}
> +
> +static inline s32 insn_get_branch_offset(u32 insn)
> +{
> +    return aarch64_get_branch_offset(insn);
> +}
> +
> +static inline u32 insn_set_branch_offset(u32 insn, s32 offset)
> +{
> +    return aarch64_set_branch_offset(insn, offset);
> +}
> +
>  #endif /* !__ARCH_ARM_ARM64_INSN */
>  /*
>   * Local variables:
> -- 
> 1.9.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/7] xen/arm: cpufeature: Provide an helper to check if a capability is supported
  2016-07-20 15:25 ` [PATCH v5 2/7] xen/arm: cpufeature: Provide an helper to check if a capability is supported Julien Grall
@ 2016-07-22 14:18   ` Konrad Rzeszutek Wilk
  2016-07-22 15:31     ` Julien Grall
  0 siblings, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-07-22 14:18 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, xen-devel

On Wed, Jul 20, 2016 at 04:25:55PM +0100, Julien Grall wrote:
> The CPU capabilities will be set depending on the value found in the CPU
> registers. This patch provides a generic to go through a set of capabilities
> and find which one should be enabled.
> 
> The parameter "info" is used to display the kind of capability updated (e.g
> workaround, feature...).
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Acked-by: Stefano Stabellini <sstabellini@kernel.org>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> with just one minor
suggestion:

> 
> ---
>     Changes in v4:
>         - Add Stefano's acked-by
> 
>     Changes in v3:
>          - Patch added. The code was previously part of "Detect
>          silicon...".
> ---
>  xen/arch/arm/cpufeature.c        | 16 ++++++++++++++++
>  xen/include/asm-arm/cpufeature.h |  9 +++++++++
>  2 files changed, 25 insertions(+)
> 
> diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
> index 7a1b56b..088625b 100644
> --- a/xen/arch/arm/cpufeature.c
> +++ b/xen/arch/arm/cpufeature.c
> @@ -24,6 +24,22 @@
>  
>  DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
>  
> +void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
> +                             const char *info)
> +{
> +    int i;
> +
> +    for ( i = 0; caps[i].matches; i++ )
> +    {
> +        if ( !caps[i].matches(&caps[i]) )

So what if the 'struct arm_cpu_capabilitues' has '->matches' set to
NULL?

Perhaps:

	if ( !caps[i].matches || !caps[i]... ?
> +            continue;
> +
> +        if ( !cpus_have_cap(caps[i].capability) && caps[i].desc )
> +            printk(XENLOG_INFO "%s: %s\n", info, caps[i].desc);
> +        cpus_set_cap(caps[i].capability);
> +    }
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
> index 2bebad1..be2414c 100644
> --- a/xen/include/asm-arm/cpufeature.h
> +++ b/xen/include/asm-arm/cpufeature.h
> @@ -62,6 +62,15 @@ static inline void cpus_set_cap(unsigned int num)
>          __set_bit(num, cpu_hwcaps);
>  }
>  
> +struct arm_cpu_capabilities {
> +    const char *desc;
> +    u16 capability;
> +    bool_t (*matches)(const struct arm_cpu_capabilities *);
> +};
> +
> +void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
> +                             const char *info);
> +
>  #endif /* __ASSEMBLY__ */
>  
>  #endif
> -- 
> 1.9.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/7] xen/arm: Detect silicon revision and set cap bits accordingly
  2016-07-20 15:25 ` [PATCH v5 3/7] xen/arm: Detect silicon revision and set cap bits accordingly Julien Grall
@ 2016-07-22 14:24   ` Konrad Rzeszutek Wilk
  2016-07-27 16:09     ` Julien Grall
  0 siblings, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-07-22 14:24 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, xen-devel

On Wed, Jul 20, 2016 at 04:25:56PM +0100, Julien Grall wrote:
> After each CPU has been started, we iterate through a list of CPU
> errata to detect CPUs which need from hypervisor code patches.
> 
> For each bug there is a function which check if that a particular CPU is
s/check/checks/
> affected. This needs to be done on every CPUs to cover heterogenous
s/CPUs/CPU/
> system properly.
s/system/systems/ ? (not sure)
> 
> If a certain erratum has been detected, the capability bit will be set.
> In the case the erratum requires code patching, this will be triggered
> by the call to apply_alternatives.
s/the///

> 
> The code is based on the file arch/arm64/kernel/cpu_errata.c in Linux
> v4.6-rc3.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
> 
> ---
>     Changes in v4:
>         - Add missing emacs magic blocks
>         - Add Stefano's acked-by
> 
>     Changes in v3:
>         - Move update_cpu_capabilities in a separate patch
>         - Update the commit message to mention that workaround may
>         not require code patching.
> 
>     Changes in v2:
>         - Use XENLOG_INFO for the loglevel of the message
> ---
>  xen/arch/arm/Makefile            |  1 +
>  xen/arch/arm/cpuerrata.c         | 34 ++++++++++++++++++++++++++++++++++
>  xen/arch/arm/setup.c             |  3 +++
>  xen/arch/arm/smpboot.c           |  3 +++
>  xen/include/asm-arm/cpuerrata.h  | 14 ++++++++++++++
>  xen/include/asm-arm/cpufeature.h |  6 ++++++
>  6 files changed, 61 insertions(+)
>  create mode 100644 xen/arch/arm/cpuerrata.c
>  create mode 100644 xen/include/asm-arm/cpuerrata.h
> 
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 74bd7b8..23aaf52 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -7,6 +7,7 @@ subdir-$(CONFIG_ACPI) += acpi
>  obj-$(CONFIG_ALTERNATIVE) += alternative.o
>  obj-y += bootfdt.o
>  obj-y += cpu.o
> +obj-y += cpuerrata.o
>  obj-y += cpufeature.o
>  obj-y += decode.o
>  obj-y += device.o
> diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
> new file mode 100644
> index 0000000..03ae7b4
> --- /dev/null
> +++ b/xen/arch/arm/cpuerrata.c
> @@ -0,0 +1,34 @@
> +#include <xen/config.h>
> +#include <asm/cpufeature.h>
> +#include <asm/cpuerrata.h>
> +
> +#define MIDR_RANGE(model, min, max)     \
> +    .matches = is_affected_midr_range,  \
> +    .midr_model = model,                \
> +    .midr_range_min = min,              \
> +    .midr_range_max = max
> +
> +static bool_t __maybe_unused
> +is_affected_midr_range(const struct arm_cpu_capabilities *entry)
> +{
> +    return MIDR_IS_CPU_MODEL_RANGE(boot_cpu_data.midr.bits, entry->midr_model,
> +                                   entry->midr_range_min,
> +                                   entry->midr_range_max);
> +}
> +
> +static const struct arm_cpu_capabilities arm_errata[] = {
> +    {},
> +};
> +
> +void check_local_cpu_errata(void)
> +{
> +    update_cpu_capabilities(arm_errata, "enabled workaround for");
> +}
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 97b3214..38eb888 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -43,6 +43,7 @@
>  #include <asm/current.h>
>  #include <asm/setup.h>
>  #include <asm/gic.h>
> +#include <asm/cpuerrata.h>
>  #include <asm/cpufeature.h>
>  #include <asm/platform.h>
>  #include <asm/procinfo.h>
> @@ -171,6 +172,8 @@ static void __init processor_id(void)
>      }
>  
>      processor_setup();
> +
> +    check_local_cpu_errata();
>  }
>  
>  void dt_unreserved_regions(paddr_t s, paddr_t e,
> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> index 3a962f7..d56b91d 100644
> --- a/xen/arch/arm/smpboot.c
> +++ b/xen/arch/arm/smpboot.c
> @@ -29,6 +29,7 @@
>  #include <xen/timer.h>
>  #include <xen/irq.h>
>  #include <xen/console.h>
> +#include <asm/cpuerrata.h>
>  #include <asm/gic.h>
>  #include <asm/psci.h>
>  #include <asm/acpi.h>
> @@ -316,6 +317,8 @@ void start_secondary(unsigned long boot_phys_offset,
>      local_irq_enable();
>      local_abort_enable();
>  
> +    check_local_cpu_errata();
> +
>      printk(XENLOG_DEBUG "CPU %u booted.\n", smp_processor_id());
>  
>      startup_cpu_idle_loop();
> diff --git a/xen/include/asm-arm/cpuerrata.h b/xen/include/asm-arm/cpuerrata.h
> new file mode 100644
> index 0000000..fe93beb
> --- /dev/null
> +++ b/xen/include/asm-arm/cpuerrata.h
> @@ -0,0 +1,14 @@
> +#ifndef __ARM_CPUERRATA_H

Sorry about having engaged the pedantic review mode, but this caught my
eye.

I thought the style was to prefix it with __ and also postfix it with __:

$ grep "__" *.h
decode.h:#ifndef __ARCH_ARM_DECODE_H_
decode.h:#define __ARCH_ARM_DECODE_H_
decode.h:#endif /* __ARCH_ARM_DECODE_H_ */
kernel.h:#ifndef __ARCH_ARM_KERNEL_H__
kernel.h:#define __ARCH_ARM_KERNEL_H__
kernel.h:#endif /* #ifdef __ARCH_ARM_KERNEL_H__ */
vtimer.h:#ifndef __ARCH_ARM_VTIMER_H__
vtimer.h:#define __ARCH_ARM_VTIMER_H__
vuart.h:#ifndef __ARCH_ARM_VUART_H__
vuart.h:#define __ARCH_ARM_VUART_H__
vuart.h:#endif /* __ARCH_ARM_VUART_H__ */

And the include/asm also has have this in in surplus.

It really is minor and I am sorry for even picking up on this, but
could you add the __ at the end?

> +#define __ARM_CPUERRATA_H
> +
> +void check_local_cpu_errata(void);
> +
> +#endif /* __ARM_CPUERRATA_H */
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
> index be2414c..fb57295 100644
> --- a/xen/include/asm-arm/cpufeature.h
> +++ b/xen/include/asm-arm/cpufeature.h
> @@ -66,6 +66,12 @@ struct arm_cpu_capabilities {
>      const char *desc;
>      u16 capability;
>      bool_t (*matches)(const struct arm_cpu_capabilities *);
> +    union {
> +        struct {    /* To be used for eratum handling only */
> +            u32 midr_model;
> +            u32 midr_range_min, midr_range_max;
> +        };
> +    };
>  };
>  
>  void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
> -- 
> 1.9.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 4/7] xen/arm: Document the errata implemented in Xen
  2016-07-20 15:25 ` [PATCH v5 4/7] xen/arm: Document the errata implemented in Xen Julien Grall
@ 2016-07-22 14:25   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-07-22 14:25 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, xen-devel

On Wed, Jul 20, 2016 at 04:25:57PM +0100, Julien Grall wrote:
> The new document will help to keep track of each erratum Xen is able to
> handle.
> 
> The text is based on the Linux doc in Documents/arm64/silicon-errata.txt.
> 
> Also list the current errata that Xen is aware of.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Acked-by: Stefano Stabellini <sstabellini@kernel.org>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

> 
> ---
>     Changes in v4:
>         - Fix grammar in the document : s/by ARM64/on ARM64/
>         - Add Stefano's acked-by
> 
>     Changes in v3:
>         - Fix grammar in the commit message
>         - s/Linux/Xen/
>         - Mention that runtime patching is only supported by ARM64
> ---
>  docs/misc/arm/silicon-errata.txt | 45 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 45 insertions(+)
>  create mode 100644 docs/misc/arm/silicon-errata.txt
> 
> diff --git a/docs/misc/arm/silicon-errata.txt b/docs/misc/arm/silicon-errata.txt
> new file mode 100644
> index 0000000..5d54694
> --- /dev/null
> +++ b/docs/misc/arm/silicon-errata.txt
> @@ -0,0 +1,45 @@
> +                Silicon Errata and Software Workarounds
> +                =======================================
> +
> +It is an unfortunate fact of life that hardware is often produced with
> +so-called "errata", which can cause it to deviate from the architecture
> +under specific circumstances.  For hardware produced by ARM, these
> +errata are broadly classified into the following categories:
> +
> +  Category A: A critical error without a viable workaround.
> +  Category B: A significant or critical error with an acceptable
> +              workaround.
> +  Category C: A minor error that is not expected to occur under normal
> +              operation.
> +
> +For more information, consult one of the "Software Developers Errata
> +Notice" documents available on infocenter.arm.com (registration
> +required).
> +
> +As far as Xen is concerned, Category B errata may require some special
> +treatment in the hypervisor. For example, avoiding a particular sequence
> +of code, or configuring the processor in a particular way. A less common
> +situation may require similar actions in order to declassify a Category A
> +erratum into a Category C erratum. These are collectively known as
> +"software workarounds" and are only required in the minority of cases
> +(e.g. those cases that both require a non-secure workaround *and* can
> +be triggered by Xen).
> +
> +For software workarounds that may adversely impact systems unaffected by
> +the erratum in question, a Kconfig entry is added under "ARM errata
> +workarounds via the alternatives framework". These are enabled by default
> +and patched in at runtime when an affected CPU is detected. Note that
> +runtime patching is only supported on ARM64. For less-intrusive workarounds,
> +a Kconfig option is not available and the code is structured (preferably
> +with a comment) in such a way that the erratum will not be hit.
> +
> +This approach can make it slightly onerous to determine exactly which
> +errata are worked around in an arbitrary hypervisor source tree, so this
> +file acts as a registry of software workarounds in the Xen hypervisor and
> +will be updated when new workarounds are committed and backported to
> +stable hypervisors.
> +
> +| Implementor    | Component       | Erratum ID      | Kconfig                 |
> ++----------------+-----------------+-----------------+-------------------------+
> +| ARM            | Cortex-A15      | #766422         | N/A                     |
> +| ARM            | Cortex-A57      | #852523         | N/A                     |
> -- 
> 1.9.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64
  2016-07-20 15:25 [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Julien Grall
                   ` (6 preceding siblings ...)
  2016-07-20 15:26 ` [PATCH v5 7/7] xen/arm: traps: Don't inject a fault if the translation VA -> IPA fails Julien Grall
@ 2016-07-22 14:29 ` Konrad Rzeszutek Wilk
  7 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-07-22 14:29 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, xen-devel

On Wed, Jul 20, 2016 at 04:25:53PM +0100, Julien Grall wrote:
> Hello,
> 
> Some of the processor errata will require to modify code sequence. As those
> modifications may impact the performance, they should only be enabled on
> affected cores. Furthermore, Xen may also want to take advantage of
> new hardware features coming up with v8.1 and v8.2.
> 
> The first part of the series adds the alternative infrastructure, most of
> the code is based on Linux v4.6-rc3. The rest of the series implements
> errata for Cortex-A57 and Cortex-A53.
> 
> A branch with all the patches can be found here:
> 
> git://xenbits.xen.org/people/julieng/xen-unstable.git branch alternative-v5
> 
> For all the changes, see in each patch.

Hey!

I took a look at the patches. Only had some brief comments. I didn't review
the last three patches as I am not that familiar with the instructions.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching
  2016-07-22 14:15   ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching Konrad Rzeszutek Wilk
@ 2016-07-22 15:29     ` Julien Grall
  2016-07-22 15:38       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 20+ messages in thread
From: Julien Grall @ 2016-07-22 15:29 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Andre Przywara, sstabellini, Steve Capper, Wei Chen, xen-devel



On 22/07/16 15:15, Konrad Rzeszutek Wilk wrote:
>> diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
>> new file mode 100644
>> index 0000000..d00f98e
>> --- /dev/null
>> +++ b/xen/arch/arm/alternative.c
>
> Hey!

Hi Konrad,

> I've some comments, most of them are light. What I am concerned most
> is the difference between comments in the header vs the code.
> Looking at Linux code it seems to have the same issue - and I know you don't
> want to differ to much from Linux.

Well, I had to differ from Linux because the code was ARM64 specific 
(i.e renaming function and moving code around to potentially allow ARM32 
support).

So I don't mind to fix them here and...


>
> Perhaps go with the changes as is (With the comments being off) and the
> then have some patches for Linux for the header mismatches and those
> can be backported?

we can think about Linux changes and resync the code later.

>
> ..snip..
>> +static int __apply_alternatives(const struct alt_region *region)
>> +{
>> +    const struct alt_instr *alt;
>> +    const u32 *origptr, *replptr;
>> +    u32 *writeptr, *writemap;
>> +    mfn_t text_mfn = _mfn(virt_to_mfn(_stext));
>> +    unsigned int text_order = get_order_from_bytes(_end - _start);
>> +
>> +    printk("Patching kernel code\n");
>
> I see you use prefixed later with 'alternatives:' and also this is without
> any prefix (DEBUG, or ERR). Would it make sense to have this one above
> be dprintk while the one below with printk (KERN_ERR) ?

I think this one is useful in non-debug build to know that we are 
patching the code. So if it crash, we know where we are.

>> +
>> +    /*
>> +     * The text section is read-only. So re-map Xen to be able to patch
>> +     * the code.
>> +     */
>> +    writemap = __vmap(&text_mfn, 1 << text_order, 1, 1, PAGE_HYPERVISOR,
>> +                      VMAP_DEFAULT);
>> +    if ( !writemap )
>> +    {
>> +        printk("alternatives: Unable to map the text section\n");
>
> Perhaps include the size of the text section? That may help in figuring
> out what went wrong or if one could increase the VMAP area?

Good point. I will do it in the next version.

>
>> +        return -ENOMEM;
>> +    }
>> +
>> +    for ( alt = region->begin; alt < region->end; alt++ )
>> +    {
>> +        u32 insn;
>> +        int i, nr_inst;
>> +
>> +        if ( !cpus_have_cap(alt->cpufeature) )
>> +            continue;
>> +
>> +        BUG_ON(alt->alt_len != alt->orig_len);
>
> The header file says:
> +	u8  alt_len;		/* size of new instruction(s), <= orig_len */
>
> So this BUG_ON seems incorrect?
> But hten later in the header it says:
> "1. Be exactly the same length (in bytes) as the default code
> + *    sequence."
>
> So the BUG_ON is correct. Perhaps the 'alt_len' comment should be s/<=/==/ ?

Good point. I suspect the plan was to support sequence with different 
length but it has been abandoned later on (CC Andre to confirm).

>
>> +
>> +        origptr = ALT_ORIG_PTR(alt);
>> +        writeptr = origptr - (u32 *)_start + writemap;
>
> How about just using writeptr += ?

I am not sure about your suggestion here. Regardless the Linux code, the 
origptr will not follow a pattern at each iteration. So we have to 
recompute it everytime.

>
> Ah wait. You are trying to preserve the Linux code!. Nevermind then.
>
>> +        replptr = ALT_REPL_PTR(alt);
>> +
>> +        nr_inst = alt->alt_len / sizeof(insn);
>> +
>> +        for ( i = 0; i < nr_inst; i++ )
>> +        {
>> +            insn = get_alt_insn(alt, origptr + i, replptr + i);
>> +            *(writeptr + i) = cpu_to_le32(insn);
>> +        }
>> +
>> +        /* Ensure the new instructions reached the memory and nuke */
>> +        clean_and_invalidate_dcache_va_range(writeptr,
>> +                                             (sizeof (*writeptr) * nr_inst));
>> +    }
>> +
>> +    /* Nuke the instruction cache */
>> +    invalidate_icache();
>> +
>> +    vunmap(writemap);
>> +
>> +    return 0;
>> +}
>> +
>> +/*
>> + * We might be patching the stop_machine state machine, so implement a
>> + * really simple polling protocol here.
>> + */
>> +static int __apply_alternatives_multi_stop(void *unused)
>> +{
>> +    static int patched = 0;
>
> Shouldn't this be 'atomic_t' ?

Does it matter? From my understanding the code will behave the same.

>> +    struct alt_region region = {
>
> Can this 'const' ?

It could, I just followed Linux but as we differ I will move to const.

>> +        .begin = __alt_instructions,
>> +        .end = __alt_instructions_end,
>> +    };
>> +
>> +    /* We always have a CPU 0 at this point (__init) */
>> +    if ( smp_processor_id() )
>> +    {
>> +        while ( !read_atomic(&patched) )
>> +            cpu_relax();
>> +        isb();
>> +    }
>> +    else
>> +    {
>> +        int ret;
>> +
>> +        BUG_ON(patched);
>> +        ret = __apply_alternatives(&region);
>> +        /* The patching is not expected to fail during boot. */
>> +        BUG_ON(ret != 0);
>> +
>> +        /* Barriers provided by the cache flushing */
>> +        write_atomic(&patched, 1);
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +/*
>> + * This function should only be called during boot and before CPU0 jump
>> + * into the idle_loop.
>
> So could you add an:
>> + */
>> +void __init apply_alternatives_all(void)
>> +{
>> +    int ret;
>> +
> ASSERT(system_state != SYS_STATE_active);
> ?

Will do.

>> +	/* better not try code patching on a live SMP system */
>> +    ret = stop_machine_run(__apply_alternatives_multi_stop, NULL, NR_CPUS);
>> +
>> +    /* stop_machine_run should never fail at this stage of the boot */
>> +    BUG_ON(ret);
>> +}
>> +
>> +int apply_alternatives(void *start, size_t length)
>> +{
>> +    struct alt_region region = {
>
> Could this be 'const'?

Will do.

>
>> +        .begin = start,
>> +        .end = start + length,
>> +    };
>> +
>> +    return __apply_alternatives(&region);
>> +}
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */

[...]

>> diff --git a/xen/include/asm-arm/alternative.h b/xen/include/asm-arm/alternative.h
>> new file mode 100644
>> index 0000000..4287bac
>> --- /dev/null
>> +++ b/xen/include/asm-arm/alternative.h
>> @@ -0,0 +1,165 @@
>> +#ifndef __ASM_ALTERNATIVE_H
>> +#define __ASM_ALTERNATIVE_H
>> +
>> +#include <asm/cpufeature.h>
>> +#include <xen/config.h>
>> +#include <xen/kconfig.h>
>> +
>> +#ifdef CONFIG_ALTERNATIVE
>> +
>> +#ifndef __ASSEMBLY__
>> +
>> +#include <xen/init.h>
>> +#include <xen/types.h>
>> +#include <xen/stringify.h>
>> +
>> +struct alt_instr {
>> +	s32 orig_offset;	/* offset to original instruction */
>> +	s32 alt_offset;		/* offset to replacement instruction */
>> +	u16 cpufeature;		/* cpufeature bit set for replacement */
>> +	u8  orig_len;		/* size of original instruction(s) */
>> +	u8  alt_len;		/* size of new instruction(s), <= orig_len */
>
> Perhaps s/<=/==/?

See my comment above.

>
>> +};
>> +
>> +void __init apply_alternatives_all(void);
>> +int apply_alternatives(void *start, size_t length);
>> +
>> +#define ALTINSTR_ENTRY(feature)						      \
>> +	" .word 661b - .\n"				/* label           */ \
>> +	" .word 663f - .\n"				/* new instruction */ \
>> +	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
>> +	" .byte 662b-661b\n"				/* source len      */ \
>> +	" .byte 664f-663f\n"				/* replacement len */
>> +
>> +/*
>> + * alternative assembly primitive:
>> + *
>> + * If any of these .org directive fail, it means that insn1 and insn2
>> + * don't have the same length. This used to be written as
>> + *
>> + * .if ((664b-663b) != (662b-661b))
>> + * 	.error "Alternatives instruction length mismatch"
>> + * .endif
>> + *
>> + * but most assemblers die if insn1 or insn2 have a .inst. This should
>> + * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
>> + * containing commit 4e4d08cf7399b606 or c1baaddf8861).
>> + */
>> +#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
>> +	".if "__stringify(cfg_enabled)" == 1\n"				\
>> +	"661:\n\t"							\
>> +	oldinstr "\n"							\
>> +	"662:\n"							\
>> +	".pushsection .altinstructions,\"a\"\n"				\
>> +	ALTINSTR_ENTRY(feature)						\
>> +	".popsection\n"							\
>> +	".pushsection .altinstr_replacement, \"a\"\n"			\
>> +	"663:\n\t"							\
>> +	newinstr "\n"							\
>> +	"664:\n\t"							\
>> +	".popsection\n\t"						\
>> +	".org	. - (664b-663b) + (662b-661b)\n\t"			\
>> +	".org	. - (662b-661b) + (664b-663b)\n"			\
>> +	".endif\n"
>> +
>> +#define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
>> +	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
>> +
>> +#else
>> +
>> +#include <asm/asm_defns.h>
>> +
>> +.macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
>> +	.word \orig_offset - .
>> +	.word \alt_offset - .
>> +	.hword \feature
>> +	.byte \orig_len
>> +	.byte \alt_len
>> +.endm
>> +
>> +.macro alternative_insn insn1, insn2, cap, enable = 1
>> +	.if \enable
>> +661:	\insn1
>> +662:	.pushsection .altinstructions, "a"
>> +	altinstruction_entry 661b, 663f, \cap, 662b-661b, 664f-663f
>> +	.popsection
>> +	.pushsection .altinstr_replacement, "ax"
>> +663:	\insn2
>> +664:	.popsection
>> +	.org	. - (664b-663b) + (662b-661b)
>> +	.org	. - (662b-661b) + (664b-663b)
>> +	.endif
>> +.endm
>> +
>> +/*
>> + * Begin an alternative code sequence.
>> + *
>> + * The code that follows this macro will be assembled and linked as
>> + * normal. There are no restrictions on this code.
>> + */
>> +.macro alternative_if_not cap, enable = 1
>> +	.if \enable
>> +	.pushsection .altinstructions, "a"
>> +	altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
>> +	.popsection
>> +661:
>> +	.endif
>> +.endm
>> +
>> +/*
>> + * Provide the alternative code sequence.
>> + *
>> + * The code that follows this macro is assembled into a special
>> + * section to be used for dynamic patching. Code that follows this
>> + * macro must:
>> + *
>> + * 1. Be exactly the same length (in bytes) as the default code
>> + *    sequence.
>> + *
>> + * 2. Not contain a branch target that is used outside of the
>> + *    alternative sequence it is defined in (branches into an
>> + *    alternative sequence are not fixed up).
>
> Hm,? But get_alt_insn seems to take care of fixing this up:

I am not sure about this one. I cannot find why there is this 
restriction (CC some ARM folks to have their input).

>
>  /*
> +         * If we're branching inside the alternate sequence,
> +         * do not rewrite the instruction, as it is already
> +         * correct. Otherwise, generate the new instruction.
> +         */
>
>
> ?
>
>> + */
>> +.macro alternative_else
>> +662:	.pushsection .altinstr_replacement, "ax"
>> +663:
>> +.endm
>> +
>> +/*
>> + * Complete an alternative code sequence.
>> + */
>> +.macro alternative_endif
>> +664:	.popsection
>> +	.org	. - (664b-663b) + (662b-661b)
>> +	.org	. - (662b-661b) + (664b-663b)
>> +.endm
>> +
>> +#define _ALTERNATIVE_CFG(insn1, insn2, cap, cfg, ...)	\
>> +	alternative_insn insn1, insn2, cap, IS_ENABLED(cfg)
>> +
>> +#endif  /*  __ASSEMBLY__  */
>> +
>> +/*
>> + * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature));
>> + *
>> + * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature, CONFIG_FOO));
>> + * N.B. If CONFIG_FOO is specified, but not selected, the whole block
>> + *      will be omitted, including oldinstr.
>> + */
>> +#define ALTERNATIVE(oldinstr, newinstr, ...)   \
>> +	_ALTERNATIVE_CFG(oldinstr, newinstr, __VA_ARGS__, 1)
>> +
>> +#else /* !CONFIG_ALTERNATIVE */
>> +
>> +static inline void apply_alternatives_all(void)
>> +{
>> +}
>> +
>> +static inline int apply_alternatives(void *start, size_t lenght)
>> +{
>> +    return 0;
>> +}
>> +
>> +#endif
>> +
>> +#endif /* __ASM_ALTERNATIVE_H */
>
> This being a new file perhaps add:
> *
>  * Local variables:
>  * mode: C
>  * c-file-style: "BSD"
>  * c-basic-offset: 4
>  * indent-tabs-mode: nil
>  * End:
>  */
> ?

It is a Linux file with Linux coding style. I would need to look what 
should be the emacs magic block here.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/7] xen/arm: cpufeature: Provide an helper to check if a capability is supported
  2016-07-22 14:18   ` Konrad Rzeszutek Wilk
@ 2016-07-22 15:31     ` Julien Grall
  2016-07-22 15:39       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 20+ messages in thread
From: Julien Grall @ 2016-07-22 15:31 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: sstabellini, xen-devel

Hi Konrad,

On 22/07/16 15:18, Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 20, 2016 at 04:25:55PM +0100, Julien Grall wrote:
>> diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
>> index 7a1b56b..088625b 100644
>> --- a/xen/arch/arm/cpufeature.c
>> +++ b/xen/arch/arm/cpufeature.c
>> @@ -24,6 +24,22 @@
>>
>>  DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
>>
>> +void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
>> +                             const char *info)
>> +{
>> +    int i;
>> +
>> +    for ( i = 0; caps[i].matches; i++ )
>> +    {
>> +        if ( !caps[i].matches(&caps[i]) )
>
> So what if the 'struct arm_cpu_capabilitues' has '->matches' set to
> NULL?

It is the exit condition of the loop:

  for ( i = 0; caps[i].matches; i++ )

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching
  2016-07-22 15:29     ` Julien Grall
@ 2016-07-22 15:38       ` Konrad Rzeszutek Wilk
  2016-07-22 15:47         ` Julien Grall
  0 siblings, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-07-22 15:38 UTC (permalink / raw)
  To: Julien Grall
  Cc: Andre Przywara, sstabellini, Steve Capper, Wei Chen, xen-devel

> >>+
> >>+        origptr = ALT_ORIG_PTR(alt);
> >>+        writeptr = origptr - (u32 *)_start + writemap;
> >
> >How about just using writeptr += ?
> 
> I am not sure about your suggestion here. Regardless the Linux code, the
> origptr will not follow a pattern at each iteration. So we have to recompute
> it everytime.

Just ignore that. I somehow equated writeptr to writemp.

..snip..
> 
> >
> >Ah wait. You are trying to preserve the Linux code!. Nevermind then.
> >
> >>+        replptr = ALT_REPL_PTR(alt);
> >>+
> >>+        nr_inst = alt->alt_len / sizeof(insn);
> >>+
> >>+        for ( i = 0; i < nr_inst; i++ )
> >>+        {
> >>+            insn = get_alt_insn(alt, origptr + i, replptr + i);
> >>+            *(writeptr + i) = cpu_to_le32(insn);
> >>+        }
> >>+
> >>+        /* Ensure the new instructions reached the memory and nuke */
> >>+        clean_and_invalidate_dcache_va_range(writeptr,
> >>+                                             (sizeof (*writeptr) * nr_inst));
> >>+    }
> >>+
> >>+    /* Nuke the instruction cache */
> >>+    invalidate_icache();
> >>+
> >>+    vunmap(writemap);
> >>+
> >>+    return 0;
> >>+}
> >>+
> >>+/*
> >>+ * We might be patching the stop_machine state machine, so implement a
> >>+ * really simple polling protocol here.
> >>+ */
> >>+static int __apply_alternatives_multi_stop(void *unused)
> >>+{
> >>+    static int patched = 0;
> >
> >Shouldn't this be 'atomic_t' ?
> 
> Does it matter? From my understanding the code will behave the same.

Not at all under the hood.
 But I see 'atomic_write' and they all operate on the 'atomic_t' .. hence
the query. 

..snip.
> >This being a new file perhaps add:
> >*
> > * Local variables:
> > * mode: C
> > * c-file-style: "BSD"
> > * c-basic-offset: 4
> > * indent-tabs-mode: nil
> > * End:
> > */
> >?
> 
> It is a Linux file with Linux coding style. I would need to look what should
> be the emacs magic block here.

Right. I just meant that you needed the magic block.
> 
> Regards,
> 
> -- 
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/7] xen/arm: cpufeature: Provide an helper to check if a capability is supported
  2016-07-22 15:31     ` Julien Grall
@ 2016-07-22 15:39       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-07-22 15:39 UTC (permalink / raw)
  To: Julien Grall; +Cc: sstabellini, xen-devel

On Fri, Jul 22, 2016 at 04:31:13PM +0100, Julien Grall wrote:
> Hi Konrad,
> 
> On 22/07/16 15:18, Konrad Rzeszutek Wilk wrote:
> >On Wed, Jul 20, 2016 at 04:25:55PM +0100, Julien Grall wrote:
> >>diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
> >>index 7a1b56b..088625b 100644
> >>--- a/xen/arch/arm/cpufeature.c
> >>+++ b/xen/arch/arm/cpufeature.c
> >>@@ -24,6 +24,22 @@
> >>
> >> DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
> >>
> >>+void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
> >>+                             const char *info)
> >>+{
> >>+    int i;
> >>+
> >>+    for ( i = 0; caps[i].matches; i++ )
> >>+    {
> >>+        if ( !caps[i].matches(&caps[i]) )
> >
> >So what if the 'struct arm_cpu_capabilitues' has '->matches' set to
> >NULL?
> 
> It is the exit condition of the loop:
> 
>  for ( i = 0; caps[i].matches; i++ )

Duh!

Ignore my query then please.
> 
> Regards,
> 
> -- 
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching
  2016-07-22 15:38       ` Konrad Rzeszutek Wilk
@ 2016-07-22 15:47         ` Julien Grall
  0 siblings, 0 replies; 20+ messages in thread
From: Julien Grall @ 2016-07-22 15:47 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Andre Przywara, sstabellini, Steve Capper, Wei Chen, xen-devel



On 22/07/16 16:38, Konrad Rzeszutek Wilk wrote:
>>>> +
>>>> +        origptr = ALT_ORIG_PTR(alt);
>>>> +        writeptr = origptr - (u32 *)_start + writemap;
>>>
>>> How about just using writeptr += ?
>>
>> I am not sure about your suggestion here. Regardless the Linux code, the
>> origptr will not follow a pattern at each iteration. So we have to recompute
>> it everytime.
>
> Just ignore that. I somehow equated writeptr to writemp.
>
> ..snip..
>>
>>>
>>> Ah wait. You are trying to preserve the Linux code!. Nevermind then.
>>>
>>>> +        replptr = ALT_REPL_PTR(alt);
>>>> +
>>>> +        nr_inst = alt->alt_len / sizeof(insn);
>>>> +
>>>> +        for ( i = 0; i < nr_inst; i++ )
>>>> +        {
>>>> +            insn = get_alt_insn(alt, origptr + i, replptr + i);
>>>> +            *(writeptr + i) = cpu_to_le32(insn);
>>>> +        }
>>>> +
>>>> +        /* Ensure the new instructions reached the memory and nuke */
>>>> +        clean_and_invalidate_dcache_va_range(writeptr,
>>>> +                                             (sizeof (*writeptr) * nr_inst));
>>>> +    }
>>>> +
>>>> +    /* Nuke the instruction cache */
>>>> +    invalidate_icache();
>>>> +
>>>> +    vunmap(writemap);
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +/*
>>>> + * We might be patching the stop_machine state machine, so implement a
>>>> + * really simple polling protocol here.
>>>> + */
>>>> +static int __apply_alternatives_multi_stop(void *unused)
>>>> +{
>>>> +    static int patched = 0;
>>>
>>> Shouldn't this be 'atomic_t' ?
>>
>> Does it matter? From my understanding the code will behave the same.
>
> Not at all under the hood.
>  But I see 'atomic_write' and they all operate on the 'atomic_t' .. hence
> the query.

The code is using *_atomic (and not atomic_*) which operate on any type. 
I know the naming is really confusing.

I would like to rename write_atomic and read_atomic to WRITE_ONCE and 
READ_ONCE.

>
> ..snip.
>>> This being a new file perhaps add:
>>> *
>>> * Local variables:
>>> * mode: C
>>> * c-file-style: "BSD"
>>> * c-basic-offset: 4
>>> * indent-tabs-mode: nil
>>> * End:
>>> */
>>> ?
>>
>> It is a Linux file with Linux coding style. I would need to look what should
>> be the emacs magic block here.
>
> Right. I just meant that you needed the magic block.

I don't think we ever put magic block on Linux file. I will try to find 
out and update the patch.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 3/7] xen/arm: Detect silicon revision and set cap bits accordingly
  2016-07-22 14:24   ` Konrad Rzeszutek Wilk
@ 2016-07-27 16:09     ` Julien Grall
  0 siblings, 0 replies; 20+ messages in thread
From: Julien Grall @ 2016-07-27 16:09 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: sstabellini, xen-devel

Hi Konrad,

On 22/07/16 15:24, Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 20, 2016 at 04:25:56PM +0100, Julien Grall wrote:
>> diff --git a/xen/include/asm-arm/cpuerrata.h b/xen/include/asm-arm/cpuerrata.h
>> new file mode 100644
>> index 0000000..fe93beb
>> --- /dev/null
>> +++ b/xen/include/asm-arm/cpuerrata.h
>> @@ -0,0 +1,14 @@
>> +#ifndef __ARM_CPUERRATA_H
>
> Sorry about having engaged the pedantic review mode, but this caught my
> eye.
>
> I thought the style was to prefix it with __ and also postfix it with __:
>
> $ grep "__" *.h
> decode.h:#ifndef __ARCH_ARM_DECODE_H_
> decode.h:#define __ARCH_ARM_DECODE_H_
> decode.h:#endif /* __ARCH_ARM_DECODE_H_ */
> kernel.h:#ifndef __ARCH_ARM_KERNEL_H__
> kernel.h:#define __ARCH_ARM_KERNEL_H__
> kernel.h:#endif /* #ifdef __ARCH_ARM_KERNEL_H__ */
> vtimer.h:#ifndef __ARCH_ARM_VTIMER_H__
> vtimer.h:#define __ARCH_ARM_VTIMER_H__
> vuart.h:#ifndef __ARCH_ARM_VUART_H__
> vuart.h:#define __ARCH_ARM_VUART_H__
> vuart.h:#endif /* __ARCH_ARM_VUART_H__ */
>
> And the include/asm also has have this in in surplus.
>
> It really is minor and I am sorry for even picking up on this, but
> could you add the __ at the end?

I am not sure if there is a strict coding style here. A lot of ARM 
headers (and x86 too) does not have the trailing "__" or only one "_" as 
the first example you gave.

Anyway, I don't mind to add them here.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-07-27 16:09 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-20 15:25 [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Julien Grall
2016-07-20 15:25 ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching Julien Grall
2016-07-21  0:32   ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patchingo Stefano Stabellini
2016-07-22 14:15   ` [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching Konrad Rzeszutek Wilk
2016-07-22 15:29     ` Julien Grall
2016-07-22 15:38       ` Konrad Rzeszutek Wilk
2016-07-22 15:47         ` Julien Grall
2016-07-20 15:25 ` [PATCH v5 2/7] xen/arm: cpufeature: Provide an helper to check if a capability is supported Julien Grall
2016-07-22 14:18   ` Konrad Rzeszutek Wilk
2016-07-22 15:31     ` Julien Grall
2016-07-22 15:39       ` Konrad Rzeszutek Wilk
2016-07-20 15:25 ` [PATCH v5 3/7] xen/arm: Detect silicon revision and set cap bits accordingly Julien Grall
2016-07-22 14:24   ` Konrad Rzeszutek Wilk
2016-07-27 16:09     ` Julien Grall
2016-07-20 15:25 ` [PATCH v5 4/7] xen/arm: Document the errata implemented in Xen Julien Grall
2016-07-22 14:25   ` Konrad Rzeszutek Wilk
2016-07-20 15:25 ` [PATCH v5 5/7] xen/arm: arm64: Add Cortex-A53 cache errata workaround Julien Grall
2016-07-20 15:25 ` [PATCH v5 6/7] xen/arm: arm64: Add cortex-A57 erratum 832075 workaround Julien Grall
2016-07-20 15:26 ` [PATCH v5 7/7] xen/arm: traps: Don't inject a fault if the translation VA -> IPA fails Julien Grall
2016-07-22 14:29 ` [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64 Konrad Rzeszutek Wilk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).