From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoffer Dall Subject: Re: [PATCH v4 18/19] arm64: KVM: Introduce EL2 VA randomisation Date: Thu, 18 Jan 2018 21:28:24 +0100 Message-ID: <20180118202824.GF21802@cbox> References: <20180104184334.16571-1-marc.zyngier@arm.com> <20180104184334.16571-19-marc.zyngier@arm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Mark Rutland , Catalin Marinas , Will Deacon , James Morse , Steve Capper , Peter Maydell To: Marc Zyngier Return-path: Received: from mail-wm0-f65.google.com ([74.125.82.65]:43615 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932493AbeARU21 (ORCPT ); Thu, 18 Jan 2018 15:28:27 -0500 Received: by mail-wm0-f65.google.com with SMTP id g1so24329136wmg.2 for ; Thu, 18 Jan 2018 12:28:27 -0800 (PST) Content-Disposition: inline In-Reply-To: <20180104184334.16571-19-marc.zyngier@arm.com> Sender: kvm-owner@vger.kernel.org List-ID: On Thu, Jan 04, 2018 at 06:43:33PM +0000, Marc Zyngier wrote: > The main idea behind randomising the EL2 VA is that we usually have > a few spare bits between the most significant bit of the VA mask > and the most significant bit of the linear mapping. > > Those bits could be a bunch of zeroes, and could be useful > to move things around a bit. Of course, the more memory you have, > the less randomisation you get... > > Alternatively, these bits could be the result of KASLR, in which > case they are already random. But it would be nice to have a > *different* randomization, just to make the job of a potential > attacker a bit more difficult. > > Inserting these random bits is a bit involved. We don't have a spare > register (short of rewriting all the kern_hyp_va call sites), and > the immediate we want to insert is too random to be used with the > ORR instruction. The best option I could come up with is the following > sequence: > > and x0, x0, #va_mask So if I get this right, you want to insert an arbitrary random value without an extra register in bits [(VA_BITS-1):first_random_bit] and BIT(VA_BITS-1) is always set in the input because it's a kernel address. > ror x0, x0, #first_random_bit Then you rotate so that the random bits become the LSBs and the random value should be inserted into bits [NR_RAND_BITS-1:0] in x0 ? > add x0, x0, #(random & 0xfff) So you do this via two rounds, first the lower 12 bits > add x0, x0, #(random >> 12), lsl #12 Then the upper 12 bits (permitting a maximum of 24 randomized bits) > ror x0, x0, #(63 - first_random_bit) And then you rotate things back into their place. Only, I don't understand why this isn't then (64 - first_random_bit) ? > > making it a fairly long sequence, but one that a decent CPU should > be able to execute without breaking a sweat. It is of course NOPed > out on VHE. The last 4 instructions can also be turned into NOPs > if it appears that there is no free bits to use. > > Signed-off-by: Marc Zyngier > --- > arch/arm64/include/asm/kvm_mmu.h | 10 +++++- > arch/arm64/kvm/va_layout.c | 68 +++++++++++++++++++++++++++++++++++++--- > virt/kvm/arm/mmu.c | 2 +- > 3 files changed, 73 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h > index cc882e890bb1..4fca6ddadccc 100644 > --- a/arch/arm64/include/asm/kvm_mmu.h > +++ b/arch/arm64/include/asm/kvm_mmu.h > @@ -85,6 +85,10 @@ > .macro kern_hyp_va reg > alternative_cb kvm_update_va_mask > and \reg, \reg, #1 > + ror \reg, \reg, #1 > + add \reg, \reg, #0 > + add \reg, \reg, #0 > + ror \reg, \reg, #63 > alternative_cb_end > .endm > > @@ -101,7 +105,11 @@ void kvm_update_va_mask(struct alt_instr *alt, > > static inline unsigned long __kern_hyp_va(unsigned long v) > { > - asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n", > + asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n" > + "ror %0, %0, #1\n" > + "add %0, %0, #0\n" > + "add %0, %0, #0\n" > + "ror %0, %0, #63\n", This now sort of serves as the documentation if you don't have the commit message, so I think you should annotate each line like the commit message does. Alternative, since you're duplicating a bunch of code which will be replaced at runtime anyway, you could make all of these "and %0, %0, #1" and then copy the documentation assembly code as a comment to compute_instruction() and put a comment reference here. > kvm_update_va_mask) > : "+r" (v)); > return v; > diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c > index 75bb1c6772b0..bf0d6bdf5f14 100644 > --- a/arch/arm64/kvm/va_layout.c > +++ b/arch/arm64/kvm/va_layout.c > @@ -16,11 +16,15 @@ > */ > > #include > +#include > +#include > #include > #include > #include > #include > It would be nice to have a comment on these, something like: /* The LSB of the random hyp VA tag or 0 if no randomization is used. */ > +static u8 tag_lsb; /* The random hyp VA tag value with the region bit, if hyp randomization is used */ > +static u64 tag_val; > static u64 va_mask; > > static void compute_layout(void) > @@ -32,8 +36,31 @@ static void compute_layout(void) > region = idmap_addr & BIT(VA_BITS - 1); > region ^= BIT(VA_BITS - 1); > > - va_mask = BIT(VA_BITS - 1) - 1; > - va_mask |= region; > + tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^ > + (u64)(high_memory - 1)); > + > + if (tag_lsb == (VA_BITS - 1)) { > + /* > + * No space in the address, let's compute the mask so > + * that it covers (VA_BITS - 1) bits, and the region > + * bit. The tag is set to zero. > + */ > + tag_lsb = tag_val = 0; tag_val should already be 0, right? and wouldn't it be slightly nicer to have a temporary variable and only set tag_lsb when needed, called something like linear_bits ? > + va_mask = BIT(VA_BITS - 1) - 1; > + va_mask |= region; > + } else { > + /* > + * We do have some free bits. Let's have the mask to > + * cover the low bits of the VA, and the tag to > + * contain the random stuff plus the region bit. > + */ Since you have two masks below this comment is a bit hard to parse, how about explaining what makes up a Hyp address from a kernel linear address instead, something like: /* * We do have some free bits to insert a random tag. * Hyp VAs are now created from kernel linear map VAs * using the following formula (with V == VA_BITS): * * 63 ... V | V-1 | V-2 ... tag_lsb | tag_lsb - 1 ... 0 * ------------------------------------------------------- * | 0000000 | region | random tag | kern linear VA | */ (assuming I got this vaguely correct). > + u64 mask = GENMASK_ULL(VA_BITS - 2, tag_lsb); for consistency it would be nicer to use GENMASK_ULL(VA_BITS - 2, 0) above as suggested in the other patch then. And we could also call this tag_mask to be super explicit. > + > + va_mask = BIT(tag_lsb) - 1; and here, GENMASK_ULL(tag_lsb - 1, 0). > + tag_val = get_random_long() & mask; > + tag_val |= region; it's actually unclear to me why you need the region bit included in tag_val? > + tag_val >>= tag_lsb; > + } > } > > static u32 compute_instruction(int n, u32 rd, u32 rn) > @@ -46,6 +73,33 @@ static u32 compute_instruction(int n, u32 rd, u32 rn) > AARCH64_INSN_VARIANT_64BIT, > rn, rd, va_mask); > break; > + > + case 1: > + /* ROR is a variant of EXTR with Rm = Rn */ > + insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT, > + rn, rn, rd, > + tag_lsb); > + break; > + > + case 2: > + insn = aarch64_insn_gen_add_sub_imm(rd, rn, > + tag_val & (SZ_4K - 1), > + AARCH64_INSN_VARIANT_64BIT, > + AARCH64_INSN_ADSB_ADD); > + break; > + > + case 3: > + insn = aarch64_insn_gen_add_sub_imm(rd, rn, > + tag_val & GENMASK(23, 12), > + AARCH64_INSN_VARIANT_64BIT, > + AARCH64_INSN_ADSB_ADD); > + break; > + > + case 4: > + /* ROR is a variant of EXTR with Rm = Rn */ > + insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT, > + rn, rn, rd, 64 - tag_lsb); Ah, you do use 64 - first_rand in the code. Well, I approve of this line of code then. > + break; > } > > return insn; > @@ -56,8 +110,8 @@ void __init kvm_update_va_mask(struct alt_instr *alt, > { > int i; > > - /* We only expect a 1 instruction sequence */ > - BUG_ON(nr_inst != 1); > + /* We only expect a 5 instruction sequence */ Still sounds strange to me, just drop the comment I think if we keep the BUG_ON. > + BUG_ON(nr_inst != 5); > > if (!has_vhe() && !va_mask) > compute_layout(); > @@ -68,8 +122,12 @@ void __init kvm_update_va_mask(struct alt_instr *alt, > /* > * VHE doesn't need any address translation, let's NOP > * everything. > + * > + * Alternatively, if we don't have any spare bits in > + * the address, NOP everything after masking tha s/tha/the/ > + * kernel VA. > */ > - if (has_vhe()) { > + if (has_vhe() || (!tag_lsb && i > 1)) { > updptr[i] = aarch64_insn_gen_nop(); > continue; > } > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c > index 14c5e5534f2f..d01c7111b1f7 100644 > --- a/virt/kvm/arm/mmu.c > +++ b/virt/kvm/arm/mmu.c > @@ -1811,7 +1811,7 @@ int kvm_mmu_init(void) > kern_hyp_va((unsigned long)high_memory - 1)); > > if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) && > - hyp_idmap_start < kern_hyp_va(~0UL) && > + hyp_idmap_start < kern_hyp_va((unsigned long)high_memory - 1) && Is this actually required for this patch or are we just trying to be nice? I'm actually not sure I remember what this is about beyond the VA=idmap for everything on 32-bit case; I thought we chose the hyp address space exactly so that it wouldn't overlap with the idmap? > hyp_idmap_start != (unsigned long)__hyp_idmap_text_start) { > /* > * The idmap page is intersecting with the VA space, > -- > 2.14.2 > Thanks, -Christoffer From mboxrd@z Thu Jan 1 00:00:00 1970 From: christoffer.dall@linaro.org (Christoffer Dall) Date: Thu, 18 Jan 2018 21:28:24 +0100 Subject: [PATCH v4 18/19] arm64: KVM: Introduce EL2 VA randomisation In-Reply-To: <20180104184334.16571-19-marc.zyngier@arm.com> References: <20180104184334.16571-1-marc.zyngier@arm.com> <20180104184334.16571-19-marc.zyngier@arm.com> Message-ID: <20180118202824.GF21802@cbox> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Thu, Jan 04, 2018 at 06:43:33PM +0000, Marc Zyngier wrote: > The main idea behind randomising the EL2 VA is that we usually have > a few spare bits between the most significant bit of the VA mask > and the most significant bit of the linear mapping. > > Those bits could be a bunch of zeroes, and could be useful > to move things around a bit. Of course, the more memory you have, > the less randomisation you get... > > Alternatively, these bits could be the result of KASLR, in which > case they are already random. But it would be nice to have a > *different* randomization, just to make the job of a potential > attacker a bit more difficult. > > Inserting these random bits is a bit involved. We don't have a spare > register (short of rewriting all the kern_hyp_va call sites), and > the immediate we want to insert is too random to be used with the > ORR instruction. The best option I could come up with is the following > sequence: > > and x0, x0, #va_mask So if I get this right, you want to insert an arbitrary random value without an extra register in bits [(VA_BITS-1):first_random_bit] and BIT(VA_BITS-1) is always set in the input because it's a kernel address. > ror x0, x0, #first_random_bit Then you rotate so that the random bits become the LSBs and the random value should be inserted into bits [NR_RAND_BITS-1:0] in x0 ? > add x0, x0, #(random & 0xfff) So you do this via two rounds, first the lower 12 bits > add x0, x0, #(random >> 12), lsl #12 Then the upper 12 bits (permitting a maximum of 24 randomized bits) > ror x0, x0, #(63 - first_random_bit) And then you rotate things back into their place. Only, I don't understand why this isn't then (64 - first_random_bit) ? > > making it a fairly long sequence, but one that a decent CPU should > be able to execute without breaking a sweat. It is of course NOPed > out on VHE. The last 4 instructions can also be turned into NOPs > if it appears that there is no free bits to use. > > Signed-off-by: Marc Zyngier > --- > arch/arm64/include/asm/kvm_mmu.h | 10 +++++- > arch/arm64/kvm/va_layout.c | 68 +++++++++++++++++++++++++++++++++++++--- > virt/kvm/arm/mmu.c | 2 +- > 3 files changed, 73 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h > index cc882e890bb1..4fca6ddadccc 100644 > --- a/arch/arm64/include/asm/kvm_mmu.h > +++ b/arch/arm64/include/asm/kvm_mmu.h > @@ -85,6 +85,10 @@ > .macro kern_hyp_va reg > alternative_cb kvm_update_va_mask > and \reg, \reg, #1 > + ror \reg, \reg, #1 > + add \reg, \reg, #0 > + add \reg, \reg, #0 > + ror \reg, \reg, #63 > alternative_cb_end > .endm > > @@ -101,7 +105,11 @@ void kvm_update_va_mask(struct alt_instr *alt, > > static inline unsigned long __kern_hyp_va(unsigned long v) > { > - asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n", > + asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n" > + "ror %0, %0, #1\n" > + "add %0, %0, #0\n" > + "add %0, %0, #0\n" > + "ror %0, %0, #63\n", This now sort of serves as the documentation if you don't have the commit message, so I think you should annotate each line like the commit message does. Alternative, since you're duplicating a bunch of code which will be replaced at runtime anyway, you could make all of these "and %0, %0, #1" and then copy the documentation assembly code as a comment to compute_instruction() and put a comment reference here. > kvm_update_va_mask) > : "+r" (v)); > return v; > diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c > index 75bb1c6772b0..bf0d6bdf5f14 100644 > --- a/arch/arm64/kvm/va_layout.c > +++ b/arch/arm64/kvm/va_layout.c > @@ -16,11 +16,15 @@ > */ > > #include > +#include > +#include > #include > #include > #include > #include > It would be nice to have a comment on these, something like: /* The LSB of the random hyp VA tag or 0 if no randomization is used. */ > +static u8 tag_lsb; /* The random hyp VA tag value with the region bit, if hyp randomization is used */ > +static u64 tag_val; > static u64 va_mask; > > static void compute_layout(void) > @@ -32,8 +36,31 @@ static void compute_layout(void) > region = idmap_addr & BIT(VA_BITS - 1); > region ^= BIT(VA_BITS - 1); > > - va_mask = BIT(VA_BITS - 1) - 1; > - va_mask |= region; > + tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^ > + (u64)(high_memory - 1)); > + > + if (tag_lsb == (VA_BITS - 1)) { > + /* > + * No space in the address, let's compute the mask so > + * that it covers (VA_BITS - 1) bits, and the region > + * bit. The tag is set to zero. > + */ > + tag_lsb = tag_val = 0; tag_val should already be 0, right? and wouldn't it be slightly nicer to have a temporary variable and only set tag_lsb when needed, called something like linear_bits ? > + va_mask = BIT(VA_BITS - 1) - 1; > + va_mask |= region; > + } else { > + /* > + * We do have some free bits. Let's have the mask to > + * cover the low bits of the VA, and the tag to > + * contain the random stuff plus the region bit. > + */ Since you have two masks below this comment is a bit hard to parse, how about explaining what makes up a Hyp address from a kernel linear address instead, something like: /* * We do have some free bits to insert a random tag. * Hyp VAs are now created from kernel linear map VAs * using the following formula (with V == VA_BITS): * * 63 ... V | V-1 | V-2 ... tag_lsb | tag_lsb - 1 ... 0 * ------------------------------------------------------- * | 0000000 | region | random tag | kern linear VA | */ (assuming I got this vaguely correct). > + u64 mask = GENMASK_ULL(VA_BITS - 2, tag_lsb); for consistency it would be nicer to use GENMASK_ULL(VA_BITS - 2, 0) above as suggested in the other patch then. And we could also call this tag_mask to be super explicit. > + > + va_mask = BIT(tag_lsb) - 1; and here, GENMASK_ULL(tag_lsb - 1, 0). > + tag_val = get_random_long() & mask; > + tag_val |= region; it's actually unclear to me why you need the region bit included in tag_val? > + tag_val >>= tag_lsb; > + } > } > > static u32 compute_instruction(int n, u32 rd, u32 rn) > @@ -46,6 +73,33 @@ static u32 compute_instruction(int n, u32 rd, u32 rn) > AARCH64_INSN_VARIANT_64BIT, > rn, rd, va_mask); > break; > + > + case 1: > + /* ROR is a variant of EXTR with Rm = Rn */ > + insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT, > + rn, rn, rd, > + tag_lsb); > + break; > + > + case 2: > + insn = aarch64_insn_gen_add_sub_imm(rd, rn, > + tag_val & (SZ_4K - 1), > + AARCH64_INSN_VARIANT_64BIT, > + AARCH64_INSN_ADSB_ADD); > + break; > + > + case 3: > + insn = aarch64_insn_gen_add_sub_imm(rd, rn, > + tag_val & GENMASK(23, 12), > + AARCH64_INSN_VARIANT_64BIT, > + AARCH64_INSN_ADSB_ADD); > + break; > + > + case 4: > + /* ROR is a variant of EXTR with Rm = Rn */ > + insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT, > + rn, rn, rd, 64 - tag_lsb); Ah, you do use 64 - first_rand in the code. Well, I approve of this line of code then. > + break; > } > > return insn; > @@ -56,8 +110,8 @@ void __init kvm_update_va_mask(struct alt_instr *alt, > { > int i; > > - /* We only expect a 1 instruction sequence */ > - BUG_ON(nr_inst != 1); > + /* We only expect a 5 instruction sequence */ Still sounds strange to me, just drop the comment I think if we keep the BUG_ON. > + BUG_ON(nr_inst != 5); > > if (!has_vhe() && !va_mask) > compute_layout(); > @@ -68,8 +122,12 @@ void __init kvm_update_va_mask(struct alt_instr *alt, > /* > * VHE doesn't need any address translation, let's NOP > * everything. > + * > + * Alternatively, if we don't have any spare bits in > + * the address, NOP everything after masking tha s/tha/the/ > + * kernel VA. > */ > - if (has_vhe()) { > + if (has_vhe() || (!tag_lsb && i > 1)) { > updptr[i] = aarch64_insn_gen_nop(); > continue; > } > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c > index 14c5e5534f2f..d01c7111b1f7 100644 > --- a/virt/kvm/arm/mmu.c > +++ b/virt/kvm/arm/mmu.c > @@ -1811,7 +1811,7 @@ int kvm_mmu_init(void) > kern_hyp_va((unsigned long)high_memory - 1)); > > if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) && > - hyp_idmap_start < kern_hyp_va(~0UL) && > + hyp_idmap_start < kern_hyp_va((unsigned long)high_memory - 1) && Is this actually required for this patch or are we just trying to be nice? I'm actually not sure I remember what this is about beyond the VA=idmap for everything on 32-bit case; I thought we chose the hyp address space exactly so that it wouldn't overlap with the idmap? > hyp_idmap_start != (unsigned long)__hyp_idmap_text_start) { > /* > * The idmap page is intersecting with the VA space, > -- > 2.14.2 > Thanks, -Christoffer