From: James Morse <james.morse@arm.com> To: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, Christoffer Dall <christoffer.dall@linaro.org>, Mark Rutland <mark.rutland@arm.com>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will.deacon@arm.com>, Steve Capper <steve.capper@arm.com> Subject: Re: [PATCH v2 07/19] arm64: insn: Add encoder for bitwise operations using litterals Date: Wed, 13 Dec 2017 15:45:03 +0000 [thread overview] Message-ID: <5A314AFF.60300@arm.com> (raw) In-Reply-To: <07971a66-1b36-d2cd-e590-2b7b65d14b97@arm.com> Hi Marc, On 13/12/17 14:32, Marc Zyngier wrote: > On 12/12/17 18:32, James Morse wrote: >> On 11/12/17 14:49, Marc Zyngier wrote: >>> We lack a way to encode operations such as AND, ORR, EOR that take >>> an immediate value. Doing so is quite involved, and is all about >>> reverse engineering the decoding algorithm described in the >>> pseudocode function DecodeBitMasks(). >>> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c >>> index 7e432662d454..326b17016485 100644 >>> --- a/arch/arm64/kernel/insn.c >>> +++ b/arch/arm64/kernel/insn.c >> >>> +static u32 aarch64_encode_immediate(u64 imm, >>> + enum aarch64_insn_variant variant, >>> + u32 insn) >>> +{ >>> + unsigned int immr, imms, n, ones, ror, esz, tmp; >>> + u64 mask; >> >> [...] >> >>> + /* Compute the rotation */ >>> + if (range_of_ones(imm)) { >>> + /* >>> + * Pattern: 0..01..10..0 >>> + * >>> + * Compute how many rotate we need to align it right >>> + */ >>> + ror = ffs(imm) - 1; >> >> (how come range_of_ones() uses __ffs64() on the same value?) > > News flash: range_of_ones is completely buggy. It will fail on the > trivial value 1 (__ffs64(1) = 0; 0 - 1 = -1; val >> -1 is... ermmmm). > I definitely got mixed up between the two. They do different things!? Aaaaaahhhh.... [ ...] >> Unless I've gone wrong, I think the 'Trim imm to the element size' code needs to >> move up into the esz-reducing loop so it doesn't happen for a 64bit immediate. > Yup. I've stashed the following patch: > > diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c > index b8fb2d89b3a6..e58be1c57f18 100644 > --- a/arch/arm64/kernel/insn.c > +++ b/arch/arm64/kernel/insn.c > @@ -1503,8 +1503,7 @@ pstate_check_t * const aarch32_opcode_cond_checks[16] = { > static bool range_of_ones(u64 val) > { > /* Doesn't handle full ones or full zeroes */ > - int x = __ffs64(val) - 1; > - u64 sval = val >> x; > + u64 sval = val >> __ffs64(val); > > /* One of Sean Eron Anderson's bithack tricks */ > return ((sval + 1) & (sval)) == 0; > @@ -1515,7 +1514,7 @@ static u32 aarch64_encode_immediate(u64 imm, > u32 insn) > { > unsigned int immr, imms, n, ones, ror, esz, tmp; > - u64 mask; > + u64 mask = ~0UL; > > /* Can't encode full zeroes or full ones */ > if (!imm || !~imm) > @@ -1543,8 +1542,12 @@ static u32 aarch64_encode_immediate(u64 imm, > for (tmp = esz; tmp > 2; tmp /= 2) { > u64 emask = BIT(tmp / 2) - 1; > > - if ((imm & emask) != ((imm >> (tmp / 2)) & emask)) > + if ((imm & emask) != ((imm >> (tmp / 2)) & emask)) { > + /* Trim imm to the element size */ > + mask = BIT(esz - 1) - 1; > + imm &= mask; Won't this still lose the top bit? It generates 0x7fffffff for esz=32, and for esz=32 we run through here when the two 16bit values are different. This still runs for a 64bit immediate. The 0xf80000000fffffff example compares 0xf8000000 with 0fffffff then breaks here on the first iteration of this loop. With this change it still attempts to generate a 64bit mask. I was thinking of something like [0]. That only runs when we know the two tmp:halves match, it just keeps the bottom tmp:half for the next run and never runs for a 64bit immediate. > break; > + } > > esz = tmp; > } > @@ -1552,10 +1555,6 @@ static u32 aarch64_encode_immediate(u64 imm, > /* N is only set if we're encoding a 64bit value */ > n = esz == 64; > > - /* Trim imm to the element size */ > - mask = BIT(esz - 1) - 1; > - imm &= mask; > - > /* That's how many ones we need to encode */ > ones = hweight64(imm); > > I really need to run this against gas in order to make sure > I get the same parameters for all the possible values. Sounds good, Thanks, James [0] Not even built: diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c index 12d3ec2154c2..d9fbdea7b18d 100644 --- a/arch/arm64/kernel/insn.c +++ b/arch/arm64/kernel/insn.c @@ -1529,15 +1529,15 @@ static u32 aarch64_encode_immediate(u64 imm, break; esz = tmp; + + /* Trim imm to the element size */ + mask = BIT(esz) - 1; + imm &= mask; } /* N is only set if we're encoding a 64bit value */ n = esz == 64; - /* Trim imm to the element size */ - mask = BIT(esz - 1) - 1; - imm &= mask; - /* That's how many ones we need to encode */ ones = hweight64(imm);
WARNING: multiple messages have this Message-ID (diff)
From: james.morse@arm.com (James Morse) To: linux-arm-kernel@lists.infradead.org Subject: [PATCH v2 07/19] arm64: insn: Add encoder for bitwise operations using litterals Date: Wed, 13 Dec 2017 15:45:03 +0000 [thread overview] Message-ID: <5A314AFF.60300@arm.com> (raw) In-Reply-To: <07971a66-1b36-d2cd-e590-2b7b65d14b97@arm.com> Hi Marc, On 13/12/17 14:32, Marc Zyngier wrote: > On 12/12/17 18:32, James Morse wrote: >> On 11/12/17 14:49, Marc Zyngier wrote: >>> We lack a way to encode operations such as AND, ORR, EOR that take >>> an immediate value. Doing so is quite involved, and is all about >>> reverse engineering the decoding algorithm described in the >>> pseudocode function DecodeBitMasks(). >>> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c >>> index 7e432662d454..326b17016485 100644 >>> --- a/arch/arm64/kernel/insn.c >>> +++ b/arch/arm64/kernel/insn.c >> >>> +static u32 aarch64_encode_immediate(u64 imm, >>> + enum aarch64_insn_variant variant, >>> + u32 insn) >>> +{ >>> + unsigned int immr, imms, n, ones, ror, esz, tmp; >>> + u64 mask; >> >> [...] >> >>> + /* Compute the rotation */ >>> + if (range_of_ones(imm)) { >>> + /* >>> + * Pattern: 0..01..10..0 >>> + * >>> + * Compute how many rotate we need to align it right >>> + */ >>> + ror = ffs(imm) - 1; >> >> (how come range_of_ones() uses __ffs64() on the same value?) > > News flash: range_of_ones is completely buggy. It will fail on the > trivial value 1 (__ffs64(1) = 0; 0 - 1 = -1; val >> -1 is... ermmmm). > I definitely got mixed up between the two. They do different things!? Aaaaaahhhh.... [ ...] >> Unless I've gone wrong, I think the 'Trim imm to the element size' code needs to >> move up into the esz-reducing loop so it doesn't happen for a 64bit immediate. > Yup. I've stashed the following patch: > > diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c > index b8fb2d89b3a6..e58be1c57f18 100644 > --- a/arch/arm64/kernel/insn.c > +++ b/arch/arm64/kernel/insn.c > @@ -1503,8 +1503,7 @@ pstate_check_t * const aarch32_opcode_cond_checks[16] = { > static bool range_of_ones(u64 val) > { > /* Doesn't handle full ones or full zeroes */ > - int x = __ffs64(val) - 1; > - u64 sval = val >> x; > + u64 sval = val >> __ffs64(val); > > /* One of Sean Eron Anderson's bithack tricks */ > return ((sval + 1) & (sval)) == 0; > @@ -1515,7 +1514,7 @@ static u32 aarch64_encode_immediate(u64 imm, > u32 insn) > { > unsigned int immr, imms, n, ones, ror, esz, tmp; > - u64 mask; > + u64 mask = ~0UL; > > /* Can't encode full zeroes or full ones */ > if (!imm || !~imm) > @@ -1543,8 +1542,12 @@ static u32 aarch64_encode_immediate(u64 imm, > for (tmp = esz; tmp > 2; tmp /= 2) { > u64 emask = BIT(tmp / 2) - 1; > > - if ((imm & emask) != ((imm >> (tmp / 2)) & emask)) > + if ((imm & emask) != ((imm >> (tmp / 2)) & emask)) { > + /* Trim imm to the element size */ > + mask = BIT(esz - 1) - 1; > + imm &= mask; Won't this still lose the top bit? It generates 0x7fffffff for esz=32, and for esz=32 we run through here when the two 16bit values are different. This still runs for a 64bit immediate. The 0xf80000000fffffff example compares 0xf8000000 with 0fffffff then breaks here on the first iteration of this loop. With this change it still attempts to generate a 64bit mask. I was thinking of something like [0]. That only runs when we know the two tmp:halves match, it just keeps the bottom tmp:half for the next run and never runs for a 64bit immediate. > break; > + } > > esz = tmp; > } > @@ -1552,10 +1555,6 @@ static u32 aarch64_encode_immediate(u64 imm, > /* N is only set if we're encoding a 64bit value */ > n = esz == 64; > > - /* Trim imm to the element size */ > - mask = BIT(esz - 1) - 1; > - imm &= mask; > - > /* That's how many ones we need to encode */ > ones = hweight64(imm); > > I really need to run this against gas in order to make sure > I get the same parameters for all the possible values. Sounds good, Thanks, James [0] Not even built: diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c index 12d3ec2154c2..d9fbdea7b18d 100644 --- a/arch/arm64/kernel/insn.c +++ b/arch/arm64/kernel/insn.c @@ -1529,15 +1529,15 @@ static u32 aarch64_encode_immediate(u64 imm, break; esz = tmp; + + /* Trim imm to the element size */ + mask = BIT(esz) - 1; + imm &= mask; } /* N is only set if we're encoding a 64bit value */ n = esz == 64; - /* Trim imm to the element size */ - mask = BIT(esz - 1) - 1; - imm &= mask; - /* That's how many ones we need to encode */ ones = hweight64(imm);
next prev parent reply other threads:[~2017-12-13 15:47 UTC|newest] Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-12-11 14:49 [PATCH v2 00/19] KVM/arm64: Randomise EL2 mappings Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 01/19] arm64: asm-offsets: Avoid clashing DMA definitions Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 15:03 ` Russell King - ARM Linux 2017-12-11 15:03 ` Russell King - ARM Linux 2017-12-11 15:22 ` Marc Zyngier 2017-12-11 15:22 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 02/19] arm64: asm-offsets: Remove unused definitions Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 03/19] arm64: asm-offsets: Remove potential circular dependency Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 04/19] arm64: alternatives: Enforce alignment of struct alt_instr Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 05/19] arm64: alternatives: Add dynamic patching feature Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-13 17:53 ` Catalin Marinas 2017-12-13 17:53 ` Catalin Marinas 2017-12-14 12:22 ` Marc Zyngier 2017-12-14 12:22 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 06/19] arm64: insn: Add N immediate encoding Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 07/19] arm64: insn: Add encoder for bitwise operations using litterals Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-12 18:32 ` James Morse 2017-12-12 18:32 ` James Morse 2017-12-12 23:40 ` Peter Maydell 2017-12-12 23:40 ` Peter Maydell 2017-12-13 14:32 ` Marc Zyngier 2017-12-13 14:32 ` Marc Zyngier 2017-12-13 15:45 ` James Morse [this message] 2017-12-13 15:45 ` James Morse 2017-12-13 15:52 ` Marc Zyngier 2017-12-13 15:52 ` Marc Zyngier 2017-12-14 8:40 ` Marc Zyngier 2017-12-14 8:40 ` Marc Zyngier 2017-12-12 18:56 ` Peter Maydell 2017-12-12 18:56 ` Peter Maydell 2017-12-11 14:49 ` [PATCH v2 08/19] arm64: KVM: Dynamically patch the kernel/hyp VA mask Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-14 13:17 ` James Morse 2017-12-14 13:17 ` James Morse 2017-12-14 13:27 ` Marc Zyngier 2017-12-14 13:27 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 09/19] arm64: cpufeatures: Drop the ARM64_HYP_OFFSET_LOW feature flag Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 11/19] KVM: arm/arm64: Demote HYP VA range display to being a debug feature Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 12/19] KVM: arm/arm64: Move ioremap calls to create_hyp_io_mappings Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 13/19] KVM: arm/arm64: Keep GICv2 HYP VAs in kvm_vgic_global_state Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 15/19] arm64; insn: Add encoder for the EXTR instruction Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 16/19] arm64: insn: Allow ADD/SUB (immediate) with LSL #12 Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 17/19] arm64: KVM: Dynamically compute the HYP VA mask Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 18/19] arm64: KVM: Introduce EL2 VA randomisation Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier 2017-12-11 14:49 ` [PATCH v2 19/19] arm64: Update the KVM memory map documentation Marc Zyngier 2017-12-11 14:49 ` Marc Zyngier
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=5A314AFF.60300@arm.com \ --to=james.morse@arm.com \ --cc=catalin.marinas@arm.com \ --cc=christoffer.dall@linaro.org \ --cc=kvm@vger.kernel.org \ --cc=kvmarm@lists.cs.columbia.edu \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=marc.zyngier@arm.com \ --cc=mark.rutland@arm.com \ --cc=steve.capper@arm.com \ --cc=will.deacon@arm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.