From: Ard Biesheuvel <ardb@kernel.org>
To: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Linux ARM <linux-arm-kernel@lists.infradead.org>,
linux-hardening@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
Will Deacon <will@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Kees Cook <keescook@chromium.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Mark Brown <broonie@kernel.org>
Subject: Re: [PATCH v4 03/26] arm64: head: move assignment of idmap_t0sz to C code
Date: Tue, 14 Jun 2022 11:34:22 +0200 [thread overview]
Message-ID: <CAMj1kXFh2ee++UT59z5H9mNF_WO-0Ve=4aG7=CQVoKFk1MDJhQ@mail.gmail.com> (raw)
In-Reply-To: <1223d394-cd70-49be-2d27-f245c15709ef@arm.com>
On Tue, 14 Jun 2022 at 11:22, Anshuman Khandual
<anshuman.khandual@arm.com> wrote:
>
>
> On 6/13/22 20:15, Ard Biesheuvel wrote:
> > Setting idmap_t0sz involves fiddling with the caches if done with the
> > MMU off. Since we will be creating an initial ID map with the MMU and
> > caches off, and the permanent ID map with the MMU and caches on, let's
> > move this assignment of idmap_t0sz out of the startup code, and replace
> > it with a macro that simply issues the three instructions needed to
> > calculate the value wherever it is needed before the MMU is turned on.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> > arch/arm64/include/asm/assembler.h | 14 ++++++++++++++
> > arch/arm64/include/asm/mmu_context.h | 2 +-
> > arch/arm64/kernel/head.S | 13 +------------
> > arch/arm64/mm/mmu.c | 5 ++++-
> > arch/arm64/mm/proc.S | 2 +-
> > 5 files changed, 21 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> > index 8c5a61aeaf8e..9468f45c07a6 100644
> > --- a/arch/arm64/include/asm/assembler.h
> > +++ b/arch/arm64/include/asm/assembler.h
> > @@ -359,6 +359,20 @@ alternative_cb_end
> > bfi \valreg, \t1sz, #TCR_T1SZ_OFFSET, #TCR_TxSZ_WIDTH
> > .endm
> >
> > +/*
> > + * idmap_get_t0sz - get the T0SZ value needed to cover the ID map
> > + *
> > + * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the
> > + * entire ID map region can be mapped. As T0SZ == (64 - #bits used),
> > + * this number conveniently equals the number of leading zeroes in
> > + * the physical address of _end.
> > + */
> > + .macro idmap_get_t0sz, reg
> > + adrp \reg, _end
> > + orr \reg, \reg, #(1 << VA_BITS_MIN) - 1
> > + clz \reg, \reg
> > + .endm
>
> Is there any particular reason to evaluate idmap t0sz from '__end' and
> VA_BITS_MIN, instead of '__idmap_text_end', as was the case previously.
>
Ah yes, I failed to mention that. In a later patch, the ID map will
cover the entire image.
> > +
> > /*
> > * tcr_compute_pa_size - set TCR.(I)PS to the highest supported
> > * ID_AA64MMFR0_EL1.PARange value
> > diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> > index 6770667b34a3..6ac0086ebb1a 100644
> > --- a/arch/arm64/include/asm/mmu_context.h
> > +++ b/arch/arm64/include/asm/mmu_context.h
> > @@ -60,7 +60,7 @@ static inline void cpu_switch_mm(pgd_t *pgd, struct mm_struct *mm)
> > * TCR_T0SZ(VA_BITS), unless system RAM is positioned very high in
> > * physical memory, in which case it will be smaller.
> > */
> > -extern u64 idmap_t0sz;
> > +extern int idmap_t0sz;
> > extern u64 idmap_ptrs_per_pgd;
> >
> > /*
> > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> > index dc07858eb673..7f361bc72d12 100644
> > --- a/arch/arm64/kernel/head.S
> > +++ b/arch/arm64/kernel/head.S
> > @@ -299,22 +299,11 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
> > * physical address space. So for the ID map, use an extended virtual
> > * range in that case, and configure an additional translation level
> > * if needed.
> > - *
> > - * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the
> > - * entire ID map region can be mapped. As T0SZ == (64 - #bits used),
> > - * this number conveniently equals the number of leading zeroes in
> > - * the physical address of __idmap_text_end.
> > */
> > - adrp x5, __idmap_text_end
> > - clz x5, x5
> > + idmap_get_t0sz x5
> > cmp x5, TCR_T0SZ(VA_BITS_MIN) // default T0SZ small enough?
> > b.ge 1f // .. then skip VA range extension
> >
> > - adr_l x6, idmap_t0sz
> > - str x5, [x6]
> > - dmb sy
> > - dc ivac, x6 // Invalidate potentially stale cache line
>
> Right, as there is no 'idmap_t0sz' variable to update, cache maintenance
> can be dropped off.
>
> > -
> > #if (VA_BITS < 48)
> > #define EXTRA_SHIFT (PGDIR_SHIFT + PAGE_SHIFT - 3)
> > #define EXTRA_PTRS (1 << (PHYS_MASK_SHIFT - EXTRA_SHIFT))
> > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > index 17b339c1a326..103bf4ae408d 100644
> > --- a/arch/arm64/mm/mmu.c
> > +++ b/arch/arm64/mm/mmu.c
> > @@ -43,7 +43,7 @@
> > #define NO_CONT_MAPPINGS BIT(1)
> > #define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */
> >
> > -u64 idmap_t0sz = TCR_T0SZ(VA_BITS_MIN);
> > +int idmap_t0sz __ro_after_init;
>
> I guess this is just to reduce 'idmap_t0sz' memory foot print.
>
It's essentially the 2log of a u64 so it doesn't have to be a u64. The
footprint doesn't really matter, of course.
> > u64 idmap_ptrs_per_pgd = PTRS_PER_PGD;
> >
> > #if VA_BITS > 48
> > @@ -785,6 +785,9 @@ void __init paging_init(void)
> > (u64)&vabits_actual + sizeof(vabits_actual));
> > #endif
> >
> > + idmap_t0sz = min(63UL - __fls(__pa_symbol(_end)),
> > + TCR_T0SZ(VA_BITS_MIN));
> > +
>
> Just curious - but does not this also need some sync for the update
> to be visible across the system ?
>
No it does not, now that the asm macro no longer refers to the variable.
> > map_kernel(pgdp);
> > map_mem(pgdp);
> >
> > diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> > index 972ce8d7f2c5..97cd67697212 100644
> > --- a/arch/arm64/mm/proc.S
> > +++ b/arch/arm64/mm/proc.S
> > @@ -470,7 +470,7 @@ SYM_FUNC_START(__cpu_setup)
> > add x9, x9, #64
> > tcr_set_t1sz tcr, x9
> > #else
> > - ldr_l x9, idmap_t0sz
> > + idmap_get_t0sz x9
> > #endif
> > tcr_set_t0sz tcr, x9
> >
>
> Avoiding one cache maintenance in __create_page_table(), now makes us
> again evaluate idmap_t0sz in __cpu_setup(), and also capture & update
> idmap_t0sz in paging_init(). This change moves idmap_t0sz outside the
> asm functions but from performance perspecive, is there an improvement ?
No, the performance is not expected to be affected, and a ~10
instruction delta at boot is not going to be measurable anyway. The
point of this patch is to remove the need to reason about how/when
variables are accessed, and whether that requires cache cleaning,
invalidation, system-wide DMBs etc.
next prev parent reply other threads:[~2022-06-14 9:34 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-13 14:45 [PATCH v4 00/26] arm64: refactor boot flow and add support for WXN Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 01/26] arm64: head: move kimage_vaddr variable into C file Ard Biesheuvel
2022-06-14 8:26 ` Anshuman Khandual
2022-06-13 14:45 ` [PATCH v4 02/26] arm64: mm: make vabits_actual a build time constant if possible Ard Biesheuvel
2022-06-14 8:25 ` Anshuman Khandual
2022-06-14 8:34 ` Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 03/26] arm64: head: move assignment of idmap_t0sz to C code Ard Biesheuvel
2022-06-14 9:22 ` Anshuman Khandual
2022-06-14 9:34 ` Ard Biesheuvel [this message]
2022-06-24 12:36 ` Will Deacon
2022-06-24 12:57 ` Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 04/26] arm64: head: drop idmap_ptrs_per_pgd Ard Biesheuvel
2022-06-15 4:07 ` Anshuman Khandual
2022-06-13 14:45 ` [PATCH v4 05/26] arm64: head: simplify page table mapping macros (slightly) Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 06/26] arm64: head: switch to map_memory macro for the extended ID map Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 07/26] arm64: head: split off idmap creation code Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 08/26] arm64: kernel: drop unnecessary PoC cache clean+invalidate Ard Biesheuvel
2022-06-15 4:32 ` Anshuman Khandual
2022-06-13 14:45 ` [PATCH v4 09/26] arm64: head: pass ID map root table address to __enable_mmu() Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 10/26] arm64: mm: provide idmap pointer to cpu_replace_ttbr1() Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 11/26] arm64: head: add helper function to remap regions in early page tables Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 12/26] arm64: head: cover entire kernel image in initial ID map Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 13/26] arm64: head: use relative references to the RELA and RELR tables Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 14/26] arm64: head: create a temporary FDT mapping in the initial ID map Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 15/26] arm64: idreg-override: use early FDT mapping in " Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 16/26] arm64: head: factor out TTBR1 assignment into a macro Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 17/26] arm64: head: populate kernel page tables with MMU and caches on Ard Biesheuvel
2022-06-24 12:56 ` Will Deacon
2022-06-24 13:07 ` Ard Biesheuvel
2022-06-24 13:29 ` Will Deacon
2022-06-24 14:07 ` Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 18/26] arm64: head: record CPU boot mode after enabling the MMU Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 19/26] arm64: kaslr: defer initialization to late initcall where permitted Ard Biesheuvel
2022-06-24 13:08 ` Will Deacon
2022-06-24 13:09 ` Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 20/26] arm64: head: avoid relocating the kernel twice for KASLR Ard Biesheuvel
2022-06-24 13:16 ` Will Deacon
2022-06-24 13:17 ` Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 21/26] arm64: setup: drop early FDT pointer helpers Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 22/26] arm64: mm: move ro_after_init section into the data segment Ard Biesheuvel
2022-06-13 17:00 ` Kees Cook
2022-06-13 17:16 ` Ard Biesheuvel
2022-06-13 23:38 ` Kees Cook
2022-06-16 11:31 ` Ard Biesheuvel
2022-06-16 16:18 ` Kees Cook
2022-06-16 16:31 ` Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 23/26] arm64: head: remap the kernel text/inittext region read-only Ard Biesheuvel
2022-06-13 16:57 ` Kees Cook
2022-06-13 14:45 ` [PATCH v4 24/26] mm: add arch hook to validate mmap() prot flags Ard Biesheuvel
2022-06-13 16:37 ` Kees Cook
2022-06-13 16:44 ` Ard Biesheuvel
2022-06-13 14:45 ` [PATCH v4 25/26] arm64: mm: add support for WXN memory translation attribute Ard Biesheuvel
2022-06-13 16:51 ` Kees Cook
2022-06-13 14:45 ` [PATCH v4 26/26] arm64: kernel: move ID map out of .text mapping Ard Biesheuvel
2022-06-13 16:52 ` Kees Cook
2022-06-24 13:19 ` [PATCH v4 00/26] arm64: refactor boot flow and add support for WXN Will Deacon
2022-06-24 14:40 ` Ard Biesheuvel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMj1kXFh2ee++UT59z5H9mNF_WO-0Ve=4aG7=CQVoKFk1MDJhQ@mail.gmail.com' \
--to=ardb@kernel.org \
--cc=anshuman.khandual@arm.com \
--cc=broonie@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=keescook@chromium.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-hardening@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).