linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/8] arm64: stable UEFI virtual mappings for kexec
@ 2015-01-08 18:48 Ard Biesheuvel
  2015-01-08 18:48 ` [PATCH v5 1/8] arm64/mm: add explicit struct_mm argument to __create_mapping() Ard Biesheuvel
                   ` (8 more replies)
  0 siblings, 9 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2015-01-08 18:48 UTC (permalink / raw)
  To: leif.lindholm, roy.franz, matt.fleming, will.deacon,
	catalin.marinas, linux-arm-kernel, linux-efi, bp, msalter,
	geoff.levand, dyoung, mark.rutland, linux-kernel
  Cc: Ard Biesheuvel

This is v5 of the series to update the UEFI memory map handling for the arm64
architecture so that virtual mappings of UEFI Runtime Services are stable across
kexec.

Changes between v5 and v4:
- rebased onto v3.19-rc3 + the early_ioremap() fix we sent out today
- patches #1 and #2: unchanged
- patch #3: added Matt's ack, addressed Boris's review comments
- patch #4: added Boris's ack
- patch #5: added Leif's ack
- patch #6: addressed review comments from Mark Rutland, Leif and Matt (see
  below)
- patch #7 and #8: added Leif's ack

I have tried to address Matt's concern about drivers/firmware/efi/fdt.c becoming
a dumping ground for ARM specific stuff by moving the virtmap creation to
arm-stub.c. However, EFI is not the only place where sharing code between ARM
and arm64 should be dealt with in a better way, so this remains a concern IMO.

Regarding the duplicated version of efi_get_memory_map(): I dropped it and now
call the original version twice, where the first call is actually only used to
get an allocation whose size is an upper bound for all runtime region
descriptors combined. I have posted a separate patch to improve the original
as there are some concerns there as well.

Regarding EFI_VIRTMAP/EFI_ARCH_1: I added an elaborate comment about it, and
I think there is a case for retaining it next to EFI_RUNTIME_SERVICES.

To de-risk the adoption of the subset of patches that are essential to get kexec
working on UEFI systems, in v4 I dropped all the patches related to the iomem
resource table, /dev/mem permissions and memory attributes etc. These topics
have been addressed in a separate series.

The primary changes between v4 and v3 in the patches that were kept are:
- instead of reording the memory map so that part of it can double as input
  argument to SetVirtualAddressMap(), increase the allocation size for the
  memory map so that we can use some of it as scratch space and use that to
  prepare the input to SVAM() instead. UPDATE: now a separate allocation (v5)
- added some acks
- rebased onto v3.19-rc1

NOTE: these changes trigger an issue on AMD Seattle that we (Mark Rutland and I)
consider a firmware bug. It appears that, during the call to SVAM() (which is
called with a 1:1 mapping as per the UEFI spec) the virtual mapping being
installed is dereferenced prematurely. This went unnoticed in the original
situation, as the virtual mappings were just kernel mappings that are always
accessible. However, in the new situation, those mappings are only active
during Runtime Service invocations, and performing any kind of access through
them at any other time triggers a fault.
UPDATE: this issue has been fixed but obviously requires those affected to
install a new version of the firmware

============== v1 blurb ==================

The main premise of these patches is that, in order to support kexec, we need
to add code to the kernel that is able to deal with the state of the firmware
after SetVirtualAddressMap() [SVAM] has been called. However, if we are going to
deal with that anyway, why not make that the default state, and have only a
single code path for both cases.

This means SVAM() needs to move to the stub, and hence the code that invents
the virtual layout needs to move with it. The result is that the kernel proper
is entered with the virt_addr members of all EFI_MEMORY_RUNTIME regions
assigned, and the mapping installed into the firmware. The kernel proper needs
to set up the page tables, and switch to them while performing the runtime
services calls. Note that there is also an efi_to_phys() to translate the values
of the fw_vendor and tables fields of the EFI system table. Again, this is
something we need to do anyway under kexec, or we end up handing over state
between one kernel and the next, which implies different code paths between
non-kexec and kexec.

The layout is chosen such that it used the userland half of the virtual address
space (TTBR0), which means no additional alignment or reservation is required
to ensure that it will always be available.

One thing that may stand out is the reordering of the memory map. The reason
for doing this is that we can use the same memory map as input to SVAM(). The
alternative is allocating memory for it using boot services, but that clutters
up the existing logic a bit between getting the memory map, populating the fdt,
and loop again if it didn't fit.

Ard Biesheuvel (8):
  arm64/mm: add explicit struct_mm argument to __create_mapping()
  arm64/mm: add create_pgd_mapping() to create private page tables
  efi: split off remapping code from efi_config_init()
  efi: efistub: allow allocation alignment larger than EFI_PAGE_SIZE
  arm64/efi: set EFI_ALLOC_ALIGN to 64 KB
  arm64/efi: move SetVirtualAddressMap() to UEFI stub
  arm64/efi: remove free_boot_services() and friends
  arm64/efi: remove idmap manipulations from UEFI code

 arch/arm64/include/asm/efi.h                   |  38 ++-
 arch/arm64/include/asm/mmu.h                   |   5 +-
 arch/arm64/include/asm/pgtable.h               |   5 +
 arch/arm64/kernel/efi.c                        | 369 ++++++++-----------------
 arch/arm64/kernel/setup.c                      |   2 +-
 arch/arm64/mm/mmu.c                            |  60 ++--
 drivers/firmware/efi/efi.c                     |  56 ++--
 drivers/firmware/efi/libstub/arm-stub.c        |  59 ++++
 drivers/firmware/efi/libstub/efi-stub-helper.c |  25 +-
 drivers/firmware/efi/libstub/efistub.h         |   4 +
 drivers/firmware/efi/libstub/fdt.c             |  62 ++++-
 include/linux/efi.h                            |   2 +
 12 files changed, 362 insertions(+), 325 deletions(-)

-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: [PATCH v5 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub
@ 2015-01-29  9:50 Steve Capper
  2015-01-29  9:55 ` Ard Biesheuvel
  0 siblings, 1 reply; 18+ messages in thread
From: Steve Capper @ 2015-01-29  9:50 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Leif Lindholm, Roy Franz, Matt Fleming, Will Deacon,
	Catalin Marinas, linux-arm-kernel, linux-efi, bp, Mark Salter,
	Geoff Levand, Dave Young, Mark Rutland, linux-kernel, linux-next

On 8 January 2015 at 18:48, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> In order to support kexec, the kernel needs to be able to deal with the
> state of the UEFI firmware after SetVirtualAddressMap() has been called.
> To avoid having separate code paths for non-kexec and kexec, let's move
> the call to SetVirtualAddressMap() to the stub: this will guarantee us
> that it will only be called once (since the stub is not executed during
> kexec), and ensures that the UEFI state is identical between kexec and
> normal boot.
>
> This implies that the layout of the virtual mapping needs to be created
> by the stub as well. All regions are rounded up to a naturally aligned
> multiple of 64 KB (for compatibility with 64k pages kernels) and recorded
> in the UEFI memory map. The kernel proper reads those values and installs
> the mappings in a dedicated set of page tables that are swapped in during
> UEFI Runtime Services calls.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Hi,
I've been testing out linux-next next-20150128 and have run into an
early bootup failure on Seattle.
Having done a bisect, this patch comes up as the first "bad" patch:
f3cdfd2 arm64/efi: move SetVirtualAddressMap() to UEFI stub

I've tried the defconfig with 4-levels 4KB and 2-levels 64KB pages and
the failure mode doesn't change.

The point of failure for me is in setup_arch, just after call to
local_async_enable.

I'm not very knowledgeable of EFI, my guess is that a System Error
occurs early (during the EFI stub activity?), then manifests once the
asynchronous aborts are enabled?

Cheers,
--
Steve

The full boot log:
EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services and installing virtual address map...
Booting Linux on physical CPU 0x0
Initializing cgroup subsys cpu
Linux version 3.19.0-rc4+ (steven@capper-seattle) (gcc version 4.9.2
20141101 (Red Hat 4.9.2-1) (GCC) ) #42 SMP Thu Jan 29 09:28:34 GMT
2015
CPU: AArch64 Processor [410fd070] revision 0
Detected PIPT I-cache on CPU0
alternatives: enabling workaround for ARM erratum 832075
Early serial console at MMIO 0xe1010000 (options '')
bootconsole [uart0] enabled
Bad mode in Error handler detected, code 0xbf000000
CPU: 0 PID: 0 Comm: swapper Not tainted 3.19.0-rc4+ #42
Hardware name: amd,seattle (DT)
task: fffffe0000aaddf0 ti: fffffe0000a70000 task.ti: fffffe0000a70000
PC is at setup_arch+0x1f8/0x510
LR is at setup_arch+0x1f4/0x510
pc : [<fffffe00009b2818>] lr : [<fffffe00009b2814>] pstate: 000002c5
sp : fffffe0000a73f10
x29: fffffe0000a73f10 x28: 0000028001000000
x27: fffffe0000081230 x26: 0000008001c00000
x25: 0000008001be0000 x24: fffffe0000aa6000
x23: 0000000000000000 x22: fffffe0000aa6000
x21: fffffe0000a73fe8 x20: fffffe0000b60000
x19: fffffe0000080000 x18: 0000000000000000
x17: 0000000000000800 x16: 0000000000001000
x15: 0000000000001c00 x14: 0ffffffffffffffe
x13: 0000000000000001 x12: 0000000000000010
x11: 0000000000000007 x10: 0101010101010101
x9 : fffffffffffffffe x8 : 0000000000000008
x7 : 0000000000000006 x6 : 0000800000000000
x5 : 000000000000005f x4 : 0000000000000000
x3 : 0000000000000063 x2 : 0000000000000065
x1 : 0000000000000000 x0 : 0000000000000001

Internal error: Oops - bad mode: 0 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 3.19.0-rc4+ #42
Hardware name: amd,seattle (DT)
task: fffffe0000aaddf0 ti: fffffe0000a70000 task.ti: fffffe0000a70000
PC is at setup_arch+0x1f8/0x510
LR is at setup_arch+0x1f4/0x510
pc : [<fffffe00009b2818>] lr : [<fffffe00009b2814>] pstate: 000002c5
sp : fffffe0000a73f10
x29: fffffe0000a73f10 x28: 0000028001000000
x27: fffffe0000081230 x26: 0000008001c00000
x25: 0000008001be0000 x24: fffffe0000aa6000
x23: 0000000000000000 x22: fffffe0000aa6000
x21: fffffe0000a73fe8 x20: fffffe0000b60000
x19: fffffe0000080000 x18: 0000000000000000
x17: 0000000000000800 x16: 0000000000001000
x15: 0000000000001c00 x14: 0ffffffffffffffe
x13: 0000000000000001 x12: 0000000000000010
x11: 0000000000000007 x10: 0101010101010101
x9 : fffffffffffffffe x8 : 0000000000000008
x7 : 0000000000000006 x6 : 0000800000000000
x5 : 000000000000005f x4 : 0000000000000000
x3 : 0000000000000063 x2 : 0000000000000065
x1 : 0000000000000000 x0 : 0000000000000001

Process swapper (pid: 0, stack limit = 0xfffffe0000a70058)
Stack: (0xfffffe0000a73f10 to 0xfffffe0000a74000)
3f00:                                     00a73fa0 fffffe00 009b0688 fffffe00
3f20: 009ef3b8 fffffe00 00b60000 fffffe00 00b60000 fffffe00 00aa6000 fffffe00
3f40: 00000000 00000000 01000000 00000080 01be0000 00000080 01c00000 00000080
3f60: 00081230 fffffe00 00630088 fffffe00 00000001 00000000 1fe00000 00000080
3f80: 00b63870 fffffe00 00000002 00000000 00b6451a fffffe00 00000000 00000000
3fa0: 00000000 00000000 010906e0 00000080 f0f1e938 00000083 00000e12 00000000
3fc0: 1fe00000 00000080 410fd070 00000000 01ab0000 00000080 01000000 00000080
3fe0: 00000000 00000000 009ef3b8 fffffe00 00000000 00000000 00000000 00000000
Call trace:
[<fffffe00009b2818>] setup_arch+0x1f8/0x510
[<fffffe00009b0684>] start_kernel+0xa4/0x3a8
Code: 94000b2c 940009f7 97fff760 d50344ff (d00007f5)
---[ end trace cb88537fdc8fa200 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task!

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-01-29  9:55 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-08 18:48 [PATCH v5 0/8] arm64: stable UEFI virtual mappings for kexec Ard Biesheuvel
2015-01-08 18:48 ` [PATCH v5 1/8] arm64/mm: add explicit struct_mm argument to __create_mapping() Ard Biesheuvel
2015-01-08 18:48 ` [PATCH v5 2/8] arm64/mm: add create_pgd_mapping() to create private page tables Ard Biesheuvel
2015-01-08 18:48 ` [PATCH v5 3/8] efi: split off remapping code from efi_config_init() Ard Biesheuvel
2015-01-08 18:48 ` [PATCH v5 4/8] efi: efistub: allow allocation alignment larger than EFI_PAGE_SIZE Ard Biesheuvel
2015-01-08 18:48 ` [PATCH v5 5/8] arm64/efi: set EFI_ALLOC_ALIGN to 64 KB Ard Biesheuvel
2015-01-08 18:48 ` [PATCH v5 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub Ard Biesheuvel
2015-01-09 16:41   ` Leif Lindholm
2015-01-12 11:46   ` Matt Fleming
2015-01-12 16:09     ` Ard Biesheuvel
2015-01-12 16:26       ` Matt Fleming
2015-01-08 18:48 ` [PATCH v5 7/8] arm64/efi: remove free_boot_services() and friends Ard Biesheuvel
2015-01-09 15:49   ` Will Deacon
2015-01-08 18:48 ` [PATCH v5 8/8] arm64/efi: remove idmap manipulations from UEFI code Ard Biesheuvel
2015-01-09 16:03   ` Will Deacon
2015-01-09 16:16 ` [PATCH v5 0/8] arm64: stable UEFI virtual mappings for kexec Leif Lindholm
2015-01-29  9:50 [PATCH v5 6/8] arm64/efi: move SetVirtualAddressMap() to UEFI stub Steve Capper
2015-01-29  9:55 ` Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).