From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 617AEC43214 for ; Tue, 10 Aug 2021 06:26:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D8D1861077 for ; Tue, 10 Aug 2021 06:26:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D8D1861077 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E86756B0075; Tue, 10 Aug 2021 02:26:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E22DF8D0002; Tue, 10 Aug 2021 02:26:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ACC4D6B0078; Tue, 10 Aug 2021 02:26:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0086.hostedemail.com [216.40.44.86]) by kanga.kvack.org (Postfix) with ESMTP id 939EE6B0074 for ; Tue, 10 Aug 2021 02:26:28 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 426928249980 for ; Tue, 10 Aug 2021 06:26:28 +0000 (UTC) X-FDA: 78458186856.04.5E5CB17 Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) by imf02.hostedemail.com (Postfix) with ESMTP id F21537006203 for ; Tue, 10 Aug 2021 06:26:27 +0000 (UTC) Received: by mail-lj1-f176.google.com with SMTP id u13so27400732lje.5 for ; Mon, 09 Aug 2021 23:26:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Wx+RYAAVAPw18oVnn3qZR4sMu11WM9+OwnA33lJi4Gw=; b=KjNnlspysltiYL9VG1uTtzPa26xJF/4zf28T9Q8NUyea39Hie4Cg6Yt7e1mDZ6tmjH Zuj6D2wx1SgT7ihXTirifdxPF6O/BsvSUxiZhzselypHb9i167AS9inRV7TI+juMqt/a XlGmRsV3i78lbRQndTrBHd7ScFISM+QkYxg4D037nVJVBZ1xewPnW2hzvGdMWfh7Qsss hO2Cxqlt7TjnP7G0OHspAPWVJUoapBNUcPFF8KKoYTTJW832BS9lr/detTiQzwi00+6D a+lTZR5WOrVRmzhEOA2eTA/bawQGWKtnkAgaRDpYhdGakucGDyC2OphNzutuJ3BGFWoz rVYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Wx+RYAAVAPw18oVnn3qZR4sMu11WM9+OwnA33lJi4Gw=; b=DDopvtgQWbU0VRHb/8OWNO+0W6FumrpPgJiUYTa4N3xThCsWxH3qBZTgxMCGgp33O/ YzLtnmWK+TD7lqycg/Bt7KCBxyB5sy3R7pLNhD5VSmLJr49BbMQBLqeygNoVzk4OBXIm loNoocFnvzg1OGwp+NTGxnqHOKVuaAeMafRwboaeZj/jFwx6713p7nYtJCp96hZpXIkG ZGt2sIfXZODsYTuGZp9jFKB/+PccyewBcpW5n53UIP72CzfIOMyhiGfC2ILwJ4E7JEIy mXXqoA3Erw7dqLUB+cINQYkIRH4DWEvCFLVD1sZYcb6TEKOOZSPAxxAjhAv7Js6UMlN+ fKMA== X-Gm-Message-State: AOAM531jeV5aaLXVA1FvueTdwKVCCtZ1SK6pAGGZt8hKTBf0/fyee8JC cTRhBuJrpm81r64kXnZWMOO5wg== X-Google-Smtp-Source: ABdhPJwWBQ+K3JlUHh1r9KuMzK5f88CTEesYRXokmMYpujHyCswZFi39bgh9hPbmyaRICi3KPxpsAA== X-Received: by 2002:a2e:a4c9:: with SMTP id p9mr18511425ljm.437.1628576786272; Mon, 09 Aug 2021 23:26:26 -0700 (PDT) Received: from box.localdomain ([86.57.175.117]) by smtp.gmail.com with ESMTPSA id z24sm1948290lfr.105.2021.08.09.23.26.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Aug 2021 23:26:22 -0700 (PDT) From: "Kirill A. Shutemov" X-Google-Original-From: "Kirill A. Shutemov" Received: by box.localdomain (Postfix, from userid 1000) id 9C30E102BD9; Tue, 10 Aug 2021 09:26:34 +0300 (+03) To: Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCH 2/5] efi/x86: Implement support for unaccepted memory Date: Tue, 10 Aug 2021 09:26:23 +0300 Message-Id: <20210810062626.1012-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210810062626.1012-1-kirill.shutemov@linux.intel.com> References: <20210810062626.1012-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: F21537006203 Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=shutemov-name.20150623.gappssmtp.com header.s=20150623 header.b=KjNnlspy; dmarc=none; spf=none (imf02.hostedemail.com: domain of kirill@shutemov.name has no SPF policy when checking 209.85.208.176) smtp.mailfrom=kirill@shutemov.name X-Stat-Signature: 1ms6mpt6gu1jbmw83zyn4ffta7jqtb74 X-HE-Tag: 1628576787-765785 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: UEFI Specification version 2.9 introduces concept of memory acceptance: Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, requiring memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific for the Virtrual Machine platform. Accepting memory is costly and it makes VMM allocate memory for the accepted guest physical address range. It's better to postpone memory acceptation until memory is needed. It lowers boot time and reduces memory overhead. Kernel needs to know what memory has been accepted. Firmware communicates this information via memory map: a new memory type -- EFI_UNACCEPTED_MEMORY -- indicates such memory. Range based tracking works fine for firmware, but it gets bulky for kernel: e820 has to be modified on every page acceptance. It leads to table fragmentation, but there's a limited number of entries in the e820 table Other option is to mark such memory as usable in e820 and track if the range has been accepted in a bitmap. One bit in the bitmap represents 2MiB in the address space: one 4k page is enough to track 64GiB or physical address space. In the worst case scenario -- a huge hole in the middle of the address space -- we would need 256MiB to handle 4PiB of the address space. Any unaccepted memory that is not aligned to 2M get accepted upfront. The bitmap allocated and constructed in EFI stub and passed down to kernel via boot_params. allocate_e820() allocates the bitmap if unaccepted memory present according to the maximum address in the memory map. The same boot_params.unaccepted_memory can be used to pass the bitmap between two kernel on kexec, but the use-case is not yet implemented. Signed-off-by: Kirill A. Shutemov --- Documentation/x86/zero-page.rst | 1 + arch/x86/boot/compressed/Makefile | 1 + arch/x86/boot/compressed/bitmap.c | 24 +++++++ arch/x86/boot/compressed/unaccepted_memory.c | 36 ++++++++++ arch/x86/include/asm/unaccepted_memory.h | 12 ++++ arch/x86/include/uapi/asm/bootparam.h | 3 +- drivers/firmware/efi/Kconfig | 12 ++++ drivers/firmware/efi/efi.c | 1 + drivers/firmware/efi/libstub/x86-stub.c | 75 ++++++++++++++++---- include/linux/efi.h | 3 +- 10 files changed, 153 insertions(+), 15 deletions(-) create mode 100644 arch/x86/boot/compressed/bitmap.c create mode 100644 arch/x86/boot/compressed/unaccepted_memory.c create mode 100644 arch/x86/include/asm/unaccepted_memory.h diff --git a/Documentation/x86/zero-page.rst b/Documentation/x86/zero-pag= e.rst index f088f5881666..8e3447a4b373 100644 --- a/Documentation/x86/zero-page.rst +++ b/Documentation/x86/zero-page.rst @@ -42,4 +42,5 @@ Offset/Size Proto Name Meaning 2D0/A00 ALL e820_table E820 memory map table (array of struct e820_entry) D00/1EC ALL eddbuf EDD data (array of struct edd_info) +ECC/008 ALL unaccepted_memory Bitmap of unaccepted memory (1bit =3D=3D = 2M) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed= /Makefile index 1bfe30ebadbe..f5b49e74d728 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -100,6 +100,7 @@ endif vmlinux-objs-$(CONFIG_ACPI) +=3D $(obj)/acpi.o vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) +=3D $(obj)/tdx.o vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) +=3D $(obj)/tdcall.o +vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) +=3D $(obj)/bitmap.o $(obj)/una= ccepted_memory.o =20 vmlinux-objs-$(CONFIG_EFI_MIXED) +=3D $(obj)/efi_thunk_$(BITS).o efi-obj-$(CONFIG_EFI_STUB) =3D $(objtree)/drivers/firmware/efi/libstub/l= ib.a diff --git a/arch/x86/boot/compressed/bitmap.c b/arch/x86/boot/compressed= /bitmap.c new file mode 100644 index 000000000000..bf58b259380a --- /dev/null +++ b/arch/x86/boot/compressed/bitmap.c @@ -0,0 +1,24 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Taken from lib/string.c */ + +#include + +void __bitmap_set(unsigned long *map, unsigned int start, int len) +{ + unsigned long *p =3D map + BIT_WORD(start); + const unsigned int size =3D start + len; + int bits_to_set =3D BITS_PER_LONG - (start % BITS_PER_LONG); + unsigned long mask_to_set =3D BITMAP_FIRST_WORD_MASK(start); + + while (len - bits_to_set >=3D 0) { + *p |=3D mask_to_set; + len -=3D bits_to_set; + bits_to_set =3D BITS_PER_LONG; + mask_to_set =3D ~0UL; + p++; + } + if (len) { + mask_to_set &=3D BITMAP_LAST_WORD_MASK(size); + *p |=3D mask_to_set; + } +} diff --git a/arch/x86/boot/compressed/unaccepted_memory.c b/arch/x86/boot= /compressed/unaccepted_memory.c new file mode 100644 index 000000000000..c2eca85b5073 --- /dev/null +++ b/arch/x86/boot/compressed/unaccepted_memory.c @@ -0,0 +1,36 @@ +#include "error.h" +#include "misc.h" + +static inline void __accept_memory(phys_addr_t start, phys_addr_t end) +{ + /* Platform-specific memory-acceptance call goes here */ + error("Cannot accept memory"); +} + +void mark_unaccepted(struct boot_params *params, u64 start, u64 num) +{ + u64 end =3D start + num * PAGE_SIZE; + unsigned int npages; + + if ((start & PMD_MASK) =3D=3D (end & PMD_MASK)) { + npages =3D (end - start) / PAGE_SIZE; + __accept_memory(start, start + npages * PAGE_SIZE); + return; + } + + if (start & ~PMD_MASK) { + npages =3D (round_up(start, PMD_SIZE) - start) / PAGE_SIZE; + __accept_memory(start, start + npages * PAGE_SIZE); + start =3D round_up(start, PMD_SIZE); + } + + if (end & ~PMD_MASK) { + npages =3D (end - round_down(end, PMD_SIZE)) / PAGE_SIZE; + end =3D round_down(end, PMD_SIZE); + __accept_memory(end, end + npages * PAGE_SIZE); + } + + npages =3D (end - start) / PMD_SIZE; + bitmap_set((unsigned long *)params->unaccepted_memory, + start / PMD_SIZE, npages); +} diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/= asm/unaccepted_memory.h new file mode 100644 index 000000000000..cbc24040b853 --- /dev/null +++ b/arch/x86/include/asm/unaccepted_memory.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (C) 2020 Intel Corporation */ +#ifndef _ASM_X86_UNACCEPTED_MEMORY_H +#define _ASM_X86_UNACCEPTED_MEMORY_H + +#include + +struct boot_params; + +void mark_unaccepted(struct boot_params *params, u64 start, u64 num); + +#endif diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uap= i/asm/bootparam.h index b25d3f82c2f3..16bc686a198d 100644 --- a/arch/x86/include/uapi/asm/bootparam.h +++ b/arch/x86/include/uapi/asm/bootparam.h @@ -217,7 +217,8 @@ struct boot_params { struct boot_e820_entry e820_table[E820_MAX_ENTRIES_ZEROPAGE]; /* 0x2d0 = */ __u8 _pad8[48]; /* 0xcd0 */ struct edd_info eddbuf[EDDMAXNR]; /* 0xd00 */ - __u8 _pad9[276]; /* 0xeec */ + __u64 unaccepted_memory; /* 0xeec */ + __u8 _pad9[268]; /* 0xef4 */ } __attribute__((packed)); =20 /** diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig index 2c3dac5ecb36..e13b584cdd80 100644 --- a/drivers/firmware/efi/Kconfig +++ b/drivers/firmware/efi/Kconfig @@ -243,6 +243,18 @@ config EFI_DISABLE_PCI_DMA options "efi=3Ddisable_early_pci_dma" or "efi=3Dno_disable_early_pci_= dma" may be used to override this option. =20 +config UNACCEPTED_MEMORY + bool + depends on EFI_STUB + help + Some Virtual Machine platforms, such as Intel TDX, introduce + the concept of memory acceptance, requiring memory to be accepted + before it can be used by the guest. This protects against a class of + attacks by the virtual machine platform. + + This option adds support for unaccepted memory and makes such memory + usable by kernel. + endmenu =20 config EFI_EMBEDDED_FIRMWARE diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 847f33ffc4ae..c6b8a1c5a87f 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -737,6 +737,7 @@ static __initdata char memory_type_name[][13] =3D { "MMIO Port", "PAL Code", "Persistent", + "Unaccepted", }; =20 char * __init efi_md_typeattr_format(char *buf, size_t size, diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/e= fi/libstub/x86-stub.c index f14c4ff5839f..e67ec1245f10 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -9,12 +9,14 @@ #include #include #include +#include =20 #include #include #include #include #include +#include =20 #include "efistub.h" =20 @@ -504,6 +506,12 @@ setup_e820(struct boot_params *params, struct setup_= data *e820ext, u32 e820ext_s e820_type =3D E820_TYPE_PMEM; break; =20 + case EFI_UNACCEPTED_MEMORY: + if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) + continue; + e820_type =3D E820_TYPE_RAM; + mark_unaccepted(params, d->phys_addr, d->num_pages); + break; default: continue; } @@ -569,30 +577,71 @@ static efi_status_t alloc_e820ext(u32 nr_desc, stru= ct setup_data **e820ext, } =20 static efi_status_t allocate_e820(struct boot_params *params, + struct efi_boot_memmap *map, struct setup_data **e820ext, u32 *e820ext_size) { - unsigned long map_size, desc_size, map_key; efi_status_t status; - __u32 nr_desc, desc_version; - - /* Only need the size of the mem map and size of each mem descriptor */ - map_size =3D 0; - status =3D efi_bs_call(get_memory_map, &map_size, NULL, &map_key, - &desc_size, &desc_version); - if (status !=3D EFI_BUFFER_TOO_SMALL) - return (status !=3D EFI_SUCCESS) ? status : EFI_UNSUPPORTED; + __u32 nr_desc; + bool unaccepted_memory_present =3D false; + u64 max_addr =3D 0; + int i; =20 - nr_desc =3D map_size / desc_size + EFI_MMAP_NR_SLACK_SLOTS; + status =3D efi_get_memory_map(map); + if (status !=3D EFI_SUCCESS) + return status; =20 - if (nr_desc > ARRAY_SIZE(params->e820_table)) { - u32 nr_e820ext =3D nr_desc - ARRAY_SIZE(params->e820_table); + nr_desc =3D *map->map_size / *map->desc_size; + if (nr_desc > ARRAY_SIZE(params->e820_table) - EFI_MMAP_NR_SLACK_SLOTS)= { + u32 nr_e820ext =3D nr_desc - ARRAY_SIZE(params->e820_table) - + EFI_MMAP_NR_SLACK_SLOTS; =20 status =3D alloc_e820ext(nr_e820ext, e820ext, e820ext_size); if (status !=3D EFI_SUCCESS) return status; } =20 + if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) + return EFI_SUCCESS; + + /* Check if there's any unaccepted memory and find the max address */ + for (i =3D 0; i < nr_desc; i++) { + efi_memory_desc_t *d; + + d =3D efi_early_memdesc_ptr(*map->map, *map->desc_size, i); + if (d->type =3D=3D EFI_UNACCEPTED_MEMORY) + unaccepted_memory_present =3D true; + if (d->phys_addr + d->num_pages * PAGE_SIZE > max_addr) + max_addr =3D d->phys_addr + d->num_pages * PAGE_SIZE; + } + + /* + * If unaccepted memory present allocate a bitmap to track what memory + * has to be accepted before access. + * + * One bit in the bitmap represents 2MiB in the address space: one 4k + * page is enough to track 64GiB or physical address space. + * + * In the worst case scenario -- a huge hole in the middle of the + * address space -- we would need 256MiB to handle 4PiB of the address + * space. + * + * TODO: handle situation if params->unaccepted_memory has already set. + * It's required to deal with kexec. + */ + if (unaccepted_memory_present) { + unsigned long *unaccepted_memory =3D NULL; + u64 size =3D DIV_ROUND_UP(max_addr, PMD_SIZE * BITS_PER_BYTE); + + status =3D efi_allocate_pages(size, + (unsigned long *)&unaccepted_memory, + ULONG_MAX); + if (status !=3D EFI_SUCCESS) + return status; + memset(unaccepted_memory, 0, size); + params->unaccepted_memory =3D (u64)unaccepted_memory; + } + return EFI_SUCCESS; } =20 @@ -642,7 +691,7 @@ static efi_status_t exit_boot(struct boot_params *boo= t_params, void *handle) priv.boot_params =3D boot_params; priv.efi =3D &boot_params->efi_info; =20 - status =3D allocate_e820(boot_params, &e820ext, &e820ext_size); + status =3D allocate_e820(boot_params, &map, &e820ext, &e820ext_size); if (status !=3D EFI_SUCCESS) return status; =20 diff --git a/include/linux/efi.h b/include/linux/efi.h index 6b5d36babfcc..d43cc872b582 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -108,7 +108,8 @@ typedef struct { #define EFI_MEMORY_MAPPED_IO_PORT_SPACE 12 #define EFI_PAL_CODE 13 #define EFI_PERSISTENT_MEMORY 14 -#define EFI_MAX_MEMORY_TYPE 15 +#define EFI_UNACCEPTED_MEMORY 15 +#define EFI_MAX_MEMORY_TYPE 16 =20 /* Attribute values: */ #define EFI_MEMORY_UC ((u64)0x0000000000000001ULL) /* uncached */ --=20 2.31.1