All of lore.kernel.org
 help / color / mirror / Atom feed
* [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD)
@ 2017-10-16 15:34 Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 02/17] x86/mm: Add Secure Encrypted Virtualization (SEV) support Brijesh Singh
                   ` (17 more replies)
  0 siblings, 18 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Brijesh Singh, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andy Lutomirski, Tom Lendacky, Paolo Bonzini,
	Radim Krčmář,
	kvm, linux-kernel

This part of Secure Encrypted Virtualization (SEV) series focuses on the
changes required in a guest OS for SEV support.

When SEV is active, the memory content of guest OS will be transparently
encrypted with a key unique to the guest VM.

SEV guests have concept of private and shared memory. Private memory is
encrypted with the guest-specific key, while shared memory may be encrypted with
hypervisor key. Certain type of memory (namely insruction pages and guest page
tables) are always treated as private. Due to security reasons all DMA
operations inside the guest must be performed on shared memory.

The SEV feature is enabled by the hypervisor, and guest can identify it through
CPUID function and the 0xc0010131 (F17H_SEV) MSR. When enabled, page table
entries will determine how memory is accessed. If a page table entry has the
memory encryption mask set, then that memory will be accessed using
guest-specific key. Certain memory (instruction pages, page tables) will always
be accessed using guest-specific key.

This patch series builds upon the Secure Memory Encryption (SME) feature. Unlike
SME, when SEV is enabled, all the data (e.g EFI, kernel, initrd, etc) will have
been placed into memory as encrypted by the guest BIOS.

The approach that this patch series takes is to encrypt everything possible
starting early in the boot. Since the DMA operations inside guest must be
performed on shared memory hence it uses SW-IOTLB to complete the DMA operations.

The following links provide additional details:

AMD Memory Encryption whitepaper:
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf

AMD64 Architecture Programmer's Manual:
    http://support.amd.com/TechDocs/24593.pdf
    SME is section 7.10
    SEV is section 15.34

Secure Encrypted Virutualization Key Management:
http://support.amd.com/TechDocs/55766_SEV-KM API_Specification.pdf

KVM Forum Presentation:
http://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf

SEV Guest BIOS support:
  SEV support has been accepted into EDKII/OVMF BIOS
  https://github.com/tianocore/edk2/commits/master

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org

---
This series is based on tip/master commit : 3594329f88c5 (Merge branch 'linus')

Complete git tree is available: https://github.com/codomania/tip/tree/v6-p1

Changes since v5:
 * enhance early_set_memory_decrypted() to do memory contents encrypt/decrypt in
   addition to C bit changes.

Changes since v4:
 * rename per-CPU define to DEFINE_PER_CPU_DECRYPTED
 * add more comments in per-CPU section definition
 * rename __sev_active() to sev_key_active() to use more obivious naming
 * changes to address v4 feedbacks

Changes since v3:
 * use static key to branch the unrolling of rep ins/outs when SEV is active
 * simplify the memory encryption detection logic
 * rename per-cpu define to DEFINE_PER_CPU_UNENCRYPTED
 * simplfy the logic to map per-cpu as unencrypted
 * changes to address v3 feedbacks

Changes since v2:
 * add documentation
 * update early_set_memory_* to use kernel_physical_mapping_init()
   to split larger page into smaller (recommended by Boris)
 * changes to address v2 feedback
 * drop hypervisor specific patches, those patches will be included in part 2

Brijesh Singh (5):
  Documentation/x86: Add AMD Secure Encrypted Virtualization (SEV)
    description
  x86: Add support for changing memory encryption attribute in early
    boot
  percpu: Introduce DEFINE_PER_CPU_DECRYPTED
  X86/KVM: Decrypt shared per-cpu variables when SEV is active
  X86/KVM: Clear encryption attribute when SEV is active

Tom Lendacky (12):
  x86/mm: Add Secure Encrypted Virtualization (SEV) support
  x86/mm: Don't attempt to encrypt initrd under SEV
  x86/realmode: Don't decrypt trampoline area under SEV
  x86/mm: Use encrypted access of boot related data with SEV
  x86/mm: Include SEV for encryption memory attribute changes
  x86/efi: Access EFI data as encrypted when SEV is active
  resource: Consolidate resource walking code
  resource: Provide resource struct in resource walk callback
  x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory
    pages
  x86/mm: Add DMA support for SEV memory encryption
  x86/boot: Add early boot support when running with SEV active
  x86/io: Unroll string I/O when SEV is active

 Documentation/x86/amd-memory-encryption.txt |  30 ++-
 arch/powerpc/kernel/machine_kexec_file_64.c |  12 +-
 arch/x86/boot/compressed/Makefile           |   1 +
 arch/x86/boot/compressed/head_64.S          |  16 ++
 arch/x86/boot/compressed/mem_encrypt.S      | 120 +++++++++++
 arch/x86/boot/compressed/misc.h             |   2 +
 arch/x86/boot/compressed/pagetable.c        |   8 +-
 arch/x86/entry/vdso/vma.c                   |   5 +-
 arch/x86/include/asm/io.h                   |  42 +++-
 arch/x86/include/asm/mem_encrypt.h          |  14 ++
 arch/x86/include/asm/msr-index.h            |   3 +
 arch/x86/include/uapi/asm/kvm_para.h        |   1 -
 arch/x86/kernel/crash.c                     |  18 +-
 arch/x86/kernel/kvm.c                       |  35 +++-
 arch/x86/kernel/kvmclock.c                  |  65 +++++-
 arch/x86/kernel/pmem.c                      |   2 +-
 arch/x86/kernel/setup.c                     |   6 +-
 arch/x86/mm/ioremap.c                       | 123 +++++++++---
 arch/x86/mm/mem_encrypt.c                   | 301 +++++++++++++++++++++++++++-
 arch/x86/mm/pageattr.c                      |   4 +-
 arch/x86/platform/efi/efi_64.c              |  16 +-
 arch/x86/realmode/init.c                    |   5 +-
 include/asm-generic/vmlinux.lds.h           |  19 ++
 include/linux/ioport.h                      |   7 +-
 include/linux/kexec.h                       |   2 +-
 include/linux/mem_encrypt.h                 |   7 +-
 include/linux/percpu-defs.h                 |  15 ++
 kernel/kexec_file.c                         |   5 +-
 kernel/resource.c                           |  76 ++++---
 lib/swiotlb.c                               |   5 +-
 30 files changed, 842 insertions(+), 123 deletions(-)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S

-- 
2.9.5

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 02/17] x86/mm: Add Secure Encrypted Virtualization (SEV) support
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 16:21   ` Borislav Petkov
  2017-10-16 19:51   ` [Part1 PATCH v6.1 " Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 03/17] x86/mm: Don't attempt to encrypt initrd under SEV Brijesh Singh
                   ` (16 subsequent siblings)
  17 siblings, 2 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andy Lutomirski, linux-kernel, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

Provide support for Secure Encrypted Virtualization (SEV). This initial
support defines a flag that is used by the kernel to determine if it is
running with SEV active.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/mem_encrypt.h |  6 ++++++
 arch/x86/mm/mem_encrypt.c          | 26 ++++++++++++++++++++++++++
 include/linux/mem_encrypt.h        |  7 +++++--
 3 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 6a77c63540f7..2b024741bce9 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -47,6 +47,9 @@ void __init mem_encrypt_init(void);
 
 void swiotlb_set_mem_attributes(void *vaddr, unsigned long size);
 
+bool sme_active(void);
+bool sev_active(void);
+
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
 #define sme_me_mask	0ULL
@@ -64,6 +67,9 @@ static inline void __init sme_early_init(void) { }
 static inline void __init sme_encrypt_kernel(void) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
+static inline bool sme_active(void) { return false; }
+static inline bool sev_active(void) { return false; }
+
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
 /*
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 16c5f37933a2..af7f733f6f31 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -42,6 +42,8 @@ static char sme_cmdline_off[] __initdata = "off";
 u64 sme_me_mask __section(.data) = 0;
 EXPORT_SYMBOL_GPL(sme_me_mask);
 
+static bool sev_enabled __section(.data) = false;
+
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
 
@@ -192,6 +194,30 @@ void __init sme_early_init(void)
 		protection_map[i] = pgprot_encrypted(protection_map[i]);
 }
 
+/*
+ * SME and SEV are very similar but they are not the same, so there are
+ * times that the kernel will need to distinguish between SME and SEV. The
+ * sme_active() and sev_active() functions are used for this.  When a
+ * distinction isn't needed, the mem_encrypt_active() function can be used.
+ *
+ * The trampoline code is a good example for this requirement.  Before
+ * paging is activated, SME will access all memory as decrypted, but SEV
+ * will access all memory as encrypted.  So, when APs are being brought
+ * up under SME the trampoline area cannot be encrypted, whereas under SEV
+ * the trampoline area must be encrypted.
+ */
+bool sme_active(void)
+{
+	return sme_me_mask && !sev_enabled;
+}
+EXPORT_SYMBOL_GPL(sme_active);
+
+bool sev_active(void)
+{
+	return sme_me_mask && sev_enabled;
+}
+EXPORT_SYMBOL_GPL(sev_active);
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void)
 {
diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 265a9cd21cb4..b310a9c18113 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -23,11 +23,14 @@
 
 #define sme_me_mask	0ULL
 
+static inline bool sme_active(void) { return false; }
+static inline bool sev_active(void) { return false; }
+
 #endif	/* CONFIG_ARCH_HAS_MEM_ENCRYPT */
 
-static inline bool sme_active(void)
+static inline bool mem_encrypt_active(void)
 {
-	return !!sme_me_mask;
+	return sme_me_mask;
 }
 
 static inline u64 sme_get_me_mask(void)
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 03/17] x86/mm: Don't attempt to encrypt initrd under SEV
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 02/17] x86/mm: Add Secure Encrypted Virtualization (SEV) support Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 04/17] x86/realmode: Don't decrypt trampoline area " Brijesh Singh
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andy Lutomirski, linux-kernel, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

When SEV is active the initrd/initramfs will already have already been
placed in memory encrypted so do not try to encrypt it.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/setup.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 17dea09f06a3..bb5c3b4ea00f 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -368,9 +368,11 @@ static void __init reserve_initrd(void)
 	 * If SME is active, this memory will be marked encrypted by the
 	 * kernel when it is accessed (including relocation). However, the
 	 * ramdisk image was loaded decrypted by the bootloader, so make
-	 * sure that it is encrypted before accessing it.
+	 * sure that it is encrypted before accessing it. For SEV the
+	 * ramdisk will already be encrypted, so only do this for SME.
 	 */
-	sme_early_encrypt(ramdisk_image, ramdisk_end - ramdisk_image);
+	if (sme_active())
+		sme_early_encrypt(ramdisk_image, ramdisk_end - ramdisk_image);
 
 	initrd_start = 0;
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 04/17] x86/realmode: Don't decrypt trampoline area under SEV
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 02/17] x86/mm: Add Secure Encrypted Virtualization (SEV) support Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 03/17] x86/mm: Don't attempt to encrypt initrd under SEV Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 05/17] x86/mm: Use encrypted access of boot related data with SEV Brijesh Singh
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andy Lutomirski, Laura Abbott,
	Kirill A. Shutemov, linux-kernel, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

When SEV is active the trampoline area will need to be in encrypted
memory so only mark the area decrypted if SME is active.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/realmode/init.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 1f71980fc5e0..d03125c2b73b 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -63,9 +63,10 @@ static void __init setup_real_mode(void)
 	/*
 	 * If SME is active, the trampoline area will need to be in
 	 * decrypted memory in order to bring up other processors
-	 * successfully.
+	 * successfully. This is not needed for SEV.
 	 */
-	set_memory_decrypted((unsigned long)base, size >> PAGE_SHIFT);
+	if (sme_active())
+		set_memory_decrypted((unsigned long)base, size >> PAGE_SHIFT);
 
 	memcpy(base, real_mode_blob, size);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 05/17] x86/mm: Use encrypted access of boot related data with SEV
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (2 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 04/17] x86/realmode: Don't decrypt trampoline area " Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 06/17] x86/mm: Include SEV for encryption memory attribute changes Brijesh Singh
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andy Lutomirski, Laura Abbott,
	Kirill A. Shutemov, Matt Fleming, linux-kernel, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

When Secure Encrypted Virtualization (SEV) is active, boot data (such as
EFI related data, setup data) is encrypted and needs to be accessed as
such when mapped. Update the architecture override in early_memremap to
keep the encryption attribute when mapping this data.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/ioremap.c | 44 ++++++++++++++++++++++++++++++--------------
 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 34f0e1847dd6..52cc0f4ed494 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -422,6 +422,9 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr)
  * areas should be mapped decrypted. And since the encryption key can
  * change across reboots, persistent memory should also be mapped
  * decrypted.
+ *
+ * If SEV is active, that implies that BIOS/UEFI also ran encrypted so
+ * only persistent memory should be mapped decrypted.
  */
 static bool memremap_should_map_decrypted(resource_size_t phys_addr,
 					  unsigned long size)
@@ -458,6 +461,11 @@ static bool memremap_should_map_decrypted(resource_size_t phys_addr,
 	case E820_TYPE_ACPI:
 	case E820_TYPE_NVS:
 	case E820_TYPE_UNUSABLE:
+		/* For SEV, these areas are encrypted */
+		if (sev_active())
+			break;
+		/* Fallthrough */
+
 	case E820_TYPE_PRAM:
 		return true;
 	default:
@@ -581,7 +589,7 @@ static bool __init early_memremap_is_setup_data(resource_size_t phys_addr,
 bool arch_memremap_can_ram_remap(resource_size_t phys_addr, unsigned long size,
 				 unsigned long flags)
 {
-	if (!sme_active())
+	if (!mem_encrypt_active())
 		return true;
 
 	if (flags & MEMREMAP_ENC)
@@ -590,12 +598,13 @@ bool arch_memremap_can_ram_remap(resource_size_t phys_addr, unsigned long size,
 	if (flags & MEMREMAP_DEC)
 		return false;
 
-	if (memremap_is_setup_data(phys_addr, size) ||
-	    memremap_is_efi_data(phys_addr, size) ||
-	    memremap_should_map_decrypted(phys_addr, size))
-		return false;
+	if (sme_active()) {
+		if (memremap_is_setup_data(phys_addr, size) ||
+		    memremap_is_efi_data(phys_addr, size))
+			return false;
+	}
 
-	return true;
+	return !memremap_should_map_decrypted(phys_addr, size);
 }
 
 /*
@@ -608,17 +617,24 @@ pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr,
 					     unsigned long size,
 					     pgprot_t prot)
 {
-	if (!sme_active())
+	bool encrypted_prot;
+
+	if (!mem_encrypt_active())
 		return prot;
 
-	if (early_memremap_is_setup_data(phys_addr, size) ||
-	    memremap_is_efi_data(phys_addr, size) ||
-	    memremap_should_map_decrypted(phys_addr, size))
-		prot = pgprot_decrypted(prot);
-	else
-		prot = pgprot_encrypted(prot);
+	encrypted_prot = true;
+
+	if (sme_active()) {
+		if (early_memremap_is_setup_data(phys_addr, size) ||
+		    memremap_is_efi_data(phys_addr, size))
+			encrypted_prot = false;
+	}
+
+	if (encrypted_prot && memremap_should_map_decrypted(phys_addr, size))
+		encrypted_prot = false;
 
-	return prot;
+	return encrypted_prot ? pgprot_encrypted(prot)
+			      : pgprot_decrypted(prot);
 }
 
 bool phys_mem_access_encrypted(unsigned long phys_addr, unsigned long size)
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 06/17] x86/mm: Include SEV for encryption memory attribute changes
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (3 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 05/17] x86/mm: Use encrypted access of boot related data with SEV Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 07/17] x86/efi: Access EFI data as encrypted when SEV is active Brijesh Singh
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andy Lutomirski, John Ogness, Matt Fleming,
	Laura Abbott, Dan Williams, Kirill A. Shutemov, linux-kernel,
	Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

The current code checks only for sme_active() when determining whether
to perform the encryption attribute change.  Include sev_active() in this
check so that memory attribute changes can occur under SME and SEV.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/pageattr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index dfb7d657cf43..3fe68483463c 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1781,8 +1781,8 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
 	unsigned long start;
 	int ret;
 
-	/* Nothing to do if the SME is not active */
-	if (!sme_active())
+	/* Nothing to do if memory encryption is not active */
+	if (!mem_encrypt_active())
 		return 0;
 
 	/* Should not be working on unaligned addresses */
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 07/17] x86/efi: Access EFI data as encrypted when SEV is active
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (4 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 06/17] x86/mm: Include SEV for encryption memory attribute changes Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 08/17] resource: Consolidate resource walking code Brijesh Singh
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andy Lutomirski, Matt Fleming, Ard Biesheuvel,
	linux-efi, linux-kernel, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

EFI data is encrypted when the kernel is run under SEV. Update the
page table references to be sure the EFI memory areas are accessed
encrypted.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: linux-efi@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/platform/efi/efi_64.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 12e83888e5b9..5469c9319f43 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -32,6 +32,7 @@
 #include <linux/reboot.h>
 #include <linux/slab.h>
 #include <linux/ucs2_string.h>
+#include <linux/mem_encrypt.h>
 
 #include <asm/setup.h>
 #include <asm/page.h>
@@ -369,7 +370,11 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	 * as trim_bios_range() will reserve the first page and isolate it away
 	 * from memory allocators anyway.
 	 */
-	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, _PAGE_RW)) {
+	pf = _PAGE_RW;
+	if (sev_active())
+		pf |= _PAGE_ENC;
+
+	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, pf)) {
 		pr_err("Failed to create 1:1 mapping for the first page!\n");
 		return 1;
 	}
@@ -412,6 +417,9 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va)
 	if (!(md->attribute & EFI_MEMORY_WB))
 		flags |= _PAGE_PCD;
 
+	if (sev_active())
+		flags |= _PAGE_ENC;
+
 	pfn = md->phys_addr >> PAGE_SHIFT;
 	if (kernel_map_pages_in_pgd(pgd, pfn, va, md->num_pages, flags))
 		pr_warn("Error mapping PA 0x%llx -> VA 0x%llx!\n",
@@ -538,6 +546,9 @@ static int __init efi_update_mem_attr(struct mm_struct *mm, efi_memory_desc_t *m
 	if (!(md->attribute & EFI_MEMORY_RO))
 		pf |= _PAGE_RW;
 
+	if (sev_active())
+		pf |= _PAGE_ENC;
+
 	return efi_update_mappings(md, pf);
 }
 
@@ -589,6 +600,9 @@ void __init efi_runtime_update_mappings(void)
 			(md->type != EFI_RUNTIME_SERVICES_CODE))
 			pf |= _PAGE_RW;
 
+		if (sev_active())
+			pf |= _PAGE_ENC;
+
 		efi_update_mappings(md, pf);
 	}
 }
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 08/17] resource: Consolidate resource walking code
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (5 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 07/17] x86/efi: Access EFI data as encrypted when SEV is active Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 09/17] resource: Provide resource struct in resource walk callback Brijesh Singh
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86; +Cc: bp, Tom Lendacky, Borislav Petkov, linux-kernel, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

The walk_iomem_res_desc(), walk_system_ram_res() and walk_system_ram_range()
functions each have much of the same code.  Create a new function that
consolidates the common code from these functions in one place to reduce
the amount of duplicated code.

Cc: Borislav Petkov <bp@suse.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 kernel/resource.c | 52 +++++++++++++++++++++++++---------------------------
 1 file changed, 25 insertions(+), 27 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 9b5f04404152..7323c1b636cd 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -400,6 +400,26 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
 	return 0;
 }
 
+static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
+				 bool first_level_children_only,
+				 void *arg, int (*func)(u64, u64, void *))
+{
+	u64 orig_end = res->end;
+	int ret = -1;
+
+	while ((res->start < res->end) &&
+	       !find_next_iomem_res(res, desc, first_level_children_only)) {
+		ret = (*func)(res->start, res->end, arg);
+		if (ret)
+			break;
+
+		res->start = res->end + 1;
+		res->end = orig_end;
+	}
+
+	return ret;
+}
+
 /*
  * Walks through iomem resources and calls func() with matching resource
  * ranges. This walks through whole tree and not just first level children.
@@ -418,26 +438,12 @@ int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
 		u64 end, void *arg, int (*func)(u64, u64, void *))
 {
 	struct resource res;
-	u64 orig_end;
-	int ret = -1;
 
 	res.start = start;
 	res.end = end;
 	res.flags = flags;
-	orig_end = res.end;
-
-	while ((res.start < res.end) &&
-		(!find_next_iomem_res(&res, desc, false))) {
-
-		ret = (*func)(res.start, res.end, arg);
-		if (ret)
-			break;
-
-		res.start = res.end + 1;
-		res.end = orig_end;
-	}
 
-	return ret;
+	return __walk_iomem_res_desc(&res, desc, false, arg, func);
 }
 
 /*
@@ -451,22 +457,13 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 				int (*func)(u64, u64, void *))
 {
 	struct resource res;
-	u64 orig_end;
-	int ret = -1;
 
 	res.start = start;
 	res.end = end;
 	res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
-	orig_end = res.end;
-	while ((res.start < res.end) &&
-		(!find_next_iomem_res(&res, IORES_DESC_NONE, true))) {
-		ret = (*func)(res.start, res.end, arg);
-		if (ret)
-			break;
-		res.start = res.end + 1;
-		res.end = orig_end;
-	}
-	return ret;
+
+	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
+				     arg, func);
 }
 
 #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
@@ -508,6 +505,7 @@ static int __is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
 {
 	return 1;
 }
+
 /*
  * This generic page_is_ram() returns true if specified address is
  * registered as System RAM in iomem_resource list.
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 09/17] resource: Provide resource struct in resource walk callback
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (6 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 08/17] resource: Consolidate resource walking code Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 10/17] x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pages Brijesh Singh
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, linux-kernel, linuxppc-dev,
	Benjamin Herrenschmidt, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

In prep for a new function that will need additional resource information
during the resource walk, update the resource walk callback to pass the
resource structure.  Since the current callback start and end arguments
are pulled from the resource structure, the callback functions can obtain
them from the resource structure directly.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: linux-kernel@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/powerpc/kernel/machine_kexec_file_64.c | 12 +++++++++---
 arch/x86/kernel/crash.c                     | 18 +++++++++---------
 arch/x86/kernel/pmem.c                      |  2 +-
 include/linux/ioport.h                      |  4 ++--
 include/linux/kexec.h                       |  2 +-
 kernel/kexec_file.c                         |  5 +++--
 kernel/resource.c                           |  9 +++++----
 7 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c
index 992c0d258e5d..e4395f937d63 100644
--- a/arch/powerpc/kernel/machine_kexec_file_64.c
+++ b/arch/powerpc/kernel/machine_kexec_file_64.c
@@ -91,11 +91,13 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
  * and that value will be returned. If all free regions are visited without
  * func returning non-zero, then zero will be returned.
  */
-int arch_kexec_walk_mem(struct kexec_buf *kbuf, int (*func)(u64, u64, void *))
+int arch_kexec_walk_mem(struct kexec_buf *kbuf,
+			int (*func)(struct resource *, void *))
 {
 	int ret = 0;
 	u64 i;
 	phys_addr_t mstart, mend;
+	struct resource res = { };
 
 	if (kbuf->top_down) {
 		for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
@@ -105,7 +107,9 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf, int (*func)(u64, u64, void *))
 			 * range while in kexec, end points to the last byte
 			 * in the range.
 			 */
-			ret = func(mstart, mend - 1, kbuf);
+			res.start = mstart;
+			res.end = mend - 1;
+			ret = func(&res, kbuf);
 			if (ret)
 				break;
 		}
@@ -117,7 +121,9 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf, int (*func)(u64, u64, void *))
 			 * range while in kexec, end points to the last byte
 			 * in the range.
 			 */
-			ret = func(mstart, mend - 1, kbuf);
+			res.start = mstart;
+			res.end = mend - 1;
+			ret = func(&res, kbuf);
 			if (ret)
 				break;
 		}
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 44404e2307bb..815008c9ca18 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -209,7 +209,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
 }
 
 #ifdef CONFIG_KEXEC_FILE
-static int get_nr_ram_ranges_callback(u64 start, u64 end, void *arg)
+static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
 {
 	unsigned int *nr_ranges = arg;
 
@@ -342,7 +342,7 @@ static int elf_header_exclude_ranges(struct crash_elf_data *ced,
 	return ret;
 }
 
-static int prepare_elf64_ram_headers_callback(u64 start, u64 end, void *arg)
+static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
 {
 	struct crash_elf_data *ced = arg;
 	Elf64_Ehdr *ehdr;
@@ -355,7 +355,7 @@ static int prepare_elf64_ram_headers_callback(u64 start, u64 end, void *arg)
 	ehdr = ced->ehdr;
 
 	/* Exclude unwanted mem ranges */
-	ret = elf_header_exclude_ranges(ced, start, end);
+	ret = elf_header_exclude_ranges(ced, res->start, res->end);
 	if (ret)
 		return ret;
 
@@ -518,14 +518,14 @@ static int add_e820_entry(struct boot_params *params, struct e820_entry *entry)
 	return 0;
 }
 
-static int memmap_entry_callback(u64 start, u64 end, void *arg)
+static int memmap_entry_callback(struct resource *res, void *arg)
 {
 	struct crash_memmap_data *cmd = arg;
 	struct boot_params *params = cmd->params;
 	struct e820_entry ei;
 
-	ei.addr = start;
-	ei.size = end - start + 1;
+	ei.addr = res->start;
+	ei.size = res->end - res->start + 1;
 	ei.type = cmd->type;
 	add_e820_entry(params, &ei);
 
@@ -619,12 +619,12 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
 	return ret;
 }
 
-static int determine_backup_region(u64 start, u64 end, void *arg)
+static int determine_backup_region(struct resource *res, void *arg)
 {
 	struct kimage *image = arg;
 
-	image->arch.backup_src_start = start;
-	image->arch.backup_src_sz = end - start + 1;
+	image->arch.backup_src_start = res->start;
+	image->arch.backup_src_sz = res->end - res->start + 1;
 
 	/* Expecting only one range for backup region */
 	return 1;
diff --git a/arch/x86/kernel/pmem.c b/arch/x86/kernel/pmem.c
index 0c5315d322c8..297bb71f10c5 100644
--- a/arch/x86/kernel/pmem.c
+++ b/arch/x86/kernel/pmem.c
@@ -6,7 +6,7 @@
 #include <linux/init.h>
 #include <linux/ioport.h>
 
-static int found(u64 start, u64 end, void *data)
+static int found(struct resource *res, void *data)
 {
 	return 1;
 }
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index f5cf32e80041..617d8a2aac67 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -271,10 +271,10 @@ walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 		void *arg, int (*func)(unsigned long, unsigned long, void *));
 extern int
 walk_system_ram_res(u64 start, u64 end, void *arg,
-		    int (*func)(u64, u64, void *));
+		    int (*func)(struct resource *, void *));
 extern int
 walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
-		    void *arg, int (*func)(u64, u64, void *));
+		    void *arg, int (*func)(struct resource *, void *));
 
 /* True if any part of r1 overlaps r2 */
 static inline bool resource_overlaps(struct resource *r1, struct resource *r2)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 2b7590f5483a..d672f955100a 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -159,7 +159,7 @@ struct kexec_buf {
 };
 
 int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
-			       int (*func)(u64, u64, void *));
+			       int (*func)(struct resource *, void *));
 extern int kexec_add_buffer(struct kexec_buf *kbuf);
 int kexec_locate_mem_hole(struct kexec_buf *kbuf);
 #endif /* CONFIG_KEXEC_FILE */
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 9f48f4412297..e5bcd94c1efb 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -406,9 +406,10 @@ static int locate_mem_hole_bottom_up(unsigned long start, unsigned long end,
 	return 1;
 }
 
-static int locate_mem_hole_callback(u64 start, u64 end, void *arg)
+static int locate_mem_hole_callback(struct resource *res, void *arg)
 {
 	struct kexec_buf *kbuf = (struct kexec_buf *)arg;
+	u64 start = res->start, end = res->end;
 	unsigned long sz = end - start + 1;
 
 	/* Returning 0 will take to next memory range */
@@ -437,7 +438,7 @@ static int locate_mem_hole_callback(u64 start, u64 end, void *arg)
  * func returning non-zero, then zero will be returned.
  */
 int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
-			       int (*func)(u64, u64, void *))
+			       int (*func)(struct resource *, void *))
 {
 	if (kbuf->image->type == KEXEC_TYPE_CRASH)
 		return walk_iomem_res_desc(crashk_res.desc,
diff --git a/kernel/resource.c b/kernel/resource.c
index 7323c1b636cd..8430042fa77b 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -402,14 +402,15 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
 
 static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
 				 bool first_level_children_only,
-				 void *arg, int (*func)(u64, u64, void *))
+				 void *arg,
+				 int (*func)(struct resource *, void *))
 {
 	u64 orig_end = res->end;
 	int ret = -1;
 
 	while ((res->start < res->end) &&
 	       !find_next_iomem_res(res, desc, first_level_children_only)) {
-		ret = (*func)(res->start, res->end, arg);
+		ret = (*func)(res, arg);
 		if (ret)
 			break;
 
@@ -435,7 +436,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
  * <linux/ioport.h> and set it in 'desc' of a target resource entry.
  */
 int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
-		u64 end, void *arg, int (*func)(u64, u64, void *))
+		u64 end, void *arg, int (*func)(struct resource *, void *))
 {
 	struct resource res;
 
@@ -454,7 +455,7 @@ int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
  * ranges.
  */
 int walk_system_ram_res(u64 start, u64 end, void *arg,
-				int (*func)(u64, u64, void *))
+				int (*func)(struct resource *, void *))
 {
 	struct resource res;
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 10/17] x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pages
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (7 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 09/17] resource: Provide resource struct in resource walk callback Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 11/17] x86/mm: Add DMA support for SEV memory encryption Brijesh Singh
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Kirill A. Shutemov, Laura Abbott,
	Andy Lutomirski, Jérôme Glisse, Andrew Morton,
	Dan Williams, Kees Cook, linux-kernel, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

In order for memory pages to be properly mapped when SEV is active, we
need to use the PAGE_KERNEL protection attribute as the base protection.
This will insure that memory mapping of, e.g. ACPI tables, receives the
proper mapping attributes.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/ioremap.c  | 79 ++++++++++++++++++++++++++++++++++++++++++--------
 include/linux/ioport.h |  3 ++
 kernel/resource.c      | 19 ++++++++++++
 3 files changed, 89 insertions(+), 12 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 52cc0f4ed494..6e4573b1da34 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -27,6 +27,11 @@
 
 #include "physaddr.h"
 
+struct ioremap_mem_flags {
+	bool system_ram;
+	bool desc_other;
+};
+
 /*
  * Fix up the linear direct mapping of the kernel to avoid cache attribute
  * conflicts.
@@ -56,17 +61,59 @@ int ioremap_change_attr(unsigned long vaddr, unsigned long size,
 	return err;
 }
 
-static int __ioremap_check_ram(unsigned long start_pfn, unsigned long nr_pages,
-			       void *arg)
+static bool __ioremap_check_ram(struct resource *res)
 {
+	unsigned long start_pfn, stop_pfn;
 	unsigned long i;
 
-	for (i = 0; i < nr_pages; ++i)
-		if (pfn_valid(start_pfn + i) &&
-		    !PageReserved(pfn_to_page(start_pfn + i)))
-			return 1;
+	if ((res->flags & IORESOURCE_SYSTEM_RAM) != IORESOURCE_SYSTEM_RAM)
+		return false;
 
-	return 0;
+	start_pfn = (res->start + PAGE_SIZE - 1) >> PAGE_SHIFT;
+	stop_pfn = (res->end + 1) >> PAGE_SHIFT;
+	if (stop_pfn > start_pfn) {
+		for (i = 0; i < (stop_pfn - start_pfn); ++i)
+			if (pfn_valid(start_pfn + i) &&
+			    !PageReserved(pfn_to_page(start_pfn + i)))
+				return true;
+	}
+
+	return false;
+}
+
+static int __ioremap_check_desc_other(struct resource *res)
+{
+	return (res->desc != IORES_DESC_NONE);
+}
+
+static int __ioremap_res_check(struct resource *res, void *arg)
+{
+	struct ioremap_mem_flags *flags = arg;
+
+	if (!flags->system_ram)
+		flags->system_ram = __ioremap_check_ram(res);
+
+	if (!flags->desc_other)
+		flags->desc_other = __ioremap_check_desc_other(res);
+
+	return flags->system_ram && flags->desc_other;
+}
+
+/*
+ * To avoid multiple resource walks, this function walks resources marked as
+ * IORESOURCE_MEM and IORESOURCE_BUSY and looking for system RAM and/or a
+ * resource described not as IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
+ */
+static void __ioremap_check_mem(resource_size_t addr, unsigned long size,
+				struct ioremap_mem_flags *flags)
+{
+	u64 start, end;
+
+	start = (u64)addr;
+	end = start + size - 1;
+	memset(flags, 0, sizeof(*flags));
+
+	walk_mem_res(start, end, flags, __ioremap_res_check);
 }
 
 /*
@@ -87,9 +134,10 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		unsigned long size, enum page_cache_mode pcm, void *caller)
 {
 	unsigned long offset, vaddr;
-	resource_size_t pfn, last_pfn, last_addr;
+	resource_size_t last_addr;
 	const resource_size_t unaligned_phys_addr = phys_addr;
 	const unsigned long unaligned_size = size;
+	struct ioremap_mem_flags mem_flags;
 	struct vm_struct *area;
 	enum page_cache_mode new_pcm;
 	pgprot_t prot;
@@ -108,13 +156,12 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		return NULL;
 	}
 
+	__ioremap_check_mem(phys_addr, size, &mem_flags);
+
 	/*
 	 * Don't allow anybody to remap normal RAM that we're using..
 	 */
-	pfn      = phys_addr >> PAGE_SHIFT;
-	last_pfn = last_addr >> PAGE_SHIFT;
-	if (walk_system_ram_range(pfn, last_pfn - pfn + 1, NULL,
-					  __ioremap_check_ram) == 1) {
+	if (mem_flags.system_ram) {
 		WARN_ONCE(1, "ioremap on RAM at %pa - %pa\n",
 			  &phys_addr, &last_addr);
 		return NULL;
@@ -146,7 +193,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		pcm = new_pcm;
 	}
 
+	/*
+	 * If the page being mapped is in memory and SEV is active then
+	 * make sure the memory encryption attribute is enabled in the
+	 * resulting mapping.
+	 */
 	prot = PAGE_KERNEL_IO;
+	if (sev_active() && mem_flags.desc_other)
+		prot = pgprot_encrypted(prot);
+
 	switch (pcm) {
 	case _PAGE_CACHE_MODE_UC:
 	default:
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 617d8a2aac67..c04d584ab5a1 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -270,6 +270,9 @@ extern int
 walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 		void *arg, int (*func)(unsigned long, unsigned long, void *));
 extern int
+walk_mem_res(u64 start, u64 end, void *arg,
+	     int (*func)(struct resource *, void *));
+extern int
 walk_system_ram_res(u64 start, u64 end, void *arg,
 		    int (*func)(struct resource *, void *));
 extern int
diff --git a/kernel/resource.c b/kernel/resource.c
index 8430042fa77b..54ba6de3757c 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -397,6 +397,8 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
 		res->start = p->start;
 	if (res->end > p->end)
 		res->end = p->end;
+	res->flags = p->flags;
+	res->desc = p->desc;
 	return 0;
 }
 
@@ -467,6 +469,23 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 				     arg, func);
 }
 
+/*
+ * This function calls the @func callback against all memory ranges, which
+ * are ranges marked as IORESOURCE_MEM and IORESOUCE_BUSY.
+ */
+int walk_mem_res(u64 start, u64 end, void *arg,
+		 int (*func)(struct resource *, void *))
+{
+	struct resource res;
+
+	res.start = start;
+	res.end = end;
+	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+
+	return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
+				     arg, func);
+}
+
 #if !defined(CONFIG_ARCH_HAS_WALK_MEMORY)
 
 /*
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 11/17] x86/mm: Add DMA support for SEV memory encryption
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (8 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 10/17] x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pages Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 12/17] x86/boot: Add early boot support when running with SEV active Brijesh Singh
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Konrad Rzeszutek Wilk, linux-kernel,
	Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

DMA access to encrypted memory cannot be performed when SEV is active.
In order for DMA to properly work when SEV is active, the SWIOTLB bounce
buffers must be used.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/mem_encrypt.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++
 lib/swiotlb.c             |  5 +--
 2 files changed, 89 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index af7f733f6f31..7dadc88c0dc6 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -192,6 +192,70 @@ void __init sme_early_init(void)
 	/* Update the protection map with memory encryption mask */
 	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
 		protection_map[i] = pgprot_encrypted(protection_map[i]);
+
+	if (sev_active())
+		swiotlb_force = SWIOTLB_FORCE;
+}
+
+static void *sev_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		       gfp_t gfp, unsigned long attrs)
+{
+	unsigned long dma_mask;
+	unsigned int order;
+	struct page *page;
+	void *vaddr = NULL;
+
+	dma_mask = dma_alloc_coherent_mask(dev, gfp);
+	order = get_order(size);
+
+	/*
+	 * Memory will be memset to zero after marking decrypted, so don't
+	 * bother clearing it before.
+	 */
+	gfp &= ~__GFP_ZERO;
+
+	page = alloc_pages_node(dev_to_node(dev), gfp, order);
+	if (page) {
+		dma_addr_t addr;
+
+		/*
+		 * Since we will be clearing the encryption bit, check the
+		 * mask with it already cleared.
+		 */
+		addr = __sme_clr(phys_to_dma(dev, page_to_phys(page)));
+		if ((addr + size) > dma_mask) {
+			__free_pages(page, get_order(size));
+		} else {
+			vaddr = page_address(page);
+			*dma_handle = addr;
+		}
+	}
+
+	if (!vaddr)
+		vaddr = swiotlb_alloc_coherent(dev, size, dma_handle, gfp);
+
+	if (!vaddr)
+		return NULL;
+
+	/* Clear the SME encryption bit for DMA use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, *dma_handle))) {
+		set_memory_decrypted((unsigned long)vaddr, 1 << order);
+		memset(vaddr, 0, PAGE_SIZE << order);
+		*dma_handle = __sme_clr(*dma_handle);
+	}
+
+	return vaddr;
+}
+
+static void sev_free(struct device *dev, size_t size, void *vaddr,
+		     dma_addr_t dma_handle, unsigned long attrs)
+{
+	/* Set the SME encryption bit for re-use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, dma_handle)))
+		set_memory_encrypted((unsigned long)vaddr,
+				     1 << get_order(size));
+
+	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
 }
 
 /*
@@ -218,6 +282,20 @@ bool sev_active(void)
 }
 EXPORT_SYMBOL_GPL(sev_active);
 
+static const struct dma_map_ops sev_dma_ops = {
+	.alloc                  = sev_alloc,
+	.free                   = sev_free,
+	.map_page               = swiotlb_map_page,
+	.unmap_page             = swiotlb_unmap_page,
+	.map_sg                 = swiotlb_map_sg_attrs,
+	.unmap_sg               = swiotlb_unmap_sg_attrs,
+	.sync_single_for_cpu    = swiotlb_sync_single_for_cpu,
+	.sync_single_for_device = swiotlb_sync_single_for_device,
+	.sync_sg_for_cpu        = swiotlb_sync_sg_for_cpu,
+	.sync_sg_for_device     = swiotlb_sync_sg_for_device,
+	.mapping_error          = swiotlb_dma_mapping_error,
+};
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void)
 {
@@ -227,6 +305,14 @@ void __init mem_encrypt_init(void)
 	/* Call into SWIOTLB to update the SWIOTLB DMA buffers */
 	swiotlb_update_mem_attributes();
 
+	/*
+	 * With SEV, DMA operations cannot use encryption. New DMA ops
+	 * are required in order to mark the DMA areas as decrypted or
+	 * to use bounce buffers.
+	 */
+	if (sev_active())
+		dma_ops = &sev_dma_ops;
+
 	pr_info("AMD Secure Memory Encryption (SME) active\n");
 }
 
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 8c6c83ef57a4..cea19aaf303c 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -507,8 +507,9 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 	if (no_iotlb_memory)
 		panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
 
-	if (sme_active())
-		pr_warn_once("SME is active and system is using DMA bounce buffers\n");
+	if (mem_encrypt_active())
+		pr_warn_once("%s is active and system is using DMA bounce buffers\n",
+			     sme_active() ? "SME" : "SEV");
 
 	mask = dma_get_seg_boundary(hwdev);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 12/17] x86/boot: Add early boot support when running with SEV active
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (9 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 11/17] x86/mm: Add DMA support for SEV memory encryption Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 13/17] x86/io: Unroll string I/O when SEV is active Brijesh Singh
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Konrad Rzeszutek Wilk, Kirill A. Shutemov,
	Laura Abbott, Andy Lutomirski, Kees Cook, Paolo Bonzini,
	Radim Krčmář,
	linux-kernel, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

Early in the boot process, add checks to determine if the kernel is
running with Secure Encrypted Virtualization (SEV) active.

Checking for SEV requires checking that the kernel is running under a
hypervisor (CPUID 0x00000001, bit 31), that the SEV feature is available
(CPUID 0x8000001f, bit 1) and then checking a non-interceptable SEV MSR
(0xc0010131, bit 0).

This check is required so that during early compressed kernel booting the
pagetables (both the boot pagetables and KASLR pagetables (if enabled) are
updated to include the encryption mask so that when the kernel is
decompressed into encrypted memory, it can boot properly.

After the kernel is decompressed and continues booting the same logic is
used to check if SEV is active and set a flag indicating so.  This allows
us to distinguish between SME and SEV, each of which have unique
differences in how certain things are handled: e.g. DMA (always bounce
buffered with SEV) or EFI tables (always access decrypted with SME).

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/boot/compressed/Makefile      |   1 +
 arch/x86/boot/compressed/head_64.S     |  16 +++++
 arch/x86/boot/compressed/mem_encrypt.S | 120 +++++++++++++++++++++++++++++++++
 arch/x86/boot/compressed/misc.h        |   2 +
 arch/x86/boot/compressed/pagetable.c   |   8 ++-
 arch/x86/include/asm/msr-index.h       |   3 +
 arch/x86/include/uapi/asm/kvm_para.h   |   1 -
 arch/x86/mm/mem_encrypt.c              |  50 +++++++++++---
 8 files changed, 186 insertions(+), 15 deletions(-)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 8a958274b54c..7fc5b7168e4f 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -77,6 +77,7 @@ vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
 	vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/pagetable.o
+	vmlinux-objs-y += $(obj)/mem_encrypt.o
 endif
 
 $(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index b4a5d284391c..3dfad60720d0 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -130,6 +130,19 @@ ENTRY(startup_32)
  /*
   * Build early 4G boot pagetable
   */
+	/*
+	 * If SEV is active then set the encryption mask in the page tables.
+	 * This will insure that when the kernel is copied and decompressed
+	 * it will be done so encrypted.
+	 */
+	call	get_sev_encryption_bit
+	xorl	%edx, %edx
+	testl	%eax, %eax
+	jz	1f
+	subl	$32, %eax	/* Encryption bit is always above bit 31 */
+	bts	%eax, %edx	/* Set encryption mask for page tables */
+1:
+
 	/* Initialize Page tables to 0 */
 	leal	pgtable(%ebx), %edi
 	xorl	%eax, %eax
@@ -140,12 +153,14 @@ ENTRY(startup_32)
 	leal	pgtable + 0(%ebx), %edi
 	leal	0x1007 (%edi), %eax
 	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 
 	/* Build Level 3 */
 	leal	pgtable + 0x1000(%ebx), %edi
 	leal	0x1007(%edi), %eax
 	movl	$4, %ecx
 1:	movl	%eax, 0x00(%edi)
+	addl	%edx, 0x04(%edi)
 	addl	$0x00001000, %eax
 	addl	$8, %edi
 	decl	%ecx
@@ -156,6 +171,7 @@ ENTRY(startup_32)
 	movl	$0x00000183, %eax
 	movl	$2048, %ecx
 1:	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 	addl	$0x00200000, %eax
 	addl	$8, %edi
 	decl	%ecx
diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
new file mode 100644
index 000000000000..54f5f6625a73
--- /dev/null
+++ b/arch/x86/boot/compressed/mem_encrypt.S
@@ -0,0 +1,120 @@
+/*
+ * AMD Memory Encryption Support
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+
+#include <asm/processor-flags.h>
+#include <asm/msr.h>
+#include <asm/asm-offsets.h>
+
+	.text
+	.code32
+ENTRY(get_sev_encryption_bit)
+	xor	%eax, %eax
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	push	%ebx
+	push	%ecx
+	push	%edx
+	push	%edi
+
+	/*
+	 * RIP-relative addressing is needed to access the encryption bit
+	 * variable. Since we are running in 32-bit mode we need this call/pop
+	 * sequence to get the proper relative addressing.
+	 */
+	call	1f
+1:	popl	%edi
+	subl	$1b, %edi
+
+	movl	enc_bit(%edi), %eax
+	cmpl	$0, %eax
+	jge	.Lsev_exit
+
+	/* Check if running under a hypervisor */
+	movl	$1, %eax
+	cpuid
+	bt	$31, %ecx		/* Check the hypervisor bit */
+	jnc	.Lno_sev
+
+	movl	$0x80000000, %eax	/* CPUID to check the highest leaf */
+	cpuid
+	cmpl	$0x8000001f, %eax	/* See if 0x8000001f is available */
+	jb	.Lno_sev
+
+	/*
+	 * Check for the SEV feature:
+	 *   CPUID Fn8000_001F[EAX] - Bit 1
+	 *   CPUID Fn8000_001F[EBX] - Bits 5:0
+	 *     Pagetable bit position used to indicate encryption
+	 */
+	movl	$0x8000001f, %eax
+	cpuid
+	bt	$1, %eax		/* Check if SEV is available */
+	jnc	.Lno_sev
+
+	movl	$MSR_AMD64_SEV, %ecx	/* Read the SEV MSR */
+	rdmsr
+	bt	$MSR_AMD64_SEV_ENABLED_BIT, %eax	/* Check if SEV is active */
+	jnc	.Lno_sev
+
+	movl	%ebx, %eax
+	andl	$0x3f, %eax		/* Return the encryption bit location */
+	movl	%eax, enc_bit(%edi)
+	jmp	.Lsev_exit
+
+.Lno_sev:
+	xor	%eax, %eax
+	movl	%eax, enc_bit(%edi)
+
+.Lsev_exit:
+	pop	%edi
+	pop	%edx
+	pop	%ecx
+	pop	%ebx
+
+#endif	/* CONFIG_AMD_MEM_ENCRYPT */
+
+	ret
+ENDPROC(get_sev_encryption_bit)
+
+	.code64
+ENTRY(get_sev_encryption_mask)
+	xor	%rax, %rax
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	push	%rbp
+	push	%rdx
+
+	movq	%rsp, %rbp		/* Save current stack pointer */
+
+	call	get_sev_encryption_bit	/* Get the encryption bit position */
+	testl	%eax, %eax
+	jz	.Lno_sev_mask
+
+	xor	%rdx, %rdx
+	bts	%rax, %rdx		/* Create the encryption mask */
+	mov	%rdx, %rax		/* ... and return it */
+
+.Lno_sev_mask:
+	movq	%rbp, %rsp		/* Restore original stack pointer */
+
+	pop	%rdx
+	pop	%rbp
+#endif
+
+	ret
+ENDPROC(get_sev_encryption_mask)
+
+	.data
+enc_bit:
+	.int	0xffffffff
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 766a5211f827..38c5f0e13f7d 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -108,4 +108,6 @@ static inline void console_init(void)
 { }
 #endif
 
+unsigned long get_sev_encryption_mask(void);
+
 #endif
diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/pagetable.c
index f1aa43854bed..a577329d84c5 100644
--- a/arch/x86/boot/compressed/pagetable.c
+++ b/arch/x86/boot/compressed/pagetable.c
@@ -76,16 +76,18 @@ static unsigned long top_level_pgt;
  * Mapping information structure passed to kernel_ident_mapping_init().
  * Due to relocation, pointers must be assigned at run time not build time.
  */
-static struct x86_mapping_info mapping_info = {
-	.page_flag       = __PAGE_KERNEL_LARGE_EXEC,
-};
+static struct x86_mapping_info mapping_info;
 
 /* Locates and clears a region for a new top level page table. */
 void initialize_identity_maps(void)
 {
+	unsigned long sev_me_mask = get_sev_encryption_mask();
+
 	/* Init mapping_info with run-time function/buffer pointers. */
 	mapping_info.alloc_pgt_page = alloc_pgt_page;
 	mapping_info.context = &pgt_data;
+	mapping_info.page_flag = __PAGE_KERNEL_LARGE_EXEC | sev_me_mask;
+	mapping_info.kernpg_flag = _KERNPG_TABLE | sev_me_mask;
 
 	/*
 	 * It should be impossible for this not to already be true,
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 17f5c12e1afd..4a4e7d72b565 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -323,6 +323,9 @@
 #define MSR_AMD64_IBSBRTARGET		0xc001103b
 #define MSR_AMD64_IBSOPDATA4		0xc001103d
 #define MSR_AMD64_IBS_REG_COUNT_MAX	8 /* includes MSR_AMD64_IBSBRTARGET */
+#define MSR_AMD64_SEV			0xc0010131
+#define MSR_AMD64_SEV_ENABLED_BIT	0
+#define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
 
 /* Fam 17h MSRs */
 #define MSR_F17H_IRPERF			0xc00000e9
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index a965e5b0d328..c202ba3fba0f 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -109,5 +109,4 @@ struct kvm_vcpu_pv_apf_data {
 #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 #define KVM_PV_EOI_DISABLED 0x0
 
-
 #endif /* _UAPI_ASM_X86_KVM_PARA_H */
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 7dadc88c0dc6..0c3186755dec 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -313,7 +313,9 @@ void __init mem_encrypt_init(void)
 	if (sev_active())
 		dma_ops = &sev_dma_ops;
 
-	pr_info("AMD Secure Memory Encryption (SME) active\n");
+	pr_info("AMD %s active\n",
+		sev_active() ? "Secure Encrypted Virtualization (SEV)"
+			     : "Secure Memory Encryption (SME)");
 }
 
 void swiotlb_set_mem_attributes(void *vaddr, unsigned long size)
@@ -641,37 +643,63 @@ void __init __nostackprotector sme_enable(struct boot_params *bp)
 {
 	const char *cmdline_ptr, *cmdline_arg, *cmdline_on, *cmdline_off;
 	unsigned int eax, ebx, ecx, edx;
+	unsigned long feature_mask;
 	bool active_by_default;
 	unsigned long me_mask;
 	char buffer[16];
 	u64 msr;
 
-	/* Check for the SME support leaf */
+	/* Check for the SME/SEV support leaf */
 	eax = 0x80000000;
 	ecx = 0;
 	native_cpuid(&eax, &ebx, &ecx, &edx);
 	if (eax < 0x8000001f)
 		return;
 
+#define AMD_SME_BIT	BIT(0)
+#define AMD_SEV_BIT	BIT(1)
 	/*
-	 * Check for the SME feature:
-	 *   CPUID Fn8000_001F[EAX] - Bit 0
-	 *     Secure Memory Encryption support
-	 *   CPUID Fn8000_001F[EBX] - Bits 5:0
-	 *     Pagetable bit position used to indicate encryption
+	 * Set the feature mask (SME or SEV) based on whether we are
+	 * running under a hypervisor.
+	 */
+	eax = 1;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	feature_mask = (ecx & BIT(31)) ? AMD_SEV_BIT : AMD_SME_BIT;
+
+	/*
+	 * Check for the SME/SEV feature:
+	 *   CPUID Fn8000_001F[EAX]
+	 *   - Bit 0 - Secure Memory Encryption support
+	 *   - Bit 1 - Secure Encrypted Virtualization support
+	 *   CPUID Fn8000_001F[EBX]
+	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
 	 */
 	eax = 0x8000001f;
 	ecx = 0;
 	native_cpuid(&eax, &ebx, &ecx, &edx);
-	if (!(eax & 1))
+	if (!(eax & feature_mask))
 		return;
 
 	me_mask = 1UL << (ebx & 0x3f);
 
-	/* Check if SME is enabled */
-	msr = __rdmsr(MSR_K8_SYSCFG);
-	if (!(msr & MSR_K8_SYSCFG_MEM_ENCRYPT))
+	/* Check if memory encryption is enabled */
+	if (feature_mask == AMD_SME_BIT) {
+		/* For SME, check the SYSCFG MSR */
+		msr = __rdmsr(MSR_K8_SYSCFG);
+		if (!(msr & MSR_K8_SYSCFG_MEM_ENCRYPT))
+			return;
+	} else {
+		/* For SEV, check the SEV MSR */
+		msr = __rdmsr(MSR_AMD64_SEV);
+		if (!(msr & MSR_AMD64_SEV_ENABLED))
+			return;
+
+		/* SEV state cannot be controlled by a command line option */
+		sme_me_mask = me_mask;
+		sev_enabled = true;
 		return;
+	}
 
 	/*
 	 * Fixups have not been applied to phys_base yet and we're running
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 13/17] x86/io: Unroll string I/O when SEV is active
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (10 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 12/17] x86/boot: Add early boot support when running with SEV active Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 14/17] x86: Add support for changing memory encryption attribute in early boot Brijesh Singh
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andy Shevchenko, David Laight, Arnd Bergmann,
	linux-kernel, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

Secure Encrypted Virtualization (SEV) does not support string I/O, so
unroll the string I/O operation into a loop operating on one element at
a time.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/io.h | 42 ++++++++++++++++++++++++++++++++++++++----
 arch/x86/mm/mem_encrypt.c |  8 ++++++++
 2 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index c40a95c33bb8..a785270b53e1 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -265,6 +265,20 @@ static inline void slow_down_io(void)
 
 #endif
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+
+extern struct static_key_false __sev;
+static inline bool sev_key_active(void)
+{
+	return static_branch_unlikely(&__sev);
+}
+
+#else /* !CONFIG_AMD_MEM_ENCRYPT */
+
+static inline bool sev_key_active(void) { return false; }
+
+#endif /* CONFIG_AMD_MEM_ENCRYPT */
+
 #define BUILDIO(bwl, bw, type)						\
 static inline void out##bwl(unsigned type value, int port)		\
 {									\
@@ -295,14 +309,34 @@ static inline unsigned type in##bwl##_p(int port)			\
 									\
 static inline void outs##bwl(int port, const void *addr, unsigned long count) \
 {									\
-	asm volatile("rep; outs" #bwl					\
-		     : "+S"(addr), "+c"(count) : "d"(port) : "memory");	\
+	if (sev_key_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			out##bwl(*value, port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; outs" #bwl				\
+			     : "+S"(addr), "+c"(count)			\
+			     : "d"(port) : "memory");			\
+	}								\
 }									\
 									\
 static inline void ins##bwl(int port, void *addr, unsigned long count)	\
 {									\
-	asm volatile("rep; ins" #bwl					\
-		     : "+D"(addr), "+c"(count) : "d"(port) : "memory");	\
+	if (sev_key_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			*value = in##bwl(port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; ins" #bwl				\
+			     : "+D"(addr), "+c"(count)			\
+			     : "d"(port) : "memory");			\
+	}								\
 }
 
 BUILDIO(b, b, char)
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 0c3186755dec..2336c3922227 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -41,6 +41,8 @@ static char sme_cmdline_off[] __initdata = "off";
  */
 u64 sme_me_mask __section(.data) = 0;
 EXPORT_SYMBOL_GPL(sme_me_mask);
+DEFINE_STATIC_KEY_FALSE(__sev);
+EXPORT_SYMBOL_GPL(__sev);
 
 static bool sev_enabled __section(.data) = false;
 
@@ -313,6 +315,12 @@ void __init mem_encrypt_init(void)
 	if (sev_active())
 		dma_ops = &sev_dma_ops;
 
+	/*
+	 * With SEV, we need to unroll the rep string I/O instructions.
+	 */
+	if (sev_active())
+		static_branch_enable(&__sev);
+
 	pr_info("AMD %s active\n",
 		sev_active() ? "Secure Encrypted Virtualization (SEV)"
 			     : "Secure Memory Encryption (SME)");
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 14/17] x86: Add support for changing memory encryption attribute in early boot
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (11 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 13/17] x86/io: Unroll string I/O when SEV is active Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 18:20   ` Borislav Petkov
  2017-10-16 19:56   ` [Part1 PATCH v6.1 " Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 15/17] percpu: Introduce DEFINE_PER_CPU_DECRYPTED Brijesh Singh
                   ` (4 subsequent siblings)
  17 siblings, 2 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Brijesh Singh, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, linux-kernel, Tom Lendacky

Some KVM-specific custom MSRs share the guest physical address with the
hypervisor in early boot. When SEV is active, the shared physical address
must be mapped with memory encryption attribute cleared so that both
hypervisor and guest can access the data.

Add APIs to change the memory encryption attribute in early boot code.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---

Changes since v5:

early_set_memory_enc_dec() is enhanced to perform encrypt/decrypt and change
the C bit in one go. The changes shields the caller from having to check
the C bit status before changing it and also shield the OS from converting a
page blindly.

Boris,

I removed your R-b since I was not sure if you are okay with the above changes.
please let me know if you are okay with the changes. thanks

 arch/x86/include/asm/mem_encrypt.h |   8 +++
 arch/x86/mm/mem_encrypt.c          | 131 +++++++++++++++++++++++++++++++++++++
 2 files changed, 139 insertions(+)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 2b024741bce9..3ba68c92be1b 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -42,6 +42,9 @@ void __init sme_early_init(void);
 void __init sme_encrypt_kernel(void);
 void __init sme_enable(struct boot_params *bp);
 
+int __init early_set_memory_decrypted(resource_size_t paddr, unsigned long size);
+int __init early_set_memory_encrypted(resource_size_t paddr, unsigned long size);
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void);
 
@@ -70,6 +73,11 @@ static inline void __init sme_enable(struct boot_params *bp) { }
 static inline bool sme_active(void) { return false; }
 static inline bool sev_active(void) { return false; }
 
+static inline int __init
+early_set_memory_decrypted(resource_size_t paddr, unsigned long size) { return 0; }
+static inline int __init
+early_set_memory_encrypted(resource_size_t paddr, unsigned long size) { return 0; }
+
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
 /*
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 2336c3922227..b671e91e6a1f 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -30,6 +30,8 @@
 #include <asm/msr.h>
 #include <asm/cmdline.h>
 
+#include "mm_internal.h"
+
 static char sme_cmdline_arg[] __initdata = "mem_encrypt";
 static char sme_cmdline_on[]  __initdata = "on";
 static char sme_cmdline_off[] __initdata = "off";
@@ -260,6 +262,135 @@ static void sev_free(struct device *dev, size_t size, void *vaddr,
 	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
 }
 
+static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
+{
+	pgprot_t old_prot, new_prot;
+	unsigned long pfn, pa, size;
+	pte_t new_pte;
+
+	switch (level) {
+	case PG_LEVEL_4K:
+		pfn = pte_pfn(*kpte);
+		old_prot = pte_pgprot(*kpte);
+		break;
+	case PG_LEVEL_2M:
+		pfn = pmd_pfn(*(pmd_t *)kpte);
+		old_prot = pmd_pgprot(*(pmd_t *)kpte);
+		break;
+	case PG_LEVEL_1G:
+		pfn = pud_pfn(*(pud_t *)kpte);
+		old_prot = pud_pgprot(*(pud_t *)kpte);
+		break;
+	default:
+		return;
+	}
+
+	new_prot = old_prot;
+	if (enc)
+		pgprot_val(new_prot) |= _PAGE_ENC;
+	else
+		pgprot_val(new_prot) &= ~_PAGE_ENC;
+
+	/* if prot is same then do nothing */
+	if (pgprot_val(old_prot) == pgprot_val(new_prot))
+		return;
+
+	pa = pfn << page_level_shift(level);
+	size = page_level_size(level);
+
+	/*
+	 * We are going to perform in-place encrypt/decrypt and change the
+	 * physical page attribute from C=1 to C=0 or vice versa. Flush the
+	 * caches to ensure that data gets accessed with correct C-bit.
+	 */
+	clflush_cache_range(__va(pa), size);
+
+	/* encrypt/decrypt the contents in-place */
+	if (enc)
+		sme_early_encrypt(pa, size);
+	else
+		sme_early_decrypt(pa, size);
+
+	/* change the page encryption mask */
+	new_pte = pfn_pte(pfn, new_prot);
+	set_pte_atomic(kpte, new_pte);
+}
+
+static int __init early_set_memory_enc_dec(resource_size_t paddr,
+					   unsigned long size, bool enc)
+{
+	unsigned long vaddr, vaddr_end, vaddr_next;
+	unsigned long psize, pmask;
+	int split_page_size_mask;
+	pte_t *kpte;
+	int level, ret;
+
+	vaddr = (unsigned long)__va(paddr);
+	vaddr_next = vaddr;
+	vaddr_end = vaddr + size;
+
+	for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+		kpte = lookup_address(vaddr, &level);
+		if (!kpte || pte_none(*kpte)) {
+			ret = 1;
+			goto out;
+		}
+
+		if (level == PG_LEVEL_4K) {
+			__set_clr_pte_enc(kpte, level, enc);
+			vaddr_next = (vaddr & PAGE_MASK) + PAGE_SIZE;
+			continue;
+		}
+
+		psize = page_level_size(level);
+		pmask = page_level_mask(level);
+
+		/*
+		 * Check, whether we can change the large page in one go.
+		 * We request a split, when the address is not aligned and
+		 * the number of pages to set/clear encryption bit is smaller
+		 * than the number of pages in the large page.
+		 */
+		if (vaddr == (vaddr & pmask) &&
+			((vaddr_end - vaddr) >= psize)) {
+			__set_clr_pte_enc(kpte, level, enc);
+			vaddr_next = (vaddr & pmask) + psize;
+			continue;
+		}
+
+		/*
+		 * virtual address is part of large page, create the page table
+		 * mapping to use smaller pages (4K or 2M). If virtual address
+		 * is part of 2M page the we request spliting the large page
+		 * into 4K, similarly 1GB large page is requested to split into
+		 * 2M pages.
+		 */
+		if (level == PG_LEVEL_2M)
+			split_page_size_mask = 0;
+		else
+			split_page_size_mask = 1 << PG_LEVEL_2M;
+
+		kernel_physical_mapping_init(__pa(vaddr & pmask),
+					     __pa((vaddr_end & pmask) + psize),
+					     split_page_size_mask);
+	}
+
+	ret = 0;
+out:
+	__flush_tlb_all();
+	return ret;
+}
+
+int __init early_set_memory_decrypted(resource_size_t paddr, unsigned long size)
+{
+	return early_set_memory_enc_dec(paddr, size, false);
+}
+
+int __init early_set_memory_encrypted(resource_size_t paddr, unsigned long size)
+{
+	return early_set_memory_enc_dec(paddr, size, true);
+}
+
 /*
  * SME and SEV are very similar but they are not the same, so there are
  * times that the kernel will need to distinguish between SME and SEV. The
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 15/17] percpu: Introduce DEFINE_PER_CPU_DECRYPTED
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (12 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 14/17] x86: Add support for changing memory encryption attribute in early boot Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active Brijesh Singh
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Brijesh Singh, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Arnd Bergmann, Tejun Heo, Christoph Lameter,
	linux-arch, linux-kernel, Tom Lendacky

KVM guest defines three per-CPU variables (steal-time, apf_reason, and
avic_eio) which are shared between a guest and a hypervisor. When SEV
is active, memory is encrypted with a guest-specific key, and if the
guest OS wants to share the memory region with the hypervisor then it
must clear the C-bit (i.e set decrypted) before sharing it.

DEFINE_PER_CPU_DECRYPTED can be used to define the per-CPU variables
which will be shared between a guest and a hypervisor.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 include/asm-generic/vmlinux.lds.h | 19 +++++++++++++++++++
 include/linux/percpu-defs.h       | 15 +++++++++++++++
 2 files changed, 34 insertions(+)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 353f52fdc35e..bdcd1caae092 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -779,6 +779,24 @@
 #endif
 
 /*
+ * Memory encryption operates on a page basis. Since we need to clear
+ * the memory encryption mask for this section, it needs to be aligned
+ * on a page boundary and be a page-size multiple in length.
+ *
+ * Note: We use a separate section so that only this section gets
+ * decrypted to avoid exposing more than we wish.
+ */
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+#define PERCPU_DECRYPTED_SECTION					\
+	. = ALIGN(PAGE_SIZE);						\
+	*(.data..percpu..decrypted)					\
+	. = ALIGN(PAGE_SIZE);
+#else
+#define PERCPU_DECRYPTED_SECTION
+#endif
+
+
+/*
  * Default discarded sections.
  *
  * Some archs want to discard exit text/data at runtime rather than
@@ -816,6 +834,7 @@
 	. = ALIGN(cacheline);						\
 	*(.data..percpu)						\
 	*(.data..percpu..shared_aligned)				\
+	PERCPU_DECRYPTED_SECTION					\
 	VMLINUX_SYMBOL(__per_cpu_end) = .;
 
 /**
diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
index 8f16299ca068..2d2096ba1cfe 100644
--- a/include/linux/percpu-defs.h
+++ b/include/linux/percpu-defs.h
@@ -173,6 +173,21 @@
 	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
 
 /*
+ * Declaration/definition used for per-CPU variables that should be accessed
+ * as decrypted when memory encryption is enabled in the guest.
+ */
+#if defined(CONFIG_VIRTUALIZATION) && defined(CONFIG_AMD_MEM_ENCRYPT)
+
+#define DECLARE_PER_CPU_DECRYPTED(type, name)				\
+	DECLARE_PER_CPU_SECTION(type, name, "..decrypted")
+
+#define DEFINE_PER_CPU_DECRYPTED(type, name)				\
+	DEFINE_PER_CPU_SECTION(type, name, "..decrypted")
+#else
+#define DEFINE_PER_CPU_DECRYPTED(type, name)	DEFINE_PER_CPU(type, name)
+#endif
+
+/*
  * Intermodule exports for per-CPU variables.  sparse forgets about
  * address space across EXPORT_SYMBOL(), change EXPORT_SYMBOL() to
  * noop if __CHECKER__.
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (13 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 15/17] percpu: Introduce DEFINE_PER_CPU_DECRYPTED Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-16 22:24   ` Borislav Petkov
  2017-10-19 13:44   ` [Part1 PATCH v6.1 " Brijesh Singh
  2017-10-16 15:34 ` [Part1 PATCH v6 17/17] X86/KVM: Clear encryption attribute " Brijesh Singh
                   ` (2 subsequent siblings)
  17 siblings, 2 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Brijesh Singh, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Paolo Bonzini, Radim Krčmář,
	Tom Lendacky, linux-kernel, kvm

When SEV is active, guest memory is encrypted with a guest-specific key, a
guest memory region shared with the hypervisor must be mapped as decrypted
before we can share it.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---

Changes since v5:
early_set_memory_decrypt() takes care of decrypting the memory contents
and changing the C bit hence there is no need to use explicit call
(sme_early_decrypt) to decrypt the memory contents.

Boris,

I removed your R-b since I was not sure if you are okay with the above changes.
please let me know if you are okay with the changes. thanks

 arch/x86/kernel/kvm.c | 35 ++++++++++++++++++++++++++++++++---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 8bb9594d0761..ff0f04077925 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
 
 early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
 
-static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
-static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
+static DEFINE_PER_CPU_DECRYPTED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
+static DEFINE_PER_CPU_DECRYPTED(struct kvm_steal_time, steal_time) __aligned(64);
 static int has_steal_clock = 0;
 
 /*
@@ -312,7 +312,7 @@ static void kvm_register_steal_time(void)
 		cpu, (unsigned long long) slow_virt_to_phys(st));
 }
 
-static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
+static DEFINE_PER_CPU_DECRYPTED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
 
 static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
 {
@@ -426,9 +426,37 @@ void kvm_disable_steal_time(void)
 	wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
 }
 
+static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
+{
+	early_set_memory_decrypted(slow_virt_to_phys(ptr), size);
+}
+
+/*
+ * Iterate through all possible CPUs and map the memory region pointed
+ * by apf_reason, steal_time and kvm_apic_eoi as decrypted at once.
+ *
+ * Note: we iterate through all possible CPUs to ensure that CPUs
+ * hotplugged will have their per-cpu variable already mapped as
+ * decrypted.
+ */
+static void __init sev_map_percpu_data(void)
+{
+	int cpu;
+
+	if (!sev_active())
+		return;
+
+	for_each_possible_cpu(cpu) {
+		__set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
+		__set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
+		__set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));
+	}
+}
+
 #ifdef CONFIG_SMP
 static void __init kvm_smp_prepare_boot_cpu(void)
 {
+	sev_map_percpu_data();
 	kvm_guest_cpu_init();
 	native_smp_prepare_boot_cpu();
 	kvm_spinlock_init();
@@ -496,6 +524,7 @@ void __init kvm_guest_init(void)
 				      kvm_cpu_online, kvm_cpu_down_prepare) < 0)
 		pr_err("kvm_guest: Failed to install cpu hotplug callbacks\n");
 #else
+	sev_map_percpu_data();
 	kvm_guest_cpu_init();
 #endif
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6 17/17] X86/KVM: Clear encryption attribute when SEV is active
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (14 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active Brijesh Singh
@ 2017-10-16 15:34 ` Brijesh Singh
  2017-10-19 13:47   ` [Part1 PATCH v6.1 " Brijesh Singh
  2017-10-16 15:42 ` [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
  2017-10-20  9:24 ` Borislav Petkov
  17 siblings, 1 reply; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:34 UTC (permalink / raw)
  To: x86
  Cc: bp, Brijesh Singh, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Paolo Bonzini, Radim Krčmář,
	Tom Lendacky, linux-kernel, kvm

The guest physical memory area holding the struct pvclock_wall_clock and
struct pvclock_vcpu_time_info are shared with the hypervisor. It
periodically updates the contents of the memory. When SEV is active, we
must clear the encryption attributes from the shared memory pages so that
both hypervisor and guest can access the data.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/entry/vdso/vma.c  |  5 ++--
 arch/x86/kernel/kvmclock.c | 65 ++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 57 insertions(+), 13 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 1911310959f8..d63053142b16 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -114,10 +114,11 @@ static int vvar_fault(const struct vm_special_mapping *sm,
 		struct pvclock_vsyscall_time_info *pvti =
 			pvclock_pvti_cpu0_va();
 		if (pvti && vclock_was_used(VCLOCK_PVCLOCK)) {
-			ret = vm_insert_pfn(
+			ret = vm_insert_pfn_prot(
 				vma,
 				vmf->address,
-				__pa(pvti) >> PAGE_SHIFT);
+				__pa(pvti) >> PAGE_SHIFT,
+				pgprot_decrypted(vma->vm_page_prot));
 		}
 	} else if (sym_offset == image->sym_hvclock_page) {
 		struct ms_hyperv_tsc_page *tsc_pg = hv_get_tsc_page();
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index d88967659098..3de184be0887 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -27,6 +27,7 @@
 #include <linux/sched.h>
 #include <linux/sched/clock.h>
 
+#include <asm/mem_encrypt.h>
 #include <asm/x86_init.h>
 #include <asm/reboot.h>
 #include <asm/kvmclock.h>
@@ -45,7 +46,7 @@ early_param("no-kvmclock", parse_no_kvmclock);
 
 /* The hypervisor will put information about time periodically here */
 static struct pvclock_vsyscall_time_info *hv_clock;
-static struct pvclock_wall_clock wall_clock;
+static struct pvclock_wall_clock *wall_clock;
 
 struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
 {
@@ -64,15 +65,15 @@ static void kvm_get_wallclock(struct timespec *now)
 	int low, high;
 	int cpu;
 
-	low = (int)__pa_symbol(&wall_clock);
-	high = ((u64)__pa_symbol(&wall_clock) >> 32);
+	low = (int)slow_virt_to_phys(wall_clock);
+	high = ((u64)slow_virt_to_phys(wall_clock) >> 32);
 
 	native_write_msr(msr_kvm_wall_clock, low, high);
 
 	cpu = get_cpu();
 
 	vcpu_time = &hv_clock[cpu].pvti;
-	pvclock_read_wallclock(&wall_clock, vcpu_time, now);
+	pvclock_read_wallclock(wall_clock, vcpu_time, now);
 
 	put_cpu();
 }
@@ -249,11 +250,39 @@ static void kvm_shutdown(void)
 	native_machine_shutdown();
 }
 
+static phys_addr_t __init kvm_memblock_alloc(phys_addr_t size,
+					     phys_addr_t align)
+{
+	phys_addr_t mem;
+
+	mem = memblock_alloc(size, align);
+	if (!mem)
+		return 0;
+
+	if (sev_active()) {
+		if (early_set_memory_decrypted(mem, size))
+			goto e_free;
+	}
+
+	return mem;
+e_free:
+	memblock_free(mem, size);
+	return 0;
+}
+
+static void __init kvm_memblock_free(phys_addr_t addr, phys_addr_t size)
+{
+	if (sev_active())
+		early_set_memory_encrypted(addr, size);
+
+	memblock_free(addr, size);
+}
+
 void __init kvmclock_init(void)
 {
 	struct pvclock_vcpu_time_info *vcpu_time;
-	unsigned long mem;
-	int size, cpu;
+	unsigned long mem, mem_wall_clock;
+	int size, cpu, wall_clock_size;
 	u8 flags;
 
 	size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
@@ -267,21 +296,35 @@ void __init kvmclock_init(void)
 	} else if (!(kvmclock && kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE)))
 		return;
 
-	printk(KERN_INFO "kvm-clock: Using msrs %x and %x",
-		msr_kvm_system_time, msr_kvm_wall_clock);
+	wall_clock_size = PAGE_ALIGN(sizeof(struct pvclock_wall_clock));
+	mem_wall_clock = kvm_memblock_alloc(wall_clock_size, PAGE_SIZE);
+	if (!mem_wall_clock)
+		return;
 
-	mem = memblock_alloc(size, PAGE_SIZE);
-	if (!mem)
+	wall_clock = __va(mem_wall_clock);
+	memset(wall_clock, 0, wall_clock_size);
+
+	mem = kvm_memblock_alloc(size, PAGE_SIZE);
+	if (!mem) {
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
 		return;
+	}
+
 	hv_clock = __va(mem);
 	memset(hv_clock, 0, size);
 
 	if (kvm_register_clock("primary cpu clock")) {
 		hv_clock = NULL;
-		memblock_free(mem, size);
+		kvm_memblock_free(mem, size);
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
 		return;
 	}
 
+	printk(KERN_INFO "kvm-clock: Using msrs %x and %x",
+		msr_kvm_system_time, msr_kvm_wall_clock);
+
 	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
 		pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD)
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (15 preceding siblings ...)
  2017-10-16 15:34 ` [Part1 PATCH v6 17/17] X86/KVM: Clear encryption attribute " Brijesh Singh
@ 2017-10-16 15:42 ` Brijesh Singh
  2017-10-20  9:24 ` Borislav Petkov
  17 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 15:42 UTC (permalink / raw)
  To: x86
  Cc: brijesh.singh, bp, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andy Lutomirski, Tom Lendacky, Paolo Bonzini,
	Radim Krčmář,
	kvm, linux-kernel



On 10/16/17 10:34 AM, Brijesh Singh wrote:

> This series is based on tip/master commit : 3594329f88c5 (Merge branch 'linus')
Small correction,

On tip/master the series applies on 3c794350da95 (Merge branch
'x86/urgent')'

Complete git tree based on tip/master is available at
https://github.com/codomania/tip/tree/sev-v6-p1

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 02/17] x86/mm: Add Secure Encrypted Virtualization (SEV) support
  2017-10-16 15:34 ` [Part1 PATCH v6 02/17] x86/mm: Add Secure Encrypted Virtualization (SEV) support Brijesh Singh
@ 2017-10-16 16:21   ` Borislav Petkov
  2017-10-16 17:46     ` Brijesh Singh
  2017-10-16 19:51   ` [Part1 PATCH v6.1 " Brijesh Singh
  1 sibling, 1 reply; 36+ messages in thread
From: Borislav Petkov @ 2017-10-16 16:21 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andy Lutomirski, linux-kernel

On Mon, Oct 16, 2017 at 10:34:08AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Provide support for Secure Encrypted Virtualization (SEV). This initial
> support defines a flag that is used by the kernel to determine if it is
> running with SEV active.

...

> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 16c5f37933a2..af7f733f6f31 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -42,6 +42,8 @@ static char sme_cmdline_off[] __initdata = "off";
>  u64 sme_me_mask __section(.data) = 0;
>  EXPORT_SYMBOL_GPL(sme_me_mask);
>  
> +static bool sev_enabled __section(.data) = false;

You need to run a patch through checkpatch everytime you change it -
sometimes the warning makes sense, like in this case:

ERROR: do not initialise statics to false
#73: FILE: arch/x86/mm/mem_encrypt.c:45:
+static bool sev_enabled __section(.data) = false;

Please add checkpatch to a script which you run before committing. Or
something to that effect. In any case, you should automate it so that
you don't forget.

Thx.

P.S., just send a fixed version as a reply to this message.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 02/17] x86/mm: Add Secure Encrypted Virtualization (SEV) support
  2017-10-16 16:21   ` Borislav Petkov
@ 2017-10-16 17:46     ` Brijesh Singh
  2017-10-16 18:18       ` Borislav Petkov
  0 siblings, 1 reply; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 17:46 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, Tom Lendacky, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Andy Lutomirski, linux-kernel


On 10/16/17 11:21 AM, Borislav Petkov wrote:

...

>> +static bool sev_enabled __section(.data) = false;
> You need to run a patch through checkpatch everytime you change it -
> sometimes the warning makes sense, like in this case:
>
> ERROR: do not initialise statics to false
> #73: FILE: arch/x86/mm/mem_encrypt.c:45:
> +static bool sev_enabled __section(.data) = false;


sev_enabled lives in .data section and looking at the objdump it seems
to initialized to zero. So, I think its safe to remove the initialization.

Tom,

Please let me know if you are okay with the changes.

I will do a quick test run and update the patch.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 02/17] x86/mm: Add Secure Encrypted Virtualization (SEV) support
  2017-10-16 17:46     ` Brijesh Singh
@ 2017-10-16 18:18       ` Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-10-16 18:18 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andy Lutomirski, linux-kernel

On Mon, Oct 16, 2017 at 12:46:20PM -0500, Brijesh Singh wrote:
> sev_enabled lives in .data section and looking at the objdump it seems
> to initialized to zero. So, I think its safe to remove the initialization.

So I'd assume that static means it gets cleared to 0 automatically, even
if it is not in the .bss section. And Tom put it in the .data section to
protect it from the .bss clearing later.

To quote the C99 standard:

"If an object that has static storage duration is not initialized
explicitly, then:

...

— if it has arithmetic type, it is initialized to (positive or unsigned) zero;"

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 14/17] x86: Add support for changing memory encryption attribute in early boot
  2017-10-16 15:34 ` [Part1 PATCH v6 14/17] x86: Add support for changing memory encryption attribute in early boot Brijesh Singh
@ 2017-10-16 18:20   ` Borislav Petkov
  2017-10-16 19:56   ` [Part1 PATCH v6.1 " Brijesh Singh
  1 sibling, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-10-16 18:20 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, linux-kernel, Tom Lendacky

On Mon, Oct 16, 2017 at 10:34:20AM -0500, Brijesh Singh wrote:
> Some KVM-specific custom MSRs share the guest physical address with the
> hypervisor in early boot. When SEV is active, the shared physical address
> must be mapped with memory encryption attribute cleared so that both
> hypervisor and guest can access the data.
> 
> Add APIs to change the memory encryption attribute in early boot code.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: x86@kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
> 
> Changes since v5:
> 
> early_set_memory_enc_dec() is enhanced to perform encrypt/decrypt and change
> the C bit in one go. The changes shields the caller from having to check
> the C bit status before changing it and also shield the OS from converting a
> page blindly.
> 
> Boris,
> 
> I removed your R-b since I was not sure if you are okay with the above changes.
> please let me know if you are okay with the changes. thanks

Looks ok, you can readd it, here are only minor comment corrections.
Just send v6.1 as a reply to this message.

Thx.

---
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index b671e91e6a1f..53d11b4d74b7 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -291,7 +291,7 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
 	else
 		pgprot_val(new_prot) &= ~_PAGE_ENC;
 
-	/* if prot is same then do nothing */
+	/* If prot is same then do nothing. */
 	if (pgprot_val(old_prot) == pgprot_val(new_prot))
 		return;
 
@@ -299,19 +299,19 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
 	size = page_level_size(level);
 
 	/*
-	 * We are going to perform in-place encrypt/decrypt and change the
+	 * We are going to perform in-place en-/decryption and change the
 	 * physical page attribute from C=1 to C=0 or vice versa. Flush the
-	 * caches to ensure that data gets accessed with correct C-bit.
+	 * caches to ensure that data gets accessed with the correct C-bit.
 	 */
 	clflush_cache_range(__va(pa), size);
 
-	/* encrypt/decrypt the contents in-place */
+	/* Encrypt/decrypt the contents in-place */
 	if (enc)
 		sme_early_encrypt(pa, size);
 	else
 		sme_early_decrypt(pa, size);
 
-	/* change the page encryption mask */
+	/* Change the page encryption mask. */
 	new_pte = pfn_pte(pfn, new_prot);
 	set_pte_atomic(kpte, new_pte);
 }
@@ -322,8 +322,8 @@ static int __init early_set_memory_enc_dec(resource_size_t paddr,
 	unsigned long vaddr, vaddr_end, vaddr_next;
 	unsigned long psize, pmask;
 	int split_page_size_mask;
-	pte_t *kpte;
 	int level, ret;
+	pte_t *kpte;
 
 	vaddr = (unsigned long)__va(paddr);
 	vaddr_next = vaddr;
@@ -346,24 +346,23 @@ static int __init early_set_memory_enc_dec(resource_size_t paddr,
 		pmask = page_level_mask(level);
 
 		/*
-		 * Check, whether we can change the large page in one go.
-		 * We request a split, when the address is not aligned and
+		 * Check whether we can change the large page in one go.
+		 * We request a split when the address is not aligned and
 		 * the number of pages to set/clear encryption bit is smaller
 		 * than the number of pages in the large page.
 		 */
 		if (vaddr == (vaddr & pmask) &&
-			((vaddr_end - vaddr) >= psize)) {
+		    ((vaddr_end - vaddr) >= psize)) {
 			__set_clr_pte_enc(kpte, level, enc);
 			vaddr_next = (vaddr & pmask) + psize;
 			continue;
 		}
 
 		/*
-		 * virtual address is part of large page, create the page table
-		 * mapping to use smaller pages (4K or 2M). If virtual address
-		 * is part of 2M page the we request spliting the large page
-		 * into 4K, similarly 1GB large page is requested to split into
-		 * 2M pages.
+		 * The virtual address is part of a larger page, create the next
+		 * level page table mapping (4K or 2M). If it is part of a 2M
+		 * page then we request a split of the large page into 4K
+		 * chunks. A 1GB large page is split into 2M pages, resp.
 		 */
 		if (level == PG_LEVEL_2M)
 			split_page_size_mask = 0;
@@ -376,6 +375,7 @@ static int __init early_set_memory_enc_dec(resource_size_t paddr,
 	}
 
 	ret = 0;
+
 out:
 	__flush_tlb_all();
 	return ret;

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6.1 02/17] x86/mm: Add Secure Encrypted Virtualization (SEV) support
  2017-10-16 15:34 ` [Part1 PATCH v6 02/17] x86/mm: Add Secure Encrypted Virtualization (SEV) support Brijesh Singh
  2017-10-16 16:21   ` Borislav Petkov
@ 2017-10-16 19:51   ` Brijesh Singh
  1 sibling, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 19:51 UTC (permalink / raw)
  To: x86
  Cc: bp, Tom Lendacky, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andy Lutomirski, linux-kernel, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

Provide support for Secure Encrypted Virtualization (SEV). This initial
support defines a flag that is used by the kernel to determine if it is
running with SEV active.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---

Changes since v6:
 * do not initialise static sev_enabled to false

 arch/x86/include/asm/mem_encrypt.h |  6 ++++++
 arch/x86/mm/mem_encrypt.c          | 26 ++++++++++++++++++++++++++
 include/linux/mem_encrypt.h        |  7 +++++--
 3 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 6a77c63540f7..2b024741bce9 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -47,6 +47,9 @@ void __init mem_encrypt_init(void);
 
 void swiotlb_set_mem_attributes(void *vaddr, unsigned long size);
 
+bool sme_active(void);
+bool sev_active(void);
+
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
 #define sme_me_mask	0ULL
@@ -64,6 +67,9 @@ static inline void __init sme_early_init(void) { }
 static inline void __init sme_encrypt_kernel(void) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
+static inline bool sme_active(void) { return false; }
+static inline bool sev_active(void) { return false; }
+
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
 /*
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 16c5f37933a2..add836d3d174 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -42,6 +42,8 @@ static char sme_cmdline_off[] __initdata = "off";
 u64 sme_me_mask __section(.data) = 0;
 EXPORT_SYMBOL_GPL(sme_me_mask);
 
+static bool sev_enabled __section(.data);
+
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
 
@@ -192,6 +194,30 @@ void __init sme_early_init(void)
 		protection_map[i] = pgprot_encrypted(protection_map[i]);
 }
 
+/*
+ * SME and SEV are very similar but they are not the same, so there are
+ * times that the kernel will need to distinguish between SME and SEV. The
+ * sme_active() and sev_active() functions are used for this.  When a
+ * distinction isn't needed, the mem_encrypt_active() function can be used.
+ *
+ * The trampoline code is a good example for this requirement.  Before
+ * paging is activated, SME will access all memory as decrypted, but SEV
+ * will access all memory as encrypted.  So, when APs are being brought
+ * up under SME the trampoline area cannot be encrypted, whereas under SEV
+ * the trampoline area must be encrypted.
+ */
+bool sme_active(void)
+{
+	return sme_me_mask && !sev_enabled;
+}
+EXPORT_SYMBOL_GPL(sme_active);
+
+bool sev_active(void)
+{
+	return sme_me_mask && sev_enabled;
+}
+EXPORT_SYMBOL_GPL(sev_active);
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void)
 {
diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 265a9cd21cb4..b310a9c18113 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -23,11 +23,14 @@
 
 #define sme_me_mask	0ULL
 
+static inline bool sme_active(void) { return false; }
+static inline bool sev_active(void) { return false; }
+
 #endif	/* CONFIG_ARCH_HAS_MEM_ENCRYPT */
 
-static inline bool sme_active(void)
+static inline bool mem_encrypt_active(void)
 {
-	return !!sme_me_mask;
+	return sme_me_mask;
 }
 
 static inline u64 sme_get_me_mask(void)
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6.1 14/17] x86: Add support for changing memory encryption attribute in early boot
  2017-10-16 15:34 ` [Part1 PATCH v6 14/17] x86: Add support for changing memory encryption attribute in early boot Brijesh Singh
  2017-10-16 18:20   ` Borislav Petkov
@ 2017-10-16 19:56   ` Brijesh Singh
  2017-10-16 21:25     ` Borislav Petkov
  1 sibling, 1 reply; 36+ messages in thread
From: Brijesh Singh @ 2017-10-16 19:56 UTC (permalink / raw)
  To: x86
  Cc: bp, Brijesh Singh, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, linux-kernel, Tom Lendacky

Some KVM-specific custom MSRs share the guest physical address with the
hypervisor in early boot. When SEV is active, the shared physical address
must be mapped with memory encryption attribute cleared so that both
hypervisor and guest can access the data.

Add APIs to change the memory encryption attribute in early boot code.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---

Changes since v6:
 * applied the improvements from Boris

 arch/x86/include/asm/mem_encrypt.h |   8 +++
 arch/x86/mm/mem_encrypt.c          | 131 +++++++++++++++++++++++++++++++++++++
 2 files changed, 139 insertions(+)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 2b024741bce9..3ba68c92be1b 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -42,6 +42,9 @@ void __init sme_early_init(void);
 void __init sme_encrypt_kernel(void);
 void __init sme_enable(struct boot_params *bp);
 
+int __init early_set_memory_decrypted(resource_size_t paddr, unsigned long size);
+int __init early_set_memory_encrypted(resource_size_t paddr, unsigned long size);
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void);
 
@@ -70,6 +73,11 @@ static inline void __init sme_enable(struct boot_params *bp) { }
 static inline bool sme_active(void) { return false; }
 static inline bool sev_active(void) { return false; }
 
+static inline int __init
+early_set_memory_decrypted(resource_size_t paddr, unsigned long size) { return 0; }
+static inline int __init
+early_set_memory_encrypted(resource_size_t paddr, unsigned long size) { return 0; }
+
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
 /*
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 45110b60be21..476c80c67030 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -30,6 +30,8 @@
 #include <asm/msr.h>
 #include <asm/cmdline.h>
 
+#include "mm_internal.h"
+
 static char sme_cmdline_arg[] __initdata = "mem_encrypt";
 static char sme_cmdline_on[]  __initdata = "on";
 static char sme_cmdline_off[] __initdata = "off";
@@ -260,6 +262,135 @@ static void sev_free(struct device *dev, size_t size, void *vaddr,
 	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
 }
 
+static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
+{
+	pgprot_t old_prot, new_prot;
+	unsigned long pfn, pa, size;
+	pte_t new_pte;
+
+	switch (level) {
+	case PG_LEVEL_4K:
+		pfn = pte_pfn(*kpte);
+		old_prot = pte_pgprot(*kpte);
+		break;
+	case PG_LEVEL_2M:
+		pfn = pmd_pfn(*(pmd_t *)kpte);
+		old_prot = pmd_pgprot(*(pmd_t *)kpte);
+		break;
+	case PG_LEVEL_1G:
+		pfn = pud_pfn(*(pud_t *)kpte);
+		old_prot = pud_pgprot(*(pud_t *)kpte);
+		break;
+	default:
+		return;
+	}
+
+	new_prot = old_prot;
+	if (enc)
+		pgprot_val(new_prot) |= _PAGE_ENC;
+	else
+		pgprot_val(new_prot) &= ~_PAGE_ENC;
+
+	/* If prot is same then do nothing. */
+	if (pgprot_val(old_prot) == pgprot_val(new_prot))
+		return;
+
+	pa = pfn << page_level_shift(level);
+	size = page_level_size(level);
+
+	/*
+	 * We are going to perform in-place en-/decryption and change the
+	 * physical page attribute from C=1 to C=0 or vice versa. Flush the
+	 * caches to ensure that data gets accessed with the correct C-bit.
+	 */
+	clflush_cache_range(__va(pa), size);
+
+	/* Encrypt/decrypt the contents in-place */
+	if (enc)
+		sme_early_encrypt(pa, size);
+	else
+		sme_early_decrypt(pa, size);
+
+	/* Change the page encryption mask. */
+	new_pte = pfn_pte(pfn, new_prot);
+	set_pte_atomic(kpte, new_pte);
+}
+
+static int __init early_set_memory_enc_dec(resource_size_t paddr,
+					   unsigned long size, bool enc)
+{
+	unsigned long vaddr, vaddr_end, vaddr_next;
+	unsigned long psize, pmask;
+	int split_page_size_mask;
+	int level, ret;
+	pte_t *kpte;
+
+	vaddr = (unsigned long)__va(paddr);
+	vaddr_next = vaddr;
+	vaddr_end = vaddr + size;
+
+	for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+		kpte = lookup_address(vaddr, &level);
+		if (!kpte || pte_none(*kpte)) {
+			ret = 1;
+			goto out;
+		}
+
+		if (level == PG_LEVEL_4K) {
+			__set_clr_pte_enc(kpte, level, enc);
+			vaddr_next = (vaddr & PAGE_MASK) + PAGE_SIZE;
+			continue;
+		}
+
+		psize = page_level_size(level);
+		pmask = page_level_mask(level);
+
+		/*
+		 * Check whether we can change the large page in one go.
+		 * We request a split when the address is not aligned and
+		 * the number of pages to set/clear encryption bit is smaller
+		 * than the number of pages in the large page.
+		 */
+		if (vaddr == (vaddr & pmask) &&
+		    ((vaddr_end - vaddr) >= psize)) {
+			__set_clr_pte_enc(kpte, level, enc);
+			vaddr_next = (vaddr & pmask) + psize;
+			continue;
+		}
+
+		/*
+		 * The virtual address is part of a larger page, create the next
+		 * level page table mapping (4K or 2M). If it is part of a 2M
+		 * page then we request a split of the large page into 4K
+		 * chunks. A 1GB large page is split into 2M pages, resp.
+		 */
+		if (level == PG_LEVEL_2M)
+			split_page_size_mask = 0;
+		else
+			split_page_size_mask = 1 << PG_LEVEL_2M;
+
+		kernel_physical_mapping_init(__pa(vaddr & pmask),
+					     __pa((vaddr_end & pmask) + psize),
+					     split_page_size_mask);
+	}
+
+	ret = 0;
+
+out:
+	__flush_tlb_all();
+	return ret;
+}
+
+int __init early_set_memory_decrypted(resource_size_t paddr, unsigned long size)
+{
+	return early_set_memory_enc_dec(paddr, size, false);
+}
+
+int __init early_set_memory_encrypted(resource_size_t paddr, unsigned long size)
+{
+	return early_set_memory_enc_dec(paddr, size, true);
+}
+
 /*
  * SME and SEV are very similar but they are not the same, so there are
  * times that the kernel will need to distinguish between SME and SEV. The
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6.1 14/17] x86: Add support for changing memory encryption attribute in early boot
  2017-10-16 19:56   ` [Part1 PATCH v6.1 " Brijesh Singh
@ 2017-10-16 21:25     ` Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-10-16 21:25 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, linux-kernel,
	Tom Lendacky

On Mon, Oct 16, 2017 at 02:56:08PM -0500, Brijesh Singh wrote:
> Some KVM-specific custom MSRs share the guest physical address with the
> hypervisor in early boot. When SEV is active, the shared physical address
> must be mapped with memory encryption attribute cleared so that both
> hypervisor and guest can access the data.
> 
> Add APIs to change the memory encryption attribute in early boot code.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: x86@kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Improvements-by: Borislav Petkov <bp@suse.de>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
> 
> Changes since v6:
>  * applied the improvements from Boris
> 
>  arch/x86/include/asm/mem_encrypt.h |   8 +++
>  arch/x86/mm/mem_encrypt.c          | 131 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 139 insertions(+)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active
  2017-10-16 15:34 ` [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active Brijesh Singh
@ 2017-10-16 22:24   ` Borislav Petkov
  2017-10-17  1:43     ` Brijesh Singh
  2017-10-19 13:44   ` [Part1 PATCH v6.1 " Brijesh Singh
  1 sibling, 1 reply; 36+ messages in thread
From: Borislav Petkov @ 2017-10-16 22:24 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Paolo Bonzini,
	Radim Krčmář,
	Tom Lendacky, linux-kernel, kvm

On Mon, Oct 16, 2017 at 10:34:22AM -0500, Brijesh Singh wrote:
> When SEV is active, guest memory is encrypted with a guest-specific key, a
> guest memory region shared with the hypervisor must be mapped as decrypted
> before we can share it.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kvm@vger.kernel.org
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
> 
> Changes since v5:
> early_set_memory_decrypt() takes care of decrypting the memory contents
> and changing the C bit hence there is no need to use explicit call
> (sme_early_decrypt) to decrypt the memory contents.
> 
> Boris,
> 
> I removed your R-b since I was not sure if you are okay with the above changes.
> please let me know if you are okay with the changes. thanks
> 
>  arch/x86/kernel/kvm.c | 35 ++++++++++++++++++++++++++++++++---
>  1 file changed, 32 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 8bb9594d0761..ff0f04077925 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
>  
>  early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
>  
> -static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> -static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
> +static DEFINE_PER_CPU_DECRYPTED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> +static DEFINE_PER_CPU_DECRYPTED(struct kvm_steal_time, steal_time) __aligned(64);
>  static int has_steal_clock = 0;
>  
>  /*
> @@ -312,7 +312,7 @@ static void kvm_register_steal_time(void)
>  		cpu, (unsigned long long) slow_virt_to_phys(st));
>  }
>  
> -static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
> +static DEFINE_PER_CPU_DECRYPTED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
>  
>  static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
>  {
> @@ -426,9 +426,37 @@ void kvm_disable_steal_time(void)
>  	wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
>  }
>  
> +static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
> +{
> +	early_set_memory_decrypted(slow_virt_to_phys(ptr), size);
> +}

Ok, so this looks like useless conversion:

you pass in a virtual address, it gets converted to a physical address
with slow_virt_to_phys() and then in early_set_memory_enc_dec() gets
converted to a virtual address again.

Why? Why not pass the virtual address directly?

> +/*
> + * Iterate through all possible CPUs and map the memory region pointed
> + * by apf_reason, steal_time and kvm_apic_eoi as decrypted at once.
> + *
> + * Note: we iterate through all possible CPUs to ensure that CPUs
> + * hotplugged will have their per-cpu variable already mapped as
> + * decrypted.
> + */
> +static void __init sev_map_percpu_data(void)
> +{
> +	int cpu;
> +
> +	if (!sev_active())
> +		return;
> +
> +	for_each_possible_cpu(cpu) {
> +		__set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
> +		__set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
> +		__set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));
> +	}
> +}
> +
>  #ifdef CONFIG_SMP
>  static void __init kvm_smp_prepare_boot_cpu(void)
>  {
> +	sev_map_percpu_data();
>  	kvm_guest_cpu_init();
>  	native_smp_prepare_boot_cpu();
>  	kvm_spinlock_init();
> @@ -496,6 +524,7 @@ void __init kvm_guest_init(void)
>  				      kvm_cpu_online, kvm_cpu_down_prepare) < 0)
>  		pr_err("kvm_guest: Failed to install cpu hotplug callbacks\n");
>  #else
> +	sev_map_percpu_data();
>  	kvm_guest_cpu_init();
>  #endif

Why isn't it enough to call

	sev_map_percpu_data()

at the end of kvm_guest_init() only but you have to call it in
kvm_smp_prepare_boot_cpu() too? I mean, once you map those things
decrypted, there's no need to do them again...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active
  2017-10-16 22:24   ` Borislav Petkov
@ 2017-10-17  1:43     ` Brijesh Singh
  2017-10-17  8:20       ` Borislav Petkov
  0 siblings, 1 reply; 36+ messages in thread
From: Brijesh Singh @ 2017-10-17  1:43 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Paolo Bonzini, Radim Krčmář,
	Tom Lendacky, linux-kernel, kvm



On 10/16/17 5:24 PM, Borislav Petkov wrote:
...

>>  
>> +static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
>> +{
>> +	early_set_memory_decrypted(slow_virt_to_phys(ptr), size);
>> +}
> Ok, so this looks like useless conversion:
>
> you pass in a virtual address, it gets converted to a physical address
> with slow_virt_to_phys() and then in early_set_memory_enc_dec() gets
> converted to a virtual address again.
>
> Why? Why not pass the virtual address directly?


Actually, I worked to enable the kvmclock support before the
kvm-stealtime, eoi and apf_reason. The kvmclock uses memblock_alloc() to
allocate the shared memory and since the memblock_alloc() returns the
physical address hence I used the same input type as a argument to the
early_set_memory_decrypted(). If you want me to change the input to
accept the virtual address then I have no issue doing so. But the
changes need to propagated to kvmclock (i.e PATCH 17/17) to use __va().
Please let me know if you want me to pass the virtual address.


>> +/*
>> + * Iterate through all possible CPUs and map the memory region pointed
>> + * by apf_reason, steal_time and kvm_apic_eoi as decrypted at once.
>> + *
>> + * Note: we iterate through all possible CPUs to ensure that CPUs
>> + * hotplugged will have their per-cpu variable already mapped as
>> + * decrypted.
>> + */
>> +static void __init sev_map_percpu_data(void)
>> +{
>> +	int cpu;
>> +
>> +	if (!sev_active())
>> +		return;
>> +
>> +	for_each_possible_cpu(cpu) {
>> +		__set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
>> +		__set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
>> +		__set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));
>> +	}
>> +}
>> +
>>  #ifdef CONFIG_SMP
>>  static void __init kvm_smp_prepare_boot_cpu(void)
>>  {
>> +	sev_map_percpu_data();
>>  	kvm_guest_cpu_init();
>>  	native_smp_prepare_boot_cpu();
>>  	kvm_spinlock_init();
>> @@ -496,6 +524,7 @@ void __init kvm_guest_init(void)
>>  				      kvm_cpu_online, kvm_cpu_down_prepare) < 0)
>>  		pr_err("kvm_guest: Failed to install cpu hotplug callbacks\n");
>>  #else
>> +	sev_map_percpu_data();
>>  	kvm_guest_cpu_init();
>>  #endif
> Why isn't it enough to call
>
> 	sev_map_percpu_data()
>
> at the end of kvm_guest_init() only but you have to call it in
> kvm_smp_prepare_boot_cpu() too? I mean, once you map those things
> decrypted, there's no need to do them again...
>

IIRC, we tried clearing C bit in kvm_guest_init() but since the
kvm_guest_init() is called before setup_per_cpu_areas() hence
per_cpu_ptr(var, cpu_id) was not able to get another processors copy of
the variable.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active
  2017-10-17  1:43     ` Brijesh Singh
@ 2017-10-17  8:20       ` Borislav Petkov
  2017-10-17 11:54         ` Brijesh Singh
  0 siblings, 1 reply; 36+ messages in thread
From: Borislav Petkov @ 2017-10-17  8:20 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Paolo Bonzini,
	Radim Krčmář,
	Tom Lendacky, linux-kernel, kvm

On Mon, Oct 16, 2017 at 08:43:15PM -0500, Brijesh Singh wrote:
> Actually, I worked to enable the kvmclock support before the
> kvm-stealtime, eoi and apf_reason. The kvmclock uses memblock_alloc() to
> allocate the shared memory and since the memblock_alloc() returns the
> physical address hence I used the same input type as a argument to the
> early_set_memory_decrypted(). If you want me to change the input to
> accept the virtual address then I have no issue doing so. But the
> changes need to propagated to kvmclock (i.e PATCH 17/17) to use __va().

And? You already convert addresses you've gotten from memblock with
__va there.

> Please let me know if you want me to pass the virtual address.

Yes please. The kernel generally handles virtual addresses and the
physical addresses you get from memblock, you simply convert once and
hand in for decryption.

> IIRC, we tried clearing C bit in kvm_guest_init() but since the
> kvm_guest_init() is called before setup_per_cpu_areas() hence
> per_cpu_ptr(var, cpu_id) was not able to get another processors copy of
> the variable.

But you are still calling it in kvm_guest_init(). So that second call
can be removed and you can call it only in kvm_smp_prepare_boot_cpu() ?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active
  2017-10-17  8:20       ` Borislav Petkov
@ 2017-10-17 11:54         ` Brijesh Singh
  2017-10-17 13:35           ` Borislav Petkov
  0 siblings, 1 reply; 36+ messages in thread
From: Brijesh Singh @ 2017-10-17 11:54 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Paolo Bonzini, Radim Krčmář,
	Tom Lendacky, linux-kernel, kvm



On 10/17/17 3:20 AM, Borislav Petkov wrote:
> On Mon, Oct 16, 2017 at 08:43:15PM -0500, Brijesh Singh wrote:
>> Actually, I worked to enable the kvmclock support before the
>> kvm-stealtime, eoi and apf_reason. The kvmclock uses memblock_alloc() to
>> allocate the shared memory and since the memblock_alloc() returns the
>> physical address hence I used the same input type as a argument to the
>> early_set_memory_decrypted(). If you want me to change the input to
>> accept the virtual address then I have no issue doing so. But the
>> changes need to propagated to kvmclock (i.e PATCH 17/17) to use __va().
> And? You already convert addresses you've gotten from memblock with
> __va there.
>
>> Please let me know if you want me to pass the virtual address.
> Yes please. The kernel generally handles virtual addresses and the
> physical addresses you get from memblock, you simply convert once and
> hand in for decryption.


Will do. Do you want me to send v7 with that addressed. Because this
require changes in 3 patches (PATCH 14, 16, 17)

>
>> IIRC, we tried clearing C bit in kvm_guest_init() but since the
>> kvm_guest_init() is called before setup_per_cpu_areas() hence
>> per_cpu_ptr(var, cpu_id) was not able to get another processors copy of
>> the variable.
> But you are still calling it in kvm_guest_init(). So that second call
> can be removed and you can call it only in kvm_smp_prepare_boot_cpu() ?
>

The second call is for UP cases. The kvm_smp_prepapre_boot_cpu() is
called only when CONFIG_SMP is enabled. Am I missing something ?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active
  2017-10-17 11:54         ` Brijesh Singh
@ 2017-10-17 13:35           ` Borislav Petkov
  2017-10-17 15:42             ` Brijesh Singh
  0 siblings, 1 reply; 36+ messages in thread
From: Borislav Petkov @ 2017-10-17 13:35 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Paolo Bonzini,
	Radim Krčmář,
	Tom Lendacky, linux-kernel, kvm

On Tue, Oct 17, 2017 at 06:54:52AM -0500, Brijesh Singh wrote:
> Will do. Do you want me to send v7 with that addressed. Because this
> require changes in 3 patches (PATCH 14, 16, 17)

Just send me those three first as a reply to the thread so that I can
test the whole queue everywhere. Or even better, point me to the v7 git
branch once you have it ready.

> The second call is for UP cases. The kvm_smp_prepapre_boot_cpu() is
> called only when CONFIG_SMP is enabled. Am I missing something ?

Yes, you are.

kvm_guest_init() gets called unconditionally from setup_arch(). But then
you said kvm_guest_init() is called before setup_per_cpu_areas() so why
do you need that call there at all? percpu areas are not ready yet, what
makes them ready in the UP case?

IOW, this sev_map_percpu_data() needs to happen only once, during boot.
So call it only once by finding the right spot and not by adding a
second call for the UP case.

AFAICT, it looks the easiest if you put it in kvm_guest_cpu_init() and
do something like:

	if (smp_processor_id() == boot_cpu_data.cpu_index)
		sev_map_percpu_data();

...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active
  2017-10-17 13:35           ` Borislav Petkov
@ 2017-10-17 15:42             ` Brijesh Singh
  2017-10-17 16:23               ` Borislav Petkov
  0 siblings, 1 reply; 36+ messages in thread
From: Brijesh Singh @ 2017-10-17 15:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Paolo Bonzini, Radim Krčmář,
	Tom Lendacky, linux-kernel, kvm



On 10/17/17 8:35 AM, Borislav Petkov wrote:
...

>> The second call is for UP cases. The kvm_smp_prepapre_boot_cpu() is
>> called only when CONFIG_SMP is enabled. Am I missing something ?
> Yes, you are.
>
> kvm_guest_init() gets called unconditionally from setup_arch(). But then
> you said kvm_guest_init() is called before setup_per_cpu_areas() so why
> do you need that call there at all? percpu areas are not ready yet, what
> makes them ready in the UP case?


I believe in the case of UP,  the setup_per_cpu_areas() does not do
anything special for the static per-cpu variables hence its fine to
access them. If we look at kvm_guest_init() then it directly jumps to
kvm_guest_cpu_init() in UP case. But in the case of CONFIG_SMP the
kvm_guest_cpu_init() is called from the kvm_smp_prepare_boot_cpu().


> IOW, this sev_map_percpu_data() needs to happen only once, during boot.
> So call it only once by finding the right spot and not by adding a
> second call for the UP case.
>
> AFAICT, it looks the easiest if you put it in kvm_guest_cpu_init() and
> do something like:
>
> 	if (smp_processor_id() == boot_cpu_data.cpu_index)
> 		sev_map_percpu_data();
>
> ...
>

OK, this goes back to your initial feedback during RFC v3 where I tried
to do similar thing. But since sev_map_percpu_data() uses __init
functions hence we need to mark the kvm_guest_cpu_init() as __ref but
you didn't like the idea and asked me to call the sev_map_percpu_data
from kvm_smp_prepare_boot_cpu() which is already __init.


 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active
  2017-10-17 15:42             ` Brijesh Singh
@ 2017-10-17 16:23               ` Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-10-17 16:23 UTC (permalink / raw)
  To: Brijesh Singh, Paolo Bonzini
  Cc: x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Radim Krčmář,
	Tom Lendacky, linux-kernel, kvm

On Tue, Oct 17, 2017 at 10:42:45AM -0500, Brijesh Singh wrote:
> OK, this goes back to your initial feedback during RFC v3 where I tried
> to do similar thing. But since sev_map_percpu_data() uses __init
> functions hence we need to mark the kvm_guest_cpu_init() as __ref but
> you didn't like the idea and asked me to call the sev_map_percpu_data
> from kvm_smp_prepare_boot_cpu() which is already __init.

So looking at this more, I'm wondering how did I miss this: this whole
SEV code doesn't belong in kvm.c but in svm.c! This is AMD-only and
should be there.

Now, for that you need those three percpu variables to not be static but
I think we can do this by adding their declarations to a kvm-internal.h
header which is in arch/x86/kernel/kvm-internal.h, i.e., not in the
include path and only kvm-related units can include it.

@Paolo: thoughts?

Then, you can call sev_map_percpu_data() during module init, i.e.,
svm_init(). Or in one of the init functions you're adding later,
sev_hardware_setup() for example, or so. I.e., one of those init paths.
Because, AFAICT, you want to clear the physical C-bit only once in the
percpu pages.

I think this is much better than the "grafted" situation now.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6.1 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active
  2017-10-16 15:34 ` [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active Brijesh Singh
  2017-10-16 22:24   ` Borislav Petkov
@ 2017-10-19 13:44   ` Brijesh Singh
  2017-10-19 15:40     ` Borislav Petkov
  1 sibling, 1 reply; 36+ messages in thread
From: Brijesh Singh @ 2017-10-19 13:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Paolo Bonzini, Radim Krčmář,
	Tom Lendacky, x86, linux-kernel, kvm

When SEV is active, guest memory is encrypted with a guest-specific key, a
guest memory region shared with the hypervisor must be mapped as decrypted
before we can share it.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---

Changes since v6:
 * use virtual address when call early_set_memory_decrypted()

 arch/x86/kernel/kvm.c | 40 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 37 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 8bb9594d0761..3df743b60c80 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
 
 early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
 
-static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
-static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
+static DEFINE_PER_CPU_DECRYPTED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
+static DEFINE_PER_CPU_DECRYPTED(struct kvm_steal_time, steal_time) __aligned(64);
 static int has_steal_clock = 0;
 
 /*
@@ -312,7 +312,7 @@ static void kvm_register_steal_time(void)
 		cpu, (unsigned long long) slow_virt_to_phys(st));
 }
 
-static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
+static DEFINE_PER_CPU_DECRYPTED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
 
 static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
 {
@@ -426,9 +426,42 @@ void kvm_disable_steal_time(void)
 	wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
 }
 
+static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
+{
+	early_set_memory_decrypted((unsigned long) ptr, size);
+}
+
+/*
+ * Iterate through all possible CPUs and map the memory region pointed
+ * by apf_reason, steal_time and kvm_apic_eoi as decrypted at once.
+ *
+ * Note: we iterate through all possible CPUs to ensure that CPUs
+ * hotplugged will have their per-cpu variable already mapped as
+ * decrypted.
+ */
+static void __init sev_map_percpu_data(void)
+{
+	int cpu;
+
+	if (!sev_active())
+		return;
+
+	for_each_possible_cpu(cpu) {
+		__set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
+		__set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
+		__set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));
+	}
+}
+
 #ifdef CONFIG_SMP
 static void __init kvm_smp_prepare_boot_cpu(void)
 {
+	/*
+	 * Map the per-cpu variables as decrypted before kvm_guest_cpu_init()
+	 * shares the guest physical address with the hypervisor.
+	 */
+	sev_map_percpu_data();
+
 	kvm_guest_cpu_init();
 	native_smp_prepare_boot_cpu();
 	kvm_spinlock_init();
@@ -496,6 +529,7 @@ void __init kvm_guest_init(void)
 				      kvm_cpu_online, kvm_cpu_down_prepare) < 0)
 		pr_err("kvm_guest: Failed to install cpu hotplug callbacks\n");
 #else
+	sev_map_percpu_data();
 	kvm_guest_cpu_init();
 #endif
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Part1 PATCH v6.1 17/17] X86/KVM: Clear encryption attribute when SEV is active
  2017-10-16 15:34 ` [Part1 PATCH v6 17/17] X86/KVM: Clear encryption attribute " Brijesh Singh
@ 2017-10-19 13:47   ` Brijesh Singh
  0 siblings, 0 replies; 36+ messages in thread
From: Brijesh Singh @ 2017-10-19 13:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Paolo Bonzini, Radim Krčmář,
	Tom Lendacky, x86, linux-kernel, kvm

The guest physical memory area holding the struct pvclock_wall_clock and
struct pvclock_vcpu_time_info are shared with the hypervisor. It
periodically updates the contents of the memory. When SEV is active, we
must clear the encryption attributes from the shared memory pages so that
both hypervisor and guest can access the data.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---

Changes since v6:
 * use virtual address when call early_set_memory_decrypted()

 arch/x86/entry/vdso/vma.c  |  5 ++--
 arch/x86/kernel/kvmclock.c | 65 ++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 57 insertions(+), 13 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 1911310959f8..d63053142b16 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -114,10 +114,11 @@ static int vvar_fault(const struct vm_special_mapping *sm,
 		struct pvclock_vsyscall_time_info *pvti =
 			pvclock_pvti_cpu0_va();
 		if (pvti && vclock_was_used(VCLOCK_PVCLOCK)) {
-			ret = vm_insert_pfn(
+			ret = vm_insert_pfn_prot(
 				vma,
 				vmf->address,
-				__pa(pvti) >> PAGE_SHIFT);
+				__pa(pvti) >> PAGE_SHIFT,
+				pgprot_decrypted(vma->vm_page_prot));
 		}
 	} else if (sym_offset == image->sym_hvclock_page) {
 		struct ms_hyperv_tsc_page *tsc_pg = hv_get_tsc_page();
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index d88967659098..0d6d92e78fac 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -27,6 +27,7 @@
 #include <linux/sched.h>
 #include <linux/sched/clock.h>
 
+#include <asm/mem_encrypt.h>
 #include <asm/x86_init.h>
 #include <asm/reboot.h>
 #include <asm/kvmclock.h>
@@ -45,7 +46,7 @@ early_param("no-kvmclock", parse_no_kvmclock);
 
 /* The hypervisor will put information about time periodically here */
 static struct pvclock_vsyscall_time_info *hv_clock;
-static struct pvclock_wall_clock wall_clock;
+static struct pvclock_wall_clock *wall_clock;
 
 struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
 {
@@ -64,15 +65,15 @@ static void kvm_get_wallclock(struct timespec *now)
 	int low, high;
 	int cpu;
 
-	low = (int)__pa_symbol(&wall_clock);
-	high = ((u64)__pa_symbol(&wall_clock) >> 32);
+	low = (int)slow_virt_to_phys(wall_clock);
+	high = ((u64)slow_virt_to_phys(wall_clock) >> 32);
 
 	native_write_msr(msr_kvm_wall_clock, low, high);
 
 	cpu = get_cpu();
 
 	vcpu_time = &hv_clock[cpu].pvti;
-	pvclock_read_wallclock(&wall_clock, vcpu_time, now);
+	pvclock_read_wallclock(wall_clock, vcpu_time, now);
 
 	put_cpu();
 }
@@ -249,11 +250,39 @@ static void kvm_shutdown(void)
 	native_machine_shutdown();
 }
 
+static phys_addr_t __init kvm_memblock_alloc(phys_addr_t size,
+					     phys_addr_t align)
+{
+	phys_addr_t mem;
+
+	mem = memblock_alloc(size, align);
+	if (!mem)
+		return 0;
+
+	if (sev_active()) {
+		if (early_set_memory_decrypted((unsigned long)__va(mem), size))
+			goto e_free;
+	}
+
+	return mem;
+e_free:
+	memblock_free(mem, size);
+	return 0;
+}
+
+static void __init kvm_memblock_free(phys_addr_t addr, phys_addr_t size)
+{
+	if (sev_active())
+		early_set_memory_encrypted((unsigned long)__va(addr), size);
+
+	memblock_free(addr, size);
+}
+
 void __init kvmclock_init(void)
 {
 	struct pvclock_vcpu_time_info *vcpu_time;
-	unsigned long mem;
-	int size, cpu;
+	unsigned long mem, mem_wall_clock;
+	int size, cpu, wall_clock_size;
 	u8 flags;
 
 	size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
@@ -267,21 +296,35 @@ void __init kvmclock_init(void)
 	} else if (!(kvmclock && kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE)))
 		return;
 
-	printk(KERN_INFO "kvm-clock: Using msrs %x and %x",
-		msr_kvm_system_time, msr_kvm_wall_clock);
+	wall_clock_size = PAGE_ALIGN(sizeof(struct pvclock_wall_clock));
+	mem_wall_clock = kvm_memblock_alloc(wall_clock_size, PAGE_SIZE);
+	if (!mem_wall_clock)
+		return;
 
-	mem = memblock_alloc(size, PAGE_SIZE);
-	if (!mem)
+	wall_clock = __va(mem_wall_clock);
+	memset(wall_clock, 0, wall_clock_size);
+
+	mem = kvm_memblock_alloc(size, PAGE_SIZE);
+	if (!mem) {
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
 		return;
+	}
+
 	hv_clock = __va(mem);
 	memset(hv_clock, 0, size);
 
 	if (kvm_register_clock("primary cpu clock")) {
 		hv_clock = NULL;
-		memblock_free(mem, size);
+		kvm_memblock_free(mem, size);
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
 		return;
 	}
 
+	printk(KERN_INFO "kvm-clock: Using msrs %x and %x",
+		msr_kvm_system_time, msr_kvm_wall_clock);
+
 	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
 		pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6.1 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active
  2017-10-19 13:44   ` [Part1 PATCH v6.1 " Brijesh Singh
@ 2017-10-19 15:40     ` Borislav Petkov
  0 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-10-19 15:40 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Borislav Petkov,
	Paolo Bonzini, Radim Krčmář,
	Tom Lendacky, x86, linux-kernel, kvm

On Thu, Oct 19, 2017 at 08:44:44AM -0500, Brijesh Singh wrote:
> When SEV is active, guest memory is encrypted with a guest-specific key, a
> guest memory region shared with the hypervisor must be mapped as decrypted
> before we can share it.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: x86@kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kvm@vger.kernel.org
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
> 
> Changes since v6:
>  * use virtual address when call early_set_memory_decrypted()
> 
>  arch/x86/kernel/kvm.c | 40 +++++++++++++++++++++++++++++++++++++---
>  1 file changed, 37 insertions(+), 3 deletions(-)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD)
  2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
                   ` (16 preceding siblings ...)
  2017-10-16 15:42 ` [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
@ 2017-10-20  9:24 ` Borislav Petkov
  17 siblings, 0 replies; 36+ messages in thread
From: Borislav Petkov @ 2017-10-20  9:24 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Andy Lutomirski, Tom Lendacky, Paolo Bonzini,
	Radim KrÄ?mář,
	kvm, linux-kernel

On Mon, Oct 16, 2017 at 10:34:06AM -0500, Brijesh Singh wrote:
> This part of Secure Encrypted Virtualization (SEV) series focuses on the
> changes required in a guest OS for SEV support.

...

> ---
> This series is based on tip/master commit : 3594329f88c5 (Merge branch 'linus')
> 
> Complete git tree is available: https://github.com/codomania/tip/tree/v6-p1
> 
> Changes since v5:
>  * enhance early_set_memory_decrypted() to do memory contents encrypt/decrypt in
>    addition to C bit changes.

Ok, this passes testing here. Please add the randconfig hunk from
yesterday and send what would look like a final v7! :-)

Reviewed-by: Borislav Petkov <bp@suse.de>
Tested-by: Borislav Petkov <bp@suse.de>

Thanks!

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-10-20  9:25 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-16 15:34 [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 02/17] x86/mm: Add Secure Encrypted Virtualization (SEV) support Brijesh Singh
2017-10-16 16:21   ` Borislav Petkov
2017-10-16 17:46     ` Brijesh Singh
2017-10-16 18:18       ` Borislav Petkov
2017-10-16 19:51   ` [Part1 PATCH v6.1 " Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 03/17] x86/mm: Don't attempt to encrypt initrd under SEV Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 04/17] x86/realmode: Don't decrypt trampoline area " Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 05/17] x86/mm: Use encrypted access of boot related data with SEV Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 06/17] x86/mm: Include SEV for encryption memory attribute changes Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 07/17] x86/efi: Access EFI data as encrypted when SEV is active Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 08/17] resource: Consolidate resource walking code Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 09/17] resource: Provide resource struct in resource walk callback Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 10/17] x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pages Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 11/17] x86/mm: Add DMA support for SEV memory encryption Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 12/17] x86/boot: Add early boot support when running with SEV active Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 13/17] x86/io: Unroll string I/O when SEV is active Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 14/17] x86: Add support for changing memory encryption attribute in early boot Brijesh Singh
2017-10-16 18:20   ` Borislav Petkov
2017-10-16 19:56   ` [Part1 PATCH v6.1 " Brijesh Singh
2017-10-16 21:25     ` Borislav Petkov
2017-10-16 15:34 ` [Part1 PATCH v6 15/17] percpu: Introduce DEFINE_PER_CPU_DECRYPTED Brijesh Singh
2017-10-16 15:34 ` [Part1 PATCH v6 16/17] X86/KVM: Decrypt shared per-cpu variables when SEV is active Brijesh Singh
2017-10-16 22:24   ` Borislav Petkov
2017-10-17  1:43     ` Brijesh Singh
2017-10-17  8:20       ` Borislav Petkov
2017-10-17 11:54         ` Brijesh Singh
2017-10-17 13:35           ` Borislav Petkov
2017-10-17 15:42             ` Brijesh Singh
2017-10-17 16:23               ` Borislav Petkov
2017-10-19 13:44   ` [Part1 PATCH v6.1 " Brijesh Singh
2017-10-19 15:40     ` Borislav Petkov
2017-10-16 15:34 ` [Part1 PATCH v6 17/17] X86/KVM: Clear encryption attribute " Brijesh Singh
2017-10-19 13:47   ` [Part1 PATCH v6.1 " Brijesh Singh
2017-10-16 15:42 ` [Part1 PATCH v6 00/17] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
2017-10-20  9:24 ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.