All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-02 15:12 ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

This RFC series provides support for AMD's new Secure Encrypted Virtualization
(SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].

SEV is an extension to the AMD-V architecture which supports running multiple
VMs under the control of a hypervisor. When enabled, SEV hardware tags all
code and data with its VM ASID which indicates which VM the data originated
from or is intended for. This tag is kept with the data at all times when
inside the SOC, and prevents that data from being used by anyone other than the
owner. While the tag protects VM data inside the SOC, AES with 128 bit
encryption protects data outside the SOC. When data leaves or enters the SOC,
it is encrypted/decrypted  respectively by hardware with a key based on the
associated tag.

SEV guest VMs have the concept of private and shared memory.  Private memory is
encrypted with the  guest-specific key, while shared memory may be encrypted
with hypervisor key.  Certain types of memory (namely instruction pages and
guest page tables) are always treated as private memory by the hardware.
For data memory, SEV guest VMs can choose which pages they would like to be
private. The choice is done using the standard CPU page tables using the C-bit,
and is fully controlled by the guest. Due to security reasons all the DMA
operations inside the  guest must be performed on shared pages (C-bit clear).
Note that since C-bit is only controllable by the guest OS when it is operating
in 64-bit or 32-bit PAE mode, in all other modes the SEV hardware forces the
C-bit to a 1.

SEV is designed to protect guest VMs from a benign but vulnerable (i.e. not
fully malicious) hypervisor. In particular, it reduces the attack surface of
guest VMs and can prevent certain types of VM-escape bugs (e.g. hypervisor
read-anywhere) from being used to steal guest data.

The RFC series also expands crypto driver (ccp.ko) to include the support for
Platform Security Processor (PSP) which is used for communicating with SEV
firmware that runs within the AMD secure processor providing a secure key
management interfaces. The hypervisor uses this interface to encrypt the
bootstrap code and perform common activities such as launching, running,
snapshotting, migrating and debugging encrypted guest.

A new ioctl (KVM_MEMORY_ENCRYPT_OP) is introduced which can be used by Qemu to
issue SEV guest life cycle commands.

The RFC series also includes patches required in guest OS to enable SEV feature.
A guest OS can check SEV support by calling KVM_FEATURE cpuid instruction.

The patch breakdown:
* [1 - 17]: guest OS specific changes when SEV is active
* [18]: already queued in kvm upstream tree but was not in tip tree hence its
  included so that build does not fail
* [19 - 21]: since CCP and PSP shares the same PCIe ID hence the patch expands
  the CCP driver by creating a high level AMD Secure Processor (SP) framework
  to allow integration of PSP device into ccp.ko.
* [22 - 32]: hypervisor changes to support memory encryption

The following links provide additional details:

AMD Memory Encryption whitepaper:
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf

AMD64 Architecture Programmer's Manual:
    http://support.amd.com/TechDocs/24593.pdf
    SME is section 7.10
    SEV is section 15.34

Secure Encrypted Virutualization Key Management:
http://support.amd.com/TechDocs/55766_SEV-KM API_Specification.pdf

KVM Forum Presentation:
http://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf

[1] http://marc.info/?l=linux-kernel&m=148725974113693&w=2

---

Based on the feedbacks, we have started adding the SEV guest support in OVMF
BIOS. This series has been tested using EDK2/OVMF BIOS, the initial EDK2 patches
has been submmited on edk2 mailing list for discussion.

TODO:
 - add support for migration commands
 - update QEMU RFC's to SEV spec 0.14
 - investigate virtio and vfio support for SEV guest
 - investigate SMM support for SEV guest
 - add support for nested virtualization

Changes since v1:
 - update to newer SEV key management API spec (0.12 -> 0.14)
 - expand the CCP driver and integrate the PSP interface support
 - remove the usage of SEV ref_count and release the SEV FW resources in
   kvm_x86_ops->vm_destroy
 - acquire the kvm->lock before executing the SEV commands and release on exit.
 - rename ioctl from KVM_SEV_ISSUE_CMD to KVM_MEMORY_ENCRYPT_OP
 - extend KVM_MEMORY_ENCRYPT_OP ioctl to require file descriptor for the SEV
   device. A program without access to /dev/sev will not be able to issue SEV
   commands
 - update vmcb on succesful LAUNCH_FINISH to indicate that SEV is active
 - serveral fixes based on Paolo's review feedbacks
 - add APIs to support sharing the guest physical address with hypervisor
 - update kvm pvclock driver to use the shared buffer when SEV is active
 - pin the SEV guest memory

Brijesh Singh (18):
      x86: mm: Provide support to use memblock when spliting large pages
      x86: Add support for changing memory encryption attribute in early boot
      x86: kvm: Provide support to create Guest and HV shared per-CPU variables
      x86: kvmclock: Clear encryption attribute when SEV is active
      crypto: ccp: Introduce the AMD Secure Processor device
      crypto: ccp: Add Platform Security Processor (PSP) interface support
      crypto: ccp: Add Secure Encrypted Virtualization (SEV) interface support
      kvm: svm: prepare to reserve asid for SEV guest
      kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
      kvm: x86: prepare for SEV guest management API support
      kvm: svm: Add support for SEV LAUNCH_START command
      kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
      kvm: svm: Add support for SEV LAUNCH_FINISH command
      kvm: svm: Add support for SEV GUEST_STATUS command
      kvm: svm: Add support for SEV DEBUG_DECRYPT command
      kvm: svm: Add support for SEV DEBUG_ENCRYPT command
      kvm: svm: Add support for SEV LAUNCH_MEASURE command
      x86: kvm: Pin the guest memory when SEV is active

Tom Lendacky (14):
      x86: Add the Secure Encrypted Virtualization CPU feature
      x86: Secure Encrypted Virtualization (SEV) support
      KVM: SVM: prepare for new bit definition in nested_ctl
      KVM: SVM: Add SEV feature definitions to KVM
      x86: Use encrypted access of BOOT related data with SEV
      x86/pci: Use memremap when walking setup data
      x86/efi: Access EFI data as encrypted when SEV is active
      x86: Use PAGE_KERNEL protection for ioremap of memory page
      x86: Change early_ioremap to early_memremap for BOOT data
      x86: DMA support for SEV memory encryption
      x86: Unroll string I/O when SEV is active
      x86: Add early boot support when running with SEV active
      KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
      kvm: svm: Use the hardware provided GPA instead of page walk



 arch/x86/boot/compressed/Makefile      |    2 
 arch/x86/boot/compressed/head_64.S     |   16 
 arch/x86/boot/compressed/mem_encrypt.S |   75 ++
 arch/x86/include/asm/cpufeatures.h     |    1 
 arch/x86/include/asm/io.h              |   26 +
 arch/x86/include/asm/kvm_emulate.h     |    1 
 arch/x86/include/asm/kvm_host.h        |   19 +
 arch/x86/include/asm/mem_encrypt.h     |   29 +
 arch/x86/include/asm/msr-index.h       |    2 
 arch/x86/include/asm/svm.h             |    3 
 arch/x86/include/uapi/asm/hyperv.h     |    4 
 arch/x86/include/uapi/asm/kvm_para.h   |    4 
 arch/x86/kernel/acpi/boot.c            |    4 
 arch/x86/kernel/cpu/amd.c              |   22 +
 arch/x86/kernel/cpu/scattered.c        |    1 
 arch/x86/kernel/kvm.c                  |   43 +
 arch/x86/kernel/kvmclock.c             |   65 ++
 arch/x86/kernel/mem_encrypt_init.c     |   24 +
 arch/x86/kernel/mpparse.c              |   10 
 arch/x86/kvm/cpuid.c                   |    4 
 arch/x86/kvm/emulate.c                 |   20 -
 arch/x86/kvm/svm.c                     | 1051 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c                     |   60 ++
 arch/x86/mm/ioremap.c                  |   44 +
 arch/x86/mm/mem_encrypt.c              |  143 ++++
 arch/x86/mm/pageattr.c                 |   51 +-
 arch/x86/pci/common.c                  |    4 
 arch/x86/platform/efi/efi_64.c         |   15 
 drivers/crypto/Kconfig                 |   10 
 drivers/crypto/ccp/Kconfig             |   55 +-
 drivers/crypto/ccp/Makefile            |   10 
 drivers/crypto/ccp/ccp-dev-v3.c        |   86 +--
 drivers/crypto/ccp/ccp-dev-v5.c        |   73 +-
 drivers/crypto/ccp/ccp-dev.c           |  137 ++--
 drivers/crypto/ccp/ccp-dev.h           |   35 -
 drivers/crypto/ccp/psp-dev.c           |  211 ++++++
 drivers/crypto/ccp/psp-dev.h           |  102 +++
 drivers/crypto/ccp/sev-dev.c           |  348 +++++++++++
 drivers/crypto/ccp/sev-dev.h           |   67 ++
 drivers/crypto/ccp/sev-ops.c           |  324 ++++++++++
 drivers/crypto/ccp/sp-dev.c            |  324 ++++++++++
 drivers/crypto/ccp/sp-dev.h            |  172 +++++
 drivers/crypto/ccp/sp-pci.c            |  328 ++++++++++
 drivers/crypto/ccp/sp-platform.c       |  268 ++++++++
 drivers/sfi/sfi_core.c                 |    6 
 include/asm-generic/vmlinux.lds.h      |    3 
 include/linux/ccp.h                    |    3 
 include/linux/mem_encrypt.h            |    6 
 include/linux/mm.h                     |    1 
 include/linux/percpu-defs.h            |    9 
 include/linux/psp-sev.h                |  672 ++++++++++++++++++++
 include/uapi/linux/Kbuild              |    1 
 include/uapi/linux/kvm.h               |  100 +++
 include/uapi/linux/psp-sev.h           |  123 ++++
 kernel/resource.c                      |   40 +
 55 files changed, 4991 insertions(+), 266 deletions(-)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S
 create mode 100644 drivers/crypto/ccp/psp-dev.c
 create mode 100644 drivers/crypto/ccp/psp-dev.h
 create mode 100644 drivers/crypto/ccp/sev-dev.c
 create mode 100644 drivers/crypto/ccp/sev-dev.h
 create mode 100644 drivers/crypto/ccp/sev-ops.c
 create mode 100644 drivers/crypto/ccp/sp-dev.c
 create mode 100644 drivers/crypto/ccp/sp-dev.h
 create mode 100644 drivers/crypto/ccp/sp-pci.c
 create mode 100644 drivers/crypto/ccp/sp-platform.c
 create mode 100644 include/linux/psp-sev.h
 create mode 100644 include/uapi/linux/psp-sev.h


--
Brijesh Singh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-02 15:12 ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

This RFC series provides support for AMD's new Secure Encrypted Virtualization
(SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].

SEV is an extension to the AMD-V architecture which supports running multiple
VMs under the control of a hypervisor. When enabled, SEV hardware tags all
code and data with its VM ASID which indicates which VM the data originated
from or is intended for. This tag is kept with the data at all times when
inside the SOC, and prevents that data from being used by anyone other than the
owner. While the tag protects VM data inside the SOC, AES with 128 bit
encryption protects data outside the SOC. When data leaves or enters the SOC,
it is encrypted/decrypted  respectively by hardware with a key based on the
associated tag.

SEV guest VMs have the concept of private and shared memory.  Private memory is
encrypted with the  guest-specific key, while shared memory may be encrypted
with hypervisor key.  Certain types of memory (namely instruction pages and
guest page tables) are always treated as private memory by the hardware.
For data memory, SEV guest VMs can choose which pages they would like to be
private. The choice is done using the standard CPU page tables using the C-bit,
and is fully controlled by the guest. Due to security reasons all the DMA
operations inside the  guest must be performed on shared pages (C-bit clear).
Note that since C-bit is only controllable by the guest OS when it is operating
in 64-bit or 32-bit PAE mode, in all other modes the SEV hardware forces the
C-bit to a 1.

SEV is designed to protect guest VMs from a benign but vulnerable (i.e. not
fully malicious) hypervisor. In particular, it reduces the attack surface of
guest VMs and can prevent certain types of VM-escape bugs (e.g. hypervisor
read-anywhere) from being used to steal guest data.

The RFC series also expands crypto driver (ccp.ko) to include the support for
Platform Security Processor (PSP) which is used for communicating with SEV
firmware that runs within the AMD secure processor providing a secure key
management interfaces. The hypervisor uses this interface to encrypt the
bootstrap code and perform common activities such as launching, running,
snapshotting, migrating and debugging encrypted guest.

A new ioctl (KVM_MEMORY_ENCRYPT_OP) is introduced which can be used by Qemu to
issue SEV guest life cycle commands.

The RFC series also includes patches required in guest OS to enable SEV feature.
A guest OS can check SEV support by calling KVM_FEATURE cpuid instruction.

The patch breakdown:
* [1 - 17]: guest OS specific changes when SEV is active
* [18]: already queued in kvm upstream tree but was not in tip tree hence its
  included so that build does not fail
* [19 - 21]: since CCP and PSP shares the same PCIe ID hence the patch expands
  the CCP driver by creating a high level AMD Secure Processor (SP) framework
  to allow integration of PSP device into ccp.ko.
* [22 - 32]: hypervisor changes to support memory encryption

The following links provide additional details:

AMD Memory Encryption whitepaper:
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf

AMD64 Architecture Programmer's Manual:
    http://support.amd.com/TechDocs/24593.pdf
    SME is section 7.10
    SEV is section 15.34

Secure Encrypted Virutualization Key Management:
http://support.amd.com/TechDocs/55766_SEV-KM API_Specification.pdf

KVM Forum Presentation:
http://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf

[1] http://marc.info/?l=linux-kernel&m=148725974113693&w=2

---

Based on the feedbacks, we have started adding the SEV guest support in OVMF
BIOS. This series has been tested using EDK2/OVMF BIOS, the initial EDK2 patches
has been submmited on edk2 mailing list for discussion.

TODO:
 - add support for migration commands
 - update QEMU RFC's to SEV spec 0.14
 - investigate virtio and vfio support for SEV guest
 - investigate SMM support for SEV guest
 - add support for nested virtualization

Changes since v1:
 - update to newer SEV key management API spec (0.12 -> 0.14)
 - expand the CCP driver and integrate the PSP interface support
 - remove the usage of SEV ref_count and release the SEV FW resources in
   kvm_x86_ops->vm_destroy
 - acquire the kvm->lock before executing the SEV commands and release on exit.
 - rename ioctl from KVM_SEV_ISSUE_CMD to KVM_MEMORY_ENCRYPT_OP
 - extend KVM_MEMORY_ENCRYPT_OP ioctl to require file descriptor for the SEV
   device. A program without access to /dev/sev will not be able to issue SEV
   commands
 - update vmcb on succesful LAUNCH_FINISH to indicate that SEV is active
 - serveral fixes based on Paolo's review feedbacks
 - add APIs to support sharing the guest physical address with hypervisor
 - update kvm pvclock driver to use the shared buffer when SEV is active
 - pin the SEV guest memory

Brijesh Singh (18):
      x86: mm: Provide support to use memblock when spliting large pages
      x86: Add support for changing memory encryption attribute in early boot
      x86: kvm: Provide support to create Guest and HV shared per-CPU variables
      x86: kvmclock: Clear encryption attribute when SEV is active
      crypto: ccp: Introduce the AMD Secure Processor device
      crypto: ccp: Add Platform Security Processor (PSP) interface support
      crypto: ccp: Add Secure Encrypted Virtualization (SEV) interface support
      kvm: svm: prepare to reserve asid for SEV guest
      kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
      kvm: x86: prepare for SEV guest management API support
      kvm: svm: Add support for SEV LAUNCH_START command
      kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
      kvm: svm: Add support for SEV LAUNCH_FINISH command
      kvm: svm: Add support for SEV GUEST_STATUS command
      kvm: svm: Add support for SEV DEBUG_DECRYPT command
      kvm: svm: Add support for SEV DEBUG_ENCRYPT command
      kvm: svm: Add support for SEV LAUNCH_MEASURE command
      x86: kvm: Pin the guest memory when SEV is active

Tom Lendacky (14):
      x86: Add the Secure Encrypted Virtualization CPU feature
      x86: Secure Encrypted Virtualization (SEV) support
      KVM: SVM: prepare for new bit definition in nested_ctl
      KVM: SVM: Add SEV feature definitions to KVM
      x86: Use encrypted access of BOOT related data with SEV
      x86/pci: Use memremap when walking setup data
      x86/efi: Access EFI data as encrypted when SEV is active
      x86: Use PAGE_KERNEL protection for ioremap of memory page
      x86: Change early_ioremap to early_memremap for BOOT data
      x86: DMA support for SEV memory encryption
      x86: Unroll string I/O when SEV is active
      x86: Add early boot support when running with SEV active
      KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
      kvm: svm: Use the hardware provided GPA instead of page walk



 arch/x86/boot/compressed/Makefile      |    2 
 arch/x86/boot/compressed/head_64.S     |   16 
 arch/x86/boot/compressed/mem_encrypt.S |   75 ++
 arch/x86/include/asm/cpufeatures.h     |    1 
 arch/x86/include/asm/io.h              |   26 +
 arch/x86/include/asm/kvm_emulate.h     |    1 
 arch/x86/include/asm/kvm_host.h        |   19 +
 arch/x86/include/asm/mem_encrypt.h     |   29 +
 arch/x86/include/asm/msr-index.h       |    2 
 arch/x86/include/asm/svm.h             |    3 
 arch/x86/include/uapi/asm/hyperv.h     |    4 
 arch/x86/include/uapi/asm/kvm_para.h   |    4 
 arch/x86/kernel/acpi/boot.c            |    4 
 arch/x86/kernel/cpu/amd.c              |   22 +
 arch/x86/kernel/cpu/scattered.c        |    1 
 arch/x86/kernel/kvm.c                  |   43 +
 arch/x86/kernel/kvmclock.c             |   65 ++
 arch/x86/kernel/mem_encrypt_init.c     |   24 +
 arch/x86/kernel/mpparse.c              |   10 
 arch/x86/kvm/cpuid.c                   |    4 
 arch/x86/kvm/emulate.c                 |   20 -
 arch/x86/kvm/svm.c                     | 1051 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c                     |   60 ++
 arch/x86/mm/ioremap.c                  |   44 +
 arch/x86/mm/mem_encrypt.c              |  143 ++++
 arch/x86/mm/pageattr.c                 |   51 +-
 arch/x86/pci/common.c                  |    4 
 arch/x86/platform/efi/efi_64.c         |   15 
 drivers/crypto/Kconfig                 |   10 
 drivers/crypto/ccp/Kconfig             |   55 +-
 drivers/crypto/ccp/Makefile            |   10 
 drivers/crypto/ccp/ccp-dev-v3.c        |   86 +--
 drivers/crypto/ccp/ccp-dev-v5.c        |   73 +-
 drivers/crypto/ccp/ccp-dev.c           |  137 ++--
 drivers/crypto/ccp/ccp-dev.h           |   35 -
 drivers/crypto/ccp/psp-dev.c           |  211 ++++++
 drivers/crypto/ccp/psp-dev.h           |  102 +++
 drivers/crypto/ccp/sev-dev.c           |  348 +++++++++++
 drivers/crypto/ccp/sev-dev.h           |   67 ++
 drivers/crypto/ccp/sev-ops.c           |  324 ++++++++++
 drivers/crypto/ccp/sp-dev.c            |  324 ++++++++++
 drivers/crypto/ccp/sp-dev.h            |  172 +++++
 drivers/crypto/ccp/sp-pci.c            |  328 ++++++++++
 drivers/crypto/ccp/sp-platform.c       |  268 ++++++++
 drivers/sfi/sfi_core.c                 |    6 
 include/asm-generic/vmlinux.lds.h      |    3 
 include/linux/ccp.h                    |    3 
 include/linux/mem_encrypt.h            |    6 
 include/linux/mm.h                     |    1 
 include/linux/percpu-defs.h            |    9 
 include/linux/psp-sev.h                |  672 ++++++++++++++++++++
 include/uapi/linux/Kbuild              |    1 
 include/uapi/linux/kvm.h               |  100 +++
 include/uapi/linux/psp-sev.h           |  123 ++++
 kernel/resource.c                      |   40 +
 55 files changed, 4991 insertions(+), 266 deletions(-)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S
 create mode 100644 drivers/crypto/ccp/psp-dev.c
 create mode 100644 drivers/crypto/ccp/psp-dev.h
 create mode 100644 drivers/crypto/ccp/sev-dev.c
 create mode 100644 drivers/crypto/ccp/sev-dev.h
 create mode 100644 drivers/crypto/ccp/sev-ops.c
 create mode 100644 drivers/crypto/ccp/sp-dev.c
 create mode 100644 drivers/crypto/ccp/sp-dev.h
 create mode 100644 drivers/crypto/ccp/sp-pci.c
 create mode 100644 drivers/crypto/ccp/sp-platform.c
 create mode 100644 include/linux/psp-sev.h
 create mode 100644 include/uapi/linux/psp-sev.h


--
Brijesh Singh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-02 15:12 ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

This RFC series provides support for AMD's new Secure Encrypted Virtualization
(SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].

SEV is an extension to the AMD-V architecture which supports running multiple
VMs under the control of a hypervisor. When enabled, SEV hardware tags all
code and data with its VM ASID which indicates which VM the data originated
from or is intended for. This tag is kept with the data at all times when
inside the SOC, and prevents that data from being used by anyone other than the
owner. While the tag protects VM data inside the SOC, AES with 128 bit
encryption protects data outside the SOC. When data leaves or enters the SOC,
it is encrypted/decrypted  respectively by hardware with a key based on the
associated tag.

SEV guest VMs have the concept of private and shared memory.  Private memory is
encrypted with the  guest-specific key, while shared memory may be encrypted
with hypervisor key.  Certain types of memory (namely instruction pages and
guest page tables) are always treated as private memory by the hardware.
For data memory, SEV guest VMs can choose which pages they would like to be
private. The choice is done using the standard CPU page tables using the C-bit,
and is fully controlled by the guest. Due to security reasons all the DMA
operations inside the  guest must be performed on shared pages (C-bit clear).
Note that since C-bit is only controllable by the guest OS when it is operating
in 64-bit or 32-bit PAE mode, in all other modes the SEV hardware forces the
C-bit to a 1.

SEV is designed to protect guest VMs from a benign but vulnerable (i.e. not
fully malicious) hypervisor. In particular, it reduces the attack surface of
guest VMs and can prevent certain types of VM-escape bugs (e.g. hypervisor
read-anywhere) from being used to steal guest data.

The RFC series also expands crypto driver (ccp.ko) to include the support for
Platform Security Processor (PSP) which is used for communicating with SEV
firmware that runs within the AMD secure processor providing a secure key
management interfaces. The hypervisor uses this interface to encrypt the
bootstrap code and perform common activities such as launching, running,
snapshotting, migrating and debugging encrypted guest.

A new ioctl (KVM_MEMORY_ENCRYPT_OP) is introduced which can be used by Qemu to
issue SEV guest life cycle commands.

The RFC series also includes patches required in guest OS to enable SEV feature.
A guest OS can check SEV support by calling KVM_FEATURE cpuid instruction.

The patch breakdown:
* [1 - 17]: guest OS specific changes when SEV is active
* [18]: already queued in kvm upstream tree but was not in tip tree hence its
  included so that build does not fail
* [19 - 21]: since CCP and PSP shares the same PCIe ID hence the patch expands
  the CCP driver by creating a high level AMD Secure Processor (SP) framework
  to allow integration of PSP device into ccp.ko.
* [22 - 32]: hypervisor changes to support memory encryption

The following links provide additional details:

AMD Memory Encryption whitepaper:
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf

AMD64 Architecture Programmer's Manual:
    http://support.amd.com/TechDocs/24593.pdf
    SME is section 7.10
    SEV is section 15.34

Secure Encrypted Virutualization Key Management:
http://support.amd.com/TechDocs/55766_SEV-KM API_Specification.pdf

KVM Forum Presentation:
http://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf

[1] http://marc.info/?l=linux-kernel&m=148725974113693&w=2

---

Based on the feedbacks, we have started adding the SEV guest support in OVMF
BIOS. This series has been tested using EDK2/OVMF BIOS, the initial EDK2 patches
has been submmited on edk2 mailing list for discussion.

TODO:
 - add support for migration commands
 - update QEMU RFC's to SEV spec 0.14
 - investigate virtio and vfio support for SEV guest
 - investigate SMM support for SEV guest
 - add support for nested virtualization

Changes since v1:
 - update to newer SEV key management API spec (0.12 -> 0.14)
 - expand the CCP driver and integrate the PSP interface support
 - remove the usage of SEV ref_count and release the SEV FW resources in
   kvm_x86_ops->vm_destroy
 - acquire the kvm->lock before executing the SEV commands and release on exit.
 - rename ioctl from KVM_SEV_ISSUE_CMD to KVM_MEMORY_ENCRYPT_OP
 - extend KVM_MEMORY_ENCRYPT_OP ioctl to require file descriptor for the SEV
   device. A program without access to /dev/sev will not be able to issue SEV
   commands
 - update vmcb on succesful LAUNCH_FINISH to indicate that SEV is active
 - serveral fixes based on Paolo's review feedbacks
 - add APIs to support sharing the guest physical address with hypervisor
 - update kvm pvclock driver to use the shared buffer when SEV is active
 - pin the SEV guest memory

Brijesh Singh (18):
      x86: mm: Provide support to use memblock when spliting large pages
      x86: Add support for changing memory encryption attribute in early boot
      x86: kvm: Provide support to create Guest and HV shared per-CPU variables
      x86: kvmclock: Clear encryption attribute when SEV is active
      crypto: ccp: Introduce the AMD Secure Processor device
      crypto: ccp: Add Platform Security Processor (PSP) interface support
      crypto: ccp: Add Secure Encrypted Virtualization (SEV) interface support
      kvm: svm: prepare to reserve asid for SEV guest
      kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
      kvm: x86: prepare for SEV guest management API support
      kvm: svm: Add support for SEV LAUNCH_START command
      kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
      kvm: svm: Add support for SEV LAUNCH_FINISH command
      kvm: svm: Add support for SEV GUEST_STATUS command
      kvm: svm: Add support for SEV DEBUG_DECRYPT command
      kvm: svm: Add support for SEV DEBUG_ENCRYPT command
      kvm: svm: Add support for SEV LAUNCH_MEASURE command
      x86: kvm: Pin the guest memory when SEV is active

Tom Lendacky (14):
      x86: Add the Secure Encrypted Virtualization CPU feature
      x86: Secure Encrypted Virtualization (SEV) support
      KVM: SVM: prepare for new bit definition in nested_ctl
      KVM: SVM: Add SEV feature definitions to KVM
      x86: Use encrypted access of BOOT related data with SEV
      x86/pci: Use memremap when walking setup data
      x86/efi: Access EFI data as encrypted when SEV is active
      x86: Use PAGE_KERNEL protection for ioremap of memory page
      x86: Change early_ioremap to early_memremap for BOOT data
      x86: DMA support for SEV memory encryption
      x86: Unroll string I/O when SEV is active
      x86: Add early boot support when running with SEV active
      KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
      kvm: svm: Use the hardware provided GPA instead of page walk



 arch/x86/boot/compressed/Makefile      |    2 
 arch/x86/boot/compressed/head_64.S     |   16 
 arch/x86/boot/compressed/mem_encrypt.S |   75 ++
 arch/x86/include/asm/cpufeatures.h     |    1 
 arch/x86/include/asm/io.h              |   26 +
 arch/x86/include/asm/kvm_emulate.h     |    1 
 arch/x86/include/asm/kvm_host.h        |   19 +
 arch/x86/include/asm/mem_encrypt.h     |   29 +
 arch/x86/include/asm/msr-index.h       |    2 
 arch/x86/include/asm/svm.h             |    3 
 arch/x86/include/uapi/asm/hyperv.h     |    4 
 arch/x86/include/uapi/asm/kvm_para.h   |    4 
 arch/x86/kernel/acpi/boot.c            |    4 
 arch/x86/kernel/cpu/amd.c              |   22 +
 arch/x86/kernel/cpu/scattered.c        |    1 
 arch/x86/kernel/kvm.c                  |   43 +
 arch/x86/kernel/kvmclock.c             |   65 ++
 arch/x86/kernel/mem_encrypt_init.c     |   24 +
 arch/x86/kernel/mpparse.c              |   10 
 arch/x86/kvm/cpuid.c                   |    4 
 arch/x86/kvm/emulate.c                 |   20 -
 arch/x86/kvm/svm.c                     | 1051 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c                     |   60 ++
 arch/x86/mm/ioremap.c                  |   44 +
 arch/x86/mm/mem_encrypt.c              |  143 ++++
 arch/x86/mm/pageattr.c                 |   51 +-
 arch/x86/pci/common.c                  |    4 
 arch/x86/platform/efi/efi_64.c         |   15 
 drivers/crypto/Kconfig                 |   10 
 drivers/crypto/ccp/Kconfig             |   55 +-
 drivers/crypto/ccp/Makefile            |   10 
 drivers/crypto/ccp/ccp-dev-v3.c        |   86 +--
 drivers/crypto/ccp/ccp-dev-v5.c        |   73 +-
 drivers/crypto/ccp/ccp-dev.c           |  137 ++--
 drivers/crypto/ccp/ccp-dev.h           |   35 -
 drivers/crypto/ccp/psp-dev.c           |  211 ++++++
 drivers/crypto/ccp/psp-dev.h           |  102 +++
 drivers/crypto/ccp/sev-dev.c           |  348 +++++++++++
 drivers/crypto/ccp/sev-dev.h           |   67 ++
 drivers/crypto/ccp/sev-ops.c           |  324 ++++++++++
 drivers/crypto/ccp/sp-dev.c            |  324 ++++++++++
 drivers/crypto/ccp/sp-dev.h            |  172 +++++
 drivers/crypto/ccp/sp-pci.c            |  328 ++++++++++
 drivers/crypto/ccp/sp-platform.c       |  268 ++++++++
 drivers/sfi/sfi_core.c                 |    6 
 include/asm-generic/vmlinux.lds.h      |    3 
 include/linux/ccp.h                    |    3 
 include/linux/mem_encrypt.h            |    6 
 include/linux/mm.h                     |    1 
 include/linux/percpu-defs.h            |    9 
 include/linux/psp-sev.h                |  672 ++++++++++++++++++++
 include/uapi/linux/Kbuild              |    1 
 include/uapi/linux/kvm.h               |  100 +++
 include/uapi/linux/psp-sev.h           |  123 ++++
 kernel/resource.c                      |   40 +
 55 files changed, 4991 insertions(+), 266 deletions(-)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S
 create mode 100644 drivers/crypto/ccp/psp-dev.c
 create mode 100644 drivers/crypto/ccp/psp-dev.h
 create mode 100644 drivers/crypto/ccp/sev-dev.c
 create mode 100644 drivers/crypto/ccp/sev-dev.h
 create mode 100644 drivers/crypto/ccp/sev-ops.c
 create mode 100644 drivers/crypto/ccp/sp-dev.c
 create mode 100644 drivers/crypto/ccp/sp-dev.h
 create mode 100644 drivers/crypto/ccp/sp-pci.c
 create mode 100644 drivers/crypto/ccp/sp-platform.c
 create mode 100644 include/linux/psp-sev.h
 create mode 100644 include/uapi/linux/psp-sev.h


--
Brijesh Singh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-02 15:12 ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

This RFC series provides support for AMD's new Secure Encrypted Virtualization
(SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].

SEV is an extension to the AMD-V architecture which supports running multiple
VMs under the control of a hypervisor. When enabled, SEV hardware tags all
code and data with its VM ASID which indicates which VM the data originated
from or is intended for. This tag is kept with the data at all times when
inside the SOC, and prevents that data from being used by anyone other than the
owner. While the tag protects VM data inside the SOC, AES with 128 bit
encryption protects data outside the SOC. When data leaves or enters the SOC,
it is encrypted/decrypted  respectively by hardware with a key based on the
associated tag.

SEV guest VMs have the concept of private and shared memory.  Private memory is
encrypted with the  guest-specific key, while shared memory may be encrypted
with hypervisor key.  Certain types of memory (namely instruction pages and
guest page tables) are always treated as private memory by the hardware.
For data memory, SEV guest VMs can choose which pages they would like to be
private. The choice is done using the standard CPU page tables using the C-bit,
and is fully controlled by the guest. Due to security reasons all the DMA
operations inside the  guest must be performed on shared pages (C-bit clear).
Note that since C-bit is only controllable by the guest OS when it is operating
in 64-bit or 32-bit PAE mode, in all other modes the SEV hardware forces the
C-bit to a 1.

SEV is designed to protect guest VMs from a benign but vulnerable (i.e. not
fully malicious) hypervisor. In particular, it reduces the attack surface of
guest VMs and can prevent certain types of VM-escape bugs (e.g. hypervisor
read-anywhere) from being used to steal guest data.

The RFC series also expands crypto driver (ccp.ko) to include the support for
Platform Security Processor (PSP) which is used for communicating with SEV
firmware that runs within the AMD secure processor providing a secure key
management interfaces. The hypervisor uses this interface to encrypt the
bootstrap code and perform common activities such as launching, running,
snapshotting, migrating and debugging encrypted guest.

A new ioctl (KVM_MEMORY_ENCRYPT_OP) is introduced which can be used by Qemu to
issue SEV guest life cycle commands.

The RFC series also includes patches required in guest OS to enable SEV feature.
A guest OS can check SEV support by calling KVM_FEATURE cpuid instruction.

The patch breakdown:
* [1 - 17]: guest OS specific changes when SEV is active
* [18]: already queued in kvm upstream tree but was not in tip tree hence its
  included so that build does not fail
* [19 - 21]: since CCP and PSP shares the same PCIe ID hence the patch expands
  the CCP driver by creating a high level AMD Secure Processor (SP) framework
  to allow integration of PSP device into ccp.ko.
* [22 - 32]: hypervisor changes to support memory encryption

The following links provide additional details:

AMD Memory Encryption whitepaper:
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf

AMD64 Architecture Programmer's Manual:
    http://support.amd.com/TechDocs/24593.pdf
    SME is section 7.10
    SEV is section 15.34

Secure Encrypted Virutualization Key Management:
http://support.amd.com/TechDocs/55766_SEV-KM API_Specification.pdf

KVM Forum Presentation:
http://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf

[1] http://marc.info/?l=linux-kernel&m=148725974113693&w=2

---

Based on the feedbacks, we have started adding the SEV guest support in OVMF
BIOS. This series has been tested using EDK2/OVMF BIOS, the initial EDK2 patches
has been submmited on edk2 mailing list for discussion.

TODO:
 - add support for migration commands
 - update QEMU RFC's to SEV spec 0.14
 - investigate virtio and vfio support for SEV guest
 - investigate SMM support for SEV guest
 - add support for nested virtualization

Changes since v1:
 - update to newer SEV key management API spec (0.12 -> 0.14)
 - expand the CCP driver and integrate the PSP interface support
 - remove the usage of SEV ref_count and release the SEV FW resources in
   kvm_x86_ops->vm_destroy
 - acquire the kvm->lock before executing the SEV commands and release on exit.
 - rename ioctl from KVM_SEV_ISSUE_CMD to KVM_MEMORY_ENCRYPT_OP
 - extend KVM_MEMORY_ENCRYPT_OP ioctl to require file descriptor for the SEV
   device. A program without access to /dev/sev will not be able to issue SEV
   commands
 - update vmcb on succesful LAUNCH_FINISH to indicate that SEV is active
 - serveral fixes based on Paolo's review feedbacks
 - add APIs to support sharing the guest physical address with hypervisor
 - update kvm pvclock driver to use the shared buffer when SEV is active
 - pin the SEV guest memory

Brijesh Singh (18):
      x86: mm: Provide support to use memblock when spliting large pages
      x86: Add support for changing memory encryption attribute in early boot
      x86: kvm: Provide support to create Guest and HV shared per-CPU variables
      x86: kvmclock: Clear encryption attribute when SEV is active
      crypto: ccp: Introduce the AMD Secure Processor device
      crypto: ccp: Add Platform Security Processor (PSP) interface support
      crypto: ccp: Add Secure Encrypted Virtualization (SEV) interface support
      kvm: svm: prepare to reserve asid for SEV guest
      kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
      kvm: x86: prepare for SEV guest management API support
      kvm: svm: Add support for SEV LAUNCH_START command
      kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
      kvm: svm: Add support for SEV LAUNCH_FINISH command
      kvm: svm: Add support for SEV GUEST_STATUS command
      kvm: svm: Add support for SEV DEBUG_DECRYPT command
      kvm: svm: Add support for SEV DEBUG_ENCRYPT command
      kvm: svm: Add support for SEV LAUNCH_MEASURE command
      x86: kvm: Pin the guest memory when SEV is active

Tom Lendacky (14):
      x86: Add the Secure Encrypted Virtualization CPU feature
      x86: Secure Encrypted Virtualization (SEV) support
      KVM: SVM: prepare for new bit definition in nested_ctl
      KVM: SVM: Add SEV feature definitions to KVM
      x86: Use encrypted access of BOOT related data with SEV
      x86/pci: Use memremap when walking setup data
      x86/efi: Access EFI data as encrypted when SEV is active
      x86: Use PAGE_KERNEL protection for ioremap of memory page
      x86: Change early_ioremap to early_memremap for BOOT data
      x86: DMA support for SEV memory encryption
      x86: Unroll string I/O when SEV is active
      x86: Add early boot support when running with SEV active
      KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
      kvm: svm: Use the hardware provided GPA instead of page walk



 arch/x86/boot/compressed/Makefile      |    2 
 arch/x86/boot/compressed/head_64.S     |   16 
 arch/x86/boot/compressed/mem_encrypt.S |   75 ++
 arch/x86/include/asm/cpufeatures.h     |    1 
 arch/x86/include/asm/io.h              |   26 +
 arch/x86/include/asm/kvm_emulate.h     |    1 
 arch/x86/include/asm/kvm_host.h        |   19 +
 arch/x86/include/asm/mem_encrypt.h     |   29 +
 arch/x86/include/asm/msr-index.h       |    2 
 arch/x86/include/asm/svm.h             |    3 
 arch/x86/include/uapi/asm/hyperv.h     |    4 
 arch/x86/include/uapi/asm/kvm_para.h   |    4 
 arch/x86/kernel/acpi/boot.c            |    4 
 arch/x86/kernel/cpu/amd.c              |   22 +
 arch/x86/kernel/cpu/scattered.c        |    1 
 arch/x86/kernel/kvm.c                  |   43 +
 arch/x86/kernel/kvmclock.c             |   65 ++
 arch/x86/kernel/mem_encrypt_init.c     |   24 +
 arch/x86/kernel/mpparse.c              |   10 
 arch/x86/kvm/cpuid.c                   |    4 
 arch/x86/kvm/emulate.c                 |   20 -
 arch/x86/kvm/svm.c                     | 1051 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c                     |   60 ++
 arch/x86/mm/ioremap.c                  |   44 +
 arch/x86/mm/mem_encrypt.c              |  143 ++++
 arch/x86/mm/pageattr.c                 |   51 +-
 arch/x86/pci/common.c                  |    4 
 arch/x86/platform/efi/efi_64.c         |   15 
 drivers/crypto/Kconfig                 |   10 
 drivers/crypto/ccp/Kconfig             |   55 +-
 drivers/crypto/ccp/Makefile            |   10 
 drivers/crypto/ccp/ccp-dev-v3.c        |   86 +--
 drivers/crypto/ccp/ccp-dev-v5.c        |   73 +-
 drivers/crypto/ccp/ccp-dev.c           |  137 ++--
 drivers/crypto/ccp/ccp-dev.h           |   35 -
 drivers/crypto/ccp/psp-dev.c           |  211 ++++++
 drivers/crypto/ccp/psp-dev.h           |  102 +++
 drivers/crypto/ccp/sev-dev.c           |  348 +++++++++++
 drivers/crypto/ccp/sev-dev.h           |   67 ++
 drivers/crypto/ccp/sev-ops.c           |  324 ++++++++++
 drivers/crypto/ccp/sp-dev.c            |  324 ++++++++++
 drivers/crypto/ccp/sp-dev.h            |  172 +++++
 drivers/crypto/ccp/sp-pci.c            |  328 ++++++++++
 drivers/crypto/ccp/sp-platform.c       |  268 ++++++++
 drivers/sfi/sfi_core.c                 |    6 
 include/asm-generic/vmlinux.lds.h      |    3 
 include/linux/ccp.h                    |    3 
 include/linux/mem_encrypt.h            |    6 
 include/linux/mm.h                     |    1 
 include/linux/percpu-defs.h            |    9 
 include/linux/psp-sev.h                |  672 ++++++++++++++++++++
 include/uapi/linux/Kbuild              |    1 
 include/uapi/linux/kvm.h               |  100 +++
 include/uapi/linux/psp-sev.h           |  123 ++++
 kernel/resource.c                      |   40 +
 55 files changed, 4991 insertions(+), 266 deletions(-)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S
 create mode 100644 drivers/crypto/ccp/psp-dev.c
 create mode 100644 drivers/crypto/ccp/psp-dev.h
 create mode 100644 drivers/crypto/ccp/sev-dev.c
 create mode 100644 drivers/crypto/ccp/sev-dev.h
 create mode 100644 drivers/crypto/ccp/sev-ops.c
 create mode 100644 drivers/crypto/ccp/sp-dev.c
 create mode 100644 drivers/crypto/ccp/sp-dev.h
 create mode 100644 drivers/crypto/ccp/sp-pci.c
 create mode 100644 drivers/crypto/ccp/sp-platform.c
 create mode 100644 include/linux/psp-sev.h
 create mode 100644 include/uapi/linux/psp-sev.h


--
Brijesh Singh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
  2017-03-02 15:12 ` Brijesh Singh
                   ` (3 preceding siblings ...)
  (?)
@ 2017-03-02 15:12 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

Update the CPU features to include identifying and reporting on the
Secure Encrypted Virtualization (SEV) feature.  SME is identified by
CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
as available if reported by CPUID and enabled by BIOS.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/include/asm/msr-index.h   |    2 ++
 arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
 arch/x86/kernel/cpu/scattered.c    |    1 +
 4 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b1a4468..9907579 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -188,6 +188,7 @@
  */
 
 #define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
+#define X86_FEATURE_SEV		( 7*32+ 1) /* AMD Secure Encrypted Virtualization */
 #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
 #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
 #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index e2d0503..e8b3b28 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -361,6 +361,8 @@
 #define MSR_K7_PERFCTR3			0xc0010007
 #define MSR_K7_CLK_CTL			0xc001001b
 #define MSR_K7_HWCR			0xc0010015
+#define MSR_K7_HWCR_SMMLOCK_BIT		0
+#define MSR_K7_HWCR_SMMLOCK		BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
 #define MSR_K7_FID_VID_CTL		0xc0010041
 #define MSR_K7_FID_VID_STATUS		0xc0010042
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 6bddda3..675958e 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -617,10 +617,13 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 		set_cpu_bug(c, X86_BUG_AMD_E400);
 
 	/*
-	 * BIOS support is required for SME. If BIOS has enabld SME then
-	 * adjust x86_phys_bits by the SME physical address space reduction
-	 * value. If BIOS has not enabled SME then don't advertise the
-	 * feature (set in scattered.c).
+	 * BIOS support is required for SME and SEV.
+	 *   For SME: If BIOS has enabled SME then adjust x86_phys_bits by
+	 *	      the SME physical address space reduction value.
+	 *	      If BIOS has not enabled SME then don't advertise the
+	 *	      SME feature (set in scattered.c).
+	 *   For SEV: If BIOS has not enabled SEV then don't advertise the
+	 *            SEV feature (set in scattered.c).
 	 */
 	if (c->extended_cpuid_level >= 0x8000001f) {
 		if (cpu_has(c, X86_FEATURE_SME)) {
@@ -637,6 +640,17 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 				clear_cpu_cap(c, X86_FEATURE_SME);
 			}
 		}
+
+		if (cpu_has(c, X86_FEATURE_SEV)) {
+			u64 syscfg, hwcr;
+
+			/* Check if SEV is enabled */
+			rdmsrl(MSR_K8_SYSCFG, syscfg);
+			rdmsrl(MSR_K7_HWCR, hwcr);
+			if (!(syscfg & MSR_K8_SYSCFG_MEM_ENCRYPT) ||
+			    !(hwcr & MSR_K7_HWCR_SMMLOCK))
+				clear_cpu_cap(c, X86_FEATURE_SEV);
+		}
 	}
 }
 
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index cabda87..c3f58d9 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
 	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
+	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
 	{ 0, 0, 0, 0, 0 }
 };
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:12   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Update the CPU features to include identifying and reporting on the
Secure Encrypted Virtualization (SEV) feature.  SME is identified by
CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
as available if reported by CPUID and enabled by BIOS.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/include/asm/msr-index.h   |    2 ++
 arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
 arch/x86/kernel/cpu/scattered.c    |    1 +
 4 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b1a4468..9907579 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -188,6 +188,7 @@
  */
 
 #define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
+#define X86_FEATURE_SEV		( 7*32+ 1) /* AMD Secure Encrypted Virtualization */
 #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
 #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
 #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index e2d0503..e8b3b28 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -361,6 +361,8 @@
 #define MSR_K7_PERFCTR3			0xc0010007
 #define MSR_K7_CLK_CTL			0xc001001b
 #define MSR_K7_HWCR			0xc0010015
+#define MSR_K7_HWCR_SMMLOCK_BIT		0
+#define MSR_K7_HWCR_SMMLOCK		BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
 #define MSR_K7_FID_VID_CTL		0xc0010041
 #define MSR_K7_FID_VID_STATUS		0xc0010042
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 6bddda3..675958e 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -617,10 +617,13 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 		set_cpu_bug(c, X86_BUG_AMD_E400);
 
 	/*
-	 * BIOS support is required for SME. If BIOS has enabld SME then
-	 * adjust x86_phys_bits by the SME physical address space reduction
-	 * value. If BIOS has not enabled SME then don't advertise the
-	 * feature (set in scattered.c).
+	 * BIOS support is required for SME and SEV.
+	 *   For SME: If BIOS has enabled SME then adjust x86_phys_bits by
+	 *	      the SME physical address space reduction value.
+	 *	      If BIOS has not enabled SME then don't advertise the
+	 *	      SME feature (set in scattered.c).
+	 *   For SEV: If BIOS has not enabled SEV then don't advertise the
+	 *            SEV feature (set in scattered.c).
 	 */
 	if (c->extended_cpuid_level >= 0x8000001f) {
 		if (cpu_has(c, X86_FEATURE_SME)) {
@@ -637,6 +640,17 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 				clear_cpu_cap(c, X86_FEATURE_SME);
 			}
 		}
+
+		if (cpu_has(c, X86_FEATURE_SEV)) {
+			u64 syscfg, hwcr;
+
+			/* Check if SEV is enabled */
+			rdmsrl(MSR_K8_SYSCFG, syscfg);
+			rdmsrl(MSR_K7_HWCR, hwcr);
+			if (!(syscfg & MSR_K8_SYSCFG_MEM_ENCRYPT) ||
+			    !(hwcr & MSR_K7_HWCR_SMMLOCK))
+				clear_cpu_cap(c, X86_FEATURE_SEV);
+		}
 	}
 }
 
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index cabda87..c3f58d9 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
 	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
+	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
 	{ 0, 0, 0, 0, 0 }
 };
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Update the CPU features to include identifying and reporting on the
Secure Encrypted Virtualization (SEV) feature.  SME is identified by
CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
as available if reported by CPUID and enabled by BIOS.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/include/asm/msr-index.h   |    2 ++
 arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
 arch/x86/kernel/cpu/scattered.c    |    1 +
 4 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b1a4468..9907579 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -188,6 +188,7 @@
  */
 
 #define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
+#define X86_FEATURE_SEV		( 7*32+ 1) /* AMD Secure Encrypted Virtualization */
 #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
 #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
 #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index e2d0503..e8b3b28 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -361,6 +361,8 @@
 #define MSR_K7_PERFCTR3			0xc0010007
 #define MSR_K7_CLK_CTL			0xc001001b
 #define MSR_K7_HWCR			0xc0010015
+#define MSR_K7_HWCR_SMMLOCK_BIT		0
+#define MSR_K7_HWCR_SMMLOCK		BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
 #define MSR_K7_FID_VID_CTL		0xc0010041
 #define MSR_K7_FID_VID_STATUS		0xc0010042
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 6bddda3..675958e 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -617,10 +617,13 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 		set_cpu_bug(c, X86_BUG_AMD_E400);
 
 	/*
-	 * BIOS support is required for SME. If BIOS has enabld SME then
-	 * adjust x86_phys_bits by the SME physical address space reduction
-	 * value. If BIOS has not enabled SME then don't advertise the
-	 * feature (set in scattered.c).
+	 * BIOS support is required for SME and SEV.
+	 *   For SME: If BIOS has enabled SME then adjust x86_phys_bits by
+	 *	      the SME physical address space reduction value.
+	 *	      If BIOS has not enabled SME then don't advertise the
+	 *	      SME feature (set in scattered.c).
+	 *   For SEV: If BIOS has not enabled SEV then don't advertise the
+	 *            SEV feature (set in scattered.c).
 	 */
 	if (c->extended_cpuid_level >= 0x8000001f) {
 		if (cpu_has(c, X86_FEATURE_SME)) {
@@ -637,6 +640,17 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 				clear_cpu_cap(c, X86_FEATURE_SME);
 			}
 		}
+
+		if (cpu_has(c, X86_FEATURE_SEV)) {
+			u64 syscfg, hwcr;
+
+			/* Check if SEV is enabled */
+			rdmsrl(MSR_K8_SYSCFG, syscfg);
+			rdmsrl(MSR_K7_HWCR, hwcr);
+			if (!(syscfg & MSR_K8_SYSCFG_MEM_ENCRYPT) ||
+			    !(hwcr & MSR_K7_HWCR_SMMLOCK))
+				clear_cpu_cap(c, X86_FEATURE_SEV);
+		}
 	}
 }
 
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index cabda87..c3f58d9 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
 	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
+	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
 	{ 0, 0, 0, 0, 0 }
 };
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Update the CPU features to include identifying and reporting on the
Secure Encrypted Virtualization (SEV) feature.  SME is identified by
CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
as available if reported by CPUID and enabled by BIOS.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/include/asm/msr-index.h   |    2 ++
 arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
 arch/x86/kernel/cpu/scattered.c    |    1 +
 4 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b1a4468..9907579 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -188,6 +188,7 @@
  */
 
 #define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
+#define X86_FEATURE_SEV		( 7*32+ 1) /* AMD Secure Encrypted Virtualization */
 #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
 #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
 #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index e2d0503..e8b3b28 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -361,6 +361,8 @@
 #define MSR_K7_PERFCTR3			0xc0010007
 #define MSR_K7_CLK_CTL			0xc001001b
 #define MSR_K7_HWCR			0xc0010015
+#define MSR_K7_HWCR_SMMLOCK_BIT		0
+#define MSR_K7_HWCR_SMMLOCK		BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
 #define MSR_K7_FID_VID_CTL		0xc0010041
 #define MSR_K7_FID_VID_STATUS		0xc0010042
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 6bddda3..675958e 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -617,10 +617,13 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 		set_cpu_bug(c, X86_BUG_AMD_E400);
 
 	/*
-	 * BIOS support is required for SME. If BIOS has enabld SME then
-	 * adjust x86_phys_bits by the SME physical address space reduction
-	 * value. If BIOS has not enabled SME then don't advertise the
-	 * feature (set in scattered.c).
+	 * BIOS support is required for SME and SEV.
+	 *   For SME: If BIOS has enabled SME then adjust x86_phys_bits by
+	 *	      the SME physical address space reduction value.
+	 *	      If BIOS has not enabled SME then don't advertise the
+	 *	      SME feature (set in scattered.c).
+	 *   For SEV: If BIOS has not enabled SEV then don't advertise the
+	 *            SEV feature (set in scattered.c).
 	 */
 	if (c->extended_cpuid_level >= 0x8000001f) {
 		if (cpu_has(c, X86_FEATURE_SME)) {
@@ -637,6 +640,17 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 				clear_cpu_cap(c, X86_FEATURE_SME);
 			}
 		}
+
+		if (cpu_has(c, X86_FEATURE_SEV)) {
+			u64 syscfg, hwcr;
+
+			/* Check if SEV is enabled */
+			rdmsrl(MSR_K8_SYSCFG, syscfg);
+			rdmsrl(MSR_K7_HWCR, hwcr);
+			if (!(syscfg & MSR_K8_SYSCFG_MEM_ENCRYPT) ||
+			    !(hwcr & MSR_K7_HWCR_SMMLOCK))
+				clear_cpu_cap(c, X86_FEATURE_SEV);
+		}
 	}
 }
 
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index cabda87..c3f58d9 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
 	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
+	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
 	{ 0, 0, 0, 0, 0 }
 };
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Update the CPU features to include identifying and reporting on the
Secure Encrypted Virtualization (SEV) feature.  SME is identified by
CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
as available if reported by CPUID and enabled by BIOS.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/include/asm/msr-index.h   |    2 ++
 arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
 arch/x86/kernel/cpu/scattered.c    |    1 +
 4 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b1a4468..9907579 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -188,6 +188,7 @@
  */
 
 #define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
+#define X86_FEATURE_SEV		( 7*32+ 1) /* AMD Secure Encrypted Virtualization */
 #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
 #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
 #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index e2d0503..e8b3b28 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -361,6 +361,8 @@
 #define MSR_K7_PERFCTR3			0xc0010007
 #define MSR_K7_CLK_CTL			0xc001001b
 #define MSR_K7_HWCR			0xc0010015
+#define MSR_K7_HWCR_SMMLOCK_BIT		0
+#define MSR_K7_HWCR_SMMLOCK		BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
 #define MSR_K7_FID_VID_CTL		0xc0010041
 #define MSR_K7_FID_VID_STATUS		0xc0010042
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 6bddda3..675958e 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -617,10 +617,13 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 		set_cpu_bug(c, X86_BUG_AMD_E400);
 
 	/*
-	 * BIOS support is required for SME. If BIOS has enabld SME then
-	 * adjust x86_phys_bits by the SME physical address space reduction
-	 * value. If BIOS has not enabled SME then don't advertise the
-	 * feature (set in scattered.c).
+	 * BIOS support is required for SME and SEV.
+	 *   For SME: If BIOS has enabled SME then adjust x86_phys_bits by
+	 *	      the SME physical address space reduction value.
+	 *	      If BIOS has not enabled SME then don't advertise the
+	 *	      SME feature (set in scattered.c).
+	 *   For SEV: If BIOS has not enabled SEV then don't advertise the
+	 *            SEV feature (set in scattered.c).
 	 */
 	if (c->extended_cpuid_level >= 0x8000001f) {
 		if (cpu_has(c, X86_FEATURE_SME)) {
@@ -637,6 +640,17 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 				clear_cpu_cap(c, X86_FEATURE_SME);
 			}
 		}
+
+		if (cpu_has(c, X86_FEATURE_SEV)) {
+			u64 syscfg, hwcr;
+
+			/* Check if SEV is enabled */
+			rdmsrl(MSR_K8_SYSCFG, syscfg);
+			rdmsrl(MSR_K7_HWCR, hwcr);
+			if (!(syscfg & MSR_K8_SYSCFG_MEM_ENCRYPT) ||
+			    !(hwcr & MSR_K7_HWCR_SMMLOCK))
+				clear_cpu_cap(c, X86_FEATURE_SEV);
+		}
 	}
 }
 
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index cabda87..c3f58d9 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
 	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
+	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
 	{ 0, 0, 0, 0, 0 }
 };
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support
  2017-03-02 15:12 ` Brijesh Singh
                   ` (5 preceding siblings ...)
  (?)
@ 2017-03-02 15:12 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

Provide support for Secure Encyrpted Virtualization (SEV). This initial
support defines a flag that is used by the kernel to determine if it is
running with SEV active.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   14 +++++++++++++-
 arch/x86/mm/mem_encrypt.c          |    3 +++
 include/linux/mem_encrypt.h        |    6 ++++++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 1fd5426..9799835 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -20,10 +20,16 @@
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 
 extern unsigned long sme_me_mask;
+extern unsigned int sev_enabled;
 
 static inline bool sme_active(void)
 {
-	return (sme_me_mask) ? true : false;
+	return (sme_me_mask && !sev_enabled) ? true : false;
+}
+
+static inline bool sev_active(void)
+{
+	return (sme_me_mask && sev_enabled) ? true : false;
 }
 
 static inline u64 sme_dma_mask(void)
@@ -53,6 +59,7 @@ void swiotlb_set_mem_attributes(void *vaddr, unsigned long size);
 
 #ifndef sme_me_mask
 #define sme_me_mask	0UL
+#define sev_enabled	0
 
 static inline bool sme_active(void)
 {
@@ -64,6 +71,11 @@ static inline u64 sme_dma_mask(void)
 	return 0ULL;
 }
 
+static inline bool sev_active(void)
+{
+	return false;
+}
+
 static inline int set_memory_encrypted(unsigned long vaddr, int numpages)
 {
 	return 0;
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index c5062e1..090419b 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -34,6 +34,9 @@ void __init __early_pgtable_flush(void);
 unsigned long sme_me_mask __section(.data) = 0;
 EXPORT_SYMBOL_GPL(sme_me_mask);
 
+unsigned int sev_enabled __section(.data) = 0;
+EXPORT_SYMBOL_GPL(sev_enabled);
+
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
 
diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 913cf80..4b47c73 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -23,6 +23,7 @@
 
 #ifndef sme_me_mask
 #define sme_me_mask	0UL
+#define sev_enabled	0
 
 static inline bool sme_active(void)
 {
@@ -34,6 +35,11 @@ static inline u64 sme_dma_mask(void)
 	return 0ULL;
 }
 
+static inline bool sev_active(void)
+{
+	return false;
+}
+
 static inline int set_memory_encrypted(unsigned long vaddr, int numpages)
 {
 	return 0;

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:12   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Provide support for Secure Encyrpted Virtualization (SEV). This initial
support defines a flag that is used by the kernel to determine if it is
running with SEV active.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   14 +++++++++++++-
 arch/x86/mm/mem_encrypt.c          |    3 +++
 include/linux/mem_encrypt.h        |    6 ++++++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 1fd5426..9799835 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -20,10 +20,16 @@
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 
 extern unsigned long sme_me_mask;
+extern unsigned int sev_enabled;
 
 static inline bool sme_active(void)
 {
-	return (sme_me_mask) ? true : false;
+	return (sme_me_mask && !sev_enabled) ? true : false;
+}
+
+static inline bool sev_active(void)
+{
+	return (sme_me_mask && sev_enabled) ? true : false;
 }
 
 static inline u64 sme_dma_mask(void)
@@ -53,6 +59,7 @@ void swiotlb_set_mem_attributes(void *vaddr, unsigned long size);
 
 #ifndef sme_me_mask
 #define sme_me_mask	0UL
+#define sev_enabled	0
 
 static inline bool sme_active(void)
 {
@@ -64,6 +71,11 @@ static inline u64 sme_dma_mask(void)
 	return 0ULL;
 }
 
+static inline bool sev_active(void)
+{
+	return false;
+}
+
 static inline int set_memory_encrypted(unsigned long vaddr, int numpages)
 {
 	return 0;
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index c5062e1..090419b 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -34,6 +34,9 @@ void __init __early_pgtable_flush(void);
 unsigned long sme_me_mask __section(.data) = 0;
 EXPORT_SYMBOL_GPL(sme_me_mask);
 
+unsigned int sev_enabled __section(.data) = 0;
+EXPORT_SYMBOL_GPL(sev_enabled);
+
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
 
diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 913cf80..4b47c73 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -23,6 +23,7 @@
 
 #ifndef sme_me_mask
 #define sme_me_mask	0UL
+#define sev_enabled	0
 
 static inline bool sme_active(void)
 {
@@ -34,6 +35,11 @@ static inline u64 sme_dma_mask(void)
 	return 0ULL;
 }
 
+static inline bool sev_active(void)
+{
+	return false;
+}
+
 static inline int set_memory_encrypted(unsigned long vaddr, int numpages)
 {
 	return 0;

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Provide support for Secure Encyrpted Virtualization (SEV). This initial
support defines a flag that is used by the kernel to determine if it is
running with SEV active.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   14 +++++++++++++-
 arch/x86/mm/mem_encrypt.c          |    3 +++
 include/linux/mem_encrypt.h        |    6 ++++++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 1fd5426..9799835 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -20,10 +20,16 @@
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 
 extern unsigned long sme_me_mask;
+extern unsigned int sev_enabled;
 
 static inline bool sme_active(void)
 {
-	return (sme_me_mask) ? true : false;
+	return (sme_me_mask && !sev_enabled) ? true : false;
+}
+
+static inline bool sev_active(void)
+{
+	return (sme_me_mask && sev_enabled) ? true : false;
 }
 
 static inline u64 sme_dma_mask(void)
@@ -53,6 +59,7 @@ void swiotlb_set_mem_attributes(void *vaddr, unsigned long size);
 
 #ifndef sme_me_mask
 #define sme_me_mask	0UL
+#define sev_enabled	0
 
 static inline bool sme_active(void)
 {
@@ -64,6 +71,11 @@ static inline u64 sme_dma_mask(void)
 	return 0ULL;
 }
 
+static inline bool sev_active(void)
+{
+	return false;
+}
+
 static inline int set_memory_encrypted(unsigned long vaddr, int numpages)
 {
 	return 0;
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index c5062e1..090419b 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -34,6 +34,9 @@ void __init __early_pgtable_flush(void);
 unsigned long sme_me_mask __section(.data) = 0;
 EXPORT_SYMBOL_GPL(sme_me_mask);
 
+unsigned int sev_enabled __section(.data) = 0;
+EXPORT_SYMBOL_GPL(sev_enabled);
+
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
 
diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 913cf80..4b47c73 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -23,6 +23,7 @@
 
 #ifndef sme_me_mask
 #define sme_me_mask	0UL
+#define sev_enabled	0
 
 static inline bool sme_active(void)
 {
@@ -34,6 +35,11 @@ static inline u64 sme_dma_mask(void)
 	return 0ULL;
 }
 
+static inline bool sev_active(void)
+{
+	return false;
+}
+
 static inline int set_memory_encrypted(unsigned long vaddr, int numpages)
 {
 	return 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Provide support for Secure Encyrpted Virtualization (SEV). This initial
support defines a flag that is used by the kernel to determine if it is
running with SEV active.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   14 +++++++++++++-
 arch/x86/mm/mem_encrypt.c          |    3 +++
 include/linux/mem_encrypt.h        |    6 ++++++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 1fd5426..9799835 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -20,10 +20,16 @@
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 
 extern unsigned long sme_me_mask;
+extern unsigned int sev_enabled;
 
 static inline bool sme_active(void)
 {
-	return (sme_me_mask) ? true : false;
+	return (sme_me_mask && !sev_enabled) ? true : false;
+}
+
+static inline bool sev_active(void)
+{
+	return (sme_me_mask && sev_enabled) ? true : false;
 }
 
 static inline u64 sme_dma_mask(void)
@@ -53,6 +59,7 @@ void swiotlb_set_mem_attributes(void *vaddr, unsigned long size);
 
 #ifndef sme_me_mask
 #define sme_me_mask	0UL
+#define sev_enabled	0
 
 static inline bool sme_active(void)
 {
@@ -64,6 +71,11 @@ static inline u64 sme_dma_mask(void)
 	return 0ULL;
 }
 
+static inline bool sev_active(void)
+{
+	return false;
+}
+
 static inline int set_memory_encrypted(unsigned long vaddr, int numpages)
 {
 	return 0;
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index c5062e1..090419b 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -34,6 +34,9 @@ void __init __early_pgtable_flush(void);
 unsigned long sme_me_mask __section(.data) = 0;
 EXPORT_SYMBOL_GPL(sme_me_mask);
 
+unsigned int sev_enabled __section(.data) = 0;
+EXPORT_SYMBOL_GPL(sev_enabled);
+
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
 
diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 913cf80..4b47c73 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -23,6 +23,7 @@
 
 #ifndef sme_me_mask
 #define sme_me_mask	0UL
+#define sev_enabled	0
 
 static inline bool sme_active(void)
 {
@@ -34,6 +35,11 @@ static inline u64 sme_dma_mask(void)
 	return 0ULL;
 }
 
+static inline bool sev_active(void)
+{
+	return false;
+}
+
 static inline int set_memory_encrypted(unsigned long vaddr, int numpages)
 {
 	return 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Provide support for Secure Encyrpted Virtualization (SEV). This initial
support defines a flag that is used by the kernel to determine if it is
running with SEV active.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   14 +++++++++++++-
 arch/x86/mm/mem_encrypt.c          |    3 +++
 include/linux/mem_encrypt.h        |    6 ++++++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 1fd5426..9799835 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -20,10 +20,16 @@
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 
 extern unsigned long sme_me_mask;
+extern unsigned int sev_enabled;
 
 static inline bool sme_active(void)
 {
-	return (sme_me_mask) ? true : false;
+	return (sme_me_mask && !sev_enabled) ? true : false;
+}
+
+static inline bool sev_active(void)
+{
+	return (sme_me_mask && sev_enabled) ? true : false;
 }
 
 static inline u64 sme_dma_mask(void)
@@ -53,6 +59,7 @@ void swiotlb_set_mem_attributes(void *vaddr, unsigned long size);
 
 #ifndef sme_me_mask
 #define sme_me_mask	0UL
+#define sev_enabled	0
 
 static inline bool sme_active(void)
 {
@@ -64,6 +71,11 @@ static inline u64 sme_dma_mask(void)
 	return 0ULL;
 }
 
+static inline bool sev_active(void)
+{
+	return false;
+}
+
 static inline int set_memory_encrypted(unsigned long vaddr, int numpages)
 {
 	return 0;
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index c5062e1..090419b 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -34,6 +34,9 @@ void __init __early_pgtable_flush(void);
 unsigned long sme_me_mask __section(.data) = 0;
 EXPORT_SYMBOL_GPL(sme_me_mask);
 
+unsigned int sev_enabled __section(.data) = 0;
+EXPORT_SYMBOL_GPL(sev_enabled);
+
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
 
diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 913cf80..4b47c73 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -23,6 +23,7 @@
 
 #ifndef sme_me_mask
 #define sme_me_mask	0UL
+#define sev_enabled	0
 
 static inline bool sme_active(void)
 {
@@ -34,6 +35,11 @@ static inline u64 sme_dma_mask(void)
 	return 0ULL;
 }
 
+static inline bool sev_active(void)
+{
+	return false;
+}
+
 static inline int set_memory_encrypted(unsigned long vaddr, int numpages)
 {
 	return 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 03/32] KVM: SVM: prepare for new bit definition in nested_ctl
  2017-03-02 15:12 ` Brijesh Singh
                   ` (6 preceding siblings ...)
  (?)
@ 2017-03-02 15:12 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

Currently the nested_ctl variable in the vmcb_control_area structure is
used to indicate nested paging support. The nested paging support field
is actually defined as bit 0 of the field. In order to support a new
feature flag the usage of the nested_ctl and nested paging support must
be converted to operate on a single bit.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/svm.h |    2 ++
 arch/x86/kvm/svm.c         |    7 ++++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 14824fc..2aca535 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -136,6 +136,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_VM_CR_SVM_LOCK_MASK 0x0008ULL
 #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
 
+#define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
+
 struct __attribute__ ((__packed__)) vmcb_seg {
 	u16 selector;
 	u16 attrib;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 08a4d3a..75b0645 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1246,7 +1246,7 @@ static void init_vmcb(struct vcpu_svm *svm)
 
 	if (npt_enabled) {
 		/* Setup VMCB for Nested Paging */
-		control->nested_ctl = 1;
+		control->nested_ctl |= SVM_NESTED_CTL_NP_ENABLE;
 		clr_intercept(svm, INTERCEPT_INVLPG);
 		clr_exception_intercept(svm, PF_VECTOR);
 		clr_cr_intercept(svm, INTERCEPT_CR3_READ);
@@ -2840,7 +2840,8 @@ static bool nested_vmcb_checks(struct vmcb *vmcb)
 	if (vmcb->control.asid == 0)
 		return false;
 
-	if (vmcb->control.nested_ctl && !npt_enabled)
+	if ((vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) &&
+	    !npt_enabled)
 		return false;
 
 	return true;
@@ -2915,7 +2916,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 	else
 		svm->vcpu.arch.hflags &= ~HF_HIF_MASK;
 
-	if (nested_vmcb->control.nested_ctl) {
+	if (nested_vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) {
 		kvm_mmu_unload(&svm->vcpu);
 		svm->nested.nested_cr3 = nested_vmcb->control.nested_cr3;
 		nested_svm_init_mmu_context(&svm->vcpu);

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 03/32] KVM: SVM: prepare for new bit definition in nested_ctl
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:12   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Currently the nested_ctl variable in the vmcb_control_area structure is
used to indicate nested paging support. The nested paging support field
is actually defined as bit 0 of the field. In order to support a new
feature flag the usage of the nested_ctl and nested paging support must
be converted to operate on a single bit.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/svm.h |    2 ++
 arch/x86/kvm/svm.c         |    7 ++++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 14824fc..2aca535 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -136,6 +136,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_VM_CR_SVM_LOCK_MASK 0x0008ULL
 #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
 
+#define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
+
 struct __attribute__ ((__packed__)) vmcb_seg {
 	u16 selector;
 	u16 attrib;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 08a4d3a..75b0645 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1246,7 +1246,7 @@ static void init_vmcb(struct vcpu_svm *svm)
 
 	if (npt_enabled) {
 		/* Setup VMCB for Nested Paging */
-		control->nested_ctl = 1;
+		control->nested_ctl |= SVM_NESTED_CTL_NP_ENABLE;
 		clr_intercept(svm, INTERCEPT_INVLPG);
 		clr_exception_intercept(svm, PF_VECTOR);
 		clr_cr_intercept(svm, INTERCEPT_CR3_READ);
@@ -2840,7 +2840,8 @@ static bool nested_vmcb_checks(struct vmcb *vmcb)
 	if (vmcb->control.asid == 0)
 		return false;
 
-	if (vmcb->control.nested_ctl && !npt_enabled)
+	if ((vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) &&
+	    !npt_enabled)
 		return false;
 
 	return true;
@@ -2915,7 +2916,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 	else
 		svm->vcpu.arch.hflags &= ~HF_HIF_MASK;
 
-	if (nested_vmcb->control.nested_ctl) {
+	if (nested_vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) {
 		kvm_mmu_unload(&svm->vcpu);
 		svm->nested.nested_cr3 = nested_vmcb->control.nested_cr3;
 		nested_svm_init_mmu_context(&svm->vcpu);

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 03/32] KVM: SVM: prepare for new bit definition in nested_ctl
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Currently the nested_ctl variable in the vmcb_control_area structure is
used to indicate nested paging support. The nested paging support field
is actually defined as bit 0 of the field. In order to support a new
feature flag the usage of the nested_ctl and nested paging support must
be converted to operate on a single bit.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/svm.h |    2 ++
 arch/x86/kvm/svm.c         |    7 ++++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 14824fc..2aca535 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -136,6 +136,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_VM_CR_SVM_LOCK_MASK 0x0008ULL
 #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
 
+#define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
+
 struct __attribute__ ((__packed__)) vmcb_seg {
 	u16 selector;
 	u16 attrib;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 08a4d3a..75b0645 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1246,7 +1246,7 @@ static void init_vmcb(struct vcpu_svm *svm)
 
 	if (npt_enabled) {
 		/* Setup VMCB for Nested Paging */
-		control->nested_ctl = 1;
+		control->nested_ctl |= SVM_NESTED_CTL_NP_ENABLE;
 		clr_intercept(svm, INTERCEPT_INVLPG);
 		clr_exception_intercept(svm, PF_VECTOR);
 		clr_cr_intercept(svm, INTERCEPT_CR3_READ);
@@ -2840,7 +2840,8 @@ static bool nested_vmcb_checks(struct vmcb *vmcb)
 	if (vmcb->control.asid == 0)
 		return false;
 
-	if (vmcb->control.nested_ctl && !npt_enabled)
+	if ((vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) &&
+	    !npt_enabled)
 		return false;
 
 	return true;
@@ -2915,7 +2916,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 	else
 		svm->vcpu.arch.hflags &= ~HF_HIF_MASK;
 
-	if (nested_vmcb->control.nested_ctl) {
+	if (nested_vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) {
 		kvm_mmu_unload(&svm->vcpu);
 		svm->nested.nested_cr3 = nested_vmcb->control.nested_cr3;
 		nested_svm_init_mmu_context(&svm->vcpu);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 03/32] KVM: SVM: prepare for new bit definition in nested_ctl
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Currently the nested_ctl variable in the vmcb_control_area structure is
used to indicate nested paging support. The nested paging support field
is actually defined as bit 0 of the field. In order to support a new
feature flag the usage of the nested_ctl and nested paging support must
be converted to operate on a single bit.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/svm.h |    2 ++
 arch/x86/kvm/svm.c         |    7 ++++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 14824fc..2aca535 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -136,6 +136,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_VM_CR_SVM_LOCK_MASK 0x0008ULL
 #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
 
+#define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
+
 struct __attribute__ ((__packed__)) vmcb_seg {
 	u16 selector;
 	u16 attrib;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 08a4d3a..75b0645 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1246,7 +1246,7 @@ static void init_vmcb(struct vcpu_svm *svm)
 
 	if (npt_enabled) {
 		/* Setup VMCB for Nested Paging */
-		control->nested_ctl = 1;
+		control->nested_ctl |= SVM_NESTED_CTL_NP_ENABLE;
 		clr_intercept(svm, INTERCEPT_INVLPG);
 		clr_exception_intercept(svm, PF_VECTOR);
 		clr_cr_intercept(svm, INTERCEPT_CR3_READ);
@@ -2840,7 +2840,8 @@ static bool nested_vmcb_checks(struct vmcb *vmcb)
 	if (vmcb->control.asid == 0)
 		return false;
 
-	if (vmcb->control.nested_ctl && !npt_enabled)
+	if ((vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) &&
+	    !npt_enabled)
 		return false;
 
 	return true;
@@ -2915,7 +2916,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 	else
 		svm->vcpu.arch.hflags &= ~HF_HIF_MASK;
 
-	if (nested_vmcb->control.nested_ctl) {
+	if (nested_vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) {
 		kvm_mmu_unload(&svm->vcpu);
 		svm->nested.nested_cr3 = nested_vmcb->control.nested_cr3;
 		nested_svm_init_mmu_context(&svm->vcpu);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 03/32] KVM: SVM: prepare for new bit definition in nested_ctl
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Currently the nested_ctl variable in the vmcb_control_area structure is
used to indicate nested paging support. The nested paging support field
is actually defined as bit 0 of the field. In order to support a new
feature flag the usage of the nested_ctl and nested paging support must
be converted to operate on a single bit.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/svm.h |    2 ++
 arch/x86/kvm/svm.c         |    7 ++++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 14824fc..2aca535 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -136,6 +136,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_VM_CR_SVM_LOCK_MASK 0x0008ULL
 #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
 
+#define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
+
 struct __attribute__ ((__packed__)) vmcb_seg {
 	u16 selector;
 	u16 attrib;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 08a4d3a..75b0645 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1246,7 +1246,7 @@ static void init_vmcb(struct vcpu_svm *svm)
 
 	if (npt_enabled) {
 		/* Setup VMCB for Nested Paging */
-		control->nested_ctl = 1;
+		control->nested_ctl |= SVM_NESTED_CTL_NP_ENABLE;
 		clr_intercept(svm, INTERCEPT_INVLPG);
 		clr_exception_intercept(svm, PF_VECTOR);
 		clr_cr_intercept(svm, INTERCEPT_CR3_READ);
@@ -2840,7 +2840,8 @@ static bool nested_vmcb_checks(struct vmcb *vmcb)
 	if (vmcb->control.asid == 0)
 		return false;
 
-	if (vmcb->control.nested_ctl && !npt_enabled)
+	if ((vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) &&
+	    !npt_enabled)
 		return false;
 
 	return true;
@@ -2915,7 +2916,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 	else
 		svm->vcpu.arch.hflags &= ~HF_HIF_MASK;
 
-	if (nested_vmcb->control.nested_ctl) {
+	if (nested_vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) {
 		kvm_mmu_unload(&svm->vcpu);
 		svm->nested.nested_cr3 = nested_vmcb->control.nested_cr3;
 		nested_svm_init_mmu_context(&svm->vcpu);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 04/32] KVM: SVM: Add SEV feature definitions to KVM
  2017-03-02 15:12 ` Brijesh Singh
                   ` (9 preceding siblings ...)
  (?)
@ 2017-03-02 15:12 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

Define a new KVM CPU feature for Secure Encrypted Virtualization (SEV).
The kernel will check for the presence of this feature to determine if
it is running with SEV active.

Define the SEV enable bit for the VMCB control structure. The hypervisor
will use this bit to enable SEV in the guest.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/svm.h           |    1 +
 arch/x86/include/uapi/asm/kvm_para.h |    1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 2aca535..fba2a7b 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -137,6 +137,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
 
 #define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
+#define SVM_NESTED_CTL_SEV_ENABLE	BIT(1)
 
 struct __attribute__ ((__packed__)) vmcb_seg {
 	u16 selector;
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 1421a65..bc2802f 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@
 #define KVM_FEATURE_STEAL_TIME		5
 #define KVM_FEATURE_PV_EOI		6
 #define KVM_FEATURE_PV_UNHALT		7
+#define KVM_FEATURE_SEV			8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 04/32] KVM: SVM: Add SEV feature definitions to KVM
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:12   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Define a new KVM CPU feature for Secure Encrypted Virtualization (SEV).
The kernel will check for the presence of this feature to determine if
it is running with SEV active.

Define the SEV enable bit for the VMCB control structure. The hypervisor
will use this bit to enable SEV in the guest.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/svm.h           |    1 +
 arch/x86/include/uapi/asm/kvm_para.h |    1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 2aca535..fba2a7b 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -137,6 +137,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
 
 #define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
+#define SVM_NESTED_CTL_SEV_ENABLE	BIT(1)
 
 struct __attribute__ ((__packed__)) vmcb_seg {
 	u16 selector;
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 1421a65..bc2802f 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@
 #define KVM_FEATURE_STEAL_TIME		5
 #define KVM_FEATURE_PV_EOI		6
 #define KVM_FEATURE_PV_UNHALT		7
+#define KVM_FEATURE_SEV			8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 04/32] KVM: SVM: Add SEV feature definitions to KVM
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Define a new KVM CPU feature for Secure Encrypted Virtualization (SEV).
The kernel will check for the presence of this feature to determine if
it is running with SEV active.

Define the SEV enable bit for the VMCB control structure. The hypervisor
will use this bit to enable SEV in the guest.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/svm.h           |    1 +
 arch/x86/include/uapi/asm/kvm_para.h |    1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 2aca535..fba2a7b 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -137,6 +137,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
 
 #define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
+#define SVM_NESTED_CTL_SEV_ENABLE	BIT(1)
 
 struct __attribute__ ((__packed__)) vmcb_seg {
 	u16 selector;
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 1421a65..bc2802f 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@
 #define KVM_FEATURE_STEAL_TIME		5
 #define KVM_FEATURE_PV_EOI		6
 #define KVM_FEATURE_PV_UNHALT		7
+#define KVM_FEATURE_SEV			8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 04/32] KVM: SVM: Add SEV feature definitions to KVM
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Define a new KVM CPU feature for Secure Encrypted Virtualization (SEV).
The kernel will check for the presence of this feature to determine if
it is running with SEV active.

Define the SEV enable bit for the VMCB control structure. The hypervisor
will use this bit to enable SEV in the guest.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/svm.h           |    1 +
 arch/x86/include/uapi/asm/kvm_para.h |    1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 2aca535..fba2a7b 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -137,6 +137,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
 
 #define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
+#define SVM_NESTED_CTL_SEV_ENABLE	BIT(1)
 
 struct __attribute__ ((__packed__)) vmcb_seg {
 	u16 selector;
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 1421a65..bc2802f 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@
 #define KVM_FEATURE_STEAL_TIME		5
 #define KVM_FEATURE_PV_EOI		6
 #define KVM_FEATURE_PV_UNHALT		7
+#define KVM_FEATURE_SEV			8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 04/32] KVM: SVM: Add SEV feature definitions to KVM
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Define a new KVM CPU feature for Secure Encrypted Virtualization (SEV).
The kernel will check for the presence of this feature to determine if
it is running with SEV active.

Define the SEV enable bit for the VMCB control structure. The hypervisor
will use this bit to enable SEV in the guest.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/svm.h           |    1 +
 arch/x86/include/uapi/asm/kvm_para.h |    1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 2aca535..fba2a7b 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -137,6 +137,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
 
 #define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
+#define SVM_NESTED_CTL_SEV_ENABLE	BIT(1)
 
 struct __attribute__ ((__packed__)) vmcb_seg {
 	u16 selector;
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 1421a65..bc2802f 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@
 #define KVM_FEATURE_STEAL_TIME		5
 #define KVM_FEATURE_PV_EOI		6
 #define KVM_FEATURE_PV_UNHALT		7
+#define KVM_FEATURE_SEV			8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
  2017-03-02 15:12 ` Brijesh Singh
                   ` (11 preceding siblings ...)
  (?)
@ 2017-03-02 15:12 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
EFI related data, setup data) is encrypted and needs to be accessed as
such when mapped. Update the architecture override in early_memremap to
keep the encryption attribute when mapping this data.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/ioremap.c |   36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c6cb921..c400ab5 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -462,12 +462,31 @@ static bool memremap_is_setup_data(resource_size_t phys_addr,
 }
 
 /*
- * This function determines if an address should be mapped encrypted.
- * Boot setup data, EFI data and E820 areas are checked in making this
- * determination.
+ * This function determines if an address should be mapped encrypted when
+ * SEV is active.  E820 areas are checked in making this determination.
  */
-static bool memremap_should_map_encrypted(resource_size_t phys_addr,
-					  unsigned long size)
+static bool memremap_sev_should_map_encrypted(resource_size_t phys_addr,
+					      unsigned long size)
+{
+	/* Check if the address is in persistent memory */
+	switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
+	case E820_TYPE_PMEM:
+	case E820_TYPE_PRAM:
+		return false;
+	default:
+		break;
+	}
+
+	return true;
+}
+
+/*
+ * This function determines if an address should be mapped encrypted when
+ * SME is active.  Boot setup data, EFI data and E820 areas are checked in
+ * making this determination.
+ */
+static bool memremap_sme_should_map_encrypted(resource_size_t phys_addr,
+					      unsigned long size)
 {
 	/*
 	 * SME is not active, return true:
@@ -508,6 +527,13 @@ static bool memremap_should_map_encrypted(resource_size_t phys_addr,
 	return true;
 }
 
+static bool memremap_should_map_encrypted(resource_size_t phys_addr,
+					  unsigned long size)
+{
+	return sev_active() ? memremap_sev_should_map_encrypted(phys_addr, size)
+			    : memremap_sme_should_map_encrypted(phys_addr, size);
+}
+
 /*
  * Architecure function to determine if RAM remap is allowed.
  */

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:12   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
EFI related data, setup data) is encrypted and needs to be accessed as
such when mapped. Update the architecture override in early_memremap to
keep the encryption attribute when mapping this data.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/ioremap.c |   36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c6cb921..c400ab5 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -462,12 +462,31 @@ static bool memremap_is_setup_data(resource_size_t phys_addr,
 }
 
 /*
- * This function determines if an address should be mapped encrypted.
- * Boot setup data, EFI data and E820 areas are checked in making this
- * determination.
+ * This function determines if an address should be mapped encrypted when
+ * SEV is active.  E820 areas are checked in making this determination.
  */
-static bool memremap_should_map_encrypted(resource_size_t phys_addr,
-					  unsigned long size)
+static bool memremap_sev_should_map_encrypted(resource_size_t phys_addr,
+					      unsigned long size)
+{
+	/* Check if the address is in persistent memory */
+	switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
+	case E820_TYPE_PMEM:
+	case E820_TYPE_PRAM:
+		return false;
+	default:
+		break;
+	}
+
+	return true;
+}
+
+/*
+ * This function determines if an address should be mapped encrypted when
+ * SME is active.  Boot setup data, EFI data and E820 areas are checked in
+ * making this determination.
+ */
+static bool memremap_sme_should_map_encrypted(resource_size_t phys_addr,
+					      unsigned long size)
 {
 	/*
 	 * SME is not active, return true:
@@ -508,6 +527,13 @@ static bool memremap_should_map_encrypted(resource_size_t phys_addr,
 	return true;
 }
 
+static bool memremap_should_map_encrypted(resource_size_t phys_addr,
+					  unsigned long size)
+{
+	return sev_active() ? memremap_sev_should_map_encrypted(phys_addr, size)
+			    : memremap_sme_should_map_encrypted(phys_addr, size);
+}
+
 /*
  * Architecure function to determine if RAM remap is allowed.
  */

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
EFI related data, setup data) is encrypted and needs to be accessed as
such when mapped. Update the architecture override in early_memremap to
keep the encryption attribute when mapping this data.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/ioremap.c |   36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c6cb921..c400ab5 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -462,12 +462,31 @@ static bool memremap_is_setup_data(resource_size_t phys_addr,
 }
 
 /*
- * This function determines if an address should be mapped encrypted.
- * Boot setup data, EFI data and E820 areas are checked in making this
- * determination.
+ * This function determines if an address should be mapped encrypted when
+ * SEV is active.  E820 areas are checked in making this determination.
  */
-static bool memremap_should_map_encrypted(resource_size_t phys_addr,
-					  unsigned long size)
+static bool memremap_sev_should_map_encrypted(resource_size_t phys_addr,
+					      unsigned long size)
+{
+	/* Check if the address is in persistent memory */
+	switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
+	case E820_TYPE_PMEM:
+	case E820_TYPE_PRAM:
+		return false;
+	default:
+		break;
+	}
+
+	return true;
+}
+
+/*
+ * This function determines if an address should be mapped encrypted when
+ * SME is active.  Boot setup data, EFI data and E820 areas are checked in
+ * making this determination.
+ */
+static bool memremap_sme_should_map_encrypted(resource_size_t phys_addr,
+					      unsigned long size)
 {
 	/*
 	 * SME is not active, return true:
@@ -508,6 +527,13 @@ static bool memremap_should_map_encrypted(resource_size_t phys_addr,
 	return true;
 }
 
+static bool memremap_should_map_encrypted(resource_size_t phys_addr,
+					  unsigned long size)
+{
+	return sev_active() ? memremap_sev_should_map_encrypted(phys_addr, size)
+			    : memremap_sme_should_map_encrypted(phys_addr, size);
+}
+
 /*
  * Architecure function to determine if RAM remap is allowed.
  */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
EFI related data, setup data) is encrypted and needs to be accessed as
such when mapped. Update the architecture override in early_memremap to
keep the encryption attribute when mapping this data.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/ioremap.c |   36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c6cb921..c400ab5 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -462,12 +462,31 @@ static bool memremap_is_setup_data(resource_size_t phys_addr,
 }
 
 /*
- * This function determines if an address should be mapped encrypted.
- * Boot setup data, EFI data and E820 areas are checked in making this
- * determination.
+ * This function determines if an address should be mapped encrypted when
+ * SEV is active.  E820 areas are checked in making this determination.
  */
-static bool memremap_should_map_encrypted(resource_size_t phys_addr,
-					  unsigned long size)
+static bool memremap_sev_should_map_encrypted(resource_size_t phys_addr,
+					      unsigned long size)
+{
+	/* Check if the address is in persistent memory */
+	switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
+	case E820_TYPE_PMEM:
+	case E820_TYPE_PRAM:
+		return false;
+	default:
+		break;
+	}
+
+	return true;
+}
+
+/*
+ * This function determines if an address should be mapped encrypted when
+ * SME is active.  Boot setup data, EFI data and E820 areas are checked in
+ * making this determination.
+ */
+static bool memremap_sme_should_map_encrypted(resource_size_t phys_addr,
+					      unsigned long size)
 {
 	/*
 	 * SME is not active, return true:
@@ -508,6 +527,13 @@ static bool memremap_should_map_encrypted(resource_size_t phys_addr,
 	return true;
 }
 
+static bool memremap_should_map_encrypted(resource_size_t phys_addr,
+					  unsigned long size)
+{
+	return sev_active() ? memremap_sev_should_map_encrypted(phys_addr, size)
+			    : memremap_sme_should_map_encrypted(phys_addr, size);
+}
+
 /*
  * Architecure function to determine if RAM remap is allowed.
  */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
@ 2017-03-02 15:12   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
EFI related data, setup data) is encrypted and needs to be accessed as
such when mapped. Update the architecture override in early_memremap to
keep the encryption attribute when mapping this data.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/ioremap.c |   36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c6cb921..c400ab5 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -462,12 +462,31 @@ static bool memremap_is_setup_data(resource_size_t phys_addr,
 }
 
 /*
- * This function determines if an address should be mapped encrypted.
- * Boot setup data, EFI data and E820 areas are checked in making this
- * determination.
+ * This function determines if an address should be mapped encrypted when
+ * SEV is active.  E820 areas are checked in making this determination.
  */
-static bool memremap_should_map_encrypted(resource_size_t phys_addr,
-					  unsigned long size)
+static bool memremap_sev_should_map_encrypted(resource_size_t phys_addr,
+					      unsigned long size)
+{
+	/* Check if the address is in persistent memory */
+	switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
+	case E820_TYPE_PMEM:
+	case E820_TYPE_PRAM:
+		return false;
+	default:
+		break;
+	}
+
+	return true;
+}
+
+/*
+ * This function determines if an address should be mapped encrypted when
+ * SME is active.  Boot setup data, EFI data and E820 areas are checked in
+ * making this determination.
+ */
+static bool memremap_sme_should_map_encrypted(resource_size_t phys_addr,
+					      unsigned long size)
 {
 	/*
 	 * SME is not active, return true:
@@ -508,6 +527,13 @@ static bool memremap_should_map_encrypted(resource_size_t phys_addr,
 	return true;
 }
 
+static bool memremap_should_map_encrypted(resource_size_t phys_addr,
+					  unsigned long size)
+{
+	return sev_active() ? memremap_sev_should_map_encrypted(phys_addr, size)
+			    : memremap_sme_should_map_encrypted(phys_addr, size);
+}
+
 /*
  * Architecure function to determine if RAM remap is allowed.
  */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
  2017-03-02 15:12 ` Brijesh Singh
                   ` (12 preceding siblings ...)
  (?)
@ 2017-03-02 15:13 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

The use of ioremap will force the setup data to be mapped decrypted even
though setup data is encrypted.  Switch to using memremap which will be
able to perform the proper mapping.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/pci/common.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index a4fdfa7..0b06670 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
 
 	pa_data = boot_params.hdr.setup_data;
 	while (pa_data) {
-		data = ioremap(pa_data, sizeof(*rom));
+		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
 		if (!data)
 			return -ENOMEM;
 
@@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
 			}
 		}
 		pa_data = data->next;
-		iounmap(data);
+		memunmap(data);
 	}
 	set_dma_domain_ops(dev);
 	set_dev_domain_options(dev);

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:13   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

The use of ioremap will force the setup data to be mapped decrypted even
though setup data is encrypted.  Switch to using memremap which will be
able to perform the proper mapping.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/pci/common.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index a4fdfa7..0b06670 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
 
 	pa_data = boot_params.hdr.setup_data;
 	while (pa_data) {
-		data = ioremap(pa_data, sizeof(*rom));
+		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
 		if (!data)
 			return -ENOMEM;
 
@@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
 			}
 		}
 		pa_data = data->next;
-		iounmap(data);
+		memunmap(data);
 	}
 	set_dma_domain_ops(dev);
 	set_dev_domain_options(dev);

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

The use of ioremap will force the setup data to be mapped decrypted even
though setup data is encrypted.  Switch to using memremap which will be
able to perform the proper mapping.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/pci/common.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index a4fdfa7..0b06670 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
 
 	pa_data = boot_params.hdr.setup_data;
 	while (pa_data) {
-		data = ioremap(pa_data, sizeof(*rom));
+		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
 		if (!data)
 			return -ENOMEM;
 
@@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
 			}
 		}
 		pa_data = data->next;
-		iounmap(data);
+		memunmap(data);
 	}
 	set_dma_domain_ops(dev);
 	set_dev_domain_options(dev);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

The use of ioremap will force the setup data to be mapped decrypted even
though setup data is encrypted.  Switch to using memremap which will be
able to perform the proper mapping.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/pci/common.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index a4fdfa7..0b06670 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
 
 	pa_data = boot_params.hdr.setup_data;
 	while (pa_data) {
-		data = ioremap(pa_data, sizeof(*rom));
+		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
 		if (!data)
 			return -ENOMEM;
 
@@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
 			}
 		}
 		pa_data = data->next;
-		iounmap(data);
+		memunmap(data);
 	}
 	set_dma_domain_ops(dev);
 	set_dev_domain_options(dev);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

The use of ioremap will force the setup data to be mapped decrypted even
though setup data is encrypted.  Switch to using memremap which will be
able to perform the proper mapping.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/pci/common.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index a4fdfa7..0b06670 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
 
 	pa_data = boot_params.hdr.setup_data;
 	while (pa_data) {
-		data = ioremap(pa_data, sizeof(*rom));
+		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
 		if (!data)
 			return -ENOMEM;
 
@@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
 			}
 		}
 		pa_data = data->next;
-		iounmap(data);
+		memunmap(data);
 	}
 	set_dma_domain_ops(dev);
 	set_dev_domain_options(dev);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 07/32] x86/efi: Access EFI data as encrypted when SEV is active
  2017-03-02 15:12 ` Brijesh Singh
                   ` (15 preceding siblings ...)
  (?)
@ 2017-03-02 15:13 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

EFI data is encrypted when the kernel is run under SEV. Update the
page table references to be sure the EFI memory areas are accessed
encrypted.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/platform/efi/efi_64.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 2d8674d..9a76ed8 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -45,6 +45,7 @@
 #include <asm/realmode.h>
 #include <asm/time.h>
 #include <asm/pgalloc.h>
+#include <asm/mem_encrypt.h>
 
 /*
  * We allocate runtime services regions bottom-up, starting from -4G, i.e.
@@ -286,7 +287,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	 * as trim_bios_range() will reserve the first page and isolate it away
 	 * from memory allocators anyway.
 	 */
-	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, _PAGE_RW)) {
+	pf = _PAGE_RW;
+	if (sev_active())
+		pf |= _PAGE_ENC;
+	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, pf)) {
 		pr_err("Failed to create 1:1 mapping for the first page!\n");
 		return 1;
 	}
@@ -329,6 +333,9 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va)
 	if (!(md->attribute & EFI_MEMORY_WB))
 		flags |= _PAGE_PCD;
 
+	if (sev_active())
+		flags |= _PAGE_ENC;
+
 	pfn = md->phys_addr >> PAGE_SHIFT;
 	if (kernel_map_pages_in_pgd(pgd, pfn, va, md->num_pages, flags))
 		pr_warn("Error mapping PA 0x%llx -> VA 0x%llx!\n",
@@ -455,6 +462,9 @@ static int __init efi_update_mem_attr(struct mm_struct *mm, efi_memory_desc_t *m
 	if (!(md->attribute & EFI_MEMORY_RO))
 		pf |= _PAGE_RW;
 
+	if (sev_active())
+		pf |= _PAGE_ENC;
+
 	return efi_update_mappings(md, pf);
 }
 
@@ -506,6 +516,9 @@ void __init efi_runtime_update_mappings(void)
 			(md->type != EFI_RUNTIME_SERVICES_CODE))
 			pf |= _PAGE_RW;
 
+		if (sev_active())
+			pf |= _PAGE_ENC;
+
 		efi_update_mappings(md, pf);
 	}
 }

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 07/32] x86/efi: Access EFI data as encrypted when SEV is active
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:13   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

EFI data is encrypted when the kernel is run under SEV. Update the
page table references to be sure the EFI memory areas are accessed
encrypted.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/platform/efi/efi_64.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 2d8674d..9a76ed8 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -45,6 +45,7 @@
 #include <asm/realmode.h>
 #include <asm/time.h>
 #include <asm/pgalloc.h>
+#include <asm/mem_encrypt.h>
 
 /*
  * We allocate runtime services regions bottom-up, starting from -4G, i.e.
@@ -286,7 +287,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	 * as trim_bios_range() will reserve the first page and isolate it away
 	 * from memory allocators anyway.
 	 */
-	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, _PAGE_RW)) {
+	pf = _PAGE_RW;
+	if (sev_active())
+		pf |= _PAGE_ENC;
+	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, pf)) {
 		pr_err("Failed to create 1:1 mapping for the first page!\n");
 		return 1;
 	}
@@ -329,6 +333,9 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va)
 	if (!(md->attribute & EFI_MEMORY_WB))
 		flags |= _PAGE_PCD;
 
+	if (sev_active())
+		flags |= _PAGE_ENC;
+
 	pfn = md->phys_addr >> PAGE_SHIFT;
 	if (kernel_map_pages_in_pgd(pgd, pfn, va, md->num_pages, flags))
 		pr_warn("Error mapping PA 0x%llx -> VA 0x%llx!\n",
@@ -455,6 +462,9 @@ static int __init efi_update_mem_attr(struct mm_struct *mm, efi_memory_desc_t *m
 	if (!(md->attribute & EFI_MEMORY_RO))
 		pf |= _PAGE_RW;
 
+	if (sev_active())
+		pf |= _PAGE_ENC;
+
 	return efi_update_mappings(md, pf);
 }
 
@@ -506,6 +516,9 @@ void __init efi_runtime_update_mappings(void)
 			(md->type != EFI_RUNTIME_SERVICES_CODE))
 			pf |= _PAGE_RW;
 
+		if (sev_active())
+			pf |= _PAGE_ENC;
+
 		efi_update_mappings(md, pf);
 	}
 }

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 07/32] x86/efi: Access EFI data as encrypted when SEV is active
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

EFI data is encrypted when the kernel is run under SEV. Update the
page table references to be sure the EFI memory areas are accessed
encrypted.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/platform/efi/efi_64.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 2d8674d..9a76ed8 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -45,6 +45,7 @@
 #include <asm/realmode.h>
 #include <asm/time.h>
 #include <asm/pgalloc.h>
+#include <asm/mem_encrypt.h>
 
 /*
  * We allocate runtime services regions bottom-up, starting from -4G, i.e.
@@ -286,7 +287,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	 * as trim_bios_range() will reserve the first page and isolate it away
 	 * from memory allocators anyway.
 	 */
-	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, _PAGE_RW)) {
+	pf = _PAGE_RW;
+	if (sev_active())
+		pf |= _PAGE_ENC;
+	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, pf)) {
 		pr_err("Failed to create 1:1 mapping for the first page!\n");
 		return 1;
 	}
@@ -329,6 +333,9 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va)
 	if (!(md->attribute & EFI_MEMORY_WB))
 		flags |= _PAGE_PCD;
 
+	if (sev_active())
+		flags |= _PAGE_ENC;
+
 	pfn = md->phys_addr >> PAGE_SHIFT;
 	if (kernel_map_pages_in_pgd(pgd, pfn, va, md->num_pages, flags))
 		pr_warn("Error mapping PA 0x%llx -> VA 0x%llx!\n",
@@ -455,6 +462,9 @@ static int __init efi_update_mem_attr(struct mm_struct *mm, efi_memory_desc_t *m
 	if (!(md->attribute & EFI_MEMORY_RO))
 		pf |= _PAGE_RW;
 
+	if (sev_active())
+		pf |= _PAGE_ENC;
+
 	return efi_update_mappings(md, pf);
 }
 
@@ -506,6 +516,9 @@ void __init efi_runtime_update_mappings(void)
 			(md->type != EFI_RUNTIME_SERVICES_CODE))
 			pf |= _PAGE_RW;
 
+		if (sev_active())
+			pf |= _PAGE_ENC;
+
 		efi_update_mappings(md, pf);
 	}
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 07/32] x86/efi: Access EFI data as encrypted when SEV is active
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

EFI data is encrypted when the kernel is run under SEV. Update the
page table references to be sure the EFI memory areas are accessed
encrypted.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/platform/efi/efi_64.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 2d8674d..9a76ed8 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -45,6 +45,7 @@
 #include <asm/realmode.h>
 #include <asm/time.h>
 #include <asm/pgalloc.h>
+#include <asm/mem_encrypt.h>
 
 /*
  * We allocate runtime services regions bottom-up, starting from -4G, i.e.
@@ -286,7 +287,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	 * as trim_bios_range() will reserve the first page and isolate it away
 	 * from memory allocators anyway.
 	 */
-	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, _PAGE_RW)) {
+	pf = _PAGE_RW;
+	if (sev_active())
+		pf |= _PAGE_ENC;
+	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, pf)) {
 		pr_err("Failed to create 1:1 mapping for the first page!\n");
 		return 1;
 	}
@@ -329,6 +333,9 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va)
 	if (!(md->attribute & EFI_MEMORY_WB))
 		flags |= _PAGE_PCD;
 
+	if (sev_active())
+		flags |= _PAGE_ENC;
+
 	pfn = md->phys_addr >> PAGE_SHIFT;
 	if (kernel_map_pages_in_pgd(pgd, pfn, va, md->num_pages, flags))
 		pr_warn("Error mapping PA 0x%llx -> VA 0x%llx!\n",
@@ -455,6 +462,9 @@ static int __init efi_update_mem_attr(struct mm_struct *mm, efi_memory_desc_t *m
 	if (!(md->attribute & EFI_MEMORY_RO))
 		pf |= _PAGE_RW;
 
+	if (sev_active())
+		pf |= _PAGE_ENC;
+
 	return efi_update_mappings(md, pf);
 }
 
@@ -506,6 +516,9 @@ void __init efi_runtime_update_mappings(void)
 			(md->type != EFI_RUNTIME_SERVICES_CODE))
 			pf |= _PAGE_RW;
 
+		if (sev_active())
+			pf |= _PAGE_ENC;
+
 		efi_update_mappings(md, pf);
 	}
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 07/32] x86/efi: Access EFI data as encrypted when SEV is active
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

EFI data is encrypted when the kernel is run under SEV. Update the
page table references to be sure the EFI memory areas are accessed
encrypted.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/platform/efi/efi_64.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 2d8674d..9a76ed8 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -45,6 +45,7 @@
 #include <asm/realmode.h>
 #include <asm/time.h>
 #include <asm/pgalloc.h>
+#include <asm/mem_encrypt.h>
 
 /*
  * We allocate runtime services regions bottom-up, starting from -4G, i.e.
@@ -286,7 +287,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	 * as trim_bios_range() will reserve the first page and isolate it away
 	 * from memory allocators anyway.
 	 */
-	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, _PAGE_RW)) {
+	pf = _PAGE_RW;
+	if (sev_active())
+		pf |= _PAGE_ENC;
+	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, pf)) {
 		pr_err("Failed to create 1:1 mapping for the first page!\n");
 		return 1;
 	}
@@ -329,6 +333,9 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va)
 	if (!(md->attribute & EFI_MEMORY_WB))
 		flags |= _PAGE_PCD;
 
+	if (sev_active())
+		flags |= _PAGE_ENC;
+
 	pfn = md->phys_addr >> PAGE_SHIFT;
 	if (kernel_map_pages_in_pgd(pgd, pfn, va, md->num_pages, flags))
 		pr_warn("Error mapping PA 0x%llx -> VA 0x%llx!\n",
@@ -455,6 +462,9 @@ static int __init efi_update_mem_attr(struct mm_struct *mm, efi_memory_desc_t *m
 	if (!(md->attribute & EFI_MEMORY_RO))
 		pf |= _PAGE_RW;
 
+	if (sev_active())
+		pf |= _PAGE_ENC;
+
 	return efi_update_mappings(md, pf);
 }
 
@@ -506,6 +516,9 @@ void __init efi_runtime_update_mappings(void)
 			(md->type != EFI_RUNTIME_SERVICES_CODE))
 			pf |= _PAGE_RW;
 
+		if (sev_active())
+			pf |= _PAGE_ENC;
+
 		efi_update_mappings(md, pf);
 	}
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
  2017-03-02 15:12 ` Brijesh Singh
                   ` (16 preceding siblings ...)
  (?)
@ 2017-03-02 15:13 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

In order for memory pages to be properly mapped when SEV is active, we
need to use the PAGE_KERNEL protection attribute as the base protection.
This will insure that memory mapping of, e.g. ACPI tables, receives the
proper mapping attributes.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/ioremap.c |    8 ++++++++
 include/linux/mm.h    |    1 +
 kernel/resource.c     |   40 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c400ab5..481c999 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		pcm = new_pcm;
 	}
 
+	/*
+	 * If the page being mapped is in memory and SEV is active then
+	 * make sure the memory encryption attribute is enabled in the
+	 * resulting mapping.
+	 */
 	prot = PAGE_KERNEL_IO;
+	if (sev_active() && page_is_mem(pfn))
+		prot = __pgprot(pgprot_val(prot) | _PAGE_ENC);
+
 	switch (pcm) {
 	case _PAGE_CACHE_MODE_UC:
 	default:
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b84615b..825df27 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -445,6 +445,7 @@ static inline int get_page_unless_zero(struct page *page)
 }
 
 extern int page_is_ram(unsigned long pfn);
+extern int page_is_mem(unsigned long pfn);
 
 enum {
 	REGION_INTERSECTS,
diff --git a/kernel/resource.c b/kernel/resource.c
index 9b5f044..db56ba3 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(page_is_ram);
 
+/*
+ * This function returns true if the target memory is marked as
+ * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
+ * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
+ */
+static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
+{
+	struct resource res;
+	unsigned long pfn, end_pfn;
+	u64 orig_end;
+	int ret = -1;
+
+	res.start = (u64) start_pfn << PAGE_SHIFT;
+	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+	orig_end = res.end;
+	while ((res.start < res.end) &&
+		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
+		end_pfn = (res.end + 1) >> PAGE_SHIFT;
+		if (end_pfn > pfn)
+			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
+		if (ret)
+			break;
+		res.start = res.end + 1;
+		res.end = orig_end;
+	}
+	return ret;
+}
+
+/*
+ * This generic page_is_mem() returns true if specified address is
+ * registered as memory in iomem_resource list.
+ */
+int __weak page_is_mem(unsigned long pfn)
+{
+	return walk_mem_range(pfn, 1) == 1;
+}
+EXPORT_SYMBOL_GPL(page_is_mem);
+
 /**
  * region_intersects() - determine intersection of region with known resources
  * @start: region start address

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:13   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

In order for memory pages to be properly mapped when SEV is active, we
need to use the PAGE_KERNEL protection attribute as the base protection.
This will insure that memory mapping of, e.g. ACPI tables, receives the
proper mapping attributes.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/ioremap.c |    8 ++++++++
 include/linux/mm.h    |    1 +
 kernel/resource.c     |   40 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c400ab5..481c999 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		pcm = new_pcm;
 	}
 
+	/*
+	 * If the page being mapped is in memory and SEV is active then
+	 * make sure the memory encryption attribute is enabled in the
+	 * resulting mapping.
+	 */
 	prot = PAGE_KERNEL_IO;
+	if (sev_active() && page_is_mem(pfn))
+		prot = __pgprot(pgprot_val(prot) | _PAGE_ENC);
+
 	switch (pcm) {
 	case _PAGE_CACHE_MODE_UC:
 	default:
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b84615b..825df27 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -445,6 +445,7 @@ static inline int get_page_unless_zero(struct page *page)
 }
 
 extern int page_is_ram(unsigned long pfn);
+extern int page_is_mem(unsigned long pfn);
 
 enum {
 	REGION_INTERSECTS,
diff --git a/kernel/resource.c b/kernel/resource.c
index 9b5f044..db56ba3 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(page_is_ram);
 
+/*
+ * This function returns true if the target memory is marked as
+ * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
+ * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
+ */
+static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
+{
+	struct resource res;
+	unsigned long pfn, end_pfn;
+	u64 orig_end;
+	int ret = -1;
+
+	res.start = (u64) start_pfn << PAGE_SHIFT;
+	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+	orig_end = res.end;
+	while ((res.start < res.end) &&
+		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
+		end_pfn = (res.end + 1) >> PAGE_SHIFT;
+		if (end_pfn > pfn)
+			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
+		if (ret)
+			break;
+		res.start = res.end + 1;
+		res.end = orig_end;
+	}
+	return ret;
+}
+
+/*
+ * This generic page_is_mem() returns true if specified address is
+ * registered as memory in iomem_resource list.
+ */
+int __weak page_is_mem(unsigned long pfn)
+{
+	return walk_mem_range(pfn, 1) == 1;
+}
+EXPORT_SYMBOL_GPL(page_is_mem);
+
 /**
  * region_intersects() - determine intersection of region with known resources
  * @start: region start address

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

In order for memory pages to be properly mapped when SEV is active, we
need to use the PAGE_KERNEL protection attribute as the base protection.
This will insure that memory mapping of, e.g. ACPI tables, receives the
proper mapping attributes.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/ioremap.c |    8 ++++++++
 include/linux/mm.h    |    1 +
 kernel/resource.c     |   40 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c400ab5..481c999 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		pcm = new_pcm;
 	}
 
+	/*
+	 * If the page being mapped is in memory and SEV is active then
+	 * make sure the memory encryption attribute is enabled in the
+	 * resulting mapping.
+	 */
 	prot = PAGE_KERNEL_IO;
+	if (sev_active() && page_is_mem(pfn))
+		prot = __pgprot(pgprot_val(prot) | _PAGE_ENC);
+
 	switch (pcm) {
 	case _PAGE_CACHE_MODE_UC:
 	default:
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b84615b..825df27 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -445,6 +445,7 @@ static inline int get_page_unless_zero(struct page *page)
 }
 
 extern int page_is_ram(unsigned long pfn);
+extern int page_is_mem(unsigned long pfn);
 
 enum {
 	REGION_INTERSECTS,
diff --git a/kernel/resource.c b/kernel/resource.c
index 9b5f044..db56ba3 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(page_is_ram);
 
+/*
+ * This function returns true if the target memory is marked as
+ * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
+ * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
+ */
+static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
+{
+	struct resource res;
+	unsigned long pfn, end_pfn;
+	u64 orig_end;
+	int ret = -1;
+
+	res.start = (u64) start_pfn << PAGE_SHIFT;
+	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+	orig_end = res.end;
+	while ((res.start < res.end) &&
+		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
+		end_pfn = (res.end + 1) >> PAGE_SHIFT;
+		if (end_pfn > pfn)
+			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
+		if (ret)
+			break;
+		res.start = res.end + 1;
+		res.end = orig_end;
+	}
+	return ret;
+}
+
+/*
+ * This generic page_is_mem() returns true if specified address is
+ * registered as memory in iomem_resource list.
+ */
+int __weak page_is_mem(unsigned long pfn)
+{
+	return walk_mem_range(pfn, 1) == 1;
+}
+EXPORT_SYMBOL_GPL(page_is_mem);
+
 /**
  * region_intersects() - determine intersection of region with known resources
  * @start: region start address

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

In order for memory pages to be properly mapped when SEV is active, we
need to use the PAGE_KERNEL protection attribute as the base protection.
This will insure that memory mapping of, e.g. ACPI tables, receives the
proper mapping attributes.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/ioremap.c |    8 ++++++++
 include/linux/mm.h    |    1 +
 kernel/resource.c     |   40 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c400ab5..481c999 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		pcm = new_pcm;
 	}
 
+	/*
+	 * If the page being mapped is in memory and SEV is active then
+	 * make sure the memory encryption attribute is enabled in the
+	 * resulting mapping.
+	 */
 	prot = PAGE_KERNEL_IO;
+	if (sev_active() && page_is_mem(pfn))
+		prot = __pgprot(pgprot_val(prot) | _PAGE_ENC);
+
 	switch (pcm) {
 	case _PAGE_CACHE_MODE_UC:
 	default:
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b84615b..825df27 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -445,6 +445,7 @@ static inline int get_page_unless_zero(struct page *page)
 }
 
 extern int page_is_ram(unsigned long pfn);
+extern int page_is_mem(unsigned long pfn);
 
 enum {
 	REGION_INTERSECTS,
diff --git a/kernel/resource.c b/kernel/resource.c
index 9b5f044..db56ba3 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(page_is_ram);
 
+/*
+ * This function returns true if the target memory is marked as
+ * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
+ * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
+ */
+static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
+{
+	struct resource res;
+	unsigned long pfn, end_pfn;
+	u64 orig_end;
+	int ret = -1;
+
+	res.start = (u64) start_pfn << PAGE_SHIFT;
+	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+	orig_end = res.end;
+	while ((res.start < res.end) &&
+		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
+		end_pfn = (res.end + 1) >> PAGE_SHIFT;
+		if (end_pfn > pfn)
+			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
+		if (ret)
+			break;
+		res.start = res.end + 1;
+		res.end = orig_end;
+	}
+	return ret;
+}
+
+/*
+ * This generic page_is_mem() returns true if specified address is
+ * registered as memory in iomem_resource list.
+ */
+int __weak page_is_mem(unsigned long pfn)
+{
+	return walk_mem_range(pfn, 1) == 1;
+}
+EXPORT_SYMBOL_GPL(page_is_mem);
+
 /**
  * region_intersects() - determine intersection of region with known resources
  * @start: region start address

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

In order for memory pages to be properly mapped when SEV is active, we
need to use the PAGE_KERNEL protection attribute as the base protection.
This will insure that memory mapping of, e.g. ACPI tables, receives the
proper mapping attributes.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/ioremap.c |    8 ++++++++
 include/linux/mm.h    |    1 +
 kernel/resource.c     |   40 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c400ab5..481c999 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		pcm = new_pcm;
 	}
 
+	/*
+	 * If the page being mapped is in memory and SEV is active then
+	 * make sure the memory encryption attribute is enabled in the
+	 * resulting mapping.
+	 */
 	prot = PAGE_KERNEL_IO;
+	if (sev_active() && page_is_mem(pfn))
+		prot = __pgprot(pgprot_val(prot) | _PAGE_ENC);
+
 	switch (pcm) {
 	case _PAGE_CACHE_MODE_UC:
 	default:
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b84615b..825df27 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -445,6 +445,7 @@ static inline int get_page_unless_zero(struct page *page)
 }
 
 extern int page_is_ram(unsigned long pfn);
+extern int page_is_mem(unsigned long pfn);
 
 enum {
 	REGION_INTERSECTS,
diff --git a/kernel/resource.c b/kernel/resource.c
index 9b5f044..db56ba3 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(page_is_ram);
 
+/*
+ * This function returns true if the target memory is marked as
+ * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
+ * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
+ */
+static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
+{
+	struct resource res;
+	unsigned long pfn, end_pfn;
+	u64 orig_end;
+	int ret = -1;
+
+	res.start = (u64) start_pfn << PAGE_SHIFT;
+	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+	orig_end = res.end;
+	while ((res.start < res.end) &&
+		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
+		end_pfn = (res.end + 1) >> PAGE_SHIFT;
+		if (end_pfn > pfn)
+			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
+		if (ret)
+			break;
+		res.start = res.end + 1;
+		res.end = orig_end;
+	}
+	return ret;
+}
+
+/*
+ * This generic page_is_mem() returns true if specified address is
+ * registered as memory in iomem_resource list.
+ */
+int __weak page_is_mem(unsigned long pfn)
+{
+	return walk_mem_range(pfn, 1) == 1;
+}
+EXPORT_SYMBOL_GPL(page_is_mem);
+
 /**
  * region_intersects() - determine intersection of region with known resources
  * @start: region start address

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 09/32] x86: Change early_ioremap to early_memremap for BOOT data
  2017-03-02 15:12 ` Brijesh Singh
                   ` (18 preceding siblings ...)
  (?)
@ 2017-03-02 15:13 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

In order to map BOOT data with the proper encryption bit, the
early_ioremap() function calls are changed to early_memremap() calls.
This allows the proper access for both SME and SEV.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kernel/acpi/boot.c |    4 ++--
 arch/x86/kernel/mpparse.c   |   10 +++++-----
 drivers/sfi/sfi_core.c      |    6 +++---
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 35174c6..468c25a 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -124,7 +124,7 @@ char *__init __acpi_map_table(unsigned long phys, unsigned long size)
 	if (!phys || !size)
 		return NULL;
 
-	return early_ioremap(phys, size);
+	return early_memremap(phys, size);
 }
 
 void __init __acpi_unmap_table(char *map, unsigned long size)
@@ -132,7 +132,7 @@ void __init __acpi_unmap_table(char *map, unsigned long size)
 	if (!map || !size)
 		return;
 
-	early_iounmap(map, size);
+	early_memunmap(map, size);
 }
 
 #ifdef CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index 0d904d7..fd37f39 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -436,9 +436,9 @@ static unsigned long __init get_mpc_size(unsigned long physptr)
 	struct mpc_table *mpc;
 	unsigned long size;
 
-	mpc = early_ioremap(physptr, PAGE_SIZE);
+	mpc = early_memremap(physptr, PAGE_SIZE);
 	size = mpc->length;
-	early_iounmap(mpc, PAGE_SIZE);
+	early_memunmap(mpc, PAGE_SIZE);
 	apic_printk(APIC_VERBOSE, "  mpc: %lx-%lx\n", physptr, physptr + size);
 
 	return size;
@@ -450,7 +450,7 @@ static int __init check_physptr(struct mpf_intel *mpf, unsigned int early)
 	unsigned long size;
 
 	size = get_mpc_size(mpf->physptr);
-	mpc = early_ioremap(mpf->physptr, size);
+	mpc = early_memremap(mpf->physptr, size);
 	/*
 	 * Read the physical hardware table.  Anything here will
 	 * override the defaults.
@@ -461,10 +461,10 @@ static int __init check_physptr(struct mpf_intel *mpf, unsigned int early)
 #endif
 		pr_err("BIOS bug, MP table errors detected!...\n");
 		pr_cont("... disabling SMP support. (tell your hw vendor)\n");
-		early_iounmap(mpc, size);
+		early_memunmap(mpc, size);
 		return -1;
 	}
-	early_iounmap(mpc, size);
+	early_memunmap(mpc, size);
 
 	if (early)
 		return -1;
diff --git a/drivers/sfi/sfi_core.c b/drivers/sfi/sfi_core.c
index 296db7a..d00ae3f 100644
--- a/drivers/sfi/sfi_core.c
+++ b/drivers/sfi/sfi_core.c
@@ -92,7 +92,7 @@ static struct sfi_table_simple *syst_va __read_mostly;
 static u32 sfi_use_ioremap __read_mostly;
 
 /*
- * sfi_un/map_memory calls early_ioremap/iounmap which is a __init function
+ * sfi_un/map_memory calls early_memremap/memunmap which is a __init function
  * and introduces section mismatch. So use __ref to make it calm.
  */
 static void __iomem * __ref sfi_map_memory(u64 phys, u32 size)
@@ -103,7 +103,7 @@ static void __iomem * __ref sfi_map_memory(u64 phys, u32 size)
 	if (sfi_use_ioremap)
 		return ioremap_cache(phys, size);
 	else
-		return early_ioremap(phys, size);
+		return early_memremap(phys, size);
 }
 
 static void __ref sfi_unmap_memory(void __iomem *virt, u32 size)
@@ -114,7 +114,7 @@ static void __ref sfi_unmap_memory(void __iomem *virt, u32 size)
 	if (sfi_use_ioremap)
 		iounmap(virt);
 	else
-		early_iounmap(virt, size);
+		early_memunmap(virt, size);
 }
 
 static void sfi_print_table_header(unsigned long long pa,

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 09/32] x86: Change early_ioremap to early_memremap for BOOT data
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:13   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

In order to map BOOT data with the proper encryption bit, the
early_ioremap() function calls are changed to early_memremap() calls.
This allows the proper access for both SME and SEV.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kernel/acpi/boot.c |    4 ++--
 arch/x86/kernel/mpparse.c   |   10 +++++-----
 drivers/sfi/sfi_core.c      |    6 +++---
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 35174c6..468c25a 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -124,7 +124,7 @@ char *__init __acpi_map_table(unsigned long phys, unsigned long size)
 	if (!phys || !size)
 		return NULL;
 
-	return early_ioremap(phys, size);
+	return early_memremap(phys, size);
 }
 
 void __init __acpi_unmap_table(char *map, unsigned long size)
@@ -132,7 +132,7 @@ void __init __acpi_unmap_table(char *map, unsigned long size)
 	if (!map || !size)
 		return;
 
-	early_iounmap(map, size);
+	early_memunmap(map, size);
 }
 
 #ifdef CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index 0d904d7..fd37f39 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -436,9 +436,9 @@ static unsigned long __init get_mpc_size(unsigned long physptr)
 	struct mpc_table *mpc;
 	unsigned long size;
 
-	mpc = early_ioremap(physptr, PAGE_SIZE);
+	mpc = early_memremap(physptr, PAGE_SIZE);
 	size = mpc->length;
-	early_iounmap(mpc, PAGE_SIZE);
+	early_memunmap(mpc, PAGE_SIZE);
 	apic_printk(APIC_VERBOSE, "  mpc: %lx-%lx\n", physptr, physptr + size);
 
 	return size;
@@ -450,7 +450,7 @@ static int __init check_physptr(struct mpf_intel *mpf, unsigned int early)
 	unsigned long size;
 
 	size = get_mpc_size(mpf->physptr);
-	mpc = early_ioremap(mpf->physptr, size);
+	mpc = early_memremap(mpf->physptr, size);
 	/*
 	 * Read the physical hardware table.  Anything here will
 	 * override the defaults.
@@ -461,10 +461,10 @@ static int __init check_physptr(struct mpf_intel *mpf, unsigned int early)
 #endif
 		pr_err("BIOS bug, MP table errors detected!...\n");
 		pr_cont("... disabling SMP support. (tell your hw vendor)\n");
-		early_iounmap(mpc, size);
+		early_memunmap(mpc, size);
 		return -1;
 	}
-	early_iounmap(mpc, size);
+	early_memunmap(mpc, size);
 
 	if (early)
 		return -1;
diff --git a/drivers/sfi/sfi_core.c b/drivers/sfi/sfi_core.c
index 296db7a..d00ae3f 100644
--- a/drivers/sfi/sfi_core.c
+++ b/drivers/sfi/sfi_core.c
@@ -92,7 +92,7 @@ static struct sfi_table_simple *syst_va __read_mostly;
 static u32 sfi_use_ioremap __read_mostly;
 
 /*
- * sfi_un/map_memory calls early_ioremap/iounmap which is a __init function
+ * sfi_un/map_memory calls early_memremap/memunmap which is a __init function
  * and introduces section mismatch. So use __ref to make it calm.
  */
 static void __iomem * __ref sfi_map_memory(u64 phys, u32 size)
@@ -103,7 +103,7 @@ static void __iomem * __ref sfi_map_memory(u64 phys, u32 size)
 	if (sfi_use_ioremap)
 		return ioremap_cache(phys, size);
 	else
-		return early_ioremap(phys, size);
+		return early_memremap(phys, size);
 }
 
 static void __ref sfi_unmap_memory(void __iomem *virt, u32 size)
@@ -114,7 +114,7 @@ static void __ref sfi_unmap_memory(void __iomem *virt, u32 size)
 	if (sfi_use_ioremap)
 		iounmap(virt);
 	else
-		early_iounmap(virt, size);
+		early_memunmap(virt, size);
 }
 
 static void sfi_print_table_header(unsigned long long pa,

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 09/32] x86: Change early_ioremap to early_memremap for BOOT data
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

In order to map BOOT data with the proper encryption bit, the
early_ioremap() function calls are changed to early_memremap() calls.
This allows the proper access for both SME and SEV.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kernel/acpi/boot.c |    4 ++--
 arch/x86/kernel/mpparse.c   |   10 +++++-----
 drivers/sfi/sfi_core.c      |    6 +++---
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 35174c6..468c25a 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -124,7 +124,7 @@ char *__init __acpi_map_table(unsigned long phys, unsigned long size)
 	if (!phys || !size)
 		return NULL;
 
-	return early_ioremap(phys, size);
+	return early_memremap(phys, size);
 }
 
 void __init __acpi_unmap_table(char *map, unsigned long size)
@@ -132,7 +132,7 @@ void __init __acpi_unmap_table(char *map, unsigned long size)
 	if (!map || !size)
 		return;
 
-	early_iounmap(map, size);
+	early_memunmap(map, size);
 }
 
 #ifdef CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index 0d904d7..fd37f39 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -436,9 +436,9 @@ static unsigned long __init get_mpc_size(unsigned long physptr)
 	struct mpc_table *mpc;
 	unsigned long size;
 
-	mpc = early_ioremap(physptr, PAGE_SIZE);
+	mpc = early_memremap(physptr, PAGE_SIZE);
 	size = mpc->length;
-	early_iounmap(mpc, PAGE_SIZE);
+	early_memunmap(mpc, PAGE_SIZE);
 	apic_printk(APIC_VERBOSE, "  mpc: %lx-%lx\n", physptr, physptr + size);
 
 	return size;
@@ -450,7 +450,7 @@ static int __init check_physptr(struct mpf_intel *mpf, unsigned int early)
 	unsigned long size;
 
 	size = get_mpc_size(mpf->physptr);
-	mpc = early_ioremap(mpf->physptr, size);
+	mpc = early_memremap(mpf->physptr, size);
 	/*
 	 * Read the physical hardware table.  Anything here will
 	 * override the defaults.
@@ -461,10 +461,10 @@ static int __init check_physptr(struct mpf_intel *mpf, unsigned int early)
 #endif
 		pr_err("BIOS bug, MP table errors detected!...\n");
 		pr_cont("... disabling SMP support. (tell your hw vendor)\n");
-		early_iounmap(mpc, size);
+		early_memunmap(mpc, size);
 		return -1;
 	}
-	early_iounmap(mpc, size);
+	early_memunmap(mpc, size);
 
 	if (early)
 		return -1;
diff --git a/drivers/sfi/sfi_core.c b/drivers/sfi/sfi_core.c
index 296db7a..d00ae3f 100644
--- a/drivers/sfi/sfi_core.c
+++ b/drivers/sfi/sfi_core.c
@@ -92,7 +92,7 @@ static struct sfi_table_simple *syst_va __read_mostly;
 static u32 sfi_use_ioremap __read_mostly;
 
 /*
- * sfi_un/map_memory calls early_ioremap/iounmap which is a __init function
+ * sfi_un/map_memory calls early_memremap/memunmap which is a __init function
  * and introduces section mismatch. So use __ref to make it calm.
  */
 static void __iomem * __ref sfi_map_memory(u64 phys, u32 size)
@@ -103,7 +103,7 @@ static void __iomem * __ref sfi_map_memory(u64 phys, u32 size)
 	if (sfi_use_ioremap)
 		return ioremap_cache(phys, size);
 	else
-		return early_ioremap(phys, size);
+		return early_memremap(phys, size);
 }
 
 static void __ref sfi_unmap_memory(void __iomem *virt, u32 size)
@@ -114,7 +114,7 @@ static void __ref sfi_unmap_memory(void __iomem *virt, u32 size)
 	if (sfi_use_ioremap)
 		iounmap(virt);
 	else
-		early_iounmap(virt, size);
+		early_memunmap(virt, size);
 }
 
 static void sfi_print_table_header(unsigned long long pa,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 09/32] x86: Change early_ioremap to early_memremap for BOOT data
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

In order to map BOOT data with the proper encryption bit, the
early_ioremap() function calls are changed to early_memremap() calls.
This allows the proper access for both SME and SEV.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kernel/acpi/boot.c |    4 ++--
 arch/x86/kernel/mpparse.c   |   10 +++++-----
 drivers/sfi/sfi_core.c      |    6 +++---
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 35174c6..468c25a 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -124,7 +124,7 @@ char *__init __acpi_map_table(unsigned long phys, unsigned long size)
 	if (!phys || !size)
 		return NULL;
 
-	return early_ioremap(phys, size);
+	return early_memremap(phys, size);
 }
 
 void __init __acpi_unmap_table(char *map, unsigned long size)
@@ -132,7 +132,7 @@ void __init __acpi_unmap_table(char *map, unsigned long size)
 	if (!map || !size)
 		return;
 
-	early_iounmap(map, size);
+	early_memunmap(map, size);
 }
 
 #ifdef CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index 0d904d7..fd37f39 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -436,9 +436,9 @@ static unsigned long __init get_mpc_size(unsigned long physptr)
 	struct mpc_table *mpc;
 	unsigned long size;
 
-	mpc = early_ioremap(physptr, PAGE_SIZE);
+	mpc = early_memremap(physptr, PAGE_SIZE);
 	size = mpc->length;
-	early_iounmap(mpc, PAGE_SIZE);
+	early_memunmap(mpc, PAGE_SIZE);
 	apic_printk(APIC_VERBOSE, "  mpc: %lx-%lx\n", physptr, physptr + size);
 
 	return size;
@@ -450,7 +450,7 @@ static int __init check_physptr(struct mpf_intel *mpf, unsigned int early)
 	unsigned long size;
 
 	size = get_mpc_size(mpf->physptr);
-	mpc = early_ioremap(mpf->physptr, size);
+	mpc = early_memremap(mpf->physptr, size);
 	/*
 	 * Read the physical hardware table.  Anything here will
 	 * override the defaults.
@@ -461,10 +461,10 @@ static int __init check_physptr(struct mpf_intel *mpf, unsigned int early)
 #endif
 		pr_err("BIOS bug, MP table errors detected!...\n");
 		pr_cont("... disabling SMP support. (tell your hw vendor)\n");
-		early_iounmap(mpc, size);
+		early_memunmap(mpc, size);
 		return -1;
 	}
-	early_iounmap(mpc, size);
+	early_memunmap(mpc, size);
 
 	if (early)
 		return -1;
diff --git a/drivers/sfi/sfi_core.c b/drivers/sfi/sfi_core.c
index 296db7a..d00ae3f 100644
--- a/drivers/sfi/sfi_core.c
+++ b/drivers/sfi/sfi_core.c
@@ -92,7 +92,7 @@ static struct sfi_table_simple *syst_va __read_mostly;
 static u32 sfi_use_ioremap __read_mostly;
 
 /*
- * sfi_un/map_memory calls early_ioremap/iounmap which is a __init function
+ * sfi_un/map_memory calls early_memremap/memunmap which is a __init function
  * and introduces section mismatch. So use __ref to make it calm.
  */
 static void __iomem * __ref sfi_map_memory(u64 phys, u32 size)
@@ -103,7 +103,7 @@ static void __iomem * __ref sfi_map_memory(u64 phys, u32 size)
 	if (sfi_use_ioremap)
 		return ioremap_cache(phys, size);
 	else
-		return early_ioremap(phys, size);
+		return early_memremap(phys, size);
 }
 
 static void __ref sfi_unmap_memory(void __iomem *virt, u32 size)
@@ -114,7 +114,7 @@ static void __ref sfi_unmap_memory(void __iomem *virt, u32 size)
 	if (sfi_use_ioremap)
 		iounmap(virt);
 	else
-		early_iounmap(virt, size);
+		early_memunmap(virt, size);
 }
 
 static void sfi_print_table_header(unsigned long long pa,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 09/32] x86: Change early_ioremap to early_memremap for BOOT data
@ 2017-03-02 15:13   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:13 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

In order to map BOOT data with the proper encryption bit, the
early_ioremap() function calls are changed to early_memremap() calls.
This allows the proper access for both SME and SEV.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kernel/acpi/boot.c |    4 ++--
 arch/x86/kernel/mpparse.c   |   10 +++++-----
 drivers/sfi/sfi_core.c      |    6 +++---
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 35174c6..468c25a 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -124,7 +124,7 @@ char *__init __acpi_map_table(unsigned long phys, unsigned long size)
 	if (!phys || !size)
 		return NULL;
 
-	return early_ioremap(phys, size);
+	return early_memremap(phys, size);
 }
 
 void __init __acpi_unmap_table(char *map, unsigned long size)
@@ -132,7 +132,7 @@ void __init __acpi_unmap_table(char *map, unsigned long size)
 	if (!map || !size)
 		return;
 
-	early_iounmap(map, size);
+	early_memunmap(map, size);
 }
 
 #ifdef CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index 0d904d7..fd37f39 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -436,9 +436,9 @@ static unsigned long __init get_mpc_size(unsigned long physptr)
 	struct mpc_table *mpc;
 	unsigned long size;
 
-	mpc = early_ioremap(physptr, PAGE_SIZE);
+	mpc = early_memremap(physptr, PAGE_SIZE);
 	size = mpc->length;
-	early_iounmap(mpc, PAGE_SIZE);
+	early_memunmap(mpc, PAGE_SIZE);
 	apic_printk(APIC_VERBOSE, "  mpc: %lx-%lx\n", physptr, physptr + size);
 
 	return size;
@@ -450,7 +450,7 @@ static int __init check_physptr(struct mpf_intel *mpf, unsigned int early)
 	unsigned long size;
 
 	size = get_mpc_size(mpf->physptr);
-	mpc = early_ioremap(mpf->physptr, size);
+	mpc = early_memremap(mpf->physptr, size);
 	/*
 	 * Read the physical hardware table.  Anything here will
 	 * override the defaults.
@@ -461,10 +461,10 @@ static int __init check_physptr(struct mpf_intel *mpf, unsigned int early)
 #endif
 		pr_err("BIOS bug, MP table errors detected!...\n");
 		pr_cont("... disabling SMP support. (tell your hw vendor)\n");
-		early_iounmap(mpc, size);
+		early_memunmap(mpc, size);
 		return -1;
 	}
-	early_iounmap(mpc, size);
+	early_memunmap(mpc, size);
 
 	if (early)
 		return -1;
diff --git a/drivers/sfi/sfi_core.c b/drivers/sfi/sfi_core.c
index 296db7a..d00ae3f 100644
--- a/drivers/sfi/sfi_core.c
+++ b/drivers/sfi/sfi_core.c
@@ -92,7 +92,7 @@ static struct sfi_table_simple *syst_va __read_mostly;
 static u32 sfi_use_ioremap __read_mostly;
 
 /*
- * sfi_un/map_memory calls early_ioremap/iounmap which is a __init function
+ * sfi_un/map_memory calls early_memremap/memunmap which is a __init function
  * and introduces section mismatch. So use __ref to make it calm.
  */
 static void __iomem * __ref sfi_map_memory(u64 phys, u32 size)
@@ -103,7 +103,7 @@ static void __iomem * __ref sfi_map_memory(u64 phys, u32 size)
 	if (sfi_use_ioremap)
 		return ioremap_cache(phys, size);
 	else
-		return early_ioremap(phys, size);
+		return early_memremap(phys, size);
 }
 
 static void __ref sfi_unmap_memory(void __iomem *virt, u32 size)
@@ -114,7 +114,7 @@ static void __ref sfi_unmap_memory(void __iomem *virt, u32 size)
 	if (sfi_use_ioremap)
 		iounmap(virt);
 	else
-		early_iounmap(virt, size);
+		early_memunmap(virt, size);
 }
 
 static void sfi_print_table_header(unsigned long long pa,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 10/32] x86: DMA support for SEV memory encryption
  2017-03-02 15:12 ` Brijesh Singh
                   ` (21 preceding siblings ...)
  (?)
@ 2017-03-02 15:14 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

DMA access to memory mapped as encrypted while SEV is active can not be
encrypted during device write or decrypted during device read. In order
for DMA to properly work when SEV is active, the swiotlb bounce buffers
must be used.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/mem_encrypt.c |   77 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 090419b..7df5f4c 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -197,8 +197,81 @@ void __init sme_early_init(void)
 	/* Update the protection map with memory encryption mask */
 	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
 		protection_map[i] = pgprot_encrypted(protection_map[i]);
+
+	if (sev_active())
+		swiotlb_force = SWIOTLB_FORCE;
+}
+
+static void *sme_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		       gfp_t gfp, unsigned long attrs)
+{
+	unsigned long dma_mask;
+	unsigned int order;
+	struct page *page;
+	void *vaddr = NULL;
+
+	dma_mask = dma_alloc_coherent_mask(dev, gfp);
+	order = get_order(size);
+
+	gfp &= ~__GFP_ZERO;
+
+	page = alloc_pages_node(dev_to_node(dev), gfp, order);
+	if (page) {
+		dma_addr_t addr;
+
+		/*
+		 * Since we will be clearing the encryption bit, check the
+		 * mask with it already cleared.
+		 */
+		addr = phys_to_dma(dev, page_to_phys(page)) & ~sme_me_mask;
+		if ((addr + size) > dma_mask) {
+			__free_pages(page, get_order(size));
+		} else {
+			vaddr = page_address(page);
+			*dma_handle = addr;
+		}
+	}
+
+	if (!vaddr)
+		vaddr = swiotlb_alloc_coherent(dev, size, dma_handle, gfp);
+
+	if (!vaddr)
+		return NULL;
+
+	/* Clear the SME encryption bit for DMA use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, *dma_handle))) {
+		set_memory_decrypted((unsigned long)vaddr, 1 << order);
+		*dma_handle &= ~sme_me_mask;
+	}
+
+	return vaddr;
 }
 
+static void sme_free(struct device *dev, size_t size, void *vaddr,
+		     dma_addr_t dma_handle, unsigned long attrs)
+{
+	/* Set the SME encryption bit for re-use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, dma_handle)))
+		set_memory_encrypted((unsigned long)vaddr,
+				     1 << get_order(size));
+
+	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
+}
+
+static struct dma_map_ops sme_dma_ops = {
+	.alloc                  = sme_alloc,
+	.free                   = sme_free,
+	.map_page               = swiotlb_map_page,
+	.unmap_page             = swiotlb_unmap_page,
+	.map_sg                 = swiotlb_map_sg_attrs,
+	.unmap_sg               = swiotlb_unmap_sg_attrs,
+	.sync_single_for_cpu    = swiotlb_sync_single_for_cpu,
+	.sync_single_for_device = swiotlb_sync_single_for_device,
+	.sync_sg_for_cpu        = swiotlb_sync_sg_for_cpu,
+	.sync_sg_for_device     = swiotlb_sync_sg_for_device,
+	.mapping_error          = swiotlb_dma_mapping_error,
+};
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void)
 {
@@ -208,6 +281,10 @@ void __init mem_encrypt_init(void)
 	/* Call into SWIOTLB to update the SWIOTLB DMA buffers */
 	swiotlb_update_mem_attributes();
 
+	/* Use SEV DMA operations if SEV is active */
+	if (sev_active())
+		dma_ops = &sme_dma_ops;
+
 	pr_info("AMD Secure Memory Encryption (SME) active\n");
 }
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 10/32] x86: DMA support for SEV memory encryption
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:14   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

DMA access to memory mapped as encrypted while SEV is active can not be
encrypted during device write or decrypted during device read. In order
for DMA to properly work when SEV is active, the swiotlb bounce buffers
must be used.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/mem_encrypt.c |   77 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 090419b..7df5f4c 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -197,8 +197,81 @@ void __init sme_early_init(void)
 	/* Update the protection map with memory encryption mask */
 	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
 		protection_map[i] = pgprot_encrypted(protection_map[i]);
+
+	if (sev_active())
+		swiotlb_force = SWIOTLB_FORCE;
+}
+
+static void *sme_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		       gfp_t gfp, unsigned long attrs)
+{
+	unsigned long dma_mask;
+	unsigned int order;
+	struct page *page;
+	void *vaddr = NULL;
+
+	dma_mask = dma_alloc_coherent_mask(dev, gfp);
+	order = get_order(size);
+
+	gfp &= ~__GFP_ZERO;
+
+	page = alloc_pages_node(dev_to_node(dev), gfp, order);
+	if (page) {
+		dma_addr_t addr;
+
+		/*
+		 * Since we will be clearing the encryption bit, check the
+		 * mask with it already cleared.
+		 */
+		addr = phys_to_dma(dev, page_to_phys(page)) & ~sme_me_mask;
+		if ((addr + size) > dma_mask) {
+			__free_pages(page, get_order(size));
+		} else {
+			vaddr = page_address(page);
+			*dma_handle = addr;
+		}
+	}
+
+	if (!vaddr)
+		vaddr = swiotlb_alloc_coherent(dev, size, dma_handle, gfp);
+
+	if (!vaddr)
+		return NULL;
+
+	/* Clear the SME encryption bit for DMA use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, *dma_handle))) {
+		set_memory_decrypted((unsigned long)vaddr, 1 << order);
+		*dma_handle &= ~sme_me_mask;
+	}
+
+	return vaddr;
 }
 
+static void sme_free(struct device *dev, size_t size, void *vaddr,
+		     dma_addr_t dma_handle, unsigned long attrs)
+{
+	/* Set the SME encryption bit for re-use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, dma_handle)))
+		set_memory_encrypted((unsigned long)vaddr,
+				     1 << get_order(size));
+
+	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
+}
+
+static struct dma_map_ops sme_dma_ops = {
+	.alloc                  = sme_alloc,
+	.free                   = sme_free,
+	.map_page               = swiotlb_map_page,
+	.unmap_page             = swiotlb_unmap_page,
+	.map_sg                 = swiotlb_map_sg_attrs,
+	.unmap_sg               = swiotlb_unmap_sg_attrs,
+	.sync_single_for_cpu    = swiotlb_sync_single_for_cpu,
+	.sync_single_for_device = swiotlb_sync_single_for_device,
+	.sync_sg_for_cpu        = swiotlb_sync_sg_for_cpu,
+	.sync_sg_for_device     = swiotlb_sync_sg_for_device,
+	.mapping_error          = swiotlb_dma_mapping_error,
+};
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void)
 {
@@ -208,6 +281,10 @@ void __init mem_encrypt_init(void)
 	/* Call into SWIOTLB to update the SWIOTLB DMA buffers */
 	swiotlb_update_mem_attributes();
 
+	/* Use SEV DMA operations if SEV is active */
+	if (sev_active())
+		dma_ops = &sme_dma_ops;
+
 	pr_info("AMD Secure Memory Encryption (SME) active\n");
 }
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 10/32] x86: DMA support for SEV memory encryption
@ 2017-03-02 15:14   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

DMA access to memory mapped as encrypted while SEV is active can not be
encrypted during device write or decrypted during device read. In order
for DMA to properly work when SEV is active, the swiotlb bounce buffers
must be used.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/mem_encrypt.c |   77 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 090419b..7df5f4c 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -197,8 +197,81 @@ void __init sme_early_init(void)
 	/* Update the protection map with memory encryption mask */
 	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
 		protection_map[i] = pgprot_encrypted(protection_map[i]);
+
+	if (sev_active())
+		swiotlb_force = SWIOTLB_FORCE;
+}
+
+static void *sme_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		       gfp_t gfp, unsigned long attrs)
+{
+	unsigned long dma_mask;
+	unsigned int order;
+	struct page *page;
+	void *vaddr = NULL;
+
+	dma_mask = dma_alloc_coherent_mask(dev, gfp);
+	order = get_order(size);
+
+	gfp &= ~__GFP_ZERO;
+
+	page = alloc_pages_node(dev_to_node(dev), gfp, order);
+	if (page) {
+		dma_addr_t addr;
+
+		/*
+		 * Since we will be clearing the encryption bit, check the
+		 * mask with it already cleared.
+		 */
+		addr = phys_to_dma(dev, page_to_phys(page)) & ~sme_me_mask;
+		if ((addr + size) > dma_mask) {
+			__free_pages(page, get_order(size));
+		} else {
+			vaddr = page_address(page);
+			*dma_handle = addr;
+		}
+	}
+
+	if (!vaddr)
+		vaddr = swiotlb_alloc_coherent(dev, size, dma_handle, gfp);
+
+	if (!vaddr)
+		return NULL;
+
+	/* Clear the SME encryption bit for DMA use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, *dma_handle))) {
+		set_memory_decrypted((unsigned long)vaddr, 1 << order);
+		*dma_handle &= ~sme_me_mask;
+	}
+
+	return vaddr;
 }
 
+static void sme_free(struct device *dev, size_t size, void *vaddr,
+		     dma_addr_t dma_handle, unsigned long attrs)
+{
+	/* Set the SME encryption bit for re-use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, dma_handle)))
+		set_memory_encrypted((unsigned long)vaddr,
+				     1 << get_order(size));
+
+	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
+}
+
+static struct dma_map_ops sme_dma_ops = {
+	.alloc                  = sme_alloc,
+	.free                   = sme_free,
+	.map_page               = swiotlb_map_page,
+	.unmap_page             = swiotlb_unmap_page,
+	.map_sg                 = swiotlb_map_sg_attrs,
+	.unmap_sg               = swiotlb_unmap_sg_attrs,
+	.sync_single_for_cpu    = swiotlb_sync_single_for_cpu,
+	.sync_single_for_device = swiotlb_sync_single_for_device,
+	.sync_sg_for_cpu        = swiotlb_sync_sg_for_cpu,
+	.sync_sg_for_device     = swiotlb_sync_sg_for_device,
+	.mapping_error          = swiotlb_dma_mapping_error,
+};
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void)
 {
@@ -208,6 +281,10 @@ void __init mem_encrypt_init(void)
 	/* Call into SWIOTLB to update the SWIOTLB DMA buffers */
 	swiotlb_update_mem_attributes();
 
+	/* Use SEV DMA operations if SEV is active */
+	if (sev_active())
+		dma_ops = &sme_dma_ops;
+
 	pr_info("AMD Secure Memory Encryption (SME) active\n");
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 10/32] x86: DMA support for SEV memory encryption
@ 2017-03-02 15:14   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

DMA access to memory mapped as encrypted while SEV is active can not be
encrypted during device write or decrypted during device read. In order
for DMA to properly work when SEV is active, the swiotlb bounce buffers
must be used.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/mem_encrypt.c |   77 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 090419b..7df5f4c 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -197,8 +197,81 @@ void __init sme_early_init(void)
 	/* Update the protection map with memory encryption mask */
 	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
 		protection_map[i] = pgprot_encrypted(protection_map[i]);
+
+	if (sev_active())
+		swiotlb_force = SWIOTLB_FORCE;
+}
+
+static void *sme_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		       gfp_t gfp, unsigned long attrs)
+{
+	unsigned long dma_mask;
+	unsigned int order;
+	struct page *page;
+	void *vaddr = NULL;
+
+	dma_mask = dma_alloc_coherent_mask(dev, gfp);
+	order = get_order(size);
+
+	gfp &= ~__GFP_ZERO;
+
+	page = alloc_pages_node(dev_to_node(dev), gfp, order);
+	if (page) {
+		dma_addr_t addr;
+
+		/*
+		 * Since we will be clearing the encryption bit, check the
+		 * mask with it already cleared.
+		 */
+		addr = phys_to_dma(dev, page_to_phys(page)) & ~sme_me_mask;
+		if ((addr + size) > dma_mask) {
+			__free_pages(page, get_order(size));
+		} else {
+			vaddr = page_address(page);
+			*dma_handle = addr;
+		}
+	}
+
+	if (!vaddr)
+		vaddr = swiotlb_alloc_coherent(dev, size, dma_handle, gfp);
+
+	if (!vaddr)
+		return NULL;
+
+	/* Clear the SME encryption bit for DMA use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, *dma_handle))) {
+		set_memory_decrypted((unsigned long)vaddr, 1 << order);
+		*dma_handle &= ~sme_me_mask;
+	}
+
+	return vaddr;
 }
 
+static void sme_free(struct device *dev, size_t size, void *vaddr,
+		     dma_addr_t dma_handle, unsigned long attrs)
+{
+	/* Set the SME encryption bit for re-use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, dma_handle)))
+		set_memory_encrypted((unsigned long)vaddr,
+				     1 << get_order(size));
+
+	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
+}
+
+static struct dma_map_ops sme_dma_ops = {
+	.alloc                  = sme_alloc,
+	.free                   = sme_free,
+	.map_page               = swiotlb_map_page,
+	.unmap_page             = swiotlb_unmap_page,
+	.map_sg                 = swiotlb_map_sg_attrs,
+	.unmap_sg               = swiotlb_unmap_sg_attrs,
+	.sync_single_for_cpu    = swiotlb_sync_single_for_cpu,
+	.sync_single_for_device = swiotlb_sync_single_for_device,
+	.sync_sg_for_cpu        = swiotlb_sync_sg_for_cpu,
+	.sync_sg_for_device     = swiotlb_sync_sg_for_device,
+	.mapping_error          = swiotlb_dma_mapping_error,
+};
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void)
 {
@@ -208,6 +281,10 @@ void __init mem_encrypt_init(void)
 	/* Call into SWIOTLB to update the SWIOTLB DMA buffers */
 	swiotlb_update_mem_attributes();
 
+	/* Use SEV DMA operations if SEV is active */
+	if (sev_active())
+		dma_ops = &sme_dma_ops;
+
 	pr_info("AMD Secure Memory Encryption (SME) active\n");
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 10/32] x86: DMA support for SEV memory encryption
@ 2017-03-02 15:14   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

DMA access to memory mapped as encrypted while SEV is active can not be
encrypted during device write or decrypted during device read. In order
for DMA to properly work when SEV is active, the swiotlb bounce buffers
must be used.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/mem_encrypt.c |   77 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 090419b..7df5f4c 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -197,8 +197,81 @@ void __init sme_early_init(void)
 	/* Update the protection map with memory encryption mask */
 	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
 		protection_map[i] = pgprot_encrypted(protection_map[i]);
+
+	if (sev_active())
+		swiotlb_force = SWIOTLB_FORCE;
+}
+
+static void *sme_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		       gfp_t gfp, unsigned long attrs)
+{
+	unsigned long dma_mask;
+	unsigned int order;
+	struct page *page;
+	void *vaddr = NULL;
+
+	dma_mask = dma_alloc_coherent_mask(dev, gfp);
+	order = get_order(size);
+
+	gfp &= ~__GFP_ZERO;
+
+	page = alloc_pages_node(dev_to_node(dev), gfp, order);
+	if (page) {
+		dma_addr_t addr;
+
+		/*
+		 * Since we will be clearing the encryption bit, check the
+		 * mask with it already cleared.
+		 */
+		addr = phys_to_dma(dev, page_to_phys(page)) & ~sme_me_mask;
+		if ((addr + size) > dma_mask) {
+			__free_pages(page, get_order(size));
+		} else {
+			vaddr = page_address(page);
+			*dma_handle = addr;
+		}
+	}
+
+	if (!vaddr)
+		vaddr = swiotlb_alloc_coherent(dev, size, dma_handle, gfp);
+
+	if (!vaddr)
+		return NULL;
+
+	/* Clear the SME encryption bit for DMA use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, *dma_handle))) {
+		set_memory_decrypted((unsigned long)vaddr, 1 << order);
+		*dma_handle &= ~sme_me_mask;
+	}
+
+	return vaddr;
 }
 
+static void sme_free(struct device *dev, size_t size, void *vaddr,
+		     dma_addr_t dma_handle, unsigned long attrs)
+{
+	/* Set the SME encryption bit for re-use if not swiotlb area */
+	if (!is_swiotlb_buffer(dma_to_phys(dev, dma_handle)))
+		set_memory_encrypted((unsigned long)vaddr,
+				     1 << get_order(size));
+
+	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
+}
+
+static struct dma_map_ops sme_dma_ops = {
+	.alloc                  = sme_alloc,
+	.free                   = sme_free,
+	.map_page               = swiotlb_map_page,
+	.unmap_page             = swiotlb_unmap_page,
+	.map_sg                 = swiotlb_map_sg_attrs,
+	.unmap_sg               = swiotlb_unmap_sg_attrs,
+	.sync_single_for_cpu    = swiotlb_sync_single_for_cpu,
+	.sync_single_for_device = swiotlb_sync_single_for_device,
+	.sync_sg_for_cpu        = swiotlb_sync_sg_for_cpu,
+	.sync_sg_for_device     = swiotlb_sync_sg_for_device,
+	.mapping_error          = swiotlb_dma_mapping_error,
+};
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void)
 {
@@ -208,6 +281,10 @@ void __init mem_encrypt_init(void)
 	/* Call into SWIOTLB to update the SWIOTLB DMA buffers */
 	swiotlb_update_mem_attributes();
 
+	/* Use SEV DMA operations if SEV is active */
+	if (sev_active())
+		dma_ops = &sme_dma_ops;
+
 	pr_info("AMD Secure Memory Encryption (SME) active\n");
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 11/32] x86: Unroll string I/O when SEV is active
  2017-03-02 15:12 ` Brijesh Singh
                   ` (23 preceding siblings ...)
  (?)
@ 2017-03-02 15:14 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

Secure Encrypted Virtualization (SEV) does not support string I/O, so
unroll the string I/O operation into a loop operating on one element at
a time.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/io.h |   26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 833f7cc..b596114 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -327,14 +327,32 @@ static inline unsigned type in##bwl##_p(int port)			\
 									\
 static inline void outs##bwl(int port, const void *addr, unsigned long count) \
 {									\
-	asm volatile("rep; outs" #bwl					\
-		     : "+S"(addr), "+c"(count) : "d"(port));		\
+	if (sev_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			out##bwl(*value, port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; outs" #bwl				\
+			     : "+S"(addr), "+c"(count) : "d"(port));	\
+	}								\
 }									\
 									\
 static inline void ins##bwl(int port, void *addr, unsigned long count)	\
 {									\
-	asm volatile("rep; ins" #bwl					\
-		     : "+D"(addr), "+c"(count) : "d"(port));		\
+	if (sev_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			*value = in##bwl(port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; ins" #bwl				\
+			     : "+D"(addr), "+c"(count) : "d"(port));	\
+	}								\
 }
 
 BUILDIO(b, b, char)

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 11/32] x86: Unroll string I/O when SEV is active
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:14   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Secure Encrypted Virtualization (SEV) does not support string I/O, so
unroll the string I/O operation into a loop operating on one element at
a time.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/io.h |   26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 833f7cc..b596114 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -327,14 +327,32 @@ static inline unsigned type in##bwl##_p(int port)			\
 									\
 static inline void outs##bwl(int port, const void *addr, unsigned long count) \
 {									\
-	asm volatile("rep; outs" #bwl					\
-		     : "+S"(addr), "+c"(count) : "d"(port));		\
+	if (sev_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			out##bwl(*value, port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; outs" #bwl				\
+			     : "+S"(addr), "+c"(count) : "d"(port));	\
+	}								\
 }									\
 									\
 static inline void ins##bwl(int port, void *addr, unsigned long count)	\
 {									\
-	asm volatile("rep; ins" #bwl					\
-		     : "+D"(addr), "+c"(count) : "d"(port));		\
+	if (sev_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			*value = in##bwl(port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; ins" #bwl				\
+			     : "+D"(addr), "+c"(count) : "d"(port));	\
+	}								\
 }
 
 BUILDIO(b, b, char)

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 11/32] x86: Unroll string I/O when SEV is active
@ 2017-03-02 15:14   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Secure Encrypted Virtualization (SEV) does not support string I/O, so
unroll the string I/O operation into a loop operating on one element at
a time.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/io.h |   26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 833f7cc..b596114 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -327,14 +327,32 @@ static inline unsigned type in##bwl##_p(int port)			\
 									\
 static inline void outs##bwl(int port, const void *addr, unsigned long count) \
 {									\
-	asm volatile("rep; outs" #bwl					\
-		     : "+S"(addr), "+c"(count) : "d"(port));		\
+	if (sev_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			out##bwl(*value, port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; outs" #bwl				\
+			     : "+S"(addr), "+c"(count) : "d"(port));	\
+	}								\
 }									\
 									\
 static inline void ins##bwl(int port, void *addr, unsigned long count)	\
 {									\
-	asm volatile("rep; ins" #bwl					\
-		     : "+D"(addr), "+c"(count) : "d"(port));		\
+	if (sev_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			*value = in##bwl(port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; ins" #bwl				\
+			     : "+D"(addr), "+c"(count) : "d"(port));	\
+	}								\
 }
 
 BUILDIO(b, b, char)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 11/32] x86: Unroll string I/O when SEV is active
@ 2017-03-02 15:14   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Secure Encrypted Virtualization (SEV) does not support string I/O, so
unroll the string I/O operation into a loop operating on one element at
a time.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/io.h |   26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 833f7cc..b596114 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -327,14 +327,32 @@ static inline unsigned type in##bwl##_p(int port)			\
 									\
 static inline void outs##bwl(int port, const void *addr, unsigned long count) \
 {									\
-	asm volatile("rep; outs" #bwl					\
-		     : "+S"(addr), "+c"(count) : "d"(port));		\
+	if (sev_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			out##bwl(*value, port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; outs" #bwl				\
+			     : "+S"(addr), "+c"(count) : "d"(port));	\
+	}								\
 }									\
 									\
 static inline void ins##bwl(int port, void *addr, unsigned long count)	\
 {									\
-	asm volatile("rep; ins" #bwl					\
-		     : "+D"(addr), "+c"(count) : "d"(port));		\
+	if (sev_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			*value = in##bwl(port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; ins" #bwl				\
+			     : "+D"(addr), "+c"(count) : "d"(port));	\
+	}								\
 }
 
 BUILDIO(b, b, char)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 11/32] x86: Unroll string I/O when SEV is active
@ 2017-03-02 15:14   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Secure Encrypted Virtualization (SEV) does not support string I/O, so
unroll the string I/O operation into a loop operating on one element at
a time.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/io.h |   26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 833f7cc..b596114 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -327,14 +327,32 @@ static inline unsigned type in##bwl##_p(int port)			\
 									\
 static inline void outs##bwl(int port, const void *addr, unsigned long count) \
 {									\
-	asm volatile("rep; outs" #bwl					\
-		     : "+S"(addr), "+c"(count) : "d"(port));		\
+	if (sev_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			out##bwl(*value, port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; outs" #bwl				\
+			     : "+S"(addr), "+c"(count) : "d"(port));	\
+	}								\
 }									\
 									\
 static inline void ins##bwl(int port, void *addr, unsigned long count)	\
 {									\
-	asm volatile("rep; ins" #bwl					\
-		     : "+D"(addr), "+c"(count) : "d"(port));		\
+	if (sev_active()) {						\
+		unsigned type *value = (unsigned type *)addr;		\
+		while (count) {						\
+			*value = in##bwl(port);				\
+			value++;					\
+			count--;					\
+		}							\
+	} else {							\
+		asm volatile("rep; ins" #bwl				\
+			     : "+D"(addr), "+c"(count) : "d"(port));	\
+	}								\
 }
 
 BUILDIO(b, b, char)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
  2017-03-02 15:12 ` Brijesh Singh
                   ` (25 preceding siblings ...)
  (?)
@ 2017-03-02 15:14 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

Early in the boot process, add checks to determine if the kernel is
running with Secure Encrypted Virtualization (SEV) active by issuing
a CPUID instruction.

During early compressed kernel booting, if SEV is active the pagetables are
updated so that data is accessed and decompressed with encryption.

During uncompressed kernel booting, if SEV is the memory encryption mask is
set and a flag is set to indicate that SEV is enabled.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/boot/compressed/Makefile      |    2 +
 arch/x86/boot/compressed/head_64.S     |   16 +++++++
 arch/x86/boot/compressed/mem_encrypt.S |   75 ++++++++++++++++++++++++++++++++
 arch/x86/include/uapi/asm/hyperv.h     |    4 ++
 arch/x86/include/uapi/asm/kvm_para.h   |    3 +
 arch/x86/kernel/mem_encrypt_init.c     |   24 ++++++++++
 6 files changed, 124 insertions(+)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 44163e8..51f9cd0 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -72,6 +72,8 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 	$(obj)/string.o $(obj)/cmdline.o $(obj)/error.o \
 	$(obj)/piggy.o $(obj)/cpuflags.o
 
+vmlinux-objs-$(CONFIG_X86_64) += $(obj)/mem_encrypt.o
+
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index d2ae1f8..625b5380 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -130,6 +130,19 @@ ENTRY(startup_32)
  /*
   * Build early 4G boot pagetable
   */
+	/*
+	 * If SEV is active set the encryption mask in the page tables. This
+	 * will insure that when the kernel is copied and decompressed it
+	 * will be done so encrypted.
+	 */
+	call	sev_enabled
+	xorl	%edx, %edx
+	testl	%eax, %eax
+	jz	1f
+	subl	$32, %eax	/* Encryption bit is always above bit 31 */
+	bts	%eax, %edx	/* Set encryption mask for page tables */
+1:
+
 	/* Initialize Page tables to 0 */
 	leal	pgtable(%ebx), %edi
 	xorl	%eax, %eax
@@ -140,12 +153,14 @@ ENTRY(startup_32)
 	leal	pgtable + 0(%ebx), %edi
 	leal	0x1007 (%edi), %eax
 	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 
 	/* Build Level 3 */
 	leal	pgtable + 0x1000(%ebx), %edi
 	leal	0x1007(%edi), %eax
 	movl	$4, %ecx
 1:	movl	%eax, 0x00(%edi)
+	addl	%edx, 0x04(%edi)
 	addl	$0x00001000, %eax
 	addl	$8, %edi
 	decl	%ecx
@@ -156,6 +171,7 @@ ENTRY(startup_32)
 	movl	$0x00000183, %eax
 	movl	$2048, %ecx
 1:	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 	addl	$0x00200000, %eax
 	addl	$8, %edi
 	decl	%ecx
diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
new file mode 100644
index 0000000..8313c31
--- /dev/null
+++ b/arch/x86/boot/compressed/mem_encrypt.S
@@ -0,0 +1,75 @@
+/*
+ * AMD Memory Encryption Support
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+
+#include <asm/processor-flags.h>
+#include <asm/msr.h>
+#include <asm/asm-offsets.h>
+#include <uapi/asm/kvm_para.h>
+
+	.text
+	.code32
+ENTRY(sev_enabled)
+	xor	%eax, %eax
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	push	%ebx
+	push	%ecx
+	push	%edx
+
+	/* Check if running under a hypervisor */
+	movl	$0x40000000, %eax
+	cpuid
+	cmpl	$0x40000001, %eax
+	jb	.Lno_sev
+
+	movl	$0x40000001, %eax
+	cpuid
+	bt	$KVM_FEATURE_SEV, %eax
+	jnc	.Lno_sev
+
+	/*
+	 * Check for memory encryption feature:
+	 *   CPUID Fn8000_001F[EAX] - Bit 0
+	 */
+	movl	$0x8000001f, %eax
+	cpuid
+	bt	$0, %eax
+	jnc	.Lno_sev
+
+	/*
+	 * Get memory encryption information:
+	 *   CPUID Fn8000_001F[EBX] - Bits 5:0
+	 *     Pagetable bit position used to indicate encryption
+	 */
+	movl	%ebx, %eax
+	andl	$0x3f, %eax
+	movl	%eax, sev_enc_bit(%ebp)
+	jmp	.Lsev_exit
+
+.Lno_sev:
+	xor	%eax, %eax
+
+.Lsev_exit:
+	pop	%edx
+	pop	%ecx
+	pop	%ebx
+
+#endif	/* CONFIG_AMD_MEM_ENCRYPT */
+
+	ret
+ENDPROC(sev_enabled)
+
+	.bss
+sev_enc_bit:
+	.word	0
diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h
index 9b1a918..8278161 100644
--- a/arch/x86/include/uapi/asm/hyperv.h
+++ b/arch/x86/include/uapi/asm/hyperv.h
@@ -3,6 +3,8 @@
 
 #include <linux/types.h>
 
+#ifndef __ASSEMBLY__
+
 /*
  * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
  * is set by CPUID(HvCpuIdFunctionVersionAndFeatures).
@@ -363,4 +365,6 @@ struct hv_timer_message_payload {
 #define HV_STIMER_AUTOENABLE		(1ULL << 3)
 #define HV_STIMER_SINT(config)		(__u8)(((config) >> 16) & 0x0F)
 
+#endif	/* __ASSEMBLY__ */
+
 #endif
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index bc2802f..e81b74a 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -26,6 +26,8 @@
 #define KVM_FEATURE_PV_UNHALT		7
 #define KVM_FEATURE_SEV			8
 
+#ifndef __ASSEMBLY__
+
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
  */
@@ -100,5 +102,6 @@ struct kvm_vcpu_pv_apf_data {
 #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 #define KVM_PV_EOI_DISABLED 0x0
 
+#endif	/* __ASSEMBLY__ */
 
 #endif /* _UAPI_ASM_X86_KVM_PARA_H */
diff --git a/arch/x86/kernel/mem_encrypt_init.c b/arch/x86/kernel/mem_encrypt_init.c
index 35c5e3d..5d514e6 100644
--- a/arch/x86/kernel/mem_encrypt_init.c
+++ b/arch/x86/kernel/mem_encrypt_init.c
@@ -22,6 +22,7 @@
 #include <asm/processor-flags.h>
 #include <asm/msr.h>
 #include <asm/cmdline.h>
+#include <asm/kvm_para.h>
 
 static char sme_cmdline_arg_on[] __initdata = "mem_encrypt=on";
 static char sme_cmdline_arg_off[] __initdata = "mem_encrypt=off";
@@ -232,6 +233,29 @@ unsigned long __init sme_enable(void *boot_data)
 	void *cmdline_arg;
 	u64 msr;
 
+	/* Check if running under a hypervisor */
+	eax = 0x40000000;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	if (eax > 0x40000000) {
+		eax = 0x40000001;
+		ecx = 0;
+		native_cpuid(&eax, &ebx, &ecx, &edx);
+		if (!(eax & BIT(KVM_FEATURE_SEV)))
+			goto out;
+
+		eax = 0x8000001f;
+		ecx = 0;
+		native_cpuid(&eax, &ebx, &ecx, &edx);
+		if (!(eax & 1))
+			goto out;
+
+		sme_me_mask = 1UL << (ebx & 0x3f);
+		sev_enabled = 1;
+
+		goto out;
+	}
+
 	/* Check for an AMD processor */
 	eax = 0;
 	ecx = 0;

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:14   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Early in the boot process, add checks to determine if the kernel is
running with Secure Encrypted Virtualization (SEV) active by issuing
a CPUID instruction.

During early compressed kernel booting, if SEV is active the pagetables are
updated so that data is accessed and decompressed with encryption.

During uncompressed kernel booting, if SEV is the memory encryption mask is
set and a flag is set to indicate that SEV is enabled.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/boot/compressed/Makefile      |    2 +
 arch/x86/boot/compressed/head_64.S     |   16 +++++++
 arch/x86/boot/compressed/mem_encrypt.S |   75 ++++++++++++++++++++++++++++++++
 arch/x86/include/uapi/asm/hyperv.h     |    4 ++
 arch/x86/include/uapi/asm/kvm_para.h   |    3 +
 arch/x86/kernel/mem_encrypt_init.c     |   24 ++++++++++
 6 files changed, 124 insertions(+)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 44163e8..51f9cd0 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -72,6 +72,8 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 	$(obj)/string.o $(obj)/cmdline.o $(obj)/error.o \
 	$(obj)/piggy.o $(obj)/cpuflags.o
 
+vmlinux-objs-$(CONFIG_X86_64) += $(obj)/mem_encrypt.o
+
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index d2ae1f8..625b5380 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -130,6 +130,19 @@ ENTRY(startup_32)
  /*
   * Build early 4G boot pagetable
   */
+	/*
+	 * If SEV is active set the encryption mask in the page tables. This
+	 * will insure that when the kernel is copied and decompressed it
+	 * will be done so encrypted.
+	 */
+	call	sev_enabled
+	xorl	%edx, %edx
+	testl	%eax, %eax
+	jz	1f
+	subl	$32, %eax	/* Encryption bit is always above bit 31 */
+	bts	%eax, %edx	/* Set encryption mask for page tables */
+1:
+
 	/* Initialize Page tables to 0 */
 	leal	pgtable(%ebx), %edi
 	xorl	%eax, %eax
@@ -140,12 +153,14 @@ ENTRY(startup_32)
 	leal	pgtable + 0(%ebx), %edi
 	leal	0x1007 (%edi), %eax
 	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 
 	/* Build Level 3 */
 	leal	pgtable + 0x1000(%ebx), %edi
 	leal	0x1007(%edi), %eax
 	movl	$4, %ecx
 1:	movl	%eax, 0x00(%edi)
+	addl	%edx, 0x04(%edi)
 	addl	$0x00001000, %eax
 	addl	$8, %edi
 	decl	%ecx
@@ -156,6 +171,7 @@ ENTRY(startup_32)
 	movl	$0x00000183, %eax
 	movl	$2048, %ecx
 1:	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 	addl	$0x00200000, %eax
 	addl	$8, %edi
 	decl	%ecx
diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
new file mode 100644
index 0000000..8313c31
--- /dev/null
+++ b/arch/x86/boot/compressed/mem_encrypt.S
@@ -0,0 +1,75 @@
+/*
+ * AMD Memory Encryption Support
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+
+#include <asm/processor-flags.h>
+#include <asm/msr.h>
+#include <asm/asm-offsets.h>
+#include <uapi/asm/kvm_para.h>
+
+	.text
+	.code32
+ENTRY(sev_enabled)
+	xor	%eax, %eax
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	push	%ebx
+	push	%ecx
+	push	%edx
+
+	/* Check if running under a hypervisor */
+	movl	$0x40000000, %eax
+	cpuid
+	cmpl	$0x40000001, %eax
+	jb	.Lno_sev
+
+	movl	$0x40000001, %eax
+	cpuid
+	bt	$KVM_FEATURE_SEV, %eax
+	jnc	.Lno_sev
+
+	/*
+	 * Check for memory encryption feature:
+	 *   CPUID Fn8000_001F[EAX] - Bit 0
+	 */
+	movl	$0x8000001f, %eax
+	cpuid
+	bt	$0, %eax
+	jnc	.Lno_sev
+
+	/*
+	 * Get memory encryption information:
+	 *   CPUID Fn8000_001F[EBX] - Bits 5:0
+	 *     Pagetable bit position used to indicate encryption
+	 */
+	movl	%ebx, %eax
+	andl	$0x3f, %eax
+	movl	%eax, sev_enc_bit(%ebp)
+	jmp	.Lsev_exit
+
+.Lno_sev:
+	xor	%eax, %eax
+
+.Lsev_exit:
+	pop	%edx
+	pop	%ecx
+	pop	%ebx
+
+#endif	/* CONFIG_AMD_MEM_ENCRYPT */
+
+	ret
+ENDPROC(sev_enabled)
+
+	.bss
+sev_enc_bit:
+	.word	0
diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h
index 9b1a918..8278161 100644
--- a/arch/x86/include/uapi/asm/hyperv.h
+++ b/arch/x86/include/uapi/asm/hyperv.h
@@ -3,6 +3,8 @@
 
 #include <linux/types.h>
 
+#ifndef __ASSEMBLY__
+
 /*
  * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
  * is set by CPUID(HvCpuIdFunctionVersionAndFeatures).
@@ -363,4 +365,6 @@ struct hv_timer_message_payload {
 #define HV_STIMER_AUTOENABLE		(1ULL << 3)
 #define HV_STIMER_SINT(config)		(__u8)(((config) >> 16) & 0x0F)
 
+#endif	/* __ASSEMBLY__ */
+
 #endif
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index bc2802f..e81b74a 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -26,6 +26,8 @@
 #define KVM_FEATURE_PV_UNHALT		7
 #define KVM_FEATURE_SEV			8
 
+#ifndef __ASSEMBLY__
+
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
  */
@@ -100,5 +102,6 @@ struct kvm_vcpu_pv_apf_data {
 #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 #define KVM_PV_EOI_DISABLED 0x0
 
+#endif	/* __ASSEMBLY__ */
 
 #endif /* _UAPI_ASM_X86_KVM_PARA_H */
diff --git a/arch/x86/kernel/mem_encrypt_init.c b/arch/x86/kernel/mem_encrypt_init.c
index 35c5e3d..5d514e6 100644
--- a/arch/x86/kernel/mem_encrypt_init.c
+++ b/arch/x86/kernel/mem_encrypt_init.c
@@ -22,6 +22,7 @@
 #include <asm/processor-flags.h>
 #include <asm/msr.h>
 #include <asm/cmdline.h>
+#include <asm/kvm_para.h>
 
 static char sme_cmdline_arg_on[] __initdata = "mem_encrypt=on";
 static char sme_cmdline_arg_off[] __initdata = "mem_encrypt=off";
@@ -232,6 +233,29 @@ unsigned long __init sme_enable(void *boot_data)
 	void *cmdline_arg;
 	u64 msr;
 
+	/* Check if running under a hypervisor */
+	eax = 0x40000000;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	if (eax > 0x40000000) {
+		eax = 0x40000001;
+		ecx = 0;
+		native_cpuid(&eax, &ebx, &ecx, &edx);
+		if (!(eax & BIT(KVM_FEATURE_SEV)))
+			goto out;
+
+		eax = 0x8000001f;
+		ecx = 0;
+		native_cpuid(&eax, &ebx, &ecx, &edx);
+		if (!(eax & 1))
+			goto out;
+
+		sme_me_mask = 1UL << (ebx & 0x3f);
+		sev_enabled = 1;
+
+		goto out;
+	}
+
 	/* Check for an AMD processor */
 	eax = 0;
 	ecx = 0;

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-02 15:14   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Early in the boot process, add checks to determine if the kernel is
running with Secure Encrypted Virtualization (SEV) active by issuing
a CPUID instruction.

During early compressed kernel booting, if SEV is active the pagetables are
updated so that data is accessed and decompressed with encryption.

During uncompressed kernel booting, if SEV is the memory encryption mask is
set and a flag is set to indicate that SEV is enabled.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/boot/compressed/Makefile      |    2 +
 arch/x86/boot/compressed/head_64.S     |   16 +++++++
 arch/x86/boot/compressed/mem_encrypt.S |   75 ++++++++++++++++++++++++++++++++
 arch/x86/include/uapi/asm/hyperv.h     |    4 ++
 arch/x86/include/uapi/asm/kvm_para.h   |    3 +
 arch/x86/kernel/mem_encrypt_init.c     |   24 ++++++++++
 6 files changed, 124 insertions(+)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 44163e8..51f9cd0 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -72,6 +72,8 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 	$(obj)/string.o $(obj)/cmdline.o $(obj)/error.o \
 	$(obj)/piggy.o $(obj)/cpuflags.o
 
+vmlinux-objs-$(CONFIG_X86_64) += $(obj)/mem_encrypt.o
+
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index d2ae1f8..625b5380 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -130,6 +130,19 @@ ENTRY(startup_32)
  /*
   * Build early 4G boot pagetable
   */
+	/*
+	 * If SEV is active set the encryption mask in the page tables. This
+	 * will insure that when the kernel is copied and decompressed it
+	 * will be done so encrypted.
+	 */
+	call	sev_enabled
+	xorl	%edx, %edx
+	testl	%eax, %eax
+	jz	1f
+	subl	$32, %eax	/* Encryption bit is always above bit 31 */
+	bts	%eax, %edx	/* Set encryption mask for page tables */
+1:
+
 	/* Initialize Page tables to 0 */
 	leal	pgtable(%ebx), %edi
 	xorl	%eax, %eax
@@ -140,12 +153,14 @@ ENTRY(startup_32)
 	leal	pgtable + 0(%ebx), %edi
 	leal	0x1007 (%edi), %eax
 	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 
 	/* Build Level 3 */
 	leal	pgtable + 0x1000(%ebx), %edi
 	leal	0x1007(%edi), %eax
 	movl	$4, %ecx
 1:	movl	%eax, 0x00(%edi)
+	addl	%edx, 0x04(%edi)
 	addl	$0x00001000, %eax
 	addl	$8, %edi
 	decl	%ecx
@@ -156,6 +171,7 @@ ENTRY(startup_32)
 	movl	$0x00000183, %eax
 	movl	$2048, %ecx
 1:	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 	addl	$0x00200000, %eax
 	addl	$8, %edi
 	decl	%ecx
diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
new file mode 100644
index 0000000..8313c31
--- /dev/null
+++ b/arch/x86/boot/compressed/mem_encrypt.S
@@ -0,0 +1,75 @@
+/*
+ * AMD Memory Encryption Support
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+
+#include <asm/processor-flags.h>
+#include <asm/msr.h>
+#include <asm/asm-offsets.h>
+#include <uapi/asm/kvm_para.h>
+
+	.text
+	.code32
+ENTRY(sev_enabled)
+	xor	%eax, %eax
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	push	%ebx
+	push	%ecx
+	push	%edx
+
+	/* Check if running under a hypervisor */
+	movl	$0x40000000, %eax
+	cpuid
+	cmpl	$0x40000001, %eax
+	jb	.Lno_sev
+
+	movl	$0x40000001, %eax
+	cpuid
+	bt	$KVM_FEATURE_SEV, %eax
+	jnc	.Lno_sev
+
+	/*
+	 * Check for memory encryption feature:
+	 *   CPUID Fn8000_001F[EAX] - Bit 0
+	 */
+	movl	$0x8000001f, %eax
+	cpuid
+	bt	$0, %eax
+	jnc	.Lno_sev
+
+	/*
+	 * Get memory encryption information:
+	 *   CPUID Fn8000_001F[EBX] - Bits 5:0
+	 *     Pagetable bit position used to indicate encryption
+	 */
+	movl	%ebx, %eax
+	andl	$0x3f, %eax
+	movl	%eax, sev_enc_bit(%ebp)
+	jmp	.Lsev_exit
+
+.Lno_sev:
+	xor	%eax, %eax
+
+.Lsev_exit:
+	pop	%edx
+	pop	%ecx
+	pop	%ebx
+
+#endif	/* CONFIG_AMD_MEM_ENCRYPT */
+
+	ret
+ENDPROC(sev_enabled)
+
+	.bss
+sev_enc_bit:
+	.word	0
diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h
index 9b1a918..8278161 100644
--- a/arch/x86/include/uapi/asm/hyperv.h
+++ b/arch/x86/include/uapi/asm/hyperv.h
@@ -3,6 +3,8 @@
 
 #include <linux/types.h>
 
+#ifndef __ASSEMBLY__
+
 /*
  * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
  * is set by CPUID(HvCpuIdFunctionVersionAndFeatures).
@@ -363,4 +365,6 @@ struct hv_timer_message_payload {
 #define HV_STIMER_AUTOENABLE		(1ULL << 3)
 #define HV_STIMER_SINT(config)		(__u8)(((config) >> 16) & 0x0F)
 
+#endif	/* __ASSEMBLY__ */
+
 #endif
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index bc2802f..e81b74a 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -26,6 +26,8 @@
 #define KVM_FEATURE_PV_UNHALT		7
 #define KVM_FEATURE_SEV			8
 
+#ifndef __ASSEMBLY__
+
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
  */
@@ -100,5 +102,6 @@ struct kvm_vcpu_pv_apf_data {
 #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 #define KVM_PV_EOI_DISABLED 0x0
 
+#endif	/* __ASSEMBLY__ */
 
 #endif /* _UAPI_ASM_X86_KVM_PARA_H */
diff --git a/arch/x86/kernel/mem_encrypt_init.c b/arch/x86/kernel/mem_encrypt_init.c
index 35c5e3d..5d514e6 100644
--- a/arch/x86/kernel/mem_encrypt_init.c
+++ b/arch/x86/kernel/mem_encrypt_init.c
@@ -22,6 +22,7 @@
 #include <asm/processor-flags.h>
 #include <asm/msr.h>
 #include <asm/cmdline.h>
+#include <asm/kvm_para.h>
 
 static char sme_cmdline_arg_on[] __initdata = "mem_encrypt=on";
 static char sme_cmdline_arg_off[] __initdata = "mem_encrypt=off";
@@ -232,6 +233,29 @@ unsigned long __init sme_enable(void *boot_data)
 	void *cmdline_arg;
 	u64 msr;
 
+	/* Check if running under a hypervisor */
+	eax = 0x40000000;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	if (eax > 0x40000000) {
+		eax = 0x40000001;
+		ecx = 0;
+		native_cpuid(&eax, &ebx, &ecx, &edx);
+		if (!(eax & BIT(KVM_FEATURE_SEV)))
+			goto out;
+
+		eax = 0x8000001f;
+		ecx = 0;
+		native_cpuid(&eax, &ebx, &ecx, &edx);
+		if (!(eax & 1))
+			goto out;
+
+		sme_me_mask = 1UL << (ebx & 0x3f);
+		sev_enabled = 1;
+
+		goto out;
+	}
+
 	/* Check for an AMD processor */
 	eax = 0;
 	ecx = 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-02 15:14   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Early in the boot process, add checks to determine if the kernel is
running with Secure Encrypted Virtualization (SEV) active by issuing
a CPUID instruction.

During early compressed kernel booting, if SEV is active the pagetables are
updated so that data is accessed and decompressed with encryption.

During uncompressed kernel booting, if SEV is the memory encryption mask is
set and a flag is set to indicate that SEV is enabled.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/boot/compressed/Makefile      |    2 +
 arch/x86/boot/compressed/head_64.S     |   16 +++++++
 arch/x86/boot/compressed/mem_encrypt.S |   75 ++++++++++++++++++++++++++++++++
 arch/x86/include/uapi/asm/hyperv.h     |    4 ++
 arch/x86/include/uapi/asm/kvm_para.h   |    3 +
 arch/x86/kernel/mem_encrypt_init.c     |   24 ++++++++++
 6 files changed, 124 insertions(+)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 44163e8..51f9cd0 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -72,6 +72,8 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 	$(obj)/string.o $(obj)/cmdline.o $(obj)/error.o \
 	$(obj)/piggy.o $(obj)/cpuflags.o
 
+vmlinux-objs-$(CONFIG_X86_64) += $(obj)/mem_encrypt.o
+
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index d2ae1f8..625b5380 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -130,6 +130,19 @@ ENTRY(startup_32)
  /*
   * Build early 4G boot pagetable
   */
+	/*
+	 * If SEV is active set the encryption mask in the page tables. This
+	 * will insure that when the kernel is copied and decompressed it
+	 * will be done so encrypted.
+	 */
+	call	sev_enabled
+	xorl	%edx, %edx
+	testl	%eax, %eax
+	jz	1f
+	subl	$32, %eax	/* Encryption bit is always above bit 31 */
+	bts	%eax, %edx	/* Set encryption mask for page tables */
+1:
+
 	/* Initialize Page tables to 0 */
 	leal	pgtable(%ebx), %edi
 	xorl	%eax, %eax
@@ -140,12 +153,14 @@ ENTRY(startup_32)
 	leal	pgtable + 0(%ebx), %edi
 	leal	0x1007 (%edi), %eax
 	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 
 	/* Build Level 3 */
 	leal	pgtable + 0x1000(%ebx), %edi
 	leal	0x1007(%edi), %eax
 	movl	$4, %ecx
 1:	movl	%eax, 0x00(%edi)
+	addl	%edx, 0x04(%edi)
 	addl	$0x00001000, %eax
 	addl	$8, %edi
 	decl	%ecx
@@ -156,6 +171,7 @@ ENTRY(startup_32)
 	movl	$0x00000183, %eax
 	movl	$2048, %ecx
 1:	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 	addl	$0x00200000, %eax
 	addl	$8, %edi
 	decl	%ecx
diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
new file mode 100644
index 0000000..8313c31
--- /dev/null
+++ b/arch/x86/boot/compressed/mem_encrypt.S
@@ -0,0 +1,75 @@
+/*
+ * AMD Memory Encryption Support
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+
+#include <asm/processor-flags.h>
+#include <asm/msr.h>
+#include <asm/asm-offsets.h>
+#include <uapi/asm/kvm_para.h>
+
+	.text
+	.code32
+ENTRY(sev_enabled)
+	xor	%eax, %eax
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	push	%ebx
+	push	%ecx
+	push	%edx
+
+	/* Check if running under a hypervisor */
+	movl	$0x40000000, %eax
+	cpuid
+	cmpl	$0x40000001, %eax
+	jb	.Lno_sev
+
+	movl	$0x40000001, %eax
+	cpuid
+	bt	$KVM_FEATURE_SEV, %eax
+	jnc	.Lno_sev
+
+	/*
+	 * Check for memory encryption feature:
+	 *   CPUID Fn8000_001F[EAX] - Bit 0
+	 */
+	movl	$0x8000001f, %eax
+	cpuid
+	bt	$0, %eax
+	jnc	.Lno_sev
+
+	/*
+	 * Get memory encryption information:
+	 *   CPUID Fn8000_001F[EBX] - Bits 5:0
+	 *     Pagetable bit position used to indicate encryption
+	 */
+	movl	%ebx, %eax
+	andl	$0x3f, %eax
+	movl	%eax, sev_enc_bit(%ebp)
+	jmp	.Lsev_exit
+
+.Lno_sev:
+	xor	%eax, %eax
+
+.Lsev_exit:
+	pop	%edx
+	pop	%ecx
+	pop	%ebx
+
+#endif	/* CONFIG_AMD_MEM_ENCRYPT */
+
+	ret
+ENDPROC(sev_enabled)
+
+	.bss
+sev_enc_bit:
+	.word	0
diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h
index 9b1a918..8278161 100644
--- a/arch/x86/include/uapi/asm/hyperv.h
+++ b/arch/x86/include/uapi/asm/hyperv.h
@@ -3,6 +3,8 @@
 
 #include <linux/types.h>
 
+#ifndef __ASSEMBLY__
+
 /*
  * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
  * is set by CPUID(HvCpuIdFunctionVersionAndFeatures).
@@ -363,4 +365,6 @@ struct hv_timer_message_payload {
 #define HV_STIMER_AUTOENABLE		(1ULL << 3)
 #define HV_STIMER_SINT(config)		(__u8)(((config) >> 16) & 0x0F)
 
+#endif	/* __ASSEMBLY__ */
+
 #endif
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index bc2802f..e81b74a 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -26,6 +26,8 @@
 #define KVM_FEATURE_PV_UNHALT		7
 #define KVM_FEATURE_SEV			8
 
+#ifndef __ASSEMBLY__
+
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
  */
@@ -100,5 +102,6 @@ struct kvm_vcpu_pv_apf_data {
 #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 #define KVM_PV_EOI_DISABLED 0x0
 
+#endif	/* __ASSEMBLY__ */
 
 #endif /* _UAPI_ASM_X86_KVM_PARA_H */
diff --git a/arch/x86/kernel/mem_encrypt_init.c b/arch/x86/kernel/mem_encrypt_init.c
index 35c5e3d..5d514e6 100644
--- a/arch/x86/kernel/mem_encrypt_init.c
+++ b/arch/x86/kernel/mem_encrypt_init.c
@@ -22,6 +22,7 @@
 #include <asm/processor-flags.h>
 #include <asm/msr.h>
 #include <asm/cmdline.h>
+#include <asm/kvm_para.h>
 
 static char sme_cmdline_arg_on[] __initdata = "mem_encrypt=on";
 static char sme_cmdline_arg_off[] __initdata = "mem_encrypt=off";
@@ -232,6 +233,29 @@ unsigned long __init sme_enable(void *boot_data)
 	void *cmdline_arg;
 	u64 msr;
 
+	/* Check if running under a hypervisor */
+	eax = 0x40000000;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	if (eax > 0x40000000) {
+		eax = 0x40000001;
+		ecx = 0;
+		native_cpuid(&eax, &ebx, &ecx, &edx);
+		if (!(eax & BIT(KVM_FEATURE_SEV)))
+			goto out;
+
+		eax = 0x8000001f;
+		ecx = 0;
+		native_cpuid(&eax, &ebx, &ecx, &edx);
+		if (!(eax & 1))
+			goto out;
+
+		sme_me_mask = 1UL << (ebx & 0x3f);
+		sev_enabled = 1;
+
+		goto out;
+	}
+
 	/* Check for an AMD processor */
 	eax = 0;
 	ecx = 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-02 15:14   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:14 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Early in the boot process, add checks to determine if the kernel is
running with Secure Encrypted Virtualization (SEV) active by issuing
a CPUID instruction.

During early compressed kernel booting, if SEV is active the pagetables are
updated so that data is accessed and decompressed with encryption.

During uncompressed kernel booting, if SEV is the memory encryption mask is
set and a flag is set to indicate that SEV is enabled.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/boot/compressed/Makefile      |    2 +
 arch/x86/boot/compressed/head_64.S     |   16 +++++++
 arch/x86/boot/compressed/mem_encrypt.S |   75 ++++++++++++++++++++++++++++++++
 arch/x86/include/uapi/asm/hyperv.h     |    4 ++
 arch/x86/include/uapi/asm/kvm_para.h   |    3 +
 arch/x86/kernel/mem_encrypt_init.c     |   24 ++++++++++
 6 files changed, 124 insertions(+)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 44163e8..51f9cd0 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -72,6 +72,8 @@ vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
 	$(obj)/string.o $(obj)/cmdline.o $(obj)/error.o \
 	$(obj)/piggy.o $(obj)/cpuflags.o
 
+vmlinux-objs-$(CONFIG_X86_64) += $(obj)/mem_encrypt.o
+
 vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
 vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
 ifdef CONFIG_X86_64
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index d2ae1f8..625b5380 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -130,6 +130,19 @@ ENTRY(startup_32)
  /*
   * Build early 4G boot pagetable
   */
+	/*
+	 * If SEV is active set the encryption mask in the page tables. This
+	 * will insure that when the kernel is copied and decompressed it
+	 * will be done so encrypted.
+	 */
+	call	sev_enabled
+	xorl	%edx, %edx
+	testl	%eax, %eax
+	jz	1f
+	subl	$32, %eax	/* Encryption bit is always above bit 31 */
+	bts	%eax, %edx	/* Set encryption mask for page tables */
+1:
+
 	/* Initialize Page tables to 0 */
 	leal	pgtable(%ebx), %edi
 	xorl	%eax, %eax
@@ -140,12 +153,14 @@ ENTRY(startup_32)
 	leal	pgtable + 0(%ebx), %edi
 	leal	0x1007 (%edi), %eax
 	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 
 	/* Build Level 3 */
 	leal	pgtable + 0x1000(%ebx), %edi
 	leal	0x1007(%edi), %eax
 	movl	$4, %ecx
 1:	movl	%eax, 0x00(%edi)
+	addl	%edx, 0x04(%edi)
 	addl	$0x00001000, %eax
 	addl	$8, %edi
 	decl	%ecx
@@ -156,6 +171,7 @@ ENTRY(startup_32)
 	movl	$0x00000183, %eax
 	movl	$2048, %ecx
 1:	movl	%eax, 0(%edi)
+	addl	%edx, 4(%edi)
 	addl	$0x00200000, %eax
 	addl	$8, %edi
 	decl	%ecx
diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
new file mode 100644
index 0000000..8313c31
--- /dev/null
+++ b/arch/x86/boot/compressed/mem_encrypt.S
@@ -0,0 +1,75 @@
+/*
+ * AMD Memory Encryption Support
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+
+#include <asm/processor-flags.h>
+#include <asm/msr.h>
+#include <asm/asm-offsets.h>
+#include <uapi/asm/kvm_para.h>
+
+	.text
+	.code32
+ENTRY(sev_enabled)
+	xor	%eax, %eax
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	push	%ebx
+	push	%ecx
+	push	%edx
+
+	/* Check if running under a hypervisor */
+	movl	$0x40000000, %eax
+	cpuid
+	cmpl	$0x40000001, %eax
+	jb	.Lno_sev
+
+	movl	$0x40000001, %eax
+	cpuid
+	bt	$KVM_FEATURE_SEV, %eax
+	jnc	.Lno_sev
+
+	/*
+	 * Check for memory encryption feature:
+	 *   CPUID Fn8000_001F[EAX] - Bit 0
+	 */
+	movl	$0x8000001f, %eax
+	cpuid
+	bt	$0, %eax
+	jnc	.Lno_sev
+
+	/*
+	 * Get memory encryption information:
+	 *   CPUID Fn8000_001F[EBX] - Bits 5:0
+	 *     Pagetable bit position used to indicate encryption
+	 */
+	movl	%ebx, %eax
+	andl	$0x3f, %eax
+	movl	%eax, sev_enc_bit(%ebp)
+	jmp	.Lsev_exit
+
+.Lno_sev:
+	xor	%eax, %eax
+
+.Lsev_exit:
+	pop	%edx
+	pop	%ecx
+	pop	%ebx
+
+#endif	/* CONFIG_AMD_MEM_ENCRYPT */
+
+	ret
+ENDPROC(sev_enabled)
+
+	.bss
+sev_enc_bit:
+	.word	0
diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h
index 9b1a918..8278161 100644
--- a/arch/x86/include/uapi/asm/hyperv.h
+++ b/arch/x86/include/uapi/asm/hyperv.h
@@ -3,6 +3,8 @@
 
 #include <linux/types.h>
 
+#ifndef __ASSEMBLY__
+
 /*
  * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
  * is set by CPUID(HvCpuIdFunctionVersionAndFeatures).
@@ -363,4 +365,6 @@ struct hv_timer_message_payload {
 #define HV_STIMER_AUTOENABLE		(1ULL << 3)
 #define HV_STIMER_SINT(config)		(__u8)(((config) >> 16) & 0x0F)
 
+#endif	/* __ASSEMBLY__ */
+
 #endif
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index bc2802f..e81b74a 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -26,6 +26,8 @@
 #define KVM_FEATURE_PV_UNHALT		7
 #define KVM_FEATURE_SEV			8
 
+#ifndef __ASSEMBLY__
+
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
  */
@@ -100,5 +102,6 @@ struct kvm_vcpu_pv_apf_data {
 #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 #define KVM_PV_EOI_DISABLED 0x0
 
+#endif	/* __ASSEMBLY__ */
 
 #endif /* _UAPI_ASM_X86_KVM_PARA_H */
diff --git a/arch/x86/kernel/mem_encrypt_init.c b/arch/x86/kernel/mem_encrypt_init.c
index 35c5e3d..5d514e6 100644
--- a/arch/x86/kernel/mem_encrypt_init.c
+++ b/arch/x86/kernel/mem_encrypt_init.c
@@ -22,6 +22,7 @@
 #include <asm/processor-flags.h>
 #include <asm/msr.h>
 #include <asm/cmdline.h>
+#include <asm/kvm_para.h>
 
 static char sme_cmdline_arg_on[] __initdata = "mem_encrypt=on";
 static char sme_cmdline_arg_off[] __initdata = "mem_encrypt=off";
@@ -232,6 +233,29 @@ unsigned long __init sme_enable(void *boot_data)
 	void *cmdline_arg;
 	u64 msr;
 
+	/* Check if running under a hypervisor */
+	eax = 0x40000000;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	if (eax > 0x40000000) {
+		eax = 0x40000001;
+		ecx = 0;
+		native_cpuid(&eax, &ebx, &ecx, &edx);
+		if (!(eax & BIT(KVM_FEATURE_SEV)))
+			goto out;
+
+		eax = 0x8000001f;
+		ecx = 0;
+		native_cpuid(&eax, &ebx, &ecx, &edx);
+		if (!(eax & 1))
+			goto out;
+
+		sme_me_mask = 1UL << (ebx & 0x3f);
+		sev_enabled = 1;
+
+		goto out;
+	}
+
 	/* Check for an AMD processor */
 	eax = 0;
 	ecx = 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 13/32] KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
  2017-03-02 15:12 ` Brijesh Singh
                   ` (27 preceding siblings ...)
  (?)
@ 2017-03-02 15:15 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

Modify the SVM cpuid update function to indicate if Secure Encrypted
Virtualization (SEV) is active in the guest by setting the SEV KVM CPU
features bit. SEV is active if Secure Memory Encryption is enabled in
the host and the SEV_ENABLE bit of the VMCB is set.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kvm/cpuid.c |    4 +++-
 arch/x86/kvm/svm.c   |   18 ++++++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 1639de8..e0c40a8 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -601,7 +601,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		entry->edx = 0;
 		break;
 	case 0x80000000:
-		entry->eax = min(entry->eax, 0x8000001a);
+		entry->eax = min(entry->eax, 0x8000001f);
 		break;
 	case 0x80000001:
 		entry->edx &= kvm_cpuid_8000_0001_edx_x86_features;
@@ -634,6 +634,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		break;
 	case 0x8000001d:
 		break;
+	case 0x8000001f:
+		break;
 	/*Add support for Centaur's CPUID instruction*/
 	case 0xC0000000:
 		/*Just support up to 0xC0000004 now*/
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 75b0645..36d61ff 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -46,6 +46,7 @@
 #include <asm/irq_remapping.h>
 
 #include <asm/virtext.h>
+#include <asm/mem_encrypt.h>
 #include "trace.h"
 
 #define __ex(x) __kvm_handle_fault_on_reboot(x)
@@ -5005,10 +5006,27 @@ static void svm_cpuid_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct kvm_cpuid_entry2 *entry;
+	struct vmcb_control_area *ca = &svm->vmcb->control;
+	struct kvm_cpuid_entry2 *features, *sev_info;
 
 	/* Update nrips enabled cache */
 	svm->nrips_enabled = !!guest_cpuid_has_nrips(&svm->vcpu);
 
+	/* Check for Secure Encrypted Virtualization support */
+	features = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
+	if (!features)
+		return;
+
+	sev_info = kvm_find_cpuid_entry(vcpu, 0x8000001f, 0);
+	if (!sev_info)
+		return;
+
+	if (ca->nested_ctl & SVM_NESTED_CTL_SEV_ENABLE) {
+		features->eax |= (1 << KVM_FEATURE_SEV);
+		cpuid(0x8000001f, &sev_info->eax, &sev_info->ebx,
+		      &sev_info->ecx, &sev_info->edx);
+	}
+
 	if (!kvm_vcpu_apicv_active(vcpu))
 		return;
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 13/32] KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:15   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Modify the SVM cpuid update function to indicate if Secure Encrypted
Virtualization (SEV) is active in the guest by setting the SEV KVM CPU
features bit. SEV is active if Secure Memory Encryption is enabled in
the host and the SEV_ENABLE bit of the VMCB is set.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kvm/cpuid.c |    4 +++-
 arch/x86/kvm/svm.c   |   18 ++++++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 1639de8..e0c40a8 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -601,7 +601,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		entry->edx = 0;
 		break;
 	case 0x80000000:
-		entry->eax = min(entry->eax, 0x8000001a);
+		entry->eax = min(entry->eax, 0x8000001f);
 		break;
 	case 0x80000001:
 		entry->edx &= kvm_cpuid_8000_0001_edx_x86_features;
@@ -634,6 +634,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		break;
 	case 0x8000001d:
 		break;
+	case 0x8000001f:
+		break;
 	/*Add support for Centaur's CPUID instruction*/
 	case 0xC0000000:
 		/*Just support up to 0xC0000004 now*/
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 75b0645..36d61ff 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -46,6 +46,7 @@
 #include <asm/irq_remapping.h>
 
 #include <asm/virtext.h>
+#include <asm/mem_encrypt.h>
 #include "trace.h"
 
 #define __ex(x) __kvm_handle_fault_on_reboot(x)
@@ -5005,10 +5006,27 @@ static void svm_cpuid_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct kvm_cpuid_entry2 *entry;
+	struct vmcb_control_area *ca = &svm->vmcb->control;
+	struct kvm_cpuid_entry2 *features, *sev_info;
 
 	/* Update nrips enabled cache */
 	svm->nrips_enabled = !!guest_cpuid_has_nrips(&svm->vcpu);
 
+	/* Check for Secure Encrypted Virtualization support */
+	features = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
+	if (!features)
+		return;
+
+	sev_info = kvm_find_cpuid_entry(vcpu, 0x8000001f, 0);
+	if (!sev_info)
+		return;
+
+	if (ca->nested_ctl & SVM_NESTED_CTL_SEV_ENABLE) {
+		features->eax |= (1 << KVM_FEATURE_SEV);
+		cpuid(0x8000001f, &sev_info->eax, &sev_info->ebx,
+		      &sev_info->ecx, &sev_info->edx);
+	}
+
 	if (!kvm_vcpu_apicv_active(vcpu))
 		return;
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 13/32] KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Modify the SVM cpuid update function to indicate if Secure Encrypted
Virtualization (SEV) is active in the guest by setting the SEV KVM CPU
features bit. SEV is active if Secure Memory Encryption is enabled in
the host and the SEV_ENABLE bit of the VMCB is set.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kvm/cpuid.c |    4 +++-
 arch/x86/kvm/svm.c   |   18 ++++++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 1639de8..e0c40a8 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -601,7 +601,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		entry->edx = 0;
 		break;
 	case 0x80000000:
-		entry->eax = min(entry->eax, 0x8000001a);
+		entry->eax = min(entry->eax, 0x8000001f);
 		break;
 	case 0x80000001:
 		entry->edx &= kvm_cpuid_8000_0001_edx_x86_features;
@@ -634,6 +634,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		break;
 	case 0x8000001d:
 		break;
+	case 0x8000001f:
+		break;
 	/*Add support for Centaur's CPUID instruction*/
 	case 0xC0000000:
 		/*Just support up to 0xC0000004 now*/
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 75b0645..36d61ff 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -46,6 +46,7 @@
 #include <asm/irq_remapping.h>
 
 #include <asm/virtext.h>
+#include <asm/mem_encrypt.h>
 #include "trace.h"
 
 #define __ex(x) __kvm_handle_fault_on_reboot(x)
@@ -5005,10 +5006,27 @@ static void svm_cpuid_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct kvm_cpuid_entry2 *entry;
+	struct vmcb_control_area *ca = &svm->vmcb->control;
+	struct kvm_cpuid_entry2 *features, *sev_info;
 
 	/* Update nrips enabled cache */
 	svm->nrips_enabled = !!guest_cpuid_has_nrips(&svm->vcpu);
 
+	/* Check for Secure Encrypted Virtualization support */
+	features = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
+	if (!features)
+		return;
+
+	sev_info = kvm_find_cpuid_entry(vcpu, 0x8000001f, 0);
+	if (!sev_info)
+		return;
+
+	if (ca->nested_ctl & SVM_NESTED_CTL_SEV_ENABLE) {
+		features->eax |= (1 << KVM_FEATURE_SEV);
+		cpuid(0x8000001f, &sev_info->eax, &sev_info->ebx,
+		      &sev_info->ecx, &sev_info->edx);
+	}
+
 	if (!kvm_vcpu_apicv_active(vcpu))
 		return;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 13/32] KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

Modify the SVM cpuid update function to indicate if Secure Encrypted
Virtualization (SEV) is active in the guest by setting the SEV KVM CPU
features bit. SEV is active if Secure Memory Encryption is enabled in
the host and the SEV_ENABLE bit of the VMCB is set.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kvm/cpuid.c |    4 +++-
 arch/x86/kvm/svm.c   |   18 ++++++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 1639de8..e0c40a8 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -601,7 +601,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		entry->edx = 0;
 		break;
 	case 0x80000000:
-		entry->eax = min(entry->eax, 0x8000001a);
+		entry->eax = min(entry->eax, 0x8000001f);
 		break;
 	case 0x80000001:
 		entry->edx &= kvm_cpuid_8000_0001_edx_x86_features;
@@ -634,6 +634,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		break;
 	case 0x8000001d:
 		break;
+	case 0x8000001f:
+		break;
 	/*Add support for Centaur's CPUID instruction*/
 	case 0xC0000000:
 		/*Just support up to 0xC0000004 now*/
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 75b0645..36d61ff 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -46,6 +46,7 @@
 #include <asm/irq_remapping.h>
 
 #include <asm/virtext.h>
+#include <asm/mem_encrypt.h>
 #include "trace.h"
 
 #define __ex(x) __kvm_handle_fault_on_reboot(x)
@@ -5005,10 +5006,27 @@ static void svm_cpuid_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct kvm_cpuid_entry2 *entry;
+	struct vmcb_control_area *ca = &svm->vmcb->control;
+	struct kvm_cpuid_entry2 *features, *sev_info;
 
 	/* Update nrips enabled cache */
 	svm->nrips_enabled = !!guest_cpuid_has_nrips(&svm->vcpu);
 
+	/* Check for Secure Encrypted Virtualization support */
+	features = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
+	if (!features)
+		return;
+
+	sev_info = kvm_find_cpuid_entry(vcpu, 0x8000001f, 0);
+	if (!sev_info)
+		return;
+
+	if (ca->nested_ctl & SVM_NESTED_CTL_SEV_ENABLE) {
+		features->eax |= (1 << KVM_FEATURE_SEV);
+		cpuid(0x8000001f, &sev_info->eax, &sev_info->ebx,
+		      &sev_info->ecx, &sev_info->edx);
+	}
+
 	if (!kvm_vcpu_apicv_active(vcpu))
 		return;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 13/32] KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

Modify the SVM cpuid update function to indicate if Secure Encrypted
Virtualization (SEV) is active in the guest by setting the SEV KVM CPU
features bit. SEV is active if Secure Memory Encryption is enabled in
the host and the SEV_ENABLE bit of the VMCB is set.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kvm/cpuid.c |    4 +++-
 arch/x86/kvm/svm.c   |   18 ++++++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 1639de8..e0c40a8 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -601,7 +601,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		entry->edx = 0;
 		break;
 	case 0x80000000:
-		entry->eax = min(entry->eax, 0x8000001a);
+		entry->eax = min(entry->eax, 0x8000001f);
 		break;
 	case 0x80000001:
 		entry->edx &= kvm_cpuid_8000_0001_edx_x86_features;
@@ -634,6 +634,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		break;
 	case 0x8000001d:
 		break;
+	case 0x8000001f:
+		break;
 	/*Add support for Centaur's CPUID instruction*/
 	case 0xC0000000:
 		/*Just support up to 0xC0000004 now*/
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 75b0645..36d61ff 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -46,6 +46,7 @@
 #include <asm/irq_remapping.h>
 
 #include <asm/virtext.h>
+#include <asm/mem_encrypt.h>
 #include "trace.h"
 
 #define __ex(x) __kvm_handle_fault_on_reboot(x)
@@ -5005,10 +5006,27 @@ static void svm_cpuid_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct kvm_cpuid_entry2 *entry;
+	struct vmcb_control_area *ca = &svm->vmcb->control;
+	struct kvm_cpuid_entry2 *features, *sev_info;
 
 	/* Update nrips enabled cache */
 	svm->nrips_enabled = !!guest_cpuid_has_nrips(&svm->vcpu);
 
+	/* Check for Secure Encrypted Virtualization support */
+	features = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
+	if (!features)
+		return;
+
+	sev_info = kvm_find_cpuid_entry(vcpu, 0x8000001f, 0);
+	if (!sev_info)
+		return;
+
+	if (ca->nested_ctl & SVM_NESTED_CTL_SEV_ENABLE) {
+		features->eax |= (1 << KVM_FEATURE_SEV);
+		cpuid(0x8000001f, &sev_info->eax, &sev_info->ebx,
+		      &sev_info->ecx, &sev_info->edx);
+	}
+
 	if (!kvm_vcpu_apicv_active(vcpu))
 		return;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-02 15:12 ` Brijesh Singh
                   ` (28 preceding siblings ...)
  (?)
@ 2017-03-02 15:15 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

If kernel_maps_pages_in_pgd is called early in boot process to change the
memory attributes then it fails to allocate memory when spliting large
pages. The patch extends the cpa_data to provide the support to use
memblock_alloc when slab allocator is not available.

The feature will be used in Secure Encrypted Virtualization (SEV) mode,
where we may need to change the memory region attributes in early boot
process.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 42 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 46cc89d..9e4ab3b 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -14,6 +14,7 @@
 #include <linux/gfp.h>
 #include <linux/pci.h>
 #include <linux/vmalloc.h>
+#include <linux/memblock.h>
 
 #include <asm/e820/api.h>
 #include <asm/processor.h>
@@ -37,6 +38,7 @@ struct cpa_data {
 	int		flags;
 	unsigned long	pfn;
 	unsigned	force_split : 1;
+	unsigned	force_memblock :1;
 	int		curpage;
 	struct page	**pages;
 };
@@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 
 static int
 __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
-		   struct page *base)
+		  pte_t *pbase, unsigned long new_pfn)
 {
-	pte_t *pbase = (pte_t *)page_address(base);
 	unsigned long ref_pfn, pfn, pfninc = 1;
 	unsigned int i, level;
 	pte_t *tmp;
@@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 		return 1;
 	}
 
-	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
+	paravirt_alloc_pte(&init_mm, new_pfn);
 
 	switch (level) {
 	case PG_LEVEL_2M:
@@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	 * pagetable protections, the actual ptes set above control the
 	 * primary protection behavior:
 	 */
-	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
+	__set_pmd_pte(kpte, address,
+		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
 
 	/*
 	 * Intel Atom errata AAH41 workaround.
@@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	return 0;
 }
 
+static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
+{
+	unsigned long phys;
+	struct page *base;
+
+	if (cpa->force_memblock) {
+		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		if (!phys)
+			return NULL;
+		*pfn = phys >> PAGE_SHIFT;
+		return (pte_t *)__va(phys);
+	}
+
+	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
+	if (!base)
+		return NULL;
+	*pfn = page_to_pfn(base);
+	return (pte_t *)page_address(base);
+}
+
+static void try_free_pte(struct cpa_data *cpa, pte_t *pte)
+{
+	if (cpa->force_memblock)
+		memblock_free(__pa(pte), PAGE_SIZE);
+	else
+		__free_page((struct page *)pte);
+}
+
 static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
 			    unsigned long address)
 {
-	struct page *base;
+	pte_t *new_pte;
+	unsigned long new_pfn;
 
 	if (!debug_pagealloc_enabled())
 		spin_unlock(&cpa_lock);
-	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
+	new_pte = try_alloc_pte(cpa, &new_pfn);
 	if (!debug_pagealloc_enabled())
 		spin_lock(&cpa_lock);
-	if (!base)
+	if (!new_pte)
 		return -ENOMEM;
 
-	if (__split_large_page(cpa, kpte, address, base))
-		__free_page(base);
+	if (__split_large_page(cpa, kpte, address, new_pte, new_pfn))
+		try_free_pte(cpa, new_pte);
 
 	return 0;
 }
@@ -2035,6 +2066,7 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 			    unsigned numpages, unsigned long page_flags)
 {
 	int retval = -EINVAL;
+	int use_memblock = !slab_is_available();
 
 	struct cpa_data cpa = {
 		.vaddr = &address,
@@ -2044,6 +2076,7 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 		.mask_set = __pgprot(0),
 		.mask_clr = __pgprot(0),
 		.flags = 0,
+		.force_memblock = use_memblock,
 	};
 
 	if (!(__supported_pte_mask & _PAGE_NX))

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:15   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

If kernel_maps_pages_in_pgd is called early in boot process to change the
memory attributes then it fails to allocate memory when spliting large
pages. The patch extends the cpa_data to provide the support to use
memblock_alloc when slab allocator is not available.

The feature will be used in Secure Encrypted Virtualization (SEV) mode,
where we may need to change the memory region attributes in early boot
process.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 42 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 46cc89d..9e4ab3b 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -14,6 +14,7 @@
 #include <linux/gfp.h>
 #include <linux/pci.h>
 #include <linux/vmalloc.h>
+#include <linux/memblock.h>
 
 #include <asm/e820/api.h>
 #include <asm/processor.h>
@@ -37,6 +38,7 @@ struct cpa_data {
 	int		flags;
 	unsigned long	pfn;
 	unsigned	force_split : 1;
+	unsigned	force_memblock :1;
 	int		curpage;
 	struct page	**pages;
 };
@@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 
 static int
 __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
-		   struct page *base)
+		  pte_t *pbase, unsigned long new_pfn)
 {
-	pte_t *pbase = (pte_t *)page_address(base);
 	unsigned long ref_pfn, pfn, pfninc = 1;
 	unsigned int i, level;
 	pte_t *tmp;
@@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 		return 1;
 	}
 
-	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
+	paravirt_alloc_pte(&init_mm, new_pfn);
 
 	switch (level) {
 	case PG_LEVEL_2M:
@@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	 * pagetable protections, the actual ptes set above control the
 	 * primary protection behavior:
 	 */
-	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
+	__set_pmd_pte(kpte, address,
+		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
 
 	/*
 	 * Intel Atom errata AAH41 workaround.
@@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	return 0;
 }
 
+static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
+{
+	unsigned long phys;
+	struct page *base;
+
+	if (cpa->force_memblock) {
+		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		if (!phys)
+			return NULL;
+		*pfn = phys >> PAGE_SHIFT;
+		return (pte_t *)__va(phys);
+	}
+
+	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
+	if (!base)
+		return NULL;
+	*pfn = page_to_pfn(base);
+	return (pte_t *)page_address(base);
+}
+
+static void try_free_pte(struct cpa_data *cpa, pte_t *pte)
+{
+	if (cpa->force_memblock)
+		memblock_free(__pa(pte), PAGE_SIZE);
+	else
+		__free_page((struct page *)pte);
+}
+
 static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
 			    unsigned long address)
 {
-	struct page *base;
+	pte_t *new_pte;
+	unsigned long new_pfn;
 
 	if (!debug_pagealloc_enabled())
 		spin_unlock(&cpa_lock);
-	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
+	new_pte = try_alloc_pte(cpa, &new_pfn);
 	if (!debug_pagealloc_enabled())
 		spin_lock(&cpa_lock);
-	if (!base)
+	if (!new_pte)
 		return -ENOMEM;
 
-	if (__split_large_page(cpa, kpte, address, base))
-		__free_page(base);
+	if (__split_large_page(cpa, kpte, address, new_pte, new_pfn))
+		try_free_pte(cpa, new_pte);
 
 	return 0;
 }
@@ -2035,6 +2066,7 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 			    unsigned numpages, unsigned long page_flags)
 {
 	int retval = -EINVAL;
+	int use_memblock = !slab_is_available();
 
 	struct cpa_data cpa = {
 		.vaddr = &address,
@@ -2044,6 +2076,7 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 		.mask_set = __pgprot(0),
 		.mask_clr = __pgprot(0),
 		.flags = 0,
+		.force_memblock = use_memblock,
 	};
 
 	if (!(__supported_pte_mask & _PAGE_NX))

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

If kernel_maps_pages_in_pgd is called early in boot process to change the
memory attributes then it fails to allocate memory when spliting large
pages. The patch extends the cpa_data to provide the support to use
memblock_alloc when slab allocator is not available.

The feature will be used in Secure Encrypted Virtualization (SEV) mode,
where we may need to change the memory region attributes in early boot
process.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 42 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 46cc89d..9e4ab3b 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -14,6 +14,7 @@
 #include <linux/gfp.h>
 #include <linux/pci.h>
 #include <linux/vmalloc.h>
+#include <linux/memblock.h>
 
 #include <asm/e820/api.h>
 #include <asm/processor.h>
@@ -37,6 +38,7 @@ struct cpa_data {
 	int		flags;
 	unsigned long	pfn;
 	unsigned	force_split : 1;
+	unsigned	force_memblock :1;
 	int		curpage;
 	struct page	**pages;
 };
@@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 
 static int
 __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
-		   struct page *base)
+		  pte_t *pbase, unsigned long new_pfn)
 {
-	pte_t *pbase = (pte_t *)page_address(base);
 	unsigned long ref_pfn, pfn, pfninc = 1;
 	unsigned int i, level;
 	pte_t *tmp;
@@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 		return 1;
 	}
 
-	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
+	paravirt_alloc_pte(&init_mm, new_pfn);
 
 	switch (level) {
 	case PG_LEVEL_2M:
@@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	 * pagetable protections, the actual ptes set above control the
 	 * primary protection behavior:
 	 */
-	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
+	__set_pmd_pte(kpte, address,
+		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
 
 	/*
 	 * Intel Atom errata AAH41 workaround.
@@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	return 0;
 }
 
+static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
+{
+	unsigned long phys;
+	struct page *base;
+
+	if (cpa->force_memblock) {
+		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		if (!phys)
+			return NULL;
+		*pfn = phys >> PAGE_SHIFT;
+		return (pte_t *)__va(phys);
+	}
+
+	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
+	if (!base)
+		return NULL;
+	*pfn = page_to_pfn(base);
+	return (pte_t *)page_address(base);
+}
+
+static void try_free_pte(struct cpa_data *cpa, pte_t *pte)
+{
+	if (cpa->force_memblock)
+		memblock_free(__pa(pte), PAGE_SIZE);
+	else
+		__free_page((struct page *)pte);
+}
+
 static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
 			    unsigned long address)
 {
-	struct page *base;
+	pte_t *new_pte;
+	unsigned long new_pfn;
 
 	if (!debug_pagealloc_enabled())
 		spin_unlock(&cpa_lock);
-	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
+	new_pte = try_alloc_pte(cpa, &new_pfn);
 	if (!debug_pagealloc_enabled())
 		spin_lock(&cpa_lock);
-	if (!base)
+	if (!new_pte)
 		return -ENOMEM;
 
-	if (__split_large_page(cpa, kpte, address, base))
-		__free_page(base);
+	if (__split_large_page(cpa, kpte, address, new_pte, new_pfn))
+		try_free_pte(cpa, new_pte);
 
 	return 0;
 }
@@ -2035,6 +2066,7 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 			    unsigned numpages, unsigned long page_flags)
 {
 	int retval = -EINVAL;
+	int use_memblock = !slab_is_available();
 
 	struct cpa_data cpa = {
 		.vaddr = &address,
@@ -2044,6 +2076,7 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 		.mask_set = __pgprot(0),
 		.mask_clr = __pgprot(0),
 		.flags = 0,
+		.force_memblock = use_memblock,
 	};
 
 	if (!(__supported_pte_mask & _PAGE_NX))

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

If kernel_maps_pages_in_pgd is called early in boot process to change the
memory attributes then it fails to allocate memory when spliting large
pages. The patch extends the cpa_data to provide the support to use
memblock_alloc when slab allocator is not available.

The feature will be used in Secure Encrypted Virtualization (SEV) mode,
where we may need to change the memory region attributes in early boot
process.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 42 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 46cc89d..9e4ab3b 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -14,6 +14,7 @@
 #include <linux/gfp.h>
 #include <linux/pci.h>
 #include <linux/vmalloc.h>
+#include <linux/memblock.h>
 
 #include <asm/e820/api.h>
 #include <asm/processor.h>
@@ -37,6 +38,7 @@ struct cpa_data {
 	int		flags;
 	unsigned long	pfn;
 	unsigned	force_split : 1;
+	unsigned	force_memblock :1;
 	int		curpage;
 	struct page	**pages;
 };
@@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 
 static int
 __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
-		   struct page *base)
+		  pte_t *pbase, unsigned long new_pfn)
 {
-	pte_t *pbase = (pte_t *)page_address(base);
 	unsigned long ref_pfn, pfn, pfninc = 1;
 	unsigned int i, level;
 	pte_t *tmp;
@@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 		return 1;
 	}
 
-	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
+	paravirt_alloc_pte(&init_mm, new_pfn);
 
 	switch (level) {
 	case PG_LEVEL_2M:
@@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	 * pagetable protections, the actual ptes set above control the
 	 * primary protection behavior:
 	 */
-	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
+	__set_pmd_pte(kpte, address,
+		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
 
 	/*
 	 * Intel Atom errata AAH41 workaround.
@@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	return 0;
 }
 
+static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
+{
+	unsigned long phys;
+	struct page *base;
+
+	if (cpa->force_memblock) {
+		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		if (!phys)
+			return NULL;
+		*pfn = phys >> PAGE_SHIFT;
+		return (pte_t *)__va(phys);
+	}
+
+	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
+	if (!base)
+		return NULL;
+	*pfn = page_to_pfn(base);
+	return (pte_t *)page_address(base);
+}
+
+static void try_free_pte(struct cpa_data *cpa, pte_t *pte)
+{
+	if (cpa->force_memblock)
+		memblock_free(__pa(pte), PAGE_SIZE);
+	else
+		__free_page((struct page *)pte);
+}
+
 static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
 			    unsigned long address)
 {
-	struct page *base;
+	pte_t *new_pte;
+	unsigned long new_pfn;
 
 	if (!debug_pagealloc_enabled())
 		spin_unlock(&cpa_lock);
-	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
+	new_pte = try_alloc_pte(cpa, &new_pfn);
 	if (!debug_pagealloc_enabled())
 		spin_lock(&cpa_lock);
-	if (!base)
+	if (!new_pte)
 		return -ENOMEM;
 
-	if (__split_large_page(cpa, kpte, address, base))
-		__free_page(base);
+	if (__split_large_page(cpa, kpte, address, new_pte, new_pfn))
+		try_free_pte(cpa, new_pte);
 
 	return 0;
 }
@@ -2035,6 +2066,7 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 			    unsigned numpages, unsigned long page_flags)
 {
 	int retval = -EINVAL;
+	int use_memblock = !slab_is_available();
 
 	struct cpa_data cpa = {
 		.vaddr = &address,
@@ -2044,6 +2076,7 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 		.mask_set = __pgprot(0),
 		.mask_clr = __pgprot(0),
 		.flags = 0,
+		.force_memblock = use_memblock,
 	};
 
 	if (!(__supported_pte_mask & _PAGE_NX))

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

If kernel_maps_pages_in_pgd is called early in boot process to change the
memory attributes then it fails to allocate memory when spliting large
pages. The patch extends the cpa_data to provide the support to use
memblock_alloc when slab allocator is not available.

The feature will be used in Secure Encrypted Virtualization (SEV) mode,
where we may need to change the memory region attributes in early boot
process.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 42 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 46cc89d..9e4ab3b 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -14,6 +14,7 @@
 #include <linux/gfp.h>
 #include <linux/pci.h>
 #include <linux/vmalloc.h>
+#include <linux/memblock.h>
 
 #include <asm/e820/api.h>
 #include <asm/processor.h>
@@ -37,6 +38,7 @@ struct cpa_data {
 	int		flags;
 	unsigned long	pfn;
 	unsigned	force_split : 1;
+	unsigned	force_memblock :1;
 	int		curpage;
 	struct page	**pages;
 };
@@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 
 static int
 __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
-		   struct page *base)
+		  pte_t *pbase, unsigned long new_pfn)
 {
-	pte_t *pbase = (pte_t *)page_address(base);
 	unsigned long ref_pfn, pfn, pfninc = 1;
 	unsigned int i, level;
 	pte_t *tmp;
@@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 		return 1;
 	}
 
-	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
+	paravirt_alloc_pte(&init_mm, new_pfn);
 
 	switch (level) {
 	case PG_LEVEL_2M:
@@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	 * pagetable protections, the actual ptes set above control the
 	 * primary protection behavior:
 	 */
-	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
+	__set_pmd_pte(kpte, address,
+		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
 
 	/*
 	 * Intel Atom errata AAH41 workaround.
@@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	return 0;
 }
 
+static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
+{
+	unsigned long phys;
+	struct page *base;
+
+	if (cpa->force_memblock) {
+		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		if (!phys)
+			return NULL;
+		*pfn = phys >> PAGE_SHIFT;
+		return (pte_t *)__va(phys);
+	}
+
+	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
+	if (!base)
+		return NULL;
+	*pfn = page_to_pfn(base);
+	return (pte_t *)page_address(base);
+}
+
+static void try_free_pte(struct cpa_data *cpa, pte_t *pte)
+{
+	if (cpa->force_memblock)
+		memblock_free(__pa(pte), PAGE_SIZE);
+	else
+		__free_page((struct page *)pte);
+}
+
 static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
 			    unsigned long address)
 {
-	struct page *base;
+	pte_t *new_pte;
+	unsigned long new_pfn;
 
 	if (!debug_pagealloc_enabled())
 		spin_unlock(&cpa_lock);
-	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
+	new_pte = try_alloc_pte(cpa, &new_pfn);
 	if (!debug_pagealloc_enabled())
 		spin_lock(&cpa_lock);
-	if (!base)
+	if (!new_pte)
 		return -ENOMEM;
 
-	if (__split_large_page(cpa, kpte, address, base))
-		__free_page(base);
+	if (__split_large_page(cpa, kpte, address, new_pte, new_pfn))
+		try_free_pte(cpa, new_pte);
 
 	return 0;
 }
@@ -2035,6 +2066,7 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 			    unsigned numpages, unsigned long page_flags)
 {
 	int retval = -EINVAL;
+	int use_memblock = !slab_is_available();
 
 	struct cpa_data cpa = {
 		.vaddr = &address,
@@ -2044,6 +2076,7 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 		.mask_set = __pgprot(0),
 		.mask_clr = __pgprot(0),
 		.flags = 0,
+		.force_memblock = use_memblock,
 	};
 
 	if (!(__supported_pte_mask & _PAGE_NX))

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
  2017-03-02 15:12 ` Brijesh Singh
                   ` (30 preceding siblings ...)
  (?)
@ 2017-03-02 15:15 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

Some KVM-specific custom MSRs shares the guest physical address with
hypervisor. When SEV is active, the shared physical address must be mapped
with encryption attribute cleared so that both hypervsior and guest can
access the data.

Add APIs to change memory encryption attribute in early boot code.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   15 +++++++++
 arch/x86/mm/mem_encrypt.c          |   63 ++++++++++++++++++++++++++++++++++++
 2 files changed, 78 insertions(+)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 9799835..95bbe4c 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -47,6 +47,9 @@ void __init sme_unmap_bootdata(char *real_mode_data);
 
 void __init sme_early_init(void);
 
+int __init early_set_memory_decrypted(void *addr, unsigned long size);
+int __init early_set_memory_encrypted(void *addr, unsigned long size);
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void);
 
@@ -110,6 +113,18 @@ static inline void __init sme_early_init(void)
 {
 }
 
+static inline int __init early_set_memory_decrypted(void *addr,
+						    unsigned long size)
+{
+	return 1;
+}
+
+static inline int __init early_set_memory_encrypted(void *addr,
+						    unsigned long size)
+{
+	return 1;
+}
+
 #define __sme_pa		__pa
 #define __sme_pa_nodebug	__pa_nodebug
 
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 7df5f4c..567e0d8 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -15,6 +15,7 @@
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
 #include <linux/swiotlb.h>
+#include <linux/mem_encrypt.h>
 
 #include <asm/tlbflush.h>
 #include <asm/fixmap.h>
@@ -258,6 +259,68 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
 	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
 }
 
+static unsigned long __init get_pte_flags(unsigned long address)
+{
+	int level;
+	pte_t *pte;
+	unsigned long flags = _KERNPG_TABLE_NOENC | _PAGE_ENC;
+
+	pte = lookup_address(address, &level);
+	if (!pte)
+		return flags;
+
+	switch (level) {
+	case PG_LEVEL_4K:
+		flags = pte_flags(*pte);
+		break;
+	case PG_LEVEL_2M:
+		flags = pmd_flags(*(pmd_t *)pte);
+		break;
+	case PG_LEVEL_1G:
+		flags = pud_flags(*(pud_t *)pte);
+		break;
+	default:
+		break;
+	}
+
+	return flags;
+}
+
+int __init early_set_memory_enc_dec(void *vaddr, unsigned long size,
+				    unsigned long flags)
+{
+	unsigned long pfn, npages;
+	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
+
+	/* We are going to change the physical page attribute from C=1 to C=0.
+	 * Flush the caches to ensure that all the data with C=1 is flushed to
+	 * memory. Any caching of the vaddr after function returns will
+	 * use C=0.
+	 */
+	clflush_cache_range(vaddr, size);
+
+	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
+
+	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
+					flags & ~sme_me_mask);
+
+}
+
+int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
+{
+	unsigned long flags = get_pte_flags((unsigned long)vaddr);
+
+	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);
+}
+
+int __init early_set_memory_encrypted(void *vaddr, unsigned long size)
+{
+	unsigned long flags = get_pte_flags((unsigned long)vaddr);
+
+	return early_set_memory_enc_dec(vaddr, size, flags | _PAGE_ENC);
+}
+
 static struct dma_map_ops sme_dma_ops = {
 	.alloc                  = sme_alloc,
 	.free                   = sme_free,

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:15   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Some KVM-specific custom MSRs shares the guest physical address with
hypervisor. When SEV is active, the shared physical address must be mapped
with encryption attribute cleared so that both hypervsior and guest can
access the data.

Add APIs to change memory encryption attribute in early boot code.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   15 +++++++++
 arch/x86/mm/mem_encrypt.c          |   63 ++++++++++++++++++++++++++++++++++++
 2 files changed, 78 insertions(+)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 9799835..95bbe4c 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -47,6 +47,9 @@ void __init sme_unmap_bootdata(char *real_mode_data);
 
 void __init sme_early_init(void);
 
+int __init early_set_memory_decrypted(void *addr, unsigned long size);
+int __init early_set_memory_encrypted(void *addr, unsigned long size);
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void);
 
@@ -110,6 +113,18 @@ static inline void __init sme_early_init(void)
 {
 }
 
+static inline int __init early_set_memory_decrypted(void *addr,
+						    unsigned long size)
+{
+	return 1;
+}
+
+static inline int __init early_set_memory_encrypted(void *addr,
+						    unsigned long size)
+{
+	return 1;
+}
+
 #define __sme_pa		__pa
 #define __sme_pa_nodebug	__pa_nodebug
 
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 7df5f4c..567e0d8 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -15,6 +15,7 @@
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
 #include <linux/swiotlb.h>
+#include <linux/mem_encrypt.h>
 
 #include <asm/tlbflush.h>
 #include <asm/fixmap.h>
@@ -258,6 +259,68 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
 	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
 }
 
+static unsigned long __init get_pte_flags(unsigned long address)
+{
+	int level;
+	pte_t *pte;
+	unsigned long flags = _KERNPG_TABLE_NOENC | _PAGE_ENC;
+
+	pte = lookup_address(address, &level);
+	if (!pte)
+		return flags;
+
+	switch (level) {
+	case PG_LEVEL_4K:
+		flags = pte_flags(*pte);
+		break;
+	case PG_LEVEL_2M:
+		flags = pmd_flags(*(pmd_t *)pte);
+		break;
+	case PG_LEVEL_1G:
+		flags = pud_flags(*(pud_t *)pte);
+		break;
+	default:
+		break;
+	}
+
+	return flags;
+}
+
+int __init early_set_memory_enc_dec(void *vaddr, unsigned long size,
+				    unsigned long flags)
+{
+	unsigned long pfn, npages;
+	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
+
+	/* We are going to change the physical page attribute from C=1 to C=0.
+	 * Flush the caches to ensure that all the data with C=1 is flushed to
+	 * memory. Any caching of the vaddr after function returns will
+	 * use C=0.
+	 */
+	clflush_cache_range(vaddr, size);
+
+	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
+
+	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
+					flags & ~sme_me_mask);
+
+}
+
+int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
+{
+	unsigned long flags = get_pte_flags((unsigned long)vaddr);
+
+	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);
+}
+
+int __init early_set_memory_encrypted(void *vaddr, unsigned long size)
+{
+	unsigned long flags = get_pte_flags((unsigned long)vaddr);
+
+	return early_set_memory_enc_dec(vaddr, size, flags | _PAGE_ENC);
+}
+
 static struct dma_map_ops sme_dma_ops = {
 	.alloc                  = sme_alloc,
 	.free                   = sme_free,

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

Some KVM-specific custom MSRs shares the guest physical address with
hypervisor. When SEV is active, the shared physical address must be mapped
with encryption attribute cleared so that both hypervsior and guest can
access the data.

Add APIs to change memory encryption attribute in early boot code.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   15 +++++++++
 arch/x86/mm/mem_encrypt.c          |   63 ++++++++++++++++++++++++++++++++++++
 2 files changed, 78 insertions(+)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 9799835..95bbe4c 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -47,6 +47,9 @@ void __init sme_unmap_bootdata(char *real_mode_data);
 
 void __init sme_early_init(void);
 
+int __init early_set_memory_decrypted(void *addr, unsigned long size);
+int __init early_set_memory_encrypted(void *addr, unsigned long size);
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void);
 
@@ -110,6 +113,18 @@ static inline void __init sme_early_init(void)
 {
 }
 
+static inline int __init early_set_memory_decrypted(void *addr,
+						    unsigned long size)
+{
+	return 1;
+}
+
+static inline int __init early_set_memory_encrypted(void *addr,
+						    unsigned long size)
+{
+	return 1;
+}
+
 #define __sme_pa		__pa
 #define __sme_pa_nodebug	__pa_nodebug
 
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 7df5f4c..567e0d8 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -15,6 +15,7 @@
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
 #include <linux/swiotlb.h>
+#include <linux/mem_encrypt.h>
 
 #include <asm/tlbflush.h>
 #include <asm/fixmap.h>
@@ -258,6 +259,68 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
 	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
 }
 
+static unsigned long __init get_pte_flags(unsigned long address)
+{
+	int level;
+	pte_t *pte;
+	unsigned long flags = _KERNPG_TABLE_NOENC | _PAGE_ENC;
+
+	pte = lookup_address(address, &level);
+	if (!pte)
+		return flags;
+
+	switch (level) {
+	case PG_LEVEL_4K:
+		flags = pte_flags(*pte);
+		break;
+	case PG_LEVEL_2M:
+		flags = pmd_flags(*(pmd_t *)pte);
+		break;
+	case PG_LEVEL_1G:
+		flags = pud_flags(*(pud_t *)pte);
+		break;
+	default:
+		break;
+	}
+
+	return flags;
+}
+
+int __init early_set_memory_enc_dec(void *vaddr, unsigned long size,
+				    unsigned long flags)
+{
+	unsigned long pfn, npages;
+	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
+
+	/* We are going to change the physical page attribute from C=1 to C=0.
+	 * Flush the caches to ensure that all the data with C=1 is flushed to
+	 * memory. Any caching of the vaddr after function returns will
+	 * use C=0.
+	 */
+	clflush_cache_range(vaddr, size);
+
+	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
+
+	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
+					flags & ~sme_me_mask);
+
+}
+
+int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
+{
+	unsigned long flags = get_pte_flags((unsigned long)vaddr);
+
+	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);
+}
+
+int __init early_set_memory_encrypted(void *vaddr, unsigned long size)
+{
+	unsigned long flags = get_pte_flags((unsigned long)vaddr);
+
+	return early_set_memory_enc_dec(vaddr, size, flags | _PAGE_ENC);
+}
+
 static struct dma_map_ops sme_dma_ops = {
 	.alloc                  = sme_alloc,
 	.free                   = sme_free,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

Some KVM-specific custom MSRs shares the guest physical address with
hypervisor. When SEV is active, the shared physical address must be mapped
with encryption attribute cleared so that both hypervsior and guest can
access the data.

Add APIs to change memory encryption attribute in early boot code.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   15 +++++++++
 arch/x86/mm/mem_encrypt.c          |   63 ++++++++++++++++++++++++++++++++++++
 2 files changed, 78 insertions(+)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 9799835..95bbe4c 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -47,6 +47,9 @@ void __init sme_unmap_bootdata(char *real_mode_data);
 
 void __init sme_early_init(void);
 
+int __init early_set_memory_decrypted(void *addr, unsigned long size);
+int __init early_set_memory_encrypted(void *addr, unsigned long size);
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void);
 
@@ -110,6 +113,18 @@ static inline void __init sme_early_init(void)
 {
 }
 
+static inline int __init early_set_memory_decrypted(void *addr,
+						    unsigned long size)
+{
+	return 1;
+}
+
+static inline int __init early_set_memory_encrypted(void *addr,
+						    unsigned long size)
+{
+	return 1;
+}
+
 #define __sme_pa		__pa
 #define __sme_pa_nodebug	__pa_nodebug
 
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 7df5f4c..567e0d8 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -15,6 +15,7 @@
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
 #include <linux/swiotlb.h>
+#include <linux/mem_encrypt.h>
 
 #include <asm/tlbflush.h>
 #include <asm/fixmap.h>
@@ -258,6 +259,68 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
 	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
 }
 
+static unsigned long __init get_pte_flags(unsigned long address)
+{
+	int level;
+	pte_t *pte;
+	unsigned long flags = _KERNPG_TABLE_NOENC | _PAGE_ENC;
+
+	pte = lookup_address(address, &level);
+	if (!pte)
+		return flags;
+
+	switch (level) {
+	case PG_LEVEL_4K:
+		flags = pte_flags(*pte);
+		break;
+	case PG_LEVEL_2M:
+		flags = pmd_flags(*(pmd_t *)pte);
+		break;
+	case PG_LEVEL_1G:
+		flags = pud_flags(*(pud_t *)pte);
+		break;
+	default:
+		break;
+	}
+
+	return flags;
+}
+
+int __init early_set_memory_enc_dec(void *vaddr, unsigned long size,
+				    unsigned long flags)
+{
+	unsigned long pfn, npages;
+	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
+
+	/* We are going to change the physical page attribute from C=1 to C=0.
+	 * Flush the caches to ensure that all the data with C=1 is flushed to
+	 * memory. Any caching of the vaddr after function returns will
+	 * use C=0.
+	 */
+	clflush_cache_range(vaddr, size);
+
+	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
+
+	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
+					flags & ~sme_me_mask);
+
+}
+
+int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
+{
+	unsigned long flags = get_pte_flags((unsigned long)vaddr);
+
+	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);
+}
+
+int __init early_set_memory_encrypted(void *vaddr, unsigned long size)
+{
+	unsigned long flags = get_pte_flags((unsigned long)vaddr);
+
+	return early_set_memory_enc_dec(vaddr, size, flags | _PAGE_ENC);
+}
+
 static struct dma_map_ops sme_dma_ops = {
 	.alloc                  = sme_alloc,
 	.free                   = sme_free,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Some KVM-specific custom MSRs shares the guest physical address with
hypervisor. When SEV is active, the shared physical address must be mapped
with encryption attribute cleared so that both hypervsior and guest can
access the data.

Add APIs to change memory encryption attribute in early boot code.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   15 +++++++++
 arch/x86/mm/mem_encrypt.c          |   63 ++++++++++++++++++++++++++++++++++++
 2 files changed, 78 insertions(+)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 9799835..95bbe4c 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -47,6 +47,9 @@ void __init sme_unmap_bootdata(char *real_mode_data);
 
 void __init sme_early_init(void);
 
+int __init early_set_memory_decrypted(void *addr, unsigned long size);
+int __init early_set_memory_encrypted(void *addr, unsigned long size);
+
 /* Architecture __weak replacement functions */
 void __init mem_encrypt_init(void);
 
@@ -110,6 +113,18 @@ static inline void __init sme_early_init(void)
 {
 }
 
+static inline int __init early_set_memory_decrypted(void *addr,
+						    unsigned long size)
+{
+	return 1;
+}
+
+static inline int __init early_set_memory_encrypted(void *addr,
+						    unsigned long size)
+{
+	return 1;
+}
+
 #define __sme_pa		__pa
 #define __sme_pa_nodebug	__pa_nodebug
 
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 7df5f4c..567e0d8 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -15,6 +15,7 @@
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
 #include <linux/swiotlb.h>
+#include <linux/mem_encrypt.h>
 
 #include <asm/tlbflush.h>
 #include <asm/fixmap.h>
@@ -258,6 +259,68 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
 	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
 }
 
+static unsigned long __init get_pte_flags(unsigned long address)
+{
+	int level;
+	pte_t *pte;
+	unsigned long flags = _KERNPG_TABLE_NOENC | _PAGE_ENC;
+
+	pte = lookup_address(address, &level);
+	if (!pte)
+		return flags;
+
+	switch (level) {
+	case PG_LEVEL_4K:
+		flags = pte_flags(*pte);
+		break;
+	case PG_LEVEL_2M:
+		flags = pmd_flags(*(pmd_t *)pte);
+		break;
+	case PG_LEVEL_1G:
+		flags = pud_flags(*(pud_t *)pte);
+		break;
+	default:
+		break;
+	}
+
+	return flags;
+}
+
+int __init early_set_memory_enc_dec(void *vaddr, unsigned long size,
+				    unsigned long flags)
+{
+	unsigned long pfn, npages;
+	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
+
+	/* We are going to change the physical page attribute from C=1 to C=0.
+	 * Flush the caches to ensure that all the data with C=1 is flushed to
+	 * memory. Any caching of the vaddr after function returns will
+	 * use C=0.
+	 */
+	clflush_cache_range(vaddr, size);
+
+	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
+
+	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
+					flags & ~sme_me_mask);
+
+}
+
+int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
+{
+	unsigned long flags = get_pte_flags((unsigned long)vaddr);
+
+	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);
+}
+
+int __init early_set_memory_encrypted(void *vaddr, unsigned long size)
+{
+	unsigned long flags = get_pte_flags((unsigned long)vaddr);
+
+	return early_set_memory_enc_dec(vaddr, size, flags | _PAGE_ENC);
+}
+
 static struct dma_map_ops sme_dma_ops = {
 	.alloc                  = sme_alloc,
 	.free                   = sme_free,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
  2017-03-02 15:12 ` Brijesh Singh
                   ` (33 preceding siblings ...)
  (?)
@ 2017-03-02 15:15 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
variable at compile time and share its physical address with hypervisor.
It presents a challege when SEV is active in guest OS. When SEV is active,
guest memory is encrypted with guest key and hypervisor will no longer able
to modify the guest memory. When SEV is active, we need to clear the
encryption attribute of shared physical addresses so that both guest and
hypervisor can access the data.

To solve this problem, I have tried these three options:

1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
detected then clear the encryption attribute. But while doing so I found
that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
called.

2) Since the encryption attributes works on PAGE_SIZE hence add some extra
padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
clear the encryption attribute of the full PAGE. The downside of this was
now we need to modify structure which may break the compatibility.

3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
used to hold the compile time shared per-CPU variables. When SEV is
detected we map this section with encryption attribute cleared.

This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
macro to create a compile time per-CPU variable. When SEV is detected we
map the per-CPU variable as decrypted (i.e with encryption attribute cleared).

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/kvm.c             |   43 +++++++++++++++++++++++++++++++------
 include/asm-generic/vmlinux.lds.h |    3 +++
 include/linux/percpu-defs.h       |    9 ++++++++
 3 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 099fcba..706a08e 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
 
 early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
 
-static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
-static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
+static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
+static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) __aligned(64);
 static int has_steal_clock = 0;
 
 /*
@@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
 #endif
 }
 
+static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
+{
+	/* When SEV is active, the percpu static variables initialized
+	 * in data section will contain the encrypted data so we first
+	 * need to decrypt it and then map it as decrypted.
+	 */
+	if (sev_active()) {
+		unsigned long pa = slow_virt_to_phys(addr);
+
+		sme_early_decrypt(pa, size);
+		return early_set_memory_decrypted(addr, size);
+	}
+
+	return 0;
+}
+
 static void kvm_register_steal_time(void)
 {
 	int cpu = smp_processor_id();
@@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
 	if (!has_steal_clock)
 		return;
 
+	if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
+		pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
+		return;
+	}
+
 	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
 	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
 		cpu, (unsigned long long) slow_virt_to_phys(st));
 }
 
-static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
+static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
 
 static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
 {
@@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
 	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
 		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
 
+		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
+					sizeof(struct kvm_vcpu_pv_apf_data)))
+			goto skip_asyncpf;
 #ifdef CONFIG_PREEMPT
 		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
 #endif
 		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
 		__this_cpu_write(apf_reason.enabled, 1);
-		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
-		       smp_processor_id());
+		printk(KERN_INFO"KVM setup async PF for cpu %d msr %llx\n",
+		       smp_processor_id(), pa);
 	}
-
+skip_asyncpf:
 	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
 		unsigned long pa;
 		/* Size alignment is implied but just to make it explicit. */
 		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
+		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&kvm_apic_eoi),
+					sizeof(unsigned long)))
+			goto skip_pv_eoi;
 		__this_cpu_write(kvm_apic_eoi, 0);
 		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
 			| KVM_MSR_ENABLED;
 		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
+		printk(KERN_INFO"KVM setup PV EOI for cpu %d msr %lx\n",
+		       smp_processor_id(), pa);
 	}
-
+skip_pv_eoi:
 	if (has_steal_clock)
 		kvm_register_steal_time();
 }
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 0968d13..8d29910 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -773,6 +773,9 @@
 	. = ALIGN(cacheline);						\
 	*(.data..percpu)						\
 	*(.data..percpu..shared_aligned)				\
+	. = ALIGN(PAGE_SIZE);						\
+	*(.data..percpu..hv_shared)					\
+	. = ALIGN(PAGE_SIZE);						\
 	VMLINUX_SYMBOL(__per_cpu_end) = .;
 
 /**
diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
index 8f16299..5af366e 100644
--- a/include/linux/percpu-defs.h
+++ b/include/linux/percpu-defs.h
@@ -172,6 +172,15 @@
 #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
 	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
 
+/* Declaration/definition used for per-CPU variables that must be shared
+ * between hypervisor and guest OS.
+ */
+#define DECLARE_PER_CPU_HV_SHARED(type, name)				\
+	DECLARE_PER_CPU_SECTION(type, name, "..hv_shared")
+
+#define DEFINE_PER_CPU_HV_SHARED(type, name)				\
+	DEFINE_PER_CPU_SECTION(type, name, "..hv_shared")
+
 /*
  * Intermodule exports for per-CPU variables.  sparse forgets about
  * address space across EXPORT_SYMBOL(), change EXPORT_SYMBOL() to

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:15   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
variable at compile time and share its physical address with hypervisor.
It presents a challege when SEV is active in guest OS. When SEV is active,
guest memory is encrypted with guest key and hypervisor will no longer able
to modify the guest memory. When SEV is active, we need to clear the
encryption attribute of shared physical addresses so that both guest and
hypervisor can access the data.

To solve this problem, I have tried these three options:

1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
detected then clear the encryption attribute. But while doing so I found
that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
called.

2) Since the encryption attributes works on PAGE_SIZE hence add some extra
padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
clear the encryption attribute of the full PAGE. The downside of this was
now we need to modify structure which may break the compatibility.

3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
used to hold the compile time shared per-CPU variables. When SEV is
detected we map this section with encryption attribute cleared.

This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
macro to create a compile time per-CPU variable. When SEV is detected we
map the per-CPU variable as decrypted (i.e with encryption attribute cleared).

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/kvm.c             |   43 +++++++++++++++++++++++++++++++------
 include/asm-generic/vmlinux.lds.h |    3 +++
 include/linux/percpu-defs.h       |    9 ++++++++
 3 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 099fcba..706a08e 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
 
 early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
 
-static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
-static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
+static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
+static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) __aligned(64);
 static int has_steal_clock = 0;
 
 /*
@@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
 #endif
 }
 
+static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
+{
+	/* When SEV is active, the percpu static variables initialized
+	 * in data section will contain the encrypted data so we first
+	 * need to decrypt it and then map it as decrypted.
+	 */
+	if (sev_active()) {
+		unsigned long pa = slow_virt_to_phys(addr);
+
+		sme_early_decrypt(pa, size);
+		return early_set_memory_decrypted(addr, size);
+	}
+
+	return 0;
+}
+
 static void kvm_register_steal_time(void)
 {
 	int cpu = smp_processor_id();
@@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
 	if (!has_steal_clock)
 		return;
 
+	if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
+		pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
+		return;
+	}
+
 	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
 	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
 		cpu, (unsigned long long) slow_virt_to_phys(st));
 }
 
-static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
+static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
 
 static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
 {
@@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
 	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
 		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
 
+		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
+					sizeof(struct kvm_vcpu_pv_apf_data)))
+			goto skip_asyncpf;
 #ifdef CONFIG_PREEMPT
 		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
 #endif
 		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
 		__this_cpu_write(apf_reason.enabled, 1);
-		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
-		       smp_processor_id());
+		printk(KERN_INFO"KVM setup async PF for cpu %d msr %llx\n",
+		       smp_processor_id(), pa);
 	}
-
+skip_asyncpf:
 	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
 		unsigned long pa;
 		/* Size alignment is implied but just to make it explicit. */
 		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
+		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&kvm_apic_eoi),
+					sizeof(unsigned long)))
+			goto skip_pv_eoi;
 		__this_cpu_write(kvm_apic_eoi, 0);
 		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
 			| KVM_MSR_ENABLED;
 		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
+		printk(KERN_INFO"KVM setup PV EOI for cpu %d msr %lx\n",
+		       smp_processor_id(), pa);
 	}
-
+skip_pv_eoi:
 	if (has_steal_clock)
 		kvm_register_steal_time();
 }
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 0968d13..8d29910 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -773,6 +773,9 @@
 	. = ALIGN(cacheline);						\
 	*(.data..percpu)						\
 	*(.data..percpu..shared_aligned)				\
+	. = ALIGN(PAGE_SIZE);						\
+	*(.data..percpu..hv_shared)					\
+	. = ALIGN(PAGE_SIZE);						\
 	VMLINUX_SYMBOL(__per_cpu_end) = .;
 
 /**
diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
index 8f16299..5af366e 100644
--- a/include/linux/percpu-defs.h
+++ b/include/linux/percpu-defs.h
@@ -172,6 +172,15 @@
 #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
 	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
 
+/* Declaration/definition used for per-CPU variables that must be shared
+ * between hypervisor and guest OS.
+ */
+#define DECLARE_PER_CPU_HV_SHARED(type, name)				\
+	DECLARE_PER_CPU_SECTION(type, name, "..hv_shared")
+
+#define DEFINE_PER_CPU_HV_SHARED(type, name)				\
+	DEFINE_PER_CPU_SECTION(type, name, "..hv_shared")
+
 /*
  * Intermodule exports for per-CPU variables.  sparse forgets about
  * address space across EXPORT_SYMBOL(), change EXPORT_SYMBOL() to

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
variable at compile time and share its physical address with hypervisor.
It presents a challege when SEV is active in guest OS. When SEV is active,
guest memory is encrypted with guest key and hypervisor will no longer able
to modify the guest memory. When SEV is active, we need to clear the
encryption attribute of shared physical addresses so that both guest and
hypervisor can access the data.

To solve this problem, I have tried these three options:

1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
detected then clear the encryption attribute. But while doing so I found
that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
called.

2) Since the encryption attributes works on PAGE_SIZE hence add some extra
padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
clear the encryption attribute of the full PAGE. The downside of this was
now we need to modify structure which may break the compatibility.

3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
used to hold the compile time shared per-CPU variables. When SEV is
detected we map this section with encryption attribute cleared.

This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
macro to create a compile time per-CPU variable. When SEV is detected we
map the per-CPU variable as decrypted (i.e with encryption attribute cleared).

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/kvm.c             |   43 +++++++++++++++++++++++++++++++------
 include/asm-generic/vmlinux.lds.h |    3 +++
 include/linux/percpu-defs.h       |    9 ++++++++
 3 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 099fcba..706a08e 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
 
 early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
 
-static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
-static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
+static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
+static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) __aligned(64);
 static int has_steal_clock = 0;
 
 /*
@@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
 #endif
 }
 
+static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
+{
+	/* When SEV is active, the percpu static variables initialized
+	 * in data section will contain the encrypted data so we first
+	 * need to decrypt it and then map it as decrypted.
+	 */
+	if (sev_active()) {
+		unsigned long pa = slow_virt_to_phys(addr);
+
+		sme_early_decrypt(pa, size);
+		return early_set_memory_decrypted(addr, size);
+	}
+
+	return 0;
+}
+
 static void kvm_register_steal_time(void)
 {
 	int cpu = smp_processor_id();
@@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
 	if (!has_steal_clock)
 		return;
 
+	if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
+		pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
+		return;
+	}
+
 	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
 	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
 		cpu, (unsigned long long) slow_virt_to_phys(st));
 }
 
-static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
+static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
 
 static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
 {
@@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
 	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
 		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
 
+		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
+					sizeof(struct kvm_vcpu_pv_apf_data)))
+			goto skip_asyncpf;
 #ifdef CONFIG_PREEMPT
 		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
 #endif
 		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
 		__this_cpu_write(apf_reason.enabled, 1);
-		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
-		       smp_processor_id());
+		printk(KERN_INFO"KVM setup async PF for cpu %d msr %llx\n",
+		       smp_processor_id(), pa);
 	}
-
+skip_asyncpf:
 	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
 		unsigned long pa;
 		/* Size alignment is implied but just to make it explicit. */
 		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
+		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&kvm_apic_eoi),
+					sizeof(unsigned long)))
+			goto skip_pv_eoi;
 		__this_cpu_write(kvm_apic_eoi, 0);
 		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
 			| KVM_MSR_ENABLED;
 		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
+		printk(KERN_INFO"KVM setup PV EOI for cpu %d msr %lx\n",
+		       smp_processor_id(), pa);
 	}
-
+skip_pv_eoi:
 	if (has_steal_clock)
 		kvm_register_steal_time();
 }
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 0968d13..8d29910 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -773,6 +773,9 @@
 	. = ALIGN(cacheline);						\
 	*(.data..percpu)						\
 	*(.data..percpu..shared_aligned)				\
+	. = ALIGN(PAGE_SIZE);						\
+	*(.data..percpu..hv_shared)					\
+	. = ALIGN(PAGE_SIZE);						\
 	VMLINUX_SYMBOL(__per_cpu_end) = .;
 
 /**
diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
index 8f16299..5af366e 100644
--- a/include/linux/percpu-defs.h
+++ b/include/linux/percpu-defs.h
@@ -172,6 +172,15 @@
 #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
 	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
 
+/* Declaration/definition used for per-CPU variables that must be shared
+ * between hypervisor and guest OS.
+ */
+#define DECLARE_PER_CPU_HV_SHARED(type, name)				\
+	DECLARE_PER_CPU_SECTION(type, name, "..hv_shared")
+
+#define DEFINE_PER_CPU_HV_SHARED(type, name)				\
+	DEFINE_PER_CPU_SECTION(type, name, "..hv_shared")
+
 /*
  * Intermodule exports for per-CPU variables.  sparse forgets about
  * address space across EXPORT_SYMBOL(), change EXPORT_SYMBOL() to

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
variable at compile time and share its physical address with hypervisor.
It presents a challege when SEV is active in guest OS. When SEV is active,
guest memory is encrypted with guest key and hypervisor will no longer able
to modify the guest memory. When SEV is active, we need to clear the
encryption attribute of shared physical addresses so that both guest and
hypervisor can access the data.

To solve this problem, I have tried these three options:

1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
detected then clear the encryption attribute. But while doing so I found
that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
called.

2) Since the encryption attributes works on PAGE_SIZE hence add some extra
padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
clear the encryption attribute of the full PAGE. The downside of this was
now we need to modify structure which may break the compatibility.

3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
used to hold the compile time shared per-CPU variables. When SEV is
detected we map this section with encryption attribute cleared.

This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
macro to create a compile time per-CPU variable. When SEV is detected we
map the per-CPU variable as decrypted (i.e with encryption attribute cleared).

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/kvm.c             |   43 +++++++++++++++++++++++++++++++------
 include/asm-generic/vmlinux.lds.h |    3 +++
 include/linux/percpu-defs.h       |    9 ++++++++
 3 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 099fcba..706a08e 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
 
 early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
 
-static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
-static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
+static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
+static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) __aligned(64);
 static int has_steal_clock = 0;
 
 /*
@@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
 #endif
 }
 
+static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
+{
+	/* When SEV is active, the percpu static variables initialized
+	 * in data section will contain the encrypted data so we first
+	 * need to decrypt it and then map it as decrypted.
+	 */
+	if (sev_active()) {
+		unsigned long pa = slow_virt_to_phys(addr);
+
+		sme_early_decrypt(pa, size);
+		return early_set_memory_decrypted(addr, size);
+	}
+
+	return 0;
+}
+
 static void kvm_register_steal_time(void)
 {
 	int cpu = smp_processor_id();
@@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
 	if (!has_steal_clock)
 		return;
 
+	if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
+		pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
+		return;
+	}
+
 	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
 	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
 		cpu, (unsigned long long) slow_virt_to_phys(st));
 }
 
-static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
+static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
 
 static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
 {
@@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
 	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
 		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
 
+		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
+					sizeof(struct kvm_vcpu_pv_apf_data)))
+			goto skip_asyncpf;
 #ifdef CONFIG_PREEMPT
 		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
 #endif
 		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
 		__this_cpu_write(apf_reason.enabled, 1);
-		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
-		       smp_processor_id());
+		printk(KERN_INFO"KVM setup async PF for cpu %d msr %llx\n",
+		       smp_processor_id(), pa);
 	}
-
+skip_asyncpf:
 	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
 		unsigned long pa;
 		/* Size alignment is implied but just to make it explicit. */
 		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
+		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&kvm_apic_eoi),
+					sizeof(unsigned long)))
+			goto skip_pv_eoi;
 		__this_cpu_write(kvm_apic_eoi, 0);
 		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
 			| KVM_MSR_ENABLED;
 		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
+		printk(KERN_INFO"KVM setup PV EOI for cpu %d msr %lx\n",
+		       smp_processor_id(), pa);
 	}
-
+skip_pv_eoi:
 	if (has_steal_clock)
 		kvm_register_steal_time();
 }
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 0968d13..8d29910 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -773,6 +773,9 @@
 	. = ALIGN(cacheline);						\
 	*(.data..percpu)						\
 	*(.data..percpu..shared_aligned)				\
+	. = ALIGN(PAGE_SIZE);						\
+	*(.data..percpu..hv_shared)					\
+	. = ALIGN(PAGE_SIZE);						\
 	VMLINUX_SYMBOL(__per_cpu_end) = .;
 
 /**
diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
index 8f16299..5af366e 100644
--- a/include/linux/percpu-defs.h
+++ b/include/linux/percpu-defs.h
@@ -172,6 +172,15 @@
 #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
 	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
 
+/* Declaration/definition used for per-CPU variables that must be shared
+ * between hypervisor and guest OS.
+ */
+#define DECLARE_PER_CPU_HV_SHARED(type, name)				\
+	DECLARE_PER_CPU_SECTION(type, name, "..hv_shared")
+
+#define DEFINE_PER_CPU_HV_SHARED(type, name)				\
+	DEFINE_PER_CPU_SECTION(type, name, "..hv_shared")
+
 /*
  * Intermodule exports for per-CPU variables.  sparse forgets about
  * address space across EXPORT_SYMBOL(), change EXPORT_SYMBOL() to

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
variable at compile time and share its physical address with hypervisor.
It presents a challege when SEV is active in guest OS. When SEV is active,
guest memory is encrypted with guest key and hypervisor will no longer able
to modify the guest memory. When SEV is active, we need to clear the
encryption attribute of shared physical addresses so that both guest and
hypervisor can access the data.

To solve this problem, I have tried these three options:

1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
detected then clear the encryption attribute. But while doing so I found
that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
called.

2) Since the encryption attributes works on PAGE_SIZE hence add some extra
padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
clear the encryption attribute of the full PAGE. The downside of this was
now we need to modify structure which may break the compatibility.

3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
used to hold the compile time shared per-CPU variables. When SEV is
detected we map this section with encryption attribute cleared.

This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
macro to create a compile time per-CPU variable. When SEV is detected we
map the per-CPU variable as decrypted (i.e with encryption attribute cleared).

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/kvm.c             |   43 +++++++++++++++++++++++++++++++------
 include/asm-generic/vmlinux.lds.h |    3 +++
 include/linux/percpu-defs.h       |    9 ++++++++
 3 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 099fcba..706a08e 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
 
 early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
 
-static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
-static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
+static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
+static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) __aligned(64);
 static int has_steal_clock = 0;
 
 /*
@@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
 #endif
 }
 
+static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
+{
+	/* When SEV is active, the percpu static variables initialized
+	 * in data section will contain the encrypted data so we first
+	 * need to decrypt it and then map it as decrypted.
+	 */
+	if (sev_active()) {
+		unsigned long pa = slow_virt_to_phys(addr);
+
+		sme_early_decrypt(pa, size);
+		return early_set_memory_decrypted(addr, size);
+	}
+
+	return 0;
+}
+
 static void kvm_register_steal_time(void)
 {
 	int cpu = smp_processor_id();
@@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
 	if (!has_steal_clock)
 		return;
 
+	if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
+		pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
+		return;
+	}
+
 	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
 	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
 		cpu, (unsigned long long) slow_virt_to_phys(st));
 }
 
-static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
+static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
 
 static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
 {
@@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
 	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
 		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
 
+		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
+					sizeof(struct kvm_vcpu_pv_apf_data)))
+			goto skip_asyncpf;
 #ifdef CONFIG_PREEMPT
 		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
 #endif
 		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
 		__this_cpu_write(apf_reason.enabled, 1);
-		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
-		       smp_processor_id());
+		printk(KERN_INFO"KVM setup async PF for cpu %d msr %llx\n",
+		       smp_processor_id(), pa);
 	}
-
+skip_asyncpf:
 	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
 		unsigned long pa;
 		/* Size alignment is implied but just to make it explicit. */
 		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
+		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&kvm_apic_eoi),
+					sizeof(unsigned long)))
+			goto skip_pv_eoi;
 		__this_cpu_write(kvm_apic_eoi, 0);
 		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
 			| KVM_MSR_ENABLED;
 		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
+		printk(KERN_INFO"KVM setup PV EOI for cpu %d msr %lx\n",
+		       smp_processor_id(), pa);
 	}
-
+skip_pv_eoi:
 	if (has_steal_clock)
 		kvm_register_steal_time();
 }
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 0968d13..8d29910 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -773,6 +773,9 @@
 	. = ALIGN(cacheline);						\
 	*(.data..percpu)						\
 	*(.data..percpu..shared_aligned)				\
+	. = ALIGN(PAGE_SIZE);						\
+	*(.data..percpu..hv_shared)					\
+	. = ALIGN(PAGE_SIZE);						\
 	VMLINUX_SYMBOL(__per_cpu_end) = .;
 
 /**
diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
index 8f16299..5af366e 100644
--- a/include/linux/percpu-defs.h
+++ b/include/linux/percpu-defs.h
@@ -172,6 +172,15 @@
 #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
 	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
 
+/* Declaration/definition used for per-CPU variables that must be shared
+ * between hypervisor and guest OS.
+ */
+#define DECLARE_PER_CPU_HV_SHARED(type, name)				\
+	DECLARE_PER_CPU_SECTION(type, name, "..hv_shared")
+
+#define DEFINE_PER_CPU_HV_SHARED(type, name)				\
+	DEFINE_PER_CPU_SECTION(type, name, "..hv_shared")
+
 /*
  * Intermodule exports for per-CPU variables.  sparse forgets about
  * address space across EXPORT_SYMBOL(), change EXPORT_SYMBOL() to

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 17/32] x86: kvmclock: Clear encryption attribute when SEV is active
  2017-03-02 15:12 ` Brijesh Singh
                   ` (35 preceding siblings ...)
  (?)
@ 2017-03-02 15:15 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The guest physical memory area holding the struct pvclock_wall_clock and
struct pvclock_vcpu_time_info are shared with the hypervisor. Hypervisor
periodically updates the contents of the memory. When SEV is active we must
clear the encryption attributes of the shared memory pages so that both
hypervisor and guest can access the data.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/kvmclock.c |   65 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 56 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 278de4f..3b38b3d 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -27,6 +27,7 @@
 #include <linux/sched.h>
 #include <linux/sched/clock.h>
 
+#include <asm/mem_encrypt.h>
 #include <asm/x86_init.h>
 #include <asm/reboot.h>
 
@@ -44,7 +45,7 @@ early_param("no-kvmclock", parse_no_kvmclock);
 
 /* The hypervisor will put information about time periodically here */
 static struct pvclock_vsyscall_time_info *hv_clock;
-static struct pvclock_wall_clock wall_clock;
+static struct pvclock_wall_clock *wall_clock;
 
 struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
 {
@@ -62,15 +63,18 @@ static void kvm_get_wallclock(struct timespec *now)
 	int low, high;
 	int cpu;
 
-	low = (int)__pa_symbol(&wall_clock);
-	high = ((u64)__pa_symbol(&wall_clock) >> 32);
+	if (!wall_clock)
+		return;
+
+	low = (int)slow_virt_to_phys(wall_clock);
+	high = ((u64)slow_virt_to_phys(wall_clock) >> 32);
 
 	native_write_msr(msr_kvm_wall_clock, low, high);
 
 	cpu = get_cpu();
 
 	vcpu_time = &hv_clock[cpu].pvti;
-	pvclock_read_wallclock(&wall_clock, vcpu_time, now);
+	pvclock_read_wallclock(wall_clock, vcpu_time, now);
 
 	put_cpu();
 }
@@ -246,11 +250,40 @@ static void kvm_shutdown(void)
 	native_machine_shutdown();
 }
 
+static phys_addr_t kvm_memblock_alloc(phys_addr_t size, phys_addr_t align)
+{
+	phys_addr_t mem;
+
+	mem = memblock_alloc(size, align);
+	if (!mem)
+		return 0;
+
+	/* When SEV is active clear the encryption attributes of the pages */
+	if (sev_active()) {
+		if (early_set_memory_decrypted(__va(mem), size))
+			goto e_free;
+	}
+
+	return mem;
+e_free:
+	memblock_free(mem, size);
+	return 0;
+}
+
+static void kvm_memblock_free(phys_addr_t addr, phys_addr_t size)
+{
+	/* When SEV is active restore the encryption attributes of the pages */
+	if (sev_active())
+		early_set_memory_encrypted(__va(addr), size);
+
+	memblock_free(addr, size);
+}
+
 void __init kvmclock_init(void)
 {
 	struct pvclock_vcpu_time_info *vcpu_time;
-	unsigned long mem;
-	int size, cpu;
+	unsigned long mem, mem_wall_clock;
+	int size, cpu, wall_clock_size;
 	u8 flags;
 
 	size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
@@ -267,15 +300,29 @@ void __init kvmclock_init(void)
 	printk(KERN_INFO "kvm-clock: Using msrs %x and %x",
 		msr_kvm_system_time, msr_kvm_wall_clock);
 
-	mem = memblock_alloc(size, PAGE_SIZE);
-	if (!mem)
+	wall_clock_size = PAGE_ALIGN(sizeof(struct pvclock_wall_clock));
+	mem_wall_clock = kvm_memblock_alloc(wall_clock_size, PAGE_SIZE);
+	if (!mem_wall_clock)
 		return;
+
+	wall_clock = __va(mem_wall_clock);
+	memset(wall_clock, 0, wall_clock_size);
+
+	mem = kvm_memblock_alloc(size, PAGE_SIZE);
+	if (!mem) {
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
+		return;
+	}
+
 	hv_clock = __va(mem);
 	memset(hv_clock, 0, size);
 
 	if (kvm_register_clock("primary cpu clock")) {
 		hv_clock = NULL;
-		memblock_free(mem, size);
+		kvm_memblock_free(mem, size);
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
 		return;
 	}
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 17/32] x86: kvmclock: Clear encryption attribute when SEV is active
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:15   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The guest physical memory area holding the struct pvclock_wall_clock and
struct pvclock_vcpu_time_info are shared with the hypervisor. Hypervisor
periodically updates the contents of the memory. When SEV is active we must
clear the encryption attributes of the shared memory pages so that both
hypervisor and guest can access the data.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/kvmclock.c |   65 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 56 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 278de4f..3b38b3d 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -27,6 +27,7 @@
 #include <linux/sched.h>
 #include <linux/sched/clock.h>
 
+#include <asm/mem_encrypt.h>
 #include <asm/x86_init.h>
 #include <asm/reboot.h>
 
@@ -44,7 +45,7 @@ early_param("no-kvmclock", parse_no_kvmclock);
 
 /* The hypervisor will put information about time periodically here */
 static struct pvclock_vsyscall_time_info *hv_clock;
-static struct pvclock_wall_clock wall_clock;
+static struct pvclock_wall_clock *wall_clock;
 
 struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
 {
@@ -62,15 +63,18 @@ static void kvm_get_wallclock(struct timespec *now)
 	int low, high;
 	int cpu;
 
-	low = (int)__pa_symbol(&wall_clock);
-	high = ((u64)__pa_symbol(&wall_clock) >> 32);
+	if (!wall_clock)
+		return;
+
+	low = (int)slow_virt_to_phys(wall_clock);
+	high = ((u64)slow_virt_to_phys(wall_clock) >> 32);
 
 	native_write_msr(msr_kvm_wall_clock, low, high);
 
 	cpu = get_cpu();
 
 	vcpu_time = &hv_clock[cpu].pvti;
-	pvclock_read_wallclock(&wall_clock, vcpu_time, now);
+	pvclock_read_wallclock(wall_clock, vcpu_time, now);
 
 	put_cpu();
 }
@@ -246,11 +250,40 @@ static void kvm_shutdown(void)
 	native_machine_shutdown();
 }
 
+static phys_addr_t kvm_memblock_alloc(phys_addr_t size, phys_addr_t align)
+{
+	phys_addr_t mem;
+
+	mem = memblock_alloc(size, align);
+	if (!mem)
+		return 0;
+
+	/* When SEV is active clear the encryption attributes of the pages */
+	if (sev_active()) {
+		if (early_set_memory_decrypted(__va(mem), size))
+			goto e_free;
+	}
+
+	return mem;
+e_free:
+	memblock_free(mem, size);
+	return 0;
+}
+
+static void kvm_memblock_free(phys_addr_t addr, phys_addr_t size)
+{
+	/* When SEV is active restore the encryption attributes of the pages */
+	if (sev_active())
+		early_set_memory_encrypted(__va(addr), size);
+
+	memblock_free(addr, size);
+}
+
 void __init kvmclock_init(void)
 {
 	struct pvclock_vcpu_time_info *vcpu_time;
-	unsigned long mem;
-	int size, cpu;
+	unsigned long mem, mem_wall_clock;
+	int size, cpu, wall_clock_size;
 	u8 flags;
 
 	size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
@@ -267,15 +300,29 @@ void __init kvmclock_init(void)
 	printk(KERN_INFO "kvm-clock: Using msrs %x and %x",
 		msr_kvm_system_time, msr_kvm_wall_clock);
 
-	mem = memblock_alloc(size, PAGE_SIZE);
-	if (!mem)
+	wall_clock_size = PAGE_ALIGN(sizeof(struct pvclock_wall_clock));
+	mem_wall_clock = kvm_memblock_alloc(wall_clock_size, PAGE_SIZE);
+	if (!mem_wall_clock)
 		return;
+
+	wall_clock = __va(mem_wall_clock);
+	memset(wall_clock, 0, wall_clock_size);
+
+	mem = kvm_memblock_alloc(size, PAGE_SIZE);
+	if (!mem) {
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
+		return;
+	}
+
 	hv_clock = __va(mem);
 	memset(hv_clock, 0, size);
 
 	if (kvm_register_clock("primary cpu clock")) {
 		hv_clock = NULL;
-		memblock_free(mem, size);
+		kvm_memblock_free(mem, size);
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
 		return;
 	}
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 17/32] x86: kvmclock: Clear encryption attribute when SEV is active
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The guest physical memory area holding the struct pvclock_wall_clock and
struct pvclock_vcpu_time_info are shared with the hypervisor. Hypervisor
periodically updates the contents of the memory. When SEV is active we must
clear the encryption attributes of the shared memory pages so that both
hypervisor and guest can access the data.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/kvmclock.c |   65 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 56 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 278de4f..3b38b3d 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -27,6 +27,7 @@
 #include <linux/sched.h>
 #include <linux/sched/clock.h>
 
+#include <asm/mem_encrypt.h>
 #include <asm/x86_init.h>
 #include <asm/reboot.h>
 
@@ -44,7 +45,7 @@ early_param("no-kvmclock", parse_no_kvmclock);
 
 /* The hypervisor will put information about time periodically here */
 static struct pvclock_vsyscall_time_info *hv_clock;
-static struct pvclock_wall_clock wall_clock;
+static struct pvclock_wall_clock *wall_clock;
 
 struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
 {
@@ -62,15 +63,18 @@ static void kvm_get_wallclock(struct timespec *now)
 	int low, high;
 	int cpu;
 
-	low = (int)__pa_symbol(&wall_clock);
-	high = ((u64)__pa_symbol(&wall_clock) >> 32);
+	if (!wall_clock)
+		return;
+
+	low = (int)slow_virt_to_phys(wall_clock);
+	high = ((u64)slow_virt_to_phys(wall_clock) >> 32);
 
 	native_write_msr(msr_kvm_wall_clock, low, high);
 
 	cpu = get_cpu();
 
 	vcpu_time = &hv_clock[cpu].pvti;
-	pvclock_read_wallclock(&wall_clock, vcpu_time, now);
+	pvclock_read_wallclock(wall_clock, vcpu_time, now);
 
 	put_cpu();
 }
@@ -246,11 +250,40 @@ static void kvm_shutdown(void)
 	native_machine_shutdown();
 }
 
+static phys_addr_t kvm_memblock_alloc(phys_addr_t size, phys_addr_t align)
+{
+	phys_addr_t mem;
+
+	mem = memblock_alloc(size, align);
+	if (!mem)
+		return 0;
+
+	/* When SEV is active clear the encryption attributes of the pages */
+	if (sev_active()) {
+		if (early_set_memory_decrypted(__va(mem), size))
+			goto e_free;
+	}
+
+	return mem;
+e_free:
+	memblock_free(mem, size);
+	return 0;
+}
+
+static void kvm_memblock_free(phys_addr_t addr, phys_addr_t size)
+{
+	/* When SEV is active restore the encryption attributes of the pages */
+	if (sev_active())
+		early_set_memory_encrypted(__va(addr), size);
+
+	memblock_free(addr, size);
+}
+
 void __init kvmclock_init(void)
 {
 	struct pvclock_vcpu_time_info *vcpu_time;
-	unsigned long mem;
-	int size, cpu;
+	unsigned long mem, mem_wall_clock;
+	int size, cpu, wall_clock_size;
 	u8 flags;
 
 	size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
@@ -267,15 +300,29 @@ void __init kvmclock_init(void)
 	printk(KERN_INFO "kvm-clock: Using msrs %x and %x",
 		msr_kvm_system_time, msr_kvm_wall_clock);
 
-	mem = memblock_alloc(size, PAGE_SIZE);
-	if (!mem)
+	wall_clock_size = PAGE_ALIGN(sizeof(struct pvclock_wall_clock));
+	mem_wall_clock = kvm_memblock_alloc(wall_clock_size, PAGE_SIZE);
+	if (!mem_wall_clock)
 		return;
+
+	wall_clock = __va(mem_wall_clock);
+	memset(wall_clock, 0, wall_clock_size);
+
+	mem = kvm_memblock_alloc(size, PAGE_SIZE);
+	if (!mem) {
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
+		return;
+	}
+
 	hv_clock = __va(mem);
 	memset(hv_clock, 0, size);
 
 	if (kvm_register_clock("primary cpu clock")) {
 		hv_clock = NULL;
-		memblock_free(mem, size);
+		kvm_memblock_free(mem, size);
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
 		return;
 	}
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 17/32] x86: kvmclock: Clear encryption attribute when SEV is active
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The guest physical memory area holding the struct pvclock_wall_clock and
struct pvclock_vcpu_time_info are shared with the hypervisor. Hypervisor
periodically updates the contents of the memory. When SEV is active we must
clear the encryption attributes of the shared memory pages so that both
hypervisor and guest can access the data.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/kvmclock.c |   65 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 56 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 278de4f..3b38b3d 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -27,6 +27,7 @@
 #include <linux/sched.h>
 #include <linux/sched/clock.h>
 
+#include <asm/mem_encrypt.h>
 #include <asm/x86_init.h>
 #include <asm/reboot.h>
 
@@ -44,7 +45,7 @@ early_param("no-kvmclock", parse_no_kvmclock);
 
 /* The hypervisor will put information about time periodically here */
 static struct pvclock_vsyscall_time_info *hv_clock;
-static struct pvclock_wall_clock wall_clock;
+static struct pvclock_wall_clock *wall_clock;
 
 struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
 {
@@ -62,15 +63,18 @@ static void kvm_get_wallclock(struct timespec *now)
 	int low, high;
 	int cpu;
 
-	low = (int)__pa_symbol(&wall_clock);
-	high = ((u64)__pa_symbol(&wall_clock) >> 32);
+	if (!wall_clock)
+		return;
+
+	low = (int)slow_virt_to_phys(wall_clock);
+	high = ((u64)slow_virt_to_phys(wall_clock) >> 32);
 
 	native_write_msr(msr_kvm_wall_clock, low, high);
 
 	cpu = get_cpu();
 
 	vcpu_time = &hv_clock[cpu].pvti;
-	pvclock_read_wallclock(&wall_clock, vcpu_time, now);
+	pvclock_read_wallclock(wall_clock, vcpu_time, now);
 
 	put_cpu();
 }
@@ -246,11 +250,40 @@ static void kvm_shutdown(void)
 	native_machine_shutdown();
 }
 
+static phys_addr_t kvm_memblock_alloc(phys_addr_t size, phys_addr_t align)
+{
+	phys_addr_t mem;
+
+	mem = memblock_alloc(size, align);
+	if (!mem)
+		return 0;
+
+	/* When SEV is active clear the encryption attributes of the pages */
+	if (sev_active()) {
+		if (early_set_memory_decrypted(__va(mem), size))
+			goto e_free;
+	}
+
+	return mem;
+e_free:
+	memblock_free(mem, size);
+	return 0;
+}
+
+static void kvm_memblock_free(phys_addr_t addr, phys_addr_t size)
+{
+	/* When SEV is active restore the encryption attributes of the pages */
+	if (sev_active())
+		early_set_memory_encrypted(__va(addr), size);
+
+	memblock_free(addr, size);
+}
+
 void __init kvmclock_init(void)
 {
 	struct pvclock_vcpu_time_info *vcpu_time;
-	unsigned long mem;
-	int size, cpu;
+	unsigned long mem, mem_wall_clock;
+	int size, cpu, wall_clock_size;
 	u8 flags;
 
 	size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
@@ -267,15 +300,29 @@ void __init kvmclock_init(void)
 	printk(KERN_INFO "kvm-clock: Using msrs %x and %x",
 		msr_kvm_system_time, msr_kvm_wall_clock);
 
-	mem = memblock_alloc(size, PAGE_SIZE);
-	if (!mem)
+	wall_clock_size = PAGE_ALIGN(sizeof(struct pvclock_wall_clock));
+	mem_wall_clock = kvm_memblock_alloc(wall_clock_size, PAGE_SIZE);
+	if (!mem_wall_clock)
 		return;
+
+	wall_clock = __va(mem_wall_clock);
+	memset(wall_clock, 0, wall_clock_size);
+
+	mem = kvm_memblock_alloc(size, PAGE_SIZE);
+	if (!mem) {
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
+		return;
+	}
+
 	hv_clock = __va(mem);
 	memset(hv_clock, 0, size);
 
 	if (kvm_register_clock("primary cpu clock")) {
 		hv_clock = NULL;
-		memblock_free(mem, size);
+		kvm_memblock_free(mem, size);
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
 		return;
 	}
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 17/32] x86: kvmclock: Clear encryption attribute when SEV is active
@ 2017-03-02 15:15   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:15 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The guest physical memory area holding the struct pvclock_wall_clock and
struct pvclock_vcpu_time_info are shared with the hypervisor. Hypervisor
periodically updates the contents of the memory. When SEV is active we must
clear the encryption attributes of the shared memory pages so that both
hypervisor and guest can access the data.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/kvmclock.c |   65 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 56 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 278de4f..3b38b3d 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -27,6 +27,7 @@
 #include <linux/sched.h>
 #include <linux/sched/clock.h>
 
+#include <asm/mem_encrypt.h>
 #include <asm/x86_init.h>
 #include <asm/reboot.h>
 
@@ -44,7 +45,7 @@ early_param("no-kvmclock", parse_no_kvmclock);
 
 /* The hypervisor will put information about time periodically here */
 static struct pvclock_vsyscall_time_info *hv_clock;
-static struct pvclock_wall_clock wall_clock;
+static struct pvclock_wall_clock *wall_clock;
 
 struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
 {
@@ -62,15 +63,18 @@ static void kvm_get_wallclock(struct timespec *now)
 	int low, high;
 	int cpu;
 
-	low = (int)__pa_symbol(&wall_clock);
-	high = ((u64)__pa_symbol(&wall_clock) >> 32);
+	if (!wall_clock)
+		return;
+
+	low = (int)slow_virt_to_phys(wall_clock);
+	high = ((u64)slow_virt_to_phys(wall_clock) >> 32);
 
 	native_write_msr(msr_kvm_wall_clock, low, high);
 
 	cpu = get_cpu();
 
 	vcpu_time = &hv_clock[cpu].pvti;
-	pvclock_read_wallclock(&wall_clock, vcpu_time, now);
+	pvclock_read_wallclock(wall_clock, vcpu_time, now);
 
 	put_cpu();
 }
@@ -246,11 +250,40 @@ static void kvm_shutdown(void)
 	native_machine_shutdown();
 }
 
+static phys_addr_t kvm_memblock_alloc(phys_addr_t size, phys_addr_t align)
+{
+	phys_addr_t mem;
+
+	mem = memblock_alloc(size, align);
+	if (!mem)
+		return 0;
+
+	/* When SEV is active clear the encryption attributes of the pages */
+	if (sev_active()) {
+		if (early_set_memory_decrypted(__va(mem), size))
+			goto e_free;
+	}
+
+	return mem;
+e_free:
+	memblock_free(mem, size);
+	return 0;
+}
+
+static void kvm_memblock_free(phys_addr_t addr, phys_addr_t size)
+{
+	/* When SEV is active restore the encryption attributes of the pages */
+	if (sev_active())
+		early_set_memory_encrypted(__va(addr), size);
+
+	memblock_free(addr, size);
+}
+
 void __init kvmclock_init(void)
 {
 	struct pvclock_vcpu_time_info *vcpu_time;
-	unsigned long mem;
-	int size, cpu;
+	unsigned long mem, mem_wall_clock;
+	int size, cpu, wall_clock_size;
 	u8 flags;
 
 	size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
@@ -267,15 +300,29 @@ void __init kvmclock_init(void)
 	printk(KERN_INFO "kvm-clock: Using msrs %x and %x",
 		msr_kvm_system_time, msr_kvm_wall_clock);
 
-	mem = memblock_alloc(size, PAGE_SIZE);
-	if (!mem)
+	wall_clock_size = PAGE_ALIGN(sizeof(struct pvclock_wall_clock));
+	mem_wall_clock = kvm_memblock_alloc(wall_clock_size, PAGE_SIZE);
+	if (!mem_wall_clock)
 		return;
+
+	wall_clock = __va(mem_wall_clock);
+	memset(wall_clock, 0, wall_clock_size);
+
+	mem = kvm_memblock_alloc(size, PAGE_SIZE);
+	if (!mem) {
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
+		return;
+	}
+
 	hv_clock = __va(mem);
 	memset(hv_clock, 0, size);
 
 	if (kvm_register_clock("primary cpu clock")) {
 		hv_clock = NULL;
-		memblock_free(mem, size);
+		kvm_memblock_free(mem, size);
+		kvm_memblock_free(mem_wall_clock, wall_clock_size);
+		wall_clock = NULL;
 		return;
 	}
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
  2017-03-02 15:12 ` Brijesh Singh
                   ` (36 preceding siblings ...)
  (?)
@ 2017-03-02 15:16 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

From: Tom Lendacky <thomas.lendacky@amd.com>

When a guest causes a NPF which requires emulation, KVM sometimes walks
the guest page tables to translate the GVA to a GPA. This is unnecessary
most of the time on AMD hardware since the hardware provides the GPA in
EXITINFO2.

The only exception cases involve string operations involving rep or
operations that use two memory locations. With rep, the GPA will only be
the value of the initial NPF and with dual memory locations we won't know
which memory address was translated into EXITINFO2.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_emulate.h |    1 +
 arch/x86/include/asm/kvm_host.h    |    3 ++
 arch/x86/kvm/emulate.c             |   20 +++++++++++++---
 arch/x86/kvm/svm.c                 |    2 ++
 arch/x86/kvm/x86.c                 |   45 ++++++++++++++++++++++++++++--------
 5 files changed, 57 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index e9cd7be..3e8c287 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -441,5 +441,6 @@ int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
 int emulate_int_real(struct x86_emulate_ctxt *ctxt, int irq);
 void emulator_invalidate_register_cache(struct x86_emulate_ctxt *ctxt);
 void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt);
+bool emulator_can_use_gpa(struct x86_emulate_ctxt *ctxt);
 
 #endif /* _ASM_X86_KVM_X86_EMULATE_H */
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 37326b5..bff1f15 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -668,6 +668,9 @@ struct kvm_vcpu_arch {
 
 	int pending_ioapic_eoi;
 	int pending_external_vector;
+
+	/* GPA available (AMD only) */
+	bool gpa_available;
 };
 
 struct kvm_lpage_info {
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index cedbba0..45c7306 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -173,6 +173,7 @@
 #define NearBranch  ((u64)1 << 52)  /* Near branches */
 #define No16	    ((u64)1 << 53)  /* No 16 bit operand */
 #define IncSP       ((u64)1 << 54)  /* SP is incremented before ModRM calc */
+#define TwoMemOp    ((u64)1 << 55)  /* Instruction has two memory operand */
 
 #define DstXacc     (DstAccLo | SrcAccHi | SrcWrite)
 
@@ -4298,7 +4299,7 @@ static const struct opcode group1[] = {
 };
 
 static const struct opcode group1A[] = {
-	I(DstMem | SrcNone | Mov | Stack | IncSP, em_pop), N, N, N, N, N, N, N,
+	I(DstMem | SrcNone | Mov | Stack | IncSP | TwoMemOp, em_pop), N, N, N, N, N, N, N,
 };
 
 static const struct opcode group2[] = {
@@ -4336,7 +4337,7 @@ static const struct opcode group5[] = {
 	I(SrcMemFAddr | ImplicitOps,		em_call_far),
 	I(SrcMem | NearBranch,			em_jmp_abs),
 	I(SrcMemFAddr | ImplicitOps,		em_jmp_far),
-	I(SrcMem | Stack,			em_push), D(Undefined),
+	I(SrcMem | Stack | TwoMemOp,		em_push), D(Undefined),
 };
 
 static const struct opcode group6[] = {
@@ -4556,8 +4557,8 @@ static const struct opcode opcode_table[256] = {
 	/* 0xA0 - 0xA7 */
 	I2bv(DstAcc | SrcMem | Mov | MemAbs, em_mov),
 	I2bv(DstMem | SrcAcc | Mov | MemAbs | PageTable, em_mov),
-	I2bv(SrcSI | DstDI | Mov | String, em_mov),
-	F2bv(SrcSI | DstDI | String | NoWrite, em_cmp_r),
+	I2bv(SrcSI | DstDI | Mov | String | TwoMemOp, em_mov),
+	F2bv(SrcSI | DstDI | String | NoWrite | TwoMemOp, em_cmp_r),
 	/* 0xA8 - 0xAF */
 	F2bv(DstAcc | SrcImm | NoWrite, em_test),
 	I2bv(SrcAcc | DstDI | Mov | String, em_mov),
@@ -5671,3 +5672,14 @@ void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt)
 {
 	writeback_registers(ctxt);
 }
+
+bool emulator_can_use_gpa(struct x86_emulate_ctxt *ctxt)
+{
+	if (ctxt->rep_prefix && (ctxt->d & String))
+		return false;
+
+	if (ctxt->d & TwoMemOp)
+		return false;
+
+	return true;
+}
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 36d61ff..b581499 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4184,6 +4184,8 @@ static int handle_exit(struct kvm_vcpu *vcpu)
 
 	trace_kvm_exit(exit_code, vcpu, KVM_ISA_SVM);
 
+	vcpu->arch.gpa_available = (exit_code == SVM_EXIT_NPF);
+
 	if (!is_cr_intercept(svm, INTERCEPT_CR0_WRITE))
 		vcpu->arch.cr0 = svm->vmcb->save.cr0;
 	if (npt_enabled)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9e6a593..2099df8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4465,6 +4465,21 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt,
 }
 EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);
 
+static int vcpu_is_mmio_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
+			    gpa_t gpa, bool write)
+{
+	/* For APIC access vmexit */
+	if ((gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
+		return 1;
+
+	if (vcpu_match_mmio_gpa(vcpu, gpa)) {
+		trace_vcpu_match_mmio(gva, gpa, write, true);
+		return 1;
+	}
+
+	return 0;
+}
+
 static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
 				gpa_t *gpa, struct x86_exception *exception,
 				bool write)
@@ -4491,16 +4506,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
 	if (*gpa == UNMAPPED_GVA)
 		return -1;
 
-	/* For APIC access vmexit */
-	if ((*gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
-		return 1;
-
-	if (vcpu_match_mmio_gpa(vcpu, *gpa)) {
-		trace_vcpu_match_mmio(gva, *gpa, write, true);
-		return 1;
-	}
-
-	return 0;
+	return vcpu_is_mmio_gpa(vcpu, gva, *gpa, write);
 }
 
 int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
@@ -4597,6 +4603,22 @@ static int emulator_read_write_onepage(unsigned long addr, void *val,
 	int handled, ret;
 	bool write = ops->write;
 	struct kvm_mmio_fragment *frag;
+	struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
+
+	/*
+	 * If the exit was due to a NPF we may already have a GPA.
+	 * If the GPA is present, use it to avoid the GVA to GPA table walk.
+	 * Note, this cannot be used on string operations since string
+	 * operation using rep will only have the initial GPA from the NPF
+	 * occurred.
+	 */
+	if (vcpu->arch.gpa_available &&
+	    emulator_can_use_gpa(ctxt) &&
+	    vcpu_is_mmio_gpa(vcpu, addr, exception->address, write) &&
+	    (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK)) {
+		gpa = exception->address;
+		goto mmio;
+	}
 
 	ret = vcpu_mmio_gva_to_gpa(vcpu, addr, &gpa, exception, write);
 
@@ -5613,6 +5635,9 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 	}
 
 restart:
+	/* Save the faulting GPA (cr2) in the address field */
+	ctxt->exception.address = cr2;
+
 	r = x86_emulate_insn(ctxt);
 
 	if (r == EMULATION_INTERCEPTED)

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:16   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

When a guest causes a NPF which requires emulation, KVM sometimes walks
the guest page tables to translate the GVA to a GPA. This is unnecessary
most of the time on AMD hardware since the hardware provides the GPA in
EXITINFO2.

The only exception cases involve string operations involving rep or
operations that use two memory locations. With rep, the GPA will only be
the value of the initial NPF and with dual memory locations we won't know
which memory address was translated into EXITINFO2.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_emulate.h |    1 +
 arch/x86/include/asm/kvm_host.h    |    3 ++
 arch/x86/kvm/emulate.c             |   20 +++++++++++++---
 arch/x86/kvm/svm.c                 |    2 ++
 arch/x86/kvm/x86.c                 |   45 ++++++++++++++++++++++++++++--------
 5 files changed, 57 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index e9cd7be..3e8c287 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -441,5 +441,6 @@ int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
 int emulate_int_real(struct x86_emulate_ctxt *ctxt, int irq);
 void emulator_invalidate_register_cache(struct x86_emulate_ctxt *ctxt);
 void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt);
+bool emulator_can_use_gpa(struct x86_emulate_ctxt *ctxt);
 
 #endif /* _ASM_X86_KVM_X86_EMULATE_H */
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 37326b5..bff1f15 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -668,6 +668,9 @@ struct kvm_vcpu_arch {
 
 	int pending_ioapic_eoi;
 	int pending_external_vector;
+
+	/* GPA available (AMD only) */
+	bool gpa_available;
 };
 
 struct kvm_lpage_info {
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index cedbba0..45c7306 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -173,6 +173,7 @@
 #define NearBranch  ((u64)1 << 52)  /* Near branches */
 #define No16	    ((u64)1 << 53)  /* No 16 bit operand */
 #define IncSP       ((u64)1 << 54)  /* SP is incremented before ModRM calc */
+#define TwoMemOp    ((u64)1 << 55)  /* Instruction has two memory operand */
 
 #define DstXacc     (DstAccLo | SrcAccHi | SrcWrite)
 
@@ -4298,7 +4299,7 @@ static const struct opcode group1[] = {
 };
 
 static const struct opcode group1A[] = {
-	I(DstMem | SrcNone | Mov | Stack | IncSP, em_pop), N, N, N, N, N, N, N,
+	I(DstMem | SrcNone | Mov | Stack | IncSP | TwoMemOp, em_pop), N, N, N, N, N, N, N,
 };
 
 static const struct opcode group2[] = {
@@ -4336,7 +4337,7 @@ static const struct opcode group5[] = {
 	I(SrcMemFAddr | ImplicitOps,		em_call_far),
 	I(SrcMem | NearBranch,			em_jmp_abs),
 	I(SrcMemFAddr | ImplicitOps,		em_jmp_far),
-	I(SrcMem | Stack,			em_push), D(Undefined),
+	I(SrcMem | Stack | TwoMemOp,		em_push), D(Undefined),
 };
 
 static const struct opcode group6[] = {
@@ -4556,8 +4557,8 @@ static const struct opcode opcode_table[256] = {
 	/* 0xA0 - 0xA7 */
 	I2bv(DstAcc | SrcMem | Mov | MemAbs, em_mov),
 	I2bv(DstMem | SrcAcc | Mov | MemAbs | PageTable, em_mov),
-	I2bv(SrcSI | DstDI | Mov | String, em_mov),
-	F2bv(SrcSI | DstDI | String | NoWrite, em_cmp_r),
+	I2bv(SrcSI | DstDI | Mov | String | TwoMemOp, em_mov),
+	F2bv(SrcSI | DstDI | String | NoWrite | TwoMemOp, em_cmp_r),
 	/* 0xA8 - 0xAF */
 	F2bv(DstAcc | SrcImm | NoWrite, em_test),
 	I2bv(SrcAcc | DstDI | Mov | String, em_mov),
@@ -5671,3 +5672,14 @@ void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt)
 {
 	writeback_registers(ctxt);
 }
+
+bool emulator_can_use_gpa(struct x86_emulate_ctxt *ctxt)
+{
+	if (ctxt->rep_prefix && (ctxt->d & String))
+		return false;
+
+	if (ctxt->d & TwoMemOp)
+		return false;
+
+	return true;
+}
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 36d61ff..b581499 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4184,6 +4184,8 @@ static int handle_exit(struct kvm_vcpu *vcpu)
 
 	trace_kvm_exit(exit_code, vcpu, KVM_ISA_SVM);
 
+	vcpu->arch.gpa_available = (exit_code == SVM_EXIT_NPF);
+
 	if (!is_cr_intercept(svm, INTERCEPT_CR0_WRITE))
 		vcpu->arch.cr0 = svm->vmcb->save.cr0;
 	if (npt_enabled)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9e6a593..2099df8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4465,6 +4465,21 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt,
 }
 EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);
 
+static int vcpu_is_mmio_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
+			    gpa_t gpa, bool write)
+{
+	/* For APIC access vmexit */
+	if ((gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
+		return 1;
+
+	if (vcpu_match_mmio_gpa(vcpu, gpa)) {
+		trace_vcpu_match_mmio(gva, gpa, write, true);
+		return 1;
+	}
+
+	return 0;
+}
+
 static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
 				gpa_t *gpa, struct x86_exception *exception,
 				bool write)
@@ -4491,16 +4506,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
 	if (*gpa == UNMAPPED_GVA)
 		return -1;
 
-	/* For APIC access vmexit */
-	if ((*gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
-		return 1;
-
-	if (vcpu_match_mmio_gpa(vcpu, *gpa)) {
-		trace_vcpu_match_mmio(gva, *gpa, write, true);
-		return 1;
-	}
-
-	return 0;
+	return vcpu_is_mmio_gpa(vcpu, gva, *gpa, write);
 }
 
 int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
@@ -4597,6 +4603,22 @@ static int emulator_read_write_onepage(unsigned long addr, void *val,
 	int handled, ret;
 	bool write = ops->write;
 	struct kvm_mmio_fragment *frag;
+	struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
+
+	/*
+	 * If the exit was due to a NPF we may already have a GPA.
+	 * If the GPA is present, use it to avoid the GVA to GPA table walk.
+	 * Note, this cannot be used on string operations since string
+	 * operation using rep will only have the initial GPA from the NPF
+	 * occurred.
+	 */
+	if (vcpu->arch.gpa_available &&
+	    emulator_can_use_gpa(ctxt) &&
+	    vcpu_is_mmio_gpa(vcpu, addr, exception->address, write) &&
+	    (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK)) {
+		gpa = exception->address;
+		goto mmio;
+	}
 
 	ret = vcpu_mmio_gva_to_gpa(vcpu, addr, &gpa, exception, write);
 
@@ -5613,6 +5635,9 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 	}
 
 restart:
+	/* Save the faulting GPA (cr2) in the address field */
+	ctxt->exception.address = cr2;
+
 	r = x86_emulate_insn(ctxt);
 
 	if (r == EMULATION_INTERCEPTED)

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

When a guest causes a NPF which requires emulation, KVM sometimes walks
the guest page tables to translate the GVA to a GPA. This is unnecessary
most of the time on AMD hardware since the hardware provides the GPA in
EXITINFO2.

The only exception cases involve string operations involving rep or
operations that use two memory locations. With rep, the GPA will only be
the value of the initial NPF and with dual memory locations we won't know
which memory address was translated into EXITINFO2.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_emulate.h |    1 +
 arch/x86/include/asm/kvm_host.h    |    3 ++
 arch/x86/kvm/emulate.c             |   20 +++++++++++++---
 arch/x86/kvm/svm.c                 |    2 ++
 arch/x86/kvm/x86.c                 |   45 ++++++++++++++++++++++++++++--------
 5 files changed, 57 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index e9cd7be..3e8c287 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -441,5 +441,6 @@ int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
 int emulate_int_real(struct x86_emulate_ctxt *ctxt, int irq);
 void emulator_invalidate_register_cache(struct x86_emulate_ctxt *ctxt);
 void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt);
+bool emulator_can_use_gpa(struct x86_emulate_ctxt *ctxt);
 
 #endif /* _ASM_X86_KVM_X86_EMULATE_H */
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 37326b5..bff1f15 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -668,6 +668,9 @@ struct kvm_vcpu_arch {
 
 	int pending_ioapic_eoi;
 	int pending_external_vector;
+
+	/* GPA available (AMD only) */
+	bool gpa_available;
 };
 
 struct kvm_lpage_info {
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index cedbba0..45c7306 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -173,6 +173,7 @@
 #define NearBranch  ((u64)1 << 52)  /* Near branches */
 #define No16	    ((u64)1 << 53)  /* No 16 bit operand */
 #define IncSP       ((u64)1 << 54)  /* SP is incremented before ModRM calc */
+#define TwoMemOp    ((u64)1 << 55)  /* Instruction has two memory operand */
 
 #define DstXacc     (DstAccLo | SrcAccHi | SrcWrite)
 
@@ -4298,7 +4299,7 @@ static const struct opcode group1[] = {
 };
 
 static const struct opcode group1A[] = {
-	I(DstMem | SrcNone | Mov | Stack | IncSP, em_pop), N, N, N, N, N, N, N,
+	I(DstMem | SrcNone | Mov | Stack | IncSP | TwoMemOp, em_pop), N, N, N, N, N, N, N,
 };
 
 static const struct opcode group2[] = {
@@ -4336,7 +4337,7 @@ static const struct opcode group5[] = {
 	I(SrcMemFAddr | ImplicitOps,		em_call_far),
 	I(SrcMem | NearBranch,			em_jmp_abs),
 	I(SrcMemFAddr | ImplicitOps,		em_jmp_far),
-	I(SrcMem | Stack,			em_push), D(Undefined),
+	I(SrcMem | Stack | TwoMemOp,		em_push), D(Undefined),
 };
 
 static const struct opcode group6[] = {
@@ -4556,8 +4557,8 @@ static const struct opcode opcode_table[256] = {
 	/* 0xA0 - 0xA7 */
 	I2bv(DstAcc | SrcMem | Mov | MemAbs, em_mov),
 	I2bv(DstMem | SrcAcc | Mov | MemAbs | PageTable, em_mov),
-	I2bv(SrcSI | DstDI | Mov | String, em_mov),
-	F2bv(SrcSI | DstDI | String | NoWrite, em_cmp_r),
+	I2bv(SrcSI | DstDI | Mov | String | TwoMemOp, em_mov),
+	F2bv(SrcSI | DstDI | String | NoWrite | TwoMemOp, em_cmp_r),
 	/* 0xA8 - 0xAF */
 	F2bv(DstAcc | SrcImm | NoWrite, em_test),
 	I2bv(SrcAcc | DstDI | Mov | String, em_mov),
@@ -5671,3 +5672,14 @@ void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt)
 {
 	writeback_registers(ctxt);
 }
+
+bool emulator_can_use_gpa(struct x86_emulate_ctxt *ctxt)
+{
+	if (ctxt->rep_prefix && (ctxt->d & String))
+		return false;
+
+	if (ctxt->d & TwoMemOp)
+		return false;
+
+	return true;
+}
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 36d61ff..b581499 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4184,6 +4184,8 @@ static int handle_exit(struct kvm_vcpu *vcpu)
 
 	trace_kvm_exit(exit_code, vcpu, KVM_ISA_SVM);
 
+	vcpu->arch.gpa_available = (exit_code == SVM_EXIT_NPF);
+
 	if (!is_cr_intercept(svm, INTERCEPT_CR0_WRITE))
 		vcpu->arch.cr0 = svm->vmcb->save.cr0;
 	if (npt_enabled)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9e6a593..2099df8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4465,6 +4465,21 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt,
 }
 EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);
 
+static int vcpu_is_mmio_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
+			    gpa_t gpa, bool write)
+{
+	/* For APIC access vmexit */
+	if ((gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
+		return 1;
+
+	if (vcpu_match_mmio_gpa(vcpu, gpa)) {
+		trace_vcpu_match_mmio(gva, gpa, write, true);
+		return 1;
+	}
+
+	return 0;
+}
+
 static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
 				gpa_t *gpa, struct x86_exception *exception,
 				bool write)
@@ -4491,16 +4506,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
 	if (*gpa == UNMAPPED_GVA)
 		return -1;
 
-	/* For APIC access vmexit */
-	if ((*gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
-		return 1;
-
-	if (vcpu_match_mmio_gpa(vcpu, *gpa)) {
-		trace_vcpu_match_mmio(gva, *gpa, write, true);
-		return 1;
-	}
-
-	return 0;
+	return vcpu_is_mmio_gpa(vcpu, gva, *gpa, write);
 }
 
 int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
@@ -4597,6 +4603,22 @@ static int emulator_read_write_onepage(unsigned long addr, void *val,
 	int handled, ret;
 	bool write = ops->write;
 	struct kvm_mmio_fragment *frag;
+	struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
+
+	/*
+	 * If the exit was due to a NPF we may already have a GPA.
+	 * If the GPA is present, use it to avoid the GVA to GPA table walk.
+	 * Note, this cannot be used on string operations since string
+	 * operation using rep will only have the initial GPA from the NPF
+	 * occurred.
+	 */
+	if (vcpu->arch.gpa_available &&
+	    emulator_can_use_gpa(ctxt) &&
+	    vcpu_is_mmio_gpa(vcpu, addr, exception->address, write) &&
+	    (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK)) {
+		gpa = exception->address;
+		goto mmio;
+	}
 
 	ret = vcpu_mmio_gva_to_gpa(vcpu, addr, &gpa, exception, write);
 
@@ -5613,6 +5635,9 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 	}
 
 restart:
+	/* Save the faulting GPA (cr2) in the address field */
+	ctxt->exception.address = cr2;
+
 	r = x86_emulate_insn(ctxt);
 
 	if (r == EMULATION_INTERCEPTED)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

From: Tom Lendacky <thomas.lendacky@amd.com>

When a guest causes a NPF which requires emulation, KVM sometimes walks
the guest page tables to translate the GVA to a GPA. This is unnecessary
most of the time on AMD hardware since the hardware provides the GPA in
EXITINFO2.

The only exception cases involve string operations involving rep or
operations that use two memory locations. With rep, the GPA will only be
the value of the initial NPF and with dual memory locations we won't know
which memory address was translated into EXITINFO2.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_emulate.h |    1 +
 arch/x86/include/asm/kvm_host.h    |    3 ++
 arch/x86/kvm/emulate.c             |   20 +++++++++++++---
 arch/x86/kvm/svm.c                 |    2 ++
 arch/x86/kvm/x86.c                 |   45 ++++++++++++++++++++++++++++--------
 5 files changed, 57 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index e9cd7be..3e8c287 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -441,5 +441,6 @@ int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
 int emulate_int_real(struct x86_emulate_ctxt *ctxt, int irq);
 void emulator_invalidate_register_cache(struct x86_emulate_ctxt *ctxt);
 void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt);
+bool emulator_can_use_gpa(struct x86_emulate_ctxt *ctxt);
 
 #endif /* _ASM_X86_KVM_X86_EMULATE_H */
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 37326b5..bff1f15 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -668,6 +668,9 @@ struct kvm_vcpu_arch {
 
 	int pending_ioapic_eoi;
 	int pending_external_vector;
+
+	/* GPA available (AMD only) */
+	bool gpa_available;
 };
 
 struct kvm_lpage_info {
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index cedbba0..45c7306 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -173,6 +173,7 @@
 #define NearBranch  ((u64)1 << 52)  /* Near branches */
 #define No16	    ((u64)1 << 53)  /* No 16 bit operand */
 #define IncSP       ((u64)1 << 54)  /* SP is incremented before ModRM calc */
+#define TwoMemOp    ((u64)1 << 55)  /* Instruction has two memory operand */
 
 #define DstXacc     (DstAccLo | SrcAccHi | SrcWrite)
 
@@ -4298,7 +4299,7 @@ static const struct opcode group1[] = {
 };
 
 static const struct opcode group1A[] = {
-	I(DstMem | SrcNone | Mov | Stack | IncSP, em_pop), N, N, N, N, N, N, N,
+	I(DstMem | SrcNone | Mov | Stack | IncSP | TwoMemOp, em_pop), N, N, N, N, N, N, N,
 };
 
 static const struct opcode group2[] = {
@@ -4336,7 +4337,7 @@ static const struct opcode group5[] = {
 	I(SrcMemFAddr | ImplicitOps,		em_call_far),
 	I(SrcMem | NearBranch,			em_jmp_abs),
 	I(SrcMemFAddr | ImplicitOps,		em_jmp_far),
-	I(SrcMem | Stack,			em_push), D(Undefined),
+	I(SrcMem | Stack | TwoMemOp,		em_push), D(Undefined),
 };
 
 static const struct opcode group6[] = {
@@ -4556,8 +4557,8 @@ static const struct opcode opcode_table[256] = {
 	/* 0xA0 - 0xA7 */
 	I2bv(DstAcc | SrcMem | Mov | MemAbs, em_mov),
 	I2bv(DstMem | SrcAcc | Mov | MemAbs | PageTable, em_mov),
-	I2bv(SrcSI | DstDI | Mov | String, em_mov),
-	F2bv(SrcSI | DstDI | String | NoWrite, em_cmp_r),
+	I2bv(SrcSI | DstDI | Mov | String | TwoMemOp, em_mov),
+	F2bv(SrcSI | DstDI | String | NoWrite | TwoMemOp, em_cmp_r),
 	/* 0xA8 - 0xAF */
 	F2bv(DstAcc | SrcImm | NoWrite, em_test),
 	I2bv(SrcAcc | DstDI | Mov | String, em_mov),
@@ -5671,3 +5672,14 @@ void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt)
 {
 	writeback_registers(ctxt);
 }
+
+bool emulator_can_use_gpa(struct x86_emulate_ctxt *ctxt)
+{
+	if (ctxt->rep_prefix && (ctxt->d & String))
+		return false;
+
+	if (ctxt->d & TwoMemOp)
+		return false;
+
+	return true;
+}
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 36d61ff..b581499 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4184,6 +4184,8 @@ static int handle_exit(struct kvm_vcpu *vcpu)
 
 	trace_kvm_exit(exit_code, vcpu, KVM_ISA_SVM);
 
+	vcpu->arch.gpa_available = (exit_code == SVM_EXIT_NPF);
+
 	if (!is_cr_intercept(svm, INTERCEPT_CR0_WRITE))
 		vcpu->arch.cr0 = svm->vmcb->save.cr0;
 	if (npt_enabled)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9e6a593..2099df8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4465,6 +4465,21 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt,
 }
 EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);
 
+static int vcpu_is_mmio_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
+			    gpa_t gpa, bool write)
+{
+	/* For APIC access vmexit */
+	if ((gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
+		return 1;
+
+	if (vcpu_match_mmio_gpa(vcpu, gpa)) {
+		trace_vcpu_match_mmio(gva, gpa, write, true);
+		return 1;
+	}
+
+	return 0;
+}
+
 static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
 				gpa_t *gpa, struct x86_exception *exception,
 				bool write)
@@ -4491,16 +4506,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
 	if (*gpa == UNMAPPED_GVA)
 		return -1;
 
-	/* For APIC access vmexit */
-	if ((*gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
-		return 1;
-
-	if (vcpu_match_mmio_gpa(vcpu, *gpa)) {
-		trace_vcpu_match_mmio(gva, *gpa, write, true);
-		return 1;
-	}
-
-	return 0;
+	return vcpu_is_mmio_gpa(vcpu, gva, *gpa, write);
 }
 
 int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
@@ -4597,6 +4603,22 @@ static int emulator_read_write_onepage(unsigned long addr, void *val,
 	int handled, ret;
 	bool write = ops->write;
 	struct kvm_mmio_fragment *frag;
+	struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
+
+	/*
+	 * If the exit was due to a NPF we may already have a GPA.
+	 * If the GPA is present, use it to avoid the GVA to GPA table walk.
+	 * Note, this cannot be used on string operations since string
+	 * operation using rep will only have the initial GPA from the NPF
+	 * occurred.
+	 */
+	if (vcpu->arch.gpa_available &&
+	    emulator_can_use_gpa(ctxt) &&
+	    vcpu_is_mmio_gpa(vcpu, addr, exception->address, write) &&
+	    (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK)) {
+		gpa = exception->address;
+		goto mmio;
+	}
 
 	ret = vcpu_mmio_gva_to_gpa(vcpu, addr, &gpa, exception, write);
 
@@ -5613,6 +5635,9 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 	}
 
 restart:
+	/* Save the faulting GPA (cr2) in the address field */
+	ctxt->exception.address = cr2;
+
 	r = x86_emulate_insn(ctxt);
 
 	if (r == EMULATION_INTERCEPTED)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

From: Tom Lendacky <thomas.lendacky@amd.com>

When a guest causes a NPF which requires emulation, KVM sometimes walks
the guest page tables to translate the GVA to a GPA. This is unnecessary
most of the time on AMD hardware since the hardware provides the GPA in
EXITINFO2.

The only exception cases involve string operations involving rep or
operations that use two memory locations. With rep, the GPA will only be
the value of the initial NPF and with dual memory locations we won't know
which memory address was translated into EXITINFO2.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_emulate.h |    1 +
 arch/x86/include/asm/kvm_host.h    |    3 ++
 arch/x86/kvm/emulate.c             |   20 +++++++++++++---
 arch/x86/kvm/svm.c                 |    2 ++
 arch/x86/kvm/x86.c                 |   45 ++++++++++++++++++++++++++++--------
 5 files changed, 57 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index e9cd7be..3e8c287 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -441,5 +441,6 @@ int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
 int emulate_int_real(struct x86_emulate_ctxt *ctxt, int irq);
 void emulator_invalidate_register_cache(struct x86_emulate_ctxt *ctxt);
 void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt);
+bool emulator_can_use_gpa(struct x86_emulate_ctxt *ctxt);
 
 #endif /* _ASM_X86_KVM_X86_EMULATE_H */
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 37326b5..bff1f15 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -668,6 +668,9 @@ struct kvm_vcpu_arch {
 
 	int pending_ioapic_eoi;
 	int pending_external_vector;
+
+	/* GPA available (AMD only) */
+	bool gpa_available;
 };
 
 struct kvm_lpage_info {
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index cedbba0..45c7306 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -173,6 +173,7 @@
 #define NearBranch  ((u64)1 << 52)  /* Near branches */
 #define No16	    ((u64)1 << 53)  /* No 16 bit operand */
 #define IncSP       ((u64)1 << 54)  /* SP is incremented before ModRM calc */
+#define TwoMemOp    ((u64)1 << 55)  /* Instruction has two memory operand */
 
 #define DstXacc     (DstAccLo | SrcAccHi | SrcWrite)
 
@@ -4298,7 +4299,7 @@ static const struct opcode group1[] = {
 };
 
 static const struct opcode group1A[] = {
-	I(DstMem | SrcNone | Mov | Stack | IncSP, em_pop), N, N, N, N, N, N, N,
+	I(DstMem | SrcNone | Mov | Stack | IncSP | TwoMemOp, em_pop), N, N, N, N, N, N, N,
 };
 
 static const struct opcode group2[] = {
@@ -4336,7 +4337,7 @@ static const struct opcode group5[] = {
 	I(SrcMemFAddr | ImplicitOps,		em_call_far),
 	I(SrcMem | NearBranch,			em_jmp_abs),
 	I(SrcMemFAddr | ImplicitOps,		em_jmp_far),
-	I(SrcMem | Stack,			em_push), D(Undefined),
+	I(SrcMem | Stack | TwoMemOp,		em_push), D(Undefined),
 };
 
 static const struct opcode group6[] = {
@@ -4556,8 +4557,8 @@ static const struct opcode opcode_table[256] = {
 	/* 0xA0 - 0xA7 */
 	I2bv(DstAcc | SrcMem | Mov | MemAbs, em_mov),
 	I2bv(DstMem | SrcAcc | Mov | MemAbs | PageTable, em_mov),
-	I2bv(SrcSI | DstDI | Mov | String, em_mov),
-	F2bv(SrcSI | DstDI | String | NoWrite, em_cmp_r),
+	I2bv(SrcSI | DstDI | Mov | String | TwoMemOp, em_mov),
+	F2bv(SrcSI | DstDI | String | NoWrite | TwoMemOp, em_cmp_r),
 	/* 0xA8 - 0xAF */
 	F2bv(DstAcc | SrcImm | NoWrite, em_test),
 	I2bv(SrcAcc | DstDI | Mov | String, em_mov),
@@ -5671,3 +5672,14 @@ void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt)
 {
 	writeback_registers(ctxt);
 }
+
+bool emulator_can_use_gpa(struct x86_emulate_ctxt *ctxt)
+{
+	if (ctxt->rep_prefix && (ctxt->d & String))
+		return false;
+
+	if (ctxt->d & TwoMemOp)
+		return false;
+
+	return true;
+}
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 36d61ff..b581499 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4184,6 +4184,8 @@ static int handle_exit(struct kvm_vcpu *vcpu)
 
 	trace_kvm_exit(exit_code, vcpu, KVM_ISA_SVM);
 
+	vcpu->arch.gpa_available = (exit_code == SVM_EXIT_NPF);
+
 	if (!is_cr_intercept(svm, INTERCEPT_CR0_WRITE))
 		vcpu->arch.cr0 = svm->vmcb->save.cr0;
 	if (npt_enabled)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9e6a593..2099df8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4465,6 +4465,21 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt,
 }
 EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);
 
+static int vcpu_is_mmio_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
+			    gpa_t gpa, bool write)
+{
+	/* For APIC access vmexit */
+	if ((gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
+		return 1;
+
+	if (vcpu_match_mmio_gpa(vcpu, gpa)) {
+		trace_vcpu_match_mmio(gva, gpa, write, true);
+		return 1;
+	}
+
+	return 0;
+}
+
 static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
 				gpa_t *gpa, struct x86_exception *exception,
 				bool write)
@@ -4491,16 +4506,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu, unsigned long gva,
 	if (*gpa == UNMAPPED_GVA)
 		return -1;
 
-	/* For APIC access vmexit */
-	if ((*gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
-		return 1;
-
-	if (vcpu_match_mmio_gpa(vcpu, *gpa)) {
-		trace_vcpu_match_mmio(gva, *gpa, write, true);
-		return 1;
-	}
-
-	return 0;
+	return vcpu_is_mmio_gpa(vcpu, gva, *gpa, write);
 }
 
 int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
@@ -4597,6 +4603,22 @@ static int emulator_read_write_onepage(unsigned long addr, void *val,
 	int handled, ret;
 	bool write = ops->write;
 	struct kvm_mmio_fragment *frag;
+	struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
+
+	/*
+	 * If the exit was due to a NPF we may already have a GPA.
+	 * If the GPA is present, use it to avoid the GVA to GPA table walk.
+	 * Note, this cannot be used on string operations since string
+	 * operation using rep will only have the initial GPA from the NPF
+	 * occurred.
+	 */
+	if (vcpu->arch.gpa_available &&
+	    emulator_can_use_gpa(ctxt) &&
+	    vcpu_is_mmio_gpa(vcpu, addr, exception->address, write) &&
+	    (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK)) {
+		gpa = exception->address;
+		goto mmio;
+	}
 
 	ret = vcpu_mmio_gva_to_gpa(vcpu, addr, &gpa, exception, write);
 
@@ -5613,6 +5635,9 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 	}
 
 restart:
+	/* Save the faulting GPA (cr2) in the address field */
+	ctxt->exception.address = cr2;
+
 	r = x86_emulate_insn(ctxt);
 
 	if (r == EMULATION_INTERCEPTED)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
  2017-03-02 15:12 ` Brijesh Singh
                   ` (38 preceding siblings ...)
  (?)
@ 2017-03-02 15:16 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The CCP device is part of the AMD Secure Processor. In order to expand the
usage of the AMD Secure Processor, create a framework that allows functional
components of the AMD Secure Processor to be initialized and handled
appropriately.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 drivers/crypto/Kconfig           |   10 +
 drivers/crypto/ccp/Kconfig       |   43 +++--
 drivers/crypto/ccp/Makefile      |    8 -
 drivers/crypto/ccp/ccp-dev-v3.c  |   86 +++++-----
 drivers/crypto/ccp/ccp-dev-v5.c  |   73 ++++-----
 drivers/crypto/ccp/ccp-dev.c     |  137 +++++++++-------
 drivers/crypto/ccp/ccp-dev.h     |   35 ----
 drivers/crypto/ccp/sp-dev.c      |  308 ++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/sp-dev.h      |  140 ++++++++++++++++
 drivers/crypto/ccp/sp-pci.c      |  324 ++++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/sp-platform.c |  268 +++++++++++++++++++++++++++++++
 include/linux/ccp.h              |    3 
 12 files changed, 1240 insertions(+), 195 deletions(-)
 create mode 100644 drivers/crypto/ccp/sp-dev.c
 create mode 100644 drivers/crypto/ccp/sp-dev.h
 create mode 100644 drivers/crypto/ccp/sp-pci.c
 create mode 100644 drivers/crypto/ccp/sp-platform.c

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 7956478..d31b469 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -456,14 +456,14 @@ config CRYPTO_DEV_ATMEL_SHA
 	  To compile this driver as a module, choose M here: the module
 	  will be called atmel-sha.
 
-config CRYPTO_DEV_CCP
-	bool "Support for AMD Cryptographic Coprocessor"
+config CRYPTO_DEV_SP
+	bool "Support for AMD Secure Processor"
 	depends on ((X86 && PCI) || (ARM64 && (OF_ADDRESS || ACPI))) && HAS_IOMEM
 	help
-	  The AMD Cryptographic Coprocessor provides hardware offload support
-	  for encryption, hashing and related operations.
+	  The AMD Secure Processor provides hardware offload support for memory
+	  encryption in virtualization and cryptographic hashing and related operations.
 
-if CRYPTO_DEV_CCP
+if CRYPTO_DEV_SP
 	source "drivers/crypto/ccp/Kconfig"
 endif
 
diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index 2238f77..bc08f03 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -1,26 +1,37 @@
-config CRYPTO_DEV_CCP_DD
-	tristate "Cryptographic Coprocessor device driver"
-	depends on CRYPTO_DEV_CCP
-	default m
-	select HW_RANDOM
-	select DMA_ENGINE
-	select DMADEVICES
-	select CRYPTO_SHA1
-	select CRYPTO_SHA256
-	help
-	  Provides the interface to use the AMD Cryptographic Coprocessor
-	  which can be used to offload encryption operations such as SHA,
-	  AES and more. If you choose 'M' here, this module will be called
-	  ccp.
-
 config CRYPTO_DEV_CCP_CRYPTO
 	tristate "Encryption and hashing offload support"
-	depends on CRYPTO_DEV_CCP_DD
+	depends on CRYPTO_DEV_SP_DD
 	default m
 	select CRYPTO_HASH
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_AUTHENC
+	select CRYPTO_DEV_CCP
 	help
 	  Support for using the cryptographic API with the AMD Cryptographic
 	  Coprocessor. This module supports offload of SHA and AES algorithms.
 	  If you choose 'M' here, this module will be called ccp_crypto.
+
+config CRYPTO_DEV_SP_DD
+	tristate "Secure Processor device driver"
+	depends on CRYPTO_DEV_SP
+	default m
+	help
+	  Provides the interface to use the AMD Secure Processor. The
+	  AMD Secure Processor support the Platform Security Processor (PSP)
+	  and Cryptographic Coprocessor (CCP). If you choose 'M' here, this
+	  module will be called ccp.
+
+if CRYPTO_DEV_SP_DD
+config CRYPTO_DEV_CCP
+	bool "Cryptographic Coprocessor interface"
+	default y
+	select HW_RANDOM
+	select DMA_ENGINE
+	select DMADEVICES
+	select CRYPTO_SHA1
+	select CRYPTO_SHA256
+	help
+	  Provides the interface to use the AMD Cryptographic Coprocessor
+	  which can be used to offload encryption operations such as SHA,
+	  AES and more.
+endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 346ceb8..8127e18 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -1,11 +1,11 @@
-obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
-ccp-objs := ccp-dev.o \
+obj-$(CONFIG_CRYPTO_DEV_SP_DD) += ccp.o
+ccp-objs := sp-dev.o sp-platform.o
+ccp-$(CONFIG_PCI) += sp-pci.o
+ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-ops.o \
 	    ccp-dev-v3.o \
 	    ccp-dev-v5.o \
-	    ccp-platform.o \
 	    ccp-dmaengine.o
-ccp-$(CONFIG_PCI) += ccp-pci.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/ccp-dev-v3.c b/drivers/crypto/ccp/ccp-dev-v3.c
index 7bc0998..5c50d14 100644
--- a/drivers/crypto/ccp/ccp-dev-v3.c
+++ b/drivers/crypto/ccp/ccp-dev-v3.c
@@ -315,6 +315,39 @@ static int ccp_perform_ecc(struct ccp_op *op)
 	return ccp_do_cmd(op, cr, ARRAY_SIZE(cr));
 }
 
+static irqreturn_t ccp_irq_handler(int irq, void *data)
+{
+	struct ccp_device *ccp = data;
+	struct ccp_cmd_queue *cmd_q;
+	u32 q_int, status;
+	unsigned int i;
+
+	status = ioread32(ccp->io_regs + IRQ_STATUS_REG);
+
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		cmd_q = &ccp->cmd_q[i];
+
+		q_int = status & (cmd_q->int_ok | cmd_q->int_err);
+		if (q_int) {
+			cmd_q->int_status = status;
+			cmd_q->q_status = ioread32(cmd_q->reg_status);
+			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
+
+			/* On error, only save the first error value */
+			if ((q_int & cmd_q->int_err) && !cmd_q->cmd_error)
+				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
+
+			cmd_q->int_rcvd = 1;
+
+			/* Acknowledge the interrupt and wake the kthread */
+			iowrite32(q_int, ccp->io_regs + IRQ_STATUS_REG);
+			wake_up_interruptible(&cmd_q->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
 static int ccp_init(struct ccp_device *ccp)
 {
 	struct device *dev = ccp->dev;
@@ -374,7 +407,7 @@ static int ccp_init(struct ccp_device *ccp)
 
 #ifdef CONFIG_ARM64
 		/* For arm64 set the recommended queue cache settings */
-		iowrite32(ccp->axcache, ccp->io_regs + CMD_Q_CACHE_BASE +
+		iowrite32(ccp->sp->axcache, ccp->io_regs + CMD_Q_CACHE_BASE +
 			  (CMD_Q_CACHE_INC * i));
 #endif
 
@@ -398,7 +431,7 @@ static int ccp_init(struct ccp_device *ccp)
 	iowrite32(qim, ccp->io_regs + IRQ_STATUS_REG);
 
 	/* Request an irq */
-	ret = ccp->get_irq(ccp);
+	ret = sp_request_ccp_irq(ccp->sp, ccp_irq_handler, ccp->name, ccp);
 	if (ret) {
 		dev_err(dev, "unable to allocate an IRQ\n");
 		goto e_pool;
@@ -450,7 +483,7 @@ static int ccp_init(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 e_pool:
 	for (i = 0; i < ccp->cmd_q_count; i++)
@@ -496,7 +529,7 @@ static void ccp_destroy(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 	for (i = 0; i < ccp->cmd_q_count; i++)
 		dma_pool_destroy(ccp->cmd_q[i].dma_pool);
@@ -516,40 +549,6 @@ static void ccp_destroy(struct ccp_device *ccp)
 	}
 }
 
-static irqreturn_t ccp_irq_handler(int irq, void *data)
-{
-	struct device *dev = data;
-	struct ccp_device *ccp = dev_get_drvdata(dev);
-	struct ccp_cmd_queue *cmd_q;
-	u32 q_int, status;
-	unsigned int i;
-
-	status = ioread32(ccp->io_regs + IRQ_STATUS_REG);
-
-	for (i = 0; i < ccp->cmd_q_count; i++) {
-		cmd_q = &ccp->cmd_q[i];
-
-		q_int = status & (cmd_q->int_ok | cmd_q->int_err);
-		if (q_int) {
-			cmd_q->int_status = status;
-			cmd_q->q_status = ioread32(cmd_q->reg_status);
-			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
-
-			/* On error, only save the first error value */
-			if ((q_int & cmd_q->int_err) && !cmd_q->cmd_error)
-				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
-
-			cmd_q->int_rcvd = 1;
-
-			/* Acknowledge the interrupt and wake the kthread */
-			iowrite32(q_int, ccp->io_regs + IRQ_STATUS_REG);
-			wake_up_interruptible(&cmd_q->int_queue);
-		}
-	}
-
-	return IRQ_HANDLED;
-}
-
 static const struct ccp_actions ccp3_actions = {
 	.aes = ccp_perform_aes,
 	.xts_aes = ccp_perform_xts_aes,
@@ -562,13 +561,18 @@ static const struct ccp_actions ccp3_actions = {
 	.init = ccp_init,
 	.destroy = ccp_destroy,
 	.get_free_slots = ccp_get_free_slots,
-	.irqhandler = ccp_irq_handler,
 };
 
-const struct ccp_vdata ccpv3 = {
+const struct ccp_vdata ccpv3_platform = {
+	.version = CCP_VERSION(3, 0),
+	.setup = NULL,
+	.perform = &ccp3_actions,
+	.offset = 0,
+};
+
+const struct ccp_vdata ccpv3_pci = {
 	.version = CCP_VERSION(3, 0),
 	.setup = NULL,
 	.perform = &ccp3_actions,
-	.bar = 2,
 	.offset = 0x20000,
 };
diff --git a/drivers/crypto/ccp/ccp-dev-v5.c b/drivers/crypto/ccp/ccp-dev-v5.c
index 612898b..dd6335b 100644
--- a/drivers/crypto/ccp/ccp-dev-v5.c
+++ b/drivers/crypto/ccp/ccp-dev-v5.c
@@ -651,6 +651,38 @@ static int ccp_assign_lsbs(struct ccp_device *ccp)
 	return rc;
 }
 
+static irqreturn_t ccp5_irq_handler(int irq, void *data)
+{
+	struct device *dev = data;
+	struct ccp_device *ccp = dev_get_drvdata(dev);
+	u32 status;
+	unsigned int i;
+
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		struct ccp_cmd_queue *cmd_q = &ccp->cmd_q[i];
+
+		status = ioread32(cmd_q->reg_interrupt_status);
+
+		if (status) {
+			cmd_q->int_status = status;
+			cmd_q->q_status = ioread32(cmd_q->reg_status);
+			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
+
+			/* On error, only save the first error value */
+			if ((status & INT_ERROR) && !cmd_q->cmd_error)
+				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
+
+			cmd_q->int_rcvd = 1;
+
+			/* Acknowledge the interrupt and wake the kthread */
+			iowrite32(ALL_INTERRUPTS, cmd_q->reg_interrupt_status);
+			wake_up_interruptible(&cmd_q->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
 static int ccp5_init(struct ccp_device *ccp)
 {
 	struct device *dev = ccp->dev;
@@ -752,7 +784,7 @@ static int ccp5_init(struct ccp_device *ccp)
 
 	dev_dbg(dev, "Requesting an IRQ...\n");
 	/* Request an irq */
-	ret = ccp->get_irq(ccp);
+	ret = sp_request_ccp_irq(ccp->sp, ccp5_irq_handler, ccp->name, ccp);
 	if (ret) {
 		dev_err(dev, "unable to allocate an IRQ\n");
 		goto e_pool;
@@ -855,7 +887,7 @@ static int ccp5_init(struct ccp_device *ccp)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
 e_irq:
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 e_pool:
 	for (i = 0; i < ccp->cmd_q_count; i++)
@@ -901,7 +933,7 @@ static void ccp5_destroy(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 	for (i = 0; i < ccp->cmd_q_count; i++) {
 		cmd_q = &ccp->cmd_q[i];
@@ -924,38 +956,6 @@ static void ccp5_destroy(struct ccp_device *ccp)
 	}
 }
 
-static irqreturn_t ccp5_irq_handler(int irq, void *data)
-{
-	struct device *dev = data;
-	struct ccp_device *ccp = dev_get_drvdata(dev);
-	u32 status;
-	unsigned int i;
-
-	for (i = 0; i < ccp->cmd_q_count; i++) {
-		struct ccp_cmd_queue *cmd_q = &ccp->cmd_q[i];
-
-		status = ioread32(cmd_q->reg_interrupt_status);
-
-		if (status) {
-			cmd_q->int_status = status;
-			cmd_q->q_status = ioread32(cmd_q->reg_status);
-			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
-
-			/* On error, only save the first error value */
-			if ((status & INT_ERROR) && !cmd_q->cmd_error)
-				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
-
-			cmd_q->int_rcvd = 1;
-
-			/* Acknowledge the interrupt and wake the kthread */
-			iowrite32(ALL_INTERRUPTS, cmd_q->reg_interrupt_status);
-			wake_up_interruptible(&cmd_q->int_queue);
-		}
-	}
-
-	return IRQ_HANDLED;
-}
-
 static void ccp5_config(struct ccp_device *ccp)
 {
 	/* Public side */
@@ -1001,14 +1001,12 @@ static const struct ccp_actions ccp5_actions = {
 	.init = ccp5_init,
 	.destroy = ccp5_destroy,
 	.get_free_slots = ccp5_get_free_slots,
-	.irqhandler = ccp5_irq_handler,
 };
 
 const struct ccp_vdata ccpv5a = {
 	.version = CCP_VERSION(5, 0),
 	.setup = ccp5_config,
 	.perform = &ccp5_actions,
-	.bar = 2,
 	.offset = 0x0,
 };
 
@@ -1016,6 +1014,5 @@ const struct ccp_vdata ccpv5b = {
 	.version = CCP_VERSION(5, 0),
 	.setup = ccp5other_config,
 	.perform = &ccp5_actions,
-	.bar = 2,
 	.offset = 0x0,
 };
diff --git a/drivers/crypto/ccp/ccp-dev.c b/drivers/crypto/ccp/ccp-dev.c
index 511ab04..0fa8c4a 100644
--- a/drivers/crypto/ccp/ccp-dev.c
+++ b/drivers/crypto/ccp/ccp-dev.c
@@ -22,19 +22,11 @@
 #include <linux/mutex.h>
 #include <linux/delay.h>
 #include <linux/hw_random.h>
-#include <linux/cpu.h>
-#ifdef CONFIG_X86
-#include <asm/cpu_device_id.h>
-#endif
 #include <linux/ccp.h>
 
+#include "sp-dev.h"
 #include "ccp-dev.h"
 
-MODULE_AUTHOR("Tom Lendacky <thomas.lendacky@amd.com>");
-MODULE_LICENSE("GPL");
-MODULE_VERSION("1.0.0");
-MODULE_DESCRIPTION("AMD Cryptographic Coprocessor driver");
-
 struct ccp_tasklet_data {
 	struct completion completion;
 	struct ccp_cmd *cmd;
@@ -110,13 +102,6 @@ static LIST_HEAD(ccp_units);
 static DEFINE_SPINLOCK(ccp_rr_lock);
 static struct ccp_device *ccp_rr;
 
-/* Ever-increasing value to produce unique unit numbers */
-static atomic_t ccp_unit_ordinal;
-static unsigned int ccp_increment_unit_ordinal(void)
-{
-	return atomic_inc_return(&ccp_unit_ordinal);
-}
-
 /**
  * ccp_add_device - add a CCP device to the list
  *
@@ -455,19 +440,17 @@ int ccp_cmd_queue_thread(void *data)
 	return 0;
 }
 
-/**
- * ccp_alloc_struct - allocate and initialize the ccp_device struct
- *
- * @dev: device struct of the CCP
- */
-struct ccp_device *ccp_alloc_struct(struct device *dev)
+static struct ccp_device *ccp_alloc_struct(struct sp_device *sp)
 {
+	struct device *dev = sp->dev;
 	struct ccp_device *ccp;
 
 	ccp = devm_kzalloc(dev, sizeof(*ccp), GFP_KERNEL);
 	if (!ccp)
 		return NULL;
+
 	ccp->dev = dev;
+	ccp->sp = sp;
 
 	INIT_LIST_HEAD(&ccp->cmd);
 	INIT_LIST_HEAD(&ccp->backlog);
@@ -482,9 +465,8 @@ struct ccp_device *ccp_alloc_struct(struct device *dev)
 	init_waitqueue_head(&ccp->sb_queue);
 	init_waitqueue_head(&ccp->suspend_queue);
 
-	ccp->ord = ccp_increment_unit_ordinal();
-	snprintf(ccp->name, MAX_CCP_NAME_LEN, "ccp-%u", ccp->ord);
-	snprintf(ccp->rngname, MAX_CCP_NAME_LEN, "ccp-%u-rng", ccp->ord);
+	snprintf(ccp->name, MAX_CCP_NAME_LEN, "ccp-%u", sp->ord);
+	snprintf(ccp->rngname, MAX_CCP_NAME_LEN, "ccp-%u-rng", sp->ord);
 
 	return ccp;
 }
@@ -536,53 +518,94 @@ bool ccp_queues_suspended(struct ccp_device *ccp)
 }
 #endif
 
-static int __init ccp_mod_init(void)
+int ccp_dev_init(struct sp_device *sp)
 {
-#ifdef CONFIG_X86
+	struct device *dev = sp->dev;
+	struct ccp_device *ccp;
 	int ret;
 
-	ret = ccp_pci_init();
-	if (ret)
-		return ret;
-
-	/* Don't leave the driver loaded if init failed */
-	if (ccp_present() != 0) {
-		ccp_pci_exit();
-		return -ENODEV;
+	ret = -ENOMEM;
+	ccp = ccp_alloc_struct(sp);
+	if (!ccp)
+		goto e_err;
+	sp->ccp_data = ccp;
+
+	ccp->vdata = (struct ccp_vdata *)sp->dev_data->ccp_vdata;
+	if (!ccp->vdata || !ccp->vdata->version) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
 	}
 
-	return 0;
-#endif
+	ccp->io_regs = sp->io_map + ccp->vdata->offset;
 
-#ifdef CONFIG_ARM64
-	int ret;
+	if (ccp->vdata->setup)
+		ccp->vdata->setup(ccp);
 
-	ret = ccp_platform_init();
+	ret = ccp->vdata->perform->init(ccp);
 	if (ret)
-		return ret;
+		goto e_err;
 
-	/* Don't leave the driver loaded if init failed */
-	if (ccp_present() != 0) {
-		ccp_platform_exit();
-		return -ENODEV;
-	}
+	dev_notice(dev, "ccp enabled\n");
 
 	return 0;
-#endif
 
-	return -ENODEV;
+e_err:
+	sp->ccp_data = NULL;
+
+	dev_notice(dev, "ccp initialization failed\n");
+
+	return ret;
 }
 
-static void __exit ccp_mod_exit(void)
+void ccp_dev_destroy(struct sp_device *sp)
 {
-#ifdef CONFIG_X86
-	ccp_pci_exit();
-#endif
+	struct ccp_device *ccp = sp->ccp_data;
 
-#ifdef CONFIG_ARM64
-	ccp_platform_exit();
-#endif
+	ccp->vdata->perform->destroy(ccp);
+}
+
+int ccp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	struct ccp_device *ccp = sp->ccp_data;
+	unsigned long flags;
+	unsigned int i;
+
+	spin_lock_irqsave(&ccp->cmd_lock, flags);
+
+	ccp->suspending = 1;
+
+	/* Wake all the queue kthreads to prepare for suspend */
+	for (i = 0; i < ccp->cmd_q_count; i++)
+		wake_up_process(ccp->cmd_q[i].kthread);
+
+	spin_unlock_irqrestore(&ccp->cmd_lock, flags);
+
+	/* Wait for all queue kthreads to say they're done */
+	while (!ccp_queues_suspended(ccp))
+		wait_event_interruptible(ccp->suspend_queue,
+					 ccp_queues_suspended(ccp));
+
+	return 0;
 }
 
-module_init(ccp_mod_init);
-module_exit(ccp_mod_exit);
+int ccp_dev_resume(struct sp_device *sp)
+{
+	struct ccp_device *ccp = sp->ccp_data;
+	unsigned long flags;
+	unsigned int i;
+
+	spin_lock_irqsave(&ccp->cmd_lock, flags);
+
+	ccp->suspending = 0;
+
+	/* Wake up all the kthreads */
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		ccp->cmd_q[i].suspended = 0;
+		wake_up_process(ccp->cmd_q[i].kthread);
+	}
+
+	spin_unlock_irqrestore(&ccp->cmd_lock, flags);
+
+	return 0;
+}
diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
index 649e561..25a4bfd 100644
--- a/drivers/crypto/ccp/ccp-dev.h
+++ b/drivers/crypto/ccp/ccp-dev.h
@@ -27,6 +27,8 @@
 #include <linux/irqreturn.h>
 #include <linux/dmaengine.h>
 
+#include "sp-dev.h"
+
 #define MAX_CCP_NAME_LEN		16
 #define MAX_DMAPOOL_NAME_LEN		32
 
@@ -35,9 +37,6 @@
 
 #define TRNG_RETRIES			10
 
-#define CACHE_NONE			0x00
-#define CACHE_WB_NO_ALLOC		0xb7
-
 /****** Register Mappings ******/
 #define Q_MASK_REG			0x000
 #define TRNG_OUT_REG			0x00c
@@ -322,18 +321,15 @@ struct ccp_device {
 	struct list_head entry;
 
 	struct ccp_vdata *vdata;
-	unsigned int ord;
 	char name[MAX_CCP_NAME_LEN];
 	char rngname[MAX_CCP_NAME_LEN];
 
 	struct device *dev;
+	struct sp_device *sp;
 
 	/* Bus specific device information
 	 */
 	void *dev_specific;
-	int (*get_irq)(struct ccp_device *ccp);
-	void (*free_irq)(struct ccp_device *ccp);
-	unsigned int irq;
 
 	/* I/O area used for device communication. The register mapping
 	 * starts at an offset into the mapped bar.
@@ -342,7 +338,6 @@ struct ccp_device {
 	 *   them.
 	 */
 	struct mutex req_mutex ____cacheline_aligned;
-	void __iomem *io_map;
 	void __iomem *io_regs;
 
 	/* Master lists that all cmds are queued on. Because there can be
@@ -407,9 +402,6 @@ struct ccp_device {
 	/* Suspend support */
 	unsigned int suspending;
 	wait_queue_head_t suspend_queue;
-
-	/* DMA caching attribute support */
-	unsigned int axcache;
 };
 
 enum ccp_memtype {
@@ -592,18 +584,11 @@ struct ccp5_desc {
 	struct dword7 dw7;
 };
 
-int ccp_pci_init(void);
-void ccp_pci_exit(void);
-
-int ccp_platform_init(void);
-void ccp_platform_exit(void);
-
 void ccp_add_device(struct ccp_device *ccp);
 void ccp_del_device(struct ccp_device *ccp);
 
 extern void ccp_log_error(struct ccp_device *, int);
 
-struct ccp_device *ccp_alloc_struct(struct device *dev);
 bool ccp_queues_suspended(struct ccp_device *ccp);
 int ccp_cmd_queue_thread(void *data);
 int ccp_trng_read(struct hwrng *rng, void *data, size_t max, bool wait);
@@ -629,20 +614,6 @@ struct ccp_actions {
 	unsigned int (*get_free_slots)(struct ccp_cmd_queue *);
 	int (*init)(struct ccp_device *);
 	void (*destroy)(struct ccp_device *);
-	irqreturn_t (*irqhandler)(int, void *);
-};
-
-/* Structure to hold CCP version-specific values */
-struct ccp_vdata {
-	const unsigned int version;
-	void (*setup)(struct ccp_device *);
-	const struct ccp_actions *perform;
-	const unsigned int bar;
-	const unsigned int offset;
 };
 
-extern const struct ccp_vdata ccpv3;
-extern const struct ccp_vdata ccpv5a;
-extern const struct ccp_vdata ccpv5b;
-
 #endif
diff --git a/drivers/crypto/ccp/sp-dev.c b/drivers/crypto/ccp/sp-dev.c
new file mode 100644
index 0000000..e47fb8e
--- /dev/null
+++ b/drivers/crypto/ccp/sp-dev.c
@@ -0,0 +1,308 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *	Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+
+#include "sp-dev.h"
+
+MODULE_AUTHOR("Tom Lendacky <thomas.lendacky@amd.com>");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1.1.0");
+MODULE_DESCRIPTION("AMD Secure Processor driver");
+
+/* List of SPs, SP count, read-write access lock, and access functions
+ *
+ * Lock structure: get sp_unit_lock for reading whenever we need to
+ * examine the SP list.
+ */
+static DEFINE_RWLOCK(sp_unit_lock);
+static LIST_HEAD(sp_units);
+
+/* Ever-increasing value to produce unique unit numbers */
+static atomic_t sp_ordinal;
+
+static void sp_add_device(struct sp_device *sp)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	list_add_tail(&sp->entry, &sp_units);
+
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+}
+
+static void sp_del_device(struct sp_device *sp)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	list_del(&sp->entry);
+
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+}
+
+struct sp_device *sp_get_device(void)
+{
+	struct sp_device *sp = NULL;
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	if (list_empty(&sp_units))
+		goto unlock;
+
+	sp = list_first_entry(&sp_units, struct sp_device, entry);
+
+	list_add_tail(&sp->entry, &sp_units);
+unlock:
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+	return sp;
+}
+
+static irqreturn_t sp_irq_handler(int irq, void *data)
+{
+	struct sp_device *sp = data;
+
+	if (sp->psp_irq_handler)
+		sp->psp_irq_handler(irq, sp->psp_irq_data);
+
+	if (sp->ccp_irq_handler)
+		sp->ccp_irq_handler(irq, sp->ccp_irq_data);
+
+	return IRQ_HANDLED;
+}
+
+int sp_request_psp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data)
+{
+	int ret;
+
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->ccp_vdata) {
+		/* Need a common routine to manager all interrupts */
+		sp->psp_irq_data = data;
+		sp->psp_irq_handler = handler;
+
+		if (!sp->irq_registered) {
+			ret = request_irq(sp->psp_irq, sp_irq_handler, 0,
+					  sp->name, sp);
+			if (ret)
+				return ret;
+
+			sp->irq_registered = true;
+		}
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		ret = request_irq(sp->psp_irq, handler, 0, name, data);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+int sp_request_ccp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data)
+{
+	int ret;
+
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->psp_vdata) {
+		/* Need a common routine to manager all interrupts */
+		sp->ccp_irq_data = data;
+		sp->ccp_irq_handler = handler;
+
+		if (!sp->irq_registered) {
+			ret = request_irq(sp->ccp_irq, sp_irq_handler, 0,
+					  sp->name, sp);
+			if (ret)
+				return ret;
+
+			sp->irq_registered = true;
+		}
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		ret = request_irq(sp->ccp_irq, handler, 0, name, data);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+void sp_free_psp_irq(struct sp_device *sp, void *data)
+{
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->ccp_vdata) {
+		/* Using a common routine to manager all interrupts */
+		if (!sp->ccp_irq_handler) {
+			/* Nothing else using it, so free it */
+			free_irq(sp->psp_irq, sp);
+
+			sp->irq_registered = false;
+		}
+
+		sp->psp_irq_handler = NULL;
+		sp->psp_irq_data = NULL;
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		free_irq(sp->psp_irq, data);
+	}
+}
+
+void sp_free_ccp_irq(struct sp_device *sp, void *data)
+{
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->psp_vdata) {
+		/* Using a common routine to manager all interrupts */
+		if (!sp->psp_irq_handler) {
+			/* Nothing else using it, so free it */
+			free_irq(sp->ccp_irq, sp);
+
+			sp->irq_registered = false;
+		}
+
+		sp->ccp_irq_handler = NULL;
+		sp->ccp_irq_data = NULL;
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		free_irq(sp->ccp_irq, data);
+	}
+}
+
+/**
+ * sp_alloc_struct - allocate and initialize the sp_device struct
+ *
+ * @dev: device struct of the SP
+ */
+struct sp_device *sp_alloc_struct(struct device *dev)
+{
+	struct sp_device *sp;
+
+	sp = devm_kzalloc(dev, sizeof(*sp), GFP_KERNEL);
+	if (!sp)
+		return NULL;
+
+	sp->dev = dev;
+	sp->ord = atomic_inc_return(&sp_ordinal) - 1;
+	snprintf(sp->name, SP_MAX_NAME_LEN, "sp-%u", sp->ord);
+
+	return sp;
+}
+
+int sp_init(struct sp_device *sp)
+{
+	sp_add_device(sp);
+
+	if (sp->dev_data->ccp_vdata)
+		ccp_dev_init(sp);
+
+	return 0;
+}
+
+void sp_destroy(struct sp_device *sp)
+{
+	if (sp->dev_data->ccp_vdata)
+		ccp_dev_destroy(sp);
+
+	sp_del_device(sp);
+}
+
+int sp_suspend(struct sp_device *sp, pm_message_t state)
+{
+	int ret;
+
+	if (sp->dev_data->ccp_vdata) {
+		ret = ccp_dev_suspend(sp, state);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+int sp_resume(struct sp_device *sp)
+{
+	int ret;
+
+	if (sp->dev_data->ccp_vdata) {
+		ret = ccp_dev_resume(sp);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+struct sp_device *sp_get_psp_master_device(void)
+{
+	struct sp_device *sp = sp_get_device();
+
+	if (!sp)
+		return NULL;
+
+	if (!sp->psp_data)
+		return NULL;
+
+	return sp->get_master_device();
+}
+
+void sp_set_psp_master(struct sp_device *sp)
+{
+	if (sp->psp_data)
+		sp->set_master_device(sp);
+}
+
+static int __init sp_mod_init(void)
+{
+#ifdef CONFIG_X86
+	int ret;
+
+	ret = sp_pci_init();
+	if (ret)
+		return ret;
+
+	return 0;
+#endif
+
+#ifdef CONFIG_ARM64
+	int ret;
+
+	ret = sp_platform_init();
+	if (ret)
+		return ret;
+
+	return 0;
+#endif
+
+	return -ENODEV;
+}
+
+static void __exit sp_mod_exit(void)
+{
+#ifdef CONFIG_X86
+	sp_pci_exit();
+#endif
+
+#ifdef CONFIG_ARM64
+	sp_platform_exit();
+#endif
+}
+
+module_init(sp_mod_init);
+module_exit(sp_mod_exit);
diff --git a/drivers/crypto/ccp/sp-dev.h b/drivers/crypto/ccp/sp-dev.h
new file mode 100644
index 0000000..9a8a8f8
--- /dev/null
+++ b/drivers/crypto/ccp/sp-dev.h
@@ -0,0 +1,140 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *	Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __SP_DEV_H__
+#define __SP_DEV_H__
+
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/dmapool.h>
+#include <linux/hw_random.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+
+#define SP_MAX_NAME_LEN		32
+
+#define CACHE_NONE			0x00
+#define CACHE_WB_NO_ALLOC		0xb7
+
+/* Structure to hold CCP device data */
+struct ccp_device;
+struct ccp_vdata {
+	const unsigned int version;
+	void (*setup)(struct ccp_device *);
+	const struct ccp_actions *perform;
+	const unsigned int offset;
+};
+
+/* Structure to hold SP device data */
+struct sp_dev_data {
+	const unsigned int bar;
+
+	const struct ccp_vdata *ccp_vdata;
+	const void *psp_vdata;
+};
+
+struct sp_device {
+	struct list_head entry;
+
+	struct device *dev;
+
+	struct sp_dev_data *dev_data;
+	unsigned int ord;
+	char name[SP_MAX_NAME_LEN];
+
+	/* Bus specific device information */
+	void *dev_specific;
+
+	/* I/O area used for device communication. */
+	void __iomem *io_map;
+
+	/* DMA caching attribute support */
+	unsigned int axcache;
+
+	bool irq_registered;
+
+	/* get and set master device */
+	struct sp_device*(*get_master_device) (void);
+	void(*set_master_device) (struct sp_device *);
+
+	unsigned int psp_irq;
+	irq_handler_t psp_irq_handler;
+	void *psp_irq_data;
+
+	unsigned int ccp_irq;
+	irq_handler_t ccp_irq_handler;
+	void *ccp_irq_data;
+
+	void *psp_data;
+	void *ccp_data;
+};
+
+int sp_pci_init(void);
+void sp_pci_exit(void);
+
+int sp_platform_init(void);
+void sp_platform_exit(void);
+
+struct sp_device *sp_alloc_struct(struct device *dev);
+
+int sp_init(struct sp_device *sp);
+void sp_destroy(struct sp_device *sp);
+struct sp_device *sp_get_master(void);
+
+int sp_suspend(struct sp_device *sp, pm_message_t state);
+int sp_resume(struct sp_device *sp);
+
+int sp_request_psp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data);
+void sp_free_psp_irq(struct sp_device *sp, void *data);
+
+int sp_request_ccp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data);
+void sp_free_ccp_irq(struct sp_device *sp, void *data);
+
+void sp_set_psp_master(struct sp_device *sp);
+struct sp_device *sp_get_psp_master_device(void);
+
+#ifdef CONFIG_CRYPTO_DEV_CCP
+
+int ccp_dev_init(struct sp_device *sp);
+void ccp_dev_destroy(struct sp_device *sp);
+
+int ccp_dev_suspend(struct sp_device *sp, pm_message_t state);
+int ccp_dev_resume(struct sp_device *sp);
+
+#else	/* !CONFIG_CRYPTO_DEV_CCP */
+
+static inline int ccp_dev_init(struct sp_device *sp)
+{
+	return 0;
+}
+static inline void ccp_dev_destroy(struct sp_device *sp) { }
+
+static inline int ccp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	return 0;
+}
+static inline int ccp_dev_resume(struct sp_device *sp)
+{
+	return 0;
+}
+
+#endif	/* CONFIG_CRYPTO_DEV_CCP */
+
+#endif
diff --git a/drivers/crypto/ccp/sp-pci.c b/drivers/crypto/ccp/sp-pci.c
new file mode 100644
index 0000000..0960e2d
--- /dev/null
+++ b/drivers/crypto/ccp/sp-pci.c
@@ -0,0 +1,324 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ * 	   Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/pci_ids.h>
+#include <linux/dma-mapping.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+
+#include "sp-dev.h"
+
+#define MSIX_VECTORS			2
+
+struct sp_pci {
+	int msix_count;
+	struct msix_entry msix_entry[MSIX_VECTORS];
+};
+
+static struct sp_device *sp_dev_master;
+
+static int sp_get_msix_irqs(struct sp_device *sp)
+{
+	struct sp_pci *sp_pci = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int v, ret;
+
+	for (v = 0; v < ARRAY_SIZE(sp_pci->msix_entry); v++)
+		sp_pci->msix_entry[v].entry = v;
+
+	ret = pci_enable_msix_range(pdev, sp_pci->msix_entry, 1, v);
+	if (ret < 0)
+		return ret;
+
+	sp_pci->msix_count = ret;
+
+	sp->psp_irq = sp_pci->msix_entry[0].vector;
+	sp->ccp_irq = (sp_pci->msix_count > 1) ? sp_pci->msix_entry[1].vector
+					       : sp_pci->msix_entry[0].vector;
+
+	return 0;
+}
+
+static int sp_get_msi_irq(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int ret;
+
+	ret = pci_enable_msi(pdev);
+	if (ret)
+		return ret;
+
+	sp->psp_irq = pdev->irq;
+	sp->ccp_irq = pdev->irq;
+
+	return 0;
+}
+
+static int sp_get_irqs(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	int ret;
+
+	ret = sp_get_msix_irqs(sp);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI-X vectors, try MSI */
+	dev_notice(dev, "could not enable MSI-X (%d), trying MSI\n", ret);
+	ret = sp_get_msi_irq(sp);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI interrupt */
+	dev_notice(dev, "could not enable MSI (%d)\n", ret);
+
+	return ret;
+}
+
+static void sp_free_irqs(struct sp_device *sp)
+{
+	struct sp_pci *sp_pci = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (sp_pci->msix_count)
+		pci_disable_msix(pdev);
+	else if (sp->psp_irq)
+		pci_disable_msi(pdev);
+
+	sp->psp_irq = 0;
+	sp->ccp_irq = 0;
+}
+
+static bool sp_pci_is_master(struct sp_device *sp)
+{
+	struct device *dev_cur, *dev_new;
+	struct pci_dev *pdev_cur, *pdev_new;
+
+	dev_new = sp->dev;
+	dev_cur = sp_dev_master->dev;
+
+	pdev_new = to_pci_dev(dev_new);
+	pdev_cur = to_pci_dev(dev_cur);
+
+	if (pdev_new->bus->number < pdev_cur->bus->number)
+		return true;
+
+	if (PCI_SLOT(pdev_new->devfn) < PCI_SLOT(pdev_cur->devfn))
+		return true;
+
+	if (PCI_FUNC(pdev_new->devfn) < PCI_FUNC(pdev_cur->devfn))
+		return true;
+
+	return false;
+}
+
+static void sp_pci_set_master(struct sp_device *sp)
+{
+	if (!sp_dev_master) {
+		sp_dev_master = sp;
+		return;
+	}
+
+	if (sp_pci_is_master(sp))
+		sp_dev_master = sp;
+}
+
+static struct sp_device *sp_pci_get_master(void)
+{
+	return sp_dev_master;
+}
+
+static int sp_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct sp_device *sp;
+	struct sp_pci *sp_pci;
+	struct device *dev = &pdev->dev;
+	void __iomem * const *iomap_table;
+	int bar_mask;
+	int ret;
+
+	ret = -ENOMEM;
+	sp = sp_alloc_struct(dev);
+	if (!sp)
+		goto e_err;
+
+	sp_pci = devm_kzalloc(dev, sizeof(*sp_pci), GFP_KERNEL);
+	if (!sp_pci)
+		goto e_err;
+	sp->dev_specific = sp_pci;
+
+	sp->dev_data = (struct sp_dev_data *)id->driver_data;
+	if (!sp->dev_data) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	ret = pcim_enable_device(pdev);
+	if (ret) {
+		dev_err(dev, "pcim_enable_device failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	bar_mask = pci_select_bars(pdev, IORESOURCE_MEM);
+	ret = pcim_iomap_regions(pdev, bar_mask, "sp");
+	if (ret) {
+		dev_err(dev, "pcim_iomap_regions failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	iomap_table = pcim_iomap_table(pdev);
+	if (!iomap_table) {
+		dev_err(dev, "pcim_iomap_table failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	sp->io_map = iomap_table[sp->dev_data->bar];
+	if (!sp->io_map) {
+		dev_err(dev, "ioremap failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	ret = sp_get_irqs(sp);
+	if (ret)
+		goto e_err;
+
+	pci_set_master(pdev);
+
+	sp->set_master_device = sp_pci_set_master;
+	sp->get_master_device = sp_pci_get_master;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+	if (ret) {
+		ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
+		if (ret) {
+			dev_err(dev, "dma_set_mask_and_coherent failed (%d)\n",
+				ret);
+			goto e_err;
+		}
+	}
+
+	dev_set_drvdata(dev, sp);
+
+	ret = sp_init(sp);
+	if (ret)
+		goto e_err;
+
+	dev_notice(dev, "enabled\n");
+
+	return 0;
+
+e_err:
+	dev_notice(dev, "initialization failed\n");
+
+	return ret;
+}
+
+static void sp_pci_remove(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	if (!sp)
+		return;
+
+	sp_destroy(sp);
+
+	sp_free_irqs(sp);
+
+	dev_notice(dev, "disabled\n");
+}
+
+#ifdef CONFIG_PM
+static int sp_pci_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_suspend(sp, state);
+}
+
+static int sp_pci_resume(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_resume(sp);
+}
+#endif
+
+extern struct ccp_vdata ccpv3_pci;
+extern struct ccp_vdata ccpv5a;
+extern struct ccp_vdata ccpv5b;
+
+static const struct sp_dev_data dev_data[] = {
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv3_pci,
+#endif
+	},
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv5a,
+#endif
+	},
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv5b,
+#endif
+	},
+};
+
+static const struct pci_device_id sp_pci_table[] = {
+	{ PCI_VDEVICE(AMD, 0x1537), (kernel_ulong_t)&dev_data[0] },
+	{ PCI_VDEVICE(AMD, 0x1456), (kernel_ulong_t)&dev_data[1] },
+	{ PCI_VDEVICE(AMD, 0x1468), (kernel_ulong_t)&dev_data[2] },
+	/* Last entry must be zero */
+	{ 0, }
+};
+MODULE_DEVICE_TABLE(pci, sp_pci_table);
+
+static struct pci_driver sp_pci_driver = {
+	.name = "sp",
+	.id_table = sp_pci_table,
+	.probe = sp_pci_probe,
+	.remove = sp_pci_remove,
+#ifdef CONFIG_PM
+	.suspend = sp_pci_suspend,
+	.resume = sp_pci_resume,
+#endif
+};
+
+int sp_pci_init(void)
+{
+	return pci_register_driver(&sp_pci_driver);
+}
+
+void sp_pci_exit(void)
+{
+	pci_unregister_driver(&sp_pci_driver);
+}
diff --git a/drivers/crypto/ccp/sp-platform.c b/drivers/crypto/ccp/sp-platform.c
new file mode 100644
index 0000000..a918238
--- /dev/null
+++ b/drivers/crypto/ccp/sp-platform.c
@@ -0,0 +1,268 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ * 	   Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/platform_device.h>
+#include <linux/ioport.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/acpi.h>
+
+#include "sp-dev.h"
+
+struct sp_platform {
+	int coherent;
+	unsigned int irq_count;
+};
+
+static struct sp_device *sp_dev_master;
+static const struct acpi_device_id sp_acpi_match[];
+static const struct of_device_id sp_of_match[];
+
+static struct sp_dev_data *sp_get_of_dev_data(struct platform_device *pdev)
+{
+#ifdef CONFIG_OF
+	const struct of_device_id *match;
+
+	match = of_match_node(sp_of_match, pdev->dev.of_node);
+	if (match && match->data)
+		return (struct sp_dev_data *)match->data;
+#endif
+
+	return NULL;
+}
+
+static struct sp_dev_data *sp_get_acpi_dev_data(struct platform_device *pdev)
+{
+#ifdef CONFIG_ACPI
+	const struct acpi_device_id *match;
+
+	match = acpi_match_device(sp_acpi_match, &pdev->dev);
+	if (match && match->driver_data)
+		return (struct sp_dev_data *)match->driver_data;
+#endif
+
+	return NULL;
+}
+
+static int sp_get_irqs(struct sp_device *sp)
+{
+	struct sp_platform *sp_platform = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct platform_device *pdev = to_platform_device(dev);
+	unsigned int i, count;
+	int ret;
+
+	for (i = 0, count = 0; i < pdev->num_resources; i++) {
+		struct resource *res = &pdev->resource[i];
+
+		if (resource_type(res) == IORESOURCE_IRQ)
+			count++;
+	}
+
+	sp_platform->irq_count = count;
+
+	ret = platform_get_irq(pdev, 0);
+	if (ret < 0)
+		return ret;
+
+	sp->psp_irq = ret;
+	if (count == 1) {
+		sp->ccp_irq = ret;
+	} else {
+		ret = platform_get_irq(pdev, 1);
+		if (ret < 0)
+			return ret;
+
+		sp->ccp_irq = ret;
+	}
+
+	return 0;
+}
+
+void sp_platform_set_master(struct sp_device *sp)
+{
+	if (!sp_dev_master)
+		sp_dev_master = sp;
+}
+
+static int sp_platform_probe(struct platform_device *pdev)
+{
+	struct sp_device *sp;
+	struct sp_platform *sp_platform;
+	struct device *dev = &pdev->dev;
+	enum dev_dma_attr attr;
+	struct resource *ior;
+	int ret;
+
+	ret = -ENOMEM;
+	sp = sp_alloc_struct(dev);
+	if (!sp)
+		goto e_err;
+
+	sp_platform = devm_kzalloc(dev, sizeof(*sp_platform), GFP_KERNEL);
+	if (!sp_platform)
+		goto e_err;
+
+	sp->dev_specific = sp_platform;
+	sp->dev_data = pdev->dev.of_node ? sp_get_of_dev_data(pdev)
+					 : sp_get_acpi_dev_data(pdev);
+	if (!sp->dev_data) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	ior = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	sp->io_map = devm_ioremap_resource(dev, ior);
+	if (IS_ERR(sp->io_map)) {
+		ret = PTR_ERR(sp->io_map);
+		goto e_err;
+	}
+
+	attr = device_get_dma_attr(dev);
+	if (attr == DEV_DMA_NOT_SUPPORTED) {
+		dev_err(dev, "DMA is not supported");
+		goto e_err;
+	}
+
+	sp_platform->coherent = (attr == DEV_DMA_COHERENT);
+	if (sp_platform->coherent)
+		sp->axcache = CACHE_WB_NO_ALLOC;
+	else
+		sp->axcache = CACHE_NONE;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+	if (ret) {
+		dev_err(dev, "dma_set_mask_and_coherent failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	ret = sp_get_irqs(sp);
+	if (ret)
+		goto e_err;
+
+	dev_set_drvdata(dev, sp);
+
+	ret = sp_init(sp);
+	if (ret)
+		goto e_err;
+
+	dev_notice(dev, "enabled\n");
+
+	return 0;
+
+e_err:
+	dev_notice(dev, "initialization failed\n");
+
+	return ret;
+}
+
+static int sp_platform_remove(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	if (!sp)
+		return 0;
+
+	sp_destroy(sp);
+
+	dev_notice(dev, "disabled\n");
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int sp_platform_suspend(struct platform_device *pdev,
+			       pm_message_t state)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_suspend(sp, state);
+}
+
+static int sp_platform_resume(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_resume(sp);
+}
+#endif
+
+extern struct ccp_vdata ccpv3_platform;
+
+static const struct sp_dev_data dev_data[] = {
+	{
+#ifdef CONFIG_AMD_CCP
+		.ccp_vdata = &ccpv3_platform,
+#endif
+	},
+};
+
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id sp_acpi_match[] = {
+	{ "AMDI0C00", (kernel_ulong_t)&dev_data[0] },
+	{ },
+};
+MODULE_DEVICE_TABLE(acpi, sp_acpi_match);
+#endif
+
+#ifdef CONFIG_OF
+static const struct of_device_id sp_of_match[] = {
+	{ .compatible = "amd,ccp-seattle-v1a",
+	  .data = (const void *)&dev_data[0] },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, sp_of_match);
+#endif
+
+static struct platform_driver sp_platform_driver = {
+	.driver = {
+		.name = "sp",
+#ifdef CONFIG_ACPI
+		.acpi_match_table = sp_acpi_match,
+#endif
+#ifdef CONFIG_OF
+		.of_match_table = sp_of_match,
+#endif
+	},
+	.probe = sp_platform_probe,
+	.remove = sp_platform_remove,
+#ifdef CONFIG_PM
+	.suspend = sp_platform_suspend,
+	.resume = sp_platform_resume,
+#endif
+};
+
+struct sp_device *sp_platform_get_master(void)
+{
+	return sp_dev_master;
+}
+
+int sp_platform_init(void)
+{
+	return platform_driver_register(&sp_platform_driver);
+}
+
+void sp_platform_exit(void)
+{
+	platform_driver_unregister(&sp_platform_driver);
+}
diff --git a/include/linux/ccp.h b/include/linux/ccp.h
index c71dd8f..1ea14e6 100644
--- a/include/linux/ccp.h
+++ b/include/linux/ccp.h
@@ -24,8 +24,7 @@
 struct ccp_device;
 struct ccp_cmd;
 
-#if defined(CONFIG_CRYPTO_DEV_CCP_DD) || \
-	defined(CONFIG_CRYPTO_DEV_CCP_DD_MODULE)
+#if defined(CONFIG_CRYPTO_DEV_CCP)
 
 /**
  * ccp_present - check if a CCP device is present

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:16   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The CCP device is part of the AMD Secure Processor. In order to expand the
usage of the AMD Secure Processor, create a framework that allows functional
components of the AMD Secure Processor to be initialized and handled
appropriately.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 drivers/crypto/Kconfig           |   10 +
 drivers/crypto/ccp/Kconfig       |   43 +++--
 drivers/crypto/ccp/Makefile      |    8 -
 drivers/crypto/ccp/ccp-dev-v3.c  |   86 +++++-----
 drivers/crypto/ccp/ccp-dev-v5.c  |   73 ++++-----
 drivers/crypto/ccp/ccp-dev.c     |  137 +++++++++-------
 drivers/crypto/ccp/ccp-dev.h     |   35 ----
 drivers/crypto/ccp/sp-dev.c      |  308 ++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/sp-dev.h      |  140 ++++++++++++++++
 drivers/crypto/ccp/sp-pci.c      |  324 ++++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/sp-platform.c |  268 +++++++++++++++++++++++++++++++
 include/linux/ccp.h              |    3 
 12 files changed, 1240 insertions(+), 195 deletions(-)
 create mode 100644 drivers/crypto/ccp/sp-dev.c
 create mode 100644 drivers/crypto/ccp/sp-dev.h
 create mode 100644 drivers/crypto/ccp/sp-pci.c
 create mode 100644 drivers/crypto/ccp/sp-platform.c

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 7956478..d31b469 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -456,14 +456,14 @@ config CRYPTO_DEV_ATMEL_SHA
 	  To compile this driver as a module, choose M here: the module
 	  will be called atmel-sha.
 
-config CRYPTO_DEV_CCP
-	bool "Support for AMD Cryptographic Coprocessor"
+config CRYPTO_DEV_SP
+	bool "Support for AMD Secure Processor"
 	depends on ((X86 && PCI) || (ARM64 && (OF_ADDRESS || ACPI))) && HAS_IOMEM
 	help
-	  The AMD Cryptographic Coprocessor provides hardware offload support
-	  for encryption, hashing and related operations.
+	  The AMD Secure Processor provides hardware offload support for memory
+	  encryption in virtualization and cryptographic hashing and related operations.
 
-if CRYPTO_DEV_CCP
+if CRYPTO_DEV_SP
 	source "drivers/crypto/ccp/Kconfig"
 endif
 
diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index 2238f77..bc08f03 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -1,26 +1,37 @@
-config CRYPTO_DEV_CCP_DD
-	tristate "Cryptographic Coprocessor device driver"
-	depends on CRYPTO_DEV_CCP
-	default m
-	select HW_RANDOM
-	select DMA_ENGINE
-	select DMADEVICES
-	select CRYPTO_SHA1
-	select CRYPTO_SHA256
-	help
-	  Provides the interface to use the AMD Cryptographic Coprocessor
-	  which can be used to offload encryption operations such as SHA,
-	  AES and more. If you choose 'M' here, this module will be called
-	  ccp.
-
 config CRYPTO_DEV_CCP_CRYPTO
 	tristate "Encryption and hashing offload support"
-	depends on CRYPTO_DEV_CCP_DD
+	depends on CRYPTO_DEV_SP_DD
 	default m
 	select CRYPTO_HASH
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_AUTHENC
+	select CRYPTO_DEV_CCP
 	help
 	  Support for using the cryptographic API with the AMD Cryptographic
 	  Coprocessor. This module supports offload of SHA and AES algorithms.
 	  If you choose 'M' here, this module will be called ccp_crypto.
+
+config CRYPTO_DEV_SP_DD
+	tristate "Secure Processor device driver"
+	depends on CRYPTO_DEV_SP
+	default m
+	help
+	  Provides the interface to use the AMD Secure Processor. The
+	  AMD Secure Processor support the Platform Security Processor (PSP)
+	  and Cryptographic Coprocessor (CCP). If you choose 'M' here, this
+	  module will be called ccp.
+
+if CRYPTO_DEV_SP_DD
+config CRYPTO_DEV_CCP
+	bool "Cryptographic Coprocessor interface"
+	default y
+	select HW_RANDOM
+	select DMA_ENGINE
+	select DMADEVICES
+	select CRYPTO_SHA1
+	select CRYPTO_SHA256
+	help
+	  Provides the interface to use the AMD Cryptographic Coprocessor
+	  which can be used to offload encryption operations such as SHA,
+	  AES and more.
+endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 346ceb8..8127e18 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -1,11 +1,11 @@
-obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
-ccp-objs := ccp-dev.o \
+obj-$(CONFIG_CRYPTO_DEV_SP_DD) += ccp.o
+ccp-objs := sp-dev.o sp-platform.o
+ccp-$(CONFIG_PCI) += sp-pci.o
+ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-ops.o \
 	    ccp-dev-v3.o \
 	    ccp-dev-v5.o \
-	    ccp-platform.o \
 	    ccp-dmaengine.o
-ccp-$(CONFIG_PCI) += ccp-pci.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/ccp-dev-v3.c b/drivers/crypto/ccp/ccp-dev-v3.c
index 7bc0998..5c50d14 100644
--- a/drivers/crypto/ccp/ccp-dev-v3.c
+++ b/drivers/crypto/ccp/ccp-dev-v3.c
@@ -315,6 +315,39 @@ static int ccp_perform_ecc(struct ccp_op *op)
 	return ccp_do_cmd(op, cr, ARRAY_SIZE(cr));
 }
 
+static irqreturn_t ccp_irq_handler(int irq, void *data)
+{
+	struct ccp_device *ccp = data;
+	struct ccp_cmd_queue *cmd_q;
+	u32 q_int, status;
+	unsigned int i;
+
+	status = ioread32(ccp->io_regs + IRQ_STATUS_REG);
+
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		cmd_q = &ccp->cmd_q[i];
+
+		q_int = status & (cmd_q->int_ok | cmd_q->int_err);
+		if (q_int) {
+			cmd_q->int_status = status;
+			cmd_q->q_status = ioread32(cmd_q->reg_status);
+			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
+
+			/* On error, only save the first error value */
+			if ((q_int & cmd_q->int_err) && !cmd_q->cmd_error)
+				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
+
+			cmd_q->int_rcvd = 1;
+
+			/* Acknowledge the interrupt and wake the kthread */
+			iowrite32(q_int, ccp->io_regs + IRQ_STATUS_REG);
+			wake_up_interruptible(&cmd_q->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
 static int ccp_init(struct ccp_device *ccp)
 {
 	struct device *dev = ccp->dev;
@@ -374,7 +407,7 @@ static int ccp_init(struct ccp_device *ccp)
 
 #ifdef CONFIG_ARM64
 		/* For arm64 set the recommended queue cache settings */
-		iowrite32(ccp->axcache, ccp->io_regs + CMD_Q_CACHE_BASE +
+		iowrite32(ccp->sp->axcache, ccp->io_regs + CMD_Q_CACHE_BASE +
 			  (CMD_Q_CACHE_INC * i));
 #endif
 
@@ -398,7 +431,7 @@ static int ccp_init(struct ccp_device *ccp)
 	iowrite32(qim, ccp->io_regs + IRQ_STATUS_REG);
 
 	/* Request an irq */
-	ret = ccp->get_irq(ccp);
+	ret = sp_request_ccp_irq(ccp->sp, ccp_irq_handler, ccp->name, ccp);
 	if (ret) {
 		dev_err(dev, "unable to allocate an IRQ\n");
 		goto e_pool;
@@ -450,7 +483,7 @@ static int ccp_init(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 e_pool:
 	for (i = 0; i < ccp->cmd_q_count; i++)
@@ -496,7 +529,7 @@ static void ccp_destroy(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 	for (i = 0; i < ccp->cmd_q_count; i++)
 		dma_pool_destroy(ccp->cmd_q[i].dma_pool);
@@ -516,40 +549,6 @@ static void ccp_destroy(struct ccp_device *ccp)
 	}
 }
 
-static irqreturn_t ccp_irq_handler(int irq, void *data)
-{
-	struct device *dev = data;
-	struct ccp_device *ccp = dev_get_drvdata(dev);
-	struct ccp_cmd_queue *cmd_q;
-	u32 q_int, status;
-	unsigned int i;
-
-	status = ioread32(ccp->io_regs + IRQ_STATUS_REG);
-
-	for (i = 0; i < ccp->cmd_q_count; i++) {
-		cmd_q = &ccp->cmd_q[i];
-
-		q_int = status & (cmd_q->int_ok | cmd_q->int_err);
-		if (q_int) {
-			cmd_q->int_status = status;
-			cmd_q->q_status = ioread32(cmd_q->reg_status);
-			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
-
-			/* On error, only save the first error value */
-			if ((q_int & cmd_q->int_err) && !cmd_q->cmd_error)
-				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
-
-			cmd_q->int_rcvd = 1;
-
-			/* Acknowledge the interrupt and wake the kthread */
-			iowrite32(q_int, ccp->io_regs + IRQ_STATUS_REG);
-			wake_up_interruptible(&cmd_q->int_queue);
-		}
-	}
-
-	return IRQ_HANDLED;
-}
-
 static const struct ccp_actions ccp3_actions = {
 	.aes = ccp_perform_aes,
 	.xts_aes = ccp_perform_xts_aes,
@@ -562,13 +561,18 @@ static const struct ccp_actions ccp3_actions = {
 	.init = ccp_init,
 	.destroy = ccp_destroy,
 	.get_free_slots = ccp_get_free_slots,
-	.irqhandler = ccp_irq_handler,
 };
 
-const struct ccp_vdata ccpv3 = {
+const struct ccp_vdata ccpv3_platform = {
+	.version = CCP_VERSION(3, 0),
+	.setup = NULL,
+	.perform = &ccp3_actions,
+	.offset = 0,
+};
+
+const struct ccp_vdata ccpv3_pci = {
 	.version = CCP_VERSION(3, 0),
 	.setup = NULL,
 	.perform = &ccp3_actions,
-	.bar = 2,
 	.offset = 0x20000,
 };
diff --git a/drivers/crypto/ccp/ccp-dev-v5.c b/drivers/crypto/ccp/ccp-dev-v5.c
index 612898b..dd6335b 100644
--- a/drivers/crypto/ccp/ccp-dev-v5.c
+++ b/drivers/crypto/ccp/ccp-dev-v5.c
@@ -651,6 +651,38 @@ static int ccp_assign_lsbs(struct ccp_device *ccp)
 	return rc;
 }
 
+static irqreturn_t ccp5_irq_handler(int irq, void *data)
+{
+	struct device *dev = data;
+	struct ccp_device *ccp = dev_get_drvdata(dev);
+	u32 status;
+	unsigned int i;
+
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		struct ccp_cmd_queue *cmd_q = &ccp->cmd_q[i];
+
+		status = ioread32(cmd_q->reg_interrupt_status);
+
+		if (status) {
+			cmd_q->int_status = status;
+			cmd_q->q_status = ioread32(cmd_q->reg_status);
+			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
+
+			/* On error, only save the first error value */
+			if ((status & INT_ERROR) && !cmd_q->cmd_error)
+				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
+
+			cmd_q->int_rcvd = 1;
+
+			/* Acknowledge the interrupt and wake the kthread */
+			iowrite32(ALL_INTERRUPTS, cmd_q->reg_interrupt_status);
+			wake_up_interruptible(&cmd_q->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
 static int ccp5_init(struct ccp_device *ccp)
 {
 	struct device *dev = ccp->dev;
@@ -752,7 +784,7 @@ static int ccp5_init(struct ccp_device *ccp)
 
 	dev_dbg(dev, "Requesting an IRQ...\n");
 	/* Request an irq */
-	ret = ccp->get_irq(ccp);
+	ret = sp_request_ccp_irq(ccp->sp, ccp5_irq_handler, ccp->name, ccp);
 	if (ret) {
 		dev_err(dev, "unable to allocate an IRQ\n");
 		goto e_pool;
@@ -855,7 +887,7 @@ static int ccp5_init(struct ccp_device *ccp)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
 e_irq:
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 e_pool:
 	for (i = 0; i < ccp->cmd_q_count; i++)
@@ -901,7 +933,7 @@ static void ccp5_destroy(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 	for (i = 0; i < ccp->cmd_q_count; i++) {
 		cmd_q = &ccp->cmd_q[i];
@@ -924,38 +956,6 @@ static void ccp5_destroy(struct ccp_device *ccp)
 	}
 }
 
-static irqreturn_t ccp5_irq_handler(int irq, void *data)
-{
-	struct device *dev = data;
-	struct ccp_device *ccp = dev_get_drvdata(dev);
-	u32 status;
-	unsigned int i;
-
-	for (i = 0; i < ccp->cmd_q_count; i++) {
-		struct ccp_cmd_queue *cmd_q = &ccp->cmd_q[i];
-
-		status = ioread32(cmd_q->reg_interrupt_status);
-
-		if (status) {
-			cmd_q->int_status = status;
-			cmd_q->q_status = ioread32(cmd_q->reg_status);
-			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
-
-			/* On error, only save the first error value */
-			if ((status & INT_ERROR) && !cmd_q->cmd_error)
-				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
-
-			cmd_q->int_rcvd = 1;
-
-			/* Acknowledge the interrupt and wake the kthread */
-			iowrite32(ALL_INTERRUPTS, cmd_q->reg_interrupt_status);
-			wake_up_interruptible(&cmd_q->int_queue);
-		}
-	}
-
-	return IRQ_HANDLED;
-}
-
 static void ccp5_config(struct ccp_device *ccp)
 {
 	/* Public side */
@@ -1001,14 +1001,12 @@ static const struct ccp_actions ccp5_actions = {
 	.init = ccp5_init,
 	.destroy = ccp5_destroy,
 	.get_free_slots = ccp5_get_free_slots,
-	.irqhandler = ccp5_irq_handler,
 };
 
 const struct ccp_vdata ccpv5a = {
 	.version = CCP_VERSION(5, 0),
 	.setup = ccp5_config,
 	.perform = &ccp5_actions,
-	.bar = 2,
 	.offset = 0x0,
 };
 
@@ -1016,6 +1014,5 @@ const struct ccp_vdata ccpv5b = {
 	.version = CCP_VERSION(5, 0),
 	.setup = ccp5other_config,
 	.perform = &ccp5_actions,
-	.bar = 2,
 	.offset = 0x0,
 };
diff --git a/drivers/crypto/ccp/ccp-dev.c b/drivers/crypto/ccp/ccp-dev.c
index 511ab04..0fa8c4a 100644
--- a/drivers/crypto/ccp/ccp-dev.c
+++ b/drivers/crypto/ccp/ccp-dev.c
@@ -22,19 +22,11 @@
 #include <linux/mutex.h>
 #include <linux/delay.h>
 #include <linux/hw_random.h>
-#include <linux/cpu.h>
-#ifdef CONFIG_X86
-#include <asm/cpu_device_id.h>
-#endif
 #include <linux/ccp.h>
 
+#include "sp-dev.h"
 #include "ccp-dev.h"
 
-MODULE_AUTHOR("Tom Lendacky <thomas.lendacky@amd.com>");
-MODULE_LICENSE("GPL");
-MODULE_VERSION("1.0.0");
-MODULE_DESCRIPTION("AMD Cryptographic Coprocessor driver");
-
 struct ccp_tasklet_data {
 	struct completion completion;
 	struct ccp_cmd *cmd;
@@ -110,13 +102,6 @@ static LIST_HEAD(ccp_units);
 static DEFINE_SPINLOCK(ccp_rr_lock);
 static struct ccp_device *ccp_rr;
 
-/* Ever-increasing value to produce unique unit numbers */
-static atomic_t ccp_unit_ordinal;
-static unsigned int ccp_increment_unit_ordinal(void)
-{
-	return atomic_inc_return(&ccp_unit_ordinal);
-}
-
 /**
  * ccp_add_device - add a CCP device to the list
  *
@@ -455,19 +440,17 @@ int ccp_cmd_queue_thread(void *data)
 	return 0;
 }
 
-/**
- * ccp_alloc_struct - allocate and initialize the ccp_device struct
- *
- * @dev: device struct of the CCP
- */
-struct ccp_device *ccp_alloc_struct(struct device *dev)
+static struct ccp_device *ccp_alloc_struct(struct sp_device *sp)
 {
+	struct device *dev = sp->dev;
 	struct ccp_device *ccp;
 
 	ccp = devm_kzalloc(dev, sizeof(*ccp), GFP_KERNEL);
 	if (!ccp)
 		return NULL;
+
 	ccp->dev = dev;
+	ccp->sp = sp;
 
 	INIT_LIST_HEAD(&ccp->cmd);
 	INIT_LIST_HEAD(&ccp->backlog);
@@ -482,9 +465,8 @@ struct ccp_device *ccp_alloc_struct(struct device *dev)
 	init_waitqueue_head(&ccp->sb_queue);
 	init_waitqueue_head(&ccp->suspend_queue);
 
-	ccp->ord = ccp_increment_unit_ordinal();
-	snprintf(ccp->name, MAX_CCP_NAME_LEN, "ccp-%u", ccp->ord);
-	snprintf(ccp->rngname, MAX_CCP_NAME_LEN, "ccp-%u-rng", ccp->ord);
+	snprintf(ccp->name, MAX_CCP_NAME_LEN, "ccp-%u", sp->ord);
+	snprintf(ccp->rngname, MAX_CCP_NAME_LEN, "ccp-%u-rng", sp->ord);
 
 	return ccp;
 }
@@ -536,53 +518,94 @@ bool ccp_queues_suspended(struct ccp_device *ccp)
 }
 #endif
 
-static int __init ccp_mod_init(void)
+int ccp_dev_init(struct sp_device *sp)
 {
-#ifdef CONFIG_X86
+	struct device *dev = sp->dev;
+	struct ccp_device *ccp;
 	int ret;
 
-	ret = ccp_pci_init();
-	if (ret)
-		return ret;
-
-	/* Don't leave the driver loaded if init failed */
-	if (ccp_present() != 0) {
-		ccp_pci_exit();
-		return -ENODEV;
+	ret = -ENOMEM;
+	ccp = ccp_alloc_struct(sp);
+	if (!ccp)
+		goto e_err;
+	sp->ccp_data = ccp;
+
+	ccp->vdata = (struct ccp_vdata *)sp->dev_data->ccp_vdata;
+	if (!ccp->vdata || !ccp->vdata->version) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
 	}
 
-	return 0;
-#endif
+	ccp->io_regs = sp->io_map + ccp->vdata->offset;
 
-#ifdef CONFIG_ARM64
-	int ret;
+	if (ccp->vdata->setup)
+		ccp->vdata->setup(ccp);
 
-	ret = ccp_platform_init();
+	ret = ccp->vdata->perform->init(ccp);
 	if (ret)
-		return ret;
+		goto e_err;
 
-	/* Don't leave the driver loaded if init failed */
-	if (ccp_present() != 0) {
-		ccp_platform_exit();
-		return -ENODEV;
-	}
+	dev_notice(dev, "ccp enabled\n");
 
 	return 0;
-#endif
 
-	return -ENODEV;
+e_err:
+	sp->ccp_data = NULL;
+
+	dev_notice(dev, "ccp initialization failed\n");
+
+	return ret;
 }
 
-static void __exit ccp_mod_exit(void)
+void ccp_dev_destroy(struct sp_device *sp)
 {
-#ifdef CONFIG_X86
-	ccp_pci_exit();
-#endif
+	struct ccp_device *ccp = sp->ccp_data;
 
-#ifdef CONFIG_ARM64
-	ccp_platform_exit();
-#endif
+	ccp->vdata->perform->destroy(ccp);
+}
+
+int ccp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	struct ccp_device *ccp = sp->ccp_data;
+	unsigned long flags;
+	unsigned int i;
+
+	spin_lock_irqsave(&ccp->cmd_lock, flags);
+
+	ccp->suspending = 1;
+
+	/* Wake all the queue kthreads to prepare for suspend */
+	for (i = 0; i < ccp->cmd_q_count; i++)
+		wake_up_process(ccp->cmd_q[i].kthread);
+
+	spin_unlock_irqrestore(&ccp->cmd_lock, flags);
+
+	/* Wait for all queue kthreads to say they're done */
+	while (!ccp_queues_suspended(ccp))
+		wait_event_interruptible(ccp->suspend_queue,
+					 ccp_queues_suspended(ccp));
+
+	return 0;
 }
 
-module_init(ccp_mod_init);
-module_exit(ccp_mod_exit);
+int ccp_dev_resume(struct sp_device *sp)
+{
+	struct ccp_device *ccp = sp->ccp_data;
+	unsigned long flags;
+	unsigned int i;
+
+	spin_lock_irqsave(&ccp->cmd_lock, flags);
+
+	ccp->suspending = 0;
+
+	/* Wake up all the kthreads */
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		ccp->cmd_q[i].suspended = 0;
+		wake_up_process(ccp->cmd_q[i].kthread);
+	}
+
+	spin_unlock_irqrestore(&ccp->cmd_lock, flags);
+
+	return 0;
+}
diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
index 649e561..25a4bfd 100644
--- a/drivers/crypto/ccp/ccp-dev.h
+++ b/drivers/crypto/ccp/ccp-dev.h
@@ -27,6 +27,8 @@
 #include <linux/irqreturn.h>
 #include <linux/dmaengine.h>
 
+#include "sp-dev.h"
+
 #define MAX_CCP_NAME_LEN		16
 #define MAX_DMAPOOL_NAME_LEN		32
 
@@ -35,9 +37,6 @@
 
 #define TRNG_RETRIES			10
 
-#define CACHE_NONE			0x00
-#define CACHE_WB_NO_ALLOC		0xb7
-
 /****** Register Mappings ******/
 #define Q_MASK_REG			0x000
 #define TRNG_OUT_REG			0x00c
@@ -322,18 +321,15 @@ struct ccp_device {
 	struct list_head entry;
 
 	struct ccp_vdata *vdata;
-	unsigned int ord;
 	char name[MAX_CCP_NAME_LEN];
 	char rngname[MAX_CCP_NAME_LEN];
 
 	struct device *dev;
+	struct sp_device *sp;
 
 	/* Bus specific device information
 	 */
 	void *dev_specific;
-	int (*get_irq)(struct ccp_device *ccp);
-	void (*free_irq)(struct ccp_device *ccp);
-	unsigned int irq;
 
 	/* I/O area used for device communication. The register mapping
 	 * starts at an offset into the mapped bar.
@@ -342,7 +338,6 @@ struct ccp_device {
 	 *   them.
 	 */
 	struct mutex req_mutex ____cacheline_aligned;
-	void __iomem *io_map;
 	void __iomem *io_regs;
 
 	/* Master lists that all cmds are queued on. Because there can be
@@ -407,9 +402,6 @@ struct ccp_device {
 	/* Suspend support */
 	unsigned int suspending;
 	wait_queue_head_t suspend_queue;
-
-	/* DMA caching attribute support */
-	unsigned int axcache;
 };
 
 enum ccp_memtype {
@@ -592,18 +584,11 @@ struct ccp5_desc {
 	struct dword7 dw7;
 };
 
-int ccp_pci_init(void);
-void ccp_pci_exit(void);
-
-int ccp_platform_init(void);
-void ccp_platform_exit(void);
-
 void ccp_add_device(struct ccp_device *ccp);
 void ccp_del_device(struct ccp_device *ccp);
 
 extern void ccp_log_error(struct ccp_device *, int);
 
-struct ccp_device *ccp_alloc_struct(struct device *dev);
 bool ccp_queues_suspended(struct ccp_device *ccp);
 int ccp_cmd_queue_thread(void *data);
 int ccp_trng_read(struct hwrng *rng, void *data, size_t max, bool wait);
@@ -629,20 +614,6 @@ struct ccp_actions {
 	unsigned int (*get_free_slots)(struct ccp_cmd_queue *);
 	int (*init)(struct ccp_device *);
 	void (*destroy)(struct ccp_device *);
-	irqreturn_t (*irqhandler)(int, void *);
-};
-
-/* Structure to hold CCP version-specific values */
-struct ccp_vdata {
-	const unsigned int version;
-	void (*setup)(struct ccp_device *);
-	const struct ccp_actions *perform;
-	const unsigned int bar;
-	const unsigned int offset;
 };
 
-extern const struct ccp_vdata ccpv3;
-extern const struct ccp_vdata ccpv5a;
-extern const struct ccp_vdata ccpv5b;
-
 #endif
diff --git a/drivers/crypto/ccp/sp-dev.c b/drivers/crypto/ccp/sp-dev.c
new file mode 100644
index 0000000..e47fb8e
--- /dev/null
+++ b/drivers/crypto/ccp/sp-dev.c
@@ -0,0 +1,308 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *	Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+
+#include "sp-dev.h"
+
+MODULE_AUTHOR("Tom Lendacky <thomas.lendacky@amd.com>");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1.1.0");
+MODULE_DESCRIPTION("AMD Secure Processor driver");
+
+/* List of SPs, SP count, read-write access lock, and access functions
+ *
+ * Lock structure: get sp_unit_lock for reading whenever we need to
+ * examine the SP list.
+ */
+static DEFINE_RWLOCK(sp_unit_lock);
+static LIST_HEAD(sp_units);
+
+/* Ever-increasing value to produce unique unit numbers */
+static atomic_t sp_ordinal;
+
+static void sp_add_device(struct sp_device *sp)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	list_add_tail(&sp->entry, &sp_units);
+
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+}
+
+static void sp_del_device(struct sp_device *sp)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	list_del(&sp->entry);
+
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+}
+
+struct sp_device *sp_get_device(void)
+{
+	struct sp_device *sp = NULL;
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	if (list_empty(&sp_units))
+		goto unlock;
+
+	sp = list_first_entry(&sp_units, struct sp_device, entry);
+
+	list_add_tail(&sp->entry, &sp_units);
+unlock:
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+	return sp;
+}
+
+static irqreturn_t sp_irq_handler(int irq, void *data)
+{
+	struct sp_device *sp = data;
+
+	if (sp->psp_irq_handler)
+		sp->psp_irq_handler(irq, sp->psp_irq_data);
+
+	if (sp->ccp_irq_handler)
+		sp->ccp_irq_handler(irq, sp->ccp_irq_data);
+
+	return IRQ_HANDLED;
+}
+
+int sp_request_psp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data)
+{
+	int ret;
+
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->ccp_vdata) {
+		/* Need a common routine to manager all interrupts */
+		sp->psp_irq_data = data;
+		sp->psp_irq_handler = handler;
+
+		if (!sp->irq_registered) {
+			ret = request_irq(sp->psp_irq, sp_irq_handler, 0,
+					  sp->name, sp);
+			if (ret)
+				return ret;
+
+			sp->irq_registered = true;
+		}
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		ret = request_irq(sp->psp_irq, handler, 0, name, data);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+int sp_request_ccp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data)
+{
+	int ret;
+
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->psp_vdata) {
+		/* Need a common routine to manager all interrupts */
+		sp->ccp_irq_data = data;
+		sp->ccp_irq_handler = handler;
+
+		if (!sp->irq_registered) {
+			ret = request_irq(sp->ccp_irq, sp_irq_handler, 0,
+					  sp->name, sp);
+			if (ret)
+				return ret;
+
+			sp->irq_registered = true;
+		}
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		ret = request_irq(sp->ccp_irq, handler, 0, name, data);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+void sp_free_psp_irq(struct sp_device *sp, void *data)
+{
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->ccp_vdata) {
+		/* Using a common routine to manager all interrupts */
+		if (!sp->ccp_irq_handler) {
+			/* Nothing else using it, so free it */
+			free_irq(sp->psp_irq, sp);
+
+			sp->irq_registered = false;
+		}
+
+		sp->psp_irq_handler = NULL;
+		sp->psp_irq_data = NULL;
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		free_irq(sp->psp_irq, data);
+	}
+}
+
+void sp_free_ccp_irq(struct sp_device *sp, void *data)
+{
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->psp_vdata) {
+		/* Using a common routine to manager all interrupts */
+		if (!sp->psp_irq_handler) {
+			/* Nothing else using it, so free it */
+			free_irq(sp->ccp_irq, sp);
+
+			sp->irq_registered = false;
+		}
+
+		sp->ccp_irq_handler = NULL;
+		sp->ccp_irq_data = NULL;
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		free_irq(sp->ccp_irq, data);
+	}
+}
+
+/**
+ * sp_alloc_struct - allocate and initialize the sp_device struct
+ *
+ * @dev: device struct of the SP
+ */
+struct sp_device *sp_alloc_struct(struct device *dev)
+{
+	struct sp_device *sp;
+
+	sp = devm_kzalloc(dev, sizeof(*sp), GFP_KERNEL);
+	if (!sp)
+		return NULL;
+
+	sp->dev = dev;
+	sp->ord = atomic_inc_return(&sp_ordinal) - 1;
+	snprintf(sp->name, SP_MAX_NAME_LEN, "sp-%u", sp->ord);
+
+	return sp;
+}
+
+int sp_init(struct sp_device *sp)
+{
+	sp_add_device(sp);
+
+	if (sp->dev_data->ccp_vdata)
+		ccp_dev_init(sp);
+
+	return 0;
+}
+
+void sp_destroy(struct sp_device *sp)
+{
+	if (sp->dev_data->ccp_vdata)
+		ccp_dev_destroy(sp);
+
+	sp_del_device(sp);
+}
+
+int sp_suspend(struct sp_device *sp, pm_message_t state)
+{
+	int ret;
+
+	if (sp->dev_data->ccp_vdata) {
+		ret = ccp_dev_suspend(sp, state);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+int sp_resume(struct sp_device *sp)
+{
+	int ret;
+
+	if (sp->dev_data->ccp_vdata) {
+		ret = ccp_dev_resume(sp);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+struct sp_device *sp_get_psp_master_device(void)
+{
+	struct sp_device *sp = sp_get_device();
+
+	if (!sp)
+		return NULL;
+
+	if (!sp->psp_data)
+		return NULL;
+
+	return sp->get_master_device();
+}
+
+void sp_set_psp_master(struct sp_device *sp)
+{
+	if (sp->psp_data)
+		sp->set_master_device(sp);
+}
+
+static int __init sp_mod_init(void)
+{
+#ifdef CONFIG_X86
+	int ret;
+
+	ret = sp_pci_init();
+	if (ret)
+		return ret;
+
+	return 0;
+#endif
+
+#ifdef CONFIG_ARM64
+	int ret;
+
+	ret = sp_platform_init();
+	if (ret)
+		return ret;
+
+	return 0;
+#endif
+
+	return -ENODEV;
+}
+
+static void __exit sp_mod_exit(void)
+{
+#ifdef CONFIG_X86
+	sp_pci_exit();
+#endif
+
+#ifdef CONFIG_ARM64
+	sp_platform_exit();
+#endif
+}
+
+module_init(sp_mod_init);
+module_exit(sp_mod_exit);
diff --git a/drivers/crypto/ccp/sp-dev.h b/drivers/crypto/ccp/sp-dev.h
new file mode 100644
index 0000000..9a8a8f8
--- /dev/null
+++ b/drivers/crypto/ccp/sp-dev.h
@@ -0,0 +1,140 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *	Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __SP_DEV_H__
+#define __SP_DEV_H__
+
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/dmapool.h>
+#include <linux/hw_random.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+
+#define SP_MAX_NAME_LEN		32
+
+#define CACHE_NONE			0x00
+#define CACHE_WB_NO_ALLOC		0xb7
+
+/* Structure to hold CCP device data */
+struct ccp_device;
+struct ccp_vdata {
+	const unsigned int version;
+	void (*setup)(struct ccp_device *);
+	const struct ccp_actions *perform;
+	const unsigned int offset;
+};
+
+/* Structure to hold SP device data */
+struct sp_dev_data {
+	const unsigned int bar;
+
+	const struct ccp_vdata *ccp_vdata;
+	const void *psp_vdata;
+};
+
+struct sp_device {
+	struct list_head entry;
+
+	struct device *dev;
+
+	struct sp_dev_data *dev_data;
+	unsigned int ord;
+	char name[SP_MAX_NAME_LEN];
+
+	/* Bus specific device information */
+	void *dev_specific;
+
+	/* I/O area used for device communication. */
+	void __iomem *io_map;
+
+	/* DMA caching attribute support */
+	unsigned int axcache;
+
+	bool irq_registered;
+
+	/* get and set master device */
+	struct sp_device*(*get_master_device) (void);
+	void(*set_master_device) (struct sp_device *);
+
+	unsigned int psp_irq;
+	irq_handler_t psp_irq_handler;
+	void *psp_irq_data;
+
+	unsigned int ccp_irq;
+	irq_handler_t ccp_irq_handler;
+	void *ccp_irq_data;
+
+	void *psp_data;
+	void *ccp_data;
+};
+
+int sp_pci_init(void);
+void sp_pci_exit(void);
+
+int sp_platform_init(void);
+void sp_platform_exit(void);
+
+struct sp_device *sp_alloc_struct(struct device *dev);
+
+int sp_init(struct sp_device *sp);
+void sp_destroy(struct sp_device *sp);
+struct sp_device *sp_get_master(void);
+
+int sp_suspend(struct sp_device *sp, pm_message_t state);
+int sp_resume(struct sp_device *sp);
+
+int sp_request_psp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data);
+void sp_free_psp_irq(struct sp_device *sp, void *data);
+
+int sp_request_ccp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data);
+void sp_free_ccp_irq(struct sp_device *sp, void *data);
+
+void sp_set_psp_master(struct sp_device *sp);
+struct sp_device *sp_get_psp_master_device(void);
+
+#ifdef CONFIG_CRYPTO_DEV_CCP
+
+int ccp_dev_init(struct sp_device *sp);
+void ccp_dev_destroy(struct sp_device *sp);
+
+int ccp_dev_suspend(struct sp_device *sp, pm_message_t state);
+int ccp_dev_resume(struct sp_device *sp);
+
+#else	/* !CONFIG_CRYPTO_DEV_CCP */
+
+static inline int ccp_dev_init(struct sp_device *sp)
+{
+	return 0;
+}
+static inline void ccp_dev_destroy(struct sp_device *sp) { }
+
+static inline int ccp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	return 0;
+}
+static inline int ccp_dev_resume(struct sp_device *sp)
+{
+	return 0;
+}
+
+#endif	/* CONFIG_CRYPTO_DEV_CCP */
+
+#endif
diff --git a/drivers/crypto/ccp/sp-pci.c b/drivers/crypto/ccp/sp-pci.c
new file mode 100644
index 0000000..0960e2d
--- /dev/null
+++ b/drivers/crypto/ccp/sp-pci.c
@@ -0,0 +1,324 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ * 	   Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/pci_ids.h>
+#include <linux/dma-mapping.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+
+#include "sp-dev.h"
+
+#define MSIX_VECTORS			2
+
+struct sp_pci {
+	int msix_count;
+	struct msix_entry msix_entry[MSIX_VECTORS];
+};
+
+static struct sp_device *sp_dev_master;
+
+static int sp_get_msix_irqs(struct sp_device *sp)
+{
+	struct sp_pci *sp_pci = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int v, ret;
+
+	for (v = 0; v < ARRAY_SIZE(sp_pci->msix_entry); v++)
+		sp_pci->msix_entry[v].entry = v;
+
+	ret = pci_enable_msix_range(pdev, sp_pci->msix_entry, 1, v);
+	if (ret < 0)
+		return ret;
+
+	sp_pci->msix_count = ret;
+
+	sp->psp_irq = sp_pci->msix_entry[0].vector;
+	sp->ccp_irq = (sp_pci->msix_count > 1) ? sp_pci->msix_entry[1].vector
+					       : sp_pci->msix_entry[0].vector;
+
+	return 0;
+}
+
+static int sp_get_msi_irq(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int ret;
+
+	ret = pci_enable_msi(pdev);
+	if (ret)
+		return ret;
+
+	sp->psp_irq = pdev->irq;
+	sp->ccp_irq = pdev->irq;
+
+	return 0;
+}
+
+static int sp_get_irqs(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	int ret;
+
+	ret = sp_get_msix_irqs(sp);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI-X vectors, try MSI */
+	dev_notice(dev, "could not enable MSI-X (%d), trying MSI\n", ret);
+	ret = sp_get_msi_irq(sp);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI interrupt */
+	dev_notice(dev, "could not enable MSI (%d)\n", ret);
+
+	return ret;
+}
+
+static void sp_free_irqs(struct sp_device *sp)
+{
+	struct sp_pci *sp_pci = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (sp_pci->msix_count)
+		pci_disable_msix(pdev);
+	else if (sp->psp_irq)
+		pci_disable_msi(pdev);
+
+	sp->psp_irq = 0;
+	sp->ccp_irq = 0;
+}
+
+static bool sp_pci_is_master(struct sp_device *sp)
+{
+	struct device *dev_cur, *dev_new;
+	struct pci_dev *pdev_cur, *pdev_new;
+
+	dev_new = sp->dev;
+	dev_cur = sp_dev_master->dev;
+
+	pdev_new = to_pci_dev(dev_new);
+	pdev_cur = to_pci_dev(dev_cur);
+
+	if (pdev_new->bus->number < pdev_cur->bus->number)
+		return true;
+
+	if (PCI_SLOT(pdev_new->devfn) < PCI_SLOT(pdev_cur->devfn))
+		return true;
+
+	if (PCI_FUNC(pdev_new->devfn) < PCI_FUNC(pdev_cur->devfn))
+		return true;
+
+	return false;
+}
+
+static void sp_pci_set_master(struct sp_device *sp)
+{
+	if (!sp_dev_master) {
+		sp_dev_master = sp;
+		return;
+	}
+
+	if (sp_pci_is_master(sp))
+		sp_dev_master = sp;
+}
+
+static struct sp_device *sp_pci_get_master(void)
+{
+	return sp_dev_master;
+}
+
+static int sp_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct sp_device *sp;
+	struct sp_pci *sp_pci;
+	struct device *dev = &pdev->dev;
+	void __iomem * const *iomap_table;
+	int bar_mask;
+	int ret;
+
+	ret = -ENOMEM;
+	sp = sp_alloc_struct(dev);
+	if (!sp)
+		goto e_err;
+
+	sp_pci = devm_kzalloc(dev, sizeof(*sp_pci), GFP_KERNEL);
+	if (!sp_pci)
+		goto e_err;
+	sp->dev_specific = sp_pci;
+
+	sp->dev_data = (struct sp_dev_data *)id->driver_data;
+	if (!sp->dev_data) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	ret = pcim_enable_device(pdev);
+	if (ret) {
+		dev_err(dev, "pcim_enable_device failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	bar_mask = pci_select_bars(pdev, IORESOURCE_MEM);
+	ret = pcim_iomap_regions(pdev, bar_mask, "sp");
+	if (ret) {
+		dev_err(dev, "pcim_iomap_regions failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	iomap_table = pcim_iomap_table(pdev);
+	if (!iomap_table) {
+		dev_err(dev, "pcim_iomap_table failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	sp->io_map = iomap_table[sp->dev_data->bar];
+	if (!sp->io_map) {
+		dev_err(dev, "ioremap failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	ret = sp_get_irqs(sp);
+	if (ret)
+		goto e_err;
+
+	pci_set_master(pdev);
+
+	sp->set_master_device = sp_pci_set_master;
+	sp->get_master_device = sp_pci_get_master;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+	if (ret) {
+		ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
+		if (ret) {
+			dev_err(dev, "dma_set_mask_and_coherent failed (%d)\n",
+				ret);
+			goto e_err;
+		}
+	}
+
+	dev_set_drvdata(dev, sp);
+
+	ret = sp_init(sp);
+	if (ret)
+		goto e_err;
+
+	dev_notice(dev, "enabled\n");
+
+	return 0;
+
+e_err:
+	dev_notice(dev, "initialization failed\n");
+
+	return ret;
+}
+
+static void sp_pci_remove(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	if (!sp)
+		return;
+
+	sp_destroy(sp);
+
+	sp_free_irqs(sp);
+
+	dev_notice(dev, "disabled\n");
+}
+
+#ifdef CONFIG_PM
+static int sp_pci_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_suspend(sp, state);
+}
+
+static int sp_pci_resume(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_resume(sp);
+}
+#endif
+
+extern struct ccp_vdata ccpv3_pci;
+extern struct ccp_vdata ccpv5a;
+extern struct ccp_vdata ccpv5b;
+
+static const struct sp_dev_data dev_data[] = {
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv3_pci,
+#endif
+	},
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv5a,
+#endif
+	},
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv5b,
+#endif
+	},
+};
+
+static const struct pci_device_id sp_pci_table[] = {
+	{ PCI_VDEVICE(AMD, 0x1537), (kernel_ulong_t)&dev_data[0] },
+	{ PCI_VDEVICE(AMD, 0x1456), (kernel_ulong_t)&dev_data[1] },
+	{ PCI_VDEVICE(AMD, 0x1468), (kernel_ulong_t)&dev_data[2] },
+	/* Last entry must be zero */
+	{ 0, }
+};
+MODULE_DEVICE_TABLE(pci, sp_pci_table);
+
+static struct pci_driver sp_pci_driver = {
+	.name = "sp",
+	.id_table = sp_pci_table,
+	.probe = sp_pci_probe,
+	.remove = sp_pci_remove,
+#ifdef CONFIG_PM
+	.suspend = sp_pci_suspend,
+	.resume = sp_pci_resume,
+#endif
+};
+
+int sp_pci_init(void)
+{
+	return pci_register_driver(&sp_pci_driver);
+}
+
+void sp_pci_exit(void)
+{
+	pci_unregister_driver(&sp_pci_driver);
+}
diff --git a/drivers/crypto/ccp/sp-platform.c b/drivers/crypto/ccp/sp-platform.c
new file mode 100644
index 0000000..a918238
--- /dev/null
+++ b/drivers/crypto/ccp/sp-platform.c
@@ -0,0 +1,268 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ * 	   Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/platform_device.h>
+#include <linux/ioport.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/acpi.h>
+
+#include "sp-dev.h"
+
+struct sp_platform {
+	int coherent;
+	unsigned int irq_count;
+};
+
+static struct sp_device *sp_dev_master;
+static const struct acpi_device_id sp_acpi_match[];
+static const struct of_device_id sp_of_match[];
+
+static struct sp_dev_data *sp_get_of_dev_data(struct platform_device *pdev)
+{
+#ifdef CONFIG_OF
+	const struct of_device_id *match;
+
+	match = of_match_node(sp_of_match, pdev->dev.of_node);
+	if (match && match->data)
+		return (struct sp_dev_data *)match->data;
+#endif
+
+	return NULL;
+}
+
+static struct sp_dev_data *sp_get_acpi_dev_data(struct platform_device *pdev)
+{
+#ifdef CONFIG_ACPI
+	const struct acpi_device_id *match;
+
+	match = acpi_match_device(sp_acpi_match, &pdev->dev);
+	if (match && match->driver_data)
+		return (struct sp_dev_data *)match->driver_data;
+#endif
+
+	return NULL;
+}
+
+static int sp_get_irqs(struct sp_device *sp)
+{
+	struct sp_platform *sp_platform = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct platform_device *pdev = to_platform_device(dev);
+	unsigned int i, count;
+	int ret;
+
+	for (i = 0, count = 0; i < pdev->num_resources; i++) {
+		struct resource *res = &pdev->resource[i];
+
+		if (resource_type(res) == IORESOURCE_IRQ)
+			count++;
+	}
+
+	sp_platform->irq_count = count;
+
+	ret = platform_get_irq(pdev, 0);
+	if (ret < 0)
+		return ret;
+
+	sp->psp_irq = ret;
+	if (count == 1) {
+		sp->ccp_irq = ret;
+	} else {
+		ret = platform_get_irq(pdev, 1);
+		if (ret < 0)
+			return ret;
+
+		sp->ccp_irq = ret;
+	}
+
+	return 0;
+}
+
+void sp_platform_set_master(struct sp_device *sp)
+{
+	if (!sp_dev_master)
+		sp_dev_master = sp;
+}
+
+static int sp_platform_probe(struct platform_device *pdev)
+{
+	struct sp_device *sp;
+	struct sp_platform *sp_platform;
+	struct device *dev = &pdev->dev;
+	enum dev_dma_attr attr;
+	struct resource *ior;
+	int ret;
+
+	ret = -ENOMEM;
+	sp = sp_alloc_struct(dev);
+	if (!sp)
+		goto e_err;
+
+	sp_platform = devm_kzalloc(dev, sizeof(*sp_platform), GFP_KERNEL);
+	if (!sp_platform)
+		goto e_err;
+
+	sp->dev_specific = sp_platform;
+	sp->dev_data = pdev->dev.of_node ? sp_get_of_dev_data(pdev)
+					 : sp_get_acpi_dev_data(pdev);
+	if (!sp->dev_data) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	ior = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	sp->io_map = devm_ioremap_resource(dev, ior);
+	if (IS_ERR(sp->io_map)) {
+		ret = PTR_ERR(sp->io_map);
+		goto e_err;
+	}
+
+	attr = device_get_dma_attr(dev);
+	if (attr == DEV_DMA_NOT_SUPPORTED) {
+		dev_err(dev, "DMA is not supported");
+		goto e_err;
+	}
+
+	sp_platform->coherent = (attr == DEV_DMA_COHERENT);
+	if (sp_platform->coherent)
+		sp->axcache = CACHE_WB_NO_ALLOC;
+	else
+		sp->axcache = CACHE_NONE;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+	if (ret) {
+		dev_err(dev, "dma_set_mask_and_coherent failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	ret = sp_get_irqs(sp);
+	if (ret)
+		goto e_err;
+
+	dev_set_drvdata(dev, sp);
+
+	ret = sp_init(sp);
+	if (ret)
+		goto e_err;
+
+	dev_notice(dev, "enabled\n");
+
+	return 0;
+
+e_err:
+	dev_notice(dev, "initialization failed\n");
+
+	return ret;
+}
+
+static int sp_platform_remove(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	if (!sp)
+		return 0;
+
+	sp_destroy(sp);
+
+	dev_notice(dev, "disabled\n");
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int sp_platform_suspend(struct platform_device *pdev,
+			       pm_message_t state)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_suspend(sp, state);
+}
+
+static int sp_platform_resume(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_resume(sp);
+}
+#endif
+
+extern struct ccp_vdata ccpv3_platform;
+
+static const struct sp_dev_data dev_data[] = {
+	{
+#ifdef CONFIG_AMD_CCP
+		.ccp_vdata = &ccpv3_platform,
+#endif
+	},
+};
+
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id sp_acpi_match[] = {
+	{ "AMDI0C00", (kernel_ulong_t)&dev_data[0] },
+	{ },
+};
+MODULE_DEVICE_TABLE(acpi, sp_acpi_match);
+#endif
+
+#ifdef CONFIG_OF
+static const struct of_device_id sp_of_match[] = {
+	{ .compatible = "amd,ccp-seattle-v1a",
+	  .data = (const void *)&dev_data[0] },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, sp_of_match);
+#endif
+
+static struct platform_driver sp_platform_driver = {
+	.driver = {
+		.name = "sp",
+#ifdef CONFIG_ACPI
+		.acpi_match_table = sp_acpi_match,
+#endif
+#ifdef CONFIG_OF
+		.of_match_table = sp_of_match,
+#endif
+	},
+	.probe = sp_platform_probe,
+	.remove = sp_platform_remove,
+#ifdef CONFIG_PM
+	.suspend = sp_platform_suspend,
+	.resume = sp_platform_resume,
+#endif
+};
+
+struct sp_device *sp_platform_get_master(void)
+{
+	return sp_dev_master;
+}
+
+int sp_platform_init(void)
+{
+	return platform_driver_register(&sp_platform_driver);
+}
+
+void sp_platform_exit(void)
+{
+	platform_driver_unregister(&sp_platform_driver);
+}
diff --git a/include/linux/ccp.h b/include/linux/ccp.h
index c71dd8f..1ea14e6 100644
--- a/include/linux/ccp.h
+++ b/include/linux/ccp.h
@@ -24,8 +24,7 @@
 struct ccp_device;
 struct ccp_cmd;
 
-#if defined(CONFIG_CRYPTO_DEV_CCP_DD) || \
-	defined(CONFIG_CRYPTO_DEV_CCP_DD_MODULE)
+#if defined(CONFIG_CRYPTO_DEV_CCP)
 
 /**
  * ccp_present - check if a CCP device is present

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The CCP device is part of the AMD Secure Processor. In order to expand the
usage of the AMD Secure Processor, create a framework that allows functional
components of the AMD Secure Processor to be initialized and handled
appropriately.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 drivers/crypto/Kconfig           |   10 +
 drivers/crypto/ccp/Kconfig       |   43 +++--
 drivers/crypto/ccp/Makefile      |    8 -
 drivers/crypto/ccp/ccp-dev-v3.c  |   86 +++++-----
 drivers/crypto/ccp/ccp-dev-v5.c  |   73 ++++-----
 drivers/crypto/ccp/ccp-dev.c     |  137 +++++++++-------
 drivers/crypto/ccp/ccp-dev.h     |   35 ----
 drivers/crypto/ccp/sp-dev.c      |  308 ++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/sp-dev.h      |  140 ++++++++++++++++
 drivers/crypto/ccp/sp-pci.c      |  324 ++++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/sp-platform.c |  268 +++++++++++++++++++++++++++++++
 include/linux/ccp.h              |    3 
 12 files changed, 1240 insertions(+), 195 deletions(-)
 create mode 100644 drivers/crypto/ccp/sp-dev.c
 create mode 100644 drivers/crypto/ccp/sp-dev.h
 create mode 100644 drivers/crypto/ccp/sp-pci.c
 create mode 100644 drivers/crypto/ccp/sp-platform.c

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 7956478..d31b469 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -456,14 +456,14 @@ config CRYPTO_DEV_ATMEL_SHA
 	  To compile this driver as a module, choose M here: the module
 	  will be called atmel-sha.
 
-config CRYPTO_DEV_CCP
-	bool "Support for AMD Cryptographic Coprocessor"
+config CRYPTO_DEV_SP
+	bool "Support for AMD Secure Processor"
 	depends on ((X86 && PCI) || (ARM64 && (OF_ADDRESS || ACPI))) && HAS_IOMEM
 	help
-	  The AMD Cryptographic Coprocessor provides hardware offload support
-	  for encryption, hashing and related operations.
+	  The AMD Secure Processor provides hardware offload support for memory
+	  encryption in virtualization and cryptographic hashing and related operations.
 
-if CRYPTO_DEV_CCP
+if CRYPTO_DEV_SP
 	source "drivers/crypto/ccp/Kconfig"
 endif
 
diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index 2238f77..bc08f03 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -1,26 +1,37 @@
-config CRYPTO_DEV_CCP_DD
-	tristate "Cryptographic Coprocessor device driver"
-	depends on CRYPTO_DEV_CCP
-	default m
-	select HW_RANDOM
-	select DMA_ENGINE
-	select DMADEVICES
-	select CRYPTO_SHA1
-	select CRYPTO_SHA256
-	help
-	  Provides the interface to use the AMD Cryptographic Coprocessor
-	  which can be used to offload encryption operations such as SHA,
-	  AES and more. If you choose 'M' here, this module will be called
-	  ccp.
-
 config CRYPTO_DEV_CCP_CRYPTO
 	tristate "Encryption and hashing offload support"
-	depends on CRYPTO_DEV_CCP_DD
+	depends on CRYPTO_DEV_SP_DD
 	default m
 	select CRYPTO_HASH
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_AUTHENC
+	select CRYPTO_DEV_CCP
 	help
 	  Support for using the cryptographic API with the AMD Cryptographic
 	  Coprocessor. This module supports offload of SHA and AES algorithms.
 	  If you choose 'M' here, this module will be called ccp_crypto.
+
+config CRYPTO_DEV_SP_DD
+	tristate "Secure Processor device driver"
+	depends on CRYPTO_DEV_SP
+	default m
+	help
+	  Provides the interface to use the AMD Secure Processor. The
+	  AMD Secure Processor support the Platform Security Processor (PSP)
+	  and Cryptographic Coprocessor (CCP). If you choose 'M' here, this
+	  module will be called ccp.
+
+if CRYPTO_DEV_SP_DD
+config CRYPTO_DEV_CCP
+	bool "Cryptographic Coprocessor interface"
+	default y
+	select HW_RANDOM
+	select DMA_ENGINE
+	select DMADEVICES
+	select CRYPTO_SHA1
+	select CRYPTO_SHA256
+	help
+	  Provides the interface to use the AMD Cryptographic Coprocessor
+	  which can be used to offload encryption operations such as SHA,
+	  AES and more.
+endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 346ceb8..8127e18 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -1,11 +1,11 @@
-obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
-ccp-objs := ccp-dev.o \
+obj-$(CONFIG_CRYPTO_DEV_SP_DD) += ccp.o
+ccp-objs := sp-dev.o sp-platform.o
+ccp-$(CONFIG_PCI) += sp-pci.o
+ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-ops.o \
 	    ccp-dev-v3.o \
 	    ccp-dev-v5.o \
-	    ccp-platform.o \
 	    ccp-dmaengine.o
-ccp-$(CONFIG_PCI) += ccp-pci.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/ccp-dev-v3.c b/drivers/crypto/ccp/ccp-dev-v3.c
index 7bc0998..5c50d14 100644
--- a/drivers/crypto/ccp/ccp-dev-v3.c
+++ b/drivers/crypto/ccp/ccp-dev-v3.c
@@ -315,6 +315,39 @@ static int ccp_perform_ecc(struct ccp_op *op)
 	return ccp_do_cmd(op, cr, ARRAY_SIZE(cr));
 }
 
+static irqreturn_t ccp_irq_handler(int irq, void *data)
+{
+	struct ccp_device *ccp = data;
+	struct ccp_cmd_queue *cmd_q;
+	u32 q_int, status;
+	unsigned int i;
+
+	status = ioread32(ccp->io_regs + IRQ_STATUS_REG);
+
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		cmd_q = &ccp->cmd_q[i];
+
+		q_int = status & (cmd_q->int_ok | cmd_q->int_err);
+		if (q_int) {
+			cmd_q->int_status = status;
+			cmd_q->q_status = ioread32(cmd_q->reg_status);
+			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
+
+			/* On error, only save the first error value */
+			if ((q_int & cmd_q->int_err) && !cmd_q->cmd_error)
+				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
+
+			cmd_q->int_rcvd = 1;
+
+			/* Acknowledge the interrupt and wake the kthread */
+			iowrite32(q_int, ccp->io_regs + IRQ_STATUS_REG);
+			wake_up_interruptible(&cmd_q->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
 static int ccp_init(struct ccp_device *ccp)
 {
 	struct device *dev = ccp->dev;
@@ -374,7 +407,7 @@ static int ccp_init(struct ccp_device *ccp)
 
 #ifdef CONFIG_ARM64
 		/* For arm64 set the recommended queue cache settings */
-		iowrite32(ccp->axcache, ccp->io_regs + CMD_Q_CACHE_BASE +
+		iowrite32(ccp->sp->axcache, ccp->io_regs + CMD_Q_CACHE_BASE +
 			  (CMD_Q_CACHE_INC * i));
 #endif
 
@@ -398,7 +431,7 @@ static int ccp_init(struct ccp_device *ccp)
 	iowrite32(qim, ccp->io_regs + IRQ_STATUS_REG);
 
 	/* Request an irq */
-	ret = ccp->get_irq(ccp);
+	ret = sp_request_ccp_irq(ccp->sp, ccp_irq_handler, ccp->name, ccp);
 	if (ret) {
 		dev_err(dev, "unable to allocate an IRQ\n");
 		goto e_pool;
@@ -450,7 +483,7 @@ static int ccp_init(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 e_pool:
 	for (i = 0; i < ccp->cmd_q_count; i++)
@@ -496,7 +529,7 @@ static void ccp_destroy(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 	for (i = 0; i < ccp->cmd_q_count; i++)
 		dma_pool_destroy(ccp->cmd_q[i].dma_pool);
@@ -516,40 +549,6 @@ static void ccp_destroy(struct ccp_device *ccp)
 	}
 }
 
-static irqreturn_t ccp_irq_handler(int irq, void *data)
-{
-	struct device *dev = data;
-	struct ccp_device *ccp = dev_get_drvdata(dev);
-	struct ccp_cmd_queue *cmd_q;
-	u32 q_int, status;
-	unsigned int i;
-
-	status = ioread32(ccp->io_regs + IRQ_STATUS_REG);
-
-	for (i = 0; i < ccp->cmd_q_count; i++) {
-		cmd_q = &ccp->cmd_q[i];
-
-		q_int = status & (cmd_q->int_ok | cmd_q->int_err);
-		if (q_int) {
-			cmd_q->int_status = status;
-			cmd_q->q_status = ioread32(cmd_q->reg_status);
-			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
-
-			/* On error, only save the first error value */
-			if ((q_int & cmd_q->int_err) && !cmd_q->cmd_error)
-				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
-
-			cmd_q->int_rcvd = 1;
-
-			/* Acknowledge the interrupt and wake the kthread */
-			iowrite32(q_int, ccp->io_regs + IRQ_STATUS_REG);
-			wake_up_interruptible(&cmd_q->int_queue);
-		}
-	}
-
-	return IRQ_HANDLED;
-}
-
 static const struct ccp_actions ccp3_actions = {
 	.aes = ccp_perform_aes,
 	.xts_aes = ccp_perform_xts_aes,
@@ -562,13 +561,18 @@ static const struct ccp_actions ccp3_actions = {
 	.init = ccp_init,
 	.destroy = ccp_destroy,
 	.get_free_slots = ccp_get_free_slots,
-	.irqhandler = ccp_irq_handler,
 };
 
-const struct ccp_vdata ccpv3 = {
+const struct ccp_vdata ccpv3_platform = {
+	.version = CCP_VERSION(3, 0),
+	.setup = NULL,
+	.perform = &ccp3_actions,
+	.offset = 0,
+};
+
+const struct ccp_vdata ccpv3_pci = {
 	.version = CCP_VERSION(3, 0),
 	.setup = NULL,
 	.perform = &ccp3_actions,
-	.bar = 2,
 	.offset = 0x20000,
 };
diff --git a/drivers/crypto/ccp/ccp-dev-v5.c b/drivers/crypto/ccp/ccp-dev-v5.c
index 612898b..dd6335b 100644
--- a/drivers/crypto/ccp/ccp-dev-v5.c
+++ b/drivers/crypto/ccp/ccp-dev-v5.c
@@ -651,6 +651,38 @@ static int ccp_assign_lsbs(struct ccp_device *ccp)
 	return rc;
 }
 
+static irqreturn_t ccp5_irq_handler(int irq, void *data)
+{
+	struct device *dev = data;
+	struct ccp_device *ccp = dev_get_drvdata(dev);
+	u32 status;
+	unsigned int i;
+
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		struct ccp_cmd_queue *cmd_q = &ccp->cmd_q[i];
+
+		status = ioread32(cmd_q->reg_interrupt_status);
+
+		if (status) {
+			cmd_q->int_status = status;
+			cmd_q->q_status = ioread32(cmd_q->reg_status);
+			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
+
+			/* On error, only save the first error value */
+			if ((status & INT_ERROR) && !cmd_q->cmd_error)
+				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
+
+			cmd_q->int_rcvd = 1;
+
+			/* Acknowledge the interrupt and wake the kthread */
+			iowrite32(ALL_INTERRUPTS, cmd_q->reg_interrupt_status);
+			wake_up_interruptible(&cmd_q->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
 static int ccp5_init(struct ccp_device *ccp)
 {
 	struct device *dev = ccp->dev;
@@ -752,7 +784,7 @@ static int ccp5_init(struct ccp_device *ccp)
 
 	dev_dbg(dev, "Requesting an IRQ...\n");
 	/* Request an irq */
-	ret = ccp->get_irq(ccp);
+	ret = sp_request_ccp_irq(ccp->sp, ccp5_irq_handler, ccp->name, ccp);
 	if (ret) {
 		dev_err(dev, "unable to allocate an IRQ\n");
 		goto e_pool;
@@ -855,7 +887,7 @@ static int ccp5_init(struct ccp_device *ccp)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
 e_irq:
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 e_pool:
 	for (i = 0; i < ccp->cmd_q_count; i++)
@@ -901,7 +933,7 @@ static void ccp5_destroy(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 	for (i = 0; i < ccp->cmd_q_count; i++) {
 		cmd_q = &ccp->cmd_q[i];
@@ -924,38 +956,6 @@ static void ccp5_destroy(struct ccp_device *ccp)
 	}
 }
 
-static irqreturn_t ccp5_irq_handler(int irq, void *data)
-{
-	struct device *dev = data;
-	struct ccp_device *ccp = dev_get_drvdata(dev);
-	u32 status;
-	unsigned int i;
-
-	for (i = 0; i < ccp->cmd_q_count; i++) {
-		struct ccp_cmd_queue *cmd_q = &ccp->cmd_q[i];
-
-		status = ioread32(cmd_q->reg_interrupt_status);
-
-		if (status) {
-			cmd_q->int_status = status;
-			cmd_q->q_status = ioread32(cmd_q->reg_status);
-			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
-
-			/* On error, only save the first error value */
-			if ((status & INT_ERROR) && !cmd_q->cmd_error)
-				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
-
-			cmd_q->int_rcvd = 1;
-
-			/* Acknowledge the interrupt and wake the kthread */
-			iowrite32(ALL_INTERRUPTS, cmd_q->reg_interrupt_status);
-			wake_up_interruptible(&cmd_q->int_queue);
-		}
-	}
-
-	return IRQ_HANDLED;
-}
-
 static void ccp5_config(struct ccp_device *ccp)
 {
 	/* Public side */
@@ -1001,14 +1001,12 @@ static const struct ccp_actions ccp5_actions = {
 	.init = ccp5_init,
 	.destroy = ccp5_destroy,
 	.get_free_slots = ccp5_get_free_slots,
-	.irqhandler = ccp5_irq_handler,
 };
 
 const struct ccp_vdata ccpv5a = {
 	.version = CCP_VERSION(5, 0),
 	.setup = ccp5_config,
 	.perform = &ccp5_actions,
-	.bar = 2,
 	.offset = 0x0,
 };
 
@@ -1016,6 +1014,5 @@ const struct ccp_vdata ccpv5b = {
 	.version = CCP_VERSION(5, 0),
 	.setup = ccp5other_config,
 	.perform = &ccp5_actions,
-	.bar = 2,
 	.offset = 0x0,
 };
diff --git a/drivers/crypto/ccp/ccp-dev.c b/drivers/crypto/ccp/ccp-dev.c
index 511ab04..0fa8c4a 100644
--- a/drivers/crypto/ccp/ccp-dev.c
+++ b/drivers/crypto/ccp/ccp-dev.c
@@ -22,19 +22,11 @@
 #include <linux/mutex.h>
 #include <linux/delay.h>
 #include <linux/hw_random.h>
-#include <linux/cpu.h>
-#ifdef CONFIG_X86
-#include <asm/cpu_device_id.h>
-#endif
 #include <linux/ccp.h>
 
+#include "sp-dev.h"
 #include "ccp-dev.h"
 
-MODULE_AUTHOR("Tom Lendacky <thomas.lendacky@amd.com>");
-MODULE_LICENSE("GPL");
-MODULE_VERSION("1.0.0");
-MODULE_DESCRIPTION("AMD Cryptographic Coprocessor driver");
-
 struct ccp_tasklet_data {
 	struct completion completion;
 	struct ccp_cmd *cmd;
@@ -110,13 +102,6 @@ static LIST_HEAD(ccp_units);
 static DEFINE_SPINLOCK(ccp_rr_lock);
 static struct ccp_device *ccp_rr;
 
-/* Ever-increasing value to produce unique unit numbers */
-static atomic_t ccp_unit_ordinal;
-static unsigned int ccp_increment_unit_ordinal(void)
-{
-	return atomic_inc_return(&ccp_unit_ordinal);
-}
-
 /**
  * ccp_add_device - add a CCP device to the list
  *
@@ -455,19 +440,17 @@ int ccp_cmd_queue_thread(void *data)
 	return 0;
 }
 
-/**
- * ccp_alloc_struct - allocate and initialize the ccp_device struct
- *
- * @dev: device struct of the CCP
- */
-struct ccp_device *ccp_alloc_struct(struct device *dev)
+static struct ccp_device *ccp_alloc_struct(struct sp_device *sp)
 {
+	struct device *dev = sp->dev;
 	struct ccp_device *ccp;
 
 	ccp = devm_kzalloc(dev, sizeof(*ccp), GFP_KERNEL);
 	if (!ccp)
 		return NULL;
+
 	ccp->dev = dev;
+	ccp->sp = sp;
 
 	INIT_LIST_HEAD(&ccp->cmd);
 	INIT_LIST_HEAD(&ccp->backlog);
@@ -482,9 +465,8 @@ struct ccp_device *ccp_alloc_struct(struct device *dev)
 	init_waitqueue_head(&ccp->sb_queue);
 	init_waitqueue_head(&ccp->suspend_queue);
 
-	ccp->ord = ccp_increment_unit_ordinal();
-	snprintf(ccp->name, MAX_CCP_NAME_LEN, "ccp-%u", ccp->ord);
-	snprintf(ccp->rngname, MAX_CCP_NAME_LEN, "ccp-%u-rng", ccp->ord);
+	snprintf(ccp->name, MAX_CCP_NAME_LEN, "ccp-%u", sp->ord);
+	snprintf(ccp->rngname, MAX_CCP_NAME_LEN, "ccp-%u-rng", sp->ord);
 
 	return ccp;
 }
@@ -536,53 +518,94 @@ bool ccp_queues_suspended(struct ccp_device *ccp)
 }
 #endif
 
-static int __init ccp_mod_init(void)
+int ccp_dev_init(struct sp_device *sp)
 {
-#ifdef CONFIG_X86
+	struct device *dev = sp->dev;
+	struct ccp_device *ccp;
 	int ret;
 
-	ret = ccp_pci_init();
-	if (ret)
-		return ret;
-
-	/* Don't leave the driver loaded if init failed */
-	if (ccp_present() != 0) {
-		ccp_pci_exit();
-		return -ENODEV;
+	ret = -ENOMEM;
+	ccp = ccp_alloc_struct(sp);
+	if (!ccp)
+		goto e_err;
+	sp->ccp_data = ccp;
+
+	ccp->vdata = (struct ccp_vdata *)sp->dev_data->ccp_vdata;
+	if (!ccp->vdata || !ccp->vdata->version) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
 	}
 
-	return 0;
-#endif
+	ccp->io_regs = sp->io_map + ccp->vdata->offset;
 
-#ifdef CONFIG_ARM64
-	int ret;
+	if (ccp->vdata->setup)
+		ccp->vdata->setup(ccp);
 
-	ret = ccp_platform_init();
+	ret = ccp->vdata->perform->init(ccp);
 	if (ret)
-		return ret;
+		goto e_err;
 
-	/* Don't leave the driver loaded if init failed */
-	if (ccp_present() != 0) {
-		ccp_platform_exit();
-		return -ENODEV;
-	}
+	dev_notice(dev, "ccp enabled\n");
 
 	return 0;
-#endif
 
-	return -ENODEV;
+e_err:
+	sp->ccp_data = NULL;
+
+	dev_notice(dev, "ccp initialization failed\n");
+
+	return ret;
 }
 
-static void __exit ccp_mod_exit(void)
+void ccp_dev_destroy(struct sp_device *sp)
 {
-#ifdef CONFIG_X86
-	ccp_pci_exit();
-#endif
+	struct ccp_device *ccp = sp->ccp_data;
 
-#ifdef CONFIG_ARM64
-	ccp_platform_exit();
-#endif
+	ccp->vdata->perform->destroy(ccp);
+}
+
+int ccp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	struct ccp_device *ccp = sp->ccp_data;
+	unsigned long flags;
+	unsigned int i;
+
+	spin_lock_irqsave(&ccp->cmd_lock, flags);
+
+	ccp->suspending = 1;
+
+	/* Wake all the queue kthreads to prepare for suspend */
+	for (i = 0; i < ccp->cmd_q_count; i++)
+		wake_up_process(ccp->cmd_q[i].kthread);
+
+	spin_unlock_irqrestore(&ccp->cmd_lock, flags);
+
+	/* Wait for all queue kthreads to say they're done */
+	while (!ccp_queues_suspended(ccp))
+		wait_event_interruptible(ccp->suspend_queue,
+					 ccp_queues_suspended(ccp));
+
+	return 0;
 }
 
-module_init(ccp_mod_init);
-module_exit(ccp_mod_exit);
+int ccp_dev_resume(struct sp_device *sp)
+{
+	struct ccp_device *ccp = sp->ccp_data;
+	unsigned long flags;
+	unsigned int i;
+
+	spin_lock_irqsave(&ccp->cmd_lock, flags);
+
+	ccp->suspending = 0;
+
+	/* Wake up all the kthreads */
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		ccp->cmd_q[i].suspended = 0;
+		wake_up_process(ccp->cmd_q[i].kthread);
+	}
+
+	spin_unlock_irqrestore(&ccp->cmd_lock, flags);
+
+	return 0;
+}
diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
index 649e561..25a4bfd 100644
--- a/drivers/crypto/ccp/ccp-dev.h
+++ b/drivers/crypto/ccp/ccp-dev.h
@@ -27,6 +27,8 @@
 #include <linux/irqreturn.h>
 #include <linux/dmaengine.h>
 
+#include "sp-dev.h"
+
 #define MAX_CCP_NAME_LEN		16
 #define MAX_DMAPOOL_NAME_LEN		32
 
@@ -35,9 +37,6 @@
 
 #define TRNG_RETRIES			10
 
-#define CACHE_NONE			0x00
-#define CACHE_WB_NO_ALLOC		0xb7
-
 /****** Register Mappings ******/
 #define Q_MASK_REG			0x000
 #define TRNG_OUT_REG			0x00c
@@ -322,18 +321,15 @@ struct ccp_device {
 	struct list_head entry;
 
 	struct ccp_vdata *vdata;
-	unsigned int ord;
 	char name[MAX_CCP_NAME_LEN];
 	char rngname[MAX_CCP_NAME_LEN];
 
 	struct device *dev;
+	struct sp_device *sp;
 
 	/* Bus specific device information
 	 */
 	void *dev_specific;
-	int (*get_irq)(struct ccp_device *ccp);
-	void (*free_irq)(struct ccp_device *ccp);
-	unsigned int irq;
 
 	/* I/O area used for device communication. The register mapping
 	 * starts at an offset into the mapped bar.
@@ -342,7 +338,6 @@ struct ccp_device {
 	 *   them.
 	 */
 	struct mutex req_mutex ____cacheline_aligned;
-	void __iomem *io_map;
 	void __iomem *io_regs;
 
 	/* Master lists that all cmds are queued on. Because there can be
@@ -407,9 +402,6 @@ struct ccp_device {
 	/* Suspend support */
 	unsigned int suspending;
 	wait_queue_head_t suspend_queue;
-
-	/* DMA caching attribute support */
-	unsigned int axcache;
 };
 
 enum ccp_memtype {
@@ -592,18 +584,11 @@ struct ccp5_desc {
 	struct dword7 dw7;
 };
 
-int ccp_pci_init(void);
-void ccp_pci_exit(void);
-
-int ccp_platform_init(void);
-void ccp_platform_exit(void);
-
 void ccp_add_device(struct ccp_device *ccp);
 void ccp_del_device(struct ccp_device *ccp);
 
 extern void ccp_log_error(struct ccp_device *, int);
 
-struct ccp_device *ccp_alloc_struct(struct device *dev);
 bool ccp_queues_suspended(struct ccp_device *ccp);
 int ccp_cmd_queue_thread(void *data);
 int ccp_trng_read(struct hwrng *rng, void *data, size_t max, bool wait);
@@ -629,20 +614,6 @@ struct ccp_actions {
 	unsigned int (*get_free_slots)(struct ccp_cmd_queue *);
 	int (*init)(struct ccp_device *);
 	void (*destroy)(struct ccp_device *);
-	irqreturn_t (*irqhandler)(int, void *);
-};
-
-/* Structure to hold CCP version-specific values */
-struct ccp_vdata {
-	const unsigned int version;
-	void (*setup)(struct ccp_device *);
-	const struct ccp_actions *perform;
-	const unsigned int bar;
-	const unsigned int offset;
 };
 
-extern const struct ccp_vdata ccpv3;
-extern const struct ccp_vdata ccpv5a;
-extern const struct ccp_vdata ccpv5b;
-
 #endif
diff --git a/drivers/crypto/ccp/sp-dev.c b/drivers/crypto/ccp/sp-dev.c
new file mode 100644
index 0000000..e47fb8e
--- /dev/null
+++ b/drivers/crypto/ccp/sp-dev.c
@@ -0,0 +1,308 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *	Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+
+#include "sp-dev.h"
+
+MODULE_AUTHOR("Tom Lendacky <thomas.lendacky@amd.com>");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1.1.0");
+MODULE_DESCRIPTION("AMD Secure Processor driver");
+
+/* List of SPs, SP count, read-write access lock, and access functions
+ *
+ * Lock structure: get sp_unit_lock for reading whenever we need to
+ * examine the SP list.
+ */
+static DEFINE_RWLOCK(sp_unit_lock);
+static LIST_HEAD(sp_units);
+
+/* Ever-increasing value to produce unique unit numbers */
+static atomic_t sp_ordinal;
+
+static void sp_add_device(struct sp_device *sp)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	list_add_tail(&sp->entry, &sp_units);
+
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+}
+
+static void sp_del_device(struct sp_device *sp)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	list_del(&sp->entry);
+
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+}
+
+struct sp_device *sp_get_device(void)
+{
+	struct sp_device *sp = NULL;
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	if (list_empty(&sp_units))
+		goto unlock;
+
+	sp = list_first_entry(&sp_units, struct sp_device, entry);
+
+	list_add_tail(&sp->entry, &sp_units);
+unlock:
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+	return sp;
+}
+
+static irqreturn_t sp_irq_handler(int irq, void *data)
+{
+	struct sp_device *sp = data;
+
+	if (sp->psp_irq_handler)
+		sp->psp_irq_handler(irq, sp->psp_irq_data);
+
+	if (sp->ccp_irq_handler)
+		sp->ccp_irq_handler(irq, sp->ccp_irq_data);
+
+	return IRQ_HANDLED;
+}
+
+int sp_request_psp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data)
+{
+	int ret;
+
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->ccp_vdata) {
+		/* Need a common routine to manager all interrupts */
+		sp->psp_irq_data = data;
+		sp->psp_irq_handler = handler;
+
+		if (!sp->irq_registered) {
+			ret = request_irq(sp->psp_irq, sp_irq_handler, 0,
+					  sp->name, sp);
+			if (ret)
+				return ret;
+
+			sp->irq_registered = true;
+		}
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		ret = request_irq(sp->psp_irq, handler, 0, name, data);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+int sp_request_ccp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data)
+{
+	int ret;
+
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->psp_vdata) {
+		/* Need a common routine to manager all interrupts */
+		sp->ccp_irq_data = data;
+		sp->ccp_irq_handler = handler;
+
+		if (!sp->irq_registered) {
+			ret = request_irq(sp->ccp_irq, sp_irq_handler, 0,
+					  sp->name, sp);
+			if (ret)
+				return ret;
+
+			sp->irq_registered = true;
+		}
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		ret = request_irq(sp->ccp_irq, handler, 0, name, data);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+void sp_free_psp_irq(struct sp_device *sp, void *data)
+{
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->ccp_vdata) {
+		/* Using a common routine to manager all interrupts */
+		if (!sp->ccp_irq_handler) {
+			/* Nothing else using it, so free it */
+			free_irq(sp->psp_irq, sp);
+
+			sp->irq_registered = false;
+		}
+
+		sp->psp_irq_handler = NULL;
+		sp->psp_irq_data = NULL;
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		free_irq(sp->psp_irq, data);
+	}
+}
+
+void sp_free_ccp_irq(struct sp_device *sp, void *data)
+{
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->psp_vdata) {
+		/* Using a common routine to manager all interrupts */
+		if (!sp->psp_irq_handler) {
+			/* Nothing else using it, so free it */
+			free_irq(sp->ccp_irq, sp);
+
+			sp->irq_registered = false;
+		}
+
+		sp->ccp_irq_handler = NULL;
+		sp->ccp_irq_data = NULL;
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		free_irq(sp->ccp_irq, data);
+	}
+}
+
+/**
+ * sp_alloc_struct - allocate and initialize the sp_device struct
+ *
+ * @dev: device struct of the SP
+ */
+struct sp_device *sp_alloc_struct(struct device *dev)
+{
+	struct sp_device *sp;
+
+	sp = devm_kzalloc(dev, sizeof(*sp), GFP_KERNEL);
+	if (!sp)
+		return NULL;
+
+	sp->dev = dev;
+	sp->ord = atomic_inc_return(&sp_ordinal) - 1;
+	snprintf(sp->name, SP_MAX_NAME_LEN, "sp-%u", sp->ord);
+
+	return sp;
+}
+
+int sp_init(struct sp_device *sp)
+{
+	sp_add_device(sp);
+
+	if (sp->dev_data->ccp_vdata)
+		ccp_dev_init(sp);
+
+	return 0;
+}
+
+void sp_destroy(struct sp_device *sp)
+{
+	if (sp->dev_data->ccp_vdata)
+		ccp_dev_destroy(sp);
+
+	sp_del_device(sp);
+}
+
+int sp_suspend(struct sp_device *sp, pm_message_t state)
+{
+	int ret;
+
+	if (sp->dev_data->ccp_vdata) {
+		ret = ccp_dev_suspend(sp, state);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+int sp_resume(struct sp_device *sp)
+{
+	int ret;
+
+	if (sp->dev_data->ccp_vdata) {
+		ret = ccp_dev_resume(sp);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+struct sp_device *sp_get_psp_master_device(void)
+{
+	struct sp_device *sp = sp_get_device();
+
+	if (!sp)
+		return NULL;
+
+	if (!sp->psp_data)
+		return NULL;
+
+	return sp->get_master_device();
+}
+
+void sp_set_psp_master(struct sp_device *sp)
+{
+	if (sp->psp_data)
+		sp->set_master_device(sp);
+}
+
+static int __init sp_mod_init(void)
+{
+#ifdef CONFIG_X86
+	int ret;
+
+	ret = sp_pci_init();
+	if (ret)
+		return ret;
+
+	return 0;
+#endif
+
+#ifdef CONFIG_ARM64
+	int ret;
+
+	ret = sp_platform_init();
+	if (ret)
+		return ret;
+
+	return 0;
+#endif
+
+	return -ENODEV;
+}
+
+static void __exit sp_mod_exit(void)
+{
+#ifdef CONFIG_X86
+	sp_pci_exit();
+#endif
+
+#ifdef CONFIG_ARM64
+	sp_platform_exit();
+#endif
+}
+
+module_init(sp_mod_init);
+module_exit(sp_mod_exit);
diff --git a/drivers/crypto/ccp/sp-dev.h b/drivers/crypto/ccp/sp-dev.h
new file mode 100644
index 0000000..9a8a8f8
--- /dev/null
+++ b/drivers/crypto/ccp/sp-dev.h
@@ -0,0 +1,140 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *	Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __SP_DEV_H__
+#define __SP_DEV_H__
+
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/dmapool.h>
+#include <linux/hw_random.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+
+#define SP_MAX_NAME_LEN		32
+
+#define CACHE_NONE			0x00
+#define CACHE_WB_NO_ALLOC		0xb7
+
+/* Structure to hold CCP device data */
+struct ccp_device;
+struct ccp_vdata {
+	const unsigned int version;
+	void (*setup)(struct ccp_device *);
+	const struct ccp_actions *perform;
+	const unsigned int offset;
+};
+
+/* Structure to hold SP device data */
+struct sp_dev_data {
+	const unsigned int bar;
+
+	const struct ccp_vdata *ccp_vdata;
+	const void *psp_vdata;
+};
+
+struct sp_device {
+	struct list_head entry;
+
+	struct device *dev;
+
+	struct sp_dev_data *dev_data;
+	unsigned int ord;
+	char name[SP_MAX_NAME_LEN];
+
+	/* Bus specific device information */
+	void *dev_specific;
+
+	/* I/O area used for device communication. */
+	void __iomem *io_map;
+
+	/* DMA caching attribute support */
+	unsigned int axcache;
+
+	bool irq_registered;
+
+	/* get and set master device */
+	struct sp_device*(*get_master_device) (void);
+	void(*set_master_device) (struct sp_device *);
+
+	unsigned int psp_irq;
+	irq_handler_t psp_irq_handler;
+	void *psp_irq_data;
+
+	unsigned int ccp_irq;
+	irq_handler_t ccp_irq_handler;
+	void *ccp_irq_data;
+
+	void *psp_data;
+	void *ccp_data;
+};
+
+int sp_pci_init(void);
+void sp_pci_exit(void);
+
+int sp_platform_init(void);
+void sp_platform_exit(void);
+
+struct sp_device *sp_alloc_struct(struct device *dev);
+
+int sp_init(struct sp_device *sp);
+void sp_destroy(struct sp_device *sp);
+struct sp_device *sp_get_master(void);
+
+int sp_suspend(struct sp_device *sp, pm_message_t state);
+int sp_resume(struct sp_device *sp);
+
+int sp_request_psp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data);
+void sp_free_psp_irq(struct sp_device *sp, void *data);
+
+int sp_request_ccp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data);
+void sp_free_ccp_irq(struct sp_device *sp, void *data);
+
+void sp_set_psp_master(struct sp_device *sp);
+struct sp_device *sp_get_psp_master_device(void);
+
+#ifdef CONFIG_CRYPTO_DEV_CCP
+
+int ccp_dev_init(struct sp_device *sp);
+void ccp_dev_destroy(struct sp_device *sp);
+
+int ccp_dev_suspend(struct sp_device *sp, pm_message_t state);
+int ccp_dev_resume(struct sp_device *sp);
+
+#else	/* !CONFIG_CRYPTO_DEV_CCP */
+
+static inline int ccp_dev_init(struct sp_device *sp)
+{
+	return 0;
+}
+static inline void ccp_dev_destroy(struct sp_device *sp) { }
+
+static inline int ccp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	return 0;
+}
+static inline int ccp_dev_resume(struct sp_device *sp)
+{
+	return 0;
+}
+
+#endif	/* CONFIG_CRYPTO_DEV_CCP */
+
+#endif
diff --git a/drivers/crypto/ccp/sp-pci.c b/drivers/crypto/ccp/sp-pci.c
new file mode 100644
index 0000000..0960e2d
--- /dev/null
+++ b/drivers/crypto/ccp/sp-pci.c
@@ -0,0 +1,324 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ * 	   Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/pci_ids.h>
+#include <linux/dma-mapping.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+
+#include "sp-dev.h"
+
+#define MSIX_VECTORS			2
+
+struct sp_pci {
+	int msix_count;
+	struct msix_entry msix_entry[MSIX_VECTORS];
+};
+
+static struct sp_device *sp_dev_master;
+
+static int sp_get_msix_irqs(struct sp_device *sp)
+{
+	struct sp_pci *sp_pci = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int v, ret;
+
+	for (v = 0; v < ARRAY_SIZE(sp_pci->msix_entry); v++)
+		sp_pci->msix_entry[v].entry = v;
+
+	ret = pci_enable_msix_range(pdev, sp_pci->msix_entry, 1, v);
+	if (ret < 0)
+		return ret;
+
+	sp_pci->msix_count = ret;
+
+	sp->psp_irq = sp_pci->msix_entry[0].vector;
+	sp->ccp_irq = (sp_pci->msix_count > 1) ? sp_pci->msix_entry[1].vector
+					       : sp_pci->msix_entry[0].vector;
+
+	return 0;
+}
+
+static int sp_get_msi_irq(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int ret;
+
+	ret = pci_enable_msi(pdev);
+	if (ret)
+		return ret;
+
+	sp->psp_irq = pdev->irq;
+	sp->ccp_irq = pdev->irq;
+
+	return 0;
+}
+
+static int sp_get_irqs(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	int ret;
+
+	ret = sp_get_msix_irqs(sp);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI-X vectors, try MSI */
+	dev_notice(dev, "could not enable MSI-X (%d), trying MSI\n", ret);
+	ret = sp_get_msi_irq(sp);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI interrupt */
+	dev_notice(dev, "could not enable MSI (%d)\n", ret);
+
+	return ret;
+}
+
+static void sp_free_irqs(struct sp_device *sp)
+{
+	struct sp_pci *sp_pci = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (sp_pci->msix_count)
+		pci_disable_msix(pdev);
+	else if (sp->psp_irq)
+		pci_disable_msi(pdev);
+
+	sp->psp_irq = 0;
+	sp->ccp_irq = 0;
+}
+
+static bool sp_pci_is_master(struct sp_device *sp)
+{
+	struct device *dev_cur, *dev_new;
+	struct pci_dev *pdev_cur, *pdev_new;
+
+	dev_new = sp->dev;
+	dev_cur = sp_dev_master->dev;
+
+	pdev_new = to_pci_dev(dev_new);
+	pdev_cur = to_pci_dev(dev_cur);
+
+	if (pdev_new->bus->number < pdev_cur->bus->number)
+		return true;
+
+	if (PCI_SLOT(pdev_new->devfn) < PCI_SLOT(pdev_cur->devfn))
+		return true;
+
+	if (PCI_FUNC(pdev_new->devfn) < PCI_FUNC(pdev_cur->devfn))
+		return true;
+
+	return false;
+}
+
+static void sp_pci_set_master(struct sp_device *sp)
+{
+	if (!sp_dev_master) {
+		sp_dev_master = sp;
+		return;
+	}
+
+	if (sp_pci_is_master(sp))
+		sp_dev_master = sp;
+}
+
+static struct sp_device *sp_pci_get_master(void)
+{
+	return sp_dev_master;
+}
+
+static int sp_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct sp_device *sp;
+	struct sp_pci *sp_pci;
+	struct device *dev = &pdev->dev;
+	void __iomem * const *iomap_table;
+	int bar_mask;
+	int ret;
+
+	ret = -ENOMEM;
+	sp = sp_alloc_struct(dev);
+	if (!sp)
+		goto e_err;
+
+	sp_pci = devm_kzalloc(dev, sizeof(*sp_pci), GFP_KERNEL);
+	if (!sp_pci)
+		goto e_err;
+	sp->dev_specific = sp_pci;
+
+	sp->dev_data = (struct sp_dev_data *)id->driver_data;
+	if (!sp->dev_data) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	ret = pcim_enable_device(pdev);
+	if (ret) {
+		dev_err(dev, "pcim_enable_device failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	bar_mask = pci_select_bars(pdev, IORESOURCE_MEM);
+	ret = pcim_iomap_regions(pdev, bar_mask, "sp");
+	if (ret) {
+		dev_err(dev, "pcim_iomap_regions failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	iomap_table = pcim_iomap_table(pdev);
+	if (!iomap_table) {
+		dev_err(dev, "pcim_iomap_table failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	sp->io_map = iomap_table[sp->dev_data->bar];
+	if (!sp->io_map) {
+		dev_err(dev, "ioremap failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	ret = sp_get_irqs(sp);
+	if (ret)
+		goto e_err;
+
+	pci_set_master(pdev);
+
+	sp->set_master_device = sp_pci_set_master;
+	sp->get_master_device = sp_pci_get_master;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+	if (ret) {
+		ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
+		if (ret) {
+			dev_err(dev, "dma_set_mask_and_coherent failed (%d)\n",
+				ret);
+			goto e_err;
+		}
+	}
+
+	dev_set_drvdata(dev, sp);
+
+	ret = sp_init(sp);
+	if (ret)
+		goto e_err;
+
+	dev_notice(dev, "enabled\n");
+
+	return 0;
+
+e_err:
+	dev_notice(dev, "initialization failed\n");
+
+	return ret;
+}
+
+static void sp_pci_remove(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	if (!sp)
+		return;
+
+	sp_destroy(sp);
+
+	sp_free_irqs(sp);
+
+	dev_notice(dev, "disabled\n");
+}
+
+#ifdef CONFIG_PM
+static int sp_pci_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_suspend(sp, state);
+}
+
+static int sp_pci_resume(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_resume(sp);
+}
+#endif
+
+extern struct ccp_vdata ccpv3_pci;
+extern struct ccp_vdata ccpv5a;
+extern struct ccp_vdata ccpv5b;
+
+static const struct sp_dev_data dev_data[] = {
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv3_pci,
+#endif
+	},
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv5a,
+#endif
+	},
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv5b,
+#endif
+	},
+};
+
+static const struct pci_device_id sp_pci_table[] = {
+	{ PCI_VDEVICE(AMD, 0x1537), (kernel_ulong_t)&dev_data[0] },
+	{ PCI_VDEVICE(AMD, 0x1456), (kernel_ulong_t)&dev_data[1] },
+	{ PCI_VDEVICE(AMD, 0x1468), (kernel_ulong_t)&dev_data[2] },
+	/* Last entry must be zero */
+	{ 0, }
+};
+MODULE_DEVICE_TABLE(pci, sp_pci_table);
+
+static struct pci_driver sp_pci_driver = {
+	.name = "sp",
+	.id_table = sp_pci_table,
+	.probe = sp_pci_probe,
+	.remove = sp_pci_remove,
+#ifdef CONFIG_PM
+	.suspend = sp_pci_suspend,
+	.resume = sp_pci_resume,
+#endif
+};
+
+int sp_pci_init(void)
+{
+	return pci_register_driver(&sp_pci_driver);
+}
+
+void sp_pci_exit(void)
+{
+	pci_unregister_driver(&sp_pci_driver);
+}
diff --git a/drivers/crypto/ccp/sp-platform.c b/drivers/crypto/ccp/sp-platform.c
new file mode 100644
index 0000000..a918238
--- /dev/null
+++ b/drivers/crypto/ccp/sp-platform.c
@@ -0,0 +1,268 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ * 	   Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/platform_device.h>
+#include <linux/ioport.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/acpi.h>
+
+#include "sp-dev.h"
+
+struct sp_platform {
+	int coherent;
+	unsigned int irq_count;
+};
+
+static struct sp_device *sp_dev_master;
+static const struct acpi_device_id sp_acpi_match[];
+static const struct of_device_id sp_of_match[];
+
+static struct sp_dev_data *sp_get_of_dev_data(struct platform_device *pdev)
+{
+#ifdef CONFIG_OF
+	const struct of_device_id *match;
+
+	match = of_match_node(sp_of_match, pdev->dev.of_node);
+	if (match && match->data)
+		return (struct sp_dev_data *)match->data;
+#endif
+
+	return NULL;
+}
+
+static struct sp_dev_data *sp_get_acpi_dev_data(struct platform_device *pdev)
+{
+#ifdef CONFIG_ACPI
+	const struct acpi_device_id *match;
+
+	match = acpi_match_device(sp_acpi_match, &pdev->dev);
+	if (match && match->driver_data)
+		return (struct sp_dev_data *)match->driver_data;
+#endif
+
+	return NULL;
+}
+
+static int sp_get_irqs(struct sp_device *sp)
+{
+	struct sp_platform *sp_platform = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct platform_device *pdev = to_platform_device(dev);
+	unsigned int i, count;
+	int ret;
+
+	for (i = 0, count = 0; i < pdev->num_resources; i++) {
+		struct resource *res = &pdev->resource[i];
+
+		if (resource_type(res) == IORESOURCE_IRQ)
+			count++;
+	}
+
+	sp_platform->irq_count = count;
+
+	ret = platform_get_irq(pdev, 0);
+	if (ret < 0)
+		return ret;
+
+	sp->psp_irq = ret;
+	if (count == 1) {
+		sp->ccp_irq = ret;
+	} else {
+		ret = platform_get_irq(pdev, 1);
+		if (ret < 0)
+			return ret;
+
+		sp->ccp_irq = ret;
+	}
+
+	return 0;
+}
+
+void sp_platform_set_master(struct sp_device *sp)
+{
+	if (!sp_dev_master)
+		sp_dev_master = sp;
+}
+
+static int sp_platform_probe(struct platform_device *pdev)
+{
+	struct sp_device *sp;
+	struct sp_platform *sp_platform;
+	struct device *dev = &pdev->dev;
+	enum dev_dma_attr attr;
+	struct resource *ior;
+	int ret;
+
+	ret = -ENOMEM;
+	sp = sp_alloc_struct(dev);
+	if (!sp)
+		goto e_err;
+
+	sp_platform = devm_kzalloc(dev, sizeof(*sp_platform), GFP_KERNEL);
+	if (!sp_platform)
+		goto e_err;
+
+	sp->dev_specific = sp_platform;
+	sp->dev_data = pdev->dev.of_node ? sp_get_of_dev_data(pdev)
+					 : sp_get_acpi_dev_data(pdev);
+	if (!sp->dev_data) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	ior = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	sp->io_map = devm_ioremap_resource(dev, ior);
+	if (IS_ERR(sp->io_map)) {
+		ret = PTR_ERR(sp->io_map);
+		goto e_err;
+	}
+
+	attr = device_get_dma_attr(dev);
+	if (attr == DEV_DMA_NOT_SUPPORTED) {
+		dev_err(dev, "DMA is not supported");
+		goto e_err;
+	}
+
+	sp_platform->coherent = (attr == DEV_DMA_COHERENT);
+	if (sp_platform->coherent)
+		sp->axcache = CACHE_WB_NO_ALLOC;
+	else
+		sp->axcache = CACHE_NONE;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+	if (ret) {
+		dev_err(dev, "dma_set_mask_and_coherent failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	ret = sp_get_irqs(sp);
+	if (ret)
+		goto e_err;
+
+	dev_set_drvdata(dev, sp);
+
+	ret = sp_init(sp);
+	if (ret)
+		goto e_err;
+
+	dev_notice(dev, "enabled\n");
+
+	return 0;
+
+e_err:
+	dev_notice(dev, "initialization failed\n");
+
+	return ret;
+}
+
+static int sp_platform_remove(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	if (!sp)
+		return 0;
+
+	sp_destroy(sp);
+
+	dev_notice(dev, "disabled\n");
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int sp_platform_suspend(struct platform_device *pdev,
+			       pm_message_t state)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_suspend(sp, state);
+}
+
+static int sp_platform_resume(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_resume(sp);
+}
+#endif
+
+extern struct ccp_vdata ccpv3_platform;
+
+static const struct sp_dev_data dev_data[] = {
+	{
+#ifdef CONFIG_AMD_CCP
+		.ccp_vdata = &ccpv3_platform,
+#endif
+	},
+};
+
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id sp_acpi_match[] = {
+	{ "AMDI0C00", (kernel_ulong_t)&dev_data[0] },
+	{ },
+};
+MODULE_DEVICE_TABLE(acpi, sp_acpi_match);
+#endif
+
+#ifdef CONFIG_OF
+static const struct of_device_id sp_of_match[] = {
+	{ .compatible = "amd,ccp-seattle-v1a",
+	  .data = (const void *)&dev_data[0] },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, sp_of_match);
+#endif
+
+static struct platform_driver sp_platform_driver = {
+	.driver = {
+		.name = "sp",
+#ifdef CONFIG_ACPI
+		.acpi_match_table = sp_acpi_match,
+#endif
+#ifdef CONFIG_OF
+		.of_match_table = sp_of_match,
+#endif
+	},
+	.probe = sp_platform_probe,
+	.remove = sp_platform_remove,
+#ifdef CONFIG_PM
+	.suspend = sp_platform_suspend,
+	.resume = sp_platform_resume,
+#endif
+};
+
+struct sp_device *sp_platform_get_master(void)
+{
+	return sp_dev_master;
+}
+
+int sp_platform_init(void)
+{
+	return platform_driver_register(&sp_platform_driver);
+}
+
+void sp_platform_exit(void)
+{
+	platform_driver_unregister(&sp_platform_driver);
+}
diff --git a/include/linux/ccp.h b/include/linux/ccp.h
index c71dd8f..1ea14e6 100644
--- a/include/linux/ccp.h
+++ b/include/linux/ccp.h
@@ -24,8 +24,7 @@
 struct ccp_device;
 struct ccp_cmd;
 
-#if defined(CONFIG_CRYPTO_DEV_CCP_DD) || \
-	defined(CONFIG_CRYPTO_DEV_CCP_DD_MODULE)
+#if defined(CONFIG_CRYPTO_DEV_CCP)
 
 /**
  * ccp_present - check if a CCP device is present

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The CCP device is part of the AMD Secure Processor. In order to expand the
usage of the AMD Secure Processor, create a framework that allows functional
components of the AMD Secure Processor to be initialized and handled
appropriately.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 drivers/crypto/Kconfig           |   10 +
 drivers/crypto/ccp/Kconfig       |   43 +++--
 drivers/crypto/ccp/Makefile      |    8 -
 drivers/crypto/ccp/ccp-dev-v3.c  |   86 +++++-----
 drivers/crypto/ccp/ccp-dev-v5.c  |   73 ++++-----
 drivers/crypto/ccp/ccp-dev.c     |  137 +++++++++-------
 drivers/crypto/ccp/ccp-dev.h     |   35 ----
 drivers/crypto/ccp/sp-dev.c      |  308 ++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/sp-dev.h      |  140 ++++++++++++++++
 drivers/crypto/ccp/sp-pci.c      |  324 ++++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/sp-platform.c |  268 +++++++++++++++++++++++++++++++
 include/linux/ccp.h              |    3 
 12 files changed, 1240 insertions(+), 195 deletions(-)
 create mode 100644 drivers/crypto/ccp/sp-dev.c
 create mode 100644 drivers/crypto/ccp/sp-dev.h
 create mode 100644 drivers/crypto/ccp/sp-pci.c
 create mode 100644 drivers/crypto/ccp/sp-platform.c

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 7956478..d31b469 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -456,14 +456,14 @@ config CRYPTO_DEV_ATMEL_SHA
 	  To compile this driver as a module, choose M here: the module
 	  will be called atmel-sha.
 
-config CRYPTO_DEV_CCP
-	bool "Support for AMD Cryptographic Coprocessor"
+config CRYPTO_DEV_SP
+	bool "Support for AMD Secure Processor"
 	depends on ((X86 && PCI) || (ARM64 && (OF_ADDRESS || ACPI))) && HAS_IOMEM
 	help
-	  The AMD Cryptographic Coprocessor provides hardware offload support
-	  for encryption, hashing and related operations.
+	  The AMD Secure Processor provides hardware offload support for memory
+	  encryption in virtualization and cryptographic hashing and related operations.
 
-if CRYPTO_DEV_CCP
+if CRYPTO_DEV_SP
 	source "drivers/crypto/ccp/Kconfig"
 endif
 
diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index 2238f77..bc08f03 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -1,26 +1,37 @@
-config CRYPTO_DEV_CCP_DD
-	tristate "Cryptographic Coprocessor device driver"
-	depends on CRYPTO_DEV_CCP
-	default m
-	select HW_RANDOM
-	select DMA_ENGINE
-	select DMADEVICES
-	select CRYPTO_SHA1
-	select CRYPTO_SHA256
-	help
-	  Provides the interface to use the AMD Cryptographic Coprocessor
-	  which can be used to offload encryption operations such as SHA,
-	  AES and more. If you choose 'M' here, this module will be called
-	  ccp.
-
 config CRYPTO_DEV_CCP_CRYPTO
 	tristate "Encryption and hashing offload support"
-	depends on CRYPTO_DEV_CCP_DD
+	depends on CRYPTO_DEV_SP_DD
 	default m
 	select CRYPTO_HASH
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_AUTHENC
+	select CRYPTO_DEV_CCP
 	help
 	  Support for using the cryptographic API with the AMD Cryptographic
 	  Coprocessor. This module supports offload of SHA and AES algorithms.
 	  If you choose 'M' here, this module will be called ccp_crypto.
+
+config CRYPTO_DEV_SP_DD
+	tristate "Secure Processor device driver"
+	depends on CRYPTO_DEV_SP
+	default m
+	help
+	  Provides the interface to use the AMD Secure Processor. The
+	  AMD Secure Processor support the Platform Security Processor (PSP)
+	  and Cryptographic Coprocessor (CCP). If you choose 'M' here, this
+	  module will be called ccp.
+
+if CRYPTO_DEV_SP_DD
+config CRYPTO_DEV_CCP
+	bool "Cryptographic Coprocessor interface"
+	default y
+	select HW_RANDOM
+	select DMA_ENGINE
+	select DMADEVICES
+	select CRYPTO_SHA1
+	select CRYPTO_SHA256
+	help
+	  Provides the interface to use the AMD Cryptographic Coprocessor
+	  which can be used to offload encryption operations such as SHA,
+	  AES and more.
+endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 346ceb8..8127e18 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -1,11 +1,11 @@
-obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
-ccp-objs := ccp-dev.o \
+obj-$(CONFIG_CRYPTO_DEV_SP_DD) += ccp.o
+ccp-objs := sp-dev.o sp-platform.o
+ccp-$(CONFIG_PCI) += sp-pci.o
+ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-ops.o \
 	    ccp-dev-v3.o \
 	    ccp-dev-v5.o \
-	    ccp-platform.o \
 	    ccp-dmaengine.o
-ccp-$(CONFIG_PCI) += ccp-pci.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/ccp-dev-v3.c b/drivers/crypto/ccp/ccp-dev-v3.c
index 7bc0998..5c50d14 100644
--- a/drivers/crypto/ccp/ccp-dev-v3.c
+++ b/drivers/crypto/ccp/ccp-dev-v3.c
@@ -315,6 +315,39 @@ static int ccp_perform_ecc(struct ccp_op *op)
 	return ccp_do_cmd(op, cr, ARRAY_SIZE(cr));
 }
 
+static irqreturn_t ccp_irq_handler(int irq, void *data)
+{
+	struct ccp_device *ccp = data;
+	struct ccp_cmd_queue *cmd_q;
+	u32 q_int, status;
+	unsigned int i;
+
+	status = ioread32(ccp->io_regs + IRQ_STATUS_REG);
+
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		cmd_q = &ccp->cmd_q[i];
+
+		q_int = status & (cmd_q->int_ok | cmd_q->int_err);
+		if (q_int) {
+			cmd_q->int_status = status;
+			cmd_q->q_status = ioread32(cmd_q->reg_status);
+			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
+
+			/* On error, only save the first error value */
+			if ((q_int & cmd_q->int_err) && !cmd_q->cmd_error)
+				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
+
+			cmd_q->int_rcvd = 1;
+
+			/* Acknowledge the interrupt and wake the kthread */
+			iowrite32(q_int, ccp->io_regs + IRQ_STATUS_REG);
+			wake_up_interruptible(&cmd_q->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
 static int ccp_init(struct ccp_device *ccp)
 {
 	struct device *dev = ccp->dev;
@@ -374,7 +407,7 @@ static int ccp_init(struct ccp_device *ccp)
 
 #ifdef CONFIG_ARM64
 		/* For arm64 set the recommended queue cache settings */
-		iowrite32(ccp->axcache, ccp->io_regs + CMD_Q_CACHE_BASE +
+		iowrite32(ccp->sp->axcache, ccp->io_regs + CMD_Q_CACHE_BASE +
 			  (CMD_Q_CACHE_INC * i));
 #endif
 
@@ -398,7 +431,7 @@ static int ccp_init(struct ccp_device *ccp)
 	iowrite32(qim, ccp->io_regs + IRQ_STATUS_REG);
 
 	/* Request an irq */
-	ret = ccp->get_irq(ccp);
+	ret = sp_request_ccp_irq(ccp->sp, ccp_irq_handler, ccp->name, ccp);
 	if (ret) {
 		dev_err(dev, "unable to allocate an IRQ\n");
 		goto e_pool;
@@ -450,7 +483,7 @@ static int ccp_init(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 e_pool:
 	for (i = 0; i < ccp->cmd_q_count; i++)
@@ -496,7 +529,7 @@ static void ccp_destroy(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 	for (i = 0; i < ccp->cmd_q_count; i++)
 		dma_pool_destroy(ccp->cmd_q[i].dma_pool);
@@ -516,40 +549,6 @@ static void ccp_destroy(struct ccp_device *ccp)
 	}
 }
 
-static irqreturn_t ccp_irq_handler(int irq, void *data)
-{
-	struct device *dev = data;
-	struct ccp_device *ccp = dev_get_drvdata(dev);
-	struct ccp_cmd_queue *cmd_q;
-	u32 q_int, status;
-	unsigned int i;
-
-	status = ioread32(ccp->io_regs + IRQ_STATUS_REG);
-
-	for (i = 0; i < ccp->cmd_q_count; i++) {
-		cmd_q = &ccp->cmd_q[i];
-
-		q_int = status & (cmd_q->int_ok | cmd_q->int_err);
-		if (q_int) {
-			cmd_q->int_status = status;
-			cmd_q->q_status = ioread32(cmd_q->reg_status);
-			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
-
-			/* On error, only save the first error value */
-			if ((q_int & cmd_q->int_err) && !cmd_q->cmd_error)
-				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
-
-			cmd_q->int_rcvd = 1;
-
-			/* Acknowledge the interrupt and wake the kthread */
-			iowrite32(q_int, ccp->io_regs + IRQ_STATUS_REG);
-			wake_up_interruptible(&cmd_q->int_queue);
-		}
-	}
-
-	return IRQ_HANDLED;
-}
-
 static const struct ccp_actions ccp3_actions = {
 	.aes = ccp_perform_aes,
 	.xts_aes = ccp_perform_xts_aes,
@@ -562,13 +561,18 @@ static const struct ccp_actions ccp3_actions = {
 	.init = ccp_init,
 	.destroy = ccp_destroy,
 	.get_free_slots = ccp_get_free_slots,
-	.irqhandler = ccp_irq_handler,
 };
 
-const struct ccp_vdata ccpv3 = {
+const struct ccp_vdata ccpv3_platform = {
+	.version = CCP_VERSION(3, 0),
+	.setup = NULL,
+	.perform = &ccp3_actions,
+	.offset = 0,
+};
+
+const struct ccp_vdata ccpv3_pci = {
 	.version = CCP_VERSION(3, 0),
 	.setup = NULL,
 	.perform = &ccp3_actions,
-	.bar = 2,
 	.offset = 0x20000,
 };
diff --git a/drivers/crypto/ccp/ccp-dev-v5.c b/drivers/crypto/ccp/ccp-dev-v5.c
index 612898b..dd6335b 100644
--- a/drivers/crypto/ccp/ccp-dev-v5.c
+++ b/drivers/crypto/ccp/ccp-dev-v5.c
@@ -651,6 +651,38 @@ static int ccp_assign_lsbs(struct ccp_device *ccp)
 	return rc;
 }
 
+static irqreturn_t ccp5_irq_handler(int irq, void *data)
+{
+	struct device *dev = data;
+	struct ccp_device *ccp = dev_get_drvdata(dev);
+	u32 status;
+	unsigned int i;
+
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		struct ccp_cmd_queue *cmd_q = &ccp->cmd_q[i];
+
+		status = ioread32(cmd_q->reg_interrupt_status);
+
+		if (status) {
+			cmd_q->int_status = status;
+			cmd_q->q_status = ioread32(cmd_q->reg_status);
+			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
+
+			/* On error, only save the first error value */
+			if ((status & INT_ERROR) && !cmd_q->cmd_error)
+				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
+
+			cmd_q->int_rcvd = 1;
+
+			/* Acknowledge the interrupt and wake the kthread */
+			iowrite32(ALL_INTERRUPTS, cmd_q->reg_interrupt_status);
+			wake_up_interruptible(&cmd_q->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
 static int ccp5_init(struct ccp_device *ccp)
 {
 	struct device *dev = ccp->dev;
@@ -752,7 +784,7 @@ static int ccp5_init(struct ccp_device *ccp)
 
 	dev_dbg(dev, "Requesting an IRQ...\n");
 	/* Request an irq */
-	ret = ccp->get_irq(ccp);
+	ret = sp_request_ccp_irq(ccp->sp, ccp5_irq_handler, ccp->name, ccp);
 	if (ret) {
 		dev_err(dev, "unable to allocate an IRQ\n");
 		goto e_pool;
@@ -855,7 +887,7 @@ static int ccp5_init(struct ccp_device *ccp)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
 e_irq:
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 e_pool:
 	for (i = 0; i < ccp->cmd_q_count; i++)
@@ -901,7 +933,7 @@ static void ccp5_destroy(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 	for (i = 0; i < ccp->cmd_q_count; i++) {
 		cmd_q = &ccp->cmd_q[i];
@@ -924,38 +956,6 @@ static void ccp5_destroy(struct ccp_device *ccp)
 	}
 }
 
-static irqreturn_t ccp5_irq_handler(int irq, void *data)
-{
-	struct device *dev = data;
-	struct ccp_device *ccp = dev_get_drvdata(dev);
-	u32 status;
-	unsigned int i;
-
-	for (i = 0; i < ccp->cmd_q_count; i++) {
-		struct ccp_cmd_queue *cmd_q = &ccp->cmd_q[i];
-
-		status = ioread32(cmd_q->reg_interrupt_status);
-
-		if (status) {
-			cmd_q->int_status = status;
-			cmd_q->q_status = ioread32(cmd_q->reg_status);
-			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
-
-			/* On error, only save the first error value */
-			if ((status & INT_ERROR) && !cmd_q->cmd_error)
-				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
-
-			cmd_q->int_rcvd = 1;
-
-			/* Acknowledge the interrupt and wake the kthread */
-			iowrite32(ALL_INTERRUPTS, cmd_q->reg_interrupt_status);
-			wake_up_interruptible(&cmd_q->int_queue);
-		}
-	}
-
-	return IRQ_HANDLED;
-}
-
 static void ccp5_config(struct ccp_device *ccp)
 {
 	/* Public side */
@@ -1001,14 +1001,12 @@ static const struct ccp_actions ccp5_actions = {
 	.init = ccp5_init,
 	.destroy = ccp5_destroy,
 	.get_free_slots = ccp5_get_free_slots,
-	.irqhandler = ccp5_irq_handler,
 };
 
 const struct ccp_vdata ccpv5a = {
 	.version = CCP_VERSION(5, 0),
 	.setup = ccp5_config,
 	.perform = &ccp5_actions,
-	.bar = 2,
 	.offset = 0x0,
 };
 
@@ -1016,6 +1014,5 @@ const struct ccp_vdata ccpv5b = {
 	.version = CCP_VERSION(5, 0),
 	.setup = ccp5other_config,
 	.perform = &ccp5_actions,
-	.bar = 2,
 	.offset = 0x0,
 };
diff --git a/drivers/crypto/ccp/ccp-dev.c b/drivers/crypto/ccp/ccp-dev.c
index 511ab04..0fa8c4a 100644
--- a/drivers/crypto/ccp/ccp-dev.c
+++ b/drivers/crypto/ccp/ccp-dev.c
@@ -22,19 +22,11 @@
 #include <linux/mutex.h>
 #include <linux/delay.h>
 #include <linux/hw_random.h>
-#include <linux/cpu.h>
-#ifdef CONFIG_X86
-#include <asm/cpu_device_id.h>
-#endif
 #include <linux/ccp.h>
 
+#include "sp-dev.h"
 #include "ccp-dev.h"
 
-MODULE_AUTHOR("Tom Lendacky <thomas.lendacky@amd.com>");
-MODULE_LICENSE("GPL");
-MODULE_VERSION("1.0.0");
-MODULE_DESCRIPTION("AMD Cryptographic Coprocessor driver");
-
 struct ccp_tasklet_data {
 	struct completion completion;
 	struct ccp_cmd *cmd;
@@ -110,13 +102,6 @@ static LIST_HEAD(ccp_units);
 static DEFINE_SPINLOCK(ccp_rr_lock);
 static struct ccp_device *ccp_rr;
 
-/* Ever-increasing value to produce unique unit numbers */
-static atomic_t ccp_unit_ordinal;
-static unsigned int ccp_increment_unit_ordinal(void)
-{
-	return atomic_inc_return(&ccp_unit_ordinal);
-}
-
 /**
  * ccp_add_device - add a CCP device to the list
  *
@@ -455,19 +440,17 @@ int ccp_cmd_queue_thread(void *data)
 	return 0;
 }
 
-/**
- * ccp_alloc_struct - allocate and initialize the ccp_device struct
- *
- * @dev: device struct of the CCP
- */
-struct ccp_device *ccp_alloc_struct(struct device *dev)
+static struct ccp_device *ccp_alloc_struct(struct sp_device *sp)
 {
+	struct device *dev = sp->dev;
 	struct ccp_device *ccp;
 
 	ccp = devm_kzalloc(dev, sizeof(*ccp), GFP_KERNEL);
 	if (!ccp)
 		return NULL;
+
 	ccp->dev = dev;
+	ccp->sp = sp;
 
 	INIT_LIST_HEAD(&ccp->cmd);
 	INIT_LIST_HEAD(&ccp->backlog);
@@ -482,9 +465,8 @@ struct ccp_device *ccp_alloc_struct(struct device *dev)
 	init_waitqueue_head(&ccp->sb_queue);
 	init_waitqueue_head(&ccp->suspend_queue);
 
-	ccp->ord = ccp_increment_unit_ordinal();
-	snprintf(ccp->name, MAX_CCP_NAME_LEN, "ccp-%u", ccp->ord);
-	snprintf(ccp->rngname, MAX_CCP_NAME_LEN, "ccp-%u-rng", ccp->ord);
+	snprintf(ccp->name, MAX_CCP_NAME_LEN, "ccp-%u", sp->ord);
+	snprintf(ccp->rngname, MAX_CCP_NAME_LEN, "ccp-%u-rng", sp->ord);
 
 	return ccp;
 }
@@ -536,53 +518,94 @@ bool ccp_queues_suspended(struct ccp_device *ccp)
 }
 #endif
 
-static int __init ccp_mod_init(void)
+int ccp_dev_init(struct sp_device *sp)
 {
-#ifdef CONFIG_X86
+	struct device *dev = sp->dev;
+	struct ccp_device *ccp;
 	int ret;
 
-	ret = ccp_pci_init();
-	if (ret)
-		return ret;
-
-	/* Don't leave the driver loaded if init failed */
-	if (ccp_present() != 0) {
-		ccp_pci_exit();
-		return -ENODEV;
+	ret = -ENOMEM;
+	ccp = ccp_alloc_struct(sp);
+	if (!ccp)
+		goto e_err;
+	sp->ccp_data = ccp;
+
+	ccp->vdata = (struct ccp_vdata *)sp->dev_data->ccp_vdata;
+	if (!ccp->vdata || !ccp->vdata->version) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
 	}
 
-	return 0;
-#endif
+	ccp->io_regs = sp->io_map + ccp->vdata->offset;
 
-#ifdef CONFIG_ARM64
-	int ret;
+	if (ccp->vdata->setup)
+		ccp->vdata->setup(ccp);
 
-	ret = ccp_platform_init();
+	ret = ccp->vdata->perform->init(ccp);
 	if (ret)
-		return ret;
+		goto e_err;
 
-	/* Don't leave the driver loaded if init failed */
-	if (ccp_present() != 0) {
-		ccp_platform_exit();
-		return -ENODEV;
-	}
+	dev_notice(dev, "ccp enabled\n");
 
 	return 0;
-#endif
 
-	return -ENODEV;
+e_err:
+	sp->ccp_data = NULL;
+
+	dev_notice(dev, "ccp initialization failed\n");
+
+	return ret;
 }
 
-static void __exit ccp_mod_exit(void)
+void ccp_dev_destroy(struct sp_device *sp)
 {
-#ifdef CONFIG_X86
-	ccp_pci_exit();
-#endif
+	struct ccp_device *ccp = sp->ccp_data;
 
-#ifdef CONFIG_ARM64
-	ccp_platform_exit();
-#endif
+	ccp->vdata->perform->destroy(ccp);
+}
+
+int ccp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	struct ccp_device *ccp = sp->ccp_data;
+	unsigned long flags;
+	unsigned int i;
+
+	spin_lock_irqsave(&ccp->cmd_lock, flags);
+
+	ccp->suspending = 1;
+
+	/* Wake all the queue kthreads to prepare for suspend */
+	for (i = 0; i < ccp->cmd_q_count; i++)
+		wake_up_process(ccp->cmd_q[i].kthread);
+
+	spin_unlock_irqrestore(&ccp->cmd_lock, flags);
+
+	/* Wait for all queue kthreads to say they're done */
+	while (!ccp_queues_suspended(ccp))
+		wait_event_interruptible(ccp->suspend_queue,
+					 ccp_queues_suspended(ccp));
+
+	return 0;
 }
 
-module_init(ccp_mod_init);
-module_exit(ccp_mod_exit);
+int ccp_dev_resume(struct sp_device *sp)
+{
+	struct ccp_device *ccp = sp->ccp_data;
+	unsigned long flags;
+	unsigned int i;
+
+	spin_lock_irqsave(&ccp->cmd_lock, flags);
+
+	ccp->suspending = 0;
+
+	/* Wake up all the kthreads */
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		ccp->cmd_q[i].suspended = 0;
+		wake_up_process(ccp->cmd_q[i].kthread);
+	}
+
+	spin_unlock_irqrestore(&ccp->cmd_lock, flags);
+
+	return 0;
+}
diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
index 649e561..25a4bfd 100644
--- a/drivers/crypto/ccp/ccp-dev.h
+++ b/drivers/crypto/ccp/ccp-dev.h
@@ -27,6 +27,8 @@
 #include <linux/irqreturn.h>
 #include <linux/dmaengine.h>
 
+#include "sp-dev.h"
+
 #define MAX_CCP_NAME_LEN		16
 #define MAX_DMAPOOL_NAME_LEN		32
 
@@ -35,9 +37,6 @@
 
 #define TRNG_RETRIES			10
 
-#define CACHE_NONE			0x00
-#define CACHE_WB_NO_ALLOC		0xb7
-
 /****** Register Mappings ******/
 #define Q_MASK_REG			0x000
 #define TRNG_OUT_REG			0x00c
@@ -322,18 +321,15 @@ struct ccp_device {
 	struct list_head entry;
 
 	struct ccp_vdata *vdata;
-	unsigned int ord;
 	char name[MAX_CCP_NAME_LEN];
 	char rngname[MAX_CCP_NAME_LEN];
 
 	struct device *dev;
+	struct sp_device *sp;
 
 	/* Bus specific device information
 	 */
 	void *dev_specific;
-	int (*get_irq)(struct ccp_device *ccp);
-	void (*free_irq)(struct ccp_device *ccp);
-	unsigned int irq;
 
 	/* I/O area used for device communication. The register mapping
 	 * starts at an offset into the mapped bar.
@@ -342,7 +338,6 @@ struct ccp_device {
 	 *   them.
 	 */
 	struct mutex req_mutex ____cacheline_aligned;
-	void __iomem *io_map;
 	void __iomem *io_regs;
 
 	/* Master lists that all cmds are queued on. Because there can be
@@ -407,9 +402,6 @@ struct ccp_device {
 	/* Suspend support */
 	unsigned int suspending;
 	wait_queue_head_t suspend_queue;
-
-	/* DMA caching attribute support */
-	unsigned int axcache;
 };
 
 enum ccp_memtype {
@@ -592,18 +584,11 @@ struct ccp5_desc {
 	struct dword7 dw7;
 };
 
-int ccp_pci_init(void);
-void ccp_pci_exit(void);
-
-int ccp_platform_init(void);
-void ccp_platform_exit(void);
-
 void ccp_add_device(struct ccp_device *ccp);
 void ccp_del_device(struct ccp_device *ccp);
 
 extern void ccp_log_error(struct ccp_device *, int);
 
-struct ccp_device *ccp_alloc_struct(struct device *dev);
 bool ccp_queues_suspended(struct ccp_device *ccp);
 int ccp_cmd_queue_thread(void *data);
 int ccp_trng_read(struct hwrng *rng, void *data, size_t max, bool wait);
@@ -629,20 +614,6 @@ struct ccp_actions {
 	unsigned int (*get_free_slots)(struct ccp_cmd_queue *);
 	int (*init)(struct ccp_device *);
 	void (*destroy)(struct ccp_device *);
-	irqreturn_t (*irqhandler)(int, void *);
-};
-
-/* Structure to hold CCP version-specific values */
-struct ccp_vdata {
-	const unsigned int version;
-	void (*setup)(struct ccp_device *);
-	const struct ccp_actions *perform;
-	const unsigned int bar;
-	const unsigned int offset;
 };
 
-extern const struct ccp_vdata ccpv3;
-extern const struct ccp_vdata ccpv5a;
-extern const struct ccp_vdata ccpv5b;
-
 #endif
diff --git a/drivers/crypto/ccp/sp-dev.c b/drivers/crypto/ccp/sp-dev.c
new file mode 100644
index 0000000..e47fb8e
--- /dev/null
+++ b/drivers/crypto/ccp/sp-dev.c
@@ -0,0 +1,308 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *	Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+
+#include "sp-dev.h"
+
+MODULE_AUTHOR("Tom Lendacky <thomas.lendacky@amd.com>");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1.1.0");
+MODULE_DESCRIPTION("AMD Secure Processor driver");
+
+/* List of SPs, SP count, read-write access lock, and access functions
+ *
+ * Lock structure: get sp_unit_lock for reading whenever we need to
+ * examine the SP list.
+ */
+static DEFINE_RWLOCK(sp_unit_lock);
+static LIST_HEAD(sp_units);
+
+/* Ever-increasing value to produce unique unit numbers */
+static atomic_t sp_ordinal;
+
+static void sp_add_device(struct sp_device *sp)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	list_add_tail(&sp->entry, &sp_units);
+
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+}
+
+static void sp_del_device(struct sp_device *sp)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	list_del(&sp->entry);
+
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+}
+
+struct sp_device *sp_get_device(void)
+{
+	struct sp_device *sp = NULL;
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	if (list_empty(&sp_units))
+		goto unlock;
+
+	sp = list_first_entry(&sp_units, struct sp_device, entry);
+
+	list_add_tail(&sp->entry, &sp_units);
+unlock:
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+	return sp;
+}
+
+static irqreturn_t sp_irq_handler(int irq, void *data)
+{
+	struct sp_device *sp = data;
+
+	if (sp->psp_irq_handler)
+		sp->psp_irq_handler(irq, sp->psp_irq_data);
+
+	if (sp->ccp_irq_handler)
+		sp->ccp_irq_handler(irq, sp->ccp_irq_data);
+
+	return IRQ_HANDLED;
+}
+
+int sp_request_psp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data)
+{
+	int ret;
+
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->ccp_vdata) {
+		/* Need a common routine to manager all interrupts */
+		sp->psp_irq_data = data;
+		sp->psp_irq_handler = handler;
+
+		if (!sp->irq_registered) {
+			ret = request_irq(sp->psp_irq, sp_irq_handler, 0,
+					  sp->name, sp);
+			if (ret)
+				return ret;
+
+			sp->irq_registered = true;
+		}
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		ret = request_irq(sp->psp_irq, handler, 0, name, data);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+int sp_request_ccp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data)
+{
+	int ret;
+
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->psp_vdata) {
+		/* Need a common routine to manager all interrupts */
+		sp->ccp_irq_data = data;
+		sp->ccp_irq_handler = handler;
+
+		if (!sp->irq_registered) {
+			ret = request_irq(sp->ccp_irq, sp_irq_handler, 0,
+					  sp->name, sp);
+			if (ret)
+				return ret;
+
+			sp->irq_registered = true;
+		}
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		ret = request_irq(sp->ccp_irq, handler, 0, name, data);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+void sp_free_psp_irq(struct sp_device *sp, void *data)
+{
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->ccp_vdata) {
+		/* Using a common routine to manager all interrupts */
+		if (!sp->ccp_irq_handler) {
+			/* Nothing else using it, so free it */
+			free_irq(sp->psp_irq, sp);
+
+			sp->irq_registered = false;
+		}
+
+		sp->psp_irq_handler = NULL;
+		sp->psp_irq_data = NULL;
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		free_irq(sp->psp_irq, data);
+	}
+}
+
+void sp_free_ccp_irq(struct sp_device *sp, void *data)
+{
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->psp_vdata) {
+		/* Using a common routine to manager all interrupts */
+		if (!sp->psp_irq_handler) {
+			/* Nothing else using it, so free it */
+			free_irq(sp->ccp_irq, sp);
+
+			sp->irq_registered = false;
+		}
+
+		sp->ccp_irq_handler = NULL;
+		sp->ccp_irq_data = NULL;
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		free_irq(sp->ccp_irq, data);
+	}
+}
+
+/**
+ * sp_alloc_struct - allocate and initialize the sp_device struct
+ *
+ * @dev: device struct of the SP
+ */
+struct sp_device *sp_alloc_struct(struct device *dev)
+{
+	struct sp_device *sp;
+
+	sp = devm_kzalloc(dev, sizeof(*sp), GFP_KERNEL);
+	if (!sp)
+		return NULL;
+
+	sp->dev = dev;
+	sp->ord = atomic_inc_return(&sp_ordinal) - 1;
+	snprintf(sp->name, SP_MAX_NAME_LEN, "sp-%u", sp->ord);
+
+	return sp;
+}
+
+int sp_init(struct sp_device *sp)
+{
+	sp_add_device(sp);
+
+	if (sp->dev_data->ccp_vdata)
+		ccp_dev_init(sp);
+
+	return 0;
+}
+
+void sp_destroy(struct sp_device *sp)
+{
+	if (sp->dev_data->ccp_vdata)
+		ccp_dev_destroy(sp);
+
+	sp_del_device(sp);
+}
+
+int sp_suspend(struct sp_device *sp, pm_message_t state)
+{
+	int ret;
+
+	if (sp->dev_data->ccp_vdata) {
+		ret = ccp_dev_suspend(sp, state);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+int sp_resume(struct sp_device *sp)
+{
+	int ret;
+
+	if (sp->dev_data->ccp_vdata) {
+		ret = ccp_dev_resume(sp);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+struct sp_device *sp_get_psp_master_device(void)
+{
+	struct sp_device *sp = sp_get_device();
+
+	if (!sp)
+		return NULL;
+
+	if (!sp->psp_data)
+		return NULL;
+
+	return sp->get_master_device();
+}
+
+void sp_set_psp_master(struct sp_device *sp)
+{
+	if (sp->psp_data)
+		sp->set_master_device(sp);
+}
+
+static int __init sp_mod_init(void)
+{
+#ifdef CONFIG_X86
+	int ret;
+
+	ret = sp_pci_init();
+	if (ret)
+		return ret;
+
+	return 0;
+#endif
+
+#ifdef CONFIG_ARM64
+	int ret;
+
+	ret = sp_platform_init();
+	if (ret)
+		return ret;
+
+	return 0;
+#endif
+
+	return -ENODEV;
+}
+
+static void __exit sp_mod_exit(void)
+{
+#ifdef CONFIG_X86
+	sp_pci_exit();
+#endif
+
+#ifdef CONFIG_ARM64
+	sp_platform_exit();
+#endif
+}
+
+module_init(sp_mod_init);
+module_exit(sp_mod_exit);
diff --git a/drivers/crypto/ccp/sp-dev.h b/drivers/crypto/ccp/sp-dev.h
new file mode 100644
index 0000000..9a8a8f8
--- /dev/null
+++ b/drivers/crypto/ccp/sp-dev.h
@@ -0,0 +1,140 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *	Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __SP_DEV_H__
+#define __SP_DEV_H__
+
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/dmapool.h>
+#include <linux/hw_random.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+
+#define SP_MAX_NAME_LEN		32
+
+#define CACHE_NONE			0x00
+#define CACHE_WB_NO_ALLOC		0xb7
+
+/* Structure to hold CCP device data */
+struct ccp_device;
+struct ccp_vdata {
+	const unsigned int version;
+	void (*setup)(struct ccp_device *);
+	const struct ccp_actions *perform;
+	const unsigned int offset;
+};
+
+/* Structure to hold SP device data */
+struct sp_dev_data {
+	const unsigned int bar;
+
+	const struct ccp_vdata *ccp_vdata;
+	const void *psp_vdata;
+};
+
+struct sp_device {
+	struct list_head entry;
+
+	struct device *dev;
+
+	struct sp_dev_data *dev_data;
+	unsigned int ord;
+	char name[SP_MAX_NAME_LEN];
+
+	/* Bus specific device information */
+	void *dev_specific;
+
+	/* I/O area used for device communication. */
+	void __iomem *io_map;
+
+	/* DMA caching attribute support */
+	unsigned int axcache;
+
+	bool irq_registered;
+
+	/* get and set master device */
+	struct sp_device*(*get_master_device) (void);
+	void(*set_master_device) (struct sp_device *);
+
+	unsigned int psp_irq;
+	irq_handler_t psp_irq_handler;
+	void *psp_irq_data;
+
+	unsigned int ccp_irq;
+	irq_handler_t ccp_irq_handler;
+	void *ccp_irq_data;
+
+	void *psp_data;
+	void *ccp_data;
+};
+
+int sp_pci_init(void);
+void sp_pci_exit(void);
+
+int sp_platform_init(void);
+void sp_platform_exit(void);
+
+struct sp_device *sp_alloc_struct(struct device *dev);
+
+int sp_init(struct sp_device *sp);
+void sp_destroy(struct sp_device *sp);
+struct sp_device *sp_get_master(void);
+
+int sp_suspend(struct sp_device *sp, pm_message_t state);
+int sp_resume(struct sp_device *sp);
+
+int sp_request_psp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data);
+void sp_free_psp_irq(struct sp_device *sp, void *data);
+
+int sp_request_ccp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data);
+void sp_free_ccp_irq(struct sp_device *sp, void *data);
+
+void sp_set_psp_master(struct sp_device *sp);
+struct sp_device *sp_get_psp_master_device(void);
+
+#ifdef CONFIG_CRYPTO_DEV_CCP
+
+int ccp_dev_init(struct sp_device *sp);
+void ccp_dev_destroy(struct sp_device *sp);
+
+int ccp_dev_suspend(struct sp_device *sp, pm_message_t state);
+int ccp_dev_resume(struct sp_device *sp);
+
+#else	/* !CONFIG_CRYPTO_DEV_CCP */
+
+static inline int ccp_dev_init(struct sp_device *sp)
+{
+	return 0;
+}
+static inline void ccp_dev_destroy(struct sp_device *sp) { }
+
+static inline int ccp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	return 0;
+}
+static inline int ccp_dev_resume(struct sp_device *sp)
+{
+	return 0;
+}
+
+#endif	/* CONFIG_CRYPTO_DEV_CCP */
+
+#endif
diff --git a/drivers/crypto/ccp/sp-pci.c b/drivers/crypto/ccp/sp-pci.c
new file mode 100644
index 0000000..0960e2d
--- /dev/null
+++ b/drivers/crypto/ccp/sp-pci.c
@@ -0,0 +1,324 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ * 	   Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/pci_ids.h>
+#include <linux/dma-mapping.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+
+#include "sp-dev.h"
+
+#define MSIX_VECTORS			2
+
+struct sp_pci {
+	int msix_count;
+	struct msix_entry msix_entry[MSIX_VECTORS];
+};
+
+static struct sp_device *sp_dev_master;
+
+static int sp_get_msix_irqs(struct sp_device *sp)
+{
+	struct sp_pci *sp_pci = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int v, ret;
+
+	for (v = 0; v < ARRAY_SIZE(sp_pci->msix_entry); v++)
+		sp_pci->msix_entry[v].entry = v;
+
+	ret = pci_enable_msix_range(pdev, sp_pci->msix_entry, 1, v);
+	if (ret < 0)
+		return ret;
+
+	sp_pci->msix_count = ret;
+
+	sp->psp_irq = sp_pci->msix_entry[0].vector;
+	sp->ccp_irq = (sp_pci->msix_count > 1) ? sp_pci->msix_entry[1].vector
+					       : sp_pci->msix_entry[0].vector;
+
+	return 0;
+}
+
+static int sp_get_msi_irq(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int ret;
+
+	ret = pci_enable_msi(pdev);
+	if (ret)
+		return ret;
+
+	sp->psp_irq = pdev->irq;
+	sp->ccp_irq = pdev->irq;
+
+	return 0;
+}
+
+static int sp_get_irqs(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	int ret;
+
+	ret = sp_get_msix_irqs(sp);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI-X vectors, try MSI */
+	dev_notice(dev, "could not enable MSI-X (%d), trying MSI\n", ret);
+	ret = sp_get_msi_irq(sp);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI interrupt */
+	dev_notice(dev, "could not enable MSI (%d)\n", ret);
+
+	return ret;
+}
+
+static void sp_free_irqs(struct sp_device *sp)
+{
+	struct sp_pci *sp_pci = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (sp_pci->msix_count)
+		pci_disable_msix(pdev);
+	else if (sp->psp_irq)
+		pci_disable_msi(pdev);
+
+	sp->psp_irq = 0;
+	sp->ccp_irq = 0;
+}
+
+static bool sp_pci_is_master(struct sp_device *sp)
+{
+	struct device *dev_cur, *dev_new;
+	struct pci_dev *pdev_cur, *pdev_new;
+
+	dev_new = sp->dev;
+	dev_cur = sp_dev_master->dev;
+
+	pdev_new = to_pci_dev(dev_new);
+	pdev_cur = to_pci_dev(dev_cur);
+
+	if (pdev_new->bus->number < pdev_cur->bus->number)
+		return true;
+
+	if (PCI_SLOT(pdev_new->devfn) < PCI_SLOT(pdev_cur->devfn))
+		return true;
+
+	if (PCI_FUNC(pdev_new->devfn) < PCI_FUNC(pdev_cur->devfn))
+		return true;
+
+	return false;
+}
+
+static void sp_pci_set_master(struct sp_device *sp)
+{
+	if (!sp_dev_master) {
+		sp_dev_master = sp;
+		return;
+	}
+
+	if (sp_pci_is_master(sp))
+		sp_dev_master = sp;
+}
+
+static struct sp_device *sp_pci_get_master(void)
+{
+	return sp_dev_master;
+}
+
+static int sp_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct sp_device *sp;
+	struct sp_pci *sp_pci;
+	struct device *dev = &pdev->dev;
+	void __iomem * const *iomap_table;
+	int bar_mask;
+	int ret;
+
+	ret = -ENOMEM;
+	sp = sp_alloc_struct(dev);
+	if (!sp)
+		goto e_err;
+
+	sp_pci = devm_kzalloc(dev, sizeof(*sp_pci), GFP_KERNEL);
+	if (!sp_pci)
+		goto e_err;
+	sp->dev_specific = sp_pci;
+
+	sp->dev_data = (struct sp_dev_data *)id->driver_data;
+	if (!sp->dev_data) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	ret = pcim_enable_device(pdev);
+	if (ret) {
+		dev_err(dev, "pcim_enable_device failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	bar_mask = pci_select_bars(pdev, IORESOURCE_MEM);
+	ret = pcim_iomap_regions(pdev, bar_mask, "sp");
+	if (ret) {
+		dev_err(dev, "pcim_iomap_regions failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	iomap_table = pcim_iomap_table(pdev);
+	if (!iomap_table) {
+		dev_err(dev, "pcim_iomap_table failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	sp->io_map = iomap_table[sp->dev_data->bar];
+	if (!sp->io_map) {
+		dev_err(dev, "ioremap failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	ret = sp_get_irqs(sp);
+	if (ret)
+		goto e_err;
+
+	pci_set_master(pdev);
+
+	sp->set_master_device = sp_pci_set_master;
+	sp->get_master_device = sp_pci_get_master;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+	if (ret) {
+		ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
+		if (ret) {
+			dev_err(dev, "dma_set_mask_and_coherent failed (%d)\n",
+				ret);
+			goto e_err;
+		}
+	}
+
+	dev_set_drvdata(dev, sp);
+
+	ret = sp_init(sp);
+	if (ret)
+		goto e_err;
+
+	dev_notice(dev, "enabled\n");
+
+	return 0;
+
+e_err:
+	dev_notice(dev, "initialization failed\n");
+
+	return ret;
+}
+
+static void sp_pci_remove(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	if (!sp)
+		return;
+
+	sp_destroy(sp);
+
+	sp_free_irqs(sp);
+
+	dev_notice(dev, "disabled\n");
+}
+
+#ifdef CONFIG_PM
+static int sp_pci_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_suspend(sp, state);
+}
+
+static int sp_pci_resume(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_resume(sp);
+}
+#endif
+
+extern struct ccp_vdata ccpv3_pci;
+extern struct ccp_vdata ccpv5a;
+extern struct ccp_vdata ccpv5b;
+
+static const struct sp_dev_data dev_data[] = {
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv3_pci,
+#endif
+	},
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv5a,
+#endif
+	},
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv5b,
+#endif
+	},
+};
+
+static const struct pci_device_id sp_pci_table[] = {
+	{ PCI_VDEVICE(AMD, 0x1537), (kernel_ulong_t)&dev_data[0] },
+	{ PCI_VDEVICE(AMD, 0x1456), (kernel_ulong_t)&dev_data[1] },
+	{ PCI_VDEVICE(AMD, 0x1468), (kernel_ulong_t)&dev_data[2] },
+	/* Last entry must be zero */
+	{ 0, }
+};
+MODULE_DEVICE_TABLE(pci, sp_pci_table);
+
+static struct pci_driver sp_pci_driver = {
+	.name = "sp",
+	.id_table = sp_pci_table,
+	.probe = sp_pci_probe,
+	.remove = sp_pci_remove,
+#ifdef CONFIG_PM
+	.suspend = sp_pci_suspend,
+	.resume = sp_pci_resume,
+#endif
+};
+
+int sp_pci_init(void)
+{
+	return pci_register_driver(&sp_pci_driver);
+}
+
+void sp_pci_exit(void)
+{
+	pci_unregister_driver(&sp_pci_driver);
+}
diff --git a/drivers/crypto/ccp/sp-platform.c b/drivers/crypto/ccp/sp-platform.c
new file mode 100644
index 0000000..a918238
--- /dev/null
+++ b/drivers/crypto/ccp/sp-platform.c
@@ -0,0 +1,268 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ * 	   Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/platform_device.h>
+#include <linux/ioport.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/acpi.h>
+
+#include "sp-dev.h"
+
+struct sp_platform {
+	int coherent;
+	unsigned int irq_count;
+};
+
+static struct sp_device *sp_dev_master;
+static const struct acpi_device_id sp_acpi_match[];
+static const struct of_device_id sp_of_match[];
+
+static struct sp_dev_data *sp_get_of_dev_data(struct platform_device *pdev)
+{
+#ifdef CONFIG_OF
+	const struct of_device_id *match;
+
+	match = of_match_node(sp_of_match, pdev->dev.of_node);
+	if (match && match->data)
+		return (struct sp_dev_data *)match->data;
+#endif
+
+	return NULL;
+}
+
+static struct sp_dev_data *sp_get_acpi_dev_data(struct platform_device *pdev)
+{
+#ifdef CONFIG_ACPI
+	const struct acpi_device_id *match;
+
+	match = acpi_match_device(sp_acpi_match, &pdev->dev);
+	if (match && match->driver_data)
+		return (struct sp_dev_data *)match->driver_data;
+#endif
+
+	return NULL;
+}
+
+static int sp_get_irqs(struct sp_device *sp)
+{
+	struct sp_platform *sp_platform = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct platform_device *pdev = to_platform_device(dev);
+	unsigned int i, count;
+	int ret;
+
+	for (i = 0, count = 0; i < pdev->num_resources; i++) {
+		struct resource *res = &pdev->resource[i];
+
+		if (resource_type(res) == IORESOURCE_IRQ)
+			count++;
+	}
+
+	sp_platform->irq_count = count;
+
+	ret = platform_get_irq(pdev, 0);
+	if (ret < 0)
+		return ret;
+
+	sp->psp_irq = ret;
+	if (count == 1) {
+		sp->ccp_irq = ret;
+	} else {
+		ret = platform_get_irq(pdev, 1);
+		if (ret < 0)
+			return ret;
+
+		sp->ccp_irq = ret;
+	}
+
+	return 0;
+}
+
+void sp_platform_set_master(struct sp_device *sp)
+{
+	if (!sp_dev_master)
+		sp_dev_master = sp;
+}
+
+static int sp_platform_probe(struct platform_device *pdev)
+{
+	struct sp_device *sp;
+	struct sp_platform *sp_platform;
+	struct device *dev = &pdev->dev;
+	enum dev_dma_attr attr;
+	struct resource *ior;
+	int ret;
+
+	ret = -ENOMEM;
+	sp = sp_alloc_struct(dev);
+	if (!sp)
+		goto e_err;
+
+	sp_platform = devm_kzalloc(dev, sizeof(*sp_platform), GFP_KERNEL);
+	if (!sp_platform)
+		goto e_err;
+
+	sp->dev_specific = sp_platform;
+	sp->dev_data = pdev->dev.of_node ? sp_get_of_dev_data(pdev)
+					 : sp_get_acpi_dev_data(pdev);
+	if (!sp->dev_data) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	ior = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	sp->io_map = devm_ioremap_resource(dev, ior);
+	if (IS_ERR(sp->io_map)) {
+		ret = PTR_ERR(sp->io_map);
+		goto e_err;
+	}
+
+	attr = device_get_dma_attr(dev);
+	if (attr == DEV_DMA_NOT_SUPPORTED) {
+		dev_err(dev, "DMA is not supported");
+		goto e_err;
+	}
+
+	sp_platform->coherent = (attr == DEV_DMA_COHERENT);
+	if (sp_platform->coherent)
+		sp->axcache = CACHE_WB_NO_ALLOC;
+	else
+		sp->axcache = CACHE_NONE;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+	if (ret) {
+		dev_err(dev, "dma_set_mask_and_coherent failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	ret = sp_get_irqs(sp);
+	if (ret)
+		goto e_err;
+
+	dev_set_drvdata(dev, sp);
+
+	ret = sp_init(sp);
+	if (ret)
+		goto e_err;
+
+	dev_notice(dev, "enabled\n");
+
+	return 0;
+
+e_err:
+	dev_notice(dev, "initialization failed\n");
+
+	return ret;
+}
+
+static int sp_platform_remove(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	if (!sp)
+		return 0;
+
+	sp_destroy(sp);
+
+	dev_notice(dev, "disabled\n");
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int sp_platform_suspend(struct platform_device *pdev,
+			       pm_message_t state)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_suspend(sp, state);
+}
+
+static int sp_platform_resume(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_resume(sp);
+}
+#endif
+
+extern struct ccp_vdata ccpv3_platform;
+
+static const struct sp_dev_data dev_data[] = {
+	{
+#ifdef CONFIG_AMD_CCP
+		.ccp_vdata = &ccpv3_platform,
+#endif
+	},
+};
+
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id sp_acpi_match[] = {
+	{ "AMDI0C00", (kernel_ulong_t)&dev_data[0] },
+	{ },
+};
+MODULE_DEVICE_TABLE(acpi, sp_acpi_match);
+#endif
+
+#ifdef CONFIG_OF
+static const struct of_device_id sp_of_match[] = {
+	{ .compatible = "amd,ccp-seattle-v1a",
+	  .data = (const void *)&dev_data[0] },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, sp_of_match);
+#endif
+
+static struct platform_driver sp_platform_driver = {
+	.driver = {
+		.name = "sp",
+#ifdef CONFIG_ACPI
+		.acpi_match_table = sp_acpi_match,
+#endif
+#ifdef CONFIG_OF
+		.of_match_table = sp_of_match,
+#endif
+	},
+	.probe = sp_platform_probe,
+	.remove = sp_platform_remove,
+#ifdef CONFIG_PM
+	.suspend = sp_platform_suspend,
+	.resume = sp_platform_resume,
+#endif
+};
+
+struct sp_device *sp_platform_get_master(void)
+{
+	return sp_dev_master;
+}
+
+int sp_platform_init(void)
+{
+	return platform_driver_register(&sp_platform_driver);
+}
+
+void sp_platform_exit(void)
+{
+	platform_driver_unregister(&sp_platform_driver);
+}
diff --git a/include/linux/ccp.h b/include/linux/ccp.h
index c71dd8f..1ea14e6 100644
--- a/include/linux/ccp.h
+++ b/include/linux/ccp.h
@@ -24,8 +24,7 @@
 struct ccp_device;
 struct ccp_cmd;
 
-#if defined(CONFIG_CRYPTO_DEV_CCP_DD) || \
-	defined(CONFIG_CRYPTO_DEV_CCP_DD_MODULE)
+#if defined(CONFIG_CRYPTO_DEV_CCP)
 
 /**
  * ccp_present - check if a CCP device is present

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The CCP device is part of the AMD Secure Processor. In order to expand the
usage of the AMD Secure Processor, create a framework that allows functional
components of the AMD Secure Processor to be initialized and handled
appropriately.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 drivers/crypto/Kconfig           |   10 +
 drivers/crypto/ccp/Kconfig       |   43 +++--
 drivers/crypto/ccp/Makefile      |    8 -
 drivers/crypto/ccp/ccp-dev-v3.c  |   86 +++++-----
 drivers/crypto/ccp/ccp-dev-v5.c  |   73 ++++-----
 drivers/crypto/ccp/ccp-dev.c     |  137 +++++++++-------
 drivers/crypto/ccp/ccp-dev.h     |   35 ----
 drivers/crypto/ccp/sp-dev.c      |  308 ++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/sp-dev.h      |  140 ++++++++++++++++
 drivers/crypto/ccp/sp-pci.c      |  324 ++++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/sp-platform.c |  268 +++++++++++++++++++++++++++++++
 include/linux/ccp.h              |    3 
 12 files changed, 1240 insertions(+), 195 deletions(-)
 create mode 100644 drivers/crypto/ccp/sp-dev.c
 create mode 100644 drivers/crypto/ccp/sp-dev.h
 create mode 100644 drivers/crypto/ccp/sp-pci.c
 create mode 100644 drivers/crypto/ccp/sp-platform.c

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 7956478..d31b469 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -456,14 +456,14 @@ config CRYPTO_DEV_ATMEL_SHA
 	  To compile this driver as a module, choose M here: the module
 	  will be called atmel-sha.
 
-config CRYPTO_DEV_CCP
-	bool "Support for AMD Cryptographic Coprocessor"
+config CRYPTO_DEV_SP
+	bool "Support for AMD Secure Processor"
 	depends on ((X86 && PCI) || (ARM64 && (OF_ADDRESS || ACPI))) && HAS_IOMEM
 	help
-	  The AMD Cryptographic Coprocessor provides hardware offload support
-	  for encryption, hashing and related operations.
+	  The AMD Secure Processor provides hardware offload support for memory
+	  encryption in virtualization and cryptographic hashing and related operations.
 
-if CRYPTO_DEV_CCP
+if CRYPTO_DEV_SP
 	source "drivers/crypto/ccp/Kconfig"
 endif
 
diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index 2238f77..bc08f03 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -1,26 +1,37 @@
-config CRYPTO_DEV_CCP_DD
-	tristate "Cryptographic Coprocessor device driver"
-	depends on CRYPTO_DEV_CCP
-	default m
-	select HW_RANDOM
-	select DMA_ENGINE
-	select DMADEVICES
-	select CRYPTO_SHA1
-	select CRYPTO_SHA256
-	help
-	  Provides the interface to use the AMD Cryptographic Coprocessor
-	  which can be used to offload encryption operations such as SHA,
-	  AES and more. If you choose 'M' here, this module will be called
-	  ccp.
-
 config CRYPTO_DEV_CCP_CRYPTO
 	tristate "Encryption and hashing offload support"
-	depends on CRYPTO_DEV_CCP_DD
+	depends on CRYPTO_DEV_SP_DD
 	default m
 	select CRYPTO_HASH
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_AUTHENC
+	select CRYPTO_DEV_CCP
 	help
 	  Support for using the cryptographic API with the AMD Cryptographic
 	  Coprocessor. This module supports offload of SHA and AES algorithms.
 	  If you choose 'M' here, this module will be called ccp_crypto.
+
+config CRYPTO_DEV_SP_DD
+	tristate "Secure Processor device driver"
+	depends on CRYPTO_DEV_SP
+	default m
+	help
+	  Provides the interface to use the AMD Secure Processor. The
+	  AMD Secure Processor support the Platform Security Processor (PSP)
+	  and Cryptographic Coprocessor (CCP). If you choose 'M' here, this
+	  module will be called ccp.
+
+if CRYPTO_DEV_SP_DD
+config CRYPTO_DEV_CCP
+	bool "Cryptographic Coprocessor interface"
+	default y
+	select HW_RANDOM
+	select DMA_ENGINE
+	select DMADEVICES
+	select CRYPTO_SHA1
+	select CRYPTO_SHA256
+	help
+	  Provides the interface to use the AMD Cryptographic Coprocessor
+	  which can be used to offload encryption operations such as SHA,
+	  AES and more.
+endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 346ceb8..8127e18 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -1,11 +1,11 @@
-obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
-ccp-objs := ccp-dev.o \
+obj-$(CONFIG_CRYPTO_DEV_SP_DD) += ccp.o
+ccp-objs := sp-dev.o sp-platform.o
+ccp-$(CONFIG_PCI) += sp-pci.o
+ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-ops.o \
 	    ccp-dev-v3.o \
 	    ccp-dev-v5.o \
-	    ccp-platform.o \
 	    ccp-dmaengine.o
-ccp-$(CONFIG_PCI) += ccp-pci.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/ccp-dev-v3.c b/drivers/crypto/ccp/ccp-dev-v3.c
index 7bc0998..5c50d14 100644
--- a/drivers/crypto/ccp/ccp-dev-v3.c
+++ b/drivers/crypto/ccp/ccp-dev-v3.c
@@ -315,6 +315,39 @@ static int ccp_perform_ecc(struct ccp_op *op)
 	return ccp_do_cmd(op, cr, ARRAY_SIZE(cr));
 }
 
+static irqreturn_t ccp_irq_handler(int irq, void *data)
+{
+	struct ccp_device *ccp = data;
+	struct ccp_cmd_queue *cmd_q;
+	u32 q_int, status;
+	unsigned int i;
+
+	status = ioread32(ccp->io_regs + IRQ_STATUS_REG);
+
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		cmd_q = &ccp->cmd_q[i];
+
+		q_int = status & (cmd_q->int_ok | cmd_q->int_err);
+		if (q_int) {
+			cmd_q->int_status = status;
+			cmd_q->q_status = ioread32(cmd_q->reg_status);
+			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
+
+			/* On error, only save the first error value */
+			if ((q_int & cmd_q->int_err) && !cmd_q->cmd_error)
+				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
+
+			cmd_q->int_rcvd = 1;
+
+			/* Acknowledge the interrupt and wake the kthread */
+			iowrite32(q_int, ccp->io_regs + IRQ_STATUS_REG);
+			wake_up_interruptible(&cmd_q->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
 static int ccp_init(struct ccp_device *ccp)
 {
 	struct device *dev = ccp->dev;
@@ -374,7 +407,7 @@ static int ccp_init(struct ccp_device *ccp)
 
 #ifdef CONFIG_ARM64
 		/* For arm64 set the recommended queue cache settings */
-		iowrite32(ccp->axcache, ccp->io_regs + CMD_Q_CACHE_BASE +
+		iowrite32(ccp->sp->axcache, ccp->io_regs + CMD_Q_CACHE_BASE +
 			  (CMD_Q_CACHE_INC * i));
 #endif
 
@@ -398,7 +431,7 @@ static int ccp_init(struct ccp_device *ccp)
 	iowrite32(qim, ccp->io_regs + IRQ_STATUS_REG);
 
 	/* Request an irq */
-	ret = ccp->get_irq(ccp);
+	ret = sp_request_ccp_irq(ccp->sp, ccp_irq_handler, ccp->name, ccp);
 	if (ret) {
 		dev_err(dev, "unable to allocate an IRQ\n");
 		goto e_pool;
@@ -450,7 +483,7 @@ static int ccp_init(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 e_pool:
 	for (i = 0; i < ccp->cmd_q_count; i++)
@@ -496,7 +529,7 @@ static void ccp_destroy(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 	for (i = 0; i < ccp->cmd_q_count; i++)
 		dma_pool_destroy(ccp->cmd_q[i].dma_pool);
@@ -516,40 +549,6 @@ static void ccp_destroy(struct ccp_device *ccp)
 	}
 }
 
-static irqreturn_t ccp_irq_handler(int irq, void *data)
-{
-	struct device *dev = data;
-	struct ccp_device *ccp = dev_get_drvdata(dev);
-	struct ccp_cmd_queue *cmd_q;
-	u32 q_int, status;
-	unsigned int i;
-
-	status = ioread32(ccp->io_regs + IRQ_STATUS_REG);
-
-	for (i = 0; i < ccp->cmd_q_count; i++) {
-		cmd_q = &ccp->cmd_q[i];
-
-		q_int = status & (cmd_q->int_ok | cmd_q->int_err);
-		if (q_int) {
-			cmd_q->int_status = status;
-			cmd_q->q_status = ioread32(cmd_q->reg_status);
-			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
-
-			/* On error, only save the first error value */
-			if ((q_int & cmd_q->int_err) && !cmd_q->cmd_error)
-				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
-
-			cmd_q->int_rcvd = 1;
-
-			/* Acknowledge the interrupt and wake the kthread */
-			iowrite32(q_int, ccp->io_regs + IRQ_STATUS_REG);
-			wake_up_interruptible(&cmd_q->int_queue);
-		}
-	}
-
-	return IRQ_HANDLED;
-}
-
 static const struct ccp_actions ccp3_actions = {
 	.aes = ccp_perform_aes,
 	.xts_aes = ccp_perform_xts_aes,
@@ -562,13 +561,18 @@ static const struct ccp_actions ccp3_actions = {
 	.init = ccp_init,
 	.destroy = ccp_destroy,
 	.get_free_slots = ccp_get_free_slots,
-	.irqhandler = ccp_irq_handler,
 };
 
-const struct ccp_vdata ccpv3 = {
+const struct ccp_vdata ccpv3_platform = {
+	.version = CCP_VERSION(3, 0),
+	.setup = NULL,
+	.perform = &ccp3_actions,
+	.offset = 0,
+};
+
+const struct ccp_vdata ccpv3_pci = {
 	.version = CCP_VERSION(3, 0),
 	.setup = NULL,
 	.perform = &ccp3_actions,
-	.bar = 2,
 	.offset = 0x20000,
 };
diff --git a/drivers/crypto/ccp/ccp-dev-v5.c b/drivers/crypto/ccp/ccp-dev-v5.c
index 612898b..dd6335b 100644
--- a/drivers/crypto/ccp/ccp-dev-v5.c
+++ b/drivers/crypto/ccp/ccp-dev-v5.c
@@ -651,6 +651,38 @@ static int ccp_assign_lsbs(struct ccp_device *ccp)
 	return rc;
 }
 
+static irqreturn_t ccp5_irq_handler(int irq, void *data)
+{
+	struct device *dev = data;
+	struct ccp_device *ccp = dev_get_drvdata(dev);
+	u32 status;
+	unsigned int i;
+
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		struct ccp_cmd_queue *cmd_q = &ccp->cmd_q[i];
+
+		status = ioread32(cmd_q->reg_interrupt_status);
+
+		if (status) {
+			cmd_q->int_status = status;
+			cmd_q->q_status = ioread32(cmd_q->reg_status);
+			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
+
+			/* On error, only save the first error value */
+			if ((status & INT_ERROR) && !cmd_q->cmd_error)
+				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
+
+			cmd_q->int_rcvd = 1;
+
+			/* Acknowledge the interrupt and wake the kthread */
+			iowrite32(ALL_INTERRUPTS, cmd_q->reg_interrupt_status);
+			wake_up_interruptible(&cmd_q->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
 static int ccp5_init(struct ccp_device *ccp)
 {
 	struct device *dev = ccp->dev;
@@ -752,7 +784,7 @@ static int ccp5_init(struct ccp_device *ccp)
 
 	dev_dbg(dev, "Requesting an IRQ...\n");
 	/* Request an irq */
-	ret = ccp->get_irq(ccp);
+	ret = sp_request_ccp_irq(ccp->sp, ccp5_irq_handler, ccp->name, ccp);
 	if (ret) {
 		dev_err(dev, "unable to allocate an IRQ\n");
 		goto e_pool;
@@ -855,7 +887,7 @@ static int ccp5_init(struct ccp_device *ccp)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
 e_irq:
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 e_pool:
 	for (i = 0; i < ccp->cmd_q_count; i++)
@@ -901,7 +933,7 @@ static void ccp5_destroy(struct ccp_device *ccp)
 		if (ccp->cmd_q[i].kthread)
 			kthread_stop(ccp->cmd_q[i].kthread);
 
-	ccp->free_irq(ccp);
+	sp_free_ccp_irq(ccp->sp, ccp);
 
 	for (i = 0; i < ccp->cmd_q_count; i++) {
 		cmd_q = &ccp->cmd_q[i];
@@ -924,38 +956,6 @@ static void ccp5_destroy(struct ccp_device *ccp)
 	}
 }
 
-static irqreturn_t ccp5_irq_handler(int irq, void *data)
-{
-	struct device *dev = data;
-	struct ccp_device *ccp = dev_get_drvdata(dev);
-	u32 status;
-	unsigned int i;
-
-	for (i = 0; i < ccp->cmd_q_count; i++) {
-		struct ccp_cmd_queue *cmd_q = &ccp->cmd_q[i];
-
-		status = ioread32(cmd_q->reg_interrupt_status);
-
-		if (status) {
-			cmd_q->int_status = status;
-			cmd_q->q_status = ioread32(cmd_q->reg_status);
-			cmd_q->q_int_status = ioread32(cmd_q->reg_int_status);
-
-			/* On error, only save the first error value */
-			if ((status & INT_ERROR) && !cmd_q->cmd_error)
-				cmd_q->cmd_error = CMD_Q_ERROR(cmd_q->q_status);
-
-			cmd_q->int_rcvd = 1;
-
-			/* Acknowledge the interrupt and wake the kthread */
-			iowrite32(ALL_INTERRUPTS, cmd_q->reg_interrupt_status);
-			wake_up_interruptible(&cmd_q->int_queue);
-		}
-	}
-
-	return IRQ_HANDLED;
-}
-
 static void ccp5_config(struct ccp_device *ccp)
 {
 	/* Public side */
@@ -1001,14 +1001,12 @@ static const struct ccp_actions ccp5_actions = {
 	.init = ccp5_init,
 	.destroy = ccp5_destroy,
 	.get_free_slots = ccp5_get_free_slots,
-	.irqhandler = ccp5_irq_handler,
 };
 
 const struct ccp_vdata ccpv5a = {
 	.version = CCP_VERSION(5, 0),
 	.setup = ccp5_config,
 	.perform = &ccp5_actions,
-	.bar = 2,
 	.offset = 0x0,
 };
 
@@ -1016,6 +1014,5 @@ const struct ccp_vdata ccpv5b = {
 	.version = CCP_VERSION(5, 0),
 	.setup = ccp5other_config,
 	.perform = &ccp5_actions,
-	.bar = 2,
 	.offset = 0x0,
 };
diff --git a/drivers/crypto/ccp/ccp-dev.c b/drivers/crypto/ccp/ccp-dev.c
index 511ab04..0fa8c4a 100644
--- a/drivers/crypto/ccp/ccp-dev.c
+++ b/drivers/crypto/ccp/ccp-dev.c
@@ -22,19 +22,11 @@
 #include <linux/mutex.h>
 #include <linux/delay.h>
 #include <linux/hw_random.h>
-#include <linux/cpu.h>
-#ifdef CONFIG_X86
-#include <asm/cpu_device_id.h>
-#endif
 #include <linux/ccp.h>
 
+#include "sp-dev.h"
 #include "ccp-dev.h"
 
-MODULE_AUTHOR("Tom Lendacky <thomas.lendacky@amd.com>");
-MODULE_LICENSE("GPL");
-MODULE_VERSION("1.0.0");
-MODULE_DESCRIPTION("AMD Cryptographic Coprocessor driver");
-
 struct ccp_tasklet_data {
 	struct completion completion;
 	struct ccp_cmd *cmd;
@@ -110,13 +102,6 @@ static LIST_HEAD(ccp_units);
 static DEFINE_SPINLOCK(ccp_rr_lock);
 static struct ccp_device *ccp_rr;
 
-/* Ever-increasing value to produce unique unit numbers */
-static atomic_t ccp_unit_ordinal;
-static unsigned int ccp_increment_unit_ordinal(void)
-{
-	return atomic_inc_return(&ccp_unit_ordinal);
-}
-
 /**
  * ccp_add_device - add a CCP device to the list
  *
@@ -455,19 +440,17 @@ int ccp_cmd_queue_thread(void *data)
 	return 0;
 }
 
-/**
- * ccp_alloc_struct - allocate and initialize the ccp_device struct
- *
- * @dev: device struct of the CCP
- */
-struct ccp_device *ccp_alloc_struct(struct device *dev)
+static struct ccp_device *ccp_alloc_struct(struct sp_device *sp)
 {
+	struct device *dev = sp->dev;
 	struct ccp_device *ccp;
 
 	ccp = devm_kzalloc(dev, sizeof(*ccp), GFP_KERNEL);
 	if (!ccp)
 		return NULL;
+
 	ccp->dev = dev;
+	ccp->sp = sp;
 
 	INIT_LIST_HEAD(&ccp->cmd);
 	INIT_LIST_HEAD(&ccp->backlog);
@@ -482,9 +465,8 @@ struct ccp_device *ccp_alloc_struct(struct device *dev)
 	init_waitqueue_head(&ccp->sb_queue);
 	init_waitqueue_head(&ccp->suspend_queue);
 
-	ccp->ord = ccp_increment_unit_ordinal();
-	snprintf(ccp->name, MAX_CCP_NAME_LEN, "ccp-%u", ccp->ord);
-	snprintf(ccp->rngname, MAX_CCP_NAME_LEN, "ccp-%u-rng", ccp->ord);
+	snprintf(ccp->name, MAX_CCP_NAME_LEN, "ccp-%u", sp->ord);
+	snprintf(ccp->rngname, MAX_CCP_NAME_LEN, "ccp-%u-rng", sp->ord);
 
 	return ccp;
 }
@@ -536,53 +518,94 @@ bool ccp_queues_suspended(struct ccp_device *ccp)
 }
 #endif
 
-static int __init ccp_mod_init(void)
+int ccp_dev_init(struct sp_device *sp)
 {
-#ifdef CONFIG_X86
+	struct device *dev = sp->dev;
+	struct ccp_device *ccp;
 	int ret;
 
-	ret = ccp_pci_init();
-	if (ret)
-		return ret;
-
-	/* Don't leave the driver loaded if init failed */
-	if (ccp_present() != 0) {
-		ccp_pci_exit();
-		return -ENODEV;
+	ret = -ENOMEM;
+	ccp = ccp_alloc_struct(sp);
+	if (!ccp)
+		goto e_err;
+	sp->ccp_data = ccp;
+
+	ccp->vdata = (struct ccp_vdata *)sp->dev_data->ccp_vdata;
+	if (!ccp->vdata || !ccp->vdata->version) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
 	}
 
-	return 0;
-#endif
+	ccp->io_regs = sp->io_map + ccp->vdata->offset;
 
-#ifdef CONFIG_ARM64
-	int ret;
+	if (ccp->vdata->setup)
+		ccp->vdata->setup(ccp);
 
-	ret = ccp_platform_init();
+	ret = ccp->vdata->perform->init(ccp);
 	if (ret)
-		return ret;
+		goto e_err;
 
-	/* Don't leave the driver loaded if init failed */
-	if (ccp_present() != 0) {
-		ccp_platform_exit();
-		return -ENODEV;
-	}
+	dev_notice(dev, "ccp enabled\n");
 
 	return 0;
-#endif
 
-	return -ENODEV;
+e_err:
+	sp->ccp_data = NULL;
+
+	dev_notice(dev, "ccp initialization failed\n");
+
+	return ret;
 }
 
-static void __exit ccp_mod_exit(void)
+void ccp_dev_destroy(struct sp_device *sp)
 {
-#ifdef CONFIG_X86
-	ccp_pci_exit();
-#endif
+	struct ccp_device *ccp = sp->ccp_data;
 
-#ifdef CONFIG_ARM64
-	ccp_platform_exit();
-#endif
+	ccp->vdata->perform->destroy(ccp);
+}
+
+int ccp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	struct ccp_device *ccp = sp->ccp_data;
+	unsigned long flags;
+	unsigned int i;
+
+	spin_lock_irqsave(&ccp->cmd_lock, flags);
+
+	ccp->suspending = 1;
+
+	/* Wake all the queue kthreads to prepare for suspend */
+	for (i = 0; i < ccp->cmd_q_count; i++)
+		wake_up_process(ccp->cmd_q[i].kthread);
+
+	spin_unlock_irqrestore(&ccp->cmd_lock, flags);
+
+	/* Wait for all queue kthreads to say they're done */
+	while (!ccp_queues_suspended(ccp))
+		wait_event_interruptible(ccp->suspend_queue,
+					 ccp_queues_suspended(ccp));
+
+	return 0;
 }
 
-module_init(ccp_mod_init);
-module_exit(ccp_mod_exit);
+int ccp_dev_resume(struct sp_device *sp)
+{
+	struct ccp_device *ccp = sp->ccp_data;
+	unsigned long flags;
+	unsigned int i;
+
+	spin_lock_irqsave(&ccp->cmd_lock, flags);
+
+	ccp->suspending = 0;
+
+	/* Wake up all the kthreads */
+	for (i = 0; i < ccp->cmd_q_count; i++) {
+		ccp->cmd_q[i].suspended = 0;
+		wake_up_process(ccp->cmd_q[i].kthread);
+	}
+
+	spin_unlock_irqrestore(&ccp->cmd_lock, flags);
+
+	return 0;
+}
diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
index 649e561..25a4bfd 100644
--- a/drivers/crypto/ccp/ccp-dev.h
+++ b/drivers/crypto/ccp/ccp-dev.h
@@ -27,6 +27,8 @@
 #include <linux/irqreturn.h>
 #include <linux/dmaengine.h>
 
+#include "sp-dev.h"
+
 #define MAX_CCP_NAME_LEN		16
 #define MAX_DMAPOOL_NAME_LEN		32
 
@@ -35,9 +37,6 @@
 
 #define TRNG_RETRIES			10
 
-#define CACHE_NONE			0x00
-#define CACHE_WB_NO_ALLOC		0xb7
-
 /****** Register Mappings ******/
 #define Q_MASK_REG			0x000
 #define TRNG_OUT_REG			0x00c
@@ -322,18 +321,15 @@ struct ccp_device {
 	struct list_head entry;
 
 	struct ccp_vdata *vdata;
-	unsigned int ord;
 	char name[MAX_CCP_NAME_LEN];
 	char rngname[MAX_CCP_NAME_LEN];
 
 	struct device *dev;
+	struct sp_device *sp;
 
 	/* Bus specific device information
 	 */
 	void *dev_specific;
-	int (*get_irq)(struct ccp_device *ccp);
-	void (*free_irq)(struct ccp_device *ccp);
-	unsigned int irq;
 
 	/* I/O area used for device communication. The register mapping
 	 * starts at an offset into the mapped bar.
@@ -342,7 +338,6 @@ struct ccp_device {
 	 *   them.
 	 */
 	struct mutex req_mutex ____cacheline_aligned;
-	void __iomem *io_map;
 	void __iomem *io_regs;
 
 	/* Master lists that all cmds are queued on. Because there can be
@@ -407,9 +402,6 @@ struct ccp_device {
 	/* Suspend support */
 	unsigned int suspending;
 	wait_queue_head_t suspend_queue;
-
-	/* DMA caching attribute support */
-	unsigned int axcache;
 };
 
 enum ccp_memtype {
@@ -592,18 +584,11 @@ struct ccp5_desc {
 	struct dword7 dw7;
 };
 
-int ccp_pci_init(void);
-void ccp_pci_exit(void);
-
-int ccp_platform_init(void);
-void ccp_platform_exit(void);
-
 void ccp_add_device(struct ccp_device *ccp);
 void ccp_del_device(struct ccp_device *ccp);
 
 extern void ccp_log_error(struct ccp_device *, int);
 
-struct ccp_device *ccp_alloc_struct(struct device *dev);
 bool ccp_queues_suspended(struct ccp_device *ccp);
 int ccp_cmd_queue_thread(void *data);
 int ccp_trng_read(struct hwrng *rng, void *data, size_t max, bool wait);
@@ -629,20 +614,6 @@ struct ccp_actions {
 	unsigned int (*get_free_slots)(struct ccp_cmd_queue *);
 	int (*init)(struct ccp_device *);
 	void (*destroy)(struct ccp_device *);
-	irqreturn_t (*irqhandler)(int, void *);
-};
-
-/* Structure to hold CCP version-specific values */
-struct ccp_vdata {
-	const unsigned int version;
-	void (*setup)(struct ccp_device *);
-	const struct ccp_actions *perform;
-	const unsigned int bar;
-	const unsigned int offset;
 };
 
-extern const struct ccp_vdata ccpv3;
-extern const struct ccp_vdata ccpv5a;
-extern const struct ccp_vdata ccpv5b;
-
 #endif
diff --git a/drivers/crypto/ccp/sp-dev.c b/drivers/crypto/ccp/sp-dev.c
new file mode 100644
index 0000000..e47fb8e
--- /dev/null
+++ b/drivers/crypto/ccp/sp-dev.c
@@ -0,0 +1,308 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *	Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+
+#include "sp-dev.h"
+
+MODULE_AUTHOR("Tom Lendacky <thomas.lendacky@amd.com>");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1.1.0");
+MODULE_DESCRIPTION("AMD Secure Processor driver");
+
+/* List of SPs, SP count, read-write access lock, and access functions
+ *
+ * Lock structure: get sp_unit_lock for reading whenever we need to
+ * examine the SP list.
+ */
+static DEFINE_RWLOCK(sp_unit_lock);
+static LIST_HEAD(sp_units);
+
+/* Ever-increasing value to produce unique unit numbers */
+static atomic_t sp_ordinal;
+
+static void sp_add_device(struct sp_device *sp)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	list_add_tail(&sp->entry, &sp_units);
+
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+}
+
+static void sp_del_device(struct sp_device *sp)
+{
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	list_del(&sp->entry);
+
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+}
+
+struct sp_device *sp_get_device(void)
+{
+	struct sp_device *sp = NULL;
+	unsigned long flags;
+
+	write_lock_irqsave(&sp_unit_lock, flags);
+
+	if (list_empty(&sp_units))
+		goto unlock;
+
+	sp = list_first_entry(&sp_units, struct sp_device, entry);
+
+	list_add_tail(&sp->entry, &sp_units);
+unlock:
+	write_unlock_irqrestore(&sp_unit_lock, flags);
+	return sp;
+}
+
+static irqreturn_t sp_irq_handler(int irq, void *data)
+{
+	struct sp_device *sp = data;
+
+	if (sp->psp_irq_handler)
+		sp->psp_irq_handler(irq, sp->psp_irq_data);
+
+	if (sp->ccp_irq_handler)
+		sp->ccp_irq_handler(irq, sp->ccp_irq_data);
+
+	return IRQ_HANDLED;
+}
+
+int sp_request_psp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data)
+{
+	int ret;
+
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->ccp_vdata) {
+		/* Need a common routine to manager all interrupts */
+		sp->psp_irq_data = data;
+		sp->psp_irq_handler = handler;
+
+		if (!sp->irq_registered) {
+			ret = request_irq(sp->psp_irq, sp_irq_handler, 0,
+					  sp->name, sp);
+			if (ret)
+				return ret;
+
+			sp->irq_registered = true;
+		}
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		ret = request_irq(sp->psp_irq, handler, 0, name, data);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+int sp_request_ccp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data)
+{
+	int ret;
+
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->psp_vdata) {
+		/* Need a common routine to manager all interrupts */
+		sp->ccp_irq_data = data;
+		sp->ccp_irq_handler = handler;
+
+		if (!sp->irq_registered) {
+			ret = request_irq(sp->ccp_irq, sp_irq_handler, 0,
+					  sp->name, sp);
+			if (ret)
+				return ret;
+
+			sp->irq_registered = true;
+		}
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		ret = request_irq(sp->ccp_irq, handler, 0, name, data);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+void sp_free_psp_irq(struct sp_device *sp, void *data)
+{
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->ccp_vdata) {
+		/* Using a common routine to manager all interrupts */
+		if (!sp->ccp_irq_handler) {
+			/* Nothing else using it, so free it */
+			free_irq(sp->psp_irq, sp);
+
+			sp->irq_registered = false;
+		}
+
+		sp->psp_irq_handler = NULL;
+		sp->psp_irq_data = NULL;
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		free_irq(sp->psp_irq, data);
+	}
+}
+
+void sp_free_ccp_irq(struct sp_device *sp, void *data)
+{
+	if ((sp->psp_irq == sp->ccp_irq) && sp->dev_data->psp_vdata) {
+		/* Using a common routine to manager all interrupts */
+		if (!sp->psp_irq_handler) {
+			/* Nothing else using it, so free it */
+			free_irq(sp->ccp_irq, sp);
+
+			sp->irq_registered = false;
+		}
+
+		sp->ccp_irq_handler = NULL;
+		sp->ccp_irq_data = NULL;
+	} else {
+		/* Each sub-device can manage it's own interrupt */
+		free_irq(sp->ccp_irq, data);
+	}
+}
+
+/**
+ * sp_alloc_struct - allocate and initialize the sp_device struct
+ *
+ * @dev: device struct of the SP
+ */
+struct sp_device *sp_alloc_struct(struct device *dev)
+{
+	struct sp_device *sp;
+
+	sp = devm_kzalloc(dev, sizeof(*sp), GFP_KERNEL);
+	if (!sp)
+		return NULL;
+
+	sp->dev = dev;
+	sp->ord = atomic_inc_return(&sp_ordinal) - 1;
+	snprintf(sp->name, SP_MAX_NAME_LEN, "sp-%u", sp->ord);
+
+	return sp;
+}
+
+int sp_init(struct sp_device *sp)
+{
+	sp_add_device(sp);
+
+	if (sp->dev_data->ccp_vdata)
+		ccp_dev_init(sp);
+
+	return 0;
+}
+
+void sp_destroy(struct sp_device *sp)
+{
+	if (sp->dev_data->ccp_vdata)
+		ccp_dev_destroy(sp);
+
+	sp_del_device(sp);
+}
+
+int sp_suspend(struct sp_device *sp, pm_message_t state)
+{
+	int ret;
+
+	if (sp->dev_data->ccp_vdata) {
+		ret = ccp_dev_suspend(sp, state);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+int sp_resume(struct sp_device *sp)
+{
+	int ret;
+
+	if (sp->dev_data->ccp_vdata) {
+		ret = ccp_dev_resume(sp);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+struct sp_device *sp_get_psp_master_device(void)
+{
+	struct sp_device *sp = sp_get_device();
+
+	if (!sp)
+		return NULL;
+
+	if (!sp->psp_data)
+		return NULL;
+
+	return sp->get_master_device();
+}
+
+void sp_set_psp_master(struct sp_device *sp)
+{
+	if (sp->psp_data)
+		sp->set_master_device(sp);
+}
+
+static int __init sp_mod_init(void)
+{
+#ifdef CONFIG_X86
+	int ret;
+
+	ret = sp_pci_init();
+	if (ret)
+		return ret;
+
+	return 0;
+#endif
+
+#ifdef CONFIG_ARM64
+	int ret;
+
+	ret = sp_platform_init();
+	if (ret)
+		return ret;
+
+	return 0;
+#endif
+
+	return -ENODEV;
+}
+
+static void __exit sp_mod_exit(void)
+{
+#ifdef CONFIG_X86
+	sp_pci_exit();
+#endif
+
+#ifdef CONFIG_ARM64
+	sp_platform_exit();
+#endif
+}
+
+module_init(sp_mod_init);
+module_exit(sp_mod_exit);
diff --git a/drivers/crypto/ccp/sp-dev.h b/drivers/crypto/ccp/sp-dev.h
new file mode 100644
index 0000000..9a8a8f8
--- /dev/null
+++ b/drivers/crypto/ccp/sp-dev.h
@@ -0,0 +1,140 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *	Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __SP_DEV_H__
+#define __SP_DEV_H__
+
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/dmapool.h>
+#include <linux/hw_random.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+
+#define SP_MAX_NAME_LEN		32
+
+#define CACHE_NONE			0x00
+#define CACHE_WB_NO_ALLOC		0xb7
+
+/* Structure to hold CCP device data */
+struct ccp_device;
+struct ccp_vdata {
+	const unsigned int version;
+	void (*setup)(struct ccp_device *);
+	const struct ccp_actions *perform;
+	const unsigned int offset;
+};
+
+/* Structure to hold SP device data */
+struct sp_dev_data {
+	const unsigned int bar;
+
+	const struct ccp_vdata *ccp_vdata;
+	const void *psp_vdata;
+};
+
+struct sp_device {
+	struct list_head entry;
+
+	struct device *dev;
+
+	struct sp_dev_data *dev_data;
+	unsigned int ord;
+	char name[SP_MAX_NAME_LEN];
+
+	/* Bus specific device information */
+	void *dev_specific;
+
+	/* I/O area used for device communication. */
+	void __iomem *io_map;
+
+	/* DMA caching attribute support */
+	unsigned int axcache;
+
+	bool irq_registered;
+
+	/* get and set master device */
+	struct sp_device*(*get_master_device) (void);
+	void(*set_master_device) (struct sp_device *);
+
+	unsigned int psp_irq;
+	irq_handler_t psp_irq_handler;
+	void *psp_irq_data;
+
+	unsigned int ccp_irq;
+	irq_handler_t ccp_irq_handler;
+	void *ccp_irq_data;
+
+	void *psp_data;
+	void *ccp_data;
+};
+
+int sp_pci_init(void);
+void sp_pci_exit(void);
+
+int sp_platform_init(void);
+void sp_platform_exit(void);
+
+struct sp_device *sp_alloc_struct(struct device *dev);
+
+int sp_init(struct sp_device *sp);
+void sp_destroy(struct sp_device *sp);
+struct sp_device *sp_get_master(void);
+
+int sp_suspend(struct sp_device *sp, pm_message_t state);
+int sp_resume(struct sp_device *sp);
+
+int sp_request_psp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data);
+void sp_free_psp_irq(struct sp_device *sp, void *data);
+
+int sp_request_ccp_irq(struct sp_device *sp, irq_handler_t handler,
+		       const char *name, void *data);
+void sp_free_ccp_irq(struct sp_device *sp, void *data);
+
+void sp_set_psp_master(struct sp_device *sp);
+struct sp_device *sp_get_psp_master_device(void);
+
+#ifdef CONFIG_CRYPTO_DEV_CCP
+
+int ccp_dev_init(struct sp_device *sp);
+void ccp_dev_destroy(struct sp_device *sp);
+
+int ccp_dev_suspend(struct sp_device *sp, pm_message_t state);
+int ccp_dev_resume(struct sp_device *sp);
+
+#else	/* !CONFIG_CRYPTO_DEV_CCP */
+
+static inline int ccp_dev_init(struct sp_device *sp)
+{
+	return 0;
+}
+static inline void ccp_dev_destroy(struct sp_device *sp) { }
+
+static inline int ccp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	return 0;
+}
+static inline int ccp_dev_resume(struct sp_device *sp)
+{
+	return 0;
+}
+
+#endif	/* CONFIG_CRYPTO_DEV_CCP */
+
+#endif
diff --git a/drivers/crypto/ccp/sp-pci.c b/drivers/crypto/ccp/sp-pci.c
new file mode 100644
index 0000000..0960e2d
--- /dev/null
+++ b/drivers/crypto/ccp/sp-pci.c
@@ -0,0 +1,324 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ * 	   Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/pci_ids.h>
+#include <linux/dma-mapping.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+
+#include "sp-dev.h"
+
+#define MSIX_VECTORS			2
+
+struct sp_pci {
+	int msix_count;
+	struct msix_entry msix_entry[MSIX_VECTORS];
+};
+
+static struct sp_device *sp_dev_master;
+
+static int sp_get_msix_irqs(struct sp_device *sp)
+{
+	struct sp_pci *sp_pci = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int v, ret;
+
+	for (v = 0; v < ARRAY_SIZE(sp_pci->msix_entry); v++)
+		sp_pci->msix_entry[v].entry = v;
+
+	ret = pci_enable_msix_range(pdev, sp_pci->msix_entry, 1, v);
+	if (ret < 0)
+		return ret;
+
+	sp_pci->msix_count = ret;
+
+	sp->psp_irq = sp_pci->msix_entry[0].vector;
+	sp->ccp_irq = (sp_pci->msix_count > 1) ? sp_pci->msix_entry[1].vector
+					       : sp_pci->msix_entry[0].vector;
+
+	return 0;
+}
+
+static int sp_get_msi_irq(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int ret;
+
+	ret = pci_enable_msi(pdev);
+	if (ret)
+		return ret;
+
+	sp->psp_irq = pdev->irq;
+	sp->ccp_irq = pdev->irq;
+
+	return 0;
+}
+
+static int sp_get_irqs(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	int ret;
+
+	ret = sp_get_msix_irqs(sp);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI-X vectors, try MSI */
+	dev_notice(dev, "could not enable MSI-X (%d), trying MSI\n", ret);
+	ret = sp_get_msi_irq(sp);
+	if (!ret)
+		return 0;
+
+	/* Couldn't get MSI interrupt */
+	dev_notice(dev, "could not enable MSI (%d)\n", ret);
+
+	return ret;
+}
+
+static void sp_free_irqs(struct sp_device *sp)
+{
+	struct sp_pci *sp_pci = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (sp_pci->msix_count)
+		pci_disable_msix(pdev);
+	else if (sp->psp_irq)
+		pci_disable_msi(pdev);
+
+	sp->psp_irq = 0;
+	sp->ccp_irq = 0;
+}
+
+static bool sp_pci_is_master(struct sp_device *sp)
+{
+	struct device *dev_cur, *dev_new;
+	struct pci_dev *pdev_cur, *pdev_new;
+
+	dev_new = sp->dev;
+	dev_cur = sp_dev_master->dev;
+
+	pdev_new = to_pci_dev(dev_new);
+	pdev_cur = to_pci_dev(dev_cur);
+
+	if (pdev_new->bus->number < pdev_cur->bus->number)
+		return true;
+
+	if (PCI_SLOT(pdev_new->devfn) < PCI_SLOT(pdev_cur->devfn))
+		return true;
+
+	if (PCI_FUNC(pdev_new->devfn) < PCI_FUNC(pdev_cur->devfn))
+		return true;
+
+	return false;
+}
+
+static void sp_pci_set_master(struct sp_device *sp)
+{
+	if (!sp_dev_master) {
+		sp_dev_master = sp;
+		return;
+	}
+
+	if (sp_pci_is_master(sp))
+		sp_dev_master = sp;
+}
+
+static struct sp_device *sp_pci_get_master(void)
+{
+	return sp_dev_master;
+}
+
+static int sp_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct sp_device *sp;
+	struct sp_pci *sp_pci;
+	struct device *dev = &pdev->dev;
+	void __iomem * const *iomap_table;
+	int bar_mask;
+	int ret;
+
+	ret = -ENOMEM;
+	sp = sp_alloc_struct(dev);
+	if (!sp)
+		goto e_err;
+
+	sp_pci = devm_kzalloc(dev, sizeof(*sp_pci), GFP_KERNEL);
+	if (!sp_pci)
+		goto e_err;
+	sp->dev_specific = sp_pci;
+
+	sp->dev_data = (struct sp_dev_data *)id->driver_data;
+	if (!sp->dev_data) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	ret = pcim_enable_device(pdev);
+	if (ret) {
+		dev_err(dev, "pcim_enable_device failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	bar_mask = pci_select_bars(pdev, IORESOURCE_MEM);
+	ret = pcim_iomap_regions(pdev, bar_mask, "sp");
+	if (ret) {
+		dev_err(dev, "pcim_iomap_regions failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	iomap_table = pcim_iomap_table(pdev);
+	if (!iomap_table) {
+		dev_err(dev, "pcim_iomap_table failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	sp->io_map = iomap_table[sp->dev_data->bar];
+	if (!sp->io_map) {
+		dev_err(dev, "ioremap failed\n");
+		ret = -ENOMEM;
+		goto e_err;
+	}
+
+	ret = sp_get_irqs(sp);
+	if (ret)
+		goto e_err;
+
+	pci_set_master(pdev);
+
+	sp->set_master_device = sp_pci_set_master;
+	sp->get_master_device = sp_pci_get_master;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+	if (ret) {
+		ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
+		if (ret) {
+			dev_err(dev, "dma_set_mask_and_coherent failed (%d)\n",
+				ret);
+			goto e_err;
+		}
+	}
+
+	dev_set_drvdata(dev, sp);
+
+	ret = sp_init(sp);
+	if (ret)
+		goto e_err;
+
+	dev_notice(dev, "enabled\n");
+
+	return 0;
+
+e_err:
+	dev_notice(dev, "initialization failed\n");
+
+	return ret;
+}
+
+static void sp_pci_remove(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	if (!sp)
+		return;
+
+	sp_destroy(sp);
+
+	sp_free_irqs(sp);
+
+	dev_notice(dev, "disabled\n");
+}
+
+#ifdef CONFIG_PM
+static int sp_pci_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_suspend(sp, state);
+}
+
+static int sp_pci_resume(struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_resume(sp);
+}
+#endif
+
+extern struct ccp_vdata ccpv3_pci;
+extern struct ccp_vdata ccpv5a;
+extern struct ccp_vdata ccpv5b;
+
+static const struct sp_dev_data dev_data[] = {
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv3_pci,
+#endif
+	},
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv5a,
+#endif
+	},
+	{
+		.bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_CCP
+		.ccp_vdata = &ccpv5b,
+#endif
+	},
+};
+
+static const struct pci_device_id sp_pci_table[] = {
+	{ PCI_VDEVICE(AMD, 0x1537), (kernel_ulong_t)&dev_data[0] },
+	{ PCI_VDEVICE(AMD, 0x1456), (kernel_ulong_t)&dev_data[1] },
+	{ PCI_VDEVICE(AMD, 0x1468), (kernel_ulong_t)&dev_data[2] },
+	/* Last entry must be zero */
+	{ 0, }
+};
+MODULE_DEVICE_TABLE(pci, sp_pci_table);
+
+static struct pci_driver sp_pci_driver = {
+	.name = "sp",
+	.id_table = sp_pci_table,
+	.probe = sp_pci_probe,
+	.remove = sp_pci_remove,
+#ifdef CONFIG_PM
+	.suspend = sp_pci_suspend,
+	.resume = sp_pci_resume,
+#endif
+};
+
+int sp_pci_init(void)
+{
+	return pci_register_driver(&sp_pci_driver);
+}
+
+void sp_pci_exit(void)
+{
+	pci_unregister_driver(&sp_pci_driver);
+}
diff --git a/drivers/crypto/ccp/sp-platform.c b/drivers/crypto/ccp/sp-platform.c
new file mode 100644
index 0000000..a918238
--- /dev/null
+++ b/drivers/crypto/ccp/sp-platform.c
@@ -0,0 +1,268 @@
+/*
+ * AMD Secure Processor driver
+ *
+ * Copyright (C) 2017 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ * 	   Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/platform_device.h>
+#include <linux/ioport.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/acpi.h>
+
+#include "sp-dev.h"
+
+struct sp_platform {
+	int coherent;
+	unsigned int irq_count;
+};
+
+static struct sp_device *sp_dev_master;
+static const struct acpi_device_id sp_acpi_match[];
+static const struct of_device_id sp_of_match[];
+
+static struct sp_dev_data *sp_get_of_dev_data(struct platform_device *pdev)
+{
+#ifdef CONFIG_OF
+	const struct of_device_id *match;
+
+	match = of_match_node(sp_of_match, pdev->dev.of_node);
+	if (match && match->data)
+		return (struct sp_dev_data *)match->data;
+#endif
+
+	return NULL;
+}
+
+static struct sp_dev_data *sp_get_acpi_dev_data(struct platform_device *pdev)
+{
+#ifdef CONFIG_ACPI
+	const struct acpi_device_id *match;
+
+	match = acpi_match_device(sp_acpi_match, &pdev->dev);
+	if (match && match->driver_data)
+		return (struct sp_dev_data *)match->driver_data;
+#endif
+
+	return NULL;
+}
+
+static int sp_get_irqs(struct sp_device *sp)
+{
+	struct sp_platform *sp_platform = sp->dev_specific;
+	struct device *dev = sp->dev;
+	struct platform_device *pdev = to_platform_device(dev);
+	unsigned int i, count;
+	int ret;
+
+	for (i = 0, count = 0; i < pdev->num_resources; i++) {
+		struct resource *res = &pdev->resource[i];
+
+		if (resource_type(res) == IORESOURCE_IRQ)
+			count++;
+	}
+
+	sp_platform->irq_count = count;
+
+	ret = platform_get_irq(pdev, 0);
+	if (ret < 0)
+		return ret;
+
+	sp->psp_irq = ret;
+	if (count == 1) {
+		sp->ccp_irq = ret;
+	} else {
+		ret = platform_get_irq(pdev, 1);
+		if (ret < 0)
+			return ret;
+
+		sp->ccp_irq = ret;
+	}
+
+	return 0;
+}
+
+void sp_platform_set_master(struct sp_device *sp)
+{
+	if (!sp_dev_master)
+		sp_dev_master = sp;
+}
+
+static int sp_platform_probe(struct platform_device *pdev)
+{
+	struct sp_device *sp;
+	struct sp_platform *sp_platform;
+	struct device *dev = &pdev->dev;
+	enum dev_dma_attr attr;
+	struct resource *ior;
+	int ret;
+
+	ret = -ENOMEM;
+	sp = sp_alloc_struct(dev);
+	if (!sp)
+		goto e_err;
+
+	sp_platform = devm_kzalloc(dev, sizeof(*sp_platform), GFP_KERNEL);
+	if (!sp_platform)
+		goto e_err;
+
+	sp->dev_specific = sp_platform;
+	sp->dev_data = pdev->dev.of_node ? sp_get_of_dev_data(pdev)
+					 : sp_get_acpi_dev_data(pdev);
+	if (!sp->dev_data) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	ior = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	sp->io_map = devm_ioremap_resource(dev, ior);
+	if (IS_ERR(sp->io_map)) {
+		ret = PTR_ERR(sp->io_map);
+		goto e_err;
+	}
+
+	attr = device_get_dma_attr(dev);
+	if (attr == DEV_DMA_NOT_SUPPORTED) {
+		dev_err(dev, "DMA is not supported");
+		goto e_err;
+	}
+
+	sp_platform->coherent = (attr == DEV_DMA_COHERENT);
+	if (sp_platform->coherent)
+		sp->axcache = CACHE_WB_NO_ALLOC;
+	else
+		sp->axcache = CACHE_NONE;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+	if (ret) {
+		dev_err(dev, "dma_set_mask_and_coherent failed (%d)\n", ret);
+		goto e_err;
+	}
+
+	ret = sp_get_irqs(sp);
+	if (ret)
+		goto e_err;
+
+	dev_set_drvdata(dev, sp);
+
+	ret = sp_init(sp);
+	if (ret)
+		goto e_err;
+
+	dev_notice(dev, "enabled\n");
+
+	return 0;
+
+e_err:
+	dev_notice(dev, "initialization failed\n");
+
+	return ret;
+}
+
+static int sp_platform_remove(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	if (!sp)
+		return 0;
+
+	sp_destroy(sp);
+
+	dev_notice(dev, "disabled\n");
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int sp_platform_suspend(struct platform_device *pdev,
+			       pm_message_t state)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_suspend(sp, state);
+}
+
+static int sp_platform_resume(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct sp_device *sp = dev_get_drvdata(dev);
+
+	return sp_resume(sp);
+}
+#endif
+
+extern struct ccp_vdata ccpv3_platform;
+
+static const struct sp_dev_data dev_data[] = {
+	{
+#ifdef CONFIG_AMD_CCP
+		.ccp_vdata = &ccpv3_platform,
+#endif
+	},
+};
+
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id sp_acpi_match[] = {
+	{ "AMDI0C00", (kernel_ulong_t)&dev_data[0] },
+	{ },
+};
+MODULE_DEVICE_TABLE(acpi, sp_acpi_match);
+#endif
+
+#ifdef CONFIG_OF
+static const struct of_device_id sp_of_match[] = {
+	{ .compatible = "amd,ccp-seattle-v1a",
+	  .data = (const void *)&dev_data[0] },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, sp_of_match);
+#endif
+
+static struct platform_driver sp_platform_driver = {
+	.driver = {
+		.name = "sp",
+#ifdef CONFIG_ACPI
+		.acpi_match_table = sp_acpi_match,
+#endif
+#ifdef CONFIG_OF
+		.of_match_table = sp_of_match,
+#endif
+	},
+	.probe = sp_platform_probe,
+	.remove = sp_platform_remove,
+#ifdef CONFIG_PM
+	.suspend = sp_platform_suspend,
+	.resume = sp_platform_resume,
+#endif
+};
+
+struct sp_device *sp_platform_get_master(void)
+{
+	return sp_dev_master;
+}
+
+int sp_platform_init(void)
+{
+	return platform_driver_register(&sp_platform_driver);
+}
+
+void sp_platform_exit(void)
+{
+	platform_driver_unregister(&sp_platform_driver);
+}
diff --git a/include/linux/ccp.h b/include/linux/ccp.h
index c71dd8f..1ea14e6 100644
--- a/include/linux/ccp.h
+++ b/include/linux/ccp.h
@@ -24,8 +24,7 @@
 struct ccp_device;
 struct ccp_cmd;
 
-#if defined(CONFIG_CRYPTO_DEV_CCP_DD) || \
-	defined(CONFIG_CRYPTO_DEV_CCP_DD_MODULE)
+#if defined(CONFIG_CRYPTO_DEV_CCP)
 
 /**
  * ccp_present - check if a CCP device is present

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 20/32] crypto: ccp: Add Platform Security Processor (PSP) interface support
  2017-03-02 15:12 ` Brijesh Singh
                   ` (41 preceding siblings ...)
  (?)
@ 2017-03-02 15:16 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

AMD Platform Security Processor (PSP) is a dedicated processor that
provides the support for encrypting the guest memory in a Secure Encrypted
Virtualiztion (SEV) mode, along with software-based Tursted Executation
Environment (TEE) to enable the third-party tursted applications.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 drivers/crypto/ccp/Kconfig   |    7 +
 drivers/crypto/ccp/Makefile  |    1 
 drivers/crypto/ccp/psp-dev.c |  211 ++++++++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/psp-dev.h |  102 ++++++++++++++++++++
 drivers/crypto/ccp/sp-dev.c  |   16 +++
 drivers/crypto/ccp/sp-dev.h  |   34 +++++++
 drivers/crypto/ccp/sp-pci.c  |    4 +
 7 files changed, 374 insertions(+), 1 deletion(-)
 create mode 100644 drivers/crypto/ccp/psp-dev.c
 create mode 100644 drivers/crypto/ccp/psp-dev.h

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index bc08f03..59c207e 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -34,4 +34,11 @@ config CRYPTO_DEV_CCP
 	  Provides the interface to use the AMD Cryptographic Coprocessor
 	  which can be used to offload encryption operations such as SHA,
 	  AES and more.
+
+config CRYPTO_DEV_PSP
+	bool "Platform Security Processor interface"
+	default y
+	help
+	 Provide the interface for AMD Platform Security Processor (PSP) device.
+
 endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 8127e18..12e569d 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -6,6 +6,7 @@ ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-dev-v3.o \
 	    ccp-dev-v5.o \
 	    ccp-dmaengine.o
+ccp-$(CONFIG_CRYPTO_DEV_PSP) += psp-dev.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/psp-dev.c b/drivers/crypto/ccp/psp-dev.c
new file mode 100644
index 0000000..6f64aa7
--- /dev/null
+++ b/drivers/crypto/ccp/psp-dev.c
@@ -0,0 +1,211 @@
+/*
+ * AMD Platform Security Processor (PSP) interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/delay.h>
+#include <linux/hw_random.h>
+#include <linux/ccp.h>
+
+#include "sp-dev.h"
+#include "psp-dev.h"
+
+static LIST_HEAD(psp_devs);
+static DEFINE_SPINLOCK(psp_devs_lock);
+
+const struct psp_vdata psp_entry = {
+	.offset = 0x10500,
+};
+
+void psp_add_device(struct psp_device *psp)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&psp_devs_lock, flags);
+
+	list_add_tail(&psp->entry, &psp_devs);
+
+	spin_unlock_irqrestore(&psp_devs_lock, flags);
+}
+
+void psp_del_device(struct psp_device *psp)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&psp_devs_lock, flags);
+
+	list_del(&psp->entry);
+	spin_unlock_irqrestore(&psp_devs_lock, flags);
+}
+
+static struct psp_device *psp_alloc_struct(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct psp_device *psp;
+
+	psp = devm_kzalloc(dev, sizeof(*psp), GFP_KERNEL);
+	if (!psp)
+		return NULL;
+
+	psp->dev = dev;
+	psp->sp = sp;
+
+	snprintf(psp->name, sizeof(psp->name), "psp-%u", sp->ord);
+
+	return psp;
+}
+
+irqreturn_t psp_irq_handler(int irq, void *data)
+{
+	unsigned int status;
+	irqreturn_t ret = IRQ_HANDLED;
+	struct psp_device *psp = data;
+
+	/* read the interrupt status */
+	status = ioread32(psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	/* invoke subdevice interrupt handlers */
+	if (status) {
+		if (psp->sev_irq_handler)
+			ret = psp->sev_irq_handler(irq, psp->sev_irq_data);
+	}
+
+	/* clear the interrupt status */
+	iowrite32(status, psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	return ret;
+}
+
+static int psp_init(struct psp_device *psp)
+{
+	psp_add_device(psp);
+
+	sev_dev_init(psp);
+
+	return 0;
+}
+
+int psp_dev_init(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct psp_device *psp;
+	int ret;
+
+	ret = -ENOMEM;
+	psp = psp_alloc_struct(sp);
+	if (!psp)
+		goto e_err;
+	sp->psp_data = psp;
+
+	psp->vdata = (struct psp_vdata *)sp->dev_data->psp_vdata;
+	if (!psp->vdata) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	psp->io_regs = sp->io_map + psp->vdata->offset;
+
+	/* Disable and clear interrupts until ready */
+	iowrite32(0, psp->io_regs + PSP_P2CMSG_INTEN);
+	iowrite32(0xffffffff, psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	dev_dbg(dev, "requesting an IRQ ...\n");
+	/* Request an irq */
+	ret = sp_request_psp_irq(psp->sp, psp_irq_handler, psp->name, psp);
+	if (ret) {
+		dev_err(dev, "psp: unable to allocate an IRQ\n");
+		goto e_err;
+	}
+
+	sp_set_psp_master(sp);
+
+	dev_dbg(dev, "initializing psp\n");
+	ret = psp_init(psp);
+	if (ret) {
+		dev_err(dev, "failed to init psp\n");
+		goto e_irq;
+	}
+
+	/* Enable interrupt */
+	dev_dbg(dev, "Enabling interrupts ...\n");
+	iowrite32(7, psp->io_regs + PSP_P2CMSG_INTEN);
+
+	dev_notice(dev, "psp enabled\n");
+
+	return 0;
+
+e_irq:
+	sp_free_psp_irq(psp->sp, psp);
+e_err:
+	sp->psp_data = NULL;
+
+	dev_notice(dev, "psp initialization failed\n");
+
+	return ret;
+}
+
+void psp_dev_destroy(struct sp_device *sp)
+{
+	struct psp_device *psp = sp->psp_data;
+
+	sev_dev_destroy(psp);
+
+	sp_free_psp_irq(sp, psp);
+
+	psp_del_device(psp);
+}
+
+int psp_dev_resume(struct sp_device *sp)
+{
+	sev_dev_resume(sp->psp_data);
+	return 0;
+}
+
+int psp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	sev_dev_suspend(sp->psp_data, state);
+	return 0;
+}
+
+int psp_request_sev_irq(struct psp_device *psp, irq_handler_t handler,
+			void *data)
+{
+	psp->sev_irq_data = data;
+	psp->sev_irq_handler = handler;
+
+	return 0;
+}
+
+int psp_free_sev_irq(struct psp_device *psp, void *data)
+{
+	if (psp->sev_irq_handler) {
+		psp->sev_irq_data = NULL;
+		psp->sev_irq_handler = NULL;
+	}
+
+	return 0;
+}
+
+struct psp_device *psp_get_master_device(void)
+{
+	struct sp_device *sp = sp_get_psp_master_device();
+
+	return sp ? sp->psp_data : NULL;
+}
diff --git a/drivers/crypto/ccp/psp-dev.h b/drivers/crypto/ccp/psp-dev.h
new file mode 100644
index 0000000..bbd3d96
--- /dev/null
+++ b/drivers/crypto/ccp/psp-dev.h
@@ -0,0 +1,102 @@
+/*
+ * AMD Platform Security Processor (PSP) interface driver
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_DEV_H__
+#define __PSP_DEV_H__
+
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/dmapool.h>
+#include <linux/hw_random.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+#include <linux/dmaengine.h>
+
+#include "sp-dev.h"
+
+#define PSP_P2CMSG_INTEN		0x0110
+#define PSP_P2CMSG_INTSTS		0x0114
+
+#define PSP_C2PMSG_ATTR_0		0x0118
+#define PSP_C2PMSG_ATTR_1		0x011c
+#define PSP_C2PMSG_ATTR_2		0x0120
+#define PSP_C2PMSG_ATTR_3		0x0124
+#define PSP_P2CMSG_ATTR_0		0x0128
+
+#define PSP_CMDRESP_CMD_SHIFT		16
+#define PSP_CMDRESP_IOC			BIT(0)
+#define PSP_CMDRESP_RESP		BIT(31)
+#define PSP_CMDRESP_ERR_MASK		0xffff
+
+#define MAX_PSP_NAME_LEN		16
+
+struct psp_device {
+	struct list_head entry;
+
+	struct psp_vdata *vdata;
+	char name[MAX_PSP_NAME_LEN];
+
+	struct device *dev;
+	struct sp_device *sp;
+
+	void __iomem *io_regs;
+
+	irq_handler_t sev_irq_handler;
+	void *sev_irq_data;
+
+	void *sev_data;
+};
+
+void psp_add_device(struct psp_device *psp);
+void psp_del_device(struct psp_device *psp);
+
+int psp_request_sev_irq(struct psp_device *psp, irq_handler_t handler,
+			void *data);
+int psp_free_sev_irq(struct psp_device *psp, void *data);
+
+struct psp_device *psp_get_master_device(void);
+
+#ifdef CONFIG_AMD_SEV
+
+int sev_dev_init(struct psp_device *psp);
+void sev_dev_destroy(struct psp_device *psp);
+int sev_dev_resume(struct psp_device *psp);
+int sev_dev_suspend(struct psp_device *psp, pm_message_t state);
+
+#else
+
+static inline int sev_dev_init(struct psp_device *psp)
+{
+	return -ENODEV;
+}
+
+static inline void sev_dev_destroy(struct psp_device *psp) { }
+
+static inline int sev_dev_resume(struct psp_device *psp)
+{
+	return -ENODEV;
+}
+
+static inline int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
+{
+	return -ENODEV;
+}
+
+#endif /* __AMD_SEV_H */
+
+#endif /* __PSP_DEV_H */
+
diff --git a/drivers/crypto/ccp/sp-dev.c b/drivers/crypto/ccp/sp-dev.c
index e47fb8e..975a435 100644
--- a/drivers/crypto/ccp/sp-dev.c
+++ b/drivers/crypto/ccp/sp-dev.c
@@ -212,6 +212,8 @@ int sp_init(struct sp_device *sp)
 	if (sp->dev_data->ccp_vdata)
 		ccp_dev_init(sp);
 
+	if (sp->dev_data->psp_vdata)
+		psp_dev_init(sp);
 	return 0;
 }
 
@@ -220,6 +222,9 @@ void sp_destroy(struct sp_device *sp)
 	if (sp->dev_data->ccp_vdata)
 		ccp_dev_destroy(sp);
 
+	if (sp->dev_data->psp_vdata)
+		psp_dev_destroy(sp);
+
 	sp_del_device(sp);
 }
 
@@ -233,6 +238,12 @@ int sp_suspend(struct sp_device *sp, pm_message_t state)
 			return ret;
 	}
 
+	if (sp->dev_data->psp_vdata) {
+		ret = psp_dev_suspend(sp, state);
+		if (ret)
+			return ret;
+	}
+
 	return 0;
 }
 
@@ -246,6 +257,11 @@ int sp_resume(struct sp_device *sp)
 			return ret;
 	}
 
+	if (sp->dev_data->psp_vdata) {
+		ret = psp_dev_resume(sp);
+		if (ret)
+			return ret;
+	}
 	return 0;
 }
 
diff --git a/drivers/crypto/ccp/sp-dev.h b/drivers/crypto/ccp/sp-dev.h
index 9a8a8f8..aeff7a0 100644
--- a/drivers/crypto/ccp/sp-dev.h
+++ b/drivers/crypto/ccp/sp-dev.h
@@ -40,12 +40,18 @@ struct ccp_vdata {
 	const unsigned int offset;
 };
 
+struct psp_vdata {
+	const unsigned int version;
+	const struct psp_actions *perform;
+	const unsigned int offset;
+};
+
 /* Structure to hold SP device data */
 struct sp_dev_data {
 	const unsigned int bar;
 
 	const struct ccp_vdata *ccp_vdata;
-	const void *psp_vdata;
+	const struct psp_vdata *psp_vdata;
 };
 
 struct sp_device {
@@ -137,4 +143,30 @@ static inline int ccp_dev_resume(struct sp_device *sp)
 
 #endif	/* CONFIG_CRYPTO_DEV_CCP */
 
+#ifdef CONFIG_CRYPTO_DEV_PSP
+
+int psp_dev_init(struct sp_device *sp);
+void psp_dev_destroy(struct sp_device *sp);
+
+int psp_dev_suspend(struct sp_device *sp, pm_message_t state);
+int psp_dev_resume(struct sp_device *sp);
+#else /* !CONFIG_CRYPTO_DEV_CCP */
+
+static inline int psp_dev_init(struct sp_device *sp)
+{
+	return 0;
+}
+static inline void psp_dev_destroy(struct sp_device *sp) { }
+
+static inline int psp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	return 0;
+}
+static inline int psp_dev_resume(struct sp_device *sp)
+{
+	return 0;
+}
+
+#endif /* CONFIG_CRYPTO_DEV_CCP */
+
 #endif
diff --git a/drivers/crypto/ccp/sp-pci.c b/drivers/crypto/ccp/sp-pci.c
index 0960e2d..4999662 100644
--- a/drivers/crypto/ccp/sp-pci.c
+++ b/drivers/crypto/ccp/sp-pci.c
@@ -271,6 +271,7 @@ static int sp_pci_resume(struct pci_dev *pdev)
 extern struct ccp_vdata ccpv3_pci;
 extern struct ccp_vdata ccpv5a;
 extern struct ccp_vdata ccpv5b;
+extern struct psp_vdata psp_entry;
 
 static const struct sp_dev_data dev_data[] = {
 	{
@@ -284,6 +285,9 @@ static const struct sp_dev_data dev_data[] = {
 #ifdef CONFIG_CRYPTO_DEV_CCP
 		.ccp_vdata = &ccpv5a,
 #endif
+#ifdef CONFIG_CRYPTO_DEV_PSP
+		.psp_vdata = &psp_entry
+#endif
 	},
 	{
 		.bar = 2,

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 20/32] crypto: ccp: Add Platform Security Processor (PSP) interface support
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:16   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

AMD Platform Security Processor (PSP) is a dedicated processor that
provides the support for encrypting the guest memory in a Secure Encrypted
Virtualiztion (SEV) mode, along with software-based Tursted Executation
Environment (TEE) to enable the third-party tursted applications.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 drivers/crypto/ccp/Kconfig   |    7 +
 drivers/crypto/ccp/Makefile  |    1 
 drivers/crypto/ccp/psp-dev.c |  211 ++++++++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/psp-dev.h |  102 ++++++++++++++++++++
 drivers/crypto/ccp/sp-dev.c  |   16 +++
 drivers/crypto/ccp/sp-dev.h  |   34 +++++++
 drivers/crypto/ccp/sp-pci.c  |    4 +
 7 files changed, 374 insertions(+), 1 deletion(-)
 create mode 100644 drivers/crypto/ccp/psp-dev.c
 create mode 100644 drivers/crypto/ccp/psp-dev.h

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index bc08f03..59c207e 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -34,4 +34,11 @@ config CRYPTO_DEV_CCP
 	  Provides the interface to use the AMD Cryptographic Coprocessor
 	  which can be used to offload encryption operations such as SHA,
 	  AES and more.
+
+config CRYPTO_DEV_PSP
+	bool "Platform Security Processor interface"
+	default y
+	help
+	 Provide the interface for AMD Platform Security Processor (PSP) device.
+
 endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 8127e18..12e569d 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -6,6 +6,7 @@ ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-dev-v3.o \
 	    ccp-dev-v5.o \
 	    ccp-dmaengine.o
+ccp-$(CONFIG_CRYPTO_DEV_PSP) += psp-dev.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/psp-dev.c b/drivers/crypto/ccp/psp-dev.c
new file mode 100644
index 0000000..6f64aa7
--- /dev/null
+++ b/drivers/crypto/ccp/psp-dev.c
@@ -0,0 +1,211 @@
+/*
+ * AMD Platform Security Processor (PSP) interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/delay.h>
+#include <linux/hw_random.h>
+#include <linux/ccp.h>
+
+#include "sp-dev.h"
+#include "psp-dev.h"
+
+static LIST_HEAD(psp_devs);
+static DEFINE_SPINLOCK(psp_devs_lock);
+
+const struct psp_vdata psp_entry = {
+	.offset = 0x10500,
+};
+
+void psp_add_device(struct psp_device *psp)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&psp_devs_lock, flags);
+
+	list_add_tail(&psp->entry, &psp_devs);
+
+	spin_unlock_irqrestore(&psp_devs_lock, flags);
+}
+
+void psp_del_device(struct psp_device *psp)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&psp_devs_lock, flags);
+
+	list_del(&psp->entry);
+	spin_unlock_irqrestore(&psp_devs_lock, flags);
+}
+
+static struct psp_device *psp_alloc_struct(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct psp_device *psp;
+
+	psp = devm_kzalloc(dev, sizeof(*psp), GFP_KERNEL);
+	if (!psp)
+		return NULL;
+
+	psp->dev = dev;
+	psp->sp = sp;
+
+	snprintf(psp->name, sizeof(psp->name), "psp-%u", sp->ord);
+
+	return psp;
+}
+
+irqreturn_t psp_irq_handler(int irq, void *data)
+{
+	unsigned int status;
+	irqreturn_t ret = IRQ_HANDLED;
+	struct psp_device *psp = data;
+
+	/* read the interrupt status */
+	status = ioread32(psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	/* invoke subdevice interrupt handlers */
+	if (status) {
+		if (psp->sev_irq_handler)
+			ret = psp->sev_irq_handler(irq, psp->sev_irq_data);
+	}
+
+	/* clear the interrupt status */
+	iowrite32(status, psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	return ret;
+}
+
+static int psp_init(struct psp_device *psp)
+{
+	psp_add_device(psp);
+
+	sev_dev_init(psp);
+
+	return 0;
+}
+
+int psp_dev_init(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct psp_device *psp;
+	int ret;
+
+	ret = -ENOMEM;
+	psp = psp_alloc_struct(sp);
+	if (!psp)
+		goto e_err;
+	sp->psp_data = psp;
+
+	psp->vdata = (struct psp_vdata *)sp->dev_data->psp_vdata;
+	if (!psp->vdata) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	psp->io_regs = sp->io_map + psp->vdata->offset;
+
+	/* Disable and clear interrupts until ready */
+	iowrite32(0, psp->io_regs + PSP_P2CMSG_INTEN);
+	iowrite32(0xffffffff, psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	dev_dbg(dev, "requesting an IRQ ...\n");
+	/* Request an irq */
+	ret = sp_request_psp_irq(psp->sp, psp_irq_handler, psp->name, psp);
+	if (ret) {
+		dev_err(dev, "psp: unable to allocate an IRQ\n");
+		goto e_err;
+	}
+
+	sp_set_psp_master(sp);
+
+	dev_dbg(dev, "initializing psp\n");
+	ret = psp_init(psp);
+	if (ret) {
+		dev_err(dev, "failed to init psp\n");
+		goto e_irq;
+	}
+
+	/* Enable interrupt */
+	dev_dbg(dev, "Enabling interrupts ...\n");
+	iowrite32(7, psp->io_regs + PSP_P2CMSG_INTEN);
+
+	dev_notice(dev, "psp enabled\n");
+
+	return 0;
+
+e_irq:
+	sp_free_psp_irq(psp->sp, psp);
+e_err:
+	sp->psp_data = NULL;
+
+	dev_notice(dev, "psp initialization failed\n");
+
+	return ret;
+}
+
+void psp_dev_destroy(struct sp_device *sp)
+{
+	struct psp_device *psp = sp->psp_data;
+
+	sev_dev_destroy(psp);
+
+	sp_free_psp_irq(sp, psp);
+
+	psp_del_device(psp);
+}
+
+int psp_dev_resume(struct sp_device *sp)
+{
+	sev_dev_resume(sp->psp_data);
+	return 0;
+}
+
+int psp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	sev_dev_suspend(sp->psp_data, state);
+	return 0;
+}
+
+int psp_request_sev_irq(struct psp_device *psp, irq_handler_t handler,
+			void *data)
+{
+	psp->sev_irq_data = data;
+	psp->sev_irq_handler = handler;
+
+	return 0;
+}
+
+int psp_free_sev_irq(struct psp_device *psp, void *data)
+{
+	if (psp->sev_irq_handler) {
+		psp->sev_irq_data = NULL;
+		psp->sev_irq_handler = NULL;
+	}
+
+	return 0;
+}
+
+struct psp_device *psp_get_master_device(void)
+{
+	struct sp_device *sp = sp_get_psp_master_device();
+
+	return sp ? sp->psp_data : NULL;
+}
diff --git a/drivers/crypto/ccp/psp-dev.h b/drivers/crypto/ccp/psp-dev.h
new file mode 100644
index 0000000..bbd3d96
--- /dev/null
+++ b/drivers/crypto/ccp/psp-dev.h
@@ -0,0 +1,102 @@
+/*
+ * AMD Platform Security Processor (PSP) interface driver
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_DEV_H__
+#define __PSP_DEV_H__
+
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/dmapool.h>
+#include <linux/hw_random.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+#include <linux/dmaengine.h>
+
+#include "sp-dev.h"
+
+#define PSP_P2CMSG_INTEN		0x0110
+#define PSP_P2CMSG_INTSTS		0x0114
+
+#define PSP_C2PMSG_ATTR_0		0x0118
+#define PSP_C2PMSG_ATTR_1		0x011c
+#define PSP_C2PMSG_ATTR_2		0x0120
+#define PSP_C2PMSG_ATTR_3		0x0124
+#define PSP_P2CMSG_ATTR_0		0x0128
+
+#define PSP_CMDRESP_CMD_SHIFT		16
+#define PSP_CMDRESP_IOC			BIT(0)
+#define PSP_CMDRESP_RESP		BIT(31)
+#define PSP_CMDRESP_ERR_MASK		0xffff
+
+#define MAX_PSP_NAME_LEN		16
+
+struct psp_device {
+	struct list_head entry;
+
+	struct psp_vdata *vdata;
+	char name[MAX_PSP_NAME_LEN];
+
+	struct device *dev;
+	struct sp_device *sp;
+
+	void __iomem *io_regs;
+
+	irq_handler_t sev_irq_handler;
+	void *sev_irq_data;
+
+	void *sev_data;
+};
+
+void psp_add_device(struct psp_device *psp);
+void psp_del_device(struct psp_device *psp);
+
+int psp_request_sev_irq(struct psp_device *psp, irq_handler_t handler,
+			void *data);
+int psp_free_sev_irq(struct psp_device *psp, void *data);
+
+struct psp_device *psp_get_master_device(void);
+
+#ifdef CONFIG_AMD_SEV
+
+int sev_dev_init(struct psp_device *psp);
+void sev_dev_destroy(struct psp_device *psp);
+int sev_dev_resume(struct psp_device *psp);
+int sev_dev_suspend(struct psp_device *psp, pm_message_t state);
+
+#else
+
+static inline int sev_dev_init(struct psp_device *psp)
+{
+	return -ENODEV;
+}
+
+static inline void sev_dev_destroy(struct psp_device *psp) { }
+
+static inline int sev_dev_resume(struct psp_device *psp)
+{
+	return -ENODEV;
+}
+
+static inline int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
+{
+	return -ENODEV;
+}
+
+#endif /* __AMD_SEV_H */
+
+#endif /* __PSP_DEV_H */
+
diff --git a/drivers/crypto/ccp/sp-dev.c b/drivers/crypto/ccp/sp-dev.c
index e47fb8e..975a435 100644
--- a/drivers/crypto/ccp/sp-dev.c
+++ b/drivers/crypto/ccp/sp-dev.c
@@ -212,6 +212,8 @@ int sp_init(struct sp_device *sp)
 	if (sp->dev_data->ccp_vdata)
 		ccp_dev_init(sp);
 
+	if (sp->dev_data->psp_vdata)
+		psp_dev_init(sp);
 	return 0;
 }
 
@@ -220,6 +222,9 @@ void sp_destroy(struct sp_device *sp)
 	if (sp->dev_data->ccp_vdata)
 		ccp_dev_destroy(sp);
 
+	if (sp->dev_data->psp_vdata)
+		psp_dev_destroy(sp);
+
 	sp_del_device(sp);
 }
 
@@ -233,6 +238,12 @@ int sp_suspend(struct sp_device *sp, pm_message_t state)
 			return ret;
 	}
 
+	if (sp->dev_data->psp_vdata) {
+		ret = psp_dev_suspend(sp, state);
+		if (ret)
+			return ret;
+	}
+
 	return 0;
 }
 
@@ -246,6 +257,11 @@ int sp_resume(struct sp_device *sp)
 			return ret;
 	}
 
+	if (sp->dev_data->psp_vdata) {
+		ret = psp_dev_resume(sp);
+		if (ret)
+			return ret;
+	}
 	return 0;
 }
 
diff --git a/drivers/crypto/ccp/sp-dev.h b/drivers/crypto/ccp/sp-dev.h
index 9a8a8f8..aeff7a0 100644
--- a/drivers/crypto/ccp/sp-dev.h
+++ b/drivers/crypto/ccp/sp-dev.h
@@ -40,12 +40,18 @@ struct ccp_vdata {
 	const unsigned int offset;
 };
 
+struct psp_vdata {
+	const unsigned int version;
+	const struct psp_actions *perform;
+	const unsigned int offset;
+};
+
 /* Structure to hold SP device data */
 struct sp_dev_data {
 	const unsigned int bar;
 
 	const struct ccp_vdata *ccp_vdata;
-	const void *psp_vdata;
+	const struct psp_vdata *psp_vdata;
 };
 
 struct sp_device {
@@ -137,4 +143,30 @@ static inline int ccp_dev_resume(struct sp_device *sp)
 
 #endif	/* CONFIG_CRYPTO_DEV_CCP */
 
+#ifdef CONFIG_CRYPTO_DEV_PSP
+
+int psp_dev_init(struct sp_device *sp);
+void psp_dev_destroy(struct sp_device *sp);
+
+int psp_dev_suspend(struct sp_device *sp, pm_message_t state);
+int psp_dev_resume(struct sp_device *sp);
+#else /* !CONFIG_CRYPTO_DEV_CCP */
+
+static inline int psp_dev_init(struct sp_device *sp)
+{
+	return 0;
+}
+static inline void psp_dev_destroy(struct sp_device *sp) { }
+
+static inline int psp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	return 0;
+}
+static inline int psp_dev_resume(struct sp_device *sp)
+{
+	return 0;
+}
+
+#endif /* CONFIG_CRYPTO_DEV_CCP */
+
 #endif
diff --git a/drivers/crypto/ccp/sp-pci.c b/drivers/crypto/ccp/sp-pci.c
index 0960e2d..4999662 100644
--- a/drivers/crypto/ccp/sp-pci.c
+++ b/drivers/crypto/ccp/sp-pci.c
@@ -271,6 +271,7 @@ static int sp_pci_resume(struct pci_dev *pdev)
 extern struct ccp_vdata ccpv3_pci;
 extern struct ccp_vdata ccpv5a;
 extern struct ccp_vdata ccpv5b;
+extern struct psp_vdata psp_entry;
 
 static const struct sp_dev_data dev_data[] = {
 	{
@@ -284,6 +285,9 @@ static const struct sp_dev_data dev_data[] = {
 #ifdef CONFIG_CRYPTO_DEV_CCP
 		.ccp_vdata = &ccpv5a,
 #endif
+#ifdef CONFIG_CRYPTO_DEV_PSP
+		.psp_vdata = &psp_entry
+#endif
 	},
 	{
 		.bar = 2,

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 20/32] crypto: ccp: Add Platform Security Processor (PSP) interface support
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

AMD Platform Security Processor (PSP) is a dedicated processor that
provides the support for encrypting the guest memory in a Secure Encrypted
Virtualiztion (SEV) mode, along with software-based Tursted Executation
Environment (TEE) to enable the third-party tursted applications.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 drivers/crypto/ccp/Kconfig   |    7 +
 drivers/crypto/ccp/Makefile  |    1 
 drivers/crypto/ccp/psp-dev.c |  211 ++++++++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/psp-dev.h |  102 ++++++++++++++++++++
 drivers/crypto/ccp/sp-dev.c  |   16 +++
 drivers/crypto/ccp/sp-dev.h  |   34 +++++++
 drivers/crypto/ccp/sp-pci.c  |    4 +
 7 files changed, 374 insertions(+), 1 deletion(-)
 create mode 100644 drivers/crypto/ccp/psp-dev.c
 create mode 100644 drivers/crypto/ccp/psp-dev.h

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index bc08f03..59c207e 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -34,4 +34,11 @@ config CRYPTO_DEV_CCP
 	  Provides the interface to use the AMD Cryptographic Coprocessor
 	  which can be used to offload encryption operations such as SHA,
 	  AES and more.
+
+config CRYPTO_DEV_PSP
+	bool "Platform Security Processor interface"
+	default y
+	help
+	 Provide the interface for AMD Platform Security Processor (PSP) device.
+
 endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 8127e18..12e569d 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -6,6 +6,7 @@ ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-dev-v3.o \
 	    ccp-dev-v5.o \
 	    ccp-dmaengine.o
+ccp-$(CONFIG_CRYPTO_DEV_PSP) += psp-dev.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/psp-dev.c b/drivers/crypto/ccp/psp-dev.c
new file mode 100644
index 0000000..6f64aa7
--- /dev/null
+++ b/drivers/crypto/ccp/psp-dev.c
@@ -0,0 +1,211 @@
+/*
+ * AMD Platform Security Processor (PSP) interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/delay.h>
+#include <linux/hw_random.h>
+#include <linux/ccp.h>
+
+#include "sp-dev.h"
+#include "psp-dev.h"
+
+static LIST_HEAD(psp_devs);
+static DEFINE_SPINLOCK(psp_devs_lock);
+
+const struct psp_vdata psp_entry = {
+	.offset = 0x10500,
+};
+
+void psp_add_device(struct psp_device *psp)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&psp_devs_lock, flags);
+
+	list_add_tail(&psp->entry, &psp_devs);
+
+	spin_unlock_irqrestore(&psp_devs_lock, flags);
+}
+
+void psp_del_device(struct psp_device *psp)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&psp_devs_lock, flags);
+
+	list_del(&psp->entry);
+	spin_unlock_irqrestore(&psp_devs_lock, flags);
+}
+
+static struct psp_device *psp_alloc_struct(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct psp_device *psp;
+
+	psp = devm_kzalloc(dev, sizeof(*psp), GFP_KERNEL);
+	if (!psp)
+		return NULL;
+
+	psp->dev = dev;
+	psp->sp = sp;
+
+	snprintf(psp->name, sizeof(psp->name), "psp-%u", sp->ord);
+
+	return psp;
+}
+
+irqreturn_t psp_irq_handler(int irq, void *data)
+{
+	unsigned int status;
+	irqreturn_t ret = IRQ_HANDLED;
+	struct psp_device *psp = data;
+
+	/* read the interrupt status */
+	status = ioread32(psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	/* invoke subdevice interrupt handlers */
+	if (status) {
+		if (psp->sev_irq_handler)
+			ret = psp->sev_irq_handler(irq, psp->sev_irq_data);
+	}
+
+	/* clear the interrupt status */
+	iowrite32(status, psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	return ret;
+}
+
+static int psp_init(struct psp_device *psp)
+{
+	psp_add_device(psp);
+
+	sev_dev_init(psp);
+
+	return 0;
+}
+
+int psp_dev_init(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct psp_device *psp;
+	int ret;
+
+	ret = -ENOMEM;
+	psp = psp_alloc_struct(sp);
+	if (!psp)
+		goto e_err;
+	sp->psp_data = psp;
+
+	psp->vdata = (struct psp_vdata *)sp->dev_data->psp_vdata;
+	if (!psp->vdata) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	psp->io_regs = sp->io_map + psp->vdata->offset;
+
+	/* Disable and clear interrupts until ready */
+	iowrite32(0, psp->io_regs + PSP_P2CMSG_INTEN);
+	iowrite32(0xffffffff, psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	dev_dbg(dev, "requesting an IRQ ...\n");
+	/* Request an irq */
+	ret = sp_request_psp_irq(psp->sp, psp_irq_handler, psp->name, psp);
+	if (ret) {
+		dev_err(dev, "psp: unable to allocate an IRQ\n");
+		goto e_err;
+	}
+
+	sp_set_psp_master(sp);
+
+	dev_dbg(dev, "initializing psp\n");
+	ret = psp_init(psp);
+	if (ret) {
+		dev_err(dev, "failed to init psp\n");
+		goto e_irq;
+	}
+
+	/* Enable interrupt */
+	dev_dbg(dev, "Enabling interrupts ...\n");
+	iowrite32(7, psp->io_regs + PSP_P2CMSG_INTEN);
+
+	dev_notice(dev, "psp enabled\n");
+
+	return 0;
+
+e_irq:
+	sp_free_psp_irq(psp->sp, psp);
+e_err:
+	sp->psp_data = NULL;
+
+	dev_notice(dev, "psp initialization failed\n");
+
+	return ret;
+}
+
+void psp_dev_destroy(struct sp_device *sp)
+{
+	struct psp_device *psp = sp->psp_data;
+
+	sev_dev_destroy(psp);
+
+	sp_free_psp_irq(sp, psp);
+
+	psp_del_device(psp);
+}
+
+int psp_dev_resume(struct sp_device *sp)
+{
+	sev_dev_resume(sp->psp_data);
+	return 0;
+}
+
+int psp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	sev_dev_suspend(sp->psp_data, state);
+	return 0;
+}
+
+int psp_request_sev_irq(struct psp_device *psp, irq_handler_t handler,
+			void *data)
+{
+	psp->sev_irq_data = data;
+	psp->sev_irq_handler = handler;
+
+	return 0;
+}
+
+int psp_free_sev_irq(struct psp_device *psp, void *data)
+{
+	if (psp->sev_irq_handler) {
+		psp->sev_irq_data = NULL;
+		psp->sev_irq_handler = NULL;
+	}
+
+	return 0;
+}
+
+struct psp_device *psp_get_master_device(void)
+{
+	struct sp_device *sp = sp_get_psp_master_device();
+
+	return sp ? sp->psp_data : NULL;
+}
diff --git a/drivers/crypto/ccp/psp-dev.h b/drivers/crypto/ccp/psp-dev.h
new file mode 100644
index 0000000..bbd3d96
--- /dev/null
+++ b/drivers/crypto/ccp/psp-dev.h
@@ -0,0 +1,102 @@
+/*
+ * AMD Platform Security Processor (PSP) interface driver
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_DEV_H__
+#define __PSP_DEV_H__
+
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/dmapool.h>
+#include <linux/hw_random.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+#include <linux/dmaengine.h>
+
+#include "sp-dev.h"
+
+#define PSP_P2CMSG_INTEN		0x0110
+#define PSP_P2CMSG_INTSTS		0x0114
+
+#define PSP_C2PMSG_ATTR_0		0x0118
+#define PSP_C2PMSG_ATTR_1		0x011c
+#define PSP_C2PMSG_ATTR_2		0x0120
+#define PSP_C2PMSG_ATTR_3		0x0124
+#define PSP_P2CMSG_ATTR_0		0x0128
+
+#define PSP_CMDRESP_CMD_SHIFT		16
+#define PSP_CMDRESP_IOC			BIT(0)
+#define PSP_CMDRESP_RESP		BIT(31)
+#define PSP_CMDRESP_ERR_MASK		0xffff
+
+#define MAX_PSP_NAME_LEN		16
+
+struct psp_device {
+	struct list_head entry;
+
+	struct psp_vdata *vdata;
+	char name[MAX_PSP_NAME_LEN];
+
+	struct device *dev;
+	struct sp_device *sp;
+
+	void __iomem *io_regs;
+
+	irq_handler_t sev_irq_handler;
+	void *sev_irq_data;
+
+	void *sev_data;
+};
+
+void psp_add_device(struct psp_device *psp);
+void psp_del_device(struct psp_device *psp);
+
+int psp_request_sev_irq(struct psp_device *psp, irq_handler_t handler,
+			void *data);
+int psp_free_sev_irq(struct psp_device *psp, void *data);
+
+struct psp_device *psp_get_master_device(void);
+
+#ifdef CONFIG_AMD_SEV
+
+int sev_dev_init(struct psp_device *psp);
+void sev_dev_destroy(struct psp_device *psp);
+int sev_dev_resume(struct psp_device *psp);
+int sev_dev_suspend(struct psp_device *psp, pm_message_t state);
+
+#else
+
+static inline int sev_dev_init(struct psp_device *psp)
+{
+	return -ENODEV;
+}
+
+static inline void sev_dev_destroy(struct psp_device *psp) { }
+
+static inline int sev_dev_resume(struct psp_device *psp)
+{
+	return -ENODEV;
+}
+
+static inline int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
+{
+	return -ENODEV;
+}
+
+#endif /* __AMD_SEV_H */
+
+#endif /* __PSP_DEV_H */
+
diff --git a/drivers/crypto/ccp/sp-dev.c b/drivers/crypto/ccp/sp-dev.c
index e47fb8e..975a435 100644
--- a/drivers/crypto/ccp/sp-dev.c
+++ b/drivers/crypto/ccp/sp-dev.c
@@ -212,6 +212,8 @@ int sp_init(struct sp_device *sp)
 	if (sp->dev_data->ccp_vdata)
 		ccp_dev_init(sp);
 
+	if (sp->dev_data->psp_vdata)
+		psp_dev_init(sp);
 	return 0;
 }
 
@@ -220,6 +222,9 @@ void sp_destroy(struct sp_device *sp)
 	if (sp->dev_data->ccp_vdata)
 		ccp_dev_destroy(sp);
 
+	if (sp->dev_data->psp_vdata)
+		psp_dev_destroy(sp);
+
 	sp_del_device(sp);
 }
 
@@ -233,6 +238,12 @@ int sp_suspend(struct sp_device *sp, pm_message_t state)
 			return ret;
 	}
 
+	if (sp->dev_data->psp_vdata) {
+		ret = psp_dev_suspend(sp, state);
+		if (ret)
+			return ret;
+	}
+
 	return 0;
 }
 
@@ -246,6 +257,11 @@ int sp_resume(struct sp_device *sp)
 			return ret;
 	}
 
+	if (sp->dev_data->psp_vdata) {
+		ret = psp_dev_resume(sp);
+		if (ret)
+			return ret;
+	}
 	return 0;
 }
 
diff --git a/drivers/crypto/ccp/sp-dev.h b/drivers/crypto/ccp/sp-dev.h
index 9a8a8f8..aeff7a0 100644
--- a/drivers/crypto/ccp/sp-dev.h
+++ b/drivers/crypto/ccp/sp-dev.h
@@ -40,12 +40,18 @@ struct ccp_vdata {
 	const unsigned int offset;
 };
 
+struct psp_vdata {
+	const unsigned int version;
+	const struct psp_actions *perform;
+	const unsigned int offset;
+};
+
 /* Structure to hold SP device data */
 struct sp_dev_data {
 	const unsigned int bar;
 
 	const struct ccp_vdata *ccp_vdata;
-	const void *psp_vdata;
+	const struct psp_vdata *psp_vdata;
 };
 
 struct sp_device {
@@ -137,4 +143,30 @@ static inline int ccp_dev_resume(struct sp_device *sp)
 
 #endif	/* CONFIG_CRYPTO_DEV_CCP */
 
+#ifdef CONFIG_CRYPTO_DEV_PSP
+
+int psp_dev_init(struct sp_device *sp);
+void psp_dev_destroy(struct sp_device *sp);
+
+int psp_dev_suspend(struct sp_device *sp, pm_message_t state);
+int psp_dev_resume(struct sp_device *sp);
+#else /* !CONFIG_CRYPTO_DEV_CCP */
+
+static inline int psp_dev_init(struct sp_device *sp)
+{
+	return 0;
+}
+static inline void psp_dev_destroy(struct sp_device *sp) { }
+
+static inline int psp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	return 0;
+}
+static inline int psp_dev_resume(struct sp_device *sp)
+{
+	return 0;
+}
+
+#endif /* CONFIG_CRYPTO_DEV_CCP */
+
 #endif
diff --git a/drivers/crypto/ccp/sp-pci.c b/drivers/crypto/ccp/sp-pci.c
index 0960e2d..4999662 100644
--- a/drivers/crypto/ccp/sp-pci.c
+++ b/drivers/crypto/ccp/sp-pci.c
@@ -271,6 +271,7 @@ static int sp_pci_resume(struct pci_dev *pdev)
 extern struct ccp_vdata ccpv3_pci;
 extern struct ccp_vdata ccpv5a;
 extern struct ccp_vdata ccpv5b;
+extern struct psp_vdata psp_entry;
 
 static const struct sp_dev_data dev_data[] = {
 	{
@@ -284,6 +285,9 @@ static const struct sp_dev_data dev_data[] = {
 #ifdef CONFIG_CRYPTO_DEV_CCP
 		.ccp_vdata = &ccpv5a,
 #endif
+#ifdef CONFIG_CRYPTO_DEV_PSP
+		.psp_vdata = &psp_entry
+#endif
 	},
 	{
 		.bar = 2,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 20/32] crypto: ccp: Add Platform Security Processor (PSP) interface support
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

AMD Platform Security Processor (PSP) is a dedicated processor that
provides the support for encrypting the guest memory in a Secure Encrypted
Virtualiztion (SEV) mode, along with software-based Tursted Executation
Environment (TEE) to enable the third-party tursted applications.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 drivers/crypto/ccp/Kconfig   |    7 +
 drivers/crypto/ccp/Makefile  |    1 
 drivers/crypto/ccp/psp-dev.c |  211 ++++++++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/psp-dev.h |  102 ++++++++++++++++++++
 drivers/crypto/ccp/sp-dev.c  |   16 +++
 drivers/crypto/ccp/sp-dev.h  |   34 +++++++
 drivers/crypto/ccp/sp-pci.c  |    4 +
 7 files changed, 374 insertions(+), 1 deletion(-)
 create mode 100644 drivers/crypto/ccp/psp-dev.c
 create mode 100644 drivers/crypto/ccp/psp-dev.h

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index bc08f03..59c207e 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -34,4 +34,11 @@ config CRYPTO_DEV_CCP
 	  Provides the interface to use the AMD Cryptographic Coprocessor
 	  which can be used to offload encryption operations such as SHA,
 	  AES and more.
+
+config CRYPTO_DEV_PSP
+	bool "Platform Security Processor interface"
+	default y
+	help
+	 Provide the interface for AMD Platform Security Processor (PSP) device.
+
 endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 8127e18..12e569d 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -6,6 +6,7 @@ ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-dev-v3.o \
 	    ccp-dev-v5.o \
 	    ccp-dmaengine.o
+ccp-$(CONFIG_CRYPTO_DEV_PSP) += psp-dev.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/psp-dev.c b/drivers/crypto/ccp/psp-dev.c
new file mode 100644
index 0000000..6f64aa7
--- /dev/null
+++ b/drivers/crypto/ccp/psp-dev.c
@@ -0,0 +1,211 @@
+/*
+ * AMD Platform Security Processor (PSP) interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/delay.h>
+#include <linux/hw_random.h>
+#include <linux/ccp.h>
+
+#include "sp-dev.h"
+#include "psp-dev.h"
+
+static LIST_HEAD(psp_devs);
+static DEFINE_SPINLOCK(psp_devs_lock);
+
+const struct psp_vdata psp_entry = {
+	.offset = 0x10500,
+};
+
+void psp_add_device(struct psp_device *psp)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&psp_devs_lock, flags);
+
+	list_add_tail(&psp->entry, &psp_devs);
+
+	spin_unlock_irqrestore(&psp_devs_lock, flags);
+}
+
+void psp_del_device(struct psp_device *psp)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&psp_devs_lock, flags);
+
+	list_del(&psp->entry);
+	spin_unlock_irqrestore(&psp_devs_lock, flags);
+}
+
+static struct psp_device *psp_alloc_struct(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct psp_device *psp;
+
+	psp = devm_kzalloc(dev, sizeof(*psp), GFP_KERNEL);
+	if (!psp)
+		return NULL;
+
+	psp->dev = dev;
+	psp->sp = sp;
+
+	snprintf(psp->name, sizeof(psp->name), "psp-%u", sp->ord);
+
+	return psp;
+}
+
+irqreturn_t psp_irq_handler(int irq, void *data)
+{
+	unsigned int status;
+	irqreturn_t ret = IRQ_HANDLED;
+	struct psp_device *psp = data;
+
+	/* read the interrupt status */
+	status = ioread32(psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	/* invoke subdevice interrupt handlers */
+	if (status) {
+		if (psp->sev_irq_handler)
+			ret = psp->sev_irq_handler(irq, psp->sev_irq_data);
+	}
+
+	/* clear the interrupt status */
+	iowrite32(status, psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	return ret;
+}
+
+static int psp_init(struct psp_device *psp)
+{
+	psp_add_device(psp);
+
+	sev_dev_init(psp);
+
+	return 0;
+}
+
+int psp_dev_init(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct psp_device *psp;
+	int ret;
+
+	ret = -ENOMEM;
+	psp = psp_alloc_struct(sp);
+	if (!psp)
+		goto e_err;
+	sp->psp_data = psp;
+
+	psp->vdata = (struct psp_vdata *)sp->dev_data->psp_vdata;
+	if (!psp->vdata) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	psp->io_regs = sp->io_map + psp->vdata->offset;
+
+	/* Disable and clear interrupts until ready */
+	iowrite32(0, psp->io_regs + PSP_P2CMSG_INTEN);
+	iowrite32(0xffffffff, psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	dev_dbg(dev, "requesting an IRQ ...\n");
+	/* Request an irq */
+	ret = sp_request_psp_irq(psp->sp, psp_irq_handler, psp->name, psp);
+	if (ret) {
+		dev_err(dev, "psp: unable to allocate an IRQ\n");
+		goto e_err;
+	}
+
+	sp_set_psp_master(sp);
+
+	dev_dbg(dev, "initializing psp\n");
+	ret = psp_init(psp);
+	if (ret) {
+		dev_err(dev, "failed to init psp\n");
+		goto e_irq;
+	}
+
+	/* Enable interrupt */
+	dev_dbg(dev, "Enabling interrupts ...\n");
+	iowrite32(7, psp->io_regs + PSP_P2CMSG_INTEN);
+
+	dev_notice(dev, "psp enabled\n");
+
+	return 0;
+
+e_irq:
+	sp_free_psp_irq(psp->sp, psp);
+e_err:
+	sp->psp_data = NULL;
+
+	dev_notice(dev, "psp initialization failed\n");
+
+	return ret;
+}
+
+void psp_dev_destroy(struct sp_device *sp)
+{
+	struct psp_device *psp = sp->psp_data;
+
+	sev_dev_destroy(psp);
+
+	sp_free_psp_irq(sp, psp);
+
+	psp_del_device(psp);
+}
+
+int psp_dev_resume(struct sp_device *sp)
+{
+	sev_dev_resume(sp->psp_data);
+	return 0;
+}
+
+int psp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	sev_dev_suspend(sp->psp_data, state);
+	return 0;
+}
+
+int psp_request_sev_irq(struct psp_device *psp, irq_handler_t handler,
+			void *data)
+{
+	psp->sev_irq_data = data;
+	psp->sev_irq_handler = handler;
+
+	return 0;
+}
+
+int psp_free_sev_irq(struct psp_device *psp, void *data)
+{
+	if (psp->sev_irq_handler) {
+		psp->sev_irq_data = NULL;
+		psp->sev_irq_handler = NULL;
+	}
+
+	return 0;
+}
+
+struct psp_device *psp_get_master_device(void)
+{
+	struct sp_device *sp = sp_get_psp_master_device();
+
+	return sp ? sp->psp_data : NULL;
+}
diff --git a/drivers/crypto/ccp/psp-dev.h b/drivers/crypto/ccp/psp-dev.h
new file mode 100644
index 0000000..bbd3d96
--- /dev/null
+++ b/drivers/crypto/ccp/psp-dev.h
@@ -0,0 +1,102 @@
+/*
+ * AMD Platform Security Processor (PSP) interface driver
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_DEV_H__
+#define __PSP_DEV_H__
+
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/dmapool.h>
+#include <linux/hw_random.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+#include <linux/dmaengine.h>
+
+#include "sp-dev.h"
+
+#define PSP_P2CMSG_INTEN		0x0110
+#define PSP_P2CMSG_INTSTS		0x0114
+
+#define PSP_C2PMSG_ATTR_0		0x0118
+#define PSP_C2PMSG_ATTR_1		0x011c
+#define PSP_C2PMSG_ATTR_2		0x0120
+#define PSP_C2PMSG_ATTR_3		0x0124
+#define PSP_P2CMSG_ATTR_0		0x0128
+
+#define PSP_CMDRESP_CMD_SHIFT		16
+#define PSP_CMDRESP_IOC			BIT(0)
+#define PSP_CMDRESP_RESP		BIT(31)
+#define PSP_CMDRESP_ERR_MASK		0xffff
+
+#define MAX_PSP_NAME_LEN		16
+
+struct psp_device {
+	struct list_head entry;
+
+	struct psp_vdata *vdata;
+	char name[MAX_PSP_NAME_LEN];
+
+	struct device *dev;
+	struct sp_device *sp;
+
+	void __iomem *io_regs;
+
+	irq_handler_t sev_irq_handler;
+	void *sev_irq_data;
+
+	void *sev_data;
+};
+
+void psp_add_device(struct psp_device *psp);
+void psp_del_device(struct psp_device *psp);
+
+int psp_request_sev_irq(struct psp_device *psp, irq_handler_t handler,
+			void *data);
+int psp_free_sev_irq(struct psp_device *psp, void *data);
+
+struct psp_device *psp_get_master_device(void);
+
+#ifdef CONFIG_AMD_SEV
+
+int sev_dev_init(struct psp_device *psp);
+void sev_dev_destroy(struct psp_device *psp);
+int sev_dev_resume(struct psp_device *psp);
+int sev_dev_suspend(struct psp_device *psp, pm_message_t state);
+
+#else
+
+static inline int sev_dev_init(struct psp_device *psp)
+{
+	return -ENODEV;
+}
+
+static inline void sev_dev_destroy(struct psp_device *psp) { }
+
+static inline int sev_dev_resume(struct psp_device *psp)
+{
+	return -ENODEV;
+}
+
+static inline int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
+{
+	return -ENODEV;
+}
+
+#endif /* __AMD_SEV_H */
+
+#endif /* __PSP_DEV_H */
+
diff --git a/drivers/crypto/ccp/sp-dev.c b/drivers/crypto/ccp/sp-dev.c
index e47fb8e..975a435 100644
--- a/drivers/crypto/ccp/sp-dev.c
+++ b/drivers/crypto/ccp/sp-dev.c
@@ -212,6 +212,8 @@ int sp_init(struct sp_device *sp)
 	if (sp->dev_data->ccp_vdata)
 		ccp_dev_init(sp);
 
+	if (sp->dev_data->psp_vdata)
+		psp_dev_init(sp);
 	return 0;
 }
 
@@ -220,6 +222,9 @@ void sp_destroy(struct sp_device *sp)
 	if (sp->dev_data->ccp_vdata)
 		ccp_dev_destroy(sp);
 
+	if (sp->dev_data->psp_vdata)
+		psp_dev_destroy(sp);
+
 	sp_del_device(sp);
 }
 
@@ -233,6 +238,12 @@ int sp_suspend(struct sp_device *sp, pm_message_t state)
 			return ret;
 	}
 
+	if (sp->dev_data->psp_vdata) {
+		ret = psp_dev_suspend(sp, state);
+		if (ret)
+			return ret;
+	}
+
 	return 0;
 }
 
@@ -246,6 +257,11 @@ int sp_resume(struct sp_device *sp)
 			return ret;
 	}
 
+	if (sp->dev_data->psp_vdata) {
+		ret = psp_dev_resume(sp);
+		if (ret)
+			return ret;
+	}
 	return 0;
 }
 
diff --git a/drivers/crypto/ccp/sp-dev.h b/drivers/crypto/ccp/sp-dev.h
index 9a8a8f8..aeff7a0 100644
--- a/drivers/crypto/ccp/sp-dev.h
+++ b/drivers/crypto/ccp/sp-dev.h
@@ -40,12 +40,18 @@ struct ccp_vdata {
 	const unsigned int offset;
 };
 
+struct psp_vdata {
+	const unsigned int version;
+	const struct psp_actions *perform;
+	const unsigned int offset;
+};
+
 /* Structure to hold SP device data */
 struct sp_dev_data {
 	const unsigned int bar;
 
 	const struct ccp_vdata *ccp_vdata;
-	const void *psp_vdata;
+	const struct psp_vdata *psp_vdata;
 };
 
 struct sp_device {
@@ -137,4 +143,30 @@ static inline int ccp_dev_resume(struct sp_device *sp)
 
 #endif	/* CONFIG_CRYPTO_DEV_CCP */
 
+#ifdef CONFIG_CRYPTO_DEV_PSP
+
+int psp_dev_init(struct sp_device *sp);
+void psp_dev_destroy(struct sp_device *sp);
+
+int psp_dev_suspend(struct sp_device *sp, pm_message_t state);
+int psp_dev_resume(struct sp_device *sp);
+#else /* !CONFIG_CRYPTO_DEV_CCP */
+
+static inline int psp_dev_init(struct sp_device *sp)
+{
+	return 0;
+}
+static inline void psp_dev_destroy(struct sp_device *sp) { }
+
+static inline int psp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	return 0;
+}
+static inline int psp_dev_resume(struct sp_device *sp)
+{
+	return 0;
+}
+
+#endif /* CONFIG_CRYPTO_DEV_CCP */
+
 #endif
diff --git a/drivers/crypto/ccp/sp-pci.c b/drivers/crypto/ccp/sp-pci.c
index 0960e2d..4999662 100644
--- a/drivers/crypto/ccp/sp-pci.c
+++ b/drivers/crypto/ccp/sp-pci.c
@@ -271,6 +271,7 @@ static int sp_pci_resume(struct pci_dev *pdev)
 extern struct ccp_vdata ccpv3_pci;
 extern struct ccp_vdata ccpv5a;
 extern struct ccp_vdata ccpv5b;
+extern struct psp_vdata psp_entry;
 
 static const struct sp_dev_data dev_data[] = {
 	{
@@ -284,6 +285,9 @@ static const struct sp_dev_data dev_data[] = {
 #ifdef CONFIG_CRYPTO_DEV_CCP
 		.ccp_vdata = &ccpv5a,
 #endif
+#ifdef CONFIG_CRYPTO_DEV_PSP
+		.psp_vdata = &psp_entry
+#endif
 	},
 	{
 		.bar = 2,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 20/32] crypto: ccp: Add Platform Security Processor (PSP) interface support
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

AMD Platform Security Processor (PSP) is a dedicated processor that
provides the support for encrypting the guest memory in a Secure Encrypted
Virtualiztion (SEV) mode, along with software-based Tursted Executation
Environment (TEE) to enable the third-party tursted applications.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 drivers/crypto/ccp/Kconfig   |    7 +
 drivers/crypto/ccp/Makefile  |    1 
 drivers/crypto/ccp/psp-dev.c |  211 ++++++++++++++++++++++++++++++++++++++++++
 drivers/crypto/ccp/psp-dev.h |  102 ++++++++++++++++++++
 drivers/crypto/ccp/sp-dev.c  |   16 +++
 drivers/crypto/ccp/sp-dev.h  |   34 +++++++
 drivers/crypto/ccp/sp-pci.c  |    4 +
 7 files changed, 374 insertions(+), 1 deletion(-)
 create mode 100644 drivers/crypto/ccp/psp-dev.c
 create mode 100644 drivers/crypto/ccp/psp-dev.h

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index bc08f03..59c207e 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -34,4 +34,11 @@ config CRYPTO_DEV_CCP
 	  Provides the interface to use the AMD Cryptographic Coprocessor
 	  which can be used to offload encryption operations such as SHA,
 	  AES and more.
+
+config CRYPTO_DEV_PSP
+	bool "Platform Security Processor interface"
+	default y
+	help
+	 Provide the interface for AMD Platform Security Processor (PSP) device.
+
 endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 8127e18..12e569d 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -6,6 +6,7 @@ ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-dev-v3.o \
 	    ccp-dev-v5.o \
 	    ccp-dmaengine.o
+ccp-$(CONFIG_CRYPTO_DEV_PSP) += psp-dev.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/psp-dev.c b/drivers/crypto/ccp/psp-dev.c
new file mode 100644
index 0000000..6f64aa7
--- /dev/null
+++ b/drivers/crypto/ccp/psp-dev.c
@@ -0,0 +1,211 @@
+/*
+ * AMD Platform Security Processor (PSP) interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/delay.h>
+#include <linux/hw_random.h>
+#include <linux/ccp.h>
+
+#include "sp-dev.h"
+#include "psp-dev.h"
+
+static LIST_HEAD(psp_devs);
+static DEFINE_SPINLOCK(psp_devs_lock);
+
+const struct psp_vdata psp_entry = {
+	.offset = 0x10500,
+};
+
+void psp_add_device(struct psp_device *psp)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&psp_devs_lock, flags);
+
+	list_add_tail(&psp->entry, &psp_devs);
+
+	spin_unlock_irqrestore(&psp_devs_lock, flags);
+}
+
+void psp_del_device(struct psp_device *psp)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&psp_devs_lock, flags);
+
+	list_del(&psp->entry);
+	spin_unlock_irqrestore(&psp_devs_lock, flags);
+}
+
+static struct psp_device *psp_alloc_struct(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct psp_device *psp;
+
+	psp = devm_kzalloc(dev, sizeof(*psp), GFP_KERNEL);
+	if (!psp)
+		return NULL;
+
+	psp->dev = dev;
+	psp->sp = sp;
+
+	snprintf(psp->name, sizeof(psp->name), "psp-%u", sp->ord);
+
+	return psp;
+}
+
+irqreturn_t psp_irq_handler(int irq, void *data)
+{
+	unsigned int status;
+	irqreturn_t ret = IRQ_HANDLED;
+	struct psp_device *psp = data;
+
+	/* read the interrupt status */
+	status = ioread32(psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	/* invoke subdevice interrupt handlers */
+	if (status) {
+		if (psp->sev_irq_handler)
+			ret = psp->sev_irq_handler(irq, psp->sev_irq_data);
+	}
+
+	/* clear the interrupt status */
+	iowrite32(status, psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	return ret;
+}
+
+static int psp_init(struct psp_device *psp)
+{
+	psp_add_device(psp);
+
+	sev_dev_init(psp);
+
+	return 0;
+}
+
+int psp_dev_init(struct sp_device *sp)
+{
+	struct device *dev = sp->dev;
+	struct psp_device *psp;
+	int ret;
+
+	ret = -ENOMEM;
+	psp = psp_alloc_struct(sp);
+	if (!psp)
+		goto e_err;
+	sp->psp_data = psp;
+
+	psp->vdata = (struct psp_vdata *)sp->dev_data->psp_vdata;
+	if (!psp->vdata) {
+		ret = -ENODEV;
+		dev_err(dev, "missing driver data\n");
+		goto e_err;
+	}
+
+	psp->io_regs = sp->io_map + psp->vdata->offset;
+
+	/* Disable and clear interrupts until ready */
+	iowrite32(0, psp->io_regs + PSP_P2CMSG_INTEN);
+	iowrite32(0xffffffff, psp->io_regs + PSP_P2CMSG_INTSTS);
+
+	dev_dbg(dev, "requesting an IRQ ...\n");
+	/* Request an irq */
+	ret = sp_request_psp_irq(psp->sp, psp_irq_handler, psp->name, psp);
+	if (ret) {
+		dev_err(dev, "psp: unable to allocate an IRQ\n");
+		goto e_err;
+	}
+
+	sp_set_psp_master(sp);
+
+	dev_dbg(dev, "initializing psp\n");
+	ret = psp_init(psp);
+	if (ret) {
+		dev_err(dev, "failed to init psp\n");
+		goto e_irq;
+	}
+
+	/* Enable interrupt */
+	dev_dbg(dev, "Enabling interrupts ...\n");
+	iowrite32(7, psp->io_regs + PSP_P2CMSG_INTEN);
+
+	dev_notice(dev, "psp enabled\n");
+
+	return 0;
+
+e_irq:
+	sp_free_psp_irq(psp->sp, psp);
+e_err:
+	sp->psp_data = NULL;
+
+	dev_notice(dev, "psp initialization failed\n");
+
+	return ret;
+}
+
+void psp_dev_destroy(struct sp_device *sp)
+{
+	struct psp_device *psp = sp->psp_data;
+
+	sev_dev_destroy(psp);
+
+	sp_free_psp_irq(sp, psp);
+
+	psp_del_device(psp);
+}
+
+int psp_dev_resume(struct sp_device *sp)
+{
+	sev_dev_resume(sp->psp_data);
+	return 0;
+}
+
+int psp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	sev_dev_suspend(sp->psp_data, state);
+	return 0;
+}
+
+int psp_request_sev_irq(struct psp_device *psp, irq_handler_t handler,
+			void *data)
+{
+	psp->sev_irq_data = data;
+	psp->sev_irq_handler = handler;
+
+	return 0;
+}
+
+int psp_free_sev_irq(struct psp_device *psp, void *data)
+{
+	if (psp->sev_irq_handler) {
+		psp->sev_irq_data = NULL;
+		psp->sev_irq_handler = NULL;
+	}
+
+	return 0;
+}
+
+struct psp_device *psp_get_master_device(void)
+{
+	struct sp_device *sp = sp_get_psp_master_device();
+
+	return sp ? sp->psp_data : NULL;
+}
diff --git a/drivers/crypto/ccp/psp-dev.h b/drivers/crypto/ccp/psp-dev.h
new file mode 100644
index 0000000..bbd3d96
--- /dev/null
+++ b/drivers/crypto/ccp/psp-dev.h
@@ -0,0 +1,102 @@
+/*
+ * AMD Platform Security Processor (PSP) interface driver
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_DEV_H__
+#define __PSP_DEV_H__
+
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/dmapool.h>
+#include <linux/hw_random.h>
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+#include <linux/dmaengine.h>
+
+#include "sp-dev.h"
+
+#define PSP_P2CMSG_INTEN		0x0110
+#define PSP_P2CMSG_INTSTS		0x0114
+
+#define PSP_C2PMSG_ATTR_0		0x0118
+#define PSP_C2PMSG_ATTR_1		0x011c
+#define PSP_C2PMSG_ATTR_2		0x0120
+#define PSP_C2PMSG_ATTR_3		0x0124
+#define PSP_P2CMSG_ATTR_0		0x0128
+
+#define PSP_CMDRESP_CMD_SHIFT		16
+#define PSP_CMDRESP_IOC			BIT(0)
+#define PSP_CMDRESP_RESP		BIT(31)
+#define PSP_CMDRESP_ERR_MASK		0xffff
+
+#define MAX_PSP_NAME_LEN		16
+
+struct psp_device {
+	struct list_head entry;
+
+	struct psp_vdata *vdata;
+	char name[MAX_PSP_NAME_LEN];
+
+	struct device *dev;
+	struct sp_device *sp;
+
+	void __iomem *io_regs;
+
+	irq_handler_t sev_irq_handler;
+	void *sev_irq_data;
+
+	void *sev_data;
+};
+
+void psp_add_device(struct psp_device *psp);
+void psp_del_device(struct psp_device *psp);
+
+int psp_request_sev_irq(struct psp_device *psp, irq_handler_t handler,
+			void *data);
+int psp_free_sev_irq(struct psp_device *psp, void *data);
+
+struct psp_device *psp_get_master_device(void);
+
+#ifdef CONFIG_AMD_SEV
+
+int sev_dev_init(struct psp_device *psp);
+void sev_dev_destroy(struct psp_device *psp);
+int sev_dev_resume(struct psp_device *psp);
+int sev_dev_suspend(struct psp_device *psp, pm_message_t state);
+
+#else
+
+static inline int sev_dev_init(struct psp_device *psp)
+{
+	return -ENODEV;
+}
+
+static inline void sev_dev_destroy(struct psp_device *psp) { }
+
+static inline int sev_dev_resume(struct psp_device *psp)
+{
+	return -ENODEV;
+}
+
+static inline int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
+{
+	return -ENODEV;
+}
+
+#endif /* __AMD_SEV_H */
+
+#endif /* __PSP_DEV_H */
+
diff --git a/drivers/crypto/ccp/sp-dev.c b/drivers/crypto/ccp/sp-dev.c
index e47fb8e..975a435 100644
--- a/drivers/crypto/ccp/sp-dev.c
+++ b/drivers/crypto/ccp/sp-dev.c
@@ -212,6 +212,8 @@ int sp_init(struct sp_device *sp)
 	if (sp->dev_data->ccp_vdata)
 		ccp_dev_init(sp);
 
+	if (sp->dev_data->psp_vdata)
+		psp_dev_init(sp);
 	return 0;
 }
 
@@ -220,6 +222,9 @@ void sp_destroy(struct sp_device *sp)
 	if (sp->dev_data->ccp_vdata)
 		ccp_dev_destroy(sp);
 
+	if (sp->dev_data->psp_vdata)
+		psp_dev_destroy(sp);
+
 	sp_del_device(sp);
 }
 
@@ -233,6 +238,12 @@ int sp_suspend(struct sp_device *sp, pm_message_t state)
 			return ret;
 	}
 
+	if (sp->dev_data->psp_vdata) {
+		ret = psp_dev_suspend(sp, state);
+		if (ret)
+			return ret;
+	}
+
 	return 0;
 }
 
@@ -246,6 +257,11 @@ int sp_resume(struct sp_device *sp)
 			return ret;
 	}
 
+	if (sp->dev_data->psp_vdata) {
+		ret = psp_dev_resume(sp);
+		if (ret)
+			return ret;
+	}
 	return 0;
 }
 
diff --git a/drivers/crypto/ccp/sp-dev.h b/drivers/crypto/ccp/sp-dev.h
index 9a8a8f8..aeff7a0 100644
--- a/drivers/crypto/ccp/sp-dev.h
+++ b/drivers/crypto/ccp/sp-dev.h
@@ -40,12 +40,18 @@ struct ccp_vdata {
 	const unsigned int offset;
 };
 
+struct psp_vdata {
+	const unsigned int version;
+	const struct psp_actions *perform;
+	const unsigned int offset;
+};
+
 /* Structure to hold SP device data */
 struct sp_dev_data {
 	const unsigned int bar;
 
 	const struct ccp_vdata *ccp_vdata;
-	const void *psp_vdata;
+	const struct psp_vdata *psp_vdata;
 };
 
 struct sp_device {
@@ -137,4 +143,30 @@ static inline int ccp_dev_resume(struct sp_device *sp)
 
 #endif	/* CONFIG_CRYPTO_DEV_CCP */
 
+#ifdef CONFIG_CRYPTO_DEV_PSP
+
+int psp_dev_init(struct sp_device *sp);
+void psp_dev_destroy(struct sp_device *sp);
+
+int psp_dev_suspend(struct sp_device *sp, pm_message_t state);
+int psp_dev_resume(struct sp_device *sp);
+#else /* !CONFIG_CRYPTO_DEV_CCP */
+
+static inline int psp_dev_init(struct sp_device *sp)
+{
+	return 0;
+}
+static inline void psp_dev_destroy(struct sp_device *sp) { }
+
+static inline int psp_dev_suspend(struct sp_device *sp, pm_message_t state)
+{
+	return 0;
+}
+static inline int psp_dev_resume(struct sp_device *sp)
+{
+	return 0;
+}
+
+#endif /* CONFIG_CRYPTO_DEV_CCP */
+
 #endif
diff --git a/drivers/crypto/ccp/sp-pci.c b/drivers/crypto/ccp/sp-pci.c
index 0960e2d..4999662 100644
--- a/drivers/crypto/ccp/sp-pci.c
+++ b/drivers/crypto/ccp/sp-pci.c
@@ -271,6 +271,7 @@ static int sp_pci_resume(struct pci_dev *pdev)
 extern struct ccp_vdata ccpv3_pci;
 extern struct ccp_vdata ccpv5a;
 extern struct ccp_vdata ccpv5b;
+extern struct psp_vdata psp_entry;
 
 static const struct sp_dev_data dev_data[] = {
 	{
@@ -284,6 +285,9 @@ static const struct sp_dev_data dev_data[] = {
 #ifdef CONFIG_CRYPTO_DEV_CCP
 		.ccp_vdata = &ccpv5a,
 #endif
+#ifdef CONFIG_CRYPTO_DEV_PSP
+		.psp_vdata = &psp_entry
+#endif
 	},
 	{
 		.bar = 2,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 21/32] crypto: ccp: Add Secure Encrypted Virtualization (SEV) interface support
  2017-03-02 15:12 ` Brijesh Singh
                   ` (42 preceding siblings ...)
  (?)
@ 2017-03-02 15:16 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The Secure Encrypted Virtualization (SEV) interface allows the memory
contents of a virtual machine (VM) to be transparently encrypted with
a key unique to the guest.

The interface provides:
  - /dev/sev device and ioctl (SEV_ISSUE_CMD) to execute the platform
    provisioning commands from the userspace.
  - in-kernel API's to encrypt the guest memory region. The in-kernel APIs
    will be used by KVM to bootstrap and debug the SEV guest.

SEV key management spec is available here [1]
[1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Specification.pdf

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 drivers/crypto/ccp/Kconfig   |    7 
 drivers/crypto/ccp/Makefile  |    1 
 drivers/crypto/ccp/psp-dev.h |    6 
 drivers/crypto/ccp/sev-dev.c |  348 ++++++++++++++++++++++
 drivers/crypto/ccp/sev-dev.h |   67 ++++
 drivers/crypto/ccp/sev-ops.c |  324 ++++++++++++++++++++
 include/linux/psp-sev.h      |  672 ++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/Kbuild    |    1 
 include/uapi/linux/psp-sev.h |  123 ++++++++
 9 files changed, 1546 insertions(+), 3 deletions(-)
 create mode 100644 drivers/crypto/ccp/sev-dev.c
 create mode 100644 drivers/crypto/ccp/sev-dev.h
 create mode 100644 drivers/crypto/ccp/sev-ops.c
 create mode 100644 include/linux/psp-sev.h
 create mode 100644 include/uapi/linux/psp-sev.h

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index 59c207e..67d1917 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -41,4 +41,11 @@ config CRYPTO_DEV_PSP
 	help
 	 Provide the interface for AMD Platform Security Processor (PSP) device.
 
+config CRYPTO_DEV_SEV
+	bool "Secure Encrypted Virtualization (SEV) interface"
+	default y
+	help
+	 Provide the kernel and userspace (/dev/sev) interface to issue the
+	 Secure Encrypted Virtualization (SEV) commands.
+
 endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 12e569d..4c4e77e 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -7,6 +7,7 @@ ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-dev-v5.o \
 	    ccp-dmaengine.o
 ccp-$(CONFIG_CRYPTO_DEV_PSP) += psp-dev.o
+ccp-$(CONFIG_CRYPTO_DEV_SEV) += sev-dev.o sev-ops.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/psp-dev.h b/drivers/crypto/ccp/psp-dev.h
index bbd3d96..fd67b14 100644
--- a/drivers/crypto/ccp/psp-dev.h
+++ b/drivers/crypto/ccp/psp-dev.h
@@ -70,14 +70,14 @@ int psp_free_sev_irq(struct psp_device *psp, void *data);
 
 struct psp_device *psp_get_master_device(void);
 
-#ifdef CONFIG_AMD_SEV
+#ifdef CONFIG_CRYPTO_DEV_SEV
 
 int sev_dev_init(struct psp_device *psp);
 void sev_dev_destroy(struct psp_device *psp);
 int sev_dev_resume(struct psp_device *psp);
 int sev_dev_suspend(struct psp_device *psp, pm_message_t state);
 
-#else
+#else /* !CONFIG_CRYPTO_DEV_SEV */
 
 static inline int sev_dev_init(struct psp_device *psp)
 {
@@ -96,7 +96,7 @@ static inline int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
 	return -ENODEV;
 }
 
-#endif /* __AMD_SEV_H */
+#endif /* CONFIG_CRYPTO_DEV_SEV */
 
 #endif /* __PSP_DEV_H */
 
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
new file mode 100644
index 0000000..a67e2d7
--- /dev/null
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -0,0 +1,348 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/delay.h>
+#include <linux/wait.h>
+#include <linux/jiffies.h>
+
+#include "psp-dev.h"
+#include "sev-dev.h"
+
+extern struct file_operations sev_fops;
+
+static LIST_HEAD(sev_devs);
+static DEFINE_SPINLOCK(sev_devs_lock);
+static atomic_t sev_id;
+
+static unsigned int psp_poll;
+module_param(psp_poll, uint, 0444);
+MODULE_PARM_DESC(psp_poll, "Poll for sev command completion - any non-zero value");
+
+DEFINE_MUTEX(sev_cmd_mutex);
+
+void sev_add_device(struct sev_device *sev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sev_devs_lock, flags);
+
+	list_add_tail(&sev->entry, &sev_devs);
+
+	spin_unlock_irqrestore(&sev_devs_lock, flags);
+}
+
+void sev_del_device(struct sev_device *sev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sev_devs_lock, flags);
+
+	list_del(&sev->entry);
+	spin_unlock_irqrestore(&sev_devs_lock, flags);
+}
+
+static struct sev_device *get_sev_master_device(void)
+{
+	struct psp_device *psp = psp_get_master_device();
+
+	return psp ? psp->sev_data : NULL;
+}
+
+static int sev_wait_cmd_poll(struct sev_device *sev, unsigned int timeout,
+			     unsigned int *reg)
+{
+	int wait = timeout * 10;	/* 100ms sleep => timeout * 10 */
+
+	while (--wait) {
+		msleep(100);
+
+		*reg = ioread32(sev->io_regs + PSP_CMDRESP);
+		if (*reg & PSP_CMDRESP_RESP)
+			break;
+	}
+
+	if (!wait) {
+		dev_err(sev->dev, "sev command timed out\n");
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static int sev_wait_cmd_ioc(struct sev_device *sev, unsigned int timeout,
+			    unsigned int *reg)
+{
+	unsigned long jiffie_timeout = timeout;
+	long ret;
+
+	jiffie_timeout *= HZ;
+
+	sev->int_rcvd = 0;
+
+	ret = wait_event_interruptible_timeout(sev->int_queue, sev->int_rcvd,
+						jiffie_timeout);
+	if (ret <= 0) {
+		dev_err(sev->dev, "sev command (%#x) timed out\n",
+				*reg >> PSP_CMDRESP_CMD_SHIFT);
+		return -ETIMEDOUT;
+	}
+
+	*reg = ioread32(sev->io_regs + PSP_CMDRESP);
+
+	return 0;
+}
+
+static int sev_wait_cmd(struct sev_device *sev, unsigned int timeout,
+			unsigned int *reg)
+{
+	return (*reg & PSP_CMDRESP_IOC) ? sev_wait_cmd_ioc(sev, timeout, reg)
+					: sev_wait_cmd_poll(sev, timeout, reg);
+}
+
+static struct sev_device *sev_alloc_struct(struct psp_device *psp)
+{
+	struct device *dev = psp->dev;
+	struct sev_device *sev;
+
+	sev = devm_kzalloc(dev, sizeof(*sev), GFP_KERNEL);
+	if (!sev)
+		return NULL;
+
+	sev->dev = dev;
+	sev->psp = psp;
+	sev->id = atomic_inc_return(&sev_id);
+
+	snprintf(sev->name, sizeof(sev->name), "sev%u", sev->id);
+	init_waitqueue_head(&sev->int_queue);
+
+	return sev;
+}
+
+irqreturn_t sev_irq_handler(int irq, void *data)
+{
+	struct sev_device *sev = data;
+	unsigned int status;
+
+	status = ioread32(sev->io_regs + PSP_P2CMSG_INTSTS);
+	if (status & (1 << PSP_CMD_COMPLETE_REG)) {
+		int reg;
+
+		reg = ioread32(sev->io_regs + PSP_CMDRESP);
+		if (reg & PSP_CMDRESP_RESP) {
+			sev->int_rcvd = 1;
+			wake_up_interruptible(&sev->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
+static bool check_sev_support(struct sev_device *sev)
+{
+	/* Bit 0 in PSP_FEATURE_REG is set then SEV is support in PSP */
+	if (ioread32(sev->io_regs + PSP_FEATURE_REG) & 1)
+		return true;
+
+	return false;
+}
+
+int sev_dev_init(struct psp_device *psp)
+{
+	struct device *dev = psp->dev;
+	struct sev_device *sev;
+	int ret;
+
+	ret = -ENOMEM;
+	sev = sev_alloc_struct(psp);
+	if (!sev)
+		goto e_err;
+	psp->sev_data = sev;
+	
+	sev->io_regs = psp->io_regs;
+
+	dev_dbg(dev, "checking SEV support ...\n");
+	/* check SEV support */
+	if (!check_sev_support(sev)) {
+		dev_dbg(dev, "device does not support SEV\n");
+		goto e_err;
+	}
+
+	dev_dbg(dev, "requesting an IRQ ...\n");
+	/* Request an irq */
+	ret = psp_request_sev_irq(sev->psp, sev_irq_handler, sev);
+	if (ret) {
+		dev_err(dev, "unable to allocate an IRQ\n");
+		goto e_err;
+	}
+
+	/* initialize SEV ops */
+	dev_dbg(dev, "init sev ops\n");
+	ret = sev_ops_init(sev);
+	if (ret) {
+		dev_err(dev, "failed to init sev ops\n");
+		goto e_irq;
+	}
+
+	sev_add_device(sev);
+
+	dev_notice(dev, "sev enabled\n");
+
+	return 0;
+
+e_irq:
+	psp_free_sev_irq(psp, sev);
+e_err:
+	psp->sev_data = NULL;
+
+	dev_notice(dev, "sev initialization failed\n");
+
+	return ret;
+}
+
+void sev_dev_destroy(struct psp_device *psp)
+{
+	struct sev_device *sev = psp->sev_data;
+
+	psp_free_sev_irq(psp, sev);
+
+	sev_ops_destroy(sev);
+
+	sev_del_device(sev);
+}
+
+int sev_dev_resume(struct psp_device *psp)
+{
+	return 0;
+}
+
+int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
+{
+	return 0;
+}
+
+int sev_issue_cmd(int cmd, void *data, unsigned int timeout, int *psp_ret)
+{
+	struct sev_device *sev = get_sev_master_device();
+	unsigned int phys_lsb, phys_msb;
+	unsigned int reg, ret;
+
+	if (!sev)
+		return -ENODEV;
+
+	if (psp_ret)
+		*psp_ret = 0;
+
+	/* Set the physical address for the PSP */
+	phys_lsb = data ? lower_32_bits(__psp_pa(data)) : 0;
+	phys_msb = data ? upper_32_bits(__psp_pa(data)) : 0;
+
+	dev_dbg(sev->dev, "sev command id %#x buffer 0x%08x%08x\n",
+			cmd, phys_msb, phys_lsb);
+
+	/* Only one command at a time... */
+	mutex_lock(&sev_cmd_mutex);
+
+	iowrite32(phys_lsb, sev->io_regs + PSP_CMDBUFF_ADDR_LO);
+	iowrite32(phys_msb, sev->io_regs + PSP_CMDBUFF_ADDR_HI);
+	wmb();
+
+	reg = cmd;
+	reg <<= PSP_CMDRESP_CMD_SHIFT;
+	reg |= psp_poll ? 0 : PSP_CMDRESP_IOC;
+	iowrite32(reg, sev->io_regs + PSP_CMDRESP);
+
+	ret = sev_wait_cmd(sev, timeout, &reg);
+	if (ret)
+		goto unlock;
+
+	if (psp_ret)
+		*psp_ret = reg & PSP_CMDRESP_ERR_MASK;
+
+	if (reg & PSP_CMDRESP_ERR_MASK) {
+		dev_dbg(sev->dev, "sev command %u failed (%#010x)\n",
+			cmd, reg & PSP_CMDRESP_ERR_MASK);
+		ret = -EIO;
+	}
+
+unlock:
+	mutex_unlock(&sev_cmd_mutex);
+
+	return ret;
+}
+
+int sev_platform_init(struct sev_data_init *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_INIT, data, SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_init);
+
+int sev_platform_shutdown(int *error)
+{
+	return sev_issue_cmd(SEV_CMD_SHUTDOWN, 0, SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_shutdown);
+
+int sev_platform_status(struct sev_data_status *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_PLATFORM_STATUS, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_status);
+
+int sev_issue_cmd_external_user(struct file *filep, unsigned int cmd,
+				void *data, int timeout, int *error)
+{
+	if (!filep || filep->f_op != &sev_fops)
+		return -EBADF;
+
+	return sev_issue_cmd(cmd, data,
+			timeout ? timeout : SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_issue_cmd_external_user);
+
+int sev_guest_deactivate(struct sev_data_deactivate *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DEACTIVATE, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_deactivate);
+
+int sev_guest_activate(struct sev_data_activate *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_ACTIVATE, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_activate);
+
+int sev_guest_decommission(struct sev_data_decommission *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DECOMMISSION, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_decommission);
+
+int sev_guest_df_flush(int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DF_FLUSH, 0,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_df_flush);
+
diff --git a/drivers/crypto/ccp/sev-dev.h b/drivers/crypto/ccp/sev-dev.h
new file mode 100644
index 0000000..0df6ead
--- /dev/null
+++ b/drivers/crypto/ccp/sev-dev.h
@@ -0,0 +1,67 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) interface
+ *
+ * Copyright (C) 2013,2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __SEV_DEV_H__
+#define __SEV_DEV_H__
+
+#include <linux/device.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+#include <linux/miscdevice.h>
+
+#include <linux/psp-sev.h>
+
+#define PSP_C2PMSG(_num)		((_num) << 2)
+#define PSP_CMDRESP			PSP_C2PMSG(32)
+#define PSP_CMDBUFF_ADDR_LO		PSP_C2PMSG(56)
+#define PSP_CMDBUFF_ADDR_HI             PSP_C2PMSG(57)
+#define PSP_FEATURE_REG			PSP_C2PMSG(63)
+
+#define PSP_P2CMSG(_num)		(_num << 2)
+#define PSP_CMD_COMPLETE_REG		1
+#define PSP_CMD_COMPLETE		PSP_P2CMSG(PSP_CMD_COMPLETE_REG)
+
+#define MAX_PSP_NAME_LEN		16
+#define SEV_DEFAULT_TIMEOUT		5
+
+struct sev_device {
+	struct list_head entry;
+
+	struct dentry *debugfs;
+	struct miscdevice misc;
+
+	unsigned int id;
+	char name[MAX_PSP_NAME_LEN];
+
+	struct device *dev;
+	struct sp_device *sp;
+	struct psp_device *psp;
+
+	void __iomem *io_regs;
+
+	unsigned int int_rcvd;
+	wait_queue_head_t int_queue;
+};
+
+void sev_add_device(struct sev_device *sev);
+void sev_del_device(struct sev_device *sev);
+
+int sev_ops_init(struct sev_device *sev);
+void sev_ops_destroy(struct sev_device *sev);
+
+int sev_issue_cmd(int cmd, void *data, unsigned int timeout, int *error);
+
+#endif /* __SEV_DEV_H */
diff --git a/drivers/crypto/ccp/sev-ops.c b/drivers/crypto/ccp/sev-ops.c
new file mode 100644
index 0000000..727a8db
--- /dev/null
+++ b/drivers/crypto/ccp/sev-ops.c
@@ -0,0 +1,324 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) command interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/uaccess.h>
+
+#include <uapi/linux/psp-sev.h>
+
+#include "psp-dev.h"
+#include "sev-dev.h"
+
+static int sev_ioctl_init(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_init *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = sev_platform_init(data, &argp->error);
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_platform_status(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_status *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = sev_platform_status(data, &argp->error);
+
+	if (copy_to_user((void *)argp->data, data, sizeof(*data)))
+		ret = -EFAULT;
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pek_csr(struct sev_issue_cmd *argp)
+{
+	int ret;
+	void *csr_addr = NULL;
+	struct sev_data_pek_csr *data;
+	struct sev_user_data_pek_csr input;
+
+	if (copy_from_user(&input, (void *)argp->data,
+			sizeof(struct sev_user_data_pek_csr)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy PEK certificate from userspace */
+	if (input.address && input.length) {
+		csr_addr = kmalloc(input.length, GFP_KERNEL);
+		if (!csr_addr) {
+			ret = -ENOMEM;
+			goto e_err;
+		}
+		if (copy_from_user(csr_addr, (void *)input.address,
+				input.length)) {
+			ret = -EFAULT;
+			goto e_csr_free;
+		}
+
+		data->address = __psp_pa(csr_addr);
+		data->length = input.length;
+	}
+
+	ret = sev_issue_cmd(SEV_CMD_PEK_CSR,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+
+	input.length = data->length;
+
+	/* copy PEK certificate length to userspace */
+	if (copy_to_user((void *)argp->data, &input,
+			sizeof(struct sev_user_data_pek_csr)))
+		ret = -EFAULT;
+e_csr_free:
+	kfree(csr_addr);
+e_err:
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pek_cert_import(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_pek_cert_import *data;
+	struct sev_user_data_pek_cert_import input;
+	void *pek_cert, *oca_cert;
+
+	if (copy_from_user(&input, (void *)argp->data, sizeof(*data)))
+		return -EFAULT;
+
+	if (!input.pek_cert_address || !input.pek_cert_length ||
+		!input.oca_cert_address || !input.oca_cert_length)
+		return -EINVAL;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy PEK certificate from userspace */
+	pek_cert = kmalloc(input.pek_cert_length, GFP_KERNEL);
+	if (!pek_cert) {
+		ret = -ENOMEM;
+		goto e_free;
+	}
+	if (copy_from_user(pek_cert, (void *)input.pek_cert_address,
+				input.pek_cert_length)) {
+		ret = -EFAULT;
+		goto e_free_pek_cert;
+	}
+
+	data->pek_cert_address = __psp_pa(pek_cert);
+	data->pek_cert_length = input.pek_cert_length;
+
+	/* copy OCA certificate from userspace */
+	oca_cert = kmalloc(input.oca_cert_length, GFP_KERNEL);
+	if (!oca_cert) {
+		ret = -ENOMEM;
+		goto e_free_pek_cert;
+	}
+	if (copy_from_user(oca_cert, (void *)input.oca_cert_address,
+				input.oca_cert_length)) {
+		ret = -EFAULT;
+		goto e_free_oca_cert;
+	}
+
+	data->oca_cert_address = __psp_pa(oca_cert);
+	data->oca_cert_length = input.oca_cert_length;
+
+	ret = sev_issue_cmd(SEV_CMD_PEK_CERT_IMPORT,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+e_free_oca_cert:
+	kfree(oca_cert);
+e_free_pek_cert:
+	kfree(pek_cert);
+e_free:
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pdh_cert_export(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_pdh_cert_export *data;
+	struct sev_user_data_pdh_cert_export input;
+	void *pdh_cert = NULL, *cert_chain = NULL;
+
+	if (copy_from_user(&input, (void *)argp->data, sizeof(*data)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy pdh certificate from userspace */
+	if (input.pdh_cert_length && input.pdh_cert_address) {
+		pdh_cert = kmalloc(input.pdh_cert_length, GFP_KERNEL);
+		if (!pdh_cert) {
+			ret = -ENOMEM;
+			goto e_free;
+		}
+		if (copy_from_user(pdh_cert, (void *)input.pdh_cert_address,
+					input.pdh_cert_length)) {
+			ret = -EFAULT;
+			goto e_free_pdh_cert;
+		}
+
+		data->pdh_cert_address = __psp_pa(pdh_cert);
+		data->pdh_cert_length = input.pdh_cert_length;
+	}
+
+	/* copy cert_chain certificate from userspace */
+	if (input.cert_chain_length && input.cert_chain_address) {
+		cert_chain = kmalloc(input.cert_chain_length, GFP_KERNEL);
+		if (!cert_chain) {
+			ret = -ENOMEM;
+			goto e_free_pdh_cert;
+		}
+		if (copy_from_user(cert_chain, (void *)input.cert_chain_address,
+					input.cert_chain_length)) {
+			ret = -EFAULT;
+			goto e_free_cert_chain;
+		}
+
+		data->cert_chain_address = __psp_pa(cert_chain);
+		data->cert_chain_length = input.cert_chain_length;
+	}
+
+	ret = sev_issue_cmd(SEV_CMD_PDH_CERT_EXPORT,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+
+	input.cert_chain_length = data->cert_chain_length;
+	input.pdh_cert_length = data->pdh_cert_length;
+
+	/* copy certificate length to userspace */
+	if (copy_to_user((void *)argp->data, &input,
+			sizeof(struct sev_user_data_pek_csr)))
+		ret = -EFAULT;
+
+e_free_cert_chain:
+	kfree(cert_chain);
+e_free_pdh_cert:
+	kfree(pdh_cert);
+e_free:
+	kfree(data);
+	return ret;
+}
+
+static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
+{
+	int ret = -EFAULT;
+	void __user *argp = (void __user *)arg;
+	struct sev_issue_cmd input;
+
+	if (ioctl != SEV_ISSUE_CMD)
+		return -EINVAL;
+
+	if (copy_from_user(&input, argp, sizeof(struct sev_issue_cmd)))
+		return -EFAULT;
+
+	if (input.cmd > SEV_CMD_MAX)
+		return -EINVAL;
+
+	switch (input.cmd) {
+
+	case SEV_USER_CMD_INIT: {
+		ret = sev_ioctl_init(&input);
+		break;
+	}
+	case SEV_USER_CMD_SHUTDOWN: {
+		ret = sev_platform_shutdown(&input.error);
+		break;
+	}
+	case SEV_USER_CMD_FACTORY_RESET: {
+		ret = sev_issue_cmd(SEV_CMD_FACTORY_RESET, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PLATFORM_STATUS: {
+		ret = sev_ioctl_platform_status(&input);
+		break;
+	}
+	case SEV_USER_CMD_PEK_GEN: {
+		ret = sev_issue_cmd(SEV_CMD_PEK_GEN, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PDH_GEN: {
+		ret = sev_issue_cmd(SEV_CMD_PDH_GEN, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PEK_CSR: {
+		ret = sev_ioctl_pek_csr(&input);
+		break;
+	}
+	case SEV_USER_CMD_PEK_CERT_IMPORT: {
+		ret = sev_ioctl_pek_cert_import(&input);
+		break;
+	}
+	case SEV_USER_CMD_PDH_CERT_EXPORT: {
+		ret = sev_ioctl_pdh_cert_export(&input);
+		break;
+	}
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	if (copy_to_user(argp, &input, sizeof(struct sev_issue_cmd)))
+		ret = -EFAULT;
+
+	return ret;
+}
+
+const struct file_operations sev_fops = {
+	.owner	= THIS_MODULE,
+	.unlocked_ioctl = sev_ioctl,
+};
+
+int sev_ops_init(struct sev_device *sev)
+{
+	struct miscdevice *misc = &sev->misc;
+
+	misc->minor = MISC_DYNAMIC_MINOR;
+	misc->name = sev->name;
+	misc->fops = &sev_fops;
+
+	return misc_register(misc);
+}
+
+void sev_ops_destroy(struct sev_device *sev)
+{
+	misc_deregister(&sev->misc);
+}
+
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
new file mode 100644
index 0000000..acce6ed
--- /dev/null
+++ b/include/linux/psp-sev.h
@@ -0,0 +1,672 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) driver interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_SEV_H__
+#define __PSP_SEV_H__
+
+#ifdef CONFIG_X86
+#include <linux/mem_encrypt.h>
+
+#define __psp_pa(x)	__sme_pa(x)
+#else
+#define __psp_pa(x)	__pa(x)
+#endif
+
+/**
+ * SEV platform and guest management commands
+ */
+enum sev_cmd {
+	/* platform commands */
+	SEV_CMD_INIT			= 0x001,
+	SEV_CMD_SHUTDOWN		= 0x002,
+	SEV_CMD_FACTORY_RESET		= 0x003,
+	SEV_CMD_PLATFORM_STATUS		= 0x004,
+	SEV_CMD_PEK_GEN			= 0x005,
+	SEV_CMD_PEK_CSR			= 0x006,
+	SEV_CMD_PEK_CERT_IMPORT		= 0x007,
+	SEV_CMD_PDH_GEN			= 0x008,
+	SEV_CMD_PDH_CERT_EXPORT		= 0x009,
+	SEV_CMD_DF_FLUSH		= 0x00A,
+
+	/* Guest commands */
+	SEV_CMD_DECOMMISSION		= 0x020,
+	SEV_CMD_ACTIVATE		= 0x021,
+	SEV_CMD_DEACTIVATE		= 0x022,
+	SEV_CMD_GUEST_STATUS		= 0x023,
+
+	/* Guest launch commands */
+	SEV_CMD_LAUNCH_START		= 0x030,
+	SEV_CMD_LAUNCH_UPDATE_DATA	= 0x031,
+	SEV_CMD_LAUNCH_UPDATE_VMSA	= 0x032,
+	SEV_CMD_LAUNCH_MEASURE		= 0x033,
+	SEV_CMD_LAUNCH_UPDATE_SECRET	= 0x034,
+	SEV_CMD_LAUNCH_FINISH		= 0x035,
+
+	/* Guest migration commands (outgoing) */
+	SEV_CMD_SEND_START		= 0x040,
+	SEV_CMD_SEND_UPDATE_DATA	= 0x041,
+	SEV_CMD_SEND_UPDATE_VMSA	= 0x042,
+	SEV_CMD_SEND_FINISH		= 0x043,
+
+	/* Guest migration commands (incoming) */
+	SEV_CMD_RECEIVE_START		= 0x050,
+	SEV_CMD_RECEIVE_UPDATE_DATA	= 0x051,
+	SEV_CMD_RECEIVE_UPDATE_VMSA	= 0x052,
+	SEV_CMD_RECEIVE_FINISH		= 0x053,
+
+	/* Guest debug commands */
+	SEV_CMD_DBG_DECRYPT		= 0x060,
+	SEV_CMD_DBG_ENCRYPT		= 0x061,
+
+	SEV_CMD_MAX,
+};
+
+/**
+ * status code returned by the commands
+ */
+enum psp_ret_code {
+	SEV_RET_SUCCESS = 0,
+	SEV_RET_INVALID_PLATFORM_STATE,
+	SEV_RET_INVALID_GUEST_STATE,
+	SEV_RET_INAVLID_CONFIG,
+	SEV_RET_INVALID_LENGTH,
+	SEV_RET_ALREADY_OWNED,
+	SEV_RET_INVALID_CERTIFICATE,
+	SEV_RET_POLICY_FAILURE,
+	SEV_RET_INACTIVE,
+	SEV_RET_INVALID_ADDRESS,
+	SEV_RET_BAD_SIGNATURE,
+	SEV_RET_BAD_MEASUREMENT,
+	SEV_RET_ASID_OWNED,
+	SEV_RET_INVALID_ASID,
+	SEV_RET_WBINVD_REQUIRED,
+	SEV_RET_DFFLUSH_REQUIRED,
+	SEV_RET_INVALID_GUEST,
+	SEV_RET_INVALID_COMMAND,
+	SEV_RET_ACTIVE,
+	SEV_RET_HWSEV_RET_PLATFORM,
+	SEV_RET_HWSEV_RET_UNSAFE,
+	SEV_RET_UNSUPPORTED,
+	SEV_RET_MAX,
+};
+
+/**
+ * struct sev_data_init - INIT command parameters
+ *
+ * @flags: processing flags
+ * @tmr_address: system physical address used for SEV-ES
+ * @tmr_length: length of tmr_address
+ */
+struct sev_data_init {
+	__u32 flags;				/* In */
+	__u32 reserved;				/* In */
+	__u64 tmr_address;			/* In */
+	__u32 tmr_length;			/* In */
+};
+
+/**
+ * struct sev_data_status - PLATFORM_STATUS command parameters
+ *
+ * @major: major API version
+ * @minor: minor API version
+ * @state: platform state
+ * @owner: self-owned or externally owned
+ * @config: platform config flags
+ * @guest_count: number of active guests
+ */
+struct sev_data_status {
+	__u8 api_major;				/* Out */
+	__u8 api_minor;				/* Out */
+	__u8 state;				/* Out */
+	__u8 owner;				/* Out */
+	__u32 config;				/* Out */
+	__u32 guest_count;			/* Out */
+};
+
+/**
+ * struct sev_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @length: length of certificate
+ */
+struct sev_data_pek_csr {
+	__u64 address;					/* In */
+	__u32 length;					/* In/Out */
+};
+
+/**
+ * struct sev_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_length: length of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_length: length of OCA certificate
+ */
+struct sev_data_pek_cert_import {
+	__u64 pek_cert_address;				/* In */
+	__u32 pek_cert_length;				/* In */
+	__u32 reserved;					/* In */
+	__u64 oca_cert_address;				/* In */
+	__u32 oca_cert_length;				/* In */
+};
+
+/**
+ * struct sev_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_length: length of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_length: length of PDH certificate chain
+ */
+struct sev_data_pdh_cert_export {
+	__u64 pdh_cert_address;				/* In */
+	__u32 pdh_cert_length;				/* In/Out */
+	__u32 reserved;					/* In */
+	__u64 cert_chain_address;			/* In */
+	__u32 cert_chain_length;			/* In/Out */
+};
+
+/**
+ * struct sev_data_decommission - DECOMMISSION command parameters
+ *
+ * @handle: handle of the VM to decommission
+ */
+struct sev_data_decommission {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_activate - ACTIVATE command parameters
+ *
+ * @handle: handle of the VM to activate
+ * @asid: asid assigned to the VM
+ */
+struct sev_data_activate {
+	u32 handle;				/* In */
+	u32 asid;				/* In */
+};
+
+/**
+ * struct sev_data_deactivate - DEACTIVATE command parameters
+ *
+ * @handle: handle of the VM to deactivate
+ */
+struct sev_data_deactivate {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_guest_status - SEV GUEST_STATUS command parameters
+ *
+ * @handle: handle of the VM to retrieve status
+ * @policy: policy information for the VM
+ * @asid: current ASID of the VM
+ * @state: current state of the VM
+ */
+struct sev_data_guest_status {
+	u32 handle;				/* In */
+	u32 policy;				/* Out */
+	u32 asid;				/* Out */
+	u8 state;				/* Out */
+};
+
+/**
+ * struct sev_data_launch_start - LAUNCH_START command parameters
+ *
+ * @handle: handle assigned to the VM
+ * @policy: guest launch policy
+ * @dh_cert_address: physical address of DH certificate blob
+ * @dh_cert_length: length of DH certificate blob
+ * @session_address: physical address of session parameters
+ * @session_len: length of session parameters
+ */
+struct sev_data_launch_start {
+	u32 handle;				/* In/Out */
+	u32 policy;				/* In */
+	u64 dh_cert_address;			/* In */
+	u32 dh_cert_length;			/* In */
+	u32 reserved;				/* In */
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In */
+};
+
+/**
+ * struct sev_data_launch_update_data - LAUNCH_UPDATE_DATA command parameter
+ *
+ * @handle: handle of the VM to update
+ * @length: length of memory to be encrypted
+ * @address: physical address of memory region to encrypt
+ */
+struct sev_data_launch_update_data {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In */
+};
+
+/**
+ * struct sev_data_launch_update_vmsa - LAUNCH_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM
+ * @address: physical address of memory region to encrypt
+ * @length: length of memory region to encrypt
+ */
+struct sev_data_launch_update_vmsa {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In */
+};
+
+/**
+ * struct sev_data_launch_measure - LAUNCH_MEASURE command parameters
+ *
+ * @handle: handle of the VM to process
+ * @address: physical address containing the measurement blob
+ * @length: length of measurement blob
+ */
+struct sev_data_launch_measure {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In/Out */
+};
+
+/**
+ * struct sev_data_launch_secret - LAUNCH_SECRET command parameters
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing the packet header
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest_paddr
+ * @trans_address: physical address of transport memory buffer
+ * @trans_length: length of transport memory buffer
+ */
+struct sev_data_launch_secret {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_launch_finish - LAUNCH_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_launch_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_send_start - SEND_START command parameters
+ *
+ * @handle: handle of the VM to process
+ * @pdh_cert_address: physical address containing PDH certificate
+ * @pdh_cert_length: length of PDH certificate
+ * @plat_certs_address: physical address containing platform certificate
+ * @plat_certs_length: length of platform certificate
+ * @amd_certs_address: physical address containing AMD certificate
+ * @amd_certs_length: length of AMD certificate
+ * @session_data_address: physical address containing Session data
+ * @session_length: length of session data
+ */
+struct sev_data_send_start {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 pdh_cert_address;			/* In */
+	u32 pdh_cert_length;			/* In/Out */
+	u32 reserved2;
+	u64 plat_cert_address;			/* In */
+	u32 plat_cert_length;			/* In/Out */
+	u32 reserved3;
+	u64 amd_cert_address;			/* In */
+	u32 amd_cert_length;			/* In/Out */
+	u32 reserved4;
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In/Out */
+};
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_DATA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_length: length of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_length: length of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_length: length of host memory region
+ */
+struct sev_data_send_update_data {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In/Out */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_length: length of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_length: length of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_length: length of host memory region
+ */
+struct sev_data_send_update_vmsa {
+	u32 handle;				/* In */
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In/Out */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_send_finish - SEND_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_send_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_receive_start - RECEIVE_START command parameters
+ *
+ * @handle: handle of the VM to perform receive operation
+ * @pdh_cert_address: system physical address containing PDH certificate blob
+ * @pdh_cert_length: length of PDH certificate blob
+ * @session_address: system physical address containing session blob
+ * @session_length: length of session blob
+ */
+struct sev_data_receive_start {
+	u32 handle;				/* In/Out */
+	u32 reserved1;
+	u64 pdh_cert_address;			/* In */
+	u32 pdh_cert_length;			/* In */
+	u32 reserved2;
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In/Out */
+};
+
+/**
+ * struct sev_data_receive_update_data - RECEIVE_UPDATE_DATA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_length: length of transport buffer
+ */
+struct sev_data_receive_update_data {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_receive_update_vmsa - RECEIVE_UPDATE_VMSA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_length: length of transport buffer
+ */
+struct sev_data_receive_update_vmsa {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_receive_finish - RECEIVE_FINISH command parameters
+ *
+ * @handle: handle of the VM to finish
+ */
+struct sev_data_receive_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_dbg - DBG_ENCRYPT/DBG_DECRYPT command parameters
+ *
+ * @handle: handle of the VM to perform debug operation
+ * @src_addr: source address of data to operate on
+ * @dst_addr: destination address of data to operate on
+ * @length: length of data to operate on
+ */
+struct sev_data_dbg {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 src_addr;				/* In */
+	u64 dst_addr;				/* In */
+	u32 length;				/* In */
+};
+
+#if defined(CONFIG_CRYPTO_DEV_SEV)
+
+/**
+ * sev_platform_init - perform SEV INIT command
+ *
+ * @init: sev_data_init structure to be processed
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_init(struct sev_data_init *init, int *error);
+
+/**
+ * sev_platform_shutdown - perform SEV SHUTDOWN command
+ *
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_shutdown(int *error);
+
+/**
+ * sev_platform_status - perform SEV PLATFORM_STATUS command
+ *
+ * @init: sev_data_status structure to be processed
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_status(struct sev_data_status *status, int *error);
+
+/**
+ * sev_issue_cmd_external_user - issue SEV command by other driver
+ *
+ * The function can be used by other drivers to issue a SEV command on
+ * behalf by userspace. The caller must pass a valid SEV file descriptor
+ * so that we know that caller has access to SEV device.
+ *
+ * @filep - SEV device file pointer
+ * @cmd - command to issue
+ * @data - command buffer
+ * @timeout - If zero then use default timeout
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ * -%EINVAL    if the SEV file descriptor is not valid
+ */
+int sev_issue_cmd_external_user(struct file *filep, unsigned int id,
+				void *data, int timeout, int *error);
+
+/**
+ * sev_guest_deactivate - perform SEV DEACTIVATE command
+ *
+ * @deactivate: sev_data_deactivate structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_deactivate(struct sev_data_deactivate *data, int *error);
+
+/**
+ * sev_guest_activate - perform SEV ACTIVATE command
+ *
+ * @activate: sev_data_activate structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_activate(struct sev_data_activate *data, int *error);
+
+/**
+ * sev_guest_df_flush - perform SEV DF_FLUSH command
+ *
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_df_flush(int *error);
+
+/**
+ * sev_guest_decommission - perform SEV DECOMMISSION command
+ *
+ * @decommission: sev_data_decommission structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_decommission(struct sev_data_decommission *data, int *error);
+
+#else	/* !CONFIG_CRYPTO_DEV_SEV */
+
+static inline int sev_platform_status(struct sev_data_status *status,
+				      int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_platform_init(struct sev_data_init *init, int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_platform_shutdown(int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_issue_cmd_external_user(int fd, unsigned int id,
+					void *data, int timeout, int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_deactivate(struct sev_data_deactivate *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_decommission(struct sev_data_decommission *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_activate(struct sev_data_activate *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_df_flush(int *error)
+{
+	return -ENODEV;
+}
+
+#endif	/* CONFIG_CRYPTO_DEV_SEV */
+
+#endif	/* __PSP_SEV_H__ */
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index f330ba4..2e15ea7 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -481,3 +481,4 @@ header-y += xilinx-v4l2-controls.h
 header-y += zorro.h
 header-y += zorro_ids.h
 header-y += userfaultfd.h
+header-y += psp-sev.h
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
new file mode 100644
index 0000000..050976d
--- /dev/null
+++ b/include/uapi/linux/psp-sev.h
@@ -0,0 +1,123 @@
+
+/*
+ * Userspace interface for AMD Secure Encrypted Virtualization (SEV)
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_SEV_USER_H__
+#define __PSP_SEV_USER_H__
+
+#include <linux/types.h>
+
+/**
+ * SEV platform commands
+ */
+enum {
+	SEV_USER_CMD_INIT = 0,
+	SEV_USER_CMD_SHUTDOWN,
+	SEV_USER_CMD_FACTORY_RESET,
+	SEV_USER_CMD_PLATFORM_STATUS,
+	SEV_USER_CMD_PEK_GEN,
+	SEV_USER_CMD_PEK_CSR,
+	SEV_USER_CMD_PDH_GEN,
+	SEV_USER_CMD_PDH_CERT_EXPORT,
+	SEV_USER_CMD_PEK_CERT_IMPORT,
+
+	SEV_USER_CMD_MAX,
+};
+
+/**
+ * struct sev_user_data_init - INIT command parameters
+ *
+ * @flags: processing flags
+ */
+struct sev_user_data_init {
+	__u32 flags;				/* In */
+};
+
+/**
+ * struct sev_user_data_status - PLATFORM_STATUS command parameters
+ *
+ * @major: major API version
+ * @minor: minor API version
+ * @state: platform state
+ * @owner: self-owned or externally owned
+ * @config: platform config flags
+ * @guest_count: number of active guests
+ */
+struct sev_user_data_status {
+	__u8 api_major;				/* Out */
+	__u8 api_minor;				/* Out */
+	__u8 state;				/* Out */
+	__u8 owner;				/* Out */
+	__u32 config;				/* Out */
+	__u32 guest_count;			/* Out */
+};
+
+/**
+ * struct sev_user_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @length: length of certificate
+ */
+struct sev_user_data_pek_csr {
+	__u64 address;					/* In */
+	__u32 length;					/* In/Out */
+};
+
+/**
+ * q
+ * struct sev_user_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_length: length of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_length: length of OCA certificate
+ */
+struct sev_user_data_pek_cert_import {
+	__u64 pek_cert_address;				/* In */
+	__u32 pek_cert_length;				/* In */
+	__u64 oca_cert_address;				/* In */
+	__u32 oca_cert_length;				/* In */
+};
+
+/**
+ * struct sev_user_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_length: length of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_length: length of PDH certificate chain
+ */
+struct sev_user_data_pdh_cert_export {
+	__u64 pdh_cert_address;				/* In */
+	__u32 pdh_cert_length;				/* In/Out */
+	__u64 cert_chain_address;			/* In */
+	__u32 cert_chain_length;			/* In/Out */
+};
+
+/**
+ * struct sev_issue_cmd - SEV ioctl parameters
+ *
+ * @cmd: SEV commands to execute
+ * @opaque: pointer to the command structure
+ * @error: SEV FW return code on failure
+ */
+struct sev_issue_cmd {
+	__u32 cmd;					/* In */
+	__u64 data;					/* In */
+	__u32 error;					/* Out */
+};
+
+#define SEV_IOC_TYPE		'S'
+#define SEV_ISSUE_CMD	_IOWR(SEV_IOC_TYPE, 0x0, struct sev_issue_cmd)
+
+#endif /* __PSP_USER_SEV_H */
+

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 21/32] crypto: ccp: Add Secure Encrypted Virtualization (SEV) interface support
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:16   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The Secure Encrypted Virtualization (SEV) interface allows the memory
contents of a virtual machine (VM) to be transparently encrypted with
a key unique to the guest.

The interface provides:
  - /dev/sev device and ioctl (SEV_ISSUE_CMD) to execute the platform
    provisioning commands from the userspace.
  - in-kernel API's to encrypt the guest memory region. The in-kernel APIs
    will be used by KVM to bootstrap and debug the SEV guest.

SEV key management spec is available here [1]
[1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Specification.pdf

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 drivers/crypto/ccp/Kconfig   |    7 
 drivers/crypto/ccp/Makefile  |    1 
 drivers/crypto/ccp/psp-dev.h |    6 
 drivers/crypto/ccp/sev-dev.c |  348 ++++++++++++++++++++++
 drivers/crypto/ccp/sev-dev.h |   67 ++++
 drivers/crypto/ccp/sev-ops.c |  324 ++++++++++++++++++++
 include/linux/psp-sev.h      |  672 ++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/Kbuild    |    1 
 include/uapi/linux/psp-sev.h |  123 ++++++++
 9 files changed, 1546 insertions(+), 3 deletions(-)
 create mode 100644 drivers/crypto/ccp/sev-dev.c
 create mode 100644 drivers/crypto/ccp/sev-dev.h
 create mode 100644 drivers/crypto/ccp/sev-ops.c
 create mode 100644 include/linux/psp-sev.h
 create mode 100644 include/uapi/linux/psp-sev.h

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index 59c207e..67d1917 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -41,4 +41,11 @@ config CRYPTO_DEV_PSP
 	help
 	 Provide the interface for AMD Platform Security Processor (PSP) device.
 
+config CRYPTO_DEV_SEV
+	bool "Secure Encrypted Virtualization (SEV) interface"
+	default y
+	help
+	 Provide the kernel and userspace (/dev/sev) interface to issue the
+	 Secure Encrypted Virtualization (SEV) commands.
+
 endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 12e569d..4c4e77e 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -7,6 +7,7 @@ ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-dev-v5.o \
 	    ccp-dmaengine.o
 ccp-$(CONFIG_CRYPTO_DEV_PSP) += psp-dev.o
+ccp-$(CONFIG_CRYPTO_DEV_SEV) += sev-dev.o sev-ops.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/psp-dev.h b/drivers/crypto/ccp/psp-dev.h
index bbd3d96..fd67b14 100644
--- a/drivers/crypto/ccp/psp-dev.h
+++ b/drivers/crypto/ccp/psp-dev.h
@@ -70,14 +70,14 @@ int psp_free_sev_irq(struct psp_device *psp, void *data);
 
 struct psp_device *psp_get_master_device(void);
 
-#ifdef CONFIG_AMD_SEV
+#ifdef CONFIG_CRYPTO_DEV_SEV
 
 int sev_dev_init(struct psp_device *psp);
 void sev_dev_destroy(struct psp_device *psp);
 int sev_dev_resume(struct psp_device *psp);
 int sev_dev_suspend(struct psp_device *psp, pm_message_t state);
 
-#else
+#else /* !CONFIG_CRYPTO_DEV_SEV */
 
 static inline int sev_dev_init(struct psp_device *psp)
 {
@@ -96,7 +96,7 @@ static inline int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
 	return -ENODEV;
 }
 
-#endif /* __AMD_SEV_H */
+#endif /* CONFIG_CRYPTO_DEV_SEV */
 
 #endif /* __PSP_DEV_H */
 
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
new file mode 100644
index 0000000..a67e2d7
--- /dev/null
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -0,0 +1,348 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/delay.h>
+#include <linux/wait.h>
+#include <linux/jiffies.h>
+
+#include "psp-dev.h"
+#include "sev-dev.h"
+
+extern struct file_operations sev_fops;
+
+static LIST_HEAD(sev_devs);
+static DEFINE_SPINLOCK(sev_devs_lock);
+static atomic_t sev_id;
+
+static unsigned int psp_poll;
+module_param(psp_poll, uint, 0444);
+MODULE_PARM_DESC(psp_poll, "Poll for sev command completion - any non-zero value");
+
+DEFINE_MUTEX(sev_cmd_mutex);
+
+void sev_add_device(struct sev_device *sev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sev_devs_lock, flags);
+
+	list_add_tail(&sev->entry, &sev_devs);
+
+	spin_unlock_irqrestore(&sev_devs_lock, flags);
+}
+
+void sev_del_device(struct sev_device *sev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sev_devs_lock, flags);
+
+	list_del(&sev->entry);
+	spin_unlock_irqrestore(&sev_devs_lock, flags);
+}
+
+static struct sev_device *get_sev_master_device(void)
+{
+	struct psp_device *psp = psp_get_master_device();
+
+	return psp ? psp->sev_data : NULL;
+}
+
+static int sev_wait_cmd_poll(struct sev_device *sev, unsigned int timeout,
+			     unsigned int *reg)
+{
+	int wait = timeout * 10;	/* 100ms sleep => timeout * 10 */
+
+	while (--wait) {
+		msleep(100);
+
+		*reg = ioread32(sev->io_regs + PSP_CMDRESP);
+		if (*reg & PSP_CMDRESP_RESP)
+			break;
+	}
+
+	if (!wait) {
+		dev_err(sev->dev, "sev command timed out\n");
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static int sev_wait_cmd_ioc(struct sev_device *sev, unsigned int timeout,
+			    unsigned int *reg)
+{
+	unsigned long jiffie_timeout = timeout;
+	long ret;
+
+	jiffie_timeout *= HZ;
+
+	sev->int_rcvd = 0;
+
+	ret = wait_event_interruptible_timeout(sev->int_queue, sev->int_rcvd,
+						jiffie_timeout);
+	if (ret <= 0) {
+		dev_err(sev->dev, "sev command (%#x) timed out\n",
+				*reg >> PSP_CMDRESP_CMD_SHIFT);
+		return -ETIMEDOUT;
+	}
+
+	*reg = ioread32(sev->io_regs + PSP_CMDRESP);
+
+	return 0;
+}
+
+static int sev_wait_cmd(struct sev_device *sev, unsigned int timeout,
+			unsigned int *reg)
+{
+	return (*reg & PSP_CMDRESP_IOC) ? sev_wait_cmd_ioc(sev, timeout, reg)
+					: sev_wait_cmd_poll(sev, timeout, reg);
+}
+
+static struct sev_device *sev_alloc_struct(struct psp_device *psp)
+{
+	struct device *dev = psp->dev;
+	struct sev_device *sev;
+
+	sev = devm_kzalloc(dev, sizeof(*sev), GFP_KERNEL);
+	if (!sev)
+		return NULL;
+
+	sev->dev = dev;
+	sev->psp = psp;
+	sev->id = atomic_inc_return(&sev_id);
+
+	snprintf(sev->name, sizeof(sev->name), "sev%u", sev->id);
+	init_waitqueue_head(&sev->int_queue);
+
+	return sev;
+}
+
+irqreturn_t sev_irq_handler(int irq, void *data)
+{
+	struct sev_device *sev = data;
+	unsigned int status;
+
+	status = ioread32(sev->io_regs + PSP_P2CMSG_INTSTS);
+	if (status & (1 << PSP_CMD_COMPLETE_REG)) {
+		int reg;
+
+		reg = ioread32(sev->io_regs + PSP_CMDRESP);
+		if (reg & PSP_CMDRESP_RESP) {
+			sev->int_rcvd = 1;
+			wake_up_interruptible(&sev->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
+static bool check_sev_support(struct sev_device *sev)
+{
+	/* Bit 0 in PSP_FEATURE_REG is set then SEV is support in PSP */
+	if (ioread32(sev->io_regs + PSP_FEATURE_REG) & 1)
+		return true;
+
+	return false;
+}
+
+int sev_dev_init(struct psp_device *psp)
+{
+	struct device *dev = psp->dev;
+	struct sev_device *sev;
+	int ret;
+
+	ret = -ENOMEM;
+	sev = sev_alloc_struct(psp);
+	if (!sev)
+		goto e_err;
+	psp->sev_data = sev;
+	
+	sev->io_regs = psp->io_regs;
+
+	dev_dbg(dev, "checking SEV support ...\n");
+	/* check SEV support */
+	if (!check_sev_support(sev)) {
+		dev_dbg(dev, "device does not support SEV\n");
+		goto e_err;
+	}
+
+	dev_dbg(dev, "requesting an IRQ ...\n");
+	/* Request an irq */
+	ret = psp_request_sev_irq(sev->psp, sev_irq_handler, sev);
+	if (ret) {
+		dev_err(dev, "unable to allocate an IRQ\n");
+		goto e_err;
+	}
+
+	/* initialize SEV ops */
+	dev_dbg(dev, "init sev ops\n");
+	ret = sev_ops_init(sev);
+	if (ret) {
+		dev_err(dev, "failed to init sev ops\n");
+		goto e_irq;
+	}
+
+	sev_add_device(sev);
+
+	dev_notice(dev, "sev enabled\n");
+
+	return 0;
+
+e_irq:
+	psp_free_sev_irq(psp, sev);
+e_err:
+	psp->sev_data = NULL;
+
+	dev_notice(dev, "sev initialization failed\n");
+
+	return ret;
+}
+
+void sev_dev_destroy(struct psp_device *psp)
+{
+	struct sev_device *sev = psp->sev_data;
+
+	psp_free_sev_irq(psp, sev);
+
+	sev_ops_destroy(sev);
+
+	sev_del_device(sev);
+}
+
+int sev_dev_resume(struct psp_device *psp)
+{
+	return 0;
+}
+
+int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
+{
+	return 0;
+}
+
+int sev_issue_cmd(int cmd, void *data, unsigned int timeout, int *psp_ret)
+{
+	struct sev_device *sev = get_sev_master_device();
+	unsigned int phys_lsb, phys_msb;
+	unsigned int reg, ret;
+
+	if (!sev)
+		return -ENODEV;
+
+	if (psp_ret)
+		*psp_ret = 0;
+
+	/* Set the physical address for the PSP */
+	phys_lsb = data ? lower_32_bits(__psp_pa(data)) : 0;
+	phys_msb = data ? upper_32_bits(__psp_pa(data)) : 0;
+
+	dev_dbg(sev->dev, "sev command id %#x buffer 0x%08x%08x\n",
+			cmd, phys_msb, phys_lsb);
+
+	/* Only one command at a time... */
+	mutex_lock(&sev_cmd_mutex);
+
+	iowrite32(phys_lsb, sev->io_regs + PSP_CMDBUFF_ADDR_LO);
+	iowrite32(phys_msb, sev->io_regs + PSP_CMDBUFF_ADDR_HI);
+	wmb();
+
+	reg = cmd;
+	reg <<= PSP_CMDRESP_CMD_SHIFT;
+	reg |= psp_poll ? 0 : PSP_CMDRESP_IOC;
+	iowrite32(reg, sev->io_regs + PSP_CMDRESP);
+
+	ret = sev_wait_cmd(sev, timeout, &reg);
+	if (ret)
+		goto unlock;
+
+	if (psp_ret)
+		*psp_ret = reg & PSP_CMDRESP_ERR_MASK;
+
+	if (reg & PSP_CMDRESP_ERR_MASK) {
+		dev_dbg(sev->dev, "sev command %u failed (%#010x)\n",
+			cmd, reg & PSP_CMDRESP_ERR_MASK);
+		ret = -EIO;
+	}
+
+unlock:
+	mutex_unlock(&sev_cmd_mutex);
+
+	return ret;
+}
+
+int sev_platform_init(struct sev_data_init *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_INIT, data, SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_init);
+
+int sev_platform_shutdown(int *error)
+{
+	return sev_issue_cmd(SEV_CMD_SHUTDOWN, 0, SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_shutdown);
+
+int sev_platform_status(struct sev_data_status *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_PLATFORM_STATUS, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_status);
+
+int sev_issue_cmd_external_user(struct file *filep, unsigned int cmd,
+				void *data, int timeout, int *error)
+{
+	if (!filep || filep->f_op != &sev_fops)
+		return -EBADF;
+
+	return sev_issue_cmd(cmd, data,
+			timeout ? timeout : SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_issue_cmd_external_user);
+
+int sev_guest_deactivate(struct sev_data_deactivate *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DEACTIVATE, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_deactivate);
+
+int sev_guest_activate(struct sev_data_activate *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_ACTIVATE, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_activate);
+
+int sev_guest_decommission(struct sev_data_decommission *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DECOMMISSION, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_decommission);
+
+int sev_guest_df_flush(int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DF_FLUSH, 0,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_df_flush);
+
diff --git a/drivers/crypto/ccp/sev-dev.h b/drivers/crypto/ccp/sev-dev.h
new file mode 100644
index 0000000..0df6ead
--- /dev/null
+++ b/drivers/crypto/ccp/sev-dev.h
@@ -0,0 +1,67 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) interface
+ *
+ * Copyright (C) 2013,2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __SEV_DEV_H__
+#define __SEV_DEV_H__
+
+#include <linux/device.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+#include <linux/miscdevice.h>
+
+#include <linux/psp-sev.h>
+
+#define PSP_C2PMSG(_num)		((_num) << 2)
+#define PSP_CMDRESP			PSP_C2PMSG(32)
+#define PSP_CMDBUFF_ADDR_LO		PSP_C2PMSG(56)
+#define PSP_CMDBUFF_ADDR_HI             PSP_C2PMSG(57)
+#define PSP_FEATURE_REG			PSP_C2PMSG(63)
+
+#define PSP_P2CMSG(_num)		(_num << 2)
+#define PSP_CMD_COMPLETE_REG		1
+#define PSP_CMD_COMPLETE		PSP_P2CMSG(PSP_CMD_COMPLETE_REG)
+
+#define MAX_PSP_NAME_LEN		16
+#define SEV_DEFAULT_TIMEOUT		5
+
+struct sev_device {
+	struct list_head entry;
+
+	struct dentry *debugfs;
+	struct miscdevice misc;
+
+	unsigned int id;
+	char name[MAX_PSP_NAME_LEN];
+
+	struct device *dev;
+	struct sp_device *sp;
+	struct psp_device *psp;
+
+	void __iomem *io_regs;
+
+	unsigned int int_rcvd;
+	wait_queue_head_t int_queue;
+};
+
+void sev_add_device(struct sev_device *sev);
+void sev_del_device(struct sev_device *sev);
+
+int sev_ops_init(struct sev_device *sev);
+void sev_ops_destroy(struct sev_device *sev);
+
+int sev_issue_cmd(int cmd, void *data, unsigned int timeout, int *error);
+
+#endif /* __SEV_DEV_H */
diff --git a/drivers/crypto/ccp/sev-ops.c b/drivers/crypto/ccp/sev-ops.c
new file mode 100644
index 0000000..727a8db
--- /dev/null
+++ b/drivers/crypto/ccp/sev-ops.c
@@ -0,0 +1,324 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) command interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/uaccess.h>
+
+#include <uapi/linux/psp-sev.h>
+
+#include "psp-dev.h"
+#include "sev-dev.h"
+
+static int sev_ioctl_init(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_init *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = sev_platform_init(data, &argp->error);
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_platform_status(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_status *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = sev_platform_status(data, &argp->error);
+
+	if (copy_to_user((void *)argp->data, data, sizeof(*data)))
+		ret = -EFAULT;
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pek_csr(struct sev_issue_cmd *argp)
+{
+	int ret;
+	void *csr_addr = NULL;
+	struct sev_data_pek_csr *data;
+	struct sev_user_data_pek_csr input;
+
+	if (copy_from_user(&input, (void *)argp->data,
+			sizeof(struct sev_user_data_pek_csr)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy PEK certificate from userspace */
+	if (input.address && input.length) {
+		csr_addr = kmalloc(input.length, GFP_KERNEL);
+		if (!csr_addr) {
+			ret = -ENOMEM;
+			goto e_err;
+		}
+		if (copy_from_user(csr_addr, (void *)input.address,
+				input.length)) {
+			ret = -EFAULT;
+			goto e_csr_free;
+		}
+
+		data->address = __psp_pa(csr_addr);
+		data->length = input.length;
+	}
+
+	ret = sev_issue_cmd(SEV_CMD_PEK_CSR,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+
+	input.length = data->length;
+
+	/* copy PEK certificate length to userspace */
+	if (copy_to_user((void *)argp->data, &input,
+			sizeof(struct sev_user_data_pek_csr)))
+		ret = -EFAULT;
+e_csr_free:
+	kfree(csr_addr);
+e_err:
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pek_cert_import(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_pek_cert_import *data;
+	struct sev_user_data_pek_cert_import input;
+	void *pek_cert, *oca_cert;
+
+	if (copy_from_user(&input, (void *)argp->data, sizeof(*data)))
+		return -EFAULT;
+
+	if (!input.pek_cert_address || !input.pek_cert_length ||
+		!input.oca_cert_address || !input.oca_cert_length)
+		return -EINVAL;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy PEK certificate from userspace */
+	pek_cert = kmalloc(input.pek_cert_length, GFP_KERNEL);
+	if (!pek_cert) {
+		ret = -ENOMEM;
+		goto e_free;
+	}
+	if (copy_from_user(pek_cert, (void *)input.pek_cert_address,
+				input.pek_cert_length)) {
+		ret = -EFAULT;
+		goto e_free_pek_cert;
+	}
+
+	data->pek_cert_address = __psp_pa(pek_cert);
+	data->pek_cert_length = input.pek_cert_length;
+
+	/* copy OCA certificate from userspace */
+	oca_cert = kmalloc(input.oca_cert_length, GFP_KERNEL);
+	if (!oca_cert) {
+		ret = -ENOMEM;
+		goto e_free_pek_cert;
+	}
+	if (copy_from_user(oca_cert, (void *)input.oca_cert_address,
+				input.oca_cert_length)) {
+		ret = -EFAULT;
+		goto e_free_oca_cert;
+	}
+
+	data->oca_cert_address = __psp_pa(oca_cert);
+	data->oca_cert_length = input.oca_cert_length;
+
+	ret = sev_issue_cmd(SEV_CMD_PEK_CERT_IMPORT,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+e_free_oca_cert:
+	kfree(oca_cert);
+e_free_pek_cert:
+	kfree(pek_cert);
+e_free:
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pdh_cert_export(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_pdh_cert_export *data;
+	struct sev_user_data_pdh_cert_export input;
+	void *pdh_cert = NULL, *cert_chain = NULL;
+
+	if (copy_from_user(&input, (void *)argp->data, sizeof(*data)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy pdh certificate from userspace */
+	if (input.pdh_cert_length && input.pdh_cert_address) {
+		pdh_cert = kmalloc(input.pdh_cert_length, GFP_KERNEL);
+		if (!pdh_cert) {
+			ret = -ENOMEM;
+			goto e_free;
+		}
+		if (copy_from_user(pdh_cert, (void *)input.pdh_cert_address,
+					input.pdh_cert_length)) {
+			ret = -EFAULT;
+			goto e_free_pdh_cert;
+		}
+
+		data->pdh_cert_address = __psp_pa(pdh_cert);
+		data->pdh_cert_length = input.pdh_cert_length;
+	}
+
+	/* copy cert_chain certificate from userspace */
+	if (input.cert_chain_length && input.cert_chain_address) {
+		cert_chain = kmalloc(input.cert_chain_length, GFP_KERNEL);
+		if (!cert_chain) {
+			ret = -ENOMEM;
+			goto e_free_pdh_cert;
+		}
+		if (copy_from_user(cert_chain, (void *)input.cert_chain_address,
+					input.cert_chain_length)) {
+			ret = -EFAULT;
+			goto e_free_cert_chain;
+		}
+
+		data->cert_chain_address = __psp_pa(cert_chain);
+		data->cert_chain_length = input.cert_chain_length;
+	}
+
+	ret = sev_issue_cmd(SEV_CMD_PDH_CERT_EXPORT,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+
+	input.cert_chain_length = data->cert_chain_length;
+	input.pdh_cert_length = data->pdh_cert_length;
+
+	/* copy certificate length to userspace */
+	if (copy_to_user((void *)argp->data, &input,
+			sizeof(struct sev_user_data_pek_csr)))
+		ret = -EFAULT;
+
+e_free_cert_chain:
+	kfree(cert_chain);
+e_free_pdh_cert:
+	kfree(pdh_cert);
+e_free:
+	kfree(data);
+	return ret;
+}
+
+static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
+{
+	int ret = -EFAULT;
+	void __user *argp = (void __user *)arg;
+	struct sev_issue_cmd input;
+
+	if (ioctl != SEV_ISSUE_CMD)
+		return -EINVAL;
+
+	if (copy_from_user(&input, argp, sizeof(struct sev_issue_cmd)))
+		return -EFAULT;
+
+	if (input.cmd > SEV_CMD_MAX)
+		return -EINVAL;
+
+	switch (input.cmd) {
+
+	case SEV_USER_CMD_INIT: {
+		ret = sev_ioctl_init(&input);
+		break;
+	}
+	case SEV_USER_CMD_SHUTDOWN: {
+		ret = sev_platform_shutdown(&input.error);
+		break;
+	}
+	case SEV_USER_CMD_FACTORY_RESET: {
+		ret = sev_issue_cmd(SEV_CMD_FACTORY_RESET, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PLATFORM_STATUS: {
+		ret = sev_ioctl_platform_status(&input);
+		break;
+	}
+	case SEV_USER_CMD_PEK_GEN: {
+		ret = sev_issue_cmd(SEV_CMD_PEK_GEN, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PDH_GEN: {
+		ret = sev_issue_cmd(SEV_CMD_PDH_GEN, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PEK_CSR: {
+		ret = sev_ioctl_pek_csr(&input);
+		break;
+	}
+	case SEV_USER_CMD_PEK_CERT_IMPORT: {
+		ret = sev_ioctl_pek_cert_import(&input);
+		break;
+	}
+	case SEV_USER_CMD_PDH_CERT_EXPORT: {
+		ret = sev_ioctl_pdh_cert_export(&input);
+		break;
+	}
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	if (copy_to_user(argp, &input, sizeof(struct sev_issue_cmd)))
+		ret = -EFAULT;
+
+	return ret;
+}
+
+const struct file_operations sev_fops = {
+	.owner	= THIS_MODULE,
+	.unlocked_ioctl = sev_ioctl,
+};
+
+int sev_ops_init(struct sev_device *sev)
+{
+	struct miscdevice *misc = &sev->misc;
+
+	misc->minor = MISC_DYNAMIC_MINOR;
+	misc->name = sev->name;
+	misc->fops = &sev_fops;
+
+	return misc_register(misc);
+}
+
+void sev_ops_destroy(struct sev_device *sev)
+{
+	misc_deregister(&sev->misc);
+}
+
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
new file mode 100644
index 0000000..acce6ed
--- /dev/null
+++ b/include/linux/psp-sev.h
@@ -0,0 +1,672 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) driver interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_SEV_H__
+#define __PSP_SEV_H__
+
+#ifdef CONFIG_X86
+#include <linux/mem_encrypt.h>
+
+#define __psp_pa(x)	__sme_pa(x)
+#else
+#define __psp_pa(x)	__pa(x)
+#endif
+
+/**
+ * SEV platform and guest management commands
+ */
+enum sev_cmd {
+	/* platform commands */
+	SEV_CMD_INIT			= 0x001,
+	SEV_CMD_SHUTDOWN		= 0x002,
+	SEV_CMD_FACTORY_RESET		= 0x003,
+	SEV_CMD_PLATFORM_STATUS		= 0x004,
+	SEV_CMD_PEK_GEN			= 0x005,
+	SEV_CMD_PEK_CSR			= 0x006,
+	SEV_CMD_PEK_CERT_IMPORT		= 0x007,
+	SEV_CMD_PDH_GEN			= 0x008,
+	SEV_CMD_PDH_CERT_EXPORT		= 0x009,
+	SEV_CMD_DF_FLUSH		= 0x00A,
+
+	/* Guest commands */
+	SEV_CMD_DECOMMISSION		= 0x020,
+	SEV_CMD_ACTIVATE		= 0x021,
+	SEV_CMD_DEACTIVATE		= 0x022,
+	SEV_CMD_GUEST_STATUS		= 0x023,
+
+	/* Guest launch commands */
+	SEV_CMD_LAUNCH_START		= 0x030,
+	SEV_CMD_LAUNCH_UPDATE_DATA	= 0x031,
+	SEV_CMD_LAUNCH_UPDATE_VMSA	= 0x032,
+	SEV_CMD_LAUNCH_MEASURE		= 0x033,
+	SEV_CMD_LAUNCH_UPDATE_SECRET	= 0x034,
+	SEV_CMD_LAUNCH_FINISH		= 0x035,
+
+	/* Guest migration commands (outgoing) */
+	SEV_CMD_SEND_START		= 0x040,
+	SEV_CMD_SEND_UPDATE_DATA	= 0x041,
+	SEV_CMD_SEND_UPDATE_VMSA	= 0x042,
+	SEV_CMD_SEND_FINISH		= 0x043,
+
+	/* Guest migration commands (incoming) */
+	SEV_CMD_RECEIVE_START		= 0x050,
+	SEV_CMD_RECEIVE_UPDATE_DATA	= 0x051,
+	SEV_CMD_RECEIVE_UPDATE_VMSA	= 0x052,
+	SEV_CMD_RECEIVE_FINISH		= 0x053,
+
+	/* Guest debug commands */
+	SEV_CMD_DBG_DECRYPT		= 0x060,
+	SEV_CMD_DBG_ENCRYPT		= 0x061,
+
+	SEV_CMD_MAX,
+};
+
+/**
+ * status code returned by the commands
+ */
+enum psp_ret_code {
+	SEV_RET_SUCCESS = 0,
+	SEV_RET_INVALID_PLATFORM_STATE,
+	SEV_RET_INVALID_GUEST_STATE,
+	SEV_RET_INAVLID_CONFIG,
+	SEV_RET_INVALID_LENGTH,
+	SEV_RET_ALREADY_OWNED,
+	SEV_RET_INVALID_CERTIFICATE,
+	SEV_RET_POLICY_FAILURE,
+	SEV_RET_INACTIVE,
+	SEV_RET_INVALID_ADDRESS,
+	SEV_RET_BAD_SIGNATURE,
+	SEV_RET_BAD_MEASUREMENT,
+	SEV_RET_ASID_OWNED,
+	SEV_RET_INVALID_ASID,
+	SEV_RET_WBINVD_REQUIRED,
+	SEV_RET_DFFLUSH_REQUIRED,
+	SEV_RET_INVALID_GUEST,
+	SEV_RET_INVALID_COMMAND,
+	SEV_RET_ACTIVE,
+	SEV_RET_HWSEV_RET_PLATFORM,
+	SEV_RET_HWSEV_RET_UNSAFE,
+	SEV_RET_UNSUPPORTED,
+	SEV_RET_MAX,
+};
+
+/**
+ * struct sev_data_init - INIT command parameters
+ *
+ * @flags: processing flags
+ * @tmr_address: system physical address used for SEV-ES
+ * @tmr_length: length of tmr_address
+ */
+struct sev_data_init {
+	__u32 flags;				/* In */
+	__u32 reserved;				/* In */
+	__u64 tmr_address;			/* In */
+	__u32 tmr_length;			/* In */
+};
+
+/**
+ * struct sev_data_status - PLATFORM_STATUS command parameters
+ *
+ * @major: major API version
+ * @minor: minor API version
+ * @state: platform state
+ * @owner: self-owned or externally owned
+ * @config: platform config flags
+ * @guest_count: number of active guests
+ */
+struct sev_data_status {
+	__u8 api_major;				/* Out */
+	__u8 api_minor;				/* Out */
+	__u8 state;				/* Out */
+	__u8 owner;				/* Out */
+	__u32 config;				/* Out */
+	__u32 guest_count;			/* Out */
+};
+
+/**
+ * struct sev_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @length: length of certificate
+ */
+struct sev_data_pek_csr {
+	__u64 address;					/* In */
+	__u32 length;					/* In/Out */
+};
+
+/**
+ * struct sev_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_length: length of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_length: length of OCA certificate
+ */
+struct sev_data_pek_cert_import {
+	__u64 pek_cert_address;				/* In */
+	__u32 pek_cert_length;				/* In */
+	__u32 reserved;					/* In */
+	__u64 oca_cert_address;				/* In */
+	__u32 oca_cert_length;				/* In */
+};
+
+/**
+ * struct sev_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_length: length of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_length: length of PDH certificate chain
+ */
+struct sev_data_pdh_cert_export {
+	__u64 pdh_cert_address;				/* In */
+	__u32 pdh_cert_length;				/* In/Out */
+	__u32 reserved;					/* In */
+	__u64 cert_chain_address;			/* In */
+	__u32 cert_chain_length;			/* In/Out */
+};
+
+/**
+ * struct sev_data_decommission - DECOMMISSION command parameters
+ *
+ * @handle: handle of the VM to decommission
+ */
+struct sev_data_decommission {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_activate - ACTIVATE command parameters
+ *
+ * @handle: handle of the VM to activate
+ * @asid: asid assigned to the VM
+ */
+struct sev_data_activate {
+	u32 handle;				/* In */
+	u32 asid;				/* In */
+};
+
+/**
+ * struct sev_data_deactivate - DEACTIVATE command parameters
+ *
+ * @handle: handle of the VM to deactivate
+ */
+struct sev_data_deactivate {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_guest_status - SEV GUEST_STATUS command parameters
+ *
+ * @handle: handle of the VM to retrieve status
+ * @policy: policy information for the VM
+ * @asid: current ASID of the VM
+ * @state: current state of the VM
+ */
+struct sev_data_guest_status {
+	u32 handle;				/* In */
+	u32 policy;				/* Out */
+	u32 asid;				/* Out */
+	u8 state;				/* Out */
+};
+
+/**
+ * struct sev_data_launch_start - LAUNCH_START command parameters
+ *
+ * @handle: handle assigned to the VM
+ * @policy: guest launch policy
+ * @dh_cert_address: physical address of DH certificate blob
+ * @dh_cert_length: length of DH certificate blob
+ * @session_address: physical address of session parameters
+ * @session_len: length of session parameters
+ */
+struct sev_data_launch_start {
+	u32 handle;				/* In/Out */
+	u32 policy;				/* In */
+	u64 dh_cert_address;			/* In */
+	u32 dh_cert_length;			/* In */
+	u32 reserved;				/* In */
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In */
+};
+
+/**
+ * struct sev_data_launch_update_data - LAUNCH_UPDATE_DATA command parameter
+ *
+ * @handle: handle of the VM to update
+ * @length: length of memory to be encrypted
+ * @address: physical address of memory region to encrypt
+ */
+struct sev_data_launch_update_data {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In */
+};
+
+/**
+ * struct sev_data_launch_update_vmsa - LAUNCH_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM
+ * @address: physical address of memory region to encrypt
+ * @length: length of memory region to encrypt
+ */
+struct sev_data_launch_update_vmsa {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In */
+};
+
+/**
+ * struct sev_data_launch_measure - LAUNCH_MEASURE command parameters
+ *
+ * @handle: handle of the VM to process
+ * @address: physical address containing the measurement blob
+ * @length: length of measurement blob
+ */
+struct sev_data_launch_measure {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In/Out */
+};
+
+/**
+ * struct sev_data_launch_secret - LAUNCH_SECRET command parameters
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing the packet header
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest_paddr
+ * @trans_address: physical address of transport memory buffer
+ * @trans_length: length of transport memory buffer
+ */
+struct sev_data_launch_secret {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_launch_finish - LAUNCH_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_launch_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_send_start - SEND_START command parameters
+ *
+ * @handle: handle of the VM to process
+ * @pdh_cert_address: physical address containing PDH certificate
+ * @pdh_cert_length: length of PDH certificate
+ * @plat_certs_address: physical address containing platform certificate
+ * @plat_certs_length: length of platform certificate
+ * @amd_certs_address: physical address containing AMD certificate
+ * @amd_certs_length: length of AMD certificate
+ * @session_data_address: physical address containing Session data
+ * @session_length: length of session data
+ */
+struct sev_data_send_start {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 pdh_cert_address;			/* In */
+	u32 pdh_cert_length;			/* In/Out */
+	u32 reserved2;
+	u64 plat_cert_address;			/* In */
+	u32 plat_cert_length;			/* In/Out */
+	u32 reserved3;
+	u64 amd_cert_address;			/* In */
+	u32 amd_cert_length;			/* In/Out */
+	u32 reserved4;
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In/Out */
+};
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_DATA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_length: length of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_length: length of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_length: length of host memory region
+ */
+struct sev_data_send_update_data {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In/Out */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_length: length of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_length: length of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_length: length of host memory region
+ */
+struct sev_data_send_update_vmsa {
+	u32 handle;				/* In */
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In/Out */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_send_finish - SEND_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_send_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_receive_start - RECEIVE_START command parameters
+ *
+ * @handle: handle of the VM to perform receive operation
+ * @pdh_cert_address: system physical address containing PDH certificate blob
+ * @pdh_cert_length: length of PDH certificate blob
+ * @session_address: system physical address containing session blob
+ * @session_length: length of session blob
+ */
+struct sev_data_receive_start {
+	u32 handle;				/* In/Out */
+	u32 reserved1;
+	u64 pdh_cert_address;			/* In */
+	u32 pdh_cert_length;			/* In */
+	u32 reserved2;
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In/Out */
+};
+
+/**
+ * struct sev_data_receive_update_data - RECEIVE_UPDATE_DATA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_length: length of transport buffer
+ */
+struct sev_data_receive_update_data {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_receive_update_vmsa - RECEIVE_UPDATE_VMSA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_length: length of transport buffer
+ */
+struct sev_data_receive_update_vmsa {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_receive_finish - RECEIVE_FINISH command parameters
+ *
+ * @handle: handle of the VM to finish
+ */
+struct sev_data_receive_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_dbg - DBG_ENCRYPT/DBG_DECRYPT command parameters
+ *
+ * @handle: handle of the VM to perform debug operation
+ * @src_addr: source address of data to operate on
+ * @dst_addr: destination address of data to operate on
+ * @length: length of data to operate on
+ */
+struct sev_data_dbg {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 src_addr;				/* In */
+	u64 dst_addr;				/* In */
+	u32 length;				/* In */
+};
+
+#if defined(CONFIG_CRYPTO_DEV_SEV)
+
+/**
+ * sev_platform_init - perform SEV INIT command
+ *
+ * @init: sev_data_init structure to be processed
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_init(struct sev_data_init *init, int *error);
+
+/**
+ * sev_platform_shutdown - perform SEV SHUTDOWN command
+ *
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_shutdown(int *error);
+
+/**
+ * sev_platform_status - perform SEV PLATFORM_STATUS command
+ *
+ * @init: sev_data_status structure to be processed
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_status(struct sev_data_status *status, int *error);
+
+/**
+ * sev_issue_cmd_external_user - issue SEV command by other driver
+ *
+ * The function can be used by other drivers to issue a SEV command on
+ * behalf by userspace. The caller must pass a valid SEV file descriptor
+ * so that we know that caller has access to SEV device.
+ *
+ * @filep - SEV device file pointer
+ * @cmd - command to issue
+ * @data - command buffer
+ * @timeout - If zero then use default timeout
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ * -%EINVAL    if the SEV file descriptor is not valid
+ */
+int sev_issue_cmd_external_user(struct file *filep, unsigned int id,
+				void *data, int timeout, int *error);
+
+/**
+ * sev_guest_deactivate - perform SEV DEACTIVATE command
+ *
+ * @deactivate: sev_data_deactivate structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_deactivate(struct sev_data_deactivate *data, int *error);
+
+/**
+ * sev_guest_activate - perform SEV ACTIVATE command
+ *
+ * @activate: sev_data_activate structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_activate(struct sev_data_activate *data, int *error);
+
+/**
+ * sev_guest_df_flush - perform SEV DF_FLUSH command
+ *
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_df_flush(int *error);
+
+/**
+ * sev_guest_decommission - perform SEV DECOMMISSION command
+ *
+ * @decommission: sev_data_decommission structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_decommission(struct sev_data_decommission *data, int *error);
+
+#else	/* !CONFIG_CRYPTO_DEV_SEV */
+
+static inline int sev_platform_status(struct sev_data_status *status,
+				      int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_platform_init(struct sev_data_init *init, int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_platform_shutdown(int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_issue_cmd_external_user(int fd, unsigned int id,
+					void *data, int timeout, int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_deactivate(struct sev_data_deactivate *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_decommission(struct sev_data_decommission *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_activate(struct sev_data_activate *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_df_flush(int *error)
+{
+	return -ENODEV;
+}
+
+#endif	/* CONFIG_CRYPTO_DEV_SEV */
+
+#endif	/* __PSP_SEV_H__ */
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index f330ba4..2e15ea7 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -481,3 +481,4 @@ header-y += xilinx-v4l2-controls.h
 header-y += zorro.h
 header-y += zorro_ids.h
 header-y += userfaultfd.h
+header-y += psp-sev.h
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
new file mode 100644
index 0000000..050976d
--- /dev/null
+++ b/include/uapi/linux/psp-sev.h
@@ -0,0 +1,123 @@
+
+/*
+ * Userspace interface for AMD Secure Encrypted Virtualization (SEV)
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_SEV_USER_H__
+#define __PSP_SEV_USER_H__
+
+#include <linux/types.h>
+
+/**
+ * SEV platform commands
+ */
+enum {
+	SEV_USER_CMD_INIT = 0,
+	SEV_USER_CMD_SHUTDOWN,
+	SEV_USER_CMD_FACTORY_RESET,
+	SEV_USER_CMD_PLATFORM_STATUS,
+	SEV_USER_CMD_PEK_GEN,
+	SEV_USER_CMD_PEK_CSR,
+	SEV_USER_CMD_PDH_GEN,
+	SEV_USER_CMD_PDH_CERT_EXPORT,
+	SEV_USER_CMD_PEK_CERT_IMPORT,
+
+	SEV_USER_CMD_MAX,
+};
+
+/**
+ * struct sev_user_data_init - INIT command parameters
+ *
+ * @flags: processing flags
+ */
+struct sev_user_data_init {
+	__u32 flags;				/* In */
+};
+
+/**
+ * struct sev_user_data_status - PLATFORM_STATUS command parameters
+ *
+ * @major: major API version
+ * @minor: minor API version
+ * @state: platform state
+ * @owner: self-owned or externally owned
+ * @config: platform config flags
+ * @guest_count: number of active guests
+ */
+struct sev_user_data_status {
+	__u8 api_major;				/* Out */
+	__u8 api_minor;				/* Out */
+	__u8 state;				/* Out */
+	__u8 owner;				/* Out */
+	__u32 config;				/* Out */
+	__u32 guest_count;			/* Out */
+};
+
+/**
+ * struct sev_user_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @length: length of certificate
+ */
+struct sev_user_data_pek_csr {
+	__u64 address;					/* In */
+	__u32 length;					/* In/Out */
+};
+
+/**
+ * q
+ * struct sev_user_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_length: length of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_length: length of OCA certificate
+ */
+struct sev_user_data_pek_cert_import {
+	__u64 pek_cert_address;				/* In */
+	__u32 pek_cert_length;				/* In */
+	__u64 oca_cert_address;				/* In */
+	__u32 oca_cert_length;				/* In */
+};
+
+/**
+ * struct sev_user_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_length: length of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_length: length of PDH certificate chain
+ */
+struct sev_user_data_pdh_cert_export {
+	__u64 pdh_cert_address;				/* In */
+	__u32 pdh_cert_length;				/* In/Out */
+	__u64 cert_chain_address;			/* In */
+	__u32 cert_chain_length;			/* In/Out */
+};
+
+/**
+ * struct sev_issue_cmd - SEV ioctl parameters
+ *
+ * @cmd: SEV commands to execute
+ * @opaque: pointer to the command structure
+ * @error: SEV FW return code on failure
+ */
+struct sev_issue_cmd {
+	__u32 cmd;					/* In */
+	__u64 data;					/* In */
+	__u32 error;					/* Out */
+};
+
+#define SEV_IOC_TYPE		'S'
+#define SEV_ISSUE_CMD	_IOWR(SEV_IOC_TYPE, 0x0, struct sev_issue_cmd)
+
+#endif /* __PSP_USER_SEV_H */
+

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 21/32] crypto: ccp: Add Secure Encrypted Virtualization (SEV) interface support
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The Secure Encrypted Virtualization (SEV) interface allows the memory
contents of a virtual machine (VM) to be transparently encrypted with
a key unique to the guest.

The interface provides:
  - /dev/sev device and ioctl (SEV_ISSUE_CMD) to execute the platform
    provisioning commands from the userspace.
  - in-kernel API's to encrypt the guest memory region. The in-kernel APIs
    will be used by KVM to bootstrap and debug the SEV guest.

SEV key management spec is available here [1]
[1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Specification.pdf

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 drivers/crypto/ccp/Kconfig   |    7 
 drivers/crypto/ccp/Makefile  |    1 
 drivers/crypto/ccp/psp-dev.h |    6 
 drivers/crypto/ccp/sev-dev.c |  348 ++++++++++++++++++++++
 drivers/crypto/ccp/sev-dev.h |   67 ++++
 drivers/crypto/ccp/sev-ops.c |  324 ++++++++++++++++++++
 include/linux/psp-sev.h      |  672 ++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/Kbuild    |    1 
 include/uapi/linux/psp-sev.h |  123 ++++++++
 9 files changed, 1546 insertions(+), 3 deletions(-)
 create mode 100644 drivers/crypto/ccp/sev-dev.c
 create mode 100644 drivers/crypto/ccp/sev-dev.h
 create mode 100644 drivers/crypto/ccp/sev-ops.c
 create mode 100644 include/linux/psp-sev.h
 create mode 100644 include/uapi/linux/psp-sev.h

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index 59c207e..67d1917 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -41,4 +41,11 @@ config CRYPTO_DEV_PSP
 	help
 	 Provide the interface for AMD Platform Security Processor (PSP) device.
 
+config CRYPTO_DEV_SEV
+	bool "Secure Encrypted Virtualization (SEV) interface"
+	default y
+	help
+	 Provide the kernel and userspace (/dev/sev) interface to issue the
+	 Secure Encrypted Virtualization (SEV) commands.
+
 endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 12e569d..4c4e77e 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -7,6 +7,7 @@ ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-dev-v5.o \
 	    ccp-dmaengine.o
 ccp-$(CONFIG_CRYPTO_DEV_PSP) += psp-dev.o
+ccp-$(CONFIG_CRYPTO_DEV_SEV) += sev-dev.o sev-ops.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/psp-dev.h b/drivers/crypto/ccp/psp-dev.h
index bbd3d96..fd67b14 100644
--- a/drivers/crypto/ccp/psp-dev.h
+++ b/drivers/crypto/ccp/psp-dev.h
@@ -70,14 +70,14 @@ int psp_free_sev_irq(struct psp_device *psp, void *data);
 
 struct psp_device *psp_get_master_device(void);
 
-#ifdef CONFIG_AMD_SEV
+#ifdef CONFIG_CRYPTO_DEV_SEV
 
 int sev_dev_init(struct psp_device *psp);
 void sev_dev_destroy(struct psp_device *psp);
 int sev_dev_resume(struct psp_device *psp);
 int sev_dev_suspend(struct psp_device *psp, pm_message_t state);
 
-#else
+#else /* !CONFIG_CRYPTO_DEV_SEV */
 
 static inline int sev_dev_init(struct psp_device *psp)
 {
@@ -96,7 +96,7 @@ static inline int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
 	return -ENODEV;
 }
 
-#endif /* __AMD_SEV_H */
+#endif /* CONFIG_CRYPTO_DEV_SEV */
 
 #endif /* __PSP_DEV_H */
 
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
new file mode 100644
index 0000000..a67e2d7
--- /dev/null
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -0,0 +1,348 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/delay.h>
+#include <linux/wait.h>
+#include <linux/jiffies.h>
+
+#include "psp-dev.h"
+#include "sev-dev.h"
+
+extern struct file_operations sev_fops;
+
+static LIST_HEAD(sev_devs);
+static DEFINE_SPINLOCK(sev_devs_lock);
+static atomic_t sev_id;
+
+static unsigned int psp_poll;
+module_param(psp_poll, uint, 0444);
+MODULE_PARM_DESC(psp_poll, "Poll for sev command completion - any non-zero value");
+
+DEFINE_MUTEX(sev_cmd_mutex);
+
+void sev_add_device(struct sev_device *sev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sev_devs_lock, flags);
+
+	list_add_tail(&sev->entry, &sev_devs);
+
+	spin_unlock_irqrestore(&sev_devs_lock, flags);
+}
+
+void sev_del_device(struct sev_device *sev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sev_devs_lock, flags);
+
+	list_del(&sev->entry);
+	spin_unlock_irqrestore(&sev_devs_lock, flags);
+}
+
+static struct sev_device *get_sev_master_device(void)
+{
+	struct psp_device *psp = psp_get_master_device();
+
+	return psp ? psp->sev_data : NULL;
+}
+
+static int sev_wait_cmd_poll(struct sev_device *sev, unsigned int timeout,
+			     unsigned int *reg)
+{
+	int wait = timeout * 10;	/* 100ms sleep => timeout * 10 */
+
+	while (--wait) {
+		msleep(100);
+
+		*reg = ioread32(sev->io_regs + PSP_CMDRESP);
+		if (*reg & PSP_CMDRESP_RESP)
+			break;
+	}
+
+	if (!wait) {
+		dev_err(sev->dev, "sev command timed out\n");
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static int sev_wait_cmd_ioc(struct sev_device *sev, unsigned int timeout,
+			    unsigned int *reg)
+{
+	unsigned long jiffie_timeout = timeout;
+	long ret;
+
+	jiffie_timeout *= HZ;
+
+	sev->int_rcvd = 0;
+
+	ret = wait_event_interruptible_timeout(sev->int_queue, sev->int_rcvd,
+						jiffie_timeout);
+	if (ret <= 0) {
+		dev_err(sev->dev, "sev command (%#x) timed out\n",
+				*reg >> PSP_CMDRESP_CMD_SHIFT);
+		return -ETIMEDOUT;
+	}
+
+	*reg = ioread32(sev->io_regs + PSP_CMDRESP);
+
+	return 0;
+}
+
+static int sev_wait_cmd(struct sev_device *sev, unsigned int timeout,
+			unsigned int *reg)
+{
+	return (*reg & PSP_CMDRESP_IOC) ? sev_wait_cmd_ioc(sev, timeout, reg)
+					: sev_wait_cmd_poll(sev, timeout, reg);
+}
+
+static struct sev_device *sev_alloc_struct(struct psp_device *psp)
+{
+	struct device *dev = psp->dev;
+	struct sev_device *sev;
+
+	sev = devm_kzalloc(dev, sizeof(*sev), GFP_KERNEL);
+	if (!sev)
+		return NULL;
+
+	sev->dev = dev;
+	sev->psp = psp;
+	sev->id = atomic_inc_return(&sev_id);
+
+	snprintf(sev->name, sizeof(sev->name), "sev%u", sev->id);
+	init_waitqueue_head(&sev->int_queue);
+
+	return sev;
+}
+
+irqreturn_t sev_irq_handler(int irq, void *data)
+{
+	struct sev_device *sev = data;
+	unsigned int status;
+
+	status = ioread32(sev->io_regs + PSP_P2CMSG_INTSTS);
+	if (status & (1 << PSP_CMD_COMPLETE_REG)) {
+		int reg;
+
+		reg = ioread32(sev->io_regs + PSP_CMDRESP);
+		if (reg & PSP_CMDRESP_RESP) {
+			sev->int_rcvd = 1;
+			wake_up_interruptible(&sev->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
+static bool check_sev_support(struct sev_device *sev)
+{
+	/* Bit 0 in PSP_FEATURE_REG is set then SEV is support in PSP */
+	if (ioread32(sev->io_regs + PSP_FEATURE_REG) & 1)
+		return true;
+
+	return false;
+}
+
+int sev_dev_init(struct psp_device *psp)
+{
+	struct device *dev = psp->dev;
+	struct sev_device *sev;
+	int ret;
+
+	ret = -ENOMEM;
+	sev = sev_alloc_struct(psp);
+	if (!sev)
+		goto e_err;
+	psp->sev_data = sev;
+	
+	sev->io_regs = psp->io_regs;
+
+	dev_dbg(dev, "checking SEV support ...\n");
+	/* check SEV support */
+	if (!check_sev_support(sev)) {
+		dev_dbg(dev, "device does not support SEV\n");
+		goto e_err;
+	}
+
+	dev_dbg(dev, "requesting an IRQ ...\n");
+	/* Request an irq */
+	ret = psp_request_sev_irq(sev->psp, sev_irq_handler, sev);
+	if (ret) {
+		dev_err(dev, "unable to allocate an IRQ\n");
+		goto e_err;
+	}
+
+	/* initialize SEV ops */
+	dev_dbg(dev, "init sev ops\n");
+	ret = sev_ops_init(sev);
+	if (ret) {
+		dev_err(dev, "failed to init sev ops\n");
+		goto e_irq;
+	}
+
+	sev_add_device(sev);
+
+	dev_notice(dev, "sev enabled\n");
+
+	return 0;
+
+e_irq:
+	psp_free_sev_irq(psp, sev);
+e_err:
+	psp->sev_data = NULL;
+
+	dev_notice(dev, "sev initialization failed\n");
+
+	return ret;
+}
+
+void sev_dev_destroy(struct psp_device *psp)
+{
+	struct sev_device *sev = psp->sev_data;
+
+	psp_free_sev_irq(psp, sev);
+
+	sev_ops_destroy(sev);
+
+	sev_del_device(sev);
+}
+
+int sev_dev_resume(struct psp_device *psp)
+{
+	return 0;
+}
+
+int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
+{
+	return 0;
+}
+
+int sev_issue_cmd(int cmd, void *data, unsigned int timeout, int *psp_ret)
+{
+	struct sev_device *sev = get_sev_master_device();
+	unsigned int phys_lsb, phys_msb;
+	unsigned int reg, ret;
+
+	if (!sev)
+		return -ENODEV;
+
+	if (psp_ret)
+		*psp_ret = 0;
+
+	/* Set the physical address for the PSP */
+	phys_lsb = data ? lower_32_bits(__psp_pa(data)) : 0;
+	phys_msb = data ? upper_32_bits(__psp_pa(data)) : 0;
+
+	dev_dbg(sev->dev, "sev command id %#x buffer 0x%08x%08x\n",
+			cmd, phys_msb, phys_lsb);
+
+	/* Only one command at a time... */
+	mutex_lock(&sev_cmd_mutex);
+
+	iowrite32(phys_lsb, sev->io_regs + PSP_CMDBUFF_ADDR_LO);
+	iowrite32(phys_msb, sev->io_regs + PSP_CMDBUFF_ADDR_HI);
+	wmb();
+
+	reg = cmd;
+	reg <<= PSP_CMDRESP_CMD_SHIFT;
+	reg |= psp_poll ? 0 : PSP_CMDRESP_IOC;
+	iowrite32(reg, sev->io_regs + PSP_CMDRESP);
+
+	ret = sev_wait_cmd(sev, timeout, &reg);
+	if (ret)
+		goto unlock;
+
+	if (psp_ret)
+		*psp_ret = reg & PSP_CMDRESP_ERR_MASK;
+
+	if (reg & PSP_CMDRESP_ERR_MASK) {
+		dev_dbg(sev->dev, "sev command %u failed (%#010x)\n",
+			cmd, reg & PSP_CMDRESP_ERR_MASK);
+		ret = -EIO;
+	}
+
+unlock:
+	mutex_unlock(&sev_cmd_mutex);
+
+	return ret;
+}
+
+int sev_platform_init(struct sev_data_init *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_INIT, data, SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_init);
+
+int sev_platform_shutdown(int *error)
+{
+	return sev_issue_cmd(SEV_CMD_SHUTDOWN, 0, SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_shutdown);
+
+int sev_platform_status(struct sev_data_status *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_PLATFORM_STATUS, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_status);
+
+int sev_issue_cmd_external_user(struct file *filep, unsigned int cmd,
+				void *data, int timeout, int *error)
+{
+	if (!filep || filep->f_op != &sev_fops)
+		return -EBADF;
+
+	return sev_issue_cmd(cmd, data,
+			timeout ? timeout : SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_issue_cmd_external_user);
+
+int sev_guest_deactivate(struct sev_data_deactivate *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DEACTIVATE, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_deactivate);
+
+int sev_guest_activate(struct sev_data_activate *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_ACTIVATE, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_activate);
+
+int sev_guest_decommission(struct sev_data_decommission *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DECOMMISSION, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_decommission);
+
+int sev_guest_df_flush(int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DF_FLUSH, 0,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_df_flush);
+
diff --git a/drivers/crypto/ccp/sev-dev.h b/drivers/crypto/ccp/sev-dev.h
new file mode 100644
index 0000000..0df6ead
--- /dev/null
+++ b/drivers/crypto/ccp/sev-dev.h
@@ -0,0 +1,67 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) interface
+ *
+ * Copyright (C) 2013,2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __SEV_DEV_H__
+#define __SEV_DEV_H__
+
+#include <linux/device.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+#include <linux/miscdevice.h>
+
+#include <linux/psp-sev.h>
+
+#define PSP_C2PMSG(_num)		((_num) << 2)
+#define PSP_CMDRESP			PSP_C2PMSG(32)
+#define PSP_CMDBUFF_ADDR_LO		PSP_C2PMSG(56)
+#define PSP_CMDBUFF_ADDR_HI             PSP_C2PMSG(57)
+#define PSP_FEATURE_REG			PSP_C2PMSG(63)
+
+#define PSP_P2CMSG(_num)		(_num << 2)
+#define PSP_CMD_COMPLETE_REG		1
+#define PSP_CMD_COMPLETE		PSP_P2CMSG(PSP_CMD_COMPLETE_REG)
+
+#define MAX_PSP_NAME_LEN		16
+#define SEV_DEFAULT_TIMEOUT		5
+
+struct sev_device {
+	struct list_head entry;
+
+	struct dentry *debugfs;
+	struct miscdevice misc;
+
+	unsigned int id;
+	char name[MAX_PSP_NAME_LEN];
+
+	struct device *dev;
+	struct sp_device *sp;
+	struct psp_device *psp;
+
+	void __iomem *io_regs;
+
+	unsigned int int_rcvd;
+	wait_queue_head_t int_queue;
+};
+
+void sev_add_device(struct sev_device *sev);
+void sev_del_device(struct sev_device *sev);
+
+int sev_ops_init(struct sev_device *sev);
+void sev_ops_destroy(struct sev_device *sev);
+
+int sev_issue_cmd(int cmd, void *data, unsigned int timeout, int *error);
+
+#endif /* __SEV_DEV_H */
diff --git a/drivers/crypto/ccp/sev-ops.c b/drivers/crypto/ccp/sev-ops.c
new file mode 100644
index 0000000..727a8db
--- /dev/null
+++ b/drivers/crypto/ccp/sev-ops.c
@@ -0,0 +1,324 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) command interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/uaccess.h>
+
+#include <uapi/linux/psp-sev.h>
+
+#include "psp-dev.h"
+#include "sev-dev.h"
+
+static int sev_ioctl_init(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_init *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = sev_platform_init(data, &argp->error);
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_platform_status(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_status *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = sev_platform_status(data, &argp->error);
+
+	if (copy_to_user((void *)argp->data, data, sizeof(*data)))
+		ret = -EFAULT;
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pek_csr(struct sev_issue_cmd *argp)
+{
+	int ret;
+	void *csr_addr = NULL;
+	struct sev_data_pek_csr *data;
+	struct sev_user_data_pek_csr input;
+
+	if (copy_from_user(&input, (void *)argp->data,
+			sizeof(struct sev_user_data_pek_csr)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy PEK certificate from userspace */
+	if (input.address && input.length) {
+		csr_addr = kmalloc(input.length, GFP_KERNEL);
+		if (!csr_addr) {
+			ret = -ENOMEM;
+			goto e_err;
+		}
+		if (copy_from_user(csr_addr, (void *)input.address,
+				input.length)) {
+			ret = -EFAULT;
+			goto e_csr_free;
+		}
+
+		data->address = __psp_pa(csr_addr);
+		data->length = input.length;
+	}
+
+	ret = sev_issue_cmd(SEV_CMD_PEK_CSR,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+
+	input.length = data->length;
+
+	/* copy PEK certificate length to userspace */
+	if (copy_to_user((void *)argp->data, &input,
+			sizeof(struct sev_user_data_pek_csr)))
+		ret = -EFAULT;
+e_csr_free:
+	kfree(csr_addr);
+e_err:
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pek_cert_import(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_pek_cert_import *data;
+	struct sev_user_data_pek_cert_import input;
+	void *pek_cert, *oca_cert;
+
+	if (copy_from_user(&input, (void *)argp->data, sizeof(*data)))
+		return -EFAULT;
+
+	if (!input.pek_cert_address || !input.pek_cert_length ||
+		!input.oca_cert_address || !input.oca_cert_length)
+		return -EINVAL;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy PEK certificate from userspace */
+	pek_cert = kmalloc(input.pek_cert_length, GFP_KERNEL);
+	if (!pek_cert) {
+		ret = -ENOMEM;
+		goto e_free;
+	}
+	if (copy_from_user(pek_cert, (void *)input.pek_cert_address,
+				input.pek_cert_length)) {
+		ret = -EFAULT;
+		goto e_free_pek_cert;
+	}
+
+	data->pek_cert_address = __psp_pa(pek_cert);
+	data->pek_cert_length = input.pek_cert_length;
+
+	/* copy OCA certificate from userspace */
+	oca_cert = kmalloc(input.oca_cert_length, GFP_KERNEL);
+	if (!oca_cert) {
+		ret = -ENOMEM;
+		goto e_free_pek_cert;
+	}
+	if (copy_from_user(oca_cert, (void *)input.oca_cert_address,
+				input.oca_cert_length)) {
+		ret = -EFAULT;
+		goto e_free_oca_cert;
+	}
+
+	data->oca_cert_address = __psp_pa(oca_cert);
+	data->oca_cert_length = input.oca_cert_length;
+
+	ret = sev_issue_cmd(SEV_CMD_PEK_CERT_IMPORT,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+e_free_oca_cert:
+	kfree(oca_cert);
+e_free_pek_cert:
+	kfree(pek_cert);
+e_free:
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pdh_cert_export(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_pdh_cert_export *data;
+	struct sev_user_data_pdh_cert_export input;
+	void *pdh_cert = NULL, *cert_chain = NULL;
+
+	if (copy_from_user(&input, (void *)argp->data, sizeof(*data)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy pdh certificate from userspace */
+	if (input.pdh_cert_length && input.pdh_cert_address) {
+		pdh_cert = kmalloc(input.pdh_cert_length, GFP_KERNEL);
+		if (!pdh_cert) {
+			ret = -ENOMEM;
+			goto e_free;
+		}
+		if (copy_from_user(pdh_cert, (void *)input.pdh_cert_address,
+					input.pdh_cert_length)) {
+			ret = -EFAULT;
+			goto e_free_pdh_cert;
+		}
+
+		data->pdh_cert_address = __psp_pa(pdh_cert);
+		data->pdh_cert_length = input.pdh_cert_length;
+	}
+
+	/* copy cert_chain certificate from userspace */
+	if (input.cert_chain_length && input.cert_chain_address) {
+		cert_chain = kmalloc(input.cert_chain_length, GFP_KERNEL);
+		if (!cert_chain) {
+			ret = -ENOMEM;
+			goto e_free_pdh_cert;
+		}
+		if (copy_from_user(cert_chain, (void *)input.cert_chain_address,
+					input.cert_chain_length)) {
+			ret = -EFAULT;
+			goto e_free_cert_chain;
+		}
+
+		data->cert_chain_address = __psp_pa(cert_chain);
+		data->cert_chain_length = input.cert_chain_length;
+	}
+
+	ret = sev_issue_cmd(SEV_CMD_PDH_CERT_EXPORT,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+
+	input.cert_chain_length = data->cert_chain_length;
+	input.pdh_cert_length = data->pdh_cert_length;
+
+	/* copy certificate length to userspace */
+	if (copy_to_user((void *)argp->data, &input,
+			sizeof(struct sev_user_data_pek_csr)))
+		ret = -EFAULT;
+
+e_free_cert_chain:
+	kfree(cert_chain);
+e_free_pdh_cert:
+	kfree(pdh_cert);
+e_free:
+	kfree(data);
+	return ret;
+}
+
+static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
+{
+	int ret = -EFAULT;
+	void __user *argp = (void __user *)arg;
+	struct sev_issue_cmd input;
+
+	if (ioctl != SEV_ISSUE_CMD)
+		return -EINVAL;
+
+	if (copy_from_user(&input, argp, sizeof(struct sev_issue_cmd)))
+		return -EFAULT;
+
+	if (input.cmd > SEV_CMD_MAX)
+		return -EINVAL;
+
+	switch (input.cmd) {
+
+	case SEV_USER_CMD_INIT: {
+		ret = sev_ioctl_init(&input);
+		break;
+	}
+	case SEV_USER_CMD_SHUTDOWN: {
+		ret = sev_platform_shutdown(&input.error);
+		break;
+	}
+	case SEV_USER_CMD_FACTORY_RESET: {
+		ret = sev_issue_cmd(SEV_CMD_FACTORY_RESET, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PLATFORM_STATUS: {
+		ret = sev_ioctl_platform_status(&input);
+		break;
+	}
+	case SEV_USER_CMD_PEK_GEN: {
+		ret = sev_issue_cmd(SEV_CMD_PEK_GEN, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PDH_GEN: {
+		ret = sev_issue_cmd(SEV_CMD_PDH_GEN, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PEK_CSR: {
+		ret = sev_ioctl_pek_csr(&input);
+		break;
+	}
+	case SEV_USER_CMD_PEK_CERT_IMPORT: {
+		ret = sev_ioctl_pek_cert_import(&input);
+		break;
+	}
+	case SEV_USER_CMD_PDH_CERT_EXPORT: {
+		ret = sev_ioctl_pdh_cert_export(&input);
+		break;
+	}
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	if (copy_to_user(argp, &input, sizeof(struct sev_issue_cmd)))
+		ret = -EFAULT;
+
+	return ret;
+}
+
+const struct file_operations sev_fops = {
+	.owner	= THIS_MODULE,
+	.unlocked_ioctl = sev_ioctl,
+};
+
+int sev_ops_init(struct sev_device *sev)
+{
+	struct miscdevice *misc = &sev->misc;
+
+	misc->minor = MISC_DYNAMIC_MINOR;
+	misc->name = sev->name;
+	misc->fops = &sev_fops;
+
+	return misc_register(misc);
+}
+
+void sev_ops_destroy(struct sev_device *sev)
+{
+	misc_deregister(&sev->misc);
+}
+
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
new file mode 100644
index 0000000..acce6ed
--- /dev/null
+++ b/include/linux/psp-sev.h
@@ -0,0 +1,672 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) driver interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_SEV_H__
+#define __PSP_SEV_H__
+
+#ifdef CONFIG_X86
+#include <linux/mem_encrypt.h>
+
+#define __psp_pa(x)	__sme_pa(x)
+#else
+#define __psp_pa(x)	__pa(x)
+#endif
+
+/**
+ * SEV platform and guest management commands
+ */
+enum sev_cmd {
+	/* platform commands */
+	SEV_CMD_INIT			= 0x001,
+	SEV_CMD_SHUTDOWN		= 0x002,
+	SEV_CMD_FACTORY_RESET		= 0x003,
+	SEV_CMD_PLATFORM_STATUS		= 0x004,
+	SEV_CMD_PEK_GEN			= 0x005,
+	SEV_CMD_PEK_CSR			= 0x006,
+	SEV_CMD_PEK_CERT_IMPORT		= 0x007,
+	SEV_CMD_PDH_GEN			= 0x008,
+	SEV_CMD_PDH_CERT_EXPORT		= 0x009,
+	SEV_CMD_DF_FLUSH		= 0x00A,
+
+	/* Guest commands */
+	SEV_CMD_DECOMMISSION		= 0x020,
+	SEV_CMD_ACTIVATE		= 0x021,
+	SEV_CMD_DEACTIVATE		= 0x022,
+	SEV_CMD_GUEST_STATUS		= 0x023,
+
+	/* Guest launch commands */
+	SEV_CMD_LAUNCH_START		= 0x030,
+	SEV_CMD_LAUNCH_UPDATE_DATA	= 0x031,
+	SEV_CMD_LAUNCH_UPDATE_VMSA	= 0x032,
+	SEV_CMD_LAUNCH_MEASURE		= 0x033,
+	SEV_CMD_LAUNCH_UPDATE_SECRET	= 0x034,
+	SEV_CMD_LAUNCH_FINISH		= 0x035,
+
+	/* Guest migration commands (outgoing) */
+	SEV_CMD_SEND_START		= 0x040,
+	SEV_CMD_SEND_UPDATE_DATA	= 0x041,
+	SEV_CMD_SEND_UPDATE_VMSA	= 0x042,
+	SEV_CMD_SEND_FINISH		= 0x043,
+
+	/* Guest migration commands (incoming) */
+	SEV_CMD_RECEIVE_START		= 0x050,
+	SEV_CMD_RECEIVE_UPDATE_DATA	= 0x051,
+	SEV_CMD_RECEIVE_UPDATE_VMSA	= 0x052,
+	SEV_CMD_RECEIVE_FINISH		= 0x053,
+
+	/* Guest debug commands */
+	SEV_CMD_DBG_DECRYPT		= 0x060,
+	SEV_CMD_DBG_ENCRYPT		= 0x061,
+
+	SEV_CMD_MAX,
+};
+
+/**
+ * status code returned by the commands
+ */
+enum psp_ret_code {
+	SEV_RET_SUCCESS = 0,
+	SEV_RET_INVALID_PLATFORM_STATE,
+	SEV_RET_INVALID_GUEST_STATE,
+	SEV_RET_INAVLID_CONFIG,
+	SEV_RET_INVALID_LENGTH,
+	SEV_RET_ALREADY_OWNED,
+	SEV_RET_INVALID_CERTIFICATE,
+	SEV_RET_POLICY_FAILURE,
+	SEV_RET_INACTIVE,
+	SEV_RET_INVALID_ADDRESS,
+	SEV_RET_BAD_SIGNATURE,
+	SEV_RET_BAD_MEASUREMENT,
+	SEV_RET_ASID_OWNED,
+	SEV_RET_INVALID_ASID,
+	SEV_RET_WBINVD_REQUIRED,
+	SEV_RET_DFFLUSH_REQUIRED,
+	SEV_RET_INVALID_GUEST,
+	SEV_RET_INVALID_COMMAND,
+	SEV_RET_ACTIVE,
+	SEV_RET_HWSEV_RET_PLATFORM,
+	SEV_RET_HWSEV_RET_UNSAFE,
+	SEV_RET_UNSUPPORTED,
+	SEV_RET_MAX,
+};
+
+/**
+ * struct sev_data_init - INIT command parameters
+ *
+ * @flags: processing flags
+ * @tmr_address: system physical address used for SEV-ES
+ * @tmr_length: length of tmr_address
+ */
+struct sev_data_init {
+	__u32 flags;				/* In */
+	__u32 reserved;				/* In */
+	__u64 tmr_address;			/* In */
+	__u32 tmr_length;			/* In */
+};
+
+/**
+ * struct sev_data_status - PLATFORM_STATUS command parameters
+ *
+ * @major: major API version
+ * @minor: minor API version
+ * @state: platform state
+ * @owner: self-owned or externally owned
+ * @config: platform config flags
+ * @guest_count: number of active guests
+ */
+struct sev_data_status {
+	__u8 api_major;				/* Out */
+	__u8 api_minor;				/* Out */
+	__u8 state;				/* Out */
+	__u8 owner;				/* Out */
+	__u32 config;				/* Out */
+	__u32 guest_count;			/* Out */
+};
+
+/**
+ * struct sev_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @length: length of certificate
+ */
+struct sev_data_pek_csr {
+	__u64 address;					/* In */
+	__u32 length;					/* In/Out */
+};
+
+/**
+ * struct sev_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_length: length of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_length: length of OCA certificate
+ */
+struct sev_data_pek_cert_import {
+	__u64 pek_cert_address;				/* In */
+	__u32 pek_cert_length;				/* In */
+	__u32 reserved;					/* In */
+	__u64 oca_cert_address;				/* In */
+	__u32 oca_cert_length;				/* In */
+};
+
+/**
+ * struct sev_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_length: length of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_length: length of PDH certificate chain
+ */
+struct sev_data_pdh_cert_export {
+	__u64 pdh_cert_address;				/* In */
+	__u32 pdh_cert_length;				/* In/Out */
+	__u32 reserved;					/* In */
+	__u64 cert_chain_address;			/* In */
+	__u32 cert_chain_length;			/* In/Out */
+};
+
+/**
+ * struct sev_data_decommission - DECOMMISSION command parameters
+ *
+ * @handle: handle of the VM to decommission
+ */
+struct sev_data_decommission {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_activate - ACTIVATE command parameters
+ *
+ * @handle: handle of the VM to activate
+ * @asid: asid assigned to the VM
+ */
+struct sev_data_activate {
+	u32 handle;				/* In */
+	u32 asid;				/* In */
+};
+
+/**
+ * struct sev_data_deactivate - DEACTIVATE command parameters
+ *
+ * @handle: handle of the VM to deactivate
+ */
+struct sev_data_deactivate {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_guest_status - SEV GUEST_STATUS command parameters
+ *
+ * @handle: handle of the VM to retrieve status
+ * @policy: policy information for the VM
+ * @asid: current ASID of the VM
+ * @state: current state of the VM
+ */
+struct sev_data_guest_status {
+	u32 handle;				/* In */
+	u32 policy;				/* Out */
+	u32 asid;				/* Out */
+	u8 state;				/* Out */
+};
+
+/**
+ * struct sev_data_launch_start - LAUNCH_START command parameters
+ *
+ * @handle: handle assigned to the VM
+ * @policy: guest launch policy
+ * @dh_cert_address: physical address of DH certificate blob
+ * @dh_cert_length: length of DH certificate blob
+ * @session_address: physical address of session parameters
+ * @session_len: length of session parameters
+ */
+struct sev_data_launch_start {
+	u32 handle;				/* In/Out */
+	u32 policy;				/* In */
+	u64 dh_cert_address;			/* In */
+	u32 dh_cert_length;			/* In */
+	u32 reserved;				/* In */
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In */
+};
+
+/**
+ * struct sev_data_launch_update_data - LAUNCH_UPDATE_DATA command parameter
+ *
+ * @handle: handle of the VM to update
+ * @length: length of memory to be encrypted
+ * @address: physical address of memory region to encrypt
+ */
+struct sev_data_launch_update_data {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In */
+};
+
+/**
+ * struct sev_data_launch_update_vmsa - LAUNCH_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM
+ * @address: physical address of memory region to encrypt
+ * @length: length of memory region to encrypt
+ */
+struct sev_data_launch_update_vmsa {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In */
+};
+
+/**
+ * struct sev_data_launch_measure - LAUNCH_MEASURE command parameters
+ *
+ * @handle: handle of the VM to process
+ * @address: physical address containing the measurement blob
+ * @length: length of measurement blob
+ */
+struct sev_data_launch_measure {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In/Out */
+};
+
+/**
+ * struct sev_data_launch_secret - LAUNCH_SECRET command parameters
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing the packet header
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest_paddr
+ * @trans_address: physical address of transport memory buffer
+ * @trans_length: length of transport memory buffer
+ */
+struct sev_data_launch_secret {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_launch_finish - LAUNCH_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_launch_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_send_start - SEND_START command parameters
+ *
+ * @handle: handle of the VM to process
+ * @pdh_cert_address: physical address containing PDH certificate
+ * @pdh_cert_length: length of PDH certificate
+ * @plat_certs_address: physical address containing platform certificate
+ * @plat_certs_length: length of platform certificate
+ * @amd_certs_address: physical address containing AMD certificate
+ * @amd_certs_length: length of AMD certificate
+ * @session_data_address: physical address containing Session data
+ * @session_length: length of session data
+ */
+struct sev_data_send_start {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 pdh_cert_address;			/* In */
+	u32 pdh_cert_length;			/* In/Out */
+	u32 reserved2;
+	u64 plat_cert_address;			/* In */
+	u32 plat_cert_length;			/* In/Out */
+	u32 reserved3;
+	u64 amd_cert_address;			/* In */
+	u32 amd_cert_length;			/* In/Out */
+	u32 reserved4;
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In/Out */
+};
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_DATA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_length: length of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_length: length of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_length: length of host memory region
+ */
+struct sev_data_send_update_data {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In/Out */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_length: length of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_length: length of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_length: length of host memory region
+ */
+struct sev_data_send_update_vmsa {
+	u32 handle;				/* In */
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In/Out */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_send_finish - SEND_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_send_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_receive_start - RECEIVE_START command parameters
+ *
+ * @handle: handle of the VM to perform receive operation
+ * @pdh_cert_address: system physical address containing PDH certificate blob
+ * @pdh_cert_length: length of PDH certificate blob
+ * @session_address: system physical address containing session blob
+ * @session_length: length of session blob
+ */
+struct sev_data_receive_start {
+	u32 handle;				/* In/Out */
+	u32 reserved1;
+	u64 pdh_cert_address;			/* In */
+	u32 pdh_cert_length;			/* In */
+	u32 reserved2;
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In/Out */
+};
+
+/**
+ * struct sev_data_receive_update_data - RECEIVE_UPDATE_DATA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_length: length of transport buffer
+ */
+struct sev_data_receive_update_data {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_receive_update_vmsa - RECEIVE_UPDATE_VMSA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_length: length of transport buffer
+ */
+struct sev_data_receive_update_vmsa {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_receive_finish - RECEIVE_FINISH command parameters
+ *
+ * @handle: handle of the VM to finish
+ */
+struct sev_data_receive_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_dbg - DBG_ENCRYPT/DBG_DECRYPT command parameters
+ *
+ * @handle: handle of the VM to perform debug operation
+ * @src_addr: source address of data to operate on
+ * @dst_addr: destination address of data to operate on
+ * @length: length of data to operate on
+ */
+struct sev_data_dbg {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 src_addr;				/* In */
+	u64 dst_addr;				/* In */
+	u32 length;				/* In */
+};
+
+#if defined(CONFIG_CRYPTO_DEV_SEV)
+
+/**
+ * sev_platform_init - perform SEV INIT command
+ *
+ * @init: sev_data_init structure to be processed
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_init(struct sev_data_init *init, int *error);
+
+/**
+ * sev_platform_shutdown - perform SEV SHUTDOWN command
+ *
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_shutdown(int *error);
+
+/**
+ * sev_platform_status - perform SEV PLATFORM_STATUS command
+ *
+ * @init: sev_data_status structure to be processed
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_status(struct sev_data_status *status, int *error);
+
+/**
+ * sev_issue_cmd_external_user - issue SEV command by other driver
+ *
+ * The function can be used by other drivers to issue a SEV command on
+ * behalf by userspace. The caller must pass a valid SEV file descriptor
+ * so that we know that caller has access to SEV device.
+ *
+ * @filep - SEV device file pointer
+ * @cmd - command to issue
+ * @data - command buffer
+ * @timeout - If zero then use default timeout
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ * -%EINVAL    if the SEV file descriptor is not valid
+ */
+int sev_issue_cmd_external_user(struct file *filep, unsigned int id,
+				void *data, int timeout, int *error);
+
+/**
+ * sev_guest_deactivate - perform SEV DEACTIVATE command
+ *
+ * @deactivate: sev_data_deactivate structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_deactivate(struct sev_data_deactivate *data, int *error);
+
+/**
+ * sev_guest_activate - perform SEV ACTIVATE command
+ *
+ * @activate: sev_data_activate structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_activate(struct sev_data_activate *data, int *error);
+
+/**
+ * sev_guest_df_flush - perform SEV DF_FLUSH command
+ *
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_df_flush(int *error);
+
+/**
+ * sev_guest_decommission - perform SEV DECOMMISSION command
+ *
+ * @decommission: sev_data_decommission structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_decommission(struct sev_data_decommission *data, int *error);
+
+#else	/* !CONFIG_CRYPTO_DEV_SEV */
+
+static inline int sev_platform_status(struct sev_data_status *status,
+				      int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_platform_init(struct sev_data_init *init, int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_platform_shutdown(int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_issue_cmd_external_user(int fd, unsigned int id,
+					void *data, int timeout, int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_deactivate(struct sev_data_deactivate *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_decommission(struct sev_data_decommission *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_activate(struct sev_data_activate *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_df_flush(int *error)
+{
+	return -ENODEV;
+}
+
+#endif	/* CONFIG_CRYPTO_DEV_SEV */
+
+#endif	/* __PSP_SEV_H__ */
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index f330ba4..2e15ea7 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -481,3 +481,4 @@ header-y += xilinx-v4l2-controls.h
 header-y += zorro.h
 header-y += zorro_ids.h
 header-y += userfaultfd.h
+header-y += psp-sev.h
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
new file mode 100644
index 0000000..050976d
--- /dev/null
+++ b/include/uapi/linux/psp-sev.h
@@ -0,0 +1,123 @@
+
+/*
+ * Userspace interface for AMD Secure Encrypted Virtualization (SEV)
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_SEV_USER_H__
+#define __PSP_SEV_USER_H__
+
+#include <linux/types.h>
+
+/**
+ * SEV platform commands
+ */
+enum {
+	SEV_USER_CMD_INIT = 0,
+	SEV_USER_CMD_SHUTDOWN,
+	SEV_USER_CMD_FACTORY_RESET,
+	SEV_USER_CMD_PLATFORM_STATUS,
+	SEV_USER_CMD_PEK_GEN,
+	SEV_USER_CMD_PEK_CSR,
+	SEV_USER_CMD_PDH_GEN,
+	SEV_USER_CMD_PDH_CERT_EXPORT,
+	SEV_USER_CMD_PEK_CERT_IMPORT,
+
+	SEV_USER_CMD_MAX,
+};
+
+/**
+ * struct sev_user_data_init - INIT command parameters
+ *
+ * @flags: processing flags
+ */
+struct sev_user_data_init {
+	__u32 flags;				/* In */
+};
+
+/**
+ * struct sev_user_data_status - PLATFORM_STATUS command parameters
+ *
+ * @major: major API version
+ * @minor: minor API version
+ * @state: platform state
+ * @owner: self-owned or externally owned
+ * @config: platform config flags
+ * @guest_count: number of active guests
+ */
+struct sev_user_data_status {
+	__u8 api_major;				/* Out */
+	__u8 api_minor;				/* Out */
+	__u8 state;				/* Out */
+	__u8 owner;				/* Out */
+	__u32 config;				/* Out */
+	__u32 guest_count;			/* Out */
+};
+
+/**
+ * struct sev_user_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @length: length of certificate
+ */
+struct sev_user_data_pek_csr {
+	__u64 address;					/* In */
+	__u32 length;					/* In/Out */
+};
+
+/**
+ * q
+ * struct sev_user_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_length: length of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_length: length of OCA certificate
+ */
+struct sev_user_data_pek_cert_import {
+	__u64 pek_cert_address;				/* In */
+	__u32 pek_cert_length;				/* In */
+	__u64 oca_cert_address;				/* In */
+	__u32 oca_cert_length;				/* In */
+};
+
+/**
+ * struct sev_user_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_length: length of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_length: length of PDH certificate chain
+ */
+struct sev_user_data_pdh_cert_export {
+	__u64 pdh_cert_address;				/* In */
+	__u32 pdh_cert_length;				/* In/Out */
+	__u64 cert_chain_address;			/* In */
+	__u32 cert_chain_length;			/* In/Out */
+};
+
+/**
+ * struct sev_issue_cmd - SEV ioctl parameters
+ *
+ * @cmd: SEV commands to execute
+ * @opaque: pointer to the command structure
+ * @error: SEV FW return code on failure
+ */
+struct sev_issue_cmd {
+	__u32 cmd;					/* In */
+	__u64 data;					/* In */
+	__u32 error;					/* Out */
+};
+
+#define SEV_IOC_TYPE		'S'
+#define SEV_ISSUE_CMD	_IOWR(SEV_IOC_TYPE, 0x0, struct sev_issue_cmd)
+
+#endif /* __PSP_USER_SEV_H */
+

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 21/32] crypto: ccp: Add Secure Encrypted Virtualization (SEV) interface support
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The Secure Encrypted Virtualization (SEV) interface allows the memory
contents of a virtual machine (VM) to be transparently encrypted with
a key unique to the guest.

The interface provides:
  - /dev/sev device and ioctl (SEV_ISSUE_CMD) to execute the platform
    provisioning commands from the userspace.
  - in-kernel API's to encrypt the guest memory region. The in-kernel APIs
    will be used by KVM to bootstrap and debug the SEV guest.

SEV key management spec is available here [1]
[1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Specification.pdf

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 drivers/crypto/ccp/Kconfig   |    7 
 drivers/crypto/ccp/Makefile  |    1 
 drivers/crypto/ccp/psp-dev.h |    6 
 drivers/crypto/ccp/sev-dev.c |  348 ++++++++++++++++++++++
 drivers/crypto/ccp/sev-dev.h |   67 ++++
 drivers/crypto/ccp/sev-ops.c |  324 ++++++++++++++++++++
 include/linux/psp-sev.h      |  672 ++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/Kbuild    |    1 
 include/uapi/linux/psp-sev.h |  123 ++++++++
 9 files changed, 1546 insertions(+), 3 deletions(-)
 create mode 100644 drivers/crypto/ccp/sev-dev.c
 create mode 100644 drivers/crypto/ccp/sev-dev.h
 create mode 100644 drivers/crypto/ccp/sev-ops.c
 create mode 100644 include/linux/psp-sev.h
 create mode 100644 include/uapi/linux/psp-sev.h

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index 59c207e..67d1917 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -41,4 +41,11 @@ config CRYPTO_DEV_PSP
 	help
 	 Provide the interface for AMD Platform Security Processor (PSP) device.
 
+config CRYPTO_DEV_SEV
+	bool "Secure Encrypted Virtualization (SEV) interface"
+	default y
+	help
+	 Provide the kernel and userspace (/dev/sev) interface to issue the
+	 Secure Encrypted Virtualization (SEV) commands.
+
 endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 12e569d..4c4e77e 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -7,6 +7,7 @@ ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-dev-v5.o \
 	    ccp-dmaengine.o
 ccp-$(CONFIG_CRYPTO_DEV_PSP) += psp-dev.o
+ccp-$(CONFIG_CRYPTO_DEV_SEV) += sev-dev.o sev-ops.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/psp-dev.h b/drivers/crypto/ccp/psp-dev.h
index bbd3d96..fd67b14 100644
--- a/drivers/crypto/ccp/psp-dev.h
+++ b/drivers/crypto/ccp/psp-dev.h
@@ -70,14 +70,14 @@ int psp_free_sev_irq(struct psp_device *psp, void *data);
 
 struct psp_device *psp_get_master_device(void);
 
-#ifdef CONFIG_AMD_SEV
+#ifdef CONFIG_CRYPTO_DEV_SEV
 
 int sev_dev_init(struct psp_device *psp);
 void sev_dev_destroy(struct psp_device *psp);
 int sev_dev_resume(struct psp_device *psp);
 int sev_dev_suspend(struct psp_device *psp, pm_message_t state);
 
-#else
+#else /* !CONFIG_CRYPTO_DEV_SEV */
 
 static inline int sev_dev_init(struct psp_device *psp)
 {
@@ -96,7 +96,7 @@ static inline int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
 	return -ENODEV;
 }
 
-#endif /* __AMD_SEV_H */
+#endif /* CONFIG_CRYPTO_DEV_SEV */
 
 #endif /* __PSP_DEV_H */
 
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
new file mode 100644
index 0000000..a67e2d7
--- /dev/null
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -0,0 +1,348 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/delay.h>
+#include <linux/wait.h>
+#include <linux/jiffies.h>
+
+#include "psp-dev.h"
+#include "sev-dev.h"
+
+extern struct file_operations sev_fops;
+
+static LIST_HEAD(sev_devs);
+static DEFINE_SPINLOCK(sev_devs_lock);
+static atomic_t sev_id;
+
+static unsigned int psp_poll;
+module_param(psp_poll, uint, 0444);
+MODULE_PARM_DESC(psp_poll, "Poll for sev command completion - any non-zero value");
+
+DEFINE_MUTEX(sev_cmd_mutex);
+
+void sev_add_device(struct sev_device *sev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sev_devs_lock, flags);
+
+	list_add_tail(&sev->entry, &sev_devs);
+
+	spin_unlock_irqrestore(&sev_devs_lock, flags);
+}
+
+void sev_del_device(struct sev_device *sev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sev_devs_lock, flags);
+
+	list_del(&sev->entry);
+	spin_unlock_irqrestore(&sev_devs_lock, flags);
+}
+
+static struct sev_device *get_sev_master_device(void)
+{
+	struct psp_device *psp = psp_get_master_device();
+
+	return psp ? psp->sev_data : NULL;
+}
+
+static int sev_wait_cmd_poll(struct sev_device *sev, unsigned int timeout,
+			     unsigned int *reg)
+{
+	int wait = timeout * 10;	/* 100ms sleep => timeout * 10 */
+
+	while (--wait) {
+		msleep(100);
+
+		*reg = ioread32(sev->io_regs + PSP_CMDRESP);
+		if (*reg & PSP_CMDRESP_RESP)
+			break;
+	}
+
+	if (!wait) {
+		dev_err(sev->dev, "sev command timed out\n");
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static int sev_wait_cmd_ioc(struct sev_device *sev, unsigned int timeout,
+			    unsigned int *reg)
+{
+	unsigned long jiffie_timeout = timeout;
+	long ret;
+
+	jiffie_timeout *= HZ;
+
+	sev->int_rcvd = 0;
+
+	ret = wait_event_interruptible_timeout(sev->int_queue, sev->int_rcvd,
+						jiffie_timeout);
+	if (ret <= 0) {
+		dev_err(sev->dev, "sev command (%#x) timed out\n",
+				*reg >> PSP_CMDRESP_CMD_SHIFT);
+		return -ETIMEDOUT;
+	}
+
+	*reg = ioread32(sev->io_regs + PSP_CMDRESP);
+
+	return 0;
+}
+
+static int sev_wait_cmd(struct sev_device *sev, unsigned int timeout,
+			unsigned int *reg)
+{
+	return (*reg & PSP_CMDRESP_IOC) ? sev_wait_cmd_ioc(sev, timeout, reg)
+					: sev_wait_cmd_poll(sev, timeout, reg);
+}
+
+static struct sev_device *sev_alloc_struct(struct psp_device *psp)
+{
+	struct device *dev = psp->dev;
+	struct sev_device *sev;
+
+	sev = devm_kzalloc(dev, sizeof(*sev), GFP_KERNEL);
+	if (!sev)
+		return NULL;
+
+	sev->dev = dev;
+	sev->psp = psp;
+	sev->id = atomic_inc_return(&sev_id);
+
+	snprintf(sev->name, sizeof(sev->name), "sev%u", sev->id);
+	init_waitqueue_head(&sev->int_queue);
+
+	return sev;
+}
+
+irqreturn_t sev_irq_handler(int irq, void *data)
+{
+	struct sev_device *sev = data;
+	unsigned int status;
+
+	status = ioread32(sev->io_regs + PSP_P2CMSG_INTSTS);
+	if (status & (1 << PSP_CMD_COMPLETE_REG)) {
+		int reg;
+
+		reg = ioread32(sev->io_regs + PSP_CMDRESP);
+		if (reg & PSP_CMDRESP_RESP) {
+			sev->int_rcvd = 1;
+			wake_up_interruptible(&sev->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
+static bool check_sev_support(struct sev_device *sev)
+{
+	/* Bit 0 in PSP_FEATURE_REG is set then SEV is support in PSP */
+	if (ioread32(sev->io_regs + PSP_FEATURE_REG) & 1)
+		return true;
+
+	return false;
+}
+
+int sev_dev_init(struct psp_device *psp)
+{
+	struct device *dev = psp->dev;
+	struct sev_device *sev;
+	int ret;
+
+	ret = -ENOMEM;
+	sev = sev_alloc_struct(psp);
+	if (!sev)
+		goto e_err;
+	psp->sev_data = sev;
+	
+	sev->io_regs = psp->io_regs;
+
+	dev_dbg(dev, "checking SEV support ...\n");
+	/* check SEV support */
+	if (!check_sev_support(sev)) {
+		dev_dbg(dev, "device does not support SEV\n");
+		goto e_err;
+	}
+
+	dev_dbg(dev, "requesting an IRQ ...\n");
+	/* Request an irq */
+	ret = psp_request_sev_irq(sev->psp, sev_irq_handler, sev);
+	if (ret) {
+		dev_err(dev, "unable to allocate an IRQ\n");
+		goto e_err;
+	}
+
+	/* initialize SEV ops */
+	dev_dbg(dev, "init sev ops\n");
+	ret = sev_ops_init(sev);
+	if (ret) {
+		dev_err(dev, "failed to init sev ops\n");
+		goto e_irq;
+	}
+
+	sev_add_device(sev);
+
+	dev_notice(dev, "sev enabled\n");
+
+	return 0;
+
+e_irq:
+	psp_free_sev_irq(psp, sev);
+e_err:
+	psp->sev_data = NULL;
+
+	dev_notice(dev, "sev initialization failed\n");
+
+	return ret;
+}
+
+void sev_dev_destroy(struct psp_device *psp)
+{
+	struct sev_device *sev = psp->sev_data;
+
+	psp_free_sev_irq(psp, sev);
+
+	sev_ops_destroy(sev);
+
+	sev_del_device(sev);
+}
+
+int sev_dev_resume(struct psp_device *psp)
+{
+	return 0;
+}
+
+int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
+{
+	return 0;
+}
+
+int sev_issue_cmd(int cmd, void *data, unsigned int timeout, int *psp_ret)
+{
+	struct sev_device *sev = get_sev_master_device();
+	unsigned int phys_lsb, phys_msb;
+	unsigned int reg, ret;
+
+	if (!sev)
+		return -ENODEV;
+
+	if (psp_ret)
+		*psp_ret = 0;
+
+	/* Set the physical address for the PSP */
+	phys_lsb = data ? lower_32_bits(__psp_pa(data)) : 0;
+	phys_msb = data ? upper_32_bits(__psp_pa(data)) : 0;
+
+	dev_dbg(sev->dev, "sev command id %#x buffer 0x%08x%08x\n",
+			cmd, phys_msb, phys_lsb);
+
+	/* Only one command at a time... */
+	mutex_lock(&sev_cmd_mutex);
+
+	iowrite32(phys_lsb, sev->io_regs + PSP_CMDBUFF_ADDR_LO);
+	iowrite32(phys_msb, sev->io_regs + PSP_CMDBUFF_ADDR_HI);
+	wmb();
+
+	reg = cmd;
+	reg <<= PSP_CMDRESP_CMD_SHIFT;
+	reg |= psp_poll ? 0 : PSP_CMDRESP_IOC;
+	iowrite32(reg, sev->io_regs + PSP_CMDRESP);
+
+	ret = sev_wait_cmd(sev, timeout, &reg);
+	if (ret)
+		goto unlock;
+
+	if (psp_ret)
+		*psp_ret = reg & PSP_CMDRESP_ERR_MASK;
+
+	if (reg & PSP_CMDRESP_ERR_MASK) {
+		dev_dbg(sev->dev, "sev command %u failed (%#010x)\n",
+			cmd, reg & PSP_CMDRESP_ERR_MASK);
+		ret = -EIO;
+	}
+
+unlock:
+	mutex_unlock(&sev_cmd_mutex);
+
+	return ret;
+}
+
+int sev_platform_init(struct sev_data_init *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_INIT, data, SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_init);
+
+int sev_platform_shutdown(int *error)
+{
+	return sev_issue_cmd(SEV_CMD_SHUTDOWN, 0, SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_shutdown);
+
+int sev_platform_status(struct sev_data_status *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_PLATFORM_STATUS, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_status);
+
+int sev_issue_cmd_external_user(struct file *filep, unsigned int cmd,
+				void *data, int timeout, int *error)
+{
+	if (!filep || filep->f_op != &sev_fops)
+		return -EBADF;
+
+	return sev_issue_cmd(cmd, data,
+			timeout ? timeout : SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_issue_cmd_external_user);
+
+int sev_guest_deactivate(struct sev_data_deactivate *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DEACTIVATE, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_deactivate);
+
+int sev_guest_activate(struct sev_data_activate *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_ACTIVATE, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_activate);
+
+int sev_guest_decommission(struct sev_data_decommission *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DECOMMISSION, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_decommission);
+
+int sev_guest_df_flush(int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DF_FLUSH, 0,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_df_flush);
+
diff --git a/drivers/crypto/ccp/sev-dev.h b/drivers/crypto/ccp/sev-dev.h
new file mode 100644
index 0000000..0df6ead
--- /dev/null
+++ b/drivers/crypto/ccp/sev-dev.h
@@ -0,0 +1,67 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) interface
+ *
+ * Copyright (C) 2013,2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __SEV_DEV_H__
+#define __SEV_DEV_H__
+
+#include <linux/device.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+#include <linux/miscdevice.h>
+
+#include <linux/psp-sev.h>
+
+#define PSP_C2PMSG(_num)		((_num) << 2)
+#define PSP_CMDRESP			PSP_C2PMSG(32)
+#define PSP_CMDBUFF_ADDR_LO		PSP_C2PMSG(56)
+#define PSP_CMDBUFF_ADDR_HI             PSP_C2PMSG(57)
+#define PSP_FEATURE_REG			PSP_C2PMSG(63)
+
+#define PSP_P2CMSG(_num)		(_num << 2)
+#define PSP_CMD_COMPLETE_REG		1
+#define PSP_CMD_COMPLETE		PSP_P2CMSG(PSP_CMD_COMPLETE_REG)
+
+#define MAX_PSP_NAME_LEN		16
+#define SEV_DEFAULT_TIMEOUT		5
+
+struct sev_device {
+	struct list_head entry;
+
+	struct dentry *debugfs;
+	struct miscdevice misc;
+
+	unsigned int id;
+	char name[MAX_PSP_NAME_LEN];
+
+	struct device *dev;
+	struct sp_device *sp;
+	struct psp_device *psp;
+
+	void __iomem *io_regs;
+
+	unsigned int int_rcvd;
+	wait_queue_head_t int_queue;
+};
+
+void sev_add_device(struct sev_device *sev);
+void sev_del_device(struct sev_device *sev);
+
+int sev_ops_init(struct sev_device *sev);
+void sev_ops_destroy(struct sev_device *sev);
+
+int sev_issue_cmd(int cmd, void *data, unsigned int timeout, int *error);
+
+#endif /* __SEV_DEV_H */
diff --git a/drivers/crypto/ccp/sev-ops.c b/drivers/crypto/ccp/sev-ops.c
new file mode 100644
index 0000000..727a8db
--- /dev/null
+++ b/drivers/crypto/ccp/sev-ops.c
@@ -0,0 +1,324 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) command interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/uaccess.h>
+
+#include <uapi/linux/psp-sev.h>
+
+#include "psp-dev.h"
+#include "sev-dev.h"
+
+static int sev_ioctl_init(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_init *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = sev_platform_init(data, &argp->error);
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_platform_status(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_status *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = sev_platform_status(data, &argp->error);
+
+	if (copy_to_user((void *)argp->data, data, sizeof(*data)))
+		ret = -EFAULT;
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pek_csr(struct sev_issue_cmd *argp)
+{
+	int ret;
+	void *csr_addr = NULL;
+	struct sev_data_pek_csr *data;
+	struct sev_user_data_pek_csr input;
+
+	if (copy_from_user(&input, (void *)argp->data,
+			sizeof(struct sev_user_data_pek_csr)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy PEK certificate from userspace */
+	if (input.address && input.length) {
+		csr_addr = kmalloc(input.length, GFP_KERNEL);
+		if (!csr_addr) {
+			ret = -ENOMEM;
+			goto e_err;
+		}
+		if (copy_from_user(csr_addr, (void *)input.address,
+				input.length)) {
+			ret = -EFAULT;
+			goto e_csr_free;
+		}
+
+		data->address = __psp_pa(csr_addr);
+		data->length = input.length;
+	}
+
+	ret = sev_issue_cmd(SEV_CMD_PEK_CSR,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+
+	input.length = data->length;
+
+	/* copy PEK certificate length to userspace */
+	if (copy_to_user((void *)argp->data, &input,
+			sizeof(struct sev_user_data_pek_csr)))
+		ret = -EFAULT;
+e_csr_free:
+	kfree(csr_addr);
+e_err:
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pek_cert_import(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_pek_cert_import *data;
+	struct sev_user_data_pek_cert_import input;
+	void *pek_cert, *oca_cert;
+
+	if (copy_from_user(&input, (void *)argp->data, sizeof(*data)))
+		return -EFAULT;
+
+	if (!input.pek_cert_address || !input.pek_cert_length ||
+		!input.oca_cert_address || !input.oca_cert_length)
+		return -EINVAL;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy PEK certificate from userspace */
+	pek_cert = kmalloc(input.pek_cert_length, GFP_KERNEL);
+	if (!pek_cert) {
+		ret = -ENOMEM;
+		goto e_free;
+	}
+	if (copy_from_user(pek_cert, (void *)input.pek_cert_address,
+				input.pek_cert_length)) {
+		ret = -EFAULT;
+		goto e_free_pek_cert;
+	}
+
+	data->pek_cert_address = __psp_pa(pek_cert);
+	data->pek_cert_length = input.pek_cert_length;
+
+	/* copy OCA certificate from userspace */
+	oca_cert = kmalloc(input.oca_cert_length, GFP_KERNEL);
+	if (!oca_cert) {
+		ret = -ENOMEM;
+		goto e_free_pek_cert;
+	}
+	if (copy_from_user(oca_cert, (void *)input.oca_cert_address,
+				input.oca_cert_length)) {
+		ret = -EFAULT;
+		goto e_free_oca_cert;
+	}
+
+	data->oca_cert_address = __psp_pa(oca_cert);
+	data->oca_cert_length = input.oca_cert_length;
+
+	ret = sev_issue_cmd(SEV_CMD_PEK_CERT_IMPORT,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+e_free_oca_cert:
+	kfree(oca_cert);
+e_free_pek_cert:
+	kfree(pek_cert);
+e_free:
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pdh_cert_export(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_pdh_cert_export *data;
+	struct sev_user_data_pdh_cert_export input;
+	void *pdh_cert = NULL, *cert_chain = NULL;
+
+	if (copy_from_user(&input, (void *)argp->data, sizeof(*data)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy pdh certificate from userspace */
+	if (input.pdh_cert_length && input.pdh_cert_address) {
+		pdh_cert = kmalloc(input.pdh_cert_length, GFP_KERNEL);
+		if (!pdh_cert) {
+			ret = -ENOMEM;
+			goto e_free;
+		}
+		if (copy_from_user(pdh_cert, (void *)input.pdh_cert_address,
+					input.pdh_cert_length)) {
+			ret = -EFAULT;
+			goto e_free_pdh_cert;
+		}
+
+		data->pdh_cert_address = __psp_pa(pdh_cert);
+		data->pdh_cert_length = input.pdh_cert_length;
+	}
+
+	/* copy cert_chain certificate from userspace */
+	if (input.cert_chain_length && input.cert_chain_address) {
+		cert_chain = kmalloc(input.cert_chain_length, GFP_KERNEL);
+		if (!cert_chain) {
+			ret = -ENOMEM;
+			goto e_free_pdh_cert;
+		}
+		if (copy_from_user(cert_chain, (void *)input.cert_chain_address,
+					input.cert_chain_length)) {
+			ret = -EFAULT;
+			goto e_free_cert_chain;
+		}
+
+		data->cert_chain_address = __psp_pa(cert_chain);
+		data->cert_chain_length = input.cert_chain_length;
+	}
+
+	ret = sev_issue_cmd(SEV_CMD_PDH_CERT_EXPORT,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+
+	input.cert_chain_length = data->cert_chain_length;
+	input.pdh_cert_length = data->pdh_cert_length;
+
+	/* copy certificate length to userspace */
+	if (copy_to_user((void *)argp->data, &input,
+			sizeof(struct sev_user_data_pek_csr)))
+		ret = -EFAULT;
+
+e_free_cert_chain:
+	kfree(cert_chain);
+e_free_pdh_cert:
+	kfree(pdh_cert);
+e_free:
+	kfree(data);
+	return ret;
+}
+
+static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
+{
+	int ret = -EFAULT;
+	void __user *argp = (void __user *)arg;
+	struct sev_issue_cmd input;
+
+	if (ioctl != SEV_ISSUE_CMD)
+		return -EINVAL;
+
+	if (copy_from_user(&input, argp, sizeof(struct sev_issue_cmd)))
+		return -EFAULT;
+
+	if (input.cmd > SEV_CMD_MAX)
+		return -EINVAL;
+
+	switch (input.cmd) {
+
+	case SEV_USER_CMD_INIT: {
+		ret = sev_ioctl_init(&input);
+		break;
+	}
+	case SEV_USER_CMD_SHUTDOWN: {
+		ret = sev_platform_shutdown(&input.error);
+		break;
+	}
+	case SEV_USER_CMD_FACTORY_RESET: {
+		ret = sev_issue_cmd(SEV_CMD_FACTORY_RESET, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PLATFORM_STATUS: {
+		ret = sev_ioctl_platform_status(&input);
+		break;
+	}
+	case SEV_USER_CMD_PEK_GEN: {
+		ret = sev_issue_cmd(SEV_CMD_PEK_GEN, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PDH_GEN: {
+		ret = sev_issue_cmd(SEV_CMD_PDH_GEN, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PEK_CSR: {
+		ret = sev_ioctl_pek_csr(&input);
+		break;
+	}
+	case SEV_USER_CMD_PEK_CERT_IMPORT: {
+		ret = sev_ioctl_pek_cert_import(&input);
+		break;
+	}
+	case SEV_USER_CMD_PDH_CERT_EXPORT: {
+		ret = sev_ioctl_pdh_cert_export(&input);
+		break;
+	}
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	if (copy_to_user(argp, &input, sizeof(struct sev_issue_cmd)))
+		ret = -EFAULT;
+
+	return ret;
+}
+
+const struct file_operations sev_fops = {
+	.owner	= THIS_MODULE,
+	.unlocked_ioctl = sev_ioctl,
+};
+
+int sev_ops_init(struct sev_device *sev)
+{
+	struct miscdevice *misc = &sev->misc;
+
+	misc->minor = MISC_DYNAMIC_MINOR;
+	misc->name = sev->name;
+	misc->fops = &sev_fops;
+
+	return misc_register(misc);
+}
+
+void sev_ops_destroy(struct sev_device *sev)
+{
+	misc_deregister(&sev->misc);
+}
+
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
new file mode 100644
index 0000000..acce6ed
--- /dev/null
+++ b/include/linux/psp-sev.h
@@ -0,0 +1,672 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) driver interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_SEV_H__
+#define __PSP_SEV_H__
+
+#ifdef CONFIG_X86
+#include <linux/mem_encrypt.h>
+
+#define __psp_pa(x)	__sme_pa(x)
+#else
+#define __psp_pa(x)	__pa(x)
+#endif
+
+/**
+ * SEV platform and guest management commands
+ */
+enum sev_cmd {
+	/* platform commands */
+	SEV_CMD_INIT			= 0x001,
+	SEV_CMD_SHUTDOWN		= 0x002,
+	SEV_CMD_FACTORY_RESET		= 0x003,
+	SEV_CMD_PLATFORM_STATUS		= 0x004,
+	SEV_CMD_PEK_GEN			= 0x005,
+	SEV_CMD_PEK_CSR			= 0x006,
+	SEV_CMD_PEK_CERT_IMPORT		= 0x007,
+	SEV_CMD_PDH_GEN			= 0x008,
+	SEV_CMD_PDH_CERT_EXPORT		= 0x009,
+	SEV_CMD_DF_FLUSH		= 0x00A,
+
+	/* Guest commands */
+	SEV_CMD_DECOMMISSION		= 0x020,
+	SEV_CMD_ACTIVATE		= 0x021,
+	SEV_CMD_DEACTIVATE		= 0x022,
+	SEV_CMD_GUEST_STATUS		= 0x023,
+
+	/* Guest launch commands */
+	SEV_CMD_LAUNCH_START		= 0x030,
+	SEV_CMD_LAUNCH_UPDATE_DATA	= 0x031,
+	SEV_CMD_LAUNCH_UPDATE_VMSA	= 0x032,
+	SEV_CMD_LAUNCH_MEASURE		= 0x033,
+	SEV_CMD_LAUNCH_UPDATE_SECRET	= 0x034,
+	SEV_CMD_LAUNCH_FINISH		= 0x035,
+
+	/* Guest migration commands (outgoing) */
+	SEV_CMD_SEND_START		= 0x040,
+	SEV_CMD_SEND_UPDATE_DATA	= 0x041,
+	SEV_CMD_SEND_UPDATE_VMSA	= 0x042,
+	SEV_CMD_SEND_FINISH		= 0x043,
+
+	/* Guest migration commands (incoming) */
+	SEV_CMD_RECEIVE_START		= 0x050,
+	SEV_CMD_RECEIVE_UPDATE_DATA	= 0x051,
+	SEV_CMD_RECEIVE_UPDATE_VMSA	= 0x052,
+	SEV_CMD_RECEIVE_FINISH		= 0x053,
+
+	/* Guest debug commands */
+	SEV_CMD_DBG_DECRYPT		= 0x060,
+	SEV_CMD_DBG_ENCRYPT		= 0x061,
+
+	SEV_CMD_MAX,
+};
+
+/**
+ * status code returned by the commands
+ */
+enum psp_ret_code {
+	SEV_RET_SUCCESS = 0,
+	SEV_RET_INVALID_PLATFORM_STATE,
+	SEV_RET_INVALID_GUEST_STATE,
+	SEV_RET_INAVLID_CONFIG,
+	SEV_RET_INVALID_LENGTH,
+	SEV_RET_ALREADY_OWNED,
+	SEV_RET_INVALID_CERTIFICATE,
+	SEV_RET_POLICY_FAILURE,
+	SEV_RET_INACTIVE,
+	SEV_RET_INVALID_ADDRESS,
+	SEV_RET_BAD_SIGNATURE,
+	SEV_RET_BAD_MEASUREMENT,
+	SEV_RET_ASID_OWNED,
+	SEV_RET_INVALID_ASID,
+	SEV_RET_WBINVD_REQUIRED,
+	SEV_RET_DFFLUSH_REQUIRED,
+	SEV_RET_INVALID_GUEST,
+	SEV_RET_INVALID_COMMAND,
+	SEV_RET_ACTIVE,
+	SEV_RET_HWSEV_RET_PLATFORM,
+	SEV_RET_HWSEV_RET_UNSAFE,
+	SEV_RET_UNSUPPORTED,
+	SEV_RET_MAX,
+};
+
+/**
+ * struct sev_data_init - INIT command parameters
+ *
+ * @flags: processing flags
+ * @tmr_address: system physical address used for SEV-ES
+ * @tmr_length: length of tmr_address
+ */
+struct sev_data_init {
+	__u32 flags;				/* In */
+	__u32 reserved;				/* In */
+	__u64 tmr_address;			/* In */
+	__u32 tmr_length;			/* In */
+};
+
+/**
+ * struct sev_data_status - PLATFORM_STATUS command parameters
+ *
+ * @major: major API version
+ * @minor: minor API version
+ * @state: platform state
+ * @owner: self-owned or externally owned
+ * @config: platform config flags
+ * @guest_count: number of active guests
+ */
+struct sev_data_status {
+	__u8 api_major;				/* Out */
+	__u8 api_minor;				/* Out */
+	__u8 state;				/* Out */
+	__u8 owner;				/* Out */
+	__u32 config;				/* Out */
+	__u32 guest_count;			/* Out */
+};
+
+/**
+ * struct sev_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @length: length of certificate
+ */
+struct sev_data_pek_csr {
+	__u64 address;					/* In */
+	__u32 length;					/* In/Out */
+};
+
+/**
+ * struct sev_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_length: length of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_length: length of OCA certificate
+ */
+struct sev_data_pek_cert_import {
+	__u64 pek_cert_address;				/* In */
+	__u32 pek_cert_length;				/* In */
+	__u32 reserved;					/* In */
+	__u64 oca_cert_address;				/* In */
+	__u32 oca_cert_length;				/* In */
+};
+
+/**
+ * struct sev_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_length: length of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_length: length of PDH certificate chain
+ */
+struct sev_data_pdh_cert_export {
+	__u64 pdh_cert_address;				/* In */
+	__u32 pdh_cert_length;				/* In/Out */
+	__u32 reserved;					/* In */
+	__u64 cert_chain_address;			/* In */
+	__u32 cert_chain_length;			/* In/Out */
+};
+
+/**
+ * struct sev_data_decommission - DECOMMISSION command parameters
+ *
+ * @handle: handle of the VM to decommission
+ */
+struct sev_data_decommission {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_activate - ACTIVATE command parameters
+ *
+ * @handle: handle of the VM to activate
+ * @asid: asid assigned to the VM
+ */
+struct sev_data_activate {
+	u32 handle;				/* In */
+	u32 asid;				/* In */
+};
+
+/**
+ * struct sev_data_deactivate - DEACTIVATE command parameters
+ *
+ * @handle: handle of the VM to deactivate
+ */
+struct sev_data_deactivate {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_guest_status - SEV GUEST_STATUS command parameters
+ *
+ * @handle: handle of the VM to retrieve status
+ * @policy: policy information for the VM
+ * @asid: current ASID of the VM
+ * @state: current state of the VM
+ */
+struct sev_data_guest_status {
+	u32 handle;				/* In */
+	u32 policy;				/* Out */
+	u32 asid;				/* Out */
+	u8 state;				/* Out */
+};
+
+/**
+ * struct sev_data_launch_start - LAUNCH_START command parameters
+ *
+ * @handle: handle assigned to the VM
+ * @policy: guest launch policy
+ * @dh_cert_address: physical address of DH certificate blob
+ * @dh_cert_length: length of DH certificate blob
+ * @session_address: physical address of session parameters
+ * @session_len: length of session parameters
+ */
+struct sev_data_launch_start {
+	u32 handle;				/* In/Out */
+	u32 policy;				/* In */
+	u64 dh_cert_address;			/* In */
+	u32 dh_cert_length;			/* In */
+	u32 reserved;				/* In */
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In */
+};
+
+/**
+ * struct sev_data_launch_update_data - LAUNCH_UPDATE_DATA command parameter
+ *
+ * @handle: handle of the VM to update
+ * @length: length of memory to be encrypted
+ * @address: physical address of memory region to encrypt
+ */
+struct sev_data_launch_update_data {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In */
+};
+
+/**
+ * struct sev_data_launch_update_vmsa - LAUNCH_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM
+ * @address: physical address of memory region to encrypt
+ * @length: length of memory region to encrypt
+ */
+struct sev_data_launch_update_vmsa {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In */
+};
+
+/**
+ * struct sev_data_launch_measure - LAUNCH_MEASURE command parameters
+ *
+ * @handle: handle of the VM to process
+ * @address: physical address containing the measurement blob
+ * @length: length of measurement blob
+ */
+struct sev_data_launch_measure {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In/Out */
+};
+
+/**
+ * struct sev_data_launch_secret - LAUNCH_SECRET command parameters
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing the packet header
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest_paddr
+ * @trans_address: physical address of transport memory buffer
+ * @trans_length: length of transport memory buffer
+ */
+struct sev_data_launch_secret {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_launch_finish - LAUNCH_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_launch_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_send_start - SEND_START command parameters
+ *
+ * @handle: handle of the VM to process
+ * @pdh_cert_address: physical address containing PDH certificate
+ * @pdh_cert_length: length of PDH certificate
+ * @plat_certs_address: physical address containing platform certificate
+ * @plat_certs_length: length of platform certificate
+ * @amd_certs_address: physical address containing AMD certificate
+ * @amd_certs_length: length of AMD certificate
+ * @session_data_address: physical address containing Session data
+ * @session_length: length of session data
+ */
+struct sev_data_send_start {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 pdh_cert_address;			/* In */
+	u32 pdh_cert_length;			/* In/Out */
+	u32 reserved2;
+	u64 plat_cert_address;			/* In */
+	u32 plat_cert_length;			/* In/Out */
+	u32 reserved3;
+	u64 amd_cert_address;			/* In */
+	u32 amd_cert_length;			/* In/Out */
+	u32 reserved4;
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In/Out */
+};
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_DATA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_length: length of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_length: length of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_length: length of host memory region
+ */
+struct sev_data_send_update_data {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In/Out */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_length: length of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_length: length of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_length: length of host memory region
+ */
+struct sev_data_send_update_vmsa {
+	u32 handle;				/* In */
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In/Out */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_send_finish - SEND_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_send_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_receive_start - RECEIVE_START command parameters
+ *
+ * @handle: handle of the VM to perform receive operation
+ * @pdh_cert_address: system physical address containing PDH certificate blob
+ * @pdh_cert_length: length of PDH certificate blob
+ * @session_address: system physical address containing session blob
+ * @session_length: length of session blob
+ */
+struct sev_data_receive_start {
+	u32 handle;				/* In/Out */
+	u32 reserved1;
+	u64 pdh_cert_address;			/* In */
+	u32 pdh_cert_length;			/* In */
+	u32 reserved2;
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In/Out */
+};
+
+/**
+ * struct sev_data_receive_update_data - RECEIVE_UPDATE_DATA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_length: length of transport buffer
+ */
+struct sev_data_receive_update_data {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_receive_update_vmsa - RECEIVE_UPDATE_VMSA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_length: length of transport buffer
+ */
+struct sev_data_receive_update_vmsa {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_receive_finish - RECEIVE_FINISH command parameters
+ *
+ * @handle: handle of the VM to finish
+ */
+struct sev_data_receive_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_dbg - DBG_ENCRYPT/DBG_DECRYPT command parameters
+ *
+ * @handle: handle of the VM to perform debug operation
+ * @src_addr: source address of data to operate on
+ * @dst_addr: destination address of data to operate on
+ * @length: length of data to operate on
+ */
+struct sev_data_dbg {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 src_addr;				/* In */
+	u64 dst_addr;				/* In */
+	u32 length;				/* In */
+};
+
+#if defined(CONFIG_CRYPTO_DEV_SEV)
+
+/**
+ * sev_platform_init - perform SEV INIT command
+ *
+ * @init: sev_data_init structure to be processed
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_init(struct sev_data_init *init, int *error);
+
+/**
+ * sev_platform_shutdown - perform SEV SHUTDOWN command
+ *
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_shutdown(int *error);
+
+/**
+ * sev_platform_status - perform SEV PLATFORM_STATUS command
+ *
+ * @init: sev_data_status structure to be processed
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_status(struct sev_data_status *status, int *error);
+
+/**
+ * sev_issue_cmd_external_user - issue SEV command by other driver
+ *
+ * The function can be used by other drivers to issue a SEV command on
+ * behalf by userspace. The caller must pass a valid SEV file descriptor
+ * so that we know that caller has access to SEV device.
+ *
+ * @filep - SEV device file pointer
+ * @cmd - command to issue
+ * @data - command buffer
+ * @timeout - If zero then use default timeout
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ * -%EINVAL    if the SEV file descriptor is not valid
+ */
+int sev_issue_cmd_external_user(struct file *filep, unsigned int id,
+				void *data, int timeout, int *error);
+
+/**
+ * sev_guest_deactivate - perform SEV DEACTIVATE command
+ *
+ * @deactivate: sev_data_deactivate structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_deactivate(struct sev_data_deactivate *data, int *error);
+
+/**
+ * sev_guest_activate - perform SEV ACTIVATE command
+ *
+ * @activate: sev_data_activate structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_activate(struct sev_data_activate *data, int *error);
+
+/**
+ * sev_guest_df_flush - perform SEV DF_FLUSH command
+ *
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_df_flush(int *error);
+
+/**
+ * sev_guest_decommission - perform SEV DECOMMISSION command
+ *
+ * @decommission: sev_data_decommission structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_decommission(struct sev_data_decommission *data, int *error);
+
+#else	/* !CONFIG_CRYPTO_DEV_SEV */
+
+static inline int sev_platform_status(struct sev_data_status *status,
+				      int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_platform_init(struct sev_data_init *init, int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_platform_shutdown(int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_issue_cmd_external_user(int fd, unsigned int id,
+					void *data, int timeout, int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_deactivate(struct sev_data_deactivate *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_decommission(struct sev_data_decommission *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_activate(struct sev_data_activate *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_df_flush(int *error)
+{
+	return -ENODEV;
+}
+
+#endif	/* CONFIG_CRYPTO_DEV_SEV */
+
+#endif	/* __PSP_SEV_H__ */
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index f330ba4..2e15ea7 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -481,3 +481,4 @@ header-y += xilinx-v4l2-controls.h
 header-y += zorro.h
 header-y += zorro_ids.h
 header-y += userfaultfd.h
+header-y += psp-sev.h
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
new file mode 100644
index 0000000..050976d
--- /dev/null
+++ b/include/uapi/linux/psp-sev.h
@@ -0,0 +1,123 @@
+
+/*
+ * Userspace interface for AMD Secure Encrypted Virtualization (SEV)
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_SEV_USER_H__
+#define __PSP_SEV_USER_H__
+
+#include <linux/types.h>
+
+/**
+ * SEV platform commands
+ */
+enum {
+	SEV_USER_CMD_INIT = 0,
+	SEV_USER_CMD_SHUTDOWN,
+	SEV_USER_CMD_FACTORY_RESET,
+	SEV_USER_CMD_PLATFORM_STATUS,
+	SEV_USER_CMD_PEK_GEN,
+	SEV_USER_CMD_PEK_CSR,
+	SEV_USER_CMD_PDH_GEN,
+	SEV_USER_CMD_PDH_CERT_EXPORT,
+	SEV_USER_CMD_PEK_CERT_IMPORT,
+
+	SEV_USER_CMD_MAX,
+};
+
+/**
+ * struct sev_user_data_init - INIT command parameters
+ *
+ * @flags: processing flags
+ */
+struct sev_user_data_init {
+	__u32 flags;				/* In */
+};
+
+/**
+ * struct sev_user_data_status - PLATFORM_STATUS command parameters
+ *
+ * @major: major API version
+ * @minor: minor API version
+ * @state: platform state
+ * @owner: self-owned or externally owned
+ * @config: platform config flags
+ * @guest_count: number of active guests
+ */
+struct sev_user_data_status {
+	__u8 api_major;				/* Out */
+	__u8 api_minor;				/* Out */
+	__u8 state;				/* Out */
+	__u8 owner;				/* Out */
+	__u32 config;				/* Out */
+	__u32 guest_count;			/* Out */
+};
+
+/**
+ * struct sev_user_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @length: length of certificate
+ */
+struct sev_user_data_pek_csr {
+	__u64 address;					/* In */
+	__u32 length;					/* In/Out */
+};
+
+/**
+ * q
+ * struct sev_user_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_length: length of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_length: length of OCA certificate
+ */
+struct sev_user_data_pek_cert_import {
+	__u64 pek_cert_address;				/* In */
+	__u32 pek_cert_length;				/* In */
+	__u64 oca_cert_address;				/* In */
+	__u32 oca_cert_length;				/* In */
+};
+
+/**
+ * struct sev_user_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_length: length of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_length: length of PDH certificate chain
+ */
+struct sev_user_data_pdh_cert_export {
+	__u64 pdh_cert_address;				/* In */
+	__u32 pdh_cert_length;				/* In/Out */
+	__u64 cert_chain_address;			/* In */
+	__u32 cert_chain_length;			/* In/Out */
+};
+
+/**
+ * struct sev_issue_cmd - SEV ioctl parameters
+ *
+ * @cmd: SEV commands to execute
+ * @opaque: pointer to the command structure
+ * @error: SEV FW return code on failure
+ */
+struct sev_issue_cmd {
+	__u32 cmd;					/* In */
+	__u64 data;					/* In */
+	__u32 error;					/* Out */
+};
+
+#define SEV_IOC_TYPE		'S'
+#define SEV_ISSUE_CMD	_IOWR(SEV_IOC_TYPE, 0x0, struct sev_issue_cmd)
+
+#endif /* __PSP_USER_SEV_H */
+

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 21/32] crypto: ccp: Add Secure Encrypted Virtualization (SEV) interface support
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The Secure Encrypted Virtualization (SEV) interface allows the memory
contents of a virtual machine (VM) to be transparently encrypted with
a key unique to the guest.

The interface provides:
  - /dev/sev device and ioctl (SEV_ISSUE_CMD) to execute the platform
    provisioning commands from the userspace.
  - in-kernel API's to encrypt the guest memory region. The in-kernel APIs
    will be used by KVM to bootstrap and debug the SEV guest.

SEV key management spec is available here [1]
[1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Specification.pdf

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 drivers/crypto/ccp/Kconfig   |    7 
 drivers/crypto/ccp/Makefile  |    1 
 drivers/crypto/ccp/psp-dev.h |    6 
 drivers/crypto/ccp/sev-dev.c |  348 ++++++++++++++++++++++
 drivers/crypto/ccp/sev-dev.h |   67 ++++
 drivers/crypto/ccp/sev-ops.c |  324 ++++++++++++++++++++
 include/linux/psp-sev.h      |  672 ++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/Kbuild    |    1 
 include/uapi/linux/psp-sev.h |  123 ++++++++
 9 files changed, 1546 insertions(+), 3 deletions(-)
 create mode 100644 drivers/crypto/ccp/sev-dev.c
 create mode 100644 drivers/crypto/ccp/sev-dev.h
 create mode 100644 drivers/crypto/ccp/sev-ops.c
 create mode 100644 include/linux/psp-sev.h
 create mode 100644 include/uapi/linux/psp-sev.h

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index 59c207e..67d1917 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -41,4 +41,11 @@ config CRYPTO_DEV_PSP
 	help
 	 Provide the interface for AMD Platform Security Processor (PSP) device.
 
+config CRYPTO_DEV_SEV
+	bool "Secure Encrypted Virtualization (SEV) interface"
+	default y
+	help
+	 Provide the kernel and userspace (/dev/sev) interface to issue the
+	 Secure Encrypted Virtualization (SEV) commands.
+
 endif
diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
index 12e569d..4c4e77e 100644
--- a/drivers/crypto/ccp/Makefile
+++ b/drivers/crypto/ccp/Makefile
@@ -7,6 +7,7 @@ ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
 	    ccp-dev-v5.o \
 	    ccp-dmaengine.o
 ccp-$(CONFIG_CRYPTO_DEV_PSP) += psp-dev.o
+ccp-$(CONFIG_CRYPTO_DEV_SEV) += sev-dev.o sev-ops.o
 
 obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
 ccp-crypto-objs := ccp-crypto-main.o \
diff --git a/drivers/crypto/ccp/psp-dev.h b/drivers/crypto/ccp/psp-dev.h
index bbd3d96..fd67b14 100644
--- a/drivers/crypto/ccp/psp-dev.h
+++ b/drivers/crypto/ccp/psp-dev.h
@@ -70,14 +70,14 @@ int psp_free_sev_irq(struct psp_device *psp, void *data);
 
 struct psp_device *psp_get_master_device(void);
 
-#ifdef CONFIG_AMD_SEV
+#ifdef CONFIG_CRYPTO_DEV_SEV
 
 int sev_dev_init(struct psp_device *psp);
 void sev_dev_destroy(struct psp_device *psp);
 int sev_dev_resume(struct psp_device *psp);
 int sev_dev_suspend(struct psp_device *psp, pm_message_t state);
 
-#else
+#else /* !CONFIG_CRYPTO_DEV_SEV */
 
 static inline int sev_dev_init(struct psp_device *psp)
 {
@@ -96,7 +96,7 @@ static inline int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
 	return -ENODEV;
 }
 
-#endif /* __AMD_SEV_H */
+#endif /* CONFIG_CRYPTO_DEV_SEV */
 
 #endif /* __PSP_DEV_H */
 
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
new file mode 100644
index 0000000..a67e2d7
--- /dev/null
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -0,0 +1,348 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/delay.h>
+#include <linux/wait.h>
+#include <linux/jiffies.h>
+
+#include "psp-dev.h"
+#include "sev-dev.h"
+
+extern struct file_operations sev_fops;
+
+static LIST_HEAD(sev_devs);
+static DEFINE_SPINLOCK(sev_devs_lock);
+static atomic_t sev_id;
+
+static unsigned int psp_poll;
+module_param(psp_poll, uint, 0444);
+MODULE_PARM_DESC(psp_poll, "Poll for sev command completion - any non-zero value");
+
+DEFINE_MUTEX(sev_cmd_mutex);
+
+void sev_add_device(struct sev_device *sev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sev_devs_lock, flags);
+
+	list_add_tail(&sev->entry, &sev_devs);
+
+	spin_unlock_irqrestore(&sev_devs_lock, flags);
+}
+
+void sev_del_device(struct sev_device *sev)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sev_devs_lock, flags);
+
+	list_del(&sev->entry);
+	spin_unlock_irqrestore(&sev_devs_lock, flags);
+}
+
+static struct sev_device *get_sev_master_device(void)
+{
+	struct psp_device *psp = psp_get_master_device();
+
+	return psp ? psp->sev_data : NULL;
+}
+
+static int sev_wait_cmd_poll(struct sev_device *sev, unsigned int timeout,
+			     unsigned int *reg)
+{
+	int wait = timeout * 10;	/* 100ms sleep => timeout * 10 */
+
+	while (--wait) {
+		msleep(100);
+
+		*reg = ioread32(sev->io_regs + PSP_CMDRESP);
+		if (*reg & PSP_CMDRESP_RESP)
+			break;
+	}
+
+	if (!wait) {
+		dev_err(sev->dev, "sev command timed out\n");
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static int sev_wait_cmd_ioc(struct sev_device *sev, unsigned int timeout,
+			    unsigned int *reg)
+{
+	unsigned long jiffie_timeout = timeout;
+	long ret;
+
+	jiffie_timeout *= HZ;
+
+	sev->int_rcvd = 0;
+
+	ret = wait_event_interruptible_timeout(sev->int_queue, sev->int_rcvd,
+						jiffie_timeout);
+	if (ret <= 0) {
+		dev_err(sev->dev, "sev command (%#x) timed out\n",
+				*reg >> PSP_CMDRESP_CMD_SHIFT);
+		return -ETIMEDOUT;
+	}
+
+	*reg = ioread32(sev->io_regs + PSP_CMDRESP);
+
+	return 0;
+}
+
+static int sev_wait_cmd(struct sev_device *sev, unsigned int timeout,
+			unsigned int *reg)
+{
+	return (*reg & PSP_CMDRESP_IOC) ? sev_wait_cmd_ioc(sev, timeout, reg)
+					: sev_wait_cmd_poll(sev, timeout, reg);
+}
+
+static struct sev_device *sev_alloc_struct(struct psp_device *psp)
+{
+	struct device *dev = psp->dev;
+	struct sev_device *sev;
+
+	sev = devm_kzalloc(dev, sizeof(*sev), GFP_KERNEL);
+	if (!sev)
+		return NULL;
+
+	sev->dev = dev;
+	sev->psp = psp;
+	sev->id = atomic_inc_return(&sev_id);
+
+	snprintf(sev->name, sizeof(sev->name), "sev%u", sev->id);
+	init_waitqueue_head(&sev->int_queue);
+
+	return sev;
+}
+
+irqreturn_t sev_irq_handler(int irq, void *data)
+{
+	struct sev_device *sev = data;
+	unsigned int status;
+
+	status = ioread32(sev->io_regs + PSP_P2CMSG_INTSTS);
+	if (status & (1 << PSP_CMD_COMPLETE_REG)) {
+		int reg;
+
+		reg = ioread32(sev->io_regs + PSP_CMDRESP);
+		if (reg & PSP_CMDRESP_RESP) {
+			sev->int_rcvd = 1;
+			wake_up_interruptible(&sev->int_queue);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
+
+static bool check_sev_support(struct sev_device *sev)
+{
+	/* Bit 0 in PSP_FEATURE_REG is set then SEV is support in PSP */
+	if (ioread32(sev->io_regs + PSP_FEATURE_REG) & 1)
+		return true;
+
+	return false;
+}
+
+int sev_dev_init(struct psp_device *psp)
+{
+	struct device *dev = psp->dev;
+	struct sev_device *sev;
+	int ret;
+
+	ret = -ENOMEM;
+	sev = sev_alloc_struct(psp);
+	if (!sev)
+		goto e_err;
+	psp->sev_data = sev;
+	
+	sev->io_regs = psp->io_regs;
+
+	dev_dbg(dev, "checking SEV support ...\n");
+	/* check SEV support */
+	if (!check_sev_support(sev)) {
+		dev_dbg(dev, "device does not support SEV\n");
+		goto e_err;
+	}
+
+	dev_dbg(dev, "requesting an IRQ ...\n");
+	/* Request an irq */
+	ret = psp_request_sev_irq(sev->psp, sev_irq_handler, sev);
+	if (ret) {
+		dev_err(dev, "unable to allocate an IRQ\n");
+		goto e_err;
+	}
+
+	/* initialize SEV ops */
+	dev_dbg(dev, "init sev ops\n");
+	ret = sev_ops_init(sev);
+	if (ret) {
+		dev_err(dev, "failed to init sev ops\n");
+		goto e_irq;
+	}
+
+	sev_add_device(sev);
+
+	dev_notice(dev, "sev enabled\n");
+
+	return 0;
+
+e_irq:
+	psp_free_sev_irq(psp, sev);
+e_err:
+	psp->sev_data = NULL;
+
+	dev_notice(dev, "sev initialization failed\n");
+
+	return ret;
+}
+
+void sev_dev_destroy(struct psp_device *psp)
+{
+	struct sev_device *sev = psp->sev_data;
+
+	psp_free_sev_irq(psp, sev);
+
+	sev_ops_destroy(sev);
+
+	sev_del_device(sev);
+}
+
+int sev_dev_resume(struct psp_device *psp)
+{
+	return 0;
+}
+
+int sev_dev_suspend(struct psp_device *psp, pm_message_t state)
+{
+	return 0;
+}
+
+int sev_issue_cmd(int cmd, void *data, unsigned int timeout, int *psp_ret)
+{
+	struct sev_device *sev = get_sev_master_device();
+	unsigned int phys_lsb, phys_msb;
+	unsigned int reg, ret;
+
+	if (!sev)
+		return -ENODEV;
+
+	if (psp_ret)
+		*psp_ret = 0;
+
+	/* Set the physical address for the PSP */
+	phys_lsb = data ? lower_32_bits(__psp_pa(data)) : 0;
+	phys_msb = data ? upper_32_bits(__psp_pa(data)) : 0;
+
+	dev_dbg(sev->dev, "sev command id %#x buffer 0x%08x%08x\n",
+			cmd, phys_msb, phys_lsb);
+
+	/* Only one command at a time... */
+	mutex_lock(&sev_cmd_mutex);
+
+	iowrite32(phys_lsb, sev->io_regs + PSP_CMDBUFF_ADDR_LO);
+	iowrite32(phys_msb, sev->io_regs + PSP_CMDBUFF_ADDR_HI);
+	wmb();
+
+	reg = cmd;
+	reg <<= PSP_CMDRESP_CMD_SHIFT;
+	reg |= psp_poll ? 0 : PSP_CMDRESP_IOC;
+	iowrite32(reg, sev->io_regs + PSP_CMDRESP);
+
+	ret = sev_wait_cmd(sev, timeout, &reg);
+	if (ret)
+		goto unlock;
+
+	if (psp_ret)
+		*psp_ret = reg & PSP_CMDRESP_ERR_MASK;
+
+	if (reg & PSP_CMDRESP_ERR_MASK) {
+		dev_dbg(sev->dev, "sev command %u failed (%#010x)\n",
+			cmd, reg & PSP_CMDRESP_ERR_MASK);
+		ret = -EIO;
+	}
+
+unlock:
+	mutex_unlock(&sev_cmd_mutex);
+
+	return ret;
+}
+
+int sev_platform_init(struct sev_data_init *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_INIT, data, SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_init);
+
+int sev_platform_shutdown(int *error)
+{
+	return sev_issue_cmd(SEV_CMD_SHUTDOWN, 0, SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_shutdown);
+
+int sev_platform_status(struct sev_data_status *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_PLATFORM_STATUS, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_platform_status);
+
+int sev_issue_cmd_external_user(struct file *filep, unsigned int cmd,
+				void *data, int timeout, int *error)
+{
+	if (!filep || filep->f_op != &sev_fops)
+		return -EBADF;
+
+	return sev_issue_cmd(cmd, data,
+			timeout ? timeout : SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_issue_cmd_external_user);
+
+int sev_guest_deactivate(struct sev_data_deactivate *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DEACTIVATE, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_deactivate);
+
+int sev_guest_activate(struct sev_data_activate *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_ACTIVATE, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_activate);
+
+int sev_guest_decommission(struct sev_data_decommission *data, int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DECOMMISSION, data,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_decommission);
+
+int sev_guest_df_flush(int *error)
+{
+	return sev_issue_cmd(SEV_CMD_DF_FLUSH, 0,
+			SEV_DEFAULT_TIMEOUT, error);
+}
+EXPORT_SYMBOL_GPL(sev_guest_df_flush);
+
diff --git a/drivers/crypto/ccp/sev-dev.h b/drivers/crypto/ccp/sev-dev.h
new file mode 100644
index 0000000..0df6ead
--- /dev/null
+++ b/drivers/crypto/ccp/sev-dev.h
@@ -0,0 +1,67 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) interface
+ *
+ * Copyright (C) 2013,2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __SEV_DEV_H__
+#define __SEV_DEV_H__
+
+#include <linux/device.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/list.h>
+#include <linux/wait.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+#include <linux/miscdevice.h>
+
+#include <linux/psp-sev.h>
+
+#define PSP_C2PMSG(_num)		((_num) << 2)
+#define PSP_CMDRESP			PSP_C2PMSG(32)
+#define PSP_CMDBUFF_ADDR_LO		PSP_C2PMSG(56)
+#define PSP_CMDBUFF_ADDR_HI             PSP_C2PMSG(57)
+#define PSP_FEATURE_REG			PSP_C2PMSG(63)
+
+#define PSP_P2CMSG(_num)		(_num << 2)
+#define PSP_CMD_COMPLETE_REG		1
+#define PSP_CMD_COMPLETE		PSP_P2CMSG(PSP_CMD_COMPLETE_REG)
+
+#define MAX_PSP_NAME_LEN		16
+#define SEV_DEFAULT_TIMEOUT		5
+
+struct sev_device {
+	struct list_head entry;
+
+	struct dentry *debugfs;
+	struct miscdevice misc;
+
+	unsigned int id;
+	char name[MAX_PSP_NAME_LEN];
+
+	struct device *dev;
+	struct sp_device *sp;
+	struct psp_device *psp;
+
+	void __iomem *io_regs;
+
+	unsigned int int_rcvd;
+	wait_queue_head_t int_queue;
+};
+
+void sev_add_device(struct sev_device *sev);
+void sev_del_device(struct sev_device *sev);
+
+int sev_ops_init(struct sev_device *sev);
+void sev_ops_destroy(struct sev_device *sev);
+
+int sev_issue_cmd(int cmd, void *data, unsigned int timeout, int *error);
+
+#endif /* __SEV_DEV_H */
diff --git a/drivers/crypto/ccp/sev-ops.c b/drivers/crypto/ccp/sev-ops.c
new file mode 100644
index 0000000..727a8db
--- /dev/null
+++ b/drivers/crypto/ccp/sev-ops.c
@@ -0,0 +1,324 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) command interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/spinlock.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/uaccess.h>
+
+#include <uapi/linux/psp-sev.h>
+
+#include "psp-dev.h"
+#include "sev-dev.h"
+
+static int sev_ioctl_init(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_init *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = sev_platform_init(data, &argp->error);
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_platform_status(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_status *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	ret = sev_platform_status(data, &argp->error);
+
+	if (copy_to_user((void *)argp->data, data, sizeof(*data)))
+		ret = -EFAULT;
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pek_csr(struct sev_issue_cmd *argp)
+{
+	int ret;
+	void *csr_addr = NULL;
+	struct sev_data_pek_csr *data;
+	struct sev_user_data_pek_csr input;
+
+	if (copy_from_user(&input, (void *)argp->data,
+			sizeof(struct sev_user_data_pek_csr)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy PEK certificate from userspace */
+	if (input.address && input.length) {
+		csr_addr = kmalloc(input.length, GFP_KERNEL);
+		if (!csr_addr) {
+			ret = -ENOMEM;
+			goto e_err;
+		}
+		if (copy_from_user(csr_addr, (void *)input.address,
+				input.length)) {
+			ret = -EFAULT;
+			goto e_csr_free;
+		}
+
+		data->address = __psp_pa(csr_addr);
+		data->length = input.length;
+	}
+
+	ret = sev_issue_cmd(SEV_CMD_PEK_CSR,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+
+	input.length = data->length;
+
+	/* copy PEK certificate length to userspace */
+	if (copy_to_user((void *)argp->data, &input,
+			sizeof(struct sev_user_data_pek_csr)))
+		ret = -EFAULT;
+e_csr_free:
+	kfree(csr_addr);
+e_err:
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pek_cert_import(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_pek_cert_import *data;
+	struct sev_user_data_pek_cert_import input;
+	void *pek_cert, *oca_cert;
+
+	if (copy_from_user(&input, (void *)argp->data, sizeof(*data)))
+		return -EFAULT;
+
+	if (!input.pek_cert_address || !input.pek_cert_length ||
+		!input.oca_cert_address || !input.oca_cert_length)
+		return -EINVAL;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy PEK certificate from userspace */
+	pek_cert = kmalloc(input.pek_cert_length, GFP_KERNEL);
+	if (!pek_cert) {
+		ret = -ENOMEM;
+		goto e_free;
+	}
+	if (copy_from_user(pek_cert, (void *)input.pek_cert_address,
+				input.pek_cert_length)) {
+		ret = -EFAULT;
+		goto e_free_pek_cert;
+	}
+
+	data->pek_cert_address = __psp_pa(pek_cert);
+	data->pek_cert_length = input.pek_cert_length;
+
+	/* copy OCA certificate from userspace */
+	oca_cert = kmalloc(input.oca_cert_length, GFP_KERNEL);
+	if (!oca_cert) {
+		ret = -ENOMEM;
+		goto e_free_pek_cert;
+	}
+	if (copy_from_user(oca_cert, (void *)input.oca_cert_address,
+				input.oca_cert_length)) {
+		ret = -EFAULT;
+		goto e_free_oca_cert;
+	}
+
+	data->oca_cert_address = __psp_pa(oca_cert);
+	data->oca_cert_length = input.oca_cert_length;
+
+	ret = sev_issue_cmd(SEV_CMD_PEK_CERT_IMPORT,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+e_free_oca_cert:
+	kfree(oca_cert);
+e_free_pek_cert:
+	kfree(pek_cert);
+e_free:
+	kfree(data);
+	return ret;
+}
+
+static int sev_ioctl_pdh_cert_export(struct sev_issue_cmd *argp)
+{
+	int ret;
+	struct sev_data_pdh_cert_export *data;
+	struct sev_user_data_pdh_cert_export input;
+	void *pdh_cert = NULL, *cert_chain = NULL;
+
+	if (copy_from_user(&input, (void *)argp->data, sizeof(*data)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* copy pdh certificate from userspace */
+	if (input.pdh_cert_length && input.pdh_cert_address) {
+		pdh_cert = kmalloc(input.pdh_cert_length, GFP_KERNEL);
+		if (!pdh_cert) {
+			ret = -ENOMEM;
+			goto e_free;
+		}
+		if (copy_from_user(pdh_cert, (void *)input.pdh_cert_address,
+					input.pdh_cert_length)) {
+			ret = -EFAULT;
+			goto e_free_pdh_cert;
+		}
+
+		data->pdh_cert_address = __psp_pa(pdh_cert);
+		data->pdh_cert_length = input.pdh_cert_length;
+	}
+
+	/* copy cert_chain certificate from userspace */
+	if (input.cert_chain_length && input.cert_chain_address) {
+		cert_chain = kmalloc(input.cert_chain_length, GFP_KERNEL);
+		if (!cert_chain) {
+			ret = -ENOMEM;
+			goto e_free_pdh_cert;
+		}
+		if (copy_from_user(cert_chain, (void *)input.cert_chain_address,
+					input.cert_chain_length)) {
+			ret = -EFAULT;
+			goto e_free_cert_chain;
+		}
+
+		data->cert_chain_address = __psp_pa(cert_chain);
+		data->cert_chain_length = input.cert_chain_length;
+	}
+
+	ret = sev_issue_cmd(SEV_CMD_PDH_CERT_EXPORT,
+			data, SEV_DEFAULT_TIMEOUT, &argp->error);
+
+	input.cert_chain_length = data->cert_chain_length;
+	input.pdh_cert_length = data->pdh_cert_length;
+
+	/* copy certificate length to userspace */
+	if (copy_to_user((void *)argp->data, &input,
+			sizeof(struct sev_user_data_pek_csr)))
+		ret = -EFAULT;
+
+e_free_cert_chain:
+	kfree(cert_chain);
+e_free_pdh_cert:
+	kfree(pdh_cert);
+e_free:
+	kfree(data);
+	return ret;
+}
+
+static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
+{
+	int ret = -EFAULT;
+	void __user *argp = (void __user *)arg;
+	struct sev_issue_cmd input;
+
+	if (ioctl != SEV_ISSUE_CMD)
+		return -EINVAL;
+
+	if (copy_from_user(&input, argp, sizeof(struct sev_issue_cmd)))
+		return -EFAULT;
+
+	if (input.cmd > SEV_CMD_MAX)
+		return -EINVAL;
+
+	switch (input.cmd) {
+
+	case SEV_USER_CMD_INIT: {
+		ret = sev_ioctl_init(&input);
+		break;
+	}
+	case SEV_USER_CMD_SHUTDOWN: {
+		ret = sev_platform_shutdown(&input.error);
+		break;
+	}
+	case SEV_USER_CMD_FACTORY_RESET: {
+		ret = sev_issue_cmd(SEV_CMD_FACTORY_RESET, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PLATFORM_STATUS: {
+		ret = sev_ioctl_platform_status(&input);
+		break;
+	}
+	case SEV_USER_CMD_PEK_GEN: {
+		ret = sev_issue_cmd(SEV_CMD_PEK_GEN, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PDH_GEN: {
+		ret = sev_issue_cmd(SEV_CMD_PDH_GEN, 0,
+				SEV_DEFAULT_TIMEOUT, &input.error);
+		break;
+	}
+	case SEV_USER_CMD_PEK_CSR: {
+		ret = sev_ioctl_pek_csr(&input);
+		break;
+	}
+	case SEV_USER_CMD_PEK_CERT_IMPORT: {
+		ret = sev_ioctl_pek_cert_import(&input);
+		break;
+	}
+	case SEV_USER_CMD_PDH_CERT_EXPORT: {
+		ret = sev_ioctl_pdh_cert_export(&input);
+		break;
+	}
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	if (copy_to_user(argp, &input, sizeof(struct sev_issue_cmd)))
+		ret = -EFAULT;
+
+	return ret;
+}
+
+const struct file_operations sev_fops = {
+	.owner	= THIS_MODULE,
+	.unlocked_ioctl = sev_ioctl,
+};
+
+int sev_ops_init(struct sev_device *sev)
+{
+	struct miscdevice *misc = &sev->misc;
+
+	misc->minor = MISC_DYNAMIC_MINOR;
+	misc->name = sev->name;
+	misc->fops = &sev_fops;
+
+	return misc_register(misc);
+}
+
+void sev_ops_destroy(struct sev_device *sev)
+{
+	misc_deregister(&sev->misc);
+}
+
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
new file mode 100644
index 0000000..acce6ed
--- /dev/null
+++ b/include/linux/psp-sev.h
@@ -0,0 +1,672 @@
+/*
+ * AMD Secure Encrypted Virtualization (SEV) driver interface
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_SEV_H__
+#define __PSP_SEV_H__
+
+#ifdef CONFIG_X86
+#include <linux/mem_encrypt.h>
+
+#define __psp_pa(x)	__sme_pa(x)
+#else
+#define __psp_pa(x)	__pa(x)
+#endif
+
+/**
+ * SEV platform and guest management commands
+ */
+enum sev_cmd {
+	/* platform commands */
+	SEV_CMD_INIT			= 0x001,
+	SEV_CMD_SHUTDOWN		= 0x002,
+	SEV_CMD_FACTORY_RESET		= 0x003,
+	SEV_CMD_PLATFORM_STATUS		= 0x004,
+	SEV_CMD_PEK_GEN			= 0x005,
+	SEV_CMD_PEK_CSR			= 0x006,
+	SEV_CMD_PEK_CERT_IMPORT		= 0x007,
+	SEV_CMD_PDH_GEN			= 0x008,
+	SEV_CMD_PDH_CERT_EXPORT		= 0x009,
+	SEV_CMD_DF_FLUSH		= 0x00A,
+
+	/* Guest commands */
+	SEV_CMD_DECOMMISSION		= 0x020,
+	SEV_CMD_ACTIVATE		= 0x021,
+	SEV_CMD_DEACTIVATE		= 0x022,
+	SEV_CMD_GUEST_STATUS		= 0x023,
+
+	/* Guest launch commands */
+	SEV_CMD_LAUNCH_START		= 0x030,
+	SEV_CMD_LAUNCH_UPDATE_DATA	= 0x031,
+	SEV_CMD_LAUNCH_UPDATE_VMSA	= 0x032,
+	SEV_CMD_LAUNCH_MEASURE		= 0x033,
+	SEV_CMD_LAUNCH_UPDATE_SECRET	= 0x034,
+	SEV_CMD_LAUNCH_FINISH		= 0x035,
+
+	/* Guest migration commands (outgoing) */
+	SEV_CMD_SEND_START		= 0x040,
+	SEV_CMD_SEND_UPDATE_DATA	= 0x041,
+	SEV_CMD_SEND_UPDATE_VMSA	= 0x042,
+	SEV_CMD_SEND_FINISH		= 0x043,
+
+	/* Guest migration commands (incoming) */
+	SEV_CMD_RECEIVE_START		= 0x050,
+	SEV_CMD_RECEIVE_UPDATE_DATA	= 0x051,
+	SEV_CMD_RECEIVE_UPDATE_VMSA	= 0x052,
+	SEV_CMD_RECEIVE_FINISH		= 0x053,
+
+	/* Guest debug commands */
+	SEV_CMD_DBG_DECRYPT		= 0x060,
+	SEV_CMD_DBG_ENCRYPT		= 0x061,
+
+	SEV_CMD_MAX,
+};
+
+/**
+ * status code returned by the commands
+ */
+enum psp_ret_code {
+	SEV_RET_SUCCESS = 0,
+	SEV_RET_INVALID_PLATFORM_STATE,
+	SEV_RET_INVALID_GUEST_STATE,
+	SEV_RET_INAVLID_CONFIG,
+	SEV_RET_INVALID_LENGTH,
+	SEV_RET_ALREADY_OWNED,
+	SEV_RET_INVALID_CERTIFICATE,
+	SEV_RET_POLICY_FAILURE,
+	SEV_RET_INACTIVE,
+	SEV_RET_INVALID_ADDRESS,
+	SEV_RET_BAD_SIGNATURE,
+	SEV_RET_BAD_MEASUREMENT,
+	SEV_RET_ASID_OWNED,
+	SEV_RET_INVALID_ASID,
+	SEV_RET_WBINVD_REQUIRED,
+	SEV_RET_DFFLUSH_REQUIRED,
+	SEV_RET_INVALID_GUEST,
+	SEV_RET_INVALID_COMMAND,
+	SEV_RET_ACTIVE,
+	SEV_RET_HWSEV_RET_PLATFORM,
+	SEV_RET_HWSEV_RET_UNSAFE,
+	SEV_RET_UNSUPPORTED,
+	SEV_RET_MAX,
+};
+
+/**
+ * struct sev_data_init - INIT command parameters
+ *
+ * @flags: processing flags
+ * @tmr_address: system physical address used for SEV-ES
+ * @tmr_length: length of tmr_address
+ */
+struct sev_data_init {
+	__u32 flags;				/* In */
+	__u32 reserved;				/* In */
+	__u64 tmr_address;			/* In */
+	__u32 tmr_length;			/* In */
+};
+
+/**
+ * struct sev_data_status - PLATFORM_STATUS command parameters
+ *
+ * @major: major API version
+ * @minor: minor API version
+ * @state: platform state
+ * @owner: self-owned or externally owned
+ * @config: platform config flags
+ * @guest_count: number of active guests
+ */
+struct sev_data_status {
+	__u8 api_major;				/* Out */
+	__u8 api_minor;				/* Out */
+	__u8 state;				/* Out */
+	__u8 owner;				/* Out */
+	__u32 config;				/* Out */
+	__u32 guest_count;			/* Out */
+};
+
+/**
+ * struct sev_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @length: length of certificate
+ */
+struct sev_data_pek_csr {
+	__u64 address;					/* In */
+	__u32 length;					/* In/Out */
+};
+
+/**
+ * struct sev_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_length: length of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_length: length of OCA certificate
+ */
+struct sev_data_pek_cert_import {
+	__u64 pek_cert_address;				/* In */
+	__u32 pek_cert_length;				/* In */
+	__u32 reserved;					/* In */
+	__u64 oca_cert_address;				/* In */
+	__u32 oca_cert_length;				/* In */
+};
+
+/**
+ * struct sev_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_length: length of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_length: length of PDH certificate chain
+ */
+struct sev_data_pdh_cert_export {
+	__u64 pdh_cert_address;				/* In */
+	__u32 pdh_cert_length;				/* In/Out */
+	__u32 reserved;					/* In */
+	__u64 cert_chain_address;			/* In */
+	__u32 cert_chain_length;			/* In/Out */
+};
+
+/**
+ * struct sev_data_decommission - DECOMMISSION command parameters
+ *
+ * @handle: handle of the VM to decommission
+ */
+struct sev_data_decommission {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_activate - ACTIVATE command parameters
+ *
+ * @handle: handle of the VM to activate
+ * @asid: asid assigned to the VM
+ */
+struct sev_data_activate {
+	u32 handle;				/* In */
+	u32 asid;				/* In */
+};
+
+/**
+ * struct sev_data_deactivate - DEACTIVATE command parameters
+ *
+ * @handle: handle of the VM to deactivate
+ */
+struct sev_data_deactivate {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_guest_status - SEV GUEST_STATUS command parameters
+ *
+ * @handle: handle of the VM to retrieve status
+ * @policy: policy information for the VM
+ * @asid: current ASID of the VM
+ * @state: current state of the VM
+ */
+struct sev_data_guest_status {
+	u32 handle;				/* In */
+	u32 policy;				/* Out */
+	u32 asid;				/* Out */
+	u8 state;				/* Out */
+};
+
+/**
+ * struct sev_data_launch_start - LAUNCH_START command parameters
+ *
+ * @handle: handle assigned to the VM
+ * @policy: guest launch policy
+ * @dh_cert_address: physical address of DH certificate blob
+ * @dh_cert_length: length of DH certificate blob
+ * @session_address: physical address of session parameters
+ * @session_len: length of session parameters
+ */
+struct sev_data_launch_start {
+	u32 handle;				/* In/Out */
+	u32 policy;				/* In */
+	u64 dh_cert_address;			/* In */
+	u32 dh_cert_length;			/* In */
+	u32 reserved;				/* In */
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In */
+};
+
+/**
+ * struct sev_data_launch_update_data - LAUNCH_UPDATE_DATA command parameter
+ *
+ * @handle: handle of the VM to update
+ * @length: length of memory to be encrypted
+ * @address: physical address of memory region to encrypt
+ */
+struct sev_data_launch_update_data {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In */
+};
+
+/**
+ * struct sev_data_launch_update_vmsa - LAUNCH_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM
+ * @address: physical address of memory region to encrypt
+ * @length: length of memory region to encrypt
+ */
+struct sev_data_launch_update_vmsa {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In */
+};
+
+/**
+ * struct sev_data_launch_measure - LAUNCH_MEASURE command parameters
+ *
+ * @handle: handle of the VM to process
+ * @address: physical address containing the measurement blob
+ * @length: length of measurement blob
+ */
+struct sev_data_launch_measure {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 address;				/* In */
+	u32 length;				/* In/Out */
+};
+
+/**
+ * struct sev_data_launch_secret - LAUNCH_SECRET command parameters
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing the packet header
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest_paddr
+ * @trans_address: physical address of transport memory buffer
+ * @trans_length: length of transport memory buffer
+ */
+struct sev_data_launch_secret {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_launch_finish - LAUNCH_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_launch_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_send_start - SEND_START command parameters
+ *
+ * @handle: handle of the VM to process
+ * @pdh_cert_address: physical address containing PDH certificate
+ * @pdh_cert_length: length of PDH certificate
+ * @plat_certs_address: physical address containing platform certificate
+ * @plat_certs_length: length of platform certificate
+ * @amd_certs_address: physical address containing AMD certificate
+ * @amd_certs_length: length of AMD certificate
+ * @session_data_address: physical address containing Session data
+ * @session_length: length of session data
+ */
+struct sev_data_send_start {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 pdh_cert_address;			/* In */
+	u32 pdh_cert_length;			/* In/Out */
+	u32 reserved2;
+	u64 plat_cert_address;			/* In */
+	u32 plat_cert_length;			/* In/Out */
+	u32 reserved3;
+	u64 amd_cert_address;			/* In */
+	u32 amd_cert_length;			/* In/Out */
+	u32 reserved4;
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In/Out */
+};
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_DATA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_length: length of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_length: length of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_length: length of host memory region
+ */
+struct sev_data_send_update_data {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In/Out */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_send_update - SEND_UPDATE_VMSA command
+ *
+ * @handle: handle of the VM to process
+ * @hdr_address: physical address containing packet header
+ * @hdr_length: length of packet header
+ * @guest_address: physical address of guest memory region to send
+ * @guest_length: length of guest memory region to send
+ * @trans_address: physical address of host memory region
+ * @trans_length: length of host memory region
+ */
+struct sev_data_send_update_vmsa {
+	u32 handle;				/* In */
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In/Out */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_send_finish - SEND_FINISH command parameters
+ *
+ * @handle: handle of the VM to process
+ */
+struct sev_data_send_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_receive_start - RECEIVE_START command parameters
+ *
+ * @handle: handle of the VM to perform receive operation
+ * @pdh_cert_address: system physical address containing PDH certificate blob
+ * @pdh_cert_length: length of PDH certificate blob
+ * @session_address: system physical address containing session blob
+ * @session_length: length of session blob
+ */
+struct sev_data_receive_start {
+	u32 handle;				/* In/Out */
+	u32 reserved1;
+	u64 pdh_cert_address;			/* In */
+	u32 pdh_cert_length;			/* In */
+	u32 reserved2;
+	u64 session_data_address;		/* In */
+	u32 session_data_length;		/* In/Out */
+};
+
+/**
+ * struct sev_data_receive_update_data - RECEIVE_UPDATE_DATA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_length: length of transport buffer
+ */
+struct sev_data_receive_update_data {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_receive_update_vmsa - RECEIVE_UPDATE_VMSA command parameters
+ *
+ * @handle: handle of the VM to update
+ * @hdr_address: physical address containing packet header blob
+ * @hdr_length: length of packet header
+ * @guest_address: system physical address of guest memory region
+ * @guest_length: length of guest memory region
+ * @trans_address: system physical address of transport buffer
+ * @trans_length: length of transport buffer
+ */
+struct sev_data_receive_update_vmsa {
+	u32 handle;				/* In */
+	u32 reserved1;
+	u64 hdr_address;			/* In */
+	u32 hdr_length;				/* In */
+	u32 reserved2;
+	u64 guest_address;			/* In */
+	u32 guest_length;			/* In */
+	u32 reserved3;
+	u64 trans_address;			/* In */
+	u32 trans_length;			/* In */
+};
+
+/**
+ * struct sev_data_receive_finish - RECEIVE_FINISH command parameters
+ *
+ * @handle: handle of the VM to finish
+ */
+struct sev_data_receive_finish {
+	u32 handle;				/* In */
+};
+
+/**
+ * struct sev_data_dbg - DBG_ENCRYPT/DBG_DECRYPT command parameters
+ *
+ * @handle: handle of the VM to perform debug operation
+ * @src_addr: source address of data to operate on
+ * @dst_addr: destination address of data to operate on
+ * @length: length of data to operate on
+ */
+struct sev_data_dbg {
+	u32 handle;				/* In */
+	u32 reserved;
+	u64 src_addr;				/* In */
+	u64 dst_addr;				/* In */
+	u32 length;				/* In */
+};
+
+#if defined(CONFIG_CRYPTO_DEV_SEV)
+
+/**
+ * sev_platform_init - perform SEV INIT command
+ *
+ * @init: sev_data_init structure to be processed
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_init(struct sev_data_init *init, int *error);
+
+/**
+ * sev_platform_shutdown - perform SEV SHUTDOWN command
+ *
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_shutdown(int *error);
+
+/**
+ * sev_platform_status - perform SEV PLATFORM_STATUS command
+ *
+ * @init: sev_data_status structure to be processed
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ */
+int sev_platform_status(struct sev_data_status *status, int *error);
+
+/**
+ * sev_issue_cmd_external_user - issue SEV command by other driver
+ *
+ * The function can be used by other drivers to issue a SEV command on
+ * behalf by userspace. The caller must pass a valid SEV file descriptor
+ * so that we know that caller has access to SEV device.
+ *
+ * @filep - SEV device file pointer
+ * @cmd - command to issue
+ * @data - command buffer
+ * @timeout - If zero then use default timeout
+ * @error: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV successfully processed the command
+ * -%ENODEV    if the SEV device is not available
+ * -%ENOTSUPP  if the SEV does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if the SEV returned a non-zero return code
+ * -%EINVAL    if the SEV file descriptor is not valid
+ */
+int sev_issue_cmd_external_user(struct file *filep, unsigned int id,
+				void *data, int timeout, int *error);
+
+/**
+ * sev_guest_deactivate - perform SEV DEACTIVATE command
+ *
+ * @deactivate: sev_data_deactivate structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_deactivate(struct sev_data_deactivate *data, int *error);
+
+/**
+ * sev_guest_activate - perform SEV ACTIVATE command
+ *
+ * @activate: sev_data_activate structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_activate(struct sev_data_activate *data, int *error);
+
+/**
+ * sev_guest_df_flush - perform SEV DF_FLUSH command
+ *
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_df_flush(int *error);
+
+/**
+ * sev_guest_decommission - perform SEV DECOMMISSION command
+ *
+ * @decommission: sev_data_decommission structure to be processed
+ * @sev_ret: sev command return code
+ *
+ * Returns:
+ * 0 if the sev successfully processed the command
+ * -%ENODEV    if the sev device is not available
+ * -%ENOTSUPP  if the sev does not support SEV
+ * -%ETIMEDOUT if the sev command timed out
+ * -%EIO       if the sev returned a non-zero return code
+ */
+int sev_guest_decommission(struct sev_data_decommission *data, int *error);
+
+#else	/* !CONFIG_CRYPTO_DEV_SEV */
+
+static inline int sev_platform_status(struct sev_data_status *status,
+				      int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_platform_init(struct sev_data_init *init, int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_platform_shutdown(int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_issue_cmd_external_user(int fd, unsigned int id,
+					void *data, int timeout, int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_deactivate(struct sev_data_deactivate *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_decommission(struct sev_data_decommission *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_activate(struct sev_data_activate *data,
+					int *error)
+{
+	return -ENODEV;
+}
+
+static inline int sev_guest_df_flush(int *error)
+{
+	return -ENODEV;
+}
+
+#endif	/* CONFIG_CRYPTO_DEV_SEV */
+
+#endif	/* __PSP_SEV_H__ */
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index f330ba4..2e15ea7 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -481,3 +481,4 @@ header-y += xilinx-v4l2-controls.h
 header-y += zorro.h
 header-y += zorro_ids.h
 header-y += userfaultfd.h
+header-y += psp-sev.h
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
new file mode 100644
index 0000000..050976d
--- /dev/null
+++ b/include/uapi/linux/psp-sev.h
@@ -0,0 +1,123 @@
+
+/*
+ * Userspace interface for AMD Secure Encrypted Virtualization (SEV)
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __PSP_SEV_USER_H__
+#define __PSP_SEV_USER_H__
+
+#include <linux/types.h>
+
+/**
+ * SEV platform commands
+ */
+enum {
+	SEV_USER_CMD_INIT = 0,
+	SEV_USER_CMD_SHUTDOWN,
+	SEV_USER_CMD_FACTORY_RESET,
+	SEV_USER_CMD_PLATFORM_STATUS,
+	SEV_USER_CMD_PEK_GEN,
+	SEV_USER_CMD_PEK_CSR,
+	SEV_USER_CMD_PDH_GEN,
+	SEV_USER_CMD_PDH_CERT_EXPORT,
+	SEV_USER_CMD_PEK_CERT_IMPORT,
+
+	SEV_USER_CMD_MAX,
+};
+
+/**
+ * struct sev_user_data_init - INIT command parameters
+ *
+ * @flags: processing flags
+ */
+struct sev_user_data_init {
+	__u32 flags;				/* In */
+};
+
+/**
+ * struct sev_user_data_status - PLATFORM_STATUS command parameters
+ *
+ * @major: major API version
+ * @minor: minor API version
+ * @state: platform state
+ * @owner: self-owned or externally owned
+ * @config: platform config flags
+ * @guest_count: number of active guests
+ */
+struct sev_user_data_status {
+	__u8 api_major;				/* Out */
+	__u8 api_minor;				/* Out */
+	__u8 state;				/* Out */
+	__u8 owner;				/* Out */
+	__u32 config;				/* Out */
+	__u32 guest_count;			/* Out */
+};
+
+/**
+ * struct sev_user_data_pek_csr - PEK_CSR command parameters
+ *
+ * @address: PEK certificate chain
+ * @length: length of certificate
+ */
+struct sev_user_data_pek_csr {
+	__u64 address;					/* In */
+	__u32 length;					/* In/Out */
+};
+
+/**
+ * q
+ * struct sev_user_data_cert_import - PEK_CERT_IMPORT command parameters
+ *
+ * @pek_address: PEK certificate chain
+ * @pek_length: length of PEK certificate
+ * @oca_address: OCA certificate chain
+ * @oca_length: length of OCA certificate
+ */
+struct sev_user_data_pek_cert_import {
+	__u64 pek_cert_address;				/* In */
+	__u32 pek_cert_length;				/* In */
+	__u64 oca_cert_address;				/* In */
+	__u32 oca_cert_length;				/* In */
+};
+
+/**
+ * struct sev_user_data_pdh_cert_export - PDH_CERT_EXPORT command parameters
+ *
+ * @pdh_address: PDH certificate address
+ * @pdh_length: length of PDH certificate
+ * @cert_chain_address: PDH certificate chain
+ * @cert_chain_length: length of PDH certificate chain
+ */
+struct sev_user_data_pdh_cert_export {
+	__u64 pdh_cert_address;				/* In */
+	__u32 pdh_cert_length;				/* In/Out */
+	__u64 cert_chain_address;			/* In */
+	__u32 cert_chain_length;			/* In/Out */
+};
+
+/**
+ * struct sev_issue_cmd - SEV ioctl parameters
+ *
+ * @cmd: SEV commands to execute
+ * @opaque: pointer to the command structure
+ * @error: SEV FW return code on failure
+ */
+struct sev_issue_cmd {
+	__u32 cmd;					/* In */
+	__u64 data;					/* In */
+	__u32 error;					/* Out */
+};
+
+#define SEV_IOC_TYPE		'S'
+#define SEV_ISSUE_CMD	_IOWR(SEV_IOC_TYPE, 0x0, struct sev_issue_cmd)
+
+#endif /* __PSP_USER_SEV_H */
+

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 22/32] kvm: svm: prepare to reserve asid for SEV guest
  2017-03-02 15:12 ` Brijesh Singh
                   ` (44 preceding siblings ...)
  (?)
@ 2017-03-02 15:16 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

In current implementation, asid allocation starts from 1, this patch
adds a min_asid variable in svm_vcpu structure to allow starting asid
from something other than 1.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b581499..8d8fe62 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -507,6 +507,7 @@ struct svm_cpu_data {
 	u64 asid_generation;
 	u32 max_asid;
 	u32 next_asid;
+	u32 min_asid;
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
@@ -763,6 +764,7 @@ static int svm_hardware_enable(void)
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	sd->next_asid = sd->max_asid + 1;
+	sd->min_asid = 1;
 
 	native_store_gdt(&gdt_descr);
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -2026,7 +2028,7 @@ static void new_asid(struct vcpu_svm *svm, struct svm_cpu_data *sd)
 {
 	if (sd->next_asid > sd->max_asid) {
 		++sd->asid_generation;
-		sd->next_asid = 1;
+		sd->next_asid = sd->min_asid;
 		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
 	}
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 22/32] kvm: svm: prepare to reserve asid for SEV guest
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:16   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

In current implementation, asid allocation starts from 1, this patch
adds a min_asid variable in svm_vcpu structure to allow starting asid
from something other than 1.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b581499..8d8fe62 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -507,6 +507,7 @@ struct svm_cpu_data {
 	u64 asid_generation;
 	u32 max_asid;
 	u32 next_asid;
+	u32 min_asid;
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
@@ -763,6 +764,7 @@ static int svm_hardware_enable(void)
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	sd->next_asid = sd->max_asid + 1;
+	sd->min_asid = 1;
 
 	native_store_gdt(&gdt_descr);
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -2026,7 +2028,7 @@ static void new_asid(struct vcpu_svm *svm, struct svm_cpu_data *sd)
 {
 	if (sd->next_asid > sd->max_asid) {
 		++sd->asid_generation;
-		sd->next_asid = 1;
+		sd->next_asid = sd->min_asid;
 		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
 	}
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 22/32] kvm: svm: prepare to reserve asid for SEV guest
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

In current implementation, asid allocation starts from 1, this patch
adds a min_asid variable in svm_vcpu structure to allow starting asid
from something other than 1.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b581499..8d8fe62 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -507,6 +507,7 @@ struct svm_cpu_data {
 	u64 asid_generation;
 	u32 max_asid;
 	u32 next_asid;
+	u32 min_asid;
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
@@ -763,6 +764,7 @@ static int svm_hardware_enable(void)
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	sd->next_asid = sd->max_asid + 1;
+	sd->min_asid = 1;
 
 	native_store_gdt(&gdt_descr);
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -2026,7 +2028,7 @@ static void new_asid(struct vcpu_svm *svm, struct svm_cpu_data *sd)
 {
 	if (sd->next_asid > sd->max_asid) {
 		++sd->asid_generation;
-		sd->next_asid = 1;
+		sd->next_asid = sd->min_asid;
 		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
 	}
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 22/32] kvm: svm: prepare to reserve asid for SEV guest
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

In current implementation, asid allocation starts from 1, this patch
adds a min_asid variable in svm_vcpu structure to allow starting asid
from something other than 1.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b581499..8d8fe62 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -507,6 +507,7 @@ struct svm_cpu_data {
 	u64 asid_generation;
 	u32 max_asid;
 	u32 next_asid;
+	u32 min_asid;
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
@@ -763,6 +764,7 @@ static int svm_hardware_enable(void)
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	sd->next_asid = sd->max_asid + 1;
+	sd->min_asid = 1;
 
 	native_store_gdt(&gdt_descr);
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -2026,7 +2028,7 @@ static void new_asid(struct vcpu_svm *svm, struct svm_cpu_data *sd)
 {
 	if (sd->next_asid > sd->max_asid) {
 		++sd->asid_generation;
-		sd->next_asid = 1;
+		sd->next_asid = sd->min_asid;
 		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
 	}
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 22/32] kvm: svm: prepare to reserve asid for SEV guest
@ 2017-03-02 15:16   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:16 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

In current implementation, asid allocation starts from 1, this patch
adds a min_asid variable in svm_vcpu structure to allow starting asid
from something other than 1.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b581499..8d8fe62 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -507,6 +507,7 @@ struct svm_cpu_data {
 	u64 asid_generation;
 	u32 max_asid;
 	u32 next_asid;
+	u32 min_asid;
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
@@ -763,6 +764,7 @@ static int svm_hardware_enable(void)
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	sd->next_asid = sd->max_asid + 1;
+	sd->min_asid = 1;
 
 	native_store_gdt(&gdt_descr);
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -2026,7 +2028,7 @@ static void new_asid(struct vcpu_svm *svm, struct svm_cpu_data *sd)
 {
 	if (sd->next_asid > sd->max_asid) {
 		++sd->asid_generation;
-		sd->next_asid = 1;
+		sd->next_asid = sd->min_asid;
 		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
 	}
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 23/32] kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
  2017-03-02 15:12 ` Brijesh Singh
                   ` (46 preceding siblings ...)
  (?)
@ 2017-03-02 15:17 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

If hardware supports encrypting then KVM_MEMORY_ENCRYPT_OP ioctl can
be used by qemu to issue platform specific memory encryption commands.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 arch/x86/kvm/x86.c              |   12 ++++++++++++
 include/uapi/linux/kvm.h        |    2 ++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index bff1f15..62651ad 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1033,6 +1033,8 @@ struct kvm_x86_ops {
 	void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
 
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
+
+	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2099df8..6a737e9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3926,6 +3926,14 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 	return r;
 }
 
+static int kvm_vm_ioctl_memory_encryption_op(struct kvm *kvm, void __user *argp)
+{
+	if (kvm_x86_ops->memory_encryption_op)
+		return kvm_x86_ops->memory_encryption_op(kvm, argp);
+
+	return -ENOTTY;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
 		       unsigned int ioctl, unsigned long arg)
 {
@@ -4189,6 +4197,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = kvm_vm_ioctl_enable_cap(kvm, &cap);
 		break;
 	}
+	case KVM_MEMORY_ENCRYPT_OP: {
+		r = kvm_vm_ioctl_memory_encryption_op(kvm, argp);
+		break;
+	}
 	default:
 		r = kvm_vm_ioctl_assigned_device(kvm, ioctl, arg);
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index cac48ed..fef7d83 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1281,6 +1281,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_S390_GET_IRQ_STATE	  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
 /* Available with KVM_CAP_X86_SMM */
 #define KVM_SMI                   _IO(KVMIO,   0xb7)
+/* Memory Encryption Commands */
+#define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
 
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 23/32] kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:17   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

If hardware supports encrypting then KVM_MEMORY_ENCRYPT_OP ioctl can
be used by qemu to issue platform specific memory encryption commands.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 arch/x86/kvm/x86.c              |   12 ++++++++++++
 include/uapi/linux/kvm.h        |    2 ++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index bff1f15..62651ad 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1033,6 +1033,8 @@ struct kvm_x86_ops {
 	void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
 
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
+
+	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2099df8..6a737e9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3926,6 +3926,14 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 	return r;
 }
 
+static int kvm_vm_ioctl_memory_encryption_op(struct kvm *kvm, void __user *argp)
+{
+	if (kvm_x86_ops->memory_encryption_op)
+		return kvm_x86_ops->memory_encryption_op(kvm, argp);
+
+	return -ENOTTY;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
 		       unsigned int ioctl, unsigned long arg)
 {
@@ -4189,6 +4197,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = kvm_vm_ioctl_enable_cap(kvm, &cap);
 		break;
 	}
+	case KVM_MEMORY_ENCRYPT_OP: {
+		r = kvm_vm_ioctl_memory_encryption_op(kvm, argp);
+		break;
+	}
 	default:
 		r = kvm_vm_ioctl_assigned_device(kvm, ioctl, arg);
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index cac48ed..fef7d83 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1281,6 +1281,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_S390_GET_IRQ_STATE	  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
 /* Available with KVM_CAP_X86_SMM */
 #define KVM_SMI                   _IO(KVMIO,   0xb7)
+/* Memory Encryption Commands */
+#define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
 
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 23/32] kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

If hardware supports encrypting then KVM_MEMORY_ENCRYPT_OP ioctl can
be used by qemu to issue platform specific memory encryption commands.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 arch/x86/kvm/x86.c              |   12 ++++++++++++
 include/uapi/linux/kvm.h        |    2 ++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index bff1f15..62651ad 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1033,6 +1033,8 @@ struct kvm_x86_ops {
 	void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
 
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
+
+	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2099df8..6a737e9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3926,6 +3926,14 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 	return r;
 }
 
+static int kvm_vm_ioctl_memory_encryption_op(struct kvm *kvm, void __user *argp)
+{
+	if (kvm_x86_ops->memory_encryption_op)
+		return kvm_x86_ops->memory_encryption_op(kvm, argp);
+
+	return -ENOTTY;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
 		       unsigned int ioctl, unsigned long arg)
 {
@@ -4189,6 +4197,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = kvm_vm_ioctl_enable_cap(kvm, &cap);
 		break;
 	}
+	case KVM_MEMORY_ENCRYPT_OP: {
+		r = kvm_vm_ioctl_memory_encryption_op(kvm, argp);
+		break;
+	}
 	default:
 		r = kvm_vm_ioctl_assigned_device(kvm, ioctl, arg);
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index cac48ed..fef7d83 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1281,6 +1281,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_S390_GET_IRQ_STATE	  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
 /* Available with KVM_CAP_X86_SMM */
 #define KVM_SMI                   _IO(KVMIO,   0xb7)
+/* Memory Encryption Commands */
+#define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
 
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 23/32] kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

If hardware supports encrypting then KVM_MEMORY_ENCRYPT_OP ioctl can
be used by qemu to issue platform specific memory encryption commands.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 arch/x86/kvm/x86.c              |   12 ++++++++++++
 include/uapi/linux/kvm.h        |    2 ++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index bff1f15..62651ad 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1033,6 +1033,8 @@ struct kvm_x86_ops {
 	void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
 
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
+
+	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2099df8..6a737e9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3926,6 +3926,14 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 	return r;
 }
 
+static int kvm_vm_ioctl_memory_encryption_op(struct kvm *kvm, void __user *argp)
+{
+	if (kvm_x86_ops->memory_encryption_op)
+		return kvm_x86_ops->memory_encryption_op(kvm, argp);
+
+	return -ENOTTY;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
 		       unsigned int ioctl, unsigned long arg)
 {
@@ -4189,6 +4197,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = kvm_vm_ioctl_enable_cap(kvm, &cap);
 		break;
 	}
+	case KVM_MEMORY_ENCRYPT_OP: {
+		r = kvm_vm_ioctl_memory_encryption_op(kvm, argp);
+		break;
+	}
 	default:
 		r = kvm_vm_ioctl_assigned_device(kvm, ioctl, arg);
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index cac48ed..fef7d83 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1281,6 +1281,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_S390_GET_IRQ_STATE	  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
 /* Available with KVM_CAP_X86_SMM */
 #define KVM_SMI                   _IO(KVMIO,   0xb7)
+/* Memory Encryption Commands */
+#define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
 
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 23/32] kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

If hardware supports encrypting then KVM_MEMORY_ENCRYPT_OP ioctl can
be used by qemu to issue platform specific memory encryption commands.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 arch/x86/kvm/x86.c              |   12 ++++++++++++
 include/uapi/linux/kvm.h        |    2 ++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index bff1f15..62651ad 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1033,6 +1033,8 @@ struct kvm_x86_ops {
 	void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
 
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
+
+	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2099df8..6a737e9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3926,6 +3926,14 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 	return r;
 }
 
+static int kvm_vm_ioctl_memory_encryption_op(struct kvm *kvm, void __user *argp)
+{
+	if (kvm_x86_ops->memory_encryption_op)
+		return kvm_x86_ops->memory_encryption_op(kvm, argp);
+
+	return -ENOTTY;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
 		       unsigned int ioctl, unsigned long arg)
 {
@@ -4189,6 +4197,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = kvm_vm_ioctl_enable_cap(kvm, &cap);
 		break;
 	}
+	case KVM_MEMORY_ENCRYPT_OP: {
+		r = kvm_vm_ioctl_memory_encryption_op(kvm, argp);
+		break;
+	}
 	default:
 		r = kvm_vm_ioctl_assigned_device(kvm, ioctl, arg);
 	}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index cac48ed..fef7d83 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1281,6 +1281,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_S390_GET_IRQ_STATE	  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
 /* Available with KVM_CAP_X86_SMM */
 #define KVM_SMI                   _IO(KVMIO,   0xb7)
+/* Memory Encryption Commands */
+#define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
 
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 24/32] kvm: x86: prepare for SEV guest management API support
  2017-03-02 15:12 ` Brijesh Singh
                   ` (49 preceding siblings ...)
  (?)
@ 2017-03-02 15:17 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The patch adds initial support required to integrate Secure Encrypted
Virtualization (SEV) feature.

ASID management:
 - Reserve asid range for SEV guest, SEV asid range is obtained through
   CPUID Fn8000_001f[ECX]. A non-SEV guest can use any asid outside the SEV
   asid range.
 - SEV guest must have asid value within asid range obtained through CPUID.
 - SEV guest must have the same asid for all vcpu's. A TLB flush is required
   if different vcpu for the same ASID is to be run on the same host CPU.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    8 ++
 arch/x86/kvm/svm.c              |  189 +++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h        |   98 ++++++++++++++++++++
 3 files changed, 294 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 62651ad..fcc4710 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -719,6 +719,12 @@ struct kvm_hv {
 	HV_REFERENCE_TSC_PAGE tsc_ref;
 };
 
+struct kvm_sev_info {
+	unsigned int handle;	/* firmware handle */
+	unsigned int asid;	/* asid for this guest */
+	int sev_fd;		/* SEV device fd */
+};
+
 struct kvm_arch {
 	unsigned int n_used_mmu_pages;
 	unsigned int n_requested_mmu_pages;
@@ -805,6 +811,8 @@ struct kvm_arch {
 
 	bool x2apic_format;
 	bool x2apic_broadcast_quirk_disabled;
+
+	struct kvm_sev_info sev_info;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 8d8fe62..fb63398 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -36,6 +36,7 @@
 #include <linux/slab.h>
 #include <linux/amd-iommu.h>
 #include <linux/hashtable.h>
+#include <linux/psp-sev.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -211,6 +212,9 @@ struct vcpu_svm {
 	 */
 	struct list_head ir_list;
 	spinlock_t ir_list_lock;
+
+	/* which host cpu was used for running this vcpu */
+	bool last_cpuid;
 };
 
 /*
@@ -490,6 +494,64 @@ static inline bool gif_set(struct vcpu_svm *svm)
 	return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
 }
 
+/* Secure Encrypted Virtualization */
+static unsigned int max_sev_asid;
+static unsigned long *sev_asid_bitmap;
+
+static bool kvm_sev_enabled(void)
+{
+	return max_sev_asid ? 1 : 0;
+}
+
+static inline struct kvm_sev_info *sev_get_info(struct kvm *kvm)
+{
+	struct kvm_arch *vm_data = &kvm->arch;
+
+	return &vm_data->sev_info;
+}
+
+static unsigned int sev_get_handle(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	return sev_info->handle;
+}
+
+static inline int sev_guest(struct kvm *kvm)
+{
+	return sev_get_handle(kvm);
+}
+
+static inline int sev_get_asid(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return -EINVAL;
+
+	return sev_info->asid;
+}
+
+static inline int sev_get_fd(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return -EINVAL;
+
+	return sev_info->sev_fd;
+}
+
+static inline void sev_set_asid(struct kvm *kvm, int asid)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return;
+
+	sev_info->asid = asid;
+}
+
 static unsigned long iopm_base;
 
 struct kvm_ldttss_desc {
@@ -511,6 +573,8 @@ struct svm_cpu_data {
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
+
+	struct vmcb **sev_vmcbs;  /* index = sev_asid, value = vmcb pointer */
 };
 
 static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -764,7 +828,7 @@ static int svm_hardware_enable(void)
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	sd->next_asid = sd->max_asid + 1;
-	sd->min_asid = 1;
+	sd->min_asid = max_sev_asid + 1;
 
 	native_store_gdt(&gdt_descr);
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -825,6 +889,7 @@ static void svm_cpu_uninit(int cpu)
 
 	per_cpu(svm_data, raw_smp_processor_id()) = NULL;
 	__free_page(sd->save_area);
+	kfree(sd->sev_vmcbs);
 	kfree(sd);
 }
 
@@ -842,6 +907,14 @@ static int svm_cpu_init(int cpu)
 	if (!sd->save_area)
 		goto err_1;
 
+	if (kvm_sev_enabled()) {
+		sd->sev_vmcbs = kmalloc((max_sev_asid + 1) * sizeof(void *),
+					GFP_KERNEL);
+		r = -ENOMEM;
+		if (!sd->sev_vmcbs)
+			goto err_1;
+	}
+
 	per_cpu(svm_data, cpu) = sd;
 
 	return 0;
@@ -1017,6 +1090,61 @@ static int avic_ga_log_notifier(u32 ga_tag)
 	return 0;
 }
 
+static __init void sev_hardware_setup(void)
+{
+	int ret, error, nguests;
+	struct sev_data_init *init;
+	struct sev_data_status *status;
+
+	/*
+	 * Get maximum number of encrypted guest supported: Fn8001_001F[ECX]
+	 * 	Bit 31:0: Number of supported guest
+	 */
+	nguests = cpuid_ecx(0x8000001F);
+	if (!nguests)
+		return;
+
+	init = kzalloc(sizeof(*init), GFP_KERNEL);
+	if (!init)
+		return;
+
+	status = kzalloc(sizeof(*status), GFP_KERNEL);
+	if (!status)
+		goto err_1;
+
+	/* Initialize SEV firmware */
+	ret = sev_platform_init(init, &error);
+	if (ret) {
+		pr_err("SEV: PLATFORM_INIT ret=%d (%#x)\n", ret, error);
+		goto err_2;
+	}
+
+	/* Initialize SEV ASID bitmap */
+	sev_asid_bitmap = kcalloc(BITS_TO_LONGS(nguests),
+				  sizeof(unsigned long), GFP_KERNEL);
+	if (IS_ERR(sev_asid_bitmap)) {
+		sev_platform_shutdown(&error);
+		goto err_2;
+	}
+
+	/* Query the platform status and print API version */
+	ret = sev_platform_status(status, &error);
+	if (ret) {
+		printk(KERN_ERR "SEV: PLATFORM_STATUS ret=%#x\n", error);
+		goto err_2;
+	}
+
+	max_sev_asid = nguests;
+
+	printk(KERN_INFO "kvm: SEV enabled\n");
+	printk(KERN_INFO "SEV API: %d.%d\n",
+			status->api_major, status->api_minor);
+err_2:
+	kfree(status);
+err_1:
+	kfree(init);
+}
+
 static __init int svm_hardware_setup(void)
 {
 	int cpu;
@@ -1052,6 +1180,9 @@ static __init int svm_hardware_setup(void)
 		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
 	}
 
+	if (boot_cpu_has(X86_FEATURE_SEV))
+		sev_hardware_setup();
+
 	for_each_possible_cpu(cpu) {
 		r = svm_cpu_init(cpu);
 		if (r)
@@ -1094,10 +1225,25 @@ static __init int svm_hardware_setup(void)
 	return r;
 }
 
+static __exit void sev_hardware_unsetup(void)
+{
+	int ret, err;
+
+	ret = sev_platform_shutdown(&err);
+	if (ret)
+		printk(KERN_ERR "failed to shutdown PSP rc=%d (%#0x10x)\n",
+		ret, err);
+
+	kfree(sev_asid_bitmap);
+}
+
 static __exit void svm_hardware_unsetup(void)
 {
 	int cpu;
 
+	if (kvm_sev_enabled())
+		sev_hardware_unsetup();
+
 	for_each_possible_cpu(cpu)
 		svm_cpu_uninit(cpu);
 
@@ -1157,6 +1303,11 @@ static void avic_init_vmcb(struct vcpu_svm *svm)
 	svm->vcpu.arch.apicv_active = true;
 }
 
+static void sev_init_vmcb(struct vcpu_svm *svm)
+{
+	svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ENABLE;
+}
+
 static void init_vmcb(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -1271,6 +1422,9 @@ static void init_vmcb(struct vcpu_svm *svm)
 	if (avic)
 		avic_init_vmcb(svm);
 
+	if (sev_guest(svm->vcpu.kvm))
+		sev_init_vmcb(svm);
+
 	mark_all_dirty(svm->vmcb);
 
 	enable_gif(svm);
@@ -2084,6 +2238,11 @@ static int pf_interception(struct vcpu_svm *svm)
 	default:
 		error_code = svm->vmcb->control.exit_info_1;
 
+		/* In SEV mode, the guest physical address will have C-bit
+		 * set. C-bit must be cleared before handling the fault.
+		 */
+		if (sev_guest(svm->vcpu.kvm))
+			fault_address &= ~sme_me_mask;
 		trace_kvm_page_fault(fault_address, error_code);
 		if (!npt_enabled && kvm_event_needs_reinjection(&svm->vcpu))
 			kvm_mmu_unprotect_page_virt(&svm->vcpu, fault_address);
@@ -4258,12 +4417,40 @@ static void reload_tss(struct kvm_vcpu *vcpu)
 	load_TR_desc();
 }
 
+static void pre_sev_run(struct vcpu_svm *svm)
+{
+	int asid = sev_get_asid(svm->vcpu.kvm);
+	int cpu = raw_smp_processor_id();
+	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
+
+	/* Assign the asid allocated for this SEV guest */
+	svm->vmcb->control.asid = asid;
+
+	/* Flush guest TLB:
+	 * - when different VMCB for the same ASID is to be run on the
+	 *   same host CPU
+	 *   or
+	 * - this VMCB was executed on different host cpu in previous VMRUNs.
+	 */
+	if (sd->sev_vmcbs[asid] != (void *)svm->vmcb ||
+		svm->last_cpuid != cpu)
+		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
+
+	svm->last_cpuid = cpu;
+	sd->sev_vmcbs[asid] = (void *)svm->vmcb;
+
+	mark_dirty(svm->vmcb, VMCB_ASID);
+}
+
 static void pre_svm_run(struct vcpu_svm *svm)
 {
 	int cpu = raw_smp_processor_id();
 
 	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
 
+	if (sev_guest(svm->vcpu.kvm))
+		return pre_sev_run(svm);
+
 	/* FIXME: handle wraparound of asid_generation */
 	if (svm->asid_generation != sd->asid_generation)
 		new_asid(svm, sd);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index fef7d83..9df37a2 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1284,6 +1284,104 @@ struct kvm_s390_ucas_mapping {
 /* Memory Encryption Commands */
 #define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
 
+/* Secure Encrypted Virtualization mode */
+enum sev_cmd_id {
+	/* Guest launch commands */
+	KVM_SEV_LAUNCH_START = 0,
+	KVM_SEV_LAUNCH_UPDATE_DATA,
+	KVM_SEV_LAUNCH_MEASURE,
+	KVM_SEV_LAUNCH_FINISH,
+	/* Guest migration commands (outgoing) */
+	KVM_SEV_SEND_START,
+	KVM_SEV_SEND_UPDATE_DATA,
+	KVM_SEV_SEND_FINISH,
+	/* Guest migration commands (incoming) */
+	KVM_SEV_RECEIVE_START,
+	KVM_SEV_RECEIVE_UPDATE_DATA,
+	KVM_SEV_RECEIVE_FINISH,
+	/* Guest status and debug commands */
+	KVM_SEV_GUEST_STATUS,
+	KVM_SEV_DBG_DECRYPT,
+	KVM_SEV_DBG_ENCRYPT,
+
+	KVM_SEV_NR_MAX,
+};
+
+struct kvm_sev_cmd {
+	__u32 id;
+	__u64 data;
+	__u32 error;
+	__u32 sev_fd;
+};
+
+struct kvm_sev_launch_start {
+	__u32 handle;
+	__u32 policy;
+	__u64 dh_cert_data;
+	__u32 dh_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_launch_update_data {
+	__u64 address;
+	__u32 length;
+};
+
+struct kvm_sev_launch_measure {
+	__u64 address;
+	__u32 length;
+};
+
+struct kvm_sev_send_start {
+	__u64 pdh_cert_data;
+	__u32 pdh_cert_length;
+	__u64 plat_cert_data;
+	__u32 plat_cert_length;
+	__u64 amd_cert_data;
+	__u32 amd_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_send_update_data {
+	__u64 hdr_data;
+	__u32 hdr_length;
+	__u64 guest_address;
+	__u32 guest_length;
+	__u64 host_address;
+	__u32 host_length;
+};
+
+struct kvm_sev_receive_start {
+	__u32 handle;
+	__u64 pdh_cert_data;
+	__u32 pdh_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_receive_update_data {
+	__u64 hdr_data;
+	__u32 hdr_length;
+	__u64 guest_address;
+	__u32 guest_length;
+	__u64 host_address;
+	__u32 host_length;
+};
+
+struct kvm_sev_guest_status {
+	__u32 handle;
+	__u32 policy;
+	__u32 state;
+};
+
+struct kvm_sev_dbg {
+	__u64 src_addr;
+	__u64 dst_addr;
+	__u32 length;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 24/32] kvm: x86: prepare for SEV guest management API support
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:17   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The patch adds initial support required to integrate Secure Encrypted
Virtualization (SEV) feature.

ASID management:
 - Reserve asid range for SEV guest, SEV asid range is obtained through
   CPUID Fn8000_001f[ECX]. A non-SEV guest can use any asid outside the SEV
   asid range.
 - SEV guest must have asid value within asid range obtained through CPUID.
 - SEV guest must have the same asid for all vcpu's. A TLB flush is required
   if different vcpu for the same ASID is to be run on the same host CPU.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    8 ++
 arch/x86/kvm/svm.c              |  189 +++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h        |   98 ++++++++++++++++++++
 3 files changed, 294 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 62651ad..fcc4710 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -719,6 +719,12 @@ struct kvm_hv {
 	HV_REFERENCE_TSC_PAGE tsc_ref;
 };
 
+struct kvm_sev_info {
+	unsigned int handle;	/* firmware handle */
+	unsigned int asid;	/* asid for this guest */
+	int sev_fd;		/* SEV device fd */
+};
+
 struct kvm_arch {
 	unsigned int n_used_mmu_pages;
 	unsigned int n_requested_mmu_pages;
@@ -805,6 +811,8 @@ struct kvm_arch {
 
 	bool x2apic_format;
 	bool x2apic_broadcast_quirk_disabled;
+
+	struct kvm_sev_info sev_info;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 8d8fe62..fb63398 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -36,6 +36,7 @@
 #include <linux/slab.h>
 #include <linux/amd-iommu.h>
 #include <linux/hashtable.h>
+#include <linux/psp-sev.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -211,6 +212,9 @@ struct vcpu_svm {
 	 */
 	struct list_head ir_list;
 	spinlock_t ir_list_lock;
+
+	/* which host cpu was used for running this vcpu */
+	bool last_cpuid;
 };
 
 /*
@@ -490,6 +494,64 @@ static inline bool gif_set(struct vcpu_svm *svm)
 	return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
 }
 
+/* Secure Encrypted Virtualization */
+static unsigned int max_sev_asid;
+static unsigned long *sev_asid_bitmap;
+
+static bool kvm_sev_enabled(void)
+{
+	return max_sev_asid ? 1 : 0;
+}
+
+static inline struct kvm_sev_info *sev_get_info(struct kvm *kvm)
+{
+	struct kvm_arch *vm_data = &kvm->arch;
+
+	return &vm_data->sev_info;
+}
+
+static unsigned int sev_get_handle(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	return sev_info->handle;
+}
+
+static inline int sev_guest(struct kvm *kvm)
+{
+	return sev_get_handle(kvm);
+}
+
+static inline int sev_get_asid(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return -EINVAL;
+
+	return sev_info->asid;
+}
+
+static inline int sev_get_fd(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return -EINVAL;
+
+	return sev_info->sev_fd;
+}
+
+static inline void sev_set_asid(struct kvm *kvm, int asid)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return;
+
+	sev_info->asid = asid;
+}
+
 static unsigned long iopm_base;
 
 struct kvm_ldttss_desc {
@@ -511,6 +573,8 @@ struct svm_cpu_data {
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
+
+	struct vmcb **sev_vmcbs;  /* index = sev_asid, value = vmcb pointer */
 };
 
 static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -764,7 +828,7 @@ static int svm_hardware_enable(void)
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	sd->next_asid = sd->max_asid + 1;
-	sd->min_asid = 1;
+	sd->min_asid = max_sev_asid + 1;
 
 	native_store_gdt(&gdt_descr);
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -825,6 +889,7 @@ static void svm_cpu_uninit(int cpu)
 
 	per_cpu(svm_data, raw_smp_processor_id()) = NULL;
 	__free_page(sd->save_area);
+	kfree(sd->sev_vmcbs);
 	kfree(sd);
 }
 
@@ -842,6 +907,14 @@ static int svm_cpu_init(int cpu)
 	if (!sd->save_area)
 		goto err_1;
 
+	if (kvm_sev_enabled()) {
+		sd->sev_vmcbs = kmalloc((max_sev_asid + 1) * sizeof(void *),
+					GFP_KERNEL);
+		r = -ENOMEM;
+		if (!sd->sev_vmcbs)
+			goto err_1;
+	}
+
 	per_cpu(svm_data, cpu) = sd;
 
 	return 0;
@@ -1017,6 +1090,61 @@ static int avic_ga_log_notifier(u32 ga_tag)
 	return 0;
 }
 
+static __init void sev_hardware_setup(void)
+{
+	int ret, error, nguests;
+	struct sev_data_init *init;
+	struct sev_data_status *status;
+
+	/*
+	 * Get maximum number of encrypted guest supported: Fn8001_001F[ECX]
+	 * 	Bit 31:0: Number of supported guest
+	 */
+	nguests = cpuid_ecx(0x8000001F);
+	if (!nguests)
+		return;
+
+	init = kzalloc(sizeof(*init), GFP_KERNEL);
+	if (!init)
+		return;
+
+	status = kzalloc(sizeof(*status), GFP_KERNEL);
+	if (!status)
+		goto err_1;
+
+	/* Initialize SEV firmware */
+	ret = sev_platform_init(init, &error);
+	if (ret) {
+		pr_err("SEV: PLATFORM_INIT ret=%d (%#x)\n", ret, error);
+		goto err_2;
+	}
+
+	/* Initialize SEV ASID bitmap */
+	sev_asid_bitmap = kcalloc(BITS_TO_LONGS(nguests),
+				  sizeof(unsigned long), GFP_KERNEL);
+	if (IS_ERR(sev_asid_bitmap)) {
+		sev_platform_shutdown(&error);
+		goto err_2;
+	}
+
+	/* Query the platform status and print API version */
+	ret = sev_platform_status(status, &error);
+	if (ret) {
+		printk(KERN_ERR "SEV: PLATFORM_STATUS ret=%#x\n", error);
+		goto err_2;
+	}
+
+	max_sev_asid = nguests;
+
+	printk(KERN_INFO "kvm: SEV enabled\n");
+	printk(KERN_INFO "SEV API: %d.%d\n",
+			status->api_major, status->api_minor);
+err_2:
+	kfree(status);
+err_1:
+	kfree(init);
+}
+
 static __init int svm_hardware_setup(void)
 {
 	int cpu;
@@ -1052,6 +1180,9 @@ static __init int svm_hardware_setup(void)
 		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
 	}
 
+	if (boot_cpu_has(X86_FEATURE_SEV))
+		sev_hardware_setup();
+
 	for_each_possible_cpu(cpu) {
 		r = svm_cpu_init(cpu);
 		if (r)
@@ -1094,10 +1225,25 @@ static __init int svm_hardware_setup(void)
 	return r;
 }
 
+static __exit void sev_hardware_unsetup(void)
+{
+	int ret, err;
+
+	ret = sev_platform_shutdown(&err);
+	if (ret)
+		printk(KERN_ERR "failed to shutdown PSP rc=%d (%#0x10x)\n",
+		ret, err);
+
+	kfree(sev_asid_bitmap);
+}
+
 static __exit void svm_hardware_unsetup(void)
 {
 	int cpu;
 
+	if (kvm_sev_enabled())
+		sev_hardware_unsetup();
+
 	for_each_possible_cpu(cpu)
 		svm_cpu_uninit(cpu);
 
@@ -1157,6 +1303,11 @@ static void avic_init_vmcb(struct vcpu_svm *svm)
 	svm->vcpu.arch.apicv_active = true;
 }
 
+static void sev_init_vmcb(struct vcpu_svm *svm)
+{
+	svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ENABLE;
+}
+
 static void init_vmcb(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -1271,6 +1422,9 @@ static void init_vmcb(struct vcpu_svm *svm)
 	if (avic)
 		avic_init_vmcb(svm);
 
+	if (sev_guest(svm->vcpu.kvm))
+		sev_init_vmcb(svm);
+
 	mark_all_dirty(svm->vmcb);
 
 	enable_gif(svm);
@@ -2084,6 +2238,11 @@ static int pf_interception(struct vcpu_svm *svm)
 	default:
 		error_code = svm->vmcb->control.exit_info_1;
 
+		/* In SEV mode, the guest physical address will have C-bit
+		 * set. C-bit must be cleared before handling the fault.
+		 */
+		if (sev_guest(svm->vcpu.kvm))
+			fault_address &= ~sme_me_mask;
 		trace_kvm_page_fault(fault_address, error_code);
 		if (!npt_enabled && kvm_event_needs_reinjection(&svm->vcpu))
 			kvm_mmu_unprotect_page_virt(&svm->vcpu, fault_address);
@@ -4258,12 +4417,40 @@ static void reload_tss(struct kvm_vcpu *vcpu)
 	load_TR_desc();
 }
 
+static void pre_sev_run(struct vcpu_svm *svm)
+{
+	int asid = sev_get_asid(svm->vcpu.kvm);
+	int cpu = raw_smp_processor_id();
+	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
+
+	/* Assign the asid allocated for this SEV guest */
+	svm->vmcb->control.asid = asid;
+
+	/* Flush guest TLB:
+	 * - when different VMCB for the same ASID is to be run on the
+	 *   same host CPU
+	 *   or
+	 * - this VMCB was executed on different host cpu in previous VMRUNs.
+	 */
+	if (sd->sev_vmcbs[asid] != (void *)svm->vmcb ||
+		svm->last_cpuid != cpu)
+		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
+
+	svm->last_cpuid = cpu;
+	sd->sev_vmcbs[asid] = (void *)svm->vmcb;
+
+	mark_dirty(svm->vmcb, VMCB_ASID);
+}
+
 static void pre_svm_run(struct vcpu_svm *svm)
 {
 	int cpu = raw_smp_processor_id();
 
 	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
 
+	if (sev_guest(svm->vcpu.kvm))
+		return pre_sev_run(svm);
+
 	/* FIXME: handle wraparound of asid_generation */
 	if (svm->asid_generation != sd->asid_generation)
 		new_asid(svm, sd);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index fef7d83..9df37a2 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1284,6 +1284,104 @@ struct kvm_s390_ucas_mapping {
 /* Memory Encryption Commands */
 #define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
 
+/* Secure Encrypted Virtualization mode */
+enum sev_cmd_id {
+	/* Guest launch commands */
+	KVM_SEV_LAUNCH_START = 0,
+	KVM_SEV_LAUNCH_UPDATE_DATA,
+	KVM_SEV_LAUNCH_MEASURE,
+	KVM_SEV_LAUNCH_FINISH,
+	/* Guest migration commands (outgoing) */
+	KVM_SEV_SEND_START,
+	KVM_SEV_SEND_UPDATE_DATA,
+	KVM_SEV_SEND_FINISH,
+	/* Guest migration commands (incoming) */
+	KVM_SEV_RECEIVE_START,
+	KVM_SEV_RECEIVE_UPDATE_DATA,
+	KVM_SEV_RECEIVE_FINISH,
+	/* Guest status and debug commands */
+	KVM_SEV_GUEST_STATUS,
+	KVM_SEV_DBG_DECRYPT,
+	KVM_SEV_DBG_ENCRYPT,
+
+	KVM_SEV_NR_MAX,
+};
+
+struct kvm_sev_cmd {
+	__u32 id;
+	__u64 data;
+	__u32 error;
+	__u32 sev_fd;
+};
+
+struct kvm_sev_launch_start {
+	__u32 handle;
+	__u32 policy;
+	__u64 dh_cert_data;
+	__u32 dh_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_launch_update_data {
+	__u64 address;
+	__u32 length;
+};
+
+struct kvm_sev_launch_measure {
+	__u64 address;
+	__u32 length;
+};
+
+struct kvm_sev_send_start {
+	__u64 pdh_cert_data;
+	__u32 pdh_cert_length;
+	__u64 plat_cert_data;
+	__u32 plat_cert_length;
+	__u64 amd_cert_data;
+	__u32 amd_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_send_update_data {
+	__u64 hdr_data;
+	__u32 hdr_length;
+	__u64 guest_address;
+	__u32 guest_length;
+	__u64 host_address;
+	__u32 host_length;
+};
+
+struct kvm_sev_receive_start {
+	__u32 handle;
+	__u64 pdh_cert_data;
+	__u32 pdh_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_receive_update_data {
+	__u64 hdr_data;
+	__u32 hdr_length;
+	__u64 guest_address;
+	__u32 guest_length;
+	__u64 host_address;
+	__u32 host_length;
+};
+
+struct kvm_sev_guest_status {
+	__u32 handle;
+	__u32 policy;
+	__u32 state;
+};
+
+struct kvm_sev_dbg {
+	__u64 src_addr;
+	__u64 dst_addr;
+	__u32 length;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 24/32] kvm: x86: prepare for SEV guest management API support
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The patch adds initial support required to integrate Secure Encrypted
Virtualization (SEV) feature.

ASID management:
 - Reserve asid range for SEV guest, SEV asid range is obtained through
   CPUID Fn8000_001f[ECX]. A non-SEV guest can use any asid outside the SEV
   asid range.
 - SEV guest must have asid value within asid range obtained through CPUID.
 - SEV guest must have the same asid for all vcpu's. A TLB flush is required
   if different vcpu for the same ASID is to be run on the same host CPU.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    8 ++
 arch/x86/kvm/svm.c              |  189 +++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h        |   98 ++++++++++++++++++++
 3 files changed, 294 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 62651ad..fcc4710 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -719,6 +719,12 @@ struct kvm_hv {
 	HV_REFERENCE_TSC_PAGE tsc_ref;
 };
 
+struct kvm_sev_info {
+	unsigned int handle;	/* firmware handle */
+	unsigned int asid;	/* asid for this guest */
+	int sev_fd;		/* SEV device fd */
+};
+
 struct kvm_arch {
 	unsigned int n_used_mmu_pages;
 	unsigned int n_requested_mmu_pages;
@@ -805,6 +811,8 @@ struct kvm_arch {
 
 	bool x2apic_format;
 	bool x2apic_broadcast_quirk_disabled;
+
+	struct kvm_sev_info sev_info;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 8d8fe62..fb63398 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -36,6 +36,7 @@
 #include <linux/slab.h>
 #include <linux/amd-iommu.h>
 #include <linux/hashtable.h>
+#include <linux/psp-sev.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -211,6 +212,9 @@ struct vcpu_svm {
 	 */
 	struct list_head ir_list;
 	spinlock_t ir_list_lock;
+
+	/* which host cpu was used for running this vcpu */
+	bool last_cpuid;
 };
 
 /*
@@ -490,6 +494,64 @@ static inline bool gif_set(struct vcpu_svm *svm)
 	return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
 }
 
+/* Secure Encrypted Virtualization */
+static unsigned int max_sev_asid;
+static unsigned long *sev_asid_bitmap;
+
+static bool kvm_sev_enabled(void)
+{
+	return max_sev_asid ? 1 : 0;
+}
+
+static inline struct kvm_sev_info *sev_get_info(struct kvm *kvm)
+{
+	struct kvm_arch *vm_data = &kvm->arch;
+
+	return &vm_data->sev_info;
+}
+
+static unsigned int sev_get_handle(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	return sev_info->handle;
+}
+
+static inline int sev_guest(struct kvm *kvm)
+{
+	return sev_get_handle(kvm);
+}
+
+static inline int sev_get_asid(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return -EINVAL;
+
+	return sev_info->asid;
+}
+
+static inline int sev_get_fd(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return -EINVAL;
+
+	return sev_info->sev_fd;
+}
+
+static inline void sev_set_asid(struct kvm *kvm, int asid)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return;
+
+	sev_info->asid = asid;
+}
+
 static unsigned long iopm_base;
 
 struct kvm_ldttss_desc {
@@ -511,6 +573,8 @@ struct svm_cpu_data {
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
+
+	struct vmcb **sev_vmcbs;  /* index = sev_asid, value = vmcb pointer */
 };
 
 static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -764,7 +828,7 @@ static int svm_hardware_enable(void)
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	sd->next_asid = sd->max_asid + 1;
-	sd->min_asid = 1;
+	sd->min_asid = max_sev_asid + 1;
 
 	native_store_gdt(&gdt_descr);
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -825,6 +889,7 @@ static void svm_cpu_uninit(int cpu)
 
 	per_cpu(svm_data, raw_smp_processor_id()) = NULL;
 	__free_page(sd->save_area);
+	kfree(sd->sev_vmcbs);
 	kfree(sd);
 }
 
@@ -842,6 +907,14 @@ static int svm_cpu_init(int cpu)
 	if (!sd->save_area)
 		goto err_1;
 
+	if (kvm_sev_enabled()) {
+		sd->sev_vmcbs = kmalloc((max_sev_asid + 1) * sizeof(void *),
+					GFP_KERNEL);
+		r = -ENOMEM;
+		if (!sd->sev_vmcbs)
+			goto err_1;
+	}
+
 	per_cpu(svm_data, cpu) = sd;
 
 	return 0;
@@ -1017,6 +1090,61 @@ static int avic_ga_log_notifier(u32 ga_tag)
 	return 0;
 }
 
+static __init void sev_hardware_setup(void)
+{
+	int ret, error, nguests;
+	struct sev_data_init *init;
+	struct sev_data_status *status;
+
+	/*
+	 * Get maximum number of encrypted guest supported: Fn8001_001F[ECX]
+	 * 	Bit 31:0: Number of supported guest
+	 */
+	nguests = cpuid_ecx(0x8000001F);
+	if (!nguests)
+		return;
+
+	init = kzalloc(sizeof(*init), GFP_KERNEL);
+	if (!init)
+		return;
+
+	status = kzalloc(sizeof(*status), GFP_KERNEL);
+	if (!status)
+		goto err_1;
+
+	/* Initialize SEV firmware */
+	ret = sev_platform_init(init, &error);
+	if (ret) {
+		pr_err("SEV: PLATFORM_INIT ret=%d (%#x)\n", ret, error);
+		goto err_2;
+	}
+
+	/* Initialize SEV ASID bitmap */
+	sev_asid_bitmap = kcalloc(BITS_TO_LONGS(nguests),
+				  sizeof(unsigned long), GFP_KERNEL);
+	if (IS_ERR(sev_asid_bitmap)) {
+		sev_platform_shutdown(&error);
+		goto err_2;
+	}
+
+	/* Query the platform status and print API version */
+	ret = sev_platform_status(status, &error);
+	if (ret) {
+		printk(KERN_ERR "SEV: PLATFORM_STATUS ret=%#x\n", error);
+		goto err_2;
+	}
+
+	max_sev_asid = nguests;
+
+	printk(KERN_INFO "kvm: SEV enabled\n");
+	printk(KERN_INFO "SEV API: %d.%d\n",
+			status->api_major, status->api_minor);
+err_2:
+	kfree(status);
+err_1:
+	kfree(init);
+}
+
 static __init int svm_hardware_setup(void)
 {
 	int cpu;
@@ -1052,6 +1180,9 @@ static __init int svm_hardware_setup(void)
 		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
 	}
 
+	if (boot_cpu_has(X86_FEATURE_SEV))
+		sev_hardware_setup();
+
 	for_each_possible_cpu(cpu) {
 		r = svm_cpu_init(cpu);
 		if (r)
@@ -1094,10 +1225,25 @@ static __init int svm_hardware_setup(void)
 	return r;
 }
 
+static __exit void sev_hardware_unsetup(void)
+{
+	int ret, err;
+
+	ret = sev_platform_shutdown(&err);
+	if (ret)
+		printk(KERN_ERR "failed to shutdown PSP rc=%d (%#0x10x)\n",
+		ret, err);
+
+	kfree(sev_asid_bitmap);
+}
+
 static __exit void svm_hardware_unsetup(void)
 {
 	int cpu;
 
+	if (kvm_sev_enabled())
+		sev_hardware_unsetup();
+
 	for_each_possible_cpu(cpu)
 		svm_cpu_uninit(cpu);
 
@@ -1157,6 +1303,11 @@ static void avic_init_vmcb(struct vcpu_svm *svm)
 	svm->vcpu.arch.apicv_active = true;
 }
 
+static void sev_init_vmcb(struct vcpu_svm *svm)
+{
+	svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ENABLE;
+}
+
 static void init_vmcb(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -1271,6 +1422,9 @@ static void init_vmcb(struct vcpu_svm *svm)
 	if (avic)
 		avic_init_vmcb(svm);
 
+	if (sev_guest(svm->vcpu.kvm))
+		sev_init_vmcb(svm);
+
 	mark_all_dirty(svm->vmcb);
 
 	enable_gif(svm);
@@ -2084,6 +2238,11 @@ static int pf_interception(struct vcpu_svm *svm)
 	default:
 		error_code = svm->vmcb->control.exit_info_1;
 
+		/* In SEV mode, the guest physical address will have C-bit
+		 * set. C-bit must be cleared before handling the fault.
+		 */
+		if (sev_guest(svm->vcpu.kvm))
+			fault_address &= ~sme_me_mask;
 		trace_kvm_page_fault(fault_address, error_code);
 		if (!npt_enabled && kvm_event_needs_reinjection(&svm->vcpu))
 			kvm_mmu_unprotect_page_virt(&svm->vcpu, fault_address);
@@ -4258,12 +4417,40 @@ static void reload_tss(struct kvm_vcpu *vcpu)
 	load_TR_desc();
 }
 
+static void pre_sev_run(struct vcpu_svm *svm)
+{
+	int asid = sev_get_asid(svm->vcpu.kvm);
+	int cpu = raw_smp_processor_id();
+	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
+
+	/* Assign the asid allocated for this SEV guest */
+	svm->vmcb->control.asid = asid;
+
+	/* Flush guest TLB:
+	 * - when different VMCB for the same ASID is to be run on the
+	 *   same host CPU
+	 *   or
+	 * - this VMCB was executed on different host cpu in previous VMRUNs.
+	 */
+	if (sd->sev_vmcbs[asid] != (void *)svm->vmcb ||
+		svm->last_cpuid != cpu)
+		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
+
+	svm->last_cpuid = cpu;
+	sd->sev_vmcbs[asid] = (void *)svm->vmcb;
+
+	mark_dirty(svm->vmcb, VMCB_ASID);
+}
+
 static void pre_svm_run(struct vcpu_svm *svm)
 {
 	int cpu = raw_smp_processor_id();
 
 	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
 
+	if (sev_guest(svm->vcpu.kvm))
+		return pre_sev_run(svm);
+
 	/* FIXME: handle wraparound of asid_generation */
 	if (svm->asid_generation != sd->asid_generation)
 		new_asid(svm, sd);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index fef7d83..9df37a2 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1284,6 +1284,104 @@ struct kvm_s390_ucas_mapping {
 /* Memory Encryption Commands */
 #define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
 
+/* Secure Encrypted Virtualization mode */
+enum sev_cmd_id {
+	/* Guest launch commands */
+	KVM_SEV_LAUNCH_START = 0,
+	KVM_SEV_LAUNCH_UPDATE_DATA,
+	KVM_SEV_LAUNCH_MEASURE,
+	KVM_SEV_LAUNCH_FINISH,
+	/* Guest migration commands (outgoing) */
+	KVM_SEV_SEND_START,
+	KVM_SEV_SEND_UPDATE_DATA,
+	KVM_SEV_SEND_FINISH,
+	/* Guest migration commands (incoming) */
+	KVM_SEV_RECEIVE_START,
+	KVM_SEV_RECEIVE_UPDATE_DATA,
+	KVM_SEV_RECEIVE_FINISH,
+	/* Guest status and debug commands */
+	KVM_SEV_GUEST_STATUS,
+	KVM_SEV_DBG_DECRYPT,
+	KVM_SEV_DBG_ENCRYPT,
+
+	KVM_SEV_NR_MAX,
+};
+
+struct kvm_sev_cmd {
+	__u32 id;
+	__u64 data;
+	__u32 error;
+	__u32 sev_fd;
+};
+
+struct kvm_sev_launch_start {
+	__u32 handle;
+	__u32 policy;
+	__u64 dh_cert_data;
+	__u32 dh_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_launch_update_data {
+	__u64 address;
+	__u32 length;
+};
+
+struct kvm_sev_launch_measure {
+	__u64 address;
+	__u32 length;
+};
+
+struct kvm_sev_send_start {
+	__u64 pdh_cert_data;
+	__u32 pdh_cert_length;
+	__u64 plat_cert_data;
+	__u32 plat_cert_length;
+	__u64 amd_cert_data;
+	__u32 amd_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_send_update_data {
+	__u64 hdr_data;
+	__u32 hdr_length;
+	__u64 guest_address;
+	__u32 guest_length;
+	__u64 host_address;
+	__u32 host_length;
+};
+
+struct kvm_sev_receive_start {
+	__u32 handle;
+	__u64 pdh_cert_data;
+	__u32 pdh_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_receive_update_data {
+	__u64 hdr_data;
+	__u32 hdr_length;
+	__u64 guest_address;
+	__u32 guest_length;
+	__u64 host_address;
+	__u32 host_length;
+};
+
+struct kvm_sev_guest_status {
+	__u32 handle;
+	__u32 policy;
+	__u32 state;
+};
+
+struct kvm_sev_dbg {
+	__u64 src_addr;
+	__u64 dst_addr;
+	__u32 length;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 24/32] kvm: x86: prepare for SEV guest management API support
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The patch adds initial support required to integrate Secure Encrypted
Virtualization (SEV) feature.

ASID management:
 - Reserve asid range for SEV guest, SEV asid range is obtained through
   CPUID Fn8000_001f[ECX]. A non-SEV guest can use any asid outside the SEV
   asid range.
 - SEV guest must have asid value within asid range obtained through CPUID.
 - SEV guest must have the same asid for all vcpu's. A TLB flush is required
   if different vcpu for the same ASID is to be run on the same host CPU.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    8 ++
 arch/x86/kvm/svm.c              |  189 +++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h        |   98 ++++++++++++++++++++
 3 files changed, 294 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 62651ad..fcc4710 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -719,6 +719,12 @@ struct kvm_hv {
 	HV_REFERENCE_TSC_PAGE tsc_ref;
 };
 
+struct kvm_sev_info {
+	unsigned int handle;	/* firmware handle */
+	unsigned int asid;	/* asid for this guest */
+	int sev_fd;		/* SEV device fd */
+};
+
 struct kvm_arch {
 	unsigned int n_used_mmu_pages;
 	unsigned int n_requested_mmu_pages;
@@ -805,6 +811,8 @@ struct kvm_arch {
 
 	bool x2apic_format;
 	bool x2apic_broadcast_quirk_disabled;
+
+	struct kvm_sev_info sev_info;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 8d8fe62..fb63398 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -36,6 +36,7 @@
 #include <linux/slab.h>
 #include <linux/amd-iommu.h>
 #include <linux/hashtable.h>
+#include <linux/psp-sev.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -211,6 +212,9 @@ struct vcpu_svm {
 	 */
 	struct list_head ir_list;
 	spinlock_t ir_list_lock;
+
+	/* which host cpu was used for running this vcpu */
+	bool last_cpuid;
 };
 
 /*
@@ -490,6 +494,64 @@ static inline bool gif_set(struct vcpu_svm *svm)
 	return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
 }
 
+/* Secure Encrypted Virtualization */
+static unsigned int max_sev_asid;
+static unsigned long *sev_asid_bitmap;
+
+static bool kvm_sev_enabled(void)
+{
+	return max_sev_asid ? 1 : 0;
+}
+
+static inline struct kvm_sev_info *sev_get_info(struct kvm *kvm)
+{
+	struct kvm_arch *vm_data = &kvm->arch;
+
+	return &vm_data->sev_info;
+}
+
+static unsigned int sev_get_handle(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	return sev_info->handle;
+}
+
+static inline int sev_guest(struct kvm *kvm)
+{
+	return sev_get_handle(kvm);
+}
+
+static inline int sev_get_asid(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return -EINVAL;
+
+	return sev_info->asid;
+}
+
+static inline int sev_get_fd(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return -EINVAL;
+
+	return sev_info->sev_fd;
+}
+
+static inline void sev_set_asid(struct kvm *kvm, int asid)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return;
+
+	sev_info->asid = asid;
+}
+
 static unsigned long iopm_base;
 
 struct kvm_ldttss_desc {
@@ -511,6 +573,8 @@ struct svm_cpu_data {
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
+
+	struct vmcb **sev_vmcbs;  /* index = sev_asid, value = vmcb pointer */
 };
 
 static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -764,7 +828,7 @@ static int svm_hardware_enable(void)
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	sd->next_asid = sd->max_asid + 1;
-	sd->min_asid = 1;
+	sd->min_asid = max_sev_asid + 1;
 
 	native_store_gdt(&gdt_descr);
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -825,6 +889,7 @@ static void svm_cpu_uninit(int cpu)
 
 	per_cpu(svm_data, raw_smp_processor_id()) = NULL;
 	__free_page(sd->save_area);
+	kfree(sd->sev_vmcbs);
 	kfree(sd);
 }
 
@@ -842,6 +907,14 @@ static int svm_cpu_init(int cpu)
 	if (!sd->save_area)
 		goto err_1;
 
+	if (kvm_sev_enabled()) {
+		sd->sev_vmcbs = kmalloc((max_sev_asid + 1) * sizeof(void *),
+					GFP_KERNEL);
+		r = -ENOMEM;
+		if (!sd->sev_vmcbs)
+			goto err_1;
+	}
+
 	per_cpu(svm_data, cpu) = sd;
 
 	return 0;
@@ -1017,6 +1090,61 @@ static int avic_ga_log_notifier(u32 ga_tag)
 	return 0;
 }
 
+static __init void sev_hardware_setup(void)
+{
+	int ret, error, nguests;
+	struct sev_data_init *init;
+	struct sev_data_status *status;
+
+	/*
+	 * Get maximum number of encrypted guest supported: Fn8001_001F[ECX]
+	 * 	Bit 31:0: Number of supported guest
+	 */
+	nguests = cpuid_ecx(0x8000001F);
+	if (!nguests)
+		return;
+
+	init = kzalloc(sizeof(*init), GFP_KERNEL);
+	if (!init)
+		return;
+
+	status = kzalloc(sizeof(*status), GFP_KERNEL);
+	if (!status)
+		goto err_1;
+
+	/* Initialize SEV firmware */
+	ret = sev_platform_init(init, &error);
+	if (ret) {
+		pr_err("SEV: PLATFORM_INIT ret=%d (%#x)\n", ret, error);
+		goto err_2;
+	}
+
+	/* Initialize SEV ASID bitmap */
+	sev_asid_bitmap = kcalloc(BITS_TO_LONGS(nguests),
+				  sizeof(unsigned long), GFP_KERNEL);
+	if (IS_ERR(sev_asid_bitmap)) {
+		sev_platform_shutdown(&error);
+		goto err_2;
+	}
+
+	/* Query the platform status and print API version */
+	ret = sev_platform_status(status, &error);
+	if (ret) {
+		printk(KERN_ERR "SEV: PLATFORM_STATUS ret=%#x\n", error);
+		goto err_2;
+	}
+
+	max_sev_asid = nguests;
+
+	printk(KERN_INFO "kvm: SEV enabled\n");
+	printk(KERN_INFO "SEV API: %d.%d\n",
+			status->api_major, status->api_minor);
+err_2:
+	kfree(status);
+err_1:
+	kfree(init);
+}
+
 static __init int svm_hardware_setup(void)
 {
 	int cpu;
@@ -1052,6 +1180,9 @@ static __init int svm_hardware_setup(void)
 		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
 	}
 
+	if (boot_cpu_has(X86_FEATURE_SEV))
+		sev_hardware_setup();
+
 	for_each_possible_cpu(cpu) {
 		r = svm_cpu_init(cpu);
 		if (r)
@@ -1094,10 +1225,25 @@ static __init int svm_hardware_setup(void)
 	return r;
 }
 
+static __exit void sev_hardware_unsetup(void)
+{
+	int ret, err;
+
+	ret = sev_platform_shutdown(&err);
+	if (ret)
+		printk(KERN_ERR "failed to shutdown PSP rc=%d (%#0x10x)\n",
+		ret, err);
+
+	kfree(sev_asid_bitmap);
+}
+
 static __exit void svm_hardware_unsetup(void)
 {
 	int cpu;
 
+	if (kvm_sev_enabled())
+		sev_hardware_unsetup();
+
 	for_each_possible_cpu(cpu)
 		svm_cpu_uninit(cpu);
 
@@ -1157,6 +1303,11 @@ static void avic_init_vmcb(struct vcpu_svm *svm)
 	svm->vcpu.arch.apicv_active = true;
 }
 
+static void sev_init_vmcb(struct vcpu_svm *svm)
+{
+	svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ENABLE;
+}
+
 static void init_vmcb(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -1271,6 +1422,9 @@ static void init_vmcb(struct vcpu_svm *svm)
 	if (avic)
 		avic_init_vmcb(svm);
 
+	if (sev_guest(svm->vcpu.kvm))
+		sev_init_vmcb(svm);
+
 	mark_all_dirty(svm->vmcb);
 
 	enable_gif(svm);
@@ -2084,6 +2238,11 @@ static int pf_interception(struct vcpu_svm *svm)
 	default:
 		error_code = svm->vmcb->control.exit_info_1;
 
+		/* In SEV mode, the guest physical address will have C-bit
+		 * set. C-bit must be cleared before handling the fault.
+		 */
+		if (sev_guest(svm->vcpu.kvm))
+			fault_address &= ~sme_me_mask;
 		trace_kvm_page_fault(fault_address, error_code);
 		if (!npt_enabled && kvm_event_needs_reinjection(&svm->vcpu))
 			kvm_mmu_unprotect_page_virt(&svm->vcpu, fault_address);
@@ -4258,12 +4417,40 @@ static void reload_tss(struct kvm_vcpu *vcpu)
 	load_TR_desc();
 }
 
+static void pre_sev_run(struct vcpu_svm *svm)
+{
+	int asid = sev_get_asid(svm->vcpu.kvm);
+	int cpu = raw_smp_processor_id();
+	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
+
+	/* Assign the asid allocated for this SEV guest */
+	svm->vmcb->control.asid = asid;
+
+	/* Flush guest TLB:
+	 * - when different VMCB for the same ASID is to be run on the
+	 *   same host CPU
+	 *   or
+	 * - this VMCB was executed on different host cpu in previous VMRUNs.
+	 */
+	if (sd->sev_vmcbs[asid] != (void *)svm->vmcb ||
+		svm->last_cpuid != cpu)
+		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
+
+	svm->last_cpuid = cpu;
+	sd->sev_vmcbs[asid] = (void *)svm->vmcb;
+
+	mark_dirty(svm->vmcb, VMCB_ASID);
+}
+
 static void pre_svm_run(struct vcpu_svm *svm)
 {
 	int cpu = raw_smp_processor_id();
 
 	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
 
+	if (sev_guest(svm->vcpu.kvm))
+		return pre_sev_run(svm);
+
 	/* FIXME: handle wraparound of asid_generation */
 	if (svm->asid_generation != sd->asid_generation)
 		new_asid(svm, sd);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index fef7d83..9df37a2 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1284,6 +1284,104 @@ struct kvm_s390_ucas_mapping {
 /* Memory Encryption Commands */
 #define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
 
+/* Secure Encrypted Virtualization mode */
+enum sev_cmd_id {
+	/* Guest launch commands */
+	KVM_SEV_LAUNCH_START = 0,
+	KVM_SEV_LAUNCH_UPDATE_DATA,
+	KVM_SEV_LAUNCH_MEASURE,
+	KVM_SEV_LAUNCH_FINISH,
+	/* Guest migration commands (outgoing) */
+	KVM_SEV_SEND_START,
+	KVM_SEV_SEND_UPDATE_DATA,
+	KVM_SEV_SEND_FINISH,
+	/* Guest migration commands (incoming) */
+	KVM_SEV_RECEIVE_START,
+	KVM_SEV_RECEIVE_UPDATE_DATA,
+	KVM_SEV_RECEIVE_FINISH,
+	/* Guest status and debug commands */
+	KVM_SEV_GUEST_STATUS,
+	KVM_SEV_DBG_DECRYPT,
+	KVM_SEV_DBG_ENCRYPT,
+
+	KVM_SEV_NR_MAX,
+};
+
+struct kvm_sev_cmd {
+	__u32 id;
+	__u64 data;
+	__u32 error;
+	__u32 sev_fd;
+};
+
+struct kvm_sev_launch_start {
+	__u32 handle;
+	__u32 policy;
+	__u64 dh_cert_data;
+	__u32 dh_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_launch_update_data {
+	__u64 address;
+	__u32 length;
+};
+
+struct kvm_sev_launch_measure {
+	__u64 address;
+	__u32 length;
+};
+
+struct kvm_sev_send_start {
+	__u64 pdh_cert_data;
+	__u32 pdh_cert_length;
+	__u64 plat_cert_data;
+	__u32 plat_cert_length;
+	__u64 amd_cert_data;
+	__u32 amd_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_send_update_data {
+	__u64 hdr_data;
+	__u32 hdr_length;
+	__u64 guest_address;
+	__u32 guest_length;
+	__u64 host_address;
+	__u32 host_length;
+};
+
+struct kvm_sev_receive_start {
+	__u32 handle;
+	__u64 pdh_cert_data;
+	__u32 pdh_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_receive_update_data {
+	__u64 hdr_data;
+	__u32 hdr_length;
+	__u64 guest_address;
+	__u32 guest_length;
+	__u64 host_address;
+	__u32 host_length;
+};
+
+struct kvm_sev_guest_status {
+	__u32 handle;
+	__u32 policy;
+	__u32 state;
+};
+
+struct kvm_sev_dbg {
+	__u64 src_addr;
+	__u64 dst_addr;
+	__u32 length;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 24/32] kvm: x86: prepare for SEV guest management API support
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The patch adds initial support required to integrate Secure Encrypted
Virtualization (SEV) feature.

ASID management:
 - Reserve asid range for SEV guest, SEV asid range is obtained through
   CPUID Fn8000_001f[ECX]. A non-SEV guest can use any asid outside the SEV
   asid range.
 - SEV guest must have asid value within asid range obtained through CPUID.
 - SEV guest must have the same asid for all vcpu's. A TLB flush is required
   if different vcpu for the same ASID is to be run on the same host CPU.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    8 ++
 arch/x86/kvm/svm.c              |  189 +++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h        |   98 ++++++++++++++++++++
 3 files changed, 294 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 62651ad..fcc4710 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -719,6 +719,12 @@ struct kvm_hv {
 	HV_REFERENCE_TSC_PAGE tsc_ref;
 };
 
+struct kvm_sev_info {
+	unsigned int handle;	/* firmware handle */
+	unsigned int asid;	/* asid for this guest */
+	int sev_fd;		/* SEV device fd */
+};
+
 struct kvm_arch {
 	unsigned int n_used_mmu_pages;
 	unsigned int n_requested_mmu_pages;
@@ -805,6 +811,8 @@ struct kvm_arch {
 
 	bool x2apic_format;
 	bool x2apic_broadcast_quirk_disabled;
+
+	struct kvm_sev_info sev_info;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 8d8fe62..fb63398 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -36,6 +36,7 @@
 #include <linux/slab.h>
 #include <linux/amd-iommu.h>
 #include <linux/hashtable.h>
+#include <linux/psp-sev.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -211,6 +212,9 @@ struct vcpu_svm {
 	 */
 	struct list_head ir_list;
 	spinlock_t ir_list_lock;
+
+	/* which host cpu was used for running this vcpu */
+	bool last_cpuid;
 };
 
 /*
@@ -490,6 +494,64 @@ static inline bool gif_set(struct vcpu_svm *svm)
 	return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
 }
 
+/* Secure Encrypted Virtualization */
+static unsigned int max_sev_asid;
+static unsigned long *sev_asid_bitmap;
+
+static bool kvm_sev_enabled(void)
+{
+	return max_sev_asid ? 1 : 0;
+}
+
+static inline struct kvm_sev_info *sev_get_info(struct kvm *kvm)
+{
+	struct kvm_arch *vm_data = &kvm->arch;
+
+	return &vm_data->sev_info;
+}
+
+static unsigned int sev_get_handle(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	return sev_info->handle;
+}
+
+static inline int sev_guest(struct kvm *kvm)
+{
+	return sev_get_handle(kvm);
+}
+
+static inline int sev_get_asid(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return -EINVAL;
+
+	return sev_info->asid;
+}
+
+static inline int sev_get_fd(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return -EINVAL;
+
+	return sev_info->sev_fd;
+}
+
+static inline void sev_set_asid(struct kvm *kvm, int asid)
+{
+	struct kvm_sev_info *sev_info = sev_get_info(kvm);
+
+	if (!sev_info)
+		return;
+
+	sev_info->asid = asid;
+}
+
 static unsigned long iopm_base;
 
 struct kvm_ldttss_desc {
@@ -511,6 +573,8 @@ struct svm_cpu_data {
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
+
+	struct vmcb **sev_vmcbs;  /* index = sev_asid, value = vmcb pointer */
 };
 
 static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -764,7 +828,7 @@ static int svm_hardware_enable(void)
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
 	sd->next_asid = sd->max_asid + 1;
-	sd->min_asid = 1;
+	sd->min_asid = max_sev_asid + 1;
 
 	native_store_gdt(&gdt_descr);
 	gdt = (struct desc_struct *)gdt_descr.address;
@@ -825,6 +889,7 @@ static void svm_cpu_uninit(int cpu)
 
 	per_cpu(svm_data, raw_smp_processor_id()) = NULL;
 	__free_page(sd->save_area);
+	kfree(sd->sev_vmcbs);
 	kfree(sd);
 }
 
@@ -842,6 +907,14 @@ static int svm_cpu_init(int cpu)
 	if (!sd->save_area)
 		goto err_1;
 
+	if (kvm_sev_enabled()) {
+		sd->sev_vmcbs = kmalloc((max_sev_asid + 1) * sizeof(void *),
+					GFP_KERNEL);
+		r = -ENOMEM;
+		if (!sd->sev_vmcbs)
+			goto err_1;
+	}
+
 	per_cpu(svm_data, cpu) = sd;
 
 	return 0;
@@ -1017,6 +1090,61 @@ static int avic_ga_log_notifier(u32 ga_tag)
 	return 0;
 }
 
+static __init void sev_hardware_setup(void)
+{
+	int ret, error, nguests;
+	struct sev_data_init *init;
+	struct sev_data_status *status;
+
+	/*
+	 * Get maximum number of encrypted guest supported: Fn8001_001F[ECX]
+	 * 	Bit 31:0: Number of supported guest
+	 */
+	nguests = cpuid_ecx(0x8000001F);
+	if (!nguests)
+		return;
+
+	init = kzalloc(sizeof(*init), GFP_KERNEL);
+	if (!init)
+		return;
+
+	status = kzalloc(sizeof(*status), GFP_KERNEL);
+	if (!status)
+		goto err_1;
+
+	/* Initialize SEV firmware */
+	ret = sev_platform_init(init, &error);
+	if (ret) {
+		pr_err("SEV: PLATFORM_INIT ret=%d (%#x)\n", ret, error);
+		goto err_2;
+	}
+
+	/* Initialize SEV ASID bitmap */
+	sev_asid_bitmap = kcalloc(BITS_TO_LONGS(nguests),
+				  sizeof(unsigned long), GFP_KERNEL);
+	if (IS_ERR(sev_asid_bitmap)) {
+		sev_platform_shutdown(&error);
+		goto err_2;
+	}
+
+	/* Query the platform status and print API version */
+	ret = sev_platform_status(status, &error);
+	if (ret) {
+		printk(KERN_ERR "SEV: PLATFORM_STATUS ret=%#x\n", error);
+		goto err_2;
+	}
+
+	max_sev_asid = nguests;
+
+	printk(KERN_INFO "kvm: SEV enabled\n");
+	printk(KERN_INFO "SEV API: %d.%d\n",
+			status->api_major, status->api_minor);
+err_2:
+	kfree(status);
+err_1:
+	kfree(init);
+}
+
 static __init int svm_hardware_setup(void)
 {
 	int cpu;
@@ -1052,6 +1180,9 @@ static __init int svm_hardware_setup(void)
 		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
 	}
 
+	if (boot_cpu_has(X86_FEATURE_SEV))
+		sev_hardware_setup();
+
 	for_each_possible_cpu(cpu) {
 		r = svm_cpu_init(cpu);
 		if (r)
@@ -1094,10 +1225,25 @@ static __init int svm_hardware_setup(void)
 	return r;
 }
 
+static __exit void sev_hardware_unsetup(void)
+{
+	int ret, err;
+
+	ret = sev_platform_shutdown(&err);
+	if (ret)
+		printk(KERN_ERR "failed to shutdown PSP rc=%d (%#0x10x)\n",
+		ret, err);
+
+	kfree(sev_asid_bitmap);
+}
+
 static __exit void svm_hardware_unsetup(void)
 {
 	int cpu;
 
+	if (kvm_sev_enabled())
+		sev_hardware_unsetup();
+
 	for_each_possible_cpu(cpu)
 		svm_cpu_uninit(cpu);
 
@@ -1157,6 +1303,11 @@ static void avic_init_vmcb(struct vcpu_svm *svm)
 	svm->vcpu.arch.apicv_active = true;
 }
 
+static void sev_init_vmcb(struct vcpu_svm *svm)
+{
+	svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ENABLE;
+}
+
 static void init_vmcb(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -1271,6 +1422,9 @@ static void init_vmcb(struct vcpu_svm *svm)
 	if (avic)
 		avic_init_vmcb(svm);
 
+	if (sev_guest(svm->vcpu.kvm))
+		sev_init_vmcb(svm);
+
 	mark_all_dirty(svm->vmcb);
 
 	enable_gif(svm);
@@ -2084,6 +2238,11 @@ static int pf_interception(struct vcpu_svm *svm)
 	default:
 		error_code = svm->vmcb->control.exit_info_1;
 
+		/* In SEV mode, the guest physical address will have C-bit
+		 * set. C-bit must be cleared before handling the fault.
+		 */
+		if (sev_guest(svm->vcpu.kvm))
+			fault_address &= ~sme_me_mask;
 		trace_kvm_page_fault(fault_address, error_code);
 		if (!npt_enabled && kvm_event_needs_reinjection(&svm->vcpu))
 			kvm_mmu_unprotect_page_virt(&svm->vcpu, fault_address);
@@ -4258,12 +4417,40 @@ static void reload_tss(struct kvm_vcpu *vcpu)
 	load_TR_desc();
 }
 
+static void pre_sev_run(struct vcpu_svm *svm)
+{
+	int asid = sev_get_asid(svm->vcpu.kvm);
+	int cpu = raw_smp_processor_id();
+	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
+
+	/* Assign the asid allocated for this SEV guest */
+	svm->vmcb->control.asid = asid;
+
+	/* Flush guest TLB:
+	 * - when different VMCB for the same ASID is to be run on the
+	 *   same host CPU
+	 *   or
+	 * - this VMCB was executed on different host cpu in previous VMRUNs.
+	 */
+	if (sd->sev_vmcbs[asid] != (void *)svm->vmcb ||
+		svm->last_cpuid != cpu)
+		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
+
+	svm->last_cpuid = cpu;
+	sd->sev_vmcbs[asid] = (void *)svm->vmcb;
+
+	mark_dirty(svm->vmcb, VMCB_ASID);
+}
+
 static void pre_svm_run(struct vcpu_svm *svm)
 {
 	int cpu = raw_smp_processor_id();
 
 	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
 
+	if (sev_guest(svm->vcpu.kvm))
+		return pre_sev_run(svm);
+
 	/* FIXME: handle wraparound of asid_generation */
 	if (svm->asid_generation != sd->asid_generation)
 		new_asid(svm, sd);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index fef7d83..9df37a2 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1284,6 +1284,104 @@ struct kvm_s390_ucas_mapping {
 /* Memory Encryption Commands */
 #define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
 
+/* Secure Encrypted Virtualization mode */
+enum sev_cmd_id {
+	/* Guest launch commands */
+	KVM_SEV_LAUNCH_START = 0,
+	KVM_SEV_LAUNCH_UPDATE_DATA,
+	KVM_SEV_LAUNCH_MEASURE,
+	KVM_SEV_LAUNCH_FINISH,
+	/* Guest migration commands (outgoing) */
+	KVM_SEV_SEND_START,
+	KVM_SEV_SEND_UPDATE_DATA,
+	KVM_SEV_SEND_FINISH,
+	/* Guest migration commands (incoming) */
+	KVM_SEV_RECEIVE_START,
+	KVM_SEV_RECEIVE_UPDATE_DATA,
+	KVM_SEV_RECEIVE_FINISH,
+	/* Guest status and debug commands */
+	KVM_SEV_GUEST_STATUS,
+	KVM_SEV_DBG_DECRYPT,
+	KVM_SEV_DBG_ENCRYPT,
+
+	KVM_SEV_NR_MAX,
+};
+
+struct kvm_sev_cmd {
+	__u32 id;
+	__u64 data;
+	__u32 error;
+	__u32 sev_fd;
+};
+
+struct kvm_sev_launch_start {
+	__u32 handle;
+	__u32 policy;
+	__u64 dh_cert_data;
+	__u32 dh_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_launch_update_data {
+	__u64 address;
+	__u32 length;
+};
+
+struct kvm_sev_launch_measure {
+	__u64 address;
+	__u32 length;
+};
+
+struct kvm_sev_send_start {
+	__u64 pdh_cert_data;
+	__u32 pdh_cert_length;
+	__u64 plat_cert_data;
+	__u32 plat_cert_length;
+	__u64 amd_cert_data;
+	__u32 amd_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_send_update_data {
+	__u64 hdr_data;
+	__u32 hdr_length;
+	__u64 guest_address;
+	__u32 guest_length;
+	__u64 host_address;
+	__u32 host_length;
+};
+
+struct kvm_sev_receive_start {
+	__u32 handle;
+	__u64 pdh_cert_data;
+	__u32 pdh_cert_length;
+	__u64 session_data;
+	__u32 session_length;
+};
+
+struct kvm_sev_receive_update_data {
+	__u64 hdr_data;
+	__u32 hdr_length;
+	__u64 guest_address;
+	__u32 guest_length;
+	__u64 host_address;
+	__u32 host_length;
+};
+
+struct kvm_sev_guest_status {
+	__u32 handle;
+	__u32 policy;
+	__u32 state;
+};
+
+struct kvm_sev_dbg {
+	__u64 src_addr;
+	__u64 dst_addr;
+	__u32 length;
+};
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 25/32] kvm: svm: Add support for SEV LAUNCH_START command
  2017-03-02 15:12 ` Brijesh Singh
                   ` (50 preceding siblings ...)
  (?)
@ 2017-03-02 15:17 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The command is used to bootstrap SEV guest from unencrypted boot images.
The command creates a new VM encryption key (VEK) using the guest owner's
public DH certificates, and session data. The VEK will be used to encrypt
the guest memory.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |  302 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 301 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index fb63398..b5fa8c0 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -37,6 +37,7 @@
 #include <linux/amd-iommu.h>
 #include <linux/hashtable.h>
 #include <linux/psp-sev.h>
+#include <linux/file.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -497,6 +498,10 @@ static inline bool gif_set(struct vcpu_svm *svm)
 /* Secure Encrypted Virtualization */
 static unsigned int max_sev_asid;
 static unsigned long *sev_asid_bitmap;
+static void sev_deactivate_handle(struct kvm *kvm);
+static void sev_decommission_handle(struct kvm *kvm);
+static int sev_asid_new(void);
+static void sev_asid_free(int asid);
 
 static bool kvm_sev_enabled(void)
 {
@@ -1534,6 +1539,17 @@ static inline int avic_free_vm_id(int id)
 	return 0;
 }
 
+static void sev_vm_destroy(struct kvm *kvm)
+{
+	if (!sev_guest(kvm))
+		return;
+
+	/* release the firmware resources */
+	sev_deactivate_handle(kvm);
+	sev_decommission_handle(kvm);
+	sev_asid_free(sev_get_asid(kvm));
+}
+
 static void avic_vm_destroy(struct kvm *kvm)
 {
 	unsigned long flags;
@@ -1551,6 +1567,12 @@ static void avic_vm_destroy(struct kvm *kvm)
 	spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
 }
 
+static void svm_vm_destroy(struct kvm *kvm)
+{
+	avic_vm_destroy(kvm);
+	sev_vm_destroy(kvm);
+}
+
 static int avic_vm_init(struct kvm *kvm)
 {
 	unsigned long flags;
@@ -5502,6 +5524,282 @@ static inline void avic_post_state_restore(struct kvm_vcpu *vcpu)
 	avic_handle_ldr_update(vcpu);
 }
 
+static int sev_asid_new(void)
+{
+	int pos;
+
+	if (!max_sev_asid)
+		return -EINVAL;
+
+	pos = find_first_zero_bit(sev_asid_bitmap, max_sev_asid);
+	if (pos >= max_sev_asid)
+		return -EBUSY;
+
+	set_bit(pos, sev_asid_bitmap);
+	return pos + 1;
+}
+
+static void sev_asid_free(int asid)
+{
+	int cpu, pos;
+	struct svm_cpu_data *sd;
+
+	pos = asid - 1;
+	clear_bit(pos, sev_asid_bitmap);
+
+	for_each_possible_cpu(cpu) {
+		sd = per_cpu(svm_data, cpu);
+		sd->sev_vmcbs[pos] = NULL;
+	}
+}
+
+static int sev_issue_cmd(struct kvm *kvm, int id, void *data, int *error)
+{
+	int ret;
+	struct fd f;
+	int fd = sev_get_fd(kvm);
+
+	f = fdget(fd);
+	if (!f.file)
+		return -EBADF;
+
+	ret = sev_issue_cmd_external_user(f.file, id, data, 0, error);
+	fdput(f);
+
+	return ret;
+}
+
+static void sev_decommission_handle(struct kvm *kvm)
+{
+	int ret, error;
+	struct sev_data_decommission *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_guest_decommission(data, &error);
+	if (ret)
+		pr_err("SEV: DECOMMISSION %d (%#x)\n", ret, error);
+
+	kfree(data);
+}
+
+static void sev_deactivate_handle(struct kvm *kvm)
+{
+	int ret, error;
+	struct sev_data_deactivate *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_guest_deactivate(data, &error);
+	if (ret) {
+		pr_err("SEV: DEACTIVATE %d (%#x)\n", ret, error);
+		goto buffer_free;
+	}
+
+	wbinvd_on_all_cpus();
+
+	ret = sev_guest_df_flush(&error);
+	if (ret)
+		pr_err("SEV: DF_FLUSH %d (%#x)\n", ret, error);
+
+buffer_free:
+	kfree(data);
+}
+
+static int sev_activate_asid(unsigned int handle, int asid, int *error)
+{
+	int ret;
+	struct sev_data_activate *data;
+
+	wbinvd_on_all_cpus();
+
+	ret = sev_guest_df_flush(error);
+	if (ret) {
+		pr_err("SEV: DF_FLUSH %d (%#x)\n", ret, *error);
+		return ret;
+	}
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = handle;
+	data->asid   = asid;
+	ret = sev_guest_activate(data, error);
+	if (ret)
+		pr_err("SEV: ACTIVATE %d (%#x)\n", ret, *error);
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_pre_start(struct kvm *kvm, int *asid)
+{
+	int ret;
+
+	/* If guest has active SEV handle then deactivate before creating the
+	 * encryption context.
+	 */
+	if (sev_guest(kvm)) {
+		sev_deactivate_handle(kvm);
+		sev_decommission_handle(kvm);
+		*asid = sev_get_asid(kvm);  /* reuse the asid */
+		ret = 0;
+	} else {
+		/* Allocate new asid for this launch */
+		ret = sev_asid_new();
+		if (ret < 0) {
+			pr_err("SEV: failed to get free asid\n");
+			return ret;
+		}
+		*asid = ret;
+		ret = 0;
+	}
+
+	return ret;
+}
+
+static int sev_post_start(struct kvm *kvm, int asid, int handle,
+			int sev_fd, int *error)
+{
+	int ret;
+
+	/* activate asid */
+	ret = sev_activate_asid(handle, asid, error);
+	if (ret)
+		return ret;
+
+	kvm->arch.sev_info.handle = handle;
+	kvm->arch.sev_info.asid = asid;
+	kvm->arch.sev_info.sev_fd = sev_fd;
+
+	return 0;
+}
+
+static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret, asid = 0;
+	void *dh_cert_addr = NULL;
+	void *session_addr = NULL;
+	struct kvm_sev_launch_start params;
+	struct sev_data_launch_start *start;
+	int *error = &argp->error;
+	struct fd f;
+
+	f = fdget(argp->sev_fd);
+	if (!f.file)
+		return -EBADF;
+
+	/* Get parameter from the user */
+	ret = -EFAULT;
+	if (copy_from_user(&params, (void *)argp->data,
+				sizeof(struct kvm_sev_launch_start)))
+		goto err_1;
+
+	ret = -ENOMEM;
+	start = kzalloc(sizeof(*start), GFP_KERNEL);
+	if (!start)
+		goto err_1;
+
+	ret = sev_pre_start(kvm, &asid);
+	if (ret)
+		goto err_2;
+
+	start->handle = params.handle;
+	start->policy = params.policy;
+
+	/* Copy DH certificate from userspace */
+	if (params.dh_cert_length && params.dh_cert_data) {
+		dh_cert_addr = kmalloc(params.dh_cert_length, GFP_KERNEL);
+		if (!dh_cert_addr) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		if (copy_from_user(dh_cert_addr, (void *)params.dh_cert_data,
+				params.dh_cert_length)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		start->dh_cert_address = __psp_pa(dh_cert_addr);
+		start->dh_cert_length = params.dh_cert_length;
+	}
+
+	/* Copy session data from userspace */
+	if (params.session_length && params.session_data) {
+		session_addr = kmalloc(params.dh_cert_length, GFP_KERNEL);
+		if (!session_addr) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		if (copy_from_user(session_addr, (void *)params.session_data,
+				params.session_length)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		start->session_data_address = __psp_pa(session_addr);
+		start->session_data_length = params.session_length;
+	}
+
+	/* launch start */
+	ret = sev_issue_cmd_external_user(f.file, SEV_CMD_LAUNCH_START,
+					  start, 0, error);
+	if (ret) {
+		pr_err("SEV: LAUNCH_START ret=%d (%#010x)\n", ret, *error);
+		goto err_3;
+	}
+
+	ret = sev_post_start(kvm, asid, start->handle, argp->sev_fd, error);
+	if (ret)
+		goto err_3;
+
+	params.handle = start->handle;
+	if (copy_to_user((void *) argp->data, &params,
+				sizeof(struct kvm_sev_launch_start)))
+		ret = -EFAULT;
+err_3:
+	if (ret && asid) /* free asid if we have encountered error */
+		sev_asid_free(asid);
+	kfree(dh_cert_addr);
+	kfree(session_addr);
+err_2:
+	kfree(start);
+err_1:
+	fdput(f);
+	return ret;
+}
+
+static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
+{
+	int r = -ENOTTY;
+	struct kvm_sev_cmd sev_cmd;
+
+	if (copy_from_user(&sev_cmd, argp, sizeof(struct kvm_sev_cmd)))
+		return -EFAULT;
+
+	mutex_lock(&kvm->lock);
+
+	switch (sev_cmd.id) {
+	case KVM_SEV_LAUNCH_START: {
+		r = sev_launch_start(kvm, &sev_cmd);
+		break;
+	}
+	default:
+		break;
+	}
+
+	mutex_unlock(&kvm->lock);
+	if (copy_to_user(argp, &sev_cmd, sizeof(struct kvm_sev_cmd)))
+		r = -EFAULT;
+	return r;
+}
+
 static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
@@ -5518,7 +5816,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.vcpu_reset = svm_vcpu_reset,
 
 	.vm_init = avic_vm_init,
-	.vm_destroy = avic_vm_destroy,
+	.vm_destroy = svm_vm_destroy,
 
 	.prepare_guest_switch = svm_prepare_guest_switch,
 	.vcpu_load = svm_vcpu_load,
@@ -5617,6 +5915,8 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.pmu_ops = &amd_pmu_ops,
 	.deliver_posted_interrupt = svm_deliver_avic_intr,
 	.update_pi_irte = svm_update_pi_irte,
+
+	.memory_encryption_op = amd_memory_encryption_cmd,
 };
 
 static int __init svm_init(void)

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 25/32] kvm: svm: Add support for SEV LAUNCH_START command
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:17   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used to bootstrap SEV guest from unencrypted boot images.
The command creates a new VM encryption key (VEK) using the guest owner's
public DH certificates, and session data. The VEK will be used to encrypt
the guest memory.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |  302 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 301 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index fb63398..b5fa8c0 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -37,6 +37,7 @@
 #include <linux/amd-iommu.h>
 #include <linux/hashtable.h>
 #include <linux/psp-sev.h>
+#include <linux/file.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -497,6 +498,10 @@ static inline bool gif_set(struct vcpu_svm *svm)
 /* Secure Encrypted Virtualization */
 static unsigned int max_sev_asid;
 static unsigned long *sev_asid_bitmap;
+static void sev_deactivate_handle(struct kvm *kvm);
+static void sev_decommission_handle(struct kvm *kvm);
+static int sev_asid_new(void);
+static void sev_asid_free(int asid);
 
 static bool kvm_sev_enabled(void)
 {
@@ -1534,6 +1539,17 @@ static inline int avic_free_vm_id(int id)
 	return 0;
 }
 
+static void sev_vm_destroy(struct kvm *kvm)
+{
+	if (!sev_guest(kvm))
+		return;
+
+	/* release the firmware resources */
+	sev_deactivate_handle(kvm);
+	sev_decommission_handle(kvm);
+	sev_asid_free(sev_get_asid(kvm));
+}
+
 static void avic_vm_destroy(struct kvm *kvm)
 {
 	unsigned long flags;
@@ -1551,6 +1567,12 @@ static void avic_vm_destroy(struct kvm *kvm)
 	spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
 }
 
+static void svm_vm_destroy(struct kvm *kvm)
+{
+	avic_vm_destroy(kvm);
+	sev_vm_destroy(kvm);
+}
+
 static int avic_vm_init(struct kvm *kvm)
 {
 	unsigned long flags;
@@ -5502,6 +5524,282 @@ static inline void avic_post_state_restore(struct kvm_vcpu *vcpu)
 	avic_handle_ldr_update(vcpu);
 }
 
+static int sev_asid_new(void)
+{
+	int pos;
+
+	if (!max_sev_asid)
+		return -EINVAL;
+
+	pos = find_first_zero_bit(sev_asid_bitmap, max_sev_asid);
+	if (pos >= max_sev_asid)
+		return -EBUSY;
+
+	set_bit(pos, sev_asid_bitmap);
+	return pos + 1;
+}
+
+static void sev_asid_free(int asid)
+{
+	int cpu, pos;
+	struct svm_cpu_data *sd;
+
+	pos = asid - 1;
+	clear_bit(pos, sev_asid_bitmap);
+
+	for_each_possible_cpu(cpu) {
+		sd = per_cpu(svm_data, cpu);
+		sd->sev_vmcbs[pos] = NULL;
+	}
+}
+
+static int sev_issue_cmd(struct kvm *kvm, int id, void *data, int *error)
+{
+	int ret;
+	struct fd f;
+	int fd = sev_get_fd(kvm);
+
+	f = fdget(fd);
+	if (!f.file)
+		return -EBADF;
+
+	ret = sev_issue_cmd_external_user(f.file, id, data, 0, error);
+	fdput(f);
+
+	return ret;
+}
+
+static void sev_decommission_handle(struct kvm *kvm)
+{
+	int ret, error;
+	struct sev_data_decommission *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_guest_decommission(data, &error);
+	if (ret)
+		pr_err("SEV: DECOMMISSION %d (%#x)\n", ret, error);
+
+	kfree(data);
+}
+
+static void sev_deactivate_handle(struct kvm *kvm)
+{
+	int ret, error;
+	struct sev_data_deactivate *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_guest_deactivate(data, &error);
+	if (ret) {
+		pr_err("SEV: DEACTIVATE %d (%#x)\n", ret, error);
+		goto buffer_free;
+	}
+
+	wbinvd_on_all_cpus();
+
+	ret = sev_guest_df_flush(&error);
+	if (ret)
+		pr_err("SEV: DF_FLUSH %d (%#x)\n", ret, error);
+
+buffer_free:
+	kfree(data);
+}
+
+static int sev_activate_asid(unsigned int handle, int asid, int *error)
+{
+	int ret;
+	struct sev_data_activate *data;
+
+	wbinvd_on_all_cpus();
+
+	ret = sev_guest_df_flush(error);
+	if (ret) {
+		pr_err("SEV: DF_FLUSH %d (%#x)\n", ret, *error);
+		return ret;
+	}
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = handle;
+	data->asid   = asid;
+	ret = sev_guest_activate(data, error);
+	if (ret)
+		pr_err("SEV: ACTIVATE %d (%#x)\n", ret, *error);
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_pre_start(struct kvm *kvm, int *asid)
+{
+	int ret;
+
+	/* If guest has active SEV handle then deactivate before creating the
+	 * encryption context.
+	 */
+	if (sev_guest(kvm)) {
+		sev_deactivate_handle(kvm);
+		sev_decommission_handle(kvm);
+		*asid = sev_get_asid(kvm);  /* reuse the asid */
+		ret = 0;
+	} else {
+		/* Allocate new asid for this launch */
+		ret = sev_asid_new();
+		if (ret < 0) {
+			pr_err("SEV: failed to get free asid\n");
+			return ret;
+		}
+		*asid = ret;
+		ret = 0;
+	}
+
+	return ret;
+}
+
+static int sev_post_start(struct kvm *kvm, int asid, int handle,
+			int sev_fd, int *error)
+{
+	int ret;
+
+	/* activate asid */
+	ret = sev_activate_asid(handle, asid, error);
+	if (ret)
+		return ret;
+
+	kvm->arch.sev_info.handle = handle;
+	kvm->arch.sev_info.asid = asid;
+	kvm->arch.sev_info.sev_fd = sev_fd;
+
+	return 0;
+}
+
+static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret, asid = 0;
+	void *dh_cert_addr = NULL;
+	void *session_addr = NULL;
+	struct kvm_sev_launch_start params;
+	struct sev_data_launch_start *start;
+	int *error = &argp->error;
+	struct fd f;
+
+	f = fdget(argp->sev_fd);
+	if (!f.file)
+		return -EBADF;
+
+	/* Get parameter from the user */
+	ret = -EFAULT;
+	if (copy_from_user(&params, (void *)argp->data,
+				sizeof(struct kvm_sev_launch_start)))
+		goto err_1;
+
+	ret = -ENOMEM;
+	start = kzalloc(sizeof(*start), GFP_KERNEL);
+	if (!start)
+		goto err_1;
+
+	ret = sev_pre_start(kvm, &asid);
+	if (ret)
+		goto err_2;
+
+	start->handle = params.handle;
+	start->policy = params.policy;
+
+	/* Copy DH certificate from userspace */
+	if (params.dh_cert_length && params.dh_cert_data) {
+		dh_cert_addr = kmalloc(params.dh_cert_length, GFP_KERNEL);
+		if (!dh_cert_addr) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		if (copy_from_user(dh_cert_addr, (void *)params.dh_cert_data,
+				params.dh_cert_length)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		start->dh_cert_address = __psp_pa(dh_cert_addr);
+		start->dh_cert_length = params.dh_cert_length;
+	}
+
+	/* Copy session data from userspace */
+	if (params.session_length && params.session_data) {
+		session_addr = kmalloc(params.dh_cert_length, GFP_KERNEL);
+		if (!session_addr) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		if (copy_from_user(session_addr, (void *)params.session_data,
+				params.session_length)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		start->session_data_address = __psp_pa(session_addr);
+		start->session_data_length = params.session_length;
+	}
+
+	/* launch start */
+	ret = sev_issue_cmd_external_user(f.file, SEV_CMD_LAUNCH_START,
+					  start, 0, error);
+	if (ret) {
+		pr_err("SEV: LAUNCH_START ret=%d (%#010x)\n", ret, *error);
+		goto err_3;
+	}
+
+	ret = sev_post_start(kvm, asid, start->handle, argp->sev_fd, error);
+	if (ret)
+		goto err_3;
+
+	params.handle = start->handle;
+	if (copy_to_user((void *) argp->data, &params,
+				sizeof(struct kvm_sev_launch_start)))
+		ret = -EFAULT;
+err_3:
+	if (ret && asid) /* free asid if we have encountered error */
+		sev_asid_free(asid);
+	kfree(dh_cert_addr);
+	kfree(session_addr);
+err_2:
+	kfree(start);
+err_1:
+	fdput(f);
+	return ret;
+}
+
+static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
+{
+	int r = -ENOTTY;
+	struct kvm_sev_cmd sev_cmd;
+
+	if (copy_from_user(&sev_cmd, argp, sizeof(struct kvm_sev_cmd)))
+		return -EFAULT;
+
+	mutex_lock(&kvm->lock);
+
+	switch (sev_cmd.id) {
+	case KVM_SEV_LAUNCH_START: {
+		r = sev_launch_start(kvm, &sev_cmd);
+		break;
+	}
+	default:
+		break;
+	}
+
+	mutex_unlock(&kvm->lock);
+	if (copy_to_user(argp, &sev_cmd, sizeof(struct kvm_sev_cmd)))
+		r = -EFAULT;
+	return r;
+}
+
 static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
@@ -5518,7 +5816,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.vcpu_reset = svm_vcpu_reset,
 
 	.vm_init = avic_vm_init,
-	.vm_destroy = avic_vm_destroy,
+	.vm_destroy = svm_vm_destroy,
 
 	.prepare_guest_switch = svm_prepare_guest_switch,
 	.vcpu_load = svm_vcpu_load,
@@ -5617,6 +5915,8 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.pmu_ops = &amd_pmu_ops,
 	.deliver_posted_interrupt = svm_deliver_avic_intr,
 	.update_pi_irte = svm_update_pi_irte,
+
+	.memory_encryption_op = amd_memory_encryption_cmd,
 };
 
 static int __init svm_init(void)

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 25/32] kvm: svm: Add support for SEV LAUNCH_START command
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used to bootstrap SEV guest from unencrypted boot images.
The command creates a new VM encryption key (VEK) using the guest owner's
public DH certificates, and session data. The VEK will be used to encrypt
the guest memory.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |  302 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 301 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index fb63398..b5fa8c0 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -37,6 +37,7 @@
 #include <linux/amd-iommu.h>
 #include <linux/hashtable.h>
 #include <linux/psp-sev.h>
+#include <linux/file.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -497,6 +498,10 @@ static inline bool gif_set(struct vcpu_svm *svm)
 /* Secure Encrypted Virtualization */
 static unsigned int max_sev_asid;
 static unsigned long *sev_asid_bitmap;
+static void sev_deactivate_handle(struct kvm *kvm);
+static void sev_decommission_handle(struct kvm *kvm);
+static int sev_asid_new(void);
+static void sev_asid_free(int asid);
 
 static bool kvm_sev_enabled(void)
 {
@@ -1534,6 +1539,17 @@ static inline int avic_free_vm_id(int id)
 	return 0;
 }
 
+static void sev_vm_destroy(struct kvm *kvm)
+{
+	if (!sev_guest(kvm))
+		return;
+
+	/* release the firmware resources */
+	sev_deactivate_handle(kvm);
+	sev_decommission_handle(kvm);
+	sev_asid_free(sev_get_asid(kvm));
+}
+
 static void avic_vm_destroy(struct kvm *kvm)
 {
 	unsigned long flags;
@@ -1551,6 +1567,12 @@ static void avic_vm_destroy(struct kvm *kvm)
 	spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
 }
 
+static void svm_vm_destroy(struct kvm *kvm)
+{
+	avic_vm_destroy(kvm);
+	sev_vm_destroy(kvm);
+}
+
 static int avic_vm_init(struct kvm *kvm)
 {
 	unsigned long flags;
@@ -5502,6 +5524,282 @@ static inline void avic_post_state_restore(struct kvm_vcpu *vcpu)
 	avic_handle_ldr_update(vcpu);
 }
 
+static int sev_asid_new(void)
+{
+	int pos;
+
+	if (!max_sev_asid)
+		return -EINVAL;
+
+	pos = find_first_zero_bit(sev_asid_bitmap, max_sev_asid);
+	if (pos >= max_sev_asid)
+		return -EBUSY;
+
+	set_bit(pos, sev_asid_bitmap);
+	return pos + 1;
+}
+
+static void sev_asid_free(int asid)
+{
+	int cpu, pos;
+	struct svm_cpu_data *sd;
+
+	pos = asid - 1;
+	clear_bit(pos, sev_asid_bitmap);
+
+	for_each_possible_cpu(cpu) {
+		sd = per_cpu(svm_data, cpu);
+		sd->sev_vmcbs[pos] = NULL;
+	}
+}
+
+static int sev_issue_cmd(struct kvm *kvm, int id, void *data, int *error)
+{
+	int ret;
+	struct fd f;
+	int fd = sev_get_fd(kvm);
+
+	f = fdget(fd);
+	if (!f.file)
+		return -EBADF;
+
+	ret = sev_issue_cmd_external_user(f.file, id, data, 0, error);
+	fdput(f);
+
+	return ret;
+}
+
+static void sev_decommission_handle(struct kvm *kvm)
+{
+	int ret, error;
+	struct sev_data_decommission *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_guest_decommission(data, &error);
+	if (ret)
+		pr_err("SEV: DECOMMISSION %d (%#x)\n", ret, error);
+
+	kfree(data);
+}
+
+static void sev_deactivate_handle(struct kvm *kvm)
+{
+	int ret, error;
+	struct sev_data_deactivate *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_guest_deactivate(data, &error);
+	if (ret) {
+		pr_err("SEV: DEACTIVATE %d (%#x)\n", ret, error);
+		goto buffer_free;
+	}
+
+	wbinvd_on_all_cpus();
+
+	ret = sev_guest_df_flush(&error);
+	if (ret)
+		pr_err("SEV: DF_FLUSH %d (%#x)\n", ret, error);
+
+buffer_free:
+	kfree(data);
+}
+
+static int sev_activate_asid(unsigned int handle, int asid, int *error)
+{
+	int ret;
+	struct sev_data_activate *data;
+
+	wbinvd_on_all_cpus();
+
+	ret = sev_guest_df_flush(error);
+	if (ret) {
+		pr_err("SEV: DF_FLUSH %d (%#x)\n", ret, *error);
+		return ret;
+	}
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = handle;
+	data->asid   = asid;
+	ret = sev_guest_activate(data, error);
+	if (ret)
+		pr_err("SEV: ACTIVATE %d (%#x)\n", ret, *error);
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_pre_start(struct kvm *kvm, int *asid)
+{
+	int ret;
+
+	/* If guest has active SEV handle then deactivate before creating the
+	 * encryption context.
+	 */
+	if (sev_guest(kvm)) {
+		sev_deactivate_handle(kvm);
+		sev_decommission_handle(kvm);
+		*asid = sev_get_asid(kvm);  /* reuse the asid */
+		ret = 0;
+	} else {
+		/* Allocate new asid for this launch */
+		ret = sev_asid_new();
+		if (ret < 0) {
+			pr_err("SEV: failed to get free asid\n");
+			return ret;
+		}
+		*asid = ret;
+		ret = 0;
+	}
+
+	return ret;
+}
+
+static int sev_post_start(struct kvm *kvm, int asid, int handle,
+			int sev_fd, int *error)
+{
+	int ret;
+
+	/* activate asid */
+	ret = sev_activate_asid(handle, asid, error);
+	if (ret)
+		return ret;
+
+	kvm->arch.sev_info.handle = handle;
+	kvm->arch.sev_info.asid = asid;
+	kvm->arch.sev_info.sev_fd = sev_fd;
+
+	return 0;
+}
+
+static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret, asid = 0;
+	void *dh_cert_addr = NULL;
+	void *session_addr = NULL;
+	struct kvm_sev_launch_start params;
+	struct sev_data_launch_start *start;
+	int *error = &argp->error;
+	struct fd f;
+
+	f = fdget(argp->sev_fd);
+	if (!f.file)
+		return -EBADF;
+
+	/* Get parameter from the user */
+	ret = -EFAULT;
+	if (copy_from_user(&params, (void *)argp->data,
+				sizeof(struct kvm_sev_launch_start)))
+		goto err_1;
+
+	ret = -ENOMEM;
+	start = kzalloc(sizeof(*start), GFP_KERNEL);
+	if (!start)
+		goto err_1;
+
+	ret = sev_pre_start(kvm, &asid);
+	if (ret)
+		goto err_2;
+
+	start->handle = params.handle;
+	start->policy = params.policy;
+
+	/* Copy DH certificate from userspace */
+	if (params.dh_cert_length && params.dh_cert_data) {
+		dh_cert_addr = kmalloc(params.dh_cert_length, GFP_KERNEL);
+		if (!dh_cert_addr) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		if (copy_from_user(dh_cert_addr, (void *)params.dh_cert_data,
+				params.dh_cert_length)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		start->dh_cert_address = __psp_pa(dh_cert_addr);
+		start->dh_cert_length = params.dh_cert_length;
+	}
+
+	/* Copy session data from userspace */
+	if (params.session_length && params.session_data) {
+		session_addr = kmalloc(params.dh_cert_length, GFP_KERNEL);
+		if (!session_addr) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		if (copy_from_user(session_addr, (void *)params.session_data,
+				params.session_length)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		start->session_data_address = __psp_pa(session_addr);
+		start->session_data_length = params.session_length;
+	}
+
+	/* launch start */
+	ret = sev_issue_cmd_external_user(f.file, SEV_CMD_LAUNCH_START,
+					  start, 0, error);
+	if (ret) {
+		pr_err("SEV: LAUNCH_START ret=%d (%#010x)\n", ret, *error);
+		goto err_3;
+	}
+
+	ret = sev_post_start(kvm, asid, start->handle, argp->sev_fd, error);
+	if (ret)
+		goto err_3;
+
+	params.handle = start->handle;
+	if (copy_to_user((void *) argp->data, &params,
+				sizeof(struct kvm_sev_launch_start)))
+		ret = -EFAULT;
+err_3:
+	if (ret && asid) /* free asid if we have encountered error */
+		sev_asid_free(asid);
+	kfree(dh_cert_addr);
+	kfree(session_addr);
+err_2:
+	kfree(start);
+err_1:
+	fdput(f);
+	return ret;
+}
+
+static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
+{
+	int r = -ENOTTY;
+	struct kvm_sev_cmd sev_cmd;
+
+	if (copy_from_user(&sev_cmd, argp, sizeof(struct kvm_sev_cmd)))
+		return -EFAULT;
+
+	mutex_lock(&kvm->lock);
+
+	switch (sev_cmd.id) {
+	case KVM_SEV_LAUNCH_START: {
+		r = sev_launch_start(kvm, &sev_cmd);
+		break;
+	}
+	default:
+		break;
+	}
+
+	mutex_unlock(&kvm->lock);
+	if (copy_to_user(argp, &sev_cmd, sizeof(struct kvm_sev_cmd)))
+		r = -EFAULT;
+	return r;
+}
+
 static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
@@ -5518,7 +5816,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.vcpu_reset = svm_vcpu_reset,
 
 	.vm_init = avic_vm_init,
-	.vm_destroy = avic_vm_destroy,
+	.vm_destroy = svm_vm_destroy,
 
 	.prepare_guest_switch = svm_prepare_guest_switch,
 	.vcpu_load = svm_vcpu_load,
@@ -5617,6 +5915,8 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.pmu_ops = &amd_pmu_ops,
 	.deliver_posted_interrupt = svm_deliver_avic_intr,
 	.update_pi_irte = svm_update_pi_irte,
+
+	.memory_encryption_op = amd_memory_encryption_cmd,
 };
 
 static int __init svm_init(void)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 25/32] kvm: svm: Add support for SEV LAUNCH_START command
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used to bootstrap SEV guest from unencrypted boot images.
The command creates a new VM encryption key (VEK) using the guest owner's
public DH certificates, and session data. The VEK will be used to encrypt
the guest memory.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |  302 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 301 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index fb63398..b5fa8c0 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -37,6 +37,7 @@
 #include <linux/amd-iommu.h>
 #include <linux/hashtable.h>
 #include <linux/psp-sev.h>
+#include <linux/file.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -497,6 +498,10 @@ static inline bool gif_set(struct vcpu_svm *svm)
 /* Secure Encrypted Virtualization */
 static unsigned int max_sev_asid;
 static unsigned long *sev_asid_bitmap;
+static void sev_deactivate_handle(struct kvm *kvm);
+static void sev_decommission_handle(struct kvm *kvm);
+static int sev_asid_new(void);
+static void sev_asid_free(int asid);
 
 static bool kvm_sev_enabled(void)
 {
@@ -1534,6 +1539,17 @@ static inline int avic_free_vm_id(int id)
 	return 0;
 }
 
+static void sev_vm_destroy(struct kvm *kvm)
+{
+	if (!sev_guest(kvm))
+		return;
+
+	/* release the firmware resources */
+	sev_deactivate_handle(kvm);
+	sev_decommission_handle(kvm);
+	sev_asid_free(sev_get_asid(kvm));
+}
+
 static void avic_vm_destroy(struct kvm *kvm)
 {
 	unsigned long flags;
@@ -1551,6 +1567,12 @@ static void avic_vm_destroy(struct kvm *kvm)
 	spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
 }
 
+static void svm_vm_destroy(struct kvm *kvm)
+{
+	avic_vm_destroy(kvm);
+	sev_vm_destroy(kvm);
+}
+
 static int avic_vm_init(struct kvm *kvm)
 {
 	unsigned long flags;
@@ -5502,6 +5524,282 @@ static inline void avic_post_state_restore(struct kvm_vcpu *vcpu)
 	avic_handle_ldr_update(vcpu);
 }
 
+static int sev_asid_new(void)
+{
+	int pos;
+
+	if (!max_sev_asid)
+		return -EINVAL;
+
+	pos = find_first_zero_bit(sev_asid_bitmap, max_sev_asid);
+	if (pos >= max_sev_asid)
+		return -EBUSY;
+
+	set_bit(pos, sev_asid_bitmap);
+	return pos + 1;
+}
+
+static void sev_asid_free(int asid)
+{
+	int cpu, pos;
+	struct svm_cpu_data *sd;
+
+	pos = asid - 1;
+	clear_bit(pos, sev_asid_bitmap);
+
+	for_each_possible_cpu(cpu) {
+		sd = per_cpu(svm_data, cpu);
+		sd->sev_vmcbs[pos] = NULL;
+	}
+}
+
+static int sev_issue_cmd(struct kvm *kvm, int id, void *data, int *error)
+{
+	int ret;
+	struct fd f;
+	int fd = sev_get_fd(kvm);
+
+	f = fdget(fd);
+	if (!f.file)
+		return -EBADF;
+
+	ret = sev_issue_cmd_external_user(f.file, id, data, 0, error);
+	fdput(f);
+
+	return ret;
+}
+
+static void sev_decommission_handle(struct kvm *kvm)
+{
+	int ret, error;
+	struct sev_data_decommission *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_guest_decommission(data, &error);
+	if (ret)
+		pr_err("SEV: DECOMMISSION %d (%#x)\n", ret, error);
+
+	kfree(data);
+}
+
+static void sev_deactivate_handle(struct kvm *kvm)
+{
+	int ret, error;
+	struct sev_data_deactivate *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_guest_deactivate(data, &error);
+	if (ret) {
+		pr_err("SEV: DEACTIVATE %d (%#x)\n", ret, error);
+		goto buffer_free;
+	}
+
+	wbinvd_on_all_cpus();
+
+	ret = sev_guest_df_flush(&error);
+	if (ret)
+		pr_err("SEV: DF_FLUSH %d (%#x)\n", ret, error);
+
+buffer_free:
+	kfree(data);
+}
+
+static int sev_activate_asid(unsigned int handle, int asid, int *error)
+{
+	int ret;
+	struct sev_data_activate *data;
+
+	wbinvd_on_all_cpus();
+
+	ret = sev_guest_df_flush(error);
+	if (ret) {
+		pr_err("SEV: DF_FLUSH %d (%#x)\n", ret, *error);
+		return ret;
+	}
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = handle;
+	data->asid   = asid;
+	ret = sev_guest_activate(data, error);
+	if (ret)
+		pr_err("SEV: ACTIVATE %d (%#x)\n", ret, *error);
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_pre_start(struct kvm *kvm, int *asid)
+{
+	int ret;
+
+	/* If guest has active SEV handle then deactivate before creating the
+	 * encryption context.
+	 */
+	if (sev_guest(kvm)) {
+		sev_deactivate_handle(kvm);
+		sev_decommission_handle(kvm);
+		*asid = sev_get_asid(kvm);  /* reuse the asid */
+		ret = 0;
+	} else {
+		/* Allocate new asid for this launch */
+		ret = sev_asid_new();
+		if (ret < 0) {
+			pr_err("SEV: failed to get free asid\n");
+			return ret;
+		}
+		*asid = ret;
+		ret = 0;
+	}
+
+	return ret;
+}
+
+static int sev_post_start(struct kvm *kvm, int asid, int handle,
+			int sev_fd, int *error)
+{
+	int ret;
+
+	/* activate asid */
+	ret = sev_activate_asid(handle, asid, error);
+	if (ret)
+		return ret;
+
+	kvm->arch.sev_info.handle = handle;
+	kvm->arch.sev_info.asid = asid;
+	kvm->arch.sev_info.sev_fd = sev_fd;
+
+	return 0;
+}
+
+static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret, asid = 0;
+	void *dh_cert_addr = NULL;
+	void *session_addr = NULL;
+	struct kvm_sev_launch_start params;
+	struct sev_data_launch_start *start;
+	int *error = &argp->error;
+	struct fd f;
+
+	f = fdget(argp->sev_fd);
+	if (!f.file)
+		return -EBADF;
+
+	/* Get parameter from the user */
+	ret = -EFAULT;
+	if (copy_from_user(&params, (void *)argp->data,
+				sizeof(struct kvm_sev_launch_start)))
+		goto err_1;
+
+	ret = -ENOMEM;
+	start = kzalloc(sizeof(*start), GFP_KERNEL);
+	if (!start)
+		goto err_1;
+
+	ret = sev_pre_start(kvm, &asid);
+	if (ret)
+		goto err_2;
+
+	start->handle = params.handle;
+	start->policy = params.policy;
+
+	/* Copy DH certificate from userspace */
+	if (params.dh_cert_length && params.dh_cert_data) {
+		dh_cert_addr = kmalloc(params.dh_cert_length, GFP_KERNEL);
+		if (!dh_cert_addr) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		if (copy_from_user(dh_cert_addr, (void *)params.dh_cert_data,
+				params.dh_cert_length)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		start->dh_cert_address = __psp_pa(dh_cert_addr);
+		start->dh_cert_length = params.dh_cert_length;
+	}
+
+	/* Copy session data from userspace */
+	if (params.session_length && params.session_data) {
+		session_addr = kmalloc(params.dh_cert_length, GFP_KERNEL);
+		if (!session_addr) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		if (copy_from_user(session_addr, (void *)params.session_data,
+				params.session_length)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		start->session_data_address = __psp_pa(session_addr);
+		start->session_data_length = params.session_length;
+	}
+
+	/* launch start */
+	ret = sev_issue_cmd_external_user(f.file, SEV_CMD_LAUNCH_START,
+					  start, 0, error);
+	if (ret) {
+		pr_err("SEV: LAUNCH_START ret=%d (%#010x)\n", ret, *error);
+		goto err_3;
+	}
+
+	ret = sev_post_start(kvm, asid, start->handle, argp->sev_fd, error);
+	if (ret)
+		goto err_3;
+
+	params.handle = start->handle;
+	if (copy_to_user((void *) argp->data, &params,
+				sizeof(struct kvm_sev_launch_start)))
+		ret = -EFAULT;
+err_3:
+	if (ret && asid) /* free asid if we have encountered error */
+		sev_asid_free(asid);
+	kfree(dh_cert_addr);
+	kfree(session_addr);
+err_2:
+	kfree(start);
+err_1:
+	fdput(f);
+	return ret;
+}
+
+static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
+{
+	int r = -ENOTTY;
+	struct kvm_sev_cmd sev_cmd;
+
+	if (copy_from_user(&sev_cmd, argp, sizeof(struct kvm_sev_cmd)))
+		return -EFAULT;
+
+	mutex_lock(&kvm->lock);
+
+	switch (sev_cmd.id) {
+	case KVM_SEV_LAUNCH_START: {
+		r = sev_launch_start(kvm, &sev_cmd);
+		break;
+	}
+	default:
+		break;
+	}
+
+	mutex_unlock(&kvm->lock);
+	if (copy_to_user(argp, &sev_cmd, sizeof(struct kvm_sev_cmd)))
+		r = -EFAULT;
+	return r;
+}
+
 static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
@@ -5518,7 +5816,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.vcpu_reset = svm_vcpu_reset,
 
 	.vm_init = avic_vm_init,
-	.vm_destroy = avic_vm_destroy,
+	.vm_destroy = svm_vm_destroy,
 
 	.prepare_guest_switch = svm_prepare_guest_switch,
 	.vcpu_load = svm_vcpu_load,
@@ -5617,6 +5915,8 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.pmu_ops = &amd_pmu_ops,
 	.deliver_posted_interrupt = svm_deliver_avic_intr,
 	.update_pi_irte = svm_update_pi_irte,
+
+	.memory_encryption_op = amd_memory_encryption_cmd,
 };
 
 static int __init svm_init(void)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 25/32] kvm: svm: Add support for SEV LAUNCH_START command
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used to bootstrap SEV guest from unencrypted boot images.
The command creates a new VM encryption key (VEK) using the guest owner's
public DH certificates, and session data. The VEK will be used to encrypt
the guest memory.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |  302 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 301 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index fb63398..b5fa8c0 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -37,6 +37,7 @@
 #include <linux/amd-iommu.h>
 #include <linux/hashtable.h>
 #include <linux/psp-sev.h>
+#include <linux/file.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -497,6 +498,10 @@ static inline bool gif_set(struct vcpu_svm *svm)
 /* Secure Encrypted Virtualization */
 static unsigned int max_sev_asid;
 static unsigned long *sev_asid_bitmap;
+static void sev_deactivate_handle(struct kvm *kvm);
+static void sev_decommission_handle(struct kvm *kvm);
+static int sev_asid_new(void);
+static void sev_asid_free(int asid);
 
 static bool kvm_sev_enabled(void)
 {
@@ -1534,6 +1539,17 @@ static inline int avic_free_vm_id(int id)
 	return 0;
 }
 
+static void sev_vm_destroy(struct kvm *kvm)
+{
+	if (!sev_guest(kvm))
+		return;
+
+	/* release the firmware resources */
+	sev_deactivate_handle(kvm);
+	sev_decommission_handle(kvm);
+	sev_asid_free(sev_get_asid(kvm));
+}
+
 static void avic_vm_destroy(struct kvm *kvm)
 {
 	unsigned long flags;
@@ -1551,6 +1567,12 @@ static void avic_vm_destroy(struct kvm *kvm)
 	spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
 }
 
+static void svm_vm_destroy(struct kvm *kvm)
+{
+	avic_vm_destroy(kvm);
+	sev_vm_destroy(kvm);
+}
+
 static int avic_vm_init(struct kvm *kvm)
 {
 	unsigned long flags;
@@ -5502,6 +5524,282 @@ static inline void avic_post_state_restore(struct kvm_vcpu *vcpu)
 	avic_handle_ldr_update(vcpu);
 }
 
+static int sev_asid_new(void)
+{
+	int pos;
+
+	if (!max_sev_asid)
+		return -EINVAL;
+
+	pos = find_first_zero_bit(sev_asid_bitmap, max_sev_asid);
+	if (pos >= max_sev_asid)
+		return -EBUSY;
+
+	set_bit(pos, sev_asid_bitmap);
+	return pos + 1;
+}
+
+static void sev_asid_free(int asid)
+{
+	int cpu, pos;
+	struct svm_cpu_data *sd;
+
+	pos = asid - 1;
+	clear_bit(pos, sev_asid_bitmap);
+
+	for_each_possible_cpu(cpu) {
+		sd = per_cpu(svm_data, cpu);
+		sd->sev_vmcbs[pos] = NULL;
+	}
+}
+
+static int sev_issue_cmd(struct kvm *kvm, int id, void *data, int *error)
+{
+	int ret;
+	struct fd f;
+	int fd = sev_get_fd(kvm);
+
+	f = fdget(fd);
+	if (!f.file)
+		return -EBADF;
+
+	ret = sev_issue_cmd_external_user(f.file, id, data, 0, error);
+	fdput(f);
+
+	return ret;
+}
+
+static void sev_decommission_handle(struct kvm *kvm)
+{
+	int ret, error;
+	struct sev_data_decommission *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_guest_decommission(data, &error);
+	if (ret)
+		pr_err("SEV: DECOMMISSION %d (%#x)\n", ret, error);
+
+	kfree(data);
+}
+
+static void sev_deactivate_handle(struct kvm *kvm)
+{
+	int ret, error;
+	struct sev_data_deactivate *data;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_guest_deactivate(data, &error);
+	if (ret) {
+		pr_err("SEV: DEACTIVATE %d (%#x)\n", ret, error);
+		goto buffer_free;
+	}
+
+	wbinvd_on_all_cpus();
+
+	ret = sev_guest_df_flush(&error);
+	if (ret)
+		pr_err("SEV: DF_FLUSH %d (%#x)\n", ret, error);
+
+buffer_free:
+	kfree(data);
+}
+
+static int sev_activate_asid(unsigned int handle, int asid, int *error)
+{
+	int ret;
+	struct sev_data_activate *data;
+
+	wbinvd_on_all_cpus();
+
+	ret = sev_guest_df_flush(error);
+	if (ret) {
+		pr_err("SEV: DF_FLUSH %d (%#x)\n", ret, *error);
+		return ret;
+	}
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = handle;
+	data->asid   = asid;
+	ret = sev_guest_activate(data, error);
+	if (ret)
+		pr_err("SEV: ACTIVATE %d (%#x)\n", ret, *error);
+
+	kfree(data);
+	return ret;
+}
+
+static int sev_pre_start(struct kvm *kvm, int *asid)
+{
+	int ret;
+
+	/* If guest has active SEV handle then deactivate before creating the
+	 * encryption context.
+	 */
+	if (sev_guest(kvm)) {
+		sev_deactivate_handle(kvm);
+		sev_decommission_handle(kvm);
+		*asid = sev_get_asid(kvm);  /* reuse the asid */
+		ret = 0;
+	} else {
+		/* Allocate new asid for this launch */
+		ret = sev_asid_new();
+		if (ret < 0) {
+			pr_err("SEV: failed to get free asid\n");
+			return ret;
+		}
+		*asid = ret;
+		ret = 0;
+	}
+
+	return ret;
+}
+
+static int sev_post_start(struct kvm *kvm, int asid, int handle,
+			int sev_fd, int *error)
+{
+	int ret;
+
+	/* activate asid */
+	ret = sev_activate_asid(handle, asid, error);
+	if (ret)
+		return ret;
+
+	kvm->arch.sev_info.handle = handle;
+	kvm->arch.sev_info.asid = asid;
+	kvm->arch.sev_info.sev_fd = sev_fd;
+
+	return 0;
+}
+
+static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret, asid = 0;
+	void *dh_cert_addr = NULL;
+	void *session_addr = NULL;
+	struct kvm_sev_launch_start params;
+	struct sev_data_launch_start *start;
+	int *error = &argp->error;
+	struct fd f;
+
+	f = fdget(argp->sev_fd);
+	if (!f.file)
+		return -EBADF;
+
+	/* Get parameter from the user */
+	ret = -EFAULT;
+	if (copy_from_user(&params, (void *)argp->data,
+				sizeof(struct kvm_sev_launch_start)))
+		goto err_1;
+
+	ret = -ENOMEM;
+	start = kzalloc(sizeof(*start), GFP_KERNEL);
+	if (!start)
+		goto err_1;
+
+	ret = sev_pre_start(kvm, &asid);
+	if (ret)
+		goto err_2;
+
+	start->handle = params.handle;
+	start->policy = params.policy;
+
+	/* Copy DH certificate from userspace */
+	if (params.dh_cert_length && params.dh_cert_data) {
+		dh_cert_addr = kmalloc(params.dh_cert_length, GFP_KERNEL);
+		if (!dh_cert_addr) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		if (copy_from_user(dh_cert_addr, (void *)params.dh_cert_data,
+				params.dh_cert_length)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		start->dh_cert_address = __psp_pa(dh_cert_addr);
+		start->dh_cert_length = params.dh_cert_length;
+	}
+
+	/* Copy session data from userspace */
+	if (params.session_length && params.session_data) {
+		session_addr = kmalloc(params.dh_cert_length, GFP_KERNEL);
+		if (!session_addr) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		if (copy_from_user(session_addr, (void *)params.session_data,
+				params.session_length)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+		start->session_data_address = __psp_pa(session_addr);
+		start->session_data_length = params.session_length;
+	}
+
+	/* launch start */
+	ret = sev_issue_cmd_external_user(f.file, SEV_CMD_LAUNCH_START,
+					  start, 0, error);
+	if (ret) {
+		pr_err("SEV: LAUNCH_START ret=%d (%#010x)\n", ret, *error);
+		goto err_3;
+	}
+
+	ret = sev_post_start(kvm, asid, start->handle, argp->sev_fd, error);
+	if (ret)
+		goto err_3;
+
+	params.handle = start->handle;
+	if (copy_to_user((void *) argp->data, &params,
+				sizeof(struct kvm_sev_launch_start)))
+		ret = -EFAULT;
+err_3:
+	if (ret && asid) /* free asid if we have encountered error */
+		sev_asid_free(asid);
+	kfree(dh_cert_addr);
+	kfree(session_addr);
+err_2:
+	kfree(start);
+err_1:
+	fdput(f);
+	return ret;
+}
+
+static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
+{
+	int r = -ENOTTY;
+	struct kvm_sev_cmd sev_cmd;
+
+	if (copy_from_user(&sev_cmd, argp, sizeof(struct kvm_sev_cmd)))
+		return -EFAULT;
+
+	mutex_lock(&kvm->lock);
+
+	switch (sev_cmd.id) {
+	case KVM_SEV_LAUNCH_START: {
+		r = sev_launch_start(kvm, &sev_cmd);
+		break;
+	}
+	default:
+		break;
+	}
+
+	mutex_unlock(&kvm->lock);
+	if (copy_to_user(argp, &sev_cmd, sizeof(struct kvm_sev_cmd)))
+		r = -EFAULT;
+	return r;
+}
+
 static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
@@ -5518,7 +5816,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.vcpu_reset = svm_vcpu_reset,
 
 	.vm_init = avic_vm_init,
-	.vm_destroy = avic_vm_destroy,
+	.vm_destroy = svm_vm_destroy,
 
 	.prepare_guest_switch = svm_prepare_guest_switch,
 	.vcpu_load = svm_vcpu_load,
@@ -5617,6 +5915,8 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.pmu_ops = &amd_pmu_ops,
 	.deliver_posted_interrupt = svm_deliver_avic_intr,
 	.update_pi_irte = svm_update_pi_irte,
+
+	.memory_encryption_op = amd_memory_encryption_cmd,
 };
 
 static int __init svm_init(void)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
  2017-03-02 15:12 ` Brijesh Singh
                   ` (52 preceding siblings ...)
  (?)
@ 2017-03-02 15:17 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The command is used for encrypting the guest memory region using the VM
encryption key (VEK) created from LAUNCH_START.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |  150 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 150 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b5fa8c0..62c2b22 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -38,6 +38,8 @@
 #include <linux/hashtable.h>
 #include <linux/psp-sev.h>
 #include <linux/file.h>
+#include <linux/pagemap.h>
+#include <linux/swap.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -502,6 +504,7 @@ static void sev_deactivate_handle(struct kvm *kvm);
 static void sev_decommission_handle(struct kvm *kvm);
 static int sev_asid_new(void);
 static void sev_asid_free(int asid);
+#define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
 
 static bool kvm_sev_enabled(void)
 {
@@ -5775,6 +5778,149 @@ static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
+				    unsigned long *n)
+{
+	struct page **pages;
+	int first, last;
+	unsigned long npages, pinned;
+
+	/* Get number of pages */
+	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
+	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
+	npages = (last - first + 1);
+
+	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
+	if (!pages)
+		return NULL;
+
+	/* pin the user virtual address */
+	down_read(&current->mm->mmap_sem);
+	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
+	up_read(&current->mm->mmap_sem);
+	if (pinned != npages) {
+		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
+				npages, pinned);
+		goto err;
+	}
+
+	*n = npages;
+	return pages;
+err:
+	if (pinned > 0)
+		release_pages(pages, pinned, 0);
+	kfree(pages);
+
+	return NULL;
+}
+
+static void sev_unpin_memory(struct page **pages, unsigned long npages)
+{
+	release_pages(pages, npages, 0);
+	kfree(pages);
+}
+
+static void sev_clflush_pages(struct page *pages[], int num_pages)
+{
+	unsigned long i;
+	uint8_t *page_virtual;
+
+	if (num_pages == 0 || pages == NULL)
+		return;
+
+	for (i = 0; i < num_pages; i++) {
+		page_virtual = kmap_atomic(pages[i]);
+		clflush_cache_range(page_virtual, PAGE_SIZE);
+		kunmap_atomic(page_virtual);
+	}
+}
+
+static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct page **inpages;
+	unsigned long uaddr, ulen;
+	int i, len, ret, offset;
+	unsigned long nr_pages;
+	struct kvm_sev_launch_update_data params;
+	struct sev_data_launch_update_data *data;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	/* Get the parameters from the user */
+	ret = -EFAULT;
+	if (copy_from_user(&params, (void *)argp->data,
+			sizeof(struct kvm_sev_launch_update_data)))
+		goto err_1;
+
+	uaddr = params.address;
+	ulen = params.length;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	/* pin user pages */
+	inpages = sev_pin_memory(params.address, params.length, &nr_pages);
+	if (!inpages) {
+		ret = -ENOMEM;
+		goto err_2;
+	}
+
+	/* invalidate the cache line for these pages to ensure that DRAM
+	 * has recent content before calling the SEV commands to perform
+	 * the encryption.
+	 */
+	sev_clflush_pages(inpages, nr_pages);
+
+	/* the array of pages returned by get_user_pages() is a page-aligned
+	 * memory. Since the user buffer is probably not page-aligned, we need
+	 * to calculate the offset within a page for first update entry.
+	 */
+	offset = uaddr & (PAGE_SIZE - 1);
+	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
+	ulen -= len;
+
+	/* update first page -
+	 * special care need to be taken for the first page because we might
+	 * be dealing with offset within the page
+	 */
+	data->handle = sev_get_handle(kvm);
+	data->length = len;
+	data->address = __sev_page_pa(inpages[0]) + offset;
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
+			data, &argp->error);
+	if (ret)
+		goto err_3;
+
+	/* update remaining pages */
+	for (i = 1; i < nr_pages; i++) {
+
+		len = min_t(size_t, PAGE_SIZE, ulen);
+		ulen -= len;
+		data->length = len;
+		data->address = __sev_page_pa(inpages[i]);
+		ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
+					data, &argp->error);
+		if (ret)
+			goto err_3;
+	}
+
+	/* mark pages dirty */
+	for (i = 0; i < nr_pages; i++) {
+		set_page_dirty_lock(inpages[i]);
+		mark_page_accessed(inpages[i]);
+	}
+err_3:
+	sev_unpin_memory(inpages, nr_pages);
+err_2:
+	kfree(data);
+err_1:
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5790,6 +5936,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_start(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_UPDATE_DATA: {
+		r = sev_launch_update_data(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:17   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used for encrypting the guest memory region using the VM
encryption key (VEK) created from LAUNCH_START.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |  150 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 150 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b5fa8c0..62c2b22 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -38,6 +38,8 @@
 #include <linux/hashtable.h>
 #include <linux/psp-sev.h>
 #include <linux/file.h>
+#include <linux/pagemap.h>
+#include <linux/swap.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -502,6 +504,7 @@ static void sev_deactivate_handle(struct kvm *kvm);
 static void sev_decommission_handle(struct kvm *kvm);
 static int sev_asid_new(void);
 static void sev_asid_free(int asid);
+#define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
 
 static bool kvm_sev_enabled(void)
 {
@@ -5775,6 +5778,149 @@ static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
+				    unsigned long *n)
+{
+	struct page **pages;
+	int first, last;
+	unsigned long npages, pinned;
+
+	/* Get number of pages */
+	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
+	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
+	npages = (last - first + 1);
+
+	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
+	if (!pages)
+		return NULL;
+
+	/* pin the user virtual address */
+	down_read(&current->mm->mmap_sem);
+	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
+	up_read(&current->mm->mmap_sem);
+	if (pinned != npages) {
+		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
+				npages, pinned);
+		goto err;
+	}
+
+	*n = npages;
+	return pages;
+err:
+	if (pinned > 0)
+		release_pages(pages, pinned, 0);
+	kfree(pages);
+
+	return NULL;
+}
+
+static void sev_unpin_memory(struct page **pages, unsigned long npages)
+{
+	release_pages(pages, npages, 0);
+	kfree(pages);
+}
+
+static void sev_clflush_pages(struct page *pages[], int num_pages)
+{
+	unsigned long i;
+	uint8_t *page_virtual;
+
+	if (num_pages == 0 || pages == NULL)
+		return;
+
+	for (i = 0; i < num_pages; i++) {
+		page_virtual = kmap_atomic(pages[i]);
+		clflush_cache_range(page_virtual, PAGE_SIZE);
+		kunmap_atomic(page_virtual);
+	}
+}
+
+static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct page **inpages;
+	unsigned long uaddr, ulen;
+	int i, len, ret, offset;
+	unsigned long nr_pages;
+	struct kvm_sev_launch_update_data params;
+	struct sev_data_launch_update_data *data;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	/* Get the parameters from the user */
+	ret = -EFAULT;
+	if (copy_from_user(&params, (void *)argp->data,
+			sizeof(struct kvm_sev_launch_update_data)))
+		goto err_1;
+
+	uaddr = params.address;
+	ulen = params.length;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	/* pin user pages */
+	inpages = sev_pin_memory(params.address, params.length, &nr_pages);
+	if (!inpages) {
+		ret = -ENOMEM;
+		goto err_2;
+	}
+
+	/* invalidate the cache line for these pages to ensure that DRAM
+	 * has recent content before calling the SEV commands to perform
+	 * the encryption.
+	 */
+	sev_clflush_pages(inpages, nr_pages);
+
+	/* the array of pages returned by get_user_pages() is a page-aligned
+	 * memory. Since the user buffer is probably not page-aligned, we need
+	 * to calculate the offset within a page for first update entry.
+	 */
+	offset = uaddr & (PAGE_SIZE - 1);
+	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
+	ulen -= len;
+
+	/* update first page -
+	 * special care need to be taken for the first page because we might
+	 * be dealing with offset within the page
+	 */
+	data->handle = sev_get_handle(kvm);
+	data->length = len;
+	data->address = __sev_page_pa(inpages[0]) + offset;
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
+			data, &argp->error);
+	if (ret)
+		goto err_3;
+
+	/* update remaining pages */
+	for (i = 1; i < nr_pages; i++) {
+
+		len = min_t(size_t, PAGE_SIZE, ulen);
+		ulen -= len;
+		data->length = len;
+		data->address = __sev_page_pa(inpages[i]);
+		ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
+					data, &argp->error);
+		if (ret)
+			goto err_3;
+	}
+
+	/* mark pages dirty */
+	for (i = 0; i < nr_pages; i++) {
+		set_page_dirty_lock(inpages[i]);
+		mark_page_accessed(inpages[i]);
+	}
+err_3:
+	sev_unpin_memory(inpages, nr_pages);
+err_2:
+	kfree(data);
+err_1:
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5790,6 +5936,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_start(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_UPDATE_DATA: {
+		r = sev_launch_update_data(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used for encrypting the guest memory region using the VM
encryption key (VEK) created from LAUNCH_START.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |  150 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 150 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b5fa8c0..62c2b22 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -38,6 +38,8 @@
 #include <linux/hashtable.h>
 #include <linux/psp-sev.h>
 #include <linux/file.h>
+#include <linux/pagemap.h>
+#include <linux/swap.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -502,6 +504,7 @@ static void sev_deactivate_handle(struct kvm *kvm);
 static void sev_decommission_handle(struct kvm *kvm);
 static int sev_asid_new(void);
 static void sev_asid_free(int asid);
+#define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
 
 static bool kvm_sev_enabled(void)
 {
@@ -5775,6 +5778,149 @@ static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
+				    unsigned long *n)
+{
+	struct page **pages;
+	int first, last;
+	unsigned long npages, pinned;
+
+	/* Get number of pages */
+	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
+	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
+	npages = (last - first + 1);
+
+	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
+	if (!pages)
+		return NULL;
+
+	/* pin the user virtual address */
+	down_read(&current->mm->mmap_sem);
+	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
+	up_read(&current->mm->mmap_sem);
+	if (pinned != npages) {
+		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
+				npages, pinned);
+		goto err;
+	}
+
+	*n = npages;
+	return pages;
+err:
+	if (pinned > 0)
+		release_pages(pages, pinned, 0);
+	kfree(pages);
+
+	return NULL;
+}
+
+static void sev_unpin_memory(struct page **pages, unsigned long npages)
+{
+	release_pages(pages, npages, 0);
+	kfree(pages);
+}
+
+static void sev_clflush_pages(struct page *pages[], int num_pages)
+{
+	unsigned long i;
+	uint8_t *page_virtual;
+
+	if (num_pages == 0 || pages == NULL)
+		return;
+
+	for (i = 0; i < num_pages; i++) {
+		page_virtual = kmap_atomic(pages[i]);
+		clflush_cache_range(page_virtual, PAGE_SIZE);
+		kunmap_atomic(page_virtual);
+	}
+}
+
+static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct page **inpages;
+	unsigned long uaddr, ulen;
+	int i, len, ret, offset;
+	unsigned long nr_pages;
+	struct kvm_sev_launch_update_data params;
+	struct sev_data_launch_update_data *data;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	/* Get the parameters from the user */
+	ret = -EFAULT;
+	if (copy_from_user(&params, (void *)argp->data,
+			sizeof(struct kvm_sev_launch_update_data)))
+		goto err_1;
+
+	uaddr = params.address;
+	ulen = params.length;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	/* pin user pages */
+	inpages = sev_pin_memory(params.address, params.length, &nr_pages);
+	if (!inpages) {
+		ret = -ENOMEM;
+		goto err_2;
+	}
+
+	/* invalidate the cache line for these pages to ensure that DRAM
+	 * has recent content before calling the SEV commands to perform
+	 * the encryption.
+	 */
+	sev_clflush_pages(inpages, nr_pages);
+
+	/* the array of pages returned by get_user_pages() is a page-aligned
+	 * memory. Since the user buffer is probably not page-aligned, we need
+	 * to calculate the offset within a page for first update entry.
+	 */
+	offset = uaddr & (PAGE_SIZE - 1);
+	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
+	ulen -= len;
+
+	/* update first page -
+	 * special care need to be taken for the first page because we might
+	 * be dealing with offset within the page
+	 */
+	data->handle = sev_get_handle(kvm);
+	data->length = len;
+	data->address = __sev_page_pa(inpages[0]) + offset;
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
+			data, &argp->error);
+	if (ret)
+		goto err_3;
+
+	/* update remaining pages */
+	for (i = 1; i < nr_pages; i++) {
+
+		len = min_t(size_t, PAGE_SIZE, ulen);
+		ulen -= len;
+		data->length = len;
+		data->address = __sev_page_pa(inpages[i]);
+		ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
+					data, &argp->error);
+		if (ret)
+			goto err_3;
+	}
+
+	/* mark pages dirty */
+	for (i = 0; i < nr_pages; i++) {
+		set_page_dirty_lock(inpages[i]);
+		mark_page_accessed(inpages[i]);
+	}
+err_3:
+	sev_unpin_memory(inpages, nr_pages);
+err_2:
+	kfree(data);
+err_1:
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5790,6 +5936,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_start(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_UPDATE_DATA: {
+		r = sev_launch_update_data(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used for encrypting the guest memory region using the VM
encryption key (VEK) created from LAUNCH_START.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |  150 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 150 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b5fa8c0..62c2b22 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -38,6 +38,8 @@
 #include <linux/hashtable.h>
 #include <linux/psp-sev.h>
 #include <linux/file.h>
+#include <linux/pagemap.h>
+#include <linux/swap.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -502,6 +504,7 @@ static void sev_deactivate_handle(struct kvm *kvm);
 static void sev_decommission_handle(struct kvm *kvm);
 static int sev_asid_new(void);
 static void sev_asid_free(int asid);
+#define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
 
 static bool kvm_sev_enabled(void)
 {
@@ -5775,6 +5778,149 @@ static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
+				    unsigned long *n)
+{
+	struct page **pages;
+	int first, last;
+	unsigned long npages, pinned;
+
+	/* Get number of pages */
+	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
+	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
+	npages = (last - first + 1);
+
+	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
+	if (!pages)
+		return NULL;
+
+	/* pin the user virtual address */
+	down_read(&current->mm->mmap_sem);
+	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
+	up_read(&current->mm->mmap_sem);
+	if (pinned != npages) {
+		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
+				npages, pinned);
+		goto err;
+	}
+
+	*n = npages;
+	return pages;
+err:
+	if (pinned > 0)
+		release_pages(pages, pinned, 0);
+	kfree(pages);
+
+	return NULL;
+}
+
+static void sev_unpin_memory(struct page **pages, unsigned long npages)
+{
+	release_pages(pages, npages, 0);
+	kfree(pages);
+}
+
+static void sev_clflush_pages(struct page *pages[], int num_pages)
+{
+	unsigned long i;
+	uint8_t *page_virtual;
+
+	if (num_pages == 0 || pages == NULL)
+		return;
+
+	for (i = 0; i < num_pages; i++) {
+		page_virtual = kmap_atomic(pages[i]);
+		clflush_cache_range(page_virtual, PAGE_SIZE);
+		kunmap_atomic(page_virtual);
+	}
+}
+
+static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct page **inpages;
+	unsigned long uaddr, ulen;
+	int i, len, ret, offset;
+	unsigned long nr_pages;
+	struct kvm_sev_launch_update_data params;
+	struct sev_data_launch_update_data *data;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	/* Get the parameters from the user */
+	ret = -EFAULT;
+	if (copy_from_user(&params, (void *)argp->data,
+			sizeof(struct kvm_sev_launch_update_data)))
+		goto err_1;
+
+	uaddr = params.address;
+	ulen = params.length;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	/* pin user pages */
+	inpages = sev_pin_memory(params.address, params.length, &nr_pages);
+	if (!inpages) {
+		ret = -ENOMEM;
+		goto err_2;
+	}
+
+	/* invalidate the cache line for these pages to ensure that DRAM
+	 * has recent content before calling the SEV commands to perform
+	 * the encryption.
+	 */
+	sev_clflush_pages(inpages, nr_pages);
+
+	/* the array of pages returned by get_user_pages() is a page-aligned
+	 * memory. Since the user buffer is probably not page-aligned, we need
+	 * to calculate the offset within a page for first update entry.
+	 */
+	offset = uaddr & (PAGE_SIZE - 1);
+	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
+	ulen -= len;
+
+	/* update first page -
+	 * special care need to be taken for the first page because we might
+	 * be dealing with offset within the page
+	 */
+	data->handle = sev_get_handle(kvm);
+	data->length = len;
+	data->address = __sev_page_pa(inpages[0]) + offset;
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
+			data, &argp->error);
+	if (ret)
+		goto err_3;
+
+	/* update remaining pages */
+	for (i = 1; i < nr_pages; i++) {
+
+		len = min_t(size_t, PAGE_SIZE, ulen);
+		ulen -= len;
+		data->length = len;
+		data->address = __sev_page_pa(inpages[i]);
+		ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
+					data, &argp->error);
+		if (ret)
+			goto err_3;
+	}
+
+	/* mark pages dirty */
+	for (i = 0; i < nr_pages; i++) {
+		set_page_dirty_lock(inpages[i]);
+		mark_page_accessed(inpages[i]);
+	}
+err_3:
+	sev_unpin_memory(inpages, nr_pages);
+err_2:
+	kfree(data);
+err_1:
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5790,6 +5936,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_start(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_UPDATE_DATA: {
+		r = sev_launch_update_data(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used for encrypting the guest memory region using the VM
encryption key (VEK) created from LAUNCH_START.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |  150 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 150 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b5fa8c0..62c2b22 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -38,6 +38,8 @@
 #include <linux/hashtable.h>
 #include <linux/psp-sev.h>
 #include <linux/file.h>
+#include <linux/pagemap.h>
+#include <linux/swap.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -502,6 +504,7 @@ static void sev_deactivate_handle(struct kvm *kvm);
 static void sev_decommission_handle(struct kvm *kvm);
 static int sev_asid_new(void);
 static void sev_asid_free(int asid);
+#define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
 
 static bool kvm_sev_enabled(void)
 {
@@ -5775,6 +5778,149 @@ static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
+				    unsigned long *n)
+{
+	struct page **pages;
+	int first, last;
+	unsigned long npages, pinned;
+
+	/* Get number of pages */
+	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
+	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
+	npages = (last - first + 1);
+
+	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
+	if (!pages)
+		return NULL;
+
+	/* pin the user virtual address */
+	down_read(&current->mm->mmap_sem);
+	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
+	up_read(&current->mm->mmap_sem);
+	if (pinned != npages) {
+		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
+				npages, pinned);
+		goto err;
+	}
+
+	*n = npages;
+	return pages;
+err:
+	if (pinned > 0)
+		release_pages(pages, pinned, 0);
+	kfree(pages);
+
+	return NULL;
+}
+
+static void sev_unpin_memory(struct page **pages, unsigned long npages)
+{
+	release_pages(pages, npages, 0);
+	kfree(pages);
+}
+
+static void sev_clflush_pages(struct page *pages[], int num_pages)
+{
+	unsigned long i;
+	uint8_t *page_virtual;
+
+	if (num_pages == 0 || pages == NULL)
+		return;
+
+	for (i = 0; i < num_pages; i++) {
+		page_virtual = kmap_atomic(pages[i]);
+		clflush_cache_range(page_virtual, PAGE_SIZE);
+		kunmap_atomic(page_virtual);
+	}
+}
+
+static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct page **inpages;
+	unsigned long uaddr, ulen;
+	int i, len, ret, offset;
+	unsigned long nr_pages;
+	struct kvm_sev_launch_update_data params;
+	struct sev_data_launch_update_data *data;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	/* Get the parameters from the user */
+	ret = -EFAULT;
+	if (copy_from_user(&params, (void *)argp->data,
+			sizeof(struct kvm_sev_launch_update_data)))
+		goto err_1;
+
+	uaddr = params.address;
+	ulen = params.length;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	/* pin user pages */
+	inpages = sev_pin_memory(params.address, params.length, &nr_pages);
+	if (!inpages) {
+		ret = -ENOMEM;
+		goto err_2;
+	}
+
+	/* invalidate the cache line for these pages to ensure that DRAM
+	 * has recent content before calling the SEV commands to perform
+	 * the encryption.
+	 */
+	sev_clflush_pages(inpages, nr_pages);
+
+	/* the array of pages returned by get_user_pages() is a page-aligned
+	 * memory. Since the user buffer is probably not page-aligned, we need
+	 * to calculate the offset within a page for first update entry.
+	 */
+	offset = uaddr & (PAGE_SIZE - 1);
+	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
+	ulen -= len;
+
+	/* update first page -
+	 * special care need to be taken for the first page because we might
+	 * be dealing with offset within the page
+	 */
+	data->handle = sev_get_handle(kvm);
+	data->length = len;
+	data->address = __sev_page_pa(inpages[0]) + offset;
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
+			data, &argp->error);
+	if (ret)
+		goto err_3;
+
+	/* update remaining pages */
+	for (i = 1; i < nr_pages; i++) {
+
+		len = min_t(size_t, PAGE_SIZE, ulen);
+		ulen -= len;
+		data->length = len;
+		data->address = __sev_page_pa(inpages[i]);
+		ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
+					data, &argp->error);
+		if (ret)
+			goto err_3;
+	}
+
+	/* mark pages dirty */
+	for (i = 0; i < nr_pages; i++) {
+		set_page_dirty_lock(inpages[i]);
+		mark_page_accessed(inpages[i]);
+	}
+err_3:
+	sev_unpin_memory(inpages, nr_pages);
+err_2:
+	kfree(data);
+err_1:
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5790,6 +5936,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_start(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_UPDATE_DATA: {
+		r = sev_launch_update_data(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 27/32] kvm: svm: Add support for SEV LAUNCH_FINISH command
  2017-03-02 15:12 ` Brijesh Singh
                   ` (54 preceding siblings ...)
  (?)
@ 2017-03-02 15:17 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The command is used for finializing the SEV guest launch process.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 62c2b22..c108064 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5921,6 +5921,38 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int i, ret;
+	struct sev_data_launch_finish *data;
+	struct kvm_vcpu *vcpu;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* launch finish */
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_FINISH, data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	/* Iterate through each vcpus and set SEV KVM_SEV_FEATURE bit in
+	 * KVM_CPUID_FEATURE to indicate that SEV is enabled on this vcpu
+	 */
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		sev_init_vmcb(to_svm(vcpu));
+		svm_cpuid_update(vcpu);
+	}
+
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5940,6 +5972,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_update_data(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_FINISH: {
+		r = sev_launch_finish(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 27/32] kvm: svm: Add support for SEV LAUNCH_FINISH command
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:17   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used for finializing the SEV guest launch process.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 62c2b22..c108064 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5921,6 +5921,38 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int i, ret;
+	struct sev_data_launch_finish *data;
+	struct kvm_vcpu *vcpu;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* launch finish */
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_FINISH, data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	/* Iterate through each vcpus and set SEV KVM_SEV_FEATURE bit in
+	 * KVM_CPUID_FEATURE to indicate that SEV is enabled on this vcpu
+	 */
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		sev_init_vmcb(to_svm(vcpu));
+		svm_cpuid_update(vcpu);
+	}
+
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5940,6 +5972,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_update_data(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_FINISH: {
+		r = sev_launch_finish(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 27/32] kvm: svm: Add support for SEV LAUNCH_FINISH command
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used for finializing the SEV guest launch process.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 62c2b22..c108064 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5921,6 +5921,38 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int i, ret;
+	struct sev_data_launch_finish *data;
+	struct kvm_vcpu *vcpu;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* launch finish */
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_FINISH, data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	/* Iterate through each vcpus and set SEV KVM_SEV_FEATURE bit in
+	 * KVM_CPUID_FEATURE to indicate that SEV is enabled on this vcpu
+	 */
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		sev_init_vmcb(to_svm(vcpu));
+		svm_cpuid_update(vcpu);
+	}
+
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5940,6 +5972,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_update_data(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_FINISH: {
+		r = sev_launch_finish(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 27/32] kvm: svm: Add support for SEV LAUNCH_FINISH command
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used for finializing the SEV guest launch process.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 62c2b22..c108064 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5921,6 +5921,38 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int i, ret;
+	struct sev_data_launch_finish *data;
+	struct kvm_vcpu *vcpu;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* launch finish */
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_FINISH, data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	/* Iterate through each vcpus and set SEV KVM_SEV_FEATURE bit in
+	 * KVM_CPUID_FEATURE to indicate that SEV is enabled on this vcpu
+	 */
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		sev_init_vmcb(to_svm(vcpu));
+		svm_cpuid_update(vcpu);
+	}
+
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5940,6 +5972,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_update_data(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_FINISH: {
+		r = sev_launch_finish(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 27/32] kvm: svm: Add support for SEV LAUNCH_FINISH command
@ 2017-03-02 15:17   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:17 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used for finializing the SEV guest launch process.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 62c2b22..c108064 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5921,6 +5921,38 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int i, ret;
+	struct sev_data_launch_finish *data;
+	struct kvm_vcpu *vcpu;
+
+	if (!sev_guest(kvm))
+		return -EINVAL;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* launch finish */
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_FINISH, data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	/* Iterate through each vcpus and set SEV KVM_SEV_FEATURE bit in
+	 * KVM_CPUID_FEATURE to indicate that SEV is enabled on this vcpu
+	 */
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		sev_init_vmcb(to_svm(vcpu));
+		svm_cpuid_update(vcpu);
+	}
+
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5940,6 +5972,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_update_data(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_FINISH: {
+		r = sev_launch_finish(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 28/32] kvm: svm: Add support for SEV GUEST_STATUS command
  2017-03-02 15:12 ` Brijesh Singh
                   ` (57 preceding siblings ...)
  (?)
@ 2017-03-02 15:18 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The command is used for querying the SEV guest status.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index c108064..977aa22 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5953,6 +5953,39 @@ static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret;
+	struct kvm_sev_guest_status params;
+	struct sev_data_guest_status *data;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void *) argp->data,
+				sizeof(struct kvm_sev_guest_status)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_GUEST_STATUS, data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	params.policy = data->policy;
+	params.state = data->state;
+
+	if (copy_to_user((void *) argp->data, &params,
+				sizeof(struct kvm_sev_guest_status)))
+		ret = -EFAULT;
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5976,6 +6009,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_finish(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_GUEST_STATUS: {
+		r = sev_guest_status(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 28/32] kvm: svm: Add support for SEV GUEST_STATUS command
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:18   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used for querying the SEV guest status.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index c108064..977aa22 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5953,6 +5953,39 @@ static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret;
+	struct kvm_sev_guest_status params;
+	struct sev_data_guest_status *data;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void *) argp->data,
+				sizeof(struct kvm_sev_guest_status)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_GUEST_STATUS, data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	params.policy = data->policy;
+	params.state = data->state;
+
+	if (copy_to_user((void *) argp->data, &params,
+				sizeof(struct kvm_sev_guest_status)))
+		ret = -EFAULT;
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5976,6 +6009,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_finish(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_GUEST_STATUS: {
+		r = sev_guest_status(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 28/32] kvm: svm: Add support for SEV GUEST_STATUS command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used for querying the SEV guest status.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index c108064..977aa22 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5953,6 +5953,39 @@ static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret;
+	struct kvm_sev_guest_status params;
+	struct sev_data_guest_status *data;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void *) argp->data,
+				sizeof(struct kvm_sev_guest_status)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_GUEST_STATUS, data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	params.policy = data->policy;
+	params.state = data->state;
+
+	if (copy_to_user((void *) argp->data, &params,
+				sizeof(struct kvm_sev_guest_status)))
+		ret = -EFAULT;
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5976,6 +6009,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_finish(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_GUEST_STATUS: {
+		r = sev_guest_status(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 28/32] kvm: svm: Add support for SEV GUEST_STATUS command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used for querying the SEV guest status.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index c108064..977aa22 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5953,6 +5953,39 @@ static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret;
+	struct kvm_sev_guest_status params;
+	struct sev_data_guest_status *data;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void *) argp->data,
+				sizeof(struct kvm_sev_guest_status)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_GUEST_STATUS, data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	params.policy = data->policy;
+	params.state = data->state;
+
+	if (copy_to_user((void *) argp->data, &params,
+				sizeof(struct kvm_sev_guest_status)))
+		ret = -EFAULT;
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5976,6 +6009,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_finish(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_GUEST_STATUS: {
+		r = sev_guest_status(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 28/32] kvm: svm: Add support for SEV GUEST_STATUS command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used for querying the SEV guest status.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index c108064..977aa22 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5953,6 +5953,39 @@ static int sev_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret;
+	struct kvm_sev_guest_status params;
+	struct sev_data_guest_status *data;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void *) argp->data,
+				sizeof(struct kvm_sev_guest_status)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_GUEST_STATUS, data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	params.policy = data->policy;
+	params.state = data->state;
+
+	if (copy_to_user((void *) argp->data, &params,
+				sizeof(struct kvm_sev_guest_status)))
+		ret = -EFAULT;
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -5976,6 +6009,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_launch_finish(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_GUEST_STATUS: {
+		r = sev_guest_status(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
  2017-03-02 15:12 ` Brijesh Singh
                   ` (59 preceding siblings ...)
  (?)
@ 2017-03-02 15:18 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The command is used to decrypt guest memory region for debug purposes.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   76 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 977aa22..ce8819a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5986,6 +5986,78 @@ static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
+		void *dst, int *error)
+{
+	int ret;
+	struct page **inpages;
+	struct sev_data_dbg *data;
+	unsigned long npages;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
+	if (!inpages) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	data->handle = sev_get_handle(kvm);
+	data->dst_addr = __psp_pa(dst);
+	data->src_addr = __sev_page_pa(inpages[0]);
+	data->length = PAGE_SIZE;
+
+	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
+	if (ret)
+		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
+				ret, *error);
+	sev_unpin_memory(inpages, npages);
+err_1:
+	kfree(data);
+	return ret;
+}
+
+static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	void *data;
+	int ret, offset, len;
+	struct kvm_sev_dbg debug;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&debug, (void *)argp->data,
+				sizeof(struct kvm_sev_dbg)))
+		return -EFAULT;
+	/*
+	 * TODO: add support for decrypting length which crosses the
+	 * page boundary.
+	 */
+	offset = debug.src_addr & (PAGE_SIZE - 1);
+	if (offset + debug.length > PAGE_SIZE)
+		return -EINVAL;
+
+	data = (void *) get_zeroed_page(GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* decrypt full page */
+	ret = __sev_dbg_decrypt_page(kvm, debug.src_addr & PAGE_MASK,
+			data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	/* we have decrypted full page but copy request length */
+	len = min_t(size_t, (PAGE_SIZE - offset), debug.length);
+	if (copy_to_user((uint8_t *)debug.dst_addr, data + offset, len))
+		ret = -EFAULT;
+err_1:
+	free_page((unsigned long)data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6013,6 +6085,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_guest_status(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_DBG_DECRYPT: {
+		r = sev_dbg_decrypt(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:18   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used to decrypt guest memory region for debug purposes.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   76 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 977aa22..ce8819a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5986,6 +5986,78 @@ static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
+		void *dst, int *error)
+{
+	int ret;
+	struct page **inpages;
+	struct sev_data_dbg *data;
+	unsigned long npages;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
+	if (!inpages) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	data->handle = sev_get_handle(kvm);
+	data->dst_addr = __psp_pa(dst);
+	data->src_addr = __sev_page_pa(inpages[0]);
+	data->length = PAGE_SIZE;
+
+	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
+	if (ret)
+		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
+				ret, *error);
+	sev_unpin_memory(inpages, npages);
+err_1:
+	kfree(data);
+	return ret;
+}
+
+static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	void *data;
+	int ret, offset, len;
+	struct kvm_sev_dbg debug;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&debug, (void *)argp->data,
+				sizeof(struct kvm_sev_dbg)))
+		return -EFAULT;
+	/*
+	 * TODO: add support for decrypting length which crosses the
+	 * page boundary.
+	 */
+	offset = debug.src_addr & (PAGE_SIZE - 1);
+	if (offset + debug.length > PAGE_SIZE)
+		return -EINVAL;
+
+	data = (void *) get_zeroed_page(GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* decrypt full page */
+	ret = __sev_dbg_decrypt_page(kvm, debug.src_addr & PAGE_MASK,
+			data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	/* we have decrypted full page but copy request length */
+	len = min_t(size_t, (PAGE_SIZE - offset), debug.length);
+	if (copy_to_user((uint8_t *)debug.dst_addr, data + offset, len))
+		ret = -EFAULT;
+err_1:
+	free_page((unsigned long)data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6013,6 +6085,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_guest_status(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_DBG_DECRYPT: {
+		r = sev_dbg_decrypt(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used to decrypt guest memory region for debug purposes.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   76 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 977aa22..ce8819a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5986,6 +5986,78 @@ static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
+		void *dst, int *error)
+{
+	int ret;
+	struct page **inpages;
+	struct sev_data_dbg *data;
+	unsigned long npages;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
+	if (!inpages) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	data->handle = sev_get_handle(kvm);
+	data->dst_addr = __psp_pa(dst);
+	data->src_addr = __sev_page_pa(inpages[0]);
+	data->length = PAGE_SIZE;
+
+	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
+	if (ret)
+		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
+				ret, *error);
+	sev_unpin_memory(inpages, npages);
+err_1:
+	kfree(data);
+	return ret;
+}
+
+static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	void *data;
+	int ret, offset, len;
+	struct kvm_sev_dbg debug;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&debug, (void *)argp->data,
+				sizeof(struct kvm_sev_dbg)))
+		return -EFAULT;
+	/*
+	 * TODO: add support for decrypting length which crosses the
+	 * page boundary.
+	 */
+	offset = debug.src_addr & (PAGE_SIZE - 1);
+	if (offset + debug.length > PAGE_SIZE)
+		return -EINVAL;
+
+	data = (void *) get_zeroed_page(GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* decrypt full page */
+	ret = __sev_dbg_decrypt_page(kvm, debug.src_addr & PAGE_MASK,
+			data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	/* we have decrypted full page but copy request length */
+	len = min_t(size_t, (PAGE_SIZE - offset), debug.length);
+	if (copy_to_user((uint8_t *)debug.dst_addr, data + offset, len))
+		ret = -EFAULT;
+err_1:
+	free_page((unsigned long)data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6013,6 +6085,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_guest_status(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_DBG_DECRYPT: {
+		r = sev_dbg_decrypt(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used to decrypt guest memory region for debug purposes.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   76 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 977aa22..ce8819a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5986,6 +5986,78 @@ static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
+		void *dst, int *error)
+{
+	int ret;
+	struct page **inpages;
+	struct sev_data_dbg *data;
+	unsigned long npages;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
+	if (!inpages) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	data->handle = sev_get_handle(kvm);
+	data->dst_addr = __psp_pa(dst);
+	data->src_addr = __sev_page_pa(inpages[0]);
+	data->length = PAGE_SIZE;
+
+	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
+	if (ret)
+		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
+				ret, *error);
+	sev_unpin_memory(inpages, npages);
+err_1:
+	kfree(data);
+	return ret;
+}
+
+static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	void *data;
+	int ret, offset, len;
+	struct kvm_sev_dbg debug;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&debug, (void *)argp->data,
+				sizeof(struct kvm_sev_dbg)))
+		return -EFAULT;
+	/*
+	 * TODO: add support for decrypting length which crosses the
+	 * page boundary.
+	 */
+	offset = debug.src_addr & (PAGE_SIZE - 1);
+	if (offset + debug.length > PAGE_SIZE)
+		return -EINVAL;
+
+	data = (void *) get_zeroed_page(GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* decrypt full page */
+	ret = __sev_dbg_decrypt_page(kvm, debug.src_addr & PAGE_MASK,
+			data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	/* we have decrypted full page but copy request length */
+	len = min_t(size_t, (PAGE_SIZE - offset), debug.length);
+	if (copy_to_user((uint8_t *)debug.dst_addr, data + offset, len))
+		ret = -EFAULT;
+err_1:
+	free_page((unsigned long)data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6013,6 +6085,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_guest_status(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_DBG_DECRYPT: {
+		r = sev_dbg_decrypt(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used to decrypt guest memory region for debug purposes.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   76 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 977aa22..ce8819a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5986,6 +5986,78 @@ static int sev_guest_status(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
+		void *dst, int *error)
+{
+	int ret;
+	struct page **inpages;
+	struct sev_data_dbg *data;
+	unsigned long npages;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
+	if (!inpages) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	data->handle = sev_get_handle(kvm);
+	data->dst_addr = __psp_pa(dst);
+	data->src_addr = __sev_page_pa(inpages[0]);
+	data->length = PAGE_SIZE;
+
+	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
+	if (ret)
+		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
+				ret, *error);
+	sev_unpin_memory(inpages, npages);
+err_1:
+	kfree(data);
+	return ret;
+}
+
+static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	void *data;
+	int ret, offset, len;
+	struct kvm_sev_dbg debug;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&debug, (void *)argp->data,
+				sizeof(struct kvm_sev_dbg)))
+		return -EFAULT;
+	/*
+	 * TODO: add support for decrypting length which crosses the
+	 * page boundary.
+	 */
+	offset = debug.src_addr & (PAGE_SIZE - 1);
+	if (offset + debug.length > PAGE_SIZE)
+		return -EINVAL;
+
+	data = (void *) get_zeroed_page(GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	/* decrypt full page */
+	ret = __sev_dbg_decrypt_page(kvm, debug.src_addr & PAGE_MASK,
+			data, &argp->error);
+	if (ret)
+		goto err_1;
+
+	/* we have decrypted full page but copy request length */
+	len = min_t(size_t, (PAGE_SIZE - offset), debug.length);
+	if (copy_to_user((uint8_t *)debug.dst_addr, data + offset, len))
+		ret = -EFAULT;
+err_1:
+	free_page((unsigned long)data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6013,6 +6085,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_guest_status(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_DBG_DECRYPT: {
+		r = sev_dbg_decrypt(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
  2017-03-02 15:12 ` Brijesh Singh
                   ` (61 preceding siblings ...)
  (?)
@ 2017-03-02 15:18 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The command copies a plain text into guest memory and encrypts it using
the VM encryption key. The command will be used for debug purposes
(e.g setting breakpoint through gdbserver)

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 87 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ce8819a..64899ed 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6058,6 +6058,89 @@ static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_dbg_encrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	void *data;
+	int len, ret, d_off;
+	struct page **inpages;
+	struct kvm_sev_dbg debug;
+	struct sev_data_dbg *encrypt;
+	unsigned long src_addr, dst_addr, npages;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&debug, argp, sizeof(*argp)))
+		return -EFAULT;
+
+	if (debug.length > PAGE_SIZE)
+		return -EINVAL;
+
+	len = debug.length;
+	src_addr = debug.src_addr;
+	dst_addr = debug.dst_addr;
+
+	inpages = sev_pin_memory(dst_addr, PAGE_SIZE, &npages);
+	if (!inpages)
+		return -EFAULT;
+
+	encrypt = kzalloc(sizeof(*encrypt), GFP_KERNEL);
+	if (!encrypt) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	data = (void *) get_zeroed_page(GFP_KERNEL);
+	if (!data) {
+		ret = -ENOMEM;
+		goto err_2;
+	}
+
+	if ((len & 15) || (dst_addr & 15)) {
+		/* if destination address and length are not 16-byte
+		 * aligned then:
+		 * a) decrypt destination page into temporary buffer
+		 * b) copy source data into temporary buffer at correct offset
+		 * c) encrypt temporary buffer
+		 */
+		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);
+		if (ret)
+			goto err_3;
+		d_off = dst_addr & (PAGE_SIZE - 1);
+
+		if (copy_from_user(data + d_off,
+					(uint8_t *)debug.src_addr, len)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		encrypt->length = PAGE_SIZE;
+		encrypt->src_addr = __psp_pa(data);
+		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
+	} else {
+		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		d_off = dst_addr & (PAGE_SIZE - 1);
+		encrypt->length = len;
+		encrypt->src_addr = __psp_pa(data);
+		encrypt->dst_addr = __sev_page_pa(inpages[0]);
+		encrypt->dst_addr += d_off;
+	}
+
+	encrypt->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_ENCRYPT, encrypt, &argp->error);
+err_3:
+	free_page((unsigned long)data);
+err_2:
+	kfree(encrypt);
+err_1:
+	sev_unpin_memory(inpages, npages);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6089,6 +6172,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_dbg_decrypt(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_DBG_ENCRYPT: {
+		r = sev_dbg_encrypt(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:18   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command copies a plain text into guest memory and encrypts it using
the VM encryption key. The command will be used for debug purposes
(e.g setting breakpoint through gdbserver)

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 87 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ce8819a..64899ed 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6058,6 +6058,89 @@ static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_dbg_encrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	void *data;
+	int len, ret, d_off;
+	struct page **inpages;
+	struct kvm_sev_dbg debug;
+	struct sev_data_dbg *encrypt;
+	unsigned long src_addr, dst_addr, npages;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&debug, argp, sizeof(*argp)))
+		return -EFAULT;
+
+	if (debug.length > PAGE_SIZE)
+		return -EINVAL;
+
+	len = debug.length;
+	src_addr = debug.src_addr;
+	dst_addr = debug.dst_addr;
+
+	inpages = sev_pin_memory(dst_addr, PAGE_SIZE, &npages);
+	if (!inpages)
+		return -EFAULT;
+
+	encrypt = kzalloc(sizeof(*encrypt), GFP_KERNEL);
+	if (!encrypt) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	data = (void *) get_zeroed_page(GFP_KERNEL);
+	if (!data) {
+		ret = -ENOMEM;
+		goto err_2;
+	}
+
+	if ((len & 15) || (dst_addr & 15)) {
+		/* if destination address and length are not 16-byte
+		 * aligned then:
+		 * a) decrypt destination page into temporary buffer
+		 * b) copy source data into temporary buffer at correct offset
+		 * c) encrypt temporary buffer
+		 */
+		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);
+		if (ret)
+			goto err_3;
+		d_off = dst_addr & (PAGE_SIZE - 1);
+
+		if (copy_from_user(data + d_off,
+					(uint8_t *)debug.src_addr, len)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		encrypt->length = PAGE_SIZE;
+		encrypt->src_addr = __psp_pa(data);
+		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
+	} else {
+		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		d_off = dst_addr & (PAGE_SIZE - 1);
+		encrypt->length = len;
+		encrypt->src_addr = __psp_pa(data);
+		encrypt->dst_addr = __sev_page_pa(inpages[0]);
+		encrypt->dst_addr += d_off;
+	}
+
+	encrypt->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_ENCRYPT, encrypt, &argp->error);
+err_3:
+	free_page((unsigned long)data);
+err_2:
+	kfree(encrypt);
+err_1:
+	sev_unpin_memory(inpages, npages);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6089,6 +6172,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_dbg_decrypt(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_DBG_ENCRYPT: {
+		r = sev_dbg_encrypt(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command copies a plain text into guest memory and encrypts it using
the VM encryption key. The command will be used for debug purposes
(e.g setting breakpoint through gdbserver)

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 87 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ce8819a..64899ed 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6058,6 +6058,89 @@ static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_dbg_encrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	void *data;
+	int len, ret, d_off;
+	struct page **inpages;
+	struct kvm_sev_dbg debug;
+	struct sev_data_dbg *encrypt;
+	unsigned long src_addr, dst_addr, npages;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&debug, argp, sizeof(*argp)))
+		return -EFAULT;
+
+	if (debug.length > PAGE_SIZE)
+		return -EINVAL;
+
+	len = debug.length;
+	src_addr = debug.src_addr;
+	dst_addr = debug.dst_addr;
+
+	inpages = sev_pin_memory(dst_addr, PAGE_SIZE, &npages);
+	if (!inpages)
+		return -EFAULT;
+
+	encrypt = kzalloc(sizeof(*encrypt), GFP_KERNEL);
+	if (!encrypt) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	data = (void *) get_zeroed_page(GFP_KERNEL);
+	if (!data) {
+		ret = -ENOMEM;
+		goto err_2;
+	}
+
+	if ((len & 15) || (dst_addr & 15)) {
+		/* if destination address and length are not 16-byte
+		 * aligned then:
+		 * a) decrypt destination page into temporary buffer
+		 * b) copy source data into temporary buffer at correct offset
+		 * c) encrypt temporary buffer
+		 */
+		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);
+		if (ret)
+			goto err_3;
+		d_off = dst_addr & (PAGE_SIZE - 1);
+
+		if (copy_from_user(data + d_off,
+					(uint8_t *)debug.src_addr, len)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		encrypt->length = PAGE_SIZE;
+		encrypt->src_addr = __psp_pa(data);
+		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
+	} else {
+		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		d_off = dst_addr & (PAGE_SIZE - 1);
+		encrypt->length = len;
+		encrypt->src_addr = __psp_pa(data);
+		encrypt->dst_addr = __sev_page_pa(inpages[0]);
+		encrypt->dst_addr += d_off;
+	}
+
+	encrypt->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_ENCRYPT, encrypt, &argp->error);
+err_3:
+	free_page((unsigned long)data);
+err_2:
+	kfree(encrypt);
+err_1:
+	sev_unpin_memory(inpages, npages);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6089,6 +6172,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_dbg_decrypt(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_DBG_ENCRYPT: {
+		r = sev_dbg_encrypt(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command copies a plain text into guest memory and encrypts it using
the VM encryption key. The command will be used for debug purposes
(e.g setting breakpoint through gdbserver)

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 87 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ce8819a..64899ed 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6058,6 +6058,89 @@ static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_dbg_encrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	void *data;
+	int len, ret, d_off;
+	struct page **inpages;
+	struct kvm_sev_dbg debug;
+	struct sev_data_dbg *encrypt;
+	unsigned long src_addr, dst_addr, npages;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&debug, argp, sizeof(*argp)))
+		return -EFAULT;
+
+	if (debug.length > PAGE_SIZE)
+		return -EINVAL;
+
+	len = debug.length;
+	src_addr = debug.src_addr;
+	dst_addr = debug.dst_addr;
+
+	inpages = sev_pin_memory(dst_addr, PAGE_SIZE, &npages);
+	if (!inpages)
+		return -EFAULT;
+
+	encrypt = kzalloc(sizeof(*encrypt), GFP_KERNEL);
+	if (!encrypt) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	data = (void *) get_zeroed_page(GFP_KERNEL);
+	if (!data) {
+		ret = -ENOMEM;
+		goto err_2;
+	}
+
+	if ((len & 15) || (dst_addr & 15)) {
+		/* if destination address and length are not 16-byte
+		 * aligned then:
+		 * a) decrypt destination page into temporary buffer
+		 * b) copy source data into temporary buffer at correct offset
+		 * c) encrypt temporary buffer
+		 */
+		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);
+		if (ret)
+			goto err_3;
+		d_off = dst_addr & (PAGE_SIZE - 1);
+
+		if (copy_from_user(data + d_off,
+					(uint8_t *)debug.src_addr, len)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		encrypt->length = PAGE_SIZE;
+		encrypt->src_addr = __psp_pa(data);
+		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
+	} else {
+		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		d_off = dst_addr & (PAGE_SIZE - 1);
+		encrypt->length = len;
+		encrypt->src_addr = __psp_pa(data);
+		encrypt->dst_addr = __sev_page_pa(inpages[0]);
+		encrypt->dst_addr += d_off;
+	}
+
+	encrypt->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_ENCRYPT, encrypt, &argp->error);
+err_3:
+	free_page((unsigned long)data);
+err_2:
+	kfree(encrypt);
+err_1:
+	sev_unpin_memory(inpages, npages);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6089,6 +6172,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_dbg_decrypt(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_DBG_ENCRYPT: {
+		r = sev_dbg_encrypt(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command copies a plain text into guest memory and encrypts it using
the VM encryption key. The command will be used for debug purposes
(e.g setting breakpoint through gdbserver)

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 87 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ce8819a..64899ed 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6058,6 +6058,89 @@ static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_dbg_encrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	void *data;
+	int len, ret, d_off;
+	struct page **inpages;
+	struct kvm_sev_dbg debug;
+	struct sev_data_dbg *encrypt;
+	unsigned long src_addr, dst_addr, npages;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&debug, argp, sizeof(*argp)))
+		return -EFAULT;
+
+	if (debug.length > PAGE_SIZE)
+		return -EINVAL;
+
+	len = debug.length;
+	src_addr = debug.src_addr;
+	dst_addr = debug.dst_addr;
+
+	inpages = sev_pin_memory(dst_addr, PAGE_SIZE, &npages);
+	if (!inpages)
+		return -EFAULT;
+
+	encrypt = kzalloc(sizeof(*encrypt), GFP_KERNEL);
+	if (!encrypt) {
+		ret = -ENOMEM;
+		goto err_1;
+	}
+
+	data = (void *) get_zeroed_page(GFP_KERNEL);
+	if (!data) {
+		ret = -ENOMEM;
+		goto err_2;
+	}
+
+	if ((len & 15) || (dst_addr & 15)) {
+		/* if destination address and length are not 16-byte
+		 * aligned then:
+		 * a) decrypt destination page into temporary buffer
+		 * b) copy source data into temporary buffer at correct offset
+		 * c) encrypt temporary buffer
+		 */
+		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);
+		if (ret)
+			goto err_3;
+		d_off = dst_addr & (PAGE_SIZE - 1);
+
+		if (copy_from_user(data + d_off,
+					(uint8_t *)debug.src_addr, len)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		encrypt->length = PAGE_SIZE;
+		encrypt->src_addr = __psp_pa(data);
+		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
+	} else {
+		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
+			ret = -EFAULT;
+			goto err_3;
+		}
+
+		d_off = dst_addr & (PAGE_SIZE - 1);
+		encrypt->length = len;
+		encrypt->src_addr = __psp_pa(data);
+		encrypt->dst_addr = __sev_page_pa(inpages[0]);
+		encrypt->dst_addr += d_off;
+	}
+
+	encrypt->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_ENCRYPT, encrypt, &argp->error);
+err_3:
+	free_page((unsigned long)data);
+err_2:
+	kfree(encrypt);
+err_1:
+	sev_unpin_memory(inpages, npages);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6089,6 +6172,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_dbg_decrypt(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_DBG_ENCRYPT: {
+		r = sev_dbg_encrypt(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 31/32] kvm: svm: Add support for SEV LAUNCH_MEASURE command
  2017-03-02 15:12 ` Brijesh Singh
                   ` (63 preceding siblings ...)
  (?)
@ 2017-03-02 15:18 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The command is used to retrieve the measurement of memory encrypted through
the LAUNCH_UPDATE_DATA command. This measurement can be used for attestation
purposes.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 64899ed..13996d6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6141,6 +6141,54 @@ static int sev_dbg_encrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret;
+	void *addr = NULL;
+	struct kvm_sev_launch_measure params;
+	struct sev_data_launch_measure *data;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void *)argp->data,
+				sizeof(struct kvm_sev_launch_measure)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	if (params.address && params.length) {
+		ret = -EFAULT;
+		addr = kzalloc(params.length, GFP_KERNEL);
+		if (!addr)
+			goto err_1;
+		data->address = __psp_pa(addr);
+		data->length = params.length;
+	}
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_MEASURE, data, &argp->error);
+
+	/* copy the measurement to userspace */
+	if (addr &&
+		copy_to_user((void *)params.address, addr, params.length)) {
+		ret = -EFAULT;
+		goto err_1;
+	}
+
+	params.length = data->length;
+	if (copy_to_user((void *)argp->data, &params,
+				sizeof(struct kvm_sev_launch_measure)))
+		ret = -EFAULT;
+
+	kfree(addr);
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6176,6 +6224,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_dbg_encrypt(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_MEASURE: {
+		r = sev_launch_measure(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 31/32] kvm: svm: Add support for SEV LAUNCH_MEASURE command
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:18   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used to retrieve the measurement of memory encrypted through
the LAUNCH_UPDATE_DATA command. This measurement can be used for attestation
purposes.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 64899ed..13996d6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6141,6 +6141,54 @@ static int sev_dbg_encrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret;
+	void *addr = NULL;
+	struct kvm_sev_launch_measure params;
+	struct sev_data_launch_measure *data;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void *)argp->data,
+				sizeof(struct kvm_sev_launch_measure)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	if (params.address && params.length) {
+		ret = -EFAULT;
+		addr = kzalloc(params.length, GFP_KERNEL);
+		if (!addr)
+			goto err_1;
+		data->address = __psp_pa(addr);
+		data->length = params.length;
+	}
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_MEASURE, data, &argp->error);
+
+	/* copy the measurement to userspace */
+	if (addr &&
+		copy_to_user((void *)params.address, addr, params.length)) {
+		ret = -EFAULT;
+		goto err_1;
+	}
+
+	params.length = data->length;
+	if (copy_to_user((void *)argp->data, &params,
+				sizeof(struct kvm_sev_launch_measure)))
+		ret = -EFAULT;
+
+	kfree(addr);
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6176,6 +6224,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_dbg_encrypt(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_MEASURE: {
+		r = sev_launch_measure(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 31/32] kvm: svm: Add support for SEV LAUNCH_MEASURE command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used to retrieve the measurement of memory encrypted through
the LAUNCH_UPDATE_DATA command. This measurement can be used for attestation
purposes.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 64899ed..13996d6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6141,6 +6141,54 @@ static int sev_dbg_encrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret;
+	void *addr = NULL;
+	struct kvm_sev_launch_measure params;
+	struct sev_data_launch_measure *data;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void *)argp->data,
+				sizeof(struct kvm_sev_launch_measure)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	if (params.address && params.length) {
+		ret = -EFAULT;
+		addr = kzalloc(params.length, GFP_KERNEL);
+		if (!addr)
+			goto err_1;
+		data->address = __psp_pa(addr);
+		data->length = params.length;
+	}
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_MEASURE, data, &argp->error);
+
+	/* copy the measurement to userspace */
+	if (addr &&
+		copy_to_user((void *)params.address, addr, params.length)) {
+		ret = -EFAULT;
+		goto err_1;
+	}
+
+	params.length = data->length;
+	if (copy_to_user((void *)argp->data, &params,
+				sizeof(struct kvm_sev_launch_measure)))
+		ret = -EFAULT;
+
+	kfree(addr);
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6176,6 +6224,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_dbg_encrypt(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_MEASURE: {
+		r = sev_launch_measure(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 31/32] kvm: svm: Add support for SEV LAUNCH_MEASURE command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The command is used to retrieve the measurement of memory encrypted through
the LAUNCH_UPDATE_DATA command. This measurement can be used for attestation
purposes.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 64899ed..13996d6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6141,6 +6141,54 @@ static int sev_dbg_encrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret;
+	void *addr = NULL;
+	struct kvm_sev_launch_measure params;
+	struct sev_data_launch_measure *data;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void *)argp->data,
+				sizeof(struct kvm_sev_launch_measure)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	if (params.address && params.length) {
+		ret = -EFAULT;
+		addr = kzalloc(params.length, GFP_KERNEL);
+		if (!addr)
+			goto err_1;
+		data->address = __psp_pa(addr);
+		data->length = params.length;
+	}
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_MEASURE, data, &argp->error);
+
+	/* copy the measurement to userspace */
+	if (addr &&
+		copy_to_user((void *)params.address, addr, params.length)) {
+		ret = -EFAULT;
+		goto err_1;
+	}
+
+	params.length = data->length;
+	if (copy_to_user((void *)argp->data, &params,
+				sizeof(struct kvm_sev_launch_measure)))
+		ret = -EFAULT;
+
+	kfree(addr);
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6176,6 +6224,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_dbg_encrypt(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_MEASURE: {
+		r = sev_launch_measure(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 31/32] kvm: svm: Add support for SEV LAUNCH_MEASURE command
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The command is used to retrieve the measurement of memory encrypted through
the LAUNCH_UPDATE_DATA command. This measurement can be used for attestation
purposes.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/svm.c |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 64899ed..13996d6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6141,6 +6141,54 @@ static int sev_dbg_encrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	int ret;
+	void *addr = NULL;
+	struct kvm_sev_launch_measure params;
+	struct sev_data_launch_measure *data;
+
+	if (!sev_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, (void *)argp->data,
+				sizeof(struct kvm_sev_launch_measure)))
+		return -EFAULT;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	if (params.address && params.length) {
+		ret = -EFAULT;
+		addr = kzalloc(params.length, GFP_KERNEL);
+		if (!addr)
+			goto err_1;
+		data->address = __psp_pa(addr);
+		data->length = params.length;
+	}
+
+	data->handle = sev_get_handle(kvm);
+	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_MEASURE, data, &argp->error);
+
+	/* copy the measurement to userspace */
+	if (addr &&
+		copy_to_user((void *)params.address, addr, params.length)) {
+		ret = -EFAULT;
+		goto err_1;
+	}
+
+	params.length = data->length;
+	if (copy_to_user((void *)argp->data, &params,
+				sizeof(struct kvm_sev_launch_measure)))
+		ret = -EFAULT;
+
+	kfree(addr);
+err_1:
+	kfree(data);
+	return ret;
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6176,6 +6224,10 @@ static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 		r = sev_dbg_encrypt(kvm, &sev_cmd);
 		break;
 	}
+	case KVM_SEV_LAUNCH_MEASURE: {
+		r = sev_launch_measure(kvm, &sev_cmd);
+		break;
+	}
 	default:
 		break;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
  2017-03-02 15:12 ` Brijesh Singh
                   ` (64 preceding siblings ...)
  (?)
@ 2017-03-02 15:18 ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

The SEV memory encryption engine uses a tweak such that two identical
plaintexts at different location will have a different ciphertexts.
So swapping or moving ciphertexts of two pages will not result in
plaintexts being swapped. Relocating (or migrating) a physical backing pages
for SEV guest will require some additional steps. The current SEV key
management spec [1] does not provide commands to swap or migrate (move)
ciphertexts. For now we pin the memory allocated for the SEV guest. In
future when SEV key management spec provides the commands to support the
page migration we can update the KVM code to remove the pinning logical
without making any changes into userspace (qemu).

The patch pins userspace memory when a new slot is created and unpin the
memory when slot is removed.

[1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    6 +++
 arch/x86/kvm/svm.c              |   93 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              |    3 +
 3 files changed, 102 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fcc4710..9dc59f0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -723,6 +723,7 @@ struct kvm_sev_info {
 	unsigned int handle;	/* firmware handle */
 	unsigned int asid;	/* asid for this guest */
 	int sev_fd;		/* SEV device fd */
+	struct list_head pinned_memory_slot;
 };
 
 struct kvm_arch {
@@ -1043,6 +1044,11 @@ struct kvm_x86_ops {
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
 
 	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
+
+	void (*prepare_memory_region)(struct kvm *kvm,
+			struct kvm_memory_slot *memslot,
+			const struct kvm_userspace_memory_region *mem,
+			enum kvm_mr_change change);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 13996d6..ab973f9 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -498,12 +498,21 @@ static inline bool gif_set(struct vcpu_svm *svm)
 }
 
 /* Secure Encrypted Virtualization */
+struct kvm_sev_pinned_memory_slot {
+	struct list_head list;
+	unsigned long npages;
+	struct page **pages;
+	unsigned long userspace_addr;
+	short id;
+};
+
 static unsigned int max_sev_asid;
 static unsigned long *sev_asid_bitmap;
 static void sev_deactivate_handle(struct kvm *kvm);
 static void sev_decommission_handle(struct kvm *kvm);
 static int sev_asid_new(void);
 static void sev_asid_free(int asid);
+static void sev_unpin_memory(struct page **pages, unsigned long npages);
 #define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
 
 static bool kvm_sev_enabled(void)
@@ -1544,9 +1553,25 @@ static inline int avic_free_vm_id(int id)
 
 static void sev_vm_destroy(struct kvm *kvm)
 {
+	struct list_head *pos, *q;
+	struct kvm_sev_pinned_memory_slot *pinned_slot;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
 	if (!sev_guest(kvm))
 		return;
 
+	/* if guest memory is pinned then unpin it now */
+	if (!list_empty(head)) {
+		list_for_each_safe(pos, q, head) {
+			pinned_slot = list_entry(pos,
+				struct kvm_sev_pinned_memory_slot, list);
+			sev_unpin_memory(pinned_slot->pages,
+					pinned_slot->npages);
+			list_del(pos);
+			kfree(pinned_slot);
+		}
+	}
+
 	/* release the firmware resources */
 	sev_deactivate_handle(kvm);
 	sev_decommission_handle(kvm);
@@ -5663,6 +5688,8 @@ static int sev_pre_start(struct kvm *kvm, int *asid)
 		}
 		*asid = ret;
 		ret = 0;
+
+		INIT_LIST_HEAD(&kvm->arch.sev_info.pinned_memory_slot);
 	}
 
 	return ret;
@@ -6189,6 +6216,71 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static struct kvm_sev_pinned_memory_slot *sev_find_pinned_memory_slot(
+		struct kvm *kvm, struct kvm_memory_slot *slot)
+{
+	struct kvm_sev_pinned_memory_slot *i;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
+	list_for_each_entry(i, head, list) {
+		if (i->userspace_addr == slot->userspace_addr &&
+			i->id == slot->id)
+			return i;
+	}
+
+	return NULL;
+}
+
+static void amd_prepare_memory_region(struct kvm *kvm,
+				struct kvm_memory_slot *memslot,
+				const struct kvm_userspace_memory_region *mem,
+				enum kvm_mr_change change)
+{
+	struct kvm_sev_pinned_memory_slot *pinned_slot;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
+	mutex_lock(&kvm->lock);
+
+	if (!sev_guest(kvm))
+		goto unlock;
+
+	if (change == KVM_MR_CREATE) {
+
+		if (!mem->memory_size)
+			goto unlock;
+
+		pinned_slot = kmalloc(sizeof(*pinned_slot), GFP_KERNEL);
+		if (pinned_slot == NULL)
+			goto unlock;
+
+		pinned_slot->pages = sev_pin_memory(mem->userspace_addr,
+				mem->memory_size, &pinned_slot->npages);
+		if (pinned_slot->pages == NULL) {
+			kfree(pinned_slot);
+			goto unlock;
+		}
+
+		sev_clflush_pages(pinned_slot->pages, pinned_slot->npages);
+
+		pinned_slot->id = memslot->id;
+		pinned_slot->userspace_addr = mem->userspace_addr;
+		list_add_tail(&pinned_slot->list, head);
+
+	} else if  (change == KVM_MR_DELETE) {
+
+		pinned_slot = sev_find_pinned_memory_slot(kvm, memslot);
+		if (!pinned_slot)
+			goto unlock;
+
+		sev_unpin_memory(pinned_slot->pages, pinned_slot->npages);
+		list_del(&pinned_slot->list);
+		kfree(pinned_slot);
+	}
+
+unlock:
+	mutex_unlock(&kvm->lock);
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6355,6 +6447,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.update_pi_irte = svm_update_pi_irte,
 
 	.memory_encryption_op = amd_memory_encryption_cmd,
+	.prepare_memory_region = amd_prepare_memory_region,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6a737e9..e05069d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8195,6 +8195,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 				const struct kvm_userspace_memory_region *mem,
 				enum kvm_mr_change change)
 {
+	if (kvm_x86_ops->prepare_memory_region)
+		kvm_x86_ops->prepare_memory_region(kvm, memslot, mem, change);
+
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-02 15:18   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The SEV memory encryption engine uses a tweak such that two identical
plaintexts at different location will have a different ciphertexts.
So swapping or moving ciphertexts of two pages will not result in
plaintexts being swapped. Relocating (or migrating) a physical backing pages
for SEV guest will require some additional steps. The current SEV key
management spec [1] does not provide commands to swap or migrate (move)
ciphertexts. For now we pin the memory allocated for the SEV guest. In
future when SEV key management spec provides the commands to support the
page migration we can update the KVM code to remove the pinning logical
without making any changes into userspace (qemu).

The patch pins userspace memory when a new slot is created and unpin the
memory when slot is removed.

[1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    6 +++
 arch/x86/kvm/svm.c              |   93 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              |    3 +
 3 files changed, 102 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fcc4710..9dc59f0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -723,6 +723,7 @@ struct kvm_sev_info {
 	unsigned int handle;	/* firmware handle */
 	unsigned int asid;	/* asid for this guest */
 	int sev_fd;		/* SEV device fd */
+	struct list_head pinned_memory_slot;
 };
 
 struct kvm_arch {
@@ -1043,6 +1044,11 @@ struct kvm_x86_ops {
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
 
 	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
+
+	void (*prepare_memory_region)(struct kvm *kvm,
+			struct kvm_memory_slot *memslot,
+			const struct kvm_userspace_memory_region *mem,
+			enum kvm_mr_change change);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 13996d6..ab973f9 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -498,12 +498,21 @@ static inline bool gif_set(struct vcpu_svm *svm)
 }
 
 /* Secure Encrypted Virtualization */
+struct kvm_sev_pinned_memory_slot {
+	struct list_head list;
+	unsigned long npages;
+	struct page **pages;
+	unsigned long userspace_addr;
+	short id;
+};
+
 static unsigned int max_sev_asid;
 static unsigned long *sev_asid_bitmap;
 static void sev_deactivate_handle(struct kvm *kvm);
 static void sev_decommission_handle(struct kvm *kvm);
 static int sev_asid_new(void);
 static void sev_asid_free(int asid);
+static void sev_unpin_memory(struct page **pages, unsigned long npages);
 #define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
 
 static bool kvm_sev_enabled(void)
@@ -1544,9 +1553,25 @@ static inline int avic_free_vm_id(int id)
 
 static void sev_vm_destroy(struct kvm *kvm)
 {
+	struct list_head *pos, *q;
+	struct kvm_sev_pinned_memory_slot *pinned_slot;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
 	if (!sev_guest(kvm))
 		return;
 
+	/* if guest memory is pinned then unpin it now */
+	if (!list_empty(head)) {
+		list_for_each_safe(pos, q, head) {
+			pinned_slot = list_entry(pos,
+				struct kvm_sev_pinned_memory_slot, list);
+			sev_unpin_memory(pinned_slot->pages,
+					pinned_slot->npages);
+			list_del(pos);
+			kfree(pinned_slot);
+		}
+	}
+
 	/* release the firmware resources */
 	sev_deactivate_handle(kvm);
 	sev_decommission_handle(kvm);
@@ -5663,6 +5688,8 @@ static int sev_pre_start(struct kvm *kvm, int *asid)
 		}
 		*asid = ret;
 		ret = 0;
+
+		INIT_LIST_HEAD(&kvm->arch.sev_info.pinned_memory_slot);
 	}
 
 	return ret;
@@ -6189,6 +6216,71 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static struct kvm_sev_pinned_memory_slot *sev_find_pinned_memory_slot(
+		struct kvm *kvm, struct kvm_memory_slot *slot)
+{
+	struct kvm_sev_pinned_memory_slot *i;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
+	list_for_each_entry(i, head, list) {
+		if (i->userspace_addr == slot->userspace_addr &&
+			i->id == slot->id)
+			return i;
+	}
+
+	return NULL;
+}
+
+static void amd_prepare_memory_region(struct kvm *kvm,
+				struct kvm_memory_slot *memslot,
+				const struct kvm_userspace_memory_region *mem,
+				enum kvm_mr_change change)
+{
+	struct kvm_sev_pinned_memory_slot *pinned_slot;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
+	mutex_lock(&kvm->lock);
+
+	if (!sev_guest(kvm))
+		goto unlock;
+
+	if (change == KVM_MR_CREATE) {
+
+		if (!mem->memory_size)
+			goto unlock;
+
+		pinned_slot = kmalloc(sizeof(*pinned_slot), GFP_KERNEL);
+		if (pinned_slot == NULL)
+			goto unlock;
+
+		pinned_slot->pages = sev_pin_memory(mem->userspace_addr,
+				mem->memory_size, &pinned_slot->npages);
+		if (pinned_slot->pages == NULL) {
+			kfree(pinned_slot);
+			goto unlock;
+		}
+
+		sev_clflush_pages(pinned_slot->pages, pinned_slot->npages);
+
+		pinned_slot->id = memslot->id;
+		pinned_slot->userspace_addr = mem->userspace_addr;
+		list_add_tail(&pinned_slot->list, head);
+
+	} else if  (change == KVM_MR_DELETE) {
+
+		pinned_slot = sev_find_pinned_memory_slot(kvm, memslot);
+		if (!pinned_slot)
+			goto unlock;
+
+		sev_unpin_memory(pinned_slot->pages, pinned_slot->npages);
+		list_del(&pinned_slot->list);
+		kfree(pinned_slot);
+	}
+
+unlock:
+	mutex_unlock(&kvm->lock);
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6355,6 +6447,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.update_pi_irte = svm_update_pi_irte,
 
 	.memory_encryption_op = amd_memory_encryption_cmd,
+	.prepare_memory_region = amd_prepare_memory_region,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6a737e9..e05069d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8195,6 +8195,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 				const struct kvm_userspace_memory_region *mem,
 				enum kvm_mr_change change)
 {
+	if (kvm_x86_ops->prepare_memory_region)
+		kvm_x86_ops->prepare_memory_region(kvm, memslot, mem, change);
+
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The SEV memory encryption engine uses a tweak such that two identical
plaintexts at different location will have a different ciphertexts.
So swapping or moving ciphertexts of two pages will not result in
plaintexts being swapped. Relocating (or migrating) a physical backing pages
for SEV guest will require some additional steps. The current SEV key
management spec [1] does not provide commands to swap or migrate (move)
ciphertexts. For now we pin the memory allocated for the SEV guest. In
future when SEV key management spec provides the commands to support the
page migration we can update the KVM code to remove the pinning logical
without making any changes into userspace (qemu).

The patch pins userspace memory when a new slot is created and unpin the
memory when slot is removed.

[1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    6 +++
 arch/x86/kvm/svm.c              |   93 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              |    3 +
 3 files changed, 102 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fcc4710..9dc59f0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -723,6 +723,7 @@ struct kvm_sev_info {
 	unsigned int handle;	/* firmware handle */
 	unsigned int asid;	/* asid for this guest */
 	int sev_fd;		/* SEV device fd */
+	struct list_head pinned_memory_slot;
 };
 
 struct kvm_arch {
@@ -1043,6 +1044,11 @@ struct kvm_x86_ops {
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
 
 	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
+
+	void (*prepare_memory_region)(struct kvm *kvm,
+			struct kvm_memory_slot *memslot,
+			const struct kvm_userspace_memory_region *mem,
+			enum kvm_mr_change change);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 13996d6..ab973f9 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -498,12 +498,21 @@ static inline bool gif_set(struct vcpu_svm *svm)
 }
 
 /* Secure Encrypted Virtualization */
+struct kvm_sev_pinned_memory_slot {
+	struct list_head list;
+	unsigned long npages;
+	struct page **pages;
+	unsigned long userspace_addr;
+	short id;
+};
+
 static unsigned int max_sev_asid;
 static unsigned long *sev_asid_bitmap;
 static void sev_deactivate_handle(struct kvm *kvm);
 static void sev_decommission_handle(struct kvm *kvm);
 static int sev_asid_new(void);
 static void sev_asid_free(int asid);
+static void sev_unpin_memory(struct page **pages, unsigned long npages);
 #define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
 
 static bool kvm_sev_enabled(void)
@@ -1544,9 +1553,25 @@ static inline int avic_free_vm_id(int id)
 
 static void sev_vm_destroy(struct kvm *kvm)
 {
+	struct list_head *pos, *q;
+	struct kvm_sev_pinned_memory_slot *pinned_slot;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
 	if (!sev_guest(kvm))
 		return;
 
+	/* if guest memory is pinned then unpin it now */
+	if (!list_empty(head)) {
+		list_for_each_safe(pos, q, head) {
+			pinned_slot = list_entry(pos,
+				struct kvm_sev_pinned_memory_slot, list);
+			sev_unpin_memory(pinned_slot->pages,
+					pinned_slot->npages);
+			list_del(pos);
+			kfree(pinned_slot);
+		}
+	}
+
 	/* release the firmware resources */
 	sev_deactivate_handle(kvm);
 	sev_decommission_handle(kvm);
@@ -5663,6 +5688,8 @@ static int sev_pre_start(struct kvm *kvm, int *asid)
 		}
 		*asid = ret;
 		ret = 0;
+
+		INIT_LIST_HEAD(&kvm->arch.sev_info.pinned_memory_slot);
 	}
 
 	return ret;
@@ -6189,6 +6216,71 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static struct kvm_sev_pinned_memory_slot *sev_find_pinned_memory_slot(
+		struct kvm *kvm, struct kvm_memory_slot *slot)
+{
+	struct kvm_sev_pinned_memory_slot *i;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
+	list_for_each_entry(i, head, list) {
+		if (i->userspace_addr == slot->userspace_addr &&
+			i->id == slot->id)
+			return i;
+	}
+
+	return NULL;
+}
+
+static void amd_prepare_memory_region(struct kvm *kvm,
+				struct kvm_memory_slot *memslot,
+				const struct kvm_userspace_memory_region *mem,
+				enum kvm_mr_change change)
+{
+	struct kvm_sev_pinned_memory_slot *pinned_slot;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
+	mutex_lock(&kvm->lock);
+
+	if (!sev_guest(kvm))
+		goto unlock;
+
+	if (change == KVM_MR_CREATE) {
+
+		if (!mem->memory_size)
+			goto unlock;
+
+		pinned_slot = kmalloc(sizeof(*pinned_slot), GFP_KERNEL);
+		if (pinned_slot == NULL)
+			goto unlock;
+
+		pinned_slot->pages = sev_pin_memory(mem->userspace_addr,
+				mem->memory_size, &pinned_slot->npages);
+		if (pinned_slot->pages == NULL) {
+			kfree(pinned_slot);
+			goto unlock;
+		}
+
+		sev_clflush_pages(pinned_slot->pages, pinned_slot->npages);
+
+		pinned_slot->id = memslot->id;
+		pinned_slot->userspace_addr = mem->userspace_addr;
+		list_add_tail(&pinned_slot->list, head);
+
+	} else if  (change == KVM_MR_DELETE) {
+
+		pinned_slot = sev_find_pinned_memory_slot(kvm, memslot);
+		if (!pinned_slot)
+			goto unlock;
+
+		sev_unpin_memory(pinned_slot->pages, pinned_slot->npages);
+		list_del(&pinned_slot->list);
+		kfree(pinned_slot);
+	}
+
+unlock:
+	mutex_unlock(&kvm->lock);
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6355,6 +6447,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.update_pi_irte = svm_update_pi_irte,
 
 	.memory_encryption_op = amd_memory_encryption_cmd,
+	.prepare_memory_region = amd_prepare_memory_region,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6a737e9..e05069d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8195,6 +8195,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 				const struct kvm_userspace_memory_region *mem,
 				enum kvm_mr_change change)
 {
+	if (kvm_x86_ops->prepare_memory_region)
+		kvm_x86_ops->prepare_memory_region(kvm, memslot, mem, change);
+
 	return 0;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

The SEV memory encryption engine uses a tweak such that two identical
plaintexts at different location will have a different ciphertexts.
So swapping or moving ciphertexts of two pages will not result in
plaintexts being swapped. Relocating (or migrating) a physical backing pages
for SEV guest will require some additional steps. The current SEV key
management spec [1] does not provide commands to swap or migrate (move)
ciphertexts. For now we pin the memory allocated for the SEV guest. In
future when SEV key management spec provides the commands to support the
page migration we can update the KVM code to remove the pinning logical
without making any changes into userspace (qemu).

The patch pins userspace memory when a new slot is created and unpin the
memory when slot is removed.

[1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    6 +++
 arch/x86/kvm/svm.c              |   93 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              |    3 +
 3 files changed, 102 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fcc4710..9dc59f0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -723,6 +723,7 @@ struct kvm_sev_info {
 	unsigned int handle;	/* firmware handle */
 	unsigned int asid;	/* asid for this guest */
 	int sev_fd;		/* SEV device fd */
+	struct list_head pinned_memory_slot;
 };
 
 struct kvm_arch {
@@ -1043,6 +1044,11 @@ struct kvm_x86_ops {
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
 
 	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
+
+	void (*prepare_memory_region)(struct kvm *kvm,
+			struct kvm_memory_slot *memslot,
+			const struct kvm_userspace_memory_region *mem,
+			enum kvm_mr_change change);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 13996d6..ab973f9 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -498,12 +498,21 @@ static inline bool gif_set(struct vcpu_svm *svm)
 }
 
 /* Secure Encrypted Virtualization */
+struct kvm_sev_pinned_memory_slot {
+	struct list_head list;
+	unsigned long npages;
+	struct page **pages;
+	unsigned long userspace_addr;
+	short id;
+};
+
 static unsigned int max_sev_asid;
 static unsigned long *sev_asid_bitmap;
 static void sev_deactivate_handle(struct kvm *kvm);
 static void sev_decommission_handle(struct kvm *kvm);
 static int sev_asid_new(void);
 static void sev_asid_free(int asid);
+static void sev_unpin_memory(struct page **pages, unsigned long npages);
 #define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
 
 static bool kvm_sev_enabled(void)
@@ -1544,9 +1553,25 @@ static inline int avic_free_vm_id(int id)
 
 static void sev_vm_destroy(struct kvm *kvm)
 {
+	struct list_head *pos, *q;
+	struct kvm_sev_pinned_memory_slot *pinned_slot;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
 	if (!sev_guest(kvm))
 		return;
 
+	/* if guest memory is pinned then unpin it now */
+	if (!list_empty(head)) {
+		list_for_each_safe(pos, q, head) {
+			pinned_slot = list_entry(pos,
+				struct kvm_sev_pinned_memory_slot, list);
+			sev_unpin_memory(pinned_slot->pages,
+					pinned_slot->npages);
+			list_del(pos);
+			kfree(pinned_slot);
+		}
+	}
+
 	/* release the firmware resources */
 	sev_deactivate_handle(kvm);
 	sev_decommission_handle(kvm);
@@ -5663,6 +5688,8 @@ static int sev_pre_start(struct kvm *kvm, int *asid)
 		}
 		*asid = ret;
 		ret = 0;
+
+		INIT_LIST_HEAD(&kvm->arch.sev_info.pinned_memory_slot);
 	}
 
 	return ret;
@@ -6189,6 +6216,71 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static struct kvm_sev_pinned_memory_slot *sev_find_pinned_memory_slot(
+		struct kvm *kvm, struct kvm_memory_slot *slot)
+{
+	struct kvm_sev_pinned_memory_slot *i;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
+	list_for_each_entry(i, head, list) {
+		if (i->userspace_addr == slot->userspace_addr &&
+			i->id == slot->id)
+			return i;
+	}
+
+	return NULL;
+}
+
+static void amd_prepare_memory_region(struct kvm *kvm,
+				struct kvm_memory_slot *memslot,
+				const struct kvm_userspace_memory_region *mem,
+				enum kvm_mr_change change)
+{
+	struct kvm_sev_pinned_memory_slot *pinned_slot;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
+	mutex_lock(&kvm->lock);
+
+	if (!sev_guest(kvm))
+		goto unlock;
+
+	if (change == KVM_MR_CREATE) {
+
+		if (!mem->memory_size)
+			goto unlock;
+
+		pinned_slot = kmalloc(sizeof(*pinned_slot), GFP_KERNEL);
+		if (pinned_slot == NULL)
+			goto unlock;
+
+		pinned_slot->pages = sev_pin_memory(mem->userspace_addr,
+				mem->memory_size, &pinned_slot->npages);
+		if (pinned_slot->pages == NULL) {
+			kfree(pinned_slot);
+			goto unlock;
+		}
+
+		sev_clflush_pages(pinned_slot->pages, pinned_slot->npages);
+
+		pinned_slot->id = memslot->id;
+		pinned_slot->userspace_addr = mem->userspace_addr;
+		list_add_tail(&pinned_slot->list, head);
+
+	} else if  (change == KVM_MR_DELETE) {
+
+		pinned_slot = sev_find_pinned_memory_slot(kvm, memslot);
+		if (!pinned_slot)
+			goto unlock;
+
+		sev_unpin_memory(pinned_slot->pages, pinned_slot->npages);
+		list_del(&pinned_slot->list);
+		kfree(pinned_slot);
+	}
+
+unlock:
+	mutex_unlock(&kvm->lock);
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6355,6 +6447,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.update_pi_irte = svm_update_pi_irte,
 
 	.memory_encryption_op = amd_memory_encryption_cmd,
+	.prepare_memory_region = amd_prepare_memory_region,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6a737e9..e05069d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8195,6 +8195,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 				const struct kvm_userspace_memory_region *mem,
 				enum kvm_mr_change change)
 {
+	if (kvm_x86_ops->prepare_memory_region)
+		kvm_x86_ops->prepare_memory_region(kvm, memslot, mem, change);
+
 	return 0;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
@ 2017-03-02 15:18   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:18 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

The SEV memory encryption engine uses a tweak such that two identical
plaintexts at different location will have a different ciphertexts.
So swapping or moving ciphertexts of two pages will not result in
plaintexts being swapped. Relocating (or migrating) a physical backing pages
for SEV guest will require some additional steps. The current SEV key
management spec [1] does not provide commands to swap or migrate (move)
ciphertexts. For now we pin the memory allocated for the SEV guest. In
future when SEV key management spec provides the commands to support the
page migration we can update the KVM code to remove the pinning logical
without making any changes into userspace (qemu).

The patch pins userspace memory when a new slot is created and unpin the
memory when slot is removed.

[1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    6 +++
 arch/x86/kvm/svm.c              |   93 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              |    3 +
 3 files changed, 102 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fcc4710..9dc59f0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -723,6 +723,7 @@ struct kvm_sev_info {
 	unsigned int handle;	/* firmware handle */
 	unsigned int asid;	/* asid for this guest */
 	int sev_fd;		/* SEV device fd */
+	struct list_head pinned_memory_slot;
 };
 
 struct kvm_arch {
@@ -1043,6 +1044,11 @@ struct kvm_x86_ops {
 	void (*setup_mce)(struct kvm_vcpu *vcpu);
 
 	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
+
+	void (*prepare_memory_region)(struct kvm *kvm,
+			struct kvm_memory_slot *memslot,
+			const struct kvm_userspace_memory_region *mem,
+			enum kvm_mr_change change);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 13996d6..ab973f9 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -498,12 +498,21 @@ static inline bool gif_set(struct vcpu_svm *svm)
 }
 
 /* Secure Encrypted Virtualization */
+struct kvm_sev_pinned_memory_slot {
+	struct list_head list;
+	unsigned long npages;
+	struct page **pages;
+	unsigned long userspace_addr;
+	short id;
+};
+
 static unsigned int max_sev_asid;
 static unsigned long *sev_asid_bitmap;
 static void sev_deactivate_handle(struct kvm *kvm);
 static void sev_decommission_handle(struct kvm *kvm);
 static int sev_asid_new(void);
 static void sev_asid_free(int asid);
+static void sev_unpin_memory(struct page **pages, unsigned long npages);
 #define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
 
 static bool kvm_sev_enabled(void)
@@ -1544,9 +1553,25 @@ static inline int avic_free_vm_id(int id)
 
 static void sev_vm_destroy(struct kvm *kvm)
 {
+	struct list_head *pos, *q;
+	struct kvm_sev_pinned_memory_slot *pinned_slot;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
 	if (!sev_guest(kvm))
 		return;
 
+	/* if guest memory is pinned then unpin it now */
+	if (!list_empty(head)) {
+		list_for_each_safe(pos, q, head) {
+			pinned_slot = list_entry(pos,
+				struct kvm_sev_pinned_memory_slot, list);
+			sev_unpin_memory(pinned_slot->pages,
+					pinned_slot->npages);
+			list_del(pos);
+			kfree(pinned_slot);
+		}
+	}
+
 	/* release the firmware resources */
 	sev_deactivate_handle(kvm);
 	sev_decommission_handle(kvm);
@@ -5663,6 +5688,8 @@ static int sev_pre_start(struct kvm *kvm, int *asid)
 		}
 		*asid = ret;
 		ret = 0;
+
+		INIT_LIST_HEAD(&kvm->arch.sev_info.pinned_memory_slot);
 	}
 
 	return ret;
@@ -6189,6 +6216,71 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static struct kvm_sev_pinned_memory_slot *sev_find_pinned_memory_slot(
+		struct kvm *kvm, struct kvm_memory_slot *slot)
+{
+	struct kvm_sev_pinned_memory_slot *i;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
+	list_for_each_entry(i, head, list) {
+		if (i->userspace_addr == slot->userspace_addr &&
+			i->id == slot->id)
+			return i;
+	}
+
+	return NULL;
+}
+
+static void amd_prepare_memory_region(struct kvm *kvm,
+				struct kvm_memory_slot *memslot,
+				const struct kvm_userspace_memory_region *mem,
+				enum kvm_mr_change change)
+{
+	struct kvm_sev_pinned_memory_slot *pinned_slot;
+	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
+
+	mutex_lock(&kvm->lock);
+
+	if (!sev_guest(kvm))
+		goto unlock;
+
+	if (change == KVM_MR_CREATE) {
+
+		if (!mem->memory_size)
+			goto unlock;
+
+		pinned_slot = kmalloc(sizeof(*pinned_slot), GFP_KERNEL);
+		if (pinned_slot == NULL)
+			goto unlock;
+
+		pinned_slot->pages = sev_pin_memory(mem->userspace_addr,
+				mem->memory_size, &pinned_slot->npages);
+		if (pinned_slot->pages == NULL) {
+			kfree(pinned_slot);
+			goto unlock;
+		}
+
+		sev_clflush_pages(pinned_slot->pages, pinned_slot->npages);
+
+		pinned_slot->id = memslot->id;
+		pinned_slot->userspace_addr = mem->userspace_addr;
+		list_add_tail(&pinned_slot->list, head);
+
+	} else if  (change == KVM_MR_DELETE) {
+
+		pinned_slot = sev_find_pinned_memory_slot(kvm, memslot);
+		if (!pinned_slot)
+			goto unlock;
+
+		sev_unpin_memory(pinned_slot->pages, pinned_slot->npages);
+		list_del(&pinned_slot->list);
+		kfree(pinned_slot);
+	}
+
+unlock:
+	mutex_unlock(&kvm->lock);
+}
+
 static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
 {
 	int r = -ENOTTY;
@@ -6355,6 +6447,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
 	.update_pi_irte = svm_update_pi_irte,
 
 	.memory_encryption_op = amd_memory_encryption_cmd,
+	.prepare_memory_region = amd_prepare_memory_region,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6a737e9..e05069d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8195,6 +8195,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 				const struct kvm_userspace_memory_region *mem,
 				enum kvm_mr_change change)
 {
+	if (kvm_x86_ops->prepare_memory_region)
+		kvm_x86_ops->prepare_memory_region(kvm, memslot, mem, change);
+
 	return 0;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
  2017-03-02 15:16   ` Brijesh Singh
@ 2017-03-02 17:39     ` Mark Rutland
  -1 siblings, 0 replies; 424+ messages in thread
From: Mark Rutland @ 2017-03-02 17:39 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, thomas.lendacky, jroedel, keescook,
	arnd, toshi.kani, mathieu.desnoyers, luto, devel

On Thu, Mar 02, 2017 at 10:16:15AM -0500, Brijesh Singh wrote:
> The CCP device is part of the AMD Secure Processor. In order to expand the
> usage of the AMD Secure Processor, create a framework that allows functional
> components of the AMD Secure Processor to be initialized and handled
> appropriately.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  drivers/crypto/Kconfig           |   10 +
>  drivers/crypto/ccp/Kconfig       |   43 +++--
>  drivers/crypto/ccp/Makefile      |    8 -
>  drivers/crypto/ccp/ccp-dev-v3.c  |   86 +++++-----
>  drivers/crypto/ccp/ccp-dev-v5.c  |   73 ++++-----
>  drivers/crypto/ccp/ccp-dev.c     |  137 +++++++++-------
>  drivers/crypto/ccp/ccp-dev.h     |   35 ----
>  drivers/crypto/ccp/sp-dev.c      |  308 ++++++++++++++++++++++++++++++++++++
>  drivers/crypto/ccp/sp-dev.h      |  140 ++++++++++++++++
>  drivers/crypto/ccp/sp-pci.c      |  324 ++++++++++++++++++++++++++++++++++++++
>  drivers/crypto/ccp/sp-platform.c |  268 +++++++++++++++++++++++++++++++
>  include/linux/ccp.h              |    3 
>  12 files changed, 1240 insertions(+), 195 deletions(-)
>  create mode 100644 drivers/crypto/ccp/sp-dev.c
>  create mode 100644 drivers/crypto/ccp/sp-dev.h
>  create mode 100644 drivers/crypto/ccp/sp-pci.c
>  create mode 100644 drivers/crypto/ccp/sp-platform.c

> diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
> index 346ceb8..8127e18 100644
> --- a/drivers/crypto/ccp/Makefile
> +++ b/drivers/crypto/ccp/Makefile
> @@ -1,11 +1,11 @@
> -obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
> -ccp-objs := ccp-dev.o \
> +obj-$(CONFIG_CRYPTO_DEV_SP_DD) += ccp.o
> +ccp-objs := sp-dev.o sp-platform.o
> +ccp-$(CONFIG_PCI) += sp-pci.o
> +ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
>  	    ccp-ops.o \
>  	    ccp-dev-v3.o \
>  	    ccp-dev-v5.o \
> -	    ccp-platform.o \
>  	    ccp-dmaengine.o

It looks like ccp-platform.c has morphed into sp-platform.c (judging by
the compatible string and general shape of the code), and the original
ccp-platform.c is no longer built.

Shouldn't ccp-platform.c be deleted by this patch?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
@ 2017-03-02 17:39     ` Mark Rutland
  0 siblings, 0 replies; 424+ messages in thread
From: Mark Rutland @ 2017-03-02 17:39 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, thomas.lendacky, jroedel, keescook,
	arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:16:15AM -0500, Brijesh Singh wrote:
> The CCP device is part of the AMD Secure Processor. In order to expand the
> usage of the AMD Secure Processor, create a framework that allows functional
> components of the AMD Secure Processor to be initialized and handled
> appropriately.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  drivers/crypto/Kconfig           |   10 +
>  drivers/crypto/ccp/Kconfig       |   43 +++--
>  drivers/crypto/ccp/Makefile      |    8 -
>  drivers/crypto/ccp/ccp-dev-v3.c  |   86 +++++-----
>  drivers/crypto/ccp/ccp-dev-v5.c  |   73 ++++-----
>  drivers/crypto/ccp/ccp-dev.c     |  137 +++++++++-------
>  drivers/crypto/ccp/ccp-dev.h     |   35 ----
>  drivers/crypto/ccp/sp-dev.c      |  308 ++++++++++++++++++++++++++++++++++++
>  drivers/crypto/ccp/sp-dev.h      |  140 ++++++++++++++++
>  drivers/crypto/ccp/sp-pci.c      |  324 ++++++++++++++++++++++++++++++++++++++
>  drivers/crypto/ccp/sp-platform.c |  268 +++++++++++++++++++++++++++++++
>  include/linux/ccp.h              |    3 
>  12 files changed, 1240 insertions(+), 195 deletions(-)
>  create mode 100644 drivers/crypto/ccp/sp-dev.c
>  create mode 100644 drivers/crypto/ccp/sp-dev.h
>  create mode 100644 drivers/crypto/ccp/sp-pci.c
>  create mode 100644 drivers/crypto/ccp/sp-platform.c

> diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
> index 346ceb8..8127e18 100644
> --- a/drivers/crypto/ccp/Makefile
> +++ b/drivers/crypto/ccp/Makefile
> @@ -1,11 +1,11 @@
> -obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
> -ccp-objs := ccp-dev.o \
> +obj-$(CONFIG_CRYPTO_DEV_SP_DD) += ccp.o
> +ccp-objs := sp-dev.o sp-platform.o
> +ccp-$(CONFIG_PCI) += sp-pci.o
> +ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
>  	    ccp-ops.o \
>  	    ccp-dev-v3.o \
>  	    ccp-dev-v5.o \
> -	    ccp-platform.o \
>  	    ccp-dmaengine.o

It looks like ccp-platform.c has morphed into sp-platform.c (judging by
the compatible string and general shape of the code), and the original
ccp-platform.c is no longer built.

Shouldn't ccp-platform.c be deleted by this patch?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
  2017-03-02 17:39     ` Mark Rutland
  (?)
@ 2017-03-02 19:11       ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 19:11 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-efi, brijesh.singh, labbott, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, hpa, cl, tglx, aarcange, sfr,
	mchehab, simon.guinot, bhe, xemul, joro, x86, peterz, piotr.luc,
	mingo, msalter, ross.zwisler, bp, dyoung, thomas.lendacky,
	jroedel, keescook, arnd, toshi.kani, mathieu.desnoyers, luto,
	pbonzini, bhelgaas, dan.j.williams, andriy.shevchenko, akpm,
	herbert

Hi Mark,

On 03/02/2017 11:39 AM, Mark Rutland wrote:
> On Thu, Mar 02, 2017 at 10:16:15AM -0500, Brijesh Singh wrote:
>> The CCP device is part of the AMD Secure Processor. In order to expand the
>> usage of the AMD Secure Processor, create a framework that allows functional
>> components of the AMD Secure Processor to be initialized and handled
>> appropriately.
>>
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  drivers/crypto/Kconfig           |   10 +
>>  drivers/crypto/ccp/Kconfig       |   43 +++--
>>  drivers/crypto/ccp/Makefile      |    8 -
>>  drivers/crypto/ccp/ccp-dev-v3.c  |   86 +++++-----
>>  drivers/crypto/ccp/ccp-dev-v5.c  |   73 ++++-----
>>  drivers/crypto/ccp/ccp-dev.c     |  137 +++++++++-------
>>  drivers/crypto/ccp/ccp-dev.h     |   35 ----
>>  drivers/crypto/ccp/sp-dev.c      |  308 ++++++++++++++++++++++++++++++++++++
>>  drivers/crypto/ccp/sp-dev.h      |  140 ++++++++++++++++
>>  drivers/crypto/ccp/sp-pci.c      |  324 ++++++++++++++++++++++++++++++++++++++
>>  drivers/crypto/ccp/sp-platform.c |  268 +++++++++++++++++++++++++++++++
>>  include/linux/ccp.h              |    3
>>  12 files changed, 1240 insertions(+), 195 deletions(-)
>>  create mode 100644 drivers/crypto/ccp/sp-dev.c
>>  create mode 100644 drivers/crypto/ccp/sp-dev.h
>>  create mode 100644 drivers/crypto/ccp/sp-pci.c
>>  create mode 100644 drivers/crypto/ccp/sp-platform.c
>
>> diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
>> index 346ceb8..8127e18 100644
>> --- a/drivers/crypto/ccp/Makefile
>> +++ b/drivers/crypto/ccp/Makefile
>> @@ -1,11 +1,11 @@
>> -obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
>> -ccp-objs := ccp-dev.o \
>> +obj-$(CONFIG_CRYPTO_DEV_SP_DD) += ccp.o
>> +ccp-objs := sp-dev.o sp-platform.o
>> +ccp-$(CONFIG_PCI) += sp-pci.o
>> +ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
>>  	    ccp-ops.o \
>>  	    ccp-dev-v3.o \
>>  	    ccp-dev-v5.o \
>> -	    ccp-platform.o \
>>  	    ccp-dmaengine.o
>
> It looks like ccp-platform.c has morphed into sp-platform.c (judging by
> the compatible string and general shape of the code), and the original
> ccp-platform.c is no longer built.
>
> Shouldn't ccp-platform.c be deleted by this patch?
>

Good catch. Both ccp-platform.c and ccp-pci.c should have been deleted 
by this patch. I missed deleting it, will fix in next rev.

~ Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
@ 2017-03-02 19:11       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 19:11 UTC (permalink / raw)
  To: Mark Rutland
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Hi Mark,

On 03/02/2017 11:39 AM, Mark Rutland wrote:
> On Thu, Mar 02, 2017 at 10:16:15AM -0500, Brijesh Singh wrote:
>> The CCP device is part of the AMD Secure Processor. In order to expand the
>> usage of the AMD Secure Processor, create a framework that allows functional
>> components of the AMD Secure Processor to be initialized and handled
>> appropriately.
>>
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  drivers/crypto/Kconfig           |   10 +
>>  drivers/crypto/ccp/Kconfig       |   43 +++--
>>  drivers/crypto/ccp/Makefile      |    8 -
>>  drivers/crypto/ccp/ccp-dev-v3.c  |   86 +++++-----
>>  drivers/crypto/ccp/ccp-dev-v5.c  |   73 ++++-----
>>  drivers/crypto/ccp/ccp-dev.c     |  137 +++++++++-------
>>  drivers/crypto/ccp/ccp-dev.h     |   35 ----
>>  drivers/crypto/ccp/sp-dev.c      |  308 ++++++++++++++++++++++++++++++++++++
>>  drivers/crypto/ccp/sp-dev.h      |  140 ++++++++++++++++
>>  drivers/crypto/ccp/sp-pci.c      |  324 ++++++++++++++++++++++++++++++++++++++
>>  drivers/crypto/ccp/sp-platform.c |  268 +++++++++++++++++++++++++++++++
>>  include/linux/ccp.h              |    3
>>  12 files changed, 1240 insertions(+), 195 deletions(-)
>>  create mode 100644 drivers/crypto/ccp/sp-dev.c
>>  create mode 100644 drivers/crypto/ccp/sp-dev.h
>>  create mode 100644 drivers/crypto/ccp/sp-pci.c
>>  create mode 100644 drivers/crypto/ccp/sp-platform.c
>
>> diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
>> index 346ceb8..8127e18 100644
>> --- a/drivers/crypto/ccp/Makefile
>> +++ b/drivers/crypto/ccp/Makefile
>> @@ -1,11 +1,11 @@
>> -obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
>> -ccp-objs := ccp-dev.o \
>> +obj-$(CONFIG_CRYPTO_DEV_SP_DD) += ccp.o
>> +ccp-objs := sp-dev.o sp-platform.o
>> +ccp-$(CONFIG_PCI) += sp-pci.o
>> +ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
>>  	    ccp-ops.o \
>>  	    ccp-dev-v3.o \
>>  	    ccp-dev-v5.o \
>> -	    ccp-platform.o \
>>  	    ccp-dmaengine.o
>
> It looks like ccp-platform.c has morphed into sp-platform.c (judging by
> the compatible string and general shape of the code), and the original
> ccp-platform.c is no longer built.
>
> Shouldn't ccp-platform.c be deleted by this patch?
>

Good catch. Both ccp-platform.c and ccp-pci.c should have been deleted 
by this patch. I missed deleting it, will fix in next rev.

~ Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
@ 2017-03-02 19:11       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 19:11 UTC (permalink / raw)
  To: Mark Rutland
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Hi Mark,

On 03/02/2017 11:39 AM, Mark Rutland wrote:
> On Thu, Mar 02, 2017 at 10:16:15AM -0500, Brijesh Singh wrote:
>> The CCP device is part of the AMD Secure Processor. In order to expand the
>> usage of the AMD Secure Processor, create a framework that allows functional
>> components of the AMD Secure Processor to be initialized and handled
>> appropriately.
>>
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  drivers/crypto/Kconfig           |   10 +
>>  drivers/crypto/ccp/Kconfig       |   43 +++--
>>  drivers/crypto/ccp/Makefile      |    8 -
>>  drivers/crypto/ccp/ccp-dev-v3.c  |   86 +++++-----
>>  drivers/crypto/ccp/ccp-dev-v5.c  |   73 ++++-----
>>  drivers/crypto/ccp/ccp-dev.c     |  137 +++++++++-------
>>  drivers/crypto/ccp/ccp-dev.h     |   35 ----
>>  drivers/crypto/ccp/sp-dev.c      |  308 ++++++++++++++++++++++++++++++++++++
>>  drivers/crypto/ccp/sp-dev.h      |  140 ++++++++++++++++
>>  drivers/crypto/ccp/sp-pci.c      |  324 ++++++++++++++++++++++++++++++++++++++
>>  drivers/crypto/ccp/sp-platform.c |  268 +++++++++++++++++++++++++++++++
>>  include/linux/ccp.h              |    3
>>  12 files changed, 1240 insertions(+), 195 deletions(-)
>>  create mode 100644 drivers/crypto/ccp/sp-dev.c
>>  create mode 100644 drivers/crypto/ccp/sp-dev.h
>>  create mode 100644 drivers/crypto/ccp/sp-pci.c
>>  create mode 100644 drivers/crypto/ccp/sp-platform.c
>
>> diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
>> index 346ceb8..8127e18 100644
>> --- a/drivers/crypto/ccp/Makefile
>> +++ b/drivers/crypto/ccp/Makefile
>> @@ -1,11 +1,11 @@
>> -obj-$(CONFIG_CRYPTO_DEV_CCP_DD) += ccp.o
>> -ccp-objs := ccp-dev.o \
>> +obj-$(CONFIG_CRYPTO_DEV_SP_DD) += ccp.o
>> +ccp-objs := sp-dev.o sp-platform.o
>> +ccp-$(CONFIG_PCI) += sp-pci.o
>> +ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
>>  	    ccp-ops.o \
>>  	    ccp-dev-v3.o \
>>  	    ccp-dev-v5.o \
>> -	    ccp-platform.o \
>>  	    ccp-dmaengine.o
>
> It looks like ccp-platform.c has morphed into sp-platform.c (judging by
> the compatible string and general shape of the code), and the original
> ccp-platform.c is no longer built.
>
> Shouldn't ccp-platform.c be deleted by this patch?
>

Good catch. Both ccp-platform.c and ccp-pci.c should have been deleted 
by this patch. I missed deleting it, will fix in next rev.

~ Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
  2017-03-02 19:11       ` Brijesh Singh
  (?)
  (?)
@ 2017-03-03 13:55         ` Andy Shevchenko
  -1 siblings, 0 replies; 424+ messages in thread
From: Andy Shevchenko @ 2017-03-03 13:55 UTC (permalink / raw)
  To: Brijesh Singh, Mark Rutland
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, herbert, bhe, xemul, joro, x86,
	peterz, piotr.luc, mingo, msalter, ross.zwisler, bp, dyoung,
	thomas.lendacky, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott

On Thu, 2017-03-02 at 13:11 -0600, Brijesh Singh wrote:
> Hi Mark,
> 
> On 03/02/2017 11:39 AM, Mark Rutland wrote:
> > On Thu, Mar 02, 2017 at 10:16:15AM -0500, Brijesh Singh wrote:

> > > 
> > > +ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
> > >  	    ccp-ops.o \
> > >  	    ccp-dev-v3.o \
> > >  	    ccp-dev-v5.o \
> > > -	    ccp-platform.o \
> > >  	    ccp-dmaengine.o
> > 
> > It looks like ccp-platform.c has morphed into sp-platform.c (judging
> > by
> > the compatible string and general shape of the code), and the
> > original
> > ccp-platform.c is no longer built.
> > 
> > Shouldn't ccp-platform.c be deleted by this patch?
> > 
> 
> Good catch. Both ccp-platform.c and ccp-pci.c should have been
> deleted 
> by this patch. I missed deleting it, will fix in next rev.

Don't forget to use -M -C when preparing / sending patches.

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
@ 2017-03-03 13:55         ` Andy Shevchenko
  0 siblings, 0 replies; 424+ messages in thread
From: Andy Shevchenko @ 2017-03-03 13:55 UTC (permalink / raw)
  To: Brijesh Singh, Mark Rutland
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, herbert, bhe, xemul, joro, x86,
	peterz, piotr.luc, mingo, msalter, ross.zwisler, bp, dyoung,
	thomas.lendacky, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, 2017-03-02 at 13:11 -0600, Brijesh Singh wrote:
> Hi Mark,
> 
> On 03/02/2017 11:39 AM, Mark Rutland wrote:
> > On Thu, Mar 02, 2017 at 10:16:15AM -0500, Brijesh Singh wrote:

> > > 
> > > +ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
> > >  	    ccp-ops.o \
> > >  	    ccp-dev-v3.o \
> > >  	    ccp-dev-v5.o \
> > > -	    ccp-platform.o \
> > >  	    ccp-dmaengine.o
> > 
> > It looks like ccp-platform.c has morphed into sp-platform.c (judging
> > by
> > the compatible string and general shape of the code), and the
> > original
> > ccp-platform.c is no longer built.
> > 
> > Shouldn't ccp-platform.c be deleted by this patch?
> > 
> 
> Good catch. Both ccp-platform.c and ccp-pci.c should have been
> deleted 
> by this patch. I missed deleting it, will fix in next rev.

Don't forget to use -M -C when preparing / sending patches.

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
@ 2017-03-03 13:55         ` Andy Shevchenko
  0 siblings, 0 replies; 424+ messages in thread
From: Andy Shevchenko @ 2017-03-03 13:55 UTC (permalink / raw)
  To: Brijesh Singh, Mark Rutland
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, herbert, bhe, xemul, joro, x86,
	peterz, piotr.luc, mingo, msalter, ross.zwisler, bp, dyoung,
	thomas.lendacky, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott

On Thu, 2017-03-02 at 13:11 -0600, Brijesh Singh wrote:
> Hi Mark,
> 
> On 03/02/2017 11:39 AM, Mark Rutland wrote:
> > On Thu, Mar 02, 2017 at 10:16:15AM -0500, Brijesh Singh wrote:

> > > 
> > > +ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
> > >  	    ccp-ops.o \
> > >  	    ccp-dev-v3.o \
> > >  	    ccp-dev-v5.o \
> > > -	    ccp-platform.o \
> > >  	    ccp-dmaengine.o
> > 
> > It looks like ccp-platform.c has morphed into sp-platform.c (judging
> > by
> > the compatible string and general shape of the code), and the
> > original
> > ccp-platform.c is no longer built.
> > 
> > Shouldn't ccp-platform.c be deleted by this patch?
> > 
> 
> Good catch. Both ccp-platform.c and ccp-pci.c should have been
> deleted 
> by this patch. I missed deleting it, will fix in next rev.

Don't forget to use -M -C when preparing / sending patches.

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device
@ 2017-03-03 13:55         ` Andy Shevchenko
  0 siblings, 0 replies; 424+ messages in thread
From: Andy Shevchenko @ 2017-03-03 13:55 UTC (permalink / raw)
  To: Brijesh Singh, Mark Rutland
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, herbert, bhe, xemul, joro, x86,
	peterz, piotr.luc, mingo, msalter, ross.zwisler, bp, dyoung,
	thomas.lendacky, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, 2017-03-02 at 13:11 -0600, Brijesh Singh wrote:
> Hi Mark,
> 
> On 03/02/2017 11:39 AM, Mark Rutland wrote:
> > On Thu, Mar 02, 2017 at 10:16:15AM -0500, Brijesh Singh wrote:

> > > 
> > > +ccp-$(CONFIG_CRYPTO_DEV_CCP) += ccp-dev.o \
> > > A 	A A A A ccp-ops.o \
> > > A 	A A A A ccp-dev-v3.o \
> > > A 	A A A A ccp-dev-v5.o \
> > > -	A A A A ccp-platform.o \
> > > A 	A A A A ccp-dmaengine.o
> > 
> > It looks like ccp-platform.c has morphed into sp-platform.c (judging
> > by
> > the compatible string and general shape of the code), and the
> > original
> > ccp-platform.c is no longer built.
> > 
> > Shouldn't ccp-platform.c be deleted by this patch?
> > 
> 
> Good catch. Both ccp-platform.c and ccp-pci.c should have been
> deletedA 
> by this patch. I missed deleting it, will fix in next rev.

Don't forget to use -M -C when preparing / sending patches.

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
  2017-03-02 15:12   ` Brijesh Singh
  (?)
@ 2017-03-03 16:59     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-03 16:59 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas

On Thu, Mar 02, 2017 at 10:12:09AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Update the CPU features to include identifying and reporting on the
> Secure Encrypted Virtualization (SEV) feature.  SME is identified by
> CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
> MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
> as available if reported by CPUID and enabled by BIOS.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/include/asm/cpufeatures.h |    1 +
>  arch/x86/include/asm/msr-index.h   |    2 ++
>  arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
>  arch/x86/kernel/cpu/scattered.c    |    1 +
>  4 files changed, 22 insertions(+), 4 deletions(-)

So this patchset is not really ontop of Tom's patchset because this
patch doesn't apply. The reason is, Tom did the SME bit this way:

https://lkml.kernel.org/r/20170216154236.19244.7580.stgit@tlendack-t1.amdoffice.net

but it should've been in scattered.c.

> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
> index cabda87..c3f58d9 100644
> --- a/arch/x86/kernel/cpu/scattered.c
> +++ b/arch/x86/kernel/cpu/scattered.c
> @@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
>  	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
>  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
>  	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
> +	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
>  	{ 0, 0, 0, 0, 0 }

... and here it is in scattered.c, as it should be. So you've used an
older version of the patch, it seems.

Please sync with Tom to see whether he's reworked the v4 version of that
patch already. If yes, then you could send only the SME and SEV adding
patches as a reply to this message so that I can continue reviewing in
the meantime.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-03 16:59     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-03 16:59 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:09AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Update the CPU features to include identifying and reporting on the
> Secure Encrypted Virtualization (SEV) feature.  SME is identified by
> CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
> MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
> as available if reported by CPUID and enabled by BIOS.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/include/asm/cpufeatures.h |    1 +
>  arch/x86/include/asm/msr-index.h   |    2 ++
>  arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
>  arch/x86/kernel/cpu/scattered.c    |    1 +
>  4 files changed, 22 insertions(+), 4 deletions(-)

So this patchset is not really ontop of Tom's patchset because this
patch doesn't apply. The reason is, Tom did the SME bit this way:

https://lkml.kernel.org/r/20170216154236.19244.7580.stgit@tlendack-t1.amdoffice.net

but it should've been in scattered.c.

> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
> index cabda87..c3f58d9 100644
> --- a/arch/x86/kernel/cpu/scattered.c
> +++ b/arch/x86/kernel/cpu/scattered.c
> @@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
>  	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
>  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
>  	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
> +	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
>  	{ 0, 0, 0, 0, 0 }

... and here it is in scattered.c, as it should be. So you've used an
older version of the patch, it seems.

Please sync with Tom to see whether he's reworked the v4 version of that
patch already. If yes, then you could send only the SME and SEV adding
patches as a reply to this message so that I can continue reviewing in
the meantime.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-03 16:59     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-03 16:59 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:09AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Update the CPU features to include identifying and reporting on the
> Secure Encrypted Virtualization (SEV) feature.  SME is identified by
> CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
> MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
> as available if reported by CPUID and enabled by BIOS.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/include/asm/cpufeatures.h |    1 +
>  arch/x86/include/asm/msr-index.h   |    2 ++
>  arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
>  arch/x86/kernel/cpu/scattered.c    |    1 +
>  4 files changed, 22 insertions(+), 4 deletions(-)

So this patchset is not really ontop of Tom's patchset because this
patch doesn't apply. The reason is, Tom did the SME bit this way:

https://lkml.kernel.org/r/20170216154236.19244.7580.stgit@tlendack-t1.amdoffice.net

but it should've been in scattered.c.

> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
> index cabda87..c3f58d9 100644
> --- a/arch/x86/kernel/cpu/scattered.c
> +++ b/arch/x86/kernel/cpu/scattered.c
> @@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
>  	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
>  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
>  	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
> +	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
>  	{ 0, 0, 0, 0, 0 }

... and here it is in scattered.c, as it should be. So you've used an
older version of the patch, it seems.

Please sync with Tom to see whether he's reworked the v4 version of that
patch already. If yes, then you could send only the SME and SEV adding
patches as a reply to this message so that I can continue reviewing in
the meantime.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
  2017-03-02 15:12 ` Brijesh Singh
  (?)
  (?)
@ 2017-03-03 20:33   ` Bjorn Helgaas
  -1 siblings, 0 replies; 424+ messages in thread
From: Bjorn Helgaas @ 2017-03-03 20:33 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, thomas.lendacky, jroedel, keescook,
	arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim

On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
> This RFC series provides support for AMD's new Secure Encrypted Virtualization
> (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].

What kernel version is this series based on?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-03 20:33   ` Bjorn Helgaas
  0 siblings, 0 replies; 424+ messages in thread
From: Bjorn Helgaas @ 2017-03-03 20:33 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, thomas.lendacky, jroedel, keescook,
	arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
> This RFC series provides support for AMD's new Secure Encrypted Virtualization
> (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].

What kernel version is this series based on?

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-03 20:33   ` Bjorn Helgaas
  0 siblings, 0 replies; 424+ messages in thread
From: Bjorn Helgaas @ 2017-03-03 20:33 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, thomas.lendacky, jroedel, keescook,
	arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim

On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
> This RFC series provides support for AMD's new Secure Encrypted Virtualization
> (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].

What kernel version is this series based on?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-03 20:33   ` Bjorn Helgaas
  0 siblings, 0 replies; 424+ messages in thread
From: Bjorn Helgaas @ 2017-03-03 20:33 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, thomas.lendacky, jroedel, keescook,
	arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
> This RFC series provides support for AMD's new Secure Encrypted Virtualization
> (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].

What kernel version is this series based on?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
  2017-03-02 15:13   ` Brijesh Singh
  (?)
@ 2017-03-03 20:42     ` Bjorn Helgaas
  -1 siblings, 0 replies; 424+ messages in thread
From: Bjorn Helgaas @ 2017-03-03 20:42 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: linux-efi, labbott, kvm, rkrcmar, matt, linux-pci, linus.walleij,
	gary.hook, linux-mm, hpa, cl, tglx, aarcange, sfr, mchehab,
	simon.guinot, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, pbonzini,
	bhelgaas, dan.j.williams, andriy.shevchenko, akpm, herbert,
	tony.luck

On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> The use of ioremap will force the setup data to be mapped decrypted even
> though setup data is encrypted.  Switch to using memremap which will be
> able to perform the proper mapping.

How should callers decide whether to use ioremap() or memremap()?

memremap() existed before SME and SEV, and this code is used even if
SME and SEV aren't supported, so the rationale for this change should
not need the decryption argument.

> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/pci/common.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index a4fdfa7..0b06670 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
>  
>  	pa_data = boot_params.hdr.setup_data;
>  	while (pa_data) {
> -		data = ioremap(pa_data, sizeof(*rom));
> +		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);

I can't quite connect the dots here.  ioremap() on x86 would do
ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
which is ioremap_cache().  Is making a cacheable mapping the important
difference?

>  		if (!data)
>  			return -ENOMEM;
>  
> @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
>  			}
>  		}
>  		pa_data = data->next;
> -		iounmap(data);
> +		memunmap(data);
>  	}
>  	set_dma_domain_ops(dev);
>  	set_dev_domain_options(dev);
> 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-03 20:42     ` Bjorn Helgaas
  0 siblings, 0 replies; 424+ messages in thread
From: Bjorn Helgaas @ 2017-03-03 20:42 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, thomas.lendacky, jroedel, keescook,
	arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> The use of ioremap will force the setup data to be mapped decrypted even
> though setup data is encrypted.  Switch to using memremap which will be
> able to perform the proper mapping.

How should callers decide whether to use ioremap() or memremap()?

memremap() existed before SME and SEV, and this code is used even if
SME and SEV aren't supported, so the rationale for this change should
not need the decryption argument.

> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/pci/common.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index a4fdfa7..0b06670 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
>  
>  	pa_data = boot_params.hdr.setup_data;
>  	while (pa_data) {
> -		data = ioremap(pa_data, sizeof(*rom));
> +		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);

I can't quite connect the dots here.  ioremap() on x86 would do
ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
which is ioremap_cache().  Is making a cacheable mapping the important
difference?

>  		if (!data)
>  			return -ENOMEM;
>  
> @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
>  			}
>  		}
>  		pa_data = data->next;
> -		iounmap(data);
> +		memunmap(data);
>  	}
>  	set_dma_domain_ops(dev);
>  	set_dev_domain_options(dev);
> 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-03 20:42     ` Bjorn Helgaas
  0 siblings, 0 replies; 424+ messages in thread
From: Bjorn Helgaas @ 2017-03-03 20:42 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, thomas.lendacky, jroedel, keescook,
	arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> The use of ioremap will force the setup data to be mapped decrypted even
> though setup data is encrypted.  Switch to using memremap which will be
> able to perform the proper mapping.

How should callers decide whether to use ioremap() or memremap()?

memremap() existed before SME and SEV, and this code is used even if
SME and SEV aren't supported, so the rationale for this change should
not need the decryption argument.

> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/pci/common.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index a4fdfa7..0b06670 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
>  
>  	pa_data = boot_params.hdr.setup_data;
>  	while (pa_data) {
> -		data = ioremap(pa_data, sizeof(*rom));
> +		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);

I can't quite connect the dots here.  ioremap() on x86 would do
ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
which is ioremap_cache().  Is making a cacheable mapping the important
difference?

>  		if (!data)
>  			return -ENOMEM;
>  
> @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
>  			}
>  		}
>  		pa_data = data->next;
> -		iounmap(data);
> +		memunmap(data);
>  	}
>  	set_dma_domain_ops(dev);
>  	set_dev_domain_options(dev);
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
  2017-03-03 20:33   ` Bjorn Helgaas
  (?)
  (?)
@ 2017-03-03 20:51     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-03 20:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto

On Fri, Mar 03, 2017 at 02:33:23PM -0600, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
> > This RFC series provides support for AMD's new Secure Encrypted Virtualization
> > (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].
> 
> What kernel version is this series based on?

Yeah, see that mail in [1]:

https://lkml.kernel.org/r/20170216154158.19244.66630.stgit@tlendack-t1.amdoffice.net

"This patch series is based off of the master branch of tip.
  Commit a27cb9e1b2b4 ("Merge branch 'WIP.sched/core'")"

$ git describe a27cb9e1b2b4
v4.10-rc7-681-ga27cb9e1b2b4

So you need the SME pile first and then that SVE pile. But the first
patch needs refreshing as it is using a different base than the SME
pile. :-)

Tom, Brijesh, perhaps you guys could push a full tree somewhere - github
or so - for people to pull, in addition to the patchset on lkml.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-03 20:51     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-03 20:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

On Fri, Mar 03, 2017 at 02:33:23PM -0600, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
> > This RFC series provides support for AMD's new Secure Encrypted Virtualization
> > (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].
> 
> What kernel version is this series based on?

Yeah, see that mail in [1]:

https://lkml.kernel.org/r/20170216154158.19244.66630.stgit@tlendack-t1.amdoffice.net

"This patch series is based off of the master branch of tip.
  Commit a27cb9e1b2b4 ("Merge branch 'WIP.sched/core'")"

$ git describe a27cb9e1b2b4
v4.10-rc7-681-ga27cb9e1b2b4

So you need the SME pile first and then that SVE pile. But the first
patch needs refreshing as it is using a different base than the SME
pile. :-)

Tom, Brijesh, perhaps you guys could push a full tree somewhere - github
or so - for people to pull, in addition to the patchset on lkml.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-03 20:51     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-03 20:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto

On Fri, Mar 03, 2017 at 02:33:23PM -0600, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
> > This RFC series provides support for AMD's new Secure Encrypted Virtualization
> > (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].
> 
> What kernel version is this series based on?

Yeah, see that mail in [1]:

https://lkml.kernel.org/r/20170216154158.19244.66630.stgit@tlendack-t1.amdoffice.net

"This patch series is based off of the master branch of tip.
  Commit a27cb9e1b2b4 ("Merge branch 'WIP.sched/core'")"

$ git describe a27cb9e1b2b4
v4.10-rc7-681-ga27cb9e1b2b4

So you need the SME pile first and then that SVE pile. But the first
patch needs refreshing as it is using a different base than the SME
pile. :-)

Tom, Brijesh, perhaps you guys could push a full tree somewhere - github
or so - for people to pull, in addition to the patchset on lkml.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-03 20:51     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-03 20:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

On Fri, Mar 03, 2017 at 02:33:23PM -0600, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
> > This RFC series provides support for AMD's new Secure Encrypted Virtualization
> > (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].
> 
> What kernel version is this series based on?

Yeah, see that mail in [1]:

https://lkml.kernel.org/r/20170216154158.19244.66630.stgit@tlendack-t1.amdoffice.net

"This patch series is based off of the master branch of tip.
  Commit a27cb9e1b2b4 ("Merge branch 'WIP.sched/core'")"

$ git describe a27cb9e1b2b4
v4.10-rc7-681-ga27cb9e1b2b4

So you need the SME pile first and then that SVE pile. But the first
patch needs refreshing as it is using a different base than the SME
pile. :-)

Tom, Brijesh, perhaps you guys could push a full tree somewhere - github
or so - for people to pull, in addition to the patchset on lkml.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
  2017-03-03 16:59     ` Borislav Petkov
  (?)
  (?)
@ 2017-03-03 21:01       ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-03 21:01 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhel

Hi Boris,

On 03/03/2017 10:59 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:12:09AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> Update the CPU features to include identifying and reporting on the
>> Secure Encrypted Virtualization (SEV) feature.  SME is identified by
>> CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
>> MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
>> as available if reported by CPUID and enabled by BIOS.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  arch/x86/include/asm/cpufeatures.h |    1 +
>>  arch/x86/include/asm/msr-index.h   |    2 ++
>>  arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
>>  arch/x86/kernel/cpu/scattered.c    |    1 +
>>  4 files changed, 22 insertions(+), 4 deletions(-)
>
> So this patchset is not really ontop of Tom's patchset because this
> patch doesn't apply. The reason is, Tom did the SME bit this way:
>
> https://lkml.kernel.org/r/20170216154236.19244.7580.stgit@tlendack-t1.amdoffice.net
>
> but it should've been in scattered.c.
>
>> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
>> index cabda87..c3f58d9 100644
>> --- a/arch/x86/kernel/cpu/scattered.c
>> +++ b/arch/x86/kernel/cpu/scattered.c
>> @@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
>>  	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
>>  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
>>  	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
>> +	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
>>  	{ 0, 0, 0, 0, 0 }
>
> ... and here it is in scattered.c, as it should be. So you've used an
> older version of the patch, it seems.
>
> Please sync with Tom to see whether he's reworked the v4 version of that
> patch already. If yes, then you could send only the SME and SEV adding
> patches as a reply to this message so that I can continue reviewing in
> the meantime.
>

Just realized my error, I actually end up using Tom's recent updates to 
v4 instead of original v4. Here is the diff. If you have Tom's v4 
applied then apply this diff before applying SEV v2 version. Sorry about 
that.

Optionally, you also pull the complete tree from github [1].

[1] https://github.com/codomania/tip/tree/sev-rfc-v2


diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 91c40fa..b91e2495 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2153,8 +2153,8 @@
  			mem_encrypt=on:		Activate SME
  			mem_encrypt=off:	Do not activate SME

-			Refer to the SME documentation for details on when
-			memory encryption can be activated.
+			Refer to Documentation/x86/amd-memory-encryption.txt
+			for details on when memory encryption can be activated.

  	mem_sleep_default=	[SUSPEND] Default system suspend mode:
  			s2idle  - Suspend-To-Idle
diff --git a/Documentation/x86/amd-memory-encryption.txt 
b/Documentation/x86/amd-memory-encryption.txt
index 0938e89..0b72ff2 100644
--- a/Documentation/x86/amd-memory-encryption.txt
+++ b/Documentation/x86/amd-memory-encryption.txt
@@ -7,9 +7,9 @@ DRAM.  SME can therefore be used to protect the contents 
of DRAM from physical
  attacks on the system.

  A page is encrypted when a page table entry has the encryption bit set 
(see
-below how to determine the position of the bit).  The encryption bit can be
-specified in the cr3 register, allowing the PGD table to be encrypted. Each
-successive level of page tables can also be encrypted.
+below on how to determine its position).  The encryption bit can be 
specified
+in the cr3 register, allowing the PGD table to be encrypted. Each 
successive
+level of page tables can also be encrypted.

  Support for SME can be determined through the CPUID instruction. The CPUID
  function 0x8000001f reports information related to SME:
@@ -17,13 +17,14 @@ function 0x8000001f reports information related to SME:
  	0x8000001f[eax]:
  		Bit[0] indicates support for SME
  	0x8000001f[ebx]:
-		Bit[5:0]  pagetable bit number used to activate memory
-			  encryption
-		Bit[11:6] reduction in physical address space, in bits, when
-			  memory encryption is enabled (this only affects system
-			  physical addresses, not guest physical addresses)
-
-If support for SME is present, MSR 0xc00100010 (SYS_CFG) can be used to
+		Bits[5:0]  pagetable bit number used to activate memory
+			   encryption
+		Bits[11:6] reduction in physical address space, in bits, when
+			   memory encryption is enabled (this only affects
+			   system physical addresses, not guest physical
+			   addresses)
+
+If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be 
used to
  determine if SME is enabled and/or to enable memory encryption:

  	0xc0010010:
@@ -41,7 +42,7 @@ The state of SME in the Linux kernel can be documented 
as follows:
  	  The CPU supports SME (determined through CPUID instruction).

  	- Enabled:
-	  Supported and bit 23 of the SYS_CFG MSR is set.
+	  Supported and bit 23 of MSR_K8_SYSCFG is set.

  	- Active:
  	  Supported, Enabled and the Linux kernel is actively applying
@@ -51,7 +52,9 @@ The state of SME in the Linux kernel can be documented 
as follows:
  SME can also be enabled and activated in the BIOS. If SME is enabled and
  activated in the BIOS, then all memory accesses will be encrypted and 
it will
  not be necessary to activate the Linux memory encryption support.  If 
the BIOS
-merely enables SME (sets bit 23 of the SYS_CFG MSR), then Linux can 
activate
-memory encryption.  However, if BIOS does not enable SME, then Linux 
will not
-attempt to activate memory encryption, even if configured to do so by 
default
-or the mem_encrypt=on command line parameter is specified.
+merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can 
activate
+memory encryption by default 
(CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
+by supplying mem_encrypt=on on the kernel command line.  However, if 
BIOS does
+not enable SME, then Linux will not be able to activate memory 
encryption, even
+if configured to do so by default or the mem_encrypt=on command line 
parameter
+is specified.
diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index ea2de6a..d59c15c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -28,7 +28,6 @@ enum cpuid_leafs
  	CPUID_8000_000A_EDX,
  	CPUID_7_ECX,
  	CPUID_8000_0007_EBX,
-	CPUID_8000_001F_EAX,
  };

  #ifdef CONFIG_X86_FEATURE_NAMES
@@ -79,9 +78,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
  	   REQUIRED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))

  #define DISABLED_MASK_BIT_SET(feature_bit)				\
  	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
@@ -102,9 +100,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
  	   DISABLED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))

  #define cpu_has(c, bit)							\
  	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 331fb81..b1a4468 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -12,7 +12,7 @@
  /*
   * Defines x86 CPU feature bits
   */
-#define NCAPINTS	19	/* N 32-bit words worth of info */
+#define NCAPINTS	18	/* N 32-bit words worth of info */
  #define NBUGINTS	1	/* N 32-bit bug flags */

  /*
@@ -187,6 +187,7 @@
   * Reuse free bits when adding new feature flags!
   */

+#define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
  #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
  #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
  #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
@@ -296,9 +297,6 @@
  #define X86_FEATURE_SUCCOR	(17*32+1) /* Uncorrectable error 
containment and recovery */
  #define X86_FEATURE_SMCA	(17*32+3) /* Scalable MCA */

-/* AMD-defined CPU features, CPUID level 0x8000001f (eax), word 18 */
-#define X86_FEATURE_SME		(18*32+0) /* Secure Memory Encryption */
-
  /*
   * BUG word(s)
   */
diff --git a/arch/x86/include/asm/disabled-features.h 
b/arch/x86/include/asm/disabled-features.h
index 8b45e08..85599ad 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -57,7 +57,6 @@
  #define DISABLED_MASK15	0
  #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE)
  #define DISABLED_MASK17	0
-#define DISABLED_MASK18	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)

  #endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/include/asm/required-features.h 
b/arch/x86/include/asm/required-features.h
index 6847d85..fac9a5c 100644
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -100,7 +100,6 @@
  #define REQUIRED_MASK15	0
  #define REQUIRED_MASK16	0
  #define REQUIRED_MASK17	0
-#define REQUIRED_MASK18	0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)

  #endif /* _ASM_X86_REQUIRED_FEATURES_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 35a5d5d..6bddda3 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -615,6 +615,29 @@ static void early_init_amd(struct cpuinfo_x86 *c)
  	 */
  	if (cpu_has_amd_erratum(c, amd_erratum_400))
  		set_cpu_bug(c, X86_BUG_AMD_E400);
+
+	/*
+	 * BIOS support is required for SME. If BIOS has enabld SME then
+	 * adjust x86_phys_bits by the SME physical address space reduction
+	 * value. If BIOS has not enabled SME then don't advertise the
+	 * feature (set in scattered.c).
+	 */
+	if (c->extended_cpuid_level >= 0x8000001f) {
+		if (cpu_has(c, X86_FEATURE_SME)) {
+			u64 msr;
+
+			/* Check if SME is enabled */
+			rdmsrl(MSR_K8_SYSCFG, msr);
+			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT) {
+				unsigned int ebx;
+
+				ebx = cpuid_ebx(0x8000001f);
+				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
+			} else {
+				clear_cpu_cap(c, X86_FEATURE_SME);
+			}
+		}
+	}
  }

  static void init_amd_k8(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 358208d7..c188ae5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -763,29 +763,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
  	if (c->extended_cpuid_level >= 0x8000000a)
  		c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);

-	if (c->extended_cpuid_level >= 0x8000001f) {
-		cpuid(0x8000001f, &eax, &ebx, &ecx, &edx);
-
-		/* SME feature support */
-		if ((c->x86_vendor == X86_VENDOR_AMD) && (eax & 0x01)) {
-			u64 msr;
-
-			/*
-			 * For SME, BIOS support is required. If BIOS has
-			 * enabled SME adjust x86_phys_bits by the SME
-			 * physical address space reduction value. If BIOS
-			 * has not enabled SME don't advertise the feature.
-			 */
-			rdmsrl(MSR_K8_SYSCFG, msr);
-			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT)
-				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
-			else
-				eax &= ~0x01;
-		}
-
-		c->x86_capability[CPUID_8000_001F_EAX] = eax;
-	}
-
  	init_scattered_cpuid_features(c);

  	/*
diff --git a/arch/x86/kernel/cpu/scattered.c 
b/arch/x86/kernel/cpu/scattered.c
index d979406..cabda87 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -30,6 +30,7 @@ static const struct cpuid_bit cpuid_bits[] = {
  	{ X86_FEATURE_HW_PSTATE,	CPUID_EDX,  7, 0x80000007, 0 },
  	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
+	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
  	{ 0, 0, 0, 0, 0 }
  };


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-03 21:01       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-03 21:01 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Hi Boris,

On 03/03/2017 10:59 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:12:09AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> Update the CPU features to include identifying and reporting on the
>> Secure Encrypted Virtualization (SEV) feature.  SME is identified by
>> CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
>> MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
>> as available if reported by CPUID and enabled by BIOS.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  arch/x86/include/asm/cpufeatures.h |    1 +
>>  arch/x86/include/asm/msr-index.h   |    2 ++
>>  arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
>>  arch/x86/kernel/cpu/scattered.c    |    1 +
>>  4 files changed, 22 insertions(+), 4 deletions(-)
>
> So this patchset is not really ontop of Tom's patchset because this
> patch doesn't apply. The reason is, Tom did the SME bit this way:
>
> https://lkml.kernel.org/r/20170216154236.19244.7580.stgit@tlendack-t1.amdoffice.net
>
> but it should've been in scattered.c.
>
>> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
>> index cabda87..c3f58d9 100644
>> --- a/arch/x86/kernel/cpu/scattered.c
>> +++ b/arch/x86/kernel/cpu/scattered.c
>> @@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
>>  	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
>>  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
>>  	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
>> +	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
>>  	{ 0, 0, 0, 0, 0 }
>
> ... and here it is in scattered.c, as it should be. So you've used an
> older version of the patch, it seems.
>
> Please sync with Tom to see whether he's reworked the v4 version of that
> patch already. If yes, then you could send only the SME and SEV adding
> patches as a reply to this message so that I can continue reviewing in
> the meantime.
>

Just realized my error, I actually end up using Tom's recent updates to 
v4 instead of original v4. Here is the diff. If you have Tom's v4 
applied then apply this diff before applying SEV v2 version. Sorry about 
that.

Optionally, you also pull the complete tree from github [1].

[1] https://github.com/codomania/tip/tree/sev-rfc-v2


diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 91c40fa..b91e2495 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2153,8 +2153,8 @@
  			mem_encrypt=on:		Activate SME
  			mem_encrypt=off:	Do not activate SME

-			Refer to the SME documentation for details on when
-			memory encryption can be activated.
+			Refer to Documentation/x86/amd-memory-encryption.txt
+			for details on when memory encryption can be activated.

  	mem_sleep_default=	[SUSPEND] Default system suspend mode:
  			s2idle  - Suspend-To-Idle
diff --git a/Documentation/x86/amd-memory-encryption.txt 
b/Documentation/x86/amd-memory-encryption.txt
index 0938e89..0b72ff2 100644
--- a/Documentation/x86/amd-memory-encryption.txt
+++ b/Documentation/x86/amd-memory-encryption.txt
@@ -7,9 +7,9 @@ DRAM.  SME can therefore be used to protect the contents 
of DRAM from physical
  attacks on the system.

  A page is encrypted when a page table entry has the encryption bit set 
(see
-below how to determine the position of the bit).  The encryption bit can be
-specified in the cr3 register, allowing the PGD table to be encrypted. Each
-successive level of page tables can also be encrypted.
+below on how to determine its position).  The encryption bit can be 
specified
+in the cr3 register, allowing the PGD table to be encrypted. Each 
successive
+level of page tables can also be encrypted.

  Support for SME can be determined through the CPUID instruction. The CPUID
  function 0x8000001f reports information related to SME:
@@ -17,13 +17,14 @@ function 0x8000001f reports information related to SME:
  	0x8000001f[eax]:
  		Bit[0] indicates support for SME
  	0x8000001f[ebx]:
-		Bit[5:0]  pagetable bit number used to activate memory
-			  encryption
-		Bit[11:6] reduction in physical address space, in bits, when
-			  memory encryption is enabled (this only affects system
-			  physical addresses, not guest physical addresses)
-
-If support for SME is present, MSR 0xc00100010 (SYS_CFG) can be used to
+		Bits[5:0]  pagetable bit number used to activate memory
+			   encryption
+		Bits[11:6] reduction in physical address space, in bits, when
+			   memory encryption is enabled (this only affects
+			   system physical addresses, not guest physical
+			   addresses)
+
+If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be 
used to
  determine if SME is enabled and/or to enable memory encryption:

  	0xc0010010:
@@ -41,7 +42,7 @@ The state of SME in the Linux kernel can be documented 
as follows:
  	  The CPU supports SME (determined through CPUID instruction).

  	- Enabled:
-	  Supported and bit 23 of the SYS_CFG MSR is set.
+	  Supported and bit 23 of MSR_K8_SYSCFG is set.

  	- Active:
  	  Supported, Enabled and the Linux kernel is actively applying
@@ -51,7 +52,9 @@ The state of SME in the Linux kernel can be documented 
as follows:
  SME can also be enabled and activated in the BIOS. If SME is enabled and
  activated in the BIOS, then all memory accesses will be encrypted and 
it will
  not be necessary to activate the Linux memory encryption support.  If 
the BIOS
-merely enables SME (sets bit 23 of the SYS_CFG MSR), then Linux can 
activate
-memory encryption.  However, if BIOS does not enable SME, then Linux 
will not
-attempt to activate memory encryption, even if configured to do so by 
default
-or the mem_encrypt=on command line parameter is specified.
+merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can 
activate
+memory encryption by default 
(CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
+by supplying mem_encrypt=on on the kernel command line.  However, if 
BIOS does
+not enable SME, then Linux will not be able to activate memory 
encryption, even
+if configured to do so by default or the mem_encrypt=on command line 
parameter
+is specified.
diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index ea2de6a..d59c15c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -28,7 +28,6 @@ enum cpuid_leafs
  	CPUID_8000_000A_EDX,
  	CPUID_7_ECX,
  	CPUID_8000_0007_EBX,
-	CPUID_8000_001F_EAX,
  };

  #ifdef CONFIG_X86_FEATURE_NAMES
@@ -79,9 +78,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
  	   REQUIRED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))

  #define DISABLED_MASK_BIT_SET(feature_bit)				\
  	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
@@ -102,9 +100,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
  	   DISABLED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))

  #define cpu_has(c, bit)							\
  	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 331fb81..b1a4468 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -12,7 +12,7 @@
  /*
   * Defines x86 CPU feature bits
   */
-#define NCAPINTS	19	/* N 32-bit words worth of info */
+#define NCAPINTS	18	/* N 32-bit words worth of info */
  #define NBUGINTS	1	/* N 32-bit bug flags */

  /*
@@ -187,6 +187,7 @@
   * Reuse free bits when adding new feature flags!
   */

+#define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
  #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
  #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
  #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
@@ -296,9 +297,6 @@
  #define X86_FEATURE_SUCCOR	(17*32+1) /* Uncorrectable error 
containment and recovery */
  #define X86_FEATURE_SMCA	(17*32+3) /* Scalable MCA */

-/* AMD-defined CPU features, CPUID level 0x8000001f (eax), word 18 */
-#define X86_FEATURE_SME		(18*32+0) /* Secure Memory Encryption */
-
  /*
   * BUG word(s)
   */
diff --git a/arch/x86/include/asm/disabled-features.h 
b/arch/x86/include/asm/disabled-features.h
index 8b45e08..85599ad 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -57,7 +57,6 @@
  #define DISABLED_MASK15	0
  #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE)
  #define DISABLED_MASK17	0
-#define DISABLED_MASK18	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)

  #endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/include/asm/required-features.h 
b/arch/x86/include/asm/required-features.h
index 6847d85..fac9a5c 100644
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -100,7 +100,6 @@
  #define REQUIRED_MASK15	0
  #define REQUIRED_MASK16	0
  #define REQUIRED_MASK17	0
-#define REQUIRED_MASK18	0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)

  #endif /* _ASM_X86_REQUIRED_FEATURES_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 35a5d5d..6bddda3 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -615,6 +615,29 @@ static void early_init_amd(struct cpuinfo_x86 *c)
  	 */
  	if (cpu_has_amd_erratum(c, amd_erratum_400))
  		set_cpu_bug(c, X86_BUG_AMD_E400);
+
+	/*
+	 * BIOS support is required for SME. If BIOS has enabld SME then
+	 * adjust x86_phys_bits by the SME physical address space reduction
+	 * value. If BIOS has not enabled SME then don't advertise the
+	 * feature (set in scattered.c).
+	 */
+	if (c->extended_cpuid_level >= 0x8000001f) {
+		if (cpu_has(c, X86_FEATURE_SME)) {
+			u64 msr;
+
+			/* Check if SME is enabled */
+			rdmsrl(MSR_K8_SYSCFG, msr);
+			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT) {
+				unsigned int ebx;
+
+				ebx = cpuid_ebx(0x8000001f);
+				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
+			} else {
+				clear_cpu_cap(c, X86_FEATURE_SME);
+			}
+		}
+	}
  }

  static void init_amd_k8(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 358208d7..c188ae5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -763,29 +763,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
  	if (c->extended_cpuid_level >= 0x8000000a)
  		c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);

-	if (c->extended_cpuid_level >= 0x8000001f) {
-		cpuid(0x8000001f, &eax, &ebx, &ecx, &edx);
-
-		/* SME feature support */
-		if ((c->x86_vendor == X86_VENDOR_AMD) && (eax & 0x01)) {
-			u64 msr;
-
-			/*
-			 * For SME, BIOS support is required. If BIOS has
-			 * enabled SME adjust x86_phys_bits by the SME
-			 * physical address space reduction value. If BIOS
-			 * has not enabled SME don't advertise the feature.
-			 */
-			rdmsrl(MSR_K8_SYSCFG, msr);
-			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT)
-				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
-			else
-				eax &= ~0x01;
-		}
-
-		c->x86_capability[CPUID_8000_001F_EAX] = eax;
-	}
-
  	init_scattered_cpuid_features(c);

  	/*
diff --git a/arch/x86/kernel/cpu/scattered.c 
b/arch/x86/kernel/cpu/scattered.c
index d979406..cabda87 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -30,6 +30,7 @@ static const struct cpuid_bit cpuid_bits[] = {
  	{ X86_FEATURE_HW_PSTATE,	CPUID_EDX,  7, 0x80000007, 0 },
  	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
+	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
  	{ 0, 0, 0, 0, 0 }
  };

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-03 21:01       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-03 21:01 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhel

Hi Boris,

On 03/03/2017 10:59 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:12:09AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> Update the CPU features to include identifying and reporting on the
>> Secure Encrypted Virtualization (SEV) feature.  SME is identified by
>> CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
>> MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
>> as available if reported by CPUID and enabled by BIOS.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  arch/x86/include/asm/cpufeatures.h |    1 +
>>  arch/x86/include/asm/msr-index.h   |    2 ++
>>  arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
>>  arch/x86/kernel/cpu/scattered.c    |    1 +
>>  4 files changed, 22 insertions(+), 4 deletions(-)
>
> So this patchset is not really ontop of Tom's patchset because this
> patch doesn't apply. The reason is, Tom did the SME bit this way:
>
> https://lkml.kernel.org/r/20170216154236.19244.7580.stgit@tlendack-t1.amdoffice.net
>
> but it should've been in scattered.c.
>
>> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
>> index cabda87..c3f58d9 100644
>> --- a/arch/x86/kernel/cpu/scattered.c
>> +++ b/arch/x86/kernel/cpu/scattered.c
>> @@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
>>  	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
>>  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
>>  	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
>> +	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
>>  	{ 0, 0, 0, 0, 0 }
>
> ... and here it is in scattered.c, as it should be. So you've used an
> older version of the patch, it seems.
>
> Please sync with Tom to see whether he's reworked the v4 version of that
> patch already. If yes, then you could send only the SME and SEV adding
> patches as a reply to this message so that I can continue reviewing in
> the meantime.
>

Just realized my error, I actually end up using Tom's recent updates to 
v4 instead of original v4. Here is the diff. If you have Tom's v4 
applied then apply this diff before applying SEV v2 version. Sorry about 
that.

Optionally, you also pull the complete tree from github [1].

[1] https://github.com/codomania/tip/tree/sev-rfc-v2


diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 91c40fa..b91e2495 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2153,8 +2153,8 @@
  			mem_encrypt=on:		Activate SME
  			mem_encrypt=off:	Do not activate SME

-			Refer to the SME documentation for details on when
-			memory encryption can be activated.
+			Refer to Documentation/x86/amd-memory-encryption.txt
+			for details on when memory encryption can be activated.

  	mem_sleep_default=	[SUSPEND] Default system suspend mode:
  			s2idle  - Suspend-To-Idle
diff --git a/Documentation/x86/amd-memory-encryption.txt 
b/Documentation/x86/amd-memory-encryption.txt
index 0938e89..0b72ff2 100644
--- a/Documentation/x86/amd-memory-encryption.txt
+++ b/Documentation/x86/amd-memory-encryption.txt
@@ -7,9 +7,9 @@ DRAM.  SME can therefore be used to protect the contents 
of DRAM from physical
  attacks on the system.

  A page is encrypted when a page table entry has the encryption bit set 
(see
-below how to determine the position of the bit).  The encryption bit can be
-specified in the cr3 register, allowing the PGD table to be encrypted. Each
-successive level of page tables can also be encrypted.
+below on how to determine its position).  The encryption bit can be 
specified
+in the cr3 register, allowing the PGD table to be encrypted. Each 
successive
+level of page tables can also be encrypted.

  Support for SME can be determined through the CPUID instruction. The CPUID
  function 0x8000001f reports information related to SME:
@@ -17,13 +17,14 @@ function 0x8000001f reports information related to SME:
  	0x8000001f[eax]:
  		Bit[0] indicates support for SME
  	0x8000001f[ebx]:
-		Bit[5:0]  pagetable bit number used to activate memory
-			  encryption
-		Bit[11:6] reduction in physical address space, in bits, when
-			  memory encryption is enabled (this only affects system
-			  physical addresses, not guest physical addresses)
-
-If support for SME is present, MSR 0xc00100010 (SYS_CFG) can be used to
+		Bits[5:0]  pagetable bit number used to activate memory
+			   encryption
+		Bits[11:6] reduction in physical address space, in bits, when
+			   memory encryption is enabled (this only affects
+			   system physical addresses, not guest physical
+			   addresses)
+
+If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be 
used to
  determine if SME is enabled and/or to enable memory encryption:

  	0xc0010010:
@@ -41,7 +42,7 @@ The state of SME in the Linux kernel can be documented 
as follows:
  	  The CPU supports SME (determined through CPUID instruction).

  	- Enabled:
-	  Supported and bit 23 of the SYS_CFG MSR is set.
+	  Supported and bit 23 of MSR_K8_SYSCFG is set.

  	- Active:
  	  Supported, Enabled and the Linux kernel is actively applying
@@ -51,7 +52,9 @@ The state of SME in the Linux kernel can be documented 
as follows:
  SME can also be enabled and activated in the BIOS. If SME is enabled and
  activated in the BIOS, then all memory accesses will be encrypted and 
it will
  not be necessary to activate the Linux memory encryption support.  If 
the BIOS
-merely enables SME (sets bit 23 of the SYS_CFG MSR), then Linux can 
activate
-memory encryption.  However, if BIOS does not enable SME, then Linux 
will not
-attempt to activate memory encryption, even if configured to do so by 
default
-or the mem_encrypt=on command line parameter is specified.
+merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can 
activate
+memory encryption by default 
(CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
+by supplying mem_encrypt=on on the kernel command line.  However, if 
BIOS does
+not enable SME, then Linux will not be able to activate memory 
encryption, even
+if configured to do so by default or the mem_encrypt=on command line 
parameter
+is specified.
diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index ea2de6a..d59c15c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -28,7 +28,6 @@ enum cpuid_leafs
  	CPUID_8000_000A_EDX,
  	CPUID_7_ECX,
  	CPUID_8000_0007_EBX,
-	CPUID_8000_001F_EAX,
  };

  #ifdef CONFIG_X86_FEATURE_NAMES
@@ -79,9 +78,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
  	   REQUIRED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))

  #define DISABLED_MASK_BIT_SET(feature_bit)				\
  	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
@@ -102,9 +100,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
  	   DISABLED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))

  #define cpu_has(c, bit)							\
  	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 331fb81..b1a4468 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -12,7 +12,7 @@
  /*
   * Defines x86 CPU feature bits
   */
-#define NCAPINTS	19	/* N 32-bit words worth of info */
+#define NCAPINTS	18	/* N 32-bit words worth of info */
  #define NBUGINTS	1	/* N 32-bit bug flags */

  /*
@@ -187,6 +187,7 @@
   * Reuse free bits when adding new feature flags!
   */

+#define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
  #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
  #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
  #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
@@ -296,9 +297,6 @@
  #define X86_FEATURE_SUCCOR	(17*32+1) /* Uncorrectable error 
containment and recovery */
  #define X86_FEATURE_SMCA	(17*32+3) /* Scalable MCA */

-/* AMD-defined CPU features, CPUID level 0x8000001f (eax), word 18 */
-#define X86_FEATURE_SME		(18*32+0) /* Secure Memory Encryption */
-
  /*
   * BUG word(s)
   */
diff --git a/arch/x86/include/asm/disabled-features.h 
b/arch/x86/include/asm/disabled-features.h
index 8b45e08..85599ad 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -57,7 +57,6 @@
  #define DISABLED_MASK15	0
  #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE)
  #define DISABLED_MASK17	0
-#define DISABLED_MASK18	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)

  #endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/include/asm/required-features.h 
b/arch/x86/include/asm/required-features.h
index 6847d85..fac9a5c 100644
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -100,7 +100,6 @@
  #define REQUIRED_MASK15	0
  #define REQUIRED_MASK16	0
  #define REQUIRED_MASK17	0
-#define REQUIRED_MASK18	0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)

  #endif /* _ASM_X86_REQUIRED_FEATURES_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 35a5d5d..6bddda3 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -615,6 +615,29 @@ static void early_init_amd(struct cpuinfo_x86 *c)
  	 */
  	if (cpu_has_amd_erratum(c, amd_erratum_400))
  		set_cpu_bug(c, X86_BUG_AMD_E400);
+
+	/*
+	 * BIOS support is required for SME. If BIOS has enabld SME then
+	 * adjust x86_phys_bits by the SME physical address space reduction
+	 * value. If BIOS has not enabled SME then don't advertise the
+	 * feature (set in scattered.c).
+	 */
+	if (c->extended_cpuid_level >= 0x8000001f) {
+		if (cpu_has(c, X86_FEATURE_SME)) {
+			u64 msr;
+
+			/* Check if SME is enabled */
+			rdmsrl(MSR_K8_SYSCFG, msr);
+			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT) {
+				unsigned int ebx;
+
+				ebx = cpuid_ebx(0x8000001f);
+				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
+			} else {
+				clear_cpu_cap(c, X86_FEATURE_SME);
+			}
+		}
+	}
  }

  static void init_amd_k8(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 358208d7..c188ae5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -763,29 +763,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
  	if (c->extended_cpuid_level >= 0x8000000a)
  		c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);

-	if (c->extended_cpuid_level >= 0x8000001f) {
-		cpuid(0x8000001f, &eax, &ebx, &ecx, &edx);
-
-		/* SME feature support */
-		if ((c->x86_vendor == X86_VENDOR_AMD) && (eax & 0x01)) {
-			u64 msr;
-
-			/*
-			 * For SME, BIOS support is required. If BIOS has
-			 * enabled SME adjust x86_phys_bits by the SME
-			 * physical address space reduction value. If BIOS
-			 * has not enabled SME don't advertise the feature.
-			 */
-			rdmsrl(MSR_K8_SYSCFG, msr);
-			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT)
-				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
-			else
-				eax &= ~0x01;
-		}
-
-		c->x86_capability[CPUID_8000_001F_EAX] = eax;
-	}
-
  	init_scattered_cpuid_features(c);

  	/*
diff --git a/arch/x86/kernel/cpu/scattered.c 
b/arch/x86/kernel/cpu/scattered.c
index d979406..cabda87 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -30,6 +30,7 @@ static const struct cpuid_bit cpuid_bits[] = {
  	{ X86_FEATURE_HW_PSTATE,	CPUID_EDX,  7, 0x80000007, 0 },
  	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
+	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
  	{ 0, 0, 0, 0, 0 }
  };


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-03 21:01       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-03 21:01 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Hi Boris,

On 03/03/2017 10:59 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:12:09AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> Update the CPU features to include identifying and reporting on the
>> Secure Encrypted Virtualization (SEV) feature.  SME is identified by
>> CPUID 0x8000001f, but requires BIOS support to enable it (set bit 23 of
>> MSR_K8_SYSCFG and set bit 0 of MSR_K7_HWCR).  Only show the SEV feature
>> as available if reported by CPUID and enabled by BIOS.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  arch/x86/include/asm/cpufeatures.h |    1 +
>>  arch/x86/include/asm/msr-index.h   |    2 ++
>>  arch/x86/kernel/cpu/amd.c          |   22 ++++++++++++++++++----
>>  arch/x86/kernel/cpu/scattered.c    |    1 +
>>  4 files changed, 22 insertions(+), 4 deletions(-)
>
> So this patchset is not really ontop of Tom's patchset because this
> patch doesn't apply. The reason is, Tom did the SME bit this way:
>
> https://lkml.kernel.org/r/20170216154236.19244.7580.stgit@tlendack-t1.amdoffice.net
>
> but it should've been in scattered.c.
>
>> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
>> index cabda87..c3f58d9 100644
>> --- a/arch/x86/kernel/cpu/scattered.c
>> +++ b/arch/x86/kernel/cpu/scattered.c
>> @@ -31,6 +31,7 @@ static const struct cpuid_bit cpuid_bits[] = {
>>  	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
>>  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
>>  	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
>> +	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
>>  	{ 0, 0, 0, 0, 0 }
>
> ... and here it is in scattered.c, as it should be. So you've used an
> older version of the patch, it seems.
>
> Please sync with Tom to see whether he's reworked the v4 version of that
> patch already. If yes, then you could send only the SME and SEV adding
> patches as a reply to this message so that I can continue reviewing in
> the meantime.
>

Just realized my error, I actually end up using Tom's recent updates to 
v4 instead of original v4. Here is the diff. If you have Tom's v4 
applied then apply this diff before applying SEV v2 version. Sorry about 
that.

Optionally, you also pull the complete tree from github [1].

[1] https://github.com/codomania/tip/tree/sev-rfc-v2


diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 91c40fa..b91e2495 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2153,8 +2153,8 @@
  			mem_encrypt=on:		Activate SME
  			mem_encrypt=off:	Do not activate SME

-			Refer to the SME documentation for details on when
-			memory encryption can be activated.
+			Refer to Documentation/x86/amd-memory-encryption.txt
+			for details on when memory encryption can be activated.

  	mem_sleep_default=	[SUSPEND] Default system suspend mode:
  			s2idle  - Suspend-To-Idle
diff --git a/Documentation/x86/amd-memory-encryption.txt 
b/Documentation/x86/amd-memory-encryption.txt
index 0938e89..0b72ff2 100644
--- a/Documentation/x86/amd-memory-encryption.txt
+++ b/Documentation/x86/amd-memory-encryption.txt
@@ -7,9 +7,9 @@ DRAM.  SME can therefore be used to protect the contents 
of DRAM from physical
  attacks on the system.

  A page is encrypted when a page table entry has the encryption bit set 
(see
-below how to determine the position of the bit).  The encryption bit can be
-specified in the cr3 register, allowing the PGD table to be encrypted. Each
-successive level of page tables can also be encrypted.
+below on how to determine its position).  The encryption bit can be 
specified
+in the cr3 register, allowing the PGD table to be encrypted. Each 
successive
+level of page tables can also be encrypted.

  Support for SME can be determined through the CPUID instruction. The CPUID
  function 0x8000001f reports information related to SME:
@@ -17,13 +17,14 @@ function 0x8000001f reports information related to SME:
  	0x8000001f[eax]:
  		Bit[0] indicates support for SME
  	0x8000001f[ebx]:
-		Bit[5:0]  pagetable bit number used to activate memory
-			  encryption
-		Bit[11:6] reduction in physical address space, in bits, when
-			  memory encryption is enabled (this only affects system
-			  physical addresses, not guest physical addresses)
-
-If support for SME is present, MSR 0xc00100010 (SYS_CFG) can be used to
+		Bits[5:0]  pagetable bit number used to activate memory
+			   encryption
+		Bits[11:6] reduction in physical address space, in bits, when
+			   memory encryption is enabled (this only affects
+			   system physical addresses, not guest physical
+			   addresses)
+
+If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be 
used to
  determine if SME is enabled and/or to enable memory encryption:

  	0xc0010010:
@@ -41,7 +42,7 @@ The state of SME in the Linux kernel can be documented 
as follows:
  	  The CPU supports SME (determined through CPUID instruction).

  	- Enabled:
-	  Supported and bit 23 of the SYS_CFG MSR is set.
+	  Supported and bit 23 of MSR_K8_SYSCFG is set.

  	- Active:
  	  Supported, Enabled and the Linux kernel is actively applying
@@ -51,7 +52,9 @@ The state of SME in the Linux kernel can be documented 
as follows:
  SME can also be enabled and activated in the BIOS. If SME is enabled and
  activated in the BIOS, then all memory accesses will be encrypted and 
it will
  not be necessary to activate the Linux memory encryption support.  If 
the BIOS
-merely enables SME (sets bit 23 of the SYS_CFG MSR), then Linux can 
activate
-memory encryption.  However, if BIOS does not enable SME, then Linux 
will not
-attempt to activate memory encryption, even if configured to do so by 
default
-or the mem_encrypt=on command line parameter is specified.
+merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can 
activate
+memory encryption by default 
(CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
+by supplying mem_encrypt=on on the kernel command line.  However, if 
BIOS does
+not enable SME, then Linux will not be able to activate memory 
encryption, even
+if configured to do so by default or the mem_encrypt=on command line 
parameter
+is specified.
diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index ea2de6a..d59c15c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -28,7 +28,6 @@ enum cpuid_leafs
  	CPUID_8000_000A_EDX,
  	CPUID_7_ECX,
  	CPUID_8000_0007_EBX,
-	CPUID_8000_001F_EAX,
  };

  #ifdef CONFIG_X86_FEATURE_NAMES
@@ -79,9 +78,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
  	   REQUIRED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))

  #define DISABLED_MASK_BIT_SET(feature_bit)				\
  	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
@@ -102,9 +100,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
  	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
  	   DISABLED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))

  #define cpu_has(c, bit)							\
  	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 331fb81..b1a4468 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -12,7 +12,7 @@
  /*
   * Defines x86 CPU feature bits
   */
-#define NCAPINTS	19	/* N 32-bit words worth of info */
+#define NCAPINTS	18	/* N 32-bit words worth of info */
  #define NBUGINTS	1	/* N 32-bit bug flags */

  /*
@@ -187,6 +187,7 @@
   * Reuse free bits when adding new feature flags!
   */

+#define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
  #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
  #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
  #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
@@ -296,9 +297,6 @@
  #define X86_FEATURE_SUCCOR	(17*32+1) /* Uncorrectable error 
containment and recovery */
  #define X86_FEATURE_SMCA	(17*32+3) /* Scalable MCA */

-/* AMD-defined CPU features, CPUID level 0x8000001f (eax), word 18 */
-#define X86_FEATURE_SME		(18*32+0) /* Secure Memory Encryption */
-
  /*
   * BUG word(s)
   */
diff --git a/arch/x86/include/asm/disabled-features.h 
b/arch/x86/include/asm/disabled-features.h
index 8b45e08..85599ad 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -57,7 +57,6 @@
  #define DISABLED_MASK15	0
  #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE)
  #define DISABLED_MASK17	0
-#define DISABLED_MASK18	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)

  #endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/include/asm/required-features.h 
b/arch/x86/include/asm/required-features.h
index 6847d85..fac9a5c 100644
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -100,7 +100,6 @@
  #define REQUIRED_MASK15	0
  #define REQUIRED_MASK16	0
  #define REQUIRED_MASK17	0
-#define REQUIRED_MASK18	0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)

  #endif /* _ASM_X86_REQUIRED_FEATURES_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 35a5d5d..6bddda3 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -615,6 +615,29 @@ static void early_init_amd(struct cpuinfo_x86 *c)
  	 */
  	if (cpu_has_amd_erratum(c, amd_erratum_400))
  		set_cpu_bug(c, X86_BUG_AMD_E400);
+
+	/*
+	 * BIOS support is required for SME. If BIOS has enabld SME then
+	 * adjust x86_phys_bits by the SME physical address space reduction
+	 * value. If BIOS has not enabled SME then don't advertise the
+	 * feature (set in scattered.c).
+	 */
+	if (c->extended_cpuid_level >= 0x8000001f) {
+		if (cpu_has(c, X86_FEATURE_SME)) {
+			u64 msr;
+
+			/* Check if SME is enabled */
+			rdmsrl(MSR_K8_SYSCFG, msr);
+			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT) {
+				unsigned int ebx;
+
+				ebx = cpuid_ebx(0x8000001f);
+				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
+			} else {
+				clear_cpu_cap(c, X86_FEATURE_SME);
+			}
+		}
+	}
  }

  static void init_amd_k8(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 358208d7..c188ae5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -763,29 +763,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
  	if (c->extended_cpuid_level >= 0x8000000a)
  		c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);

-	if (c->extended_cpuid_level >= 0x8000001f) {
-		cpuid(0x8000001f, &eax, &ebx, &ecx, &edx);
-
-		/* SME feature support */
-		if ((c->x86_vendor == X86_VENDOR_AMD) && (eax & 0x01)) {
-			u64 msr;
-
-			/*
-			 * For SME, BIOS support is required. If BIOS has
-			 * enabled SME adjust x86_phys_bits by the SME
-			 * physical address space reduction value. If BIOS
-			 * has not enabled SME don't advertise the feature.
-			 */
-			rdmsrl(MSR_K8_SYSCFG, msr);
-			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT)
-				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
-			else
-				eax &= ~0x01;
-		}
-
-		c->x86_capability[CPUID_8000_001F_EAX] = eax;
-	}
-
  	init_scattered_cpuid_features(c);

  	/*
diff --git a/arch/x86/kernel/cpu/scattered.c 
b/arch/x86/kernel/cpu/scattered.c
index d979406..cabda87 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -30,6 +30,7 @@ static const struct cpuid_bit cpuid_bits[] = {
  	{ X86_FEATURE_HW_PSTATE,	CPUID_EDX,  7, 0x80000007, 0 },
  	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
+	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
  	{ 0, 0, 0, 0, 0 }
  };


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
  2017-03-03 20:33   ` Bjorn Helgaas
  (?)
  (?)
@ 2017-03-03 21:15     ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-03 21:15 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

Hi Bjorn,

On 03/03/2017 02:33 PM, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
>> This RFC series provides support for AMD's new Secure Encrypted Virtualization
>> (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].
>
> What kernel version is this series based on?
>

This patch series is based off of the master branch of tip.
   Commit a27cb9e1b2b4 ("Merge branch 'WIP.sched/core'")
   Tom's RFC v4 patches (http://marc.info/?l=linux-mm&m=148725973013686&w=2)

Accidentally, I ended up rebasing SEV RFCv2 patches from updated SME v4 
instead of original SME v4. So you may need to apply patch [1]

[1] http://marc.info/?l=linux-mm&m=148857523132253&w=2

Optionally, I have posted the full git tree here [2]

[2] https://github.com/codomania/tip/branches

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-03 21:15     ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-03 21:15 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Hi Bjorn,

On 03/03/2017 02:33 PM, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
>> This RFC series provides support for AMD's new Secure Encrypted Virtualization
>> (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].
>
> What kernel version is this series based on?
>

This patch series is based off of the master branch of tip.
   Commit a27cb9e1b2b4 ("Merge branch 'WIP.sched/core'")
   Tom's RFC v4 patches (http://marc.info/?l=linux-mm&m=148725973013686&w=2)

Accidentally, I ended up rebasing SEV RFCv2 patches from updated SME v4 
instead of original SME v4. So you may need to apply patch [1]

[1] http://marc.info/?l=linux-mm&m=148857523132253&w=2

Optionally, I have posted the full git tree here [2]

[2] https://github.com/codomania/tip/branches

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-03 21:15     ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-03 21:15 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

Hi Bjorn,

On 03/03/2017 02:33 PM, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
>> This RFC series provides support for AMD's new Secure Encrypted Virtualization
>> (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].
>
> What kernel version is this series based on?
>

This patch series is based off of the master branch of tip.
   Commit a27cb9e1b2b4 ("Merge branch 'WIP.sched/core'")
   Tom's RFC v4 patches (http://marc.info/?l=linux-mm&m=148725973013686&w=2)

Accidentally, I ended up rebasing SEV RFCv2 patches from updated SME v4 
instead of original SME v4. So you may need to apply patch [1]

[1] http://marc.info/?l=linux-mm&m=148857523132253&w=2

Optionally, I have posted the full git tree here [2]

[2] https://github.com/codomania/tip/branches

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-03 21:15     ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-03 21:15 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Hi Bjorn,

On 03/03/2017 02:33 PM, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:12:01AM -0500, Brijesh Singh wrote:
>> This RFC series provides support for AMD's new Secure Encrypted Virtualization
>> (SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].
>
> What kernel version is this series based on?
>

This patch series is based off of the master branch of tip.
   Commit a27cb9e1b2b4 ("Merge branch 'WIP.sched/core'")
   Tom's RFC v4 patches (http://marc.info/?l=linux-mm&m=148725973013686&w=2)

Accidentally, I ended up rebasing SEV RFCv2 patches from updated SME v4 
instead of original SME v4. So you may need to apply patch [1]

[1] http://marc.info/?l=linux-mm&m=148857523132253&w=2

Optionally, I have posted the full git tree here [2]

[2] https://github.com/codomania/tip/branches

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
  2017-03-03 20:42     ` Bjorn Helgaas
  (?)
  (?)
@ 2017-03-03 21:15       ` Tom Lendacky
  -1 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-03 21:15 UTC (permalink / raw)
  To: Bjorn Helgaas, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx

On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> The use of ioremap will force the setup data to be mapped decrypted even
>> though setup data is encrypted.  Switch to using memremap which will be
>> able to perform the proper mapping.
>
> How should callers decide whether to use ioremap() or memremap()?
>
> memremap() existed before SME and SEV, and this code is used even if
> SME and SEV aren't supported, so the rationale for this change should
> not need the decryption argument.

When SME or SEV is active an ioremap() will remove the encryption bit
from the pagetable entry when it is mapped.  This allows MMIO, which
doesn't support SME/SEV, to be performed successfully.  So my take is
that ioremap() should be used for MMIO and memremap() for pages in RAM.

>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  arch/x86/pci/common.c |    4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>> index a4fdfa7..0b06670 100644
>> --- a/arch/x86/pci/common.c
>> +++ b/arch/x86/pci/common.c
>> @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>
>>  	pa_data = boot_params.hdr.setup_data;
>>  	while (pa_data) {
>> -		data = ioremap(pa_data, sizeof(*rom));
>> +		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
>
> I can't quite connect the dots here.  ioremap() on x86 would do
> ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
> which is ioremap_cache().  Is making a cacheable mapping the important
> difference?

The memremap(MEMREMAP_WB) will actually check to see if it can perform
a __va(pa_data) in try_ram_remap() and then fallback to the
arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
that is the difference.

Thanks,
Tom

>
>>  		if (!data)
>>  			return -ENOMEM;
>>
>> @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>  			}
>>  		}
>>  		pa_data = data->next;
>> -		iounmap(data);
>> +		memunmap(data);
>>  	}
>>  	set_dma_domain_ops(dev);
>>  	set_dev_domain_options(dev);
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-03 21:15       ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-03 21:15 UTC (permalink / raw)
  To: Bjorn Helgaas, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> The use of ioremap will force the setup data to be mapped decrypted even
>> though setup data is encrypted.  Switch to using memremap which will be
>> able to perform the proper mapping.
>
> How should callers decide whether to use ioremap() or memremap()?
>
> memremap() existed before SME and SEV, and this code is used even if
> SME and SEV aren't supported, so the rationale for this change should
> not need the decryption argument.

When SME or SEV is active an ioremap() will remove the encryption bit
from the pagetable entry when it is mapped.  This allows MMIO, which
doesn't support SME/SEV, to be performed successfully.  So my take is
that ioremap() should be used for MMIO and memremap() for pages in RAM.

>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  arch/x86/pci/common.c |    4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>> index a4fdfa7..0b06670 100644
>> --- a/arch/x86/pci/common.c
>> +++ b/arch/x86/pci/common.c
>> @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>
>>  	pa_data = boot_params.hdr.setup_data;
>>  	while (pa_data) {
>> -		data = ioremap(pa_data, sizeof(*rom));
>> +		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
>
> I can't quite connect the dots here.  ioremap() on x86 would do
> ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
> which is ioremap_cache().  Is making a cacheable mapping the important
> difference?

The memremap(MEMREMAP_WB) will actually check to see if it can perform
a __va(pa_data) in try_ram_remap() and then fallback to the
arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
that is the difference.

Thanks,
Tom

>
>>  		if (!data)
>>  			return -ENOMEM;
>>
>> @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>  			}
>>  		}
>>  		pa_data = data->next;
>> -		iounmap(data);
>> +		memunmap(data);
>>  	}
>>  	set_dma_domain_ops(dev);
>>  	set_dev_domain_options(dev);
>>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-03 21:15       ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-03 21:15 UTC (permalink / raw)
  To: Bjorn Helgaas, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx

On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> The use of ioremap will force the setup data to be mapped decrypted even
>> though setup data is encrypted.  Switch to using memremap which will be
>> able to perform the proper mapping.
>
> How should callers decide whether to use ioremap() or memremap()?
>
> memremap() existed before SME and SEV, and this code is used even if
> SME and SEV aren't supported, so the rationale for this change should
> not need the decryption argument.

When SME or SEV is active an ioremap() will remove the encryption bit
from the pagetable entry when it is mapped.  This allows MMIO, which
doesn't support SME/SEV, to be performed successfully.  So my take is
that ioremap() should be used for MMIO and memremap() for pages in RAM.

>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  arch/x86/pci/common.c |    4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>> index a4fdfa7..0b06670 100644
>> --- a/arch/x86/pci/common.c
>> +++ b/arch/x86/pci/common.c
>> @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>
>>  	pa_data = boot_params.hdr.setup_data;
>>  	while (pa_data) {
>> -		data = ioremap(pa_data, sizeof(*rom));
>> +		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
>
> I can't quite connect the dots here.  ioremap() on x86 would do
> ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
> which is ioremap_cache().  Is making a cacheable mapping the important
> difference?

The memremap(MEMREMAP_WB) will actually check to see if it can perform
a __va(pa_data) in try_ram_remap() and then fallback to the
arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
that is the difference.

Thanks,
Tom

>
>>  		if (!data)
>>  			return -ENOMEM;
>>
>> @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>  			}
>>  		}
>>  		pa_data = data->next;
>> -		iounmap(data);
>> +		memunmap(data);
>>  	}
>>  	set_dma_domain_ops(dev);
>>  	set_dev_domain_options(dev);
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-03 21:15       ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-03 21:15 UTC (permalink / raw)
  To: Bjorn Helgaas, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, bp, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:
> On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> The use of ioremap will force the setup data to be mapped decrypted even
>> though setup data is encrypted.  Switch to using memremap which will be
>> able to perform the proper mapping.
>
> How should callers decide whether to use ioremap() or memremap()?
>
> memremap() existed before SME and SEV, and this code is used even if
> SME and SEV aren't supported, so the rationale for this change should
> not need the decryption argument.

When SME or SEV is active an ioremap() will remove the encryption bit
from the pagetable entry when it is mapped.  This allows MMIO, which
doesn't support SME/SEV, to be performed successfully.  So my take is
that ioremap() should be used for MMIO and memremap() for pages in RAM.

>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  arch/x86/pci/common.c |    4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>> index a4fdfa7..0b06670 100644
>> --- a/arch/x86/pci/common.c
>> +++ b/arch/x86/pci/common.c
>> @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>
>>  	pa_data = boot_params.hdr.setup_data;
>>  	while (pa_data) {
>> -		data = ioremap(pa_data, sizeof(*rom));
>> +		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
>
> I can't quite connect the dots here.  ioremap() on x86 would do
> ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
> which is ioremap_cache().  Is making a cacheable mapping the important
> difference?

The memremap(MEMREMAP_WB) will actually check to see if it can perform
a __va(pa_data) in try_ram_remap() and then fallback to the
arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
that is the difference.

Thanks,
Tom

>
>>  		if (!data)
>>  			return -ENOMEM;
>>
>> @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>  			}
>>  		}
>>  		pa_data = data->next;
>> -		iounmap(data);
>> +		memunmap(data);
>>  	}
>>  	set_dma_domain_ops(dev);
>>  	set_dev_domain_options(dev);
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
  2017-03-03 21:01       ` Brijesh Singh
  (?)
@ 2017-03-04 10:11         ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-04 10:11 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: linux-efi, kvm, rkrcmar, matt, linux-pci, linus.walleij,
	gary.hook, linux-mm, hpa, cl, tglx, aarcange, sfr, mchehab,
	simon.guinot, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, labbott, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, pbonzini,
	bhelgaas, dan.j.williams, andriy.shevchenko, akpm, herbert,
	tony.luck, pau

On Fri, Mar 03, 2017 at 03:01:23PM -0600, Brijesh Singh wrote:
> +merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can
> activate
> +memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y)
> or
> +by supplying mem_encrypt=on on the kernel command line.  However, if BIOS
> does
> +not enable SME, then Linux will not be able to activate memory encryption,
> even
> +if configured to do so by default or the mem_encrypt=on command line
> parameter
> +is specified.

This looks like a wraparound...

$ test-apply.sh /tmp/brijesh.singh.delta
checking file Documentation/admin-guide/kernel-parameters.txt
Hunk #1 succeeded at 2144 (offset -9 lines).
checking file Documentation/x86/amd-memory-encryption.txt
patch: **** malformed patch at line 23: DRAM from physical

Yap.

Looks like exchange or your mail client decided to do some patch editing
on its own.

Please send it to yourself first and try applying.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-04 10:11         ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-04 10:11 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Fri, Mar 03, 2017 at 03:01:23PM -0600, Brijesh Singh wrote:
> +merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can
> activate
> +memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y)
> or
> +by supplying mem_encrypt=on on the kernel command line.  However, if BIOS
> does
> +not enable SME, then Linux will not be able to activate memory encryption,
> even
> +if configured to do so by default or the mem_encrypt=on command line
> parameter
> +is specified.

This looks like a wraparound...

$ test-apply.sh /tmp/brijesh.singh.delta
checking file Documentation/admin-guide/kernel-parameters.txt
Hunk #1 succeeded at 2144 (offset -9 lines).
checking file Documentation/x86/amd-memory-encryption.txt
patch: **** malformed patch at line 23: DRAM from physical

Yap.

Looks like exchange or your mail client decided to do some patch editing
on its own.

Please send it to yourself first and try applying.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-04 10:11         ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-04 10:11 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Fri, Mar 03, 2017 at 03:01:23PM -0600, Brijesh Singh wrote:
> +merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can
> activate
> +memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y)
> or
> +by supplying mem_encrypt=on on the kernel command line.  However, if BIOS
> does
> +not enable SME, then Linux will not be able to activate memory encryption,
> even
> +if configured to do so by default or the mem_encrypt=on command line
> parameter
> +is specified.

This looks like a wraparound...

$ test-apply.sh /tmp/brijesh.singh.delta
checking file Documentation/admin-guide/kernel-parameters.txt
Hunk #1 succeeded at 2144 (offset -9 lines).
checking file Documentation/x86/amd-memory-encryption.txt
patch: **** malformed patch at line 23: DRAM from physical

Yap.

Looks like exchange or your mail client decided to do some patch editing
on its own.

Please send it to yourself first and try applying.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
  2017-03-04 10:11         ` Borislav Petkov
  (?)
  (?)
@ 2017-03-06 18:11           ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-06 18:11 UTC (permalink / raw)
  To: bp
  Cc: linux-efi, brijesh.singh, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, hpa, cl, tglx, aarcange, sfr,
	mchehab, simon.guinot, bhe, xemul, joro, x86, peterz, piotr.luc,
	mingo, msalter, ross.zwisler, labbott, dyoung, thomas.lendacky,
	jroedel, keescook, arnd, toshi.kani, mathieu.desnoyers, luto,
	pbonzini, bhelgaas, dan.j.williams, andriy.shevchenko, akpm,
	herbert, t

On 03/04/2017 04:11 AM, Borislav Petkov wrote:
> On Fri, Mar 03, 2017 at 03:01:23PM -0600, Brijesh Singh wrote:
> 
> This looks like a wraparound...
> 
> $ test-apply.sh /tmp/brijesh.singh.delta
> checking file Documentation/admin-guide/kernel-parameters.txt
> Hunk #1 succeeded at 2144 (offset -9 lines).
> checking file Documentation/x86/amd-memory-encryption.txt
> patch: **** malformed patch at line 23: DRAM from physical
> 
> Yap.
> 
> Looks like exchange or your mail client decided to do some patch editing
> on its own.
> 
> Please send it to yourself first and try applying.
> 

Sending it through stg mail to avoid line wrapping. Please let me know if something
is still messed up. I have tried applying it and it seems to apply okay.

---
 Documentation/admin-guide/kernel-parameters.txt |    4 +--
 Documentation/x86/amd-memory-encryption.txt     |   33 +++++++++++++----------
 arch/x86/include/asm/cpufeature.h               |    7 +----
 arch/x86/include/asm/cpufeatures.h              |    6 +---
 arch/x86/include/asm/disabled-features.h        |    3 +-
 arch/x86/include/asm/required-features.h        |    3 +-
 arch/x86/kernel/cpu/amd.c                       |   23 ++++++++++++++++
 arch/x86/kernel/cpu/common.c                    |   23 ----------------
 arch/x86/kernel/cpu/scattered.c                 |    1 +
 9 files changed, 50 insertions(+), 53 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 91c40fa..b91e2495 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2153,8 +2153,8 @@
 			mem_encrypt=on:		Activate SME
 			mem_encrypt=off:	Do not activate SME
 
-			Refer to the SME documentation for details on when
-			memory encryption can be activated.
+			Refer to Documentation/x86/amd-memory-encryption.txt
+			for details on when memory encryption can be activated.
 
 	mem_sleep_default=	[SUSPEND] Default system suspend mode:
 			s2idle  - Suspend-To-Idle
diff --git a/Documentation/x86/amd-memory-encryption.txt b/Documentation/x86/amd-memory-encryption.txt
index 0938e89..0b72ff2 100644
--- a/Documentation/x86/amd-memory-encryption.txt
+++ b/Documentation/x86/amd-memory-encryption.txt
@@ -7,9 +7,9 @@ DRAM.  SME can therefore be used to protect the contents of DRAM from physical
 attacks on the system.
 
 A page is encrypted when a page table entry has the encryption bit set (see
-below how to determine the position of the bit).  The encryption bit can be
-specified in the cr3 register, allowing the PGD table to be encrypted. Each
-successive level of page tables can also be encrypted.
+below on how to determine its position).  The encryption bit can be specified
+in the cr3 register, allowing the PGD table to be encrypted. Each successive
+level of page tables can also be encrypted.
 
 Support for SME can be determined through the CPUID instruction. The CPUID
 function 0x8000001f reports information related to SME:
@@ -17,13 +17,14 @@ function 0x8000001f reports information related to SME:
 	0x8000001f[eax]:
 		Bit[0] indicates support for SME
 	0x8000001f[ebx]:
-		Bit[5:0]  pagetable bit number used to activate memory
-			  encryption
-		Bit[11:6] reduction in physical address space, in bits, when
-			  memory encryption is enabled (this only affects system
-			  physical addresses, not guest physical addresses)
-
-If support for SME is present, MSR 0xc00100010 (SYS_CFG) can be used to
+		Bits[5:0]  pagetable bit number used to activate memory
+			   encryption
+		Bits[11:6] reduction in physical address space, in bits, when
+			   memory encryption is enabled (this only affects
+			   system physical addresses, not guest physical
+			   addresses)
+
+If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be used to
 determine if SME is enabled and/or to enable memory encryption:
 
 	0xc0010010:
@@ -41,7 +42,7 @@ The state of SME in the Linux kernel can be documented as follows:
 	  The CPU supports SME (determined through CPUID instruction).
 
 	- Enabled:
-	  Supported and bit 23 of the SYS_CFG MSR is set.
+	  Supported and bit 23 of MSR_K8_SYSCFG is set.
 
 	- Active:
 	  Supported, Enabled and the Linux kernel is actively applying
@@ -51,7 +52,9 @@ The state of SME in the Linux kernel can be documented as follows:
 SME can also be enabled and activated in the BIOS. If SME is enabled and
 activated in the BIOS, then all memory accesses will be encrypted and it will
 not be necessary to activate the Linux memory encryption support.  If the BIOS
-merely enables SME (sets bit 23 of the SYS_CFG MSR), then Linux can activate
-memory encryption.  However, if BIOS does not enable SME, then Linux will not
-attempt to activate memory encryption, even if configured to do so by default
-or the mem_encrypt=on command line parameter is specified.
+merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can activate
+memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
+by supplying mem_encrypt=on on the kernel command line.  However, if BIOS does
+not enable SME, then Linux will not be able to activate memory encryption, even
+if configured to do so by default or the mem_encrypt=on command line parameter
+is specified.
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index ea2de6a..d59c15c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -28,7 +28,6 @@ enum cpuid_leafs
 	CPUID_8000_000A_EDX,
 	CPUID_7_ECX,
 	CPUID_8000_0007_EBX,
-	CPUID_8000_001F_EAX,
 };
 
 #ifdef CONFIG_X86_FEATURE_NAMES
@@ -79,9 +78,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
 	   REQUIRED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))
 
 #define DISABLED_MASK_BIT_SET(feature_bit)				\
 	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
@@ -102,9 +100,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
 	   DISABLED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))
 
 #define cpu_has(c, bit)							\
 	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 331fb81..b1a4468 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -12,7 +12,7 @@
 /*
  * Defines x86 CPU feature bits
  */
-#define NCAPINTS	19	/* N 32-bit words worth of info */
+#define NCAPINTS	18	/* N 32-bit words worth of info */
 #define NBUGINTS	1	/* N 32-bit bug flags */
 
 /*
@@ -187,6 +187,7 @@
  * Reuse free bits when adding new feature flags!
  */
 
+#define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
 #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
 #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
 #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
@@ -296,9 +297,6 @@
 #define X86_FEATURE_SUCCOR	(17*32+1) /* Uncorrectable error containment and recovery */
 #define X86_FEATURE_SMCA	(17*32+3) /* Scalable MCA */
 
-/* AMD-defined CPU features, CPUID level 0x8000001f (eax), word 18 */
-#define X86_FEATURE_SME		(18*32+0) /* Secure Memory Encryption */
-
 /*
  * BUG word(s)
  */
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index 8b45e08..85599ad 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -57,7 +57,6 @@
 #define DISABLED_MASK15	0
 #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE)
 #define DISABLED_MASK17	0
-#define DISABLED_MASK18	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
 #endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h
index 6847d85..fac9a5c 100644
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -100,7 +100,6 @@
 #define REQUIRED_MASK15	0
 #define REQUIRED_MASK16	0
 #define REQUIRED_MASK17	0
-#define REQUIRED_MASK18	0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
 #endif /* _ASM_X86_REQUIRED_FEATURES_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 35a5d5d..6bddda3 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -615,6 +615,29 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 	 */
 	if (cpu_has_amd_erratum(c, amd_erratum_400))
 		set_cpu_bug(c, X86_BUG_AMD_E400);
+
+	/*
+	 * BIOS support is required for SME. If BIOS has enabld SME then
+	 * adjust x86_phys_bits by the SME physical address space reduction
+	 * value. If BIOS has not enabled SME then don't advertise the
+	 * feature (set in scattered.c).
+	 */
+	if (c->extended_cpuid_level >= 0x8000001f) {
+		if (cpu_has(c, X86_FEATURE_SME)) {
+			u64 msr;
+
+			/* Check if SME is enabled */
+			rdmsrl(MSR_K8_SYSCFG, msr);
+			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT) {
+				unsigned int ebx;
+
+				ebx = cpuid_ebx(0x8000001f);
+				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
+			} else {
+				clear_cpu_cap(c, X86_FEATURE_SME);
+			}
+		}
+	}
 }
 
 static void init_amd_k8(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 358208d7..c188ae5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -763,29 +763,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 	if (c->extended_cpuid_level >= 0x8000000a)
 		c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);
 
-	if (c->extended_cpuid_level >= 0x8000001f) {
-		cpuid(0x8000001f, &eax, &ebx, &ecx, &edx);
-
-		/* SME feature support */
-		if ((c->x86_vendor == X86_VENDOR_AMD) && (eax & 0x01)) {
-			u64 msr;
-
-			/*
-			 * For SME, BIOS support is required. If BIOS has
-			 * enabled SME adjust x86_phys_bits by the SME
-			 * physical address space reduction value. If BIOS
-			 * has not enabled SME don't advertise the feature.
-			 */
-			rdmsrl(MSR_K8_SYSCFG, msr);
-			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT)
-				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
-			else
-				eax &= ~0x01;
-		}
-
-		c->x86_capability[CPUID_8000_001F_EAX] = eax;
-	}
-
 	init_scattered_cpuid_features(c);
 
 	/*
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index d979406..cabda87 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -30,6 +30,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_HW_PSTATE,	CPUID_EDX,  7, 0x80000007, 0 },
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
+	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
 	{ 0, 0, 0, 0, 0 }
 };

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-06 18:11           ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-06 18:11 UTC (permalink / raw)
  To: bp
  Cc: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

On 03/04/2017 04:11 AM, Borislav Petkov wrote:
> On Fri, Mar 03, 2017 at 03:01:23PM -0600, Brijesh Singh wrote:
> 
> This looks like a wraparound...
> 
> $ test-apply.sh /tmp/brijesh.singh.delta
> checking file Documentation/admin-guide/kernel-parameters.txt
> Hunk #1 succeeded at 2144 (offset -9 lines).
> checking file Documentation/x86/amd-memory-encryption.txt
> patch: **** malformed patch at line 23: DRAM from physical
> 
> Yap.
> 
> Looks like exchange or your mail client decided to do some patch editing
> on its own.
> 
> Please send it to yourself first and try applying.
> 

Sending it through stg mail to avoid line wrapping. Please let me know if something
is still messed up. I have tried applying it and it seems to apply okay.

---
 Documentation/admin-guide/kernel-parameters.txt |    4 +--
 Documentation/x86/amd-memory-encryption.txt     |   33 +++++++++++++----------
 arch/x86/include/asm/cpufeature.h               |    7 +----
 arch/x86/include/asm/cpufeatures.h              |    6 +---
 arch/x86/include/asm/disabled-features.h        |    3 +-
 arch/x86/include/asm/required-features.h        |    3 +-
 arch/x86/kernel/cpu/amd.c                       |   23 ++++++++++++++++
 arch/x86/kernel/cpu/common.c                    |   23 ----------------
 arch/x86/kernel/cpu/scattered.c                 |    1 +
 9 files changed, 50 insertions(+), 53 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 91c40fa..b91e2495 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2153,8 +2153,8 @@
 			mem_encrypt=on:		Activate SME
 			mem_encrypt=off:	Do not activate SME
 
-			Refer to the SME documentation for details on when
-			memory encryption can be activated.
+			Refer to Documentation/x86/amd-memory-encryption.txt
+			for details on when memory encryption can be activated.
 
 	mem_sleep_default=	[SUSPEND] Default system suspend mode:
 			s2idle  - Suspend-To-Idle
diff --git a/Documentation/x86/amd-memory-encryption.txt b/Documentation/x86/amd-memory-encryption.txt
index 0938e89..0b72ff2 100644
--- a/Documentation/x86/amd-memory-encryption.txt
+++ b/Documentation/x86/amd-memory-encryption.txt
@@ -7,9 +7,9 @@ DRAM.  SME can therefore be used to protect the contents of DRAM from physical
 attacks on the system.
 
 A page is encrypted when a page table entry has the encryption bit set (see
-below how to determine the position of the bit).  The encryption bit can be
-specified in the cr3 register, allowing the PGD table to be encrypted. Each
-successive level of page tables can also be encrypted.
+below on how to determine its position).  The encryption bit can be specified
+in the cr3 register, allowing the PGD table to be encrypted. Each successive
+level of page tables can also be encrypted.
 
 Support for SME can be determined through the CPUID instruction. The CPUID
 function 0x8000001f reports information related to SME:
@@ -17,13 +17,14 @@ function 0x8000001f reports information related to SME:
 	0x8000001f[eax]:
 		Bit[0] indicates support for SME
 	0x8000001f[ebx]:
-		Bit[5:0]  pagetable bit number used to activate memory
-			  encryption
-		Bit[11:6] reduction in physical address space, in bits, when
-			  memory encryption is enabled (this only affects system
-			  physical addresses, not guest physical addresses)
-
-If support for SME is present, MSR 0xc00100010 (SYS_CFG) can be used to
+		Bits[5:0]  pagetable bit number used to activate memory
+			   encryption
+		Bits[11:6] reduction in physical address space, in bits, when
+			   memory encryption is enabled (this only affects
+			   system physical addresses, not guest physical
+			   addresses)
+
+If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be used to
 determine if SME is enabled and/or to enable memory encryption:
 
 	0xc0010010:
@@ -41,7 +42,7 @@ The state of SME in the Linux kernel can be documented as follows:
 	  The CPU supports SME (determined through CPUID instruction).
 
 	- Enabled:
-	  Supported and bit 23 of the SYS_CFG MSR is set.
+	  Supported and bit 23 of MSR_K8_SYSCFG is set.
 
 	- Active:
 	  Supported, Enabled and the Linux kernel is actively applying
@@ -51,7 +52,9 @@ The state of SME in the Linux kernel can be documented as follows:
 SME can also be enabled and activated in the BIOS. If SME is enabled and
 activated in the BIOS, then all memory accesses will be encrypted and it will
 not be necessary to activate the Linux memory encryption support.  If the BIOS
-merely enables SME (sets bit 23 of the SYS_CFG MSR), then Linux can activate
-memory encryption.  However, if BIOS does not enable SME, then Linux will not
-attempt to activate memory encryption, even if configured to do so by default
-or the mem_encrypt=on command line parameter is specified.
+merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can activate
+memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
+by supplying mem_encrypt=on on the kernel command line.  However, if BIOS does
+not enable SME, then Linux will not be able to activate memory encryption, even
+if configured to do so by default or the mem_encrypt=on command line parameter
+is specified.
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index ea2de6a..d59c15c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -28,7 +28,6 @@ enum cpuid_leafs
 	CPUID_8000_000A_EDX,
 	CPUID_7_ECX,
 	CPUID_8000_0007_EBX,
-	CPUID_8000_001F_EAX,
 };
 
 #ifdef CONFIG_X86_FEATURE_NAMES
@@ -79,9 +78,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
 	   REQUIRED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))
 
 #define DISABLED_MASK_BIT_SET(feature_bit)				\
 	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
@@ -102,9 +100,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
 	   DISABLED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))
 
 #define cpu_has(c, bit)							\
 	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 331fb81..b1a4468 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -12,7 +12,7 @@
 /*
  * Defines x86 CPU feature bits
  */
-#define NCAPINTS	19	/* N 32-bit words worth of info */
+#define NCAPINTS	18	/* N 32-bit words worth of info */
 #define NBUGINTS	1	/* N 32-bit bug flags */
 
 /*
@@ -187,6 +187,7 @@
  * Reuse free bits when adding new feature flags!
  */
 
+#define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
 #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
 #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
 #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
@@ -296,9 +297,6 @@
 #define X86_FEATURE_SUCCOR	(17*32+1) /* Uncorrectable error containment and recovery */
 #define X86_FEATURE_SMCA	(17*32+3) /* Scalable MCA */
 
-/* AMD-defined CPU features, CPUID level 0x8000001f (eax), word 18 */
-#define X86_FEATURE_SME		(18*32+0) /* Secure Memory Encryption */
-
 /*
  * BUG word(s)
  */
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index 8b45e08..85599ad 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -57,7 +57,6 @@
 #define DISABLED_MASK15	0
 #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE)
 #define DISABLED_MASK17	0
-#define DISABLED_MASK18	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
 #endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h
index 6847d85..fac9a5c 100644
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -100,7 +100,6 @@
 #define REQUIRED_MASK15	0
 #define REQUIRED_MASK16	0
 #define REQUIRED_MASK17	0
-#define REQUIRED_MASK18	0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
 #endif /* _ASM_X86_REQUIRED_FEATURES_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 35a5d5d..6bddda3 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -615,6 +615,29 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 	 */
 	if (cpu_has_amd_erratum(c, amd_erratum_400))
 		set_cpu_bug(c, X86_BUG_AMD_E400);
+
+	/*
+	 * BIOS support is required for SME. If BIOS has enabld SME then
+	 * adjust x86_phys_bits by the SME physical address space reduction
+	 * value. If BIOS has not enabled SME then don't advertise the
+	 * feature (set in scattered.c).
+	 */
+	if (c->extended_cpuid_level >= 0x8000001f) {
+		if (cpu_has(c, X86_FEATURE_SME)) {
+			u64 msr;
+
+			/* Check if SME is enabled */
+			rdmsrl(MSR_K8_SYSCFG, msr);
+			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT) {
+				unsigned int ebx;
+
+				ebx = cpuid_ebx(0x8000001f);
+				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
+			} else {
+				clear_cpu_cap(c, X86_FEATURE_SME);
+			}
+		}
+	}
 }
 
 static void init_amd_k8(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 358208d7..c188ae5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -763,29 +763,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 	if (c->extended_cpuid_level >= 0x8000000a)
 		c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);
 
-	if (c->extended_cpuid_level >= 0x8000001f) {
-		cpuid(0x8000001f, &eax, &ebx, &ecx, &edx);
-
-		/* SME feature support */
-		if ((c->x86_vendor == X86_VENDOR_AMD) && (eax & 0x01)) {
-			u64 msr;
-
-			/*
-			 * For SME, BIOS support is required. If BIOS has
-			 * enabled SME adjust x86_phys_bits by the SME
-			 * physical address space reduction value. If BIOS
-			 * has not enabled SME don't advertise the feature.
-			 */
-			rdmsrl(MSR_K8_SYSCFG, msr);
-			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT)
-				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
-			else
-				eax &= ~0x01;
-		}
-
-		c->x86_capability[CPUID_8000_001F_EAX] = eax;
-	}
-
 	init_scattered_cpuid_features(c);
 
 	/*
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index d979406..cabda87 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -30,6 +30,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_HW_PSTATE,	CPUID_EDX,  7, 0x80000007, 0 },
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
+	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
 	{ 0, 0, 0, 0, 0 }
 };
 

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-06 18:11           ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-06 18:11 UTC (permalink / raw)
  To: bp
  Cc: linux-efi, brijesh.singh, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, hpa, cl, tglx, aarcange, sfr,
	mchehab, simon.guinot, bhe, xemul, joro, x86, peterz, piotr.luc,
	mingo, msalter, ross.zwisler, labbott, dyoung, thomas.lendacky,
	jroedel, keescook, arnd, toshi.kani, mathieu.desnoyers, luto,
	pbonzini, bhelgaas, dan.j.williams, andriy.shevchenko, akpm,
	herbert

On 03/04/2017 04:11 AM, Borislav Petkov wrote:
> On Fri, Mar 03, 2017 at 03:01:23PM -0600, Brijesh Singh wrote:
> 
> This looks like a wraparound...
> 
> $ test-apply.sh /tmp/brijesh.singh.delta
> checking file Documentation/admin-guide/kernel-parameters.txt
> Hunk #1 succeeded at 2144 (offset -9 lines).
> checking file Documentation/x86/amd-memory-encryption.txt
> patch: **** malformed patch at line 23: DRAM from physical
> 
> Yap.
> 
> Looks like exchange or your mail client decided to do some patch editing
> on its own.
> 
> Please send it to yourself first and try applying.
> 

Sending it through stg mail to avoid line wrapping. Please let me know if something
is still messed up. I have tried applying it and it seems to apply okay.

---
 Documentation/admin-guide/kernel-parameters.txt |    4 +--
 Documentation/x86/amd-memory-encryption.txt     |   33 +++++++++++++----------
 arch/x86/include/asm/cpufeature.h               |    7 +----
 arch/x86/include/asm/cpufeatures.h              |    6 +---
 arch/x86/include/asm/disabled-features.h        |    3 +-
 arch/x86/include/asm/required-features.h        |    3 +-
 arch/x86/kernel/cpu/amd.c                       |   23 ++++++++++++++++
 arch/x86/kernel/cpu/common.c                    |   23 ----------------
 arch/x86/kernel/cpu/scattered.c                 |    1 +
 9 files changed, 50 insertions(+), 53 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 91c40fa..b91e2495 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2153,8 +2153,8 @@
 			mem_encrypt=on:		Activate SME
 			mem_encrypt=off:	Do not activate SME
 
-			Refer to the SME documentation for details on when
-			memory encryption can be activated.
+			Refer to Documentation/x86/amd-memory-encryption.txt
+			for details on when memory encryption can be activated.
 
 	mem_sleep_default=	[SUSPEND] Default system suspend mode:
 			s2idle  - Suspend-To-Idle
diff --git a/Documentation/x86/amd-memory-encryption.txt b/Documentation/x86/amd-memory-encryption.txt
index 0938e89..0b72ff2 100644
--- a/Documentation/x86/amd-memory-encryption.txt
+++ b/Documentation/x86/amd-memory-encryption.txt
@@ -7,9 +7,9 @@ DRAM.  SME can therefore be used to protect the contents of DRAM from physical
 attacks on the system.
 
 A page is encrypted when a page table entry has the encryption bit set (see
-below how to determine the position of the bit).  The encryption bit can be
-specified in the cr3 register, allowing the PGD table to be encrypted. Each
-successive level of page tables can also be encrypted.
+below on how to determine its position).  The encryption bit can be specified
+in the cr3 register, allowing the PGD table to be encrypted. Each successive
+level of page tables can also be encrypted.
 
 Support for SME can be determined through the CPUID instruction. The CPUID
 function 0x8000001f reports information related to SME:
@@ -17,13 +17,14 @@ function 0x8000001f reports information related to SME:
 	0x8000001f[eax]:
 		Bit[0] indicates support for SME
 	0x8000001f[ebx]:
-		Bit[5:0]  pagetable bit number used to activate memory
-			  encryption
-		Bit[11:6] reduction in physical address space, in bits, when
-			  memory encryption is enabled (this only affects system
-			  physical addresses, not guest physical addresses)
-
-If support for SME is present, MSR 0xc00100010 (SYS_CFG) can be used to
+		Bits[5:0]  pagetable bit number used to activate memory
+			   encryption
+		Bits[11:6] reduction in physical address space, in bits, when
+			   memory encryption is enabled (this only affects
+			   system physical addresses, not guest physical
+			   addresses)
+
+If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be used to
 determine if SME is enabled and/or to enable memory encryption:
 
 	0xc0010010:
@@ -41,7 +42,7 @@ The state of SME in the Linux kernel can be documented as follows:
 	  The CPU supports SME (determined through CPUID instruction).
 
 	- Enabled:
-	  Supported and bit 23 of the SYS_CFG MSR is set.
+	  Supported and bit 23 of MSR_K8_SYSCFG is set.
 
 	- Active:
 	  Supported, Enabled and the Linux kernel is actively applying
@@ -51,7 +52,9 @@ The state of SME in the Linux kernel can be documented as follows:
 SME can also be enabled and activated in the BIOS. If SME is enabled and
 activated in the BIOS, then all memory accesses will be encrypted and it will
 not be necessary to activate the Linux memory encryption support.  If the BIOS
-merely enables SME (sets bit 23 of the SYS_CFG MSR), then Linux can activate
-memory encryption.  However, if BIOS does not enable SME, then Linux will not
-attempt to activate memory encryption, even if configured to do so by default
-or the mem_encrypt=on command line parameter is specified.
+merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can activate
+memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
+by supplying mem_encrypt=on on the kernel command line.  However, if BIOS does
+not enable SME, then Linux will not be able to activate memory encryption, even
+if configured to do so by default or the mem_encrypt=on command line parameter
+is specified.
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index ea2de6a..d59c15c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -28,7 +28,6 @@ enum cpuid_leafs
 	CPUID_8000_000A_EDX,
 	CPUID_7_ECX,
 	CPUID_8000_0007_EBX,
-	CPUID_8000_001F_EAX,
 };
 
 #ifdef CONFIG_X86_FEATURE_NAMES
@@ -79,9 +78,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
 	   REQUIRED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))
 
 #define DISABLED_MASK_BIT_SET(feature_bit)				\
 	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
@@ -102,9 +100,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
 	   DISABLED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))
 
 #define cpu_has(c, bit)							\
 	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 331fb81..b1a4468 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -12,7 +12,7 @@
 /*
  * Defines x86 CPU feature bits
  */
-#define NCAPINTS	19	/* N 32-bit words worth of info */
+#define NCAPINTS	18	/* N 32-bit words worth of info */
 #define NBUGINTS	1	/* N 32-bit bug flags */
 
 /*
@@ -187,6 +187,7 @@
  * Reuse free bits when adding new feature flags!
  */
 
+#define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
 #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
 #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
 #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
@@ -296,9 +297,6 @@
 #define X86_FEATURE_SUCCOR	(17*32+1) /* Uncorrectable error containment and recovery */
 #define X86_FEATURE_SMCA	(17*32+3) /* Scalable MCA */
 
-/* AMD-defined CPU features, CPUID level 0x8000001f (eax), word 18 */
-#define X86_FEATURE_SME		(18*32+0) /* Secure Memory Encryption */
-
 /*
  * BUG word(s)
  */
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index 8b45e08..85599ad 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -57,7 +57,6 @@
 #define DISABLED_MASK15	0
 #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE)
 #define DISABLED_MASK17	0
-#define DISABLED_MASK18	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
 #endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h
index 6847d85..fac9a5c 100644
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -100,7 +100,6 @@
 #define REQUIRED_MASK15	0
 #define REQUIRED_MASK16	0
 #define REQUIRED_MASK17	0
-#define REQUIRED_MASK18	0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
 #endif /* _ASM_X86_REQUIRED_FEATURES_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 35a5d5d..6bddda3 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -615,6 +615,29 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 	 */
 	if (cpu_has_amd_erratum(c, amd_erratum_400))
 		set_cpu_bug(c, X86_BUG_AMD_E400);
+
+	/*
+	 * BIOS support is required for SME. If BIOS has enabld SME then
+	 * adjust x86_phys_bits by the SME physical address space reduction
+	 * value. If BIOS has not enabled SME then don't advertise the
+	 * feature (set in scattered.c).
+	 */
+	if (c->extended_cpuid_level >= 0x8000001f) {
+		if (cpu_has(c, X86_FEATURE_SME)) {
+			u64 msr;
+
+			/* Check if SME is enabled */
+			rdmsrl(MSR_K8_SYSCFG, msr);
+			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT) {
+				unsigned int ebx;
+
+				ebx = cpuid_ebx(0x8000001f);
+				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
+			} else {
+				clear_cpu_cap(c, X86_FEATURE_SME);
+			}
+		}
+	}
 }
 
 static void init_amd_k8(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 358208d7..c188ae5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -763,29 +763,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 	if (c->extended_cpuid_level >= 0x8000000a)
 		c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);
 
-	if (c->extended_cpuid_level >= 0x8000001f) {
-		cpuid(0x8000001f, &eax, &ebx, &ecx, &edx);
-
-		/* SME feature support */
-		if ((c->x86_vendor == X86_VENDOR_AMD) && (eax & 0x01)) {
-			u64 msr;
-
-			/*
-			 * For SME, BIOS support is required. If BIOS has
-			 * enabled SME adjust x86_phys_bits by the SME
-			 * physical address space reduction value. If BIOS
-			 * has not enabled SME don't advertise the feature.
-			 */
-			rdmsrl(MSR_K8_SYSCFG, msr);
-			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT)
-				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
-			else
-				eax &= ~0x01;
-		}
-
-		c->x86_capability[CPUID_8000_001F_EAX] = eax;
-	}
-
 	init_scattered_cpuid_features(c);
 
 	/*
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index d979406..cabda87 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -30,6 +30,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_HW_PSTATE,	CPUID_EDX,  7, 0x80000007, 0 },
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
+	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
 	{ 0, 0, 0, 0, 0 }
 };

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-06 18:11           ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-06 18:11 UTC (permalink / raw)
  To: bp
  Cc: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

On 03/04/2017 04:11 AM, Borislav Petkov wrote:
> On Fri, Mar 03, 2017 at 03:01:23PM -0600, Brijesh Singh wrote:
> 
> This looks like a wraparound...
> 
> $ test-apply.sh /tmp/brijesh.singh.delta
> checking file Documentation/admin-guide/kernel-parameters.txt
> Hunk #1 succeeded at 2144 (offset -9 lines).
> checking file Documentation/x86/amd-memory-encryption.txt
> patch: **** malformed patch at line 23: DRAM from physical
> 
> Yap.
> 
> Looks like exchange or your mail client decided to do some patch editing
> on its own.
> 
> Please send it to yourself first and try applying.
> 

Sending it through stg mail to avoid line wrapping. Please let me know if something
is still messed up. I have tried applying it and it seems to apply okay.

---
 Documentation/admin-guide/kernel-parameters.txt |    4 +--
 Documentation/x86/amd-memory-encryption.txt     |   33 +++++++++++++----------
 arch/x86/include/asm/cpufeature.h               |    7 +----
 arch/x86/include/asm/cpufeatures.h              |    6 +---
 arch/x86/include/asm/disabled-features.h        |    3 +-
 arch/x86/include/asm/required-features.h        |    3 +-
 arch/x86/kernel/cpu/amd.c                       |   23 ++++++++++++++++
 arch/x86/kernel/cpu/common.c                    |   23 ----------------
 arch/x86/kernel/cpu/scattered.c                 |    1 +
 9 files changed, 50 insertions(+), 53 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 91c40fa..b91e2495 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2153,8 +2153,8 @@
 			mem_encrypt=on:		Activate SME
 			mem_encrypt=off:	Do not activate SME
 
-			Refer to the SME documentation for details on when
-			memory encryption can be activated.
+			Refer to Documentation/x86/amd-memory-encryption.txt
+			for details on when memory encryption can be activated.
 
 	mem_sleep_default=	[SUSPEND] Default system suspend mode:
 			s2idle  - Suspend-To-Idle
diff --git a/Documentation/x86/amd-memory-encryption.txt b/Documentation/x86/amd-memory-encryption.txt
index 0938e89..0b72ff2 100644
--- a/Documentation/x86/amd-memory-encryption.txt
+++ b/Documentation/x86/amd-memory-encryption.txt
@@ -7,9 +7,9 @@ DRAM.  SME can therefore be used to protect the contents of DRAM from physical
 attacks on the system.
 
 A page is encrypted when a page table entry has the encryption bit set (see
-below how to determine the position of the bit).  The encryption bit can be
-specified in the cr3 register, allowing the PGD table to be encrypted. Each
-successive level of page tables can also be encrypted.
+below on how to determine its position).  The encryption bit can be specified
+in the cr3 register, allowing the PGD table to be encrypted. Each successive
+level of page tables can also be encrypted.
 
 Support for SME can be determined through the CPUID instruction. The CPUID
 function 0x8000001f reports information related to SME:
@@ -17,13 +17,14 @@ function 0x8000001f reports information related to SME:
 	0x8000001f[eax]:
 		Bit[0] indicates support for SME
 	0x8000001f[ebx]:
-		Bit[5:0]  pagetable bit number used to activate memory
-			  encryption
-		Bit[11:6] reduction in physical address space, in bits, when
-			  memory encryption is enabled (this only affects system
-			  physical addresses, not guest physical addresses)
-
-If support for SME is present, MSR 0xc00100010 (SYS_CFG) can be used to
+		Bits[5:0]  pagetable bit number used to activate memory
+			   encryption
+		Bits[11:6] reduction in physical address space, in bits, when
+			   memory encryption is enabled (this only affects
+			   system physical addresses, not guest physical
+			   addresses)
+
+If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be used to
 determine if SME is enabled and/or to enable memory encryption:
 
 	0xc0010010:
@@ -41,7 +42,7 @@ The state of SME in the Linux kernel can be documented as follows:
 	  The CPU supports SME (determined through CPUID instruction).
 
 	- Enabled:
-	  Supported and bit 23 of the SYS_CFG MSR is set.
+	  Supported and bit 23 of MSR_K8_SYSCFG is set.
 
 	- Active:
 	  Supported, Enabled and the Linux kernel is actively applying
@@ -51,7 +52,9 @@ The state of SME in the Linux kernel can be documented as follows:
 SME can also be enabled and activated in the BIOS. If SME is enabled and
 activated in the BIOS, then all memory accesses will be encrypted and it will
 not be necessary to activate the Linux memory encryption support.  If the BIOS
-merely enables SME (sets bit 23 of the SYS_CFG MSR), then Linux can activate
-memory encryption.  However, if BIOS does not enable SME, then Linux will not
-attempt to activate memory encryption, even if configured to do so by default
-or the mem_encrypt=on command line parameter is specified.
+merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can activate
+memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
+by supplying mem_encrypt=on on the kernel command line.  However, if BIOS does
+not enable SME, then Linux will not be able to activate memory encryption, even
+if configured to do so by default or the mem_encrypt=on command line parameter
+is specified.
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index ea2de6a..d59c15c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -28,7 +28,6 @@ enum cpuid_leafs
 	CPUID_8000_000A_EDX,
 	CPUID_7_ECX,
 	CPUID_8000_0007_EBX,
-	CPUID_8000_001F_EAX,
 };
 
 #ifdef CONFIG_X86_FEATURE_NAMES
@@ -79,9 +78,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
 	   REQUIRED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))
 
 #define DISABLED_MASK_BIT_SET(feature_bit)				\
 	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
@@ -102,9 +100,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
-	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
 	   DISABLED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))
 
 #define cpu_has(c, bit)							\
 	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 331fb81..b1a4468 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -12,7 +12,7 @@
 /*
  * Defines x86 CPU feature bits
  */
-#define NCAPINTS	19	/* N 32-bit words worth of info */
+#define NCAPINTS	18	/* N 32-bit words worth of info */
 #define NBUGINTS	1	/* N 32-bit bug flags */
 
 /*
@@ -187,6 +187,7 @@
  * Reuse free bits when adding new feature flags!
  */
 
+#define X86_FEATURE_SME		( 7*32+ 0) /* AMD Secure Memory Encryption */
 #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
 #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
 #define X86_FEATURE_CAT_L3	( 7*32+ 4) /* Cache Allocation Technology L3 */
@@ -296,9 +297,6 @@
 #define X86_FEATURE_SUCCOR	(17*32+1) /* Uncorrectable error containment and recovery */
 #define X86_FEATURE_SMCA	(17*32+3) /* Scalable MCA */
 
-/* AMD-defined CPU features, CPUID level 0x8000001f (eax), word 18 */
-#define X86_FEATURE_SME		(18*32+0) /* Secure Memory Encryption */
-
 /*
  * BUG word(s)
  */
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index 8b45e08..85599ad 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -57,7 +57,6 @@
 #define DISABLED_MASK15	0
 #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE)
 #define DISABLED_MASK17	0
-#define DISABLED_MASK18	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
 #endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h
index 6847d85..fac9a5c 100644
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -100,7 +100,6 @@
 #define REQUIRED_MASK15	0
 #define REQUIRED_MASK16	0
 #define REQUIRED_MASK17	0
-#define REQUIRED_MASK18	0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
 #endif /* _ASM_X86_REQUIRED_FEATURES_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 35a5d5d..6bddda3 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -615,6 +615,29 @@ static void early_init_amd(struct cpuinfo_x86 *c)
 	 */
 	if (cpu_has_amd_erratum(c, amd_erratum_400))
 		set_cpu_bug(c, X86_BUG_AMD_E400);
+
+	/*
+	 * BIOS support is required for SME. If BIOS has enabld SME then
+	 * adjust x86_phys_bits by the SME physical address space reduction
+	 * value. If BIOS has not enabled SME then don't advertise the
+	 * feature (set in scattered.c).
+	 */
+	if (c->extended_cpuid_level >= 0x8000001f) {
+		if (cpu_has(c, X86_FEATURE_SME)) {
+			u64 msr;
+
+			/* Check if SME is enabled */
+			rdmsrl(MSR_K8_SYSCFG, msr);
+			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT) {
+				unsigned int ebx;
+
+				ebx = cpuid_ebx(0x8000001f);
+				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
+			} else {
+				clear_cpu_cap(c, X86_FEATURE_SME);
+			}
+		}
+	}
 }
 
 static void init_amd_k8(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 358208d7..c188ae5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -763,29 +763,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 	if (c->extended_cpuid_level >= 0x8000000a)
 		c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);
 
-	if (c->extended_cpuid_level >= 0x8000001f) {
-		cpuid(0x8000001f, &eax, &ebx, &ecx, &edx);
-
-		/* SME feature support */
-		if ((c->x86_vendor == X86_VENDOR_AMD) && (eax & 0x01)) {
-			u64 msr;
-
-			/*
-			 * For SME, BIOS support is required. If BIOS has
-			 * enabled SME adjust x86_phys_bits by the SME
-			 * physical address space reduction value. If BIOS
-			 * has not enabled SME don't advertise the feature.
-			 */
-			rdmsrl(MSR_K8_SYSCFG, msr);
-			if (msr & MSR_K8_SYSCFG_MEM_ENCRYPT)
-				c->x86_phys_bits -= (ebx >> 6) & 0x3f;
-			else
-				eax &= ~0x01;
-		}
-
-		c->x86_capability[CPUID_8000_001F_EAX] = eax;
-	}
-
 	init_scattered_cpuid_features(c);
 
 	/*
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index d979406..cabda87 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -30,6 +30,7 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_HW_PSTATE,	CPUID_EDX,  7, 0x80000007, 0 },
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
+	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
 	{ 0, 0, 0, 0, 0 }
 };
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
  2017-03-06 18:11           ` Brijesh Singh
  (?)
@ 2017-03-06 20:54             ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-06 20:54 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott

On Mon, Mar 06, 2017 at 01:11:03PM -0500, Brijesh Singh wrote:
> Sending it through stg mail to avoid line wrapping. Please let me know if something
> is still messed up. I have tried applying it and it seems to apply okay.

Yep, thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-06 20:54             ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-06 20:54 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Mon, Mar 06, 2017 at 01:11:03PM -0500, Brijesh Singh wrote:
> Sending it through stg mail to avoid line wrapping. Please let me know if something
> is still messed up. I have tried applying it and it seems to apply okay.

Yep, thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature
@ 2017-03-06 20:54             ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-06 20:54 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Mon, Mar 06, 2017 at 01:11:03PM -0500, Brijesh Singh wrote:
> Sending it through stg mail to avoid line wrapping. Please let me know if something
> is still messed up. I have tried applying it and it seems to apply okay.

Yep, thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
  2017-03-03 21:15       ` Tom Lendacky
  (?)
@ 2017-03-07  0:03         ` Bjorn Helgaas
  -1 siblings, 0 replies; 424+ messages in thread
From: Bjorn Helgaas @ 2017-03-07  0:03 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: linux-efi, Brijesh Singh, labbott, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, hpa, cl, tglx, aarcange, sfr,
	mchehab, simon.guinot, bhe, xemul, joro, x86, peterz, piotr.luc,
	mingo, msalter, ross.zwisler, bp, dyoung, jroedel, keescook,
	arnd, toshi.kani, mathieu.desnoyers, luto, pbonzini, bhelgaas,
	dan.j.williams, andriy.shevchenko, akpm, herbert

On Fri, Mar 03, 2017 at 03:15:34PM -0600, Tom Lendacky wrote:
> On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:
> >On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
> >>From: Tom Lendacky <thomas.lendacky@amd.com>
> >>
> >>The use of ioremap will force the setup data to be mapped decrypted even
> >>though setup data is encrypted.  Switch to using memremap which will be
> >>able to perform the proper mapping.
> >
> >How should callers decide whether to use ioremap() or memremap()?
> >
> >memremap() existed before SME and SEV, and this code is used even if
> >SME and SEV aren't supported, so the rationale for this change should
> >not need the decryption argument.
> 
> When SME or SEV is active an ioremap() will remove the encryption bit
> from the pagetable entry when it is mapped.  This allows MMIO, which
> doesn't support SME/SEV, to be performed successfully.  So my take is
> that ioremap() should be used for MMIO and memremap() for pages in RAM.

OK, thanks.  The commit message should say something like "this is
RAM, not MMIO, so we should map it with memremap(), not ioremap()".
That's the part that determines whether the change is correct.

You can mention the encryption part, too, but it's definitely
secondary because the change has to make sense on its own, without
SME/SEV.

The following commits (from https://github.com/codomania/tip/branches)
all do basically the same thing so the changelogs (and summaries)
should all be basically the same:

  cb0d0d1eb0a6 x86: Change early_ioremap to early_memremap for BOOT data
  91acb68b8333 x86/pci: Use memremap when walking setup data
  4f687503e23f x86: Access the setup data through sysfs decrypted
  e90246b8c229 x86: Access the setup data through debugfs decrypted

I would collect them all together and move them to the beginning of
your series, since they don't depend on anything else.

Also, change "x86/pci: " to "x86/PCI" so it matches the previous
convention.

> >>Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> >>---
> >> arch/x86/pci/common.c |    4 ++--
> >> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> >>index a4fdfa7..0b06670 100644
> >>--- a/arch/x86/pci/common.c
> >>+++ b/arch/x86/pci/common.c
> >>@@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
> >>
> >> 	pa_data = boot_params.hdr.setup_data;
> >> 	while (pa_data) {
> >>-		data = ioremap(pa_data, sizeof(*rom));
> >>+		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
> >
> >I can't quite connect the dots here.  ioremap() on x86 would do
> >ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
> >which is ioremap_cache().  Is making a cacheable mapping the important
> >difference?
> 
> The memremap(MEMREMAP_WB) will actually check to see if it can perform
> a __va(pa_data) in try_ram_remap() and then fallback to the
> arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
> that is the difference.
> 
> Thanks,
> Tom
> 
> >
> >> 		if (!data)
> >> 			return -ENOMEM;
> >>
> >>@@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
> >> 			}
> >> 		}
> >> 		pa_data = data->next;
> >>-		iounmap(data);
> >>+		memunmap(data);
> >> 	}
> >> 	set_dma_domain_ops(dev);
> >> 	set_dev_domain_options(dev);
> >>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-07  0:03         ` Bjorn Helgaas
  0 siblings, 0 replies; 424+ messages in thread
From: Bjorn Helgaas @ 2017-03-07  0:03 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Fri, Mar 03, 2017 at 03:15:34PM -0600, Tom Lendacky wrote:
> On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:
> >On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
> >>From: Tom Lendacky <thomas.lendacky@amd.com>
> >>
> >>The use of ioremap will force the setup data to be mapped decrypted even
> >>though setup data is encrypted.  Switch to using memremap which will be
> >>able to perform the proper mapping.
> >
> >How should callers decide whether to use ioremap() or memremap()?
> >
> >memremap() existed before SME and SEV, and this code is used even if
> >SME and SEV aren't supported, so the rationale for this change should
> >not need the decryption argument.
> 
> When SME or SEV is active an ioremap() will remove the encryption bit
> from the pagetable entry when it is mapped.  This allows MMIO, which
> doesn't support SME/SEV, to be performed successfully.  So my take is
> that ioremap() should be used for MMIO and memremap() for pages in RAM.

OK, thanks.  The commit message should say something like "this is
RAM, not MMIO, so we should map it with memremap(), not ioremap()".
That's the part that determines whether the change is correct.

You can mention the encryption part, too, but it's definitely
secondary because the change has to make sense on its own, without
SME/SEV.

The following commits (from https://github.com/codomania/tip/branches)
all do basically the same thing so the changelogs (and summaries)
should all be basically the same:

  cb0d0d1eb0a6 x86: Change early_ioremap to early_memremap for BOOT data
  91acb68b8333 x86/pci: Use memremap when walking setup data
  4f687503e23f x86: Access the setup data through sysfs decrypted
  e90246b8c229 x86: Access the setup data through debugfs decrypted

I would collect them all together and move them to the beginning of
your series, since they don't depend on anything else.

Also, change "x86/pci: " to "x86/PCI" so it matches the previous
convention.

> >>Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> >>---
> >> arch/x86/pci/common.c |    4 ++--
> >> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> >>index a4fdfa7..0b06670 100644
> >>--- a/arch/x86/pci/common.c
> >>+++ b/arch/x86/pci/common.c
> >>@@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
> >>
> >> 	pa_data = boot_params.hdr.setup_data;
> >> 	while (pa_data) {
> >>-		data = ioremap(pa_data, sizeof(*rom));
> >>+		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
> >
> >I can't quite connect the dots here.  ioremap() on x86 would do
> >ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
> >which is ioremap_cache().  Is making a cacheable mapping the important
> >difference?
> 
> The memremap(MEMREMAP_WB) will actually check to see if it can perform
> a __va(pa_data) in try_ram_remap() and then fallback to the
> arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
> that is the difference.
> 
> Thanks,
> Tom
> 
> >
> >> 		if (!data)
> >> 			return -ENOMEM;
> >>
> >>@@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
> >> 			}
> >> 		}
> >> 		pa_data = data->next;
> >>-		iounmap(data);
> >>+		memunmap(data);
> >> 	}
> >> 	set_dma_domain_ops(dev);
> >> 	set_dev_domain_options(dev);
> >>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-07  0:03         ` Bjorn Helgaas
  0 siblings, 0 replies; 424+ messages in thread
From: Bjorn Helgaas @ 2017-03-07  0:03 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Fri, Mar 03, 2017 at 03:15:34PM -0600, Tom Lendacky wrote:
> On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:
> >On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
> >>From: Tom Lendacky <thomas.lendacky@amd.com>
> >>
> >>The use of ioremap will force the setup data to be mapped decrypted even
> >>though setup data is encrypted.  Switch to using memremap which will be
> >>able to perform the proper mapping.
> >
> >How should callers decide whether to use ioremap() or memremap()?
> >
> >memremap() existed before SME and SEV, and this code is used even if
> >SME and SEV aren't supported, so the rationale for this change should
> >not need the decryption argument.
> 
> When SME or SEV is active an ioremap() will remove the encryption bit
> from the pagetable entry when it is mapped.  This allows MMIO, which
> doesn't support SME/SEV, to be performed successfully.  So my take is
> that ioremap() should be used for MMIO and memremap() for pages in RAM.

OK, thanks.  The commit message should say something like "this is
RAM, not MMIO, so we should map it with memremap(), not ioremap()".
That's the part that determines whether the change is correct.

You can mention the encryption part, too, but it's definitely
secondary because the change has to make sense on its own, without
SME/SEV.

The following commits (from https://github.com/codomania/tip/branches)
all do basically the same thing so the changelogs (and summaries)
should all be basically the same:

  cb0d0d1eb0a6 x86: Change early_ioremap to early_memremap for BOOT data
  91acb68b8333 x86/pci: Use memremap when walking setup data
  4f687503e23f x86: Access the setup data through sysfs decrypted
  e90246b8c229 x86: Access the setup data through debugfs decrypted

I would collect them all together and move them to the beginning of
your series, since they don't depend on anything else.

Also, change "x86/pci: " to "x86/PCI" so it matches the previous
convention.

> >>Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> >>---
> >> arch/x86/pci/common.c |    4 ++--
> >> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> >>index a4fdfa7..0b06670 100644
> >>--- a/arch/x86/pci/common.c
> >>+++ b/arch/x86/pci/common.c
> >>@@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
> >>
> >> 	pa_data = boot_params.hdr.setup_data;
> >> 	while (pa_data) {
> >>-		data = ioremap(pa_data, sizeof(*rom));
> >>+		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
> >
> >I can't quite connect the dots here.  ioremap() on x86 would do
> >ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
> >which is ioremap_cache().  Is making a cacheable mapping the important
> >difference?
> 
> The memremap(MEMREMAP_WB) will actually check to see if it can perform
> a __va(pa_data) in try_ram_remap() and then fallback to the
> arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
> that is the difference.
> 
> Thanks,
> Tom
> 
> >
> >> 		if (!data)
> >> 			return -ENOMEM;
> >>
> >>@@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
> >> 			}
> >> 		}
> >> 		pa_data = data->next;
> >>-		iounmap(data);
> >>+		memunmap(data);
> >> 	}
> >> 	set_dma_domain_ops(dev);
> >> 	set_dev_domain_options(dev);
> >>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 04/32] KVM: SVM: Add SEV feature definitions to KVM
  2017-03-02 15:12   ` Brijesh Singh
  (?)
@ 2017-03-07  0:50     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07  0:50 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott

On Thu, Mar 02, 2017 at 10:12:48AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Define a new KVM CPU feature for Secure Encrypted Virtualization (SEV).
> The kernel will check for the presence of this feature to determine if
> it is running with SEV active.
> 
> Define the SEV enable bit for the VMCB control structure. The hypervisor
> will use this bit to enable SEV in the guest.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/include/asm/svm.h           |    1 +
>  arch/x86/include/uapi/asm/kvm_para.h |    1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 2aca535..fba2a7b 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -137,6 +137,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
>  #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
>  
>  #define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
> +#define SVM_NESTED_CTL_SEV_ENABLE	BIT(1)
>  
>  struct __attribute__ ((__packed__)) vmcb_seg {
>  	u16 selector;
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> index 1421a65..bc2802f 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -24,6 +24,7 @@
>  #define KVM_FEATURE_STEAL_TIME		5
>  #define KVM_FEATURE_PV_EOI		6
>  #define KVM_FEATURE_PV_UNHALT		7
> +#define KVM_FEATURE_SEV			8

This looks like it needs documenting in Documentation/virtual/kvm/cpuid.txt

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 04/32] KVM: SVM: Add SEV feature definitions to KVM
@ 2017-03-07  0:50     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07  0:50 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:48AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Define a new KVM CPU feature for Secure Encrypted Virtualization (SEV).
> The kernel will check for the presence of this feature to determine if
> it is running with SEV active.
> 
> Define the SEV enable bit for the VMCB control structure. The hypervisor
> will use this bit to enable SEV in the guest.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/include/asm/svm.h           |    1 +
>  arch/x86/include/uapi/asm/kvm_para.h |    1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 2aca535..fba2a7b 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -137,6 +137,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
>  #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
>  
>  #define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
> +#define SVM_NESTED_CTL_SEV_ENABLE	BIT(1)
>  
>  struct __attribute__ ((__packed__)) vmcb_seg {
>  	u16 selector;
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> index 1421a65..bc2802f 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -24,6 +24,7 @@
>  #define KVM_FEATURE_STEAL_TIME		5
>  #define KVM_FEATURE_PV_EOI		6
>  #define KVM_FEATURE_PV_UNHALT		7
> +#define KVM_FEATURE_SEV			8

This looks like it needs documenting in Documentation/virtual/kvm/cpuid.txt

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 04/32] KVM: SVM: Add SEV feature definitions to KVM
@ 2017-03-07  0:50     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07  0:50 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:48AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Define a new KVM CPU feature for Secure Encrypted Virtualization (SEV).
> The kernel will check for the presence of this feature to determine if
> it is running with SEV active.
> 
> Define the SEV enable bit for the VMCB control structure. The hypervisor
> will use this bit to enable SEV in the guest.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/include/asm/svm.h           |    1 +
>  arch/x86/include/uapi/asm/kvm_para.h |    1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 2aca535..fba2a7b 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -137,6 +137,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
>  #define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
>  
>  #define SVM_NESTED_CTL_NP_ENABLE	BIT(0)
> +#define SVM_NESTED_CTL_SEV_ENABLE	BIT(1)
>  
>  struct __attribute__ ((__packed__)) vmcb_seg {
>  	u16 selector;
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> index 1421a65..bc2802f 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -24,6 +24,7 @@
>  #define KVM_FEATURE_STEAL_TIME		5
>  #define KVM_FEATURE_PV_EOI		6
>  #define KVM_FEATURE_PV_UNHALT		7
> +#define KVM_FEATURE_SEV			8

This looks like it needs documenting in Documentation/virtual/kvm/cpuid.txt

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
  2017-03-02 15:12   ` Brijesh Singh
  (?)
@ 2017-03-07 11:09     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 11:09 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott

On Thu, Mar 02, 2017 at 10:12:59AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
> EFI related data, setup data) is encrypted and needs to be accessed as
> such when mapped. Update the architecture override in early_memremap to
> keep the encryption attribute when mapping this data.

This could also explain why persistent memory needs to be accessed
decrypted with SEV.

In general, what the difference in that aspect is in respect to SME. And
I'd write that in the comment over the function. And not say "E820 areas
are checked in making this determination." because that is visible but
say *why* we need to check those ranges and determine access depending
on their type.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
@ 2017-03-07 11:09     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 11:09 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:59AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
> EFI related data, setup data) is encrypted and needs to be accessed as
> such when mapped. Update the architecture override in early_memremap to
> keep the encryption attribute when mapping this data.

This could also explain why persistent memory needs to be accessed
decrypted with SEV.

In general, what the difference in that aspect is in respect to SME. And
I'd write that in the comment over the function. And not say "E820 areas
are checked in making this determination." because that is visible but
say *why* we need to check those ranges and determine access depending
on their type.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
@ 2017-03-07 11:09     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 11:09 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:59AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
> EFI related data, setup data) is encrypted and needs to be accessed as
> such when mapped. Update the architecture override in early_memremap to
> keep the encryption attribute when mapping this data.

This could also explain why persistent memory needs to be accessed
decrypted with SEV.

In general, what the difference in that aspect is in respect to SME. And
I'd write that in the comment over the function. And not say "E820 areas
are checked in making this determination." because that is visible but
say *why* we need to check those ranges and determine access depending
on their type.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support
  2017-03-02 15:12   ` Brijesh Singh
  (?)
@ 2017-03-07 11:19     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 11:19 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: linux-efi, kvm, rkrcmar, matt, linux-pci, linus.walleij,
	gary.hook, linux-mm, hpa, cl, tglx, aarcange, sfr, mchehab,
	simon.guinot, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, labbott, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, pbonzini,
	bhelgaas, dan.j.williams, andriy.shevchenko, akpm, herbert,
	tony.luck, pau

On Thu, Mar 02, 2017 at 10:12:20AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Provide support for Secure Encyrpted Virtualization (SEV). This initial
> support defines a flag that is used by the kernel to determine if it is
> running with SEV active.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>

Btw,

you need to add your Signed-off-by here after Tom's to denote that
you're handing that patch forward.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support
@ 2017-03-07 11:19     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 11:19 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:20AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Provide support for Secure Encyrpted Virtualization (SEV). This initial
> support defines a flag that is used by the kernel to determine if it is
> running with SEV active.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>

Btw,

you need to add your Signed-off-by here after Tom's to denote that
you're handing that patch forward.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support
@ 2017-03-07 11:19     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 11:19 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:20AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Provide support for Secure Encyrpted Virtualization (SEV). This initial
> support defines a flag that is used by the kernel to determine if it is
> running with SEV active.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>

Btw,

you need to add your Signed-off-by here after Tom's to denote that
you're handing that patch forward.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 07/32] x86/efi: Access EFI data as encrypted when SEV is active
  2017-03-02 15:13   ` Brijesh Singh
  (?)
  (?)
@ 2017-03-07 11:57     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 11:57 UTC (permalink / raw)
  To: Brijesh Singh, Matt Fleming
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott

On Thu, Mar 02, 2017 at 10:13:21AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> EFI data is encrypted when the kernel is run under SEV. Update the
> page table references to be sure the EFI memory areas are accessed
> encrypted.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

This SOB chain looks good.

> ---
>  arch/x86/platform/efi/efi_64.c |   15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
> index 2d8674d..9a76ed8 100644
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -45,6 +45,7 @@
>  #include <asm/realmode.h>
>  #include <asm/time.h>
>  #include <asm/pgalloc.h>
> +#include <asm/mem_encrypt.h>
>  
>  /*
>   * We allocate runtime services regions bottom-up, starting from -4G, i.e.
> @@ -286,7 +287,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
>  	 * as trim_bios_range() will reserve the first page and isolate it away
>  	 * from memory allocators anyway.
>  	 */
> -	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, _PAGE_RW)) {
> +	pf = _PAGE_RW;
> +	if (sev_active())
> +		pf |= _PAGE_ENC;
> +	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, pf)) {
>  		pr_err("Failed to create 1:1 mapping for the first page!\n");
>  		return 1;
>  	}
> @@ -329,6 +333,9 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va)
>  	if (!(md->attribute & EFI_MEMORY_WB))
>  		flags |= _PAGE_PCD;
>  
> +	if (sev_active())
> +		flags |= _PAGE_ENC;
> +

So I'm wondering if we could avoid this sprinkling of _PAGE_ENC in the
EFI code by defining something like __supported_pte_mask but called
__efi_base_page_flags or so which has _PAGE_ENC cleared in the SME case,
i.e., when baremetal and has it set in the SEV case.

Then we could simply OR in __efi_base_page_flags which the SME/SEV code
will set appropriately early enough.

Hmm.

Matt, what do you think?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 07/32] x86/efi: Access EFI data as encrypted when SEV is active
@ 2017-03-07 11:57     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 11:57 UTC (permalink / raw)
  To: Brijesh Singh, Matt Fleming
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:13:21AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> EFI data is encrypted when the kernel is run under SEV. Update the
> page table references to be sure the EFI memory areas are accessed
> encrypted.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

This SOB chain looks good.

> ---
>  arch/x86/platform/efi/efi_64.c |   15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
> index 2d8674d..9a76ed8 100644
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -45,6 +45,7 @@
>  #include <asm/realmode.h>
>  #include <asm/time.h>
>  #include <asm/pgalloc.h>
> +#include <asm/mem_encrypt.h>
>  
>  /*
>   * We allocate runtime services regions bottom-up, starting from -4G, i.e.
> @@ -286,7 +287,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
>  	 * as trim_bios_range() will reserve the first page and isolate it away
>  	 * from memory allocators anyway.
>  	 */
> -	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, _PAGE_RW)) {
> +	pf = _PAGE_RW;
> +	if (sev_active())
> +		pf |= _PAGE_ENC;
> +	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, pf)) {
>  		pr_err("Failed to create 1:1 mapping for the first page!\n");
>  		return 1;
>  	}
> @@ -329,6 +333,9 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va)
>  	if (!(md->attribute & EFI_MEMORY_WB))
>  		flags |= _PAGE_PCD;
>  
> +	if (sev_active())
> +		flags |= _PAGE_ENC;
> +

So I'm wondering if we could avoid this sprinkling of _PAGE_ENC in the
EFI code by defining something like __supported_pte_mask but called
__efi_base_page_flags or so which has _PAGE_ENC cleared in the SME case,
i.e., when baremetal and has it set in the SEV case.

Then we could simply OR in __efi_base_page_flags which the SME/SEV code
will set appropriately early enough.

Hmm.

Matt, what do you think?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 07/32] x86/efi: Access EFI data as encrypted when SEV is active
@ 2017-03-07 11:57     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 11:57 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott

On Thu, Mar 02, 2017 at 10:13:21AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> EFI data is encrypted when the kernel is run under SEV. Update the
> page table references to be sure the EFI memory areas are accessed
> encrypted.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

This SOB chain looks good.

> ---
>  arch/x86/platform/efi/efi_64.c |   15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
> index 2d8674d..9a76ed8 100644
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -45,6 +45,7 @@
>  #include <asm/realmode.h>
>  #include <asm/time.h>
>  #include <asm/pgalloc.h>
> +#include <asm/mem_encrypt.h>
>  
>  /*
>   * We allocate runtime services regions bottom-up, starting from -4G, i.e.
> @@ -286,7 +287,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
>  	 * as trim_bios_range() will reserve the first page and isolate it away
>  	 * from memory allocators anyway.
>  	 */
> -	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, _PAGE_RW)) {
> +	pf = _PAGE_RW;
> +	if (sev_active())
> +		pf |= _PAGE_ENC;
> +	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, pf)) {
>  		pr_err("Failed to create 1:1 mapping for the first page!\n");
>  		return 1;
>  	}
> @@ -329,6 +333,9 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va)
>  	if (!(md->attribute & EFI_MEMORY_WB))
>  		flags |= _PAGE_PCD;
>  
> +	if (sev_active())
> +		flags |= _PAGE_ENC;
> +

So I'm wondering if we could avoid this sprinkling of _PAGE_ENC in the
EFI code by defining something like __supported_pte_mask but called
__efi_base_page_flags or so which has _PAGE_ENC cleared in the SME case,
i.e., when baremetal and has it set in the SEV case.

Then we could simply OR in __efi_base_page_flags which the SME/SEV code
will set appropriately early enough.

Hmm.

Matt, what do you think?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 07/32] x86/efi: Access EFI data as encrypted when SEV is active
@ 2017-03-07 11:57     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 11:57 UTC (permalink / raw)
  To: Brijesh Singh, Matt Fleming
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, linux-pci, linus.walleij,
	gary.hook, linux-mm, paul.gortmaker, hpa, cl, dan.j.williams,
	aarcange, sfr, andriy.shevchenko, herbert, bhe, xemul, joro, x86,
	peterz, piotr.luc, mingo, msalter, ross.zwisler, dyoung,
	thomas.lendacky, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:13:21AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> EFI data is encrypted when the kernel is run under SEV. Update the
> page table references to be sure the EFI memory areas are accessed
> encrypted.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

This SOB chain looks good.

> ---
>  arch/x86/platform/efi/efi_64.c |   15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
> index 2d8674d..9a76ed8 100644
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -45,6 +45,7 @@
>  #include <asm/realmode.h>
>  #include <asm/time.h>
>  #include <asm/pgalloc.h>
> +#include <asm/mem_encrypt.h>
>  
>  /*
>   * We allocate runtime services regions bottom-up, starting from -4G, i.e.
> @@ -286,7 +287,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
>  	 * as trim_bios_range() will reserve the first page and isolate it away
>  	 * from memory allocators anyway.
>  	 */
> -	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, _PAGE_RW)) {
> +	pf = _PAGE_RW;
> +	if (sev_active())
> +		pf |= _PAGE_ENC;
> +	if (kernel_map_pages_in_pgd(pgd, 0x0, 0x0, 1, pf)) {
>  		pr_err("Failed to create 1:1 mapping for the first page!\n");
>  		return 1;
>  	}
> @@ -329,6 +333,9 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va)
>  	if (!(md->attribute & EFI_MEMORY_WB))
>  		flags |= _PAGE_PCD;
>  
> +	if (sev_active())
> +		flags |= _PAGE_ENC;
> +

So I'm wondering if we could avoid this sprinkling of _PAGE_ENC in the
EFI code by defining something like __supported_pte_mask but called
__efi_base_page_flags or so which has _PAGE_ENC cleared in the SME case,
i.e., when baremetal and has it set in the SEV case.

Then we could simply OR in __efi_base_page_flags which the SME/SEV code
will set appropriately early enough.

Hmm.

Matt, what do you think?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
  2017-03-02 15:13   ` Brijesh Singh
  (?)
@ 2017-03-07 14:59     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 14:59 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott

On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> In order for memory pages to be properly mapped when SEV is active, we
> need to use the PAGE_KERNEL protection attribute as the base protection.
> This will insure that memory mapping of, e.g. ACPI tables, receives the
> proper mapping attributes.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---

> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index c400ab5..481c999 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>                 pcm = new_pcm;
>         }
> 
> +       /*
> +        * If the page being mapped is in memory and SEV is active then
> +        * make sure the memory encryption attribute is enabled in the
> +        * resulting mapping.
> +        */
>         prot = PAGE_KERNEL_IO;
> +       if (sev_active() && page_is_mem(pfn))

Hmm, a resource tree walk per ioremap call. This could get expensive for
ioremap-heavy workloads.

__ioremap_caller() gets called here during boot 55 times so not a whole
lot but I wouldn't be surprised if there were some nasty use cases which
ioremap a lot.

...

> diff --git a/kernel/resource.c b/kernel/resource.c
> index 9b5f044..db56ba3 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>  }
>  EXPORT_SYMBOL_GPL(page_is_ram);
>  
> +/*
> + * This function returns true if the target memory is marked as
> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
> + */
> +static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
> +{
> +	struct resource res;
> +	unsigned long pfn, end_pfn;
> +	u64 orig_end;
> +	int ret = -1;
> +
> +	res.start = (u64) start_pfn << PAGE_SHIFT;
> +	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
> +	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> +	orig_end = res.end;
> +	while ((res.start < res.end) &&
> +		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
> +		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
> +		end_pfn = (res.end + 1) >> PAGE_SHIFT;
> +		if (end_pfn > pfn)
> +			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
> +		if (ret)
> +			break;
> +		res.start = res.end + 1;
> +		res.end = orig_end;
> +	}
> +	return ret;
> +}

So the relevant difference between this one and walk_system_ram_range()
is this:

-			ret = (*func)(pfn, end_pfn - pfn, arg);
+			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;

so it seems to me you can have your own *func() pointer which does that
IORES_DESC_NONE comparison. And then you can define your own workhorse
__walk_memory_range() which gets called by both walk_mem_range() and
walk_system_ram_range() instead of almost duplicating them.

And looking at walk_system_ram_res(), that one looks similar too except
the pfn computation. But AFAICT the pfn/end_pfn things are computed from
res.start and res.end so it looks to me like all those three functions
are crying for unification...

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-07 14:59     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 14:59 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> In order for memory pages to be properly mapped when SEV is active, we
> need to use the PAGE_KERNEL protection attribute as the base protection.
> This will insure that memory mapping of, e.g. ACPI tables, receives the
> proper mapping attributes.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---

> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index c400ab5..481c999 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>                 pcm = new_pcm;
>         }
> 
> +       /*
> +        * If the page being mapped is in memory and SEV is active then
> +        * make sure the memory encryption attribute is enabled in the
> +        * resulting mapping.
> +        */
>         prot = PAGE_KERNEL_IO;
> +       if (sev_active() && page_is_mem(pfn))

Hmm, a resource tree walk per ioremap call. This could get expensive for
ioremap-heavy workloads.

__ioremap_caller() gets called here during boot 55 times so not a whole
lot but I wouldn't be surprised if there were some nasty use cases which
ioremap a lot.

...

> diff --git a/kernel/resource.c b/kernel/resource.c
> index 9b5f044..db56ba3 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>  }
>  EXPORT_SYMBOL_GPL(page_is_ram);
>  
> +/*
> + * This function returns true if the target memory is marked as
> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
> + */
> +static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
> +{
> +	struct resource res;
> +	unsigned long pfn, end_pfn;
> +	u64 orig_end;
> +	int ret = -1;
> +
> +	res.start = (u64) start_pfn << PAGE_SHIFT;
> +	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
> +	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> +	orig_end = res.end;
> +	while ((res.start < res.end) &&
> +		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
> +		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
> +		end_pfn = (res.end + 1) >> PAGE_SHIFT;
> +		if (end_pfn > pfn)
> +			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
> +		if (ret)
> +			break;
> +		res.start = res.end + 1;
> +		res.end = orig_end;
> +	}
> +	return ret;
> +}

So the relevant difference between this one and walk_system_ram_range()
is this:

-			ret = (*func)(pfn, end_pfn - pfn, arg);
+			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;

so it seems to me you can have your own *func() pointer which does that
IORES_DESC_NONE comparison. And then you can define your own workhorse
__walk_memory_range() which gets called by both walk_mem_range() and
walk_system_ram_range() instead of almost duplicating them.

And looking at walk_system_ram_res(), that one looks similar too except
the pfn computation. But AFAICT the pfn/end_pfn things are computed from
res.start and res.end so it looks to me like all those three functions
are crying for unification...

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-07 14:59     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-07 14:59 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> In order for memory pages to be properly mapped when SEV is active, we
> need to use the PAGE_KERNEL protection attribute as the base protection.
> This will insure that memory mapping of, e.g. ACPI tables, receives the
> proper mapping attributes.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---

> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index c400ab5..481c999 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>                 pcm = new_pcm;
>         }
> 
> +       /*
> +        * If the page being mapped is in memory and SEV is active then
> +        * make sure the memory encryption attribute is enabled in the
> +        * resulting mapping.
> +        */
>         prot = PAGE_KERNEL_IO;
> +       if (sev_active() && page_is_mem(pfn))

Hmm, a resource tree walk per ioremap call. This could get expensive for
ioremap-heavy workloads.

__ioremap_caller() gets called here during boot 55 times so not a whole
lot but I wouldn't be surprised if there were some nasty use cases which
ioremap a lot.

...

> diff --git a/kernel/resource.c b/kernel/resource.c
> index 9b5f044..db56ba3 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>  }
>  EXPORT_SYMBOL_GPL(page_is_ram);
>  
> +/*
> + * This function returns true if the target memory is marked as
> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
> + */
> +static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
> +{
> +	struct resource res;
> +	unsigned long pfn, end_pfn;
> +	u64 orig_end;
> +	int ret = -1;
> +
> +	res.start = (u64) start_pfn << PAGE_SHIFT;
> +	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
> +	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> +	orig_end = res.end;
> +	while ((res.start < res.end) &&
> +		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
> +		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
> +		end_pfn = (res.end + 1) >> PAGE_SHIFT;
> +		if (end_pfn > pfn)
> +			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
> +		if (ret)
> +			break;
> +		res.start = res.end + 1;
> +		res.end = orig_end;
> +	}
> +	return ret;
> +}

So the relevant difference between this one and walk_system_ram_range()
is this:

-			ret = (*func)(pfn, end_pfn - pfn, arg);
+			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;

so it seems to me you can have your own *func() pointer which does that
IORES_DESC_NONE comparison. And then you can define your own workhorse
__walk_memory_range() which gets called by both walk_mem_range() and
walk_system_ram_range() instead of almost duplicating them.

And looking at walk_system_ram_res(), that one looks similar too except
the pfn computation. But AFAICT the pfn/end_pfn things are computed from
res.start and res.end so it looks to me like all those three functions
are crying for unification...

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 09/32] x86: Change early_ioremap to early_memremap for BOOT data
  2017-03-02 15:13   ` Brijesh Singh
  (?)
@ 2017-03-08  8:46     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-08  8:46 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott

On Thu, Mar 02, 2017 at 10:13:53AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> In order to map BOOT data with the proper encryption bit, the

Btw, what does that all-caps spelling "BOOT" denote? Something I'm
missing?

> early_ioremap() function calls are changed to early_memremap() calls.
> This allows the proper access for both SME and SEV.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/kernel/acpi/boot.c |    4 ++--
>  arch/x86/kernel/mpparse.c   |   10 +++++-----
>  drivers/sfi/sfi_core.c      |    6 +++---
>  3 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 35174c6..468c25a 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -124,7 +124,7 @@ char *__init __acpi_map_table(unsigned long phys, unsigned long size)
>  	if (!phys || !size)
>  		return NULL;
>  
> -	return early_ioremap(phys, size);
> +	return early_memremap(phys, size);

Right, the question will keep popping up why we can simply replace
memremap with ioremap and the general difference wrt to SME/SEV. So it
would be a good idea to have a comment in, say, arch/x86/mm/ioremap.c,
explaining the general situation.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 09/32] x86: Change early_ioremap to early_memremap for BOOT data
@ 2017-03-08  8:46     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-08  8:46 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:13:53AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> In order to map BOOT data with the proper encryption bit, the

Btw, what does that all-caps spelling "BOOT" denote? Something I'm
missing?

> early_ioremap() function calls are changed to early_memremap() calls.
> This allows the proper access for both SME and SEV.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/kernel/acpi/boot.c |    4 ++--
>  arch/x86/kernel/mpparse.c   |   10 +++++-----
>  drivers/sfi/sfi_core.c      |    6 +++---
>  3 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 35174c6..468c25a 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -124,7 +124,7 @@ char *__init __acpi_map_table(unsigned long phys, unsigned long size)
>  	if (!phys || !size)
>  		return NULL;
>  
> -	return early_ioremap(phys, size);
> +	return early_memremap(phys, size);

Right, the question will keep popping up why we can simply replace
memremap with ioremap and the general difference wrt to SME/SEV. So it
would be a good idea to have a comment in, say, arch/x86/mm/ioremap.c,
explaining the general situation.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 09/32] x86: Change early_ioremap to early_memremap for BOOT data
@ 2017-03-08  8:46     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-08  8:46 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:13:53AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> In order to map BOOT data with the proper encryption bit, the

Btw, what does that all-caps spelling "BOOT" denote? Something I'm
missing?

> early_ioremap() function calls are changed to early_memremap() calls.
> This allows the proper access for both SME and SEV.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/kernel/acpi/boot.c |    4 ++--
>  arch/x86/kernel/mpparse.c   |   10 +++++-----
>  drivers/sfi/sfi_core.c      |    6 +++---
>  3 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 35174c6..468c25a 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -124,7 +124,7 @@ char *__init __acpi_map_table(unsigned long phys, unsigned long size)
>  	if (!phys || !size)
>  		return NULL;
>  
> -	return early_ioremap(phys, size);
> +	return early_memremap(phys, size);

Right, the question will keep popping up why we can simply replace
memremap with ioremap and the general difference wrt to SME/SEV. So it
would be a good idea to have a comment in, say, arch/x86/mm/ioremap.c,
explaining the general situation.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 10/32] x86: DMA support for SEV memory encryption
  2017-03-02 15:14   ` Brijesh Singh
  (?)
@ 2017-03-08 10:56     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-08 10:56 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott

On Thu, Mar 02, 2017 at 10:14:25AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> DMA access to memory mapped as encrypted while SEV is active can not be
> encrypted during device write or decrypted during device read. In order
> for DMA to properly work when SEV is active, the swiotlb bounce buffers
> must be used.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/mm/mem_encrypt.c |   77 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 77 insertions(+)
> 
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 090419b..7df5f4c 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -197,8 +197,81 @@ void __init sme_early_init(void)
>  	/* Update the protection map with memory encryption mask */
>  	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
>  		protection_map[i] = pgprot_encrypted(protection_map[i]);
> +
> +	if (sev_active())
> +		swiotlb_force = SWIOTLB_FORCE;
> +}
> +
> +static void *sme_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> +		       gfp_t gfp, unsigned long attrs)
> +{
> +	unsigned long dma_mask;
> +	unsigned int order;
> +	struct page *page;
> +	void *vaddr = NULL;
> +
> +	dma_mask = dma_alloc_coherent_mask(dev, gfp);
> +	order = get_order(size);
> +
> +	gfp &= ~__GFP_ZERO;

Please add a comment around here that swiotlb_alloc_coherent() will
memset(, 0, ) the memory. It took me a while to figure out what the
situation is.

Also, Joerg says the __GFP_ZERO is not absolutely necessary but it has
not been fixed in the other DMA alloc* functions because of fears that
something would break. That bit could also be part of the comment.

> +
> +	page = alloc_pages_node(dev_to_node(dev), gfp, order);
> +	if (page) {
> +		dma_addr_t addr;
> +
> +		/*
> +		 * Since we will be clearing the encryption bit, check the
> +		 * mask with it already cleared.
> +		 */
> +		addr = phys_to_dma(dev, page_to_phys(page)) & ~sme_me_mask;
> +		if ((addr + size) > dma_mask) {
> +			__free_pages(page, get_order(size));
> +		} else {
> +			vaddr = page_address(page);
> +			*dma_handle = addr;
> +		}
> +	}
> +
> +	if (!vaddr)
> +		vaddr = swiotlb_alloc_coherent(dev, size, dma_handle, gfp);
> +
> +	if (!vaddr)
> +		return NULL;
> +
> +	/* Clear the SME encryption bit for DMA use if not swiotlb area */
> +	if (!is_swiotlb_buffer(dma_to_phys(dev, *dma_handle))) {
> +		set_memory_decrypted((unsigned long)vaddr, 1 << order);
> +		*dma_handle &= ~sme_me_mask;
> +	}
> +
> +	return vaddr;
>  }
>  
> +static void sme_free(struct device *dev, size_t size, void *vaddr,
> +		     dma_addr_t dma_handle, unsigned long attrs)
> +{
> +	/* Set the SME encryption bit for re-use if not swiotlb area */
> +	if (!is_swiotlb_buffer(dma_to_phys(dev, dma_handle)))
> +		set_memory_encrypted((unsigned long)vaddr,
> +				     1 << get_order(size));
> +
> +	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
> +}
> +
> +static struct dma_map_ops sme_dma_ops = {

WARNING: struct dma_map_ops should normally be const
#112: FILE: arch/x86/mm/mem_encrypt.c:261:
+static struct dma_map_ops sme_dma_ops = {

Please integrate scripts/checkpatch.pl in your patch creation workflow.
Some of the warnings/errors *actually* make sense.


> +	.alloc                  = sme_alloc,
> +	.free                   = sme_free,
> +	.map_page               = swiotlb_map_page,
> +	.unmap_page             = swiotlb_unmap_page,
> +	.map_sg                 = swiotlb_map_sg_attrs,
> +	.unmap_sg               = swiotlb_unmap_sg_attrs,
> +	.sync_single_for_cpu    = swiotlb_sync_single_for_cpu,
> +	.sync_single_for_device = swiotlb_sync_single_for_device,
> +	.sync_sg_for_cpu        = swiotlb_sync_sg_for_cpu,
> +	.sync_sg_for_device     = swiotlb_sync_sg_for_device,
> +	.mapping_error          = swiotlb_dma_mapping_error,
> +};
> +
>  /* Architecture __weak replacement functions */
>  void __init mem_encrypt_init(void)
>  {
> @@ -208,6 +281,10 @@ void __init mem_encrypt_init(void)
>  	/* Call into SWIOTLB to update the SWIOTLB DMA buffers */
>  	swiotlb_update_mem_attributes();
>  
> +	/* Use SEV DMA operations if SEV is active */

That's obvious. The WHY is not.

> +	if (sev_active())
> +		dma_ops = &sme_dma_ops;
> +
>  	pr_info("AMD Secure Memory Encryption (SME) active\n");
>  }
>  
> 

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 10/32] x86: DMA support for SEV memory encryption
@ 2017-03-08 10:56     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-08 10:56 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:14:25AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> DMA access to memory mapped as encrypted while SEV is active can not be
> encrypted during device write or decrypted during device read. In order
> for DMA to properly work when SEV is active, the swiotlb bounce buffers
> must be used.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/mm/mem_encrypt.c |   77 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 77 insertions(+)
> 
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 090419b..7df5f4c 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -197,8 +197,81 @@ void __init sme_early_init(void)
>  	/* Update the protection map with memory encryption mask */
>  	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
>  		protection_map[i] = pgprot_encrypted(protection_map[i]);
> +
> +	if (sev_active())
> +		swiotlb_force = SWIOTLB_FORCE;
> +}
> +
> +static void *sme_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> +		       gfp_t gfp, unsigned long attrs)
> +{
> +	unsigned long dma_mask;
> +	unsigned int order;
> +	struct page *page;
> +	void *vaddr = NULL;
> +
> +	dma_mask = dma_alloc_coherent_mask(dev, gfp);
> +	order = get_order(size);
> +
> +	gfp &= ~__GFP_ZERO;

Please add a comment around here that swiotlb_alloc_coherent() will
memset(, 0, ) the memory. It took me a while to figure out what the
situation is.

Also, Joerg says the __GFP_ZERO is not absolutely necessary but it has
not been fixed in the other DMA alloc* functions because of fears that
something would break. That bit could also be part of the comment.

> +
> +	page = alloc_pages_node(dev_to_node(dev), gfp, order);
> +	if (page) {
> +		dma_addr_t addr;
> +
> +		/*
> +		 * Since we will be clearing the encryption bit, check the
> +		 * mask with it already cleared.
> +		 */
> +		addr = phys_to_dma(dev, page_to_phys(page)) & ~sme_me_mask;
> +		if ((addr + size) > dma_mask) {
> +			__free_pages(page, get_order(size));
> +		} else {
> +			vaddr = page_address(page);
> +			*dma_handle = addr;
> +		}
> +	}
> +
> +	if (!vaddr)
> +		vaddr = swiotlb_alloc_coherent(dev, size, dma_handle, gfp);
> +
> +	if (!vaddr)
> +		return NULL;
> +
> +	/* Clear the SME encryption bit for DMA use if not swiotlb area */
> +	if (!is_swiotlb_buffer(dma_to_phys(dev, *dma_handle))) {
> +		set_memory_decrypted((unsigned long)vaddr, 1 << order);
> +		*dma_handle &= ~sme_me_mask;
> +	}
> +
> +	return vaddr;
>  }
>  
> +static void sme_free(struct device *dev, size_t size, void *vaddr,
> +		     dma_addr_t dma_handle, unsigned long attrs)
> +{
> +	/* Set the SME encryption bit for re-use if not swiotlb area */
> +	if (!is_swiotlb_buffer(dma_to_phys(dev, dma_handle)))
> +		set_memory_encrypted((unsigned long)vaddr,
> +				     1 << get_order(size));
> +
> +	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
> +}
> +
> +static struct dma_map_ops sme_dma_ops = {

WARNING: struct dma_map_ops should normally be const
#112: FILE: arch/x86/mm/mem_encrypt.c:261:
+static struct dma_map_ops sme_dma_ops = {

Please integrate scripts/checkpatch.pl in your patch creation workflow.
Some of the warnings/errors *actually* make sense.


> +	.alloc                  = sme_alloc,
> +	.free                   = sme_free,
> +	.map_page               = swiotlb_map_page,
> +	.unmap_page             = swiotlb_unmap_page,
> +	.map_sg                 = swiotlb_map_sg_attrs,
> +	.unmap_sg               = swiotlb_unmap_sg_attrs,
> +	.sync_single_for_cpu    = swiotlb_sync_single_for_cpu,
> +	.sync_single_for_device = swiotlb_sync_single_for_device,
> +	.sync_sg_for_cpu        = swiotlb_sync_sg_for_cpu,
> +	.sync_sg_for_device     = swiotlb_sync_sg_for_device,
> +	.mapping_error          = swiotlb_dma_mapping_error,
> +};
> +
>  /* Architecture __weak replacement functions */
>  void __init mem_encrypt_init(void)
>  {
> @@ -208,6 +281,10 @@ void __init mem_encrypt_init(void)
>  	/* Call into SWIOTLB to update the SWIOTLB DMA buffers */
>  	swiotlb_update_mem_attributes();
>  
> +	/* Use SEV DMA operations if SEV is active */

That's obvious. The WHY is not.

> +	if (sev_active())
> +		dma_ops = &sme_dma_ops;
> +
>  	pr_info("AMD Secure Memory Encryption (SME) active\n");
>  }
>  
> 

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 10/32] x86: DMA support for SEV memory encryption
@ 2017-03-08 10:56     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-08 10:56 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:14:25AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> DMA access to memory mapped as encrypted while SEV is active can not be
> encrypted during device write or decrypted during device read. In order
> for DMA to properly work when SEV is active, the swiotlb bounce buffers
> must be used.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/mm/mem_encrypt.c |   77 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 77 insertions(+)
> 
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 090419b..7df5f4c 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -197,8 +197,81 @@ void __init sme_early_init(void)
>  	/* Update the protection map with memory encryption mask */
>  	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
>  		protection_map[i] = pgprot_encrypted(protection_map[i]);
> +
> +	if (sev_active())
> +		swiotlb_force = SWIOTLB_FORCE;
> +}
> +
> +static void *sme_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> +		       gfp_t gfp, unsigned long attrs)
> +{
> +	unsigned long dma_mask;
> +	unsigned int order;
> +	struct page *page;
> +	void *vaddr = NULL;
> +
> +	dma_mask = dma_alloc_coherent_mask(dev, gfp);
> +	order = get_order(size);
> +
> +	gfp &= ~__GFP_ZERO;

Please add a comment around here that swiotlb_alloc_coherent() will
memset(, 0, ) the memory. It took me a while to figure out what the
situation is.

Also, Joerg says the __GFP_ZERO is not absolutely necessary but it has
not been fixed in the other DMA alloc* functions because of fears that
something would break. That bit could also be part of the comment.

> +
> +	page = alloc_pages_node(dev_to_node(dev), gfp, order);
> +	if (page) {
> +		dma_addr_t addr;
> +
> +		/*
> +		 * Since we will be clearing the encryption bit, check the
> +		 * mask with it already cleared.
> +		 */
> +		addr = phys_to_dma(dev, page_to_phys(page)) & ~sme_me_mask;
> +		if ((addr + size) > dma_mask) {
> +			__free_pages(page, get_order(size));
> +		} else {
> +			vaddr = page_address(page);
> +			*dma_handle = addr;
> +		}
> +	}
> +
> +	if (!vaddr)
> +		vaddr = swiotlb_alloc_coherent(dev, size, dma_handle, gfp);
> +
> +	if (!vaddr)
> +		return NULL;
> +
> +	/* Clear the SME encryption bit for DMA use if not swiotlb area */
> +	if (!is_swiotlb_buffer(dma_to_phys(dev, *dma_handle))) {
> +		set_memory_decrypted((unsigned long)vaddr, 1 << order);
> +		*dma_handle &= ~sme_me_mask;
> +	}
> +
> +	return vaddr;
>  }
>  
> +static void sme_free(struct device *dev, size_t size, void *vaddr,
> +		     dma_addr_t dma_handle, unsigned long attrs)
> +{
> +	/* Set the SME encryption bit for re-use if not swiotlb area */
> +	if (!is_swiotlb_buffer(dma_to_phys(dev, dma_handle)))
> +		set_memory_encrypted((unsigned long)vaddr,
> +				     1 << get_order(size));
> +
> +	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
> +}
> +
> +static struct dma_map_ops sme_dma_ops = {

WARNING: struct dma_map_ops should normally be const
#112: FILE: arch/x86/mm/mem_encrypt.c:261:
+static struct dma_map_ops sme_dma_ops = {

Please integrate scripts/checkpatch.pl in your patch creation workflow.
Some of the warnings/errors *actually* make sense.


> +	.alloc                  = sme_alloc,
> +	.free                   = sme_free,
> +	.map_page               = swiotlb_map_page,
> +	.unmap_page             = swiotlb_unmap_page,
> +	.map_sg                 = swiotlb_map_sg_attrs,
> +	.unmap_sg               = swiotlb_unmap_sg_attrs,
> +	.sync_single_for_cpu    = swiotlb_sync_single_for_cpu,
> +	.sync_single_for_device = swiotlb_sync_single_for_device,
> +	.sync_sg_for_cpu        = swiotlb_sync_sg_for_cpu,
> +	.sync_sg_for_device     = swiotlb_sync_sg_for_device,
> +	.mapping_error          = swiotlb_dma_mapping_error,
> +};
> +
>  /* Architecture __weak replacement functions */
>  void __init mem_encrypt_init(void)
>  {
> @@ -208,6 +281,10 @@ void __init mem_encrypt_init(void)
>  	/* Call into SWIOTLB to update the SWIOTLB DMA buffers */
>  	swiotlb_update_mem_attributes();
>  
> +	/* Use SEV DMA operations if SEV is active */

That's obvious. The WHY is not.

> +	if (sev_active())
> +		dma_ops = &sme_dma_ops;
> +
>  	pr_info("AMD Secure Memory Encryption (SME) active\n");
>  }
>  
> 

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support
  2017-03-02 15:12   ` Brijesh Singh
  (?)
@ 2017-03-08 15:06     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-08 15:06 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott

On Thu, Mar 02, 2017 at 10:12:20AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Provide support for Secure Encyrpted Virtualization (SEV). This initial
> support defines a flag that is used by the kernel to determine if it is
> running with SEV active.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/include/asm/mem_encrypt.h |   14 +++++++++++++-
>  arch/x86/mm/mem_encrypt.c          |    3 +++
>  include/linux/mem_encrypt.h        |    6 ++++++
>  3 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 1fd5426..9799835 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -20,10 +20,16 @@
>  #ifdef CONFIG_AMD_MEM_ENCRYPT
>  
>  extern unsigned long sme_me_mask;
> +extern unsigned int sev_enabled;

So there's a function name sev_enabled() and an int sev_enabled too.

It looks to me like you want to call the function "sev_enable()" -
similar to sme_enable(), convert it to C code - i.e., I don't see what
would speak against it - and rename that sev_enc_bit to sev_enabled and
use it everywhere when testing SEV status.

>  static inline bool sme_active(void)
>  {
> -	return (sme_me_mask) ? true : false;
> +	return (sme_me_mask && !sev_enabled) ? true : false;
> +}
> +
> +static inline bool sev_active(void)
> +{
> +	return (sme_me_mask && sev_enabled) ? true : false;

Then, those read strange: like SME and SEV are mutually exclusive. Why?
I might have an idea but I'd like for you to confirm it :-)

Then, you're calling sev_enabled in startup_32() but we can enter
in arch/x86/boot/compressed/head_64.S::startup_64() too, when we're
loaded by a 64-bit bootloader, which would then theoretically bypass
sev_enabled().

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support
@ 2017-03-08 15:06     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-08 15:06 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:20AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Provide support for Secure Encyrpted Virtualization (SEV). This initial
> support defines a flag that is used by the kernel to determine if it is
> running with SEV active.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/include/asm/mem_encrypt.h |   14 +++++++++++++-
>  arch/x86/mm/mem_encrypt.c          |    3 +++
>  include/linux/mem_encrypt.h        |    6 ++++++
>  3 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 1fd5426..9799835 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -20,10 +20,16 @@
>  #ifdef CONFIG_AMD_MEM_ENCRYPT
>  
>  extern unsigned long sme_me_mask;
> +extern unsigned int sev_enabled;

So there's a function name sev_enabled() and an int sev_enabled too.

It looks to me like you want to call the function "sev_enable()" -
similar to sme_enable(), convert it to C code - i.e., I don't see what
would speak against it - and rename that sev_enc_bit to sev_enabled and
use it everywhere when testing SEV status.

>  static inline bool sme_active(void)
>  {
> -	return (sme_me_mask) ? true : false;
> +	return (sme_me_mask && !sev_enabled) ? true : false;
> +}
> +
> +static inline bool sev_active(void)
> +{
> +	return (sme_me_mask && sev_enabled) ? true : false;

Then, those read strange: like SME and SEV are mutually exclusive. Why?
I might have an idea but I'd like for you to confirm it :-)

Then, you're calling sev_enabled in startup_32() but we can enter
in arch/x86/boot/compressed/head_64.S::startup_64() too, when we're
loaded by a 64-bit bootloader, which would then theoretically bypass
sev_enabled().

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support
@ 2017-03-08 15:06     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-08 15:06 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:12:20AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Provide support for Secure Encyrpted Virtualization (SEV). This initial
> support defines a flag that is used by the kernel to determine if it is
> running with SEV active.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/include/asm/mem_encrypt.h |   14 +++++++++++++-
>  arch/x86/mm/mem_encrypt.c          |    3 +++
>  include/linux/mem_encrypt.h        |    6 ++++++
>  3 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 1fd5426..9799835 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -20,10 +20,16 @@
>  #ifdef CONFIG_AMD_MEM_ENCRYPT
>  
>  extern unsigned long sme_me_mask;
> +extern unsigned int sev_enabled;

So there's a function name sev_enabled() and an int sev_enabled too.

It looks to me like you want to call the function "sev_enable()" -
similar to sme_enable(), convert it to C code - i.e., I don't see what
would speak against it - and rename that sev_enc_bit to sev_enabled and
use it everywhere when testing SEV status.

>  static inline bool sme_active(void)
>  {
> -	return (sme_me_mask) ? true : false;
> +	return (sme_me_mask && !sev_enabled) ? true : false;
> +}
> +
> +static inline bool sev_active(void)
> +{
> +	return (sme_me_mask && sev_enabled) ? true : false;

Then, those read strange: like SME and SEV are mutually exclusive. Why?
I might have an idea but I'd like for you to confirm it :-)

Then, you're calling sev_enabled in startup_32() but we can enter
in arch/x86/boot/compressed/head_64.S::startup_64() too, when we're
loaded by a 64-bit bootloader, which would then theoretically bypass
sev_enabled().

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
  2017-03-02 15:14   ` Brijesh Singh
  (?)
@ 2017-03-09 14:07     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-09 14:07 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas

On Thu, Mar 02, 2017 at 10:14:48AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Early in the boot process, add checks to determine if the kernel is
> running with Secure Encrypted Virtualization (SEV) active by issuing
> a CPUID instruction.
> 
> During early compressed kernel booting, if SEV is active the pagetables are
> updated so that data is accessed and decompressed with encryption.
> 
> During uncompressed kernel booting, if SEV is the memory encryption mask is
> set and a flag is set to indicate that SEV is enabled.

I don't know how many times I have to say this but I'm going to keep
doing it until it sticks: :-)

Please, no "WHAT" in the commit messages - I can see the "WHAT - but
"WHY".

Ok?

> diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
> new file mode 100644
> index 0000000..8313c31
> --- /dev/null
> +++ b/arch/x86/boot/compressed/mem_encrypt.S
> @@ -0,0 +1,75 @@
> +/*
> + * AMD Memory Encryption Support
> + *
> + * Copyright (C) 2016 Advanced Micro Devices, Inc.
> + *
> + * Author: Tom Lendacky <thomas.lendacky@amd.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/linkage.h>
> +
> +#include <asm/processor-flags.h>
> +#include <asm/msr.h>
> +#include <asm/asm-offsets.h>
> +#include <uapi/asm/kvm_para.h>
> +
> +	.text
> +	.code32
> +ENTRY(sev_enabled)
> +	xor	%eax, %eax
> +
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +	push	%ebx
> +	push	%ecx
> +	push	%edx
> +
> +	/* Check if running under a hypervisor */
> +	movl	$0x40000000, %eax
> +	cpuid
> +	cmpl	$0x40000001, %eax
> +	jb	.Lno_sev
> +
> +	movl	$0x40000001, %eax
> +	cpuid
> +	bt	$KVM_FEATURE_SEV, %eax
> +	jnc	.Lno_sev
> +
> +	/*
> +	 * Check for memory encryption feature:
> +	 *   CPUID Fn8000_001F[EAX] - Bit 0
> +	 */
> +	movl	$0x8000001f, %eax
> +	cpuid
> +	bt	$0, %eax
> +	jnc	.Lno_sev
> +
> +	/*
> +	 * Get memory encryption information:
> +	 *   CPUID Fn8000_001F[EBX] - Bits 5:0
> +	 *     Pagetable bit position used to indicate encryption
> +	 */
> +	movl	%ebx, %eax
> +	andl	$0x3f, %eax
> +	movl	%eax, sev_enc_bit(%ebp)
> +	jmp	.Lsev_exit
> +
> +.Lno_sev:
> +	xor	%eax, %eax
> +
> +.Lsev_exit:
> +	pop	%edx
> +	pop	%ecx
> +	pop	%ebx
> +
> +#endif	/* CONFIG_AMD_MEM_ENCRYPT */
> +
> +	ret
> +ENDPROC(sev_enabled)

Right, as said in another mail earlier, this could be written in C. And
then the sme_enable() piece below looks the same as this one above. So
since you want to run it before kernel decompression and after, you
could extract this code into a separate .c file which you can link in
both places, similar to what we do with verify_cpu with the difference
that verify_cpu is getting included.

Alternatively, we still have some room in setup_header.xloadflags to
pass boot info to kernel proper from before the decompression stage.

But I'd prefer linking with both stages as it is cheaper and those flags
we can use for something which really wants to use a flag like that.

> diff --git a/arch/x86/kernel/mem_encrypt_init.c b/arch/x86/kernel/mem_encrypt_init.c
> index 35c5e3d..5d514e6 100644
> --- a/arch/x86/kernel/mem_encrypt_init.c
> +++ b/arch/x86/kernel/mem_encrypt_init.c
> @@ -22,6 +22,7 @@
>  #include <asm/processor-flags.h>
>  #include <asm/msr.h>
>  #include <asm/cmdline.h>
> +#include <asm/kvm_para.h>
>  
>  static char sme_cmdline_arg_on[] __initdata = "mem_encrypt=on";
>  static char sme_cmdline_arg_off[] __initdata = "mem_encrypt=off";
> @@ -232,6 +233,29 @@ unsigned long __init sme_enable(void *boot_data)
>  	void *cmdline_arg;
>  	u64 msr;
>  
> +	/* Check if running under a hypervisor */
> +	eax = 0x40000000;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);
> +	if (eax > 0x40000000) {
> +		eax = 0x40000001;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & BIT(KVM_FEATURE_SEV)))
> +			goto out;
> +
> +		eax = 0x8000001f;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & 1))
> +			goto out;
> +
> +		sme_me_mask = 1UL << (ebx & 0x3f);
> +		sev_enabled = 1;
> +
> +		goto out;
> +	}
> +
>  	/* Check for an AMD processor */
>  	eax = 0;
>  	ecx = 0;
> 

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-09 14:07     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-09 14:07 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:14:48AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Early in the boot process, add checks to determine if the kernel is
> running with Secure Encrypted Virtualization (SEV) active by issuing
> a CPUID instruction.
> 
> During early compressed kernel booting, if SEV is active the pagetables are
> updated so that data is accessed and decompressed with encryption.
> 
> During uncompressed kernel booting, if SEV is the memory encryption mask is
> set and a flag is set to indicate that SEV is enabled.

I don't know how many times I have to say this but I'm going to keep
doing it until it sticks: :-)

Please, no "WHAT" in the commit messages - I can see the "WHAT - but
"WHY".

Ok?

> diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
> new file mode 100644
> index 0000000..8313c31
> --- /dev/null
> +++ b/arch/x86/boot/compressed/mem_encrypt.S
> @@ -0,0 +1,75 @@
> +/*
> + * AMD Memory Encryption Support
> + *
> + * Copyright (C) 2016 Advanced Micro Devices, Inc.
> + *
> + * Author: Tom Lendacky <thomas.lendacky@amd.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/linkage.h>
> +
> +#include <asm/processor-flags.h>
> +#include <asm/msr.h>
> +#include <asm/asm-offsets.h>
> +#include <uapi/asm/kvm_para.h>
> +
> +	.text
> +	.code32
> +ENTRY(sev_enabled)
> +	xor	%eax, %eax
> +
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +	push	%ebx
> +	push	%ecx
> +	push	%edx
> +
> +	/* Check if running under a hypervisor */
> +	movl	$0x40000000, %eax
> +	cpuid
> +	cmpl	$0x40000001, %eax
> +	jb	.Lno_sev
> +
> +	movl	$0x40000001, %eax
> +	cpuid
> +	bt	$KVM_FEATURE_SEV, %eax
> +	jnc	.Lno_sev
> +
> +	/*
> +	 * Check for memory encryption feature:
> +	 *   CPUID Fn8000_001F[EAX] - Bit 0
> +	 */
> +	movl	$0x8000001f, %eax
> +	cpuid
> +	bt	$0, %eax
> +	jnc	.Lno_sev
> +
> +	/*
> +	 * Get memory encryption information:
> +	 *   CPUID Fn8000_001F[EBX] - Bits 5:0
> +	 *     Pagetable bit position used to indicate encryption
> +	 */
> +	movl	%ebx, %eax
> +	andl	$0x3f, %eax
> +	movl	%eax, sev_enc_bit(%ebp)
> +	jmp	.Lsev_exit
> +
> +.Lno_sev:
> +	xor	%eax, %eax
> +
> +.Lsev_exit:
> +	pop	%edx
> +	pop	%ecx
> +	pop	%ebx
> +
> +#endif	/* CONFIG_AMD_MEM_ENCRYPT */
> +
> +	ret
> +ENDPROC(sev_enabled)

Right, as said in another mail earlier, this could be written in C. And
then the sme_enable() piece below looks the same as this one above. So
since you want to run it before kernel decompression and after, you
could extract this code into a separate .c file which you can link in
both places, similar to what we do with verify_cpu with the difference
that verify_cpu is getting included.

Alternatively, we still have some room in setup_header.xloadflags to
pass boot info to kernel proper from before the decompression stage.

But I'd prefer linking with both stages as it is cheaper and those flags
we can use for something which really wants to use a flag like that.

> diff --git a/arch/x86/kernel/mem_encrypt_init.c b/arch/x86/kernel/mem_encrypt_init.c
> index 35c5e3d..5d514e6 100644
> --- a/arch/x86/kernel/mem_encrypt_init.c
> +++ b/arch/x86/kernel/mem_encrypt_init.c
> @@ -22,6 +22,7 @@
>  #include <asm/processor-flags.h>
>  #include <asm/msr.h>
>  #include <asm/cmdline.h>
> +#include <asm/kvm_para.h>
>  
>  static char sme_cmdline_arg_on[] __initdata = "mem_encrypt=on";
>  static char sme_cmdline_arg_off[] __initdata = "mem_encrypt=off";
> @@ -232,6 +233,29 @@ unsigned long __init sme_enable(void *boot_data)
>  	void *cmdline_arg;
>  	u64 msr;
>  
> +	/* Check if running under a hypervisor */
> +	eax = 0x40000000;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);
> +	if (eax > 0x40000000) {
> +		eax = 0x40000001;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & BIT(KVM_FEATURE_SEV)))
> +			goto out;
> +
> +		eax = 0x8000001f;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & 1))
> +			goto out;
> +
> +		sme_me_mask = 1UL << (ebx & 0x3f);
> +		sev_enabled = 1;
> +
> +		goto out;
> +	}
> +
>  	/* Check for an AMD processor */
>  	eax = 0;
>  	ecx = 0;
> 

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-09 14:07     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-09 14:07 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:14:48AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Early in the boot process, add checks to determine if the kernel is
> running with Secure Encrypted Virtualization (SEV) active by issuing
> a CPUID instruction.
> 
> During early compressed kernel booting, if SEV is active the pagetables are
> updated so that data is accessed and decompressed with encryption.
> 
> During uncompressed kernel booting, if SEV is the memory encryption mask is
> set and a flag is set to indicate that SEV is enabled.

I don't know how many times I have to say this but I'm going to keep
doing it until it sticks: :-)

Please, no "WHAT" in the commit messages - I can see the "WHAT - but
"WHY".

Ok?

> diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
> new file mode 100644
> index 0000000..8313c31
> --- /dev/null
> +++ b/arch/x86/boot/compressed/mem_encrypt.S
> @@ -0,0 +1,75 @@
> +/*
> + * AMD Memory Encryption Support
> + *
> + * Copyright (C) 2016 Advanced Micro Devices, Inc.
> + *
> + * Author: Tom Lendacky <thomas.lendacky@amd.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/linkage.h>
> +
> +#include <asm/processor-flags.h>
> +#include <asm/msr.h>
> +#include <asm/asm-offsets.h>
> +#include <uapi/asm/kvm_para.h>
> +
> +	.text
> +	.code32
> +ENTRY(sev_enabled)
> +	xor	%eax, %eax
> +
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +	push	%ebx
> +	push	%ecx
> +	push	%edx
> +
> +	/* Check if running under a hypervisor */
> +	movl	$0x40000000, %eax
> +	cpuid
> +	cmpl	$0x40000001, %eax
> +	jb	.Lno_sev
> +
> +	movl	$0x40000001, %eax
> +	cpuid
> +	bt	$KVM_FEATURE_SEV, %eax
> +	jnc	.Lno_sev
> +
> +	/*
> +	 * Check for memory encryption feature:
> +	 *   CPUID Fn8000_001F[EAX] - Bit 0
> +	 */
> +	movl	$0x8000001f, %eax
> +	cpuid
> +	bt	$0, %eax
> +	jnc	.Lno_sev
> +
> +	/*
> +	 * Get memory encryption information:
> +	 *   CPUID Fn8000_001F[EBX] - Bits 5:0
> +	 *     Pagetable bit position used to indicate encryption
> +	 */
> +	movl	%ebx, %eax
> +	andl	$0x3f, %eax
> +	movl	%eax, sev_enc_bit(%ebp)
> +	jmp	.Lsev_exit
> +
> +.Lno_sev:
> +	xor	%eax, %eax
> +
> +.Lsev_exit:
> +	pop	%edx
> +	pop	%ecx
> +	pop	%ebx
> +
> +#endif	/* CONFIG_AMD_MEM_ENCRYPT */
> +
> +	ret
> +ENDPROC(sev_enabled)

Right, as said in another mail earlier, this could be written in C. And
then the sme_enable() piece below looks the same as this one above. So
since you want to run it before kernel decompression and after, you
could extract this code into a separate .c file which you can link in
both places, similar to what we do with verify_cpu with the difference
that verify_cpu is getting included.

Alternatively, we still have some room in setup_header.xloadflags to
pass boot info to kernel proper from before the decompression stage.

But I'd prefer linking with both stages as it is cheaper and those flags
we can use for something which really wants to use a flag like that.

> diff --git a/arch/x86/kernel/mem_encrypt_init.c b/arch/x86/kernel/mem_encrypt_init.c
> index 35c5e3d..5d514e6 100644
> --- a/arch/x86/kernel/mem_encrypt_init.c
> +++ b/arch/x86/kernel/mem_encrypt_init.c
> @@ -22,6 +22,7 @@
>  #include <asm/processor-flags.h>
>  #include <asm/msr.h>
>  #include <asm/cmdline.h>
> +#include <asm/kvm_para.h>
>  
>  static char sme_cmdline_arg_on[] __initdata = "mem_encrypt=on";
>  static char sme_cmdline_arg_off[] __initdata = "mem_encrypt=off";
> @@ -232,6 +233,29 @@ unsigned long __init sme_enable(void *boot_data)
>  	void *cmdline_arg;
>  	u64 msr;
>  
> +	/* Check if running under a hypervisor */
> +	eax = 0x40000000;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);
> +	if (eax > 0x40000000) {
> +		eax = 0x40000001;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & BIT(KVM_FEATURE_SEV)))
> +			goto out;
> +
> +		eax = 0x8000001f;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & 1))
> +			goto out;
> +
> +		sme_me_mask = 1UL << (ebx & 0x3f);
> +		sev_enabled = 1;
> +
> +		goto out;
> +	}
> +
>  	/* Check for an AMD processor */
>  	eax = 0;
>  	ecx = 0;
> 

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
  2017-03-09 14:07     ` Borislav Petkov
  (?)
@ 2017-03-09 16:13       ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-09 16:13 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas



On 09/03/2017 15:07, Borislav Petkov wrote:
> +	/* Check if running under a hypervisor */
> +	eax = 0x40000000;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);

This is not how you check if running under a hypervisor; you should
check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
tells you if leaf 0x40000000 is valid.

That said, the main issue with this function is that it hardcodes the
behavior for KVM.  It is possible that another hypervisor defines its
0x40000001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.

Instead, AMD should define a "well-known" bit in its own space (i.e.
0x800000xx) that is only used by hypervisors that support SEV.  This is
similar to how Intel defined one bit in leaf 1 to say "is leaf
0x40000000 valid".

Thanks,

Paolo

> +	if (eax > 0x40000000) {
> +		eax = 0x40000001;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & BIT(KVM_FEATURE_SEV)))
> +			goto out;
> +
> +		eax = 0x8000001f;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & 1))
> +			goto out;
> +
> +		sme_me_mask = 1UL << (ebx & 0x3f);
> +		sev_enabled = 1;
> +
> +		goto out;
> +	}
> +

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-09 16:13       ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-09 16:13 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj, akpm,
	davem



On 09/03/2017 15:07, Borislav Petkov wrote:
> +	/* Check if running under a hypervisor */
> +	eax = 0x40000000;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);

This is not how you check if running under a hypervisor; you should
check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
tells you if leaf 0x40000000 is valid.

That said, the main issue with this function is that it hardcodes the
behavior for KVM.  It is possible that another hypervisor defines its
0x40000001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.

Instead, AMD should define a "well-known" bit in its own space (i.e.
0x800000xx) that is only used by hypervisors that support SEV.  This is
similar to how Intel defined one bit in leaf 1 to say "is leaf
0x40000000 valid".

Thanks,

Paolo

> +	if (eax > 0x40000000) {
> +		eax = 0x40000001;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & BIT(KVM_FEATURE_SEV)))
> +			goto out;
> +
> +		eax = 0x8000001f;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & 1))
> +			goto out;
> +
> +		sme_me_mask = 1UL << (ebx & 0x3f);
> +		sev_enabled = 1;
> +
> +		goto out;
> +	}
> +

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-09 16:13       ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-09 16:13 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj, akpm,
	davem



On 09/03/2017 15:07, Borislav Petkov wrote:
> +	/* Check if running under a hypervisor */
> +	eax = 0x40000000;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);

This is not how you check if running under a hypervisor; you should
check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
tells you if leaf 0x40000000 is valid.

That said, the main issue with this function is that it hardcodes the
behavior for KVM.  It is possible that another hypervisor defines its
0x40000001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.

Instead, AMD should define a "well-known" bit in its own space (i.e.
0x800000xx) that is only used by hypervisors that support SEV.  This is
similar to how Intel defined one bit in leaf 1 to say "is leaf
0x40000000 valid".

Thanks,

Paolo

> +	if (eax > 0x40000000) {
> +		eax = 0x40000001;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & BIT(KVM_FEATURE_SEV)))
> +			goto out;
> +
> +		eax = 0x8000001f;
> +		ecx = 0;
> +		native_cpuid(&eax, &ebx, &ecx, &edx);
> +		if (!(eax & 1))
> +			goto out;
> +
> +		sme_me_mask = 1UL << (ebx & 0x3f);
> +		sev_enabled = 1;
> +
> +		goto out;
> +	}
> +

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
  2017-03-09 16:13       ` Paolo Bonzini
  (?)
@ 2017-03-09 16:29         ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-09 16:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Thu, Mar 09, 2017 at 05:13:33PM +0100, Paolo Bonzini wrote:
> This is not how you check if running under a hypervisor; you should
> check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
> tells you if leaf 0x40000000 is valid.

Ah, good point, I already do that in the microcode loader :)

        /*
         * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
         * completely accurate as xen pv guests don't see that CPUID bit set but
         * that's good enough as they don't land on the BSP path anyway.
         */
        if (native_cpuid_ecx(1) & BIT(31))
                return *res;

> That said, the main issue with this function is that it hardcodes the
> behavior for KVM.  It is possible that another hypervisor defines its
> 0x40000001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.
> 
> Instead, AMD should define a "well-known" bit in its own space (i.e.
> 0x800000xx) that is only used by hypervisors that support SEV.  This is
> similar to how Intel defined one bit in leaf 1 to say "is leaf
> 0x40000000 valid".
> 
> > +	if (eax > 0x40000000) {
> > +		eax = 0x40000001;
> > +		ecx = 0;
> > +		native_cpuid(&eax, &ebx, &ecx, &edx);
> > +		if (!(eax & BIT(KVM_FEATURE_SEV)))
> > +			goto out;
> > +
> > +		eax = 0x8000001f;
> > +		ecx = 0;
> > +		native_cpuid(&eax, &ebx, &ecx, &edx);
> > +		if (!(eax & 1))

Right, so this is testing CPUID_0x8000001f_ECX(0)[0], SME. Why not
simply set that bit for the guest too, in kvm?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-09 16:29         ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-09 16:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On Thu, Mar 09, 2017 at 05:13:33PM +0100, Paolo Bonzini wrote:
> This is not how you check if running under a hypervisor; you should
> check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
> tells you if leaf 0x40000000 is valid.

Ah, good point, I already do that in the microcode loader :)

        /*
         * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
         * completely accurate as xen pv guests don't see that CPUID bit set but
         * that's good enough as they don't land on the BSP path anyway.
         */
        if (native_cpuid_ecx(1) & BIT(31))
                return *res;

> That said, the main issue with this function is that it hardcodes the
> behavior for KVM.  It is possible that another hypervisor defines its
> 0x40000001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.
> 
> Instead, AMD should define a "well-known" bit in its own space (i.e.
> 0x800000xx) that is only used by hypervisors that support SEV.  This is
> similar to how Intel defined one bit in leaf 1 to say "is leaf
> 0x40000000 valid".
> 
> > +	if (eax > 0x40000000) {
> > +		eax = 0x40000001;
> > +		ecx = 0;
> > +		native_cpuid(&eax, &ebx, &ecx, &edx);
> > +		if (!(eax & BIT(KVM_FEATURE_SEV)))
> > +			goto out;
> > +
> > +		eax = 0x8000001f;
> > +		ecx = 0;
> > +		native_cpuid(&eax, &ebx, &ecx, &edx);
> > +		if (!(eax & 1))

Right, so this is testing CPUID_0x8000001f_ECX(0)[0], SME. Why not
simply set that bit for the guest too, in kvm?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-09 16:29         ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-09 16:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On Thu, Mar 09, 2017 at 05:13:33PM +0100, Paolo Bonzini wrote:
> This is not how you check if running under a hypervisor; you should
> check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
> tells you if leaf 0x40000000 is valid.

Ah, good point, I already do that in the microcode loader :)

        /*
         * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
         * completely accurate as xen pv guests don't see that CPUID bit set but
         * that's good enough as they don't land on the BSP path anyway.
         */
        if (native_cpuid_ecx(1) & BIT(31))
                return *res;

> That said, the main issue with this function is that it hardcodes the
> behavior for KVM.  It is possible that another hypervisor defines its
> 0x40000001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.
> 
> Instead, AMD should define a "well-known" bit in its own space (i.e.
> 0x800000xx) that is only used by hypervisors that support SEV.  This is
> similar to how Intel defined one bit in leaf 1 to say "is leaf
> 0x40000000 valid".
> 
> > +	if (eax > 0x40000000) {
> > +		eax = 0x40000001;
> > +		ecx = 0;
> > +		native_cpuid(&eax, &ebx, &ecx, &edx);
> > +		if (!(eax & BIT(KVM_FEATURE_SEV)))
> > +			goto out;
> > +
> > +		eax = 0x8000001f;
> > +		ecx = 0;
> > +		native_cpuid(&eax, &ebx, &ecx, &edx);
> > +		if (!(eax & 1))

Right, so this is testing CPUID_0x8000001f_ECX(0)[0], SME. Why not
simply set that bit for the guest too, in kvm?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 13/32] KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
  2017-03-02 15:15   ` Brijesh Singh
  (?)
@ 2017-03-09 19:29     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-09 19:29 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas

On Thu, Mar 02, 2017 at 10:15:01AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Modify the SVM cpuid update function to indicate if Secure Encrypted
> Virtualization (SEV) is active in the guest by setting the SEV KVM CPU
> features bit. SEV is active if Secure Memory Encryption is enabled in
> the host and the SEV_ENABLE bit of the VMCB is set.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/kvm/cpuid.c |    4 +++-
>  arch/x86/kvm/svm.c   |   18 ++++++++++++++++++
>  2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 1639de8..e0c40a8 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -601,7 +601,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		entry->edx = 0;
>  		break;
>  	case 0x80000000:
> -		entry->eax = min(entry->eax, 0x8000001a);
> +		entry->eax = min(entry->eax, 0x8000001f);
>  		break;
>  	case 0x80000001:
>  		entry->edx &= kvm_cpuid_8000_0001_edx_x86_features;
> @@ -634,6 +634,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		break;
>  	case 0x8000001d:
>  		break;
> +	case 0x8000001f:
> +		break;

I guess those three case's can be unified:

        case 0x8000001a:
        case 0x8000001d:
        case 0x8000001f:
                break;

...

> +	sev_info = kvm_find_cpuid_entry(vcpu, 0x8000001f, 0);
> +	if (!sev_info)
> +		return;
> +
> +	if (ca->nested_ctl & SVM_NESTED_CTL_SEV_ENABLE) {
> +		features->eax |= (1 << KVM_FEATURE_SEV);
> +		cpuid(0x8000001f, &sev_info->eax, &sev_info->ebx,
> +		      &sev_info->ecx, &sev_info->edx);
> +	}

Right, as already mentioned in the previous mail: can we communicate SEV
status to the guest solely through the 0x8000001f leaf? Then we won't
need KVM_FEATURE_SEV and this way we'll be hypervisor-agnostic, as Paolo
suggested.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 13/32] KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
@ 2017-03-09 19:29     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-09 19:29 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:15:01AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Modify the SVM cpuid update function to indicate if Secure Encrypted
> Virtualization (SEV) is active in the guest by setting the SEV KVM CPU
> features bit. SEV is active if Secure Memory Encryption is enabled in
> the host and the SEV_ENABLE bit of the VMCB is set.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/kvm/cpuid.c |    4 +++-
>  arch/x86/kvm/svm.c   |   18 ++++++++++++++++++
>  2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 1639de8..e0c40a8 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -601,7 +601,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		entry->edx = 0;
>  		break;
>  	case 0x80000000:
> -		entry->eax = min(entry->eax, 0x8000001a);
> +		entry->eax = min(entry->eax, 0x8000001f);
>  		break;
>  	case 0x80000001:
>  		entry->edx &= kvm_cpuid_8000_0001_edx_x86_features;
> @@ -634,6 +634,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		break;
>  	case 0x8000001d:
>  		break;
> +	case 0x8000001f:
> +		break;

I guess those three case's can be unified:

        case 0x8000001a:
        case 0x8000001d:
        case 0x8000001f:
                break;

...

> +	sev_info = kvm_find_cpuid_entry(vcpu, 0x8000001f, 0);
> +	if (!sev_info)
> +		return;
> +
> +	if (ca->nested_ctl & SVM_NESTED_CTL_SEV_ENABLE) {
> +		features->eax |= (1 << KVM_FEATURE_SEV);
> +		cpuid(0x8000001f, &sev_info->eax, &sev_info->ebx,
> +		      &sev_info->ecx, &sev_info->edx);
> +	}

Right, as already mentioned in the previous mail: can we communicate SEV
status to the guest solely through the 0x8000001f leaf? Then we won't
need KVM_FEATURE_SEV and this way we'll be hypervisor-agnostic, as Paolo
suggested.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 13/32] KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
@ 2017-03-09 19:29     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-09 19:29 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:15:01AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Modify the SVM cpuid update function to indicate if Secure Encrypted
> Virtualization (SEV) is active in the guest by setting the SEV KVM CPU
> features bit. SEV is active if Secure Memory Encryption is enabled in
> the host and the SEV_ENABLE bit of the VMCB is set.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/kvm/cpuid.c |    4 +++-
>  arch/x86/kvm/svm.c   |   18 ++++++++++++++++++
>  2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 1639de8..e0c40a8 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -601,7 +601,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		entry->edx = 0;
>  		break;
>  	case 0x80000000:
> -		entry->eax = min(entry->eax, 0x8000001a);
> +		entry->eax = min(entry->eax, 0x8000001f);
>  		break;
>  	case 0x80000001:
>  		entry->edx &= kvm_cpuid_8000_0001_edx_x86_features;
> @@ -634,6 +634,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		break;
>  	case 0x8000001d:
>  		break;
> +	case 0x8000001f:
> +		break;

I guess those three case's can be unified:

        case 0x8000001a:
        case 0x8000001d:
        case 0x8000001f:
                break;

...

> +	sev_info = kvm_find_cpuid_entry(vcpu, 0x8000001f, 0);
> +	if (!sev_info)
> +		return;
> +
> +	if (ca->nested_ctl & SVM_NESTED_CTL_SEV_ENABLE) {
> +		features->eax |= (1 << KVM_FEATURE_SEV);
> +		cpuid(0x8000001f, &sev_info->eax, &sev_info->ebx,
> +		      &sev_info->ecx, &sev_info->edx);
> +	}

Right, as already mentioned in the previous mail: can we communicate SEV
status to the guest solely through the 0x8000001f leaf? Then we won't
need KVM_FEATURE_SEV and this way we'll be hypervisor-agnostic, as Paolo
suggested.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-02 15:15   ` Brijesh Singh
  (?)
@ 2017-03-10 11:06     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-10 11:06 UTC (permalink / raw)
  To: Brijesh Singh, Paolo Bonzini
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott

On Thu, Mar 02, 2017 at 10:15:15AM -0500, Brijesh Singh wrote:
> If kernel_maps_pages_in_pgd is called early in boot process to change the

kernel_map_pages_in_pgd()

> memory attributes then it fails to allocate memory when spliting large
> pages. The patch extends the cpa_data to provide the support to use
> memblock_alloc when slab allocator is not available.
> 
> The feature will be used in Secure Encrypted Virtualization (SEV) mode,
> where we may need to change the memory region attributes in early boot
> process.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 42 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> index 46cc89d..9e4ab3b 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -14,6 +14,7 @@
>  #include <linux/gfp.h>
>  #include <linux/pci.h>
>  #include <linux/vmalloc.h>
> +#include <linux/memblock.h>
>  
>  #include <asm/e820/api.h>
>  #include <asm/processor.h>
> @@ -37,6 +38,7 @@ struct cpa_data {
>  	int		flags;
>  	unsigned long	pfn;
>  	unsigned	force_split : 1;
> +	unsigned	force_memblock :1;
>  	int		curpage;
>  	struct page	**pages;
>  };
> @@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
>  
>  static int
>  __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
> -		   struct page *base)
> +		  pte_t *pbase, unsigned long new_pfn)
>  {
> -	pte_t *pbase = (pte_t *)page_address(base);
>  	unsigned long ref_pfn, pfn, pfninc = 1;
>  	unsigned int i, level;
>  	pte_t *tmp;
> @@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>  		return 1;
>  	}
>  
> -	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
> +	paravirt_alloc_pte(&init_mm, new_pfn);
>  
>  	switch (level) {
>  	case PG_LEVEL_2M:
> @@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>  	 * pagetable protections, the actual ptes set above control the
>  	 * primary protection behavior:
>  	 */
> -	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
> +	__set_pmd_pte(kpte, address,
> +		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
>  
>  	/*
>  	 * Intel Atom errata AAH41 workaround.
> @@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>  	return 0;
>  }
>  
> +static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
> +{
> +	unsigned long phys;
> +	struct page *base;
> +
> +	if (cpa->force_memblock) {
> +		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);

Maybe there's a reason this fires:

WARNING: modpost: Found 2 section mismatch(es).
To see full details build your kernel with:
'make CONFIG_DEBUG_SECTION_MISMATCH=y'

WARNING: vmlinux.o(.text+0x48edc): Section mismatch in reference from the function __change_page_attr() to the function .init.text:memblock_alloc()
The function __change_page_attr() references
the function __init memblock_alloc().
This is often because __change_page_attr lacks a __init
annotation or the annotation of memblock_alloc is wrong.

WARNING: vmlinux.o(.text+0x491d1): Section mismatch in reference from the function __change_page_attr() to the function .meminit.text:memblock_free()
The function __change_page_attr() references
the function __meminit memblock_free().
This is often because __change_page_attr lacks a __meminit
annotation or the annotation of memblock_free is wrong.

Why do we need this whole early mapping? For the guest? I don't like
that memblock thing at all.

So I think the approach with the .data..percpu..hv_shared section is
fine and we should consider SEV-ES

http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf

and do this right from the get-go so that when SEV-ES comes along, we
should simply be ready and extend that mechanism to put the whole Guest
Hypervisor Communication Block in there.

But then the fact that you're mapping those decrypted in init_mm.pgd
makes me think you don't need that early mapping thing at all. Those are
the decrypted mappings of the hypervisor. And that you can do late.

Now, what would be better, IMHO (and I have no idea about virtualization
design so take with a grain of salt) is if the guest would allocate
enough memory for the GHCB and mark it decrypted from the very
beginning. It will be the communication vehicle with the hypervisor
anyway.

And we already do similar things in sme_map_bootdata() for the baremetal
kernel to map boot_data, initrd, EFI, ... and so on things decrypted.

And we should extend that mechanism to map the GHCB in the guest too and
then we can get rid of all that need for ->force_memblock which makes
the crazy mess in pageattr.c even crazier. And it would be lovely if we
can do it without it.

But maybe Paolo might have an even better idea...

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-10 11:06     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-10 11:06 UTC (permalink / raw)
  To: Brijesh Singh, Paolo Bonzini
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:15:15AM -0500, Brijesh Singh wrote:
> If kernel_maps_pages_in_pgd is called early in boot process to change the

kernel_map_pages_in_pgd()

> memory attributes then it fails to allocate memory when spliting large
> pages. The patch extends the cpa_data to provide the support to use
> memblock_alloc when slab allocator is not available.
> 
> The feature will be used in Secure Encrypted Virtualization (SEV) mode,
> where we may need to change the memory region attributes in early boot
> process.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 42 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> index 46cc89d..9e4ab3b 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -14,6 +14,7 @@
>  #include <linux/gfp.h>
>  #include <linux/pci.h>
>  #include <linux/vmalloc.h>
> +#include <linux/memblock.h>
>  
>  #include <asm/e820/api.h>
>  #include <asm/processor.h>
> @@ -37,6 +38,7 @@ struct cpa_data {
>  	int		flags;
>  	unsigned long	pfn;
>  	unsigned	force_split : 1;
> +	unsigned	force_memblock :1;
>  	int		curpage;
>  	struct page	**pages;
>  };
> @@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
>  
>  static int
>  __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
> -		   struct page *base)
> +		  pte_t *pbase, unsigned long new_pfn)
>  {
> -	pte_t *pbase = (pte_t *)page_address(base);
>  	unsigned long ref_pfn, pfn, pfninc = 1;
>  	unsigned int i, level;
>  	pte_t *tmp;
> @@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>  		return 1;
>  	}
>  
> -	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
> +	paravirt_alloc_pte(&init_mm, new_pfn);
>  
>  	switch (level) {
>  	case PG_LEVEL_2M:
> @@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>  	 * pagetable protections, the actual ptes set above control the
>  	 * primary protection behavior:
>  	 */
> -	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
> +	__set_pmd_pte(kpte, address,
> +		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
>  
>  	/*
>  	 * Intel Atom errata AAH41 workaround.
> @@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>  	return 0;
>  }
>  
> +static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
> +{
> +	unsigned long phys;
> +	struct page *base;
> +
> +	if (cpa->force_memblock) {
> +		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);

Maybe there's a reason this fires:

WARNING: modpost: Found 2 section mismatch(es).
To see full details build your kernel with:
'make CONFIG_DEBUG_SECTION_MISMATCH=y'

WARNING: vmlinux.o(.text+0x48edc): Section mismatch in reference from the function __change_page_attr() to the function .init.text:memblock_alloc()
The function __change_page_attr() references
the function __init memblock_alloc().
This is often because __change_page_attr lacks a __init
annotation or the annotation of memblock_alloc is wrong.

WARNING: vmlinux.o(.text+0x491d1): Section mismatch in reference from the function __change_page_attr() to the function .meminit.text:memblock_free()
The function __change_page_attr() references
the function __meminit memblock_free().
This is often because __change_page_attr lacks a __meminit
annotation or the annotation of memblock_free is wrong.

Why do we need this whole early mapping? For the guest? I don't like
that memblock thing at all.

So I think the approach with the .data..percpu..hv_shared section is
fine and we should consider SEV-ES

http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf

and do this right from the get-go so that when SEV-ES comes along, we
should simply be ready and extend that mechanism to put the whole Guest
Hypervisor Communication Block in there.

But then the fact that you're mapping those decrypted in init_mm.pgd
makes me think you don't need that early mapping thing at all. Those are
the decrypted mappings of the hypervisor. And that you can do late.

Now, what would be better, IMHO (and I have no idea about virtualization
design so take with a grain of salt) is if the guest would allocate
enough memory for the GHCB and mark it decrypted from the very
beginning. It will be the communication vehicle with the hypervisor
anyway.

And we already do similar things in sme_map_bootdata() for the baremetal
kernel to map boot_data, initrd, EFI, ... and so on things decrypted.

And we should extend that mechanism to map the GHCB in the guest too and
then we can get rid of all that need for ->force_memblock which makes
the crazy mess in pageattr.c even crazier. And it would be lovely if we
can do it without it.

But maybe Paolo might have an even better idea...

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-10 11:06     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-10 11:06 UTC (permalink / raw)
  To: Brijesh Singh, Paolo Bonzini
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj, akpm,
	davem

On Thu, Mar 02, 2017 at 10:15:15AM -0500, Brijesh Singh wrote:
> If kernel_maps_pages_in_pgd is called early in boot process to change the

kernel_map_pages_in_pgd()

> memory attributes then it fails to allocate memory when spliting large
> pages. The patch extends the cpa_data to provide the support to use
> memblock_alloc when slab allocator is not available.
> 
> The feature will be used in Secure Encrypted Virtualization (SEV) mode,
> where we may need to change the memory region attributes in early boot
> process.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 42 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> index 46cc89d..9e4ab3b 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -14,6 +14,7 @@
>  #include <linux/gfp.h>
>  #include <linux/pci.h>
>  #include <linux/vmalloc.h>
> +#include <linux/memblock.h>
>  
>  #include <asm/e820/api.h>
>  #include <asm/processor.h>
> @@ -37,6 +38,7 @@ struct cpa_data {
>  	int		flags;
>  	unsigned long	pfn;
>  	unsigned	force_split : 1;
> +	unsigned	force_memblock :1;
>  	int		curpage;
>  	struct page	**pages;
>  };
> @@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
>  
>  static int
>  __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
> -		   struct page *base)
> +		  pte_t *pbase, unsigned long new_pfn)
>  {
> -	pte_t *pbase = (pte_t *)page_address(base);
>  	unsigned long ref_pfn, pfn, pfninc = 1;
>  	unsigned int i, level;
>  	pte_t *tmp;
> @@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>  		return 1;
>  	}
>  
> -	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
> +	paravirt_alloc_pte(&init_mm, new_pfn);
>  
>  	switch (level) {
>  	case PG_LEVEL_2M:
> @@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>  	 * pagetable protections, the actual ptes set above control the
>  	 * primary protection behavior:
>  	 */
> -	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
> +	__set_pmd_pte(kpte, address,
> +		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
>  
>  	/*
>  	 * Intel Atom errata AAH41 workaround.
> @@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>  	return 0;
>  }
>  
> +static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
> +{
> +	unsigned long phys;
> +	struct page *base;
> +
> +	if (cpa->force_memblock) {
> +		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);

Maybe there's a reason this fires:

WARNING: modpost: Found 2 section mismatch(es).
To see full details build your kernel with:
'make CONFIG_DEBUG_SECTION_MISMATCH=y'

WARNING: vmlinux.o(.text+0x48edc): Section mismatch in reference from the function __change_page_attr() to the function .init.text:memblock_alloc()
The function __change_page_attr() references
the function __init memblock_alloc().
This is often because __change_page_attr lacks a __init
annotation or the annotation of memblock_alloc is wrong.

WARNING: vmlinux.o(.text+0x491d1): Section mismatch in reference from the function __change_page_attr() to the function .meminit.text:memblock_free()
The function __change_page_attr() references
the function __meminit memblock_free().
This is often because __change_page_attr lacks a __meminit
annotation or the annotation of memblock_free is wrong.

Why do we need this whole early mapping? For the guest? I don't like
that memblock thing at all.

So I think the approach with the .data..percpu..hv_shared section is
fine and we should consider SEV-ES

http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf

and do this right from the get-go so that when SEV-ES comes along, we
should simply be ready and extend that mechanism to put the whole Guest
Hypervisor Communication Block in there.

But then the fact that you're mapping those decrypted in init_mm.pgd
makes me think you don't need that early mapping thing at all. Those are
the decrypted mappings of the hypervisor. And that you can do late.

Now, what would be better, IMHO (and I have no idea about virtualization
design so take with a grain of salt) is if the guest would allocate
enough memory for the GHCB and mark it decrypted from the very
beginning. It will be the communication vehicle with the hypervisor
anyway.

And we already do similar things in sme_map_bootdata() for the baremetal
kernel to map boot_data, initrd, EFI, ... and so on things decrypted.

And we should extend that mechanism to map the GHCB in the guest too and
then we can get rid of all that need for ->force_memblock which makes
the crazy mess in pageattr.c even crazier. And it would be lovely if we
can do it without it.

But maybe Paolo might have an even better idea...

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
  2017-03-09 16:29         ` Borislav Petkov
  (?)
  (?)
@ 2017-03-10 16:35           ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-10 16:35 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhel

Hi Boris and Paolo,

On 03/09/2017 10:29 AM, Borislav Petkov wrote:
> On Thu, Mar 09, 2017 at 05:13:33PM +0100, Paolo Bonzini wrote:
>> This is not how you check if running under a hypervisor; you should
>> check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
>> tells you if leaf 0x40000000 is valid.
>
> Ah, good point, I already do that in the microcode loader :)
>
>         /*
>          * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
>          * completely accurate as xen pv guests don't see that CPUID bit set but
>          * that's good enough as they don't land on the BSP path anyway.
>          */
>         if (native_cpuid_ecx(1) & BIT(31))
>                 return *res;
>
>> That said, the main issue with this function is that it hardcodes the
>> behavior for KVM.  It is possible that another hypervisor defines its
>> 0x40000001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.
>>
>> Instead, AMD should define a "well-known" bit in its own space (i.e.
>> 0x800000xx) that is only used by hypervisors that support SEV.  This is
>> similar to how Intel defined one bit in leaf 1 to say "is leaf
>> 0x40000000 valid".
>>
>>> +	if (eax > 0x40000000) {
>>> +		eax = 0x40000001;
>>> +		ecx = 0;
>>> +		native_cpuid(&eax, &ebx, &ecx, &edx);
>>> +		if (!(eax & BIT(KVM_FEATURE_SEV)))
>>> +			goto out;
>>> +
>>> +		eax = 0x8000001f;
>>> +		ecx = 0;
>>> +		native_cpuid(&eax, &ebx, &ecx, &edx);
>>> +		if (!(eax & 1))
>
> Right, so this is testing CPUID_0x8000001f_ECX(0)[0], SME. Why not
> simply set that bit for the guest too, in kvm?
>

CPUID_8000_001F[EAX] indicates whether the feature is supported.
CPUID_0x8000001F[EAX]:
  * Bit 0 - SME supported
  * Bit 1 - SEV supported
  * Bit 3 - SEV-ES supported

We can use MSR_K8_SYSCFG[MemEncryptionModeEnc] to check if memory encryption is enabled.
Currently, KVM returns zero when guest OS read MSR_K8_SYSCFG. I can update my patch sets
to set this bit for SEV enabled guests.

We could update this patch to use the below logic:

  * CPUID(0) - Check for AuthenticAMD
  * CPID(1) - Check if under hypervisor
  * CPUID(0x80000000) - Check for highest supported leaf
  * CPUID(0x8000001F).EAX - Check for SME and SEV support
  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set


Paolo,

One question, do we need "AuthenticAMD" check when we are running under hypervisor ?
I was looking at qemu code and found that qemu exposes parameters to change the CPU
vendor id. The above check will fail if user changes the vendor id while launching
the SEV guest.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-10 16:35           ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-10 16:35 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

Hi Boris and Paolo,

On 03/09/2017 10:29 AM, Borislav Petkov wrote:
> On Thu, Mar 09, 2017 at 05:13:33PM +0100, Paolo Bonzini wrote:
>> This is not how you check if running under a hypervisor; you should
>> check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
>> tells you if leaf 0x40000000 is valid.
>
> Ah, good point, I already do that in the microcode loader :)
>
>         /*
>          * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
>          * completely accurate as xen pv guests don't see that CPUID bit set but
>          * that's good enough as they don't land on the BSP path anyway.
>          */
>         if (native_cpuid_ecx(1) & BIT(31))
>                 return *res;
>
>> That said, the main issue with this function is that it hardcodes the
>> behavior for KVM.  It is possible that another hypervisor defines its
>> 0x40000001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.
>>
>> Instead, AMD should define a "well-known" bit in its own space (i.e.
>> 0x800000xx) that is only used by hypervisors that support SEV.  This is
>> similar to how Intel defined one bit in leaf 1 to say "is leaf
>> 0x40000000 valid".
>>
>>> +	if (eax > 0x40000000) {
>>> +		eax = 0x40000001;
>>> +		ecx = 0;
>>> +		native_cpuid(&eax, &ebx, &ecx, &edx);
>>> +		if (!(eax & BIT(KVM_FEATURE_SEV)))
>>> +			goto out;
>>> +
>>> +		eax = 0x8000001f;
>>> +		ecx = 0;
>>> +		native_cpuid(&eax, &ebx, &ecx, &edx);
>>> +		if (!(eax & 1))
>
> Right, so this is testing CPUID_0x8000001f_ECX(0)[0], SME. Why not
> simply set that bit for the guest too, in kvm?
>

CPUID_8000_001F[EAX] indicates whether the feature is supported.
CPUID_0x8000001F[EAX]:
  * Bit 0 - SME supported
  * Bit 1 - SEV supported
  * Bit 3 - SEV-ES supported

We can use MSR_K8_SYSCFG[MemEncryptionModeEnc] to check if memory encryption is enabled.
Currently, KVM returns zero when guest OS read MSR_K8_SYSCFG. I can update my patch sets
to set this bit for SEV enabled guests.

We could update this patch to use the below logic:

  * CPUID(0) - Check for AuthenticAMD
  * CPID(1) - Check if under hypervisor
  * CPUID(0x80000000) - Check for highest supported leaf
  * CPUID(0x8000001F).EAX - Check for SME and SEV support
  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set


Paolo,

One question, do we need "AuthenticAMD" check when we are running under hypervisor ?
I was looking at qemu code and found that qemu exposes parameters to change the CPU
vendor id. The above check will fail if user changes the vendor id while launching
the SEV guest.

-Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-10 16:35           ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-10 16:35 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhel

Hi Boris and Paolo,

On 03/09/2017 10:29 AM, Borislav Petkov wrote:
> On Thu, Mar 09, 2017 at 05:13:33PM +0100, Paolo Bonzini wrote:
>> This is not how you check if running under a hypervisor; you should
>> check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
>> tells you if leaf 0x40000000 is valid.
>
> Ah, good point, I already do that in the microcode loader :)
>
>         /*
>          * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
>          * completely accurate as xen pv guests don't see that CPUID bit set but
>          * that's good enough as they don't land on the BSP path anyway.
>          */
>         if (native_cpuid_ecx(1) & BIT(31))
>                 return *res;
>
>> That said, the main issue with this function is that it hardcodes the
>> behavior for KVM.  It is possible that another hypervisor defines its
>> 0x40000001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.
>>
>> Instead, AMD should define a "well-known" bit in its own space (i.e.
>> 0x800000xx) that is only used by hypervisors that support SEV.  This is
>> similar to how Intel defined one bit in leaf 1 to say "is leaf
>> 0x40000000 valid".
>>
>>> +	if (eax > 0x40000000) {
>>> +		eax = 0x40000001;
>>> +		ecx = 0;
>>> +		native_cpuid(&eax, &ebx, &ecx, &edx);
>>> +		if (!(eax & BIT(KVM_FEATURE_SEV)))
>>> +			goto out;
>>> +
>>> +		eax = 0x8000001f;
>>> +		ecx = 0;
>>> +		native_cpuid(&eax, &ebx, &ecx, &edx);
>>> +		if (!(eax & 1))
>
> Right, so this is testing CPUID_0x8000001f_ECX(0)[0], SME. Why not
> simply set that bit for the guest too, in kvm?
>

CPUID_8000_001F[EAX] indicates whether the feature is supported.
CPUID_0x8000001F[EAX]:
  * Bit 0 - SME supported
  * Bit 1 - SEV supported
  * Bit 3 - SEV-ES supported

We can use MSR_K8_SYSCFG[MemEncryptionModeEnc] to check if memory encryption is enabled.
Currently, KVM returns zero when guest OS read MSR_K8_SYSCFG. I can update my patch sets
to set this bit for SEV enabled guests.

We could update this patch to use the below logic:

  * CPUID(0) - Check for AuthenticAMD
  * CPID(1) - Check if under hypervisor
  * CPUID(0x80000000) - Check for highest supported leaf
  * CPUID(0x8000001F).EAX - Check for SME and SEV support
  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set


Paolo,

One question, do we need "AuthenticAMD" check when we are running under hypervisor ?
I was looking at qemu code and found that qemu exposes parameters to change the CPU
vendor id. The above check will fail if user changes the vendor id while launching
the SEV guest.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-10 16:35           ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-10 16:35 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

Hi Boris and Paolo,

On 03/09/2017 10:29 AM, Borislav Petkov wrote:
> On Thu, Mar 09, 2017 at 05:13:33PM +0100, Paolo Bonzini wrote:
>> This is not how you check if running under a hypervisor; you should
>> check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
>> tells you if leaf 0x40000000 is valid.
>
> Ah, good point, I already do that in the microcode loader :)
>
>         /*
>          * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
>          * completely accurate as xen pv guests don't see that CPUID bit set but
>          * that's good enough as they don't land on the BSP path anyway.
>          */
>         if (native_cpuid_ecx(1) & BIT(31))
>                 return *res;
>
>> That said, the main issue with this function is that it hardcodes the
>> behavior for KVM.  It is possible that another hypervisor defines its
>> 0x40000001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.
>>
>> Instead, AMD should define a "well-known" bit in its own space (i.e.
>> 0x800000xx) that is only used by hypervisors that support SEV.  This is
>> similar to how Intel defined one bit in leaf 1 to say "is leaf
>> 0x40000000 valid".
>>
>>> +	if (eax > 0x40000000) {
>>> +		eax = 0x40000001;
>>> +		ecx = 0;
>>> +		native_cpuid(&eax, &ebx, &ecx, &edx);
>>> +		if (!(eax & BIT(KVM_FEATURE_SEV)))
>>> +			goto out;
>>> +
>>> +		eax = 0x8000001f;
>>> +		ecx = 0;
>>> +		native_cpuid(&eax, &ebx, &ecx, &edx);
>>> +		if (!(eax & 1))
>
> Right, so this is testing CPUID_0x8000001f_ECX(0)[0], SME. Why not
> simply set that bit for the guest too, in kvm?
>

CPUID_8000_001F[EAX] indicates whether the feature is supported.
CPUID_0x8000001F[EAX]:
  * Bit 0 - SME supported
  * Bit 1 - SEV supported
  * Bit 3 - SEV-ES supported

We can use MSR_K8_SYSCFG[MemEncryptionModeEnc] to check if memory encryption is enabled.
Currently, KVM returns zero when guest OS read MSR_K8_SYSCFG. I can update my patch sets
to set this bit for SEV enabled guests.

We could update this patch to use the below logic:

  * CPUID(0) - Check for AuthenticAMD
  * CPID(1) - Check if under hypervisor
  * CPUID(0x80000000) - Check for highest supported leaf
  * CPUID(0x8000001F).EAX - Check for SME and SEV support
  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set


Paolo,

One question, do we need "AuthenticAMD" check when we are running under hypervisor ?
I was looking at qemu code and found that qemu exposes parameters to change the CPU
vendor id. The above check will fail if user changes the vendor id while launching
the SEV guest.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-10 11:06     ` Borislav Petkov
  (?)
  (?)
@ 2017-03-10 22:41       ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-10 22:41 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani

Hi Boris,

On 03/10/2017 05:06 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:15:15AM -0500, Brijesh Singh wrote:
>> If kernel_maps_pages_in_pgd is called early in boot process to change the
>
> kernel_map_pages_in_pgd()
>
>> memory attributes then it fails to allocate memory when spliting large
>> pages. The patch extends the cpa_data to provide the support to use
>> memblock_alloc when slab allocator is not available.
>>
>> The feature will be used in Secure Encrypted Virtualization (SEV) mode,
>> where we may need to change the memory region attributes in early boot
>> process.
>>
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>> ---
>>  arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
>>  1 file changed, 42 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
>> index 46cc89d..9e4ab3b 100644
>> --- a/arch/x86/mm/pageattr.c
>> +++ b/arch/x86/mm/pageattr.c
>> @@ -14,6 +14,7 @@
>>  #include <linux/gfp.h>
>>  #include <linux/pci.h>
>>  #include <linux/vmalloc.h>
>> +#include <linux/memblock.h>
>>
>>  #include <asm/e820/api.h>
>>  #include <asm/processor.h>
>> @@ -37,6 +38,7 @@ struct cpa_data {
>>  	int		flags;
>>  	unsigned long	pfn;
>>  	unsigned	force_split : 1;
>> +	unsigned	force_memblock :1;
>>  	int		curpage;
>>  	struct page	**pages;
>>  };
>> @@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
>>
>>  static int
>>  __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>> -		   struct page *base)
>> +		  pte_t *pbase, unsigned long new_pfn)
>>  {
>> -	pte_t *pbase = (pte_t *)page_address(base);
>>  	unsigned long ref_pfn, pfn, pfninc = 1;
>>  	unsigned int i, level;
>>  	pte_t *tmp;
>> @@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  		return 1;
>>  	}
>>
>> -	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
>> +	paravirt_alloc_pte(&init_mm, new_pfn);
>>
>>  	switch (level) {
>>  	case PG_LEVEL_2M:
>> @@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  	 * pagetable protections, the actual ptes set above control the
>>  	 * primary protection behavior:
>>  	 */
>> -	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
>> +	__set_pmd_pte(kpte, address,
>> +		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
>>
>>  	/*
>>  	 * Intel Atom errata AAH41 workaround.
>> @@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  	return 0;
>>  }
>>
>> +static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
>> +{
>> +	unsigned long phys;
>> +	struct page *base;
>> +
>> +	if (cpa->force_memblock) {
>> +		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
>
> Maybe there's a reason this fires:
>
> WARNING: modpost: Found 2 section mismatch(es).
> To see full details build your kernel with:
> 'make CONFIG_DEBUG_SECTION_MISMATCH=y'
>
> WARNING: vmlinux.o(.text+0x48edc): Section mismatch in reference from the function __change_page_attr() to the function .init.text:memblock_alloc()
> The function __change_page_attr() references
> the function __init memblock_alloc().
> This is often because __change_page_attr lacks a __init
> annotation or the annotation of memblock_alloc is wrong.
>
> WARNING: vmlinux.o(.text+0x491d1): Section mismatch in reference from the function __change_page_attr() to the function .meminit.text:memblock_free()
> The function __change_page_attr() references
> the function __meminit memblock_free().
> This is often because __change_page_attr lacks a __meminit
> annotation or the annotation of memblock_free is wrong.
>

I can take a look at fixing those warning. In my initial attempt was to create
a new function to clear encryption bit but it ended up looking very similar to
__change_page_attr_set_clr() hence decided to extend the exiting function to
use memblock_alloc().


> Why do we need this whole early mapping? For the guest? I don't like
> that memblock thing at all.

Early in boot process, guest kernel allocates some structure (its either
statically allocated or dynamic allocated via memblock_alloc). And shares the physical
address of these structure with hypervisor. Since entire guest memory area is mapped
as encrypted hence those structure's are mapped as encrypted memory range. We need
a method to clear the encryption bit. Sometime these structure maybe part of 2M pages
and need to split into smaller pages.

>
> So I think the approach with the .data..percpu..hv_shared section is
> fine and we should consider SEV-ES
>
> http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf
>
> and do this right from the get-go so that when SEV-ES comes along, we
> should simply be ready and extend that mechanism to put the whole Guest
> Hypervisor Communication Block in there.
>

> But then the fact that you're mapping those decrypted in init_mm.pgd
> makes me think you don't need that early mapping thing at all. Those are
> the decrypted mappings of the hypervisor. And that you can do late.
>

In most cases, guest and hypervisor communication starts as soon as guest provides
the physical address to hypervisor. So we must map the pages as decrypted before
sharing the physical address to hypervisor.

> Now, what would be better, IMHO (and I have no idea about virtualization
> design so take with a grain of salt) is if the guest would allocate
> enough memory for the GHCB and mark it decrypted from the very
> beginning. It will be the communication vehicle with the hypervisor
> anyway.
>
> And we already do similar things in sme_map_bootdata() for the baremetal
> kernel to map boot_data, initrd, EFI, ... and so on things decrypted.
>

I will take a look at sme_map_bootdata but I believe the main difference is,
in case of SME those memory regions were allocated by bios or bootloader as
decrypted and sme_map_bootdata clears the encryptions bit.

In case of guest, memory maybe dynamically allocated at boot time and may not have same
attribute as early mapping.

> And we should extend that mechanism to map the GHCB in the guest too and
> then we can get rid of all that need for ->force_memblock which makes
> the crazy mess in pageattr.c even crazier. And it would be lovely if we
> can do it without it.
>
> But maybe Paolo might have an even better idea...
>

I am sure he will have better idea :)

-Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-10 22:41       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-10 22:41 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

Hi Boris,

On 03/10/2017 05:06 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:15:15AM -0500, Brijesh Singh wrote:
>> If kernel_maps_pages_in_pgd is called early in boot process to change the
>
> kernel_map_pages_in_pgd()
>
>> memory attributes then it fails to allocate memory when spliting large
>> pages. The patch extends the cpa_data to provide the support to use
>> memblock_alloc when slab allocator is not available.
>>
>> The feature will be used in Secure Encrypted Virtualization (SEV) mode,
>> where we may need to change the memory region attributes in early boot
>> process.
>>
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>> ---
>>  arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
>>  1 file changed, 42 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
>> index 46cc89d..9e4ab3b 100644
>> --- a/arch/x86/mm/pageattr.c
>> +++ b/arch/x86/mm/pageattr.c
>> @@ -14,6 +14,7 @@
>>  #include <linux/gfp.h>
>>  #include <linux/pci.h>
>>  #include <linux/vmalloc.h>
>> +#include <linux/memblock.h>
>>
>>  #include <asm/e820/api.h>
>>  #include <asm/processor.h>
>> @@ -37,6 +38,7 @@ struct cpa_data {
>>  	int		flags;
>>  	unsigned long	pfn;
>>  	unsigned	force_split : 1;
>> +	unsigned	force_memblock :1;
>>  	int		curpage;
>>  	struct page	**pages;
>>  };
>> @@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
>>
>>  static int
>>  __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>> -		   struct page *base)
>> +		  pte_t *pbase, unsigned long new_pfn)
>>  {
>> -	pte_t *pbase = (pte_t *)page_address(base);
>>  	unsigned long ref_pfn, pfn, pfninc = 1;
>>  	unsigned int i, level;
>>  	pte_t *tmp;
>> @@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  		return 1;
>>  	}
>>
>> -	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
>> +	paravirt_alloc_pte(&init_mm, new_pfn);
>>
>>  	switch (level) {
>>  	case PG_LEVEL_2M:
>> @@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  	 * pagetable protections, the actual ptes set above control the
>>  	 * primary protection behavior:
>>  	 */
>> -	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
>> +	__set_pmd_pte(kpte, address,
>> +		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
>>
>>  	/*
>>  	 * Intel Atom errata AAH41 workaround.
>> @@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  	return 0;
>>  }
>>
>> +static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
>> +{
>> +	unsigned long phys;
>> +	struct page *base;
>> +
>> +	if (cpa->force_memblock) {
>> +		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
>
> Maybe there's a reason this fires:
>
> WARNING: modpost: Found 2 section mismatch(es).
> To see full details build your kernel with:
> 'make CONFIG_DEBUG_SECTION_MISMATCH=y'
>
> WARNING: vmlinux.o(.text+0x48edc): Section mismatch in reference from the function __change_page_attr() to the function .init.text:memblock_alloc()
> The function __change_page_attr() references
> the function __init memblock_alloc().
> This is often because __change_page_attr lacks a __init
> annotation or the annotation of memblock_alloc is wrong.
>
> WARNING: vmlinux.o(.text+0x491d1): Section mismatch in reference from the function __change_page_attr() to the function .meminit.text:memblock_free()
> The function __change_page_attr() references
> the function __meminit memblock_free().
> This is often because __change_page_attr lacks a __meminit
> annotation or the annotation of memblock_free is wrong.
>

I can take a look at fixing those warning. In my initial attempt was to create
a new function to clear encryption bit but it ended up looking very similar to
__change_page_attr_set_clr() hence decided to extend the exiting function to
use memblock_alloc().


> Why do we need this whole early mapping? For the guest? I don't like
> that memblock thing at all.

Early in boot process, guest kernel allocates some structure (its either
statically allocated or dynamic allocated via memblock_alloc). And shares the physical
address of these structure with hypervisor. Since entire guest memory area is mapped
as encrypted hence those structure's are mapped as encrypted memory range. We need
a method to clear the encryption bit. Sometime these structure maybe part of 2M pages
and need to split into smaller pages.

>
> So I think the approach with the .data..percpu..hv_shared section is
> fine and we should consider SEV-ES
>
> http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf
>
> and do this right from the get-go so that when SEV-ES comes along, we
> should simply be ready and extend that mechanism to put the whole Guest
> Hypervisor Communication Block in there.
>

> But then the fact that you're mapping those decrypted in init_mm.pgd
> makes me think you don't need that early mapping thing at all. Those are
> the decrypted mappings of the hypervisor. And that you can do late.
>

In most cases, guest and hypervisor communication starts as soon as guest provides
the physical address to hypervisor. So we must map the pages as decrypted before
sharing the physical address to hypervisor.

> Now, what would be better, IMHO (and I have no idea about virtualization
> design so take with a grain of salt) is if the guest would allocate
> enough memory for the GHCB and mark it decrypted from the very
> beginning. It will be the communication vehicle with the hypervisor
> anyway.
>
> And we already do similar things in sme_map_bootdata() for the baremetal
> kernel to map boot_data, initrd, EFI, ... and so on things decrypted.
>

I will take a look at sme_map_bootdata but I believe the main difference is,
in case of SME those memory regions were allocated by bios or bootloader as
decrypted and sme_map_bootdata clears the encryptions bit.

In case of guest, memory maybe dynamically allocated at boot time and may not have same
attribute as early mapping.

> And we should extend that mechanism to map the GHCB in the guest too and
> then we can get rid of all that need for ->force_memblock which makes
> the crazy mess in pageattr.c even crazier. And it would be lovely if we
> can do it without it.
>
> But maybe Paolo might have an even better idea...
>

I am sure he will have better idea :)

-Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-10 22:41       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-10 22:41 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani

Hi Boris,

On 03/10/2017 05:06 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:15:15AM -0500, Brijesh Singh wrote:
>> If kernel_maps_pages_in_pgd is called early in boot process to change the
>
> kernel_map_pages_in_pgd()
>
>> memory attributes then it fails to allocate memory when spliting large
>> pages. The patch extends the cpa_data to provide the support to use
>> memblock_alloc when slab allocator is not available.
>>
>> The feature will be used in Secure Encrypted Virtualization (SEV) mode,
>> where we may need to change the memory region attributes in early boot
>> process.
>>
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>> ---
>>  arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
>>  1 file changed, 42 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
>> index 46cc89d..9e4ab3b 100644
>> --- a/arch/x86/mm/pageattr.c
>> +++ b/arch/x86/mm/pageattr.c
>> @@ -14,6 +14,7 @@
>>  #include <linux/gfp.h>
>>  #include <linux/pci.h>
>>  #include <linux/vmalloc.h>
>> +#include <linux/memblock.h>
>>
>>  #include <asm/e820/api.h>
>>  #include <asm/processor.h>
>> @@ -37,6 +38,7 @@ struct cpa_data {
>>  	int		flags;
>>  	unsigned long	pfn;
>>  	unsigned	force_split : 1;
>> +	unsigned	force_memblock :1;
>>  	int		curpage;
>>  	struct page	**pages;
>>  };
>> @@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
>>
>>  static int
>>  __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>> -		   struct page *base)
>> +		  pte_t *pbase, unsigned long new_pfn)
>>  {
>> -	pte_t *pbase = (pte_t *)page_address(base);
>>  	unsigned long ref_pfn, pfn, pfninc = 1;
>>  	unsigned int i, level;
>>  	pte_t *tmp;
>> @@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  		return 1;
>>  	}
>>
>> -	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
>> +	paravirt_alloc_pte(&init_mm, new_pfn);
>>
>>  	switch (level) {
>>  	case PG_LEVEL_2M:
>> @@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  	 * pagetable protections, the actual ptes set above control the
>>  	 * primary protection behavior:
>>  	 */
>> -	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
>> +	__set_pmd_pte(kpte, address,
>> +		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
>>
>>  	/*
>>  	 * Intel Atom errata AAH41 workaround.
>> @@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  	return 0;
>>  }
>>
>> +static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
>> +{
>> +	unsigned long phys;
>> +	struct page *base;
>> +
>> +	if (cpa->force_memblock) {
>> +		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
>
> Maybe there's a reason this fires:
>
> WARNING: modpost: Found 2 section mismatch(es).
> To see full details build your kernel with:
> 'make CONFIG_DEBUG_SECTION_MISMATCH=y'
>
> WARNING: vmlinux.o(.text+0x48edc): Section mismatch in reference from the function __change_page_attr() to the function .init.text:memblock_alloc()
> The function __change_page_attr() references
> the function __init memblock_alloc().
> This is often because __change_page_attr lacks a __init
> annotation or the annotation of memblock_alloc is wrong.
>
> WARNING: vmlinux.o(.text+0x491d1): Section mismatch in reference from the function __change_page_attr() to the function .meminit.text:memblock_free()
> The function __change_page_attr() references
> the function __meminit memblock_free().
> This is often because __change_page_attr lacks a __meminit
> annotation or the annotation of memblock_free is wrong.
>

I can take a look at fixing those warning. In my initial attempt was to create
a new function to clear encryption bit but it ended up looking very similar to
__change_page_attr_set_clr() hence decided to extend the exiting function to
use memblock_alloc().


> Why do we need this whole early mapping? For the guest? I don't like
> that memblock thing at all.

Early in boot process, guest kernel allocates some structure (its either
statically allocated or dynamic allocated via memblock_alloc). And shares the physical
address of these structure with hypervisor. Since entire guest memory area is mapped
as encrypted hence those structure's are mapped as encrypted memory range. We need
a method to clear the encryption bit. Sometime these structure maybe part of 2M pages
and need to split into smaller pages.

>
> So I think the approach with the .data..percpu..hv_shared section is
> fine and we should consider SEV-ES
>
> http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf
>
> and do this right from the get-go so that when SEV-ES comes along, we
> should simply be ready and extend that mechanism to put the whole Guest
> Hypervisor Communication Block in there.
>

> But then the fact that you're mapping those decrypted in init_mm.pgd
> makes me think you don't need that early mapping thing at all. Those are
> the decrypted mappings of the hypervisor. And that you can do late.
>

In most cases, guest and hypervisor communication starts as soon as guest provides
the physical address to hypervisor. So we must map the pages as decrypted before
sharing the physical address to hypervisor.

> Now, what would be better, IMHO (and I have no idea about virtualization
> design so take with a grain of salt) is if the guest would allocate
> enough memory for the GHCB and mark it decrypted from the very
> beginning. It will be the communication vehicle with the hypervisor
> anyway.
>
> And we already do similar things in sme_map_bootdata() for the baremetal
> kernel to map boot_data, initrd, EFI, ... and so on things decrypted.
>

I will take a look at sme_map_bootdata but I believe the main difference is,
in case of SME those memory regions were allocated by bios or bootloader as
decrypted and sme_map_bootdata clears the encryptions bit.

In case of guest, memory maybe dynamically allocated at boot time and may not have same
attribute as early mapping.

> And we should extend that mechanism to map the GHCB in the guest too and
> then we can get rid of all that need for ->force_memblock which makes
> the crazy mess in pageattr.c even crazier. And it would be lovely if we
> can do it without it.
>
> But maybe Paolo might have an even better idea...
>

I am sure he will have better idea :)

-Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-10 22:41       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-10 22:41 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

Hi Boris,

On 03/10/2017 05:06 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:15:15AM -0500, Brijesh Singh wrote:
>> If kernel_maps_pages_in_pgd is called early in boot process to change the
>
> kernel_map_pages_in_pgd()
>
>> memory attributes then it fails to allocate memory when spliting large
>> pages. The patch extends the cpa_data to provide the support to use
>> memblock_alloc when slab allocator is not available.
>>
>> The feature will be used in Secure Encrypted Virtualization (SEV) mode,
>> where we may need to change the memory region attributes in early boot
>> process.
>>
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>> ---
>>  arch/x86/mm/pageattr.c |   51 ++++++++++++++++++++++++++++++++++++++++--------
>>  1 file changed, 42 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
>> index 46cc89d..9e4ab3b 100644
>> --- a/arch/x86/mm/pageattr.c
>> +++ b/arch/x86/mm/pageattr.c
>> @@ -14,6 +14,7 @@
>>  #include <linux/gfp.h>
>>  #include <linux/pci.h>
>>  #include <linux/vmalloc.h>
>> +#include <linux/memblock.h>
>>
>>  #include <asm/e820/api.h>
>>  #include <asm/processor.h>
>> @@ -37,6 +38,7 @@ struct cpa_data {
>>  	int		flags;
>>  	unsigned long	pfn;
>>  	unsigned	force_split : 1;
>> +	unsigned	force_memblock :1;
>>  	int		curpage;
>>  	struct page	**pages;
>>  };
>> @@ -627,9 +629,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
>>
>>  static int
>>  __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>> -		   struct page *base)
>> +		  pte_t *pbase, unsigned long new_pfn)
>>  {
>> -	pte_t *pbase = (pte_t *)page_address(base);
>>  	unsigned long ref_pfn, pfn, pfninc = 1;
>>  	unsigned int i, level;
>>  	pte_t *tmp;
>> @@ -646,7 +647,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  		return 1;
>>  	}
>>
>> -	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
>> +	paravirt_alloc_pte(&init_mm, new_pfn);
>>
>>  	switch (level) {
>>  	case PG_LEVEL_2M:
>> @@ -707,7 +708,8 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  	 * pagetable protections, the actual ptes set above control the
>>  	 * primary protection behavior:
>>  	 */
>> -	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
>> +	__set_pmd_pte(kpte, address,
>> +		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));
>>
>>  	/*
>>  	 * Intel Atom errata AAH41 workaround.
>> @@ -723,21 +725,50 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
>>  	return 0;
>>  }
>>
>> +static pte_t *try_alloc_pte(struct cpa_data *cpa, unsigned long *pfn)
>> +{
>> +	unsigned long phys;
>> +	struct page *base;
>> +
>> +	if (cpa->force_memblock) {
>> +		phys = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
>
> Maybe there's a reason this fires:
>
> WARNING: modpost: Found 2 section mismatch(es).
> To see full details build your kernel with:
> 'make CONFIG_DEBUG_SECTION_MISMATCH=y'
>
> WARNING: vmlinux.o(.text+0x48edc): Section mismatch in reference from the function __change_page_attr() to the function .init.text:memblock_alloc()
> The function __change_page_attr() references
> the function __init memblock_alloc().
> This is often because __change_page_attr lacks a __init
> annotation or the annotation of memblock_alloc is wrong.
>
> WARNING: vmlinux.o(.text+0x491d1): Section mismatch in reference from the function __change_page_attr() to the function .meminit.text:memblock_free()
> The function __change_page_attr() references
> the function __meminit memblock_free().
> This is often because __change_page_attr lacks a __meminit
> annotation or the annotation of memblock_free is wrong.
>

I can take a look at fixing those warning. In my initial attempt was to create
a new function to clear encryption bit but it ended up looking very similar to
__change_page_attr_set_clr() hence decided to extend the exiting function to
use memblock_alloc().


> Why do we need this whole early mapping? For the guest? I don't like
> that memblock thing at all.

Early in boot process, guest kernel allocates some structure (its either
statically allocated or dynamic allocated via memblock_alloc). And shares the physical
address of these structure with hypervisor. Since entire guest memory area is mapped
as encrypted hence those structure's are mapped as encrypted memory range. We need
a method to clear the encryption bit. Sometime these structure maybe part of 2M pages
and need to split into smaller pages.

>
> So I think the approach with the .data..percpu..hv_shared section is
> fine and we should consider SEV-ES
>
> http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf
>
> and do this right from the get-go so that when SEV-ES comes along, we
> should simply be ready and extend that mechanism to put the whole Guest
> Hypervisor Communication Block in there.
>

> But then the fact that you're mapping those decrypted in init_mm.pgd
> makes me think you don't need that early mapping thing at all. Those are
> the decrypted mappings of the hypervisor. And that you can do late.
>

In most cases, guest and hypervisor communication starts as soon as guest provides
the physical address to hypervisor. So we must map the pages as decrypted before
sharing the physical address to hypervisor.

> Now, what would be better, IMHO (and I have no idea about virtualization
> design so take with a grain of salt) is if the guest would allocate
> enough memory for the GHCB and mark it decrypted from the very
> beginning. It will be the communication vehicle with the hypervisor
> anyway.
>
> And we already do similar things in sme_map_bootdata() for the baremetal
> kernel to map boot_data, initrd, EFI, ... and so on things decrypted.
>

I will take a look at sme_map_bootdata but I believe the main difference is,
in case of SME those memory regions were allocated by bios or bootloader as
decrypted and sme_map_bootdata clears the encryptions bit.

In case of guest, memory maybe dynamically allocated at boot time and may not have same
attribute as early mapping.

> And we should extend that mechanism to map the GHCB in the guest too and
> then we can get rid of all that need for ->force_memblock which makes
> the crazy mess in pageattr.c even crazier. And it would be lovely if we
> can do it without it.
>
> But maybe Paolo might have an even better idea...
>

I am sure he will have better idea :)

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
  2017-03-07  0:03         ` Bjorn Helgaas
  (?)
  (?)
@ 2017-03-13 20:08           ` Tom Lendacky
  -1 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-13 20:08 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhe

On 3/6/2017 6:03 PM, Bjorn Helgaas wrote:
> On Fri, Mar 03, 2017 at 03:15:34PM -0600, Tom Lendacky wrote:
>> On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:
>>> On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
>>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>>
>>>> The use of ioremap will force the setup data to be mapped decrypted even
>>>> though setup data is encrypted.  Switch to using memremap which will be
>>>> able to perform the proper mapping.
>>>
>>> How should callers decide whether to use ioremap() or memremap()?
>>>
>>> memremap() existed before SME and SEV, and this code is used even if
>>> SME and SEV aren't supported, so the rationale for this change should
>>> not need the decryption argument.
>>
>> When SME or SEV is active an ioremap() will remove the encryption bit
>> from the pagetable entry when it is mapped.  This allows MMIO, which
>> doesn't support SME/SEV, to be performed successfully.  So my take is
>> that ioremap() should be used for MMIO and memremap() for pages in RAM.
>
> OK, thanks.  The commit message should say something like "this is
> RAM, not MMIO, so we should map it with memremap(), not ioremap()".
> That's the part that determines whether the change is correct.
>
> You can mention the encryption part, too, but it's definitely
> secondary because the change has to make sense on its own, without
> SME/SEV.
>

Ok, that makes sense, will do.

> The following commits (from https://github.com/codomania/tip/branches)
> all do basically the same thing so the changelogs (and summaries)
> should all be basically the same:
>
>   cb0d0d1eb0a6 x86: Change early_ioremap to early_memremap for BOOT data
>   91acb68b8333 x86/pci: Use memremap when walking setup data
>   4f687503e23f x86: Access the setup data through sysfs decrypted
>   e90246b8c229 x86: Access the setup data through debugfs decrypted
>
> I would collect them all together and move them to the beginning of
> your series, since they don't depend on anything else.

I'll do that.

>
> Also, change "x86/pci: " to "x86/PCI" so it matches the previous
> convention.

Will do.

Thanks,
Tom

>
>>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>
>>>> ---
>>>> arch/x86/pci/common.c |    4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>>>> index a4fdfa7..0b06670 100644
>>>> --- a/arch/x86/pci/common.c
>>>> +++ b/arch/x86/pci/common.c
>>>> @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>>>
>>>> 	pa_data = boot_params.hdr.setup_data;
>>>> 	while (pa_data) {
>>>> -		data = ioremap(pa_data, sizeof(*rom));
>>>> +		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
>>>
>>> I can't quite connect the dots here.  ioremap() on x86 would do
>>> ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
>>> which is ioremap_cache().  Is making a cacheable mapping the important
>>> difference?
>>
>> The memremap(MEMREMAP_WB) will actually check to see if it can perform
>> a __va(pa_data) in try_ram_remap() and then fallback to the
>> arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
>> that is the difference.
>>
>> Thanks,
>> Tom
>>
>>>
>>>> 		if (!data)
>>>> 			return -ENOMEM;
>>>>
>>>> @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>>> 			}
>>>> 		}
>>>> 		pa_data = data->next;
>>>> -		iounmap(data);
>>>> +		memunmap(data);
>>>> 	}
>>>> 	set_dma_domain_ops(dev);
>>>> 	set_dev_domain_options(dev);
>>>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-13 20:08           ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-13 20:08 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/6/2017 6:03 PM, Bjorn Helgaas wrote:
> On Fri, Mar 03, 2017 at 03:15:34PM -0600, Tom Lendacky wrote:
>> On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:
>>> On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
>>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>>
>>>> The use of ioremap will force the setup data to be mapped decrypted even
>>>> though setup data is encrypted.  Switch to using memremap which will be
>>>> able to perform the proper mapping.
>>>
>>> How should callers decide whether to use ioremap() or memremap()?
>>>
>>> memremap() existed before SME and SEV, and this code is used even if
>>> SME and SEV aren't supported, so the rationale for this change should
>>> not need the decryption argument.
>>
>> When SME or SEV is active an ioremap() will remove the encryption bit
>> from the pagetable entry when it is mapped.  This allows MMIO, which
>> doesn't support SME/SEV, to be performed successfully.  So my take is
>> that ioremap() should be used for MMIO and memremap() for pages in RAM.
>
> OK, thanks.  The commit message should say something like "this is
> RAM, not MMIO, so we should map it with memremap(), not ioremap()".
> That's the part that determines whether the change is correct.
>
> You can mention the encryption part, too, but it's definitely
> secondary because the change has to make sense on its own, without
> SME/SEV.
>

Ok, that makes sense, will do.

> The following commits (from https://github.com/codomania/tip/branches)
> all do basically the same thing so the changelogs (and summaries)
> should all be basically the same:
>
>   cb0d0d1eb0a6 x86: Change early_ioremap to early_memremap for BOOT data
>   91acb68b8333 x86/pci: Use memremap when walking setup data
>   4f687503e23f x86: Access the setup data through sysfs decrypted
>   e90246b8c229 x86: Access the setup data through debugfs decrypted
>
> I would collect them all together and move them to the beginning of
> your series, since they don't depend on anything else.

I'll do that.

>
> Also, change "x86/pci: " to "x86/PCI" so it matches the previous
> convention.

Will do.

Thanks,
Tom

>
>>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>
>>>> ---
>>>> arch/x86/pci/common.c |    4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>>>> index a4fdfa7..0b06670 100644
>>>> --- a/arch/x86/pci/common.c
>>>> +++ b/arch/x86/pci/common.c
>>>> @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>>>
>>>> 	pa_data = boot_params.hdr.setup_data;
>>>> 	while (pa_data) {
>>>> -		data = ioremap(pa_data, sizeof(*rom));
>>>> +		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
>>>
>>> I can't quite connect the dots here.  ioremap() on x86 would do
>>> ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
>>> which is ioremap_cache().  Is making a cacheable mapping the important
>>> difference?
>>
>> The memremap(MEMREMAP_WB) will actually check to see if it can perform
>> a __va(pa_data) in try_ram_remap() and then fallback to the
>> arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
>> that is the difference.
>>
>> Thanks,
>> Tom
>>
>>>
>>>> 		if (!data)
>>>> 			return -ENOMEM;
>>>>
>>>> @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>>> 			}
>>>> 		}
>>>> 		pa_data = data->next;
>>>> -		iounmap(data);
>>>> +		memunmap(data);
>>>> 	}
>>>> 	set_dma_domain_ops(dev);
>>>> 	set_dev_domain_options(dev);
>>>>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-13 20:08           ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-13 20:08 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel

On 3/6/2017 6:03 PM, Bjorn Helgaas wrote:
> On Fri, Mar 03, 2017 at 03:15:34PM -0600, Tom Lendacky wrote:
>> On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:
>>> On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
>>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>>
>>>> The use of ioremap will force the setup data to be mapped decrypted even
>>>> though setup data is encrypted.  Switch to using memremap which will be
>>>> able to perform the proper mapping.
>>>
>>> How should callers decide whether to use ioremap() or memremap()?
>>>
>>> memremap() existed before SME and SEV, and this code is used even if
>>> SME and SEV aren't supported, so the rationale for this change should
>>> not need the decryption argument.
>>
>> When SME or SEV is active an ioremap() will remove the encryption bit
>> from the pagetable entry when it is mapped.  This allows MMIO, which
>> doesn't support SME/SEV, to be performed successfully.  So my take is
>> that ioremap() should be used for MMIO and memremap() for pages in RAM.
>
> OK, thanks.  The commit message should say something like "this is
> RAM, not MMIO, so we should map it with memremap(), not ioremap()".
> That's the part that determines whether the change is correct.
>
> You can mention the encryption part, too, but it's definitely
> secondary because the change has to make sense on its own, without
> SME/SEV.
>

Ok, that makes sense, will do.

> The following commits (from https://github.com/codomania/tip/branches)
> all do basically the same thing so the changelogs (and summaries)
> should all be basically the same:
>
>   cb0d0d1eb0a6 x86: Change early_ioremap to early_memremap for BOOT data
>   91acb68b8333 x86/pci: Use memremap when walking setup data
>   4f687503e23f x86: Access the setup data through sysfs decrypted
>   e90246b8c229 x86: Access the setup data through debugfs decrypted
>
> I would collect them all together and move them to the beginning of
> your series, since they don't depend on anything else.

I'll do that.

>
> Also, change "x86/pci: " to "x86/PCI" so it matches the previous
> convention.

Will do.

Thanks,
Tom

>
>>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>
>>>> ---
>>>> arch/x86/pci/common.c |    4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>>>> index a4fdfa7..0b06670 100644
>>>> --- a/arch/x86/pci/common.c
>>>> +++ b/arch/x86/pci/common.c
>>>> @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>>>
>>>> 	pa_data = boot_params.hdr.setup_data;
>>>> 	while (pa_data) {
>>>> -		data = ioremap(pa_data, sizeof(*rom));
>>>> +		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
>>>
>>> I can't quite connect the dots here.  ioremap() on x86 would do
>>> ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
>>> which is ioremap_cache().  Is making a cacheable mapping the important
>>> difference?
>>
>> The memremap(MEMREMAP_WB) will actually check to see if it can perform
>> a __va(pa_data) in try_ram_remap() and then fallback to the
>> arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
>> that is the difference.
>>
>> Thanks,
>> Tom
>>
>>>
>>>> 		if (!data)
>>>> 			return -ENOMEM;
>>>>
>>>> @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>>> 			}
>>>> 		}
>>>> 		pa_data = data->next;
>>>> -		iounmap(data);
>>>> +		memunmap(data);
>>>> 	}
>>>> 	set_dma_domain_ops(dev);
>>>> 	set_dev_domain_options(dev);
>>>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data
@ 2017-03-13 20:08           ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-13 20:08 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/6/2017 6:03 PM, Bjorn Helgaas wrote:
> On Fri, Mar 03, 2017 at 03:15:34PM -0600, Tom Lendacky wrote:
>> On 3/3/2017 2:42 PM, Bjorn Helgaas wrote:
>>> On Thu, Mar 02, 2017 at 10:13:10AM -0500, Brijesh Singh wrote:
>>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>>
>>>> The use of ioremap will force the setup data to be mapped decrypted even
>>>> though setup data is encrypted.  Switch to using memremap which will be
>>>> able to perform the proper mapping.
>>>
>>> How should callers decide whether to use ioremap() or memremap()?
>>>
>>> memremap() existed before SME and SEV, and this code is used even if
>>> SME and SEV aren't supported, so the rationale for this change should
>>> not need the decryption argument.
>>
>> When SME or SEV is active an ioremap() will remove the encryption bit
>> from the pagetable entry when it is mapped.  This allows MMIO, which
>> doesn't support SME/SEV, to be performed successfully.  So my take is
>> that ioremap() should be used for MMIO and memremap() for pages in RAM.
>
> OK, thanks.  The commit message should say something like "this is
> RAM, not MMIO, so we should map it with memremap(), not ioremap()".
> That's the part that determines whether the change is correct.
>
> You can mention the encryption part, too, but it's definitely
> secondary because the change has to make sense on its own, without
> SME/SEV.
>

Ok, that makes sense, will do.

> The following commits (from https://github.com/codomania/tip/branches)
> all do basically the same thing so the changelogs (and summaries)
> should all be basically the same:
>
>   cb0d0d1eb0a6 x86: Change early_ioremap to early_memremap for BOOT data
>   91acb68b8333 x86/pci: Use memremap when walking setup data
>   4f687503e23f x86: Access the setup data through sysfs decrypted
>   e90246b8c229 x86: Access the setup data through debugfs decrypted
>
> I would collect them all together and move them to the beginning of
> your series, since they don't depend on anything else.

I'll do that.

>
> Also, change "x86/pci: " to "x86/PCI" so it matches the previous
> convention.

Will do.

Thanks,
Tom

>
>>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>
>>>> ---
>>>> arch/x86/pci/common.c |    4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>>>> index a4fdfa7..0b06670 100644
>>>> --- a/arch/x86/pci/common.c
>>>> +++ b/arch/x86/pci/common.c
>>>> @@ -691,7 +691,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>>>
>>>> 	pa_data = boot_params.hdr.setup_data;
>>>> 	while (pa_data) {
>>>> -		data = ioremap(pa_data, sizeof(*rom));
>>>> +		data = memremap(pa_data, sizeof(*rom), MEMREMAP_WB);
>>>
>>> I can't quite connect the dots here.  ioremap() on x86 would do
>>> ioremap_nocache().  memremap(MEMREMAP_WB) would do arch_memremap_wb(),
>>> which is ioremap_cache().  Is making a cacheable mapping the important
>>> difference?
>>
>> The memremap(MEMREMAP_WB) will actually check to see if it can perform
>> a __va(pa_data) in try_ram_remap() and then fallback to the
>> arch_memremap_wb().  So it's actually the __va() vs the ioremap_cache()
>> that is the difference.
>>
>> Thanks,
>> Tom
>>
>>>
>>>> 		if (!data)
>>>> 			return -ENOMEM;
>>>>
>>>> @@ -710,7 +710,7 @@ int pcibios_add_device(struct pci_dev *dev)
>>>> 			}
>>>> 		}
>>>> 		pa_data = data->next;
>>>> -		iounmap(data);
>>>> +		memunmap(data);
>>>> 	}
>>>> 	set_dma_domain_ops(dev);
>>>> 	set_dev_domain_options(dev);
>>>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
  2017-03-10 16:35           ` Brijesh Singh
  (?)
  (?)
@ 2017-03-16 10:16             ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 10:16 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, d

On Fri, Mar 10, 2017 at 10:35:30AM -0600, Brijesh Singh wrote:
> We could update this patch to use the below logic:
> 
>  * CPUID(0) - Check for AuthenticAMD
>  * CPID(1) - Check if under hypervisor
>  * CPUID(0x80000000) - Check for highest supported leaf
>  * CPUID(0x8000001F).EAX - Check for SME and SEV support
>  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set

Actually, it is still not clear to me *why* we need to do anything
special wrt SEV in the guest.

Lemme clarify: why can't the guest boot just like a normal Linux on
baremetal and use the SME(!) detection code to set sme_enable and so
on? IOW, I'd like to avoid all those checks whether we're running under
hypervisor and handle all that like we're running on baremetal.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 10:16             ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 10:16 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On Fri, Mar 10, 2017 at 10:35:30AM -0600, Brijesh Singh wrote:
> We could update this patch to use the below logic:
> 
>  * CPUID(0) - Check for AuthenticAMD
>  * CPID(1) - Check if under hypervisor
>  * CPUID(0x80000000) - Check for highest supported leaf
>  * CPUID(0x8000001F).EAX - Check for SME and SEV support
>  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set

Actually, it is still not clear to me *why* we need to do anything
special wrt SEV in the guest.

Lemme clarify: why can't the guest boot just like a normal Linux on
baremetal and use the SME(!) detection code to set sme_enable and so
on? IOW, I'd like to avoid all those checks whether we're running under
hypervisor and handle all that like we're running on baremetal.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 10:16             ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 10:16 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto

On Fri, Mar 10, 2017 at 10:35:30AM -0600, Brijesh Singh wrote:
> We could update this patch to use the below logic:
> 
>  * CPUID(0) - Check for AuthenticAMD
>  * CPID(1) - Check if under hypervisor
>  * CPUID(0x80000000) - Check for highest supported leaf
>  * CPUID(0x8000001F).EAX - Check for SME and SEV support
>  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set

Actually, it is still not clear to me *why* we need to do anything
special wrt SEV in the guest.

Lemme clarify: why can't the guest boot just like a normal Linux on
baremetal and use the SME(!) detection code to set sme_enable and so
on? IOW, I'd like to avoid all those checks whether we're running under
hypervisor and handle all that like we're running on baremetal.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 10:16             ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 10:16 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On Fri, Mar 10, 2017 at 10:35:30AM -0600, Brijesh Singh wrote:
> We could update this patch to use the below logic:
> 
>  * CPUID(0) - Check for AuthenticAMD
>  * CPID(1) - Check if under hypervisor
>  * CPUID(0x80000000) - Check for highest supported leaf
>  * CPUID(0x8000001F).EAX - Check for SME and SEV support
>  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set

Actually, it is still not clear to me *why* we need to do anything
special wrt SEV in the guest.

Lemme clarify: why can't the guest boot just like a normal Linux on
baremetal and use the SME(!) detection code to set sme_enable and so
on? IOW, I'd like to avoid all those checks whether we're running under
hypervisor and handle all that like we're running on baremetal.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 23/32] kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
  2017-03-02 15:17   ` Brijesh Singh
                     ` (3 preceding siblings ...)
  (?)
@ 2017-03-16 10:25   ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:25 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:17, Brijesh Singh wrote:
> If hardware supports encrypting then KVM_MEMORY_ENCRYPT_OP ioctl can
> be used by qemu to issue platform specific memory encryption commands.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h |    2 ++
>  arch/x86/kvm/x86.c              |   12 ++++++++++++
>  include/uapi/linux/kvm.h        |    2 ++
>  3 files changed, 16 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index bff1f15..62651ad 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1033,6 +1033,8 @@ struct kvm_x86_ops {
>  	void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
>  
>  	void (*setup_mce)(struct kvm_vcpu *vcpu);
> +
> +	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
>  };
>  
>  struct kvm_arch_async_pf {
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2099df8..6a737e9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3926,6 +3926,14 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  	return r;
>  }
>  
> +static int kvm_vm_ioctl_memory_encryption_op(struct kvm *kvm, void __user *argp)
> +{
> +	if (kvm_x86_ops->memory_encryption_op)
> +		return kvm_x86_ops->memory_encryption_op(kvm, argp);
> +
> +	return -ENOTTY;
> +}
> +
>  long kvm_arch_vm_ioctl(struct file *filp,
>  		       unsigned int ioctl, unsigned long arg)
>  {
> @@ -4189,6 +4197,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  		r = kvm_vm_ioctl_enable_cap(kvm, &cap);
>  		break;
>  	}
> +	case KVM_MEMORY_ENCRYPT_OP: {
> +		r = kvm_vm_ioctl_memory_encryption_op(kvm, argp);
> +		break;
> +	}
>  	default:
>  		r = kvm_vm_ioctl_assigned_device(kvm, ioctl, arg);
>  	}
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index cac48ed..fef7d83 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1281,6 +1281,8 @@ struct kvm_s390_ucas_mapping {
>  #define KVM_S390_GET_IRQ_STATE	  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
>  /* Available with KVM_CAP_X86_SMM */
>  #define KVM_SMI                   _IO(KVMIO,   0xb7)
> +/* Memory Encryption Commands */
> +#define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
>  
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
>  #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 23/32] kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
  2017-03-02 15:17   ` Brijesh Singh
  (?)
@ 2017-03-16 10:25     ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:25 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:17, Brijesh Singh wrote:
> If hardware supports encrypting then KVM_MEMORY_ENCRYPT_OP ioctl can
> be used by qemu to issue platform specific memory encryption commands.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h |    2 ++
>  arch/x86/kvm/x86.c              |   12 ++++++++++++
>  include/uapi/linux/kvm.h        |    2 ++
>  3 files changed, 16 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index bff1f15..62651ad 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1033,6 +1033,8 @@ struct kvm_x86_ops {
>  	void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
>  
>  	void (*setup_mce)(struct kvm_vcpu *vcpu);
> +
> +	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
>  };
>  
>  struct kvm_arch_async_pf {
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2099df8..6a737e9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3926,6 +3926,14 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  	return r;
>  }
>  
> +static int kvm_vm_ioctl_memory_encryption_op(struct kvm *kvm, void __user *argp)
> +{
> +	if (kvm_x86_ops->memory_encryption_op)
> +		return kvm_x86_ops->memory_encryption_op(kvm, argp);
> +
> +	return -ENOTTY;
> +}
> +
>  long kvm_arch_vm_ioctl(struct file *filp,
>  		       unsigned int ioctl, unsigned long arg)
>  {
> @@ -4189,6 +4197,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  		r = kvm_vm_ioctl_enable_cap(kvm, &cap);
>  		break;
>  	}
> +	case KVM_MEMORY_ENCRYPT_OP: {
> +		r = kvm_vm_ioctl_memory_encryption_op(kvm, argp);
> +		break;
> +	}
>  	default:
>  		r = kvm_vm_ioctl_assigned_device(kvm, ioctl, arg);
>  	}
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index cac48ed..fef7d83 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1281,6 +1281,8 @@ struct kvm_s390_ucas_mapping {
>  #define KVM_S390_GET_IRQ_STATE	  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
>  /* Available with KVM_CAP_X86_SMM */
>  #define KVM_SMI                   _IO(KVMIO,   0xb7)
> +/* Memory Encryption Commands */
> +#define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
>  
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
>  #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 23/32] kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
@ 2017-03-16 10:25     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:25 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:17, Brijesh Singh wrote:
> If hardware supports encrypting then KVM_MEMORY_ENCRYPT_OP ioctl can
> be used by qemu to issue platform specific memory encryption commands.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h |    2 ++
>  arch/x86/kvm/x86.c              |   12 ++++++++++++
>  include/uapi/linux/kvm.h        |    2 ++
>  3 files changed, 16 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index bff1f15..62651ad 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1033,6 +1033,8 @@ struct kvm_x86_ops {
>  	void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
>  
>  	void (*setup_mce)(struct kvm_vcpu *vcpu);
> +
> +	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
>  };
>  
>  struct kvm_arch_async_pf {
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2099df8..6a737e9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3926,6 +3926,14 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  	return r;
>  }
>  
> +static int kvm_vm_ioctl_memory_encryption_op(struct kvm *kvm, void __user *argp)
> +{
> +	if (kvm_x86_ops->memory_encryption_op)
> +		return kvm_x86_ops->memory_encryption_op(kvm, argp);
> +
> +	return -ENOTTY;
> +}
> +
>  long kvm_arch_vm_ioctl(struct file *filp,
>  		       unsigned int ioctl, unsigned long arg)
>  {
> @@ -4189,6 +4197,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  		r = kvm_vm_ioctl_enable_cap(kvm, &cap);
>  		break;
>  	}
> +	case KVM_MEMORY_ENCRYPT_OP: {
> +		r = kvm_vm_ioctl_memory_encryption_op(kvm, argp);
> +		break;
> +	}
>  	default:
>  		r = kvm_vm_ioctl_assigned_device(kvm, ioctl, arg);
>  	}
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index cac48ed..fef7d83 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1281,6 +1281,8 @@ struct kvm_s390_ucas_mapping {
>  #define KVM_S390_GET_IRQ_STATE	  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
>  /* Available with KVM_CAP_X86_SMM */
>  #define KVM_SMI                   _IO(KVMIO,   0xb7)
> +/* Memory Encryption Commands */
> +#define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
>  
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
>  #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 23/32] kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
@ 2017-03-16 10:25     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:25 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:17, Brijesh Singh wrote:
> If hardware supports encrypting then KVM_MEMORY_ENCRYPT_OP ioctl can
> be used by qemu to issue platform specific memory encryption commands.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h |    2 ++
>  arch/x86/kvm/x86.c              |   12 ++++++++++++
>  include/uapi/linux/kvm.h        |    2 ++
>  3 files changed, 16 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index bff1f15..62651ad 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1033,6 +1033,8 @@ struct kvm_x86_ops {
>  	void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
>  
>  	void (*setup_mce)(struct kvm_vcpu *vcpu);
> +
> +	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
>  };
>  
>  struct kvm_arch_async_pf {
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2099df8..6a737e9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3926,6 +3926,14 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  	return r;
>  }
>  
> +static int kvm_vm_ioctl_memory_encryption_op(struct kvm *kvm, void __user *argp)
> +{
> +	if (kvm_x86_ops->memory_encryption_op)
> +		return kvm_x86_ops->memory_encryption_op(kvm, argp);
> +
> +	return -ENOTTY;
> +}
> +
>  long kvm_arch_vm_ioctl(struct file *filp,
>  		       unsigned int ioctl, unsigned long arg)
>  {
> @@ -4189,6 +4197,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  		r = kvm_vm_ioctl_enable_cap(kvm, &cap);
>  		break;
>  	}
> +	case KVM_MEMORY_ENCRYPT_OP: {
> +		r = kvm_vm_ioctl_memory_encryption_op(kvm, argp);
> +		break;
> +	}
>  	default:
>  		r = kvm_vm_ioctl_assigned_device(kvm, ioctl, arg);
>  	}
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index cac48ed..fef7d83 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1281,6 +1281,8 @@ struct kvm_s390_ucas_mapping {
>  #define KVM_S390_GET_IRQ_STATE	  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
>  /* Available with KVM_CAP_X86_SMM */
>  #define KVM_SMI                   _IO(KVMIO,   0xb7)
> +/* Memory Encryption Commands */
> +#define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
>  
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
>  #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 24/32] kvm: x86: prepare for SEV guest management API support
  2017-03-02 15:17   ` Brijesh Singh
  (?)
@ 2017-03-16 10:33     ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:33 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:17, Brijesh Singh wrote:
> ASID management:
>  - Reserve asid range for SEV guest, SEV asid range is obtained through
>    CPUID Fn8000_001f[ECX]. A non-SEV guest can use any asid outside the SEV
>    asid range.

How is backwards compatibility handled?

>  - SEV guest must have asid value within asid range obtained through CPUID.
>  - SEV guest must have the same asid for all vcpu's. A TLB flush is required
>    if different vcpu for the same ASID is to be run on the same host CPU.

[...]

> +
> +	/* which host cpu was used for running this vcpu */
> +	bool last_cpuid;

Should be unsigned int.

> 
> +	/* Assign the asid allocated for this SEV guest */
> +	svm->vmcb->control.asid = asid;
> +
> +	/* Flush guest TLB:
> +	 * - when different VMCB for the same ASID is to be run on the
> +	 *   same host CPU
> +	 *   or
> +	 * - this VMCB was executed on different host cpu in previous VMRUNs.
> +	 */
> +	if (sd->sev_vmcbs[asid] != (void *)svm->vmcb ||

Why the cast?

> +		svm->last_cpuid != cpu)
> +		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;

If there is a match, you don't need to do anything else (neither reset
the asid, nor mark it as dirty, nor update the fields), so:

	if (sd->sev_vmcbs[asid] == svm->vmcb &&
	    svm->last_cpuid == cpu)
		return;

	svm->last_cpuid = cpu;
	sd->sev_vmcbs[asid] = svm->vmcb;
	svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
	svm->vmcb->control.asid = asid;
	mark_dirty(svm->vmcb, VMCB_ASID);

(plus comments ;)).

Also, why not TLB_CONTROL_FLUSH_ASID if possible?

> +	svm->last_cpuid = cpu;
> +	sd->sev_vmcbs[asid] = (void *)svm->vmcb;
> +
> +	mark_dirty(svm->vmcb, VMCB_ASID);

[...]

> 
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index fef7d83..9df37a2 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1284,6 +1284,104 @@ struct kvm_s390_ucas_mapping {
>  /* Memory Encryption Commands */
>  #define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
>  
> +/* Secure Encrypted Virtualization mode */
> +enum sev_cmd_id {

Please add documentation in Documentation/virtual/kvm/memory_encrypt.txt.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 24/32] kvm: x86: prepare for SEV guest management API support
@ 2017-03-16 10:33     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:33 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:17, Brijesh Singh wrote:
> ASID management:
>  - Reserve asid range for SEV guest, SEV asid range is obtained through
>    CPUID Fn8000_001f[ECX]. A non-SEV guest can use any asid outside the SEV
>    asid range.

How is backwards compatibility handled?

>  - SEV guest must have asid value within asid range obtained through CPUID.
>  - SEV guest must have the same asid for all vcpu's. A TLB flush is required
>    if different vcpu for the same ASID is to be run on the same host CPU.

[...]

> +
> +	/* which host cpu was used for running this vcpu */
> +	bool last_cpuid;

Should be unsigned int.

> 
> +	/* Assign the asid allocated for this SEV guest */
> +	svm->vmcb->control.asid = asid;
> +
> +	/* Flush guest TLB:
> +	 * - when different VMCB for the same ASID is to be run on the
> +	 *   same host CPU
> +	 *   or
> +	 * - this VMCB was executed on different host cpu in previous VMRUNs.
> +	 */
> +	if (sd->sev_vmcbs[asid] != (void *)svm->vmcb ||

Why the cast?

> +		svm->last_cpuid != cpu)
> +		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;

If there is a match, you don't need to do anything else (neither reset
the asid, nor mark it as dirty, nor update the fields), so:

	if (sd->sev_vmcbs[asid] == svm->vmcb &&
	    svm->last_cpuid == cpu)
		return;

	svm->last_cpuid = cpu;
	sd->sev_vmcbs[asid] = svm->vmcb;
	svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
	svm->vmcb->control.asid = asid;
	mark_dirty(svm->vmcb, VMCB_ASID);

(plus comments ;)).

Also, why not TLB_CONTROL_FLUSH_ASID if possible?

> +	svm->last_cpuid = cpu;
> +	sd->sev_vmcbs[asid] = (void *)svm->vmcb;
> +
> +	mark_dirty(svm->vmcb, VMCB_ASID);

[...]

> 
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index fef7d83..9df37a2 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1284,6 +1284,104 @@ struct kvm_s390_ucas_mapping {
>  /* Memory Encryption Commands */
>  #define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
>  
> +/* Secure Encrypted Virtualization mode */
> +enum sev_cmd_id {

Please add documentation in Documentation/virtual/kvm/memory_encrypt.txt.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 24/32] kvm: x86: prepare for SEV guest management API support
@ 2017-03-16 10:33     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:33 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:17, Brijesh Singh wrote:
> ASID management:
>  - Reserve asid range for SEV guest, SEV asid range is obtained through
>    CPUID Fn8000_001f[ECX]. A non-SEV guest can use any asid outside the SEV
>    asid range.

How is backwards compatibility handled?

>  - SEV guest must have asid value within asid range obtained through CPUID.
>  - SEV guest must have the same asid for all vcpu's. A TLB flush is required
>    if different vcpu for the same ASID is to be run on the same host CPU.

[...]

> +
> +	/* which host cpu was used for running this vcpu */
> +	bool last_cpuid;

Should be unsigned int.

> 
> +	/* Assign the asid allocated for this SEV guest */
> +	svm->vmcb->control.asid = asid;
> +
> +	/* Flush guest TLB:
> +	 * - when different VMCB for the same ASID is to be run on the
> +	 *   same host CPU
> +	 *   or
> +	 * - this VMCB was executed on different host cpu in previous VMRUNs.
> +	 */
> +	if (sd->sev_vmcbs[asid] != (void *)svm->vmcb ||

Why the cast?

> +		svm->last_cpuid != cpu)
> +		svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;

If there is a match, you don't need to do anything else (neither reset
the asid, nor mark it as dirty, nor update the fields), so:

	if (sd->sev_vmcbs[asid] == svm->vmcb &&
	    svm->last_cpuid == cpu)
		return;

	svm->last_cpuid = cpu;
	sd->sev_vmcbs[asid] = svm->vmcb;
	svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
	svm->vmcb->control.asid = asid;
	mark_dirty(svm->vmcb, VMCB_ASID);

(plus comments ;)).

Also, why not TLB_CONTROL_FLUSH_ASID if possible?

> +	svm->last_cpuid = cpu;
> +	sd->sev_vmcbs[asid] = (void *)svm->vmcb;
> +
> +	mark_dirty(svm->vmcb, VMCB_ASID);

[...]

> 
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index fef7d83..9df37a2 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1284,6 +1284,104 @@ struct kvm_s390_ucas_mapping {
>  /* Memory Encryption Commands */
>  #define KVM_MEMORY_ENCRYPT_OP	  _IOWR(KVMIO, 0xb8, unsigned long)
>  
> +/* Secure Encrypted Virtualization mode */
> +enum sev_cmd_id {

Please add documentation in Documentation/virtual/kvm/memory_encrypt.txt.

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
  2017-03-02 15:18   ` Brijesh Singh
                     ` (3 preceding siblings ...)
  (?)
@ 2017-03-16 10:38   ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:38 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:18, Brijesh Singh wrote:
> The SEV memory encryption engine uses a tweak such that two identical
> plaintexts at different location will have a different ciphertexts.
> So swapping or moving ciphertexts of two pages will not result in
> plaintexts being swapped. Relocating (or migrating) a physical backing pages
> for SEV guest will require some additional steps. The current SEV key
> management spec [1] does not provide commands to swap or migrate (move)
> ciphertexts. For now we pin the memory allocated for the SEV guest. In
> future when SEV key management spec provides the commands to support the
> page migration we can update the KVM code to remove the pinning logical
> without making any changes into userspace (qemu).
> 
> The patch pins userspace memory when a new slot is created and unpin the
> memory when slot is removed.
> 
> [1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf

This is not enough, because memory can be hidden temporarily from the
guest and remapped later.  Think of a PCI BAR that is backed by RAM, or
also SMRAM.  The pinning must be kept even in that case.

You need to add a pair of KVM_MEMORY_ENCRYPT_OPs (one that doesn't map
to a PSP operation), such as KVM_REGISTER/UNREGISTER_ENCRYPTED_RAM.  In
QEMU you can use a RAMBlockNotifier to invoke the ioctls.

Paolo

> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h |    6 +++
>  arch/x86/kvm/svm.c              |   93 +++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c              |    3 +
>  3 files changed, 102 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index fcc4710..9dc59f0 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -723,6 +723,7 @@ struct kvm_sev_info {
>  	unsigned int handle;	/* firmware handle */
>  	unsigned int asid;	/* asid for this guest */
>  	int sev_fd;		/* SEV device fd */
> +	struct list_head pinned_memory_slot;
>  };
>  
>  struct kvm_arch {
> @@ -1043,6 +1044,11 @@ struct kvm_x86_ops {
>  	void (*setup_mce)(struct kvm_vcpu *vcpu);
>  
>  	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
> +
> +	void (*prepare_memory_region)(struct kvm *kvm,
> +			struct kvm_memory_slot *memslot,
> +			const struct kvm_userspace_memory_region *mem,
> +			enum kvm_mr_change change);
>  };
>  
>  struct kvm_arch_async_pf {
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 13996d6..ab973f9 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -498,12 +498,21 @@ static inline bool gif_set(struct vcpu_svm *svm)
>  }
>  
>  /* Secure Encrypted Virtualization */
> +struct kvm_sev_pinned_memory_slot {
> +	struct list_head list;
> +	unsigned long npages;
> +	struct page **pages;
> +	unsigned long userspace_addr;
> +	short id;
> +};
> +
>  static unsigned int max_sev_asid;
>  static unsigned long *sev_asid_bitmap;
>  static void sev_deactivate_handle(struct kvm *kvm);
>  static void sev_decommission_handle(struct kvm *kvm);
>  static int sev_asid_new(void);
>  static void sev_asid_free(int asid);
> +static void sev_unpin_memory(struct page **pages, unsigned long npages);
>  #define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
>  
>  static bool kvm_sev_enabled(void)
> @@ -1544,9 +1553,25 @@ static inline int avic_free_vm_id(int id)
>  
>  static void sev_vm_destroy(struct kvm *kvm)
>  {
> +	struct list_head *pos, *q;
> +	struct kvm_sev_pinned_memory_slot *pinned_slot;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
>  	if (!sev_guest(kvm))
>  		return;
>  
> +	/* if guest memory is pinned then unpin it now */
> +	if (!list_empty(head)) {
> +		list_for_each_safe(pos, q, head) {
> +			pinned_slot = list_entry(pos,
> +				struct kvm_sev_pinned_memory_slot, list);
> +			sev_unpin_memory(pinned_slot->pages,
> +					pinned_slot->npages);
> +			list_del(pos);
> +			kfree(pinned_slot);
> +		}
> +	}
> +
>  	/* release the firmware resources */
>  	sev_deactivate_handle(kvm);
>  	sev_decommission_handle(kvm);
> @@ -5663,6 +5688,8 @@ static int sev_pre_start(struct kvm *kvm, int *asid)
>  		}
>  		*asid = ret;
>  		ret = 0;
> +
> +		INIT_LIST_HEAD(&kvm->arch.sev_info.pinned_memory_slot);
>  	}
>  
>  	return ret;
> @@ -6189,6 +6216,71 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	return ret;
>  }
>  
> +static struct kvm_sev_pinned_memory_slot *sev_find_pinned_memory_slot(
> +		struct kvm *kvm, struct kvm_memory_slot *slot)
> +{
> +	struct kvm_sev_pinned_memory_slot *i;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
> +	list_for_each_entry(i, head, list) {
> +		if (i->userspace_addr == slot->userspace_addr &&
> +			i->id == slot->id)
> +			return i;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void amd_prepare_memory_region(struct kvm *kvm,
> +				struct kvm_memory_slot *memslot,
> +				const struct kvm_userspace_memory_region *mem,
> +				enum kvm_mr_change change)
> +{
> +	struct kvm_sev_pinned_memory_slot *pinned_slot;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
> +	mutex_lock(&kvm->lock);
> +
> +	if (!sev_guest(kvm))
> +		goto unlock;
> +
> +	if (change == KVM_MR_CREATE) {
> +
> +		if (!mem->memory_size)
> +			goto unlock;
> +
> +		pinned_slot = kmalloc(sizeof(*pinned_slot), GFP_KERNEL);
> +		if (pinned_slot == NULL)
> +			goto unlock;
> +
> +		pinned_slot->pages = sev_pin_memory(mem->userspace_addr,
> +				mem->memory_size, &pinned_slot->npages);
> +		if (pinned_slot->pages == NULL) {
> +			kfree(pinned_slot);
> +			goto unlock;
> +		}
> +
> +		sev_clflush_pages(pinned_slot->pages, pinned_slot->npages);
> +
> +		pinned_slot->id = memslot->id;
> +		pinned_slot->userspace_addr = mem->userspace_addr;
> +		list_add_tail(&pinned_slot->list, head);
> +
> +	} else if  (change == KVM_MR_DELETE) {
> +
> +		pinned_slot = sev_find_pinned_memory_slot(kvm, memslot);
> +		if (!pinned_slot)
> +			goto unlock;
> +
> +		sev_unpin_memory(pinned_slot->pages, pinned_slot->npages);
> +		list_del(&pinned_slot->list);
> +		kfree(pinned_slot);
> +	}
> +
> +unlock:
> +	mutex_unlock(&kvm->lock);
> +}
> +
>  static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
>  {
>  	int r = -ENOTTY;
> @@ -6355,6 +6447,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
>  	.update_pi_irte = svm_update_pi_irte,
>  
>  	.memory_encryption_op = amd_memory_encryption_cmd,
> +	.prepare_memory_region = amd_prepare_memory_region,
>  };
>  
>  static int __init svm_init(void)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 6a737e9..e05069d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8195,6 +8195,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  				const struct kvm_userspace_memory_region *mem,
>  				enum kvm_mr_change change)
>  {
> +	if (kvm_x86_ops->prepare_memory_region)
> +		kvm_x86_ops->prepare_memory_region(kvm, memslot, mem, change);
> +
>  	return 0;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
  2017-03-02 15:18   ` Brijesh Singh
  (?)
@ 2017-03-16 10:38     ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:38 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:18, Brijesh Singh wrote:
> The SEV memory encryption engine uses a tweak such that two identical
> plaintexts at different location will have a different ciphertexts.
> So swapping or moving ciphertexts of two pages will not result in
> plaintexts being swapped. Relocating (or migrating) a physical backing pages
> for SEV guest will require some additional steps. The current SEV key
> management spec [1] does not provide commands to swap or migrate (move)
> ciphertexts. For now we pin the memory allocated for the SEV guest. In
> future when SEV key management spec provides the commands to support the
> page migration we can update the KVM code to remove the pinning logical
> without making any changes into userspace (qemu).
> 
> The patch pins userspace memory when a new slot is created and unpin the
> memory when slot is removed.
> 
> [1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf

This is not enough, because memory can be hidden temporarily from the
guest and remapped later.  Think of a PCI BAR that is backed by RAM, or
also SMRAM.  The pinning must be kept even in that case.

You need to add a pair of KVM_MEMORY_ENCRYPT_OPs (one that doesn't map
to a PSP operation), such as KVM_REGISTER/UNREGISTER_ENCRYPTED_RAM.  In
QEMU you can use a RAMBlockNotifier to invoke the ioctls.

Paolo

> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h |    6 +++
>  arch/x86/kvm/svm.c              |   93 +++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c              |    3 +
>  3 files changed, 102 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index fcc4710..9dc59f0 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -723,6 +723,7 @@ struct kvm_sev_info {
>  	unsigned int handle;	/* firmware handle */
>  	unsigned int asid;	/* asid for this guest */
>  	int sev_fd;		/* SEV device fd */
> +	struct list_head pinned_memory_slot;
>  };
>  
>  struct kvm_arch {
> @@ -1043,6 +1044,11 @@ struct kvm_x86_ops {
>  	void (*setup_mce)(struct kvm_vcpu *vcpu);
>  
>  	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
> +
> +	void (*prepare_memory_region)(struct kvm *kvm,
> +			struct kvm_memory_slot *memslot,
> +			const struct kvm_userspace_memory_region *mem,
> +			enum kvm_mr_change change);
>  };
>  
>  struct kvm_arch_async_pf {
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 13996d6..ab973f9 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -498,12 +498,21 @@ static inline bool gif_set(struct vcpu_svm *svm)
>  }
>  
>  /* Secure Encrypted Virtualization */
> +struct kvm_sev_pinned_memory_slot {
> +	struct list_head list;
> +	unsigned long npages;
> +	struct page **pages;
> +	unsigned long userspace_addr;
> +	short id;
> +};
> +
>  static unsigned int max_sev_asid;
>  static unsigned long *sev_asid_bitmap;
>  static void sev_deactivate_handle(struct kvm *kvm);
>  static void sev_decommission_handle(struct kvm *kvm);
>  static int sev_asid_new(void);
>  static void sev_asid_free(int asid);
> +static void sev_unpin_memory(struct page **pages, unsigned long npages);
>  #define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
>  
>  static bool kvm_sev_enabled(void)
> @@ -1544,9 +1553,25 @@ static inline int avic_free_vm_id(int id)
>  
>  static void sev_vm_destroy(struct kvm *kvm)
>  {
> +	struct list_head *pos, *q;
> +	struct kvm_sev_pinned_memory_slot *pinned_slot;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
>  	if (!sev_guest(kvm))
>  		return;
>  
> +	/* if guest memory is pinned then unpin it now */
> +	if (!list_empty(head)) {
> +		list_for_each_safe(pos, q, head) {
> +			pinned_slot = list_entry(pos,
> +				struct kvm_sev_pinned_memory_slot, list);
> +			sev_unpin_memory(pinned_slot->pages,
> +					pinned_slot->npages);
> +			list_del(pos);
> +			kfree(pinned_slot);
> +		}
> +	}
> +
>  	/* release the firmware resources */
>  	sev_deactivate_handle(kvm);
>  	sev_decommission_handle(kvm);
> @@ -5663,6 +5688,8 @@ static int sev_pre_start(struct kvm *kvm, int *asid)
>  		}
>  		*asid = ret;
>  		ret = 0;
> +
> +		INIT_LIST_HEAD(&kvm->arch.sev_info.pinned_memory_slot);
>  	}
>  
>  	return ret;
> @@ -6189,6 +6216,71 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	return ret;
>  }
>  
> +static struct kvm_sev_pinned_memory_slot *sev_find_pinned_memory_slot(
> +		struct kvm *kvm, struct kvm_memory_slot *slot)
> +{
> +	struct kvm_sev_pinned_memory_slot *i;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
> +	list_for_each_entry(i, head, list) {
> +		if (i->userspace_addr == slot->userspace_addr &&
> +			i->id == slot->id)
> +			return i;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void amd_prepare_memory_region(struct kvm *kvm,
> +				struct kvm_memory_slot *memslot,
> +				const struct kvm_userspace_memory_region *mem,
> +				enum kvm_mr_change change)
> +{
> +	struct kvm_sev_pinned_memory_slot *pinned_slot;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
> +	mutex_lock(&kvm->lock);
> +
> +	if (!sev_guest(kvm))
> +		goto unlock;
> +
> +	if (change == KVM_MR_CREATE) {
> +
> +		if (!mem->memory_size)
> +			goto unlock;
> +
> +		pinned_slot = kmalloc(sizeof(*pinned_slot), GFP_KERNEL);
> +		if (pinned_slot == NULL)
> +			goto unlock;
> +
> +		pinned_slot->pages = sev_pin_memory(mem->userspace_addr,
> +				mem->memory_size, &pinned_slot->npages);
> +		if (pinned_slot->pages == NULL) {
> +			kfree(pinned_slot);
> +			goto unlock;
> +		}
> +
> +		sev_clflush_pages(pinned_slot->pages, pinned_slot->npages);
> +
> +		pinned_slot->id = memslot->id;
> +		pinned_slot->userspace_addr = mem->userspace_addr;
> +		list_add_tail(&pinned_slot->list, head);
> +
> +	} else if  (change == KVM_MR_DELETE) {
> +
> +		pinned_slot = sev_find_pinned_memory_slot(kvm, memslot);
> +		if (!pinned_slot)
> +			goto unlock;
> +
> +		sev_unpin_memory(pinned_slot->pages, pinned_slot->npages);
> +		list_del(&pinned_slot->list);
> +		kfree(pinned_slot);
> +	}
> +
> +unlock:
> +	mutex_unlock(&kvm->lock);
> +}
> +
>  static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
>  {
>  	int r = -ENOTTY;
> @@ -6355,6 +6447,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
>  	.update_pi_irte = svm_update_pi_irte,
>  
>  	.memory_encryption_op = amd_memory_encryption_cmd,
> +	.prepare_memory_region = amd_prepare_memory_region,
>  };
>  
>  static int __init svm_init(void)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 6a737e9..e05069d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8195,6 +8195,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  				const struct kvm_userspace_memory_region *mem,
>  				enum kvm_mr_change change)
>  {
> +	if (kvm_x86_ops->prepare_memory_region)
> +		kvm_x86_ops->prepare_memory_region(kvm, memslot, mem, change);
> +
>  	return 0;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
@ 2017-03-16 10:38     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:38 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:18, Brijesh Singh wrote:
> The SEV memory encryption engine uses a tweak such that two identical
> plaintexts at different location will have a different ciphertexts.
> So swapping or moving ciphertexts of two pages will not result in
> plaintexts being swapped. Relocating (or migrating) a physical backing pages
> for SEV guest will require some additional steps. The current SEV key
> management spec [1] does not provide commands to swap or migrate (move)
> ciphertexts. For now we pin the memory allocated for the SEV guest. In
> future when SEV key management spec provides the commands to support the
> page migration we can update the KVM code to remove the pinning logical
> without making any changes into userspace (qemu).
> 
> The patch pins userspace memory when a new slot is created and unpin the
> memory when slot is removed.
> 
> [1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf

This is not enough, because memory can be hidden temporarily from the
guest and remapped later.  Think of a PCI BAR that is backed by RAM, or
also SMRAM.  The pinning must be kept even in that case.

You need to add a pair of KVM_MEMORY_ENCRYPT_OPs (one that doesn't map
to a PSP operation), such as KVM_REGISTER/UNREGISTER_ENCRYPTED_RAM.  In
QEMU you can use a RAMBlockNotifier to invoke the ioctls.

Paolo

> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h |    6 +++
>  arch/x86/kvm/svm.c              |   93 +++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c              |    3 +
>  3 files changed, 102 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index fcc4710..9dc59f0 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -723,6 +723,7 @@ struct kvm_sev_info {
>  	unsigned int handle;	/* firmware handle */
>  	unsigned int asid;	/* asid for this guest */
>  	int sev_fd;		/* SEV device fd */
> +	struct list_head pinned_memory_slot;
>  };
>  
>  struct kvm_arch {
> @@ -1043,6 +1044,11 @@ struct kvm_x86_ops {
>  	void (*setup_mce)(struct kvm_vcpu *vcpu);
>  
>  	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
> +
> +	void (*prepare_memory_region)(struct kvm *kvm,
> +			struct kvm_memory_slot *memslot,
> +			const struct kvm_userspace_memory_region *mem,
> +			enum kvm_mr_change change);
>  };
>  
>  struct kvm_arch_async_pf {
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 13996d6..ab973f9 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -498,12 +498,21 @@ static inline bool gif_set(struct vcpu_svm *svm)
>  }
>  
>  /* Secure Encrypted Virtualization */
> +struct kvm_sev_pinned_memory_slot {
> +	struct list_head list;
> +	unsigned long npages;
> +	struct page **pages;
> +	unsigned long userspace_addr;
> +	short id;
> +};
> +
>  static unsigned int max_sev_asid;
>  static unsigned long *sev_asid_bitmap;
>  static void sev_deactivate_handle(struct kvm *kvm);
>  static void sev_decommission_handle(struct kvm *kvm);
>  static int sev_asid_new(void);
>  static void sev_asid_free(int asid);
> +static void sev_unpin_memory(struct page **pages, unsigned long npages);
>  #define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
>  
>  static bool kvm_sev_enabled(void)
> @@ -1544,9 +1553,25 @@ static inline int avic_free_vm_id(int id)
>  
>  static void sev_vm_destroy(struct kvm *kvm)
>  {
> +	struct list_head *pos, *q;
> +	struct kvm_sev_pinned_memory_slot *pinned_slot;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
>  	if (!sev_guest(kvm))
>  		return;
>  
> +	/* if guest memory is pinned then unpin it now */
> +	if (!list_empty(head)) {
> +		list_for_each_safe(pos, q, head) {
> +			pinned_slot = list_entry(pos,
> +				struct kvm_sev_pinned_memory_slot, list);
> +			sev_unpin_memory(pinned_slot->pages,
> +					pinned_slot->npages);
> +			list_del(pos);
> +			kfree(pinned_slot);
> +		}
> +	}
> +
>  	/* release the firmware resources */
>  	sev_deactivate_handle(kvm);
>  	sev_decommission_handle(kvm);
> @@ -5663,6 +5688,8 @@ static int sev_pre_start(struct kvm *kvm, int *asid)
>  		}
>  		*asid = ret;
>  		ret = 0;
> +
> +		INIT_LIST_HEAD(&kvm->arch.sev_info.pinned_memory_slot);
>  	}
>  
>  	return ret;
> @@ -6189,6 +6216,71 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	return ret;
>  }
>  
> +static struct kvm_sev_pinned_memory_slot *sev_find_pinned_memory_slot(
> +		struct kvm *kvm, struct kvm_memory_slot *slot)
> +{
> +	struct kvm_sev_pinned_memory_slot *i;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
> +	list_for_each_entry(i, head, list) {
> +		if (i->userspace_addr == slot->userspace_addr &&
> +			i->id == slot->id)
> +			return i;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void amd_prepare_memory_region(struct kvm *kvm,
> +				struct kvm_memory_slot *memslot,
> +				const struct kvm_userspace_memory_region *mem,
> +				enum kvm_mr_change change)
> +{
> +	struct kvm_sev_pinned_memory_slot *pinned_slot;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
> +	mutex_lock(&kvm->lock);
> +
> +	if (!sev_guest(kvm))
> +		goto unlock;
> +
> +	if (change == KVM_MR_CREATE) {
> +
> +		if (!mem->memory_size)
> +			goto unlock;
> +
> +		pinned_slot = kmalloc(sizeof(*pinned_slot), GFP_KERNEL);
> +		if (pinned_slot == NULL)
> +			goto unlock;
> +
> +		pinned_slot->pages = sev_pin_memory(mem->userspace_addr,
> +				mem->memory_size, &pinned_slot->npages);
> +		if (pinned_slot->pages == NULL) {
> +			kfree(pinned_slot);
> +			goto unlock;
> +		}
> +
> +		sev_clflush_pages(pinned_slot->pages, pinned_slot->npages);
> +
> +		pinned_slot->id = memslot->id;
> +		pinned_slot->userspace_addr = mem->userspace_addr;
> +		list_add_tail(&pinned_slot->list, head);
> +
> +	} else if  (change == KVM_MR_DELETE) {
> +
> +		pinned_slot = sev_find_pinned_memory_slot(kvm, memslot);
> +		if (!pinned_slot)
> +			goto unlock;
> +
> +		sev_unpin_memory(pinned_slot->pages, pinned_slot->npages);
> +		list_del(&pinned_slot->list);
> +		kfree(pinned_slot);
> +	}
> +
> +unlock:
> +	mutex_unlock(&kvm->lock);
> +}
> +
>  static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
>  {
>  	int r = -ENOTTY;
> @@ -6355,6 +6447,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
>  	.update_pi_irte = svm_update_pi_irte,
>  
>  	.memory_encryption_op = amd_memory_encryption_cmd,
> +	.prepare_memory_region = amd_prepare_memory_region,
>  };
>  
>  static int __init svm_init(void)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 6a737e9..e05069d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8195,6 +8195,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  				const struct kvm_userspace_memory_region *mem,
>  				enum kvm_mr_change change)
>  {
> +	if (kvm_x86_ops->prepare_memory_region)
> +		kvm_x86_ops->prepare_memory_region(kvm, memslot, mem, change);
> +
>  	return 0;
>  }
>  
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
@ 2017-03-16 10:38     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:38 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:18, Brijesh Singh wrote:
> The SEV memory encryption engine uses a tweak such that two identical
> plaintexts at different location will have a different ciphertexts.
> So swapping or moving ciphertexts of two pages will not result in
> plaintexts being swapped. Relocating (or migrating) a physical backing pages
> for SEV guest will require some additional steps. The current SEV key
> management spec [1] does not provide commands to swap or migrate (move)
> ciphertexts. For now we pin the memory allocated for the SEV guest. In
> future when SEV key management spec provides the commands to support the
> page migration we can update the KVM code to remove the pinning logical
> without making any changes into userspace (qemu).
> 
> The patch pins userspace memory when a new slot is created and unpin the
> memory when slot is removed.
> 
> [1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf

This is not enough, because memory can be hidden temporarily from the
guest and remapped later.  Think of a PCI BAR that is backed by RAM, or
also SMRAM.  The pinning must be kept even in that case.

You need to add a pair of KVM_MEMORY_ENCRYPT_OPs (one that doesn't map
to a PSP operation), such as KVM_REGISTER/UNREGISTER_ENCRYPTED_RAM.  In
QEMU you can use a RAMBlockNotifier to invoke the ioctls.

Paolo

> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h |    6 +++
>  arch/x86/kvm/svm.c              |   93 +++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c              |    3 +
>  3 files changed, 102 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index fcc4710..9dc59f0 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -723,6 +723,7 @@ struct kvm_sev_info {
>  	unsigned int handle;	/* firmware handle */
>  	unsigned int asid;	/* asid for this guest */
>  	int sev_fd;		/* SEV device fd */
> +	struct list_head pinned_memory_slot;
>  };
>  
>  struct kvm_arch {
> @@ -1043,6 +1044,11 @@ struct kvm_x86_ops {
>  	void (*setup_mce)(struct kvm_vcpu *vcpu);
>  
>  	int (*memory_encryption_op)(struct kvm *kvm, void __user *argp);
> +
> +	void (*prepare_memory_region)(struct kvm *kvm,
> +			struct kvm_memory_slot *memslot,
> +			const struct kvm_userspace_memory_region *mem,
> +			enum kvm_mr_change change);
>  };
>  
>  struct kvm_arch_async_pf {
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 13996d6..ab973f9 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -498,12 +498,21 @@ static inline bool gif_set(struct vcpu_svm *svm)
>  }
>  
>  /* Secure Encrypted Virtualization */
> +struct kvm_sev_pinned_memory_slot {
> +	struct list_head list;
> +	unsigned long npages;
> +	struct page **pages;
> +	unsigned long userspace_addr;
> +	short id;
> +};
> +
>  static unsigned int max_sev_asid;
>  static unsigned long *sev_asid_bitmap;
>  static void sev_deactivate_handle(struct kvm *kvm);
>  static void sev_decommission_handle(struct kvm *kvm);
>  static int sev_asid_new(void);
>  static void sev_asid_free(int asid);
> +static void sev_unpin_memory(struct page **pages, unsigned long npages);
>  #define __sev_page_pa(x) ((page_to_pfn(x) << PAGE_SHIFT) | sme_me_mask)
>  
>  static bool kvm_sev_enabled(void)
> @@ -1544,9 +1553,25 @@ static inline int avic_free_vm_id(int id)
>  
>  static void sev_vm_destroy(struct kvm *kvm)
>  {
> +	struct list_head *pos, *q;
> +	struct kvm_sev_pinned_memory_slot *pinned_slot;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
>  	if (!sev_guest(kvm))
>  		return;
>  
> +	/* if guest memory is pinned then unpin it now */
> +	if (!list_empty(head)) {
> +		list_for_each_safe(pos, q, head) {
> +			pinned_slot = list_entry(pos,
> +				struct kvm_sev_pinned_memory_slot, list);
> +			sev_unpin_memory(pinned_slot->pages,
> +					pinned_slot->npages);
> +			list_del(pos);
> +			kfree(pinned_slot);
> +		}
> +	}
> +
>  	/* release the firmware resources */
>  	sev_deactivate_handle(kvm);
>  	sev_decommission_handle(kvm);
> @@ -5663,6 +5688,8 @@ static int sev_pre_start(struct kvm *kvm, int *asid)
>  		}
>  		*asid = ret;
>  		ret = 0;
> +
> +		INIT_LIST_HEAD(&kvm->arch.sev_info.pinned_memory_slot);
>  	}
>  
>  	return ret;
> @@ -6189,6 +6216,71 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	return ret;
>  }
>  
> +static struct kvm_sev_pinned_memory_slot *sev_find_pinned_memory_slot(
> +		struct kvm *kvm, struct kvm_memory_slot *slot)
> +{
> +	struct kvm_sev_pinned_memory_slot *i;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
> +	list_for_each_entry(i, head, list) {
> +		if (i->userspace_addr == slot->userspace_addr &&
> +			i->id == slot->id)
> +			return i;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void amd_prepare_memory_region(struct kvm *kvm,
> +				struct kvm_memory_slot *memslot,
> +				const struct kvm_userspace_memory_region *mem,
> +				enum kvm_mr_change change)
> +{
> +	struct kvm_sev_pinned_memory_slot *pinned_slot;
> +	struct list_head *head = &kvm->arch.sev_info.pinned_memory_slot;
> +
> +	mutex_lock(&kvm->lock);
> +
> +	if (!sev_guest(kvm))
> +		goto unlock;
> +
> +	if (change == KVM_MR_CREATE) {
> +
> +		if (!mem->memory_size)
> +			goto unlock;
> +
> +		pinned_slot = kmalloc(sizeof(*pinned_slot), GFP_KERNEL);
> +		if (pinned_slot == NULL)
> +			goto unlock;
> +
> +		pinned_slot->pages = sev_pin_memory(mem->userspace_addr,
> +				mem->memory_size, &pinned_slot->npages);
> +		if (pinned_slot->pages == NULL) {
> +			kfree(pinned_slot);
> +			goto unlock;
> +		}
> +
> +		sev_clflush_pages(pinned_slot->pages, pinned_slot->npages);
> +
> +		pinned_slot->id = memslot->id;
> +		pinned_slot->userspace_addr = mem->userspace_addr;
> +		list_add_tail(&pinned_slot->list, head);
> +
> +	} else if  (change == KVM_MR_DELETE) {
> +
> +		pinned_slot = sev_find_pinned_memory_slot(kvm, memslot);
> +		if (!pinned_slot)
> +			goto unlock;
> +
> +		sev_unpin_memory(pinned_slot->pages, pinned_slot->npages);
> +		list_del(&pinned_slot->list);
> +		kfree(pinned_slot);
> +	}
> +
> +unlock:
> +	mutex_unlock(&kvm->lock);
> +}
> +
>  static int amd_memory_encryption_cmd(struct kvm *kvm, void __user *argp)
>  {
>  	int r = -ENOTTY;
> @@ -6355,6 +6447,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
>  	.update_pi_irte = svm_update_pi_irte,
>  
>  	.memory_encryption_op = amd_memory_encryption_cmd,
> +	.prepare_memory_region = amd_prepare_memory_region,
>  };
>  
>  static int __init svm_init(void)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 6a737e9..e05069d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8195,6 +8195,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  				const struct kvm_userspace_memory_region *mem,
>  				enum kvm_mr_change change)
>  {
> +	if (kvm_x86_ops->prepare_memory_region)
> +		kvm_x86_ops->prepare_memory_region(kvm, memslot, mem, change);
> +
>  	return 0;
>  }
>  
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
  2017-03-02 15:17   ` Brijesh Singh
                     ` (2 preceding siblings ...)
  (?)
@ 2017-03-16 10:48   ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:48 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:17, Brijesh Singh wrote:
> +static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
> +				    unsigned long *n)
> +{
> +	struct page **pages;
> +	int first, last;
> +	unsigned long npages, pinned;
> +
> +	/* Get number of pages */
> +	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
> +	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
> +	npages = (last - first + 1);
> +
> +	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
> +	if (!pages)
> +		return NULL;
> +
> +	/* pin the user virtual address */
> +	down_read(&current->mm->mmap_sem);
> +	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
> +	up_read(&current->mm->mmap_sem);

get_user_pages_fast, like get_user_pages_unlocked, must be called
without mmap_sem held.

> +	if (pinned != npages) {
> +		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
> +				npages, pinned);
> +		goto err;
> +	}
> +
> +	*n = npages;
> +	return pages;
> +err:
> +	if (pinned > 0)
> +		release_pages(pages, pinned, 0);
> +	kfree(pages);
> +
> +	return NULL;
> +}
>
> +	/* the array of pages returned by get_user_pages() is a page-aligned
> +	 * memory. Since the user buffer is probably not page-aligned, we need
> +	 * to calculate the offset within a page for first update entry.
> +	 */
> +	offset = uaddr & (PAGE_SIZE - 1);
> +	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
> +	ulen -= len;
> +
> +	/* update first page -
> +	 * special care need to be taken for the first page because we might
> +	 * be dealing with offset within the page
> +	 */

No need to special case the first page; just set "offset = 0" inside the
loop after the first iteration.

Paolo

> +	data->handle = sev_get_handle(kvm);
> +	data->length = len;
> +	data->address = __sev_page_pa(inpages[0]) + offset;
> +	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
> +			data, &argp->error);
> +	if (ret)
> +		goto err_3;
> +
> +	/* update remaining pages */
> +	for (i = 1; i < nr_pages; i++) {
> +
> +		len = min_t(size_t, PAGE_SIZE, ulen);
> +		ulen -= len;
> +		data->length = len;
> +		data->address = __sev_page_pa(inpages[i]);
> +		ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
> +					data, &argp->error);
> +		if (ret)
> +			goto err_3;
> +	}

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
  2017-03-02 15:17   ` Brijesh Singh
  (?)
@ 2017-03-16 10:48     ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:48 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:17, Brijesh Singh wrote:
> +static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
> +				    unsigned long *n)
> +{
> +	struct page **pages;
> +	int first, last;
> +	unsigned long npages, pinned;
> +
> +	/* Get number of pages */
> +	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
> +	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
> +	npages = (last - first + 1);
> +
> +	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
> +	if (!pages)
> +		return NULL;
> +
> +	/* pin the user virtual address */
> +	down_read(&current->mm->mmap_sem);
> +	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
> +	up_read(&current->mm->mmap_sem);

get_user_pages_fast, like get_user_pages_unlocked, must be called
without mmap_sem held.

> +	if (pinned != npages) {
> +		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
> +				npages, pinned);
> +		goto err;
> +	}
> +
> +	*n = npages;
> +	return pages;
> +err:
> +	if (pinned > 0)
> +		release_pages(pages, pinned, 0);
> +	kfree(pages);
> +
> +	return NULL;
> +}
>
> +	/* the array of pages returned by get_user_pages() is a page-aligned
> +	 * memory. Since the user buffer is probably not page-aligned, we need
> +	 * to calculate the offset within a page for first update entry.
> +	 */
> +	offset = uaddr & (PAGE_SIZE - 1);
> +	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
> +	ulen -= len;
> +
> +	/* update first page -
> +	 * special care need to be taken for the first page because we might
> +	 * be dealing with offset within the page
> +	 */

No need to special case the first page; just set "offset = 0" inside the
loop after the first iteration.

Paolo

> +	data->handle = sev_get_handle(kvm);
> +	data->length = len;
> +	data->address = __sev_page_pa(inpages[0]) + offset;
> +	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
> +			data, &argp->error);
> +	if (ret)
> +		goto err_3;
> +
> +	/* update remaining pages */
> +	for (i = 1; i < nr_pages; i++) {
> +
> +		len = min_t(size_t, PAGE_SIZE, ulen);
> +		ulen -= len;
> +		data->length = len;
> +		data->address = __sev_page_pa(inpages[i]);
> +		ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
> +					data, &argp->error);
> +		if (ret)
> +			goto err_3;
> +	}

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
@ 2017-03-16 10:48     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:48 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:17, Brijesh Singh wrote:
> +static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
> +				    unsigned long *n)
> +{
> +	struct page **pages;
> +	int first, last;
> +	unsigned long npages, pinned;
> +
> +	/* Get number of pages */
> +	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
> +	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
> +	npages = (last - first + 1);
> +
> +	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
> +	if (!pages)
> +		return NULL;
> +
> +	/* pin the user virtual address */
> +	down_read(&current->mm->mmap_sem);
> +	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
> +	up_read(&current->mm->mmap_sem);

get_user_pages_fast, like get_user_pages_unlocked, must be called
without mmap_sem held.

> +	if (pinned != npages) {
> +		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
> +				npages, pinned);
> +		goto err;
> +	}
> +
> +	*n = npages;
> +	return pages;
> +err:
> +	if (pinned > 0)
> +		release_pages(pages, pinned, 0);
> +	kfree(pages);
> +
> +	return NULL;
> +}
>
> +	/* the array of pages returned by get_user_pages() is a page-aligned
> +	 * memory. Since the user buffer is probably not page-aligned, we need
> +	 * to calculate the offset within a page for first update entry.
> +	 */
> +	offset = uaddr & (PAGE_SIZE - 1);
> +	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
> +	ulen -= len;
> +
> +	/* update first page -
> +	 * special care need to be taken for the first page because we might
> +	 * be dealing with offset within the page
> +	 */

No need to special case the first page; just set "offset = 0" inside the
loop after the first iteration.

Paolo

> +	data->handle = sev_get_handle(kvm);
> +	data->length = len;
> +	data->address = __sev_page_pa(inpages[0]) + offset;
> +	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
> +			data, &argp->error);
> +	if (ret)
> +		goto err_3;
> +
> +	/* update remaining pages */
> +	for (i = 1; i < nr_pages; i++) {
> +
> +		len = min_t(size_t, PAGE_SIZE, ulen);
> +		ulen -= len;
> +		data->length = len;
> +		data->address = __sev_page_pa(inpages[i]);
> +		ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
> +					data, &argp->error);
> +		if (ret)
> +			goto err_3;
> +	}

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
@ 2017-03-16 10:48     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:48 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:17, Brijesh Singh wrote:
> +static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
> +				    unsigned long *n)
> +{
> +	struct page **pages;
> +	int first, last;
> +	unsigned long npages, pinned;
> +
> +	/* Get number of pages */
> +	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
> +	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
> +	npages = (last - first + 1);
> +
> +	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
> +	if (!pages)
> +		return NULL;
> +
> +	/* pin the user virtual address */
> +	down_read(&current->mm->mmap_sem);
> +	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
> +	up_read(&current->mm->mmap_sem);

get_user_pages_fast, like get_user_pages_unlocked, must be called
without mmap_sem held.

> +	if (pinned != npages) {
> +		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
> +				npages, pinned);
> +		goto err;
> +	}
> +
> +	*n = npages;
> +	return pages;
> +err:
> +	if (pinned > 0)
> +		release_pages(pages, pinned, 0);
> +	kfree(pages);
> +
> +	return NULL;
> +}
>
> +	/* the array of pages returned by get_user_pages() is a page-aligned
> +	 * memory. Since the user buffer is probably not page-aligned, we need
> +	 * to calculate the offset within a page for first update entry.
> +	 */
> +	offset = uaddr & (PAGE_SIZE - 1);
> +	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
> +	ulen -= len;
> +
> +	/* update first page -
> +	 * special care need to be taken for the first page because we might
> +	 * be dealing with offset within the page
> +	 */

No need to special case the first page; just set "offset = 0" inside the
loop after the first iteration.

Paolo

> +	data->handle = sev_get_handle(kvm);
> +	data->length = len;
> +	data->address = __sev_page_pa(inpages[0]) + offset;
> +	ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
> +			data, &argp->error);
> +	if (ret)
> +		goto err_3;
> +
> +	/* update remaining pages */
> +	for (i = 1; i < nr_pages; i++) {
> +
> +		len = min_t(size_t, PAGE_SIZE, ulen);
> +		ulen -= len;
> +		data->length = len;
> +		data->address = __sev_page_pa(inpages[i]);
> +		ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_DATA,
> +					data, &argp->error);
> +		if (ret)
> +			goto err_3;
> +	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
  2017-03-02 15:18   ` Brijesh Singh
                     ` (3 preceding siblings ...)
  (?)
@ 2017-03-16 10:54   ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:54 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:18, Brijesh Singh wrote:
> +static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
> +		void *dst, int *error)
> +{
> +	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
> +	if (!inpages) {
> +		ret = -ENOMEM;
> +		goto err_1;
> +	}
> +
> +	data->handle = sev_get_handle(kvm);
> +	data->dst_addr = __psp_pa(dst);
> +	data->src_addr = __sev_page_pa(inpages[0]);
> +	data->length = PAGE_SIZE;
> +
> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
> +	if (ret)
> +		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
> +				ret, *error);
> +	sev_unpin_memory(inpages, npages);
> +err_1:
> +	kfree(data);
> +	return ret;
> +}
> +
> +static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	void *data;
> +	int ret, offset, len;
> +	struct kvm_sev_dbg debug;
> +
> +	if (!sev_guest(kvm))
> +		return -ENOTTY;
> +
> +	if (copy_from_user(&debug, (void *)argp->data,
> +				sizeof(struct kvm_sev_dbg)))
> +		return -EFAULT;
> +	/*
> +	 * TODO: add support for decrypting length which crosses the
> +	 * page boundary.
> +	 */
> +	offset = debug.src_addr & (PAGE_SIZE - 1);
> +	if (offset + debug.length > PAGE_SIZE)
> +		return -EINVAL;
> +

Please do add it, it doesn't seem very different from what you're doing
in LAUNCH_UPDATE_DATA.  There's no need for a separate
__sev_dbg_decrypt_page function, you can just pin/unpin here and do a
per-page loop as in LAUNCH_UPDATE_DATA.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
  2017-03-02 15:18   ` Brijesh Singh
  (?)
@ 2017-03-16 10:54     ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:54 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:18, Brijesh Singh wrote:
> +static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
> +		void *dst, int *error)
> +{
> +	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
> +	if (!inpages) {
> +		ret = -ENOMEM;
> +		goto err_1;
> +	}
> +
> +	data->handle = sev_get_handle(kvm);
> +	data->dst_addr = __psp_pa(dst);
> +	data->src_addr = __sev_page_pa(inpages[0]);
> +	data->length = PAGE_SIZE;
> +
> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
> +	if (ret)
> +		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
> +				ret, *error);
> +	sev_unpin_memory(inpages, npages);
> +err_1:
> +	kfree(data);
> +	return ret;
> +}
> +
> +static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	void *data;
> +	int ret, offset, len;
> +	struct kvm_sev_dbg debug;
> +
> +	if (!sev_guest(kvm))
> +		return -ENOTTY;
> +
> +	if (copy_from_user(&debug, (void *)argp->data,
> +				sizeof(struct kvm_sev_dbg)))
> +		return -EFAULT;
> +	/*
> +	 * TODO: add support for decrypting length which crosses the
> +	 * page boundary.
> +	 */
> +	offset = debug.src_addr & (PAGE_SIZE - 1);
> +	if (offset + debug.length > PAGE_SIZE)
> +		return -EINVAL;
> +

Please do add it, it doesn't seem very different from what you're doing
in LAUNCH_UPDATE_DATA.  There's no need for a separate
__sev_dbg_decrypt_page function, you can just pin/unpin here and do a
per-page loop as in LAUNCH_UPDATE_DATA.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
@ 2017-03-16 10:54     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:54 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mcheha



On 02/03/2017 16:18, Brijesh Singh wrote:
> +static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
> +		void *dst, int *error)
> +{
> +	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
> +	if (!inpages) {
> +		ret = -ENOMEM;
> +		goto err_1;
> +	}
> +
> +	data->handle = sev_get_handle(kvm);
> +	data->dst_addr = __psp_pa(dst);
> +	data->src_addr = __sev_page_pa(inpages[0]);
> +	data->length = PAGE_SIZE;
> +
> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
> +	if (ret)
> +		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
> +				ret, *error);
> +	sev_unpin_memory(inpages, npages);
> +err_1:
> +	kfree(data);
> +	return ret;
> +}
> +
> +static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	void *data;
> +	int ret, offset, len;
> +	struct kvm_sev_dbg debug;
> +
> +	if (!sev_guest(kvm))
> +		return -ENOTTY;
> +
> +	if (copy_from_user(&debug, (void *)argp->data,
> +				sizeof(struct kvm_sev_dbg)))
> +		return -EFAULT;
> +	/*
> +	 * TODO: add support for decrypting length which crosses the
> +	 * page boundary.
> +	 */
> +	offset = debug.src_addr & (PAGE_SIZE - 1);
> +	if (offset + debug.length > PAGE_SIZE)
> +		return -EINVAL;
> +

Please do add it, it doesn't seem very different from what you're doing
in LAUNCH_UPDATE_DATA.  There's no need for a separate
__sev_dbg_decrypt_page function, you can just pin/unpin here and do a
per-page loop as in LAUNCH_UPDATE_DATA.

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
@ 2017-03-16 10:54     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 10:54 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:18, Brijesh Singh wrote:
> +static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
> +		void *dst, int *error)
> +{
> +	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
> +	if (!inpages) {
> +		ret = -ENOMEM;
> +		goto err_1;
> +	}
> +
> +	data->handle = sev_get_handle(kvm);
> +	data->dst_addr = __psp_pa(dst);
> +	data->src_addr = __sev_page_pa(inpages[0]);
> +	data->length = PAGE_SIZE;
> +
> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
> +	if (ret)
> +		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
> +				ret, *error);
> +	sev_unpin_memory(inpages, npages);
> +err_1:
> +	kfree(data);
> +	return ret;
> +}
> +
> +static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	void *data;
> +	int ret, offset, len;
> +	struct kvm_sev_dbg debug;
> +
> +	if (!sev_guest(kvm))
> +		return -ENOTTY;
> +
> +	if (copy_from_user(&debug, (void *)argp->data,
> +				sizeof(struct kvm_sev_dbg)))
> +		return -EFAULT;
> +	/*
> +	 * TODO: add support for decrypting length which crosses the
> +	 * page boundary.
> +	 */
> +	offset = debug.src_addr & (PAGE_SIZE - 1);
> +	if (offset + debug.length > PAGE_SIZE)
> +		return -EINVAL;
> +

Please do add it, it doesn't seem very different from what you're doing
in LAUNCH_UPDATE_DATA.  There's no need for a separate
__sev_dbg_decrypt_page function, you can just pin/unpin here and do a
per-page loop as in LAUNCH_UPDATE_DATA.

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
  2017-03-02 15:18   ` Brijesh Singh
                     ` (3 preceding siblings ...)
  (?)
@ 2017-03-16 11:03   ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 11:03 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:18, Brijesh Singh wrote:
> +	data = (void *) get_zeroed_page(GFP_KERNEL);

The page does not need to be zeroed, does it?

> +
> +	if ((len & 15) || (dst_addr & 15)) {
> +		/* if destination address and length are not 16-byte
> +		 * aligned then:
> +		 * a) decrypt destination page into temporary buffer
> +		 * b) copy source data into temporary buffer at correct offset
> +		 * c) encrypt temporary buffer
> +		 */
> +		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);

Ah, I see now you're using this function here for read-modify-write.
data is already pinned here, so even if you keep the function it makes
sense to push pinning out of __sev_dbg_decrypt_page and into
sev_dbg_decrypt.

> +		if (ret)
> +			goto err_3;
> +		d_off = dst_addr & (PAGE_SIZE - 1);
> +
> +		if (copy_from_user(data + d_off,
> +					(uint8_t *)debug.src_addr, len)) {
> +			ret = -EFAULT;
> +			goto err_3;
> +		}
> +
> +		encrypt->length = PAGE_SIZE;

Why decrypt/re-encrypt all the page instead of just the 16 byte area
around the [dst_addr, dst_addr+len) range?

> +		encrypt->src_addr = __psp_pa(data);
> +		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
> +	} else {
> +		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
> +			ret = -EFAULT;
> +			goto err_3;
> +		}

Do you need copy_from_user, or can you just pin/unpin memory as for
DEBUG_DECRYPT?

Paolo

> +		d_off = dst_addr & (PAGE_SIZE - 1);
> +		encrypt->length = len;
> +		encrypt->src_addr = __psp_pa(data);
> +		encrypt->dst_addr = __sev_page_pa(inpages[0]);
> +		encrypt->dst_addr += d_off;
> +	}
> +
> +	encrypt->handle = sev_get_handle(kvm);
> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_ENCRYPT, encrypt, &argp->error);

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
  2017-03-02 15:18   ` Brijesh Singh
  (?)
@ 2017-03-16 11:03     ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 11:03 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:18, Brijesh Singh wrote:
> +	data = (void *) get_zeroed_page(GFP_KERNEL);

The page does not need to be zeroed, does it?

> +
> +	if ((len & 15) || (dst_addr & 15)) {
> +		/* if destination address and length are not 16-byte
> +		 * aligned then:
> +		 * a) decrypt destination page into temporary buffer
> +		 * b) copy source data into temporary buffer at correct offset
> +		 * c) encrypt temporary buffer
> +		 */
> +		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);

Ah, I see now you're using this function here for read-modify-write.
data is already pinned here, so even if you keep the function it makes
sense to push pinning out of __sev_dbg_decrypt_page and into
sev_dbg_decrypt.

> +		if (ret)
> +			goto err_3;
> +		d_off = dst_addr & (PAGE_SIZE - 1);
> +
> +		if (copy_from_user(data + d_off,
> +					(uint8_t *)debug.src_addr, len)) {
> +			ret = -EFAULT;
> +			goto err_3;
> +		}
> +
> +		encrypt->length = PAGE_SIZE;

Why decrypt/re-encrypt all the page instead of just the 16 byte area
around the [dst_addr, dst_addr+len) range?

> +		encrypt->src_addr = __psp_pa(data);
> +		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
> +	} else {
> +		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
> +			ret = -EFAULT;
> +			goto err_3;
> +		}

Do you need copy_from_user, or can you just pin/unpin memory as for
DEBUG_DECRYPT?

Paolo

> +		d_off = dst_addr & (PAGE_SIZE - 1);
> +		encrypt->length = len;
> +		encrypt->src_addr = __psp_pa(data);
> +		encrypt->dst_addr = __sev_page_pa(inpages[0]);
> +		encrypt->dst_addr += d_off;
> +	}
> +
> +	encrypt->handle = sev_get_handle(kvm);
> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_ENCRYPT, encrypt, &argp->error);

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
@ 2017-03-16 11:03     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 11:03 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:18, Brijesh Singh wrote:
> +	data = (void *) get_zeroed_page(GFP_KERNEL);

The page does not need to be zeroed, does it?

> +
> +	if ((len & 15) || (dst_addr & 15)) {
> +		/* if destination address and length are not 16-byte
> +		 * aligned then:
> +		 * a) decrypt destination page into temporary buffer
> +		 * b) copy source data into temporary buffer at correct offset
> +		 * c) encrypt temporary buffer
> +		 */
> +		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);

Ah, I see now you're using this function here for read-modify-write.
data is already pinned here, so even if you keep the function it makes
sense to push pinning out of __sev_dbg_decrypt_page and into
sev_dbg_decrypt.

> +		if (ret)
> +			goto err_3;
> +		d_off = dst_addr & (PAGE_SIZE - 1);
> +
> +		if (copy_from_user(data + d_off,
> +					(uint8_t *)debug.src_addr, len)) {
> +			ret = -EFAULT;
> +			goto err_3;
> +		}
> +
> +		encrypt->length = PAGE_SIZE;

Why decrypt/re-encrypt all the page instead of just the 16 byte area
around the [dst_addr, dst_addr+len) range?

> +		encrypt->src_addr = __psp_pa(data);
> +		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
> +	} else {
> +		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
> +			ret = -EFAULT;
> +			goto err_3;
> +		}

Do you need copy_from_user, or can you just pin/unpin memory as for
DEBUG_DECRYPT?

Paolo

> +		d_off = dst_addr & (PAGE_SIZE - 1);
> +		encrypt->length = len;
> +		encrypt->src_addr = __psp_pa(data);
> +		encrypt->dst_addr = __sev_page_pa(inpages[0]);
> +		encrypt->dst_addr += d_off;
> +	}
> +
> +	encrypt->handle = sev_get_handle(kvm);
> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_ENCRYPT, encrypt, &argp->error);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
@ 2017-03-16 11:03     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 11:03 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:18, Brijesh Singh wrote:
> +	data = (void *) get_zeroed_page(GFP_KERNEL);

The page does not need to be zeroed, does it?

> +
> +	if ((len & 15) || (dst_addr & 15)) {
> +		/* if destination address and length are not 16-byte
> +		 * aligned then:
> +		 * a) decrypt destination page into temporary buffer
> +		 * b) copy source data into temporary buffer at correct offset
> +		 * c) encrypt temporary buffer
> +		 */
> +		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);

Ah, I see now you're using this function here for read-modify-write.
data is already pinned here, so even if you keep the function it makes
sense to push pinning out of __sev_dbg_decrypt_page and into
sev_dbg_decrypt.

> +		if (ret)
> +			goto err_3;
> +		d_off = dst_addr & (PAGE_SIZE - 1);
> +
> +		if (copy_from_user(data + d_off,
> +					(uint8_t *)debug.src_addr, len)) {
> +			ret = -EFAULT;
> +			goto err_3;
> +		}
> +
> +		encrypt->length = PAGE_SIZE;

Why decrypt/re-encrypt all the page instead of just the 16 byte area
around the [dst_addr, dst_addr+len) range?

> +		encrypt->src_addr = __psp_pa(data);
> +		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
> +	} else {
> +		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
> +			ret = -EFAULT;
> +			goto err_3;
> +		}

Do you need copy_from_user, or can you just pin/unpin memory as for
DEBUG_DECRYPT?

Paolo

> +		d_off = dst_addr & (PAGE_SIZE - 1);
> +		encrypt->length = len;
> +		encrypt->src_addr = __psp_pa(data);
> +		encrypt->dst_addr = __sev_page_pa(inpages[0]);
> +		encrypt->dst_addr += d_off;
> +	}
> +
> +	encrypt->handle = sev_get_handle(kvm);
> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_ENCRYPT, encrypt, &argp->error);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
  2017-03-02 15:15   ` Brijesh Singh
  (?)
@ 2017-03-16 11:06     ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 11:06 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:15, Brijesh Singh wrote:
> Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
> variable at compile time and share its physical address with hypervisor.
> It presents a challege when SEV is active in guest OS. When SEV is active,
> guest memory is encrypted with guest key and hypervisor will no longer able
> to modify the guest memory. When SEV is active, we need to clear the
> encryption attribute of shared physical addresses so that both guest and
> hypervisor can access the data.
> 
> To solve this problem, I have tried these three options:
> 
> 1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
> detected then clear the encryption attribute. But while doing so I found
> that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
> called.
> 
> 2) Since the encryption attributes works on PAGE_SIZE hence add some extra
> padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
> clear the encryption attribute of the full PAGE. The downside of this was
> now we need to modify structure which may break the compatibility.
> 
> 3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
> used to hold the compile time shared per-CPU variables. When SEV is
> detected we map this section with encryption attribute cleared.
> 
> This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
> macro to create a compile time per-CPU variable. When SEV is detected we
> map the per-CPU variable as decrypted (i.e with encryption attribute cleared).
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Looks good to me.

Paolo

> ---
>  arch/x86/kernel/kvm.c             |   43 +++++++++++++++++++++++++++++++------
>  include/asm-generic/vmlinux.lds.h |    3 +++
>  include/linux/percpu-defs.h       |    9 ++++++++
>  3 files changed, 48 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 099fcba..706a08e 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
>  
>  early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
>  
> -static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> -static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) __aligned(64);
>  static int has_steal_clock = 0;
>  
>  /*
> @@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
>  #endif
>  }
>  
> +static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
> +{
> +	/* When SEV is active, the percpu static variables initialized
> +	 * in data section will contain the encrypted data so we first
> +	 * need to decrypt it and then map it as decrypted.
> +	 */
> +	if (sev_active()) {
> +		unsigned long pa = slow_virt_to_phys(addr);
> +
> +		sme_early_decrypt(pa, size);
> +		return early_set_memory_decrypted(addr, size);
> +	}
> +
> +	return 0;
> +}
> +
>  static void kvm_register_steal_time(void)
>  {
>  	int cpu = smp_processor_id();
> @@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
>  	if (!has_steal_clock)
>  		return;
>  
> +	if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
> +		pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
> +		return;
> +	}
> +
>  	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
>  	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
>  		cpu, (unsigned long long) slow_virt_to_phys(st));
>  }
>  
> -static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
> +static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
>  
>  static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
>  {
> @@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
>  	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
>  		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
>  
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
> +					sizeof(struct kvm_vcpu_pv_apf_data)))
> +			goto skip_asyncpf;
>  #ifdef CONFIG_PREEMPT
>  		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
>  #endif
>  		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
>  		__this_cpu_write(apf_reason.enabled, 1);
> -		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
> -		       smp_processor_id());
> +		printk(KERN_INFO"KVM setup async PF for cpu %d msr %llx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_asyncpf:
>  	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
>  		unsigned long pa;
>  		/* Size alignment is implied but just to make it explicit. */
>  		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&kvm_apic_eoi),
> +					sizeof(unsigned long)))
> +			goto skip_pv_eoi;
>  		__this_cpu_write(kvm_apic_eoi, 0);
>  		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
>  			| KVM_MSR_ENABLED;
>  		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
> +		printk(KERN_INFO"KVM setup PV EOI for cpu %d msr %lx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_pv_eoi:
>  	if (has_steal_clock)
>  		kvm_register_steal_time();
>  }
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 0968d13..8d29910 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -773,6 +773,9 @@
>  	. = ALIGN(cacheline);						\
>  	*(.data..percpu)						\
>  	*(.data..percpu..shared_aligned)				\
> +	. = ALIGN(PAGE_SIZE);						\
> +	*(.data..percpu..hv_shared)					\
> +	. = ALIGN(PAGE_SIZE);						\
>  	VMLINUX_SYMBOL(__per_cpu_end) = .;
>  
>  /**
> diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
> index 8f16299..5af366e 100644
> --- a/include/linux/percpu-defs.h
> +++ b/include/linux/percpu-defs.h
> @@ -172,6 +172,15 @@
>  #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
>  	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
>  
> +/* Declaration/definition used for per-CPU variables that must be shared
> + * between hypervisor and guest OS.
> + */
> +#define DECLARE_PER_CPU_HV_SHARED(type, name)				\
> +	DECLARE_PER_CPU_SECTION(type, name, "..hv_shared")
> +
> +#define DEFINE_PER_CPU_HV_SHARED(type, name)				\
> +	DEFINE_PER_CPU_SECTION(type, name, "..hv_shared")
> +
>  /*
>   * Intermodule exports for per-CPU variables.  sparse forgets about
>   * address space across EXPORT_SYMBOL(), change EXPORT_SYMBOL() to
> 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
@ 2017-03-16 11:06     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 11:06 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:15, Brijesh Singh wrote:
> Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
> variable at compile time and share its physical address with hypervisor.
> It presents a challege when SEV is active in guest OS. When SEV is active,
> guest memory is encrypted with guest key and hypervisor will no longer able
> to modify the guest memory. When SEV is active, we need to clear the
> encryption attribute of shared physical addresses so that both guest and
> hypervisor can access the data.
> 
> To solve this problem, I have tried these three options:
> 
> 1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
> detected then clear the encryption attribute. But while doing so I found
> that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
> called.
> 
> 2) Since the encryption attributes works on PAGE_SIZE hence add some extra
> padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
> clear the encryption attribute of the full PAGE. The downside of this was
> now we need to modify structure which may break the compatibility.
> 
> 3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
> used to hold the compile time shared per-CPU variables. When SEV is
> detected we map this section with encryption attribute cleared.
> 
> This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
> macro to create a compile time per-CPU variable. When SEV is detected we
> map the per-CPU variable as decrypted (i.e with encryption attribute cleared).
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Looks good to me.

Paolo

> ---
>  arch/x86/kernel/kvm.c             |   43 +++++++++++++++++++++++++++++++------
>  include/asm-generic/vmlinux.lds.h |    3 +++
>  include/linux/percpu-defs.h       |    9 ++++++++
>  3 files changed, 48 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 099fcba..706a08e 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
>  
>  early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
>  
> -static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> -static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) __aligned(64);
>  static int has_steal_clock = 0;
>  
>  /*
> @@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
>  #endif
>  }
>  
> +static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
> +{
> +	/* When SEV is active, the percpu static variables initialized
> +	 * in data section will contain the encrypted data so we first
> +	 * need to decrypt it and then map it as decrypted.
> +	 */
> +	if (sev_active()) {
> +		unsigned long pa = slow_virt_to_phys(addr);
> +
> +		sme_early_decrypt(pa, size);
> +		return early_set_memory_decrypted(addr, size);
> +	}
> +
> +	return 0;
> +}
> +
>  static void kvm_register_steal_time(void)
>  {
>  	int cpu = smp_processor_id();
> @@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
>  	if (!has_steal_clock)
>  		return;
>  
> +	if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
> +		pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
> +		return;
> +	}
> +
>  	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
>  	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
>  		cpu, (unsigned long long) slow_virt_to_phys(st));
>  }
>  
> -static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
> +static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
>  
>  static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
>  {
> @@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
>  	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
>  		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
>  
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
> +					sizeof(struct kvm_vcpu_pv_apf_data)))
> +			goto skip_asyncpf;
>  #ifdef CONFIG_PREEMPT
>  		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
>  #endif
>  		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
>  		__this_cpu_write(apf_reason.enabled, 1);
> -		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
> -		       smp_processor_id());
> +		printk(KERN_INFO"KVM setup async PF for cpu %d msr %llx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_asyncpf:
>  	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
>  		unsigned long pa;
>  		/* Size alignment is implied but just to make it explicit. */
>  		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&kvm_apic_eoi),
> +					sizeof(unsigned long)))
> +			goto skip_pv_eoi;
>  		__this_cpu_write(kvm_apic_eoi, 0);
>  		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
>  			| KVM_MSR_ENABLED;
>  		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
> +		printk(KERN_INFO"KVM setup PV EOI for cpu %d msr %lx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_pv_eoi:
>  	if (has_steal_clock)
>  		kvm_register_steal_time();
>  }
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 0968d13..8d29910 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -773,6 +773,9 @@
>  	. = ALIGN(cacheline);						\
>  	*(.data..percpu)						\
>  	*(.data..percpu..shared_aligned)				\
> +	. = ALIGN(PAGE_SIZE);						\
> +	*(.data..percpu..hv_shared)					\
> +	. = ALIGN(PAGE_SIZE);						\
>  	VMLINUX_SYMBOL(__per_cpu_end) = .;
>  
>  /**
> diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
> index 8f16299..5af366e 100644
> --- a/include/linux/percpu-defs.h
> +++ b/include/linux/percpu-defs.h
> @@ -172,6 +172,15 @@
>  #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
>  	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
>  
> +/* Declaration/definition used for per-CPU variables that must be shared
> + * between hypervisor and guest OS.
> + */
> +#define DECLARE_PER_CPU_HV_SHARED(type, name)				\
> +	DECLARE_PER_CPU_SECTION(type, name, "..hv_shared")
> +
> +#define DEFINE_PER_CPU_HV_SHARED(type, name)				\
> +	DEFINE_PER_CPU_SECTION(type, name, "..hv_shared")
> +
>  /*
>   * Intermodule exports for per-CPU variables.  sparse forgets about
>   * address space across EXPORT_SYMBOL(), change EXPORT_SYMBOL() to
> 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
@ 2017-03-16 11:06     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 11:06 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:15, Brijesh Singh wrote:
> Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
> variable at compile time and share its physical address with hypervisor.
> It presents a challege when SEV is active in guest OS. When SEV is active,
> guest memory is encrypted with guest key and hypervisor will no longer able
> to modify the guest memory. When SEV is active, we need to clear the
> encryption attribute of shared physical addresses so that both guest and
> hypervisor can access the data.
> 
> To solve this problem, I have tried these three options:
> 
> 1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
> detected then clear the encryption attribute. But while doing so I found
> that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
> called.
> 
> 2) Since the encryption attributes works on PAGE_SIZE hence add some extra
> padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
> clear the encryption attribute of the full PAGE. The downside of this was
> now we need to modify structure which may break the compatibility.
> 
> 3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
> used to hold the compile time shared per-CPU variables. When SEV is
> detected we map this section with encryption attribute cleared.
> 
> This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
> macro to create a compile time per-CPU variable. When SEV is detected we
> map the per-CPU variable as decrypted (i.e with encryption attribute cleared).
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Looks good to me.

Paolo

> ---
>  arch/x86/kernel/kvm.c             |   43 +++++++++++++++++++++++++++++++------
>  include/asm-generic/vmlinux.lds.h |    3 +++
>  include/linux/percpu-defs.h       |    9 ++++++++
>  3 files changed, 48 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 099fcba..706a08e 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
>  
>  early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
>  
> -static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> -static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) __aligned(64);
>  static int has_steal_clock = 0;
>  
>  /*
> @@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
>  #endif
>  }
>  
> +static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
> +{
> +	/* When SEV is active, the percpu static variables initialized
> +	 * in data section will contain the encrypted data so we first
> +	 * need to decrypt it and then map it as decrypted.
> +	 */
> +	if (sev_active()) {
> +		unsigned long pa = slow_virt_to_phys(addr);
> +
> +		sme_early_decrypt(pa, size);
> +		return early_set_memory_decrypted(addr, size);
> +	}
> +
> +	return 0;
> +}
> +
>  static void kvm_register_steal_time(void)
>  {
>  	int cpu = smp_processor_id();
> @@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
>  	if (!has_steal_clock)
>  		return;
>  
> +	if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
> +		pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
> +		return;
> +	}
> +
>  	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
>  	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
>  		cpu, (unsigned long long) slow_virt_to_phys(st));
>  }
>  
> -static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
> +static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
>  
>  static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
>  {
> @@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
>  	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
>  		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
>  
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
> +					sizeof(struct kvm_vcpu_pv_apf_data)))
> +			goto skip_asyncpf;
>  #ifdef CONFIG_PREEMPT
>  		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
>  #endif
>  		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
>  		__this_cpu_write(apf_reason.enabled, 1);
> -		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
> -		       smp_processor_id());
> +		printk(KERN_INFO"KVM setup async PF for cpu %d msr %llx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_asyncpf:
>  	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
>  		unsigned long pa;
>  		/* Size alignment is implied but just to make it explicit. */
>  		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&kvm_apic_eoi),
> +					sizeof(unsigned long)))
> +			goto skip_pv_eoi;
>  		__this_cpu_write(kvm_apic_eoi, 0);
>  		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
>  			| KVM_MSR_ENABLED;
>  		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
> +		printk(KERN_INFO"KVM setup PV EOI for cpu %d msr %lx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_pv_eoi:
>  	if (has_steal_clock)
>  		kvm_register_steal_time();
>  }
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 0968d13..8d29910 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -773,6 +773,9 @@
>  	. = ALIGN(cacheline);						\
>  	*(.data..percpu)						\
>  	*(.data..percpu..shared_aligned)				\
> +	. = ALIGN(PAGE_SIZE);						\
> +	*(.data..percpu..hv_shared)					\
> +	. = ALIGN(PAGE_SIZE);						\
>  	VMLINUX_SYMBOL(__per_cpu_end) = .;
>  
>  /**
> diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
> index 8f16299..5af366e 100644
> --- a/include/linux/percpu-defs.h
> +++ b/include/linux/percpu-defs.h
> @@ -172,6 +172,15 @@
>  #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
>  	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
>  
> +/* Declaration/definition used for per-CPU variables that must be shared
> + * between hypervisor and guest OS.
> + */
> +#define DECLARE_PER_CPU_HV_SHARED(type, name)				\
> +	DECLARE_PER_CPU_SECTION(type, name, "..hv_shared")
> +
> +#define DEFINE_PER_CPU_HV_SHARED(type, name)				\
> +	DEFINE_PER_CPU_SECTION(type, name, "..hv_shared")
> +
>  /*
>   * Intermodule exports for per-CPU variables.  sparse forgets about
>   * address space across EXPORT_SYMBOL(), change EXPORT_SYMBOL() to
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-02 15:15   ` Brijesh Singh
                     ` (3 preceding siblings ...)
  (?)
@ 2017-03-16 12:28   ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 12:28 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 02/03/2017 16:15, Brijesh Singh wrote:
> 
>  __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
> -		   struct page *base)
> +		  pte_t *pbase, unsigned long new_pfn)
>  {
> -	pte_t *pbase = (pte_t *)page_address(base);

Just one comment and I'll reply to Boris, I think you can compute pbase 
with pfn_to_kaddr, and avoid adding a new argument.

>  	 */
> -	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
> +	__set_pmd_pte(kpte, address,
> +		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));

And this probably is better written as:

	__set_pmd_pte(kpte, address, pfn_pte(new_pfn, __pgprot(_KERNPG_TABLE));

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-02 15:15   ` Brijesh Singh
  (?)
@ 2017-03-16 12:28     ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 12:28 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:15, Brijesh Singh wrote:
> 
>  __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
> -		   struct page *base)
> +		  pte_t *pbase, unsigned long new_pfn)
>  {
> -	pte_t *pbase = (pte_t *)page_address(base);

Just one comment and I'll reply to Boris, I think you can compute pbase 
with pfn_to_kaddr, and avoid adding a new argument.

>  	 */
> -	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
> +	__set_pmd_pte(kpte, address,
> +		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));

And this probably is better written as:

	__set_pmd_pte(kpte, address, pfn_pte(new_pfn, __pgprot(_KERNPG_TABLE));

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-16 12:28     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 12:28 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mcheha



On 02/03/2017 16:15, Brijesh Singh wrote:
> 
>  __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
> -		   struct page *base)
> +		  pte_t *pbase, unsigned long new_pfn)
>  {
> -	pte_t *pbase = (pte_t *)page_address(base);

Just one comment and I'll reply to Boris, I think you can compute pbase 
with pfn_to_kaddr, and avoid adding a new argument.

>  	 */
> -	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
> +	__set_pmd_pte(kpte, address,
> +		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));

And this probably is better written as:

	__set_pmd_pte(kpte, address, pfn_pte(new_pfn, __pgprot(_KERNPG_TABLE));

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-16 12:28     ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 12:28 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 02/03/2017 16:15, Brijesh Singh wrote:
> 
>  __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
> -		   struct page *base)
> +		  pte_t *pbase, unsigned long new_pfn)
>  {
> -	pte_t *pbase = (pte_t *)page_address(base);

Just one comment and I'll reply to Boris, I think you can compute pbase 
with pfn_to_kaddr, and avoid adding a new argument.

>  	 */
> -	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
> +	__set_pmd_pte(kpte, address,
> +		native_make_pte((new_pfn << PAGE_SHIFT) + _KERNPG_TABLE));

And this probably is better written as:

	__set_pmd_pte(kpte, address, pfn_pte(new_pfn, __pgprot(_KERNPG_TABLE));

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-10 22:41       ` Brijesh Singh
@ 2017-03-16 13:15         ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 13:15 UTC (permalink / raw)
  To: Brijesh Singh, Borislav Petkov
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas

[-- Attachment #1: Type: text/plain, Size: 1706 bytes --]



On 10/03/2017 23:41, Brijesh Singh wrote:
>> Maybe there's a reason this fires:
>>
>> WARNING: modpost: Found 2 section mismatch(es).
>> To see full details build your kernel with:
>> 'make CONFIG_DEBUG_SECTION_MISMATCH=y'
>>
>> WARNING: vmlinux.o(.text+0x48edc): Section mismatch in reference from
>> the function __change_page_attr() to the function
>> .init.text:memblock_alloc()
>> The function __change_page_attr() references
>> the function __init memblock_alloc().
>> This is often because __change_page_attr lacks a __init
>> annotation or the annotation of memblock_alloc is wrong.
>>
>> WARNING: vmlinux.o(.text+0x491d1): Section mismatch in reference from
>> the function __change_page_attr() to the function
>> .meminit.text:memblock_free()
>> The function __change_page_attr() references
>> the function __meminit memblock_free().
>> This is often because __change_page_attr lacks a __meminit
>> annotation or the annotation of memblock_free is wrong.
>> 
>> But maybe Paolo might have an even better idea...
> 
> I am sure he will have better idea :)

Not sure if it's better or worse, but an alternative idea is to turn
__change_page_attr and __change_page_attr_set_clr inside out, so that:
1) the alloc_pages/__free_page happens in __change_page_attr_set_clr;
2) __change_page_attr_set_clr overall does not beocome more complex.

Then you can introduce __early_change_page_attr_set_clr and/or
early_kernel_map_pages_in_pgd, for use in your next patches.  They use
the memblock allocator instead of alloc/free_page

The attached patch is compile-tested only and almost certainly has some
thinko in it.  But it even skims a few lines from the code so the idea
might have some merit.

Paolo

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: alloc-in-cpa-set-clr.patch --]
[-- Type: text/x-patch; name="alloc-in-cpa-set-clr.patch", Size: 10219 bytes --]

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 28d42130243c..953c8e697562 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -490,11 +490,12 @@ static void __set_pmd_pte(pte_t *kpte, unsigned long address, pte_t pte)
 }
 
 static int
-try_preserve_large_page(pte_t *kpte, unsigned long address,
+try_preserve_large_page(pte_t **p_kpte, unsigned long address,
 			struct cpa_data *cpa)
 {
 	unsigned long nextpage_addr, numpages, pmask, psize, addr, pfn, old_pfn;
-	pte_t new_pte, old_pte, *tmp;
+	pte_t *kpte = *p_kpte;
+	pte_t new_pte, old_pte;
 	pgprot_t old_prot, new_prot, req_prot;
 	int i, do_split = 1;
 	enum pg_level level;
@@ -507,8 +508,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	 * Check for races, another CPU might have split this page
 	 * up already:
 	 */
-	tmp = _lookup_address_cpa(cpa, address, &level);
-	if (tmp != kpte)
+	*p_kpte = _lookup_address_cpa(cpa, address, &level);
+	if (*p_kpte != kpte)
 		goto out_unlock;
 
 	switch (level) {
@@ -634,17 +635,18 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	unsigned int i, level;
 	pte_t *tmp;
 	pgprot_t ref_prot;
+	int retry = 1;
 
+	if (!debug_pagealloc_enabled())
+		spin_lock(&cpa_lock);
 	spin_lock(&pgd_lock);
 	/*
 	 * Check for races, another CPU might have split this page
 	 * up for us already:
 	 */
 	tmp = _lookup_address_cpa(cpa, address, &level);
-	if (tmp != kpte) {
-		spin_unlock(&pgd_lock);
-		return 1;
-	}
+	if (tmp != kpte)
+		goto out;
 
 	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
 
@@ -671,10 +673,11 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 		break;
 
 	default:
-		spin_unlock(&pgd_lock);
-		return 1;
+		goto out;
 	}
 
+	retry = 0;
+
 	/*
 	 * Set the GLOBAL flags only if the PRESENT flag is set
 	 * otherwise pmd/pte_present will return true even on a non
@@ -718,28 +721,34 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	 * going on.
 	 */
 	__flush_tlb_all();
-	spin_unlock(&pgd_lock);
 
-	return 0;
-}
-
-static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
-			    unsigned long address)
-{
-	struct page *base;
+out:
+	spin_unlock(&pgd_lock);
 
+	/*
+	 * Do a global flush tlb after splitting the large page
+ 	 * and before we do the actual change page attribute in the PTE.
+ 	 *
+ 	 * With out this, we violate the TLB application note, that says
+ 	 * "The TLBs may contain both ordinary and large-page
+	 *  translations for a 4-KByte range of linear addresses. This
+	 *  may occur if software modifies the paging structures so that
+	 *  the page size used for the address range changes. If the two
+	 *  translations differ with respect to page frame or attributes
+	 *  (e.g., permissions), processor behavior is undefined and may
+	 *  be implementation-specific."
+ 	 *
+ 	 * We do this global tlb flush inside the cpa_lock, so that we
+	 * don't allow any other cpu, with stale tlb entries change the
+	 * page attribute in parallel, that also falls into the
+	 * just split large page entry.
+ 	 */
+	if (!retry)
+		flush_tlb_all();
 	if (!debug_pagealloc_enabled())
 		spin_unlock(&cpa_lock);
-	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
-	if (!debug_pagealloc_enabled())
-		spin_lock(&cpa_lock);
-	if (!base)
-		return -ENOMEM;
-
-	if (__split_large_page(cpa, kpte, address, base))
-		__free_page(base);
 
-	return 0;
+	return retry;
 }
 
 static bool try_to_free_pte_page(pte_t *pte)
@@ -1166,30 +1175,26 @@ static int __cpa_process_fault(struct cpa_data *cpa, unsigned long vaddr,
 	}
 }
 
-static int __change_page_attr(struct cpa_data *cpa, int primary)
+static int __change_page_attr(struct cpa_data *cpa, pte_t **p_kpte, unsigned long address,
+			      int primary)
 {
-	unsigned long address;
-	int do_split, err;
 	unsigned int level;
 	pte_t *kpte, old_pte;
+	int err = 0;
 
-	if (cpa->flags & CPA_PAGES_ARRAY) {
-		struct page *page = cpa->pages[cpa->curpage];
-		if (unlikely(PageHighMem(page)))
-			return 0;
-		address = (unsigned long)page_address(page);
-	} else if (cpa->flags & CPA_ARRAY)
-		address = cpa->vaddr[cpa->curpage];
-	else
-		address = *cpa->vaddr;
-repeat:
-	kpte = _lookup_address_cpa(cpa, address, &level);
-	if (!kpte)
-		return __cpa_process_fault(cpa, address, primary);
+	if (!debug_pagealloc_enabled())
+		spin_lock(&cpa_lock);
+	*p_kpte = kpte = _lookup_address_cpa(cpa, address, &level);
+	if (!kpte) {
+		err = __cpa_process_fault(cpa, address, primary);
+		goto out;
+	}
 
 	old_pte = *kpte;
-	if (pte_none(old_pte))
-		return __cpa_process_fault(cpa, address, primary);
+	if (pte_none(old_pte)) {
+		err = __cpa_process_fault(cpa, address, primary);
+		goto out;
+	}
 
 	if (level == PG_LEVEL_4K) {
 		pte_t new_pte;
@@ -1228,59 +1233,27 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
 			cpa->flags |= CPA_FLUSHTLB;
 		}
 		cpa->numpages = 1;
-		return 0;
+		goto out;
 	}
 
 	/*
 	 * Check, whether we can keep the large page intact
 	 * and just change the pte:
 	 */
-	do_split = try_preserve_large_page(kpte, address, cpa);
-	/*
-	 * When the range fits into the existing large page,
-	 * return. cp->numpages and cpa->tlbflush have been updated in
-	 * try_large_page:
-	 */
-	if (do_split <= 0)
-		return do_split;
-
-	/*
-	 * We have to split the large page:
-	 */
-	err = split_large_page(cpa, kpte, address);
-	if (!err) {
-		/*
-	 	 * Do a global flush tlb after splitting the large page
-	 	 * and before we do the actual change page attribute in the PTE.
-	 	 *
-	 	 * With out this, we violate the TLB application note, that says
-	 	 * "The TLBs may contain both ordinary and large-page
-		 *  translations for a 4-KByte range of linear addresses. This
-		 *  may occur if software modifies the paging structures so that
-		 *  the page size used for the address range changes. If the two
-		 *  translations differ with respect to page frame or attributes
-		 *  (e.g., permissions), processor behavior is undefined and may
-		 *  be implementation-specific."
-	 	 *
-	 	 * We do this global tlb flush inside the cpa_lock, so that we
-		 * don't allow any other cpu, with stale tlb entries change the
-		 * page attribute in parallel, that also falls into the
-		 * just split large page entry.
-	 	 */
-		flush_tlb_all();
-		goto repeat;
-	}
+	err = try_preserve_large_page(p_kpte, address, cpa);
 
+out:
+	if (!debug_pagealloc_enabled())
+		spin_unlock(&cpa_lock);
 	return err;
 }
 
 static int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias);
 
-static int cpa_process_alias(struct cpa_data *cpa)
+static int cpa_process_alias(struct cpa_data *cpa, unsigned long vaddr)
 {
 	struct cpa_data alias_cpa;
 	unsigned long laddr = (unsigned long)__va(cpa->pfn << PAGE_SHIFT);
-	unsigned long vaddr;
 	int ret;
 
 	if (!pfn_range_is_mapped(cpa->pfn, cpa->pfn + 1))
@@ -1290,16 +1263,6 @@ static int cpa_process_alias(struct cpa_data *cpa)
 	 * No need to redo, when the primary call touched the direct
 	 * mapping already:
 	 */
-	if (cpa->flags & CPA_PAGES_ARRAY) {
-		struct page *page = cpa->pages[cpa->curpage];
-		if (unlikely(PageHighMem(page)))
-			return 0;
-		vaddr = (unsigned long)page_address(page);
-	} else if (cpa->flags & CPA_ARRAY)
-		vaddr = cpa->vaddr[cpa->curpage];
-	else
-		vaddr = *cpa->vaddr;
-
 	if (!(within(vaddr, PAGE_OFFSET,
 		    PAGE_OFFSET + (max_pfn_mapped << PAGE_SHIFT)))) {
 
@@ -1338,33 +1301,64 @@ static int cpa_process_alias(struct cpa_data *cpa)
 	return 0;
 }
 
+static unsigned long cpa_address(struct cpa_data *cpa, unsigned long numpages)
+{
+	/*
+	 * Store the remaining nr of pages for the large page
+	 * preservation check.
+	 */
+	/* for array changes, we can't use large page */
+	cpa->numpages = 1;
+	if (cpa->flags & CPA_PAGES_ARRAY) {
+		struct page *page = cpa->pages[cpa->curpage];
+		if (unlikely(PageHighMem(page)))
+			return -EINVAL;
+		return (unsigned long)page_address(page);
+	} else if (cpa->flags & CPA_ARRAY) {
+		return cpa->vaddr[cpa->curpage];
+	} else {
+		cpa->numpages = numpages;
+		return *cpa->vaddr;
+	}
+}
+
+static void cpa_advance(struct cpa_data *cpa)
+{
+	if (cpa->flags & (CPA_PAGES_ARRAY | CPA_ARRAY))
+		cpa->curpage++;
+	else
+		*cpa->vaddr += cpa->numpages * PAGE_SIZE;
+}
+
 static int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias)
 {
 	unsigned long numpages = cpa->numpages;
+	unsigned long vaddr;
+	struct page *base;
+	pte_t *kpte;
 	int ret;
 
 	while (numpages) {
-		/*
-		 * Store the remaining nr of pages for the large page
-		 * preservation check.
-		 */
-		cpa->numpages = numpages;
-		/* for array changes, we can't use large page */
-		if (cpa->flags & (CPA_ARRAY | CPA_PAGES_ARRAY))
-			cpa->numpages = 1;
-
-		if (!debug_pagealloc_enabled())
-			spin_lock(&cpa_lock);
-		ret = __change_page_attr(cpa, checkalias);
-		if (!debug_pagealloc_enabled())
-			spin_unlock(&cpa_lock);
-		if (ret)
-			return ret;
-
-		if (checkalias) {
-			ret = cpa_process_alias(cpa);
-			if (ret)
+		vaddr = cpa_address(cpa, numpages);
+		if (!IS_ERR_VALUE(vaddr)) {
+repeat:
+			ret = __change_page_attr(cpa, &kpte, vaddr, checkalias);
+			if (ret < 0)
 				return ret;
+			if (ret) {
+				base = alloc_page(GFP_KERNEL|__GFP_NOTRACK);
+				if (!base)
+					return -ENOMEM;
+				if (__split_large_page(cpa, kpte, vaddr, base))
+					__free_page(base);
+				goto repeat;
+			}
+
+			if (checkalias) {
+				ret = cpa_process_alias(cpa, vaddr);
+				if (ret < 0)
+					return ret;
+			}
 		}
 
 		/*
@@ -1374,11 +1368,7 @@ static int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias)
 		 */
 		BUG_ON(cpa->numpages > numpages || !cpa->numpages);
 		numpages -= cpa->numpages;
-		if (cpa->flags & (CPA_PAGES_ARRAY | CPA_ARRAY))
-			cpa->curpage++;
-		else
-			*cpa->vaddr += cpa->numpages * PAGE_SIZE;
-
+		cpa_advance(cpa);
 	}
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-16 13:15         ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 13:15 UTC (permalink / raw)
  To: Brijesh Singh, Borislav Petkov
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj, akpm,
	davem

[-- Attachment #1: Type: text/plain, Size: 1706 bytes --]



On 10/03/2017 23:41, Brijesh Singh wrote:
>> Maybe there's a reason this fires:
>>
>> WARNING: modpost: Found 2 section mismatch(es).
>> To see full details build your kernel with:
>> 'make CONFIG_DEBUG_SECTION_MISMATCH=y'
>>
>> WARNING: vmlinux.o(.text+0x48edc): Section mismatch in reference from
>> the function __change_page_attr() to the function
>> .init.text:memblock_alloc()
>> The function __change_page_attr() references
>> the function __init memblock_alloc().
>> This is often because __change_page_attr lacks a __init
>> annotation or the annotation of memblock_alloc is wrong.
>>
>> WARNING: vmlinux.o(.text+0x491d1): Section mismatch in reference from
>> the function __change_page_attr() to the function
>> .meminit.text:memblock_free()
>> The function __change_page_attr() references
>> the function __meminit memblock_free().
>> This is often because __change_page_attr lacks a __meminit
>> annotation or the annotation of memblock_free is wrong.
>> 
>> But maybe Paolo might have an even better idea...
> 
> I am sure he will have better idea :)

Not sure if it's better or worse, but an alternative idea is to turn
__change_page_attr and __change_page_attr_set_clr inside out, so that:
1) the alloc_pages/__free_page happens in __change_page_attr_set_clr;
2) __change_page_attr_set_clr overall does not beocome more complex.

Then you can introduce __early_change_page_attr_set_clr and/or
early_kernel_map_pages_in_pgd, for use in your next patches.  They use
the memblock allocator instead of alloc/free_page

The attached patch is compile-tested only and almost certainly has some
thinko in it.  But it even skims a few lines from the code so the idea
might have some merit.

Paolo

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: alloc-in-cpa-set-clr.patch --]
[-- Type: text/x-patch; name="alloc-in-cpa-set-clr.patch", Size: 10219 bytes --]

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 28d42130243c..953c8e697562 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -490,11 +490,12 @@ static void __set_pmd_pte(pte_t *kpte, unsigned long address, pte_t pte)
 }
 
 static int
-try_preserve_large_page(pte_t *kpte, unsigned long address,
+try_preserve_large_page(pte_t **p_kpte, unsigned long address,
 			struct cpa_data *cpa)
 {
 	unsigned long nextpage_addr, numpages, pmask, psize, addr, pfn, old_pfn;
-	pte_t new_pte, old_pte, *tmp;
+	pte_t *kpte = *p_kpte;
+	pte_t new_pte, old_pte;
 	pgprot_t old_prot, new_prot, req_prot;
 	int i, do_split = 1;
 	enum pg_level level;
@@ -507,8 +508,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	 * Check for races, another CPU might have split this page
 	 * up already:
 	 */
-	tmp = _lookup_address_cpa(cpa, address, &level);
-	if (tmp != kpte)
+	*p_kpte = _lookup_address_cpa(cpa, address, &level);
+	if (*p_kpte != kpte)
 		goto out_unlock;
 
 	switch (level) {
@@ -634,17 +635,18 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	unsigned int i, level;
 	pte_t *tmp;
 	pgprot_t ref_prot;
+	int retry = 1;
 
+	if (!debug_pagealloc_enabled())
+		spin_lock(&cpa_lock);
 	spin_lock(&pgd_lock);
 	/*
 	 * Check for races, another CPU might have split this page
 	 * up for us already:
 	 */
 	tmp = _lookup_address_cpa(cpa, address, &level);
-	if (tmp != kpte) {
-		spin_unlock(&pgd_lock);
-		return 1;
-	}
+	if (tmp != kpte)
+		goto out;
 
 	paravirt_alloc_pte(&init_mm, page_to_pfn(base));
 
@@ -671,10 +673,11 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 		break;
 
 	default:
-		spin_unlock(&pgd_lock);
-		return 1;
+		goto out;
 	}
 
+	retry = 0;
+
 	/*
 	 * Set the GLOBAL flags only if the PRESENT flag is set
 	 * otherwise pmd/pte_present will return true even on a non
@@ -718,28 +721,34 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
 	 * going on.
 	 */
 	__flush_tlb_all();
-	spin_unlock(&pgd_lock);
 
-	return 0;
-}
-
-static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
-			    unsigned long address)
-{
-	struct page *base;
+out:
+	spin_unlock(&pgd_lock);
 
+	/*
+	 * Do a global flush tlb after splitting the large page
+ 	 * and before we do the actual change page attribute in the PTE.
+ 	 *
+ 	 * With out this, we violate the TLB application note, that says
+ 	 * "The TLBs may contain both ordinary and large-page
+	 *  translations for a 4-KByte range of linear addresses. This
+	 *  may occur if software modifies the paging structures so that
+	 *  the page size used for the address range changes. If the two
+	 *  translations differ with respect to page frame or attributes
+	 *  (e.g., permissions), processor behavior is undefined and may
+	 *  be implementation-specific."
+ 	 *
+ 	 * We do this global tlb flush inside the cpa_lock, so that we
+	 * don't allow any other cpu, with stale tlb entries change the
+	 * page attribute in parallel, that also falls into the
+	 * just split large page entry.
+ 	 */
+	if (!retry)
+		flush_tlb_all();
 	if (!debug_pagealloc_enabled())
 		spin_unlock(&cpa_lock);
-	base = alloc_pages(GFP_KERNEL | __GFP_NOTRACK, 0);
-	if (!debug_pagealloc_enabled())
-		spin_lock(&cpa_lock);
-	if (!base)
-		return -ENOMEM;
-
-	if (__split_large_page(cpa, kpte, address, base))
-		__free_page(base);
 
-	return 0;
+	return retry;
 }
 
 static bool try_to_free_pte_page(pte_t *pte)
@@ -1166,30 +1175,26 @@ static int __cpa_process_fault(struct cpa_data *cpa, unsigned long vaddr,
 	}
 }
 
-static int __change_page_attr(struct cpa_data *cpa, int primary)
+static int __change_page_attr(struct cpa_data *cpa, pte_t **p_kpte, unsigned long address,
+			      int primary)
 {
-	unsigned long address;
-	int do_split, err;
 	unsigned int level;
 	pte_t *kpte, old_pte;
+	int err = 0;
 
-	if (cpa->flags & CPA_PAGES_ARRAY) {
-		struct page *page = cpa->pages[cpa->curpage];
-		if (unlikely(PageHighMem(page)))
-			return 0;
-		address = (unsigned long)page_address(page);
-	} else if (cpa->flags & CPA_ARRAY)
-		address = cpa->vaddr[cpa->curpage];
-	else
-		address = *cpa->vaddr;
-repeat:
-	kpte = _lookup_address_cpa(cpa, address, &level);
-	if (!kpte)
-		return __cpa_process_fault(cpa, address, primary);
+	if (!debug_pagealloc_enabled())
+		spin_lock(&cpa_lock);
+	*p_kpte = kpte = _lookup_address_cpa(cpa, address, &level);
+	if (!kpte) {
+		err = __cpa_process_fault(cpa, address, primary);
+		goto out;
+	}
 
 	old_pte = *kpte;
-	if (pte_none(old_pte))
-		return __cpa_process_fault(cpa, address, primary);
+	if (pte_none(old_pte)) {
+		err = __cpa_process_fault(cpa, address, primary);
+		goto out;
+	}
 
 	if (level == PG_LEVEL_4K) {
 		pte_t new_pte;
@@ -1228,59 +1233,27 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
 			cpa->flags |= CPA_FLUSHTLB;
 		}
 		cpa->numpages = 1;
-		return 0;
+		goto out;
 	}
 
 	/*
 	 * Check, whether we can keep the large page intact
 	 * and just change the pte:
 	 */
-	do_split = try_preserve_large_page(kpte, address, cpa);
-	/*
-	 * When the range fits into the existing large page,
-	 * return. cp->numpages and cpa->tlbflush have been updated in
-	 * try_large_page:
-	 */
-	if (do_split <= 0)
-		return do_split;
-
-	/*
-	 * We have to split the large page:
-	 */
-	err = split_large_page(cpa, kpte, address);
-	if (!err) {
-		/*
-	 	 * Do a global flush tlb after splitting the large page
-	 	 * and before we do the actual change page attribute in the PTE.
-	 	 *
-	 	 * With out this, we violate the TLB application note, that says
-	 	 * "The TLBs may contain both ordinary and large-page
-		 *  translations for a 4-KByte range of linear addresses. This
-		 *  may occur if software modifies the paging structures so that
-		 *  the page size used for the address range changes. If the two
-		 *  translations differ with respect to page frame or attributes
-		 *  (e.g., permissions), processor behavior is undefined and may
-		 *  be implementation-specific."
-	 	 *
-	 	 * We do this global tlb flush inside the cpa_lock, so that we
-		 * don't allow any other cpu, with stale tlb entries change the
-		 * page attribute in parallel, that also falls into the
-		 * just split large page entry.
-	 	 */
-		flush_tlb_all();
-		goto repeat;
-	}
+	err = try_preserve_large_page(p_kpte, address, cpa);
 
+out:
+	if (!debug_pagealloc_enabled())
+		spin_unlock(&cpa_lock);
 	return err;
 }
 
 static int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias);
 
-static int cpa_process_alias(struct cpa_data *cpa)
+static int cpa_process_alias(struct cpa_data *cpa, unsigned long vaddr)
 {
 	struct cpa_data alias_cpa;
 	unsigned long laddr = (unsigned long)__va(cpa->pfn << PAGE_SHIFT);
-	unsigned long vaddr;
 	int ret;
 
 	if (!pfn_range_is_mapped(cpa->pfn, cpa->pfn + 1))
@@ -1290,16 +1263,6 @@ static int cpa_process_alias(struct cpa_data *cpa)
 	 * No need to redo, when the primary call touched the direct
 	 * mapping already:
 	 */
-	if (cpa->flags & CPA_PAGES_ARRAY) {
-		struct page *page = cpa->pages[cpa->curpage];
-		if (unlikely(PageHighMem(page)))
-			return 0;
-		vaddr = (unsigned long)page_address(page);
-	} else if (cpa->flags & CPA_ARRAY)
-		vaddr = cpa->vaddr[cpa->curpage];
-	else
-		vaddr = *cpa->vaddr;
-
 	if (!(within(vaddr, PAGE_OFFSET,
 		    PAGE_OFFSET + (max_pfn_mapped << PAGE_SHIFT)))) {
 
@@ -1338,33 +1301,64 @@ static int cpa_process_alias(struct cpa_data *cpa)
 	return 0;
 }
 
+static unsigned long cpa_address(struct cpa_data *cpa, unsigned long numpages)
+{
+	/*
+	 * Store the remaining nr of pages for the large page
+	 * preservation check.
+	 */
+	/* for array changes, we can't use large page */
+	cpa->numpages = 1;
+	if (cpa->flags & CPA_PAGES_ARRAY) {
+		struct page *page = cpa->pages[cpa->curpage];
+		if (unlikely(PageHighMem(page)))
+			return -EINVAL;
+		return (unsigned long)page_address(page);
+	} else if (cpa->flags & CPA_ARRAY) {
+		return cpa->vaddr[cpa->curpage];
+	} else {
+		cpa->numpages = numpages;
+		return *cpa->vaddr;
+	}
+}
+
+static void cpa_advance(struct cpa_data *cpa)
+{
+	if (cpa->flags & (CPA_PAGES_ARRAY | CPA_ARRAY))
+		cpa->curpage++;
+	else
+		*cpa->vaddr += cpa->numpages * PAGE_SIZE;
+}
+
 static int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias)
 {
 	unsigned long numpages = cpa->numpages;
+	unsigned long vaddr;
+	struct page *base;
+	pte_t *kpte;
 	int ret;
 
 	while (numpages) {
-		/*
-		 * Store the remaining nr of pages for the large page
-		 * preservation check.
-		 */
-		cpa->numpages = numpages;
-		/* for array changes, we can't use large page */
-		if (cpa->flags & (CPA_ARRAY | CPA_PAGES_ARRAY))
-			cpa->numpages = 1;
-
-		if (!debug_pagealloc_enabled())
-			spin_lock(&cpa_lock);
-		ret = __change_page_attr(cpa, checkalias);
-		if (!debug_pagealloc_enabled())
-			spin_unlock(&cpa_lock);
-		if (ret)
-			return ret;
-
-		if (checkalias) {
-			ret = cpa_process_alias(cpa);
-			if (ret)
+		vaddr = cpa_address(cpa, numpages);
+		if (!IS_ERR_VALUE(vaddr)) {
+repeat:
+			ret = __change_page_attr(cpa, &kpte, vaddr, checkalias);
+			if (ret < 0)
 				return ret;
+			if (ret) {
+				base = alloc_page(GFP_KERNEL|__GFP_NOTRACK);
+				if (!base)
+					return -ENOMEM;
+				if (__split_large_page(cpa, kpte, vaddr, base))
+					__free_page(base);
+				goto repeat;
+			}
+
+			if (checkalias) {
+				ret = cpa_process_alias(cpa, vaddr);
+				if (ret < 0)
+					return ret;
+			}
 		}
 
 		/*
@@ -1374,11 +1368,7 @@ static int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias)
 		 */
 		BUG_ON(cpa->numpages > numpages || !cpa->numpages);
 		numpages -= cpa->numpages;
-		if (cpa->flags & (CPA_PAGES_ARRAY | CPA_ARRAY))
-			cpa->curpage++;
-		else
-			*cpa->vaddr += cpa->numpages * PAGE_SIZE;
-
+		cpa_advance(cpa);
 	}
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
  2017-03-16 10:16             ` Borislav Petkov
  (?)
  (?)
@ 2017-03-16 14:28               ` Tom Lendacky
  -1 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 14:28 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, jroedel, keescook, arnd,
	toshi.kani, math

On 3/16/2017 5:16 AM, Borislav Petkov wrote:
> On Fri, Mar 10, 2017 at 10:35:30AM -0600, Brijesh Singh wrote:
>> We could update this patch to use the below logic:
>>
>>  * CPUID(0) - Check for AuthenticAMD
>>  * CPID(1) - Check if under hypervisor
>>  * CPUID(0x80000000) - Check for highest supported leaf
>>  * CPUID(0x8000001F).EAX - Check for SME and SEV support
>>  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set
>
> Actually, it is still not clear to me *why* we need to do anything
> special wrt SEV in the guest.
>
> Lemme clarify: why can't the guest boot just like a normal Linux on
> baremetal and use the SME(!) detection code to set sme_enable and so
> on? IOW, I'd like to avoid all those checks whether we're running under
> hypervisor and handle all that like we're running on baremetal.

Because there are differences between how SME and SEV behave
(instruction fetches are always decrypted under SEV, DMA to an
encrypted location is not supported under SEV, etc.) we need to
determine which mode we are in so that things can be setup properly
during boot. For example, if SEV is active the kernel will already
be encrypted and so we don't perform that step or the trampoline area
for bringing up an AP must be decrypted for SME but encrypted for SEV.
The hypervisor check will provide that ability to determine how we
handle things.

Thanks,
Tom

>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 14:28               ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 14:28 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj, akpm,
	davem

On 3/16/2017 5:16 AM, Borislav Petkov wrote:
> On Fri, Mar 10, 2017 at 10:35:30AM -0600, Brijesh Singh wrote:
>> We could update this patch to use the below logic:
>>
>>  * CPUID(0) - Check for AuthenticAMD
>>  * CPID(1) - Check if under hypervisor
>>  * CPUID(0x80000000) - Check for highest supported leaf
>>  * CPUID(0x8000001F).EAX - Check for SME and SEV support
>>  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set
>
> Actually, it is still not clear to me *why* we need to do anything
> special wrt SEV in the guest.
>
> Lemme clarify: why can't the guest boot just like a normal Linux on
> baremetal and use the SME(!) detection code to set sme_enable and so
> on? IOW, I'd like to avoid all those checks whether we're running under
> hypervisor and handle all that like we're running on baremetal.

Because there are differences between how SME and SEV behave
(instruction fetches are always decrypted under SEV, DMA to an
encrypted location is not supported under SEV, etc.) we need to
determine which mode we are in so that things can be setup properly
during boot. For example, if SEV is active the kernel will already
be encrypted and so we don't perform that step or the trampoline area
for bringing up an AP must be decrypted for SME but encrypted for SEV.
The hypervisor check will provide that ability to determine how we
handle things.

Thanks,
Tom

>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 14:28               ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 14:28 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, jroedel, keescook, arnd,
	toshi.kani, math

On 3/16/2017 5:16 AM, Borislav Petkov wrote:
> On Fri, Mar 10, 2017 at 10:35:30AM -0600, Brijesh Singh wrote:
>> We could update this patch to use the below logic:
>>
>>  * CPUID(0) - Check for AuthenticAMD
>>  * CPID(1) - Check if under hypervisor
>>  * CPUID(0x80000000) - Check for highest supported leaf
>>  * CPUID(0x8000001F).EAX - Check for SME and SEV support
>>  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set
>
> Actually, it is still not clear to me *why* we need to do anything
> special wrt SEV in the guest.
>
> Lemme clarify: why can't the guest boot just like a normal Linux on
> baremetal and use the SME(!) detection code to set sme_enable and so
> on? IOW, I'd like to avoid all those checks whether we're running under
> hypervisor and handle all that like we're running on baremetal.

Because there are differences between how SME and SEV behave
(instruction fetches are always decrypted under SEV, DMA to an
encrypted location is not supported under SEV, etc.) we need to
determine which mode we are in so that things can be setup properly
during boot. For example, if SEV is active the kernel will already
be encrypted and so we don't perform that step or the trampoline area
for bringing up an AP must be decrypted for SME but encrypted for SEV.
The hypervisor check will provide that ability to determine how we
handle things.

Thanks,
Tom

>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 14:28               ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 14:28 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj, akpm,
	davem

On 3/16/2017 5:16 AM, Borislav Petkov wrote:
> On Fri, Mar 10, 2017 at 10:35:30AM -0600, Brijesh Singh wrote:
>> We could update this patch to use the below logic:
>>
>>  * CPUID(0) - Check for AuthenticAMD
>>  * CPID(1) - Check if under hypervisor
>>  * CPUID(0x80000000) - Check for highest supported leaf
>>  * CPUID(0x8000001F).EAX - Check for SME and SEV support
>>  * rdmsr (MSR_K8_SYSCFG)[MemEncryptionModeEnc] - Check if SMEE is set
>
> Actually, it is still not clear to me *why* we need to do anything
> special wrt SEV in the guest.
>
> Lemme clarify: why can't the guest boot just like a normal Linux on
> baremetal and use the SME(!) detection code to set sme_enable and so
> on? IOW, I'd like to avoid all those checks whether we're running under
> hypervisor and handle all that like we're running on baremetal.

Because there are differences between how SME and SEV behave
(instruction fetches are always decrypted under SEV, DMA to an
encrypted location is not supported under SEV, etc.) we need to
determine which mode we are in so that things can be setup properly
during boot. For example, if SEV is active the kernel will already
be encrypted and so we don't perform that step or the trampoline area
for bringing up an AP must be decrypted for SME but encrypted for SEV.
The hypervisor check will provide that ability to determine how we
handle things.

Thanks,
Tom

>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
  2017-03-16 14:28               ` Tom Lendacky
  (?)
@ 2017-03-16 15:09                 ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 15:09 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Brijesh Singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mcheha

On Thu, Mar 16, 2017 at 09:28:58AM -0500, Tom Lendacky wrote:
> Because there are differences between how SME and SEV behave
> (instruction fetches are always decrypted under SEV, DMA to an
> encrypted location is not supported under SEV, etc.) we need to
> determine which mode we are in so that things can be setup properly
> during boot. For example, if SEV is active the kernel will already
> be encrypted and so we don't perform that step or the trampoline area
> for bringing up an AP must be decrypted for SME but encrypted for SEV.

So with SEV enabled, it seems to me a guest doesn't know anything about
encryption and can run as if SME is disabled. So sme_active() will be
false. And then the kernel can bypass all that code dealing with SME.

So a guest should simply run like on baremetal with no SME, IMHO.

But then there's that part: "instruction fetches are always decrypted
under SEV". What does that mean exactly? And how much of that code can
be reused so that

* SME on baremetal
* SEV on guest

use the same logic?

Having the larger SEV preparation part on the kvm host side is perfectly
fine. But I'd like to keep kernel initialization paths clean.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 15:09                 ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 15:09 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Brijesh Singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On Thu, Mar 16, 2017 at 09:28:58AM -0500, Tom Lendacky wrote:
> Because there are differences between how SME and SEV behave
> (instruction fetches are always decrypted under SEV, DMA to an
> encrypted location is not supported under SEV, etc.) we need to
> determine which mode we are in so that things can be setup properly
> during boot. For example, if SEV is active the kernel will already
> be encrypted and so we don't perform that step or the trampoline area
> for bringing up an AP must be decrypted for SME but encrypted for SEV.

So with SEV enabled, it seems to me a guest doesn't know anything about
encryption and can run as if SME is disabled. So sme_active() will be
false. And then the kernel can bypass all that code dealing with SME.

So a guest should simply run like on baremetal with no SME, IMHO.

But then there's that part: "instruction fetches are always decrypted
under SEV". What does that mean exactly? And how much of that code can
be reused so that

* SME on baremetal
* SEV on guest

use the same logic?

Having the larger SEV preparation part on the kvm host side is perfectly
fine. But I'd like to keep kernel initialization paths clean.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 15:09                 ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 15:09 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Brijesh Singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On Thu, Mar 16, 2017 at 09:28:58AM -0500, Tom Lendacky wrote:
> Because there are differences between how SME and SEV behave
> (instruction fetches are always decrypted under SEV, DMA to an
> encrypted location is not supported under SEV, etc.) we need to
> determine which mode we are in so that things can be setup properly
> during boot. For example, if SEV is active the kernel will already
> be encrypted and so we don't perform that step or the trampoline area
> for bringing up an AP must be decrypted for SME but encrypted for SEV.

So with SEV enabled, it seems to me a guest doesn't know anything about
encryption and can run as if SME is disabled. So sme_active() will be
false. And then the kernel can bypass all that code dealing with SME.

So a guest should simply run like on baremetal with no SME, IMHO.

But then there's that part: "instruction fetches are always decrypted
under SEV". What does that mean exactly? And how much of that code can
be reused so that

* SME on baremetal
* SEV on guest

use the same logic?

Having the larger SEV preparation part on the kvm host side is perfectly
fine. But I'd like to keep kernel initialization paths clean.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
  2017-03-16 15:09                 ` Borislav Petkov
  (?)
  (?)
@ 2017-03-16 16:11                   ` Tom Lendacky
  -1 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 16:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

On 3/16/2017 10:09 AM, Borislav Petkov wrote:
> On Thu, Mar 16, 2017 at 09:28:58AM -0500, Tom Lendacky wrote:
>> Because there are differences between how SME and SEV behave
>> (instruction fetches are always decrypted under SEV, DMA to an
>> encrypted location is not supported under SEV, etc.) we need to
>> determine which mode we are in so that things can be setup properly
>> during boot. For example, if SEV is active the kernel will already
>> be encrypted and so we don't perform that step or the trampoline area
>> for bringing up an AP must be decrypted for SME but encrypted for SEV.
>
> So with SEV enabled, it seems to me a guest doesn't know anything about
> encryption and can run as if SME is disabled. So sme_active() will be
> false. And then the kernel can bypass all that code dealing with SME.
>
> So a guest should simply run like on baremetal with no SME, IMHO.
>

Not quite. The guest still needs to understand about the encryption mask
so that it can protect memory by setting the encryption mask in the
pagetable entries.  It can also decide when to share memory with the
hypervisor by not setting the encryption mask in the pagetable entries.

> But then there's that part: "instruction fetches are always decrypted
> under SEV". What does that mean exactly? And how much of that code can

"Instruction fetches are always decrypted under SEV" means that,
regardless of how a virtual address is mapped, encrypted or decrypted,
if an instruction fetch is performed by the CPU from that address it
will always be decrypted. This is to prevent the hypervisor from
injecting executable code into the guest since it would have to be
valid encrypted instructions.

> be reused so that
>
> * SME on baremetal
> * SEV on guest
>
> use the same logic?

There are many areas that use the same logic, but there are certain
situations where we need to check between SME vs SEV (e.g. DMA operation
setup or decrypting the trampoline area) and act accordingly.

Thanks,
Tom

>
> Having the larger SEV preparation part on the kvm host side is perfectly
> fine. But I'd like to keep kernel initialization paths clean.
>
> Thanks.
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 16:11                   ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 16:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On 3/16/2017 10:09 AM, Borislav Petkov wrote:
> On Thu, Mar 16, 2017 at 09:28:58AM -0500, Tom Lendacky wrote:
>> Because there are differences between how SME and SEV behave
>> (instruction fetches are always decrypted under SEV, DMA to an
>> encrypted location is not supported under SEV, etc.) we need to
>> determine which mode we are in so that things can be setup properly
>> during boot. For example, if SEV is active the kernel will already
>> be encrypted and so we don't perform that step or the trampoline area
>> for bringing up an AP must be decrypted for SME but encrypted for SEV.
>
> So with SEV enabled, it seems to me a guest doesn't know anything about
> encryption and can run as if SME is disabled. So sme_active() will be
> false. And then the kernel can bypass all that code dealing with SME.
>
> So a guest should simply run like on baremetal with no SME, IMHO.
>

Not quite. The guest still needs to understand about the encryption mask
so that it can protect memory by setting the encryption mask in the
pagetable entries.  It can also decide when to share memory with the
hypervisor by not setting the encryption mask in the pagetable entries.

> But then there's that part: "instruction fetches are always decrypted
> under SEV". What does that mean exactly? And how much of that code can

"Instruction fetches are always decrypted under SEV" means that,
regardless of how a virtual address is mapped, encrypted or decrypted,
if an instruction fetch is performed by the CPU from that address it
will always be decrypted. This is to prevent the hypervisor from
injecting executable code into the guest since it would have to be
valid encrypted instructions.

> be reused so that
>
> * SME on baremetal
> * SEV on guest
>
> use the same logic?

There are many areas that use the same logic, but there are certain
situations where we need to check between SME vs SEV (e.g. DMA operation
setup or decrypting the trampoline area) and act accordingly.

Thanks,
Tom

>
> Having the larger SEV preparation part on the kvm host side is perfectly
> fine. But I'd like to keep kernel initialization paths clean.
>
> Thanks.
>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 16:11                   ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 16:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel

On 3/16/2017 10:09 AM, Borislav Petkov wrote:
> On Thu, Mar 16, 2017 at 09:28:58AM -0500, Tom Lendacky wrote:
>> Because there are differences between how SME and SEV behave
>> (instruction fetches are always decrypted under SEV, DMA to an
>> encrypted location is not supported under SEV, etc.) we need to
>> determine which mode we are in so that things can be setup properly
>> during boot. For example, if SEV is active the kernel will already
>> be encrypted and so we don't perform that step or the trampoline area
>> for bringing up an AP must be decrypted for SME but encrypted for SEV.
>
> So with SEV enabled, it seems to me a guest doesn't know anything about
> encryption and can run as if SME is disabled. So sme_active() will be
> false. And then the kernel can bypass all that code dealing with SME.
>
> So a guest should simply run like on baremetal with no SME, IMHO.
>

Not quite. The guest still needs to understand about the encryption mask
so that it can protect memory by setting the encryption mask in the
pagetable entries.  It can also decide when to share memory with the
hypervisor by not setting the encryption mask in the pagetable entries.

> But then there's that part: "instruction fetches are always decrypted
> under SEV". What does that mean exactly? And how much of that code can

"Instruction fetches are always decrypted under SEV" means that,
regardless of how a virtual address is mapped, encrypted or decrypted,
if an instruction fetch is performed by the CPU from that address it
will always be decrypted. This is to prevent the hypervisor from
injecting executable code into the guest since it would have to be
valid encrypted instructions.

> be reused so that
>
> * SME on baremetal
> * SEV on guest
>
> use the same logic?

There are many areas that use the same logic, but there are certain
situations where we need to check between SME vs SEV (e.g. DMA operation
setup or decrypting the trampoline area) and act accordingly.

Thanks,
Tom

>
> Having the larger SEV preparation part on the kvm host side is perfectly
> fine. But I'd like to keep kernel initialization paths clean.
>
> Thanks.
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 16:11                   ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 16:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On 3/16/2017 10:09 AM, Borislav Petkov wrote:
> On Thu, Mar 16, 2017 at 09:28:58AM -0500, Tom Lendacky wrote:
>> Because there are differences between how SME and SEV behave
>> (instruction fetches are always decrypted under SEV, DMA to an
>> encrypted location is not supported under SEV, etc.) we need to
>> determine which mode we are in so that things can be setup properly
>> during boot. For example, if SEV is active the kernel will already
>> be encrypted and so we don't perform that step or the trampoline area
>> for bringing up an AP must be decrypted for SME but encrypted for SEV.
>
> So with SEV enabled, it seems to me a guest doesn't know anything about
> encryption and can run as if SME is disabled. So sme_active() will be
> false. And then the kernel can bypass all that code dealing with SME.
>
> So a guest should simply run like on baremetal with no SME, IMHO.
>

Not quite. The guest still needs to understand about the encryption mask
so that it can protect memory by setting the encryption mask in the
pagetable entries.  It can also decide when to share memory with the
hypervisor by not setting the encryption mask in the pagetable entries.

> But then there's that part: "instruction fetches are always decrypted
> under SEV". What does that mean exactly? And how much of that code can

"Instruction fetches are always decrypted under SEV" means that,
regardless of how a virtual address is mapped, encrypted or decrypted,
if an instruction fetch is performed by the CPU from that address it
will always be decrypted. This is to prevent the hypervisor from
injecting executable code into the guest since it would have to be
valid encrypted instructions.

> be reused so that
>
> * SME on baremetal
> * SEV on guest
>
> use the same logic?

There are many areas that use the same logic, but there are certain
situations where we need to check between SME vs SEV (e.g. DMA operation
setup or decrypting the trampoline area) and act accordingly.

Thanks,
Tom

>
> Having the larger SEV preparation part on the kvm host side is perfectly
> fine. But I'd like to keep kernel initialization paths clean.
>
> Thanks.
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
  2017-03-16 16:11                   ` Tom Lendacky
  (?)
@ 2017-03-16 16:29                     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 16:29 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Brijesh Singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto

On Thu, Mar 16, 2017 at 11:11:26AM -0500, Tom Lendacky wrote:
> Not quite. The guest still needs to understand about the encryption mask
> so that it can protect memory by setting the encryption mask in the
> pagetable entries.  It can also decide when to share memory with the
> hypervisor by not setting the encryption mask in the pagetable entries.

Ok, so the kernel - by that I mean both the baremetal and guest kernel -
needs to know whether we're encrypting stuff. So it needs to know about
SME.

> "Instruction fetches are always decrypted under SEV" means that,
> regardless of how a virtual address is mapped, encrypted or decrypted,
> if an instruction fetch is performed by the CPU from that address it
> will always be decrypted. This is to prevent the hypervisor from
> injecting executable code into the guest since it would have to be
> valid encrypted instructions.

Ok, so the guest needs to map its pages encrypted.

Which reminds me, KSM might be a PITA to enable with SEV but that's a
different story. :)

> There are many areas that use the same logic, but there are certain
> situations where we need to check between SME vs SEV (e.g. DMA operation
> setup or decrypting the trampoline area) and act accordingly.

Right, and I'd like to keep those areas where it differs at minimum and
nicely cordoned off from the main paths.

So looking back at the current patch in this subthread:

we do check

* CPUID 0x40000000
* 8000_001F[EAX] for SME
* 8000_001F[EBX][5:0] for the encryption bits.

So how about we generate the following CPUID picture for the guest:

CPUID_Fn8000001F_EAX = ...10b

That is, SME bit is cleared, SEV is set. This will mean for the guest
kernel that SEV is enabled and you can avoid yourself the 0x40000000
leaf check and the additional KVM feature bit glue.

10b configuration will be invalid for baremetal as - I'm assuming - you
can't have SEV=1b with SME=0b. It will be a virt-only configuration and
this way you can even avoid the hypervisor-specific detection but do
that for all.

Hmmm?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 16:29                     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 16:29 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Brijesh Singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On Thu, Mar 16, 2017 at 11:11:26AM -0500, Tom Lendacky wrote:
> Not quite. The guest still needs to understand about the encryption mask
> so that it can protect memory by setting the encryption mask in the
> pagetable entries.  It can also decide when to share memory with the
> hypervisor by not setting the encryption mask in the pagetable entries.

Ok, so the kernel - by that I mean both the baremetal and guest kernel -
needs to know whether we're encrypting stuff. So it needs to know about
SME.

> "Instruction fetches are always decrypted under SEV" means that,
> regardless of how a virtual address is mapped, encrypted or decrypted,
> if an instruction fetch is performed by the CPU from that address it
> will always be decrypted. This is to prevent the hypervisor from
> injecting executable code into the guest since it would have to be
> valid encrypted instructions.

Ok, so the guest needs to map its pages encrypted.

Which reminds me, KSM might be a PITA to enable with SEV but that's a
different story. :)

> There are many areas that use the same logic, but there are certain
> situations where we need to check between SME vs SEV (e.g. DMA operation
> setup or decrypting the trampoline area) and act accordingly.

Right, and I'd like to keep those areas where it differs at minimum and
nicely cordoned off from the main paths.

So looking back at the current patch in this subthread:

we do check

* CPUID 0x40000000
* 8000_001F[EAX] for SME
* 8000_001F[EBX][5:0] for the encryption bits.

So how about we generate the following CPUID picture for the guest:

CPUID_Fn8000001F_EAX = ...10b

That is, SME bit is cleared, SEV is set. This will mean for the guest
kernel that SEV is enabled and you can avoid yourself the 0x40000000
leaf check and the additional KVM feature bit glue.

10b configuration will be invalid for baremetal as - I'm assuming - you
can't have SEV=1b with SME=0b. It will be a virt-only configuration and
this way you can even avoid the hypervisor-specific detection but do
that for all.

Hmmm?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active
@ 2017-03-16 16:29                     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 16:29 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Brijesh Singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On Thu, Mar 16, 2017 at 11:11:26AM -0500, Tom Lendacky wrote:
> Not quite. The guest still needs to understand about the encryption mask
> so that it can protect memory by setting the encryption mask in the
> pagetable entries.  It can also decide when to share memory with the
> hypervisor by not setting the encryption mask in the pagetable entries.

Ok, so the kernel - by that I mean both the baremetal and guest kernel -
needs to know whether we're encrypting stuff. So it needs to know about
SME.

> "Instruction fetches are always decrypted under SEV" means that,
> regardless of how a virtual address is mapped, encrypted or decrypted,
> if an instruction fetch is performed by the CPU from that address it
> will always be decrypted. This is to prevent the hypervisor from
> injecting executable code into the guest since it would have to be
> valid encrypted instructions.

Ok, so the guest needs to map its pages encrypted.

Which reminds me, KSM might be a PITA to enable with SEV but that's a
different story. :)

> There are many areas that use the same logic, but there are certain
> situations where we need to check between SME vs SEV (e.g. DMA operation
> setup or decrypting the trampoline area) and act accordingly.

Right, and I'd like to keep those areas where it differs at minimum and
nicely cordoned off from the main paths.

So looking back at the current patch in this subthread:

we do check

* CPUID 0x40000000
* 8000_001F[EAX] for SME
* 8000_001F[EBX][5:0] for the encryption bits.

So how about we generate the following CPUID picture for the guest:

CPUID_Fn8000001F_EAX = ...10b

That is, SME bit is cleared, SEV is set. This will mean for the guest
kernel that SEV is enabled and you can avoid yourself the 0x40000000
leaf check and the additional KVM feature bit glue.

10b configuration will be invalid for baremetal as - I'm assuming - you
can't have SEV=1b with SME=0b. It will be a virt-only configuration and
this way you can even avoid the hypervisor-specific detection but do
that for all.

Hmmm?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
  2017-03-16 10:38     ` Paolo Bonzini
  (?)
  (?)
@ 2017-03-16 18:17     ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:17 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd
  Cc: brijesh.singh



On 03/16/2017 05:38 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> The SEV memory encryption engine uses a tweak such that two identical
>> plaintexts at different location will have a different ciphertexts.
>> So swapping or moving ciphertexts of two pages will not result in
>> plaintexts being swapped. Relocating (or migrating) a physical backing pages
>> for SEV guest will require some additional steps. The current SEV key
>> management spec [1] does not provide commands to swap or migrate (move)
>> ciphertexts. For now we pin the memory allocated for the SEV guest. In
>> future when SEV key management spec provides the commands to support the
>> page migration we can update the KVM code to remove the pinning logical
>> without making any changes into userspace (qemu).
>>
>> The patch pins userspace memory when a new slot is created and unpin the
>> memory when slot is removed.
>>
>> [1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf
>
> This is not enough, because memory can be hidden temporarily from the
> guest and remapped later.  Think of a PCI BAR that is backed by RAM, or
> also SMRAM.  The pinning must be kept even in that case.
>
> You need to add a pair of KVM_MEMORY_ENCRYPT_OPs (one that doesn't map
> to a PSP operation), such as KVM_REGISTER/UNREGISTER_ENCRYPTED_RAM.  In
> QEMU you can use a RAMBlockNotifier to invoke the ioctls.
>

I was hoping to avoid adding new ioctl, but I see your point. Will add a pair of ioctl's
and use RAMBlocNotifier to invoke those ioctls.

-Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
  2017-03-16 10:38     ` Paolo Bonzini
  (?)
  (?)
@ 2017-03-16 18:17       ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:17 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem
  Cc: brijesh.singh



On 03/16/2017 05:38 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> The SEV memory encryption engine uses a tweak such that two identical
>> plaintexts at different location will have a different ciphertexts.
>> So swapping or moving ciphertexts of two pages will not result in
>> plaintexts being swapped. Relocating (or migrating) a physical backing pages
>> for SEV guest will require some additional steps. The current SEV key
>> management spec [1] does not provide commands to swap or migrate (move)
>> ciphertexts. For now we pin the memory allocated for the SEV guest. In
>> future when SEV key management spec provides the commands to support the
>> page migration we can update the KVM code to remove the pinning logical
>> without making any changes into userspace (qemu).
>>
>> The patch pins userspace memory when a new slot is created and unpin the
>> memory when slot is removed.
>>
>> [1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf
>
> This is not enough, because memory can be hidden temporarily from the
> guest and remapped later.  Think of a PCI BAR that is backed by RAM, or
> also SMRAM.  The pinning must be kept even in that case.
>
> You need to add a pair of KVM_MEMORY_ENCRYPT_OPs (one that doesn't map
> to a PSP operation), such as KVM_REGISTER/UNREGISTER_ENCRYPTED_RAM.  In
> QEMU you can use a RAMBlockNotifier to invoke the ioctls.
>

I was hoping to avoid adding new ioctl, but I see your point. Will add a pair of ioctl's
and use RAMBlocNotifier to invoke those ioctls.

-Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
@ 2017-03-16 18:17       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:17 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel
  Cc: brijesh.singh



On 03/16/2017 05:38 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> The SEV memory encryption engine uses a tweak such that two identical
>> plaintexts at different location will have a different ciphertexts.
>> So swapping or moving ciphertexts of two pages will not result in
>> plaintexts being swapped. Relocating (or migrating) a physical backing pages
>> for SEV guest will require some additional steps. The current SEV key
>> management spec [1] does not provide commands to swap or migrate (move)
>> ciphertexts. For now we pin the memory allocated for the SEV guest. In
>> future when SEV key management spec provides the commands to support the
>> page migration we can update the KVM code to remove the pinning logical
>> without making any changes into userspace (qemu).
>>
>> The patch pins userspace memory when a new slot is created and unpin the
>> memory when slot is removed.
>>
>> [1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf
>
> This is not enough, because memory can be hidden temporarily from the
> guest and remapped later.  Think of a PCI BAR that is backed by RAM, or
> also SMRAM.  The pinning must be kept even in that case.
>
> You need to add a pair of KVM_MEMORY_ENCRYPT_OPs (one that doesn't map
> to a PSP operation), such as KVM_REGISTER/UNREGISTER_ENCRYPTED_RAM.  In
> QEMU you can use a RAMBlockNotifier to invoke the ioctls.
>

I was hoping to avoid adding new ioctl, but I see your point. Will add a pair of ioctl's
and use RAMBlocNotifier to invoke those ioctls.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
@ 2017-03-16 18:17       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:17 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel
  Cc: brijesh.singh



On 03/16/2017 05:38 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> The SEV memory encryption engine uses a tweak such that two identical
>> plaintexts at different location will have a different ciphertexts.
>> So swapping or moving ciphertexts of two pages will not result in
>> plaintexts being swapped. Relocating (or migrating) a physical backing pages
>> for SEV guest will require some additional steps. The current SEV key
>> management spec [1] does not provide commands to swap or migrate (move)
>> ciphertexts. For now we pin the memory allocated for the SEV guest. In
>> future when SEV key management spec provides the commands to support the
>> page migration we can update the KVM code to remove the pinning logical
>> without making any changes into userspace (qemu).
>>
>> The patch pins userspace memory when a new slot is created and unpin the
>> memory when slot is removed.
>>
>> [1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf
>
> This is not enough, because memory can be hidden temporarily from the
> guest and remapped later.  Think of a PCI BAR that is backed by RAM, or
> also SMRAM.  The pinning must be kept even in that case.
>
> You need to add a pair of KVM_MEMORY_ENCRYPT_OPs (one that doesn't map
> to a PSP operation), such as KVM_REGISTER/UNREGISTER_ENCRYPTED_RAM.  In
> QEMU you can use a RAMBlockNotifier to invoke the ioctls.
>

I was hoping to avoid adding new ioctl, but I see your point. Will add a pair of ioctl's
and use RAMBlocNotifier to invoke those ioctls.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active
@ 2017-03-16 18:17       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:17 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem
  Cc: brijesh.singh



On 03/16/2017 05:38 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> The SEV memory encryption engine uses a tweak such that two identical
>> plaintexts at different location will have a different ciphertexts.
>> So swapping or moving ciphertexts of two pages will not result in
>> plaintexts being swapped. Relocating (or migrating) a physical backing pages
>> for SEV guest will require some additional steps. The current SEV key
>> management spec [1] does not provide commands to swap or migrate (move)
>> ciphertexts. For now we pin the memory allocated for the SEV guest. In
>> future when SEV key management spec provides the commands to support the
>> page migration we can update the KVM code to remove the pinning logical
>> without making any changes into userspace (qemu).
>>
>> The patch pins userspace memory when a new slot is created and unpin the
>> memory when slot is removed.
>>
>> [1] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf
>
> This is not enough, because memory can be hidden temporarily from the
> guest and remapped later.  Think of a PCI BAR that is backed by RAM, or
> also SMRAM.  The pinning must be kept even in that case.
>
> You need to add a pair of KVM_MEMORY_ENCRYPT_OPs (one that doesn't map
> to a PSP operation), such as KVM_REGISTER/UNREGISTER_ENCRYPTED_RAM.  In
> QEMU you can use a RAMBlockNotifier to invoke the ioctls.
>

I was hoping to avoid adding new ioctl, but I see your point. Will add a pair of ioctl's
and use RAMBlocNotifier to invoke those ioctls.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
  2017-03-16 10:48     ` Paolo Bonzini
                       ` (2 preceding siblings ...)
  (?)
@ 2017-03-16 18:20     ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:20 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd
  Cc: brijesh.singh


On 03/16/2017 05:48 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:17, Brijesh Singh wrote:
>> +static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
>> +				    unsigned long *n)
>> +{
>> +	struct page **pages;
>> +	int first, last;
>> +	unsigned long npages, pinned;
>> +
>> +	/* Get number of pages */
>> +	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
>> +	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
>> +	npages = (last - first + 1);
>> +
>> +	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
>> +	if (!pages)
>> +		return NULL;
>> +
>> +	/* pin the user virtual address */
>> +	down_read(&current->mm->mmap_sem);
>> +	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
>> +	up_read(&current->mm->mmap_sem);
>
> get_user_pages_fast, like get_user_pages_unlocked, must be called
> without mmap_sem held.

Sure.

>
>> +	if (pinned != npages) {
>> +		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
>> +				npages, pinned);
>> +		goto err;
>> +	}
>> +
>> +	*n = npages;
>> +	return pages;
>> +err:
>> +	if (pinned > 0)
>> +		release_pages(pages, pinned, 0);
>> +	kfree(pages);
>> +
>> +	return NULL;
>> +}
>>
>> +	/* the array of pages returned by get_user_pages() is a page-aligned
>> +	 * memory. Since the user buffer is probably not page-aligned, we need
>> +	 * to calculate the offset within a page for first update entry.
>> +	 */
>> +	offset = uaddr & (PAGE_SIZE - 1);
>> +	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
>> +	ulen -= len;
>> +
>> +	/* update first page -
>> +	 * special care need to be taken for the first page because we might
>> +	 * be dealing with offset within the page
>> +	 */
>
> No need to special case the first page; just set "offset = 0" inside the
> loop after the first iteration.
>

Will do.

-Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
  2017-03-16 10:48     ` Paolo Bonzini
  (?)
  (?)
@ 2017-03-16 18:20       ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:20 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem
  Cc: brijesh.singh


On 03/16/2017 05:48 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:17, Brijesh Singh wrote:
>> +static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
>> +				    unsigned long *n)
>> +{
>> +	struct page **pages;
>> +	int first, last;
>> +	unsigned long npages, pinned;
>> +
>> +	/* Get number of pages */
>> +	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
>> +	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
>> +	npages = (last - first + 1);
>> +
>> +	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
>> +	if (!pages)
>> +		return NULL;
>> +
>> +	/* pin the user virtual address */
>> +	down_read(&current->mm->mmap_sem);
>> +	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
>> +	up_read(&current->mm->mmap_sem);
>
> get_user_pages_fast, like get_user_pages_unlocked, must be called
> without mmap_sem held.

Sure.

>
>> +	if (pinned != npages) {
>> +		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
>> +				npages, pinned);
>> +		goto err;
>> +	}
>> +
>> +	*n = npages;
>> +	return pages;
>> +err:
>> +	if (pinned > 0)
>> +		release_pages(pages, pinned, 0);
>> +	kfree(pages);
>> +
>> +	return NULL;
>> +}
>>
>> +	/* the array of pages returned by get_user_pages() is a page-aligned
>> +	 * memory. Since the user buffer is probably not page-aligned, we need
>> +	 * to calculate the offset within a page for first update entry.
>> +	 */
>> +	offset = uaddr & (PAGE_SIZE - 1);
>> +	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
>> +	ulen -= len;
>> +
>> +	/* update first page -
>> +	 * special care need to be taken for the first page because we might
>> +	 * be dealing with offset within the page
>> +	 */
>
> No need to special case the first page; just set "offset = 0" inside the
> loop after the first iteration.
>

Will do.

-Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
@ 2017-03-16 18:20       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:20 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel
  Cc: brijesh.singh


On 03/16/2017 05:48 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:17, Brijesh Singh wrote:
>> +static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
>> +				    unsigned long *n)
>> +{
>> +	struct page **pages;
>> +	int first, last;
>> +	unsigned long npages, pinned;
>> +
>> +	/* Get number of pages */
>> +	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
>> +	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
>> +	npages = (last - first + 1);
>> +
>> +	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
>> +	if (!pages)
>> +		return NULL;
>> +
>> +	/* pin the user virtual address */
>> +	down_read(&current->mm->mmap_sem);
>> +	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
>> +	up_read(&current->mm->mmap_sem);
>
> get_user_pages_fast, like get_user_pages_unlocked, must be called
> without mmap_sem held.

Sure.

>
>> +	if (pinned != npages) {
>> +		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
>> +				npages, pinned);
>> +		goto err;
>> +	}
>> +
>> +	*n = npages;
>> +	return pages;
>> +err:
>> +	if (pinned > 0)
>> +		release_pages(pages, pinned, 0);
>> +	kfree(pages);
>> +
>> +	return NULL;
>> +}
>>
>> +	/* the array of pages returned by get_user_pages() is a page-aligned
>> +	 * memory. Since the user buffer is probably not page-aligned, we need
>> +	 * to calculate the offset within a page for first update entry.
>> +	 */
>> +	offset = uaddr & (PAGE_SIZE - 1);
>> +	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
>> +	ulen -= len;
>> +
>> +	/* update first page -
>> +	 * special care need to be taken for the first page because we might
>> +	 * be dealing with offset within the page
>> +	 */
>
> No need to special case the first page; just set "offset = 0" inside the
> loop after the first iteration.
>

Will do.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
@ 2017-03-16 18:20       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:20 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel
  Cc: brijesh.singh


On 03/16/2017 05:48 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:17, Brijesh Singh wrote:
>> +static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
>> +				    unsigned long *n)
>> +{
>> +	struct page **pages;
>> +	int first, last;
>> +	unsigned long npages, pinned;
>> +
>> +	/* Get number of pages */
>> +	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
>> +	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
>> +	npages = (last - first + 1);
>> +
>> +	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
>> +	if (!pages)
>> +		return NULL;
>> +
>> +	/* pin the user virtual address */
>> +	down_read(&current->mm->mmap_sem);
>> +	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
>> +	up_read(&current->mm->mmap_sem);
>
> get_user_pages_fast, like get_user_pages_unlocked, must be called
> without mmap_sem held.

Sure.

>
>> +	if (pinned != npages) {
>> +		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
>> +				npages, pinned);
>> +		goto err;
>> +	}
>> +
>> +	*n = npages;
>> +	return pages;
>> +err:
>> +	if (pinned > 0)
>> +		release_pages(pages, pinned, 0);
>> +	kfree(pages);
>> +
>> +	return NULL;
>> +}
>>
>> +	/* the array of pages returned by get_user_pages() is a page-aligned
>> +	 * memory. Since the user buffer is probably not page-aligned, we need
>> +	 * to calculate the offset within a page for first update entry.
>> +	 */
>> +	offset = uaddr & (PAGE_SIZE - 1);
>> +	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
>> +	ulen -= len;
>> +
>> +	/* update first page -
>> +	 * special care need to be taken for the first page because we might
>> +	 * be dealing with offset within the page
>> +	 */
>
> No need to special case the first page; just set "offset = 0" inside the
> loop after the first iteration.
>

Will do.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
@ 2017-03-16 18:20       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:20 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem
  Cc: brijesh.singh


On 03/16/2017 05:48 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:17, Brijesh Singh wrote:
>> +static struct page **sev_pin_memory(unsigned long uaddr, unsigned long ulen,
>> +				    unsigned long *n)
>> +{
>> +	struct page **pages;
>> +	int first, last;
>> +	unsigned long npages, pinned;
>> +
>> +	/* Get number of pages */
>> +	first = (uaddr & PAGE_MASK) >> PAGE_SHIFT;
>> +	last = ((uaddr + ulen - 1) & PAGE_MASK) >> PAGE_SHIFT;
>> +	npages = (last - first + 1);
>> +
>> +	pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
>> +	if (!pages)
>> +		return NULL;
>> +
>> +	/* pin the user virtual address */
>> +	down_read(&current->mm->mmap_sem);
>> +	pinned = get_user_pages_fast(uaddr, npages, 1, pages);
>> +	up_read(&current->mm->mmap_sem);
>
> get_user_pages_fast, like get_user_pages_unlocked, must be called
> without mmap_sem held.

Sure.

>
>> +	if (pinned != npages) {
>> +		printk(KERN_ERR "SEV: failed to pin  %ld pages (got %ld)\n",
>> +				npages, pinned);
>> +		goto err;
>> +	}
>> +
>> +	*n = npages;
>> +	return pages;
>> +err:
>> +	if (pinned > 0)
>> +		release_pages(pages, pinned, 0);
>> +	kfree(pages);
>> +
>> +	return NULL;
>> +}
>>
>> +	/* the array of pages returned by get_user_pages() is a page-aligned
>> +	 * memory. Since the user buffer is probably not page-aligned, we need
>> +	 * to calculate the offset within a page for first update entry.
>> +	 */
>> +	offset = uaddr & (PAGE_SIZE - 1);
>> +	len = min_t(size_t, (PAGE_SIZE - offset), ulen);
>> +	ulen -= len;
>> +
>> +	/* update first page -
>> +	 * special care need to be taken for the first page because we might
>> +	 * be dealing with offset within the page
>> +	 */
>
> No need to special case the first page; just set "offset = 0" inside the
> loop after the first iteration.
>

Will do.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-10 22:41       ` Brijesh Singh
  (?)
@ 2017-03-16 18:28         ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 18:28 UTC (permalink / raw)
  To: Brijesh Singh, Paolo Bonzini
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott

On Fri, Mar 10, 2017 at 04:41:56PM -0600, Brijesh Singh wrote:
> I can take a look at fixing those warning. In my initial attempt was to create
> a new function to clear encryption bit but it ended up looking very similar to
> __change_page_attr_set_clr() hence decided to extend the exiting function to
> use memblock_alloc().

... except that having all that SEV-specific code in main code paths is
yucky and I'd like to avoid it, if possible.

> Early in boot process, guest kernel allocates some structure (its either
> statically allocated or dynamic allocated via memblock_alloc). And shares the physical
> address of these structure with hypervisor. Since entire guest memory area is mapped
> as encrypted hence those structure's are mapped as encrypted memory range. We need
> a method to clear the encryption bit. Sometime these structure maybe part of 2M pages
> and need to split into smaller pages.

So how hard would it be if the hypervisor allocated that memory for the
guest instead? It would allocate it decrypted and guest would need to
access it decrypted too. All in preparation for SEV-ES which will need a
block of unencrypted memory for the guest anyway...

> In most cases, guest and hypervisor communication starts as soon as guest provides
> the physical address to hypervisor. So we must map the pages as decrypted before
> sharing the physical address to hypervisor.

See above: so purely theoretically speaking, the hypervisor could prep
that decrypted range for the guest. I'd look in Paolo's direction,
though, for the feasibility of something like that.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-16 18:28         ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 18:28 UTC (permalink / raw)
  To: Brijesh Singh, Paolo Bonzini
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj, akpm,
	davem

On Fri, Mar 10, 2017 at 04:41:56PM -0600, Brijesh Singh wrote:
> I can take a look at fixing those warning. In my initial attempt was to create
> a new function to clear encryption bit but it ended up looking very similar to
> __change_page_attr_set_clr() hence decided to extend the exiting function to
> use memblock_alloc().

... except that having all that SEV-specific code in main code paths is
yucky and I'd like to avoid it, if possible.

> Early in boot process, guest kernel allocates some structure (its either
> statically allocated or dynamic allocated via memblock_alloc). And shares the physical
> address of these structure with hypervisor. Since entire guest memory area is mapped
> as encrypted hence those structure's are mapped as encrypted memory range. We need
> a method to clear the encryption bit. Sometime these structure maybe part of 2M pages
> and need to split into smaller pages.

So how hard would it be if the hypervisor allocated that memory for the
guest instead? It would allocate it decrypted and guest would need to
access it decrypted too. All in preparation for SEV-ES which will need a
block of unencrypted memory for the guest anyway...

> In most cases, guest and hypervisor communication starts as soon as guest provides
> the physical address to hypervisor. So we must map the pages as decrypted before
> sharing the physical address to hypervisor.

See above: so purely theoretically speaking, the hypervisor could prep
that decrypted range for the guest. I'd look in Paolo's direction,
though, for the feasibility of something like that.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-16 18:28         ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-16 18:28 UTC (permalink / raw)
  To: Brijesh Singh, Paolo Bonzini
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj, akpm,
	davem

On Fri, Mar 10, 2017 at 04:41:56PM -0600, Brijesh Singh wrote:
> I can take a look at fixing those warning. In my initial attempt was to create
> a new function to clear encryption bit but it ended up looking very similar to
> __change_page_attr_set_clr() hence decided to extend the exiting function to
> use memblock_alloc().

... except that having all that SEV-specific code in main code paths is
yucky and I'd like to avoid it, if possible.

> Early in boot process, guest kernel allocates some structure (its either
> statically allocated or dynamic allocated via memblock_alloc). And shares the physical
> address of these structure with hypervisor. Since entire guest memory area is mapped
> as encrypted hence those structure's are mapped as encrypted memory range. We need
> a method to clear the encryption bit. Sometime these structure maybe part of 2M pages
> and need to split into smaller pages.

So how hard would it be if the hypervisor allocated that memory for the
guest instead? It would allocate it decrypted and guest would need to
access it decrypted too. All in preparation for SEV-ES which will need a
block of unencrypted memory for the guest anyway...

> In most cases, guest and hypervisor communication starts as soon as guest provides
> the physical address to hypervisor. So we must map the pages as decrypted before
> sharing the physical address to hypervisor.

See above: so purely theoretically speaking, the hypervisor could prep
that decrypted range for the guest. I'd look in Paolo's direction,
though, for the feasibility of something like that.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
  2017-03-16 11:03     ` Paolo Bonzini
  (?)
  (?)
@ 2017-03-16 18:34     ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:34 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd
  Cc: brijesh.singh



On 03/16/2017 06:03 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> +	data = (void *) get_zeroed_page(GFP_KERNEL);
>
> The page does not need to be zeroed, does it?
>

No, we don't have to zero it. I will fix it.

>> +
>> +	if ((len & 15) || (dst_addr & 15)) {
>> +		/* if destination address and length are not 16-byte
>> +		 * aligned then:
>> +		 * a) decrypt destination page into temporary buffer
>> +		 * b) copy source data into temporary buffer at correct offset
>> +		 * c) encrypt temporary buffer
>> +		 */
>> +		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);
>
> Ah, I see now you're using this function here for read-modify-write.
> data is already pinned here, so even if you keep the function it makes
> sense to push pinning out of __sev_dbg_decrypt_page and into
> sev_dbg_decrypt.

I can push out pinning part outside __sev_dbg_decrypt_page

>
>> +		if (ret)
>> +			goto err_3;
>> +		d_off = dst_addr & (PAGE_SIZE - 1);
>> +
>> +		if (copy_from_user(data + d_off,
>> +					(uint8_t *)debug.src_addr, len)) {
>> +			ret = -EFAULT;
>> +			goto err_3;
>> +		}
>> +
>> +		encrypt->length = PAGE_SIZE;
>
> Why decrypt/re-encrypt all the page instead of just the 16 byte area
> around the [dst_addr, dst_addr+len) range?
>

good catch, I should be fine just decrypting a 16 byte area. Will fix in next rev

>> +		encrypt->src_addr = __psp_pa(data);
>> +		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
>> +	} else {
>> +		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
>> +			ret = -EFAULT;
>> +			goto err_3;
>> +		}
>
> Do you need copy_from_user, or can you just pin/unpin memory as for
> DEBUG_DECRYPT?
>

We can work either with pin/unpin or copy_from_user. I think I choose copy_from_user because
in most of time ENCRYPT path was used when I set breakpoint through gdb which basically
requires copying pretty small data into guest memory. It may be very much possible that
someone can try to copy lot more data and then pin/unpin can speedup the things.

-Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
  2017-03-16 11:03     ` Paolo Bonzini
  (?)
  (?)
@ 2017-03-16 18:34       ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:34 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem
  Cc: brijesh.singh



On 03/16/2017 06:03 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> +	data = (void *) get_zeroed_page(GFP_KERNEL);
>
> The page does not need to be zeroed, does it?
>

No, we don't have to zero it. I will fix it.

>> +
>> +	if ((len & 15) || (dst_addr & 15)) {
>> +		/* if destination address and length are not 16-byte
>> +		 * aligned then:
>> +		 * a) decrypt destination page into temporary buffer
>> +		 * b) copy source data into temporary buffer at correct offset
>> +		 * c) encrypt temporary buffer
>> +		 */
>> +		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);
>
> Ah, I see now you're using this function here for read-modify-write.
> data is already pinned here, so even if you keep the function it makes
> sense to push pinning out of __sev_dbg_decrypt_page and into
> sev_dbg_decrypt.

I can push out pinning part outside __sev_dbg_decrypt_page

>
>> +		if (ret)
>> +			goto err_3;
>> +		d_off = dst_addr & (PAGE_SIZE - 1);
>> +
>> +		if (copy_from_user(data + d_off,
>> +					(uint8_t *)debug.src_addr, len)) {
>> +			ret = -EFAULT;
>> +			goto err_3;
>> +		}
>> +
>> +		encrypt->length = PAGE_SIZE;
>
> Why decrypt/re-encrypt all the page instead of just the 16 byte area
> around the [dst_addr, dst_addr+len) range?
>

good catch, I should be fine just decrypting a 16 byte area. Will fix in next rev

>> +		encrypt->src_addr = __psp_pa(data);
>> +		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
>> +	} else {
>> +		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
>> +			ret = -EFAULT;
>> +			goto err_3;
>> +		}
>
> Do you need copy_from_user, or can you just pin/unpin memory as for
> DEBUG_DECRYPT?
>

We can work either with pin/unpin or copy_from_user. I think I choose copy_from_user because
in most of time ENCRYPT path was used when I set breakpoint through gdb which basically
requires copying pretty small data into guest memory. It may be very much possible that
someone can try to copy lot more data and then pin/unpin can speedup the things.

-Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
@ 2017-03-16 18:34       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:34 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel
  Cc: brijesh.singh



On 03/16/2017 06:03 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> +	data = (void *) get_zeroed_page(GFP_KERNEL);
>
> The page does not need to be zeroed, does it?
>

No, we don't have to zero it. I will fix it.

>> +
>> +	if ((len & 15) || (dst_addr & 15)) {
>> +		/* if destination address and length are not 16-byte
>> +		 * aligned then:
>> +		 * a) decrypt destination page into temporary buffer
>> +		 * b) copy source data into temporary buffer at correct offset
>> +		 * c) encrypt temporary buffer
>> +		 */
>> +		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);
>
> Ah, I see now you're using this function here for read-modify-write.
> data is already pinned here, so even if you keep the function it makes
> sense to push pinning out of __sev_dbg_decrypt_page and into
> sev_dbg_decrypt.

I can push out pinning part outside __sev_dbg_decrypt_page

>
>> +		if (ret)
>> +			goto err_3;
>> +		d_off = dst_addr & (PAGE_SIZE - 1);
>> +
>> +		if (copy_from_user(data + d_off,
>> +					(uint8_t *)debug.src_addr, len)) {
>> +			ret = -EFAULT;
>> +			goto err_3;
>> +		}
>> +
>> +		encrypt->length = PAGE_SIZE;
>
> Why decrypt/re-encrypt all the page instead of just the 16 byte area
> around the [dst_addr, dst_addr+len) range?
>

good catch, I should be fine just decrypting a 16 byte area. Will fix in next rev

>> +		encrypt->src_addr = __psp_pa(data);
>> +		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
>> +	} else {
>> +		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
>> +			ret = -EFAULT;
>> +			goto err_3;
>> +		}
>
> Do you need copy_from_user, or can you just pin/unpin memory as for
> DEBUG_DECRYPT?
>

We can work either with pin/unpin or copy_from_user. I think I choose copy_from_user because
in most of time ENCRYPT path was used when I set breakpoint through gdb which basically
requires copying pretty small data into guest memory. It may be very much possible that
someone can try to copy lot more data and then pin/unpin can speedup the things.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
@ 2017-03-16 18:34       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:34 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel
  Cc: brijesh.singh



On 03/16/2017 06:03 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> +	data = (void *) get_zeroed_page(GFP_KERNEL);
>
> The page does not need to be zeroed, does it?
>

No, we don't have to zero it. I will fix it.

>> +
>> +	if ((len & 15) || (dst_addr & 15)) {
>> +		/* if destination address and length are not 16-byte
>> +		 * aligned then:
>> +		 * a) decrypt destination page into temporary buffer
>> +		 * b) copy source data into temporary buffer at correct offset
>> +		 * c) encrypt temporary buffer
>> +		 */
>> +		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);
>
> Ah, I see now you're using this function here for read-modify-write.
> data is already pinned here, so even if you keep the function it makes
> sense to push pinning out of __sev_dbg_decrypt_page and into
> sev_dbg_decrypt.

I can push out pinning part outside __sev_dbg_decrypt_page

>
>> +		if (ret)
>> +			goto err_3;
>> +		d_off = dst_addr & (PAGE_SIZE - 1);
>> +
>> +		if (copy_from_user(data + d_off,
>> +					(uint8_t *)debug.src_addr, len)) {
>> +			ret = -EFAULT;
>> +			goto err_3;
>> +		}
>> +
>> +		encrypt->length = PAGE_SIZE;
>
> Why decrypt/re-encrypt all the page instead of just the 16 byte area
> around the [dst_addr, dst_addr+len) range?
>

good catch, I should be fine just decrypting a 16 byte area. Will fix in next rev

>> +		encrypt->src_addr = __psp_pa(data);
>> +		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
>> +	} else {
>> +		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
>> +			ret = -EFAULT;
>> +			goto err_3;
>> +		}
>
> Do you need copy_from_user, or can you just pin/unpin memory as for
> DEBUG_DECRYPT?
>

We can work either with pin/unpin or copy_from_user. I think I choose copy_from_user because
in most of time ENCRYPT path was used when I set breakpoint through gdb which basically
requires copying pretty small data into guest memory. It may be very much possible that
someone can try to copy lot more data and then pin/unpin can speedup the things.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command
@ 2017-03-16 18:34       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:34 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem
  Cc: brijesh.singh



On 03/16/2017 06:03 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> +	data = (void *) get_zeroed_page(GFP_KERNEL);
>
> The page does not need to be zeroed, does it?
>

No, we don't have to zero it. I will fix it.

>> +
>> +	if ((len & 15) || (dst_addr & 15)) {
>> +		/* if destination address and length are not 16-byte
>> +		 * aligned then:
>> +		 * a) decrypt destination page into temporary buffer
>> +		 * b) copy source data into temporary buffer at correct offset
>> +		 * c) encrypt temporary buffer
>> +		 */
>> +		ret = __sev_dbg_decrypt_page(kvm, dst_addr, data, &argp->error);
>
> Ah, I see now you're using this function here for read-modify-write.
> data is already pinned here, so even if you keep the function it makes
> sense to push pinning out of __sev_dbg_decrypt_page and into
> sev_dbg_decrypt.

I can push out pinning part outside __sev_dbg_decrypt_page

>
>> +		if (ret)
>> +			goto err_3;
>> +		d_off = dst_addr & (PAGE_SIZE - 1);
>> +
>> +		if (copy_from_user(data + d_off,
>> +					(uint8_t *)debug.src_addr, len)) {
>> +			ret = -EFAULT;
>> +			goto err_3;
>> +		}
>> +
>> +		encrypt->length = PAGE_SIZE;
>
> Why decrypt/re-encrypt all the page instead of just the 16 byte area
> around the [dst_addr, dst_addr+len) range?
>

good catch, I should be fine just decrypting a 16 byte area. Will fix in next rev

>> +		encrypt->src_addr = __psp_pa(data);
>> +		encrypt->dst_addr =  __sev_page_pa(inpages[0]);
>> +	} else {
>> +		if (copy_from_user(data, (uint8_t *)debug.src_addr, len)) {
>> +			ret = -EFAULT;
>> +			goto err_3;
>> +		}
>
> Do you need copy_from_user, or can you just pin/unpin memory as for
> DEBUG_DECRYPT?
>

We can work either with pin/unpin or copy_from_user. I think I choose copy_from_user because
in most of time ENCRYPT path was used when I set breakpoint through gdb which basically
requires copying pretty small data into guest memory. It may be very much possible that
someone can try to copy lot more data and then pin/unpin can speedup the things.

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
  2017-03-16 10:54     ` Paolo Bonzini
  (?)
  (?)
@ 2017-03-16 18:41     ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:41 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd
  Cc: brijesh.singh



On 03/16/2017 05:54 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> +static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
>> +		void *dst, int *error)
>> +{
>> +	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
>> +	if (!inpages) {
>> +		ret = -ENOMEM;
>> +		goto err_1;
>> +	}
>> +
>> +	data->handle = sev_get_handle(kvm);
>> +	data->dst_addr = __psp_pa(dst);
>> +	data->src_addr = __sev_page_pa(inpages[0]);
>> +	data->length = PAGE_SIZE;
>> +
>> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
>> +	if (ret)
>> +		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
>> +				ret, *error);
>> +	sev_unpin_memory(inpages, npages);
>> +err_1:
>> +	kfree(data);
>> +	return ret;
>> +}
>> +
>> +static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
>> +{
>> +	void *data;
>> +	int ret, offset, len;
>> +	struct kvm_sev_dbg debug;
>> +
>> +	if (!sev_guest(kvm))
>> +		return -ENOTTY;
>> +
>> +	if (copy_from_user(&debug, (void *)argp->data,
>> +				sizeof(struct kvm_sev_dbg)))
>> +		return -EFAULT;
>> +	/*
>> +	 * TODO: add support for decrypting length which crosses the
>> +	 * page boundary.
>> +	 */
>> +	offset = debug.src_addr & (PAGE_SIZE - 1);
>> +	if (offset + debug.length > PAGE_SIZE)
>> +		return -EINVAL;
>> +
>
> Please do add it, it doesn't seem very different from what you're doing
> in LAUNCH_UPDATE_DATA.  There's no need for a separate
> __sev_dbg_decrypt_page function, you can just pin/unpin here and do a
> per-page loop as in LAUNCH_UPDATE_DATA.
>

I can certainly add support to handle crossing the page boundary cases.
Should we limit the size to prevent user passing arbitrary long length
and we end up looping inside the kernel? I was thinking to limit to a PAGE_SIZE.

~ Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
  2017-03-16 10:54     ` Paolo Bonzini
  (?)
  (?)
@ 2017-03-16 18:41       ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:41 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem
  Cc: brijesh.singh



On 03/16/2017 05:54 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> +static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
>> +		void *dst, int *error)
>> +{
>> +	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
>> +	if (!inpages) {
>> +		ret = -ENOMEM;
>> +		goto err_1;
>> +	}
>> +
>> +	data->handle = sev_get_handle(kvm);
>> +	data->dst_addr = __psp_pa(dst);
>> +	data->src_addr = __sev_page_pa(inpages[0]);
>> +	data->length = PAGE_SIZE;
>> +
>> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
>> +	if (ret)
>> +		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
>> +				ret, *error);
>> +	sev_unpin_memory(inpages, npages);
>> +err_1:
>> +	kfree(data);
>> +	return ret;
>> +}
>> +
>> +static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
>> +{
>> +	void *data;
>> +	int ret, offset, len;
>> +	struct kvm_sev_dbg debug;
>> +
>> +	if (!sev_guest(kvm))
>> +		return -ENOTTY;
>> +
>> +	if (copy_from_user(&debug, (void *)argp->data,
>> +				sizeof(struct kvm_sev_dbg)))
>> +		return -EFAULT;
>> +	/*
>> +	 * TODO: add support for decrypting length which crosses the
>> +	 * page boundary.
>> +	 */
>> +	offset = debug.src_addr & (PAGE_SIZE - 1);
>> +	if (offset + debug.length > PAGE_SIZE)
>> +		return -EINVAL;
>> +
>
> Please do add it, it doesn't seem very different from what you're doing
> in LAUNCH_UPDATE_DATA.  There's no need for a separate
> __sev_dbg_decrypt_page function, you can just pin/unpin here and do a
> per-page loop as in LAUNCH_UPDATE_DATA.
>

I can certainly add support to handle crossing the page boundary cases.
Should we limit the size to prevent user passing arbitrary long length
and we end up looping inside the kernel? I was thinking to limit to a PAGE_SIZE.

~ Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
@ 2017-03-16 18:41       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:41 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel
  Cc: brijesh.singh



On 03/16/2017 05:54 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> +static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
>> +		void *dst, int *error)
>> +{
>> +	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
>> +	if (!inpages) {
>> +		ret = -ENOMEM;
>> +		goto err_1;
>> +	}
>> +
>> +	data->handle = sev_get_handle(kvm);
>> +	data->dst_addr = __psp_pa(dst);
>> +	data->src_addr = __sev_page_pa(inpages[0]);
>> +	data->length = PAGE_SIZE;
>> +
>> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
>> +	if (ret)
>> +		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
>> +				ret, *error);
>> +	sev_unpin_memory(inpages, npages);
>> +err_1:
>> +	kfree(data);
>> +	return ret;
>> +}
>> +
>> +static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
>> +{
>> +	void *data;
>> +	int ret, offset, len;
>> +	struct kvm_sev_dbg debug;
>> +
>> +	if (!sev_guest(kvm))
>> +		return -ENOTTY;
>> +
>> +	if (copy_from_user(&debug, (void *)argp->data,
>> +				sizeof(struct kvm_sev_dbg)))
>> +		return -EFAULT;
>> +	/*
>> +	 * TODO: add support for decrypting length which crosses the
>> +	 * page boundary.
>> +	 */
>> +	offset = debug.src_addr & (PAGE_SIZE - 1);
>> +	if (offset + debug.length > PAGE_SIZE)
>> +		return -EINVAL;
>> +
>
> Please do add it, it doesn't seem very different from what you're doing
> in LAUNCH_UPDATE_DATA.  There's no need for a separate
> __sev_dbg_decrypt_page function, you can just pin/unpin here and do a
> per-page loop as in LAUNCH_UPDATE_DATA.
>

I can certainly add support to handle crossing the page boundary cases.
Should we limit the size to prevent user passing arbitrary long length
and we end up looping inside the kernel? I was thinking to limit to a PAGE_SIZE.

~ Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
@ 2017-03-16 18:41       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:41 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel
  Cc: brijesh.singh



On 03/16/2017 05:54 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> +static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
>> +		void *dst, int *error)
>> +{
>> +	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
>> +	if (!inpages) {
>> +		ret = -ENOMEM;
>> +		goto err_1;
>> +	}
>> +
>> +	data->handle = sev_get_handle(kvm);
>> +	data->dst_addr = __psp_pa(dst);
>> +	data->src_addr = __sev_page_pa(inpages[0]);
>> +	data->length = PAGE_SIZE;
>> +
>> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
>> +	if (ret)
>> +		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
>> +				ret, *error);
>> +	sev_unpin_memory(inpages, npages);
>> +err_1:
>> +	kfree(data);
>> +	return ret;
>> +}
>> +
>> +static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
>> +{
>> +	void *data;
>> +	int ret, offset, len;
>> +	struct kvm_sev_dbg debug;
>> +
>> +	if (!sev_guest(kvm))
>> +		return -ENOTTY;
>> +
>> +	if (copy_from_user(&debug, (void *)argp->data,
>> +				sizeof(struct kvm_sev_dbg)))
>> +		return -EFAULT;
>> +	/*
>> +	 * TODO: add support for decrypting length which crosses the
>> +	 * page boundary.
>> +	 */
>> +	offset = debug.src_addr & (PAGE_SIZE - 1);
>> +	if (offset + debug.length > PAGE_SIZE)
>> +		return -EINVAL;
>> +
>
> Please do add it, it doesn't seem very different from what you're doing
> in LAUNCH_UPDATE_DATA.  There's no need for a separate
> __sev_dbg_decrypt_page function, you can just pin/unpin here and do a
> per-page loop as in LAUNCH_UPDATE_DATA.
>

I can certainly add support to handle crossing the page boundary cases.
Should we limit the size to prevent user passing arbitrary long length
and we end up looping inside the kernel? I was thinking to limit to a PAGE_SIZE.

~ Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
@ 2017-03-16 18:41       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-16 18:41 UTC (permalink / raw)
  To: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem
  Cc: brijesh.singh



On 03/16/2017 05:54 AM, Paolo Bonzini wrote:
>
>
> On 02/03/2017 16:18, Brijesh Singh wrote:
>> +static int __sev_dbg_decrypt_page(struct kvm *kvm, unsigned long src,
>> +		void *dst, int *error)
>> +{
>> +	inpages = sev_pin_memory(src, PAGE_SIZE, &npages);
>> +	if (!inpages) {
>> +		ret = -ENOMEM;
>> +		goto err_1;
>> +	}
>> +
>> +	data->handle = sev_get_handle(kvm);
>> +	data->dst_addr = __psp_pa(dst);
>> +	data->src_addr = __sev_page_pa(inpages[0]);
>> +	data->length = PAGE_SIZE;
>> +
>> +	ret = sev_issue_cmd(kvm, SEV_CMD_DBG_DECRYPT, data, error);
>> +	if (ret)
>> +		printk(KERN_ERR "SEV: DEBUG_DECRYPT %d (%#010x)\n",
>> +				ret, *error);
>> +	sev_unpin_memory(inpages, npages);
>> +err_1:
>> +	kfree(data);
>> +	return ret;
>> +}
>> +
>> +static int sev_dbg_decrypt(struct kvm *kvm, struct kvm_sev_cmd *argp)
>> +{
>> +	void *data;
>> +	int ret, offset, len;
>> +	struct kvm_sev_dbg debug;
>> +
>> +	if (!sev_guest(kvm))
>> +		return -ENOTTY;
>> +
>> +	if (copy_from_user(&debug, (void *)argp->data,
>> +				sizeof(struct kvm_sev_dbg)))
>> +		return -EFAULT;
>> +	/*
>> +	 * TODO: add support for decrypting length which crosses the
>> +	 * page boundary.
>> +	 */
>> +	offset = debug.src_addr & (PAGE_SIZE - 1);
>> +	if (offset + debug.length > PAGE_SIZE)
>> +		return -EINVAL;
>> +
>
> Please do add it, it doesn't seem very different from what you're doing
> in LAUNCH_UPDATE_DATA.  There's no need for a separate
> __sev_dbg_decrypt_page function, you can just pin/unpin here and do a
> per-page loop as in LAUNCH_UPDATE_DATA.
>

I can certainly add support to handle crossing the page boundary cases.
Should we limit the size to prevent user passing arbitrary long length
and we end up looping inside the kernel? I was thinking to limit to a PAGE_SIZE.

~ Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
  2017-03-07 11:09     ` Borislav Petkov
  (?)
  (?)
@ 2017-03-16 19:03       ` Tom Lendacky
  -1 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 19:03 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab

On 3/7/2017 5:09 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:12:59AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
>> EFI related data, setup data) is encrypted and needs to be accessed as
>> such when mapped. Update the architecture override in early_memremap to
>> keep the encryption attribute when mapping this data.
>
> This could also explain why persistent memory needs to be accessed
> decrypted with SEV.

I'll add some comments about why persistent memory needs to be accessed
decrypted (because the encryption key changes across reboots) for both
SME and SEV.

>
> In general, what the difference in that aspect is in respect to SME. And
> I'd write that in the comment over the function. And not say "E820 areas
> are checked in making this determination." because that is visible but
> say *why* we need to check those ranges and determine access depending
> on their type.

Will do.

Thanks,
Tom

>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
@ 2017-03-16 19:03       ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 19:03 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/7/2017 5:09 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:12:59AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
>> EFI related data, setup data) is encrypted and needs to be accessed as
>> such when mapped. Update the architecture override in early_memremap to
>> keep the encryption attribute when mapping this data.
>
> This could also explain why persistent memory needs to be accessed
> decrypted with SEV.

I'll add some comments about why persistent memory needs to be accessed
decrypted (because the encryption key changes across reboots) for both
SME and SEV.

>
> In general, what the difference in that aspect is in respect to SME. And
> I'd write that in the comment over the function. And not say "E820 areas
> are checked in making this determination." because that is visible but
> say *why* we need to check those ranges and determine access depending
> on their type.

Will do.

Thanks,
Tom

>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
@ 2017-03-16 19:03       ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 19:03 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab

On 3/7/2017 5:09 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:12:59AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
>> EFI related data, setup data) is encrypted and needs to be accessed as
>> such when mapped. Update the architecture override in early_memremap to
>> keep the encryption attribute when mapping this data.
>
> This could also explain why persistent memory needs to be accessed
> decrypted with SEV.

I'll add some comments about why persistent memory needs to be accessed
decrypted (because the encryption key changes across reboots) for both
SME and SEV.

>
> In general, what the difference in that aspect is in respect to SME. And
> I'd write that in the comment over the function. And not say "E820 areas
> are checked in making this determination." because that is visible but
> say *why* we need to check those ranges and determine access depending
> on their type.

Will do.

Thanks,
Tom

>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV
@ 2017-03-16 19:03       ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 19:03 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/7/2017 5:09 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:12:59AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> When Secure Encrypted Virtualization (SEV) is active, BOOT data (such as
>> EFI related data, setup data) is encrypted and needs to be accessed as
>> such when mapped. Update the architecture override in early_memremap to
>> keep the encryption attribute when mapping this data.
>
> This could also explain why persistent memory needs to be accessed
> decrypted with SEV.

I'll add some comments about why persistent memory needs to be accessed
decrypted (because the encryption key changes across reboots) for both
SME and SEV.

>
> In general, what the difference in that aspect is in respect to SME. And
> I'd write that in the comment over the function. And not say "E820 areas
> are checked in making this determination." because that is visible but
> say *why* we need to check those ranges and determine access depending
> on their type.

Will do.

Thanks,
Tom

>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
  2017-03-07 14:59     ` Borislav Petkov
  (?)
  (?)
@ 2017-03-16 20:04       ` Tom Lendacky
  -1 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 20:04 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab

On 3/7/2017 8:59 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> In order for memory pages to be properly mapped when SEV is active, we
>> need to use the PAGE_KERNEL protection attribute as the base protection.
>> This will insure that memory mapping of, e.g. ACPI tables, receives the
>> proper mapping attributes.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index c400ab5..481c999 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>>                 pcm = new_pcm;
>>         }
>>
>> +       /*
>> +        * If the page being mapped is in memory and SEV is active then
>> +        * make sure the memory encryption attribute is enabled in the
>> +        * resulting mapping.
>> +        */
>>         prot = PAGE_KERNEL_IO;
>> +       if (sev_active() && page_is_mem(pfn))
>
> Hmm, a resource tree walk per ioremap call. This could get expensive for
> ioremap-heavy workloads.
>
> __ioremap_caller() gets called here during boot 55 times so not a whole
> lot but I wouldn't be surprised if there were some nasty use cases which
> ioremap a lot.
>
> ...
>
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index 9b5f044..db56ba3 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>>  }
>>  EXPORT_SYMBOL_GPL(page_is_ram);
>>
>> +/*
>> + * This function returns true if the target memory is marked as
>> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
>> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
>> + */
>> +static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
>> +{
>> +	struct resource res;
>> +	unsigned long pfn, end_pfn;
>> +	u64 orig_end;
>> +	int ret = -1;
>> +
>> +	res.start = (u64) start_pfn << PAGE_SHIFT;
>> +	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>> +	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>> +	orig_end = res.end;
>> +	while ((res.start < res.end) &&
>> +		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>> +		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>> +		end_pfn = (res.end + 1) >> PAGE_SHIFT;
>> +		if (end_pfn > pfn)
>> +			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>> +		if (ret)
>> +			break;
>> +		res.start = res.end + 1;
>> +		res.end = orig_end;
>> +	}
>> +	return ret;
>> +}
>
> So the relevant difference between this one and walk_system_ram_range()
> is this:
>
> -			ret = (*func)(pfn, end_pfn - pfn, arg);
> +			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>
> so it seems to me you can have your own *func() pointer which does that
> IORES_DESC_NONE comparison. And then you can define your own workhorse
> __walk_memory_range() which gets called by both walk_mem_range() and
> walk_system_ram_range() instead of almost duplicating them.
>
> And looking at walk_system_ram_res(), that one looks similar too except
> the pfn computation. But AFAICT the pfn/end_pfn things are computed from
> res.start and res.end so it looks to me like all those three functions
> are crying for unification...

I'll take a look at what it takes to consolidate these with a pre-patch. 
Then I'll add the new support.

Thanks,
Tom

>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-16 20:04       ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 20:04 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/7/2017 8:59 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> In order for memory pages to be properly mapped when SEV is active, we
>> need to use the PAGE_KERNEL protection attribute as the base protection.
>> This will insure that memory mapping of, e.g. ACPI tables, receives the
>> proper mapping attributes.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index c400ab5..481c999 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>>                 pcm = new_pcm;
>>         }
>>
>> +       /*
>> +        * If the page being mapped is in memory and SEV is active then
>> +        * make sure the memory encryption attribute is enabled in the
>> +        * resulting mapping.
>> +        */
>>         prot = PAGE_KERNEL_IO;
>> +       if (sev_active() && page_is_mem(pfn))
>
> Hmm, a resource tree walk per ioremap call. This could get expensive for
> ioremap-heavy workloads.
>
> __ioremap_caller() gets called here during boot 55 times so not a whole
> lot but I wouldn't be surprised if there were some nasty use cases which
> ioremap a lot.
>
> ...
>
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index 9b5f044..db56ba3 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>>  }
>>  EXPORT_SYMBOL_GPL(page_is_ram);
>>
>> +/*
>> + * This function returns true if the target memory is marked as
>> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
>> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
>> + */
>> +static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
>> +{
>> +	struct resource res;
>> +	unsigned long pfn, end_pfn;
>> +	u64 orig_end;
>> +	int ret = -1;
>> +
>> +	res.start = (u64) start_pfn << PAGE_SHIFT;
>> +	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>> +	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>> +	orig_end = res.end;
>> +	while ((res.start < res.end) &&
>> +		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>> +		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>> +		end_pfn = (res.end + 1) >> PAGE_SHIFT;
>> +		if (end_pfn > pfn)
>> +			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>> +		if (ret)
>> +			break;
>> +		res.start = res.end + 1;
>> +		res.end = orig_end;
>> +	}
>> +	return ret;
>> +}
>
> So the relevant difference between this one and walk_system_ram_range()
> is this:
>
> -			ret = (*func)(pfn, end_pfn - pfn, arg);
> +			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>
> so it seems to me you can have your own *func() pointer which does that
> IORES_DESC_NONE comparison. And then you can define your own workhorse
> __walk_memory_range() which gets called by both walk_mem_range() and
> walk_system_ram_range() instead of almost duplicating them.
>
> And looking at walk_system_ram_res(), that one looks similar too except
> the pfn computation. But AFAICT the pfn/end_pfn things are computed from
> res.start and res.end so it looks to me like all those three functions
> are crying for unification...

I'll take a look at what it takes to consolidate these with a pre-patch. 
Then I'll add the new support.

Thanks,
Tom

>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-16 20:04       ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 20:04 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab

On 3/7/2017 8:59 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> In order for memory pages to be properly mapped when SEV is active, we
>> need to use the PAGE_KERNEL protection attribute as the base protection.
>> This will insure that memory mapping of, e.g. ACPI tables, receives the
>> proper mapping attributes.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index c400ab5..481c999 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>>                 pcm = new_pcm;
>>         }
>>
>> +       /*
>> +        * If the page being mapped is in memory and SEV is active then
>> +        * make sure the memory encryption attribute is enabled in the
>> +        * resulting mapping.
>> +        */
>>         prot = PAGE_KERNEL_IO;
>> +       if (sev_active() && page_is_mem(pfn))
>
> Hmm, a resource tree walk per ioremap call. This could get expensive for
> ioremap-heavy workloads.
>
> __ioremap_caller() gets called here during boot 55 times so not a whole
> lot but I wouldn't be surprised if there were some nasty use cases which
> ioremap a lot.
>
> ...
>
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index 9b5f044..db56ba3 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>>  }
>>  EXPORT_SYMBOL_GPL(page_is_ram);
>>
>> +/*
>> + * This function returns true if the target memory is marked as
>> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
>> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
>> + */
>> +static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
>> +{
>> +	struct resource res;
>> +	unsigned long pfn, end_pfn;
>> +	u64 orig_end;
>> +	int ret = -1;
>> +
>> +	res.start = (u64) start_pfn << PAGE_SHIFT;
>> +	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>> +	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>> +	orig_end = res.end;
>> +	while ((res.start < res.end) &&
>> +		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>> +		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>> +		end_pfn = (res.end + 1) >> PAGE_SHIFT;
>> +		if (end_pfn > pfn)
>> +			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>> +		if (ret)
>> +			break;
>> +		res.start = res.end + 1;
>> +		res.end = orig_end;
>> +	}
>> +	return ret;
>> +}
>
> So the relevant difference between this one and walk_system_ram_range()
> is this:
>
> -			ret = (*func)(pfn, end_pfn - pfn, arg);
> +			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>
> so it seems to me you can have your own *func() pointer which does that
> IORES_DESC_NONE comparison. And then you can define your own workhorse
> __walk_memory_range() which gets called by both walk_mem_range() and
> walk_system_ram_range() instead of almost duplicating them.
>
> And looking at walk_system_ram_res(), that one looks similar too except
> the pfn computation. But AFAICT the pfn/end_pfn things are computed from
> res.start and res.end so it looks to me like all those three functions
> are crying for unification...

I'll take a look at what it takes to consolidate these with a pre-patch. 
Then I'll add the new support.

Thanks,
Tom

>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-16 20:04       ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-16 20:04 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/7/2017 8:59 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> In order for memory pages to be properly mapped when SEV is active, we
>> need to use the PAGE_KERNEL protection attribute as the base protection.
>> This will insure that memory mapping of, e.g. ACPI tables, receives the
>> proper mapping attributes.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index c400ab5..481c999 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -151,7 +151,15 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>>                 pcm = new_pcm;
>>         }
>>
>> +       /*
>> +        * If the page being mapped is in memory and SEV is active then
>> +        * make sure the memory encryption attribute is enabled in the
>> +        * resulting mapping.
>> +        */
>>         prot = PAGE_KERNEL_IO;
>> +       if (sev_active() && page_is_mem(pfn))
>
> Hmm, a resource tree walk per ioremap call. This could get expensive for
> ioremap-heavy workloads.
>
> __ioremap_caller() gets called here during boot 55 times so not a whole
> lot but I wouldn't be surprised if there were some nasty use cases which
> ioremap a lot.
>
> ...
>
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index 9b5f044..db56ba3 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>>  }
>>  EXPORT_SYMBOL_GPL(page_is_ram);
>>
>> +/*
>> + * This function returns true if the target memory is marked as
>> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
>> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
>> + */
>> +static int walk_mem_range(unsigned long start_pfn, unsigned long nr_pages)
>> +{
>> +	struct resource res;
>> +	unsigned long pfn, end_pfn;
>> +	u64 orig_end;
>> +	int ret = -1;
>> +
>> +	res.start = (u64) start_pfn << PAGE_SHIFT;
>> +	res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>> +	res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>> +	orig_end = res.end;
>> +	while ((res.start < res.end) &&
>> +		(find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>> +		pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>> +		end_pfn = (res.end + 1) >> PAGE_SHIFT;
>> +		if (end_pfn > pfn)
>> +			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>> +		if (ret)
>> +			break;
>> +		res.start = res.end + 1;
>> +		res.end = orig_end;
>> +	}
>> +	return ret;
>> +}
>
> So the relevant difference between this one and walk_system_ram_range()
> is this:
>
> -			ret = (*func)(pfn, end_pfn - pfn, arg);
> +			ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>
> so it seems to me you can have your own *func() pointer which does that
> IORES_DESC_NONE comparison. And then you can define your own workhorse
> __walk_memory_range() which gets called by both walk_mem_range() and
> walk_system_ram_range() instead of almost duplicating them.
>
> And looking at walk_system_ram_res(), that one looks similar too except
> the pfn computation. But AFAICT the pfn/end_pfn things are computed from
> res.start and res.end so it looks to me like all those three functions
> are crying for unification...

I'll take a look at what it takes to consolidate these with a pre-patch. 
Then I'll add the new support.

Thanks,
Tom

>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-16 18:28         ` Borislav Petkov
  (?)
@ 2017-03-16 22:25           ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 22:25 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: linux-efi, kvm, rkrcmar, matt, linux-pci, linus.walleij,
	gary.hook, linux-mm, hpa, cl, tglx, aarcange, sfr, mchehab,
	simon.guinot, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, bhelgaas,
	dan.j.williams, andriy.shevchenko, herbert, paul.gortmaker,
	labbott, devel, iamjoonsoo.kim



On 16/03/2017 19:28, Borislav Petkov wrote:
> So how hard would it be if the hypervisor allocated that memory for the
> guest instead? It would allocate it decrypted and guest would need to
> access it decrypted too. All in preparation for SEV-ES which will need a
> block of unencrypted memory for the guest anyway...

The kvmclock memory is initially zero so there is no need for the
hypervisor to allocate anything; the point of these patches is just to
access the data in a natural way from Linux source code.

I also don't really like the patch as is (plus it fails modpost), but
IMO reusing __change_page_attr and __split_large_page is the right thing
to do.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-16 22:25           ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 22:25 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott



On 16/03/2017 19:28, Borislav Petkov wrote:
> So how hard would it be if the hypervisor allocated that memory for the
> guest instead? It would allocate it decrypted and guest would need to
> access it decrypted too. All in preparation for SEV-ES which will need a
> block of unencrypted memory for the guest anyway...

The kvmclock memory is initially zero so there is no need for the
hypervisor to allocate anything; the point of these patches is just to
access the data in a natural way from Linux source code.

I also don't really like the patch as is (plus it fails modpost), but
IMO reusing __change_page_attr and __split_large_page is the right thing
to do.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-16 22:25           ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-16 22:25 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott



On 16/03/2017 19:28, Borislav Petkov wrote:
> So how hard would it be if the hypervisor allocated that memory for the
> guest instead? It would allocate it decrypted and guest would need to
> access it decrypted too. All in preparation for SEV-ES which will need a
> block of unencrypted memory for the guest anyway...

The kvmclock memory is initially zero so there is no need for the
hypervisor to allocate anything; the point of these patches is just to
access the data in a natural way from Linux source code.

I also don't really like the patch as is (plus it fails modpost), but
IMO reusing __change_page_attr and __split_large_page is the right thing
to do.

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-16 22:25           ` Paolo Bonzini
  (?)
@ 2017-03-17 10:17             ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-17 10:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Thu, Mar 16, 2017 at 11:25:36PM +0100, Paolo Bonzini wrote:
> The kvmclock memory is initially zero so there is no need for the
> hypervisor to allocate anything; the point of these patches is just to
> access the data in a natural way from Linux source code.

I realize that.

> I also don't really like the patch as is (plus it fails modpost), but
> IMO reusing __change_page_attr and __split_large_page is the right thing
> to do.

Right, so teaching pageattr.c about memblock could theoretically come
around and bite us later when a page allocated with memblock gets freed
with free_page().

And looking at this more, we have all this kernel pagetable preparation
code down the init_mem_mapping() call and the pagetable setup in
arch/x86/mm/init_{32,64}.c

And that code even does some basic page splitting. Oh and it uses
alloc_low_pages() which knows whether to do memblock reservation or the
common __get_free_pages() when slabs are up.

So what would be much cleaner, IMHO, is if one would reuse that code to
change init_mm.pgd mappings early without copying pageattr.c.

init_mem_mapping() gets called before kvm_guest_init() in setup_arch()
so the guest would simply fixup its pagetable right there.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-17 10:17             ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-17 10:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott

On Thu, Mar 16, 2017 at 11:25:36PM +0100, Paolo Bonzini wrote:
> The kvmclock memory is initially zero so there is no need for the
> hypervisor to allocate anything; the point of these patches is just to
> access the data in a natural way from Linux source code.

I realize that.

> I also don't really like the patch as is (plus it fails modpost), but
> IMO reusing __change_page_attr and __split_large_page is the right thing
> to do.

Right, so teaching pageattr.c about memblock could theoretically come
around and bite us later when a page allocated with memblock gets freed
with free_page().

And looking at this more, we have all this kernel pagetable preparation
code down the init_mem_mapping() call and the pagetable setup in
arch/x86/mm/init_{32,64}.c

And that code even does some basic page splitting. Oh and it uses
alloc_low_pages() which knows whether to do memblock reservation or the
common __get_free_pages() when slabs are up.

So what would be much cleaner, IMHO, is if one would reuse that code to
change init_mm.pgd mappings early without copying pageattr.c.

init_mem_mapping() gets called before kvm_guest_init() in setup_arch()
so the guest would simply fixup its pagetable right there.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-17 10:17             ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-17 10:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott

On Thu, Mar 16, 2017 at 11:25:36PM +0100, Paolo Bonzini wrote:
> The kvmclock memory is initially zero so there is no need for the
> hypervisor to allocate anything; the point of these patches is just to
> access the data in a natural way from Linux source code.

I realize that.

> I also don't really like the patch as is (plus it fails modpost), but
> IMO reusing __change_page_attr and __split_large_page is the right thing
> to do.

Right, so teaching pageattr.c about memblock could theoretically come
around and bite us later when a page allocated with memblock gets freed
with free_page().

And looking at this more, we have all this kernel pagetable preparation
code down the init_mem_mapping() call and the pagetable setup in
arch/x86/mm/init_{32,64}.c

And that code even does some basic page splitting. Oh and it uses
alloc_low_pages() which knows whether to do memblock reservation or the
common __get_free_pages() when slabs are up.

So what would be much cleaner, IMHO, is if one would reuse that code to
change init_mm.pgd mappings early without copying pageattr.c.

init_mem_mapping() gets called before kvm_guest_init() in setup_arch()
so the guest would simply fixup its pagetable right there.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-17 10:17             ` Borislav Petkov
@ 2017-03-17 10:47               ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 10:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab



On 17/03/2017 11:17, Borislav Petkov wrote:
> 
>> I also don't really like the patch as is (plus it fails modpost), but
>> IMO reusing __change_page_attr and __split_large_page is the right thing
>> to do.
> 
> Right, so teaching pageattr.c about memblock could theoretically come
> around and bite us later when a page allocated with memblock gets freed
> with free_page().

Theoretically or practically?

> And looking at this more, we have all this kernel pagetable preparation
> code down the init_mem_mapping() call and the pagetable setup in
> arch/x86/mm/init_{32,64}.c

It only looks at the E820 map, doesn't it?  Why does it have to do
anything with percpu memory areas?

Paolo

> And that code even does some basic page splitting. Oh and it uses
> alloc_low_pages() which knows whether to do memblock reservation or the
> common __get_free_pages() when slabs are up.
> 
> So what would be much cleaner, IMHO, is if one would reuse that code to
> change init_mm.pgd mappings early without copying pageattr.c.
> 
> init_mem_mapping() gets called before kvm_guest_init() in setup_arch()
> so the guest would simply fixup its pagetable right there.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-17 10:47               ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 10:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab



On 17/03/2017 11:17, Borislav Petkov wrote:
> 
>> I also don't really like the patch as is (plus it fails modpost), but
>> IMO reusing __change_page_attr and __split_large_page is the right thing
>> to do.
> 
> Right, so teaching pageattr.c about memblock could theoretically come
> around and bite us later when a page allocated with memblock gets freed
> with free_page().

Theoretically or practically?

> And looking at this more, we have all this kernel pagetable preparation
> code down the init_mem_mapping() call and the pagetable setup in
> arch/x86/mm/init_{32,64}.c

It only looks at the E820 map, doesn't it?  Why does it have to do
anything with percpu memory areas?

Paolo

> And that code even does some basic page splitting. Oh and it uses
> alloc_low_pages() which knows whether to do memblock reservation or the
> common __get_free_pages() when slabs are up.
> 
> So what would be much cleaner, IMHO, is if one would reuse that code to
> change init_mm.pgd mappings early without copying pageattr.c.
> 
> init_mem_mapping() gets called before kvm_guest_init() in setup_arch()
> so the guest would simply fixup its pagetable right there.

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-17 10:47               ` Paolo Bonzini
  (?)
@ 2017-03-17 10:56                 ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-17 10:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Fri, Mar 17, 2017 at 11:47:16AM +0100, Paolo Bonzini wrote:
> Theoretically or practically?

In the sense, it needs to be tried first to see how ugly it can get.

> It only looks at the E820 map, doesn't it?  Why does it have to do
> anything with percpu memory areas?

That's irrelevant. What we want to do is take what's in init_mm.pgd and
modify it. And use the facilities in arch/x86/mm/init_{32,64}.c because
they already know about early/late pagetable pages allocation and they
deal with the kernel pagetable anyway.

And *not* teach pageattr.c about memblock because that can be misused,
as tglx pointed out on IRC.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-17 10:56                 ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-17 10:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Fri, Mar 17, 2017 at 11:47:16AM +0100, Paolo Bonzini wrote:
> Theoretically or practically?

In the sense, it needs to be tried first to see how ugly it can get.

> It only looks at the E820 map, doesn't it?  Why does it have to do
> anything with percpu memory areas?

That's irrelevant. What we want to do is take what's in init_mm.pgd and
modify it. And use the facilities in arch/x86/mm/init_{32,64}.c because
they already know about early/late pagetable pages allocation and they
deal with the kernel pagetable anyway.

And *not* teach pageattr.c about memblock because that can be misused,
as tglx pointed out on IRC.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-17 10:56                 ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-17 10:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Fri, Mar 17, 2017 at 11:47:16AM +0100, Paolo Bonzini wrote:
> Theoretically or practically?

In the sense, it needs to be tried first to see how ugly it can get.

> It only looks at the E820 map, doesn't it?  Why does it have to do
> anything with percpu memory areas?

That's irrelevant. What we want to do is take what's in init_mm.pgd and
modify it. And use the facilities in arch/x86/mm/init_{32,64}.c because
they already know about early/late pagetable pages allocation and they
deal with the kernel pagetable anyway.

And *not* teach pageattr.c about memblock because that can be misused,
as tglx pointed out on IRC.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-17 10:56                 ` Borislav Petkov
  (?)
@ 2017-03-17 11:03                   ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 11:03 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-efi, Brijesh Singh, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, hpa, cl, tglx, aarcange, sfr,
	mchehab, simon.guinot, bhe, xemul, joro, x86, peterz, piotr.luc,
	mingo, msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, bhelgaas,
	dan.j.williams, andriy.shevchenko, herbert, paul.gortmaker,
	devel



On 17/03/2017 11:56, Borislav Petkov wrote:
>> Theoretically or practically?
> In the sense, it needs to be tried first to see how ugly it can get.
> 
>> It only looks at the E820 map, doesn't it?  Why does it have to do
>> anything with percpu memory areas?
> That's irrelevant. What we want to do is take what's in init_mm.pgd and
> modify it. And use the facilities in arch/x86/mm/init_{32,64}.c because
> they already know about early/late pagetable pages allocation and they
> deal with the kernel pagetable anyway.

If it is possible to do it in a fairly hypervisor-independent manner,
I'm all for it.  That is, only by looking at AMD-specified CPUID leaves
and at kernel ELF sections.

Paolo

> And *not* teach pageattr.c about memblock because that can be misused,
> as tglx pointed out on IRC.

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-17 11:03                   ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 11:03 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab



On 17/03/2017 11:56, Borislav Petkov wrote:
>> Theoretically or practically?
> In the sense, it needs to be tried first to see how ugly it can get.
> 
>> It only looks at the E820 map, doesn't it?  Why does it have to do
>> anything with percpu memory areas?
> That's irrelevant. What we want to do is take what's in init_mm.pgd and
> modify it. And use the facilities in arch/x86/mm/init_{32,64}.c because
> they already know about early/late pagetable pages allocation and they
> deal with the kernel pagetable anyway.

If it is possible to do it in a fairly hypervisor-independent manner,
I'm all for it.  That is, only by looking at AMD-specified CPUID leaves
and at kernel ELF sections.

Paolo

> And *not* teach pageattr.c about memblock because that can be misused,
> as tglx pointed out on IRC.

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-17 11:03                   ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 11:03 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab



On 17/03/2017 11:56, Borislav Petkov wrote:
>> Theoretically or practically?
> In the sense, it needs to be tried first to see how ugly it can get.
> 
>> It only looks at the E820 map, doesn't it?  Why does it have to do
>> anything with percpu memory areas?
> That's irrelevant. What we want to do is take what's in init_mm.pgd and
> modify it. And use the facilities in arch/x86/mm/init_{32,64}.c because
> they already know about early/late pagetable pages allocation and they
> deal with the kernel pagetable anyway.

If it is possible to do it in a fairly hypervisor-independent manner,
I'm all for it.  That is, only by looking at AMD-specified CPUID leaves
and at kernel ELF sections.

Paolo

> And *not* teach pageattr.c about memblock because that can be misused,
> as tglx pointed out on IRC.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
  2017-03-16 18:41       ` Brijesh Singh
                         ` (2 preceding siblings ...)
  (?)
@ 2017-03-17 11:09       ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 11:09 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 16/03/2017 19:41, Brijesh Singh wrote:
>>
>> Please do add it, it doesn't seem very different from what you're doing
>> in LAUNCH_UPDATE_DATA.  There's no need for a separate
>> __sev_dbg_decrypt_page function, you can just pin/unpin here and do a
>> per-page loop as in LAUNCH_UPDATE_DATA.
> 
> I can certainly add support to handle crossing the page boundary cases.
> Should we limit the size to prevent user passing arbitrary long length
> and we end up looping inside the kernel? I was thinking to limit to a
> PAGE_SIZE.

I guess it depends on how it's used.  PAGE_SIZE makes sense since you
only know if a physical address is encrypted when you reach it from a
visit of the page tables.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
  2017-03-16 18:41       ` Brijesh Singh
  (?)
@ 2017-03-17 11:09         ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 11:09 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 16/03/2017 19:41, Brijesh Singh wrote:
>>
>> Please do add it, it doesn't seem very different from what you're doing
>> in LAUNCH_UPDATE_DATA.  There's no need for a separate
>> __sev_dbg_decrypt_page function, you can just pin/unpin here and do a
>> per-page loop as in LAUNCH_UPDATE_DATA.
> 
> I can certainly add support to handle crossing the page boundary cases.
> Should we limit the size to prevent user passing arbitrary long length
> and we end up looping inside the kernel? I was thinking to limit to a
> PAGE_SIZE.

I guess it depends on how it's used.  PAGE_SIZE makes sense since you
only know if a physical address is encrypted when you reach it from a
visit of the page tables.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
@ 2017-03-17 11:09         ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 11:09 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mcheha



On 16/03/2017 19:41, Brijesh Singh wrote:
>>
>> Please do add it, it doesn't seem very different from what you're doing
>> in LAUNCH_UPDATE_DATA.  There's no need for a separate
>> __sev_dbg_decrypt_page function, you can just pin/unpin here and do a
>> per-page loop as in LAUNCH_UPDATE_DATA.
> 
> I can certainly add support to handle crossing the page boundary cases.
> Should we limit the size to prevent user passing arbitrary long length
> and we end up looping inside the kernel? I was thinking to limit to a
> PAGE_SIZE.

I guess it depends on how it's used.  PAGE_SIZE makes sense since you
only know if a physical address is encrypted when you reach it from a
visit of the page tables.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command
@ 2017-03-17 11:09         ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 11:09 UTC (permalink / raw)
  To: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem



On 16/03/2017 19:41, Brijesh Singh wrote:
>>
>> Please do add it, it doesn't seem very different from what you're doing
>> in LAUNCH_UPDATE_DATA.  There's no need for a separate
>> __sev_dbg_decrypt_page function, you can just pin/unpin here and do a
>> per-page loop as in LAUNCH_UPDATE_DATA.
> 
> I can certainly add support to handle crossing the page boundary cases.
> Should we limit the size to prevent user passing arbitrary long length
> and we end up looping inside the kernel? I was thinking to limit to a
> PAGE_SIZE.

I guess it depends on how it's used.  PAGE_SIZE makes sense since you
only know if a physical address is encrypted when you reach it from a
visit of the page tables.

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-17 11:03                   ` Paolo Bonzini
  (?)
@ 2017-03-17 11:33                     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-17 11:33 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Fri, Mar 17, 2017 at 12:03:31PM +0100, Paolo Bonzini wrote:

> If it is possible to do it in a fairly hypervisor-independent manner,
> I'm all for it.  That is, only by looking at AMD-specified CPUID leaves
> and at kernel ELF sections.

Not even that.

What that needs to be able to do is:

	kvm_map_percpu_hv_shared(st, sizeof(*st)))

where st is the percpu steal time ptr:

	struct kvm_steal_time *st = &per_cpu(steal_time, cpu);

Underneath, what it does basically is it clears the encryption mask from
the pte, see patch 16/32.

And I keep talking about SEV-ES because this is going to expand on the
need of having a shared memory region which the hypervisor and the guest
needs to access, thus unencrypted. See

http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf

This is where you come in and say what would be the best approach there...

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-17 11:33                     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-17 11:33 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Fri, Mar 17, 2017 at 12:03:31PM +0100, Paolo Bonzini wrote:

> If it is possible to do it in a fairly hypervisor-independent manner,
> I'm all for it.  That is, only by looking at AMD-specified CPUID leaves
> and at kernel ELF sections.

Not even that.

What that needs to be able to do is:

	kvm_map_percpu_hv_shared(st, sizeof(*st)))

where st is the percpu steal time ptr:

	struct kvm_steal_time *st = &per_cpu(steal_time, cpu);

Underneath, what it does basically is it clears the encryption mask from
the pte, see patch 16/32.

And I keep talking about SEV-ES because this is going to expand on the
need of having a shared memory region which the hypervisor and the guest
needs to access, thus unencrypted. See

http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf

This is where you come in and say what would be the best approach there...

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-17 11:33                     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-17 11:33 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Fri, Mar 17, 2017 at 12:03:31PM +0100, Paolo Bonzini wrote:

> If it is possible to do it in a fairly hypervisor-independent manner,
> I'm all for it.  That is, only by looking at AMD-specified CPUID leaves
> and at kernel ELF sections.

Not even that.

What that needs to be able to do is:

	kvm_map_percpu_hv_shared(st, sizeof(*st)))

where st is the percpu steal time ptr:

	struct kvm_steal_time *st = &per_cpu(steal_time, cpu);

Underneath, what it does basically is it clears the encryption mask from
the pte, see patch 16/32.

And I keep talking about SEV-ES because this is going to expand on the
need of having a shared memory region which the hypervisor and the guest
needs to access, thus unencrypted. See

http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf

This is where you come in and say what would be the best approach there...

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
  2017-03-16 20:04       ` Tom Lendacky
  (?)
  (?)
@ 2017-03-17 14:32         ` Tom Lendacky
  -1 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-17 14:32 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab

On 3/16/2017 3:04 PM, Tom Lendacky wrote:
> On 3/7/2017 8:59 AM, Borislav Petkov wrote:
>> On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>
>>> In order for memory pages to be properly mapped when SEV is active, we
>>> need to use the PAGE_KERNEL protection attribute as the base protection.
>>> This will insure that memory mapping of, e.g. ACPI tables, receives the
>>> proper mapping attributes.
>>>
>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>>> ---
>>
>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>> index c400ab5..481c999 100644
>>> --- a/arch/x86/mm/ioremap.c
>>> +++ b/arch/x86/mm/ioremap.c
>>> @@ -151,7 +151,15 @@ static void __iomem
>>> *__ioremap_caller(resource_size_t phys_addr,
>>>                 pcm = new_pcm;
>>>         }
>>>
>>> +       /*
>>> +        * If the page being mapped is in memory and SEV is active then
>>> +        * make sure the memory encryption attribute is enabled in the
>>> +        * resulting mapping.
>>> +        */
>>>         prot = PAGE_KERNEL_IO;
>>> +       if (sev_active() && page_is_mem(pfn))
>>
>> Hmm, a resource tree walk per ioremap call. This could get expensive for
>> ioremap-heavy workloads.
>>
>> __ioremap_caller() gets called here during boot 55 times so not a whole
>> lot but I wouldn't be surprised if there were some nasty use cases which
>> ioremap a lot.
>>
>> ...
>>
>>> diff --git a/kernel/resource.c b/kernel/resource.c
>>> index 9b5f044..db56ba3 100644
>>> --- a/kernel/resource.c
>>> +++ b/kernel/resource.c
>>> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>>>  }
>>>  EXPORT_SYMBOL_GPL(page_is_ram);
>>>
>>> +/*
>>> + * This function returns true if the target memory is marked as
>>> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
>>> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
>>> + */
>>> +static int walk_mem_range(unsigned long start_pfn, unsigned long
>>> nr_pages)
>>> +{
>>> +    struct resource res;
>>> +    unsigned long pfn, end_pfn;
>>> +    u64 orig_end;
>>> +    int ret = -1;
>>> +
>>> +    res.start = (u64) start_pfn << PAGE_SHIFT;
>>> +    res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>>> +    res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>>> +    orig_end = res.end;
>>> +    while ((res.start < res.end) &&
>>> +        (find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>>> +        pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>>> +        end_pfn = (res.end + 1) >> PAGE_SHIFT;
>>> +        if (end_pfn > pfn)
>>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>> +        if (ret)
>>> +            break;
>>> +        res.start = res.end + 1;
>>> +        res.end = orig_end;
>>> +    }
>>> +    return ret;
>>> +}
>>
>> So the relevant difference between this one and walk_system_ram_range()
>> is this:
>>
>> -            ret = (*func)(pfn, end_pfn - pfn, arg);
>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>
>> so it seems to me you can have your own *func() pointer which does that
>> IORES_DESC_NONE comparison. And then you can define your own workhorse
>> __walk_memory_range() which gets called by both walk_mem_range() and
>> walk_system_ram_range() instead of almost duplicating them.
>>
>> And looking at walk_system_ram_res(), that one looks similar too except
>> the pfn computation. But AFAICT the pfn/end_pfn things are computed from
>> res.start and res.end so it looks to me like all those three functions
>> are crying for unification...
>
> I'll take a look at what it takes to consolidate these with a pre-patch.
> Then I'll add the new support.

It looks pretty straight forward to combine walk_iomem_res_desc() and
walk_system_ram_res(). The walk_system_ram_range() function would fit
easily into this, also, except for the fact that the callback function
takes unsigned longs vs the u64s of the other functions.  Is it worth
modifying all of the callers of walk_system_ram_range() (which are only
about 8 locations) to change the callback functions to accept u64s in
order to consolidate the walk_system_ram_range() function, too?

Thanks,
Tom

>
> Thanks,
> Tom
>
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-17 14:32         ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-17 14:32 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/16/2017 3:04 PM, Tom Lendacky wrote:
> On 3/7/2017 8:59 AM, Borislav Petkov wrote:
>> On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>
>>> In order for memory pages to be properly mapped when SEV is active, we
>>> need to use the PAGE_KERNEL protection attribute as the base protection.
>>> This will insure that memory mapping of, e.g. ACPI tables, receives the
>>> proper mapping attributes.
>>>
>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>>> ---
>>
>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>> index c400ab5..481c999 100644
>>> --- a/arch/x86/mm/ioremap.c
>>> +++ b/arch/x86/mm/ioremap.c
>>> @@ -151,7 +151,15 @@ static void __iomem
>>> *__ioremap_caller(resource_size_t phys_addr,
>>>                 pcm = new_pcm;
>>>         }
>>>
>>> +       /*
>>> +        * If the page being mapped is in memory and SEV is active then
>>> +        * make sure the memory encryption attribute is enabled in the
>>> +        * resulting mapping.
>>> +        */
>>>         prot = PAGE_KERNEL_IO;
>>> +       if (sev_active() && page_is_mem(pfn))
>>
>> Hmm, a resource tree walk per ioremap call. This could get expensive for
>> ioremap-heavy workloads.
>>
>> __ioremap_caller() gets called here during boot 55 times so not a whole
>> lot but I wouldn't be surprised if there were some nasty use cases which
>> ioremap a lot.
>>
>> ...
>>
>>> diff --git a/kernel/resource.c b/kernel/resource.c
>>> index 9b5f044..db56ba3 100644
>>> --- a/kernel/resource.c
>>> +++ b/kernel/resource.c
>>> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>>>  }
>>>  EXPORT_SYMBOL_GPL(page_is_ram);
>>>
>>> +/*
>>> + * This function returns true if the target memory is marked as
>>> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
>>> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
>>> + */
>>> +static int walk_mem_range(unsigned long start_pfn, unsigned long
>>> nr_pages)
>>> +{
>>> +    struct resource res;
>>> +    unsigned long pfn, end_pfn;
>>> +    u64 orig_end;
>>> +    int ret = -1;
>>> +
>>> +    res.start = (u64) start_pfn << PAGE_SHIFT;
>>> +    res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>>> +    res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>>> +    orig_end = res.end;
>>> +    while ((res.start < res.end) &&
>>> +        (find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>>> +        pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>>> +        end_pfn = (res.end + 1) >> PAGE_SHIFT;
>>> +        if (end_pfn > pfn)
>>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>> +        if (ret)
>>> +            break;
>>> +        res.start = res.end + 1;
>>> +        res.end = orig_end;
>>> +    }
>>> +    return ret;
>>> +}
>>
>> So the relevant difference between this one and walk_system_ram_range()
>> is this:
>>
>> -            ret = (*func)(pfn, end_pfn - pfn, arg);
>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>
>> so it seems to me you can have your own *func() pointer which does that
>> IORES_DESC_NONE comparison. And then you can define your own workhorse
>> __walk_memory_range() which gets called by both walk_mem_range() and
>> walk_system_ram_range() instead of almost duplicating them.
>>
>> And looking at walk_system_ram_res(), that one looks similar too except
>> the pfn computation. But AFAICT the pfn/end_pfn things are computed from
>> res.start and res.end so it looks to me like all those three functions
>> are crying for unification...
>
> I'll take a look at what it takes to consolidate these with a pre-patch.
> Then I'll add the new support.

It looks pretty straight forward to combine walk_iomem_res_desc() and
walk_system_ram_res(). The walk_system_ram_range() function would fit
easily into this, also, except for the fact that the callback function
takes unsigned longs vs the u64s of the other functions.  Is it worth
modifying all of the callers of walk_system_ram_range() (which are only
about 8 locations) to change the callback functions to accept u64s in
order to consolidate the walk_system_ram_range() function, too?

Thanks,
Tom

>
> Thanks,
> Tom
>
>>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-17 14:32         ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-17 14:32 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab

On 3/16/2017 3:04 PM, Tom Lendacky wrote:
> On 3/7/2017 8:59 AM, Borislav Petkov wrote:
>> On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>
>>> In order for memory pages to be properly mapped when SEV is active, we
>>> need to use the PAGE_KERNEL protection attribute as the base protection.
>>> This will insure that memory mapping of, e.g. ACPI tables, receives the
>>> proper mapping attributes.
>>>
>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>>> ---
>>
>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>> index c400ab5..481c999 100644
>>> --- a/arch/x86/mm/ioremap.c
>>> +++ b/arch/x86/mm/ioremap.c
>>> @@ -151,7 +151,15 @@ static void __iomem
>>> *__ioremap_caller(resource_size_t phys_addr,
>>>                 pcm = new_pcm;
>>>         }
>>>
>>> +       /*
>>> +        * If the page being mapped is in memory and SEV is active then
>>> +        * make sure the memory encryption attribute is enabled in the
>>> +        * resulting mapping.
>>> +        */
>>>         prot = PAGE_KERNEL_IO;
>>> +       if (sev_active() && page_is_mem(pfn))
>>
>> Hmm, a resource tree walk per ioremap call. This could get expensive for
>> ioremap-heavy workloads.
>>
>> __ioremap_caller() gets called here during boot 55 times so not a whole
>> lot but I wouldn't be surprised if there were some nasty use cases which
>> ioremap a lot.
>>
>> ...
>>
>>> diff --git a/kernel/resource.c b/kernel/resource.c
>>> index 9b5f044..db56ba3 100644
>>> --- a/kernel/resource.c
>>> +++ b/kernel/resource.c
>>> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>>>  }
>>>  EXPORT_SYMBOL_GPL(page_is_ram);
>>>
>>> +/*
>>> + * This function returns true if the target memory is marked as
>>> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
>>> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
>>> + */
>>> +static int walk_mem_range(unsigned long start_pfn, unsigned long
>>> nr_pages)
>>> +{
>>> +    struct resource res;
>>> +    unsigned long pfn, end_pfn;
>>> +    u64 orig_end;
>>> +    int ret = -1;
>>> +
>>> +    res.start = (u64) start_pfn << PAGE_SHIFT;
>>> +    res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>>> +    res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>>> +    orig_end = res.end;
>>> +    while ((res.start < res.end) &&
>>> +        (find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>>> +        pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>>> +        end_pfn = (res.end + 1) >> PAGE_SHIFT;
>>> +        if (end_pfn > pfn)
>>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>> +        if (ret)
>>> +            break;
>>> +        res.start = res.end + 1;
>>> +        res.end = orig_end;
>>> +    }
>>> +    return ret;
>>> +}
>>
>> So the relevant difference between this one and walk_system_ram_range()
>> is this:
>>
>> -            ret = (*func)(pfn, end_pfn - pfn, arg);
>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>
>> so it seems to me you can have your own *func() pointer which does that
>> IORES_DESC_NONE comparison. And then you can define your own workhorse
>> __walk_memory_range() which gets called by both walk_mem_range() and
>> walk_system_ram_range() instead of almost duplicating them.
>>
>> And looking at walk_system_ram_res(), that one looks similar too except
>> the pfn computation. But AFAICT the pfn/end_pfn things are computed from
>> res.start and res.end so it looks to me like all those three functions
>> are crying for unification...
>
> I'll take a look at what it takes to consolidate these with a pre-patch.
> Then I'll add the new support.

It looks pretty straight forward to combine walk_iomem_res_desc() and
walk_system_ram_res(). The walk_system_ram_range() function would fit
easily into this, also, except for the fact that the callback function
takes unsigned longs vs the u64s of the other functions.  Is it worth
modifying all of the callers of walk_system_ram_range() (which are only
about 8 locations) to change the callback functions to accept u64s in
order to consolidate the walk_system_ram_range() function, too?

Thanks,
Tom

>
> Thanks,
> Tom
>
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-17 14:32         ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-17 14:32 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/16/2017 3:04 PM, Tom Lendacky wrote:
> On 3/7/2017 8:59 AM, Borislav Petkov wrote:
>> On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>
>>> In order for memory pages to be properly mapped when SEV is active, we
>>> need to use the PAGE_KERNEL protection attribute as the base protection.
>>> This will insure that memory mapping of, e.g. ACPI tables, receives the
>>> proper mapping attributes.
>>>
>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>>> ---
>>
>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>> index c400ab5..481c999 100644
>>> --- a/arch/x86/mm/ioremap.c
>>> +++ b/arch/x86/mm/ioremap.c
>>> @@ -151,7 +151,15 @@ static void __iomem
>>> *__ioremap_caller(resource_size_t phys_addr,
>>>                 pcm = new_pcm;
>>>         }
>>>
>>> +       /*
>>> +        * If the page being mapped is in memory and SEV is active then
>>> +        * make sure the memory encryption attribute is enabled in the
>>> +        * resulting mapping.
>>> +        */
>>>         prot = PAGE_KERNEL_IO;
>>> +       if (sev_active() && page_is_mem(pfn))
>>
>> Hmm, a resource tree walk per ioremap call. This could get expensive for
>> ioremap-heavy workloads.
>>
>> __ioremap_caller() gets called here during boot 55 times so not a whole
>> lot but I wouldn't be surprised if there were some nasty use cases which
>> ioremap a lot.
>>
>> ...
>>
>>> diff --git a/kernel/resource.c b/kernel/resource.c
>>> index 9b5f044..db56ba3 100644
>>> --- a/kernel/resource.c
>>> +++ b/kernel/resource.c
>>> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>>>  }
>>>  EXPORT_SYMBOL_GPL(page_is_ram);
>>>
>>> +/*
>>> + * This function returns true if the target memory is marked as
>>> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
>>> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
>>> + */
>>> +static int walk_mem_range(unsigned long start_pfn, unsigned long
>>> nr_pages)
>>> +{
>>> +    struct resource res;
>>> +    unsigned long pfn, end_pfn;
>>> +    u64 orig_end;
>>> +    int ret = -1;
>>> +
>>> +    res.start = (u64) start_pfn << PAGE_SHIFT;
>>> +    res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>>> +    res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>>> +    orig_end = res.end;
>>> +    while ((res.start < res.end) &&
>>> +        (find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>>> +        pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>>> +        end_pfn = (res.end + 1) >> PAGE_SHIFT;
>>> +        if (end_pfn > pfn)
>>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>> +        if (ret)
>>> +            break;
>>> +        res.start = res.end + 1;
>>> +        res.end = orig_end;
>>> +    }
>>> +    return ret;
>>> +}
>>
>> So the relevant difference between this one and walk_system_ram_range()
>> is this:
>>
>> -            ret = (*func)(pfn, end_pfn - pfn, arg);
>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>
>> so it seems to me you can have your own *func() pointer which does that
>> IORES_DESC_NONE comparison. And then you can define your own workhorse
>> __walk_memory_range() which gets called by both walk_mem_range() and
>> walk_system_ram_range() instead of almost duplicating them.
>>
>> And looking at walk_system_ram_res(), that one looks similar too except
>> the pfn computation. But AFAICT the pfn/end_pfn things are computed from
>> res.start and res.end so it looks to me like all those three functions
>> are crying for unification...
>
> I'll take a look at what it takes to consolidate these with a pre-patch.
> Then I'll add the new support.

It looks pretty straight forward to combine walk_iomem_res_desc() and
walk_system_ram_res(). The walk_system_ram_range() function would fit
easily into this, also, except for the fact that the callback function
takes unsigned longs vs the u64s of the other functions.  Is it worth
modifying all of the callers of walk_system_ram_range() (which are only
about 8 locations) to change the callback functions to accept u64s in
order to consolidate the walk_system_ram_range() function, too?

Thanks,
Tom

>
> Thanks,
> Tom
>
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-17 11:33                     ` Borislav Petkov
  (?)
@ 2017-03-17 14:45                       ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 14:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel



On 17/03/2017 12:33, Borislav Petkov wrote:
> On Fri, Mar 17, 2017 at 12:03:31PM +0100, Paolo Bonzini wrote:
> 
>> If it is possible to do it in a fairly hypervisor-independent manner,
>> I'm all for it.  That is, only by looking at AMD-specified CPUID leaves
>> and at kernel ELF sections.
> 
> Not even that.
> 
> What that needs to be able to do is:
> 
> 	kvm_map_percpu_hv_shared(st, sizeof(*st)))
> 
> where st is the percpu steal time ptr:
> 
> 	struct kvm_steal_time *st = &per_cpu(steal_time, cpu);
> 
> Underneath, what it does basically is it clears the encryption mask from
> the pte, see patch 16/32.

Yes, and I'd like that to be done with a new data section rather than a
special KVM hook.

> And I keep talking about SEV-ES because this is going to expand on the
> need of having a shared memory region which the hypervisor and the guest
> needs to access, thus unencrypted. See
> 
> http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf
> 
> This is where you come in and say what would be the best approach there...

I have no idea.  SEV-ES seems to be very hard to set up at the beginning
of the kernel bootstrap.  There's all sorts of chicken and egg problems,
as well as complicated handshakes between the firmware and the guest,
and the way to do it also depends on the trust and threat models.

A much simpler way is to just boot under a trusted hypervisor, do
"modprobe sev-es" and save a snapshot of the guest.  Then you sign the
snapshot and pass it to your cloud provider.

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-17 14:45                       ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 14:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab



On 17/03/2017 12:33, Borislav Petkov wrote:
> On Fri, Mar 17, 2017 at 12:03:31PM +0100, Paolo Bonzini wrote:
> 
>> If it is possible to do it in a fairly hypervisor-independent manner,
>> I'm all for it.  That is, only by looking at AMD-specified CPUID leaves
>> and at kernel ELF sections.
> 
> Not even that.
> 
> What that needs to be able to do is:
> 
> 	kvm_map_percpu_hv_shared(st, sizeof(*st)))
> 
> where st is the percpu steal time ptr:
> 
> 	struct kvm_steal_time *st = &per_cpu(steal_time, cpu);
> 
> Underneath, what it does basically is it clears the encryption mask from
> the pte, see patch 16/32.

Yes, and I'd like that to be done with a new data section rather than a
special KVM hook.

> And I keep talking about SEV-ES because this is going to expand on the
> need of having a shared memory region which the hypervisor and the guest
> needs to access, thus unencrypted. See
> 
> http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf
> 
> This is where you come in and say what would be the best approach there...

I have no idea.  SEV-ES seems to be very hard to set up at the beginning
of the kernel bootstrap.  There's all sorts of chicken and egg problems,
as well as complicated handshakes between the firmware and the guest,
and the way to do it also depends on the trust and threat models.

A much simpler way is to just boot under a trusted hypervisor, do
"modprobe sev-es" and save a snapshot of the guest.  Then you sign the
snapshot and pass it to your cloud provider.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-17 14:45                       ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-17 14:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab



On 17/03/2017 12:33, Borislav Petkov wrote:
> On Fri, Mar 17, 2017 at 12:03:31PM +0100, Paolo Bonzini wrote:
> 
>> If it is possible to do it in a fairly hypervisor-independent manner,
>> I'm all for it.  That is, only by looking at AMD-specified CPUID leaves
>> and at kernel ELF sections.
> 
> Not even that.
> 
> What that needs to be able to do is:
> 
> 	kvm_map_percpu_hv_shared(st, sizeof(*st)))
> 
> where st is the percpu steal time ptr:
> 
> 	struct kvm_steal_time *st = &per_cpu(steal_time, cpu);
> 
> Underneath, what it does basically is it clears the encryption mask from
> the pte, see patch 16/32.

Yes, and I'd like that to be done with a new data section rather than a
special KVM hook.

> And I keep talking about SEV-ES because this is going to expand on the
> need of having a shared memory region which the hypervisor and the guest
> needs to access, thus unencrypted. See
> 
> http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf
> 
> This is where you come in and say what would be the best approach there...

I have no idea.  SEV-ES seems to be very hard to set up at the beginning
of the kernel bootstrap.  There's all sorts of chicken and egg problems,
as well as complicated handshakes between the firmware and the guest,
and the way to do it also depends on the trust and threat models.

A much simpler way is to just boot under a trusted hypervisor, do
"modprobe sev-es" and save a snapshot of the guest.  Then you sign the
snapshot and pass it to your cloud provider.

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
  2017-03-17 14:32         ` Tom Lendacky
  (?)
@ 2017-03-17 14:55           ` Tom Lendacky
  -1 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-17 14:55 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: linux-efi, kvm, rkrcmar, matt, linux-pci, linus.walleij,
	gary.hook, linux-mm, hpa, cl, tglx, aarcange, sfr, mchehab,
	simon.guinot, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, labbott, dyoung, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, pbonzini, bhelgaas,
	dan.j.williams, andriy.shevchenko, akpm, herbert, tony.luck,
	paul.gortmaker

On 3/17/2017 9:32 AM, Tom Lendacky wrote:
> On 3/16/2017 3:04 PM, Tom Lendacky wrote:
>> On 3/7/2017 8:59 AM, Borislav Petkov wrote:
>>> On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
>>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>>
>>>> In order for memory pages to be properly mapped when SEV is active, we
>>>> need to use the PAGE_KERNEL protection attribute as the base
>>>> protection.
>>>> This will insure that memory mapping of, e.g. ACPI tables, receives the
>>>> proper mapping attributes.
>>>>
>>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>>>> ---
>>>
>>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>>> index c400ab5..481c999 100644
>>>> --- a/arch/x86/mm/ioremap.c
>>>> +++ b/arch/x86/mm/ioremap.c
>>>> @@ -151,7 +151,15 @@ static void __iomem
>>>> *__ioremap_caller(resource_size_t phys_addr,
>>>>                 pcm = new_pcm;
>>>>         }
>>>>
>>>> +       /*
>>>> +        * If the page being mapped is in memory and SEV is active then
>>>> +        * make sure the memory encryption attribute is enabled in the
>>>> +        * resulting mapping.
>>>> +        */
>>>>         prot = PAGE_KERNEL_IO;
>>>> +       if (sev_active() && page_is_mem(pfn))
>>>
>>> Hmm, a resource tree walk per ioremap call. This could get expensive for
>>> ioremap-heavy workloads.
>>>
>>> __ioremap_caller() gets called here during boot 55 times so not a whole
>>> lot but I wouldn't be surprised if there were some nasty use cases which
>>> ioremap a lot.
>>>
>>> ...
>>>
>>>> diff --git a/kernel/resource.c b/kernel/resource.c
>>>> index 9b5f044..db56ba3 100644
>>>> --- a/kernel/resource.c
>>>> +++ b/kernel/resource.c
>>>> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>>>>  }
>>>>  EXPORT_SYMBOL_GPL(page_is_ram);
>>>>
>>>> +/*
>>>> + * This function returns true if the target memory is marked as
>>>> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
>>>> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
>>>> + */
>>>> +static int walk_mem_range(unsigned long start_pfn, unsigned long
>>>> nr_pages)
>>>> +{
>>>> +    struct resource res;
>>>> +    unsigned long pfn, end_pfn;
>>>> +    u64 orig_end;
>>>> +    int ret = -1;
>>>> +
>>>> +    res.start = (u64) start_pfn << PAGE_SHIFT;
>>>> +    res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>>>> +    res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>>>> +    orig_end = res.end;
>>>> +    while ((res.start < res.end) &&
>>>> +        (find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>>>> +        pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>>>> +        end_pfn = (res.end + 1) >> PAGE_SHIFT;
>>>> +        if (end_pfn > pfn)
>>>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>>> +        if (ret)
>>>> +            break;
>>>> +        res.start = res.end + 1;
>>>> +        res.end = orig_end;
>>>> +    }
>>>> +    return ret;
>>>> +}
>>>
>>> So the relevant difference between this one and walk_system_ram_range()
>>> is this:
>>>
>>> -            ret = (*func)(pfn, end_pfn - pfn, arg);
>>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>>
>>> so it seems to me you can have your own *func() pointer which does that
>>> IORES_DESC_NONE comparison. And then you can define your own workhorse
>>> __walk_memory_range() which gets called by both walk_mem_range() and
>>> walk_system_ram_range() instead of almost duplicating them.
>>>
>>> And looking at walk_system_ram_res(), that one looks similar too except
>>> the pfn computation. But AFAICT the pfn/end_pfn things are computed from
>>> res.start and res.end so it looks to me like all those three functions
>>> are crying for unification...
>>
>> I'll take a look at what it takes to consolidate these with a pre-patch.
>> Then I'll add the new support.
>
> It looks pretty straight forward to combine walk_iomem_res_desc() and
> walk_system_ram_res(). The walk_system_ram_range() function would fit
> easily into this, also, except for the fact that the callback function
> takes unsigned longs vs the u64s of the other functions.  Is it worth
> modifying all of the callers of walk_system_ram_range() (which are only
> about 8 locations) to change the callback functions to accept u64s in
> order to consolidate the walk_system_ram_range() function, too?

The more I dig, the more I find that the changes keep expanding. I'll
leave walk_system_ram_range() out of the consolidation for now.

Thanks,
Tom

>
> Thanks,
> Tom
>
>>
>> Thanks,
>> Tom
>>
>>>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-17 14:55           ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-17 14:55 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/17/2017 9:32 AM, Tom Lendacky wrote:
> On 3/16/2017 3:04 PM, Tom Lendacky wrote:
>> On 3/7/2017 8:59 AM, Borislav Petkov wrote:
>>> On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
>>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>>
>>>> In order for memory pages to be properly mapped when SEV is active, we
>>>> need to use the PAGE_KERNEL protection attribute as the base
>>>> protection.
>>>> This will insure that memory mapping of, e.g. ACPI tables, receives the
>>>> proper mapping attributes.
>>>>
>>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>>>> ---
>>>
>>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>>> index c400ab5..481c999 100644
>>>> --- a/arch/x86/mm/ioremap.c
>>>> +++ b/arch/x86/mm/ioremap.c
>>>> @@ -151,7 +151,15 @@ static void __iomem
>>>> *__ioremap_caller(resource_size_t phys_addr,
>>>>                 pcm = new_pcm;
>>>>         }
>>>>
>>>> +       /*
>>>> +        * If the page being mapped is in memory and SEV is active then
>>>> +        * make sure the memory encryption attribute is enabled in the
>>>> +        * resulting mapping.
>>>> +        */
>>>>         prot = PAGE_KERNEL_IO;
>>>> +       if (sev_active() && page_is_mem(pfn))
>>>
>>> Hmm, a resource tree walk per ioremap call. This could get expensive for
>>> ioremap-heavy workloads.
>>>
>>> __ioremap_caller() gets called here during boot 55 times so not a whole
>>> lot but I wouldn't be surprised if there were some nasty use cases which
>>> ioremap a lot.
>>>
>>> ...
>>>
>>>> diff --git a/kernel/resource.c b/kernel/resource.c
>>>> index 9b5f044..db56ba3 100644
>>>> --- a/kernel/resource.c
>>>> +++ b/kernel/resource.c
>>>> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>>>>  }
>>>>  EXPORT_SYMBOL_GPL(page_is_ram);
>>>>
>>>> +/*
>>>> + * This function returns true if the target memory is marked as
>>>> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
>>>> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
>>>> + */
>>>> +static int walk_mem_range(unsigned long start_pfn, unsigned long
>>>> nr_pages)
>>>> +{
>>>> +    struct resource res;
>>>> +    unsigned long pfn, end_pfn;
>>>> +    u64 orig_end;
>>>> +    int ret = -1;
>>>> +
>>>> +    res.start = (u64) start_pfn << PAGE_SHIFT;
>>>> +    res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>>>> +    res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>>>> +    orig_end = res.end;
>>>> +    while ((res.start < res.end) &&
>>>> +        (find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>>>> +        pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>>>> +        end_pfn = (res.end + 1) >> PAGE_SHIFT;
>>>> +        if (end_pfn > pfn)
>>>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>>> +        if (ret)
>>>> +            break;
>>>> +        res.start = res.end + 1;
>>>> +        res.end = orig_end;
>>>> +    }
>>>> +    return ret;
>>>> +}
>>>
>>> So the relevant difference between this one and walk_system_ram_range()
>>> is this:
>>>
>>> -            ret = (*func)(pfn, end_pfn - pfn, arg);
>>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>>
>>> so it seems to me you can have your own *func() pointer which does that
>>> IORES_DESC_NONE comparison. And then you can define your own workhorse
>>> __walk_memory_range() which gets called by both walk_mem_range() and
>>> walk_system_ram_range() instead of almost duplicating them.
>>>
>>> And looking at walk_system_ram_res(), that one looks similar too except
>>> the pfn computation. But AFAICT the pfn/end_pfn things are computed from
>>> res.start and res.end so it looks to me like all those three functions
>>> are crying for unification...
>>
>> I'll take a look at what it takes to consolidate these with a pre-patch.
>> Then I'll add the new support.
>
> It looks pretty straight forward to combine walk_iomem_res_desc() and
> walk_system_ram_res(). The walk_system_ram_range() function would fit
> easily into this, also, except for the fact that the callback function
> takes unsigned longs vs the u64s of the other functions.  Is it worth
> modifying all of the callers of walk_system_ram_range() (which are only
> about 8 locations) to change the callback functions to accept u64s in
> order to consolidate the walk_system_ram_range() function, too?

The more I dig, the more I find that the changes keep expanding. I'll
leave walk_system_ram_range() out of the consolidation for now.

Thanks,
Tom

>
> Thanks,
> Tom
>
>>
>> Thanks,
>> Tom
>>
>>>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page
@ 2017-03-17 14:55           ` Tom Lendacky
  0 siblings, 0 replies; 424+ messages in thread
From: Tom Lendacky @ 2017-03-17 14:55 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, jroedel, keescook, arnd, toshi.kani,
	mathieu.desnoyers, luto, devel, bhelgaas, tglx, mchehab,
	iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On 3/17/2017 9:32 AM, Tom Lendacky wrote:
> On 3/16/2017 3:04 PM, Tom Lendacky wrote:
>> On 3/7/2017 8:59 AM, Borislav Petkov wrote:
>>> On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:
>>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>>
>>>> In order for memory pages to be properly mapped when SEV is active, we
>>>> need to use the PAGE_KERNEL protection attribute as the base
>>>> protection.
>>>> This will insure that memory mapping of, e.g. ACPI tables, receives the
>>>> proper mapping attributes.
>>>>
>>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>>>> ---
>>>
>>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>>> index c400ab5..481c999 100644
>>>> --- a/arch/x86/mm/ioremap.c
>>>> +++ b/arch/x86/mm/ioremap.c
>>>> @@ -151,7 +151,15 @@ static void __iomem
>>>> *__ioremap_caller(resource_size_t phys_addr,
>>>>                 pcm = new_pcm;
>>>>         }
>>>>
>>>> +       /*
>>>> +        * If the page being mapped is in memory and SEV is active then
>>>> +        * make sure the memory encryption attribute is enabled in the
>>>> +        * resulting mapping.
>>>> +        */
>>>>         prot = PAGE_KERNEL_IO;
>>>> +       if (sev_active() && page_is_mem(pfn))
>>>
>>> Hmm, a resource tree walk per ioremap call. This could get expensive for
>>> ioremap-heavy workloads.
>>>
>>> __ioremap_caller() gets called here during boot 55 times so not a whole
>>> lot but I wouldn't be surprised if there were some nasty use cases which
>>> ioremap a lot.
>>>
>>> ...
>>>
>>>> diff --git a/kernel/resource.c b/kernel/resource.c
>>>> index 9b5f044..db56ba3 100644
>>>> --- a/kernel/resource.c
>>>> +++ b/kernel/resource.c
>>>> @@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
>>>>  }
>>>>  EXPORT_SYMBOL_GPL(page_is_ram);
>>>>
>>>> +/*
>>>> + * This function returns true if the target memory is marked as
>>>> + * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
>>>> + * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
>>>> + */
>>>> +static int walk_mem_range(unsigned long start_pfn, unsigned long
>>>> nr_pages)
>>>> +{
>>>> +    struct resource res;
>>>> +    unsigned long pfn, end_pfn;
>>>> +    u64 orig_end;
>>>> +    int ret = -1;
>>>> +
>>>> +    res.start = (u64) start_pfn << PAGE_SHIFT;
>>>> +    res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
>>>> +    res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>>>> +    orig_end = res.end;
>>>> +    while ((res.start < res.end) &&
>>>> +        (find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
>>>> +        pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
>>>> +        end_pfn = (res.end + 1) >> PAGE_SHIFT;
>>>> +        if (end_pfn > pfn)
>>>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>>> +        if (ret)
>>>> +            break;
>>>> +        res.start = res.end + 1;
>>>> +        res.end = orig_end;
>>>> +    }
>>>> +    return ret;
>>>> +}
>>>
>>> So the relevant difference between this one and walk_system_ram_range()
>>> is this:
>>>
>>> -            ret = (*func)(pfn, end_pfn - pfn, arg);
>>> +            ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
>>>
>>> so it seems to me you can have your own *func() pointer which does that
>>> IORES_DESC_NONE comparison. And then you can define your own workhorse
>>> __walk_memory_range() which gets called by both walk_mem_range() and
>>> walk_system_ram_range() instead of almost duplicating them.
>>>
>>> And looking at walk_system_ram_res(), that one looks similar too except
>>> the pfn computation. But AFAICT the pfn/end_pfn things are computed from
>>> res.start and res.end so it looks to me like all those three functions
>>> are crying for unification...
>>
>> I'll take a look at what it takes to consolidate these with a pre-patch.
>> Then I'll add the new support.
>
> It looks pretty straight forward to combine walk_iomem_res_desc() and
> walk_system_ram_res(). The walk_system_ram_range() function would fit
> easily into this, also, except for the fact that the callback function
> takes unsigned longs vs the u64s of the other functions.  Is it worth
> modifying all of the callers of walk_system_ram_range() (which are only
> about 8 locations) to change the callback functions to accept u64s in
> order to consolidate the walk_system_ram_range() function, too?

The more I dig, the more I find that the changes keep expanding. I'll
leave walk_system_ram_range() out of the consolidation for now.

Thanks,
Tom

>
> Thanks,
> Tom
>
>>
>> Thanks,
>> Tom
>>
>>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-17 14:45                       ` Paolo Bonzini
  (?)
  (?)
@ 2017-03-18 16:37                           ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-18 16:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot-jKBdWWKqtFpg9hUCZPvPmw,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	rkrcmar-H+wXaHxf7aLQT0dZR+AlfA,
	matt-mF/unelCI9GS6iBeEJttW/XRex20P6io,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linus.walleij-QSEj5FYQhm4dnm+yROfE0A, gary.hook-5C7GfCeVMHo,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	hpa-YMNOUZJC4hwAvxtiuMwx3w, cl-vYTEC60ixJUAvxtiuMwx3w,
	dan.j.williams-ral2JQCrhuEAvxtiuMwx3w,
	aarcange-H+wXaHxf7aLQT0dZR+AlfA, sfr-3FnU+UHB4dNDw9hX6IcOSA,
	andriy.shevchenko-VuQAYsv1563Yd54FQh9/CA,
	herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q,
	bhe-H+wXaHxf7aLQT0dZR+AlfA, xemul-bzQdu9zFT3WakBO8gow8eQ,
	joro-zLv9SwRftAIdnm+yROfE0A, x86-DgEjT+Ai2ygdnm+yROfE0A,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, piotr.luc-ral2JQCrhuEAvxtiuMwx3w,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	ross.zwisler-VuQAYsv1563Yd54FQh9/CA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo,
	jroedel-l3A5Bk7waGM, keescook-F7+t8E8rja9g9hUCZPvPmw,
	arnd-r2nGTMty4D4, toshi.kani-ZPxbGqLxI0U,
	mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w,
	luto-DgEjT+Ai2ygdnm+yROfE0A

On Fri, Mar 17, 2017 at 03:45:26PM +0100, Paolo Bonzini wrote:
> Yes, and I'd like that to be done with a new data section rather than a
> special KVM hook.

Can you give more details about how pls? Or is there already an example for that
somewhere in the kvm code?

> I have no idea.  SEV-ES seems to be very hard to set up at the beginning
> of the kernel bootstrap.  There's all sorts of chicken and egg problems,
> as well as complicated handshakes between the firmware and the guest,
> and the way to do it also depends on the trust and threat models.
> 
> A much simpler way is to just boot under a trusted hypervisor, do
> "modprobe sev-es" and save a snapshot of the guest.  Then you sign the
> snapshot and pass it to your cloud provider.

Right, especially the early trapping could be a pain. I don't think this
is cast in stone yet, though...

We'll see.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-18 16:37                           ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-18 16:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Fri, Mar 17, 2017 at 03:45:26PM +0100, Paolo Bonzini wrote:
> Yes, and I'd like that to be done with a new data section rather than a
> special KVM hook.

Can you give more details about how pls? Or is there already an example for that
somewhere in the kvm code?

> I have no idea.  SEV-ES seems to be very hard to set up at the beginning
> of the kernel bootstrap.  There's all sorts of chicken and egg problems,
> as well as complicated handshakes between the firmware and the guest,
> and the way to do it also depends on the trust and threat models.
> 
> A much simpler way is to just boot under a trusted hypervisor, do
> "modprobe sev-es" and save a snapshot of the guest.  Then you sign the
> snapshot and pass it to your cloud provider.

Right, especially the early trapping could be a pain. I don't think this
is cast in stone yet, though...

We'll see.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-18 16:37                           ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-18 16:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot-jKBdWWKqtFpg9hUCZPvPmw,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	rkrcmar-H+wXaHxf7aLQT0dZR+AlfA,
	matt-mF/unelCI9GS6iBeEJttW/XRex20P6io,
	linux-pci-u79uwXL29TY76Z2rM5mHXA,
	linus.walleij-QSEj5FYQhm4dnm+yROfE0A, gary.hook-5C7GfCeVMHo,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	hpa-YMNOUZJC4hwAvxtiuMwx3w, cl-vYTEC60ixJUAvxtiuMwx3w,
	dan.j.williams-ral2JQCrhuEAvxtiuMwx3w,
	aarcange-H+wXaHxf7aLQT0dZR+AlfA, sfr-3FnU+UHB4dNDw9hX6IcOSA,
	andriy.shevchenko-VuQAYsv1563Yd54FQh9/CA,
	herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q,
	bhe-H+wXaHxf7aLQT0dZR+AlfA, xemul-bzQdu9zFT3WakBO8gow8eQ,
	joro-zLv9SwRftAIdnm+yROfE0A, x86-DgEjT+Ai2ygdnm+yROfE0A,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, piotr.luc-ral2JQCrhuEAvxtiuMwx3w,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	ross.zwisler-VuQAYsv1563Yd54FQh9/CA,
	dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo,
	jroedel-l3A5Bk7waGM, keescook-F7+t8E8rja9g9hUCZPvPmw,
	arnd-r2nGTMty4D4, toshi.kani-ZPxbGqLxI0U,
	mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w,
	luto-DgEjT+Ai2ygdnm+yROfE0A

On Fri, Mar 17, 2017 at 03:45:26PM +0100, Paolo Bonzini wrote:
> Yes, and I'd like that to be done with a new data section rather than a
> special KVM hook.

Can you give more details about how pls? Or is there already an example for that
somewhere in the kvm code?

> I have no idea.  SEV-ES seems to be very hard to set up at the beginning
> of the kernel bootstrap.  There's all sorts of chicken and egg problems,
> as well as complicated handshakes between the firmware and the guest,
> and the way to do it also depends on the trust and threat models.
> 
> A much simpler way is to just boot under a trusted hypervisor, do
> "modprobe sev-es" and save a snapshot of the guest.  Then you sign the
> snapshot and pass it to your cloud provider.

Right, especially the early trapping could be a pain. I don't think this
is cast in stone yet, though...

We'll see.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-03-18 16:37                           ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-18 16:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Fri, Mar 17, 2017 at 03:45:26PM +0100, Paolo Bonzini wrote:
> Yes, and I'd like that to be done with a new data section rather than a
> special KVM hook.

Can you give more details about how pls? Or is there already an example for that
somewhere in the kvm code?

> I have no idea.  SEV-ES seems to be very hard to set up at the beginning
> of the kernel bootstrap.  There's all sorts of chicken and egg problems,
> as well as complicated handshakes between the firmware and the guest,
> and the way to do it also depends on the trust and threat models.
> 
> A much simpler way is to just boot under a trusted hypervisor, do
> "modprobe sev-es" and save a snapshot of the guest.  Then you sign the
> snapshot and pass it to your cloud provider.

Right, especially the early trapping could be a pain. I don't think this
is cast in stone yet, though...

We'll see.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
  2017-03-02 15:15   ` Brijesh Singh
  (?)
@ 2017-03-24 17:12     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-24 17:12 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas

On Thu, Mar 02, 2017 at 10:15:28AM -0500, Brijesh Singh wrote:
> Some KVM-specific custom MSRs shares the guest physical address with
> hypervisor. When SEV is active, the shared physical address must be mapped
> with encryption attribute cleared so that both hypervsior and guest can
> access the data.
> 
> Add APIs to change memory encryption attribute in early boot code.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/mem_encrypt.h |   15 +++++++++
>  arch/x86/mm/mem_encrypt.c          |   63 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 78 insertions(+)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 9799835..95bbe4c 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -47,6 +47,9 @@ void __init sme_unmap_bootdata(char *real_mode_data);
>  
>  void __init sme_early_init(void);
>  
> +int __init early_set_memory_decrypted(void *addr, unsigned long size);
> +int __init early_set_memory_encrypted(void *addr, unsigned long size);
> +
>  /* Architecture __weak replacement functions */
>  void __init mem_encrypt_init(void);
>  
> @@ -110,6 +113,18 @@ static inline void __init sme_early_init(void)
>  {
>  }
>  
> +static inline int __init early_set_memory_decrypted(void *addr,
> +						    unsigned long size)
> +{
> +	return 1;
	^^^^^^^^

return 1 when !CONFIG_AMD_MEM_ENCRYPT ?

The non-early variants return 0.

> +}
> +
> +static inline int __init early_set_memory_encrypted(void *addr,
> +						    unsigned long size)
> +{
> +	return 1;
> +}
> +
>  #define __sme_pa		__pa
>  #define __sme_pa_nodebug	__pa_nodebug
>  
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 7df5f4c..567e0d8 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -15,6 +15,7 @@
>  #include <linux/mm.h>
>  #include <linux/dma-mapping.h>
>  #include <linux/swiotlb.h>
> +#include <linux/mem_encrypt.h>
>  
>  #include <asm/tlbflush.h>
>  #include <asm/fixmap.h>
> @@ -258,6 +259,68 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
>  	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
>  }
>  
> +static unsigned long __init get_pte_flags(unsigned long address)
> +{
> +	int level;
> +	pte_t *pte;
> +	unsigned long flags = _KERNPG_TABLE_NOENC | _PAGE_ENC;
> +
> +	pte = lookup_address(address, &level);
> +	if (!pte)
> +		return flags;
> +
> +	switch (level) {
> +	case PG_LEVEL_4K:
> +		flags = pte_flags(*pte);
> +		break;
> +	case PG_LEVEL_2M:
> +		flags = pmd_flags(*(pmd_t *)pte);
> +		break;
> +	case PG_LEVEL_1G:
> +		flags = pud_flags(*(pud_t *)pte);
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	return flags;
> +}
> +
> +int __init early_set_memory_enc_dec(void *vaddr, unsigned long size,
> +				    unsigned long flags)
> +{
> +	unsigned long pfn, npages;
> +	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
> +
> +	/* We are going to change the physical page attribute from C=1 to C=0.
> +	 * Flush the caches to ensure that all the data with C=1 is flushed to
> +	 * memory. Any caching of the vaddr after function returns will
> +	 * use C=0.
> +	 */

Kernel comments style is:

	/*
	 * A sentence ending with a full-stop.
	 * Another sentence. ...
	 * More sentences. ...
	 */

> +	clflush_cache_range(vaddr, size);
> +
> +	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
> +
> +	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
> +					flags & ~sme_me_mask);
> +
> +}
> +
> +int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
> +{
> +	unsigned long flags = get_pte_flags((unsigned long)vaddr);

So this does lookup_address()...

> +	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);

... and this does it too in slow_virt_to_phys(). So you do it twice per
vaddr.

So why don't you define a __slow_virt_to_phys() helper - notice
the "__" - which returns flags in its second parameter and which
slow_virt_to_phys() calls with a NULL second parameter in the other
cases?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
@ 2017-03-24 17:12     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-24 17:12 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:15:28AM -0500, Brijesh Singh wrote:
> Some KVM-specific custom MSRs shares the guest physical address with
> hypervisor. When SEV is active, the shared physical address must be mapped
> with encryption attribute cleared so that both hypervsior and guest can
> access the data.
> 
> Add APIs to change memory encryption attribute in early boot code.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/mem_encrypt.h |   15 +++++++++
>  arch/x86/mm/mem_encrypt.c          |   63 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 78 insertions(+)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 9799835..95bbe4c 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -47,6 +47,9 @@ void __init sme_unmap_bootdata(char *real_mode_data);
>  
>  void __init sme_early_init(void);
>  
> +int __init early_set_memory_decrypted(void *addr, unsigned long size);
> +int __init early_set_memory_encrypted(void *addr, unsigned long size);
> +
>  /* Architecture __weak replacement functions */
>  void __init mem_encrypt_init(void);
>  
> @@ -110,6 +113,18 @@ static inline void __init sme_early_init(void)
>  {
>  }
>  
> +static inline int __init early_set_memory_decrypted(void *addr,
> +						    unsigned long size)
> +{
> +	return 1;
	^^^^^^^^

return 1 when !CONFIG_AMD_MEM_ENCRYPT ?

The non-early variants return 0.

> +}
> +
> +static inline int __init early_set_memory_encrypted(void *addr,
> +						    unsigned long size)
> +{
> +	return 1;
> +}
> +
>  #define __sme_pa		__pa
>  #define __sme_pa_nodebug	__pa_nodebug
>  
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 7df5f4c..567e0d8 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -15,6 +15,7 @@
>  #include <linux/mm.h>
>  #include <linux/dma-mapping.h>
>  #include <linux/swiotlb.h>
> +#include <linux/mem_encrypt.h>
>  
>  #include <asm/tlbflush.h>
>  #include <asm/fixmap.h>
> @@ -258,6 +259,68 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
>  	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
>  }
>  
> +static unsigned long __init get_pte_flags(unsigned long address)
> +{
> +	int level;
> +	pte_t *pte;
> +	unsigned long flags = _KERNPG_TABLE_NOENC | _PAGE_ENC;
> +
> +	pte = lookup_address(address, &level);
> +	if (!pte)
> +		return flags;
> +
> +	switch (level) {
> +	case PG_LEVEL_4K:
> +		flags = pte_flags(*pte);
> +		break;
> +	case PG_LEVEL_2M:
> +		flags = pmd_flags(*(pmd_t *)pte);
> +		break;
> +	case PG_LEVEL_1G:
> +		flags = pud_flags(*(pud_t *)pte);
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	return flags;
> +}
> +
> +int __init early_set_memory_enc_dec(void *vaddr, unsigned long size,
> +				    unsigned long flags)
> +{
> +	unsigned long pfn, npages;
> +	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
> +
> +	/* We are going to change the physical page attribute from C=1 to C=0.
> +	 * Flush the caches to ensure that all the data with C=1 is flushed to
> +	 * memory. Any caching of the vaddr after function returns will
> +	 * use C=0.
> +	 */

Kernel comments style is:

	/*
	 * A sentence ending with a full-stop.
	 * Another sentence. ...
	 * More sentences. ...
	 */

> +	clflush_cache_range(vaddr, size);
> +
> +	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
> +
> +	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
> +					flags & ~sme_me_mask);
> +
> +}
> +
> +int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
> +{
> +	unsigned long flags = get_pte_flags((unsigned long)vaddr);

So this does lookup_address()...

> +	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);

... and this does it too in slow_virt_to_phys(). So you do it twice per
vaddr.

So why don't you define a __slow_virt_to_phys() helper - notice
the "__" - which returns flags in its second parameter and which
slow_virt_to_phys() calls with a NULL second parameter in the other
cases?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
@ 2017-03-24 17:12     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-24 17:12 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:15:28AM -0500, Brijesh Singh wrote:
> Some KVM-specific custom MSRs shares the guest physical address with
> hypervisor. When SEV is active, the shared physical address must be mapped
> with encryption attribute cleared so that both hypervsior and guest can
> access the data.
> 
> Add APIs to change memory encryption attribute in early boot code.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/mem_encrypt.h |   15 +++++++++
>  arch/x86/mm/mem_encrypt.c          |   63 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 78 insertions(+)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 9799835..95bbe4c 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -47,6 +47,9 @@ void __init sme_unmap_bootdata(char *real_mode_data);
>  
>  void __init sme_early_init(void);
>  
> +int __init early_set_memory_decrypted(void *addr, unsigned long size);
> +int __init early_set_memory_encrypted(void *addr, unsigned long size);
> +
>  /* Architecture __weak replacement functions */
>  void __init mem_encrypt_init(void);
>  
> @@ -110,6 +113,18 @@ static inline void __init sme_early_init(void)
>  {
>  }
>  
> +static inline int __init early_set_memory_decrypted(void *addr,
> +						    unsigned long size)
> +{
> +	return 1;
	^^^^^^^^

return 1 when !CONFIG_AMD_MEM_ENCRYPT ?

The non-early variants return 0.

> +}
> +
> +static inline int __init early_set_memory_encrypted(void *addr,
> +						    unsigned long size)
> +{
> +	return 1;
> +}
> +
>  #define __sme_pa		__pa
>  #define __sme_pa_nodebug	__pa_nodebug
>  
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 7df5f4c..567e0d8 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -15,6 +15,7 @@
>  #include <linux/mm.h>
>  #include <linux/dma-mapping.h>
>  #include <linux/swiotlb.h>
> +#include <linux/mem_encrypt.h>
>  
>  #include <asm/tlbflush.h>
>  #include <asm/fixmap.h>
> @@ -258,6 +259,68 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
>  	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
>  }
>  
> +static unsigned long __init get_pte_flags(unsigned long address)
> +{
> +	int level;
> +	pte_t *pte;
> +	unsigned long flags = _KERNPG_TABLE_NOENC | _PAGE_ENC;
> +
> +	pte = lookup_address(address, &level);
> +	if (!pte)
> +		return flags;
> +
> +	switch (level) {
> +	case PG_LEVEL_4K:
> +		flags = pte_flags(*pte);
> +		break;
> +	case PG_LEVEL_2M:
> +		flags = pmd_flags(*(pmd_t *)pte);
> +		break;
> +	case PG_LEVEL_1G:
> +		flags = pud_flags(*(pud_t *)pte);
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	return flags;
> +}
> +
> +int __init early_set_memory_enc_dec(void *vaddr, unsigned long size,
> +				    unsigned long flags)
> +{
> +	unsigned long pfn, npages;
> +	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
> +
> +	/* We are going to change the physical page attribute from C=1 to C=0.
> +	 * Flush the caches to ensure that all the data with C=1 is flushed to
> +	 * memory. Any caching of the vaddr after function returns will
> +	 * use C=0.
> +	 */

Kernel comments style is:

	/*
	 * A sentence ending with a full-stop.
	 * Another sentence. ...
	 * More sentences. ...
	 */

> +	clflush_cache_range(vaddr, size);
> +
> +	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
> +
> +	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
> +					flags & ~sme_me_mask);
> +
> +}
> +
> +int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
> +{
> +	unsigned long flags = get_pte_flags((unsigned long)vaddr);

So this does lookup_address()...

> +	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);

... and this does it too in slow_virt_to_phys(). So you do it twice per
vaddr.

So why don't you define a __slow_virt_to_phys() helper - notice
the "__" - which returns flags in its second parameter and which
slow_virt_to_phys() calls with a NULL second parameter in the other
cases?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
  2017-03-24 17:12     ` Borislav Petkov
  (?)
  (?)
@ 2017-03-27 15:07       ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-27 15:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhel

Hi Boris,

On 03/24/2017 12:12 PM, Borislav Petkov wrote:
>>  }
>>
>> +static inline int __init early_set_memory_decrypted(void *addr,
>> +						    unsigned long size)
>> +{
>> +	return 1;
> 	^^^^^^^^
>
> return 1 when !CONFIG_AMD_MEM_ENCRYPT ?
>
> The non-early variants return 0.
>

I will fix it and use the same return value.

>> +}
>> +
>> +static inline int __init early_set_memory_encrypted(void *addr,
>> +						    unsigned long size)
>> +{
>> +	return 1;
>> +}
>> +
>>  #define __sme_pa		__pa

>> +	unsigned long pfn, npages;
>> +	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
>> +
>> +	/* We are going to change the physical page attribute from C=1 to C=0.
>> +	 * Flush the caches to ensure that all the data with C=1 is flushed to
>> +	 * memory. Any caching of the vaddr after function returns will
>> +	 * use C=0.
>> +	 */
>
> Kernel comments style is:
>
> 	/*
> 	 * A sentence ending with a full-stop.
> 	 * Another sentence. ...
> 	 * More sentences. ...
> 	 */
>

I will update to use kernel comment style.


>> +	clflush_cache_range(vaddr, size);
>> +
>> +	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> +	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
>> +
>> +	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
>> +					flags & ~sme_me_mask);
>> +
>> +}
>> +
>> +int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
>> +{
>> +	unsigned long flags = get_pte_flags((unsigned long)vaddr);
>
> So this does lookup_address()...
>
>> +	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);
>
> ... and this does it too in slow_virt_to_phys(). So you do it twice per
> vaddr.
>
> So why don't you define a __slow_virt_to_phys() helper - notice
> the "__" - which returns flags in its second parameter and which
> slow_virt_to_phys() calls with a NULL second parameter in the other
> cases?
>

I will look into creating a helper function. thanks

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
@ 2017-03-27 15:07       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-27 15:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Hi Boris,

On 03/24/2017 12:12 PM, Borislav Petkov wrote:
>>  }
>>
>> +static inline int __init early_set_memory_decrypted(void *addr,
>> +						    unsigned long size)
>> +{
>> +	return 1;
> 	^^^^^^^^
>
> return 1 when !CONFIG_AMD_MEM_ENCRYPT ?
>
> The non-early variants return 0.
>

I will fix it and use the same return value.

>> +}
>> +
>> +static inline int __init early_set_memory_encrypted(void *addr,
>> +						    unsigned long size)
>> +{
>> +	return 1;
>> +}
>> +
>>  #define __sme_pa		__pa

>> +	unsigned long pfn, npages;
>> +	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
>> +
>> +	/* We are going to change the physical page attribute from C=1 to C=0.
>> +	 * Flush the caches to ensure that all the data with C=1 is flushed to
>> +	 * memory. Any caching of the vaddr after function returns will
>> +	 * use C=0.
>> +	 */
>
> Kernel comments style is:
>
> 	/*
> 	 * A sentence ending with a full-stop.
> 	 * Another sentence. ...
> 	 * More sentences. ...
> 	 */
>

I will update to use kernel comment style.


>> +	clflush_cache_range(vaddr, size);
>> +
>> +	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> +	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
>> +
>> +	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
>> +					flags & ~sme_me_mask);
>> +
>> +}
>> +
>> +int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
>> +{
>> +	unsigned long flags = get_pte_flags((unsigned long)vaddr);
>
> So this does lookup_address()...
>
>> +	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);
>
> ... and this does it too in slow_virt_to_phys(). So you do it twice per
> vaddr.
>
> So why don't you define a __slow_virt_to_phys() helper - notice
> the "__" - which returns flags in its second parameter and which
> slow_virt_to_phys() calls with a NULL second parameter in the other
> cases?
>

I will look into creating a helper function. thanks

-Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
@ 2017-03-27 15:07       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-27 15:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhel

Hi Boris,

On 03/24/2017 12:12 PM, Borislav Petkov wrote:
>>  }
>>
>> +static inline int __init early_set_memory_decrypted(void *addr,
>> +						    unsigned long size)
>> +{
>> +	return 1;
> 	^^^^^^^^
>
> return 1 when !CONFIG_AMD_MEM_ENCRYPT ?
>
> The non-early variants return 0.
>

I will fix it and use the same return value.

>> +}
>> +
>> +static inline int __init early_set_memory_encrypted(void *addr,
>> +						    unsigned long size)
>> +{
>> +	return 1;
>> +}
>> +
>>  #define __sme_pa		__pa

>> +	unsigned long pfn, npages;
>> +	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
>> +
>> +	/* We are going to change the physical page attribute from C=1 to C=0.
>> +	 * Flush the caches to ensure that all the data with C=1 is flushed to
>> +	 * memory. Any caching of the vaddr after function returns will
>> +	 * use C=0.
>> +	 */
>
> Kernel comments style is:
>
> 	/*
> 	 * A sentence ending with a full-stop.
> 	 * Another sentence. ...
> 	 * More sentences. ...
> 	 */
>

I will update to use kernel comment style.


>> +	clflush_cache_range(vaddr, size);
>> +
>> +	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> +	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
>> +
>> +	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
>> +					flags & ~sme_me_mask);
>> +
>> +}
>> +
>> +int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
>> +{
>> +	unsigned long flags = get_pte_flags((unsigned long)vaddr);
>
> So this does lookup_address()...
>
>> +	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);
>
> ... and this does it too in slow_virt_to_phys(). So you do it twice per
> vaddr.
>
> So why don't you define a __slow_virt_to_phys() helper - notice
> the "__" - which returns flags in its second parameter and which
> slow_virt_to_phys() calls with a NULL second parameter in the other
> cases?
>

I will look into creating a helper function. thanks

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot
@ 2017-03-27 15:07       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-27 15:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Hi Boris,

On 03/24/2017 12:12 PM, Borislav Petkov wrote:
>>  }
>>
>> +static inline int __init early_set_memory_decrypted(void *addr,
>> +						    unsigned long size)
>> +{
>> +	return 1;
> 	^^^^^^^^
>
> return 1 when !CONFIG_AMD_MEM_ENCRYPT ?
>
> The non-early variants return 0.
>

I will fix it and use the same return value.

>> +}
>> +
>> +static inline int __init early_set_memory_encrypted(void *addr,
>> +						    unsigned long size)
>> +{
>> +	return 1;
>> +}
>> +
>>  #define __sme_pa		__pa

>> +	unsigned long pfn, npages;
>> +	unsigned long addr = (unsigned long)vaddr & PAGE_MASK;
>> +
>> +	/* We are going to change the physical page attribute from C=1 to C=0.
>> +	 * Flush the caches to ensure that all the data with C=1 is flushed to
>> +	 * memory. Any caching of the vaddr after function returns will
>> +	 * use C=0.
>> +	 */
>
> Kernel comments style is:
>
> 	/*
> 	 * A sentence ending with a full-stop.
> 	 * Another sentence. ...
> 	 * More sentences. ...
> 	 */
>

I will update to use kernel comment style.


>> +	clflush_cache_range(vaddr, size);
>> +
>> +	npages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> +	pfn = slow_virt_to_phys((void *)addr) >> PAGE_SHIFT;
>> +
>> +	return kernel_map_pages_in_pgd(init_mm.pgd, pfn, addr, npages,
>> +					flags & ~sme_me_mask);
>> +
>> +}
>> +
>> +int __init early_set_memory_decrypted(void *vaddr, unsigned long size)
>> +{
>> +	unsigned long flags = get_pte_flags((unsigned long)vaddr);
>
> So this does lookup_address()...
>
>> +	return early_set_memory_enc_dec(vaddr, size, flags & ~sme_me_mask);
>
> ... and this does it too in slow_virt_to_phys(). So you do it twice per
> vaddr.
>
> So why don't you define a __slow_virt_to_phys() helper - notice
> the "__" - which returns flags in its second parameter and which
> slow_virt_to_phys() calls with a NULL second parameter in the other
> cases?
>

I will look into creating a helper function. thanks

-Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
  2017-03-02 15:15   ` Brijesh Singh
  (?)
@ 2017-03-28 18:39     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-28 18:39 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas

On Thu, Mar 02, 2017 at 10:15:36AM -0500, Brijesh Singh wrote:
> Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
> variable at compile time and share its physical address with hypervisor.
> It presents a challege when SEV is active in guest OS. When SEV is active,
> guest memory is encrypted with guest key and hypervisor will no longer able
> to modify the guest memory. When SEV is active, we need to clear the
> encryption attribute of shared physical addresses so that both guest and
> hypervisor can access the data.
> 
> To solve this problem, I have tried these three options:
> 
> 1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
> detected then clear the encryption attribute. But while doing so I found
> that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
> called.
> 
> 2) Since the encryption attributes works on PAGE_SIZE hence add some extra
> padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
> clear the encryption attribute of the full PAGE. The downside of this was
> now we need to modify structure which may break the compatibility.

>From SEV-ES whitepaper:

"To facilitate this communication, the SEV-ES architecture defines
a Guest Hypervisor Communication Block (GHCB). The GHCB resides in
page of shared memory so it is accessible to both the guest VM and the
hypervisor."

So this is kinda begging to be implemented with a shared page between
guest and host. And then put steal-time, ... etc in there too. Provided
there's enough room in the single page for the GHCB *and* our stuff.

> 
> 3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
> used to hold the compile time shared per-CPU variables. When SEV is
> detected we map this section with encryption attribute cleared.
> 
> This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
> macro to create a compile time per-CPU variable. When SEV is detected we
> map the per-CPU variable as decrypted (i.e with encryption attribute cleared).
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/kernel/kvm.c             |   43 +++++++++++++++++++++++++++++++------
>  include/asm-generic/vmlinux.lds.h |    3 +++
>  include/linux/percpu-defs.h       |    9 ++++++++
>  3 files changed, 48 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 099fcba..706a08e 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
>  
>  early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
>  
> -static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> -static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) __aligned(64);
>  static int has_steal_clock = 0;
>  
>  /*
> @@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
>  #endif
>  }
>  
> +static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
> +{
> +	/* When SEV is active, the percpu static variables initialized
> +	 * in data section will contain the encrypted data so we first
> +	 * need to decrypt it and then map it as decrypted.
> +	 */

Kernel comments style is:

	/*
	 * A sentence ending with a full-stop.
	 * Another sentence. ...
	 * More sentences. ...
	 */

But you get the idea. Please check your whole patchset for this.

> +	if (sev_active()) {
> +		unsigned long pa = slow_virt_to_phys(addr);
> +
> +		sme_early_decrypt(pa, size);
> +		return early_set_memory_decrypted(addr, size);
> +	}
> +
> +	return 0;
> +}
> +
>  static void kvm_register_steal_time(void)
>  {
>  	int cpu = smp_processor_id();
> @@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
>  	if (!has_steal_clock)
>  		return;
>  
> +	if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
> +		pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
> +		return;
> +	}
> +
>  	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
>  	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
>  		cpu, (unsigned long long) slow_virt_to_phys(st));
>  }
>  
> -static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
> +static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
>  
>  static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
>  {
> @@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
>  	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
>  		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
>  
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
> +					sizeof(struct kvm_vcpu_pv_apf_data)))
> +			goto skip_asyncpf;
>  #ifdef CONFIG_PREEMPT
>  		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
>  #endif
>  		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
>  		__this_cpu_write(apf_reason.enabled, 1);
> -		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
> -		       smp_processor_id());
> +		printk(KERN_INFO"KVM setup async PF for cpu %d msr %llx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_asyncpf:
>  	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
>  		unsigned long pa;
>  		/* Size alignment is implied but just to make it explicit. */
>  		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&kvm_apic_eoi),
> +					sizeof(unsigned long)))
> +			goto skip_pv_eoi;
>  		__this_cpu_write(kvm_apic_eoi, 0);
>  		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
>  			| KVM_MSR_ENABLED;
>  		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
> +		printk(KERN_INFO"KVM setup PV EOI for cpu %d msr %lx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_pv_eoi:
>  	if (has_steal_clock)
>  		kvm_register_steal_time();
>  }
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 0968d13..8d29910 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -773,6 +773,9 @@
>  	. = ALIGN(cacheline);						\
>  	*(.data..percpu)						\
>  	*(.data..percpu..shared_aligned)				\
> +	. = ALIGN(PAGE_SIZE);						\
> +	*(.data..percpu..hv_shared)					\
> +	. = ALIGN(PAGE_SIZE);						\
>  	VMLINUX_SYMBOL(__per_cpu_end) = .;
>  
>  /**
> diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
> index 8f16299..5af366e 100644
> --- a/include/linux/percpu-defs.h
> +++ b/include/linux/percpu-defs.h
> @@ -172,6 +172,15 @@
>  #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
>  	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
>  
> +/* Declaration/definition used for per-CPU variables that must be shared
> + * between hypervisor and guest OS.
> + */
> +#define DECLARE_PER_CPU_HV_SHARED(type, name)				\
> +	DECLARE_PER_CPU_SECTION(type, name, "..hv_shared")
> +
> +#define DEFINE_PER_CPU_HV_SHARED(type, name)				\
> +	DEFINE_PER_CPU_SECTION(type, name, "..hv_shared")
> +

If we end up doing something like that, this needs to be in

#ifdef CONFIG_VIRTUALIZATION

...

#endif

Above too.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
@ 2017-03-28 18:39     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-28 18:39 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:15:36AM -0500, Brijesh Singh wrote:
> Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
> variable at compile time and share its physical address with hypervisor.
> It presents a challege when SEV is active in guest OS. When SEV is active,
> guest memory is encrypted with guest key and hypervisor will no longer able
> to modify the guest memory. When SEV is active, we need to clear the
> encryption attribute of shared physical addresses so that both guest and
> hypervisor can access the data.
> 
> To solve this problem, I have tried these three options:
> 
> 1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
> detected then clear the encryption attribute. But while doing so I found
> that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
> called.
> 
> 2) Since the encryption attributes works on PAGE_SIZE hence add some extra
> padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
> clear the encryption attribute of the full PAGE. The downside of this was
> now we need to modify structure which may break the compatibility.

>From SEV-ES whitepaper:

"To facilitate this communication, the SEV-ES architecture defines
a Guest Hypervisor Communication Block (GHCB). The GHCB resides in
page of shared memory so it is accessible to both the guest VM and the
hypervisor."

So this is kinda begging to be implemented with a shared page between
guest and host. And then put steal-time, ... etc in there too. Provided
there's enough room in the single page for the GHCB *and* our stuff.

> 
> 3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
> used to hold the compile time shared per-CPU variables. When SEV is
> detected we map this section with encryption attribute cleared.
> 
> This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
> macro to create a compile time per-CPU variable. When SEV is detected we
> map the per-CPU variable as decrypted (i.e with encryption attribute cleared).
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/kernel/kvm.c             |   43 +++++++++++++++++++++++++++++++------
>  include/asm-generic/vmlinux.lds.h |    3 +++
>  include/linux/percpu-defs.h       |    9 ++++++++
>  3 files changed, 48 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 099fcba..706a08e 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
>  
>  early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
>  
> -static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> -static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) __aligned(64);
>  static int has_steal_clock = 0;
>  
>  /*
> @@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
>  #endif
>  }
>  
> +static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
> +{
> +	/* When SEV is active, the percpu static variables initialized
> +	 * in data section will contain the encrypted data so we first
> +	 * need to decrypt it and then map it as decrypted.
> +	 */

Kernel comments style is:

	/*
	 * A sentence ending with a full-stop.
	 * Another sentence. ...
	 * More sentences. ...
	 */

But you get the idea. Please check your whole patchset for this.

> +	if (sev_active()) {
> +		unsigned long pa = slow_virt_to_phys(addr);
> +
> +		sme_early_decrypt(pa, size);
> +		return early_set_memory_decrypted(addr, size);
> +	}
> +
> +	return 0;
> +}
> +
>  static void kvm_register_steal_time(void)
>  {
>  	int cpu = smp_processor_id();
> @@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
>  	if (!has_steal_clock)
>  		return;
>  
> +	if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
> +		pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
> +		return;
> +	}
> +
>  	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
>  	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
>  		cpu, (unsigned long long) slow_virt_to_phys(st));
>  }
>  
> -static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
> +static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
>  
>  static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
>  {
> @@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
>  	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
>  		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
>  
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
> +					sizeof(struct kvm_vcpu_pv_apf_data)))
> +			goto skip_asyncpf;
>  #ifdef CONFIG_PREEMPT
>  		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
>  #endif
>  		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
>  		__this_cpu_write(apf_reason.enabled, 1);
> -		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
> -		       smp_processor_id());
> +		printk(KERN_INFO"KVM setup async PF for cpu %d msr %llx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_asyncpf:
>  	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
>  		unsigned long pa;
>  		/* Size alignment is implied but just to make it explicit. */
>  		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&kvm_apic_eoi),
> +					sizeof(unsigned long)))
> +			goto skip_pv_eoi;
>  		__this_cpu_write(kvm_apic_eoi, 0);
>  		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
>  			| KVM_MSR_ENABLED;
>  		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
> +		printk(KERN_INFO"KVM setup PV EOI for cpu %d msr %lx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_pv_eoi:
>  	if (has_steal_clock)
>  		kvm_register_steal_time();
>  }
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 0968d13..8d29910 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -773,6 +773,9 @@
>  	. = ALIGN(cacheline);						\
>  	*(.data..percpu)						\
>  	*(.data..percpu..shared_aligned)				\
> +	. = ALIGN(PAGE_SIZE);						\
> +	*(.data..percpu..hv_shared)					\
> +	. = ALIGN(PAGE_SIZE);						\
>  	VMLINUX_SYMBOL(__per_cpu_end) = .;
>  
>  /**
> diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
> index 8f16299..5af366e 100644
> --- a/include/linux/percpu-defs.h
> +++ b/include/linux/percpu-defs.h
> @@ -172,6 +172,15 @@
>  #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
>  	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
>  
> +/* Declaration/definition used for per-CPU variables that must be shared
> + * between hypervisor and guest OS.
> + */
> +#define DECLARE_PER_CPU_HV_SHARED(type, name)				\
> +	DECLARE_PER_CPU_SECTION(type, name, "..hv_shared")
> +
> +#define DEFINE_PER_CPU_HV_SHARED(type, name)				\
> +	DEFINE_PER_CPU_SECTION(type, name, "..hv_shared")
> +

If we end up doing something like that, this needs to be in

#ifdef CONFIG_VIRTUALIZATION

...

#endif

Above too.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
@ 2017-03-28 18:39     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-28 18:39 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:15:36AM -0500, Brijesh Singh wrote:
> Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
> variable at compile time and share its physical address with hypervisor.
> It presents a challege when SEV is active in guest OS. When SEV is active,
> guest memory is encrypted with guest key and hypervisor will no longer able
> to modify the guest memory. When SEV is active, we need to clear the
> encryption attribute of shared physical addresses so that both guest and
> hypervisor can access the data.
> 
> To solve this problem, I have tried these three options:
> 
> 1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
> detected then clear the encryption attribute. But while doing so I found
> that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
> called.
> 
> 2) Since the encryption attributes works on PAGE_SIZE hence add some extra
> padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
> clear the encryption attribute of the full PAGE. The downside of this was
> now we need to modify structure which may break the compatibility.

>From SEV-ES whitepaper:

"To facilitate this communication, the SEV-ES architecture defines
a Guest Hypervisor Communication Block (GHCB). The GHCB resides in
page of shared memory so it is accessible to both the guest VM and the
hypervisor."

So this is kinda begging to be implemented with a shared page between
guest and host. And then put steal-time, ... etc in there too. Provided
there's enough room in the single page for the GHCB *and* our stuff.

> 
> 3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
> used to hold the compile time shared per-CPU variables. When SEV is
> detected we map this section with encryption attribute cleared.
> 
> This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
> macro to create a compile time per-CPU variable. When SEV is detected we
> map the per-CPU variable as decrypted (i.e with encryption attribute cleared).
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/kernel/kvm.c             |   43 +++++++++++++++++++++++++++++++------
>  include/asm-generic/vmlinux.lds.h |    3 +++
>  include/linux/percpu-defs.h       |    9 ++++++++
>  3 files changed, 48 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 099fcba..706a08e 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
>  
>  early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
>  
> -static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> -static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) __aligned(64);
>  static int has_steal_clock = 0;
>  
>  /*
> @@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
>  #endif
>  }
>  
> +static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
> +{
> +	/* When SEV is active, the percpu static variables initialized
> +	 * in data section will contain the encrypted data so we first
> +	 * need to decrypt it and then map it as decrypted.
> +	 */

Kernel comments style is:

	/*
	 * A sentence ending with a full-stop.
	 * Another sentence. ...
	 * More sentences. ...
	 */

But you get the idea. Please check your whole patchset for this.

> +	if (sev_active()) {
> +		unsigned long pa = slow_virt_to_phys(addr);
> +
> +		sme_early_decrypt(pa, size);
> +		return early_set_memory_decrypted(addr, size);
> +	}
> +
> +	return 0;
> +}
> +
>  static void kvm_register_steal_time(void)
>  {
>  	int cpu = smp_processor_id();
> @@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
>  	if (!has_steal_clock)
>  		return;
>  
> +	if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
> +		pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
> +		return;
> +	}
> +
>  	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
>  	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
>  		cpu, (unsigned long long) slow_virt_to_phys(st));
>  }
>  
> -static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
> +static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
>  
>  static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
>  {
> @@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
>  	if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
>  		u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
>  
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
> +					sizeof(struct kvm_vcpu_pv_apf_data)))
> +			goto skip_asyncpf;
>  #ifdef CONFIG_PREEMPT
>  		pa |= KVM_ASYNC_PF_SEND_ALWAYS;
>  #endif
>  		wrmsrl(MSR_KVM_ASYNC_PF_EN, pa | KVM_ASYNC_PF_ENABLED);
>  		__this_cpu_write(apf_reason.enabled, 1);
> -		printk(KERN_INFO"KVM setup async PF for cpu %d\n",
> -		       smp_processor_id());
> +		printk(KERN_INFO"KVM setup async PF for cpu %d msr %llx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_asyncpf:
>  	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
>  		unsigned long pa;
>  		/* Size alignment is implied but just to make it explicit. */
>  		BUILD_BUG_ON(__alignof__(kvm_apic_eoi) < 4);
> +		if (kvm_map_percpu_hv_shared(this_cpu_ptr(&kvm_apic_eoi),
> +					sizeof(unsigned long)))
> +			goto skip_pv_eoi;
>  		__this_cpu_write(kvm_apic_eoi, 0);
>  		pa = slow_virt_to_phys(this_cpu_ptr(&kvm_apic_eoi))
>  			| KVM_MSR_ENABLED;
>  		wrmsrl(MSR_KVM_PV_EOI_EN, pa);
> +		printk(KERN_INFO"KVM setup PV EOI for cpu %d msr %lx\n",
> +		       smp_processor_id(), pa);
>  	}
> -
> +skip_pv_eoi:
>  	if (has_steal_clock)
>  		kvm_register_steal_time();
>  }
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 0968d13..8d29910 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -773,6 +773,9 @@
>  	. = ALIGN(cacheline);						\
>  	*(.data..percpu)						\
>  	*(.data..percpu..shared_aligned)				\
> +	. = ALIGN(PAGE_SIZE);						\
> +	*(.data..percpu..hv_shared)					\
> +	. = ALIGN(PAGE_SIZE);						\
>  	VMLINUX_SYMBOL(__per_cpu_end) = .;
>  
>  /**
> diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
> index 8f16299..5af366e 100644
> --- a/include/linux/percpu-defs.h
> +++ b/include/linux/percpu-defs.h
> @@ -172,6 +172,15 @@
>  #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
>  	DEFINE_PER_CPU_SECTION(type, name, "..read_mostly")
>  
> +/* Declaration/definition used for per-CPU variables that must be shared
> + * between hypervisor and guest OS.
> + */
> +#define DECLARE_PER_CPU_HV_SHARED(type, name)				\
> +	DECLARE_PER_CPU_SECTION(type, name, "..hv_shared")
> +
> +#define DEFINE_PER_CPU_HV_SHARED(type, name)				\
> +	DEFINE_PER_CPU_SECTION(type, name, "..hv_shared")
> +

If we end up doing something like that, this needs to be in

#ifdef CONFIG_VIRTUALIZATION

...

#endif

Above too.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
  2017-03-02 15:16   ` Brijesh Singh
  (?)
@ 2017-03-29 15:14     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-29 15:14 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: linux-efi, kvm, rkrcmar, matt, linux-pci, linus.walleij,
	gary.hook, linux-mm, hpa, cl, tglx, aarcange, sfr, mchehab,
	simon.guinot, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, labbott, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, pbonzini,
	bhelgaas, dan.j.williams, andriy.shevchenko, akpm, herbert,
	tony.luck, pau

On Thu, Mar 02, 2017 at 10:16:05AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> When a guest causes a NPF which requires emulation, KVM sometimes walks
> the guest page tables to translate the GVA to a GPA. This is unnecessary
> most of the time on AMD hardware since the hardware provides the GPA in
> EXITINFO2.
> 
> The only exception cases involve string operations involving rep or
> operations that use two memory locations. With rep, the GPA will only be
> the value of the initial NPF and with dual memory locations we won't know
> which memory address was translated into EXITINFO2.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Reviewed-by: Borislav Petkov <bp@suse.de>

I think I already asked you to remove Revewed-by tags when you have to
change an already reviewed patch in non-trivial manner. Why does this
one still have my Reviewed-by tag?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
@ 2017-03-29 15:14     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-29 15:14 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:16:05AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> When a guest causes a NPF which requires emulation, KVM sometimes walks
> the guest page tables to translate the GVA to a GPA. This is unnecessary
> most of the time on AMD hardware since the hardware provides the GPA in
> EXITINFO2.
> 
> The only exception cases involve string operations involving rep or
> operations that use two memory locations. With rep, the GPA will only be
> the value of the initial NPF and with dual memory locations we won't know
> which memory address was translated into EXITINFO2.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Reviewed-by: Borislav Petkov <bp@suse.de>

I think I already asked you to remove Revewed-by tags when you have to
change an already reviewed patch in non-trivial manner. Why does this
one still have my Reviewed-by tag?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
@ 2017-03-29 15:14     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-29 15:14 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj,
	pbonzini, akpm, davem

On Thu, Mar 02, 2017 at 10:16:05AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> When a guest causes a NPF which requires emulation, KVM sometimes walks
> the guest page tables to translate the GVA to a GPA. This is unnecessary
> most of the time on AMD hardware since the hardware provides the GPA in
> EXITINFO2.
> 
> The only exception cases involve string operations involving rep or
> operations that use two memory locations. With rep, the GPA will only be
> the value of the initial NPF and with dual memory locations we won't know
> which memory address was translated into EXITINFO2.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Reviewed-by: Borislav Petkov <bp@suse.de>

I think I already asked you to remove Revewed-by tags when you have to
change an already reviewed patch in non-trivial manner. Why does this
one still have my Reviewed-by tag?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
  2017-03-28 18:39     ` Borislav Petkov
  (?)
@ 2017-03-29 15:21       ` Paolo Bonzini
  -1 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-29 15:21 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas



On 28/03/2017 20:39, Borislav Petkov wrote:
>> 2) Since the encryption attributes works on PAGE_SIZE hence add some extra
>> padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
>> clear the encryption attribute of the full PAGE. The downside of this was
>> now we need to modify structure which may break the compatibility.
> From SEV-ES whitepaper:
> 
> "To facilitate this communication, the SEV-ES architecture defines
> a Guest Hypervisor Communication Block (GHCB). The GHCB resides in
> page of shared memory so it is accessible to both the guest VM and the
> hypervisor."
> 
> So this is kinda begging to be implemented with a shared page between
> guest and host. And then put steal-time, ... etc in there too. Provided
> there's enough room in the single page for the GHCB *and* our stuff.

The GHCB would have to be allocated much earlier, possibly even by
firmware depending on how things will be designed.  I think it's
premature to consider SEV-ES requirements.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
@ 2017-03-29 15:21       ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-29 15:21 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj, akpm,
	davem



On 28/03/2017 20:39, Borislav Petkov wrote:
>> 2) Since the encryption attributes works on PAGE_SIZE hence add some extra
>> padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
>> clear the encryption attribute of the full PAGE. The downside of this was
>> now we need to modify structure which may break the compatibility.
> From SEV-ES whitepaper:
> 
> "To facilitate this communication, the SEV-ES architecture defines
> a Guest Hypervisor Communication Block (GHCB). The GHCB resides in
> page of shared memory so it is accessible to both the guest VM and the
> hypervisor."
> 
> So this is kinda begging to be implemented with a shared page between
> guest and host. And then put steal-time, ... etc in there too. Provided
> there's enough room in the single page for the GHCB *and* our stuff.

The GHCB would have to be allocated much earlier, possibly even by
firmware depending on how things will be designed.  I think it's
premature to consider SEV-ES requirements.

Paolo

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
@ 2017-03-29 15:21       ` Paolo Bonzini
  0 siblings, 0 replies; 424+ messages in thread
From: Paolo Bonzini @ 2017-03-29 15:21 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: simon.guinot, linux-efi, kvm, rkrcmar, matt, linux-pci,
	linus.walleij, gary.hook, linux-mm, paul.gortmaker, hpa, cl,
	dan.j.williams, aarcange, sfr, andriy.shevchenko, herbert, bhe,
	xemul, joro, x86, peterz, piotr.luc, mingo, msalter,
	ross.zwisler, dyoung, thomas.lendacky, jroedel, keescook, arnd,
	toshi.kani, mathieu.desnoyers, luto, devel, bhelgaas, tglx,
	mchehab, iamjoonsoo.kim, labbott, tony.luck, alexandre.bounine,
	kuleshovmail, linux-kernel, mcgrof, mst, linux-crypto, tj, akpm,
	davem



On 28/03/2017 20:39, Borislav Petkov wrote:
>> 2) Since the encryption attributes works on PAGE_SIZE hence add some extra
>> padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
>> clear the encryption attribute of the full PAGE. The downside of this was
>> now we need to modify structure which may break the compatibility.
> From SEV-ES whitepaper:
> 
> "To facilitate this communication, the SEV-ES architecture defines
> a Guest Hypervisor Communication Block (GHCB). The GHCB resides in
> page of shared memory so it is accessible to both the guest VM and the
> hypervisor."
> 
> So this is kinda begging to be implemented with a shared page between
> guest and host. And then put steal-time, ... etc in there too. Provided
> there's enough room in the single page for the GHCB *and* our stuff.

The GHCB would have to be allocated much earlier, possibly even by
firmware depending on how things will be designed.  I think it's
premature to consider SEV-ES requirements.

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
  2017-03-29 15:21       ` Paolo Bonzini
  (?)
@ 2017-03-29 15:32         ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-29 15:32 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Wed, Mar 29, 2017 at 05:21:13PM +0200, Paolo Bonzini wrote:
> The GHCB would have to be allocated much earlier, possibly even by
> firmware depending on how things will be designed.

How about a statically allocated page like we do with the early
pagetable pages in head_64.S?

> I think it's premature to consider SEV-ES requirements.

My only concern is not to have to redo a lot when SEV-ES gets enabled.
So it would be prudent to design with SEV-ES in the back of our minds.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
@ 2017-03-29 15:32         ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-29 15:32 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On Wed, Mar 29, 2017 at 05:21:13PM +0200, Paolo Bonzini wrote:
> The GHCB would have to be allocated much earlier, possibly even by
> firmware depending on how things will be designed.

How about a statically allocated page like we do with the early
pagetable pages in head_64.S?

> I think it's premature to consider SEV-ES requirements.

My only concern is not to have to redo a lot when SEV-ES gets enabled.
So it would be prudent to design with SEV-ES in the back of our minds.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables
@ 2017-03-29 15:32         ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-03-29 15:32 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Brijesh Singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, akpm, davem

On Wed, Mar 29, 2017 at 05:21:13PM +0200, Paolo Bonzini wrote:
> The GHCB would have to be allocated much earlier, possibly even by
> firmware depending on how things will be designed.

How about a statically allocated page like we do with the early
pagetable pages in head_64.S?

> I think it's premature to consider SEV-ES requirements.

My only concern is not to have to redo a lot when SEV-ES gets enabled.
So it would be prudent to design with SEV-ES in the back of our minds.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
  2017-03-29 15:14     ` Borislav Petkov
  (?)
  (?)
@ 2017-03-29 17:08       ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-29 17:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani

Hi Boris,

On 03/29/2017 10:14 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:16:05AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> When a guest causes a NPF which requires emulation, KVM sometimes walks
>> the guest page tables to translate the GVA to a GPA. This is unnecessary
>> most of the time on AMD hardware since the hardware provides the GPA in
>> EXITINFO2.
>>
>> The only exception cases involve string operations involving rep or
>> operations that use two memory locations. With rep, the GPA will only be
>> the value of the initial NPF and with dual memory locations we won't know
>> which memory address was translated into EXITINFO2.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> Reviewed-by: Borislav Petkov <bp@suse.de>
>
> I think I already asked you to remove Revewed-by tags when you have to
> change an already reviewed patch in non-trivial manner. Why does this
> one still have my Reviewed-by tag?
>

Actually this patch is included in RFCv2 series for the completeness.

The patch is already been reviewed and accepted in kvm upstream tree but it
was not present in the tip branch hence I cherry-picked into RFC so that we do
not break the build. SEV runtime behavior needs this patch. I have tried to
highlight it in cover letter. It was my bad that I missed fixing the Reviewed-by
tag during cherry picking. Sorry about that and will be extra careful next time around. Thanks


~ Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
@ 2017-03-29 17:08       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-29 17:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Hi Boris,

On 03/29/2017 10:14 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:16:05AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> When a guest causes a NPF which requires emulation, KVM sometimes walks
>> the guest page tables to translate the GVA to a GPA. This is unnecessary
>> most of the time on AMD hardware since the hardware provides the GPA in
>> EXITINFO2.
>>
>> The only exception cases involve string operations involving rep or
>> operations that use two memory locations. With rep, the GPA will only be
>> the value of the initial NPF and with dual memory locations we won't know
>> which memory address was translated into EXITINFO2.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> Reviewed-by: Borislav Petkov <bp@suse.de>
>
> I think I already asked you to remove Revewed-by tags when you have to
> change an already reviewed patch in non-trivial manner. Why does this
> one still have my Reviewed-by tag?
>

Actually this patch is included in RFCv2 series for the completeness.

The patch is already been reviewed and accepted in kvm upstream tree but it
was not present in the tip branch hence I cherry-picked into RFC so that we do
not break the build. SEV runtime behavior needs this patch. I have tried to
highlight it in cover letter. It was my bad that I missed fixing the Reviewed-by
tag during cherry picking. Sorry about that and will be extra careful next time around. Thanks


~ Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
@ 2017-03-29 17:08       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-29 17:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani

Hi Boris,

On 03/29/2017 10:14 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:16:05AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> When a guest causes a NPF which requires emulation, KVM sometimes walks
>> the guest page tables to translate the GVA to a GPA. This is unnecessary
>> most of the time on AMD hardware since the hardware provides the GPA in
>> EXITINFO2.
>>
>> The only exception cases involve string operations involving rep or
>> operations that use two memory locations. With rep, the GPA will only be
>> the value of the initial NPF and with dual memory locations we won't know
>> which memory address was translated into EXITINFO2.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> Reviewed-by: Borislav Petkov <bp@suse.de>
>
> I think I already asked you to remove Revewed-by tags when you have to
> change an already reviewed patch in non-trivial manner. Why does this
> one still have my Reviewed-by tag?
>

Actually this patch is included in RFCv2 series for the completeness.

The patch is already been reviewed and accepted in kvm upstream tree but it
was not present in the tip branch hence I cherry-picked into RFC so that we do
not break the build. SEV runtime behavior needs this patch. I have tried to
highlight it in cover letter. It was my bad that I missed fixing the Reviewed-by
tag during cherry picking. Sorry about that and will be extra careful next time around. Thanks


~ Brijesh

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk
@ 2017-03-29 17:08       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-29 17:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott, tony.luck,
	alexandre.bounine, kuleshovmail, linux-kernel, mcgrof, mst,
	linux-crypto, tj, pbonzini, akpm, davem

Hi Boris,

On 03/29/2017 10:14 AM, Borislav Petkov wrote:
> On Thu, Mar 02, 2017 at 10:16:05AM -0500, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> When a guest causes a NPF which requires emulation, KVM sometimes walks
>> the guest page tables to translate the GVA to a GPA. This is unnecessary
>> most of the time on AMD hardware since the hardware provides the GPA in
>> EXITINFO2.
>>
>> The only exception cases involve string operations involving rep or
>> operations that use two memory locations. With rep, the GPA will only be
>> the value of the initial NPF and with dual memory locations we won't know
>> which memory address was translated into EXITINFO2.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> Reviewed-by: Borislav Petkov <bp@suse.de>
>
> I think I already asked you to remove Revewed-by tags when you have to
> change an already reviewed patch in non-trivial manner. Why does this
> one still have my Reviewed-by tag?
>

Actually this patch is included in RFCv2 series for the completeness.

The patch is already been reviewed and accepted in kvm upstream tree but it
was not present in the tip branch hence I cherry-picked into RFC so that we do
not break the build. SEV runtime behavior needs this patch. I have tried to
highlight it in cover letter. It was my bad that I missed fixing the Reviewed-by
tag during cherry picking. Sorry about that and will be extra careful next time around. Thanks


~ Brijesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-03-17 10:17             ` Borislav Petkov
  (?)
  (?)
@ 2017-04-06 14:05               ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-06 14:05 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhel

Hi Boris,

On 03/17/2017 05:17 AM, Borislav Petkov wrote:
> On Thu, Mar 16, 2017 at 11:25:36PM +0100, Paolo Bonzini wrote:
>> The kvmclock memory is initially zero so there is no need for the
>> hypervisor to allocate anything; the point of these patches is just to
>> access the data in a natural way from Linux source code.
>
> I realize that.
>
>> I also don't really like the patch as is (plus it fails modpost), but
>> IMO reusing __change_page_attr and __split_large_page is the right thing
>> to do.
>
> Right, so teaching pageattr.c about memblock could theoretically come
> around and bite us later when a page allocated with memblock gets freed
> with free_page().
>
> And looking at this more, we have all this kernel pagetable preparation
> code down the init_mem_mapping() call and the pagetable setup in
> arch/x86/mm/init_{32,64}.c
>
> And that code even does some basic page splitting. Oh and it uses
> alloc_low_pages() which knows whether to do memblock reservation or the
> common __get_free_pages() when slabs are up.
>

I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
routines to do basic page splitting. I think it sufficient for our usage.

I should be able to drop the memblock patch from the series and update the
Patch 15 [1] to use the kernel_physical_mapping_init().

The kernel_physical_mapping_init() creates the page table mapping using
default KERNEL_PAGE attributes, I tried to extend the function by passing
'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
pages. The code did not looked clean hence I dropped that idea. Instead,
I took the below approach. I did some runtime test and it seem to be working okay.

[1] http://marc.info/?l=linux-mm&m=148846773731212&w=2

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 7df5f4c..de16ef4 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -15,6 +15,7 @@
  #include <linux/mm.h>
  #include <linux/dma-mapping.h>
  #include <linux/swiotlb.h>
+#include <linux/mem_encrypt.h>
  
  #include <asm/tlbflush.h>
  #include <asm/fixmap.h>
@@ -22,6 +23,8 @@
  #include <asm/bootparam.h>
  #include <asm/cacheflush.h>
  
+#include "mm_internal.h"
+
  extern pmdval_t early_pmd_flags;
  int __init __early_make_pgtable(unsigned long, pmdval_t);
  void __init __early_pgtable_flush(void);
@@ -258,6 +261,72 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
         swiotlb_free_coherent(dev, size, vaddr, dma_handle);
  }
  
+static int __init early_set_memory_enc_dec(resource_size_t paddr,
+                                          unsigned long size, bool enc)
+{
+       pte_t *kpte;
+       int level;
+       unsigned long vaddr, vaddr_end, vaddr_next;
+
+       vaddr = (unsigned long)__va(paddr);
+       vaddr_next = vaddr;
+       vaddr_end = vaddr + size;
+
+       /*
+        * We are going to change the physical page attribute from C=1 to C=0.
+        * Flush the caches to ensure that all the data with C=1 is flushed to
+        * memory. Any caching of the vaddr after function returns will
+        * use C=0.
+        */
+       clflush_cache_range(__va(paddr), size);
+
+       for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+               kpte = lookup_address(vaddr, &level);
+               if (!kpte || pte_none(*kpte) )
+                       return 1;
+
+               if (level == PG_LEVEL_4K) {
+                       pte_t new_pte;
+                       unsigned long pfn = pte_pfn(*kpte);
+                       pgprot_t new_prot = pte_pgprot(*kpte);
+
+                       if (enc)
+                               pgprot_val(new_prot) |= _PAGE_ENC;
+                       else
+                               pgprot_val(new_prot) &= ~_PAGE_ENC;
+
+                       new_pte = pfn_pte(pfn, canon_pgprot(new_prot));
+                       pr_info("  pte %016lx -> 0x%016lx\n", pte_val(*kpte),
+                               pte_val(new_pte));
+                       set_pte_atomic(kpte, new_pte);
+                       vaddr_next = (vaddr & PAGE_MASK) + PAGE_SIZE;
+                       continue;
+               }
+
+               /*
+                * virtual address is part of large page, create the page
+                * table mapping to use smaller pages (4K). The virtual and
+                * physical address must be aligned to PMD level.
+                */
+               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
+                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
+                                            0);
+       }
+
+       __flush_tlb_all();
+       return 0;
+}
+
+int __init early_set_memory_decrypted(resource_size_t paddr, unsigned long size)
+{
+       return early_set_memory_enc_dec(paddr, size, false);
+}
+
+int __init early_set_memory_encrypted(resource_size_t paddr, unsigned long size)
+{
+       return early_set_memory_enc_dec(paddr, size, true);
+}
+

> So what would be much cleaner, IMHO, is if one would reuse that code to
> change init_mm.pgd mappings early without copying pageattr.c.
>
> init_mem_mapping() gets called before kvm_guest_init() in setup_arch()
> so the guest would simply fixup its pagetable right there.
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-06 14:05               ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-06 14:05 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott

Hi Boris,

On 03/17/2017 05:17 AM, Borislav Petkov wrote:
> On Thu, Mar 16, 2017 at 11:25:36PM +0100, Paolo Bonzini wrote:
>> The kvmclock memory is initially zero so there is no need for the
>> hypervisor to allocate anything; the point of these patches is just to
>> access the data in a natural way from Linux source code.
>
> I realize that.
>
>> I also don't really like the patch as is (plus it fails modpost), but
>> IMO reusing __change_page_attr and __split_large_page is the right thing
>> to do.
>
> Right, so teaching pageattr.c about memblock could theoretically come
> around and bite us later when a page allocated with memblock gets freed
> with free_page().
>
> And looking at this more, we have all this kernel pagetable preparation
> code down the init_mem_mapping() call and the pagetable setup in
> arch/x86/mm/init_{32,64}.c
>
> And that code even does some basic page splitting. Oh and it uses
> alloc_low_pages() which knows whether to do memblock reservation or the
> common __get_free_pages() when slabs are up.
>

I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
routines to do basic page splitting. I think it sufficient for our usage.

I should be able to drop the memblock patch from the series and update the
Patch 15 [1] to use the kernel_physical_mapping_init().

The kernel_physical_mapping_init() creates the page table mapping using
default KERNEL_PAGE attributes, I tried to extend the function by passing
'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
pages. The code did not looked clean hence I dropped that idea. Instead,
I took the below approach. I did some runtime test and it seem to be working okay.

[1] http://marc.info/?l=linux-mm&m=148846773731212&w=2

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 7df5f4c..de16ef4 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -15,6 +15,7 @@
  #include <linux/mm.h>
  #include <linux/dma-mapping.h>
  #include <linux/swiotlb.h>
+#include <linux/mem_encrypt.h>
  
  #include <asm/tlbflush.h>
  #include <asm/fixmap.h>
@@ -22,6 +23,8 @@
  #include <asm/bootparam.h>
  #include <asm/cacheflush.h>
  
+#include "mm_internal.h"
+
  extern pmdval_t early_pmd_flags;
  int __init __early_make_pgtable(unsigned long, pmdval_t);
  void __init __early_pgtable_flush(void);
@@ -258,6 +261,72 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
         swiotlb_free_coherent(dev, size, vaddr, dma_handle);
  }
  
+static int __init early_set_memory_enc_dec(resource_size_t paddr,
+                                          unsigned long size, bool enc)
+{
+       pte_t *kpte;
+       int level;
+       unsigned long vaddr, vaddr_end, vaddr_next;
+
+       vaddr = (unsigned long)__va(paddr);
+       vaddr_next = vaddr;
+       vaddr_end = vaddr + size;
+
+       /*
+        * We are going to change the physical page attribute from C=1 to C=0.
+        * Flush the caches to ensure that all the data with C=1 is flushed to
+        * memory. Any caching of the vaddr after function returns will
+        * use C=0.
+        */
+       clflush_cache_range(__va(paddr), size);
+
+       for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+               kpte = lookup_address(vaddr, &level);
+               if (!kpte || pte_none(*kpte) )
+                       return 1;
+
+               if (level == PG_LEVEL_4K) {
+                       pte_t new_pte;
+                       unsigned long pfn = pte_pfn(*kpte);
+                       pgprot_t new_prot = pte_pgprot(*kpte);
+
+                       if (enc)
+                               pgprot_val(new_prot) |= _PAGE_ENC;
+                       else
+                               pgprot_val(new_prot) &= ~_PAGE_ENC;
+
+                       new_pte = pfn_pte(pfn, canon_pgprot(new_prot));
+                       pr_info("  pte %016lx -> 0x%016lx\n", pte_val(*kpte),
+                               pte_val(new_pte));
+                       set_pte_atomic(kpte, new_pte);
+                       vaddr_next = (vaddr & PAGE_MASK) + PAGE_SIZE;
+                       continue;
+               }
+
+               /*
+                * virtual address is part of large page, create the page
+                * table mapping to use smaller pages (4K). The virtual and
+                * physical address must be aligned to PMD level.
+                */
+               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
+                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
+                                            0);
+       }
+
+       __flush_tlb_all();
+       return 0;
+}
+
+int __init early_set_memory_decrypted(resource_size_t paddr, unsigned long size)
+{
+       return early_set_memory_enc_dec(paddr, size, false);
+}
+
+int __init early_set_memory_encrypted(resource_size_t paddr, unsigned long size)
+{
+       return early_set_memory_enc_dec(paddr, size, true);
+}
+

> So what would be much cleaner, IMHO, is if one would reuse that code to
> change init_mm.pgd mappings early without copying pageattr.c.
>
> init_mem_mapping() gets called before kvm_guest_init() in setup_arch()
> so the guest would simply fixup its pagetable right there.
>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-06 14:05               ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-06 14:05 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel, bhel

Hi Boris,

On 03/17/2017 05:17 AM, Borislav Petkov wrote:
> On Thu, Mar 16, 2017 at 11:25:36PM +0100, Paolo Bonzini wrote:
>> The kvmclock memory is initially zero so there is no need for the
>> hypervisor to allocate anything; the point of these patches is just to
>> access the data in a natural way from Linux source code.
>
> I realize that.
>
>> I also don't really like the patch as is (plus it fails modpost), but
>> IMO reusing __change_page_attr and __split_large_page is the right thing
>> to do.
>
> Right, so teaching pageattr.c about memblock could theoretically come
> around and bite us later when a page allocated with memblock gets freed
> with free_page().
>
> And looking at this more, we have all this kernel pagetable preparation
> code down the init_mem_mapping() call and the pagetable setup in
> arch/x86/mm/init_{32,64}.c
>
> And that code even does some basic page splitting. Oh and it uses
> alloc_low_pages() which knows whether to do memblock reservation or the
> common __get_free_pages() when slabs are up.
>

I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
routines to do basic page splitting. I think it sufficient for our usage.

I should be able to drop the memblock patch from the series and update the
Patch 15 [1] to use the kernel_physical_mapping_init().

The kernel_physical_mapping_init() creates the page table mapping using
default KERNEL_PAGE attributes, I tried to extend the function by passing
'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
pages. The code did not looked clean hence I dropped that idea. Instead,
I took the below approach. I did some runtime test and it seem to be working okay.

[1] http://marc.info/?l=linux-mm&m=148846773731212&w=2

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 7df5f4c..de16ef4 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -15,6 +15,7 @@
  #include <linux/mm.h>
  #include <linux/dma-mapping.h>
  #include <linux/swiotlb.h>
+#include <linux/mem_encrypt.h>
  
  #include <asm/tlbflush.h>
  #include <asm/fixmap.h>
@@ -22,6 +23,8 @@
  #include <asm/bootparam.h>
  #include <asm/cacheflush.h>
  
+#include "mm_internal.h"
+
  extern pmdval_t early_pmd_flags;
  int __init __early_make_pgtable(unsigned long, pmdval_t);
  void __init __early_pgtable_flush(void);
@@ -258,6 +261,72 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
         swiotlb_free_coherent(dev, size, vaddr, dma_handle);
  }
  
+static int __init early_set_memory_enc_dec(resource_size_t paddr,
+                                          unsigned long size, bool enc)
+{
+       pte_t *kpte;
+       int level;
+       unsigned long vaddr, vaddr_end, vaddr_next;
+
+       vaddr = (unsigned long)__va(paddr);
+       vaddr_next = vaddr;
+       vaddr_end = vaddr + size;
+
+       /*
+        * We are going to change the physical page attribute from C=1 to C=0.
+        * Flush the caches to ensure that all the data with C=1 is flushed to
+        * memory. Any caching of the vaddr after function returns will
+        * use C=0.
+        */
+       clflush_cache_range(__va(paddr), size);
+
+       for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+               kpte = lookup_address(vaddr, &level);
+               if (!kpte || pte_none(*kpte) )
+                       return 1;
+
+               if (level == PG_LEVEL_4K) {
+                       pte_t new_pte;
+                       unsigned long pfn = pte_pfn(*kpte);
+                       pgprot_t new_prot = pte_pgprot(*kpte);
+
+                       if (enc)
+                               pgprot_val(new_prot) |= _PAGE_ENC;
+                       else
+                               pgprot_val(new_prot) &= ~_PAGE_ENC;
+
+                       new_pte = pfn_pte(pfn, canon_pgprot(new_prot));
+                       pr_info("  pte %016lx -> 0x%016lx\n", pte_val(*kpte),
+                               pte_val(new_pte));
+                       set_pte_atomic(kpte, new_pte);
+                       vaddr_next = (vaddr & PAGE_MASK) + PAGE_SIZE;
+                       continue;
+               }
+
+               /*
+                * virtual address is part of large page, create the page
+                * table mapping to use smaller pages (4K). The virtual and
+                * physical address must be aligned to PMD level.
+                */
+               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
+                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
+                                            0);
+       }
+
+       __flush_tlb_all();
+       return 0;
+}
+
+int __init early_set_memory_decrypted(resource_size_t paddr, unsigned long size)
+{
+       return early_set_memory_enc_dec(paddr, size, false);
+}
+
+int __init early_set_memory_encrypted(resource_size_t paddr, unsigned long size)
+{
+       return early_set_memory_enc_dec(paddr, size, true);
+}
+

> So what would be much cleaner, IMHO, is if one would reuse that code to
> change init_mm.pgd mappings early without copying pageattr.c.
>
> init_mem_mapping() gets called before kvm_guest_init() in setup_arch()
> so the guest would simply fixup its pagetable right there.
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-06 14:05               ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-06 14:05 UTC (permalink / raw)
  To: Borislav Petkov, Paolo Bonzini
  Cc: brijesh.singh, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott

Hi Boris,

On 03/17/2017 05:17 AM, Borislav Petkov wrote:
> On Thu, Mar 16, 2017 at 11:25:36PM +0100, Paolo Bonzini wrote:
>> The kvmclock memory is initially zero so there is no need for the
>> hypervisor to allocate anything; the point of these patches is just to
>> access the data in a natural way from Linux source code.
>
> I realize that.
>
>> I also don't really like the patch as is (plus it fails modpost), but
>> IMO reusing __change_page_attr and __split_large_page is the right thing
>> to do.
>
> Right, so teaching pageattr.c about memblock could theoretically come
> around and bite us later when a page allocated with memblock gets freed
> with free_page().
>
> And looking at this more, we have all this kernel pagetable preparation
> code down the init_mem_mapping() call and the pagetable setup in
> arch/x86/mm/init_{32,64}.c
>
> And that code even does some basic page splitting. Oh and it uses
> alloc_low_pages() which knows whether to do memblock reservation or the
> common __get_free_pages() when slabs are up.
>

I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
routines to do basic page splitting. I think it sufficient for our usage.

I should be able to drop the memblock patch from the series and update the
Patch 15 [1] to use the kernel_physical_mapping_init().

The kernel_physical_mapping_init() creates the page table mapping using
default KERNEL_PAGE attributes, I tried to extend the function by passing
'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
pages. The code did not looked clean hence I dropped that idea. Instead,
I took the below approach. I did some runtime test and it seem to be working okay.

[1] http://marc.info/?l=linux-mm&m=148846773731212&w=2

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 7df5f4c..de16ef4 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -15,6 +15,7 @@
  #include <linux/mm.h>
  #include <linux/dma-mapping.h>
  #include <linux/swiotlb.h>
+#include <linux/mem_encrypt.h>
  
  #include <asm/tlbflush.h>
  #include <asm/fixmap.h>
@@ -22,6 +23,8 @@
  #include <asm/bootparam.h>
  #include <asm/cacheflush.h>
  
+#include "mm_internal.h"
+
  extern pmdval_t early_pmd_flags;
  int __init __early_make_pgtable(unsigned long, pmdval_t);
  void __init __early_pgtable_flush(void);
@@ -258,6 +261,72 @@ static void sme_free(struct device *dev, size_t size, void *vaddr,
         swiotlb_free_coherent(dev, size, vaddr, dma_handle);
  }
  
+static int __init early_set_memory_enc_dec(resource_size_t paddr,
+                                          unsigned long size, bool enc)
+{
+       pte_t *kpte;
+       int level;
+       unsigned long vaddr, vaddr_end, vaddr_next;
+
+       vaddr = (unsigned long)__va(paddr);
+       vaddr_next = vaddr;
+       vaddr_end = vaddr + size;
+
+       /*
+        * We are going to change the physical page attribute from C=1 to C=0.
+        * Flush the caches to ensure that all the data with C=1 is flushed to
+        * memory. Any caching of the vaddr after function returns will
+        * use C=0.
+        */
+       clflush_cache_range(__va(paddr), size);
+
+       for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+               kpte = lookup_address(vaddr, &level);
+               if (!kpte || pte_none(*kpte) )
+                       return 1;
+
+               if (level == PG_LEVEL_4K) {
+                       pte_t new_pte;
+                       unsigned long pfn = pte_pfn(*kpte);
+                       pgprot_t new_prot = pte_pgprot(*kpte);
+
+                       if (enc)
+                               pgprot_val(new_prot) |= _PAGE_ENC;
+                       else
+                               pgprot_val(new_prot) &= ~_PAGE_ENC;
+
+                       new_pte = pfn_pte(pfn, canon_pgprot(new_prot));
+                       pr_info("  pte %016lx -> 0x%016lx\n", pte_val(*kpte),
+                               pte_val(new_pte));
+                       set_pte_atomic(kpte, new_pte);
+                       vaddr_next = (vaddr & PAGE_MASK) + PAGE_SIZE;
+                       continue;
+               }
+
+               /*
+                * virtual address is part of large page, create the page
+                * table mapping to use smaller pages (4K). The virtual and
+                * physical address must be aligned to PMD level.
+                */
+               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
+                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
+                                            0);
+       }
+
+       __flush_tlb_all();
+       return 0;
+}
+
+int __init early_set_memory_decrypted(resource_size_t paddr, unsigned long size)
+{
+       return early_set_memory_enc_dec(paddr, size, false);
+}
+
+int __init early_set_memory_encrypted(resource_size_t paddr, unsigned long size)
+{
+       return early_set_memory_enc_dec(paddr, size, true);
+}
+

> So what would be much cleaner, IMHO, is if one would reuse that code to
> change init_mm.pgd mappings early without copying pageattr.c.
>
> init_mem_mapping() gets called before kvm_guest_init() in setup_arch()
> so the guest would simply fixup its pagetable right there.
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-04-06 14:05               ` Brijesh Singh
  (?)
  (?)
@ 2017-04-06 17:25                 ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-04-06 17:25 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

Hi Brijesh,

On Thu, Apr 06, 2017 at 09:05:03AM -0500, Brijesh Singh wrote:
> I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
> routines to do basic page splitting. I think it sufficient for our usage.

Good :)

> I should be able to drop the memblock patch from the series and update the
> Patch 15 [1] to use the kernel_physical_mapping_init().
> 
> The kernel_physical_mapping_init() creates the page table mapping using
> default KERNEL_PAGE attributes, I tried to extend the function by passing
> 'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
> pages. The code did not looked clean hence I dropped that idea.

Or, you could have a

__kernel_physical_mapping_init_prot(..., prot)

helper which gets a protection argument and hands it down. The lower
levels already hand down prot which is good.

The interface kernel_physical_mapping_init() will then itself call:

	__kernel_physical_mapping_init_prot(..., PAGE_KERNEL);

for the normal cases.

That in a pre-patch of course.

How does that sound?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-06 17:25                 ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-04-06 17:25 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott

Hi Brijesh,

On Thu, Apr 06, 2017 at 09:05:03AM -0500, Brijesh Singh wrote:
> I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
> routines to do basic page splitting. I think it sufficient for our usage.

Good :)

> I should be able to drop the memblock patch from the series and update the
> Patch 15 [1] to use the kernel_physical_mapping_init().
> 
> The kernel_physical_mapping_init() creates the page table mapping using
> default KERNEL_PAGE attributes, I tried to extend the function by passing
> 'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
> pages. The code did not looked clean hence I dropped that idea.

Or, you could have a

__kernel_physical_mapping_init_prot(..., prot)

helper which gets a protection argument and hands it down. The lower
levels already hand down prot which is good.

The interface kernel_physical_mapping_init() will then itself call:

	__kernel_physical_mapping_init_prot(..., PAGE_KERNEL);

for the normal cases.

That in a pre-patch of course.

How does that sound?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-06 17:25                 ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-04-06 17:25 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

Hi Brijesh,

On Thu, Apr 06, 2017 at 09:05:03AM -0500, Brijesh Singh wrote:
> I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
> routines to do basic page splitting. I think it sufficient for our usage.

Good :)

> I should be able to drop the memblock patch from the series and update the
> Patch 15 [1] to use the kernel_physical_mapping_init().
> 
> The kernel_physical_mapping_init() creates the page table mapping using
> default KERNEL_PAGE attributes, I tried to extend the function by passing
> 'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
> pages. The code did not looked clean hence I dropped that idea.

Or, you could have a

__kernel_physical_mapping_init_prot(..., prot)

helper which gets a protection argument and hands it down. The lower
levels already hand down prot which is good.

The interface kernel_physical_mapping_init() will then itself call:

	__kernel_physical_mapping_init_prot(..., PAGE_KERNEL);

for the normal cases.

That in a pre-patch of course.

How does that sound?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-06 17:25                 ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-04-06 17:25 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott

Hi Brijesh,

On Thu, Apr 06, 2017 at 09:05:03AM -0500, Brijesh Singh wrote:
> I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
> routines to do basic page splitting. I think it sufficient for our usage.

Good :)

> I should be able to drop the memblock patch from the series and update the
> Patch 15 [1] to use the kernel_physical_mapping_init().
> 
> The kernel_physical_mapping_init() creates the page table mapping using
> default KERNEL_PAGE attributes, I tried to extend the function by passing
> 'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
> pages. The code did not looked clean hence I dropped that idea.

Or, you could have a

__kernel_physical_mapping_init_prot(..., prot)

helper which gets a protection argument and hands it down. The lower
levels already hand down prot which is good.

The interface kernel_physical_mapping_init() will then itself call:

	__kernel_physical_mapping_init_prot(..., PAGE_KERNEL);

for the normal cases.

That in a pre-patch of course.

How does that sound?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-04-06 17:25                 ` Borislav Petkov
  (?)
  (?)
@ 2017-04-06 18:37                   ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-06 18:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, thomas.lendacky,
	jroedel, keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 04/06/2017 12:25 PM, Borislav Petkov wrote:
> Hi Brijesh,
>
> On Thu, Apr 06, 2017 at 09:05:03AM -0500, Brijesh Singh wrote:
>> I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
>> routines to do basic page splitting. I think it sufficient for our usage.
>
> Good :)
>
>> I should be able to drop the memblock patch from the series and update the
>> Patch 15 [1] to use the kernel_physical_mapping_init().
>>
>> The kernel_physical_mapping_init() creates the page table mapping using
>> default KERNEL_PAGE attributes, I tried to extend the function by passing
>> 'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
>> pages. The code did not looked clean hence I dropped that idea.
>
> Or, you could have a
>
> __kernel_physical_mapping_init_prot(..., prot)
>
> helper which gets a protection argument and hands it down. The lower
> levels already hand down prot which is good.
>

I did thought about prot idea but ran into another corner case which may require
us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
level down (see comment [1]). So, if we encounter a case where our address range
is part of large page but we need to clear only one entry (i.e asked to clear just
one page into 2M region). In that case, now we need to pass additional arguments
into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
code that it should use our prot for specific entries and all other entries will use
the old_prot.

[1] http://lxr.free-electrons.com/source/arch/x86/mm/init_64.c#L546


> The interface kernel_physical_mapping_init() will then itself call:
>
> 	__kernel_physical_mapping_init_prot(..., PAGE_KERNEL);
>
> for the normal cases.
>
> That in a pre-patch of course.
>
> How does that sound?
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-06 18:37                   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-06 18:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, thomas.lendacky,
	jroedel, keescook, arnd, toshi.kani, mathieu.desnoyers, luto,
	devel, bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott



On 04/06/2017 12:25 PM, Borislav Petkov wrote:
> Hi Brijesh,
>
> On Thu, Apr 06, 2017 at 09:05:03AM -0500, Brijesh Singh wrote:
>> I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
>> routines to do basic page splitting. I think it sufficient for our usage.
>
> Good :)
>
>> I should be able to drop the memblock patch from the series and update the
>> Patch 15 [1] to use the kernel_physical_mapping_init().
>>
>> The kernel_physical_mapping_init() creates the page table mapping using
>> default KERNEL_PAGE attributes, I tried to extend the function by passing
>> 'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
>> pages. The code did not looked clean hence I dropped that idea.
>
> Or, you could have a
>
> __kernel_physical_mapping_init_prot(..., prot)
>
> helper which gets a protection argument and hands it down. The lower
> levels already hand down prot which is good.
>

I did thought about prot idea but ran into another corner case which may require
us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
level down (see comment [1]). So, if we encounter a case where our address range
is part of large page but we need to clear only one entry (i.e asked to clear just
one page into 2M region). In that case, now we need to pass additional arguments
into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
code that it should use our prot for specific entries and all other entries will use
the old_prot.

[1] http://lxr.free-electrons.com/source/arch/x86/mm/init_64.c#L546


> The interface kernel_physical_mapping_init() will then itself call:
>
> 	__kernel_physical_mapping_init_prot(..., PAGE_KERNEL);
>
> for the normal cases.
>
> That in a pre-patch of course.
>
> How does that sound?
>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-06 18:37                   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-06 18:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, thomas.lendacky,
	jroedel, keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 04/06/2017 12:25 PM, Borislav Petkov wrote:
> Hi Brijesh,
>
> On Thu, Apr 06, 2017 at 09:05:03AM -0500, Brijesh Singh wrote:
>> I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
>> routines to do basic page splitting. I think it sufficient for our usage.
>
> Good :)
>
>> I should be able to drop the memblock patch from the series and update the
>> Patch 15 [1] to use the kernel_physical_mapping_init().
>>
>> The kernel_physical_mapping_init() creates the page table mapping using
>> default KERNEL_PAGE attributes, I tried to extend the function by passing
>> 'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
>> pages. The code did not looked clean hence I dropped that idea.
>
> Or, you could have a
>
> __kernel_physical_mapping_init_prot(..., prot)
>
> helper which gets a protection argument and hands it down. The lower
> levels already hand down prot which is good.
>

I did thought about prot idea but ran into another corner case which may require
us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
level down (see comment [1]). So, if we encounter a case where our address range
is part of large page but we need to clear only one entry (i.e asked to clear just
one page into 2M region). In that case, now we need to pass additional arguments
into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
code that it should use our prot for specific entries and all other entries will use
the old_prot.

[1] http://lxr.free-electrons.com/source/arch/x86/mm/init_64.c#L546


> The interface kernel_physical_mapping_init() will then itself call:
>
> 	__kernel_physical_mapping_init_prot(..., PAGE_KERNEL);
>
> for the normal cases.
>
> That in a pre-patch of course.
>
> How does that sound?
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-06 18:37                   ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-06 18:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, thomas.lendacky,
	jroedel, keescook, arnd, toshi.kani, mathieu.desnoyers, luto,
	devel, bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott



On 04/06/2017 12:25 PM, Borislav Petkov wrote:
> Hi Brijesh,
>
> On Thu, Apr 06, 2017 at 09:05:03AM -0500, Brijesh Singh wrote:
>> I looked into arch/x86/mm/init_{32,64}.c and as you pointed the file contains
>> routines to do basic page splitting. I think it sufficient for our usage.
>
> Good :)
>
>> I should be able to drop the memblock patch from the series and update the
>> Patch 15 [1] to use the kernel_physical_mapping_init().
>>
>> The kernel_physical_mapping_init() creates the page table mapping using
>> default KERNEL_PAGE attributes, I tried to extend the function by passing
>> 'bool enc' flags to hint whether to clr or set _PAGE_ENC when splitting the
>> pages. The code did not looked clean hence I dropped that idea.
>
> Or, you could have a
>
> __kernel_physical_mapping_init_prot(..., prot)
>
> helper which gets a protection argument and hands it down. The lower
> levels already hand down prot which is good.
>

I did thought about prot idea but ran into another corner case which may require
us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
level down (see comment [1]). So, if we encounter a case where our address range
is part of large page but we need to clear only one entry (i.e asked to clear just
one page into 2M region). In that case, now we need to pass additional arguments
into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
code that it should use our prot for specific entries and all other entries will use
the old_prot.

[1] http://lxr.free-electrons.com/source/arch/x86/mm/init_64.c#L546


> The interface kernel_physical_mapping_init() will then itself call:
>
> 	__kernel_physical_mapping_init_prot(..., PAGE_KERNEL);
>
> for the normal cases.
>
> That in a pre-patch of course.
>
> How does that sound?
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-04-06 18:37                   ` Brijesh Singh
  (?)
  (?)
@ 2017-04-07 11:33                     ` Borislav Petkov
  -1 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-04-07 11:33 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Thu, Apr 06, 2017 at 01:37:41PM -0500, Brijesh Singh wrote:
> I did thought about prot idea but ran into another corner case which may require
> us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
> and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
> level down (see comment [1]). So, if we encounter a case where our address range
> is part of large page but we need to clear only one entry (i.e asked to clear just
> one page into 2M region). In that case, now we need to pass additional arguments
> into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
> code that it should use our prot for specific entries and all other entries will use
> the old_prot.

Ok, but your !4K case:

+               /*
+                * virtual address is part of large page, create the page
+                * table mapping to use smaller pages (4K). The virtual and
+                * physical address must be aligned to PMD level.
+                */
+               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
+                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
+                                            0);


would map a 2M page as encrypted by default. What if we want to map a 2M page
frame as ~_PAGE_ENC?

IOW, if you're adding a new interface, it should be generic enough, like
with prot argument.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-07 11:33                     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-04-07 11:33 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott

On Thu, Apr 06, 2017 at 01:37:41PM -0500, Brijesh Singh wrote:
> I did thought about prot idea but ran into another corner case which may require
> us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
> and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
> level down (see comment [1]). So, if we encounter a case where our address range
> is part of large page but we need to clear only one entry (i.e asked to clear just
> one page into 2M region). In that case, now we need to pass additional arguments
> into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
> code that it should use our prot for specific entries and all other entries will use
> the old_prot.

Ok, but your !4K case:

+               /*
+                * virtual address is part of large page, create the page
+                * table mapping to use smaller pages (4K). The virtual and
+                * physical address must be aligned to PMD level.
+                */
+               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
+                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
+                                            0);


would map a 2M page as encrypted by default. What if we want to map a 2M page
frame as ~_PAGE_ENC?

IOW, if you're adding a new interface, it should be generic enough, like
with prot argument.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-07 11:33                     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-04-07 11:33 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab

On Thu, Apr 06, 2017 at 01:37:41PM -0500, Brijesh Singh wrote:
> I did thought about prot idea but ran into another corner case which may require
> us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
> and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
> level down (see comment [1]). So, if we encounter a case where our address range
> is part of large page but we need to clear only one entry (i.e asked to clear just
> one page into 2M region). In that case, now we need to pass additional arguments
> into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
> code that it should use our prot for specific entries and all other entries will use
> the old_prot.

Ok, but your !4K case:

+               /*
+                * virtual address is part of large page, create the page
+                * table mapping to use smaller pages (4K). The virtual and
+                * physical address must be aligned to PMD level.
+                */
+               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
+                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
+                                            0);


would map a 2M page as encrypted by default. What if we want to map a 2M page
frame as ~_PAGE_ENC?

IOW, if you're adding a new interface, it should be generic enough, like
with prot argument.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-07 11:33                     ` Borislav Petkov
  0 siblings, 0 replies; 424+ messages in thread
From: Borislav Petkov @ 2017-04-07 11:33 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Paolo Bonzini, simon.guinot, linux-efi, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi.kani, mathieu.desnoyers, luto, devel,
	bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott

On Thu, Apr 06, 2017 at 01:37:41PM -0500, Brijesh Singh wrote:
> I did thought about prot idea but ran into another corner case which may require
> us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
> and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
> level down (see comment [1]). So, if we encounter a case where our address range
> is part of large page but we need to clear only one entry (i.e asked to clear just
> one page into 2M region). In that case, now we need to pass additional arguments
> into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
> code that it should use our prot for specific entries and all other entries will use
> the old_prot.

Ok, but your !4K case:

+               /*
+                * virtual address is part of large page, create the page
+                * table mapping to use smaller pages (4K). The virtual and
+                * physical address must be aligned to PMD level.
+                */
+               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
+                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
+                                            0);


would map a 2M page as encrypted by default. What if we want to map a 2M page
frame as ~_PAGE_ENC?

IOW, if you're adding a new interface, it should be generic enough, like
with prot argument.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix ImendA?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG NA 1/4 rnberg)
-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
  2017-04-07 11:33                     ` Borislav Petkov
  (?)
  (?)
@ 2017-04-07 14:50                       ` Brijesh Singh
  -1 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-07 14:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, thomas.lendacky,
	jroedel, keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 04/07/2017 06:33 AM, Borislav Petkov wrote:
> On Thu, Apr 06, 2017 at 01:37:41PM -0500, Brijesh Singh wrote:
>> I did thought about prot idea but ran into another corner case which may require
>> us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
>> and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
>> level down (see comment [1]). So, if we encounter a case where our address range
>> is part of large page but we need to clear only one entry (i.e asked to clear just
>> one page into 2M region). In that case, now we need to pass additional arguments
>> into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
>> code that it should use our prot for specific entries and all other entries will use
>> the old_prot.
>
> Ok, but your !4K case:
>
> +               /*
> +                * virtual address is part of large page, create the page
> +                * table mapping to use smaller pages (4K). The virtual and
> +                * physical address must be aligned to PMD level.
> +                */
> +               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
> +                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
> +                                            0);
>
>
> would map a 2M page as encrypted by default. What if we want to map a 2M page
> frame as ~_PAGE_ENC?
>

Thanks for feedbacks, I will make sure that we cover all other cases in final patch.
Untested but something like this can be used to check whether we can change the large page
in one go or request the splitting.

+               psize = page_level_size(level);
+               pmask = page_level_mask(level);
+
+               /*
+                * Check, whether we can change the large page in one go.
+                * We request a split, when the address is not aligned and
+                * the number of pages to set or clear encryption bit is smaller
+                * than the number of pages in the large page.
+                */
+               if (vaddr == (vaddr & pmask) && ((vaddr_end - vaddr) >= psize)) {
+                       /* UPDATE PMD HERE */
+                       vaddr_next = (vaddr & pmask) + psize;
+                       continue;
+               }
+

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-07 14:50                       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-07 14:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, thomas.lendacky,
	jroedel, keescook, arnd, toshi.kani, mathieu.desnoyers, luto,
	devel, bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott



On 04/07/2017 06:33 AM, Borislav Petkov wrote:
> On Thu, Apr 06, 2017 at 01:37:41PM -0500, Brijesh Singh wrote:
>> I did thought about prot idea but ran into another corner case which may require
>> us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
>> and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
>> level down (see comment [1]). So, if we encounter a case where our address range
>> is part of large page but we need to clear only one entry (i.e asked to clear just
>> one page into 2M region). In that case, now we need to pass additional arguments
>> into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
>> code that it should use our prot for specific entries and all other entries will use
>> the old_prot.
>
> Ok, but your !4K case:
>
> +               /*
> +                * virtual address is part of large page, create the page
> +                * table mapping to use smaller pages (4K). The virtual and
> +                * physical address must be aligned to PMD level.
> +                */
> +               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
> +                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
> +                                            0);
>
>
> would map a 2M page as encrypted by default. What if we want to map a 2M page
> frame as ~_PAGE_ENC?
>

Thanks for feedbacks, I will make sure that we cover all other cases in final patch.
Untested but something like this can be used to check whether we can change the large page
in one go or request the splitting.

+               psize = page_level_size(level);
+               pmask = page_level_mask(level);
+
+               /*
+                * Check, whether we can change the large page in one go.
+                * We request a split, when the address is not aligned and
+                * the number of pages to set or clear encryption bit is smaller
+                * than the number of pages in the large page.
+                */
+               if (vaddr == (vaddr & pmask) && ((vaddr_end - vaddr) >= psize)) {
+                       /* UPDATE PMD HERE */
+                       vaddr_next = (vaddr & pmask) + psize;
+                       continue;
+               }
+

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-07 14:50                       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-07 14:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, thomas.lendacky,
	jroedel, keescook, arnd, toshi.kani, mathieu.desnoyers, luto



On 04/07/2017 06:33 AM, Borislav Petkov wrote:
> On Thu, Apr 06, 2017 at 01:37:41PM -0500, Brijesh Singh wrote:
>> I did thought about prot idea but ran into another corner case which may require
>> us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
>> and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
>> level down (see comment [1]). So, if we encounter a case where our address range
>> is part of large page but we need to clear only one entry (i.e asked to clear just
>> one page into 2M region). In that case, now we need to pass additional arguments
>> into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
>> code that it should use our prot for specific entries and all other entries will use
>> the old_prot.
>
> Ok, but your !4K case:
>
> +               /*
> +                * virtual address is part of large page, create the page
> +                * table mapping to use smaller pages (4K). The virtual and
> +                * physical address must be aligned to PMD level.
> +                */
> +               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
> +                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
> +                                            0);
>
>
> would map a 2M page as encrypted by default. What if we want to map a 2M page
> frame as ~_PAGE_ENC?
>

Thanks for feedbacks, I will make sure that we cover all other cases in final patch.
Untested but something like this can be used to check whether we can change the large page
in one go or request the splitting.

+               psize = page_level_size(level);
+               pmask = page_level_mask(level);
+
+               /*
+                * Check, whether we can change the large page in one go.
+                * We request a split, when the address is not aligned and
+                * the number of pages to set or clear encryption bit is smaller
+                * than the number of pages in the large page.
+                */
+               if (vaddr == (vaddr & pmask) && ((vaddr_end - vaddr) >= psize)) {
+                       /* UPDATE PMD HERE */
+                       vaddr_next = (vaddr & pmask) + psize;
+                       continue;
+               }
+

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* Re: [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages
@ 2017-04-07 14:50                       ` Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-04-07 14:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, Paolo Bonzini, simon.guinot, linux-efi, kvm,
	rkrcmar, matt, linux-pci, linus.walleij, gary.hook, linux-mm,
	paul.gortmaker, hpa, cl, dan.j.williams, aarcange, sfr,
	andriy.shevchenko, herbert, bhe, xemul, joro, x86, peterz,
	piotr.luc, mingo, msalter, ross.zwisler, dyoung, thomas.lendacky,
	jroedel, keescook, arnd, toshi.kani, mathieu.desnoyers, luto,
	devel, bhelgaas, tglx, mchehab, iamjoonsoo.kim, labbott



On 04/07/2017 06:33 AM, Borislav Petkov wrote:
> On Thu, Apr 06, 2017 at 01:37:41PM -0500, Brijesh Singh wrote:
>> I did thought about prot idea but ran into another corner case which may require
>> us changing the signature of phys_pud_init and phys_pmd_init. The paddr_start
>> and paddr_end args into kernel_physical_mapping_init() should be aligned on PMD
>> level down (see comment [1]). So, if we encounter a case where our address range
>> is part of large page but we need to clear only one entry (i.e asked to clear just
>> one page into 2M region). In that case, now we need to pass additional arguments
>> into kernel_physical_mapping, phys_pud_init and phys_pmd_init to hint the splitting
>> code that it should use our prot for specific entries and all other entries will use
>> the old_prot.
>
> Ok, but your !4K case:
>
> +               /*
> +                * virtual address is part of large page, create the page
> +                * table mapping to use smaller pages (4K). The virtual and
> +                * physical address must be aligned to PMD level.
> +                */
> +               kernel_physical_mapping_init(__pa(vaddr & PMD_MASK),
> +                                            __pa((vaddr_end & PMD_MASK) + PMD_SIZE),
> +                                            0);
>
>
> would map a 2M page as encrypted by default. What if we want to map a 2M page
> frame as ~_PAGE_ENC?
>

Thanks for feedbacks, I will make sure that we cover all other cases in final patch.
Untested but something like this can be used to check whether we can change the large page
in one go or request the splitting.

+               psize = page_level_size(level);
+               pmask = page_level_mask(level);
+
+               /*
+                * Check, whether we can change the large page in one go.
+                * We request a split, when the address is not aligned and
+                * the number of pages to set or clear encryption bit is smaller
+                * than the number of pages in the large page.
+                */
+               if (vaddr == (vaddr & pmask) && ((vaddr_end - vaddr) >= psize)) {
+                       /* UPDATE PMD HERE */
+                       vaddr_next = (vaddr & pmask) + psize;
+                       continue;
+               }
+

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 424+ messages in thread

* [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD)
@ 2017-03-02 15:12 Brijesh Singh
  0 siblings, 0 replies; 424+ messages in thread
From: Brijesh Singh @ 2017-03-02 15:12 UTC (permalink / raw)
  To: simon.guinot, linux-efi, brijesh.singh, kvm, rkrcmar, matt,
	linux-pci, linus.walleij, gary.hook, linux-mm, paul.gortmaker,
	hpa, cl, dan.j.williams, aarcange, sfr, andriy.shevchenko,
	herbert, bhe, xemul, joro, x86, peterz, piotr.luc, mingo,
	msalter, ross.zwisler, bp, dyoung, thomas.lendacky, jroedel,
	keescook, arnd, toshi

This RFC series provides support for AMD's new Secure Encrypted Virtualization
(SEV) feature. This RFC is build upon Secure Memory Encryption (SME) RFCv4 [1].

SEV is an extension to the AMD-V architecture which supports running multiple
VMs under the control of a hypervisor. When enabled, SEV hardware tags all
code and data with its VM ASID which indicates which VM the data originated
from or is intended for. This tag is kept with the data at all times when
inside the SOC, and prevents that data from being used by anyone other than the
owner. While the tag protects VM data inside the SOC, AES with 128 bit
encryption protects data outside the SOC. When data leaves or enters the SOC,
it is encrypted/decrypted  respectively by hardware with a key based on the
associated tag.

SEV guest VMs have the concept of private and shared memory.  Private memory is
encrypted with the  guest-specific key, while shared memory may be encrypted
with hypervisor key.  Certain types of memory (namely instruction pages and
guest page tables) are always treated as private memory by the hardware.
For data memory, SEV guest VMs can choose which pages they would like to be
private. The choice is done using the standard CPU page tables using the C-bit,
and is fully controlled by the guest. Due to security reasons all the DMA
operations inside the  guest must be performed on shared pages (C-bit clear).
Note that since C-bit is only controllable by the guest OS when it is operating
in 64-bit or 32-bit PAE mode, in all other modes the SEV hardware forces the
C-bit to a 1.

SEV is designed to protect guest VMs from a benign but vulnerable (i.e. not
fully malicious) hypervisor. In particular, it reduces the attack surface of
guest VMs and can prevent certain types of VM-escape bugs (e.g. hypervisor
read-anywhere) from being used to steal guest data.

The RFC series also expands crypto driver (ccp.ko) to include the support for
Platform Security Processor (PSP) which is used for communicating with SEV
firmware that runs within the AMD secure processor providing a secure key
management interfaces. The hypervisor uses this interface to encrypt the
bootstrap code and perform common activities such as launching, running,
snapshotting, migrating and debugging encrypted guest.

A new ioctl (KVM_MEMORY_ENCRYPT_OP) is introduced which can be used by Qemu to
issue SEV guest life cycle commands.

The RFC series also includes patches required in guest OS to enable SEV feature.
A guest OS can check SEV support by calling KVM_FEATURE cpuid instruction.

The patch breakdown:
* [1 - 17]: guest OS specific changes when SEV is active
* [18]: already queued in kvm upstream tree but was not in tip tree hence its
  included so that build does not fail
* [19 - 21]: since CCP and PSP shares the same PCIe ID hence the patch expands
  the CCP driver by creating a high level AMD Secure Processor (SP) framework
  to allow integration of PSP device into ccp.ko.
* [22 - 32]: hypervisor changes to support memory encryption

The following links provide additional details:

AMD Memory Encryption whitepaper:
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf

AMD64 Architecture Programmer's Manual:
    http://support.amd.com/TechDocs/24593.pdf
    SME is section 7.10
    SEV is section 15.34

Secure Encrypted Virutualization Key Management:
http://support.amd.com/TechDocs/55766_SEV-KM API_Specification.pdf

KVM Forum Presentation:
http://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf

[1] http://marc.info/?l=linux-kernel&m=148725974113693&w=2

---

Based on the feedbacks, we have started adding the SEV guest support in OVMF
BIOS. This series has been tested using EDK2/OVMF BIOS, the initial EDK2 patches
has been submmited on edk2 mailing list for discussion.

TODO:
 - add support for migration commands
 - update QEMU RFC's to SEV spec 0.14
 - investigate virtio and vfio support for SEV guest
 - investigate SMM support for SEV guest
 - add support for nested virtualization

Changes since v1:
 - update to newer SEV key management API spec (0.12 -> 0.14)
 - expand the CCP driver and integrate the PSP interface support
 - remove the usage of SEV ref_count and release the SEV FW resources in
   kvm_x86_ops->vm_destroy
 - acquire the kvm->lock before executing the SEV commands and release on exit.
 - rename ioctl from KVM_SEV_ISSUE_CMD to KVM_MEMORY_ENCRYPT_OP
 - extend KVM_MEMORY_ENCRYPT_OP ioctl to require file descriptor for the SEV
   device. A program without access to /dev/sev will not be able to issue SEV
   commands
 - update vmcb on succesful LAUNCH_FINISH to indicate that SEV is active
 - serveral fixes based on Paolo's review feedbacks
 - add APIs to support sharing the guest physical address with hypervisor
 - update kvm pvclock driver to use the shared buffer when SEV is active
 - pin the SEV guest memory

Brijesh Singh (18):
      x86: mm: Provide support to use memblock when spliting large pages
      x86: Add support for changing memory encryption attribute in early boot
      x86: kvm: Provide support to create Guest and HV shared per-CPU variables
      x86: kvmclock: Clear encryption attribute when SEV is active
      crypto: ccp: Introduce the AMD Secure Processor device
      crypto: ccp: Add Platform Security Processor (PSP) interface support
      crypto: ccp: Add Secure Encrypted Virtualization (SEV) interface support
      kvm: svm: prepare to reserve asid for SEV guest
      kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl
      kvm: x86: prepare for SEV guest management API support
      kvm: svm: Add support for SEV LAUNCH_START command
      kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command
      kvm: svm: Add support for SEV LAUNCH_FINISH command
      kvm: svm: Add support for SEV GUEST_STATUS command
      kvm: svm: Add support for SEV DEBUG_DECRYPT command
      kvm: svm: Add support for SEV DEBUG_ENCRYPT command
      kvm: svm: Add support for SEV LAUNCH_MEASURE command
      x86: kvm: Pin the guest memory when SEV is active

Tom Lendacky (14):
      x86: Add the Secure Encrypted Virtualization CPU feature
      x86: Secure Encrypted Virtualization (SEV) support
      KVM: SVM: prepare for new bit definition in nested_ctl
      KVM: SVM: Add SEV feature definitions to KVM
      x86: Use encrypted access of BOOT related data with SEV
      x86/pci: Use memremap when walking setup data
      x86/efi: Access EFI data as encrypted when SEV is active
      x86: Use PAGE_KERNEL protection for ioremap of memory page
      x86: Change early_ioremap to early_memremap for BOOT data
      x86: DMA support for SEV memory encryption
      x86: Unroll string I/O when SEV is active
      x86: Add early boot support when running with SEV active
      KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature
      kvm: svm: Use the hardware provided GPA instead of page walk



 arch/x86/boot/compressed/Makefile      |    2 
 arch/x86/boot/compressed/head_64.S     |   16 
 arch/x86/boot/compressed/mem_encrypt.S |   75 ++
 arch/x86/include/asm/cpufeatures.h     |    1 
 arch/x86/include/asm/io.h              |   26 +
 arch/x86/include/asm/kvm_emulate.h     |    1 
 arch/x86/include/asm/kvm_host.h        |   19 +
 arch/x86/include/asm/mem_encrypt.h     |   29 +
 arch/x86/include/asm/msr-index.h       |    2 
 arch/x86/include/asm/svm.h             |    3 
 arch/x86/include/uapi/asm/hyperv.h     |    4 
 arch/x86/include/uapi/asm/kvm_para.h   |    4 
 arch/x86/kernel/acpi/boot.c            |    4 
 arch/x86/kernel/cpu/amd.c              |   22 +
 arch/x86/kernel/cpu/scattered.c        |    1 
 arch/x86/kernel/kvm.c                  |   43 +
 arch/x86/kernel/kvmclock.c             |   65 ++
 arch/x86/kernel/mem_encrypt_init.c     |   24 +
 arch/x86/kernel/mpparse.c              |   10 
 arch/x86/kvm/cpuid.c                   |    4 
 arch/x86/kvm/emulate.c                 |   20 -
 arch/x86/kvm/svm.c                     | 1051 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c                     |   60 ++
 arch/x86/mm/ioremap.c                  |   44 +
 arch/x86/mm/mem_encrypt.c              |  143 ++++
 arch/x86/mm/pageattr.c                 |   51 +-
 arch/x86/pci/common.c                  |    4 
 arch/x86/platform/efi/efi_64.c         |   15 
 drivers/crypto/Kconfig                 |   10 
 drivers/crypto/ccp/Kconfig             |   55 +-
 drivers/crypto/ccp/Makefile            |   10 
 drivers/crypto/ccp/ccp-dev-v3.c        |   86 +--
 drivers/crypto/ccp/ccp-dev-v5.c        |   73 +-
 drivers/crypto/ccp/ccp-dev.c           |  137 ++--
 drivers/crypto/ccp/ccp-dev.h           |   35 -
 drivers/crypto/ccp/psp-dev.c           |  211 ++++++
 drivers/crypto/ccp/psp-dev.h           |  102 +++
 drivers/crypto/ccp/sev-dev.c           |  348 +++++++++++
 drivers/crypto/ccp/sev-dev.h           |   67 ++
 drivers/crypto/ccp/sev-ops.c           |  324 ++++++++++
 drivers/crypto/ccp/sp-dev.c            |  324 ++++++++++
 drivers/crypto/ccp/sp-dev.h            |  172 +++++
 drivers/crypto/ccp/sp-pci.c            |  328 ++++++++++
 drivers/crypto/ccp/sp-platform.c       |  268 ++++++++
 drivers/sfi/sfi_core.c                 |    6 
 include/asm-generic/vmlinux.lds.h      |    3 
 include/linux/ccp.h                    |    3 
 include/linux/mem_encrypt.h            |    6 
 include/linux/mm.h                     |    1 
 include/linux/percpu-defs.h            |    9 
 include/linux/psp-sev.h                |  672 ++++++++++++++++++++
 include/uapi/linux/Kbuild              |    1 
 include/uapi/linux/kvm.h               |  100 +++
 include/uapi/linux/psp-sev.h           |  123 ++++
 kernel/resource.c                      |   40 +
 55 files changed, 4991 insertions(+), 266 deletions(-)
 create mode 100644 arch/x86/boot/compressed/mem_encrypt.S
 create mode 100644 drivers/crypto/ccp/psp-dev.c
 create mode 100644 drivers/crypto/ccp/psp-dev.h
 create mode 100644 drivers/crypto/ccp/sev-dev.c
 create mode 100644 drivers/crypto/ccp/sev-dev.h
 create mode 100644 drivers/crypto/ccp/sev-ops.c
 create mode 100644 drivers/crypto/ccp/sp-dev.c
 create mode 100644 drivers/crypto/ccp/sp-dev.h
 create mode 100644 drivers/crypto/ccp/sp-pci.c
 create mode 100644 drivers/crypto/ccp/sp-platform.c
 create mode 100644 include/linux/psp-sev.h
 create mode 100644 include/uapi/linux/psp-sev.h

^ permalink raw reply	[flat|nested] 424+ messages in thread

end of thread, other threads:[~2017-04-07 14:51 UTC | newest]

Thread overview: 424+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-02 15:12 [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD) Brijesh Singh
2017-03-02 15:12 ` Brijesh Singh
2017-03-02 15:12 ` Brijesh Singh
2017-03-02 15:12 ` Brijesh Singh
2017-03-02 15:12 ` [RFC PATCH v2 01/32] x86: Add the Secure Encrypted Virtualization CPU feature Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-03 16:59   ` Borislav Petkov
2017-03-03 16:59     ` Borislav Petkov
2017-03-03 16:59     ` Borislav Petkov
2017-03-03 21:01     ` Brijesh Singh
2017-03-03 21:01       ` Brijesh Singh
2017-03-03 21:01       ` Brijesh Singh
2017-03-03 21:01       ` Brijesh Singh
2017-03-04 10:11       ` Borislav Petkov
2017-03-04 10:11         ` Borislav Petkov
2017-03-04 10:11         ` Borislav Petkov
2017-03-06 18:11         ` Brijesh Singh
2017-03-06 18:11           ` Brijesh Singh
2017-03-06 18:11           ` Brijesh Singh
2017-03-06 18:11           ` Brijesh Singh
2017-03-06 20:54           ` Borislav Petkov
2017-03-06 20:54             ` Borislav Petkov
2017-03-06 20:54             ` Borislav Petkov
2017-03-02 15:12 ` Brijesh Singh
2017-03-02 15:12 ` [RFC PATCH v2 02/32] x86: Secure Encrypted Virtualization (SEV) support Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-07 11:19   ` Borislav Petkov
2017-03-07 11:19     ` Borislav Petkov
2017-03-07 11:19     ` Borislav Petkov
2017-03-08 15:06   ` Borislav Petkov
2017-03-08 15:06     ` Borislav Petkov
2017-03-08 15:06     ` Borislav Petkov
2017-03-02 15:12 ` Brijesh Singh
2017-03-02 15:12 ` [RFC PATCH v2 03/32] KVM: SVM: prepare for new bit definition in nested_ctl Brijesh Singh
2017-03-02 15:12 ` Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-02 15:12 ` [RFC PATCH v2 04/32] KVM: SVM: Add SEV feature definitions to KVM Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-07  0:50   ` Borislav Petkov
2017-03-07  0:50     ` Borislav Petkov
2017-03-07  0:50     ` Borislav Petkov
2017-03-02 15:12 ` Brijesh Singh
2017-03-02 15:12 ` [RFC PATCH v2 05/32] x86: Use encrypted access of BOOT related data with SEV Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-02 15:12   ` Brijesh Singh
2017-03-07 11:09   ` Borislav Petkov
2017-03-07 11:09     ` Borislav Petkov
2017-03-07 11:09     ` Borislav Petkov
2017-03-16 19:03     ` Tom Lendacky
2017-03-16 19:03       ` Tom Lendacky
2017-03-16 19:03       ` Tom Lendacky
2017-03-16 19:03       ` Tom Lendacky
2017-03-02 15:12 ` Brijesh Singh
2017-03-02 15:13 ` [RFC PATCH v2 06/32] x86/pci: Use memremap when walking setup data Brijesh Singh
2017-03-02 15:13 ` Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-03 20:42   ` Bjorn Helgaas
2017-03-03 20:42     ` Bjorn Helgaas
2017-03-03 20:42     ` Bjorn Helgaas
2017-03-03 21:15     ` Tom Lendacky
2017-03-03 21:15       ` Tom Lendacky
2017-03-03 21:15       ` Tom Lendacky
2017-03-03 21:15       ` Tom Lendacky
2017-03-07  0:03       ` Bjorn Helgaas
2017-03-07  0:03         ` Bjorn Helgaas
2017-03-07  0:03         ` Bjorn Helgaas
2017-03-13 20:08         ` Tom Lendacky
2017-03-13 20:08           ` Tom Lendacky
2017-03-13 20:08           ` Tom Lendacky
2017-03-13 20:08           ` Tom Lendacky
2017-03-02 15:13 ` [RFC PATCH v2 07/32] x86/efi: Access EFI data as encrypted when SEV is active Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-07 11:57   ` Borislav Petkov
2017-03-07 11:57     ` Borislav Petkov
2017-03-07 11:57     ` Borislav Petkov
2017-03-07 11:57     ` Borislav Petkov
2017-03-02 15:13 ` Brijesh Singh
2017-03-02 15:13 ` [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page Brijesh Singh
2017-03-02 15:13 ` Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-07 14:59   ` Borislav Petkov
2017-03-07 14:59     ` Borislav Petkov
2017-03-07 14:59     ` Borislav Petkov
2017-03-16 20:04     ` Tom Lendacky
2017-03-16 20:04       ` Tom Lendacky
2017-03-16 20:04       ` Tom Lendacky
2017-03-16 20:04       ` Tom Lendacky
2017-03-17 14:32       ` Tom Lendacky
2017-03-17 14:32         ` Tom Lendacky
2017-03-17 14:32         ` Tom Lendacky
2017-03-17 14:32         ` Tom Lendacky
2017-03-17 14:55         ` Tom Lendacky
2017-03-17 14:55           ` Tom Lendacky
2017-03-17 14:55           ` Tom Lendacky
2017-03-02 15:13 ` [RFC PATCH v2 09/32] x86: Change early_ioremap to early_memremap for BOOT data Brijesh Singh
2017-03-02 15:13 ` Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-02 15:13   ` Brijesh Singh
2017-03-08  8:46   ` Borislav Petkov
2017-03-08  8:46     ` Borislav Petkov
2017-03-08  8:46     ` Borislav Petkov
2017-03-02 15:14 ` [RFC PATCH v2 10/32] x86: DMA support for SEV memory encryption Brijesh Singh
2017-03-02 15:14   ` Brijesh Singh
2017-03-02 15:14   ` Brijesh Singh
2017-03-02 15:14   ` Brijesh Singh
2017-03-08 10:56   ` Borislav Petkov
2017-03-08 10:56     ` Borislav Petkov
2017-03-08 10:56     ` Borislav Petkov
2017-03-02 15:14 ` Brijesh Singh
2017-03-02 15:14 ` [RFC PATCH v2 11/32] x86: Unroll string I/O when SEV is active Brijesh Singh
2017-03-02 15:14   ` Brijesh Singh
2017-03-02 15:14   ` Brijesh Singh
2017-03-02 15:14   ` Brijesh Singh
2017-03-02 15:14 ` Brijesh Singh
2017-03-02 15:14 ` [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active Brijesh Singh
2017-03-02 15:14   ` Brijesh Singh
2017-03-02 15:14   ` Brijesh Singh
2017-03-02 15:14   ` Brijesh Singh
2017-03-09 14:07   ` Borislav Petkov
2017-03-09 14:07     ` Borislav Petkov
2017-03-09 14:07     ` Borislav Petkov
2017-03-09 16:13     ` Paolo Bonzini
2017-03-09 16:13       ` Paolo Bonzini
2017-03-09 16:13       ` Paolo Bonzini
2017-03-09 16:29       ` Borislav Petkov
2017-03-09 16:29         ` Borislav Petkov
2017-03-09 16:29         ` Borislav Petkov
2017-03-10 16:35         ` Brijesh Singh
2017-03-10 16:35           ` Brijesh Singh
2017-03-10 16:35           ` Brijesh Singh
2017-03-10 16:35           ` Brijesh Singh
2017-03-16 10:16           ` Borislav Petkov
2017-03-16 10:16             ` Borislav Petkov
2017-03-16 10:16             ` Borislav Petkov
2017-03-16 10:16             ` Borislav Petkov
2017-03-16 14:28             ` Tom Lendacky
2017-03-16 14:28               ` Tom Lendacky
2017-03-16 14:28               ` Tom Lendacky
2017-03-16 14:28               ` Tom Lendacky
2017-03-16 15:09               ` Borislav Petkov
2017-03-16 15:09                 ` Borislav Petkov
2017-03-16 15:09                 ` Borislav Petkov
2017-03-16 16:11                 ` Tom Lendacky
2017-03-16 16:11                   ` Tom Lendacky
2017-03-16 16:11                   ` Tom Lendacky
2017-03-16 16:11                   ` Tom Lendacky
2017-03-16 16:29                   ` Borislav Petkov
2017-03-16 16:29                     ` Borislav Petkov
2017-03-16 16:29                     ` Borislav Petkov
2017-03-02 15:14 ` Brijesh Singh
2017-03-02 15:15 ` [RFC PATCH v2 13/32] KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-09 19:29   ` Borislav Petkov
2017-03-09 19:29     ` Borislav Petkov
2017-03-09 19:29     ` Borislav Petkov
2017-03-02 15:15 ` Brijesh Singh
2017-03-02 15:15 ` [RFC PATCH v2 14/32] x86: mm: Provide support to use memblock when spliting large pages Brijesh Singh
2017-03-02 15:15 ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-10 11:06   ` Borislav Petkov
2017-03-10 11:06     ` Borislav Petkov
2017-03-10 11:06     ` Borislav Petkov
2017-03-10 22:41     ` Brijesh Singh
2017-03-10 22:41       ` Brijesh Singh
2017-03-10 22:41       ` Brijesh Singh
2017-03-10 22:41       ` Brijesh Singh
2017-03-16 13:15       ` Paolo Bonzini
2017-03-16 13:15         ` Paolo Bonzini
2017-03-16 18:28       ` Borislav Petkov
2017-03-16 18:28         ` Borislav Petkov
2017-03-16 18:28         ` Borislav Petkov
2017-03-16 22:25         ` Paolo Bonzini
2017-03-16 22:25           ` Paolo Bonzini
2017-03-16 22:25           ` Paolo Bonzini
2017-03-17 10:17           ` Borislav Petkov
2017-03-17 10:17             ` Borislav Petkov
2017-03-17 10:17             ` Borislav Petkov
2017-03-17 10:47             ` Paolo Bonzini
2017-03-17 10:47               ` Paolo Bonzini
2017-03-17 10:56               ` Borislav Petkov
2017-03-17 10:56                 ` Borislav Petkov
2017-03-17 10:56                 ` Borislav Petkov
2017-03-17 11:03                 ` Paolo Bonzini
2017-03-17 11:03                   ` Paolo Bonzini
2017-03-17 11:03                   ` Paolo Bonzini
2017-03-17 11:33                   ` Borislav Petkov
2017-03-17 11:33                     ` Borislav Petkov
2017-03-17 11:33                     ` Borislav Petkov
2017-03-17 14:45                     ` Paolo Bonzini
2017-03-17 14:45                       ` Paolo Bonzini
2017-03-17 14:45                       ` Paolo Bonzini
     [not found]                       ` <b516a873-029a-b20a-3c43-d8bf4a200cb7-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-03-18 16:37                         ` Borislav Petkov
2017-03-18 16:37                           ` Borislav Petkov
2017-03-18 16:37                           ` Borislav Petkov
2017-03-18 16:37                           ` Borislav Petkov
2017-04-06 14:05             ` Brijesh Singh
2017-04-06 14:05               ` Brijesh Singh
2017-04-06 14:05               ` Brijesh Singh
2017-04-06 14:05               ` Brijesh Singh
2017-04-06 17:25               ` Borislav Petkov
2017-04-06 17:25                 ` Borislav Petkov
2017-04-06 17:25                 ` Borislav Petkov
2017-04-06 17:25                 ` Borislav Petkov
2017-04-06 18:37                 ` Brijesh Singh
2017-04-06 18:37                   ` Brijesh Singh
2017-04-06 18:37                   ` Brijesh Singh
2017-04-06 18:37                   ` Brijesh Singh
2017-04-07 11:33                   ` Borislav Petkov
2017-04-07 11:33                     ` Borislav Petkov
2017-04-07 11:33                     ` Borislav Petkov
2017-04-07 11:33                     ` Borislav Petkov
2017-04-07 14:50                     ` Brijesh Singh
2017-04-07 14:50                       ` Brijesh Singh
2017-04-07 14:50                       ` Brijesh Singh
2017-04-07 14:50                       ` Brijesh Singh
2017-03-16 12:28   ` Paolo Bonzini
2017-03-16 12:28   ` Paolo Bonzini
2017-03-16 12:28     ` Paolo Bonzini
2017-03-16 12:28     ` Paolo Bonzini
2017-03-02 15:15 ` [RFC PATCH v2 15/32] x86: Add support for changing memory encryption attribute in early boot Brijesh Singh
2017-03-02 15:15 ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-24 17:12   ` Borislav Petkov
2017-03-24 17:12     ` Borislav Petkov
2017-03-24 17:12     ` Borislav Petkov
2017-03-27 15:07     ` Brijesh Singh
2017-03-27 15:07       ` Brijesh Singh
2017-03-27 15:07       ` Brijesh Singh
2017-03-27 15:07       ` Brijesh Singh
2017-03-02 15:15 ` [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-16 11:06   ` Paolo Bonzini
2017-03-16 11:06     ` Paolo Bonzini
2017-03-16 11:06     ` Paolo Bonzini
2017-03-28 18:39   ` Borislav Petkov
2017-03-28 18:39     ` Borislav Petkov
2017-03-28 18:39     ` Borislav Petkov
2017-03-29 15:21     ` Paolo Bonzini
2017-03-29 15:21       ` Paolo Bonzini
2017-03-29 15:21       ` Paolo Bonzini
2017-03-29 15:32       ` Borislav Petkov
2017-03-29 15:32         ` Borislav Petkov
2017-03-29 15:32         ` Borislav Petkov
2017-03-02 15:15 ` Brijesh Singh
2017-03-02 15:15 ` [RFC PATCH v2 17/32] x86: kvmclock: Clear encryption attribute when SEV is active Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-02 15:15   ` Brijesh Singh
2017-03-02 15:15 ` Brijesh Singh
2017-03-02 15:16 ` [RFC PATCH v2 18/32] kvm: svm: Use the hardware provided GPA instead of page walk Brijesh Singh
2017-03-02 15:16 ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-29 15:14   ` Borislav Petkov
2017-03-29 15:14     ` Borislav Petkov
2017-03-29 15:14     ` Borislav Petkov
2017-03-29 17:08     ` Brijesh Singh
2017-03-29 17:08       ` Brijesh Singh
2017-03-29 17:08       ` Brijesh Singh
2017-03-29 17:08       ` Brijesh Singh
2017-03-02 15:16 ` [RFC PATCH v2 19/32] crypto: ccp: Introduce the AMD Secure Processor device Brijesh Singh
2017-03-02 15:16 ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 17:39   ` Mark Rutland
2017-03-02 17:39     ` Mark Rutland
2017-03-02 19:11     ` Brijesh Singh
2017-03-02 19:11       ` Brijesh Singh
2017-03-02 19:11       ` Brijesh Singh
2017-03-03 13:55       ` Andy Shevchenko
2017-03-03 13:55         ` Andy Shevchenko
2017-03-03 13:55         ` Andy Shevchenko
2017-03-03 13:55         ` Andy Shevchenko
2017-03-02 15:16 ` [RFC PATCH v2 20/32] crypto: ccp: Add Platform Security Processor (PSP) interface support Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16 ` Brijesh Singh
2017-03-02 15:16 ` [RFC PATCH v2 21/32] crypto: ccp: Add Secure Encrypted Virtualization (SEV) " Brijesh Singh
2017-03-02 15:16 ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16 ` [RFC PATCH v2 22/32] kvm: svm: prepare to reserve asid for SEV guest Brijesh Singh
2017-03-02 15:16 ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:16   ` Brijesh Singh
2017-03-02 15:17 ` [RFC PATCH v2 23/32] kvm: introduce KVM_MEMORY_ENCRYPT_OP ioctl Brijesh Singh
2017-03-02 15:17 ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-16 10:25   ` Paolo Bonzini
2017-03-16 10:25     ` Paolo Bonzini
2017-03-16 10:25     ` Paolo Bonzini
2017-03-16 10:25   ` Paolo Bonzini
2017-03-02 15:17 ` [RFC PATCH v2 24/32] kvm: x86: prepare for SEV guest management API support Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-16 10:33   ` Paolo Bonzini
2017-03-16 10:33     ` Paolo Bonzini
2017-03-16 10:33     ` Paolo Bonzini
2017-03-02 15:17 ` Brijesh Singh
2017-03-02 15:17 ` [RFC PATCH v2 25/32] kvm: svm: Add support for SEV LAUNCH_START command Brijesh Singh
2017-03-02 15:17 ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:17 ` [RFC PATCH v2 26/32] kvm: svm: Add support for SEV LAUNCH_UPDATE_DATA command Brijesh Singh
2017-03-02 15:17 ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-16 10:48   ` Paolo Bonzini
2017-03-16 10:48   ` Paolo Bonzini
2017-03-16 10:48     ` Paolo Bonzini
2017-03-16 10:48     ` Paolo Bonzini
2017-03-16 18:20     ` Brijesh Singh
2017-03-16 18:20       ` Brijesh Singh
2017-03-16 18:20       ` Brijesh Singh
2017-03-16 18:20       ` Brijesh Singh
2017-03-16 18:20     ` Brijesh Singh
2017-03-02 15:17 ` [RFC PATCH v2 27/32] kvm: svm: Add support for SEV LAUNCH_FINISH command Brijesh Singh
2017-03-02 15:17 ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:17   ` Brijesh Singh
2017-03-02 15:18 ` [RFC PATCH v2 28/32] kvm: svm: Add support for SEV GUEST_STATUS command Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18 ` Brijesh Singh
2017-03-02 15:18 ` [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-16 10:54   ` Paolo Bonzini
2017-03-16 10:54     ` Paolo Bonzini
2017-03-16 10:54     ` Paolo Bonzini
2017-03-16 18:41     ` Brijesh Singh
2017-03-16 18:41     ` Brijesh Singh
2017-03-16 18:41       ` Brijesh Singh
2017-03-16 18:41       ` Brijesh Singh
2017-03-16 18:41       ` Brijesh Singh
2017-03-17 11:09       ` Paolo Bonzini
2017-03-17 11:09       ` Paolo Bonzini
2017-03-17 11:09         ` Paolo Bonzini
2017-03-17 11:09         ` Paolo Bonzini
2017-03-16 10:54   ` Paolo Bonzini
2017-03-02 15:18 ` Brijesh Singh
2017-03-02 15:18 ` [RFC PATCH v2 30/32] kvm: svm: Add support for SEV DEBUG_ENCRYPT command Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-16 11:03   ` Paolo Bonzini
2017-03-16 11:03     ` Paolo Bonzini
2017-03-16 11:03     ` Paolo Bonzini
2017-03-16 18:34     ` Brijesh Singh
2017-03-16 18:34     ` Brijesh Singh
2017-03-16 18:34       ` Brijesh Singh
2017-03-16 18:34       ` Brijesh Singh
2017-03-16 18:34       ` Brijesh Singh
2017-03-16 11:03   ` Paolo Bonzini
2017-03-02 15:18 ` Brijesh Singh
2017-03-02 15:18 ` [RFC PATCH v2 31/32] kvm: svm: Add support for SEV LAUNCH_MEASURE command Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18 ` Brijesh Singh
2017-03-02 15:18 ` [RFC PATCH v2 32/32] x86: kvm: Pin the guest memory when SEV is active Brijesh Singh
2017-03-02 15:18 ` Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-02 15:18   ` Brijesh Singh
2017-03-16 10:38   ` Paolo Bonzini
2017-03-16 10:38     ` Paolo Bonzini
2017-03-16 10:38     ` Paolo Bonzini
2017-03-16 18:17     ` Brijesh Singh
2017-03-16 18:17     ` Brijesh Singh
2017-03-16 18:17       ` Brijesh Singh
2017-03-16 18:17       ` Brijesh Singh
2017-03-16 18:17       ` Brijesh Singh
2017-03-16 10:38   ` Paolo Bonzini
2017-03-03 20:33 ` [RFC PATCH v2 00/32] x86: Secure Encrypted Virtualization (AMD) Bjorn Helgaas
2017-03-03 20:33   ` Bjorn Helgaas
2017-03-03 20:33   ` Bjorn Helgaas
2017-03-03 20:33   ` Bjorn Helgaas
2017-03-03 20:51   ` Borislav Petkov
2017-03-03 20:51     ` Borislav Petkov
2017-03-03 20:51     ` Borislav Petkov
2017-03-03 20:51     ` Borislav Petkov
2017-03-03 21:15   ` Brijesh Singh
2017-03-03 21:15     ` Brijesh Singh
2017-03-03 21:15     ` Brijesh Singh
2017-03-03 21:15     ` Brijesh Singh
2017-03-02 15:12 Brijesh Singh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.