All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/18] Support SDEI Virtualization
@ 2022-04-03 15:38 ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This series intends to virtualize Software Delegated Exception Interface
(SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
deliver NMI-alike SDEI event to guest and it's needed by Async PF to
deliver page-not-present notification from hypervisor to guest. The code
and the required qemu changes can be found from:

   https://developer.arm.com/documentation/den0054/c
   https://github.com/gwshan/linux    ("kvm/arm64_sdei")
   https://github.com/gwshan/qemu     ("kvm/arm64_sdei")

The design is quite strightforward by following the specification. The
(SDEI) events are classified into the shared and private ones according
to their scope. The shared event is system or VM scoped, but the private
event is vcpu scoped. This implementation doesn't support the shared
event because all the needed events are private. Besides, the migration
isn't supported by implementation and it's something to be supported
in future.

There are several objects (data structures) introduced to help on the
event registration, enablement, disablement, unregistration, reset,
delivery and handling.

  * kvm_sdei_exposed_event
    The event which are defined and exposed by KVM. The event can't
    be registered until it's exposed. Besides, all the information
    in this event can't be changed after it's exposed.
    
  * kvm_sdei_event
    The events are created based on the exposed events. Their states
    are changed when hypercalls are received or they are delivered
    to guest for handling.
    
  * kvm_sdei_vcpu_context
    The vcpu context helps to handle events. The interrupted context
    is saved before the event handler is executed, and restored after
    the event handler is to finish.
    
  * kvm_sdei_vcpu
    Place holder for all objects for one particular VCPU.

The patches are organized as below:

  PATCH[01-02] Preparatory work to extend smccc_get_argx() and refactor
               hypercall routing mechanism
  PATCH[03]    Adds SDEI virtualization infrastructure
  PATCH[04-16] Supports various SDEI hypercalls and event handling
  PATCH[17]    Exposes SDEI capability
  PATCH[18]    Adds SDEI selftest case
  
The previous revisions can be found:

  v5: https://lore.kernel.org/kvmarm/20220322080710.51727-1-gshan@redhat.com/
  v4: https://lore.kernel.org/kvmarm/20210815001352.81927-1-gshan@redhat.com/
  v3: https://lore.kernel.org/kvmarm/20210507083124.43347-1-gshan@redhat.com/
  v2: https://lore.kernel.org/kvmarm/20210209032733.99996-1-gshan@redhat.com/
  v1: https://lore.kernel.org/kvmarm/20200817100531.83045-1-gshan@redhat.com/

Testing
=======
[1] The selftest case included in this series works fine. The default SDEI
    event, whose number is zero, can be registered, enabled, raised. The
    SDEI event handler can be invoked.

    [host]# pwd
    /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
    [root@virtlab-arm01 kvm]# ./aarch64/sdei 

        NR_VCPUS: 2    SDEI Event: 0x00000000

    --- VERSION
        Version:              1.1 (vendor: 0x4b564d)
    --- FEATURES
        Shared event slots:   0
        Private event slots:  0
        Relative mode:        No
    --- PRIVATE_RESET
    --- SHARED_RESET
    --- PE_UNMASK
    --- EVENT_GET_INFO
        Type:                 Private
        Priority:             Normal
        Signaled:             Yes
    --- EVENT_REGISTER
    --- EVENT_ENABLE
    --- EVENT_SIGNAL
        Handled:              Yes
        IRQ:                  No
        Status:               Registered-Enabled-Running
        PC/PSTATE:            000000000040232c 00000000600003c5
        Regs:                 0000000000000000 0000000000000000
                              0000000000000000 0000000000000000
    --- PE_MASK
    --- EVENT_DISABLE
    --- EVENT_UNREGISTER

        Result: OK

[2] There are additional patches in the following repositories to create
    procfs entries, allowing to inject SDEI event from host side. The
    SDEI client in the guest side registers the SDEI default event, whose
    number is zero. Also, the QEMU exports SDEI ACPI table and supports
    migration for SDEI.

    https://github.com/gwshan/linux    ("kvm/arm64_sdei")
    https://github.com/gwshan/qemu     ("kvm/arm64_sdei")

    [2.1] Start the guests and migrate the source VM to the destination
          VM.

    [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64       \
            -accel kvm -machine virt,gic-version=host                     \
            -cpu host -smp 6,sockets=2,cores=3,threads=1                  \
            -m 1024M,slots=16,maxmem=64G                                  \
               :                                                          \
            -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
            -initrd /home/gavin/sandbox/images/rootfs.cpio.xz             \
            -append earlycon=pl011,mmio,0x9000000                         \
               :

    [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64       \
            -accel kvm -machine virt,gic-version=host                     \
            -cpu host -smp 6,sockets=2,cores=3,threads=1                  \
            -m 1024M,slots=16,maxmem=64G                                  \
               :                                                          \
            -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
            -initrd /home/gavin/sandbox/images/rootfs.cpio.xz             \
            -append earlycon=pl011,mmio,0x9000000                         \
            -incoming tcp:0:4444                                          \
               :

    [2.2] Check kernel log on the source VM. The SDEI service is enabled
          and the default SDEI event (0x0) is enabled.

     [guest-src]# dmesg | grep -i sdei
     ACPI: SDEI 0x000000005BC80000 000024 \
                (v00 BOCHS  BXPC     00000001 BXPC 00000001)
     sdei: SDEIv1.1 (0x4b564d) detected in firmware.
     SDEI TEST: Version 1.1, Vendor 0x4b564d
     sdei_init: SDEI event (0x0) registered
     sdei_init: SDEI event (0x0) enabled

 
     (qemu) migrate -d tcp:localhost:4444

    [2.3] Migrate the source VM to the destination VM. Inject SDEI event
          to the destination VM. The event is raised and handled.

    (qemu) migrate -d tcp:localhost:4444

    [host]# echo 0 > /proc/kvm/kvm-5360/vcpu-1

    [guest-dst]#
    =========== SDEI Event (CPU#1) ===========
    Event: 0000000000000000  Parameter: 00000000dabfdabf
    PC: ffff800008cbb554  PSTATE: 00000000604000c5  SP: ffff800009c7bde0
    Regs:    00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001 
             ffff800016c28000 0000000000000000 0000000000000000 0000000000000000 
             0000000000000000 0000000000000000 0000000000000000 0000000000000000 
             0000000000000000 0000000000000000 0000000000000000 0000000000000000 
             0000000000000000 0000000000000000 0000000000000000 ffff800009399008 
             ffff8000097d9af0 ffff8000097d99f8 ffff8000093a8db8 ffff8000097d9b18 
             0000000000000000 0000000000000000 ffff000000339d00 0000000000000000 
             0000000000000000 ffff800009c7bde0 ffff800008cbb5c4 
    Context: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001 
             ffff800016c28000 03ffffffffffffff 000000024325db59 ffff8000097de190 
             ffff00000033a790 ffff800008cbb814 0000000000000a30 0000000000000000 

Changelog
========= 
v6:
   * Rebased to v5.18.rc1                                     (Gavin)
   * Pass additional argument to smccc_get_arg()              (Oliver) 
   * Add preparatory patch to route hypercalls based on their
     owners                                                   (Oliver)
   * Remove the support for shared event.                     (Oliver/Gavin)
   * Remove the support for migration and add-on patches to
     support it in future                                     (Oliver)
   * The events are exposed by KVM instead of VMM             (Oliver)
   * kvm_sdei_state.h is dropped and all the structures are
     folded into the corresponding ones in kvm_sdei.h         (Oliver)
   * Rename 'struct kvm_sdei_registered_event' to
     'struct kvm_sdei_event'                                  (Oliver)
   * Misc comments from Oliver Upon                           (Oliver)  
v5:
   * Rebased to v5.17.rc7                                     (Gavin)
   * Unified names for the objects, data structures, variables
     and functions. The events have been named as exposed,
     registered and vcpu event. The staes needs to be migrated
     is put into kvm_sdei_state.h                             (Eric)
   * More inline functions to visit SDEI event's properties   (Eric)
   * Support unregistration pending state                     (Eric)
   * Support v1.1 SDEI specification                          (Eric)
   * Fold the code to inject, deliver and handle SDEI event
     from PATCH[v4 13/18/19] into PATCH[v5 13]                (Eric)
   * Simplified ioctl interface to visit all events at once   (Eric/Gavin)
   * Improved reference count and avoid its migration. Also,
     the limit to memory allocation is added based on it.     (Eric)
   * Change the return values from hypercall functions        (Eric) 
   * Validate @ksdei and @vsdi in kvm_sdei_hypercall()        (Shannon)
   * Add document to explain how SDEI virtualization and the
     migration are supported                                  (Eric)
   * Improved selftest case to inject and handle SDEI event   (Gavin)
   * Improved comments and commit logs                        (Eric)
   * Address misc comments from Eric. Hopefully, all of them
     are covered in v5 because Eric provided lots of comments
     in the last round of review                              (Eric)
v4:
   * Rebased to v5.14.rc5                                     (Gavin)
v3:
   * Rebased to v5.13.rc1                                     (Gavin)
   * Use linux data types in kvm_sdei.h                       (Gavin)
v2:
   * Rebased to v5.11.rc6                                     (Gavin)
   * Dropped changes related to SDEI client driver            (Gavin)
   * Removed support for passthrou SDEI events                (Gavin)
   * Redesigned data structures                               (Gavin)
   * Implementation is almost rewritten as the data
     structures are totally changed                           (Gavin)
   * Added ioctl commands to support migration                (Gavin)

Gavin Shan (18):
  KVM: arm64: Extend smccc_get_argx()
  KVM: arm64: Route hypercalls based on their owner
  KVM: arm64: Add SDEI virtualization infrastructure
  KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
  KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE}
  KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
  KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall
  KVM: arm64: Support SDEI_EVENT_STATUS hypercall
  KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall
  KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall
  KVM: arm64: Support SDEI_{PRIVATE, SHARED}_RESET
  KVM: arm64: Support SDEI event injection, delivery
  KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME}
    hypercall
  KVM: arm64: Support SDEI_EVENT_SIGNAL hypercall
  KVM: arm64: Support SDEI_FEATURES hypercall
  KVM: arm64: Support SDEI_VERSION hypercall
  KVM: arm64: Expose SDEI capability
  KVM: selftests: Add SDEI test case

 Documentation/virt/kvm/api.rst             |  11 +
 arch/arm64/include/asm/kvm_emulate.h       |   1 +
 arch/arm64/include/asm/kvm_host.h          |   3 +
 arch/arm64/include/asm/kvm_sdei.h          | 155 ++++
 arch/arm64/kvm/Makefile                    |   2 +-
 arch/arm64/kvm/arm.c                       |   8 +
 arch/arm64/kvm/hyp/exception.c             |   7 +
 arch/arm64/kvm/hypercalls.c                | 204 +++--
 arch/arm64/kvm/inject_fault.c              |  29 +
 arch/arm64/kvm/psci.c                      |  14 +-
 arch/arm64/kvm/pvtime.c                    |   2 +-
 arch/arm64/kvm/sdei.c                      | 902 +++++++++++++++++++++
 arch/arm64/kvm/trng.c                      |   4 +-
 include/kvm/arm_hypercalls.h               |  19 +-
 include/uapi/linux/arm_sdei.h              |   4 +
 include/uapi/linux/kvm.h                   |   1 +
 tools/testing/selftests/kvm/Makefile       |   1 +
 tools/testing/selftests/kvm/aarch64/sdei.c | 498 ++++++++++++
 18 files changed, 1767 insertions(+), 98 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_sdei.h
 create mode 100644 arch/arm64/kvm/sdei.c
 create mode 100644 tools/testing/selftests/kvm/aarch64/sdei.c

-- 
2.23.0


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v6 00/18] Support SDEI Virtualization
@ 2022-04-03 15:38 ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This series intends to virtualize Software Delegated Exception Interface
(SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
deliver NMI-alike SDEI event to guest and it's needed by Async PF to
deliver page-not-present notification from hypervisor to guest. The code
and the required qemu changes can be found from:

   https://developer.arm.com/documentation/den0054/c
   https://github.com/gwshan/linux    ("kvm/arm64_sdei")
   https://github.com/gwshan/qemu     ("kvm/arm64_sdei")

The design is quite strightforward by following the specification. The
(SDEI) events are classified into the shared and private ones according
to their scope. The shared event is system or VM scoped, but the private
event is vcpu scoped. This implementation doesn't support the shared
event because all the needed events are private. Besides, the migration
isn't supported by implementation and it's something to be supported
in future.

There are several objects (data structures) introduced to help on the
event registration, enablement, disablement, unregistration, reset,
delivery and handling.

  * kvm_sdei_exposed_event
    The event which are defined and exposed by KVM. The event can't
    be registered until it's exposed. Besides, all the information
    in this event can't be changed after it's exposed.
    
  * kvm_sdei_event
    The events are created based on the exposed events. Their states
    are changed when hypercalls are received or they are delivered
    to guest for handling.
    
  * kvm_sdei_vcpu_context
    The vcpu context helps to handle events. The interrupted context
    is saved before the event handler is executed, and restored after
    the event handler is to finish.
    
  * kvm_sdei_vcpu
    Place holder for all objects for one particular VCPU.

The patches are organized as below:

  PATCH[01-02] Preparatory work to extend smccc_get_argx() and refactor
               hypercall routing mechanism
  PATCH[03]    Adds SDEI virtualization infrastructure
  PATCH[04-16] Supports various SDEI hypercalls and event handling
  PATCH[17]    Exposes SDEI capability
  PATCH[18]    Adds SDEI selftest case
  
The previous revisions can be found:

  v5: https://lore.kernel.org/kvmarm/20220322080710.51727-1-gshan@redhat.com/
  v4: https://lore.kernel.org/kvmarm/20210815001352.81927-1-gshan@redhat.com/
  v3: https://lore.kernel.org/kvmarm/20210507083124.43347-1-gshan@redhat.com/
  v2: https://lore.kernel.org/kvmarm/20210209032733.99996-1-gshan@redhat.com/
  v1: https://lore.kernel.org/kvmarm/20200817100531.83045-1-gshan@redhat.com/

Testing
=======
[1] The selftest case included in this series works fine. The default SDEI
    event, whose number is zero, can be registered, enabled, raised. The
    SDEI event handler can be invoked.

    [host]# pwd
    /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
    [root@virtlab-arm01 kvm]# ./aarch64/sdei 

        NR_VCPUS: 2    SDEI Event: 0x00000000

    --- VERSION
        Version:              1.1 (vendor: 0x4b564d)
    --- FEATURES
        Shared event slots:   0
        Private event slots:  0
        Relative mode:        No
    --- PRIVATE_RESET
    --- SHARED_RESET
    --- PE_UNMASK
    --- EVENT_GET_INFO
        Type:                 Private
        Priority:             Normal
        Signaled:             Yes
    --- EVENT_REGISTER
    --- EVENT_ENABLE
    --- EVENT_SIGNAL
        Handled:              Yes
        IRQ:                  No
        Status:               Registered-Enabled-Running
        PC/PSTATE:            000000000040232c 00000000600003c5
        Regs:                 0000000000000000 0000000000000000
                              0000000000000000 0000000000000000
    --- PE_MASK
    --- EVENT_DISABLE
    --- EVENT_UNREGISTER

        Result: OK

[2] There are additional patches in the following repositories to create
    procfs entries, allowing to inject SDEI event from host side. The
    SDEI client in the guest side registers the SDEI default event, whose
    number is zero. Also, the QEMU exports SDEI ACPI table and supports
    migration for SDEI.

    https://github.com/gwshan/linux    ("kvm/arm64_sdei")
    https://github.com/gwshan/qemu     ("kvm/arm64_sdei")

    [2.1] Start the guests and migrate the source VM to the destination
          VM.

    [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64       \
            -accel kvm -machine virt,gic-version=host                     \
            -cpu host -smp 6,sockets=2,cores=3,threads=1                  \
            -m 1024M,slots=16,maxmem=64G                                  \
               :                                                          \
            -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
            -initrd /home/gavin/sandbox/images/rootfs.cpio.xz             \
            -append earlycon=pl011,mmio,0x9000000                         \
               :

    [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64       \
            -accel kvm -machine virt,gic-version=host                     \
            -cpu host -smp 6,sockets=2,cores=3,threads=1                  \
            -m 1024M,slots=16,maxmem=64G                                  \
               :                                                          \
            -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
            -initrd /home/gavin/sandbox/images/rootfs.cpio.xz             \
            -append earlycon=pl011,mmio,0x9000000                         \
            -incoming tcp:0:4444                                          \
               :

    [2.2] Check kernel log on the source VM. The SDEI service is enabled
          and the default SDEI event (0x0) is enabled.

     [guest-src]# dmesg | grep -i sdei
     ACPI: SDEI 0x000000005BC80000 000024 \
                (v00 BOCHS  BXPC     00000001 BXPC 00000001)
     sdei: SDEIv1.1 (0x4b564d) detected in firmware.
     SDEI TEST: Version 1.1, Vendor 0x4b564d
     sdei_init: SDEI event (0x0) registered
     sdei_init: SDEI event (0x0) enabled

 
     (qemu) migrate -d tcp:localhost:4444

    [2.3] Migrate the source VM to the destination VM. Inject SDEI event
          to the destination VM. The event is raised and handled.

    (qemu) migrate -d tcp:localhost:4444

    [host]# echo 0 > /proc/kvm/kvm-5360/vcpu-1

    [guest-dst]#
    =========== SDEI Event (CPU#1) ===========
    Event: 0000000000000000  Parameter: 00000000dabfdabf
    PC: ffff800008cbb554  PSTATE: 00000000604000c5  SP: ffff800009c7bde0
    Regs:    00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001 
             ffff800016c28000 0000000000000000 0000000000000000 0000000000000000 
             0000000000000000 0000000000000000 0000000000000000 0000000000000000 
             0000000000000000 0000000000000000 0000000000000000 0000000000000000 
             0000000000000000 0000000000000000 0000000000000000 ffff800009399008 
             ffff8000097d9af0 ffff8000097d99f8 ffff8000093a8db8 ffff8000097d9b18 
             0000000000000000 0000000000000000 ffff000000339d00 0000000000000000 
             0000000000000000 ffff800009c7bde0 ffff800008cbb5c4 
    Context: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001 
             ffff800016c28000 03ffffffffffffff 000000024325db59 ffff8000097de190 
             ffff00000033a790 ffff800008cbb814 0000000000000a30 0000000000000000 

Changelog
========= 
v6:
   * Rebased to v5.18.rc1                                     (Gavin)
   * Pass additional argument to smccc_get_arg()              (Oliver) 
   * Add preparatory patch to route hypercalls based on their
     owners                                                   (Oliver)
   * Remove the support for shared event.                     (Oliver/Gavin)
   * Remove the support for migration and add-on patches to
     support it in future                                     (Oliver)
   * The events are exposed by KVM instead of VMM             (Oliver)
   * kvm_sdei_state.h is dropped and all the structures are
     folded into the corresponding ones in kvm_sdei.h         (Oliver)
   * Rename 'struct kvm_sdei_registered_event' to
     'struct kvm_sdei_event'                                  (Oliver)
   * Misc comments from Oliver Upon                           (Oliver)  
v5:
   * Rebased to v5.17.rc7                                     (Gavin)
   * Unified names for the objects, data structures, variables
     and functions. The events have been named as exposed,
     registered and vcpu event. The staes needs to be migrated
     is put into kvm_sdei_state.h                             (Eric)
   * More inline functions to visit SDEI event's properties   (Eric)
   * Support unregistration pending state                     (Eric)
   * Support v1.1 SDEI specification                          (Eric)
   * Fold the code to inject, deliver and handle SDEI event
     from PATCH[v4 13/18/19] into PATCH[v5 13]                (Eric)
   * Simplified ioctl interface to visit all events at once   (Eric/Gavin)
   * Improved reference count and avoid its migration. Also,
     the limit to memory allocation is added based on it.     (Eric)
   * Change the return values from hypercall functions        (Eric) 
   * Validate @ksdei and @vsdi in kvm_sdei_hypercall()        (Shannon)
   * Add document to explain how SDEI virtualization and the
     migration are supported                                  (Eric)
   * Improved selftest case to inject and handle SDEI event   (Gavin)
   * Improved comments and commit logs                        (Eric)
   * Address misc comments from Eric. Hopefully, all of them
     are covered in v5 because Eric provided lots of comments
     in the last round of review                              (Eric)
v4:
   * Rebased to v5.14.rc5                                     (Gavin)
v3:
   * Rebased to v5.13.rc1                                     (Gavin)
   * Use linux data types in kvm_sdei.h                       (Gavin)
v2:
   * Rebased to v5.11.rc6                                     (Gavin)
   * Dropped changes related to SDEI client driver            (Gavin)
   * Removed support for passthrou SDEI events                (Gavin)
   * Redesigned data structures                               (Gavin)
   * Implementation is almost rewritten as the data
     structures are totally changed                           (Gavin)
   * Added ioctl commands to support migration                (Gavin)

Gavin Shan (18):
  KVM: arm64: Extend smccc_get_argx()
  KVM: arm64: Route hypercalls based on their owner
  KVM: arm64: Add SDEI virtualization infrastructure
  KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
  KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE}
  KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
  KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall
  KVM: arm64: Support SDEI_EVENT_STATUS hypercall
  KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall
  KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall
  KVM: arm64: Support SDEI_{PRIVATE, SHARED}_RESET
  KVM: arm64: Support SDEI event injection, delivery
  KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME}
    hypercall
  KVM: arm64: Support SDEI_EVENT_SIGNAL hypercall
  KVM: arm64: Support SDEI_FEATURES hypercall
  KVM: arm64: Support SDEI_VERSION hypercall
  KVM: arm64: Expose SDEI capability
  KVM: selftests: Add SDEI test case

 Documentation/virt/kvm/api.rst             |  11 +
 arch/arm64/include/asm/kvm_emulate.h       |   1 +
 arch/arm64/include/asm/kvm_host.h          |   3 +
 arch/arm64/include/asm/kvm_sdei.h          | 155 ++++
 arch/arm64/kvm/Makefile                    |   2 +-
 arch/arm64/kvm/arm.c                       |   8 +
 arch/arm64/kvm/hyp/exception.c             |   7 +
 arch/arm64/kvm/hypercalls.c                | 204 +++--
 arch/arm64/kvm/inject_fault.c              |  29 +
 arch/arm64/kvm/psci.c                      |  14 +-
 arch/arm64/kvm/pvtime.c                    |   2 +-
 arch/arm64/kvm/sdei.c                      | 902 +++++++++++++++++++++
 arch/arm64/kvm/trng.c                      |   4 +-
 include/kvm/arm_hypercalls.h               |  19 +-
 include/uapi/linux/arm_sdei.h              |   4 +
 include/uapi/linux/kvm.h                   |   1 +
 tools/testing/selftests/kvm/Makefile       |   1 +
 tools/testing/selftests/kvm/aarch64/sdei.c | 498 ++++++++++++
 18 files changed, 1767 insertions(+), 98 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_sdei.h
 create mode 100644 arch/arm64/kvm/sdei.c
 create mode 100644 tools/testing/selftests/kvm/aarch64/sdei.c

-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v6 01/18] KVM: arm64: Extend smccc_get_argx()
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:38   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Currently, there are 3 inline functions to retrieve SMCCC arguments,
but the number of arguments is limited to 3. We need to retrieve
more SMCCC arguments when SDEI virtualization is supported.

This introduces smccc_get_arg(), which accepts @index to indicate
the SMCCC argument to be retrieved. Besides, smccc_get_function()
also calls into this newly introduced helper. Further more, we also
mechanically replace smccc_get_{arg1, arg2, arg3}() using the newly
introduced helper.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/hypercalls.c  |  4 ++--
 arch/arm64/kvm/psci.c        | 14 +++++++-------
 arch/arm64/kvm/pvtime.c      |  2 +-
 arch/arm64/kvm/trng.c        |  4 ++--
 include/kvm/arm_hypercalls.h | 19 +++++--------------
 5 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 202b8c455724..8438fd79e3f0 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -34,7 +34,7 @@ static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
 	 * (virtual or physical) with the first argument of the SMCCC
 	 * call. In case the identifier is not supported, error out.
 	 */
-	feature = smccc_get_arg1(vcpu);
+	feature = smccc_get_arg(vcpu, 1);
 	switch (feature) {
 	case KVM_PTP_VIRT_COUNTER:
 		cycles = systime_snapshot.cycles - vcpu_read_sys_reg(vcpu, CNTVOFF_EL2);
@@ -70,7 +70,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 		val[0] = ARM_SMCCC_VERSION_1_1;
 		break;
 	case ARM_SMCCC_ARCH_FEATURES_FUNC_ID:
-		feature = smccc_get_arg1(vcpu);
+		feature = smccc_get_arg(vcpu, 1);
 		switch (feature) {
 		case ARM_SMCCC_ARCH_WORKAROUND_1:
 			switch (arm64_get_spectre_v2_state()) {
diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 372da09a2fab..3aaa4921f3b3 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -71,7 +71,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 	struct kvm_vcpu *vcpu = NULL;
 	unsigned long cpu_id;
 
-	cpu_id = smccc_get_arg1(source_vcpu);
+	cpu_id = smccc_get_arg(source_vcpu, 1);
 	if (!kvm_psci_valid_affinity(source_vcpu, cpu_id))
 		return PSCI_RET_INVALID_PARAMS;
 
@@ -92,7 +92,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 
 	reset_state = &vcpu->arch.reset_state;
 
-	reset_state->pc = smccc_get_arg2(source_vcpu);
+	reset_state->pc = smccc_get_arg(source_vcpu, 2);
 
 	/* Propagate caller endianness */
 	reset_state->be = kvm_vcpu_is_be(source_vcpu);
@@ -101,7 +101,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 	 * NOTE: We always update r0 (or x0) because for PSCI v0.1
 	 * the general purpose registers are undefined upon CPU_ON.
 	 */
-	reset_state->r0 = smccc_get_arg3(source_vcpu);
+	reset_state->r0 = smccc_get_arg(source_vcpu, 3);
 
 	WRITE_ONCE(reset_state->reset, true);
 	kvm_make_request(KVM_REQ_VCPU_RESET, vcpu);
@@ -128,8 +128,8 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_vcpu *tmp;
 
-	target_affinity = smccc_get_arg1(vcpu);
-	lowest_affinity_level = smccc_get_arg2(vcpu);
+	target_affinity = smccc_get_arg(vcpu, 1);
+	lowest_affinity_level = smccc_get_arg(vcpu, 2);
 
 	if (!kvm_psci_valid_affinity(vcpu, target_affinity))
 		return PSCI_RET_INVALID_PARAMS;
@@ -326,7 +326,7 @@ static int kvm_psci_1_x_call(struct kvm_vcpu *vcpu, u32 minor)
 		val = minor == 0 ? KVM_ARM_PSCI_1_0 : KVM_ARM_PSCI_1_1;
 		break;
 	case PSCI_1_0_FN_PSCI_FEATURES:
-		arg = smccc_get_arg1(vcpu);
+		arg = smccc_get_arg(vcpu, 1);
 		val = kvm_psci_check_allowed_function(vcpu, arg);
 		if (val)
 			break;
@@ -364,7 +364,7 @@ static int kvm_psci_1_x_call(struct kvm_vcpu *vcpu, u32 minor)
 		fallthrough;
 	case PSCI_1_1_FN64_SYSTEM_RESET2:
 		if (minor >= 1) {
-			arg = smccc_get_arg1(vcpu);
+			arg = smccc_get_arg(vcpu, 1);
 
 			if (arg <= PSCI_1_1_RESET_TYPE_SYSTEM_WARM_RESET ||
 			    arg >= PSCI_1_1_RESET_TYPE_VENDOR_START) {
diff --git a/arch/arm64/kvm/pvtime.c b/arch/arm64/kvm/pvtime.c
index 78a09f7a6637..05e775fc9e8b 100644
--- a/arch/arm64/kvm/pvtime.c
+++ b/arch/arm64/kvm/pvtime.c
@@ -34,7 +34,7 @@ void kvm_update_stolen_time(struct kvm_vcpu *vcpu)
 
 long kvm_hypercall_pv_features(struct kvm_vcpu *vcpu)
 {
-	u32 feature = smccc_get_arg1(vcpu);
+	u32 feature = smccc_get_arg(vcpu, 1);
 	long val = SMCCC_RET_NOT_SUPPORTED;
 
 	switch (feature) {
diff --git a/arch/arm64/kvm/trng.c b/arch/arm64/kvm/trng.c
index 99bdd7103c9c..89911b724a26 100644
--- a/arch/arm64/kvm/trng.c
+++ b/arch/arm64/kvm/trng.c
@@ -24,7 +24,7 @@ static const uuid_t arm_smc_trng_uuid __aligned(4) = UUID_INIT(
 static int kvm_trng_do_rnd(struct kvm_vcpu *vcpu, int size)
 {
 	DECLARE_BITMAP(bits, TRNG_MAX_BITS64);
-	u32 num_bits = smccc_get_arg1(vcpu);
+	u32 num_bits = smccc_get_arg(vcpu, 1);
 	int i;
 
 	if (num_bits > 3 * size) {
@@ -60,7 +60,7 @@ int kvm_trng_call(struct kvm_vcpu *vcpu)
 		val = ARM_SMCCC_TRNG_VERSION_1_0;
 		break;
 	case ARM_SMCCC_TRNG_FEATURES:
-		switch (smccc_get_arg1(vcpu)) {
+		switch (smccc_get_arg(vcpu, 1)) {
 		case ARM_SMCCC_TRNG_VERSION:
 		case ARM_SMCCC_TRNG_FEATURES:
 		case ARM_SMCCC_TRNG_GET_UUID:
diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
index 0e2509d27910..723d2865c055 100644
--- a/include/kvm/arm_hypercalls.h
+++ b/include/kvm/arm_hypercalls.h
@@ -8,24 +8,15 @@
 
 int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
 
-static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
-{
-	return vcpu_get_reg(vcpu, 0);
-}
-
-static inline unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
+static inline unsigned long smccc_get_arg(struct kvm_vcpu *vcpu,
+					  unsigned char index)
 {
-	return vcpu_get_reg(vcpu, 1);
+	return vcpu_get_reg(vcpu, index);
 }
 
-static inline unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
-{
-	return vcpu_get_reg(vcpu, 2);
-}
-
-static inline unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
+static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
 {
-	return vcpu_get_reg(vcpu, 3);
+	return smccc_get_arg(vcpu, 0);
 }
 
 static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 01/18] KVM: arm64: Extend smccc_get_argx()
@ 2022-04-03 15:38   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

Currently, there are 3 inline functions to retrieve SMCCC arguments,
but the number of arguments is limited to 3. We need to retrieve
more SMCCC arguments when SDEI virtualization is supported.

This introduces smccc_get_arg(), which accepts @index to indicate
the SMCCC argument to be retrieved. Besides, smccc_get_function()
also calls into this newly introduced helper. Further more, we also
mechanically replace smccc_get_{arg1, arg2, arg3}() using the newly
introduced helper.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/hypercalls.c  |  4 ++--
 arch/arm64/kvm/psci.c        | 14 +++++++-------
 arch/arm64/kvm/pvtime.c      |  2 +-
 arch/arm64/kvm/trng.c        |  4 ++--
 include/kvm/arm_hypercalls.h | 19 +++++--------------
 5 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 202b8c455724..8438fd79e3f0 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -34,7 +34,7 @@ static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
 	 * (virtual or physical) with the first argument of the SMCCC
 	 * call. In case the identifier is not supported, error out.
 	 */
-	feature = smccc_get_arg1(vcpu);
+	feature = smccc_get_arg(vcpu, 1);
 	switch (feature) {
 	case KVM_PTP_VIRT_COUNTER:
 		cycles = systime_snapshot.cycles - vcpu_read_sys_reg(vcpu, CNTVOFF_EL2);
@@ -70,7 +70,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 		val[0] = ARM_SMCCC_VERSION_1_1;
 		break;
 	case ARM_SMCCC_ARCH_FEATURES_FUNC_ID:
-		feature = smccc_get_arg1(vcpu);
+		feature = smccc_get_arg(vcpu, 1);
 		switch (feature) {
 		case ARM_SMCCC_ARCH_WORKAROUND_1:
 			switch (arm64_get_spectre_v2_state()) {
diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 372da09a2fab..3aaa4921f3b3 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -71,7 +71,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 	struct kvm_vcpu *vcpu = NULL;
 	unsigned long cpu_id;
 
-	cpu_id = smccc_get_arg1(source_vcpu);
+	cpu_id = smccc_get_arg(source_vcpu, 1);
 	if (!kvm_psci_valid_affinity(source_vcpu, cpu_id))
 		return PSCI_RET_INVALID_PARAMS;
 
@@ -92,7 +92,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 
 	reset_state = &vcpu->arch.reset_state;
 
-	reset_state->pc = smccc_get_arg2(source_vcpu);
+	reset_state->pc = smccc_get_arg(source_vcpu, 2);
 
 	/* Propagate caller endianness */
 	reset_state->be = kvm_vcpu_is_be(source_vcpu);
@@ -101,7 +101,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 	 * NOTE: We always update r0 (or x0) because for PSCI v0.1
 	 * the general purpose registers are undefined upon CPU_ON.
 	 */
-	reset_state->r0 = smccc_get_arg3(source_vcpu);
+	reset_state->r0 = smccc_get_arg(source_vcpu, 3);
 
 	WRITE_ONCE(reset_state->reset, true);
 	kvm_make_request(KVM_REQ_VCPU_RESET, vcpu);
@@ -128,8 +128,8 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_vcpu *tmp;
 
-	target_affinity = smccc_get_arg1(vcpu);
-	lowest_affinity_level = smccc_get_arg2(vcpu);
+	target_affinity = smccc_get_arg(vcpu, 1);
+	lowest_affinity_level = smccc_get_arg(vcpu, 2);
 
 	if (!kvm_psci_valid_affinity(vcpu, target_affinity))
 		return PSCI_RET_INVALID_PARAMS;
@@ -326,7 +326,7 @@ static int kvm_psci_1_x_call(struct kvm_vcpu *vcpu, u32 minor)
 		val = minor == 0 ? KVM_ARM_PSCI_1_0 : KVM_ARM_PSCI_1_1;
 		break;
 	case PSCI_1_0_FN_PSCI_FEATURES:
-		arg = smccc_get_arg1(vcpu);
+		arg = smccc_get_arg(vcpu, 1);
 		val = kvm_psci_check_allowed_function(vcpu, arg);
 		if (val)
 			break;
@@ -364,7 +364,7 @@ static int kvm_psci_1_x_call(struct kvm_vcpu *vcpu, u32 minor)
 		fallthrough;
 	case PSCI_1_1_FN64_SYSTEM_RESET2:
 		if (minor >= 1) {
-			arg = smccc_get_arg1(vcpu);
+			arg = smccc_get_arg(vcpu, 1);
 
 			if (arg <= PSCI_1_1_RESET_TYPE_SYSTEM_WARM_RESET ||
 			    arg >= PSCI_1_1_RESET_TYPE_VENDOR_START) {
diff --git a/arch/arm64/kvm/pvtime.c b/arch/arm64/kvm/pvtime.c
index 78a09f7a6637..05e775fc9e8b 100644
--- a/arch/arm64/kvm/pvtime.c
+++ b/arch/arm64/kvm/pvtime.c
@@ -34,7 +34,7 @@ void kvm_update_stolen_time(struct kvm_vcpu *vcpu)
 
 long kvm_hypercall_pv_features(struct kvm_vcpu *vcpu)
 {
-	u32 feature = smccc_get_arg1(vcpu);
+	u32 feature = smccc_get_arg(vcpu, 1);
 	long val = SMCCC_RET_NOT_SUPPORTED;
 
 	switch (feature) {
diff --git a/arch/arm64/kvm/trng.c b/arch/arm64/kvm/trng.c
index 99bdd7103c9c..89911b724a26 100644
--- a/arch/arm64/kvm/trng.c
+++ b/arch/arm64/kvm/trng.c
@@ -24,7 +24,7 @@ static const uuid_t arm_smc_trng_uuid __aligned(4) = UUID_INIT(
 static int kvm_trng_do_rnd(struct kvm_vcpu *vcpu, int size)
 {
 	DECLARE_BITMAP(bits, TRNG_MAX_BITS64);
-	u32 num_bits = smccc_get_arg1(vcpu);
+	u32 num_bits = smccc_get_arg(vcpu, 1);
 	int i;
 
 	if (num_bits > 3 * size) {
@@ -60,7 +60,7 @@ int kvm_trng_call(struct kvm_vcpu *vcpu)
 		val = ARM_SMCCC_TRNG_VERSION_1_0;
 		break;
 	case ARM_SMCCC_TRNG_FEATURES:
-		switch (smccc_get_arg1(vcpu)) {
+		switch (smccc_get_arg(vcpu, 1)) {
 		case ARM_SMCCC_TRNG_VERSION:
 		case ARM_SMCCC_TRNG_FEATURES:
 		case ARM_SMCCC_TRNG_GET_UUID:
diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
index 0e2509d27910..723d2865c055 100644
--- a/include/kvm/arm_hypercalls.h
+++ b/include/kvm/arm_hypercalls.h
@@ -8,24 +8,15 @@
 
 int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
 
-static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
-{
-	return vcpu_get_reg(vcpu, 0);
-}
-
-static inline unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
+static inline unsigned long smccc_get_arg(struct kvm_vcpu *vcpu,
+					  unsigned char index)
 {
-	return vcpu_get_reg(vcpu, 1);
+	return vcpu_get_reg(vcpu, index);
 }
 
-static inline unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
-{
-	return vcpu_get_reg(vcpu, 2);
-}
-
-static inline unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
+static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
 {
-	return vcpu_get_reg(vcpu, 3);
+	return smccc_get_arg(vcpu, 0);
 }
 
 static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 02/18] KVM: arm64: Route hypercalls based on their owner
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:38   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

kvm_hvc_call_handler() directly handles the incoming hypercall, or
and routes it based on its (function) ID. kvm_psci_call() becomes
the gate keeper to handle the hypercall that can't be handled by
any one else. It makes kvm_hvc_call_handler() a bit messy.

This reorgnizes the code to route the hypercall to the corresponding
handler based on its owner. The hypercall may be handled directly
in the handler or routed further to the corresponding functionality.
The (function) ID is always verified before it's routed to the
corresponding functionality. By the way, @func_id is repalced by
@func, to be consistent with by smccc_get_function().

PSCI is the only exception, those hypercalls defined by 0.2 or
beyond are routed to the handler for Standard Secure Service, but
those defined in 0.1 are routed to the handler for Standard
Hypervisor Service.

Suggested-by: Oliver Upton <oupton@google.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/hypercalls.c | 199 +++++++++++++++++++++++-------------
 1 file changed, 127 insertions(+), 72 deletions(-)

diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 8438fd79e3f0..b659387d8919 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -9,65 +9,14 @@
 #include <kvm/arm_hypercalls.h>
 #include <kvm/arm_psci.h>
 
-static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
-{
-	struct system_time_snapshot systime_snapshot;
-	u64 cycles = ~0UL;
-	u32 feature;
-
-	/*
-	 * system time and counter value must captured at the same
-	 * time to keep consistency and precision.
-	 */
-	ktime_get_snapshot(&systime_snapshot);
-
-	/*
-	 * This is only valid if the current clocksource is the
-	 * architected counter, as this is the only one the guest
-	 * can see.
-	 */
-	if (systime_snapshot.cs_id != CSID_ARM_ARCH_COUNTER)
-		return;
-
-	/*
-	 * The guest selects one of the two reference counters
-	 * (virtual or physical) with the first argument of the SMCCC
-	 * call. In case the identifier is not supported, error out.
-	 */
-	feature = smccc_get_arg(vcpu, 1);
-	switch (feature) {
-	case KVM_PTP_VIRT_COUNTER:
-		cycles = systime_snapshot.cycles - vcpu_read_sys_reg(vcpu, CNTVOFF_EL2);
-		break;
-	case KVM_PTP_PHYS_COUNTER:
-		cycles = systime_snapshot.cycles;
-		break;
-	default:
-		return;
-	}
-
-	/*
-	 * This relies on the top bit of val[0] never being set for
-	 * valid values of system time, because that is *really* far
-	 * in the future (about 292 years from 1970, and at that stage
-	 * nobody will give a damn about it).
-	 */
-	val[0] = upper_32_bits(systime_snapshot.real);
-	val[1] = lower_32_bits(systime_snapshot.real);
-	val[2] = upper_32_bits(cycles);
-	val[3] = lower_32_bits(cycles);
-}
-
-int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
+static int kvm_hvc_arch(struct kvm_vcpu *vcpu, u32 func)
 {
-	u32 func_id = smccc_get_function(vcpu);
-	u64 val[4] = {SMCCC_RET_NOT_SUPPORTED};
+	u64 val = SMCCC_RET_NOT_SUPPORTED;
 	u32 feature;
-	gpa_t gpa;
 
-	switch (func_id) {
+	switch (func) {
 	case ARM_SMCCC_VERSION_FUNC_ID:
-		val[0] = ARM_SMCCC_VERSION_1_1;
+		val = ARM_SMCCC_VERSION_1_1;
 		break;
 	case ARM_SMCCC_ARCH_FEATURES_FUNC_ID:
 		feature = smccc_get_arg(vcpu, 1);
@@ -77,10 +26,10 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 			case SPECTRE_VULNERABLE:
 				break;
 			case SPECTRE_MITIGATED:
-				val[0] = SMCCC_RET_SUCCESS;
+				val = SMCCC_RET_SUCCESS;
 				break;
 			case SPECTRE_UNAFFECTED:
-				val[0] = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED;
+				val = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED;
 				break;
 			}
 			break;
@@ -103,7 +52,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 					break;
 				fallthrough;
 			case SPECTRE_UNAFFECTED:
-				val[0] = SMCCC_RET_NOT_REQUIRED;
+				val = SMCCC_RET_NOT_REQUIRED;
 				break;
 			}
 			break;
@@ -112,26 +61,120 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 			case SPECTRE_VULNERABLE:
 				break;
 			case SPECTRE_MITIGATED:
-				val[0] = SMCCC_RET_SUCCESS;
+				val = SMCCC_RET_SUCCESS;
 				break;
 			case SPECTRE_UNAFFECTED:
-				val[0] = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED;
+				val = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED;
 				break;
 			}
 			break;
 		case ARM_SMCCC_HV_PV_TIME_FEATURES:
-			val[0] = SMCCC_RET_SUCCESS;
+			val = SMCCC_RET_SUCCESS;
 			break;
 		}
-		break;
+	}
+
+	smccc_set_retval(vcpu, val, 0, 0, 0);
+	return 1;
+}
+
+static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
+{
+	u64 val = SMCCC_RET_NOT_SUPPORTED;
+
+	switch (func) {
+	case ARM_SMCCC_TRNG_VERSION ... ARM_SMCCC_TRNG_RND32:
+	case ARM_SMCCC_TRNG_RND64:
+		return kvm_trng_call(vcpu);
+	case PSCI_0_2_FN_PSCI_VERSION ... PSCI_0_2_FN_SYSTEM_RESET:
+	case PSCI_0_2_FN64_CPU_SUSPEND ... PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
+	case PSCI_1_0_FN_PSCI_FEATURES ... PSCI_1_0_FN_SET_SUSPEND_MODE:
+	case PSCI_1_0_FN64_SYSTEM_SUSPEND:
+	case PSCI_1_1_FN_SYSTEM_RESET2:
+	case PSCI_1_1_FN64_SYSTEM_RESET2:
+		return kvm_psci_call(vcpu);
+	}
+
+	smccc_set_retval(vcpu, val, 0, 0, 0);
+	return 1;
+}
+
+static int kvm_hvc_standard_hyp(struct kvm_vcpu *vcpu, u32 func)
+{
+	u64 val = SMCCC_RET_NOT_SUPPORTED;
+	gpa_t gpa;
+
+	switch (func) {
 	case ARM_SMCCC_HV_PV_TIME_FEATURES:
-		val[0] = kvm_hypercall_pv_features(vcpu);
+		val = kvm_hypercall_pv_features(vcpu);
 		break;
 	case ARM_SMCCC_HV_PV_TIME_ST:
 		gpa = kvm_init_stolen_time(vcpu);
 		if (gpa != GPA_INVALID)
-			val[0] = gpa;
+			val = gpa;
 		break;
+	case KVM_PSCI_FN_CPU_SUSPEND ... KVM_PSCI_FN_MIGRATE:
+		return kvm_psci_call(vcpu);
+	}
+
+	smccc_set_retval(vcpu, val, 0, 0, 0);
+	return 1;
+}
+
+static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
+{
+	struct system_time_snapshot systime_snapshot;
+	u64 cycles = ~0UL;
+	u32 feature;
+
+	/*
+	 * system time and counter value must captured at the same
+	 * time to keep consistency and precision.
+	 */
+	ktime_get_snapshot(&systime_snapshot);
+
+	/*
+	 * This is only valid if the current clocksource is the
+	 * architected counter, as this is the only one the guest
+	 * can see.
+	 */
+	if (systime_snapshot.cs_id != CSID_ARM_ARCH_COUNTER)
+		return;
+
+	/*
+	 * The guest selects one of the two reference counters
+	 * (virtual or physical) with the first argument of the SMCCC
+	 * call. In case the identifier is not supported, error out.
+	 */
+	feature = smccc_get_arg(vcpu, 1);
+	switch (feature) {
+	case KVM_PTP_VIRT_COUNTER:
+		cycles = systime_snapshot.cycles - vcpu_read_sys_reg(vcpu, CNTVOFF_EL2);
+		break;
+	case KVM_PTP_PHYS_COUNTER:
+		cycles = systime_snapshot.cycles;
+		break;
+	default:
+		return;
+	}
+
+	/*
+	 * This relies on the top bit of val[0] never being set for
+	 * valid values of system time, because that is *really* far
+	 * in the future (about 292 years from 1970, and at that stage
+	 * nobody will give a damn about it).
+	 */
+	val[0] = upper_32_bits(systime_snapshot.real);
+	val[1] = lower_32_bits(systime_snapshot.real);
+	val[2] = upper_32_bits(cycles);
+	val[3] = lower_32_bits(cycles);
+}
+
+static int kvm_hvc_vendor_hyp(struct kvm_vcpu *vcpu, u32 func)
+{
+	u64 val[4] = { SMCCC_RET_NOT_SUPPORTED };
+
+	switch (func) {
 	case ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID:
 		val[0] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_0;
 		val[1] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_1;
@@ -145,16 +188,28 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 	case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
 		kvm_ptp_get_time(vcpu, val);
 		break;
-	case ARM_SMCCC_TRNG_VERSION:
-	case ARM_SMCCC_TRNG_FEATURES:
-	case ARM_SMCCC_TRNG_GET_UUID:
-	case ARM_SMCCC_TRNG_RND32:
-	case ARM_SMCCC_TRNG_RND64:
-		return kvm_trng_call(vcpu);
-	default:
-		return kvm_psci_call(vcpu);
 	}
 
 	smccc_set_retval(vcpu, val[0], val[1], val[2], val[3]);
 	return 1;
 }
+
+int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
+{
+	u32 func = smccc_get_function(vcpu);
+	u64 val = SMCCC_RET_NOT_SUPPORTED;
+
+	switch (ARM_SMCCC_OWNER_NUM(func)) {
+	case ARM_SMCCC_OWNER_ARCH:
+		return kvm_hvc_arch(vcpu, func);
+	case ARM_SMCCC_OWNER_STANDARD:
+		return kvm_hvc_standard(vcpu, func);
+	case ARM_SMCCC_OWNER_STANDARD_HYP:
+		return kvm_hvc_standard_hyp(vcpu, func);
+	case ARM_SMCCC_OWNER_VENDOR_HYP:
+		return kvm_hvc_vendor_hyp(vcpu, func);
+	}
+
+	smccc_set_retval(vcpu, val, 0, 0, 0);
+	return 1;
+}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 02/18] KVM: arm64: Route hypercalls based on their owner
@ 2022-04-03 15:38   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

kvm_hvc_call_handler() directly handles the incoming hypercall, or
and routes it based on its (function) ID. kvm_psci_call() becomes
the gate keeper to handle the hypercall that can't be handled by
any one else. It makes kvm_hvc_call_handler() a bit messy.

This reorgnizes the code to route the hypercall to the corresponding
handler based on its owner. The hypercall may be handled directly
in the handler or routed further to the corresponding functionality.
The (function) ID is always verified before it's routed to the
corresponding functionality. By the way, @func_id is repalced by
@func, to be consistent with by smccc_get_function().

PSCI is the only exception, those hypercalls defined by 0.2 or
beyond are routed to the handler for Standard Secure Service, but
those defined in 0.1 are routed to the handler for Standard
Hypervisor Service.

Suggested-by: Oliver Upton <oupton@google.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/hypercalls.c | 199 +++++++++++++++++++++++-------------
 1 file changed, 127 insertions(+), 72 deletions(-)

diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 8438fd79e3f0..b659387d8919 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -9,65 +9,14 @@
 #include <kvm/arm_hypercalls.h>
 #include <kvm/arm_psci.h>
 
-static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
-{
-	struct system_time_snapshot systime_snapshot;
-	u64 cycles = ~0UL;
-	u32 feature;
-
-	/*
-	 * system time and counter value must captured at the same
-	 * time to keep consistency and precision.
-	 */
-	ktime_get_snapshot(&systime_snapshot);
-
-	/*
-	 * This is only valid if the current clocksource is the
-	 * architected counter, as this is the only one the guest
-	 * can see.
-	 */
-	if (systime_snapshot.cs_id != CSID_ARM_ARCH_COUNTER)
-		return;
-
-	/*
-	 * The guest selects one of the two reference counters
-	 * (virtual or physical) with the first argument of the SMCCC
-	 * call. In case the identifier is not supported, error out.
-	 */
-	feature = smccc_get_arg(vcpu, 1);
-	switch (feature) {
-	case KVM_PTP_VIRT_COUNTER:
-		cycles = systime_snapshot.cycles - vcpu_read_sys_reg(vcpu, CNTVOFF_EL2);
-		break;
-	case KVM_PTP_PHYS_COUNTER:
-		cycles = systime_snapshot.cycles;
-		break;
-	default:
-		return;
-	}
-
-	/*
-	 * This relies on the top bit of val[0] never being set for
-	 * valid values of system time, because that is *really* far
-	 * in the future (about 292 years from 1970, and at that stage
-	 * nobody will give a damn about it).
-	 */
-	val[0] = upper_32_bits(systime_snapshot.real);
-	val[1] = lower_32_bits(systime_snapshot.real);
-	val[2] = upper_32_bits(cycles);
-	val[3] = lower_32_bits(cycles);
-}
-
-int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
+static int kvm_hvc_arch(struct kvm_vcpu *vcpu, u32 func)
 {
-	u32 func_id = smccc_get_function(vcpu);
-	u64 val[4] = {SMCCC_RET_NOT_SUPPORTED};
+	u64 val = SMCCC_RET_NOT_SUPPORTED;
 	u32 feature;
-	gpa_t gpa;
 
-	switch (func_id) {
+	switch (func) {
 	case ARM_SMCCC_VERSION_FUNC_ID:
-		val[0] = ARM_SMCCC_VERSION_1_1;
+		val = ARM_SMCCC_VERSION_1_1;
 		break;
 	case ARM_SMCCC_ARCH_FEATURES_FUNC_ID:
 		feature = smccc_get_arg(vcpu, 1);
@@ -77,10 +26,10 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 			case SPECTRE_VULNERABLE:
 				break;
 			case SPECTRE_MITIGATED:
-				val[0] = SMCCC_RET_SUCCESS;
+				val = SMCCC_RET_SUCCESS;
 				break;
 			case SPECTRE_UNAFFECTED:
-				val[0] = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED;
+				val = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED;
 				break;
 			}
 			break;
@@ -103,7 +52,7 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 					break;
 				fallthrough;
 			case SPECTRE_UNAFFECTED:
-				val[0] = SMCCC_RET_NOT_REQUIRED;
+				val = SMCCC_RET_NOT_REQUIRED;
 				break;
 			}
 			break;
@@ -112,26 +61,120 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 			case SPECTRE_VULNERABLE:
 				break;
 			case SPECTRE_MITIGATED:
-				val[0] = SMCCC_RET_SUCCESS;
+				val = SMCCC_RET_SUCCESS;
 				break;
 			case SPECTRE_UNAFFECTED:
-				val[0] = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED;
+				val = SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED;
 				break;
 			}
 			break;
 		case ARM_SMCCC_HV_PV_TIME_FEATURES:
-			val[0] = SMCCC_RET_SUCCESS;
+			val = SMCCC_RET_SUCCESS;
 			break;
 		}
-		break;
+	}
+
+	smccc_set_retval(vcpu, val, 0, 0, 0);
+	return 1;
+}
+
+static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
+{
+	u64 val = SMCCC_RET_NOT_SUPPORTED;
+
+	switch (func) {
+	case ARM_SMCCC_TRNG_VERSION ... ARM_SMCCC_TRNG_RND32:
+	case ARM_SMCCC_TRNG_RND64:
+		return kvm_trng_call(vcpu);
+	case PSCI_0_2_FN_PSCI_VERSION ... PSCI_0_2_FN_SYSTEM_RESET:
+	case PSCI_0_2_FN64_CPU_SUSPEND ... PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
+	case PSCI_1_0_FN_PSCI_FEATURES ... PSCI_1_0_FN_SET_SUSPEND_MODE:
+	case PSCI_1_0_FN64_SYSTEM_SUSPEND:
+	case PSCI_1_1_FN_SYSTEM_RESET2:
+	case PSCI_1_1_FN64_SYSTEM_RESET2:
+		return kvm_psci_call(vcpu);
+	}
+
+	smccc_set_retval(vcpu, val, 0, 0, 0);
+	return 1;
+}
+
+static int kvm_hvc_standard_hyp(struct kvm_vcpu *vcpu, u32 func)
+{
+	u64 val = SMCCC_RET_NOT_SUPPORTED;
+	gpa_t gpa;
+
+	switch (func) {
 	case ARM_SMCCC_HV_PV_TIME_FEATURES:
-		val[0] = kvm_hypercall_pv_features(vcpu);
+		val = kvm_hypercall_pv_features(vcpu);
 		break;
 	case ARM_SMCCC_HV_PV_TIME_ST:
 		gpa = kvm_init_stolen_time(vcpu);
 		if (gpa != GPA_INVALID)
-			val[0] = gpa;
+			val = gpa;
 		break;
+	case KVM_PSCI_FN_CPU_SUSPEND ... KVM_PSCI_FN_MIGRATE:
+		return kvm_psci_call(vcpu);
+	}
+
+	smccc_set_retval(vcpu, val, 0, 0, 0);
+	return 1;
+}
+
+static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
+{
+	struct system_time_snapshot systime_snapshot;
+	u64 cycles = ~0UL;
+	u32 feature;
+
+	/*
+	 * system time and counter value must captured at the same
+	 * time to keep consistency and precision.
+	 */
+	ktime_get_snapshot(&systime_snapshot);
+
+	/*
+	 * This is only valid if the current clocksource is the
+	 * architected counter, as this is the only one the guest
+	 * can see.
+	 */
+	if (systime_snapshot.cs_id != CSID_ARM_ARCH_COUNTER)
+		return;
+
+	/*
+	 * The guest selects one of the two reference counters
+	 * (virtual or physical) with the first argument of the SMCCC
+	 * call. In case the identifier is not supported, error out.
+	 */
+	feature = smccc_get_arg(vcpu, 1);
+	switch (feature) {
+	case KVM_PTP_VIRT_COUNTER:
+		cycles = systime_snapshot.cycles - vcpu_read_sys_reg(vcpu, CNTVOFF_EL2);
+		break;
+	case KVM_PTP_PHYS_COUNTER:
+		cycles = systime_snapshot.cycles;
+		break;
+	default:
+		return;
+	}
+
+	/*
+	 * This relies on the top bit of val[0] never being set for
+	 * valid values of system time, because that is *really* far
+	 * in the future (about 292 years from 1970, and at that stage
+	 * nobody will give a damn about it).
+	 */
+	val[0] = upper_32_bits(systime_snapshot.real);
+	val[1] = lower_32_bits(systime_snapshot.real);
+	val[2] = upper_32_bits(cycles);
+	val[3] = lower_32_bits(cycles);
+}
+
+static int kvm_hvc_vendor_hyp(struct kvm_vcpu *vcpu, u32 func)
+{
+	u64 val[4] = { SMCCC_RET_NOT_SUPPORTED };
+
+	switch (func) {
 	case ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID:
 		val[0] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_0;
 		val[1] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_1;
@@ -145,16 +188,28 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 	case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
 		kvm_ptp_get_time(vcpu, val);
 		break;
-	case ARM_SMCCC_TRNG_VERSION:
-	case ARM_SMCCC_TRNG_FEATURES:
-	case ARM_SMCCC_TRNG_GET_UUID:
-	case ARM_SMCCC_TRNG_RND32:
-	case ARM_SMCCC_TRNG_RND64:
-		return kvm_trng_call(vcpu);
-	default:
-		return kvm_psci_call(vcpu);
 	}
 
 	smccc_set_retval(vcpu, val[0], val[1], val[2], val[3]);
 	return 1;
 }
+
+int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
+{
+	u32 func = smccc_get_function(vcpu);
+	u64 val = SMCCC_RET_NOT_SUPPORTED;
+
+	switch (ARM_SMCCC_OWNER_NUM(func)) {
+	case ARM_SMCCC_OWNER_ARCH:
+		return kvm_hvc_arch(vcpu, func);
+	case ARM_SMCCC_OWNER_STANDARD:
+		return kvm_hvc_standard(vcpu, func);
+	case ARM_SMCCC_OWNER_STANDARD_HYP:
+		return kvm_hvc_standard_hyp(vcpu, func);
+	case ARM_SMCCC_OWNER_VENDOR_HYP:
+		return kvm_hvc_vendor_hyp(vcpu, func);
+	}
+
+	smccc_set_retval(vcpu, val, 0, 0, 0);
+	return 1;
+}
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:38   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Software Delegated Exception Interface (SDEI) provides a mechanism
for registering and servicing system events, as defined by ARM DEN0054C
specification. One of these events will be used by Asynchronous Page
Fault (Async PF) to deliver notifications from host to guest.

The events are classified into shared and private ones according to
their scopes. The shared events are system or VM scoped, but the
private events are CPU or VCPU scoped. The shared events can be
registered, enabled, unregistered and reset through hypercalls
issued from any VCPU. However, the private events are registered,
enabled, unregistered and reset on the calling VCPU through
hypercalls. Besides, the events are also classified into critical
and normal events according their priority. During event delivery
and handling, the normal event can be preempted by another critical
event, but not in reverse way. The critical event is never preempted
by another normal event.

This introduces SDEI virtualization infrastructure for various objects
used in the implementation. Currently, we don't support the shared
event.

  * kvm_sdei_exposed_event
    The event which are defined and exposed by KVM. The event can't
    be registered until it's exposed. Besides, all the information
    in this event can't be changed after it's exposed.

  * kvm_sdei_event
    The events are created based on the exposed events. Their states
    are changed when hypercalls are received or they are delivered
    to guest for handling.

  * kvm_sdei_vcpu_context
    The vcpu context helps to handle events. The interrupted context
    is saved before the event handler is executed, and restored after
    the event handler is to finish.

  * kvm_sdei_vcpu
    Place holder for all objects for one particular VCPU.

The error of SDEI_NOT_SUPPORTED is returned for all hypercalls for now.
They will be supported one by one in the subsequent patches.

Link: https://developer.arm.com/documentation/den0054/latest
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/include/asm/kvm_host.h |   1 +
 arch/arm64/include/asm/kvm_sdei.h | 148 ++++++++++++++++++++++++++++++
 arch/arm64/kvm/Makefile           |   2 +-
 arch/arm64/kvm/arm.c              |   4 +
 arch/arm64/kvm/hypercalls.c       |   3 +
 arch/arm64/kvm/sdei.c             |  98 ++++++++++++++++++++
 include/uapi/linux/arm_sdei.h     |   4 +
 7 files changed, 259 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/kvm_sdei.h
 create mode 100644 arch/arm64/kvm/sdei.c

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e3b25dc6c367..7644a400c4a8 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -343,6 +343,7 @@ struct kvm_vcpu_arch {
 	 * Anything that is not used directly from assembly code goes
 	 * here.
 	 */
+	struct kvm_sdei_vcpu *sdei;
 
 	/*
 	 * Guest registers we preserve during guest debugging.
diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
new file mode 100644
index 000000000000..2dbfb3ae0a48
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -0,0 +1,148 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Definitions of various KVM SDEI events.
+ *
+ * Copyright (C) 2022 Red Hat, Inc.
+ *
+ * Author(s): Gavin Shan <gshan@redhat.com>
+ */
+
+#ifndef __ARM64_KVM_SDEI_H__
+#define __ARM64_KVM_SDEI_H__
+
+#include <uapi/linux/arm_sdei.h>
+#include <linux/arm-smccc.h>
+#include <linux/bits.h>
+#include <linux/spinlock.h>
+
+/*
+ * The event which are defined and exposed by KVM. The event can't
+ * be registered until it's exposed. Besides, all the information
+ * in this event can't be changed after it's exposed.
+ */
+struct kvm_sdei_exposed_event {
+	unsigned int	num;
+	unsigned char	type;
+	unsigned char	signaled;
+	unsigned char	priority;
+};
+
+/*
+ * Currently, only the private events are supported. The events are
+ * created based on the exposed events and their states are changed
+ * when hypercalls are received or they are delivered to guest for
+ * handling.
+ */
+struct kvm_sdei_event {
+	struct kvm_sdei_exposed_event	*exposed_event;
+
+	unsigned char			route_mode;
+	unsigned long			route_affinity;
+	unsigned long			ep_address;
+	unsigned long			ep_arg;
+#define KVM_SDEI_EVENT_STATE_REGISTERED		BIT(0)
+#define KVM_SDEI_EVENT_STATE_ENABLED		BIT(1)
+#define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING	BIT(2)
+	unsigned long			state;
+	unsigned long			event_count;
+};
+
+/*
+ * The vcpu context helps to handle events. The preempted or interrupted
+ * context is saved before the event handler is executed, and restored
+ * after the event handler is to finish. The event with normal priority
+ * can be preempted by the one with critical priority. So there can be
+ * two contexts on one particular vcpu for the events with normal and
+ * critical priority separately.
+ */
+struct kvm_sdei_vcpu_context {
+	struct kvm_sdei_event	*event;
+	unsigned long		regs[18];
+	unsigned long		pc;
+	unsigned long		pstate;
+};
+
+struct kvm_sdei_vcpu {
+	spinlock_t			lock;
+	struct kvm_sdei_event		*events;
+	unsigned char			masked;
+	unsigned long			critical_event_count;
+	unsigned long			normal_event_count;
+	struct kvm_sdei_vcpu_context	context[SDEI_EVENT_PRIORITY_CRITICAL + 1];
+};
+
+/*
+ * According to SDEI specification (v1.1), the event number spans 32-bits
+ * and the lower 24-bits are used as the (real) event number. I don't
+ * think we can use that much event numbers in one system. So we reserve
+ * two bits from the 24-bits real event number, to indicate its types:
+ * physical or virtual event. One reserved bit is enough for now, but
+ * two bits are reserved for possible extension in future.
+ *
+ * The physical events are owned by firmware while the virtual events
+ * are used by VMM and KVM.
+ */
+#define KVM_SDEI_EVENT_NUM_TYPE_SHIFT	22
+#define KVM_SDEI_EVENT_NUM_TYPE_MASK	(3 << KVM_SDEI_EVENT_NUM_TYPE_SHIFT)
+#define KVM_SDEI_EVENT_NUM_TYPE_PHYS	0
+#define KVM_SDEI_EVENT_NUM_TYPE_VIRT	1
+
+static inline bool kvm_sdei_is_virtual(unsigned int num)
+{
+	unsigned int type;
+
+	type = (num & KVM_SDEI_EVENT_NUM_TYPE_MASK) >>
+	       KVM_SDEI_EVENT_NUM_TYPE_SHIFT;
+	if (type == KVM_SDEI_EVENT_NUM_TYPE_VIRT)
+		return true;
+
+	return false;
+}
+
+static inline bool kvm_sdei_is_sw_signaled(unsigned int num)
+{
+	return num == SDEI_SW_SIGNALED_EVENT;
+}
+
+static inline bool kvm_sdei_is_supported(unsigned int num)
+{
+	return kvm_sdei_is_sw_signaled(num) ||
+	       kvm_sdei_is_virtual(num);
+}
+
+static inline bool kvm_sdei_is_critical(unsigned char priority)
+{
+	return priority == SDEI_EVENT_PRIORITY_CRITICAL;
+}
+
+static inline bool kvm_sdei_is_normal(unsigned char priority)
+{
+	return priority == SDEI_EVENT_PRIORITY_NORMAL;
+}
+
+#define KVM_SDEI_REGISTERED_EVENT_FUNC(func, field)			\
+static inline bool kvm_sdei_is_##func(struct kvm_sdei_event *event)	\
+{									\
+	return !!(event->state & KVM_SDEI_EVENT_STATE_##field);		\
+}									\
+									\
+static inline void kvm_sdei_set_##func(struct kvm_sdei_event *event)	\
+{									\
+	event->state |= KVM_SDEI_EVENT_STATE_##field;			\
+}									\
+									\
+static inline void kvm_sdei_clear_##func(struct kvm_sdei_event *event)	\
+{									\
+	event->state &= ~KVM_SDEI_EVENT_STATE_##field;			\
+}
+
+KVM_SDEI_REGISTERED_EVENT_FUNC(registered, REGISTERED)
+KVM_SDEI_REGISTERED_EVENT_FUNC(enabled, ENABLED)
+KVM_SDEI_REGISTERED_EVENT_FUNC(unregister_pending, UNREGISTER_PENDING)
+
+/* APIs */
+int kvm_sdei_call(struct kvm_vcpu *vcpu);
+void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
+void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
+
+#endif /* __ARM64_KVM_SDEI_H__ */
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 261644b1a6bb..d6ced92ae3f0 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -14,7 +14,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
 	 inject_fault.o va_layout.o handle_exit.o \
 	 guest.o debug.o reset.o sys_regs.o \
 	 vgic-sys-reg-v3.o fpsimd.o pmu.o pkvm.o \
-	 arch_timer.o trng.o vmid.o \
+	 arch_timer.o trng.o vmid.o sdei.o \
 	 vgic/vgic.o vgic/vgic-init.o \
 	 vgic/vgic-irqfd.o vgic/vgic-v2.o \
 	 vgic/vgic-v3.o vgic/vgic-v4.o \
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 523bc934fe2f..227c0e390571 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -38,6 +38,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_emulate.h>
+#include <asm/kvm_sdei.h>
 #include <asm/sections.h>
 
 #include <kvm/arm_hypercalls.h>
@@ -331,6 +332,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 
 	kvm_arm_pvtime_vcpu_init(&vcpu->arch);
 
+	kvm_sdei_create_vcpu(vcpu);
+
 	vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
 
 	err = kvm_vgic_vcpu_init(vcpu);
@@ -352,6 +355,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
 	kvm_timer_vcpu_terminate(vcpu);
 	kvm_pmu_vcpu_destroy(vcpu);
+	kvm_sdei_destroy_vcpu(vcpu);
 
 	kvm_arm_vcpu_destroy(vcpu);
 }
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index b659387d8919..6aa027a4cee8 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -5,6 +5,7 @@
 #include <linux/kvm_host.h>
 
 #include <asm/kvm_emulate.h>
+#include <asm/kvm_sdei.h>
 
 #include <kvm/arm_hypercalls.h>
 #include <kvm/arm_psci.h>
@@ -93,6 +94,8 @@ static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
 	case PSCI_1_1_FN_SYSTEM_RESET2:
 	case PSCI_1_1_FN64_SYSTEM_RESET2:
 		return kvm_psci_call(vcpu);
+	case SDEI_1_0_FN_SDEI_VERSION ... SDEI_1_1_FN_SDEI_FEATURES:
+		return kvm_sdei_call(vcpu);
 	}
 
 	smccc_set_retval(vcpu, val, 0, 0, 0);
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
new file mode 100644
index 000000000000..3507e33ec00e
--- /dev/null
+++ b/arch/arm64/kvm/sdei.c
@@ -0,0 +1,98 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * SDEI virtualization support.
+ *
+ * Copyright (C) 2022 Red Hat, Inc.
+ *
+ * Author(s): Gavin Shan <gshan@redhat.com>
+ */
+
+#include <linux/kernel.h>
+#include <linux/kvm_host.h>
+#include <linux/slab.h>
+#include <kvm/arm_hypercalls.h>
+#include <asm/kvm_sdei.h>
+
+static struct kvm_sdei_exposed_event exposed_events[] = {
+	{ .num      = SDEI_SW_SIGNALED_EVENT,
+	  .type     = SDEI_EVENT_TYPE_PRIVATE,
+	  .signaled = 1,
+	  .priority = SDEI_EVENT_PRIORITY_NORMAL,
+	},
+};
+
+#define kvm_sdei_for_each_exposed_event(event, idx)	\
+	for (idx = 0, event = &exposed_events[0];	\
+	     idx < ARRAY_SIZE(exposed_events);		\
+	     idx++, event++)
+
+int kvm_sdei_call(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	u32 func = smccc_get_function(vcpu);
+	bool has_result = true;
+	unsigned long ret;
+
+	/*
+	 * We don't have return value for COMPLETE or COMPLETE_AND_RESUME
+	 * hypercalls. Otherwise, the restored context will be corrupted.
+	 */
+	if (func == SDEI_1_0_FN_SDEI_EVENT_COMPLETE ||
+	    func == SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME)
+		has_result = false;
+
+	if (!vsdei) {
+		ret = SDEI_NOT_SUPPORTED;
+		goto out;
+	}
+
+	switch (func) {
+	default:
+		ret = SDEI_NOT_SUPPORTED;
+	}
+
+out:
+	if (has_result)
+		smccc_set_retval(vcpu, ret, 0, 0, 0);
+
+	return 1;
+}
+
+void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *events;
+	unsigned int i;
+
+	vsdei = kzalloc(sizeof(*vsdei), GFP_KERNEL_ACCOUNT);
+	if (!vsdei)
+		return;
+
+	events = kcalloc(ARRAY_SIZE(exposed_events), sizeof(*events),
+			 GFP_KERNEL_ACCOUNT);
+	if (!events) {
+		kfree(vsdei);
+		return;
+	}
+
+	kvm_sdei_for_each_exposed_event(exposed_event, i)
+		events[i].exposed_event = exposed_event;
+
+	spin_lock_init(&vsdei->lock);
+	vsdei->events = events;
+	vsdei->masked = 1;
+	vcpu->arch.sdei = vsdei;
+}
+
+void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+
+	if (!vsdei)
+		return;
+
+	vcpu->arch.sdei = NULL;
+	kfree(vsdei->events);
+	kfree(vsdei);
+}
diff --git a/include/uapi/linux/arm_sdei.h b/include/uapi/linux/arm_sdei.h
index af0630ba5437..572c77c59af6 100644
--- a/include/uapi/linux/arm_sdei.h
+++ b/include/uapi/linux/arm_sdei.h
@@ -22,8 +22,12 @@
 #define SDEI_1_0_FN_SDEI_PE_UNMASK			SDEI_1_0_FN(0x0C)
 #define SDEI_1_0_FN_SDEI_INTERRUPT_BIND			SDEI_1_0_FN(0x0D)
 #define SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE		SDEI_1_0_FN(0x0E)
+#define SDEI_1_1_FN_SDEI_EVENT_SIGNAL			SDEI_1_0_FN(0x0F)
 #define SDEI_1_0_FN_SDEI_PRIVATE_RESET			SDEI_1_0_FN(0x11)
 #define SDEI_1_0_FN_SDEI_SHARED_RESET			SDEI_1_0_FN(0x12)
+#define SDEI_1_1_FN_SDEI_FEATURES			SDEI_1_0_FN(0x30)
+
+#define SDEI_SW_SIGNALED_EVENT	0
 
 #define SDEI_VERSION_MAJOR_SHIFT			48
 #define SDEI_VERSION_MAJOR_MASK				0x7fff
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-04-03 15:38   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

Software Delegated Exception Interface (SDEI) provides a mechanism
for registering and servicing system events, as defined by ARM DEN0054C
specification. One of these events will be used by Asynchronous Page
Fault (Async PF) to deliver notifications from host to guest.

The events are classified into shared and private ones according to
their scopes. The shared events are system or VM scoped, but the
private events are CPU or VCPU scoped. The shared events can be
registered, enabled, unregistered and reset through hypercalls
issued from any VCPU. However, the private events are registered,
enabled, unregistered and reset on the calling VCPU through
hypercalls. Besides, the events are also classified into critical
and normal events according their priority. During event delivery
and handling, the normal event can be preempted by another critical
event, but not in reverse way. The critical event is never preempted
by another normal event.

This introduces SDEI virtualization infrastructure for various objects
used in the implementation. Currently, we don't support the shared
event.

  * kvm_sdei_exposed_event
    The event which are defined and exposed by KVM. The event can't
    be registered until it's exposed. Besides, all the information
    in this event can't be changed after it's exposed.

  * kvm_sdei_event
    The events are created based on the exposed events. Their states
    are changed when hypercalls are received or they are delivered
    to guest for handling.

  * kvm_sdei_vcpu_context
    The vcpu context helps to handle events. The interrupted context
    is saved before the event handler is executed, and restored after
    the event handler is to finish.

  * kvm_sdei_vcpu
    Place holder for all objects for one particular VCPU.

The error of SDEI_NOT_SUPPORTED is returned for all hypercalls for now.
They will be supported one by one in the subsequent patches.

Link: https://developer.arm.com/documentation/den0054/latest
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/include/asm/kvm_host.h |   1 +
 arch/arm64/include/asm/kvm_sdei.h | 148 ++++++++++++++++++++++++++++++
 arch/arm64/kvm/Makefile           |   2 +-
 arch/arm64/kvm/arm.c              |   4 +
 arch/arm64/kvm/hypercalls.c       |   3 +
 arch/arm64/kvm/sdei.c             |  98 ++++++++++++++++++++
 include/uapi/linux/arm_sdei.h     |   4 +
 7 files changed, 259 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/kvm_sdei.h
 create mode 100644 arch/arm64/kvm/sdei.c

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e3b25dc6c367..7644a400c4a8 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -343,6 +343,7 @@ struct kvm_vcpu_arch {
 	 * Anything that is not used directly from assembly code goes
 	 * here.
 	 */
+	struct kvm_sdei_vcpu *sdei;
 
 	/*
 	 * Guest registers we preserve during guest debugging.
diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
new file mode 100644
index 000000000000..2dbfb3ae0a48
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -0,0 +1,148 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Definitions of various KVM SDEI events.
+ *
+ * Copyright (C) 2022 Red Hat, Inc.
+ *
+ * Author(s): Gavin Shan <gshan@redhat.com>
+ */
+
+#ifndef __ARM64_KVM_SDEI_H__
+#define __ARM64_KVM_SDEI_H__
+
+#include <uapi/linux/arm_sdei.h>
+#include <linux/arm-smccc.h>
+#include <linux/bits.h>
+#include <linux/spinlock.h>
+
+/*
+ * The event which are defined and exposed by KVM. The event can't
+ * be registered until it's exposed. Besides, all the information
+ * in this event can't be changed after it's exposed.
+ */
+struct kvm_sdei_exposed_event {
+	unsigned int	num;
+	unsigned char	type;
+	unsigned char	signaled;
+	unsigned char	priority;
+};
+
+/*
+ * Currently, only the private events are supported. The events are
+ * created based on the exposed events and their states are changed
+ * when hypercalls are received or they are delivered to guest for
+ * handling.
+ */
+struct kvm_sdei_event {
+	struct kvm_sdei_exposed_event	*exposed_event;
+
+	unsigned char			route_mode;
+	unsigned long			route_affinity;
+	unsigned long			ep_address;
+	unsigned long			ep_arg;
+#define KVM_SDEI_EVENT_STATE_REGISTERED		BIT(0)
+#define KVM_SDEI_EVENT_STATE_ENABLED		BIT(1)
+#define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING	BIT(2)
+	unsigned long			state;
+	unsigned long			event_count;
+};
+
+/*
+ * The vcpu context helps to handle events. The preempted or interrupted
+ * context is saved before the event handler is executed, and restored
+ * after the event handler is to finish. The event with normal priority
+ * can be preempted by the one with critical priority. So there can be
+ * two contexts on one particular vcpu for the events with normal and
+ * critical priority separately.
+ */
+struct kvm_sdei_vcpu_context {
+	struct kvm_sdei_event	*event;
+	unsigned long		regs[18];
+	unsigned long		pc;
+	unsigned long		pstate;
+};
+
+struct kvm_sdei_vcpu {
+	spinlock_t			lock;
+	struct kvm_sdei_event		*events;
+	unsigned char			masked;
+	unsigned long			critical_event_count;
+	unsigned long			normal_event_count;
+	struct kvm_sdei_vcpu_context	context[SDEI_EVENT_PRIORITY_CRITICAL + 1];
+};
+
+/*
+ * According to SDEI specification (v1.1), the event number spans 32-bits
+ * and the lower 24-bits are used as the (real) event number. I don't
+ * think we can use that much event numbers in one system. So we reserve
+ * two bits from the 24-bits real event number, to indicate its types:
+ * physical or virtual event. One reserved bit is enough for now, but
+ * two bits are reserved for possible extension in future.
+ *
+ * The physical events are owned by firmware while the virtual events
+ * are used by VMM and KVM.
+ */
+#define KVM_SDEI_EVENT_NUM_TYPE_SHIFT	22
+#define KVM_SDEI_EVENT_NUM_TYPE_MASK	(3 << KVM_SDEI_EVENT_NUM_TYPE_SHIFT)
+#define KVM_SDEI_EVENT_NUM_TYPE_PHYS	0
+#define KVM_SDEI_EVENT_NUM_TYPE_VIRT	1
+
+static inline bool kvm_sdei_is_virtual(unsigned int num)
+{
+	unsigned int type;
+
+	type = (num & KVM_SDEI_EVENT_NUM_TYPE_MASK) >>
+	       KVM_SDEI_EVENT_NUM_TYPE_SHIFT;
+	if (type == KVM_SDEI_EVENT_NUM_TYPE_VIRT)
+		return true;
+
+	return false;
+}
+
+static inline bool kvm_sdei_is_sw_signaled(unsigned int num)
+{
+	return num == SDEI_SW_SIGNALED_EVENT;
+}
+
+static inline bool kvm_sdei_is_supported(unsigned int num)
+{
+	return kvm_sdei_is_sw_signaled(num) ||
+	       kvm_sdei_is_virtual(num);
+}
+
+static inline bool kvm_sdei_is_critical(unsigned char priority)
+{
+	return priority == SDEI_EVENT_PRIORITY_CRITICAL;
+}
+
+static inline bool kvm_sdei_is_normal(unsigned char priority)
+{
+	return priority == SDEI_EVENT_PRIORITY_NORMAL;
+}
+
+#define KVM_SDEI_REGISTERED_EVENT_FUNC(func, field)			\
+static inline bool kvm_sdei_is_##func(struct kvm_sdei_event *event)	\
+{									\
+	return !!(event->state & KVM_SDEI_EVENT_STATE_##field);		\
+}									\
+									\
+static inline void kvm_sdei_set_##func(struct kvm_sdei_event *event)	\
+{									\
+	event->state |= KVM_SDEI_EVENT_STATE_##field;			\
+}									\
+									\
+static inline void kvm_sdei_clear_##func(struct kvm_sdei_event *event)	\
+{									\
+	event->state &= ~KVM_SDEI_EVENT_STATE_##field;			\
+}
+
+KVM_SDEI_REGISTERED_EVENT_FUNC(registered, REGISTERED)
+KVM_SDEI_REGISTERED_EVENT_FUNC(enabled, ENABLED)
+KVM_SDEI_REGISTERED_EVENT_FUNC(unregister_pending, UNREGISTER_PENDING)
+
+/* APIs */
+int kvm_sdei_call(struct kvm_vcpu *vcpu);
+void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
+void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
+
+#endif /* __ARM64_KVM_SDEI_H__ */
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 261644b1a6bb..d6ced92ae3f0 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -14,7 +14,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
 	 inject_fault.o va_layout.o handle_exit.o \
 	 guest.o debug.o reset.o sys_regs.o \
 	 vgic-sys-reg-v3.o fpsimd.o pmu.o pkvm.o \
-	 arch_timer.o trng.o vmid.o \
+	 arch_timer.o trng.o vmid.o sdei.o \
 	 vgic/vgic.o vgic/vgic-init.o \
 	 vgic/vgic-irqfd.o vgic/vgic-v2.o \
 	 vgic/vgic-v3.o vgic/vgic-v4.o \
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 523bc934fe2f..227c0e390571 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -38,6 +38,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_emulate.h>
+#include <asm/kvm_sdei.h>
 #include <asm/sections.h>
 
 #include <kvm/arm_hypercalls.h>
@@ -331,6 +332,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 
 	kvm_arm_pvtime_vcpu_init(&vcpu->arch);
 
+	kvm_sdei_create_vcpu(vcpu);
+
 	vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
 
 	err = kvm_vgic_vcpu_init(vcpu);
@@ -352,6 +355,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
 	kvm_timer_vcpu_terminate(vcpu);
 	kvm_pmu_vcpu_destroy(vcpu);
+	kvm_sdei_destroy_vcpu(vcpu);
 
 	kvm_arm_vcpu_destroy(vcpu);
 }
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index b659387d8919..6aa027a4cee8 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -5,6 +5,7 @@
 #include <linux/kvm_host.h>
 
 #include <asm/kvm_emulate.h>
+#include <asm/kvm_sdei.h>
 
 #include <kvm/arm_hypercalls.h>
 #include <kvm/arm_psci.h>
@@ -93,6 +94,8 @@ static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
 	case PSCI_1_1_FN_SYSTEM_RESET2:
 	case PSCI_1_1_FN64_SYSTEM_RESET2:
 		return kvm_psci_call(vcpu);
+	case SDEI_1_0_FN_SDEI_VERSION ... SDEI_1_1_FN_SDEI_FEATURES:
+		return kvm_sdei_call(vcpu);
 	}
 
 	smccc_set_retval(vcpu, val, 0, 0, 0);
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
new file mode 100644
index 000000000000..3507e33ec00e
--- /dev/null
+++ b/arch/arm64/kvm/sdei.c
@@ -0,0 +1,98 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * SDEI virtualization support.
+ *
+ * Copyright (C) 2022 Red Hat, Inc.
+ *
+ * Author(s): Gavin Shan <gshan@redhat.com>
+ */
+
+#include <linux/kernel.h>
+#include <linux/kvm_host.h>
+#include <linux/slab.h>
+#include <kvm/arm_hypercalls.h>
+#include <asm/kvm_sdei.h>
+
+static struct kvm_sdei_exposed_event exposed_events[] = {
+	{ .num      = SDEI_SW_SIGNALED_EVENT,
+	  .type     = SDEI_EVENT_TYPE_PRIVATE,
+	  .signaled = 1,
+	  .priority = SDEI_EVENT_PRIORITY_NORMAL,
+	},
+};
+
+#define kvm_sdei_for_each_exposed_event(event, idx)	\
+	for (idx = 0, event = &exposed_events[0];	\
+	     idx < ARRAY_SIZE(exposed_events);		\
+	     idx++, event++)
+
+int kvm_sdei_call(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	u32 func = smccc_get_function(vcpu);
+	bool has_result = true;
+	unsigned long ret;
+
+	/*
+	 * We don't have return value for COMPLETE or COMPLETE_AND_RESUME
+	 * hypercalls. Otherwise, the restored context will be corrupted.
+	 */
+	if (func == SDEI_1_0_FN_SDEI_EVENT_COMPLETE ||
+	    func == SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME)
+		has_result = false;
+
+	if (!vsdei) {
+		ret = SDEI_NOT_SUPPORTED;
+		goto out;
+	}
+
+	switch (func) {
+	default:
+		ret = SDEI_NOT_SUPPORTED;
+	}
+
+out:
+	if (has_result)
+		smccc_set_retval(vcpu, ret, 0, 0, 0);
+
+	return 1;
+}
+
+void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *events;
+	unsigned int i;
+
+	vsdei = kzalloc(sizeof(*vsdei), GFP_KERNEL_ACCOUNT);
+	if (!vsdei)
+		return;
+
+	events = kcalloc(ARRAY_SIZE(exposed_events), sizeof(*events),
+			 GFP_KERNEL_ACCOUNT);
+	if (!events) {
+		kfree(vsdei);
+		return;
+	}
+
+	kvm_sdei_for_each_exposed_event(exposed_event, i)
+		events[i].exposed_event = exposed_event;
+
+	spin_lock_init(&vsdei->lock);
+	vsdei->events = events;
+	vsdei->masked = 1;
+	vcpu->arch.sdei = vsdei;
+}
+
+void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+
+	if (!vsdei)
+		return;
+
+	vcpu->arch.sdei = NULL;
+	kfree(vsdei->events);
+	kfree(vsdei);
+}
diff --git a/include/uapi/linux/arm_sdei.h b/include/uapi/linux/arm_sdei.h
index af0630ba5437..572c77c59af6 100644
--- a/include/uapi/linux/arm_sdei.h
+++ b/include/uapi/linux/arm_sdei.h
@@ -22,8 +22,12 @@
 #define SDEI_1_0_FN_SDEI_PE_UNMASK			SDEI_1_0_FN(0x0C)
 #define SDEI_1_0_FN_SDEI_INTERRUPT_BIND			SDEI_1_0_FN(0x0D)
 #define SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE		SDEI_1_0_FN(0x0E)
+#define SDEI_1_1_FN_SDEI_EVENT_SIGNAL			SDEI_1_0_FN(0x0F)
 #define SDEI_1_0_FN_SDEI_PRIVATE_RESET			SDEI_1_0_FN(0x11)
 #define SDEI_1_0_FN_SDEI_SHARED_RESET			SDEI_1_0_FN(0x12)
+#define SDEI_1_1_FN_SDEI_FEATURES			SDEI_1_0_FN(0x30)
+
+#define SDEI_SW_SIGNALED_EVENT	0
 
 #define SDEI_VERSION_MAJOR_SHIFT			48
 #define SDEI_VERSION_MAJOR_MASK				0x7fff
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 04/18] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:38   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_EVENT_REGISTER hypercall, which is used by guest
to register event. The event won't be raised until it's registered
and enabled. For those KVM owned events, they can't be registered
if they aren't exposed.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 78 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 3507e33ec00e..89c1b231cb60 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -25,6 +25,81 @@ static struct kvm_sdei_exposed_event exposed_events[] = {
 	for (idx = 0, event = &exposed_events[0];	\
 	     idx < ARRAY_SIZE(exposed_events);		\
 	     idx++, event++)
+#define kvm_sdei_for_each_event(vsdei, event, idx)	\
+	for (idx = 0, event = &vsdei->events[0];	\
+	     idx < ARRAY_SIZE(exposed_events);		\
+	     idx++, event++)
+
+static struct kvm_sdei_event *find_event(struct kvm_vcpu *vcpu,
+					 unsigned int num)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	int i;
+
+	kvm_sdei_for_each_event(vsdei, event, i) {
+		if (event->exposed_event->num == num)
+			return event;
+	}
+
+	return NULL;
+}
+
+static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	unsigned int num = smccc_get_arg(vcpu, 1);
+	unsigned long ep_address = smccc_get_arg(vcpu, 2);
+	unsigned long ep_arg = smccc_get_arg(vcpu, 3);
+	unsigned long route_mode = smccc_get_arg(vcpu, 4);
+	unsigned long route_affinity = smccc_get_arg(vcpu, 5);
+	unsigned long ret = SDEI_SUCCESS;
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	if (route_mode != SDEI_EVENT_REGISTER_RM_ANY &&
+	    route_mode != SDEI_EVENT_REGISTER_RM_PE) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/*
+	 * The event should have been existing. Otherwise, the event
+	 * isn't exposed yet.
+	 */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+	/*
+	 * Check if the event has been registered or pending for
+	 * unregistration.
+	 */
+	if (kvm_sdei_is_registered(event) ||
+	    kvm_sdei_is_unregister_pending(event)) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	event->route_mode     = route_mode;
+	event->route_affinity = route_affinity;
+	event->ep_address     = ep_address;
+	event->ep_arg         = ep_arg;
+	kvm_sdei_set_registered(event);
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
 
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
@@ -47,6 +122,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	}
 
 	switch (func) {
+	case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
+		ret = hypercall_register(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 04/18] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
@ 2022-04-03 15:38   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_EVENT_REGISTER hypercall, which is used by guest
to register event. The event won't be raised until it's registered
and enabled. For those KVM owned events, they can't be registered
if they aren't exposed.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 78 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 3507e33ec00e..89c1b231cb60 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -25,6 +25,81 @@ static struct kvm_sdei_exposed_event exposed_events[] = {
 	for (idx = 0, event = &exposed_events[0];	\
 	     idx < ARRAY_SIZE(exposed_events);		\
 	     idx++, event++)
+#define kvm_sdei_for_each_event(vsdei, event, idx)	\
+	for (idx = 0, event = &vsdei->events[0];	\
+	     idx < ARRAY_SIZE(exposed_events);		\
+	     idx++, event++)
+
+static struct kvm_sdei_event *find_event(struct kvm_vcpu *vcpu,
+					 unsigned int num)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	int i;
+
+	kvm_sdei_for_each_event(vsdei, event, i) {
+		if (event->exposed_event->num == num)
+			return event;
+	}
+
+	return NULL;
+}
+
+static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	unsigned int num = smccc_get_arg(vcpu, 1);
+	unsigned long ep_address = smccc_get_arg(vcpu, 2);
+	unsigned long ep_arg = smccc_get_arg(vcpu, 3);
+	unsigned long route_mode = smccc_get_arg(vcpu, 4);
+	unsigned long route_affinity = smccc_get_arg(vcpu, 5);
+	unsigned long ret = SDEI_SUCCESS;
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	if (route_mode != SDEI_EVENT_REGISTER_RM_ANY &&
+	    route_mode != SDEI_EVENT_REGISTER_RM_PE) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/*
+	 * The event should have been existing. Otherwise, the event
+	 * isn't exposed yet.
+	 */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+	/*
+	 * Check if the event has been registered or pending for
+	 * unregistration.
+	 */
+	if (kvm_sdei_is_registered(event) ||
+	    kvm_sdei_is_unregister_pending(event)) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	event->route_mode     = route_mode;
+	event->route_affinity = route_affinity;
+	event->ep_address     = ep_address;
+	event->ep_arg         = ep_arg;
+	kvm_sdei_set_registered(event);
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
 
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
@@ -47,6 +122,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	}
 
 	switch (func) {
+	case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
+		ret = hypercall_register(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 05/18] KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE}
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:38   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_EVENT_{ENABLE, DISABLE} hypercall. After the
event is registered, it won't be raised and delivered to guest
until it's enabled. On other hand, the event won't be delivered
to guest if it's disabled.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 89c1b231cb60..941263578b30 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -101,6 +101,45 @@ static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_enable(struct kvm_vcpu *vcpu, bool enable)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	unsigned long num = smccc_get_arg(vcpu, 1);
+	unsigned long ret = SDEI_SUCCESS;
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	/* Check the event state */
+	if (!kvm_sdei_is_registered(event) ||
+	    kvm_sdei_is_unregister_pending(event)) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	if (enable)
+		kvm_sdei_set_enabled(event);
+	else
+		kvm_sdei_clear_enabled(event);
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -125,6 +164,12 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
 		ret = hypercall_register(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
+		ret = hypercall_enable(vcpu, true);
+		break;
+	case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
+		ret = hypercall_enable(vcpu, false);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 05/18] KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE}
@ 2022-04-03 15:38   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_EVENT_{ENABLE, DISABLE} hypercall. After the
event is registered, it won't be raised and delivered to guest
until it's enabled. On other hand, the event won't be delivered
to guest if it's disabled.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 89c1b231cb60..941263578b30 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -101,6 +101,45 @@ static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_enable(struct kvm_vcpu *vcpu, bool enable)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	unsigned long num = smccc_get_arg(vcpu, 1);
+	unsigned long ret = SDEI_SUCCESS;
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	/* Check the event state */
+	if (!kvm_sdei_is_registered(event) ||
+	    kvm_sdei_is_unregister_pending(event)) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	if (enable)
+		kvm_sdei_set_enabled(event);
+	else
+		kvm_sdei_clear_enabled(event);
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -125,6 +164,12 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
 		ret = hypercall_register(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
+		ret = hypercall_enable(vcpu, true);
+		break;
+	case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
+		ret = hypercall_enable(vcpu, false);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 06/18] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:38   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest
to retrieve the registers (x0 - x17) from the interrupted or preempted
context in the event handler. The interrupted or preempted context
is saved prior to handling the event by executing its handler and
restored after that.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 941263578b30..af5d11b8eb2f 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -140,6 +140,37 @@ static unsigned long hypercall_enable(struct kvm_vcpu *vcpu, bool enable)
 	return ret;
 }
 
+static unsigned long hypercall_context(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_vcpu_context *context;
+	unsigned long param_id = smccc_get_arg(vcpu, 1);
+	unsigned long ret = SDEI_SUCCESS;
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if we have events are being handled */
+	context = &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL];
+	context = context->event ? context : NULL;
+	context = context ? : &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	context = context->event ? context : NULL;
+	if (!context) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	/* Fetch the requested register */
+	if (param_id < ARRAY_SIZE(context->regs))
+		ret = context->regs[param_id];
+	else
+		ret = SDEI_INVALID_PARAMETERS;
+
+unlock:
+	spin_unlock(&vsdei->lock);
+
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -170,6 +201,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
 		ret = hypercall_enable(vcpu, false);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
+		ret = hypercall_context(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 06/18] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
@ 2022-04-03 15:38   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:38 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest
to retrieve the registers (x0 - x17) from the interrupted or preempted
context in the event handler. The interrupted or preempted context
is saved prior to handling the event by executing its handler and
restored after that.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 941263578b30..af5d11b8eb2f 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -140,6 +140,37 @@ static unsigned long hypercall_enable(struct kvm_vcpu *vcpu, bool enable)
 	return ret;
 }
 
+static unsigned long hypercall_context(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_vcpu_context *context;
+	unsigned long param_id = smccc_get_arg(vcpu, 1);
+	unsigned long ret = SDEI_SUCCESS;
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if we have events are being handled */
+	context = &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL];
+	context = context->event ? context : NULL;
+	context = context ? : &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	context = context->event ? context : NULL;
+	if (!context) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	/* Fetch the requested register */
+	if (param_id < ARRAY_SIZE(context->regs))
+		ret = context->regs[param_id];
+	else
+		ret = SDEI_INVALID_PARAMETERS;
+
+unlock:
+	spin_unlock(&vsdei->lock);
+
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -170,6 +201,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
 		ret = hypercall_enable(vcpu, false);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
+		ret = hypercall_context(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 07/18] KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_EVENT_UNREGISTER hypercall. It's used by the
guest to unregister event. The event is disabled automatically
and won't be delivered to guest after unregistration.

If the event is being serviced or handled, we can't unregister
it immediately. Instead, the unregistration pending state is
set for the event and it's unregistered when the event handler
is to finish by calling SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME}
hypercall.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 79 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index af5d11b8eb2f..f774f2cf0ac7 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -45,6 +45,48 @@ static struct kvm_sdei_event *find_event(struct kvm_vcpu *vcpu,
 	return NULL;
 }
 
+static int reset_event(struct kvm_vcpu *vcpu,
+		       struct kvm_sdei_event *event)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_vcpu_context *context;
+	struct kvm_sdei_exposed_event *exposed_event;
+
+	/* Check if the event has been pending for unregistration */
+	if (kvm_sdei_is_unregister_pending(event))
+		return -EAGAIN;
+
+	/*
+	 * If the event is being handled, we should set the unregistration
+	 * pending state for it. The event will be unregistered after the
+	 * event handler is to finish.
+	 */
+	exposed_event = event->exposed_event;
+	context = kvm_sdei_is_critical(exposed_event->priority) ?
+		  &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL] :
+		  &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	if (context->event == event) {
+		kvm_sdei_set_unregister_pending(event);
+		return -EBUSY;
+	}
+
+	/*
+	 * The event is ready to be unregistered. The event is disabled
+	 * when it's unregistered. The pending events should be cancelled
+	 * either.
+	 */
+	if (kvm_sdei_is_critical(exposed_event->priority))
+		vsdei->critical_event_count -= event->event_count;
+	else
+		vsdei->normal_event_count -= event->event_count;
+
+	event->event_count = 0;
+	kvm_sdei_clear_enabled(event);
+	kvm_sdei_clear_registered(event);
+
+	return 0;
+}
+
 static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -171,6 +213,40 @@ static unsigned long hypercall_context(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_unregister(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	unsigned int num = smccc_get_arg(vcpu, 1);
+	unsigned long ret = SDEI_SUCCESS;
+
+	if (!kvm_sdei_is_supported(num))
+		return SDEI_INVALID_PARAMETERS;
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+	/* Check if the event has been registered */
+	if (!kvm_sdei_is_registered(event)) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	if (reset_event(vcpu, event))
+		ret = SDEI_PENDING;
+
+unlock:
+	spin_unlock(&vsdei->lock);
+
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -204,6 +280,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
 		ret = hypercall_context(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
+		ret = hypercall_unregister(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 07/18] KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_EVENT_UNREGISTER hypercall. It's used by the
guest to unregister event. The event is disabled automatically
and won't be delivered to guest after unregistration.

If the event is being serviced or handled, we can't unregister
it immediately. Instead, the unregistration pending state is
set for the event and it's unregistered when the event handler
is to finish by calling SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME}
hypercall.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 79 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index af5d11b8eb2f..f774f2cf0ac7 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -45,6 +45,48 @@ static struct kvm_sdei_event *find_event(struct kvm_vcpu *vcpu,
 	return NULL;
 }
 
+static int reset_event(struct kvm_vcpu *vcpu,
+		       struct kvm_sdei_event *event)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_vcpu_context *context;
+	struct kvm_sdei_exposed_event *exposed_event;
+
+	/* Check if the event has been pending for unregistration */
+	if (kvm_sdei_is_unregister_pending(event))
+		return -EAGAIN;
+
+	/*
+	 * If the event is being handled, we should set the unregistration
+	 * pending state for it. The event will be unregistered after the
+	 * event handler is to finish.
+	 */
+	exposed_event = event->exposed_event;
+	context = kvm_sdei_is_critical(exposed_event->priority) ?
+		  &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL] :
+		  &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	if (context->event == event) {
+		kvm_sdei_set_unregister_pending(event);
+		return -EBUSY;
+	}
+
+	/*
+	 * The event is ready to be unregistered. The event is disabled
+	 * when it's unregistered. The pending events should be cancelled
+	 * either.
+	 */
+	if (kvm_sdei_is_critical(exposed_event->priority))
+		vsdei->critical_event_count -= event->event_count;
+	else
+		vsdei->normal_event_count -= event->event_count;
+
+	event->event_count = 0;
+	kvm_sdei_clear_enabled(event);
+	kvm_sdei_clear_registered(event);
+
+	return 0;
+}
+
 static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -171,6 +213,40 @@ static unsigned long hypercall_context(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_unregister(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	unsigned int num = smccc_get_arg(vcpu, 1);
+	unsigned long ret = SDEI_SUCCESS;
+
+	if (!kvm_sdei_is_supported(num))
+		return SDEI_INVALID_PARAMETERS;
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+	/* Check if the event has been registered */
+	if (!kvm_sdei_is_registered(event)) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	if (reset_event(vcpu, event))
+		ret = SDEI_PENDING;
+
+unlock:
+	spin_unlock(&vsdei->lock);
+
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -204,6 +280,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
 		ret = hypercall_context(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
+		ret = hypercall_unregister(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 08/18] KVM: arm64: Support SDEI_EVENT_STATUS hypercall
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_EVENT_STATUS hypercall. It's used by guest
to retrieve the status about the specified event. A bitmap is
returned to indicate the corresponding status: registration,
enablement and servicing state.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index f774f2cf0ac7..b847c6028b74 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -247,6 +247,48 @@ static unsigned long hypercall_unregister(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_status(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_vcpu_context *context;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *event;
+	unsigned int num = smccc_get_arg(vcpu, 1);
+	unsigned long ret = 0;
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/*
+	 * Check if the event exists. None of the flags will be set
+	 * if it doesn't exist.
+	 */
+	event = find_event(vcpu, num);
+	if (!event)
+		goto unlock;
+
+	if (kvm_sdei_is_registered(event))
+		ret |= (1UL << SDEI_EVENT_STATUS_REGISTERED);
+	if (kvm_sdei_is_enabled(event))
+		ret |= (1UL << SDEI_EVENT_STATUS_ENABLED);
+
+	exposed_event = event->exposed_event;
+	context = kvm_sdei_is_critical(exposed_event->priority) ?
+		  &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL] :
+		  &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	if (context->event == event)
+		ret |= (1UL << SDEI_EVENT_STATUS_RUNNING);
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -283,6 +325,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
 		ret = hypercall_unregister(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_STATUS:
+		ret = hypercall_status(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 08/18] KVM: arm64: Support SDEI_EVENT_STATUS hypercall
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_EVENT_STATUS hypercall. It's used by guest
to retrieve the status about the specified event. A bitmap is
returned to indicate the corresponding status: registration,
enablement and servicing state.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index f774f2cf0ac7..b847c6028b74 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -247,6 +247,48 @@ static unsigned long hypercall_unregister(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_status(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_vcpu_context *context;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *event;
+	unsigned int num = smccc_get_arg(vcpu, 1);
+	unsigned long ret = 0;
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/*
+	 * Check if the event exists. None of the flags will be set
+	 * if it doesn't exist.
+	 */
+	event = find_event(vcpu, num);
+	if (!event)
+		goto unlock;
+
+	if (kvm_sdei_is_registered(event))
+		ret |= (1UL << SDEI_EVENT_STATUS_REGISTERED);
+	if (kvm_sdei_is_enabled(event))
+		ret |= (1UL << SDEI_EVENT_STATUS_ENABLED);
+
+	exposed_event = event->exposed_event;
+	context = kvm_sdei_is_critical(exposed_event->priority) ?
+		  &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL] :
+		  &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	if (context->event == event)
+		ret |= (1UL << SDEI_EVENT_STATUS_RUNNING);
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -283,6 +325,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
 		ret = hypercall_unregister(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_STATUS:
+		ret = hypercall_status(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 09/18] KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_EVENT_GET_INFO hypercall. It's used by guest
to retrieve various information about the event: type, signaled,
priority, routing mode and affinity. SDEI_INVALID_PARAMETERS is
returned for the requests about routing mode and affinity as
they are only valid for the shared event, which isn't supported.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 54 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index b847c6028b74..9e642d01e303 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -289,6 +289,57 @@ static unsigned long hypercall_status(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_info(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *event;
+	unsigned int num = smccc_get_arg(vcpu, 1);
+	unsigned long info = smccc_get_arg(vcpu, 2);
+	unsigned long ret = SDEI_SUCCESS;
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+	/*
+	 * Retrieve the requested information. The shared events aren't
+	 * supported yet. So the requests to retrieve routing mode and
+	 * affinity should fail for now.
+	 */
+	exposed_event = event->exposed_event;
+	switch (info) {
+	case SDEI_EVENT_INFO_EV_TYPE:
+		ret = exposed_event->type;
+		break;
+	case SDEI_EVENT_INFO_EV_SIGNALED:
+		ret = exposed_event->signaled;
+		break;
+	case SDEI_EVENT_INFO_EV_PRIORITY:
+		ret = exposed_event->priority;
+		break;
+	case SDEI_EVENT_INFO_EV_ROUTING_MODE:
+	case SDEI_EVENT_INFO_EV_ROUTING_AFF:
+	default:
+		ret = SDEI_INVALID_PARAMETERS;
+	}
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -328,6 +379,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_STATUS:
 		ret = hypercall_status(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+		ret = hypercall_info(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 09/18] KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_EVENT_GET_INFO hypercall. It's used by guest
to retrieve various information about the event: type, signaled,
priority, routing mode and affinity. SDEI_INVALID_PARAMETERS is
returned for the requests about routing mode and affinity as
they are only valid for the shared event, which isn't supported.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 54 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index b847c6028b74..9e642d01e303 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -289,6 +289,57 @@ static unsigned long hypercall_status(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_info(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *event;
+	unsigned int num = smccc_get_arg(vcpu, 1);
+	unsigned long info = smccc_get_arg(vcpu, 2);
+	unsigned long ret = SDEI_SUCCESS;
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+	/*
+	 * Retrieve the requested information. The shared events aren't
+	 * supported yet. So the requests to retrieve routing mode and
+	 * affinity should fail for now.
+	 */
+	exposed_event = event->exposed_event;
+	switch (info) {
+	case SDEI_EVENT_INFO_EV_TYPE:
+		ret = exposed_event->type;
+		break;
+	case SDEI_EVENT_INFO_EV_SIGNALED:
+		ret = exposed_event->signaled;
+		break;
+	case SDEI_EVENT_INFO_EV_PRIORITY:
+		ret = exposed_event->priority;
+		break;
+	case SDEI_EVENT_INFO_EV_ROUTING_MODE:
+	case SDEI_EVENT_INFO_EV_ROUTING_AFF:
+	default:
+		ret = SDEI_INVALID_PARAMETERS;
+	}
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -328,6 +379,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_STATUS:
 		ret = hypercall_status(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+		ret = hypercall_info(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 10/18] KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_PE_{MASK, UNMASK} hypercall. They are used by
guest to stop or start receiving event on the specified vcpu.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 9e642d01e303..3aca36e7e27b 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -340,6 +340,18 @@ static unsigned long hypercall_info(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	unsigned long ret = SDEI_SUCCESS;
+
+	spin_lock(&vsdei->lock);
+	vsdei->masked = mask ? 1 : 0;
+	spin_unlock(&vsdei->lock);
+
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -382,6 +394,12 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
 		ret = hypercall_info(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_PE_MASK:
+		ret = hypercall_mask(vcpu, true);
+		break;
+	case SDEI_1_0_FN_SDEI_PE_UNMASK:
+		ret = hypercall_mask(vcpu, false);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 10/18] KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_PE_{MASK, UNMASK} hypercall. They are used by
guest to stop or start receiving event on the specified vcpu.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 9e642d01e303..3aca36e7e27b 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -340,6 +340,18 @@ static unsigned long hypercall_info(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	unsigned long ret = SDEI_SUCCESS;
+
+	spin_lock(&vsdei->lock);
+	vsdei->masked = mask ? 1 : 0;
+	spin_unlock(&vsdei->lock);
+
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -382,6 +394,12 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
 		ret = hypercall_info(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_PE_MASK:
+		ret = hypercall_mask(vcpu, true);
+		break;
+	case SDEI_1_0_FN_SDEI_PE_UNMASK:
+		ret = hypercall_mask(vcpu, false);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 11/18] KVM: arm64: Support SDEI_{PRIVATE, SHARED}_RESET
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_{PRIVATE, SHARED}_RESET. They are used by the
guest to reset the private events on the calling vCPU or the
shared events on all vCPUs.

The shared event isn't supported yet, we simply returns SDEI_SUCCESS
as the guest doesn't know it. It can't stop the guest from using
SDEI service.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 3aca36e7e27b..6716ed020aa2 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -352,6 +352,36 @@ static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
 	return ret;
 }
 
+static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	unsigned int i;
+	unsigned long ret = SDEI_SUCCESS;
+
+	/*
+	 * All events, which have been registered, should be private
+	 * because the shared events aren't supported yet, so we needn't
+	 * do anything to reset the shared events.
+	 */
+	if (!private)
+		return ret;
+
+	spin_lock(&vsdei->lock);
+
+	kvm_sdei_for_each_event(vsdei, event, i) {
+		if (!kvm_sdei_is_registered(event))
+			continue;
+
+		if (reset_event(vcpu, event))
+			ret = SDEI_PENDING;
+	}
+
+	spin_unlock(&vsdei->lock);
+
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -400,6 +430,12 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_PE_UNMASK:
 		ret = hypercall_mask(vcpu, false);
 		break;
+	case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
+		ret = hypercall_reset(vcpu, true);
+		break;
+	case SDEI_1_0_FN_SDEI_SHARED_RESET:
+		ret = hypercall_reset(vcpu, false);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 11/18] KVM: arm64: Support SDEI_{PRIVATE, SHARED}_RESET
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_{PRIVATE, SHARED}_RESET. They are used by the
guest to reset the private events on the calling vCPU or the
shared events on all vCPUs.

The shared event isn't supported yet, we simply returns SDEI_SUCCESS
as the guest doesn't know it. It can't stop the guest from using
SDEI service.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 3aca36e7e27b..6716ed020aa2 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -352,6 +352,36 @@ static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
 	return ret;
 }
 
+static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	unsigned int i;
+	unsigned long ret = SDEI_SUCCESS;
+
+	/*
+	 * All events, which have been registered, should be private
+	 * because the shared events aren't supported yet, so we needn't
+	 * do anything to reset the shared events.
+	 */
+	if (!private)
+		return ret;
+
+	spin_lock(&vsdei->lock);
+
+	kvm_sdei_for_each_event(vsdei, event, i) {
+		if (!kvm_sdei_is_registered(event))
+			continue;
+
+		if (reset_event(vcpu, event))
+			ret = SDEI_PENDING;
+	}
+
+	spin_unlock(&vsdei->lock);
+
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -400,6 +430,12 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_PE_UNMASK:
 		ret = hypercall_mask(vcpu, false);
 		break;
+	case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
+		ret = hypercall_reset(vcpu, true);
+		break;
+	case SDEI_1_0_FN_SDEI_SHARED_RESET:
+		ret = hypercall_reset(vcpu, false);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 12/18] KVM: arm64: Support SDEI event injection, delivery
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports event injection, delivery and cancellation. The event
is injected and cancelled by kvm_sdei_{inject, cancel}_event(). For
event delivery, vcpu request (KVM_REQ_SDEI) is introduced.

kvm_sdei_deliver_event() is called to accommodate the KVM_REQ_SDEI
request. The execution context is switched like below:

  * x0 - x17 are saved. All of them are cleared except the following
    registers:

    x0: event number
    x1: user argument associated with the SDEI event
    x2: PC of the interrupted or preempted context
    x3: PSTATE of the interrupted or preempted context

  * PC is set to the event handler, which is provided when the event
    is registered. PSTATE is modified according to the specification.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/include/asm/kvm_host.h |   1 +
 arch/arm64/include/asm/kvm_sdei.h |   4 +
 arch/arm64/kvm/arm.c              |   3 +
 arch/arm64/kvm/sdei.c             | 260 ++++++++++++++++++++++++++++++
 4 files changed, 268 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 7644a400c4a8..951264d4b64d 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -46,6 +46,7 @@
 #define KVM_REQ_RECORD_STEAL	KVM_ARCH_REQ(3)
 #define KVM_REQ_RELOAD_GICv4	KVM_ARCH_REQ(4)
 #define KVM_REQ_RELOAD_PMU	KVM_ARCH_REQ(5)
+#define KVM_REQ_SDEI		KVM_ARCH_REQ(6)
 
 #define KVM_DIRTY_LOG_MANUAL_CAPS   (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
 				     KVM_DIRTY_LOG_INITIALLY_SET)
diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index 2dbfb3ae0a48..f946d4ebdc14 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -142,6 +142,10 @@ KVM_SDEI_REGISTERED_EVENT_FUNC(unregister_pending, UNREGISTER_PENDING)
 
 /* APIs */
 int kvm_sdei_call(struct kvm_vcpu *vcpu);
+int kvm_sdei_inject_event(struct kvm_vcpu *vcpu, unsigned int num,
+			  bool immediate);
+int kvm_sdei_cancel_event(struct kvm_vcpu *vcpu, unsigned int num);
+void kvm_sdei_deliver_event(struct kvm_vcpu *vcpu);
 void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
 void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 227c0e390571..7e77d62aeab1 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -659,6 +659,9 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
 		if (kvm_check_request(KVM_REQ_VCPU_RESET, vcpu))
 			kvm_reset_vcpu(vcpu);
 
+		if (kvm_check_request(KVM_REQ_SDEI, vcpu))
+			kvm_sdei_deliver_event(vcpu);
+
 		/*
 		 * Clear IRQ_PENDING requests that were made to guarantee
 		 * that a VCPU sees new virtual interrupts.
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 6716ed020aa2..9d18fee59751 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -87,6 +87,36 @@ static int reset_event(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
+static int inject_event(struct kvm_vcpu *vcpu,
+			struct kvm_sdei_event *event)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_exposed_event *exposed_event;
+
+	/* The event should have been registered and enabled */
+	if (!kvm_sdei_is_registered(event) ||
+	    !kvm_sdei_is_enabled(event))
+		return -ENOENT;
+
+	/*
+	 * If the event is pending for unregistration, we shouldn't
+	 * inject the event.
+	 */
+	if (kvm_sdei_is_unregister_pending(event))
+		return -EAGAIN;
+
+	event->event_count++;
+	exposed_event = event->exposed_event;
+	if (kvm_sdei_is_critical(exposed_event->priority))
+		vsdei->critical_event_count++;
+	else
+		vsdei->normal_event_count++;
+
+	kvm_make_request(KVM_REQ_SDEI, vcpu);
+
+	return 0;
+}
+
 static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -447,6 +477,236 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+int kvm_sdei_inject_event(struct kvm_vcpu *vcpu,
+			  unsigned int num,
+			  bool immediate)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *event;
+	unsigned long event_count = 0;
+	int ret = 0;
+
+	if (!vsdei) {
+		ret = -EPERM;
+		goto out;
+	}
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the vcpu has been masked */
+	if (vsdei->masked) {
+		ret = -EPERM;
+		goto unlock;
+	}
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = -ENOENT;
+		goto unlock;
+	}
+
+	/*
+	 * In some cases, the injected event is expected to be delivered
+	 * immediately. However, there are two cases the injected event
+	 * can't be delivered immediately: (a) the injected event is a
+	 * critical one, but we already have pending critical events for
+	 * delivery. (b) the injected event is a normal one, but we have
+	 * pending events for delivery, regardless of their priorities.
+	 */
+	if (immediate) {
+		exposed_event = event->exposed_event;
+		event_count = vsdei->critical_event_count;
+		if (kvm_sdei_is_normal(exposed_event->priority))
+			event_count += vsdei->normal_event_count;
+
+		if (event_count > 0)
+			return -ENOSPC;
+	}
+
+	ret = inject_event(vcpu, event);
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
+int kvm_sdei_cancel_event(struct kvm_vcpu *vcpu, unsigned int num)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_vcpu_context *context;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *event;
+	int ret = 0;
+
+	if (!vsdei) {
+		ret = -EPERM;
+		goto out;
+	}
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = -ENOENT;
+		goto unlock;
+	}
+
+	/* The event should have been registered and eanbled */
+	if (!kvm_sdei_is_registered(event) ||
+	    !kvm_sdei_is_enabled(event)) {
+		ret = -EBUSY;
+		goto unlock;
+	}
+
+	/*
+	 * If the event is pending for unregistration, we needn't
+	 * cancel it because all the pending events will be cancelled
+	 * after its handler is to finish.
+	 */
+	if (kvm_sdei_is_unregister_pending(event)) {
+		ret = -EAGAIN;
+		goto unlock;
+	}
+
+	/* Return success if there is no pending events */
+	if (event->event_count <= 0)
+		goto unlock;
+
+	/* The event can't be cancelled if it's being handled */
+	exposed_event = event->exposed_event;
+	context = kvm_sdei_is_critical(exposed_event->priority) ?
+		  &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL] :
+		  &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	if (event->event_count == 1 &&
+	    context->event == event) {
+		ret = -EAGAIN;
+		goto unlock;
+	}
+
+	/* Cancel the event */
+	event->event_count--;
+	if (kvm_sdei_is_critical(exposed_event->priority))
+		vsdei->critical_event_count--;
+	else
+		vsdei->normal_event_count--;
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
+void kvm_sdei_deliver_event(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_vcpu_context *context;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *selected, *event;
+	unsigned long pstate;
+	unsigned int i;
+
+	if (!vsdei)
+		return;
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the vcpu has been masked */
+	if (vsdei->masked)
+		goto unlock;
+
+	/* The currently handled critical event can't be preempted */
+	context = &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL];
+	if (context->event)
+		goto unlock;
+
+	/*
+	 * Select the event to be handled. The critical event is
+	 * selected if we have one. Otherwise, the first normal
+	 * event will be selected. Beside, the normal event can
+	 * be preempted by the critical event. However, the normal
+	 * event can't be preempted by another normal event.
+	 */
+	selected = NULL;
+	context = &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	kvm_sdei_for_each_event(vsdei, event, i) {
+		exposed_event = event->exposed_event;
+		if (event->event_count <= 0)
+			continue;
+
+		if (context->event &&
+		    kvm_sdei_is_normal(exposed_event->priority))
+			continue;
+
+		if (kvm_sdei_is_critical(exposed_event->priority)) {
+			selected = event;
+			break;
+		}
+
+		selected = selected ? : event;
+	}
+
+	if (!selected)
+		goto unlock;
+
+	/*
+	 * Save context: x0 -> x17, PC, PState. There might be pending
+	 * exception or PC increment request in the last run on this vCPU.
+	 * In this case, we need to save the site in advance. Otherwise,
+	 * the passed entry point could be floated by 4 bytes in the
+	 * subsequent call to __kvm_adjust_pc().
+	 */
+	event = selected;
+	exposed_event = event->exposed_event;
+	context = kvm_sdei_is_critical(exposed_event->priority) ?
+		  &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL] :
+		  &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	context->event = event;
+
+	__kvm_adjust_pc(vcpu);
+	context->pc = *vcpu_pc(vcpu);
+	context->pstate = *vcpu_cpsr(vcpu);
+	for (i = 0; i < ARRAY_SIZE(context->regs); i++)
+		context->regs[i] = vcpu_get_reg(vcpu, i);
+
+	/*
+	 * Inject event: x0 -> x3, PC, PState. The other general
+	 * purpose registers are cleared.
+	 */
+	for (i = 0; i < ARRAY_SIZE(context->regs); i++)
+		vcpu_set_reg(vcpu, i, 0);
+
+	vcpu_set_reg(vcpu, 0, exposed_event->num);
+	vcpu_set_reg(vcpu, 1, event->ep_arg);
+	vcpu_set_reg(vcpu, 2, context->pc);
+	vcpu_set_reg(vcpu, 3, context->pstate);
+
+	pstate = context->pstate;
+	pstate |= (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT);
+	pstate &= ~PSR_MODE_MASK;
+	pstate |= PSR_MODE_EL1h;
+	pstate &= ~PSR_MODE32_BIT;
+
+	*vcpu_cpsr(vcpu) = pstate;
+	*vcpu_pc(vcpu) = event->ep_address;
+
+unlock:
+	spin_unlock(&vsdei->lock);
+}
+
 void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei;
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 12/18] KVM: arm64: Support SDEI event injection, delivery
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports event injection, delivery and cancellation. The event
is injected and cancelled by kvm_sdei_{inject, cancel}_event(). For
event delivery, vcpu request (KVM_REQ_SDEI) is introduced.

kvm_sdei_deliver_event() is called to accommodate the KVM_REQ_SDEI
request. The execution context is switched like below:

  * x0 - x17 are saved. All of them are cleared except the following
    registers:

    x0: event number
    x1: user argument associated with the SDEI event
    x2: PC of the interrupted or preempted context
    x3: PSTATE of the interrupted or preempted context

  * PC is set to the event handler, which is provided when the event
    is registered. PSTATE is modified according to the specification.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/include/asm/kvm_host.h |   1 +
 arch/arm64/include/asm/kvm_sdei.h |   4 +
 arch/arm64/kvm/arm.c              |   3 +
 arch/arm64/kvm/sdei.c             | 260 ++++++++++++++++++++++++++++++
 4 files changed, 268 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 7644a400c4a8..951264d4b64d 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -46,6 +46,7 @@
 #define KVM_REQ_RECORD_STEAL	KVM_ARCH_REQ(3)
 #define KVM_REQ_RELOAD_GICv4	KVM_ARCH_REQ(4)
 #define KVM_REQ_RELOAD_PMU	KVM_ARCH_REQ(5)
+#define KVM_REQ_SDEI		KVM_ARCH_REQ(6)
 
 #define KVM_DIRTY_LOG_MANUAL_CAPS   (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
 				     KVM_DIRTY_LOG_INITIALLY_SET)
diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index 2dbfb3ae0a48..f946d4ebdc14 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -142,6 +142,10 @@ KVM_SDEI_REGISTERED_EVENT_FUNC(unregister_pending, UNREGISTER_PENDING)
 
 /* APIs */
 int kvm_sdei_call(struct kvm_vcpu *vcpu);
+int kvm_sdei_inject_event(struct kvm_vcpu *vcpu, unsigned int num,
+			  bool immediate);
+int kvm_sdei_cancel_event(struct kvm_vcpu *vcpu, unsigned int num);
+void kvm_sdei_deliver_event(struct kvm_vcpu *vcpu);
 void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
 void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 227c0e390571..7e77d62aeab1 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -659,6 +659,9 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
 		if (kvm_check_request(KVM_REQ_VCPU_RESET, vcpu))
 			kvm_reset_vcpu(vcpu);
 
+		if (kvm_check_request(KVM_REQ_SDEI, vcpu))
+			kvm_sdei_deliver_event(vcpu);
+
 		/*
 		 * Clear IRQ_PENDING requests that were made to guarantee
 		 * that a VCPU sees new virtual interrupts.
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 6716ed020aa2..9d18fee59751 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -87,6 +87,36 @@ static int reset_event(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
+static int inject_event(struct kvm_vcpu *vcpu,
+			struct kvm_sdei_event *event)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_exposed_event *exposed_event;
+
+	/* The event should have been registered and enabled */
+	if (!kvm_sdei_is_registered(event) ||
+	    !kvm_sdei_is_enabled(event))
+		return -ENOENT;
+
+	/*
+	 * If the event is pending for unregistration, we shouldn't
+	 * inject the event.
+	 */
+	if (kvm_sdei_is_unregister_pending(event))
+		return -EAGAIN;
+
+	event->event_count++;
+	exposed_event = event->exposed_event;
+	if (kvm_sdei_is_critical(exposed_event->priority))
+		vsdei->critical_event_count++;
+	else
+		vsdei->normal_event_count++;
+
+	kvm_make_request(KVM_REQ_SDEI, vcpu);
+
+	return 0;
+}
+
 static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -447,6 +477,236 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+int kvm_sdei_inject_event(struct kvm_vcpu *vcpu,
+			  unsigned int num,
+			  bool immediate)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *event;
+	unsigned long event_count = 0;
+	int ret = 0;
+
+	if (!vsdei) {
+		ret = -EPERM;
+		goto out;
+	}
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the vcpu has been masked */
+	if (vsdei->masked) {
+		ret = -EPERM;
+		goto unlock;
+	}
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = -ENOENT;
+		goto unlock;
+	}
+
+	/*
+	 * In some cases, the injected event is expected to be delivered
+	 * immediately. However, there are two cases the injected event
+	 * can't be delivered immediately: (a) the injected event is a
+	 * critical one, but we already have pending critical events for
+	 * delivery. (b) the injected event is a normal one, but we have
+	 * pending events for delivery, regardless of their priorities.
+	 */
+	if (immediate) {
+		exposed_event = event->exposed_event;
+		event_count = vsdei->critical_event_count;
+		if (kvm_sdei_is_normal(exposed_event->priority))
+			event_count += vsdei->normal_event_count;
+
+		if (event_count > 0)
+			return -ENOSPC;
+	}
+
+	ret = inject_event(vcpu, event);
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
+int kvm_sdei_cancel_event(struct kvm_vcpu *vcpu, unsigned int num)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_vcpu_context *context;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *event;
+	int ret = 0;
+
+	if (!vsdei) {
+		ret = -EPERM;
+		goto out;
+	}
+
+	if (!kvm_sdei_is_supported(num)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = -ENOENT;
+		goto unlock;
+	}
+
+	/* The event should have been registered and eanbled */
+	if (!kvm_sdei_is_registered(event) ||
+	    !kvm_sdei_is_enabled(event)) {
+		ret = -EBUSY;
+		goto unlock;
+	}
+
+	/*
+	 * If the event is pending for unregistration, we needn't
+	 * cancel it because all the pending events will be cancelled
+	 * after its handler is to finish.
+	 */
+	if (kvm_sdei_is_unregister_pending(event)) {
+		ret = -EAGAIN;
+		goto unlock;
+	}
+
+	/* Return success if there is no pending events */
+	if (event->event_count <= 0)
+		goto unlock;
+
+	/* The event can't be cancelled if it's being handled */
+	exposed_event = event->exposed_event;
+	context = kvm_sdei_is_critical(exposed_event->priority) ?
+		  &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL] :
+		  &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	if (event->event_count == 1 &&
+	    context->event == event) {
+		ret = -EAGAIN;
+		goto unlock;
+	}
+
+	/* Cancel the event */
+	event->event_count--;
+	if (kvm_sdei_is_critical(exposed_event->priority))
+		vsdei->critical_event_count--;
+	else
+		vsdei->normal_event_count--;
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
+void kvm_sdei_deliver_event(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_vcpu_context *context;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *selected, *event;
+	unsigned long pstate;
+	unsigned int i;
+
+	if (!vsdei)
+		return;
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the vcpu has been masked */
+	if (vsdei->masked)
+		goto unlock;
+
+	/* The currently handled critical event can't be preempted */
+	context = &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL];
+	if (context->event)
+		goto unlock;
+
+	/*
+	 * Select the event to be handled. The critical event is
+	 * selected if we have one. Otherwise, the first normal
+	 * event will be selected. Beside, the normal event can
+	 * be preempted by the critical event. However, the normal
+	 * event can't be preempted by another normal event.
+	 */
+	selected = NULL;
+	context = &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	kvm_sdei_for_each_event(vsdei, event, i) {
+		exposed_event = event->exposed_event;
+		if (event->event_count <= 0)
+			continue;
+
+		if (context->event &&
+		    kvm_sdei_is_normal(exposed_event->priority))
+			continue;
+
+		if (kvm_sdei_is_critical(exposed_event->priority)) {
+			selected = event;
+			break;
+		}
+
+		selected = selected ? : event;
+	}
+
+	if (!selected)
+		goto unlock;
+
+	/*
+	 * Save context: x0 -> x17, PC, PState. There might be pending
+	 * exception or PC increment request in the last run on this vCPU.
+	 * In this case, we need to save the site in advance. Otherwise,
+	 * the passed entry point could be floated by 4 bytes in the
+	 * subsequent call to __kvm_adjust_pc().
+	 */
+	event = selected;
+	exposed_event = event->exposed_event;
+	context = kvm_sdei_is_critical(exposed_event->priority) ?
+		  &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL] :
+		  &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	context->event = event;
+
+	__kvm_adjust_pc(vcpu);
+	context->pc = *vcpu_pc(vcpu);
+	context->pstate = *vcpu_cpsr(vcpu);
+	for (i = 0; i < ARRAY_SIZE(context->regs); i++)
+		context->regs[i] = vcpu_get_reg(vcpu, i);
+
+	/*
+	 * Inject event: x0 -> x3, PC, PState. The other general
+	 * purpose registers are cleared.
+	 */
+	for (i = 0; i < ARRAY_SIZE(context->regs); i++)
+		vcpu_set_reg(vcpu, i, 0);
+
+	vcpu_set_reg(vcpu, 0, exposed_event->num);
+	vcpu_set_reg(vcpu, 1, event->ep_arg);
+	vcpu_set_reg(vcpu, 2, context->pc);
+	vcpu_set_reg(vcpu, 3, context->pstate);
+
+	pstate = context->pstate;
+	pstate |= (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT);
+	pstate &= ~PSR_MODE_MASK;
+	pstate |= PSR_MODE_EL1h;
+	pstate &= ~PSR_MODE32_BIT;
+
+	*vcpu_cpsr(vcpu) = pstate;
+	*vcpu_pc(vcpu) = event->ep_address;
+
+unlock:
+	spin_unlock(&vsdei->lock);
+}
+
 void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME} hypercall
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
They are used by guest to notify the completion of event in its
handler. The previously interrupted or preempted context is restored
like below.

   * x0 - x17, PC and PState are restored to what values we had in
     the interrupted or preempted context.

   * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
     is injected.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/include/asm/kvm_emulate.h |  1 +
 arch/arm64/include/asm/kvm_host.h    |  1 +
 arch/arm64/kvm/hyp/exception.c       |  7 +++
 arch/arm64/kvm/inject_fault.c        | 29 ++++++++++
 arch/arm64/kvm/sdei.c                | 79 ++++++++++++++++++++++++++++
 5 files changed, 117 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index d62405ce3e6d..ca9de9f24923 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -37,6 +37,7 @@ bool kvm_condition_valid32(const struct kvm_vcpu *vcpu);
 void kvm_skip_instr32(struct kvm_vcpu *vcpu);
 
 void kvm_inject_undefined(struct kvm_vcpu *vcpu);
+void kvm_inject_irq(struct kvm_vcpu *vcpu);
 void kvm_inject_vabt(struct kvm_vcpu *vcpu);
 void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
 void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 951264d4b64d..ac475d3b9151 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -431,6 +431,7 @@ struct kvm_vcpu_arch {
 #define KVM_ARM64_EXCEPT_AA32_UND	(0 << 9)
 #define KVM_ARM64_EXCEPT_AA32_IABT	(1 << 9)
 #define KVM_ARM64_EXCEPT_AA32_DABT	(2 << 9)
+#define KVM_ARM64_EXCEPT_AA32_IRQ	(3 << 9)
 /* For AArch64: */
 #define KVM_ARM64_EXCEPT_AA64_ELx_SYNC	(0 << 9)
 #define KVM_ARM64_EXCEPT_AA64_ELx_IRQ	(1 << 9)
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index c5d009715402..f425ea11e4f6 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -313,6 +313,9 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
 		case KVM_ARM64_EXCEPT_AA32_DABT:
 			enter_exception32(vcpu, PSR_AA32_MODE_ABT, 16);
 			break;
+		case KVM_ARM64_EXCEPT_AA32_IRQ:
+			enter_exception32(vcpu, PSR_AA32_MODE_IRQ, 24);
+			break;
 		default:
 			/* Err... */
 			break;
@@ -323,6 +326,10 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
 		      KVM_ARM64_EXCEPT_AA64_EL1):
 			enter_exception64(vcpu, PSR_MODE_EL1h, except_type_sync);
 			break;
+		case (KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
+		      KVM_ARM64_EXCEPT_AA64_EL1):
+			enter_exception64(vcpu, PSR_MODE_EL1h, except_type_irq);
+			break;
 		default:
 			/*
 			 * Only EL1_SYNC makes sense so far, EL2_{SYNC,IRQ}
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index b47df73e98d7..c8a8791bdf28 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -66,6 +66,13 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
 	vcpu_write_sys_reg(vcpu, esr, ESR_EL1);
 }
 
+static void inject_irq64(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1     |
+			     KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
+			     KVM_ARM64_PENDING_EXCEPTION);
+}
+
 #define DFSR_FSC_EXTABT_LPAE	0x10
 #define DFSR_FSC_EXTABT_nLPAE	0x08
 #define DFSR_LPAE		BIT(9)
@@ -77,6 +84,12 @@ static void inject_undef32(struct kvm_vcpu *vcpu)
 			     KVM_ARM64_PENDING_EXCEPTION);
 }
 
+static void inject_irq32(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA32_IRQ |
+			     KVM_ARM64_PENDING_EXCEPTION);
+}
+
 /*
  * Modelled after TakeDataAbortException() and TakePrefetchAbortException
  * pseudocode.
@@ -160,6 +173,22 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
 		inject_undef64(vcpu);
 }
 
+/**
+ * kvm_inject_irq - inject an IRQ into the guest
+ * @vcpu: The vCPU in which to inject IRQ
+ *
+ * Inject IRQs to the target vCPU. It is assumed that this code is
+ * called from the VCPU thread and that the VCPU therefore is not
+ * currently executing guest code.
+ */
+void kvm_inject_irq(struct kvm_vcpu *vcpu)
+{
+	if (vcpu_el1_is_32bit(vcpu))
+		inject_irq32(vcpu);
+	else
+		inject_irq64(vcpu);
+}
+
 void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 esr)
 {
 	vcpu_set_vsesr(vcpu, esr & ESR_ELx_ISS_MASK);
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 9d18fee59751..ebdbe7810cf0 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -243,6 +243,79 @@ static unsigned long hypercall_context(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_complete(struct kvm_vcpu *vcpu, bool resume)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *event;
+	struct kvm_sdei_vcpu_context *context;
+	unsigned int i;
+	unsigned long ret = SDEI_SUCCESS;
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if there is any event being handled */
+	context = &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL];
+	context = context->event ? context : NULL;
+	context = context ? : &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	context = context->event ? context : NULL;
+	if (!context) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	/* Restore registers: x0 -> x17, PC, PState */
+	for (i = 0; i < ARRAY_SIZE(context->regs); i++)
+		vcpu_set_reg(vcpu, i, context->regs[i]);
+
+	*vcpu_cpsr(vcpu) = context->pstate;
+	*vcpu_pc(vcpu) = context->pc;
+
+	/* Inject interrupt if needed */
+	if (resume)
+		kvm_inject_irq(vcpu);
+
+	/*
+	 * Decrease the event count and invalidate the event in the
+	 * vcpu context.
+	 */
+	event = context->event;
+	exposed_event = event->exposed_event;
+	context->event = NULL;
+	event->event_count--;
+	if (kvm_sdei_is_critical(exposed_event->priority))
+		vsdei->critical_event_count--;
+	else
+		vsdei->normal_event_count--;
+
+	/*
+	 * We need to check if the event is pending for unregistration.
+	 * In that case, the event should be disabled and unregistered.
+	 * All the pending events are cancelled either.
+	 */
+	if (kvm_sdei_is_unregister_pending(event)) {
+		if (kvm_sdei_is_critical(exposed_event->priority))
+			vsdei->critical_event_count -= event->event_count;
+		else
+			vsdei->normal_event_count -= event->event_count;
+
+		event->event_count = 0;
+		kvm_sdei_clear_enabled(event);
+		kvm_sdei_clear_registered(event);
+		kvm_sdei_clear_unregister_pending(event);
+	}
+
+	/* Another request if we have more events to be handled */
+	if (vsdei->critical_event_count > 0 ||
+	    vsdei->normal_event_count > 0)
+		kvm_make_request(KVM_REQ_SDEI, vcpu);
+
+unlock:
+	spin_unlock(&vsdei->lock);
+
+	return ret;
+}
+
 static unsigned long hypercall_unregister(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -445,6 +518,12 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
 		ret = hypercall_context(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
+		ret = hypercall_complete(vcpu, false);
+		break;
+	case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
+		ret = hypercall_complete(vcpu, true);
+		break;
 	case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
 		ret = hypercall_unregister(vcpu);
 		break;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
They are used by guest to notify the completion of event in its
handler. The previously interrupted or preempted context is restored
like below.

   * x0 - x17, PC and PState are restored to what values we had in
     the interrupted or preempted context.

   * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
     is injected.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/include/asm/kvm_emulate.h |  1 +
 arch/arm64/include/asm/kvm_host.h    |  1 +
 arch/arm64/kvm/hyp/exception.c       |  7 +++
 arch/arm64/kvm/inject_fault.c        | 29 ++++++++++
 arch/arm64/kvm/sdei.c                | 79 ++++++++++++++++++++++++++++
 5 files changed, 117 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index d62405ce3e6d..ca9de9f24923 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -37,6 +37,7 @@ bool kvm_condition_valid32(const struct kvm_vcpu *vcpu);
 void kvm_skip_instr32(struct kvm_vcpu *vcpu);
 
 void kvm_inject_undefined(struct kvm_vcpu *vcpu);
+void kvm_inject_irq(struct kvm_vcpu *vcpu);
 void kvm_inject_vabt(struct kvm_vcpu *vcpu);
 void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
 void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 951264d4b64d..ac475d3b9151 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -431,6 +431,7 @@ struct kvm_vcpu_arch {
 #define KVM_ARM64_EXCEPT_AA32_UND	(0 << 9)
 #define KVM_ARM64_EXCEPT_AA32_IABT	(1 << 9)
 #define KVM_ARM64_EXCEPT_AA32_DABT	(2 << 9)
+#define KVM_ARM64_EXCEPT_AA32_IRQ	(3 << 9)
 /* For AArch64: */
 #define KVM_ARM64_EXCEPT_AA64_ELx_SYNC	(0 << 9)
 #define KVM_ARM64_EXCEPT_AA64_ELx_IRQ	(1 << 9)
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index c5d009715402..f425ea11e4f6 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -313,6 +313,9 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
 		case KVM_ARM64_EXCEPT_AA32_DABT:
 			enter_exception32(vcpu, PSR_AA32_MODE_ABT, 16);
 			break;
+		case KVM_ARM64_EXCEPT_AA32_IRQ:
+			enter_exception32(vcpu, PSR_AA32_MODE_IRQ, 24);
+			break;
 		default:
 			/* Err... */
 			break;
@@ -323,6 +326,10 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
 		      KVM_ARM64_EXCEPT_AA64_EL1):
 			enter_exception64(vcpu, PSR_MODE_EL1h, except_type_sync);
 			break;
+		case (KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
+		      KVM_ARM64_EXCEPT_AA64_EL1):
+			enter_exception64(vcpu, PSR_MODE_EL1h, except_type_irq);
+			break;
 		default:
 			/*
 			 * Only EL1_SYNC makes sense so far, EL2_{SYNC,IRQ}
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index b47df73e98d7..c8a8791bdf28 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -66,6 +66,13 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
 	vcpu_write_sys_reg(vcpu, esr, ESR_EL1);
 }
 
+static void inject_irq64(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1     |
+			     KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
+			     KVM_ARM64_PENDING_EXCEPTION);
+}
+
 #define DFSR_FSC_EXTABT_LPAE	0x10
 #define DFSR_FSC_EXTABT_nLPAE	0x08
 #define DFSR_LPAE		BIT(9)
@@ -77,6 +84,12 @@ static void inject_undef32(struct kvm_vcpu *vcpu)
 			     KVM_ARM64_PENDING_EXCEPTION);
 }
 
+static void inject_irq32(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA32_IRQ |
+			     KVM_ARM64_PENDING_EXCEPTION);
+}
+
 /*
  * Modelled after TakeDataAbortException() and TakePrefetchAbortException
  * pseudocode.
@@ -160,6 +173,22 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
 		inject_undef64(vcpu);
 }
 
+/**
+ * kvm_inject_irq - inject an IRQ into the guest
+ * @vcpu: The vCPU in which to inject IRQ
+ *
+ * Inject IRQs to the target vCPU. It is assumed that this code is
+ * called from the VCPU thread and that the VCPU therefore is not
+ * currently executing guest code.
+ */
+void kvm_inject_irq(struct kvm_vcpu *vcpu)
+{
+	if (vcpu_el1_is_32bit(vcpu))
+		inject_irq32(vcpu);
+	else
+		inject_irq64(vcpu);
+}
+
 void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 esr)
 {
 	vcpu_set_vsesr(vcpu, esr & ESR_ELx_ISS_MASK);
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 9d18fee59751..ebdbe7810cf0 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -243,6 +243,79 @@ static unsigned long hypercall_context(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+static unsigned long hypercall_complete(struct kvm_vcpu *vcpu, bool resume)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_exposed_event *exposed_event;
+	struct kvm_sdei_event *event;
+	struct kvm_sdei_vcpu_context *context;
+	unsigned int i;
+	unsigned long ret = SDEI_SUCCESS;
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if there is any event being handled */
+	context = &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL];
+	context = context->event ? context : NULL;
+	context = context ? : &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
+	context = context->event ? context : NULL;
+	if (!context) {
+		ret = SDEI_DENIED;
+		goto unlock;
+	}
+
+	/* Restore registers: x0 -> x17, PC, PState */
+	for (i = 0; i < ARRAY_SIZE(context->regs); i++)
+		vcpu_set_reg(vcpu, i, context->regs[i]);
+
+	*vcpu_cpsr(vcpu) = context->pstate;
+	*vcpu_pc(vcpu) = context->pc;
+
+	/* Inject interrupt if needed */
+	if (resume)
+		kvm_inject_irq(vcpu);
+
+	/*
+	 * Decrease the event count and invalidate the event in the
+	 * vcpu context.
+	 */
+	event = context->event;
+	exposed_event = event->exposed_event;
+	context->event = NULL;
+	event->event_count--;
+	if (kvm_sdei_is_critical(exposed_event->priority))
+		vsdei->critical_event_count--;
+	else
+		vsdei->normal_event_count--;
+
+	/*
+	 * We need to check if the event is pending for unregistration.
+	 * In that case, the event should be disabled and unregistered.
+	 * All the pending events are cancelled either.
+	 */
+	if (kvm_sdei_is_unregister_pending(event)) {
+		if (kvm_sdei_is_critical(exposed_event->priority))
+			vsdei->critical_event_count -= event->event_count;
+		else
+			vsdei->normal_event_count -= event->event_count;
+
+		event->event_count = 0;
+		kvm_sdei_clear_enabled(event);
+		kvm_sdei_clear_registered(event);
+		kvm_sdei_clear_unregister_pending(event);
+	}
+
+	/* Another request if we have more events to be handled */
+	if (vsdei->critical_event_count > 0 ||
+	    vsdei->normal_event_count > 0)
+		kvm_make_request(KVM_REQ_SDEI, vcpu);
+
+unlock:
+	spin_unlock(&vsdei->lock);
+
+	return ret;
+}
+
 static unsigned long hypercall_unregister(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -445,6 +518,12 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
 		ret = hypercall_context(vcpu);
 		break;
+	case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
+		ret = hypercall_complete(vcpu, false);
+		break;
+	case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
+		ret = hypercall_complete(vcpu, true);
+		break;
 	case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
 		ret = hypercall_unregister(vcpu);
 		break;
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 14/18] KVM: arm64: Support SDEI_EVENT_SIGNAL hypercall
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_EVENT_SIGNAL hypercall. It's used by guest
to inject event, whose number must be zero to the specified
vCPU. As the shared event isn't supported, calling vCPU is
assumed to be the target.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index ebdbe7810cf0..e1f6ab9800ee 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -455,6 +455,48 @@ static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
 	return ret;
 }
 
+static unsigned long hypercall_signal(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	unsigned int num = smccc_get_arg(vcpu, 1);
+	unsigned long ret = SDEI_SUCCESS;
+
+	/*
+	 * The event must be the software signaled one, whose number
+	 * is zero.
+	 */
+	if (!kvm_sdei_is_sw_signaled(num)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the vcpu has been masked */
+	if (vsdei->masked) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+	if (inject_event(vcpu, event)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
 static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -539,6 +581,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_PE_UNMASK:
 		ret = hypercall_mask(vcpu, false);
 		break;
+	case SDEI_1_1_FN_SDEI_EVENT_SIGNAL:
+		ret = hypercall_signal(vcpu);
+		break;
 	case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
 		ret = hypercall_reset(vcpu, true);
 		break;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 14/18] KVM: arm64: Support SDEI_EVENT_SIGNAL hypercall
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_EVENT_SIGNAL hypercall. It's used by guest
to inject event, whose number must be zero to the specified
vCPU. As the shared event isn't supported, calling vCPU is
assumed to be the target.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index ebdbe7810cf0..e1f6ab9800ee 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -455,6 +455,48 @@ static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
 	return ret;
 }
 
+static unsigned long hypercall_signal(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+	struct kvm_sdei_event *event;
+	unsigned int num = smccc_get_arg(vcpu, 1);
+	unsigned long ret = SDEI_SUCCESS;
+
+	/*
+	 * The event must be the software signaled one, whose number
+	 * is zero.
+	 */
+	if (!kvm_sdei_is_sw_signaled(num)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto out;
+	}
+
+	spin_lock(&vsdei->lock);
+
+	/* Check if the vcpu has been masked */
+	if (vsdei->masked) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+	/* Check if the event exists */
+	event = find_event(vcpu, num);
+	if (!event) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+	if (inject_event(vcpu, event)) {
+		ret = SDEI_INVALID_PARAMETERS;
+		goto unlock;
+	}
+
+unlock:
+	spin_unlock(&vsdei->lock);
+out:
+	return ret;
+}
+
 static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -539,6 +581,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_PE_UNMASK:
 		ret = hypercall_mask(vcpu, false);
 		break;
+	case SDEI_1_1_FN_SDEI_EVENT_SIGNAL:
+		ret = hypercall_signal(vcpu);
+		break;
 	case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
 		ret = hypercall_reset(vcpu, true);
 		break;
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 15/18] KVM: arm64: Support SDEI_FEATURES hypercall
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_FEATURES hypercall. It's used by guest to
retrieve the supported features, which are number of slots for
the interrupt binding events and relative mode for the event
handler. Currently, none of them is supported.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index e1f6ab9800ee..ab0b7b5e3191 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -527,6 +527,23 @@ static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
 	return ret;
 }
 
+static unsigned long hypercall_features(struct kvm_vcpu *vcpu)
+{
+	unsigned long feature = smccc_get_arg(vcpu, 1);
+	unsigned long ret;
+
+	switch (feature) {
+	case 0: /* BIND_SLOTS */
+	case 1: /* RELATIVE_MODE */
+		ret = 0;
+		break;
+	default:
+		ret = SDEI_INVALID_PARAMETERS;
+	}
+
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -590,6 +607,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_SHARED_RESET:
 		ret = hypercall_reset(vcpu, false);
 		break;
+	case SDEI_1_1_FN_SDEI_FEATURES:
+		ret = hypercall_features(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 15/18] KVM: arm64: Support SDEI_FEATURES hypercall
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_FEATURES hypercall. It's used by guest to
retrieve the supported features, which are number of slots for
the interrupt binding events and relative mode for the event
handler. Currently, none of them is supported.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/kvm/sdei.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index e1f6ab9800ee..ab0b7b5e3191 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -527,6 +527,23 @@ static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
 	return ret;
 }
 
+static unsigned long hypercall_features(struct kvm_vcpu *vcpu)
+{
+	unsigned long feature = smccc_get_arg(vcpu, 1);
+	unsigned long ret;
+
+	switch (feature) {
+	case 0: /* BIND_SLOTS */
+	case 1: /* RELATIVE_MODE */
+		ret = 0;
+		break;
+	default:
+		ret = SDEI_INVALID_PARAMETERS;
+	}
+
+	return ret;
+}
+
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -590,6 +607,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	case SDEI_1_0_FN_SDEI_SHARED_RESET:
 		ret = hypercall_reset(vcpu, false);
 		break;
+	case SDEI_1_1_FN_SDEI_FEATURES:
+		ret = hypercall_features(vcpu);
+		break;
 	default:
 		ret = SDEI_NOT_SUPPORTED;
 	}
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 16/18] KVM: arm64: Support SDEI_VERSION hypercall
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This supports SDEI_VERSION hypercall by returning v1.1, which is
the specification version we're following. The vendor is set to
one of the values returned from ARM_SMCCC_VENDOR_HYP_CALL_UID
hypercall.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/include/asm/kvm_sdei.h |  3 +++
 arch/arm64/kvm/sdei.c             | 11 +++++++++++
 2 files changed, 14 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index f946d4ebdc14..32dfd5595f15 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -71,6 +71,9 @@ struct kvm_sdei_vcpu {
 	struct kvm_sdei_vcpu_context	context[SDEI_EVENT_PRIORITY_CRITICAL + 1];
 };
 
+/* Returned as vendor through SDEI_VERSION hypercall */
+#define KVM_SDEI_VENDOR	ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_2
+
 /*
  * According to SDEI specification (v1.1), the event number spans 32-bits
  * and the lower 24-bits are used as the (real) event number. I don't
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index ab0b7b5e3191..5d9b49a4355c 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -117,6 +117,14 @@ static int inject_event(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
+static unsigned long hypercall_version(struct kvm_vcpu *vcpu)
+{
+	/* v1.1 and vendor ID */
+	return (1UL << SDEI_VERSION_MAJOR_SHIFT) |
+	       (1UL << SDEI_VERSION_MINOR_SHIFT) |
+	       KVM_SDEI_VENDOR;
+}
+
 static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -565,6 +573,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	}
 
 	switch (func) {
+	case SDEI_1_0_FN_SDEI_VERSION:
+		ret = hypercall_version(vcpu);
+		break;
 	case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
 		ret = hypercall_register(vcpu);
 		break;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 16/18] KVM: arm64: Support SDEI_VERSION hypercall
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This supports SDEI_VERSION hypercall by returning v1.1, which is
the specification version we're following. The vendor is set to
one of the values returned from ARM_SMCCC_VENDOR_HYP_CALL_UID
hypercall.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/arm64/include/asm/kvm_sdei.h |  3 +++
 arch/arm64/kvm/sdei.c             | 11 +++++++++++
 2 files changed, 14 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index f946d4ebdc14..32dfd5595f15 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -71,6 +71,9 @@ struct kvm_sdei_vcpu {
 	struct kvm_sdei_vcpu_context	context[SDEI_EVENT_PRIORITY_CRITICAL + 1];
 };
 
+/* Returned as vendor through SDEI_VERSION hypercall */
+#define KVM_SDEI_VENDOR	ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_2
+
 /*
  * According to SDEI specification (v1.1), the event number spans 32-bits
  * and the lower 24-bits are used as the (real) event number. I don't
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index ab0b7b5e3191..5d9b49a4355c 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -117,6 +117,14 @@ static int inject_event(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
+static unsigned long hypercall_version(struct kvm_vcpu *vcpu)
+{
+	/* v1.1 and vendor ID */
+	return (1UL << SDEI_VERSION_MAJOR_SHIFT) |
+	       (1UL << SDEI_VERSION_MINOR_SHIFT) |
+	       KVM_SDEI_VENDOR;
+}
+
 static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
@@ -565,6 +573,9 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
 	}
 
 	switch (func) {
+	case SDEI_1_0_FN_SDEI_VERSION:
+		ret = hypercall_version(vcpu);
+		break;
 	case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
 		ret = hypercall_register(vcpu);
 		break;
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 17/18] KVM: arm64: Expose SDEI capability
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

The SDEI functionality is ready to be exposed. This adds new capability
(KVM_CAP_ARM_SDEI) and exposes it. VMM needs this to add ACPI table for
SDEI so that the service can be detected from the guest kernel.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 Documentation/virt/kvm/api.rst | 11 +++++++++++
 arch/arm64/kvm/arm.c           |  1 +
 include/uapi/linux/kvm.h       |  1 +
 3 files changed, 13 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index d13fa6600467..59b94a24a490 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7723,6 +7723,17 @@ At this time, KVM_PMU_CAP_DISABLE is the only capability.  Setting
 this capability will disable PMU virtualization for that VM.  Usermode
 should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
 
+8.36 KVM_CAP_ARM_SDEI
+---------------------
+
+:Capability: KVM_CAP_ARM_SDEI
+:Architectures: arm64
+:Type: vm
+
+This capability indicates that the SDEI virtual service is supported
+in the host. A VMM can check whether the service is available to enable
+it.
+
 9. Known KVM API problems
 =========================
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 7e77d62aeab1..8117a9e974f0 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -211,6 +211,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_VCPU_ATTRIBUTES:
 	case KVM_CAP_PTP_KVM:
+	case KVM_CAP_ARM_SDEI:
 		r = 1;
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 91a6fe4e02c0..a5474265c841 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1144,6 +1144,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_MEM_OP_EXTENSION 211
 #define KVM_CAP_PMU_CAPABILITY 212
 #define KVM_CAP_DISABLE_QUIRKS2 213
+#define KVM_CAP_ARM_SDEI 214
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 17/18] KVM: arm64: Expose SDEI capability
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

The SDEI functionality is ready to be exposed. This adds new capability
(KVM_CAP_ARM_SDEI) and exposes it. VMM needs this to add ACPI table for
SDEI so that the service can be detected from the guest kernel.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 Documentation/virt/kvm/api.rst | 11 +++++++++++
 arch/arm64/kvm/arm.c           |  1 +
 include/uapi/linux/kvm.h       |  1 +
 3 files changed, 13 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index d13fa6600467..59b94a24a490 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7723,6 +7723,17 @@ At this time, KVM_PMU_CAP_DISABLE is the only capability.  Setting
 this capability will disable PMU virtualization for that VM.  Usermode
 should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
 
+8.36 KVM_CAP_ARM_SDEI
+---------------------
+
+:Capability: KVM_CAP_ARM_SDEI
+:Architectures: arm64
+:Type: vm
+
+This capability indicates that the SDEI virtual service is supported
+in the host. A VMM can check whether the service is available to enable
+it.
+
 9. Known KVM API problems
 =========================
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 7e77d62aeab1..8117a9e974f0 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -211,6 +211,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_VCPU_ATTRIBUTES:
 	case KVM_CAP_PTP_KVM:
+	case KVM_CAP_ARM_SDEI:
 		r = 1;
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 91a6fe4e02c0..a5474265c841 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1144,6 +1144,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_MEM_OP_EXTENSION 211
 #define KVM_CAP_PMU_CAPABILITY 212
 #define KVM_CAP_DISABLE_QUIRKS2 213
+#define KVM_CAP_ARM_SDEI 214
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 18/18] KVM: selftests: Add SDEI test case
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:39   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

This adds SDEI self-test case where the various hypercalls are
issued to default event (0x0). The default event is private,
signaled and in normal priority.

By default, two vCPUs are started and the following ioctl commands
or hypercalls are sent to them in sequence, to simulate how they
are used in VMM and the linux guest:

   kvm_check_cap(KVM_CAP_ARM_SDEI)     (Available functionality)

   SDEI_1_0_FN_SDEI_VERSION
   SDEI_1_1_FN_SDEI_FEATURES           (SDEI capability probing)
   SDEI_1_0_FN_SDEI_SHARED_RESET       (restart SDEI)
   SDEI_1_0_FN_SDEI_PE_UNMASK          (CPU online)

   SDEI_1_0_FN_SDEI_EVENT_GET_INFO
   SDEI_1_0_FN_SDEI_EVENT_REGISTER     (register event)
   SDEI_1_0_FN_SDEI_EVENT_ENABLE       (enable event)
   SDEI_1_1_FN_SDEI_EVENT_SIGNAL       (event injection)

   SDEI_1_0_FN_SDEI_EVENT_DISABLE      (disable event)
   SDEI_1_0_FN_SDEI_EVENT_UNREGISTER   (unregister event)
   SDEI_1_0_FN_SDEI_PE_MASK            (CPU offline)

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 tools/testing/selftests/kvm/Makefile       |   1 +
 tools/testing/selftests/kvm/aarch64/sdei.c | 498 +++++++++++++++++++++
 2 files changed, 499 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/aarch64/sdei.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 21c2dbd21a81..53f3b651726e 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -108,6 +108,7 @@ TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
 TEST_GEN_PROGS_aarch64 += aarch64/psci_cpu_on_test
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_irq
+TEST_GEN_PROGS_aarch64 += aarch64/sdei
 TEST_GEN_PROGS_aarch64 += demand_paging_test
 TEST_GEN_PROGS_aarch64 += dirty_log_test
 TEST_GEN_PROGS_aarch64 += dirty_log_perf_test
diff --git a/tools/testing/selftests/kvm/aarch64/sdei.c b/tools/testing/selftests/kvm/aarch64/sdei.c
new file mode 100644
index 000000000000..07acbc7582d0
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/sdei.c
@@ -0,0 +1,498 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM SDEI test
+ *
+ * Copyright (C) 2022 Red Hat, Inc.
+ *
+ * Author(s): Gavin Shan <gshan@redhat.com>
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <pthread.h>
+#include <linux/bitmap.h>
+
+#include "kvm_util.h"
+#include "processor.h"
+#include "linux/arm_sdei.h"
+
+#define NR_VCPUS		2
+#define SDEI_TEST_EVENT_NUM	SDEI_SW_SIGNALED_EVENT
+
+#define VCPU_COMMAND_IDLE	0
+#define VCPU_COMMAND_EXIT	1
+
+struct vcpu_command {
+	const char	*name;
+	uint64_t	command;
+};
+
+struct sdei_feature {
+	uint16_t	shared_slots;
+	uint16_t	private_slots;
+	uint8_t		relative_mode;
+};
+
+struct sdei_event_info {
+	uint8_t		type;
+	uint8_t		priority;
+	uint8_t		signaled;
+};
+
+struct sdei_event_signal {
+	uint8_t		handled;
+	uint8_t		irq;
+	uint64_t	status;
+	uint64_t	pc;
+	uint64_t	pstate;
+	uint64_t	regs[18];
+};
+
+struct sdei_state {
+	uint64_t	command;
+	uint64_t	num;
+	uint64_t	status;
+	union {
+		uint64_t			version;
+		struct sdei_feature		feature;
+		struct sdei_event_info		info;
+		struct sdei_event_signal	signal;
+	};
+
+	uint8_t		command_completed;
+};
+
+struct vcpu_state {
+	struct kvm_vm		*vm;
+	uint32_t		vcpu_id;
+	pthread_t		thread;
+	struct sdei_state	state;
+};
+
+static struct vcpu_state vcpu_states[NR_VCPUS];
+static struct vcpu_command vcpu_commands[] = {
+	{ "VERSION",          SDEI_1_0_FN_SDEI_VERSION          },
+	{ "FEATURES",         SDEI_1_1_FN_SDEI_FEATURES         },
+	{ "PRIVATE_RESET",    SDEI_1_0_FN_SDEI_PRIVATE_RESET    },
+	{ "SHARED_RESET",     SDEI_1_0_FN_SDEI_SHARED_RESET     },
+	{ "PE_UNMASK",        SDEI_1_0_FN_SDEI_PE_UNMASK        },
+	{ "EVENT_GET_INFO",   SDEI_1_0_FN_SDEI_EVENT_GET_INFO   },
+	{ "EVENT_REGISTER",   SDEI_1_0_FN_SDEI_EVENT_REGISTER   },
+	{ "EVENT_ENABLE",     SDEI_1_0_FN_SDEI_EVENT_ENABLE     },
+	{ "EVENT_SIGNAL",     SDEI_1_1_FN_SDEI_EVENT_SIGNAL     },
+	{ "PE_MASK",          SDEI_1_0_FN_SDEI_PE_MASK          },
+	{ "EVENT_DISABLE",    SDEI_1_0_FN_SDEI_EVENT_DISABLE    },
+	{ "EVENT_UNREGISTER", SDEI_1_0_FN_SDEI_EVENT_UNREGISTER },
+};
+
+static inline int64_t smccc(uint32_t func, uint64_t arg0, uint64_t arg1,
+			    uint64_t arg2, uint64_t arg3, uint64_t arg4)
+{
+	int64_t ret;
+
+	asm volatile (
+		"mov    x0, %1\n"
+		"mov    x1, %2\n"
+		"mov    x2, %3\n"
+		"mov    x3, %4\n"
+		"mov    x4, %5\n"
+		"mov    x5, %6\n"
+		"hvc    #0\n"
+		"mov    %0, x0\n"
+	: "=r" (ret) : "r" (func), "r" (arg0), "r" (arg1),
+	"r" (arg2), "r" (arg3), "r" (arg4) :
+	"x0", "x1", "x2", "x3", "x4", "x5");
+
+	return ret;
+}
+
+static inline bool is_error(int64_t status)
+{
+	if (status == SDEI_NOT_SUPPORTED      ||
+	    status == SDEI_INVALID_PARAMETERS ||
+	    status == SDEI_DENIED             ||
+	    status == SDEI_PENDING            ||
+	    status == SDEI_OUT_OF_RESOURCE)
+		return true;
+
+	return false;
+}
+
+static void guest_irq_handler(struct ex_regs *regs)
+{
+	int vcpu_id = guest_get_vcpuid();
+	struct sdei_state *state = &vcpu_states[vcpu_id].state;
+
+	WRITE_ONCE(state->signal.irq, true);
+}
+
+static void sdei_event_handler(uint64_t num, uint64_t arg,
+			       uint64_t pc, uint64_t pstate)
+{
+	struct sdei_state *state = (struct sdei_state *)arg;
+	uint64_t status;
+
+	status = smccc(SDEI_1_0_FN_SDEI_EVENT_STATUS, num, 0, 0, 0, 0);
+	WRITE_ONCE(state->signal.status, status);
+
+	WRITE_ONCE(state->signal.pc, pc);
+	WRITE_ONCE(state->signal.pstate, pstate);
+
+	status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 0, 0, 0, 0, 0);
+	WRITE_ONCE(state->signal.regs[0], status);
+	status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 1, 0, 0, 0, 0);
+	WRITE_ONCE(state->signal.regs[1], status);
+	status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 2, 0, 0, 0, 0);
+	WRITE_ONCE(state->signal.regs[2], status);
+	status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 3, 0, 0, 0, 0);
+	WRITE_ONCE(state->signal.regs[3], status);
+
+	WRITE_ONCE(state->signal.handled, true);
+	smccc(SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME,
+	      num, 0, 0, 0, 0);
+}
+
+static bool sdei_event_wait(struct sdei_state *state,
+			    uint64_t timeout_in_seconds)
+{
+	uint64_t limit, count = 0;
+
+	limit = (timeout_in_seconds * 1000000) / 10;
+
+	while (1) {
+		if (READ_ONCE(state->signal.handled))
+			return true;
+
+		if (++count >= limit)
+			return false;
+
+		/*
+		 * We issues HVC calls here to ensure the injected
+		 * event can be delivered in time.
+		 */
+		smccc(SDEI_1_0_FN_SDEI_EVENT_GET_INFO,
+		      READ_ONCE(state->num), SDEI_EVENT_INFO_EV_TYPE,
+		      0, 0, 0);
+
+		usleep(10);
+	}
+
+	return false;
+}
+
+static void guest_code(int vcpu_id)
+{
+	struct sdei_state *state;
+	uint64_t command, last_command = -1UL, num, status;
+
+	state = &vcpu_states[vcpu_id].state;
+
+	while (1) {
+		command = READ_ONCE(state->command);
+		if (command == last_command)
+			continue;
+
+		num = READ_ONCE(state->num);
+		switch (command) {
+		case VCPU_COMMAND_IDLE:
+			WRITE_ONCE(state->status, SDEI_SUCCESS);
+			break;
+		case SDEI_1_0_FN_SDEI_VERSION:
+			status = smccc(command, 0, 0, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->version, status);
+			break;
+		case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
+		case SDEI_1_0_FN_SDEI_SHARED_RESET:
+		case SDEI_1_0_FN_SDEI_PE_UNMASK:
+		case SDEI_1_0_FN_SDEI_PE_MASK:
+			status = smccc(command, 0, 0, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			break;
+		case SDEI_1_1_FN_SDEI_FEATURES:
+			status = smccc(command, 0, 0, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->feature.shared_slots,
+				   (status & 0xffff0000) >> 16);
+			WRITE_ONCE(state->feature.private_slots,
+				   (status & 0x0000ffff));
+			status = smccc(command, 1, 0, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->feature.relative_mode, status);
+			break;
+		case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+			status = smccc(command, num,
+				       SDEI_EVENT_INFO_EV_TYPE, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->info.type, status);
+			status = smccc(command, num,
+				       SDEI_EVENT_INFO_EV_PRIORITY, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->info.priority, status);
+			status = smccc(command, num,
+				       SDEI_EVENT_INFO_EV_SIGNALED, 0, 0, 0);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->info.signaled, status);
+			break;
+		case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
+			status = smccc(command, num,
+				       (uint64_t)sdei_event_handler,
+				       (uint64_t)state,
+				       SDEI_EVENT_REGISTER_RM_ANY, 0);
+			WRITE_ONCE(state->status, status);
+			break;
+		case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
+		case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
+		case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
+			status = smccc(command, num, 0, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			break;
+		case SDEI_1_1_FN_SDEI_EVENT_SIGNAL:
+			status = smccc(command, num, (uint64_t)state, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			if (!sdei_event_wait(state, 5))
+				WRITE_ONCE(state->status, SDEI_DENIED);
+
+			break;
+		case VCPU_COMMAND_EXIT:
+			WRITE_ONCE(state->status, SDEI_SUCCESS);
+			GUEST_DONE();
+			break;
+		default:
+			WRITE_ONCE(state->status, SDEI_INVALID_PARAMETERS);
+		}
+
+		last_command = command;
+		WRITE_ONCE(state->command_completed, true);
+	}
+}
+
+static void *vcpu_thread(void *arg)
+{
+	struct vcpu_state *state = arg;
+
+	vcpu_run(state->vm, state->vcpu_id);
+
+	return NULL;
+}
+
+static bool vcpu_wait(struct kvm_vm *vm, int timeout_in_seconds)
+{
+	unsigned long count, limit;
+	int i;
+
+	count = 0;
+	limit = (timeout_in_seconds * 1000000) / 50;
+	while (1) {
+		for (i = 0; i < NR_VCPUS; i++) {
+			sync_global_from_guest(vm, vcpu_states[i].state);
+			if (!vcpu_states[i].state.command_completed)
+				break;
+		}
+
+		if (i >= NR_VCPUS)
+			return true;
+
+		if (++count > limit)
+			return false;
+
+		usleep(50);
+	}
+
+	return false;
+}
+
+static void vcpu_send_command(struct kvm_vm *vm, uint64_t command)
+{
+	int i;
+
+	for (i = 0; i < NR_VCPUS; i++) {
+		memset(&vcpu_states[i].state, 0,
+		       sizeof(vcpu_states[0].state));
+		vcpu_states[i].state.num = SDEI_TEST_EVENT_NUM;
+		vcpu_states[i].state.status = SDEI_SUCCESS;
+		vcpu_states[i].state.command = command;
+		vcpu_states[i].state.command_completed = false;
+
+		sync_global_to_guest(vm, vcpu_states[i].state);
+	}
+}
+
+static bool vcpu_check_state(struct kvm_vm *vm)
+{
+	int i, j, ret;
+
+	for (i = 0; i < NR_VCPUS; i++)
+		sync_global_from_guest(vm, vcpu_states[i].state);
+
+	for (i = 0; i < NR_VCPUS; i++) {
+		if (is_error(vcpu_states[i].state.status))
+			return false;
+
+		for (j = 0; j < NR_VCPUS; j++) {
+			ret = memcmp(&vcpu_states[i].state,
+				     &vcpu_states[j].state,
+				     sizeof(vcpu_states[0].state));
+			if (ret)
+				return false;
+		}
+	}
+
+	return true;
+}
+
+static void vcpu_dump_state(int index)
+{
+	struct sdei_state *state = &vcpu_states[0].state;
+
+	pr_info("--- %s\n", vcpu_commands[index].name);
+	switch (state->command) {
+	case SDEI_1_0_FN_SDEI_VERSION:
+		pr_info("    Version:              %ld.%ld (vendor: 0x%lx)\n",
+			SDEI_VERSION_MAJOR(state->version),
+			SDEI_VERSION_MINOR(state->version),
+			SDEI_VERSION_VENDOR(state->version));
+		break;
+	case SDEI_1_1_FN_SDEI_FEATURES:
+		pr_info("    Shared event slots:   %d\n",
+			state->feature.shared_slots);
+		pr_info("    Private event slots:  %d\n",
+			state->feature.private_slots);
+		pr_info("    Relative mode:        %s\n",
+			state->feature.relative_mode ? "Yes" : "No");
+			break;
+	case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+		pr_info("    Type:                 %s\n",
+			state->info.type == SDEI_EVENT_TYPE_SHARED ?
+			"Shared" : "Private");
+		pr_info("    Priority:             %s\n",
+			state->info.priority == SDEI_EVENT_PRIORITY_NORMAL ?
+			"Normal" : "Critical");
+		pr_info("    Signaled:             %s\n",
+			state->info.signaled ? "Yes" : "No");
+		break;
+	case SDEI_1_1_FN_SDEI_EVENT_SIGNAL:
+		pr_info("    Handled:              %s\n",
+			state->signal.handled ? "Yes" : "No");
+		pr_info("    IRQ:                  %s\n",
+			state->signal.irq ? "Yes" : "No");
+		pr_info("    Status:               %s-%s-%s\n",
+			state->signal.status & (1 << SDEI_EVENT_STATUS_REGISTERED) ?
+			"Registered" : "x",
+			state->signal.status & (1 << SDEI_EVENT_STATUS_ENABLED) ?
+			"Enabled" : "x",
+			state->signal.status & (1 << SDEI_EVENT_STATUS_RUNNING) ?
+			"Running" : "x");
+		pr_info("    PC/PSTATE:            %016lx %016lx\n",
+			state->signal.pc, state->signal.pstate);
+		pr_info("    Regs:                 %016lx %016lx %016lx %016lx\n",
+			state->signal.regs[0], state->signal.regs[1],
+			state->signal.regs[2], state->signal.regs[3]);
+		break;
+	}
+
+	if (index == ARRAY_SIZE(vcpu_commands))
+		pr_info("\n");
+}
+
+int main(int argc, char **argv)
+{
+	struct kvm_vm *vm;
+	uint32_t vcpu_ids[NR_VCPUS];
+	int i, ret;
+
+	if (!kvm_check_cap(KVM_CAP_ARM_SDEI)) {
+		pr_info("SDEI not supported\n");
+		return 0;
+	}
+
+	/* Create VM */
+	for (i = 0; i < NR_VCPUS; i++) {
+		vcpu_states[i].vcpu_id = i;
+		vcpu_ids[i] = i;
+	}
+
+	vm = vm_create_default_with_vcpus(NR_VCPUS, 0, 0,
+					  guest_code, vcpu_ids);
+	vm_init_descriptor_tables(vm);
+	vm_install_exception_handler(vm, VECTOR_IRQ_CURRENT,
+				     guest_irq_handler);
+	ucall_init(vm, NULL);
+
+	/* Start the vCPUs */
+	vcpu_send_command(vm, VCPU_COMMAND_IDLE);
+	for (i = 0; i < NR_VCPUS; i++) {
+		vcpu_states[i].vm = vm;
+		vcpu_args_set(vm, i, 1, i);
+		vcpu_init_descriptor_tables(vm, i);
+
+		ret = pthread_create(&vcpu_states[i].thread, NULL,
+				     vcpu_thread, &vcpu_states[i]);
+		TEST_ASSERT(!ret, "Failed to create vCPU-%d pthread\n", i);
+	}
+
+	/* Wait the idle command to complete */
+	ret = vcpu_wait(vm, 5);
+	TEST_ASSERT(ret, "Timeout to execute IDLE command\n");
+
+	/* Start the tests */
+	pr_info("\n");
+	pr_info("    NR_VCPUS: %d    SDEI Event: 0x%08x\n\n",
+		NR_VCPUS, SDEI_TEST_EVENT_NUM);
+	for (i = 0; i < ARRAY_SIZE(vcpu_commands); i++) {
+		/*
+		 * We depends on SDEI_EVENT_SIGNAL hypercall to inject SDEI
+		 * event. The number of the injected event must be zero. So
+		 * we have to skip the corresponding test if the SDEI event
+		 * number isn't zero.
+		 */
+		if (SDEI_TEST_EVENT_NUM != SDEI_SW_SIGNALED_EVENT &&
+		    vcpu_commands[i].command == SDEI_1_1_FN_SDEI_EVENT_SIGNAL)
+			continue;
+
+		vcpu_send_command(vm, vcpu_commands[i].command);
+		ret = vcpu_wait(vm, 5);
+		if (!ret) {
+			pr_info("%s: Timeout\n", vcpu_commands[i].name);
+			return -1;
+		}
+
+		ret = vcpu_check_state(vm);
+		if (!ret) {
+			pr_info("%s: Fail\n", vcpu_commands[i].name);
+			return -1;
+		}
+
+		vcpu_dump_state(i);
+	}
+
+	/* Terminate the guests */
+	pr_info("\n    Result: OK\n\n");
+	vcpu_send_command(vm, VCPU_COMMAND_EXIT);
+	sleep(1);
+
+	return 0;
+}
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 18/18] KVM: selftests: Add SDEI test case
@ 2022-04-03 15:39   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:39 UTC (permalink / raw)
  To: kvmarm
  Cc: linux-kernel, eauger, oupton, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

This adds SDEI self-test case where the various hypercalls are
issued to default event (0x0). The default event is private,
signaled and in normal priority.

By default, two vCPUs are started and the following ioctl commands
or hypercalls are sent to them in sequence, to simulate how they
are used in VMM and the linux guest:

   kvm_check_cap(KVM_CAP_ARM_SDEI)     (Available functionality)

   SDEI_1_0_FN_SDEI_VERSION
   SDEI_1_1_FN_SDEI_FEATURES           (SDEI capability probing)
   SDEI_1_0_FN_SDEI_SHARED_RESET       (restart SDEI)
   SDEI_1_0_FN_SDEI_PE_UNMASK          (CPU online)

   SDEI_1_0_FN_SDEI_EVENT_GET_INFO
   SDEI_1_0_FN_SDEI_EVENT_REGISTER     (register event)
   SDEI_1_0_FN_SDEI_EVENT_ENABLE       (enable event)
   SDEI_1_1_FN_SDEI_EVENT_SIGNAL       (event injection)

   SDEI_1_0_FN_SDEI_EVENT_DISABLE      (disable event)
   SDEI_1_0_FN_SDEI_EVENT_UNREGISTER   (unregister event)
   SDEI_1_0_FN_SDEI_PE_MASK            (CPU offline)

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 tools/testing/selftests/kvm/Makefile       |   1 +
 tools/testing/selftests/kvm/aarch64/sdei.c | 498 +++++++++++++++++++++
 2 files changed, 499 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/aarch64/sdei.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 21c2dbd21a81..53f3b651726e 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -108,6 +108,7 @@ TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
 TEST_GEN_PROGS_aarch64 += aarch64/psci_cpu_on_test
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_irq
+TEST_GEN_PROGS_aarch64 += aarch64/sdei
 TEST_GEN_PROGS_aarch64 += demand_paging_test
 TEST_GEN_PROGS_aarch64 += dirty_log_test
 TEST_GEN_PROGS_aarch64 += dirty_log_perf_test
diff --git a/tools/testing/selftests/kvm/aarch64/sdei.c b/tools/testing/selftests/kvm/aarch64/sdei.c
new file mode 100644
index 000000000000..07acbc7582d0
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/sdei.c
@@ -0,0 +1,498 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM SDEI test
+ *
+ * Copyright (C) 2022 Red Hat, Inc.
+ *
+ * Author(s): Gavin Shan <gshan@redhat.com>
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <pthread.h>
+#include <linux/bitmap.h>
+
+#include "kvm_util.h"
+#include "processor.h"
+#include "linux/arm_sdei.h"
+
+#define NR_VCPUS		2
+#define SDEI_TEST_EVENT_NUM	SDEI_SW_SIGNALED_EVENT
+
+#define VCPU_COMMAND_IDLE	0
+#define VCPU_COMMAND_EXIT	1
+
+struct vcpu_command {
+	const char	*name;
+	uint64_t	command;
+};
+
+struct sdei_feature {
+	uint16_t	shared_slots;
+	uint16_t	private_slots;
+	uint8_t		relative_mode;
+};
+
+struct sdei_event_info {
+	uint8_t		type;
+	uint8_t		priority;
+	uint8_t		signaled;
+};
+
+struct sdei_event_signal {
+	uint8_t		handled;
+	uint8_t		irq;
+	uint64_t	status;
+	uint64_t	pc;
+	uint64_t	pstate;
+	uint64_t	regs[18];
+};
+
+struct sdei_state {
+	uint64_t	command;
+	uint64_t	num;
+	uint64_t	status;
+	union {
+		uint64_t			version;
+		struct sdei_feature		feature;
+		struct sdei_event_info		info;
+		struct sdei_event_signal	signal;
+	};
+
+	uint8_t		command_completed;
+};
+
+struct vcpu_state {
+	struct kvm_vm		*vm;
+	uint32_t		vcpu_id;
+	pthread_t		thread;
+	struct sdei_state	state;
+};
+
+static struct vcpu_state vcpu_states[NR_VCPUS];
+static struct vcpu_command vcpu_commands[] = {
+	{ "VERSION",          SDEI_1_0_FN_SDEI_VERSION          },
+	{ "FEATURES",         SDEI_1_1_FN_SDEI_FEATURES         },
+	{ "PRIVATE_RESET",    SDEI_1_0_FN_SDEI_PRIVATE_RESET    },
+	{ "SHARED_RESET",     SDEI_1_0_FN_SDEI_SHARED_RESET     },
+	{ "PE_UNMASK",        SDEI_1_0_FN_SDEI_PE_UNMASK        },
+	{ "EVENT_GET_INFO",   SDEI_1_0_FN_SDEI_EVENT_GET_INFO   },
+	{ "EVENT_REGISTER",   SDEI_1_0_FN_SDEI_EVENT_REGISTER   },
+	{ "EVENT_ENABLE",     SDEI_1_0_FN_SDEI_EVENT_ENABLE     },
+	{ "EVENT_SIGNAL",     SDEI_1_1_FN_SDEI_EVENT_SIGNAL     },
+	{ "PE_MASK",          SDEI_1_0_FN_SDEI_PE_MASK          },
+	{ "EVENT_DISABLE",    SDEI_1_0_FN_SDEI_EVENT_DISABLE    },
+	{ "EVENT_UNREGISTER", SDEI_1_0_FN_SDEI_EVENT_UNREGISTER },
+};
+
+static inline int64_t smccc(uint32_t func, uint64_t arg0, uint64_t arg1,
+			    uint64_t arg2, uint64_t arg3, uint64_t arg4)
+{
+	int64_t ret;
+
+	asm volatile (
+		"mov    x0, %1\n"
+		"mov    x1, %2\n"
+		"mov    x2, %3\n"
+		"mov    x3, %4\n"
+		"mov    x4, %5\n"
+		"mov    x5, %6\n"
+		"hvc    #0\n"
+		"mov    %0, x0\n"
+	: "=r" (ret) : "r" (func), "r" (arg0), "r" (arg1),
+	"r" (arg2), "r" (arg3), "r" (arg4) :
+	"x0", "x1", "x2", "x3", "x4", "x5");
+
+	return ret;
+}
+
+static inline bool is_error(int64_t status)
+{
+	if (status == SDEI_NOT_SUPPORTED      ||
+	    status == SDEI_INVALID_PARAMETERS ||
+	    status == SDEI_DENIED             ||
+	    status == SDEI_PENDING            ||
+	    status == SDEI_OUT_OF_RESOURCE)
+		return true;
+
+	return false;
+}
+
+static void guest_irq_handler(struct ex_regs *regs)
+{
+	int vcpu_id = guest_get_vcpuid();
+	struct sdei_state *state = &vcpu_states[vcpu_id].state;
+
+	WRITE_ONCE(state->signal.irq, true);
+}
+
+static void sdei_event_handler(uint64_t num, uint64_t arg,
+			       uint64_t pc, uint64_t pstate)
+{
+	struct sdei_state *state = (struct sdei_state *)arg;
+	uint64_t status;
+
+	status = smccc(SDEI_1_0_FN_SDEI_EVENT_STATUS, num, 0, 0, 0, 0);
+	WRITE_ONCE(state->signal.status, status);
+
+	WRITE_ONCE(state->signal.pc, pc);
+	WRITE_ONCE(state->signal.pstate, pstate);
+
+	status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 0, 0, 0, 0, 0);
+	WRITE_ONCE(state->signal.regs[0], status);
+	status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 1, 0, 0, 0, 0);
+	WRITE_ONCE(state->signal.regs[1], status);
+	status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 2, 0, 0, 0, 0);
+	WRITE_ONCE(state->signal.regs[2], status);
+	status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 3, 0, 0, 0, 0);
+	WRITE_ONCE(state->signal.regs[3], status);
+
+	WRITE_ONCE(state->signal.handled, true);
+	smccc(SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME,
+	      num, 0, 0, 0, 0);
+}
+
+static bool sdei_event_wait(struct sdei_state *state,
+			    uint64_t timeout_in_seconds)
+{
+	uint64_t limit, count = 0;
+
+	limit = (timeout_in_seconds * 1000000) / 10;
+
+	while (1) {
+		if (READ_ONCE(state->signal.handled))
+			return true;
+
+		if (++count >= limit)
+			return false;
+
+		/*
+		 * We issues HVC calls here to ensure the injected
+		 * event can be delivered in time.
+		 */
+		smccc(SDEI_1_0_FN_SDEI_EVENT_GET_INFO,
+		      READ_ONCE(state->num), SDEI_EVENT_INFO_EV_TYPE,
+		      0, 0, 0);
+
+		usleep(10);
+	}
+
+	return false;
+}
+
+static void guest_code(int vcpu_id)
+{
+	struct sdei_state *state;
+	uint64_t command, last_command = -1UL, num, status;
+
+	state = &vcpu_states[vcpu_id].state;
+
+	while (1) {
+		command = READ_ONCE(state->command);
+		if (command == last_command)
+			continue;
+
+		num = READ_ONCE(state->num);
+		switch (command) {
+		case VCPU_COMMAND_IDLE:
+			WRITE_ONCE(state->status, SDEI_SUCCESS);
+			break;
+		case SDEI_1_0_FN_SDEI_VERSION:
+			status = smccc(command, 0, 0, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->version, status);
+			break;
+		case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
+		case SDEI_1_0_FN_SDEI_SHARED_RESET:
+		case SDEI_1_0_FN_SDEI_PE_UNMASK:
+		case SDEI_1_0_FN_SDEI_PE_MASK:
+			status = smccc(command, 0, 0, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			break;
+		case SDEI_1_1_FN_SDEI_FEATURES:
+			status = smccc(command, 0, 0, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->feature.shared_slots,
+				   (status & 0xffff0000) >> 16);
+			WRITE_ONCE(state->feature.private_slots,
+				   (status & 0x0000ffff));
+			status = smccc(command, 1, 0, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->feature.relative_mode, status);
+			break;
+		case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+			status = smccc(command, num,
+				       SDEI_EVENT_INFO_EV_TYPE, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->info.type, status);
+			status = smccc(command, num,
+				       SDEI_EVENT_INFO_EV_PRIORITY, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->info.priority, status);
+			status = smccc(command, num,
+				       SDEI_EVENT_INFO_EV_SIGNALED, 0, 0, 0);
+			if (is_error(status))
+				break;
+
+			WRITE_ONCE(state->info.signaled, status);
+			break;
+		case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
+			status = smccc(command, num,
+				       (uint64_t)sdei_event_handler,
+				       (uint64_t)state,
+				       SDEI_EVENT_REGISTER_RM_ANY, 0);
+			WRITE_ONCE(state->status, status);
+			break;
+		case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
+		case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
+		case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
+			status = smccc(command, num, 0, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			break;
+		case SDEI_1_1_FN_SDEI_EVENT_SIGNAL:
+			status = smccc(command, num, (uint64_t)state, 0, 0, 0);
+			WRITE_ONCE(state->status, status);
+			if (is_error(status))
+				break;
+
+			if (!sdei_event_wait(state, 5))
+				WRITE_ONCE(state->status, SDEI_DENIED);
+
+			break;
+		case VCPU_COMMAND_EXIT:
+			WRITE_ONCE(state->status, SDEI_SUCCESS);
+			GUEST_DONE();
+			break;
+		default:
+			WRITE_ONCE(state->status, SDEI_INVALID_PARAMETERS);
+		}
+
+		last_command = command;
+		WRITE_ONCE(state->command_completed, true);
+	}
+}
+
+static void *vcpu_thread(void *arg)
+{
+	struct vcpu_state *state = arg;
+
+	vcpu_run(state->vm, state->vcpu_id);
+
+	return NULL;
+}
+
+static bool vcpu_wait(struct kvm_vm *vm, int timeout_in_seconds)
+{
+	unsigned long count, limit;
+	int i;
+
+	count = 0;
+	limit = (timeout_in_seconds * 1000000) / 50;
+	while (1) {
+		for (i = 0; i < NR_VCPUS; i++) {
+			sync_global_from_guest(vm, vcpu_states[i].state);
+			if (!vcpu_states[i].state.command_completed)
+				break;
+		}
+
+		if (i >= NR_VCPUS)
+			return true;
+
+		if (++count > limit)
+			return false;
+
+		usleep(50);
+	}
+
+	return false;
+}
+
+static void vcpu_send_command(struct kvm_vm *vm, uint64_t command)
+{
+	int i;
+
+	for (i = 0; i < NR_VCPUS; i++) {
+		memset(&vcpu_states[i].state, 0,
+		       sizeof(vcpu_states[0].state));
+		vcpu_states[i].state.num = SDEI_TEST_EVENT_NUM;
+		vcpu_states[i].state.status = SDEI_SUCCESS;
+		vcpu_states[i].state.command = command;
+		vcpu_states[i].state.command_completed = false;
+
+		sync_global_to_guest(vm, vcpu_states[i].state);
+	}
+}
+
+static bool vcpu_check_state(struct kvm_vm *vm)
+{
+	int i, j, ret;
+
+	for (i = 0; i < NR_VCPUS; i++)
+		sync_global_from_guest(vm, vcpu_states[i].state);
+
+	for (i = 0; i < NR_VCPUS; i++) {
+		if (is_error(vcpu_states[i].state.status))
+			return false;
+
+		for (j = 0; j < NR_VCPUS; j++) {
+			ret = memcmp(&vcpu_states[i].state,
+				     &vcpu_states[j].state,
+				     sizeof(vcpu_states[0].state));
+			if (ret)
+				return false;
+		}
+	}
+
+	return true;
+}
+
+static void vcpu_dump_state(int index)
+{
+	struct sdei_state *state = &vcpu_states[0].state;
+
+	pr_info("--- %s\n", vcpu_commands[index].name);
+	switch (state->command) {
+	case SDEI_1_0_FN_SDEI_VERSION:
+		pr_info("    Version:              %ld.%ld (vendor: 0x%lx)\n",
+			SDEI_VERSION_MAJOR(state->version),
+			SDEI_VERSION_MINOR(state->version),
+			SDEI_VERSION_VENDOR(state->version));
+		break;
+	case SDEI_1_1_FN_SDEI_FEATURES:
+		pr_info("    Shared event slots:   %d\n",
+			state->feature.shared_slots);
+		pr_info("    Private event slots:  %d\n",
+			state->feature.private_slots);
+		pr_info("    Relative mode:        %s\n",
+			state->feature.relative_mode ? "Yes" : "No");
+			break;
+	case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+		pr_info("    Type:                 %s\n",
+			state->info.type == SDEI_EVENT_TYPE_SHARED ?
+			"Shared" : "Private");
+		pr_info("    Priority:             %s\n",
+			state->info.priority == SDEI_EVENT_PRIORITY_NORMAL ?
+			"Normal" : "Critical");
+		pr_info("    Signaled:             %s\n",
+			state->info.signaled ? "Yes" : "No");
+		break;
+	case SDEI_1_1_FN_SDEI_EVENT_SIGNAL:
+		pr_info("    Handled:              %s\n",
+			state->signal.handled ? "Yes" : "No");
+		pr_info("    IRQ:                  %s\n",
+			state->signal.irq ? "Yes" : "No");
+		pr_info("    Status:               %s-%s-%s\n",
+			state->signal.status & (1 << SDEI_EVENT_STATUS_REGISTERED) ?
+			"Registered" : "x",
+			state->signal.status & (1 << SDEI_EVENT_STATUS_ENABLED) ?
+			"Enabled" : "x",
+			state->signal.status & (1 << SDEI_EVENT_STATUS_RUNNING) ?
+			"Running" : "x");
+		pr_info("    PC/PSTATE:            %016lx %016lx\n",
+			state->signal.pc, state->signal.pstate);
+		pr_info("    Regs:                 %016lx %016lx %016lx %016lx\n",
+			state->signal.regs[0], state->signal.regs[1],
+			state->signal.regs[2], state->signal.regs[3]);
+		break;
+	}
+
+	if (index == ARRAY_SIZE(vcpu_commands))
+		pr_info("\n");
+}
+
+int main(int argc, char **argv)
+{
+	struct kvm_vm *vm;
+	uint32_t vcpu_ids[NR_VCPUS];
+	int i, ret;
+
+	if (!kvm_check_cap(KVM_CAP_ARM_SDEI)) {
+		pr_info("SDEI not supported\n");
+		return 0;
+	}
+
+	/* Create VM */
+	for (i = 0; i < NR_VCPUS; i++) {
+		vcpu_states[i].vcpu_id = i;
+		vcpu_ids[i] = i;
+	}
+
+	vm = vm_create_default_with_vcpus(NR_VCPUS, 0, 0,
+					  guest_code, vcpu_ids);
+	vm_init_descriptor_tables(vm);
+	vm_install_exception_handler(vm, VECTOR_IRQ_CURRENT,
+				     guest_irq_handler);
+	ucall_init(vm, NULL);
+
+	/* Start the vCPUs */
+	vcpu_send_command(vm, VCPU_COMMAND_IDLE);
+	for (i = 0; i < NR_VCPUS; i++) {
+		vcpu_states[i].vm = vm;
+		vcpu_args_set(vm, i, 1, i);
+		vcpu_init_descriptor_tables(vm, i);
+
+		ret = pthread_create(&vcpu_states[i].thread, NULL,
+				     vcpu_thread, &vcpu_states[i]);
+		TEST_ASSERT(!ret, "Failed to create vCPU-%d pthread\n", i);
+	}
+
+	/* Wait the idle command to complete */
+	ret = vcpu_wait(vm, 5);
+	TEST_ASSERT(ret, "Timeout to execute IDLE command\n");
+
+	/* Start the tests */
+	pr_info("\n");
+	pr_info("    NR_VCPUS: %d    SDEI Event: 0x%08x\n\n",
+		NR_VCPUS, SDEI_TEST_EVENT_NUM);
+	for (i = 0; i < ARRAY_SIZE(vcpu_commands); i++) {
+		/*
+		 * We depends on SDEI_EVENT_SIGNAL hypercall to inject SDEI
+		 * event. The number of the injected event must be zero. So
+		 * we have to skip the corresponding test if the SDEI event
+		 * number isn't zero.
+		 */
+		if (SDEI_TEST_EVENT_NUM != SDEI_SW_SIGNALED_EVENT &&
+		    vcpu_commands[i].command == SDEI_1_1_FN_SDEI_EVENT_SIGNAL)
+			continue;
+
+		vcpu_send_command(vm, vcpu_commands[i].command);
+		ret = vcpu_wait(vm, 5);
+		if (!ret) {
+			pr_info("%s: Timeout\n", vcpu_commands[i].name);
+			return -1;
+		}
+
+		ret = vcpu_check_state(vm);
+		if (!ret) {
+			pr_info("%s: Fail\n", vcpu_commands[i].name);
+			return -1;
+		}
+
+		vcpu_dump_state(i);
+	}
+
+	/* Terminate the guests */
+	pr_info("\n    Result: OK\n\n");
+	vcpu_send_command(vm, VCPU_COMMAND_EXIT);
+	sleep(1);
+
+	return 0;
+}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 00/18] Support SDEI Virtualization
  2022-04-03 15:38 ` Gavin Shan
@ 2022-04-03 15:47   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, Oliver Upton, James Morse,
	Mark Rutland, Shannon Zhao

On 4/3/22 11:38 PM, Gavin Shan wrote:
> This series intends to virtualize Software Delegated Exception Interface
> (SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
> deliver NMI-alike SDEI event to guest and it's needed by Async PF to
> deliver page-not-present notification from hypervisor to guest. The code
> and the required qemu changes can be found from:
> 
>     https://developer.arm.com/documentation/den0054/c
>     https://github.com/gwshan/linux    ("kvm/arm64_sdei")
>     https://github.com/gwshan/qemu     ("kvm/arm64_sdei")
> 
> The design is quite strightforward by following the specification. The
> (SDEI) events are classified into the shared and private ones according
> to their scope. The shared event is system or VM scoped, but the private
> event is vcpu scoped. This implementation doesn't support the shared
> event because all the needed events are private. Besides, the migration
> isn't supported by implementation and it's something to be supported
> in future.
> 
> There are several objects (data structures) introduced to help on the
> event registration, enablement, disablement, unregistration, reset,
> delivery and handling.
> 
>    * kvm_sdei_exposed_event
>      The event which are defined and exposed by KVM. The event can't
>      be registered until it's exposed. Besides, all the information
>      in this event can't be changed after it's exposed.
>      
>    * kvm_sdei_event
>      The events are created based on the exposed events. Their states
>      are changed when hypercalls are received or they are delivered
>      to guest for handling.
>      
>    * kvm_sdei_vcpu_context
>      The vcpu context helps to handle events. The interrupted context
>      is saved before the event handler is executed, and restored after
>      the event handler is to finish.
>      
>    * kvm_sdei_vcpu
>      Place holder for all objects for one particular VCPU.
> 
> The patches are organized as below:
> 
>    PATCH[01-02] Preparatory work to extend smccc_get_argx() and refactor
>                 hypercall routing mechanism
>    PATCH[03]    Adds SDEI virtualization infrastructure
>    PATCH[04-16] Supports various SDEI hypercalls and event handling
>    PATCH[17]    Exposes SDEI capability
>    PATCH[18]    Adds SDEI selftest case
>    
> The previous revisions can be found:
> 
>    v5: https://lore.kernel.org/kvmarm/20220322080710.51727-1-gshan@redhat.com/
>    v4: https://lore.kernel.org/kvmarm/20210815001352.81927-1-gshan@redhat.com/
>    v3: https://lore.kernel.org/kvmarm/20210507083124.43347-1-gshan@redhat.com/
>    v2: https://lore.kernel.org/kvmarm/20210209032733.99996-1-gshan@redhat.com/
>    v1: https://lore.kernel.org/kvmarm/20200817100531.83045-1-gshan@redhat.com/
> 

I'm explicitly copying Oliver, James, Mark and Shannon to avoid resending this series.
It seems they have been skipped even I explicitly copied them by 'git send-email --cc=<email-addr>'.

[...]

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 00/18] Support SDEI Virtualization
@ 2022-04-03 15:47   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-03 15:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will

On 4/3/22 11:38 PM, Gavin Shan wrote:
> This series intends to virtualize Software Delegated Exception Interface
> (SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
> deliver NMI-alike SDEI event to guest and it's needed by Async PF to
> deliver page-not-present notification from hypervisor to guest. The code
> and the required qemu changes can be found from:
> 
>     https://developer.arm.com/documentation/den0054/c
>     https://github.com/gwshan/linux    ("kvm/arm64_sdei")
>     https://github.com/gwshan/qemu     ("kvm/arm64_sdei")
> 
> The design is quite strightforward by following the specification. The
> (SDEI) events are classified into the shared and private ones according
> to their scope. The shared event is system or VM scoped, but the private
> event is vcpu scoped. This implementation doesn't support the shared
> event because all the needed events are private. Besides, the migration
> isn't supported by implementation and it's something to be supported
> in future.
> 
> There are several objects (data structures) introduced to help on the
> event registration, enablement, disablement, unregistration, reset,
> delivery and handling.
> 
>    * kvm_sdei_exposed_event
>      The event which are defined and exposed by KVM. The event can't
>      be registered until it's exposed. Besides, all the information
>      in this event can't be changed after it's exposed.
>      
>    * kvm_sdei_event
>      The events are created based on the exposed events. Their states
>      are changed when hypercalls are received or they are delivered
>      to guest for handling.
>      
>    * kvm_sdei_vcpu_context
>      The vcpu context helps to handle events. The interrupted context
>      is saved before the event handler is executed, and restored after
>      the event handler is to finish.
>      
>    * kvm_sdei_vcpu
>      Place holder for all objects for one particular VCPU.
> 
> The patches are organized as below:
> 
>    PATCH[01-02] Preparatory work to extend smccc_get_argx() and refactor
>                 hypercall routing mechanism
>    PATCH[03]    Adds SDEI virtualization infrastructure
>    PATCH[04-16] Supports various SDEI hypercalls and event handling
>    PATCH[17]    Exposes SDEI capability
>    PATCH[18]    Adds SDEI selftest case
>    
> The previous revisions can be found:
> 
>    v5: https://lore.kernel.org/kvmarm/20220322080710.51727-1-gshan@redhat.com/
>    v4: https://lore.kernel.org/kvmarm/20210815001352.81927-1-gshan@redhat.com/
>    v3: https://lore.kernel.org/kvmarm/20210507083124.43347-1-gshan@redhat.com/
>    v2: https://lore.kernel.org/kvmarm/20210209032733.99996-1-gshan@redhat.com/
>    v1: https://lore.kernel.org/kvmarm/20200817100531.83045-1-gshan@redhat.com/
> 

I'm explicitly copying Oliver, James, Mark and Shannon to avoid resending this series.
It seems they have been skipped even I explicitly copied them by 'git send-email --cc=<email-addr>'.

[...]

Thanks,
Gavin


_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 00/18] Support SDEI Virtualization
  2022-04-03 15:47   ` Gavin Shan
@ 2022-04-04  6:09     ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-04  6:09 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, James Morse, Mark Rutland,
	Shannon Zhao

Hi Gavin,

On Sun, Apr 03, 2022 at 11:47:07PM +0800, Gavin Shan wrote:
> I'm explicitly copying Oliver, James, Mark and Shannon to avoid resending this series.
> It seems they have been skipped even I explicitly copied them by 'git send-email --cc=<email-addr>'.

Dunno about others, but FWIW your first crack at sending this series out
arrived in my inbox just fine :)

Thanks for cc'ing me, I'll find some time this week to take a look.

--
Best,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 00/18] Support SDEI Virtualization
@ 2022-04-04  6:09     ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-04  6:09 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Gavin,

On Sun, Apr 03, 2022 at 11:47:07PM +0800, Gavin Shan wrote:
> I'm explicitly copying Oliver, James, Mark and Shannon to avoid resending this series.
> It seems they have been skipped even I explicitly copied them by 'git send-email --cc=<email-addr>'.

Dunno about others, but FWIW your first crack at sending this series out
arrived in my inbox just fine :)

Thanks for cc'ing me, I'll find some time this week to take a look.

--
Best,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH] KVM: arm64: fix returnvar.cocci warnings
  2022-04-03 15:39   ` Gavin Shan
@ 2022-04-04 10:26     ` kernel test robot
  -1 siblings, 0 replies; 111+ messages in thread
From: kernel test robot @ 2022-04-04 10:26 UTC (permalink / raw)
  To: Gavin Shan, kvmarm
  Cc: kbuild-all, linux-kernel, eauger, oupton, Jonathan.Cameron,
	vkuznets, will, shannon.zhaosl, james.morse, mark.rutland, maz,
	pbonzini, shan.gavin

From: kernel test robot <lkp@intel.com>

arch/arm64/kvm/sdei.c:346:15-18: Unneeded variable: "ret". Return "SDEI_SUCCESS" on line 352


 Remove unneeded variable used to store return value.

Generated by: scripts/coccinelle/misc/returnvar.cocci

CC: Gavin Shan <gshan@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
---

url:    https://github.com/intel-lab-lkp/linux/commits/Gavin-Shan/Support-SDEI-Virtualization/20220403-234350
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git be2d3ecedd9911fbfd7e55cc9ceac5f8b79ae4cf
:::::: branch date: 19 hours ago
:::::: commit date: 19 hours ago

 arch/arm64/kvm/sdei.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -343,13 +343,12 @@ out:
 static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
-	unsigned long ret = SDEI_SUCCESS;
 
 	spin_lock(&vsdei->lock);
 	vsdei->masked = mask ? 1 : 0;
 	spin_unlock(&vsdei->lock);
 
-	return ret;
+	return SDEI_SUCCESS;
 }
 
 int kvm_sdei_call(struct kvm_vcpu *vcpu)

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH] KVM: arm64: fix returnvar.cocci warnings
@ 2022-04-04 10:26     ` kernel test robot
  0 siblings, 0 replies; 111+ messages in thread
From: kernel test robot @ 2022-04-04 10:26 UTC (permalink / raw)
  To: Gavin Shan, kvmarm
  Cc: kbuild-all, maz, linux-kernel, eauger, shan.gavin,
	Jonathan.Cameron, pbonzini, vkuznets, will

From: kernel test robot <lkp@intel.com>

arch/arm64/kvm/sdei.c:346:15-18: Unneeded variable: "ret". Return "SDEI_SUCCESS" on line 352


 Remove unneeded variable used to store return value.

Generated by: scripts/coccinelle/misc/returnvar.cocci

CC: Gavin Shan <gshan@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
---

url:    https://github.com/intel-lab-lkp/linux/commits/Gavin-Shan/Support-SDEI-Virtualization/20220403-234350
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git be2d3ecedd9911fbfd7e55cc9ceac5f8b79ae4cf
:::::: branch date: 19 hours ago
:::::: commit date: 19 hours ago

 arch/arm64/kvm/sdei.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -343,13 +343,12 @@ out:
 static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
 {
 	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
-	unsigned long ret = SDEI_SUCCESS;
 
 	spin_lock(&vsdei->lock);
 	vsdei->masked = mask ? 1 : 0;
 	spin_unlock(&vsdei->lock);
 
-	return ret;
+	return SDEI_SUCCESS;
 }
 
 int kvm_sdei_call(struct kvm_vcpu *vcpu)
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 10/18] KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall
  2022-04-03 15:39   ` Gavin Shan
@ 2022-04-04 10:29     ` kernel test robot
  -1 siblings, 0 replies; 111+ messages in thread
From: kernel test robot @ 2022-04-04 10:29 UTC (permalink / raw)
  To: Gavin Shan, kvmarm
  Cc: kbuild-all, linux-kernel, eauger, oupton, Jonathan.Cameron,
	vkuznets, will, shannon.zhaosl, james.morse, mark.rutland, maz,
	pbonzini, shan.gavin

Hi Gavin,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.18-rc1 next-20220404]
[cannot apply to kvmarm/next kvm/master linux/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Gavin-Shan/Support-SDEI-Virtualization/20220403-234350
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git be2d3ecedd9911fbfd7e55cc9ceac5f8b79ae4cf
config: arm64-randconfig-c004-20220404 (https://download.01.org/0day-ci/archive/20220404/202204041802.MNEvTtnJ-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 11.2.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


cocci warnings: (new ones prefixed by >>)
>> arch/arm64/kvm/sdei.c:346:15-18: Unneeded variable: "ret". Return "SDEI_SUCCESS" on line 352

Please review and possibly fold the followup patch.

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 10/18] KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall
@ 2022-04-04 10:29     ` kernel test robot
  0 siblings, 0 replies; 111+ messages in thread
From: kernel test robot @ 2022-04-04 10:29 UTC (permalink / raw)
  To: Gavin Shan, kvmarm
  Cc: kbuild-all, maz, linux-kernel, eauger, shan.gavin,
	Jonathan.Cameron, pbonzini, vkuznets, will

Hi Gavin,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.18-rc1 next-20220404]
[cannot apply to kvmarm/next kvm/master linux/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Gavin-Shan/Support-SDEI-Virtualization/20220403-234350
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git be2d3ecedd9911fbfd7e55cc9ceac5f8b79ae4cf
config: arm64-randconfig-c004-20220404 (https://download.01.org/0day-ci/archive/20220404/202204041802.MNEvTtnJ-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 11.2.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


cocci warnings: (new ones prefixed by >>)
>> arch/arm64/kvm/sdei.c:346:15-18: Unneeded variable: "ret". Return "SDEI_SUCCESS" on line 352

Please review and possibly fold the followup patch.

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 00/18] Support SDEI Virtualization
  2022-04-04  6:09     ` Oliver Upton
@ 2022-04-04 10:53       ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-04 10:53 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, James Morse, Mark Rutland,
	Shannon Zhao

Hi Oliver,

On 4/4/22 2:09 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:47:07PM +0800, Gavin Shan wrote:
>> I'm explicitly copying Oliver, James, Mark and Shannon to avoid resending this series.
>> It seems they have been skipped even I explicitly copied them by 'git send-email --cc=<email-addr>'.
> 
> Dunno about others, but FWIW your first crack at sending this series out
> arrived in my inbox just fine :)
> 
> Thanks for cc'ing me, I'll find some time this week to take a look.
> 

Thanks for letting me know the emails and patches have been delivered
correctly. Please take your time to review and thanks for your comments
again :)

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 00/18] Support SDEI Virtualization
@ 2022-04-04 10:53       ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-04 10:53 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 4/4/22 2:09 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:47:07PM +0800, Gavin Shan wrote:
>> I'm explicitly copying Oliver, James, Mark and Shannon to avoid resending this series.
>> It seems they have been skipped even I explicitly copied them by 'git send-email --cc=<email-addr>'.
> 
> Dunno about others, but FWIW your first crack at sending this series out
> arrived in my inbox just fine :)
> 
> Thanks for cc'ing me, I'll find some time this week to take a look.
> 

Thanks for letting me know the emails and patches have been delivered
correctly. Please take your time to review and thanks for your comments
again :)

Thanks,
Gavin


_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] KVM: arm64: fix returnvar.cocci warnings
  2022-04-04 10:26     ` kernel test robot
  (?)
@ 2022-04-04 10:54       ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-04 10:54 UTC (permalink / raw)
  To: kernel test robot, kvmarm
  Cc: kbuild-all, linux-kernel, eauger, oupton, Jonathan.Cameron,
	vkuznets, will, shannon.zhaosl, james.morse, mark.rutland, maz,
	pbonzini, shan.gavin

On 4/4/22 6:26 PM, kernel test robot wrote:
> From: kernel test robot <lkp@intel.com>
> 
> arch/arm64/kvm/sdei.c:346:15-18: Unneeded variable: "ret". Return "SDEI_SUCCESS" on line 352
> 
> 
>   Remove unneeded variable used to store return value.
> 
> Generated by: scripts/coccinelle/misc/returnvar.cocci
> 
> CC: Gavin Shan <gshan@redhat.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: kernel test robot <lkp@intel.com>
> ---
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Gavin-Shan/Support-SDEI-Virtualization/20220403-234350
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git be2d3ecedd9911fbfd7e55cc9ceac5f8b79ae4cf
> :::::: branch date: 19 hours ago
> :::::: commit date: 19 hours ago
> 
>   arch/arm64/kvm/sdei.c |    3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -343,13 +343,12 @@ out:
>   static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
>   {
>   	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> -	unsigned long ret = SDEI_SUCCESS;
>   
>   	spin_lock(&vsdei->lock);
>   	vsdei->masked = mask ? 1 : 0;
>   	spin_unlock(&vsdei->lock);
>   
> -	return ret;
> +	return SDEI_SUCCESS;
>   }
>   
>   int kvm_sdei_call(struct kvm_vcpu *vcpu)
> 

Thanks for reporting the warning. I will fold the changes in next respin
if needed.

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] KVM: arm64: fix returnvar.cocci warnings
@ 2022-04-04 10:54       ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-04 10:54 UTC (permalink / raw)
  To: kernel test robot, kvmarm
  Cc: kbuild-all, maz, linux-kernel, eauger, shan.gavin,
	Jonathan.Cameron, pbonzini, vkuznets, will

On 4/4/22 6:26 PM, kernel test robot wrote:
> From: kernel test robot <lkp@intel.com>
> 
> arch/arm64/kvm/sdei.c:346:15-18: Unneeded variable: "ret". Return "SDEI_SUCCESS" on line 352
> 
> 
>   Remove unneeded variable used to store return value.
> 
> Generated by: scripts/coccinelle/misc/returnvar.cocci
> 
> CC: Gavin Shan <gshan@redhat.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: kernel test robot <lkp@intel.com>
> ---
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Gavin-Shan/Support-SDEI-Virtualization/20220403-234350
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git be2d3ecedd9911fbfd7e55cc9ceac5f8b79ae4cf
> :::::: branch date: 19 hours ago
> :::::: commit date: 19 hours ago
> 
>   arch/arm64/kvm/sdei.c |    3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -343,13 +343,12 @@ out:
>   static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
>   {
>   	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> -	unsigned long ret = SDEI_SUCCESS;
>   
>   	spin_lock(&vsdei->lock);
>   	vsdei->masked = mask ? 1 : 0;
>   	spin_unlock(&vsdei->lock);
>   
> -	return ret;
> +	return SDEI_SUCCESS;
>   }
>   
>   int kvm_sdei_call(struct kvm_vcpu *vcpu)
> 

Thanks for reporting the warning. I will fold the changes in next respin
if needed.

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH] KVM: arm64: fix returnvar.cocci warnings
@ 2022-04-04 10:54       ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-04 10:54 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1461 bytes --]

On 4/4/22 6:26 PM, kernel test robot wrote:
> From: kernel test robot <lkp@intel.com>
> 
> arch/arm64/kvm/sdei.c:346:15-18: Unneeded variable: "ret". Return "SDEI_SUCCESS" on line 352
> 
> 
>   Remove unneeded variable used to store return value.
> 
> Generated by: scripts/coccinelle/misc/returnvar.cocci
> 
> CC: Gavin Shan <gshan@redhat.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: kernel test robot <lkp@intel.com>
> ---
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Gavin-Shan/Support-SDEI-Virtualization/20220403-234350
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git be2d3ecedd9911fbfd7e55cc9ceac5f8b79ae4cf
> :::::: branch date: 19 hours ago
> :::::: commit date: 19 hours ago
> 
>   arch/arm64/kvm/sdei.c |    3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -343,13 +343,12 @@ out:
>   static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
>   {
>   	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> -	unsigned long ret = SDEI_SUCCESS;
>   
>   	spin_lock(&vsdei->lock);
>   	vsdei->masked = mask ? 1 : 0;
>   	spin_unlock(&vsdei->lock);
>   
> -	return ret;
> +	return SDEI_SUCCESS;
>   }
>   
>   int kvm_sdei_call(struct kvm_vcpu *vcpu)
> 

Thanks for reporting the warning. I will fold the changes in next respin
if needed.

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 02/18] KVM: arm64: Route hypercalls based on their owner
  2022-04-03 15:38   ` Gavin Shan
@ 2022-04-21  8:19     ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-21  8:19 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Gavin,

On Sun, Apr 03, 2022 at 11:38:55PM +0800, Gavin Shan wrote:
> kvm_hvc_call_handler() directly handles the incoming hypercall, or
> and routes it based on its (function) ID. kvm_psci_call() becomes
> the gate keeper to handle the hypercall that can't be handled by
> any one else. It makes kvm_hvc_call_handler() a bit messy.
> 
> This reorgnizes the code to route the hypercall to the corresponding
> handler based on its owner.

nit: write changelogs in the imperative:

Reorganize the code to ...

> The hypercall may be handled directly
> in the handler or routed further to the corresponding functionality.
> The (function) ID is always verified before it's routed to the
> corresponding functionality. By the way, @func_id is repalced by
> @func, to be consistent with by smccc_get_function().
> 
> PSCI is the only exception, those hypercalls defined by 0.2 or
> beyond are routed to the handler for Standard Secure Service, but
> those defined in 0.1 are routed to the handler for Standard
> Hypervisor Service.
> 
> Suggested-by: Oliver Upton <oupton@google.com>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/kvm/hypercalls.c | 199 +++++++++++++++++++++++-------------
>  1 file changed, 127 insertions(+), 72 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index 8438fd79e3f0..b659387d8919 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c

[...]

> +static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
> +{
> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
> +
> +	switch (func) {
> +	case ARM_SMCCC_TRNG_VERSION ... ARM_SMCCC_TRNG_RND32:
> +	case ARM_SMCCC_TRNG_RND64:
> +		return kvm_trng_call(vcpu);
> +	case PSCI_0_2_FN_PSCI_VERSION ... PSCI_0_2_FN_SYSTEM_RESET:
> +	case PSCI_0_2_FN64_CPU_SUSPEND ... PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
> +	case PSCI_1_0_FN_PSCI_FEATURES ... PSCI_1_0_FN_SET_SUSPEND_MODE:
> +	case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> +	case PSCI_1_1_FN_SYSTEM_RESET2:
> +	case PSCI_1_1_FN64_SYSTEM_RESET2:

Isn't it known from the SMCCC what range of hypercall numbers PSCI and
TRNG fall under, respectively?

https://developer.arm.com/documentation/den0028/e/

See sections 6.3 and 6.4.

> +		return kvm_psci_call(vcpu);
> +	}
> +
> +	smccc_set_retval(vcpu, val, 0, 0, 0);
> +	return 1;

I don't think any cases of the switch statement change val, could you
just use SMCCC_RET_NOT_SUPPORTED here?

> +}
> +
> +static int kvm_hvc_standard_hyp(struct kvm_vcpu *vcpu, u32 func)
> +{
> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
> +	gpa_t gpa;
> +
> +	switch (func) {
>  	case ARM_SMCCC_HV_PV_TIME_FEATURES:
> -		val[0] = kvm_hypercall_pv_features(vcpu);
> +		val = kvm_hypercall_pv_features(vcpu);
>  		break;
>  	case ARM_SMCCC_HV_PV_TIME_ST:
>  		gpa = kvm_init_stolen_time(vcpu);
>  		if (gpa != GPA_INVALID)
> -			val[0] = gpa;
> +			val = gpa;
>  		break;
> +	case KVM_PSCI_FN_CPU_SUSPEND ... KVM_PSCI_FN_MIGRATE:
> +		return kvm_psci_call(vcpu);

You might want to handle these from the main call handler with a giant
disclaimer that these values predate SMCCC and therefore collide with
the standard hypervisor service range.

[...]

> +
> +int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
> +{
> +	u32 func = smccc_get_function(vcpu);
> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
> +
> +	switch (ARM_SMCCC_OWNER_NUM(func)) {
> +	case ARM_SMCCC_OWNER_ARCH:
> +		return kvm_hvc_arch(vcpu, func);
> +	case ARM_SMCCC_OWNER_STANDARD:
> +		return kvm_hvc_standard(vcpu, func);
> +	case ARM_SMCCC_OWNER_STANDARD_HYP:
> +		return kvm_hvc_standard_hyp(vcpu, func);
> +	case ARM_SMCCC_OWNER_VENDOR_HYP:
> +		return kvm_hvc_vendor_hyp(vcpu, func);
> +	}
> +
> +	smccc_set_retval(vcpu, val, 0, 0, 0);

Same here, avoid indirecting the return value through a local variable.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 02/18] KVM: arm64: Route hypercalls based on their owner
@ 2022-04-21  8:19     ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-21  8:19 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Gavin,

On Sun, Apr 03, 2022 at 11:38:55PM +0800, Gavin Shan wrote:
> kvm_hvc_call_handler() directly handles the incoming hypercall, or
> and routes it based on its (function) ID. kvm_psci_call() becomes
> the gate keeper to handle the hypercall that can't be handled by
> any one else. It makes kvm_hvc_call_handler() a bit messy.
> 
> This reorgnizes the code to route the hypercall to the corresponding
> handler based on its owner.

nit: write changelogs in the imperative:

Reorganize the code to ...

> The hypercall may be handled directly
> in the handler or routed further to the corresponding functionality.
> The (function) ID is always verified before it's routed to the
> corresponding functionality. By the way, @func_id is repalced by
> @func, to be consistent with by smccc_get_function().
> 
> PSCI is the only exception, those hypercalls defined by 0.2 or
> beyond are routed to the handler for Standard Secure Service, but
> those defined in 0.1 are routed to the handler for Standard
> Hypervisor Service.
> 
> Suggested-by: Oliver Upton <oupton@google.com>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/kvm/hypercalls.c | 199 +++++++++++++++++++++++-------------
>  1 file changed, 127 insertions(+), 72 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index 8438fd79e3f0..b659387d8919 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c

[...]

> +static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
> +{
> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
> +
> +	switch (func) {
> +	case ARM_SMCCC_TRNG_VERSION ... ARM_SMCCC_TRNG_RND32:
> +	case ARM_SMCCC_TRNG_RND64:
> +		return kvm_trng_call(vcpu);
> +	case PSCI_0_2_FN_PSCI_VERSION ... PSCI_0_2_FN_SYSTEM_RESET:
> +	case PSCI_0_2_FN64_CPU_SUSPEND ... PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
> +	case PSCI_1_0_FN_PSCI_FEATURES ... PSCI_1_0_FN_SET_SUSPEND_MODE:
> +	case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> +	case PSCI_1_1_FN_SYSTEM_RESET2:
> +	case PSCI_1_1_FN64_SYSTEM_RESET2:

Isn't it known from the SMCCC what range of hypercall numbers PSCI and
TRNG fall under, respectively?

https://developer.arm.com/documentation/den0028/e/

See sections 6.3 and 6.4.

> +		return kvm_psci_call(vcpu);
> +	}
> +
> +	smccc_set_retval(vcpu, val, 0, 0, 0);
> +	return 1;

I don't think any cases of the switch statement change val, could you
just use SMCCC_RET_NOT_SUPPORTED here?

> +}
> +
> +static int kvm_hvc_standard_hyp(struct kvm_vcpu *vcpu, u32 func)
> +{
> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
> +	gpa_t gpa;
> +
> +	switch (func) {
>  	case ARM_SMCCC_HV_PV_TIME_FEATURES:
> -		val[0] = kvm_hypercall_pv_features(vcpu);
> +		val = kvm_hypercall_pv_features(vcpu);
>  		break;
>  	case ARM_SMCCC_HV_PV_TIME_ST:
>  		gpa = kvm_init_stolen_time(vcpu);
>  		if (gpa != GPA_INVALID)
> -			val[0] = gpa;
> +			val = gpa;
>  		break;
> +	case KVM_PSCI_FN_CPU_SUSPEND ... KVM_PSCI_FN_MIGRATE:
> +		return kvm_psci_call(vcpu);

You might want to handle these from the main call handler with a giant
disclaimer that these values predate SMCCC and therefore collide with
the standard hypervisor service range.

[...]

> +
> +int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
> +{
> +	u32 func = smccc_get_function(vcpu);
> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
> +
> +	switch (ARM_SMCCC_OWNER_NUM(func)) {
> +	case ARM_SMCCC_OWNER_ARCH:
> +		return kvm_hvc_arch(vcpu, func);
> +	case ARM_SMCCC_OWNER_STANDARD:
> +		return kvm_hvc_standard(vcpu, func);
> +	case ARM_SMCCC_OWNER_STANDARD_HYP:
> +		return kvm_hvc_standard_hyp(vcpu, func);
> +	case ARM_SMCCC_OWNER_VENDOR_HYP:
> +		return kvm_hvc_vendor_hyp(vcpu, func);
> +	}
> +
> +	smccc_set_retval(vcpu, val, 0, 0, 0);

Same here, avoid indirecting the return value through a local variable.

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 02/18] KVM: arm64: Route hypercalls based on their owner
  2022-04-21  8:19     ` Oliver Upton
@ 2022-04-22 12:20       ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-22 12:20 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 4/21/22 4:19 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:38:55PM +0800, Gavin Shan wrote:
>> kvm_hvc_call_handler() directly handles the incoming hypercall, or
>> and routes it based on its (function) ID. kvm_psci_call() becomes
>> the gate keeper to handle the hypercall that can't be handled by
>> any one else. It makes kvm_hvc_call_handler() a bit messy.
>>
>> This reorgnizes the code to route the hypercall to the corresponding
>> handler based on its owner.
> 
> nit: write changelogs in the imperative:
> 
> Reorganize the code to ...
> 

Thanks again for your review. It will be corrected in next respin.
By the way, could you help to review the rest when you have free
cycles? :)

>> The hypercall may be handled directly
>> in the handler or routed further to the corresponding functionality.
>> The (function) ID is always verified before it's routed to the
>> corresponding functionality. By the way, @func_id is repalced by
>> @func, to be consistent with by smccc_get_function().
>>
>> PSCI is the only exception, those hypercalls defined by 0.2 or
>> beyond are routed to the handler for Standard Secure Service, but
>> those defined in 0.1 are routed to the handler for Standard
>> Hypervisor Service.
>>
>> Suggested-by: Oliver Upton <oupton@google.com>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/kvm/hypercalls.c | 199 +++++++++++++++++++++++-------------
>>   1 file changed, 127 insertions(+), 72 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
>> index 8438fd79e3f0..b659387d8919 100644
>> --- a/arch/arm64/kvm/hypercalls.c
>> +++ b/arch/arm64/kvm/hypercalls.c
> 
> [...]
> 
>> +static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
>> +{
>> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
>> +
>> +	switch (func) {
>> +	case ARM_SMCCC_TRNG_VERSION ... ARM_SMCCC_TRNG_RND32:
>> +	case ARM_SMCCC_TRNG_RND64:
>> +		return kvm_trng_call(vcpu);
>> +	case PSCI_0_2_FN_PSCI_VERSION ... PSCI_0_2_FN_SYSTEM_RESET:
>> +	case PSCI_0_2_FN64_CPU_SUSPEND ... PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
>> +	case PSCI_1_0_FN_PSCI_FEATURES ... PSCI_1_0_FN_SET_SUSPEND_MODE:
>> +	case PSCI_1_0_FN64_SYSTEM_SUSPEND:
>> +	case PSCI_1_1_FN_SYSTEM_RESET2:
>> +	case PSCI_1_1_FN64_SYSTEM_RESET2:
> 
> Isn't it known from the SMCCC what range of hypercall numbers PSCI and
> TRNG fall under, respectively?
> 
> https://developer.arm.com/documentation/den0028/e/
> 
> See sections 6.3 and 6.4.
> 

Bit#30 of the function ID is the call convention indication, which is
either 32 or 64-bits. For TRNG's function IDs, its 32-bits and 64-bits
variants are discrete. Besides, the spec reserves more functions IDs
than what range we're using. It means we don't have symbols to match
the reserved ranges. So it looks good to me for TRNG cases.

For PSCI, it can be simplified as below, according to the defination
in include/uapi/linux/psci.h:

     case PSCI_0_2_FN_PSCI_VERSION ...
          PSCI_1_1_FN_SYSTEM_RESET2:     /* 32-bits */
     case PSCI_0_2_FN64_CPU_SUSPEND ...
          PSCI_1_1_FN64_SYSTEM_RESET2:   /* 64-bits */

>> +		return kvm_psci_call(vcpu);
>> +	}
>> +
>> +	smccc_set_retval(vcpu, val, 0, 0, 0);
>> +	return 1;
> 
> I don't think any cases of the switch statement change val, could you
> just use SMCCC_RET_NOT_SUPPORTED here?
> 

Yes, Will do in next respin.

>> +}
>> +
>> +static int kvm_hvc_standard_hyp(struct kvm_vcpu *vcpu, u32 func)
>> +{
>> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
>> +	gpa_t gpa;
>> +
>> +	switch (func) {
>>   	case ARM_SMCCC_HV_PV_TIME_FEATURES:
>> -		val[0] = kvm_hypercall_pv_features(vcpu);
>> +		val = kvm_hypercall_pv_features(vcpu);
>>   		break;
>>   	case ARM_SMCCC_HV_PV_TIME_ST:
>>   		gpa = kvm_init_stolen_time(vcpu);
>>   		if (gpa != GPA_INVALID)
>> -			val[0] = gpa;
>> +			val = gpa;
>>   		break;
>> +	case KVM_PSCI_FN_CPU_SUSPEND ... KVM_PSCI_FN_MIGRATE:
>> +		return kvm_psci_call(vcpu);
> 
> You might want to handle these from the main call handler with a giant
> disclaimer that these values predate SMCCC and therefore collide with
> the standard hypervisor service range.
> 
> [...]
> 

I probably just keep it as it is to follow the rule: to route
based on the owner strictly. Besides, there are 3 levels to
handle SMCCCs after this patch is applied, which corresponds
to 3 handlers as main/owner/function. It sounds more natural
for reader to follow the implementation in this way.

>> +
>> +int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>> +{
>> +	u32 func = smccc_get_function(vcpu);
>> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
>> +
>> +	switch (ARM_SMCCC_OWNER_NUM(func)) {
>> +	case ARM_SMCCC_OWNER_ARCH:
>> +		return kvm_hvc_arch(vcpu, func);
>> +	case ARM_SMCCC_OWNER_STANDARD:
>> +		return kvm_hvc_standard(vcpu, func);
>> +	case ARM_SMCCC_OWNER_STANDARD_HYP:
>> +		return kvm_hvc_standard_hyp(vcpu, func);
>> +	case ARM_SMCCC_OWNER_VENDOR_HYP:
>> +		return kvm_hvc_vendor_hyp(vcpu, func);
>> +	}
>> +
>> +	smccc_set_retval(vcpu, val, 0, 0, 0);
> 
> Same here, avoid indirecting the return value through a local variable.
> 

Sure, will do in next respin.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 02/18] KVM: arm64: Route hypercalls based on their owner
@ 2022-04-22 12:20       ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-22 12:20 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 4/21/22 4:19 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:38:55PM +0800, Gavin Shan wrote:
>> kvm_hvc_call_handler() directly handles the incoming hypercall, or
>> and routes it based on its (function) ID. kvm_psci_call() becomes
>> the gate keeper to handle the hypercall that can't be handled by
>> any one else. It makes kvm_hvc_call_handler() a bit messy.
>>
>> This reorgnizes the code to route the hypercall to the corresponding
>> handler based on its owner.
> 
> nit: write changelogs in the imperative:
> 
> Reorganize the code to ...
> 

Thanks again for your review. It will be corrected in next respin.
By the way, could you help to review the rest when you have free
cycles? :)

>> The hypercall may be handled directly
>> in the handler or routed further to the corresponding functionality.
>> The (function) ID is always verified before it's routed to the
>> corresponding functionality. By the way, @func_id is repalced by
>> @func, to be consistent with by smccc_get_function().
>>
>> PSCI is the only exception, those hypercalls defined by 0.2 or
>> beyond are routed to the handler for Standard Secure Service, but
>> those defined in 0.1 are routed to the handler for Standard
>> Hypervisor Service.
>>
>> Suggested-by: Oliver Upton <oupton@google.com>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/kvm/hypercalls.c | 199 +++++++++++++++++++++++-------------
>>   1 file changed, 127 insertions(+), 72 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
>> index 8438fd79e3f0..b659387d8919 100644
>> --- a/arch/arm64/kvm/hypercalls.c
>> +++ b/arch/arm64/kvm/hypercalls.c
> 
> [...]
> 
>> +static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
>> +{
>> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
>> +
>> +	switch (func) {
>> +	case ARM_SMCCC_TRNG_VERSION ... ARM_SMCCC_TRNG_RND32:
>> +	case ARM_SMCCC_TRNG_RND64:
>> +		return kvm_trng_call(vcpu);
>> +	case PSCI_0_2_FN_PSCI_VERSION ... PSCI_0_2_FN_SYSTEM_RESET:
>> +	case PSCI_0_2_FN64_CPU_SUSPEND ... PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
>> +	case PSCI_1_0_FN_PSCI_FEATURES ... PSCI_1_0_FN_SET_SUSPEND_MODE:
>> +	case PSCI_1_0_FN64_SYSTEM_SUSPEND:
>> +	case PSCI_1_1_FN_SYSTEM_RESET2:
>> +	case PSCI_1_1_FN64_SYSTEM_RESET2:
> 
> Isn't it known from the SMCCC what range of hypercall numbers PSCI and
> TRNG fall under, respectively?
> 
> https://developer.arm.com/documentation/den0028/e/
> 
> See sections 6.3 and 6.4.
> 

Bit#30 of the function ID is the call convention indication, which is
either 32 or 64-bits. For TRNG's function IDs, its 32-bits and 64-bits
variants are discrete. Besides, the spec reserves more functions IDs
than what range we're using. It means we don't have symbols to match
the reserved ranges. So it looks good to me for TRNG cases.

For PSCI, it can be simplified as below, according to the defination
in include/uapi/linux/psci.h:

     case PSCI_0_2_FN_PSCI_VERSION ...
          PSCI_1_1_FN_SYSTEM_RESET2:     /* 32-bits */
     case PSCI_0_2_FN64_CPU_SUSPEND ...
          PSCI_1_1_FN64_SYSTEM_RESET2:   /* 64-bits */

>> +		return kvm_psci_call(vcpu);
>> +	}
>> +
>> +	smccc_set_retval(vcpu, val, 0, 0, 0);
>> +	return 1;
> 
> I don't think any cases of the switch statement change val, could you
> just use SMCCC_RET_NOT_SUPPORTED here?
> 

Yes, Will do in next respin.

>> +}
>> +
>> +static int kvm_hvc_standard_hyp(struct kvm_vcpu *vcpu, u32 func)
>> +{
>> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
>> +	gpa_t gpa;
>> +
>> +	switch (func) {
>>   	case ARM_SMCCC_HV_PV_TIME_FEATURES:
>> -		val[0] = kvm_hypercall_pv_features(vcpu);
>> +		val = kvm_hypercall_pv_features(vcpu);
>>   		break;
>>   	case ARM_SMCCC_HV_PV_TIME_ST:
>>   		gpa = kvm_init_stolen_time(vcpu);
>>   		if (gpa != GPA_INVALID)
>> -			val[0] = gpa;
>> +			val = gpa;
>>   		break;
>> +	case KVM_PSCI_FN_CPU_SUSPEND ... KVM_PSCI_FN_MIGRATE:
>> +		return kvm_psci_call(vcpu);
> 
> You might want to handle these from the main call handler with a giant
> disclaimer that these values predate SMCCC and therefore collide with
> the standard hypervisor service range.
> 
> [...]
> 

I probably just keep it as it is to follow the rule: to route
based on the owner strictly. Besides, there are 3 levels to
handle SMCCCs after this patch is applied, which corresponds
to 3 handlers as main/owner/function. It sounds more natural
for reader to follow the implementation in this way.

>> +
>> +int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>> +{
>> +	u32 func = smccc_get_function(vcpu);
>> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
>> +
>> +	switch (ARM_SMCCC_OWNER_NUM(func)) {
>> +	case ARM_SMCCC_OWNER_ARCH:
>> +		return kvm_hvc_arch(vcpu, func);
>> +	case ARM_SMCCC_OWNER_STANDARD:
>> +		return kvm_hvc_standard(vcpu, func);
>> +	case ARM_SMCCC_OWNER_STANDARD_HYP:
>> +		return kvm_hvc_standard_hyp(vcpu, func);
>> +	case ARM_SMCCC_OWNER_VENDOR_HYP:
>> +		return kvm_hvc_vendor_hyp(vcpu, func);
>> +	}
>> +
>> +	smccc_set_retval(vcpu, val, 0, 0, 0);
> 
> Same here, avoid indirecting the return value through a local variable.
> 

Sure, will do in next respin.

Thanks,
Gavin


_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 02/18] KVM: arm64: Route hypercalls based on their owner
  2022-04-22 12:20       ` Gavin Shan
@ 2022-04-22 17:59         ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-22 17:59 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

On Fri, Apr 22, 2022 at 08:20:50PM +0800, Gavin Shan wrote:
> Hi Oliver,
> 
> On 4/21/22 4:19 PM, Oliver Upton wrote:
> > On Sun, Apr 03, 2022 at 11:38:55PM +0800, Gavin Shan wrote:
> > > kvm_hvc_call_handler() directly handles the incoming hypercall, or
> > > and routes it based on its (function) ID. kvm_psci_call() becomes
> > > the gate keeper to handle the hypercall that can't be handled by
> > > any one else. It makes kvm_hvc_call_handler() a bit messy.
> > > 
> > > This reorgnizes the code to route the hypercall to the corresponding
> > > handler based on its owner.
> > 
> > nit: write changelogs in the imperative:
> > 
> > Reorganize the code to ...
> > 
> 
> Thanks again for your review. It will be corrected in next respin.
> By the way, could you help to review the rest when you have free
> cycles? :)

Yup, I've been thinking on the rest of the series just to make sure the
feedback I give is sane.

> > > The hypercall may be handled directly
> > > in the handler or routed further to the corresponding functionality.
> > > The (function) ID is always verified before it's routed to the
> > > corresponding functionality. By the way, @func_id is repalced by
> > > @func, to be consistent with by smccc_get_function().
> > > 
> > > PSCI is the only exception, those hypercalls defined by 0.2 or
> > > beyond are routed to the handler for Standard Secure Service, but
> > > those defined in 0.1 are routed to the handler for Standard
> > > Hypervisor Service.
> > > 
> > > Suggested-by: Oliver Upton <oupton@google.com>
> > > Signed-off-by: Gavin Shan <gshan@redhat.com>
> > > ---
> > >   arch/arm64/kvm/hypercalls.c | 199 +++++++++++++++++++++++-------------
> > >   1 file changed, 127 insertions(+), 72 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> > > index 8438fd79e3f0..b659387d8919 100644
> > > --- a/arch/arm64/kvm/hypercalls.c
> > > +++ b/arch/arm64/kvm/hypercalls.c
> > 
> > [...]
> > 
> > > +static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
> > > +{
> > > +	u64 val = SMCCC_RET_NOT_SUPPORTED;
> > > +
> > > +	switch (func) {
> > > +	case ARM_SMCCC_TRNG_VERSION ... ARM_SMCCC_TRNG_RND32:
> > > +	case ARM_SMCCC_TRNG_RND64:
> > > +		return kvm_trng_call(vcpu);
> > > +	case PSCI_0_2_FN_PSCI_VERSION ... PSCI_0_2_FN_SYSTEM_RESET:
> > > +	case PSCI_0_2_FN64_CPU_SUSPEND ... PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
> > > +	case PSCI_1_0_FN_PSCI_FEATURES ... PSCI_1_0_FN_SET_SUSPEND_MODE:
> > > +	case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> > > +	case PSCI_1_1_FN_SYSTEM_RESET2:
> > > +	case PSCI_1_1_FN64_SYSTEM_RESET2:
> > 
> > Isn't it known from the SMCCC what range of hypercall numbers PSCI and
> > TRNG fall under, respectively?
> > 
> > https://developer.arm.com/documentation/den0028/e/
> > 
> > See sections 6.3 and 6.4.
> > 
> 
> Bit#30 of the function ID is the call convention indication, which is
> either 32 or 64-bits. For TRNG's function IDs, its 32-bits and 64-bits
> variants are discrete. Besides, the spec reserves more functions IDs
> than what range we're using. It means we don't have symbols to match
> the reserved ranges. So it looks good to me for TRNG cases.
> 
> For PSCI, it can be simplified as below, according to the defination
> in include/uapi/linux/psci.h:
> 
>     case PSCI_0_2_FN_PSCI_VERSION ...
>          PSCI_1_1_FN_SYSTEM_RESET2:     /* 32-bits */
>     case PSCI_0_2_FN64_CPU_SUSPEND ...
>          PSCI_1_1_FN64_SYSTEM_RESET2:   /* 64-bits */

Right, but this still requires that we go back and update this switch
statement every time we add a new PSCI call, which is exactly what I was
hoping we could avoid. Doing this based exactly on the spec reduces the
burden for future changes, and keeps all relevant context in a single
spot.

  #define SMCCC_STD_PSCI_RANGE_START	0x0000
  #define SMCCC_STD_PSCI_RANGE_END	0x001f
  #define SMCCC_STD_TRNG_RANGE_START	0x0050
  #define SMCCC_STD_TRNG_RANGE_END	0x005f

  switch (ARM_SMCCC_FUNC_NUM(function_id)) {
          case SMCCC_STD_PSCI_RANGE_START ... SMCCC_STD_PSCI_RANGE_END:
	          return kvm_psci_call(vcpu);
          case SMCCC_STD_TRNG_RANGE_START ... SMCCC_STD_TRNG_RANGE_END:
	  	  return kvm_trng_call(vcpu);

	 ...
  }

[...]

> > > +	case KVM_PSCI_FN_CPU_SUSPEND ... KVM_PSCI_FN_MIGRATE:
> > > +		return kvm_psci_call(vcpu);
> > 
> > You might want to handle these from the main call handler with a giant
> > disclaimer that these values predate SMCCC and therefore collide with
> > the standard hypervisor service range.
> > 
> > [...]
> > 
> 
> I probably just keep it as it is to follow the rule: to route
> based on the owner strictly. Besides, there are 3 levels to
> handle SMCCCs after this patch is applied, which corresponds
> to 3 handlers as main/owner/function. It sounds more natural
> for reader to follow the implementation in this way.

I think this makes it much more confusing for the reader, as you'd be
hard pressed to find these function IDs in the SMCCC spec. Since their
values are outside of the specification, it is confusing to only address
them after these switch statements have decided that they belong to a
particular service owner as they do not.

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 02/18] KVM: arm64: Route hypercalls based on their owner
@ 2022-04-22 17:59         ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-22 17:59 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

On Fri, Apr 22, 2022 at 08:20:50PM +0800, Gavin Shan wrote:
> Hi Oliver,
> 
> On 4/21/22 4:19 PM, Oliver Upton wrote:
> > On Sun, Apr 03, 2022 at 11:38:55PM +0800, Gavin Shan wrote:
> > > kvm_hvc_call_handler() directly handles the incoming hypercall, or
> > > and routes it based on its (function) ID. kvm_psci_call() becomes
> > > the gate keeper to handle the hypercall that can't be handled by
> > > any one else. It makes kvm_hvc_call_handler() a bit messy.
> > > 
> > > This reorgnizes the code to route the hypercall to the corresponding
> > > handler based on its owner.
> > 
> > nit: write changelogs in the imperative:
> > 
> > Reorganize the code to ...
> > 
> 
> Thanks again for your review. It will be corrected in next respin.
> By the way, could you help to review the rest when you have free
> cycles? :)

Yup, I've been thinking on the rest of the series just to make sure the
feedback I give is sane.

> > > The hypercall may be handled directly
> > > in the handler or routed further to the corresponding functionality.
> > > The (function) ID is always verified before it's routed to the
> > > corresponding functionality. By the way, @func_id is repalced by
> > > @func, to be consistent with by smccc_get_function().
> > > 
> > > PSCI is the only exception, those hypercalls defined by 0.2 or
> > > beyond are routed to the handler for Standard Secure Service, but
> > > those defined in 0.1 are routed to the handler for Standard
> > > Hypervisor Service.
> > > 
> > > Suggested-by: Oliver Upton <oupton@google.com>
> > > Signed-off-by: Gavin Shan <gshan@redhat.com>
> > > ---
> > >   arch/arm64/kvm/hypercalls.c | 199 +++++++++++++++++++++++-------------
> > >   1 file changed, 127 insertions(+), 72 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> > > index 8438fd79e3f0..b659387d8919 100644
> > > --- a/arch/arm64/kvm/hypercalls.c
> > > +++ b/arch/arm64/kvm/hypercalls.c
> > 
> > [...]
> > 
> > > +static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
> > > +{
> > > +	u64 val = SMCCC_RET_NOT_SUPPORTED;
> > > +
> > > +	switch (func) {
> > > +	case ARM_SMCCC_TRNG_VERSION ... ARM_SMCCC_TRNG_RND32:
> > > +	case ARM_SMCCC_TRNG_RND64:
> > > +		return kvm_trng_call(vcpu);
> > > +	case PSCI_0_2_FN_PSCI_VERSION ... PSCI_0_2_FN_SYSTEM_RESET:
> > > +	case PSCI_0_2_FN64_CPU_SUSPEND ... PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
> > > +	case PSCI_1_0_FN_PSCI_FEATURES ... PSCI_1_0_FN_SET_SUSPEND_MODE:
> > > +	case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> > > +	case PSCI_1_1_FN_SYSTEM_RESET2:
> > > +	case PSCI_1_1_FN64_SYSTEM_RESET2:
> > 
> > Isn't it known from the SMCCC what range of hypercall numbers PSCI and
> > TRNG fall under, respectively?
> > 
> > https://developer.arm.com/documentation/den0028/e/
> > 
> > See sections 6.3 and 6.4.
> > 
> 
> Bit#30 of the function ID is the call convention indication, which is
> either 32 or 64-bits. For TRNG's function IDs, its 32-bits and 64-bits
> variants are discrete. Besides, the spec reserves more functions IDs
> than what range we're using. It means we don't have symbols to match
> the reserved ranges. So it looks good to me for TRNG cases.
> 
> For PSCI, it can be simplified as below, according to the defination
> in include/uapi/linux/psci.h:
> 
>     case PSCI_0_2_FN_PSCI_VERSION ...
>          PSCI_1_1_FN_SYSTEM_RESET2:     /* 32-bits */
>     case PSCI_0_2_FN64_CPU_SUSPEND ...
>          PSCI_1_1_FN64_SYSTEM_RESET2:   /* 64-bits */

Right, but this still requires that we go back and update this switch
statement every time we add a new PSCI call, which is exactly what I was
hoping we could avoid. Doing this based exactly on the spec reduces the
burden for future changes, and keeps all relevant context in a single
spot.

  #define SMCCC_STD_PSCI_RANGE_START	0x0000
  #define SMCCC_STD_PSCI_RANGE_END	0x001f
  #define SMCCC_STD_TRNG_RANGE_START	0x0050
  #define SMCCC_STD_TRNG_RANGE_END	0x005f

  switch (ARM_SMCCC_FUNC_NUM(function_id)) {
          case SMCCC_STD_PSCI_RANGE_START ... SMCCC_STD_PSCI_RANGE_END:
	          return kvm_psci_call(vcpu);
          case SMCCC_STD_TRNG_RANGE_START ... SMCCC_STD_TRNG_RANGE_END:
	  	  return kvm_trng_call(vcpu);

	 ...
  }

[...]

> > > +	case KVM_PSCI_FN_CPU_SUSPEND ... KVM_PSCI_FN_MIGRATE:
> > > +		return kvm_psci_call(vcpu);
> > 
> > You might want to handle these from the main call handler with a giant
> > disclaimer that these values predate SMCCC and therefore collide with
> > the standard hypervisor service range.
> > 
> > [...]
> > 
> 
> I probably just keep it as it is to follow the rule: to route
> based on the owner strictly. Besides, there are 3 levels to
> handle SMCCCs after this patch is applied, which corresponds
> to 3 handlers as main/owner/function. It sounds more natural
> for reader to follow the implementation in this way.

I think this makes it much more confusing for the reader, as you'd be
hard pressed to find these function IDs in the SMCCC spec. Since their
values are outside of the specification, it is confusing to only address
them after these switch statements have decided that they belong to a
particular service owner as they do not.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-04-03 15:38   ` Gavin Shan
@ 2022-04-22 21:48     ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-22 21:48 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

On Sun, Apr 03, 2022 at 11:38:56PM +0800, Gavin Shan wrote:
> Software Delegated Exception Interface (SDEI) provides a mechanism
> for registering and servicing system events, as defined by ARM DEN0054C
> specification. One of these events will be used by Asynchronous Page
> Fault (Async PF) to deliver notifications from host to guest.
> 
> The events are classified into shared and private ones according to
> their scopes. The shared events are system or VM scoped, but the
> private events are CPU or VCPU scoped. The shared events can be
> registered, enabled, unregistered and reset through hypercalls
> issued from any VCPU. However, the private events are registered,
> enabled, unregistered and reset on the calling VCPU through
> hypercalls. Besides, the events are also classified into critical
> and normal events according their priority. During event delivery
> and handling, the normal event can be preempted by another critical
> event, but not in reverse way. The critical event is never preempted
> by another normal event.

We don't have any need for critical events though, right? We should avoid
building out the plumbing around the concept of critical events until
there is an actual use case for it.

> This introduces SDEI virtualization infrastructure for various objects
> used in the implementation. Currently, we don't support the shared
> event.
> 
>   * kvm_sdei_exposed_event
>     The event which are defined and exposed by KVM. The event can't
>     be registered until it's exposed. Besides, all the information
>     in this event can't be changed after it's exposed.
> 
>   * kvm_sdei_event
>     The events are created based on the exposed events. Their states
>     are changed when hypercalls are received or they are delivered
>     to guest for handling.
> 
>   * kvm_sdei_vcpu_context
>     The vcpu context helps to handle events. The interrupted context
>     is saved before the event handler is executed, and restored after
>     the event handler is to finish.
> 
>   * kvm_sdei_vcpu
>     Place holder for all objects for one particular VCPU.
> 
> The error of SDEI_NOT_SUPPORTED is returned for all hypercalls for now.
> They will be supported one by one in the subsequent patches.
> 
> Link: https://developer.arm.com/documentation/den0054/latest
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |   1 +
>  arch/arm64/include/asm/kvm_sdei.h | 148 ++++++++++++++++++++++++++++++
>  arch/arm64/kvm/Makefile           |   2 +-
>  arch/arm64/kvm/arm.c              |   4 +
>  arch/arm64/kvm/hypercalls.c       |   3 +
>  arch/arm64/kvm/sdei.c             |  98 ++++++++++++++++++++
>  include/uapi/linux/arm_sdei.h     |   4 +
>  7 files changed, 259 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/include/asm/kvm_sdei.h
>  create mode 100644 arch/arm64/kvm/sdei.c
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index e3b25dc6c367..7644a400c4a8 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -343,6 +343,7 @@ struct kvm_vcpu_arch {
>  	 * Anything that is not used directly from assembly code goes
>  	 * here.
>  	 */
> +	struct kvm_sdei_vcpu *sdei;
>  
>  	/*
>  	 * Guest registers we preserve during guest debugging.
> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
> new file mode 100644
> index 000000000000..2dbfb3ae0a48
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_sdei.h
> @@ -0,0 +1,148 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Definitions of various KVM SDEI events.
> + *
> + * Copyright (C) 2022 Red Hat, Inc.
> + *
> + * Author(s): Gavin Shan <gshan@redhat.com>
> + */
> +
> +#ifndef __ARM64_KVM_SDEI_H__
> +#define __ARM64_KVM_SDEI_H__
> +
> +#include <uapi/linux/arm_sdei.h>
> +#include <linux/arm-smccc.h>
> +#include <linux/bits.h>
> +#include <linux/spinlock.h>
> +
> +/*
> + * The event which are defined and exposed by KVM. The event can't
> + * be registered until it's exposed. Besides, all the information
> + * in this event can't be changed after it's exposed.
> + */

kernel doc style comments are highly preferable when describing a
complex struct. Figuring out what each of the fields mean is not
obvious.

> +struct kvm_sdei_exposed_event {
> +	unsigned int	num;
> +	unsigned char	type;
> +	unsigned char	signaled;

what is this used for?

> +	unsigned char	priority;
> +};

I don't think we have a need for this struct. ::type will always be set
to PRIVATE and ::priority will always be NORMAL.

> +/*
> + * Currently, only the private events are supported. The events are
> + * created based on the exposed events and their states are changed
> + * when hypercalls are received or they are delivered to guest for
> + * handling.
> + */
> +struct kvm_sdei_event {
> +	struct kvm_sdei_exposed_event	*exposed_event;

I'm not following what is meant by an exposed event. By default the
KVM will expose all of the events to its guests.

> +	unsigned char			route_mode;
> +	unsigned long			route_affinity;

If we only have private events, do we need to worry about routing?

> +	unsigned long			ep_address;
> +	unsigned long			ep_arg;
> +#define KVM_SDEI_EVENT_STATE_REGISTERED		BIT(0)
> +#define KVM_SDEI_EVENT_STATE_ENABLED		BIT(1)
> +#define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING	BIT(2)
> +	unsigned long			state;

Isn't this state actually local to a PE (not VM) for private events?

> +	unsigned long			event_count;
> +};
> +
> +/*
> + * The vcpu context helps to handle events. The preempted or interrupted
> + * context is saved before the event handler is executed, and restored
> + * after the event handler is to finish. The event with normal priority
> + * can be preempted by the one with critical priority. So there can be
> + * two contexts on one particular vcpu for the events with normal and
> + * critical priority separately.
> + */
> +struct kvm_sdei_vcpu_context {
> +	struct kvm_sdei_event	*event;

Do we need this if we disallow nesting events?

> +	unsigned long		regs[18];
> +	unsigned long		pc;
> +	unsigned long		pstate;
> +};
> +
> +struct kvm_sdei_vcpu {
> +	spinlock_t			lock;

Why do we need a lock? This state should only ever be messed with in the
context of a single vCPU to which we already have exclusive access.

> +	struct kvm_sdei_event		*events;
> +	unsigned char			masked;
> +	unsigned long			critical_event_count;
> +	unsigned long			normal_event_count;
> +	struct kvm_sdei_vcpu_context	context[SDEI_EVENT_PRIORITY_CRITICAL + 1];
> +};
> +
> +/*
> + * According to SDEI specification (v1.1), the event number spans 32-bits
> + * and the lower 24-bits are used as the (real) event number. I don't
> + * think we can use that much event numbers in one system. So we reserve
> + * two bits from the 24-bits real event number, to indicate its types:
> + * physical or virtual event. One reserved bit is enough for now, but
> + * two bits are reserved for possible extension in future.
> + *
> + * The physical events are owned by firmware while the virtual events
> + * are used by VMM and KVM.

Doesn't KVM own everything? I don't see how the guest could interact
with another SDEI implementation.

> + */
> +#define KVM_SDEI_EVENT_NUM_TYPE_SHIFT	22
> +#define KVM_SDEI_EVENT_NUM_TYPE_MASK	(3 << KVM_SDEI_EVENT_NUM_TYPE_SHIFT)
> +#define KVM_SDEI_EVENT_NUM_TYPE_PHYS	0
> +#define KVM_SDEI_EVENT_NUM_TYPE_VIRT	1
> +
> +static inline bool kvm_sdei_is_virtual(unsigned int num)
> +{
> +	unsigned int type;
> +
> +	type = (num & KVM_SDEI_EVENT_NUM_TYPE_MASK) >>
> +	       KVM_SDEI_EVENT_NUM_TYPE_SHIFT;
> +	if (type == KVM_SDEI_EVENT_NUM_TYPE_VIRT)
> +		return true;
> +
> +	return false;
> +}
> +
> +static inline bool kvm_sdei_is_sw_signaled(unsigned int num)
> +{
> +	return num == SDEI_SW_SIGNALED_EVENT;
> +}

Couldn't the caller just check the event number on their own?

> +static inline bool kvm_sdei_is_supported(unsigned int num)
> +{
> +	return kvm_sdei_is_sw_signaled(num) ||
> +	       kvm_sdei_is_virtual(num);
> +}

Is there ever going to be a situation where KVM has defined a new event
but doesn't actually support it?

> +static inline bool kvm_sdei_is_critical(unsigned char priority)
> +{
> +	return priority == SDEI_EVENT_PRIORITY_CRITICAL;
> +}
> +
> +static inline bool kvm_sdei_is_normal(unsigned char priority)
> +{
> +	return priority == SDEI_EVENT_PRIORITY_NORMAL;
> +}
> +
> +#define KVM_SDEI_REGISTERED_EVENT_FUNC(func, field)			\
> +static inline bool kvm_sdei_is_##func(struct kvm_sdei_event *event)	\
> +{									\
> +	return !!(event->state & KVM_SDEI_EVENT_STATE_##field);		\
> +}									\
> +									\
> +static inline void kvm_sdei_set_##func(struct kvm_sdei_event *event)	\
> +{									\
> +	event->state |= KVM_SDEI_EVENT_STATE_##field;			\
> +}									\
> +									\
> +static inline void kvm_sdei_clear_##func(struct kvm_sdei_event *event)	\
> +{									\
> +	event->state &= ~KVM_SDEI_EVENT_STATE_##field;			\
> +}
> +
> +KVM_SDEI_REGISTERED_EVENT_FUNC(registered, REGISTERED)
> +KVM_SDEI_REGISTERED_EVENT_FUNC(enabled, ENABLED)
> +KVM_SDEI_REGISTERED_EVENT_FUNC(unregister_pending, UNREGISTER_PENDING)

Are there any particular concerns about open coding the bitwise
operations that are getting wrapped here? test_bit()/set_bit() is also a
helpful construct.

> +/* APIs */
> +int kvm_sdei_call(struct kvm_vcpu *vcpu);
> +void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
> +void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
> +
> +#endif /* __ARM64_KVM_SDEI_H__ */
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index 261644b1a6bb..d6ced92ae3f0 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -14,7 +14,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
>  	 inject_fault.o va_layout.o handle_exit.o \
>  	 guest.o debug.o reset.o sys_regs.o \
>  	 vgic-sys-reg-v3.o fpsimd.o pmu.o pkvm.o \
> -	 arch_timer.o trng.o vmid.o \
> +	 arch_timer.o trng.o vmid.o sdei.o \
>  	 vgic/vgic.o vgic/vgic-init.o \
>  	 vgic/vgic-irqfd.o vgic/vgic-v2.o \
>  	 vgic/vgic-v3.o vgic/vgic-v4.o \
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 523bc934fe2f..227c0e390571 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -38,6 +38,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_mmu.h>
>  #include <asm/kvm_emulate.h>
> +#include <asm/kvm_sdei.h>
>  #include <asm/sections.h>
>  
>  #include <kvm/arm_hypercalls.h>
> @@ -331,6 +332,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  
>  	kvm_arm_pvtime_vcpu_init(&vcpu->arch);
>  
> +	kvm_sdei_create_vcpu(vcpu);
> +
>  	vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
>  
>  	err = kvm_vgic_vcpu_init(vcpu);
> @@ -352,6 +355,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
>  	kvm_timer_vcpu_terminate(vcpu);
>  	kvm_pmu_vcpu_destroy(vcpu);
> +	kvm_sdei_destroy_vcpu(vcpu);
>  
>  	kvm_arm_vcpu_destroy(vcpu);
>  }
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index b659387d8919..6aa027a4cee8 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c
> @@ -5,6 +5,7 @@
>  #include <linux/kvm_host.h>
>  
>  #include <asm/kvm_emulate.h>
> +#include <asm/kvm_sdei.h>
>  
>  #include <kvm/arm_hypercalls.h>
>  #include <kvm/arm_psci.h>
> @@ -93,6 +94,8 @@ static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
>  	case PSCI_1_1_FN_SYSTEM_RESET2:
>  	case PSCI_1_1_FN64_SYSTEM_RESET2:
>  		return kvm_psci_call(vcpu);
> +	case SDEI_1_0_FN_SDEI_VERSION ... SDEI_1_1_FN_SDEI_FEATURES:
> +		return kvm_sdei_call(vcpu);

I mentioned in another thread, but reraising here on the new diff.
Prefer using the defined function [start, end] range in this switch
statement.

Overall, I think this still puts a lot of abstraction around the concept
of SDEI events, even though we have a very narrow use case for it in KVM
for now. Removing all of the plumbing for critical and shared events
should help collapse this quite a bit.

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-04-22 21:48     ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-22 21:48 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

On Sun, Apr 03, 2022 at 11:38:56PM +0800, Gavin Shan wrote:
> Software Delegated Exception Interface (SDEI) provides a mechanism
> for registering and servicing system events, as defined by ARM DEN0054C
> specification. One of these events will be used by Asynchronous Page
> Fault (Async PF) to deliver notifications from host to guest.
> 
> The events are classified into shared and private ones according to
> their scopes. The shared events are system or VM scoped, but the
> private events are CPU or VCPU scoped. The shared events can be
> registered, enabled, unregistered and reset through hypercalls
> issued from any VCPU. However, the private events are registered,
> enabled, unregistered and reset on the calling VCPU through
> hypercalls. Besides, the events are also classified into critical
> and normal events according their priority. During event delivery
> and handling, the normal event can be preempted by another critical
> event, but not in reverse way. The critical event is never preempted
> by another normal event.

We don't have any need for critical events though, right? We should avoid
building out the plumbing around the concept of critical events until
there is an actual use case for it.

> This introduces SDEI virtualization infrastructure for various objects
> used in the implementation. Currently, we don't support the shared
> event.
> 
>   * kvm_sdei_exposed_event
>     The event which are defined and exposed by KVM. The event can't
>     be registered until it's exposed. Besides, all the information
>     in this event can't be changed after it's exposed.
> 
>   * kvm_sdei_event
>     The events are created based on the exposed events. Their states
>     are changed when hypercalls are received or they are delivered
>     to guest for handling.
> 
>   * kvm_sdei_vcpu_context
>     The vcpu context helps to handle events. The interrupted context
>     is saved before the event handler is executed, and restored after
>     the event handler is to finish.
> 
>   * kvm_sdei_vcpu
>     Place holder for all objects for one particular VCPU.
> 
> The error of SDEI_NOT_SUPPORTED is returned for all hypercalls for now.
> They will be supported one by one in the subsequent patches.
> 
> Link: https://developer.arm.com/documentation/den0054/latest
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |   1 +
>  arch/arm64/include/asm/kvm_sdei.h | 148 ++++++++++++++++++++++++++++++
>  arch/arm64/kvm/Makefile           |   2 +-
>  arch/arm64/kvm/arm.c              |   4 +
>  arch/arm64/kvm/hypercalls.c       |   3 +
>  arch/arm64/kvm/sdei.c             |  98 ++++++++++++++++++++
>  include/uapi/linux/arm_sdei.h     |   4 +
>  7 files changed, 259 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/include/asm/kvm_sdei.h
>  create mode 100644 arch/arm64/kvm/sdei.c
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index e3b25dc6c367..7644a400c4a8 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -343,6 +343,7 @@ struct kvm_vcpu_arch {
>  	 * Anything that is not used directly from assembly code goes
>  	 * here.
>  	 */
> +	struct kvm_sdei_vcpu *sdei;
>  
>  	/*
>  	 * Guest registers we preserve during guest debugging.
> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
> new file mode 100644
> index 000000000000..2dbfb3ae0a48
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_sdei.h
> @@ -0,0 +1,148 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Definitions of various KVM SDEI events.
> + *
> + * Copyright (C) 2022 Red Hat, Inc.
> + *
> + * Author(s): Gavin Shan <gshan@redhat.com>
> + */
> +
> +#ifndef __ARM64_KVM_SDEI_H__
> +#define __ARM64_KVM_SDEI_H__
> +
> +#include <uapi/linux/arm_sdei.h>
> +#include <linux/arm-smccc.h>
> +#include <linux/bits.h>
> +#include <linux/spinlock.h>
> +
> +/*
> + * The event which are defined and exposed by KVM. The event can't
> + * be registered until it's exposed. Besides, all the information
> + * in this event can't be changed after it's exposed.
> + */

kernel doc style comments are highly preferable when describing a
complex struct. Figuring out what each of the fields mean is not
obvious.

> +struct kvm_sdei_exposed_event {
> +	unsigned int	num;
> +	unsigned char	type;
> +	unsigned char	signaled;

what is this used for?

> +	unsigned char	priority;
> +};

I don't think we have a need for this struct. ::type will always be set
to PRIVATE and ::priority will always be NORMAL.

> +/*
> + * Currently, only the private events are supported. The events are
> + * created based on the exposed events and their states are changed
> + * when hypercalls are received or they are delivered to guest for
> + * handling.
> + */
> +struct kvm_sdei_event {
> +	struct kvm_sdei_exposed_event	*exposed_event;

I'm not following what is meant by an exposed event. By default the
KVM will expose all of the events to its guests.

> +	unsigned char			route_mode;
> +	unsigned long			route_affinity;

If we only have private events, do we need to worry about routing?

> +	unsigned long			ep_address;
> +	unsigned long			ep_arg;
> +#define KVM_SDEI_EVENT_STATE_REGISTERED		BIT(0)
> +#define KVM_SDEI_EVENT_STATE_ENABLED		BIT(1)
> +#define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING	BIT(2)
> +	unsigned long			state;

Isn't this state actually local to a PE (not VM) for private events?

> +	unsigned long			event_count;
> +};
> +
> +/*
> + * The vcpu context helps to handle events. The preempted or interrupted
> + * context is saved before the event handler is executed, and restored
> + * after the event handler is to finish. The event with normal priority
> + * can be preempted by the one with critical priority. So there can be
> + * two contexts on one particular vcpu for the events with normal and
> + * critical priority separately.
> + */
> +struct kvm_sdei_vcpu_context {
> +	struct kvm_sdei_event	*event;

Do we need this if we disallow nesting events?

> +	unsigned long		regs[18];
> +	unsigned long		pc;
> +	unsigned long		pstate;
> +};
> +
> +struct kvm_sdei_vcpu {
> +	spinlock_t			lock;

Why do we need a lock? This state should only ever be messed with in the
context of a single vCPU to which we already have exclusive access.

> +	struct kvm_sdei_event		*events;
> +	unsigned char			masked;
> +	unsigned long			critical_event_count;
> +	unsigned long			normal_event_count;
> +	struct kvm_sdei_vcpu_context	context[SDEI_EVENT_PRIORITY_CRITICAL + 1];
> +};
> +
> +/*
> + * According to SDEI specification (v1.1), the event number spans 32-bits
> + * and the lower 24-bits are used as the (real) event number. I don't
> + * think we can use that much event numbers in one system. So we reserve
> + * two bits from the 24-bits real event number, to indicate its types:
> + * physical or virtual event. One reserved bit is enough for now, but
> + * two bits are reserved for possible extension in future.
> + *
> + * The physical events are owned by firmware while the virtual events
> + * are used by VMM and KVM.

Doesn't KVM own everything? I don't see how the guest could interact
with another SDEI implementation.

> + */
> +#define KVM_SDEI_EVENT_NUM_TYPE_SHIFT	22
> +#define KVM_SDEI_EVENT_NUM_TYPE_MASK	(3 << KVM_SDEI_EVENT_NUM_TYPE_SHIFT)
> +#define KVM_SDEI_EVENT_NUM_TYPE_PHYS	0
> +#define KVM_SDEI_EVENT_NUM_TYPE_VIRT	1
> +
> +static inline bool kvm_sdei_is_virtual(unsigned int num)
> +{
> +	unsigned int type;
> +
> +	type = (num & KVM_SDEI_EVENT_NUM_TYPE_MASK) >>
> +	       KVM_SDEI_EVENT_NUM_TYPE_SHIFT;
> +	if (type == KVM_SDEI_EVENT_NUM_TYPE_VIRT)
> +		return true;
> +
> +	return false;
> +}
> +
> +static inline bool kvm_sdei_is_sw_signaled(unsigned int num)
> +{
> +	return num == SDEI_SW_SIGNALED_EVENT;
> +}

Couldn't the caller just check the event number on their own?

> +static inline bool kvm_sdei_is_supported(unsigned int num)
> +{
> +	return kvm_sdei_is_sw_signaled(num) ||
> +	       kvm_sdei_is_virtual(num);
> +}

Is there ever going to be a situation where KVM has defined a new event
but doesn't actually support it?

> +static inline bool kvm_sdei_is_critical(unsigned char priority)
> +{
> +	return priority == SDEI_EVENT_PRIORITY_CRITICAL;
> +}
> +
> +static inline bool kvm_sdei_is_normal(unsigned char priority)
> +{
> +	return priority == SDEI_EVENT_PRIORITY_NORMAL;
> +}
> +
> +#define KVM_SDEI_REGISTERED_EVENT_FUNC(func, field)			\
> +static inline bool kvm_sdei_is_##func(struct kvm_sdei_event *event)	\
> +{									\
> +	return !!(event->state & KVM_SDEI_EVENT_STATE_##field);		\
> +}									\
> +									\
> +static inline void kvm_sdei_set_##func(struct kvm_sdei_event *event)	\
> +{									\
> +	event->state |= KVM_SDEI_EVENT_STATE_##field;			\
> +}									\
> +									\
> +static inline void kvm_sdei_clear_##func(struct kvm_sdei_event *event)	\
> +{									\
> +	event->state &= ~KVM_SDEI_EVENT_STATE_##field;			\
> +}
> +
> +KVM_SDEI_REGISTERED_EVENT_FUNC(registered, REGISTERED)
> +KVM_SDEI_REGISTERED_EVENT_FUNC(enabled, ENABLED)
> +KVM_SDEI_REGISTERED_EVENT_FUNC(unregister_pending, UNREGISTER_PENDING)

Are there any particular concerns about open coding the bitwise
operations that are getting wrapped here? test_bit()/set_bit() is also a
helpful construct.

> +/* APIs */
> +int kvm_sdei_call(struct kvm_vcpu *vcpu);
> +void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
> +void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
> +
> +#endif /* __ARM64_KVM_SDEI_H__ */
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index 261644b1a6bb..d6ced92ae3f0 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -14,7 +14,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
>  	 inject_fault.o va_layout.o handle_exit.o \
>  	 guest.o debug.o reset.o sys_regs.o \
>  	 vgic-sys-reg-v3.o fpsimd.o pmu.o pkvm.o \
> -	 arch_timer.o trng.o vmid.o \
> +	 arch_timer.o trng.o vmid.o sdei.o \
>  	 vgic/vgic.o vgic/vgic-init.o \
>  	 vgic/vgic-irqfd.o vgic/vgic-v2.o \
>  	 vgic/vgic-v3.o vgic/vgic-v4.o \
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 523bc934fe2f..227c0e390571 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -38,6 +38,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_mmu.h>
>  #include <asm/kvm_emulate.h>
> +#include <asm/kvm_sdei.h>
>  #include <asm/sections.h>
>  
>  #include <kvm/arm_hypercalls.h>
> @@ -331,6 +332,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  
>  	kvm_arm_pvtime_vcpu_init(&vcpu->arch);
>  
> +	kvm_sdei_create_vcpu(vcpu);
> +
>  	vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
>  
>  	err = kvm_vgic_vcpu_init(vcpu);
> @@ -352,6 +355,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
>  	kvm_timer_vcpu_terminate(vcpu);
>  	kvm_pmu_vcpu_destroy(vcpu);
> +	kvm_sdei_destroy_vcpu(vcpu);
>  
>  	kvm_arm_vcpu_destroy(vcpu);
>  }
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index b659387d8919..6aa027a4cee8 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c
> @@ -5,6 +5,7 @@
>  #include <linux/kvm_host.h>
>  
>  #include <asm/kvm_emulate.h>
> +#include <asm/kvm_sdei.h>
>  
>  #include <kvm/arm_hypercalls.h>
>  #include <kvm/arm_psci.h>
> @@ -93,6 +94,8 @@ static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
>  	case PSCI_1_1_FN_SYSTEM_RESET2:
>  	case PSCI_1_1_FN64_SYSTEM_RESET2:
>  		return kvm_psci_call(vcpu);
> +	case SDEI_1_0_FN_SDEI_VERSION ... SDEI_1_1_FN_SDEI_FEATURES:
> +		return kvm_sdei_call(vcpu);

I mentioned in another thread, but reraising here on the new diff.
Prefer using the defined function [start, end] range in this switch
statement.

Overall, I think this still puts a lot of abstraction around the concept
of SDEI events, even though we have a very narrow use case for it in KVM
for now. Removing all of the plumbing for critical and shared events
should help collapse this quite a bit.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 02/18] KVM: arm64: Route hypercalls based on their owner
  2022-04-22 17:59         ` Oliver Upton
@ 2022-04-23 12:48           ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-23 12:48 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 4/23/22 1:59 AM, Oliver Upton wrote:
> On Fri, Apr 22, 2022 at 08:20:50PM +0800, Gavin Shan wrote:
>> On 4/21/22 4:19 PM, Oliver Upton wrote:
>>> On Sun, Apr 03, 2022 at 11:38:55PM +0800, Gavin Shan wrote:
>>>> kvm_hvc_call_handler() directly handles the incoming hypercall, or
>>>> and routes it based on its (function) ID. kvm_psci_call() becomes
>>>> the gate keeper to handle the hypercall that can't be handled by
>>>> any one else. It makes kvm_hvc_call_handler() a bit messy.
>>>>
>>>> This reorgnizes the code to route the hypercall to the corresponding
>>>> handler based on its owner.
>>>
>>> nit: write changelogs in the imperative:
>>>
>>> Reorganize the code to ...
>>>
>>
>> Thanks again for your review. It will be corrected in next respin.
>> By the way, could you help to review the rest when you have free
>> cycles? :)
> 
> Yup, I've been thinking on the rest of the series just to make sure the
> feedback I give is sane.
> 

Sure.

>>>> The hypercall may be handled directly
>>>> in the handler or routed further to the corresponding functionality.
>>>> The (function) ID is always verified before it's routed to the
>>>> corresponding functionality. By the way, @func_id is repalced by
>>>> @func, to be consistent with by smccc_get_function().
>>>>
>>>> PSCI is the only exception, those hypercalls defined by 0.2 or
>>>> beyond are routed to the handler for Standard Secure Service, but
>>>> those defined in 0.1 are routed to the handler for Standard
>>>> Hypervisor Service.
>>>>
>>>> Suggested-by: Oliver Upton <oupton@google.com>
>>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>>> ---
>>>>    arch/arm64/kvm/hypercalls.c | 199 +++++++++++++++++++++++-------------
>>>>    1 file changed, 127 insertions(+), 72 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
>>>> index 8438fd79e3f0..b659387d8919 100644
>>>> --- a/arch/arm64/kvm/hypercalls.c
>>>> +++ b/arch/arm64/kvm/hypercalls.c
>>>
>>> [...]
>>>
>>>> +static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
>>>> +{
>>>> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
>>>> +
>>>> +	switch (func) {
>>>> +	case ARM_SMCCC_TRNG_VERSION ... ARM_SMCCC_TRNG_RND32:
>>>> +	case ARM_SMCCC_TRNG_RND64:
>>>> +		return kvm_trng_call(vcpu);
>>>> +	case PSCI_0_2_FN_PSCI_VERSION ... PSCI_0_2_FN_SYSTEM_RESET:
>>>> +	case PSCI_0_2_FN64_CPU_SUSPEND ... PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
>>>> +	case PSCI_1_0_FN_PSCI_FEATURES ... PSCI_1_0_FN_SET_SUSPEND_MODE:
>>>> +	case PSCI_1_0_FN64_SYSTEM_SUSPEND:
>>>> +	case PSCI_1_1_FN_SYSTEM_RESET2:
>>>> +	case PSCI_1_1_FN64_SYSTEM_RESET2:
>>>
>>> Isn't it known from the SMCCC what range of hypercall numbers PSCI and
>>> TRNG fall under, respectively?
>>>
>>> https://developer.arm.com/documentation/den0028/e/
>>>
>>> See sections 6.3 and 6.4.
>>>
>>
>> Bit#30 of the function ID is the call convention indication, which is
>> either 32 or 64-bits. For TRNG's function IDs, its 32-bits and 64-bits
>> variants are discrete. Besides, the spec reserves more functions IDs
>> than what range we're using. It means we don't have symbols to match
>> the reserved ranges. So it looks good to me for TRNG cases.
>>
>> For PSCI, it can be simplified as below, according to the defination
>> in include/uapi/linux/psci.h:
>>
>>      case PSCI_0_2_FN_PSCI_VERSION ...
>>           PSCI_1_1_FN_SYSTEM_RESET2:     /* 32-bits */
>>      case PSCI_0_2_FN64_CPU_SUSPEND ...
>>           PSCI_1_1_FN64_SYSTEM_RESET2:   /* 64-bits */
> 
> Right, but this still requires that we go back and update this switch
> statement every time we add a new PSCI call, which is exactly what I was
> hoping we could avoid. Doing this based exactly on the spec reduces the
> burden for future changes, and keeps all relevant context in a single
> spot.
> 
>    #define SMCCC_STD_PSCI_RANGE_START	0x0000
>    #define SMCCC_STD_PSCI_RANGE_END	0x001f
>    #define SMCCC_STD_TRNG_RANGE_START	0x0050
>    #define SMCCC_STD_TRNG_RANGE_END	0x005f
> 
>    switch (ARM_SMCCC_FUNC_NUM(function_id)) {
>            case SMCCC_STD_PSCI_RANGE_START ... SMCCC_STD_PSCI_RANGE_END:
> 	          return kvm_psci_call(vcpu);
>            case SMCCC_STD_TRNG_RANGE_START ... SMCCC_STD_TRNG_RANGE_END:
> 	  	  return kvm_trng_call(vcpu);
> 
> 	 ...
>    }
> 

Yep, we should avoid to visit and modify this function when a new PSCI call
is added. I intended not to introduce new macros, especially in the header
file (include/linux/arm-smccc.h), which is out of kvm/arm64 scope to some
degree. However, these newly added macros will have life much easier. I will
include the changes in next respin.

>>>> +	case KVM_PSCI_FN_CPU_SUSPEND ... KVM_PSCI_FN_MIGRATE:
>>>> +		return kvm_psci_call(vcpu);
>>>
>>> You might want to handle these from the main call handler with a giant
>>> disclaimer that these values predate SMCCC and therefore collide with
>>> the standard hypervisor service range.
>>>
>>> [...]
>>>
>>
>> I probably just keep it as it is to follow the rule: to route
>> based on the owner strictly. Besides, there are 3 levels to
>> handle SMCCCs after this patch is applied, which corresponds
>> to 3 handlers as main/owner/function. It sounds more natural
>> for reader to follow the implementation in this way.
> 
> I think this makes it much more confusing for the reader, as you'd be
> hard pressed to find these function IDs in the SMCCC spec. Since their
> values are outside of the specification, it is confusing to only address
> them after these switch statements have decided that they belong to a
> particular service owner as they do not.
> 

Ok. Lets filter these SMCCC PSCI numbers in kvm_hvc_call_handler():

     /* Filter these calls that aren't documented in the specification */
     if (func >= KVM_PSCI_FN_CPU_SUSPEND && func <= KVM_PSCI_FN_MIGRATE)
         return kvm_psci_call(vcpu);

     switch (ARM_SMCCC_OWNER_NUM(func)) {
         :
     }

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 02/18] KVM: arm64: Route hypercalls based on their owner
@ 2022-04-23 12:48           ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-23 12:48 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 4/23/22 1:59 AM, Oliver Upton wrote:
> On Fri, Apr 22, 2022 at 08:20:50PM +0800, Gavin Shan wrote:
>> On 4/21/22 4:19 PM, Oliver Upton wrote:
>>> On Sun, Apr 03, 2022 at 11:38:55PM +0800, Gavin Shan wrote:
>>>> kvm_hvc_call_handler() directly handles the incoming hypercall, or
>>>> and routes it based on its (function) ID. kvm_psci_call() becomes
>>>> the gate keeper to handle the hypercall that can't be handled by
>>>> any one else. It makes kvm_hvc_call_handler() a bit messy.
>>>>
>>>> This reorgnizes the code to route the hypercall to the corresponding
>>>> handler based on its owner.
>>>
>>> nit: write changelogs in the imperative:
>>>
>>> Reorganize the code to ...
>>>
>>
>> Thanks again for your review. It will be corrected in next respin.
>> By the way, could you help to review the rest when you have free
>> cycles? :)
> 
> Yup, I've been thinking on the rest of the series just to make sure the
> feedback I give is sane.
> 

Sure.

>>>> The hypercall may be handled directly
>>>> in the handler or routed further to the corresponding functionality.
>>>> The (function) ID is always verified before it's routed to the
>>>> corresponding functionality. By the way, @func_id is repalced by
>>>> @func, to be consistent with by smccc_get_function().
>>>>
>>>> PSCI is the only exception, those hypercalls defined by 0.2 or
>>>> beyond are routed to the handler for Standard Secure Service, but
>>>> those defined in 0.1 are routed to the handler for Standard
>>>> Hypervisor Service.
>>>>
>>>> Suggested-by: Oliver Upton <oupton@google.com>
>>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>>> ---
>>>>    arch/arm64/kvm/hypercalls.c | 199 +++++++++++++++++++++++-------------
>>>>    1 file changed, 127 insertions(+), 72 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
>>>> index 8438fd79e3f0..b659387d8919 100644
>>>> --- a/arch/arm64/kvm/hypercalls.c
>>>> +++ b/arch/arm64/kvm/hypercalls.c
>>>
>>> [...]
>>>
>>>> +static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
>>>> +{
>>>> +	u64 val = SMCCC_RET_NOT_SUPPORTED;
>>>> +
>>>> +	switch (func) {
>>>> +	case ARM_SMCCC_TRNG_VERSION ... ARM_SMCCC_TRNG_RND32:
>>>> +	case ARM_SMCCC_TRNG_RND64:
>>>> +		return kvm_trng_call(vcpu);
>>>> +	case PSCI_0_2_FN_PSCI_VERSION ... PSCI_0_2_FN_SYSTEM_RESET:
>>>> +	case PSCI_0_2_FN64_CPU_SUSPEND ... PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
>>>> +	case PSCI_1_0_FN_PSCI_FEATURES ... PSCI_1_0_FN_SET_SUSPEND_MODE:
>>>> +	case PSCI_1_0_FN64_SYSTEM_SUSPEND:
>>>> +	case PSCI_1_1_FN_SYSTEM_RESET2:
>>>> +	case PSCI_1_1_FN64_SYSTEM_RESET2:
>>>
>>> Isn't it known from the SMCCC what range of hypercall numbers PSCI and
>>> TRNG fall under, respectively?
>>>
>>> https://developer.arm.com/documentation/den0028/e/
>>>
>>> See sections 6.3 and 6.4.
>>>
>>
>> Bit#30 of the function ID is the call convention indication, which is
>> either 32 or 64-bits. For TRNG's function IDs, its 32-bits and 64-bits
>> variants are discrete. Besides, the spec reserves more functions IDs
>> than what range we're using. It means we don't have symbols to match
>> the reserved ranges. So it looks good to me for TRNG cases.
>>
>> For PSCI, it can be simplified as below, according to the defination
>> in include/uapi/linux/psci.h:
>>
>>      case PSCI_0_2_FN_PSCI_VERSION ...
>>           PSCI_1_1_FN_SYSTEM_RESET2:     /* 32-bits */
>>      case PSCI_0_2_FN64_CPU_SUSPEND ...
>>           PSCI_1_1_FN64_SYSTEM_RESET2:   /* 64-bits */
> 
> Right, but this still requires that we go back and update this switch
> statement every time we add a new PSCI call, which is exactly what I was
> hoping we could avoid. Doing this based exactly on the spec reduces the
> burden for future changes, and keeps all relevant context in a single
> spot.
> 
>    #define SMCCC_STD_PSCI_RANGE_START	0x0000
>    #define SMCCC_STD_PSCI_RANGE_END	0x001f
>    #define SMCCC_STD_TRNG_RANGE_START	0x0050
>    #define SMCCC_STD_TRNG_RANGE_END	0x005f
> 
>    switch (ARM_SMCCC_FUNC_NUM(function_id)) {
>            case SMCCC_STD_PSCI_RANGE_START ... SMCCC_STD_PSCI_RANGE_END:
> 	          return kvm_psci_call(vcpu);
>            case SMCCC_STD_TRNG_RANGE_START ... SMCCC_STD_TRNG_RANGE_END:
> 	  	  return kvm_trng_call(vcpu);
> 
> 	 ...
>    }
> 

Yep, we should avoid to visit and modify this function when a new PSCI call
is added. I intended not to introduce new macros, especially in the header
file (include/linux/arm-smccc.h), which is out of kvm/arm64 scope to some
degree. However, these newly added macros will have life much easier. I will
include the changes in next respin.

>>>> +	case KVM_PSCI_FN_CPU_SUSPEND ... KVM_PSCI_FN_MIGRATE:
>>>> +		return kvm_psci_call(vcpu);
>>>
>>> You might want to handle these from the main call handler with a giant
>>> disclaimer that these values predate SMCCC and therefore collide with
>>> the standard hypervisor service range.
>>>
>>> [...]
>>>
>>
>> I probably just keep it as it is to follow the rule: to route
>> based on the owner strictly. Besides, there are 3 levels to
>> handle SMCCCs after this patch is applied, which corresponds
>> to 3 handlers as main/owner/function. It sounds more natural
>> for reader to follow the implementation in this way.
> 
> I think this makes it much more confusing for the reader, as you'd be
> hard pressed to find these function IDs in the SMCCC spec. Since their
> values are outside of the specification, it is confusing to only address
> them after these switch statements have decided that they belong to a
> particular service owner as they do not.
> 

Ok. Lets filter these SMCCC PSCI numbers in kvm_hvc_call_handler():

     /* Filter these calls that aren't documented in the specification */
     if (func >= KVM_PSCI_FN_CPU_SUSPEND && func <= KVM_PSCI_FN_MIGRATE)
         return kvm_psci_call(vcpu);

     switch (ARM_SMCCC_OWNER_NUM(func)) {
         :
     }

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-04-22 21:48     ` Oliver Upton
@ 2022-04-23 14:18       ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-23 14:18 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 4/23/22 5:48 AM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:38:56PM +0800, Gavin Shan wrote:
>> Software Delegated Exception Interface (SDEI) provides a mechanism
>> for registering and servicing system events, as defined by ARM DEN0054C
>> specification. One of these events will be used by Asynchronous Page
>> Fault (Async PF) to deliver notifications from host to guest.
>>
>> The events are classified into shared and private ones according to
>> their scopes. The shared events are system or VM scoped, but the
>> private events are CPU or VCPU scoped. The shared events can be
>> registered, enabled, unregistered and reset through hypercalls
>> issued from any VCPU. However, the private events are registered,
>> enabled, unregistered and reset on the calling VCPU through
>> hypercalls. Besides, the events are also classified into critical
>> and normal events according their priority. During event delivery
>> and handling, the normal event can be preempted by another critical
>> event, but not in reverse way. The critical event is never preempted
>> by another normal event.
> 
> We don't have any need for critical events though, right? We should avoid
> building out the plumbing around the concept of critical events until
> there is an actual use case for it.
> 

The Async PF one is critical event, as guest needs to handle it immediately.
Otherwise, it's possible that guest can't continue its execution. Besides,
the software signaled event (0x0) is normal event. They're the only two
events to be supported, I assume the software signaled event (0x0) is only
used selftest/kvm. So Async PF one becomes the only event and it can be
in normal priority until other SDEI event needs to be added and supported.

However, the logic to support critical/normal events has been here. So
I think it's probably nice to keep it. At least, it make it easier to
add a new SDEI event in future. We dropped the support for the shared
event from v5 to v6, I think we probably never need a shared event for
ever :)

>> This introduces SDEI virtualization infrastructure for various objects
>> used in the implementation. Currently, we don't support the shared
>> event.
>>
>>    * kvm_sdei_exposed_event
>>      The event which are defined and exposed by KVM. The event can't
>>      be registered until it's exposed. Besides, all the information
>>      in this event can't be changed after it's exposed.
>>
>>    * kvm_sdei_event
>>      The events are created based on the exposed events. Their states
>>      are changed when hypercalls are received or they are delivered
>>      to guest for handling.
>>
>>    * kvm_sdei_vcpu_context
>>      The vcpu context helps to handle events. The interrupted context
>>      is saved before the event handler is executed, and restored after
>>      the event handler is to finish.
>>
>>    * kvm_sdei_vcpu
>>      Place holder for all objects for one particular VCPU.
>>
>> The error of SDEI_NOT_SUPPORTED is returned for all hypercalls for now.
>> They will be supported one by one in the subsequent patches.
>>
>> Link: https://developer.arm.com/documentation/den0054/latest
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/include/asm/kvm_host.h |   1 +
>>   arch/arm64/include/asm/kvm_sdei.h | 148 ++++++++++++++++++++++++++++++
>>   arch/arm64/kvm/Makefile           |   2 +-
>>   arch/arm64/kvm/arm.c              |   4 +
>>   arch/arm64/kvm/hypercalls.c       |   3 +
>>   arch/arm64/kvm/sdei.c             |  98 ++++++++++++++++++++
>>   include/uapi/linux/arm_sdei.h     |   4 +
>>   7 files changed, 259 insertions(+), 1 deletion(-)
>>   create mode 100644 arch/arm64/include/asm/kvm_sdei.h
>>   create mode 100644 arch/arm64/kvm/sdei.c
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index e3b25dc6c367..7644a400c4a8 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -343,6 +343,7 @@ struct kvm_vcpu_arch {
>>   	 * Anything that is not used directly from assembly code goes
>>   	 * here.
>>   	 */
>> +	struct kvm_sdei_vcpu *sdei;
>>   
>>   	/*
>>   	 * Guest registers we preserve during guest debugging.
>> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
>> new file mode 100644
>> index 000000000000..2dbfb3ae0a48
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/kvm_sdei.h
>> @@ -0,0 +1,148 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Definitions of various KVM SDEI events.
>> + *
>> + * Copyright (C) 2022 Red Hat, Inc.
>> + *
>> + * Author(s): Gavin Shan <gshan@redhat.com>
>> + */
>> +
>> +#ifndef __ARM64_KVM_SDEI_H__
>> +#define __ARM64_KVM_SDEI_H__
>> +
>> +#include <uapi/linux/arm_sdei.h>
>> +#include <linux/arm-smccc.h>
>> +#include <linux/bits.h>
>> +#include <linux/spinlock.h>
>> +
>> +/*
>> + * The event which are defined and exposed by KVM. The event can't
>> + * be registered until it's exposed. Besides, all the information
>> + * in this event can't be changed after it's exposed.
>> + */
> 
> kernel doc style comments are highly preferable when describing a
> complex struct. Figuring out what each of the fields mean is not
> obvious.
> 

Yeah, it's a nice point and lets do this in next respin.

>> +struct kvm_sdei_exposed_event {
>> +	unsigned int	num;
>> +	unsigned char	type;
>> +	unsigned char	signaled;
> 
> what is this used for?
> 

It indicates the event can be raised by software or not. For those
events exposed by KVM should be raised by software, so this should
always be true.

>> +	unsigned char	priority;
>> +};
> 
> I don't think we have a need for this struct. ::type will always be set
> to PRIVATE and ::priority will always be NORMAL.
> 

If we don't support the critical event, this struct isn't needed except
@num field. However, I think it would be nice to support the critical
event. Besides, this struct can be merged with struct kvm_sdei_event
even the critical event is supported.

The struct and struct kvm_sdei_event are tracking the information
and state for one particular event. The information isn't changeable,
but state can be modify through hypercalls. It's the reason why I
had two separate structs to track the information and state.

>> +/*
>> + * Currently, only the private events are supported. The events are
>> + * created based on the exposed events and their states are changed
>> + * when hypercalls are received or they are delivered to guest for
>> + * handling.
>> + */
>> +struct kvm_sdei_event {
>> +	struct kvm_sdei_exposed_event	*exposed_event;
> 
> I'm not following what is meant by an exposed event. By default the
> KVM will expose all of the events to its guests.
> 

Please refer to the above reply. struct kvm_sdei_exposed_event
and this struct are tracking information and state for one particular
event on one particular vcpu. The unchangeable information is
maintained in kvm_sdei_exposed_event, but the changeable state
is tracked by this struct. Besides, the struct kvm_sdei_exposed_event
instance can be dereferenced by mutiple struct kvm_sdei_event
instances on different vcpus.

>> +	unsigned char			route_mode;
>> +	unsigned long			route_affinity;
> 
> If we only have private events, do we need to worry about routing?
> 

Yes, these two fields should be dropped. The private event is always
routed to the owning vcpu.

>> +	unsigned long			ep_address;
>> +	unsigned long			ep_arg;
>> +#define KVM_SDEI_EVENT_STATE_REGISTERED		BIT(0)
>> +#define KVM_SDEI_EVENT_STATE_ENABLED		BIT(1)
>> +#define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING	BIT(2)
>> +	unsigned long			state;
> 
> Isn't this state actually local to a PE (not VM) for private events?
> 

Yes, the state is vcpu scoped. After the support to shared event is
dropped from v5 to v6, there are no VM scoped properties any more.
Besides, For one same event, separate (struct kvm_sdei_event) instances
are created for the individual vcpus.


>> +	unsigned long			event_count;
>> +};
>> +
>> +/*
>> + * The vcpu context helps to handle events. The preempted or interrupted
>> + * context is saved before the event handler is executed, and restored
>> + * after the event handler is to finish. The event with normal priority
>> + * can be preempted by the one with critical priority. So there can be
>> + * two contexts on one particular vcpu for the events with normal and
>> + * critical priority separately.
>> + */
>> +struct kvm_sdei_vcpu_context {
>> +	struct kvm_sdei_event	*event;
> 
> Do we need this if we disallow nesting events?
> 

Yes, we need this. "event == NULL" is used as indication of invalid
context. @event is the associated SDEI event when the context is
valid.

>> +	unsigned long		regs[18];
>> +	unsigned long		pc;
>> +	unsigned long		pstate;
>> +};
>> +
>> +struct kvm_sdei_vcpu {
>> +	spinlock_t			lock;
> 
> Why do we need a lock? This state should only ever be messed with in the
> context of a single vCPU to which we already have exclusive access.
> 

Good point. I don't think we needn't it any more. The lock was introduced
to allow comprehensive event injection. For example, the event is injected
in the context, which is out of the vcpu. We shouldn't have this case now.

>> +	struct kvm_sdei_event		*events;
>> +	unsigned char			masked;
>> +	unsigned long			critical_event_count;
>> +	unsigned long			normal_event_count;
>> +	struct kvm_sdei_vcpu_context	context[SDEI_EVENT_PRIORITY_CRITICAL + 1];
>> +};
>> +
>> +/*
>> + * According to SDEI specification (v1.1), the event number spans 32-bits
>> + * and the lower 24-bits are used as the (real) event number. I don't
>> + * think we can use that much event numbers in one system. So we reserve
>> + * two bits from the 24-bits real event number, to indicate its types:
>> + * physical or virtual event. One reserved bit is enough for now, but
>> + * two bits are reserved for possible extension in future.
>> + *
>> + * The physical events are owned by firmware while the virtual events
>> + * are used by VMM and KVM.
> 
> Doesn't KVM own everything? I don't see how the guest could interact
> with another SDEI implementation.
> 

I might be overthinking on the scheme. The host's firmware might have
SDEI supported and we want to propogate these events originated from
host's firmware to guest. In this case, we need to distinguish the events
originated from host's firmware and kvm (guest's firmware). Even this
case isn't possible to happen, I think it's still nice to distinguish
the events originated from a real firmware or KVM emulated firmware.

>> + */
>> +#define KVM_SDEI_EVENT_NUM_TYPE_SHIFT	22
>> +#define KVM_SDEI_EVENT_NUM_TYPE_MASK	(3 << KVM_SDEI_EVENT_NUM_TYPE_SHIFT)
>> +#define KVM_SDEI_EVENT_NUM_TYPE_PHYS	0
>> +#define KVM_SDEI_EVENT_NUM_TYPE_VIRT	1
>> +
>> +static inline bool kvm_sdei_is_virtual(unsigned int num)
>> +{
>> +	unsigned int type;
>> +
>> +	type = (num & KVM_SDEI_EVENT_NUM_TYPE_MASK) >>
>> +	       KVM_SDEI_EVENT_NUM_TYPE_SHIFT;
>> +	if (type == KVM_SDEI_EVENT_NUM_TYPE_VIRT)
>> +		return true;
>> +
>> +	return false;
>> +}
>> +
>> +static inline bool kvm_sdei_is_sw_signaled(unsigned int num)
>> +{
>> +	return num == SDEI_SW_SIGNALED_EVENT;
>> +}
> 
> Couldn't the caller just check the event number on their own?
> 

It would be hard because the caller can be guest. Generally, the
event and its associated information/state are accessed by hypercalls,
event injection and delivery, migration to be supported in future.
So I think it's good to check the event number by ourselves.

>> +static inline bool kvm_sdei_is_supported(unsigned int num)
>> +{
>> +	return kvm_sdei_is_sw_signaled(num) ||
>> +	       kvm_sdei_is_virtual(num);
>> +}
> 
> Is there ever going to be a situation where KVM has defined a new event
> but doesn't actually support it?
> 

Nice point. It's impossible. I will drop this helper and
kvm_sdei_is_supported(). Instead, the kvm_sdei_exposed_event
or kvm_sdei_event array should be checked. Besides, The macros
KVM_SDEI_EVENT_NUM_TYPE_* should be dropped either.

>> +static inline bool kvm_sdei_is_critical(unsigned char priority)
>> +{
>> +	return priority == SDEI_EVENT_PRIORITY_CRITICAL;
>> +}
>> +
>> +static inline bool kvm_sdei_is_normal(unsigned char priority)
>> +{
>> +	return priority == SDEI_EVENT_PRIORITY_NORMAL;
>> +}
>> +
>> +#define KVM_SDEI_REGISTERED_EVENT_FUNC(func, field)			\
>> +static inline bool kvm_sdei_is_##func(struct kvm_sdei_event *event)	\
>> +{									\
>> +	return !!(event->state & KVM_SDEI_EVENT_STATE_##field);		\
>> +}									\
>> +									\
>> +static inline void kvm_sdei_set_##func(struct kvm_sdei_event *event)	\
>> +{									\
>> +	event->state |= KVM_SDEI_EVENT_STATE_##field;			\
>> +}									\
>> +									\
>> +static inline void kvm_sdei_clear_##func(struct kvm_sdei_event *event)	\
>> +{									\
>> +	event->state &= ~KVM_SDEI_EVENT_STATE_##field;			\
>> +}
>> +
>> +KVM_SDEI_REGISTERED_EVENT_FUNC(registered, REGISTERED)
>> +KVM_SDEI_REGISTERED_EVENT_FUNC(enabled, ENABLED)
>> +KVM_SDEI_REGISTERED_EVENT_FUNC(unregister_pending, UNREGISTER_PENDING)
> 
> Are there any particular concerns about open coding the bitwise
> operations that are getting wrapped here? test_bit()/set_bit() is also a
> helpful construct.
> 

Ok. Lets drop these helpers and go ahead to use {test, set, clear}_bit()
in next respin.

>> +/* APIs */
>> +int kvm_sdei_call(struct kvm_vcpu *vcpu);
>> +void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
>> +void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
>> +
>> +#endif /* __ARM64_KVM_SDEI_H__ */
>> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
>> index 261644b1a6bb..d6ced92ae3f0 100644
>> --- a/arch/arm64/kvm/Makefile
>> +++ b/arch/arm64/kvm/Makefile
>> @@ -14,7 +14,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
>>   	 inject_fault.o va_layout.o handle_exit.o \
>>   	 guest.o debug.o reset.o sys_regs.o \
>>   	 vgic-sys-reg-v3.o fpsimd.o pmu.o pkvm.o \
>> -	 arch_timer.o trng.o vmid.o \
>> +	 arch_timer.o trng.o vmid.o sdei.o \
>>   	 vgic/vgic.o vgic/vgic-init.o \
>>   	 vgic/vgic-irqfd.o vgic/vgic-v2.o \
>>   	 vgic/vgic-v3.o vgic/vgic-v4.o \
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 523bc934fe2f..227c0e390571 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -38,6 +38,7 @@
>>   #include <asm/kvm_asm.h>
>>   #include <asm/kvm_mmu.h>
>>   #include <asm/kvm_emulate.h>
>> +#include <asm/kvm_sdei.h>
>>   #include <asm/sections.h>
>>   
>>   #include <kvm/arm_hypercalls.h>
>> @@ -331,6 +332,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>>   
>>   	kvm_arm_pvtime_vcpu_init(&vcpu->arch);
>>   
>> +	kvm_sdei_create_vcpu(vcpu);
>> +
>>   	vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
>>   
>>   	err = kvm_vgic_vcpu_init(vcpu);
>> @@ -352,6 +355,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>>   	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
>>   	kvm_timer_vcpu_terminate(vcpu);
>>   	kvm_pmu_vcpu_destroy(vcpu);
>> +	kvm_sdei_destroy_vcpu(vcpu);
>>   
>>   	kvm_arm_vcpu_destroy(vcpu);
>>   }
>> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
>> index b659387d8919..6aa027a4cee8 100644
>> --- a/arch/arm64/kvm/hypercalls.c
>> +++ b/arch/arm64/kvm/hypercalls.c
>> @@ -5,6 +5,7 @@
>>   #include <linux/kvm_host.h>
>>   
>>   #include <asm/kvm_emulate.h>
>> +#include <asm/kvm_sdei.h>
>>   
>>   #include <kvm/arm_hypercalls.h>
>>   #include <kvm/arm_psci.h>
>> @@ -93,6 +94,8 @@ static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
>>   	case PSCI_1_1_FN_SYSTEM_RESET2:
>>   	case PSCI_1_1_FN64_SYSTEM_RESET2:
>>   		return kvm_psci_call(vcpu);
>> +	case SDEI_1_0_FN_SDEI_VERSION ... SDEI_1_1_FN_SDEI_FEATURES:
>> +		return kvm_sdei_call(vcpu);
> 
> I mentioned in another thread, but reraising here on the new diff.
> Prefer using the defined function [start, end] range in this switch
> statement.
> 
> Overall, I think this still puts a lot of abstraction around the concept
> of SDEI events, even though we have a very narrow use case for it in KVM
> for now. Removing all of the plumbing for critical and shared events
> should help collapse this quite a bit.
> 

Yeah, I will define two macros in include/linux/arm-smccc.h, similar to
what you suggested for PATCH[02].

   #define SMCCC_STD_SDEI_RANGE_START	0x0020
   #define SMCCC_STD_SDEI_RANGE_END	0x003f

The support to share event was removed from v5 to v6. However, the @route_{
mode, affinity} fields in struct kvm_sdei_event should be dropped either.

For the critical event support, I think it would be nice to keep it as I
explained above. First of all, the async PF event is naturally a critical
event. Secondly, the support doesn't require any VM-scoped properties.
So the SDEI event and (vcpu) context can be accessed and migrated through
firmware pseudo-register easily in future.

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-04-23 14:18       ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-23 14:18 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 4/23/22 5:48 AM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:38:56PM +0800, Gavin Shan wrote:
>> Software Delegated Exception Interface (SDEI) provides a mechanism
>> for registering and servicing system events, as defined by ARM DEN0054C
>> specification. One of these events will be used by Asynchronous Page
>> Fault (Async PF) to deliver notifications from host to guest.
>>
>> The events are classified into shared and private ones according to
>> their scopes. The shared events are system or VM scoped, but the
>> private events are CPU or VCPU scoped. The shared events can be
>> registered, enabled, unregistered and reset through hypercalls
>> issued from any VCPU. However, the private events are registered,
>> enabled, unregistered and reset on the calling VCPU through
>> hypercalls. Besides, the events are also classified into critical
>> and normal events according their priority. During event delivery
>> and handling, the normal event can be preempted by another critical
>> event, but not in reverse way. The critical event is never preempted
>> by another normal event.
> 
> We don't have any need for critical events though, right? We should avoid
> building out the plumbing around the concept of critical events until
> there is an actual use case for it.
> 

The Async PF one is critical event, as guest needs to handle it immediately.
Otherwise, it's possible that guest can't continue its execution. Besides,
the software signaled event (0x0) is normal event. They're the only two
events to be supported, I assume the software signaled event (0x0) is only
used selftest/kvm. So Async PF one becomes the only event and it can be
in normal priority until other SDEI event needs to be added and supported.

However, the logic to support critical/normal events has been here. So
I think it's probably nice to keep it. At least, it make it easier to
add a new SDEI event in future. We dropped the support for the shared
event from v5 to v6, I think we probably never need a shared event for
ever :)

>> This introduces SDEI virtualization infrastructure for various objects
>> used in the implementation. Currently, we don't support the shared
>> event.
>>
>>    * kvm_sdei_exposed_event
>>      The event which are defined and exposed by KVM. The event can't
>>      be registered until it's exposed. Besides, all the information
>>      in this event can't be changed after it's exposed.
>>
>>    * kvm_sdei_event
>>      The events are created based on the exposed events. Their states
>>      are changed when hypercalls are received or they are delivered
>>      to guest for handling.
>>
>>    * kvm_sdei_vcpu_context
>>      The vcpu context helps to handle events. The interrupted context
>>      is saved before the event handler is executed, and restored after
>>      the event handler is to finish.
>>
>>    * kvm_sdei_vcpu
>>      Place holder for all objects for one particular VCPU.
>>
>> The error of SDEI_NOT_SUPPORTED is returned for all hypercalls for now.
>> They will be supported one by one in the subsequent patches.
>>
>> Link: https://developer.arm.com/documentation/den0054/latest
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/include/asm/kvm_host.h |   1 +
>>   arch/arm64/include/asm/kvm_sdei.h | 148 ++++++++++++++++++++++++++++++
>>   arch/arm64/kvm/Makefile           |   2 +-
>>   arch/arm64/kvm/arm.c              |   4 +
>>   arch/arm64/kvm/hypercalls.c       |   3 +
>>   arch/arm64/kvm/sdei.c             |  98 ++++++++++++++++++++
>>   include/uapi/linux/arm_sdei.h     |   4 +
>>   7 files changed, 259 insertions(+), 1 deletion(-)
>>   create mode 100644 arch/arm64/include/asm/kvm_sdei.h
>>   create mode 100644 arch/arm64/kvm/sdei.c
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index e3b25dc6c367..7644a400c4a8 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -343,6 +343,7 @@ struct kvm_vcpu_arch {
>>   	 * Anything that is not used directly from assembly code goes
>>   	 * here.
>>   	 */
>> +	struct kvm_sdei_vcpu *sdei;
>>   
>>   	/*
>>   	 * Guest registers we preserve during guest debugging.
>> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
>> new file mode 100644
>> index 000000000000..2dbfb3ae0a48
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/kvm_sdei.h
>> @@ -0,0 +1,148 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Definitions of various KVM SDEI events.
>> + *
>> + * Copyright (C) 2022 Red Hat, Inc.
>> + *
>> + * Author(s): Gavin Shan <gshan@redhat.com>
>> + */
>> +
>> +#ifndef __ARM64_KVM_SDEI_H__
>> +#define __ARM64_KVM_SDEI_H__
>> +
>> +#include <uapi/linux/arm_sdei.h>
>> +#include <linux/arm-smccc.h>
>> +#include <linux/bits.h>
>> +#include <linux/spinlock.h>
>> +
>> +/*
>> + * The event which are defined and exposed by KVM. The event can't
>> + * be registered until it's exposed. Besides, all the information
>> + * in this event can't be changed after it's exposed.
>> + */
> 
> kernel doc style comments are highly preferable when describing a
> complex struct. Figuring out what each of the fields mean is not
> obvious.
> 

Yeah, it's a nice point and lets do this in next respin.

>> +struct kvm_sdei_exposed_event {
>> +	unsigned int	num;
>> +	unsigned char	type;
>> +	unsigned char	signaled;
> 
> what is this used for?
> 

It indicates the event can be raised by software or not. For those
events exposed by KVM should be raised by software, so this should
always be true.

>> +	unsigned char	priority;
>> +};
> 
> I don't think we have a need for this struct. ::type will always be set
> to PRIVATE and ::priority will always be NORMAL.
> 

If we don't support the critical event, this struct isn't needed except
@num field. However, I think it would be nice to support the critical
event. Besides, this struct can be merged with struct kvm_sdei_event
even the critical event is supported.

The struct and struct kvm_sdei_event are tracking the information
and state for one particular event. The information isn't changeable,
but state can be modify through hypercalls. It's the reason why I
had two separate structs to track the information and state.

>> +/*
>> + * Currently, only the private events are supported. The events are
>> + * created based on the exposed events and their states are changed
>> + * when hypercalls are received or they are delivered to guest for
>> + * handling.
>> + */
>> +struct kvm_sdei_event {
>> +	struct kvm_sdei_exposed_event	*exposed_event;
> 
> I'm not following what is meant by an exposed event. By default the
> KVM will expose all of the events to its guests.
> 

Please refer to the above reply. struct kvm_sdei_exposed_event
and this struct are tracking information and state for one particular
event on one particular vcpu. The unchangeable information is
maintained in kvm_sdei_exposed_event, but the changeable state
is tracked by this struct. Besides, the struct kvm_sdei_exposed_event
instance can be dereferenced by mutiple struct kvm_sdei_event
instances on different vcpus.

>> +	unsigned char			route_mode;
>> +	unsigned long			route_affinity;
> 
> If we only have private events, do we need to worry about routing?
> 

Yes, these two fields should be dropped. The private event is always
routed to the owning vcpu.

>> +	unsigned long			ep_address;
>> +	unsigned long			ep_arg;
>> +#define KVM_SDEI_EVENT_STATE_REGISTERED		BIT(0)
>> +#define KVM_SDEI_EVENT_STATE_ENABLED		BIT(1)
>> +#define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING	BIT(2)
>> +	unsigned long			state;
> 
> Isn't this state actually local to a PE (not VM) for private events?
> 

Yes, the state is vcpu scoped. After the support to shared event is
dropped from v5 to v6, there are no VM scoped properties any more.
Besides, For one same event, separate (struct kvm_sdei_event) instances
are created for the individual vcpus.


>> +	unsigned long			event_count;
>> +};
>> +
>> +/*
>> + * The vcpu context helps to handle events. The preempted or interrupted
>> + * context is saved before the event handler is executed, and restored
>> + * after the event handler is to finish. The event with normal priority
>> + * can be preempted by the one with critical priority. So there can be
>> + * two contexts on one particular vcpu for the events with normal and
>> + * critical priority separately.
>> + */
>> +struct kvm_sdei_vcpu_context {
>> +	struct kvm_sdei_event	*event;
> 
> Do we need this if we disallow nesting events?
> 

Yes, we need this. "event == NULL" is used as indication of invalid
context. @event is the associated SDEI event when the context is
valid.

>> +	unsigned long		regs[18];
>> +	unsigned long		pc;
>> +	unsigned long		pstate;
>> +};
>> +
>> +struct kvm_sdei_vcpu {
>> +	spinlock_t			lock;
> 
> Why do we need a lock? This state should only ever be messed with in the
> context of a single vCPU to which we already have exclusive access.
> 

Good point. I don't think we needn't it any more. The lock was introduced
to allow comprehensive event injection. For example, the event is injected
in the context, which is out of the vcpu. We shouldn't have this case now.

>> +	struct kvm_sdei_event		*events;
>> +	unsigned char			masked;
>> +	unsigned long			critical_event_count;
>> +	unsigned long			normal_event_count;
>> +	struct kvm_sdei_vcpu_context	context[SDEI_EVENT_PRIORITY_CRITICAL + 1];
>> +};
>> +
>> +/*
>> + * According to SDEI specification (v1.1), the event number spans 32-bits
>> + * and the lower 24-bits are used as the (real) event number. I don't
>> + * think we can use that much event numbers in one system. So we reserve
>> + * two bits from the 24-bits real event number, to indicate its types:
>> + * physical or virtual event. One reserved bit is enough for now, but
>> + * two bits are reserved for possible extension in future.
>> + *
>> + * The physical events are owned by firmware while the virtual events
>> + * are used by VMM and KVM.
> 
> Doesn't KVM own everything? I don't see how the guest could interact
> with another SDEI implementation.
> 

I might be overthinking on the scheme. The host's firmware might have
SDEI supported and we want to propogate these events originated from
host's firmware to guest. In this case, we need to distinguish the events
originated from host's firmware and kvm (guest's firmware). Even this
case isn't possible to happen, I think it's still nice to distinguish
the events originated from a real firmware or KVM emulated firmware.

>> + */
>> +#define KVM_SDEI_EVENT_NUM_TYPE_SHIFT	22
>> +#define KVM_SDEI_EVENT_NUM_TYPE_MASK	(3 << KVM_SDEI_EVENT_NUM_TYPE_SHIFT)
>> +#define KVM_SDEI_EVENT_NUM_TYPE_PHYS	0
>> +#define KVM_SDEI_EVENT_NUM_TYPE_VIRT	1
>> +
>> +static inline bool kvm_sdei_is_virtual(unsigned int num)
>> +{
>> +	unsigned int type;
>> +
>> +	type = (num & KVM_SDEI_EVENT_NUM_TYPE_MASK) >>
>> +	       KVM_SDEI_EVENT_NUM_TYPE_SHIFT;
>> +	if (type == KVM_SDEI_EVENT_NUM_TYPE_VIRT)
>> +		return true;
>> +
>> +	return false;
>> +}
>> +
>> +static inline bool kvm_sdei_is_sw_signaled(unsigned int num)
>> +{
>> +	return num == SDEI_SW_SIGNALED_EVENT;
>> +}
> 
> Couldn't the caller just check the event number on their own?
> 

It would be hard because the caller can be guest. Generally, the
event and its associated information/state are accessed by hypercalls,
event injection and delivery, migration to be supported in future.
So I think it's good to check the event number by ourselves.

>> +static inline bool kvm_sdei_is_supported(unsigned int num)
>> +{
>> +	return kvm_sdei_is_sw_signaled(num) ||
>> +	       kvm_sdei_is_virtual(num);
>> +}
> 
> Is there ever going to be a situation where KVM has defined a new event
> but doesn't actually support it?
> 

Nice point. It's impossible. I will drop this helper and
kvm_sdei_is_supported(). Instead, the kvm_sdei_exposed_event
or kvm_sdei_event array should be checked. Besides, The macros
KVM_SDEI_EVENT_NUM_TYPE_* should be dropped either.

>> +static inline bool kvm_sdei_is_critical(unsigned char priority)
>> +{
>> +	return priority == SDEI_EVENT_PRIORITY_CRITICAL;
>> +}
>> +
>> +static inline bool kvm_sdei_is_normal(unsigned char priority)
>> +{
>> +	return priority == SDEI_EVENT_PRIORITY_NORMAL;
>> +}
>> +
>> +#define KVM_SDEI_REGISTERED_EVENT_FUNC(func, field)			\
>> +static inline bool kvm_sdei_is_##func(struct kvm_sdei_event *event)	\
>> +{									\
>> +	return !!(event->state & KVM_SDEI_EVENT_STATE_##field);		\
>> +}									\
>> +									\
>> +static inline void kvm_sdei_set_##func(struct kvm_sdei_event *event)	\
>> +{									\
>> +	event->state |= KVM_SDEI_EVENT_STATE_##field;			\
>> +}									\
>> +									\
>> +static inline void kvm_sdei_clear_##func(struct kvm_sdei_event *event)	\
>> +{									\
>> +	event->state &= ~KVM_SDEI_EVENT_STATE_##field;			\
>> +}
>> +
>> +KVM_SDEI_REGISTERED_EVENT_FUNC(registered, REGISTERED)
>> +KVM_SDEI_REGISTERED_EVENT_FUNC(enabled, ENABLED)
>> +KVM_SDEI_REGISTERED_EVENT_FUNC(unregister_pending, UNREGISTER_PENDING)
> 
> Are there any particular concerns about open coding the bitwise
> operations that are getting wrapped here? test_bit()/set_bit() is also a
> helpful construct.
> 

Ok. Lets drop these helpers and go ahead to use {test, set, clear}_bit()
in next respin.

>> +/* APIs */
>> +int kvm_sdei_call(struct kvm_vcpu *vcpu);
>> +void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
>> +void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
>> +
>> +#endif /* __ARM64_KVM_SDEI_H__ */
>> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
>> index 261644b1a6bb..d6ced92ae3f0 100644
>> --- a/arch/arm64/kvm/Makefile
>> +++ b/arch/arm64/kvm/Makefile
>> @@ -14,7 +14,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
>>   	 inject_fault.o va_layout.o handle_exit.o \
>>   	 guest.o debug.o reset.o sys_regs.o \
>>   	 vgic-sys-reg-v3.o fpsimd.o pmu.o pkvm.o \
>> -	 arch_timer.o trng.o vmid.o \
>> +	 arch_timer.o trng.o vmid.o sdei.o \
>>   	 vgic/vgic.o vgic/vgic-init.o \
>>   	 vgic/vgic-irqfd.o vgic/vgic-v2.o \
>>   	 vgic/vgic-v3.o vgic/vgic-v4.o \
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 523bc934fe2f..227c0e390571 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -38,6 +38,7 @@
>>   #include <asm/kvm_asm.h>
>>   #include <asm/kvm_mmu.h>
>>   #include <asm/kvm_emulate.h>
>> +#include <asm/kvm_sdei.h>
>>   #include <asm/sections.h>
>>   
>>   #include <kvm/arm_hypercalls.h>
>> @@ -331,6 +332,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>>   
>>   	kvm_arm_pvtime_vcpu_init(&vcpu->arch);
>>   
>> +	kvm_sdei_create_vcpu(vcpu);
>> +
>>   	vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
>>   
>>   	err = kvm_vgic_vcpu_init(vcpu);
>> @@ -352,6 +355,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>>   	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
>>   	kvm_timer_vcpu_terminate(vcpu);
>>   	kvm_pmu_vcpu_destroy(vcpu);
>> +	kvm_sdei_destroy_vcpu(vcpu);
>>   
>>   	kvm_arm_vcpu_destroy(vcpu);
>>   }
>> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
>> index b659387d8919..6aa027a4cee8 100644
>> --- a/arch/arm64/kvm/hypercalls.c
>> +++ b/arch/arm64/kvm/hypercalls.c
>> @@ -5,6 +5,7 @@
>>   #include <linux/kvm_host.h>
>>   
>>   #include <asm/kvm_emulate.h>
>> +#include <asm/kvm_sdei.h>
>>   
>>   #include <kvm/arm_hypercalls.h>
>>   #include <kvm/arm_psci.h>
>> @@ -93,6 +94,8 @@ static int kvm_hvc_standard(struct kvm_vcpu *vcpu, u32 func)
>>   	case PSCI_1_1_FN_SYSTEM_RESET2:
>>   	case PSCI_1_1_FN64_SYSTEM_RESET2:
>>   		return kvm_psci_call(vcpu);
>> +	case SDEI_1_0_FN_SDEI_VERSION ... SDEI_1_1_FN_SDEI_FEATURES:
>> +		return kvm_sdei_call(vcpu);
> 
> I mentioned in another thread, but reraising here on the new diff.
> Prefer using the defined function [start, end] range in this switch
> statement.
> 
> Overall, I think this still puts a lot of abstraction around the concept
> of SDEI events, even though we have a very narrow use case for it in KVM
> for now. Removing all of the plumbing for critical and shared events
> should help collapse this quite a bit.
> 

Yeah, I will define two macros in include/linux/arm-smccc.h, similar to
what you suggested for PATCH[02].

   #define SMCCC_STD_SDEI_RANGE_START	0x0020
   #define SMCCC_STD_SDEI_RANGE_END	0x003f

The support to share event was removed from v5 to v6. However, the @route_{
mode, affinity} fields in struct kvm_sdei_event should be dropped either.

For the critical event support, I think it would be nice to keep it as I
explained above. First of all, the async PF event is naturally a critical
event. Secondly, the support doesn't require any VM-scoped properties.
So the SDEI event and (vcpu) context can be accessed and migrated through
firmware pseudo-register easily in future.

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-04-23 14:18       ` Gavin Shan
@ 2022-04-23 18:43         ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-23 18:43 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

On Sat, Apr 23, 2022 at 10:18:49PM +0800, Gavin Shan wrote:
> Hi Oliver,
> 
> On 4/23/22 5:48 AM, Oliver Upton wrote:
> > On Sun, Apr 03, 2022 at 11:38:56PM +0800, Gavin Shan wrote:
> > > Software Delegated Exception Interface (SDEI) provides a mechanism
> > > for registering and servicing system events, as defined by ARM DEN0054C
> > > specification. One of these events will be used by Asynchronous Page
> > > Fault (Async PF) to deliver notifications from host to guest.
> > > 
> > > The events are classified into shared and private ones according to
> > > their scopes. The shared events are system or VM scoped, but the
> > > private events are CPU or VCPU scoped. The shared events can be
> > > registered, enabled, unregistered and reset through hypercalls
> > > issued from any VCPU. However, the private events are registered,
> > > enabled, unregistered and reset on the calling VCPU through
> > > hypercalls. Besides, the events are also classified into critical
> > > and normal events according their priority. During event delivery
> > > and handling, the normal event can be preempted by another critical
> > > event, but not in reverse way. The critical event is never preempted
> > > by another normal event.
> > 
> > We don't have any need for critical events though, right? We should avoid
> > building out the plumbing around the concept of critical events until
> > there is an actual use case for it.
> > 
> 
> The Async PF one is critical event, as guest needs to handle it immediately.

But that's the sticking point for me. IIUC, we're going to deliver an
aync PF SDEI event to the PE that is waiting on a page so it can go do
something else and wait for the page to come in. Normal events preempt
~everything, critical events preempt even normal events.

How can the guest context switch and do something better at an arbitrary
instruction boundary (such as in an SDEI handler of normal priority)? If
a guest takes a page fault in that context, it may as well wait
synchronously for the page to come in.

And in the case of the page ready event, we still need to clean up shop
before switching to the unblocked context.

> Otherwise, it's possible that guest can't continue its execution. Besides,
> the software signaled event (0x0) is normal event. They're the only two
> events to be supported, I assume the software signaled event (0x0) is only
> used selftest/kvm. So Async PF one becomes the only event and it can be
> in normal priority until other SDEI event needs to be added and supported.

I believe there are multiple use cases for guest-initiated SDEI events
beyond just testing. Poking a hung PE but one example.

> However, the logic to support critical/normal events has been here. So
> I think it's probably nice to keep it. At least, it make it easier to
> add a new SDEI event in future. We dropped the support for the shared
> event from v5 to v6, I think we probably never need a shared event for
> ever :)

But then we're sprinkling a lot of dead code throughout KVM, right? It
makes KVM's job even easier if it doesn't have to worry about nesting
SDEI events.
> > > +struct kvm_sdei_exposed_event {
> > > +	unsigned int	num;
> > > +	unsigned char	type;
> > > +	unsigned char	signaled;
> > 
> > what is this used for?
> > 
> 
> It indicates the event can be raised by software or not. For those
> events exposed by KVM should be raised by software, so this should
> always be true.

Isn't there always going to be some piece of software that raises an
event?

For KVM, we have guest-initiated 'software-signaled' events and KVM-initiated
async PF (whatever else may follow as well).

> > Do we need this if we disallow nesting events?
> > 
> 
> Yes, we need this. "event == NULL" is used as indication of invalid
> context. @event is the associated SDEI event when the context is
> valid.

What if we use some other plumbing to indicate the state of the vCPU? MP
state comes to mind, for example.

> > > +/*
> > > + * According to SDEI specification (v1.1), the event number spans 32-bits
> > > + * and the lower 24-bits are used as the (real) event number. I don't
> > > + * think we can use that much event numbers in one system. So we reserve
> > > + * two bits from the 24-bits real event number, to indicate its types:
> > > + * physical or virtual event. One reserved bit is enough for now, but
> > > + * two bits are reserved for possible extension in future.
> > > + *
> > > + * The physical events are owned by firmware while the virtual events
> > > + * are used by VMM and KVM.
> > 
> > Doesn't KVM own everything? I don't see how the guest could interact
> > with another SDEI implementation.
> > 
> 
> I might be overthinking on the scheme. The host's firmware might have
> SDEI supported and we want to propogate these events originated from
> host's firmware to guest. In this case, we need to distinguish the events
> originated from host's firmware and kvm (guest's firmware). Even this
> case isn't possible to happen, I think it's still nice to distinguish
> the events originated from a real firmware or KVM emulated firmware.

The guest ABI w.r.t. SDEI is under full ownership of KVM. Any other
implementations events will never get exposed to the guest.

Couldn't the guest own the host if it was talking to our firmware
anyway?

> > > + */
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_SHIFT	22
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_MASK	(3 << KVM_SDEI_EVENT_NUM_TYPE_SHIFT)
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_PHYS	0
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_VIRT	1
> > > +
> > > +static inline bool kvm_sdei_is_virtual(unsigned int num)
> > > +{
> > > +	unsigned int type;
> > > +
> > > +	type = (num & KVM_SDEI_EVENT_NUM_TYPE_MASK) >>
> > > +	       KVM_SDEI_EVENT_NUM_TYPE_SHIFT;
> > > +	if (type == KVM_SDEI_EVENT_NUM_TYPE_VIRT)
> > > +		return true;
> > > +
> > > +	return false;
> > > +}
> > > +
> > > +static inline bool kvm_sdei_is_sw_signaled(unsigned int num)
> > > +{
> > > +	return num == SDEI_SW_SIGNALED_EVENT;
> > > +}
> > 
> > Couldn't the caller just check the event number on their own?
> > 
> 
> It would be hard because the caller can be guest. Generally, the
> event and its associated information/state are accessed by hypercalls,
> event injection and delivery, migration to be supported in future.
> So I think it's good to check the event number by ourselves.

What I'm saying is, can't the caller of kvm_sdei_is_sw_signaled() just
do the comparison?

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-04-23 18:43         ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-23 18:43 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

On Sat, Apr 23, 2022 at 10:18:49PM +0800, Gavin Shan wrote:
> Hi Oliver,
> 
> On 4/23/22 5:48 AM, Oliver Upton wrote:
> > On Sun, Apr 03, 2022 at 11:38:56PM +0800, Gavin Shan wrote:
> > > Software Delegated Exception Interface (SDEI) provides a mechanism
> > > for registering and servicing system events, as defined by ARM DEN0054C
> > > specification. One of these events will be used by Asynchronous Page
> > > Fault (Async PF) to deliver notifications from host to guest.
> > > 
> > > The events are classified into shared and private ones according to
> > > their scopes. The shared events are system or VM scoped, but the
> > > private events are CPU or VCPU scoped. The shared events can be
> > > registered, enabled, unregistered and reset through hypercalls
> > > issued from any VCPU. However, the private events are registered,
> > > enabled, unregistered and reset on the calling VCPU through
> > > hypercalls. Besides, the events are also classified into critical
> > > and normal events according their priority. During event delivery
> > > and handling, the normal event can be preempted by another critical
> > > event, but not in reverse way. The critical event is never preempted
> > > by another normal event.
> > 
> > We don't have any need for critical events though, right? We should avoid
> > building out the plumbing around the concept of critical events until
> > there is an actual use case for it.
> > 
> 
> The Async PF one is critical event, as guest needs to handle it immediately.

But that's the sticking point for me. IIUC, we're going to deliver an
aync PF SDEI event to the PE that is waiting on a page so it can go do
something else and wait for the page to come in. Normal events preempt
~everything, critical events preempt even normal events.

How can the guest context switch and do something better at an arbitrary
instruction boundary (such as in an SDEI handler of normal priority)? If
a guest takes a page fault in that context, it may as well wait
synchronously for the page to come in.

And in the case of the page ready event, we still need to clean up shop
before switching to the unblocked context.

> Otherwise, it's possible that guest can't continue its execution. Besides,
> the software signaled event (0x0) is normal event. They're the only two
> events to be supported, I assume the software signaled event (0x0) is only
> used selftest/kvm. So Async PF one becomes the only event and it can be
> in normal priority until other SDEI event needs to be added and supported.

I believe there are multiple use cases for guest-initiated SDEI events
beyond just testing. Poking a hung PE but one example.

> However, the logic to support critical/normal events has been here. So
> I think it's probably nice to keep it. At least, it make it easier to
> add a new SDEI event in future. We dropped the support for the shared
> event from v5 to v6, I think we probably never need a shared event for
> ever :)

But then we're sprinkling a lot of dead code throughout KVM, right? It
makes KVM's job even easier if it doesn't have to worry about nesting
SDEI events.
> > > +struct kvm_sdei_exposed_event {
> > > +	unsigned int	num;
> > > +	unsigned char	type;
> > > +	unsigned char	signaled;
> > 
> > what is this used for?
> > 
> 
> It indicates the event can be raised by software or not. For those
> events exposed by KVM should be raised by software, so this should
> always be true.

Isn't there always going to be some piece of software that raises an
event?

For KVM, we have guest-initiated 'software-signaled' events and KVM-initiated
async PF (whatever else may follow as well).

> > Do we need this if we disallow nesting events?
> > 
> 
> Yes, we need this. "event == NULL" is used as indication of invalid
> context. @event is the associated SDEI event when the context is
> valid.

What if we use some other plumbing to indicate the state of the vCPU? MP
state comes to mind, for example.

> > > +/*
> > > + * According to SDEI specification (v1.1), the event number spans 32-bits
> > > + * and the lower 24-bits are used as the (real) event number. I don't
> > > + * think we can use that much event numbers in one system. So we reserve
> > > + * two bits from the 24-bits real event number, to indicate its types:
> > > + * physical or virtual event. One reserved bit is enough for now, but
> > > + * two bits are reserved for possible extension in future.
> > > + *
> > > + * The physical events are owned by firmware while the virtual events
> > > + * are used by VMM and KVM.
> > 
> > Doesn't KVM own everything? I don't see how the guest could interact
> > with another SDEI implementation.
> > 
> 
> I might be overthinking on the scheme. The host's firmware might have
> SDEI supported and we want to propogate these events originated from
> host's firmware to guest. In this case, we need to distinguish the events
> originated from host's firmware and kvm (guest's firmware). Even this
> case isn't possible to happen, I think it's still nice to distinguish
> the events originated from a real firmware or KVM emulated firmware.

The guest ABI w.r.t. SDEI is under full ownership of KVM. Any other
implementations events will never get exposed to the guest.

Couldn't the guest own the host if it was talking to our firmware
anyway?

> > > + */
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_SHIFT	22
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_MASK	(3 << KVM_SDEI_EVENT_NUM_TYPE_SHIFT)
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_PHYS	0
> > > +#define KVM_SDEI_EVENT_NUM_TYPE_VIRT	1
> > > +
> > > +static inline bool kvm_sdei_is_virtual(unsigned int num)
> > > +{
> > > +	unsigned int type;
> > > +
> > > +	type = (num & KVM_SDEI_EVENT_NUM_TYPE_MASK) >>
> > > +	       KVM_SDEI_EVENT_NUM_TYPE_SHIFT;
> > > +	if (type == KVM_SDEI_EVENT_NUM_TYPE_VIRT)
> > > +		return true;
> > > +
> > > +	return false;
> > > +}
> > > +
> > > +static inline bool kvm_sdei_is_sw_signaled(unsigned int num)
> > > +{
> > > +	return num == SDEI_SW_SIGNALED_EVENT;
> > > +}
> > 
> > Couldn't the caller just check the event number on their own?
> > 
> 
> It would be hard because the caller can be guest. Generally, the
> event and its associated information/state are accessed by hypercalls,
> event injection and delivery, migration to be supported in future.
> So I think it's good to check the event number by ourselves.

What I'm saying is, can't the caller of kvm_sdei_is_sw_signaled() just
do the comparison?

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-04-23 18:43         ` Oliver Upton
@ 2022-04-24  3:00           ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-24  3:00 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 4/24/22 2:43 AM, Oliver Upton wrote:
> On Sat, Apr 23, 2022 at 10:18:49PM +0800, Gavin Shan wrote:
>> On 4/23/22 5:48 AM, Oliver Upton wrote:
>>> On Sun, Apr 03, 2022 at 11:38:56PM +0800, Gavin Shan wrote:
>>>> Software Delegated Exception Interface (SDEI) provides a mechanism
>>>> for registering and servicing system events, as defined by ARM DEN0054C
>>>> specification. One of these events will be used by Asynchronous Page
>>>> Fault (Async PF) to deliver notifications from host to guest.
>>>>
>>>> The events are classified into shared and private ones according to
>>>> their scopes. The shared events are system or VM scoped, but the
>>>> private events are CPU or VCPU scoped. The shared events can be
>>>> registered, enabled, unregistered and reset through hypercalls
>>>> issued from any VCPU. However, the private events are registered,
>>>> enabled, unregistered and reset on the calling VCPU through
>>>> hypercalls. Besides, the events are also classified into critical
>>>> and normal events according their priority. During event delivery
>>>> and handling, the normal event can be preempted by another critical
>>>> event, but not in reverse way. The critical event is never preempted
>>>> by another normal event.
>>>
>>> We don't have any need for critical events though, right? We should avoid
>>> building out the plumbing around the concept of critical events until
>>> there is an actual use case for it.
>>>
>>
>> The Async PF one is critical event, as guest needs to handle it immediately.
> 
> But that's the sticking point for me. IIUC, we're going to deliver an
> aync PF SDEI event to the PE that is waiting on a page so it can go do
> something else and wait for the page to come in. Normal events preempt
> ~everything, critical events preempt even normal events.
> 
> How can the guest context switch and do something better at an arbitrary
> instruction boundary (such as in an SDEI handler of normal priority)? If
> a guest takes a page fault in that context, it may as well wait
> synchronously for the page to come in.
> 
> And in the case of the page ready event, we still need to clean up shop
> before switching to the unblocked context.
> 

The Async PF is working exactly like what you said. The normal event handler
should be executed in EL1. When the vcpu runs in EL1, no Async PF event will
be triggered. So the page fault in the normal event context is always resolved
synchronously. The page ready notification is delivered by PPI instead of
SDEI event, but yes, we need cleanup before switching to the previously
suspended context.

>> Otherwise, it's possible that guest can't continue its execution. Besides,
>> the software signaled event (0x0) is normal event. They're the only two
>> events to be supported, I assume the software signaled event (0x0) is only
>> used selftest/kvm. So Async PF one becomes the only event and it can be
>> in normal priority until other SDEI event needs to be added and supported.
> 
> I believe there are multiple use cases for guest-initiated SDEI events
> beyond just testing. Poking a hung PE but one example.
> 

Right. I think we can drop support to the critical event. Lets do this
in next respin. Prior to working on next revision, I still want to
confirm with you on the data structures. Please refer the below reply
about the adjusted data structures.

>> However, the logic to support critical/normal events has been here. So
>> I think it's probably nice to keep it. At least, it make it easier to
>> add a new SDEI event in future. We dropped the support for the shared
>> event from v5 to v6, I think we probably never need a shared event for
>> ever :)
> 
> But then we're sprinkling a lot of dead code throughout KVM, right? It
> makes KVM's job even easier if it doesn't have to worry about nesting
> SDEI events.
>>>> +struct kvm_sdei_exposed_event {
>>>> +	unsigned int	num;
>>>> +	unsigned char	type;
>>>> +	unsigned char	signaled;
>>>
>>> what is this used for?
>>>
>>
>> It indicates the event can be raised by software or not. For those
>> events exposed by KVM should be raised by software, so this should
>> always be true.
> 
> Isn't there always going to be some piece of software that raises an
> event?
> 
> For KVM, we have guest-initiated 'software-signaled' events and KVM-initiated
> async PF (whatever else may follow as well).
> 

Yes, The assumption that all events are always singled by software should
be true. So this field (@signaled) can be dropped either. So I plan to
change the data structures like below, according to the suggestions given
by you. Please double check if there are anything missed.

(1) Those fields of struct kvm_sdei_exposed_event are dropped or merged
     to struct kvm_sdei_event.

     struct kvm_sdei_event {
            unsigned int          num;
            unsigned long         ep_addr;
            unsigned long         ep_arg;
#define KVM_SDEI_EVENT_STATE_REGISTERED         0
#define KVM_SDEI_EVENT_STATE_ENABLED            1
#define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING 2
            unsigned long         state;                 /* accessed by {test,set,clear}_bit() */
            unsigned long         event_count;
     };

(2) In arch/arm64/kvm/sdei.c

     static kvm_sdei_event exposed_events[] = {
            { .num = SDEI_SW_SIGNALED_EVENT },
     };

(3) In arch/arm64/kvm/sdei.c::kvm_sdei_create_vcpu(), the SDEI events
     are instantiated based on @exposed_events[]. It's just what we're
     doing and nothing is changed.

>>> Do we need this if we disallow nesting events?
>>>
>>
>> Yes, we need this. "event == NULL" is used as indication of invalid
>> context. @event is the associated SDEI event when the context is
>> valid.
> 
> What if we use some other plumbing to indicate the state of the vCPU? MP
> state comes to mind, for example.
> 

Even the indication is done by another state, kvm_sdei_vcpu_context still
need to be linked (associated) with the event. After the vCPU context becomes
valid after the event is delivered, we still need to know the associated
event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
is one of the examples, we need to decrease struct kvm_sdei_event::event_count
for the hypercall.

There are several options I can figure out for now. Please let me know your
preference.

(1) Rename 'struct kvm_sdei_event *event' to 'unsigned int num'. We use
     the check of '@num == KVM_SDEI_INVALID_NUM' as the indication.
     KVM_SDEI_INVALID_NUM is defined as -1U. With the change, no pointer
     is left in kvm_sdei_vcpu_context.

(2) Add field of 'struct kvm_sdei_event *current_event' to kvm_sdei_vcpu,
     to associate the event with the vCPU context. We still use the
     check of '@current_event == NULL' as the indication.

(3) Have changes in (1) and (2) in the mean while.

Besides, the data structures needs somewhat adjustment as you suggested
previously. The major changes are to drop @lock and critical event support.

    struct kvm_sdei_vcpu_context {
         struct kvm_sdei_event   *event;                      /* need your confirm */
         unsigned long           regs[18];
         unsigned long           pc;
         unsigned long           pstate;
    };

    struct kvm_sdei_vcpu {
         struct kvm_sdei_event           *events;            /* instantiated from @exposed_events[] */
         unsigned char                   masked;             /* vCPU is masked off or not           */
         unsigned long                   event_count;        /* Total count of pending events       */
         struct kvm_sdei_vcpu_context    context;            /* vCPU context for SDEI event         */
    };

>>>> +/*
>>>> + * According to SDEI specification (v1.1), the event number spans 32-bits
>>>> + * and the lower 24-bits are used as the (real) event number. I don't
>>>> + * think we can use that much event numbers in one system. So we reserve
>>>> + * two bits from the 24-bits real event number, to indicate its types:
>>>> + * physical or virtual event. One reserved bit is enough for now, but
>>>> + * two bits are reserved for possible extension in future.
>>>> + *
>>>> + * The physical events are owned by firmware while the virtual events
>>>> + * are used by VMM and KVM.
>>>
>>> Doesn't KVM own everything? I don't see how the guest could interact
>>> with another SDEI implementation.
>>>
>>
>> I might be overthinking on the scheme. The host's firmware might have
>> SDEI supported and we want to propogate these events originated from
>> host's firmware to guest. In this case, we need to distinguish the events
>> originated from host's firmware and kvm (guest's firmware). Even this
>> case isn't possible to happen, I think it's still nice to distinguish
>> the events originated from a real firmware or KVM emulated firmware.
> 
> The guest ABI w.r.t. SDEI is under full ownership of KVM. Any other
> implementations events will never get exposed to the guest.
> 
> Couldn't the guest own the host if it was talking to our firmware
> anyway?
> 

Right. Lets drop these macros and kvm_sdei_is_virtual() in next respin.
As you suggested, we need to iterate struct kvm_sdei_vcpu::events to
see if the event (number) is valid or not.

>>>> + */
>>>> +#define KVM_SDEI_EVENT_NUM_TYPE_SHIFT	22
>>>> +#define KVM_SDEI_EVENT_NUM_TYPE_MASK	(3 << KVM_SDEI_EVENT_NUM_TYPE_SHIFT)
>>>> +#define KVM_SDEI_EVENT_NUM_TYPE_PHYS	0
>>>> +#define KVM_SDEI_EVENT_NUM_TYPE_VIRT	1
>>>> +
>>>> +static inline bool kvm_sdei_is_virtual(unsigned int num)
>>>> +{
>>>> +	unsigned int type;
>>>> +
>>>> +	type = (num & KVM_SDEI_EVENT_NUM_TYPE_MASK) >>
>>>> +	       KVM_SDEI_EVENT_NUM_TYPE_SHIFT;
>>>> +	if (type == KVM_SDEI_EVENT_NUM_TYPE_VIRT)
>>>> +		return true;
>>>> +
>>>> +	return false;
>>>> +}
>>>> +
>>>> +static inline bool kvm_sdei_is_virtual(unsigned int num)
>>>> +{
>>>> +	return num == SDEI_SW_SIGNALED_EVENT;
>>>> +}
>>>
>>> Couldn't the caller just check the event number on their own?
>>>
>>
>> It would be hard because the caller can be guest. Generally, the
>> event and its associated information/state are accessed by hypercalls,
>> event injection and delivery, migration to be supported in future.
>> So I think it's good to check the event number by ourselves.
> 
> What I'm saying is, can't the caller of kvm_sdei_is_sw_signaled() just
> do the comparison?
> 

The only caller of kvm_sdei_is_sw_signaled() is hypercall_signal(). So
lets drop kvm_sdei_is_sw_signaled() and do the comparison in hypercall_signal()
in next respin.


Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-04-24  3:00           ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-24  3:00 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 4/24/22 2:43 AM, Oliver Upton wrote:
> On Sat, Apr 23, 2022 at 10:18:49PM +0800, Gavin Shan wrote:
>> On 4/23/22 5:48 AM, Oliver Upton wrote:
>>> On Sun, Apr 03, 2022 at 11:38:56PM +0800, Gavin Shan wrote:
>>>> Software Delegated Exception Interface (SDEI) provides a mechanism
>>>> for registering and servicing system events, as defined by ARM DEN0054C
>>>> specification. One of these events will be used by Asynchronous Page
>>>> Fault (Async PF) to deliver notifications from host to guest.
>>>>
>>>> The events are classified into shared and private ones according to
>>>> their scopes. The shared events are system or VM scoped, but the
>>>> private events are CPU or VCPU scoped. The shared events can be
>>>> registered, enabled, unregistered and reset through hypercalls
>>>> issued from any VCPU. However, the private events are registered,
>>>> enabled, unregistered and reset on the calling VCPU through
>>>> hypercalls. Besides, the events are also classified into critical
>>>> and normal events according their priority. During event delivery
>>>> and handling, the normal event can be preempted by another critical
>>>> event, but not in reverse way. The critical event is never preempted
>>>> by another normal event.
>>>
>>> We don't have any need for critical events though, right? We should avoid
>>> building out the plumbing around the concept of critical events until
>>> there is an actual use case for it.
>>>
>>
>> The Async PF one is critical event, as guest needs to handle it immediately.
> 
> But that's the sticking point for me. IIUC, we're going to deliver an
> aync PF SDEI event to the PE that is waiting on a page so it can go do
> something else and wait for the page to come in. Normal events preempt
> ~everything, critical events preempt even normal events.
> 
> How can the guest context switch and do something better at an arbitrary
> instruction boundary (such as in an SDEI handler of normal priority)? If
> a guest takes a page fault in that context, it may as well wait
> synchronously for the page to come in.
> 
> And in the case of the page ready event, we still need to clean up shop
> before switching to the unblocked context.
> 

The Async PF is working exactly like what you said. The normal event handler
should be executed in EL1. When the vcpu runs in EL1, no Async PF event will
be triggered. So the page fault in the normal event context is always resolved
synchronously. The page ready notification is delivered by PPI instead of
SDEI event, but yes, we need cleanup before switching to the previously
suspended context.

>> Otherwise, it's possible that guest can't continue its execution. Besides,
>> the software signaled event (0x0) is normal event. They're the only two
>> events to be supported, I assume the software signaled event (0x0) is only
>> used selftest/kvm. So Async PF one becomes the only event and it can be
>> in normal priority until other SDEI event needs to be added and supported.
> 
> I believe there are multiple use cases for guest-initiated SDEI events
> beyond just testing. Poking a hung PE but one example.
> 

Right. I think we can drop support to the critical event. Lets do this
in next respin. Prior to working on next revision, I still want to
confirm with you on the data structures. Please refer the below reply
about the adjusted data structures.

>> However, the logic to support critical/normal events has been here. So
>> I think it's probably nice to keep it. At least, it make it easier to
>> add a new SDEI event in future. We dropped the support for the shared
>> event from v5 to v6, I think we probably never need a shared event for
>> ever :)
> 
> But then we're sprinkling a lot of dead code throughout KVM, right? It
> makes KVM's job even easier if it doesn't have to worry about nesting
> SDEI events.
>>>> +struct kvm_sdei_exposed_event {
>>>> +	unsigned int	num;
>>>> +	unsigned char	type;
>>>> +	unsigned char	signaled;
>>>
>>> what is this used for?
>>>
>>
>> It indicates the event can be raised by software or not. For those
>> events exposed by KVM should be raised by software, so this should
>> always be true.
> 
> Isn't there always going to be some piece of software that raises an
> event?
> 
> For KVM, we have guest-initiated 'software-signaled' events and KVM-initiated
> async PF (whatever else may follow as well).
> 

Yes, The assumption that all events are always singled by software should
be true. So this field (@signaled) can be dropped either. So I plan to
change the data structures like below, according to the suggestions given
by you. Please double check if there are anything missed.

(1) Those fields of struct kvm_sdei_exposed_event are dropped or merged
     to struct kvm_sdei_event.

     struct kvm_sdei_event {
            unsigned int          num;
            unsigned long         ep_addr;
            unsigned long         ep_arg;
#define KVM_SDEI_EVENT_STATE_REGISTERED         0
#define KVM_SDEI_EVENT_STATE_ENABLED            1
#define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING 2
            unsigned long         state;                 /* accessed by {test,set,clear}_bit() */
            unsigned long         event_count;
     };

(2) In arch/arm64/kvm/sdei.c

     static kvm_sdei_event exposed_events[] = {
            { .num = SDEI_SW_SIGNALED_EVENT },
     };

(3) In arch/arm64/kvm/sdei.c::kvm_sdei_create_vcpu(), the SDEI events
     are instantiated based on @exposed_events[]. It's just what we're
     doing and nothing is changed.

>>> Do we need this if we disallow nesting events?
>>>
>>
>> Yes, we need this. "event == NULL" is used as indication of invalid
>> context. @event is the associated SDEI event when the context is
>> valid.
> 
> What if we use some other plumbing to indicate the state of the vCPU? MP
> state comes to mind, for example.
> 

Even the indication is done by another state, kvm_sdei_vcpu_context still
need to be linked (associated) with the event. After the vCPU context becomes
valid after the event is delivered, we still need to know the associated
event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
is one of the examples, we need to decrease struct kvm_sdei_event::event_count
for the hypercall.

There are several options I can figure out for now. Please let me know your
preference.

(1) Rename 'struct kvm_sdei_event *event' to 'unsigned int num'. We use
     the check of '@num == KVM_SDEI_INVALID_NUM' as the indication.
     KVM_SDEI_INVALID_NUM is defined as -1U. With the change, no pointer
     is left in kvm_sdei_vcpu_context.

(2) Add field of 'struct kvm_sdei_event *current_event' to kvm_sdei_vcpu,
     to associate the event with the vCPU context. We still use the
     check of '@current_event == NULL' as the indication.

(3) Have changes in (1) and (2) in the mean while.

Besides, the data structures needs somewhat adjustment as you suggested
previously. The major changes are to drop @lock and critical event support.

    struct kvm_sdei_vcpu_context {
         struct kvm_sdei_event   *event;                      /* need your confirm */
         unsigned long           regs[18];
         unsigned long           pc;
         unsigned long           pstate;
    };

    struct kvm_sdei_vcpu {
         struct kvm_sdei_event           *events;            /* instantiated from @exposed_events[] */
         unsigned char                   masked;             /* vCPU is masked off or not           */
         unsigned long                   event_count;        /* Total count of pending events       */
         struct kvm_sdei_vcpu_context    context;            /* vCPU context for SDEI event         */
    };

>>>> +/*
>>>> + * According to SDEI specification (v1.1), the event number spans 32-bits
>>>> + * and the lower 24-bits are used as the (real) event number. I don't
>>>> + * think we can use that much event numbers in one system. So we reserve
>>>> + * two bits from the 24-bits real event number, to indicate its types:
>>>> + * physical or virtual event. One reserved bit is enough for now, but
>>>> + * two bits are reserved for possible extension in future.
>>>> + *
>>>> + * The physical events are owned by firmware while the virtual events
>>>> + * are used by VMM and KVM.
>>>
>>> Doesn't KVM own everything? I don't see how the guest could interact
>>> with another SDEI implementation.
>>>
>>
>> I might be overthinking on the scheme. The host's firmware might have
>> SDEI supported and we want to propogate these events originated from
>> host's firmware to guest. In this case, we need to distinguish the events
>> originated from host's firmware and kvm (guest's firmware). Even this
>> case isn't possible to happen, I think it's still nice to distinguish
>> the events originated from a real firmware or KVM emulated firmware.
> 
> The guest ABI w.r.t. SDEI is under full ownership of KVM. Any other
> implementations events will never get exposed to the guest.
> 
> Couldn't the guest own the host if it was talking to our firmware
> anyway?
> 

Right. Lets drop these macros and kvm_sdei_is_virtual() in next respin.
As you suggested, we need to iterate struct kvm_sdei_vcpu::events to
see if the event (number) is valid or not.

>>>> + */
>>>> +#define KVM_SDEI_EVENT_NUM_TYPE_SHIFT	22
>>>> +#define KVM_SDEI_EVENT_NUM_TYPE_MASK	(3 << KVM_SDEI_EVENT_NUM_TYPE_SHIFT)
>>>> +#define KVM_SDEI_EVENT_NUM_TYPE_PHYS	0
>>>> +#define KVM_SDEI_EVENT_NUM_TYPE_VIRT	1
>>>> +
>>>> +static inline bool kvm_sdei_is_virtual(unsigned int num)
>>>> +{
>>>> +	unsigned int type;
>>>> +
>>>> +	type = (num & KVM_SDEI_EVENT_NUM_TYPE_MASK) >>
>>>> +	       KVM_SDEI_EVENT_NUM_TYPE_SHIFT;
>>>> +	if (type == KVM_SDEI_EVENT_NUM_TYPE_VIRT)
>>>> +		return true;
>>>> +
>>>> +	return false;
>>>> +}
>>>> +
>>>> +static inline bool kvm_sdei_is_virtual(unsigned int num)
>>>> +{
>>>> +	return num == SDEI_SW_SIGNALED_EVENT;
>>>> +}
>>>
>>> Couldn't the caller just check the event number on their own?
>>>
>>
>> It would be hard because the caller can be guest. Generally, the
>> event and its associated information/state are accessed by hypercalls,
>> event injection and delivery, migration to be supported in future.
>> So I think it's good to check the event number by ourselves.
> 
> What I'm saying is, can't the caller of kvm_sdei_is_sw_signaled() just
> do the comparison?
> 

The only caller of kvm_sdei_is_sw_signaled() is hypercall_signal(). So
lets drop kvm_sdei_is_sw_signaled() and do the comparison in hypercall_signal()
in next respin.


Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-04-24  3:00           ` Gavin Shan
@ 2022-04-28 20:28             ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-28 20:28 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Gavin,

On Sun, Apr 24, 2022 at 11:00:56AM +0800, Gavin Shan wrote:

[...]

> Yes, The assumption that all events are always singled by software should
> be true. So this field (@signaled) can be dropped either. So I plan to
> change the data structures like below, according to the suggestions given
> by you. Please double check if there are anything missed.
> 
> (1) Those fields of struct kvm_sdei_exposed_event are dropped or merged
>     to struct kvm_sdei_event.
> 
>     struct kvm_sdei_event {
>            unsigned int          num;
>            unsigned long         ep_addr;
>            unsigned long         ep_arg;
> #define KVM_SDEI_EVENT_STATE_REGISTERED         0
> #define KVM_SDEI_EVENT_STATE_ENABLED            1
> #define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING 2
>            unsigned long         state;                 /* accessed by {test,set,clear}_bit() */
>            unsigned long         event_count;
>     };
> 
> (2) In arch/arm64/kvm/sdei.c
> 
>     static kvm_sdei_event exposed_events[] = {
>            { .num = SDEI_SW_SIGNALED_EVENT },
>     };
> 
> (3) In arch/arm64/kvm/sdei.c::kvm_sdei_create_vcpu(), the SDEI events
>     are instantiated based on @exposed_events[]. It's just what we're
>     doing and nothing is changed.

The part I find troubling is the fact that we are treating SDEI events
as a list-like thing. If we want to behave more like hardware, why can't
we track the state of an event in bitmaps? There are three bits of
relevant state for any given event in the context of a vCPU: registered,
enabled, and pending.

I'm having some second thoughts about the suggestion to use MP state for
this, given that we need to represent a few bits of state for the vCPU
as well. Seems we need to track the mask state of a vCPU and a bit to
indicate whether an SDEI handler is active. You could put these bits in
kvm_vcpu_arch::flags, actually.

So maybe it could be organized like so:

  /* bits for the bitmaps below */
  enum kvm_sdei_event {
  	KVM_SDEI_EVENT_SW_SIGNALED = 0,
	KVM_SDEI_EVENT_ASYNC_PF,
	...
	NR_KVM_SDEI_EVENTS,
  };

  struct kvm_sdei_event_handler {
  	unsigned long ep_addr;
	unsigned long ep_arg;
  };

  struct kvm_sdei_event_context {
  	unsigned long pc;
	unsigned long pstate;
	unsigned long regs[18];
  };

  struct kvm_sdei_vcpu {
  	unsigned long registered;
	unsigned long enabled;
	unsigned long pending;

	struct kvm_sdei_event_handler handlers[NR_KVM_SDEI_EVENTS];
	struct kvm_sdei_event_context ctxt;
  };

But it is hard to really talk about these data structures w/o a feel for
the mechanics of working the series around it.

> > > > Do we need this if we disallow nesting events?
> > > > 
> > > 
> > > Yes, we need this. "event == NULL" is used as indication of invalid
> > > context. @event is the associated SDEI event when the context is
> > > valid.
> > 
> > What if we use some other plumbing to indicate the state of the vCPU? MP
> > state comes to mind, for example.
> > 
> 
> Even the indication is done by another state, kvm_sdei_vcpu_context still
> need to be linked (associated) with the event. After the vCPU context becomes
> valid after the event is delivered, we still need to know the associated
> event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
> is one of the examples, we need to decrease struct kvm_sdei_event::event_count
> for the hypercall.

Why do we need to keep track of how many times an event has been
signaled? Nothing in SDEI seems to suggest that the number of event
signals corresponds to the number of times the handler is invoked. In
fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:

"""
The event has edgetriggered semantics and the number of event signals
may not correspond to the number of times the handler is invoked in the
target PE.
"""

DEN0054C 5.1.16.1

So perhaps we queue at most 1 pending event for the guest.

I'd also like to see if anyone else has thoughts on the topic, as I'd
hate for you to go back to the whiteboard again in the next spin.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-04-28 20:28             ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-28 20:28 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Gavin,

On Sun, Apr 24, 2022 at 11:00:56AM +0800, Gavin Shan wrote:

[...]

> Yes, The assumption that all events are always singled by software should
> be true. So this field (@signaled) can be dropped either. So I plan to
> change the data structures like below, according to the suggestions given
> by you. Please double check if there are anything missed.
> 
> (1) Those fields of struct kvm_sdei_exposed_event are dropped or merged
>     to struct kvm_sdei_event.
> 
>     struct kvm_sdei_event {
>            unsigned int          num;
>            unsigned long         ep_addr;
>            unsigned long         ep_arg;
> #define KVM_SDEI_EVENT_STATE_REGISTERED         0
> #define KVM_SDEI_EVENT_STATE_ENABLED            1
> #define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING 2
>            unsigned long         state;                 /* accessed by {test,set,clear}_bit() */
>            unsigned long         event_count;
>     };
> 
> (2) In arch/arm64/kvm/sdei.c
> 
>     static kvm_sdei_event exposed_events[] = {
>            { .num = SDEI_SW_SIGNALED_EVENT },
>     };
> 
> (3) In arch/arm64/kvm/sdei.c::kvm_sdei_create_vcpu(), the SDEI events
>     are instantiated based on @exposed_events[]. It's just what we're
>     doing and nothing is changed.

The part I find troubling is the fact that we are treating SDEI events
as a list-like thing. If we want to behave more like hardware, why can't
we track the state of an event in bitmaps? There are three bits of
relevant state for any given event in the context of a vCPU: registered,
enabled, and pending.

I'm having some second thoughts about the suggestion to use MP state for
this, given that we need to represent a few bits of state for the vCPU
as well. Seems we need to track the mask state of a vCPU and a bit to
indicate whether an SDEI handler is active. You could put these bits in
kvm_vcpu_arch::flags, actually.

So maybe it could be organized like so:

  /* bits for the bitmaps below */
  enum kvm_sdei_event {
  	KVM_SDEI_EVENT_SW_SIGNALED = 0,
	KVM_SDEI_EVENT_ASYNC_PF,
	...
	NR_KVM_SDEI_EVENTS,
  };

  struct kvm_sdei_event_handler {
  	unsigned long ep_addr;
	unsigned long ep_arg;
  };

  struct kvm_sdei_event_context {
  	unsigned long pc;
	unsigned long pstate;
	unsigned long regs[18];
  };

  struct kvm_sdei_vcpu {
  	unsigned long registered;
	unsigned long enabled;
	unsigned long pending;

	struct kvm_sdei_event_handler handlers[NR_KVM_SDEI_EVENTS];
	struct kvm_sdei_event_context ctxt;
  };

But it is hard to really talk about these data structures w/o a feel for
the mechanics of working the series around it.

> > > > Do we need this if we disallow nesting events?
> > > > 
> > > 
> > > Yes, we need this. "event == NULL" is used as indication of invalid
> > > context. @event is the associated SDEI event when the context is
> > > valid.
> > 
> > What if we use some other plumbing to indicate the state of the vCPU? MP
> > state comes to mind, for example.
> > 
> 
> Even the indication is done by another state, kvm_sdei_vcpu_context still
> need to be linked (associated) with the event. After the vCPU context becomes
> valid after the event is delivered, we still need to know the associated
> event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
> is one of the examples, we need to decrease struct kvm_sdei_event::event_count
> for the hypercall.

Why do we need to keep track of how many times an event has been
signaled? Nothing in SDEI seems to suggest that the number of event
signals corresponds to the number of times the handler is invoked. In
fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:

"""
The event has edgetriggered semantics and the number of event signals
may not correspond to the number of times the handler is invoked in the
target PE.
"""

DEN0054C 5.1.16.1

So perhaps we queue at most 1 pending event for the guest.

I'd also like to see if anyone else has thoughts on the topic, as I'd
hate for you to go back to the whiteboard again in the next spin.

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-04-28 20:28             ` Oliver Upton
@ 2022-04-30 11:38               ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-30 11:38 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 4/29/22 4:28 AM, Oliver Upton wrote:
> On Sun, Apr 24, 2022 at 11:00:56AM +0800, Gavin Shan wrote:
> 
> [...]
> 
>> Yes, The assumption that all events are always singled by software should
>> be true. So this field (@signaled) can be dropped either. So I plan to
>> change the data structures like below, according to the suggestions given
>> by you. Please double check if there are anything missed.
>>
>> (1) Those fields of struct kvm_sdei_exposed_event are dropped or merged
>>      to struct kvm_sdei_event.
>>
>>      struct kvm_sdei_event {
>>             unsigned int          num;
>>             unsigned long         ep_addr;
>>             unsigned long         ep_arg;
>> #define KVM_SDEI_EVENT_STATE_REGISTERED         0
>> #define KVM_SDEI_EVENT_STATE_ENABLED            1
>> #define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING 2
>>             unsigned long         state;                 /* accessed by {test,set,clear}_bit() */
>>             unsigned long         event_count;
>>      };
>>
>> (2) In arch/arm64/kvm/sdei.c
>>
>>      static kvm_sdei_event exposed_events[] = {
>>             { .num = SDEI_SW_SIGNALED_EVENT },
>>      };
>>
>> (3) In arch/arm64/kvm/sdei.c::kvm_sdei_create_vcpu(), the SDEI events
>>      are instantiated based on @exposed_events[]. It's just what we're
>>      doing and nothing is changed.
> 
> The part I find troubling is the fact that we are treating SDEI events
> as a list-like thing. If we want to behave more like hardware, why can't
> we track the state of an event in bitmaps? There are three bits of
> relevant state for any given event in the context of a vCPU: registered,
> enabled, and pending.
> 
> I'm having some second thoughts about the suggestion to use MP state for
> this, given that we need to represent a few bits of state for the vCPU
> as well. Seems we need to track the mask state of a vCPU and a bit to
> indicate whether an SDEI handler is active. You could put these bits in
> kvm_vcpu_arch::flags, actually.
> 
> So maybe it could be organized like so:
> 
>    /* bits for the bitmaps below */
>    enum kvm_sdei_event {
>    	KVM_SDEI_EVENT_SW_SIGNALED = 0,
> 	KVM_SDEI_EVENT_ASYNC_PF,
> 	...
> 	NR_KVM_SDEI_EVENTS,
>    };
> 
>    struct kvm_sdei_event_handler {
>    	unsigned long ep_addr;
> 	unsigned long ep_arg;
>    };
> 
>    struct kvm_sdei_event_context {
>    	unsigned long pc;
> 	unsigned long pstate;
> 	unsigned long regs[18];
>    };
> 
>    struct kvm_sdei_vcpu {
>    	unsigned long registered;
> 	unsigned long enabled;
> 	unsigned long pending;
> 
> 	struct kvm_sdei_event_handler handlers[NR_KVM_SDEI_EVENTS];
> 	struct kvm_sdei_event_context ctxt;
>    };
> 
> But it is hard to really talk about these data structures w/o a feel for
> the mechanics of working the series around it.
> 

Thank you for the comments and details. It should work by using bitmaps
to represent event's states. I will adopt your proposed structs in next
respin. However, there are more states needed. So I would adjust
"struct kvm_sdei_vcpu" like below in next respin.

     struct kvm_sdei_vcpu {
         unsigned long registered;    /* the event is registered or not                 */
         unsigned long enabled;       /* the event is enabled or not                    */
         unsigned long unregistering; /* the event is pending for unregistration        */
         unsigned long pending;       /* the event is pending for delivery and handling */
         unsigned long active;        /* the event is currently being handled           */

         :
         <this part is just like what you suggested>
     };

I rename @pending to @unregister. Besides, there are two states added:

    @pending: Indicate there has one event has been injected. The next step
              for the event is to deliver it for handling. For one particular
              event, we allow one pending event in the maximum.
    @active:  Indicate the event is currently being handled. The information
              stored in 'struct kvm_sdei_event_context' instance can be
              correlated with the event.

Furthermore, it's fair enough to put the (vcpu) mask state into 'flags'
field of struct kvm_vcpu_arch :)

>>>>> Do we need this if we disallow nesting events?
>>>>>
>>>>
>>>> Yes, we need this. "event == NULL" is used as indication of invalid
>>>> context. @event is the associated SDEI event when the context is
>>>> valid.
>>>
>>> What if we use some other plumbing to indicate the state of the vCPU? MP
>>> state comes to mind, for example.
>>>
>>
>> Even the indication is done by another state, kvm_sdei_vcpu_context still
>> need to be linked (associated) with the event. After the vCPU context becomes
>> valid after the event is delivered, we still need to know the associated
>> event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
>> is one of the examples, we need to decrease struct kvm_sdei_event::event_count
>> for the hypercall.
> 
> Why do we need to keep track of how many times an event has been
> signaled? Nothing in SDEI seems to suggest that the number of event
> signals corresponds to the number of times the handler is invoked. In
> fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:
> 
> """
> The event has edgetriggered semantics and the number of event signals
> may not correspond to the number of times the handler is invoked in the
> target PE.
> """
> 
> DEN0054C 5.1.16.1
> 
> So perhaps we queue at most 1 pending event for the guest.
> 
> I'd also like to see if anyone else has thoughts on the topic, as I'd
> hate for you to go back to the whiteboard again in the next spin.
> 

Agreed. In next respin, we will have one pending event at most. Error
can be returned if user attempts to inject event whose pending state
(struct kvm_sdei_vcpu::pending) has been set.

Indeed, the hardest part is to determine the data structures and
functions we need. Oliver, your valuable comments are helping to
bring this series to the right track. However, I do think it's
helpful if somebody else can confirm the outcomes from the previous
discussions. I'm not sure if Marc has time for a quick scan and provide
comments.

I would summarize the outcomes from our discussions, to help Marc
or others to confirm:

- Drop support for the shared event.
- Dropsupport for the critical event.
- The events in the implementations are all private and can be signaled
   (raised) by software.
- Drop migration support for now, and we will consider it using
   pseudo firmware registers. So add-on patches are expected to support
   the migration in future.
- Drop locking mechanism. All the functions are executed in vcpu context.
- To use the data struct as you suggested. Besides, the vcpu's mask
   state is put to struct kvm_arch_vcpu::flags.
   enum kvm_sdei_event
   struct kvm_sdei_event_handler
   struct kvm_sdei_event_context
   struct kvm_sdei_vcpu

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-04-30 11:38               ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-04-30 11:38 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 4/29/22 4:28 AM, Oliver Upton wrote:
> On Sun, Apr 24, 2022 at 11:00:56AM +0800, Gavin Shan wrote:
> 
> [...]
> 
>> Yes, The assumption that all events are always singled by software should
>> be true. So this field (@signaled) can be dropped either. So I plan to
>> change the data structures like below, according to the suggestions given
>> by you. Please double check if there are anything missed.
>>
>> (1) Those fields of struct kvm_sdei_exposed_event are dropped or merged
>>      to struct kvm_sdei_event.
>>
>>      struct kvm_sdei_event {
>>             unsigned int          num;
>>             unsigned long         ep_addr;
>>             unsigned long         ep_arg;
>> #define KVM_SDEI_EVENT_STATE_REGISTERED         0
>> #define KVM_SDEI_EVENT_STATE_ENABLED            1
>> #define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING 2
>>             unsigned long         state;                 /* accessed by {test,set,clear}_bit() */
>>             unsigned long         event_count;
>>      };
>>
>> (2) In arch/arm64/kvm/sdei.c
>>
>>      static kvm_sdei_event exposed_events[] = {
>>             { .num = SDEI_SW_SIGNALED_EVENT },
>>      };
>>
>> (3) In arch/arm64/kvm/sdei.c::kvm_sdei_create_vcpu(), the SDEI events
>>      are instantiated based on @exposed_events[]. It's just what we're
>>      doing and nothing is changed.
> 
> The part I find troubling is the fact that we are treating SDEI events
> as a list-like thing. If we want to behave more like hardware, why can't
> we track the state of an event in bitmaps? There are three bits of
> relevant state for any given event in the context of a vCPU: registered,
> enabled, and pending.
> 
> I'm having some second thoughts about the suggestion to use MP state for
> this, given that we need to represent a few bits of state for the vCPU
> as well. Seems we need to track the mask state of a vCPU and a bit to
> indicate whether an SDEI handler is active. You could put these bits in
> kvm_vcpu_arch::flags, actually.
> 
> So maybe it could be organized like so:
> 
>    /* bits for the bitmaps below */
>    enum kvm_sdei_event {
>    	KVM_SDEI_EVENT_SW_SIGNALED = 0,
> 	KVM_SDEI_EVENT_ASYNC_PF,
> 	...
> 	NR_KVM_SDEI_EVENTS,
>    };
> 
>    struct kvm_sdei_event_handler {
>    	unsigned long ep_addr;
> 	unsigned long ep_arg;
>    };
> 
>    struct kvm_sdei_event_context {
>    	unsigned long pc;
> 	unsigned long pstate;
> 	unsigned long regs[18];
>    };
> 
>    struct kvm_sdei_vcpu {
>    	unsigned long registered;
> 	unsigned long enabled;
> 	unsigned long pending;
> 
> 	struct kvm_sdei_event_handler handlers[NR_KVM_SDEI_EVENTS];
> 	struct kvm_sdei_event_context ctxt;
>    };
> 
> But it is hard to really talk about these data structures w/o a feel for
> the mechanics of working the series around it.
> 

Thank you for the comments and details. It should work by using bitmaps
to represent event's states. I will adopt your proposed structs in next
respin. However, there are more states needed. So I would adjust
"struct kvm_sdei_vcpu" like below in next respin.

     struct kvm_sdei_vcpu {
         unsigned long registered;    /* the event is registered or not                 */
         unsigned long enabled;       /* the event is enabled or not                    */
         unsigned long unregistering; /* the event is pending for unregistration        */
         unsigned long pending;       /* the event is pending for delivery and handling */
         unsigned long active;        /* the event is currently being handled           */

         :
         <this part is just like what you suggested>
     };

I rename @pending to @unregister. Besides, there are two states added:

    @pending: Indicate there has one event has been injected. The next step
              for the event is to deliver it for handling. For one particular
              event, we allow one pending event in the maximum.
    @active:  Indicate the event is currently being handled. The information
              stored in 'struct kvm_sdei_event_context' instance can be
              correlated with the event.

Furthermore, it's fair enough to put the (vcpu) mask state into 'flags'
field of struct kvm_vcpu_arch :)

>>>>> Do we need this if we disallow nesting events?
>>>>>
>>>>
>>>> Yes, we need this. "event == NULL" is used as indication of invalid
>>>> context. @event is the associated SDEI event when the context is
>>>> valid.
>>>
>>> What if we use some other plumbing to indicate the state of the vCPU? MP
>>> state comes to mind, for example.
>>>
>>
>> Even the indication is done by another state, kvm_sdei_vcpu_context still
>> need to be linked (associated) with the event. After the vCPU context becomes
>> valid after the event is delivered, we still need to know the associated
>> event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
>> is one of the examples, we need to decrease struct kvm_sdei_event::event_count
>> for the hypercall.
> 
> Why do we need to keep track of how many times an event has been
> signaled? Nothing in SDEI seems to suggest that the number of event
> signals corresponds to the number of times the handler is invoked. In
> fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:
> 
> """
> The event has edgetriggered semantics and the number of event signals
> may not correspond to the number of times the handler is invoked in the
> target PE.
> """
> 
> DEN0054C 5.1.16.1
> 
> So perhaps we queue at most 1 pending event for the guest.
> 
> I'd also like to see if anyone else has thoughts on the topic, as I'd
> hate for you to go back to the whiteboard again in the next spin.
> 

Agreed. In next respin, we will have one pending event at most. Error
can be returned if user attempts to inject event whose pending state
(struct kvm_sdei_vcpu::pending) has been set.

Indeed, the hardest part is to determine the data structures and
functions we need. Oliver, your valuable comments are helping to
bring this series to the right track. However, I do think it's
helpful if somebody else can confirm the outcomes from the previous
discussions. I'm not sure if Marc has time for a quick scan and provide
comments.

I would summarize the outcomes from our discussions, to help Marc
or others to confirm:

- Drop support for the shared event.
- Dropsupport for the critical event.
- The events in the implementations are all private and can be signaled
   (raised) by software.
- Drop migration support for now, and we will consider it using
   pseudo firmware registers. So add-on patches are expected to support
   the migration in future.
- Drop locking mechanism. All the functions are executed in vcpu context.
- To use the data struct as you suggested. Besides, the vcpu's mask
   state is put to struct kvm_arch_vcpu::flags.
   enum kvm_sdei_event
   struct kvm_sdei_event_handler
   struct kvm_sdei_event_context
   struct kvm_sdei_vcpu

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-04-30 11:38               ` Gavin Shan
@ 2022-04-30 14:16                 ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-30 14:16 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Gavin,

On Sat, Apr 30, 2022 at 07:38:29PM +0800, Gavin Shan wrote:
> Thank you for the comments and details. It should work by using bitmaps
> to represent event's states. I will adopt your proposed structs in next
> respin. However, there are more states needed. So I would adjust
> "struct kvm_sdei_vcpu" like below in next respin.
> 
>     struct kvm_sdei_vcpu {
>         unsigned long registered;    /* the event is registered or not                 */
>         unsigned long enabled;       /* the event is enabled or not                    */
>         unsigned long unregistering; /* the event is pending for unregistration        */

I'm not following why we need to keep track of the 'pending unregister'
state directly. Is it not possible to infer from (active && !registered)?

>         unsigned long pending;       /* the event is pending for delivery and handling */
>         unsigned long active;        /* the event is currently being handled           */
> 
>         :
>         <this part is just like what you suggested>
>     };
> 
> I rename @pending to @unregister. Besides, there are two states added:
> 
>    @pending: Indicate there has one event has been injected. The next step
>              for the event is to deliver it for handling. For one particular
>              event, we allow one pending event in the maximum.

Right, if an event retriggers when it is pending we still dispatch a
single event to the guest. And since we're only doing normal priority
events, it is entirely implementation defined which gets dispatched
first.

>    @active:  Indicate the event is currently being handled. The information
>              stored in 'struct kvm_sdei_event_context' instance can be
>              correlated with the event.

Does this need to be a bitmap though? We can't ever have more than one
SDEI event active at a time since this is private to a vCPU.

> Furthermore, it's fair enough to put the (vcpu) mask state into 'flags'
> field of struct kvm_vcpu_arch :)

I think you can get away with putting active in there too, I don't see
why we need more than a single bit for this info.

> > > > > > Do we need this if we disallow nesting events?
> > > > > > 
> > > > > 
> > > > > Yes, we need this. "event == NULL" is used as indication of invalid
> > > > > context. @event is the associated SDEI event when the context is
> > > > > valid.
> > > > 
> > > > What if we use some other plumbing to indicate the state of the vCPU? MP
> > > > state comes to mind, for example.
> > > > 
> > > 
> > > Even the indication is done by another state, kvm_sdei_vcpu_context still
> > > need to be linked (associated) with the event. After the vCPU context becomes
> > > valid after the event is delivered, we still need to know the associated
> > > event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
> > > is one of the examples, we need to decrease struct kvm_sdei_event::event_count
> > > for the hypercall.
> > 
> > Why do we need to keep track of how many times an event has been
> > signaled? Nothing in SDEI seems to suggest that the number of event
> > signals corresponds to the number of times the handler is invoked. In
> > fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:
> > 
> > """
> > The event has edgetriggered semantics and the number of event signals
> > may not correspond to the number of times the handler is invoked in the
> > target PE.
> > """
> > 
> > DEN0054C 5.1.16.1
> > 
> > So perhaps we queue at most 1 pending event for the guest.
> > 
> > I'd also like to see if anyone else has thoughts on the topic, as I'd
> > hate for you to go back to the whiteboard again in the next spin.
> > 
> 
> Agreed. In next respin, we will have one pending event at most. Error
> can be returned if user attempts to inject event whose pending state
> (struct kvm_sdei_vcpu::pending) has been set.

I don't believe we can do that. The SDEI_EVENT_SIGNAL call should succeed,
even if the event was already pending.

> Indeed, the hardest part is to determine the data structures and
> functions we need. Oliver, your valuable comments are helping to
> bring this series to the right track. However, I do think it's
> helpful if somebody else can confirm the outcomes from the previous
> discussions. I'm not sure if Marc has time for a quick scan and provide
> comments.
> 
> I would summarize the outcomes from our discussions, to help Marc
> or others to confirm:

Going to take a look at some of your later patches as well, just a heads
up.

> - Drop support for the shared event.
> - Dropsupport for the critical event.
> - The events in the implementations are all private and can be signaled
>   (raised) by software.
> - Drop migration support for now, and we will consider it using
>   pseudo firmware registers. So add-on patches are expected to support
>   the migration in future.

Migration will be supported in a future spin of this series, not a
subsequent one right? :) I had just made the suggestion because there was
a lot of renovations that we were discussing.

> - Drop locking mechanism. All the functions are executed in vcpu context.

Well, not entirely. Just need to make sure atomics are used to post
events to another vCPU in the case of SDEI_EVENT_SIGNAL.

set_bit() fits the bill here, as we've discussed.

> - To use the data struct as you suggested. Besides, the vcpu's mask
>   state is put to struct kvm_arch_vcpu::flags.
>   enum kvm_sdei_event
>   struct kvm_sdei_event_handler
>   struct kvm_sdei_event_context
>   struct kvm_sdei_vcpu
> 
> Thanks,
> Gavin
>

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-04-30 14:16                 ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-30 14:16 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Gavin,

On Sat, Apr 30, 2022 at 07:38:29PM +0800, Gavin Shan wrote:
> Thank you for the comments and details. It should work by using bitmaps
> to represent event's states. I will adopt your proposed structs in next
> respin. However, there are more states needed. So I would adjust
> "struct kvm_sdei_vcpu" like below in next respin.
> 
>     struct kvm_sdei_vcpu {
>         unsigned long registered;    /* the event is registered or not                 */
>         unsigned long enabled;       /* the event is enabled or not                    */
>         unsigned long unregistering; /* the event is pending for unregistration        */

I'm not following why we need to keep track of the 'pending unregister'
state directly. Is it not possible to infer from (active && !registered)?

>         unsigned long pending;       /* the event is pending for delivery and handling */
>         unsigned long active;        /* the event is currently being handled           */
> 
>         :
>         <this part is just like what you suggested>
>     };
> 
> I rename @pending to @unregister. Besides, there are two states added:
> 
>    @pending: Indicate there has one event has been injected. The next step
>              for the event is to deliver it for handling. For one particular
>              event, we allow one pending event in the maximum.

Right, if an event retriggers when it is pending we still dispatch a
single event to the guest. And since we're only doing normal priority
events, it is entirely implementation defined which gets dispatched
first.

>    @active:  Indicate the event is currently being handled. The information
>              stored in 'struct kvm_sdei_event_context' instance can be
>              correlated with the event.

Does this need to be a bitmap though? We can't ever have more than one
SDEI event active at a time since this is private to a vCPU.

> Furthermore, it's fair enough to put the (vcpu) mask state into 'flags'
> field of struct kvm_vcpu_arch :)

I think you can get away with putting active in there too, I don't see
why we need more than a single bit for this info.

> > > > > > Do we need this if we disallow nesting events?
> > > > > > 
> > > > > 
> > > > > Yes, we need this. "event == NULL" is used as indication of invalid
> > > > > context. @event is the associated SDEI event when the context is
> > > > > valid.
> > > > 
> > > > What if we use some other plumbing to indicate the state of the vCPU? MP
> > > > state comes to mind, for example.
> > > > 
> > > 
> > > Even the indication is done by another state, kvm_sdei_vcpu_context still
> > > need to be linked (associated) with the event. After the vCPU context becomes
> > > valid after the event is delivered, we still need to know the associated
> > > event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
> > > is one of the examples, we need to decrease struct kvm_sdei_event::event_count
> > > for the hypercall.
> > 
> > Why do we need to keep track of how many times an event has been
> > signaled? Nothing in SDEI seems to suggest that the number of event
> > signals corresponds to the number of times the handler is invoked. In
> > fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:
> > 
> > """
> > The event has edgetriggered semantics and the number of event signals
> > may not correspond to the number of times the handler is invoked in the
> > target PE.
> > """
> > 
> > DEN0054C 5.1.16.1
> > 
> > So perhaps we queue at most 1 pending event for the guest.
> > 
> > I'd also like to see if anyone else has thoughts on the topic, as I'd
> > hate for you to go back to the whiteboard again in the next spin.
> > 
> 
> Agreed. In next respin, we will have one pending event at most. Error
> can be returned if user attempts to inject event whose pending state
> (struct kvm_sdei_vcpu::pending) has been set.

I don't believe we can do that. The SDEI_EVENT_SIGNAL call should succeed,
even if the event was already pending.

> Indeed, the hardest part is to determine the data structures and
> functions we need. Oliver, your valuable comments are helping to
> bring this series to the right track. However, I do think it's
> helpful if somebody else can confirm the outcomes from the previous
> discussions. I'm not sure if Marc has time for a quick scan and provide
> comments.
> 
> I would summarize the outcomes from our discussions, to help Marc
> or others to confirm:

Going to take a look at some of your later patches as well, just a heads
up.

> - Drop support for the shared event.
> - Dropsupport for the critical event.
> - The events in the implementations are all private and can be signaled
>   (raised) by software.
> - Drop migration support for now, and we will consider it using
>   pseudo firmware registers. So add-on patches are expected to support
>   the migration in future.

Migration will be supported in a future spin of this series, not a
subsequent one right? :) I had just made the suggestion because there was
a lot of renovations that we were discussing.

> - Drop locking mechanism. All the functions are executed in vcpu context.

Well, not entirely. Just need to make sure atomics are used to post
events to another vCPU in the case of SDEI_EVENT_SIGNAL.

set_bit() fits the bill here, as we've discussed.

> - To use the data struct as you suggested. Besides, the vcpu's mask
>   state is put to struct kvm_arch_vcpu::flags.
>   enum kvm_sdei_event
>   struct kvm_sdei_event_handler
>   struct kvm_sdei_event_context
>   struct kvm_sdei_vcpu
> 
> Thanks,
> Gavin
>

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 04/18] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
  2022-04-03 15:38   ` Gavin Shan
@ 2022-04-30 14:54     ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-30 14:54 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Gavin,

On Sun, Apr 03, 2022 at 11:38:57PM +0800, Gavin Shan wrote:
> This supports SDEI_EVENT_REGISTER hypercall, which is used by guest
> to register event. The event won't be raised until it's registered
> and enabled. For those KVM owned events, they can't be registered
> if they aren't exposed.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/kvm/sdei.c | 78 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 78 insertions(+)
> 
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 3507e33ec00e..89c1b231cb60 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -25,6 +25,81 @@ static struct kvm_sdei_exposed_event exposed_events[] = {
>  	for (idx = 0, event = &exposed_events[0];	\
>  	     idx < ARRAY_SIZE(exposed_events);		\
>  	     idx++, event++)
> +#define kvm_sdei_for_each_event(vsdei, event, idx)	\
> +	for (idx = 0, event = &vsdei->events[0];	\
> +	     idx < ARRAY_SIZE(exposed_events);		\
> +	     idx++, event++)
> +
> +static struct kvm_sdei_event *find_event(struct kvm_vcpu *vcpu,
> +					 unsigned int num)
> +{
> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> +	struct kvm_sdei_event *event;
> +	int i;
> +
> +	kvm_sdei_for_each_event(vsdei, event, i) {
> +		if (event->exposed_event->num == num)
> +			return event;
> +	}
> +
> +	return NULL;
> +}

I imagine you'll drop this hunk in the next spin.

> +static unsigned long hypercall_register(struct kvm_vcpu *vcpu)

Hmm, hypercall_ is not a very descriptive scope. Could you instead do
something like kvm_sdei_?

so for this one, kvm_sdei_event_register()? Provides decent context
clues to connect back to the spec as well.

> +{
> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> +	struct kvm_sdei_event *event;
> +	unsigned int num = smccc_get_arg(vcpu, 1);
> +	unsigned long ep_address = smccc_get_arg(vcpu, 2);
> +	unsigned long ep_arg = smccc_get_arg(vcpu, 3);

We discussed using some structure to track the registered context of an
event. Maybe just build it on the stack then assign it in the array?

> +	unsigned long route_mode = smccc_get_arg(vcpu, 4);

This is really 'flags'. route_mode is bit[0]. I imagine we don't want to
support relative mode, so bit[1] is useless for us in that case too.

The spec is somewhat imprecise on what happens for reserved flags. The
prototype in section 5.1.2 of [1] suggests that reserved bits must be
zero, but 5.1.2.3 'Client responsibilities' does not state that invalid
flags result in an error.

Arm TF certainly rejects unexpected flags [2].

[1]: DEN0054C https://developer.arm.com/documentation/den0054/latest
[2]: https://github.com/ARM-software/arm-trusted-firmware/blob/66c3906e4c32d675eb06bd081de8a3359f76b84c/services/std_svc/sdei/sdei_main.c#L260

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 04/18] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
@ 2022-04-30 14:54     ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-30 14:54 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Gavin,

On Sun, Apr 03, 2022 at 11:38:57PM +0800, Gavin Shan wrote:
> This supports SDEI_EVENT_REGISTER hypercall, which is used by guest
> to register event. The event won't be raised until it's registered
> and enabled. For those KVM owned events, they can't be registered
> if they aren't exposed.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/kvm/sdei.c | 78 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 78 insertions(+)
> 
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 3507e33ec00e..89c1b231cb60 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -25,6 +25,81 @@ static struct kvm_sdei_exposed_event exposed_events[] = {
>  	for (idx = 0, event = &exposed_events[0];	\
>  	     idx < ARRAY_SIZE(exposed_events);		\
>  	     idx++, event++)
> +#define kvm_sdei_for_each_event(vsdei, event, idx)	\
> +	for (idx = 0, event = &vsdei->events[0];	\
> +	     idx < ARRAY_SIZE(exposed_events);		\
> +	     idx++, event++)
> +
> +static struct kvm_sdei_event *find_event(struct kvm_vcpu *vcpu,
> +					 unsigned int num)
> +{
> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> +	struct kvm_sdei_event *event;
> +	int i;
> +
> +	kvm_sdei_for_each_event(vsdei, event, i) {
> +		if (event->exposed_event->num == num)
> +			return event;
> +	}
> +
> +	return NULL;
> +}

I imagine you'll drop this hunk in the next spin.

> +static unsigned long hypercall_register(struct kvm_vcpu *vcpu)

Hmm, hypercall_ is not a very descriptive scope. Could you instead do
something like kvm_sdei_?

so for this one, kvm_sdei_event_register()? Provides decent context
clues to connect back to the spec as well.

> +{
> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> +	struct kvm_sdei_event *event;
> +	unsigned int num = smccc_get_arg(vcpu, 1);
> +	unsigned long ep_address = smccc_get_arg(vcpu, 2);
> +	unsigned long ep_arg = smccc_get_arg(vcpu, 3);

We discussed using some structure to track the registered context of an
event. Maybe just build it on the stack then assign it in the array?

> +	unsigned long route_mode = smccc_get_arg(vcpu, 4);

This is really 'flags'. route_mode is bit[0]. I imagine we don't want to
support relative mode, so bit[1] is useless for us in that case too.

The spec is somewhat imprecise on what happens for reserved flags. The
prototype in section 5.1.2 of [1] suggests that reserved bits must be
zero, but 5.1.2.3 'Client responsibilities' does not state that invalid
flags result in an error.

Arm TF certainly rejects unexpected flags [2].

[1]: DEN0054C https://developer.arm.com/documentation/den0054/latest
[2]: https://github.com/ARM-software/arm-trusted-firmware/blob/66c3906e4c32d675eb06bd081de8a3359f76b84c/services/std_svc/sdei/sdei_main.c#L260

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 06/18] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
  2022-04-03 15:38   ` Gavin Shan
@ 2022-04-30 15:03     ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-30 15:03 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

On Sun, Apr 03, 2022 at 11:38:59PM +0800, Gavin Shan wrote:
> This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest
> to retrieve the registers (x0 - x17) from the interrupted or preempted
> context in the event handler. The interrupted or preempted context
> is saved prior to handling the event by executing its handler and
> restored after that.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/kvm/sdei.c | 34 ++++++++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)
> 
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 941263578b30..af5d11b8eb2f 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -140,6 +140,37 @@ static unsigned long hypercall_enable(struct kvm_vcpu *vcpu, bool enable)
>  	return ret;
>  }
>  
> +static unsigned long hypercall_context(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> +	struct kvm_sdei_vcpu_context *context;
> +	unsigned long param_id = smccc_get_arg(vcpu, 1);
> +	unsigned long ret = SDEI_SUCCESS;
> +
> +	spin_lock(&vsdei->lock);
> +
> +	/* Check if we have events are being handled */
> +	context = &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL];
> +	context = context->event ? context : NULL;
> +	context = context ? : &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
> +	context = context->event ? context : NULL;
> +	if (!context) {
> +		ret = SDEI_DENIED;
> +		goto unlock;
> +	}

Eek! You'll probably be able to drop all of this and just check the SDEI
active flag.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 06/18] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
@ 2022-04-30 15:03     ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-30 15:03 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

On Sun, Apr 03, 2022 at 11:38:59PM +0800, Gavin Shan wrote:
> This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest
> to retrieve the registers (x0 - x17) from the interrupted or preempted
> context in the event handler. The interrupted or preempted context
> is saved prior to handling the event by executing its handler and
> restored after that.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/kvm/sdei.c | 34 ++++++++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)
> 
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 941263578b30..af5d11b8eb2f 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -140,6 +140,37 @@ static unsigned long hypercall_enable(struct kvm_vcpu *vcpu, bool enable)
>  	return ret;
>  }
>  
> +static unsigned long hypercall_context(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> +	struct kvm_sdei_vcpu_context *context;
> +	unsigned long param_id = smccc_get_arg(vcpu, 1);
> +	unsigned long ret = SDEI_SUCCESS;
> +
> +	spin_lock(&vsdei->lock);
> +
> +	/* Check if we have events are being handled */
> +	context = &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL];
> +	context = context->event ? context : NULL;
> +	context = context ? : &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
> +	context = context->event ? context : NULL;
> +	if (!context) {
> +		ret = SDEI_DENIED;
> +		goto unlock;
> +	}

Eek! You'll probably be able to drop all of this and just check the SDEI
active flag.

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 14/18] KVM: arm64: Support SDEI_EVENT_SIGNAL hypercall
  2022-04-03 15:39   ` Gavin Shan
@ 2022-04-30 21:32     ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-30 21:32 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Gavin,

On Sun, Apr 03, 2022 at 11:39:07PM +0800, Gavin Shan wrote:
> This supports SDEI_EVENT_SIGNAL hypercall. It's used by guest
> to inject event, whose number must be zero to the specified
> vCPU. As the shared event isn't supported, calling vCPU is
> assumed to be the target.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 45 insertions(+)
> 
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index ebdbe7810cf0..e1f6ab9800ee 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -455,6 +455,48 @@ static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
>  	return ret;
>  }
>  
> +static unsigned long hypercall_signal(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> +	struct kvm_sdei_event *event;
> +	unsigned int num = smccc_get_arg(vcpu, 1);
> +	unsigned long ret = SDEI_SUCCESS;
> +
> +	/*
> +	 * The event must be the software signaled one, whose number
> +	 * is zero.
> +	 */
> +	if (!kvm_sdei_is_sw_signaled(num)) {
> +		ret = SDEI_INVALID_PARAMETERS;
> +		goto out;
> +	}
> +
> +	spin_lock(&vsdei->lock);
> +
> +	/* Check if the vcpu has been masked */
> +	if (vsdei->masked) {
> +		ret = SDEI_INVALID_PARAMETERS;
> +		goto unlock;
> +	}

You should still be able to signal an event if the vCPU is masked. Just
means the bit will rot in the pending bitmap until the vCPU is unmasked.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 14/18] KVM: arm64: Support SDEI_EVENT_SIGNAL hypercall
@ 2022-04-30 21:32     ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-04-30 21:32 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Gavin,

On Sun, Apr 03, 2022 at 11:39:07PM +0800, Gavin Shan wrote:
> This supports SDEI_EVENT_SIGNAL hypercall. It's used by guest
> to inject event, whose number must be zero to the specified
> vCPU. As the shared event isn't supported, calling vCPU is
> assumed to be the target.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 45 insertions(+)
> 
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index ebdbe7810cf0..e1f6ab9800ee 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -455,6 +455,48 @@ static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
>  	return ret;
>  }
>  
> +static unsigned long hypercall_signal(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> +	struct kvm_sdei_event *event;
> +	unsigned int num = smccc_get_arg(vcpu, 1);
> +	unsigned long ret = SDEI_SUCCESS;
> +
> +	/*
> +	 * The event must be the software signaled one, whose number
> +	 * is zero.
> +	 */
> +	if (!kvm_sdei_is_sw_signaled(num)) {
> +		ret = SDEI_INVALID_PARAMETERS;
> +		goto out;
> +	}
> +
> +	spin_lock(&vsdei->lock);
> +
> +	/* Check if the vcpu has been masked */
> +	if (vsdei->masked) {
> +		ret = SDEI_INVALID_PARAMETERS;
> +		goto unlock;
> +	}

You should still be able to signal an event if the vCPU is masked. Just
means the bit will rot in the pending bitmap until the vCPU is unmasked.

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME} hypercall
  2022-04-03 15:39   ` [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall Gavin Shan
@ 2022-05-01  6:50     ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-01  6:50 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

On Sun, Apr 03, 2022 at 11:39:06PM +0800, Gavin Shan wrote:
> This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
> They are used by guest to notify the completion of event in its
> handler. The previously interrupted or preempted context is restored
> like below.
> 
>    * x0 - x17, PC and PState are restored to what values we had in
>      the interrupted or preempted context.
> 
>    * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
>      is injected.

I don't think that's how COMPLETE_AND_RESUME works. The caller specifies an
address at which it would like to begin execution within the client
exception level.

SDEI spec suggests this behaves like a synchronous exception. DEN 0054C
5.2.2 'Event Resume Context' speaks more about how it is supposed to
work.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME} hypercall
@ 2022-05-01  6:50     ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-01  6:50 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

On Sun, Apr 03, 2022 at 11:39:06PM +0800, Gavin Shan wrote:
> This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
> They are used by guest to notify the completion of event in its
> handler. The previously interrupted or preempted context is restored
> like below.
> 
>    * x0 - x17, PC and PState are restored to what values we had in
>      the interrupted or preempted context.
> 
>    * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
>      is injected.

I don't think that's how COMPLETE_AND_RESUME works. The caller specifies an
address at which it would like to begin execution within the client
exception level.

SDEI spec suggests this behaves like a synchronous exception. DEN 0054C
5.2.2 'Event Resume Context' speaks more about how it is supposed to
work.

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 15/18] KVM: arm64: Support SDEI_FEATURES hypercall
  2022-04-03 15:39   ` Gavin Shan
@ 2022-05-01  6:55     ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-01  6:55 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

On Sun, Apr 03, 2022 at 11:39:08PM +0800, Gavin Shan wrote:
> This supports SDEI_FEATURES hypercall. It's used by guest to
> retrieve the supported features, which are number of slots for
> the interrupt binding events and relative mode for the event
> handler. Currently, none of them is supported.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/kvm/sdei.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index e1f6ab9800ee..ab0b7b5e3191 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -527,6 +527,23 @@ static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
>  	return ret;
>  }
>  
> +static unsigned long hypercall_features(struct kvm_vcpu *vcpu)
> +{
> +	unsigned long feature = smccc_get_arg(vcpu, 1);
> +	unsigned long ret;
> +
> +	switch (feature) {
> +	case 0: /* BIND_SLOTS */
> +	case 1: /* RELATIVE_MODE */

Can you create macros for these?

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 15/18] KVM: arm64: Support SDEI_FEATURES hypercall
@ 2022-05-01  6:55     ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-01  6:55 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

On Sun, Apr 03, 2022 at 11:39:08PM +0800, Gavin Shan wrote:
> This supports SDEI_FEATURES hypercall. It's used by guest to
> retrieve the supported features, which are number of slots for
> the interrupt binding events and relative mode for the event
> handler. Currently, none of them is supported.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/arm64/kvm/sdei.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index e1f6ab9800ee..ab0b7b5e3191 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -527,6 +527,23 @@ static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
>  	return ret;
>  }
>  
> +static unsigned long hypercall_features(struct kvm_vcpu *vcpu)
> +{
> +	unsigned long feature = smccc_get_arg(vcpu, 1);
> +	unsigned long ret;
> +
> +	switch (feature) {
> +	case 0: /* BIND_SLOTS */
> +	case 1: /* RELATIVE_MODE */

Can you create macros for these?

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-04-30 14:16                 ` Oliver Upton
@ 2022-05-02  2:35                   ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  2:35 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 4/30/22 10:16 PM, Oliver Upton wrote:
> On Sat, Apr 30, 2022 at 07:38:29PM +0800, Gavin Shan wrote:
>> Thank you for the comments and details. It should work by using bitmaps
>> to represent event's states. I will adopt your proposed structs in next
>> respin. However, there are more states needed. So I would adjust
>> "struct kvm_sdei_vcpu" like below in next respin.
>>
>>      struct kvm_sdei_vcpu {
>>          unsigned long registered;    /* the event is registered or not                 */
>>          unsigned long enabled;       /* the event is enabled or not                    */
>>          unsigned long unregistering; /* the event is pending for unregistration        */
> 
> I'm not following why we need to keep track of the 'pending unregister'
> state directly. Is it not possible to infer from (active && !registered)?
> 

The event can be unregistered and reseted through hypercalls when it's
being handled. In this case, the unregistration for the event can't
be done immediately and has to be delayed until the handling is finished.
The unregistration pending state is used in this case. Yes, it's
correct we also can use (active & !registered) to represent the state.

>>          unsigned long pending;       /* the event is pending for delivery and handling */
>>          unsigned long active;        /* the event is currently being handled           */
>>
>>          :
>>          <this part is just like what you suggested>
>>      };
>>
>> I rename @pending to @unregister. Besides, there are two states added:
>>
>>     @pending: Indicate there has one event has been injected. The next step
>>               for the event is to deliver it for handling. For one particular
>>               event, we allow one pending event in the maximum.
> 
> Right, if an event retriggers when it is pending we still dispatch a
> single event to the guest. And since we're only doing normal priority
> events, it is entirely implementation defined which gets dispatched
> first.
> 

Yep, we will simply rely on find_first_bit() for the priority. It means
the software signaled event, whose number is zero, will have the highest
priority.

>>     @active:  Indicate the event is currently being handled. The information
>>               stored in 'struct kvm_sdei_event_context' instance can be
>>               correlated with the event.
> 
> Does this need to be a bitmap though? We can't ever have more than one
> SDEI event active at a time since this is private to a vCPU.
> 

Yes, one event is active at most on one particular vCPU. So tt don't
have to be a bitmap necessarily. The reason I proposed to use bitmap
for this state is to having all (event) states represented by bitmaps.
In this way, all states are managed in a unified fashion. The alternative
way is to have "unsigned long active_event", which traces the active
event number. It also consumes 8-bytes when live migration is concerned.
So I prefer a bitmap :)

>> Furthermore, it's fair enough to put the (vcpu) mask state into 'flags'
>> field of struct kvm_vcpu_arch :)
> 
> I think you can get away with putting active in there too, I don't see
> why we need more than a single bit for this info.
> 

Not really. We just need one single bit for vCPU's mask state. We need
multiple bits for event's active state, depending on how many events are
supported. We need to know which event is currently active at least.
For now, there are only two supported events (0/1), but one single bit
is still not enough because there are 3 states: (1) software signaled
event is active. (2) async pf event is active. (3) none of them is
active.

Lets use a bitmap for the event active state as I said above, if you
don't strongly object :)

>>>>>>> Do we need this if we disallow nesting events?
>>>>>>>
>>>>>>
>>>>>> Yes, we need this. "event == NULL" is used as indication of invalid
>>>>>> context. @event is the associated SDEI event when the context is
>>>>>> valid.
>>>>>
>>>>> What if we use some other plumbing to indicate the state of the vCPU? MP
>>>>> state comes to mind, for example.
>>>>>
>>>>
>>>> Even the indication is done by another state, kvm_sdei_vcpu_context still
>>>> need to be linked (associated) with the event. After the vCPU context becomes
>>>> valid after the event is delivered, we still need to know the associated
>>>> event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
>>>> is one of the examples, we need to decrease struct kvm_sdei_event::event_count
>>>> for the hypercall.
>>>
>>> Why do we need to keep track of how many times an event has been
>>> signaled? Nothing in SDEI seems to suggest that the number of event
>>> signals corresponds to the number of times the handler is invoked. In
>>> fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:
>>>
>>> """
>>> The event has edgetriggered semantics and the number of event signals
>>> may not correspond to the number of times the handler is invoked in the
>>> target PE.
>>> """
>>>
>>> DEN0054C 5.1.16.1
>>>
>>> So perhaps we queue at most 1 pending event for the guest.
>>>
>>> I'd also like to see if anyone else has thoughts on the topic, as I'd
>>> hate for you to go back to the whiteboard again in the next spin.
>>>
>>
>> Agreed. In next respin, we will have one pending event at most. Error
>> can be returned if user attempts to inject event whose pending state
>> (struct kvm_sdei_vcpu::pending) has been set.
> 
> I don't believe we can do that. The SDEI_EVENT_SIGNAL call should succeed,
> even if the event was already pending.
> 

I rethinking it a bit. Yes, you're correct. In this specific case, the
event handler is running for multiple events.

>> Indeed, the hardest part is to determine the data structures and
>> functions we need. Oliver, your valuable comments are helping to
>> bring this series to the right track. However, I do think it's
>> helpful if somebody else can confirm the outcomes from the previous
>> discussions. I'm not sure if Marc has time for a quick scan and provide
>> comments.
>>
>> I would summarize the outcomes from our discussions, to help Marc
>> or others to confirm:
> 
> Going to take a look at some of your later patches as well, just a heads
> up.
> 

Yep, thanks again for your valuable comments :)

>> - Drop support for the shared event.
>> - Dropsupport for the critical event.
>> - The events in the implementations are all private and can be signaled
>>    (raised) by software.
>> - Drop migration support for now, and we will consider it using
>>    pseudo firmware registers. So add-on patches are expected to support
>>    the migration in future.
> 
> Migration will be supported in a future spin of this series, not a
> subsequent one right? :) I had just made the suggestion because there was
> a lot of renovations that we were discussing.
> 

I prefer a separate series to support migration after this series gets
merged. There are couple of reasons to do so: (1) The migration depends
on Raghavendra's series to support for hypercall services selection.
The series is close to be merged, but not happen yet. The SDEI is one
of the hypercall services at least. SDEI's pseudo firmware registers
for migration will be managed by the infrastructure. (2) I would focus
on the core functinality for now. In this way, we give migration space.
For example, the data structures needs sorts of adjustments for migration,
just in case.

>> - Drop locking mechanism. All the functions are executed in vcpu context.
> 
> Well, not entirely. Just need to make sure atomics are used to post
> events to another vCPU in the case of SDEI_EVENT_SIGNAL.
> 
> set_bit() fits the bill here, as we've discussed.
> 

Yes, I meant to remove struct kvm_sdei_vcpu::lock by dropping the
locking mechanism :)

>> - To use the data struct as you suggested. Besides, the vcpu's mask
>>    state is put to struct kvm_arch_vcpu::flags.
>>    enum kvm_sdei_event
>>    struct kvm_sdei_event_handler
>>    struct kvm_sdei_event_context
>>    struct kvm_sdei_vcpu
>>

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-05-02  2:35                   ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  2:35 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 4/30/22 10:16 PM, Oliver Upton wrote:
> On Sat, Apr 30, 2022 at 07:38:29PM +0800, Gavin Shan wrote:
>> Thank you for the comments and details. It should work by using bitmaps
>> to represent event's states. I will adopt your proposed structs in next
>> respin. However, there are more states needed. So I would adjust
>> "struct kvm_sdei_vcpu" like below in next respin.
>>
>>      struct kvm_sdei_vcpu {
>>          unsigned long registered;    /* the event is registered or not                 */
>>          unsigned long enabled;       /* the event is enabled or not                    */
>>          unsigned long unregistering; /* the event is pending for unregistration        */
> 
> I'm not following why we need to keep track of the 'pending unregister'
> state directly. Is it not possible to infer from (active && !registered)?
> 

The event can be unregistered and reseted through hypercalls when it's
being handled. In this case, the unregistration for the event can't
be done immediately and has to be delayed until the handling is finished.
The unregistration pending state is used in this case. Yes, it's
correct we also can use (active & !registered) to represent the state.

>>          unsigned long pending;       /* the event is pending for delivery and handling */
>>          unsigned long active;        /* the event is currently being handled           */
>>
>>          :
>>          <this part is just like what you suggested>
>>      };
>>
>> I rename @pending to @unregister. Besides, there are two states added:
>>
>>     @pending: Indicate there has one event has been injected. The next step
>>               for the event is to deliver it for handling. For one particular
>>               event, we allow one pending event in the maximum.
> 
> Right, if an event retriggers when it is pending we still dispatch a
> single event to the guest. And since we're only doing normal priority
> events, it is entirely implementation defined which gets dispatched
> first.
> 

Yep, we will simply rely on find_first_bit() for the priority. It means
the software signaled event, whose number is zero, will have the highest
priority.

>>     @active:  Indicate the event is currently being handled. The information
>>               stored in 'struct kvm_sdei_event_context' instance can be
>>               correlated with the event.
> 
> Does this need to be a bitmap though? We can't ever have more than one
> SDEI event active at a time since this is private to a vCPU.
> 

Yes, one event is active at most on one particular vCPU. So tt don't
have to be a bitmap necessarily. The reason I proposed to use bitmap
for this state is to having all (event) states represented by bitmaps.
In this way, all states are managed in a unified fashion. The alternative
way is to have "unsigned long active_event", which traces the active
event number. It also consumes 8-bytes when live migration is concerned.
So I prefer a bitmap :)

>> Furthermore, it's fair enough to put the (vcpu) mask state into 'flags'
>> field of struct kvm_vcpu_arch :)
> 
> I think you can get away with putting active in there too, I don't see
> why we need more than a single bit for this info.
> 

Not really. We just need one single bit for vCPU's mask state. We need
multiple bits for event's active state, depending on how many events are
supported. We need to know which event is currently active at least.
For now, there are only two supported events (0/1), but one single bit
is still not enough because there are 3 states: (1) software signaled
event is active. (2) async pf event is active. (3) none of them is
active.

Lets use a bitmap for the event active state as I said above, if you
don't strongly object :)

>>>>>>> Do we need this if we disallow nesting events?
>>>>>>>
>>>>>>
>>>>>> Yes, we need this. "event == NULL" is used as indication of invalid
>>>>>> context. @event is the associated SDEI event when the context is
>>>>>> valid.
>>>>>
>>>>> What if we use some other plumbing to indicate the state of the vCPU? MP
>>>>> state comes to mind, for example.
>>>>>
>>>>
>>>> Even the indication is done by another state, kvm_sdei_vcpu_context still
>>>> need to be linked (associated) with the event. After the vCPU context becomes
>>>> valid after the event is delivered, we still need to know the associated
>>>> event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
>>>> is one of the examples, we need to decrease struct kvm_sdei_event::event_count
>>>> for the hypercall.
>>>
>>> Why do we need to keep track of how many times an event has been
>>> signaled? Nothing in SDEI seems to suggest that the number of event
>>> signals corresponds to the number of times the handler is invoked. In
>>> fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:
>>>
>>> """
>>> The event has edgetriggered semantics and the number of event signals
>>> may not correspond to the number of times the handler is invoked in the
>>> target PE.
>>> """
>>>
>>> DEN0054C 5.1.16.1
>>>
>>> So perhaps we queue at most 1 pending event for the guest.
>>>
>>> I'd also like to see if anyone else has thoughts on the topic, as I'd
>>> hate for you to go back to the whiteboard again in the next spin.
>>>
>>
>> Agreed. In next respin, we will have one pending event at most. Error
>> can be returned if user attempts to inject event whose pending state
>> (struct kvm_sdei_vcpu::pending) has been set.
> 
> I don't believe we can do that. The SDEI_EVENT_SIGNAL call should succeed,
> even if the event was already pending.
> 

I rethinking it a bit. Yes, you're correct. In this specific case, the
event handler is running for multiple events.

>> Indeed, the hardest part is to determine the data structures and
>> functions we need. Oliver, your valuable comments are helping to
>> bring this series to the right track. However, I do think it's
>> helpful if somebody else can confirm the outcomes from the previous
>> discussions. I'm not sure if Marc has time for a quick scan and provide
>> comments.
>>
>> I would summarize the outcomes from our discussions, to help Marc
>> or others to confirm:
> 
> Going to take a look at some of your later patches as well, just a heads
> up.
> 

Yep, thanks again for your valuable comments :)

>> - Drop support for the shared event.
>> - Dropsupport for the critical event.
>> - The events in the implementations are all private and can be signaled
>>    (raised) by software.
>> - Drop migration support for now, and we will consider it using
>>    pseudo firmware registers. So add-on patches are expected to support
>>    the migration in future.
> 
> Migration will be supported in a future spin of this series, not a
> subsequent one right? :) I had just made the suggestion because there was
> a lot of renovations that we were discussing.
> 

I prefer a separate series to support migration after this series gets
merged. There are couple of reasons to do so: (1) The migration depends
on Raghavendra's series to support for hypercall services selection.
The series is close to be merged, but not happen yet. The SDEI is one
of the hypercall services at least. SDEI's pseudo firmware registers
for migration will be managed by the infrastructure. (2) I would focus
on the core functinality for now. In this way, we give migration space.
For example, the data structures needs sorts of adjustments for migration,
just in case.

>> - Drop locking mechanism. All the functions are executed in vcpu context.
> 
> Well, not entirely. Just need to make sure atomics are used to post
> events to another vCPU in the case of SDEI_EVENT_SIGNAL.
> 
> set_bit() fits the bill here, as we've discussed.
> 

Yes, I meant to remove struct kvm_sdei_vcpu::lock by dropping the
locking mechanism :)

>> - To use the data struct as you suggested. Besides, the vcpu's mask
>>    state is put to struct kvm_arch_vcpu::flags.
>>    enum kvm_sdei_event
>>    struct kvm_sdei_event_handler
>>    struct kvm_sdei_event_context
>>    struct kvm_sdei_vcpu
>>

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 04/18] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
  2022-04-30 14:54     ` Oliver Upton
@ 2022-05-02  2:55       ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  2:55 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 4/30/22 10:54 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:38:57PM +0800, Gavin Shan wrote:
>> This supports SDEI_EVENT_REGISTER hypercall, which is used by guest
>> to register event. The event won't be raised until it's registered
>> and enabled. For those KVM owned events, they can't be registered
>> if they aren't exposed.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/kvm/sdei.c | 78 +++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 78 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 3507e33ec00e..89c1b231cb60 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -25,6 +25,81 @@ static struct kvm_sdei_exposed_event exposed_events[] = {
>>   	for (idx = 0, event = &exposed_events[0];	\
>>   	     idx < ARRAY_SIZE(exposed_events);		\
>>   	     idx++, event++)
>> +#define kvm_sdei_for_each_event(vsdei, event, idx)	\
>> +	for (idx = 0, event = &vsdei->events[0];	\
>> +	     idx < ARRAY_SIZE(exposed_events);		\
>> +	     idx++, event++)
>> +
>> +static struct kvm_sdei_event *find_event(struct kvm_vcpu *vcpu,
>> +					 unsigned int num)
>> +{
>> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> +	struct kvm_sdei_event *event;
>> +	int i;
>> +
>> +	kvm_sdei_for_each_event(vsdei, event, i) {
>> +		if (event->exposed_event->num == num)
>> +			return event;
>> +	}
>> +
>> +	return NULL;
>> +}
> 
> I imagine you'll drop this hunk in the next spin.
> 

Yes, I will :)

>> +static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
> 
> Hmm, hypercall_ is not a very descriptive scope. Could you instead do
> something like kvm_sdei_?
> 
> so for this one, kvm_sdei_event_register()? Provides decent context
> clues to connect back to the spec as well.
> 

Sure. I will revise the names of all functions for hypercalls and
remove "hypercall" prefix. For this particular case, I would use
event_register() because "kvm_sdei_" prefix has been reserved for
those global scoped functions :)

>> +{
>> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> +	struct kvm_sdei_event *event;
>> +	unsigned int num = smccc_get_arg(vcpu, 1);
>> +	unsigned long ep_address = smccc_get_arg(vcpu, 2);
>> +	unsigned long ep_arg = smccc_get_arg(vcpu, 3);
> 
> We discussed using some structure to track the registered context of an
> event. Maybe just build it on the stack then assign it in the array?
> 

Yes, It will be something like below:

      struct kvm_sdei_event_handler handler = {
             .ep_address = smccc_get_arg(vcpu, 2),
             .ep_arg     = smccc_get_arg(vcpu, 3),
      };

>> +	unsigned long route_mode = smccc_get_arg(vcpu, 4);
> 
> This is really 'flags'. route_mode is bit[0]. I imagine we don't want to
> support relative mode, so bit[1] is useless for us in that case too.
> 
> The spec is somewhat imprecise on what happens for reserved flags. The
> prototype in section 5.1.2 of [1] suggests that reserved bits must be
> zero, but 5.1.2.3 'Client responsibilities' does not state that invalid
> flags result in an error.
> 
> Arm TF certainly rejects unexpected flags [2].
> 
> [1]: DEN0054C https://developer.arm.com/documentation/den0054/latest
> [2]: https://github.com/ARM-software/arm-trusted-firmware/blob/66c3906e4c32d675eb06bd081de8a3359f76b84c/services/std_svc/sdei/sdei_main.c#L260
> 

Yes, This chunk of code is still stick to old specification. Lets
improve in next respin:

    - Rename @route_mode to @flags
    - Reject if the reserved bits are set.
    - Reject if relative mode (bit#1) is selected.
    - Reject if routing mode (bit#0) isn't RM_ANY (0).
    - @route_affinity will be dropped.

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 04/18] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
@ 2022-05-02  2:55       ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  2:55 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 4/30/22 10:54 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:38:57PM +0800, Gavin Shan wrote:
>> This supports SDEI_EVENT_REGISTER hypercall, which is used by guest
>> to register event. The event won't be raised until it's registered
>> and enabled. For those KVM owned events, they can't be registered
>> if they aren't exposed.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/kvm/sdei.c | 78 +++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 78 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 3507e33ec00e..89c1b231cb60 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -25,6 +25,81 @@ static struct kvm_sdei_exposed_event exposed_events[] = {
>>   	for (idx = 0, event = &exposed_events[0];	\
>>   	     idx < ARRAY_SIZE(exposed_events);		\
>>   	     idx++, event++)
>> +#define kvm_sdei_for_each_event(vsdei, event, idx)	\
>> +	for (idx = 0, event = &vsdei->events[0];	\
>> +	     idx < ARRAY_SIZE(exposed_events);		\
>> +	     idx++, event++)
>> +
>> +static struct kvm_sdei_event *find_event(struct kvm_vcpu *vcpu,
>> +					 unsigned int num)
>> +{
>> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> +	struct kvm_sdei_event *event;
>> +	int i;
>> +
>> +	kvm_sdei_for_each_event(vsdei, event, i) {
>> +		if (event->exposed_event->num == num)
>> +			return event;
>> +	}
>> +
>> +	return NULL;
>> +}
> 
> I imagine you'll drop this hunk in the next spin.
> 

Yes, I will :)

>> +static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
> 
> Hmm, hypercall_ is not a very descriptive scope. Could you instead do
> something like kvm_sdei_?
> 
> so for this one, kvm_sdei_event_register()? Provides decent context
> clues to connect back to the spec as well.
> 

Sure. I will revise the names of all functions for hypercalls and
remove "hypercall" prefix. For this particular case, I would use
event_register() because "kvm_sdei_" prefix has been reserved for
those global scoped functions :)

>> +{
>> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> +	struct kvm_sdei_event *event;
>> +	unsigned int num = smccc_get_arg(vcpu, 1);
>> +	unsigned long ep_address = smccc_get_arg(vcpu, 2);
>> +	unsigned long ep_arg = smccc_get_arg(vcpu, 3);
> 
> We discussed using some structure to track the registered context of an
> event. Maybe just build it on the stack then assign it in the array?
> 

Yes, It will be something like below:

      struct kvm_sdei_event_handler handler = {
             .ep_address = smccc_get_arg(vcpu, 2),
             .ep_arg     = smccc_get_arg(vcpu, 3),
      };

>> +	unsigned long route_mode = smccc_get_arg(vcpu, 4);
> 
> This is really 'flags'. route_mode is bit[0]. I imagine we don't want to
> support relative mode, so bit[1] is useless for us in that case too.
> 
> The spec is somewhat imprecise on what happens for reserved flags. The
> prototype in section 5.1.2 of [1] suggests that reserved bits must be
> zero, but 5.1.2.3 'Client responsibilities' does not state that invalid
> flags result in an error.
> 
> Arm TF certainly rejects unexpected flags [2].
> 
> [1]: DEN0054C https://developer.arm.com/documentation/den0054/latest
> [2]: https://github.com/ARM-software/arm-trusted-firmware/blob/66c3906e4c32d675eb06bd081de8a3359f76b84c/services/std_svc/sdei/sdei_main.c#L260
> 

Yes, This chunk of code is still stick to old specification. Lets
improve in next respin:

    - Rename @route_mode to @flags
    - Reject if the reserved bits are set.
    - Reject if relative mode (bit#1) is selected.
    - Reject if routing mode (bit#0) isn't RM_ANY (0).
    - @route_affinity will be dropped.

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 06/18] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
  2022-04-30 15:03     ` Oliver Upton
@ 2022-05-02  2:57       ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  2:57 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 4/30/22 11:03 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:38:59PM +0800, Gavin Shan wrote:
>> This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest
>> to retrieve the registers (x0 - x17) from the interrupted or preempted
>> context in the event handler. The interrupted or preempted context
>> is saved prior to handling the event by executing its handler and
>> restored after that.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/kvm/sdei.c | 34 ++++++++++++++++++++++++++++++++++
>>   1 file changed, 34 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 941263578b30..af5d11b8eb2f 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -140,6 +140,37 @@ static unsigned long hypercall_enable(struct kvm_vcpu *vcpu, bool enable)
>>   	return ret;
>>   }
>>   
>> +static unsigned long hypercall_context(struct kvm_vcpu *vcpu)
>> +{
>> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> +	struct kvm_sdei_vcpu_context *context;
>> +	unsigned long param_id = smccc_get_arg(vcpu, 1);
>> +	unsigned long ret = SDEI_SUCCESS;
>> +
>> +	spin_lock(&vsdei->lock);
>> +
>> +	/* Check if we have events are being handled */
>> +	context = &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL];
>> +	context = context->event ? context : NULL;
>> +	context = context ? : &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
>> +	context = context->event ? context : NULL;
>> +	if (!context) {
>> +		ret = SDEI_DENIED;
>> +		goto unlock;
>> +	}
> 
> Eek! You'll probably be able to drop all of this and just check the SDEI
> active flag.
>

Yep, the event's active state will be checked instead in next respin :)

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 06/18] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
@ 2022-05-02  2:57       ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  2:57 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 4/30/22 11:03 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:38:59PM +0800, Gavin Shan wrote:
>> This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest
>> to retrieve the registers (x0 - x17) from the interrupted or preempted
>> context in the event handler. The interrupted or preempted context
>> is saved prior to handling the event by executing its handler and
>> restored after that.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/kvm/sdei.c | 34 ++++++++++++++++++++++++++++++++++
>>   1 file changed, 34 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 941263578b30..af5d11b8eb2f 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -140,6 +140,37 @@ static unsigned long hypercall_enable(struct kvm_vcpu *vcpu, bool enable)
>>   	return ret;
>>   }
>>   
>> +static unsigned long hypercall_context(struct kvm_vcpu *vcpu)
>> +{
>> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> +	struct kvm_sdei_vcpu_context *context;
>> +	unsigned long param_id = smccc_get_arg(vcpu, 1);
>> +	unsigned long ret = SDEI_SUCCESS;
>> +
>> +	spin_lock(&vsdei->lock);
>> +
>> +	/* Check if we have events are being handled */
>> +	context = &vsdei->context[SDEI_EVENT_PRIORITY_CRITICAL];
>> +	context = context->event ? context : NULL;
>> +	context = context ? : &vsdei->context[SDEI_EVENT_PRIORITY_NORMAL];
>> +	context = context->event ? context : NULL;
>> +	if (!context) {
>> +		ret = SDEI_DENIED;
>> +		goto unlock;
>> +	}
> 
> Eek! You'll probably be able to drop all of this and just check the SDEI
> active flag.
>

Yep, the event's active state will be checked instead in next respin :)

Thanks,
Gavin


_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 14/18] KVM: arm64: Support SDEI_EVENT_SIGNAL hypercall
  2022-04-30 21:32     ` Oliver Upton
@ 2022-05-02  3:04       ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  3:04 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 5/1/22 5:32 AM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:39:07PM +0800, Gavin Shan wrote:
>> This supports SDEI_EVENT_SIGNAL hypercall. It's used by guest
>> to inject event, whose number must be zero to the specified
>> vCPU. As the shared event isn't supported, calling vCPU is
>> assumed to be the target.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 45 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index ebdbe7810cf0..e1f6ab9800ee 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -455,6 +455,48 @@ static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
>>   	return ret;
>>   }
>>   
>> +static unsigned long hypercall_signal(struct kvm_vcpu *vcpu)
>> +{
>> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> +	struct kvm_sdei_event *event;
>> +	unsigned int num = smccc_get_arg(vcpu, 1);
>> +	unsigned long ret = SDEI_SUCCESS;
>> +
>> +	/*
>> +	 * The event must be the software signaled one, whose number
>> +	 * is zero.
>> +	 */
>> +	if (!kvm_sdei_is_sw_signaled(num)) {
>> +		ret = SDEI_INVALID_PARAMETERS;
>> +		goto out;
>> +	}
>> +
>> +	spin_lock(&vsdei->lock);
>> +
>> +	/* Check if the vcpu has been masked */
>> +	if (vsdei->masked) {
>> +		ret = SDEI_INVALID_PARAMETERS;
>> +		goto unlock;
>> +	}
> 
> You should still be able to signal an event if the vCPU is masked. Just
> means the bit will rot in the pending bitmap until the vCPU is unmasked.
> 

Nice point! The event pending state is set if vCPU is masked. However,
it's not becoming active until the vCPU is unmasked :)

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 14/18] KVM: arm64: Support SDEI_EVENT_SIGNAL hypercall
@ 2022-05-02  3:04       ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  3:04 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 5/1/22 5:32 AM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:39:07PM +0800, Gavin Shan wrote:
>> This supports SDEI_EVENT_SIGNAL hypercall. It's used by guest
>> to inject event, whose number must be zero to the specified
>> vCPU. As the shared event isn't supported, calling vCPU is
>> assumed to be the target.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 45 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index ebdbe7810cf0..e1f6ab9800ee 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -455,6 +455,48 @@ static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
>>   	return ret;
>>   }
>>   
>> +static unsigned long hypercall_signal(struct kvm_vcpu *vcpu)
>> +{
>> +	struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> +	struct kvm_sdei_event *event;
>> +	unsigned int num = smccc_get_arg(vcpu, 1);
>> +	unsigned long ret = SDEI_SUCCESS;
>> +
>> +	/*
>> +	 * The event must be the software signaled one, whose number
>> +	 * is zero.
>> +	 */
>> +	if (!kvm_sdei_is_sw_signaled(num)) {
>> +		ret = SDEI_INVALID_PARAMETERS;
>> +		goto out;
>> +	}
>> +
>> +	spin_lock(&vsdei->lock);
>> +
>> +	/* Check if the vcpu has been masked */
>> +	if (vsdei->masked) {
>> +		ret = SDEI_INVALID_PARAMETERS;
>> +		goto unlock;
>> +	}
> 
> You should still be able to signal an event if the vCPU is masked. Just
> means the bit will rot in the pending bitmap until the vCPU is unmasked.
> 

Nice point! The event pending state is set if vCPU is masked. However,
it's not becoming active until the vCPU is unmasked :)

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 15/18] KVM: arm64: Support SDEI_FEATURES hypercall
  2022-05-01  6:55     ` Oliver Upton
@ 2022-05-02  3:05       ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  3:05 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 5/1/22 2:55 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:39:08PM +0800, Gavin Shan wrote:
>> This supports SDEI_FEATURES hypercall. It's used by guest to
>> retrieve the supported features, which are number of slots for
>> the interrupt binding events and relative mode for the event
>> handler. Currently, none of them is supported.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/kvm/sdei.c | 20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index e1f6ab9800ee..ab0b7b5e3191 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -527,6 +527,23 @@ static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
>>   	return ret;
>>   }
>>   
>> +static unsigned long hypercall_features(struct kvm_vcpu *vcpu)
>> +{
>> +	unsigned long feature = smccc_get_arg(vcpu, 1);
>> +	unsigned long ret;
>> +
>> +	switch (feature) {
>> +	case 0: /* BIND_SLOTS */
>> +	case 1: /* RELATIVE_MODE */
> 
> Can you create macros for these?
> 

Surely I will do. Thanks for your review and comments :)

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 15/18] KVM: arm64: Support SDEI_FEATURES hypercall
@ 2022-05-02  3:05       ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  3:05 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 5/1/22 2:55 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:39:08PM +0800, Gavin Shan wrote:
>> This supports SDEI_FEATURES hypercall. It's used by guest to
>> retrieve the supported features, which are number of slots for
>> the interrupt binding events and relative mode for the event
>> handler. Currently, none of them is supported.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   arch/arm64/kvm/sdei.c | 20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index e1f6ab9800ee..ab0b7b5e3191 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -527,6 +527,23 @@ static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
>>   	return ret;
>>   }
>>   
>> +static unsigned long hypercall_features(struct kvm_vcpu *vcpu)
>> +{
>> +	unsigned long feature = smccc_get_arg(vcpu, 1);
>> +	unsigned long ret;
>> +
>> +	switch (feature) {
>> +	case 0: /* BIND_SLOTS */
>> +	case 1: /* RELATIVE_MODE */
> 
> Can you create macros for these?
> 

Surely I will do. Thanks for your review and comments :)

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-05-02  2:35                   ` Gavin Shan
@ 2022-05-02  3:40                     ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-02  3:40 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Gavin,

On Mon, May 02, 2022 at 10:35:08AM +0800, Gavin Shan wrote:
> Hi Oliver,
> 
> On 4/30/22 10:16 PM, Oliver Upton wrote:
> > On Sat, Apr 30, 2022 at 07:38:29PM +0800, Gavin Shan wrote:
> > > Thank you for the comments and details. It should work by using bitmaps
> > > to represent event's states. I will adopt your proposed structs in next
> > > respin. However, there are more states needed. So I would adjust
> > > "struct kvm_sdei_vcpu" like below in next respin.
> > > 
> > >      struct kvm_sdei_vcpu {
> > >          unsigned long registered;    /* the event is registered or not                 */
> > >          unsigned long enabled;       /* the event is enabled or not                    */
> > >          unsigned long unregistering; /* the event is pending for unregistration        */
> > 
> > I'm not following why we need to keep track of the 'pending unregister'
> > state directly. Is it not possible to infer from (active && !registered)?
> > 
> 
> The event can be unregistered and reseted through hypercalls when it's
> being handled. In this case, the unregistration for the event can't
> be done immediately and has to be delayed until the handling is finished.
> The unregistration pending state is used in this case. Yes, it's
> correct we also can use (active & !registered) to represent the state.

I don't believe there is any delay in the unregistration of an event.
The state is only meant to imply that the handler must complete before
software can re-register for the event.

The event state machine from 6.1 can be encoded using 3 bits, which is
exactly what we see in Table 13 of DEN0054C.

I'm sorry for being pedantic, but avoiding duplication of state reduces
the chance of bugs + makes things a bit easier to reason about.

> > >          unsigned long pending;       /* the event is pending for delivery and handling */
> > >          unsigned long active;        /* the event is currently being handled           */
> > > 
> > >          :
> > >          <this part is just like what you suggested>
> > >      };
> > > 
> > > I rename @pending to @unregister. Besides, there are two states added:
> > > 
> > >     @pending: Indicate there has one event has been injected. The next step
> > >               for the event is to deliver it for handling. For one particular
> > >               event, we allow one pending event in the maximum.
> > 
> > Right, if an event retriggers when it is pending we still dispatch a
> > single event to the guest. And since we're only doing normal priority
> > events, it is entirely implementation defined which gets dispatched
> > first.
> > 
> 
> Yep, we will simply rely on find_first_bit() for the priority. It means
> the software signaled event, whose number is zero, will have the highest
> priority.
> 
> > >     @active:  Indicate the event is currently being handled. The information
> > >               stored in 'struct kvm_sdei_event_context' instance can be
> > >               correlated with the event.
> > 
> > Does this need to be a bitmap though? We can't ever have more than one
> > SDEI event active at a time since this is private to a vCPU.
> > 
> 
> Yes, one event is active at most on one particular vCPU. So tt don't
> have to be a bitmap necessarily. The reason I proposed to use bitmap
> for this state is to having all (event) states represented by bitmaps.
> In this way, all states are managed in a unified fashion. The alternative
> way is to have "unsigned long active_event", which traces the active
> event number. It also consumes 8-bytes when live migration is concerned.
> So I prefer a bitmap :)
>

The small benefit of using the event number is that we can address all
events in 8 bytes, whereas we'd need to extend the bitmap for >64
events. I suppose we'll run into that issue either way, since the
pending, registered, and enabled portions are also bitmaps.

When live migration is in scope we should probably bark at userspace if
it attempts to set more than a single bit in the register.

> > > Furthermore, it's fair enough to put the (vcpu) mask state into 'flags'
> > > field of struct kvm_vcpu_arch :)
> > 
> > I think you can get away with putting active in there too, I don't see
> > why we need more than a single bit for this info.
> > 
> 
> Not really. We just need one single bit for vCPU's mask state. We need
> multiple bits for event's active state, depending on how many events are
> supported. We need to know which event is currently active at least.
> For now, there are only two supported events (0/1), but one single bit
> is still not enough because there are 3 states: (1) software signaled
> event is active. (2) async pf event is active. (3) none of them is
> active.
> 
> Lets use a bitmap for the event active state as I said above, if you
> don't strongly object :)
> 
> > > > > > > > Do we need this if we disallow nesting events?
> > > > > > > > 
> > > > > > > 
> > > > > > > Yes, we need this. "event == NULL" is used as indication of invalid
> > > > > > > context. @event is the associated SDEI event when the context is
> > > > > > > valid.
> > > > > > 
> > > > > > What if we use some other plumbing to indicate the state of the vCPU? MP
> > > > > > state comes to mind, for example.
> > > > > > 
> > > > > 
> > > > > Even the indication is done by another state, kvm_sdei_vcpu_context still
> > > > > need to be linked (associated) with the event. After the vCPU context becomes
> > > > > valid after the event is delivered, we still need to know the associated
> > > > > event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
> > > > > is one of the examples, we need to decrease struct kvm_sdei_event::event_count
> > > > > for the hypercall.
> > > > 
> > > > Why do we need to keep track of how many times an event has been
> > > > signaled? Nothing in SDEI seems to suggest that the number of event
> > > > signals corresponds to the number of times the handler is invoked. In
> > > > fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:
> > > > 
> > > > """
> > > > The event has edgetriggered semantics and the number of event signals
> > > > may not correspond to the number of times the handler is invoked in the
> > > > target PE.
> > > > """
> > > > 
> > > > DEN0054C 5.1.16.1
> > > > 
> > > > So perhaps we queue at most 1 pending event for the guest.
> > > > 
> > > > I'd also like to see if anyone else has thoughts on the topic, as I'd
> > > > hate for you to go back to the whiteboard again in the next spin.
> > > > 
> > > 
> > > Agreed. In next respin, we will have one pending event at most. Error
> > > can be returned if user attempts to inject event whose pending state
> > > (struct kvm_sdei_vcpu::pending) has been set.
> > 
> > I don't believe we can do that. The SDEI_EVENT_SIGNAL call should succeed,
> > even if the event was already pending.
> > 
> 
> I rethinking it a bit. Yes, you're correct. In this specific case, the
> event handler is running for multiple events.
>
> > > Indeed, the hardest part is to determine the data structures and
> > > functions we need. Oliver, your valuable comments are helping to
> > > bring this series to the right track. However, I do think it's
> > > helpful if somebody else can confirm the outcomes from the previous
> > > discussions. I'm not sure if Marc has time for a quick scan and provide
> > > comments.
> > > 
> > > I would summarize the outcomes from our discussions, to help Marc
> > > or others to confirm:
> > 
> > Going to take a look at some of your later patches as well, just a heads
> > up.
> > 
> 
> Yep, thanks again for your valuable comments :)
> 
> > > - Drop support for the shared event.
> > > - Dropsupport for the critical event.
> > > - The events in the implementations are all private and can be signaled
> > >    (raised) by software.
> > > - Drop migration support for now, and we will consider it using
> > >    pseudo firmware registers. So add-on patches are expected to support
> > >    the migration in future.
> > 
> > Migration will be supported in a future spin of this series, not a
> > subsequent one right? :) I had just made the suggestion because there was
> > a lot of renovations that we were discussing.
> > 
> 
> I prefer a separate series to support migration after this series gets
> merged. There are couple of reasons to do so: (1) The migration depends
> on Raghavendra's series to support for hypercall services selection.
> The series is close to be merged, but not happen yet. The SDEI is one
> of the hypercall services at least. SDEI's pseudo firmware registers
> for migration will be managed by the infrastructure. (2) I would focus
> on the core functinality for now. In this way, we give migration space.
> For example, the data structures needs sorts of adjustments for migration,
> just in case.

Although merging this series w/o support for LM would mean that a guest
using SDEI could potentially explode when migrated, right? We can't
break it to implement something else.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-05-02  3:40                     ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-02  3:40 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Gavin,

On Mon, May 02, 2022 at 10:35:08AM +0800, Gavin Shan wrote:
> Hi Oliver,
> 
> On 4/30/22 10:16 PM, Oliver Upton wrote:
> > On Sat, Apr 30, 2022 at 07:38:29PM +0800, Gavin Shan wrote:
> > > Thank you for the comments and details. It should work by using bitmaps
> > > to represent event's states. I will adopt your proposed structs in next
> > > respin. However, there are more states needed. So I would adjust
> > > "struct kvm_sdei_vcpu" like below in next respin.
> > > 
> > >      struct kvm_sdei_vcpu {
> > >          unsigned long registered;    /* the event is registered or not                 */
> > >          unsigned long enabled;       /* the event is enabled or not                    */
> > >          unsigned long unregistering; /* the event is pending for unregistration        */
> > 
> > I'm not following why we need to keep track of the 'pending unregister'
> > state directly. Is it not possible to infer from (active && !registered)?
> > 
> 
> The event can be unregistered and reseted through hypercalls when it's
> being handled. In this case, the unregistration for the event can't
> be done immediately and has to be delayed until the handling is finished.
> The unregistration pending state is used in this case. Yes, it's
> correct we also can use (active & !registered) to represent the state.

I don't believe there is any delay in the unregistration of an event.
The state is only meant to imply that the handler must complete before
software can re-register for the event.

The event state machine from 6.1 can be encoded using 3 bits, which is
exactly what we see in Table 13 of DEN0054C.

I'm sorry for being pedantic, but avoiding duplication of state reduces
the chance of bugs + makes things a bit easier to reason about.

> > >          unsigned long pending;       /* the event is pending for delivery and handling */
> > >          unsigned long active;        /* the event is currently being handled           */
> > > 
> > >          :
> > >          <this part is just like what you suggested>
> > >      };
> > > 
> > > I rename @pending to @unregister. Besides, there are two states added:
> > > 
> > >     @pending: Indicate there has one event has been injected. The next step
> > >               for the event is to deliver it for handling. For one particular
> > >               event, we allow one pending event in the maximum.
> > 
> > Right, if an event retriggers when it is pending we still dispatch a
> > single event to the guest. And since we're only doing normal priority
> > events, it is entirely implementation defined which gets dispatched
> > first.
> > 
> 
> Yep, we will simply rely on find_first_bit() for the priority. It means
> the software signaled event, whose number is zero, will have the highest
> priority.
> 
> > >     @active:  Indicate the event is currently being handled. The information
> > >               stored in 'struct kvm_sdei_event_context' instance can be
> > >               correlated with the event.
> > 
> > Does this need to be a bitmap though? We can't ever have more than one
> > SDEI event active at a time since this is private to a vCPU.
> > 
> 
> Yes, one event is active at most on one particular vCPU. So tt don't
> have to be a bitmap necessarily. The reason I proposed to use bitmap
> for this state is to having all (event) states represented by bitmaps.
> In this way, all states are managed in a unified fashion. The alternative
> way is to have "unsigned long active_event", which traces the active
> event number. It also consumes 8-bytes when live migration is concerned.
> So I prefer a bitmap :)
>

The small benefit of using the event number is that we can address all
events in 8 bytes, whereas we'd need to extend the bitmap for >64
events. I suppose we'll run into that issue either way, since the
pending, registered, and enabled portions are also bitmaps.

When live migration is in scope we should probably bark at userspace if
it attempts to set more than a single bit in the register.

> > > Furthermore, it's fair enough to put the (vcpu) mask state into 'flags'
> > > field of struct kvm_vcpu_arch :)
> > 
> > I think you can get away with putting active in there too, I don't see
> > why we need more than a single bit for this info.
> > 
> 
> Not really. We just need one single bit for vCPU's mask state. We need
> multiple bits for event's active state, depending on how many events are
> supported. We need to know which event is currently active at least.
> For now, there are only two supported events (0/1), but one single bit
> is still not enough because there are 3 states: (1) software signaled
> event is active. (2) async pf event is active. (3) none of them is
> active.
> 
> Lets use a bitmap for the event active state as I said above, if you
> don't strongly object :)
> 
> > > > > > > > Do we need this if we disallow nesting events?
> > > > > > > > 
> > > > > > > 
> > > > > > > Yes, we need this. "event == NULL" is used as indication of invalid
> > > > > > > context. @event is the associated SDEI event when the context is
> > > > > > > valid.
> > > > > > 
> > > > > > What if we use some other plumbing to indicate the state of the vCPU? MP
> > > > > > state comes to mind, for example.
> > > > > > 
> > > > > 
> > > > > Even the indication is done by another state, kvm_sdei_vcpu_context still
> > > > > need to be linked (associated) with the event. After the vCPU context becomes
> > > > > valid after the event is delivered, we still need to know the associated
> > > > > event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
> > > > > is one of the examples, we need to decrease struct kvm_sdei_event::event_count
> > > > > for the hypercall.
> > > > 
> > > > Why do we need to keep track of how many times an event has been
> > > > signaled? Nothing in SDEI seems to suggest that the number of event
> > > > signals corresponds to the number of times the handler is invoked. In
> > > > fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:
> > > > 
> > > > """
> > > > The event has edgetriggered semantics and the number of event signals
> > > > may not correspond to the number of times the handler is invoked in the
> > > > target PE.
> > > > """
> > > > 
> > > > DEN0054C 5.1.16.1
> > > > 
> > > > So perhaps we queue at most 1 pending event for the guest.
> > > > 
> > > > I'd also like to see if anyone else has thoughts on the topic, as I'd
> > > > hate for you to go back to the whiteboard again in the next spin.
> > > > 
> > > 
> > > Agreed. In next respin, we will have one pending event at most. Error
> > > can be returned if user attempts to inject event whose pending state
> > > (struct kvm_sdei_vcpu::pending) has been set.
> > 
> > I don't believe we can do that. The SDEI_EVENT_SIGNAL call should succeed,
> > even if the event was already pending.
> > 
> 
> I rethinking it a bit. Yes, you're correct. In this specific case, the
> event handler is running for multiple events.
>
> > > Indeed, the hardest part is to determine the data structures and
> > > functions we need. Oliver, your valuable comments are helping to
> > > bring this series to the right track. However, I do think it's
> > > helpful if somebody else can confirm the outcomes from the previous
> > > discussions. I'm not sure if Marc has time for a quick scan and provide
> > > comments.
> > > 
> > > I would summarize the outcomes from our discussions, to help Marc
> > > or others to confirm:
> > 
> > Going to take a look at some of your later patches as well, just a heads
> > up.
> > 
> 
> Yep, thanks again for your valuable comments :)
> 
> > > - Drop support for the shared event.
> > > - Dropsupport for the critical event.
> > > - The events in the implementations are all private and can be signaled
> > >    (raised) by software.
> > > - Drop migration support for now, and we will consider it using
> > >    pseudo firmware registers. So add-on patches are expected to support
> > >    the migration in future.
> > 
> > Migration will be supported in a future spin of this series, not a
> > subsequent one right? :) I had just made the suggestion because there was
> > a lot of renovations that we were discussing.
> > 
> 
> I prefer a separate series to support migration after this series gets
> merged. There are couple of reasons to do so: (1) The migration depends
> on Raghavendra's series to support for hypercall services selection.
> The series is close to be merged, but not happen yet. The SDEI is one
> of the hypercall services at least. SDEI's pseudo firmware registers
> for migration will be managed by the infrastructure. (2) I would focus
> on the core functinality for now. In this way, we give migration space.
> For example, the data structures needs sorts of adjustments for migration,
> just in case.

Although merging this series w/o support for LM would mean that a guest
using SDEI could potentially explode when migrated, right? We can't
break it to implement something else.

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 04/18] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
  2022-05-02  2:55       ` Gavin Shan
@ 2022-05-02  3:43         ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-02  3:43 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

On Mon, May 02, 2022 at 10:55:51AM +0800, Gavin Shan wrote:
> > > +	unsigned long route_mode = smccc_get_arg(vcpu, 4);
> > 
> > This is really 'flags'. route_mode is bit[0]. I imagine we don't want to
> > support relative mode, so bit[1] is useless for us in that case too.
> > 
> > The spec is somewhat imprecise on what happens for reserved flags. The
> > prototype in section 5.1.2 of [1] suggests that reserved bits must be
> > zero, but 5.1.2.3 'Client responsibilities' does not state that invalid
> > flags result in an error.
> > 
> > Arm TF certainly rejects unexpected flags [2].
> > 
> > [1]: DEN0054C https://developer.arm.com/documentation/den0054/latest
> > [2]: https://github.com/ARM-software/arm-trusted-firmware/blob/66c3906e4c32d675eb06bd081de8a3359f76b84c/services/std_svc/sdei/sdei_main.c#L260
> > 
> 
> Yes, This chunk of code is still stick to old specification. Lets
> improve in next respin:
> 
>    - Rename @route_mode to @flags
>    - Reject if the reserved bits are set.
>    - Reject if relative mode (bit#1) is selected.
>    - Reject if routing mode (bit#0) isn't RM_ANY (0).

Bit[0] is ignored for private events, actually. So we really just reject
if any of bit[63:1] are set.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 04/18] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
@ 2022-05-02  3:43         ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-02  3:43 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

On Mon, May 02, 2022 at 10:55:51AM +0800, Gavin Shan wrote:
> > > +	unsigned long route_mode = smccc_get_arg(vcpu, 4);
> > 
> > This is really 'flags'. route_mode is bit[0]. I imagine we don't want to
> > support relative mode, so bit[1] is useless for us in that case too.
> > 
> > The spec is somewhat imprecise on what happens for reserved flags. The
> > prototype in section 5.1.2 of [1] suggests that reserved bits must be
> > zero, but 5.1.2.3 'Client responsibilities' does not state that invalid
> > flags result in an error.
> > 
> > Arm TF certainly rejects unexpected flags [2].
> > 
> > [1]: DEN0054C https://developer.arm.com/documentation/den0054/latest
> > [2]: https://github.com/ARM-software/arm-trusted-firmware/blob/66c3906e4c32d675eb06bd081de8a3359f76b84c/services/std_svc/sdei/sdei_main.c#L260
> > 
> 
> Yes, This chunk of code is still stick to old specification. Lets
> improve in next respin:
> 
>    - Rename @route_mode to @flags
>    - Reject if the reserved bits are set.
>    - Reject if relative mode (bit#1) is selected.
>    - Reject if routing mode (bit#0) isn't RM_ANY (0).

Bit[0] is ignored for private events, actually. So we really just reject
if any of bit[63:1] are set.

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME} hypercall
  2022-05-01  6:50     ` Oliver Upton
@ 2022-05-02  6:19       ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  6:19 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 5/1/22 2:50 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:39:06PM +0800, Gavin Shan wrote:
>> This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
>> They are used by guest to notify the completion of event in its
>> handler. The previously interrupted or preempted context is restored
>> like below.
>>
>>     * x0 - x17, PC and PState are restored to what values we had in
>>       the interrupted or preempted context.
>>
>>     * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
>>       is injected.
> 
> I don't think that's how COMPLETE_AND_RESUME works. The caller specifies an
> address at which it would like to begin execution within the client
> exception level.
> 
> SDEI spec suggests this behaves like a synchronous exception. DEN 0054C
> 5.2.2 'Event Resume Context' speaks more about how it is supposed to
> work.
> 

It's actually the linux convention. If the event handler, which was
specified in previous hypercall to EVENT_REGISTER, returns success,
the (linux) client calls into COMPLETE_AND_RESUME and the resume
address is specified with FIQ vector offset. More details can be
found from arch/arm64/kernel::sdei.c::do_sdei_event().

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME} hypercall
@ 2022-05-02  6:19       ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  6:19 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 5/1/22 2:50 PM, Oliver Upton wrote:
> On Sun, Apr 03, 2022 at 11:39:06PM +0800, Gavin Shan wrote:
>> This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
>> They are used by guest to notify the completion of event in its
>> handler. The previously interrupted or preempted context is restored
>> like below.
>>
>>     * x0 - x17, PC and PState are restored to what values we had in
>>       the interrupted or preempted context.
>>
>>     * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
>>       is injected.
> 
> I don't think that's how COMPLETE_AND_RESUME works. The caller specifies an
> address at which it would like to begin execution within the client
> exception level.
> 
> SDEI spec suggests this behaves like a synchronous exception. DEN 0054C
> 5.2.2 'Event Resume Context' speaks more about how it is supposed to
> work.
> 

It's actually the linux convention. If the event handler, which was
specified in previous hypercall to EVENT_REGISTER, returns success,
the (linux) client calls into COMPLETE_AND_RESUME and the resume
address is specified with FIQ vector offset. More details can be
found from arch/arm64/kernel::sdei.c::do_sdei_event().

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-05-02  3:40                     ` Oliver Upton
@ 2022-05-02  7:25                       ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  7:25 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 5/2/22 11:40 AM, Oliver Upton wrote:
> On Mon, May 02, 2022 at 10:35:08AM +0800, Gavin Shan wrote:
>> On 4/30/22 10:16 PM, Oliver Upton wrote:
>>> On Sat, Apr 30, 2022 at 07:38:29PM +0800, Gavin Shan wrote:
>>>> Thank you for the comments and details. It should work by using bitmaps
>>>> to represent event's states. I will adopt your proposed structs in next
>>>> respin. However, there are more states needed. So I would adjust
>>>> "struct kvm_sdei_vcpu" like below in next respin.
>>>>
>>>>       struct kvm_sdei_vcpu {
>>>>           unsigned long registered;    /* the event is registered or not                 */
>>>>           unsigned long enabled;       /* the event is enabled or not                    */
>>>>           unsigned long unregistering; /* the event is pending for unregistration        */
>>>
>>> I'm not following why we need to keep track of the 'pending unregister'
>>> state directly. Is it not possible to infer from (active && !registered)?
>>>
>>
>> The event can be unregistered and reseted through hypercalls when it's
>> being handled. In this case, the unregistration for the event can't
>> be done immediately and has to be delayed until the handling is finished.
>> The unregistration pending state is used in this case. Yes, it's
>> correct we also can use (active & !registered) to represent the state.
> 
> I don't believe there is any delay in the unregistration of an event.
> The state is only meant to imply that the handler must complete before
> software can re-register for the event.
> 
> The event state machine from 6.1 can be encoded using 3 bits, which is
> exactly what we see in Table 13 of DEN0054C.
> 
> I'm sorry for being pedantic, but avoiding duplication of state reduces
> the chance of bugs + makes things a bit easier to reason about.
> 

 From section 6.1 and table 13 in DEN0054C, it's true there are just 3-bits
for the event states: reigstered/unregistered, enabled/disabled, running/not_running.
The unregistration-pending state is somewhat related to implementation, similar
to the pending state we have. The unregistration-pending state isn't explicitly
defined by the specification. However, the state is implicitly stated by the
specification. For example, there are statements about it in section 5.1.18
and 5.1.18.2:

    It is expected that no private event handlers would have the event
    handler property handler-running set to TRUE. If an event handler
    is running, unregister will be pending until the event is completed.

Oliver, how about to adjust struct kvm_sdei_vcpu like below. With the
changes, struct kvm_sdei_vcpu::unregistering is dropped, to match with
the specification strictly.

    struct kvm_sdei_vcpu {
        unsigned long registered;
        unsigned long enabled;
        unsigned long running;        // renamed from 'active' to match the specification strictly
        unsigned long pending;        // event pending for delivery
           :
    };

    state                          @registered  @enabled  @running  @pending
    --------------------------------------------------------------------------------
    unregistered                   0            0         0/1       0
    registered-disabled            1            0         0         0/1
    registered-enabled             1            1         0/1       0/1
    handler-running                0/1          0/1       1         0/1

We can use the specific encoding to represent the unregistration-pending.

    state                          @registered  @enabled  @running  @pending
    -------------------------------------------------------------------------
    handler-running                0            0          1        0

Thanks for your valuable comments, Oliver. I'm not starting to work on
v7 yet. I also would like to make everything clear before that. In that
case, it will be easier for you to review next revision :)

>>>>           unsigned long pending;       /* the event is pending for delivery and handling */
>>>>           unsigned long active;        /* the event is currently being handled           */
>>>>
>>>>           :
>>>>           <this part is just like what you suggested>
>>>>       };
>>>>
>>>> I rename @pending to @unregister. Besides, there are two states added:
>>>>
>>>>      @pending: Indicate there has one event has been injected. The next step
>>>>                for the event is to deliver it for handling. For one particular
>>>>                event, we allow one pending event in the maximum.
>>>
>>> Right, if an event retriggers when it is pending we still dispatch a
>>> single event to the guest. And since we're only doing normal priority
>>> events, it is entirely implementation defined which gets dispatched
>>> first.
>>>
>>
>> Yep, we will simply rely on find_first_bit() for the priority. It means
>> the software signaled event, whose number is zero, will have the highest
>> priority.
>>
>>>>      @active:  Indicate the event is currently being handled. The information
>>>>                stored in 'struct kvm_sdei_event_context' instance can be
>>>>                correlated with the event.
>>>
>>> Does this need to be a bitmap though? We can't ever have more than one
>>> SDEI event active at a time since this is private to a vCPU.
>>>
>>
>> Yes, one event is active at most on one particular vCPU. So tt don't
>> have to be a bitmap necessarily. The reason I proposed to use bitmap
>> for this state is to having all (event) states represented by bitmaps.
>> In this way, all states are managed in a unified fashion. The alternative
>> way is to have "unsigned long active_event", which traces the active
>> event number. It also consumes 8-bytes when live migration is concerned.
>> So I prefer a bitmap :)
>>
> 
> The small benefit of using the event number is that we can address all
> events in 8 bytes, whereas we'd need to extend the bitmap for >64
> events. I suppose we'll run into that issue either way, since the
> pending, registered, and enabled portions are also bitmaps.
> 
> When live migration is in scope we should probably bark at userspace if
> it attempts to set more than a single bit in the register.
> 

Even it's unlikely to support the shared event, bitmap will help in that
case. I'm not sure about other VMM, the pseudo firmware registers are
almost transparent to user space in QEMU. They're accessed and no one
cares the values reading from and writing to these registers in QEMU ;-)

Please refer to the above reply for more details :)

>>>> Furthermore, it's fair enough to put the (vcpu) mask state into 'flags'
>>>> field of struct kvm_vcpu_arch :)
>>>
>>> I think you can get away with putting active in there too, I don't see
>>> why we need more than a single bit for this info.
>>>
>>
>> Not really. We just need one single bit for vCPU's mask state. We need
>> multiple bits for event's active state, depending on how many events are
>> supported. We need to know which event is currently active at least.
>> For now, there are only two supported events (0/1), but one single bit
>> is still not enough because there are 3 states: (1) software signaled
>> event is active. (2) async pf event is active. (3) none of them is
>> active.
>>
>> Lets use a bitmap for the event active state as I said above, if you
>> don't strongly object :)
>>
>>>>>>>>> Do we need this if we disallow nesting events?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, we need this. "event == NULL" is used as indication of invalid
>>>>>>>> context. @event is the associated SDEI event when the context is
>>>>>>>> valid.
>>>>>>>
>>>>>>> What if we use some other plumbing to indicate the state of the vCPU? MP
>>>>>>> state comes to mind, for example.
>>>>>>>
>>>>>>
>>>>>> Even the indication is done by another state, kvm_sdei_vcpu_context still
>>>>>> need to be linked (associated) with the event. After the vCPU context becomes
>>>>>> valid after the event is delivered, we still need to know the associated
>>>>>> event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
>>>>>> is one of the examples, we need to decrease struct kvm_sdei_event::event_count
>>>>>> for the hypercall.
>>>>>
>>>>> Why do we need to keep track of how many times an event has been
>>>>> signaled? Nothing in SDEI seems to suggest that the number of event
>>>>> signals corresponds to the number of times the handler is invoked. In
>>>>> fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:
>>>>>
>>>>> """
>>>>> The event has edgetriggered semantics and the number of event signals
>>>>> may not correspond to the number of times the handler is invoked in the
>>>>> target PE.
>>>>> """
>>>>>
>>>>> DEN0054C 5.1.16.1
>>>>>
>>>>> So perhaps we queue at most 1 pending event for the guest.
>>>>>
>>>>> I'd also like to see if anyone else has thoughts on the topic, as I'd
>>>>> hate for you to go back to the whiteboard again in the next spin.
>>>>>
>>>>
>>>> Agreed. In next respin, we will have one pending event at most. Error
>>>> can be returned if user attempts to inject event whose pending state
>>>> (struct kvm_sdei_vcpu::pending) has been set.
>>>
>>> I don't believe we can do that. The SDEI_EVENT_SIGNAL call should succeed,
>>> even if the event was already pending.
>>>
>>
>> I rethinking it a bit. Yes, you're correct. In this specific case, the
>> event handler is running for multiple events.
>>
>>>> Indeed, the hardest part is to determine the data structures and
>>>> functions we need. Oliver, your valuable comments are helping to
>>>> bring this series to the right track. However, I do think it's
>>>> helpful if somebody else can confirm the outcomes from the previous
>>>> discussions. I'm not sure if Marc has time for a quick scan and provide
>>>> comments.
>>>>
>>>> I would summarize the outcomes from our discussions, to help Marc
>>>> or others to confirm:
>>>
>>> Going to take a look at some of your later patches as well, just a heads
>>> up.
>>>
>>
>> Yep, thanks again for your valuable comments :)
>>
>>>> - Drop support for the shared event.
>>>> - Dropsupport for the critical event.
>>>> - The events in the implementations are all private and can be signaled
>>>>     (raised) by software.
>>>> - Drop migration support for now, and we will consider it using
>>>>     pseudo firmware registers. So add-on patches are expected to support
>>>>     the migration in future.
>>>
>>> Migration will be supported in a future spin of this series, not a
>>> subsequent one right? :) I had just made the suggestion because there was
>>> a lot of renovations that we were discussing.
>>>
>>
>> I prefer a separate series to support migration after this series gets
>> merged. There are couple of reasons to do so: (1) The migration depends
>> on Raghavendra's series to support for hypercall services selection.
>> The series is close to be merged, but not happen yet. The SDEI is one
>> of the hypercall services at least. SDEI's pseudo firmware registers
>> for migration will be managed by the infrastructure. (2) I would focus
>> on the core functinality for now. In this way, we give migration space.
>> For example, the data structures needs sorts of adjustments for migration,
>> just in case.
> 
> Although merging this series w/o support for LM would mean that a guest
> using SDEI could potentially explode when migrated, right? We can't
> break it to implement something else.
> 

Yes. I even had the code for migration for v6, which is really coarse and
needs to be polished. If two VMs are mismatched in terms of SDEI support,
the migration fails. With Raghavendra's series to support for hypercall
services selection, we can make SDEI migration opt-in to some extent at
least. For example, the migration from where SDEI isn't supported to
the destination VM where SDEI is supported. It's helping people to finish
forwards upgrade.

    git@github.com:gwshan/linux.git  (branch: kvm/arm64_sdei)

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-05-02  7:25                       ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  7:25 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 5/2/22 11:40 AM, Oliver Upton wrote:
> On Mon, May 02, 2022 at 10:35:08AM +0800, Gavin Shan wrote:
>> On 4/30/22 10:16 PM, Oliver Upton wrote:
>>> On Sat, Apr 30, 2022 at 07:38:29PM +0800, Gavin Shan wrote:
>>>> Thank you for the comments and details. It should work by using bitmaps
>>>> to represent event's states. I will adopt your proposed structs in next
>>>> respin. However, there are more states needed. So I would adjust
>>>> "struct kvm_sdei_vcpu" like below in next respin.
>>>>
>>>>       struct kvm_sdei_vcpu {
>>>>           unsigned long registered;    /* the event is registered or not                 */
>>>>           unsigned long enabled;       /* the event is enabled or not                    */
>>>>           unsigned long unregistering; /* the event is pending for unregistration        */
>>>
>>> I'm not following why we need to keep track of the 'pending unregister'
>>> state directly. Is it not possible to infer from (active && !registered)?
>>>
>>
>> The event can be unregistered and reseted through hypercalls when it's
>> being handled. In this case, the unregistration for the event can't
>> be done immediately and has to be delayed until the handling is finished.
>> The unregistration pending state is used in this case. Yes, it's
>> correct we also can use (active & !registered) to represent the state.
> 
> I don't believe there is any delay in the unregistration of an event.
> The state is only meant to imply that the handler must complete before
> software can re-register for the event.
> 
> The event state machine from 6.1 can be encoded using 3 bits, which is
> exactly what we see in Table 13 of DEN0054C.
> 
> I'm sorry for being pedantic, but avoiding duplication of state reduces
> the chance of bugs + makes things a bit easier to reason about.
> 

 From section 6.1 and table 13 in DEN0054C, it's true there are just 3-bits
for the event states: reigstered/unregistered, enabled/disabled, running/not_running.
The unregistration-pending state is somewhat related to implementation, similar
to the pending state we have. The unregistration-pending state isn't explicitly
defined by the specification. However, the state is implicitly stated by the
specification. For example, there are statements about it in section 5.1.18
and 5.1.18.2:

    It is expected that no private event handlers would have the event
    handler property handler-running set to TRUE. If an event handler
    is running, unregister will be pending until the event is completed.

Oliver, how about to adjust struct kvm_sdei_vcpu like below. With the
changes, struct kvm_sdei_vcpu::unregistering is dropped, to match with
the specification strictly.

    struct kvm_sdei_vcpu {
        unsigned long registered;
        unsigned long enabled;
        unsigned long running;        // renamed from 'active' to match the specification strictly
        unsigned long pending;        // event pending for delivery
           :
    };

    state                          @registered  @enabled  @running  @pending
    --------------------------------------------------------------------------------
    unregistered                   0            0         0/1       0
    registered-disabled            1            0         0         0/1
    registered-enabled             1            1         0/1       0/1
    handler-running                0/1          0/1       1         0/1

We can use the specific encoding to represent the unregistration-pending.

    state                          @registered  @enabled  @running  @pending
    -------------------------------------------------------------------------
    handler-running                0            0          1        0

Thanks for your valuable comments, Oliver. I'm not starting to work on
v7 yet. I also would like to make everything clear before that. In that
case, it will be easier for you to review next revision :)

>>>>           unsigned long pending;       /* the event is pending for delivery and handling */
>>>>           unsigned long active;        /* the event is currently being handled           */
>>>>
>>>>           :
>>>>           <this part is just like what you suggested>
>>>>       };
>>>>
>>>> I rename @pending to @unregister. Besides, there are two states added:
>>>>
>>>>      @pending: Indicate there has one event has been injected. The next step
>>>>                for the event is to deliver it for handling. For one particular
>>>>                event, we allow one pending event in the maximum.
>>>
>>> Right, if an event retriggers when it is pending we still dispatch a
>>> single event to the guest. And since we're only doing normal priority
>>> events, it is entirely implementation defined which gets dispatched
>>> first.
>>>
>>
>> Yep, we will simply rely on find_first_bit() for the priority. It means
>> the software signaled event, whose number is zero, will have the highest
>> priority.
>>
>>>>      @active:  Indicate the event is currently being handled. The information
>>>>                stored in 'struct kvm_sdei_event_context' instance can be
>>>>                correlated with the event.
>>>
>>> Does this need to be a bitmap though? We can't ever have more than one
>>> SDEI event active at a time since this is private to a vCPU.
>>>
>>
>> Yes, one event is active at most on one particular vCPU. So tt don't
>> have to be a bitmap necessarily. The reason I proposed to use bitmap
>> for this state is to having all (event) states represented by bitmaps.
>> In this way, all states are managed in a unified fashion. The alternative
>> way is to have "unsigned long active_event", which traces the active
>> event number. It also consumes 8-bytes when live migration is concerned.
>> So I prefer a bitmap :)
>>
> 
> The small benefit of using the event number is that we can address all
> events in 8 bytes, whereas we'd need to extend the bitmap for >64
> events. I suppose we'll run into that issue either way, since the
> pending, registered, and enabled portions are also bitmaps.
> 
> When live migration is in scope we should probably bark at userspace if
> it attempts to set more than a single bit in the register.
> 

Even it's unlikely to support the shared event, bitmap will help in that
case. I'm not sure about other VMM, the pseudo firmware registers are
almost transparent to user space in QEMU. They're accessed and no one
cares the values reading from and writing to these registers in QEMU ;-)

Please refer to the above reply for more details :)

>>>> Furthermore, it's fair enough to put the (vcpu) mask state into 'flags'
>>>> field of struct kvm_vcpu_arch :)
>>>
>>> I think you can get away with putting active in there too, I don't see
>>> why we need more than a single bit for this info.
>>>
>>
>> Not really. We just need one single bit for vCPU's mask state. We need
>> multiple bits for event's active state, depending on how many events are
>> supported. We need to know which event is currently active at least.
>> For now, there are only two supported events (0/1), but one single bit
>> is still not enough because there are 3 states: (1) software signaled
>> event is active. (2) async pf event is active. (3) none of them is
>> active.
>>
>> Lets use a bitmap for the event active state as I said above, if you
>> don't strongly object :)
>>
>>>>>>>>> Do we need this if we disallow nesting events?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, we need this. "event == NULL" is used as indication of invalid
>>>>>>>> context. @event is the associated SDEI event when the context is
>>>>>>>> valid.
>>>>>>>
>>>>>>> What if we use some other plumbing to indicate the state of the vCPU? MP
>>>>>>> state comes to mind, for example.
>>>>>>>
>>>>>>
>>>>>> Even the indication is done by another state, kvm_sdei_vcpu_context still
>>>>>> need to be linked (associated) with the event. After the vCPU context becomes
>>>>>> valid after the event is delivered, we still need to know the associated
>>>>>> event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
>>>>>> is one of the examples, we need to decrease struct kvm_sdei_event::event_count
>>>>>> for the hypercall.
>>>>>
>>>>> Why do we need to keep track of how many times an event has been
>>>>> signaled? Nothing in SDEI seems to suggest that the number of event
>>>>> signals corresponds to the number of times the handler is invoked. In
>>>>> fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:
>>>>>
>>>>> """
>>>>> The event has edgetriggered semantics and the number of event signals
>>>>> may not correspond to the number of times the handler is invoked in the
>>>>> target PE.
>>>>> """
>>>>>
>>>>> DEN0054C 5.1.16.1
>>>>>
>>>>> So perhaps we queue at most 1 pending event for the guest.
>>>>>
>>>>> I'd also like to see if anyone else has thoughts on the topic, as I'd
>>>>> hate for you to go back to the whiteboard again in the next spin.
>>>>>
>>>>
>>>> Agreed. In next respin, we will have one pending event at most. Error
>>>> can be returned if user attempts to inject event whose pending state
>>>> (struct kvm_sdei_vcpu::pending) has been set.
>>>
>>> I don't believe we can do that. The SDEI_EVENT_SIGNAL call should succeed,
>>> even if the event was already pending.
>>>
>>
>> I rethinking it a bit. Yes, you're correct. In this specific case, the
>> event handler is running for multiple events.
>>
>>>> Indeed, the hardest part is to determine the data structures and
>>>> functions we need. Oliver, your valuable comments are helping to
>>>> bring this series to the right track. However, I do think it's
>>>> helpful if somebody else can confirm the outcomes from the previous
>>>> discussions. I'm not sure if Marc has time for a quick scan and provide
>>>> comments.
>>>>
>>>> I would summarize the outcomes from our discussions, to help Marc
>>>> or others to confirm:
>>>
>>> Going to take a look at some of your later patches as well, just a heads
>>> up.
>>>
>>
>> Yep, thanks again for your valuable comments :)
>>
>>>> - Drop support for the shared event.
>>>> - Dropsupport for the critical event.
>>>> - The events in the implementations are all private and can be signaled
>>>>     (raised) by software.
>>>> - Drop migration support for now, and we will consider it using
>>>>     pseudo firmware registers. So add-on patches are expected to support
>>>>     the migration in future.
>>>
>>> Migration will be supported in a future spin of this series, not a
>>> subsequent one right? :) I had just made the suggestion because there was
>>> a lot of renovations that we were discussing.
>>>
>>
>> I prefer a separate series to support migration after this series gets
>> merged. There are couple of reasons to do so: (1) The migration depends
>> on Raghavendra's series to support for hypercall services selection.
>> The series is close to be merged, but not happen yet. The SDEI is one
>> of the hypercall services at least. SDEI's pseudo firmware registers
>> for migration will be managed by the infrastructure. (2) I would focus
>> on the core functinality for now. In this way, we give migration space.
>> For example, the data structures needs sorts of adjustments for migration,
>> just in case.
> 
> Although merging this series w/o support for LM would mean that a guest
> using SDEI could potentially explode when migrated, right? We can't
> break it to implement something else.
> 

Yes. I even had the code for migration for v6, which is really coarse and
needs to be polished. If two VMs are mismatched in terms of SDEI support,
the migration fails. With Raghavendra's series to support for hypercall
services selection, we can make SDEI migration opt-in to some extent at
least. For example, the migration from where SDEI isn't supported to
the destination VM where SDEI is supported. It's helping people to finish
forwards upgrade.

    git@github.com:gwshan/linux.git  (branch: kvm/arm64_sdei)

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 04/18] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
  2022-05-02  3:43         ` Oliver Upton
@ 2022-05-02  7:28           ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  7:28 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 5/2/22 11:43 AM, Oliver Upton wrote:
> On Mon, May 02, 2022 at 10:55:51AM +0800, Gavin Shan wrote:
>>>> +	unsigned long route_mode = smccc_get_arg(vcpu, 4);
>>>
>>> This is really 'flags'. route_mode is bit[0]. I imagine we don't want to
>>> support relative mode, so bit[1] is useless for us in that case too.
>>>
>>> The spec is somewhat imprecise on what happens for reserved flags. The
>>> prototype in section 5.1.2 of [1] suggests that reserved bits must be
>>> zero, but 5.1.2.3 'Client responsibilities' does not state that invalid
>>> flags result in an error.
>>>
>>> Arm TF certainly rejects unexpected flags [2].
>>>
>>> [1]: DEN0054C https://developer.arm.com/documentation/den0054/latest
>>> [2]: https://github.com/ARM-software/arm-trusted-firmware/blob/66c3906e4c32d675eb06bd081de8a3359f76b84c/services/std_svc/sdei/sdei_main.c#L260
>>>
>>
>> Yes, This chunk of code is still stick to old specification. Lets
>> improve in next respin:
>>
>>     - Rename @route_mode to @flags
>>     - Reject if the reserved bits are set.
>>     - Reject if relative mode (bit#1) is selected.
>>     - Reject if routing mode (bit#0) isn't RM_ANY (0).
> 
> Bit[0] is ignored for private events, actually. So we really just reject
> if any of bit[63:1] are set.
> 

It makes sense to me. Thanks for your confirm :)

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 04/18] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
@ 2022-05-02  7:28           ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  7:28 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 5/2/22 11:43 AM, Oliver Upton wrote:
> On Mon, May 02, 2022 at 10:55:51AM +0800, Gavin Shan wrote:
>>>> +	unsigned long route_mode = smccc_get_arg(vcpu, 4);
>>>
>>> This is really 'flags'. route_mode is bit[0]. I imagine we don't want to
>>> support relative mode, so bit[1] is useless for us in that case too.
>>>
>>> The spec is somewhat imprecise on what happens for reserved flags. The
>>> prototype in section 5.1.2 of [1] suggests that reserved bits must be
>>> zero, but 5.1.2.3 'Client responsibilities' does not state that invalid
>>> flags result in an error.
>>>
>>> Arm TF certainly rejects unexpected flags [2].
>>>
>>> [1]: DEN0054C https://developer.arm.com/documentation/den0054/latest
>>> [2]: https://github.com/ARM-software/arm-trusted-firmware/blob/66c3906e4c32d675eb06bd081de8a3359f76b84c/services/std_svc/sdei/sdei_main.c#L260
>>>
>>
>> Yes, This chunk of code is still stick to old specification. Lets
>> improve in next respin:
>>
>>     - Rename @route_mode to @flags
>>     - Reject if the reserved bits are set.
>>     - Reject if relative mode (bit#1) is selected.
>>     - Reject if routing mode (bit#0) isn't RM_ANY (0).
> 
> Bit[0] is ignored for private events, actually. So we really just reject
> if any of bit[63:1] are set.
> 

It makes sense to me. Thanks for your confirm :)

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME} hypercall
  2022-05-02  6:19       ` Gavin Shan
@ 2022-05-02  7:38         ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-02  7:38 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

On Mon, May 02, 2022 at 02:19:30PM +0800, Gavin Shan wrote:
> Hi Oliver,
> 
> On 5/1/22 2:50 PM, Oliver Upton wrote:
> > On Sun, Apr 03, 2022 at 11:39:06PM +0800, Gavin Shan wrote:
> > > This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
> > > They are used by guest to notify the completion of event in its
> > > handler. The previously interrupted or preempted context is restored
> > > like below.
> > > 
> > >     * x0 - x17, PC and PState are restored to what values we had in
> > >       the interrupted or preempted context.
> > > 
> > >     * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
> > >       is injected.
> > 
> > I don't think that's how COMPLETE_AND_RESUME works. The caller specifies an
> > address at which it would like to begin execution within the client
> > exception level.
> > 
> > SDEI spec suggests this behaves like a synchronous exception. DEN 0054C
> > 5.2.2 'Event Resume Context' speaks more about how it is supposed to
> > work.
> > 
> 
> It's actually the linux convention. If the event handler, which was
> specified in previous hypercall to EVENT_REGISTER, returns success,
> the (linux) client calls into COMPLETE_AND_RESUME and the resume
> address is specified with FIQ vector offset. More details can be
> found from arch/arm64/kernel::sdei.c::do_sdei_event().

Right -- but look at what its doing. It returns the address at which it
wants to resume execution.

arch/arm64/kernel.entry.S::__sdei_asm_handler winds up passing this as
an argument to COMPLETE_AND_RESUME. Also, what would happen if we run
something that isn't Linux inside of KVM? This is why I suggested
implementing COMPLETE_AND_RESUME in line with the specification, not
based on what the kernel is presently doing.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME} hypercall
@ 2022-05-02  7:38         ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-02  7:38 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

On Mon, May 02, 2022 at 02:19:30PM +0800, Gavin Shan wrote:
> Hi Oliver,
> 
> On 5/1/22 2:50 PM, Oliver Upton wrote:
> > On Sun, Apr 03, 2022 at 11:39:06PM +0800, Gavin Shan wrote:
> > > This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
> > > They are used by guest to notify the completion of event in its
> > > handler. The previously interrupted or preempted context is restored
> > > like below.
> > > 
> > >     * x0 - x17, PC and PState are restored to what values we had in
> > >       the interrupted or preempted context.
> > > 
> > >     * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
> > >       is injected.
> > 
> > I don't think that's how COMPLETE_AND_RESUME works. The caller specifies an
> > address at which it would like to begin execution within the client
> > exception level.
> > 
> > SDEI spec suggests this behaves like a synchronous exception. DEN 0054C
> > 5.2.2 'Event Resume Context' speaks more about how it is supposed to
> > work.
> > 
> 
> It's actually the linux convention. If the event handler, which was
> specified in previous hypercall to EVENT_REGISTER, returns success,
> the (linux) client calls into COMPLETE_AND_RESUME and the resume
> address is specified with FIQ vector offset. More details can be
> found from arch/arm64/kernel::sdei.c::do_sdei_event().

Right -- but look at what its doing. It returns the address at which it
wants to resume execution.

arch/arm64/kernel.entry.S::__sdei_asm_handler winds up passing this as
an argument to COMPLETE_AND_RESUME. Also, what would happen if we run
something that isn't Linux inside of KVM? This is why I suggested
implementing COMPLETE_AND_RESUME in line with the specification, not
based on what the kernel is presently doing.

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME} hypercall
  2022-05-02  7:38         ` Oliver Upton
@ 2022-05-02  7:51           ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  7:51 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 5/2/22 3:38 PM, Oliver Upton wrote:
> On Mon, May 02, 2022 at 02:19:30PM +0800, Gavin Shan wrote:
>> On 5/1/22 2:50 PM, Oliver Upton wrote:
>>> On Sun, Apr 03, 2022 at 11:39:06PM +0800, Gavin Shan wrote:
>>>> This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
>>>> They are used by guest to notify the completion of event in its
>>>> handler. The previously interrupted or preempted context is restored
>>>> like below.
>>>>
>>>>      * x0 - x17, PC and PState are restored to what values we had in
>>>>        the interrupted or preempted context.
>>>>
>>>>      * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
>>>>        is injected.
>>>
>>> I don't think that's how COMPLETE_AND_RESUME works. The caller specifies an
>>> address at which it would like to begin execution within the client
>>> exception level.
>>>
>>> SDEI spec suggests this behaves like a synchronous exception. DEN 0054C
>>> 5.2.2 'Event Resume Context' speaks more about how it is supposed to
>>> work.
>>>
>>
>> It's actually the linux convention. If the event handler, which was
>> specified in previous hypercall to EVENT_REGISTER, returns success,
>> the (linux) client calls into COMPLETE_AND_RESUME and the resume
>> address is specified with FIQ vector offset. More details can be
>> found from arch/arm64/kernel::sdei.c::do_sdei_event().
> 
> Right -- but look at what its doing. It returns the address at which it
> wants to resume execution.
> 
> arch/arm64/kernel.entry.S::__sdei_asm_handler winds up passing this as
> an argument to COMPLETE_AND_RESUME. Also, what would happen if we run
> something that isn't Linux inside of KVM? This is why I suggested
> implementing COMPLETE_AND_RESUME in line with the specification, not
> based on what the kernel is presently doing.
> 

Indeed. The address for the resumed execution is passed by x1 when
COMPLETE_AND_RESUME is called. I will figure this out in next revision.
I don't think we can have the assumption that the guest is linux.

Thanks again for your review and comments :)

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME} hypercall
@ 2022-05-02  7:51           ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  7:51 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 5/2/22 3:38 PM, Oliver Upton wrote:
> On Mon, May 02, 2022 at 02:19:30PM +0800, Gavin Shan wrote:
>> On 5/1/22 2:50 PM, Oliver Upton wrote:
>>> On Sun, Apr 03, 2022 at 11:39:06PM +0800, Gavin Shan wrote:
>>>> This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
>>>> They are used by guest to notify the completion of event in its
>>>> handler. The previously interrupted or preempted context is restored
>>>> like below.
>>>>
>>>>      * x0 - x17, PC and PState are restored to what values we had in
>>>>        the interrupted or preempted context.
>>>>
>>>>      * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
>>>>        is injected.
>>>
>>> I don't think that's how COMPLETE_AND_RESUME works. The caller specifies an
>>> address at which it would like to begin execution within the client
>>> exception level.
>>>
>>> SDEI spec suggests this behaves like a synchronous exception. DEN 0054C
>>> 5.2.2 'Event Resume Context' speaks more about how it is supposed to
>>> work.
>>>
>>
>> It's actually the linux convention. If the event handler, which was
>> specified in previous hypercall to EVENT_REGISTER, returns success,
>> the (linux) client calls into COMPLETE_AND_RESUME and the resume
>> address is specified with FIQ vector offset. More details can be
>> found from arch/arm64/kernel::sdei.c::do_sdei_event().
> 
> Right -- but look at what its doing. It returns the address at which it
> wants to resume execution.
> 
> arch/arm64/kernel.entry.S::__sdei_asm_handler winds up passing this as
> an argument to COMPLETE_AND_RESUME. Also, what would happen if we run
> something that isn't Linux inside of KVM? This is why I suggested
> implementing COMPLETE_AND_RESUME in line with the specification, not
> based on what the kernel is presently doing.
> 

Indeed. The address for the resumed execution is passed by x1 when
COMPLETE_AND_RESUME is called. I will figure this out in next revision.
I don't think we can have the assumption that the guest is linux.

Thanks again for your review and comments :)

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-05-02  7:25                       ` Gavin Shan
@ 2022-05-02  7:57                         ` Oliver Upton
  -1 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-02  7:57 UTC (permalink / raw)
  To: Gavin Shan
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

On Mon, May 02, 2022 at 03:25:40PM +0800, Gavin Shan wrote:
> Oliver, how about to adjust struct kvm_sdei_vcpu like below. With the
> changes, struct kvm_sdei_vcpu::unregistering is dropped, to match with
> the specification strictly.
> 
>    struct kvm_sdei_vcpu {
>        unsigned long registered;
>        unsigned long enabled;
>        unsigned long running;        // renamed from 'active' to match the specification strictly
>        unsigned long pending;        // event pending for delivery
>           :
>    };
> 
>    state                          @registered  @enabled  @running  @pending
>    --------------------------------------------------------------------------------
>    unregistered                   0            0         0/1       0
>    registered-disabled            1            0         0         0/1
>    registered-enabled             1            1         0/1       0/1
>    handler-running                0/1          0/1       1         0/1
> 
> We can use the specific encoding to represent the unregistration-pending.
> 
>    state                          @registered  @enabled  @running  @pending
>    -------------------------------------------------------------------------
>    handler-running                0            0          1        0

Right, this is what I had in mind. This encodes the
'handler-unregister-pending' state.

> Thanks for your valuable comments, Oliver. I'm not starting to work on
> v7 yet. I also would like to make everything clear before that. In that
> case, it will be easier for you to review next revision :)
> 
> > > > >           unsigned long pending;       /* the event is pending for delivery and handling */
> > > > >           unsigned long active;        /* the event is currently being handled           */
> > > > > 
> > > > >           :
> > > > >           <this part is just like what you suggested>
> > > > >       };
> > > > > 
> > > > > I rename @pending to @unregister. Besides, there are two states added:
> > > > > 
> > > > >      @pending: Indicate there has one event has been injected. The next step
> > > > >                for the event is to deliver it for handling. For one particular
> > > > >                event, we allow one pending event in the maximum.
> > > > 
> > > > Right, if an event retriggers when it is pending we still dispatch a
> > > > single event to the guest. And since we're only doing normal priority
> > > > events, it is entirely implementation defined which gets dispatched
> > > > first.
> > > > 
> > > 
> > > Yep, we will simply rely on find_first_bit() for the priority. It means
> > > the software signaled event, whose number is zero, will have the highest
> > > priority.
> > > 
> > > > >      @active:  Indicate the event is currently being handled. The information
> > > > >                stored in 'struct kvm_sdei_event_context' instance can be
> > > > >                correlated with the event.
> > > > 
> > > > Does this need to be a bitmap though? We can't ever have more than one
> > > > SDEI event active at a time since this is private to a vCPU.
> > > > 
> > > 
> > > Yes, one event is active at most on one particular vCPU. So tt don't
> > > have to be a bitmap necessarily. The reason I proposed to use bitmap
> > > for this state is to having all (event) states represented by bitmaps.
> > > In this way, all states are managed in a unified fashion. The alternative
> > > way is to have "unsigned long active_event", which traces the active
> > > event number. It also consumes 8-bytes when live migration is concerned.
> > > So I prefer a bitmap :)
> > > 
> > 
> > The small benefit of using the event number is that we can address all
> > events in 8 bytes, whereas we'd need to extend the bitmap for >64
> > events. I suppose we'll run into that issue either way, since the
> > pending, registered, and enabled portions are also bitmaps.
> > 
> > When live migration is in scope we should probably bark at userspace if
> > it attempts to set more than a single bit in the register.
> > 
> 
> Even it's unlikely to support the shared event, bitmap will help in that
> case. I'm not sure about other VMM, the pseudo firmware registers are
> almost transparent to user space in QEMU. They're accessed and no one
> cares the values reading from and writing to these registers in QEMU ;-)

Regardless of whether userspace actually manipulates the registers we
should still reject unsupported values. For example:

Let's say the VM is started on a kernel that introduced yet another SDEI
widget outside of your series. The VM was migrated back to an older
kernel w/o the SDEI widget, and as such the VMM attempts to set the
widget bit. Since the old kernel doesn't know what to do with the value
it should return EINVAL to userspace.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-05-02  7:57                         ` Oliver Upton
  0 siblings, 0 replies; 111+ messages in thread
From: Oliver Upton @ 2022-05-02  7:57 UTC (permalink / raw)
  To: Gavin Shan
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

On Mon, May 02, 2022 at 03:25:40PM +0800, Gavin Shan wrote:
> Oliver, how about to adjust struct kvm_sdei_vcpu like below. With the
> changes, struct kvm_sdei_vcpu::unregistering is dropped, to match with
> the specification strictly.
> 
>    struct kvm_sdei_vcpu {
>        unsigned long registered;
>        unsigned long enabled;
>        unsigned long running;        // renamed from 'active' to match the specification strictly
>        unsigned long pending;        // event pending for delivery
>           :
>    };
> 
>    state                          @registered  @enabled  @running  @pending
>    --------------------------------------------------------------------------------
>    unregistered                   0            0         0/1       0
>    registered-disabled            1            0         0         0/1
>    registered-enabled             1            1         0/1       0/1
>    handler-running                0/1          0/1       1         0/1
> 
> We can use the specific encoding to represent the unregistration-pending.
> 
>    state                          @registered  @enabled  @running  @pending
>    -------------------------------------------------------------------------
>    handler-running                0            0          1        0

Right, this is what I had in mind. This encodes the
'handler-unregister-pending' state.

> Thanks for your valuable comments, Oliver. I'm not starting to work on
> v7 yet. I also would like to make everything clear before that. In that
> case, it will be easier for you to review next revision :)
> 
> > > > >           unsigned long pending;       /* the event is pending for delivery and handling */
> > > > >           unsigned long active;        /* the event is currently being handled           */
> > > > > 
> > > > >           :
> > > > >           <this part is just like what you suggested>
> > > > >       };
> > > > > 
> > > > > I rename @pending to @unregister. Besides, there are two states added:
> > > > > 
> > > > >      @pending: Indicate there has one event has been injected. The next step
> > > > >                for the event is to deliver it for handling. For one particular
> > > > >                event, we allow one pending event in the maximum.
> > > > 
> > > > Right, if an event retriggers when it is pending we still dispatch a
> > > > single event to the guest. And since we're only doing normal priority
> > > > events, it is entirely implementation defined which gets dispatched
> > > > first.
> > > > 
> > > 
> > > Yep, we will simply rely on find_first_bit() for the priority. It means
> > > the software signaled event, whose number is zero, will have the highest
> > > priority.
> > > 
> > > > >      @active:  Indicate the event is currently being handled. The information
> > > > >                stored in 'struct kvm_sdei_event_context' instance can be
> > > > >                correlated with the event.
> > > > 
> > > > Does this need to be a bitmap though? We can't ever have more than one
> > > > SDEI event active at a time since this is private to a vCPU.
> > > > 
> > > 
> > > Yes, one event is active at most on one particular vCPU. So tt don't
> > > have to be a bitmap necessarily. The reason I proposed to use bitmap
> > > for this state is to having all (event) states represented by bitmaps.
> > > In this way, all states are managed in a unified fashion. The alternative
> > > way is to have "unsigned long active_event", which traces the active
> > > event number. It also consumes 8-bytes when live migration is concerned.
> > > So I prefer a bitmap :)
> > > 
> > 
> > The small benefit of using the event number is that we can address all
> > events in 8 bytes, whereas we'd need to extend the bitmap for >64
> > events. I suppose we'll run into that issue either way, since the
> > pending, registered, and enabled portions are also bitmaps.
> > 
> > When live migration is in scope we should probably bark at userspace if
> > it attempts to set more than a single bit in the register.
> > 
> 
> Even it's unlikely to support the shared event, bitmap will help in that
> case. I'm not sure about other VMM, the pseudo firmware registers are
> almost transparent to user space in QEMU. They're accessed and no one
> cares the values reading from and writing to these registers in QEMU ;-)

Regardless of whether userspace actually manipulates the registers we
should still reject unsupported values. For example:

Let's say the VM is started on a kernel that introduced yet another SDEI
widget outside of your series. The VM was migrated back to an older
kernel w/o the SDEI widget, and as such the VMM attempts to set the
widget bit. Since the old kernel doesn't know what to do with the value
it should return EINVAL to userspace.

--
Thanks,
Oliver
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
  2022-05-02  7:57                         ` Oliver Upton
@ 2022-05-02  8:23                           ` Gavin Shan
  -1 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  8:23 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, linux-kernel, eauger, Jonathan.Cameron, vkuznets, will,
	shannon.zhaosl, james.morse, mark.rutland, maz, pbonzini,
	shan.gavin

Hi Oliver,

On 5/2/22 3:57 PM, Oliver Upton wrote:
> On Mon, May 02, 2022 at 03:25:40PM +0800, Gavin Shan wrote:
>> Oliver, how about to adjust struct kvm_sdei_vcpu like below. With the
>> changes, struct kvm_sdei_vcpu::unregistering is dropped, to match with
>> the specification strictly.
>>
>>     struct kvm_sdei_vcpu {
>>         unsigned long registered;
>>         unsigned long enabled;
>>         unsigned long running;        // renamed from 'active' to match the specification strictly
>>         unsigned long pending;        // event pending for delivery
>>            :
>>     };
>>
>>     state                          @registered  @enabled  @running  @pending
>>     --------------------------------------------------------------------------------
>>     unregistered                   0            0         0/1       0
>>     registered-disabled            1            0         0         0/1
>>     registered-enabled             1            1         0/1       0/1
>>     handler-running                0/1          0/1       1         0/1
>>
>> We can use the specific encoding to represent the unregistration-pending.
>>
>>     state                          @registered  @enabled  @running  @pending
>>     -------------------------------------------------------------------------
>>     handler-running                0            0          1        0
> 
> Right, this is what I had in mind. This encodes the
> 'handler-unregister-pending' state.
> 

Cool, Thanks for your confirm. I think we're on same page for the
data structures now. With this, I'm able to start working on next
revision. Oliver, I'm sorry for taking you too much time to reach
to the point :)

>> Thanks for your valuable comments, Oliver. I'm not starting to work on
>> v7 yet. I also would like to make everything clear before that. In that
>> case, it will be easier for you to review next revision :)
>>
>>>>>>            unsigned long pending;       /* the event is pending for delivery and handling */
>>>>>>            unsigned long active;        /* the event is currently being handled           */
>>>>>>
>>>>>>            :
>>>>>>            <this part is just like what you suggested>
>>>>>>        };
>>>>>>
>>>>>> I rename @pending to @unregister. Besides, there are two states added:
>>>>>>
>>>>>>       @pending: Indicate there has one event has been injected. The next step
>>>>>>                 for the event is to deliver it for handling. For one particular
>>>>>>                 event, we allow one pending event in the maximum.
>>>>>
>>>>> Right, if an event retriggers when it is pending we still dispatch a
>>>>> single event to the guest. And since we're only doing normal priority
>>>>> events, it is entirely implementation defined which gets dispatched
>>>>> first.
>>>>>
>>>>
>>>> Yep, we will simply rely on find_first_bit() for the priority. It means
>>>> the software signaled event, whose number is zero, will have the highest
>>>> priority.
>>>>
>>>>>>       @active:  Indicate the event is currently being handled. The information
>>>>>>                 stored in 'struct kvm_sdei_event_context' instance can be
>>>>>>                 correlated with the event.
>>>>>
>>>>> Does this need to be a bitmap though? We can't ever have more than one
>>>>> SDEI event active at a time since this is private to a vCPU.
>>>>>
>>>>
>>>> Yes, one event is active at most on one particular vCPU. So tt don't
>>>> have to be a bitmap necessarily. The reason I proposed to use bitmap
>>>> for this state is to having all (event) states represented by bitmaps.
>>>> In this way, all states are managed in a unified fashion. The alternative
>>>> way is to have "unsigned long active_event", which traces the active
>>>> event number. It also consumes 8-bytes when live migration is concerned.
>>>> So I prefer a bitmap :)
>>>>
>>>
>>> The small benefit of using the event number is that we can address all
>>> events in 8 bytes, whereas we'd need to extend the bitmap for >64
>>> events. I suppose we'll run into that issue either way, since the
>>> pending, registered, and enabled portions are also bitmaps.
>>>
>>> When live migration is in scope we should probably bark at userspace if
>>> it attempts to set more than a single bit in the register.
>>>
>>
>> Even it's unlikely to support the shared event, bitmap will help in that
>> case. I'm not sure about other VMM, the pseudo firmware registers are
>> almost transparent to user space in QEMU. They're accessed and no one
>> cares the values reading from and writing to these registers in QEMU ;-)
> 
> Regardless of whether userspace actually manipulates the registers we
> should still reject unsupported values. For example:
> 
> Let's say the VM is started on a kernel that introduced yet another SDEI
> widget outside of your series. The VM was migrated back to an older
> kernel w/o the SDEI widget, and as such the VMM attempts to set the
> widget bit. Since the old kernel doesn't know what to do with the value
> it should return EINVAL to userspace.
> 

Yep, agreed. Thanks for the examples and details. Lets have more
discussion when the series to support migration is posted.

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure
@ 2022-05-02  8:23                           ` Gavin Shan
  0 siblings, 0 replies; 111+ messages in thread
From: Gavin Shan @ 2022-05-02  8:23 UTC (permalink / raw)
  To: Oliver Upton
  Cc: maz, linux-kernel, eauger, shan.gavin, Jonathan.Cameron,
	pbonzini, vkuznets, will, kvmarm

Hi Oliver,

On 5/2/22 3:57 PM, Oliver Upton wrote:
> On Mon, May 02, 2022 at 03:25:40PM +0800, Gavin Shan wrote:
>> Oliver, how about to adjust struct kvm_sdei_vcpu like below. With the
>> changes, struct kvm_sdei_vcpu::unregistering is dropped, to match with
>> the specification strictly.
>>
>>     struct kvm_sdei_vcpu {
>>         unsigned long registered;
>>         unsigned long enabled;
>>         unsigned long running;        // renamed from 'active' to match the specification strictly
>>         unsigned long pending;        // event pending for delivery
>>            :
>>     };
>>
>>     state                          @registered  @enabled  @running  @pending
>>     --------------------------------------------------------------------------------
>>     unregistered                   0            0         0/1       0
>>     registered-disabled            1            0         0         0/1
>>     registered-enabled             1            1         0/1       0/1
>>     handler-running                0/1          0/1       1         0/1
>>
>> We can use the specific encoding to represent the unregistration-pending.
>>
>>     state                          @registered  @enabled  @running  @pending
>>     -------------------------------------------------------------------------
>>     handler-running                0            0          1        0
> 
> Right, this is what I had in mind. This encodes the
> 'handler-unregister-pending' state.
> 

Cool, Thanks for your confirm. I think we're on same page for the
data structures now. With this, I'm able to start working on next
revision. Oliver, I'm sorry for taking you too much time to reach
to the point :)

>> Thanks for your valuable comments, Oliver. I'm not starting to work on
>> v7 yet. I also would like to make everything clear before that. In that
>> case, it will be easier for you to review next revision :)
>>
>>>>>>            unsigned long pending;       /* the event is pending for delivery and handling */
>>>>>>            unsigned long active;        /* the event is currently being handled           */
>>>>>>
>>>>>>            :
>>>>>>            <this part is just like what you suggested>
>>>>>>        };
>>>>>>
>>>>>> I rename @pending to @unregister. Besides, there are two states added:
>>>>>>
>>>>>>       @pending: Indicate there has one event has been injected. The next step
>>>>>>                 for the event is to deliver it for handling. For one particular
>>>>>>                 event, we allow one pending event in the maximum.
>>>>>
>>>>> Right, if an event retriggers when it is pending we still dispatch a
>>>>> single event to the guest. And since we're only doing normal priority
>>>>> events, it is entirely implementation defined which gets dispatched
>>>>> first.
>>>>>
>>>>
>>>> Yep, we will simply rely on find_first_bit() for the priority. It means
>>>> the software signaled event, whose number is zero, will have the highest
>>>> priority.
>>>>
>>>>>>       @active:  Indicate the event is currently being handled. The information
>>>>>>                 stored in 'struct kvm_sdei_event_context' instance can be
>>>>>>                 correlated with the event.
>>>>>
>>>>> Does this need to be a bitmap though? We can't ever have more than one
>>>>> SDEI event active at a time since this is private to a vCPU.
>>>>>
>>>>
>>>> Yes, one event is active at most on one particular vCPU. So tt don't
>>>> have to be a bitmap necessarily. The reason I proposed to use bitmap
>>>> for this state is to having all (event) states represented by bitmaps.
>>>> In this way, all states are managed in a unified fashion. The alternative
>>>> way is to have "unsigned long active_event", which traces the active
>>>> event number. It also consumes 8-bytes when live migration is concerned.
>>>> So I prefer a bitmap :)
>>>>
>>>
>>> The small benefit of using the event number is that we can address all
>>> events in 8 bytes, whereas we'd need to extend the bitmap for >64
>>> events. I suppose we'll run into that issue either way, since the
>>> pending, registered, and enabled portions are also bitmaps.
>>>
>>> When live migration is in scope we should probably bark at userspace if
>>> it attempts to set more than a single bit in the register.
>>>
>>
>> Even it's unlikely to support the shared event, bitmap will help in that
>> case. I'm not sure about other VMM, the pseudo firmware registers are
>> almost transparent to user space in QEMU. They're accessed and no one
>> cares the values reading from and writing to these registers in QEMU ;-)
> 
> Regardless of whether userspace actually manipulates the registers we
> should still reject unsupported values. For example:
> 
> Let's say the VM is started on a kernel that introduced yet another SDEI
> widget outside of your series. The VM was migrated back to an older
> kernel w/o the SDEI widget, and as such the VMM attempts to set the
> widget bit. Since the old kernel doesn't know what to do with the value
> it should return EINVAL to userspace.
> 

Yep, agreed. Thanks for the examples and details. Lets have more
discussion when the series to support migration is posted.

Thanks,
Gavin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 111+ messages in thread

end of thread, other threads:[~2022-05-02  8:24 UTC | newest]

Thread overview: 111+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-03 15:38 [PATCH v6 00/18] Support SDEI Virtualization Gavin Shan
2022-04-03 15:38 ` Gavin Shan
2022-04-03 15:38 ` [PATCH v6 01/18] KVM: arm64: Extend smccc_get_argx() Gavin Shan
2022-04-03 15:38   ` Gavin Shan
2022-04-03 15:38 ` [PATCH v6 02/18] KVM: arm64: Route hypercalls based on their owner Gavin Shan
2022-04-03 15:38   ` Gavin Shan
2022-04-21  8:19   ` Oliver Upton
2022-04-21  8:19     ` Oliver Upton
2022-04-22 12:20     ` Gavin Shan
2022-04-22 12:20       ` Gavin Shan
2022-04-22 17:59       ` Oliver Upton
2022-04-22 17:59         ` Oliver Upton
2022-04-23 12:48         ` Gavin Shan
2022-04-23 12:48           ` Gavin Shan
2022-04-03 15:38 ` [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure Gavin Shan
2022-04-03 15:38   ` Gavin Shan
2022-04-22 21:48   ` Oliver Upton
2022-04-22 21:48     ` Oliver Upton
2022-04-23 14:18     ` Gavin Shan
2022-04-23 14:18       ` Gavin Shan
2022-04-23 18:43       ` Oliver Upton
2022-04-23 18:43         ` Oliver Upton
2022-04-24  3:00         ` Gavin Shan
2022-04-24  3:00           ` Gavin Shan
2022-04-28 20:28           ` Oliver Upton
2022-04-28 20:28             ` Oliver Upton
2022-04-30 11:38             ` Gavin Shan
2022-04-30 11:38               ` Gavin Shan
2022-04-30 14:16               ` Oliver Upton
2022-04-30 14:16                 ` Oliver Upton
2022-05-02  2:35                 ` Gavin Shan
2022-05-02  2:35                   ` Gavin Shan
2022-05-02  3:40                   ` Oliver Upton
2022-05-02  3:40                     ` Oliver Upton
2022-05-02  7:25                     ` Gavin Shan
2022-05-02  7:25                       ` Gavin Shan
2022-05-02  7:57                       ` Oliver Upton
2022-05-02  7:57                         ` Oliver Upton
2022-05-02  8:23                         ` Gavin Shan
2022-05-02  8:23                           ` Gavin Shan
2022-04-03 15:38 ` [PATCH v6 04/18] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall Gavin Shan
2022-04-03 15:38   ` Gavin Shan
2022-04-30 14:54   ` Oliver Upton
2022-04-30 14:54     ` Oliver Upton
2022-05-02  2:55     ` Gavin Shan
2022-05-02  2:55       ` Gavin Shan
2022-05-02  3:43       ` Oliver Upton
2022-05-02  3:43         ` Oliver Upton
2022-05-02  7:28         ` Gavin Shan
2022-05-02  7:28           ` Gavin Shan
2022-04-03 15:38 ` [PATCH v6 05/18] KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE} Gavin Shan
2022-04-03 15:38   ` Gavin Shan
2022-04-03 15:38 ` [PATCH v6 06/18] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall Gavin Shan
2022-04-03 15:38   ` Gavin Shan
2022-04-30 15:03   ` Oliver Upton
2022-04-30 15:03     ` Oliver Upton
2022-05-02  2:57     ` Gavin Shan
2022-05-02  2:57       ` Gavin Shan
2022-04-03 15:39 ` [PATCH v6 07/18] KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall Gavin Shan
2022-04-03 15:39   ` Gavin Shan
2022-04-03 15:39 ` [PATCH v6 08/18] KVM: arm64: Support SDEI_EVENT_STATUS hypercall Gavin Shan
2022-04-03 15:39   ` Gavin Shan
2022-04-03 15:39 ` [PATCH v6 09/18] KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall Gavin Shan
2022-04-03 15:39   ` Gavin Shan
2022-04-03 15:39 ` [PATCH v6 10/18] KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall Gavin Shan
2022-04-03 15:39   ` Gavin Shan
2022-04-04 10:26   ` [PATCH] KVM: arm64: fix returnvar.cocci warnings kernel test robot
2022-04-04 10:26     ` kernel test robot
2022-04-04 10:54     ` Gavin Shan
2022-04-04 10:54       ` Gavin Shan
2022-04-04 10:54       ` Gavin Shan
2022-04-04 10:29   ` [PATCH v6 10/18] KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall kernel test robot
2022-04-04 10:29     ` kernel test robot
2022-04-03 15:39 ` [PATCH v6 11/18] KVM: arm64: Support SDEI_{PRIVATE, SHARED}_RESET Gavin Shan
2022-04-03 15:39   ` Gavin Shan
2022-04-03 15:39 ` [PATCH v6 12/18] KVM: arm64: Support SDEI event injection, delivery Gavin Shan
2022-04-03 15:39   ` Gavin Shan
2022-04-03 15:39 ` [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME} hypercall Gavin Shan
2022-04-03 15:39   ` [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall Gavin Shan
2022-05-01  6:50   ` [PATCH v6 13/18] KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME} hypercall Oliver Upton
2022-05-01  6:50     ` Oliver Upton
2022-05-02  6:19     ` Gavin Shan
2022-05-02  6:19       ` Gavin Shan
2022-05-02  7:38       ` Oliver Upton
2022-05-02  7:38         ` Oliver Upton
2022-05-02  7:51         ` Gavin Shan
2022-05-02  7:51           ` Gavin Shan
2022-04-03 15:39 ` [PATCH v6 14/18] KVM: arm64: Support SDEI_EVENT_SIGNAL hypercall Gavin Shan
2022-04-03 15:39   ` Gavin Shan
2022-04-30 21:32   ` Oliver Upton
2022-04-30 21:32     ` Oliver Upton
2022-05-02  3:04     ` Gavin Shan
2022-05-02  3:04       ` Gavin Shan
2022-04-03 15:39 ` [PATCH v6 15/18] KVM: arm64: Support SDEI_FEATURES hypercall Gavin Shan
2022-04-03 15:39   ` Gavin Shan
2022-05-01  6:55   ` Oliver Upton
2022-05-01  6:55     ` Oliver Upton
2022-05-02  3:05     ` Gavin Shan
2022-05-02  3:05       ` Gavin Shan
2022-04-03 15:39 ` [PATCH v6 16/18] KVM: arm64: Support SDEI_VERSION hypercall Gavin Shan
2022-04-03 15:39   ` Gavin Shan
2022-04-03 15:39 ` [PATCH v6 17/18] KVM: arm64: Expose SDEI capability Gavin Shan
2022-04-03 15:39   ` Gavin Shan
2022-04-03 15:39 ` [PATCH v6 18/18] KVM: selftests: Add SDEI test case Gavin Shan
2022-04-03 15:39   ` Gavin Shan
2022-04-03 15:47 ` [PATCH v6 00/18] Support SDEI Virtualization Gavin Shan
2022-04-03 15:47   ` Gavin Shan
2022-04-04  6:09   ` Oliver Upton
2022-04-04  6:09     ` Oliver Upton
2022-04-04 10:53     ` Gavin Shan
2022-04-04 10:53       ` Gavin Shan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.