qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
@ 2023-09-26 10:03 Salil Mehta via
  2023-09-26 10:04 ` [PATCH RFC V2 01/37] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Salil Mehta via
                   ` (33 more replies)
  0 siblings, 34 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:03 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

[ *REPEAT: Sent patches got held at internal server yesterday* ]

PROLOGUE
========

To assist in review and set the right expectations from this RFC, please first
read below sections *APPENDED AT THE END* of this cover letter,

1. Important *DISCLAIMER* [Section (X)]
2. Work presented at KVMForum Conference (slides available) [Section (V)F]
3. Organization of patches [Section (XI)]
4. References [Section (XII)]
5. Detailed TODO list of the leftover work or work-in-progress [Section (IX)]

NOTE: There has been an interest shown by other organizations in adapting
this series for their architecture. I am planning to split this RFC into
architecture *agnostic* and *specific* patch-sets in subsequent releases. ARM
specific patch-set will continue as RFC V3 and architecture agnostic patch-set
will be floated without RFC tag and can be consumed in this Qemu cycle if
MAINTAINERs ack it.

[Please check section (XI)B for details of architecture agnostic patches]


SECTIONS [I - XIII] are as follows :

(I) Key Changes (RFC V1 -> RFC V2)
    ==================================

    RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/

1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
   *online-capable* or *enabled* to the Guest OS at the boot time. This means
   associated CPUs can have ACPI _STA as *enabled* or *disabled* even after boot
   See, UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface Flags[20]
2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF}
   request. This is required to {dis}allow online'ing a vCPU.
3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI _STA.PRESENT 
   to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
   hot{un}plug.
4. Live Migration works (some issues are still there)
5. TCG/HVF/qtest does not support Hotplug and falls back to default.
6. Code for TCG support do exists in this release (it is a work-in-progress)
7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
   hotplug capability (_OSC Query support still pending)
8. Misc. Bug fixes

(II) Summary
     =======

This patch-set introduces the virtual CPU hotplug support for ARMv8 architecture
in QEMU. Idea is to be able to hotplug and hot-unplug the vCPUs while guest VM
is running and no reboot is required. This does *not* makes any assumption of
the physical CPU hotplug availability within the host system but rather tries to
solve the problem at virtualizer/QEMU layer. Introduces ACPI CPU hotplug hooks
and event handling to interface with the guest kernel, code to initialize, plug
and unplug CPUs. No changes are required within the host kernel/KVM except the
support of hypercall exit handling in the user-space/Qemu which has recently
been added to the kernel. Its corresponding Guest kernel changes have been
posted on the mailing-list [3] [4] by James Morse.

(III) Motivation
      ==========

This allows scaling the guest VM compute capacity on-demand which would be
useful for the following example scenarios,

1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration
   framework which could adjust resource requests (CPU and Mem requests) for
   the containers in a pod, based on usage.
2. Pay-as-you-grow Business Model: Infrastructure provider could allocate and
   restrict the total number of compute resources available to the guest VM
   according to the SLA (Service Level Agreement). VM owner could request for
   more compute to be hot-plugged for some cost.

For example, Kata Container VM starts with a minimum amount of resources (i.e.
hotplug everything approach). why?

1. Allowing faster *boot time* and
2. Reduction in *memory footprint*

Kata Container VM can boot with just 1 vCPU and then later more vCPUs can be
hot-plugged as per requirement.

(IV) Terminology
     ===========

(*) Posssible CPUs: Total vCPUs which could ever exist in VM. This includes
                    any cold booted CPUs plus any CPUs which could be later
                    hot-plugged.
                    - Qemu parameter(-smp maxcpus=N)
(*) Present CPUs:   Possible CPUs which are ACPI 'present'. These might or might
                    not be ACPI 'enabled'. 
                    - Present vCPUs = Possible vCPUs (Always on ARM Arch)
(*) Enabled CPUs:   Possible CPUs which are ACPI ‘present’ and 'enabled' and can
                    now be ‘onlined’ (PSCI) for use by Guest Kernel. All cold
                    booted vCPUs are ACPI 'enabled' at boot. Later, using
                    device_add more vCPUs can be hotplugged and be made ACPI
                    'enabled.
                    - Qemu parameter(-smp cpus=N). Can be used to specify some
		      cold booted vCPUs during VM init. Some can be added using
		      '-device' option.

(V) Constraints Due To ARMv8 CPU Architecture [+] Other Impediments
    ===============================================================

A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
   1. ARMv8 CPU architecture does not support the concept of the physical CPU
      hotplug. 
      a. There are many per-CPU components like PMU, SVE, MTE, Arch timers etc.
         whose behaviour need to be clearly defined when CPU is hot(un)plugged.
         There is no specification for this.

   2. Other ARM components like GIC etc. have not been designed to realize
      physical CPU hotplug capability as of now. For example,
      a. Every physical CPU has a unique GICC (GIC CPU Interface) by construct.
         Architecture does not specifies what CPU hot(un)plug would mean in
         context to any of these.
      b. CPUs/GICC are physically connected to unique GICR (GIC Redistributor).
         GIC Redistributors are always part of always-on power domain. Hence,
         cannot be powered-off as per specification.

B. Impediments in Firmware/ACPI (Architectural Constraint)

   1. Firmware has to expose GICC, GICR and other per-CPU features like PMU,
      SVE, MTE, Arch Timers etc. to the OS. Due to architectural constraint
      stated in above section A1(a),  all interrupt controller structures of
      MADT describing GIC CPU Interfaces and the GIC Redistibutors MUST be
      presented by firmware to the OSPM during the boot time. 
   2. Architectures that support CPU hotplug can evaluate ACPI _MAT method to
      get this kind of information from the firmware even after boot and the
      OSPM has capability to process these. ARM kernel uses information in MADT
      interrupt controller structures to identify number of Present CPUs during
      boot and hence does not allow to change these after boot. Number of
      present CPUs cannot be changed. It is an architectural constraint!

C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural Constraint)

   1. KVM VGIC:
       a. Sizing of various VGIC resources like memory regions etc. related to
          the redistributor happens only once and is fixed at the VM init time
          and cannot be changed later after initialization has happened.
          KVM statically configures these resources based on the number of vCPUs
          and the number/size of redistributor ranges.
       b. Association between vCPU and its VGIC redistributor is fixed at the
          VM init time within the KVM i.e. when redistributor iodevs gets
          registered. VGIC does not allows to setup/change this association
          after VM initialization has happened. Physically, every CPU/GICC is
          uniquely connected with its redistributor and there is no
          architectural way to set this up.
   2. KVM vCPUs:
       a. Lack of specification means destruction of KVM vCPUs does not exist as
          there is no reference to tell what to do with other per-vCPU
          components like redistributors, arch timer etc.
       b. Infact, KVM does not implements destruction of vCPUs for any
          architecture. This is independent of the fact whether architecture
          actually supports CPU Hotplug feature. For example, even for x86 KVM
          does not implements destruction of vCPUs.

D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints->Arch)

   1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs to
      overcome the KVM constraint. KVM vCPUs are created, initialized when Qemu
      CPU Objects are realized. But keepinsg the QOM CPU objects realized for
      'yet-to-be-plugged' vCPUs can create problems when these new vCPUs shall
      be plugged using device_add and a new QOM CPU object shall be created.
   2. GICV3State and GICV3CPUState objects MUST be sized over *possible vCPUs*
      during VM init time while QOM GICV3 Object is realized. This is because
      KVM VGIC can only be initialized once during init time. But every
      GICV3CPUState has an associated QOM CPU Object. Later might corresponds to
      vCPU which are 'yet-to-be-plugged'(unplugged at init).
   3. How should new QOM CPU objects be connected back to the GICV3CPUState
      objects and disconnected from it in case CPU is being hot(un)plugged?
   4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in the
      QOM for which KVM vCPU already exists? For example, whether to keep,
       a. No QOM CPU objects Or
       b. Unrealized CPU Objects
   5. How should vCPU state be exposed via ACPI to the Guest? Especially for
      the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not exists
      within the QOM but the Guest always expects all possible vCPUs to be
      identified as ACPI *present* during boot.
   6. How should Qemu expose GIC CPU interfaces for the unplugged or
      yet-to-beplugged vCPUs using ACPI MADT Table to the Guest?

E. Summary of Approach ([+] Workarounds to problems in sections A, B, C & D)

   1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e. even
      for the vCPUs which are yet-to-be-plugged in Qemu but keep them in the
      powered-off state.
   2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
      objects corresponding to the unplugged/yet-to-be-plugged vCPUs are parked
      at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to x86)
   3. GICV3State and GICV3CPUState objects are sized over possible vCPUs during
      VM init time i.e. when Qemu GIC is realized. This in turn sizes KVM VGIC
      resources like memory regions etc. related to the redistributors with the
      number of possible KVM vCPUs. This never changes after VM has initialized.
   4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs are
      released post Host KVM CPU and GIC/VGIC initialization.
   5. Build ACPI MADT Table with below updates 
      a. Number of GIC CPU interface entries (=possible vCPUs)
      b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable) 
      c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1  
         - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
	 - vCPU can be ACPI enabled+onlined after Guest boots (Firmware Policy) 
	 - Some issues with above (details in later sections)
   6. Expose below ACPI Status to Guest kernel
      a. Always _STA.Present=1 (all possible vCPUs)
      b. _STA.Enabled=1 (plugged vCPUs)
      c. _STA.Enabled=0 (unplugged vCPUs)
   7. vCPU hotplug *realizes* new QOM CPU object. Following happens,
      a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread
      b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list)
         - Attaches to QOM CPU object.
      c. Reinitializes KVM vCPU in the Host
         - Resets the core and sys regs, sets defaults etc.
      d. Runs KVM vCPU (created with "start-powered-off")
	 - vCPU thread sleeps (waits for vCPU reset via PSCI) 
      e. Updates Qemu GIC
         - Wires back IRQs related to this vCPU.
         - GICV3CPUState association with QOM CPU Object.
      f. Updates [6] ACPI _STA.Enabled=1
      g. Notifies Guest about new vCPU (via ACPI GED interface)
	 - Guest checks _STA.Enabled=1
	 - Guest adds processor (registers CPU with LDM) [3]
      h. Plugs the QOM CPU object in the slot.
         - slot-number = cpu-index{socket,cluster,core,thread}
      i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC)
         - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
         - Qemu powers-on KVM vCPU in the Host
   8. vCPU hot-unplug *unrealizes* QOM CPU Object. Following happens,
      a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event
         - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC) 
      b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
         - Qemu powers-off the KVM vCPU in the Host
      c Guest signals *Eject* vCPU to Qemu
      d. Qemu updates [6] ACPI _STA.Enabled=0
      e. Updates GIC
         - Un-wires IRQs related to this vCPU
         - GICV3CPUState association with new QOM CPU Object is updated.
      f. Unplugs the vCPU
	 - Removes from slot
         - Parks KVM vCPU ("kvm_parked_vcpus" list)
         - Unrealizes QOM CPU Object & joins back Qemu vCPU thread
	 - Destroys QOM CPU object 
      g. Guest checks ACPI _STA.Enabled=0
         - Removes processor (unregisters CPU with LDM) [3]

F. Work Presented at KVM Forum Conferences:
   Details of above work has been presented at KVMForum2020 and KVMForum2023
   conferences. Slides are available at below links,
   a. KVMForum 2023
      - Challenges Revisited in Supporting Virt CPU Hotplug on architectures that don't Support CPU Hotplug (like ARM64)
        https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
   b. KVMForum 2020
      - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems (like ARM64) - Salil Mehta, Huawei
        https://sched.co/eE4m

(VI) Commands Used
     =============

    A. Qemu launch commands to init the machine

    $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
    -cpu host -smp cpus=4,maxcpus=6 \
    -m 300M \
    -kernel Image \
    -initrd rootfs.cpio.gz \
    -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 acpi=force" \
    -nographic \
    -bios  QEMU_EFI.fd \

    B. Hot-(un)plug related commands

    # Hotplug a host vCPU(accel=kvm)
    $ device_add host-arm-cpu,id=core4,core-id=4

    # Hotplug a vCPU(accel=tcg)
    $ device_add cortex-a57-arm-cpu,id=core4,core-id=4

    # Delete the vCPU
    $ device_del core4

    Sample output on guest after boot:

    $ cat /sys/devices/system/cpu/possible
    0-5
    $ cat /sys/devices/system/cpu/present
    0-5
    $ cat /sys/devices/system/cpu/enabled
    0-3
    $ cat /sys/devices/system/cpu/online
    0-1
    $ cat /sys/devices/system/cpu/offline
    2-5

    Sample output on guest after hotplug of vCPU=4:

    $ cat /sys/devices/system/cpu/possible
    0-5
    $ cat /sys/devices/system/cpu/present
    0-5
    $ cat /sys/devices/system/cpu/enabled
    0-4
    $ cat /sys/devices/system/cpu/online
    0-1,4
    $ cat /sys/devices/system/cpu/offline
    2-3,5

    Note: vCPU=4 was explicitly 'onlined' after hot-plug
    $ echo 1 > /sys/devices/system/cpu/cpu4/online

(VII) Repository
      ==========

 (*) QEMU changes for vCPU hotplug could be cloned from below site,
     https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
 (*) Guest Kernel changes (by James Morse, ARM) are available here:
     https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git virtual_cpu_hotplug/rfc/v2


(VIII) KNOWN ISSUES
       ============

1. Migration has been lightly tested. Below are some of the known issues:
   - Ocassional CPU stall (not always repeatable)
   - Negative test case like asymmetric source/destination VM config causes dump.
   - Migration with TCG is not working properly.
2. TCG with Single threaded mode is broken.
3. HVF and qtest support is broken. 
4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable are
   mutually exclusive i.e. as per the change [6] a vCPU cannot be both
   GICC.Enabled and GICC.online-capable. This means,
      [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
   a. If we have to support hot-unplug of the cold-booted vCPUs then these MUST
      be specified as GICC.online-capable in the MADT Table during boot by the
      firmware/Qemu. But this requirement conflicts with the requirement to
      support new Qemu changes with legacy OS which dont understand
      MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore this
      bit and hence these vCPUs will not appear on such OS. This is unexpected
      behaviour.
   b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to unplug
      these cold-booted vCPUs from OS (which in actual should be blocked by
      returning error at Qemu) then features like 'kexec' will break.
   c. As I understand, removal of the cold-booted vCPUs is a required feature
      and x86 world allows it.
   d. Hence, either we need a specification change to make the MADT.GICC.Enabled
      and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT support
      removal of cold-booted vCPUs. In the later case, a check can be introduced
      to bar the users from unplugging vCPUs, which were cold-booted, using QMP
      commands. (Needs discussion!)
      Please check below patch part of this patch-set:
          [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled]
5. Code related to the notification to GICV3 about hot(un)plug of a vCPU event
   might need further discussion.


(IX) THINGS TO DO
     ============

1. Fix the Migration Issues
2. Fix issues related to TCG/Emulation support.
3. Comprehensive Testing. Current testing is very basic.
   a. Negative Test cases
4. Qemu Documentation(.rst) need to be updated.
5. Fix qtest, HVF Support
6. Fix the design issue related to ACPI MADT.GICC flags discussed in known
   issues. This might require UEFI ACPI specification change!
7. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now.

 Above is *not* a complete list. Will update later!

Best regards
Salil.

(X) DISCLAIMER
    ==========

This work is an attempt to present a proof-of-concept of the ARM64 vCPU hotplug
implementation to the community. This is *not* a production level code and might
have bugs. Only a basic testing has been done on HiSilicon Kunpeng920 SoC for
servers. Once the design and core idea behind the implementation has been
verified more efforts can be put to harden the code.

This work is *mostly* in the lines of the discussions which have happened in the
previous years[see refs below] across different channels like mailing-list,
Linaro Open Discussions platform, various conferences like KVMFourm etc. This
RFC is being used as a way to verify the idea mentioned in this cover-letter and
to get community views. Once this has been agreed, a formal patch shall be
posted to the mailing-list for review.

[The concept being presented has been found to work!]

(XI) ORGANIZATION OF PATCHES
     =======================
 
 A. All patches [Architecture 'agnostic' + 'specific']:

   [Patch 1-9, 23, 36] logic required during machine init
    (*) Some validation checks
    (*) Introduces core-id property and some util functions required later.
    (*) Refactors Parking logic of vCPUs    
    (*) Logic to pre-create vCPUs
    (*) GIC initialization pre-sized with possible vCPUs.
    (*) Some refactoring to have common hot and cold plug logic together.
    (*) Release of disable QOM CPU objects in post_cpu_init()
    (*) Support of ACPI _OSC method to negotiate platform hotplug capabilities
   [Patch 10-22] logic related to ACPI at machine init time
    (*) Changes required to Enable ACPI for cpu hotplug
    (*) Initialization ACPI GED framework to cater CPU Hotplug Events
    (*) Build ACPI AML related to CPU control dev 
    (*) ACPI MADT/MAT changes
   [Patch 24-35] Logic required during vCPU hot-(un)plug
    (*) Basic framework changes to suppport vCPU hot-(un)plug
    (*) ACPI GED changes for hot-(un)plug hooks.
    (*) wire-unwire the IRQs
    (*) GIC notification logic
    (*) ARMCPU unrealize logic
    (*) Handling of SMCC Hypercall Exits by KVM to Qemu  
   
 B. Architecture *agnostic* patches part of patch-set:

   [Patch 5,9,11,13,16,20,24,31,33] Common logic to support hotplug 
    (*) Refactors Parking logic of vCPUs
    (*) Introduces ACPI GED Support for vCPU Hotplug Events
    (*) Introduces ACPI AML change for CPU Control Device     

(XII) REFERENCES
      ==========

[1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
[2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/
[3] https://lore.kernel.org/lkml/20230203135043.409192-1-james.morse@arm.com/
[4] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
[5] https://lore.kernel.org/all/20230404154050.2270077-1-oliver.upton@linux.dev/
[6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
[7] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
[8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
[9] https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler
[10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html
[11] https://lkml.org/lkml/2019/7/10/235
[12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
[13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
[14] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
[15] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
[16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
[17] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
[18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-philippe@linaro.org/
[19] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/ 
[20] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags

(XIII) ACKNOWLEDGEMENTS
       ================

I would like to take this opportunity to thank below people for various
discussions with me over different channels during the development:

Marc Zyngier (Google)               Catalin Marinas (ARM),         
James Morse(ARM),                   Will Deacon (Google), 
Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat), 
Jonathan Cameron (Huawei),          Darren Hart (Ampere),
Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
Andrew Jones (Redhat),              Karl Heubaum (Oracle),
Keqian Zhu (Huawei),                Miguel Luis (Oracle),
Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
Shameerali Kolothum (Huawei)        Russell King (Oracle)
Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
Zengtao/Prime (Huawei),             And all those whom I have missed! 

Many thanks to below people for their current or past contributions:

1. James Morse (ARM)
   (Current Kernel part of vCPU Hotplug Support on AARCH64)
2. Jean-Philippe Brucker (Linaro)
   (Protoyped one of the earlier PSCI based POC [17][18] based on RFC V1)
3. Keqian Zhu (Huawei)
   (Co-developed Qemu prototype)
4. Xiongfeng Wang (Huawei)
   (Co-developed earlier kernel prototype)
5. Vishnu Pajjuri (Ampere)
   (Verification on Ampere ARM64 Platforms + fixes)
6. Miguel Luis (Oracle)
   (Verification on Oracle ARM64 Platforms + fixes)


Author Salil Mehta (1):
  target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu

Jean-Philippe Brucker (2):
  hw/acpi: Make _MAT method optional
  target/arm/kvm: Write CPU state back to KVM on reset

Miguel Luis (1):
  tcg/mttcg: enable threads to unregister in tcg_ctxs[]

Salil Mehta (33):
  arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  cpus-common: Add common CPU utility for possible vCPUs
  hw/arm/virt: Move setting of common CPU properties in a function
  arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
  accel/kvm: Extract common KVM vCPU {creation,parking} code
  arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
  arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine init
  arm/virt: Init PMU at host for all possible vcpus
  hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
  arm/acpi: Enable ACPI support for vcpu hotplug
  hw/acpi: Add ACPI CPU hotplug init stub
  hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
  hw/acpi: Init GED framework with cpu hotplug events
  arm/virt: Add cpu hotplug events to GED during creation
  arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
  arm/virt/acpi: Build CPUs AML with CPU Hotplug support
  arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits to Guest
  hw/acpi: Update GED _EVT method AML with cpu scan
  hw/arm: MADT Tbl change to size the guest with possible vCPUs
  arm/virt: Release objects for *disabled* possible vCPUs after init
  hw/acpi: Update ACPI GED framework to support vCPU Hotplug
  arm/virt: Add/update basic hot-(un)plug framework
  arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
  hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
  hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info
  arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
  hw/arm: Changes required for reset and to support next boot
  physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
  target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  hw/arm: Support hotplug capability check using _OSC method
  hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled

 accel/kvm/kvm-all.c                    |  61 +-
 accel/tcg/tcg-accel-ops-mttcg.c        |   1 +
 cpus-common.c                          |  37 ++
 gdbstub/gdbstub.c                      |  13 +
 hw/acpi/acpi-cpu-hotplug-stub.c        |   6 +
 hw/acpi/cpu.c                          |  91 ++-
 hw/acpi/generic_event_device.c         |  33 +
 hw/arm/Kconfig                         |   1 +
 hw/arm/boot.c                          |   2 +-
 hw/arm/virt-acpi-build.c               | 110 +++-
 hw/arm/virt.c                          | 863 ++++++++++++++++++++-----
 hw/core/gpio.c                         |   2 +-
 hw/i386/acpi-build.c                   |   2 +-
 hw/intc/arm_gicv3.c                    |   1 +
 hw/intc/arm_gicv3_common.c             |  66 +-
 hw/intc/arm_gicv3_cpuif.c              | 265 ++++----
 hw/intc/arm_gicv3_cpuif_common.c       |   5 +
 hw/intc/arm_gicv3_kvm.c                |  39 +-
 hw/intc/gicv3_internal.h               |   2 +
 include/exec/cpu-common.h              |   8 +
 include/exec/gdbstub.h                 |   1 +
 include/hw/acpi/cpu.h                  |   7 +-
 include/hw/acpi/cpu_hotplug.h          |   4 +
 include/hw/acpi/generic_event_device.h |   5 +
 include/hw/arm/boot.h                  |   2 +
 include/hw/arm/virt.h                  |  10 +-
 include/hw/core/cpu.h                  |  77 +++
 include/hw/intc/arm_gicv3_common.h     |  23 +
 include/hw/qdev-core.h                 |   2 +
 include/sysemu/kvm.h                   |   2 +
 include/tcg/tcg.h                      |   1 +
 softmmu/physmem.c                      |  25 +
 target/arm/arm-powerctl.c              |  51 +-
 target/arm/cpu-qom.h                   |   3 +
 target/arm/cpu.c                       | 112 ++++
 target/arm/cpu.h                       |  17 +
 target/arm/cpu64.c                     |  15 +
 target/arm/gdbstub.c                   |   6 +
 target/arm/helper.c                    |  27 +-
 target/arm/internals.h                 |  12 +-
 target/arm/kvm.c                       |  93 ++-
 target/arm/kvm64.c                     |  59 +-
 target/arm/kvm_arm.h                   |  24 +
 target/arm/meson.build                 |   1 +
 target/arm/{tcg => }/psci.c            |   8 +
 target/arm/tcg/meson.build             |   4 -
 tcg/tcg.c                              |  23 +
 47 files changed, 1873 insertions(+), 349 deletions(-)
 rename target/arm/{tcg => }/psci.c (97%)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 01/37] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-26 23:57   ` [PATCH RFC V2 01/37] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for possible vCPUs Salil Mehta via
                   ` (32 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

This shall be used to store user specified topology{socket,cluster,core,thread}
and shall be converted to a unique 'vcpu-id' which is used as slot-index during
hot(un)plug of vCPU.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c    | 63 ++++++++++++++++++++++++++++++++++++++++++++++++
 target/arm/cpu.c |  4 +++
 target/arm/cpu.h |  4 +++
 3 files changed, 71 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 7d9dbc2663..57fe97c242 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -221,6 +221,11 @@ static const char *valid_cpus[] = {
     ARM_CPU_TYPE_NAME("max"),
 };
 
+static int virt_get_socket_id(const MachineState *ms, int cpu_index);
+static int virt_get_cluster_id(const MachineState *ms, int cpu_index);
+static int virt_get_core_id(const MachineState *ms, int cpu_index);
+static int virt_get_thread_id(const MachineState *ms, int cpu_index);
+
 static bool cpu_type_valid(const char *cpu)
 {
     int i;
@@ -2168,6 +2173,14 @@ static void machvirt_init(MachineState *machine)
                           &error_fatal);
 
         aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
+        object_property_set_int(cpuobj, "socket-id",
+                                virt_get_socket_id(machine, n), NULL);
+        object_property_set_int(cpuobj, "cluster-id",
+                                virt_get_cluster_id(machine, n), NULL);
+        object_property_set_int(cpuobj, "core-id",
+                                virt_get_core_id(machine, n), NULL);
+        object_property_set_int(cpuobj, "thread-id",
+                                virt_get_thread_id(machine, n), NULL);
 
         if (!vms->secure) {
             object_property_set_bool(cpuobj, "has_el3", false, NULL);
@@ -2652,10 +2665,59 @@ static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
     return socket_id % ms->numa_state->num_nodes;
 }
 
+static int virt_get_socket_id(const MachineState *ms, int cpu_index)
+{
+    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
+
+    return ms->possible_cpus->cpus[cpu_index].props.socket_id;
+}
+
+static int virt_get_cluster_id(const MachineState *ms, int cpu_index)
+{
+    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
+
+    return ms->possible_cpus->cpus[cpu_index].props.cluster_id;
+}
+
+static int virt_get_core_id(const MachineState *ms, int cpu_index)
+{
+    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
+
+    return ms->possible_cpus->cpus[cpu_index].props.core_id;
+}
+
+static int virt_get_thread_id(const MachineState *ms, int cpu_index)
+{
+    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
+
+    return ms->possible_cpus->cpus[cpu_index].props.thread_id;
+}
+
+static int
+virt_get_cpu_id_from_cpu_topo(const MachineState *ms, DeviceState *dev)
+{
+    int cpu_id, sock_vcpu_num, clus_vcpu_num, core_vcpu_num;
+    ARMCPU *cpu = ARM_CPU(dev);
+
+    /* calculate total logical cpus across socket/cluster/core */
+    sock_vcpu_num = cpu->socket_id * (ms->smp.threads * ms->smp.cores *
+                    ms->smp.clusters);
+    clus_vcpu_num = cpu->cluster_id * (ms->smp.threads * ms->smp.cores);
+    core_vcpu_num = cpu->core_id * ms->smp.threads;
+
+    /* get vcpu-id(logical cpu index) for this vcpu from this topology */
+    cpu_id = (sock_vcpu_num + clus_vcpu_num + core_vcpu_num) + cpu->thread_id;
+
+    assert(cpu_id >= 0 && cpu_id < ms->possible_cpus->len);
+
+    return cpu_id;
+}
+
 static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
 {
     int n;
     unsigned int max_cpus = ms->smp.max_cpus;
+    unsigned int smp_threads = ms->smp.threads;
     VirtMachineState *vms = VIRT_MACHINE(ms);
     MachineClass *mc = MACHINE_GET_CLASS(vms);
 
@@ -2669,6 +2731,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
     ms->possible_cpus->len = max_cpus;
     for (n = 0; n < ms->possible_cpus->len; n++) {
         ms->possible_cpus->cpus[n].type = ms->cpu_type;
+        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
         ms->possible_cpus->cpus[n].arch_id =
             virt_cpu_mp_affinity(vms, n);
 
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 93c28d50e5..1376350416 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -2277,6 +2277,10 @@ static Property arm_cpu_properties[] = {
     DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
                         mp_affinity, ARM64_AFFINITY_INVALID),
     DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
+    DEFINE_PROP_INT32("socket-id", ARMCPU, socket_id, 0),
+    DEFINE_PROP_INT32("cluster-id", ARMCPU, cluster_id, 0),
+    DEFINE_PROP_INT32("core-id", ARMCPU, core_id, 0),
+    DEFINE_PROP_INT32("thread-id", ARMCPU, thread_id, 0),
     DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1),
     DEFINE_PROP_END_OF_LIST()
 };
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 88e5accda6..d51d39f621 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1094,6 +1094,10 @@ struct ArchCPU {
     QLIST_HEAD(, ARMELChangeHook) el_change_hooks;
 
     int32_t node_id; /* NUMA node this CPU belongs to */
+    int32_t socket_id;
+    int32_t cluster_id;
+    int32_t core_id;
+    int32_t thread_id;
 
     /* Used to synchronize KVM and QEMU in-kernel device levels */
     uint8_t device_irq_level;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for possible vCPUs
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
  2023-09-26 10:04 ` [PATCH RFC V2 01/37] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-27  3:54   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function Salil Mehta via
                   ` (31 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Adds various utility functions which might be required to fetch or check the
state of the possible vCPUs. This also introduces concept of *disabled* vCPUs,
which are part of the *possible* vCPUs but are not part of the *present* vCPU.
This state shall be used during machine init time to check the presence of
vcpus.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 cpus-common.c         | 31 +++++++++++++++++++++++++
 include/hw/core/cpu.h | 53 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+)

diff --git a/cpus-common.c b/cpus-common.c
index 45c745ecf6..24c04199a1 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -24,6 +24,7 @@
 #include "sysemu/cpus.h"
 #include "qemu/lockable.h"
 #include "trace/trace-root.h"
+#include "hw/boards.h"
 
 QemuMutex qemu_cpu_list_lock;
 static QemuCond exclusive_cond;
@@ -107,6 +108,36 @@ void cpu_list_remove(CPUState *cpu)
     cpu_list_generation_id++;
 }
 
+CPUState *qemu_get_possible_cpu(int index)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    const CPUArchIdList *possible_cpus = ms->possible_cpus;
+
+    assert((index >= 0) && (index < possible_cpus->len));
+
+    return CPU(possible_cpus->cpus[index].cpu);
+}
+
+bool qemu_present_cpu(CPUState *cpu)
+{
+    return cpu;
+}
+
+bool qemu_enabled_cpu(CPUState *cpu)
+{
+    return cpu && !cpu->disabled;
+}
+
+uint64_t qemu_get_cpu_archid(int cpu_index)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    const CPUArchIdList *possible_cpus = ms->possible_cpus;
+
+    assert((cpu_index >= 0) && (cpu_index < possible_cpus->len));
+
+    return possible_cpus->cpus[cpu_index].arch_id;
+}
+
 CPUState *qemu_get_cpu(int index)
 {
     CPUState *cpu;
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index fdcbe87352..e5af79950c 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -413,6 +413,17 @@ struct CPUState {
     SavedIOTLB saved_iotlb;
 #endif
 
+    /*
+     * Some architectures do not allow *presence* of vCPUs to be changed
+     * after guest has booted using information specified by VMM/firmware
+     * via ACPI MADT at the boot time. Thus to enable vCPU hotplug on these
+     * architectures possible vCPU can have CPUState object in 'disabled'
+     * state or can also not have CPUState object at all. This is possible
+     * when vCPU Hotplug is supported and vCPUs are 'yet-to-be-plugged' in
+     * the QOM or have been hot-unplugged.
+     * By default every CPUState is enabled as of now across all archs.
+     */
+    bool disabled;
     /* TODO Move common fields from CPUArchState here. */
     int cpu_index;
     int cluster_index;
@@ -770,6 +781,48 @@ static inline bool cpu_in_exclusive_context(const CPUState *cpu)
  */
 CPUState *qemu_get_cpu(int index);
 
+/**
+ * qemu_get_possible_cpu:
+ * @index: The CPUState@cpu_index value of the CPU to obtain.
+ *         Input index MUST be in range [0, Max Possible CPUs)
+ *
+ * If CPUState object exists,then it gets a CPU matching
+ * @index in the possible CPU array.
+ *
+ * Returns: The possible CPU or %NULL if CPU does not exist.
+ */
+CPUState *qemu_get_possible_cpu(int index);
+
+/**
+ * qemu_present_cpu:
+ * @cpu: The vCPU to check
+ *
+ * Checks if the vCPU is amongst the present possible vcpus.
+ *
+ * Returns: True if it is present possible vCPU else false
+ */
+bool qemu_present_cpu(CPUState *cpu);
+
+/**
+ * qemu_enabled_cpu:
+ * @cpu: The vCPU to check
+ *
+ * Checks if the vCPU is enabled.
+ *
+ * Returns: True if it is 'enabled' else false
+ */
+bool qemu_enabled_cpu(CPUState *cpu);
+
+/**
+ * qemu_get_cpu_archid:
+ * @cpu_index: possible vCPU for which arch-id needs to be retreived
+ *
+ * Fetches the vCPU arch-id from the present possible vCPUs.
+ *
+ * Returns: arch-id of the possible vCPU
+ */
+uint64_t qemu_get_cpu_archid(int cpu_index);
+
 /**
  * cpu_exists:
  * @id: Guest-exposed CPU ID to lookup.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
  2023-09-26 10:04 ` [PATCH RFC V2 01/37] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Salil Mehta via
  2023-09-26 10:04 ` [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for possible vCPUs Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-27  5:16   ` Gavin Shan
  2023-10-10  6:46   ` Shaoqin Huang
  2023-09-26 10:04 ` [PATCH RFC V2 04/37] arm/virt, target/arm: Machine init time change common to vCPU {cold|hot}-plug Salil Mehta via
                   ` (30 subsequent siblings)
  33 siblings, 2 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Factor out CPU properties code common for {hot,cold}-plugged CPUs. This allows
code reuse.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c         | 220 ++++++++++++++++++++++++++----------------
 include/hw/arm/virt.h |   4 +
 2 files changed, 140 insertions(+), 84 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 57fe97c242..0eb6bf5a18 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2018,16 +2018,130 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
     }
 }
 
+static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,
+                                    Error **errp)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    VirtMachineState *vms = VIRT_MACHINE(ms);
+    Error *local_err = NULL;
+    VirtMachineClass *vmc;
+
+    vmc = VIRT_MACHINE_GET_CLASS(ms);
+
+    /* now, set the cpu object property values */
+    numa_cpu_pre_plug(cpu_slot, DEVICE(cpuobj), &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    object_property_set_int(cpuobj, "mp-affinity", cpu_slot->arch_id, NULL);
+
+    if (!vms->secure) {
+        object_property_set_bool(cpuobj, "has_el3", false, NULL);
+    }
+
+    if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
+        object_property_set_bool(cpuobj, "has_el2", false, NULL);
+    }
+
+    if (vmc->kvm_no_adjvtime &&
+        object_property_find(cpuobj, "kvm-no-adjvtime")) {
+        object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
+    }
+
+    if (vmc->no_kvm_steal_time &&
+        object_property_find(cpuobj, "kvm-steal-time")) {
+        object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
+    }
+
+    if (vmc->no_pmu && object_property_find(cpuobj, "pmu")) {
+        object_property_set_bool(cpuobj, "pmu", false, NULL);
+    }
+
+    if (vmc->no_tcg_lpa2 && object_property_find(cpuobj, "lpa2")) {
+        object_property_set_bool(cpuobj, "lpa2", false, NULL);
+    }
+
+    if (object_property_find(cpuobj, "reset-cbar")) {
+        object_property_set_int(cpuobj, "reset-cbar",
+                                vms->memmap[VIRT_CPUPERIPHS].base,
+                                &local_err);
+        if (local_err) {
+            goto out;
+        }
+    }
+
+    /* link already initialized {secure,tag}-memory regions to this cpu */
+    object_property_set_link(cpuobj, "memory", OBJECT(vms->sysmem), &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    if (vms->secure) {
+        object_property_set_link(cpuobj, "secure-memory",
+                                 OBJECT(vms->secure_sysmem), &local_err);
+        if (local_err) {
+            goto out;
+        }
+    }
+
+    if (vms->mte) {
+        if (!object_property_find(cpuobj, "tag-memory")) {
+            error_setg(&local_err, "MTE requested, but not supported "
+                       "by the guest CPU");
+            if (local_err) {
+                goto out;
+            }
+        }
+
+        object_property_set_link(cpuobj, "tag-memory", OBJECT(vms->tag_sysmem),
+                                 &local_err);
+        if (local_err) {
+            goto out;
+        }
+
+        if (vms->secure) {
+            object_property_set_link(cpuobj, "secure-tag-memory",
+                                     OBJECT(vms->secure_tag_sysmem),
+                                     &local_err);
+            if (local_err) {
+                goto out;
+            }
+        }
+    }
+
+    /*
+     * RFC: Question: this must only be called for the hotplugged cpus. For the
+     * cold booted secondary cpus this is being taken care in arm_load_kernel()
+     * in boot.c. Perhaps we should remove that code now?
+     */
+    if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED) {
+        object_property_set_int(cpuobj, "psci-conduit", vms->psci_conduit,
+                                NULL);
+
+        /* Secondary CPUs start in PSCI powered-down state */
+        if (CPU(cpuobj)->cpu_index > 0) {
+            object_property_set_bool(cpuobj, "start-powered-off", true, NULL);
+        }
+    }
+
+out:
+    if (local_err) {
+        error_propagate(errp, local_err);
+    }
+    return;
+}
+
 static void machvirt_init(MachineState *machine)
 {
     VirtMachineState *vms = VIRT_MACHINE(machine);
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(machine);
     MachineClass *mc = MACHINE_GET_CLASS(machine);
     const CPUArchIdList *possible_cpus;
-    MemoryRegion *sysmem = get_system_memory();
+    MemoryRegion *secure_tag_sysmem = NULL;
     MemoryRegion *secure_sysmem = NULL;
     MemoryRegion *tag_sysmem = NULL;
-    MemoryRegion *secure_tag_sysmem = NULL;
+    MemoryRegion *sysmem;
     int n, virt_max_cpus;
     bool firmware_loaded;
     bool aarch64 = true;
@@ -2071,6 +2185,8 @@ static void machvirt_init(MachineState *machine)
      */
     finalize_gic_version(vms);
 
+    sysmem = vms->sysmem = get_system_memory();
+
     if (vms->secure) {
         /*
          * The Secure view of the world is the same as the NonSecure,
@@ -2078,7 +2194,7 @@ static void machvirt_init(MachineState *machine)
          * containing the system memory at low priority; any secure-only
          * devices go in at higher priority and take precedence.
          */
-        secure_sysmem = g_new(MemoryRegion, 1);
+        secure_sysmem = vms->secure_sysmem = g_new(MemoryRegion, 1);
         memory_region_init(secure_sysmem, OBJECT(machine), "secure-memory",
                            UINT64_MAX);
         memory_region_add_subregion_overlap(secure_sysmem, 0, sysmem, -1);
@@ -2151,6 +2267,23 @@ static void machvirt_init(MachineState *machine)
         exit(1);
     }
 
+    if (vms->mte) {
+        /* Create the memory region only once, but link to all cpus later */
+        tag_sysmem = vms->tag_sysmem = g_new(MemoryRegion, 1);
+        memory_region_init(tag_sysmem, OBJECT(machine),
+                           "tag-memory", UINT64_MAX / 32);
+
+        if (vms->secure) {
+            secure_tag_sysmem = vms->secure_tag_sysmem = g_new(MemoryRegion, 1);
+            memory_region_init(secure_tag_sysmem, OBJECT(machine),
+                               "secure-tag-memory", UINT64_MAX / 32);
+
+            /* As with ram, secure-tag takes precedence over tag.  */
+            memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
+                                                tag_sysmem, -1);
+        }
+    }
+
     create_fdt(vms);
 
     assert(possible_cpus->len == max_cpus);
@@ -2163,15 +2296,10 @@ static void machvirt_init(MachineState *machine)
         }
 
         cpuobj = object_new(possible_cpus->cpus[n].type);
-        object_property_set_int(cpuobj, "mp-affinity",
-                                possible_cpus->cpus[n].arch_id, NULL);
 
         cs = CPU(cpuobj);
         cs->cpu_index = n;
 
-        numa_cpu_pre_plug(&possible_cpus->cpus[cs->cpu_index], DEVICE(cpuobj),
-                          &error_fatal);
-
         aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
         object_property_set_int(cpuobj, "socket-id",
                                 virt_get_socket_id(machine, n), NULL);
@@ -2182,82 +2310,6 @@ static void machvirt_init(MachineState *machine)
         object_property_set_int(cpuobj, "thread-id",
                                 virt_get_thread_id(machine, n), NULL);
 
-        if (!vms->secure) {
-            object_property_set_bool(cpuobj, "has_el3", false, NULL);
-        }
-
-        if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
-            object_property_set_bool(cpuobj, "has_el2", false, NULL);
-        }
-
-        if (vmc->kvm_no_adjvtime &&
-            object_property_find(cpuobj, "kvm-no-adjvtime")) {
-            object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
-        }
-
-        if (vmc->no_kvm_steal_time &&
-            object_property_find(cpuobj, "kvm-steal-time")) {
-            object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
-        }
-
-        if (vmc->no_pmu && object_property_find(cpuobj, "pmu")) {
-            object_property_set_bool(cpuobj, "pmu", false, NULL);
-        }
-
-        if (vmc->no_tcg_lpa2 && object_property_find(cpuobj, "lpa2")) {
-            object_property_set_bool(cpuobj, "lpa2", false, NULL);
-        }
-
-        if (object_property_find(cpuobj, "reset-cbar")) {
-            object_property_set_int(cpuobj, "reset-cbar",
-                                    vms->memmap[VIRT_CPUPERIPHS].base,
-                                    &error_abort);
-        }
-
-        object_property_set_link(cpuobj, "memory", OBJECT(sysmem),
-                                 &error_abort);
-        if (vms->secure) {
-            object_property_set_link(cpuobj, "secure-memory",
-                                     OBJECT(secure_sysmem), &error_abort);
-        }
-
-        if (vms->mte) {
-            /* Create the memory region only once, but link to all cpus. */
-            if (!tag_sysmem) {
-                /*
-                 * The property exists only if MemTag is supported.
-                 * If it is, we must allocate the ram to back that up.
-                 */
-                if (!object_property_find(cpuobj, "tag-memory")) {
-                    error_report("MTE requested, but not supported "
-                                 "by the guest CPU");
-                    exit(1);
-                }
-
-                tag_sysmem = g_new(MemoryRegion, 1);
-                memory_region_init(tag_sysmem, OBJECT(machine),
-                                   "tag-memory", UINT64_MAX / 32);
-
-                if (vms->secure) {
-                    secure_tag_sysmem = g_new(MemoryRegion, 1);
-                    memory_region_init(secure_tag_sysmem, OBJECT(machine),
-                                       "secure-tag-memory", UINT64_MAX / 32);
-
-                    /* As with ram, secure-tag takes precedence over tag.  */
-                    memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
-                                                        tag_sysmem, -1);
-                }
-            }
-
-            object_property_set_link(cpuobj, "tag-memory", OBJECT(tag_sysmem),
-                                     &error_abort);
-            if (vms->secure) {
-                object_property_set_link(cpuobj, "secure-tag-memory",
-                                         OBJECT(secure_tag_sysmem),
-                                         &error_abort);
-            }
-        }
-
         qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
         object_unref(cpuobj);
     }
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index e1ddbea96b..13163adb07 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -148,6 +148,10 @@ struct VirtMachineState {
     DeviceState *platform_bus_dev;
     FWCfgState *fw_cfg;
     PFlashCFI01 *flash[2];
+    MemoryRegion *sysmem;
+    MemoryRegion *secure_sysmem;
+    MemoryRegion *tag_sysmem;
+    MemoryRegion *secure_tag_sysmem;
     bool secure;
     bool highmem;
     bool highmem_compact;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 04/37] arm/virt, target/arm: Machine init time change common to vCPU {cold|hot}-plug
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (2 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-27  6:28   ` [PATCH RFC V2 04/37] arm/virt,target/arm: " Gavin Shan
  2023-09-27  6:30   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation, parking} code Salil Mehta via
                   ` (29 subsequent siblings)
  33 siblings, 2 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Refactor and introduce the common logic required during the initialization of
both cold and hot plugged vCPUs. Also initialize the *disabled* state of the
vCPUs which shall be used further during init phases of various other components
like GIC, PMU, ACPI etc as part of the virt machine initialization.

KVM vCPUs corresponding to unplugged/yet-to-be-plugged QOM CPUs are kept in
powered-off state in the KVM Host and do not run the guest code. Plugged vCPUs
are also kept in powered-off state but vCPU threads exist and is kept sleeping.

TBD:
For the cold booted vCPUs, this change also exists in the arm_load_kernel()
in boot.c but for the hotplugged CPUs this change should still remain part of
the pre-plug phase. We are duplicating the powering-off of the cold booted CPUs.
Shall we remove the duplicate change from boot.c?

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Reported-by: Gavin Shan <gavin.shan@redhat.com>
[GS: pointed the assertion due to wrong range check]
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c      | 149 ++++++++++++++++++++++++++++++++++++++++-----
 target/arm/cpu.c   |   7 +++
 target/arm/cpu64.c |  14 +++++
 3 files changed, 156 insertions(+), 14 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 0eb6bf5a18..3668ad27ec 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -221,6 +221,7 @@ static const char *valid_cpus[] = {
     ARM_CPU_TYPE_NAME("max"),
 };
 
+static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid);
 static int virt_get_socket_id(const MachineState *ms, int cpu_index);
 static int virt_get_cluster_id(const MachineState *ms, int cpu_index);
 static int virt_get_core_id(const MachineState *ms, int cpu_index);
@@ -2154,6 +2155,14 @@ static void machvirt_init(MachineState *machine)
         exit(1);
     }
 
+    finalize_gic_version(vms);
+    if (tcg_enabled() || hvf_enabled() || qtest_enabled() ||
+        (vms->gic_version < VIRT_GIC_VERSION_3)) {
+        machine->smp.max_cpus = smp_cpus;
+        mc->has_hotpluggable_cpus = false;
+        warn_report("cpu hotplug feature has been disabled");
+    }
+
     possible_cpus = mc->possible_cpu_arch_ids(machine);
 
     /*
@@ -2180,11 +2189,6 @@ static void machvirt_init(MachineState *machine)
         virt_set_memmap(vms, pa_bits);
     }
 
-    /* We can probe only here because during property set
-     * KVM is not available yet
-     */
-    finalize_gic_version(vms);
-
     sysmem = vms->sysmem = get_system_memory();
 
     if (vms->secure) {
@@ -2289,17 +2293,9 @@ static void machvirt_init(MachineState *machine)
     assert(possible_cpus->len == max_cpus);
     for (n = 0; n < possible_cpus->len; n++) {
         Object *cpuobj;
-        CPUState *cs;
-
-        if (n >= smp_cpus) {
-            break;
-        }
 
         cpuobj = object_new(possible_cpus->cpus[n].type);
 
-        cs = CPU(cpuobj);
-        cs->cpu_index = n;
-
         aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
         object_property_set_int(cpuobj, "socket-id",
                                 virt_get_socket_id(machine, n), NULL);
@@ -2804,6 +2800,50 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
     return ms->possible_cpus;
 }
 
+static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid)
+{
+    VirtMachineState *vms = VIRT_MACHINE(ms);
+    CPUArchId *found_cpu;
+    uint64_t mp_affinity;
+
+    assert(vcpuid >= 0 && vcpuid < ms->possible_cpus->len);
+
+    /*
+     * RFC: Question:
+     * TBD: Should mp-affinity be treated as MPIDR?
+     */
+    mp_affinity = virt_cpu_mp_affinity(vms, vcpuid);
+    found_cpu = &ms->possible_cpus->cpus[vcpuid];
+
+    assert(found_cpu->arch_id == mp_affinity);
+
+    /*
+     * RFC: Question:
+     * Slot-id is the index where vCPU with certain arch-id(=mpidr/ap-affinity)
+     * is plugged. For Host KVM, MPIDR for vCPU is derived using vcpu-id.
+     * As I understand, MPIDR and vcpu-id are property of vCPU but slot-id is
+     * more related to machine? Current code assumes slot-id and vcpu-id are
+     * same i.e. meaning of slot is bit vague.
+     *
+     * Q1: Is there any requirement to clearly represent slot and dissociate it
+     *     from vcpu-id?
+     * Q2: Should we make MPIDR within host KVM user configurable?
+     *
+     *          +----+----+----+----+----+----+----+----+
+     * MPIDR    |||  Res  |   Aff2  |   Aff1  |  Aff0   |
+     *          +----+----+----+----+----+----+----+----+
+     *                     \         \         \   |    |
+     *                      \   8bit  \   8bit  \  |4bit|
+     *                       \<------->\<------->\ |<-->|
+     *                        \         \         \|    |
+     *          +----+----+----+----+----+----+----+----+
+     * VCPU-ID  |  Byte4  |  Byte2  |  Byte1  |  Byte0  |
+     *          +----+----+----+----+----+----+----+----+
+     */
+
+    return found_cpu;
+}
+
 static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                                  Error **errp)
 {
@@ -2847,6 +2887,81 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
                          dev, &error_abort);
 }
 
+static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                              Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    MachineState *ms = MACHINE(hotplug_dev);
+    ARMCPU *cpu = ARM_CPU(dev);
+    CPUState *cs = CPU(dev);
+    CPUArchId *cpu_slot;
+    int32_t min_cpuid = 0;
+    int32_t max_cpuid;
+
+    /* sanity check the cpu */
+    if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
+        error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
+                   ms->cpu_type);
+        return;
+    }
+
+    if ((cpu->thread_id < 0) || (cpu->thread_id >= ms->smp.threads)) {
+        error_setg(errp, "Invalid thread-id %u specified, correct range 0:%u",
+                   cpu->thread_id, ms->smp.threads - 1);
+        return;
+    }
+
+    max_cpuid = ms->possible_cpus->len - 1;
+    if (!dev->hotplugged) {
+        min_cpuid = vms->acpi_dev ? ms->smp.cpus : 0;
+        max_cpuid = vms->acpi_dev ? max_cpuid : ms->smp.cpus - 1;
+    }
+
+    if ((cpu->core_id < min_cpuid) || (cpu->core_id > max_cpuid)) {
+        error_setg(errp, "Invalid core-id %d specified, correct range %d:%d",
+                   cpu->core_id, min_cpuid, max_cpuid);
+        return;
+    }
+
+    if ((cpu->cluster_id < 0) || (cpu->cluster_id >= ms->smp.clusters)) {
+        error_setg(errp, "Invalid cluster-id %u specified, correct range 0:%u",
+                   cpu->cluster_id, ms->smp.clusters - 1);
+        return;
+    }
+
+    if ((cpu->socket_id < 0) || (cpu->socket_id >= ms->smp.sockets)) {
+        error_setg(errp, "Invalid socket-id %u specified, correct range 0:%u",
+                   cpu->socket_id, ms->smp.sockets - 1);
+        return;
+    }
+
+    cs->cpu_index = virt_get_cpu_id_from_cpu_topo(ms, dev);
+
+    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
+    if (qemu_present_cpu(CPU(cpu_slot->cpu))) {
+        error_setg(errp, "cpu(id%d=%d:%d:%d:%d) with arch-id %" PRIu64 " exist",
+                   cs->cpu_index, cpu->socket_id, cpu->cluster_id, cpu->core_id,
+                   cpu->thread_id, cpu_slot->arch_id);
+        return;
+    }
+    virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
+}
+
+static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                          Error **errp)
+{
+    MachineState *ms = MACHINE(hotplug_dev);
+    CPUState *cs = CPU(dev);
+    CPUArchId *cpu_slot;
+
+    /* insert the cold/hot-plugged vcpu in the slot */
+    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
+    cpu_slot->cpu = OBJECT(dev);
+
+    cs->disabled = false;
+    return;
+}
+
 static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
                                             DeviceState *dev, Error **errp)
 {
@@ -2888,6 +3003,8 @@ static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
         object_property_set_str(OBJECT(dev), "reserved-regions[0]",
                                 resv_prop_str, errp);
         g_free(resv_prop_str);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        virt_cpu_pre_plug(hotplug_dev, dev, errp);
     }
 }
 
@@ -2909,6 +3026,8 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
         virt_memory_plug(hotplug_dev, dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
         virtio_md_pci_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        virt_cpu_plug(hotplug_dev, dev, errp);
     }
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
@@ -2993,7 +3112,8 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
     if (device_is_dynamic_sysbus(mc, dev) ||
         object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
         object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI) ||
-        object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
+        object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
+        object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
         return HOTPLUG_HANDLER(machine);
     }
     return NULL;
@@ -3070,6 +3190,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
 #endif
     mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
     mc->kvm_type = virt_kvm_type;
+    mc->has_hotpluggable_cpus = true;
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
     hc->pre_plug = virt_machine_device_pre_plug_cb;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 1376350416..3a2e7e64ee 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -2332,6 +2332,12 @@ static const struct TCGCPUOps arm_tcg_ops = {
 };
 #endif /* CONFIG_TCG */
 
+static int64_t arm_cpu_get_arch_id(CPUState *cs)
+{
+    ARMCPU *cpu = ARM_CPU(cs);
+    return cpu->mp_affinity;
+}
+
 static void arm_cpu_class_init(ObjectClass *oc, void *data)
 {
     ARMCPUClass *acc = ARM_CPU_CLASS(oc);
@@ -2350,6 +2356,7 @@ static void arm_cpu_class_init(ObjectClass *oc, void *data)
     cc->class_by_name = arm_cpu_class_by_name;
     cc->has_work = arm_cpu_has_work;
     cc->dump_state = arm_cpu_dump_state;
+    cc->get_arch_id = arm_cpu_get_arch_id;
     cc->set_pc = arm_cpu_set_pc;
     cc->get_pc = arm_cpu_get_pc;
     cc->gdb_read_register = arm_cpu_gdb_read_register;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 96158093cc..a660e3f483 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -739,6 +739,17 @@ static void aarch64_cpu_set_aarch64(Object *obj, bool value, Error **errp)
     }
 }
 
+static void aarch64_cpu_initfn(Object *obj)
+{
+    CPUState *cs = CPU(obj);
+
+    /*
+     * we start every ARM64 vcpu as disabled possible vCPU. It needs to be
+     * enabled explicitly
+     */
+    cs->disabled = true;
+}
+
 static void aarch64_cpu_finalizefn(Object *obj)
 {
 }
@@ -751,7 +762,9 @@ static gchar *aarch64_gdb_arch_name(CPUState *cs)
 static void aarch64_cpu_class_init(ObjectClass *oc, void *data)
 {
     CPUClass *cc = CPU_CLASS(oc);
+    DeviceClass *dc = DEVICE_CLASS(oc);
 
+    dc->user_creatable = true;
     cc->gdb_read_register = aarch64_cpu_gdb_read_register;
     cc->gdb_write_register = aarch64_cpu_gdb_write_register;
     cc->gdb_num_core_regs = 34;
@@ -800,6 +813,7 @@ static const TypeInfo aarch64_cpu_type_info = {
     .name = TYPE_AARCH64_CPU,
     .parent = TYPE_ARM_CPU,
     .instance_size = sizeof(ARMCPU),
+    .instance_init = aarch64_cpu_initfn,
     .instance_finalize = aarch64_cpu_finalizefn,
     .abstract = true,
     .class_size = sizeof(AArch64CPUClass),
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation, parking} code
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (3 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 04/37] arm/virt, target/arm: Machine init time change common to vCPU {cold|hot}-plug Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-27  6:51   ` [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation,parking} code Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 06/37] arm/virt, kvm: Pre-create disabled possible vCPUs @machine init Salil Mehta via
                   ` (28 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

KVM vCPU creation is done once during the initialization of the VM when Qemu
threads are spawned. This is common to all the architectures. If the architecture
supports vCPU hot-{un}plug then this KVM vCPU creation could be deferred to
later point as well. Some architectures might in any case create KVM vCPUs for
the yet-to-be plugged vCPUs (i.e. QoM Object & thread does not exists) during VM
init time and park them.

Hot-unplug of vCPU results in destruction of the vCPU objects in QOM but
the KVM vCPU objects in the Host KVM are not destroyed and their representative
KVM vCPU objects in Qemu are parked.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 accel/kvm/kvm-all.c  | 61 ++++++++++++++++++++++++++++++++++----------
 include/sysemu/kvm.h |  2 ++
 2 files changed, 49 insertions(+), 14 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 7b3da8dc3a..86e9c9ea60 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -137,6 +137,7 @@ static QemuMutex kml_slots_lock;
 #define kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
 
 static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
+static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
 
 static inline void kvm_resample_fd_remove(int gsi)
 {
@@ -320,11 +321,51 @@ err:
     return ret;
 }
 
+void kvm_park_vcpu(CPUState *cpu)
+{
+    unsigned long vcpu_id = cpu->cpu_index;
+    struct KVMParkedVcpu *vcpu;
+
+    vcpu = g_malloc0(sizeof(*vcpu));
+    vcpu->vcpu_id = vcpu_id;
+    vcpu->kvm_fd = cpu->kvm_fd;
+    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
+}
+
+int kvm_create_vcpu(CPUState *cpu)
+{
+    unsigned long vcpu_id = cpu->cpu_index;
+    KVMState *s = kvm_state;
+    int ret;
+
+    DPRINTF("kvm_create_vcpu\n");
+
+    /* check if the KVM vCPU already exist but is parked */
+    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
+    if (ret > 0) {
+        goto found;
+    }
+
+    /* create a new KVM vcpu */
+    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
+    if (ret < 0) {
+        return ret;
+    }
+
+found:
+    cpu->vcpu_dirty = true;
+    cpu->kvm_fd = ret;
+    cpu->kvm_state = s;
+    cpu->dirty_pages = 0;
+    cpu->throttle_us_per_full = 0;
+
+    return 0;
+}
+
 static int do_kvm_destroy_vcpu(CPUState *cpu)
 {
     KVMState *s = kvm_state;
     long mmap_size;
-    struct KVMParkedVcpu *vcpu = NULL;
     int ret = 0;
 
     DPRINTF("kvm_destroy_vcpu\n");
@@ -353,10 +394,7 @@ static int do_kvm_destroy_vcpu(CPUState *cpu)
         }
     }
 
-    vcpu = g_malloc0(sizeof(*vcpu));
-    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
-    vcpu->kvm_fd = cpu->kvm_fd;
-    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
+    kvm_park_vcpu(cpu);
 err:
     return ret;
 }
@@ -384,7 +422,7 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
         }
     }
 
-    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
+    return -1;
 }
 
 int kvm_init_vcpu(CPUState *cpu, Error **errp)
@@ -395,19 +433,14 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
 
     trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
-    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
+    ret = kvm_create_vcpu(cpu);
     if (ret < 0) {
-        error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
+        error_setg_errno(errp, -ret,
+                         "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
                          kvm_arch_vcpu_id(cpu));
         goto err;
     }
 
-    cpu->kvm_fd = ret;
-    cpu->kvm_state = s;
-    cpu->vcpu_dirty = true;
-    cpu->dirty_pages = 0;
-    cpu->throttle_us_per_full = 0;
-
     mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
     if (mmap_size < 0) {
         ret = mmap_size;
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 115f0cca79..2c34889b01 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -473,6 +473,8 @@ void kvm_set_sigmask_len(KVMState *s, unsigned int sigmask_len);
 
 int kvm_physical_memory_addr_from_host(KVMState *s, void *ram_addr,
                                        hwaddr *phys_addr);
+int kvm_create_vcpu(CPUState *cpu);
+void kvm_park_vcpu(CPUState *cpu);
 
 #endif /* NEED_CPU_H */
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 06/37] arm/virt, kvm: Pre-create disabled possible vCPUs @machine init
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (4 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation, parking} code Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-27 10:04   ` [PATCH RFC V2 06/37] arm/virt,kvm: " Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 07/37] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus " Salil Mehta via
                   ` (27 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

In ARMv8 architecture, GIC needs all the vCPUs to be created and present when
it is initialized. This is because:
1. GICC and MPIDR association must be fixed at the VM initialization time.
   This is represented by register GIC_TYPER(mp_afffinity, proc_num)
2. GICC(cpu interfaces), GICR(redistributors) etc all must be initialized
   at the boot time as well.
3. Memory regions associated with GICR etc. cannot be changed(add/del/mod)
   after VM has inited.

This patch adds the support to pre-create all such possible vCPUs within the
host using the KVM interface as part of the virt machine initialization. These
vCPUs could later be attached to QOM/ACPI while they are actually hot plugged
and made present.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
[VP: Identified CPU stall issue & suggested probable fix]
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c         | 53 +++++++++++++++++++++++++++++++++++++++++--
 include/hw/core/cpu.h |  1 +
 target/arm/cpu64.c    |  1 +
 target/arm/kvm.c      | 32 ++++++++++++++++++++++++++
 target/arm/kvm64.c    |  9 +++++++-
 target/arm/kvm_arm.h  | 11 +++++++++
 6 files changed, 104 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3668ad27ec..6ba131b799 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2293,8 +2293,10 @@ static void machvirt_init(MachineState *machine)
     assert(possible_cpus->len == max_cpus);
     for (n = 0; n < possible_cpus->len; n++) {
         Object *cpuobj;
+        CPUState *cs;
 
         cpuobj = object_new(possible_cpus->cpus[n].type);
+        cs = CPU(cpuobj);
 
         aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
         object_property_set_int(cpuobj, "socket-id",
@@ -2306,8 +2308,55 @@ static void machvirt_init(MachineState *machine)
         object_property_set_int(cpuobj, "thread-id",
                                 virt_get_thread_id(machine, n), NULL);
 
-        qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
-        object_unref(cpuobj);
+        if (n < smp_cpus) {
+            qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
+            object_unref(cpuobj);
+        } else {
+            CPUArchId *cpu_slot;
+
+            /* handling for vcpus which are yet to be hot-plugged */
+            cs->cpu_index = n;
+            cpu_slot = virt_find_cpu_slot(machine, cs->cpu_index);
+
+            /*
+             * ARM host vCPU features need to be fixed at the boot time. But as
+             * per current approach this CPU object will be destroyed during
+             * cpu_post_init(). During hotplug of vCPUs these properties are
+             * initialized again.
+             */
+            virt_cpu_set_properties(cpuobj, cpu_slot, &error_fatal);
+
+            /*
+             * For KVM, we shall be pre-creating the now disabled/un-plugged
+             * possbile host vcpus and park them till the time they are
+             * actually hot plugged. This is required to pre-size the host
+             * GICC and GICR with the all possible vcpus for this VM.
+             */
+            if (kvm_enabled()) {
+                kvm_arm_create_host_vcpu(ARM_CPU(cs));
+            }
+            /*
+             * Add disabled vCPU to CPU slot during the init phase of the virt
+             * machine
+             * 1. We need this ARMCPU object during the GIC init. This object
+             *    will facilitate in pre-realizing the GIC. Any info like
+             *    mp-affinity(required to derive gicr_type) etc. could still be
+             *    fetched while preserving QOM abstraction akin to realized
+             *    vCPUs.
+             * 2. Now, after initialization of the virt machine is complete we
+             *    could use two approaches to deal with this ARMCPU object:
+             *    (i) re-use this ARMCPU object during hotplug of this vCPU.
+             *                             OR
+             *    (ii) defer release this ARMCPU object after gic has been
+             *         initialized or during pre-plug phase when a vCPU is
+             *         hotplugged.
+             *
+             *    We will use the (ii) approach and release the ARMCPU objects
+             *    after GIC and machine has been fully initialized during
+             *    machine_init_done() phase.
+             */
+             cpu_slot->cpu = OBJECT(cs);
+        }
     }
     fdt_add_timer_nodes(vms);
     fdt_add_cpu_nodes(vms);
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index e5af79950c..b2201a98ee 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -401,6 +401,7 @@ struct CPUState {
     uint32_t kvm_fetch_index;
     uint64_t dirty_pages;
     int kvm_vcpu_stats_fd;
+    VMChangeStateEntry *vmcse;
 
     /* Use by accel-block: CPU is executing an ioctl() */
     QemuLockCnt in_ioctl_lock;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index a660e3f483..3a38e7ccaf 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -748,6 +748,7 @@ static void aarch64_cpu_initfn(Object *obj)
      * enabled explicitly
      */
     cs->disabled = true;
+    cs->thread_id = 0;
 }
 
 static void aarch64_cpu_finalizefn(Object *obj)
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index b4c7654f49..0e1d0692b1 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -637,6 +637,38 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
     write_list_to_cpustate(cpu);
 }
 
+void kvm_arm_create_host_vcpu(ARMCPU *cpu)
+{
+    CPUState *cs = CPU(cpu);
+    unsigned long vcpu_id = cs->cpu_index;
+    int ret;
+
+    ret = kvm_create_vcpu(cs);
+    if (ret < 0) {
+        error_report("Failed to create host vcpu %ld", vcpu_id);
+        abort();
+    }
+
+    /*
+     * Initialize the vCPU in the host. This will reset the sys regs
+     * for this vCPU and related registers like MPIDR_EL1 etc. also
+     * gets programmed during this call to host. These are referred
+     * later while setting device attributes of the GICR during GICv3
+     * reset
+     */
+    ret = kvm_arch_init_vcpu(cs);
+    if (ret < 0) {
+        error_report("Failed to initialize host vcpu %ld", vcpu_id);
+        abort();
+    }
+
+    /*
+     * park the created vCPU. shall be used during kvm_get_vcpu() when
+     * threads are created during realization of ARM vCPUs.
+     */
+    kvm_park_vcpu(cs);
+}
+
 /*
  * Update KVM's MP_STATE based on what QEMU thinks it is
  */
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index 94bbd9661f..364cc21f81 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -566,7 +566,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
         return -EINVAL;
     }
 
-    qemu_add_vm_change_state_handler(kvm_arm_vm_state_change, cs);
+    /*
+     * Install VM change handler only when vCPU thread has been spawned
+     * i.e. vCPU is being realized
+     */
+    if (cs->thread_id) {
+        cs->vmcse = qemu_add_vm_change_state_handler(kvm_arm_vm_state_change,
+                                                     cs);
+    }
 
     /* Determine init features for this CPU */
     memset(cpu->kvm_init_features, 0, sizeof(cpu->kvm_init_features));
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 051a0da41c..31408499b3 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -163,6 +163,17 @@ void kvm_arm_cpu_post_load(ARMCPU *cpu);
  */
 void kvm_arm_reset_vcpu(ARMCPU *cpu);
 
+/**
+ * kvm_arm_create_host_vcpu:
+ * @cpu: ARMCPU
+ *
+ * Called at to pre create all possible kvm vCPUs within the the host at the
+ * virt machine init time. This will also init this pre-created vCPU and
+ * hence result in vCPU reset at host. These pre created and inited vCPUs
+ * shall be parked for use when ARM vCPUs are actually realized.
+ */
+void kvm_arm_create_host_vcpu(ARMCPU *cpu);
+
 /**
  * kvm_arm_init_serror_injection:
  * @cs: CPUState
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 07/37] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus @machine init
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (5 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 06/37] arm/virt, kvm: Pre-create disabled possible vCPUs @machine init Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28  0:14   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 08/37] arm/virt: Init PMU at host for all possible vcpus Salil Mehta via
                   ` (26 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

GIC needs to be pre-sized with possible vcpus at the initialization time. This
is necessary because Memory regions and resources associated with GICC/GICR
etc cannot be changed (add/del/modified) after VM has inited. Also, GIC_TYPER
needs to be initialized with mp_affinity and cpu interface number association.
This cannot be changed after GIC has initialized.

Once all the cpu interfaces of the GIC has been inited it needs to be ensured
that any updates to the GICC during reset only takes place for the present
vcpus and not the disabled ones. Therefore, proper checks are required at
various places.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
[changed the comment in arm_gicv3_icc_reset]
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c              | 15 ++++++++-------
 hw/intc/arm_gicv3_common.c |  7 +++++--
 hw/intc/arm_gicv3_cpuif.c  |  8 ++++++++
 hw/intc/arm_gicv3_kvm.c    | 34 +++++++++++++++++++++++++++++++---
 include/hw/arm/virt.h      |  2 +-
 5 files changed, 53 insertions(+), 13 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 6ba131b799..a208b4e517 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -718,6 +718,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
     const char *gictype;
     int i;
     unsigned int smp_cpus = ms->smp.cpus;
+    unsigned int max_cpus = ms->smp.max_cpus;
     uint32_t nb_redist_regions = 0;
     int revision;
 
@@ -742,7 +743,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
     }
     vms->gic = qdev_new(gictype);
     qdev_prop_set_uint32(vms->gic, "revision", revision);
-    qdev_prop_set_uint32(vms->gic, "num-cpu", smp_cpus);
+    qdev_prop_set_uint32(vms->gic, "num-cpu", max_cpus);
     /* Note that the num-irq property counts both internal and external
      * interrupts; there are always 32 of the former (mandated by GIC spec).
      */
@@ -753,7 +754,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
 
     if (vms->gic_version != VIRT_GIC_VERSION_2) {
         uint32_t redist0_capacity = virt_redist_capacity(vms, VIRT_GIC_REDIST);
-        uint32_t redist0_count = MIN(smp_cpus, redist0_capacity);
+        uint32_t redist0_count = MIN(max_cpus, redist0_capacity);
 
         nb_redist_regions = virt_gicv3_redist_region_count(vms);
 
@@ -774,7 +775,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
                 virt_redist_capacity(vms, VIRT_HIGH_GIC_REDIST2);
 
             qdev_prop_set_uint32(vms->gic, "redist-region-count[1]",
-                MIN(smp_cpus - redist0_count, redist1_capacity));
+                MIN(max_cpus - redist0_count, redist1_capacity));
         }
     } else {
         if (!kvm_irqchip_in_kernel()) {
@@ -831,7 +832,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
         } else if (vms->virt) {
             qemu_irq irq = qdev_get_gpio_in(vms->gic,
                                             ppibase + ARCH_GIC_MAINT_IRQ);
-            sysbus_connect_irq(gicbusdev, i + 4 * smp_cpus, irq);
+            sysbus_connect_irq(gicbusdev, i + 4 * max_cpus, irq);
         }
 
         qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
@@ -839,11 +840,11 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
                                                      + VIRTUAL_PMU_IRQ));
 
         sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
-        sysbus_connect_irq(gicbusdev, i + smp_cpus,
+        sysbus_connect_irq(gicbusdev, i + max_cpus,
                            qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
-        sysbus_connect_irq(gicbusdev, i + 2 * smp_cpus,
+        sysbus_connect_irq(gicbusdev, i + 2 * max_cpus,
                            qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
-        sysbus_connect_irq(gicbusdev, i + 3 * smp_cpus,
+        sysbus_connect_irq(gicbusdev, i + 3 * max_cpus,
                            qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
     }
 
diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index 2ebf880ead..ebd99af610 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -392,10 +392,13 @@ static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
     s->cpu = g_new0(GICv3CPUState, s->num_cpu);
 
     for (i = 0; i < s->num_cpu; i++) {
-        CPUState *cpu = qemu_get_cpu(i);
+        CPUState *cpu = qemu_get_possible_cpu(i);
         uint64_t cpu_affid;
 
-        s->cpu[i].cpu = cpu;
+        if (qemu_enabled_cpu(cpu)) {
+            s->cpu[i].cpu = cpu;
+        }
+
         s->cpu[i].gic = s;
         /* Store GICv3CPUState in CPUARMState gicv3state pointer */
         gicv3_set_gicv3state(cpu, &s->cpu[i]);
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index d07b13eb27..7b7a0fdb9c 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -934,6 +934,10 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
     ARMCPU *cpu = ARM_CPU(cs->cpu);
     CPUARMState *env = &cpu->env;
 
+    if (!qemu_enabled_cpu(cs->cpu)) {
+        return;
+    }
+
     g_assert(qemu_mutex_iothread_locked());
 
     trace_gicv3_cpuif_update(gicv3_redist_affid(cs), cs->hppi.irq,
@@ -1826,6 +1830,10 @@ static void icc_generate_sgi(CPUARMState *env, GICv3CPUState *cs,
     for (i = 0; i < s->num_cpu; i++) {
         GICv3CPUState *ocs = &s->cpu[i];
 
+        if (!qemu_enabled_cpu(ocs->cpu)) {
+            continue;
+        }
+
         if (irm) {
             /* IRM == 1 : route to all CPUs except self */
             if (cs == ocs) {
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index 72ad916d3d..b6f50caf84 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -24,6 +24,7 @@
 #include "hw/intc/arm_gicv3_common.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
+#include "sysemu/cpus.h"
 #include "sysemu/kvm.h"
 #include "sysemu/runstate.h"
 #include "kvm_arm.h"
@@ -458,6 +459,18 @@ static void kvm_arm_gicv3_put(GICv3State *s)
         GICv3CPUState *c = &s->cpu[ncpu];
         int num_pri_bits;
 
+        /*
+         * To support hotplug of vcpus we need to make sure all gic cpuif/GICC
+         * are initialized at machvirt init time. Once the init is done we
+         * release the ARMCPU object for disabled vcpus but this leg could hit
+         * during reset of GICC later as well i.e. after init has happened and
+         * all of the cases we want to make sure we dont acess the GICC for
+         * the disabled VCPUs.
+         */
+        if (!qemu_enabled_cpu(c->cpu)) {
+            continue;
+        }
+
         kvm_gicc_access(s, ICC_SRE_EL1, ncpu, &c->icc_sre_el1, true);
         kvm_gicc_access(s, ICC_CTLR_EL1, ncpu,
                         &c->icc_ctlr_el1[GICV3_NS], true);
@@ -616,6 +629,11 @@ static void kvm_arm_gicv3_get(GICv3State *s)
         GICv3CPUState *c = &s->cpu[ncpu];
         int num_pri_bits;
 
+        /* don't access GICC for the disabled vCPUs. */
+        if (!qemu_enabled_cpu(c->cpu)) {
+            continue;
+        }
+
         kvm_gicc_access(s, ICC_SRE_EL1, ncpu, &c->icc_sre_el1, false);
         kvm_gicc_access(s, ICC_CTLR_EL1, ncpu,
                         &c->icc_ctlr_el1[GICV3_NS], false);
@@ -695,10 +713,19 @@ static void arm_gicv3_icc_reset(CPUARMState *env, const ARMCPRegInfo *ri)
         return;
     }
 
+    /*
+     * This shall be called even when vcpu is being hotplugged or onlined and
+     * other vcpus might be running. Host kernel KVM code to handle device
+     * access of IOCTLs KVM_{GET|SET}_DEVICE_ATTR might fail due to inability to
+     * grab vcpu locks for all the vcpus. Hence, we need to pause all vcpus to
+     * facilitate locking within host.
+     */
+    pause_all_vcpus();
     /* Initialize to actual HW supported configuration */
     kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS,
                       KVM_VGIC_ATTR(ICC_CTLR_EL1, c->gicr_typer),
                       &c->icc_ctlr_el1[GICV3_NS], false, &error_abort);
+    resume_all_vcpus();
 
     c->icc_ctlr_el1[GICV3_S] = c->icc_ctlr_el1[GICV3_NS];
 }
@@ -808,9 +835,10 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
     gicv3_init_irqs_and_mmio(s, kvm_arm_gicv3_set_irq, NULL);
 
     for (i = 0; i < s->num_cpu; i++) {
-        ARMCPU *cpu = ARM_CPU(qemu_get_cpu(i));
-
-        define_arm_cp_regs(cpu, gicv3_cpuif_reginfo);
+        CPUState *cs = qemu_get_cpu(i);
+        if (qemu_enabled_cpu(cs)) {
+            define_arm_cp_regs(ARM_CPU(cs), gicv3_cpuif_reginfo);
+        }
     }
 
     /* Try to create the device via the device control API */
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 13163adb07..098c7917a4 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -217,7 +217,7 @@ static inline int virt_gicv3_redist_region_count(VirtMachineState *vms)
 
     assert(vms->gic_version != VIRT_GIC_VERSION_2);
 
-    return (MACHINE(vms)->smp.cpus > redist0_capacity &&
+    return (MACHINE(vms)->smp.max_cpus > redist0_capacity &&
             vms->highmem_redists) ? 2 : 1;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 08/37] arm/virt: Init PMU at host for all possible vcpus
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (6 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 07/37] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus " Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-26 10:04 ` [PATCH RFC V2 09/37] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file Salil Mehta via
                   ` (25 subsequent siblings)
  33 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

PMU for all possible vCPUs must be initialized at the VM initialization time.
Refactor existing code to accomodate possible vCPUs. This also assumes that all
processor being used are identical.

Past discussion for reference:
Link: https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c         | 12 ++++++++----
 include/hw/arm/virt.h |  1 +
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a208b4e517..070c36054e 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1960,12 +1960,14 @@ static void finalize_gic_version(VirtMachineState *vms)
  */
 static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
 {
+    CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
     int max_cpus = MACHINE(vms)->smp.max_cpus;
-    bool aarch64, pmu, steal_time;
+    bool aarch64, steal_time;
     CPUState *cpu;
+    int n;
 
     aarch64 = object_property_get_bool(OBJECT(first_cpu), "aarch64", NULL);
-    pmu = object_property_get_bool(OBJECT(first_cpu), "pmu", NULL);
+    vms->pmu = object_property_get_bool(OBJECT(first_cpu), "pmu", NULL);
     steal_time = object_property_get_bool(OBJECT(first_cpu),
                                           "kvm-steal-time", NULL);
 
@@ -1992,8 +1994,10 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
             memory_region_add_subregion(sysmem, pvtime_reg_base, pvtime);
         }
 
-        CPU_FOREACH(cpu) {
-            if (pmu) {
+        for (n = 0; n < possible_cpus->len; n++) {
+            cpu = qemu_get_possible_cpu(n);
+
+            if (vms->pmu) {
                 assert(arm_feature(&ARM_CPU(cpu)->env, ARM_FEATURE_PMU));
                 if (kvm_irqchip_in_kernel()) {
                     kvm_arm_pmu_set_irq(cpu, PPI(VIRTUAL_PMU_IRQ));
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 098c7917a4..fc0469c33f 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -164,6 +164,7 @@ struct VirtMachineState {
     bool ras;
     bool mte;
     bool dtb_randomness;
+    bool pmu;
     OnOffAuto acpi;
     VirtGICType gic_version;
     VirtIOMMUType iommu;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 09/37] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (7 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 08/37] arm/virt: Init PMU at host for all possible vcpus Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28  0:19   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 10/37] arm/acpi: Enable ACPI support for vcpu hotplug Salil Mehta via
                   ` (24 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

CPU ctrl-dev MMIO region length could be used in ACPI GED (common ACPI code
across architectures) and various other architecture specific places. To make
these code places independent of compilation order, ACPI_CPU_HOTPLUG_REG_LEN
macro should be moved to a header file.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/acpi/cpu.c                 | 2 +-
 include/hw/acpi/cpu_hotplug.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 19c154d78f..45defdc0e2 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -1,12 +1,12 @@
 #include "qemu/osdep.h"
 #include "migration/vmstate.h"
 #include "hw/acpi/cpu.h"
+#include "hw/acpi/cpu_hotplug.h"
 #include "qapi/error.h"
 #include "qapi/qapi-events-acpi.h"
 #include "trace.h"
 #include "sysemu/numa.h"
 
-#define ACPI_CPU_HOTPLUG_REG_LEN 12
 #define ACPI_CPU_SELECTOR_OFFSET_WR 0
 #define ACPI_CPU_FLAGS_OFFSET_RW 4
 #define ACPI_CPU_CMD_OFFSET_WR 5
diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h
index 3b932abbbb..48b291e45e 100644
--- a/include/hw/acpi/cpu_hotplug.h
+++ b/include/hw/acpi/cpu_hotplug.h
@@ -19,6 +19,8 @@
 #include "hw/hotplug.h"
 #include "hw/acpi/cpu.h"
 
+#define ACPI_CPU_HOTPLUG_REG_LEN 12
+
 typedef struct AcpiCpuHotplug {
     Object *device;
     MemoryRegion io;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 10/37] arm/acpi: Enable ACPI support for vcpu hotplug
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (8 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 09/37] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28  0:25   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 11/37] hw/acpi: Add ACPI CPU hotplug init stub Salil Mehta via
                   ` (23 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

ACPI is required to interface QEMU with the guest. Roughly falls into below
cases,

1. Convey the possible vcpus config at the machine init time to the guest
   using various DSDT tables like MADT etc.
2. Convey vcpu hotplug events to guest(using GED)
3. Assist in evaluation of various ACPI methods(like _EVT, _STA, _OST, _EJ0,
   _MAT etc.)
4. Provides ACPI cpu hotplug state and 12 Byte memory mapped cpu hotplug
   control register interface to the OSPM/guest corresponding to each possible
   vcpu. The register interface consists of various R/W fields and their
   handling operations. These are called when ever register fields or memory
   regions are accessed(i.e. read or written) by OSPM when ever it evaluates
   various ACPI methods.

Note: lot of this framework code is inherited from the changes already done for
      x86 but still some minor changes are required to make it compatible with
      ARM64.)

This patch enables the ACPI support for virtual cpu hotplug. ACPI changes
required will follow in subsequent patches.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 7e68348440..dae06158cd 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -29,6 +29,7 @@ config ARM_VIRT
     select ACPI_HW_REDUCED
     select ACPI_APEI
     select ACPI_VIOT
+    select ACPI_CPU_HOTPLUG
     select VIRTIO_MEM_SUPPORTED
     select ACPI_CXL
     select ACPI_HMAT
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 11/37] hw/acpi: Add ACPI CPU hotplug init stub
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (9 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 10/37] arm/acpi: Enable ACPI support for vcpu hotplug Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28  0:28   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 12/37] hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init Salil Mehta via
                   ` (22 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

ACPI CPU hotplug related initialization should only happend if ACPI_CPU_HOTPLUG
support has been enabled for particular architecture. Add cpu_hotplug_hw_init()
stub to avoid compilation break.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/acpi/acpi-cpu-hotplug-stub.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/acpi/acpi-cpu-hotplug-stub.c b/hw/acpi/acpi-cpu-hotplug-stub.c
index 3fc4b14c26..c6c61bb9cd 100644
--- a/hw/acpi/acpi-cpu-hotplug-stub.c
+++ b/hw/acpi/acpi-cpu-hotplug-stub.c
@@ -19,6 +19,12 @@ void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, Object *owner,
     return;
 }
 
+void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
+                         CPUHotplugState *state, hwaddr base_addr)
+{
+    return;
+}
+
 void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list)
 {
     return;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 12/37] hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (10 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 11/37] hw/acpi: Add ACPI CPU hotplug init stub Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28  0:40   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 13/37] hw/acpi: Init GED framework with cpu hotplug events Salil Mehta via
                   ` (21 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

ACPI CPU Hotplug code assumes a virtual CPU is unplugged if the CPUState object
is absent in the list of ths possible CPUs(CPUArchIdList *possible_cpus)
maintained on per-machine basis. Use the earlier introduced qemu_present_cpu()
API to check this state.

This change should have no bearing on the functionality of any architecture and
is mere a representational change.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/acpi/cpu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 45defdc0e2..d5ba37b209 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -225,7 +225,10 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
     state->dev_count = id_list->len;
     state->devs = g_new0(typeof(*state->devs), state->dev_count);
     for (i = 0; i < id_list->len; i++) {
-        state->devs[i].cpu =  CPU(id_list->cpus[i].cpu);
+        struct CPUState *cpu = CPU(id_list->cpus[i].cpu);
+        if (qemu_present_cpu(cpu)) {
+            state->devs[i].cpu = cpu;
+        }
         state->devs[i].arch_id = id_list->cpus[i].arch_id;
     }
     memory_region_init_io(&state->ctrl_reg, owner, &cpu_hotplug_ops, state,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 13/37] hw/acpi: Init GED framework with cpu hotplug events
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (11 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 12/37] hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28  0:56   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 14/37] arm/virt: Add cpu hotplug events to GED during creation Salil Mehta via
                   ` (20 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

ACPI GED(as described in the ACPI 6.2 spec) can be used to generate ACPI events
when OSPM/guest receives an interrupt listed in the _CRS object of GED. OSPM
then maps or demultiplexes the event by evaluating _EVT method.

This change adds the support of cpu hotplug event initialization in the
existing GED framework.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/acpi/generic_event_device.c         | 8 ++++++++
 include/hw/acpi/generic_event_device.h | 5 +++++
 2 files changed, 13 insertions(+)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index a3d31631fe..d2fa1d0e4a 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -25,6 +25,7 @@ static const uint32_t ged_supported_events[] = {
     ACPI_GED_MEM_HOTPLUG_EVT,
     ACPI_GED_PWR_DOWN_EVT,
     ACPI_GED_NVDIMM_HOTPLUG_EVT,
+    ACPI_GED_CPU_HOTPLUG_EVT,
 };
 
 /*
@@ -400,6 +401,13 @@ static void acpi_ged_initfn(Object *obj)
     memory_region_init_io(&ged_st->regs, obj, &ged_regs_ops, ged_st,
                           TYPE_ACPI_GED "-regs", ACPI_GED_REG_COUNT);
     sysbus_init_mmio(sbd, &ged_st->regs);
+
+    s->cpuhp.device = OBJECT(s);
+    memory_region_init(&s->container_cpuhp, OBJECT(dev), "cpuhp container",
+                       ACPI_CPU_HOTPLUG_REG_LEN);
+    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->container_cpuhp);
+    cpu_hotplug_hw_init(&s->container_cpuhp, OBJECT(dev),
+                        &s->cpuhp_state, 0);
 }
 
 static void acpi_ged_class_init(ObjectClass *class, void *data)
diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
index d831bbd889..d0a5a43abf 100644
--- a/include/hw/acpi/generic_event_device.h
+++ b/include/hw/acpi/generic_event_device.h
@@ -60,6 +60,7 @@
 #define HW_ACPI_GENERIC_EVENT_DEVICE_H
 
 #include "hw/sysbus.h"
+#include "hw/acpi/cpu_hotplug.h"
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/ghes.h"
 #include "qom/object.h"
@@ -97,6 +98,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
 #define ACPI_GED_MEM_HOTPLUG_EVT   0x1
 #define ACPI_GED_PWR_DOWN_EVT      0x2
 #define ACPI_GED_NVDIMM_HOTPLUG_EVT 0x4
+#define ACPI_GED_CPU_HOTPLUG_EVT    0x8
 
 typedef struct GEDState {
     MemoryRegion evt;
@@ -108,6 +110,9 @@ struct AcpiGedState {
     SysBusDevice parent_obj;
     MemHotplugState memhp_state;
     MemoryRegion container_memhp;
+    CPUHotplugState cpuhp_state;
+    MemoryRegion container_cpuhp;
+    AcpiCpuHotplug cpuhp;
     GEDState ged_state;
     uint32_t ged_event_bitmap;
     qemu_irq irq;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 14/37] arm/virt: Add cpu hotplug events to GED during creation
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (12 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 13/37] hw/acpi: Init GED framework with cpu hotplug events Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28  1:03   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 15/37] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed Salil Mehta via
                   ` (19 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Add CPU Hotplug event to the set of supported ged-events during the creation of
GED device during VM init. Also initialize the memory map for CPU Hotplug
control device used in event exchanges between Qemu/VMM and the guest.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c         | 5 ++++-
 include/hw/arm/virt.h | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 070c36054e..5c8a0672dc 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -76,6 +76,7 @@
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
 #include "hw/acpi/generic_event_device.h"
+#include "hw/acpi/cpu_hotplug.h"
 #include "hw/virtio/virtio-md-pci.h"
 #include "hw/virtio/virtio-iommu.h"
 #include "hw/char/pl011.h"
@@ -155,6 +156,7 @@ static const MemMapEntry base_memmap[] = {
     [VIRT_NVDIMM_ACPI] =        { 0x09090000, NVDIMM_ACPI_IO_LEN},
     [VIRT_PVTIME] =             { 0x090a0000, 0x00010000 },
     [VIRT_SECURE_GPIO] =        { 0x090b0000, 0x00001000 },
+    [VIRT_CPUHP_ACPI] =         { 0x090c0000, ACPI_CPU_HOTPLUG_REG_LEN},
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
@@ -640,7 +642,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
     DeviceState *dev;
     MachineState *ms = MACHINE(vms);
     int irq = vms->irqmap[VIRT_ACPI_GED];
-    uint32_t event = ACPI_GED_PWR_DOWN_EVT;
+    uint32_t event = ACPI_GED_PWR_DOWN_EVT | ACPI_GED_CPU_HOTPLUG_EVT;
 
     if (ms->ram_slots) {
         event |= ACPI_GED_MEM_HOTPLUG_EVT;
@@ -655,6 +657,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
 
     sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, vms->memmap[VIRT_ACPI_GED].base);
     sysbus_mmio_map(SYS_BUS_DEVICE(dev), 1, vms->memmap[VIRT_PCDIMM_ACPI].base);
+    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 3, vms->memmap[VIRT_CPUHP_ACPI].base);
     sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, qdev_get_gpio_in(vms->gic, irq));
 
     sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index fc0469c33f..09a0b2d4f0 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -85,6 +85,7 @@ enum {
     VIRT_PCDIMM_ACPI,
     VIRT_ACPI_GED,
     VIRT_NVDIMM_ACPI,
+    VIRT_CPUHP_ACPI,
     VIRT_PVTIME,
     VIRT_LOWMEMMAP_LAST,
 };
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 15/37] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (13 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 14/37] arm/virt: Add cpu hotplug events to GED during creation Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28  1:08   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 16/37] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change Salil Mehta via
                   ` (18 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

ACPI CPU hotplug state (is_present=_STA.PRESENT, is_enabled=_STA.ENABLED) for
all the possible vCPUs MUST be initialized during machine init. This is done
during the creation of the GED device. VMM/Qemu MUST expose/fake the ACPI state
of the disabled vCPUs to the Guest kernel as 'present' (_STA.PRESENT) always
i.e. ACPI persistent. if the 'disabled' vCPU objectes are destroyed before the
GED device has been created then their ACPI hotplug state might not get
initialized correctly as acpi_persistent flag is part of the CPUState. This will
expose wrong status of the unplugged vCPUs to the Guest kernel.

Hence, moving the GED device creation before disabled vCPU objects get destroyed
as part of the post CPU init routine.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 5c8a0672dc..cbb6199ec6 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2376,6 +2376,12 @@ static void machvirt_init(MachineState *machine)
 
     create_gic(vms, sysmem);
 
+    has_ged = has_ged && aarch64 && firmware_loaded &&
+              virt_is_acpi_enabled(vms);
+    if (has_ged) {
+        vms->acpi_dev = create_acpi_ged(vms);
+    }
+
     virt_cpu_post_init(vms, sysmem);
 
     fdt_add_pmu_nodes(vms);
@@ -2398,9 +2404,7 @@ static void machvirt_init(MachineState *machine)
 
     create_pcie(vms);
 
-    if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
-        vms->acpi_dev = create_acpi_ged(vms);
-    } else {
+    if (!has_ged) {
         create_gpio_devices(vms, VIRT_GPIO, sysmem);
     }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 16/37] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (14 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 15/37] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28  1:26   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 17/37] arm/virt/acpi: Build CPUs AML with CPU Hotplug support Salil Mehta via
                   ` (17 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

CPUs Control device(\\_SB.PCI0) register interface for the x86 arch is based on
PCI and is IO port based and hence existing cpus AML code assumes _CRS objects
would evaluate to a system resource which describes IO Port address. But on ARM
arch CPUs control device(\\_SB.PRES) register interface is memory-mapped hence
_CRS object should evaluate to system resource which describes memory-mapped
base address.

This cpus AML code change updates the existing inerface of the build cpus AML
function to accept both IO/MEMORY type regions and update the _CRS object
correspondingly.

NOTE: Beside above CPU scan shall be triggered when OSPM evaluates _EVT method
      part of the GED framework which is covered in subsequent patch.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/acpi/cpu.c         | 23 ++++++++++++++++-------
 hw/i386/acpi-build.c  |  2 +-
 include/hw/acpi/cpu.h |  5 +++--
 3 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index d5ba37b209..232720992d 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -341,9 +341,10 @@ const VMStateDescription vmstate_cpu_hotplug = {
 #define CPU_FW_EJECT_EVENT "CEJF"
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
-                    hwaddr io_base,
+                    hwaddr base_addr,
                     const char *res_root,
-                    const char *event_handler_method)
+                    const char *event_handler_method,
+                    AmlRegionSpace rs)
 {
     Aml *ifctx;
     Aml *field;
@@ -370,13 +371,19 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
         aml_append(cpu_ctrl_dev, aml_mutex(CPU_LOCK, 0));
 
         crs = aml_resource_template();
-        aml_append(crs, aml_io(AML_DECODE16, io_base, io_base, 1,
+        if (rs == AML_SYSTEM_IO) {
+            aml_append(crs, aml_io(AML_DECODE16, base_addr, base_addr, 1,
                                ACPI_CPU_HOTPLUG_REG_LEN));
+        } else {
+            aml_append(crs, aml_memory32_fixed(base_addr,
+                               ACPI_CPU_HOTPLUG_REG_LEN, AML_READ_WRITE));
+        }
+
         aml_append(cpu_ctrl_dev, aml_name_decl("_CRS", crs));
 
         /* declare CPU hotplug MMIO region with related access fields */
         aml_append(cpu_ctrl_dev,
-            aml_operation_region("PRST", AML_SYSTEM_IO, aml_int(io_base),
+            aml_operation_region("PRST", rs, aml_int(base_addr),
                                  ACPI_CPU_HOTPLUG_REG_LEN));
 
         field = aml_field("PRST", AML_BYTE_ACC, AML_NOLOCK,
@@ -702,9 +709,11 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
     aml_append(sb_scope, cpus_dev);
     aml_append(table, sb_scope);
 
-    method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
-    aml_append(method, aml_call0("\\_SB.CPUS." CPU_SCAN_METHOD));
-    aml_append(table, method);
+    if (event_handler_method) {
+        method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
+        aml_append(method, aml_call0("\\_SB.CPUS." CPU_SCAN_METHOD));
+        aml_append(table, method);
+    }
 
     g_free(cphp_res_path);
 }
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index bb12b0ad43..560f108d38 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1550,7 +1550,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             .fw_unplugs_cpu = pm->smi_on_cpu_unplug,
         };
         build_cpus_aml(dsdt, machine, opts, pm->cpu_hp_io_base,
-                       "\\_SB.PCI0", "\\_GPE._E02");
+                       "\\_SB.PCI0", "\\_GPE._E02", AML_SYSTEM_IO);
     }
 
     if (pcms->memhp_io_base && nr_mem) {
diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
index 999caaf510..b87ebfdf4b 100644
--- a/include/hw/acpi/cpu.h
+++ b/include/hw/acpi/cpu.h
@@ -56,9 +56,10 @@ typedef struct CPUHotplugFeatures {
 } CPUHotplugFeatures;
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
-                    hwaddr io_base,
+                    hwaddr base_addr,
                     const char *res_root,
-                    const char *event_handler_method);
+                    const char *event_handler_method,
+                    AmlRegionSpace rs);
 
 void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 17/37] arm/virt/acpi: Build CPUs AML with CPU Hotplug support
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (15 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 16/37] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28  1:36   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 18/37] arm/virt: Make ARM vCPU *present* status ACPI *persistent* Salil Mehta via
                   ` (16 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Support of vCPU Hotplug requires sequence of ACPI handshakes between Qemu and
Guest kernel when a vCPU is plugged or unplugged. Most of the AML code to
support these handshakes already exists. This AML need to be build during VM
init for ARM architecture as well if the GED support exists.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt-acpi-build.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 6b674231c2..d27df5030e 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -858,7 +858,18 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
      * the RTC ACPI device at all when using UEFI.
      */
     scope = aml_scope("\\_SB");
-    acpi_dsdt_add_cpus(scope, vms);
+    /* if GED is enabled then cpus AML shall be added as part build_cpus_aml */
+    if (vms->acpi_dev) {
+        CPUHotplugFeatures opts = {
+             .acpi_1_compatible = false,
+             .has_legacy_cphp = false
+        };
+
+        build_cpus_aml(scope, ms, opts, memmap[VIRT_CPUHP_ACPI].base,
+                       "\\_SB", NULL, AML_SYSTEM_MEMORY);
+    } else {
+        acpi_dsdt_add_cpus(scope, vms);
+    }
     acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
                        (irqmap[VIRT_UART] + ARM_SPI_BASE));
     if (vmc->acpi_expose_flash) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 18/37] arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (16 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 17/37] arm/virt/acpi: Build CPUs AML with CPU Hotplug support Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28 23:18   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} Bits to Guest Salil Mehta via
                   ` (15 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

ARM arch does not allow CPUs presence to be changed [1] after kernel has booted.
Hence, firmware/ACPI/Qemu must ensure persistent view of the vCPUs to the Guest
kernel even when they are not present in the QoM i.e. are unplugged or are
yet-to-be-plugged

References:
[1] Check comment 5 in the bugzilla entry
   Link: https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 cpus-common.c         |  6 ++++++
 hw/arm/virt.c         |  7 +++++++
 include/hw/core/cpu.h | 20 ++++++++++++++++++++
 3 files changed, 33 insertions(+)

diff --git a/cpus-common.c b/cpus-common.c
index 24c04199a1..d64aa63b19 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -128,6 +128,12 @@ bool qemu_enabled_cpu(CPUState *cpu)
     return cpu && !cpu->disabled;
 }
 
+bool qemu_persistent_cpu(CPUState *cpu)
+{
+    /* cpu state can be faked to the guest via acpi */
+    return cpu->acpi_persistent;
+}
+
 uint64_t qemu_get_cpu_archid(int cpu_index)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index cbb6199ec6..f1bee569d5 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3006,6 +3006,13 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         return;
     }
     virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
+
+    /*
+     * To give persistent presence view of vCPUs to the guest, ACPI might need
+     * to fake the presence of the vCPUs to the guest but keep them disabled.
+     * This shall be used during the init of ACPI Hotplug state and hot-unplug
+     */
+     cs->acpi_persistent = true;
 }
 
 static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index b2201a98ee..dab572c9bd 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -425,6 +425,13 @@ struct CPUState {
      * By default every CPUState is enabled as of now across all archs.
      */
     bool disabled;
+    /*
+     * On certain architectures, to give persistent view of the 'presence' of
+     * vCPUs to the guest, ACPI might need to fake the 'presence' of the vCPUs
+     * but keep them ACPI disabled to the guest. This is done by returning
+     * _STA.PRES=True and _STA.Ena=False for the unplugged vCPUs in QEMU QoM.
+     */
+    bool acpi_persistent;
     /* TODO Move common fields from CPUArchState here. */
     int cpu_index;
     int cluster_index;
@@ -814,6 +821,19 @@ bool qemu_present_cpu(CPUState *cpu);
  */
 bool qemu_enabled_cpu(CPUState *cpu);
 
+/**
+ * qemu_persistent_cpu:
+ * @cpu: The vCPU to check
+ *
+ * Checks if the vCPU state should always be reflected as *present* via ACPI
+ * to the Guest. By default, this is False on all architectures and has to be
+ * explicity set during initialization.
+ *
+ * Returns: True if it is ACPI 'persistent' CPU
+ *
+ */
+bool qemu_persistent_cpu(CPUState *cpu);
+
 /**
  * qemu_get_cpu_archid:
  * @cpu_index: possible vCPU for which arch-id needs to be retreived
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} Bits to Guest
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (17 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 18/37] arm/virt: Make ARM vCPU *present* status ACPI *persistent* Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28 23:33   ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} " Gavin Shan
  2024-01-17 21:46   ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} " Jonathan Cameron via
  2023-09-26 10:04 ` [PATCH RFC V2 20/37] hw/acpi: Update GED _EVT method AML with cpu scan Salil Mehta via
                   ` (14 subsequent siblings)
  33 siblings, 2 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

ACPI AML changes to properly reflect the _STA.PRES and _STA.ENA Bits to the
guest during initialzation, when CPUs are hotplugged and after CPUs are
hot-unplugged.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/acpi/cpu.c                  | 49 +++++++++++++++++++++++++++++++---
 hw/acpi/generic_event_device.c | 11 ++++++++
 include/hw/acpi/cpu.h          |  2 ++
 3 files changed, 58 insertions(+), 4 deletions(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 232720992d..e1299696d3 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -63,10 +63,11 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, unsigned size)
     cdev = &cpu_st->devs[cpu_st->selector];
     switch (addr) {
     case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
-        val |= cdev->cpu ? 1 : 0;
+        val |= cdev->is_enabled ? 1 : 0;
         val |= cdev->is_inserting ? 2 : 0;
         val |= cdev->is_removing  ? 4 : 0;
         val |= cdev->fw_remove  ? 16 : 0;
+        val |= cdev->is_present ? 32 : 0;
         trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
         break;
     case ACPI_CPU_CMD_DATA_OFFSET_RW:
@@ -228,7 +229,21 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
         struct CPUState *cpu = CPU(id_list->cpus[i].cpu);
         if (qemu_present_cpu(cpu)) {
             state->devs[i].cpu = cpu;
+            state->devs[i].is_present = true;
+        } else {
+            if (qemu_persistent_cpu(cpu)) {
+                state->devs[i].is_present = true;
+            } else {
+                state->devs[i].is_present = false;
+            }
         }
+
+        if (qemu_enabled_cpu(cpu)) {
+            state->devs[i].is_enabled = true;
+        } else {
+            state->devs[i].is_enabled = false;
+        }
+
         state->devs[i].arch_id = id_list->cpus[i].arch_id;
     }
     memory_region_init_io(&state->ctrl_reg, owner, &cpu_hotplug_ops, state,
@@ -261,6 +276,8 @@ void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
     }
 
     cdev->cpu = CPU(dev);
+    cdev->is_present = true;
+    cdev->is_enabled = true;
     if (dev->hotplugged) {
         cdev->is_inserting = true;
         acpi_send_event(DEVICE(hotplug_dev), ACPI_CPU_HOTPLUG_STATUS);
@@ -292,6 +309,11 @@ void acpi_cpu_unplug_cb(CPUHotplugState *cpu_st,
         return;
     }
 
+    cdev->is_enabled = false;
+    if (!qemu_persistent_cpu(CPU(dev))) {
+        cdev->is_present = false;
+    }
+
     cdev->cpu = NULL;
 }
 
@@ -302,6 +324,8 @@ static const VMStateDescription vmstate_cpuhp_sts = {
     .fields      = (VMStateField[]) {
         VMSTATE_BOOL(is_inserting, AcpiCpuStatus),
         VMSTATE_BOOL(is_removing, AcpiCpuStatus),
+        VMSTATE_BOOL(is_present, AcpiCpuStatus),
+        VMSTATE_BOOL(is_enabled, AcpiCpuStatus),
         VMSTATE_UINT32(ost_event, AcpiCpuStatus),
         VMSTATE_UINT32(ost_status, AcpiCpuStatus),
         VMSTATE_END_OF_LIST()
@@ -339,6 +363,7 @@ const VMStateDescription vmstate_cpu_hotplug = {
 #define CPU_REMOVE_EVENT  "CRMV"
 #define CPU_EJECT_EVENT   "CEJ0"
 #define CPU_FW_EJECT_EVENT "CEJF"
+#define CPU_PRESENT       "CPRS"
 
 void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
                     hwaddr base_addr,
@@ -399,7 +424,9 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
         aml_append(field, aml_named_field(CPU_EJECT_EVENT, 1));
         /* tell firmware to do device eject, write only */
         aml_append(field, aml_named_field(CPU_FW_EJECT_EVENT, 1));
-        aml_append(field, aml_reserved_field(3));
+        /* 1 if present, read only */
+        aml_append(field, aml_named_field(CPU_PRESENT, 1));
+        aml_append(field, aml_reserved_field(2));
         aml_append(field, aml_named_field(CPU_COMMAND, 8));
         aml_append(cpu_ctrl_dev, field);
 
@@ -429,6 +456,7 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
         Aml *ctrl_lock = aml_name("%s.%s", cphp_res_path, CPU_LOCK);
         Aml *cpu_selector = aml_name("%s.%s", cphp_res_path, CPU_SELECTOR);
         Aml *is_enabled = aml_name("%s.%s", cphp_res_path, CPU_ENABLED);
+        Aml *is_present = aml_name("%s.%s", cphp_res_path, CPU_PRESENT);
         Aml *cpu_cmd = aml_name("%s.%s", cphp_res_path, CPU_COMMAND);
         Aml *cpu_data = aml_name("%s.%s", cphp_res_path, CPU_DATA);
         Aml *ins_evt = aml_name("%s.%s", cphp_res_path, CPU_INSERT_EVENT);
@@ -457,13 +485,26 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
         {
             Aml *idx = aml_arg(0);
             Aml *sta = aml_local(0);
+            Aml *ifctx2;
+            Aml *else_ctx;
 
             aml_append(method, aml_acquire(ctrl_lock, 0xFFFF));
             aml_append(method, aml_store(idx, cpu_selector));
             aml_append(method, aml_store(zero, sta));
-            ifctx = aml_if(aml_equal(is_enabled, one));
+            ifctx = aml_if(aml_equal(is_present, one));
             {
-                aml_append(ifctx, aml_store(aml_int(0xF), sta));
+                ifctx2 = aml_if(aml_equal(is_enabled, one));
+                {
+                    /* cpu is present and enabled */
+                    aml_append(ifctx2, aml_store(aml_int(0xF), sta));
+                }
+                aml_append(ifctx, ifctx2);
+                else_ctx = aml_else();
+                {
+                    /* cpu is present but disabled */
+                    aml_append(else_ctx, aml_store(aml_int(0xD), sta));
+                }
+                aml_append(ifctx, else_ctx);
             }
             aml_append(method, ifctx);
             aml_append(method, aml_release(ctrl_lock));
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index d2fa1d0e4a..b84602b238 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -319,6 +319,16 @@ static const VMStateDescription vmstate_memhp_state = {
     }
 };
 
+static const VMStateDescription vmstate_cpuhp_state = {
+    .name = "acpi-ged/cpuhp",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields      = (VMStateField[]) {
+        VMSTATE_CPU_HOTPLUG(cpuhp_state, AcpiGedState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_ged_state = {
     .name = "acpi-ged-state",
     .version_id = 1,
@@ -367,6 +377,7 @@ static const VMStateDescription vmstate_acpi_ged = {
     },
     .subsections = (const VMStateDescription * []) {
         &vmstate_memhp_state,
+        &vmstate_cpuhp_state,
         &vmstate_ghes_state,
         NULL
     }
diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
index b87ebfdf4b..786a30d6d4 100644
--- a/include/hw/acpi/cpu.h
+++ b/include/hw/acpi/cpu.h
@@ -22,6 +22,8 @@ typedef struct AcpiCpuStatus {
     uint64_t arch_id;
     bool is_inserting;
     bool is_removing;
+    bool is_present;
+    bool is_enabled;
     bool fw_remove;
     uint32_t ost_event;
     uint32_t ost_status;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 20/37] hw/acpi: Update GED _EVT method AML with cpu scan
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (18 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} Bits to Guest Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28 23:35   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with possible vCPUs Salil Mehta via
                   ` (13 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

OSPM evaluates _EVT method to map the event. The cpu hotplug event eventually
results in start of the cpu scan. Scan figures out the cpu and the kind of
event(plug/unplug) and notifies it back to the guest.

The change in this patch updates the GED AML _EVT method with the call to
\\_SB.CPUS.CSCN which will do above.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/acpi/generic_event_device.c | 4 ++++
 include/hw/acpi/cpu_hotplug.h  | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index b84602b238..ad252e6a91 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -108,6 +108,10 @@ void build_ged_aml(Aml *table, const char *name, HotplugHandler *hotplug_dev,
                 aml_append(if_ctx, aml_call0(MEMORY_DEVICES_CONTAINER "."
                                              MEMORY_SLOT_SCAN_METHOD));
                 break;
+            case ACPI_GED_CPU_HOTPLUG_EVT:
+                aml_append(if_ctx, aml_call0(ACPI_CPU_CONTAINER "."
+                                             ACPI_CPU_SCAN_METHOD));
+                break;
             case ACPI_GED_PWR_DOWN_EVT:
                 aml_append(if_ctx,
                            aml_notify(aml_name(ACPI_POWER_BUTTON_DEVICE),
diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h
index 48b291e45e..ef631750b4 100644
--- a/include/hw/acpi/cpu_hotplug.h
+++ b/include/hw/acpi/cpu_hotplug.h
@@ -20,6 +20,8 @@
 #include "hw/acpi/cpu.h"
 
 #define ACPI_CPU_HOTPLUG_REG_LEN 12
+#define ACPI_CPU_SCAN_METHOD "CSCN"
+#define ACPI_CPU_CONTAINER "\\_SB.CPUS"
 
 typedef struct AcpiCpuHotplug {
     Object *device;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with possible vCPUs
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (19 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 20/37] hw/acpi: Update GED _EVT method AML with cpu scan Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28 23:43   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 22/37] hw/acpi: Make _MAT method optional Salil Mehta via
                   ` (12 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Changes required during building of MADT Table by QEMU to accomodate disabled
possible vCPUs. This info shall be used by the guest kernel to size up its
resources during boot time. This pre-sizing of the guest kernel done on
possible vCPUs will facilitate hotplug of the disabled vCPUs.

This change also caters ACPI MADT GIC CPU Interface flag related changes
recently introduced in the UEFI ACPI 6.5 Specification which allows deferred
virtual CPU online'ing in the Guest Kernel.

Link: https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt-acpi-build.c | 36 ++++++++++++++++++++++++++++++------
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index d27df5030e..cbccd2ca2d 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -700,6 +700,29 @@ static void build_append_gicr(GArray *table_data, uint64_t base, uint32_t size)
     build_append_int_noprefix(table_data, size, 4); /* Discovery Range Length */
 }
 
+static uint32_t virt_acpi_get_gicc_flags(CPUState *cpu)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
+
+    /* can only exist in 'enabled' state */
+    if (!mc->has_hotpluggable_cpus) {
+        return 1;
+    }
+
+    /*
+     * ARM GIC CPU Interface can be 'online-capable' or 'enabled' at boot
+     * We MUST set 'online-capable' Bit for all hotpluggable CPUs except the
+     * first/boot CPU. Cold-booted CPUs without 'Id' can also be unplugged.
+     * Though as-of-now this is only used as a debugging feature.
+     *
+     *   UEFI ACPI Specification 6.5
+     *   Section: 5.2.12.14. GIC CPU Interface (GICC) Structure
+     *   Table:   5.37 GICC CPU Interface Flags
+     *   Link: https://uefi.org/specs/ACPI/6.5
+     */
+    return cpu && !cpu->cpu_index ? 1 : (1 << 3);
+}
+
 static void
 build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
@@ -726,12 +749,13 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     build_append_int_noprefix(table_data, vms->gic_version, 1);
     build_append_int_noprefix(table_data, 0, 3);   /* Reserved */
 
-    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
-        ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
+    for (i = 0; i < MACHINE(vms)->smp.max_cpus; i++) {
+        CPUState *cpu = qemu_get_possible_cpu(i);
         uint64_t physical_base_address = 0, gich = 0, gicv = 0;
         uint32_t vgic_interrupt = vms->virt ? PPI(ARCH_GIC_MAINT_IRQ) : 0;
-        uint32_t pmu_interrupt = arm_feature(&armcpu->env, ARM_FEATURE_PMU) ?
-                                             PPI(VIRTUAL_PMU_IRQ) : 0;
+        uint32_t pmu_interrupt = vms->pmu ? PPI(VIRTUAL_PMU_IRQ) : 0;
+        uint32_t flags = virt_acpi_get_gicc_flags(cpu);
+        uint64_t mpidr = qemu_get_cpu_archid(i);
 
         if (vms->gic_version == VIRT_GIC_VERSION_2) {
             physical_base_address = memmap[VIRT_GIC_CPU].base;
@@ -746,7 +770,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         build_append_int_noprefix(table_data, i, 4);    /* GIC ID */
         build_append_int_noprefix(table_data, i, 4);    /* ACPI Processor UID */
         /* Flags */
-        build_append_int_noprefix(table_data, 1, 4);    /* Enabled */
+        build_append_int_noprefix(table_data, flags, 4);
         /* Parking Protocol Version */
         build_append_int_noprefix(table_data, 0, 4);
         /* Performance Interrupt GSIV */
@@ -760,7 +784,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         build_append_int_noprefix(table_data, vgic_interrupt, 4);
         build_append_int_noprefix(table_data, 0, 8);    /* GICR Base Address*/
         /* MPIDR */
-        build_append_int_noprefix(table_data, armcpu->mp_affinity, 8);
+        build_append_int_noprefix(table_data, mpidr, 8);
         /* Processor Power Efficiency Class */
         build_append_int_noprefix(table_data, 0, 1);
         /* Reserved */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 22/37] hw/acpi: Make _MAT method optional
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (20 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with possible vCPUs Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28 23:50   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 23/37] arm/virt: Release objects for *disabled* possible vCPUs after init Salil Mehta via
                   ` (11 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

The GICC interface on arm64 vCPUs is statically defined in the MADT, and
doesn't require a _MAT entry. Although the GICC is indicated as present
by the MADT entry, it can only be used from vCPU sysregs, which aren't
accessible until hot-add.

Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Co-developed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/acpi/cpu.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index e1299696d3..217db99538 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -715,11 +715,13 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
             aml_append(dev, method);
 
             /* build _MAT object */
-            assert(adevc && adevc->madt_cpu);
-            adevc->madt_cpu(i, arch_ids, madt_buf,
-                            true); /* set enabled flag */
-            aml_append(dev, aml_name_decl("_MAT",
-                aml_buffer(madt_buf->len, (uint8_t *)madt_buf->data)));
+            if (adevc && adevc->madt_cpu) {
+                assert(adevc && adevc->madt_cpu);
+                adevc->madt_cpu(i, arch_ids, madt_buf,
+                                true); /* set enabled flag */
+                aml_append(dev, aml_name_decl("_MAT",
+                    aml_buffer(madt_buf->len, (uint8_t *)madt_buf->data)));
+            }
             g_array_free(madt_buf, true);
 
             if (CPU(arch_ids->cpus[i].cpu) != first_cpu) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 23/37] arm/virt: Release objects for *disabled* possible vCPUs after init
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (21 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 22/37] hw/acpi: Make _MAT method optional Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-28 23:57   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to support vCPU Hotplug Salil Mehta via
                   ` (10 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

During machvirt_init(), QOM ARMCPU objects are also pre-created along with the
corresponding KVM vCPUs in the host for all possible vCPUs. This necessary
because of the architectural constraint, KVM restricts the deferred creation of
the KVM vCPUs and VGIC initialization/sizing after VM init. Hence, VGIC is
pre-sized with possible vCPUs.

After initialization of the machine is complete disabled possible KVM vCPUs are
then parked at the per-virt-machine list "kvm_parked_vcpus" and we release the
QOM ARMCPU objects for the disabled vCPUs. These shall be re-created at the time
when vCPU is hotplugged again. QOM ARMCPU object is then re-attached with
corresponding parked KVM vCPU.

Alternatively, we could've never released the QOM CPU objects and kept on
reusing. This approach might require some modifications of qdevice_add()
interface to get old ARMCPU object instead of creating a new one for the hotplug
request.

Each of the above approaches come with their own pros and cons. This prototype
uses the 1st approach.(suggestions are welcome!)

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f1bee569d5..3b068534a8 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1965,6 +1965,7 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
 {
     CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
     int max_cpus = MACHINE(vms)->smp.max_cpus;
+    MachineState *ms = MACHINE(vms);
     bool aarch64, steal_time;
     CPUState *cpu;
     int n;
@@ -2025,6 +2026,37 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
             }
         }
     }
+
+    if (kvm_enabled() || tcg_enabled()) {
+        for (n = 0; n < possible_cpus->len; n++) {
+            cpu = qemu_get_possible_cpu(n);
+
+            /*
+             * Now, GIC has been sized with possible CPUs and we dont require
+             * disabled vCPU objects to be represented in the QOM. Release the
+             * disabled ARMCPU objects earlier used during init for pre-sizing.
+             *
+             * We fake to the guest through ACPI about the presence(_STA.PRES=1)
+             * of these non-existent vCPUs at VMM/qemu and present these as
+             * disabled vCPUs(_STA.ENA=0) so that they cant be used. These vCPUs
+             * can be later added to the guest through hotplug exchanges when
+             * ARMCPU objects are created back again using 'device_add' QMP
+             * command.
+             */
+            /*
+             * RFC: Question: Other approach could've been to keep them forever
+             * and release it only once when qemu exits as part of finalize or
+             * when new vCPU is hotplugged. In the later old could be released
+             * for the newly created object for the same vCPU?
+             */
+            if (!qemu_enabled_cpu(cpu)) {
+                CPUArchId *cpu_slot;
+                cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index);
+                cpu_slot->cpu = NULL;
+                object_unref(OBJECT(cpu));
+            }
+        }
+    }
 }
 
 static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to support vCPU Hotplug
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (22 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 23/37] arm/virt: Release objects for *disabled* possible vCPUs after init Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-26 11:02   ` Michael S. Tsirkin
  2023-09-26 10:04 ` [PATCH RFC V2 25/37] arm/virt: Add/update basic hot-(un)plug framework Salil Mehta via
                   ` (9 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

ACPI GED shall be used to convey to the guest kernel about any CPU hot-(un)plug
events. Therefore, existing ACPI GED framework inside QEMU needs to be enhanced
to support CPU hotplug state and events.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/acpi/generic_event_device.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index ad252e6a91..0266733a54 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -12,6 +12,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "hw/acpi/acpi.h"
+#include "hw/acpi/cpu.h"
 #include "hw/acpi/generic_event_device.h"
 #include "hw/irq.h"
 #include "hw/mem/pc-dimm.h"
@@ -239,6 +240,8 @@ static void acpi_ged_device_plug_cb(HotplugHandler *hotplug_dev,
         } else {
             acpi_memory_plug_cb(hotplug_dev, &s->memhp_state, dev, errp);
         }
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        acpi_cpu_plug_cb(hotplug_dev, &s->cpuhp_state, dev, errp);
     } else {
         error_setg(errp, "virt: device plug request for unsupported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
@@ -253,6 +256,8 @@ static void acpi_ged_unplug_request_cb(HotplugHandler *hotplug_dev,
     if ((object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
                        !(object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)))) {
         acpi_memory_unplug_request_cb(hotplug_dev, &s->memhp_state, dev, errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        acpi_cpu_unplug_request_cb(hotplug_dev, &s->cpuhp_state, dev, errp);
     } else {
         error_setg(errp, "acpi: device unplug request for unsupported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
@@ -266,6 +271,8 @@ static void acpi_ged_unplug_cb(HotplugHandler *hotplug_dev,
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
         acpi_memory_unplug_cb(&s->memhp_state, dev, errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        acpi_cpu_unplug_cb(&s->cpuhp_state, dev, errp);
     } else {
         error_setg(errp, "acpi: device unplug for unsupported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
@@ -277,6 +284,7 @@ static void acpi_ged_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList ***list)
     AcpiGedState *s = ACPI_GED(adev);
 
     acpi_memory_ospm_status(&s->memhp_state, list);
+    acpi_cpu_ospm_status(&s->cpuhp_state, list);
 }
 
 static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
@@ -291,6 +299,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
         sel = ACPI_GED_PWR_DOWN_EVT;
     } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
         sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
+    } else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
+        sel = ACPI_GED_CPU_HOTPLUG_EVT;
     } else {
         /* Unknown event. Return without generating interrupt. */
         warn_report("GED: Unsupported event %d. No irq injected", ev);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 25/37] arm/virt: Add/update basic hot-(un)plug framework
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (23 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to support vCPU Hotplug Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-29  0:20   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 26/37] arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug Salil Mehta via
                   ` (8 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Add CPU hot-unplug hooks and update hotplug hooks with additional sanity checks
for use in hotplug paths.

Note, Functional contents of the hooks(now left with TODO comment) shall be
gradually filled in the subsequent patches in an incremental approach to patch
and logic building which would be roughly as follows:
1. (Un-)wiring of interrupts between vCPU<->GIC
2. Sending events to Guest for hot-(un)plug so that guest can take appropriate
   actions.
3. Notifying GIC about hot-(un)plug action so that vCPU could be (un-)stitched
   to the GIC CPU interface.
4. Updating the Guest with Next boot info for this vCPU in the firmware.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3b068534a8..dce02136cb 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -81,6 +81,7 @@
 #include "hw/virtio/virtio-iommu.h"
 #include "hw/char/pl011.h"
 #include "qemu/guest-random.h"
+#include "qapi/qmp/qdict.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -2985,12 +2986,23 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
 {
     VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
     MachineState *ms = MACHINE(hotplug_dev);
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
     ARMCPU *cpu = ARM_CPU(dev);
     CPUState *cs = CPU(dev);
     CPUArchId *cpu_slot;
     int32_t min_cpuid = 0;
     int32_t max_cpuid;
 
+    if (dev->hotplugged && !vms->acpi_dev) {
+        error_setg(errp, "GED acpi device does not exists");
+        return;
+    }
+
+    if (dev->hotplugged && !mc->has_hotpluggable_cpus) {
+        error_setg(errp, "CPU hotplug not supported on this machine");
+        return;
+    }
+
     /* sanity check the cpu */
     if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
         error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
@@ -3039,6 +3051,22 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     }
     virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
 
+    /*
+     * Fix the GIC for this new vCPU being plugged. The QOM CPU object for the
+     * new vCPU need to be updated in the corresponding QOM GICv3CPUState object
+     * We also need to re-wire the IRQs for this new CPU object. This update
+     * is limited to the QOM only and does not affects the KVM. Later has
+     * already been pre-sized with possible CPU at VM init time. This is a
+     * workaround to the constraints posed by ARM architecture w.r.t supporting
+     * CPU Hotplug. Specification does not exist for the later.
+     * This patch-up is required both for {cold,hot}-plugged vCPUs. Cold-inited
+     * vCPUs have their GIC state initialized during machvit_init().
+     */
+    if (vms->acpi_dev) {
+        /* TODO: update GIC about this hotplug change here */
+        /* TODO: wire the GIC<->CPU irqs */
+    }
+
     /*
      * To give persistent presence view of vCPUs to the guest, ACPI might need
      * to fake the presence of the vCPUs to the guest but keep them disabled.
@@ -3050,6 +3078,7 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
 static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                           Error **errp)
 {
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
     MachineState *ms = MACHINE(hotplug_dev);
     CPUState *cs = CPU(dev);
     CPUArchId *cpu_slot;
@@ -3058,10 +3087,81 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
     cpu_slot->cpu = OBJECT(dev);
 
+    /*
+     * Update the ACPI Hotplug state both for vCPUs being {hot,cold}-plugged.
+     * vCPUs can be cold-plugged using '-device' option. For vCPUs being hot
+     * plugged, guest is also notified.
+     */
+    if (vms->acpi_dev) {
+        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest */
+        /* TODO: register cpu for reset & update F/W info for the next boot */
+    }
+
     cs->disabled = false;
     return;
 }
 
+static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
+                                    DeviceState *dev, Error **errp)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    ARMCPU *cpu = ARM_CPU(dev);
+    CPUState *cs = CPU(dev);
+
+    if (!vms->acpi_dev || !dev->realized) {
+        error_setg(errp, "GED does not exists or device is not realized!");
+        return;
+    }
+
+    if (!mc->has_hotpluggable_cpus) {
+        error_setg(errp, "CPU hot(un)plug not supported on this machine");
+        return;
+    }
+
+    if (cs->cpu_index == first_cpu->cpu_index) {
+        error_setg(errp, "Boot CPU(id%d=%d:%d:%d:%d) hot-unplug not supported",
+                   first_cpu->cpu_index, cpu->socket_id, cpu->cluster_id,
+                   cpu->core_id, cpu->thread_id);
+        return;
+    }
+
+    /* TODO: request cpu hotplug from guest */
+
+    return;
+}
+
+static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
+                            Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    MachineState *ms = MACHINE(hotplug_dev);
+    CPUState *cs = CPU(dev);
+    CPUArchId *cpu_slot;
+
+    if (!vms->acpi_dev || !dev->realized) {
+        error_setg(errp, "GED does not exists or device is not realized!");
+        return;
+    }
+
+    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
+
+    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
+
+    /* TODO: unwire the gic-cpu irqs here */
+    /* TODO: update the GIC about this hot unplug change */
+
+    /* TODO: unregister cpu for reset & update F/W info for the next boot */
+
+    qobject_unref(dev->opts);
+    dev->opts = NULL;
+
+    cpu_slot->cpu = NULL;
+    cs->disabled = true;
+
+    return;
+}
+
 static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
                                             DeviceState *dev, Error **errp)
 {
@@ -3185,6 +3285,8 @@ static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
         virtio_md_pci_unplug_request(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev),
                                      errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        virt_cpu_unplug_request(hotplug_dev, dev, errp);
     } else {
         error_setg(errp, "device unplug request for unsupported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
@@ -3198,6 +3300,8 @@ static void virt_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
         virt_dimm_unplug(hotplug_dev, dev, errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
         virtio_md_pci_unplug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+        virt_cpu_unplug(hotplug_dev, dev, errp);
     } else {
         error_setg(errp, "virt: device unplug for unsupported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 26/37] arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (24 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 25/37] arm/virt: Add/update basic hot-(un)plug framework Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-26 10:04 ` [PATCH RFC V2 27/37] hw/arm, gicv3: Changes to update GIC with vCPU hot-plug notification Salil Mehta via
                   ` (7 subsequent siblings)
  33 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Refactors the existing GIC create code to extract common code to wire the
vcpu<->gic interrupts. This function could be used with cold-plug case and also
used when vCPU is hot-plugged. It also introduces a new function to unwire the
vcpu<->gic interrupts for the vCPU hot-unplug cases.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c          | 139 ++++++++++++++++++++++++++++-------------
 hw/core/gpio.c         |   2 +-
 include/hw/qdev-core.h |   2 +
 3 files changed, 99 insertions(+), 44 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index dce02136cb..5b829e47b7 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -714,6 +714,99 @@ static void create_v2m(VirtMachineState *vms)
     vms->msi_controller = VIRT_MSI_CTRL_GICV2M;
 }
 
+/*
+ * Mapping from the output timer irq lines from the CPU to the GIC PPI inputs
+ * we use for the virt board.
+ */
+const int timer_irq[] = {
+    [GTIMER_PHYS] = ARCH_TIMER_NS_EL1_IRQ,
+    [GTIMER_VIRT] = ARCH_TIMER_VIRT_IRQ,
+    [GTIMER_HYP]  = ARCH_TIMER_NS_EL2_IRQ,
+    [GTIMER_SEC]  = ARCH_TIMER_S_EL1_IRQ,
+};
+
+static void unwire_gic_cpu_irqs(VirtMachineState *vms, CPUState *cs)
+{
+    MachineState *ms = MACHINE(vms);
+    unsigned int max_cpus = ms->smp.max_cpus;
+    DeviceState *cpudev = DEVICE(cs);
+    DeviceState *gicdev = vms->gic;
+    int cpu = CPU(cs)->cpu_index;
+    int type = vms->gic_version;
+    int irq;
+
+    for (irq = 0; irq < ARRAY_SIZE(timer_irq); irq++) {
+        qdev_disconnect_gpio_out_named(cpudev, NULL, irq);
+    }
+
+    if (type != VIRT_GIC_VERSION_2) {
+        qdev_disconnect_gpio_out_named(cpudev, "gicv3-maintenance-interrupt",
+                                       0);
+    } else if (vms->virt) {
+        qdev_disconnect_gpio_out_named(gicdev, SYSBUS_DEVICE_GPIO_IRQ,
+                                       cpu + 4 * max_cpus);
+    }
+
+    /*
+     * RFC: Question: This currently does not takes care of intimating the
+     * devices which might be sitting on system bus. Do we need a
+     * sysbus_disconnect_irq() which also does the job of notification beside
+     * disconnection?
+     */
+    qdev_disconnect_gpio_out_named(cpudev, "pmu-interrupt", 0);
+    qdev_disconnect_gpio_out_named(gicdev, SYSBUS_DEVICE_GPIO_IRQ, cpu);
+    qdev_disconnect_gpio_out_named(gicdev,
+                                   SYSBUS_DEVICE_GPIO_IRQ, cpu + max_cpus);
+    qdev_disconnect_gpio_out_named(gicdev, SYSBUS_DEVICE_GPIO_IRQ,
+                                   cpu + 2 * max_cpus);
+    qdev_disconnect_gpio_out_named(gicdev, SYSBUS_DEVICE_GPIO_IRQ,
+                                   cpu + 3 * max_cpus);
+}
+
+static void wire_gic_cpu_irqs(VirtMachineState *vms, CPUState *cs)
+{
+    MachineState *ms = MACHINE(vms);
+    unsigned int max_cpus = ms->smp.max_cpus;
+    DeviceState *cpudev = DEVICE(cs);
+    DeviceState *gicdev = vms->gic;
+    int cpu = CPU(cs)->cpu_index;
+    int type = vms->gic_version;
+    SysBusDevice *gicbusdev;
+    int ppibase;
+    int irq;
+
+    ppibase = NUM_IRQS + cpu * GIC_INTERNAL + GIC_NR_SGIS;
+
+    for (irq = 0; irq < ARRAY_SIZE(timer_irq); irq++) {
+        qdev_connect_gpio_out(cpudev, irq,
+                              qdev_get_gpio_in(gicdev,
+                                               ppibase + timer_irq[irq]));
+    }
+
+    gicbusdev = SYS_BUS_DEVICE(gicdev);
+    if (type != VIRT_GIC_VERSION_2) {
+        qemu_irq irq = qdev_get_gpio_in(gicdev,
+                                        ppibase + ARCH_GIC_MAINT_IRQ);
+        qdev_connect_gpio_out_named(cpudev, "gicv3-maintenance-interrupt",
+                                    0, irq);
+    } else if (vms->virt) {
+        qemu_irq irq = qdev_get_gpio_in(gicdev,
+                                        ppibase + ARCH_GIC_MAINT_IRQ);
+        sysbus_connect_irq(gicbusdev, cpu + 4 * max_cpus, irq);
+    }
+
+    qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
+                                qdev_get_gpio_in(gicdev,
+                                                 ppibase + VIRTUAL_PMU_IRQ));
+    sysbus_connect_irq(gicbusdev, cpu, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
+    sysbus_connect_irq(gicbusdev, cpu + max_cpus,
+                       qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
+    sysbus_connect_irq(gicbusdev, cpu + 2 * max_cpus,
+                       qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
+    sysbus_connect_irq(gicbusdev, cpu + 3 * max_cpus,
+                       qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
+}
+
 static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
 {
     MachineState *ms = MACHINE(vms);
@@ -809,47 +902,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
      * and the GIC's IRQ/FIQ/VIRQ/VFIQ interrupt outputs to the CPU's inputs.
      */
     for (i = 0; i < smp_cpus; i++) {
-        DeviceState *cpudev = DEVICE(qemu_get_cpu(i));
-        int ppibase = NUM_IRQS + i * GIC_INTERNAL + GIC_NR_SGIS;
-        int irq;
-        /* Mapping from the output timer irq lines from the CPU to the
-         * GIC PPI inputs we use for the virt board.
-         */
-        const int timer_irq[] = {
-            [GTIMER_PHYS] = ARCH_TIMER_NS_EL1_IRQ,
-            [GTIMER_VIRT] = ARCH_TIMER_VIRT_IRQ,
-            [GTIMER_HYP]  = ARCH_TIMER_NS_EL2_IRQ,
-            [GTIMER_SEC]  = ARCH_TIMER_S_EL1_IRQ,
-        };
-
-        for (irq = 0; irq < ARRAY_SIZE(timer_irq); irq++) {
-            qdev_connect_gpio_out(cpudev, irq,
-                                  qdev_get_gpio_in(vms->gic,
-                                                   ppibase + timer_irq[irq]));
-        }
-
-        if (vms->gic_version != VIRT_GIC_VERSION_2) {
-            qemu_irq irq = qdev_get_gpio_in(vms->gic,
-                                            ppibase + ARCH_GIC_MAINT_IRQ);
-            qdev_connect_gpio_out_named(cpudev, "gicv3-maintenance-interrupt",
-                                        0, irq);
-        } else if (vms->virt) {
-            qemu_irq irq = qdev_get_gpio_in(vms->gic,
-                                            ppibase + ARCH_GIC_MAINT_IRQ);
-            sysbus_connect_irq(gicbusdev, i + 4 * max_cpus, irq);
-        }
-
-        qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
-                                    qdev_get_gpio_in(vms->gic, ppibase
-                                                     + VIRTUAL_PMU_IRQ));
-
-        sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
-        sysbus_connect_irq(gicbusdev, i + max_cpus,
-                           qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
-        sysbus_connect_irq(gicbusdev, i + 2 * max_cpus,
-                           qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
-        sysbus_connect_irq(gicbusdev, i + 3 * max_cpus,
-                           qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
+        wire_gic_cpu_irqs(vms, qemu_get_cpu(i));
     }
 
     fdt_add_gic_node(vms);
@@ -3064,7 +3117,7 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
      */
     if (vms->acpi_dev) {
         /* TODO: update GIC about this hotplug change here */
-        /* TODO: wire the GIC<->CPU irqs */
+        wire_gic_cpu_irqs(vms, cs);
     }
 
     /*
@@ -3148,7 +3201,7 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
 
     /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
 
-    /* TODO: unwire the gic-cpu irqs here */
+    unwire_gic_cpu_irqs(vms, cs);
     /* TODO: update the GIC about this hot unplug change */
 
     /* TODO: unregister cpu for reset & update F/W info for the next boot */
diff --git a/hw/core/gpio.c b/hw/core/gpio.c
index 80d07a6ec9..abb164d5c0 100644
--- a/hw/core/gpio.c
+++ b/hw/core/gpio.c
@@ -143,7 +143,7 @@ qemu_irq qdev_get_gpio_out_connector(DeviceState *dev, const char *name, int n)
 
 /* disconnect a GPIO output, returning the disconnected input (if any) */
 
-static qemu_irq qdev_disconnect_gpio_out_named(DeviceState *dev,
+qemu_irq qdev_disconnect_gpio_out_named(DeviceState *dev,
                                                const char *name, int n)
 {
     char *propname = g_strdup_printf("%s[%d]",
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 884c726a87..992f5419fa 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -739,6 +739,8 @@ qemu_irq qdev_get_gpio_out_connector(DeviceState *dev, const char *name, int n);
  */
 qemu_irq qdev_intercept_gpio_out(DeviceState *dev, qemu_irq icpt,
                                  const char *name, int n);
+qemu_irq qdev_disconnect_gpio_out_named(DeviceState *dev,
+                                               const char *name, int n);
 
 BusState *qdev_get_child_bus(DeviceState *dev, const char *name);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 27/37] hw/arm, gicv3: Changes to update GIC with vCPU hot-plug notification
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (25 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 26/37] arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-26 10:04 ` [PATCH RFC V2 28/37] hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info Salil Mehta via
                   ` (6 subsequent siblings)
  33 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

vCPU hot-(un)plug events MUST be notified to the GIC. Introduce a notfication
mechanism to update any such events to GIC so that it can update its vCPU to GIC
CPU interface association.

This is required to implement a workaround to the limitations posed by the ARM
architecture. For details about the constraints and workarounds please check
below slides:

Link: https://kvm-forum.qemu.org/2023/talk/9SMPDQ/

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c                      | 27 +++++++++++++--
 hw/intc/arm_gicv3_common.c         | 54 +++++++++++++++++++++++++++++-
 hw/intc/arm_gicv3_cpuif_common.c   |  5 +++
 hw/intc/gicv3_internal.h           |  1 +
 include/hw/arm/virt.h              |  1 +
 include/hw/intc/arm_gicv3_common.h | 22 ++++++++++++
 6 files changed, 107 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 5b829e47b7..b447e86fb6 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -666,6 +666,16 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
     return dev;
 }
 
+static void virt_add_gic_cpuhp_notifier(VirtMachineState *vms)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(vms);
+
+    if (mc->has_hotpluggable_cpus) {
+        Notifier *cpuhp_notifier = gicv3_cpuhp_notifier(vms->gic);
+        notifier_list_add(&vms->cpuhp_notifiers, cpuhp_notifier);
+    }
+}
+
 static void create_its(VirtMachineState *vms)
 {
     const char *itsclass = its_class_name();
@@ -912,6 +922,9 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
     } else if (vms->gic_version == VIRT_GIC_VERSION_2) {
         create_v2m(vms);
     }
+
+    /* add GIC CPU hot(un)plug update notifier */
+    virt_add_gic_cpuhp_notifier(vms);
 }
 
 static void create_uart(const VirtMachineState *vms, int uart,
@@ -2384,6 +2397,8 @@ static void machvirt_init(MachineState *machine)
 
     create_fdt(vms);
 
+    notifier_list_init(&vms->cpuhp_notifiers);
+    possible_cpus = mc->possible_cpu_arch_ids(machine);
     assert(possible_cpus->len == max_cpus);
     for (n = 0; n < possible_cpus->len; n++) {
         Object *cpuobj;
@@ -3034,6 +3049,14 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
                          dev, &error_abort);
 }
 
+static void virt_update_gic(VirtMachineState *vms, CPUState *cs)
+{
+    GICv3CPUHotplugInfo gic_info = { .gic = vms->gic, .cpu = cs };
+
+    /* notify gic to stitch GICC to this new cpu */
+    notifier_list_notify(&vms->cpuhp_notifiers, &gic_info);
+}
+
 static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                               Error **errp)
 {
@@ -3116,7 +3139,7 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
      * vCPUs have their GIC state initialized during machvit_init().
      */
     if (vms->acpi_dev) {
-        /* TODO: update GIC about this hotplug change here */
+        virt_update_gic(vms, cs);
         wire_gic_cpu_irqs(vms, cs);
     }
 
@@ -3202,7 +3225,7 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
     /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
 
     unwire_gic_cpu_irqs(vms, cs);
-    /* TODO: update the GIC about this hot unplug change */
+    virt_update_gic(vms, cs);
 
     /* TODO: unregister cpu for reset & update F/W info for the next boot */
 
diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index ebd99af610..fc87fa9369 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -33,7 +33,6 @@
 #include "hw/arm/linux-boot-if.h"
 #include "sysemu/kvm.h"
 
-
 static void gicv3_gicd_no_migration_shift_bug_post_load(GICv3State *cs)
 {
     if (cs->gicd_no_migration_shift_bug) {
@@ -322,6 +321,56 @@ void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler,
     }
 }
 
+static int arm_gicv3_get_proc_num(GICv3State *s, CPUState *cpu)
+{
+    uint64_t mp_affinity;
+    uint64_t gicr_typer;
+    uint64_t cpu_affid;
+    int i;
+
+    mp_affinity = object_property_get_uint(OBJECT(cpu), "mp-affinity", NULL);
+    /* match the cpu mp-affinity to get the gic cpuif number */
+    for (i = 0; i < s->num_cpu; i++) {
+        gicr_typer = s->cpu[i].gicr_typer;
+        cpu_affid = (gicr_typer >> 32) & 0xFFFFFF;
+        if (cpu_affid == mp_affinity) {
+            return i;
+        }
+    }
+
+    return -1;
+}
+
+static void arm_gicv3_cpu_update_notifier(Notifier *notifier, void * data)
+{
+    GICv3CPUHotplugInfo *gic_info = (GICv3CPUHotplugInfo *)data;
+    CPUState *cpu = gic_info->cpu;
+    int gic_cpuif_num;
+    GICv3State *s;
+
+    s = ARM_GICV3_COMMON(gic_info->gic);
+
+    /* this shall get us mapped gicv3 cpuif corresponding to mpidr */
+    gic_cpuif_num = arm_gicv3_get_proc_num(s, cpu);
+    if (gic_cpuif_num < 0) {
+        error_report("Failed to associate cpu %d with any GIC cpuif",
+                     cpu->cpu_index);
+        abort();
+    }
+
+    /* check if update is for vcpu hot-unplug */
+    if (qemu_enabled_cpu(cpu)) {
+        s->cpu[gic_cpuif_num].cpu = NULL;
+        return;
+    }
+
+    /* re-stitch the gic cpuif to this new cpu */
+    gicv3_set_gicv3state(cpu, &s->cpu[gic_cpuif_num]);
+    gicv3_set_cpustate(&s->cpu[gic_cpuif_num], cpu);
+
+    /* TODO: initialize the registers info for this newly added cpu */
+}
+
 static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
 {
     GICv3State *s = ARM_GICV3_COMMON(dev);
@@ -444,6 +493,8 @@ static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
         s->cpu[cpuidx - 1].gicr_typer |= GICR_TYPER_LAST;
     }
 
+    s->cpu_update_notifier.notify = arm_gicv3_cpu_update_notifier;
+
     s->itslist = g_ptr_array_new();
 }
 
@@ -451,6 +502,7 @@ static void arm_gicv3_finalize(Object *obj)
 {
     GICv3State *s = ARM_GICV3_COMMON(obj);
 
+    notifier_remove(&s->cpu_update_notifier);
     g_free(s->redist_region_count);
 }
 
diff --git a/hw/intc/arm_gicv3_cpuif_common.c b/hw/intc/arm_gicv3_cpuif_common.c
index ff1239f65d..381cf2754b 100644
--- a/hw/intc/arm_gicv3_cpuif_common.c
+++ b/hw/intc/arm_gicv3_cpuif_common.c
@@ -20,3 +20,8 @@ void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *s)
 
     env->gicv3state = (void *)s;
 };
+
+void gicv3_set_cpustate(GICv3CPUState *s, CPUState *cpu)
+{
+    s->cpu = cpu;
+}
diff --git a/hw/intc/gicv3_internal.h b/hw/intc/gicv3_internal.h
index 29d5cdc1b6..9d4c1209bd 100644
--- a/hw/intc/gicv3_internal.h
+++ b/hw/intc/gicv3_internal.h
@@ -848,5 +848,6 @@ static inline void gicv3_cache_all_target_cpustates(GICv3State *s)
 }
 
 void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *s);
+void gicv3_set_cpustate(GICv3CPUState *s, CPUState *cpu);
 
 #endif /* QEMU_ARM_GICV3_INTERNAL_H */
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 09a0b2d4f0..f9a748a5a9 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -189,6 +189,7 @@ struct VirtMachineState {
     PCIBus *bus;
     char *oem_id;
     char *oem_table_id;
+    NotifierList cpuhp_notifiers;
 };
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
diff --git a/include/hw/intc/arm_gicv3_common.h b/include/hw/intc/arm_gicv3_common.h
index 4e2fb518e7..97a48f44b9 100644
--- a/include/hw/intc/arm_gicv3_common.h
+++ b/include/hw/intc/arm_gicv3_common.h
@@ -280,6 +280,7 @@ struct GICv3State {
     GICv3CPUState *gicd_irouter_target[GICV3_MAXIRQ];
     uint32_t gicd_nsacr[DIV_ROUND_UP(GICV3_MAXIRQ, 16)];
 
+    Notifier cpu_update_notifier;
     GICv3CPUState *cpu;
     /* List of all ITSes connected to this GIC */
     GPtrArray *itslist;
@@ -328,6 +329,27 @@ struct ARMGICv3CommonClass {
 
 void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler,
                               const MemoryRegionOps *ops);
+/**
+ * Structure used by GICv3 CPU hotplug notifier
+ */
+typedef struct GICv3CPUHotplugInfo {
+    DeviceState *gic; /* GICv3State */
+    CPUState *cpu;
+} GICv3CPUHotplugInfo;
+
+/**
+ * gicv3_cpuhp_notifier
+ *
+ * Returns CPU hotplug notifier which could be used to update GIC about any
+ * CPU hot(un)plug events.
+ *
+ * Returns: Notifier initialized with CPU Hot(un)plug update function
+ */
+static inline Notifier *gicv3_cpuhp_notifier(DeviceState *dev)
+{
+    GICv3State *s = ARM_GICV3_COMMON(dev);
+    return &s->cpu_update_notifier;
+}
 
 /**
  * gicv3_class_name
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 28/37] hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (26 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 27/37] hw/arm, gicv3: Changes to update GIC with vCPU hot-plug notification Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-26 10:04 ` [PATCH RFC V2 29/37] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events Salil Mehta via
                   ` (5 subsequent siblings)
  33 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

vCPU register info needs to be re-initialized each time vCPU is hot-plugged.
This has to be done both for emulation/TCG and KVM case. This is done in
context to the GIC update notification for any vCPU hot-(un)plug events. This
change adds that support and re-factors existing to maximize the code re-use.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/intc/arm_gicv3.c                |   1 +
 hw/intc/arm_gicv3_common.c         |   7 +-
 hw/intc/arm_gicv3_cpuif.c          | 257 +++++++++++++++--------------
 hw/intc/arm_gicv3_kvm.c            |   7 +-
 hw/intc/gicv3_internal.h           |   1 +
 include/hw/intc/arm_gicv3_common.h |   1 +
 6 files changed, 150 insertions(+), 124 deletions(-)

diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c
index 0b8f79a122..e1c7c8c4bc 100644
--- a/hw/intc/arm_gicv3.c
+++ b/hw/intc/arm_gicv3.c
@@ -410,6 +410,7 @@ static void arm_gicv3_class_init(ObjectClass *klass, void *data)
     ARMGICv3Class *agc = ARM_GICV3_CLASS(klass);
 
     agcc->post_load = arm_gicv3_post_load;
+    agcc->init_cpu_reginfo = gicv3_init_cpu_reginfo;
     device_class_set_parent_realize(dc, arm_gic_realize, &agc->parent_realize);
 }
 
diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index fc87fa9369..d051024a30 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -345,10 +345,12 @@ static void arm_gicv3_cpu_update_notifier(Notifier *notifier, void * data)
 {
     GICv3CPUHotplugInfo *gic_info = (GICv3CPUHotplugInfo *)data;
     CPUState *cpu = gic_info->cpu;
+    ARMGICv3CommonClass *c;
     int gic_cpuif_num;
     GICv3State *s;
 
     s = ARM_GICV3_COMMON(gic_info->gic);
+    c = ARM_GICV3_COMMON_GET_CLASS(s);
 
     /* this shall get us mapped gicv3 cpuif corresponding to mpidr */
     gic_cpuif_num = arm_gicv3_get_proc_num(s, cpu);
@@ -368,7 +370,10 @@ static void arm_gicv3_cpu_update_notifier(Notifier *notifier, void * data)
     gicv3_set_gicv3state(cpu, &s->cpu[gic_cpuif_num]);
     gicv3_set_cpustate(&s->cpu[gic_cpuif_num], cpu);
 
-    /* TODO: initialize the registers info for this newly added cpu */
+    /* initialize the registers info for this newly added cpu */
+    if (c->init_cpu_reginfo) {
+        c->init_cpu_reginfo(cpu);
+    }
 }
 
 static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index 7b7a0fdb9c..70fc2cc858 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -2782,6 +2782,127 @@ static const ARMCPRegInfo gicv3_cpuif_ich_apxr23_reginfo[] = {
     },
 };
 
+void gicv3_init_cpu_reginfo(CPUState *cs)
+{
+    ARMCPU *cpu = ARM_CPU(cs);
+    GICv3CPUState *gcs = icc_cs_from_env(&cpu->env);
+
+    /*
+     * If the CPU doesn't define a GICv3 configuration, probably because
+     * in real hardware it doesn't have one, then we use default values
+     * matching the one used by most Arm CPUs. This applies to:
+     *  cpu->gic_num_lrs
+     *  cpu->gic_vpribits
+     *  cpu->gic_vprebits
+     *  cpu->gic_pribits
+     */
+
+    /*
+     * Note that we can't just use the GICv3CPUState as an opaque pointer
+     * in define_arm_cp_regs_with_opaque(), because when we're called back
+     * it might be with code translated by CPU 0 but run by CPU 1, in
+     * which case we'd get the wrong value.
+     * So instead we define the regs with no ri->opaque info, and
+     * get back to the GICv3CPUState from the CPUARMState.
+     */
+    define_arm_cp_regs(cpu, gicv3_cpuif_reginfo);
+
+    /*
+     * The CPU implementation specifies the number of supported
+     * bits of physical priority. For backwards compatibility
+     * of migration, we have a compat property that forces use
+     * of 8 priority bits regardless of what the CPU really has.
+     */
+    if (gcs->gic->force_8bit_prio) {
+        gcs->pribits = 8;
+    } else {
+        gcs->pribits = cpu->gic_pribits ?: 5;
+    }
+
+    /*
+     * The GICv3 has separate ID register fields for virtual priority
+     * and preemption bit values, but only a single ID register field
+     * for the physical priority bits. The preemption bit count is
+     * always the same as the priority bit count, except that 8 bits
+     * of priority means 7 preemption bits. We precalculate the
+     * preemption bits because it simplifies the code and makes the
+     * parallels between the virtual and physical bits of the GIC
+     * a bit clearer.
+     */
+    gcs->prebits = gcs->pribits;
+    if (gcs->prebits == 8) {
+        gcs->prebits--;
+    }
+    /*
+     * Check that CPU code defining pribits didn't violate
+     * architectural constraints our implementation relies on.
+     */
+    g_assert(gcs->pribits >= 4 && gcs->pribits <= 8);
+
+    /*
+     * gicv3_cpuif_reginfo[] defines ICC_AP*R0_EL1; add definitions
+     * for ICC_AP*R{1,2,3}_EL1 if the prebits value requires them.
+     */
+    if (gcs->prebits >= 6) {
+        define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr1_reginfo);
+    }
+    if (gcs->prebits == 7) {
+        define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr23_reginfo);
+    }
+
+    if (arm_feature(&cpu->env, ARM_FEATURE_EL2)) {
+        int j;
+
+        gcs->num_list_regs = cpu->gic_num_lrs ?: 4;
+        gcs->vpribits = cpu->gic_vpribits ?: 5;
+        gcs->vprebits = cpu->gic_vprebits ?: 5;
+
+        /*
+         * Check against architectural constraints: getting these
+         * wrong would be a bug in the CPU code defining these,
+         * and the implementation relies on them holding.
+         */
+        g_assert(gcs->vprebits <= gcs->vpribits);
+        g_assert(gcs->vprebits >= 5 && gcs->vprebits <= 7);
+        g_assert(gcs->vpribits >= 5 && gcs->vpribits <= 8);
+
+        define_arm_cp_regs(cpu, gicv3_cpuif_hcr_reginfo);
+
+        for (j = 0; j < gcs->num_list_regs; j++) {
+            /*
+             * Note that the AArch64 LRs are 64-bit; the AArch32 LRs
+             * are split into two cp15 regs, LR (the low part, with the
+             * same encoding as the AArch64 LR) and LRC (the high part).
+             */
+            ARMCPRegInfo lr_regset[] = {
+                { .name = "ICH_LRn_EL2", .state = ARM_CP_STATE_BOTH,
+                  .opc0 = 3, .opc1 = 4, .crn = 12,
+                  .crm = 12 + (j >> 3), .opc2 = j & 7,
+                  .type = ARM_CP_IO | ARM_CP_NO_RAW,
+                  .access = PL2_RW,
+                  .readfn = ich_lr_read,
+                  .writefn = ich_lr_write,
+                },
+                { .name = "ICH_LRCn_EL2", .state = ARM_CP_STATE_AA32,
+                  .cp = 15, .opc1 = 4, .crn = 12,
+                  .crm = 14 + (j >> 3), .opc2 = j & 7,
+                  .type = ARM_CP_IO | ARM_CP_NO_RAW,
+                  .access = PL2_RW,
+                  .readfn = ich_lr_read,
+                  .writefn = ich_lr_write,
+                },
+            };
+            define_arm_cp_regs(cpu, lr_regset);
+        }
+        if (gcs->vprebits >= 6) {
+            define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr1_reginfo);
+        }
+        if (gcs->vprebits == 7) {
+            define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr23_reginfo);
+        }
+    }
+}
+
 static void gicv3_cpuif_el_change_hook(ARMCPU *cpu, void *opaque)
 {
     GICv3CPUState *cs = opaque;
@@ -2804,131 +2925,23 @@ void gicv3_init_cpuif(GICv3State *s)
 
     for (i = 0; i < s->num_cpu; i++) {
         ARMCPU *cpu = ARM_CPU(qemu_get_cpu(i));
-        GICv3CPUState *cs = &s->cpu[i];
-
-        /*
-         * If the CPU doesn't define a GICv3 configuration, probably because
-         * in real hardware it doesn't have one, then we use default values
-         * matching the one used by most Arm CPUs. This applies to:
-         *  cpu->gic_num_lrs
-         *  cpu->gic_vpribits
-         *  cpu->gic_vprebits
-         *  cpu->gic_pribits
-         */
-
-        /* Note that we can't just use the GICv3CPUState as an opaque pointer
-         * in define_arm_cp_regs_with_opaque(), because when we're called back
-         * it might be with code translated by CPU 0 but run by CPU 1, in
-         * which case we'd get the wrong value.
-         * So instead we define the regs with no ri->opaque info, and
-         * get back to the GICv3CPUState from the CPUARMState.
-         *
-         * These CP regs callbacks can be called from either TCG or HVF code.
-         */
-        define_arm_cp_regs(cpu, gicv3_cpuif_reginfo);
-
-        /*
-         * The CPU implementation specifies the number of supported
-         * bits of physical priority. For backwards compatibility
-         * of migration, we have a compat property that forces use
-         * of 8 priority bits regardless of what the CPU really has.
-         */
-        if (s->force_8bit_prio) {
-            cs->pribits = 8;
-        } else {
-            cs->pribits = cpu->gic_pribits ?: 5;
-        }
-
-        /*
-         * The GICv3 has separate ID register fields for virtual priority
-         * and preemption bit values, but only a single ID register field
-         * for the physical priority bits. The preemption bit count is
-         * always the same as the priority bit count, except that 8 bits
-         * of priority means 7 preemption bits. We precalculate the
-         * preemption bits because it simplifies the code and makes the
-         * parallels between the virtual and physical bits of the GIC
-         * a bit clearer.
-         */
-        cs->prebits = cs->pribits;
-        if (cs->prebits == 8) {
-            cs->prebits--;
-        }
-        /*
-         * Check that CPU code defining pribits didn't violate
-         * architectural constraints our implementation relies on.
-         */
-        g_assert(cs->pribits >= 4 && cs->pribits <= 8);
 
-        /*
-         * gicv3_cpuif_reginfo[] defines ICC_AP*R0_EL1; add definitions
-         * for ICC_AP*R{1,2,3}_EL1 if the prebits value requires them.
-         */
-        if (cs->prebits >= 6) {
-            define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr1_reginfo);
-        }
-        if (cs->prebits == 7) {
-            define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr23_reginfo);
-        }
-
-        if (arm_feature(&cpu->env, ARM_FEATURE_EL2)) {
-            int j;
-
-            cs->num_list_regs = cpu->gic_num_lrs ?: 4;
-            cs->vpribits = cpu->gic_vpribits ?: 5;
-            cs->vprebits = cpu->gic_vprebits ?: 5;
-
-            /* Check against architectural constraints: getting these
-             * wrong would be a bug in the CPU code defining these,
-             * and the implementation relies on them holding.
-             */
-            g_assert(cs->vprebits <= cs->vpribits);
-            g_assert(cs->vprebits >= 5 && cs->vprebits <= 7);
-            g_assert(cs->vpribits >= 5 && cs->vpribits <= 8);
-
-            define_arm_cp_regs(cpu, gicv3_cpuif_hcr_reginfo);
-
-            for (j = 0; j < cs->num_list_regs; j++) {
-                /* Note that the AArch64 LRs are 64-bit; the AArch32 LRs
-                 * are split into two cp15 regs, LR (the low part, with the
-                 * same encoding as the AArch64 LR) and LRC (the high part).
+        if (qemu_enabled_cpu(CPU(cpu))) {
+            GICv3CPUState *cs = icc_cs_from_env(&cpu->env);
+            gicv3_init_cpu_reginfo(CPU(cpu));
+            if (tcg_enabled() || qtest_enabled()) {
+                /*
+                 * We can only trap EL changes with TCG. However the GIC
+                 * interrupt state only changes on EL changes involving EL2 or
+                 * EL3, so for the non-TCG case this is OK, as EL2 and EL3 can't
+                 * exist.
                  */
-                ARMCPRegInfo lr_regset[] = {
-                    { .name = "ICH_LRn_EL2", .state = ARM_CP_STATE_BOTH,
-                      .opc0 = 3, .opc1 = 4, .crn = 12,
-                      .crm = 12 + (j >> 3), .opc2 = j & 7,
-                      .type = ARM_CP_IO | ARM_CP_NO_RAW,
-                      .access = PL2_RW,
-                      .readfn = ich_lr_read,
-                      .writefn = ich_lr_write,
-                    },
-                    { .name = "ICH_LRCn_EL2", .state = ARM_CP_STATE_AA32,
-                      .cp = 15, .opc1 = 4, .crn = 12,
-                      .crm = 14 + (j >> 3), .opc2 = j & 7,
-                      .type = ARM_CP_IO | ARM_CP_NO_RAW,
-                      .access = PL2_RW,
-                      .readfn = ich_lr_read,
-                      .writefn = ich_lr_write,
-                    },
-                };
-                define_arm_cp_regs(cpu, lr_regset);
-            }
-            if (cs->vprebits >= 6) {
-                define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr1_reginfo);
-            }
-            if (cs->vprebits == 7) {
-                define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr23_reginfo);
+                arm_register_el_change_hook(cpu, gicv3_cpuif_el_change_hook,
+                                            cs);
+            } else {
+                assert(!arm_feature(&cpu->env, ARM_FEATURE_EL2));
+                assert(!arm_feature(&cpu->env, ARM_FEATURE_EL3));
             }
         }
-        if (tcg_enabled() || qtest_enabled()) {
-            /*
-             * We can only trap EL changes with TCG. However the GIC interrupt
-             * state only changes on EL changes involving EL2 or EL3, so for
-             * the non-TCG case this is OK, as EL2 and EL3 can't exist.
-             */
-            arm_register_el_change_hook(cpu, gicv3_cpuif_el_change_hook, cs);
-        } else {
-            assert(!arm_feature(&cpu->env, ARM_FEATURE_EL2));
-            assert(!arm_feature(&cpu->env, ARM_FEATURE_EL3));
-        }
     }
 }
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index b6f50caf84..67985e4a21 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -804,6 +804,10 @@ static void vm_change_state_handler(void *opaque, bool running,
     }
 }
 
+static void kvm_gicv3_init_cpu_reginfo(CPUState *cs)
+{
+    define_arm_cp_regs(ARM_CPU(cs), gicv3_cpuif_reginfo);
+}
 
 static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
 {
@@ -837,7 +841,7 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
     for (i = 0; i < s->num_cpu; i++) {
         CPUState *cs = qemu_get_cpu(i);
         if (qemu_enabled_cpu(cs)) {
-            define_arm_cp_regs(ARM_CPU(cs), gicv3_cpuif_reginfo);
+            kvm_gicv3_init_cpu_reginfo(cs);
         }
     }
 
@@ -926,6 +930,7 @@ static void kvm_arm_gicv3_class_init(ObjectClass *klass, void *data)
 
     agcc->pre_save = kvm_arm_gicv3_get;
     agcc->post_load = kvm_arm_gicv3_put;
+    agcc->init_cpu_reginfo = kvm_gicv3_init_cpu_reginfo;
     device_class_set_parent_realize(dc, kvm_arm_gicv3_realize,
                                     &kgc->parent_realize);
     resettable_class_set_parent_phases(rc, NULL, kvm_arm_gicv3_reset_hold, NULL,
diff --git a/hw/intc/gicv3_internal.h b/hw/intc/gicv3_internal.h
index 9d4c1209bd..0bed0f6e2a 100644
--- a/hw/intc/gicv3_internal.h
+++ b/hw/intc/gicv3_internal.h
@@ -709,6 +709,7 @@ void gicv3_redist_vinvall(GICv3CPUState *cs, uint64_t vptaddr);
 
 void gicv3_redist_send_sgi(GICv3CPUState *cs, int grp, int irq, bool ns);
 void gicv3_init_cpuif(GICv3State *s);
+void gicv3_init_cpu_reginfo(CPUState *cs);
 
 /**
  * gicv3_cpuif_update:
diff --git a/include/hw/intc/arm_gicv3_common.h b/include/hw/intc/arm_gicv3_common.h
index 97a48f44b9..b5f8ba17ff 100644
--- a/include/hw/intc/arm_gicv3_common.h
+++ b/include/hw/intc/arm_gicv3_common.h
@@ -325,6 +325,7 @@ struct ARMGICv3CommonClass {
 
     void (*pre_save)(GICv3State *s);
     void (*post_load)(GICv3State *s);
+    void (*init_cpu_reginfo)(CPUState *cs);
 };
 
 void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 29/37] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (27 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 28/37] hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-29  0:30   ` Gavin Shan
  2023-09-26 10:04 ` [PATCH RFC V2 30/37] hw/arm: Changes required for reset and to support next boot Salil Mehta via
                   ` (4 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

During any vCPU hot-(un)plug, running guest VM needs to be intimated about the
new vCPU being added or request the deletion of the vCPU which is already part
of the guest VM. This is done using the ACPI GED event which eventually gets
demultiplexed to a CPU hotplug event and further to specific hot-(un)plug event
of a particular vCPU.

This change adds the ACPI calls to the existing hot-(un)plug hooks to trigger
ACPI GED events from QEMU to guest VM.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt.c | 33 ++++++++++++++++++++++++++++++---
 1 file changed, 30 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b447e86fb6..6f5ee4a1c6 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3157,6 +3157,7 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
     MachineState *ms = MACHINE(hotplug_dev);
     CPUState *cs = CPU(dev);
+    Error *local_err = NULL;
     CPUArchId *cpu_slot;
 
     /* insert the cold/hot-plugged vcpu in the slot */
@@ -3169,12 +3170,20 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
      * plugged, guest is also notified.
      */
     if (vms->acpi_dev) {
-        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest */
+        HotplugHandlerClass *hhc;
+        /* update acpi hotplug state and send cpu hotplug event to guest */
+        hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
+        hhc->plug(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
+        if (local_err) {
+            goto fail;
+        }
         /* TODO: register cpu for reset & update F/W info for the next boot */
     }
 
     cs->disabled = false;
     return;
+fail:
+    error_propagate(errp, local_err);
 }
 
 static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
@@ -3182,8 +3191,10 @@ static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
 {
     MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
     VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+    HotplugHandlerClass *hhc;
     ARMCPU *cpu = ARM_CPU(dev);
     CPUState *cs = CPU(dev);
+    Error *local_err = NULL;
 
     if (!vms->acpi_dev || !dev->realized) {
         error_setg(errp, "GED does not exists or device is not realized!");
@@ -3202,9 +3213,16 @@ static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
         return;
     }
 
-    /* TODO: request cpu hotplug from guest */
+    /* request cpu hotplug from guest */
+    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
+    hhc->unplug_request(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
+    if (local_err) {
+        goto fail;
+    }
 
     return;
+fail:
+    error_propagate(errp, local_err);
 }
 
 static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
@@ -3212,7 +3230,9 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
 {
     VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
     MachineState *ms = MACHINE(hotplug_dev);
+    HotplugHandlerClass *hhc;
     CPUState *cs = CPU(dev);
+    Error *local_err = NULL;
     CPUArchId *cpu_slot;
 
     if (!vms->acpi_dev || !dev->realized) {
@@ -3222,7 +3242,12 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
 
     cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
 
-    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
+    /* update the acpi cpu hotplug state for cpu hot-unplug */
+    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
+    hhc->unplug(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
+    if (local_err) {
+        goto fail;
+    }
 
     unwire_gic_cpu_irqs(vms, cs);
     virt_update_gic(vms, cs);
@@ -3236,6 +3261,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
     cs->disabled = true;
 
     return;
+fail:
+    error_propagate(errp, local_err);
 }
 
 static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 30/37] hw/arm: Changes required for reset and to support next boot
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (28 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 29/37] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-09-26 10:04 ` [PATCH RFC V2 31/37] physmem, gdbstub: Common helping funcs/changes to *unrealize* vCPU Salil Mehta via
                   ` (3 subsequent siblings)
  33 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Updates the firmware config with the next boot cpus information and also
registers the reset callback to be called when guest reboots to reset the cpu.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/boot.c         |  2 +-
 hw/arm/virt.c         | 18 +++++++++++++++---
 include/hw/arm/boot.h |  2 ++
 include/hw/arm/virt.h |  1 +
 4 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 720f22531a..2a2d27c20a 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -682,7 +682,7 @@ fail:
     return -1;
 }
 
-static void do_cpu_reset(void *opaque)
+void do_cpu_reset(void *opaque)
 {
     ARMCPU *cpu = opaque;
     CPUState *cs = CPU(cpu);
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 6f5ee4a1c6..e46f529801 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -45,6 +45,8 @@
 #include "sysemu/device_tree.h"
 #include "sysemu/numa.h"
 #include "sysemu/runstate.h"
+#include "sysemu/reset.h"
+#include "sysemu/sysemu.h"
 #include "sysemu/tpm.h"
 #include "sysemu/tcg.h"
 #include "sysemu/kvm.h"
@@ -1357,7 +1359,7 @@ static FWCfgState *create_fw_cfg(const VirtMachineState *vms, AddressSpace *as)
     char *nodename;
 
     fw_cfg = fw_cfg_init_mem_wide(base + 8, base, 8, base + 16, as);
-    fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, (uint16_t)ms->smp.cpus);
+    fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, vms->boot_cpus);
 
     nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base);
     qemu_fdt_add_subnode(ms->fdt, nodename);
@@ -3177,7 +3179,13 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         if (local_err) {
             goto fail;
         }
-        /* TODO: register cpu for reset & update F/W info for the next boot */
+        /* register this cpu for reset & update F/W info for the next boot */
+        qemu_register_reset(do_cpu_reset, ARM_CPU(cs));
+    }
+
+    vms->boot_cpus++;
+    if (vms->fw_cfg) {
+        fw_cfg_modify_i16(vms->fw_cfg, FW_CFG_NB_CPUS, vms->boot_cpus);
     }
 
     cs->disabled = false;
@@ -3252,7 +3260,11 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
     unwire_gic_cpu_irqs(vms, cs);
     virt_update_gic(vms, cs);
 
-    /* TODO: unregister cpu for reset & update F/W info for the next boot */
+    qemu_unregister_reset(do_cpu_reset, ARM_CPU(cs));
+    vms->boot_cpus--;
+    if (vms->fw_cfg) {
+        fw_cfg_modify_i16(vms->fw_cfg, FW_CFG_NB_CPUS, vms->boot_cpus);
+    }
 
     qobject_unref(dev->opts);
     dev->opts = NULL;
diff --git a/include/hw/arm/boot.h b/include/hw/arm/boot.h
index 80c492d742..f81326a1dc 100644
--- a/include/hw/arm/boot.h
+++ b/include/hw/arm/boot.h
@@ -178,6 +178,8 @@ AddressSpace *arm_boot_address_space(ARMCPU *cpu,
 int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
                  hwaddr addr_limit, AddressSpace *as, MachineState *ms);
 
+void do_cpu_reset(void *opaque);
+
 /* Write a secure board setup routine with a dummy handler for SMCs */
 void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu,
                                             const struct arm_boot_info *info,
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index f9a748a5a9..a130fdad52 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -176,6 +176,7 @@ struct VirtMachineState {
     MemMapEntry *memmap;
     char *pciehb_nodename;
     const int *irqmap;
+    uint16_t boot_cpus;
     int fdt_size;
     uint32_t clock_phandle;
     uint32_t gic_phandle;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 31/37] physmem, gdbstub: Common helping funcs/changes to *unrealize* vCPU
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (29 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 30/37] hw/arm: Changes required for reset and to support next boot Salil Mehta via
@ 2023-09-26 10:04 ` Salil Mehta via
  2023-10-03  6:33   ` [PATCH RFC V2 31/37] physmem,gdbstub: " Philippe Mathieu-Daudé
  2023-09-26 10:36 ` [PATCH RFC V2 32/37] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
                   ` (2 subsequent siblings)
  33 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:04 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Supporting vCPU Hotplug for ARM arch also means introducing new functionality of
unrealizing the ARMCPU. This requires some new common functions.

Defining them as part of architecture independent change so that this code could
be reused by other interested parties.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 gdbstub/gdbstub.c         | 13 +++++++++++++
 include/exec/cpu-common.h |  8 ++++++++
 include/exec/gdbstub.h    |  1 +
 include/hw/core/cpu.h     |  1 +
 softmmu/physmem.c         | 25 +++++++++++++++++++++++++
 5 files changed, 48 insertions(+)

diff --git a/gdbstub/gdbstub.c b/gdbstub/gdbstub.c
index 5f28d5cf57..ddbcb4f115 100644
--- a/gdbstub/gdbstub.c
+++ b/gdbstub/gdbstub.c
@@ -491,6 +491,19 @@ void gdb_register_coprocessor(CPUState *cpu,
     }
 }
 
+void gdb_unregister_coprocessor_all(CPUState *cpu)
+{
+    GDBRegisterState *s, *p;
+
+    p = cpu->gdb_regs;
+    while (p) {
+        s = p;
+        p = p->next;
+        g_free(s);
+    }
+    cpu->gdb_regs = NULL;
+}
+
 static void gdb_process_breakpoint_remove_all(GDBProcess *p)
 {
     CPUState *cpu = gdb_get_first_cpu_in_process(p);
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 87dc9a752c..27cd4d32b1 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -120,6 +120,14 @@ size_t qemu_ram_pagesize_largest(void);
  */
 void cpu_address_space_init(CPUState *cpu, int asidx,
                             const char *prefix, MemoryRegion *mr);
+/**
+ * cpu_address_space_destroy:
+ * @cpu: CPU for which address space needs to be destroyed
+ * @asidx: integer index of this address space
+ *
+ * Note that with KVM only one address space is supported.
+ */
+void cpu_address_space_destroy(CPUState *cpu, int asidx);
 
 void cpu_physical_memory_rw(hwaddr addr, void *buf,
                             hwaddr len, bool is_write);
diff --git a/include/exec/gdbstub.h b/include/exec/gdbstub.h
index 7d743fe1e9..a22f0875e2 100644
--- a/include/exec/gdbstub.h
+++ b/include/exec/gdbstub.h
@@ -17,6 +17,7 @@ typedef int (*gdb_set_reg_cb)(CPUArchState *env, uint8_t *buf, int reg);
 void gdb_register_coprocessor(CPUState *cpu,
                               gdb_get_reg_cb get_reg, gdb_set_reg_cb set_reg,
                               int num_regs, const char *xml, int g_pos);
+void gdb_unregister_coprocessor_all(CPUState *cpu);
 
 /**
  * gdbserver_start: start the gdb server
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index dab572c9bd..ffd815a0d8 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -366,6 +366,7 @@ struct CPUState {
     QSIMPLEQ_HEAD(, qemu_work_item) work_list;
 
     CPUAddressSpace *cpu_ases;
+    int cpu_ases_ref_count;
     int num_ases;
     AddressSpace *as;
     MemoryRegion *memory;
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 3df73542e1..a93ae783af 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -762,6 +762,7 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
 
     if (!cpu->cpu_ases) {
         cpu->cpu_ases = g_new0(CPUAddressSpace, cpu->num_ases);
+        cpu->cpu_ases_ref_count = cpu->num_ases;
     }
 
     newas = &cpu->cpu_ases[asidx];
@@ -775,6 +776,30 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
     }
 }
 
+void cpu_address_space_destroy(CPUState *cpu, int asidx)
+{
+    CPUAddressSpace *cpuas;
+
+    assert(asidx < cpu->num_ases);
+    assert(asidx == 0 || !kvm_enabled());
+    assert(cpu->cpu_ases);
+
+    cpuas = &cpu->cpu_ases[asidx];
+    if (tcg_enabled()) {
+        memory_listener_unregister(&cpuas->tcg_as_listener);
+    }
+
+    address_space_destroy(cpuas->as);
+    g_free_rcu(cpuas->as, rcu);
+
+    if (cpu->cpu_ases_ref_count == 1) {
+        g_free(cpu->cpu_ases);
+        cpu->cpu_ases = NULL;
+    }
+
+    cpu->cpu_ases_ref_count--;
+}
+
 AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
 {
     /* Return the AddressSpace corresponding to the specified index */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 32/37] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (30 preceding siblings ...)
  2023-09-26 10:04 ` [PATCH RFC V2 31/37] physmem, gdbstub: Common helping funcs/changes to *unrealize* vCPU Salil Mehta via
@ 2023-09-26 10:36 ` Salil Mehta via
  2023-09-26 10:36   ` [PATCH RFC V2 33/37] target/arm/kvm: Write CPU state back to KVM on reset Salil Mehta via
                     ` (4 more replies)
  2023-10-11 10:23 ` [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Vishnu Pajjuri
  2023-10-12 17:02 ` Miguel Luis
  33 siblings, 5 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

vCPU Hot-unplug will result in QOM CPU object unrealization which will do away
with all the vCPU thread creations, allocations, registrations that happened
as part of the realization process. This change introduces the ARM CPU unrealize
function taking care of exactly that.

Note, initialized KVM vCPUs are not destroyed in host KVM but their Qemu context
is parked at the QEMU KVM layer.

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
[VP: Identified CPU stall issue & suggested probable fix]
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 target/arm/cpu-qom.h   |   3 ++
 target/arm/cpu.c       | 101 +++++++++++++++++++++++++++++++++++++++++
 target/arm/cpu.h       |  13 ++++++
 target/arm/gdbstub.c   |   6 +++
 target/arm/helper.c    |  25 ++++++++++
 target/arm/internals.h |   1 +
 target/arm/kvm64.c     |   4 ++
 7 files changed, 153 insertions(+)

diff --git a/target/arm/cpu-qom.h b/target/arm/cpu-qom.h
index 514c22ced9..2503493710 100644
--- a/target/arm/cpu-qom.h
+++ b/target/arm/cpu-qom.h
@@ -54,6 +54,7 @@ struct ARMCPUClass {
 
     const ARMCPUInfo *info;
     DeviceRealize parent_realize;
+    DeviceUnrealize parent_unrealize;
     ResettablePhases parent_phases;
 };
 
@@ -70,7 +71,9 @@ struct AArch64CPUClass {
 };
 
 void register_cp_regs_for_features(ARMCPU *cpu);
+void unregister_cp_regs_for_features(ARMCPU *cpu);
 void init_cpreg_list(ARMCPU *cpu);
+void destroy_cpreg_list(ARMCPU *cpu);
 
 /* Callback functions for the generic timer's timers. */
 void arm_gt_ptimer_cb(void *opaque);
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 3a2e7e64ee..93b00835bf 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -141,6 +141,16 @@ void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
     QLIST_INSERT_HEAD(&cpu->pre_el_change_hooks, entry, node);
 }
 
+void arm_unregister_pre_el_change_hooks(ARMCPU *cpu)
+{
+    ARMELChangeHook *entry, *next;
+
+    QLIST_FOREACH_SAFE(entry, &cpu->pre_el_change_hooks, node, next) {
+        QLIST_REMOVE(entry, node);
+        g_free(entry);
+    }
+}
+
 void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
                                  void *opaque)
 {
@@ -152,6 +162,16 @@ void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
     QLIST_INSERT_HEAD(&cpu->el_change_hooks, entry, node);
 }
 
+void arm_unregister_el_change_hooks(ARMCPU *cpu)
+{
+    ARMELChangeHook *entry, *next;
+
+    QLIST_FOREACH_SAFE(entry, &cpu->el_change_hooks, node, next) {
+        QLIST_REMOVE(entry, node);
+        g_free(entry);
+    }
+}
+
 static void cp_reg_reset(gpointer key, gpointer value, gpointer opaque)
 {
     /* Reset a single ARMCPRegInfo register */
@@ -2244,6 +2264,85 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
     acc->parent_realize(dev, errp);
 }
 
+static void arm_cpu_unrealizefn(DeviceState *dev)
+{
+    ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
+    ARMCPU *cpu = ARM_CPU(dev);
+    CPUARMState *env = &cpu->env;
+    CPUState *cs = CPU(dev);
+    bool has_secure;
+
+    has_secure = cpu->has_el3 || arm_feature(env, ARM_FEATURE_M_SECURITY);
+
+    /* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn cleanly */
+    cpu_address_space_destroy(cs, ARMASIdx_NS);
+
+    if (cpu->tag_memory != NULL) {
+        cpu_address_space_destroy(cs, ARMASIdx_TagNS);
+        if (has_secure) {
+            cpu_address_space_destroy(cs, ARMASIdx_TagS);
+        }
+    }
+
+    if (has_secure) {
+        cpu_address_space_destroy(cs, ARMASIdx_S);
+    }
+
+    destroy_cpreg_list(cpu);
+    arm_cpu_unregister_gdb_regs(cpu);
+    unregister_cp_regs_for_features(cpu);
+
+    if (cpu->sau_sregion && arm_feature(env, ARM_FEATURE_M_SECURITY)) {
+        g_free(env->sau.rbar);
+        g_free(env->sau.rlar);
+    }
+
+    if (arm_feature(env, ARM_FEATURE_PMSA) &&
+        arm_feature(env, ARM_FEATURE_V7) &&
+        cpu->pmsav7_dregion) {
+        if (arm_feature(env, ARM_FEATURE_V8)) {
+            g_free(env->pmsav8.rbar[M_REG_NS]);
+            g_free(env->pmsav8.rlar[M_REG_NS]);
+            if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
+                g_free(env->pmsav8.rbar[M_REG_S]);
+                g_free(env->pmsav8.rlar[M_REG_S]);
+            }
+        } else {
+            g_free(env->pmsav7.drbar);
+            g_free(env->pmsav7.drsr);
+            g_free(env->pmsav7.dracr);
+        }
+        if (cpu->pmsav8r_hdregion) {
+            g_free(env->pmsav8.hprbar);
+            g_free(env->pmsav8.hprlar);
+        }
+    }
+
+    if (arm_feature(env, ARM_FEATURE_PMU)) {
+        if (!kvm_enabled()) {
+            arm_unregister_pre_el_change_hooks(cpu);
+            arm_unregister_el_change_hooks(cpu);
+        }
+
+#ifndef CONFIG_USER_ONLY
+        if (cpu->pmu_timer) {
+            timer_del(cpu->pmu_timer);
+        }
+#endif
+    }
+
+    cpu_remove_sync(CPU(dev));
+    acc->parent_unrealize(dev);
+
+#ifndef CONFIG_USER_ONLY
+    timer_del(cpu->gt_timer[GTIMER_PHYS]);
+    timer_del(cpu->gt_timer[GTIMER_VIRT]);
+    timer_del(cpu->gt_timer[GTIMER_HYP]);
+    timer_del(cpu->gt_timer[GTIMER_SEC]);
+    timer_del(cpu->gt_timer[GTIMER_HYPVIRT]);
+#endif
+}
+
 static ObjectClass *arm_cpu_class_by_name(const char *cpu_model)
 {
     ObjectClass *oc;
@@ -2347,6 +2446,8 @@ static void arm_cpu_class_init(ObjectClass *oc, void *data)
 
     device_class_set_parent_realize(dc, arm_cpu_realizefn,
                                     &acc->parent_realize);
+    device_class_set_parent_unrealize(dc, arm_cpu_unrealizefn,
+                                      &acc->parent_unrealize);
 
     device_class_set_props(dc, arm_cpu_properties);
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index d51d39f621..9fe89cf10a 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3264,6 +3264,13 @@ static inline AddressSpace *arm_addressspace(CPUState *cs, MemTxAttrs attrs)
  */
 void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
                                  void *opaque);
+/**
+ * arm_unregister_pre_el_change_hook:
+ * unregister all pre EL change hook functions. Generally called during
+ * unrealize'ing leg
+ */
+void arm_unregister_pre_el_change_hooks(ARMCPU *cpu);
+
 /**
  * arm_register_el_change_hook:
  * Register a hook function which will be called immediately after this
@@ -3276,6 +3283,12 @@ void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
  */
 void arm_register_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook, void
         *opaque);
+/**
+ * arm_unregister_el_change_hook:
+ * unregister all EL change hook functions.  Generally called during
+ * unrealize'ing leg
+ */
+void arm_unregister_el_change_hooks(ARMCPU *cpu);
 
 /**
  * arm_rebuild_hflags:
diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index f421c5d041..fecbd84ba6 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -580,3 +580,9 @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
     }
 #endif /* CONFIG_TCG */
 }
+
+void arm_cpu_unregister_gdb_regs(ARMCPU *cpu)
+{
+    CPUState *cs = CPU(cpu);
+    gdb_unregister_coprocessor_all(cs);
+}
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 50f61e42ca..272d6ba139 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -262,6 +262,19 @@ void init_cpreg_list(ARMCPU *cpu)
     g_list_free(keys);
 }
 
+void destroy_cpreg_list(ARMCPU *cpu)
+{
+    assert(cpu->cpreg_indexes);
+    assert(cpu->cpreg_values);
+    assert(cpu->cpreg_vmstate_indexes);
+    assert(cpu->cpreg_vmstate_values);
+
+    g_free(cpu->cpreg_indexes);
+    g_free(cpu->cpreg_values);
+    g_free(cpu->cpreg_vmstate_indexes);
+    g_free(cpu->cpreg_vmstate_values);
+}
+
 /*
  * Some registers are not accessible from AArch32 EL3 if SCR.NS == 0.
  */
@@ -9279,6 +9292,18 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 #endif
 }
 
+void unregister_cp_regs_for_features(ARMCPU *cpu)
+{
+    CPUARMState *env = &cpu->env;
+    if (arm_feature(env, ARM_FEATURE_M)) {
+        /* M profile has no coprocessor registers */
+        return;
+    }
+
+    /* empty it all. unregister all the coprocessor registers */
+    g_hash_table_remove_all(cpu->cp_regs);
+}
+
 /* Sort alphabetically by type name, except for "any". */
 static gint arm_cpu_list_compare(gconstpointer a, gconstpointer b)
 {
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 0f01bc32a8..fe330e89e7 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -183,6 +183,7 @@ static inline int r14_bank_number(int mode)
 }
 
 void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu);
+void arm_cpu_unregister_gdb_regs(ARMCPU *cpu);
 void arm_translate_init(void);
 
 void arm_restore_state_to_opc(CPUState *cs,
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index 364cc21f81..38de0b4148 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -651,6 +651,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
+    if (cs->thread_id) {
+        qemu_del_vm_change_state_handler(cs->vmcse);
+    }
+
     return 0;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 33/37] target/arm/kvm: Write CPU state back to KVM on reset
  2023-09-26 10:36 ` [PATCH RFC V2 32/37] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
@ 2023-09-26 10:36   ` Salil Mehta via
  2023-09-26 10:36   ` [PATCH RFC V2 34/37] target/arm/kvm, tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu Salil Mehta via
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

When a KVM vCPU is reset following a PSCI CPU_ON call, its power state
is not synchronized with KVM at the moment. Because the vCPU is not
marked dirty, we miss the call to kvm_arch_put_registers() that writes
to KVM's MP_STATE. Force mp_state synchronization.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 target/arm/kvm.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 0e1d0692b1..8e7c68af6a 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -614,11 +614,12 @@ void kvm_arm_cpu_post_load(ARMCPU *cpu)
 void kvm_arm_reset_vcpu(ARMCPU *cpu)
 {
     int ret;
+    CPUState *cs = CPU(cpu);
 
     /* Re-init VCPU so that all registers are set to
      * their respective reset values.
      */
-    ret = kvm_arm_vcpu_init(CPU(cpu));
+    ret = kvm_arm_vcpu_init(cs);
     if (ret < 0) {
         fprintf(stderr, "kvm_arm_vcpu_init failed: %s\n", strerror(-ret));
         abort();
@@ -635,6 +636,12 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
      * for the same reason we do so in kvm_arch_get_registers().
      */
     write_list_to_cpustate(cpu);
+
+    /*
+     * Ensure we call kvm_arch_put_registers(). The vCPU isn't marked dirty if
+     * it was parked in KVM and is now booting from a PSCI CPU_ON call.
+     */
+    cs->vcpu_dirty = true;
 }
 
 void kvm_arm_create_host_vcpu(ARMCPU *cpu)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 34/37] target/arm/kvm, tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
  2023-09-26 10:36 ` [PATCH RFC V2 32/37] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
  2023-09-26 10:36   ` [PATCH RFC V2 33/37] target/arm/kvm: Write CPU state back to KVM on reset Salil Mehta via
@ 2023-09-26 10:36   ` Salil Mehta via
  2023-09-29  4:15     ` [PATCH RFC V2 34/37] target/arm/kvm,tcg: " Gavin Shan
  2023-09-26 10:36   ` [PATCH RFC V2 35/37] hw/arm: Support hotplug capability check using _OSC method Salil Mehta via
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

From: Author Salil Mehta <salil.mehta@huawei.com>

Add registration and Handling of HVC/SMC hypercall exits to VMM

Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 target/arm/arm-powerctl.c   | 51 +++++++++++++++++++++++++++++-------
 target/arm/helper.c         |  2 +-
 target/arm/internals.h      | 11 --------
 target/arm/kvm.c            | 52 +++++++++++++++++++++++++++++++++++++
 target/arm/kvm64.c          | 46 +++++++++++++++++++++++++++++---
 target/arm/kvm_arm.h        | 13 ++++++++++
 target/arm/meson.build      |  1 +
 target/arm/{tcg => }/psci.c |  8 ++++++
 target/arm/tcg/meson.build  |  4 ---
 9 files changed, 160 insertions(+), 28 deletions(-)
 rename target/arm/{tcg => }/psci.c (97%)

diff --git a/target/arm/arm-powerctl.c b/target/arm/arm-powerctl.c
index 326a03153d..0184c7fb09 100644
--- a/target/arm/arm-powerctl.c
+++ b/target/arm/arm-powerctl.c
@@ -16,6 +16,7 @@
 #include "qemu/log.h"
 #include "qemu/main-loop.h"
 #include "sysemu/tcg.h"
+#include "hw/boards.h"
 
 #ifndef DEBUG_ARM_POWERCTL
 #define DEBUG_ARM_POWERCTL 0
@@ -28,18 +29,37 @@
         } \
     } while (0)
 
+static CPUArchId *arm_get_archid_by_id(uint64_t id)
+{
+    int n;
+    CPUArchId *arch_id;
+    MachineState *ms = MACHINE(qdev_get_machine());
+
+    /*
+     * At this point disabled CPUs don't have a CPUState, but their CPUArchId
+     * exists.
+     *
+     * TODO: Is arch_id == mp_affinity? This needs work.
+     */
+    for (n = 0; n < ms->possible_cpus->len; n++) {
+        arch_id = &ms->possible_cpus->cpus[n];
+
+        if (arch_id->arch_id == id) {
+            return arch_id;
+        }
+    }
+    return NULL;
+}
+
 CPUState *arm_get_cpu_by_id(uint64_t id)
 {
-    CPUState *cpu;
+    CPUArchId *arch_id;
 
     DPRINTF("cpu %" PRId64 "\n", id);
 
-    CPU_FOREACH(cpu) {
-        ARMCPU *armcpu = ARM_CPU(cpu);
-
-        if (armcpu->mp_affinity == id) {
-            return cpu;
-        }
+    arch_id = arm_get_archid_by_id(id);
+    if (arch_id && arch_id->cpu) {
+        return CPU(arch_id->cpu);
     }
 
     qemu_log_mask(LOG_GUEST_ERROR,
@@ -148,6 +168,7 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
 {
     CPUState *target_cpu_state;
     ARMCPU *target_cpu;
+    CPUArchId *arch_id;
     struct CpuOnInfo *info;
 
     assert(qemu_mutex_iothread_locked());
@@ -168,12 +189,24 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
     }
 
     /* Retrieve the cpu we are powering up */
-    target_cpu_state = arm_get_cpu_by_id(cpuid);
-    if (!target_cpu_state) {
+    arch_id = arm_get_archid_by_id(cpuid);
+    if (!arch_id) {
         /* The cpu was not found */
         return QEMU_ARM_POWERCTL_INVALID_PARAM;
     }
 
+    target_cpu_state = CPU(arch_id->cpu);
+    if (!qemu_enabled_cpu(target_cpu_state)) {
+        /*
+         * The cpu is not plugged in or disabled. We should return appropriate
+         * value as introduced in DEN0022E PSCI 1.2 issue E
+         */
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "[ARM]%s: Denying attempt to online removed/disabled "
+                      "CPU%" PRId64"\n", __func__, cpuid);
+        return QEMU_ARM_POWERCTL_IS_OFF;
+    }
+
     target_cpu = ARM_CPU(target_cpu_state);
     if (target_cpu->power_state == PSCI_ON) {
         qemu_log_mask(LOG_GUEST_ERROR,
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 272d6ba139..4d396426f2 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11187,7 +11187,7 @@ void arm_cpu_do_interrupt(CPUState *cs)
                       env->exception.syndrome);
     }
 
-    if (tcg_enabled() && arm_is_psci_call(cpu, cs->exception_index)) {
+    if (arm_is_psci_call(cpu, cs->exception_index)) {
         arm_handle_psci_call(cpu);
         qemu_log_mask(CPU_LOG_INT, "...handled as PSCI call\n");
         return;
diff --git a/target/arm/internals.h b/target/arm/internals.h
index fe330e89e7..7ffefc2d58 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -305,21 +305,10 @@ vaddr arm_adjust_watchpoint_address(CPUState *cs, vaddr addr, int len);
 /* Callback function for when a watchpoint or breakpoint triggers. */
 void arm_debug_excp_handler(CPUState *cs);
 
-#if defined(CONFIG_USER_ONLY) || !defined(CONFIG_TCG)
-static inline bool arm_is_psci_call(ARMCPU *cpu, int excp_type)
-{
-    return false;
-}
-static inline void arm_handle_psci_call(ARMCPU *cpu)
-{
-    g_assert_not_reached();
-}
-#else
 /* Return true if the r0/x0 value indicates that this SMC/HVC is a PSCI call. */
 bool arm_is_psci_call(ARMCPU *cpu, int excp_type);
 /* Actually handle a PSCI call */
 void arm_handle_psci_call(ARMCPU *cpu);
-#endif
 
 /**
  * arm_clear_exclusive: clear the exclusive monitor
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 8e7c68af6a..6f3fd5aecd 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -250,6 +250,7 @@ int kvm_arm_get_max_vm_ipa_size(MachineState *ms, bool *fixed_ipa)
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
     int ret = 0;
+
     /* For ARM interrupt delivery is always asynchronous,
      * whether we are using an in-kernel VGIC or not.
      */
@@ -280,6 +281,22 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
         }
     }
 
+    /*
+     * To be able to handle PSCI CPU ON calls in QEMU, we need to install SMCCC
+     * filter in the Host KVM. This is required to support features like
+     * virtual CPU Hotplug on ARM platforms.
+     */
+    if (kvm_arm_set_smccc_filter(PSCI_0_2_FN64_CPU_ON,
+                                 KVM_SMCCC_FILTER_FWD_TO_USER)) {
+        error_report("CPU On PSCI-to-user-space fwd filter install failed");
+        abort();
+    }
+    if (kvm_arm_set_smccc_filter(PSCI_0_2_FN_CPU_OFF,
+                                 KVM_SMCCC_FILTER_FWD_TO_USER)) {
+        error_report("CPU Off PSCI-to-user-space fwd filter install failed");
+        abort();
+    }
+
     kvm_arm_init_debug(s);
 
     return ret;
@@ -952,6 +969,38 @@ static int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
     return -1;
 }
 
+static int kvm_arm_handle_hypercall(CPUState *cs, struct kvm_run *run)
+{
+    ARMCPU *cpu = ARM_CPU(cs);
+    CPUARMState *env = &cpu->env;
+
+    kvm_cpu_synchronize_state(cs);
+
+    /*
+     * hard coding immediate to 0 as we dont expect non-zero value as of now
+     * This might change in future versions. Hence, KVM_GET_ONE_REG  could be
+     * used in such cases but it must be enhanced then only synchronize will
+     * also fetch ESR_EL2 value.
+     */
+    if (run->hypercall.flags == KVM_HYPERCALL_EXIT_SMC) {
+        cs->exception_index = EXCP_SMC;
+        env->exception.syndrome = syn_aa64_smc(0);
+    } else {
+        cs->exception_index = EXCP_HVC;
+        env->exception.syndrome = syn_aa64_hvc(0);
+    }
+    env->exception.target_el = 1;
+    qemu_mutex_lock_iothread();
+    arm_cpu_do_interrupt(cs);
+    qemu_mutex_unlock_iothread();
+
+    /*
+     * For PSCI, exit the kvm_run loop and process the work. Especially
+     * important if this was a CPU_OFF command and we can't return to the guest.
+     */
+    return EXCP_INTERRUPT;
+}
+
 int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
 {
     int ret = 0;
@@ -967,6 +1016,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
                                        run->arm_nisv.fault_ipa);
         break;
+    case KVM_EXIT_HYPERCALL:
+          ret = kvm_arm_handle_hypercall(cs, run);
+        break;
     default:
         qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
                       __func__, run->exit_reason);
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index 38de0b4148..efe24e3f90 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -113,6 +113,25 @@ bool kvm_arm_hw_debug_active(CPUState *cs)
     return ((cur_hw_wps > 0) || (cur_hw_bps > 0));
 }
 
+static bool kvm_arm_set_vm_attr(struct kvm_device_attr *attr, const char *name)
+{
+    int err;
+
+    err = kvm_vm_ioctl(kvm_state, KVM_HAS_DEVICE_ATTR, attr);
+    if (err != 0) {
+        error_report("%s: KVM_HAS_DEVICE_ATTR: %s", name, strerror(-err));
+        return false;
+    }
+
+    err = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, attr);
+    if (err != 0) {
+        error_report("%s: KVM_SET_DEVICE_ATTR: %s", name, strerror(-err));
+        return false;
+    }
+
+    return true;
+}
+
 static bool kvm_arm_set_device_attr(CPUState *cs, struct kvm_device_attr *attr,
                                     const char *name)
 {
@@ -183,6 +202,28 @@ void kvm_arm_pvtime_init(CPUState *cs, uint64_t ipa)
     }
 }
 
+int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction)
+{
+    struct kvm_smccc_filter filter = {
+        .base = func,
+        .nr_functions = 1,
+        .action = faction,
+    };
+    struct kvm_device_attr attr = {
+        .group = KVM_ARM_VM_SMCCC_CTRL,
+        .attr = KVM_ARM_VM_SMCCC_FILTER,
+        .flags = 0,
+        .addr = (uintptr_t)&filter,
+    };
+
+    if (!kvm_arm_set_vm_attr(&attr, "SMCCC Filter")) {
+        error_report("failed to set SMCCC filter in KVM Host");
+        return -1;
+    }
+
+    return 0;
+}
+
 static int read_sys_reg32(int fd, uint32_t *pret, uint64_t id)
 {
     uint64_t ret;
@@ -633,9 +674,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
     }
 
     /*
-     * When KVM is in use, PSCI is emulated in-kernel and not by qemu.
-     * Currently KVM has its own idea about MPIDR assignment, so we
-     * override our defaults with what we get from KVM.
+     * KVM may emulate PSCI in-kernel. Currently KVM has its own idea about
+     * MPIDR assignment, so we override our defaults with what we get from KVM.
      */
     ret = kvm_get_one_reg(cs, ARM64_SYS_REG(ARM_CPU_ID_MPIDR), &mpidr);
     if (ret) {
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 31408499b3..bf4df54c96 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -388,6 +388,15 @@ void kvm_arm_pvtime_init(CPUState *cs, uint64_t ipa);
 
 int kvm_arm_set_irq(int cpu, int irqtype, int irq, int level);
 
+/**
+ * kvm_arm_set_smccc_filter
+ * @func: funcion
+ * @faction: SMCCC filter action(handle, deny, fwd-to-user) to be deployed
+ *
+ * Sets the ARMs SMC-CC filter in KVM Host for selective hypercall exits
+ */
+int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction);
+
 #else
 
 /*
@@ -462,6 +471,10 @@ static inline uint32_t kvm_arm_sve_get_vls(CPUState *cs)
     g_assert_not_reached();
 }
 
+static inline int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction)
+{
+    g_assert_not_reached();
+}
 #endif
 
 /**
diff --git a/target/arm/meson.build b/target/arm/meson.build
index e645e456da..fdfc8b958f 100644
--- a/target/arm/meson.build
+++ b/target/arm/meson.build
@@ -23,6 +23,7 @@ arm_system_ss.add(files(
   'arm-qmp-cmds.c',
   'cortex-regs.c',
   'machine.c',
+  'psci.c',
   'ptw.c',
 ))
 
diff --git a/target/arm/tcg/psci.c b/target/arm/psci.c
similarity index 97%
rename from target/arm/tcg/psci.c
rename to target/arm/psci.c
index 6c1239bb96..a8690a16af 100644
--- a/target/arm/tcg/psci.c
+++ b/target/arm/psci.c
@@ -21,7 +21,9 @@
 #include "exec/helper-proto.h"
 #include "kvm-consts.h"
 #include "qemu/main-loop.h"
+#include "qemu/error-report.h"
 #include "sysemu/runstate.h"
+#include "sysemu/tcg.h"
 #include "internals.h"
 #include "arm-powerctl.h"
 
@@ -157,6 +159,11 @@ void arm_handle_psci_call(ARMCPU *cpu)
     case QEMU_PSCI_0_1_FN_CPU_SUSPEND:
     case QEMU_PSCI_0_2_FN_CPU_SUSPEND:
     case QEMU_PSCI_0_2_FN64_CPU_SUSPEND:
+       if (!tcg_enabled()) {
+            warn_report("CPU suspend not supported in non-tcg mode");
+            break;
+       }
+#ifdef CONFIG_TCG
         /* Affinity levels are not supported in QEMU */
         if (param[1] & 0xfffe0000) {
             ret = QEMU_PSCI_RET_INVALID_PARAMS;
@@ -169,6 +176,7 @@ void arm_handle_psci_call(ARMCPU *cpu)
             env->regs[0] = 0;
         }
         helper_wfi(env, 4);
+#endif
         break;
     case QEMU_PSCI_1_0_FN_PSCI_FEATURES:
         switch (param[1]) {
diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
index 6fca38f2cc..ad3cfcb3bd 100644
--- a/target/arm/tcg/meson.build
+++ b/target/arm/tcg/meson.build
@@ -51,7 +51,3 @@ arm_ss.add(when: 'TARGET_AARCH64', if_true: files(
   'sme_helper.c',
   'sve_helper.c',
 ))
-
-arm_system_ss.add(files(
-  'psci.c',
-))
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 35/37] hw/arm: Support hotplug capability check using _OSC method
  2023-09-26 10:36 ` [PATCH RFC V2 32/37] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
  2023-09-26 10:36   ` [PATCH RFC V2 33/37] target/arm/kvm: Write CPU state back to KVM on reset Salil Mehta via
  2023-09-26 10:36   ` [PATCH RFC V2 34/37] target/arm/kvm, tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu Salil Mehta via
@ 2023-09-26 10:36   ` Salil Mehta via
  2023-09-29  4:23     ` Gavin Shan
  2023-09-26 10:36   ` [PATCH RFC V2 36/37] tcg/mttcg: enable threads to unregister in tcg_ctxs[] Salil Mehta via
  2023-09-26 10:36   ` [PATCH RFC V2 37/37] hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled Salil Mehta via
  4 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Physical CPU hotplug results in (un)setting of ACPI _STA.Present bit. AARCH64
platforms do not support physical CPU hotplug. Virtual CPU hotplug support being
implemented toggles ACPI _STA.Enabled Bit to achieve hotplug functionality. This
is not same as physical CPU hotplug support.

In future, if ARM architecture supports physical CPU hotplug then the current
design of virtual CPU hotplug can be used unchanged. Hence, there is a need for
firmware/VMM/Qemu to support evaluation of platform wide capabilitiy related to
the *type* of CPU hotplug support present on the platform. OSPM might need this
during boot time to correctly initialize the CPUs and other related components
in the kernel.

NOTE: This implementation will be improved to add the support of *query* in the
subsequent versions. This is very minimal support to assist kernel.

ASL for the implemented _OSC method:

Method (_OSC, 4, NotSerialized)  // _OSC: Operating System Capabilities
{
    CreateDWordField (Arg3, Zero, CDW1)
    If ((Arg0 == ToUUID ("0811b06e-4a27-44f9-8d60-3cbbc22e7b48") /* Platform-wide Capabilities */))
    {
        CreateDWordField (Arg3, 0x04, CDW2)
        Local0 = CDW2 /* \_SB_._OSC.CDW2 */
        If ((Arg1 != One))
        {
            CDW1 |= 0x08
        }

        Local0 &= 0x00800000
        If ((CDW2 != Local0))
        {
            CDW1 |= 0x10
        }

        CDW2 = Local0
    }
    Else
    {
        CDW1 |= 0x04
    }

    Return (Arg3)
}

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt-acpi-build.c | 52 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index cbccd2ca2d..377450dd16 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -861,6 +861,55 @@ static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
     build_fadt(table_data, linker, &fadt, vms->oem_id, vms->oem_table_id);
 }
 
+static void build_virt_osc_method(Aml *scope, VirtMachineState *vms)
+{
+    Aml *if_uuid, *else_uuid, *if_rev, *if_caps_masked, *method;
+    Aml *a_cdw1 = aml_name("CDW1");
+    Aml *a_cdw2 = aml_local(0);
+
+    method = aml_method("_OSC", 4, AML_NOTSERIALIZED);
+    aml_append(method, aml_create_dword_field(aml_arg(3), aml_int(0), "CDW1"));
+
+    /* match UUID */
+    if_uuid = aml_if(aml_equal(
+        aml_arg(0), aml_touuid("0811B06E-4A27-44F9-8D60-3CBBC22E7B48")));
+
+    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
+    aml_append(if_uuid, aml_store(aml_name("CDW2"), a_cdw2));
+
+    /* check unknown revision in arg(1) */
+    if_rev = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(1))));
+    /* set revision error bits,  DWORD1 Bit[3] */
+    aml_append(if_rev, aml_or(a_cdw1, aml_int(0x08), a_cdw1));
+    aml_append(if_uuid, if_rev);
+
+    /*
+     * check support for vCPU hotplug type(=enabled) platform-wide capability
+     * in DWORD2 as sepcified in the below ACPI Specification ECR,
+     *  # https://bugzilla.tianocore.org/show_bug.cgi?id=4481
+     */
+    if (vms->acpi_dev) {
+        aml_append(if_uuid, aml_and(a_cdw2, aml_int(0x800000), a_cdw2));
+        /* check if OSPM specified hotplug capability bits were masked */
+        if_caps_masked = aml_if(aml_lnot(aml_equal(aml_name("CDW2"), a_cdw2)));
+        aml_append(if_caps_masked, aml_or(a_cdw1, aml_int(0x10), a_cdw1));
+        aml_append(if_uuid, if_caps_masked);
+    }
+    aml_append(if_uuid, aml_store(a_cdw2, aml_name("CDW2")));
+
+    aml_append(method, if_uuid);
+    else_uuid = aml_else();
+
+    /* set unrecognized UUID error bits, DWORD1 Bit[2] */
+    aml_append(else_uuid, aml_or(a_cdw1, aml_int(4), a_cdw1));
+    aml_append(method, else_uuid);
+
+    aml_append(method, aml_return(aml_arg(3)));
+    aml_append(scope, method);
+
+    return;
+}
+
 /* DSDT */
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
@@ -894,6 +943,9 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     } else {
         acpi_dsdt_add_cpus(scope, vms);
     }
+
+    build_virt_osc_method(scope, vms);
+
     acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
                        (irqmap[VIRT_UART] + ARM_SPI_BASE));
     if (vmc->acpi_expose_flash) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 36/37] tcg/mttcg: enable threads to unregister in tcg_ctxs[]
  2023-09-26 10:36 ` [PATCH RFC V2 32/37] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
                     ` (2 preceding siblings ...)
  2023-09-26 10:36   ` [PATCH RFC V2 35/37] hw/arm: Support hotplug capability check using _OSC method Salil Mehta via
@ 2023-09-26 10:36   ` Salil Mehta via
  2023-09-26 10:36   ` [PATCH RFC V2 37/37] hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled Salil Mehta via
  4 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

From: Miguel Luis <miguel.luis@oracle.com>

[This patch is just for reference. It has problems as it does not takes care of
the TranslationBlocks and their assigned regions during CPU unrealize]

When using TCG acceleration in a multi-threaded context each vCPU has its own
thread registered in tcg_ctxs[] upon creation and tcg_cur_ctxs stores the current
number of threads that got created. Although, the lack of a mechanism to
unregister these threads is a problem when exercising vCPU hotplug/unplug
due to the fact that tcg_cur_ctxs gets incremented everytime a vCPU gets
hotplugged but never gets decremented everytime a vCPU gets unplugged, therefore
breaking the assert stating tcg_cur_ctxs < tcg_max_ctxs after a certain amount
of vCPU hotplugs.

Suggested-by: Salil Mehta <salil.mehta@huawei.com>
[SM: Check Things To Do Section, https://lore.kernel.org/all/20200613213629.21984-1-salil.mehta@huawei.com/]
Signed-off-by: Miguel Luis <miguel.luis@oracle.com>
---
 accel/tcg/tcg-accel-ops-mttcg.c |  1 +
 include/tcg/tcg.h               |  1 +
 tcg/tcg.c                       | 23 +++++++++++++++++++++++
 3 files changed, 25 insertions(+)

diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
index b276262007..5cf9747ef2 100644
--- a/accel/tcg/tcg-accel-ops-mttcg.c
+++ b/accel/tcg/tcg-accel-ops-mttcg.c
@@ -127,6 +127,7 @@ static void *mttcg_cpu_thread_fn(void *arg)
     qemu_mutex_unlock_iothread();
     rcu_remove_force_rcu_notifier(&force_rcu.notifier);
     rcu_unregister_thread();
+    tcg_unregister_thread();
     return NULL;
 }
 
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 0875971719..6c1cd2a618 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -785,6 +785,7 @@ static inline void *tcg_malloc(int size)
 
 void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus);
 void tcg_register_thread(void);
+void tcg_unregister_thread(void);
 void tcg_prologue_init(TCGContext *s);
 void tcg_func_start(TCGContext *s);
 
diff --git a/tcg/tcg.c b/tcg/tcg.c
index ddfe9a96cb..6760f40823 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -742,6 +742,14 @@ static void alloc_tcg_plugin_context(TCGContext *s)
 #endif
 }
 
+static void free_tcg_plugin_context(TCGContext *s)
+{
+#ifdef CONFIG_PLUGIN
+    g_ptr_array_unref(s->plugin_tb->insns);
+    g_free(s->plugin_tb);
+#endif
+}
+
 /*
  * All TCG threads except the parent (i.e. the one that called tcg_context_init
  * and registered the target's TCG globals) must register with this function
@@ -791,6 +799,21 @@ void tcg_register_thread(void)
 
     tcg_ctx = s;
 }
+
+void tcg_unregister_thread(void)
+{
+    TCGContext *s = tcg_ctx;
+    unsigned int n;
+
+    /* Unclaim an entry in tcg_ctxs */
+    n = qatomic_fetch_dec(&tcg_cur_ctxs);
+    g_assert(n > 1);
+    qatomic_store_release(&tcg_ctxs[n - 1], 0);
+
+    free_tcg_plugin_context(s);
+
+    g_free(s);
+}
 #endif /* !CONFIG_USER_ONLY */
 
 /* pool based memory allocation */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 37/37] hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
  2023-09-26 10:36 ` [PATCH RFC V2 32/37] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
                     ` (3 preceding siblings ...)
  2023-09-26 10:36   ` [PATCH RFC V2 36/37] tcg/mttcg: enable threads to unregister in tcg_ctxs[] Salil Mehta via
@ 2023-09-26 10:36   ` Salil Mehta via
  4 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 10:36 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, jean-philippe, jonathan.cameron, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hotpluggable CPUs MUST be exposed as 'online-capable' as per the new change. But
cold booted CPUs if made 'online-capable' during boot time might not get
detected in the legacy OS. Hence, can cause compatibility problems.

Original Change Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706

Specification change might take time and hence disabling the support of
unplugging any cold booted CPUs to preserve the compatibility with legacy OS.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 hw/arm/virt-acpi-build.c | 19 ++++++++++++++-----
 hw/arm/virt.c            | 16 ++++++++++++++++
 include/hw/core/cpu.h    |  2 ++
 3 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 377450dd16..879c83a337 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -710,17 +710,26 @@ static uint32_t virt_acpi_get_gicc_flags(CPUState *cpu)
     }
 
     /*
-     * ARM GIC CPU Interface can be 'online-capable' or 'enabled' at boot
-     * We MUST set 'online-capable' Bit for all hotpluggable CPUs except the
-     * first/boot CPU. Cold-booted CPUs without 'Id' can also be unplugged.
-     * Though as-of-now this is only used as a debugging feature.
+     * ARM GIC CPU Interface can be 'online-capable' or 'enabled' at boot. We
+     * MUST set 'online-capable' bit for all hotpluggable CPUs.
+     * Change Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706
      *
      *   UEFI ACPI Specification 6.5
      *   Section: 5.2.12.14. GIC CPU Interface (GICC) Structure
      *   Table:   5.37 GICC CPU Interface Flags
      *   Link: https://uefi.org/specs/ACPI/6.5
+     *
+     * Cold-booted CPUs, except for the first/boot CPU, SHOULD be allowed to be
+     * hot(un)plug as well but for this to happen these MUST have
+     * 'online-capable' bit set. Later creates compatibility problem with legacy
+     * OS as it might ignore online-capable' bits during boot time and hence
+     * some CPUs might not get detected. To fix this MADT GIC CPU interface flag
+     * should be allowed to have both bits set i.e. 'online-capable' and
+     * 'Enabled' bits together. This change will require UEFI ACPI standard
+     * change. Till this happens exposing all cold-booted CPUs as 'enabled' only
+     *
      */
-    return cpu && !cpu->cpu_index ? 1 : (1 << 3);
+    return cpu && cpu->cold_booted ? 1 : (1 << 3);
 }
 
 static void
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index e46f529801..3bfe9b9db3 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3151,6 +3151,10 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
      * This shall be used during the init of ACPI Hotplug state and hot-unplug
      */
      cs->acpi_persistent = true;
+
+    if (!dev->hotplugged) {
+        cs->cold_booted = true;
+    }
 }
 
 static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
@@ -3214,6 +3218,18 @@ static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
         return;
     }
 
+    /*
+     * UEFI ACPI standard change is required to make both 'enabled' and the
+     * 'online-capable' bit co-exist instead of being mutually exclusive.
+     * check virt_acpi_get_gicc_flags() for more details.
+     *
+     * Disable the unplugging of cold-booted vCPUs as a temporary mitigation.
+     */
+    if (cs->cold_booted) {
+        error_setg(errp, "Hot-unplug of cold-booted CPU not supported!");
+        return;
+    }
+
     if (cs->cpu_index == first_cpu->cpu_index) {
         error_setg(errp, "Boot CPU(id%d=%d:%d:%d:%d) hot-unplug not supported",
                    first_cpu->cpu_index, cpu->socket_id, cpu->cluster_id,
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index ffd815a0d8..f6b92a3285 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -441,6 +441,8 @@ struct CPUState {
     uint32_t can_do_io;
     int32_t exception_index;
 
+    bool cold_booted;
+
     AccelCPUState *accel;
     /* shared by kvm, hax and hvf */
     bool vcpu_dirty;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to support vCPU Hotplug
  2023-09-26 10:04 ` [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to support vCPU Hotplug Salil Mehta via
@ 2023-09-26 11:02   ` Michael S. Tsirkin
  2023-09-26 11:37     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Michael S. Tsirkin @ 2023-09-26 11:02 UTC (permalink / raw)
  To: Salil Mehta
  Cc: qemu-devel, qemu-arm, maz, jean-philippe, jonathan.cameron,
	lpieralisi, peter.maydell, richard.henderson, imammedo,
	andrew.jones, david, philmd, eric.auger, will, ardb,
	oliver.upton, pbonzini, gshan, rafael, borntraeger, alex.bennee,
	linux, darren, ilkka, vishnu, karl.heubaum, miguel.luis,
	salil.mehta, zhukeqian1, wangxiongfeng2, wangyanan55, jiakernel2,
	maobibo, lixianglai

On Tue, Sep 26, 2023 at 11:04:23AM +0100, Salil Mehta wrote:
> ACPI GED shall be used to convey to the guest kernel about any CPU hot-(un)plug
> events. Therefore, existing ACPI GED framework inside QEMU needs to be enhanced
> to support CPU hotplug state and events.
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>

Co-developed with yourself?

didn't you co-develop this with xianglai li?

Just include his S.O.B then, and drop the non-standard Co-developed-by.



> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>



> ---
>  hw/acpi/generic_event_device.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index ad252e6a91..0266733a54 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -12,6 +12,7 @@
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
>  #include "hw/acpi/acpi.h"
> +#include "hw/acpi/cpu.h"
>  #include "hw/acpi/generic_event_device.h"
>  #include "hw/irq.h"
>  #include "hw/mem/pc-dimm.h"
> @@ -239,6 +240,8 @@ static void acpi_ged_device_plug_cb(HotplugHandler *hotplug_dev,
>          } else {
>              acpi_memory_plug_cb(hotplug_dev, &s->memhp_state, dev, errp);
>          }
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        acpi_cpu_plug_cb(hotplug_dev, &s->cpuhp_state, dev, errp);
>      } else {
>          error_setg(errp, "virt: device plug request for unsupported device"
>                     " type: %s", object_get_typename(OBJECT(dev)));
> @@ -253,6 +256,8 @@ static void acpi_ged_unplug_request_cb(HotplugHandler *hotplug_dev,
>      if ((object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
>                         !(object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)))) {
>          acpi_memory_unplug_request_cb(hotplug_dev, &s->memhp_state, dev, errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        acpi_cpu_unplug_request_cb(hotplug_dev, &s->cpuhp_state, dev, errp);
>      } else {
>          error_setg(errp, "acpi: device unplug request for unsupported device"
>                     " type: %s", object_get_typename(OBJECT(dev)));
> @@ -266,6 +271,8 @@ static void acpi_ged_unplug_cb(HotplugHandler *hotplug_dev,
>  
>      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>          acpi_memory_unplug_cb(&s->memhp_state, dev, errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        acpi_cpu_unplug_cb(&s->cpuhp_state, dev, errp);
>      } else {
>          error_setg(errp, "acpi: device unplug for unsupported device"
>                     " type: %s", object_get_typename(OBJECT(dev)));
> @@ -277,6 +284,7 @@ static void acpi_ged_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList ***list)
>      AcpiGedState *s = ACPI_GED(adev);
>  
>      acpi_memory_ospm_status(&s->memhp_state, list);
> +    acpi_cpu_ospm_status(&s->cpuhp_state, list);
>  }
>  
>  static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
> @@ -291,6 +299,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>          sel = ACPI_GED_PWR_DOWN_EVT;
>      } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
>          sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
> +    } else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
> +        sel = ACPI_GED_CPU_HOTPLUG_EVT;
>      } else {
>          /* Unknown event. Return without generating interrupt. */
>          warn_report("GED: Unsupported event %d. No irq injected", ev);
> -- 
> 2.34.1



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to support vCPU Hotplug
  2023-09-26 11:02   ` Michael S. Tsirkin
@ 2023-09-26 11:37     ` Salil Mehta via
  2023-09-26 12:00       ` Michael S. Tsirkin
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 11:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, qemu-arm, maz, jean-philippe, Jonathan Cameron,
	lpieralisi, peter.maydell, richard.henderson, imammedo,
	andrew.jones, david, philmd, eric.auger, will, ardb,
	oliver.upton, pbonzini, gshan, rafael, borntraeger, alex.bennee,
	linux, darren, ilkka, vishnu, karl.heubaum, miguel.luis,
	salil.mehta, zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, September 26, 2023 12:02 PM
> To: Salil Mehta <salil.mehta@huawei.com>
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; maz@kernel.org; jean-
> philippe@linaro.org; Jonathan Cameron <jonathan.cameron@huawei.com>;
> lpieralisi@kernel.org; peter.maydell@linaro.org;
> richard.henderson@linaro.org; imammedo@redhat.com; andrew.jones@linux.dev;
> david@redhat.com; philmd@linaro.org; eric.auger@redhat.com;
> will@kernel.org; ardb@kernel.org; oliver.upton@linux.dev;
> pbonzini@redhat.com; gshan@redhat.com; rafael@kernel.org;
> borntraeger@linux.ibm.com; alex.bennee@linaro.org; linux@armlinux.org.uk;
> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian
> <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
> wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com;
> maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to
> support vCPU Hotplug
> 
> On Tue, Sep 26, 2023 at 11:04:23AM +0100, Salil Mehta wrote:
> > ACPI GED shall be used to convey to the guest kernel about any CPU hot-
> (un)plug
> > events. Therefore, existing ACPI GED framework inside QEMU needs to be
> enhanced
> > to support CPU hotplug state and events.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> 
> Co-developed with yourself?
> 
> didn't you co-develop this with xianglai li?

No, our effort is quite old ARM patch-set existed since the year 2020
without any change. Please check the original patch-set here:

https://lore.kernel.org/qemu-devel/20200613213629.21984-11-salil.mehta@huawei.com/


To be fair to the authors, it will not be right to add another SOB here.



> 
> Just include his S.O.B then, and drop the non-standard Co-developed-by.

Co-developed-by Tag has been added to ensure main authors of the patch
get highlighted clearly.


> 
> 
> 
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> 
> 
> 
> > ---
> >  hw/acpi/generic_event_device.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/hw/acpi/generic_event_device.c
> b/hw/acpi/generic_event_device.c
> > index ad252e6a91..0266733a54 100644
> > --- a/hw/acpi/generic_event_device.c
> > +++ b/hw/acpi/generic_event_device.c
> > @@ -12,6 +12,7 @@
> >  #include "qemu/osdep.h"
> >  #include "qapi/error.h"
> >  #include "hw/acpi/acpi.h"
> > +#include "hw/acpi/cpu.h"
> >  #include "hw/acpi/generic_event_device.h"
> >  #include "hw/irq.h"
> >  #include "hw/mem/pc-dimm.h"
> > @@ -239,6 +240,8 @@ static void acpi_ged_device_plug_cb(HotplugHandler
> *hotplug_dev,
> >          } else {
> >              acpi_memory_plug_cb(hotplug_dev, &s->memhp_state, dev,
> errp);
> >          }
> > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > +        acpi_cpu_plug_cb(hotplug_dev, &s->cpuhp_state, dev, errp);
> >      } else {
> >          error_setg(errp, "virt: device plug request for unsupported
> device"
> >                     " type: %s", object_get_typename(OBJECT(dev)));
> > @@ -253,6 +256,8 @@ static void acpi_ged_unplug_request_cb(HotplugHandler
> *hotplug_dev,
> >      if ((object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
> >                         !(object_dynamic_cast(OBJECT(dev),
> TYPE_NVDIMM)))) {
> >          acpi_memory_unplug_request_cb(hotplug_dev, &s->memhp_state, dev,
> errp);
> > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > +        acpi_cpu_unplug_request_cb(hotplug_dev, &s->cpuhp_state, dev,
> errp);
> >      } else {
> >          error_setg(errp, "acpi: device unplug request for unsupported
> device"
> >                     " type: %s", object_get_typename(OBJECT(dev)));
> > @@ -266,6 +271,8 @@ static void acpi_ged_unplug_cb(HotplugHandler
> *hotplug_dev,
> >
> >      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> >          acpi_memory_unplug_cb(&s->memhp_state, dev, errp);
> > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > +        acpi_cpu_unplug_cb(&s->cpuhp_state, dev, errp);
> >      } else {
> >          error_setg(errp, "acpi: device unplug for unsupported device"
> >                     " type: %s", object_get_typename(OBJECT(dev)));
> > @@ -277,6 +284,7 @@ static void acpi_ged_ospm_status(AcpiDeviceIf *adev,
> ACPIOSTInfoList ***list)
> >      AcpiGedState *s = ACPI_GED(adev);
> >
> >      acpi_memory_ospm_status(&s->memhp_state, list);
> > +    acpi_cpu_ospm_status(&s->cpuhp_state, list);
> >  }
> >
> >  static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits
> ev)
> > @@ -291,6 +299,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev,
> AcpiEventStatusBits ev)
> >          sel = ACPI_GED_PWR_DOWN_EVT;
> >      } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
> >          sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
> > +    } else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
> > +        sel = ACPI_GED_CPU_HOTPLUG_EVT;
> >      } else {
> >          /* Unknown event. Return without generating interrupt. */
> >          warn_report("GED: Unsupported event %d. No irq injected", ev);
> > --
> > 2.34.1



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to support vCPU Hotplug
  2023-09-26 11:37     ` Salil Mehta via
@ 2023-09-26 12:00       ` Michael S. Tsirkin
  2023-09-26 12:27         ` Salil Mehta via
  2023-09-26 13:02         ` lixianglai
  0 siblings, 2 replies; 153+ messages in thread
From: Michael S. Tsirkin @ 2023-09-26 12:00 UTC (permalink / raw)
  To: Salil Mehta
  Cc: qemu-devel, qemu-arm, maz, jean-philippe, Jonathan Cameron,
	lpieralisi, peter.maydell, richard.henderson, imammedo,
	andrew.jones, david, philmd, eric.auger, will, ardb,
	oliver.upton, pbonzini, gshan, rafael, borntraeger, alex.bennee,
	linux, darren, ilkka, vishnu, karl.heubaum, miguel.luis,
	salil.mehta, zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

On Tue, Sep 26, 2023 at 11:37:38AM +0000, Salil Mehta wrote:
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Tuesday, September 26, 2023 12:02 PM
> > To: Salil Mehta <salil.mehta@huawei.com>
> > Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; maz@kernel.org; jean-
> > philippe@linaro.org; Jonathan Cameron <jonathan.cameron@huawei.com>;
> > lpieralisi@kernel.org; peter.maydell@linaro.org;
> > richard.henderson@linaro.org; imammedo@redhat.com; andrew.jones@linux.dev;
> > david@redhat.com; philmd@linaro.org; eric.auger@redhat.com;
> > will@kernel.org; ardb@kernel.org; oliver.upton@linux.dev;
> > pbonzini@redhat.com; gshan@redhat.com; rafael@kernel.org;
> > borntraeger@linux.ibm.com; alex.bennee@linaro.org; linux@armlinux.org.uk;
> > darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> > vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> > miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian
> > <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
> > wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com;
> > maobibo@loongson.cn; lixianglai@loongson.cn
> > Subject: Re: [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to
> > support vCPU Hotplug
> > 
> > On Tue, Sep 26, 2023 at 11:04:23AM +0100, Salil Mehta wrote:
> > > ACPI GED shall be used to convey to the guest kernel about any CPU hot-
> > (un)plug
> > > events. Therefore, existing ACPI GED framework inside QEMU needs to be
> > enhanced
> > > to support CPU hotplug state and events.
> > >
> > > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > 
> > Co-developed with yourself?
> > 
> > didn't you co-develop this with xianglai li?
> 
> No, our effort is quite old ARM patch-set existed since the year 2020
> without any change. Please check the original patch-set here:
> 
> https://lore.kernel.org/qemu-devel/20200613213629.21984-11-salil.mehta@huawei.com/
> 
> 
> To be fair to the authors, it will not be right to add another SOB here.
> 

I see. And what's the difference with patches that xianglai li posted?
Are they both rebases of the same old patch then?

> 
> > 
> > Just include his S.O.B then, and drop the non-standard Co-developed-by.
> 
> Co-developed-by Tag has been added to ensure main authors of the patch
> get highlighted clearly.

I think I don't know the patch provenance at this point.

> 
> > 
> > 
> > 
> > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > 
> > 
> > 
> > > ---
> > >  hw/acpi/generic_event_device.c | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > >
> > > diff --git a/hw/acpi/generic_event_device.c
> > b/hw/acpi/generic_event_device.c
> > > index ad252e6a91..0266733a54 100644
> > > --- a/hw/acpi/generic_event_device.c
> > > +++ b/hw/acpi/generic_event_device.c
> > > @@ -12,6 +12,7 @@
> > >  #include "qemu/osdep.h"
> > >  #include "qapi/error.h"
> > >  #include "hw/acpi/acpi.h"
> > > +#include "hw/acpi/cpu.h"
> > >  #include "hw/acpi/generic_event_device.h"
> > >  #include "hw/irq.h"
> > >  #include "hw/mem/pc-dimm.h"
> > > @@ -239,6 +240,8 @@ static void acpi_ged_device_plug_cb(HotplugHandler
> > *hotplug_dev,
> > >          } else {
> > >              acpi_memory_plug_cb(hotplug_dev, &s->memhp_state, dev,
> > errp);
> > >          }
> > > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > > +        acpi_cpu_plug_cb(hotplug_dev, &s->cpuhp_state, dev, errp);
> > >      } else {
> > >          error_setg(errp, "virt: device plug request for unsupported
> > device"
> > >                     " type: %s", object_get_typename(OBJECT(dev)));
> > > @@ -253,6 +256,8 @@ static void acpi_ged_unplug_request_cb(HotplugHandler
> > *hotplug_dev,
> > >      if ((object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
> > >                         !(object_dynamic_cast(OBJECT(dev),
> > TYPE_NVDIMM)))) {
> > >          acpi_memory_unplug_request_cb(hotplug_dev, &s->memhp_state, dev,
> > errp);
> > > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > > +        acpi_cpu_unplug_request_cb(hotplug_dev, &s->cpuhp_state, dev,
> > errp);
> > >      } else {
> > >          error_setg(errp, "acpi: device unplug request for unsupported
> > device"
> > >                     " type: %s", object_get_typename(OBJECT(dev)));
> > > @@ -266,6 +271,8 @@ static void acpi_ged_unplug_cb(HotplugHandler
> > *hotplug_dev,
> > >
> > >      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > >          acpi_memory_unplug_cb(&s->memhp_state, dev, errp);
> > > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > > +        acpi_cpu_unplug_cb(&s->cpuhp_state, dev, errp);
> > >      } else {
> > >          error_setg(errp, "acpi: device unplug for unsupported device"
> > >                     " type: %s", object_get_typename(OBJECT(dev)));
> > > @@ -277,6 +284,7 @@ static void acpi_ged_ospm_status(AcpiDeviceIf *adev,
> > ACPIOSTInfoList ***list)
> > >      AcpiGedState *s = ACPI_GED(adev);
> > >
> > >      acpi_memory_ospm_status(&s->memhp_state, list);
> > > +    acpi_cpu_ospm_status(&s->cpuhp_state, list);
> > >  }
> > >
> > >  static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits
> > ev)
> > > @@ -291,6 +299,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev,
> > AcpiEventStatusBits ev)
> > >          sel = ACPI_GED_PWR_DOWN_EVT;
> > >      } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
> > >          sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
> > > +    } else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
> > > +        sel = ACPI_GED_CPU_HOTPLUG_EVT;
> > >      } else {
> > >          /* Unknown event. Return without generating interrupt. */
> > >          warn_report("GED: Unsupported event %d. No irq injected", ev);
> > > --
> > > 2.34.1



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to support vCPU Hotplug
  2023-09-26 12:00       ` Michael S. Tsirkin
@ 2023-09-26 12:27         ` Salil Mehta via
  2023-09-26 13:02         ` lixianglai
  1 sibling, 0 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-26 12:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, qemu-arm, maz, jean-philippe, Jonathan Cameron,
	lpieralisi, peter.maydell, richard.henderson, imammedo,
	andrew.jones, david, philmd, eric.auger, will, ardb,
	oliver.upton, pbonzini, gshan, rafael, borntraeger, alex.bennee,
	linux, darren, ilkka, vishnu, karl.heubaum, miguel.luis,
	salil.mehta, zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, September 26, 2023 1:00 PM
> To: Salil Mehta <salil.mehta@huawei.com>
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; maz@kernel.org; jean-
> philippe@linaro.org; Jonathan Cameron <jonathan.cameron@huawei.com>;
> lpieralisi@kernel.org; peter.maydell@linaro.org;
> richard.henderson@linaro.org; imammedo@redhat.com; andrew.jones@linux.dev;
> david@redhat.com; philmd@linaro.org; eric.auger@redhat.com;
> will@kernel.org; ardb@kernel.org; oliver.upton@linux.dev;
> pbonzini@redhat.com; gshan@redhat.com; rafael@kernel.org;
> borntraeger@linux.ibm.com; alex.bennee@linaro.org; linux@armlinux.org.uk;
> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian
> <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
> wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com;
> maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to
> support vCPU Hotplug
> 
> On Tue, Sep 26, 2023 at 11:37:38AM +0000, Salil Mehta wrote:
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Tuesday, September 26, 2023 12:02 PM
> > > To: Salil Mehta <salil.mehta@huawei.com>
> > > Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; maz@kernel.org; jean-
> > > philippe@linaro.org; Jonathan Cameron <jonathan.cameron@huawei.com>;
> > > lpieralisi@kernel.org; peter.maydell@linaro.org;
> > > richard.henderson@linaro.org; imammedo@redhat.com;
> andrew.jones@linux.dev;
> > > david@redhat.com; philmd@linaro.org; eric.auger@redhat.com;
> > > will@kernel.org; ardb@kernel.org; oliver.upton@linux.dev;
> > > pbonzini@redhat.com; gshan@redhat.com; rafael@kernel.org;
> > > borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk;
> > > darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> > > vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> > > miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian
> > > <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
> > > wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com;
> > > maobibo@loongson.cn; lixianglai@loongson.cn
> > > Subject: Re: [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to
> > > support vCPU Hotplug
> > >
> > > On Tue, Sep 26, 2023 at 11:04:23AM +0100, Salil Mehta wrote:
> > > > ACPI GED shall be used to convey to the guest kernel about any CPU
> hot-
> > > (un)plug
> > > > events. Therefore, existing ACPI GED framework inside QEMU needs to
> be
> > > enhanced
> > > > to support CPU hotplug state and events.
> > > >
> > > > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > >
> > > Co-developed with yourself?
> > >
> > > didn't you co-develop this with xianglai li?
> >
> > No, our effort is quite old ARM patch-set existed since the year 2020
> > without any change. Please check the original patch-set here:
> >
> > https://lore.kernel.org/qemu-devel/20200613213629.21984-11-
> salil.mehta@huawei.com/
> >
> >
> > To be fair to the authors, it will not be right to add another SOB here.
> >
> 
> I see. And what's the difference with patches that xianglai li posted?

I am not sure if there is. But if there is any change then it can be
commented on the architecture agnostic patch-set which I shall be
posting later this week. If that change is a common change to all
the architectures maybe I can pick-up that change and add in the V2
version of the arch agnostic patch-set and then it will make sense
to add the SOBs of the contributing members there.

> Are they both rebases of the same old patch then?

RFC V2 is an extension of RFC V1. We been working with ARM, Oracle
Ampere, Linaro and other companies to get to this stage.

It has already been agreed that Loongson folks shall be rebasing
their patch-set over ARM RFC V2 patch-set which is very big. 


https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.mehta@huawei.com/T/#m523b37819c4811c7827333982004e07a1ef03879

I have for now pointed the patches in the exact order which can be
used to rebase their patch-set .

I will be sending a separate arch agnostic patch-set later this week
which will be without RFC Tag. This way different companies can work
independently.


Thanks
Salil.

> > > Just include his S.O.B then, and drop the non-standard Co-developed-by.
> >
> > Co-developed-by Tag has been added to ensure main authors of the patch
> > get highlighted clearly.
> 
> I think I don't know the patch provenance at this point.


It was us from Huawei in the year 2020.

https://lore.kernel.org/qemu-devel/20200613213629.21984-10-salil.mehta@huawei.com/




> > > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > > > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > > > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > >
> > >
> > >
> > > > ---
> > > >  hw/acpi/generic_event_device.c | 10 ++++++++++
> > > >  1 file changed, 10 insertions(+)
> > > >
> > > > diff --git a/hw/acpi/generic_event_device.c
> > > b/hw/acpi/generic_event_device.c
> > > > index ad252e6a91..0266733a54 100644
> > > > --- a/hw/acpi/generic_event_device.c
> > > > +++ b/hw/acpi/generic_event_device.c
> > > > @@ -12,6 +12,7 @@
> > > >  #include "qemu/osdep.h"
> > > >  #include "qapi/error.h"
> > > >  #include "hw/acpi/acpi.h"
> > > > +#include "hw/acpi/cpu.h"
> > > >  #include "hw/acpi/generic_event_device.h"
> > > >  #include "hw/irq.h"
> > > >  #include "hw/mem/pc-dimm.h"
> > > > @@ -239,6 +240,8 @@ static void
> acpi_ged_device_plug_cb(HotplugHandler
> > > *hotplug_dev,
> > > >          } else {
> > > >              acpi_memory_plug_cb(hotplug_dev, &s->memhp_state, dev,
> > > errp);
> > > >          }
> > > > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > > > +        acpi_cpu_plug_cb(hotplug_dev, &s->cpuhp_state, dev, errp);
> > > >      } else {
> > > >          error_setg(errp, "virt: device plug request for unsupported
> > > device"
> > > >                     " type: %s", object_get_typename(OBJECT(dev)));
> > > > @@ -253,6 +256,8 @@ static void
> acpi_ged_unplug_request_cb(HotplugHandler
> > > *hotplug_dev,
> > > >      if ((object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
> > > >                         !(object_dynamic_cast(OBJECT(dev),
> > > TYPE_NVDIMM)))) {
> > > >          acpi_memory_unplug_request_cb(hotplug_dev, &s->memhp_state,
> dev,
> > > errp);
> > > > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > > > +        acpi_cpu_unplug_request_cb(hotplug_dev, &s->cpuhp_state,
> dev,
> > > errp);
> > > >      } else {
> > > >          error_setg(errp, "acpi: device unplug request for
> unsupported
> > > device"
> > > >                     " type: %s", object_get_typename(OBJECT(dev)));
> > > > @@ -266,6 +271,8 @@ static void acpi_ged_unplug_cb(HotplugHandler
> > > *hotplug_dev,
> > > >
> > > >      if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > > >          acpi_memory_unplug_cb(&s->memhp_state, dev, errp);
> > > > +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > > > +        acpi_cpu_unplug_cb(&s->cpuhp_state, dev, errp);
> > > >      } else {
> > > >          error_setg(errp, "acpi: device unplug for unsupported
> device"
> > > >                     " type: %s", object_get_typename(OBJECT(dev)));
> > > > @@ -277,6 +284,7 @@ static void acpi_ged_ospm_status(AcpiDeviceIf
> *adev,
> > > ACPIOSTInfoList ***list)
> > > >      AcpiGedState *s = ACPI_GED(adev);
> > > >
> > > >      acpi_memory_ospm_status(&s->memhp_state, list);
> > > > +    acpi_cpu_ospm_status(&s->cpuhp_state, list);
> > > >  }
> > > >
> > > >  static void acpi_ged_send_event(AcpiDeviceIf *adev,
> AcpiEventStatusBits
> > > ev)
> > > > @@ -291,6 +299,8 @@ static void acpi_ged_send_event(AcpiDeviceIf
> *adev,
> > > AcpiEventStatusBits ev)
> > > >          sel = ACPI_GED_PWR_DOWN_EVT;
> > > >      } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
> > > >          sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
> > > > +    } else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
> > > > +        sel = ACPI_GED_CPU_HOTPLUG_EVT;
> > > >      } else {
> > > >          /* Unknown event. Return without generating interrupt. */
> > > >          warn_report("GED: Unsupported event %d. No irq injected",
> ev);
> > > > --
> > > > 2.34.1



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to support vCPU Hotplug
  2023-09-26 12:00       ` Michael S. Tsirkin
  2023-09-26 12:27         ` Salil Mehta via
@ 2023-09-26 13:02         ` lixianglai
  1 sibling, 0 replies; 153+ messages in thread
From: lixianglai @ 2023-09-26 13:02 UTC (permalink / raw)
  To: Michael S. Tsirkin, Salil Mehta
  Cc: qemu-devel, qemu-arm, maz, jean-philippe, Jonathan Cameron,
	lpieralisi, peter.maydell, richard.henderson, imammedo,
	andrew.jones, david, philmd, eric.auger, will, ardb,
	oliver.upton, pbonzini, gshan, rafael, borntraeger, alex.bennee,
	linux, darren, ilkka, vishnu, karl.heubaum, miguel.luis,
	salil.mehta, zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo


Hi Michael S. Tsirkin :
> On Tue, Sep 26, 2023 at 11:37:38AM +0000, Salil Mehta wrote:
>>> From: Michael S. Tsirkin <mst@redhat.com>
>>> Sent: Tuesday, September 26, 2023 12:02 PM
>>> To: Salil Mehta <salil.mehta@huawei.com>
>>> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; maz@kernel.org; jean-
>>> philippe@linaro.org; Jonathan Cameron <jonathan.cameron@huawei.com>;
>>> lpieralisi@kernel.org; peter.maydell@linaro.org;
>>> richard.henderson@linaro.org; imammedo@redhat.com; andrew.jones@linux.dev;
>>> david@redhat.com; philmd@linaro.org; eric.auger@redhat.com;
>>> will@kernel.org; ardb@kernel.org; oliver.upton@linux.dev;
>>> pbonzini@redhat.com; gshan@redhat.com; rafael@kernel.org;
>>> borntraeger@linux.ibm.com; alex.bennee@linaro.org; linux@armlinux.org.uk;
>>> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
>>> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
>>> miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian
>>> <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
>>> wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com;
>>> maobibo@loongson.cn; lixianglai@loongson.cn
>>> Subject: Re: [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to
>>> support vCPU Hotplug
>>>
>>> On Tue, Sep 26, 2023 at 11:04:23AM +0100, Salil Mehta wrote:
>>>> ACPI GED shall be used to convey to the guest kernel about any CPU hot-
>>> (un)plug
>>>> events. Therefore, existing ACPI GED framework inside QEMU needs to be
>>> enhanced
>>>> to support CPU hotplug state and events.
>>>>
>>>> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
>>> Co-developed with yourself?
>>>
>>> didn't you co-develop this with xianglai li?
>> No, our effort is quite old ARM patch-set existed since the year 2020
>> without any change. Please check the original patch-set here:
>>
>> https://lore.kernel.org/qemu-devel/20200613213629.21984-11-salil.mehta@huawei.com/
>>
>>
>> To be fair to the authors, it will not be right to add another SOB here.
>>
> I see. And what's the difference with patches that xianglai li posted?
> Are they both rebases of the same old patch then?
>
The two patches in front of me in the patch I sent were indeed Salil 
Mehta's patches from rebase,

which were explained in the cover letter of my patch.


I apologize for any misunderstanding this may have caused.

Thanks,

Xianglai.


>>> Just include his S.O.B then, and drop the non-standard Co-developed-by.
>> Co-developed-by Tag has been added to ensure main authors of the patch
>> get highlighted clearly.
> I think I don't know the patch provenance at this point.
>
>>>
>>>
>>>> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>>> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>>>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>>>> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>>
>>>
>>>> ---
>>>>   hw/acpi/generic_event_device.c | 10 ++++++++++
>>>>   1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/hw/acpi/generic_event_device.c
>>> b/hw/acpi/generic_event_device.c
>>>> index ad252e6a91..0266733a54 100644
>>>> --- a/hw/acpi/generic_event_device.c
>>>> +++ b/hw/acpi/generic_event_device.c
>>>> @@ -12,6 +12,7 @@
>>>>   #include "qemu/osdep.h"
>>>>   #include "qapi/error.h"
>>>>   #include "hw/acpi/acpi.h"
>>>> +#include "hw/acpi/cpu.h"
>>>>   #include "hw/acpi/generic_event_device.h"
>>>>   #include "hw/irq.h"
>>>>   #include "hw/mem/pc-dimm.h"
>>>> @@ -239,6 +240,8 @@ static void acpi_ged_device_plug_cb(HotplugHandler
>>> *hotplug_dev,
>>>>           } else {
>>>>               acpi_memory_plug_cb(hotplug_dev, &s->memhp_state, dev,
>>> errp);
>>>>           }
>>>> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
>>>> +        acpi_cpu_plug_cb(hotplug_dev, &s->cpuhp_state, dev, errp);
>>>>       } else {
>>>>           error_setg(errp, "virt: device plug request for unsupported
>>> device"
>>>>                      " type: %s", object_get_typename(OBJECT(dev)));
>>>> @@ -253,6 +256,8 @@ static void acpi_ged_unplug_request_cb(HotplugHandler
>>> *hotplug_dev,
>>>>       if ((object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
>>>>                          !(object_dynamic_cast(OBJECT(dev),
>>> TYPE_NVDIMM)))) {
>>>>           acpi_memory_unplug_request_cb(hotplug_dev, &s->memhp_state, dev,
>>> errp);
>>>> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
>>>> +        acpi_cpu_unplug_request_cb(hotplug_dev, &s->cpuhp_state, dev,
>>> errp);
>>>>       } else {
>>>>           error_setg(errp, "acpi: device unplug request for unsupported
>>> device"
>>>>                      " type: %s", object_get_typename(OBJECT(dev)));
>>>> @@ -266,6 +271,8 @@ static void acpi_ged_unplug_cb(HotplugHandler
>>> *hotplug_dev,
>>>>       if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>>>>           acpi_memory_unplug_cb(&s->memhp_state, dev, errp);
>>>> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
>>>> +        acpi_cpu_unplug_cb(&s->cpuhp_state, dev, errp);
>>>>       } else {
>>>>           error_setg(errp, "acpi: device unplug for unsupported device"
>>>>                      " type: %s", object_get_typename(OBJECT(dev)));
>>>> @@ -277,6 +284,7 @@ static void acpi_ged_ospm_status(AcpiDeviceIf *adev,
>>> ACPIOSTInfoList ***list)
>>>>       AcpiGedState *s = ACPI_GED(adev);
>>>>
>>>>       acpi_memory_ospm_status(&s->memhp_state, list);
>>>> +    acpi_cpu_ospm_status(&s->cpuhp_state, list);
>>>>   }
>>>>
>>>>   static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits
>>> ev)
>>>> @@ -291,6 +299,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev,
>>> AcpiEventStatusBits ev)
>>>>           sel = ACPI_GED_PWR_DOWN_EVT;
>>>>       } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
>>>>           sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
>>>> +    } else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
>>>> +        sel = ACPI_GED_CPU_HOTPLUG_EVT;
>>>>       } else {
>>>>           /* Unknown event. Return without generating interrupt. */
>>>>           warn_report("GED: Unsupported event %d. No irq injected", ev);
>>>> --
>>>> 2.34.1



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 01/37] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2023-09-26 10:04 ` [PATCH RFC V2 01/37] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Salil Mehta via
@ 2023-09-26 23:57   ` Gavin Shan
  2023-10-02  9:53     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-26 23:57 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> This shall be used to store user specified topology{socket,cluster,core,thread}
> and shall be converted to a unique 'vcpu-id' which is used as slot-index during
> hot(un)plug of vCPU.
> 

Note that we don't have 'vcpu-id' property. It's actually the index to the array
ms->possible_cpus->cpus[] and cpu->cpu_index. Please improve the commit log if
it makes sense.

> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c    | 63 ++++++++++++++++++++++++++++++++++++++++++++++++
>   target/arm/cpu.c |  4 +++
>   target/arm/cpu.h |  4 +++
>   3 files changed, 71 insertions(+)
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 7d9dbc2663..57fe97c242 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -221,6 +221,11 @@ static const char *valid_cpus[] = {
>       ARM_CPU_TYPE_NAME("max"),
>   };
>   
> +static int virt_get_socket_id(const MachineState *ms, int cpu_index);
> +static int virt_get_cluster_id(const MachineState *ms, int cpu_index);
> +static int virt_get_core_id(const MachineState *ms, int cpu_index);
> +static int virt_get_thread_id(const MachineState *ms, int cpu_index);
> +
>   static bool cpu_type_valid(const char *cpu)
>   {
>       int i;
> @@ -2168,6 +2173,14 @@ static void machvirt_init(MachineState *machine)
>                             &error_fatal);
>   
>           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
> +        object_property_set_int(cpuobj, "socket-id",
> +                                virt_get_socket_id(machine, n), NULL);
> +        object_property_set_int(cpuobj, "cluster-id",
> +                                virt_get_cluster_id(machine, n), NULL);
> +        object_property_set_int(cpuobj, "core-id",
> +                                virt_get_core_id(machine, n), NULL);
> +        object_property_set_int(cpuobj, "thread-id",
> +                                virt_get_thread_id(machine, n), NULL);
>   
>           if (!vms->secure) {
>               object_property_set_bool(cpuobj, "has_el3", false, NULL);
> @@ -2652,10 +2665,59 @@ static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
>       return socket_id % ms->numa_state->num_nodes;
>   }
>   

It seems it's not unnecessary to keep virt_get_{socket, cluster, core, thread}_id()
because they're called for once. I would suggest to figure out the socket, cluster,
core and thread ID through @possible_cpus in machvirt_init(), like below.

Besides, we can't always expose property "cluster-id" since cluster in the CPU
topology isn't always supported, seeing MachineClass::smp_props. Some users may
want to hide cluster for unknown reasons. 'cluster-id' shouldn't be exposed in
this case. Otherwise, users may be confused by 'cluster-id' property while it
has been disabled. For example, a VM is started with the following command lines
and 'cluster-id' shouldn't be supported in vCPU hot-add.

     -cpu host -smp=maxcpus=2,cpus=1,sockets=2,cores=1,threads=1
     (qemu) device_add host,id=cpu1,socket-id=1,cluster-id=0,core-id=0,thread-id=0

     object_property_set_int(cpuobj, "socket-id",
                             possible_cpus->cpus[i].props.socket_id, NULL);
     if (mc->smp_props.cluster_supported && mc->smp_props.has_clusters) {
         object_property_set_int(cpuobj, "cluster-id",
                                 possible_cpus->cpus[i].props.cluster_id, NULL);
     }
     object_property_set_int(cpuobj, "core-id",
                             possible_cpus->cpus[i].props.core_id, NULL);
     object_property_set_int(cpuobj, "thread-id",
                             possible_cpus->cpus[i].props.thread_id, NULL);

> +static int virt_get_socket_id(const MachineState *ms, int cpu_index)
> +{
> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> +
> +    return ms->possible_cpus->cpus[cpu_index].props.socket_id;
> +}
> +
> +static int virt_get_cluster_id(const MachineState *ms, int cpu_index)
> +{
> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> +
> +    return ms->possible_cpus->cpus[cpu_index].props.cluster_id;
> +}
> +
> +static int virt_get_core_id(const MachineState *ms, int cpu_index)
> +{
> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> +
> +    return ms->possible_cpus->cpus[cpu_index].props.core_id;
> +}
> +
> +static int virt_get_thread_id(const MachineState *ms, int cpu_index)
> +{
> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> +
> +    return ms->possible_cpus->cpus[cpu_index].props.thread_id;
> +}
> +
> +static int
> +virt_get_cpu_id_from_cpu_topo(const MachineState *ms, DeviceState *dev)
> +{
> +    int cpu_id, sock_vcpu_num, clus_vcpu_num, core_vcpu_num;
> +    ARMCPU *cpu = ARM_CPU(dev);
> +
> +    /* calculate total logical cpus across socket/cluster/core */
> +    sock_vcpu_num = cpu->socket_id * (ms->smp.threads * ms->smp.cores *
> +                    ms->smp.clusters);
> +    clus_vcpu_num = cpu->cluster_id * (ms->smp.threads * ms->smp.cores);
> +    core_vcpu_num = cpu->core_id * ms->smp.threads;
> +
> +    /* get vcpu-id(logical cpu index) for this vcpu from this topology */
> +    cpu_id = (sock_vcpu_num + clus_vcpu_num + core_vcpu_num) + cpu->thread_id;
> +
> +    assert(cpu_id >= 0 && cpu_id < ms->possible_cpus->len);
> +
> +    return cpu_id;
> +}
> +

This function is called for once in PATCH[04/37]. I think it needs to be moved
around to PATCH[04/37].

[PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug

The function name can be shortened because I don't see the suffix "_from_cpu_topo"
is too much helpful. I think virt_get_cpu_index() would be good enough since it's
called for once to return the index in array MachineState::possible_cpus::cpus[]
and the return value is stored to CPUState::cpu_index.

static int virt_get_cpu_index(const MachineState *ms, ARMCPU *cpu)
{
     int index, cpus_in_socket, cpus_in_cluster, cpus_in_core;

     /*
      * It's fine to take cluster into account even it's not supported. In this
      * case, ms->smp.clusters is always one.
      */
}

>   static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>   {
>       int n;
>       unsigned int max_cpus = ms->smp.max_cpus;
> +    unsigned int smp_threads = ms->smp.threads;
>       VirtMachineState *vms = VIRT_MACHINE(ms);
>       MachineClass *mc = MACHINE_GET_CLASS(vms);
>   
> @@ -2669,6 +2731,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>       ms->possible_cpus->len = max_cpus;
>       for (n = 0; n < ms->possible_cpus->len; n++) {
>           ms->possible_cpus->cpus[n].type = ms->cpu_type;
> +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
>           ms->possible_cpus->cpus[n].arch_id =
>               virt_cpu_mp_affinity(vms, n);
>   

This initialization seems to accomodate HMP command "info hotpluggable-cpus".
It would be nice if it can be mentioned in the commit log.

> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 93c28d50e5..1376350416 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -2277,6 +2277,10 @@ static Property arm_cpu_properties[] = {
>       DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
>                           mp_affinity, ARM64_AFFINITY_INVALID),
>       DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
> +    DEFINE_PROP_INT32("socket-id", ARMCPU, socket_id, 0),
> +    DEFINE_PROP_INT32("cluster-id", ARMCPU, cluster_id, 0),
> +    DEFINE_PROP_INT32("core-id", ARMCPU, core_id, 0),
> +    DEFINE_PROP_INT32("thread-id", ARMCPU, thread_id, 0),
>       DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1),
>       DEFINE_PROP_END_OF_LIST()
>   };

All those 4 properties are used for vCPU hot-add, meaning they're not needed
when vCPU hotplug isn't supported on the specific board. Even for hw/virt board,
cluster isn't always supported and 'cluster-id' shouldn't always be exposed,
as explained above. How about to register the properties dynamically only when
they're needed by vCPU hotplug?

> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 88e5accda6..d51d39f621 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -1094,6 +1094,10 @@ struct ArchCPU {
>       QLIST_HEAD(, ARMELChangeHook) el_change_hooks;
>   
>       int32_t node_id; /* NUMA node this CPU belongs to */
> +    int32_t socket_id;
> +    int32_t cluster_id;
> +    int32_t core_id;
> +    int32_t thread_id;

It would be fine to keep those fields even the corresponding properties are
dynamically registered, but a little bit memory overhead incurred :)

>   
>       /* Used to synchronize KVM and QEMU in-kernel device levels */
>       uint8_t device_irq_level;

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for possible vCPUs
  2023-09-26 10:04 ` [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for possible vCPUs Salil Mehta via
@ 2023-09-27  3:54   ` Gavin Shan
  2023-10-02 10:21     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-27  3:54 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> Adds various utility functions which might be required to fetch or check the
> state of the possible vCPUs. This also introduces concept of *disabled* vCPUs,
> which are part of the *possible* vCPUs but are not part of the *present* vCPU.
> This state shall be used during machine init time to check the presence of
> vcpus.
   ^^^^^

   vCPUs

> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   cpus-common.c         | 31 +++++++++++++++++++++++++
>   include/hw/core/cpu.h | 53 +++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 84 insertions(+)
> 
> diff --git a/cpus-common.c b/cpus-common.c
> index 45c745ecf6..24c04199a1 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -24,6 +24,7 @@
>   #include "sysemu/cpus.h"
>   #include "qemu/lockable.h"
>   #include "trace/trace-root.h"
> +#include "hw/boards.h"
>   
>   QemuMutex qemu_cpu_list_lock;
>   static QemuCond exclusive_cond;
> @@ -107,6 +108,36 @@ void cpu_list_remove(CPUState *cpu)
>       cpu_list_generation_id++;
>   }
>   
> +CPUState *qemu_get_possible_cpu(int index)
> +{
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
> +
> +    assert((index >= 0) && (index < possible_cpus->len));
> +
> +    return CPU(possible_cpus->cpus[index].cpu);
> +}
> +
> +bool qemu_present_cpu(CPUState *cpu)
> +{
> +    return cpu;
> +}
> +
> +bool qemu_enabled_cpu(CPUState *cpu)
> +{
> +    return cpu && !cpu->disabled;
> +}
> +

I do think it's a good idea to have wrappers to check for CPU's states since
these CPU states play important role in this series to support vCPU hotplug.
However, it would be nice to move them around into header file (include/hw/boards.h)
because all the checks are originated from ms->possible_cpus->cpus[]. It sounds
functions to a machine (board) instead of global scope. Besides, it would be
nice to have same input (index) for all functions. How about something like
below in include/hw/boards.h?

static inline  bool machine_has_possible_cpu(int index)
{
     MachineState *ms = MACHINE(qdev_get_machine());

     if (!ms || !ms->possible_cpus || index < 0 || index >= ms->possible_cus->len) {
         return false;
     }

     return true;
}

static inline bool machine_has_present_cpu(int index)
{
     MachineState *ms = MACHINE(qdev_get_machine());

     if (!machine_is_possible_cpu(index) ||
         !ms->possible_cpus->cpus[index].cpu) {
         return false;
     }

     return true;
}

static inline bool machine_has_enabled_cpu(int index)
{
     MachineState *ms = MACHINE(qdev_get_machine());
     CPUState *cs;

     if (!machine_is_present_cpu(index)) {
         return false;
     }

     cs = CPU(ms->possible_cpus->cpus[index].cpu);
     return !cs->disabled
}

> +uint64_t qemu_get_cpu_archid(int cpu_index)
> +{
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
> +
> +    assert((cpu_index >= 0) && (cpu_index < possible_cpus->len));
> +
> +    return possible_cpus->cpus[cpu_index].arch_id;
> +}
> +

I think it's unnecessary to keep it since it's called for once by
hw/arm/virt-acpi-build.c::build_madt. The architectural ID can be
directly fetched from possible_cpus->cpus[i].arch_id. It's fine
to drop this function and fold the logic to the following patch.

[PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with possible vCPUs


>   CPUState *qemu_get_cpu(int index)
>   {
>       CPUState *cpu;
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index fdcbe87352..e5af79950c 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -413,6 +413,17 @@ struct CPUState {
>       SavedIOTLB saved_iotlb;
>   #endif
>   
> +    /*
> +     * Some architectures do not allow *presence* of vCPUs to be changed
> +     * after guest has booted using information specified by VMM/firmware
> +     * via ACPI MADT at the boot time. Thus to enable vCPU hotplug on these
> +     * architectures possible vCPU can have CPUState object in 'disabled'
> +     * state or can also not have CPUState object at all. This is possible
> +     * when vCPU Hotplug is supported and vCPUs are 'yet-to-be-plugged' in
> +     * the QOM or have been hot-unplugged.
> +     * By default every CPUState is enabled as of now across all archs.
> +     */
> +    bool disabled;
>       /* TODO Move common fields from CPUArchState here. */
>       int cpu_index;
>       int cluster_index;

I guess the comments can be simplified a bit. How about something like below?

     /*
      * In order to support vCPU hotplug on architectures like aarch64,
      * the vCPU states fall into possible, present or enabled. This field
      * is added to distinguish present and enabled vCPUs. By default, all
      * vCPUs are present and enabled.
      */

> @@ -770,6 +781,48 @@ static inline bool cpu_in_exclusive_context(const CPUState *cpu)
>    */
>   CPUState *qemu_get_cpu(int index);
>   
> +/**
> + * qemu_get_possible_cpu:
> + * @index: The CPUState@cpu_index value of the CPU to obtain.
> + *         Input index MUST be in range [0, Max Possible CPUs)
> + *
> + * If CPUState object exists,then it gets a CPU matching
> + * @index in the possible CPU array.
> + *
> + * Returns: The possible CPU or %NULL if CPU does not exist.
> + */
> +CPUState *qemu_get_possible_cpu(int index);
> +
> +/**
> + * qemu_present_cpu:
> + * @cpu: The vCPU to check
> + *
> + * Checks if the vCPU is amongst the present possible vcpus.
> + *
> + * Returns: True if it is present possible vCPU else false
> + */
> +bool qemu_present_cpu(CPUState *cpu);
> +
> +/**
> + * qemu_enabled_cpu:
> + * @cpu: The vCPU to check
> + *
> + * Checks if the vCPU is enabled.
> + *
> + * Returns: True if it is 'enabled' else false
> + */
> +bool qemu_enabled_cpu(CPUState *cpu);
> +
> +/**
> + * qemu_get_cpu_archid:
> + * @cpu_index: possible vCPU for which arch-id needs to be retreived
> + *
> + * Fetches the vCPU arch-id from the present possible vCPUs.
> + *
> + * Returns: arch-id of the possible vCPU
> + */
> +uint64_t qemu_get_cpu_archid(int cpu_index);
> +

All these descriptive stuff isn't needed after the functions are moved to
include/hw/boards.h, and qemu_get_cpu_archid() is dropped.

>   /**
>    * cpu_exists:
>    * @id: Guest-exposed CPU ID to lookup.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function
  2023-09-26 10:04 ` [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function Salil Mehta via
@ 2023-09-27  5:16   ` Gavin Shan
  2023-10-02 10:24     ` Salil Mehta via
  2023-10-10  6:46   ` Shaoqin Huang
  1 sibling, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-27  5:16 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> Factor out CPU properties code common for {hot,cold}-plugged CPUs. This allows
> code reuse.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c         | 220 ++++++++++++++++++++++++++----------------
>   include/hw/arm/virt.h |   4 +
>   2 files changed, 140 insertions(+), 84 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 57fe97c242..0eb6bf5a18 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2018,16 +2018,130 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
>       }
>   }
>   
> +static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,
> +                                    Error **errp)
> +{
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    VirtMachineState *vms = VIRT_MACHINE(ms);
> +    Error *local_err = NULL;
> +    VirtMachineClass *vmc;
> +
> +    vmc = VIRT_MACHINE_GET_CLASS(ms);
> +
> +    /* now, set the cpu object property values */
> +    numa_cpu_pre_plug(cpu_slot, DEVICE(cpuobj), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    object_property_set_int(cpuobj, "mp-affinity", cpu_slot->arch_id, NULL);
> +
> +    if (!vms->secure) {
> +        object_property_set_bool(cpuobj, "has_el3", false, NULL);
> +    }
> +
> +    if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
> +        object_property_set_bool(cpuobj, "has_el2", false, NULL);
> +    }
> +
> +    if (vmc->kvm_no_adjvtime &&
> +        object_property_find(cpuobj, "kvm-no-adjvtime")) {
> +        object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
> +    }
> +
> +    if (vmc->no_kvm_steal_time &&
> +        object_property_find(cpuobj, "kvm-steal-time")) {
> +        object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
> +    }
> +
> +    if (vmc->no_pmu && object_property_find(cpuobj, "pmu")) {
> +        object_property_set_bool(cpuobj, "pmu", false, NULL);
> +    }
> +
> +    if (vmc->no_tcg_lpa2 && object_property_find(cpuobj, "lpa2")) {
> +        object_property_set_bool(cpuobj, "lpa2", false, NULL);
> +    }
> +
> +    if (object_property_find(cpuobj, "reset-cbar")) {
> +        object_property_set_int(cpuobj, "reset-cbar",
> +                                vms->memmap[VIRT_CPUPERIPHS].base,
> +                                &local_err);
> +        if (local_err) {
> +            goto out;
> +        }
> +    }
> +
> +    /* link already initialized {secure,tag}-memory regions to this cpu */
> +    object_property_set_link(cpuobj, "memory", OBJECT(vms->sysmem), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    if (vms->secure) {
> +        object_property_set_link(cpuobj, "secure-memory",
> +                                 OBJECT(vms->secure_sysmem), &local_err);
> +        if (local_err) {
> +            goto out;
> +        }
> +    }
> +
> +    if (vms->mte) {
> +        if (!object_property_find(cpuobj, "tag-memory")) {
> +            error_setg(&local_err, "MTE requested, but not supported "
> +                       "by the guest CPU");
> +            if (local_err) {
> +                goto out;
> +            }
> +        }
> +
> +        object_property_set_link(cpuobj, "tag-memory", OBJECT(vms->tag_sysmem),
> +                                 &local_err);
> +        if (local_err) {
> +            goto out;
> +        }
> +
> +        if (vms->secure) {
> +            object_property_set_link(cpuobj, "secure-tag-memory",
> +                                     OBJECT(vms->secure_tag_sysmem),
> +                                     &local_err);
> +            if (local_err) {
> +                goto out;
> +            }
> +        }
> +    }
> +
> +    /*
> +     * RFC: Question: this must only be called for the hotplugged cpus. For the
> +     * cold booted secondary cpus this is being taken care in arm_load_kernel()
> +     * in boot.c. Perhaps we should remove that code now?
> +     */
> +    if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED) {
> +        object_property_set_int(cpuobj, "psci-conduit", vms->psci_conduit,
> +                                NULL);
> +
> +        /* Secondary CPUs start in PSCI powered-down state */
> +        if (CPU(cpuobj)->cpu_index > 0) {
> +            object_property_set_bool(cpuobj, "start-powered-off", true, NULL);
> +        }
> +    }
> +
> +out:
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +    }
> +    return;
        ^^^^^^

It's not needed obviously :)

> +}
> +
>   static void machvirt_init(MachineState *machine)
>   {
>       VirtMachineState *vms = VIRT_MACHINE(machine);
>       VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(machine);
>       MachineClass *mc = MACHINE_GET_CLASS(machine);
>       const CPUArchIdList *possible_cpus;
> -    MemoryRegion *sysmem = get_system_memory();
> +    MemoryRegion *secure_tag_sysmem = NULL;
>       MemoryRegion *secure_sysmem = NULL;
>       MemoryRegion *tag_sysmem = NULL;
> -    MemoryRegion *secure_tag_sysmem = NULL;
> +    MemoryRegion *sysmem;
>       int n, virt_max_cpus;
>       bool firmware_loaded;
>       bool aarch64 = true;
> @@ -2071,6 +2185,8 @@ static void machvirt_init(MachineState *machine)
>        */
>       finalize_gic_version(vms);
>   
> +    sysmem = vms->sysmem = get_system_memory();
> +
>       if (vms->secure) {
>           /*
>            * The Secure view of the world is the same as the NonSecure,
> @@ -2078,7 +2194,7 @@ static void machvirt_init(MachineState *machine)
>            * containing the system memory at low priority; any secure-only
>            * devices go in at higher priority and take precedence.
>            */
> -        secure_sysmem = g_new(MemoryRegion, 1);
> +        secure_sysmem = vms->secure_sysmem = g_new(MemoryRegion, 1);
>           memory_region_init(secure_sysmem, OBJECT(machine), "secure-memory",
>                              UINT64_MAX);
>           memory_region_add_subregion_overlap(secure_sysmem, 0, sysmem, -1);
> @@ -2151,6 +2267,23 @@ static void machvirt_init(MachineState *machine)
>           exit(1);
>       }
>   
> +    if (vms->mte) {
> +        /* Create the memory region only once, but link to all cpus later */
> +        tag_sysmem = vms->tag_sysmem = g_new(MemoryRegion, 1);
> +        memory_region_init(tag_sysmem, OBJECT(machine),
> +                           "tag-memory", UINT64_MAX / 32);
> +
> +        if (vms->secure) {
> +            secure_tag_sysmem = vms->secure_tag_sysmem = g_new(MemoryRegion, 1);
> +            memory_region_init(secure_tag_sysmem, OBJECT(machine),
> +                               "secure-tag-memory", UINT64_MAX / 32);
> +
> +            /* As with ram, secure-tag takes precedence over tag.  */
> +            memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
> +                                                tag_sysmem, -1);
> +        }
> +    }
> +
>       create_fdt(vms);
>   
>       assert(possible_cpus->len == max_cpus);
> @@ -2163,15 +2296,10 @@ static void machvirt_init(MachineState *machine)
>           }
>   
>           cpuobj = object_new(possible_cpus->cpus[n].type);
> -        object_property_set_int(cpuobj, "mp-affinity",
> -                                possible_cpus->cpus[n].arch_id, NULL);
>   
>           cs = CPU(cpuobj);
>           cs->cpu_index = n;
>   
> -        numa_cpu_pre_plug(&possible_cpus->cpus[cs->cpu_index], DEVICE(cpuobj),
> -                          &error_fatal);
> -
>           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
>           object_property_set_int(cpuobj, "socket-id",
>                                   virt_get_socket_id(machine, n), NULL);
> @@ -2182,82 +2310,6 @@ static void machvirt_init(MachineState *machine)
>           object_property_set_int(cpuobj, "thread-id",
>                                   virt_get_thread_id(machine, n), NULL);
>   
> -        if (!vms->secure) {
> -            object_property_set_bool(cpuobj, "has_el3", false, NULL);
> -        }
> -
> -        if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
> -            object_property_set_bool(cpuobj, "has_el2", false, NULL);
> -        }
> -
> -        if (vmc->kvm_no_adjvtime &&
> -            object_property_find(cpuobj, "kvm-no-adjvtime")) {
> -            object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
> -        }
> -
> -        if (vmc->no_kvm_steal_time &&
> -            object_property_find(cpuobj, "kvm-steal-time")) {
> -            object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
> -        }
> -
> -        if (vmc->no_pmu && object_property_find(cpuobj, "pmu")) {
> -            object_property_set_bool(cpuobj, "pmu", false, NULL);
> -        }
> -
> -        if (vmc->no_tcg_lpa2 && object_property_find(cpuobj, "lpa2")) {
> -            object_property_set_bool(cpuobj, "lpa2", false, NULL);
> -        }
> -
> -        if (object_property_find(cpuobj, "reset-cbar")) {
> -            object_property_set_int(cpuobj, "reset-cbar",
> -                                    vms->memmap[VIRT_CPUPERIPHS].base,
> -                                    &error_abort);
> -        }
> -
> -        object_property_set_link(cpuobj, "memory", OBJECT(sysmem),
> -                                 &error_abort);
> -        if (vms->secure) {
> -            object_property_set_link(cpuobj, "secure-memory",
> -                                     OBJECT(secure_sysmem), &error_abort);
> -        }
> -
> -        if (vms->mte) {
> -            /* Create the memory region only once, but link to all cpus. */
> -            if (!tag_sysmem) {
> -                /*
> -                 * The property exists only if MemTag is supported.
> -                 * If it is, we must allocate the ram to back that up.
> -                 */
> -                if (!object_property_find(cpuobj, "tag-memory")) {
> -                    error_report("MTE requested, but not supported "
> -                                 "by the guest CPU");
> -                    exit(1);
> -                }
> -
> -                tag_sysmem = g_new(MemoryRegion, 1);
> -                memory_region_init(tag_sysmem, OBJECT(machine),
> -                                   "tag-memory", UINT64_MAX / 32);
> -
> -                if (vms->secure) {
> -                    secure_tag_sysmem = g_new(MemoryRegion, 1);
> -                    memory_region_init(secure_tag_sysmem, OBJECT(machine),
> -                                       "secure-tag-memory", UINT64_MAX / 32);
> -
> -                    /* As with ram, secure-tag takes precedence over tag.  */
> -                    memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
> -                                                        tag_sysmem, -1);
> -                }
> -            }
> -
> -            object_property_set_link(cpuobj, "tag-memory", OBJECT(tag_sysmem),
> -                                     &error_abort);
> -            if (vms->secure) {
> -                object_property_set_link(cpuobj, "secure-tag-memory",
> -                                         OBJECT(secure_tag_sysmem),
> -                                         &error_abort);
> -            }
> -        }
> -
>           qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
>           object_unref(cpuobj);
>       }
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index e1ddbea96b..13163adb07 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -148,6 +148,10 @@ struct VirtMachineState {
>       DeviceState *platform_bus_dev;
>       FWCfgState *fw_cfg;
>       PFlashCFI01 *flash[2];
> +    MemoryRegion *sysmem;
> +    MemoryRegion *secure_sysmem;
> +    MemoryRegion *tag_sysmem;
> +    MemoryRegion *secure_tag_sysmem;
>       bool secure;
>       bool highmem;
>       bool highmem_compact;

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
  2023-09-26 10:04 ` [PATCH RFC V2 04/37] arm/virt, target/arm: Machine init time change common to vCPU {cold|hot}-plug Salil Mehta via
@ 2023-09-27  6:28   ` Gavin Shan
  2023-10-02 16:12     ` Salil Mehta via
  2023-09-27  6:30   ` Gavin Shan
  1 sibling, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-27  6:28 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> Refactor and introduce the common logic required during the initialization of
> both cold and hot plugged vCPUs. Also initialize the *disabled* state of the
> vCPUs which shall be used further during init phases of various other components
> like GIC, PMU, ACPI etc as part of the virt machine initialization.
> 
> KVM vCPUs corresponding to unplugged/yet-to-be-plugged QOM CPUs are kept in
> powered-off state in the KVM Host and do not run the guest code. Plugged vCPUs
> are also kept in powered-off state but vCPU threads exist and is kept sleeping.
> 
> TBD:
> For the cold booted vCPUs, this change also exists in the arm_load_kernel()
> in boot.c but for the hotplugged CPUs this change should still remain part of
> the pre-plug phase. We are duplicating the powering-off of the cold booted CPUs.
> Shall we remove the duplicate change from boot.c?
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Reported-by: Gavin Shan <gavin.shan@redhat.com>
> [GS: pointed the assertion due to wrong range check]
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c      | 149 ++++++++++++++++++++++++++++++++++++++++-----
>   target/arm/cpu.c   |   7 +++
>   target/arm/cpu64.c |  14 +++++
>   3 files changed, 156 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 0eb6bf5a18..3668ad27ec 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -221,6 +221,7 @@ static const char *valid_cpus[] = {
>       ARM_CPU_TYPE_NAME("max"),
>   };
>   
> +static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid);
>   static int virt_get_socket_id(const MachineState *ms, int cpu_index);
>   static int virt_get_cluster_id(const MachineState *ms, int cpu_index);
>   static int virt_get_core_id(const MachineState *ms, int cpu_index);
> @@ -2154,6 +2155,14 @@ static void machvirt_init(MachineState *machine)
>           exit(1);
>       }
>   
> +    finalize_gic_version(vms);
> +    if (tcg_enabled() || hvf_enabled() || qtest_enabled() ||
> +        (vms->gic_version < VIRT_GIC_VERSION_3)) {
> +        machine->smp.max_cpus = smp_cpus;
> +        mc->has_hotpluggable_cpus = false;
> +        warn_report("cpu hotplug feature has been disabled");
> +    }
> +

Comments needed here to explain why @mc->has_hotpluggable_cpus is set to false.
I guess it's something related to TODO list, mentioned in the cover letter.

>       possible_cpus = mc->possible_cpu_arch_ids(machine);
>   
>       /*
> @@ -2180,11 +2189,6 @@ static void machvirt_init(MachineState *machine)
>           virt_set_memmap(vms, pa_bits);
>       }
>   
> -    /* We can probe only here because during property set
> -     * KVM is not available yet
> -     */
> -    finalize_gic_version(vms);
> -
>       sysmem = vms->sysmem = get_system_memory();
>   
>       if (vms->secure) {
> @@ -2289,17 +2293,9 @@ static void machvirt_init(MachineState *machine)
>       assert(possible_cpus->len == max_cpus);
>       for (n = 0; n < possible_cpus->len; n++) {
>           Object *cpuobj;
> -        CPUState *cs;
> -
> -        if (n >= smp_cpus) {
> -            break;
> -        }
>   
>           cpuobj = object_new(possible_cpus->cpus[n].type);
>   
> -        cs = CPU(cpuobj);
> -        cs->cpu_index = n;
> -
>           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
>           object_property_set_int(cpuobj, "socket-id",
>                                   virt_get_socket_id(machine, n), NULL);
> @@ -2804,6 +2800,50 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>       return ms->possible_cpus;
>   }
>   
> +static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(ms);
> +    CPUArchId *found_cpu;
> +    uint64_t mp_affinity;
> +
> +    assert(vcpuid >= 0 && vcpuid < ms->possible_cpus->len);
> +
> +    /*
> +     * RFC: Question:
> +     * TBD: Should mp-affinity be treated as MPIDR?
> +     */
> +    mp_affinity = virt_cpu_mp_affinity(vms, vcpuid);
> +    found_cpu = &ms->possible_cpus->cpus[vcpuid];
> +
> +    assert(found_cpu->arch_id == mp_affinity);
> +
> +    /*
> +     * RFC: Question:
> +     * Slot-id is the index where vCPU with certain arch-id(=mpidr/ap-affinity)
> +     * is plugged. For Host KVM, MPIDR for vCPU is derived using vcpu-id.
> +     * As I understand, MPIDR and vcpu-id are property of vCPU but slot-id is
> +     * more related to machine? Current code assumes slot-id and vcpu-id are
> +     * same i.e. meaning of slot is bit vague.
> +     *
> +     * Q1: Is there any requirement to clearly represent slot and dissociate it
> +     *     from vcpu-id?
> +     * Q2: Should we make MPIDR within host KVM user configurable?
> +     *
> +     *          +----+----+----+----+----+----+----+----+
> +     * MPIDR    |||  Res  |   Aff2  |   Aff1  |  Aff0   |
> +     *          +----+----+----+----+----+----+----+----+
> +     *                     \         \         \   |    |
> +     *                      \   8bit  \   8bit  \  |4bit|
> +     *                       \<------->\<------->\ |<-->|
> +     *                        \         \         \|    |
> +     *          +----+----+----+----+----+----+----+----+
> +     * VCPU-ID  |  Byte4  |  Byte2  |  Byte1  |  Byte0  |
> +     *          +----+----+----+----+----+----+----+----+
> +     */
> +
> +    return found_cpu;
> +}
> +

MPIDR[31] is set to 0b1, looking at linux/arch/arm64/kvm/sys_regs.c::reset_mpidr().

I think this function can be renamed to virt_get_cpu_slot(ms, index), better to
reflect its intention. I had same concerns why cs->cpu_index can't be reused
as MPIDR, but it's out of scope for this series. It maybe something to be improved
afterwards.

- cs->cpu_index is passed to ioctl(KVM_CREATE_VCPU). On the host, it's translated
   to MPIDR as you outlined in above comments.

- cs->cpu_index is translated to ms->possible_cpus->cpus[i].arch_id, which will
   be exposed to guest kernel through MDAT GIC structures

- In guest kernel, CPU0's hardware ID is read from MPIDR in linux/arch/arm64/kernel/setup.c::smp_setup_processor_id().
   Other CPU's hardware ID is fetched from MDAT GIC structure.

So I think we probably just need a function to translate cs->cpu_index to
MPIDR, to mimic what's done in linux/arch/arm64/sys_reg.c::reset_mpidr(). In
this way, the hardware IDs originating from MPIDR and MADT GIC structure
will be exactly same.


>   static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                    Error **errp)
>   {
> @@ -2847,6 +2887,81 @@ static void virt_memory_plug(HotplugHandler *hotplug_dev,
>                            dev, &error_abort);
>   }
>   
> +static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                              Error **errp)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> +    MachineState *ms = MACHINE(hotplug_dev);
> +    ARMCPU *cpu = ARM_CPU(dev);
> +    CPUState *cs = CPU(dev);
> +    CPUArchId *cpu_slot;
> +    int32_t min_cpuid = 0;
> +    int32_t max_cpuid;
> +
> +    /* sanity check the cpu */
> +    if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
> +        error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
> +                   ms->cpu_type);
> +        return;
> +    }
> +
> +    if ((cpu->thread_id < 0) || (cpu->thread_id >= ms->smp.threads)) {
> +        error_setg(errp, "Invalid thread-id %u specified, correct range 0:%u",
> +                   cpu->thread_id, ms->smp.threads - 1);
> +        return;
> +    }
> +
> +    max_cpuid = ms->possible_cpus->len - 1;
> +    if (!dev->hotplugged) {
> +        min_cpuid = vms->acpi_dev ? ms->smp.cpus : 0;
> +        max_cpuid = vms->acpi_dev ? max_cpuid : ms->smp.cpus - 1;
> +    }
> +

I don't understand how the range is figured out. cpu->core_id should
be in range [0, ms->smp.cores). With your code, the following scenario
becomes invalid incorrectly?

-cpu host -smp maxcpus=4,cpus=1,sockets=4,clusters=1,cores=1,threads=1
(qemu) device_add host,id=cpu1,socket-id=1,cluster-id=0,core-id=2,thread-id=0

> +    if ((cpu->core_id < min_cpuid) || (cpu->core_id > max_cpuid)) {
> +        error_setg(errp, "Invalid core-id %d specified, correct range %d:%d",
> +                   cpu->core_id, min_cpuid, max_cpuid);
> +        return;
> +    }
> +
> +    if ((cpu->cluster_id < 0) || (cpu->cluster_id >= ms->smp.clusters)) {
> +        error_setg(errp, "Invalid cluster-id %u specified, correct range 0:%u",
> +                   cpu->cluster_id, ms->smp.clusters - 1);
> +        return;
> +    }
> +
> +    if ((cpu->socket_id < 0) || (cpu->socket_id >= ms->smp.sockets)) {
> +        error_setg(errp, "Invalid socket-id %u specified, correct range 0:%u",
> +                   cpu->socket_id, ms->smp.sockets - 1);
> +        return;
> +    }
> +
> +    cs->cpu_index = virt_get_cpu_id_from_cpu_topo(ms, dev);
> +
> +    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> +    if (qemu_present_cpu(CPU(cpu_slot->cpu))) {
> +        error_setg(errp, "cpu(id%d=%d:%d:%d:%d) with arch-id %" PRIu64 " exist",
> +                   cs->cpu_index, cpu->socket_id, cpu->cluster_id, cpu->core_id,
> +                   cpu->thread_id, cpu_slot->arch_id);
> +        return;
> +    }
> +    virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
> +}
> +
> +static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                          Error **errp)
> +{
> +    MachineState *ms = MACHINE(hotplug_dev);
> +    CPUState *cs = CPU(dev);
> +    CPUArchId *cpu_slot;
> +
> +    /* insert the cold/hot-plugged vcpu in the slot */
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

May be:

        /* CPU becomes present */

> +    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> +    cpu_slot->cpu = OBJECT(dev);
> +
> +    cs->disabled = false;
> +    return;
        ^^^^^^

        not needed.

May be worthy some comments like below, correlating to what's done in
aarch64_cpu_initfn():

        /* CPU becomes enabled after it's hot added */

> +}
> +
>   static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
>                                               DeviceState *dev, Error **errp)
>   {
> @@ -2888,6 +3003,8 @@ static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
>           object_property_set_str(OBJECT(dev), "reserved-regions[0]",
>                                   resv_prop_str, errp);
>           g_free(resv_prop_str);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        virt_cpu_pre_plug(hotplug_dev, dev, errp);
>       }
>   }
>   
> @@ -2909,6 +3026,8 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
>           virt_memory_plug(hotplug_dev, dev, errp);
>       } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
>           virtio_md_pci_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        virt_cpu_plug(hotplug_dev, dev, errp);
>       }
>   
>       if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
> @@ -2993,7 +3112,8 @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
>       if (device_is_dynamic_sysbus(mc, dev) ||
>           object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
>           object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI) ||
> -        object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
> +        object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
> +        object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
>           return HOTPLUG_HANDLER(machine);
>       }
>       return NULL;
> @@ -3070,6 +3190,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
>   #endif
>       mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
>       mc->kvm_type = virt_kvm_type;
> +    mc->has_hotpluggable_cpus = true;
>       assert(!mc->get_hotplug_handler);
>       mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
>       hc->pre_plug = virt_machine_device_pre_plug_cb;
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 1376350416..3a2e7e64ee 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -2332,6 +2332,12 @@ static const struct TCGCPUOps arm_tcg_ops = {
>   };
>   #endif /* CONFIG_TCG */
>   
> +static int64_t arm_cpu_get_arch_id(CPUState *cs)
> +{
> +    ARMCPU *cpu = ARM_CPU(cs);
> +    return cpu->mp_affinity;
> +}
> +
>   static void arm_cpu_class_init(ObjectClass *oc, void *data)
>   {
>       ARMCPUClass *acc = ARM_CPU_CLASS(oc);
> @@ -2350,6 +2356,7 @@ static void arm_cpu_class_init(ObjectClass *oc, void *data)
>       cc->class_by_name = arm_cpu_class_by_name;
>       cc->has_work = arm_cpu_has_work;
>       cc->dump_state = arm_cpu_dump_state;
> +    cc->get_arch_id = arm_cpu_get_arch_id;
>       cc->set_pc = arm_cpu_set_pc;
>       cc->get_pc = arm_cpu_get_pc;
>       cc->gdb_read_register = arm_cpu_gdb_read_register;
> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> index 96158093cc..a660e3f483 100644
> --- a/target/arm/cpu64.c
> +++ b/target/arm/cpu64.c
> @@ -739,6 +739,17 @@ static void aarch64_cpu_set_aarch64(Object *obj, bool value, Error **errp)
>       }
>   }
>   
> +static void aarch64_cpu_initfn(Object *obj)
> +{
> +    CPUState *cs = CPU(obj);
> +
> +    /*
> +     * we start every ARM64 vcpu as disabled possible vCPU. It needs to be
> +     * enabled explicitly
> +     */
> +    cs->disabled = true;
> +}
> +

The comments can be simplified to:

     /* The CPU state isn't enabled until it's hot added completely */

>   static void aarch64_cpu_finalizefn(Object *obj)
>   {
>   }
> @@ -751,7 +762,9 @@ static gchar *aarch64_gdb_arch_name(CPUState *cs)
>   static void aarch64_cpu_class_init(ObjectClass *oc, void *data)
>   {
>       CPUClass *cc = CPU_CLASS(oc);
> +    DeviceClass *dc = DEVICE_CLASS(oc);
>   
> +    dc->user_creatable = true;
>       cc->gdb_read_register = aarch64_cpu_gdb_read_register;
>       cc->gdb_write_register = aarch64_cpu_gdb_write_register;
>       cc->gdb_num_core_regs = 34;
> @@ -800,6 +813,7 @@ static const TypeInfo aarch64_cpu_type_info = {
>       .name = TYPE_AARCH64_CPU,
>       .parent = TYPE_ARM_CPU,
>       .instance_size = sizeof(ARMCPU),
> +    .instance_init = aarch64_cpu_initfn,
>       .instance_finalize = aarch64_cpu_finalizefn,
>       .abstract = true,
>       .class_size = sizeof(AArch64CPUClass),

I'm not sure if 'dc->user_creatable' can be set true here because
the ARMCPU objects aren't ready for hot added/removed at this point.
The hacks for GICv3 aren't included so far. I think a separate patch
may be needed in the last to enable the functionality?

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
  2023-09-26 10:04 ` [PATCH RFC V2 04/37] arm/virt, target/arm: Machine init time change common to vCPU {cold|hot}-plug Salil Mehta via
  2023-09-27  6:28   ` [PATCH RFC V2 04/37] arm/virt,target/arm: " Gavin Shan
@ 2023-09-27  6:30   ` Gavin Shan
  2023-10-02 10:27     ` Salil Mehta via
  1 sibling, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-27  6:30 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

On 9/26/23 20:04, Salil Mehta wrote:
> Refactor and introduce the common logic required during the initialization of
> both cold and hot plugged vCPUs. Also initialize the *disabled* state of the
> vCPUs which shall be used further during init phases of various other components
> like GIC, PMU, ACPI etc as part of the virt machine initialization.
> 
> KVM vCPUs corresponding to unplugged/yet-to-be-plugged QOM CPUs are kept in
> powered-off state in the KVM Host and do not run the guest code. Plugged vCPUs
> are also kept in powered-off state but vCPU threads exist and is kept sleeping.
> 
> TBD:
> For the cold booted vCPUs, this change also exists in the arm_load_kernel()
> in boot.c but for the hotplugged CPUs this change should still remain part of
> the pre-plug phase. We are duplicating the powering-off of the cold booted CPUs.
> Shall we remove the duplicate change from boot.c?
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Reported-by: Gavin Shan <gavin.shan@redhat.com>
                            ^^^^^^^^^^^^^^^^^^^^^

                            <gshan@redhat.com>

> [GS: pointed the assertion due to wrong range check]
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c      | 149 ++++++++++++++++++++++++++++++++++++++++-----
>   target/arm/cpu.c   |   7 +++
>   target/arm/cpu64.c |  14 +++++
>   3 files changed, 156 insertions(+), 14 deletions(-)
> 



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation,parking} code
  2023-09-26 10:04 ` [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation, parking} code Salil Mehta via
@ 2023-09-27  6:51   ` Gavin Shan
  2023-10-02 16:20     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-27  6:51 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> KVM vCPU creation is done once during the initialization of the VM when Qemu
> threads are spawned. This is common to all the architectures. If the architecture
> supports vCPU hot-{un}plug then this KVM vCPU creation could be deferred to
> later point as well. Some architectures might in any case create KVM vCPUs for
> the yet-to-be plugged vCPUs (i.e. QoM Object & thread does not exists) during VM
> init time and park them.
> 
> Hot-unplug of vCPU results in destruction of the vCPU objects in QOM but
> the KVM vCPU objects in the Host KVM are not destroyed and their representative
> KVM vCPU objects in Qemu are parked.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   accel/kvm/kvm-all.c  | 61 ++++++++++++++++++++++++++++++++++----------
>   include/sysemu/kvm.h |  2 ++
>   2 files changed, 49 insertions(+), 14 deletions(-)
> 

The most important point seems missed in the commit log: The KVM vCPU objects,
including those hotpluggable objects, need to be in place before in-host GICv3
is initialized. So we need expose kvm_create_vcpu() to make those KVM vCPU
objects in place, even for those non-present vCPUs.

> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 7b3da8dc3a..86e9c9ea60 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -137,6 +137,7 @@ static QemuMutex kml_slots_lock;
>   #define kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
>   
>   static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
> +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
>   
>   static inline void kvm_resample_fd_remove(int gsi)
>   {
> @@ -320,11 +321,51 @@ err:
>       return ret;
>   }
>   
> +void kvm_park_vcpu(CPUState *cpu)
> +{
> +    unsigned long vcpu_id = cpu->cpu_index;
> +    struct KVMParkedVcpu *vcpu;
> +
> +    vcpu = g_malloc0(sizeof(*vcpu));
> +    vcpu->vcpu_id = vcpu_id;

        vcpu->vcpu_id = cpu->cpu_index;

@vcpu_id can be dropped.

> +    vcpu->kvm_fd = cpu->kvm_fd;
> +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
> +}
> +
> +int kvm_create_vcpu(CPUState *cpu)
> +{
> +    unsigned long vcpu_id = cpu->cpu_index;
> +    KVMState *s = kvm_state;
> +    int ret;
> +
> +    DPRINTF("kvm_create_vcpu\n");
> +
> +    /* check if the KVM vCPU already exist but is parked */
> +    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
> +    if (ret > 0) {
> +        goto found;
> +    }
> +
> +    /* create a new KVM vcpu */
> +    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +found:
> +    cpu->vcpu_dirty = true;
> +    cpu->kvm_fd = ret;
> +    cpu->kvm_state = s;
> +    cpu->dirty_pages = 0;
> +    cpu->throttle_us_per_full = 0;
> +
> +    return 0;
> +}
> +

The found tag can be dropped. @cpu can be initialized if vCPU fd is found
and then bail early.

        /* The KVM vCPU may have been existing */
        ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
        if (ret > 0) {
            cpu->vcpu_dirty = true;
             :
             :
            return 0;
        }

        /* Create a new KVM vCPU */

>   static int do_kvm_destroy_vcpu(CPUState *cpu)
>   {
>       KVMState *s = kvm_state;
>       long mmap_size;
> -    struct KVMParkedVcpu *vcpu = NULL;
>       int ret = 0;
>   
>       DPRINTF("kvm_destroy_vcpu\n");
> @@ -353,10 +394,7 @@ static int do_kvm_destroy_vcpu(CPUState *cpu)
>           }
>       }
>   
> -    vcpu = g_malloc0(sizeof(*vcpu));
> -    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
> -    vcpu->kvm_fd = cpu->kvm_fd;
> -    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
> +    kvm_park_vcpu(cpu);
>   err:
>       return ret;
>   }
> @@ -384,7 +422,7 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
>           }
>       }
>   
> -    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
> +    return -1;
>   }
>   
>   int kvm_init_vcpu(CPUState *cpu, Error **errp)
> @@ -395,19 +433,14 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
>   
>       trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>   
> -    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
> +    ret = kvm_create_vcpu(cpu);
>       if (ret < 0) {
> -        error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
> +        error_setg_errno(errp, -ret,
> +                         "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
>                            kvm_arch_vcpu_id(cpu));
>           goto err;
>       }
>   
> -    cpu->kvm_fd = ret;
> -    cpu->kvm_state = s;
> -    cpu->vcpu_dirty = true;
> -    cpu->dirty_pages = 0;
> -    cpu->throttle_us_per_full = 0;
> -
>       mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
>       if (mmap_size < 0) {
>           ret = mmap_size;
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 115f0cca79..2c34889b01 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -473,6 +473,8 @@ void kvm_set_sigmask_len(KVMState *s, unsigned int sigmask_len);
>   
>   int kvm_physical_memory_addr_from_host(KVMState *s, void *ram_addr,
>                                          hwaddr *phys_addr);
> +int kvm_create_vcpu(CPUState *cpu);
> +void kvm_park_vcpu(CPUState *cpu);
>   
>   #endif /* NEED_CPU_H */
>   

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 06/37] arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
  2023-09-26 10:04 ` [PATCH RFC V2 06/37] arm/virt, kvm: Pre-create disabled possible vCPUs @machine init Salil Mehta via
@ 2023-09-27 10:04   ` Gavin Shan
  2023-10-02 16:39     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-27 10:04 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> In ARMv8 architecture, GIC needs all the vCPUs to be created and present when
> it is initialized. This is because:
> 1. GICC and MPIDR association must be fixed at the VM initialization time.
>     This is represented by register GIC_TYPER(mp_afffinity, proc_num)
> 2. GICC(cpu interfaces), GICR(redistributors) etc all must be initialized
>     at the boot time as well.
> 3. Memory regions associated with GICR etc. cannot be changed(add/del/mod)
>     after VM has inited.
> 
> This patch adds the support to pre-create all such possible vCPUs within the
> host using the KVM interface as part of the virt machine initialization. These
> vCPUs could later be attached to QOM/ACPI while they are actually hot plugged
> and made present.
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> [VP: Identified CPU stall issue & suggested probable fix]
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c         | 53 +++++++++++++++++++++++++++++++++++++++++--
>   include/hw/core/cpu.h |  1 +
>   target/arm/cpu64.c    |  1 +
>   target/arm/kvm.c      | 32 ++++++++++++++++++++++++++
>   target/arm/kvm64.c    |  9 +++++++-
>   target/arm/kvm_arm.h  | 11 +++++++++
>   6 files changed, 104 insertions(+), 3 deletions(-)
> 

The subject looks a bit misleading. (possible && disabled) == (disabled). So it
can be simplified to something like below:

arm/virt,kvm: Pre-create KVM objects for hotpluggable vCPUs

I think the commit log can be improved to something like below:

All possible vCPUs are classified to cold-booting and hotpluggable vCPUs.
In ARMv8 architecture, GIC needs all the possible vCPUs to be existing
and present when it is initialized for several factors. After the initializaion,
the CPU instances for those hotpluggable vCPUs aren't needed, but the
KVM objects like vCPU's file descriptor should be kept as they have been
shared to host.

1. GICC and MPIDR association must be fixed at the VM initialization time.
    This is represented by register GIC_TYPER(mp_afffinity, proc_num)
2. GICC(cpu interfaces), GICR(redistributors) etc all must be initialized
    at the boot time as well.
3. Memory regions associated with GICR etc. cannot be changed(add/del/mod)
    after VM has inited.

This creates and realizes CPU instances for those cold-booting vCPUs. They
becomes enabled eventually. For these hotpluggable vCPUs, the vCPU instances
are created, but not realized. They become present eventually.


> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 3668ad27ec..6ba131b799 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2293,8 +2293,10 @@ static void machvirt_init(MachineState *machine)
>       assert(possible_cpus->len == max_cpus);
>       for (n = 0; n < possible_cpus->len; n++) {
>           Object *cpuobj;
> +        CPUState *cs;
>   
>           cpuobj = object_new(possible_cpus->cpus[n].type);
> +        cs = CPU(cpuobj);
>   
>           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
>           object_property_set_int(cpuobj, "socket-id",
> @@ -2306,8 +2308,55 @@ static void machvirt_init(MachineState *machine)
>           object_property_set_int(cpuobj, "thread-id",
>                                   virt_get_thread_id(machine, n), NULL);
>   
> -        qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
> -        object_unref(cpuobj);
> +        if (n < smp_cpus) {
> +            qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
> +            object_unref(cpuobj);
> +        } else {
> +            CPUArchId *cpu_slot;
> +
> +            /* handling for vcpus which are yet to be hot-plugged */
> +            cs->cpu_index = n;
> +            cpu_slot = virt_find_cpu_slot(machine, cs->cpu_index);
> +
> +            /*
> +             * ARM host vCPU features need to be fixed at the boot time. But as
> +             * per current approach this CPU object will be destroyed during
> +             * cpu_post_init(). During hotplug of vCPUs these properties are
> +             * initialized again.
> +             */
> +            virt_cpu_set_properties(cpuobj, cpu_slot, &error_fatal);
> +
> +            /*
> +             * For KVM, we shall be pre-creating the now disabled/un-plugged
> +             * possbile host vcpus and park them till the time they are
> +             * actually hot plugged. This is required to pre-size the host
> +             * GICC and GICR with the all possible vcpus for this VM.
> +             */
> +            if (kvm_enabled()) {
> +                kvm_arm_create_host_vcpu(ARM_CPU(cs));
> +            }

                /*
                 * For KVM, the associated objects like vCPU's file descriptor
                 * is reserved so that they can reused when the vCPU is hot added.
                 * :
                 */

> +            /*
> +             * Add disabled vCPU to CPU slot during the init phase of the virt
> +             * machine
> +             * 1. We need this ARMCPU object during the GIC init. This object
> +             *    will facilitate in pre-realizing the GIC. Any info like
> +             *    mp-affinity(required to derive gicr_type) etc. could still be
> +             *    fetched while preserving QOM abstraction akin to realized
> +             *    vCPUs.
> +             * 2. Now, after initialization of the virt machine is complete we
> +             *    could use two approaches to deal with this ARMCPU object:
> +             *    (i) re-use this ARMCPU object during hotplug of this vCPU.
> +             *                             OR
> +             *    (ii) defer release this ARMCPU object after gic has been
> +             *         initialized or during pre-plug phase when a vCPU is
> +             *         hotplugged.
> +             *
> +             *    We will use the (ii) approach and release the ARMCPU objects
> +             *    after GIC and machine has been fully initialized during
> +             *    machine_init_done() phase.
> +             */
> +             cpu_slot->cpu = OBJECT(cs);
> +        }

            /*
             * Make the hotpluggable vCPU present because ....
             */
>       }
>       fdt_add_timer_nodes(vms);
>       fdt_add_cpu_nodes(vms);
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index e5af79950c..b2201a98ee 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -401,6 +401,7 @@ struct CPUState {
>       uint32_t kvm_fetch_index;
>       uint64_t dirty_pages;
>       int kvm_vcpu_stats_fd;
> +    VMChangeStateEntry *vmcse;
>   
>       /* Use by accel-block: CPU is executing an ioctl() */
>       QemuLockCnt in_ioctl_lock;
> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> index a660e3f483..3a38e7ccaf 100644
> --- a/target/arm/cpu64.c
> +++ b/target/arm/cpu64.c
> @@ -748,6 +748,7 @@ static void aarch64_cpu_initfn(Object *obj)
>        * enabled explicitly
>        */
>       cs->disabled = true;
> +    cs->thread_id = 0;
>   }
>   
>   static void aarch64_cpu_finalizefn(Object *obj)
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index b4c7654f49..0e1d0692b1 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -637,6 +637,38 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
>       write_list_to_cpustate(cpu);
>   }
>   
> +void kvm_arm_create_host_vcpu(ARMCPU *cpu)
> +{
> +    CPUState *cs = CPU(cpu);
> +    unsigned long vcpu_id = cs->cpu_index;
> +    int ret;
> +
> +    ret = kvm_create_vcpu(cs);
> +    if (ret < 0) {
> +        error_report("Failed to create host vcpu %ld", vcpu_id);
> +        abort();
> +    }
> +
> +    /*
> +     * Initialize the vCPU in the host. This will reset the sys regs
> +     * for this vCPU and related registers like MPIDR_EL1 etc. also
> +     * gets programmed during this call to host. These are referred
> +     * later while setting device attributes of the GICR during GICv3
> +     * reset
> +     */
> +    ret = kvm_arch_init_vcpu(cs);
> +    if (ret < 0) {
> +        error_report("Failed to initialize host vcpu %ld", vcpu_id);
> +        abort();
> +    }
> +
> +    /*
> +     * park the created vCPU. shall be used during kvm_get_vcpu() when
> +     * threads are created during realization of ARM vCPUs.
> +     */
> +    kvm_park_vcpu(cs);
> +}
> +
>   /*
>    * Update KVM's MP_STATE based on what QEMU thinks it is
>    */
> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> index 94bbd9661f..364cc21f81 100644
> --- a/target/arm/kvm64.c
> +++ b/target/arm/kvm64.c
> @@ -566,7 +566,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
>           return -EINVAL;
>       }
>   
> -    qemu_add_vm_change_state_handler(kvm_arm_vm_state_change, cs);
> +    /*
> +     * Install VM change handler only when vCPU thread has been spawned
> +     * i.e. vCPU is being realized
> +     */
> +    if (cs->thread_id) {
> +        cs->vmcse = qemu_add_vm_change_state_handler(kvm_arm_vm_state_change,
> +                                                     cs);
> +    }
>   
>       /* Determine init features for this CPU */
>       memset(cpu->kvm_init_features, 0, sizeof(cpu->kvm_init_features));
> diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
> index 051a0da41c..31408499b3 100644
> --- a/target/arm/kvm_arm.h
> +++ b/target/arm/kvm_arm.h
> @@ -163,6 +163,17 @@ void kvm_arm_cpu_post_load(ARMCPU *cpu);
>    */
>   void kvm_arm_reset_vcpu(ARMCPU *cpu);
>   
> +/**
> + * kvm_arm_create_host_vcpu:
> + * @cpu: ARMCPU
> + *
> + * Called at to pre create all possible kvm vCPUs within the the host at the
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
              to create instances for the hotpluggable vCPUs

> + * virt machine init time. This will also init this pre-created vCPU and
> + * hence result in vCPU reset at host. These pre created and inited vCPUs
> + * shall be parked for use when ARM vCPUs are actually realized.
> + */
> +void kvm_arm_create_host_vcpu(ARMCPU *cpu);
> +
>   /**
>    * kvm_arm_init_serror_injection:
>    * @cs: CPUState

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 07/37] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus @machine init
  2023-09-26 10:04 ` [PATCH RFC V2 07/37] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus " Salil Mehta via
@ 2023-09-28  0:14   ` Gavin Shan
  2023-10-16 16:15     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28  0:14 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> GIC needs to be pre-sized with possible vcpus at the initialization time. This
> is necessary because Memory regions and resources associated with GICC/GICR
> etc cannot be changed (add/del/modified) after VM has inited. Also, GIC_TYPER
> needs to be initialized with mp_affinity and cpu interface number association.
> This cannot be changed after GIC has initialized.
> 
> Once all the cpu interfaces of the GIC has been inited it needs to be ensured
                                                   ^^^^^^
                                                   initialized,
> that any updates to the GICC during reset only takes place for the present
                                                                  ^^^^^^^^^^^
                                                                  the enabled
> vcpus and not the disabled ones. Therefore, proper checks are required at
> various places.
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> [changed the comment in arm_gicv3_icc_reset]
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c              | 15 ++++++++-------
>   hw/intc/arm_gicv3_common.c |  7 +++++--
>   hw/intc/arm_gicv3_cpuif.c  |  8 ++++++++
>   hw/intc/arm_gicv3_kvm.c    | 34 +++++++++++++++++++++++++++++++---
>   include/hw/arm/virt.h      |  2 +-
>   5 files changed, 53 insertions(+), 13 deletions(-)
> 

I guess the subject can be improved to something like below because it's the preparatory
work to support vCPU hotplug (notifier) in the subsequent patches. In this patch, most
of the code changes is related to vCPU state, ms->smp_pros.max_cpus and the CPU interface
instances associated to GICv3 controller.

arm/virt,gicv3: Prepare for vCPU hotplug by checking GICv3CPUState states

We already had wrappers to check vCPU's states. I'm wandering if we need another set
of wrappers for GICv3 for several facts: (a) In this patch, we're actually concerned
by GICv3CPUState's states, disabled or enabled. vCPU states have been classified to
possible, present, and enabled. Their states aren't matching strictly. (b) With GICv3
own wrappers, the code can be detached from vCPU in high level. Please evaluate it's
worthy to have GICv3 own wrappers and we can have the folowing wrappers if want.

/*
  * The association between GICv3CPUState and ARMCPU happens in
  * arm_gicv3_common_realize(). After that, gicv3_cpuif_is_ready()
  * can be used.
  */
static inline bool gicv3_cpuif_is_ready(GICv3State *s, int index)
{
     if (!s->cpu || index >= s->num_cpu || !s->cpu[index].cpu) {
         return false;
     }

     return true;
}


> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 6ba131b799..a208b4e517 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -718,6 +718,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
>       const char *gictype;
>       int i;
>       unsigned int smp_cpus = ms->smp.cpus;
> +    unsigned int max_cpus = ms->smp.max_cpus;
>       uint32_t nb_redist_regions = 0;
>       int revision;
>   
> @@ -742,7 +743,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
>       }
>       vms->gic = qdev_new(gictype);
>       qdev_prop_set_uint32(vms->gic, "revision", revision);
> -    qdev_prop_set_uint32(vms->gic, "num-cpu", smp_cpus);
> +    qdev_prop_set_uint32(vms->gic, "num-cpu", max_cpus);
>       /* Note that the num-irq property counts both internal and external
>        * interrupts; there are always 32 of the former (mandated by GIC spec).
>        */
> @@ -753,7 +754,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
>   
>       if (vms->gic_version != VIRT_GIC_VERSION_2) {
>           uint32_t redist0_capacity = virt_redist_capacity(vms, VIRT_GIC_REDIST);
> -        uint32_t redist0_count = MIN(smp_cpus, redist0_capacity);
> +        uint32_t redist0_count = MIN(max_cpus, redist0_capacity);
>   
>           nb_redist_regions = virt_gicv3_redist_region_count(vms);
>   
> @@ -774,7 +775,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
>                   virt_redist_capacity(vms, VIRT_HIGH_GIC_REDIST2);
>   
>               qdev_prop_set_uint32(vms->gic, "redist-region-count[1]",
> -                MIN(smp_cpus - redist0_count, redist1_capacity));
> +                MIN(max_cpus - redist0_count, redist1_capacity));
>           }
>       } else {
>           if (!kvm_irqchip_in_kernel()) {
> @@ -831,7 +832,7 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
>           } else if (vms->virt) {
>               qemu_irq irq = qdev_get_gpio_in(vms->gic,
>                                               ppibase + ARCH_GIC_MAINT_IRQ);
> -            sysbus_connect_irq(gicbusdev, i + 4 * smp_cpus, irq);
> +            sysbus_connect_irq(gicbusdev, i + 4 * max_cpus, irq);
>           }
>   
>           qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0,
> @@ -839,11 +840,11 @@ static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
>                                                        + VIRTUAL_PMU_IRQ));
>   
>           sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
> -        sysbus_connect_irq(gicbusdev, i + smp_cpus,
> +        sysbus_connect_irq(gicbusdev, i + max_cpus,
>                              qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
> -        sysbus_connect_irq(gicbusdev, i + 2 * smp_cpus,
> +        sysbus_connect_irq(gicbusdev, i + 2 * max_cpus,
>                              qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
> -        sysbus_connect_irq(gicbusdev, i + 3 * smp_cpus,
> +        sysbus_connect_irq(gicbusdev, i + 3 * max_cpus,
>                              qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
>       }
>   
> diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
> index 2ebf880ead..ebd99af610 100644
> --- a/hw/intc/arm_gicv3_common.c
> +++ b/hw/intc/arm_gicv3_common.c
> @@ -392,10 +392,13 @@ static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
>       s->cpu = g_new0(GICv3CPUState, s->num_cpu);
>   
>       for (i = 0; i < s->num_cpu; i++) {
> -        CPUState *cpu = qemu_get_cpu(i);
> +        CPUState *cpu = qemu_get_possible_cpu(i);
>           uint64_t cpu_affid;
>   
> -        s->cpu[i].cpu = cpu;
> +        if (qemu_enabled_cpu(cpu)) {
> +            s->cpu[i].cpu = cpu;
> +        }
> +
>           s->cpu[i].gic = s;
>           /* Store GICv3CPUState in CPUARMState gicv3state pointer */
>           gicv3_set_gicv3state(cpu, &s->cpu[i]);

I don't think gicv3_set_gicv3state() isn't needed for !qemu_enabled_cpu(cpu)
since those disabled vCPUs will be released in hw/arm/virt.c pretty soon.

> diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
> index d07b13eb27..7b7a0fdb9c 100644
> --- a/hw/intc/arm_gicv3_cpuif.c
> +++ b/hw/intc/arm_gicv3_cpuif.c
> @@ -934,6 +934,10 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
>       ARMCPU *cpu = ARM_CPU(cs->cpu);
>       CPUARMState *env = &cpu->env;
>   
> +    if (!qemu_enabled_cpu(cs->cpu)) {
> +        return;
> +    }
> +

The question is how it's possible. It seems a bug to update GICv3CPUState
who isn't ready or disabled.

>       g_assert(qemu_mutex_iothread_locked());
>   
>       trace_gicv3_cpuif_update(gicv3_redist_affid(cs), cs->hppi.irq,
> @@ -1826,6 +1830,10 @@ static void icc_generate_sgi(CPUARMState *env, GICv3CPUState *cs,
>       for (i = 0; i < s->num_cpu; i++) {
>           GICv3CPUState *ocs = &s->cpu[i];
>   
> +        if (!qemu_enabled_cpu(ocs->cpu)) {
> +            continue;
> +        }
> +
>           if (irm) {
>               /* IRM == 1 : route to all CPUs except self */
>               if (cs == ocs) {
> diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
> index 72ad916d3d..b6f50caf84 100644
> --- a/hw/intc/arm_gicv3_kvm.c
> +++ b/hw/intc/arm_gicv3_kvm.c
> @@ -24,6 +24,7 @@
>   #include "hw/intc/arm_gicv3_common.h"
>   #include "qemu/error-report.h"
>   #include "qemu/module.h"
> +#include "sysemu/cpus.h"
>   #include "sysemu/kvm.h"
>   #include "sysemu/runstate.h"
>   #include "kvm_arm.h"
> @@ -458,6 +459,18 @@ static void kvm_arm_gicv3_put(GICv3State *s)
>           GICv3CPUState *c = &s->cpu[ncpu];
>           int num_pri_bits;
>   
> +        /*
> +         * To support hotplug of vcpus we need to make sure all gic cpuif/GICC
> +         * are initialized at machvirt init time. Once the init is done we
> +         * release the ARMCPU object for disabled vcpus but this leg could hit
> +         * during reset of GICC later as well i.e. after init has happened and
> +         * all of the cases we want to make sure we dont acess the GICC for
> +         * the disabled VCPUs.
> +         */
> +        if (!qemu_enabled_cpu(c->cpu)) {
> +            continue;
> +        }
> +
>           kvm_gicc_access(s, ICC_SRE_EL1, ncpu, &c->icc_sre_el1, true);
>           kvm_gicc_access(s, ICC_CTLR_EL1, ncpu,
>                           &c->icc_ctlr_el1[GICV3_NS], true);
> @@ -616,6 +629,11 @@ static void kvm_arm_gicv3_get(GICv3State *s)
>           GICv3CPUState *c = &s->cpu[ncpu];
>           int num_pri_bits;
>   
> +        /* don't access GICC for the disabled vCPUs. */
> +        if (!qemu_enabled_cpu(c->cpu)) {
> +            continue;
> +        }
> +
>           kvm_gicc_access(s, ICC_SRE_EL1, ncpu, &c->icc_sre_el1, false);
>           kvm_gicc_access(s, ICC_CTLR_EL1, ncpu,
>                           &c->icc_ctlr_el1[GICV3_NS], false);
> @@ -695,10 +713,19 @@ static void arm_gicv3_icc_reset(CPUARMState *env, const ARMCPRegInfo *ri)
>           return;
>       }
>   
> +    /*
> +     * This shall be called even when vcpu is being hotplugged or onlined and
> +     * other vcpus might be running. Host kernel KVM code to handle device
> +     * access of IOCTLs KVM_{GET|SET}_DEVICE_ATTR might fail due to inability to
> +     * grab vcpu locks for all the vcpus. Hence, we need to pause all vcpus to
> +     * facilitate locking within host.
> +     */
> +    pause_all_vcpus();
>       /* Initialize to actual HW supported configuration */
>       kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS,
>                         KVM_VGIC_ATTR(ICC_CTLR_EL1, c->gicr_typer),
>                         &c->icc_ctlr_el1[GICV3_NS], false, &error_abort);
> +    resume_all_vcpus();

Please swap the positions for paused_all_vcpu() and the next comment, and
then combine the comments.

>   
>       c->icc_ctlr_el1[GICV3_S] = c->icc_ctlr_el1[GICV3_NS];
>   }
> @@ -808,9 +835,10 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
>       gicv3_init_irqs_and_mmio(s, kvm_arm_gicv3_set_irq, NULL);
>   
>       for (i = 0; i < s->num_cpu; i++) {
> -        ARMCPU *cpu = ARM_CPU(qemu_get_cpu(i));
> -
> -        define_arm_cp_regs(cpu, gicv3_cpuif_reginfo);
> +        CPUState *cs = qemu_get_cpu(i);
> +        if (qemu_enabled_cpu(cs)) {
> +            define_arm_cp_regs(ARM_CPU(cs), gicv3_cpuif_reginfo);
> +        }
>       }
>   
>       /* Try to create the device via the device control API */
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 13163adb07..098c7917a4 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -217,7 +217,7 @@ static inline int virt_gicv3_redist_region_count(VirtMachineState *vms)
>   
>       assert(vms->gic_version != VIRT_GIC_VERSION_2);
>   
> -    return (MACHINE(vms)->smp.cpus > redist0_capacity &&
> +    return (MACHINE(vms)->smp.max_cpus > redist0_capacity &&
>               vms->highmem_redists) ? 2 : 1;
>   }
>   

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 09/37] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
  2023-09-26 10:04 ` [PATCH RFC V2 09/37] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file Salil Mehta via
@ 2023-09-28  0:19   ` Gavin Shan
  2023-10-16 16:20     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28  0:19 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

On 9/26/23 20:04, Salil Mehta wrote:
> CPU ctrl-dev MMIO region length could be used in ACPI GED (common ACPI code
> across architectures) and various other architecture specific places. To make
> these code places independent of compilation order, ACPI_CPU_HOTPLUG_REG_LEN
> macro should be moved to a header file.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/acpi/cpu.c                 | 2 +-
>   include/hw/acpi/cpu_hotplug.h | 2 ++
>   2 files changed, 3 insertions(+), 1 deletion(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> index 19c154d78f..45defdc0e2 100644
> --- a/hw/acpi/cpu.c
> +++ b/hw/acpi/cpu.c
> @@ -1,12 +1,12 @@
>   #include "qemu/osdep.h"
>   #include "migration/vmstate.h"
>   #include "hw/acpi/cpu.h"
> +#include "hw/acpi/cpu_hotplug.h"
>   #include "qapi/error.h"
>   #include "qapi/qapi-events-acpi.h"
>   #include "trace.h"
>   #include "sysemu/numa.h"
>   
> -#define ACPI_CPU_HOTPLUG_REG_LEN 12
>   #define ACPI_CPU_SELECTOR_OFFSET_WR 0
>   #define ACPI_CPU_FLAGS_OFFSET_RW 4
>   #define ACPI_CPU_CMD_OFFSET_WR 5
> diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h
> index 3b932abbbb..48b291e45e 100644
> --- a/include/hw/acpi/cpu_hotplug.h
> +++ b/include/hw/acpi/cpu_hotplug.h
> @@ -19,6 +19,8 @@
>   #include "hw/hotplug.h"
>   #include "hw/acpi/cpu.h"
>   
> +#define ACPI_CPU_HOTPLUG_REG_LEN 12
> +
>   typedef struct AcpiCpuHotplug {
>       Object *device;
>       MemoryRegion io;



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 10/37] arm/acpi: Enable ACPI support for vcpu hotplug
  2023-09-26 10:04 ` [PATCH RFC V2 10/37] arm/acpi: Enable ACPI support for vcpu hotplug Salil Mehta via
@ 2023-09-28  0:25   ` Gavin Shan
  2023-10-16 21:23     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28  0:25 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> ACPI is required to interface QEMU with the guest. Roughly falls into below
> cases,
> 
> 1. Convey the possible vcpus config at the machine init time to the guest
>     using various DSDT tables like MADT etc.
> 2. Convey vcpu hotplug events to guest(using GED)
> 3. Assist in evaluation of various ACPI methods(like _EVT, _STA, _OST, _EJ0,
>     _MAT etc.)
> 4. Provides ACPI cpu hotplug state and 12 Byte memory mapped cpu hotplug
>     control register interface to the OSPM/guest corresponding to each possible
>     vcpu. The register interface consists of various R/W fields and their
>     handling operations. These are called when ever register fields or memory
>     regions are accessed(i.e. read or written) by OSPM when ever it evaluates
>     various ACPI methods.
> 
> Note: lot of this framework code is inherited from the changes already done for
>        x86 but still some minor changes are required to make it compatible with
>        ARM64.)
> 
> This patch enables the ACPI support for virtual cpu hotplug. ACPI changes
> required will follow in subsequent patches.
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/Kconfig | 1 +
>   1 file changed, 1 insertion(+)
> 

I assume this patch needs to be moved around to last one, until vCPU hotplug
is supported in the code base.

> diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
> index 7e68348440..dae06158cd 100644
> --- a/hw/arm/Kconfig
> +++ b/hw/arm/Kconfig
> @@ -29,6 +29,7 @@ config ARM_VIRT
>       select ACPI_HW_REDUCED
>       select ACPI_APEI
>       select ACPI_VIOT
> +    select ACPI_CPU_HOTPLUG
>       select VIRTIO_MEM_SUPPORTED
>       select ACPI_CXL
>       select ACPI_HMAT

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 11/37] hw/acpi: Add ACPI CPU hotplug init stub
  2023-09-26 10:04 ` [PATCH RFC V2 11/37] hw/acpi: Add ACPI CPU hotplug init stub Salil Mehta via
@ 2023-09-28  0:28   ` Gavin Shan
  2023-10-16 21:27     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28  0:28 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

On 9/26/23 20:04, Salil Mehta wrote:
> ACPI CPU hotplug related initialization should only happend if ACPI_CPU_HOTPLUG
> support has been enabled for particular architecture. Add cpu_hotplug_hw_init()
> stub to avoid compilation break.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/acpi/acpi-cpu-hotplug-stub.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/hw/acpi/acpi-cpu-hotplug-stub.c b/hw/acpi/acpi-cpu-hotplug-stub.c
> index 3fc4b14c26..c6c61bb9cd 100644
> --- a/hw/acpi/acpi-cpu-hotplug-stub.c
> +++ b/hw/acpi/acpi-cpu-hotplug-stub.c
> @@ -19,6 +19,12 @@ void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, Object *owner,
>       return;
>   }
>   
> +void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
> +                         CPUHotplugState *state, hwaddr base_addr)
> +{
> +    return;
> +}
> +
>   void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list)
>   {
>       return;



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 12/37] hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
  2023-09-26 10:04 ` [PATCH RFC V2 12/37] hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init Salil Mehta via
@ 2023-09-28  0:40   ` Gavin Shan
  2023-10-16 21:41     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28  0:40 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> ACPI CPU Hotplug code assumes a virtual CPU is unplugged if the CPUState object
> is absent in the list of ths possible CPUs(CPUArchIdList *possible_cpus)
> maintained on per-machine basis. Use the earlier introduced qemu_present_cpu()
> API to check this state.
> 
> This change should have no bearing on the functionality of any architecture and
> is mere a representational change.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/acpi/cpu.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> index 45defdc0e2..d5ba37b209 100644
> --- a/hw/acpi/cpu.c
> +++ b/hw/acpi/cpu.c
> @@ -225,7 +225,10 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
>       state->dev_count = id_list->len;
>       state->devs = g_new0(typeof(*state->devs), state->dev_count);
>       for (i = 0; i < id_list->len; i++) {
> -        state->devs[i].cpu =  CPU(id_list->cpus[i].cpu);
> +        struct CPUState *cpu = CPU(id_list->cpus[i].cpu);
> +        if (qemu_present_cpu(cpu)) {
> +            state->devs[i].cpu = cpu;
> +        }
>           state->devs[i].arch_id = id_list->cpus[i].arch_id;
>       }
>       memory_region_init_io(&state->ctrl_reg, owner, &cpu_hotplug_ops, state,

I don't think qemu_present_cpu() is needed because all possible vCPUs are present
for x86 and arm64 at this point? Besides, we have the assumption all hotpluggable
vCPUs are present, looking at James' kernel series where ACPI_HOTPLUG_PRESENT_CPU
exists in linux/drivers/acpi/Kconfig :)

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 13/37] hw/acpi: Init GED framework with cpu hotplug events
  2023-09-26 10:04 ` [PATCH RFC V2 13/37] hw/acpi: Init GED framework with cpu hotplug events Salil Mehta via
@ 2023-09-28  0:56   ` Gavin Shan
  2023-10-16 21:44     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28  0:56 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> ACPI GED(as described in the ACPI 6.2 spec) can be used to generate ACPI events
> when OSPM/guest receives an interrupt listed in the _CRS object of GED. OSPM
> then maps or demultiplexes the event by evaluating _EVT method.
> 
> This change adds the support of cpu hotplug event initialization in the
> existing GED framework.
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/acpi/generic_event_device.c         | 8 ++++++++
>   include/hw/acpi/generic_event_device.h | 5 +++++
>   2 files changed, 13 insertions(+)
> 

It looks a bit strange you're co-developing the patch with yourself.
It seems all patches follow this particular pattern. I could be changed
to:

Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>

The code changes look good to me with the following nits addressed:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index a3d31631fe..d2fa1d0e4a 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -25,6 +25,7 @@ static const uint32_t ged_supported_events[] = {
>       ACPI_GED_MEM_HOTPLUG_EVT,
>       ACPI_GED_PWR_DOWN_EVT,
>       ACPI_GED_NVDIMM_HOTPLUG_EVT,
> +    ACPI_GED_CPU_HOTPLUG_EVT,
>   };
>   

Can we move ACPI_GED_CPU_HOTPLUG_EVT ahead of ACPI_GED_MEM_HOTPLUG_EVT?

>   /*
> @@ -400,6 +401,13 @@ static void acpi_ged_initfn(Object *obj)
>       memory_region_init_io(&ged_st->regs, obj, &ged_regs_ops, ged_st,
>                             TYPE_ACPI_GED "-regs", ACPI_GED_REG_COUNT);
>       sysbus_init_mmio(sbd, &ged_st->regs);
> +
> +    s->cpuhp.device = OBJECT(s);
> +    memory_region_init(&s->container_cpuhp, OBJECT(dev), "cpuhp container",
> +                       ACPI_CPU_HOTPLUG_REG_LEN);
> +    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->container_cpuhp);
> +    cpu_hotplug_hw_init(&s->container_cpuhp, OBJECT(dev),
> +                        &s->cpuhp_state, 0);
>   }
>   
>   static void acpi_ged_class_init(ObjectClass *class, void *data)
> diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
> index d831bbd889..d0a5a43abf 100644
> --- a/include/hw/acpi/generic_event_device.h
> +++ b/include/hw/acpi/generic_event_device.h
> @@ -60,6 +60,7 @@
>   #define HW_ACPI_GENERIC_EVENT_DEVICE_H
>   
>   #include "hw/sysbus.h"
> +#include "hw/acpi/cpu_hotplug.h"
>   #include "hw/acpi/memory_hotplug.h"
>   #include "hw/acpi/ghes.h"
>   #include "qom/object.h"
> @@ -97,6 +98,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
>   #define ACPI_GED_MEM_HOTPLUG_EVT   0x1
>   #define ACPI_GED_PWR_DOWN_EVT      0x2
>   #define ACPI_GED_NVDIMM_HOTPLUG_EVT 0x4
> +#define ACPI_GED_CPU_HOTPLUG_EVT    0x8
>   

#define ACPI_GED_CPU_HOTPLUG_EVT  0x1
#define ACPI_GED_MEM_HOTPLUG_EVT  0x2
   :

If the adjustment is friendly to live migration.

>   typedef struct GEDState {
>       MemoryRegion evt;
> @@ -108,6 +110,9 @@ struct AcpiGedState {
>       SysBusDevice parent_obj;
>       MemHotplugState memhp_state;
>       MemoryRegion container_memhp;
> +    CPUHotplugState cpuhp_state;
> +    MemoryRegion container_cpuhp;
> +    AcpiCpuHotplug cpuhp;
>       GEDState ged_state;
>       uint32_t ged_event_bitmap;
>       qemu_irq irq;

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 14/37] arm/virt: Add cpu hotplug events to GED during creation
  2023-09-26 10:04 ` [PATCH RFC V2 14/37] arm/virt: Add cpu hotplug events to GED during creation Salil Mehta via
@ 2023-09-28  1:03   ` Gavin Shan
  2023-10-16 21:46     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28  1:03 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> Add CPU Hotplug event to the set of supported ged-events during the creation of
> GED device during VM init. Also initialize the memory map for CPU Hotplug
              ^^^^^^^^^^^^^^
              it can be dropped.
> control device used in event exchanges between Qemu/VMM and the guest.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c         | 5 ++++-
>   include/hw/arm/virt.h | 1 +
>   2 files changed, 5 insertions(+), 1 deletion(-)
> 

The changes look good to me:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 070c36054e..5c8a0672dc 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -76,6 +76,7 @@
>   #include "hw/mem/pc-dimm.h"
>   #include "hw/mem/nvdimm.h"
>   #include "hw/acpi/generic_event_device.h"
> +#include "hw/acpi/cpu_hotplug.h"
>   #include "hw/virtio/virtio-md-pci.h"
>   #include "hw/virtio/virtio-iommu.h"
>   #include "hw/char/pl011.h"
> @@ -155,6 +156,7 @@ static const MemMapEntry base_memmap[] = {
>       [VIRT_NVDIMM_ACPI] =        { 0x09090000, NVDIMM_ACPI_IO_LEN},
>       [VIRT_PVTIME] =             { 0x090a0000, 0x00010000 },
>       [VIRT_SECURE_GPIO] =        { 0x090b0000, 0x00001000 },
> +    [VIRT_CPUHP_ACPI] =         { 0x090c0000, ACPI_CPU_HOTPLUG_REG_LEN},
>       [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
>       /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>       [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
> @@ -640,7 +642,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
>       DeviceState *dev;
>       MachineState *ms = MACHINE(vms);
>       int irq = vms->irqmap[VIRT_ACPI_GED];
> -    uint32_t event = ACPI_GED_PWR_DOWN_EVT;
> +    uint32_t event = ACPI_GED_PWR_DOWN_EVT | ACPI_GED_CPU_HOTPLUG_EVT;
>   
>       if (ms->ram_slots) {
>           event |= ACPI_GED_MEM_HOTPLUG_EVT;
> @@ -655,6 +657,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
>   
>       sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, vms->memmap[VIRT_ACPI_GED].base);
>       sysbus_mmio_map(SYS_BUS_DEVICE(dev), 1, vms->memmap[VIRT_PCDIMM_ACPI].base);
> +    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 3, vms->memmap[VIRT_CPUHP_ACPI].base);
>       sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, qdev_get_gpio_in(vms->gic, irq));
>   
>       sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index fc0469c33f..09a0b2d4f0 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -85,6 +85,7 @@ enum {
>       VIRT_PCDIMM_ACPI,
>       VIRT_ACPI_GED,
>       VIRT_NVDIMM_ACPI,
> +    VIRT_CPUHP_ACPI,
>       VIRT_PVTIME,
>       VIRT_LOWMEMMAP_LAST,
>   };

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 15/37] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  2023-09-26 10:04 ` [PATCH RFC V2 15/37] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed Salil Mehta via
@ 2023-09-28  1:08   ` Gavin Shan
  2023-10-16 21:54     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28  1:08 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> ACPI CPU hotplug state (is_present=_STA.PRESENT, is_enabled=_STA.ENABLED) for
> all the possible vCPUs MUST be initialized during machine init. This is done
> during the creation of the GED device. VMM/Qemu MUST expose/fake the ACPI state
> of the disabled vCPUs to the Guest kernel as 'present' (_STA.PRESENT) always
> i.e. ACPI persistent. if the 'disabled' vCPU objectes are destroyed before the
> GED device has been created then their ACPI hotplug state might not get
> initialized correctly as acpi_persistent flag is part of the CPUState. This will
> expose wrong status of the unplugged vCPUs to the Guest kernel.
> 
> Hence, moving the GED device creation before disabled vCPU objects get destroyed
> as part of the post CPU init routine.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c | 10 +++++++---
>   1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 5c8a0672dc..cbb6199ec6 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2376,6 +2376,12 @@ static void machvirt_init(MachineState *machine)
>   
>       create_gic(vms, sysmem);
>   
> +    has_ged = has_ged && aarch64 && firmware_loaded &&
> +              virt_is_acpi_enabled(vms);
> +    if (has_ged) {
> +        vms->acpi_dev = create_acpi_ged(vms);
> +    }
> +

I prefer the old style. Squeezing all conditions to @has_ged changes what's
to be meant by @has_ged itself.

        if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
            :
        }

>       virt_cpu_post_init(vms, sysmem);
>   
>       fdt_add_pmu_nodes(vms);
> @@ -2398,9 +2404,7 @@ static void machvirt_init(MachineState *machine)
>   
>       create_pcie(vms);
>   
> -    if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
> -        vms->acpi_dev = create_acpi_ged(vms);
> -    } else {
> +    if (!has_ged) {
>           create_gpio_devices(vms, VIRT_GPIO, sysmem);
>       }
>   

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 16/37] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
  2023-09-26 10:04 ` [PATCH RFC V2 16/37] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change Salil Mehta via
@ 2023-09-28  1:26   ` Gavin Shan
  2023-10-16 21:57     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28  1:26 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

On 9/26/23 20:04, Salil Mehta wrote:
> CPUs Control device(\\_SB.PCI0) register interface for the x86 arch is based on
> PCI and is IO port based and hence existing cpus AML code assumes _CRS objects
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                           . The existing AML code assumes _CRS object
> would evaluate to a system resource which describes IO Port address. But on ARM
   ^^^^^^^^^^^^^^^^^^^
   is evaluated to a
   
> arch CPUs control device(\\_SB.PRES) register interface is memory-mapped hence
> _CRS object should evaluate to system resource which describes memory-mapped
               ^^^^^^
               should be evaluated
> base address.
> 
> This cpus AML code change updates the existing inerface of the build cpus AML
> function to accept both IO/MEMORY type regions and update the _CRS object
> correspondingly.
> 
> NOTE: Beside above CPU scan shall be triggered when OSPM evaluates _EVT method
>        part of the GED framework which is covered in subsequent patch.
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/acpi/cpu.c         | 23 ++++++++++++++++-------
>   hw/i386/acpi-build.c  |  2 +-
>   include/hw/acpi/cpu.h |  5 +++--
>   3 files changed, 20 insertions(+), 10 deletions(-)
> 

I guess the commit log can be simplified to:

The CPU hotplug register block is declared as a IO region on x86, or a memory
region on arm64 in build_cpus_aml(), as part of the generic container device
(\\_SB.PCI0 or \\_SB.PRES).

Adapt build_cpus_aml() so that IO region and memory region can be handled
in the mean while.

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> index d5ba37b209..232720992d 100644
> --- a/hw/acpi/cpu.c
> +++ b/hw/acpi/cpu.c
> @@ -341,9 +341,10 @@ const VMStateDescription vmstate_cpu_hotplug = {
>   #define CPU_FW_EJECT_EVENT "CEJF"
>   
>   void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
> -                    hwaddr io_base,
> +                    hwaddr base_addr,
>                       const char *res_root,
> -                    const char *event_handler_method)
> +                    const char *event_handler_method,
> +                    AmlRegionSpace rs)
>   {
>       Aml *ifctx;
>       Aml *field;
> @@ -370,13 +371,19 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
>           aml_append(cpu_ctrl_dev, aml_mutex(CPU_LOCK, 0));
>   
>           crs = aml_resource_template();
> -        aml_append(crs, aml_io(AML_DECODE16, io_base, io_base, 1,
> +        if (rs == AML_SYSTEM_IO) {
> +            aml_append(crs, aml_io(AML_DECODE16, base_addr, base_addr, 1,
>                                  ACPI_CPU_HOTPLUG_REG_LEN));
> +        } else {
> +            aml_append(crs, aml_memory32_fixed(base_addr,
> +                               ACPI_CPU_HOTPLUG_REG_LEN, AML_READ_WRITE));
> +        }
> +
>           aml_append(cpu_ctrl_dev, aml_name_decl("_CRS", crs));
>   
>           /* declare CPU hotplug MMIO region with related access fields */
>           aml_append(cpu_ctrl_dev,
> -            aml_operation_region("PRST", AML_SYSTEM_IO, aml_int(io_base),
> +            aml_operation_region("PRST", rs, aml_int(base_addr),
>                                    ACPI_CPU_HOTPLUG_REG_LEN));
>   
>           field = aml_field("PRST", AML_BYTE_ACC, AML_NOLOCK,
> @@ -702,9 +709,11 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
>       aml_append(sb_scope, cpus_dev);
>       aml_append(table, sb_scope);
>   
> -    method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
> -    aml_append(method, aml_call0("\\_SB.CPUS." CPU_SCAN_METHOD));
> -    aml_append(table, method);
> +    if (event_handler_method) {
> +        method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
> +        aml_append(method, aml_call0("\\_SB.CPUS." CPU_SCAN_METHOD));
> +        aml_append(table, method);
> +    }
>   
>       g_free(cphp_res_path);
>   }
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index bb12b0ad43..560f108d38 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1550,7 +1550,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>               .fw_unplugs_cpu = pm->smi_on_cpu_unplug,
>           };
>           build_cpus_aml(dsdt, machine, opts, pm->cpu_hp_io_base,
> -                       "\\_SB.PCI0", "\\_GPE._E02");
> +                       "\\_SB.PCI0", "\\_GPE._E02", AML_SYSTEM_IO);
>       }
>   
>       if (pcms->memhp_io_base && nr_mem) {
> diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
> index 999caaf510..b87ebfdf4b 100644
> --- a/include/hw/acpi/cpu.h
> +++ b/include/hw/acpi/cpu.h
> @@ -56,9 +56,10 @@ typedef struct CPUHotplugFeatures {
>   } CPUHotplugFeatures;
>   
>   void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
> -                    hwaddr io_base,
> +                    hwaddr base_addr,
>                       const char *res_root,
> -                    const char *event_handler_method);
> +                    const char *event_handler_method,
> +                    AmlRegionSpace rs);
>   
>   void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list);
>   

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 17/37] arm/virt/acpi: Build CPUs AML with CPU Hotplug support
  2023-09-26 10:04 ` [PATCH RFC V2 17/37] arm/virt/acpi: Build CPUs AML with CPU Hotplug support Salil Mehta via
@ 2023-09-28  1:36   ` Gavin Shan
  2023-10-16 22:05     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28  1:36 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> Support of vCPU Hotplug requires sequence of ACPI handshakes between Qemu and
> Guest kernel when a vCPU is plugged or unplugged. Most of the AML code to
> support these handshakes already exists. This AML need to be build during VM
> init for ARM architecture as well if the GED support exists.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt-acpi-build.c | 13 ++++++++++++-
>   1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 6b674231c2..d27df5030e 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -858,7 +858,18 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>        * the RTC ACPI device at all when using UEFI.
>        */
>       scope = aml_scope("\\_SB");
> -    acpi_dsdt_add_cpus(scope, vms);
> +    /* if GED is enabled then cpus AML shall be added as part build_cpus_aml */
> +    if (vms->acpi_dev) {
> +        CPUHotplugFeatures opts = {
> +             .acpi_1_compatible = false,
> +             .has_legacy_cphp = false
> +        };
> +
> +        build_cpus_aml(scope, ms, opts, memmap[VIRT_CPUHP_ACPI].base,
> +                       "\\_SB", NULL, AML_SYSTEM_MEMORY);
> +    } else {
> +        acpi_dsdt_add_cpus(scope, vms);
> +    }
>       acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
>                          (irqmap[VIRT_UART] + ARM_SPI_BASE));
>       if (vmc->acpi_expose_flash) {

I don't think it's enough to check vms->acpi_dev. vCPU hotplug needn't to be
supported even vms->acpi_dev exists. For example when vGICv2 instead of
vGICv3 is enabled, and so on.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 18/37] arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  2023-09-26 10:04 ` [PATCH RFC V2 18/37] arm/virt: Make ARM vCPU *present* status ACPI *persistent* Salil Mehta via
@ 2023-09-28 23:18   ` Gavin Shan
  2023-10-16 22:33     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28 23:18 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> ARM arch does not allow CPUs presence to be changed [1] after kernel has booted.
> Hence, firmware/ACPI/Qemu must ensure persistent view of the vCPUs to the Guest
> kernel even when they are not present in the QoM i.e. are unplugged or are
> yet-to-be-plugged
> 
> References:
> [1] Check comment 5 in the bugzilla entry
>     Link: https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   cpus-common.c         |  6 ++++++
>   hw/arm/virt.c         |  7 +++++++
>   include/hw/core/cpu.h | 20 ++++++++++++++++++++
>   3 files changed, 33 insertions(+)
> 

hmm, it's another CPU state. There are 4 CPU states, plus other 3 CPU states:
possible, present, enabled. Now we're having always-present state. I think
those CPU states can be squeezed into the previous present state. What we
need is to ensure all possible vCPUs are present from the beginning. How
are to do something like below?

/*
  * The flag is set for all possible vCPUs in hw/arm/virt.c::virt_possible_cpu_arch_ids()
  * The idea is the flag is managed by specific board because the always-present is
  * the special requirements from hw/arm/virt board.
  */
#define CPU_ARCH_ID_FLAG_ALWAYS_PRESENT   (1UL << 0)
typedef struct CPUArchId {
     uint64_t  flags;
       :
}

static inline bool machine_has_possible_cpu(int index);
static inline bool mahine_has_present_cpu(int index)
{
     if (!machine_has_possible_cpu(inde)) {
         return false;
     }

     if (!ms->possible_cpus->cpus[index].flags & CPU_ARCH_ID_FLAG_ALWAYS_PRESENT) {
        return false;
     }

     return true;
}

static inline bool machine_has_enabled_cpu(int index)
{
     CPUState *cs;

     if (!machine_has_present_cpu(index)) {
         return false;
     }

     /* I'm thinking of cpu.enabled can be replaced by another
      * flag in struct CPUArchID::flags due to the fact:
      * The vCPU's states are managed by board and changed at
      * creation time or hotplug handlers.
      */
     cs = CPUSTATE(ms->possible_cpus->cpus[i].cpu);
     if (!cs || !cpu.enabled) {
         return false;
     }

     return true;
}

> diff --git a/cpus-common.c b/cpus-common.c
> index 24c04199a1..d64aa63b19 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -128,6 +128,12 @@ bool qemu_enabled_cpu(CPUState *cpu)
>       return cpu && !cpu->disabled;
>   }
>   
> +bool qemu_persistent_cpu(CPUState *cpu)
> +{
> +    /* cpu state can be faked to the guest via acpi */
> +    return cpu->acpi_persistent;
> +}
> +
>   uint64_t qemu_get_cpu_archid(int cpu_index)
>   {
>       MachineState *ms = MACHINE(qdev_get_machine());
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index cbb6199ec6..f1bee569d5 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -3006,6 +3006,13 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>           return;
>       }
>       virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
> +
> +    /*
> +     * To give persistent presence view of vCPUs to the guest, ACPI might need
> +     * to fake the presence of the vCPUs to the guest but keep them disabled.
> +     * This shall be used during the init of ACPI Hotplug state and hot-unplug
> +     */
> +     cs->acpi_persistent = true;
>   }
>   
>   static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index b2201a98ee..dab572c9bd 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -425,6 +425,13 @@ struct CPUState {
>        * By default every CPUState is enabled as of now across all archs.
>        */
>       bool disabled;
> +    /*
> +     * On certain architectures, to give persistent view of the 'presence' of
> +     * vCPUs to the guest, ACPI might need to fake the 'presence' of the vCPUs
> +     * but keep them ACPI disabled to the guest. This is done by returning
> +     * _STA.PRES=True and _STA.Ena=False for the unplugged vCPUs in QEMU QoM.
> +     */
> +    bool acpi_persistent;
>       /* TODO Move common fields from CPUArchState here. */
>       int cpu_index;
>       int cluster_index;
> @@ -814,6 +821,19 @@ bool qemu_present_cpu(CPUState *cpu);
>    */
>   bool qemu_enabled_cpu(CPUState *cpu);
>   
> +/**
> + * qemu_persistent_cpu:
> + * @cpu: The vCPU to check
> + *
> + * Checks if the vCPU state should always be reflected as *present* via ACPI
> + * to the Guest. By default, this is False on all architectures and has to be
> + * explicity set during initialization.
> + *
> + * Returns: True if it is ACPI 'persistent' CPU
> + *
> + */
> +bool qemu_persistent_cpu(CPUState *cpu);
> +
>   /**
>    * qemu_get_cpu_archid:
>    * @cpu_index: possible vCPU for which arch-id needs to be retreived

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits to Guest
  2023-09-26 10:04 ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} Bits to Guest Salil Mehta via
@ 2023-09-28 23:33   ` Gavin Shan
  2023-10-16 22:59     ` Salil Mehta via
  2024-01-17 21:46   ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} " Jonathan Cameron via
  1 sibling, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28 23:33 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> ACPI AML changes to properly reflect the _STA.PRES and _STA.ENA Bits to the
> guest during initialzation, when CPUs are hotplugged and after CPUs are
> hot-unplugged.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/acpi/cpu.c                  | 49 +++++++++++++++++++++++++++++++---
>   hw/acpi/generic_event_device.c | 11 ++++++++
>   include/hw/acpi/cpu.h          |  2 ++
>   3 files changed, 58 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> index 232720992d..e1299696d3 100644
> --- a/hw/acpi/cpu.c
> +++ b/hw/acpi/cpu.c
> @@ -63,10 +63,11 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, unsigned size)
>       cdev = &cpu_st->devs[cpu_st->selector];
>       switch (addr) {
>       case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
> -        val |= cdev->cpu ? 1 : 0;
> +        val |= cdev->is_enabled ? 1 : 0;
>           val |= cdev->is_inserting ? 2 : 0;
>           val |= cdev->is_removing  ? 4 : 0;
>           val |= cdev->fw_remove  ? 16 : 0;
> +        val |= cdev->is_present ? 32 : 0;
>           trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
>           break;

The vCPU states are synchronized to what we had. It means we're maintaining two set
vCPU states, one for board level and another set for vCPU hotplug here. They look
duplicate to each other. However, it will need too much code changes to combine
them.

>       case ACPI_CPU_CMD_DATA_OFFSET_RW:
> @@ -228,7 +229,21 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
>           struct CPUState *cpu = CPU(id_list->cpus[i].cpu);
>           if (qemu_present_cpu(cpu)) {
>               state->devs[i].cpu = cpu;
> +            state->devs[i].is_present = true;
> +        } else {
> +            if (qemu_persistent_cpu(cpu)) {
> +                state->devs[i].is_present = true;
> +            } else {
> +                state->devs[i].is_present = false;
> +            }
>           }

state->devs[i].is_present = qemu_persistent_cpu(cpu);

> +
> +        if (qemu_enabled_cpu(cpu)) {
> +            state->devs[i].is_enabled = true;
> +        } else {
> +            state->devs[i].is_enabled = false;
> +        }
> +

state->dev[i].is_enabled = qemu_enabled_cpu(cpu);

>           state->devs[i].arch_id = id_list->cpus[i].arch_id;
>       }
>       memory_region_init_io(&state->ctrl_reg, owner, &cpu_hotplug_ops, state,
> @@ -261,6 +276,8 @@ void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
>       }
>   
>       cdev->cpu = CPU(dev);
> +    cdev->is_present = true;
> +    cdev->is_enabled = true;
>       if (dev->hotplugged) {
>           cdev->is_inserting = true;
>           acpi_send_event(DEVICE(hotplug_dev), ACPI_CPU_HOTPLUG_STATUS);
> @@ -292,6 +309,11 @@ void acpi_cpu_unplug_cb(CPUHotplugState *cpu_st,
>           return;
>       }
>   
> +    cdev->is_enabled = false;
> +    if (!qemu_persistent_cpu(CPU(dev))) {
> +        cdev->is_present = false;
> +    }
> +
>       cdev->cpu = NULL;
>   }
>   
> @@ -302,6 +324,8 @@ static const VMStateDescription vmstate_cpuhp_sts = {
>       .fields      = (VMStateField[]) {
>           VMSTATE_BOOL(is_inserting, AcpiCpuStatus),
>           VMSTATE_BOOL(is_removing, AcpiCpuStatus),
> +        VMSTATE_BOOL(is_present, AcpiCpuStatus),
> +        VMSTATE_BOOL(is_enabled, AcpiCpuStatus),
>           VMSTATE_UINT32(ost_event, AcpiCpuStatus),
>           VMSTATE_UINT32(ost_status, AcpiCpuStatus),
>           VMSTATE_END_OF_LIST()
> @@ -339,6 +363,7 @@ const VMStateDescription vmstate_cpu_hotplug = {
>   #define CPU_REMOVE_EVENT  "CRMV"
>   #define CPU_EJECT_EVENT   "CEJ0"
>   #define CPU_FW_EJECT_EVENT "CEJF"
> +#define CPU_PRESENT       "CPRS"
>   
>   void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
>                       hwaddr base_addr,
> @@ -399,7 +424,9 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
>           aml_append(field, aml_named_field(CPU_EJECT_EVENT, 1));
>           /* tell firmware to do device eject, write only */
>           aml_append(field, aml_named_field(CPU_FW_EJECT_EVENT, 1));
> -        aml_append(field, aml_reserved_field(3));
> +        /* 1 if present, read only */
> +        aml_append(field, aml_named_field(CPU_PRESENT, 1));
> +        aml_append(field, aml_reserved_field(2));
>           aml_append(field, aml_named_field(CPU_COMMAND, 8));
>           aml_append(cpu_ctrl_dev, field);
>   
> @@ -429,6 +456,7 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
>           Aml *ctrl_lock = aml_name("%s.%s", cphp_res_path, CPU_LOCK);
>           Aml *cpu_selector = aml_name("%s.%s", cphp_res_path, CPU_SELECTOR);
>           Aml *is_enabled = aml_name("%s.%s", cphp_res_path, CPU_ENABLED);
> +        Aml *is_present = aml_name("%s.%s", cphp_res_path, CPU_PRESENT);
>           Aml *cpu_cmd = aml_name("%s.%s", cphp_res_path, CPU_COMMAND);
>           Aml *cpu_data = aml_name("%s.%s", cphp_res_path, CPU_DATA);
>           Aml *ins_evt = aml_name("%s.%s", cphp_res_path, CPU_INSERT_EVENT);
> @@ -457,13 +485,26 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
>           {
>               Aml *idx = aml_arg(0);
>               Aml *sta = aml_local(0);
> +            Aml *ifctx2;
> +            Aml *else_ctx;
>   
>               aml_append(method, aml_acquire(ctrl_lock, 0xFFFF));
>               aml_append(method, aml_store(idx, cpu_selector));
>               aml_append(method, aml_store(zero, sta));
> -            ifctx = aml_if(aml_equal(is_enabled, one));
> +            ifctx = aml_if(aml_equal(is_present, one));
>               {
> -                aml_append(ifctx, aml_store(aml_int(0xF), sta));
> +                ifctx2 = aml_if(aml_equal(is_enabled, one));
> +                {
> +                    /* cpu is present and enabled */
> +                    aml_append(ifctx2, aml_store(aml_int(0xF), sta));
> +                }
> +                aml_append(ifctx, ifctx2);
> +                else_ctx = aml_else();
> +                {
> +                    /* cpu is present but disabled */
> +                    aml_append(else_ctx, aml_store(aml_int(0xD), sta));
> +                }
> +                aml_append(ifctx, else_ctx);
>               }
>               aml_append(method, ifctx);
>               aml_append(method, aml_release(ctrl_lock));
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index d2fa1d0e4a..b84602b238 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -319,6 +319,16 @@ static const VMStateDescription vmstate_memhp_state = {
>       }
>   };
>   
> +static const VMStateDescription vmstate_cpuhp_state = {
> +    .name = "acpi-ged/cpuhp",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .fields      = (VMStateField[]) {
> +        VMSTATE_CPU_HOTPLUG(cpuhp_state, AcpiGedState),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
>   static const VMStateDescription vmstate_ged_state = {
>       .name = "acpi-ged-state",
>       .version_id = 1,
> @@ -367,6 +377,7 @@ static const VMStateDescription vmstate_acpi_ged = {
>       },
>       .subsections = (const VMStateDescription * []) {
>           &vmstate_memhp_state,
> +        &vmstate_cpuhp_state,
>           &vmstate_ghes_state,
>           NULL
>       }
> diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
> index b87ebfdf4b..786a30d6d4 100644
> --- a/include/hw/acpi/cpu.h
> +++ b/include/hw/acpi/cpu.h
> @@ -22,6 +22,8 @@ typedef struct AcpiCpuStatus {
>       uint64_t arch_id;
>       bool is_inserting;
>       bool is_removing;
> +    bool is_present;
> +    bool is_enabled;
>       bool fw_remove;
>       uint32_t ost_event;
>       uint32_t ost_status;

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 20/37] hw/acpi: Update GED _EVT method AML with cpu scan
  2023-09-26 10:04 ` [PATCH RFC V2 20/37] hw/acpi: Update GED _EVT method AML with cpu scan Salil Mehta via
@ 2023-09-28 23:35   ` Gavin Shan
  2023-10-16 23:01     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28 23:35 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

On 9/26/23 20:04, Salil Mehta wrote:
> OSPM evaluates _EVT method to map the event. The cpu hotplug event eventually
> results in start of the cpu scan. Scan figures out the cpu and the kind of
> event(plug/unplug) and notifies it back to the guest.
> 
> The change in this patch updates the GED AML _EVT method with the call to
> \\_SB.CPUS.CSCN which will do above.
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/acpi/generic_event_device.c | 4 ++++
>   include/hw/acpi/cpu_hotplug.h  | 2 ++
>   2 files changed, 6 insertions(+)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index b84602b238..ad252e6a91 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -108,6 +108,10 @@ void build_ged_aml(Aml *table, const char *name, HotplugHandler *hotplug_dev,
>                   aml_append(if_ctx, aml_call0(MEMORY_DEVICES_CONTAINER "."
>                                                MEMORY_SLOT_SCAN_METHOD));
>                   break;
> +            case ACPI_GED_CPU_HOTPLUG_EVT:
> +                aml_append(if_ctx, aml_call0(ACPI_CPU_CONTAINER "."
> +                                             ACPI_CPU_SCAN_METHOD));
> +                break;
>               case ACPI_GED_PWR_DOWN_EVT:
>                   aml_append(if_ctx,
>                              aml_notify(aml_name(ACPI_POWER_BUTTON_DEVICE),
> diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h
> index 48b291e45e..ef631750b4 100644
> --- a/include/hw/acpi/cpu_hotplug.h
> +++ b/include/hw/acpi/cpu_hotplug.h
> @@ -20,6 +20,8 @@
>   #include "hw/acpi/cpu.h"
>   
>   #define ACPI_CPU_HOTPLUG_REG_LEN 12
> +#define ACPI_CPU_SCAN_METHOD "CSCN"
> +#define ACPI_CPU_CONTAINER "\\_SB.CPUS"
>   
>   typedef struct AcpiCpuHotplug {
>       Object *device;



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with possible vCPUs
  2023-09-26 10:04 ` [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with possible vCPUs Salil Mehta via
@ 2023-09-28 23:43   ` Gavin Shan
  2023-10-16 23:15     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28 23:43 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> Changes required during building of MADT Table by QEMU to accomodate disabled
> possible vCPUs. This info shall be used by the guest kernel to size up its
> resources during boot time. This pre-sizing of the guest kernel done on
> possible vCPUs will facilitate hotplug of the disabled vCPUs.
> 
> This change also caters ACPI MADT GIC CPU Interface flag related changes
> recently introduced in the UEFI ACPI 6.5 Specification which allows deferred
> virtual CPU online'ing in the Guest Kernel.
> 
> Link: https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt-acpi-build.c | 36 ++++++++++++++++++++++++++++++------
>   1 file changed, 30 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index d27df5030e..cbccd2ca2d 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -700,6 +700,29 @@ static void build_append_gicr(GArray *table_data, uint64_t base, uint32_t size)
>       build_append_int_noprefix(table_data, size, 4); /* Discovery Range Length */
>   }
>   
> +static uint32_t virt_acpi_get_gicc_flags(CPUState *cpu)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
> +
> +    /* can only exist in 'enabled' state */
> +    if (!mc->has_hotpluggable_cpus) {
> +        return 1;
> +    }
> +
> +    /*
> +     * ARM GIC CPU Interface can be 'online-capable' or 'enabled' at boot
> +     * We MUST set 'online-capable' Bit for all hotpluggable CPUs except the
                                        ^^^
                                        bit
> +     * first/boot CPU. Cold-booted CPUs without 'Id' can also be unplugged.
> +     * Though as-of-now this is only used as a debugging feature.
> +     *
> +     *   UEFI ACPI Specification 6.5
> +     *   Section: 5.2.12.14. GIC CPU Interface (GICC) Structure
> +     *   Table:   5.37 GICC CPU Interface Flags
> +     *   Link: https://uefi.org/specs/ACPI/6.5
> +     */
> +    return cpu && !cpu->cpu_index ? 1 : (1 << 3);
> +}
> +

I don't understand how a cold-booted CPU can be hot removed if it doesn't
have a ID? Besides, how cpu->cpu_index is zero for the first cold-booted
CPU?

>   static void
>   build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>   {
> @@ -726,12 +749,13 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>       build_append_int_noprefix(table_data, vms->gic_version, 1);
>       build_append_int_noprefix(table_data, 0, 3);   /* Reserved */
>   
> -    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
> -        ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
> +    for (i = 0; i < MACHINE(vms)->smp.max_cpus; i++) {
> +        CPUState *cpu = qemu_get_possible_cpu(i);
>           uint64_t physical_base_address = 0, gich = 0, gicv = 0;
>           uint32_t vgic_interrupt = vms->virt ? PPI(ARCH_GIC_MAINT_IRQ) : 0;
> -        uint32_t pmu_interrupt = arm_feature(&armcpu->env, ARM_FEATURE_PMU) ?
> -                                             PPI(VIRTUAL_PMU_IRQ) : 0;
> +        uint32_t pmu_interrupt = vms->pmu ? PPI(VIRTUAL_PMU_IRQ) : 0;
> +        uint32_t flags = virt_acpi_get_gicc_flags(cpu);
> +        uint64_t mpidr = qemu_get_cpu_archid(i);
>   

qemu_get_cpu_archid() can be dropped since it's called for once. MPIDR
can be fetched from ms->possible_cpus->cpus[i].arch_id, which has been
initialized pre-hand.

>           if (vms->gic_version == VIRT_GIC_VERSION_2) {
>               physical_base_address = memmap[VIRT_GIC_CPU].base;
> @@ -746,7 +770,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>           build_append_int_noprefix(table_data, i, 4);    /* GIC ID */
>           build_append_int_noprefix(table_data, i, 4);    /* ACPI Processor UID */
>           /* Flags */
> -        build_append_int_noprefix(table_data, 1, 4);    /* Enabled */
> +        build_append_int_noprefix(table_data, flags, 4);
>           /* Parking Protocol Version */
>           build_append_int_noprefix(table_data, 0, 4);
>           /* Performance Interrupt GSIV */
> @@ -760,7 +784,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>           build_append_int_noprefix(table_data, vgic_interrupt, 4);
>           build_append_int_noprefix(table_data, 0, 8);    /* GICR Base Address*/
>           /* MPIDR */
> -        build_append_int_noprefix(table_data, armcpu->mp_affinity, 8);
> +        build_append_int_noprefix(table_data, mpidr, 8);
>           /* Processor Power Efficiency Class */
>           build_append_int_noprefix(table_data, 0, 1);
>           /* Reserved */

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 22/37] hw/acpi: Make _MAT method optional
  2023-09-26 10:04 ` [PATCH RFC V2 22/37] hw/acpi: Make _MAT method optional Salil Mehta via
@ 2023-09-28 23:50   ` Gavin Shan
  2023-10-16 23:17     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28 23:50 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

On 9/26/23 20:04, Salil Mehta wrote:
> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> 
> The GICC interface on arm64 vCPUs is statically defined in the MADT, and
> doesn't require a _MAT entry. Although the GICC is indicated as present
> by the MADT entry, it can only be used from vCPU sysregs, which aren't
> accessible until hot-add.
> 
> Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Co-developed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/acpi/cpu.c | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
> 

With following nits addressed:

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> index e1299696d3..217db99538 100644
> --- a/hw/acpi/cpu.c
> +++ b/hw/acpi/cpu.c
> @@ -715,11 +715,13 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
>               aml_append(dev, method);
>   
>               /* build _MAT object */
> -            assert(adevc && adevc->madt_cpu);
> -            adevc->madt_cpu(i, arch_ids, madt_buf,
> -                            true); /* set enabled flag */
> -            aml_append(dev, aml_name_decl("_MAT",
> -                aml_buffer(madt_buf->len, (uint8_t *)madt_buf->data)));
> +            if (adevc && adevc->madt_cpu) {
> +                assert(adevc && adevc->madt_cpu);
> +                adevc->madt_cpu(i, arch_ids, madt_buf,
> +                                true); /* set enabled flag */
> +                aml_append(dev, aml_name_decl("_MAT",
> +                    aml_buffer(madt_buf->len, (uint8_t *)madt_buf->data)));
> +            }
>               g_array_free(madt_buf, true);
>   
>               if (CPU(arch_ids->cpus[i].cpu) != first_cpu) {

May be worthy to have comment to mention _MAT isn't needed on aarch64.

                /* Build _MAT object, which isn't needed by aarch64 */

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 23/37] arm/virt: Release objects for *disabled* possible vCPUs after init
  2023-09-26 10:04 ` [PATCH RFC V2 23/37] arm/virt: Release objects for *disabled* possible vCPUs after init Salil Mehta via
@ 2023-09-28 23:57   ` Gavin Shan
  2023-10-16 23:28     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-28 23:57 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> During machvirt_init(), QOM ARMCPU objects are also pre-created along with the
> corresponding KVM vCPUs in the host for all possible vCPUs. This necessary
> because of the architectural constraint, KVM restricts the deferred creation of
> the KVM vCPUs and VGIC initialization/sizing after VM init. Hence, VGIC is
> pre-sized with possible vCPUs.
> 
> After initialization of the machine is complete disabled possible KVM vCPUs are
> then parked at the per-virt-machine list "kvm_parked_vcpus" and we release the
> QOM ARMCPU objects for the disabled vCPUs. These shall be re-created at the time
> when vCPU is hotplugged again. QOM ARMCPU object is then re-attached with
> corresponding parked KVM vCPU.
> 
> Alternatively, we could've never released the QOM CPU objects and kept on
> reusing. This approach might require some modifications of qdevice_add()
> interface to get old ARMCPU object instead of creating a new one for the hotplug
> request.
> 
> Each of the above approaches come with their own pros and cons. This prototype
> uses the 1st approach.(suggestions are welcome!)
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++
>   1 file changed, 32 insertions(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index f1bee569d5..3b068534a8 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1965,6 +1965,7 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
>   {
>       CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
>       int max_cpus = MACHINE(vms)->smp.max_cpus;
> +    MachineState *ms = MACHINE(vms);
>       bool aarch64, steal_time;
>       CPUState *cpu;
>       int n;
> @@ -2025,6 +2026,37 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
>               }
>           }
>       }
> +
> +    if (kvm_enabled() || tcg_enabled()) {
> +        for (n = 0; n < possible_cpus->len; n++) {
> +            cpu = qemu_get_possible_cpu(n);
> +
> +            /*
> +             * Now, GIC has been sized with possible CPUs and we dont require
> +             * disabled vCPU objects to be represented in the QOM. Release the
> +             * disabled ARMCPU objects earlier used during init for pre-sizing.
> +             *
> +             * We fake to the guest through ACPI about the presence(_STA.PRES=1)
> +             * of these non-existent vCPUs at VMM/qemu and present these as
> +             * disabled vCPUs(_STA.ENA=0) so that they cant be used. These vCPUs
> +             * can be later added to the guest through hotplug exchanges when
> +             * ARMCPU objects are created back again using 'device_add' QMP
> +             * command.
> +             */
> +            /*
> +             * RFC: Question: Other approach could've been to keep them forever
> +             * and release it only once when qemu exits as part of finalize or
> +             * when new vCPU is hotplugged. In the later old could be released
> +             * for the newly created object for the same vCPU?
> +             */
> +            if (!qemu_enabled_cpu(cpu)) {
> +                CPUArchId *cpu_slot;
> +                cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index);
> +                cpu_slot->cpu = NULL;
> +                object_unref(OBJECT(cpu));
> +            }
> +        }
> +    }
>   }
>   

Needn't we release those CPU instances for hve and qtest? Besides, I think it's
hard for reuse those objects because they're managed by QOM, which is almost
transparent to us, correct?

>   static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 25/37] arm/virt: Add/update basic hot-(un)plug framework
  2023-09-26 10:04 ` [PATCH RFC V2 25/37] arm/virt: Add/update basic hot-(un)plug framework Salil Mehta via
@ 2023-09-29  0:20   ` Gavin Shan
  2023-10-16 23:40     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-29  0:20 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> Add CPU hot-unplug hooks and update hotplug hooks with additional sanity checks
> for use in hotplug paths.
> 
> Note, Functional contents of the hooks(now left with TODO comment) shall be
> gradually filled in the subsequent patches in an incremental approach to patch
> and logic building which would be roughly as follows:
> 1. (Un-)wiring of interrupts between vCPU<->GIC
> 2. Sending events to Guest for hot-(un)plug so that guest can take appropriate
>     actions.
> 3. Notifying GIC about hot-(un)plug action so that vCPU could be (un-)stitched
>     to the GIC CPU interface.
> 4. Updating the Guest with Next boot info for this vCPU in the firmware.
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 104 insertions(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 3b068534a8..dce02136cb 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -81,6 +81,7 @@
>   #include "hw/virtio/virtio-iommu.h"
>   #include "hw/char/pl011.h"
>   #include "qemu/guest-random.h"
> +#include "qapi/qmp/qdict.h"
>   
>   #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>       static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
> @@ -2985,12 +2986,23 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>   {
>       VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>       MachineState *ms = MACHINE(hotplug_dev);
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
>       ARMCPU *cpu = ARM_CPU(dev);
>       CPUState *cs = CPU(dev);
>       CPUArchId *cpu_slot;
>       int32_t min_cpuid = 0;
>       int32_t max_cpuid;
>   
> +    if (dev->hotplugged && !vms->acpi_dev) {
> +        error_setg(errp, "GED acpi device does not exists");
> +        return;
> +    }
> +
> +    if (dev->hotplugged && !mc->has_hotpluggable_cpus) {
> +        error_setg(errp, "CPU hotplug not supported on this machine");
> +        return;
> +    }
> +

I guess these can be combined to:

        if (dev->hotplugged && (!mc->has_hotpluggable_cpus || !vms->acpi_dev)) {
            error_setg(errp, "CPU hotplug not supported or GED ACPI device not exist");
        }

Besides, need we check (vms->gic_version == VIRT_GIC_VERSION_3)?

>       /* sanity check the cpu */
>       if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
>           error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
> @@ -3039,6 +3051,22 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>       }
>       virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
>   
> +    /*
> +     * Fix the GIC for this new vCPU being plugged. The QOM CPU object for the
> +     * new vCPU need to be updated in the corresponding QOM GICv3CPUState object
> +     * We also need to re-wire the IRQs for this new CPU object. This update
> +     * is limited to the QOM only and does not affects the KVM. Later has
> +     * already been pre-sized with possible CPU at VM init time. This is a
> +     * workaround to the constraints posed by ARM architecture w.r.t supporting
> +     * CPU Hotplug. Specification does not exist for the later.
> +     * This patch-up is required both for {cold,hot}-plugged vCPUs. Cold-inited
> +     * vCPUs have their GIC state initialized during machvit_init().
> +     */
> +    if (vms->acpi_dev) {
> +        /* TODO: update GIC about this hotplug change here */
> +        /* TODO: wire the GIC<->CPU irqs */
> +    }
> +

When looking at these 'TODO', it seems you need order the patches to make those
preparatory patches ahead of this one. In this way, the 'TODO' can be avoided.

>       /*
>        * To give persistent presence view of vCPUs to the guest, ACPI might need
>        * to fake the presence of the vCPUs to the guest but keep them disabled.
> @@ -3050,6 +3078,7 @@ static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>   static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                             Error **errp)
>   {
> +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>       MachineState *ms = MACHINE(hotplug_dev);
>       CPUState *cs = CPU(dev);
>       CPUArchId *cpu_slot;
> @@ -3058,10 +3087,81 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>       cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
>       cpu_slot->cpu = OBJECT(dev);
>   
> +    /*
> +     * Update the ACPI Hotplug state both for vCPUs being {hot,cold}-plugged.
> +     * vCPUs can be cold-plugged using '-device' option. For vCPUs being hot
> +     * plugged, guest is also notified.
> +     */
> +    if (vms->acpi_dev) {
> +        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest */
> +        /* TODO: register cpu for reset & update F/W info for the next boot */
> +    }
> +

We needn't validate vms->acpi_dev again since it has been done in pre_plug().

>       cs->disabled = false;
>       return;
>   }
>   
> +static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
> +                                    DeviceState *dev, Error **errp)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
> +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> +    ARMCPU *cpu = ARM_CPU(dev);
> +    CPUState *cs = CPU(dev);
> +
> +    if (!vms->acpi_dev || !dev->realized) {
> +        error_setg(errp, "GED does not exists or device is not realized!");
> +        return;
> +    }
> +
> +    if (!mc->has_hotpluggable_cpus) {
> +        error_setg(errp, "CPU hot(un)plug not supported on this machine");
> +        return;
> +    }
> +
> +    if (cs->cpu_index == first_cpu->cpu_index) {
> +        error_setg(errp, "Boot CPU(id%d=%d:%d:%d:%d) hot-unplug not supported",
> +                   first_cpu->cpu_index, cpu->socket_id, cpu->cluster_id,
> +                   cpu->core_id, cpu->thread_id);
> +        return;
> +    }
> +
> +    /* TODO: request cpu hotplug from guest */
> +
> +    return;
> +}
> +
> +static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +                            Error **errp)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> +    MachineState *ms = MACHINE(hotplug_dev);
> +    CPUState *cs = CPU(dev);
> +    CPUArchId *cpu_slot;
> +
> +    if (!vms->acpi_dev || !dev->realized) {
> +        error_setg(errp, "GED does not exists or device is not realized!");
> +        return;
> +    }
> +
> +    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> +
> +    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
> +
> +    /* TODO: unwire the gic-cpu irqs here */
> +    /* TODO: update the GIC about this hot unplug change */
> +
> +    /* TODO: unregister cpu for reset & update F/W info for the next boot */
> +

Same as above.

> +    qobject_unref(dev->opts);
> +    dev->opts = NULL;
> +
> +    cpu_slot->cpu = NULL;
> +    cs->disabled = true;
> +
> +    return;
> +}
> +
>   static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
>                                               DeviceState *dev, Error **errp)
>   {
> @@ -3185,6 +3285,8 @@ static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
>       } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
>           virtio_md_pci_unplug_request(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev),
>                                        errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        virt_cpu_unplug_request(hotplug_dev, dev, errp);
>       } else {
>           error_setg(errp, "device unplug request for unsupported device"
>                      " type: %s", object_get_typename(OBJECT(dev)));
> @@ -3198,6 +3300,8 @@ static void virt_machine_device_unplug_cb(HotplugHandler *hotplug_dev,
>           virt_dimm_unplug(hotplug_dev, dev, errp);
>       } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
>           virtio_md_pci_unplug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +        virt_cpu_unplug(hotplug_dev, dev, errp);
>       } else {
>           error_setg(errp, "virt: device unplug for unsupported device"
>                      " type: %s", object_get_typename(OBJECT(dev)));

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 29/37] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
  2023-09-26 10:04 ` [PATCH RFC V2 29/37] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events Salil Mehta via
@ 2023-09-29  0:30   ` Gavin Shan
  2023-10-16 23:48     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-29  0:30 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:04, Salil Mehta wrote:
> During any vCPU hot-(un)plug, running guest VM needs to be intimated about the
> new vCPU being added or request the deletion of the vCPU which is already part
> of the guest VM. This is done using the ACPI GED event which eventually gets
> demultiplexed to a CPU hotplug event and further to specific hot-(un)plug event
> of a particular vCPU.
> 
> This change adds the ACPI calls to the existing hot-(un)plug hooks to trigger
> ACPI GED events from QEMU to guest VM.
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c | 33 ++++++++++++++++++++++++++++++---
>   1 file changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index b447e86fb6..6f5ee4a1c6 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -3157,6 +3157,7 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>       VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>       MachineState *ms = MACHINE(hotplug_dev);
>       CPUState *cs = CPU(dev);
> +    Error *local_err = NULL;
>       CPUArchId *cpu_slot;
>   
>       /* insert the cold/hot-plugged vcpu in the slot */
> @@ -3169,12 +3170,20 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>        * plugged, guest is also notified.
>        */
>       if (vms->acpi_dev) {
> -        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest */
> +        HotplugHandlerClass *hhc;
> +        /* update acpi hotplug state and send cpu hotplug event to guest */
> +        hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
> +        hhc->plug(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
> +        if (local_err) {
> +            goto fail;
> +        }
>           /* TODO: register cpu for reset & update F/W info for the next boot */
>       }
>   
>       cs->disabled = false;
>       return;
> +fail:
> +    error_propagate(errp, local_err);
>   }
>   

'fail' tag isn't needed since it's used for once. we can bail early:

     if (local_err) {
        error_propagate(errp, local_err);
        return;
     }

>   static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
> @@ -3182,8 +3191,10 @@ static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
>   {
>       MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
>       VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> +    HotplugHandlerClass *hhc;
>       ARMCPU *cpu = ARM_CPU(dev);
>       CPUState *cs = CPU(dev);
> +    Error *local_err = NULL;
>   
>       if (!vms->acpi_dev || !dev->realized) {
>           error_setg(errp, "GED does not exists or device is not realized!");
> @@ -3202,9 +3213,16 @@ static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
>           return;
>       }
>   
> -    /* TODO: request cpu hotplug from guest */
> +    /* request cpu hotplug from guest */
> +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
> +    hhc->unplug_request(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
> +    if (local_err) {
> +        goto fail;
> +    }
>   
>       return;
> +fail:
> +    error_propagate(errp, local_err);
>   }
>   

Same as above, 'fail' tag isn't needed. Besides, 'return' isn't needed.

>   static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> @@ -3212,7 +3230,9 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>   {
>       VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>       MachineState *ms = MACHINE(hotplug_dev);
> +    HotplugHandlerClass *hhc;
>       CPUState *cs = CPU(dev);
> +    Error *local_err = NULL;
>       CPUArchId *cpu_slot;
>   
>       if (!vms->acpi_dev || !dev->realized) {
> @@ -3222,7 +3242,12 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>   
>       cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
>   
> -    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
> +    /* update the acpi cpu hotplug state for cpu hot-unplug */
> +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
> +    hhc->unplug(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
> +    if (local_err) {
> +        goto fail;
> +    }
>   
>       unwire_gic_cpu_irqs(vms, cs);
>       virt_update_gic(vms, cs);
> @@ -3236,6 +3261,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>       cs->disabled = true;
>   
>       return;
> +fail:
> +    error_propagate(errp, local_err);
>   }
>   

Same as above.

>   static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 34/37] target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
  2023-09-26 10:36   ` [PATCH RFC V2 34/37] target/arm/kvm, tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu Salil Mehta via
@ 2023-09-29  4:15     ` Gavin Shan
  2023-10-17  0:03       ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-29  4:15 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:36, Salil Mehta wrote:
> From: Author Salil Mehta <salil.mehta@huawei.com>
> 
> Add registration and Handling of HVC/SMC hypercall exits to VMM
> 
> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   target/arm/arm-powerctl.c   | 51 +++++++++++++++++++++++++++++-------
>   target/arm/helper.c         |  2 +-
>   target/arm/internals.h      | 11 --------
>   target/arm/kvm.c            | 52 +++++++++++++++++++++++++++++++++++++
>   target/arm/kvm64.c          | 46 +++++++++++++++++++++++++++++---
>   target/arm/kvm_arm.h        | 13 ++++++++++
>   target/arm/meson.build      |  1 +
>   target/arm/{tcg => }/psci.c |  8 ++++++
>   target/arm/tcg/meson.build  |  4 ---
>   9 files changed, 160 insertions(+), 28 deletions(-)
>   rename target/arm/{tcg => }/psci.c (97%)
> 
> diff --git a/target/arm/arm-powerctl.c b/target/arm/arm-powerctl.c
> index 326a03153d..0184c7fb09 100644
> --- a/target/arm/arm-powerctl.c
> +++ b/target/arm/arm-powerctl.c
> @@ -16,6 +16,7 @@
>   #include "qemu/log.h"
>   #include "qemu/main-loop.h"
>   #include "sysemu/tcg.h"
> +#include "hw/boards.h"
>   
>   #ifndef DEBUG_ARM_POWERCTL
>   #define DEBUG_ARM_POWERCTL 0
> @@ -28,18 +29,37 @@
>           } \
>       } while (0)
>   
> +static CPUArchId *arm_get_archid_by_id(uint64_t id)
> +{
> +    int n;
> +    CPUArchId *arch_id;
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +
> +    /*
> +     * At this point disabled CPUs don't have a CPUState, but their CPUArchId
> +     * exists.
> +     *
> +     * TODO: Is arch_id == mp_affinity? This needs work.
> +     */
> +    for (n = 0; n < ms->possible_cpus->len; n++) {
> +        arch_id = &ms->possible_cpus->cpus[n];
> +
> +        if (arch_id->arch_id == id) {
> +            return arch_id;
> +        }
> +    }
> +    return NULL;
> +}
> +

The @arch_id should be same thing to @mp_affinity except for the boot CPU.
For the boot CPU, its value is fetched from MPIDR, which is determined by
cs->cpu_index, passed to host via ioctl(CREATE_VCPU). Besides, another
similiar function qemu_get_cpu_archid() exists in cpus-common.c. I think
they can be combined. Again, all these information inherited from
ms->possible_cpus may be better to be managed in board level, like the
vCPU states.

>   CPUState *arm_get_cpu_by_id(uint64_t id)
>   {
> -    CPUState *cpu;
> +    CPUArchId *arch_id;
>   
>       DPRINTF("cpu %" PRId64 "\n", id);
>   
> -    CPU_FOREACH(cpu) {
> -        ARMCPU *armcpu = ARM_CPU(cpu);
> -
> -        if (armcpu->mp_affinity == id) {
> -            return cpu;
> -        }
> +    arch_id = arm_get_archid_by_id(id);
> +    if (arch_id && arch_id->cpu) {
> +        return CPU(arch_id->cpu);
>       }
>   
>       qemu_log_mask(LOG_GUEST_ERROR,
> @@ -148,6 +168,7 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
>   {
>       CPUState *target_cpu_state;
>       ARMCPU *target_cpu;
> +    CPUArchId *arch_id;
>       struct CpuOnInfo *info;
>   
>       assert(qemu_mutex_iothread_locked());
> @@ -168,12 +189,24 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uint64_t context_id,
>       }
>   
>       /* Retrieve the cpu we are powering up */
> -    target_cpu_state = arm_get_cpu_by_id(cpuid);
> -    if (!target_cpu_state) {
> +    arch_id = arm_get_archid_by_id(cpuid);
> +    if (!arch_id) {
>           /* The cpu was not found */
>           return QEMU_ARM_POWERCTL_INVALID_PARAM;
>       }
>   
> +    target_cpu_state = CPU(arch_id->cpu);
> +    if (!qemu_enabled_cpu(target_cpu_state)) {
> +        /*
> +         * The cpu is not plugged in or disabled. We should return appropriate
> +         * value as introduced in DEN0022E PSCI 1.2 issue E
                                                        ^^^^^^^
                                                        issue E, which is QEMU_PSCI_RET_DENIED.
> +         */
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "[ARM]%s: Denying attempt to online removed/disabled "
> +                      "CPU%" PRId64"\n", __func__, cpuid);
> +        return QEMU_ARM_POWERCTL_IS_OFF;
> +    }
> +
>       target_cpu = ARM_CPU(target_cpu_state);
>       if (target_cpu->power_state == PSCI_ON) {
>           qemu_log_mask(LOG_GUEST_ERROR,
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 272d6ba139..4d396426f2 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -11187,7 +11187,7 @@ void arm_cpu_do_interrupt(CPUState *cs)
>                         env->exception.syndrome);
>       }
>   
> -    if (tcg_enabled() && arm_is_psci_call(cpu, cs->exception_index)) {
> +    if (arm_is_psci_call(cpu, cs->exception_index)) {
>           arm_handle_psci_call(cpu);
>           qemu_log_mask(CPU_LOG_INT, "...handled as PSCI call\n");
>           return;

We may still limit the capability to handle PSCI calls to TCG and KVM,
meaning HVF and QTest won't have this capability.

> diff --git a/target/arm/internals.h b/target/arm/internals.h
> index fe330e89e7..7ffefc2d58 100644
> --- a/target/arm/internals.h
> +++ b/target/arm/internals.h
> @@ -305,21 +305,10 @@ vaddr arm_adjust_watchpoint_address(CPUState *cs, vaddr addr, int len);
>   /* Callback function for when a watchpoint or breakpoint triggers. */
>   void arm_debug_excp_handler(CPUState *cs);
>   
> -#if defined(CONFIG_USER_ONLY) || !defined(CONFIG_TCG)
> -static inline bool arm_is_psci_call(ARMCPU *cpu, int excp_type)
> -{
> -    return false;
> -}
> -static inline void arm_handle_psci_call(ARMCPU *cpu)
> -{
> -    g_assert_not_reached();
> -}
> -#else
>   /* Return true if the r0/x0 value indicates that this SMC/HVC is a PSCI call. */
>   bool arm_is_psci_call(ARMCPU *cpu, int excp_type);
>   /* Actually handle a PSCI call */
>   void arm_handle_psci_call(ARMCPU *cpu);
> -#endif
>   
>   /**
>    * arm_clear_exclusive: clear the exclusive monitor
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 8e7c68af6a..6f3fd5aecd 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -250,6 +250,7 @@ int kvm_arm_get_max_vm_ipa_size(MachineState *ms, bool *fixed_ipa)
>   int kvm_arch_init(MachineState *ms, KVMState *s)
>   {
>       int ret = 0;
> +
   ^^^^
Unnecessary change.

>       /* For ARM interrupt delivery is always asynchronous,
>        * whether we are using an in-kernel VGIC or not.
>        */
> @@ -280,6 +281,22 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>           }
>       }
>   
> +    /*
> +     * To be able to handle PSCI CPU ON calls in QEMU, we need to install SMCCC
                                         ^^
                                         ON/OFF
> +     * filter in the Host KVM. This is required to support features like
> +     * virtual CPU Hotplug on ARM platforms.
> +     */
> +    if (kvm_arm_set_smccc_filter(PSCI_0_2_FN64_CPU_ON,
> +                                 KVM_SMCCC_FILTER_FWD_TO_USER)) {
> +        error_report("CPU On PSCI-to-user-space fwd filter install failed");
> +        abort();
> +    }
> +    if (kvm_arm_set_smccc_filter(PSCI_0_2_FN_CPU_OFF,
> +                                 KVM_SMCCC_FILTER_FWD_TO_USER)) {
> +        error_report("CPU Off PSCI-to-user-space fwd filter install failed");
> +        abort();
> +    }
> +
>       kvm_arm_init_debug(s);
>   
>       return ret;

The PSCI_ON and PSCI_OFF will be unconditionally handled by QEMU if KVM is
enabled, even vCPU hotplug isn't supported on hw/arm/virt board. Do we need to
enable it only when vCPU hotplug is supported?

> @@ -952,6 +969,38 @@ static int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t esr_iss,
>       return -1;
>   }
>   
> +static int kvm_arm_handle_hypercall(CPUState *cs, struct kvm_run *run)
> +{
> +    ARMCPU *cpu = ARM_CPU(cs);
> +    CPUARMState *env = &cpu->env;
> +
> +    kvm_cpu_synchronize_state(cs);
> +
> +    /*
> +     * hard coding immediate to 0 as we dont expect non-zero value as of now
                                            ^^^^
                                            don't
> +     * This might change in future versions. Hence, KVM_GET_ONE_REG  could be
> +     * used in such cases but it must be enhanced then only synchronize will
> +     * also fetch ESR_EL2 value.
> +     */
> +    if (run->hypercall.flags == KVM_HYPERCALL_EXIT_SMC) {
> +        cs->exception_index = EXCP_SMC;
> +        env->exception.syndrome = syn_aa64_smc(0);
> +    } else {
> +        cs->exception_index = EXCP_HVC;
> +        env->exception.syndrome = syn_aa64_hvc(0);
> +    }
> +    env->exception.target_el = 1;
> +    qemu_mutex_lock_iothread();
> +    arm_cpu_do_interrupt(cs);
> +    qemu_mutex_unlock_iothread();
> +
> +    /*
> +     * For PSCI, exit the kvm_run loop and process the work. Especially
> +     * important if this was a CPU_OFF command and we can't return to the guest.
> +     */
> +    return EXCP_INTERRUPT;
> +}
> +
>   int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>   {
>       int ret = 0;
> @@ -967,6 +1016,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>           ret = kvm_arm_handle_dabt_nisv(cs, run->arm_nisv.esr_iss,
>                                          run->arm_nisv.fault_ipa);
>           break;
> +    case KVM_EXIT_HYPERCALL:
> +          ret = kvm_arm_handle_hypercall(cs, run);
> +        break;
>       default:
>           qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
>                         __func__, run->exit_reason);
> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> index 38de0b4148..efe24e3f90 100644
> --- a/target/arm/kvm64.c
> +++ b/target/arm/kvm64.c
> @@ -113,6 +113,25 @@ bool kvm_arm_hw_debug_active(CPUState *cs)
>       return ((cur_hw_wps > 0) || (cur_hw_bps > 0));
>   }
>   
> +static bool kvm_arm_set_vm_attr(struct kvm_device_attr *attr, const char *name)
> +{
> +    int err;
> +
> +    err = kvm_vm_ioctl(kvm_state, KVM_HAS_DEVICE_ATTR, attr);
> +    if (err != 0) {
> +        error_report("%s: KVM_HAS_DEVICE_ATTR: %s", name, strerror(-err));
> +        return false;
> +    }
> +
> +    err = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, attr);
> +    if (err != 0) {
> +        error_report("%s: KVM_SET_DEVICE_ATTR: %s", name, strerror(-err));
> +        return false;
> +    }
> +
> +    return true;
> +}
> +
>   static bool kvm_arm_set_device_attr(CPUState *cs, struct kvm_device_attr *attr,
>                                       const char *name)
>   {
> @@ -183,6 +202,28 @@ void kvm_arm_pvtime_init(CPUState *cs, uint64_t ipa)
>       }
>   }
>   
> +int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction)
> +{
> +    struct kvm_smccc_filter filter = {
> +        .base = func,
> +        .nr_functions = 1,
> +        .action = faction,
> +    };
> +    struct kvm_device_attr attr = {
> +        .group = KVM_ARM_VM_SMCCC_CTRL,
> +        .attr = KVM_ARM_VM_SMCCC_FILTER,
> +        .flags = 0,
> +        .addr = (uintptr_t)&filter,
> +    };
> +
> +    if (!kvm_arm_set_vm_attr(&attr, "SMCCC Filter")) {
> +        error_report("failed to set SMCCC filter in KVM Host");
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
>   static int read_sys_reg32(int fd, uint32_t *pret, uint64_t id)
>   {
>       uint64_t ret;
> @@ -633,9 +674,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
>       }
>   
>       /*
> -     * When KVM is in use, PSCI is emulated in-kernel and not by qemu.
> -     * Currently KVM has its own idea about MPIDR assignment, so we
> -     * override our defaults with what we get from KVM.
> +     * KVM may emulate PSCI in-kernel. Currently KVM has its own idea about
> +     * MPIDR assignment, so we override our defaults with what we get from KVM.
>        */
>       ret = kvm_get_one_reg(cs, ARM64_SYS_REG(ARM_CPU_ID_MPIDR), &mpidr);
>       if (ret) {
> diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
> index 31408499b3..bf4df54c96 100644
> --- a/target/arm/kvm_arm.h
> +++ b/target/arm/kvm_arm.h
> @@ -388,6 +388,15 @@ void kvm_arm_pvtime_init(CPUState *cs, uint64_t ipa);
>   
>   int kvm_arm_set_irq(int cpu, int irqtype, int irq, int level);
>   
> +/**
> + * kvm_arm_set_smccc_filter
> + * @func: funcion
> + * @faction: SMCCC filter action(handle, deny, fwd-to-user) to be deployed
> + *
> + * Sets the ARMs SMC-CC filter in KVM Host for selective hypercall exits
> + */
> +int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction);
> +
>   #else
>   
>   /*
> @@ -462,6 +471,10 @@ static inline uint32_t kvm_arm_sve_get_vls(CPUState *cs)
>       g_assert_not_reached();
>   }
>   
> +static inline int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction)
> +{
> +    g_assert_not_reached();
> +}
>   #endif
>   
>   /**
> diff --git a/target/arm/meson.build b/target/arm/meson.build
> index e645e456da..fdfc8b958f 100644
> --- a/target/arm/meson.build
> +++ b/target/arm/meson.build
> @@ -23,6 +23,7 @@ arm_system_ss.add(files(
>     'arm-qmp-cmds.c',
>     'cortex-regs.c',
>     'machine.c',
> +  'psci.c',
>     'ptw.c',
>   ))
>   
> diff --git a/target/arm/tcg/psci.c b/target/arm/psci.c
> similarity index 97%
> rename from target/arm/tcg/psci.c
> rename to target/arm/psci.c
> index 6c1239bb96..a8690a16af 100644
> --- a/target/arm/tcg/psci.c
> +++ b/target/arm/psci.c
> @@ -21,7 +21,9 @@
>   #include "exec/helper-proto.h"
>   #include "kvm-consts.h"
>   #include "qemu/main-loop.h"
> +#include "qemu/error-report.h"
>   #include "sysemu/runstate.h"
> +#include "sysemu/tcg.h"
>   #include "internals.h"
>   #include "arm-powerctl.h"
>   
> @@ -157,6 +159,11 @@ void arm_handle_psci_call(ARMCPU *cpu)
>       case QEMU_PSCI_0_1_FN_CPU_SUSPEND:
>       case QEMU_PSCI_0_2_FN_CPU_SUSPEND:
>       case QEMU_PSCI_0_2_FN64_CPU_SUSPEND:
> +       if (!tcg_enabled()) {
> +            warn_report("CPU suspend not supported in non-tcg mode");
> +            break;
> +       }
> +#ifdef CONFIG_TCG
>           /* Affinity levels are not supported in QEMU */
>           if (param[1] & 0xfffe0000) {
>               ret = QEMU_PSCI_RET_INVALID_PARAMS;
> @@ -169,6 +176,7 @@ void arm_handle_psci_call(ARMCPU *cpu)
>               env->regs[0] = 0;
>           }
>           helper_wfi(env, 4);
> +#endif
>           break;
>       case QEMU_PSCI_1_0_FN_PSCI_FEATURES:
>           switch (param[1]) {
> diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
> index 6fca38f2cc..ad3cfcb3bd 100644
> --- a/target/arm/tcg/meson.build
> +++ b/target/arm/tcg/meson.build
> @@ -51,7 +51,3 @@ arm_ss.add(when: 'TARGET_AARCH64', if_true: files(
>     'sme_helper.c',
>     'sve_helper.c',
>   ))
> -
> -arm_system_ss.add(files(
> -  'psci.c',
> -))

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 35/37] hw/arm: Support hotplug capability check using _OSC method
  2023-09-26 10:36   ` [PATCH RFC V2 35/37] hw/arm: Support hotplug capability check using _OSC method Salil Mehta via
@ 2023-09-29  4:23     ` Gavin Shan
  2023-10-17  0:13       ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Gavin Shan @ 2023-09-29  4:23 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 20:36, Salil Mehta wrote:
> Physical CPU hotplug results in (un)setting of ACPI _STA.Present bit. AARCH64
> platforms do not support physical CPU hotplug. Virtual CPU hotplug support being
> implemented toggles ACPI _STA.Enabled Bit to achieve hotplug functionality. This
> is not same as physical CPU hotplug support.
> 
> In future, if ARM architecture supports physical CPU hotplug then the current
> design of virtual CPU hotplug can be used unchanged. Hence, there is a need for
> firmware/VMM/Qemu to support evaluation of platform wide capabilitiy related to
> the *type* of CPU hotplug support present on the platform. OSPM might need this
> during boot time to correctly initialize the CPUs and other related components
> in the kernel.
> 
> NOTE: This implementation will be improved to add the support of *query* in the
> subsequent versions. This is very minimal support to assist kernel.
> 
> ASL for the implemented _OSC method:
> 
> Method (_OSC, 4, NotSerialized)  // _OSC: Operating System Capabilities
> {
>      CreateDWordField (Arg3, Zero, CDW1)
>      If ((Arg0 == ToUUID ("0811b06e-4a27-44f9-8d60-3cbbc22e7b48") /* Platform-wide Capabilities */))
>      {
>          CreateDWordField (Arg3, 0x04, CDW2)
>          Local0 = CDW2 /* \_SB_._OSC.CDW2 */
>          If ((Arg1 != One))
>          {
>              CDW1 |= 0x08
>          }
> 
>          Local0 &= 0x00800000
>          If ((CDW2 != Local0))
>          {
>              CDW1 |= 0x10
>          }
> 
>          CDW2 = Local0
>      }
>      Else
>      {
>          CDW1 |= 0x04
>      }
> 
>      Return (Arg3)
> }
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt-acpi-build.c | 52 ++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 52 insertions(+)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index cbccd2ca2d..377450dd16 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -861,6 +861,55 @@ static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
>       build_fadt(table_data, linker, &fadt, vms->oem_id, vms->oem_table_id);
>   }
>   
> +static void build_virt_osc_method(Aml *scope, VirtMachineState *vms)
> +{
> +    Aml *if_uuid, *else_uuid, *if_rev, *if_caps_masked, *method;
> +    Aml *a_cdw1 = aml_name("CDW1");
> +    Aml *a_cdw2 = aml_local(0);
> +
> +    method = aml_method("_OSC", 4, AML_NOTSERIALIZED);
> +    aml_append(method, aml_create_dword_field(aml_arg(3), aml_int(0), "CDW1"));
> +
> +    /* match UUID */
> +    if_uuid = aml_if(aml_equal(
> +        aml_arg(0), aml_touuid("0811B06E-4A27-44F9-8D60-3CBBC22E7B48")));
> +
> +    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
> +    aml_append(if_uuid, aml_store(aml_name("CDW2"), a_cdw2));
> +
> +    /* check unknown revision in arg(1) */
> +    if_rev = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(1))));
> +    /* set revision error bits,  DWORD1 Bit[3] */
> +    aml_append(if_rev, aml_or(a_cdw1, aml_int(0x08), a_cdw1));
> +    aml_append(if_uuid, if_rev);
> +
> +    /*
> +     * check support for vCPU hotplug type(=enabled) platform-wide capability
> +     * in DWORD2 as sepcified in the below ACPI Specification ECR,
> +     *  # https://bugzilla.tianocore.org/show_bug.cgi?id=4481
> +     */
> +    if (vms->acpi_dev) {
> +        aml_append(if_uuid, aml_and(a_cdw2, aml_int(0x800000), a_cdw2));
> +        /* check if OSPM specified hotplug capability bits were masked */
> +        if_caps_masked = aml_if(aml_lnot(aml_equal(aml_name("CDW2"), a_cdw2)));
> +        aml_append(if_caps_masked, aml_or(a_cdw1, aml_int(0x10), a_cdw1));
> +        aml_append(if_uuid, if_caps_masked);
> +    }
> +    aml_append(if_uuid, aml_store(a_cdw2, aml_name("CDW2")));
> +
> +    aml_append(method, if_uuid);
> +    else_uuid = aml_else();
> +
> +    /* set unrecognized UUID error bits, DWORD1 Bit[2] */
> +    aml_append(else_uuid, aml_or(a_cdw1, aml_int(4), a_cdw1));
> +    aml_append(method, else_uuid);
> +
> +    aml_append(method, aml_return(aml_arg(3)));
> +    aml_append(scope, method);
> +
> +    return;
> +}
> +

The check on vms->acpi_dev seems not enough. We may still need to check
mc->has_hotpluggable_cpus and vms->gic_version etc. Besides, the "return"
at end of the function isn't needed.

>   /* DSDT */
>   static void
>   build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> @@ -894,6 +943,9 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>       } else {
>           acpi_dsdt_add_cpus(scope, vms);
>       }
> +
> +    build_virt_osc_method(scope, vms);
> +
>       acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
>                          (irqmap[VIRT_UART] + ARM_SPI_BASE));
>       if (vmc->acpi_expose_flash) {

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 01/37] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2023-09-26 23:57   ` [PATCH RFC V2 01/37] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property Gavin Shan
@ 2023-10-02  9:53     ` Salil Mehta via
  2023-10-02  9:53       ` Salil Mehta
  2023-10-03  5:05       ` Gavin Shan
  0 siblings, 2 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-10-02  9:53 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,
Many thanks for taking pains to review this patch-set.

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 12:57 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 01/37] arm/virt,target/arm: Add new ARMCPU
> {socket,cluster,core,thread}-id property
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > This shall be used to store user specified topology{socket,cluster,core,thread}
> > and shall be converted to a unique 'vcpu-id' which is used as slot-index during
> > hot(un)plug of vCPU.
> >
> 
> Note that we don't have 'vcpu-id' property. It's actually the index to the array
> ms->possible_cpus->cpus[] and cpu->cpu_index. Please improve the commit log if
> it makes sense.

I can change but was it mentioned anywhere in the log that vcpu-id is
a property?


> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c    | 63 ++++++++++++++++++++++++++++++++++++++++++++++++
> >   target/arm/cpu.c |  4 +++
> >   target/arm/cpu.h |  4 +++
> >   3 files changed, 71 insertions(+)
> > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 7d9dbc2663..57fe97c242 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -221,6 +221,11 @@ static const char *valid_cpus[] = {
> >       ARM_CPU_TYPE_NAME("max"),
> >   };
> >
> > +static int virt_get_socket_id(const MachineState *ms, int cpu_index);
> > +static int virt_get_cluster_id(const MachineState *ms, int cpu_index);
> > +static int virt_get_core_id(const MachineState *ms, int cpu_index);
> > +static int virt_get_thread_id(const MachineState *ms, int cpu_index);
> > +
> >   static bool cpu_type_valid(const char *cpu)
> >   {
> >       int i;
> > @@ -2168,6 +2173,14 @@ static void machvirt_init(MachineState *machine)
> >                             &error_fatal);
> >
> >           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
> > +        object_property_set_int(cpuobj, "socket-id",
> > +                                virt_get_socket_id(machine, n), NULL);
> > +        object_property_set_int(cpuobj, "cluster-id",
> > +                                virt_get_cluster_id(machine, n), NULL);
> > +        object_property_set_int(cpuobj, "core-id",
> > +                                virt_get_core_id(machine, n), NULL);
> > +        object_property_set_int(cpuobj, "thread-id",
> > +                                virt_get_thread_id(machine, n), NULL);
> >
> >           if (!vms->secure) {
> >               object_property_set_bool(cpuobj, "has_el3", false, NULL);
> > @@ -2652,10 +2665,59 @@ static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
> >       return socket_id % ms->numa_state->num_nodes;
> >   }
> >
> 
> It seems it's not unnecessary to keep virt_get_{socket, cluster, core, thread}_id()
> because they're called for once. I would suggest to figure out the socket, cluster,
> core and thread ID through @possible_cpus in machvirt_init(), like below.

It is always good to access properties through accessor functions. Beside the
main purpose here was to keep the code neat. So I would stick with these.

But because these are something which are not specific to VirtMachine I can
move them to some other place or a header file so that other architectures
can also use them.


> Besides, we can't always expose property "cluster-id" since cluster in the CPU
> topology isn't always supported, seeing MachineClass::smp_props. Some users may
> want to hide cluster for unknown reasons. 'cluster-id' shouldn't be exposed in
> this case. Otherwise, users may be confused by 'cluster-id' property while it
> has been disabled. For example, a VM is started with the following command lines
> and 'cluster-id' shouldn't be supported in vCPU hot-add.

True. All we are talking about is 4*integer space. This is to avoid complexity
of checks everywhere in the code by having these variables always exists and
with default values as 0. If the architecture does not defines property it will
not use these variable. It is a little tradeoff of memory with respect to 
maintainability of code. I would prefer later.

We can definitely put some comments in the places of their declaration.


>      -cpu host -smp=maxcpus=2,cpus=1,sockets=2,cores=1,threads=1
>      (qemu) device_add host,id=cpu1,socket-id=1,cluster-id=0,core-id=0,thread-id=0
> 
>      object_property_set_int(cpuobj, "socket-id",
>                              possible_cpus->cpus[i].props.socket_id, NULL);
>      if (mc->smp_props.cluster_supported && mc->smp_props.has_clusters) {
>          object_property_set_int(cpuobj, "cluster-id",
>                                  possible_cpus->cpus[i].props.cluster_id, NULL);
>      }

Exactly, these types of checks can be avoided. They make code unnecessarily look
complex and ugly.


>      object_property_set_int(cpuobj, "core-id",
>                              possible_cpus->cpus[i].props.core_id, NULL);
>      object_property_set_int(cpuobj, "thread-id",
>                              possible_cpus->cpus[i].props.thread_id, NULL);
> 
> > +static int virt_get_socket_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.socket_id;
> > +}
> > +
> > +static int virt_get_cluster_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.cluster_id;
> > +}
> > +
> > +static int virt_get_core_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.core_id;
> > +}
> > +
> > +static int virt_get_thread_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.thread_id;
> > +}
> > +
> > +static int
> > +virt_get_cpu_id_from_cpu_topo(const MachineState *ms, DeviceState *dev)
> > +{
> > +    int cpu_id, sock_vcpu_num, clus_vcpu_num, core_vcpu_num;
> > +    ARMCPU *cpu = ARM_CPU(dev);
> > +
> > +    /* calculate total logical cpus across socket/cluster/core */
> > +    sock_vcpu_num = cpu->socket_id * (ms->smp.threads * ms->smp.cores *
> > +                    ms->smp.clusters);
> > +    clus_vcpu_num = cpu->cluster_id * (ms->smp.threads * ms->smp.cores);
> > +    core_vcpu_num = cpu->core_id * ms->smp.threads;
> > +
> > +    /* get vcpu-id(logical cpu index) for this vcpu from this topology
> */
> > +    cpu_id = (sock_vcpu_num + clus_vcpu_num + core_vcpu_num) + cpu->thread_id;
> > +
> > +    assert(cpu_id >= 0 && cpu_id < ms->possible_cpus->len);
> > +
> > +    return cpu_id;
> > +}
> > +
> 
> This function is called for once in PATCH[04/37]. I think it needs to be moved
> around to PATCH[04/37].


Yes, we can do that.


> [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time change common
> to vCPU {cold|hot}-plug
> 
> The function name can be shortened because I don't see the suffix "_from_cpu_topo"
> is too much helpful. I think virt_get_cpu_index() would be good enough since it's
> called for once to return the index in array MachineState::possible_cpus::cpus[]
> and the return value is stored to CPUState::cpu_index

This is not an accessor function. This function derives the unique vcpu-id
from topology. Hence, naming is correct. Though, I can shorten the name
to something like below if you wish,

virt_get_cpu_id_from_cpu_topo() -> virt_cpu_id_from_topology()


The name virt_get_cpu_index() suggests as if function is something like below
and which it is not:

virt_get_cpu_index()
{
   return cs->cpu_index
}



> static int virt_get_cpu_index(const MachineState *ms, ARMCPU *cpu)
> {
>      int index, cpus_in_socket, cpus_in_cluster, cpus_in_core;
> 
>      /*
>       * It's fine to take cluster into account even it's not supported. In this
>       * case, ms->smp.clusters is always one.
>       */
> }
> 
> >   static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState
> *ms)
> >   {
> >       int n;
> >       unsigned int max_cpus = ms->smp.max_cpus;
> > +    unsigned int smp_threads = ms->smp.threads;
> >       VirtMachineState *vms = VIRT_MACHINE(ms);
> >       MachineClass *mc = MACHINE_GET_CLASS(vms);
> >
> > @@ -2669,6 +2731,7 @@ static const CPUArchIdList
> *virt_possible_cpu_arch_ids(MachineState *ms)
> >       ms->possible_cpus->len = max_cpus;
> >       for (n = 0; n < ms->possible_cpus->len; n++) {
> >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
> > +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
> >           ms->possible_cpus->cpus[n].arch_id =
> >               virt_cpu_mp_affinity(vms, n);
> >
> 
> This initialization seems to accomodate HMP command "info hotpluggable-
> cpus".
> It would be nice if it can be mentioned in the commit log.
> 
> > diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> > index 93c28d50e5..1376350416 100644
> > --- a/target/arm/cpu.c
> > +++ b/target/arm/cpu.c
> > @@ -2277,6 +2277,10 @@ static Property arm_cpu_properties[] = {
> >       DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
> >                           mp_affinity, ARM64_AFFINITY_INVALID),
> >       DEFINE_PROP_INT32("node-id", ARMCPU, node_id,
> CPU_UNSET_NUMA_NODE_ID),
> > +    DEFINE_PROP_INT32("socket-id", ARMCPU, socket_id, 0),
> > +    DEFINE_PROP_INT32("cluster-id", ARMCPU, cluster_id, 0),
> > +    DEFINE_PROP_INT32("core-id", ARMCPU, core_id, 0),
> > +    DEFINE_PROP_INT32("thread-id", ARMCPU, thread_id, 0),
> >       DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1),
> >       DEFINE_PROP_END_OF_LIST()
> >   };
> 
> All those 4 properties are used for vCPU hot-add, meaning they're not needed
> when vCPU hotplug isn't supported on the specific board. Even for hw/virt board,
> cluster isn't always supported and 'cluster-id' shouldn't always be exposed,
> as explained above. How about to register the properties dynamically only when
> they're needed by vCPU hotplug?


Yes, these are part of arch specific files so it is upto the arch whether to define
them or not to define them at all?

Yes, and as mentioned earlier, there is extra bit of memory(4*integer) which is
being used. I would tradeoff this vis-à-vis maintainability.



> > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > index 88e5accda6..d51d39f621 100644
> > --- a/target/arm/cpu.h
> > +++ b/target/arm/cpu.h
> > @@ -1094,6 +1094,10 @@ struct ArchCPU {
> >       QLIST_HEAD(, ARMELChangeHook) el_change_hooks;
> >
> >       int32_t node_id; /* NUMA node this CPU belongs to */
> > +    int32_t socket_id;
> > +    int32_t cluster_id;
> > +    int32_t core_id;
> > +    int32_t thread_id;
> 
> It would be fine to keep those fields even the corresponding properties are
> dynamically registered, but a little bit memory overhead incurred :)

You are contradicting yourself here ;)


Thanks
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 01/37] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2023-10-02  9:53     ` Salil Mehta via
@ 2023-10-02  9:53       ` Salil Mehta
  2023-10-03  5:05       ` Gavin Shan
  1 sibling, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-02  9:53 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,
Many thanks for taking pains to review this patch-set.

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 12:57 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 01/37] arm/virt,target/arm: Add new ARMCPU
> {socket,cluster,core,thread}-id property
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > This shall be used to store user specified topology{socket,cluster,core,thread}
> > and shall be converted to a unique 'vcpu-id' which is used as slot-index during
> > hot(un)plug of vCPU.
> >
> 
> Note that we don't have 'vcpu-id' property. It's actually the index to the array
> ms->possible_cpus->cpus[] and cpu->cpu_index. Please improve the commit log if
> it makes sense.

I can change but was it mentioned anywhere in the log that vcpu-id is
a property?


> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c    | 63 ++++++++++++++++++++++++++++++++++++++++++++++++
> >   target/arm/cpu.c |  4 +++
> >   target/arm/cpu.h |  4 +++
> >   3 files changed, 71 insertions(+)
> > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 7d9dbc2663..57fe97c242 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -221,6 +221,11 @@ static const char *valid_cpus[] = {
> >       ARM_CPU_TYPE_NAME("max"),
> >   };
> >
> > +static int virt_get_socket_id(const MachineState *ms, int cpu_index);
> > +static int virt_get_cluster_id(const MachineState *ms, int cpu_index);
> > +static int virt_get_core_id(const MachineState *ms, int cpu_index);
> > +static int virt_get_thread_id(const MachineState *ms, int cpu_index);
> > +
> >   static bool cpu_type_valid(const char *cpu)
> >   {
> >       int i;
> > @@ -2168,6 +2173,14 @@ static void machvirt_init(MachineState *machine)
> >                             &error_fatal);
> >
> >           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
> > +        object_property_set_int(cpuobj, "socket-id",
> > +                                virt_get_socket_id(machine, n), NULL);
> > +        object_property_set_int(cpuobj, "cluster-id",
> > +                                virt_get_cluster_id(machine, n), NULL);
> > +        object_property_set_int(cpuobj, "core-id",
> > +                                virt_get_core_id(machine, n), NULL);
> > +        object_property_set_int(cpuobj, "thread-id",
> > +                                virt_get_thread_id(machine, n), NULL);
> >
> >           if (!vms->secure) {
> >               object_property_set_bool(cpuobj, "has_el3", false, NULL);
> > @@ -2652,10 +2665,59 @@ static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
> >       return socket_id % ms->numa_state->num_nodes;
> >   }
> >
> 
> It seems it's not unnecessary to keep virt_get_{socket, cluster, core, thread}_id()
> because they're called for once. I would suggest to figure out the socket, cluster,
> core and thread ID through @possible_cpus in machvirt_init(), like below.

It is always good to access properties through accessor functions. Beside the
main purpose here was to keep the code neat. So I would stick with these.

But because these are something which are not specific to VirtMachine I can
move them to some other place or a header file so that other architectures
can also use them.


> Besides, we can't always expose property "cluster-id" since cluster in the CPU
> topology isn't always supported, seeing MachineClass::smp_props. Some users may
> want to hide cluster for unknown reasons. 'cluster-id' shouldn't be exposed in
> this case. Otherwise, users may be confused by 'cluster-id' property while it
> has been disabled. For example, a VM is started with the following command lines
> and 'cluster-id' shouldn't be supported in vCPU hot-add.

True. All we are talking about is 4*integer space. This is to avoid complexity
of checks everywhere in the code by having these variables always exists and
with default values as 0. If the architecture does not defines property it will
not use these variable. It is a little tradeoff of memory with respect to 
maintainability of code. I would prefer later.

We can definitely put some comments in the places of their declaration.


>      -cpu host -smp=maxcpus=2,cpus=1,sockets=2,cores=1,threads=1
>      (qemu) device_add host,id=cpu1,socket-id=1,cluster-id=0,core-id=0,thread-id=0
> 
>      object_property_set_int(cpuobj, "socket-id",
>                              possible_cpus->cpus[i].props.socket_id, NULL);
>      if (mc->smp_props.cluster_supported && mc->smp_props.has_clusters) {
>          object_property_set_int(cpuobj, "cluster-id",
>                                  possible_cpus->cpus[i].props.cluster_id, NULL);
>      }

Exactly, these types of checks can be avoided. They make code unnecessarily look
complex and ugly.


>      object_property_set_int(cpuobj, "core-id",
>                              possible_cpus->cpus[i].props.core_id, NULL);
>      object_property_set_int(cpuobj, "thread-id",
>                              possible_cpus->cpus[i].props.thread_id, NULL);
> 
> > +static int virt_get_socket_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.socket_id;
> > +}
> > +
> > +static int virt_get_cluster_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.cluster_id;
> > +}
> > +
> > +static int virt_get_core_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.core_id;
> > +}
> > +
> > +static int virt_get_thread_id(const MachineState *ms, int cpu_index)
> > +{
> > +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
> > +
> > +    return ms->possible_cpus->cpus[cpu_index].props.thread_id;
> > +}
> > +
> > +static int
> > +virt_get_cpu_id_from_cpu_topo(const MachineState *ms, DeviceState *dev)
> > +{
> > +    int cpu_id, sock_vcpu_num, clus_vcpu_num, core_vcpu_num;
> > +    ARMCPU *cpu = ARM_CPU(dev);
> > +
> > +    /* calculate total logical cpus across socket/cluster/core */
> > +    sock_vcpu_num = cpu->socket_id * (ms->smp.threads * ms->smp.cores *
> > +                    ms->smp.clusters);
> > +    clus_vcpu_num = cpu->cluster_id * (ms->smp.threads * ms->smp.cores);
> > +    core_vcpu_num = cpu->core_id * ms->smp.threads;
> > +
> > +    /* get vcpu-id(logical cpu index) for this vcpu from this topology
> */
> > +    cpu_id = (sock_vcpu_num + clus_vcpu_num + core_vcpu_num) + cpu->thread_id;
> > +
> > +    assert(cpu_id >= 0 && cpu_id < ms->possible_cpus->len);
> > +
> > +    return cpu_id;
> > +}
> > +
> 
> This function is called for once in PATCH[04/37]. I think it needs to be moved
> around to PATCH[04/37].


Yes, we can do that.


> [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time change common
> to vCPU {cold|hot}-plug
> 
> The function name can be shortened because I don't see the suffix "_from_cpu_topo"
> is too much helpful. I think virt_get_cpu_index() would be good enough since it's
> called for once to return the index in array MachineState::possible_cpus::cpus[]
> and the return value is stored to CPUState::cpu_index

This is not an accessor function. This function derives the unique vcpu-id
from topology. Hence, naming is correct. Though, I can shorten the name
to something like below if you wish,

virt_get_cpu_id_from_cpu_topo() -> virt_cpu_id_from_topology()


The name virt_get_cpu_index() suggests as if function is something like below
and which it is not:

virt_get_cpu_index()
{
   return cs->cpu_index
}



> static int virt_get_cpu_index(const MachineState *ms, ARMCPU *cpu)
> {
>      int index, cpus_in_socket, cpus_in_cluster, cpus_in_core;
> 
>      /*
>       * It's fine to take cluster into account even it's not supported. In this
>       * case, ms->smp.clusters is always one.
>       */
> }
> 
> >   static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState
> *ms)
> >   {
> >       int n;
> >       unsigned int max_cpus = ms->smp.max_cpus;
> > +    unsigned int smp_threads = ms->smp.threads;
> >       VirtMachineState *vms = VIRT_MACHINE(ms);
> >       MachineClass *mc = MACHINE_GET_CLASS(vms);
> >
> > @@ -2669,6 +2731,7 @@ static const CPUArchIdList
> *virt_possible_cpu_arch_ids(MachineState *ms)
> >       ms->possible_cpus->len = max_cpus;
> >       for (n = 0; n < ms->possible_cpus->len; n++) {
> >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
> > +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
> >           ms->possible_cpus->cpus[n].arch_id =
> >               virt_cpu_mp_affinity(vms, n);
> >
> 
> This initialization seems to accomodate HMP command "info hotpluggable-
> cpus".
> It would be nice if it can be mentioned in the commit log.
> 
> > diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> > index 93c28d50e5..1376350416 100644
> > --- a/target/arm/cpu.c
> > +++ b/target/arm/cpu.c
> > @@ -2277,6 +2277,10 @@ static Property arm_cpu_properties[] = {
> >       DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
> >                           mp_affinity, ARM64_AFFINITY_INVALID),
> >       DEFINE_PROP_INT32("node-id", ARMCPU, node_id,
> CPU_UNSET_NUMA_NODE_ID),
> > +    DEFINE_PROP_INT32("socket-id", ARMCPU, socket_id, 0),
> > +    DEFINE_PROP_INT32("cluster-id", ARMCPU, cluster_id, 0),
> > +    DEFINE_PROP_INT32("core-id", ARMCPU, core_id, 0),
> > +    DEFINE_PROP_INT32("thread-id", ARMCPU, thread_id, 0),
> >       DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1),
> >       DEFINE_PROP_END_OF_LIST()
> >   };
> 
> All those 4 properties are used for vCPU hot-add, meaning they're not needed
> when vCPU hotplug isn't supported on the specific board. Even for hw/virt board,
> cluster isn't always supported and 'cluster-id' shouldn't always be exposed,
> as explained above. How about to register the properties dynamically only when
> they're needed by vCPU hotplug?


Yes, these are part of arch specific files so it is upto the arch whether to define
them or not to define them at all?

Yes, and as mentioned earlier, there is extra bit of memory(4*integer) which is
being used. I would tradeoff this vis-à-vis maintainability.



> > diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> > index 88e5accda6..d51d39f621 100644
> > --- a/target/arm/cpu.h
> > +++ b/target/arm/cpu.h
> > @@ -1094,6 +1094,10 @@ struct ArchCPU {
> >       QLIST_HEAD(, ARMELChangeHook) el_change_hooks;
> >
> >       int32_t node_id; /* NUMA node this CPU belongs to */
> > +    int32_t socket_id;
> > +    int32_t cluster_id;
> > +    int32_t core_id;
> > +    int32_t thread_id;
> 
> It would be fine to keep those fields even the corresponding properties are
> dynamically registered, but a little bit memory overhead incurred :)

You are contradicting yourself here ;)


Thanks
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for possible vCPUs
  2023-09-27  3:54   ` Gavin Shan
@ 2023-10-02 10:21     ` Salil Mehta via
  2023-10-02 10:21       ` Salil Mehta
  2023-10-03  5:34       ` Gavin Shan
  0 siblings, 2 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-10-02 10:21 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 4:54 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for
> possible vCPUs
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Adds various utility functions which might be required to fetch or check
> the
> > state of the possible vCPUs. This also introduces concept of *disabled*
> vCPUs,
> > which are part of the *possible* vCPUs but are not part of the *present*
> vCPU.
> > This state shall be used during machine init time to check the presence
> of
> > vcpus.
>    ^^^^^
> 
>    vCPUs


Yes. Thanks.


> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   cpus-common.c         | 31 +++++++++++++++++++++++++
> >   include/hw/core/cpu.h | 53 +++++++++++++++++++++++++++++++++++++++++++
> >   2 files changed, 84 insertions(+)
> >
> > diff --git a/cpus-common.c b/cpus-common.c
> > index 45c745ecf6..24c04199a1 100644
> > --- a/cpus-common.c
> > +++ b/cpus-common.c
> > @@ -24,6 +24,7 @@
> >   #include "sysemu/cpus.h"
> >   #include "qemu/lockable.h"
> >   #include "trace/trace-root.h"
> > +#include "hw/boards.h"
> >
> >   QemuMutex qemu_cpu_list_lock;
> >   static QemuCond exclusive_cond;
> > @@ -107,6 +108,36 @@ void cpu_list_remove(CPUState *cpu)
> >       cpu_list_generation_id++;
> >   }
> >
> > +CPUState *qemu_get_possible_cpu(int index)
> > +{
> > +    MachineState *ms = MACHINE(qdev_get_machine());
> > +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
> > +
> > +    assert((index >= 0) && (index < possible_cpus->len));
> > +
> > +    return CPU(possible_cpus->cpus[index].cpu);
> > +}
> > +
> > +bool qemu_present_cpu(CPUState *cpu)
> > +{
> > +    return cpu;
> > +}
> > +
> > +bool qemu_enabled_cpu(CPUState *cpu)
> > +{
> > +    return cpu && !cpu->disabled;
> > +}
> > +
> 
> I do think it's a good idea to have wrappers to check for CPU's states since
> these CPU states play important role in this series to support vCPU hotplug.
> However, it would be nice to move them around into header file (include/hw/boards.h)
> because all the checks are originated from ms->possible_cpus->cpus[]. It sounds
> functions to a machine (board) instead of global scope. Besides, it would be
> nice to have same input (index) for all functions. How about something like
> below in include/hw/boards.h?

These are operations related to CPUState and hence cpus-common.c seem to be
more appropriate to me. You can see similar functions like qemu_get_cpu()
already exists in the same file.

Yes, some operation do make use of the possible list which is maintained at
board level but eventually what we are returning is the CPUState. 

I am open to move some of above to board level not all like present,
enabled checks should exist in this file only. I would prefer to keep
all of them in this file.


> 
> static inline  bool machine_has_possible_cpu(int index)
> {
>      MachineState *ms = MACHINE(qdev_get_machine());
> 
>      if (!ms || !ms->possible_cpus || index < 0 || index >= ms-
> >possible_cus->len) {
>          return false;
>      }
> 
>      return true;
> }
> 
> static inline bool machine_has_present_cpu(int index)
> {
>      MachineState *ms = MACHINE(qdev_get_machine());
> 
>      if (!machine_is_possible_cpu(index) ||
>          !ms->possible_cpus->cpus[index].cpu) {
>          return false;
>      }
> 
>      return true;
> }
> 
> static inline bool machine_has_enabled_cpu(int index)
> {
>      MachineState *ms = MACHINE(qdev_get_machine());
>      CPUState *cs;
> 
>      if (!machine_is_present_cpu(index)) {
>          return false;
>      }
> 
>      cs = CPU(ms->possible_cpus->cpus[index].cpu);
>      return !cs->disabled
> }
> 
> > +uint64_t qemu_get_cpu_archid(int cpu_index)
> > +{
> > +    MachineState *ms = MACHINE(qdev_get_machine());
> > +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
> > +
> > +    assert((cpu_index >= 0) && (cpu_index < possible_cpus->len));
> > +
> > +    return possible_cpus->cpus[cpu_index].arch_id;
> > +}
> > +
> 
> I think it's unnecessary to keep it since it's called for once by
> hw/arm/virt-acpi-build.c::build_madt. The architectural ID can be
> directly fetched from possible_cpus->cpus[i].arch_id. It's fine
> to drop this function and fold the logic to the following patch.

It is a very useful accessor API. I can see this code is being
replicated everywhere which also means many time its related
variables are repeatedly defined. 

Maybe this is being used once now. But this can be used across
architectures later.

> 
> [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with
> possible vCPUs
> 
> 
> >   CPUState *qemu_get_cpu(int index)
> >   {
> >       CPUState *cpu;
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index fdcbe87352..e5af79950c 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -413,6 +413,17 @@ struct CPUState {
> >       SavedIOTLB saved_iotlb;
> >   #endif
> >
> > +    /*
> > +     * Some architectures do not allow *presence* of vCPUs to be changed
> > +     * after guest has booted using information specified by VMM/firmware
> > +     * via ACPI MADT at the boot time. Thus to enable vCPU hotplug on these
> > +     * architectures possible vCPU can have CPUState object in 'disabled'
> > +     * state or can also not have CPUState object at all. This is possible
> > +     * when vCPU Hotplug is supported and vCPUs are 'yet-to-be-plugged' in
> > +     * the QOM or have been hot-unplugged.
> > +     * By default every CPUState is enabled as of now across all archs.
> > +     */
> > +    bool disabled;
> >       /* TODO Move common fields from CPUArchState here. */
> >       int cpu_index;
> >       int cluster_index;
> 
> I guess the comments can be simplified a bit. How about something like
> below?
>      /*
>       * In order to support vCPU hotplug on architectures like aarch64,
>       * the vCPU states fall into possible, present or enabled. This field
>       * is added to distinguish present and enabled vCPUs. By default, all
>       * vCPUs are present and enabled.
>       */

I can definitely try to simplify it but above is not properly conveying the
reason why we require the disabled state.


> 
> > @@ -770,6 +781,48 @@ static inline bool cpu_in_exclusive_context(const
> CPUState *cpu)
> >    */
> >   CPUState *qemu_get_cpu(int index);
> >
> > +/**
> > + * qemu_get_possible_cpu:
> > + * @index: The CPUState@cpu_index value of the CPU to obtain.
> > + *         Input index MUST be in range [0, Max Possible CPUs)
> > + *
> > + * If CPUState object exists,then it gets a CPU matching
> > + * @index in the possible CPU array.
> > + *
> > + * Returns: The possible CPU or %NULL if CPU does not exist.
> > + */
> > +CPUState *qemu_get_possible_cpu(int index);
> > +
> > +/**
> > + * qemu_present_cpu:
> > + * @cpu: The vCPU to check
> > + *
> > + * Checks if the vCPU is amongst the present possible vcpus.
> > + *
> > + * Returns: True if it is present possible vCPU else false
> > + */
> > +bool qemu_present_cpu(CPUState *cpu);
> > +
> > +/**
> > + * qemu_enabled_cpu:
> > + * @cpu: The vCPU to check
> > + *
> > + * Checks if the vCPU is enabled.
> > + *
> > + * Returns: True if it is 'enabled' else false
> > + */
> > +bool qemu_enabled_cpu(CPUState *cpu);
> > +
> > +/**
> > + * qemu_get_cpu_archid:
> > + * @cpu_index: possible vCPU for which arch-id needs to be retreived
> > + *
> > + * Fetches the vCPU arch-id from the present possible vCPUs.
> > + *
> > + * Returns: arch-id of the possible vCPU
> > + */
> > +uint64_t qemu_get_cpu_archid(int cpu_index);
> > +





^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for possible vCPUs
  2023-10-02 10:21     ` Salil Mehta via
@ 2023-10-02 10:21       ` Salil Mehta
  2023-10-03  5:34       ` Gavin Shan
  1 sibling, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-02 10:21 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 4:54 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for
> possible vCPUs
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Adds various utility functions which might be required to fetch or check
> the
> > state of the possible vCPUs. This also introduces concept of *disabled*
> vCPUs,
> > which are part of the *possible* vCPUs but are not part of the *present*
> vCPU.
> > This state shall be used during machine init time to check the presence
> of
> > vcpus.
>    ^^^^^
> 
>    vCPUs


Yes. Thanks.


> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   cpus-common.c         | 31 +++++++++++++++++++++++++
> >   include/hw/core/cpu.h | 53 +++++++++++++++++++++++++++++++++++++++++++
> >   2 files changed, 84 insertions(+)
> >
> > diff --git a/cpus-common.c b/cpus-common.c
> > index 45c745ecf6..24c04199a1 100644
> > --- a/cpus-common.c
> > +++ b/cpus-common.c
> > @@ -24,6 +24,7 @@
> >   #include "sysemu/cpus.h"
> >   #include "qemu/lockable.h"
> >   #include "trace/trace-root.h"
> > +#include "hw/boards.h"
> >
> >   QemuMutex qemu_cpu_list_lock;
> >   static QemuCond exclusive_cond;
> > @@ -107,6 +108,36 @@ void cpu_list_remove(CPUState *cpu)
> >       cpu_list_generation_id++;
> >   }
> >
> > +CPUState *qemu_get_possible_cpu(int index)
> > +{
> > +    MachineState *ms = MACHINE(qdev_get_machine());
> > +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
> > +
> > +    assert((index >= 0) && (index < possible_cpus->len));
> > +
> > +    return CPU(possible_cpus->cpus[index].cpu);
> > +}
> > +
> > +bool qemu_present_cpu(CPUState *cpu)
> > +{
> > +    return cpu;
> > +}
> > +
> > +bool qemu_enabled_cpu(CPUState *cpu)
> > +{
> > +    return cpu && !cpu->disabled;
> > +}
> > +
> 
> I do think it's a good idea to have wrappers to check for CPU's states since
> these CPU states play important role in this series to support vCPU hotplug.
> However, it would be nice to move them around into header file (include/hw/boards.h)
> because all the checks are originated from ms->possible_cpus->cpus[]. It sounds
> functions to a machine (board) instead of global scope. Besides, it would be
> nice to have same input (index) for all functions. How about something like
> below in include/hw/boards.h?

These are operations related to CPUState and hence cpus-common.c seem to be
more appropriate to me. You can see similar functions like qemu_get_cpu()
already exists in the same file.

Yes, some operation do make use of the possible list which is maintained at
board level but eventually what we are returning is the CPUState. 

I am open to move some of above to board level not all like present,
enabled checks should exist in this file only. I would prefer to keep
all of them in this file.


> 
> static inline  bool machine_has_possible_cpu(int index)
> {
>      MachineState *ms = MACHINE(qdev_get_machine());
> 
>      if (!ms || !ms->possible_cpus || index < 0 || index >= ms-
> >possible_cus->len) {
>          return false;
>      }
> 
>      return true;
> }
> 
> static inline bool machine_has_present_cpu(int index)
> {
>      MachineState *ms = MACHINE(qdev_get_machine());
> 
>      if (!machine_is_possible_cpu(index) ||
>          !ms->possible_cpus->cpus[index].cpu) {
>          return false;
>      }
> 
>      return true;
> }
> 
> static inline bool machine_has_enabled_cpu(int index)
> {
>      MachineState *ms = MACHINE(qdev_get_machine());
>      CPUState *cs;
> 
>      if (!machine_is_present_cpu(index)) {
>          return false;
>      }
> 
>      cs = CPU(ms->possible_cpus->cpus[index].cpu);
>      return !cs->disabled
> }
> 
> > +uint64_t qemu_get_cpu_archid(int cpu_index)
> > +{
> > +    MachineState *ms = MACHINE(qdev_get_machine());
> > +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
> > +
> > +    assert((cpu_index >= 0) && (cpu_index < possible_cpus->len));
> > +
> > +    return possible_cpus->cpus[cpu_index].arch_id;
> > +}
> > +
> 
> I think it's unnecessary to keep it since it's called for once by
> hw/arm/virt-acpi-build.c::build_madt. The architectural ID can be
> directly fetched from possible_cpus->cpus[i].arch_id. It's fine
> to drop this function and fold the logic to the following patch.

It is a very useful accessor API. I can see this code is being
replicated everywhere which also means many time its related
variables are repeatedly defined. 

Maybe this is being used once now. But this can be used across
architectures later.

> 
> [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with
> possible vCPUs
> 
> 
> >   CPUState *qemu_get_cpu(int index)
> >   {
> >       CPUState *cpu;
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index fdcbe87352..e5af79950c 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -413,6 +413,17 @@ struct CPUState {
> >       SavedIOTLB saved_iotlb;
> >   #endif
> >
> > +    /*
> > +     * Some architectures do not allow *presence* of vCPUs to be changed
> > +     * after guest has booted using information specified by VMM/firmware
> > +     * via ACPI MADT at the boot time. Thus to enable vCPU hotplug on these
> > +     * architectures possible vCPU can have CPUState object in 'disabled'
> > +     * state or can also not have CPUState object at all. This is possible
> > +     * when vCPU Hotplug is supported and vCPUs are 'yet-to-be-plugged' in
> > +     * the QOM or have been hot-unplugged.
> > +     * By default every CPUState is enabled as of now across all archs.
> > +     */
> > +    bool disabled;
> >       /* TODO Move common fields from CPUArchState here. */
> >       int cpu_index;
> >       int cluster_index;
> 
> I guess the comments can be simplified a bit. How about something like
> below?
>      /*
>       * In order to support vCPU hotplug on architectures like aarch64,
>       * the vCPU states fall into possible, present or enabled. This field
>       * is added to distinguish present and enabled vCPUs. By default, all
>       * vCPUs are present and enabled.
>       */

I can definitely try to simplify it but above is not properly conveying the
reason why we require the disabled state.


> 
> > @@ -770,6 +781,48 @@ static inline bool cpu_in_exclusive_context(const
> CPUState *cpu)
> >    */
> >   CPUState *qemu_get_cpu(int index);
> >
> > +/**
> > + * qemu_get_possible_cpu:
> > + * @index: The CPUState@cpu_index value of the CPU to obtain.
> > + *         Input index MUST be in range [0, Max Possible CPUs)
> > + *
> > + * If CPUState object exists,then it gets a CPU matching
> > + * @index in the possible CPU array.
> > + *
> > + * Returns: The possible CPU or %NULL if CPU does not exist.
> > + */
> > +CPUState *qemu_get_possible_cpu(int index);
> > +
> > +/**
> > + * qemu_present_cpu:
> > + * @cpu: The vCPU to check
> > + *
> > + * Checks if the vCPU is amongst the present possible vcpus.
> > + *
> > + * Returns: True if it is present possible vCPU else false
> > + */
> > +bool qemu_present_cpu(CPUState *cpu);
> > +
> > +/**
> > + * qemu_enabled_cpu:
> > + * @cpu: The vCPU to check
> > + *
> > + * Checks if the vCPU is enabled.
> > + *
> > + * Returns: True if it is 'enabled' else false
> > + */
> > +bool qemu_enabled_cpu(CPUState *cpu);
> > +
> > +/**
> > + * qemu_get_cpu_archid:
> > + * @cpu_index: possible vCPU for which arch-id needs to be retreived
> > + *
> > + * Fetches the vCPU arch-id from the present possible vCPUs.
> > + *
> > + * Returns: arch-id of the possible vCPU
> > + */
> > +uint64_t qemu_get_cpu_archid(int cpu_index);
> > +





^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function
  2023-09-27  5:16   ` Gavin Shan
@ 2023-10-02 10:24     ` Salil Mehta via
  2023-10-02 10:24       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-02 10:24 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 6:17 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU
> properties in a function
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Factor out CPU properties code common for {hot,cold}-plugged CPUs. This
> allows
> > code reuse.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c         | 220 ++++++++++++++++++++++++++----------------
> >   include/hw/arm/virt.h |   4 +
> >   2 files changed, 140 insertions(+), 84 deletions(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 57fe97c242..0eb6bf5a18 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2018,16 +2018,130 @@ static void virt_cpu_post_init(VirtMachineState
> *vms, MemoryRegion *sysmem)
> >       }
> >   }
> >

[...]


> > +    }
> > +
> > +    /*
> > +     * RFC: Question: this must only be called for the hotplugged cpus. For the
> > +     * cold booted secondary cpus this is being taken care in arm_load_kernel()
> > +     * in boot.c. Perhaps we should remove that code now?
> > +     */
> > +    if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED) {
> > +        object_property_set_int(cpuobj, "psci-conduit", vms->psci_conduit,
> > +                                NULL);
> > +
> > +        /* Secondary CPUs start in PSCI powered-down state */
> > +        if (CPU(cpuobj)->cpu_index > 0) {
> > +            object_property_set_bool(cpuobj, "start-powered-off", true, NULL);
> > +        }
> > +    }
> > +
> > +out:
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +    }
> > +    return;
>         ^^^^^^
> 
> It's not needed obviously :)


Yep, will remove that.


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function
  2023-10-02 10:24     ` Salil Mehta via
@ 2023-10-02 10:24       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-02 10:24 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 6:17 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU
> properties in a function
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Factor out CPU properties code common for {hot,cold}-plugged CPUs. This
> allows
> > code reuse.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c         | 220 ++++++++++++++++++++++++++----------------
> >   include/hw/arm/virt.h |   4 +
> >   2 files changed, 140 insertions(+), 84 deletions(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 57fe97c242..0eb6bf5a18 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2018,16 +2018,130 @@ static void virt_cpu_post_init(VirtMachineState
> *vms, MemoryRegion *sysmem)
> >       }
> >   }
> >

[...]


> > +    }
> > +
> > +    /*
> > +     * RFC: Question: this must only be called for the hotplugged cpus. For the
> > +     * cold booted secondary cpus this is being taken care in arm_load_kernel()
> > +     * in boot.c. Perhaps we should remove that code now?
> > +     */
> > +    if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED) {
> > +        object_property_set_int(cpuobj, "psci-conduit", vms->psci_conduit,
> > +                                NULL);
> > +
> > +        /* Secondary CPUs start in PSCI powered-down state */
> > +        if (CPU(cpuobj)->cpu_index > 0) {
> > +            object_property_set_bool(cpuobj, "start-powered-off", true, NULL);
> > +        }
> > +    }
> > +
> > +out:
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +    }
> > +    return;
>         ^^^^^^
> 
> It's not needed obviously :)


Yep, will remove that.


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
  2023-09-27  6:30   ` Gavin Shan
@ 2023-10-02 10:27     ` Salil Mehta via
  2023-10-02 10:27       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-02 10:27 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 7:30 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time
> change common to vCPU {cold|hot}-plug
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Refactor and introduce the common logic required during the
> initialization of
> > both cold and hot plugged vCPUs. Also initialize the *disabled* state of the
> > vCPUs which shall be used further during init phases of various other components
> > like GIC, PMU, ACPI etc as part of the virt machine initialization.

[...]

> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Reported-by: Gavin Shan <gavin.shan@redhat.com>
>                             ^^^^^^^^^^^^^^^^^^^^^
> 
>                             <gshan@redhat.com>


Ah. Gross. Sorry about this. Will fix.

Thanks
Salil.

> 
> > [GS: pointed the assertion due to wrong range check]
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c      | 149 ++++++++++++++++++++++++++++++++++++++++-----
> >   target/arm/cpu.c   |   7 +++
> >   target/arm/cpu64.c |  14 +++++
> >   3 files changed, 156 insertions(+), 14 deletions(-)
> >


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
  2023-10-02 10:27     ` Salil Mehta via
@ 2023-10-02 10:27       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-02 10:27 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 7:30 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time
> change common to vCPU {cold|hot}-plug
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Refactor and introduce the common logic required during the
> initialization of
> > both cold and hot plugged vCPUs. Also initialize the *disabled* state of the
> > vCPUs which shall be used further during init phases of various other components
> > like GIC, PMU, ACPI etc as part of the virt machine initialization.

[...]

> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Reported-by: Gavin Shan <gavin.shan@redhat.com>
>                             ^^^^^^^^^^^^^^^^^^^^^
> 
>                             <gshan@redhat.com>


Ah. Gross. Sorry about this. Will fix.

Thanks
Salil.

> 
> > [GS: pointed the assertion due to wrong range check]
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c      | 149 ++++++++++++++++++++++++++++++++++++++++-----
> >   target/arm/cpu.c   |   7 +++
> >   target/arm/cpu64.c |  14 +++++
> >   3 files changed, 156 insertions(+), 14 deletions(-)
> >


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
  2023-09-27  6:28   ` [PATCH RFC V2 04/37] arm/virt,target/arm: " Gavin Shan
@ 2023-10-02 16:12     ` Salil Mehta via
  2023-10-02 16:12       ` Salil Mehta
  2024-01-16 15:59       ` Jonathan Cameron via
  0 siblings, 2 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-10-02 16:12 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 7:29 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time
> change common to vCPU {cold|hot}-plug
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Refactor and introduce the common logic required during the
> initialization of
> > both cold and hot plugged vCPUs. Also initialize the *disabled* state of the
> > vCPUs which shall be used further during init phases of various other components
> > like GIC, PMU, ACPI etc as part of the virt machine initialization.
> >
> > KVM vCPUs corresponding to unplugged/yet-to-be-plugged QOM CPUs are kept in
> > powered-off state in the KVM Host and do not run the guest code. Plugged vCPUs
> > are also kept in powered-off state but vCPU threads exist and is kept sleeping.
> >
> > TBD:
> > For the cold booted vCPUs, this change also exists in the arm_load_kernel()
> > in boot.c but for the hotplugged CPUs this change should still remain part of
> > the pre-plug phase. We are duplicating the powering-off of the cold booted CPUs.
> > Shall we remove the duplicate change from boot.c?
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Reported-by: Gavin Shan <gavin.shan@redhat.com>
> > [GS: pointed the assertion due to wrong range check]
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c      | 149 ++++++++++++++++++++++++++++++++++++++++-----
> >   target/arm/cpu.c   |   7 +++
> >   target/arm/cpu64.c |  14 +++++
> >   3 files changed, 156 insertions(+), 14 deletions(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 0eb6bf5a18..3668ad27ec 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -221,6 +221,7 @@ static const char *valid_cpus[] = {
> >       ARM_CPU_TYPE_NAME("max"),
> >   };
> >
> > +static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid);
> >   static int virt_get_socket_id(const MachineState *ms, int cpu_index);
> >   static int virt_get_cluster_id(const MachineState *ms, int cpu_index);
> >   static int virt_get_core_id(const MachineState *ms, int cpu_index);
> > @@ -2154,6 +2155,14 @@ static void machvirt_init(MachineState *machine)
> >           exit(1);
> >       }
> >
> > +    finalize_gic_version(vms);
> > +    if (tcg_enabled() || hvf_enabled() || qtest_enabled() ||
> > +        (vms->gic_version < VIRT_GIC_VERSION_3)) {
> > +        machine->smp.max_cpus = smp_cpus;
> > +        mc->has_hotpluggable_cpus = false;
> > +        warn_report("cpu hotplug feature has been disabled");
> > +    }
> > +
> 
> Comments needed here to explain why @mc->has_hotpluggable_cpus is set to false.
> I guess it's something related to TODO list, mentioned in the cover letter.


I can put a comment explaining the checks as to why feature has been disabled.
BTW, isn't code self-explanatory here?


[...]

> > +static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid)
> > +{
> > +    VirtMachineState *vms = VIRT_MACHINE(ms);
> > +    CPUArchId *found_cpu;
> > +    uint64_t mp_affinity;
> > +
> > +    assert(vcpuid >= 0 && vcpuid < ms->possible_cpus->len);
> > +
> > +    /*
> > +     * RFC: Question:
> > +     * TBD: Should mp-affinity be treated as MPIDR?
> > +     */
> > +    mp_affinity = virt_cpu_mp_affinity(vms, vcpuid);
> > +    found_cpu = &ms->possible_cpus->cpus[vcpuid];
> > +
> > +    assert(found_cpu->arch_id == mp_affinity);
> > +
> > +    /*
> > +     * RFC: Question:
> > +     * Slot-id is the index where vCPU with certain arch-id(=mpidr/ap-affinity)
> > +     * is plugged. For Host KVM, MPIDR for vCPU is derived using vcpu-id.
> > +     * As I understand, MPIDR and vcpu-id are property of vCPU but slot-id is
> > +     * more related to machine? Current code assumes slot-id and vcpu-id are
> > +     * same i.e. meaning of slot is bit vague.
> > +     *
> > +     * Q1: Is there any requirement to clearly represent slot and dissociate it
> > +     *     from vcpu-id?
> > +     * Q2: Should we make MPIDR within host KVM user configurable?
> > +     *
> > +     *          +----+----+----+----+----+----+----+----+
> > +     * MPIDR    |||  Res  |   Aff2  |   Aff1  |  Aff0   |
> > +     *          +----+----+----+----+----+----+----+----+
> > +     *                     \         \         \   |    |
> > +     *                      \   8bit  \   8bit  \  |4bit|
> > +     *                       \<------->\<------->\ |<-->|
> > +     *                        \         \         \|    |
> > +     *          +----+----+----+----+----+----+----+----+
> > +     * VCPU-ID  |  Byte4  |  Byte2  |  Byte1  |  Byte0  |
> > +     *          +----+----+----+----+----+----+----+----+
> > +     */
> > +
> > +    return found_cpu;
> > +}
> > +
> 
> MPIDR[31] is set to 0b1, looking at
> linux/arch/arm64/kvm/sys_regs.c::reset_mpidr().
> 
> I think this function can be renamed to virt_get_cpu_slot(ms, index), better to
> reflect its intention. I had same concerns why cs->cpu_index can't be
> reused as MPIDR, but it's out of scope for this series. It maybe something to be
> improved afterwards.

Yes, right now it is linear mapping but this might change. I would suggest to keep
it like this with a comment so that it can be addressed in future.

User configurability of the MPIDR is not in the scope of this patch. Agreed.


[...]

> > +static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState
> *dev,
> > +                              Error **errp)
> > +{
> > +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> > +    MachineState *ms = MACHINE(hotplug_dev);
> > +    ARMCPU *cpu = ARM_CPU(dev);
> > +    CPUState *cs = CPU(dev);
> > +    CPUArchId *cpu_slot;
> > +    int32_t min_cpuid = 0;
> > +    int32_t max_cpuid;
> > +
> > +    /* sanity check the cpu */
> > +    if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
> > +        error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
> > +                   ms->cpu_type);
> > +        return;
> > +    }
> > +
> > +    if ((cpu->thread_id < 0) || (cpu->thread_id >= ms->smp.threads)) {
> > +        error_setg(errp, "Invalid thread-id %u specified, correct range
> 0:%u",
> > +                   cpu->thread_id, ms->smp.threads - 1);
> > +        return;
> > +    }
> > +
> > +    max_cpuid = ms->possible_cpus->len - 1;
> > +    if (!dev->hotplugged) {
> > +        min_cpuid = vms->acpi_dev ? ms->smp.cpus : 0;
> > +        max_cpuid = vms->acpi_dev ? max_cpuid : ms->smp.cpus - 1;
> > +    }
> > +
> 
> I don't understand how the range is figured out. cpu->core_id should
> be in range [0, ms->smp.cores).
> With your code, the following scenario
> becomes invalid incorrectly?
> 
> -cpu host -smp maxcpus=4,cpus=1,sockets=4,clusters=1,cores=1,threads=1

Ghosh. I am not sure what I was thinking while I added this.

Whatever maybe your circumstances never drink and code. Deadly
combination! (Repeat offender)

Will correct this.

Thanks
Salil.


[...]

> > +
> > +static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +                          Error **errp)
> > +{
> > +    MachineState *ms = MACHINE(hotplug_dev);
> > +    CPUState *cs = CPU(dev);
> > +    CPUArchId *cpu_slot;
> > +
> > +    /* insert the cold/hot-plugged vcpu in the slot */
>         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> May be:
> 
>         /* CPU becomes present */


Not exactly. In this leg CPU is being plugged by user action or during
init time. After plugging action is complete, a CPU eventually becomes
present.


> 
> > +    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> > +    cpu_slot->cpu = OBJECT(dev);
> > +
> > +    cs->disabled = false;
> > +    return;
>         ^^^^^^
> 
>         not needed.

Agreed.

> 
> May be worthy some comments like below, correlating to what's done in
> aarch64_cpu_initfn():
> 
>         /* CPU becomes enabled after it's hot added */


I can add a line over the initialization, if thats what you mean?


> 
> > +}
> > +
> >   static void virt_machine_device_pre_plug_cb(HotplugHandler
> *hotplug_dev,
> >                                               DeviceState *dev, Error
> **errp)

[...]

> > +static void aarch64_cpu_initfn(Object *obj)
> > +{
> > +    CPUState *cs = CPU(obj);
> > +
> > +    /*
> > +     * we start every ARM64 vcpu as disabled possible vCPU. It needs to
> be
> > +     * enabled explicitly
> > +     */
> > +    cs->disabled = true;
> > +}
> > +
> 
> The comments can be simplified to:
> 
>      /* The CPU state isn't enabled until it's hot added completely */


There is a reason why I have added comment that way because for
other architectures 'disabled' would be false by default.


> >   static void aarch64_cpu_finalizefn(Object *obj)
> >   {
> >   }
> > @@ -751,7 +762,9 @@ static gchar *aarch64_gdb_arch_name(CPUState *cs)
> >   static void aarch64_cpu_class_init(ObjectClass *oc, void *data)
> >   {
> >       CPUClass *cc = CPU_CLASS(oc);
> > +    DeviceClass *dc = DEVICE_CLASS(oc);
> >
> > +    dc->user_creatable = true;
> >       cc->gdb_read_register = aarch64_cpu_gdb_read_register;
> >       cc->gdb_write_register = aarch64_cpu_gdb_write_register;
> >       cc->gdb_num_core_regs = 34;
> > @@ -800,6 +813,7 @@ static const TypeInfo aarch64_cpu_type_info = {
> >       .name = TYPE_AARCH64_CPU,
> >       .parent = TYPE_ARM_CPU,
> >       .instance_size = sizeof(ARMCPU),
> > +    .instance_init = aarch64_cpu_initfn,
> >       .instance_finalize = aarch64_cpu_finalizefn,
> >       .abstract = true,
> >       .class_size = sizeof(AArch64CPUClass),
> 
> I'm not sure if 'dc->user_creatable' can be set true here because
> the ARMCPU objects aren't ready for hot added/removed at this point.
> The hacks for GICv3 aren't included so far. I think a separate patch
> may be needed in the last to enable the functionality?

This patch contains common init time changes for CPU {hot,cold} plug.


Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
  2023-10-02 16:12     ` Salil Mehta via
@ 2023-10-02 16:12       ` Salil Mehta
  2024-01-16 15:59       ` Jonathan Cameron via
  1 sibling, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-02 16:12 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 7:29 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time
> change common to vCPU {cold|hot}-plug
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Refactor and introduce the common logic required during the
> initialization of
> > both cold and hot plugged vCPUs. Also initialize the *disabled* state of the
> > vCPUs which shall be used further during init phases of various other components
> > like GIC, PMU, ACPI etc as part of the virt machine initialization.
> >
> > KVM vCPUs corresponding to unplugged/yet-to-be-plugged QOM CPUs are kept in
> > powered-off state in the KVM Host and do not run the guest code. Plugged vCPUs
> > are also kept in powered-off state but vCPU threads exist and is kept sleeping.
> >
> > TBD:
> > For the cold booted vCPUs, this change also exists in the arm_load_kernel()
> > in boot.c but for the hotplugged CPUs this change should still remain part of
> > the pre-plug phase. We are duplicating the powering-off of the cold booted CPUs.
> > Shall we remove the duplicate change from boot.c?
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Reported-by: Gavin Shan <gavin.shan@redhat.com>
> > [GS: pointed the assertion due to wrong range check]
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c      | 149 ++++++++++++++++++++++++++++++++++++++++-----
> >   target/arm/cpu.c   |   7 +++
> >   target/arm/cpu64.c |  14 +++++
> >   3 files changed, 156 insertions(+), 14 deletions(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 0eb6bf5a18..3668ad27ec 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -221,6 +221,7 @@ static const char *valid_cpus[] = {
> >       ARM_CPU_TYPE_NAME("max"),
> >   };
> >
> > +static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid);
> >   static int virt_get_socket_id(const MachineState *ms, int cpu_index);
> >   static int virt_get_cluster_id(const MachineState *ms, int cpu_index);
> >   static int virt_get_core_id(const MachineState *ms, int cpu_index);
> > @@ -2154,6 +2155,14 @@ static void machvirt_init(MachineState *machine)
> >           exit(1);
> >       }
> >
> > +    finalize_gic_version(vms);
> > +    if (tcg_enabled() || hvf_enabled() || qtest_enabled() ||
> > +        (vms->gic_version < VIRT_GIC_VERSION_3)) {
> > +        machine->smp.max_cpus = smp_cpus;
> > +        mc->has_hotpluggable_cpus = false;
> > +        warn_report("cpu hotplug feature has been disabled");
> > +    }
> > +
> 
> Comments needed here to explain why @mc->has_hotpluggable_cpus is set to false.
> I guess it's something related to TODO list, mentioned in the cover letter.


I can put a comment explaining the checks as to why feature has been disabled.
BTW, isn't code self-explanatory here?


[...]

> > +static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid)
> > +{
> > +    VirtMachineState *vms = VIRT_MACHINE(ms);
> > +    CPUArchId *found_cpu;
> > +    uint64_t mp_affinity;
> > +
> > +    assert(vcpuid >= 0 && vcpuid < ms->possible_cpus->len);
> > +
> > +    /*
> > +     * RFC: Question:
> > +     * TBD: Should mp-affinity be treated as MPIDR?
> > +     */
> > +    mp_affinity = virt_cpu_mp_affinity(vms, vcpuid);
> > +    found_cpu = &ms->possible_cpus->cpus[vcpuid];
> > +
> > +    assert(found_cpu->arch_id == mp_affinity);
> > +
> > +    /*
> > +     * RFC: Question:
> > +     * Slot-id is the index where vCPU with certain arch-id(=mpidr/ap-affinity)
> > +     * is plugged. For Host KVM, MPIDR for vCPU is derived using vcpu-id.
> > +     * As I understand, MPIDR and vcpu-id are property of vCPU but slot-id is
> > +     * more related to machine? Current code assumes slot-id and vcpu-id are
> > +     * same i.e. meaning of slot is bit vague.
> > +     *
> > +     * Q1: Is there any requirement to clearly represent slot and dissociate it
> > +     *     from vcpu-id?
> > +     * Q2: Should we make MPIDR within host KVM user configurable?
> > +     *
> > +     *          +----+----+----+----+----+----+----+----+
> > +     * MPIDR    |||  Res  |   Aff2  |   Aff1  |  Aff0   |
> > +     *          +----+----+----+----+----+----+----+----+
> > +     *                     \         \         \   |    |
> > +     *                      \   8bit  \   8bit  \  |4bit|
> > +     *                       \<------->\<------->\ |<-->|
> > +     *                        \         \         \|    |
> > +     *          +----+----+----+----+----+----+----+----+
> > +     * VCPU-ID  |  Byte4  |  Byte2  |  Byte1  |  Byte0  |
> > +     *          +----+----+----+----+----+----+----+----+
> > +     */
> > +
> > +    return found_cpu;
> > +}
> > +
> 
> MPIDR[31] is set to 0b1, looking at
> linux/arch/arm64/kvm/sys_regs.c::reset_mpidr().
> 
> I think this function can be renamed to virt_get_cpu_slot(ms, index), better to
> reflect its intention. I had same concerns why cs->cpu_index can't be
> reused as MPIDR, but it's out of scope for this series. It maybe something to be
> improved afterwards.

Yes, right now it is linear mapping but this might change. I would suggest to keep
it like this with a comment so that it can be addressed in future.

User configurability of the MPIDR is not in the scope of this patch. Agreed.


[...]

> > +static void virt_cpu_pre_plug(HotplugHandler *hotplug_dev, DeviceState
> *dev,
> > +                              Error **errp)
> > +{
> > +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> > +    MachineState *ms = MACHINE(hotplug_dev);
> > +    ARMCPU *cpu = ARM_CPU(dev);
> > +    CPUState *cs = CPU(dev);
> > +    CPUArchId *cpu_slot;
> > +    int32_t min_cpuid = 0;
> > +    int32_t max_cpuid;
> > +
> > +    /* sanity check the cpu */
> > +    if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
> > +        error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
> > +                   ms->cpu_type);
> > +        return;
> > +    }
> > +
> > +    if ((cpu->thread_id < 0) || (cpu->thread_id >= ms->smp.threads)) {
> > +        error_setg(errp, "Invalid thread-id %u specified, correct range
> 0:%u",
> > +                   cpu->thread_id, ms->smp.threads - 1);
> > +        return;
> > +    }
> > +
> > +    max_cpuid = ms->possible_cpus->len - 1;
> > +    if (!dev->hotplugged) {
> > +        min_cpuid = vms->acpi_dev ? ms->smp.cpus : 0;
> > +        max_cpuid = vms->acpi_dev ? max_cpuid : ms->smp.cpus - 1;
> > +    }
> > +
> 
> I don't understand how the range is figured out. cpu->core_id should
> be in range [0, ms->smp.cores).
> With your code, the following scenario
> becomes invalid incorrectly?
> 
> -cpu host -smp maxcpus=4,cpus=1,sockets=4,clusters=1,cores=1,threads=1

Ghosh. I am not sure what I was thinking while I added this.

Whatever maybe your circumstances never drink and code. Deadly
combination! (Repeat offender)

Will correct this.

Thanks
Salil.


[...]

> > +
> > +static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +                          Error **errp)
> > +{
> > +    MachineState *ms = MACHINE(hotplug_dev);
> > +    CPUState *cs = CPU(dev);
> > +    CPUArchId *cpu_slot;
> > +
> > +    /* insert the cold/hot-plugged vcpu in the slot */
>         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> May be:
> 
>         /* CPU becomes present */


Not exactly. In this leg CPU is being plugged by user action or during
init time. After plugging action is complete, a CPU eventually becomes
present.


> 
> > +    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> > +    cpu_slot->cpu = OBJECT(dev);
> > +
> > +    cs->disabled = false;
> > +    return;
>         ^^^^^^
> 
>         not needed.

Agreed.

> 
> May be worthy some comments like below, correlating to what's done in
> aarch64_cpu_initfn():
> 
>         /* CPU becomes enabled after it's hot added */


I can add a line over the initialization, if thats what you mean?


> 
> > +}
> > +
> >   static void virt_machine_device_pre_plug_cb(HotplugHandler
> *hotplug_dev,
> >                                               DeviceState *dev, Error
> **errp)

[...]

> > +static void aarch64_cpu_initfn(Object *obj)
> > +{
> > +    CPUState *cs = CPU(obj);
> > +
> > +    /*
> > +     * we start every ARM64 vcpu as disabled possible vCPU. It needs to
> be
> > +     * enabled explicitly
> > +     */
> > +    cs->disabled = true;
> > +}
> > +
> 
> The comments can be simplified to:
> 
>      /* The CPU state isn't enabled until it's hot added completely */


There is a reason why I have added comment that way because for
other architectures 'disabled' would be false by default.


> >   static void aarch64_cpu_finalizefn(Object *obj)
> >   {
> >   }
> > @@ -751,7 +762,9 @@ static gchar *aarch64_gdb_arch_name(CPUState *cs)
> >   static void aarch64_cpu_class_init(ObjectClass *oc, void *data)
> >   {
> >       CPUClass *cc = CPU_CLASS(oc);
> > +    DeviceClass *dc = DEVICE_CLASS(oc);
> >
> > +    dc->user_creatable = true;
> >       cc->gdb_read_register = aarch64_cpu_gdb_read_register;
> >       cc->gdb_write_register = aarch64_cpu_gdb_write_register;
> >       cc->gdb_num_core_regs = 34;
> > @@ -800,6 +813,7 @@ static const TypeInfo aarch64_cpu_type_info = {
> >       .name = TYPE_AARCH64_CPU,
> >       .parent = TYPE_ARM_CPU,
> >       .instance_size = sizeof(ARMCPU),
> > +    .instance_init = aarch64_cpu_initfn,
> >       .instance_finalize = aarch64_cpu_finalizefn,
> >       .abstract = true,
> >       .class_size = sizeof(AArch64CPUClass),
> 
> I'm not sure if 'dc->user_creatable' can be set true here because
> the ARMCPU objects aren't ready for hot added/removed at this point.
> The hacks for GICv3 aren't included so far. I think a separate patch
> may be needed in the last to enable the functionality?

This patch contains common init time changes for CPU {hot,cold} plug.


Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation,parking} code
  2023-09-27  6:51   ` [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation,parking} code Gavin Shan
@ 2023-10-02 16:20     ` Salil Mehta via
  2023-10-02 16:20       ` Salil Mehta
  2023-10-03  5:39       ` Gavin Shan
  0 siblings, 2 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-10-02 16:20 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 7:52 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU
> {creation,parking} code
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > KVM vCPU creation is done once during the initialization of the VM when Qemu
> > threads are spawned. This is common to all the architectures. If the architecture
> > supports vCPU hot-{un}plug then this KVM vCPU creation could be deferred to
> > later point as well. Some architectures might in any case create KVM vCPUs for
> > the yet-to-be plugged vCPUs (i.e. QoM Object & thread does not exists) during VM
> > init time and park them.
> >
> > Hot-unplug of vCPU results in destruction of the vCPU objects in QOM but
> > the KVM vCPU objects in the Host KVM are not destroyed and their representative
> > KVM vCPU objects in Qemu are parked.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   accel/kvm/kvm-all.c  | 61 ++++++++++++++++++++++++++++++++++----------
> >   include/sysemu/kvm.h |  2 ++
> >   2 files changed, 49 insertions(+), 14 deletions(-)
> >
> 
> The most important point seems missed in the commit log: The KVM vCPU objects,
> including those hotpluggable objects, need to be in place before in-host GICv3
> is initialized. So we need expose kvm_create_vcpu() to make those KVM vCPU
> objects in place, even for those non-present vCPUs.


This is a patch common to all architectures. The point you are making is specific
to the ARM architecture. This patch is now part of the common patch-set. Here,

https://lore.kernel.org/qemu-devel/20230930001933.2660-1-salil.mehta@huawei.com/


> 
> > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> > index 7b3da8dc3a..86e9c9ea60 100644
> > --- a/accel/kvm/kvm-all.c
> > +++ b/accel/kvm/kvm-all.c
> > @@ -137,6 +137,7 @@ static QemuMutex kml_slots_lock;
> >   #define kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
> >
> >   static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
> > +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
> >
> >   static inline void kvm_resample_fd_remove(int gsi)
> >   {
> > @@ -320,11 +321,51 @@ err:
> >       return ret;
> >   }
> >
> > +void kvm_park_vcpu(CPUState *cpu)
> > +{
> > +    unsigned long vcpu_id = cpu->cpu_index;
> > +    struct KVMParkedVcpu *vcpu;
> > +
> > +    vcpu = g_malloc0(sizeof(*vcpu));
> > +    vcpu->vcpu_id = vcpu_id;
> 
>         vcpu->vcpu_id = cpu->cpu_index;
> 
> @vcpu_id can be dropped.


Yes, agreed.

Thanks
Salil.

> 
> > +    vcpu->kvm_fd = cpu->kvm_fd;
> > +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
> > +}
> > +
> > +int kvm_create_vcpu(CPUState *cpu)
> > +{
> > +    unsigned long vcpu_id = cpu->cpu_index;
> > +    KVMState *s = kvm_state;
> > +    int ret;
> > +
> > +    DPRINTF("kvm_create_vcpu\n");
> > +
> > +    /* check if the KVM vCPU already exist but is parked */
> > +    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
> > +    if (ret > 0) {
> > +        goto found;
> > +    }
> > +
> > +    /* create a new KVM vcpu */
> > +    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
> > +    if (ret < 0) {
> > +        return ret;
> > +    }
> > +
> > +found:
> > +    cpu->vcpu_dirty = true;
> > +    cpu->kvm_fd = ret;
> > +    cpu->kvm_state = s;
> > +    cpu->dirty_pages = 0;
> > +    cpu->throttle_us_per_full = 0;
> > +
> > +    return 0;
> > +}
> > +
> 
> The found tag can be dropped. @cpu can be initialized if vCPU fd is found
> and then bail early.

Yes, This patch has been refactored and found has been dropped. 

https://lore.kernel.org/qemu-devel/20230930001933.2660-1-salil.mehta@huawei.com/


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation,parking} code
  2023-10-02 16:20     ` Salil Mehta via
@ 2023-10-02 16:20       ` Salil Mehta
  2023-10-03  5:39       ` Gavin Shan
  1 sibling, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-02 16:20 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 7:52 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU
> {creation,parking} code
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > KVM vCPU creation is done once during the initialization of the VM when Qemu
> > threads are spawned. This is common to all the architectures. If the architecture
> > supports vCPU hot-{un}plug then this KVM vCPU creation could be deferred to
> > later point as well. Some architectures might in any case create KVM vCPUs for
> > the yet-to-be plugged vCPUs (i.e. QoM Object & thread does not exists) during VM
> > init time and park them.
> >
> > Hot-unplug of vCPU results in destruction of the vCPU objects in QOM but
> > the KVM vCPU objects in the Host KVM are not destroyed and their representative
> > KVM vCPU objects in Qemu are parked.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   accel/kvm/kvm-all.c  | 61 ++++++++++++++++++++++++++++++++++----------
> >   include/sysemu/kvm.h |  2 ++
> >   2 files changed, 49 insertions(+), 14 deletions(-)
> >
> 
> The most important point seems missed in the commit log: The KVM vCPU objects,
> including those hotpluggable objects, need to be in place before in-host GICv3
> is initialized. So we need expose kvm_create_vcpu() to make those KVM vCPU
> objects in place, even for those non-present vCPUs.


This is a patch common to all architectures. The point you are making is specific
to the ARM architecture. This patch is now part of the common patch-set. Here,

https://lore.kernel.org/qemu-devel/20230930001933.2660-1-salil.mehta@huawei.com/


> 
> > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> > index 7b3da8dc3a..86e9c9ea60 100644
> > --- a/accel/kvm/kvm-all.c
> > +++ b/accel/kvm/kvm-all.c
> > @@ -137,6 +137,7 @@ static QemuMutex kml_slots_lock;
> >   #define kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
> >
> >   static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
> > +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
> >
> >   static inline void kvm_resample_fd_remove(int gsi)
> >   {
> > @@ -320,11 +321,51 @@ err:
> >       return ret;
> >   }
> >
> > +void kvm_park_vcpu(CPUState *cpu)
> > +{
> > +    unsigned long vcpu_id = cpu->cpu_index;
> > +    struct KVMParkedVcpu *vcpu;
> > +
> > +    vcpu = g_malloc0(sizeof(*vcpu));
> > +    vcpu->vcpu_id = vcpu_id;
> 
>         vcpu->vcpu_id = cpu->cpu_index;
> 
> @vcpu_id can be dropped.


Yes, agreed.

Thanks
Salil.

> 
> > +    vcpu->kvm_fd = cpu->kvm_fd;
> > +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
> > +}
> > +
> > +int kvm_create_vcpu(CPUState *cpu)
> > +{
> > +    unsigned long vcpu_id = cpu->cpu_index;
> > +    KVMState *s = kvm_state;
> > +    int ret;
> > +
> > +    DPRINTF("kvm_create_vcpu\n");
> > +
> > +    /* check if the KVM vCPU already exist but is parked */
> > +    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
> > +    if (ret > 0) {
> > +        goto found;
> > +    }
> > +
> > +    /* create a new KVM vcpu */
> > +    ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
> > +    if (ret < 0) {
> > +        return ret;
> > +    }
> > +
> > +found:
> > +    cpu->vcpu_dirty = true;
> > +    cpu->kvm_fd = ret;
> > +    cpu->kvm_state = s;
> > +    cpu->dirty_pages = 0;
> > +    cpu->throttle_us_per_full = 0;
> > +
> > +    return 0;
> > +}
> > +
> 
> The found tag can be dropped. @cpu can be initialized if vCPU fd is found
> and then bail early.

Yes, This patch has been refactored and found has been dropped. 

https://lore.kernel.org/qemu-devel/20230930001933.2660-1-salil.mehta@huawei.com/


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 06/37] arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
  2023-09-27 10:04   ` [PATCH RFC V2 06/37] arm/virt,kvm: " Gavin Shan
@ 2023-10-02 16:39     ` Salil Mehta via
  2023-10-02 16:39       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-02 16:39 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 11:05 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 06/37] arm/virt,kvm: Pre-create disabled
> possible vCPUs @machine init
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > In ARMv8 architecture, GIC needs all the vCPUs to be created and present when
> > it is initialized. This is because:
> > 1. GICC and MPIDR association must be fixed at the VM initialization time.
> >     This is represented by register GIC_TYPER(mp_afffinity, proc_num)
> > 2. GICC(cpu interfaces), GICR(redistributors) etc all must be initialized
> >     at the boot time as well.
> > 3. Memory regions associated with GICR etc. cannot be changed(add/del/mod)
> >     after VM has inited.
> >
> > This patch adds the support to pre-create all such possible vCPUs within the
> > host using the KVM interface as part of the virt machine initialization. These
> > vCPUs could later be attached to QOM/ACPI while they are actually hot plugged
> > and made present.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > [VP: Identified CPU stall issue & suggested probable fix]
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c         | 53 +++++++++++++++++++++++++++++++++++++++++--
> >   include/hw/core/cpu.h |  1 +
> >   target/arm/cpu64.c    |  1 +
> >   target/arm/kvm.c      | 32 ++++++++++++++++++++++++++
> >   target/arm/kvm64.c    |  9 +++++++-
> >   target/arm/kvm_arm.h  | 11 +++++++++
> >   6 files changed, 104 insertions(+), 3 deletions(-)
> >
> 
> The subject looks a bit misleading. (possible && disabled) == (disabled). So it
> can be simplified to something like below:


I will improve it.


> arm/virt,kvm: Pre-create KVM objects for hotpluggable vCPUs
> 
> I think the commit log can be improved to something like below:
> 
> All possible vCPUs are classified to cold-booting and hotpluggable vCPUs.
> In ARMv8 architecture, GIC needs all the possible vCPUs to be existing
> and present when it is initialized for several factors. After the
> initializaion,
> the CPU instances for those hotpluggable vCPUs aren't needed, but the
> KVM objects like vCPU's file descriptor should be kept as they have been
> shared to host.
> 
> 1. GICC and MPIDR association must be fixed at the VM initialization time.
>     This is represented by register GIC_TYPER(mp_afffinity, proc_num)
> 2. GICC(cpu interfaces), GICR(redistributors) etc all must be initialized
>     at the boot time as well.
> 3. Memory regions associated with GICR etc. cannot be changed(add/del/mod)
>     after VM has inited.
> 
> This creates and realizes CPU instances for those cold-booting vCPUs. They
> becomes enabled eventually. For these hotpluggable vCPUs, the vCPU
> instances
> are created, but not realized. They become present eventually.


Above is too complex. I'll make it more succinct. Will fix this.

Thanks
Salil.
 

> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 3668ad27ec..6ba131b799 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2293,8 +2293,10 @@ static void machvirt_init(MachineState *machine)
> >       assert(possible_cpus->len == max_cpus);
> >       for (n = 0; n < possible_cpus->len; n++) {
> >           Object *cpuobj;
> > +        CPUState *cs;
> >
> >           cpuobj = object_new(possible_cpus->cpus[n].type);
> > +        cs = CPU(cpuobj);
> >
> >           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
> >           object_property_set_int(cpuobj, "socket-id",
> > @@ -2306,8 +2308,55 @@ static void machvirt_init(MachineState *machine)
> >           object_property_set_int(cpuobj, "thread-id",
> >                                   virt_get_thread_id(machine, n), NULL);
> >
> > -        qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
> > -        object_unref(cpuobj);
> > +        if (n < smp_cpus) {
> > +            qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
> > +            object_unref(cpuobj);
> > +        } else {
> > +            CPUArchId *cpu_slot;
> > +
> > +            /* handling for vcpus which are yet to be hot-plugged */
> > +            cs->cpu_index = n;
> > +            cpu_slot = virt_find_cpu_slot(machine, cs->cpu_index);
> > +
> > +            /*
> > +             * ARM host vCPU features need to be fixed at the boot time. But as
> > +             * per current approach this CPU object will be destroyed during
> > +             * cpu_post_init(). During hotplug of vCPUs these properties are
> > +             * initialized again.
> > +             */
> > +            virt_cpu_set_properties(cpuobj, cpu_slot, &error_fatal);
> > +
> > +            /*
> > +             * For KVM, we shall be pre-creating the now disabled/un-plugged
> > +             * possbile host vcpus and park them till the time they are
> > +             * actually hot plugged. This is required to pre-size thehost
> > +             * GICC and GICR with the all possible vcpus for this VM.
> > +             */
> > +            if (kvm_enabled()) {
> > +                kvm_arm_create_host_vcpu(ARM_CPU(cs));
> > +            }
> 
>                 /*
>                  * For KVM, the associated objects like vCPU's file descriptor
>                  * is reserved so that they can reused when the vCPU is hot added.
>                  * :
>                  */


I think. Unnecessary.


> > +            /*
> > +             * Add disabled vCPU to CPU slot during the init phase of the virt
> > +             * machine
> > +             * 1. We need this ARMCPU object during the GIC init. This object
> > +             *    will facilitate in pre-realizing the GIC. Any info like
> > +             *    mp-affinity(required to derive gicr_type) etc. could still be
> > +             *    fetched while preserving QOM abstraction akin to realized
> > +             *    vCPUs.
> > +             * 2. Now, after initialization of the virt machine is complete we
> > +             *    could use two approaches to deal with this ARMCPU object:
> > +             *    (i) re-use this ARMCPU object during hotplug of this vCPU.
> > +             *                             OR
> > +             *    (ii) defer release this ARMCPU object after gic has been
> > +             *         initialized or during pre-plug phase when a vCPU is
> > +             *         hotplugged.
> > +             *
> > +             *    We will use the (ii) approach and release the ARMCPU objects
> > +             *    after GIC and machine has been fully initialized during
> > +             *    machine_init_done() phase.
> > +             */
> > +             cpu_slot->cpu = OBJECT(cs);
> > +        }
> 
>             /*
>              * Make the hotpluggable vCPU present because ....
>              */


Sorry, did not get?


> >       }
> >       fdt_add_timer_nodes(vms);
> >       fdt_add_cpu_nodes(vms);
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index e5af79950c..b2201a98ee 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -401,6 +401,7 @@ struct CPUState {
> >       uint32_t kvm_fetch_index;
> >       uint64_t dirty_pages;
> >       int kvm_vcpu_stats_fd;
> > +    VMChangeStateEntry *vmcse;
> >
> >       /* Use by accel-block: CPU is executing an ioctl() */
> >       QemuLockCnt in_ioctl_lock;
> > diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> > index a660e3f483..3a38e7ccaf 100644
> > --- a/target/arm/cpu64.c
> > +++ b/target/arm/cpu64.c
> > @@ -748,6 +748,7 @@ static void aarch64_cpu_initfn(Object *obj)
> >        * enabled explicitly
> >        */
> >       cs->disabled = true;
> > +    cs->thread_id = 0;
> >   }
> >
> >   static void aarch64_cpu_finalizefn(Object *obj)
> > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > index b4c7654f49..0e1d0692b1 100644
> > --- a/target/arm/kvm.c
> > +++ b/target/arm/kvm.c
> > @@ -637,6 +637,38 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
> >       write_list_to_cpustate(cpu);
> >   }
> >
> > +void kvm_arm_create_host_vcpu(ARMCPU *cpu)
> > +{
> > +    CPUState *cs = CPU(cpu);
> > +    unsigned long vcpu_id = cs->cpu_index;
> > +    int ret;
> > +
> > +    ret = kvm_create_vcpu(cs);
> > +    if (ret < 0) {
> > +        error_report("Failed to create host vcpu %ld", vcpu_id);
> > +        abort();
> > +    }
> > +
> > +    /*
> > +     * Initialize the vCPU in the host. This will reset the sys regs
> > +     * for this vCPU and related registers like MPIDR_EL1 etc. also
> > +     * gets programmed during this call to host. These are referred
> > +     * later while setting device attributes of the GICR during GICv3
> > +     * reset
> > +     */
> > +    ret = kvm_arch_init_vcpu(cs);
> > +    if (ret < 0) {
> > +        error_report("Failed to initialize host vcpu %ld", vcpu_id);
> > +        abort();
> > +    }
> > +
> > +    /*
> > +     * park the created vCPU. shall be used during kvm_get_vcpu() when
> > +     * threads are created during realization of ARM vCPUs.
> > +     */
> > +    kvm_park_vcpu(cs);
> > +}
> > +
> >   /*
> >    * Update KVM's MP_STATE based on what QEMU thinks it is
> >    */
> > diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> > index 94bbd9661f..364cc21f81 100644
> > --- a/target/arm/kvm64.c
> > +++ b/target/arm/kvm64.c
> > @@ -566,7 +566,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >           return -EINVAL;
> >       }
> >
> > -    qemu_add_vm_change_state_handler(kvm_arm_vm_state_change, cs);
> > +    /*
> > +     * Install VM change handler only when vCPU thread has been spawned
> > +     * i.e. vCPU is being realized
> > +     */
> > +    if (cs->thread_id) {
> > +        cs->vmcse =
> qemu_add_vm_change_state_handler(kvm_arm_vm_state_change,
> > +                                                     cs);
> > +    }
> >
> >       /* Determine init features for this CPU */
> >       memset(cpu->kvm_init_features, 0, sizeof(cpu->kvm_init_features));
> > diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
> > index 051a0da41c..31408499b3 100644
> > --- a/target/arm/kvm_arm.h
> > +++ b/target/arm/kvm_arm.h
> > @@ -163,6 +163,17 @@ void kvm_arm_cpu_post_load(ARMCPU *cpu);
> >    */
> >   void kvm_arm_reset_vcpu(ARMCPU *cpu);
> >
> > +/**
> > + * kvm_arm_create_host_vcpu:
> > + * @cpu: ARMCPU
> > + *
> > + * Called at to pre create all possible kvm vCPUs within the the host at
> the
>               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>               to create instances for the hotpluggable vCPUs


No. hot-plugging in ARM is a higher level operation which is being
done at QOM not at KVM level. Later is totally agnostic of
what Qemu is doing as part of hot(un)plug operations happening at
QOM. I would not want to associate hot-plugging with KVM as it
gives a wrong impression.


Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 06/37] arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
  2023-10-02 16:39     ` Salil Mehta via
@ 2023-10-02 16:39       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-02 16:39 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Wednesday, September 27, 2023 11:05 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 06/37] arm/virt,kvm: Pre-create disabled
> possible vCPUs @machine init
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > In ARMv8 architecture, GIC needs all the vCPUs to be created and present when
> > it is initialized. This is because:
> > 1. GICC and MPIDR association must be fixed at the VM initialization time.
> >     This is represented by register GIC_TYPER(mp_afffinity, proc_num)
> > 2. GICC(cpu interfaces), GICR(redistributors) etc all must be initialized
> >     at the boot time as well.
> > 3. Memory regions associated with GICR etc. cannot be changed(add/del/mod)
> >     after VM has inited.
> >
> > This patch adds the support to pre-create all such possible vCPUs within the
> > host using the KVM interface as part of the virt machine initialization. These
> > vCPUs could later be attached to QOM/ACPI while they are actually hot plugged
> > and made present.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Reported-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > [VP: Identified CPU stall issue & suggested probable fix]
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c         | 53 +++++++++++++++++++++++++++++++++++++++++--
> >   include/hw/core/cpu.h |  1 +
> >   target/arm/cpu64.c    |  1 +
> >   target/arm/kvm.c      | 32 ++++++++++++++++++++++++++
> >   target/arm/kvm64.c    |  9 +++++++-
> >   target/arm/kvm_arm.h  | 11 +++++++++
> >   6 files changed, 104 insertions(+), 3 deletions(-)
> >
> 
> The subject looks a bit misleading. (possible && disabled) == (disabled). So it
> can be simplified to something like below:


I will improve it.


> arm/virt,kvm: Pre-create KVM objects for hotpluggable vCPUs
> 
> I think the commit log can be improved to something like below:
> 
> All possible vCPUs are classified to cold-booting and hotpluggable vCPUs.
> In ARMv8 architecture, GIC needs all the possible vCPUs to be existing
> and present when it is initialized for several factors. After the
> initializaion,
> the CPU instances for those hotpluggable vCPUs aren't needed, but the
> KVM objects like vCPU's file descriptor should be kept as they have been
> shared to host.
> 
> 1. GICC and MPIDR association must be fixed at the VM initialization time.
>     This is represented by register GIC_TYPER(mp_afffinity, proc_num)
> 2. GICC(cpu interfaces), GICR(redistributors) etc all must be initialized
>     at the boot time as well.
> 3. Memory regions associated with GICR etc. cannot be changed(add/del/mod)
>     after VM has inited.
> 
> This creates and realizes CPU instances for those cold-booting vCPUs. They
> becomes enabled eventually. For these hotpluggable vCPUs, the vCPU
> instances
> are created, but not realized. They become present eventually.


Above is too complex. I'll make it more succinct. Will fix this.

Thanks
Salil.
 

> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 3668ad27ec..6ba131b799 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2293,8 +2293,10 @@ static void machvirt_init(MachineState *machine)
> >       assert(possible_cpus->len == max_cpus);
> >       for (n = 0; n < possible_cpus->len; n++) {
> >           Object *cpuobj;
> > +        CPUState *cs;
> >
> >           cpuobj = object_new(possible_cpus->cpus[n].type);
> > +        cs = CPU(cpuobj);
> >
> >           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
> >           object_property_set_int(cpuobj, "socket-id",
> > @@ -2306,8 +2308,55 @@ static void machvirt_init(MachineState *machine)
> >           object_property_set_int(cpuobj, "thread-id",
> >                                   virt_get_thread_id(machine, n), NULL);
> >
> > -        qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
> > -        object_unref(cpuobj);
> > +        if (n < smp_cpus) {
> > +            qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
> > +            object_unref(cpuobj);
> > +        } else {
> > +            CPUArchId *cpu_slot;
> > +
> > +            /* handling for vcpus which are yet to be hot-plugged */
> > +            cs->cpu_index = n;
> > +            cpu_slot = virt_find_cpu_slot(machine, cs->cpu_index);
> > +
> > +            /*
> > +             * ARM host vCPU features need to be fixed at the boot time. But as
> > +             * per current approach this CPU object will be destroyed during
> > +             * cpu_post_init(). During hotplug of vCPUs these properties are
> > +             * initialized again.
> > +             */
> > +            virt_cpu_set_properties(cpuobj, cpu_slot, &error_fatal);
> > +
> > +            /*
> > +             * For KVM, we shall be pre-creating the now disabled/un-plugged
> > +             * possbile host vcpus and park them till the time they are
> > +             * actually hot plugged. This is required to pre-size thehost
> > +             * GICC and GICR with the all possible vcpus for this VM.
> > +             */
> > +            if (kvm_enabled()) {
> > +                kvm_arm_create_host_vcpu(ARM_CPU(cs));
> > +            }
> 
>                 /*
>                  * For KVM, the associated objects like vCPU's file descriptor
>                  * is reserved so that they can reused when the vCPU is hot added.
>                  * :
>                  */


I think. Unnecessary.


> > +            /*
> > +             * Add disabled vCPU to CPU slot during the init phase of the virt
> > +             * machine
> > +             * 1. We need this ARMCPU object during the GIC init. This object
> > +             *    will facilitate in pre-realizing the GIC. Any info like
> > +             *    mp-affinity(required to derive gicr_type) etc. could still be
> > +             *    fetched while preserving QOM abstraction akin to realized
> > +             *    vCPUs.
> > +             * 2. Now, after initialization of the virt machine is complete we
> > +             *    could use two approaches to deal with this ARMCPU object:
> > +             *    (i) re-use this ARMCPU object during hotplug of this vCPU.
> > +             *                             OR
> > +             *    (ii) defer release this ARMCPU object after gic has been
> > +             *         initialized or during pre-plug phase when a vCPU is
> > +             *         hotplugged.
> > +             *
> > +             *    We will use the (ii) approach and release the ARMCPU objects
> > +             *    after GIC and machine has been fully initialized during
> > +             *    machine_init_done() phase.
> > +             */
> > +             cpu_slot->cpu = OBJECT(cs);
> > +        }
> 
>             /*
>              * Make the hotpluggable vCPU present because ....
>              */


Sorry, did not get?


> >       }
> >       fdt_add_timer_nodes(vms);
> >       fdt_add_cpu_nodes(vms);
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index e5af79950c..b2201a98ee 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -401,6 +401,7 @@ struct CPUState {
> >       uint32_t kvm_fetch_index;
> >       uint64_t dirty_pages;
> >       int kvm_vcpu_stats_fd;
> > +    VMChangeStateEntry *vmcse;
> >
> >       /* Use by accel-block: CPU is executing an ioctl() */
> >       QemuLockCnt in_ioctl_lock;
> > diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> > index a660e3f483..3a38e7ccaf 100644
> > --- a/target/arm/cpu64.c
> > +++ b/target/arm/cpu64.c
> > @@ -748,6 +748,7 @@ static void aarch64_cpu_initfn(Object *obj)
> >        * enabled explicitly
> >        */
> >       cs->disabled = true;
> > +    cs->thread_id = 0;
> >   }
> >
> >   static void aarch64_cpu_finalizefn(Object *obj)
> > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > index b4c7654f49..0e1d0692b1 100644
> > --- a/target/arm/kvm.c
> > +++ b/target/arm/kvm.c
> > @@ -637,6 +637,38 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
> >       write_list_to_cpustate(cpu);
> >   }
> >
> > +void kvm_arm_create_host_vcpu(ARMCPU *cpu)
> > +{
> > +    CPUState *cs = CPU(cpu);
> > +    unsigned long vcpu_id = cs->cpu_index;
> > +    int ret;
> > +
> > +    ret = kvm_create_vcpu(cs);
> > +    if (ret < 0) {
> > +        error_report("Failed to create host vcpu %ld", vcpu_id);
> > +        abort();
> > +    }
> > +
> > +    /*
> > +     * Initialize the vCPU in the host. This will reset the sys regs
> > +     * for this vCPU and related registers like MPIDR_EL1 etc. also
> > +     * gets programmed during this call to host. These are referred
> > +     * later while setting device attributes of the GICR during GICv3
> > +     * reset
> > +     */
> > +    ret = kvm_arch_init_vcpu(cs);
> > +    if (ret < 0) {
> > +        error_report("Failed to initialize host vcpu %ld", vcpu_id);
> > +        abort();
> > +    }
> > +
> > +    /*
> > +     * park the created vCPU. shall be used during kvm_get_vcpu() when
> > +     * threads are created during realization of ARM vCPUs.
> > +     */
> > +    kvm_park_vcpu(cs);
> > +}
> > +
> >   /*
> >    * Update KVM's MP_STATE based on what QEMU thinks it is
> >    */
> > diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> > index 94bbd9661f..364cc21f81 100644
> > --- a/target/arm/kvm64.c
> > +++ b/target/arm/kvm64.c
> > @@ -566,7 +566,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >           return -EINVAL;
> >       }
> >
> > -    qemu_add_vm_change_state_handler(kvm_arm_vm_state_change, cs);
> > +    /*
> > +     * Install VM change handler only when vCPU thread has been spawned
> > +     * i.e. vCPU is being realized
> > +     */
> > +    if (cs->thread_id) {
> > +        cs->vmcse =
> qemu_add_vm_change_state_handler(kvm_arm_vm_state_change,
> > +                                                     cs);
> > +    }
> >
> >       /* Determine init features for this CPU */
> >       memset(cpu->kvm_init_features, 0, sizeof(cpu->kvm_init_features));
> > diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
> > index 051a0da41c..31408499b3 100644
> > --- a/target/arm/kvm_arm.h
> > +++ b/target/arm/kvm_arm.h
> > @@ -163,6 +163,17 @@ void kvm_arm_cpu_post_load(ARMCPU *cpu);
> >    */
> >   void kvm_arm_reset_vcpu(ARMCPU *cpu);
> >
> > +/**
> > + * kvm_arm_create_host_vcpu:
> > + * @cpu: ARMCPU
> > + *
> > + * Called at to pre create all possible kvm vCPUs within the the host at
> the
>               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>               to create instances for the hotpluggable vCPUs


No. hot-plugging in ARM is a higher level operation which is being
done at QOM not at KVM level. Later is totally agnostic of
what Qemu is doing as part of hot(un)plug operations happening at
QOM. I would not want to associate hot-plugging with KVM as it
gives a wrong impression.


Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 01/37] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  2023-10-02  9:53     ` Salil Mehta via
  2023-10-02  9:53       ` Salil Mehta
@ 2023-10-03  5:05       ` Gavin Shan
  1 sibling, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2023-10-03  5:05 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Salil,

On 10/2/23 19:53, Salil Mehta wrote:
> Many thanks for taking pains to review this patch-set.
> 

No worries.

>> From: Gavin Shan <gshan@redhat.com>
>> Sent: Wednesday, September 27, 2023 12:57 AM
>> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
>> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
>> peter.maydell@linaro.org; richard.henderson@linaro.org;
>> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
>> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
>> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
>> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
>> linux@armlinux.org.uk; darren@os.amperecomputing.com;
>> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
>> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
>> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
>> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
>> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
>> Subject: Re: [PATCH RFC V2 01/37] arm/virt,target/arm: Add new ARMCPU
>> {socket,cluster,core,thread}-id property
>>
>> On 9/26/23 20:04, Salil Mehta wrote:
>>> This shall be used to store user specified topology{socket,cluster,core,thread}
>>> and shall be converted to a unique 'vcpu-id' which is used as slot-index during
>>> hot(un)plug of vCPU.
>>>
>>
>> Note that we don't have 'vcpu-id' property. It's actually the index to the array
>> ms->possible_cpus->cpus[] and cpu->cpu_index. Please improve the commit log if
>> it makes sense.
> 
> I can change but was it mentioned anywhere in the log that vcpu-id is
> a property?
> 

I was thinking it's a property when vcpu-id is quoted with ''. Besides,
"vcpu-id" is usually understood as vCPU's ID instead of vCPU index. I ment
to avoid 'vcpu-id' in the commit log because we're talking about vCPU index
instead vCPU ID. Otherwise, readers or reviewers will be confused because
of it.

> 
>>> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
>>> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>>> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>> ---
>>>    hw/arm/virt.c    | 63 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>    target/arm/cpu.c |  4 +++
>>>    target/arm/cpu.h |  4 +++
>>>    3 files changed, 71 insertions(+)
>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index 7d9dbc2663..57fe97c242 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -221,6 +221,11 @@ static const char *valid_cpus[] = {
>>>        ARM_CPU_TYPE_NAME("max"),
>>>    };
>>>
>>> +static int virt_get_socket_id(const MachineState *ms, int cpu_index);
>>> +static int virt_get_cluster_id(const MachineState *ms, int cpu_index);
>>> +static int virt_get_core_id(const MachineState *ms, int cpu_index);
>>> +static int virt_get_thread_id(const MachineState *ms, int cpu_index);
>>> +
>>>    static bool cpu_type_valid(const char *cpu)
>>>    {
>>>        int i;
>>> @@ -2168,6 +2173,14 @@ static void machvirt_init(MachineState *machine)
>>>                              &error_fatal);
>>>
>>>            aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
>>> +        object_property_set_int(cpuobj, "socket-id",
>>> +                                virt_get_socket_id(machine, n), NULL);
>>> +        object_property_set_int(cpuobj, "cluster-id",
>>> +                                virt_get_cluster_id(machine, n), NULL);
>>> +        object_property_set_int(cpuobj, "core-id",
>>> +                                virt_get_core_id(machine, n), NULL);
>>> +        object_property_set_int(cpuobj, "thread-id",
>>> +                                virt_get_thread_id(machine, n), NULL);
>>>
>>>            if (!vms->secure) {
>>>                object_property_set_bool(cpuobj, "has_el3", false, NULL);
>>> @@ -2652,10 +2665,59 @@ static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
>>>        return socket_id % ms->numa_state->num_nodes;
>>>    }
>>>
>>
>> It seems it's not unnecessary to keep virt_get_{socket, cluster, core, thread}_id()
>> because they're called for once. I would suggest to figure out the socket, cluster,
>> core and thread ID through @possible_cpus in machvirt_init(), like below.
> 
> It is always good to access properties through accessor functions. Beside the
> main purpose here was to keep the code neat. So I would stick with these.
> 
> But because these are something which are not specific to VirtMachine I can
> move them to some other place or a header file so that other architectures
> can also use them.
> 

No, these functions aren't property accessors at all. Actually, they're figuring
out the IDs, passed to object_property_set_int(). I don't see the benefits to
keep those functions who are called for once, and their logic is simple enough to
be integrated to the callers.

> 
>> Besides, we can't always expose property "cluster-id" since cluster in the CPU
>> topology isn't always supported, seeing MachineClass::smp_props. Some users may
>> want to hide cluster for unknown reasons. 'cluster-id' shouldn't be exposed in
>> this case. Otherwise, users may be confused by 'cluster-id' property while it
>> has been disabled. For example, a VM is started with the following command lines
>> and 'cluster-id' shouldn't be supported in vCPU hot-add.
> 
> True. All we are talking about is 4*integer space. This is to avoid complexity
> of checks everywhere in the code by having these variables always exists and
> with default values as 0. If the architecture does not defines property it will
> not use these variable. It is a little tradeoff of memory with respect to
> maintainability of code. I would prefer later.
> 
> We can definitely put some comments in the places of their declaration.
> 

I'm not sure if a comment will resolve the potential issue. It sounds weird that
'cluster-id' property exists even the level of CPU topology isn't supported at all.

> 
>>       -cpu host -smp=maxcpus=2,cpus=1,sockets=2,cores=1,threads=1
>>       (qemu) device_add host,id=cpu1,socket-id=1,cluster-id=0,core-id=0,thread-id=0
>>
>>       object_property_set_int(cpuobj, "socket-id",
>>                               possible_cpus->cpus[i].props.socket_id, NULL);
>>       if (mc->smp_props.cluster_supported && mc->smp_props.has_clusters) {
>>           object_property_set_int(cpuobj, "cluster-id",
>>                                   possible_cpus->cpus[i].props.cluster_id, NULL);
>>       }
> 
> Exactly, these types of checks can be avoided. They make code unnecessarily look
> complex and ugly.
> 

As explained above, the property 'cluster-id' shouldn't be existing if cluster
isn't supported in CPU topology.

>>       object_property_set_int(cpuobj, "core-id",
>>                               possible_cpus->cpus[i].props.core_id, NULL);
>>       object_property_set_int(cpuobj, "thread-id",
>>                               possible_cpus->cpus[i].props.thread_id, NULL);
>>
>>> +static int virt_get_socket_id(const MachineState *ms, int cpu_index)
>>> +{
>>> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
>>> +
>>> +    return ms->possible_cpus->cpus[cpu_index].props.socket_id;
>>> +}
>>> +
>>> +static int virt_get_cluster_id(const MachineState *ms, int cpu_index)
>>> +{
>>> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
>>> +
>>> +    return ms->possible_cpus->cpus[cpu_index].props.cluster_id;
>>> +}
>>> +
>>> +static int virt_get_core_id(const MachineState *ms, int cpu_index)
>>> +{
>>> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
>>> +
>>> +    return ms->possible_cpus->cpus[cpu_index].props.core_id;
>>> +}
>>> +
>>> +static int virt_get_thread_id(const MachineState *ms, int cpu_index)
>>> +{
>>> +    assert(cpu_index >= 0 && cpu_index < ms->possible_cpus->len);
>>> +
>>> +    return ms->possible_cpus->cpus[cpu_index].props.thread_id;
>>> +}
>>> +
>>> +static int
>>> +virt_get_cpu_id_from_cpu_topo(const MachineState *ms, DeviceState *dev)
>>> +{
>>> +    int cpu_id, sock_vcpu_num, clus_vcpu_num, core_vcpu_num;
>>> +    ARMCPU *cpu = ARM_CPU(dev);
>>> +
>>> +    /* calculate total logical cpus across socket/cluster/core */
>>> +    sock_vcpu_num = cpu->socket_id * (ms->smp.threads * ms->smp.cores *
>>> +                    ms->smp.clusters);
>>> +    clus_vcpu_num = cpu->cluster_id * (ms->smp.threads * ms->smp.cores);
>>> +    core_vcpu_num = cpu->core_id * ms->smp.threads;
>>> +
>>> +    /* get vcpu-id(logical cpu index) for this vcpu from this topology
>> */
>>> +    cpu_id = (sock_vcpu_num + clus_vcpu_num + core_vcpu_num) + cpu->thread_id;
>>> +
>>> +    assert(cpu_id >= 0 && cpu_id < ms->possible_cpus->len);
>>> +
>>> +    return cpu_id;
>>> +}
>>> +
>>
>> This function is called for once in PATCH[04/37]. I think it needs to be moved
>> around to PATCH[04/37].
> 
> 
> Yes, we can do that.
> 

Ok.

>> [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time change common
>> to vCPU {cold|hot}-plug
>>
>> The function name can be shortened because I don't see the suffix "_from_cpu_topo"
>> is too much helpful. I think virt_get_cpu_index() would be good enough since it's
>> called for once to return the index in array MachineState::possible_cpus::cpus[]
>> and the return value is stored to CPUState::cpu_index
> 
> This is not an accessor function. This function derives the unique vcpu-id
> from topology. Hence, naming is correct. Though, I can shorten the name
> to something like below if you wish,
> 
> virt_get_cpu_id_from_cpu_topo() -> virt_cpu_id_from_topology()
> 
> 
> The name virt_get_cpu_index() suggests as if function is something like below
> and which it is not:
> 
> virt_get_cpu_index()
> {
>     return cs->cpu_index
> }
> 

Well, the point was to indicate 'CPU index' instead of 'CPU ID' returned
from this function, as clarified in the commit log. Please don't mix
'CPU index' and 'CPU ID' even they're same in some situations on aarch64.
So how about renaming it to virt_cpu_index_from_topology()?

> 
> 
>> static int virt_get_cpu_index(const MachineState *ms, ARMCPU *cpu)
>> {
>>       int index, cpus_in_socket, cpus_in_cluster, cpus_in_core;
>>
>>       /*
>>        * It's fine to take cluster into account even it's not supported. In this
>>        * case, ms->smp.clusters is always one.
>>        */
>> }
>>
>>>    static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState
>> *ms)
>>>    {
>>>        int n;
>>>        unsigned int max_cpus = ms->smp.max_cpus;
>>> +    unsigned int smp_threads = ms->smp.threads;
>>>        VirtMachineState *vms = VIRT_MACHINE(ms);
>>>        MachineClass *mc = MACHINE_GET_CLASS(vms);
>>>
>>> @@ -2669,6 +2731,7 @@ static const CPUArchIdList
>> *virt_possible_cpu_arch_ids(MachineState *ms)
>>>        ms->possible_cpus->len = max_cpus;
>>>        for (n = 0; n < ms->possible_cpus->len; n++) {
>>>            ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>> +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
>>>            ms->possible_cpus->cpus[n].arch_id =
>>>                virt_cpu_mp_affinity(vms, n);
>>>
>>
>> This initialization seems to accomodate HMP command "info hotpluggable-
>> cpus".
>> It would be nice if it can be mentioned in the commit log.
>>
>>> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
>>> index 93c28d50e5..1376350416 100644
>>> --- a/target/arm/cpu.c
>>> +++ b/target/arm/cpu.c
>>> @@ -2277,6 +2277,10 @@ static Property arm_cpu_properties[] = {
>>>        DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
>>>                            mp_affinity, ARM64_AFFINITY_INVALID),
>>>        DEFINE_PROP_INT32("node-id", ARMCPU, node_id,
>> CPU_UNSET_NUMA_NODE_ID),
>>> +    DEFINE_PROP_INT32("socket-id", ARMCPU, socket_id, 0),
>>> +    DEFINE_PROP_INT32("cluster-id", ARMCPU, cluster_id, 0),
>>> +    DEFINE_PROP_INT32("core-id", ARMCPU, core_id, 0),
>>> +    DEFINE_PROP_INT32("thread-id", ARMCPU, thread_id, 0),
>>>        DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1),
>>>        DEFINE_PROP_END_OF_LIST()
>>>    };
>>
>> All those 4 properties are used for vCPU hot-add, meaning they're not needed
>> when vCPU hotplug isn't supported on the specific board. Even for hw/virt board,
>> cluster isn't always supported and 'cluster-id' shouldn't always be exposed,
>> as explained above. How about to register the properties dynamically only when
>> they're needed by vCPU hotplug?
> 
> 
> Yes, these are part of arch specific files so it is upto the arch whether to define
> them or not to define them at all?
> 
> Yes, and as mentioned earlier, there is extra bit of memory(4*integer) which is
> being used. I would tradeoff this vis-à-vis maintainability.
> 

Right, making sense to me.

>>> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
>>> index 88e5accda6..d51d39f621 100644
>>> --- a/target/arm/cpu.h
>>> +++ b/target/arm/cpu.h
>>> @@ -1094,6 +1094,10 @@ struct ArchCPU {
>>>        QLIST_HEAD(, ARMELChangeHook) el_change_hooks;
>>>
>>>        int32_t node_id; /* NUMA node this CPU belongs to */
>>> +    int32_t socket_id;
>>> +    int32_t cluster_id;
>>> +    int32_t core_id;
>>> +    int32_t thread_id;
>>
>> It would be fine to keep those fields even the corresponding properties are
>> dynamically registered, but a little bit memory overhead incurred :)
> 
> You are contradicting yourself here ;)
> 

Correct. I was wandering if we need the properties, even vCPU hotplug
isn't supported. As you explained above, I agree with you that it's fine
to keep these properties even vCPU hotplug is unsupported.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for possible vCPUs
  2023-10-02 10:21     ` Salil Mehta via
  2023-10-02 10:21       ` Salil Mehta
@ 2023-10-03  5:34       ` Gavin Shan
  1 sibling, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2023-10-03  5:34 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Salil,

On 10/2/23 20:21, Salil Mehta wrote:
>> From: Gavin Shan <gshan@redhat.com>
>> Sent: Wednesday, September 27, 2023 4:54 AM
>> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
>> arm@nongnu.org
>> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
>> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
>> peter.maydell@linaro.org; richard.henderson@linaro.org;
>> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
>> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
>> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
>> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
>> linux@armlinux.org.uk; darren@os.amperecomputing.com;
>> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
>> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
>> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
>> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
>> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
>> Subject: Re: [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for
>> possible vCPUs
>>
>> Hi Salil,
>>
>> On 9/26/23 20:04, Salil Mehta wrote:
>>> Adds various utility functions which might be required to fetch or check
>> the
>>> state of the possible vCPUs. This also introduces concept of *disabled*
>> vCPUs,
>>> which are part of the *possible* vCPUs but are not part of the *present*
>> vCPU.
>>> This state shall be used during machine init time to check the presence
>> of
>>> vcpus.
>>     ^^^^^
>>
>>     vCPUs
> 
> 
> Yes. Thanks.
> 
> 
>>> Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
>>> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
>>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>>> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>> ---
>>>    cpus-common.c         | 31 +++++++++++++++++++++++++
>>>    include/hw/core/cpu.h | 53 +++++++++++++++++++++++++++++++++++++++++++
>>>    2 files changed, 84 insertions(+)
>>>
>>> diff --git a/cpus-common.c b/cpus-common.c
>>> index 45c745ecf6..24c04199a1 100644
>>> --- a/cpus-common.c
>>> +++ b/cpus-common.c
>>> @@ -24,6 +24,7 @@
>>>    #include "sysemu/cpus.h"
>>>    #include "qemu/lockable.h"
>>>    #include "trace/trace-root.h"
>>> +#include "hw/boards.h"
>>>
>>>    QemuMutex qemu_cpu_list_lock;
>>>    static QemuCond exclusive_cond;
>>> @@ -107,6 +108,36 @@ void cpu_list_remove(CPUState *cpu)
>>>        cpu_list_generation_id++;
>>>    }
>>>
>>> +CPUState *qemu_get_possible_cpu(int index)
>>> +{
>>> +    MachineState *ms = MACHINE(qdev_get_machine());
>>> +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
>>> +
>>> +    assert((index >= 0) && (index < possible_cpus->len));
>>> +
>>> +    return CPU(possible_cpus->cpus[index].cpu);
>>> +}
>>> +
>>> +bool qemu_present_cpu(CPUState *cpu)
>>> +{
>>> +    return cpu;
>>> +}
>>> +
>>> +bool qemu_enabled_cpu(CPUState *cpu)
>>> +{
>>> +    return cpu && !cpu->disabled;
>>> +}
>>> +
>>
>> I do think it's a good idea to have wrappers to check for CPU's states since
>> these CPU states play important role in this series to support vCPU hotplug.
>> However, it would be nice to move them around into header file (include/hw/boards.h)
>> because all the checks are originated from ms->possible_cpus->cpus[]. It sounds
>> functions to a machine (board) instead of global scope. Besides, it would be
>> nice to have same input (index) for all functions. How about something like
>> below in include/hw/boards.h?
> 
> These are operations related to CPUState and hence cpus-common.c seem to be
> more appropriate to me. You can see similar functions like qemu_get_cpu()
> already exists in the same file.
> 
> Yes, some operation do make use of the possible list which is maintained at
> board level but eventually what we are returning is the CPUState.
> 
> I am open to move some of above to board level not all like present,
> enabled checks should exist in this file only. I would prefer to keep
> all of them in this file.
> 

There are two lists (arrays): ms->possible_cpus->cpus[] and cpus-common.c::cpus.
The former one is a board's property and the later is a global property. In our
implementation, the vCPU state depends on ms->possible_cpus->cpus[], for example:

- The possible vCPU is determined by checking its index falls in the range of
   [0, ms->possible_cpus->len]
- The present vCPU is determined by checking ms->possible_cpus->cpus[index].cpu

However, other two states have to be determined by checking CPUState

- CPUState::acpi_persistent, for always-present vCPUs
- CPUState::disabled, for enabled vCPU

As suggested in other replies, we may manage the vCPU states from board level
due the fact: the vCPU state changes on Creation, hot-add or hot-remove, which
are all driven by a board. Besides, the hotplug handler is managed by board.
Lastly, scatting the information in different places (ms->possible_cpus->cpus[]
and CPUState), which helps to determine vCPU states, seems not a good idea.

In order to maintain all the information in board level, 'struct CPUArchId'
need some adaption like below. With it, the vCPU states can be determined
from ms->possible_cpus.

#define CPU_ARCH_ID_FLAG_ALWAYS_PRESENT		(1UL << 0)
#define CPU_ARCH_ID_FLAG_ENABLED		(1UL << 1)
typedef struct CPUArchId {
     unsigned long flags
        :
};

> 
>>
>> static inline  bool machine_has_possible_cpu(int index)
>> {
>>       MachineState *ms = MACHINE(qdev_get_machine());
>>
>>       if (!ms || !ms->possible_cpus || index < 0 || index >= ms-
>>> possible_cus->len) {
>>           return false;
>>       }
>>
>>       return true;
>> }
>>
>> static inline bool machine_has_present_cpu(int index)
>> {
>>       MachineState *ms = MACHINE(qdev_get_machine());
>>
>>       if (!machine_is_possible_cpu(index) ||
>>           !ms->possible_cpus->cpus[index].cpu) {
>>           return false;
>>       }
>>
>>       return true;
>> }
>>
>> static inline bool machine_has_enabled_cpu(int index)
>> {
>>       MachineState *ms = MACHINE(qdev_get_machine());
>>       CPUState *cs;
>>
>>       if (!machine_is_present_cpu(index)) {
>>           return false;
>>       }
>>
>>       cs = CPU(ms->possible_cpus->cpus[index].cpu);
>>       return !cs->disabled
>> }
>>
>>> +uint64_t qemu_get_cpu_archid(int cpu_index)
>>> +{
>>> +    MachineState *ms = MACHINE(qdev_get_machine());
>>> +    const CPUArchIdList *possible_cpus = ms->possible_cpus;
>>> +
>>> +    assert((cpu_index >= 0) && (cpu_index < possible_cpus->len));
>>> +
>>> +    return possible_cpus->cpus[cpu_index].arch_id;
>>> +}
>>> +
>>
>> I think it's unnecessary to keep it since it's called for once by
>> hw/arm/virt-acpi-build.c::build_madt. The architectural ID can be
>> directly fetched from possible_cpus->cpus[i].arch_id. It's fine
>> to drop this function and fold the logic to the following patch.
> 
> It is a very useful accessor API. I can see this code is being
> replicated everywhere which also means many time its related
> variables are repeatedly defined.
> 
> Maybe this is being used once now. But this can be used across
> architectures later.
> 

Ok, then please make it inline at least.

>>
>> [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with
>> possible vCPUs
>>
>>
>>>    CPUState *qemu_get_cpu(int index)
>>>    {
>>>        CPUState *cpu;
>>> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
>>> index fdcbe87352..e5af79950c 100644
>>> --- a/include/hw/core/cpu.h
>>> +++ b/include/hw/core/cpu.h
>>> @@ -413,6 +413,17 @@ struct CPUState {
>>>        SavedIOTLB saved_iotlb;
>>>    #endif
>>>
>>> +    /*
>>> +     * Some architectures do not allow *presence* of vCPUs to be changed
>>> +     * after guest has booted using information specified by VMM/firmware
>>> +     * via ACPI MADT at the boot time. Thus to enable vCPU hotplug on these
>>> +     * architectures possible vCPU can have CPUState object in 'disabled'
>>> +     * state or can also not have CPUState object at all. This is possible
>>> +     * when vCPU Hotplug is supported and vCPUs are 'yet-to-be-plugged' in
>>> +     * the QOM or have been hot-unplugged.
>>> +     * By default every CPUState is enabled as of now across all archs.
>>> +     */
>>> +    bool disabled;
>>>        /* TODO Move common fields from CPUArchState here. */
>>>        int cpu_index;
>>>        int cluster_index;
>>
>> I guess the comments can be simplified a bit. How about something like
>> below?
>>       /*
>>        * In order to support vCPU hotplug on architectures like aarch64,
>>        * the vCPU states fall into possible, present or enabled. This field
>>        * is added to distinguish present and enabled vCPUs. By default, all
>>        * vCPUs are present and enabled.
>>        */
> 
> I can definitely try to simplify it but above is not properly conveying the
> reason why we require the disabled state.
> 

Ok, I think the association between this field and MDAT still need to be
mentioned.
  
>>
>>> @@ -770,6 +781,48 @@ static inline bool cpu_in_exclusive_context(const
>> CPUState *cpu)
>>>     */
>>>    CPUState *qemu_get_cpu(int index);
>>>
>>> +/**
>>> + * qemu_get_possible_cpu:
>>> + * @index: The CPUState@cpu_index value of the CPU to obtain.
>>> + *         Input index MUST be in range [0, Max Possible CPUs)
>>> + *
>>> + * If CPUState object exists,then it gets a CPU matching
>>> + * @index in the possible CPU array.
>>> + *
>>> + * Returns: The possible CPU or %NULL if CPU does not exist.
>>> + */
>>> +CPUState *qemu_get_possible_cpu(int index);
>>> +
>>> +/**
>>> + * qemu_present_cpu:
>>> + * @cpu: The vCPU to check
>>> + *
>>> + * Checks if the vCPU is amongst the present possible vcpus.
>>> + *
>>> + * Returns: True if it is present possible vCPU else false
>>> + */
>>> +bool qemu_present_cpu(CPUState *cpu);
>>> +
>>> +/**
>>> + * qemu_enabled_cpu:
>>> + * @cpu: The vCPU to check
>>> + *
>>> + * Checks if the vCPU is enabled.
>>> + *
>>> + * Returns: True if it is 'enabled' else false
>>> + */
>>> +bool qemu_enabled_cpu(CPUState *cpu);
>>> +
>>> +/**
>>> + * qemu_get_cpu_archid:
>>> + * @cpu_index: possible vCPU for which arch-id needs to be retreived
>>> + *
>>> + * Fetches the vCPU arch-id from the present possible vCPUs.
>>> + *
>>> + * Returns: arch-id of the possible vCPU
>>> + */
>>> +uint64_t qemu_get_cpu_archid(int cpu_index);
>>> +

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation,parking} code
  2023-10-02 16:20     ` Salil Mehta via
  2023-10-02 16:20       ` Salil Mehta
@ 2023-10-03  5:39       ` Gavin Shan
  1 sibling, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2023-10-03  5:39 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai


Hi Salil,

On 10/3/23 02:20, Salil Mehta wrote:
>> From: Gavin Shan <gshan@redhat.com>
>> Sent: Wednesday, September 27, 2023 7:52 AM
>> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
>> arm@nongnu.org
>> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
>> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
>> peter.maydell@linaro.org; richard.henderson@linaro.org;
>> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
>> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
>> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
>> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
>> linux@armlinux.org.uk; darren@os.amperecomputing.com;
>> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
>> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
>> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
>> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
>> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
>> Subject: Re: [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU
>> {creation,parking} code
>>
>> Hi Salil,
>>
>> On 9/26/23 20:04, Salil Mehta wrote:
>>> KVM vCPU creation is done once during the initialization of the VM when Qemu
>>> threads are spawned. This is common to all the architectures. If the architecture
>>> supports vCPU hot-{un}plug then this KVM vCPU creation could be deferred to
>>> later point as well. Some architectures might in any case create KVM vCPUs for
>>> the yet-to-be plugged vCPUs (i.e. QoM Object & thread does not exists) during VM
>>> init time and park them.
>>>
>>> Hot-unplug of vCPU results in destruction of the vCPU objects in QOM but
>>> the KVM vCPU objects in the Host KVM are not destroyed and their representative
>>> KVM vCPU objects in Qemu are parked.
>>>
>>> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>> ---
>>>    accel/kvm/kvm-all.c  | 61 ++++++++++++++++++++++++++++++++++----------
>>>    include/sysemu/kvm.h |  2 ++
>>>    2 files changed, 49 insertions(+), 14 deletions(-)
>>>
>>
>> The most important point seems missed in the commit log: The KVM vCPU objects,
>> including those hotpluggable objects, need to be in place before in-host GICv3
>> is initialized. So we need expose kvm_create_vcpu() to make those KVM vCPU
>> objects in place, even for those non-present vCPUs.
> 
> 
> This is a patch common to all architectures. The point you are making is specific
> to the ARM architecture. This patch is now part of the common patch-set. Here,
> 
> https://lore.kernel.org/qemu-devel/20230930001933.2660-1-salil.mehta@huawei.com/
> 
> 

Yes, reviewed it again. Lets have more discussions over there.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 31/37] physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
  2023-09-26 10:04 ` [PATCH RFC V2 31/37] physmem, gdbstub: Common helping funcs/changes to *unrealize* vCPU Salil Mehta via
@ 2023-10-03  6:33   ` Philippe Mathieu-Daudé
  2023-10-03 10:22     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-10-03  6:33 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, eric.auger,
	will, ardb, oliver.upton, pbonzini, mst, gshan, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 26/9/23 12:04, Salil Mehta wrote:
> Supporting vCPU Hotplug for ARM arch also means introducing new functionality of
> unrealizing the ARMCPU. This requires some new common functions.
> 
> Defining them as part of architecture independent change so that this code could
> be reused by other interested parties.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   gdbstub/gdbstub.c         | 13 +++++++++++++
>   include/exec/cpu-common.h |  8 ++++++++
>   include/exec/gdbstub.h    |  1 +
>   include/hw/core/cpu.h     |  1 +
>   softmmu/physmem.c         | 25 +++++++++++++++++++++++++
>   5 files changed, 48 insertions(+)


> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index dab572c9bd..ffd815a0d8 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -366,6 +366,7 @@ struct CPUState {
>       QSIMPLEQ_HEAD(, qemu_work_item) work_list;
>   
>       CPUAddressSpace *cpu_ases;
> +    int cpu_ases_ref_count;
>       int num_ases;
>       AddressSpace *as;
>       MemoryRegion *memory;
> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> index 3df73542e1..a93ae783af 100644
> --- a/softmmu/physmem.c
> +++ b/softmmu/physmem.c
> @@ -762,6 +762,7 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
>   
>       if (!cpu->cpu_ases) {
>           cpu->cpu_ases = g_new0(CPUAddressSpace, cpu->num_ases);
> +        cpu->cpu_ases_ref_count = cpu->num_ases;
>       }
>   
>       newas = &cpu->cpu_ases[asidx];
> @@ -775,6 +776,30 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
>       }
>   }
>   
> +void cpu_address_space_destroy(CPUState *cpu, int asidx)
> +{
> +    CPUAddressSpace *cpuas;
> +
> +    assert(asidx < cpu->num_ases);
> +    assert(asidx == 0 || !kvm_enabled());
> +    assert(cpu->cpu_ases);
> +
> +    cpuas = &cpu->cpu_ases[asidx];
> +    if (tcg_enabled()) {
> +        memory_listener_unregister(&cpuas->tcg_as_listener);
> +    }
> +
> +    address_space_destroy(cpuas->as);
> +    g_free_rcu(cpuas->as, rcu);
> +
> +    if (cpu->cpu_ases_ref_count == 1) {
> +        g_free(cpu->cpu_ases);
> +        cpu->cpu_ases = NULL;
> +    }
> +
> +    cpu->cpu_ases_ref_count--;

See Richard comment from:
https://lore.kernel.org/qemu-devel/594b2550-9a73-684f-6e54-29401dc6cd7a@linaro.org/

"I think it would be better to destroy all address spaces at once,
"so that you don't need  to invent a reference count that isn't used
"for anything else.

> +}
> +
>   AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
>   {
>       /* Return the AddressSpace corresponding to the specified index */



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 31/37] physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
  2023-10-03  6:33   ` [PATCH RFC V2 31/37] physmem,gdbstub: " Philippe Mathieu-Daudé
@ 2023-10-03 10:22     ` Salil Mehta via
  2023-10-03 10:22       ` Salil Mehta
  2023-10-04  9:17       ` Salil Mehta via
  0 siblings, 2 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-10-03 10:22 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, eric.auger,
	will, ardb, oliver.upton, pbonzini, mst, gshan, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Phil,

> From: Philippe Mathieu-Daudé <philmd@linaro.org>
> Sent: Tuesday, October 3, 2023 7:34 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> gshan@redhat.com; rafael@kernel.org; borntraeger@linux.ibm.com;
> alex.bennee@linaro.org; linux@armlinux.org.uk;
> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian
> <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
> wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com;
> maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 31/37] physmem,gdbstub: Common helping
> funcs/changes to *unrealize* vCPU
> 
> Hi Salil,
> 
> On 26/9/23 12:04, Salil Mehta wrote:
> > Supporting vCPU Hotplug for ARM arch also means introducing new
> functionality of
> > unrealizing the ARMCPU. This requires some new common functions.
> >
> > Defining them as part of architecture independent change so that this
> code could
> > be reused by other interested parties.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   gdbstub/gdbstub.c         | 13 +++++++++++++
> >   include/exec/cpu-common.h |  8 ++++++++
> >   include/exec/gdbstub.h    |  1 +
> >   include/hw/core/cpu.h     |  1 +
> >   softmmu/physmem.c         | 25 +++++++++++++++++++++++++
> >   5 files changed, 48 insertions(+)
> 
> 
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index dab572c9bd..ffd815a0d8 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -366,6 +366,7 @@ struct CPUState {
> >       QSIMPLEQ_HEAD(, qemu_work_item) work_list;
> >
> >       CPUAddressSpace *cpu_ases;
> > +    int cpu_ases_ref_count;
> >       int num_ases;
> >       AddressSpace *as;
> >       MemoryRegion *memory;
> > diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> > index 3df73542e1..a93ae783af 100644
> > --- a/softmmu/physmem.c
> > +++ b/softmmu/physmem.c
> > @@ -762,6 +762,7 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
> >
> >       if (!cpu->cpu_ases) {
> >           cpu->cpu_ases = g_new0(CPUAddressSpace, cpu->num_ases);
> > +        cpu->cpu_ases_ref_count = cpu->num_ases;
> >       }
> >
> >       newas = &cpu->cpu_ases[asidx];
> > @@ -775,6 +776,30 @@ void cpu_address_space_init(CPUState *cpu, int
> asidx,
> >       }
> >   }
> >
> > +void cpu_address_space_destroy(CPUState *cpu, int asidx)
> > +{
> > +    CPUAddressSpace *cpuas;
> > +
> > +    assert(asidx < cpu->num_ases);
> > +    assert(asidx == 0 || !kvm_enabled());
> > +    assert(cpu->cpu_ases);
> > +
> > +    cpuas = &cpu->cpu_ases[asidx];
> > +    if (tcg_enabled()) {
> > +        memory_listener_unregister(&cpuas->tcg_as_listener);
> > +    }
> > +
> > +    address_space_destroy(cpuas->as);
> > +    g_free_rcu(cpuas->as, rcu);
> > +
> > +    if (cpu->cpu_ases_ref_count == 1) {
> > +        g_free(cpu->cpu_ases);
> > +        cpu->cpu_ases = NULL;
> > +    }
> > +
> > +    cpu->cpu_ases_ref_count--;
> 
> See Richard comment from:
> https://lore.kernel.org/qemu-devel/594b2550-9a73-684f-6e54-
> 29401dc6cd7a@linaro.org/
> 
> "I think it would be better to destroy all address spaces at once,
> "so that you don't need  to invent a reference count that isn't used
> "for anything else.

Yes, we can do that and remove the reference count. The only reason I
did it was because I was not sure if it is safe to assume that all
the AddressSpace will always be destroyed *together*. And now since
this is being ported to other architectures will the same hold
true everywhere?


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 31/37] physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
  2023-10-03 10:22     ` Salil Mehta via
@ 2023-10-03 10:22       ` Salil Mehta
  2023-10-04  9:17       ` Salil Mehta via
  1 sibling, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-03 10:22 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, eric.auger,
	will, ardb, oliver.upton, pbonzini, mst, gshan, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Phil,

> From: Philippe Mathieu-Daudé <philmd@linaro.org>
> Sent: Tuesday, October 3, 2023 7:34 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> gshan@redhat.com; rafael@kernel.org; borntraeger@linux.ibm.com;
> alex.bennee@linaro.org; linux@armlinux.org.uk;
> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian
> <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
> wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com;
> maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 31/37] physmem,gdbstub: Common helping
> funcs/changes to *unrealize* vCPU
> 
> Hi Salil,
> 
> On 26/9/23 12:04, Salil Mehta wrote:
> > Supporting vCPU Hotplug for ARM arch also means introducing new
> functionality of
> > unrealizing the ARMCPU. This requires some new common functions.
> >
> > Defining them as part of architecture independent change so that this
> code could
> > be reused by other interested parties.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   gdbstub/gdbstub.c         | 13 +++++++++++++
> >   include/exec/cpu-common.h |  8 ++++++++
> >   include/exec/gdbstub.h    |  1 +
> >   include/hw/core/cpu.h     |  1 +
> >   softmmu/physmem.c         | 25 +++++++++++++++++++++++++
> >   5 files changed, 48 insertions(+)
> 
> 
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index dab572c9bd..ffd815a0d8 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -366,6 +366,7 @@ struct CPUState {
> >       QSIMPLEQ_HEAD(, qemu_work_item) work_list;
> >
> >       CPUAddressSpace *cpu_ases;
> > +    int cpu_ases_ref_count;
> >       int num_ases;
> >       AddressSpace *as;
> >       MemoryRegion *memory;
> > diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> > index 3df73542e1..a93ae783af 100644
> > --- a/softmmu/physmem.c
> > +++ b/softmmu/physmem.c
> > @@ -762,6 +762,7 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
> >
> >       if (!cpu->cpu_ases) {
> >           cpu->cpu_ases = g_new0(CPUAddressSpace, cpu->num_ases);
> > +        cpu->cpu_ases_ref_count = cpu->num_ases;
> >       }
> >
> >       newas = &cpu->cpu_ases[asidx];
> > @@ -775,6 +776,30 @@ void cpu_address_space_init(CPUState *cpu, int
> asidx,
> >       }
> >   }
> >
> > +void cpu_address_space_destroy(CPUState *cpu, int asidx)
> > +{
> > +    CPUAddressSpace *cpuas;
> > +
> > +    assert(asidx < cpu->num_ases);
> > +    assert(asidx == 0 || !kvm_enabled());
> > +    assert(cpu->cpu_ases);
> > +
> > +    cpuas = &cpu->cpu_ases[asidx];
> > +    if (tcg_enabled()) {
> > +        memory_listener_unregister(&cpuas->tcg_as_listener);
> > +    }
> > +
> > +    address_space_destroy(cpuas->as);
> > +    g_free_rcu(cpuas->as, rcu);
> > +
> > +    if (cpu->cpu_ases_ref_count == 1) {
> > +        g_free(cpu->cpu_ases);
> > +        cpu->cpu_ases = NULL;
> > +    }
> > +
> > +    cpu->cpu_ases_ref_count--;
> 
> See Richard comment from:
> https://lore.kernel.org/qemu-devel/594b2550-9a73-684f-6e54-
> 29401dc6cd7a@linaro.org/
> 
> "I think it would be better to destroy all address spaces at once,
> "so that you don't need  to invent a reference count that isn't used
> "for anything else.

Yes, we can do that and remove the reference count. The only reason I
did it was because I was not sure if it is safe to assume that all
the AddressSpace will always be destroyed *together*. And now since
this is being ported to other architectures will the same hold
true everywhere?


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 31/37] physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
  2023-10-03 10:22     ` Salil Mehta via
  2023-10-03 10:22       ` Salil Mehta
@ 2023-10-04  9:17       ` Salil Mehta via
  2023-10-04  9:17         ` Salil Mehta
  1 sibling, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-04  9:17 UTC (permalink / raw)
  To: Salil Mehta, Philippe Mathieu-Daudé, qemu-devel
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, eric.auger,
	will, ardb, oliver.upton, pbonzini, mst, gshan, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Phil/Richard,

> From: qemu-arm-bounces+salil.mehta=huawei.com@nongnu.org <qemu-arm-
> bounces+salil.mehta=huawei.com@nongnu.org> On Behalf Of Salil Mehta via
> Sent: Tuesday, October 3, 2023 11:23 AM
> 
> Hi Phil,
> 
> > From: Philippe Mathieu-Daudé <philmd@linaro.org>
> > Sent: Tuesday, October 3, 2023 7:34 AM
> > To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> > arm@nongnu.org
> > Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> > <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> > peter.maydell@linaro.org; richard.henderson@linaro.org;
> > imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> > eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> > oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> > gshan@redhat.com; rafael@kernel.org; borntraeger@linux.ibm.com;
> > alex.bennee@linaro.org; linux@armlinux.org.uk;
> > darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> > vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> > miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian
> > <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
> > wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com;
> > maobibo@loongson.cn; lixianglai@loongson.cn
> > Subject: Re: [PATCH RFC V2 31/37] physmem,gdbstub: Common helping
> > funcs/changes to *unrealize* vCPU
> >
> > Hi Salil,
> >
> > On 26/9/23 12:04, Salil Mehta wrote:
> > > Supporting vCPU Hotplug for ARM arch also means introducing new
> > functionality of
> > > unrealizing the ARMCPU. This requires some new common functions.
> > >
> > > Defining them as part of architecture independent change so that this
> > code could
> > > be reused by other interested parties.
> > >
> > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > > ---
> > >   gdbstub/gdbstub.c         | 13 +++++++++++++
> > >   include/exec/cpu-common.h |  8 ++++++++
> > >   include/exec/gdbstub.h    |  1 +
> > >   include/hw/core/cpu.h     |  1 +
> > >   softmmu/physmem.c         | 25 +++++++++++++++++++++++++
> > >   5 files changed, 48 insertions(+)
> >
> >
> > > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > > index dab572c9bd..ffd815a0d8 100644
> > > --- a/include/hw/core/cpu.h
> > > +++ b/include/hw/core/cpu.h
> > > @@ -366,6 +366,7 @@ struct CPUState {
> > >       QSIMPLEQ_HEAD(, qemu_work_item) work_list;
> > >
> > >       CPUAddressSpace *cpu_ases;
> > > +    int cpu_ases_ref_count;
> > >       int num_ases;
> > >       AddressSpace *as;
> > >       MemoryRegion *memory;
> > > diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> > > index 3df73542e1..a93ae783af 100644
> > > --- a/softmmu/physmem.c
> > > +++ b/softmmu/physmem.c
> > > @@ -762,6 +762,7 @@ void cpu_address_space_init(CPUState *cpu, int
> asidx,
> > >
> > >       if (!cpu->cpu_ases) {
> > >           cpu->cpu_ases = g_new0(CPUAddressSpace, cpu->num_ases);
> > > +        cpu->cpu_ases_ref_count = cpu->num_ases;
> > >       }
> > >
> > >       newas = &cpu->cpu_ases[asidx];
> > > @@ -775,6 +776,30 @@ void cpu_address_space_init(CPUState *cpu, int
> > asidx,
> > >       }
> > >   }
> > >
> > > +void cpu_address_space_destroy(CPUState *cpu, int asidx)
> > > +{
> > > +    CPUAddressSpace *cpuas;
> > > +
> > > +    assert(asidx < cpu->num_ases);
> > > +    assert(asidx == 0 || !kvm_enabled());
> > > +    assert(cpu->cpu_ases);
> > > +
> > > +    cpuas = &cpu->cpu_ases[asidx];
> > > +    if (tcg_enabled()) {
> > > +        memory_listener_unregister(&cpuas->tcg_as_listener);
> > > +    }
> > > +
> > > +    address_space_destroy(cpuas->as);
> > > +    g_free_rcu(cpuas->as, rcu);
> > > +
> > > +    if (cpu->cpu_ases_ref_count == 1) {
> > > +        g_free(cpu->cpu_ases);
> > > +        cpu->cpu_ases = NULL;
> > > +    }
> > > +
> > > +    cpu->cpu_ases_ref_count--;
> >
> > See Richard comment from:
> > https://lore.kernel.org/qemu-devel/594b2550-9a73-684f-6e54-
> > 29401dc6cd7a@linaro.org/
> >
> > "I think it would be better to destroy all address spaces at once,
> > "so that you don't need  to invent a reference count that isn't used
> > "for anything else.
> 
> Yes, we can do that and remove the reference count. The only reason I
> did it was because I was not sure if it is safe to assume that all
> the AddressSpace will always be destroyed *together*. And now since
> this is being ported to other architectures will the same hold
> true everywhere?

(sorry, I missed key point)

To make things clear further for ARM, presence of tagged/secure memory
is optional (and I am not even sure all of these are supported with
accel=KVM). The Address Space destruction function is common to all
architectures and hence it is not safe to destroy all of these together.

https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.mehta@huawei.com/T/#mfb2a525081c412917a0026d558e72f48875e386d

+static void arm_cpu_unrealizefn(DeviceState *dev)
+{
+    ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
+    ARMCPU *cpu = ARM_CPU(dev);
+    CPUARMState *env = &cpu->env;
+    CPUState *cs = CPU(dev);
+    bool has_secure;
+
+    has_secure = cpu->has_el3 || arm_feature(env, ARM_FEATURE_M_SECURITY);
+
+    /* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn cleanly */
+    cpu_address_space_destroy(cs, ARMASIdx_NS);
+
+    if (cpu->tag_memory != NULL) {
+        cpu_address_space_destroy(cs, ARMASIdx_TagNS);
+        if (has_secure) {
+            cpu_address_space_destroy(cs, ARMASIdx_TagS);
+        }
+    }
+
+    if (has_secure) {
+        cpu_address_space_destroy(cs, ARMASIdx_S);
+    }
 [...]
}



@Richard, please let me know if I understood your comment correctly?


Thanks
Salil.







^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 31/37] physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
  2023-10-04  9:17       ` Salil Mehta via
@ 2023-10-04  9:17         ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-04  9:17 UTC (permalink / raw)
  To: Salil Mehta, Philippe Mathieu-Daudé, qemu-devel
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, eric.auger,
	will, ardb, oliver.upton, pbonzini, mst, gshan, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Phil/Richard,

> From: qemu-arm-bounces+salil.mehta=huawei.com@nongnu.org <qemu-arm-
> bounces+salil.mehta=huawei.com@nongnu.org> On Behalf Of Salil Mehta via
> Sent: Tuesday, October 3, 2023 11:23 AM
> 
> Hi Phil,
> 
> > From: Philippe Mathieu-Daudé <philmd@linaro.org>
> > Sent: Tuesday, October 3, 2023 7:34 AM
> > To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> > arm@nongnu.org
> > Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> > <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> > peter.maydell@linaro.org; richard.henderson@linaro.org;
> > imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> > eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> > oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> > gshan@redhat.com; rafael@kernel.org; borntraeger@linux.ibm.com;
> > alex.bennee@linaro.org; linux@armlinux.org.uk;
> > darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> > vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> > miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian
> > <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
> > wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com;
> > maobibo@loongson.cn; lixianglai@loongson.cn
> > Subject: Re: [PATCH RFC V2 31/37] physmem,gdbstub: Common helping
> > funcs/changes to *unrealize* vCPU
> >
> > Hi Salil,
> >
> > On 26/9/23 12:04, Salil Mehta wrote:
> > > Supporting vCPU Hotplug for ARM arch also means introducing new
> > functionality of
> > > unrealizing the ARMCPU. This requires some new common functions.
> > >
> > > Defining them as part of architecture independent change so that this
> > code could
> > > be reused by other interested parties.
> > >
> > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > > ---
> > >   gdbstub/gdbstub.c         | 13 +++++++++++++
> > >   include/exec/cpu-common.h |  8 ++++++++
> > >   include/exec/gdbstub.h    |  1 +
> > >   include/hw/core/cpu.h     |  1 +
> > >   softmmu/physmem.c         | 25 +++++++++++++++++++++++++
> > >   5 files changed, 48 insertions(+)
> >
> >
> > > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > > index dab572c9bd..ffd815a0d8 100644
> > > --- a/include/hw/core/cpu.h
> > > +++ b/include/hw/core/cpu.h
> > > @@ -366,6 +366,7 @@ struct CPUState {
> > >       QSIMPLEQ_HEAD(, qemu_work_item) work_list;
> > >
> > >       CPUAddressSpace *cpu_ases;
> > > +    int cpu_ases_ref_count;
> > >       int num_ases;
> > >       AddressSpace *as;
> > >       MemoryRegion *memory;
> > > diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> > > index 3df73542e1..a93ae783af 100644
> > > --- a/softmmu/physmem.c
> > > +++ b/softmmu/physmem.c
> > > @@ -762,6 +762,7 @@ void cpu_address_space_init(CPUState *cpu, int
> asidx,
> > >
> > >       if (!cpu->cpu_ases) {
> > >           cpu->cpu_ases = g_new0(CPUAddressSpace, cpu->num_ases);
> > > +        cpu->cpu_ases_ref_count = cpu->num_ases;
> > >       }
> > >
> > >       newas = &cpu->cpu_ases[asidx];
> > > @@ -775,6 +776,30 @@ void cpu_address_space_init(CPUState *cpu, int
> > asidx,
> > >       }
> > >   }
> > >
> > > +void cpu_address_space_destroy(CPUState *cpu, int asidx)
> > > +{
> > > +    CPUAddressSpace *cpuas;
> > > +
> > > +    assert(asidx < cpu->num_ases);
> > > +    assert(asidx == 0 || !kvm_enabled());
> > > +    assert(cpu->cpu_ases);
> > > +
> > > +    cpuas = &cpu->cpu_ases[asidx];
> > > +    if (tcg_enabled()) {
> > > +        memory_listener_unregister(&cpuas->tcg_as_listener);
> > > +    }
> > > +
> > > +    address_space_destroy(cpuas->as);
> > > +    g_free_rcu(cpuas->as, rcu);
> > > +
> > > +    if (cpu->cpu_ases_ref_count == 1) {
> > > +        g_free(cpu->cpu_ases);
> > > +        cpu->cpu_ases = NULL;
> > > +    }
> > > +
> > > +    cpu->cpu_ases_ref_count--;
> >
> > See Richard comment from:
> > https://lore.kernel.org/qemu-devel/594b2550-9a73-684f-6e54-
> > 29401dc6cd7a@linaro.org/
> >
> > "I think it would be better to destroy all address spaces at once,
> > "so that you don't need  to invent a reference count that isn't used
> > "for anything else.
> 
> Yes, we can do that and remove the reference count. The only reason I
> did it was because I was not sure if it is safe to assume that all
> the AddressSpace will always be destroyed *together*. And now since
> this is being ported to other architectures will the same hold
> true everywhere?

(sorry, I missed key point)

To make things clear further for ARM, presence of tagged/secure memory
is optional (and I am not even sure all of these are supported with
accel=KVM). The Address Space destruction function is common to all
architectures and hence it is not safe to destroy all of these together.

https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.mehta@huawei.com/T/#mfb2a525081c412917a0026d558e72f48875e386d

+static void arm_cpu_unrealizefn(DeviceState *dev)
+{
+    ARMCPUClass *acc = ARM_CPU_GET_CLASS(dev);
+    ARMCPU *cpu = ARM_CPU(dev);
+    CPUARMState *env = &cpu->env;
+    CPUState *cs = CPU(dev);
+    bool has_secure;
+
+    has_secure = cpu->has_el3 || arm_feature(env, ARM_FEATURE_M_SECURITY);
+
+    /* rock 'n' un-roll, whatever happened in the arm_cpu_realizefn cleanly */
+    cpu_address_space_destroy(cs, ARMASIdx_NS);
+
+    if (cpu->tag_memory != NULL) {
+        cpu_address_space_destroy(cs, ARMASIdx_TagNS);
+        if (has_secure) {
+            cpu_address_space_destroy(cs, ARMASIdx_TagS);
+        }
+    }
+
+    if (has_secure) {
+        cpu_address_space_destroy(cs, ARMASIdx_S);
+    }
 [...]
}



@Richard, please let me know if I understood your comment correctly?


Thanks
Salil.







^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function
  2023-09-26 10:04 ` [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function Salil Mehta via
  2023-09-27  5:16   ` Gavin Shan
@ 2023-10-10  6:46   ` Shaoqin Huang
  2023-10-10  9:47     ` Salil Mehta via
  1 sibling, 1 reply; 153+ messages in thread
From: Shaoqin Huang @ 2023-10-10  6:46 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, gshan,
	rafael, borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai



On 9/26/23 18:04, Salil Mehta via wrote:
> Factor out CPU properties code common for {hot,cold}-plugged CPUs. This allows
> code reuse.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> ---
>   hw/arm/virt.c         | 220 ++++++++++++++++++++++++++----------------
>   include/hw/arm/virt.h |   4 +
>   2 files changed, 140 insertions(+), 84 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 57fe97c242..0eb6bf5a18 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2018,16 +2018,130 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
>       }
>   }
>   
> +static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,
> +                                    Error **errp)
> +{

Hi Salil,

This patch seems break the code, the virt_cpu_set_properties() function 
being defined but not used in this patch, so those original code in the 
machvirt_init() just not work.

We should use this function in the machvirt_init().

> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    VirtMachineState *vms = VIRT_MACHINE(ms);
> +    Error *local_err = NULL;
> +    VirtMachineClass *vmc;
> +
> +    vmc = VIRT_MACHINE_GET_CLASS(ms);
> +
> +    /* now, set the cpu object property values */
> +    numa_cpu_pre_plug(cpu_slot, DEVICE(cpuobj), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    object_property_set_int(cpuobj, "mp-affinity", cpu_slot->arch_id, NULL);
> +
> +    if (!vms->secure) {
> +        object_property_set_bool(cpuobj, "has_el3", false, NULL);
> +    }
> +
> +    if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
> +        object_property_set_bool(cpuobj, "has_el2", false, NULL);
> +    }
> +
> +    if (vmc->kvm_no_adjvtime &&
> +        object_property_find(cpuobj, "kvm-no-adjvtime")) {
> +        object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
> +    }
> +
> +    if (vmc->no_kvm_steal_time &&
> +        object_property_find(cpuobj, "kvm-steal-time")) {
> +        object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
> +    }
> +
> +    if (vmc->no_pmu && object_property_find(cpuobj, "pmu")) {
> +        object_property_set_bool(cpuobj, "pmu", false, NULL);
> +    }
> +
> +    if (vmc->no_tcg_lpa2 && object_property_find(cpuobj, "lpa2")) {
> +        object_property_set_bool(cpuobj, "lpa2", false, NULL);
> +    }
> +
> +    if (object_property_find(cpuobj, "reset-cbar")) {
> +        object_property_set_int(cpuobj, "reset-cbar",
> +                                vms->memmap[VIRT_CPUPERIPHS].base,
> +                                &local_err);
> +        if (local_err) {
> +            goto out;
> +        }
> +    }
> +
> +    /* link already initialized {secure,tag}-memory regions to this cpu */
> +    object_property_set_link(cpuobj, "memory", OBJECT(vms->sysmem), &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    if (vms->secure) {
> +        object_property_set_link(cpuobj, "secure-memory",
> +                                 OBJECT(vms->secure_sysmem), &local_err);
> +        if (local_err) {
> +            goto out;
> +        }
> +    }
> +
> +    if (vms->mte) {
> +        if (!object_property_find(cpuobj, "tag-memory")) {
> +            error_setg(&local_err, "MTE requested, but not supported "
> +                       "by the guest CPU");
> +            if (local_err) {
> +                goto out;
> +            }
> +        }
> +
> +        object_property_set_link(cpuobj, "tag-memory", OBJECT(vms->tag_sysmem),
> +                                 &local_err);
> +        if (local_err) {
> +            goto out;
> +        }
> +
> +        if (vms->secure) {
> +            object_property_set_link(cpuobj, "secure-tag-memory",
> +                                     OBJECT(vms->secure_tag_sysmem),
> +                                     &local_err);
> +            if (local_err) {
> +                goto out;
> +            }
> +        }
> +    }
> +
> +    /*
> +     * RFC: Question: this must only be called for the hotplugged cpus. For the
> +     * cold booted secondary cpus this is being taken care in arm_load_kernel()
> +     * in boot.c. Perhaps we should remove that code now?
> +     */
> +    if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED) {
> +        object_property_set_int(cpuobj, "psci-conduit", vms->psci_conduit,
> +                                NULL);
> +
> +        /* Secondary CPUs start in PSCI powered-down state */
> +        if (CPU(cpuobj)->cpu_index > 0) {
> +            object_property_set_bool(cpuobj, "start-powered-off", true, NULL);
> +        }
> +    }

Besides, if this patch is just factor out the code, we could move the 
check psci_conduit to later patch, and keep this patch clean.

Thanks,
Shaoqin

> +
> +out:
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +    }
> +    return;
> +}
> +
>   static void machvirt_init(MachineState *machine)
>   {
>       VirtMachineState *vms = VIRT_MACHINE(machine);
>       VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(machine);
>       MachineClass *mc = MACHINE_GET_CLASS(machine);
>       const CPUArchIdList *possible_cpus;
> -    MemoryRegion *sysmem = get_system_memory();
> +    MemoryRegion *secure_tag_sysmem = NULL;
>       MemoryRegion *secure_sysmem = NULL;
>       MemoryRegion *tag_sysmem = NULL;
> -    MemoryRegion *secure_tag_sysmem = NULL;
> +    MemoryRegion *sysmem;
>       int n, virt_max_cpus;
>       bool firmware_loaded;
>       bool aarch64 = true;
> @@ -2071,6 +2185,8 @@ static void machvirt_init(MachineState *machine)
>        */
>       finalize_gic_version(vms);
>   
> +    sysmem = vms->sysmem = get_system_memory();
> +
>       if (vms->secure) {
>           /*
>            * The Secure view of the world is the same as the NonSecure,
> @@ -2078,7 +2194,7 @@ static void machvirt_init(MachineState *machine)
>            * containing the system memory at low priority; any secure-only
>            * devices go in at higher priority and take precedence.
>            */
> -        secure_sysmem = g_new(MemoryRegion, 1);
> +        secure_sysmem = vms->secure_sysmem = g_new(MemoryRegion, 1);
>           memory_region_init(secure_sysmem, OBJECT(machine), "secure-memory",
>                              UINT64_MAX);
>           memory_region_add_subregion_overlap(secure_sysmem, 0, sysmem, -1);
> @@ -2151,6 +2267,23 @@ static void machvirt_init(MachineState *machine)
>           exit(1);
>       }
>   
> +    if (vms->mte) {
> +        /* Create the memory region only once, but link to all cpus later */
> +        tag_sysmem = vms->tag_sysmem = g_new(MemoryRegion, 1);
> +        memory_region_init(tag_sysmem, OBJECT(machine),
> +                           "tag-memory", UINT64_MAX / 32);
> +
> +        if (vms->secure) {
> +            secure_tag_sysmem = vms->secure_tag_sysmem = g_new(MemoryRegion, 1);
> +            memory_region_init(secure_tag_sysmem, OBJECT(machine),
> +                               "secure-tag-memory", UINT64_MAX / 32);
> +
> +            /* As with ram, secure-tag takes precedence over tag.  */
> +            memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
> +                                                tag_sysmem, -1);
> +        }
> +    }
> +
>       create_fdt(vms);
>   
>       assert(possible_cpus->len == max_cpus);
> @@ -2163,15 +2296,10 @@ static void machvirt_init(MachineState *machine)
>           }
>   
>           cpuobj = object_new(possible_cpus->cpus[n].type);
> -        object_property_set_int(cpuobj, "mp-affinity",
> -                                possible_cpus->cpus[n].arch_id, NULL);
>   
>           cs = CPU(cpuobj);
>           cs->cpu_index = n;
>   
> -        numa_cpu_pre_plug(&possible_cpus->cpus[cs->cpu_index], DEVICE(cpuobj),
> -                          &error_fatal);
> -
>           aarch64 &= object_property_get_bool(cpuobj, "aarch64", NULL);
>           object_property_set_int(cpuobj, "socket-id",
>                                   virt_get_socket_id(machine, n), NULL);
> @@ -2182,82 +2310,6 @@ static void machvirt_init(MachineState *machine)
>           object_property_set_int(cpuobj, "thread-id",
>                                   virt_get_thread_id(machine, n), NULL);
>   
> -        if (!vms->secure) {
> -            object_property_set_bool(cpuobj, "has_el3", false, NULL);
> -        }
> -
> -        if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
> -            object_property_set_bool(cpuobj, "has_el2", false, NULL);
> -        }
> -
> -        if (vmc->kvm_no_adjvtime &&
> -            object_property_find(cpuobj, "kvm-no-adjvtime")) {
> -            object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
> -        }
> -
> -        if (vmc->no_kvm_steal_time &&
> -            object_property_find(cpuobj, "kvm-steal-time")) {
> -            object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
> -        }
> -
> -        if (vmc->no_pmu && object_property_find(cpuobj, "pmu")) {
> -            object_property_set_bool(cpuobj, "pmu", false, NULL);
> -        }
> -
> -        if (vmc->no_tcg_lpa2 && object_property_find(cpuobj, "lpa2")) {
> -            object_property_set_bool(cpuobj, "lpa2", false, NULL);
> -        }
> -
> -        if (object_property_find(cpuobj, "reset-cbar")) {
> -            object_property_set_int(cpuobj, "reset-cbar",
> -                                    vms->memmap[VIRT_CPUPERIPHS].base,
> -                                    &error_abort);
> -        }
> -
> -        object_property_set_link(cpuobj, "memory", OBJECT(sysmem),
> -                                 &error_abort);
> -        if (vms->secure) {
> -            object_property_set_link(cpuobj, "secure-memory",
> -                                     OBJECT(secure_sysmem), &error_abort);
> -        }
> -
> -        if (vms->mte) {
> -            /* Create the memory region only once, but link to all cpus. */
> -            if (!tag_sysmem) {
> -                /*
> -                 * The property exists only if MemTag is supported.
> -                 * If it is, we must allocate the ram to back that up.
> -                 */
> -                if (!object_property_find(cpuobj, "tag-memory")) {
> -                    error_report("MTE requested, but not supported "
> -                                 "by the guest CPU");
> -                    exit(1);
> -                }
> -
> -                tag_sysmem = g_new(MemoryRegion, 1);
> -                memory_region_init(tag_sysmem, OBJECT(machine),
> -                                   "tag-memory", UINT64_MAX / 32);
> -
> -                if (vms->secure) {
> -                    secure_tag_sysmem = g_new(MemoryRegion, 1);
> -                    memory_region_init(secure_tag_sysmem, OBJECT(machine),
> -                                       "secure-tag-memory", UINT64_MAX / 32);
> -
> -                    /* As with ram, secure-tag takes precedence over tag.  */
> -                    memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
> -                                                        tag_sysmem, -1);
> -                }
> -            }
> -
> -            object_property_set_link(cpuobj, "tag-memory", OBJECT(tag_sysmem),
> -                                     &error_abort);
> -            if (vms->secure) {
> -                object_property_set_link(cpuobj, "secure-tag-memory",
> -                                         OBJECT(secure_tag_sysmem),
> -                                         &error_abort);
> -            }
> -        }
> -
>           qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
>           object_unref(cpuobj);
>       }
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index e1ddbea96b..13163adb07 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -148,6 +148,10 @@ struct VirtMachineState {
>       DeviceState *platform_bus_dev;
>       FWCfgState *fw_cfg;
>       PFlashCFI01 *flash[2];
> +    MemoryRegion *sysmem;
> +    MemoryRegion *secure_sysmem;
> +    MemoryRegion *tag_sysmem;
> +    MemoryRegion *secure_tag_sysmem;
>       bool secure;
>       bool highmem;
>       bool highmem_compact;

-- 
Shaoqin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function
  2023-10-10  6:46   ` Shaoqin Huang
@ 2023-10-10  9:47     ` Salil Mehta via
  2023-10-10  9:47       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-10  9:47 UTC (permalink / raw)
  To: Shaoqin Huang, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, gshan,
	rafael, borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Shaoqin,

> From: Shaoqin Huang <shahuang@redhat.com>
> Sent: Tuesday, October 10, 2023 7:47 AM
> 
> On 9/26/23 18:04, Salil Mehta via wrote:
> > Factor out CPU properties code common for {hot,cold}-plugged CPUs. This allows
> > code reuse.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c         | 220 ++++++++++++++++++++++++++----------------
> >   include/hw/arm/virt.h |   4 +
> >   2 files changed, 140 insertions(+), 84 deletions(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 57fe97c242..0eb6bf5a18 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2018,16 +2018,130 @@ static void virt_cpu_post_init(VirtMachineState
> *vms, MemoryRegion *sysmem)
> >       }
> >   }
> >
> > +static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId
> *cpu_slot,
> > +                                    Error **errp)
> > +{
> 
> Hi Salil,
> 
> This patch seems break the code, the virt_cpu_set_properties() function
> being defined but not used in this patch, so those original code in the
> machvirt_init() just not work.


Good catch. 

BTW, the change in this patch is intentional as I wanted to clearly show
the move. But I will fix the compilation break in this patch with some
trick.

Thanks for identifying!
Salil.


> We should use this function in the machvirt_init().
> 
> > +    MachineState *ms = MACHINE(qdev_get_machine());
> > +    VirtMachineState *vms = VIRT_MACHINE(ms);
> > +    Error *local_err = NULL;
> > +    VirtMachineClass *vmc;
> > +
> > +    vmc = VIRT_MACHINE_GET_CLASS(ms);
> > +
> > +    /* now, set the cpu object property values */
> > +    numa_cpu_pre_plug(cpu_slot, DEVICE(cpuobj), &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    object_property_set_int(cpuobj, "mp-affinity", cpu_slot->arch_id, NULL);
> > +

[...]

> > +    /*
> > +     * RFC: Question: this must only be called for the hotplugged cpus. For the
> > +     * cold booted secondary cpus this is being taken care in arm_load_kernel()
> > +     * in boot.c. Perhaps we should remove that code now?
> > +     */
> > +    if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED) {
> > +        object_property_set_int(cpuobj, "psci-conduit", vms->psci_conduit,
> > +                                NULL);
> > +
> > +        /* Secondary CPUs start in PSCI powered-down state */
> > +        if (CPU(cpuobj)->cpu_index > 0) {
> > +            object_property_set_bool(cpuobj, "start-powered-off", true, NULL);
> > +        }
> > +    }
> 
> Besides, if this patch is just factor out the code, we could move the
> check psci_conduit to later patch, and keep this patch clean.

I do not see the reason why we should do that?


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function
  2023-10-10  9:47     ` Salil Mehta via
@ 2023-10-10  9:47       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-10  9:47 UTC (permalink / raw)
  To: Shaoqin Huang, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, gshan,
	rafael, borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Shaoqin,

> From: Shaoqin Huang <shahuang@redhat.com>
> Sent: Tuesday, October 10, 2023 7:47 AM
> 
> On 9/26/23 18:04, Salil Mehta via wrote:
> > Factor out CPU properties code common for {hot,cold}-plugged CPUs. This allows
> > code reuse.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c         | 220 ++++++++++++++++++++++++++----------------
> >   include/hw/arm/virt.h |   4 +
> >   2 files changed, 140 insertions(+), 84 deletions(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 57fe97c242..0eb6bf5a18 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2018,16 +2018,130 @@ static void virt_cpu_post_init(VirtMachineState
> *vms, MemoryRegion *sysmem)
> >       }
> >   }
> >
> > +static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId
> *cpu_slot,
> > +                                    Error **errp)
> > +{
> 
> Hi Salil,
> 
> This patch seems break the code, the virt_cpu_set_properties() function
> being defined but not used in this patch, so those original code in the
> machvirt_init() just not work.


Good catch. 

BTW, the change in this patch is intentional as I wanted to clearly show
the move. But I will fix the compilation break in this patch with some
trick.

Thanks for identifying!
Salil.


> We should use this function in the machvirt_init().
> 
> > +    MachineState *ms = MACHINE(qdev_get_machine());
> > +    VirtMachineState *vms = VIRT_MACHINE(ms);
> > +    Error *local_err = NULL;
> > +    VirtMachineClass *vmc;
> > +
> > +    vmc = VIRT_MACHINE_GET_CLASS(ms);
> > +
> > +    /* now, set the cpu object property values */
> > +    numa_cpu_pre_plug(cpu_slot, DEVICE(cpuobj), &local_err);
> > +    if (local_err) {
> > +        goto out;
> > +    }
> > +
> > +    object_property_set_int(cpuobj, "mp-affinity", cpu_slot->arch_id, NULL);
> > +

[...]

> > +    /*
> > +     * RFC: Question: this must only be called for the hotplugged cpus. For the
> > +     * cold booted secondary cpus this is being taken care in arm_load_kernel()
> > +     * in boot.c. Perhaps we should remove that code now?
> > +     */
> > +    if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED) {
> > +        object_property_set_int(cpuobj, "psci-conduit", vms->psci_conduit,
> > +                                NULL);
> > +
> > +        /* Secondary CPUs start in PSCI powered-down state */
> > +        if (CPU(cpuobj)->cpu_index > 0) {
> > +            object_property_set_bool(cpuobj, "start-powered-off", true, NULL);
> > +        }
> > +    }
> 
> Besides, if this patch is just factor out the code, we could move the
> check psci_conduit to later patch, and keep this patch clean.

I do not see the reason why we should do that?


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (31 preceding siblings ...)
  2023-09-26 10:36 ` [PATCH RFC V2 32/37] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
@ 2023-10-11 10:23 ` Vishnu Pajjuri
  2023-10-11 10:32   ` Salil Mehta via
  2023-10-12 17:02 ` Miguel Luis
  33 siblings, 1 reply; 153+ messages in thread
From: Vishnu Pajjuri @ 2023-10-11 10:23 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, jonathan.cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, gshan,
	rafael, borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

Hi Salil,

On 26-09-2023 15:33, Salil Mehta wrote:
> [ *REPEAT: Sent patches got held at internal server yesterday* ]
>
> PROLOGUE
> ========
>
> To assist in review and set the right expectations from this RFC, please first
> read below sections *APPENDED AT THE END* of this cover letter,
>
> 1. Important *DISCLAIMER* [Section (X)]
> 2. Work presented at KVMForum Conference (slides available) [Section (V)F]
> 3. Organization of patches [Section (XI)]
> 4. References [Section (XII)]
> 5. Detailed TODO list of the leftover work or work-in-progress [Section (IX)]
>
> NOTE: There has been an interest shown by other organizations in adapting
> this series for their architecture. I am planning to split this RFC into
> architecture *agnostic* and *specific* patch-sets in subsequent releases. ARM
> specific patch-set will continue as RFC V3 and architecture agnostic patch-set
> will be floated without RFC tag and can be consumed in this Qemu cycle if
> MAINTAINERs ack it.
>
> [Please check section (XI)B for details of architecture agnostic patches]
>
>
> SECTIONS [I - XIII] are as follows :
>
> (I) Key Changes (RFC V1 -> RFC V2)
>      ==================================
>
>      RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
>
> 1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
>     *online-capable* or *enabled* to the Guest OS at the boot time. This means
>     associated CPUs can have ACPI _STA as *enabled* or *disabled* even after boot
>     See, UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface Flags[20]
> 2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF}
>     request. This is required to {dis}allow online'ing a vCPU.
> 3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI _STA.PRESENT
>     to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
>     hot{un}plug.
> 4. Live Migration works (some issues are still there)
> 5. TCG/HVF/qtest does not support Hotplug and falls back to default.
> 6. Code for TCG support do exists in this release (it is a work-in-progress)
> 7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
>     hotplug capability (_OSC Query support still pending)
> 8. Misc. Bug fixes
>
> (II) Summary
>       =======
>
> This patch-set introduces the virtual CPU hotplug support for ARMv8 architecture
> in QEMU. Idea is to be able to hotplug and hot-unplug the vCPUs while guest VM
> is running and no reboot is required. This does *not* makes any assumption of
> the physical CPU hotplug availability within the host system but rather tries to
> solve the problem at virtualizer/QEMU layer. Introduces ACPI CPU hotplug hooks
> and event handling to interface with the guest kernel, code to initialize, plug
> and unplug CPUs. No changes are required within the host kernel/KVM except the
> support of hypercall exit handling in the user-space/Qemu which has recently
> been added to the kernel. Its corresponding Guest kernel changes have been
> posted on the mailing-list [3] [4] by James Morse.
>
> (III) Motivation
>        ==========
>
> This allows scaling the guest VM compute capacity on-demand which would be
> useful for the following example scenarios,
>
> 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration
>     framework which could adjust resource requests (CPU and Mem requests) for
>     the containers in a pod, based on usage.
> 2. Pay-as-you-grow Business Model: Infrastructure provider could allocate and
>     restrict the total number of compute resources available to the guest VM
>     according to the SLA (Service Level Agreement). VM owner could request for
>     more compute to be hot-plugged for some cost.
>
> For example, Kata Container VM starts with a minimum amount of resources (i.e.
> hotplug everything approach). why?
>
> 1. Allowing faster *boot time* and
> 2. Reduction in *memory footprint*
>
> Kata Container VM can boot with just 1 vCPU and then later more vCPUs can be
> hot-plugged as per requirement.
>
> (IV) Terminology
>       ===========
>
> (*) Posssible CPUs: Total vCPUs which could ever exist in VM. This includes
>                      any cold booted CPUs plus any CPUs which could be later
>                      hot-plugged.
>                      - Qemu parameter(-smp maxcpus=N)
> (*) Present CPUs:   Possible CPUs which are ACPI 'present'. These might or might
>                      not be ACPI 'enabled'.
>                      - Present vCPUs = Possible vCPUs (Always on ARM Arch)
> (*) Enabled CPUs:   Possible CPUs which are ACPI ‘present’ and 'enabled' and can
>                      now be ‘onlined’ (PSCI) for use by Guest Kernel. All cold
>                      booted vCPUs are ACPI 'enabled' at boot. Later, using
>                      device_add more vCPUs can be hotplugged and be made ACPI
>                      'enabled.
>                      - Qemu parameter(-smp cpus=N). Can be used to specify some
> 		      cold booted vCPUs during VM init. Some can be added using
> 		      '-device' option.
>
> (V) Constraints Due To ARMv8 CPU Architecture [+] Other Impediments
>      ===============================================================
>
> A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
>     1. ARMv8 CPU architecture does not support the concept of the physical CPU
>        hotplug.
>        a. There are many per-CPU components like PMU, SVE, MTE, Arch timers etc.
>           whose behaviour need to be clearly defined when CPU is hot(un)plugged.
>           There is no specification for this.
>
>     2. Other ARM components like GIC etc. have not been designed to realize
>        physical CPU hotplug capability as of now. For example,
>        a. Every physical CPU has a unique GICC (GIC CPU Interface) by construct.
>           Architecture does not specifies what CPU hot(un)plug would mean in
>           context to any of these.
>        b. CPUs/GICC are physically connected to unique GICR (GIC Redistributor).
>           GIC Redistributors are always part of always-on power domain. Hence,
>           cannot be powered-off as per specification.
>
> B. Impediments in Firmware/ACPI (Architectural Constraint)
>
>     1. Firmware has to expose GICC, GICR and other per-CPU features like PMU,
>        SVE, MTE, Arch Timers etc. to the OS. Due to architectural constraint
>        stated in above section A1(a),  all interrupt controller structures of
>        MADT describing GIC CPU Interfaces and the GIC Redistibutors MUST be
>        presented by firmware to the OSPM during the boot time.
>     2. Architectures that support CPU hotplug can evaluate ACPI _MAT method to
>        get this kind of information from the firmware even after boot and the
>        OSPM has capability to process these. ARM kernel uses information in MADT
>        interrupt controller structures to identify number of Present CPUs during
>        boot and hence does not allow to change these after boot. Number of
>        present CPUs cannot be changed. It is an architectural constraint!
>
> C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural Constraint)
>
>     1. KVM VGIC:
>         a. Sizing of various VGIC resources like memory regions etc. related to
>            the redistributor happens only once and is fixed at the VM init time
>            and cannot be changed later after initialization has happened.
>            KVM statically configures these resources based on the number of vCPUs
>            and the number/size of redistributor ranges.
>         b. Association between vCPU and its VGIC redistributor is fixed at the
>            VM init time within the KVM i.e. when redistributor iodevs gets
>            registered. VGIC does not allows to setup/change this association
>            after VM initialization has happened. Physically, every CPU/GICC is
>            uniquely connected with its redistributor and there is no
>            architectural way to set this up.
>     2. KVM vCPUs:
>         a. Lack of specification means destruction of KVM vCPUs does not exist as
>            there is no reference to tell what to do with other per-vCPU
>            components like redistributors, arch timer etc.
>         b. Infact, KVM does not implements destruction of vCPUs for any
>            architecture. This is independent of the fact whether architecture
>            actually supports CPU Hotplug feature. For example, even for x86 KVM
>            does not implements destruction of vCPUs.
>
> D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints->Arch)
>
>     1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs to
>        overcome the KVM constraint. KVM vCPUs are created, initialized when Qemu
>        CPU Objects are realized. But keepinsg the QOM CPU objects realized for
>        'yet-to-be-plugged' vCPUs can create problems when these new vCPUs shall
>        be plugged using device_add and a new QOM CPU object shall be created.
>     2. GICV3State and GICV3CPUState objects MUST be sized over *possible vCPUs*
>        during VM init time while QOM GICV3 Object is realized. This is because
>        KVM VGIC can only be initialized once during init time. But every
>        GICV3CPUState has an associated QOM CPU Object. Later might corresponds to
>        vCPU which are 'yet-to-be-plugged'(unplugged at init).
>     3. How should new QOM CPU objects be connected back to the GICV3CPUState
>        objects and disconnected from it in case CPU is being hot(un)plugged?
>     4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in the
>        QOM for which KVM vCPU already exists? For example, whether to keep,
>         a. No QOM CPU objects Or
>         b. Unrealized CPU Objects
>     5. How should vCPU state be exposed via ACPI to the Guest? Especially for
>        the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not exists
>        within the QOM but the Guest always expects all possible vCPUs to be
>        identified as ACPI *present* during boot.
>     6. How should Qemu expose GIC CPU interfaces for the unplugged or
>        yet-to-beplugged vCPUs using ACPI MADT Table to the Guest?
>
> E. Summary of Approach ([+] Workarounds to problems in sections A, B, C & D)
>
>     1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e. even
>        for the vCPUs which are yet-to-be-plugged in Qemu but keep them in the
>        powered-off state.
>     2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
>        objects corresponding to the unplugged/yet-to-be-plugged vCPUs are parked
>        at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to x86)
>     3. GICV3State and GICV3CPUState objects are sized over possible vCPUs during
>        VM init time i.e. when Qemu GIC is realized. This in turn sizes KVM VGIC
>        resources like memory regions etc. related to the redistributors with the
>        number of possible KVM vCPUs. This never changes after VM has initialized.
>     4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs are
>        released post Host KVM CPU and GIC/VGIC initialization.
>     5. Build ACPI MADT Table with below updates
>        a. Number of GIC CPU interface entries (=possible vCPUs)
>        b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable)
>        c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1
>           - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
> 	 - vCPU can be ACPI enabled+onlined after Guest boots (Firmware Policy)
> 	 - Some issues with above (details in later sections)
>     6. Expose below ACPI Status to Guest kernel
>        a. Always _STA.Present=1 (all possible vCPUs)
>        b. _STA.Enabled=1 (plugged vCPUs)
>        c. _STA.Enabled=0 (unplugged vCPUs)
>     7. vCPU hotplug *realizes* new QOM CPU object. Following happens,
>        a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread
>        b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list)
>           - Attaches to QOM CPU object.
>        c. Reinitializes KVM vCPU in the Host
>           - Resets the core and sys regs, sets defaults etc.
>        d. Runs KVM vCPU (created with "start-powered-off")
> 	 - vCPU thread sleeps (waits for vCPU reset via PSCI)
>        e. Updates Qemu GIC
>           - Wires back IRQs related to this vCPU.
>           - GICV3CPUState association with QOM CPU Object.
>        f. Updates [6] ACPI _STA.Enabled=1
>        g. Notifies Guest about new vCPU (via ACPI GED interface)
> 	 - Guest checks _STA.Enabled=1
> 	 - Guest adds processor (registers CPU with LDM) [3]
>        h. Plugs the QOM CPU object in the slot.
>           - slot-number = cpu-index{socket,cluster,core,thread}
>        i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC)
>           - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>           - Qemu powers-on KVM vCPU in the Host
>     8. vCPU hot-unplug *unrealizes* QOM CPU Object. Following happens,
>        a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event
>           - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC)
>        b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>           - Qemu powers-off the KVM vCPU in the Host
>        c Guest signals *Eject* vCPU to Qemu
>        d. Qemu updates [6] ACPI _STA.Enabled=0
>        e. Updates GIC
>           - Un-wires IRQs related to this vCPU
>           - GICV3CPUState association with new QOM CPU Object is updated.
>        f. Unplugs the vCPU
> 	 - Removes from slot
>           - Parks KVM vCPU ("kvm_parked_vcpus" list)
>           - Unrealizes QOM CPU Object & joins back Qemu vCPU thread
> 	 - Destroys QOM CPU object
>        g. Guest checks ACPI _STA.Enabled=0
>           - Removes processor (unregisters CPU with LDM) [3]
>
> F. Work Presented at KVM Forum Conferences:
>     Details of above work has been presented at KVMForum2020 and KVMForum2023
>     conferences. Slides are available at below links,
>     a. KVMForum 2023
>        - Challenges Revisited in Supporting Virt CPU Hotplug on architectures that don't Support CPU Hotplug (like ARM64)
>          https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
>     b. KVMForum 2020
>        - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems (like ARM64) - Salil Mehta, Huawei
>          https://sched.co/eE4m
>
> (VI) Commands Used
>       =============
>
>      A. Qemu launch commands to init the machine
>
>      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
>      -cpu host -smp cpus=4,maxcpus=6 \
>      -m 300M \
>      -kernel Image \
>      -initrd rootfs.cpio.gz \
>      -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 acpi=force" \
>      -nographic \
>      -bios  QEMU_EFI.fd \
>
>      B. Hot-(un)plug related commands
>
>      # Hotplug a host vCPU(accel=kvm)
>      $ device_add host-arm-cpu,id=core4,core-id=4
>
>      # Hotplug a vCPU(accel=tcg)
>      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
>
>      # Delete the vCPU
>      $ device_del core4
>
>      Sample output on guest after boot:
>
>      $ cat /sys/devices/system/cpu/possible
>      0-5
>      $ cat /sys/devices/system/cpu/present
>      0-5
>      $ cat /sys/devices/system/cpu/enabled
>      0-3
>      $ cat /sys/devices/system/cpu/online
>      0-1
>      $ cat /sys/devices/system/cpu/offline
>      2-5
>
>      Sample output on guest after hotplug of vCPU=4:
>
>      $ cat /sys/devices/system/cpu/possible
>      0-5
>      $ cat /sys/devices/system/cpu/present
>      0-5
>      $ cat /sys/devices/system/cpu/enabled
>      0-4
>      $ cat /sys/devices/system/cpu/online
>      0-1,4
>      $ cat /sys/devices/system/cpu/offline
>      2-3,5
>
>      Note: vCPU=4 was explicitly 'onlined' after hot-plug
>      $ echo 1 > /sys/devices/system/cpu/cpu4/online
>
> (VII) Repository
>        ==========
>
>   (*) QEMU changes for vCPU hotplug could be cloned from below site,
>       https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
>   (*) Guest Kernel changes (by James Morse, ARM) are available here:
>       https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git virtual_cpu_hotplug/rfc/v2
>
>
> (VIII) KNOWN ISSUES
>         ============
>
> 1. Migration has been lightly tested. Below are some of the known issues:
>     - Ocassional CPU stall (not always repeatable)
>     - Negative test case like asymmetric source/destination VM config causes dump.
>     - Migration with TCG is not working properly.
> 2. TCG with Single threaded mode is broken.
> 3. HVF and qtest support is broken.
> 4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable are
>     mutually exclusive i.e. as per the change [6] a vCPU cannot be both
>     GICC.Enabled and GICC.online-capable. This means,
>        [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
>     a. If we have to support hot-unplug of the cold-booted vCPUs then these MUST
>        be specified as GICC.online-capable in the MADT Table during boot by the
>        firmware/Qemu. But this requirement conflicts with the requirement to
>        support new Qemu changes with legacy OS which dont understand
>        MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore this
>        bit and hence these vCPUs will not appear on such OS. This is unexpected
>        behaviour.
>     b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to unplug
>        these cold-booted vCPUs from OS (which in actual should be blocked by
>        returning error at Qemu) then features like 'kexec' will break.
>     c. As I understand, removal of the cold-booted vCPUs is a required feature
>        and x86 world allows it.
>     d. Hence, either we need a specification change to make the MADT.GICC.Enabled
>        and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT support
>        removal of cold-booted vCPUs. In the later case, a check can be introduced
>        to bar the users from unplugging vCPUs, which were cold-booted, using QMP
>        commands. (Needs discussion!)
>        Please check below patch part of this patch-set:
>            [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled]
> 5. Code related to the notification to GICV3 about hot(un)plug of a vCPU event
>     might need further discussion.
>
>
> (IX) THINGS TO DO
>       ============
>
> 1. Fix the Migration Issues
> 2. Fix issues related to TCG/Emulation support.
> 3. Comprehensive Testing. Current testing is very basic.
>     a. Negative Test cases
> 4. Qemu Documentation(.rst) need to be updated.
> 5. Fix qtest, HVF Support
> 6. Fix the design issue related to ACPI MADT.GICC flags discussed in known
>     issues. This might require UEFI ACPI specification change!
> 7. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now.
>
>   Above is *not* a complete list. Will update later!
>
> Best regards
> Salil.
>
> (X) DISCLAIMER
>      ==========
>
> This work is an attempt to present a proof-of-concept of the ARM64 vCPU hotplug
> implementation to the community. This is *not* a production level code and might
> have bugs. Only a basic testing has been done on HiSilicon Kunpeng920 SoC for
> servers. Once the design and core idea behind the implementation has been
> verified more efforts can be put to harden the code.
>
> This work is *mostly* in the lines of the discussions which have happened in the
> previous years[see refs below] across different channels like mailing-list,
> Linaro Open Discussions platform, various conferences like KVMFourm etc. This
> RFC is being used as a way to verify the idea mentioned in this cover-letter and
> to get community views. Once this has been agreed, a formal patch shall be
> posted to the mailing-list for review.
>
> [The concept being presented has been found to work!]
>
> (XI) ORGANIZATION OF PATCHES
>       =======================
>   
>   A. All patches [Architecture 'agnostic' + 'specific']:
>
>     [Patch 1-9, 23, 36] logic required during machine init
>      (*) Some validation checks
>      (*) Introduces core-id property and some util functions required later.
>      (*) Refactors Parking logic of vCPUs
>      (*) Logic to pre-create vCPUs
>      (*) GIC initialization pre-sized with possible vCPUs.
>      (*) Some refactoring to have common hot and cold plug logic together.
>      (*) Release of disable QOM CPU objects in post_cpu_init()
>      (*) Support of ACPI _OSC method to negotiate platform hotplug capabilities
>     [Patch 10-22] logic related to ACPI at machine init time
>      (*) Changes required to Enable ACPI for cpu hotplug
>      (*) Initialization ACPI GED framework to cater CPU Hotplug Events
>      (*) Build ACPI AML related to CPU control dev
>      (*) ACPI MADT/MAT changes
>     [Patch 24-35] Logic required during vCPU hot-(un)plug
>      (*) Basic framework changes to suppport vCPU hot-(un)plug
>      (*) ACPI GED changes for hot-(un)plug hooks.
>      (*) wire-unwire the IRQs
>      (*) GIC notification logic
>      (*) ARMCPU unrealize logic
>      (*) Handling of SMCC Hypercall Exits by KVM to Qemu
>     
>   B. Architecture *agnostic* patches part of patch-set:
>
>     [Patch 5,9,11,13,16,20,24,31,33] Common logic to support hotplug
>      (*) Refactors Parking logic of vCPUs
>      (*) Introduces ACPI GED Support for vCPU Hotplug Events
>      (*) Introduces ACPI AML change for CPU Control Device
>
> (XII) REFERENCES
>        ==========
>
> [1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
> [2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/
> [3] https://lore.kernel.org/lkml/20230203135043.409192-1-james.morse@arm.com/
> [4] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
> [5] https://lore.kernel.org/all/20230404154050.2270077-1-oliver.upton@linux.dev/
> [6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
> [7] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
> [8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> [9] https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler
> [10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html
> [11] https://lkml.org/lkml/2019/7/10/235
> [12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
> [13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
> [14] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
> [15] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
> [16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
> [17] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
> [18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-philippe@linaro.org/
> [19] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
> [20] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags
>
> (XIII) ACKNOWLEDGEMENTS
>         ================
>
> I would like to take this opportunity to thank below people for various
> discussions with me over different channels during the development:
>
> Marc Zyngier (Google)               Catalin Marinas (ARM),
> James Morse(ARM),                   Will Deacon (Google),
> Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
> Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat),
> Jonathan Cameron (Huawei),          Darren Hart (Ampere),
> Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
> Andrew Jones (Redhat),              Karl Heubaum (Oracle),
> Keqian Zhu (Huawei),                Miguel Luis (Oracle),
> Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
> Shameerali Kolothum (Huawei)        Russell King (Oracle)
> Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
> Zengtao/Prime (Huawei),             And all those whom I have missed!
>
> Many thanks to below people for their current or past contributions:
>
> 1. James Morse (ARM)
>     (Current Kernel part of vCPU Hotplug Support on AARCH64)
> 2. Jean-Philippe Brucker (Linaro)
>     (Protoyped one of the earlier PSCI based POC [17][18] based on RFC V1)
> 3. Keqian Zhu (Huawei)
>     (Co-developed Qemu prototype)
> 4. Xiongfeng Wang (Huawei)
>     (Co-developed earlier kernel prototype)
> 5. Vishnu Pajjuri (Ampere)
>     (Verification on Ampere ARM64 Platforms + fixes)
> 6. Miguel Luis (Oracle)
>     (Verification on Oracle ARM64 Platforms + fixes)
>
>
> Author Salil Mehta (1):
>    target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
>
> Jean-Philippe Brucker (2):
>    hw/acpi: Make _MAT method optional
>    target/arm/kvm: Write CPU state back to KVM on reset
>
> Miguel Luis (1):
>    tcg/mttcg: enable threads to unregister in tcg_ctxs[]
>
> Salil Mehta (33):
>    arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
>    cpus-common: Add common CPU utility for possible vCPUs
>    hw/arm/virt: Move setting of common CPU properties in a function
>    arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
>    accel/kvm: Extract common KVM vCPU {creation,parking} code
>    arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
>    arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine init
>    arm/virt: Init PMU at host for all possible vcpus
>    hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
>    arm/acpi: Enable ACPI support for vcpu hotplug
>    hw/acpi: Add ACPI CPU hotplug init stub
>    hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
>    hw/acpi: Init GED framework with cpu hotplug events
>    arm/virt: Add cpu hotplug events to GED during creation
>    arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
>    hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
>    arm/virt/acpi: Build CPUs AML with CPU Hotplug support
>    arm/virt: Make ARM vCPU *present* status ACPI *persistent*
>    hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits to Guest
>    hw/acpi: Update GED _EVT method AML with cpu scan
>    hw/arm: MADT Tbl change to size the guest with possible vCPUs
>    arm/virt: Release objects for *disabled* possible vCPUs after init
>    hw/acpi: Update ACPI GED framework to support vCPU Hotplug
>    arm/virt: Add/update basic hot-(un)plug framework
>    arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
>    hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
>    hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info
>    arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
>    hw/arm: Changes required for reset and to support next boot
>    physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
>    target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
>    hw/arm: Support hotplug capability check using _OSC method
>    hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
>
>   accel/kvm/kvm-all.c                    |  61 +-
>   accel/tcg/tcg-accel-ops-mttcg.c        |   1 +
>   cpus-common.c                          |  37 ++
>   gdbstub/gdbstub.c                      |  13 +
>   hw/acpi/acpi-cpu-hotplug-stub.c        |   6 +
>   hw/acpi/cpu.c                          |  91 ++-
>   hw/acpi/generic_event_device.c         |  33 +
>   hw/arm/Kconfig                         |   1 +
>   hw/arm/boot.c                          |   2 +-
>   hw/arm/virt-acpi-build.c               | 110 +++-
>   hw/arm/virt.c                          | 863 ++++++++++++++++++++-----
>   hw/core/gpio.c                         |   2 +-
>   hw/i386/acpi-build.c                   |   2 +-
>   hw/intc/arm_gicv3.c                    |   1 +
>   hw/intc/arm_gicv3_common.c             |  66 +-
>   hw/intc/arm_gicv3_cpuif.c              | 265 ++++----
>   hw/intc/arm_gicv3_cpuif_common.c       |   5 +
>   hw/intc/arm_gicv3_kvm.c                |  39 +-
>   hw/intc/gicv3_internal.h               |   2 +
>   include/exec/cpu-common.h              |   8 +
>   include/exec/gdbstub.h                 |   1 +
>   include/hw/acpi/cpu.h                  |   7 +-
>   include/hw/acpi/cpu_hotplug.h          |   4 +
>   include/hw/acpi/generic_event_device.h |   5 +
>   include/hw/arm/boot.h                  |   2 +
>   include/hw/arm/virt.h                  |  10 +-
>   include/hw/core/cpu.h                  |  77 +++
>   include/hw/intc/arm_gicv3_common.h     |  23 +
>   include/hw/qdev-core.h                 |   2 +
>   include/sysemu/kvm.h                   |   2 +
>   include/tcg/tcg.h                      |   1 +
>   softmmu/physmem.c                      |  25 +
>   target/arm/arm-powerctl.c              |  51 +-
>   target/arm/cpu-qom.h                   |   3 +
>   target/arm/cpu.c                       | 112 ++++
>   target/arm/cpu.h                       |  17 +
>   target/arm/cpu64.c                     |  15 +
>   target/arm/gdbstub.c                   |   6 +
>   target/arm/helper.c                    |  27 +-
>   target/arm/internals.h                 |  12 +-
>   target/arm/kvm.c                       |  93 ++-
>   target/arm/kvm64.c                     |  59 +-
>   target/arm/kvm_arm.h                   |  24 +
>   target/arm/meson.build                 |   1 +
>   target/arm/{tcg => }/psci.c            |   8 +
>   target/arm/tcg/meson.build             |   4 -
>   tcg/tcg.c                              |  23 +
>   47 files changed, 1873 insertions(+), 349 deletions(-)
>   rename target/arm/{tcg => }/psci.c (97%)
Tested on Ampere's platform for vCPU hotplug/unplug with reboot, 
suspend/resume and save/restore.
Also tested for vCPU hotplug/unplug along with VM live migration.

Please feel free to add,
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>

Thanks,
Vishnu


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-10-11 10:23 ` [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Vishnu Pajjuri
@ 2023-10-11 10:32   ` Salil Mehta via
  2023-10-11 10:32     ` Salil Mehta
  2023-10-11 11:08     ` Vishnu Pajjuri
  0 siblings, 2 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-10-11 10:32 UTC (permalink / raw)
  To: Vishnu Pajjuri, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, gshan,
	rafael, borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Vishnu,

> From: Vishnu Pajjuri <vishnu@amperemail.onmicrosoft.com>
> Sent: Wednesday, October 11, 2023 11:23 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> gshan@redhat.com; rafael@kernel.org; borntraeger@linux.ibm.com;
> alex.bennee@linaro.org; linux@armlinux.org.uk;
> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian
> <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
> wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com;
> maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8
> Arch
> 
> Hi Salil,
> 
> On 26-09-2023 15:33, Salil Mehta wrote:
> > [ *REPEAT: Sent patches got held at internal server yesterday* ]
> >
> > PROLOGUE
> > ========
> >
> > To assist in review and set the right expectations from this RFC, please
> first
> > read below sections *APPENDED AT THE END* of this cover letter,
> >
> > 1. Important *DISCLAIMER* [Section (X)]
> > 2. Work presented at KVMForum Conference (slides available) [Section
> (V)F]
> > 3. Organization of patches [Section (XI)]
> > 4. References [Section (XII)]
> > 5. Detailed TODO list of the leftover work or work-in-progress [Section
> (IX)]
> >
> > NOTE: There has been an interest shown by other organizations in adapting
> > this series for their architecture. I am planning to split this RFC into
> > architecture *agnostic* and *specific* patch-sets in subsequent releases.
> ARM
> > specific patch-set will continue as RFC V3 and architecture agnostic
> patch-set
> > will be floated without RFC tag and can be consumed in this Qemu cycle if
> > MAINTAINERs ack it.
> >
> > [Please check section (XI)B for details of architecture agnostic patches]
> >
> >
> > SECTIONS [I - XIII] are as follows :
> >
> > (I) Key Changes (RFC V1 -> RFC V2)
> >      ==================================
> >
> >      RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
> salil.mehta@huawei.com/
> >
> > 1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
> >     *online-capable* or *enabled* to the Guest OS at the boot time. This
> means
> >     associated CPUs can have ACPI _STA as *enabled* or *disabled* even
> after boot
> >     See, UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface
> Flags[20]
> > 2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI
> CPU_{ON,OFF}
> >     request. This is required to {dis}allow online'ing a vCPU.
> > 3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI
> _STA.PRESENT
> >     to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
> >     hot{un}plug.
> > 4. Live Migration works (some issues are still there)
> > 5. TCG/HVF/qtest does not support Hotplug and falls back to default.
> > 6. Code for TCG support do exists in this release (it is a work-in-
> progress)
> > 7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
> >     hotplug capability (_OSC Query support still pending)
> > 8. Misc. Bug fixes
> >
> > (II) Summary
> >       =======
> >
> > This patch-set introduces the virtual CPU hotplug support for ARMv8
> architecture
> > in QEMU. Idea is to be able to hotplug and hot-unplug the vCPUs while
> guest VM
> > is running and no reboot is required. This does *not* makes any
> assumption of
> > the physical CPU hotplug availability within the host system but rather
> tries to
> > solve the problem at virtualizer/QEMU layer. Introduces ACPI CPU hotplug
> hooks
> > and event handling to interface with the guest kernel, code to
> initialize, plug
> > and unplug CPUs. No changes are required within the host kernel/KVM
> except the
> > support of hypercall exit handling in the user-space/Qemu which has
> recently
> > been added to the kernel. Its corresponding Guest kernel changes have
> been
> > posted on the mailing-list [3] [4] by James Morse.
> >
> > (III) Motivation
> >        ==========
> >
> > This allows scaling the guest VM compute capacity on-demand which would
> be
> > useful for the following example scenarios,
> >
> > 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the
> orchestration
> >     framework which could adjust resource requests (CPU and Mem requests)
> for
> >     the containers in a pod, based on usage.
> > 2. Pay-as-you-grow Business Model: Infrastructure provider could allocate
> and
> >     restrict the total number of compute resources available to the guest
> VM
> >     according to the SLA (Service Level Agreement). VM owner could
> request for
> >     more compute to be hot-plugged for some cost.
> >
> > For example, Kata Container VM starts with a minimum amount of resources
> (i.e.
> > hotplug everything approach). why?
> >
> > 1. Allowing faster *boot time* and
> > 2. Reduction in *memory footprint*
> >
> > Kata Container VM can boot with just 1 vCPU and then later more vCPUs can
> be
> > hot-plugged as per requirement.
> >
> > (IV) Terminology
> >       ===========
> >
> > (*) Posssible CPUs: Total vCPUs which could ever exist in VM. This
> includes
> >                      any cold booted CPUs plus any CPUs which could be
> later
> >                      hot-plugged.
> >                      - Qemu parameter(-smp maxcpus=N)
> > (*) Present CPUs:   Possible CPUs which are ACPI 'present'. These might
> or might
> >                      not be ACPI 'enabled'.
> >                      - Present vCPUs = Possible vCPUs (Always on ARM
> Arch)
> > (*) Enabled CPUs:   Possible CPUs which are ACPI ‘present’ and 'enabled'
> and can
> >                      now be ‘onlined’ (PSCI) for use by Guest Kernel. All
> cold
> >                      booted vCPUs are ACPI 'enabled' at boot. Later,
> using
> >                      device_add more vCPUs can be hotplugged and be made
> ACPI
> >                      'enabled.
> >                      - Qemu parameter(-smp cpus=N). Can be used to
> specify some
> > 		      cold booted vCPUs during VM init. Some can be added using
> > 		      '-device' option.
> >
> > (V) Constraints Due To ARMv8 CPU Architecture [+] Other Impediments
> >      ===============================================================
> >
> > A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
> >     1. ARMv8 CPU architecture does not support the concept of the
> physical CPU
> >        hotplug.
> >        a. There are many per-CPU components like PMU, SVE, MTE, Arch
> timers etc.
> >           whose behaviour need to be clearly defined when CPU is
> hot(un)plugged.
> >           There is no specification for this.
> >
> >     2. Other ARM components like GIC etc. have not been designed to
> realize
> >        physical CPU hotplug capability as of now. For example,
> >        a. Every physical CPU has a unique GICC (GIC CPU Interface) by
> construct.
> >           Architecture does not specifies what CPU hot(un)plug would mean
> in
> >           context to any of these.
> >        b. CPUs/GICC are physically connected to unique GICR (GIC
> Redistributor).
> >           GIC Redistributors are always part of always-on power domain.
> Hence,
> >           cannot be powered-off as per specification.
> >
> > B. Impediments in Firmware/ACPI (Architectural Constraint)
> >
> >     1. Firmware has to expose GICC, GICR and other per-CPU features like
> PMU,
> >        SVE, MTE, Arch Timers etc. to the OS. Due to architectural
> constraint
> >        stated in above section A1(a),  all interrupt controller
> structures of
> >        MADT describing GIC CPU Interfaces and the GIC Redistibutors MUST
> be
> >        presented by firmware to the OSPM during the boot time.
> >     2. Architectures that support CPU hotplug can evaluate ACPI _MAT
> method to
> >        get this kind of information from the firmware even after boot and
> the
> >        OSPM has capability to process these. ARM kernel uses information
> in MADT
> >        interrupt controller structures to identify number of Present CPUs
> during
> >        boot and hence does not allow to change these after boot. Number
> of
> >        present CPUs cannot be changed. It is an architectural constraint!
> >
> > C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural
> Constraint)
> >
> >     1. KVM VGIC:
> >         a. Sizing of various VGIC resources like memory regions etc.
> related to
> >            the redistributor happens only once and is fixed at the VM
> init time
> >            and cannot be changed later after initialization has happened.
> >            KVM statically configures these resources based on the number
> of vCPUs
> >            and the number/size of redistributor ranges.
> >         b. Association between vCPU and its VGIC redistributor is fixed
> at the
> >            VM init time within the KVM i.e. when redistributor iodevs
> gets
> >            registered. VGIC does not allows to setup/change this
> association
> >            after VM initialization has happened. Physically, every
> CPU/GICC is
> >            uniquely connected with its redistributor and there is no
> >            architectural way to set this up.
> >     2. KVM vCPUs:
> >         a. Lack of specification means destruction of KVM vCPUs does not
> exist as
> >            there is no reference to tell what to do with other per-vCPU
> >            components like redistributors, arch timer etc.
> >         b. Infact, KVM does not implements destruction of vCPUs for any
> >            architecture. This is independent of the fact whether
> architecture
> >            actually supports CPU Hotplug feature. For example, even for
> x86 KVM
> >            does not implements destruction of vCPUs.
> >
> > D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints-
> >Arch)
> >
> >     1. Qemu CPU Objects MUST be created to initialize all the Host KVM
> vCPUs to
> >        overcome the KVM constraint. KVM vCPUs are created, initialized
> when Qemu
> >        CPU Objects are realized. But keepinsg the QOM CPU objects
> realized for
> >        'yet-to-be-plugged' vCPUs can create problems when these new vCPUs
> shall
> >        be plugged using device_add and a new QOM CPU object shall be
> created.
> >     2. GICV3State and GICV3CPUState objects MUST be sized over *possible
> vCPUs*
> >        during VM init time while QOM GICV3 Object is realized. This is
> because
> >        KVM VGIC can only be initialized once during init time. But every
> >        GICV3CPUState has an associated QOM CPU Object. Later might
> corresponds to
> >        vCPU which are 'yet-to-be-plugged'(unplugged at init).
> >     3. How should new QOM CPU objects be connected back to the
> GICV3CPUState
> >        objects and disconnected from it in case CPU is being
> hot(un)plugged?
> >     4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented
> in the
> >        QOM for which KVM vCPU already exists? For example, whether to
> keep,
> >         a. No QOM CPU objects Or
> >         b. Unrealized CPU Objects
> >     5. How should vCPU state be exposed via ACPI to the Guest? Especially
> for
> >        the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not
> exists
> >        within the QOM but the Guest always expects all possible vCPUs to
> be
> >        identified as ACPI *present* during boot.
> >     6. How should Qemu expose GIC CPU interfaces for the unplugged or
> >        yet-to-beplugged vCPUs using ACPI MADT Table to the Guest?
> >
> > E. Summary of Approach ([+] Workarounds to problems in sections A, B, C &
> D)
> >
> >     1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e.
> even
> >        for the vCPUs which are yet-to-be-plugged in Qemu but keep them in
> the
> >        powered-off state.
> >     2. After the KVM vCPUs have been initialized in the Host, the KVM
> vCPU
> >        objects corresponding to the unplugged/yet-to-be-plugged vCPUs are
> parked
> >        at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar
> to x86)
> >     3. GICV3State and GICV3CPUState objects are sized over possible vCPUs
> during
> >        VM init time i.e. when Qemu GIC is realized. This in turn sizes
> KVM VGIC
> >        resources like memory regions etc. related to the redistributors
> with the
> >        number of possible KVM vCPUs. This never changes after VM has
> initialized.
> >     4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged
> vCPUs are
> >        released post Host KVM CPU and GIC/VGIC initialization.
> >     5. Build ACPI MADT Table with below updates
> >        a. Number of GIC CPU interface entries (=possible vCPUs)
> >        b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable)
> >        c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1
> >           - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
> > 	 - vCPU can be ACPI enabled+onlined after Guest boots (Firmware
> Policy)
> > 	 - Some issues with above (details in later sections)
> >     6. Expose below ACPI Status to Guest kernel
> >        a. Always _STA.Present=1 (all possible vCPUs)
> >        b. _STA.Enabled=1 (plugged vCPUs)
> >        c. _STA.Enabled=0 (unplugged vCPUs)
> >     7. vCPU hotplug *realizes* new QOM CPU object. Following happens,
> >        a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread
> >        b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list)
> >           - Attaches to QOM CPU object.
> >        c. Reinitializes KVM vCPU in the Host
> >           - Resets the core and sys regs, sets defaults etc.
> >        d. Runs KVM vCPU (created with "start-powered-off")
> > 	 - vCPU thread sleeps (waits for vCPU reset via PSCI)
> >        e. Updates Qemu GIC
> >           - Wires back IRQs related to this vCPU.
> >           - GICV3CPUState association with QOM CPU Object.
> >        f. Updates [6] ACPI _STA.Enabled=1
> >        g. Notifies Guest about new vCPU (via ACPI GED interface)
> > 	 - Guest checks _STA.Enabled=1
> > 	 - Guest adds processor (registers CPU with LDM) [3]
> >        h. Plugs the QOM CPU object in the slot.
> >           - slot-number = cpu-index{socket,cluster,core,thread}
> >        i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC)
> >           - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
> >           - Qemu powers-on KVM vCPU in the Host
> >     8. vCPU hot-unplug *unrealizes* QOM CPU Object. Following happens,
> >        a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event
> >           - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC)
> >        b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
> >           - Qemu powers-off the KVM vCPU in the Host
> >        c Guest signals *Eject* vCPU to Qemu
> >        d. Qemu updates [6] ACPI _STA.Enabled=0
> >        e. Updates GIC
> >           - Un-wires IRQs related to this vCPU
> >           - GICV3CPUState association with new QOM CPU Object is updated.
> >        f. Unplugs the vCPU
> > 	 - Removes from slot
> >           - Parks KVM vCPU ("kvm_parked_vcpus" list)
> >           - Unrealizes QOM CPU Object & joins back Qemu vCPU thread
> > 	 - Destroys QOM CPU object
> >        g. Guest checks ACPI _STA.Enabled=0
> >           - Removes processor (unregisters CPU with LDM) [3]
> >
> > F. Work Presented at KVM Forum Conferences:
> >     Details of above work has been presented at KVMForum2020 and
> KVMForum2023
> >     conferences. Slides are available at below links,
> >     a. KVMForum 2023
> >        - Challenges Revisited in Supporting Virt CPU Hotplug on
> architectures that don't Support CPU Hotplug (like ARM64)
> >          https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
> >     b. KVMForum 2020
> >        - Challenges in Supporting Virtual CPU Hotplug on SoC Based
> Systems (like ARM64) - Salil Mehta, Huawei
> >          https://sched.co/eE4m
> >
> > (VI) Commands Used
> >       =============
> >
> >      A. Qemu launch commands to init the machine
> >
> >      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
> >      -cpu host -smp cpus=4,maxcpus=6 \
> >      -m 300M \
> >      -kernel Image \
> >      -initrd rootfs.cpio.gz \
> >      -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2
> acpi=force" \
> >      -nographic \
> >      -bios  QEMU_EFI.fd \
> >
> >      B. Hot-(un)plug related commands
> >
> >      # Hotplug a host vCPU(accel=kvm)
> >      $ device_add host-arm-cpu,id=core4,core-id=4
> >
> >      # Hotplug a vCPU(accel=tcg)
> >      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
> >
> >      # Delete the vCPU
> >      $ device_del core4
> >
> >      Sample output on guest after boot:
> >
> >      $ cat /sys/devices/system/cpu/possible
> >      0-5
> >      $ cat /sys/devices/system/cpu/present
> >      0-5
> >      $ cat /sys/devices/system/cpu/enabled
> >      0-3
> >      $ cat /sys/devices/system/cpu/online
> >      0-1
> >      $ cat /sys/devices/system/cpu/offline
> >      2-5
> >
> >      Sample output on guest after hotplug of vCPU=4:
> >
> >      $ cat /sys/devices/system/cpu/possible
> >      0-5
> >      $ cat /sys/devices/system/cpu/present
> >      0-5
> >      $ cat /sys/devices/system/cpu/enabled
> >      0-4
> >      $ cat /sys/devices/system/cpu/online
> >      0-1,4
> >      $ cat /sys/devices/system/cpu/offline
> >      2-3,5
> >
> >      Note: vCPU=4 was explicitly 'onlined' after hot-plug
> >      $ echo 1 > /sys/devices/system/cpu/cpu4/online
> >
> > (VII) Repository
> >        ==========
> >
> >   (*) QEMU changes for vCPU hotplug could be cloned from below site,
> >       https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
> >   (*) Guest Kernel changes (by James Morse, ARM) are available here:
> >       https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
> virtual_cpu_hotplug/rfc/v2
> >
> >
> > (VIII) KNOWN ISSUES
> >         ============
> >
> > 1. Migration has been lightly tested. Below are some of the known issues:
> >     - Ocassional CPU stall (not always repeatable)
> >     - Negative test case like asymmetric source/destination VM config
> causes dump.
> >     - Migration with TCG is not working properly.
> > 2. TCG with Single threaded mode is broken.
> > 3. HVF and qtest support is broken.
> > 4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-
> capable are
> >     mutually exclusive i.e. as per the change [6] a vCPU cannot be both
> >     GICC.Enabled and GICC.online-capable. This means,
> >        [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
> >     a. If we have to support hot-unplug of the cold-booted vCPUs then
> these MUST
> >        be specified as GICC.online-capable in the MADT Table during boot
> by the
> >        firmware/Qemu. But this requirement conflicts with the requirement
> to
> >        support new Qemu changes with legacy OS which dont understand
> >        MADT.GICC.online-capable Bit. Legacy OS during boot time will
> ignore this
> >        bit and hence these vCPUs will not appear on such OS. This is
> unexpected
> >        behaviour.
> >     b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to
> unplug
> >        these cold-booted vCPUs from OS (which in actual should be blocked
> by
> >        returning error at Qemu) then features like 'kexec' will break.
> >     c. As I understand, removal of the cold-booted vCPUs is a required
> feature
> >        and x86 world allows it.
> >     d. Hence, either we need a specification change to make the
> MADT.GICC.Enabled
> >        and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT
> support
> >        removal of cold-booted vCPUs. In the later case, a check can be
> introduced
> >        to bar the users from unplugging vCPUs, which were cold-booted,
> using QMP
> >        commands. (Needs discussion!)
> >        Please check below patch part of this patch-set:
> >            [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled]
> > 5. Code related to the notification to GICV3 about hot(un)plug of a vCPU
> event
> >     might need further discussion.
> >
> >
> > (IX) THINGS TO DO
> >       ============
> >
> > 1. Fix the Migration Issues
> > 2. Fix issues related to TCG/Emulation support.
> > 3. Comprehensive Testing. Current testing is very basic.
> >     a. Negative Test cases
> > 4. Qemu Documentation(.rst) need to be updated.
> > 5. Fix qtest, HVF Support
> > 6. Fix the design issue related to ACPI MADT.GICC flags discussed in
> known
> >     issues. This might require UEFI ACPI specification change!
> > 7. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now.
> >
> >   Above is *not* a complete list. Will update later!
> >
> > Best regards
> > Salil.
> >
> > (X) DISCLAIMER
> >      ==========
> >
> > This work is an attempt to present a proof-of-concept of the ARM64 vCPU
> hotplug
> > implementation to the community. This is *not* a production level code
> and might
> > have bugs. Only a basic testing has been done on HiSilicon Kunpeng920 SoC
> for
> > servers. Once the design and core idea behind the implementation has been
> > verified more efforts can be put to harden the code.
> >
> > This work is *mostly* in the lines of the discussions which have happened
> in the
> > previous years[see refs below] across different channels like mailing-
> list,
> > Linaro Open Discussions platform, various conferences like KVMFourm etc.
> This
> > RFC is being used as a way to verify the idea mentioned in this cover-
> letter and
> > to get community views. Once this has been agreed, a formal patch shall
> be
> > posted to the mailing-list for review.
> >
> > [The concept being presented has been found to work!]
> >
> > (XI) ORGANIZATION OF PATCHES
> >       =======================
> >
> >   A. All patches [Architecture 'agnostic' + 'specific']:
> >
> >     [Patch 1-9, 23, 36] logic required during machine init
> >      (*) Some validation checks
> >      (*) Introduces core-id property and some util functions required
> later.
> >      (*) Refactors Parking logic of vCPUs
> >      (*) Logic to pre-create vCPUs
> >      (*) GIC initialization pre-sized with possible vCPUs.
> >      (*) Some refactoring to have common hot and cold plug logic
> together.
> >      (*) Release of disable QOM CPU objects in post_cpu_init()
> >      (*) Support of ACPI _OSC method to negotiate platform hotplug
> capabilities
> >     [Patch 10-22] logic related to ACPI at machine init time
> >      (*) Changes required to Enable ACPI for cpu hotplug
> >      (*) Initialization ACPI GED framework to cater CPU Hotplug Events
> >      (*) Build ACPI AML related to CPU control dev
> >      (*) ACPI MADT/MAT changes
> >     [Patch 24-35] Logic required during vCPU hot-(un)plug
> >      (*) Basic framework changes to suppport vCPU hot-(un)plug
> >      (*) ACPI GED changes for hot-(un)plug hooks.
> >      (*) wire-unwire the IRQs
> >      (*) GIC notification logic
> >      (*) ARMCPU unrealize logic
> >      (*) Handling of SMCC Hypercall Exits by KVM to Qemu
> >
> >   B. Architecture *agnostic* patches part of patch-set:
> >
> >     [Patch 5,9,11,13,16,20,24,31,33] Common logic to support hotplug
> >      (*) Refactors Parking logic of vCPUs
> >      (*) Introduces ACPI GED Support for vCPU Hotplug Events
> >      (*) Introduces ACPI AML change for CPU Control Device
> >
> > (XII) REFERENCES
> >        ==========
> >
> > [1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
> salil.mehta@huawei.com/
> > [2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-
> salil.mehta@huawei.com/
> > [3] https://lore.kernel.org/lkml/20230203135043.409192-1-
> james.morse@arm.com/
> > [4] https://lore.kernel.org/all/20230913163823.7880-1-
> james.morse@arm.com/
> > [5] https://lore.kernel.org/all/20230404154050.2270077-1-
> oliver.upton@linux.dev/
> > [6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
> > [7]
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic
> -cpu-interface-gicc-structure
> > [8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> > [9] https://cloud.google.com/kubernetes-
> engine/docs/concepts/verticalpodautoscaler
> > [10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-
> autoscaler.html
> > [11] https://lkml.org/lkml/2019/7/10/235
> > [12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
> > [13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
> > [14] https://op-lists.linaro.org/archives/list/linaro-open-
> discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
> > [15] http://lists.nongnu.org/archive/html/qemu-devel/2018-
> 07/msg01168.html
> > [16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
> > [17] https://op-lists.linaro.org/archives/list/linaro-open-
> discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
> > [18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-
> philippe@linaro.org/
> > [19] https://lore.kernel.org/all/20230913163823.7880-1-
> james.morse@arm.com/
> > [20]
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic
> c-cpu-interface-flags
> >
> > (XIII) ACKNOWLEDGEMENTS
> >         ================
> >
> > I would like to take this opportunity to thank below people for various
> > discussions with me over different channels during the development:
> >
> > Marc Zyngier (Google)               Catalin Marinas (ARM),
> > James Morse(ARM),                   Will Deacon (Google),
> > Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
> > Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat),
> > Jonathan Cameron (Huawei),          Darren Hart (Ampere),
> > Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
> > Andrew Jones (Redhat),              Karl Heubaum (Oracle),
> > Keqian Zhu (Huawei),                Miguel Luis (Oracle),
> > Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
> > Shameerali Kolothum (Huawei)        Russell King (Oracle)
> > Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
> > Zengtao/Prime (Huawei),             And all those whom I have missed!
> >
> > Many thanks to below people for their current or past contributions:
> >
> > 1. James Morse (ARM)
> >     (Current Kernel part of vCPU Hotplug Support on AARCH64)
> > 2. Jean-Philippe Brucker (Linaro)
> >     (Protoyped one of the earlier PSCI based POC [17][18] based on RFC
> V1)
> > 3. Keqian Zhu (Huawei)
> >     (Co-developed Qemu prototype)
> > 4. Xiongfeng Wang (Huawei)
> >     (Co-developed earlier kernel prototype)
> > 5. Vishnu Pajjuri (Ampere)
> >     (Verification on Ampere ARM64 Platforms + fixes)
> > 6. Miguel Luis (Oracle)
> >     (Verification on Oracle ARM64 Platforms + fixes)
> >
> >
> > Author Salil Mehta (1):
> >    target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
> >
> > Jean-Philippe Brucker (2):
> >    hw/acpi: Make _MAT method optional
> >    target/arm/kvm: Write CPU state back to KVM on reset
> >
> > Miguel Luis (1):
> >    tcg/mttcg: enable threads to unregister in tcg_ctxs[]
> >
> > Salil Mehta (33):
> >    arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id
> property
> >    cpus-common: Add common CPU utility for possible vCPUs
> >    hw/arm/virt: Move setting of common CPU properties in a function
> >    arm/virt,target/arm: Machine init time change common to vCPU
> {cold|hot}-plug
> >    accel/kvm: Extract common KVM vCPU {creation,parking} code
> >    arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
> >    arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine
> init
> >    arm/virt: Init PMU at host for all possible vcpus
> >    hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
> >    arm/acpi: Enable ACPI support for vcpu hotplug
> >    hw/acpi: Add ACPI CPU hotplug init stub
> >    hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
> >    hw/acpi: Init GED framework with cpu hotplug events
> >    arm/virt: Add cpu hotplug events to GED during creation
> >    arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
> >    hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
> >    arm/virt/acpi: Build CPUs AML with CPU Hotplug support
> >    arm/virt: Make ARM vCPU *present* status ACPI *persistent*
> >    hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits
> to Guest
> >    hw/acpi: Update GED _EVT method AML with cpu scan
> >    hw/arm: MADT Tbl change to size the guest with possible vCPUs
> >    arm/virt: Release objects for *disabled* possible vCPUs after init
> >    hw/acpi: Update ACPI GED framework to support vCPU Hotplug
> >    arm/virt: Add/update basic hot-(un)plug framework
> >    arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
> >    hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
> >    hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register
> info
> >    arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
> >    hw/arm: Changes required for reset and to support next boot
> >    physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
> >    target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
> >    hw/arm: Support hotplug capability check using _OSC method
> >    hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
> >
> >   accel/kvm/kvm-all.c                    |  61 +-
> >   accel/tcg/tcg-accel-ops-mttcg.c        |   1 +
> >   cpus-common.c                          |  37 ++
> >   gdbstub/gdbstub.c                      |  13 +
> >   hw/acpi/acpi-cpu-hotplug-stub.c        |   6 +
> >   hw/acpi/cpu.c                          |  91 ++-
> >   hw/acpi/generic_event_device.c         |  33 +
> >   hw/arm/Kconfig                         |   1 +
> >   hw/arm/boot.c                          |   2 +-
> >   hw/arm/virt-acpi-build.c               | 110 +++-
> >   hw/arm/virt.c                          | 863 ++++++++++++++++++++-----
> >   hw/core/gpio.c                         |   2 +-
> >   hw/i386/acpi-build.c                   |   2 +-
> >   hw/intc/arm_gicv3.c                    |   1 +
> >   hw/intc/arm_gicv3_common.c             |  66 +-
> >   hw/intc/arm_gicv3_cpuif.c              | 265 ++++----
> >   hw/intc/arm_gicv3_cpuif_common.c       |   5 +
> >   hw/intc/arm_gicv3_kvm.c                |  39 +-
> >   hw/intc/gicv3_internal.h               |   2 +
> >   include/exec/cpu-common.h              |   8 +
> >   include/exec/gdbstub.h                 |   1 +
> >   include/hw/acpi/cpu.h                  |   7 +-
> >   include/hw/acpi/cpu_hotplug.h          |   4 +
> >   include/hw/acpi/generic_event_device.h |   5 +
> >   include/hw/arm/boot.h                  |   2 +
> >   include/hw/arm/virt.h                  |  10 +-
> >   include/hw/core/cpu.h                  |  77 +++
> >   include/hw/intc/arm_gicv3_common.h     |  23 +
> >   include/hw/qdev-core.h                 |   2 +
> >   include/sysemu/kvm.h                   |   2 +
> >   include/tcg/tcg.h                      |   1 +
> >   softmmu/physmem.c                      |  25 +
> >   target/arm/arm-powerctl.c              |  51 +-
> >   target/arm/cpu-qom.h                   |   3 +
> >   target/arm/cpu.c                       | 112 ++++
> >   target/arm/cpu.h                       |  17 +
> >   target/arm/cpu64.c                     |  15 +
> >   target/arm/gdbstub.c                   |   6 +
> >   target/arm/helper.c                    |  27 +-
> >   target/arm/internals.h                 |  12 +-
> >   target/arm/kvm.c                       |  93 ++-
> >   target/arm/kvm64.c                     |  59 +-
> >   target/arm/kvm_arm.h                   |  24 +
> >   target/arm/meson.build                 |   1 +
> >   target/arm/{tcg => }/psci.c            |   8 +
> >   target/arm/tcg/meson.build             |   4 -
> >   tcg/tcg.c                              |  23 +
> >   47 files changed, 1873 insertions(+), 349 deletions(-)
> >   rename target/arm/{tcg => }/psci.c (97%)
> Tested on Ampere's platform for vCPU hotplug/unplug with reboot,
> suspend/resume and save/restore.
> Also tested for vCPU hotplug/unplug along with VM live migration.
> 
> Please feel free to add,
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>

Many thanks for this.

As you are aware, we have now split above patch-set into:

1. Architecture agnostic patch-set (being reviewed below)
   https://lore.kernel.org/qemu-devel/20231009203601.17584-1-salil.mehta@huawei.com/#t
2. ARM specific patch-set (Would soon be following as RFC V3)


If possible, can I request you to sanity test the Architecture
agnostic patch-set as well for regression and provide the
Tested-by Tag for this patch-set as well?

This is to ensure these changes if accepted do not break any
existing features.


Many thanks again for your past efforts all these times!


Cheers
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-10-11 10:32   ` Salil Mehta via
@ 2023-10-11 10:32     ` Salil Mehta
  2023-10-11 11:08     ` Vishnu Pajjuri
  1 sibling, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-11 10:32 UTC (permalink / raw)
  To: Vishnu Pajjuri, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, gshan,
	rafael, borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Vishnu,

> From: Vishnu Pajjuri <vishnu@amperemail.onmicrosoft.com>
> Sent: Wednesday, October 11, 2023 11:23 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> gshan@redhat.com; rafael@kernel.org; borntraeger@linux.ibm.com;
> alex.bennee@linaro.org; linux@armlinux.org.uk;
> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> miguel.luis@oracle.com; salil.mehta@opnsrc.net; zhukeqian
> <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
> wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com;
> maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8
> Arch
> 
> Hi Salil,
> 
> On 26-09-2023 15:33, Salil Mehta wrote:
> > [ *REPEAT: Sent patches got held at internal server yesterday* ]
> >
> > PROLOGUE
> > ========
> >
> > To assist in review and set the right expectations from this RFC, please
> first
> > read below sections *APPENDED AT THE END* of this cover letter,
> >
> > 1. Important *DISCLAIMER* [Section (X)]
> > 2. Work presented at KVMForum Conference (slides available) [Section
> (V)F]
> > 3. Organization of patches [Section (XI)]
> > 4. References [Section (XII)]
> > 5. Detailed TODO list of the leftover work or work-in-progress [Section
> (IX)]
> >
> > NOTE: There has been an interest shown by other organizations in adapting
> > this series for their architecture. I am planning to split this RFC into
> > architecture *agnostic* and *specific* patch-sets in subsequent releases.
> ARM
> > specific patch-set will continue as RFC V3 and architecture agnostic
> patch-set
> > will be floated without RFC tag and can be consumed in this Qemu cycle if
> > MAINTAINERs ack it.
> >
> > [Please check section (XI)B for details of architecture agnostic patches]
> >
> >
> > SECTIONS [I - XIII] are as follows :
> >
> > (I) Key Changes (RFC V1 -> RFC V2)
> >      ==================================
> >
> >      RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
> salil.mehta@huawei.com/
> >
> > 1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
> >     *online-capable* or *enabled* to the Guest OS at the boot time. This
> means
> >     associated CPUs can have ACPI _STA as *enabled* or *disabled* even
> after boot
> >     See, UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface
> Flags[20]
> > 2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI
> CPU_{ON,OFF}
> >     request. This is required to {dis}allow online'ing a vCPU.
> > 3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI
> _STA.PRESENT
> >     to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
> >     hot{un}plug.
> > 4. Live Migration works (some issues are still there)
> > 5. TCG/HVF/qtest does not support Hotplug and falls back to default.
> > 6. Code for TCG support do exists in this release (it is a work-in-
> progress)
> > 7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
> >     hotplug capability (_OSC Query support still pending)
> > 8. Misc. Bug fixes
> >
> > (II) Summary
> >       =======
> >
> > This patch-set introduces the virtual CPU hotplug support for ARMv8
> architecture
> > in QEMU. Idea is to be able to hotplug and hot-unplug the vCPUs while
> guest VM
> > is running and no reboot is required. This does *not* makes any
> assumption of
> > the physical CPU hotplug availability within the host system but rather
> tries to
> > solve the problem at virtualizer/QEMU layer. Introduces ACPI CPU hotplug
> hooks
> > and event handling to interface with the guest kernel, code to
> initialize, plug
> > and unplug CPUs. No changes are required within the host kernel/KVM
> except the
> > support of hypercall exit handling in the user-space/Qemu which has
> recently
> > been added to the kernel. Its corresponding Guest kernel changes have
> been
> > posted on the mailing-list [3] [4] by James Morse.
> >
> > (III) Motivation
> >        ==========
> >
> > This allows scaling the guest VM compute capacity on-demand which would
> be
> > useful for the following example scenarios,
> >
> > 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the
> orchestration
> >     framework which could adjust resource requests (CPU and Mem requests)
> for
> >     the containers in a pod, based on usage.
> > 2. Pay-as-you-grow Business Model: Infrastructure provider could allocate
> and
> >     restrict the total number of compute resources available to the guest
> VM
> >     according to the SLA (Service Level Agreement). VM owner could
> request for
> >     more compute to be hot-plugged for some cost.
> >
> > For example, Kata Container VM starts with a minimum amount of resources
> (i.e.
> > hotplug everything approach). why?
> >
> > 1. Allowing faster *boot time* and
> > 2. Reduction in *memory footprint*
> >
> > Kata Container VM can boot with just 1 vCPU and then later more vCPUs can
> be
> > hot-plugged as per requirement.
> >
> > (IV) Terminology
> >       ===========
> >
> > (*) Posssible CPUs: Total vCPUs which could ever exist in VM. This
> includes
> >                      any cold booted CPUs plus any CPUs which could be
> later
> >                      hot-plugged.
> >                      - Qemu parameter(-smp maxcpus=N)
> > (*) Present CPUs:   Possible CPUs which are ACPI 'present'. These might
> or might
> >                      not be ACPI 'enabled'.
> >                      - Present vCPUs = Possible vCPUs (Always on ARM
> Arch)
> > (*) Enabled CPUs:   Possible CPUs which are ACPI ‘present’ and 'enabled'
> and can
> >                      now be ‘onlined’ (PSCI) for use by Guest Kernel. All
> cold
> >                      booted vCPUs are ACPI 'enabled' at boot. Later,
> using
> >                      device_add more vCPUs can be hotplugged and be made
> ACPI
> >                      'enabled.
> >                      - Qemu parameter(-smp cpus=N). Can be used to
> specify some
> > 		      cold booted vCPUs during VM init. Some can be added using
> > 		      '-device' option.
> >
> > (V) Constraints Due To ARMv8 CPU Architecture [+] Other Impediments
> >      ===============================================================
> >
> > A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
> >     1. ARMv8 CPU architecture does not support the concept of the
> physical CPU
> >        hotplug.
> >        a. There are many per-CPU components like PMU, SVE, MTE, Arch
> timers etc.
> >           whose behaviour need to be clearly defined when CPU is
> hot(un)plugged.
> >           There is no specification for this.
> >
> >     2. Other ARM components like GIC etc. have not been designed to
> realize
> >        physical CPU hotplug capability as of now. For example,
> >        a. Every physical CPU has a unique GICC (GIC CPU Interface) by
> construct.
> >           Architecture does not specifies what CPU hot(un)plug would mean
> in
> >           context to any of these.
> >        b. CPUs/GICC are physically connected to unique GICR (GIC
> Redistributor).
> >           GIC Redistributors are always part of always-on power domain.
> Hence,
> >           cannot be powered-off as per specification.
> >
> > B. Impediments in Firmware/ACPI (Architectural Constraint)
> >
> >     1. Firmware has to expose GICC, GICR and other per-CPU features like
> PMU,
> >        SVE, MTE, Arch Timers etc. to the OS. Due to architectural
> constraint
> >        stated in above section A1(a),  all interrupt controller
> structures of
> >        MADT describing GIC CPU Interfaces and the GIC Redistibutors MUST
> be
> >        presented by firmware to the OSPM during the boot time.
> >     2. Architectures that support CPU hotplug can evaluate ACPI _MAT
> method to
> >        get this kind of information from the firmware even after boot and
> the
> >        OSPM has capability to process these. ARM kernel uses information
> in MADT
> >        interrupt controller structures to identify number of Present CPUs
> during
> >        boot and hence does not allow to change these after boot. Number
> of
> >        present CPUs cannot be changed. It is an architectural constraint!
> >
> > C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural
> Constraint)
> >
> >     1. KVM VGIC:
> >         a. Sizing of various VGIC resources like memory regions etc.
> related to
> >            the redistributor happens only once and is fixed at the VM
> init time
> >            and cannot be changed later after initialization has happened.
> >            KVM statically configures these resources based on the number
> of vCPUs
> >            and the number/size of redistributor ranges.
> >         b. Association between vCPU and its VGIC redistributor is fixed
> at the
> >            VM init time within the KVM i.e. when redistributor iodevs
> gets
> >            registered. VGIC does not allows to setup/change this
> association
> >            after VM initialization has happened. Physically, every
> CPU/GICC is
> >            uniquely connected with its redistributor and there is no
> >            architectural way to set this up.
> >     2. KVM vCPUs:
> >         a. Lack of specification means destruction of KVM vCPUs does not
> exist as
> >            there is no reference to tell what to do with other per-vCPU
> >            components like redistributors, arch timer etc.
> >         b. Infact, KVM does not implements destruction of vCPUs for any
> >            architecture. This is independent of the fact whether
> architecture
> >            actually supports CPU Hotplug feature. For example, even for
> x86 KVM
> >            does not implements destruction of vCPUs.
> >
> > D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints-
> >Arch)
> >
> >     1. Qemu CPU Objects MUST be created to initialize all the Host KVM
> vCPUs to
> >        overcome the KVM constraint. KVM vCPUs are created, initialized
> when Qemu
> >        CPU Objects are realized. But keepinsg the QOM CPU objects
> realized for
> >        'yet-to-be-plugged' vCPUs can create problems when these new vCPUs
> shall
> >        be plugged using device_add and a new QOM CPU object shall be
> created.
> >     2. GICV3State and GICV3CPUState objects MUST be sized over *possible
> vCPUs*
> >        during VM init time while QOM GICV3 Object is realized. This is
> because
> >        KVM VGIC can only be initialized once during init time. But every
> >        GICV3CPUState has an associated QOM CPU Object. Later might
> corresponds to
> >        vCPU which are 'yet-to-be-plugged'(unplugged at init).
> >     3. How should new QOM CPU objects be connected back to the
> GICV3CPUState
> >        objects and disconnected from it in case CPU is being
> hot(un)plugged?
> >     4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented
> in the
> >        QOM for which KVM vCPU already exists? For example, whether to
> keep,
> >         a. No QOM CPU objects Or
> >         b. Unrealized CPU Objects
> >     5. How should vCPU state be exposed via ACPI to the Guest? Especially
> for
> >        the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not
> exists
> >        within the QOM but the Guest always expects all possible vCPUs to
> be
> >        identified as ACPI *present* during boot.
> >     6. How should Qemu expose GIC CPU interfaces for the unplugged or
> >        yet-to-beplugged vCPUs using ACPI MADT Table to the Guest?
> >
> > E. Summary of Approach ([+] Workarounds to problems in sections A, B, C &
> D)
> >
> >     1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e.
> even
> >        for the vCPUs which are yet-to-be-plugged in Qemu but keep them in
> the
> >        powered-off state.
> >     2. After the KVM vCPUs have been initialized in the Host, the KVM
> vCPU
> >        objects corresponding to the unplugged/yet-to-be-plugged vCPUs are
> parked
> >        at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar
> to x86)
> >     3. GICV3State and GICV3CPUState objects are sized over possible vCPUs
> during
> >        VM init time i.e. when Qemu GIC is realized. This in turn sizes
> KVM VGIC
> >        resources like memory regions etc. related to the redistributors
> with the
> >        number of possible KVM vCPUs. This never changes after VM has
> initialized.
> >     4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged
> vCPUs are
> >        released post Host KVM CPU and GIC/VGIC initialization.
> >     5. Build ACPI MADT Table with below updates
> >        a. Number of GIC CPU interface entries (=possible vCPUs)
> >        b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable)
> >        c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1
> >           - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
> > 	 - vCPU can be ACPI enabled+onlined after Guest boots (Firmware
> Policy)
> > 	 - Some issues with above (details in later sections)
> >     6. Expose below ACPI Status to Guest kernel
> >        a. Always _STA.Present=1 (all possible vCPUs)
> >        b. _STA.Enabled=1 (plugged vCPUs)
> >        c. _STA.Enabled=0 (unplugged vCPUs)
> >     7. vCPU hotplug *realizes* new QOM CPU object. Following happens,
> >        a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread
> >        b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list)
> >           - Attaches to QOM CPU object.
> >        c. Reinitializes KVM vCPU in the Host
> >           - Resets the core and sys regs, sets defaults etc.
> >        d. Runs KVM vCPU (created with "start-powered-off")
> > 	 - vCPU thread sleeps (waits for vCPU reset via PSCI)
> >        e. Updates Qemu GIC
> >           - Wires back IRQs related to this vCPU.
> >           - GICV3CPUState association with QOM CPU Object.
> >        f. Updates [6] ACPI _STA.Enabled=1
> >        g. Notifies Guest about new vCPU (via ACPI GED interface)
> > 	 - Guest checks _STA.Enabled=1
> > 	 - Guest adds processor (registers CPU with LDM) [3]
> >        h. Plugs the QOM CPU object in the slot.
> >           - slot-number = cpu-index{socket,cluster,core,thread}
> >        i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC)
> >           - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
> >           - Qemu powers-on KVM vCPU in the Host
> >     8. vCPU hot-unplug *unrealizes* QOM CPU Object. Following happens,
> >        a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event
> >           - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC)
> >        b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
> >           - Qemu powers-off the KVM vCPU in the Host
> >        c Guest signals *Eject* vCPU to Qemu
> >        d. Qemu updates [6] ACPI _STA.Enabled=0
> >        e. Updates GIC
> >           - Un-wires IRQs related to this vCPU
> >           - GICV3CPUState association with new QOM CPU Object is updated.
> >        f. Unplugs the vCPU
> > 	 - Removes from slot
> >           - Parks KVM vCPU ("kvm_parked_vcpus" list)
> >           - Unrealizes QOM CPU Object & joins back Qemu vCPU thread
> > 	 - Destroys QOM CPU object
> >        g. Guest checks ACPI _STA.Enabled=0
> >           - Removes processor (unregisters CPU with LDM) [3]
> >
> > F. Work Presented at KVM Forum Conferences:
> >     Details of above work has been presented at KVMForum2020 and
> KVMForum2023
> >     conferences. Slides are available at below links,
> >     a. KVMForum 2023
> >        - Challenges Revisited in Supporting Virt CPU Hotplug on
> architectures that don't Support CPU Hotplug (like ARM64)
> >          https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
> >     b. KVMForum 2020
> >        - Challenges in Supporting Virtual CPU Hotplug on SoC Based
> Systems (like ARM64) - Salil Mehta, Huawei
> >          https://sched.co/eE4m
> >
> > (VI) Commands Used
> >       =============
> >
> >      A. Qemu launch commands to init the machine
> >
> >      $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
> >      -cpu host -smp cpus=4,maxcpus=6 \
> >      -m 300M \
> >      -kernel Image \
> >      -initrd rootfs.cpio.gz \
> >      -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2
> acpi=force" \
> >      -nographic \
> >      -bios  QEMU_EFI.fd \
> >
> >      B. Hot-(un)plug related commands
> >
> >      # Hotplug a host vCPU(accel=kvm)
> >      $ device_add host-arm-cpu,id=core4,core-id=4
> >
> >      # Hotplug a vCPU(accel=tcg)
> >      $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
> >
> >      # Delete the vCPU
> >      $ device_del core4
> >
> >      Sample output on guest after boot:
> >
> >      $ cat /sys/devices/system/cpu/possible
> >      0-5
> >      $ cat /sys/devices/system/cpu/present
> >      0-5
> >      $ cat /sys/devices/system/cpu/enabled
> >      0-3
> >      $ cat /sys/devices/system/cpu/online
> >      0-1
> >      $ cat /sys/devices/system/cpu/offline
> >      2-5
> >
> >      Sample output on guest after hotplug of vCPU=4:
> >
> >      $ cat /sys/devices/system/cpu/possible
> >      0-5
> >      $ cat /sys/devices/system/cpu/present
> >      0-5
> >      $ cat /sys/devices/system/cpu/enabled
> >      0-4
> >      $ cat /sys/devices/system/cpu/online
> >      0-1,4
> >      $ cat /sys/devices/system/cpu/offline
> >      2-3,5
> >
> >      Note: vCPU=4 was explicitly 'onlined' after hot-plug
> >      $ echo 1 > /sys/devices/system/cpu/cpu4/online
> >
> > (VII) Repository
> >        ==========
> >
> >   (*) QEMU changes for vCPU hotplug could be cloned from below site,
> >       https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
> >   (*) Guest Kernel changes (by James Morse, ARM) are available here:
> >       https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
> virtual_cpu_hotplug/rfc/v2
> >
> >
> > (VIII) KNOWN ISSUES
> >         ============
> >
> > 1. Migration has been lightly tested. Below are some of the known issues:
> >     - Ocassional CPU stall (not always repeatable)
> >     - Negative test case like asymmetric source/destination VM config
> causes dump.
> >     - Migration with TCG is not working properly.
> > 2. TCG with Single threaded mode is broken.
> > 3. HVF and qtest support is broken.
> > 4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-
> capable are
> >     mutually exclusive i.e. as per the change [6] a vCPU cannot be both
> >     GICC.Enabled and GICC.online-capable. This means,
> >        [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
> >     a. If we have to support hot-unplug of the cold-booted vCPUs then
> these MUST
> >        be specified as GICC.online-capable in the MADT Table during boot
> by the
> >        firmware/Qemu. But this requirement conflicts with the requirement
> to
> >        support new Qemu changes with legacy OS which dont understand
> >        MADT.GICC.online-capable Bit. Legacy OS during boot time will
> ignore this
> >        bit and hence these vCPUs will not appear on such OS. This is
> unexpected
> >        behaviour.
> >     b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to
> unplug
> >        these cold-booted vCPUs from OS (which in actual should be blocked
> by
> >        returning error at Qemu) then features like 'kexec' will break.
> >     c. As I understand, removal of the cold-booted vCPUs is a required
> feature
> >        and x86 world allows it.
> >     d. Hence, either we need a specification change to make the
> MADT.GICC.Enabled
> >        and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT
> support
> >        removal of cold-booted vCPUs. In the later case, a check can be
> introduced
> >        to bar the users from unplugging vCPUs, which were cold-booted,
> using QMP
> >        commands. (Needs discussion!)
> >        Please check below patch part of this patch-set:
> >            [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled]
> > 5. Code related to the notification to GICV3 about hot(un)plug of a vCPU
> event
> >     might need further discussion.
> >
> >
> > (IX) THINGS TO DO
> >       ============
> >
> > 1. Fix the Migration Issues
> > 2. Fix issues related to TCG/Emulation support.
> > 3. Comprehensive Testing. Current testing is very basic.
> >     a. Negative Test cases
> > 4. Qemu Documentation(.rst) need to be updated.
> > 5. Fix qtest, HVF Support
> > 6. Fix the design issue related to ACPI MADT.GICC flags discussed in
> known
> >     issues. This might require UEFI ACPI specification change!
> > 7. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now.
> >
> >   Above is *not* a complete list. Will update later!
> >
> > Best regards
> > Salil.
> >
> > (X) DISCLAIMER
> >      ==========
> >
> > This work is an attempt to present a proof-of-concept of the ARM64 vCPU
> hotplug
> > implementation to the community. This is *not* a production level code
> and might
> > have bugs. Only a basic testing has been done on HiSilicon Kunpeng920 SoC
> for
> > servers. Once the design and core idea behind the implementation has been
> > verified more efforts can be put to harden the code.
> >
> > This work is *mostly* in the lines of the discussions which have happened
> in the
> > previous years[see refs below] across different channels like mailing-
> list,
> > Linaro Open Discussions platform, various conferences like KVMFourm etc.
> This
> > RFC is being used as a way to verify the idea mentioned in this cover-
> letter and
> > to get community views. Once this has been agreed, a formal patch shall
> be
> > posted to the mailing-list for review.
> >
> > [The concept being presented has been found to work!]
> >
> > (XI) ORGANIZATION OF PATCHES
> >       =======================
> >
> >   A. All patches [Architecture 'agnostic' + 'specific']:
> >
> >     [Patch 1-9, 23, 36] logic required during machine init
> >      (*) Some validation checks
> >      (*) Introduces core-id property and some util functions required
> later.
> >      (*) Refactors Parking logic of vCPUs
> >      (*) Logic to pre-create vCPUs
> >      (*) GIC initialization pre-sized with possible vCPUs.
> >      (*) Some refactoring to have common hot and cold plug logic
> together.
> >      (*) Release of disable QOM CPU objects in post_cpu_init()
> >      (*) Support of ACPI _OSC method to negotiate platform hotplug
> capabilities
> >     [Patch 10-22] logic related to ACPI at machine init time
> >      (*) Changes required to Enable ACPI for cpu hotplug
> >      (*) Initialization ACPI GED framework to cater CPU Hotplug Events
> >      (*) Build ACPI AML related to CPU control dev
> >      (*) ACPI MADT/MAT changes
> >     [Patch 24-35] Logic required during vCPU hot-(un)plug
> >      (*) Basic framework changes to suppport vCPU hot-(un)plug
> >      (*) ACPI GED changes for hot-(un)plug hooks.
> >      (*) wire-unwire the IRQs
> >      (*) GIC notification logic
> >      (*) ARMCPU unrealize logic
> >      (*) Handling of SMCC Hypercall Exits by KVM to Qemu
> >
> >   B. Architecture *agnostic* patches part of patch-set:
> >
> >     [Patch 5,9,11,13,16,20,24,31,33] Common logic to support hotplug
> >      (*) Refactors Parking logic of vCPUs
> >      (*) Introduces ACPI GED Support for vCPU Hotplug Events
> >      (*) Introduces ACPI AML change for CPU Control Device
> >
> > (XII) REFERENCES
> >        ==========
> >
> > [1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
> salil.mehta@huawei.com/
> > [2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-
> salil.mehta@huawei.com/
> > [3] https://lore.kernel.org/lkml/20230203135043.409192-1-
> james.morse@arm.com/
> > [4] https://lore.kernel.org/all/20230913163823.7880-1-
> james.morse@arm.com/
> > [5] https://lore.kernel.org/all/20230404154050.2270077-1-
> oliver.upton@linux.dev/
> > [6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
> > [7]
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic
> -cpu-interface-gicc-structure
> > [8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> > [9] https://cloud.google.com/kubernetes-
> engine/docs/concepts/verticalpodautoscaler
> > [10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-
> autoscaler.html
> > [11] https://lkml.org/lkml/2019/7/10/235
> > [12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
> > [13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
> > [14] https://op-lists.linaro.org/archives/list/linaro-open-
> discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
> > [15] http://lists.nongnu.org/archive/html/qemu-devel/2018-
> 07/msg01168.html
> > [16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
> > [17] https://op-lists.linaro.org/archives/list/linaro-open-
> discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
> > [18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-
> philippe@linaro.org/
> > [19] https://lore.kernel.org/all/20230913163823.7880-1-
> james.morse@arm.com/
> > [20]
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic
> c-cpu-interface-flags
> >
> > (XIII) ACKNOWLEDGEMENTS
> >         ================
> >
> > I would like to take this opportunity to thank below people for various
> > discussions with me over different channels during the development:
> >
> > Marc Zyngier (Google)               Catalin Marinas (ARM),
> > James Morse(ARM),                   Will Deacon (Google),
> > Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
> > Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat),
> > Jonathan Cameron (Huawei),          Darren Hart (Ampere),
> > Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
> > Andrew Jones (Redhat),              Karl Heubaum (Oracle),
> > Keqian Zhu (Huawei),                Miguel Luis (Oracle),
> > Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
> > Shameerali Kolothum (Huawei)        Russell King (Oracle)
> > Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
> > Zengtao/Prime (Huawei),             And all those whom I have missed!
> >
> > Many thanks to below people for their current or past contributions:
> >
> > 1. James Morse (ARM)
> >     (Current Kernel part of vCPU Hotplug Support on AARCH64)
> > 2. Jean-Philippe Brucker (Linaro)
> >     (Protoyped one of the earlier PSCI based POC [17][18] based on RFC
> V1)
> > 3. Keqian Zhu (Huawei)
> >     (Co-developed Qemu prototype)
> > 4. Xiongfeng Wang (Huawei)
> >     (Co-developed earlier kernel prototype)
> > 5. Vishnu Pajjuri (Ampere)
> >     (Verification on Ampere ARM64 Platforms + fixes)
> > 6. Miguel Luis (Oracle)
> >     (Verification on Oracle ARM64 Platforms + fixes)
> >
> >
> > Author Salil Mehta (1):
> >    target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
> >
> > Jean-Philippe Brucker (2):
> >    hw/acpi: Make _MAT method optional
> >    target/arm/kvm: Write CPU state back to KVM on reset
> >
> > Miguel Luis (1):
> >    tcg/mttcg: enable threads to unregister in tcg_ctxs[]
> >
> > Salil Mehta (33):
> >    arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id
> property
> >    cpus-common: Add common CPU utility for possible vCPUs
> >    hw/arm/virt: Move setting of common CPU properties in a function
> >    arm/virt,target/arm: Machine init time change common to vCPU
> {cold|hot}-plug
> >    accel/kvm: Extract common KVM vCPU {creation,parking} code
> >    arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
> >    arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine
> init
> >    arm/virt: Init PMU at host for all possible vcpus
> >    hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
> >    arm/acpi: Enable ACPI support for vcpu hotplug
> >    hw/acpi: Add ACPI CPU hotplug init stub
> >    hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
> >    hw/acpi: Init GED framework with cpu hotplug events
> >    arm/virt: Add cpu hotplug events to GED during creation
> >    arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
> >    hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
> >    arm/virt/acpi: Build CPUs AML with CPU Hotplug support
> >    arm/virt: Make ARM vCPU *present* status ACPI *persistent*
> >    hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits
> to Guest
> >    hw/acpi: Update GED _EVT method AML with cpu scan
> >    hw/arm: MADT Tbl change to size the guest with possible vCPUs
> >    arm/virt: Release objects for *disabled* possible vCPUs after init
> >    hw/acpi: Update ACPI GED framework to support vCPU Hotplug
> >    arm/virt: Add/update basic hot-(un)plug framework
> >    arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
> >    hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
> >    hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register
> info
> >    arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
> >    hw/arm: Changes required for reset and to support next boot
> >    physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
> >    target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
> >    hw/arm: Support hotplug capability check using _OSC method
> >    hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
> >
> >   accel/kvm/kvm-all.c                    |  61 +-
> >   accel/tcg/tcg-accel-ops-mttcg.c        |   1 +
> >   cpus-common.c                          |  37 ++
> >   gdbstub/gdbstub.c                      |  13 +
> >   hw/acpi/acpi-cpu-hotplug-stub.c        |   6 +
> >   hw/acpi/cpu.c                          |  91 ++-
> >   hw/acpi/generic_event_device.c         |  33 +
> >   hw/arm/Kconfig                         |   1 +
> >   hw/arm/boot.c                          |   2 +-
> >   hw/arm/virt-acpi-build.c               | 110 +++-
> >   hw/arm/virt.c                          | 863 ++++++++++++++++++++-----
> >   hw/core/gpio.c                         |   2 +-
> >   hw/i386/acpi-build.c                   |   2 +-
> >   hw/intc/arm_gicv3.c                    |   1 +
> >   hw/intc/arm_gicv3_common.c             |  66 +-
> >   hw/intc/arm_gicv3_cpuif.c              | 265 ++++----
> >   hw/intc/arm_gicv3_cpuif_common.c       |   5 +
> >   hw/intc/arm_gicv3_kvm.c                |  39 +-
> >   hw/intc/gicv3_internal.h               |   2 +
> >   include/exec/cpu-common.h              |   8 +
> >   include/exec/gdbstub.h                 |   1 +
> >   include/hw/acpi/cpu.h                  |   7 +-
> >   include/hw/acpi/cpu_hotplug.h          |   4 +
> >   include/hw/acpi/generic_event_device.h |   5 +
> >   include/hw/arm/boot.h                  |   2 +
> >   include/hw/arm/virt.h                  |  10 +-
> >   include/hw/core/cpu.h                  |  77 +++
> >   include/hw/intc/arm_gicv3_common.h     |  23 +
> >   include/hw/qdev-core.h                 |   2 +
> >   include/sysemu/kvm.h                   |   2 +
> >   include/tcg/tcg.h                      |   1 +
> >   softmmu/physmem.c                      |  25 +
> >   target/arm/arm-powerctl.c              |  51 +-
> >   target/arm/cpu-qom.h                   |   3 +
> >   target/arm/cpu.c                       | 112 ++++
> >   target/arm/cpu.h                       |  17 +
> >   target/arm/cpu64.c                     |  15 +
> >   target/arm/gdbstub.c                   |   6 +
> >   target/arm/helper.c                    |  27 +-
> >   target/arm/internals.h                 |  12 +-
> >   target/arm/kvm.c                       |  93 ++-
> >   target/arm/kvm64.c                     |  59 +-
> >   target/arm/kvm_arm.h                   |  24 +
> >   target/arm/meson.build                 |   1 +
> >   target/arm/{tcg => }/psci.c            |   8 +
> >   target/arm/tcg/meson.build             |   4 -
> >   tcg/tcg.c                              |  23 +
> >   47 files changed, 1873 insertions(+), 349 deletions(-)
> >   rename target/arm/{tcg => }/psci.c (97%)
> Tested on Ampere's platform for vCPU hotplug/unplug with reboot,
> suspend/resume and save/restore.
> Also tested for vCPU hotplug/unplug along with VM live migration.
> 
> Please feel free to add,
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>

Many thanks for this.

As you are aware, we have now split above patch-set into:

1. Architecture agnostic patch-set (being reviewed below)
   https://lore.kernel.org/qemu-devel/20231009203601.17584-1-salil.mehta@huawei.com/#t
2. ARM specific patch-set (Would soon be following as RFC V3)


If possible, can I request you to sanity test the Architecture
agnostic patch-set as well for regression and provide the
Tested-by Tag for this patch-set as well?

This is to ensure these changes if accepted do not break any
existing features.


Many thanks again for your past efforts all these times!


Cheers
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-10-11 10:32   ` Salil Mehta via
  2023-10-11 10:32     ` Salil Mehta
@ 2023-10-11 11:08     ` Vishnu Pajjuri
  2023-10-11 20:15       ` Salil Mehta
  1 sibling, 1 reply; 153+ messages in thread
From: Vishnu Pajjuri @ 2023-10-11 11:08 UTC (permalink / raw)
  To: Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, gshan,
	rafael, borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

[-- Attachment #1: Type: text/plain, Size: 34400 bytes --]

Hi Salil,

On 11-10-2023 16:02, Salil Mehta wrote:
> [EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please be mindful of safe email handling and proprietary information protection practices.]
>
>
> Hi Vishnu,
>
>> From: Vishnu Pajjuri<vishnu@amperemail.onmicrosoft.com>
>> Sent: Wednesday, October 11, 2023 11:23 AM
>> To: Salil Mehta<salil.mehta@huawei.com>;qemu-devel@nongnu.org; qemu-
>> arm@nongnu.org
>> Cc:maz@kernel.org;jean-philippe@linaro.org; Jonathan Cameron
>> <jonathan.cameron@huawei.com>;lpieralisi@kernel.org;
>> peter.maydell@linaro.org;richard.henderson@linaro.org;
>> imammedo@redhat.com;andrew.jones@linux.dev;david@redhat.com;
>> philmd@linaro.org;eric.auger@redhat.com;will@kernel.org;ardb@kernel.org;
>> oliver.upton@linux.dev;pbonzini@redhat.com;mst@redhat.com;
>> gshan@redhat.com;rafael@kernel.org;borntraeger@linux.ibm.com;
>> alex.bennee@linaro.org;linux@armlinux.org.uk;
>> darren@os.amperecomputing.com;ilkka@os.amperecomputing.com;
>> vishnu@os.amperecomputing.com;karl.heubaum@oracle.com;
>> miguel.luis@oracle.com;salil.mehta@opnsrc.net; zhukeqian
>> <zhukeqian1@huawei.com>; wangxiongfeng (C)<wangxiongfeng2@huawei.com>;
>> wangyanan (Y)<wangyanan55@huawei.com>;jiakernel2@gmail.com;
>> maobibo@loongson.cn;lixianglai@loongson.cn
>> Subject: Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8
>> Arch
>>
>> Hi Salil,
>>
>> On 26-09-2023 15:33, Salil Mehta wrote:
>>> [ *REPEAT: Sent patches got held at internal server yesterday* ]
>>>
>>> PROLOGUE
>>> ========
>>>
>>> To assist in review and set the right expectations from this RFC, please
>> first
>>> read below sections *APPENDED AT THE END* of this cover letter,
>>>
>>> 1. Important *DISCLAIMER* [Section (X)]
>>> 2. Work presented at KVMForum Conference (slides available) [Section
>> (V)F]
>>> 3. Organization of patches [Section (XI)]
>>> 4. References [Section (XII)]
>>> 5. Detailed TODO list of the leftover work or work-in-progress [Section
>> (IX)]
>>> NOTE: There has been an interest shown by other organizations in adapting
>>> this series for their architecture. I am planning to split this RFC into
>>> architecture *agnostic* and *specific* patch-sets in subsequent releases.
>> ARM
>>> specific patch-set will continue as RFC V3 and architecture agnostic
>> patch-set
>>> will be floated without RFC tag and can be consumed in this Qemu cycle if
>>> MAINTAINERs ack it.
>>>
>>> [Please check section (XI)B for details of architecture agnostic patches]
>>>
>>>
>>> SECTIONS [I - XIII] are as follows :
>>>
>>> (I) Key Changes (RFC V1 -> RFC V2)
>>>       ==================================
>>>
>>>       RFC V1:https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
>> salil.mehta@huawei.com/
>>> 1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
>>>      *online-capable* or *enabled* to the Guest OS at the boot time. This
>> means
>>>      associated CPUs can have ACPI _STA as *enabled* or *disabled* even
>> after boot
>>>      See, UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface
>> Flags[20]
>>> 2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI
>> CPU_{ON,OFF}
>>>      request. This is required to {dis}allow online'ing a vCPU.
>>> 3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI
>> _STA.PRESENT
>>>      to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
>>>      hot{un}plug.
>>> 4. Live Migration works (some issues are still there)
>>> 5. TCG/HVF/qtest does not support Hotplug and falls back to default.
>>> 6. Code for TCG support do exists in this release (it is a work-in-
>> progress)
>>> 7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
>>>      hotplug capability (_OSC Query support still pending)
>>> 8. Misc. Bug fixes
>>>
>>> (II) Summary
>>>        =======
>>>
>>> This patch-set introduces the virtual CPU hotplug support for ARMv8
>> architecture
>>> in QEMU. Idea is to be able to hotplug and hot-unplug the vCPUs while
>> guest VM
>>> is running and no reboot is required. This does *not* makes any
>> assumption of
>>> the physical CPU hotplug availability within the host system but rather
>> tries to
>>> solve the problem at virtualizer/QEMU layer. Introduces ACPI CPU hotplug
>> hooks
>>> and event handling to interface with the guest kernel, code to
>> initialize, plug
>>> and unplug CPUs. No changes are required within the host kernel/KVM
>> except the
>>> support of hypercall exit handling in the user-space/Qemu which has
>> recently
>>> been added to the kernel. Its corresponding Guest kernel changes have
>> been
>>> posted on the mailing-list [3] [4] by James Morse.
>>>
>>> (III) Motivation
>>>         ==========
>>>
>>> This allows scaling the guest VM compute capacity on-demand which would
>> be
>>> useful for the following example scenarios,
>>>
>>> 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the
>> orchestration
>>>      framework which could adjust resource requests (CPU and Mem requests)
>> for
>>>      the containers in a pod, based on usage.
>>> 2. Pay-as-you-grow Business Model: Infrastructure provider could allocate
>> and
>>>      restrict the total number of compute resources available to the guest
>> VM
>>>      according to the SLA (Service Level Agreement). VM owner could
>> request for
>>>      more compute to be hot-plugged for some cost.
>>>
>>> For example, Kata Container VM starts with a minimum amount of resources
>> (i.e.
>>> hotplug everything approach). why?
>>>
>>> 1. Allowing faster *boot time* and
>>> 2. Reduction in *memory footprint*
>>>
>>> Kata Container VM can boot with just 1 vCPU and then later more vCPUs can
>> be
>>> hot-plugged as per requirement.
>>>
>>> (IV) Terminology
>>>        ===========
>>>
>>> (*) Posssible CPUs: Total vCPUs which could ever exist in VM. This
>> includes
>>>                       any cold booted CPUs plus any CPUs which could be
>> later
>>>                       hot-plugged.
>>>                       - Qemu parameter(-smp maxcpus=N)
>>> (*) Present CPUs:   Possible CPUs which are ACPI 'present'. These might
>> or might
>>>                       not be ACPI 'enabled'.
>>>                       - Present vCPUs = Possible vCPUs (Always on ARM
>> Arch)
>>> (*) Enabled CPUs:   Possible CPUs which are ACPI ‘present’ and 'enabled'
>> and can
>>>                       now be ‘onlined’ (PSCI) for use by Guest Kernel. All
>> cold
>>>                       booted vCPUs are ACPI 'enabled' at boot. Later,
>> using
>>>                       device_add more vCPUs can be hotplugged and be made
>> ACPI
>>>                       'enabled.
>>>                       - Qemu parameter(-smp cpus=N). Can be used to
>> specify some
>>>                    cold booted vCPUs during VM init. Some can be added using
>>>                    '-device' option.
>>>
>>> (V) Constraints Due To ARMv8 CPU Architecture [+] Other Impediments
>>>       ===============================================================
>>>
>>> A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
>>>      1. ARMv8 CPU architecture does not support the concept of the
>> physical CPU
>>>         hotplug.
>>>         a. There are many per-CPU components like PMU, SVE, MTE, Arch
>> timers etc.
>>>            whose behaviour need to be clearly defined when CPU is
>> hot(un)plugged.
>>>            There is no specification for this.
>>>
>>>      2. Other ARM components like GIC etc. have not been designed to
>> realize
>>>         physical CPU hotplug capability as of now. For example,
>>>         a. Every physical CPU has a unique GICC (GIC CPU Interface) by
>> construct.
>>>            Architecture does not specifies what CPU hot(un)plug would mean
>> in
>>>            context to any of these.
>>>         b. CPUs/GICC are physically connected to unique GICR (GIC
>> Redistributor).
>>>            GIC Redistributors are always part of always-on power domain.
>> Hence,
>>>            cannot be powered-off as per specification.
>>>
>>> B. Impediments in Firmware/ACPI (Architectural Constraint)
>>>
>>>      1. Firmware has to expose GICC, GICR and other per-CPU features like
>> PMU,
>>>         SVE, MTE, Arch Timers etc. to the OS. Due to architectural
>> constraint
>>>         stated in above section A1(a),  all interrupt controller
>> structures of
>>>         MADT describing GIC CPU Interfaces and the GIC Redistibutors MUST
>> be
>>>         presented by firmware to the OSPM during the boot time.
>>>      2. Architectures that support CPU hotplug can evaluate ACPI _MAT
>> method to
>>>         get this kind of information from the firmware even after boot and
>> the
>>>         OSPM has capability to process these. ARM kernel uses information
>> in MADT
>>>         interrupt controller structures to identify number of Present CPUs
>> during
>>>         boot and hence does not allow to change these after boot. Number
>> of
>>>         present CPUs cannot be changed. It is an architectural constraint!
>>>
>>> C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural
>> Constraint)
>>>      1. KVM VGIC:
>>>          a. Sizing of various VGIC resources like memory regions etc.
>> related to
>>>             the redistributor happens only once and is fixed at the VM
>> init time
>>>             and cannot be changed later after initialization has happened.
>>>             KVM statically configures these resources based on the number
>> of vCPUs
>>>             and the number/size of redistributor ranges.
>>>          b. Association between vCPU and its VGIC redistributor is fixed
>> at the
>>>             VM init time within the KVM i.e. when redistributor iodevs
>> gets
>>>             registered. VGIC does not allows to setup/change this
>> association
>>>             after VM initialization has happened. Physically, every
>> CPU/GICC is
>>>             uniquely connected with its redistributor and there is no
>>>             architectural way to set this up.
>>>      2. KVM vCPUs:
>>>          a. Lack of specification means destruction of KVM vCPUs does not
>> exist as
>>>             there is no reference to tell what to do with other per-vCPU
>>>             components like redistributors, arch timer etc.
>>>          b. Infact, KVM does not implements destruction of vCPUs for any
>>>             architecture. This is independent of the fact whether
>> architecture
>>>             actually supports CPU Hotplug feature. For example, even for
>> x86 KVM
>>>             does not implements destruction of vCPUs.
>>>
>>> D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints-
>>> Arch)
>>>
>>>      1. Qemu CPU Objects MUST be created to initialize all the Host KVM
>> vCPUs to
>>>         overcome the KVM constraint. KVM vCPUs are created, initialized
>> when Qemu
>>>         CPU Objects are realized. But keepinsg the QOM CPU objects
>> realized for
>>>         'yet-to-be-plugged' vCPUs can create problems when these new vCPUs
>> shall
>>>         be plugged using device_add and a new QOM CPU object shall be
>> created.
>>>      2. GICV3State and GICV3CPUState objects MUST be sized over *possible
>> vCPUs*
>>>         during VM init time while QOM GICV3 Object is realized. This is
>> because
>>>         KVM VGIC can only be initialized once during init time. But every
>>>         GICV3CPUState has an associated QOM CPU Object. Later might
>> corresponds to
>>>         vCPU which are 'yet-to-be-plugged'(unplugged at init).
>>>      3. How should new QOM CPU objects be connected back to the
>> GICV3CPUState
>>>         objects and disconnected from it in case CPU is being
>> hot(un)plugged?
>>>      4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented
>> in the
>>>         QOM for which KVM vCPU already exists? For example, whether to
>> keep,
>>>          a. No QOM CPU objects Or
>>>          b. Unrealized CPU Objects
>>>      5. How should vCPU state be exposed via ACPI to the Guest? Especially
>> for
>>>         the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not
>> exists
>>>         within the QOM but the Guest always expects all possible vCPUs to
>> be
>>>         identified as ACPI *present* during boot.
>>>      6. How should Qemu expose GIC CPU interfaces for the unplugged or
>>>         yet-to-beplugged vCPUs using ACPI MADT Table to the Guest?
>>>
>>> E. Summary of Approach ([+] Workarounds to problems in sections A, B, C &
>> D)
>>>      1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e.
>> even
>>>         for the vCPUs which are yet-to-be-plugged in Qemu but keep them in
>> the
>>>         powered-off state.
>>>      2. After the KVM vCPUs have been initialized in the Host, the KVM
>> vCPU
>>>         objects corresponding to the unplugged/yet-to-be-plugged vCPUs are
>> parked
>>>         at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar
>> to x86)
>>>      3. GICV3State and GICV3CPUState objects are sized over possible vCPUs
>> during
>>>         VM init time i.e. when Qemu GIC is realized. This in turn sizes
>> KVM VGIC
>>>         resources like memory regions etc. related to the redistributors
>> with the
>>>         number of possible KVM vCPUs. This never changes after VM has
>> initialized.
>>>      4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged
>> vCPUs are
>>>         released post Host KVM CPU and GIC/VGIC initialization.
>>>      5. Build ACPI MADT Table with below updates
>>>         a. Number of GIC CPU interface entries (=possible vCPUs)
>>>         b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable)
>>>         c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1
>>>            - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
>>>       - vCPU can be ACPI enabled+onlined after Guest boots (Firmware
>> Policy)
>>>       - Some issues with above (details in later sections)
>>>      6. Expose below ACPI Status to Guest kernel
>>>         a. Always _STA.Present=1 (all possible vCPUs)
>>>         b. _STA.Enabled=1 (plugged vCPUs)
>>>         c. _STA.Enabled=0 (unplugged vCPUs)
>>>      7. vCPU hotplug *realizes* new QOM CPU object. Following happens,
>>>         a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread
>>>         b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list)
>>>            - Attaches to QOM CPU object.
>>>         c. Reinitializes KVM vCPU in the Host
>>>            - Resets the core and sys regs, sets defaults etc.
>>>         d. Runs KVM vCPU (created with "start-powered-off")
>>>       - vCPU thread sleeps (waits for vCPU reset via PSCI)
>>>         e. Updates Qemu GIC
>>>            - Wires back IRQs related to this vCPU.
>>>            - GICV3CPUState association with QOM CPU Object.
>>>         f. Updates [6] ACPI _STA.Enabled=1
>>>         g. Notifies Guest about new vCPU (via ACPI GED interface)
>>>       - Guest checks _STA.Enabled=1
>>>       - Guest adds processor (registers CPU with LDM) [3]
>>>         h. Plugs the QOM CPU object in the slot.
>>>            - slot-number = cpu-index{socket,cluster,core,thread}
>>>         i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC)
>>>            - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>>>            - Qemu powers-on KVM vCPU in the Host
>>>      8. vCPU hot-unplug *unrealizes* QOM CPU Object. Following happens,
>>>         a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event
>>>            - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC)
>>>         b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>>>            - Qemu powers-off the KVM vCPU in the Host
>>>         c Guest signals *Eject* vCPU to Qemu
>>>         d. Qemu updates [6] ACPI _STA.Enabled=0
>>>         e. Updates GIC
>>>            - Un-wires IRQs related to this vCPU
>>>            - GICV3CPUState association with new QOM CPU Object is updated.
>>>         f. Unplugs the vCPU
>>>       - Removes from slot
>>>            - Parks KVM vCPU ("kvm_parked_vcpus" list)
>>>            - Unrealizes QOM CPU Object & joins back Qemu vCPU thread
>>>       - Destroys QOM CPU object
>>>         g. Guest checks ACPI _STA.Enabled=0
>>>            - Removes processor (unregisters CPU with LDM) [3]
>>>
>>> F. Work Presented at KVM Forum Conferences:
>>>      Details of above work has been presented at KVMForum2020 and
>> KVMForum2023
>>>      conferences. Slides are available at below links,
>>>      a. KVMForum 2023
>>>         - Challenges Revisited in Supporting Virt CPU Hotplug on
>> architectures that don't Support CPU Hotplug (like ARM64)
>>>           https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
>>>      b. KVMForum 2020
>>>         - Challenges in Supporting Virtual CPU Hotplug on SoC Based
>> Systems (like ARM64) - Salil Mehta, Huawei
>>>           https://sched.co/eE4m
>>>
>>> (VI) Commands Used
>>>        =============
>>>
>>>       A. Qemu launch commands to init the machine
>>>
>>>       $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
>>>       -cpu host -smp cpus=4,maxcpus=6 \
>>>       -m 300M \
>>>       -kernel Image \
>>>       -initrd rootfs.cpio.gz \
>>>       -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2
>> acpi=force" \
>>>       -nographic \
>>>       -bios  QEMU_EFI.fd \
>>>
>>>       B. Hot-(un)plug related commands
>>>
>>>       # Hotplug a host vCPU(accel=kvm)
>>>       $ device_add host-arm-cpu,id=core4,core-id=4
>>>
>>>       # Hotplug a vCPU(accel=tcg)
>>>       $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
>>>
>>>       # Delete the vCPU
>>>       $ device_del core4
>>>
>>>       Sample output on guest after boot:
>>>
>>>       $ cat /sys/devices/system/cpu/possible
>>>       0-5
>>>       $ cat /sys/devices/system/cpu/present
>>>       0-5
>>>       $ cat /sys/devices/system/cpu/enabled
>>>       0-3
>>>       $ cat /sys/devices/system/cpu/online
>>>       0-1
>>>       $ cat /sys/devices/system/cpu/offline
>>>       2-5
>>>
>>>       Sample output on guest after hotplug of vCPU=4:
>>>
>>>       $ cat /sys/devices/system/cpu/possible
>>>       0-5
>>>       $ cat /sys/devices/system/cpu/present
>>>       0-5
>>>       $ cat /sys/devices/system/cpu/enabled
>>>       0-4
>>>       $ cat /sys/devices/system/cpu/online
>>>       0-1,4
>>>       $ cat /sys/devices/system/cpu/offline
>>>       2-3,5
>>>
>>>       Note: vCPU=4 was explicitly 'onlined' after hot-plug
>>>       $ echo 1 > /sys/devices/system/cpu/cpu4/online
>>>
>>> (VII) Repository
>>>         ==========
>>>
>>>    (*) QEMU changes for vCPU hotplug could be cloned from below site,
>>>        https://github.com/salil-mehta/qemu.git  virt-cpuhp-armv8/rfc-v2
>>>    (*) Guest Kernel changes (by James Morse, ARM) are available here:
>>>        https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
>> virtual_cpu_hotplug/rfc/v2
>>>
>>> (VIII) KNOWN ISSUES
>>>          ============
>>>
>>> 1. Migration has been lightly tested. Below are some of the known issues:
>>>      - Ocassional CPU stall (not always repeatable)
>>>      - Negative test case like asymmetric source/destination VM config
>> causes dump.
>>>      - Migration with TCG is not working properly.
>>> 2. TCG with Single threaded mode is broken.
>>> 3. HVF and qtest support is broken.
>>> 4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-
>> capable are
>>>      mutually exclusive i.e. as per the change [6] a vCPU cannot be both
>>>      GICC.Enabled and GICC.online-capable. This means,
>>>         [ Link:https://bugzilla.tianocore.org/show_bug.cgi?id=3706  ]
>>>      a. If we have to support hot-unplug of the cold-booted vCPUs then
>> these MUST
>>>         be specified as GICC.online-capable in the MADT Table during boot
>> by the
>>>         firmware/Qemu. But this requirement conflicts with the requirement
>> to
>>>         support new Qemu changes with legacy OS which dont understand
>>>         MADT.GICC.online-capable Bit. Legacy OS during boot time will
>> ignore this
>>>         bit and hence these vCPUs will not appear on such OS. This is
>> unexpected
>>>         behaviour.
>>>      b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to
>> unplug
>>>         these cold-booted vCPUs from OS (which in actual should be blocked
>> by
>>>         returning error at Qemu) then features like 'kexec' will break.
>>>      c. As I understand, removal of the cold-booted vCPUs is a required
>> feature
>>>         and x86 world allows it.
>>>      d. Hence, either we need a specification change to make the
>> MADT.GICC.Enabled
>>>         and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT
>> support
>>>         removal of cold-booted vCPUs. In the later case, a check can be
>> introduced
>>>         to bar the users from unplugging vCPUs, which were cold-booted,
>> using QMP
>>>         commands. (Needs discussion!)
>>>         Please check below patch part of this patch-set:
>>>             [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled]
>>> 5. Code related to the notification to GICV3 about hot(un)plug of a vCPU
>> event
>>>      might need further discussion.
>>>
>>>
>>> (IX) THINGS TO DO
>>>        ============
>>>
>>> 1. Fix the Migration Issues
>>> 2. Fix issues related to TCG/Emulation support.
>>> 3. Comprehensive Testing. Current testing is very basic.
>>>      a. Negative Test cases
>>> 4. Qemu Documentation(.rst) need to be updated.
>>> 5. Fix qtest, HVF Support
>>> 6. Fix the design issue related to ACPI MADT.GICC flags discussed in
>> known
>>>      issues. This might require UEFI ACPI specification change!
>>> 7. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now.
>>>
>>>    Above is *not* a complete list. Will update later!
>>>
>>> Best regards
>>> Salil.
>>>
>>> (X) DISCLAIMER
>>>       ==========
>>>
>>> This work is an attempt to present a proof-of-concept of the ARM64 vCPU
>> hotplug
>>> implementation to the community. This is *not* a production level code
>> and might
>>> have bugs. Only a basic testing has been done on HiSilicon Kunpeng920 SoC
>> for
>>> servers. Once the design and core idea behind the implementation has been
>>> verified more efforts can be put to harden the code.
>>>
>>> This work is *mostly* in the lines of the discussions which have happened
>> in the
>>> previous years[see refs below] across different channels like mailing-
>> list,
>>> Linaro Open Discussions platform, various conferences like KVMFourm etc.
>> This
>>> RFC is being used as a way to verify the idea mentioned in this cover-
>> letter and
>>> to get community views. Once this has been agreed, a formal patch shall
>> be
>>> posted to the mailing-list for review.
>>>
>>> [The concept being presented has been found to work!]
>>>
>>> (XI) ORGANIZATION OF PATCHES
>>>        =======================
>>>
>>>    A. All patches [Architecture 'agnostic' + 'specific']:
>>>
>>>      [Patch 1-9, 23, 36] logic required during machine init
>>>       (*) Some validation checks
>>>       (*) Introduces core-id property and some util functions required
>> later.
>>>       (*) Refactors Parking logic of vCPUs
>>>       (*) Logic to pre-create vCPUs
>>>       (*) GIC initialization pre-sized with possible vCPUs.
>>>       (*) Some refactoring to have common hot and cold plug logic
>> together.
>>>       (*) Release of disable QOM CPU objects in post_cpu_init()
>>>       (*) Support of ACPI _OSC method to negotiate platform hotplug
>> capabilities
>>>      [Patch 10-22] logic related to ACPI at machine init time
>>>       (*) Changes required to Enable ACPI for cpu hotplug
>>>       (*) Initialization ACPI GED framework to cater CPU Hotplug Events
>>>       (*) Build ACPI AML related to CPU control dev
>>>       (*) ACPI MADT/MAT changes
>>>      [Patch 24-35] Logic required during vCPU hot-(un)plug
>>>       (*) Basic framework changes to suppport vCPU hot-(un)plug
>>>       (*) ACPI GED changes for hot-(un)plug hooks.
>>>       (*) wire-unwire the IRQs
>>>       (*) GIC notification logic
>>>       (*) ARMCPU unrealize logic
>>>       (*) Handling of SMCC Hypercall Exits by KVM to Qemu
>>>
>>>    B. Architecture *agnostic* patches part of patch-set:
>>>
>>>      [Patch 5,9,11,13,16,20,24,31,33] Common logic to support hotplug
>>>       (*) Refactors Parking logic of vCPUs
>>>       (*) Introduces ACPI GED Support for vCPU Hotplug Events
>>>       (*) Introduces ACPI AML change for CPU Control Device
>>>
>>> (XII) REFERENCES
>>>         ==========
>>>
>>> [1]https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
>> salil.mehta@huawei.com/
>>> [2]https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-
>> salil.mehta@huawei.com/
>>> [3]https://lore.kernel.org/lkml/20230203135043.409192-1-
>> james.morse@arm.com/
>>> [4]https://lore.kernel.org/all/20230913163823.7880-1-
>> james.morse@arm.com/
>>> [5]https://lore.kernel.org/all/20230404154050.2270077-1-
>> oliver.upton@linux.dev/
>>> [6]https://bugzilla.tianocore.org/show_bug.cgi?id=3706
>>> [7]
>> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic
>> -cpu-interface-gicc-structure
>>> [8]https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
>>> [9]https://cloud.google.com/kubernetes-
>> engine/docs/concepts/verticalpodautoscaler
>>> [10]https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-
>> autoscaler.html
>>> [11]https://lkml.org/lkml/2019/7/10/235
>>> [12]https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
>>> [13]https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
>>> [14]https://op-lists.linaro.org/archives/list/linaro-open-
>> discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
>>> [15]http://lists.nongnu.org/archive/html/qemu-devel/2018-
>> 07/msg01168.html
>>> [16]https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
>>> [17]https://op-lists.linaro.org/archives/list/linaro-open-
>> discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
>>> [18]https://lore.kernel.org/lkml/20210608154805.216869-1-jean-
>> philippe@linaro.org/
>>> [19]https://lore.kernel.org/all/20230913163823.7880-1-
>> james.morse@arm.com/
>>> [20]
>> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic
>> c-cpu-interface-flags
>>> (XIII) ACKNOWLEDGEMENTS
>>>          ================
>>>
>>> I would like to take this opportunity to thank below people for various
>>> discussions with me over different channels during the development:
>>>
>>> Marc Zyngier (Google)               Catalin Marinas (ARM),
>>> James Morse(ARM),                   Will Deacon (Google),
>>> Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
>>> Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat),
>>> Jonathan Cameron (Huawei),          Darren Hart (Ampere),
>>> Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
>>> Andrew Jones (Redhat),              Karl Heubaum (Oracle),
>>> Keqian Zhu (Huawei),                Miguel Luis (Oracle),
>>> Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
>>> Shameerali Kolothum (Huawei)        Russell King (Oracle)
>>> Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
>>> Zengtao/Prime (Huawei),             And all those whom I have missed!
>>>
>>> Many thanks to below people for their current or past contributions:
>>>
>>> 1. James Morse (ARM)
>>>      (Current Kernel part of vCPU Hotplug Support on AARCH64)
>>> 2. Jean-Philippe Brucker (Linaro)
>>>      (Protoyped one of the earlier PSCI based POC [17][18] based on RFC
>> V1)
>>> 3. Keqian Zhu (Huawei)
>>>      (Co-developed Qemu prototype)
>>> 4. Xiongfeng Wang (Huawei)
>>>      (Co-developed earlier kernel prototype)
>>> 5. Vishnu Pajjuri (Ampere)
>>>      (Verification on Ampere ARM64 Platforms + fixes)
>>> 6. Miguel Luis (Oracle)
>>>      (Verification on Oracle ARM64 Platforms + fixes)
>>>
>>>
>>> Author Salil Mehta (1):
>>>     target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
>>>
>>> Jean-Philippe Brucker (2):
>>>     hw/acpi: Make _MAT method optional
>>>     target/arm/kvm: Write CPU state back to KVM on reset
>>>
>>> Miguel Luis (1):
>>>     tcg/mttcg: enable threads to unregister in tcg_ctxs[]
>>>
>>> Salil Mehta (33):
>>>     arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id
>> property
>>>     cpus-common: Add common CPU utility for possible vCPUs
>>>     hw/arm/virt: Move setting of common CPU properties in a function
>>>     arm/virt,target/arm: Machine init time change common to vCPU
>> {cold|hot}-plug
>>>     accel/kvm: Extract common KVM vCPU {creation,parking} code
>>>     arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
>>>     arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine
>> init
>>>     arm/virt: Init PMU at host for all possible vcpus
>>>     hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
>>>     arm/acpi: Enable ACPI support for vcpu hotplug
>>>     hw/acpi: Add ACPI CPU hotplug init stub
>>>     hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
>>>     hw/acpi: Init GED framework with cpu hotplug events
>>>     arm/virt: Add cpu hotplug events to GED during creation
>>>     arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
>>>     hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
>>>     arm/virt/acpi: Build CPUs AML with CPU Hotplug support
>>>     arm/virt: Make ARM vCPU *present* status ACPI *persistent*
>>>     hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits
>> to Guest
>>>     hw/acpi: Update GED _EVT method AML with cpu scan
>>>     hw/arm: MADT Tbl change to size the guest with possible vCPUs
>>>     arm/virt: Release objects for *disabled* possible vCPUs after init
>>>     hw/acpi: Update ACPI GED framework to support vCPU Hotplug
>>>     arm/virt: Add/update basic hot-(un)plug framework
>>>     arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
>>>     hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
>>>     hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register
>> info
>>>     arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
>>>     hw/arm: Changes required for reset and to support next boot
>>>     physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
>>>     target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
>>>     hw/arm: Support hotplug capability check using _OSC method
>>>     hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
>>>
>>>    accel/kvm/kvm-all.c                    |  61 +-
>>>    accel/tcg/tcg-accel-ops-mttcg.c        |   1 +
>>>    cpus-common.c                          |  37 ++
>>>    gdbstub/gdbstub.c                      |  13 +
>>>    hw/acpi/acpi-cpu-hotplug-stub.c        |   6 +
>>>    hw/acpi/cpu.c                          |  91 ++-
>>>    hw/acpi/generic_event_device.c         |  33 +
>>>    hw/arm/Kconfig                         |   1 +
>>>    hw/arm/boot.c                          |   2 +-
>>>    hw/arm/virt-acpi-build.c               | 110 +++-
>>>    hw/arm/virt.c                          | 863 ++++++++++++++++++++-----
>>>    hw/core/gpio.c                         |   2 +-
>>>    hw/i386/acpi-build.c                   |   2 +-
>>>    hw/intc/arm_gicv3.c                    |   1 +
>>>    hw/intc/arm_gicv3_common.c             |  66 +-
>>>    hw/intc/arm_gicv3_cpuif.c              | 265 ++++----
>>>    hw/intc/arm_gicv3_cpuif_common.c       |   5 +
>>>    hw/intc/arm_gicv3_kvm.c                |  39 +-
>>>    hw/intc/gicv3_internal.h               |   2 +
>>>    include/exec/cpu-common.h              |   8 +
>>>    include/exec/gdbstub.h                 |   1 +
>>>    include/hw/acpi/cpu.h                  |   7 +-
>>>    include/hw/acpi/cpu_hotplug.h          |   4 +
>>>    include/hw/acpi/generic_event_device.h |   5 +
>>>    include/hw/arm/boot.h                  |   2 +
>>>    include/hw/arm/virt.h                  |  10 +-
>>>    include/hw/core/cpu.h                  |  77 +++
>>>    include/hw/intc/arm_gicv3_common.h     |  23 +
>>>    include/hw/qdev-core.h                 |   2 +
>>>    include/sysemu/kvm.h                   |   2 +
>>>    include/tcg/tcg.h                      |   1 +
>>>    softmmu/physmem.c                      |  25 +
>>>    target/arm/arm-powerctl.c              |  51 +-
>>>    target/arm/cpu-qom.h                   |   3 +
>>>    target/arm/cpu.c                       | 112 ++++
>>>    target/arm/cpu.h                       |  17 +
>>>    target/arm/cpu64.c                     |  15 +
>>>    target/arm/gdbstub.c                   |   6 +
>>>    target/arm/helper.c                    |  27 +-
>>>    target/arm/internals.h                 |  12 +-
>>>    target/arm/kvm.c                       |  93 ++-
>>>    target/arm/kvm64.c                     |  59 +-
>>>    target/arm/kvm_arm.h                   |  24 +
>>>    target/arm/meson.build                 |   1 +
>>>    target/arm/{tcg => }/psci.c            |   8 +
>>>    target/arm/tcg/meson.build             |   4 -
>>>    tcg/tcg.c                              |  23 +
>>>    47 files changed, 1873 insertions(+), 349 deletions(-)
>>>    rename target/arm/{tcg => }/psci.c (97%)
>> Tested on Ampere's platform for vCPU hotplug/unplug with reboot,
>> suspend/resume and save/restore.
>> Also tested for vCPU hotplug/unplug along with VM live migration.
>>
>> Please feel free to add,
>> Tested-by: Vishnu Pajjuri<vishnu@os.amperecomputing.com>
> Many thanks for this.
>
> As you are aware, we have now split above patch-set into:
>
> 1. Architecture agnostic patch-set (being reviewed below)
>     https://lore.kernel.org/qemu-devel/20231009203601.17584-1-salil.mehta@huawei.com/#t
> 2. ARM specific patch-set (Would soon be following as RFC V3)
>
>
> If possible, can I request you to sanity test the Architecture
> agnostic patch-set as well for regression and provide the
> Tested-by Tag for this patch-set as well?

Sure, I'll do.

> This is to ensure these changes if accepted do not break any
> existing features.
>
>
> Many thanks again for your past efforts all these times!
It was great working experience with you.
And my pleasure to contribute to new feature like vCPU hotplug on ARM64 
platform.

_Regards_,
-Vishnu
>
> Cheers
> Salil.
>
>

[-- Attachment #2: Type: text/html, Size: 64204 bytes --]

^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-10-11 11:08     ` Vishnu Pajjuri
@ 2023-10-11 20:15       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-11 20:15 UTC (permalink / raw)
  To: Vishnu Pajjuri, Salil Mehta, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, gshan,
	rafael, borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, zhukeqian, wangxiongfeng (C),
	wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Vishnu

On 11/10/2023 12:08, Vishnu Pajjuri wrote:
> Hi Salil,
> 
> On 11-10-2023 16:02, Salil Mehta wrote:

[...]

>>> From: Vishnu Pajjuri<vishnu@amperemail.onmicrosoft.com>
>>> Sent: Wednesday, October 11, 2023 11:23 AM
>>> To: Salil Mehta<salil.mehta@huawei.com>;qemu-devel@nongnu.org; qemu-
>>> arm@nongnu.org
>>> Cc:maz@kernel.org;jean-philippe@linaro.org; Jonathan Cameron
>>> <jonathan.cameron@huawei.com>;lpieralisi@kernel.org;
>>> peter.maydell@linaro.org;richard.henderson@linaro.org;
>>> imammedo@redhat.com;andrew.jones@linux.dev;david@redhat.com;
>>> philmd@linaro.org;eric.auger@redhat.com;will@kernel.org;ardb@kernel.org;
>>> oliver.upton@linux.dev;pbonzini@redhat.com;mst@redhat.com;
>>> gshan@redhat.com;rafael@kernel.org;borntraeger@linux.ibm.com;
>>> alex.bennee@linaro.org;linux@armlinux.org.uk;
>>> darren@os.amperecomputing.com;ilkka@os.amperecomputing.com;
>>> vishnu@os.amperecomputing.com;karl.heubaum@oracle.com;
>>> miguel.luis@oracle.com;salil.mehta@opnsrc.net; zhukeqian
>>> <zhukeqian1@huawei.com>; wangxiongfeng (C)<wangxiongfeng2@huawei.com>;
>>> wangyanan (Y)<wangyanan55@huawei.com>;jiakernel2@gmail.com;
>>> maobibo@loongson.cn;lixianglai@loongson.cn
>>> Subject: Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
>>>
>>> Hi Salil,
>>>
>>> On 26-09-2023 15:33, Salil Mehta wrote:
>>>> [ *REPEAT: Sent patches got held at internal server yesterday* ]
>>>>
>>>> PROLOGUE
>>>> ========

[...]

>>> Tested on Ampere's platform for vCPU hotplug/unplug with reboot,
>>> suspend/resume and save/restore.
>>> Also tested for vCPU hotplug/unplug along with VM live migration.
>>>
>>> Please feel free to add,
>>> Tested-by: Vishnu Pajjuri<vishnu@os.amperecomputing.com>
>> Many thanks for this.
>>
>> As you are aware, we have now split above patch-set into:
>>
>> 1. Architecture agnostic patch-set (being reviewed below)
>>     https://lore.kernel.org/qemu-devel/20231009203601.17584-1-salil.mehta@huawei.com/#t
>> 2. ARM specific patch-set (Would soon be following as RFC V3)
>>
>>
>> If possible, can I request you to sanity test the Architecture
>> agnostic patch-set as well for regression and provide the
>> Tested-by Tag for this patch-set as well?
> 
> Sure, I'll do.


Thanks. I have added your tag in the architecture agnostic patch-set

https://lore.kernel.org/qemu-devel/20231011194355.15628-1-salil.mehta@huawei.com/ 



>> This is to ensure these changes if accepted do not break any
>> existing features.
>>
>>
>> Many thanks again for your past efforts all these times!
> It was great working experience with you.
> And my pleasure to contribute to new feature like vCPU hotplug on ARM64 
> platform.


You are welcome.


Cheers
Salil






^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
                   ` (32 preceding siblings ...)
  2023-10-11 10:23 ` [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Vishnu Pajjuri
@ 2023-10-12 17:02 ` Miguel Luis
  2023-10-12 17:54   ` Salil Mehta via
  33 siblings, 1 reply; 153+ messages in thread
From: Miguel Luis @ 2023-10-12 17:02 UTC (permalink / raw)
  To: Salil Mehta
  Cc: qemu-devel, qemu-arm, Marc Zyngier, jean-philippe,
	jonathan.cameron, lpieralisi, Peter Maydell, Richard Henderson,
	imammedo, andrew.jones, david, philmd, eric.auger, will, ardb,
	oliver.upton, pbonzini, mst, gshan, rafael, borntraeger,
	alex.bennee, linux, darren, ilkka, vishnu, Karl Heubaum,
	salil.mehta, zhukeqian1, wangxiongfeng2, wangyanan55, jiakernel2,
	maobibo, lixianglai

Hi Salil,

> On 26 Sep 2023, at 10:03, Salil Mehta <salil.mehta@huawei.com> wrote:
> 
> [ *REPEAT: Sent patches got held at internal server yesterday* ]
> 
> PROLOGUE
> ========
> 
> To assist in review and set the right expectations from this RFC, please first
> read below sections *APPENDED AT THE END* of this cover letter,
> 
> 1. Important *DISCLAIMER* [Section (X)]
> 2. Work presented at KVMForum Conference (slides available) [Section (V)F]
> 3. Organization of patches [Section (XI)]
> 4. References [Section (XII)]
> 5. Detailed TODO list of the leftover work or work-in-progress [Section (IX)]
> 
> NOTE: There has been an interest shown by other organizations in adapting
> this series for their architecture. I am planning to split this RFC into
> architecture *agnostic* and *specific* patch-sets in subsequent releases. ARM
> specific patch-set will continue as RFC V3 and architecture agnostic patch-set
> will be floated without RFC tag and can be consumed in this Qemu cycle if
> MAINTAINERs ack it.
> 
> [Please check section (XI)B for details of architecture agnostic patches]
> 
> 
> SECTIONS [I - XIII] are as follows :
> 
> (I) Key Changes (RFC V1 -> RFC V2)
>    ==================================
> 
>    RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
> 
> 1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
>   *online-capable* or *enabled* to the Guest OS at the boot time. This means
>   associated CPUs can have ACPI _STA as *enabled* or *disabled* even after boot
>   See, UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface Flags[20]
> 2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF}
>   request. This is required to {dis}allow online'ing a vCPU.
> 3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI _STA.PRESENT 
>   to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
>   hot{un}plug.
> 4. Live Migration works (some issues are still there)
> 5. TCG/HVF/qtest does not support Hotplug and falls back to default.
> 6. Code for TCG support do exists in this release (it is a work-in-progress)
> 7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
>   hotplug capability (_OSC Query support still pending)
> 8. Misc. Bug fixes
> 
> (II) Summary
>     =======
> 
> This patch-set introduces the virtual CPU hotplug support for ARMv8 architecture
> in QEMU. Idea is to be able to hotplug and hot-unplug the vCPUs while guest VM
> is running and no reboot is required. This does *not* makes any assumption of
> the physical CPU hotplug availability within the host system but rather tries to
> solve the problem at virtualizer/QEMU layer. Introduces ACPI CPU hotplug hooks
> and event handling to interface with the guest kernel, code to initialize, plug
> and unplug CPUs. No changes are required within the host kernel/KVM except the
> support of hypercall exit handling in the user-space/Qemu which has recently
> been added to the kernel. Its corresponding Guest kernel changes have been
> posted on the mailing-list [3] [4] by James Morse.
> 
> (III) Motivation
>      ==========
> 
> This allows scaling the guest VM compute capacity on-demand which would be
> useful for the following example scenarios,
> 
> 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration
>   framework which could adjust resource requests (CPU and Mem requests) for
>   the containers in a pod, based on usage.
> 2. Pay-as-you-grow Business Model: Infrastructure provider could allocate and
>   restrict the total number of compute resources available to the guest VM
>   according to the SLA (Service Level Agreement). VM owner could request for
>   more compute to be hot-plugged for some cost.
> 
> For example, Kata Container VM starts with a minimum amount of resources (i.e.
> hotplug everything approach). why?
> 
> 1. Allowing faster *boot time* and
> 2. Reduction in *memory footprint*
> 
> Kata Container VM can boot with just 1 vCPU and then later more vCPUs can be
> hot-plugged as per requirement.
> 
> (IV) Terminology
>     ===========
> 
> (*) Posssible CPUs: Total vCPUs which could ever exist in VM. This includes
>                    any cold booted CPUs plus any CPUs which could be later
>                    hot-plugged.
>                    - Qemu parameter(-smp maxcpus=N)
> (*) Present CPUs:   Possible CPUs which are ACPI 'present'. These might or might
>                    not be ACPI 'enabled'. 
>                    - Present vCPUs = Possible vCPUs (Always on ARM Arch)
> (*) Enabled CPUs:   Possible CPUs which are ACPI ‘present’ and 'enabled' and can
>                    now be ‘onlined’ (PSCI) for use by Guest Kernel. All cold
>                    booted vCPUs are ACPI 'enabled' at boot. Later, using
>                    device_add more vCPUs can be hotplugged and be made ACPI
>                    'enabled.
>                    - Qemu parameter(-smp cpus=N). Can be used to specify some
>      cold booted vCPUs during VM init. Some can be added using
>      '-device' option.
> 
> (V) Constraints Due To ARMv8 CPU Architecture [+] Other Impediments
>    ===============================================================
> 
> A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
>   1. ARMv8 CPU architecture does not support the concept of the physical CPU
>      hotplug. 
>      a. There are many per-CPU components like PMU, SVE, MTE, Arch timers etc.
>         whose behaviour need to be clearly defined when CPU is hot(un)plugged.
>         There is no specification for this.
> 
>   2. Other ARM components like GIC etc. have not been designed to realize
>      physical CPU hotplug capability as of now. For example,
>      a. Every physical CPU has a unique GICC (GIC CPU Interface) by construct.
>         Architecture does not specifies what CPU hot(un)plug would mean in
>         context to any of these.
>      b. CPUs/GICC are physically connected to unique GICR (GIC Redistributor).
>         GIC Redistributors are always part of always-on power domain. Hence,
>         cannot be powered-off as per specification.
> 
> B. Impediments in Firmware/ACPI (Architectural Constraint)
> 
>   1. Firmware has to expose GICC, GICR and other per-CPU features like PMU,
>      SVE, MTE, Arch Timers etc. to the OS. Due to architectural constraint
>      stated in above section A1(a),  all interrupt controller structures of
>      MADT describing GIC CPU Interfaces and the GIC Redistibutors MUST be
>      presented by firmware to the OSPM during the boot time. 
>   2. Architectures that support CPU hotplug can evaluate ACPI _MAT method to
>      get this kind of information from the firmware even after boot and the
>      OSPM has capability to process these. ARM kernel uses information in MADT
>      interrupt controller structures to identify number of Present CPUs during
>      boot and hence does not allow to change these after boot. Number of
>      present CPUs cannot be changed. It is an architectural constraint!
> 
> C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural Constraint)
> 
>   1. KVM VGIC:
>       a. Sizing of various VGIC resources like memory regions etc. related to
>          the redistributor happens only once and is fixed at the VM init time
>          and cannot be changed later after initialization has happened.
>          KVM statically configures these resources based on the number of vCPUs
>          and the number/size of redistributor ranges.
>       b. Association between vCPU and its VGIC redistributor is fixed at the
>          VM init time within the KVM i.e. when redistributor iodevs gets
>          registered. VGIC does not allows to setup/change this association
>          after VM initialization has happened. Physically, every CPU/GICC is
>          uniquely connected with its redistributor and there is no
>          architectural way to set this up.
>   2. KVM vCPUs:
>       a. Lack of specification means destruction of KVM vCPUs does not exist as
>          there is no reference to tell what to do with other per-vCPU
>          components like redistributors, arch timer etc.
>       b. Infact, KVM does not implements destruction of vCPUs for any
>          architecture. This is independent of the fact whether architecture
>          actually supports CPU Hotplug feature. For example, even for x86 KVM
>          does not implements destruction of vCPUs.
> 
> D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints->Arch)
> 
>   1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs to
>      overcome the KVM constraint. KVM vCPUs are created, initialized when Qemu
>      CPU Objects are realized. But keepinsg the QOM CPU objects realized for
>      'yet-to-be-plugged' vCPUs can create problems when these new vCPUs shall
>      be plugged using device_add and a new QOM CPU object shall be created.
>   2. GICV3State and GICV3CPUState objects MUST be sized over *possible vCPUs*
>      during VM init time while QOM GICV3 Object is realized. This is because
>      KVM VGIC can only be initialized once during init time. But every
>      GICV3CPUState has an associated QOM CPU Object. Later might corresponds to
>      vCPU which are 'yet-to-be-plugged'(unplugged at init).
>   3. How should new QOM CPU objects be connected back to the GICV3CPUState
>      objects and disconnected from it in case CPU is being hot(un)plugged?
>   4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in the
>      QOM for which KVM vCPU already exists? For example, whether to keep,
>       a. No QOM CPU objects Or
>       b. Unrealized CPU Objects
>   5. How should vCPU state be exposed via ACPI to the Guest? Especially for
>      the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not exists
>      within the QOM but the Guest always expects all possible vCPUs to be
>      identified as ACPI *present* during boot.
>   6. How should Qemu expose GIC CPU interfaces for the unplugged or
>      yet-to-beplugged vCPUs using ACPI MADT Table to the Guest?
> 
> E. Summary of Approach ([+] Workarounds to problems in sections A, B, C & D)
> 
>   1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e. even
>      for the vCPUs which are yet-to-be-plugged in Qemu but keep them in the
>      powered-off state.
>   2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
>      objects corresponding to the unplugged/yet-to-be-plugged vCPUs are parked
>      at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to x86)
>   3. GICV3State and GICV3CPUState objects are sized over possible vCPUs during
>      VM init time i.e. when Qemu GIC is realized. This in turn sizes KVM VGIC
>      resources like memory regions etc. related to the redistributors with the
>      number of possible KVM vCPUs. This never changes after VM has initialized.
>   4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs are
>      released post Host KVM CPU and GIC/VGIC initialization.
>   5. Build ACPI MADT Table with below updates 
>      a. Number of GIC CPU interface entries (=possible vCPUs)
>      b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable) 
>      c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1  
>         - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
> - vCPU can be ACPI enabled+onlined after Guest boots (Firmware Policy) 
> - Some issues with above (details in later sections)
>   6. Expose below ACPI Status to Guest kernel
>      a. Always _STA.Present=1 (all possible vCPUs)
>      b. _STA.Enabled=1 (plugged vCPUs)
>      c. _STA.Enabled=0 (unplugged vCPUs)
>   7. vCPU hotplug *realizes* new QOM CPU object. Following happens,
>      a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread
>      b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list)
>         - Attaches to QOM CPU object.
>      c. Reinitializes KVM vCPU in the Host
>         - Resets the core and sys regs, sets defaults etc.
>      d. Runs KVM vCPU (created with "start-powered-off")
> - vCPU thread sleeps (waits for vCPU reset via PSCI) 
>      e. Updates Qemu GIC
>         - Wires back IRQs related to this vCPU.
>         - GICV3CPUState association with QOM CPU Object.
>      f. Updates [6] ACPI _STA.Enabled=1
>      g. Notifies Guest about new vCPU (via ACPI GED interface)
> - Guest checks _STA.Enabled=1
> - Guest adds processor (registers CPU with LDM) [3]
>      h. Plugs the QOM CPU object in the slot.
>         - slot-number = cpu-index{socket,cluster,core,thread}
>      i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC)
>         - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>         - Qemu powers-on KVM vCPU in the Host
>   8. vCPU hot-unplug *unrealizes* QOM CPU Object. Following happens,
>      a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event
>         - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC) 
>      b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>         - Qemu powers-off the KVM vCPU in the Host
>      c Guest signals *Eject* vCPU to Qemu
>      d. Qemu updates [6] ACPI _STA.Enabled=0
>      e. Updates GIC
>         - Un-wires IRQs related to this vCPU
>         - GICV3CPUState association with new QOM CPU Object is updated.
>      f. Unplugs the vCPU
> - Removes from slot
>         - Parks KVM vCPU ("kvm_parked_vcpus" list)
>         - Unrealizes QOM CPU Object & joins back Qemu vCPU thread
> - Destroys QOM CPU object 
>      g. Guest checks ACPI _STA.Enabled=0
>         - Removes processor (unregisters CPU with LDM) [3]
> 
> F. Work Presented at KVM Forum Conferences:
>   Details of above work has been presented at KVMForum2020 and KVMForum2023
>   conferences. Slides are available at below links,
>   a. KVMForum 2023
>      - Challenges Revisited in Supporting Virt CPU Hotplug on architectures that don't Support CPU Hotplug (like ARM64)
>        https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
>   b. KVMForum 2020
>      - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems (like ARM64) - Salil Mehta, Huawei
>        https://sched.co/eE4m
> 
> (VI) Commands Used
>     =============
> 
>    A. Qemu launch commands to init the machine
> 
>    $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
>    -cpu host -smp cpus=4,maxcpus=6 \
>    -m 300M \
>    -kernel Image \
>    -initrd rootfs.cpio.gz \
>    -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 acpi=force" \
>    -nographic \
>    -bios  QEMU_EFI.fd \
> 
>    B. Hot-(un)plug related commands
> 
>    # Hotplug a host vCPU(accel=kvm)
>    $ device_add host-arm-cpu,id=core4,core-id=4
> 
>    # Hotplug a vCPU(accel=tcg)
>    $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
> 
>    # Delete the vCPU
>    $ device_del core4
> 
>    Sample output on guest after boot:
> 
>    $ cat /sys/devices/system/cpu/possible
>    0-5
>    $ cat /sys/devices/system/cpu/present
>    0-5
>    $ cat /sys/devices/system/cpu/enabled
>    0-3
>    $ cat /sys/devices/system/cpu/online
>    0-1
>    $ cat /sys/devices/system/cpu/offline
>    2-5
> 
>    Sample output on guest after hotplug of vCPU=4:
> 
>    $ cat /sys/devices/system/cpu/possible
>    0-5
>    $ cat /sys/devices/system/cpu/present
>    0-5
>    $ cat /sys/devices/system/cpu/enabled
>    0-4
>    $ cat /sys/devices/system/cpu/online
>    0-1,4
>    $ cat /sys/devices/system/cpu/offline
>    2-3,5
> 
>    Note: vCPU=4 was explicitly 'onlined' after hot-plug
>    $ echo 1 > /sys/devices/system/cpu/cpu4/online
> 
> (VII) Repository
>      ==========
> 
> (*) QEMU changes for vCPU hotplug could be cloned from below site,
>     https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
> (*) Guest Kernel changes (by James Morse, ARM) are available here:
>     https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git virtual_cpu_hotplug/rfc/v2
> 
> 
> (VIII) KNOWN ISSUES
>       ============
> 
> 1. Migration has been lightly tested. Below are some of the known issues:
>   - Ocassional CPU stall (not always repeatable)
>   - Negative test case like asymmetric source/destination VM config causes dump.
>   - Migration with TCG is not working properly.
> 2. TCG with Single threaded mode is broken.
> 3. HVF and qtest support is broken. 
> 4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable are
>   mutually exclusive i.e. as per the change [6] a vCPU cannot be both
>   GICC.Enabled and GICC.online-capable. This means,
>      [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
>   a. If we have to support hot-unplug of the cold-booted vCPUs then these MUST
>      be specified as GICC.online-capable in the MADT Table during boot by the
>      firmware/Qemu. But this requirement conflicts with the requirement to
>      support new Qemu changes with legacy OS which dont understand
>      MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore this
>      bit and hence these vCPUs will not appear on such OS. This is unexpected
>      behaviour.
>   b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to unplug
>      these cold-booted vCPUs from OS (which in actual should be blocked by
>      returning error at Qemu) then features like 'kexec' will break.
>   c. As I understand, removal of the cold-booted vCPUs is a required feature
>      and x86 world allows it.
>   d. Hence, either we need a specification change to make the MADT.GICC.Enabled
>      and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT support
>      removal of cold-booted vCPUs. In the later case, a check can be introduced
>      to bar the users from unplugging vCPUs, which were cold-booted, using QMP
>      commands. (Needs discussion!)
>      Please check below patch part of this patch-set:
>          [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled]
> 5. Code related to the notification to GICV3 about hot(un)plug of a vCPU event
>   might need further discussion.
> 
> 
> (IX) THINGS TO DO
>     ============
> 
> 1. Fix the Migration Issues
> 2. Fix issues related to TCG/Emulation support.
> 3. Comprehensive Testing. Current testing is very basic.
>   a. Negative Test cases
> 4. Qemu Documentation(.rst) need to be updated.
> 5. Fix qtest, HVF Support
> 6. Fix the design issue related to ACPI MADT.GICC flags discussed in known
>   issues. This might require UEFI ACPI specification change!
> 7. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now.
> 
> Above is *not* a complete list. Will update later!
> 
> Best regards
> Salil.
> 
> (X) DISCLAIMER
>    ==========
> 
> This work is an attempt to present a proof-of-concept of the ARM64 vCPU hotplug
> implementation to the community. This is *not* a production level code and might
> have bugs. Only a basic testing has been done on HiSilicon Kunpeng920 SoC for
> servers. Once the design and core idea behind the implementation has been
> verified more efforts can be put to harden the code.
> 
> This work is *mostly* in the lines of the discussions which have happened in the
> previous years[see refs below] across different channels like mailing-list,
> Linaro Open Discussions platform, various conferences like KVMFourm etc. This
> RFC is being used as a way to verify the idea mentioned in this cover-letter and
> to get community views. Once this has been agreed, a formal patch shall be
> posted to the mailing-list for review.
> 
> [The concept being presented has been found to work!]
> 
> (XI) ORGANIZATION OF PATCHES
>     =======================
> 
> A. All patches [Architecture 'agnostic' + 'specific']:
> 
>   [Patch 1-9, 23, 36] logic required during machine init
>    (*) Some validation checks
>    (*) Introduces core-id property and some util functions required later.
>    (*) Refactors Parking logic of vCPUs    
>    (*) Logic to pre-create vCPUs
>    (*) GIC initialization pre-sized with possible vCPUs.
>    (*) Some refactoring to have common hot and cold plug logic together.
>    (*) Release of disable QOM CPU objects in post_cpu_init()
>    (*) Support of ACPI _OSC method to negotiate platform hotplug capabilities
>   [Patch 10-22] logic related to ACPI at machine init time
>    (*) Changes required to Enable ACPI for cpu hotplug
>    (*) Initialization ACPI GED framework to cater CPU Hotplug Events
>    (*) Build ACPI AML related to CPU control dev 
>    (*) ACPI MADT/MAT changes
>   [Patch 24-35] Logic required during vCPU hot-(un)plug
>    (*) Basic framework changes to suppport vCPU hot-(un)plug
>    (*) ACPI GED changes for hot-(un)plug hooks.
>    (*) wire-unwire the IRQs
>    (*) GIC notification logic
>    (*) ARMCPU unrealize logic
>    (*) Handling of SMCC Hypercall Exits by KVM to Qemu  
> 
> B. Architecture *agnostic* patches part of patch-set:
> 
>   [Patch 5,9,11,13,16,20,24,31,33] Common logic to support hotplug 
>    (*) Refactors Parking logic of vCPUs
>    (*) Introduces ACPI GED Support for vCPU Hotplug Events
>    (*) Introduces ACPI AML change for CPU Control Device     
> 
> (XII) REFERENCES
>      ==========
> 
> [1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
> [2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/
> [3] https://lore.kernel.org/lkml/20230203135043.409192-1-james.morse@arm.com/
> [4] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
> [5] https://lore.kernel.org/all/20230404154050.2270077-1-oliver.upton@linux.dev/
> [6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
> [7] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
> [8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> [9] https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler
> [10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html
> [11] https://lkml.org/lkml/2019/7/10/235
> [12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
> [13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
> [14] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
> [15] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
> [16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
> [17] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
> [18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-philippe@linaro.org/
> [19] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/ 
> [20] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags
> 
> (XIII) ACKNOWLEDGEMENTS
>       ================
> 
> I would like to take this opportunity to thank below people for various
> discussions with me over different channels during the development:
> 
> Marc Zyngier (Google)               Catalin Marinas (ARM),         
> James Morse(ARM),                   Will Deacon (Google), 
> Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
> Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat), 
> Jonathan Cameron (Huawei),          Darren Hart (Ampere),
> Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
> Andrew Jones (Redhat),              Karl Heubaum (Oracle),
> Keqian Zhu (Huawei),                Miguel Luis (Oracle),
> Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
> Shameerali Kolothum (Huawei)        Russell King (Oracle)
> Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
> Zengtao/Prime (Huawei),             And all those whom I have missed! 
> 
> Many thanks to below people for their current or past contributions:
> 
> 1. James Morse (ARM)
>   (Current Kernel part of vCPU Hotplug Support on AARCH64)
> 2. Jean-Philippe Brucker (Linaro)
>   (Protoyped one of the earlier PSCI based POC [17][18] based on RFC V1)
> 3. Keqian Zhu (Huawei)
>   (Co-developed Qemu prototype)
> 4. Xiongfeng Wang (Huawei)
>   (Co-developed earlier kernel prototype)
> 5. Vishnu Pajjuri (Ampere)
>   (Verification on Ampere ARM64 Platforms + fixes)
> 6. Miguel Luis (Oracle)
>   (Verification on Oracle ARM64 Platforms + fixes)
> 
> 
> Author Salil Mehta (1):
>  target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
> 
> Jean-Philippe Brucker (2):
>  hw/acpi: Make _MAT method optional
>  target/arm/kvm: Write CPU state back to KVM on reset
> 
> Miguel Luis (1):
>  tcg/mttcg: enable threads to unregister in tcg_ctxs[]
> 
> Salil Mehta (33):
>  arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
>  cpus-common: Add common CPU utility for possible vCPUs
>  hw/arm/virt: Move setting of common CPU properties in a function
>  arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
>  accel/kvm: Extract common KVM vCPU {creation,parking} code
>  arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
>  arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine init
>  arm/virt: Init PMU at host for all possible vcpus
>  hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
>  arm/acpi: Enable ACPI support for vcpu hotplug
>  hw/acpi: Add ACPI CPU hotplug init stub
>  hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
>  hw/acpi: Init GED framework with cpu hotplug events
>  arm/virt: Add cpu hotplug events to GED during creation
>  arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
>  hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
>  arm/virt/acpi: Build CPUs AML with CPU Hotplug support
>  arm/virt: Make ARM vCPU *present* status ACPI *persistent*
>  hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits to Guest
>  hw/acpi: Update GED _EVT method AML with cpu scan
>  hw/arm: MADT Tbl change to size the guest with possible vCPUs
>  arm/virt: Release objects for *disabled* possible vCPUs after init
>  hw/acpi: Update ACPI GED framework to support vCPU Hotplug
>  arm/virt: Add/update basic hot-(un)plug framework
>  arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
>  hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
>  hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info
>  arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
>  hw/arm: Changes required for reset and to support next boot
>  physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
>  target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
>  hw/arm: Support hotplug capability check using _OSC method
>  hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
> 
> accel/kvm/kvm-all.c                    |  61 +-
> accel/tcg/tcg-accel-ops-mttcg.c        |   1 +
> cpus-common.c                          |  37 ++
> gdbstub/gdbstub.c                      |  13 +
> hw/acpi/acpi-cpu-hotplug-stub.c        |   6 +
> hw/acpi/cpu.c                          |  91 ++-
> hw/acpi/generic_event_device.c         |  33 +
> hw/arm/Kconfig                         |   1 +
> hw/arm/boot.c                          |   2 +-
> hw/arm/virt-acpi-build.c               | 110 +++-
> hw/arm/virt.c                          | 863 ++++++++++++++++++++-----
> hw/core/gpio.c                         |   2 +-
> hw/i386/acpi-build.c                   |   2 +-
> hw/intc/arm_gicv3.c                    |   1 +
> hw/intc/arm_gicv3_common.c             |  66 +-
> hw/intc/arm_gicv3_cpuif.c              | 265 ++++----
> hw/intc/arm_gicv3_cpuif_common.c       |   5 +
> hw/intc/arm_gicv3_kvm.c                |  39 +-
> hw/intc/gicv3_internal.h               |   2 +
> include/exec/cpu-common.h              |   8 +
> include/exec/gdbstub.h                 |   1 +
> include/hw/acpi/cpu.h                  |   7 +-
> include/hw/acpi/cpu_hotplug.h          |   4 +
> include/hw/acpi/generic_event_device.h |   5 +
> include/hw/arm/boot.h                  |   2 +
> include/hw/arm/virt.h                  |  10 +-
> include/hw/core/cpu.h                  |  77 +++
> include/hw/intc/arm_gicv3_common.h     |  23 +
> include/hw/qdev-core.h                 |   2 +
> include/sysemu/kvm.h                   |   2 +
> include/tcg/tcg.h                      |   1 +
> softmmu/physmem.c                      |  25 +
> target/arm/arm-powerctl.c              |  51 +-
> target/arm/cpu-qom.h                   |   3 +
> target/arm/cpu.c                       | 112 ++++
> target/arm/cpu.h                       |  17 +
> target/arm/cpu64.c                     |  15 +
> target/arm/gdbstub.c                   |   6 +
> target/arm/helper.c                    |  27 +-
> target/arm/internals.h                 |  12 +-
> target/arm/kvm.c                       |  93 ++-
> target/arm/kvm64.c                     |  59 +-
> target/arm/kvm_arm.h                   |  24 +
> target/arm/meson.build                 |   1 +
> target/arm/{tcg => }/psci.c            |   8 +
> target/arm/tcg/meson.build             |   4 -
> tcg/tcg.c                              |  23 +
> 47 files changed, 1873 insertions(+), 349 deletions(-)
> rename target/arm/{tcg => }/psci.c (97%)
> 

Tested on Oracle platforms with Ampere processors.
vCPU hotplug/unplug features along VM live migration.

Please feel free to add,
Tested-by: Miguel Luis <miguel.luis@oracle.com>

Thanks,
Miguel

> -- 
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-10-12 17:02 ` Miguel Luis
@ 2023-10-12 17:54   ` Salil Mehta via
  2023-10-12 17:54     ` Salil Mehta
  2023-10-13 10:43     ` Miguel Luis
  0 siblings, 2 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-10-12 17:54 UTC (permalink / raw)
  To: Miguel Luis
  Cc: qemu-devel, qemu-arm, Marc Zyngier, jean-philippe,
	Jonathan Cameron, lpieralisi, Peter Maydell, Richard Henderson,
	imammedo, andrew.jones, david, philmd, eric.auger, will, ardb,
	oliver.upton, pbonzini, mst, gshan, rafael, borntraeger,
	alex.bennee, linux, darren, ilkka, vishnu, Karl Heubaum,
	salil.mehta, zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Miguel,

> From: Miguel Luis <miguel.luis@oracle.com>
> Sent: Thursday, October 12, 2023 6:02 PM
> To: Salil Mehta <salil.mehta@huawei.com>
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; Marc Zyngier
> <maz@kernel.org>; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org; Peter Maydell
> <peter.maydell@linaro.org>; Richard Henderson
> <richard.henderson@linaro.org>; imammedo@redhat.com;
> andrew.jones@linux.dev; david@redhat.com; philmd@linaro.org;
> eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> gshan@redhat.com; rafael@kernel.org; borntraeger@linux.ibm.com;
> alex.bennee@linaro.org; linux@armlinux.org.uk;
> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> vishnu@os.amperecomputing.com; Karl Heubaum <karl.heubaum@oracle.com>;
> salil.mehta@opnsrc.net; zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng
> (C) <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8
> Arch
> 
> Hi Salil,
> 
> > On 26 Sep 2023, at 10:03, Salil Mehta <salil.mehta@huawei.com> wrote:
> >
> > [ *REPEAT: Sent patches got held at internal server yesterday* ]
> >
> > PROLOGUE
> > ========

[...]


> Tested on Oracle platforms with Ampere processors.
> vCPU hotplug/unplug features along VM live migration.
> 
> Please feel free to add,
> Tested-by: Miguel Luis <miguel.luis@oracle.com>

This is a great help.

Many thanks for your persistent efforts in the past few months.
It has really helped in expediting fixes, reducing many major
bugs and also helping in TCG part. Really appreciate it!

Will look forward to collaborate to fix the TCG part next.

Cheers
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-10-12 17:54   ` Salil Mehta via
@ 2023-10-12 17:54     ` Salil Mehta
  2023-10-13 10:43     ` Miguel Luis
  1 sibling, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-12 17:54 UTC (permalink / raw)
  To: Miguel Luis
  Cc: qemu-devel, qemu-arm, Marc Zyngier, jean-philippe,
	Jonathan Cameron, lpieralisi, Peter Maydell, Richard Henderson,
	imammedo, andrew.jones, david, philmd, eric.auger, will, ardb,
	oliver.upton, pbonzini, mst, gshan, rafael, borntraeger,
	alex.bennee, linux, darren, ilkka, vishnu, Karl Heubaum,
	salil.mehta, zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Miguel,

> From: Miguel Luis <miguel.luis@oracle.com>
> Sent: Thursday, October 12, 2023 6:02 PM
> To: Salil Mehta <salil.mehta@huawei.com>
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; Marc Zyngier
> <maz@kernel.org>; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org; Peter Maydell
> <peter.maydell@linaro.org>; Richard Henderson
> <richard.henderson@linaro.org>; imammedo@redhat.com;
> andrew.jones@linux.dev; david@redhat.com; philmd@linaro.org;
> eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> gshan@redhat.com; rafael@kernel.org; borntraeger@linux.ibm.com;
> alex.bennee@linaro.org; linux@armlinux.org.uk;
> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> vishnu@os.amperecomputing.com; Karl Heubaum <karl.heubaum@oracle.com>;
> salil.mehta@opnsrc.net; zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng
> (C) <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8
> Arch
> 
> Hi Salil,
> 
> > On 26 Sep 2023, at 10:03, Salil Mehta <salil.mehta@huawei.com> wrote:
> >
> > [ *REPEAT: Sent patches got held at internal server yesterday* ]
> >
> > PROLOGUE
> > ========

[...]


> Tested on Oracle platforms with Ampere processors.
> vCPU hotplug/unplug features along VM live migration.
> 
> Please feel free to add,
> Tested-by: Miguel Luis <miguel.luis@oracle.com>

This is a great help.

Many thanks for your persistent efforts in the past few months.
It has really helped in expediting fixes, reducing many major
bugs and also helping in TCG part. Really appreciate it!

Will look forward to collaborate to fix the TCG part next.

Cheers
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-10-12 17:54   ` Salil Mehta via
  2023-10-12 17:54     ` Salil Mehta
@ 2023-10-13 10:43     ` Miguel Luis
  1 sibling, 0 replies; 153+ messages in thread
From: Miguel Luis @ 2023-10-13 10:43 UTC (permalink / raw)
  To: Salil Mehta
  Cc: qemu-devel, qemu-arm, Marc Zyngier, jean-philippe,
	Jonathan Cameron, lpieralisi, Peter Maydell, Richard Henderson,
	imammedo, andrew.jones, david, philmd, eric.auger, will, ardb,
	oliver.upton, pbonzini, mst, gshan, rafael, borntraeger,
	alex.bennee, linux, darren, ilkka, vishnu, Karl Heubaum,
	salil.mehta, zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Salil,

> On 12 Oct 2023, at 17:54, Salil Mehta <salil.mehta@huawei.com> wrote:
> 
> Hi Miguel,
> 
>> From: Miguel Luis <miguel.luis@oracle.com>
>> Sent: Thursday, October 12, 2023 6:02 PM
>> To: Salil Mehta <salil.mehta@huawei.com>
>> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; Marc Zyngier
>> <maz@kernel.org>; jean-philippe@linaro.org; Jonathan Cameron
>> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org; Peter Maydell
>> <peter.maydell@linaro.org>; Richard Henderson
>> <richard.henderson@linaro.org>; imammedo@redhat.com;
>> andrew.jones@linux.dev; david@redhat.com; philmd@linaro.org;
>> eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
>> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
>> gshan@redhat.com; rafael@kernel.org; borntraeger@linux.ibm.com;
>> alex.bennee@linaro.org; linux@armlinux.org.uk;
>> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
>> vishnu@os.amperecomputing.com; Karl Heubaum <karl.heubaum@oracle.com>;
>> salil.mehta@opnsrc.net; zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng
>> (C) <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
>> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
>> Subject: Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8
>> Arch
>> 
>> Hi Salil,
>> 
>>> On 26 Sep 2023, at 10:03, Salil Mehta <salil.mehta@huawei.com> wrote:
>>> 
>>> [ *REPEAT: Sent patches got held at internal server yesterday* ]
>>> 
>>> PROLOGUE
>>> ========
> 
> [...]
> 
> 
>> Tested on Oracle platforms with Ampere processors.
>> vCPU hotplug/unplug features along VM live migration.
>> 
>> Please feel free to add,
>> Tested-by: Miguel Luis <miguel.luis@oracle.com>
> 
> This is a great help.
> 
> Many thanks for your persistent efforts in the past few months.
> It has really helped in expediting fixes, reducing many major
> bugs and also helping in TCG part. Really appreciate it!
> 

You are welcome!

Likewise, really appreciate you driving this forward and being
open to suggestions. Makes it easy to collaborate while
helping the community coming together.

> Will look forward to collaborate to fix the TCG part next.
> 

That is great! Looking forward to it.

Cheers
Miguel

> Cheers
> Salil.
> 
> 



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 07/37] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus @machine init
  2023-09-28  0:14   ` Gavin Shan
@ 2023-10-16 16:15     ` Salil Mehta via
  2023-10-16 16:15       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 16:15 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:14 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 07/37] arm/virt,gicv3: Changes to pre-size GIC
> with possible vcpus @machine init
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > GIC needs to be pre-sized with possible vcpus at the initialization time. This
> > is necessary because Memory regions and resources associated with GICC/GICR
> > etc cannot be changed (add/del/modified) after VM has inited. Also, GIC_TYPER
> > needs to be initialized with mp_affinity and cpu interface number association.
> > This cannot be changed after GIC has initialized.
> >
> > Once all the cpu interfaces of the GIC has been inited it needs to be ensured
>                                                    ^^^^^^
>                                                    initialized,

Sure. Thanks!


> > that any updates to the GICC during reset only takes place for the present
> 
> ^^^^^^^^^^^
>                                                                   the
> enabled


Yes. I will fix the sentence.

Thanks
Salil.


> > vcpus and not the disabled ones. Therefore, proper checks are required at
> > various places.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > [changed the comment in arm_gicv3_icc_reset]
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c              | 15 ++++++++-------
> >   hw/intc/arm_gicv3_common.c |  7 +++++--
> >   hw/intc/arm_gicv3_cpuif.c  |  8 ++++++++
> >   hw/intc/arm_gicv3_kvm.c    | 34 +++++++++++++++++++++++++++++++---
> >   include/hw/arm/virt.h      |  2 +-
> >   5 files changed, 53 insertions(+), 13 deletions(-)
> >
> 
> I guess the subject can be improved to something like below because it's the preparatory
> work to support vCPU hotplug (notifier) in the subsequent patches. In this patch, most
> of the code changes is related to vCPU state, ms->smp_pros.max_cpus and the CPU interface
> instances associated to GICv3 controller.

I general, any commit-log should state 'why' and not 'what' the patch is doing.

> 
> arm/virt,gicv3: Prepare for vCPU hotplug by checking GICv3CPUState states

Subject is a summary of what patch intends to do. It is pre-sizing the GIC in QOM
and in KVM as well i.e. VGIC. Please check the KVMForum 2023 slides page-6 for more
details.


> We already had wrappers to check vCPU's states. I'm wandering if we need another set
> of wrappers for GICv3 for several facts: (a) In this patch, we're actually concerned
> by GICv3CPUState's states, disabled or enabled. vCPU states have been classified to
> possible, present, and enabled. Their states aren't matching strictly. (b) With GICv3
> own wrappers, the code can be detached from vCPU in high level. Please evaluate it's
> worthy to have GICv3 own wrappers and we can have the folowing wrappers if
> want.

It will open up the pandoras box if we tickle with GIC as we cannot play with VGIC
states in the KVM due to ARM architecture constraints. These states are always part
of always-on power domain. In case, we decide to model the QEMU GIC to relax later
aspect we would still need to re-size the VGIC at KVM and still expose all of the
GIC CPU Interfaces to the Guest OS at boot time. Hence, those GICV3CPUState specific
APIs do not add much value to logic to be frank.

Plus, once we have these states formally described as part of the QOM then we need
to explicitly handle cases, for example, what will these new states mean in terms
of Live/Pseudo Migration?


Thanks
Salil.


[...]

> > diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
> > index 2ebf880ead..ebd99af610 100644
> > --- a/hw/intc/arm_gicv3_common.c
> > +++ b/hw/intc/arm_gicv3_common.c
> > @@ -392,10 +392,13 @@ static void arm_gicv3_common_realize(DeviceState
> *dev, Error **errp)
> >       s->cpu = g_new0(GICv3CPUState, s->num_cpu);
> >
> >       for (i = 0; i < s->num_cpu; i++) {
> > -        CPUState *cpu = qemu_get_cpu(i);
> > +        CPUState *cpu = qemu_get_possible_cpu(i);
> >           uint64_t cpu_affid;
> >
> > -        s->cpu[i].cpu = cpu;
> > +        if (qemu_enabled_cpu(cpu)) {
> > +            s->cpu[i].cpu = cpu;
> > +        }
> > +
> >           s->cpu[i].gic = s;
> >           /* Store GICv3CPUState in CPUARMState gicv3state pointer */
> >           gicv3_set_gicv3state(cpu, &s->cpu[i]);
> 
> I don't think gicv3_set_gicv3state() isn't needed for !qemu_enabled_cpu(cpu)
> since those disabled vCPUs will be released in hw/arm/virt.c pretty soon.


For disabled CPUs, GICV3CPUState will be initialized to NULL and will
continue to exists even after its corresponding QOM CPUState objects
have been released.


> > diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
> > index d07b13eb27..7b7a0fdb9c 100644
> > --- a/hw/intc/arm_gicv3_cpuif.c
> > +++ b/hw/intc/arm_gicv3_cpuif.c
> > @@ -934,6 +934,10 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
> >       ARMCPU *cpu = ARM_CPU(cs->cpu);
> >       CPUARMState *env = &cpu->env;
> >
> > +    if (!qemu_enabled_cpu(cs->cpu)) {
> > +        return;
> > +    }
> > +
> 
> The question is how it's possible. It seems a bug to update GICv3CPUState
> who isn't ready or disabled.

Ideally, it should not. This code is meant for TCG and any updates happening
for the entire GICv3 like in GICv3_update() function, which iterates over number
of GICV3 CPUs, this is required to ensure updates do not happen if corresponding
CPUState object does not exist.

[...]

> >
> > +    /*
> > +     * This shall be called even when vcpu is being hotplugged or onlined and
> > +     * other vcpus might be running. Host kernel KVM code to handle device
> > +     * access of IOCTLs KVM_{GET|SET}_DEVICE_ATTR might fail due to inability to
> > +     * grab vcpu locks for all the vcpus. Hence, we need to pause all vcpus to
> > +     * facilitate locking within host.
> > +     */
> > +    pause_all_vcpus();
> >       /* Initialize to actual HW supported configuration */
> >       kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS,
> >                         KVM_VGIC_ATTR(ICC_CTLR_EL1, c->gicr_typer),
> >                         &c->icc_ctlr_el1[GICV3_NS], false, &error_abort);
> > +    resume_all_vcpus();
> 
> Please swap the positions for paused_all_vcpu() and the next comment, and
> then combine the comments.


No issues.


Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 07/37] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus @machine init
  2023-10-16 16:15     ` Salil Mehta via
@ 2023-10-16 16:15       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 16:15 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:14 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 07/37] arm/virt,gicv3: Changes to pre-size GIC
> with possible vcpus @machine init
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > GIC needs to be pre-sized with possible vcpus at the initialization time. This
> > is necessary because Memory regions and resources associated with GICC/GICR
> > etc cannot be changed (add/del/modified) after VM has inited. Also, GIC_TYPER
> > needs to be initialized with mp_affinity and cpu interface number association.
> > This cannot be changed after GIC has initialized.
> >
> > Once all the cpu interfaces of the GIC has been inited it needs to be ensured
>                                                    ^^^^^^
>                                                    initialized,

Sure. Thanks!


> > that any updates to the GICC during reset only takes place for the present
> 
> ^^^^^^^^^^^
>                                                                   the
> enabled


Yes. I will fix the sentence.

Thanks
Salil.


> > vcpus and not the disabled ones. Therefore, proper checks are required at
> > various places.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > [changed the comment in arm_gicv3_icc_reset]
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c              | 15 ++++++++-------
> >   hw/intc/arm_gicv3_common.c |  7 +++++--
> >   hw/intc/arm_gicv3_cpuif.c  |  8 ++++++++
> >   hw/intc/arm_gicv3_kvm.c    | 34 +++++++++++++++++++++++++++++++---
> >   include/hw/arm/virt.h      |  2 +-
> >   5 files changed, 53 insertions(+), 13 deletions(-)
> >
> 
> I guess the subject can be improved to something like below because it's the preparatory
> work to support vCPU hotplug (notifier) in the subsequent patches. In this patch, most
> of the code changes is related to vCPU state, ms->smp_pros.max_cpus and the CPU interface
> instances associated to GICv3 controller.

I general, any commit-log should state 'why' and not 'what' the patch is doing.

> 
> arm/virt,gicv3: Prepare for vCPU hotplug by checking GICv3CPUState states

Subject is a summary of what patch intends to do. It is pre-sizing the GIC in QOM
and in KVM as well i.e. VGIC. Please check the KVMForum 2023 slides page-6 for more
details.


> We already had wrappers to check vCPU's states. I'm wandering if we need another set
> of wrappers for GICv3 for several facts: (a) In this patch, we're actually concerned
> by GICv3CPUState's states, disabled or enabled. vCPU states have been classified to
> possible, present, and enabled. Their states aren't matching strictly. (b) With GICv3
> own wrappers, the code can be detached from vCPU in high level. Please evaluate it's
> worthy to have GICv3 own wrappers and we can have the folowing wrappers if
> want.

It will open up the pandoras box if we tickle with GIC as we cannot play with VGIC
states in the KVM due to ARM architecture constraints. These states are always part
of always-on power domain. In case, we decide to model the QEMU GIC to relax later
aspect we would still need to re-size the VGIC at KVM and still expose all of the
GIC CPU Interfaces to the Guest OS at boot time. Hence, those GICV3CPUState specific
APIs do not add much value to logic to be frank.

Plus, once we have these states formally described as part of the QOM then we need
to explicitly handle cases, for example, what will these new states mean in terms
of Live/Pseudo Migration?


Thanks
Salil.


[...]

> > diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
> > index 2ebf880ead..ebd99af610 100644
> > --- a/hw/intc/arm_gicv3_common.c
> > +++ b/hw/intc/arm_gicv3_common.c
> > @@ -392,10 +392,13 @@ static void arm_gicv3_common_realize(DeviceState
> *dev, Error **errp)
> >       s->cpu = g_new0(GICv3CPUState, s->num_cpu);
> >
> >       for (i = 0; i < s->num_cpu; i++) {
> > -        CPUState *cpu = qemu_get_cpu(i);
> > +        CPUState *cpu = qemu_get_possible_cpu(i);
> >           uint64_t cpu_affid;
> >
> > -        s->cpu[i].cpu = cpu;
> > +        if (qemu_enabled_cpu(cpu)) {
> > +            s->cpu[i].cpu = cpu;
> > +        }
> > +
> >           s->cpu[i].gic = s;
> >           /* Store GICv3CPUState in CPUARMState gicv3state pointer */
> >           gicv3_set_gicv3state(cpu, &s->cpu[i]);
> 
> I don't think gicv3_set_gicv3state() isn't needed for !qemu_enabled_cpu(cpu)
> since those disabled vCPUs will be released in hw/arm/virt.c pretty soon.


For disabled CPUs, GICV3CPUState will be initialized to NULL and will
continue to exists even after its corresponding QOM CPUState objects
have been released.


> > diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
> > index d07b13eb27..7b7a0fdb9c 100644
> > --- a/hw/intc/arm_gicv3_cpuif.c
> > +++ b/hw/intc/arm_gicv3_cpuif.c
> > @@ -934,6 +934,10 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
> >       ARMCPU *cpu = ARM_CPU(cs->cpu);
> >       CPUARMState *env = &cpu->env;
> >
> > +    if (!qemu_enabled_cpu(cs->cpu)) {
> > +        return;
> > +    }
> > +
> 
> The question is how it's possible. It seems a bug to update GICv3CPUState
> who isn't ready or disabled.

Ideally, it should not. This code is meant for TCG and any updates happening
for the entire GICv3 like in GICv3_update() function, which iterates over number
of GICV3 CPUs, this is required to ensure updates do not happen if corresponding
CPUState object does not exist.

[...]

> >
> > +    /*
> > +     * This shall be called even when vcpu is being hotplugged or onlined and
> > +     * other vcpus might be running. Host kernel KVM code to handle device
> > +     * access of IOCTLs KVM_{GET|SET}_DEVICE_ATTR might fail due to inability to
> > +     * grab vcpu locks for all the vcpus. Hence, we need to pause all vcpus to
> > +     * facilitate locking within host.
> > +     */
> > +    pause_all_vcpus();
> >       /* Initialize to actual HW supported configuration */
> >       kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS,
> >                         KVM_VGIC_ATTR(ICC_CTLR_EL1, c->gicr_typer),
> >                         &c->icc_ctlr_el1[GICV3_NS], false, &error_abort);
> > +    resume_all_vcpus();
> 
> Please swap the positions for paused_all_vcpu() and the next comment, and
> then combine the comments.


No issues.


Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 09/37] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
  2023-09-28  0:19   ` Gavin Shan
@ 2023-10-16 16:20     ` Salil Mehta via
  2023-10-16 16:20       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 16:20 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:20 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 09/37] hw/acpi: Move CPU ctrl-dev MMIO region
> len macro to common header file
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > CPU ctrl-dev MMIO region length could be used in ACPI GED (common ACPI
> code
> > across architectures) and various other architecture specific places. To
> make
> > these code places independent of compilation order,
> ACPI_CPU_HOTPLUG_REG_LEN
> > macro should be moved to a header file.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/cpu.c                 | 2 +-
> >   include/hw/acpi/cpu_hotplug.h | 2 ++
> >   2 files changed, 3 insertions(+), 1 deletion(-)
> >
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>

Already dealt. Now, part of architecture agnostic patch-set,

https://lore.kernel.org/qemu-devel/4764CF47-47CA-4685-805C-BBE6310BE164@oracle.com/T/#m2331682ed9aa1b764a42ed9fa1b85a849f8acd76

Thanks
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 09/37] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
  2023-10-16 16:20     ` Salil Mehta via
@ 2023-10-16 16:20       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 16:20 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:20 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 09/37] hw/acpi: Move CPU ctrl-dev MMIO region
> len macro to common header file
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > CPU ctrl-dev MMIO region length could be used in ACPI GED (common ACPI
> code
> > across architectures) and various other architecture specific places. To
> make
> > these code places independent of compilation order,
> ACPI_CPU_HOTPLUG_REG_LEN
> > macro should be moved to a header file.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/cpu.c                 | 2 +-
> >   include/hw/acpi/cpu_hotplug.h | 2 ++
> >   2 files changed, 3 insertions(+), 1 deletion(-)
> >
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>

Already dealt. Now, part of architecture agnostic patch-set,

https://lore.kernel.org/qemu-devel/4764CF47-47CA-4685-805C-BBE6310BE164@oracle.com/T/#m2331682ed9aa1b764a42ed9fa1b85a849f8acd76

Thanks
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 10/37] arm/acpi: Enable ACPI support for vcpu hotplug
  2023-09-28  0:25   ` Gavin Shan
@ 2023-10-16 21:23     ` Salil Mehta via
  2023-10-16 21:23       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 21:23 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:26 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron 
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 10/37] arm/acpi: Enable ACPI support for vcpu
> hotplug
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI is required to interface QEMU with the guest. Roughly falls into below
> > cases,
> >
> > 1. Convey the possible vcpus config at the machine init time to the guest
> >     using various DSDT tables like MADT etc.
> > 2. Convey vcpu hotplug events to guest(using GED)
> > 3. Assist in evaluation of various ACPI methods(like _EVT, _STA, _OST, _EJ0,
> >     _MAT etc.)
> > 4. Provides ACPI cpu hotplug state and 12 Byte memory mapped cpu hotplug
> >     control register interface to the OSPM/guest corresponding to each possible
> >     vcpu. The register interface consists of various R/W fields and their
> >     handling operations. These are called whenever register fields or memory
> >     regions are accessed(i.e. read or written) by OSPM when ever it evaluates
> >     various ACPI methods.
> >
> > Note: lot of this framework code is inherited from the changes already done for
> >        x86 but still some minor changes are required to make it compatible with
> >        ARM64.)
> >
> > This patch enables the ACPI support for virtual cpu hotplug. ACPI changes
> > required will follow in subsequent patches.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/Kconfig | 1 +
> >   1 file changed, 1 insertion(+)
> >
> 
> I assume this patch needs to be moved around to last one, until vCPU hotplug
> is supported in the code base.

In that case subsequent patches will not get compiled till this
switch is not enabled. git-bisect wont work correctly in case of
compilation failure is observed in previous patches when this
switch gets compiled eventually.

With the current order every subsequent patch is getting compiled.
And in fact in this case you can even bring up the system. Later
was not a requirement though.

Also, this patch cannot be beyond the patch where functions defined
in the patches subsequent to this will get called. For example,
build_cpus_aml() i.e. [Patch RFC V2 17/37]

Thanks
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 10/37] arm/acpi: Enable ACPI support for vcpu hotplug
  2023-10-16 21:23     ` Salil Mehta via
@ 2023-10-16 21:23       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 21:23 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:26 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron 
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 10/37] arm/acpi: Enable ACPI support for vcpu
> hotplug
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI is required to interface QEMU with the guest. Roughly falls into below
> > cases,
> >
> > 1. Convey the possible vcpus config at the machine init time to the guest
> >     using various DSDT tables like MADT etc.
> > 2. Convey vcpu hotplug events to guest(using GED)
> > 3. Assist in evaluation of various ACPI methods(like _EVT, _STA, _OST, _EJ0,
> >     _MAT etc.)
> > 4. Provides ACPI cpu hotplug state and 12 Byte memory mapped cpu hotplug
> >     control register interface to the OSPM/guest corresponding to each possible
> >     vcpu. The register interface consists of various R/W fields and their
> >     handling operations. These are called whenever register fields or memory
> >     regions are accessed(i.e. read or written) by OSPM when ever it evaluates
> >     various ACPI methods.
> >
> > Note: lot of this framework code is inherited from the changes already done for
> >        x86 but still some minor changes are required to make it compatible with
> >        ARM64.)
> >
> > This patch enables the ACPI support for virtual cpu hotplug. ACPI changes
> > required will follow in subsequent patches.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/Kconfig | 1 +
> >   1 file changed, 1 insertion(+)
> >
> 
> I assume this patch needs to be moved around to last one, until vCPU hotplug
> is supported in the code base.

In that case subsequent patches will not get compiled till this
switch is not enabled. git-bisect wont work correctly in case of
compilation failure is observed in previous patches when this
switch gets compiled eventually.

With the current order every subsequent patch is getting compiled.
And in fact in this case you can even bring up the system. Later
was not a requirement though.

Also, this patch cannot be beyond the patch where functions defined
in the patches subsequent to this will get called. For example,
build_cpus_aml() i.e. [Patch RFC V2 17/37]

Thanks
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 11/37] hw/acpi: Add ACPI CPU hotplug init stub
  2023-09-28  0:28   ` Gavin Shan
@ 2023-10-16 21:27     ` Salil Mehta via
  2023-10-16 21:27       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 21:27 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:28 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 11/37] hw/acpi: Add ACPI CPU hotplug init stub
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI CPU hotplug related initialization should only happend if
> ACPI_CPU_HOTPLUG
> > support has been enabled for particular architecture. Add
> cpu_hotplug_hw_init()
> > stub to avoid compilation break.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/acpi-cpu-hotplug-stub.c | 6 ++++++
> >   1 file changed, 6 insertions(+)
> >
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>

Already dealt. Now, part of architecture agnostic patch-set,

https://lore.kernel.org/qemu-devel/4764CF47-47CA-4685-805C-BBE6310BE164@oracle.com/T/#m8d14dedab8dbd9fdc66324ff013b9b9966569813


Thanks
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 11/37] hw/acpi: Add ACPI CPU hotplug init stub
  2023-10-16 21:27     ` Salil Mehta via
@ 2023-10-16 21:27       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 21:27 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:28 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 11/37] hw/acpi: Add ACPI CPU hotplug init stub
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI CPU hotplug related initialization should only happend if
> ACPI_CPU_HOTPLUG
> > support has been enabled for particular architecture. Add
> cpu_hotplug_hw_init()
> > stub to avoid compilation break.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/acpi-cpu-hotplug-stub.c | 6 ++++++
> >   1 file changed, 6 insertions(+)
> >
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>

Already dealt. Now, part of architecture agnostic patch-set,

https://lore.kernel.org/qemu-devel/4764CF47-47CA-4685-805C-BBE6310BE164@oracle.com/T/#m8d14dedab8dbd9fdc66324ff013b9b9966569813


Thanks
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 12/37] hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
  2023-09-28  0:40   ` Gavin Shan
@ 2023-10-16 21:41     ` Salil Mehta via
  2023-10-16 21:41       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 21:41 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:41 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 12/37] hw/acpi: Use qemu_present_cpu() API in
> ACPI CPU hotplug init
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI CPU Hotplug code assumes a virtual CPU is unplugged if the CPUState object
> > is absent in the list of ths possible CPUs(CPUArchIdList *possible_cpus)
> > maintained on per-machine basis. Use the earlier introduced qemu_present_cpu()
> > API to check this state.
> >
> > This change should have no bearing on the functionality of any architecture and
> > is mere a representational change.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/cpu.c | 5 ++++-
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> > index 45defdc0e2..d5ba37b209 100644
> > --- a/hw/acpi/cpu.c
> > +++ b/hw/acpi/cpu.c
> > @@ -225,7 +225,10 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
> >       state->dev_count = id_list->len;
> >       state->devs = g_new0(typeof(*state->devs), state->dev_count);
> >       for (i = 0; i < id_list->len; i++) {
> > -        state->devs[i].cpu =  CPU(id_list->cpus[i].cpu);
> > +        struct CPUState *cpu = CPU(id_list->cpus[i].cpu);
> > +        if (qemu_present_cpu(cpu)) {
> > +            state->devs[i].cpu = cpu;
> > +        }
> >           state->devs[i].arch_id = id_list->cpus[i].arch_id;
> >       }
> >       memory_region_init_io(&state->ctrl_reg, owner, &cpu_hotplug_ops, state,
> 
> I don't think qemu_present_cpu() is needed because all possible vCPUs are present
> for x86 and arm64 at this point? Besides, we have the assumption all hotpluggable
> vCPUs are present, looking at James' kernel series where ACPI_HOTPLUG_PRESENT_CPU
> exists in linux/drivers/acpi/Kconfig :)

No, for x86 not all possible vCPUs need to be present at VM init. IOAPIC has
got no such limitation like GIC. Hot-pluggable vCPUs can be deferred created.
Hence, QOM CPUState objects will not be present and ACPI will expose them as
NOT PRESENT to the Guest OS.

This is not the case with ARM. Not all possible vCPUs are present at QOM
level but QEMU *fakes* the presence of vCPUs i.e. the unplugged ones to
the Guest OS by exposing them as PRESENT in ACPI _STA method. Later is
required due to the architectural limitation of ARM. GIC CPU interfaces
need to be present in MADT for all possible vCPUs at guest boot time.

Hence, we require above change.


Thanks
Salil.







^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 12/37] hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
  2023-10-16 21:41     ` Salil Mehta via
@ 2023-10-16 21:41       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 21:41 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:41 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 12/37] hw/acpi: Use qemu_present_cpu() API in
> ACPI CPU hotplug init
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI CPU Hotplug code assumes a virtual CPU is unplugged if the CPUState object
> > is absent in the list of ths possible CPUs(CPUArchIdList *possible_cpus)
> > maintained on per-machine basis. Use the earlier introduced qemu_present_cpu()
> > API to check this state.
> >
> > This change should have no bearing on the functionality of any architecture and
> > is mere a representational change.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/cpu.c | 5 ++++-
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> > index 45defdc0e2..d5ba37b209 100644
> > --- a/hw/acpi/cpu.c
> > +++ b/hw/acpi/cpu.c
> > @@ -225,7 +225,10 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
> >       state->dev_count = id_list->len;
> >       state->devs = g_new0(typeof(*state->devs), state->dev_count);
> >       for (i = 0; i < id_list->len; i++) {
> > -        state->devs[i].cpu =  CPU(id_list->cpus[i].cpu);
> > +        struct CPUState *cpu = CPU(id_list->cpus[i].cpu);
> > +        if (qemu_present_cpu(cpu)) {
> > +            state->devs[i].cpu = cpu;
> > +        }
> >           state->devs[i].arch_id = id_list->cpus[i].arch_id;
> >       }
> >       memory_region_init_io(&state->ctrl_reg, owner, &cpu_hotplug_ops, state,
> 
> I don't think qemu_present_cpu() is needed because all possible vCPUs are present
> for x86 and arm64 at this point? Besides, we have the assumption all hotpluggable
> vCPUs are present, looking at James' kernel series where ACPI_HOTPLUG_PRESENT_CPU
> exists in linux/drivers/acpi/Kconfig :)

No, for x86 not all possible vCPUs need to be present at VM init. IOAPIC has
got no such limitation like GIC. Hot-pluggable vCPUs can be deferred created.
Hence, QOM CPUState objects will not be present and ACPI will expose them as
NOT PRESENT to the Guest OS.

This is not the case with ARM. Not all possible vCPUs are present at QOM
level but QEMU *fakes* the presence of vCPUs i.e. the unplugged ones to
the Guest OS by exposing them as PRESENT in ACPI _STA method. Later is
required due to the architectural limitation of ARM. GIC CPU interfaces
need to be present in MADT for all possible vCPUs at guest boot time.

Hence, we require above change.


Thanks
Salil.







^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 13/37] hw/acpi: Init GED framework with cpu hotplug events
  2023-09-28  0:56   ` Gavin Shan
@ 2023-10-16 21:44     ` Salil Mehta via
  2023-10-16 21:44       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 21:44 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:57 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 13/37] hw/acpi: Init GED framework with cpu
> hotplug events
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI GED(as described in the ACPI 6.2 spec) can be used to generate ACPI events
> > when OSPM/guest receives an interrupt listed in the _CRS object of GED. OSPM
> > then maps or demultiplexes the event by evaluating _EVT method.
> >
> > This change adds the support of cpu hotplug event initialization in the
> > existing GED framework.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/generic_event_device.c         | 8 ++++++++
> >   include/hw/acpi/generic_event_device.h | 5 +++++
> >   2 files changed, 13 insertions(+)
> >
> 
> It looks a bit strange you're co-developing the patch with yourself.
> It seems all patches follow this particular pattern. I could be changed
> to:
> 
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> 
> The code changes look good to me with the following nits addressed:
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>


All of these comments have been addressed in architecture agnostic patch-set.

https://lore.kernel.org/qemu-devel/4764CF47-47CA-4685-805C-BBE6310BE164@oracle.com/T/#m904eb56658a07c0302d6c6fbba6c9b9ddc35cc1b

Thanks
Salil.





> 
> > diff --git a/hw/acpi/generic_event_device.c
> b/hw/acpi/generic_event_device.c
> > index a3d31631fe..d2fa1d0e4a 100644
> > --- a/hw/acpi/generic_event_device.c
> > +++ b/hw/acpi/generic_event_device.c
> > @@ -25,6 +25,7 @@ static const uint32_t ged_supported_events[] = {
> >       ACPI_GED_MEM_HOTPLUG_EVT,
> >       ACPI_GED_PWR_DOWN_EVT,
> >       ACPI_GED_NVDIMM_HOTPLUG_EVT,
> > +    ACPI_GED_CPU_HOTPLUG_EVT,
> >   };
> >
> 
> Can we move ACPI_GED_CPU_HOTPLUG_EVT ahead of ACPI_GED_MEM_HOTPLUG_EVT?
> 
> >   /*
> > @@ -400,6 +401,13 @@ static void acpi_ged_initfn(Object *obj)
> >       memory_region_init_io(&ged_st->regs, obj, &ged_regs_ops, ged_st,
> >                             TYPE_ACPI_GED "-regs", ACPI_GED_REG_COUNT);
> >       sysbus_init_mmio(sbd, &ged_st->regs);
> > +
> > +    s->cpuhp.device = OBJECT(s);
> > +    memory_region_init(&s->container_cpuhp, OBJECT(dev), "cpuhp
> container",
> > +                       ACPI_CPU_HOTPLUG_REG_LEN);
> > +    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->container_cpuhp);
> > +    cpu_hotplug_hw_init(&s->container_cpuhp, OBJECT(dev),
> > +                        &s->cpuhp_state, 0);
> >   }
> >
> >   static void acpi_ged_class_init(ObjectClass *class, void *data)
> > diff --git a/include/hw/acpi/generic_event_device.h
> b/include/hw/acpi/generic_event_device.h
> > index d831bbd889..d0a5a43abf 100644
> > --- a/include/hw/acpi/generic_event_device.h
> > +++ b/include/hw/acpi/generic_event_device.h
> > @@ -60,6 +60,7 @@
> >   #define HW_ACPI_GENERIC_EVENT_DEVICE_H
> >
> >   #include "hw/sysbus.h"
> > +#include "hw/acpi/cpu_hotplug.h"
> >   #include "hw/acpi/memory_hotplug.h"
> >   #include "hw/acpi/ghes.h"
> >   #include "qom/object.h"
> > @@ -97,6 +98,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
> >   #define ACPI_GED_MEM_HOTPLUG_EVT   0x1
> >   #define ACPI_GED_PWR_DOWN_EVT      0x2
> >   #define ACPI_GED_NVDIMM_HOTPLUG_EVT 0x4
> > +#define ACPI_GED_CPU_HOTPLUG_EVT    0x8
> >
> 
> #define ACPI_GED_CPU_HOTPLUG_EVT  0x1
> #define ACPI_GED_MEM_HOTPLUG_EVT  0x2
>    :
> 
> If the adjustment is friendly to live migration.
> 
> >   typedef struct GEDState {
> >       MemoryRegion evt;
> > @@ -108,6 +110,9 @@ struct AcpiGedState {
> >       SysBusDevice parent_obj;
> >       MemHotplugState memhp_state;
> >       MemoryRegion container_memhp;
> > +    CPUHotplugState cpuhp_state;
> > +    MemoryRegion container_cpuhp;
> > +    AcpiCpuHotplug cpuhp;
> >       GEDState ged_state;
> >       uint32_t ged_event_bitmap;
> >       qemu_irq irq;
> 
> Thanks,
> Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 13/37] hw/acpi: Init GED framework with cpu hotplug events
  2023-10-16 21:44     ` Salil Mehta via
@ 2023-10-16 21:44       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 21:44 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 1:57 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 13/37] hw/acpi: Init GED framework with cpu
> hotplug events
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI GED(as described in the ACPI 6.2 spec) can be used to generate ACPI events
> > when OSPM/guest receives an interrupt listed in the _CRS object of GED. OSPM
> > then maps or demultiplexes the event by evaluating _EVT method.
> >
> > This change adds the support of cpu hotplug event initialization in the
> > existing GED framework.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/generic_event_device.c         | 8 ++++++++
> >   include/hw/acpi/generic_event_device.h | 5 +++++
> >   2 files changed, 13 insertions(+)
> >
> 
> It looks a bit strange you're co-developing the patch with yourself.
> It seems all patches follow this particular pattern. I could be changed
> to:
> 
> Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> 
> The code changes look good to me with the following nits addressed:
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>


All of these comments have been addressed in architecture agnostic patch-set.

https://lore.kernel.org/qemu-devel/4764CF47-47CA-4685-805C-BBE6310BE164@oracle.com/T/#m904eb56658a07c0302d6c6fbba6c9b9ddc35cc1b

Thanks
Salil.





> 
> > diff --git a/hw/acpi/generic_event_device.c
> b/hw/acpi/generic_event_device.c
> > index a3d31631fe..d2fa1d0e4a 100644
> > --- a/hw/acpi/generic_event_device.c
> > +++ b/hw/acpi/generic_event_device.c
> > @@ -25,6 +25,7 @@ static const uint32_t ged_supported_events[] = {
> >       ACPI_GED_MEM_HOTPLUG_EVT,
> >       ACPI_GED_PWR_DOWN_EVT,
> >       ACPI_GED_NVDIMM_HOTPLUG_EVT,
> > +    ACPI_GED_CPU_HOTPLUG_EVT,
> >   };
> >
> 
> Can we move ACPI_GED_CPU_HOTPLUG_EVT ahead of ACPI_GED_MEM_HOTPLUG_EVT?
> 
> >   /*
> > @@ -400,6 +401,13 @@ static void acpi_ged_initfn(Object *obj)
> >       memory_region_init_io(&ged_st->regs, obj, &ged_regs_ops, ged_st,
> >                             TYPE_ACPI_GED "-regs", ACPI_GED_REG_COUNT);
> >       sysbus_init_mmio(sbd, &ged_st->regs);
> > +
> > +    s->cpuhp.device = OBJECT(s);
> > +    memory_region_init(&s->container_cpuhp, OBJECT(dev), "cpuhp
> container",
> > +                       ACPI_CPU_HOTPLUG_REG_LEN);
> > +    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->container_cpuhp);
> > +    cpu_hotplug_hw_init(&s->container_cpuhp, OBJECT(dev),
> > +                        &s->cpuhp_state, 0);
> >   }
> >
> >   static void acpi_ged_class_init(ObjectClass *class, void *data)
> > diff --git a/include/hw/acpi/generic_event_device.h
> b/include/hw/acpi/generic_event_device.h
> > index d831bbd889..d0a5a43abf 100644
> > --- a/include/hw/acpi/generic_event_device.h
> > +++ b/include/hw/acpi/generic_event_device.h
> > @@ -60,6 +60,7 @@
> >   #define HW_ACPI_GENERIC_EVENT_DEVICE_H
> >
> >   #include "hw/sysbus.h"
> > +#include "hw/acpi/cpu_hotplug.h"
> >   #include "hw/acpi/memory_hotplug.h"
> >   #include "hw/acpi/ghes.h"
> >   #include "qom/object.h"
> > @@ -97,6 +98,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
> >   #define ACPI_GED_MEM_HOTPLUG_EVT   0x1
> >   #define ACPI_GED_PWR_DOWN_EVT      0x2
> >   #define ACPI_GED_NVDIMM_HOTPLUG_EVT 0x4
> > +#define ACPI_GED_CPU_HOTPLUG_EVT    0x8
> >
> 
> #define ACPI_GED_CPU_HOTPLUG_EVT  0x1
> #define ACPI_GED_MEM_HOTPLUG_EVT  0x2
>    :
> 
> If the adjustment is friendly to live migration.
> 
> >   typedef struct GEDState {
> >       MemoryRegion evt;
> > @@ -108,6 +110,9 @@ struct AcpiGedState {
> >       SysBusDevice parent_obj;
> >       MemHotplugState memhp_state;
> >       MemoryRegion container_memhp;
> > +    CPUHotplugState cpuhp_state;
> > +    MemoryRegion container_cpuhp;
> > +    AcpiCpuHotplug cpuhp;
> >       GEDState ged_state;
> >       uint32_t ged_event_bitmap;
> >       qemu_irq irq;
> 
> Thanks,
> Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 14/37] arm/virt: Add cpu hotplug events to GED during creation
  2023-09-28  1:03   ` Gavin Shan
@ 2023-10-16 21:46     ` Salil Mehta via
  2023-10-16 21:46       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 21:46 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 2:04 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 14/37] arm/virt: Add cpu hotplug events to GED
> during creation
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Add CPU Hotplug event to the set of supported ged-events during the creation of
> > GED device during VM init. Also initialize the memory map for CPU Hotplug
>               ^^^^^^^^^^^^^^
>               it can be dropped.

Ok :)

> > control device used in event exchanges between Qemu/VMM and the guest.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c         | 5 ++++-
> >   include/hw/arm/virt.h | 1 +
> >   2 files changed, 5 insertions(+), 1 deletion(-)
> >
> 
> The changes look good to me:
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>


Great. Thanks

cheers
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 14/37] arm/virt: Add cpu hotplug events to GED during creation
  2023-10-16 21:46     ` Salil Mehta via
@ 2023-10-16 21:46       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 21:46 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 2:04 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 14/37] arm/virt: Add cpu hotplug events to GED
> during creation
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Add CPU Hotplug event to the set of supported ged-events during the creation of
> > GED device during VM init. Also initialize the memory map for CPU Hotplug
>               ^^^^^^^^^^^^^^
>               it can be dropped.

Ok :)

> > control device used in event exchanges between Qemu/VMM and the guest.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c         | 5 ++++-
> >   include/hw/arm/virt.h | 1 +
> >   2 files changed, 5 insertions(+), 1 deletion(-)
> >
> 
> The changes look good to me:
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>


Great. Thanks

cheers
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 15/37] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  2023-09-28  1:08   ` Gavin Shan
@ 2023-10-16 21:54     ` Salil Mehta via
  2023-10-16 21:54       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 21:54 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 2:08 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 15/37] arm/virt: Create GED dev before
> *disabled* CPU Objs are destroyed
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI CPU hotplug state (is_present=_STA.PRESENT, is_enabled=_STA.ENABLED) for
> > all the possible vCPUs MUST be initialized during machine init. This is done
> > during the creation of the GED device. VMM/Qemu MUST expose/fake the ACPI state
> > of the disabled vCPUs to the Guest kernel as 'present' (_STA.PRESENT) always
> > i.e. ACPI persistent. if the 'disabled' vCPU objectes are destroyed before the
> > GED device has been created then their ACPI hotplug state might not get
> > initialized correctly as acpi_persistent flag is part of the CPUState. This will
> > expose wrong status of the unplugged vCPUs to the Guest kernel.
> >
> > Hence, moving the GED device creation before disabled vCPU objects get destroyed
> > as part of the post CPU init routine.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c | 10 +++++++---
> >   1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 5c8a0672dc..cbb6199ec6 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2376,6 +2376,12 @@ static void machvirt_init(MachineState *machine)
> >
> >       create_gic(vms, sysmem);
> >
> > +    has_ged = has_ged && aarch64 && firmware_loaded &&
> > +              virt_is_acpi_enabled(vms);
> > +    if (has_ged) {
> > +        vms->acpi_dev = create_acpi_ged(vms);
> > +    }
> > +
> 
> I prefer the old style. Squeezing all conditions to @has_ged changes what's
> to be meant by @has_ged itself.

The check is too long and a similar piece of code has to used again
down below. Hence, to reuse the result, it's better to store the first
result in a variable.

I can see you point though. To mitigate that I can use a new variable
for this?


Thanks
Salil.


> 
>         if (has_ged && aarch64 && firmware_loaded &&
> virt_is_acpi_enabled(vms)) {
>             :
>         }
> 
> >       virt_cpu_post_init(vms, sysmem);
> >
> >       fdt_add_pmu_nodes(vms);
> > @@ -2398,9 +2404,7 @@ static void machvirt_init(MachineState *machine)
> >
> >       create_pcie(vms);
> >
> > -    if (has_ged && aarch64 && firmware_loaded &&
> virt_is_acpi_enabled(vms)) {
> > -        vms->acpi_dev = create_acpi_ged(vms);
> > -    } else {
> > +    if (!has_ged) {
> >           create_gpio_devices(vms, VIRT_GPIO, sysmem);
> >       }
> >
> 
> Thanks,
> Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 15/37] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  2023-10-16 21:54     ` Salil Mehta via
@ 2023-10-16 21:54       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 21:54 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 2:08 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 15/37] arm/virt: Create GED dev before
> *disabled* CPU Objs are destroyed
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI CPU hotplug state (is_present=_STA.PRESENT, is_enabled=_STA.ENABLED) for
> > all the possible vCPUs MUST be initialized during machine init. This is done
> > during the creation of the GED device. VMM/Qemu MUST expose/fake the ACPI state
> > of the disabled vCPUs to the Guest kernel as 'present' (_STA.PRESENT) always
> > i.e. ACPI persistent. if the 'disabled' vCPU objectes are destroyed before the
> > GED device has been created then their ACPI hotplug state might not get
> > initialized correctly as acpi_persistent flag is part of the CPUState. This will
> > expose wrong status of the unplugged vCPUs to the Guest kernel.
> >
> > Hence, moving the GED device creation before disabled vCPU objects get destroyed
> > as part of the post CPU init routine.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c | 10 +++++++---
> >   1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 5c8a0672dc..cbb6199ec6 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2376,6 +2376,12 @@ static void machvirt_init(MachineState *machine)
> >
> >       create_gic(vms, sysmem);
> >
> > +    has_ged = has_ged && aarch64 && firmware_loaded &&
> > +              virt_is_acpi_enabled(vms);
> > +    if (has_ged) {
> > +        vms->acpi_dev = create_acpi_ged(vms);
> > +    }
> > +
> 
> I prefer the old style. Squeezing all conditions to @has_ged changes what's
> to be meant by @has_ged itself.

The check is too long and a similar piece of code has to used again
down below. Hence, to reuse the result, it's better to store the first
result in a variable.

I can see you point though. To mitigate that I can use a new variable
for this?


Thanks
Salil.


> 
>         if (has_ged && aarch64 && firmware_loaded &&
> virt_is_acpi_enabled(vms)) {
>             :
>         }
> 
> >       virt_cpu_post_init(vms, sysmem);
> >
> >       fdt_add_pmu_nodes(vms);
> > @@ -2398,9 +2404,7 @@ static void machvirt_init(MachineState *machine)
> >
> >       create_pcie(vms);
> >
> > -    if (has_ged && aarch64 && firmware_loaded &&
> virt_is_acpi_enabled(vms)) {
> > -        vms->acpi_dev = create_acpi_ged(vms);
> > -    } else {
> > +    if (!has_ged) {
> >           create_gpio_devices(vms, VIRT_GPIO, sysmem);
> >       }
> >
> 
> Thanks,
> Gavin


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 16/37] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
  2023-09-28  1:26   ` Gavin Shan
@ 2023-10-16 21:57     ` Salil Mehta via
  2023-10-16 21:57       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 21:57 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 2:26 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 16/37] hw/acpi: Update CPUs AML with cpu-
> (ctrl)dev change
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > CPUs Control device(\\_SB.PCI0) register interface for the x86 arch is
> based on
> > PCI and is IO port based and hence existing cpus AML code assumes _CRS objects
> 
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>                            . The existing AML code assumes _CRS object
> > would evaluate to a system resource which describes IO Port address. But
> on ARM
>    ^^^^^^^^^^^^^^^^^^^
>    is evaluated to a
> 
> > arch CPUs control device(\\_SB.PRES) register interface is memory-mapped
> hence
> > _CRS object should evaluate to system resource which describes memory-mapped
>                ^^^^^^
>                should be evaluated
> > base address.
> >
> > This cpus AML code change updates the existing inerface of the build cpus AML
> > function to accept both IO/MEMORY type regions and update the _CRS object
> > correspondingly.
> >
> > NOTE: Beside above CPU scan shall be triggered when OSPM evaluates _EVT method
> >        part of the GED framework which is covered in subsequent patch.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/cpu.c         | 23 ++++++++++++++++-------
> >   hw/i386/acpi-build.c  |  2 +-
> >   include/hw/acpi/cpu.h |  5 +++--
> >   3 files changed, 20 insertions(+), 10 deletions(-)
> >
> 
> I guess the commit log can be simplified to:
> 
> The CPU hotplug register block is declared as a IO region on x86, or a memory
> region on arm64 in build_cpus_aml(), as part of the generic container device
> (\\_SB.PCI0 or \\_SB.PRES).
> 
> Adapt build_cpus_aml() so that IO region and memory region can be handled
> in the mean while.
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>

Has been reviewed already part of architecture agnostic patch-set,


https://lore.kernel.org/qemu-devel/4764CF47-47CA-4685-805C-BBE6310BE164@oracle.com/T/#md615c6d3464e7178214785501e7035bf977886f2


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 16/37] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
  2023-10-16 21:57     ` Salil Mehta via
@ 2023-10-16 21:57       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 21:57 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 2:26 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 16/37] hw/acpi: Update CPUs AML with cpu-
> (ctrl)dev change
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > CPUs Control device(\\_SB.PCI0) register interface for the x86 arch is
> based on
> > PCI and is IO port based and hence existing cpus AML code assumes _CRS objects
> 
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>                            . The existing AML code assumes _CRS object
> > would evaluate to a system resource which describes IO Port address. But
> on ARM
>    ^^^^^^^^^^^^^^^^^^^
>    is evaluated to a
> 
> > arch CPUs control device(\\_SB.PRES) register interface is memory-mapped
> hence
> > _CRS object should evaluate to system resource which describes memory-mapped
>                ^^^^^^
>                should be evaluated
> > base address.
> >
> > This cpus AML code change updates the existing inerface of the build cpus AML
> > function to accept both IO/MEMORY type regions and update the _CRS object
> > correspondingly.
> >
> > NOTE: Beside above CPU scan shall be triggered when OSPM evaluates _EVT method
> >        part of the GED framework which is covered in subsequent patch.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/cpu.c         | 23 ++++++++++++++++-------
> >   hw/i386/acpi-build.c  |  2 +-
> >   include/hw/acpi/cpu.h |  5 +++--
> >   3 files changed, 20 insertions(+), 10 deletions(-)
> >
> 
> I guess the commit log can be simplified to:
> 
> The CPU hotplug register block is declared as a IO region on x86, or a memory
> region on arm64 in build_cpus_aml(), as part of the generic container device
> (\\_SB.PCI0 or \\_SB.PRES).
> 
> Adapt build_cpus_aml() so that IO region and memory region can be handled
> in the mean while.
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>

Has been reviewed already part of architecture agnostic patch-set,


https://lore.kernel.org/qemu-devel/4764CF47-47CA-4685-805C-BBE6310BE164@oracle.com/T/#md615c6d3464e7178214785501e7035bf977886f2


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 17/37] arm/virt/acpi: Build CPUs AML with CPU Hotplug support
  2023-09-28  1:36   ` Gavin Shan
@ 2023-10-16 22:05     ` Salil Mehta via
  2023-10-16 22:05       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 22:05 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 2:36 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 17/37] arm/virt/acpi: Build CPUs AML with CPU
> Hotplug support
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Support of vCPU Hotplug requires sequence of ACPI handshakes between Qemu
> and
> > Guest kernel when a vCPU is plugged or unplugged. Most of the AML code to
> > support these handshakes already exists. This AML need to be build during VM
> > init for ARM architecture as well if the GED support exists.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt-acpi-build.c | 13 ++++++++++++-
> >   1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index 6b674231c2..d27df5030e 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -858,7 +858,18 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
> >        * the RTC ACPI device at all when using UEFI.
> >        */
> >       scope = aml_scope("\\_SB");
> > -    acpi_dsdt_add_cpus(scope, vms);
> > +    /* if GED is enabled then cpus AML shall be added as part build_cpus_aml */
> > +    if (vms->acpi_dev) {
> > +        CPUHotplugFeatures opts = {
> > +             .acpi_1_compatible = false,
> > +             .has_legacy_cphp = false
> > +        };
> > +
> > +        build_cpus_aml(scope, ms, opts, memmap[VIRT_CPUHP_ACPI].base,
> > +                       "\\_SB", NULL, AML_SYSTEM_MEMORY);
> > +    } else {
> > +        acpi_dsdt_add_cpus(scope, vms);
> > +    }
> >       acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> >                          (irqmap[VIRT_UART] + ARM_SPI_BASE));
> >       if (vmc->acpi_expose_flash) {
> 
> I don't think it's enough to check vms->acpi_dev. vCPU hotplug needn't to be
> supported even vms->acpi_dev exists. For example when vGICv2 instead of
> vGICv3 is enabled, and so on.

Good catch.

'mc->has_hotpluggable_cpus' has to be added. Will fix.

Thanks
Salil.







^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 17/37] arm/virt/acpi: Build CPUs AML with CPU Hotplug support
  2023-10-16 22:05     ` Salil Mehta via
@ 2023-10-16 22:05       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 22:05 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Thursday, September 28, 2023 2:36 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 17/37] arm/virt/acpi: Build CPUs AML with CPU
> Hotplug support
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Support of vCPU Hotplug requires sequence of ACPI handshakes between Qemu
> and
> > Guest kernel when a vCPU is plugged or unplugged. Most of the AML code to
> > support these handshakes already exists. This AML need to be build during VM
> > init for ARM architecture as well if the GED support exists.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt-acpi-build.c | 13 ++++++++++++-
> >   1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index 6b674231c2..d27df5030e 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -858,7 +858,18 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
> >        * the RTC ACPI device at all when using UEFI.
> >        */
> >       scope = aml_scope("\\_SB");
> > -    acpi_dsdt_add_cpus(scope, vms);
> > +    /* if GED is enabled then cpus AML shall be added as part build_cpus_aml */
> > +    if (vms->acpi_dev) {
> > +        CPUHotplugFeatures opts = {
> > +             .acpi_1_compatible = false,
> > +             .has_legacy_cphp = false
> > +        };
> > +
> > +        build_cpus_aml(scope, ms, opts, memmap[VIRT_CPUHP_ACPI].base,
> > +                       "\\_SB", NULL, AML_SYSTEM_MEMORY);
> > +    } else {
> > +        acpi_dsdt_add_cpus(scope, vms);
> > +    }
> >       acpi_dsdt_add_uart(scope, &memmap[VIRT_UART],
> >                          (irqmap[VIRT_UART] + ARM_SPI_BASE));
> >       if (vmc->acpi_expose_flash) {
> 
> I don't think it's enough to check vms->acpi_dev. vCPU hotplug needn't to be
> supported even vms->acpi_dev exists. For example when vGICv2 instead of
> vGICv3 is enabled, and so on.

Good catch.

'mc->has_hotpluggable_cpus' has to be added. Will fix.

Thanks
Salil.







^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 18/37] arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  2023-09-28 23:18   ` Gavin Shan
@ 2023-10-16 22:33     ` Salil Mehta via
  2023-10-16 22:33       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 22:33 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:18 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 18/37] arm/virt: Make ARM vCPU *present* status
> ACPI *persistent*
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ARM arch does not allow CPUs presence to be changed [1] after kernel has booted.
> > Hence, firmware/ACPI/Qemu must ensure persistent view of the vCPUs to the Guest
> > kernel even when they are not present in the QoM i.e. are unplugged or are
> > yet-to-be-plugged
> >
> > References:
> > [1] Check comment 5 in the bugzilla entry
> >     Link: https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   cpus-common.c         |  6 ++++++
> >   hw/arm/virt.c         |  7 +++++++
> >   include/hw/core/cpu.h | 20 ++++++++++++++++++++
> >   3 files changed, 33 insertions(+)
> >
> 
> hmm, it's another CPU state. There are 4 CPU states, plus other 3 CPU states:
> possible, present, enabled. Now we're having always-present state.

Possible vCPU is not a QOM CPUState. Neither it gets represented through
ACPI to the guest OS through _STA method. A device which in this case is
a CPU can be in ENABLED state or can just be PRESENT but not ENABLED.

All possible vCPUs get detected by guest OS by the mere presence of GICC
Entry (Enabled/online-capable) in the ACPI MADT Table.

A plugged vCPU will be 'present' in QOM and will be ACPI _STA.PRESENT as
well. A un-plugged vCPU will be 'not-present' in QOM i.e. its CPUState
object will be NULL. This is akin to x86 or any other architecture and
we are not changing any of this at QOM level.

What changes is the representation of un-plugged vCPU to Guest OS via
ACPI _STA method. For ARM, we *fake* QOM un-plugged vCPU presence to the
Guest OS through the ACPI _STA method. To detect this we check the bool
'acpi_persistent' part of CPUState. This has to be set explicitly by
Architectures like ARM which do not support hot-plug. Hence, CPUs are
in 'always present' state ACPI wise but cannot be used as they are 
ACPI disabled i.e. ACPI _STA.ENABLED=0.


> I think
> those CPU states can be squeezed into the previous present state. What we
> need is to ensure all possible vCPUs are present from the beginning.

All of this is unnecessary, really.


Thanks
Salil.






^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 18/37] arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  2023-10-16 22:33     ` Salil Mehta via
@ 2023-10-16 22:33       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 22:33 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:18 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 18/37] arm/virt: Make ARM vCPU *present* status
> ACPI *persistent*
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ARM arch does not allow CPUs presence to be changed [1] after kernel has booted.
> > Hence, firmware/ACPI/Qemu must ensure persistent view of the vCPUs to the Guest
> > kernel even when they are not present in the QoM i.e. are unplugged or are
> > yet-to-be-plugged
> >
> > References:
> > [1] Check comment 5 in the bugzilla entry
> >     Link: https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   cpus-common.c         |  6 ++++++
> >   hw/arm/virt.c         |  7 +++++++
> >   include/hw/core/cpu.h | 20 ++++++++++++++++++++
> >   3 files changed, 33 insertions(+)
> >
> 
> hmm, it's another CPU state. There are 4 CPU states, plus other 3 CPU states:
> possible, present, enabled. Now we're having always-present state.

Possible vCPU is not a QOM CPUState. Neither it gets represented through
ACPI to the guest OS through _STA method. A device which in this case is
a CPU can be in ENABLED state or can just be PRESENT but not ENABLED.

All possible vCPUs get detected by guest OS by the mere presence of GICC
Entry (Enabled/online-capable) in the ACPI MADT Table.

A plugged vCPU will be 'present' in QOM and will be ACPI _STA.PRESENT as
well. A un-plugged vCPU will be 'not-present' in QOM i.e. its CPUState
object will be NULL. This is akin to x86 or any other architecture and
we are not changing any of this at QOM level.

What changes is the representation of un-plugged vCPU to Guest OS via
ACPI _STA method. For ARM, we *fake* QOM un-plugged vCPU presence to the
Guest OS through the ACPI _STA method. To detect this we check the bool
'acpi_persistent' part of CPUState. This has to be set explicitly by
Architectures like ARM which do not support hot-plug. Hence, CPUs are
in 'always present' state ACPI wise but cannot be used as they are 
ACPI disabled i.e. ACPI _STA.ENABLED=0.


> I think
> those CPU states can be squeezed into the previous present state. What we
> need is to ensure all possible vCPUs are present from the beginning.

All of this is unnecessary, really.


Thanks
Salil.






^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits to Guest
  2023-09-28 23:33   ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} " Gavin Shan
@ 2023-10-16 22:59     ` Salil Mehta via
  2023-10-16 22:59       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 22:59 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:34 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the
> correct _STA.{PRES,ENA} Bits to Guest
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI AML changes to properly reflect the _STA.PRES and _STA.ENA Bits to
> the
> > guest during initialzation, when CPUs are hotplugged and after CPUs are
> > hot-unplugged.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/cpu.c                  | 49 +++++++++++++++++++++++++++++++---
> >   hw/acpi/generic_event_device.c | 11 ++++++++
> >   include/hw/acpi/cpu.h          |  2 ++
> >   3 files changed, 58 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> > index 232720992d..e1299696d3 100644
> > --- a/hw/acpi/cpu.c
> > +++ b/hw/acpi/cpu.c
> > @@ -63,10 +63,11 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr
> addr, unsigned size)
> >       cdev = &cpu_st->devs[cpu_st->selector];
> >       switch (addr) {
> >       case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
> > -        val |= cdev->cpu ? 1 : 0;
> > +        val |= cdev->is_enabled ? 1 : 0;
> >           val |= cdev->is_inserting ? 2 : 0;
> >           val |= cdev->is_removing  ? 4 : 0;
> >           val |= cdev->fw_remove  ? 16 : 0;
> > +        val |= cdev->is_present ? 32 : 0;
> >           trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
> >           break;
> 
> The vCPU states are synchronized to what we had. It means we're maintaining two set
> vCPU states, one for board level and another set for vCPU hotplug here. They look
> duplicate to each other. However, it will need too much code changes to combine
> them.

We need to distinguish between ACPI CPU Hotplug states and QOM CPU states.
ACPI exposes the QOM CPU state to Guest OS. Till now ACPI CPU Hotplug
states and QOM CPU states were consistent to each other. ACPI inferred
the presence of QOM CPUState object as _STA.PRESENT=1 and _STA.ENABLED=1.

But with the ARM CPU hot(un)plug changes, this assumption is no longer true. 
QOM CPUState object might still be present or NULL but ACPI CPU Hotplug
State will always expose un-plugged CPU i.e. with NULL CPUState object
as _STA.PRESENT=1 and _STA.ENABLED=0. This is a key change because of
which we are able to enable hot(un)plug mechanism on ARM.


> 
> >       case ACPI_CPU_CMD_DATA_OFFSET_RW:
> > @@ -228,7 +229,21 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object
> *owner,
> >           struct CPUState *cpu = CPU(id_list->cpus[i].cpu);
> >           if (qemu_present_cpu(cpu)) {
> >               state->devs[i].cpu = cpu;
> > +            state->devs[i].is_present = true;
> > +        } else {
> > +            if (qemu_persistent_cpu(cpu)) {
> > +                state->devs[i].is_present = true;
> > +            } else {
> > +                state->devs[i].is_present = false;
> > +            }
> >           }
> 
> state->devs[i].is_present = qemu_persistent_cpu(cpu);


Sure. Thanks for pointing. :)


> > +
> > +        if (qemu_enabled_cpu(cpu)) {
> > +            state->devs[i].is_enabled = true;
> > +        } else {
> > +            state->devs[i].is_enabled = false;
> > +        }
> > +
> 
> state->dev[i].is_enabled = qemu_enabled_cpu(cpu);


Sure. Yes.


Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits to Guest
  2023-10-16 22:59     ` Salil Mehta via
@ 2023-10-16 22:59       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 22:59 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:34 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the
> correct _STA.{PRES,ENA} Bits to Guest
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > ACPI AML changes to properly reflect the _STA.PRES and _STA.ENA Bits to
> the
> > guest during initialzation, when CPUs are hotplugged and after CPUs are
> > hot-unplugged.
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/cpu.c                  | 49 +++++++++++++++++++++++++++++++---
> >   hw/acpi/generic_event_device.c | 11 ++++++++
> >   include/hw/acpi/cpu.h          |  2 ++
> >   3 files changed, 58 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> > index 232720992d..e1299696d3 100644
> > --- a/hw/acpi/cpu.c
> > +++ b/hw/acpi/cpu.c
> > @@ -63,10 +63,11 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr
> addr, unsigned size)
> >       cdev = &cpu_st->devs[cpu_st->selector];
> >       switch (addr) {
> >       case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
> > -        val |= cdev->cpu ? 1 : 0;
> > +        val |= cdev->is_enabled ? 1 : 0;
> >           val |= cdev->is_inserting ? 2 : 0;
> >           val |= cdev->is_removing  ? 4 : 0;
> >           val |= cdev->fw_remove  ? 16 : 0;
> > +        val |= cdev->is_present ? 32 : 0;
> >           trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
> >           break;
> 
> The vCPU states are synchronized to what we had. It means we're maintaining two set
> vCPU states, one for board level and another set for vCPU hotplug here. They look
> duplicate to each other. However, it will need too much code changes to combine
> them.

We need to distinguish between ACPI CPU Hotplug states and QOM CPU states.
ACPI exposes the QOM CPU state to Guest OS. Till now ACPI CPU Hotplug
states and QOM CPU states were consistent to each other. ACPI inferred
the presence of QOM CPUState object as _STA.PRESENT=1 and _STA.ENABLED=1.

But with the ARM CPU hot(un)plug changes, this assumption is no longer true. 
QOM CPUState object might still be present or NULL but ACPI CPU Hotplug
State will always expose un-plugged CPU i.e. with NULL CPUState object
as _STA.PRESENT=1 and _STA.ENABLED=0. This is a key change because of
which we are able to enable hot(un)plug mechanism on ARM.


> 
> >       case ACPI_CPU_CMD_DATA_OFFSET_RW:
> > @@ -228,7 +229,21 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object
> *owner,
> >           struct CPUState *cpu = CPU(id_list->cpus[i].cpu);
> >           if (qemu_present_cpu(cpu)) {
> >               state->devs[i].cpu = cpu;
> > +            state->devs[i].is_present = true;
> > +        } else {
> > +            if (qemu_persistent_cpu(cpu)) {
> > +                state->devs[i].is_present = true;
> > +            } else {
> > +                state->devs[i].is_present = false;
> > +            }
> >           }
> 
> state->devs[i].is_present = qemu_persistent_cpu(cpu);


Sure. Thanks for pointing. :)


> > +
> > +        if (qemu_enabled_cpu(cpu)) {
> > +            state->devs[i].is_enabled = true;
> > +        } else {
> > +            state->devs[i].is_enabled = false;
> > +        }
> > +
> 
> state->dev[i].is_enabled = qemu_enabled_cpu(cpu);


Sure. Yes.


Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 20/37] hw/acpi: Update GED _EVT method AML with cpu scan
  2023-09-28 23:35   ` Gavin Shan
@ 2023-10-16 23:01     ` Salil Mehta via
  2023-10-16 23:01       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 23:01 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:36 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 20/37] hw/acpi: Update GED _EVT method AML with
> cpu scan
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > OSPM evaluates _EVT method to map the event. The cpu hotplug event
> eventually
> > results in start of the cpu scan. Scan figures out the cpu and the kind
> of
> > event(plug/unplug) and notifies it back to the guest.
> >
> > The change in this patch updates the GED AML _EVT method with the call to
> > \\_SB.CPUS.CSCN which will do above.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/generic_event_device.c | 4 ++++
> >   include/hw/acpi/cpu_hotplug.h  | 2 ++
> >   2 files changed, 6 insertions(+)
> >
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>

Thanks. Already reviewed as part of architecture agnostic patch-set

https://lore.kernel.org/qemu-devel/4764CF47-47CA-4685-805C-BBE6310BE164@oracle.com/T/#m40b72077ad8c797597588600bff5e5abca44150c

cheers
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 20/37] hw/acpi: Update GED _EVT method AML with cpu scan
  2023-10-16 23:01     ` Salil Mehta via
@ 2023-10-16 23:01       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 23:01 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:36 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 20/37] hw/acpi: Update GED _EVT method AML with
> cpu scan
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > OSPM evaluates _EVT method to map the event. The cpu hotplug event
> eventually
> > results in start of the cpu scan. Scan figures out the cpu and the kind
> of
> > event(plug/unplug) and notifies it back to the guest.
> >
> > The change in this patch updates the GED AML _EVT method with the call to
> > \\_SB.CPUS.CSCN which will do above.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/generic_event_device.c | 4 ++++
> >   include/hw/acpi/cpu_hotplug.h  | 2 ++
> >   2 files changed, 6 insertions(+)
> >
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>

Thanks. Already reviewed as part of architecture agnostic patch-set

https://lore.kernel.org/qemu-devel/4764CF47-47CA-4685-805C-BBE6310BE164@oracle.com/T/#m40b72077ad8c797597588600bff5e5abca44150c

cheers
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with possible vCPUs
  2023-09-28 23:43   ` Gavin Shan
@ 2023-10-16 23:15     ` Salil Mehta via
  2023-10-16 23:15       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 23:15 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:44 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest
> with possible vCPUs
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Changes required during building of MADT Table by QEMU to accommodate disabled
> > possible vCPUs. This info shall be used by the guest kernel to size up its
> > resources during boot time. This pre-sizing of the guest kernel done on
> > possible vCPUs will facilitate hotplug of the disabled vCPUs.
> >
> > This change also caters ACPI MADT GIC CPU Interface flag related changes
> > recently introduced in the UEFI ACPI 6.5 Specification which allows deferred
> > virtual CPU online'ing in the Guest Kernel.
> >
> > Link:
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt-acpi-build.c | 36 ++++++++++++++++++++++++++++++------
> >   1 file changed, 30 insertions(+), 6 deletions(-)
> >
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index d27df5030e..cbccd2ca2d 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -700,6 +700,29 @@ static void build_append_gicr(GArray *table_data, uint64_t base, uint32_t size)
> >       build_append_int_noprefix(table_data, size, 4); /* Discovery Range Length */
> >   }
> >
> > +static uint32_t virt_acpi_get_gicc_flags(CPUState *cpu)
> > +{
> > +    MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
> > +
> > +    /* can only exist in 'enabled' state */
> > +    if (!mc->has_hotpluggable_cpus) {
> > +        return 1;
> > +    }
> > +
> > +    /*
> > +     * ARM GIC CPU Interface can be 'online-capable' or 'enabled' at boot
> > +     * We MUST set 'online-capable' Bit for all hotpluggable CPUs except the
>                                         ^^^
>                                         bit

:)

> > +     * first/boot CPU. Cold-booted CPUs without 'Id' can also be unplugged.
> > +     * Though as-of-now this is only used as a debugging feature.
> > +     *
> > +     *   UEFI ACPI Specification 6.5
> > +     *   Section: 5.2.12.14. GIC CPU Interface (GICC) Structure
> > +     *   Table:   5.37 GICC CPU Interface Flags
> > +     *   Link: https://uefi.org/specs/ACPI/6.5
> > +     */
> > +    return cpu && !cpu->cpu_index ? 1 : (1 << 3);
> > +}
> > +
> 
> I don't understand how a cold-booted CPU can be hot removed if it doesn't
> have a ID? Besides, how cpu->cpu_index is zero for the first cold-booted
> CPU?

Some cold-booted CPUs can be 'pluggable'. Hence, can have 'ID' specified
as part of the QEMU command line. This 'ID' can be used to hot(un)plug
later (if supported). 

You can also start QEMU with '-s' option which will pause the QEMU and
then you can cold-plug the CPUs during VM initialization time.

Good point about boot CPU.
But, it is a default assumption to have boot CPU as 0 on ARM - I think?

I will need to cross this part.

Thanks for pointing this though.

> 
> >   static void
> >   build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState
> *vms)
> >   {
> > @@ -726,12 +749,13 @@ build_madt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
> >       build_append_int_noprefix(table_data, vms->gic_version, 1);
> >       build_append_int_noprefix(table_data, 0, 3);   /* Reserved */
> >
> > -    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
> > -        ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
> > +    for (i = 0; i < MACHINE(vms)->smp.max_cpus; i++) {
> > +        CPUState *cpu = qemu_get_possible_cpu(i);
> >           uint64_t physical_base_address = 0, gich = 0, gicv = 0;
> >           uint32_t vgic_interrupt = vms->virt ? PPI(ARCH_GIC_MAINT_IRQ) :
> 0;
> > -        uint32_t pmu_interrupt = arm_feature(&armcpu->env, ARM_FEATURE_PMU) ?
> > -                                             PPI(VIRTUAL_PMU_IRQ) : 0;
> > +        uint32_t pmu_interrupt = vms->pmu ? PPI(VIRTUAL_PMU_IRQ) : 0;
> > +        uint32_t flags = virt_acpi_get_gicc_flags(cpu);
> > +        uint64_t mpidr = qemu_get_cpu_archid(i);
> >
> 
> qemu_get_cpu_archid() can be dropped since it's called for once. MPIDR
> can be fetched from ms->possible_cpus->cpus[i].arch_id, which has been
> initialized pre-hand.

I want expose this API to other parts of QEMU and other architectures as
well. It is an accessor API and should be encouraged all the time. It
reduces the unnecessary boiler plate code.

So would like to keep it but can move to board.h if you wish?


Thanks
Salil.






^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with possible vCPUs
  2023-10-16 23:15     ` Salil Mehta via
@ 2023-10-16 23:15       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 23:15 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:44 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest
> with possible vCPUs
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Changes required during building of MADT Table by QEMU to accommodate disabled
> > possible vCPUs. This info shall be used by the guest kernel to size up its
> > resources during boot time. This pre-sizing of the guest kernel done on
> > possible vCPUs will facilitate hotplug of the disabled vCPUs.
> >
> > This change also caters ACPI MADT GIC CPU Interface flag related changes
> > recently introduced in the UEFI ACPI 6.5 Specification which allows deferred
> > virtual CPU online'ing in the Guest Kernel.
> >
> > Link:
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt-acpi-build.c | 36 ++++++++++++++++++++++++++++++------
> >   1 file changed, 30 insertions(+), 6 deletions(-)
> >
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index d27df5030e..cbccd2ca2d 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -700,6 +700,29 @@ static void build_append_gicr(GArray *table_data, uint64_t base, uint32_t size)
> >       build_append_int_noprefix(table_data, size, 4); /* Discovery Range Length */
> >   }
> >
> > +static uint32_t virt_acpi_get_gicc_flags(CPUState *cpu)
> > +{
> > +    MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
> > +
> > +    /* can only exist in 'enabled' state */
> > +    if (!mc->has_hotpluggable_cpus) {
> > +        return 1;
> > +    }
> > +
> > +    /*
> > +     * ARM GIC CPU Interface can be 'online-capable' or 'enabled' at boot
> > +     * We MUST set 'online-capable' Bit for all hotpluggable CPUs except the
>                                         ^^^
>                                         bit

:)

> > +     * first/boot CPU. Cold-booted CPUs without 'Id' can also be unplugged.
> > +     * Though as-of-now this is only used as a debugging feature.
> > +     *
> > +     *   UEFI ACPI Specification 6.5
> > +     *   Section: 5.2.12.14. GIC CPU Interface (GICC) Structure
> > +     *   Table:   5.37 GICC CPU Interface Flags
> > +     *   Link: https://uefi.org/specs/ACPI/6.5
> > +     */
> > +    return cpu && !cpu->cpu_index ? 1 : (1 << 3);
> > +}
> > +
> 
> I don't understand how a cold-booted CPU can be hot removed if it doesn't
> have a ID? Besides, how cpu->cpu_index is zero for the first cold-booted
> CPU?

Some cold-booted CPUs can be 'pluggable'. Hence, can have 'ID' specified
as part of the QEMU command line. This 'ID' can be used to hot(un)plug
later (if supported). 

You can also start QEMU with '-s' option which will pause the QEMU and
then you can cold-plug the CPUs during VM initialization time.

Good point about boot CPU.
But, it is a default assumption to have boot CPU as 0 on ARM - I think?

I will need to cross this part.

Thanks for pointing this though.

> 
> >   static void
> >   build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState
> *vms)
> >   {
> > @@ -726,12 +749,13 @@ build_madt(GArray *table_data, BIOSLinker *linker,
> VirtMachineState *vms)
> >       build_append_int_noprefix(table_data, vms->gic_version, 1);
> >       build_append_int_noprefix(table_data, 0, 3);   /* Reserved */
> >
> > -    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
> > -        ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
> > +    for (i = 0; i < MACHINE(vms)->smp.max_cpus; i++) {
> > +        CPUState *cpu = qemu_get_possible_cpu(i);
> >           uint64_t physical_base_address = 0, gich = 0, gicv = 0;
> >           uint32_t vgic_interrupt = vms->virt ? PPI(ARCH_GIC_MAINT_IRQ) :
> 0;
> > -        uint32_t pmu_interrupt = arm_feature(&armcpu->env, ARM_FEATURE_PMU) ?
> > -                                             PPI(VIRTUAL_PMU_IRQ) : 0;
> > +        uint32_t pmu_interrupt = vms->pmu ? PPI(VIRTUAL_PMU_IRQ) : 0;
> > +        uint32_t flags = virt_acpi_get_gicc_flags(cpu);
> > +        uint64_t mpidr = qemu_get_cpu_archid(i);
> >
> 
> qemu_get_cpu_archid() can be dropped since it's called for once. MPIDR
> can be fetched from ms->possible_cpus->cpus[i].arch_id, which has been
> initialized pre-hand.

I want expose this API to other parts of QEMU and other architectures as
well. It is an accessor API and should be encouraged all the time. It
reduces the unnecessary boiler plate code.

So would like to keep it but can move to board.h if you wish?


Thanks
Salil.






^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 22/37] hw/acpi: Make _MAT method optional
  2023-09-28 23:50   ` Gavin Shan
@ 2023-10-16 23:17     ` Salil Mehta via
  2023-10-16 23:17       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 23:17 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:50 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 22/37] hw/acpi: Make _MAT method optional
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >
> > The GICC interface on arm64 vCPUs is statically defined in the MADT, and
> > doesn't require a _MAT entry. Although the GICC is indicated as present
> > by the MADT entry, it can only be used from vCPU sysregs, which aren't
> > accessible until hot-add.
> >
> > Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Co-developed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> > Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/cpu.c | 12 +++++++-----
> >   1 file changed, 7 insertions(+), 5 deletions(-)
> >
> 
> With following nits addressed:
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>

Thanks.


> 
> > diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> > index e1299696d3..217db99538 100644
> > --- a/hw/acpi/cpu.c
> > +++ b/hw/acpi/cpu.c
> > @@ -715,11 +715,13 @@ void build_cpus_aml(Aml *table, MachineState
> *machine, CPUHotplugFeatures opts,
> >               aml_append(dev, method);
> >
> >               /* build _MAT object */
> > -            assert(adevc && adevc->madt_cpu);
> > -            adevc->madt_cpu(i, arch_ids, madt_buf,
> > -                            true); /* set enabled flag */
> > -            aml_append(dev, aml_name_decl("_MAT",
> > -                aml_buffer(madt_buf->len, (uint8_t *)madt_buf->data)));
> > +            if (adevc && adevc->madt_cpu) {
> > +                assert(adevc && adevc->madt_cpu);
> > +                adevc->madt_cpu(i, arch_ids, madt_buf,
> > +                                true); /* set enabled flag */
> > +                aml_append(dev, aml_name_decl("_MAT",
> > +                    aml_buffer(madt_buf->len, (uint8_t *)madt_buf->data)));
> > +            }
> >               g_array_free(madt_buf, true);
> >
> >               if (CPU(arch_ids->cpus[i].cpu) != first_cpu) {
> 
> May be worthy to have comment to mention _MAT isn't needed on aarch64.
> 
>                 /* Build _MAT object, which isn't needed by aarch64 */

This file is not an architecture specific file so not a good idea
to mention above.


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 22/37] hw/acpi: Make _MAT method optional
  2023-10-16 23:17     ` Salil Mehta via
@ 2023-10-16 23:17       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 23:17 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:50 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 22/37] hw/acpi: Make _MAT method optional
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >
> > The GICC interface on arm64 vCPUs is statically defined in the MADT, and
> > doesn't require a _MAT entry. Although the GICC is indicated as present
> > by the MADT entry, it can only be used from vCPU sysregs, which aren't
> > accessible until hot-add.
> >
> > Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Co-developed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> > Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/acpi/cpu.c | 12 +++++++-----
> >   1 file changed, 7 insertions(+), 5 deletions(-)
> >
> 
> With following nits addressed:
> 
> Reviewed-by: Gavin Shan <gshan@redhat.com>

Thanks.


> 
> > diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> > index e1299696d3..217db99538 100644
> > --- a/hw/acpi/cpu.c
> > +++ b/hw/acpi/cpu.c
> > @@ -715,11 +715,13 @@ void build_cpus_aml(Aml *table, MachineState
> *machine, CPUHotplugFeatures opts,
> >               aml_append(dev, method);
> >
> >               /* build _MAT object */
> > -            assert(adevc && adevc->madt_cpu);
> > -            adevc->madt_cpu(i, arch_ids, madt_buf,
> > -                            true); /* set enabled flag */
> > -            aml_append(dev, aml_name_decl("_MAT",
> > -                aml_buffer(madt_buf->len, (uint8_t *)madt_buf->data)));
> > +            if (adevc && adevc->madt_cpu) {
> > +                assert(adevc && adevc->madt_cpu);
> > +                adevc->madt_cpu(i, arch_ids, madt_buf,
> > +                                true); /* set enabled flag */
> > +                aml_append(dev, aml_name_decl("_MAT",
> > +                    aml_buffer(madt_buf->len, (uint8_t *)madt_buf->data)));
> > +            }
> >               g_array_free(madt_buf, true);
> >
> >               if (CPU(arch_ids->cpus[i].cpu) != first_cpu) {
> 
> May be worthy to have comment to mention _MAT isn't needed on aarch64.
> 
>                 /* Build _MAT object, which isn't needed by aarch64 */

This file is not an architecture specific file so not a good idea
to mention above.


Thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 23/37] arm/virt: Release objects for *disabled* possible vCPUs after init
  2023-09-28 23:57   ` Gavin Shan
@ 2023-10-16 23:28     ` Salil Mehta via
  2023-10-16 23:28       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 23:28 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:58 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 23/37] arm/virt: Release objects for *disabled*
> possible vCPUs after init
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > During machvirt_init(), QOM ARMCPU objects are also pre-created along
> with the
> > corresponding KVM vCPUs in the host for all possible vCPUs. This necessary
> > because of the architectural constraint, KVM restricts the deferred creation of
> > the KVM vCPUs and VGIC initialization/sizing after VM init. Hence, VGIC is
> > pre-sized with possible vCPUs.
> >
> > After initialization of the machine is complete disabled possible KVM vCPUs are
> > then parked at the per-virt-machine list "kvm_parked_vcpus" and we release the
> > QOM ARMCPU objects for the disabled vCPUs. These shall be re-created at the time
> > when vCPU is hotplugged again. QOM ARMCPU object is then re-attached with
> > corresponding parked KVM vCPU.
> >
> > Alternatively, we could've never released the QOM CPU objects and kept on
> > reusing. This approach might require some modifications of qdevice_add()
> > interface to get old ARMCPU object instead of creating a new one for the hotplug
> > request.
> >
> > Each of the above approaches come with their own pros and cons. This prototype
> > uses the 1st approach.(suggestions are welcome!)
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++
> >   1 file changed, 32 insertions(+)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index f1bee569d5..3b068534a8 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -1965,6 +1965,7 @@ static void virt_cpu_post_init(VirtMachineState
> *vms, MemoryRegion *sysmem)
> >   {
> >       CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
> >       int max_cpus = MACHINE(vms)->smp.max_cpus;
> > +    MachineState *ms = MACHINE(vms);
> >       bool aarch64, steal_time;
> >       CPUState *cpu;
> >       int n;
> > @@ -2025,6 +2026,37 @@ static void virt_cpu_post_init(VirtMachineState
> *vms, MemoryRegion *sysmem)
> >               }
> >           }
> >       }
> > +
> > +    if (kvm_enabled() || tcg_enabled()) {
> > +        for (n = 0; n < possible_cpus->len; n++) {
> > +            cpu = qemu_get_possible_cpu(n);
> > +
> > +            /*
> > +             * Now, GIC has been sized with possible CPUs and we don’t require
> > +             * disabled vCPU objects to be represented in the QOM. Release the
> > +             * disabled ARMCPU objects earlier used during init for pre-sizing.
> > +             *
> > +             * We fake to the guest through ACPI about the presence(_STA.PRES=1)
> > +             * of these non-existent vCPUs at VMM/qemu and present these as
> > +             * disabled vCPUs(_STA.ENA=0) so that they cant be used. These vCPUs
> > +             * can be later added to the guest through hotplug exchanges when
> > +             * ARMCPU objects are created back again using 'device_add' QMP
> > +             * command.
> > +             */
> > +            /*
> > +             * RFC: Question: Other approach could've been to keep them forever
> > +             * and release it only once when qemu exits as part of finalize or
> > +             * when new vCPU is hotplugged. In the later old could be released
> > +             * for the newly created object for the same vCPU?
> > +             */
> > +            if (!qemu_enabled_cpu(cpu)) {
> > +                CPUArchId *cpu_slot;
> > +                cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index);
> > +                cpu_slot->cpu = NULL;
> > +                object_unref(OBJECT(cpu));
> > +            }
> > +        }
> > +    }
> >   }
> >
> 
> Needn't we release those CPU instances for hve and qtest? Besides, I think it's
> hard for reuse those objects because they're managed by QOM, which is almost
> transparent to us, correct?

For now, this code leg won't hit for TCG, HVE or qtest. These are not supported
in this release. I might enable support of TCG in coming months. Some fixing and
testing is required.

I had created 2 working prototypes earlier in the year 2020. One did not release
the CPU objects. Hence, the ACPI CPU Hotplug state was always in sync with QOM
CPUState. But it required some changes in the qdev creation leg for the reuse of
the existing CPU object - which Igor was not in favor of. Plus, it had some issues
with Live migration (which were left unresolved at that time).  Hence, used the
current approach as the primary one.

Thanks
Salil.







^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 23/37] arm/virt: Release objects for *disabled* possible vCPUs after init
  2023-10-16 23:28     ` Salil Mehta via
@ 2023-10-16 23:28       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 23:28 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 12:58 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 23/37] arm/virt: Release objects for *disabled*
> possible vCPUs after init
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > During machvirt_init(), QOM ARMCPU objects are also pre-created along
> with the
> > corresponding KVM vCPUs in the host for all possible vCPUs. This necessary
> > because of the architectural constraint, KVM restricts the deferred creation of
> > the KVM vCPUs and VGIC initialization/sizing after VM init. Hence, VGIC is
> > pre-sized with possible vCPUs.
> >
> > After initialization of the machine is complete disabled possible KVM vCPUs are
> > then parked at the per-virt-machine list "kvm_parked_vcpus" and we release the
> > QOM ARMCPU objects for the disabled vCPUs. These shall be re-created at the time
> > when vCPU is hotplugged again. QOM ARMCPU object is then re-attached with
> > corresponding parked KVM vCPU.
> >
> > Alternatively, we could've never released the QOM CPU objects and kept on
> > reusing. This approach might require some modifications of qdevice_add()
> > interface to get old ARMCPU object instead of creating a new one for the hotplug
> > request.
> >
> > Each of the above approaches come with their own pros and cons. This prototype
> > uses the 1st approach.(suggestions are welcome!)
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++
> >   1 file changed, 32 insertions(+)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index f1bee569d5..3b068534a8 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -1965,6 +1965,7 @@ static void virt_cpu_post_init(VirtMachineState
> *vms, MemoryRegion *sysmem)
> >   {
> >       CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
> >       int max_cpus = MACHINE(vms)->smp.max_cpus;
> > +    MachineState *ms = MACHINE(vms);
> >       bool aarch64, steal_time;
> >       CPUState *cpu;
> >       int n;
> > @@ -2025,6 +2026,37 @@ static void virt_cpu_post_init(VirtMachineState
> *vms, MemoryRegion *sysmem)
> >               }
> >           }
> >       }
> > +
> > +    if (kvm_enabled() || tcg_enabled()) {
> > +        for (n = 0; n < possible_cpus->len; n++) {
> > +            cpu = qemu_get_possible_cpu(n);
> > +
> > +            /*
> > +             * Now, GIC has been sized with possible CPUs and we don’t require
> > +             * disabled vCPU objects to be represented in the QOM. Release the
> > +             * disabled ARMCPU objects earlier used during init for pre-sizing.
> > +             *
> > +             * We fake to the guest through ACPI about the presence(_STA.PRES=1)
> > +             * of these non-existent vCPUs at VMM/qemu and present these as
> > +             * disabled vCPUs(_STA.ENA=0) so that they cant be used. These vCPUs
> > +             * can be later added to the guest through hotplug exchanges when
> > +             * ARMCPU objects are created back again using 'device_add' QMP
> > +             * command.
> > +             */
> > +            /*
> > +             * RFC: Question: Other approach could've been to keep them forever
> > +             * and release it only once when qemu exits as part of finalize or
> > +             * when new vCPU is hotplugged. In the later old could be released
> > +             * for the newly created object for the same vCPU?
> > +             */
> > +            if (!qemu_enabled_cpu(cpu)) {
> > +                CPUArchId *cpu_slot;
> > +                cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index);
> > +                cpu_slot->cpu = NULL;
> > +                object_unref(OBJECT(cpu));
> > +            }
> > +        }
> > +    }
> >   }
> >
> 
> Needn't we release those CPU instances for hve and qtest? Besides, I think it's
> hard for reuse those objects because they're managed by QOM, which is almost
> transparent to us, correct?

For now, this code leg won't hit for TCG, HVE or qtest. These are not supported
in this release. I might enable support of TCG in coming months. Some fixing and
testing is required.

I had created 2 working prototypes earlier in the year 2020. One did not release
the CPU objects. Hence, the ACPI CPU Hotplug state was always in sync with QOM
CPUState. But it required some changes in the qdev creation leg for the reuse of
the existing CPU object - which Igor was not in favor of. Plus, it had some issues
with Live migration (which were left unresolved at that time).  Hence, used the
current approach as the primary one.

Thanks
Salil.







^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 25/37] arm/virt: Add/update basic hot-(un)plug framework
  2023-09-29  0:20   ` Gavin Shan
@ 2023-10-16 23:40     ` Salil Mehta via
  2023-10-16 23:40       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 23:40 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 1:21 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 25/37] arm/virt: Add/update basic hot-(un)plug
> framework
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Add CPU hot-unplug hooks and update hotplug hooks with additional sanity checks
> > for use in hotplug paths.
> >
> > Note, Functional contents of the hooks(now left with TODO comment) shall be
> > gradually filled in the subsequent patches in an incremental approach to patch
> > and logic building which would be roughly as follows:
> > 1. (Un-)wiring of interrupts between vCPU<->GIC
> > 2. Sending events to Guest for hot-(un)plug so that guest can take appropriate
> >     actions.
> > 3. Notifying GIC about hot-(un)plug action so that vCPU could be (un-)stitched
> >     to the GIC CPU interface.
> > 4. Updating the Guest with Next boot info for this vCPU in the firmware.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>

[...]

> > @@ -2985,12 +2986,23 @@ static void virt_cpu_pre_plug(HotplugHandler
> *hotplug_dev, DeviceState *dev,
> >   {
> >       VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> >       MachineState *ms = MACHINE(hotplug_dev);
> > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> >       ARMCPU *cpu = ARM_CPU(dev);
> >       CPUState *cs = CPU(dev);
> >       CPUArchId *cpu_slot;
> >       int32_t min_cpuid = 0;
> >       int32_t max_cpuid;
> >
> > +    if (dev->hotplugged && !vms->acpi_dev) {
> > +        error_setg(errp, "GED acpi device does not exists");
> > +        return;
> > +    }
> > +
> > +    if (dev->hotplugged && !mc->has_hotpluggable_cpus) {
> > +        error_setg(errp, "CPU hotplug not supported on this machine");
> > +        return;
> > +    }
> > +
> 
> I guess these can be combined to:
> 
>         if (dev->hotplugged && (!mc->has_hotpluggable_cpus || !vms->acpi_dev)) {
>             error_setg(errp, "CPU hotplug not supported or GED ACPI device not exist");
>         }


Above checks exists because I wanted different error strings for each.


> 
> Besides, need we check (vms->gic_version == VIRT_GIC_VERSION_3)?

Flag ' mc->has_hotpluggable_cpus' takes care all of that.


> 
> >       /* sanity check the cpu */
> >       if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
> >           error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
> > @@ -3039,6 +3051,22 @@ static void virt_cpu_pre_plug(HotplugHandler
> *hotplug_dev, DeviceState *dev,
> >       }
> >       virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
> >
> > +    /*
> > +     * Fix the GIC for this new vCPU being plugged. The QOM CPU object for the
> > +     * new vCPU need to be updated in the corresponding QOM GICv3CPUState object
> > +     * We also need to re-wire the IRQs for this new CPU object. This update
> > +     * is limited to the QOM only and does not affects the KVM. Later has
> > +     * already been pre-sized with possible CPU at VM init time. This is a
> > +     * workaround to the constraints posed by ARM architecture w.r.t supporting
> > +     * CPU Hotplug. Specification does not exist for the later.
> > +     * This patch-up is required both for {cold,hot}-plugged vCPUs. Cold-inited
> > +     * vCPUs have their GIC state initialized during machvit_init().
> > +     */
> > +    if (vms->acpi_dev) {
> > +        /* TODO: update GIC about this hotplug change here */
> > +        /* TODO: wire the GIC<->CPU irqs */
> > +    }
> > +
> 
> When looking at these 'TODO', it seems you need order the patches to make those
> preparatory patches ahead of this one. In this way, the 'TODO' can be avoided.

Maybe but it will break step wise flow of the patch-set


[...]

> > @@ -3058,10 +3087,81 @@ static void virt_cpu_plug(HotplugHandler
> *hotplug_dev, DeviceState *dev,
> >       cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> >       cpu_slot->cpu = OBJECT(dev);
> >
> > +    /*
> > +     * Update the ACPI Hotplug state both for vCPUs being {hot,cold}-plugged.
> > +     * vCPUs can be cold-plugged using '-device' option. For vCPUs being hot
> > +     * plugged, guest is also notified.
> > +     */
> > +    if (vms->acpi_dev) {
> > +        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest */
> > +        /* TODO: register cpu for reset & update F/W info for the next boot */
> > +    }
> > +
> 
> We needn't validate vms->acpi_dev again since it has been done in pre_plug().

We want this leg to be conditional for cold inited CPUs.


[...]

> > +static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +                            Error **errp)
> > +{
> > +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> > +    MachineState *ms = MACHINE(hotplug_dev);
> > +    CPUState *cs = CPU(dev);
> > +    CPUArchId *cpu_slot;
> > +
> > +    if (!vms->acpi_dev || !dev->realized) {
> > +        error_setg(errp, "GED does not exists or device is not realized!");
> > +        return;
> > +    }
> > +
> > +    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> > +
> > +    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
> > +
> > +    /* TODO: unwire the gic-cpu irqs here */
> > +    /* TODO: update the GIC about this hot unplug change */
> > +
> > +    /* TODO: unregister cpu for reset & update F/W info for the next boot */
> > +
> 
> Same as above.

I understand your point. But will spoil the flow of path-set

Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 25/37] arm/virt: Add/update basic hot-(un)plug framework
  2023-10-16 23:40     ` Salil Mehta via
@ 2023-10-16 23:40       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 23:40 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 1:21 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 25/37] arm/virt: Add/update basic hot-(un)plug
> framework
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > Add CPU hot-unplug hooks and update hotplug hooks with additional sanity checks
> > for use in hotplug paths.
> >
> > Note, Functional contents of the hooks(now left with TODO comment) shall be
> > gradually filled in the subsequent patches in an incremental approach to patch
> > and logic building which would be roughly as follows:
> > 1. (Un-)wiring of interrupts between vCPU<->GIC
> > 2. Sending events to Guest for hot-(un)plug so that guest can take appropriate
> >     actions.
> > 3. Notifying GIC about hot-(un)plug action so that vCPU could be (un-)stitched
> >     to the GIC CPU interface.
> > 4. Updating the Guest with Next boot info for this vCPU in the firmware.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>

[...]

> > @@ -2985,12 +2986,23 @@ static void virt_cpu_pre_plug(HotplugHandler
> *hotplug_dev, DeviceState *dev,
> >   {
> >       VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> >       MachineState *ms = MACHINE(hotplug_dev);
> > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> >       ARMCPU *cpu = ARM_CPU(dev);
> >       CPUState *cs = CPU(dev);
> >       CPUArchId *cpu_slot;
> >       int32_t min_cpuid = 0;
> >       int32_t max_cpuid;
> >
> > +    if (dev->hotplugged && !vms->acpi_dev) {
> > +        error_setg(errp, "GED acpi device does not exists");
> > +        return;
> > +    }
> > +
> > +    if (dev->hotplugged && !mc->has_hotpluggable_cpus) {
> > +        error_setg(errp, "CPU hotplug not supported on this machine");
> > +        return;
> > +    }
> > +
> 
> I guess these can be combined to:
> 
>         if (dev->hotplugged && (!mc->has_hotpluggable_cpus || !vms->acpi_dev)) {
>             error_setg(errp, "CPU hotplug not supported or GED ACPI device not exist");
>         }


Above checks exists because I wanted different error strings for each.


> 
> Besides, need we check (vms->gic_version == VIRT_GIC_VERSION_3)?

Flag ' mc->has_hotpluggable_cpus' takes care all of that.


> 
> >       /* sanity check the cpu */
> >       if (!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
> >           error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
> > @@ -3039,6 +3051,22 @@ static void virt_cpu_pre_plug(HotplugHandler
> *hotplug_dev, DeviceState *dev,
> >       }
> >       virt_cpu_set_properties(OBJECT(cs), cpu_slot, errp);
> >
> > +    /*
> > +     * Fix the GIC for this new vCPU being plugged. The QOM CPU object for the
> > +     * new vCPU need to be updated in the corresponding QOM GICv3CPUState object
> > +     * We also need to re-wire the IRQs for this new CPU object. This update
> > +     * is limited to the QOM only and does not affects the KVM. Later has
> > +     * already been pre-sized with possible CPU at VM init time. This is a
> > +     * workaround to the constraints posed by ARM architecture w.r.t supporting
> > +     * CPU Hotplug. Specification does not exist for the later.
> > +     * This patch-up is required both for {cold,hot}-plugged vCPUs. Cold-inited
> > +     * vCPUs have their GIC state initialized during machvit_init().
> > +     */
> > +    if (vms->acpi_dev) {
> > +        /* TODO: update GIC about this hotplug change here */
> > +        /* TODO: wire the GIC<->CPU irqs */
> > +    }
> > +
> 
> When looking at these 'TODO', it seems you need order the patches to make those
> preparatory patches ahead of this one. In this way, the 'TODO' can be avoided.

Maybe but it will break step wise flow of the patch-set


[...]

> > @@ -3058,10 +3087,81 @@ static void virt_cpu_plug(HotplugHandler
> *hotplug_dev, DeviceState *dev,
> >       cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> >       cpu_slot->cpu = OBJECT(dev);
> >
> > +    /*
> > +     * Update the ACPI Hotplug state both for vCPUs being {hot,cold}-plugged.
> > +     * vCPUs can be cold-plugged using '-device' option. For vCPUs being hot
> > +     * plugged, guest is also notified.
> > +     */
> > +    if (vms->acpi_dev) {
> > +        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest */
> > +        /* TODO: register cpu for reset & update F/W info for the next boot */
> > +    }
> > +
> 
> We needn't validate vms->acpi_dev again since it has been done in pre_plug().

We want this leg to be conditional for cold inited CPUs.


[...]

> > +static void virt_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +                            Error **errp)
> > +{
> > +    VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
> > +    MachineState *ms = MACHINE(hotplug_dev);
> > +    CPUState *cs = CPU(dev);
> > +    CPUArchId *cpu_slot;
> > +
> > +    if (!vms->acpi_dev || !dev->realized) {
> > +        error_setg(errp, "GED does not exists or device is not realized!");
> > +        return;
> > +    }
> > +
> > +    cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> > +
> > +    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
> > +
> > +    /* TODO: unwire the gic-cpu irqs here */
> > +    /* TODO: update the GIC about this hot unplug change */
> > +
> > +    /* TODO: unregister cpu for reset & update F/W info for the next boot */
> > +
> 
> Same as above.

I understand your point. But will spoil the flow of path-set

Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 29/37] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
  2023-09-29  0:30   ` Gavin Shan
@ 2023-10-16 23:48     ` Salil Mehta via
  2023-10-16 23:48       ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-16 23:48 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 1:31 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 29/37] arm/virt: Update the guest(via GED) about
> CPU hot-(un)plug events
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > During any vCPU hot-(un)plug, running guest VM needs to be intimated about the
> > new vCPU being added or request the deletion of the vCPU which is already part
> > of the guest VM. This is done using the ACPI GED event which eventually gets
> > demultiplexed to a CPU hotplug event and further to specific hot-(un)plug event
> > of a particular vCPU.
> >
> > This change adds the ACPI calls to the existing hot-(un)plug hooks to trigger
> > ACPI GED events from QEMU to guest VM.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---

[...]

> > @@ -3169,12 +3170,20 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >        * plugged, guest is also notified.
> >        */
> >       if (vms->acpi_dev) {
> > -        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest */
> > +        HotplugHandlerClass *hhc;
> > +        /* update acpi hotplug state and send cpu hotplug event to guest */
> > +        hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
> > +        hhc->plug(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
> > +        if (local_err) {
> > +            goto fail;
> > +        }
> >           /* TODO: register cpu for reset & update F/W info for the next boot */
> >       }
> >
> >       cs->disabled = false;
> >       return;
> > +fail:
> > +    error_propagate(errp, local_err);
> >   }
> >
> 
> 'fail' tag isn't needed since it's used for once. we can bail early:
> 
>      if (local_err) {
>         error_propagate(errp, local_err);
>         return;
>      }


Agreed. Indeed we can remove goto.


[...]

> > @@ -3202,9 +3213,16 @@ static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
> >           return;
> >       }
> >
> > -    /* TODO: request cpu hotplug from guest */
> > +    /* request cpu hotplug from guest */
> > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
> > +    hhc->unplug_request(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
> > +    if (local_err) {
> > +        goto fail;
> > +    }
> >
> >       return;
> > +fail:
> > +    error_propagate(errp, local_err);
> >   }
> >
> 
> Same as above, 'fail' tag isn't needed. Besides, 'return' isn't needed.


Agreed. We can remove goto



[...]

> > @@ -3222,7 +3242,12 @@ static void virt_cpu_unplug(HotplugHandler
> *hotplug_dev, DeviceState *dev,
> >
> >       cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> >
> > -    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
> > +    /* update the acpi cpu hotplug state for cpu hot-unplug */
> > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
> > +    hhc->unplug(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
> > +    if (local_err) {
> > +        goto fail;
> > +    }
> >
> >       unwire_gic_cpu_irqs(vms, cs);
> >       virt_update_gic(vms, cs);
> > @@ -3236,6 +3261,8 @@ static void virt_cpu_unplug(HotplugHandler
> *hotplug_dev, DeviceState *dev,
> >       cs->disabled = true;
> >
> >       return;
> > +fail:
> > +    error_propagate(errp, local_err);
> >   }
> >
> 
> Same as above.


Agreed. Will fix.


Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 29/37] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
  2023-10-16 23:48     ` Salil Mehta via
@ 2023-10-16 23:48       ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-16 23:48 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 1:31 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 29/37] arm/virt: Update the guest(via GED) about
> CPU hot-(un)plug events
> 
> Hi Salil,
> 
> On 9/26/23 20:04, Salil Mehta wrote:
> > During any vCPU hot-(un)plug, running guest VM needs to be intimated about the
> > new vCPU being added or request the deletion of the vCPU which is already part
> > of the guest VM. This is done using the ACPI GED event which eventually gets
> > demultiplexed to a CPU hotplug event and further to specific hot-(un)plug event
> > of a particular vCPU.
> >
> > This change adds the ACPI calls to the existing hot-(un)plug hooks to trigger
> > ACPI GED events from QEMU to guest VM.
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---

[...]

> > @@ -3169,12 +3170,20 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >        * plugged, guest is also notified.
> >        */
> >       if (vms->acpi_dev) {
> > -        /* TODO: update acpi hotplug state. Send cpu hotplug event to guest */
> > +        HotplugHandlerClass *hhc;
> > +        /* update acpi hotplug state and send cpu hotplug event to guest */
> > +        hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
> > +        hhc->plug(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
> > +        if (local_err) {
> > +            goto fail;
> > +        }
> >           /* TODO: register cpu for reset & update F/W info for the next boot */
> >       }
> >
> >       cs->disabled = false;
> >       return;
> > +fail:
> > +    error_propagate(errp, local_err);
> >   }
> >
> 
> 'fail' tag isn't needed since it's used for once. we can bail early:
> 
>      if (local_err) {
>         error_propagate(errp, local_err);
>         return;
>      }


Agreed. Indeed we can remove goto.


[...]

> > @@ -3202,9 +3213,16 @@ static void virt_cpu_unplug_request(HotplugHandler *hotplug_dev,
> >           return;
> >       }
> >
> > -    /* TODO: request cpu hotplug from guest */
> > +    /* request cpu hotplug from guest */
> > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
> > +    hhc->unplug_request(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
> > +    if (local_err) {
> > +        goto fail;
> > +    }
> >
> >       return;
> > +fail:
> > +    error_propagate(errp, local_err);
> >   }
> >
> 
> Same as above, 'fail' tag isn't needed. Besides, 'return' isn't needed.


Agreed. We can remove goto



[...]

> > @@ -3222,7 +3242,12 @@ static void virt_cpu_unplug(HotplugHandler
> *hotplug_dev, DeviceState *dev,
> >
> >       cpu_slot = virt_find_cpu_slot(ms, cs->cpu_index);
> >
> > -    /* TODO: update the acpi cpu hotplug state for cpu hot-unplug */
> > +    /* update the acpi cpu hotplug state for cpu hot-unplug */
> > +    hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
> > +    hhc->unplug(HOTPLUG_HANDLER(vms->acpi_dev), dev, &local_err);
> > +    if (local_err) {
> > +        goto fail;
> > +    }
> >
> >       unwire_gic_cpu_irqs(vms, cs);
> >       virt_update_gic(vms, cs);
> > @@ -3236,6 +3261,8 @@ static void virt_cpu_unplug(HotplugHandler
> *hotplug_dev, DeviceState *dev,
> >       cs->disabled = true;
> >
> >       return;
> > +fail:
> > +    error_propagate(errp, local_err);
> >   }
> >
> 
> Same as above.


Agreed. Will fix.


Thanks
Salil.

^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 34/37] target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
  2023-09-29  4:15     ` [PATCH RFC V2 34/37] target/arm/kvm,tcg: " Gavin Shan
@ 2023-10-17  0:03       ` Salil Mehta via
  2023-10-17  0:03         ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-17  0:03 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 5:15 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 34/37] target/arm/kvm,tcg: Register/Handle SMCCC
> hypercall exits to VMM/Qemu
> 
> Hi Salil,
> 
> On 9/26/23 20:36, Salil Mehta wrote:
> > From: Author Salil Mehta <salil.mehta@huawei.com>
> >
> > Add registration and Handling of HVC/SMC hypercall exits to VMM
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>

[...]

> > +static CPUArchId *arm_get_archid_by_id(uint64_t id)
> > +{
> > +    int n;
> > +    CPUArchId *arch_id;
> > +    MachineState *ms = MACHINE(qdev_get_machine());
> > +
> > +    /*
> > +     * At this point disabled CPUs don't have a CPUState, but their CPUArchId
> > +     * exists.
> > +     *
> > +     * TODO: Is arch_id == mp_affinity? This needs work.
> > +     */
> > +    for (n = 0; n < ms->possible_cpus->len; n++) {
> > +        arch_id = &ms->possible_cpus->cpus[n];
> > +
> > +        if (arch_id->arch_id == id) {
> > +            return arch_id;
> > +        }
> > +    }
> > +    return NULL;
> > +}
> > +
> 
> The @arch_id should be same thing to @mp_affinity except for the boot CPU.
> For the boot CPU, its value is fetched from MPIDR, which is determined by
> cs->cpu_index, passed to host via ioctl(CREATE_VCPU). Besides, another
> similiar function qemu_get_cpu_archid() exists in cpus-common.c. I think
> they can be combined. Again, all these information inherited from
> ms->possible_cpus may be better to be managed in board level, like the
> vCPU states.

Yes, good catch. This has been existing for long so my eyes got biased.


Thanks
Salil.


[...]

> > @@ -168,12 +189,24 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry,
> uint64_t context_id,
> >       }
> >
> >       /* Retrieve the cpu we are powering up */
> > -    target_cpu_state = arm_get_cpu_by_id(cpuid);
> > -    if (!target_cpu_state) {
> > +    arch_id = arm_get_archid_by_id(cpuid);
> > +    if (!arch_id) {
> >           /* The cpu was not found */
> >           return QEMU_ARM_POWERCTL_INVALID_PARAM;
> >       }
> >
> > +    target_cpu_state = CPU(arch_id->cpu);
> > +    if (!qemu_enabled_cpu(target_cpu_state)) {
> > +        /*
> > +         * The cpu is not plugged in or disabled. We should return appropriate
> > +         * value as introduced in DEN0022E PSCI 1.2 issue E
>                                                         ^^^^^^^
>                                                         issue E, which is QEMU_PSCI_RET_DENIED.

   PSCI_RET_DENIED


[...]

> > --- a/target/arm/helper.c
> > +++ b/target/arm/helper.c
> > @@ -11187,7 +11187,7 @@ void arm_cpu_do_interrupt(CPUState *cs)
> >                         env->exception.syndrome);
> >       }
> >
> > -    if (tcg_enabled() && arm_is_psci_call(cpu, cs->exception_index)) {
> > +    if (arm_is_psci_call(cpu, cs->exception_index)) {
> >           arm_handle_psci_call(cpu);
> >           qemu_log_mask(CPU_LOG_INT, "...handled as PSCI call\n");
> >           return;
> 
> We may still limit the capability to handle PSCI calls to TCG and KVM,
> meaning HVF and QTest won't have this capability.


We do not support them now. I need to conditionally register SMCC calls
With KVM. Will check this. Good point though.

Thanks
Salil.


[...]

> > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > index 8e7c68af6a..6f3fd5aecd 100644
> > --- a/target/arm/kvm.c
> > +++ b/target/arm/kvm.c
> > @@ -250,6 +250,7 @@ int kvm_arm_get_max_vm_ipa_size(MachineState *ms,
> bool *fixed_ipa)
> >   int kvm_arch_init(MachineState *ms, KVMState *s)
> >   {
> >       int ret = 0;
> > +
>    ^^^^
> Unnecessary change.

Yes.

Thanks.


[...]

> > @@ -280,6 +281,22 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> >           }
> >       }
> >
> > +    /*
> > +     * To be able to handle PSCI CPU ON calls in QEMU, we need to install SMCCC
>                                          ^^
>                                          ON/OFF

Yes.

> > +     * filter in the Host KVM. This is required to support features like
> > +     * virtual CPU Hotplug on ARM platforms.
> > +     */
> > +    if (kvm_arm_set_smccc_filter(PSCI_0_2_FN64_CPU_ON,
> > +                                 KVM_SMCCC_FILTER_FWD_TO_USER)) {
> > +        error_report("CPU On PSCI-to-user-space fwd filter install failed");
> > +        abort();
> > +    }
> > +    if (kvm_arm_set_smccc_filter(PSCI_0_2_FN_CPU_OFF,
> > +                                 KVM_SMCCC_FILTER_FWD_TO_USER)) {
> > +        error_report("CPU Off PSCI-to-user-space fwd filter install failed");
> > +        abort();
> > +    }
> > +
> >       kvm_arm_init_debug(s);
> >
> >       return ret;
> 
> The PSCI_ON and PSCI_OFF will be unconditionally handled by QEMU if KVM is
> enabled, even vCPU hotplug isn't supported on hw/arm/virt board. Do we need to
> enable it only when vCPU hotplug is supported?

Yes. True. I missed this earlier. It should be conditional.


Thanks
Salil.


> 
> > @@ -952,6 +969,38 @@ static int kvm_arm_handle_dabt_nisv(CPUState *cs,
> uint64_t esr_iss,
> >       return -1;
> >   }
> >
> > +static int kvm_arm_handle_hypercall(CPUState *cs, struct kvm_run *run)
> > +{
> > +    ARMCPU *cpu = ARM_CPU(cs);
> > +    CPUARMState *env = &cpu->env;
> > +
> > +    kvm_cpu_synchronize_state(cs);
> > +
> > +    /*
> > +     * hard coding immediate to 0 as we dont expect non-zero value as of now
>                                             ^^^^
>                                             don't

Thanks
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 34/37] target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
  2023-10-17  0:03       ` Salil Mehta via
@ 2023-10-17  0:03         ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-17  0:03 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 5:15 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 34/37] target/arm/kvm,tcg: Register/Handle SMCCC
> hypercall exits to VMM/Qemu
> 
> Hi Salil,
> 
> On 9/26/23 20:36, Salil Mehta wrote:
> > From: Author Salil Mehta <salil.mehta@huawei.com>
> >
> > Add registration and Handling of HVC/SMC hypercall exits to VMM
> >
> > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>

[...]

> > +static CPUArchId *arm_get_archid_by_id(uint64_t id)
> > +{
> > +    int n;
> > +    CPUArchId *arch_id;
> > +    MachineState *ms = MACHINE(qdev_get_machine());
> > +
> > +    /*
> > +     * At this point disabled CPUs don't have a CPUState, but their CPUArchId
> > +     * exists.
> > +     *
> > +     * TODO: Is arch_id == mp_affinity? This needs work.
> > +     */
> > +    for (n = 0; n < ms->possible_cpus->len; n++) {
> > +        arch_id = &ms->possible_cpus->cpus[n];
> > +
> > +        if (arch_id->arch_id == id) {
> > +            return arch_id;
> > +        }
> > +    }
> > +    return NULL;
> > +}
> > +
> 
> The @arch_id should be same thing to @mp_affinity except for the boot CPU.
> For the boot CPU, its value is fetched from MPIDR, which is determined by
> cs->cpu_index, passed to host via ioctl(CREATE_VCPU). Besides, another
> similiar function qemu_get_cpu_archid() exists in cpus-common.c. I think
> they can be combined. Again, all these information inherited from
> ms->possible_cpus may be better to be managed in board level, like the
> vCPU states.

Yes, good catch. This has been existing for long so my eyes got biased.


Thanks
Salil.


[...]

> > @@ -168,12 +189,24 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry,
> uint64_t context_id,
> >       }
> >
> >       /* Retrieve the cpu we are powering up */
> > -    target_cpu_state = arm_get_cpu_by_id(cpuid);
> > -    if (!target_cpu_state) {
> > +    arch_id = arm_get_archid_by_id(cpuid);
> > +    if (!arch_id) {
> >           /* The cpu was not found */
> >           return QEMU_ARM_POWERCTL_INVALID_PARAM;
> >       }
> >
> > +    target_cpu_state = CPU(arch_id->cpu);
> > +    if (!qemu_enabled_cpu(target_cpu_state)) {
> > +        /*
> > +         * The cpu is not plugged in or disabled. We should return appropriate
> > +         * value as introduced in DEN0022E PSCI 1.2 issue E
>                                                         ^^^^^^^
>                                                         issue E, which is QEMU_PSCI_RET_DENIED.

   PSCI_RET_DENIED


[...]

> > --- a/target/arm/helper.c
> > +++ b/target/arm/helper.c
> > @@ -11187,7 +11187,7 @@ void arm_cpu_do_interrupt(CPUState *cs)
> >                         env->exception.syndrome);
> >       }
> >
> > -    if (tcg_enabled() && arm_is_psci_call(cpu, cs->exception_index)) {
> > +    if (arm_is_psci_call(cpu, cs->exception_index)) {
> >           arm_handle_psci_call(cpu);
> >           qemu_log_mask(CPU_LOG_INT, "...handled as PSCI call\n");
> >           return;
> 
> We may still limit the capability to handle PSCI calls to TCG and KVM,
> meaning HVF and QTest won't have this capability.


We do not support them now. I need to conditionally register SMCC calls
With KVM. Will check this. Good point though.

Thanks
Salil.


[...]

> > diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> > index 8e7c68af6a..6f3fd5aecd 100644
> > --- a/target/arm/kvm.c
> > +++ b/target/arm/kvm.c
> > @@ -250,6 +250,7 @@ int kvm_arm_get_max_vm_ipa_size(MachineState *ms,
> bool *fixed_ipa)
> >   int kvm_arch_init(MachineState *ms, KVMState *s)
> >   {
> >       int ret = 0;
> > +
>    ^^^^
> Unnecessary change.

Yes.

Thanks.


[...]

> > @@ -280,6 +281,22 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> >           }
> >       }
> >
> > +    /*
> > +     * To be able to handle PSCI CPU ON calls in QEMU, we need to install SMCCC
>                                          ^^
>                                          ON/OFF

Yes.

> > +     * filter in the Host KVM. This is required to support features like
> > +     * virtual CPU Hotplug on ARM platforms.
> > +     */
> > +    if (kvm_arm_set_smccc_filter(PSCI_0_2_FN64_CPU_ON,
> > +                                 KVM_SMCCC_FILTER_FWD_TO_USER)) {
> > +        error_report("CPU On PSCI-to-user-space fwd filter install failed");
> > +        abort();
> > +    }
> > +    if (kvm_arm_set_smccc_filter(PSCI_0_2_FN_CPU_OFF,
> > +                                 KVM_SMCCC_FILTER_FWD_TO_USER)) {
> > +        error_report("CPU Off PSCI-to-user-space fwd filter install failed");
> > +        abort();
> > +    }
> > +
> >       kvm_arm_init_debug(s);
> >
> >       return ret;
> 
> The PSCI_ON and PSCI_OFF will be unconditionally handled by QEMU if KVM is
> enabled, even vCPU hotplug isn't supported on hw/arm/virt board. Do we need to
> enable it only when vCPU hotplug is supported?

Yes. True. I missed this earlier. It should be conditional.


Thanks
Salil.


> 
> > @@ -952,6 +969,38 @@ static int kvm_arm_handle_dabt_nisv(CPUState *cs,
> uint64_t esr_iss,
> >       return -1;
> >   }
> >
> > +static int kvm_arm_handle_hypercall(CPUState *cs, struct kvm_run *run)
> > +{
> > +    ARMCPU *cpu = ARM_CPU(cs);
> > +    CPUARMState *env = &cpu->env;
> > +
> > +    kvm_cpu_synchronize_state(cs);
> > +
> > +    /*
> > +     * hard coding immediate to 0 as we dont expect non-zero value as of now
>                                             ^^^^
>                                             don't

Thanks
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 35/37] hw/arm: Support hotplug capability check using _OSC method
  2023-09-29  4:23     ` Gavin Shan
@ 2023-10-17  0:13       ` Salil Mehta via
  2023-10-17  0:13         ` Salil Mehta
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-10-17  0:13 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 5:23 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 35/37] hw/arm: Support hotplug capability check
> using _OSC method
> 
> Hi Salil,
> 
> On 9/26/23 20:36, Salil Mehta wrote:
> > Physical CPU hotplug results in (un)setting of ACPI _STA.Present bit. AARCH64
> > platforms do not support physical CPU hotplug. Virtual CPU Hotplug support being
> > implemented toggles ACPI _STA.Enabled Bit to achieve Hotplug functionality. This
> > is not same as physical CPU hotplug support.
> >
> > In future, if ARM architecture supports physical CPU hotplug then the current
> > design of virtual CPU hotplug can be used unchanged. Hence, there is a need for
> > firmware/VMM/Qemu to support evaluation of platform wide capabilitiy related to
> > the *type* of CPU hotplug support present on the platform. OSPM might need this
> > during boot time to correctly initialize the CPUs and other related components
> > in the kernel.
> >
> > NOTE: This implementation will be improved to add the support of *query* in the
> > subsequent versions. This is very minimal support to assist kernel.
> >
> > ASL for the implemented _OSC method:
> >
> > Method (_OSC, 4, NotSerialized)  // _OSC: Operating System Capabilities
> > {
> >      CreateDWordField (Arg3, Zero, CDW1)
> >      If ((Arg0 == ToUUID ("0811b06e-4a27-44f9-8d60-3cbbc22e7b48") /* Platform-wide Capabilities */))
> >      {
> >          CreateDWordField (Arg3, 0x04, CDW2)
> >          Local0 = CDW2 /* \_SB_._OSC.CDW2 */
> >          If ((Arg1 != One))
> >          {
> >              CDW1 |= 0x08
> >          }
> >
> >          Local0 &= 0x00800000
> >          If ((CDW2 != Local0))
> >          {
> >              CDW1 |= 0x10
> >          }
> >
> >          CDW2 = Local0
> >      }
> >      Else
> >      {
> >          CDW1 |= 0x04
> >      }
> >
> >      Return (Arg3)
> > }
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt-acpi-build.c | 52 ++++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 52 insertions(+)
> >
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index cbccd2ca2d..377450dd16 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -861,6 +861,55 @@ static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
> >       build_fadt(table_data, linker, &fadt, vms->oem_id, vms-
> >oem_table_id);
> >   }
> >
> > +static void build_virt_osc_method(Aml *scope, VirtMachineState *vms)
> > +{
> > +    Aml *if_uuid, *else_uuid, *if_rev, *if_caps_masked, *method;
> > +    Aml *a_cdw1 = aml_name("CDW1");
> > +    Aml *a_cdw2 = aml_local(0);
> > +
> > +    method = aml_method("_OSC", 4, AML_NOTSERIALIZED);
> > +    aml_append(method, aml_create_dword_field(aml_arg(3), aml_int(0), "CDW1"));
> > +
> > +    /* match UUID */
> > +    if_uuid = aml_if(aml_equal(
> > +        aml_arg(0), aml_touuid("0811B06E-4A27-44F9-8D60-3CBBC22E7B48")));
> > +
> > +    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
> > +    aml_append(if_uuid, aml_store(aml_name("CDW2"), a_cdw2));
> > +
> > +    /* check unknown revision in arg(1) */
> > +    if_rev = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(1))));
> > +    /* set revision error bits,  DWORD1 Bit[3] */
> > +    aml_append(if_rev, aml_or(a_cdw1, aml_int(0x08), a_cdw1));
> > +    aml_append(if_uuid, if_rev);
> > +
> > +    /*
> > +     * check support for vCPU hotplug type(=enabled) platform-wide capability
> > +     * in DWORD2 as sepcified in the below ACPI Specification ECR,
> > +     *  # https://bugzilla.tianocore.org/show_bug.cgi?id=4481
> > +     */
> > +    if (vms->acpi_dev) {
> > +        aml_append(if_uuid, aml_and(a_cdw2, aml_int(0x800000), a_cdw2));
> > +        /* check if OSPM specified hotplug capability bits were masked */
> > +        if_caps_masked = aml_if(aml_lnot(aml_equal(aml_name("CDW2"), a_cdw2)));
> > +        aml_append(if_caps_masked, aml_or(a_cdw1, aml_int(0x10), a_cdw1));
> > +        aml_append(if_uuid, if_caps_masked);
> > +    }
> > +    aml_append(if_uuid, aml_store(a_cdw2, aml_name("CDW2")));
> > +
> > +    aml_append(method, if_uuid);
> > +    else_uuid = aml_else();
> > +
> > +    /* set unrecognized UUID error bits, DWORD1 Bit[2] */
> > +    aml_append(else_uuid, aml_or(a_cdw1, aml_int(4), a_cdw1));
> > +    aml_append(method, else_uuid);
> > +
> > +    aml_append(method, aml_return(aml_arg(3)));
> > +    aml_append(scope, method);
> > +
> > +    return;
> > +}
> > +
> 
> The check on vms->acpi_dev seems not enough. We may still need to check
> mc->has_hotpluggable_cpus and vms->gic_version etc. Besides, the "return"
> at end of the function isn't needed.

Agreed. We just need to check 'mc->has_hotpluggable_cpus'. It will cover
everything.

A legacy copy and paste mistake everywhere. Thanks for pointing.

Cheers
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 35/37] hw/arm: Support hotplug capability check using _OSC method
  2023-10-17  0:13       ` Salil Mehta via
@ 2023-10-17  0:13         ` Salil Mehta
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta @ 2023-10-17  0:13 UTC (permalink / raw)
  To: Gavin Shan, qemu-devel, qemu-arm
  Cc: maz, jean-philippe, Jonathan Cameron, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, ardb, oliver.upton, pbonzini, mst, rafael,
	borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Gavin,

> From: Gavin Shan <gshan@redhat.com>
> Sent: Friday, September 29, 2023 5:23 AM
> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> peter.maydell@linaro.org; richard.henderson@linaro.org;
> imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> linux@armlinux.org.uk; darren@os.amperecomputing.com;
> ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 35/37] hw/arm: Support hotplug capability check
> using _OSC method
> 
> Hi Salil,
> 
> On 9/26/23 20:36, Salil Mehta wrote:
> > Physical CPU hotplug results in (un)setting of ACPI _STA.Present bit. AARCH64
> > platforms do not support physical CPU hotplug. Virtual CPU Hotplug support being
> > implemented toggles ACPI _STA.Enabled Bit to achieve Hotplug functionality. This
> > is not same as physical CPU hotplug support.
> >
> > In future, if ARM architecture supports physical CPU hotplug then the current
> > design of virtual CPU hotplug can be used unchanged. Hence, there is a need for
> > firmware/VMM/Qemu to support evaluation of platform wide capabilitiy related to
> > the *type* of CPU hotplug support present on the platform. OSPM might need this
> > during boot time to correctly initialize the CPUs and other related components
> > in the kernel.
> >
> > NOTE: This implementation will be improved to add the support of *query* in the
> > subsequent versions. This is very minimal support to assist kernel.
> >
> > ASL for the implemented _OSC method:
> >
> > Method (_OSC, 4, NotSerialized)  // _OSC: Operating System Capabilities
> > {
> >      CreateDWordField (Arg3, Zero, CDW1)
> >      If ((Arg0 == ToUUID ("0811b06e-4a27-44f9-8d60-3cbbc22e7b48") /* Platform-wide Capabilities */))
> >      {
> >          CreateDWordField (Arg3, 0x04, CDW2)
> >          Local0 = CDW2 /* \_SB_._OSC.CDW2 */
> >          If ((Arg1 != One))
> >          {
> >              CDW1 |= 0x08
> >          }
> >
> >          Local0 &= 0x00800000
> >          If ((CDW2 != Local0))
> >          {
> >              CDW1 |= 0x10
> >          }
> >
> >          CDW2 = Local0
> >      }
> >      Else
> >      {
> >          CDW1 |= 0x04
> >      }
> >
> >      Return (Arg3)
> > }
> >
> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > ---
> >   hw/arm/virt-acpi-build.c | 52 ++++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 52 insertions(+)
> >
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index cbccd2ca2d..377450dd16 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -861,6 +861,55 @@ static void build_fadt_rev6(GArray *table_data, BIOSLinker *linker,
> >       build_fadt(table_data, linker, &fadt, vms->oem_id, vms-
> >oem_table_id);
> >   }
> >
> > +static void build_virt_osc_method(Aml *scope, VirtMachineState *vms)
> > +{
> > +    Aml *if_uuid, *else_uuid, *if_rev, *if_caps_masked, *method;
> > +    Aml *a_cdw1 = aml_name("CDW1");
> > +    Aml *a_cdw2 = aml_local(0);
> > +
> > +    method = aml_method("_OSC", 4, AML_NOTSERIALIZED);
> > +    aml_append(method, aml_create_dword_field(aml_arg(3), aml_int(0), "CDW1"));
> > +
> > +    /* match UUID */
> > +    if_uuid = aml_if(aml_equal(
> > +        aml_arg(0), aml_touuid("0811B06E-4A27-44F9-8D60-3CBBC22E7B48")));
> > +
> > +    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
> > +    aml_append(if_uuid, aml_store(aml_name("CDW2"), a_cdw2));
> > +
> > +    /* check unknown revision in arg(1) */
> > +    if_rev = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(1))));
> > +    /* set revision error bits,  DWORD1 Bit[3] */
> > +    aml_append(if_rev, aml_or(a_cdw1, aml_int(0x08), a_cdw1));
> > +    aml_append(if_uuid, if_rev);
> > +
> > +    /*
> > +     * check support for vCPU hotplug type(=enabled) platform-wide capability
> > +     * in DWORD2 as sepcified in the below ACPI Specification ECR,
> > +     *  # https://bugzilla.tianocore.org/show_bug.cgi?id=4481
> > +     */
> > +    if (vms->acpi_dev) {
> > +        aml_append(if_uuid, aml_and(a_cdw2, aml_int(0x800000), a_cdw2));
> > +        /* check if OSPM specified hotplug capability bits were masked */
> > +        if_caps_masked = aml_if(aml_lnot(aml_equal(aml_name("CDW2"), a_cdw2)));
> > +        aml_append(if_caps_masked, aml_or(a_cdw1, aml_int(0x10), a_cdw1));
> > +        aml_append(if_uuid, if_caps_masked);
> > +    }
> > +    aml_append(if_uuid, aml_store(a_cdw2, aml_name("CDW2")));
> > +
> > +    aml_append(method, if_uuid);
> > +    else_uuid = aml_else();
> > +
> > +    /* set unrecognized UUID error bits, DWORD1 Bit[2] */
> > +    aml_append(else_uuid, aml_or(a_cdw1, aml_int(4), a_cdw1));
> > +    aml_append(method, else_uuid);
> > +
> > +    aml_append(method, aml_return(aml_arg(3)));
> > +    aml_append(scope, method);
> > +
> > +    return;
> > +}
> > +
> 
> The check on vms->acpi_dev seems not enough. We may still need to check
> mc->has_hotpluggable_cpus and vms->gic_version etc. Besides, the "return"
> at end of the function isn't needed.

Agreed. We just need to check 'mc->has_hotpluggable_cpus'. It will cover
everything.

A legacy copy and paste mistake everywhere. Thanks for pointing.

Cheers
Salil.


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
  2023-10-02 16:12     ` Salil Mehta via
  2023-10-02 16:12       ` Salil Mehta
@ 2024-01-16 15:59       ` Jonathan Cameron via
  1 sibling, 0 replies; 153+ messages in thread
From: Jonathan Cameron via @ 2024-01-16 15:59 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Gavin Shan, qemu-devel, qemu-arm, maz, jean-philippe, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	rafael, borntraeger, alex.bennee, linux, darren, ilkka, vishnu,
	karl.heubaum, miguel.luis, salil.mehta, zhukeqian,
	wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

On Mon, 2 Oct 2023 17:12:43 +0100
Salil Mehta <salil.mehta@huawei.com> wrote:

> Hi Gavin,
> 
> > From: Gavin Shan <gshan@redhat.com>
> > Sent: Wednesday, September 27, 2023 7:29 AM
> > To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; qemu-
> > arm@nongnu.org
> > Cc: maz@kernel.org; jean-philippe@linaro.org; Jonathan Cameron
> > <jonathan.cameron@huawei.com>; lpieralisi@kernel.org;
> > peter.maydell@linaro.org; richard.henderson@linaro.org;
> > imammedo@redhat.com; andrew.jones@linux.dev; david@redhat.com;
> > philmd@linaro.org; eric.auger@redhat.com; will@kernel.org; ardb@kernel.org;
> > oliver.upton@linux.dev; pbonzini@redhat.com; mst@redhat.com;
> > rafael@kernel.org; borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> > linux@armlinux.org.uk; darren@os.amperecomputing.com;
> > ilkka@os.amperecomputing.com; vishnu@os.amperecomputing.com;
> > karl.heubaum@oracle.com; miguel.luis@oracle.com; salil.mehta@opnsrc.net;
> > zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> > <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> > jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> > Subject: Re: [PATCH RFC V2 04/37] arm/virt,target/arm: Machine init time
> > change common to vCPU {cold|hot}-plug
> > 
> > Hi Salil,
> > 
> > On 9/26/23 20:04, Salil Mehta wrote:  
> > > Refactor and introduce the common logic required during the  
> > initialization of  
> > > both cold and hot plugged vCPUs. Also initialize the *disabled* state of the
> > > vCPUs which shall be used further during init phases of various other components
> > > like GIC, PMU, ACPI etc as part of the virt machine initialization.
> > >
> > > KVM vCPUs corresponding to unplugged/yet-to-be-plugged QOM CPUs are kept in
> > > powered-off state in the KVM Host and do not run the guest code. Plugged vCPUs
> > > are also kept in powered-off state but vCPU threads exist and is kept sleeping.
> > >
> > > TBD:
> > > For the cold booted vCPUs, this change also exists in the arm_load_kernel()
> > > in boot.c but for the hotplugged CPUs this change should still remain part of
> > > the pre-plug phase. We are duplicating the powering-off of the cold booted CPUs.
> > > Shall we remove the duplicate change from boot.c?
> > >
> > > Co-developed-by: Salil Mehta <salil.mehta@huawei.com>
> > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
> > > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > > Reported-by: Gavin Shan <gavin.shan@redhat.com>
> > > [GS: pointed the assertion due to wrong range check]
> > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
> > > ---
> > >   hw/arm/virt.c      | 149 ++++++++++++++++++++++++++++++++++++++++-----
> > >   target/arm/cpu.c   |   7 +++
> > >   target/arm/cpu64.c |  14 +++++
> > >   3 files changed, 156 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > > index 0eb6bf5a18..3668ad27ec 100644
> > > --- a/hw/arm/virt.c
> > > +++ b/hw/arm/virt.c
> > > @@ -221,6 +221,7 @@ static const char *valid_cpus[] = {
> > >       ARM_CPU_TYPE_NAME("max"),
> > >   };
> > >
> > > +static CPUArchId *virt_find_cpu_slot(MachineState *ms, int vcpuid);
> > >   static int virt_get_socket_id(const MachineState *ms, int cpu_index);
> > >   static int virt_get_cluster_id(const MachineState *ms, int cpu_index);
> > >   static int virt_get_core_id(const MachineState *ms, int cpu_index);
> > > @@ -2154,6 +2155,14 @@ static void machvirt_init(MachineState *machine)
> > >           exit(1);
> > >       }
> > >
> > > +    finalize_gic_version(vms);
> > > +    if (tcg_enabled() || hvf_enabled() || qtest_enabled() ||
> > > +        (vms->gic_version < VIRT_GIC_VERSION_3)) {
> > > +        machine->smp.max_cpus = smp_cpus;
> > > +        mc->has_hotpluggable_cpus = false;
> > > +        warn_report("cpu hotplug feature has been disabled");
> > > +    }
> > > +  
> > 
> > Comments needed here to explain why @mc->has_hotpluggable_cpus is set to false.
> > I guess it's something related to TODO list, mentioned in the cover letter.  
> 
> 
> I can put a comment explaining the checks as to why feature has been disabled.
> BTW, isn't code self-explanatory here?

Would be good to gate that warn_report() on whether any attempt to enable
CPU hotplug has been made if (max_cpus > smp for example).
Right now it's noise on a lot of pre existing configurations.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} Bits to Guest
  2023-09-26 10:04 ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} Bits to Guest Salil Mehta via
  2023-09-28 23:33   ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} " Gavin Shan
@ 2024-01-17 21:46   ` Jonathan Cameron via
  1 sibling, 0 replies; 153+ messages in thread
From: Jonathan Cameron via @ 2024-01-17 21:46 UTC (permalink / raw)
  To: Salil Mehta via
  Cc: Salil Mehta, qemu-arm, maz, jean-philippe, lpieralisi,
	peter.maydell, richard.henderson, imammedo, andrew.jones, david,
	philmd, eric.auger, will, ardb, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, salil.mehta, zhukeqian1,
	wangxiongfeng2, wangyanan55, jiakernel2, maobibo, lixianglai

On Tue, 26 Sep 2023 11:04:18 +0100
Salil Mehta via <qemu-devel@nongnu.org> wrote:

> ACPI AML changes to properly reflect the _STA.PRES and _STA.ENA Bits to the
> guest during initialzation, when CPUs are hotplugged and after CPUs are
> hot-unplugged.
> 
> Signed-off-by: Salil Mehta <salil.mehta@huawei.com>

Hi Salil,

Just brought up a qemu on qemu test setup again to poke
the kernel series and hopefully resolve a few questions there.
Ran into a trivial problem in which the kernel was trying and
failing to attach an ACPI handler before these were hotplugged.
Came down to the kernel code now treating functional in _STA as
meaning can be enumerated and effectively
ignoring all the other bits.

Requires a change to the value presented by default...
See below.  Fix may well be completely wrong even though it works ;)

Thanks,

Jonathan

> ---
>  hw/acpi/cpu.c                  | 49 +++++++++++++++++++++++++++++++---
>  hw/acpi/generic_event_device.c | 11 ++++++++
>  include/hw/acpi/cpu.h          |  2 ++
>  3 files changed, 58 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> index 232720992d..e1299696d3 100644
> --- a/hw/acpi/cpu.c
> +++ b/hw/acpi/cpu.c
> @@ -63,10 +63,11 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr addr, unsigned size)
>      cdev = &cpu_st->devs[cpu_st->selector];
>      switch (addr) {
>      case ACPI_CPU_FLAGS_OFFSET_RW: /* pack and return is_* fields */
> -        val |= cdev->cpu ? 1 : 0;
> +        val |= cdev->is_enabled ? 1 : 0;
>          val |= cdev->is_inserting ? 2 : 0;
>          val |= cdev->is_removing  ? 4 : 0;
>          val |= cdev->fw_remove  ? 16 : 0;
> +        val |= cdev->is_present ? 32 : 0;
>          trace_cpuhp_acpi_read_flags(cpu_st->selector, val);
>          break;
>      case ACPI_CPU_CMD_DATA_OFFSET_RW:
> @@ -228,7 +229,21 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
>          struct CPUState *cpu = CPU(id_list->cpus[i].cpu);
>          if (qemu_present_cpu(cpu)) {
>              state->devs[i].cpu = cpu;
> +            state->devs[i].is_present = true;
> +        } else {
> +            if (qemu_persistent_cpu(cpu)) {
> +                state->devs[i].is_present = true;
> +            } else {
> +                state->devs[i].is_present = false;
> +            }
>          }
> +
> +        if (qemu_enabled_cpu(cpu)) {
> +            state->devs[i].is_enabled = true;
> +        } else {
> +            state->devs[i].is_enabled = false;
> +        }
> +
>          state->devs[i].arch_id = id_list->cpus[i].arch_id;
>      }
>      memory_region_init_io(&state->ctrl_reg, owner, &cpu_hotplug_ops, state,
> @@ -261,6 +276,8 @@ void acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
>      }
>  
>      cdev->cpu = CPU(dev);
> +    cdev->is_present = true;
> +    cdev->is_enabled = true;
>      if (dev->hotplugged) {
>          cdev->is_inserting = true;
>          acpi_send_event(DEVICE(hotplug_dev), ACPI_CPU_HOTPLUG_STATUS);
> @@ -292,6 +309,11 @@ void acpi_cpu_unplug_cb(CPUHotplugState *cpu_st,
>          return;
>      }
>  
> +    cdev->is_enabled = false;
> +    if (!qemu_persistent_cpu(CPU(dev))) {
> +        cdev->is_present = false;
> +    }
> +
>      cdev->cpu = NULL;
>  }
>  
> @@ -302,6 +324,8 @@ static const VMStateDescription vmstate_cpuhp_sts = {
>      .fields      = (VMStateField[]) {
>          VMSTATE_BOOL(is_inserting, AcpiCpuStatus),
>          VMSTATE_BOOL(is_removing, AcpiCpuStatus),
> +        VMSTATE_BOOL(is_present, AcpiCpuStatus),
> +        VMSTATE_BOOL(is_enabled, AcpiCpuStatus),
>          VMSTATE_UINT32(ost_event, AcpiCpuStatus),
>          VMSTATE_UINT32(ost_status, AcpiCpuStatus),
>          VMSTATE_END_OF_LIST()
> @@ -339,6 +363,7 @@ const VMStateDescription vmstate_cpu_hotplug = {
>  #define CPU_REMOVE_EVENT  "CRMV"
>  #define CPU_EJECT_EVENT   "CEJ0"
>  #define CPU_FW_EJECT_EVENT "CEJF"
> +#define CPU_PRESENT       "CPRS"
>  
>  void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
>                      hwaddr base_addr,
> @@ -399,7 +424,9 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
>          aml_append(field, aml_named_field(CPU_EJECT_EVENT, 1));
>          /* tell firmware to do device eject, write only */
>          aml_append(field, aml_named_field(CPU_FW_EJECT_EVENT, 1));
> -        aml_append(field, aml_reserved_field(3));
> +        /* 1 if present, read only */
> +        aml_append(field, aml_named_field(CPU_PRESENT, 1));
> +        aml_append(field, aml_reserved_field(2));
>          aml_append(field, aml_named_field(CPU_COMMAND, 8));
>          aml_append(cpu_ctrl_dev, field);
>  
> @@ -429,6 +456,7 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
>          Aml *ctrl_lock = aml_name("%s.%s", cphp_res_path, CPU_LOCK);
>          Aml *cpu_selector = aml_name("%s.%s", cphp_res_path, CPU_SELECTOR);
>          Aml *is_enabled = aml_name("%s.%s", cphp_res_path, CPU_ENABLED);
> +        Aml *is_present = aml_name("%s.%s", cphp_res_path, CPU_PRESENT);
>          Aml *cpu_cmd = aml_name("%s.%s", cphp_res_path, CPU_COMMAND);
>          Aml *cpu_data = aml_name("%s.%s", cphp_res_path, CPU_DATA);
>          Aml *ins_evt = aml_name("%s.%s", cphp_res_path, CPU_INSERT_EVENT);
> @@ -457,13 +485,26 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
>          {
>              Aml *idx = aml_arg(0);
>              Aml *sta = aml_local(0);
> +            Aml *ifctx2;
> +            Aml *else_ctx;
>  
>              aml_append(method, aml_acquire(ctrl_lock, 0xFFFF));
>              aml_append(method, aml_store(idx, cpu_selector));
>              aml_append(method, aml_store(zero, sta));
> -            ifctx = aml_if(aml_equal(is_enabled, one));
> +            ifctx = aml_if(aml_equal(is_present, one));
>              {
> -                aml_append(ifctx, aml_store(aml_int(0xF), sta));
> +                ifctx2 = aml_if(aml_equal(is_enabled, one));
> +                {
> +                    /* cpu is present and enabled */
> +                    aml_append(ifctx2, aml_store(aml_int(0xF), sta));
> +                }
> +                aml_append(ifctx, ifctx2);
> +                else_ctx = aml_else();
> +                {
> +                    /* cpu is present but disabled */
> +                    aml_append(else_ctx, aml_store(aml_int(0xD), sta));
For the current kernel patches, functional should not be set. So this should
be something like 0x5.

> +                }
> +                aml_append(ifctx, else_ctx);
>              }
>              aml_append(method, ifctx);
>              aml_append(method, aml_release(ctrl_lock));


^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-09-25 20:21     ` Salil Mehta via
@ 2023-09-25 23:58       ` Gavin Shan
  0 siblings, 0 replies; 153+ messages in thread
From: Gavin Shan @ 2023-09-25 23:58 UTC (permalink / raw)
  To: Salil Mehta, Russell King
  Cc: qemu-devel, qemu-arm, maz, james.morse, jean-philippe,
	Jonathan Cameron, lorenzo.pieralisi, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, catalin.marinas, ardb, justin.he, oliver.upton,
	pbonzini, mst, rafael, borntraeger, alex.bennee, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, sudeep.holla, salil.mehta,
	zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Salil,

On 9/26/23 06:21, Salil Mehta wrote:
>> From: Russell King <linux@armlinux.org.uk>
>> Sent: Monday, September 25, 2023 9:13 PM
>> To: Salil Mehta <salil.mehta@huawei.com>
>> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; maz@kernel.org;
>> james.morse@arm.com; jean-philippe@linaro.org; Jonathan Cameron
>> <jonathan.cameron@huawei.com>; lorenzo.pieralisi@linaro.com;
>> lpieralisi@kernel.org; peter.maydell@linaro.org;
>> richard.henderson@linaro.org; imammedo@redhat.com; andrew.jones@linux.dev;
>> david@redhat.com; philmd@linaro.org; eric.auger@redhat.com;
>> will@kernel.org; catalin.marinas@arm.com; ardb@kernel.org;
>> justin.he@arm.com; oliver.upton@linux.dev; pbonzini@redhat.com;
>> mst@redhat.com; gshan@redhat.com; rafael@kernel.org;
>> borntraeger@linux.ibm.com; alex.bennee@linaro.org;
>> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
>> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
>> miguel.luis@oracle.com; sudeep.holla@arm.com; salil.mehta@opnsrc.net;
>> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
>> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
>> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
>> Subject: Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8
>> Arch
>>
>> On Mon, Sep 25, 2023 at 08:03:56PM +0000, Salil Mehta wrote:
>>> Looks like some problem with Huawei's mail server server. No patches
>>> except the cover letter are reaching the qemu-devel mailing-list.
>>
>> I haven't seen any of the actual patches - just the cover letters.
>> Was that intentional?
> 
> No. all the patches are either getting held either by the server or
> Some other problem. This has happened for both the instances of the
> patch-set I had pushed to the mailing list within 2 hours.
> 
> I am not sure how to sort it out without the help of IT. China is
> asleep now.
> 
> Any suggestions welcome to debug this. Or Should I wait till tomorrow?
> 

Thanks for your efforts to continue working on the feature. Hope the mail
server issue can be fixed early so that patches can be posted. I don't
see the attached patches either. However, the code is available for early
access in your private repository, as clarified in the cover letter :)

   https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-09-25 20:12   ` Russell King (Oracle)
@ 2023-09-25 20:21     ` Salil Mehta via
  2023-09-25 23:58       ` Gavin Shan
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-25 20:21 UTC (permalink / raw)
  To: Russell King
  Cc: qemu-devel, qemu-arm, maz, james.morse, jean-philippe,
	Jonathan Cameron, lorenzo.pieralisi, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, catalin.marinas, ardb, justin.he, oliver.upton,
	pbonzini, mst, gshan, rafael, borntraeger, alex.bennee, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, sudeep.holla,
	salil.mehta, zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Hi Russell,

> From: Russell King <linux@armlinux.org.uk>
> Sent: Monday, September 25, 2023 9:13 PM
> To: Salil Mehta <salil.mehta@huawei.com>
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; maz@kernel.org;
> james.morse@arm.com; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lorenzo.pieralisi@linaro.com;
> lpieralisi@kernel.org; peter.maydell@linaro.org;
> richard.henderson@linaro.org; imammedo@redhat.com; andrew.jones@linux.dev;
> david@redhat.com; philmd@linaro.org; eric.auger@redhat.com;
> will@kernel.org; catalin.marinas@arm.com; ardb@kernel.org;
> justin.he@arm.com; oliver.upton@linux.dev; pbonzini@redhat.com;
> mst@redhat.com; gshan@redhat.com; rafael@kernel.org;
> borntraeger@linux.ibm.com; alex.bennee@linaro.org;
> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> miguel.luis@oracle.com; sudeep.holla@arm.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8
> Arch
> 
> On Mon, Sep 25, 2023 at 08:03:56PM +0000, Salil Mehta wrote:
> > Looks like some problem with Huawei's mail server server. No patches
> > except the cover letter are reaching the qemu-devel mailing-list.
> 
> I haven't seen any of the actual patches - just the cover letters.
> Was that intentional?

No. all the patches are either getting held either by the server or
Some other problem. This has happened for both the instances of the
patch-set I had pushed to the mailing list within 2 hours. 

I am not sure how to sort it out without the help of IT. China is
asleep now.

Any suggestions welcome to debug this. Or Should I wait till tomorrow?

Many thanks
Salil.



^ permalink raw reply	[flat|nested] 153+ messages in thread

* Re: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-09-25 20:03 ` Salil Mehta via
@ 2023-09-25 20:12   ` Russell King (Oracle)
  2023-09-25 20:21     ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Russell King (Oracle) @ 2023-09-25 20:12 UTC (permalink / raw)
  To: Salil Mehta
  Cc: qemu-devel, qemu-arm, maz, james.morse, jean-philippe,
	Jonathan Cameron, lorenzo.pieralisi, lpieralisi, peter.maydell,
	richard.henderson, imammedo, andrew.jones, david, philmd,
	eric.auger, will, catalin.marinas, ardb, justin.he, oliver.upton,
	pbonzini, mst, gshan, rafael, borntraeger, alex.bennee, darren,
	ilkka, vishnu, karl.heubaum, miguel.luis, sudeep.holla,
	salil.mehta, zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

On Mon, Sep 25, 2023 at 08:03:56PM +0000, Salil Mehta wrote:
> Looks like some problem with Huawei's mail server server. No patches
> except the cover letter are reaching the qemu-devel mailing-list.

I haven't seen any of the actual patches - just the cover letters.
Was that intentional?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!


^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-09-25 19:43 Salil Mehta via
@ 2023-09-25 20:03 ` Salil Mehta via
  2023-09-25 20:12   ` Russell King (Oracle)
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-25 20:03 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: maz, james.morse, jean-philippe, Jonathan Cameron,
	lorenzo.pieralisi, lpieralisi, peter.maydell, richard.henderson,
	imammedo, andrew.jones, david, philmd, eric.auger, will,
	catalin.marinas, ardb, justin.he, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, sudeep.holla, salil.mehta,
	zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2, maobibo, lixianglai

Looks like some problem with Huawei's mail server server. No patches
except the cover letter are reaching the qemu-devel mailing-list.


> From: Salil Mehta <salil.mehta@huawei.com>
> Sent: Monday, September 25, 2023 8:44 PM
> To: qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: Salil Mehta <salil.mehta@huawei.com>; maz@kernel.org;
> james.morse@arm.com; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; lorenzo.pieralisi@linaro.com;
> lpieralisi@kernel.org; peter.maydell@linaro.org;
> richard.henderson@linaro.org; imammedo@redhat.com; andrew.jones@linux.dev;
> david@redhat.com; philmd@linaro.org; eric.auger@redhat.com;
> will@kernel.org; catalin.marinas@arm.com; ardb@kernel.org;
> justin.he@arm.com; oliver.upton@linux.dev; pbonzini@redhat.com;
> mst@redhat.com; gshan@redhat.com; rafael@kernel.org;
> borntraeger@linux.ibm.com; alex.bennee@linaro.org; linux@armlinux.org.uk;
> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com;
> vishnu@os.amperecomputing.com; karl.heubaum@oracle.com;
> miguel.luis@oracle.com; sudeep.holla@arm.com; salil.mehta@opnsrc.net;
> zhukeqian <zhukeqian1@huawei.com>; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>; wangyanan (Y) <wangyanan55@huawei.com>;
> jiakernel2@gmail.com; maobibo@loongson.cn; lixianglai@loongson.cn
> Subject: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
> 
> PROLOGUE
> ========
> 
> To assist in review and set the right expectations from this RFC, please
> first
> read below sections *APPENDED AT THE END* of this cover letter,
> 
> 1. Important *DISCLAIMER* [Section (X)]
> 2. Work presented at KVMForum Conference (slides available) [Section (V)F]
> 3. Organization of patches [Section (XI)]
> 4. References [Section (XII)]
> 5. Detailed TODO list of the leftover work or work-in-progress [Section
> (IX)]
> 
> NOTE: There has been an interest shown by other organizations in adapting
> this series for their architecture. I am planning to split this RFC into
> architecture *agnostic* and *specific* patch-sets in subsequent releases.
> ARM
> specific patch-set will continue as RFC V3 and architecture agnostic patch-
> set
> will be floated without RFC tag and can be consumed in this Qemu cycle if
> MAINTAINERs ack it.
> 
> [Please check section (XI)B for details of architecture agnostic patches]
> 
> 
> SECTIONS [I - XIII] are as follows :
> 
> (I) Key Changes (RFC V1 -> RFC V2)
>     ==================================
> 
>     RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
> salil.mehta@huawei.com/
> 
> 1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
>    *online-capable* or *enabled* to the Guest OS at the boot time. This
> means
>    associated CPUs can have ACPI _STA as *enabled* or *disabled* even after
> boot
>    See, UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface
> Flags[20]
> 2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF}
>    request. This is required to {dis}allow online'ing a vCPU.
> 3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI
> _STA.PRESENT
>    to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
>    hot{un}plug.
> 4. Live Migration works (some issues are still there)
> 5. TCG/HVF/qtest does not support Hotplug and falls back to default.
> 6. Code for TCG support do exists in this release (it is a work-in-
> progress)
> 7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
>    hotplug capability (_OSC Query support still pending)
> 8. Misc. Bug fixes
> 
> (II) Summary
>      =======
> 
> This patch-set introduces the virtual CPU hotplug support for ARMv8
> architecture
> in QEMU. Idea is to be able to hotplug and hot-unplug the vCPUs while guest
> VM
> is running and no reboot is required. This does *not* makes any assumption
> of
> the physical CPU hotplug availability within the host system but rather
> tries to
> solve the problem at virtualizer/QEMU layer. Introduces ACPI CPU hotplug
> hooks
> and event handling to interface with the guest kernel, code to initialize,
> plug
> and unplug CPUs. No changes are required within the host kernel/KVM except
> the
> support of hypercall exit handling in the user-space/Qemu which has
> recently
> been added to the kernel. Its corresponding Guest kernel changes have been
> posted on the mailing-list [3] [4] by James Morse.
> 
> (III) Motivation
>       ==========
> 
> This allows scaling the guest VM compute capacity on-demand which would be
> useful for the following example scenarios,
> 
> 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration
>    framework which could adjust resource requests (CPU and Mem requests)
> for
>    the containers in a pod, based on usage.
> 2. Pay-as-you-grow Business Model: Infrastructure provider could allocate
> and
>    restrict the total number of compute resources available to the guest VM
>    according to the SLA (Service Level Agreement). VM owner could request
> for
>    more compute to be hot-plugged for some cost.
> 
> For example, Kata Container VM starts with a minimum amount of resources
> (i.e.
> hotplug everything approach). why?
> 
> 1. Allowing faster *boot time* and
> 2. Reduction in *memory footprint*
> 
> Kata Container VM can boot with just 1 vCPU and then later more vCPUs can
> be
> hot-plugged as per requirement.
> 
> (IV) Terminology
>      ===========
> 
> (*) Posssible CPUs: Total vCPUs which could ever exist in VM. This includes
>                     any cold booted CPUs plus any CPUs which could be later
>                     hot-plugged.
>                     - Qemu parameter(-smp maxcpus=N)
> (*) Present CPUs:   Possible CPUs which are ACPI 'present'. These might or
> might
>                     not be ACPI 'enabled'.
>                     - Present vCPUs = Possible vCPUs (Always on ARM Arch)
> (*) Enabled CPUs:   Possible CPUs which are ACPI ‘present’ and 'enabled'
> and can
>                     now be ‘onlined’ (PSCI) for use by Guest Kernel. All
> cold
>                     booted vCPUs are ACPI 'enabled' at boot. Later, using
>                     device_add more vCPUs can be hotplugged and be made
> ACPI
>                     'enabled.
>                     - Qemu parameter(-smp cpus=N). Can be used to specify
> some
> 		      cold booted vCPUs during VM init. Some can be added using
> 		      '-device' option.
> 
> (V) Constraints Due To ARMv8 CPU Architecture [+] Other Impediments
>     ===============================================================
> 
> A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
>    1. ARMv8 CPU architecture does not support the concept of the physical
> CPU
>       hotplug.
>       a. There are many per-CPU components like PMU, SVE, MTE, Arch timers
> etc.
>          whose behaviour need to be clearly defined when CPU is
> hot(un)plugged.
>          There is no specification for this.
> 
>    2. Other ARM components like GIC etc. have not been designed to realize
>       physical CPU hotplug capability as of now. For example,
>       a. Every physical CPU has a unique GICC (GIC CPU Interface) by
> construct.
>          Architecture does not specifies what CPU hot(un)plug would mean in
>          context to any of these.
>       b. CPUs/GICC are physically connected to unique GICR (GIC
> Redistributor).
>          GIC Redistributors are always part of always-on power domain.
> Hence,
>          cannot be powered-off as per specification.
> 
> B. Impediments in Firmware/ACPI (Architectural Constraint)
> 
>    1. Firmware has to expose GICC, GICR and other per-CPU features like
> PMU,
>       SVE, MTE, Arch Timers etc. to the OS. Due to architectural constraint
>       stated in above section A1(a),  all interrupt controller structures
> of
>       MADT describing GIC CPU Interfaces and the GIC Redistibutors MUST be
>       presented by firmware to the OSPM during the boot time.
>    2. Architectures that support CPU hotplug can evaluate ACPI _MAT method
> to
>       get this kind of information from the firmware even after boot and
> the
>       OSPM has capability to process these. ARM kernel uses information in
> MADT
>       interrupt controller structures to identify number of Present CPUs
> during
>       boot and hence does not allow to change these after boot. Number of
>       present CPUs cannot be changed. It is an architectural constraint!
> 
> C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural
> Constraint)
> 
>    1. KVM VGIC:
>        a. Sizing of various VGIC resources like memory regions etc. related
> to
>           the redistributor happens only once and is fixed at the VM init
> time
>           and cannot be changed later after initialization has happened.
>           KVM statically configures these resources based on the number of
> vCPUs
>           and the number/size of redistributor ranges.
>        b. Association between vCPU and its VGIC redistributor is fixed at
> the
>           VM init time within the KVM i.e. when redistributor iodevs gets
>           registered. VGIC does not allows to setup/change this association
>           after VM initialization has happened. Physically, every CPU/GICC
> is
>           uniquely connected with its redistributor and there is no
>           architectural way to set this up.
>    2. KVM vCPUs:
>        a. Lack of specification means destruction of KVM vCPUs does not
> exist as
>           there is no reference to tell what to do with other per-vCPU
>           components like redistributors, arch timer etc.
>        b. Infact, KVM does not implements destruction of vCPUs for any
>           architecture. This is independent of the fact whether
> architecture
>           actually supports CPU Hotplug feature. For example, even for x86
> KVM
>           does not implements destruction of vCPUs.
> 
> D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints-
> >Arch)
> 
>    1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs
> to
>       overcome the KVM constraint. KVM vCPUs are created, initialized when
> Qemu
>       CPU Objects are realized. But keepinsg the QOM CPU objects realized
> for
>       'yet-to-be-plugged' vCPUs can create problems when these new vCPUs
> shall
>       be plugged using device_add and a new QOM CPU object shall be
> created.
>    2. GICV3State and GICV3CPUState objects MUST be sized over *possible
> vCPUs*
>       during VM init time while QOM GICV3 Object is realized. This is
> because
>       KVM VGIC can only be initialized once during init time. But every
>       GICV3CPUState has an associated QOM CPU Object. Later might
> corresponds to
>       vCPU which are 'yet-to-be-plugged'(unplugged at init).
>    3. How should new QOM CPU objects be connected back to the GICV3CPUState
>       objects and disconnected from it in case CPU is being hot(un)plugged?
>    4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in
> the
>       QOM for which KVM vCPU already exists? For example, whether to keep,
>        a. No QOM CPU objects Or
>        b. Unrealized CPU Objects
>    5. How should vCPU state be exposed via ACPI to the Guest? Especially
> for
>       the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not
> exists
>       within the QOM but the Guest always expects all possible vCPUs to be
>       identified as ACPI *present* during boot.
>    6. How should Qemu expose GIC CPU interfaces for the unplugged or
>       yet-to-beplugged vCPUs using ACPI MADT Table to the Guest?
> 
> E. Summary of Approach ([+] Workarounds to problems in sections A, B, C &
> D)
> 
>    1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e.
> even
>       for the vCPUs which are yet-to-be-plugged in Qemu but keep them in
> the
>       powered-off state.
>    2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
>       objects corresponding to the unplugged/yet-to-be-plugged vCPUs are
> parked
>       at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to
> x86)
>    3. GICV3State and GICV3CPUState objects are sized over possible vCPUs
> during
>       VM init time i.e. when Qemu GIC is realized. This in turn sizes KVM
> VGIC
>       resources like memory regions etc. related to the redistributors with
> the
>       number of possible KVM vCPUs. This never changes after VM has
> initialized.
>    4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs
> are
>       released post Host KVM CPU and GIC/VGIC initialization.
>    5. Build ACPI MADT Table with below updates
>       a. Number of GIC CPU interface entries (=possible vCPUs)
>       b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable)
>       c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1
>          - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
> 	 - vCPU can be ACPI enabled+onlined after Guest boots (Firmware
> Policy)
> 	 - Some issues with above (details in later sections)
>    6. Expose below ACPI Status to Guest kernel
>       a. Always _STA.Present=1 (all possible vCPUs)
>       b. _STA.Enabled=1 (plugged vCPUs)
>       c. _STA.Enabled=0 (unplugged vCPUs)
>    7. vCPU hotplug *realizes* new QOM CPU object. Following happens,
>       a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread
>       b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list)
>          - Attaches to QOM CPU object.
>       c. Reinitializes KVM vCPU in the Host
>          - Resets the core and sys regs, sets defaults etc.
>       d. Runs KVM vCPU (created with "start-powered-off")
> 	 - vCPU thread sleeps (waits for vCPU reset via PSCI)
>       e. Updates Qemu GIC
>          - Wires back IRQs related to this vCPU.
>          - GICV3CPUState association with QOM CPU Object.
>       f. Updates [6] ACPI _STA.Enabled=1
>       g. Notifies Guest about new vCPU (via ACPI GED interface)
> 	 - Guest checks _STA.Enabled=1
> 	 - Guest adds processor (registers CPU with LDM) [3]
>       h. Plugs the QOM CPU object in the slot.
>          - slot-number = cpu-index{socket,cluster,core,thread}
>       i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC)
>          - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>          - Qemu powers-on KVM vCPU in the Host
>    8. vCPU hot-unplug *unrealizes* QOM CPU Object. Following happens,
>       a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event
>          - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC)
>       b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>          - Qemu powers-off the KVM vCPU in the Host
>       c Guest signals *Eject* vCPU to Qemu
>       d. Qemu updates [6] ACPI _STA.Enabled=0
>       e. Updates GIC
>          - Un-wires IRQs related to this vCPU
>          - GICV3CPUState association with new QOM CPU Object is updated.
>       f. Unplugs the vCPU
> 	 - Removes from slot
>          - Parks KVM vCPU ("kvm_parked_vcpus" list)
>          - Unrealizes QOM CPU Object & joins back Qemu vCPU thread
> 	 - Destroys QOM CPU object
>       g. Guest checks ACPI _STA.Enabled=0
>          - Removes processor (unregisters CPU with LDM) [3]
> 
> F. Work Presented at KVM Forum Conferences:
>    Details of above work has been presented at KVMForum2020 and
> KVMForum2023
>    conferences. Slides are available at below links,
>    a. KVMForum 2023
>       - Challenges Revisited in Supporting Virt CPU Hotplug on
> architectures that don't Support CPU Hotplug (like ARM64)
>         https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
>    b. KVMForum 2020
>       - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems
> (like ARM64) - Salil Mehta, Huawei
>         https://sched.co/eE4m
> 
> (VI) Commands Used
>      =============
> 
>     A. Qemu launch commands to init the machine
> 
>     $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
>     -cpu host -smp cpus=4,maxcpus=6 \
>     -m 300M \
>     -kernel Image \
>     -initrd rootfs.cpio.gz \
>     -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2
> acpi=force" \
>     -nographic \
>     -bios  QEMU_EFI.fd \
> 
>     B. Hot-(un)plug related commands
> 
>     # Hotplug a host vCPU(accel=kvm)
>     $ device_add host-arm-cpu,id=core4,core-id=4
> 
>     # Hotplug a vCPU(accel=tcg)
>     $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
> 
>     # Delete the vCPU
>     $ device_del core4
> 
>     Sample output on guest after boot:
> 
>     $ cat /sys/devices/system/cpu/possible
>     0-5
>     $ cat /sys/devices/system/cpu/present
>     0-5
>     $ cat /sys/devices/system/cpu/enabled
>     0-3
>     $ cat /sys/devices/system/cpu/online
>     0-1
>     $ cat /sys/devices/system/cpu/offline
>     2-5
> 
>     Sample output on guest after hotplug of vCPU=4:
> 
>     $ cat /sys/devices/system/cpu/possible
>     0-5
>     $ cat /sys/devices/system/cpu/present
>     0-5
>     $ cat /sys/devices/system/cpu/enabled
>     0-4
>     $ cat /sys/devices/system/cpu/online
>     0-1,4
>     $ cat /sys/devices/system/cpu/offline
>     2-3,5
> 
>     Note: vCPU=4 was explicitly 'onlined' after hot-plug
>     $ echo 1 > /sys/devices/system/cpu/cpu4/online
> 
> (VII) Repository
>       ==========
> 
>  (*) QEMU changes for vCPU hotplug could be cloned from below site,
>      https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
>  (*) Guest Kernel changes (by James Morse, ARM) are available here:
>      https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
> virtual_cpu_hotplug/rfc/v2
> 
> 
> (VIII) KNOWN ISSUES
>        ============
> 
> 1. Migration has been lightly tested. Below are some of the known issues:
>    - Ocassional CPU stall (not always repeatable)
>    - Negative test case like asymmetric source/destination VM config causes
> dump.
>    - Migration with TCG is not working properly.
> 2. TCG with Single threaded mode is broken.
> 3. HVF and qtest support is broken.
> 4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable
> are
>    mutually exclusive i.e. as per the change [6] a vCPU cannot be both
>    GICC.Enabled and GICC.online-capable. This means,
>       [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
>    a. If we have to support hot-unplug of the cold-booted vCPUs then these
> MUST
>       be specified as GICC.online-capable in the MADT Table during boot by
> the
>       firmware/Qemu. But this requirement conflicts with the requirement to
>       support new Qemu changes with legacy OS which dont understand
>       MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore
> this
>       bit and hence these vCPUs will not appear on such OS. This is
> unexpected
>       behaviour.
>    b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to
> unplug
>       these cold-booted vCPUs from OS (which in actual should be blocked by
>       returning error at Qemu) then features like 'kexec' will break.
>    c. As I understand, removal of the cold-booted vCPUs is a required
> feature
>       and x86 world allows it.
>    d. Hence, either we need a specification change to make the
> MADT.GICC.Enabled
>       and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT
> support
>       removal of cold-booted vCPUs. In the later case, a check can be
> introduced
>       to bar the users from unplugging vCPUs, which were cold-booted, using
> QMP
>       commands. (Needs discussion!)
>       Please check below patch part of this patch-set:
>           [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled]
> 5. Code related to the notification to GICV3 about hot(un)plug of a vCPU
> event
>    inlcudes virt.h in the arm_gicv3_common.c which is not correct. Needs a
>    better way to notify GIC about CPU event independent of VirtMachineState
> 
> 
> (IX) THINGS TO DO
>      ============
> 
> 1. Fix the Migration Issues
> 2. Fix issues related to TCG/Emulation support.
> 3. Comprehensive Testing. Current testing is very basic.
>    a. Negative Test cases
> 4. Qemu Documentation(.rst) need to be updated.
> 5. Fix qtest, HVF Support
> 6. Fix the design issue related to ACPI MADT.GICC flags discussed in known
>    issues. This might require UEFI ACPI specification change!
> 7. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now.
> 
>  Above is *not* a complete list. Will update later!
> 
> Best regards
> Salil.
> 
> (X) DISCLAIMER
>     ==========
> 
> This work is an attempt to present a proof-of-concept of the ARM64 vCPU
> hotplug
> implementation to the community. This is *not* a production level code and
> might
> have bugs. Only a basic testing has been done on HiSilicon Kunpeng920 SoC
> for
> servers. Once the design and core idea behind the implementation has been
> verified more efforts can be put to harden the code.
> 
> This work is *mostly* in the lines of the discussions which have happened
> in the
> previous years[see refs below] across different channels like mailing-list,
> Linaro Open Discussions platform, various conferences like KVMFourm etc.
> This
> RFC is being used as a way to verify the idea mentioned in this cover-
> letter and
> to get community views. Once this has been agreed upon a formal patch shall
> be
> presented to the communit Once this has been agreed, a formal patch shall
> be
> posted on the mailing-list.
> 
> The concept being presented has been found to work!
> 
> (XI) ORGANIZATION OF PATCHES
>      =======================
> 
>  A. All patches [Architecture 'agnostic' + 'specific']:
> 
>    [Patch 1-9, 23, 36] logic required during machine init
>     (*) Some validation checks
>     (*) Introduces core-id property and some util functions required later.
>     (*) Refactors Parking logic of vCPUs
>     (*) Logic to pre-create vCPUs
>     (*) GIC initialization pre-sized with possible vCPUs.
>     (*) Some refactoring to have common hot and cold plug logic together.
>     (*) Release of disable QOM CPU objects in post_cpu_init()
>     (*) Support of ACPI _OSC method to negotiate platform hotplug
> capabilities
>    [Patch 10-22] logic related to ACPI at machine init time
>     (*) Changes required to Enable ACPI for cpu hotplug
>     (*) Initialization ACPI GED framework to cater CPU Hotplug Events
>     (*) Build ACPI AML related to CPU control dev
>     (*) ACPI MADT/MAT changes
>    [Patch 24-35] Logic required during vCPU hot-(un)plug
>     (*) Basic framework changes to suppport vCPU hot-(un)plug
>     (*) ACPI GED changes for hot-(un)plug hooks.
>     (*) wire-unwire the IRQs
>     (*) GIC notification logic
>     (*) ARMCPU unrealize logic
>     (*) Handling of SMCC Hypercall Exits by KVM to Qemu
> 
>  B. Architecture *agnostic* patches part of patch-set:
> 
>    [Patch 5,9,11,13,16,20,24,31,33] Common logic to support hotplug
>     (*) Refactors Parking logic of vCPUs
>     (*) Introduces ACPI GED Support for vCPU Hotplug Events
>     (*) Introduces ACPI AML change for CPU Control Device
> 
> (XII) REFERENCES
>       ==========
> 
> [1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
> salil.mehta@huawei.com/
> [2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-
> salil.mehta@huawei.com/
> [3] https://lore.kernel.org/lkml/20230203135043.409192-1-
> james.morse@arm.com/
> [4] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
> [5] https://lore.kernel.org/all/20230404154050.2270077-1-
> oliver.upton@linux.dev/
> [6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
> [7]
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic
> -cpu-interface-gicc-structure
> [8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> [9] https://cloud.google.com/kubernetes-
> engine/docs/concepts/verticalpodautoscaler
> [10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-
> autoscaler.html
> [11] https://lkml.org/lkml/2019/7/10/235
> [12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
> [13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
> [14] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-
> lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
> [15] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
> [16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
> [17] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-
> lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
> [18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-
> philippe@linaro.org/
> [19] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
> [20]
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic
> c-cpu-interface-flags
> 
> (XIII) ACKNOWLEDGEMENTS
>        ================
> 
> I would like to take this opportunity to thank below people for various
> discussions with me over different channels during the development:
> 
> Marc Zyngier (Google)               Catalin Marinas (ARM),
> James Morse(ARM),                   Will Deacon (Google),
> Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
> Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat),
> Jonathan Cameron (Huawei),          Darren Hart (Ampere),
> Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
> Andrew Jones (Redhat),              Karl Heubaum (Oracle),
> Keqian Zhu (Huawei),                Miguel Luis (Oracle),
> Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
> Shameerali Kolothum (Huawei)        Russell King (Oracle)
> Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
> Zengtao/Prime (Huawei),             And all those whom I have missed!
> 
> Many thanks to below people for their current or past contributions:
> 
> 1. James Morse (ARM)
>    (Current Kernel part of vCPU Hotplug Support on AARCH64)
> 2. Jean-Philippe Brucker (Linaro)
>    (Protoyped one of the earlier PSCI based POC [17][18] based on RFC V1)
> 3. Keqian Zhu (Huawei)
>    (Co-developed Qemu prototype)
> 4. Xiongfeng Wang (Huawei)
>    (Co-developed earlier kernel prototype)
> 5. Vishnu Pajjuri (Ampere)
>    (Verification on Ampere ARM64 Platforms + fixes)
> 6. Miguel Luis (Oracle)
>    (Verification on Oracle ARM64 Platforms + fixes)
> 
> 
> Author Salil Mehta (1):
>   target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
> 
> Jean-Philippe Brucker (2):
>   hw/acpi: Make _MAT method optional
>   target/arm/kvm: Write CPU state back to KVM on reset
> 
> Miguel Luis (1):
>   tcg/mttcg: enable threads to unregister in tcg_ctxs[]
> 
> Salil Mehta (33):
>   arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id
> property
>   cpus-common: Add common CPU utility for possible vCPUs
>   hw/arm/virt: Move setting of common CPU properties in a function
>   arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-
> plug
>   accel/kvm: Extract common KVM vCPU {creation,parking} code
>   arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
>   arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine init
>   arm/virt: Init PMU at host for all possible vcpus
>   hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
>   arm/acpi: Enable ACPI support for vcpu hotplug
>   hw/acpi: Add ACPI CPU hotplug init stub
>   hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
>   hw/acpi: Init GED framework with cpu hotplug events
>   arm/virt: Add cpu hotplug events to GED during creation
>   arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
>   hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
>   arm/virt/acpi: Build CPUs AML with CPU Hotplug support
>   arm/virt: Make ARM vCPU *present* status ACPI *persistent*
>   hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits to
> Guest
>   hw/acpi: Update GED _EVT method AML with cpu scan
>   hw/arm: MADT Tbl change to size the guest with possible vCPUs
>   arm/virt: Release objects for *disabled* possible vCPUs after init
>   hw/acpi: Update ACPI GED framework to support vCPU Hotplug
>   arm/virt: Add/update basic hot-(un)plug framework
>   arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
>   hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
>   hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info
>   arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
>   hw/arm: Changes required for reset and to support next boot
>   physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
>   target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
>   hw/arm: Support hotplug capability check using _OSC method
>   hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
> 
>  accel/kvm/kvm-all.c                    |  61 +-
>  accel/tcg/tcg-accel-ops-mttcg.c        |   1 +
>  cpus-common.c                          |  37 ++
>  gdbstub/gdbstub.c                      |  13 +
>  hw/acpi/acpi-cpu-hotplug-stub.c        |   6 +
>  hw/acpi/cpu.c                          |  91 ++-
>  hw/acpi/generic_event_device.c         |  33 +
>  hw/arm/Kconfig                         |   1 +
>  hw/arm/boot.c                          |   2 +-
>  hw/arm/virt-acpi-build.c               | 110 +++-
>  hw/arm/virt.c                          | 863 ++++++++++++++++++++-----
>  hw/core/gpio.c                         |   2 +-
>  hw/i386/acpi-build.c                   |   2 +-
>  hw/intc/arm_gicv3.c                    |   1 +
>  hw/intc/arm_gicv3_common.c             |  66 +-
>  hw/intc/arm_gicv3_cpuif.c              | 265 ++++----
>  hw/intc/arm_gicv3_cpuif_common.c       |   5 +
>  hw/intc/arm_gicv3_kvm.c                |  39 +-
>  hw/intc/gicv3_internal.h               |   2 +
>  include/exec/cpu-common.h              |   8 +
>  include/exec/gdbstub.h                 |   1 +
>  include/hw/acpi/cpu.h                  |   7 +-
>  include/hw/acpi/cpu_hotplug.h          |   4 +
>  include/hw/acpi/generic_event_device.h |   5 +
>  include/hw/arm/boot.h                  |   2 +
>  include/hw/arm/virt.h                  |  10 +-
>  include/hw/core/cpu.h                  |  77 +++
>  include/hw/intc/arm_gicv3_common.h     |  23 +
>  include/hw/qdev-core.h                 |   2 +
>  include/sysemu/kvm.h                   |   2 +
>  include/tcg/tcg.h                      |   1 +
>  softmmu/physmem.c                      |  25 +
>  target/arm/arm-powerctl.c              |  51 +-
>  target/arm/cpu-qom.h                   |   3 +
>  target/arm/cpu.c                       | 112 ++++
>  target/arm/cpu.h                       |  17 +
>  target/arm/cpu64.c                     |  15 +
>  target/arm/gdbstub.c                   |   6 +
>  target/arm/helper.c                    |  27 +-
>  target/arm/internals.h                 |  12 +-
>  target/arm/kvm.c                       |  93 ++-
>  target/arm/kvm64.c                     |  59 +-
>  target/arm/kvm_arm.h                   |  24 +
>  target/arm/meson.build                 |   1 +
>  target/arm/{tcg => }/psci.c            |   8 +
>  target/arm/tcg/meson.build             |   4 -
>  tcg/tcg.c                              |  23 +
>  47 files changed, 1873 insertions(+), 349 deletions(-)
>  rename target/arm/{tcg => }/psci.c (97%)
> 
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
@ 2023-09-25 19:43 Salil Mehta via
  2023-09-25 20:03 ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-25 19:43 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, james.morse, jean-philippe, jonathan.cameron,
	lorenzo.pieralisi, lpieralisi, peter.maydell, richard.henderson,
	imammedo, andrew.jones, david, philmd, eric.auger, will,
	catalin.marinas, ardb, justin.he, oliver.upton, pbonzini, mst,
	gshan, rafael, borntraeger, alex.bennee, linux, darren, ilkka,
	vishnu, karl.heubaum, miguel.luis, sudeep.holla, salil.mehta,
	zhukeqian1, wangxiongfeng2, wangyanan55, jiakernel2, maobibo,
	lixianglai

PROLOGUE
========

To assist in review and set the right expectations from this RFC, please first
read below sections *APPENDED AT THE END* of this cover letter,

1. Important *DISCLAIMER* [Section (X)]
2. Work presented at KVMForum Conference (slides available) [Section (V)F]
3. Organization of patches [Section (XI)]
4. References [Section (XII)]
5. Detailed TODO list of the leftover work or work-in-progress [Section (IX)]

NOTE: There has been an interest shown by other organizations in adapting
this series for their architecture. I am planning to split this RFC into
architecture *agnostic* and *specific* patch-sets in subsequent releases. ARM
specific patch-set will continue as RFC V3 and architecture agnostic patch-set
will be floated without RFC tag and can be consumed in this Qemu cycle if
MAINTAINERs ack it.

[Please check section (XI)B for details of architecture agnostic patches]


SECTIONS [I - XIII] are as follows :

(I) Key Changes (RFC V1 -> RFC V2)
    ==================================

    RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/

1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
   *online-capable* or *enabled* to the Guest OS at the boot time. This means
   associated CPUs can have ACPI _STA as *enabled* or *disabled* even after boot
   See, UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface Flags[20]
2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF}
   request. This is required to {dis}allow online'ing a vCPU.
3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI _STA.PRESENT 
   to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
   hot{un}plug.
4. Live Migration works (some issues are still there)
5. TCG/HVF/qtest does not support Hotplug and falls back to default.
6. Code for TCG support do exists in this release (it is a work-in-progress)
7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
   hotplug capability (_OSC Query support still pending)
8. Misc. Bug fixes

(II) Summary
     =======

This patch-set introduces the virtual CPU hotplug support for ARMv8 architecture
in QEMU. Idea is to be able to hotplug and hot-unplug the vCPUs while guest VM
is running and no reboot is required. This does *not* makes any assumption of
the physical CPU hotplug availability within the host system but rather tries to
solve the problem at virtualizer/QEMU layer. Introduces ACPI CPU hotplug hooks
and event handling to interface with the guest kernel, code to initialize, plug
and unplug CPUs. No changes are required within the host kernel/KVM except the
support of hypercall exit handling in the user-space/Qemu which has recently
been added to the kernel. Its corresponding Guest kernel changes have been
posted on the mailing-list [3] [4] by James Morse.

(III) Motivation
      ==========

This allows scaling the guest VM compute capacity on-demand which would be
useful for the following example scenarios,

1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration
   framework which could adjust resource requests (CPU and Mem requests) for
   the containers in a pod, based on usage.
2. Pay-as-you-grow Business Model: Infrastructure provider could allocate and
   restrict the total number of compute resources available to the guest VM
   according to the SLA (Service Level Agreement). VM owner could request for
   more compute to be hot-plugged for some cost.

For example, Kata Container VM starts with a minimum amount of resources (i.e.
hotplug everything approach). why?

1. Allowing faster *boot time* and
2. Reduction in *memory footprint*

Kata Container VM can boot with just 1 vCPU and then later more vCPUs can be
hot-plugged as per requirement.

(IV) Terminology
     ===========

(*) Posssible CPUs: Total vCPUs which could ever exist in VM. This includes
                    any cold booted CPUs plus any CPUs which could be later
                    hot-plugged.
                    - Qemu parameter(-smp maxcpus=N)
(*) Present CPUs:   Possible CPUs which are ACPI 'present'. These might or might
                    not be ACPI 'enabled'. 
                    - Present vCPUs = Possible vCPUs (Always on ARM Arch)
(*) Enabled CPUs:   Possible CPUs which are ACPI ‘present’ and 'enabled' and can
                    now be ‘onlined’ (PSCI) for use by Guest Kernel. All cold
                    booted vCPUs are ACPI 'enabled' at boot. Later, using
                    device_add more vCPUs can be hotplugged and be made ACPI
                    'enabled.
                    - Qemu parameter(-smp cpus=N). Can be used to specify some
		      cold booted vCPUs during VM init. Some can be added using
		      '-device' option.

(V) Constraints Due To ARMv8 CPU Architecture [+] Other Impediments
    ===============================================================

A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
   1. ARMv8 CPU architecture does not support the concept of the physical CPU
      hotplug. 
      a. There are many per-CPU components like PMU, SVE, MTE, Arch timers etc.
         whose behaviour need to be clearly defined when CPU is hot(un)plugged.
         There is no specification for this.

   2. Other ARM components like GIC etc. have not been designed to realize
      physical CPU hotplug capability as of now. For example,
      a. Every physical CPU has a unique GICC (GIC CPU Interface) by construct.
         Architecture does not specifies what CPU hot(un)plug would mean in
         context to any of these.
      b. CPUs/GICC are physically connected to unique GICR (GIC Redistributor).
         GIC Redistributors are always part of always-on power domain. Hence,
         cannot be powered-off as per specification.

B. Impediments in Firmware/ACPI (Architectural Constraint)

   1. Firmware has to expose GICC, GICR and other per-CPU features like PMU,
      SVE, MTE, Arch Timers etc. to the OS. Due to architectural constraint
      stated in above section A1(a),  all interrupt controller structures of
      MADT describing GIC CPU Interfaces and the GIC Redistibutors MUST be
      presented by firmware to the OSPM during the boot time. 
   2. Architectures that support CPU hotplug can evaluate ACPI _MAT method to
      get this kind of information from the firmware even after boot and the
      OSPM has capability to process these. ARM kernel uses information in MADT
      interrupt controller structures to identify number of Present CPUs during
      boot and hence does not allow to change these after boot. Number of
      present CPUs cannot be changed. It is an architectural constraint!

C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural Constraint)

   1. KVM VGIC:
       a. Sizing of various VGIC resources like memory regions etc. related to
          the redistributor happens only once and is fixed at the VM init time
          and cannot be changed later after initialization has happened.
          KVM statically configures these resources based on the number of vCPUs
          and the number/size of redistributor ranges.
       b. Association between vCPU and its VGIC redistributor is fixed at the
          VM init time within the KVM i.e. when redistributor iodevs gets
          registered. VGIC does not allows to setup/change this association
          after VM initialization has happened. Physically, every CPU/GICC is
          uniquely connected with its redistributor and there is no
          architectural way to set this up.
   2. KVM vCPUs:
       a. Lack of specification means destruction of KVM vCPUs does not exist as
          there is no reference to tell what to do with other per-vCPU
          components like redistributors, arch timer etc.
       b. Infact, KVM does not implements destruction of vCPUs for any
          architecture. This is independent of the fact whether architecture
          actually supports CPU Hotplug feature. For example, even for x86 KVM
          does not implements destruction of vCPUs.

D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints->Arch)

   1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs to
      overcome the KVM constraint. KVM vCPUs are created, initialized when Qemu
      CPU Objects are realized. But keepinsg the QOM CPU objects realized for
      'yet-to-be-plugged' vCPUs can create problems when these new vCPUs shall
      be plugged using device_add and a new QOM CPU object shall be created.
   2. GICV3State and GICV3CPUState objects MUST be sized over *possible vCPUs*
      during VM init time while QOM GICV3 Object is realized. This is because
      KVM VGIC can only be initialized once during init time. But every
      GICV3CPUState has an associated QOM CPU Object. Later might corresponds to
      vCPU which are 'yet-to-be-plugged'(unplugged at init).
   3. How should new QOM CPU objects be connected back to the GICV3CPUState
      objects and disconnected from it in case CPU is being hot(un)plugged?
   4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in the
      QOM for which KVM vCPU already exists? For example, whether to keep,
       a. No QOM CPU objects Or
       b. Unrealized CPU Objects
   5. How should vCPU state be exposed via ACPI to the Guest? Especially for
      the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not exists
      within the QOM but the Guest always expects all possible vCPUs to be
      identified as ACPI *present* during boot.
   6. How should Qemu expose GIC CPU interfaces for the unplugged or
      yet-to-beplugged vCPUs using ACPI MADT Table to the Guest?

E. Summary of Approach ([+] Workarounds to problems in sections A, B, C & D)

   1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e. even
      for the vCPUs which are yet-to-be-plugged in Qemu but keep them in the
      powered-off state.
   2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
      objects corresponding to the unplugged/yet-to-be-plugged vCPUs are parked
      at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to x86)
   3. GICV3State and GICV3CPUState objects are sized over possible vCPUs during
      VM init time i.e. when Qemu GIC is realized. This in turn sizes KVM VGIC
      resources like memory regions etc. related to the redistributors with the
      number of possible KVM vCPUs. This never changes after VM has initialized.
   4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs are
      released post Host KVM CPU and GIC/VGIC initialization.
   5. Build ACPI MADT Table with below updates 
      a. Number of GIC CPU interface entries (=possible vCPUs)
      b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable) 
      c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1  
         - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
	 - vCPU can be ACPI enabled+onlined after Guest boots (Firmware Policy) 
	 - Some issues with above (details in later sections)
   6. Expose below ACPI Status to Guest kernel
      a. Always _STA.Present=1 (all possible vCPUs)
      b. _STA.Enabled=1 (plugged vCPUs)
      c. _STA.Enabled=0 (unplugged vCPUs)
   7. vCPU hotplug *realizes* new QOM CPU object. Following happens,
      a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread
      b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list)
         - Attaches to QOM CPU object.
      c. Reinitializes KVM vCPU in the Host
         - Resets the core and sys regs, sets defaults etc.
      d. Runs KVM vCPU (created with "start-powered-off")
	 - vCPU thread sleeps (waits for vCPU reset via PSCI) 
      e. Updates Qemu GIC
         - Wires back IRQs related to this vCPU.
         - GICV3CPUState association with QOM CPU Object.
      f. Updates [6] ACPI _STA.Enabled=1
      g. Notifies Guest about new vCPU (via ACPI GED interface)
	 - Guest checks _STA.Enabled=1
	 - Guest adds processor (registers CPU with LDM) [3]
      h. Plugs the QOM CPU object in the slot.
         - slot-number = cpu-index{socket,cluster,core,thread}
      i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC)
         - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
         - Qemu powers-on KVM vCPU in the Host
   8. vCPU hot-unplug *unrealizes* QOM CPU Object. Following happens,
      a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event
         - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC) 
      b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
         - Qemu powers-off the KVM vCPU in the Host
      c Guest signals *Eject* vCPU to Qemu
      d. Qemu updates [6] ACPI _STA.Enabled=0
      e. Updates GIC
         - Un-wires IRQs related to this vCPU
         - GICV3CPUState association with new QOM CPU Object is updated.
      f. Unplugs the vCPU
	 - Removes from slot
         - Parks KVM vCPU ("kvm_parked_vcpus" list)
         - Unrealizes QOM CPU Object & joins back Qemu vCPU thread
	 - Destroys QOM CPU object 
      g. Guest checks ACPI _STA.Enabled=0
         - Removes processor (unregisters CPU with LDM) [3]

F. Work Presented at KVM Forum Conferences:
   Details of above work has been presented at KVMForum2020 and KVMForum2023
   conferences. Slides are available at below links,
   a. KVMForum 2023
      - Challenges Revisited in Supporting Virt CPU Hotplug on architectures that don't Support CPU Hotplug (like ARM64)
        https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
   b. KVMForum 2020
      - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems (like ARM64) - Salil Mehta, Huawei
        https://sched.co/eE4m

(VI) Commands Used
     =============

    A. Qemu launch commands to init the machine

    $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
    -cpu host -smp cpus=4,maxcpus=6 \
    -m 300M \
    -kernel Image \
    -initrd rootfs.cpio.gz \
    -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 acpi=force" \
    -nographic \
    -bios  QEMU_EFI.fd \

    B. Hot-(un)plug related commands

    # Hotplug a host vCPU(accel=kvm)
    $ device_add host-arm-cpu,id=core4,core-id=4

    # Hotplug a vCPU(accel=tcg)
    $ device_add cortex-a57-arm-cpu,id=core4,core-id=4

    # Delete the vCPU
    $ device_del core4

    Sample output on guest after boot:

    $ cat /sys/devices/system/cpu/possible
    0-5
    $ cat /sys/devices/system/cpu/present
    0-5
    $ cat /sys/devices/system/cpu/enabled
    0-3
    $ cat /sys/devices/system/cpu/online
    0-1
    $ cat /sys/devices/system/cpu/offline
    2-5

    Sample output on guest after hotplug of vCPU=4:

    $ cat /sys/devices/system/cpu/possible
    0-5
    $ cat /sys/devices/system/cpu/present
    0-5
    $ cat /sys/devices/system/cpu/enabled
    0-4
    $ cat /sys/devices/system/cpu/online
    0-1,4
    $ cat /sys/devices/system/cpu/offline
    2-3,5

    Note: vCPU=4 was explicitly 'onlined' after hot-plug
    $ echo 1 > /sys/devices/system/cpu/cpu4/online

(VII) Repository
      ==========

 (*) QEMU changes for vCPU hotplug could be cloned from below site,
     https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
 (*) Guest Kernel changes (by James Morse, ARM) are available here:
     https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git virtual_cpu_hotplug/rfc/v2


(VIII) KNOWN ISSUES
       ============

1. Migration has been lightly tested. Below are some of the known issues:
   - Ocassional CPU stall (not always repeatable)
   - Negative test case like asymmetric source/destination VM config causes dump.
   - Migration with TCG is not working properly.
2. TCG with Single threaded mode is broken.
3. HVF and qtest support is broken. 
4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable are
   mutually exclusive i.e. as per the change [6] a vCPU cannot be both
   GICC.Enabled and GICC.online-capable. This means,
      [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
   a. If we have to support hot-unplug of the cold-booted vCPUs then these MUST
      be specified as GICC.online-capable in the MADT Table during boot by the
      firmware/Qemu. But this requirement conflicts with the requirement to
      support new Qemu changes with legacy OS which dont understand
      MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore this
      bit and hence these vCPUs will not appear on such OS. This is unexpected
      behaviour.
   b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to unplug
      these cold-booted vCPUs from OS (which in actual should be blocked by
      returning error at Qemu) then features like 'kexec' will break.
   c. As I understand, removal of the cold-booted vCPUs is a required feature
      and x86 world allows it.
   d. Hence, either we need a specification change to make the MADT.GICC.Enabled
      and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT support
      removal of cold-booted vCPUs. In the later case, a check can be introduced
      to bar the users from unplugging vCPUs, which were cold-booted, using QMP
      commands. (Needs discussion!)
      Please check below patch part of this patch-set:
          [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled]
5. Code related to the notification to GICV3 about hot(un)plug of a vCPU event
   inlcudes virt.h in the arm_gicv3_common.c which is not correct. Needs a
   better way to notify GIC about CPU event independent of VirtMachineState


(IX) THINGS TO DO
     ============

1. Fix the Migration Issues
2. Fix issues related to TCG/Emulation support.
3. Comprehensive Testing. Current testing is very basic.
   a. Negative Test cases
4. Qemu Documentation(.rst) need to be updated.
5. Fix qtest, HVF Support
6. Fix the design issue related to ACPI MADT.GICC flags discussed in known
   issues. This might require UEFI ACPI specification change!
7. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now.

 Above is *not* a complete list. Will update later!

Best regards
Salil.

(X) DISCLAIMER
    ==========

This work is an attempt to present a proof-of-concept of the ARM64 vCPU hotplug
implementation to the community. This is *not* a production level code and might
have bugs. Only a basic testing has been done on HiSilicon Kunpeng920 SoC for
servers. Once the design and core idea behind the implementation has been
verified more efforts can be put to harden the code.

This work is *mostly* in the lines of the discussions which have happened in the
previous years[see refs below] across different channels like mailing-list,
Linaro Open Discussions platform, various conferences like KVMFourm etc. This
RFC is being used as a way to verify the idea mentioned in this cover-letter and
to get community views. Once this has been agreed upon a formal patch shall be
presented to the communit Once this has been agreed, a formal patch shall be
posted on the mailing-list.

The concept being presented has been found to work!

(XI) ORGANIZATION OF PATCHES
     =======================
 
 A. All patches [Architecture 'agnostic' + 'specific']:

   [Patch 1-9, 23, 36] logic required during machine init
    (*) Some validation checks
    (*) Introduces core-id property and some util functions required later.
    (*) Refactors Parking logic of vCPUs    
    (*) Logic to pre-create vCPUs
    (*) GIC initialization pre-sized with possible vCPUs.
    (*) Some refactoring to have common hot and cold plug logic together.
    (*) Release of disable QOM CPU objects in post_cpu_init()
    (*) Support of ACPI _OSC method to negotiate platform hotplug capabilities
   [Patch 10-22] logic related to ACPI at machine init time
    (*) Changes required to Enable ACPI for cpu hotplug
    (*) Initialization ACPI GED framework to cater CPU Hotplug Events
    (*) Build ACPI AML related to CPU control dev 
    (*) ACPI MADT/MAT changes
   [Patch 24-35] Logic required during vCPU hot-(un)plug
    (*) Basic framework changes to suppport vCPU hot-(un)plug
    (*) ACPI GED changes for hot-(un)plug hooks.
    (*) wire-unwire the IRQs
    (*) GIC notification logic
    (*) ARMCPU unrealize logic
    (*) Handling of SMCC Hypercall Exits by KVM to Qemu  
   
 B. Architecture *agnostic* patches part of patch-set:

   [Patch 5,9,11,13,16,20,24,31,33] Common logic to support hotplug 
    (*) Refactors Parking logic of vCPUs
    (*) Introduces ACPI GED Support for vCPU Hotplug Events
    (*) Introduces ACPI AML change for CPU Control Device     

(XII) REFERENCES
      ==========

[1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
[2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/
[3] https://lore.kernel.org/lkml/20230203135043.409192-1-james.morse@arm.com/
[4] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
[5] https://lore.kernel.org/all/20230404154050.2270077-1-oliver.upton@linux.dev/
[6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
[7] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
[8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
[9] https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler
[10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html
[11] https://lkml.org/lkml/2019/7/10/235
[12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
[13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
[14] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
[15] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
[16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
[17] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
[18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-philippe@linaro.org/
[19] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/ 
[20] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags

(XIII) ACKNOWLEDGEMENTS
       ================

I would like to take this opportunity to thank below people for various
discussions with me over different channels during the development:

Marc Zyngier (Google)               Catalin Marinas (ARM),         
James Morse(ARM),                   Will Deacon (Google), 
Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat), 
Jonathan Cameron (Huawei),          Darren Hart (Ampere),
Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
Andrew Jones (Redhat),              Karl Heubaum (Oracle),
Keqian Zhu (Huawei),                Miguel Luis (Oracle),
Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
Shameerali Kolothum (Huawei)        Russell King (Oracle)
Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
Zengtao/Prime (Huawei),             And all those whom I have missed! 

Many thanks to below people for their current or past contributions:

1. James Morse (ARM)
   (Current Kernel part of vCPU Hotplug Support on AARCH64)
2. Jean-Philippe Brucker (Linaro)
   (Protoyped one of the earlier PSCI based POC [17][18] based on RFC V1)
3. Keqian Zhu (Huawei)
   (Co-developed Qemu prototype)
4. Xiongfeng Wang (Huawei)
   (Co-developed earlier kernel prototype)
5. Vishnu Pajjuri (Ampere)
   (Verification on Ampere ARM64 Platforms + fixes)
6. Miguel Luis (Oracle)
   (Verification on Oracle ARM64 Platforms + fixes)


Author Salil Mehta (1):
  target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu

Jean-Philippe Brucker (2):
  hw/acpi: Make _MAT method optional
  target/arm/kvm: Write CPU state back to KVM on reset

Miguel Luis (1):
  tcg/mttcg: enable threads to unregister in tcg_ctxs[]

Salil Mehta (33):
  arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  cpus-common: Add common CPU utility for possible vCPUs
  hw/arm/virt: Move setting of common CPU properties in a function
  arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
  accel/kvm: Extract common KVM vCPU {creation,parking} code
  arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
  arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine init
  arm/virt: Init PMU at host for all possible vcpus
  hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
  arm/acpi: Enable ACPI support for vcpu hotplug
  hw/acpi: Add ACPI CPU hotplug init stub
  hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
  hw/acpi: Init GED framework with cpu hotplug events
  arm/virt: Add cpu hotplug events to GED during creation
  arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
  arm/virt/acpi: Build CPUs AML with CPU Hotplug support
  arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits to Guest
  hw/acpi: Update GED _EVT method AML with cpu scan
  hw/arm: MADT Tbl change to size the guest with possible vCPUs
  arm/virt: Release objects for *disabled* possible vCPUs after init
  hw/acpi: Update ACPI GED framework to support vCPU Hotplug
  arm/virt: Add/update basic hot-(un)plug framework
  arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
  hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
  hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info
  arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
  hw/arm: Changes required for reset and to support next boot
  physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
  target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  hw/arm: Support hotplug capability check using _OSC method
  hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled

 accel/kvm/kvm-all.c                    |  61 +-
 accel/tcg/tcg-accel-ops-mttcg.c        |   1 +
 cpus-common.c                          |  37 ++
 gdbstub/gdbstub.c                      |  13 +
 hw/acpi/acpi-cpu-hotplug-stub.c        |   6 +
 hw/acpi/cpu.c                          |  91 ++-
 hw/acpi/generic_event_device.c         |  33 +
 hw/arm/Kconfig                         |   1 +
 hw/arm/boot.c                          |   2 +-
 hw/arm/virt-acpi-build.c               | 110 +++-
 hw/arm/virt.c                          | 863 ++++++++++++++++++++-----
 hw/core/gpio.c                         |   2 +-
 hw/i386/acpi-build.c                   |   2 +-
 hw/intc/arm_gicv3.c                    |   1 +
 hw/intc/arm_gicv3_common.c             |  66 +-
 hw/intc/arm_gicv3_cpuif.c              | 265 ++++----
 hw/intc/arm_gicv3_cpuif_common.c       |   5 +
 hw/intc/arm_gicv3_kvm.c                |  39 +-
 hw/intc/gicv3_internal.h               |   2 +
 include/exec/cpu-common.h              |   8 +
 include/exec/gdbstub.h                 |   1 +
 include/hw/acpi/cpu.h                  |   7 +-
 include/hw/acpi/cpu_hotplug.h          |   4 +
 include/hw/acpi/generic_event_device.h |   5 +
 include/hw/arm/boot.h                  |   2 +
 include/hw/arm/virt.h                  |  10 +-
 include/hw/core/cpu.h                  |  77 +++
 include/hw/intc/arm_gicv3_common.h     |  23 +
 include/hw/qdev-core.h                 |   2 +
 include/sysemu/kvm.h                   |   2 +
 include/tcg/tcg.h                      |   1 +
 softmmu/physmem.c                      |  25 +
 target/arm/arm-powerctl.c              |  51 +-
 target/arm/cpu-qom.h                   |   3 +
 target/arm/cpu.c                       | 112 ++++
 target/arm/cpu.h                       |  17 +
 target/arm/cpu64.c                     |  15 +
 target/arm/gdbstub.c                   |   6 +
 target/arm/helper.c                    |  27 +-
 target/arm/internals.h                 |  12 +-
 target/arm/kvm.c                       |  93 ++-
 target/arm/kvm64.c                     |  59 +-
 target/arm/kvm_arm.h                   |  24 +
 target/arm/meson.build                 |   1 +
 target/arm/{tcg => }/psci.c            |   8 +
 target/arm/tcg/meson.build             |   4 -
 tcg/tcg.c                              |  23 +
 47 files changed, 1873 insertions(+), 349 deletions(-)
 rename target/arm/{tcg => }/psci.c (97%)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 153+ messages in thread

* RE: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
  2023-09-25 17:11 Salil Mehta via
@ 2023-09-25 17:17 ` Salil Mehta via
  0 siblings, 0 replies; 153+ messages in thread
From: Salil Mehta via @ 2023-09-25 17:17 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: maz, james.morse, jean-philippe, Jonathan Cameron, peter.maydell,
	richard.henderson, imammedo, drjones, andrew.jones, david,
	philmd, eric.auger, will, catalin.marinas, ardb, justin.he,
	oliver.upton, pbonzini, mst, gshan, rafael, borntraeger,
	alex.bennee, linux, vishnu, miguel.luis, sudeep.holla,
	salil.mehta, zhukeqian, wangxiongfeng (C), wangyanan (Y),
	jiakernel2

Hello,
Please ignore this series. Forgot to add RFC V2 Tag in all of the patches.
Will send again shortly. Sorry for inconvenience.


Thanks
Salil.

> From: Salil Mehta <salil.mehta@huawei.com>
> Sent: Monday, September 25, 2023 6:12 PM
> To: qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Cc: Salil Mehta <salil.mehta@huawei.com>; maz@kernel.org;
> james.morse@arm.com; jean-philippe@linaro.org; Jonathan Cameron
> <jonathan.cameron@huawei.com>; peter.maydell@linaro.org;
> richard.henderson@linaro.org; imammedo@redhat.com; drjones@redhat.com;
> andrew.jones@linux.dev; david@redhat.com; philmd@linaro.org;
> eric.auger@redhat.com; will@kernel.org; catalin.marinas@arm.com;
> ardb@kernel.org; justin.he@arm.com; oliver.upton@linux.dev;
> pbonzini@redhat.com; mst@redhat.com; gshan@redhat.com; rafael@kernel.org;
> borntraeger@linux.ibm.com; alex.bennee@linaro.org; linux@armlinux.org.uk;
> vishnu@os.amperecomputing.com; miguel.luis@oracle.com;
> sudeep.holla@arm.com; salil.mehta@opnsrc.net; zhukeqian
> <zhukeqian1@huawei.com>; wangxiongfeng (C) <wangxiongfeng2@huawei.com>;
> wangyanan (Y) <wangyanan55@huawei.com>; jiakernel2@gmail.com
> Subject: [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
> 
> PROLOGUE
> ========
> 
> To assist in review and set the right expectations from this RFC, please
> first
> read below sections *APPENDED AT THE END* of this cover letter,
> 
> 1. Important *DISCLAIMER* [Section (X)]
> 2. Work presented at KVMForum Conference (slides available) [Section (V)F]
> 3. Organization of patches [Section (XI)]
> 4. References [Section (XII)]
> 5. Detailed TODO list of the leftover work or work-in-progress [Section
> (IX)]
> 
> NOTE: There has been an interest shown by other organizations in adapting
> this series for their architecture. I am planning to split this RFC into
> architecture *agnostic* and *specific* patch-sets in subsequent releases.
> ARM
> specific patch-set will continue as RFC V3 and architecture agnostic patch-
> set
> will be floated without RFC tag and can be consumed in this Qemu cycle if
> MAINTAINERs ack it.
> 
> [Please check section (XI)B for details of architecture agnostic patches]
> 
> 
> SECTIONS [I - XIII] are as follows :
> 
> (I) Key Changes (RFC V1 -> RFC V2)
>     ==================================
> 
>     RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
> salil.mehta@huawei.com/
> 
> 1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
>    *online-capable* or *enabled* to the Guest OS at the boot time. This
> means
>    associated CPUs can have ACPI _STA as *enabled* or *disabled* even after
> boot
>    See, UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface
> Flags[20]
> 2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF}
>    request. This is required to {dis}allow online'ing a vCPU.
> 3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI
> _STA.PRESENT
>    to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
>    hot{un}plug.
> 4. Live Migration works (some issues are still there)
> 5. TCG/HVF/qtest does not support Hotplug and falls back to default.
> 6. Code for TCG support do exists in this release (it is a work-in-
> progress)
> 7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
>    hotplug capability (_OSC Query support still pending)
> 8. Misc. Bug fixes
> 
> (II) Summary
>      =======
> 
> This patch-set introduces the virtual CPU hotplug support for ARMv8
> architecture
> in QEMU. Idea is to be able to hotplug and hot-unplug the vCPUs while guest
> VM
> is running and no reboot is required. This does *not* makes any assumption
> of
> the physical CPU hotplug availability within the host system but rather
> tries to
> solve the problem at virtualizer/QEMU layer. Introduces ACPI CPU hotplug
> hooks
> and event handling to interface with the guest kernel, code to initialize,
> plug
> and unplug CPUs. No changes are required within the host kernel/KVM except
> the
> support of hypercall exit handling in the user-space/Qemu which has
> recently
> been added to the kernel. Its corresponding Guest kernel changes have been
> posted on the mailing-list [3] [4] by James Morse.
> 
> (III) Motivation
>       ==========
> 
> This allows scaling the guest VM compute capacity on-demand which would be
> useful for the following example scenarios,
> 
> 1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration
>    framework which could adjust resource requests (CPU and Mem requests)
> for
>    the containers in a pod, based on usage.
> 2. Pay-as-you-grow Business Model: Infrastructure provider could allocate
> and
>    restrict the total number of compute resources available to the guest VM
>    according to the SLA (Service Level Agreement). VM owner could request
> for
>    more compute to be hot-plugged for some cost.
> 
> For example, Kata Container VM starts with a minimum amount of resources
> (i.e.
> hotplug everything approach). why?
> 
> 1. Allowing faster *boot time* and
> 2. Reduction in *memory footprint*
> 
> Kata Container VM can boot with just 1 vCPU and then later more vCPUs can
> be
> hot-plugged as per requirement.
> 
> (IV) Terminology
>      ===========
> 
> (*) Posssible CPUs: Total vCPUs which could ever exist in VM. This includes
>                     any cold booted CPUs plus any CPUs which could be later
>                     hot-plugged.
>                     - Qemu parameter(-smp maxcpus=N)
> (*) Present CPUs:   Possible CPUs which are ACPI 'present'. These might or
> might
>                     not be ACPI 'enabled'.
>                     - Present vCPUs = Possible vCPUs (Always on ARM Arch)
> (*) Enabled CPUs:   Possible CPUs which are ACPI ‘present’ and 'enabled'
> and can
>                     now be ‘onlined’ (PSCI) for use by Guest Kernel. All
> cold
>                     booted vCPUs are ACPI 'enabled' at boot. Later, using
>                     device_add more vCPUs can be hotplugged and be made
> ACPI
>                     'enabled.
>                     - Qemu parameter(-smp cpus=N). Can be used to specify
> some
> 		      cold booted vCPUs during VM init. Some can be added using
> 		      '-device' option.
> 
> (V) Constraints Due To ARMv8 CPU Architecture [+] Other Impediments
>     ===============================================================
> 
> A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
>    1. ARMv8 CPU architecture does not support the concept of the physical
> CPU
>       hotplug.
>       a. There are many per-CPU components like PMU, SVE, MTE, Arch timers
> etc.
>          whose behaviour need to be clearly defined when CPU is
> hot(un)plugged.
>          There is no specification for this.
> 
>    2. Other ARM components like GIC etc. have not been designed to realize
>       physical CPU hotplug capability as of now. For example,
>       a. Every physical CPU has a unique GICC (GIC CPU Interface) by
> construct.
>          Architecture does not specifies what CPU hot(un)plug would mean in
>          context to any of these.
>       b. CPUs/GICC are physically connected to unique GICR (GIC
> Redistributor).
>          GIC Redistributors are always part of always-on power domain.
> Hence,
>          cannot be powered-off as per specification.
> 
> B. Impediments in Firmware/ACPI (Architectural Constraint)
> 
>    1. Firmware has to expose GICC, GICR and other per-CPU features like
> PMU,
>       SVE, MTE, Arch Timers etc. to the OS. Due to architectural constraint
>       stated in above section A1(a),  all interrupt controller structures
> of
>       MADT describing GIC CPU Interfaces and the GIC Redistibutors MUST be
>       presented by firmware to the OSPM during the boot time.
>    2. Architectures that support CPU hotplug can evaluate ACPI _MAT method
> to
>       get this kind of information from the firmware even after boot and
> the
>       OSPM has capability to process these. ARM kernel uses information in
> MADT
>       interrupt controller structures to identify number of Present CPUs
> during
>       boot and hence does not allow to change these after boot. Number of
>       present CPUs cannot be changed. It is an architectural constraint!
> 
> C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural
> Constraint)
> 
>    1. KVM VGIC:
>        a. Sizing of various VGIC resources like memory regions etc. related
> to
>           the redistributor happens only once and is fixed at the VM init
> time
>           and cannot be changed later after initialization has happened.
>           KVM statically configures these resources based on the number of
> vCPUs
>           and the number/size of redistributor ranges.
>        b. Association between vCPU and its VGIC redistributor is fixed at
> the
>           VM init time within the KVM i.e. when redistributor iodevs gets
>           registered. VGIC does not allows to setup/change this association
>           after VM initialization has happened. Physically, every CPU/GICC
> is
>           uniquely connected with its redistributor and there is no
>           architectural way to set this up.
>    2. KVM vCPUs:
>        a. Lack of specification means destruction of KVM vCPUs does not
> exist as
>           there is no reference to tell what to do with other per-vCPU
>           components like redistributors, arch timer etc.
>        b. Infact, KVM does not implements destruction of vCPUs for any
>           architecture. This is independent of the fact whether
> architecture
>           actually supports CPU Hotplug feature. For example, even for x86
> KVM
>           does not implements destruction of vCPUs.
> 
> D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints-
> >Arch)
> 
>    1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs
> to
>       overcome the KVM constraint. KVM vCPUs are created, initialized when
> Qemu
>       CPU Objects are realized. But keepinsg the QOM CPU objects realized
> for
>       'yet-to-be-plugged' vCPUs can create problems when these new vCPUs
> shall
>       be plugged using device_add and a new QOM CPU object shall be
> created.
>    2. GICV3State and GICV3CPUState objects MUST be sized over *possible
> vCPUs*
>       during VM init time while QOM GICV3 Object is realized. This is
> because
>       KVM VGIC can only be initialized once during init time. But every
>       GICV3CPUState has an associated QOM CPU Object. Later might
> corresponds to
>       vCPU which are 'yet-to-be-plugged'(unplugged at init).
>    3. How should new QOM CPU objects be connected back to the GICV3CPUState
>       objects and disconnected from it in case CPU is being hot(un)plugged?
>    4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in
> the
>       QOM for which KVM vCPU already exists? For example, whether to keep,
>        a. No QOM CPU objects Or
>        b. Unrealized CPU Objects
>    5. How should vCPU state be exposed via ACPI to the Guest? Especially
> for
>       the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not
> exists
>       within the QOM but the Guest always expects all possible vCPUs to be
>       identified as ACPI *present* during boot.
>    6. How should Qemu expose GIC CPU interfaces for the unplugged or
>       yet-to-beplugged vCPUs using ACPI MADT Table to the Guest?
> 
> E. Summary of Approach ([+] Workarounds to problems in sections A, B, C &
> D)
> 
>    1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e.
> even
>       for the vCPUs which are yet-to-be-plugged in Qemu but keep them in
> the
>       powered-off state.
>    2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
>       objects corresponding to the unplugged/yet-to-be-plugged vCPUs are
> parked
>       at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to
> x86)
>    3. GICV3State and GICV3CPUState objects are sized over possible vCPUs
> during
>       VM init time i.e. when Qemu GIC is realized. This in turn sizes KVM
> VGIC
>       resources like memory regions etc. related to the redistributors with
> the
>       number of possible KVM vCPUs. This never changes after VM has
> initialized.
>    4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs
> are
>       released post Host KVM CPU and GIC/VGIC initialization.
>    5. Build ACPI MADT Table with below updates
>       a. Number of GIC CPU interface entries (=possible vCPUs)
>       b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable)
>       c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1
>          - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
> 	 - vCPU can be ACPI enabled+onlined after Guest boots (Firmware
> Policy)
> 	 - Some issues with above (details in later sections)
>    6. Expose below ACPI Status to Guest kernel
>       a. Always _STA.Present=1 (all possible vCPUs)
>       b. _STA.Enabled=1 (plugged vCPUs)
>       c. _STA.Enabled=0 (unplugged vCPUs)
>    7. vCPU hotplug *realizes* new QOM CPU object. Following happens,
>       a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread
>       b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list)
>          - Attaches to QOM CPU object.
>       c. Reinitializes KVM vCPU in the Host
>          - Resets the core and sys regs, sets defaults etc.
>       d. Runs KVM vCPU (created with "start-powered-off")
> 	 - vCPU thread sleeps (waits for vCPU reset via PSCI)
>       e. Updates Qemu GIC
>          - Wires back IRQs related to this vCPU.
>          - GICV3CPUState association with QOM CPU Object.
>       f. Updates [6] ACPI _STA.Enabled=1
>       g. Notifies Guest about new vCPU (via ACPI GED interface)
> 	 - Guest checks _STA.Enabled=1
> 	 - Guest adds processor (registers CPU with LDM) [3]
>       h. Plugs the QOM CPU object in the slot.
>          - slot-number = cpu-index{socket,cluster,core,thread}
>       i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC)
>          - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>          - Qemu powers-on KVM vCPU in the Host
>    8. vCPU hot-unplug *unrealizes* QOM CPU Object. Following happens,
>       a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event
>          - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC)
>       b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
>          - Qemu powers-off the KVM vCPU in the Host
>       c Guest signals *Eject* vCPU to Qemu
>       d. Qemu updates [6] ACPI _STA.Enabled=0
>       e. Updates GIC
>          - Un-wires IRQs related to this vCPU
>          - GICV3CPUState association with new QOM CPU Object is updated.
>       f. Unplugs the vCPU
> 	 - Removes from slot
>          - Parks KVM vCPU ("kvm_parked_vcpus" list)
>          - Unrealizes QOM CPU Object & joins back Qemu vCPU thread
> 	 - Destroys QOM CPU object
>       g. Guest checks ACPI _STA.Enabled=0
>          - Removes processor (unregisters CPU with LDM) [3]
> 
> F. Work Presented at KVM Forum Conferences:
>    Details of above work has been presented at KVMForum2020 and
> KVMForum2023
>    conferences. Slides are available at below links,
>    a. KVMForum 2023
>       - Challenges Revisited in Supporting Virt CPU Hotplug on
> architectures that don't Support CPU Hotplug (like ARM64)
>         https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
>    b. KVMForum 2020
>       - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems
> (like ARM64) - Salil Mehta, Huawei
>         https://sched.co/eE4m
> 
> (VI) Commands Used
>      =============
> 
>     A. Qemu launch commands to init the machine
> 
>     $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
>     -cpu host -smp cpus=4,maxcpus=6 \
>     -m 300M \
>     -kernel Image \
>     -initrd rootfs.cpio.gz \
>     -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2
> acpi=force" \
>     -nographic \
>     -bios  QEMU_EFI.fd \
> 
>     B. Hot-(un)plug related commands
> 
>     # Hotplug a host vCPU(accel=kvm)
>     $ device_add host-arm-cpu,id=core4,core-id=4
> 
>     # Hotplug a vCPU(accel=tcg)
>     $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
> 
>     # Delete the vCPU
>     $ device_del core4
> 
>     Sample output on guest after boot:
> 
>     $ cat /sys/devices/system/cpu/possible
>     0-5
>     $ cat /sys/devices/system/cpu/present
>     0-5
>     $ cat /sys/devices/system/cpu/enabled
>     0-3
>     $ cat /sys/devices/system/cpu/online
>     0-1
>     $ cat /sys/devices/system/cpu/offline
>     2-5
> 
>     Sample output on guest after hotplug of vCPU=4:
> 
>     $ cat /sys/devices/system/cpu/possible
>     0-5
>     $ cat /sys/devices/system/cpu/present
>     0-5
>     $ cat /sys/devices/system/cpu/enabled
>     0-4
>     $ cat /sys/devices/system/cpu/online
>     0-1,4
>     $ cat /sys/devices/system/cpu/offline
>     2-3,5
> 
>     Note: vCPU=4 was explicitly 'onlined' after hot-plug
>     $ echo 1 > /sys/devices/system/cpu/cpu4/online
> 
> (VII) Repository
>       ==========
> 
>  (*) QEMU changes for vCPU hotplug could be cloned from below site,
>      https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
>  (*) Guest Kernel changes (by James Morse, ARM) are available here:
>      https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
> virtual_cpu_hotplug/rfc/v2
> 
> 
> (VIII) KNOWN ISSUES
>        ============
> 
> 1. Migration has been lightly tested. Below are some of the known issues:
>    - Ocassional CPU stall (not always repeatable)
>    - Negative test case like asymmetric source/destination VM config causes
> dump.
>    - Migration with TCG is not working properly.
> 2. TCG with Single threaded mode is broken.
> 3. HVF and qtest support is broken.
> 4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable
> are
>    mutually exclusive i.e. as per the change [6] a vCPU cannot be both
>    GICC.Enabled and GICC.online-capable. This means,
>       [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
>    a. If we have to support hot-unplug of the cold-booted vCPUs then these
> MUST
>       be specified as GICC.online-capable in the MADT Table during boot by
> the
>       firmware/Qemu. But this requirement conflicts with the requirement to
>       support new Qemu changes with legacy OS which dont understand
>       MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore
> this
>       bit and hence these vCPUs will not appear on such OS. This is
> unexpected
>       behaviour.
>    b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to
> unplug
>       these cold-booted vCPUs from OS (which in actual should be blocked by
>       returning error at Qemu) then features like 'kexec' will break.
>    c. As I understand, removal of the cold-booted vCPUs is a required
> feature
>       and x86 world allows it.
>    d. Hence, either we need a specification change to make the
> MADT.GICC.Enabled
>       and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT
> support
>       removal of cold-booted vCPUs. In the later case, a check can be
> introduced
>       to bar the users from unplugging vCPUs, which were cold-booted, using
> QMP
>       commands. (Needs discussion!)
>       Please check below patch part of this patch-set:
>           [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled]
> 5. Code related to the notification to GICV3 about hot(un)plug of a vCPU
> event
>    inlcudes virt.h in the arm_gicv3_common.c which is not correct. Needs a
>    better way to notify GIC about CPU event independent of VirtMachineState
> 
> 
> (IX) THINGS TO DO
>      ============
> 
> 1. Fix the Migration Issues
> 2. Fix issues related to TCG/Emulation support.
> 3. Comprehensive Testing. Current testing is very basic.
>    a. Negative Test cases
> 4. Qemu Documentation(.rst) need to be updated.
> 5. Fix qtest, HVF Support
> 6. Fix the design issue related to ACPI MADT.GICC flags discussed in known
>    issues. This might require UEFI ACPI specification change!
> 7. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now.
> 
>  Above is *not* a complete list. Will update later!
> 
> Best regards
> Salil.
> 
> (X) DISCLAIMER
>     ==========
> 
> This work is an attempt to present a proof-of-concept of the ARM64 vCPU
> hotplug
> implementation to the community. This is *not* a production level code and
> might
> have bugs. Only a basic testing has been done on HiSilicon Kunpeng920 SoC
> for
> servers. Once the design and core idea behind the implementation has been
> verified more efforts can be put to harden the code.
> 
> This work is *mostly* in the lines of the discussions which have happened
> in the
> previous years[see refs below] across different channels like mailing-list,
> Linaro Open Discussions platform, various conferences like KVMFourm etc.
> This
> RFC is being used as a way to verify the idea mentioned in this cover-
> letter and
> to get community views. Once this has been agreed upon a formal patch shall
> be
> presented to the communit Once this has been agreed, a formal patch shall
> be
> ppsted on the mailing-list.
> 
> The concept being presented has been found to work!
> 
> (XI) ORGANIZATION OF PATCHES
>      =======================
> 
>  A. All patches [Architecture 'agnostic' + 'specific']:
> 
>    [Patch 1-9, 23, 36] logic required during machine init
>     (*) Some validation checks
>     (*) Introduces core-id property and some util functions required later.
>     (*) Refactors Parking logic of vCPUs
>     (*) Logic to pre-create vCPUs
>     (*) GIC initialization pre-sized with possible vCPUs.
>     (*) Some refactoring to have common hot and cold plug logic together.
>     (*) Release of disable QOM CPU objects in post_cpu_init()
>     (*) Support of ACPI _OSC method to negotiate platform hotplug
> capabilities
>    [Patch 10-22] logic related to ACPI at machine init time
>     (*) Changes required to Enable ACPI for cpu hotplug
>     (*) Initialization ACPI GED framework to cater CPU Hotplug Events
>     (*) Build ACPI AML related to CPU control dev
>     (*) ACPI MADT/MAT changes
>    [Patch 24-35] Logic required during vCPU hot-(un)plug
>     (*) Basic framework changes to suppport vCPU hot-(un)plug
>     (*) ACPI GED changes for hot-(un)plug hooks.
>     (*) wire-unwire the IRQs
>     (*) GIC notification logic
>     (*) ARMCPU unrealize logic
>     (*) Handling of SMCC Hypercall Exits by KVM to Qemu
> 
>  B. Architecture *agnostic* patches part of patch-set:
> 
>    [Patch 5,9,11,13,16,20,24,31,33] Common logic to support hotplug
>     (*) Refactors Parking logic of vCPUs
>     (*) Introduces ACPI GED Support for vCPU Hotplug Events
>     (*) Introduces ACPI AML change for CPU Control Device
> 
> (XII) REFERENCES
>       ==========
> 
> [1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-
> salil.mehta@huawei.com/
> [2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-
> salil.mehta@huawei.com/
> [3] https://lore.kernel.org/lkml/20230203135043.409192-1-
> james.morse@arm.com/
> [4] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
> [5] https://lore.kernel.org/all/20230404154050.2270077-1-
> oliver.upton@linux.dev/
> [6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
> [7]
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic
> -cpu-interface-gicc-structure
> [8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
> [9] https://cloud.google.com/kubernetes-
> engine/docs/concepts/verticalpodautoscaler
> [10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-
> autoscaler.html
> [11] https://lkml.org/lkml/2019/7/10/235
> [12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
> [13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
> [14] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-
> lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
> [15] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
> [16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
> [17] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-
> lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
> [18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-
> philippe@linaro.org/
> [19] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
> [20]
> https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic
> c-cpu-interface-flags
> 
> (XIII) ACKNOWLEDGEMENTS
>        ================
> 
> I would like to take this opportunity to thank below people for various
> discussions with me over different channels during the development:
> 
> Marc Zyngier (Google)               Catalin Marinas (ARM),
> James Morse(ARM),                   Will Deacon (Google),
> Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
> Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat),
> Jonathan Cameron (Huawei),          Darren Hart (Ampere),
> Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
> Andrew Jones (Redhat),              Karl Heubaum (Oracle),
> Keqian Zhu (Huawei),                Miguel Luis (Oracle),
> Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
> Shameerali Kolothum (Huawei)        Russel King (Oracle)
> Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
> Zengtao/Prime (Huawei),             And all those whom I have missed!
> 
> Many thanks to below people for their current or past contributions:
> 
> 1. James Morse (ARM)
>    (Current Kernel part of vCPU Hotplug Support on AARCH64)
> 2. Jean-Philippe Brucker (Linaro)
>    (Protoyped one of the earlier PSCI based POC [17][18] based on RFC V1)
> 3. Keqian Zhu (Huawei)
>    (Co-developed Qemu prototype)
> 4. Xiongfeng Wang (Huawei)
>    (Co-developed earlier kernel prototype)
> 5. Vishnu Pajjuri (Ampere)
>    (Verification on Ampere ARM64 Platforms + fixes)
> 6. Miguel Luis (Oracle)
>    (Verification on Oracle ARM64 Platforms + fixes)
> 
> 
> Author Salil Mehta (1):
>   target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu
> 
> Jean-Philippe Brucker (2):
>   hw/acpi: Make _MAT method optional
>   target/arm/kvm: Write CPU state back to KVM on reset
> 
> Miguel Luis (1):
>   tcg/mttcg: enable threads to unregister in tcg_ctxs[]
> 
> Salil Mehta (33):
>   arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id
> property
>   cpus-common: Add common CPU utility for possible vCPUs
>   hw/arm/virt: Move setting of common CPU properties in a function
>   arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-
> plug
>   accel/kvm: Extract common KVM vCPU {creation,parking} code
>   arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
>   arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine init
>   arm/virt: Init PMU at host for all possible vcpus
>   hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
>   arm/acpi: Enable ACPI support for vcpu hotplug
>   hw/acpi: Add ACPI CPU hotplug init stub
>   hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
>   hw/acpi: Init GED framework with cpu hotplug events
>   arm/virt: Add cpu hotplug events to GED during creation
>   arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
>   hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
>   arm/virt/acpi: Build CPUs AML with CPU Hotplug support
>   arm/virt: Make ARM vCPU *present* status ACPI *persistent*
>   hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits to
> Guest
>   hw/acpi: Update GED _EVT method AML with cpu scan
>   hw/arm: MADT Tbl change to size the guest with possible vCPUs
>   arm/virt: Release objects for *disabled* possible vCPUs after init
>   hw/acpi: Update ACPI GED framework to support vCPU Hotplug
>   arm/virt: Add/update basic hot-(un)plug framework
>   arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
>   hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
>   hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info
>   arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
>   hw/arm: Changes required for reset and to support next boot
>   physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
>   target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
>   hw/arm: Support hotplug capability check using _OSC method
>   hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled
> 
>  accel/kvm/kvm-all.c                    |  61 +-
>  accel/tcg/tcg-accel-ops-mttcg.c        |   1 +
>  cpus-common.c                          |  37 ++
>  gdbstub/gdbstub.c                      |  13 +
>  hw/acpi/acpi-cpu-hotplug-stub.c        |   6 +
>  hw/acpi/cpu.c                          |  91 ++-
>  hw/acpi/generic_event_device.c         |  33 +
>  hw/arm/Kconfig                         |   1 +
>  hw/arm/boot.c                          |   2 +-
>  hw/arm/virt-acpi-build.c               | 110 +++-
>  hw/arm/virt.c                          | 868 ++++++++++++++++++++-----
>  hw/core/gpio.c                         |   2 +-
>  hw/i386/acpi-build.c                   |   2 +-
>  hw/intc/arm_gicv3.c                    |   1 +
>  hw/intc/arm_gicv3_common.c             |  66 +-
>  hw/intc/arm_gicv3_cpuif.c              | 265 ++++----
>  hw/intc/arm_gicv3_cpuif_common.c       |   5 +
>  hw/intc/arm_gicv3_kvm.c                |  39 +-
>  hw/intc/gicv3_internal.h               |   2 +
>  include/exec/cpu-common.h              |   8 +
>  include/exec/gdbstub.h                 |   1 +
>  include/hw/acpi/cpu.h                  |   7 +-
>  include/hw/acpi/cpu_hotplug.h          |   4 +
>  include/hw/acpi/generic_event_device.h |   5 +
>  include/hw/arm/boot.h                  |   2 +
>  include/hw/arm/virt.h                  |  10 +-
>  include/hw/core/cpu.h                  |  77 +++
>  include/hw/intc/arm_gicv3_common.h     |  23 +
>  include/hw/qdev-core.h                 |   2 +
>  include/sysemu/kvm.h                   |   2 +
>  include/tcg/tcg.h                      |   1 +
>  softmmu/physmem.c                      |  25 +
>  target/arm/arm-powerctl.c              |  51 +-
>  target/arm/cpu-qom.h                   |   3 +
>  target/arm/cpu.c                       | 112 ++++
>  target/arm/cpu.h                       |  17 +
>  target/arm/cpu64.c                     |  15 +
>  target/arm/gdbstub.c                   |   6 +
>  target/arm/helper.c                    |  27 +-
>  target/arm/internals.h                 |  12 +-
>  target/arm/kvm.c                       |  93 ++-
>  target/arm/kvm64.c                     |  59 +-
>  target/arm/kvm_arm.h                   |  24 +
>  target/arm/meson.build                 |   1 +
>  target/arm/{tcg => }/psci.c            |   8 +
>  target/arm/tcg/meson.build             |   4 -
>  tcg/tcg.c                              |  23 +
>  47 files changed, 1878 insertions(+), 349 deletions(-)
>  rename target/arm/{tcg => }/psci.c (97%)
> 
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 153+ messages in thread

* [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch
@ 2023-09-25 17:11 Salil Mehta via
  2023-09-25 17:17 ` Salil Mehta via
  0 siblings, 1 reply; 153+ messages in thread
From: Salil Mehta via @ 2023-09-25 17:11 UTC (permalink / raw)
  To: qemu-devel, qemu-arm
  Cc: salil.mehta, maz, james.morse, jean-philippe, jonathan.cameron,
	peter.maydell, richard.henderson, imammedo, drjones,
	andrew.jones, david, philmd, eric.auger, will, catalin.marinas,
	ardb, justin.he, oliver.upton, pbonzini, mst, gshan, rafael,
	borntraeger, alex.bennee, linux, vishnu, miguel.luis,
	sudeep.holla, salil.mehta, zhukeqian1, wangxiongfeng2,
	wangyanan55, jiakernel2

PROLOGUE
========

To assist in review and set the right expectations from this RFC, please first
read below sections *APPENDED AT THE END* of this cover letter,

1. Important *DISCLAIMER* [Section (X)]
2. Work presented at KVMForum Conference (slides available) [Section (V)F]
3. Organization of patches [Section (XI)]
4. References [Section (XII)]
5. Detailed TODO list of the leftover work or work-in-progress [Section (IX)]

NOTE: There has been an interest shown by other organizations in adapting
this series for their architecture. I am planning to split this RFC into
architecture *agnostic* and *specific* patch-sets in subsequent releases. ARM
specific patch-set will continue as RFC V3 and architecture agnostic patch-set
will be floated without RFC tag and can be consumed in this Qemu cycle if
MAINTAINERs ack it.

[Please check section (XI)B for details of architecture agnostic patches]


SECTIONS [I - XIII] are as follows :

(I) Key Changes (RFC V1 -> RFC V2)
    ==================================

    RFC V1: https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/

1. ACPI MADT Table GIC CPU Interface can now be presented [6] as ACPI
   *online-capable* or *enabled* to the Guest OS at the boot time. This means
   associated CPUs can have ACPI _STA as *enabled* or *disabled* even after boot
   See, UEFI ACPI 6.5 Spec, Section 05, Table 5.37 GICC CPU Interface Flags[20]
2. SMCC/HVC Hypercall exit handling in userspace/Qemu for PSCI CPU_{ON,OFF}
   request. This is required to {dis}allow online'ing a vCPU.
3. Always presenting unplugged vCPUs in CPUs ACPI AML code as ACPI _STA.PRESENT 
   to the Guest OS. Toggling ACPI _STA.Enabled to give an effect of the
   hot{un}plug.
4. Live Migration works (some issues are still there)
5. TCG/HVF/qtest does not support Hotplug and falls back to default.
6. Code for TCG support do exists in this release (it is a work-in-progress)
7. ACPI _OSC method can now be used by OSPM to negotiate Qemu VM platform
   hotplug capability (_OSC Query support still pending)
8. Misc. Bug fixes

(II) Summary
     =======

This patch-set introduces the virtual CPU hotplug support for ARMv8 architecture
in QEMU. Idea is to be able to hotplug and hot-unplug the vCPUs while guest VM
is running and no reboot is required. This does *not* makes any assumption of
the physical CPU hotplug availability within the host system but rather tries to
solve the problem at virtualizer/QEMU layer. Introduces ACPI CPU hotplug hooks
and event handling to interface with the guest kernel, code to initialize, plug
and unplug CPUs. No changes are required within the host kernel/KVM except the
support of hypercall exit handling in the user-space/Qemu which has recently
been added to the kernel. Its corresponding Guest kernel changes have been
posted on the mailing-list [3] [4] by James Morse.

(III) Motivation
      ==========

This allows scaling the guest VM compute capacity on-demand which would be
useful for the following example scenarios,

1. Vertical Pod Autoscaling [9][10] in the cloud: Part of the orchestration
   framework which could adjust resource requests (CPU and Mem requests) for
   the containers in a pod, based on usage.
2. Pay-as-you-grow Business Model: Infrastructure provider could allocate and
   restrict the total number of compute resources available to the guest VM
   according to the SLA (Service Level Agreement). VM owner could request for
   more compute to be hot-plugged for some cost.

For example, Kata Container VM starts with a minimum amount of resources (i.e.
hotplug everything approach). why?

1. Allowing faster *boot time* and
2. Reduction in *memory footprint*

Kata Container VM can boot with just 1 vCPU and then later more vCPUs can be
hot-plugged as per requirement.

(IV) Terminology
     ===========

(*) Posssible CPUs: Total vCPUs which could ever exist in VM. This includes
                    any cold booted CPUs plus any CPUs which could be later
                    hot-plugged.
                    - Qemu parameter(-smp maxcpus=N)
(*) Present CPUs:   Possible CPUs which are ACPI 'present'. These might or might
                    not be ACPI 'enabled'. 
                    - Present vCPUs = Possible vCPUs (Always on ARM Arch)
(*) Enabled CPUs:   Possible CPUs which are ACPI ‘present’ and 'enabled' and can
                    now be ‘onlined’ (PSCI) for use by Guest Kernel. All cold
                    booted vCPUs are ACPI 'enabled' at boot. Later, using
                    device_add more vCPUs can be hotplugged and be made ACPI
                    'enabled.
                    - Qemu parameter(-smp cpus=N). Can be used to specify some
		      cold booted vCPUs during VM init. Some can be added using
		      '-device' option.

(V) Constraints Due To ARMv8 CPU Architecture [+] Other Impediments
    ===============================================================

A. Physical Limitation to Support CPU Hotplug: (Architectural Constraint)
   1. ARMv8 CPU architecture does not support the concept of the physical CPU
      hotplug. 
      a. There are many per-CPU components like PMU, SVE, MTE, Arch timers etc.
         whose behaviour need to be clearly defined when CPU is hot(un)plugged.
         There is no specification for this.

   2. Other ARM components like GIC etc. have not been designed to realize
      physical CPU hotplug capability as of now. For example,
      a. Every physical CPU has a unique GICC (GIC CPU Interface) by construct.
         Architecture does not specifies what CPU hot(un)plug would mean in
         context to any of these.
      b. CPUs/GICC are physically connected to unique GICR (GIC Redistributor).
         GIC Redistributors are always part of always-on power domain. Hence,
         cannot be powered-off as per specification.

B. Impediments in Firmware/ACPI (Architectural Constraint)

   1. Firmware has to expose GICC, GICR and other per-CPU features like PMU,
      SVE, MTE, Arch Timers etc. to the OS. Due to architectural constraint
      stated in above section A1(a),  all interrupt controller structures of
      MADT describing GIC CPU Interfaces and the GIC Redistibutors MUST be
      presented by firmware to the OSPM during the boot time. 
   2. Architectures that support CPU hotplug can evaluate ACPI _MAT method to
      get this kind of information from the firmware even after boot and the
      OSPM has capability to process these. ARM kernel uses information in MADT
      interrupt controller structures to identify number of Present CPUs during
      boot and hence does not allow to change these after boot. Number of
      present CPUs cannot be changed. It is an architectural constraint!

C. Impediments in KVM to Support Virtual CPU Hotplug (Architectural Constraint)

   1. KVM VGIC:
       a. Sizing of various VGIC resources like memory regions etc. related to
          the redistributor happens only once and is fixed at the VM init time
          and cannot be changed later after initialization has happened.
          KVM statically configures these resources based on the number of vCPUs
          and the number/size of redistributor ranges.
       b. Association between vCPU and its VGIC redistributor is fixed at the
          VM init time within the KVM i.e. when redistributor iodevs gets
          registered. VGIC does not allows to setup/change this association
          after VM initialization has happened. Physically, every CPU/GICC is
          uniquely connected with its redistributor and there is no
          architectural way to set this up.
   2. KVM vCPUs:
       a. Lack of specification means destruction of KVM vCPUs does not exist as
          there is no reference to tell what to do with other per-vCPU
          components like redistributors, arch timer etc.
       b. Infact, KVM does not implements destruction of vCPUs for any
          architecture. This is independent of the fact whether architecture
          actually supports CPU Hotplug feature. For example, even for x86 KVM
          does not implements destruction of vCPUs.

D. Impediments in Qemu to Support Virtual CPU Hotplug (KVM Constraints->Arch)

   1. Qemu CPU Objects MUST be created to initialize all the Host KVM vCPUs to
      overcome the KVM constraint. KVM vCPUs are created, initialized when Qemu
      CPU Objects are realized. But keepinsg the QOM CPU objects realized for
      'yet-to-be-plugged' vCPUs can create problems when these new vCPUs shall
      be plugged using device_add and a new QOM CPU object shall be created.
   2. GICV3State and GICV3CPUState objects MUST be sized over *possible vCPUs*
      during VM init time while QOM GICV3 Object is realized. This is because
      KVM VGIC can only be initialized once during init time. But every
      GICV3CPUState has an associated QOM CPU Object. Later might corresponds to
      vCPU which are 'yet-to-be-plugged'(unplugged at init).
   3. How should new QOM CPU objects be connected back to the GICV3CPUState
      objects and disconnected from it in case CPU is being hot(un)plugged?
   4. How should 'unplugged' or 'yet-to-be-plugged' vCPUs be represented in the
      QOM for which KVM vCPU already exists? For example, whether to keep,
       a. No QOM CPU objects Or
       b. Unrealized CPU Objects
   5. How should vCPU state be exposed via ACPI to the Guest? Especially for
      the unplugged/yet-to-be-plugged vCPUs whose CPU objects might not exists
      within the QOM but the Guest always expects all possible vCPUs to be
      identified as ACPI *present* during boot.
   6. How should Qemu expose GIC CPU interfaces for the unplugged or
      yet-to-beplugged vCPUs using ACPI MADT Table to the Guest?

E. Summary of Approach ([+] Workarounds to problems in sections A, B, C & D)

   1. At VM Init, pre-create all the possible vCPUs in the Host KVM i.e. even
      for the vCPUs which are yet-to-be-plugged in Qemu but keep them in the
      powered-off state.
   2. After the KVM vCPUs have been initialized in the Host, the KVM vCPU
      objects corresponding to the unplugged/yet-to-be-plugged vCPUs are parked
      at the existing per-VM "kvm_parked_vcpus" list in Qemu. (similar to x86)
   3. GICV3State and GICV3CPUState objects are sized over possible vCPUs during
      VM init time i.e. when Qemu GIC is realized. This in turn sizes KVM VGIC
      resources like memory regions etc. related to the redistributors with the
      number of possible KVM vCPUs. This never changes after VM has initialized.
   4. Qemu CPU objects corresponding to unplugged/yet-to-be-plugged vCPUs are
      released post Host KVM CPU and GIC/VGIC initialization.
   5. Build ACPI MADT Table with below updates 
      a. Number of GIC CPU interface entries (=possible vCPUs)
      b. Present Boot vCPU as MADT.GICC.Enabled=1 (Not hot[un]pluggable) 
      c. Present hot(un)pluggable vCPUs as MADT.GICC.online-capable=1  
         - MADT.GICC.Enabled=0 (Mutually exclusive) [6][7]
	 - vCPU can be ACPI enabled+onlined after Guest boots (Firmware Policy) 
	 - Some issues with above (details in later sections)
   6. Expose below ACPI Status to Guest kernel
      a. Always _STA.Present=1 (all possible vCPUs)
      b. _STA.Enabled=1 (plugged vCPUs)
      c. _STA.Enabled=0 (unplugged vCPUs)
   7. vCPU hotplug *realizes* new QOM CPU object. Following happens,
      a. Realizes, initializes QOM CPU Object & spawns Qemu vCPU thread
      b. Unparks the existing KVM vCPU ("kvm_parked_vcpus" list)
         - Attaches to QOM CPU object.
      c. Reinitializes KVM vCPU in the Host
         - Resets the core and sys regs, sets defaults etc.
      d. Runs KVM vCPU (created with "start-powered-off")
	 - vCPU thread sleeps (waits for vCPU reset via PSCI) 
      e. Updates Qemu GIC
         - Wires back IRQs related to this vCPU.
         - GICV3CPUState association with QOM CPU Object.
      f. Updates [6] ACPI _STA.Enabled=1
      g. Notifies Guest about new vCPU (via ACPI GED interface)
	 - Guest checks _STA.Enabled=1
	 - Guest adds processor (registers CPU with LDM) [3]
      h. Plugs the QOM CPU object in the slot.
         - slot-number = cpu-index{socket,cluster,core,thread}
      i. Guest online's vCPU (CPU_ON PSCI call over HVC/SMC)
         - KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
         - Qemu powers-on KVM vCPU in the Host
   8. vCPU hot-unplug *unrealizes* QOM CPU Object. Following happens,
      a. Notifies Guest (via ACPI GED interface) vCPU hot-unplug event
         - Guest offline's vCPU (CPU_OFF PSCI call over HVC/SMC) 
      b. KVM exits HVC/SMC Hypercall [5] to Qemu (Policy Check).
         - Qemu powers-off the KVM vCPU in the Host
      c Guest signals *Eject* vCPU to Qemu
      d. Qemu updates [6] ACPI _STA.Enabled=0
      e. Updates GIC
         - Un-wires IRQs related to this vCPU
         - GICV3CPUState association with new QOM CPU Object is updated.
      f. Unplugs the vCPU
	 - Removes from slot
         - Parks KVM vCPU ("kvm_parked_vcpus" list)
         - Unrealizes QOM CPU Object & joins back Qemu vCPU thread
	 - Destroys QOM CPU object 
      g. Guest checks ACPI _STA.Enabled=0
         - Removes processor (unregisters CPU with LDM) [3]

F. Work Presented at KVM Forum Conferences:
   Details of above work has been presented at KVMForum2020 and KVMForum2023
   conferences. Slides are available at below links,
   a. KVMForum 2023
      - Challenges Revisited in Supporting Virt CPU Hotplug on architectures that don't Support CPU Hotplug (like ARM64)
        https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
   b. KVMForum 2020
      - Challenges in Supporting Virtual CPU Hotplug on SoC Based Systems (like ARM64) - Salil Mehta, Huawei
        https://sched.co/eE4m

(VI) Commands Used
     =============

    A. Qemu launch commands to init the machine

    $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
    -cpu host -smp cpus=4,maxcpus=6 \
    -m 300M \
    -kernel Image \
    -initrd rootfs.cpio.gz \
    -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 acpi=force" \
    -nographic \
    -bios  QEMU_EFI.fd \

    B. Hot-(un)plug related commands

    # Hotplug a host vCPU(accel=kvm)
    $ device_add host-arm-cpu,id=core4,core-id=4

    # Hotplug a vCPU(accel=tcg)
    $ device_add cortex-a57-arm-cpu,id=core4,core-id=4

    # Delete the vCPU
    $ device_del core4

    Sample output on guest after boot:

    $ cat /sys/devices/system/cpu/possible
    0-5
    $ cat /sys/devices/system/cpu/present
    0-5
    $ cat /sys/devices/system/cpu/enabled
    0-3
    $ cat /sys/devices/system/cpu/online
    0-1
    $ cat /sys/devices/system/cpu/offline
    2-5

    Sample output on guest after hotplug of vCPU=4:

    $ cat /sys/devices/system/cpu/possible
    0-5
    $ cat /sys/devices/system/cpu/present
    0-5
    $ cat /sys/devices/system/cpu/enabled
    0-4
    $ cat /sys/devices/system/cpu/online
    0-1,4
    $ cat /sys/devices/system/cpu/offline
    2-3,5

    Note: vCPU=4 was explicitly 'onlined' after hot-plug
    $ echo 1 > /sys/devices/system/cpu/cpu4/online

(VII) Repository
      ==========

 (*) QEMU changes for vCPU hotplug could be cloned from below site,
     https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
 (*) Guest Kernel changes (by James Morse, ARM) are available here:
     https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git virtual_cpu_hotplug/rfc/v2


(VIII) KNOWN ISSUES
       ============

1. Migration has been lightly tested. Below are some of the known issues:
   - Ocassional CPU stall (not always repeatable)
   - Negative test case like asymmetric source/destination VM config causes dump.
   - Migration with TCG is not working properly.
2. TCG with Single threaded mode is broken.
3. HVF and qtest support is broken. 
4. ACPI MADT Table flags [7] MADT.GICC.Enabled and MADT.GICC.online-capable are
   mutually exclusive i.e. as per the change [6] a vCPU cannot be both
   GICC.Enabled and GICC.online-capable. This means,
      [ Link: https://bugzilla.tianocore.org/show_bug.cgi?id=3706 ]
   a. If we have to support hot-unplug of the cold-booted vCPUs then these MUST
      be specified as GICC.online-capable in the MADT Table during boot by the
      firmware/Qemu. But this requirement conflicts with the requirement to
      support new Qemu changes with legacy OS which dont understand
      MADT.GICC.online-capable Bit. Legacy OS during boot time will ignore this
      bit and hence these vCPUs will not appear on such OS. This is unexpected
      behaviour.
   b. In case we decide to specify vCPUs as MADT.GICC.Enabled and try to unplug
      these cold-booted vCPUs from OS (which in actual should be blocked by
      returning error at Qemu) then features like 'kexec' will break.
   c. As I understand, removal of the cold-booted vCPUs is a required feature
      and x86 world allows it.
   d. Hence, either we need a specification change to make the MADT.GICC.Enabled
      and MADT.GICC.online-capable Bits NOT mutually exclusive or NOT support
      removal of cold-booted vCPUs. In the later case, a check can be introduced
      to bar the users from unplugging vCPUs, which were cold-booted, using QMP
      commands. (Needs discussion!)
      Please check below patch part of this patch-set:
          [hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled]
5. Code related to the notification to GICV3 about hot(un)plug of a vCPU event
   inlcudes virt.h in the arm_gicv3_common.c which is not correct. Needs a
   better way to notify GIC about CPU event independent of VirtMachineState


(IX) THINGS TO DO
     ============

1. Fix the Migration Issues
2. Fix issues related to TCG/Emulation support.
3. Comprehensive Testing. Current testing is very basic.
   a. Negative Test cases
4. Qemu Documentation(.rst) need to be updated.
5. Fix qtest, HVF Support
6. Fix the design issue related to ACPI MADT.GICC flags discussed in known
   issues. This might require UEFI ACPI specification change!
7. Add ACPI _OSC 'Query' support. Only part of _OSC support exists now.

 Above is *not* a complete list. Will update later!

Best regards
Salil.

(X) DISCLAIMER
    ==========

This work is an attempt to present a proof-of-concept of the ARM64 vCPU hotplug
implementation to the community. This is *not* a production level code and might
have bugs. Only a basic testing has been done on HiSilicon Kunpeng920 SoC for
servers. Once the design and core idea behind the implementation has been
verified more efforts can be put to harden the code.

This work is *mostly* in the lines of the discussions which have happened in the
previous years[see refs below] across different channels like mailing-list,
Linaro Open Discussions platform, various conferences like KVMFourm etc. This
RFC is being used as a way to verify the idea mentioned in this cover-letter and
to get community views. Once this has been agreed upon a formal patch shall be
presented to the communit Once this has been agreed, a formal patch shall be
ppsted on the mailing-list.

The concept being presented has been found to work!

(XI) ORGANIZATION OF PATCHES
     =======================
 
 A. All patches [Architecture 'agnostic' + 'specific']:

   [Patch 1-9, 23, 36] logic required during machine init
    (*) Some validation checks
    (*) Introduces core-id property and some util functions required later.
    (*) Refactors Parking logic of vCPUs    
    (*) Logic to pre-create vCPUs
    (*) GIC initialization pre-sized with possible vCPUs.
    (*) Some refactoring to have common hot and cold plug logic together.
    (*) Release of disable QOM CPU objects in post_cpu_init()
    (*) Support of ACPI _OSC method to negotiate platform hotplug capabilities
   [Patch 10-22] logic related to ACPI at machine init time
    (*) Changes required to Enable ACPI for cpu hotplug
    (*) Initialization ACPI GED framework to cater CPU Hotplug Events
    (*) Build ACPI AML related to CPU control dev 
    (*) ACPI MADT/MAT changes
   [Patch 24-35] Logic required during vCPU hot-(un)plug
    (*) Basic framework changes to suppport vCPU hot-(un)plug
    (*) ACPI GED changes for hot-(un)plug hooks.
    (*) wire-unwire the IRQs
    (*) GIC notification logic
    (*) ARMCPU unrealize logic
    (*) Handling of SMCC Hypercall Exits by KVM to Qemu  
   
 B. Architecture *agnostic* patches part of patch-set:

   [Patch 5,9,11,13,16,20,24,31,33] Common logic to support hotplug 
    (*) Refactors Parking logic of vCPUs
    (*) Introduces ACPI GED Support for vCPU Hotplug Events
    (*) Introduces ACPI AML change for CPU Control Device     

(XII) REFERENCES
      ==========

[1] https://lore.kernel.org/qemu-devel/20200613213629.21984-1-salil.mehta@huawei.com/
[2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/
[3] https://lore.kernel.org/lkml/20230203135043.409192-1-james.morse@arm.com/
[4] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/
[5] https://lore.kernel.org/all/20230404154050.2270077-1-oliver.upton@linux.dev/
[6] https://bugzilla.tianocore.org/show_bug.cgi?id=3706
[7] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gic-cpu-interface-gicc-structure
[8] https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
[9] https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler
[10] https://docs.aws.amazon.com/eks/latest/userguide/vertical-pod-autoscaler.html
[11] https://lkml.org/lkml/2019/7/10/235
[12] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-July/032316.html
[13] https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg06517.html
[14] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/thread/7CGL6JTACPUZEYQC34CZ2ZBWJGSR74WE/
[15] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01168.html
[16] https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00131.html
[17] https://op-lists.linaro.org/archives/list/linaro-open-discussions@op-lists.linaro.org/message/X74JS6P2N4AUWHHATJJVVFDI2EMDZJ74/
[18] https://lore.kernel.org/lkml/20210608154805.216869-1-jean-philippe@linaro.org/
[19] https://lore.kernel.org/all/20230913163823.7880-1-james.morse@arm.com/ 
[20] https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags

(XIII) ACKNOWLEDGEMENTS
       ================

I would like to take this opportunity to thank below people for various
discussions with me over different channels during the development:

Marc Zyngier (Google)               Catalin Marinas (ARM),         
James Morse(ARM),                   Will Deacon (Google), 
Jean-Phillipe Brucker (Linaro),     Sudeep Holla (ARM),
Lorenzo Pieralisi (Linaro),         Gavin Shan (Redhat), 
Jonathan Cameron (Huawei),          Darren Hart (Ampere),
Igor Mamedov (Redhat),              Ilkka Koskinen (Ampere),
Andrew Jones (Redhat),              Karl Heubaum (Oracle),
Keqian Zhu (Huawei),                Miguel Luis (Oracle),
Xiongfeng Wang (Huawei),            Vishnu Pajjuri (Ampere),
Shameerali Kolothum (Huawei)        Russel King (Oracle)
Xuwei/Joy (Huawei),                 Peter Maydel (Linaro)
Zengtao/Prime (Huawei),             And all those whom I have missed! 

Many thanks to below people for their current or past contributions:

1. James Morse (ARM)
   (Current Kernel part of vCPU Hotplug Support on AARCH64)
2. Jean-Philippe Brucker (Linaro)
   (Protoyped one of the earlier PSCI based POC [17][18] based on RFC V1)
3. Keqian Zhu (Huawei)
   (Co-developed Qemu prototype)
4. Xiongfeng Wang (Huawei)
   (Co-developed earlier kernel prototype)
5. Vishnu Pajjuri (Ampere)
   (Verification on Ampere ARM64 Platforms + fixes)
6. Miguel Luis (Oracle)
   (Verification on Oracle ARM64 Platforms + fixes)


Author Salil Mehta (1):
  target/arm/kvm,tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu

Jean-Philippe Brucker (2):
  hw/acpi: Make _MAT method optional
  target/arm/kvm: Write CPU state back to KVM on reset

Miguel Luis (1):
  tcg/mttcg: enable threads to unregister in tcg_ctxs[]

Salil Mehta (33):
  arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
  cpus-common: Add common CPU utility for possible vCPUs
  hw/arm/virt: Move setting of common CPU properties in a function
  arm/virt,target/arm: Machine init time change common to vCPU {cold|hot}-plug
  accel/kvm: Extract common KVM vCPU {creation,parking} code
  arm/virt,kvm: Pre-create disabled possible vCPUs @machine init
  arm/virt,gicv3: Changes to pre-size GIC with possible vcpus @machine init
  arm/virt: Init PMU at host for all possible vcpus
  hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
  arm/acpi: Enable ACPI support for vcpu hotplug
  hw/acpi: Add ACPI CPU hotplug init stub
  hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init
  hw/acpi: Init GED framework with cpu hotplug events
  arm/virt: Add cpu hotplug events to GED during creation
  arm/virt: Create GED dev before *disabled* CPU Objs are destroyed
  hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
  arm/virt/acpi: Build CPUs AML with CPU Hotplug support
  arm/virt: Make ARM vCPU *present* status ACPI *persistent*
  hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} Bits to Guest
  hw/acpi: Update GED _EVT method AML with cpu scan
  hw/arm: MADT Tbl change to size the guest with possible vCPUs
  arm/virt: Release objects for *disabled* possible vCPUs after init
  hw/acpi: Update ACPI GED framework to support vCPU Hotplug
  arm/virt: Add/update basic hot-(un)plug framework
  arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug
  hw/arm,gicv3: Changes to update GIC with vCPU hot-plug notification
  hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info
  arm/virt: Update the guest(via GED) about CPU hot-(un)plug events
  hw/arm: Changes required for reset and to support next boot
  physmem,gdbstub: Common helping funcs/changes to *unrealize* vCPU
  target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug
  hw/arm: Support hotplug capability check using _OSC method
  hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled

 accel/kvm/kvm-all.c                    |  61 +-
 accel/tcg/tcg-accel-ops-mttcg.c        |   1 +
 cpus-common.c                          |  37 ++
 gdbstub/gdbstub.c                      |  13 +
 hw/acpi/acpi-cpu-hotplug-stub.c        |   6 +
 hw/acpi/cpu.c                          |  91 ++-
 hw/acpi/generic_event_device.c         |  33 +
 hw/arm/Kconfig                         |   1 +
 hw/arm/boot.c                          |   2 +-
 hw/arm/virt-acpi-build.c               | 110 +++-
 hw/arm/virt.c                          | 868 ++++++++++++++++++++-----
 hw/core/gpio.c                         |   2 +-
 hw/i386/acpi-build.c                   |   2 +-
 hw/intc/arm_gicv3.c                    |   1 +
 hw/intc/arm_gicv3_common.c             |  66 +-
 hw/intc/arm_gicv3_cpuif.c              | 265 ++++----
 hw/intc/arm_gicv3_cpuif_common.c       |   5 +
 hw/intc/arm_gicv3_kvm.c                |  39 +-
 hw/intc/gicv3_internal.h               |   2 +
 include/exec/cpu-common.h              |   8 +
 include/exec/gdbstub.h                 |   1 +
 include/hw/acpi/cpu.h                  |   7 +-
 include/hw/acpi/cpu_hotplug.h          |   4 +
 include/hw/acpi/generic_event_device.h |   5 +
 include/hw/arm/boot.h                  |   2 +
 include/hw/arm/virt.h                  |  10 +-
 include/hw/core/cpu.h                  |  77 +++
 include/hw/intc/arm_gicv3_common.h     |  23 +
 include/hw/qdev-core.h                 |   2 +
 include/sysemu/kvm.h                   |   2 +
 include/tcg/tcg.h                      |   1 +
 softmmu/physmem.c                      |  25 +
 target/arm/arm-powerctl.c              |  51 +-
 target/arm/cpu-qom.h                   |   3 +
 target/arm/cpu.c                       | 112 ++++
 target/arm/cpu.h                       |  17 +
 target/arm/cpu64.c                     |  15 +
 target/arm/gdbstub.c                   |   6 +
 target/arm/helper.c                    |  27 +-
 target/arm/internals.h                 |  12 +-
 target/arm/kvm.c                       |  93 ++-
 target/arm/kvm64.c                     |  59 +-
 target/arm/kvm_arm.h                   |  24 +
 target/arm/meson.build                 |   1 +
 target/arm/{tcg => }/psci.c            |   8 +
 target/arm/tcg/meson.build             |   4 -
 tcg/tcg.c                              |  23 +
 47 files changed, 1878 insertions(+), 349 deletions(-)
 rename target/arm/{tcg => }/psci.c (97%)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 153+ messages in thread

end of thread, other threads:[~2024-01-17 21:47 UTC | newest]

Thread overview: 153+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-26 10:03 [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Salil Mehta via
2023-09-26 10:04 ` [PATCH RFC V2 01/37] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Salil Mehta via
2023-09-26 23:57   ` [PATCH RFC V2 01/37] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property Gavin Shan
2023-10-02  9:53     ` Salil Mehta via
2023-10-02  9:53       ` Salil Mehta
2023-10-03  5:05       ` Gavin Shan
2023-09-26 10:04 ` [PATCH RFC V2 02/37] cpus-common: Add common CPU utility for possible vCPUs Salil Mehta via
2023-09-27  3:54   ` Gavin Shan
2023-10-02 10:21     ` Salil Mehta via
2023-10-02 10:21       ` Salil Mehta
2023-10-03  5:34       ` Gavin Shan
2023-09-26 10:04 ` [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function Salil Mehta via
2023-09-27  5:16   ` Gavin Shan
2023-10-02 10:24     ` Salil Mehta via
2023-10-02 10:24       ` Salil Mehta
2023-10-10  6:46   ` Shaoqin Huang
2023-10-10  9:47     ` Salil Mehta via
2023-10-10  9:47       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 04/37] arm/virt, target/arm: Machine init time change common to vCPU {cold|hot}-plug Salil Mehta via
2023-09-27  6:28   ` [PATCH RFC V2 04/37] arm/virt,target/arm: " Gavin Shan
2023-10-02 16:12     ` Salil Mehta via
2023-10-02 16:12       ` Salil Mehta
2024-01-16 15:59       ` Jonathan Cameron via
2023-09-27  6:30   ` Gavin Shan
2023-10-02 10:27     ` Salil Mehta via
2023-10-02 10:27       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation, parking} code Salil Mehta via
2023-09-27  6:51   ` [PATCH RFC V2 05/37] accel/kvm: Extract common KVM vCPU {creation,parking} code Gavin Shan
2023-10-02 16:20     ` Salil Mehta via
2023-10-02 16:20       ` Salil Mehta
2023-10-03  5:39       ` Gavin Shan
2023-09-26 10:04 ` [PATCH RFC V2 06/37] arm/virt, kvm: Pre-create disabled possible vCPUs @machine init Salil Mehta via
2023-09-27 10:04   ` [PATCH RFC V2 06/37] arm/virt,kvm: " Gavin Shan
2023-10-02 16:39     ` Salil Mehta via
2023-10-02 16:39       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 07/37] arm/virt, gicv3: Changes to pre-size GIC with possible vcpus " Salil Mehta via
2023-09-28  0:14   ` Gavin Shan
2023-10-16 16:15     ` Salil Mehta via
2023-10-16 16:15       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 08/37] arm/virt: Init PMU at host for all possible vcpus Salil Mehta via
2023-09-26 10:04 ` [PATCH RFC V2 09/37] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file Salil Mehta via
2023-09-28  0:19   ` Gavin Shan
2023-10-16 16:20     ` Salil Mehta via
2023-10-16 16:20       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 10/37] arm/acpi: Enable ACPI support for vcpu hotplug Salil Mehta via
2023-09-28  0:25   ` Gavin Shan
2023-10-16 21:23     ` Salil Mehta via
2023-10-16 21:23       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 11/37] hw/acpi: Add ACPI CPU hotplug init stub Salil Mehta via
2023-09-28  0:28   ` Gavin Shan
2023-10-16 21:27     ` Salil Mehta via
2023-10-16 21:27       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 12/37] hw/acpi: Use qemu_present_cpu() API in ACPI CPU hotplug init Salil Mehta via
2023-09-28  0:40   ` Gavin Shan
2023-10-16 21:41     ` Salil Mehta via
2023-10-16 21:41       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 13/37] hw/acpi: Init GED framework with cpu hotplug events Salil Mehta via
2023-09-28  0:56   ` Gavin Shan
2023-10-16 21:44     ` Salil Mehta via
2023-10-16 21:44       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 14/37] arm/virt: Add cpu hotplug events to GED during creation Salil Mehta via
2023-09-28  1:03   ` Gavin Shan
2023-10-16 21:46     ` Salil Mehta via
2023-10-16 21:46       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 15/37] arm/virt: Create GED dev before *disabled* CPU Objs are destroyed Salil Mehta via
2023-09-28  1:08   ` Gavin Shan
2023-10-16 21:54     ` Salil Mehta via
2023-10-16 21:54       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 16/37] hw/acpi: Update CPUs AML with cpu-(ctrl)dev change Salil Mehta via
2023-09-28  1:26   ` Gavin Shan
2023-10-16 21:57     ` Salil Mehta via
2023-10-16 21:57       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 17/37] arm/virt/acpi: Build CPUs AML with CPU Hotplug support Salil Mehta via
2023-09-28  1:36   ` Gavin Shan
2023-10-16 22:05     ` Salil Mehta via
2023-10-16 22:05       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 18/37] arm/virt: Make ARM vCPU *present* status ACPI *persistent* Salil Mehta via
2023-09-28 23:18   ` Gavin Shan
2023-10-16 22:33     ` Salil Mehta via
2023-10-16 22:33       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} Bits to Guest Salil Mehta via
2023-09-28 23:33   ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES,ENA} " Gavin Shan
2023-10-16 22:59     ` Salil Mehta via
2023-10-16 22:59       ` Salil Mehta
2024-01-17 21:46   ` [PATCH RFC V2 19/37] hw/acpi: ACPI/AML Changes to reflect the correct _STA.{PRES, ENA} " Jonathan Cameron via
2023-09-26 10:04 ` [PATCH RFC V2 20/37] hw/acpi: Update GED _EVT method AML with cpu scan Salil Mehta via
2023-09-28 23:35   ` Gavin Shan
2023-10-16 23:01     ` Salil Mehta via
2023-10-16 23:01       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 21/37] hw/arm: MADT Tbl change to size the guest with possible vCPUs Salil Mehta via
2023-09-28 23:43   ` Gavin Shan
2023-10-16 23:15     ` Salil Mehta via
2023-10-16 23:15       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 22/37] hw/acpi: Make _MAT method optional Salil Mehta via
2023-09-28 23:50   ` Gavin Shan
2023-10-16 23:17     ` Salil Mehta via
2023-10-16 23:17       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 23/37] arm/virt: Release objects for *disabled* possible vCPUs after init Salil Mehta via
2023-09-28 23:57   ` Gavin Shan
2023-10-16 23:28     ` Salil Mehta via
2023-10-16 23:28       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 24/37] hw/acpi: Update ACPI GED framework to support vCPU Hotplug Salil Mehta via
2023-09-26 11:02   ` Michael S. Tsirkin
2023-09-26 11:37     ` Salil Mehta via
2023-09-26 12:00       ` Michael S. Tsirkin
2023-09-26 12:27         ` Salil Mehta via
2023-09-26 13:02         ` lixianglai
2023-09-26 10:04 ` [PATCH RFC V2 25/37] arm/virt: Add/update basic hot-(un)plug framework Salil Mehta via
2023-09-29  0:20   ` Gavin Shan
2023-10-16 23:40     ` Salil Mehta via
2023-10-16 23:40       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 26/37] arm/virt: Changes to (un)wire GICC<->vCPU IRQs during hot-(un)plug Salil Mehta via
2023-09-26 10:04 ` [PATCH RFC V2 27/37] hw/arm, gicv3: Changes to update GIC with vCPU hot-plug notification Salil Mehta via
2023-09-26 10:04 ` [PATCH RFC V2 28/37] hw/intc/arm-gicv3*: Changes required to (re)init the vCPU register info Salil Mehta via
2023-09-26 10:04 ` [PATCH RFC V2 29/37] arm/virt: Update the guest(via GED) about CPU hot-(un)plug events Salil Mehta via
2023-09-29  0:30   ` Gavin Shan
2023-10-16 23:48     ` Salil Mehta via
2023-10-16 23:48       ` Salil Mehta
2023-09-26 10:04 ` [PATCH RFC V2 30/37] hw/arm: Changes required for reset and to support next boot Salil Mehta via
2023-09-26 10:04 ` [PATCH RFC V2 31/37] physmem, gdbstub: Common helping funcs/changes to *unrealize* vCPU Salil Mehta via
2023-10-03  6:33   ` [PATCH RFC V2 31/37] physmem,gdbstub: " Philippe Mathieu-Daudé
2023-10-03 10:22     ` Salil Mehta via
2023-10-03 10:22       ` Salil Mehta
2023-10-04  9:17       ` Salil Mehta via
2023-10-04  9:17         ` Salil Mehta
2023-09-26 10:36 ` [PATCH RFC V2 32/37] target/arm: Add support of *unrealize* ARMCPU during vCPU Hot-unplug Salil Mehta via
2023-09-26 10:36   ` [PATCH RFC V2 33/37] target/arm/kvm: Write CPU state back to KVM on reset Salil Mehta via
2023-09-26 10:36   ` [PATCH RFC V2 34/37] target/arm/kvm, tcg: Register/Handle SMCCC hypercall exits to VMM/Qemu Salil Mehta via
2023-09-29  4:15     ` [PATCH RFC V2 34/37] target/arm/kvm,tcg: " Gavin Shan
2023-10-17  0:03       ` Salil Mehta via
2023-10-17  0:03         ` Salil Mehta
2023-09-26 10:36   ` [PATCH RFC V2 35/37] hw/arm: Support hotplug capability check using _OSC method Salil Mehta via
2023-09-29  4:23     ` Gavin Shan
2023-10-17  0:13       ` Salil Mehta via
2023-10-17  0:13         ` Salil Mehta
2023-09-26 10:36   ` [PATCH RFC V2 36/37] tcg/mttcg: enable threads to unregister in tcg_ctxs[] Salil Mehta via
2023-09-26 10:36   ` [PATCH RFC V2 37/37] hw/arm/virt: Expose cold-booted CPUs as MADT GICC Enabled Salil Mehta via
2023-10-11 10:23 ` [PATCH RFC V2 00/37] Support of Virtual CPU Hotplug for ARMv8 Arch Vishnu Pajjuri
2023-10-11 10:32   ` Salil Mehta via
2023-10-11 10:32     ` Salil Mehta
2023-10-11 11:08     ` Vishnu Pajjuri
2023-10-11 20:15       ` Salil Mehta
2023-10-12 17:02 ` Miguel Luis
2023-10-12 17:54   ` Salil Mehta via
2023-10-12 17:54     ` Salil Mehta
2023-10-13 10:43     ` Miguel Luis
  -- strict thread matches above, loose matches on Subject: below --
2023-09-25 19:43 Salil Mehta via
2023-09-25 20:03 ` Salil Mehta via
2023-09-25 20:12   ` Russell King (Oracle)
2023-09-25 20:21     ` Salil Mehta via
2023-09-25 23:58       ` Gavin Shan
2023-09-25 17:11 Salil Mehta via
2023-09-25 17:17 ` Salil Mehta via

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).