All of lore.kernel.org
 help / color / mirror / Atom feed
* PVH CPU hotplug design document
@ 2017-01-12 12:13 Roger Pau Monné
  2017-01-12 19:00 ` Andrew Cooper
  2017-01-13 15:51 ` Jan Beulich
  0 siblings, 2 replies; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-12 12:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Graeme Gregory, Al Stone, Andrew Cooper,
	Anshul Makkar, Julien Grall, Jan Beulich, Boris Ostrovsky

Hello,

Below is a draft of a design document for PVHv2 CPU hotplug. It should cover
both vCPU and pCPU hotplug. It's mainly centered around the hardware domain,
since for unprivileged PVH guests the vCPU hotplug mechanism is already
described in Boris series [0], and it's shared with HVM.

The aim here is to find a way to use ACPI vCPU hotplug for the hardware domain,
while still being able to properly detect and notify Xen of pCPU hotplug.

Thanks, Roger.

[0] https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00060.html

---8<---
% CPU hotplug support for PVH
% Roger Pau Monné <roger.pau@citrix.com>
% Draft B

# Revision History

| Version | Date        | Changes                                           |
|---------|-------------|---------------------------------------------------|
| Draft A | 5 Jan 2017  | Initial draft.                                    |
|---------|-------------|---------------------------------------------------|
| Draft B | 12 Jan 2017 | Removed the XXX comments and clarify some         |
|         |             | sections.                                         |
|         |             |                                                   |
|         |             | Added a sample of the SSDT ASL code that would be |
|         |             | appended to the hardware domain.                  |

# Preface

This document aims to describe the interface to use in order to implement CPU
hotplug for PVH guests, this applies to hotplug of both physical and virtual
CPUs.

# Introduction

One of the design goals of PVH is to be able to remove as much Xen PV specific
code as possible, thus limiting the number of Xen PV interfaces used by guests,
and tending to use native interfaces (as used by bare metal) as much as
possible. This is in line with the efforts also done by Xen on ARM and helps
reduce the burden of maintaining huge amounts of Xen PV code inside of guests
kernels.

This however presents some challenges due to the model used by the Xen
Hypervisor, where some devices are handled by Xen while others are left for the
hardware domain to manage. The fact that Xen lacks and AML parser also makes it
harder, since it cannot get the full hardware description from dynamic ACPI
tables (DSDT, SSDT) without the hardware domain collaboration.

One of such issues is CPU enumeration and hotplug, for both the hardware and
unprivileged domains. The aim is to be able to use the same enumeration and
hotplug interface for all PVH guests, regardless of their privilege.

This document aims to describe the interface used in order to fulfill the
following actions:

 * Virtual CPU (vCPU) enumeration at boot time.
 * Hotplug of vCPUs.
 * Hotplug of physical CPUs (pCPUs) to Xen.

# Prior work

## PV CPU hotplug

CPU hotplug for Xen PV guests is implemented using xenstore and hypercalls. The
guest has to setup a watch event on the "cpu/" xenstore node, and react to
changes in this directory. CPUs are added creating a new node and setting it's
"availability" to online:

    cpu/X/availability = "online"

Where X is the vCPU ID. This is an out-of-band method, that relies on Xen
specific interfaces in order to perform CPU hotplug.

## QEMU CPU hotplug using ACPI

The ACPI tables provided to HVM guests contain processor objects, as created by
libacpi. The number of processor objects in the ACPI namespace matches the
maximum number of processors supported by HVM guests (up to 128 at the time of
writing). Processors currently disabled are marked as so in the MADT and in
their \_MAT and \_STA methods.

A PRST operation region in I/O space is also defined, with a size of 128bits,
that's used as a bitmap of enabled vCPUs on the system. A PRSC method is
provided in order to check for updates to the PRST region and trigger
notifications on the affected processor objects. The execution of the PRSC
method is done by a GPE event. Then OSPM checks the value returned by \_STA for
the ACPI\_STA\_DEVICE\_PRESENT flag in order to check if the vCPU has been
enabled.

## Native CPU hotplug

OSPM waits for a notification from ACPI on the processor object and when an
event is received the return value from _STA is checked in order to see if
ACPI\_STA\_DEVICE\_PRESENT has been enabled. This notification is triggered
from the method of a GPE block.

# PVH CPU hotplug

The aim as stated in the introduction is to use a method as similar as possible
to bare metal CPU hotplug for PVH, this is feasible for unprivileged domains,
since the ACPI tables can be created by the toolstack and provided to the
guest. Then a minimal I/O or memory handler will be added to Xen in order to
report the bitmap of enabled vCPUs. There's already a [series][0] posted to
xen-devel that implement this functionality for unprivileged PVH guests.

This however is proven to be quite difficult to implement for the hardware
domain, since it has to manage both pCPUs and vCPUs. The hardware domain should
be able to notify Xen of the addition of new pCPUs, so that they can be used by
the Hypervisor, and also be able to hotplug new vCPUs for it's own usage. Since
Xen cannot access the dynamic (AML) ACPI tables, because it lacks an AML
parser, it is the duty of the hardware domain to parse those tables and notify
Xen of relevant events.

There are several related issues here that prevent a straightforward solution
to this issue:

 * Xen cannot parse AML tables, and thus cannot get notifications from ACPI
   events. And even in the case that Xen could parse those tables, there can
   only be one OSPM registered with ACPI
 * Xen can provide a valid MADT table to the hardware domain that describes the
   environment in which the hardware domain is running, but it cannot prevent
   the hardware domain from seeing the real processor devices in the ACPI
   namespace, neither Xen can provide the hardware domain with processor
   devices that match the vCPUs at the moment.

[0]: https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00060.html

## Proposed solution using the STAO

The general idea of this method is to use the STAO in order to hide the pCPUs
from the hardware domain, and provide processor objects for vCPUs in an extra
SSDT table.

This method requires one change to the STAO, in order to be able to notify the
hardware domain of which processors found in ACPI tables are pCPUs. The
description of the new STAO field is as follows:

 |   Field            | Byte Length | Byte Offset |     Description          |
 |--------------------|:-----------:|:-----------:|--------------------------|
 | Processor List [n] |      -      |      -      | A list of ACPI numbers,  |
 |                    |             |             | where each number is the |
 |                    |             |             | Processor UID of a       |
 |                    |             |             | physical CPU, and should |
 |                    |             |             | be treated specially by  |
 |                    |             |             | the OSPM                 |

The list of UIDs in this new field would be matched against the ACPI Processor
UID field found in local/x2 APIC MADT structs and Processor objects in the ACPI
namespace, and the OSPM should either ignore those objects, or in case it
implements pCPU hotplug, it should notify Xen of changes to these objects.

The contents of the MADT provided to the hardware domain are also going to be
different from the contents of the MADT as found in native ACPI. The local/x2
APIC entries for all the pCPUs are going to be marked as disabled.

Extra entries are going to be added for each vCPU available to the hardware
domain, up to the maximum number of supported vCPUs. Note that supported vCPUs
might be different than enabled vCPUs, so it's possible that some of these
entries are also going to be marked as disabled. The entries for vCPUs on the
MADT are going to use a processor local x2 APIC structure, and the ACPI
processor ID of the first vCPU is going to be UINT32_MAX - HVM_MAX_VCPUS, in
order to avoid clashes with IDs of pCPUs.

In order to be able to perform vCPU hotplug, the vCPUs must have an ACPI
processor object in the ACPI namespace, so that the OSPM can request
notifications and get the value of the \_STA and \_MAT methods. This can be
problematic because Xen doesn't know the ACPI name of the other processor
objects, so blindly adding new ones can create namespace clashes.

This can be solved by using a different ACPI name in order to describe vCPUs in
the ACPI namespace. Most hardware vendors tend to use CPU or PR prefixes for
the processor objects, so using a 'VP' (ie: Virtual Processor) prefix should
prevent clashes.

A Xen GPE device block will be used in order to deliver events related to the
vCPUs available to the guest, since Xen doesn't know if there are any bits
available in the native GPEs. A SCI interrupt will be injected into the guest
in order to trigger the event.

The following snippet is a representation of the ASL SSDT code that is proposed
for the hardware domain:

    DefinitionBlock ("SSDT.aml", "SSDT", 5, "Xen", "HVM", 0)
    {
        Scope (\_SB)
        {
           OperationRegion(XEN, SystemMemory, 0xDEADBEEF, 40)
           Field(XEN, ByteAcc, NoLock, Preserve) {
               NCPU, 16, /* Number of vCPUs */
               MSUA, 32, /* MADT checksum address */
               MAPA, 32, /* MADT LAPIC0 address */
           }
        }
        Scope ( \_SB ) {
            OperationRegion ( MSUM, SystemMemory, \_SB.MSUA, 1 )
            Field ( MSUM, ByteAcc, NoLock, Preserve ) {
                MSU, 8
            }
            Method ( PMAT, 2 ) {
                If ( LLess(Arg0, NCPU) ) {
                    Return ( ToBuffer(Arg1) )
                }
                Return ( Buffer() {0, 8, 0xff, 0xff, 0, 0, 0, 0} )
            }
            Processor ( VP00, 0, 0x0000b010, 0x06 ) {
                Name ( _HID, "ACPI0007" )
                Name ( _UID, 4294967167 )
                OperationRegion ( MATR, SystemMemory, Add(\_SB.MAPA, 0), 8 )
                Field ( MATR, ByteAcc, NoLock, Preserve ) {
                    MAT, 64
                }
                Field ( MATR, ByteAcc, NoLock, Preserve ) {
                    Offset(4),
                    FLG, 1
                }
                Method ( _MAT, 0 ) {
                    Return ( ToBuffer(MAT) )
                }
                Method ( _STA ) {
                    If ( FLG ) {
                        Return ( 0xF )
                    }
                    Return ( 0x0 )
                }
                Method ( _EJ0, 1, NotSerialized ) {
                    Sleep ( 0xC8 )
                }
            }
            Processor ( VP01, 1, 0x0000b010, 0x06 ) {
                Name ( _HID, "ACPI0007" )
                Name ( _UID, 4294967168 )
                OperationRegion ( MATR, SystemMemory, Add(\_SB.MAPA, 8), 8 )
                Field ( MATR, ByteAcc, NoLock, Preserve ) {
                    MAT, 64
                }
                Field ( MATR, ByteAcc, NoLock, Preserve ) {
                    Offset(4),
                    FLG, 1
                }
                Method ( _MAT, 0 ) {
                    Return ( PMAT (1, MAT) )
                }
                Method ( _STA ) {
                    If ( LLess(1, \_SB.NCPU) ) {
                        If ( FLG ) {
                            Return ( 0xF )
                        }
                    }
                    Return ( 0x0 )
                }
                Method ( _EJ0, 1, NotSerialized ) {
                    Sleep ( 0xC8 )
                }
            }
            OperationRegion ( PRST, SystemIO, 0xaf00, 1 )
            Field ( PRST, ByteAcc, NoLock, Preserve ) {
                PRS, 2
            }
            Method ( PRSC, 0 ) {
                Store ( ToBuffer(PRS), Local0 )
                Store ( DerefOf(Index(Local0, 0)), Local1 )
                And ( Local1, 1, Local2 )
                If ( LNotEqual(Local2, \_SB.VP00.FLG) ) {
                    Store ( Local2, \_SB.VP00.FLG )
                    If ( LEqual(Local2, 1) ) {
                        Notify ( VP00, 1 )
                        Subtract ( \_SB.MSU, 1, \_SB.MSU )
                    }
                    Else {
                        Notify ( VP00, 3 )
                        Add ( \_SB.MSU, 1, \_SB.MSU )
                    }
                }
                ShiftRight ( Local1, 1, Local1 )
                And ( Local1, 1, Local2 )
                If ( LNotEqual(Local2, \_SB.VP01.FLG) ) {
                    Store ( Local2, \_SB.VP01.FLG )
                    If ( LEqual(Local2, 1) ) {
                        Notify ( VP01, 1 )
                        Subtract ( \_SB.MSU, 1, \_SB.MSU )
                    }
                    Else {
                        Notify ( VP01, 3 )
                        Add ( \_SB.MSU, 1, \_SB.MSU )
                    }
                }
                Return ( One )
            }
        }
        Device ( \_SB.GPEX ) {
            Name ( _HID, "ACPI0006" )
            Name ( _UID, "XENGPE" )
            Name ( _CRS, ResourceTemplate() {
                IO (Decode16, 0xafe0 , 0xafe0, 0x00, 0x4)
            } )
            Method ( _E02 ) {
                \_SB.PRSC ()
            }
        }
    }

Since the position of the XEN data memory area is not know, the hypervisor will
have to replace the address 0xdeadbeef with the actual memory address where
this structure has been copied. This will involve a memory search of the AML
code resulting from the compilation of the above ASL snippet.

In order to implement this, the hypervisor build is going to use part of
libacpi and the iasl compiler.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-12 12:13 PVH CPU hotplug design document Roger Pau Monné
@ 2017-01-12 19:00 ` Andrew Cooper
  2017-01-13  3:06   ` Boris Ostrovsky
                     ` (2 more replies)
  2017-01-13 15:51 ` Jan Beulich
  1 sibling, 3 replies; 46+ messages in thread
From: Andrew Cooper @ 2017-01-12 19:00 UTC (permalink / raw)
  To: Roger Pau Monné, xen-devel
  Cc: Stefano Stabellini, Graeme Gregory, Al Stone, Anshul Makkar,
	Julien Grall, Jan Beulich, Boris Ostrovsky

On 12/01/17 12:13, Roger Pau Monné wrote:
> Hello,
>
> Below is a draft of a design document for PVHv2 CPU hotplug. It should cover
> both vCPU and pCPU hotplug. It's mainly centered around the hardware domain,
> since for unprivileged PVH guests the vCPU hotplug mechanism is already
> described in Boris series [0], and it's shared with HVM.
>
> The aim here is to find a way to use ACPI vCPU hotplug for the hardware domain,
> while still being able to properly detect and notify Xen of pCPU hotplug.
>
> Thanks, Roger.
>
> [0] https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00060.html
>
> ---8<---
> % CPU hotplug support for PVH
> % Roger Pau Monné <roger.pau@citrix.com>
> % Draft B
>
> # Revision History
>
> | Version | Date        | Changes                                           |
> |---------|-------------|---------------------------------------------------|
> | Draft A | 5 Jan 2017  | Initial draft.                                    |
> |---------|-------------|---------------------------------------------------|
> | Draft B | 12 Jan 2017 | Removed the XXX comments and clarify some         |
> |         |             | sections.                                         |
> |         |             |                                                   |
> |         |             | Added a sample of the SSDT ASL code that would be |
> |         |             | appended to the hardware domain.                  |
>
> # Preface
>
> This document aims to describe the interface to use in order to implement CPU
> hotplug for PVH guests, this applies to hotplug of both physical and virtual
> CPUs.
>
> # Introduction
>
> One of the design goals of PVH is to be able to remove as much Xen PV specific
> code as possible, thus limiting the number of Xen PV interfaces used by guests,
> and tending to use native interfaces (as used by bare metal) as much as
> possible. This is in line with the efforts also done by Xen on ARM and helps
> reduce the burden of maintaining huge amounts of Xen PV code inside of guests
> kernels.
>
> This however presents some challenges due to the model used by the Xen
> Hypervisor, where some devices are handled by Xen while others are left for the
> hardware domain to manage. The fact that Xen lacks and AML parser also makes it
> harder, since it cannot get the full hardware description from dynamic ACPI
> tables (DSDT, SSDT) without the hardware domain collaboration.
>
> One of such issues is CPU enumeration and hotplug, for both the hardware and
> unprivileged domains. The aim is to be able to use the same enumeration and
> hotplug interface for all PVH guests, regardless of their privilege.
>
> This document aims to describe the interface used in order to fulfill the
> following actions:
>
>  * Virtual CPU (vCPU) enumeration at boot time.
>  * Hotplug of vCPUs.
>  * Hotplug of physical CPUs (pCPUs) to Xen.
>
> # Prior work
>
> ## PV CPU hotplug
>
> CPU hotplug for Xen PV guests is implemented using xenstore and hypercalls. The
> guest has to setup a watch event on the "cpu/" xenstore node, and react to
> changes in this directory. CPUs are added creating a new node and setting it's
> "availability" to online:
>
>     cpu/X/availability = "online"
>
> Where X is the vCPU ID. This is an out-of-band method, that relies on Xen
> specific interfaces in order to perform CPU hotplug.

It is also worth pointing the shortcomings of this model, i.e. that
there is no mechanism to prevent a guest onlining more processors if it
ignores the xenstore values.

>
> ## QEMU CPU hotplug using ACPI
>
> The ACPI tables provided to HVM guests contain processor objects, as created by
> libacpi. The number of processor objects in the ACPI namespace matches the
> maximum number of processors supported by HVM guests (up to 128 at the time of
> writing). Processors currently disabled are marked as so in the MADT and in
> their \_MAT and \_STA methods.
>
> A PRST operation region in I/O space is also defined, with a size of 128bits,
> that's used as a bitmap of enabled vCPUs on the system. A PRSC method is
> provided in order to check for updates to the PRST region and trigger
> notifications on the affected processor objects. The execution of the PRSC
> method is done by a GPE event. Then OSPM checks the value returned by \_STA for
> the ACPI\_STA\_DEVICE\_PRESENT flag in order to check if the vCPU has been
> enabled.

It is worth describing the toolstack side of hotplug? It is equally
relevant IMO.

>
> ## Native CPU hotplug
>
> OSPM waits for a notification from ACPI on the processor object and when an
> event is received the return value from _STA is checked in order to see if
> ACPI\_STA\_DEVICE\_PRESENT has been enabled. This notification is triggered
> from the method of a GPE block.
>
> # PVH CPU hotplug
>
> The aim as stated in the introduction is to use a method as similar as possible
> to bare metal CPU hotplug for PVH, this is feasible for unprivileged domains,
> since the ACPI tables can be created by the toolstack and provided to the
> guest. Then a minimal I/O or memory handler will be added to Xen in order to
> report the bitmap of enabled vCPUs. There's already a [series][0] posted to
> xen-devel that implement this functionality for unprivileged PVH guests.
>
> This however is proven to be quite difficult to implement for the hardware
> domain, since it has to manage both pCPUs and vCPUs. The hardware domain should
> be able to notify Xen of the addition of new pCPUs, so that they can be used by
> the Hypervisor, and also be able to hotplug new vCPUs for it's own usage. Since
> Xen cannot access the dynamic (AML) ACPI tables, because it lacks an AML
> parser, it is the duty of the hardware domain to parse those tables and notify
> Xen of relevant events.
>
> There are several related issues here that prevent a straightforward solution
> to this issue:
>
>  * Xen cannot parse AML tables, and thus cannot get notifications from ACPI
>    events. And even in the case that Xen could parse those tables, there can
>    only be one OSPM registered with ACPI

There can indeed only be one OSPM, which is the entity that executes AML
methods and receives external interrupts from ACPI-related things.

However, dom0 being OSPM does not prohibit Xen from reading and parsing
the AML (should we choose to include that functionality in the
hypervisor).  Xen is fine to do anything it wants in terms of reading
and interpreting the tables, so long as it doesn't start executing AML
bytecode.


>  * Xen can provide a valid MADT table to the hardware domain that describes the
>    environment in which the hardware domain is running, but it cannot prevent
>    the hardware domain from seeing the real processor devices in the ACPI
>    namespace, neither Xen can provide the hardware domain with processor

", nor can Xen provide the..."

>    devices that match the vCPUs at the moment.
>
> [0]: https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00060.html
>
> ## Proposed solution using the STAO
>
> The general idea of this method is to use the STAO in order to hide the pCPUs
> from the hardware domain, and provide processor objects for vCPUs in an extra
> SSDT table.
>
> This method requires one change to the STAO, in order to be able to notify the
> hardware domain of which processors found in ACPI tables are pCPUs. The
> description of the new STAO field is as follows:
>
>  |   Field            | Byte Length | Byte Offset |     Description          |
>  |--------------------|:-----------:|:-----------:|--------------------------|
>  | Processor List [n] |      -      |      -      | A list of ACPI numbers,  |
>  |                    |             |             | where each number is the |
>  |                    |             |             | Processor UID of a       |
>  |                    |             |             | physical CPU, and should |
>  |                    |             |             | be treated specially by  |
>  |                    |             |             | the OSPM                 |
>
> The list of UIDs in this new field would be matched against the ACPI Processor
> UID field found in local/x2 APIC MADT structs and Processor objects in the ACPI
> namespace, and the OSPM should either ignore those objects, or in case it
> implements pCPU hotplug, it should notify Xen of changes to these objects.
>
> The contents of the MADT provided to the hardware domain are also going to be
> different from the contents of the MADT as found in native ACPI. The local/x2
> APIC entries for all the pCPUs are going to be marked as disabled.
>
> Extra entries are going to be added for each vCPU available to the hardware
> domain, up to the maximum number of supported vCPUs. Note that supported vCPUs
> might be different than enabled vCPUs, so it's possible that some of these
> entries are also going to be marked as disabled. The entries for vCPUs on the
> MADT are going to use a processor local x2 APIC structure, and the ACPI
> processor ID of the first vCPU is going to be UINT32_MAX - HVM_MAX_VCPUS, in
> order to avoid clashes with IDs of pCPUs.

This is slightly problematic.  There is no restriction (so far as I am
aware) on which ACPI IDs the firmware picks for its objects.  They need
not be consecutive, logical, or start from 0.

If STAO is being extended to list the IDs of the physical processor
objects, we should go one step further and explicitly list the IDs of
the virtual processor objects.  This leaves us flexibility if we have to
avoid awkward firmware ID layouts.

It is also work stating that this puts an upper limit on nr_pcpus +
nr_dom0_vcpus (but 4 billion processors really ought to be enough for
anyone...)

> In order to be able to perform vCPU hotplug, the vCPUs must have an ACPI
> processor object in the ACPI namespace, so that the OSPM can request
> notifications and get the value of the \_STA and \_MAT methods. This can be
> problematic because Xen doesn't know the ACPI name of the other processor
> objects, so blindly adding new ones can create namespace clashes.
>
> This can be solved by using a different ACPI name in order to describe vCPUs in
> the ACPI namespace. Most hardware vendors tend to use CPU or PR prefixes for
> the processor objects, so using a 'VP' (ie: Virtual Processor) prefix should
> prevent clashes.

One system I have to hand (with more than 255 pcpus) uses Cxxx

To avoid namespace collisions, I can't see any option but to parse the
DSDT/SSDTs to at least confirm that VPxx is available to use.

>
> A Xen GPE device block will be used in order to deliver events related to the
> vCPUs available to the guest, since Xen doesn't know if there are any bits
> available in the native GPEs. A SCI interrupt will be injected into the guest
> in order to trigger the event.
>
> The following snippet is a representation of the ASL SSDT code that is proposed
> for the hardware domain:
>
>     DefinitionBlock ("SSDT.aml", "SSDT", 5, "Xen", "HVM", 0)
>     {
>         Scope (\_SB)
>         {
>            OperationRegion(XEN, SystemMemory, 0xDEADBEEF, 40)
>            Field(XEN, ByteAcc, NoLock, Preserve) {
>                NCPU, 16, /* Number of vCPUs */
>                MSUA, 32, /* MADT checksum address */
>                MAPA, 32, /* MADT LAPIC0 address */
>            }
>         }
>         Scope ( \_SB ) {
>             OperationRegion ( MSUM, SystemMemory, \_SB.MSUA, 1 )
>             Field ( MSUM, ByteAcc, NoLock, Preserve ) {
>                 MSU, 8
>             }
>             Method ( PMAT, 2 ) {
>                 If ( LLess(Arg0, NCPU) ) {
>                     Return ( ToBuffer(Arg1) )
>                 }
>                 Return ( Buffer() {0, 8, 0xff, 0xff, 0, 0, 0, 0} )
>             }
>             Processor ( VP00, 0, 0x0000b010, 0x06 ) {
>                 Name ( _HID, "ACPI0007" )
>                 Name ( _UID, 4294967167 )
>                 OperationRegion ( MATR, SystemMemory, Add(\_SB.MAPA, 0), 8 )
>                 Field ( MATR, ByteAcc, NoLock, Preserve ) {
>                     MAT, 64
>                 }
>                 Field ( MATR, ByteAcc, NoLock, Preserve ) {
>                     Offset(4),
>                     FLG, 1
>                 }
>                 Method ( _MAT, 0 ) {
>                     Return ( ToBuffer(MAT) )
>                 }
>                 Method ( _STA ) {
>                     If ( FLG ) {
>                         Return ( 0xF )
>                     }
>                     Return ( 0x0 )
>                 }
>                 Method ( _EJ0, 1, NotSerialized ) {
>                     Sleep ( 0xC8 )
>                 }
>             }
>             Processor ( VP01, 1, 0x0000b010, 0x06 ) {
>                 Name ( _HID, "ACPI0007" )
>                 Name ( _UID, 4294967168 )
>                 OperationRegion ( MATR, SystemMemory, Add(\_SB.MAPA, 8), 8 )
>                 Field ( MATR, ByteAcc, NoLock, Preserve ) {
>                     MAT, 64
>                 }
>                 Field ( MATR, ByteAcc, NoLock, Preserve ) {
>                     Offset(4),
>                     FLG, 1
>                 }
>                 Method ( _MAT, 0 ) {
>                     Return ( PMAT (1, MAT) )
>                 }
>                 Method ( _STA ) {
>                     If ( LLess(1, \_SB.NCPU) ) {
>                         If ( FLG ) {
>                             Return ( 0xF )
>                         }
>                     }
>                     Return ( 0x0 )
>                 }
>                 Method ( _EJ0, 1, NotSerialized ) {
>                     Sleep ( 0xC8 )
>                 }
>             }
>             OperationRegion ( PRST, SystemIO, 0xaf00, 1 )

This also has a chance of collision, both with the system ACPI
controller, and also with PCIe devices advertising IO-BARs.  (All
graphics cards ever have IO-BARs, because windows refuses to bind a
graphics driver to a PCI graphics device if the PCI device doesn't have
at least one IO-BAR.  Because PCIe requires 4k alignment on the upstream
bridge IO-windows, there is a surprisingly low limit on the number of
graphics cards you can put in a server and have functioning to windows
satisfaction.)

As with the other risks of collisions, Xen is going to have to search
the system to find a free area to use.

>             Field ( PRST, ByteAcc, NoLock, Preserve ) {
>                 PRS, 2
>             }
>             Method ( PRSC, 0 ) {
>                 Store ( ToBuffer(PRS), Local0 )
>                 Store ( DerefOf(Index(Local0, 0)), Local1 )
>                 And ( Local1, 1, Local2 )
>                 If ( LNotEqual(Local2, \_SB.VP00.FLG) ) {
>                     Store ( Local2, \_SB.VP00.FLG )
>                     If ( LEqual(Local2, 1) ) {
>                         Notify ( VP00, 1 )
>                         Subtract ( \_SB.MSU, 1, \_SB.MSU )
>                     }
>                     Else {
>                         Notify ( VP00, 3 )
>                         Add ( \_SB.MSU, 1, \_SB.MSU )
>                     }
>                 }
>                 ShiftRight ( Local1, 1, Local1 )
>                 And ( Local1, 1, Local2 )
>                 If ( LNotEqual(Local2, \_SB.VP01.FLG) ) {
>                     Store ( Local2, \_SB.VP01.FLG )
>                     If ( LEqual(Local2, 1) ) {
>                         Notify ( VP01, 1 )
>                         Subtract ( \_SB.MSU, 1, \_SB.MSU )
>                     }
>                     Else {
>                         Notify ( VP01, 3 )
>                         Add ( \_SB.MSU, 1, \_SB.MSU )
>                     }
>                 }
>                 Return ( One )
>             }
>         }
>         Device ( \_SB.GPEX ) {
>             Name ( _HID, "ACPI0006" )
>             Name ( _UID, "XENGPE" )
>             Name ( _CRS, ResourceTemplate() {
>                 IO (Decode16, 0xafe0 , 0xafe0, 0x00, 0x4)
>             } )
>             Method ( _E02 ) {
>                 \_SB.PRSC ()
>             }
>         }
>     }
>
> Since the position of the XEN data memory area is not know, the hypervisor will
> have to replace the address 0xdeadbeef with the actual memory address where
> this structure has been copied. This will involve a memory search of the AML
> code resulting from the compilation of the above ASL snippet.

This is also slightly risky.  If we need to do this, can we get a
relocation list from the compiled table from iasl?

~Andrew

>
> In order to implement this, the hypervisor build is going to use part of
> libacpi and the iasl compiler.
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-12 19:00 ` Andrew Cooper
@ 2017-01-13  3:06   ` Boris Ostrovsky
  2017-01-13 15:27   ` Jan Beulich
  2017-01-16 14:50   ` Roger Pau Monné
  2 siblings, 0 replies; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-13  3:06 UTC (permalink / raw)
  To: Andrew Cooper, Roger Pau Monné, xen-devel
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory, Anshul Makkar,
	Julien Grall, Jan Beulich



On 01/12/2017 02:00 PM, Andrew Cooper wrote:
> On 12/01/17 12:13, Roger Pau Monné wrote:

>> ## Proposed solution using the STAO
>>
>> The general idea of this method is to use the STAO in order to hide the pCPUs
>> from the hardware domain, and provide processor objects for vCPUs in an extra
>> SSDT table.
>>
>> This method requires one change to the STAO, in order to be able to notify the
>> hardware domain of which processors found in ACPI tables are pCPUs. The
>> description of the new STAO field is as follows:
>>
>>  |   Field            | Byte Length | Byte Offset |     Description          |
>>  |--------------------|:-----------:|:-----------:|--------------------------|
>>  | Processor List [n] |      -      |      -      | A list of ACPI numbers,  |
>>  |                    |             |             | where each number is the |
>>  |                    |             |             | Processor UID of a       |
>>  |                    |             |             | physical CPU, and should |
>>  |                    |             |             | be treated specially by  |
>>  |                    |             |             | the OSPM                 |
>>
>> The list of UIDs in this new field would be matched against the ACPI Processor
>> UID field found in local/x2 APIC MADT structs and Processor objects in the ACPI
>> namespace, and the OSPM should either ignore those objects, or in case it
>> implements pCPU hotplug, it should notify Xen of changes to these objects.
>>
>> The contents of the MADT provided to the hardware domain are also going to be
>> different from the contents of the MADT as found in native ACPI. The local/x2
>> APIC entries for all the pCPUs are going to be marked as disabled.
>>
>> Extra entries are going to be added for each vCPU available to the hardware
>> domain, up to the maximum number of supported vCPUs. Note that supported vCPUs
>> might be different than enabled vCPUs, so it's possible that some of these
>> entries are also going to be marked as disabled. The entries for vCPUs on the
>> MADT are going to use a processor local x2 APIC structure, and the ACPI
>> processor ID of the first vCPU is going to be UINT32_MAX - HVM_MAX_VCPUS, in
>> order to avoid clashes with IDs of pCPUs.
>
> This is slightly problematic.  There is no restriction (so far as I am
> aware) on which ACPI IDs the firmware picks for its objects.  They need
> not be consecutive, logical, or start from 0.
>
> If STAO is being extended to list the IDs of the physical processor
> objects, we should go one step further and explicitly list the IDs of
> the virtual processor objects.  This leaves us flexibility if we have to
> avoid awkward firmware ID layouts.


I don't think I understand how we'd use VCPU list in STAO. Can you 
explain this?


>
> It is also work stating that this puts an upper limit on nr_pcpus +
> nr_dom0_vcpus (but 4 billion processors really ought to be enough for
> anyone...)
>
>> In order to be able to perform vCPU hotplug, the vCPUs must have an ACPI
>> processor object in the ACPI namespace, so that the OSPM can request
>> notifications and get the value of the \_STA and \_MAT methods. This can be
>> problematic because Xen doesn't know the ACPI name of the other processor
>> objects, so blindly adding new ones can create namespace clashes.
>>
>> This can be solved by using a different ACPI name in order to describe vCPUs in
>> the ACPI namespace. Most hardware vendors tend to use CPU or PR prefixes for
>> the processor objects, so using a 'VP' (ie: Virtual Processor) prefix should
>> prevent clashes.
>
> One system I have to hand (with more than 255 pcpus) uses Cxxx
>
> To avoid namespace collisions, I can't see any option but to parse the
> DSDT/SSDTs to at least confirm that VPxx is available to use.

You are talking about Xen doing this, right? Meaning that we'd need to 
add AML parser to the hypervisor?

If we do that, I wonder whether this will also help us to deal with _PSS 
and _CST, which we now have to pass down from dom0.

>
>>
>> A Xen GPE device block will be used in order to deliver events related to the
>> vCPUs available to the guest, since Xen doesn't know if there are any bits
>> available in the native GPEs. A SCI interrupt will be injected into the guest
>> in order to trigger the event.
>>
>> The following snippet is a representation of the ASL SSDT code that is proposed
>> for the hardware domain:
>>
>>     DefinitionBlock ("SSDT.aml", "SSDT", 5, "Xen", "HVM", 0)
>>     {
>>         Scope (\_SB)
>>         {
>>            OperationRegion(XEN, SystemMemory, 0xDEADBEEF, 40)
>>            Field(XEN, ByteAcc, NoLock, Preserve) {
>>                NCPU, 16, /* Number of vCPUs */
>>                MSUA, 32, /* MADT checksum address */
>>                MAPA, 32, /* MADT LAPIC0 address */
>>            }
>>         }
>>         Scope ( \_SB ) {
>>             OperationRegion ( MSUM, SystemMemory, \_SB.MSUA, 1 )
>>             Field ( MSUM, ByteAcc, NoLock, Preserve ) {
>>                 MSU, 8
>>             }
>>             Method ( PMAT, 2 ) {
>>                 If ( LLess(Arg0, NCPU) ) {
>>                     Return ( ToBuffer(Arg1) )
>>                 }
>>                 Return ( Buffer() {0, 8, 0xff, 0xff, 0, 0, 0, 0} )
>>             }
>>             Processor ( VP00, 0, 0x0000b010, 0x06 ) {
>>                 Name ( _HID, "ACPI0007" )
>>                 Name ( _UID, 4294967167 )
>>                 OperationRegion ( MATR, SystemMemory, Add(\_SB.MAPA, 0), 8 )
>>                 Field ( MATR, ByteAcc, NoLock, Preserve ) {
>>                     MAT, 64
>>                 }
>>                 Field ( MATR, ByteAcc, NoLock, Preserve ) {
>>                     Offset(4),
>>                     FLG, 1
>>                 }
>>                 Method ( _MAT, 0 ) {
>>                     Return ( ToBuffer(MAT) )
>>                 }
>>                 Method ( _STA ) {
>>                     If ( FLG ) {
>>                         Return ( 0xF )
>>                     }
>>                     Return ( 0x0 )
>>                 }
>>                 Method ( _EJ0, 1, NotSerialized ) {
>>                     Sleep ( 0xC8 )
>>                 }
>>             }
>>             Processor ( VP01, 1, 0x0000b010, 0x06 ) {
>>                 Name ( _HID, "ACPI0007" )
>>                 Name ( _UID, 4294967168 )
>>                 OperationRegion ( MATR, SystemMemory, Add(\_SB.MAPA, 8), 8 )
>>                 Field ( MATR, ByteAcc, NoLock, Preserve ) {
>>                     MAT, 64
>>                 }
>>                 Field ( MATR, ByteAcc, NoLock, Preserve ) {
>>                     Offset(4),
>>                     FLG, 1
>>                 }
>>                 Method ( _MAT, 0 ) {
>>                     Return ( PMAT (1, MAT) )
>>                 }
>>                 Method ( _STA ) {
>>                     If ( LLess(1, \_SB.NCPU) ) {
>>                         If ( FLG ) {
>>                             Return ( 0xF )
>>                         }
>>                     }
>>                     Return ( 0x0 )
>>                 }
>>                 Method ( _EJ0, 1, NotSerialized ) {
>>                     Sleep ( 0xC8 )
>>                 }
>>             }
>>             OperationRegion ( PRST, SystemIO, 0xaf00, 1 )
>
> This also has a chance of collision, both with the system ACPI
> controller, and also with PCIe devices advertising IO-BARs.  (All
> graphics cards ever have IO-BARs, because windows refuses to bind a
> graphics driver to a PCI graphics device if the PCI device doesn't have
> at least one IO-BAR.  Because PCIe requires 4k alignment on the upstream
> bridge IO-windows, there is a surprisingly low limit on the number of
> graphics cards you can put in a server and have functioning to windows
> satisfaction.)
>
> As with the other risks of collisions, Xen is going to have to search
> the system to find a free area to use.


I am pretty ignorant about AML but is it possible to have AML 
dynamically determine the address? Or is it a compile-time value?


-boris

>
>>             Field ( PRST, ByteAcc, NoLock, Preserve ) {
>>                 PRS, 2
>>             }
>>             Method ( PRSC, 0 ) {
>>                 Store ( ToBuffer(PRS), Local0 )
>>                 Store ( DerefOf(Index(Local0, 0)), Local1 )
>>                 And ( Local1, 1, Local2 )
>>                 If ( LNotEqual(Local2, \_SB.VP00.FLG) ) {
>>                     Store ( Local2, \_SB.VP00.FLG )
>>                     If ( LEqual(Local2, 1) ) {
>>                         Notify ( VP00, 1 )
>>                         Subtract ( \_SB.MSU, 1, \_SB.MSU )
>>                     }
>>                     Else {
>>                         Notify ( VP00, 3 )
>>                         Add ( \_SB.MSU, 1, \_SB.MSU )
>>                     }
>>                 }
>>                 ShiftRight ( Local1, 1, Local1 )
>>                 And ( Local1, 1, Local2 )
>>                 If ( LNotEqual(Local2, \_SB.VP01.FLG) ) {
>>                     Store ( Local2, \_SB.VP01.FLG )
>>                     If ( LEqual(Local2, 1) ) {
>>                         Notify ( VP01, 1 )
>>                         Subtract ( \_SB.MSU, 1, \_SB.MSU )
>>                     }
>>                     Else {
>>                         Notify ( VP01, 3 )
>>                         Add ( \_SB.MSU, 1, \_SB.MSU )
>>                     }
>>                 }
>>                 Return ( One )
>>             }
>>         }
>>         Device ( \_SB.GPEX ) {
>>             Name ( _HID, "ACPI0006" )
>>             Name ( _UID, "XENGPE" )
>>             Name ( _CRS, ResourceTemplate() {
>>                 IO (Decode16, 0xafe0 , 0xafe0, 0x00, 0x4)
>>             } )
>>             Method ( _E02 ) {
>>                 \_SB.PRSC ()
>>             }
>>         }
>>     }
>>
>> Since the position of the XEN data memory area is not know, the hypervisor will
>> have to replace the address 0xdeadbeef with the actual memory address where
>> this structure has been copied. This will involve a memory search of the AML
>> code resulting from the compilation of the above ASL snippet.
>
> This is also slightly risky.  If we need to do this, can we get a
> relocation list from the compiled table from iasl?
>
> ~Andrew
>
>>
>> In order to implement this, the hypervisor build is going to use part of
>> libacpi and the iasl compiler.
>>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-12 19:00 ` Andrew Cooper
  2017-01-13  3:06   ` Boris Ostrovsky
@ 2017-01-13 15:27   ` Jan Beulich
  2017-01-16 14:59     ` Roger Pau Monné
  2017-01-16 14:50   ` Roger Pau Monné
  2 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-13 15:27 UTC (permalink / raw)
  To: Andrew Cooper, roger.pau
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory, Anshul Makkar,
	Julien Grall, xen-devel, Boris Ostrovsky

>>> On 12.01.17 at 20:00, <andrew.cooper3@citrix.com> wrote:
> On 12/01/17 12:13, Roger Pau Monné wrote:
>> ## Proposed solution using the STAO
>>
>> The general idea of this method is to use the STAO in order to hide the pCPUs
>> from the hardware domain, and provide processor objects for vCPUs in an extra
>> SSDT table.
>>
>> This method requires one change to the STAO, in order to be able to notify the
>> hardware domain of which processors found in ACPI tables are pCPUs. The
>> description of the new STAO field is as follows:
>>
>>  |   Field            | Byte Length | Byte Offset |     Description          |
>>  |--------------------|:-----------:|:-----------:|--------------------------|
>>  | Processor List [n] |      -      |      -      | A list of ACPI numbers,  |
>>  |                    |             |             | where each number is the |
>>  |                    |             |             | Processor UID of a       |
>>  |                    |             |             | physical CPU, and should |
>>  |                    |             |             | be treated specially by  |
>>  |                    |             |             | the OSPM                 |
>>
>> The list of UIDs in this new field would be matched against the ACPI Processor
>> UID field found in local/x2 APIC MADT structs and Processor objects in the ACPI
>> namespace, and the OSPM should either ignore those objects, or in case it
>> implements pCPU hotplug, it should notify Xen of changes to these objects.
>>
>> The contents of the MADT provided to the hardware domain are also going to be
>> different from the contents of the MADT as found in native ACPI. The local/x2
>> APIC entries for all the pCPUs are going to be marked as disabled.
>>
>> Extra entries are going to be added for each vCPU available to the hardware
>> domain, up to the maximum number of supported vCPUs. Note that supported vCPUs
>> might be different than enabled vCPUs, so it's possible that some of these
>> entries are also going to be marked as disabled. The entries for vCPUs on the
>> MADT are going to use a processor local x2 APIC structure, and the ACPI
>> processor ID of the first vCPU is going to be UINT32_MAX - HVM_MAX_VCPUS, in
>> order to avoid clashes with IDs of pCPUs.
> 
> This is slightly problematic.  There is no restriction (so far as I am
> aware) on which ACPI IDs the firmware picks for its objects.  They need
> not be consecutive, logical, or start from 0.
> 
> If STAO is being extended to list the IDs of the physical processor
> objects, we should go one step further and explicitly list the IDs of
> the virtual processor objects.  This leaves us flexibility if we have to
> avoid awkward firmware ID layouts.

I don't think we should do this - vCPU IDs are already in MADT. I do,
however, think that we shouldn't name any specific IDs we mean to
use for the vCPU-s, but rather merely guarantee that there won't be
any overlap with the pCPU ones.

>> In order to be able to perform vCPU hotplug, the vCPUs must have an ACPI
>> processor object in the ACPI namespace, so that the OSPM can request
>> notifications and get the value of the \_STA and \_MAT methods. This can be
>> problematic because Xen doesn't know the ACPI name of the other processor
>> objects, so blindly adding new ones can create namespace clashes.
>>
>> This can be solved by using a different ACPI name in order to describe vCPUs in
>> the ACPI namespace. Most hardware vendors tend to use CPU or PR prefixes for
>> the processor objects, so using a 'VP' (ie: Virtual Processor) prefix should
>> prevent clashes.
> 
> One system I have to hand (with more than 255 pcpus) uses Cxxx
> 
> To avoid namespace collisions, I can't see any option but to parse the
> DSDT/SSDTs to at least confirm that VPxx is available to use.

And additionally using a two character name prefix would significantly
limit the number of vCPU-s we would be able to support going forward.
Just like above, I don't think we should specify the name here at all,
allowing dynamic picking of suitable ones.

>> A Xen GPE device block will be used in order to deliver events related to the
>> vCPUs available to the guest, since Xen doesn't know if there are any bits
>> available in the native GPEs. A SCI interrupt will be injected into the guest
>> in order to trigger the event.
>>
>> The following snippet is a representation of the ASL SSDT code that is proposed
>> for the hardware domain:
>>
>>     DefinitionBlock ("SSDT.aml", "SSDT", 5, "Xen", "HVM", 0)
>>     {
>>         Scope (\_SB)
>>         {
>>            OperationRegion(XEN, SystemMemory, 0xDEADBEEF, 40)
>>            Field(XEN, ByteAcc, NoLock, Preserve) {
>>                NCPU, 16, /* Number of vCPUs */
>>                MSUA, 32, /* MADT checksum address */
>>                MAPA, 32, /* MADT LAPIC0 address */
>>            }
[...]
>> Since the position of the XEN data memory area is not know, the hypervisor will
>> have to replace the address 0xdeadbeef with the actual memory address where
>> this structure has been copied. This will involve a memory search of the AML
>> code resulting from the compilation of the above ASL snippet.
> 
> This is also slightly risky.  If we need to do this, can we get a
> relocation list from the compiled table from iasl?

I expect iasl can't do that, the more that there's not actually any
relocation involved here. I guess we'd need a double compilation
approach, where for both a different address is being specified.
The diff of the two would then allow to create a relocation list.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-12 12:13 PVH CPU hotplug design document Roger Pau Monné
  2017-01-12 19:00 ` Andrew Cooper
@ 2017-01-13 15:51 ` Jan Beulich
  2017-01-13 19:41   ` Stefano Stabellini
                     ` (2 more replies)
  1 sibling, 3 replies; 46+ messages in thread
From: Jan Beulich @ 2017-01-13 15:51 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory, Andrew Cooper,
	Anshul Makkar, Julien Grall, xen-devel, BorisOstrovsky

>>> On 12.01.17 at 13:13, <roger.pau@citrix.com> wrote:
> # Introduction
> 
> One of the design goals of PVH is to be able to remove as much Xen PV specific
> code as possible, thus limiting the number of Xen PV interfaces used by guests,
> and tending to use native interfaces (as used by bare metal) as much as
> possible. This is in line with the efforts also done by Xen on ARM and helps
> reduce the burden of maintaining huge amounts of Xen PV code inside of guests
> kernels.
> 
> This however presents some challenges due to the model used by the Xen
> Hypervisor, where some devices are handled by Xen while others are left for the
> hardware domain to manage. The fact that Xen lacks and AML parser also makes it
> harder, since it cannot get the full hardware description from dynamic ACPI
> tables (DSDT, SSDT) without the hardware domain collaboration.

Considering all the difficulties with the proposed model plus the little
code the PV vCPU hotplug logic requires in the kernel (assuming a
xenbus driver to be there anyway), I'm rather unconvinced that
going this route instead of sticking to the PV model is actually
desirable. And clearly, for consistency within the kernel, in such a
case I'd then also favor sticking to this model for DomU.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-13 15:51 ` Jan Beulich
@ 2017-01-13 19:41   ` Stefano Stabellini
  2017-01-14  1:44   ` Boris Ostrovsky
  2017-01-16 15:14   ` Roger Pau Monné
  2 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2017-01-13 19:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Graeme Gregory, Al Stone,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, BorisOstrovsky, Roger Pau Monné

On Fri, 13 Jan 2017, Jan Beulich wrote:
> >>> On 12.01.17 at 13:13, <roger.pau@citrix.com> wrote:
> > # Introduction
> > 
> > One of the design goals of PVH is to be able to remove as much Xen PV specific
> > code as possible, thus limiting the number of Xen PV interfaces used by guests,
> > and tending to use native interfaces (as used by bare metal) as much as
> > possible. This is in line with the efforts also done by Xen on ARM and helps
> > reduce the burden of maintaining huge amounts of Xen PV code inside of guests
> > kernels.
> > 
> > This however presents some challenges due to the model used by the Xen
> > Hypervisor, where some devices are handled by Xen while others are left for the
> > hardware domain to manage. The fact that Xen lacks and AML parser also makes it
> > harder, since it cannot get the full hardware description from dynamic ACPI
> > tables (DSDT, SSDT) without the hardware domain collaboration.
> 
> Considering all the difficulties with the proposed model plus the little
> code the PV vCPU hotplug logic requires in the kernel (assuming a
> xenbus driver to be there anyway), I'm rather unconvinced that
> going this route instead of sticking to the PV model is actually
> desirable. And clearly, for consistency within the kernel, in such a
> case I'd then also favor sticking to this model for DomU.

+1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-13 15:51 ` Jan Beulich
  2017-01-13 19:41   ` Stefano Stabellini
@ 2017-01-14  1:44   ` Boris Ostrovsky
  2017-01-16 11:03     ` Jan Beulich
  2017-01-16 15:14   ` Roger Pau Monné
  2 siblings, 1 reply; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-14  1:44 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel

On 01/13/2017 10:51 AM, Jan Beulich wrote:
>>>> On 12.01.17 at 13:13, <roger.pau@citrix.com> wrote:
>> # Introduction
>>
>> One of the design goals of PVH is to be able to remove as much Xen PV specific
>> code as possible, thus limiting the number of Xen PV interfaces used by guests,
>> and tending to use native interfaces (as used by bare metal) as much as
>> possible. This is in line with the efforts also done by Xen on ARM and helps
>> reduce the burden of maintaining huge amounts of Xen PV code inside of guests
>> kernels.
>>
>> This however presents some challenges due to the model used by the Xen
>> Hypervisor, where some devices are handled by Xen while others are left for the
>> hardware domain to manage. The fact that Xen lacks and AML parser also makes it
>> harder, since it cannot get the full hardware description from dynamic ACPI
>> tables (DSDT, SSDT) without the hardware domain collaboration.
> Considering all the difficulties with the proposed model plus the little
> code the PV vCPU hotplug logic requires in the kernel (assuming a
> xenbus driver to be there anyway), I'm rather unconvinced that
> going this route instead of sticking to the PV model is actually
> desirable. And clearly, for consistency within the kernel, in such a
> case I'd then also favor sticking to this model for DomU.


Can the changes that Roger proposed here be added later? Will they
require a major rewrite of what we have now (or will soon have) for dom0
or is it just adding new code (mostly)?

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-14  1:44   ` Boris Ostrovsky
@ 2017-01-16 11:03     ` Jan Beulich
  0 siblings, 0 replies; 46+ messages in thread
From: Jan Beulich @ 2017-01-16 11:03 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Stefano Stabellini, Graeme Gregory, Al Stone,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, roger.pau

>>> On 14.01.17 at 02:44, <boris.ostrovsky@oracle.com> wrote:
> On 01/13/2017 10:51 AM, Jan Beulich wrote:
>>>>> On 12.01.17 at 13:13, <roger.pau@citrix.com> wrote:
>>> # Introduction
>>>
>>> One of the design goals of PVH is to be able to remove as much Xen PV specific
>>> code as possible, thus limiting the number of Xen PV interfaces used by guests,
>>> and tending to use native interfaces (as used by bare metal) as much as
>>> possible. This is in line with the efforts also done by Xen on ARM and helps
>>> reduce the burden of maintaining huge amounts of Xen PV code inside of guests
>>> kernels.
>>>
>>> This however presents some challenges due to the model used by the Xen
>>> Hypervisor, where some devices are handled by Xen while others are left for the
>>> hardware domain to manage. The fact that Xen lacks and AML parser also makes it
>>> harder, since it cannot get the full hardware description from dynamic ACPI
>>> tables (DSDT, SSDT) without the hardware domain collaboration.
>> Considering all the difficulties with the proposed model plus the little
>> code the PV vCPU hotplug logic requires in the kernel (assuming a
>> xenbus driver to be there anyway), I'm rather unconvinced that
>> going this route instead of sticking to the PV model is actually
>> desirable. And clearly, for consistency within the kernel, in such a
>> case I'd then also favor sticking to this model for DomU.
> 
> 
> Can the changes that Roger proposed here be added later? Will they
> require a major rewrite of what we have now (or will soon have) for dom0
> or is it just adding new code (mostly)?

I'm not sure I understand especially the first question, and it's not
clear a rewrite of what (Xen, tools, OSes) you allude to in the second.
In the end, anything can be added later (so the answer to question 1
would be yes no matter what, without further qualification).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-12 19:00 ` Andrew Cooper
  2017-01-13  3:06   ` Boris Ostrovsky
  2017-01-13 15:27   ` Jan Beulich
@ 2017-01-16 14:50   ` Roger Pau Monné
  2 siblings, 0 replies; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-16 14:50 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Graeme Gregory, Al Stone,
	Konrad Rzeszutek Wilk, Anshul Makkar, Julien Grall, Jan Beulich,
	xen-devel, Boris Ostrovsky

On Thu, Jan 12, 2017 at 07:00:57PM +0000, Andrew Cooper wrote:
> On 12/01/17 12:13, Roger Pau Monné wrote:
[...]
> > ## QEMU CPU hotplug using ACPI
> >
> > The ACPI tables provided to HVM guests contain processor objects, as created by
> > libacpi. The number of processor objects in the ACPI namespace matches the
> > maximum number of processors supported by HVM guests (up to 128 at the time of
> > writing). Processors currently disabled are marked as so in the MADT and in
> > their \_MAT and \_STA methods.
> >
> > A PRST operation region in I/O space is also defined, with a size of 128bits,
> > that's used as a bitmap of enabled vCPUs on the system. A PRSC method is
> > provided in order to check for updates to the PRST region and trigger
> > notifications on the affected processor objects. The execution of the PRSC
> > method is done by a GPE event. Then OSPM checks the value returned by \_STA for
> > the ACPI\_STA\_DEVICE\_PRESENT flag in order to check if the vCPU has been
> > enabled.
> 
> It is worth describing the toolstack side of hotplug? It is equally
> relevant IMO.

By toolstack I assume you mean the hypercalls or xenstore writes that are
performed in order to notify QEMU or Xen of new vCPUs?

I haven't looked much into this, but I guess Boris could fill up some of the
details.

> >
> > ## Native CPU hotplug
> >
> > OSPM waits for a notification from ACPI on the processor object and when an
> > event is received the return value from _STA is checked in order to see if
> > ACPI\_STA\_DEVICE\_PRESENT has been enabled. This notification is triggered
> > from the method of a GPE block.
> >
> > # PVH CPU hotplug
> >
> > The aim as stated in the introduction is to use a method as similar as possible
> > to bare metal CPU hotplug for PVH, this is feasible for unprivileged domains,
> > since the ACPI tables can be created by the toolstack and provided to the
> > guest. Then a minimal I/O or memory handler will be added to Xen in order to
> > report the bitmap of enabled vCPUs. There's already a [series][0] posted to
> > xen-devel that implement this functionality for unprivileged PVH guests.
> >
> > This however is proven to be quite difficult to implement for the hardware
> > domain, since it has to manage both pCPUs and vCPUs. The hardware domain should
> > be able to notify Xen of the addition of new pCPUs, so that they can be used by
> > the Hypervisor, and also be able to hotplug new vCPUs for it's own usage. Since
> > Xen cannot access the dynamic (AML) ACPI tables, because it lacks an AML
> > parser, it is the duty of the hardware domain to parse those tables and notify
> > Xen of relevant events.
> >
> > There are several related issues here that prevent a straightforward solution
> > to this issue:
> >
> >  * Xen cannot parse AML tables, and thus cannot get notifications from ACPI
> >    events. And even in the case that Xen could parse those tables, there can
> >    only be one OSPM registered with ACPI
> 
> There can indeed only be one OSPM, which is the entity that executes AML
> methods and receives external interrupts from ACPI-related things.
> 
> However, dom0 being OSPM does not prohibit Xen from reading and parsing
> the AML (should we choose to include that functionality in the
> hypervisor).  Xen is fine to do anything it wants in terms of reading
> and interpreting the tables, so long as it doesn't start executing AML
> bytecode.

I would like to see this too, since it would allow Xen to see the CPU power
states and shutdown the hardware without using the complicate mess that we
currently have in order to perform ACPI shutdown.

> >  * Xen can provide a valid MADT table to the hardware domain that describes the
> >    environment in which the hardware domain is running, but it cannot prevent
> >    the hardware domain from seeing the real processor devices in the ACPI
> >    namespace, neither Xen can provide the hardware domain with processor
> 
> ", nor can Xen provide the..."
> 
> >    devices that match the vCPUs at the moment.
> >
> > [0]: https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00060.html
> >
> > ## Proposed solution using the STAO
> >
> > The general idea of this method is to use the STAO in order to hide the pCPUs
> > from the hardware domain, and provide processor objects for vCPUs in an extra
> > SSDT table.
> >
> > This method requires one change to the STAO, in order to be able to notify the
> > hardware domain of which processors found in ACPI tables are pCPUs. The
> > description of the new STAO field is as follows:
> >
> >  |   Field            | Byte Length | Byte Offset |     Description          |
> >  |--------------------|:-----------:|:-----------:|--------------------------|
> >  | Processor List [n] |      -      |      -      | A list of ACPI numbers,  |
> >  |                    |             |             | where each number is the |
> >  |                    |             |             | Processor UID of a       |
> >  |                    |             |             | physical CPU, and should |
> >  |                    |             |             | be treated specially by  |
> >  |                    |             |             | the OSPM                 |
> >
> > The list of UIDs in this new field would be matched against the ACPI Processor
> > UID field found in local/x2 APIC MADT structs and Processor objects in the ACPI
> > namespace, and the OSPM should either ignore those objects, or in case it
> > implements pCPU hotplug, it should notify Xen of changes to these objects.
> >
> > The contents of the MADT provided to the hardware domain are also going to be
> > different from the contents of the MADT as found in native ACPI. The local/x2
> > APIC entries for all the pCPUs are going to be marked as disabled.
> >
> > Extra entries are going to be added for each vCPU available to the hardware
> > domain, up to the maximum number of supported vCPUs. Note that supported vCPUs
> > might be different than enabled vCPUs, so it's possible that some of these
> > entries are also going to be marked as disabled. The entries for vCPUs on the
> > MADT are going to use a processor local x2 APIC structure, and the ACPI
> > processor ID of the first vCPU is going to be UINT32_MAX - HVM_MAX_VCPUS, in
> > order to avoid clashes with IDs of pCPUs.
> 
> This is slightly problematic.  There is no restriction (so far as I am
> aware) on which ACPI IDs the firmware picks for its objects.  They need
> not be consecutive, logical, or start from 0.
> 
> If STAO is being extended to list the IDs of the physical processor
> objects, we should go one step further and explicitly list the IDs of
> the virtual processor objects.  This leaves us flexibility if we have to
> avoid awkward firmware ID layouts.
> 
> It is also work stating that this puts an upper limit on nr_pcpus +
> nr_dom0_vcpus (but 4 billion processors really ought to be enough for
> anyone...)

Right, I think that I will change that to instead use dynamic ACPI processor
UIDs, and have Xen replace them from the AML.

> > In order to be able to perform vCPU hotplug, the vCPUs must have an ACPI
> > processor object in the ACPI namespace, so that the OSPM can request
> > notifications and get the value of the \_STA and \_MAT methods. This can be
> > problematic because Xen doesn't know the ACPI name of the other processor
> > objects, so blindly adding new ones can create namespace clashes.
> >
> > This can be solved by using a different ACPI name in order to describe vCPUs in
> > the ACPI namespace. Most hardware vendors tend to use CPU or PR prefixes for
> > the processor objects, so using a 'VP' (ie: Virtual Processor) prefix should
> > prevent clashes.
> 
> One system I have to hand (with more than 255 pcpus) uses Cxxx
> 
> To avoid namespace collisions, I can't see any option but to parse the
> DSDT/SSDTs to at least confirm that VPxx is available to use.

Hm, what about defining a new bus for Xen, so the SSDT would look like:

Device ( \_SB.XEN ) {
    Name ( _HID, "ACPI0004" ) /* ACPI Module Device */
}
Scope ( \_SB.XEN ) {
    OperationRegion ( ... )
    Processor ( VP00, 0, 0x0000b010, 0x06 ) {
        ...
    }
    Processor ( VP01, 1, 0x0000b010, 0x06 ) {
        [...]
    }
    OperationRegion ( PRST, SystemIO, 0xaf00, 1 )
    Field ( PRST, ByteAcc, NoLock, Preserve ) {
        PRS, 2
    }
    Method ( PRSC, 0 ) {
        Store ( ToBuffer(PRS), Local0 )
        Store ( DerefOf(Index(Local0, 0)), Local1 )
        And ( Local1, 1, Local2 )
        If ( LNotEqual(Local2, \_SB.XEN.VP00.FLG) ) {
            Store ( Local2, \_SB.XEN.VP00.FLG )
            If ( LEqual(Local2, 1) ) {
                Notify ( VP00, 1 )
                Subtract ( \_SB.XEN.MSU, 1, \_SB.XEN.MSU )
            }
            Else {
                Notify ( VP00, 3 )
                Add ( \_SB.XEN.MSU, 1, \_SB.XEN.MSU )
            }
        }
        [...]
        Return ( One )
    }
}
Device ( \_SB.XEN.GPEX ) {
    Name ( _HID, "ACPI0006" )
    Name ( _UID, "XENGPE" )
    Name ( _CRS, ResourceTemplate() {  IO (Decode16, 0xafe0 , 0xafe0, 0x00, 0x4)} )
    Method ( _E02 ) {
        \_SB.XEN.PRSC ()
    }
}

With this I think we should be able to prevent any ACPI namespace clash, TBH, I
don't think vendors will ever use a "XEN" bus. Is there anyway we could reserve
such namespace with ACPI (_SB.XEN*)?

> This also has a chance of collision, both with the system ACPI
> controller, and also with PCIe devices advertising IO-BARs.  (All
> graphics cards ever have IO-BARs, because windows refuses to bind a
> graphics driver to a PCI graphics device if the PCI device doesn't have
> at least one IO-BAR.  Because PCIe requires 4k alignment on the upstream
> bridge IO-windows, there is a surprisingly low limit on the number of
> graphics cards you can put in a server and have functioning to windows
> satisfaction.)

Yes, I'm thinking about using SystemMemory instead of SystemIO, this way we can
use a guest RAM region that surely will not clash with anything else. This will
require to transform the IO handler into a memory handler, but it doesn't look
that complicated (and we will also waste a full memory page for it, but alas).

> As with the other risks of collisions, Xen is going to have to search
> the system to find a free area to use.
> 
> >             Field ( PRST, ByteAcc, NoLock, Preserve ) {
> >                 PRS, 2
> >             }
> >             Method ( PRSC, 0 ) {
> >                 Store ( ToBuffer(PRS), Local0 )
> >                 Store ( DerefOf(Index(Local0, 0)), Local1 )
> >                 And ( Local1, 1, Local2 )
> >                 If ( LNotEqual(Local2, \_SB.VP00.FLG) ) {
> >                     Store ( Local2, \_SB.VP00.FLG )
> >                     If ( LEqual(Local2, 1) ) {
> >                         Notify ( VP00, 1 )
> >                         Subtract ( \_SB.MSU, 1, \_SB.MSU )
> >                     }
> >                     Else {
> >                         Notify ( VP00, 3 )
> >                         Add ( \_SB.MSU, 1, \_SB.MSU )
> >                     }
> >                 }
> >                 ShiftRight ( Local1, 1, Local1 )
> >                 And ( Local1, 1, Local2 )
> >                 If ( LNotEqual(Local2, \_SB.VP01.FLG) ) {
> >                     Store ( Local2, \_SB.VP01.FLG )
> >                     If ( LEqual(Local2, 1) ) {
> >                         Notify ( VP01, 1 )
> >                         Subtract ( \_SB.MSU, 1, \_SB.MSU )
> >                     }
> >                     Else {
> >                         Notify ( VP01, 3 )
> >                         Add ( \_SB.MSU, 1, \_SB.MSU )
> >                     }
> >                 }
> >                 Return ( One )
> >             }
> >         }
> >         Device ( \_SB.GPEX ) {
> >             Name ( _HID, "ACPI0006" )
> >             Name ( _UID, "XENGPE" )
> >             Name ( _CRS, ResourceTemplate() {
> >                 IO (Decode16, 0xafe0 , 0xafe0, 0x00, 0x4)
> >             } )
> >             Method ( _E02 ) {
> >                 \_SB.PRSC ()
> >             }
> >         }
> >     }
> >
> > Since the position of the XEN data memory area is not know, the hypervisor will
> > have to replace the address 0xdeadbeef with the actual memory address where
> > this structure has been copied. This will involve a memory search of the AML
> > code resulting from the compilation of the above ASL snippet.
> 
> This is also slightly risky.  If we need to do this, can we get a
> relocation list from the compiled table from iasl?

I will look into ways to do this relocation. Jan's suggestion to do a compare
between two different AML outputs seems feasible. Will check if iasl supports
something similar to relocation (although I think it doesn't).

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-13 15:27   ` Jan Beulich
@ 2017-01-16 14:59     ` Roger Pau Monné
  0 siblings, 0 replies; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-16 14:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	KonradRzeszutek Wilk, Andrew Cooper, Anshul Makkar, Julien Grall,
	xen-devel, Boris Ostrovsky

On Fri, Jan 13, 2017 at 08:27:30AM -0700, Jan Beulich wrote:
> >>> On 12.01.17 at 20:00, <andrew.cooper3@citrix.com> wrote:
> > On 12/01/17 12:13, Roger Pau Monné wrote:
> >> Extra entries are going to be added for each vCPU available to the hardware
> >> domain, up to the maximum number of supported vCPUs. Note that supported vCPUs
> >> might be different than enabled vCPUs, so it's possible that some of these
> >> entries are also going to be marked as disabled. The entries for vCPUs on the
> >> MADT are going to use a processor local x2 APIC structure, and the ACPI
> >> processor ID of the first vCPU is going to be UINT32_MAX - HVM_MAX_VCPUS, in
> >> order to avoid clashes with IDs of pCPUs.
> > 
> > This is slightly problematic.  There is no restriction (so far as I am
> > aware) on which ACPI IDs the firmware picks for its objects.  They need
> > not be consecutive, logical, or start from 0.
> > 
> > If STAO is being extended to list the IDs of the physical processor
> > objects, we should go one step further and explicitly list the IDs of
> > the virtual processor objects.  This leaves us flexibility if we have to
> > avoid awkward firmware ID layouts.
> 
> I don't think we should do this - vCPU IDs are already in MADT. I do,
> however, think that we shouldn't name any specific IDs we mean to
> use for the vCPU-s, but rather merely guarantee that there won't be
> any overlap with the pCPU ones.

I also don't see the point in listing both pCPUs and vCPUs in the STAO. If a
processor ACPI ID is not listed as a pCPU, then it's a vCPU. I don't see the
case were a processor object won't be listed as either a pCPU or a vCPU, which
renders one of the lists moot, because it can be derived from the other one.

> >> In order to be able to perform vCPU hotplug, the vCPUs must have an ACPI
> >> processor object in the ACPI namespace, so that the OSPM can request
> >> notifications and get the value of the \_STA and \_MAT methods. This can be
> >> problematic because Xen doesn't know the ACPI name of the other processor
> >> objects, so blindly adding new ones can create namespace clashes.
> >>
> >> This can be solved by using a different ACPI name in order to describe vCPUs in
> >> the ACPI namespace. Most hardware vendors tend to use CPU or PR prefixes for
> >> the processor objects, so using a 'VP' (ie: Virtual Processor) prefix should
> >> prevent clashes.
> > 
> > One system I have to hand (with more than 255 pcpus) uses Cxxx
> > 
> > To avoid namespace collisions, I can't see any option but to parse the
> > DSDT/SSDTs to at least confirm that VPxx is available to use.
> 
> And additionally using a two character name prefix would significantly
> limit the number of vCPU-s we would be able to support going forward.
> Just like above, I don't think we should specify the name here at all,
> allowing dynamic picking of suitable ones.

See my suggestion in another reply about introducing a _SB.XEN bus.

> [...]
> >> Since the position of the XEN data memory area is not know, the hypervisor will
> >> have to replace the address 0xdeadbeef with the actual memory address where
> >> this structure has been copied. This will involve a memory search of the AML
> >> code resulting from the compilation of the above ASL snippet.
> > 
> > This is also slightly risky.  If we need to do this, can we get a
> > relocation list from the compiled table from iasl?
> 
> I expect iasl can't do that, the more that there's not actually any
> relocation involved here. I guess we'd need a double compilation
> approach, where for both a different address is being specified.
> The diff of the two would then allow to create a relocation list.

That sounds sensible, thanks for the suggestion.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-13 15:51 ` Jan Beulich
  2017-01-13 19:41   ` Stefano Stabellini
  2017-01-14  1:44   ` Boris Ostrovsky
@ 2017-01-16 15:14   ` Roger Pau Monné
  2017-01-16 16:09     ` Jan Beulich
  2 siblings, 1 reply; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-16 15:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, BorisOstrovsky

On Fri, Jan 13, 2017 at 08:51:57AM -0700, Jan Beulich wrote:
> >>> On 12.01.17 at 13:13, <roger.pau@citrix.com> wrote:
> > # Introduction
> > 
> > One of the design goals of PVH is to be able to remove as much Xen PV specific
> > code as possible, thus limiting the number of Xen PV interfaces used by guests,
> > and tending to use native interfaces (as used by bare metal) as much as
> > possible. This is in line with the efforts also done by Xen on ARM and helps
> > reduce the burden of maintaining huge amounts of Xen PV code inside of guests
> > kernels.
> > 
> > This however presents some challenges due to the model used by the Xen
> > Hypervisor, where some devices are handled by Xen while others are left for the
> > hardware domain to manage. The fact that Xen lacks and AML parser also makes it
> > harder, since it cannot get the full hardware description from dynamic ACPI
> > tables (DSDT, SSDT) without the hardware domain collaboration.
> 
> Considering all the difficulties with the proposed model plus the little
> code the PV vCPU hotplug logic requires in the kernel (assuming a
> xenbus driver to be there anyway), I'm rather unconvinced that
> going this route instead of sticking to the PV model is actually
> desirable. And clearly, for consistency within the kernel, in such a
> case I'd then also favor sticking to this model for DomU.

We would at least have to pass the APIC ID in order to perform vCPU hotplug for
PVH, and the ACPI spec mandates that when using x2APIC structures in the MADT,
there must be a matching processor object in the DSDT (5.2.12.12).

Declaring processor objects in the DSDT won't be possible for Xen, but we can
at least declare them in a SSDT, which seems better than not doing it at all.
Maybe we can get ACPI to loosen the spec and don't mandate DSDT anymore.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-16 15:14   ` Roger Pau Monné
@ 2017-01-16 16:09     ` Jan Beulich
  2017-01-16 16:31       ` Roger Pau Monné
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-16 16:09 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, BorisOstrovsky

>>> On 16.01.17 at 16:14, <roger.pau@citrix.com> wrote:
> On Fri, Jan 13, 2017 at 08:51:57AM -0700, Jan Beulich wrote:
>> >>> On 12.01.17 at 13:13, <roger.pau@citrix.com> wrote:
>> > # Introduction
>> > 
>> > One of the design goals of PVH is to be able to remove as much Xen PV specific
>> > code as possible, thus limiting the number of Xen PV interfaces used by guests,
>> > and tending to use native interfaces (as used by bare metal) as much as
>> > possible. This is in line with the efforts also done by Xen on ARM and helps
>> > reduce the burden of maintaining huge amounts of Xen PV code inside of guests
>> > kernels.
>> > 
>> > This however presents some challenges due to the model used by the Xen
>> > Hypervisor, where some devices are handled by Xen while others are left for the
>> > hardware domain to manage. The fact that Xen lacks and AML parser also makes it
>> > harder, since it cannot get the full hardware description from dynamic ACPI
>> > tables (DSDT, SSDT) without the hardware domain collaboration.
>> 
>> Considering all the difficulties with the proposed model plus the little
>> code the PV vCPU hotplug logic requires in the kernel (assuming a
>> xenbus driver to be there anyway), I'm rather unconvinced that
>> going this route instead of sticking to the PV model is actually
>> desirable. And clearly, for consistency within the kernel, in such a
>> case I'd then also favor sticking to this model for DomU.
> 
> We would at least have to pass the APIC ID in order to perform vCPU hotplug for
> PVH, and the ACPI spec mandates that when using x2APIC structures in the MADT,
> there must be a matching processor object in the DSDT (5.2.12.12).
> 
> Declaring processor objects in the DSDT won't be possible for Xen, but we can
> at least declare them in a SSDT, which seems better than not doing it at all.
> Maybe we can get ACPI to loosen the spec and don't mandate DSDT anymore.

I don't understand this reply of yours: How do any ACPI requirements
come into play when using the PV hotplug mechanism for vCPU-s?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-16 16:09     ` Jan Beulich
@ 2017-01-16 16:31       ` Roger Pau Monné
  2017-01-16 16:50         ` Jan Beulich
  0 siblings, 1 reply; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-16 16:31 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, BorisOstrovsky

On Mon, Jan 16, 2017 at 09:09:55AM -0700, Jan Beulich wrote:
> >>> On 16.01.17 at 16:14, <roger.pau@citrix.com> wrote:
> > On Fri, Jan 13, 2017 at 08:51:57AM -0700, Jan Beulich wrote:
> >> >>> On 12.01.17 at 13:13, <roger.pau@citrix.com> wrote:
> >> > # Introduction
> >> > 
> >> > One of the design goals of PVH is to be able to remove as much Xen PV specific
> >> > code as possible, thus limiting the number of Xen PV interfaces used by guests,
> >> > and tending to use native interfaces (as used by bare metal) as much as
> >> > possible. This is in line with the efforts also done by Xen on ARM and helps
> >> > reduce the burden of maintaining huge amounts of Xen PV code inside of guests
> >> > kernels.
> >> > 
> >> > This however presents some challenges due to the model used by the Xen
> >> > Hypervisor, where some devices are handled by Xen while others are left for the
> >> > hardware domain to manage. The fact that Xen lacks and AML parser also makes it
> >> > harder, since it cannot get the full hardware description from dynamic ACPI
> >> > tables (DSDT, SSDT) without the hardware domain collaboration.
> >> 
> >> Considering all the difficulties with the proposed model plus the little
> >> code the PV vCPU hotplug logic requires in the kernel (assuming a
> >> xenbus driver to be there anyway), I'm rather unconvinced that
> >> going this route instead of sticking to the PV model is actually
> >> desirable. And clearly, for consistency within the kernel, in such a
> >> case I'd then also favor sticking to this model for DomU.
> > 
> > We would at least have to pass the APIC ID in order to perform vCPU hotplug for
> > PVH, and the ACPI spec mandates that when using x2APIC structures in the MADT,
> > there must be a matching processor object in the DSDT (5.2.12.12).
> > 
> > Declaring processor objects in the DSDT won't be possible for Xen, but we can
> > at least declare them in a SSDT, which seems better than not doing it at all.
> > Maybe we can get ACPI to loosen the spec and don't mandate DSDT anymore.
> 
> I don't understand this reply of yours: How do any ACPI requirements
> come into play when using the PV hotplug mechanism for vCPU-s?

This clearly isn't a requirement when doing PV vCPU hotplug, but it's a
violation of the spec (proving x2APIC entries without matching processor
objects), so I wouldn't be surprised if ACPICA or any other ACPI implementation
refuses to boot on systems with x2APIC entries but no processor objects.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-16 16:31       ` Roger Pau Monné
@ 2017-01-16 16:50         ` Jan Beulich
  2017-01-16 17:44           ` Roger Pau Monné
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-16 16:50 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, BorisOstrovsky

>>> On 16.01.17 at 17:31, <roger.pau@citrix.com> wrote:
> On Mon, Jan 16, 2017 at 09:09:55AM -0700, Jan Beulich wrote:
>> >>> On 16.01.17 at 16:14, <roger.pau@citrix.com> wrote:
>> > On Fri, Jan 13, 2017 at 08:51:57AM -0700, Jan Beulich wrote:
>> >> >>> On 12.01.17 at 13:13, <roger.pau@citrix.com> wrote:
>> >> > # Introduction
>> >> > 
>> >> > One of the design goals of PVH is to be able to remove as much Xen PV specific
>> >> > code as possible, thus limiting the number of Xen PV interfaces used by guests,
>> >> > and tending to use native interfaces (as used by bare metal) as much as
>> >> > possible. This is in line with the efforts also done by Xen on ARM and helps
>> >> > reduce the burden of maintaining huge amounts of Xen PV code inside of guests
>> >> > kernels.
>> >> > 
>> >> > This however presents some challenges due to the model used by the Xen
>> >> > Hypervisor, where some devices are handled by Xen while others are left for the
>> >> > hardware domain to manage. The fact that Xen lacks and AML parser also makes it
>> >> > harder, since it cannot get the full hardware description from dynamic ACPI
>> >> > tables (DSDT, SSDT) without the hardware domain collaboration.
>> >> 
>> >> Considering all the difficulties with the proposed model plus the little
>> >> code the PV vCPU hotplug logic requires in the kernel (assuming a
>> >> xenbus driver to be there anyway), I'm rather unconvinced that
>> >> going this route instead of sticking to the PV model is actually
>> >> desirable. And clearly, for consistency within the kernel, in such a
>> >> case I'd then also favor sticking to this model for DomU.
>> > 
>> > We would at least have to pass the APIC ID in order to perform vCPU hotplug for
>> > PVH, and the ACPI spec mandates that when using x2APIC structures in the MADT,
>> > there must be a matching processor object in the DSDT (5.2.12.12).
>> > 
>> > Declaring processor objects in the DSDT won't be possible for Xen, but we can
>> > at least declare them in a SSDT, which seems better than not doing it at all.
>> > Maybe we can get ACPI to loosen the spec and don't mandate DSDT anymore.
>> 
>> I don't understand this reply of yours: How do any ACPI requirements
>> come into play when using the PV hotplug mechanism for vCPU-s?
> 
> This clearly isn't a requirement when doing PV vCPU hotplug, but it's a
> violation of the spec (proving x2APIC entries without matching processor
> objects), so I wouldn't be surprised if ACPICA or any other ACPI implementation
> refuses to boot on systems with x2APIC entries but no processor objects.

Good point, but what do you suggest short of declaring PVH v2 Dom0
impossible to properly implement? I think that the idea of multiplexing
ACPI for different purposes is simply going too far. For PV there's no
such problem, as the Dom0 OS is expected to be aware that processor
information coming from ACPI is not applicable to the view on CPUs it
has (read: vCPU-s). And therefore, unless clean multiplexing is possible,
I think PVH will need to retain this requirement (at which point there's
no spec violation anymore).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-16 16:50         ` Jan Beulich
@ 2017-01-16 17:44           ` Roger Pau Monné
  2017-01-16 18:16             ` Stefano Stabellini
  2017-01-17  9:12             ` Jan Beulich
  0 siblings, 2 replies; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-16 17:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, BorisOstrovsky

On Mon, Jan 16, 2017 at 09:50:53AM -0700, Jan Beulich wrote:
> >>> On 16.01.17 at 17:31, <roger.pau@citrix.com> wrote:
> > On Mon, Jan 16, 2017 at 09:09:55AM -0700, Jan Beulich wrote:
> >> >>> On 16.01.17 at 16:14, <roger.pau@citrix.com> wrote:
> >> > On Fri, Jan 13, 2017 at 08:51:57AM -0700, Jan Beulich wrote:
> >> >> >>> On 12.01.17 at 13:13, <roger.pau@citrix.com> wrote:
> >> >> > # Introduction
> >> >> > 
> >> >> > One of the design goals of PVH is to be able to remove as much Xen PV specific
> >> >> > code as possible, thus limiting the number of Xen PV interfaces used by guests,
> >> >> > and tending to use native interfaces (as used by bare metal) as much as
> >> >> > possible. This is in line with the efforts also done by Xen on ARM and helps
> >> >> > reduce the burden of maintaining huge amounts of Xen PV code inside of guests
> >> >> > kernels.
> >> >> > 
> >> >> > This however presents some challenges due to the model used by the Xen
> >> >> > Hypervisor, where some devices are handled by Xen while others are left for the
> >> >> > hardware domain to manage. The fact that Xen lacks and AML parser also makes it
> >> >> > harder, since it cannot get the full hardware description from dynamic ACPI
> >> >> > tables (DSDT, SSDT) without the hardware domain collaboration.
> >> >> 
> >> >> Considering all the difficulties with the proposed model plus the little
> >> >> code the PV vCPU hotplug logic requires in the kernel (assuming a
> >> >> xenbus driver to be there anyway), I'm rather unconvinced that
> >> >> going this route instead of sticking to the PV model is actually
> >> >> desirable. And clearly, for consistency within the kernel, in such a
> >> >> case I'd then also favor sticking to this model for DomU.
> >> > 
> >> > We would at least have to pass the APIC ID in order to perform vCPU hotplug for
> >> > PVH, and the ACPI spec mandates that when using x2APIC structures in the MADT,
> >> > there must be a matching processor object in the DSDT (5.2.12.12).
> >> > 
> >> > Declaring processor objects in the DSDT won't be possible for Xen, but we can
> >> > at least declare them in a SSDT, which seems better than not doing it at all.
> >> > Maybe we can get ACPI to loosen the spec and don't mandate DSDT anymore.
> >> 
> >> I don't understand this reply of yours: How do any ACPI requirements
> >> come into play when using the PV hotplug mechanism for vCPU-s?
> > 
> > This clearly isn't a requirement when doing PV vCPU hotplug, but it's a
> > violation of the spec (proving x2APIC entries without matching processor
> > objects), so I wouldn't be surprised if ACPICA or any other ACPI implementation
> > refuses to boot on systems with x2APIC entries but no processor objects.
> 
> Good point, but what do you suggest short of declaring PVH v2 Dom0
> impossible to properly implement? I think that the idea of multiplexing
> ACPI for different purposes is simply going too far. For PV there's no
> such problem, as the Dom0 OS is expected to be aware that processor
> information coming from ACPI is not applicable to the view on CPUs it
> has (read: vCPU-s). And therefore, unless clean multiplexing is possible,
> I think PVH will need to retain this requirement (at which point there's
> no spec violation anymore).

But we definitely want to use ACPI to pass the boot vCPU information, using the
MADT for both DomU and Dom0.

Then for PVH DomU using ACPI vCPU hotplug makes perfect sense, it requires less
Xen specific code in the OS and it's fairly easy to implement inside of
Xen/toolstack. But I understand that using different methods for DomU vs Dom0
is very awkward. I still think that ACPI vCPU hotplug for Dom0 this is not so
far-fetched, and that it could be doable.

Could we introduce a new CPUID flag to notify the guest of whether it should
expect ACPI vCPU hotplug or PV vCPU hotplug?

I don't really like having Xen-specific checks inside of OSes, like "it's PVH
guest" then short circuiting a bunch of native logic. For example the ACPICA
ACPI shutdown hooks for Xen Dom0 never made it upstream, and it's very hard for
me to argue with the FreeBSD ACPICA maintainer about why those are needed,
and why he has to maintain a patch on top of upstream ACPICA only for Xen.

I will nonetheless send a new version of the design document, adding the
comments that I have received on the last draft.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-16 17:44           ` Roger Pau Monné
@ 2017-01-16 18:16             ` Stefano Stabellini
  2017-01-17  9:12             ` Jan Beulich
  1 sibling, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2017-01-16 18:16 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Graeme Gregory, Al Stone,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, Jan Beulich, xen-devel, BorisOstrovsky

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4534 bytes --]

On Mon, 16 Jan 2017, Roger Pau Monné wrote:
> On Mon, Jan 16, 2017 at 09:50:53AM -0700, Jan Beulich wrote:
> > >>> On 16.01.17 at 17:31, <roger.pau@citrix.com> wrote:
> > > On Mon, Jan 16, 2017 at 09:09:55AM -0700, Jan Beulich wrote:
> > >> >>> On 16.01.17 at 16:14, <roger.pau@citrix.com> wrote:
> > >> > On Fri, Jan 13, 2017 at 08:51:57AM -0700, Jan Beulich wrote:
> > >> >> >>> On 12.01.17 at 13:13, <roger.pau@citrix.com> wrote:
> > >> >> > # Introduction
> > >> >> > 
> > >> >> > One of the design goals of PVH is to be able to remove as much Xen PV specific
> > >> >> > code as possible, thus limiting the number of Xen PV interfaces used by guests,
> > >> >> > and tending to use native interfaces (as used by bare metal) as much as
> > >> >> > possible. This is in line with the efforts also done by Xen on ARM and helps
> > >> >> > reduce the burden of maintaining huge amounts of Xen PV code inside of guests
> > >> >> > kernels.
> > >> >> > 
> > >> >> > This however presents some challenges due to the model used by the Xen
> > >> >> > Hypervisor, where some devices are handled by Xen while others are left for the
> > >> >> > hardware domain to manage. The fact that Xen lacks and AML parser also makes it
> > >> >> > harder, since it cannot get the full hardware description from dynamic ACPI
> > >> >> > tables (DSDT, SSDT) without the hardware domain collaboration.
> > >> >> 
> > >> >> Considering all the difficulties with the proposed model plus the little
> > >> >> code the PV vCPU hotplug logic requires in the kernel (assuming a
> > >> >> xenbus driver to be there anyway), I'm rather unconvinced that
> > >> >> going this route instead of sticking to the PV model is actually
> > >> >> desirable. And clearly, for consistency within the kernel, in such a
> > >> >> case I'd then also favor sticking to this model for DomU.
> > >> > 
> > >> > We would at least have to pass the APIC ID in order to perform vCPU hotplug for
> > >> > PVH, and the ACPI spec mandates that when using x2APIC structures in the MADT,
> > >> > there must be a matching processor object in the DSDT (5.2.12.12).
> > >> > 
> > >> > Declaring processor objects in the DSDT won't be possible for Xen, but we can
> > >> > at least declare them in a SSDT, which seems better than not doing it at all.
> > >> > Maybe we can get ACPI to loosen the spec and don't mandate DSDT anymore.
> > >> 
> > >> I don't understand this reply of yours: How do any ACPI requirements
> > >> come into play when using the PV hotplug mechanism for vCPU-s?
> > > 
> > > This clearly isn't a requirement when doing PV vCPU hotplug, but it's a
> > > violation of the spec (proving x2APIC entries without matching processor
> > > objects), so I wouldn't be surprised if ACPICA or any other ACPI implementation
> > > refuses to boot on systems with x2APIC entries but no processor objects.
> > 
> > Good point, but what do you suggest short of declaring PVH v2 Dom0
> > impossible to properly implement? I think that the idea of multiplexing
> > ACPI for different purposes is simply going too far. For PV there's no
> > such problem, as the Dom0 OS is expected to be aware that processor
> > information coming from ACPI is not applicable to the view on CPUs it
> > has (read: vCPU-s). And therefore, unless clean multiplexing is possible,
> > I think PVH will need to retain this requirement (at which point there's
> > no spec violation anymore).
> 
> But we definitely want to use ACPI to pass the boot vCPU information, using the
> MADT for both DomU and Dom0.
> 
> Then for PVH DomU using ACPI vCPU hotplug makes perfect sense, it requires less
> Xen specific code in the OS and it's fairly easy to implement inside of
> Xen/toolstack. But I understand that using different methods for DomU vs Dom0
> is very awkward. I still think that ACPI vCPU hotplug for Dom0 this is not so
> far-fetched, and that it could be doable.
> 
> Could we introduce a new CPUID flag to notify the guest of whether it should
> expect ACPI vCPU hotplug or PV vCPU hotplug?

This is a good idea.


> I don't really like having Xen-specific checks inside of OSes, like "it's PVH
> guest" then short circuiting a bunch of native logic. For example the ACPICA
> ACPI shutdown hooks for Xen Dom0 never made it upstream, and it's very hard for
> me to argue with the FreeBSD ACPICA maintainer about why those are needed,
> and why he has to maintain a patch on top of upstream ACPICA only for Xen.

Nobody likes those. However, I don't think that PV cpu hotplug requires
any of those.

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-16 17:44           ` Roger Pau Monné
  2017-01-16 18:16             ` Stefano Stabellini
@ 2017-01-17  9:12             ` Jan Beulich
  2017-01-17 11:43               ` Roger Pau Monné
  1 sibling, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-17  9:12 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, BorisOstrovsky

>>> On 16.01.17 at 18:44, <roger.pau@citrix.com> wrote:
> On Mon, Jan 16, 2017 at 09:50:53AM -0700, Jan Beulich wrote:
>> >>> On 16.01.17 at 17:31, <roger.pau@citrix.com> wrote:
>> > On Mon, Jan 16, 2017 at 09:09:55AM -0700, Jan Beulich wrote:
>> >> >>> On 16.01.17 at 16:14, <roger.pau@citrix.com> wrote:
>> >> > On Fri, Jan 13, 2017 at 08:51:57AM -0700, Jan Beulich wrote:
>> >> >> >>> On 12.01.17 at 13:13, <roger.pau@citrix.com> wrote:
>> >> >> > # Introduction
>> >> >> > 
>> >> >> > One of the design goals of PVH is to be able to remove as much Xen PV 
> specific
>> >> >> > code as possible, thus limiting the number of Xen PV interfaces used by 
> guests,
>> >> >> > and tending to use native interfaces (as used by bare metal) as much as
>> >> >> > possible. This is in line with the efforts also done by Xen on ARM and 
> helps
>> >> >> > reduce the burden of maintaining huge amounts of Xen PV code inside of 
> guests
>> >> >> > kernels.
>> >> >> > 
>> >> >> > This however presents some challenges due to the model used by the Xen
>> >> >> > Hypervisor, where some devices are handled by Xen while others are left 
> for the
>> >> >> > hardware domain to manage. The fact that Xen lacks and AML parser also 
> makes it
>> >> >> > harder, since it cannot get the full hardware description from dynamic 
> ACPI
>> >> >> > tables (DSDT, SSDT) without the hardware domain collaboration.
>> >> >> 
>> >> >> Considering all the difficulties with the proposed model plus the little
>> >> >> code the PV vCPU hotplug logic requires in the kernel (assuming a
>> >> >> xenbus driver to be there anyway), I'm rather unconvinced that
>> >> >> going this route instead of sticking to the PV model is actually
>> >> >> desirable. And clearly, for consistency within the kernel, in such a
>> >> >> case I'd then also favor sticking to this model for DomU.
>> >> > 
>> >> > We would at least have to pass the APIC ID in order to perform vCPU 
> hotplug for
>> >> > PVH, and the ACPI spec mandates that when using x2APIC structures in the 
> MADT,
>> >> > there must be a matching processor object in the DSDT (5.2.12.12).
>> >> > 
>> >> > Declaring processor objects in the DSDT won't be possible for Xen, but we 
> can
>> >> > at least declare them in a SSDT, which seems better than not doing it at 
> all.
>> >> > Maybe we can get ACPI to loosen the spec and don't mandate DSDT anymore.
>> >> 
>> >> I don't understand this reply of yours: How do any ACPI requirements
>> >> come into play when using the PV hotplug mechanism for vCPU-s?
>> > 
>> > This clearly isn't a requirement when doing PV vCPU hotplug, but it's a
>> > violation of the spec (proving x2APIC entries without matching processor
>> > objects), so I wouldn't be surprised if ACPICA or any other ACPI 
> implementation
>> > refuses to boot on systems with x2APIC entries but no processor objects.
>> 
>> Good point, but what do you suggest short of declaring PVH v2 Dom0
>> impossible to properly implement? I think that the idea of multiplexing
>> ACPI for different purposes is simply going too far. For PV there's no
>> such problem, as the Dom0 OS is expected to be aware that processor
>> information coming from ACPI is not applicable to the view on CPUs it
>> has (read: vCPU-s). And therefore, unless clean multiplexing is possible,
>> I think PVH will need to retain this requirement (at which point there's
>> no spec violation anymore).
> 
> But we definitely want to use ACPI to pass the boot vCPU information, using the
> MADT for both DomU and Dom0.

Is that really set in stone?

> Then for PVH DomU using ACPI vCPU hotplug makes perfect sense, it requires less
> Xen specific code in the OS and it's fairly easy to implement inside of
> Xen/toolstack. But I understand that using different methods for DomU vs Dom0
> is very awkward. I still think that ACPI vCPU hotplug for Dom0 this is not so
> far-fetched, and that it could be doable.
> 
> Could we introduce a new CPUID flag to notify the guest of whether it should
> expect ACPI vCPU hotplug or PV vCPU hotplug?

That would be an easy addition.

> I don't really like having Xen-specific checks inside of OSes, like "it's PVH
> guest" then short circuiting a bunch of native logic. For example the ACPICA
> ACPI shutdown hooks for Xen Dom0 never made it upstream, and it's very hard for
> me to argue with the FreeBSD ACPICA maintainer about why those are needed,
> and why he has to maintain a patch on top of upstream ACPICA only for Xen.

I understand all those concerns, but we shouldn't replace one ugliness
by another. I.e. without a reasonably clean concept of how to use
ACPI here I can't help thinking that the PV model here is the cleaner one
despite the (little) extra code it requires in OSes.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-17  9:12             ` Jan Beulich
@ 2017-01-17 11:43               ` Roger Pau Monné
  2017-01-17 12:33                 ` Jan Beulich
  0 siblings, 1 reply; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-17 11:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, BorisOstrovsky

On Tue, Jan 17, 2017 at 02:12:59AM -0700, Jan Beulich wrote:
> >>> On 16.01.17 at 18:44, <roger.pau@citrix.com> wrote:
> > On Mon, Jan 16, 2017 at 09:50:53AM -0700, Jan Beulich wrote:
> >> >>> On 16.01.17 at 17:31, <roger.pau@citrix.com> wrote:
> >> > On Mon, Jan 16, 2017 at 09:09:55AM -0700, Jan Beulich wrote:
> >> >> >>> On 16.01.17 at 16:14, <roger.pau@citrix.com> wrote:
> >> > This clearly isn't a requirement when doing PV vCPU hotplug, but it's a
> >> > violation of the spec (proving x2APIC entries without matching processor
> >> > objects), so I wouldn't be surprised if ACPICA or any other ACPI 
> > implementation
> >> > refuses to boot on systems with x2APIC entries but no processor objects.
> >> 
> >> Good point, but what do you suggest short of declaring PVH v2 Dom0
> >> impossible to properly implement? I think that the idea of multiplexing
> >> ACPI for different purposes is simply going too far. For PV there's no
> >> such problem, as the Dom0 OS is expected to be aware that processor
> >> information coming from ACPI is not applicable to the view on CPUs it
> >> has (read: vCPU-s). And therefore, unless clean multiplexing is possible,
> >> I think PVH will need to retain this requirement (at which point there's
> >> no spec violation anymore).
> > 
> > But we definitely want to use ACPI to pass the boot vCPU information, using the
> > MADT for both DomU and Dom0.
> 
> Is that really set in stone?

If the PVH domain has access to an APIC and wants to use it it must parse the
info from the MADT, or else it cannot get the APIC address or the APIC ID (you
could guess those, since their position is quite standard, but what's the
point?)

> > Then for PVH DomU using ACPI vCPU hotplug makes perfect sense, it requires less
> > Xen specific code in the OS and it's fairly easy to implement inside of
> > Xen/toolstack. But I understand that using different methods for DomU vs Dom0
> > is very awkward. I still think that ACPI vCPU hotplug for Dom0 this is not so
> > far-fetched, and that it could be doable.
> > 
> > Could we introduce a new CPUID flag to notify the guest of whether it should
> > expect ACPI vCPU hotplug or PV vCPU hotplug?
> 
> That would be an easy addition.

My proposition would be to notify the usage of PV vCPU hotplug, and not notify
anything when using ACPI vCPU hotplug.

> > I don't really like having Xen-specific checks inside of OSes, like "it's PVH
> > guest" then short circuiting a bunch of native logic. For example the ACPICA
> > ACPI shutdown hooks for Xen Dom0 never made it upstream, and it's very hard for
> > me to argue with the FreeBSD ACPICA maintainer about why those are needed,
> > and why he has to maintain a patch on top of upstream ACPICA only for Xen.
> 
> I understand all those concerns, but we shouldn't replace one ugliness
> by another. I.e. without a reasonably clean concept of how to use
> ACPI here I can't help thinking that the PV model here is the cleaner one
> despite the (little) extra code it requires in OSes.

Right. Do you agree to allow Boris DomU ACPI CPU hotplug to go in when ready,
and the PVH Dom0 series to continue using the same approach? (MADT entries for
vCPUs, unmodified processor objects in the DSDT, PV hotplug for vCPUs).

Is there anyway to get in touch with the ACPI guys in order to see whether this
can be solved in a nice way using ACPI? I know that's not something that's
going to change in the near future, but maybe by bringing it up with them we
can make our life easier in the future?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-17 11:43               ` Roger Pau Monné
@ 2017-01-17 12:33                 ` Jan Beulich
  2017-01-17 14:13                   ` Roger Pau Monné
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-17 12:33 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, BorisOstrovsky

>>> On 17.01.17 at 12:43, <roger.pau@citrix.com> wrote:
> On Tue, Jan 17, 2017 at 02:12:59AM -0700, Jan Beulich wrote:
>> >>> On 16.01.17 at 18:44, <roger.pau@citrix.com> wrote:
>> > On Mon, Jan 16, 2017 at 09:50:53AM -0700, Jan Beulich wrote:
>> >> >>> On 16.01.17 at 17:31, <roger.pau@citrix.com> wrote:
>> >> > On Mon, Jan 16, 2017 at 09:09:55AM -0700, Jan Beulich wrote:
>> >> >> >>> On 16.01.17 at 16:14, <roger.pau@citrix.com> wrote:
>> >> > This clearly isn't a requirement when doing PV vCPU hotplug, but it's a
>> >> > violation of the spec (proving x2APIC entries without matching processor
>> >> > objects), so I wouldn't be surprised if ACPICA or any other ACPI 
>> > implementation
>> >> > refuses to boot on systems with x2APIC entries but no processor objects.
>> >> 
>> >> Good point, but what do you suggest short of declaring PVH v2 Dom0
>> >> impossible to properly implement? I think that the idea of multiplexing
>> >> ACPI for different purposes is simply going too far. For PV there's no
>> >> such problem, as the Dom0 OS is expected to be aware that processor
>> >> information coming from ACPI is not applicable to the view on CPUs it
>> >> has (read: vCPU-s). And therefore, unless clean multiplexing is possible,
>> >> I think PVH will need to retain this requirement (at which point there's
>> >> no spec violation anymore).
>> > 
>> > But we definitely want to use ACPI to pass the boot vCPU information, using the
>> > MADT for both DomU and Dom0.
>> 
>> Is that really set in stone?
> 
> If the PVH domain has access to an APIC and wants to use it it must parse the
> info from the MADT, or else it cannot get the APIC address or the APIC ID (you
> could guess those, since their position is quite standard, but what's the
> point?)

There's always the option of obtaining needed information via hypercall.

>> > Then for PVH DomU using ACPI vCPU hotplug makes perfect sense, it requires less
>> > Xen specific code in the OS and it's fairly easy to implement inside of
>> > Xen/toolstack. But I understand that using different methods for DomU vs Dom0
>> > is very awkward. I still think that ACPI vCPU hotplug for Dom0 this is not so
>> > far-fetched, and that it could be doable.
>> > 
>> > Could we introduce a new CPUID flag to notify the guest of whether it should
>> > expect ACPI vCPU hotplug or PV vCPU hotplug?
>> 
>> That would be an easy addition.
> 
> My proposition would be to notify the usage of PV vCPU hotplug, and not notify
> anything when using ACPI vCPU hotplug.

I think there should be two flags, each indicating availability of the
respective variant.

>> > I don't really like having Xen-specific checks inside of OSes, like "it's PVH
>> > guest" then short circuiting a bunch of native logic. For example the ACPICA
>> > ACPI shutdown hooks for Xen Dom0 never made it upstream, and it's very hard for
>> > me to argue with the FreeBSD ACPICA maintainer about why those are needed,
>> > and why he has to maintain a patch on top of upstream ACPICA only for Xen.
>> 
>> I understand all those concerns, but we shouldn't replace one ugliness
>> by another. I.e. without a reasonably clean concept of how to use
>> ACPI here I can't help thinking that the PV model here is the cleaner one
>> despite the (little) extra code it requires in OSes.
> 
> Right. Do you agree to allow Boris DomU ACPI CPU hotplug to go in when ready,
> and the PVH Dom0 series to continue using the same approach? (MADT entries for
> vCPUs, unmodified processor objects in the DSDT, PV hotplug for vCPUs).

To be honest, until I see a clear picture of how everything is going to
fit together, I don't think I'm willing to give my okay for any of these
patches to go in. That's because without at least a clear road towards
full functionality we'd again end up with something half baked like PVHv1
was. Nor do I view having two vCPU hotplug models as a desirable goal.

> Is there anyway to get in touch with the ACPI guys in order to see whether this
> can be solved in a nice way using ACPI? I know that's not something that's
> going to change in the near future, but maybe by bringing it up with them we
> can make our life easier in the future?

I have no idea who the folks are who update the ACPI spec. I don't,
however, see the point here: ACPI wants to provide an abstraction
between hardware and OS. I don't think it was ever meant to be
able to split things at some arbitrary boundary. Or to say it
differently, I could fully understand them responding back "What the
heck?" That's also why I wouldn't (currently) consider myself bringing
up anything like this with them, even if I knew who they were. Otoh
they've accepted STAO and XENV ...

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-17 12:33                 ` Jan Beulich
@ 2017-01-17 14:13                   ` Roger Pau Monné
  2017-01-17 14:44                     ` Jan Beulich
  0 siblings, 1 reply; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-17 14:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, BorisOstrovsky

On Tue, Jan 17, 2017 at 05:33:41AM -0700, Jan Beulich wrote:
> >>> On 17.01.17 at 12:43, <roger.pau@citrix.com> wrote:
> > On Tue, Jan 17, 2017 at 02:12:59AM -0700, Jan Beulich wrote:
> >> >>> On 16.01.17 at 18:44, <roger.pau@citrix.com> wrote:
> >> > On Mon, Jan 16, 2017 at 09:50:53AM -0700, Jan Beulich wrote:
> >> >> >>> On 16.01.17 at 17:31, <roger.pau@citrix.com> wrote:
> >> >> > On Mon, Jan 16, 2017 at 09:09:55AM -0700, Jan Beulich wrote:
> >> >> >> >>> On 16.01.17 at 16:14, <roger.pau@citrix.com> wrote:
> >> >> > This clearly isn't a requirement when doing PV vCPU hotplug, but it's a
> >> >> > violation of the spec (proving x2APIC entries without matching processor
> >> >> > objects), so I wouldn't be surprised if ACPICA or any other ACPI 
> >> > implementation
> >> >> > refuses to boot on systems with x2APIC entries but no processor objects.
> >> >> 
> >> >> Good point, but what do you suggest short of declaring PVH v2 Dom0
> >> >> impossible to properly implement? I think that the idea of multiplexing
> >> >> ACPI for different purposes is simply going too far. For PV there's no
> >> >> such problem, as the Dom0 OS is expected to be aware that processor
> >> >> information coming from ACPI is not applicable to the view on CPUs it
> >> >> has (read: vCPU-s). And therefore, unless clean multiplexing is possible,
> >> >> I think PVH will need to retain this requirement (at which point there's
> >> >> no spec violation anymore).
> >> > 
> >> > But we definitely want to use ACPI to pass the boot vCPU information, using the
> >> > MADT for both DomU and Dom0.
> >> 
> >> Is that really set in stone?
> > 
> > If the PVH domain has access to an APIC and wants to use it it must parse the
> > info from the MADT, or else it cannot get the APIC address or the APIC ID (you
> > could guess those, since their position is quite standard, but what's the
> > point?)
> 
> There's always the option of obtaining needed information via hypercall.

I think we should avoid that and instead use ACPI only, or else we are
duplicating the information provided in ACPI using another interface, which is
pointless IMHO.

There's only one kind of PVHv2 guest that doesn't require ACPI, and that guest
type also doesn't have emulated local APICs. We agreed that this model was
interesting from things like unikernels DomUs, but that's the only reason why
we are providing it. Not that full OSes couldn't use it, but it seems
pointless.

> >> > Then for PVH DomU using ACPI vCPU hotplug makes perfect sense, it requires less
> >> > Xen specific code in the OS and it's fairly easy to implement inside of
> >> > Xen/toolstack. But I understand that using different methods for DomU vs Dom0
> >> > is very awkward. I still think that ACPI vCPU hotplug for Dom0 this is not so
> >> > far-fetched, and that it could be doable.
> >> > 
> >> > Could we introduce a new CPUID flag to notify the guest of whether it should
> >> > expect ACPI vCPU hotplug or PV vCPU hotplug?
> >> 
> >> That would be an easy addition.
> > 
> > My proposition would be to notify the usage of PV vCPU hotplug, and not notify
> > anything when using ACPI vCPU hotplug.
> 
> I think there should be two flags, each indicating availability of the
> respective variant.

I'm OK with this.

> >> > I don't really like having Xen-specific checks inside of OSes, like "it's PVH
> >> > guest" then short circuiting a bunch of native logic. For example the ACPICA
> >> > ACPI shutdown hooks for Xen Dom0 never made it upstream, and it's very hard for
> >> > me to argue with the FreeBSD ACPICA maintainer about why those are needed,
> >> > and why he has to maintain a patch on top of upstream ACPICA only for Xen.
> >> 
> >> I understand all those concerns, but we shouldn't replace one ugliness
> >> by another. I.e. without a reasonably clean concept of how to use
> >> ACPI here I can't help thinking that the PV model here is the cleaner one
> >> despite the (little) extra code it requires in OSes.
> > 
> > Right. Do you agree to allow Boris DomU ACPI CPU hotplug to go in when ready,
> > and the PVH Dom0 series to continue using the same approach? (MADT entries for
> > vCPUs, unmodified processor objects in the DSDT, PV hotplug for vCPUs).
> 
> To be honest, until I see a clear picture of how everything is going to
> fit together, I don't think I'm willing to give my okay for any of these
> patches to go in. That's because without at least a clear road towards
> full functionality we'd again end up with something half baked like PVHv1
> was. Nor do I view having two vCPU hotplug models as a desirable goal.

Right, let me send a new version of the ACPI vCPU hotplug for Dom0, and then we
can continue there.

> > Is there anyway to get in touch with the ACPI guys in order to see whether this
> > can be solved in a nice way using ACPI? I know that's not something that's
> > going to change in the near future, but maybe by bringing it up with them we
> > can make our life easier in the future?
> 
> I have no idea who the folks are who update the ACPI spec. I don't,
> however, see the point here: ACPI wants to provide an abstraction
> between hardware and OS. I don't think it was ever meant to be
> able to split things at some arbitrary boundary. Or to say it
> differently, I could fully understand them responding back "What the
> heck?" That's also why I wouldn't (currently) consider myself bringing
> up anything like this with them, even if I knew who they were. Otoh
> they've accepted STAO and XENV ...

Well, the ACPI spec seems to be driven by a working group inside the UEFI
Forum. I admit it's possible that the reply is just in the form of "What the
heck?", but IMHO it wouldn't hurt to ask for their opinion, Xen is not like
Linux regarding it's user base, but it's also not negligible. At the end Xen is
(ab)using ACPI in some way, so maybe we can get some suggestions.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-17 14:13                   ` Roger Pau Monné
@ 2017-01-17 14:44                     ` Jan Beulich
  2017-01-17 15:27                       ` Boris Ostrovsky
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-17 14:44 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel, BorisOstrovsky

>>> On 17.01.17 at 15:13, <roger.pau@citrix.com> wrote:
> On Tue, Jan 17, 2017 at 05:33:41AM -0700, Jan Beulich wrote:
>> >>> On 17.01.17 at 12:43, <roger.pau@citrix.com> wrote:
>> > If the PVH domain has access to an APIC and wants to use it it must parse the
>> > info from the MADT, or else it cannot get the APIC address or the APIC ID (you
>> > could guess those, since their position is quite standard, but what's the
>> > point?)
>> 
>> There's always the option of obtaining needed information via hypercall.
> 
> I think we should avoid that and instead use ACPI only, or else we are
> duplicating the information provided in ACPI using another interface, which is
> pointless IMHO.
> 
> There's only one kind of PVHv2 guest that doesn't require ACPI, and that guest
> type also doesn't have emulated local APICs. We agreed that this model was
> interesting from things like unikernels DomUs, but that's the only reason why
> we are providing it. Not that full OSes couldn't use it, but it seems
> pointless.

You writing things this way makes me notice another possible design
issue here: Requiring ACPI is a bad thing imo, with even bare hardware
going different directions for at least some use cases (SFI being one
example). Hence I think ACPI should - like on bare hardware - remain
an optional thing. Which in turn require _all_ information obtained from
ACPI (if available) to also be available another way. And this other
way might by hypercalls in our case.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-17 14:44                     ` Jan Beulich
@ 2017-01-17 15:27                       ` Boris Ostrovsky
  2017-01-17 15:33                         ` Jan Beulich
  0 siblings, 1 reply; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-17 15:27 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel

On 01/17/2017 09:44 AM, Jan Beulich wrote:
>>>> On 17.01.17 at 15:13, <roger.pau@citrix.com> wrote:
>> On Tue, Jan 17, 2017 at 05:33:41AM -0700, Jan Beulich wrote:
>>>>>> On 17.01.17 at 12:43, <roger.pau@citrix.com> wrote:
>>>> If the PVH domain has access to an APIC and wants to use it it must parse the
>>>> info from the MADT, or else it cannot get the APIC address or the APIC ID (you
>>>> could guess those, since their position is quite standard, but what's the
>>>> point?)
>>> There's always the option of obtaining needed information via hypercall.
>> I think we should avoid that and instead use ACPI only, or else we are
>> duplicating the information provided in ACPI using another interface, which is
>> pointless IMHO.
>>
>> There's only one kind of PVHv2 guest that doesn't require ACPI, and that guest
>> type also doesn't have emulated local APICs. We agreed that this model was
>> interesting from things like unikernels DomUs, but that's the only reason why
>> we are providing it. Not that full OSes couldn't use it, but it seems
>> pointless.
> You writing things this way makes me notice another possible design
> issue here: Requiring ACPI is a bad thing imo, with even bare hardware
> going different directions for at least some use cases (SFI being one
> example). Hence I think ACPI should - like on bare hardware - remain
> an optional thing. Which in turn require _all_ information obtained from
> ACPI (if available) to also be available another way. And this other
> way might by hypercalls in our case.


At the risk of derailing this thread: why do we need vCPU hotplug for
dom0 in the first place? What do we gain over "echo {1|0} >
/sys/devices/system/cpu/cpuX/online" ?

I can see why this may be needed for domUs where Xen can enforce number
of vCPUs that are allowed to run (which we don't enforce now anyway) but
why for dom0?

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-17 15:27                       ` Boris Ostrovsky
@ 2017-01-17 15:33                         ` Jan Beulich
  2017-01-17 15:50                           ` Boris Ostrovsky
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-17 15:33 UTC (permalink / raw)
  To: roger.pau, Boris Ostrovsky
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel

>>> On 17.01.17 at 16:27, <boris.ostrovsky@oracle.com> wrote:
> On 01/17/2017 09:44 AM, Jan Beulich wrote:
>>>>> On 17.01.17 at 15:13, <roger.pau@citrix.com> wrote:
>>> There's only one kind of PVHv2 guest that doesn't require ACPI, and that guest
>>> type also doesn't have emulated local APICs. We agreed that this model was
>>> interesting from things like unikernels DomUs, but that's the only reason why
>>> we are providing it. Not that full OSes couldn't use it, but it seems
>>> pointless.
>> You writing things this way makes me notice another possible design
>> issue here: Requiring ACPI is a bad thing imo, with even bare hardware
>> going different directions for at least some use cases (SFI being one
>> example). Hence I think ACPI should - like on bare hardware - remain
>> an optional thing. Which in turn require _all_ information obtained from
>> ACPI (if available) to also be available another way. And this other
>> way might by hypercalls in our case.
> 
> 
> At the risk of derailing this thread: why do we need vCPU hotplug for
> dom0 in the first place? What do we gain over "echo {1|0} >
> /sys/devices/system/cpu/cpuX/online" ?
> 
> I can see why this may be needed for domUs where Xen can enforce number
> of vCPUs that are allowed to run (which we don't enforce now anyway) but
> why for dom0?

Good that you now ask this too - that's the PV hotplug mechanism,
and I've been saying all the time that this should be just fine for PVH
(Dom0 and DomU).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-17 15:33                         ` Jan Beulich
@ 2017-01-17 15:50                           ` Boris Ostrovsky
  2017-01-17 17:45                             ` Roger Pau Monné
  0 siblings, 1 reply; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-17 15:50 UTC (permalink / raw)
  To: Jan Beulich, roger.pau
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, xen-devel

On 01/17/2017 10:33 AM, Jan Beulich wrote:
>>>> On 17.01.17 at 16:27, <boris.ostrovsky@oracle.com> wrote:
>> On 01/17/2017 09:44 AM, Jan Beulich wrote:
>>>>>> On 17.01.17 at 15:13, <roger.pau@citrix.com> wrote:
>>>> There's only one kind of PVHv2 guest that doesn't require ACPI, and that guest
>>>> type also doesn't have emulated local APICs. We agreed that this model was
>>>> interesting from things like unikernels DomUs, but that's the only reason why
>>>> we are providing it. Not that full OSes couldn't use it, but it seems
>>>> pointless.
>>> You writing things this way makes me notice another possible design
>>> issue here: Requiring ACPI is a bad thing imo, with even bare hardware
>>> going different directions for at least some use cases (SFI being one
>>> example). Hence I think ACPI should - like on bare hardware - remain
>>> an optional thing. Which in turn require _all_ information obtained from
>>> ACPI (if available) to also be available another way. And this other
>>> way might by hypercalls in our case.
>>
>> At the risk of derailing this thread: why do we need vCPU hotplug for
>> dom0 in the first place? What do we gain over "echo {1|0} >
>> /sys/devices/system/cpu/cpuX/online" ?
>>
>> I can see why this may be needed for domUs where Xen can enforce number
>> of vCPUs that are allowed to run (which we don't enforce now anyway) but
>> why for dom0?
> Good that you now ask this too - that's the PV hotplug mechanism,
> and I've been saying all the time that this should be just fine for PVH
> (Dom0 and DomU).

I think domU hotplug has some value in that we can change number VCPUs
that the guest sees and ACPI-based hotplug allows us to do that in a
"standard" manner.

For dom0 this doesn't seem to be necessary as it's a special domain
available only to platform administrator.

Part of confusion I think is because PV hotplug is not hotplug, really,
as far as Linux kernel is concerned.


-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-17 15:50                           ` Boris Ostrovsky
@ 2017-01-17 17:45                             ` Roger Pau Monné
  2017-01-17 18:50                               ` Boris Ostrovsky
  0 siblings, 1 reply; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-17 17:45 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Stefano Stabellini, Graeme Gregory, Al Stone,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, Jan Beulich, xen-devel

On Tue, Jan 17, 2017 at 10:50:44AM -0500, Boris Ostrovsky wrote:
> On 01/17/2017 10:33 AM, Jan Beulich wrote:
> >>>> On 17.01.17 at 16:27, <boris.ostrovsky@oracle.com> wrote:
> >> On 01/17/2017 09:44 AM, Jan Beulich wrote:
> >>>>>> On 17.01.17 at 15:13, <roger.pau@citrix.com> wrote:
> >>>> There's only one kind of PVHv2 guest that doesn't require ACPI, and that guest
> >>>> type also doesn't have emulated local APICs. We agreed that this model was
> >>>> interesting from things like unikernels DomUs, but that's the only reason why
> >>>> we are providing it. Not that full OSes couldn't use it, but it seems
> >>>> pointless.
> >>> You writing things this way makes me notice another possible design
> >>> issue here: Requiring ACPI is a bad thing imo, with even bare hardware
> >>> going different directions for at least some use cases (SFI being one
> >>> example). Hence I think ACPI should - like on bare hardware - remain
> >>> an optional thing. Which in turn require _all_ information obtained from
> >>> ACPI (if available) to also be available another way. And this other
> >>> way might by hypercalls in our case.
> >>
> >> At the risk of derailing this thread: why do we need vCPU hotplug for
> >> dom0 in the first place? What do we gain over "echo {1|0} >
> >> /sys/devices/system/cpu/cpuX/online" ?
> >>
> >> I can see why this may be needed for domUs where Xen can enforce number
> >> of vCPUs that are allowed to run (which we don't enforce now anyway) but
> >> why for dom0?
> > Good that you now ask this too - that's the PV hotplug mechanism,
> > and I've been saying all the time that this should be just fine for PVH
> > (Dom0 and DomU).
> 
> I think domU hotplug has some value in that we can change number VCPUs
> that the guest sees and ACPI-based hotplug allows us to do that in a
> "standard" manner.
> 
> For dom0 this doesn't seem to be necessary as it's a special domain
> available only to platform administrator.
> 
> Part of confusion I think is because PV hotplug is not hotplug, really,
> as far as Linux kernel is concerned.

Hm, I'm not really sure I'm following, but I think that we could translate this
Dom0 PV hotplug mechanism to PVH as:

 - Dom0 is provided with up to HVM_MAX_VCPUS local APIC entries in the MADT, and
   the entries > dom0_max_vcpus are marked as disabled.
 - Dom0 has HVM_MAX_VCPUS vCPUs ready to be started, either by using the local
   APIC or an hypercall.

Would that match what's done for classic PV Dom0?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-17 17:45                             ` Roger Pau Monné
@ 2017-01-17 18:50                               ` Boris Ostrovsky
  2017-01-18 10:34                                 ` Roger Pau Monné
  0 siblings, 1 reply; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-17 18:50 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Graeme Gregory, Al Stone,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, Jan Beulich, xen-devel

On 01/17/2017 12:45 PM, Roger Pau Monné wrote:
> On Tue, Jan 17, 2017 at 10:50:44AM -0500, Boris Ostrovsky wrote:
>> On 01/17/2017 10:33 AM, Jan Beulich wrote:
>>>>>> On 17.01.17 at 16:27, <boris.ostrovsky@oracle.com> wrote:
>>>> On 01/17/2017 09:44 AM, Jan Beulich wrote:
>>>>>>>> On 17.01.17 at 15:13, <roger.pau@citrix.com> wrote:
>>>>>> There's only one kind of PVHv2 guest that doesn't require ACPI, and that guest
>>>>>> type also doesn't have emulated local APICs. We agreed that this model was
>>>>>> interesting from things like unikernels DomUs, but that's the only reason why
>>>>>> we are providing it. Not that full OSes couldn't use it, but it seems
>>>>>> pointless.
>>>>> You writing things this way makes me notice another possible design
>>>>> issue here: Requiring ACPI is a bad thing imo, with even bare hardware
>>>>> going different directions for at least some use cases (SFI being one
>>>>> example). Hence I think ACPI should - like on bare hardware - remain
>>>>> an optional thing. Which in turn require _all_ information obtained from
>>>>> ACPI (if available) to also be available another way. And this other
>>>>> way might by hypercalls in our case.
>>>> At the risk of derailing this thread: why do we need vCPU hotplug for
>>>> dom0 in the first place? What do we gain over "echo {1|0} >
>>>> /sys/devices/system/cpu/cpuX/online" ?
>>>>
>>>> I can see why this may be needed for domUs where Xen can enforce number
>>>> of vCPUs that are allowed to run (which we don't enforce now anyway) but
>>>> why for dom0?
>>> Good that you now ask this too - that's the PV hotplug mechanism,
>>> and I've been saying all the time that this should be just fine for PVH
>>> (Dom0 and DomU).
>> I think domU hotplug has some value in that we can change number VCPUs
>> that the guest sees and ACPI-based hotplug allows us to do that in a
>> "standard" manner.
>>
>> For dom0 this doesn't seem to be necessary as it's a special domain
>> available only to platform administrator.
>>
>> Part of confusion I think is because PV hotplug is not hotplug, really,
>> as far as Linux kernel is concerned.
> Hm, I'm not really sure I'm following, but I think that we could translate this
> Dom0 PV hotplug mechanism to PVH as:
>
>  - Dom0 is provided with up to HVM_MAX_VCPUS local APIC entries in the MADT, and
>    the entries > dom0_max_vcpus are marked as disabled.
>  - Dom0 has HVM_MAX_VCPUS vCPUs ready to be started, either by using the local
>    APIC or an hypercall.
>
> Would that match what's done for classic PV Dom0?

To match what we have for PV dom0 I believe you'd provide MADT with
opt_dom0_max_vcpus_max entries and mark all of them enabled.

dom0 brings up all opt_dom0_max_vcpus_max VCPUs, and then offlines
(opt_dom0_max_vcpus_min+1)..opt_dom0_max_vcpus_max. See
drivers/xen/cpu_hotplug.c:setup_cpu_watcher(). That's why I said it's
not a hotplug but rather on/off-lining.

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-17 18:50                               ` Boris Ostrovsky
@ 2017-01-18 10:34                                 ` Roger Pau Monné
  2017-01-18 10:44                                   ` Jan Beulich
  0 siblings, 1 reply; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-18 10:34 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Stefano Stabellini, Graeme Gregory, Al Stone,
	Konrad Rzeszutek Wilk, Andrew Cooper, Anshul Makkar,
	Julien Grall, Jan Beulich, xen-devel

On Tue, Jan 17, 2017 at 01:50:14PM -0500, Boris Ostrovsky wrote:
> On 01/17/2017 12:45 PM, Roger Pau Monné wrote:
> > On Tue, Jan 17, 2017 at 10:50:44AM -0500, Boris Ostrovsky wrote:
> >> On 01/17/2017 10:33 AM, Jan Beulich wrote:
> >>>>>> On 17.01.17 at 16:27, <boris.ostrovsky@oracle.com> wrote:
> >>>> On 01/17/2017 09:44 AM, Jan Beulich wrote:
> >>>>>>>> On 17.01.17 at 15:13, <roger.pau@citrix.com> wrote:
> >>>>>> There's only one kind of PVHv2 guest that doesn't require ACPI, and that guest
> >>>>>> type also doesn't have emulated local APICs. We agreed that this model was
> >>>>>> interesting from things like unikernels DomUs, but that's the only reason why
> >>>>>> we are providing it. Not that full OSes couldn't use it, but it seems
> >>>>>> pointless.
> >>>>> You writing things this way makes me notice another possible design
> >>>>> issue here: Requiring ACPI is a bad thing imo, with even bare hardware
> >>>>> going different directions for at least some use cases (SFI being one
> >>>>> example). Hence I think ACPI should - like on bare hardware - remain
> >>>>> an optional thing. Which in turn require _all_ information obtained from
> >>>>> ACPI (if available) to also be available another way. And this other
> >>>>> way might by hypercalls in our case.
> >>>> At the risk of derailing this thread: why do we need vCPU hotplug for
> >>>> dom0 in the first place? What do we gain over "echo {1|0} >
> >>>> /sys/devices/system/cpu/cpuX/online" ?
> >>>>
> >>>> I can see why this may be needed for domUs where Xen can enforce number
> >>>> of vCPUs that are allowed to run (which we don't enforce now anyway) but
> >>>> why for dom0?
> >>> Good that you now ask this too - that's the PV hotplug mechanism,
> >>> and I've been saying all the time that this should be just fine for PVH
> >>> (Dom0 and DomU).
> >> I think domU hotplug has some value in that we can change number VCPUs
> >> that the guest sees and ACPI-based hotplug allows us to do that in a
> >> "standard" manner.
> >>
> >> For dom0 this doesn't seem to be necessary as it's a special domain
> >> available only to platform administrator.
> >>
> >> Part of confusion I think is because PV hotplug is not hotplug, really,
> >> as far as Linux kernel is concerned.
> > Hm, I'm not really sure I'm following, but I think that we could translate this
> > Dom0 PV hotplug mechanism to PVH as:
> >
> >  - Dom0 is provided with up to HVM_MAX_VCPUS local APIC entries in the MADT, and
> >    the entries > dom0_max_vcpus are marked as disabled.
> >  - Dom0 has HVM_MAX_VCPUS vCPUs ready to be started, either by using the local
> >    APIC or an hypercall.
> >
> > Would that match what's done for classic PV Dom0?
> 
> To match what we have for PV dom0 I believe you'd provide MADT with
> opt_dom0_max_vcpus_max entries and mark all of them enabled.
> 
> dom0 brings up all opt_dom0_max_vcpus_max VCPUs, and then offlines
> (opt_dom0_max_vcpus_min+1)..opt_dom0_max_vcpus_max. See
> drivers/xen/cpu_hotplug.c:setup_cpu_watcher(). That's why I said it's
> not a hotplug but rather on/off-lining.

But how does Dom0 get the value of opt_dom0_max_vcpus_min? It doesn't seem to
be propagated anywhere from domain_build.

Also the logic in cpu_hotplug.c is weird IMHO:

static int vcpu_online(unsigned int cpu)
{
        int err;
        char dir[16], state[16];

        sprintf(dir, "cpu/%u", cpu);
        err = xenbus_scanf(XBT_NIL, dir, "availability", "%15s", state);
        if (err != 1) {
                if (!xen_initial_domain())
                        pr_err("Unable to read cpu state\n");
                return err;
        }

        if (strcmp(state, "online") == 0)
                return 1;
        else if (strcmp(state, "offline") == 0)
                return 0;

        pr_err("unknown state(%s) on CPU%d\n", state, cpu);
        return -EINVAL;
}
[...]
static int setup_cpu_watcher(struct notifier_block *notifier,
                              unsigned long event, void *data)
{
        int cpu;
        static struct xenbus_watch cpu_watch = {
                .node = "cpu",
                .callback = handle_vcpu_hotplug_event};

        (void)register_xenbus_watch(&cpu_watch);

        for_each_possible_cpu(cpu) {
                if (vcpu_online(cpu) == 0) {
                        (void)cpu_down(cpu);
                        set_cpu_present(cpu, false);
                }
        }

        return NOTIFY_DONE;
}

xenbus_scanf should return ENOENT for Dom0, because those paths don't exist,
and the all vcpus are going to be left enabled? I'm quite sure I'm missing
something here...

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-18 10:34                                 ` Roger Pau Monné
@ 2017-01-18 10:44                                   ` Jan Beulich
  2017-01-18 11:54                                     ` Roger Pau Monné
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-18 10:44 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad RzeszutekWilk, Andrew Cooper, Anshul Makkar, Julien Grall,
	xen-devel, Boris Ostrovsky

>>> On 18.01.17 at 11:34, <roger.pau@citrix.com> wrote:
> On Tue, Jan 17, 2017 at 01:50:14PM -0500, Boris Ostrovsky wrote:
>> On 01/17/2017 12:45 PM, Roger Pau Monné wrote:
>> > On Tue, Jan 17, 2017 at 10:50:44AM -0500, Boris Ostrovsky wrote:
>> >> Part of confusion I think is because PV hotplug is not hotplug, really,
>> >> as far as Linux kernel is concerned.
>> > Hm, I'm not really sure I'm following, but I think that we could translate this
>> > Dom0 PV hotplug mechanism to PVH as:
>> >
>> >  - Dom0 is provided with up to HVM_MAX_VCPUS local APIC entries in the MADT, and
>> >    the entries > dom0_max_vcpus are marked as disabled.
>> >  - Dom0 has HVM_MAX_VCPUS vCPUs ready to be started, either by using the local
>> >    APIC or an hypercall.
>> >
>> > Would that match what's done for classic PV Dom0?
>> 
>> To match what we have for PV dom0 I believe you'd provide MADT with
>> opt_dom0_max_vcpus_max entries and mark all of them enabled.
>> 
>> dom0 brings up all opt_dom0_max_vcpus_max VCPUs, and then offlines
>> (opt_dom0_max_vcpus_min+1)..opt_dom0_max_vcpus_max. See
>> drivers/xen/cpu_hotplug.c:setup_cpu_watcher(). That's why I said it's
>> not a hotplug but rather on/off-lining.
> 
> But how does Dom0 get the value of opt_dom0_max_vcpus_min? It doesn't seem to
> be propagated anywhere from domain_build.

I'm afraid Boris has given a meaning to that (Xen) command line
option which it doesn't have. Please see that option's description
in xen-command-line.markdown. How many vCPU-s should be
offlined is - iirc - being established by a system boot setting
inside Dom0.

> Also the logic in cpu_hotplug.c is weird IMHO:
> 
> static int vcpu_online(unsigned int cpu)
> {
>         int err;
>         char dir[16], state[16];
> 
>         sprintf(dir, "cpu/%u", cpu);
>         err = xenbus_scanf(XBT_NIL, dir, "availability", "%15s", state);
>         if (err != 1) {
>                 if (!xen_initial_domain())
>                         pr_err("Unable to read cpu state\n");
>                 return err;
>         }
> 
>         if (strcmp(state, "online") == 0)
>                 return 1;
>         else if (strcmp(state, "offline") == 0)
>                 return 0;
> 
>         pr_err("unknown state(%s) on CPU%d\n", state, cpu);
>         return -EINVAL;
> }
> [...]
> static int setup_cpu_watcher(struct notifier_block *notifier,
>                               unsigned long event, void *data)
> {
>         int cpu;
>         static struct xenbus_watch cpu_watch = {
>                 .node = "cpu",
>                 .callback = handle_vcpu_hotplug_event};
> 
>         (void)register_xenbus_watch(&cpu_watch);
> 
>         for_each_possible_cpu(cpu) {
>                 if (vcpu_online(cpu) == 0) {
>                         (void)cpu_down(cpu);
>                         set_cpu_present(cpu, false);
>                 }
>         }
> 
>         return NOTIFY_DONE;
> }
> 
> xenbus_scanf should return ENOENT for Dom0, because those paths don't exist,
> and the all vcpus are going to be left enabled? I'm quite sure I'm missing
> something here...

Well, the watch ought to trigger once the paths appear, at which
point the offlining should be happening. The explicit check in
setup_cpu_watcher() is indeed useful only for DomU.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-18 10:44                                   ` Jan Beulich
@ 2017-01-18 11:54                                     ` Roger Pau Monné
  2017-01-18 13:25                                       ` Jan Beulich
  2017-01-18 13:34                                       ` Boris Ostrovsky
  0 siblings, 2 replies; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-18 11:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad RzeszutekWilk, Andrew Cooper, Anshul Makkar, Julien Grall,
	xen-devel, Boris Ostrovsky

On Wed, Jan 18, 2017 at 03:44:19AM -0700, Jan Beulich wrote:
> >>> On 18.01.17 at 11:34, <roger.pau@citrix.com> wrote:
> > On Tue, Jan 17, 2017 at 01:50:14PM -0500, Boris Ostrovsky wrote:
> >> On 01/17/2017 12:45 PM, Roger Pau Monné wrote:
> >> > On Tue, Jan 17, 2017 at 10:50:44AM -0500, Boris Ostrovsky wrote:
> >> >> Part of confusion I think is because PV hotplug is not hotplug, really,
> >> >> as far as Linux kernel is concerned.
> >> > Hm, I'm not really sure I'm following, but I think that we could translate this
> >> > Dom0 PV hotplug mechanism to PVH as:
> >> >
> >> >  - Dom0 is provided with up to HVM_MAX_VCPUS local APIC entries in the MADT, and
> >> >    the entries > dom0_max_vcpus are marked as disabled.
> >> >  - Dom0 has HVM_MAX_VCPUS vCPUs ready to be started, either by using the local
> >> >    APIC or an hypercall.
> >> >
> >> > Would that match what's done for classic PV Dom0?
> >> 
> >> To match what we have for PV dom0 I believe you'd provide MADT with
> >> opt_dom0_max_vcpus_max entries and mark all of them enabled.
> >> 
> >> dom0 brings up all opt_dom0_max_vcpus_max VCPUs, and then offlines
> >> (opt_dom0_max_vcpus_min+1)..opt_dom0_max_vcpus_max. See
> >> drivers/xen/cpu_hotplug.c:setup_cpu_watcher(). That's why I said it's
> >> not a hotplug but rather on/off-lining.
> > 
> > But how does Dom0 get the value of opt_dom0_max_vcpus_min? It doesn't seem to
> > be propagated anywhere from domain_build.
> 
> I'm afraid Boris has given a meaning to that (Xen) command line
> option which it doesn't have. Please see that option's description
> in xen-command-line.markdown. How many vCPU-s should be
> offlined is - iirc - being established by a system boot setting
> inside Dom0.

So, would it be fine to start a PVH Dom0 with as many vCPUs as what's returned
from dom0_max_vcpus, and mark them as enabled in the MADT. That's basically all
we need in order to match current PV Dom0 functionality?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-18 11:54                                     ` Roger Pau Monné
@ 2017-01-18 13:25                                       ` Jan Beulich
  2017-01-22 18:39                                         ` Boris Ostrovsky
  2017-01-18 13:34                                       ` Boris Ostrovsky
  1 sibling, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-18 13:25 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad RzeszutekWilk, Andrew Cooper, Anshul Makkar, Julien Grall,
	xen-devel, Boris Ostrovsky

>>> On 18.01.17 at 12:54, <roger.pau@citrix.com> wrote:
> So, would it be fine to start a PVH Dom0 with as many vCPUs as what's returned
> from dom0_max_vcpus, and mark them as enabled in the MADT. That's basically all
> we need in order to match current PV Dom0 functionality?

Yes, I think so.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-18 11:54                                     ` Roger Pau Monné
  2017-01-18 13:25                                       ` Jan Beulich
@ 2017-01-18 13:34                                       ` Boris Ostrovsky
  1 sibling, 0 replies; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-18 13:34 UTC (permalink / raw)
  To: Roger Pau Monné, Jan Beulich
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory,
	Konrad RzeszutekWilk, Andrew Cooper, Anshul Makkar, Julien Grall,
	xen-devel

On 01/18/2017 06:54 AM, Roger Pau Monné wrote:
> On Wed, Jan 18, 2017 at 03:44:19AM -0700, Jan Beulich wrote:
>>>>> On 18.01.17 at 11:34, <roger.pau@citrix.com> wrote:
>>> On Tue, Jan 17, 2017 at 01:50:14PM -0500, Boris Ostrovsky wrote:
>>>> On 01/17/2017 12:45 PM, Roger Pau Monné wrote:
>>>>> On Tue, Jan 17, 2017 at 10:50:44AM -0500, Boris Ostrovsky wrote:
>>>>>> Part of confusion I think is because PV hotplug is not hotplug, really,
>>>>>> as far as Linux kernel is concerned.
>>>>> Hm, I'm not really sure I'm following, but I think that we could translate this
>>>>> Dom0 PV hotplug mechanism to PVH as:
>>>>>
>>>>>  - Dom0 is provided with up to HVM_MAX_VCPUS local APIC entries in the MADT, and
>>>>>    the entries > dom0_max_vcpus are marked as disabled.
>>>>>  - Dom0 has HVM_MAX_VCPUS vCPUs ready to be started, either by using the local
>>>>>    APIC or an hypercall.
>>>>>
>>>>> Would that match what's done for classic PV Dom0?
>>>> To match what we have for PV dom0 I believe you'd provide MADT with
>>>> opt_dom0_max_vcpus_max entries and mark all of them enabled.
>>>>
>>>> dom0 brings up all opt_dom0_max_vcpus_max VCPUs, and then offlines
>>>> (opt_dom0_max_vcpus_min+1)..opt_dom0_max_vcpus_max. See
>>>> drivers/xen/cpu_hotplug.c:setup_cpu_watcher(). That's why I said it's
>>>> not a hotplug but rather on/off-lining.
>>> But how does Dom0 get the value of opt_dom0_max_vcpus_min? It doesn't seem to
>>> be propagated anywhere from domain_build.
>> I'm afraid Boris has given a meaning to that (Xen) command line
>> option which it doesn't have. Please see that option's description
>> in xen-command-line.markdown. How many vCPU-s should be
>> offlined is - iirc - being established by a system boot setting
>> inside Dom0.

Yes, I was wrong on both counts --- opt_dom0_max_vcpus_{max|min} use and
offlining at boot time (I was indeed thinking of domU). Sorry about that.

-boris




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-18 13:25                                       ` Jan Beulich
@ 2017-01-22 18:39                                         ` Boris Ostrovsky
  2017-01-23 10:35                                           ` Jan Beulich
  0 siblings, 1 reply; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-22 18:39 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory, Andrew Cooper,
	Anshul Makkar, Julien Grall, xen-devel



On 01/18/2017 08:25 AM, Jan Beulich wrote:
>>>> On 18.01.17 at 12:54, <roger.pau@citrix.com> wrote:
>> So, would it be fine to start a PVH Dom0 with as many vCPUs as what's returned
>> from dom0_max_vcpus, and mark them as enabled in the MADT. That's basically all
>> we need in order to match current PV Dom0 functionality?
> Yes, I think so.


Have we then decided that we are not supporting ACPI hotplug for both 
dom0 and domU?

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-22 18:39                                         ` Boris Ostrovsky
@ 2017-01-23 10:35                                           ` Jan Beulich
  2017-01-23 14:28                                             ` Boris Ostrovsky
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-23 10:35 UTC (permalink / raw)
  To: roger.pau, Boris Ostrovsky
  Cc: Stefano Stabellini, Al Stone, Graeme Gregory, Andrew Cooper,
	Anshul Makkar, Julien Grall, xen-devel

>>> On 22.01.17 at 19:39, <boris.ostrovsky@oracle.com> wrote:
> On 01/18/2017 08:25 AM, Jan Beulich wrote:
>>>>> On 18.01.17 at 12:54, <roger.pau@citrix.com> wrote:
>>> So, would it be fine to start a PVH Dom0 with as many vCPUs as what's returned
>>> from dom0_max_vcpus, and mark them as enabled in the MADT. That's basically all
>>> we need in order to match current PV Dom0 functionality?
>> Yes, I think so.
> 
> 
> Have we then decided that we are not supporting ACPI hotplug for both 
> dom0 and domU?

I don't think anything has been decided yet, it's just that the road to
(consistent) ACPI hotplug support still seems somewhat cloudy, so
getting there (if we're convinced this is a worthwhile goal) may still
take some time.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 10:35                                           ` Jan Beulich
@ 2017-01-23 14:28                                             ` Boris Ostrovsky
  2017-01-23 15:17                                               ` Jan Beulich
  0 siblings, 1 reply; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-23 14:28 UTC (permalink / raw)
  To: Jan Beulich, roger.pau
  Cc: Juergen Gross, Stefano Stabellini, Al Stone, Graeme Gregory,
	Andrew Cooper, Anshul Makkar, Julien Grall, xen-devel

On 01/23/2017 05:35 AM, Jan Beulich wrote:
>>>> On 22.01.17 at 19:39, <boris.ostrovsky@oracle.com> wrote:
>> On 01/18/2017 08:25 AM, Jan Beulich wrote:
>>>>>> On 18.01.17 at 12:54, <roger.pau@citrix.com> wrote:
>>>> So, would it be fine to start a PVH Dom0 with as many vCPUs as what's returned
>>>> from dom0_max_vcpus, and mark them as enabled in the MADT. That's basically all
>>>> we need in order to match current PV Dom0 functionality?
>>> Yes, I think so.
>>
>> Have we then decided that we are not supporting ACPI hotplug for both 
>> dom0 and domU?
> I don't think anything has been decided yet, it's just that the road to
> (consistent) ACPI hotplug support still seems somewhat cloudy, so
> getting there (if we're convinced this is a worthwhile goal) may still
> take some time.


I am mostly interested in domU support at this point. This is the only
feature that I can think of right now that blocks domU support in Linux.

As I said in an earlier message I don't think dom0 needs ACPI hotplug
(or hotplug at all) while domU would benefit from it. But perhaps by
"consistent" you meant that you prefer both to behave in the same manner.

From Linux perspective one option could be to have domU with PV-style
vCPU on/offlining based on xenstore and switch to ACPI hotplug if/when
it becomes available. This, however, will need an indication from the
hypervisor. We could, for example, set ACPI_FADT_HW_REDUCED, as we
discussed earlier.

-boris



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 14:28                                             ` Boris Ostrovsky
@ 2017-01-23 15:17                                               ` Jan Beulich
  2017-01-23 15:43                                                 ` Boris Ostrovsky
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-23 15:17 UTC (permalink / raw)
  To: roger.pau, Boris Ostrovsky
  Cc: Juergen Gross, Stefano Stabellini, Al Stone, Graeme Gregory,
	Andrew Cooper, Anshul Makkar, Julien Grall, xen-devel

>>> On 23.01.17 at 15:28, <boris.ostrovsky@oracle.com> wrote:
> On 01/23/2017 05:35 AM, Jan Beulich wrote:
>>>>> On 22.01.17 at 19:39, <boris.ostrovsky@oracle.com> wrote:
>>> On 01/18/2017 08:25 AM, Jan Beulich wrote:
>>>>>>> On 18.01.17 at 12:54, <roger.pau@citrix.com> wrote:
>>>>> So, would it be fine to start a PVH Dom0 with as many vCPUs as what's returned
>>>>> from dom0_max_vcpus, and mark them as enabled in the MADT. That's basically all
>>>>> we need in order to match current PV Dom0 functionality?
>>>> Yes, I think so.
>>>
>>> Have we then decided that we are not supporting ACPI hotplug for both 
>>> dom0 and domU?
>> I don't think anything has been decided yet, it's just that the road to
>> (consistent) ACPI hotplug support still seems somewhat cloudy, so
>> getting there (if we're convinced this is a worthwhile goal) may still
>> take some time.
> 
> 
> I am mostly interested in domU support at this point. This is the only
> feature that I can think of right now that blocks domU support in Linux.
> 
> As I said in an earlier message I don't think dom0 needs ACPI hotplug
> (or hotplug at all) while domU would benefit from it. But perhaps by
> "consistent" you meant that you prefer both to behave in the same manner.

Yes.

> From Linux perspective one option could be to have domU with PV-style
> vCPU on/offlining based on xenstore and switch to ACPI hotplug if/when
> it becomes available. This, however, will need an indication from the
> hypervisor. We could, for example, set ACPI_FADT_HW_REDUCED, as we
> discussed earlier.

I think we shouldn't overload that flag. Didn't we settle already on using
two CPUID flags (of for PV-style onlining/offlining, the other for ACPI
hot(un)plug)? With that I think I could then be talked into accepting the
existence of two different models (and kernels could pick which one(s)
they would like to support).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 15:17                                               ` Jan Beulich
@ 2017-01-23 15:43                                                 ` Boris Ostrovsky
  2017-01-23 15:49                                                   ` Boris Ostrovsky
  2017-01-23 15:49                                                   ` Jan Beulich
  0 siblings, 2 replies; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-23 15:43 UTC (permalink / raw)
  To: Jan Beulich, roger.pau
  Cc: Juergen Gross, Stefano Stabellini, Al Stone, Graeme Gregory,
	Andrew Cooper, Anshul Makkar, Julien Grall, xen-devel


>> From Linux perspective one option could be to have domU with PV-style
>> vCPU on/offlining based on xenstore and switch to ACPI hotplug if/when
>> it becomes available. This, however, will need an indication from the
>> hypervisor. We could, for example, set ACPI_FADT_HW_REDUCED, as we
>> discussed earlier.
> I think we shouldn't overload that flag. Didn't we settle already on using
> two CPUID flags (of for PV-style onlining/offlining, the other for ACPI
> hot(un)plug)? With that I think I could then be talked into accepting the
> existence of two different models (and kernels could pick which one(s)
> they would like to support).

I forgot about existence of ACPI_FADT_HW_REDUCED until this morning,
which is why I mentioned it now.

We can go with CPUID flags although I am not sure why we'd need two. I'd
think that OS can be expected to always support PV-style so the flag
would indicate support for ACPI-based hotplug.

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 15:43                                                 ` Boris Ostrovsky
@ 2017-01-23 15:49                                                   ` Boris Ostrovsky
  2017-01-23 15:58                                                     ` Jan Beulich
  2017-01-23 15:49                                                   ` Jan Beulich
  1 sibling, 1 reply; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-23 15:49 UTC (permalink / raw)
  To: Jan Beulich, roger.pau
  Cc: Juergen Gross, Stefano Stabellini, Al Stone, Graeme Gregory,
	Andrew Cooper, Anshul Makkar, Julien Grall, xen-devel

On 01/23/2017 10:43 AM, Boris Ostrovsky wrote:
>>> From Linux perspective one option could be to have domU with PV-style
>>> vCPU on/offlining based on xenstore and switch to ACPI hotplug if/when
>>> it becomes available. This, however, will need an indication from the
>>> hypervisor. We could, for example, set ACPI_FADT_HW_REDUCED, as we
>>> discussed earlier.
>> I think we shouldn't overload that flag. Didn't we settle already on using
>> two CPUID flags (of for PV-style onlining/offlining, the other for ACPI
>> hot(un)plug)? With that I think I could then be talked into accepting the
>> existence of two different models (and kernels could pick which one(s)
>> they would like to support).
> I forgot about existence of ACPI_FADT_HW_REDUCED until this morning,
> which is why I mentioned it now.
>
> We can go with CPUID flags although I am not sure why we'd need two. I'd
> think that OS can be expected to always support PV-style so the flag
> would indicate support for ACPI-based hotplug.

In fact, it doesn't matter whether OS supports PV-style hotplug. It's
that Xen will always set appropriate xenstore entry. It's up to the OS
whether to watch it and act upon this.

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 15:43                                                 ` Boris Ostrovsky
  2017-01-23 15:49                                                   ` Boris Ostrovsky
@ 2017-01-23 15:49                                                   ` Jan Beulich
  1 sibling, 0 replies; 46+ messages in thread
From: Jan Beulich @ 2017-01-23 15:49 UTC (permalink / raw)
  To: roger.pau, Boris Ostrovsky
  Cc: Juergen Gross, Stefano Stabellini, Al Stone, Graeme Gregory,
	Andrew Cooper, Anshul Makkar, Julien Grall, xen-devel

>>> On 23.01.17 at 16:43, <boris.ostrovsky@oracle.com> wrote:
>>> From Linux perspective one option could be to have domU with PV-style
>>> vCPU on/offlining based on xenstore and switch to ACPI hotplug if/when
>>> it becomes available. This, however, will need an indication from the
>>> hypervisor. We could, for example, set ACPI_FADT_HW_REDUCED, as we
>>> discussed earlier.
>> I think we shouldn't overload that flag. Didn't we settle already on using
>> two CPUID flags (of for PV-style onlining/offlining, the other for ACPI
>> hot(un)plug)? With that I think I could then be talked into accepting the
>> existence of two different models (and kernels could pick which one(s)
>> they would like to support).
> 
> I forgot about existence of ACPI_FADT_HW_REDUCED until this morning,
> which is why I mentioned it now.
> 
> We can go with CPUID flags although I am not sure why we'd need two. I'd
> think that OS can be expected to always support PV-style so the flag
> would indicate support for ACPI-based hotplug.

Well, to date PV style wasn't meant to be supported, was it? In which
case it would be legitimate to put it behind a feature flag. That would
then also pave a road towards removing the support for this (and
simply no longer setting that flag).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 15:49                                                   ` Boris Ostrovsky
@ 2017-01-23 15:58                                                     ` Jan Beulich
  2017-01-23 16:24                                                       ` Boris Ostrovsky
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-23 15:58 UTC (permalink / raw)
  To: roger.pau, Boris Ostrovsky
  Cc: Juergen Gross, Stefano Stabellini, Al Stone, Graeme Gregory,
	Andrew Cooper, Anshul Makkar, Julien Grall, xen-devel

>>> On 23.01.17 at 16:49, <boris.ostrovsky@oracle.com> wrote:
> On 01/23/2017 10:43 AM, Boris Ostrovsky wrote:
>>>> From Linux perspective one option could be to have domU with PV-style
>>>> vCPU on/offlining based on xenstore and switch to ACPI hotplug if/when
>>>> it becomes available. This, however, will need an indication from the
>>>> hypervisor. We could, for example, set ACPI_FADT_HW_REDUCED, as we
>>>> discussed earlier.
>>> I think we shouldn't overload that flag. Didn't we settle already on using
>>> two CPUID flags (of for PV-style onlining/offlining, the other for ACPI
>>> hot(un)plug)? With that I think I could then be talked into accepting the
>>> existence of two different models (and kernels could pick which one(s)
>>> they would like to support).
>> I forgot about existence of ACPI_FADT_HW_REDUCED until this morning,
>> which is why I mentioned it now.
>>
>> We can go with CPUID flags although I am not sure why we'd need two. I'd
>> think that OS can be expected to always support PV-style so the flag
>> would indicate support for ACPI-based hotplug.
> 
> In fact, it doesn't matter whether OS supports PV-style hotplug. It's
> that Xen will always set appropriate xenstore entry. It's up to the OS
> whether to watch it and act upon this.

That's a good point, perhaps just one CPUID flag will do then indeed.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 15:58                                                     ` Jan Beulich
@ 2017-01-23 16:24                                                       ` Boris Ostrovsky
  2017-01-23 16:50                                                         ` Roger Pau Monné
  0 siblings, 1 reply; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-23 16:24 UTC (permalink / raw)
  To: Jan Beulich, roger.pau
  Cc: Juergen Gross, Stefano Stabellini, Al Stone, Graeme Gregory,
	Andrew Cooper, Anshul Makkar, Julien Grall, xen-devel

On 01/23/2017 10:58 AM, Jan Beulich wrote:
>>>> On 23.01.17 at 16:49, <boris.ostrovsky@oracle.com> wrote:
>> On 01/23/2017 10:43 AM, Boris Ostrovsky wrote:
>>>>> From Linux perspective one option could be to have domU with PV-style
>>>>> vCPU on/offlining based on xenstore and switch to ACPI hotplug if/when
>>>>> it becomes available. This, however, will need an indication from the
>>>>> hypervisor. We could, for example, set ACPI_FADT_HW_REDUCED, as we
>>>>> discussed earlier.
>>>> I think we shouldn't overload that flag. Didn't we settle already on using
>>>> two CPUID flags (of for PV-style onlining/offlining, the other for ACPI
>>>> hot(un)plug)? With that I think I could then be talked into accepting the
>>>> existence of two different models (and kernels could pick which one(s)
>>>> they would like to support).
>>> I forgot about existence of ACPI_FADT_HW_REDUCED until this morning,
>>> which is why I mentioned it now.
>>>
>>> We can go with CPUID flags although I am not sure why we'd need two. I'd
>>> think that OS can be expected to always support PV-style so the flag
>>> would indicate support for ACPI-based hotplug.
>> In fact, it doesn't matter whether OS supports PV-style hotplug. It's
>> that Xen will always set appropriate xenstore entry. It's up to the OS
>> whether to watch it and act upon this.
> That's a good point, perhaps just one CPUID flag will do then indeed.

Roger, are you going to include this in your patchset or do you want me
to do it?

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 16:24                                                       ` Boris Ostrovsky
@ 2017-01-23 16:50                                                         ` Roger Pau Monné
  2017-01-23 17:05                                                           ` Boris Ostrovsky
  0 siblings, 1 reply; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-23 16:50 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Juergen Gross, Stefano Stabellini, Graeme Gregory, Al Stone,
	Andrew Cooper, Anshul Makkar, Julien Grall, Jan Beulich,
	xen-devel

On Mon, Jan 23, 2017 at 11:24:19AM -0500, Boris Ostrovsky wrote:
> On 01/23/2017 10:58 AM, Jan Beulich wrote:
> >>>> On 23.01.17 at 16:49, <boris.ostrovsky@oracle.com> wrote:
> >> On 01/23/2017 10:43 AM, Boris Ostrovsky wrote:
> >>>>> From Linux perspective one option could be to have domU with PV-style
> >>>>> vCPU on/offlining based on xenstore and switch to ACPI hotplug if/when
> >>>>> it becomes available. This, however, will need an indication from the
> >>>>> hypervisor. We could, for example, set ACPI_FADT_HW_REDUCED, as we
> >>>>> discussed earlier.
> >>>> I think we shouldn't overload that flag. Didn't we settle already on using
> >>>> two CPUID flags (of for PV-style onlining/offlining, the other for ACPI
> >>>> hot(un)plug)? With that I think I could then be talked into accepting the
> >>>> existence of two different models (and kernels could pick which one(s)
> >>>> they would like to support).
> >>> I forgot about existence of ACPI_FADT_HW_REDUCED until this morning,
> >>> which is why I mentioned it now.
> >>>
> >>> We can go with CPUID flags although I am not sure why we'd need two. I'd
> >>> think that OS can be expected to always support PV-style so the flag
> >>> would indicate support for ACPI-based hotplug.
> >> In fact, it doesn't matter whether OS supports PV-style hotplug. It's
> >> that Xen will always set appropriate xenstore entry. It's up to the OS
> >> whether to watch it and act upon this.
> > That's a good point, perhaps just one CPUID flag will do then indeed.
> 
> Roger, are you going to include this in your patchset or do you want me
> to do it?

I think this should be introduced in your DomU ACPI CPU hotplug series, and not
set for Dom0 until we found a way to perform ACPI vCPU hotplug for Dom0 also.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 16:50                                                         ` Roger Pau Monné
@ 2017-01-23 17:05                                                           ` Boris Ostrovsky
  2017-01-23 17:14                                                             ` Roger Pau Monné
  0 siblings, 1 reply; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-23 17:05 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, Stefano Stabellini, Graeme Gregory, Al Stone,
	Andrew Cooper, Anshul Makkar, Julien Grall, Jan Beulich,
	xen-devel

On 01/23/2017 11:50 AM, Roger Pau Monné wrote:
> On Mon, Jan 23, 2017 at 11:24:19AM -0500, Boris Ostrovsky wrote:
>> On 01/23/2017 10:58 AM, Jan Beulich wrote:
>>>>>> On 23.01.17 at 16:49, <boris.ostrovsky@oracle.com> wrote:
>>>> On 01/23/2017 10:43 AM, Boris Ostrovsky wrote:
>>>>>>> From Linux perspective one option could be to have domU with PV-style
>>>>>>> vCPU on/offlining based on xenstore and switch to ACPI hotplug if/when
>>>>>>> it becomes available. This, however, will need an indication from the
>>>>>>> hypervisor. We could, for example, set ACPI_FADT_HW_REDUCED, as we
>>>>>>> discussed earlier.
>>>>>> I think we shouldn't overload that flag. Didn't we settle already on using
>>>>>> two CPUID flags (of for PV-style onlining/offlining, the other for ACPI
>>>>>> hot(un)plug)? With that I think I could then be talked into accepting the
>>>>>> existence of two different models (and kernels could pick which one(s)
>>>>>> they would like to support).
>>>>> I forgot about existence of ACPI_FADT_HW_REDUCED until this morning,
>>>>> which is why I mentioned it now.
>>>>>
>>>>> We can go with CPUID flags although I am not sure why we'd need two. I'd
>>>>> think that OS can be expected to always support PV-style so the flag
>>>>> would indicate support for ACPI-based hotplug.
>>>> In fact, it doesn't matter whether OS supports PV-style hotplug. It's
>>>> that Xen will always set appropriate xenstore entry. It's up to the OS
>>>> whether to watch it and act upon this.
>>> That's a good point, perhaps just one CPUID flag will do then indeed.
>> Roger, are you going to include this in your patchset or do you want me
>> to do it?
> I think this should be introduced in your DomU ACPI CPU hotplug series, and not
> set for Dom0 until we found a way to perform ACPI vCPU hotplug for Dom0 also.
>

Right, although my understanding is that the series is on hold until we
come to agreement about what to do about dom0. But I guess if we agree
that we'll have a single ACPI hotplug feature bit then we can go ahead
with Linux domU patches without waiting for Xen ACPI support.


-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 17:05                                                           ` Boris Ostrovsky
@ 2017-01-23 17:14                                                             ` Roger Pau Monné
  2017-01-23 18:36                                                               ` Boris Ostrovsky
  0 siblings, 1 reply; 46+ messages in thread
From: Roger Pau Monné @ 2017-01-23 17:14 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Juergen Gross, Stefano Stabellini, Graeme Gregory, Al Stone,
	Andrew Cooper, Anshul Makkar, Julien Grall, Jan Beulich,
	xen-devel

On Mon, Jan 23, 2017 at 12:05:11PM -0500, Boris Ostrovsky wrote:
> On 01/23/2017 11:50 AM, Roger Pau Monné wrote:
> > On Mon, Jan 23, 2017 at 11:24:19AM -0500, Boris Ostrovsky wrote:
> >> On 01/23/2017 10:58 AM, Jan Beulich wrote:
> >>>>>> On 23.01.17 at 16:49, <boris.ostrovsky@oracle.com> wrote:
> >>>> On 01/23/2017 10:43 AM, Boris Ostrovsky wrote:
> >>>>>>> From Linux perspective one option could be to have domU with PV-style
> >>>>>>> vCPU on/offlining based on xenstore and switch to ACPI hotplug if/when
> >>>>>>> it becomes available. This, however, will need an indication from the
> >>>>>>> hypervisor. We could, for example, set ACPI_FADT_HW_REDUCED, as we
> >>>>>>> discussed earlier.
> >>>>>> I think we shouldn't overload that flag. Didn't we settle already on using
> >>>>>> two CPUID flags (of for PV-style onlining/offlining, the other for ACPI
> >>>>>> hot(un)plug)? With that I think I could then be talked into accepting the
> >>>>>> existence of two different models (and kernels could pick which one(s)
> >>>>>> they would like to support).
> >>>>> I forgot about existence of ACPI_FADT_HW_REDUCED until this morning,
> >>>>> which is why I mentioned it now.
> >>>>>
> >>>>> We can go with CPUID flags although I am not sure why we'd need two. I'd
> >>>>> think that OS can be expected to always support PV-style so the flag
> >>>>> would indicate support for ACPI-based hotplug.
> >>>> In fact, it doesn't matter whether OS supports PV-style hotplug. It's
> >>>> that Xen will always set appropriate xenstore entry. It's up to the OS
> >>>> whether to watch it and act upon this.
> >>> That's a good point, perhaps just one CPUID flag will do then indeed.
> >> Roger, are you going to include this in your patchset or do you want me
> >> to do it?
> > I think this should be introduced in your DomU ACPI CPU hotplug series, and not
> > set for Dom0 until we found a way to perform ACPI vCPU hotplug for Dom0 also.
> >
> 
> Right, although my understanding is that the series is on hold until we
> come to agreement about what to do about dom0. But I guess if we agree
> that we'll have a single ACPI hotplug feature bit then we can go ahead
> with Linux domU patches without waiting for Xen ACPI support.

My understanding is that the way forward now is to introduce this bit, use it
in your series and decide what we do with Dom0 when we get to a point that we
can start to test/implement vCPU hotplug there. There's still work to do until
we can get to the point of simply booting a static no-hotplug PVHv2 Dom0.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 17:14                                                             ` Roger Pau Monné
@ 2017-01-23 18:36                                                               ` Boris Ostrovsky
  2017-01-24  7:43                                                                 ` Jan Beulich
  0 siblings, 1 reply; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-23 18:36 UTC (permalink / raw)
  To: Roger Pau Monné, Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Al Stone, Graeme Gregory,
	Andrew Cooper, Anshul Makkar, Julien Grall, xen-devel


>>> I think this should be introduced in your DomU ACPI CPU hotplug series, and not
>>> set for Dom0 until we found a way to perform ACPI vCPU hotplug for Dom0 also.
>>>
>> Right, although my understanding is that the series is on hold until we
>> come to agreement about what to do about dom0. But I guess if we agree
>> that we'll have a single ACPI hotplug feature bit then we can go ahead
>> with Linux domU patches without waiting for Xen ACPI support.
> My understanding is that the way forward now is to introduce this bit, use it
> in your series and decide what we do with Dom0 when we get to a point that we
> can start to test/implement vCPU hotplug there. There's still work to do until
> we can get to the point of simply booting a static no-hotplug PVHv2 Dom0.

Jan, is that how you want to proceed?

If yes then do you want me to resubmit the series with CPUID support or
should I wait for you to review v6
(https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00060.html)
first?

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-23 18:36                                                               ` Boris Ostrovsky
@ 2017-01-24  7:43                                                                 ` Jan Beulich
  2017-01-24 14:15                                                                   ` Boris Ostrovsky
  0 siblings, 1 reply; 46+ messages in thread
From: Jan Beulich @ 2017-01-24  7:43 UTC (permalink / raw)
  To: roger.pau, Boris Ostrovsky
  Cc: Juergen Gross, Stefano Stabellini, Al Stone, Graeme Gregory,
	Andrew Cooper, Anshul Makkar, Julien Grall, xen-devel

>>> On 23.01.17 at 19:36, <boris.ostrovsky@oracle.com> wrote:

>>>> I think this should be introduced in your DomU ACPI CPU hotplug series, and not
>>>> set for Dom0 until we found a way to perform ACPI vCPU hotplug for Dom0 also.
>>>>
>>> Right, although my understanding is that the series is on hold until we
>>> come to agreement about what to do about dom0. But I guess if we agree
>>> that we'll have a single ACPI hotplug feature bit then we can go ahead
>>> with Linux domU patches without waiting for Xen ACPI support.

How would you test such Linux patches if the Xen side is missing? Or
am I misunderstanding the statement above?

>> My understanding is that the way forward now is to introduce this bit, use it
>> in your series and decide what we do with Dom0 when we get to a point that we
>> can start to test/implement vCPU hotplug there. There's still work to do until
>> we can get to the point of simply booting a static no-hotplug PVHv2 Dom0.
> 
> Jan, is that how you want to proceed?

Well, it's not just depending on my opinion, I think. As said, I could
live with the bimodal approach, but I'm not really happy with it.
Hence at least a second opinion (perhaps Andrew's, but others'
input would be more than welcome) should be waited for imo.

> If yes then do you want me to resubmit the series with CPUID support or
> should I wait for you to review v6
> (https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00060.html)
> first?

Well, as you may have seen there've been quite a few patch series
submitted during the last days. With the desire to also work on my
own series, and with the need to also work on other stuff, I can't
really predict when I'd get to look at your or any other patches (it
is, to be honest, quite a bit discouraging to see so much stuff
being submitted for review, yet so little participation in reviews,
but that's not a new/unusual situation at all). Hence I think not
waiting for review of v6 would be more reasonable, unless
aforementioned further input gathering turns out to take overly
long.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: PVH CPU hotplug design document
  2017-01-24  7:43                                                                 ` Jan Beulich
@ 2017-01-24 14:15                                                                   ` Boris Ostrovsky
  0 siblings, 0 replies; 46+ messages in thread
From: Boris Ostrovsky @ 2017-01-24 14:15 UTC (permalink / raw)
  To: Jan Beulich, roger.pau
  Cc: Juergen Gross, Stefano Stabellini, Al Stone, Graeme Gregory,
	Andrew Cooper, Anshul Makkar, Julien Grall, xen-devel

On 01/24/2017 02:43 AM, Jan Beulich wrote:
>>>> On 23.01.17 at 19:36, <boris.ostrovsky@oracle.com> wrote:
>>>>> I think this should be introduced in your DomU ACPI CPU hotplug series, and not
>>>>> set for Dom0 until we found a way to perform ACPI vCPU hotplug for Dom0 also.
>>>>>
>>>> Right, although my understanding is that the series is on hold until we
>>>> come to agreement about what to do about dom0. But I guess if we agree
>>>> that we'll have a single ACPI hotplug feature bit then we can go ahead
>>>> with Linux domU patches without waiting for Xen ACPI support.
> How would you test such Linux patches if the Xen side is missing? Or
> am I misunderstanding the statement above?

PV-style hotplug is already supported by the toolstack so that's the
only type of hotplug that domU will use. When ACPI series (with CPUID
flag) gets in then Linux will add code to implement ACPI hotplug based
on whatever leaf/bit we decide to use.

>
>>> My understanding is that the way forward now is to introduce this bit, use it
>>> in your series and decide what we do with Dom0 when we get to a point that we
>>> can start to test/implement vCPU hotplug there. There's still work to do until
>>> we can get to the point of simply booting a static no-hotplug PVHv2 Dom0.
>> Jan, is that how you want to proceed?
> Well, it's not just depending on my opinion, I think. As said, I could
> live with the bimodal approach, but I'm not really happy with it.
> Hence at least a second opinion (perhaps Andrew's, but others'
> input would be more than welcome) should be waited for imo.

I asked you specifically only because you started reviewing v6 at some
point in the past.

But yes, I will wait. As I said, given that we agreed on using CPUID bit
(which I think we did?) lack of ACPI hotplug support doesn't block domU
Linux patches.


-boris


>
>> If yes then do you want me to resubmit the series with CPUID support or
>> should I wait for you to review v6
>> (https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00060.html)
>> first?
> Well, as you may have seen there've been quite a few patch series
> submitted during the last days. With the desire to also work on my
> own series, and with the need to also work on other stuff, I can't
> really predict when I'd get to look at your or any other patches (it
> is, to be honest, quite a bit discouraging to see so much stuff
> being submitted for review, yet so little participation in reviews,
> but that's not a new/unusual situation at all). Hence I think not
> waiting for review of v6 would be more reasonable, unless
> aforementioned further input gathering turns out to take overly
> long.
>
> Jan
>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2017-01-24 14:15 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-12 12:13 PVH CPU hotplug design document Roger Pau Monné
2017-01-12 19:00 ` Andrew Cooper
2017-01-13  3:06   ` Boris Ostrovsky
2017-01-13 15:27   ` Jan Beulich
2017-01-16 14:59     ` Roger Pau Monné
2017-01-16 14:50   ` Roger Pau Monné
2017-01-13 15:51 ` Jan Beulich
2017-01-13 19:41   ` Stefano Stabellini
2017-01-14  1:44   ` Boris Ostrovsky
2017-01-16 11:03     ` Jan Beulich
2017-01-16 15:14   ` Roger Pau Monné
2017-01-16 16:09     ` Jan Beulich
2017-01-16 16:31       ` Roger Pau Monné
2017-01-16 16:50         ` Jan Beulich
2017-01-16 17:44           ` Roger Pau Monné
2017-01-16 18:16             ` Stefano Stabellini
2017-01-17  9:12             ` Jan Beulich
2017-01-17 11:43               ` Roger Pau Monné
2017-01-17 12:33                 ` Jan Beulich
2017-01-17 14:13                   ` Roger Pau Monné
2017-01-17 14:44                     ` Jan Beulich
2017-01-17 15:27                       ` Boris Ostrovsky
2017-01-17 15:33                         ` Jan Beulich
2017-01-17 15:50                           ` Boris Ostrovsky
2017-01-17 17:45                             ` Roger Pau Monné
2017-01-17 18:50                               ` Boris Ostrovsky
2017-01-18 10:34                                 ` Roger Pau Monné
2017-01-18 10:44                                   ` Jan Beulich
2017-01-18 11:54                                     ` Roger Pau Monné
2017-01-18 13:25                                       ` Jan Beulich
2017-01-22 18:39                                         ` Boris Ostrovsky
2017-01-23 10:35                                           ` Jan Beulich
2017-01-23 14:28                                             ` Boris Ostrovsky
2017-01-23 15:17                                               ` Jan Beulich
2017-01-23 15:43                                                 ` Boris Ostrovsky
2017-01-23 15:49                                                   ` Boris Ostrovsky
2017-01-23 15:58                                                     ` Jan Beulich
2017-01-23 16:24                                                       ` Boris Ostrovsky
2017-01-23 16:50                                                         ` Roger Pau Monné
2017-01-23 17:05                                                           ` Boris Ostrovsky
2017-01-23 17:14                                                             ` Roger Pau Monné
2017-01-23 18:36                                                               ` Boris Ostrovsky
2017-01-24  7:43                                                                 ` Jan Beulich
2017-01-24 14:15                                                                   ` Boris Ostrovsky
2017-01-23 15:49                                                   ` Jan Beulich
2017-01-18 13:34                                       ` Boris Ostrovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.