linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Price <steven.price@arm.com>
To: Marc Zyngier <maz@kernel.org>
Cc: peter.maydell@linaro.org, kvm@vger.kernel.org,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	"Suzuki K Pouloze" <suzuki.poulose@arm.com>,
	linux-doc@vger.kernel.org, "Russell King" <linux@armlinux.org.uk>,
	linux-kernel@vger.kernel.org, "James Morse" <james.morse@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Will Deacon" <will@kernel.org>,
	kvmarm@lists.cs.columbia.edu,
	"Julien Thierry" <julien.thierry.kdev@gmail.com>
Subject: Re: [PATCH 1/9] KVM: arm64: Document PV-time interface
Date: Mon, 5 Aug 2019 14:06:08 +0100	[thread overview]
Message-ID: <9b2077d1-29f2-549a-fd61-f9c8d3730c9c@arm.com> (raw)
In-Reply-To: <20190803121343.2f482200@why>

On 03/08/2019 12:13, Marc Zyngier wrote:
> On Fri,  2 Aug 2019 15:50:09 +0100
> Steven Price <steven.price@arm.com> wrote:
> 
> [+Peter for the userspace aspect of things]
> 
> Hi Steve,
> 
>> Introduce a paravirtualization interface for KVM/arm64 based on the
>> "Arm Paravirtualized Time for Arm-Base Systems" specification DEN 0057A.
>>
>> This only adds the details about "Stolen Time" as the details of "Live
>> Physical Time" have not been fully agreed.
>>
>> User space can specify a reserved area of memory for the guest and
>> inform KVM to populate the memory with information on time that the host
>> kernel has stolen from the guest.
>>
>> A hypercall interface is provided for the guest to interrogate the
>> hypervisor's support for this interface and the location of the shared
>> memory structures.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  Documentation/virtual/kvm/arm/pvtime.txt | 107 +++++++++++++++++++++++
>>  1 file changed, 107 insertions(+)
>>  create mode 100644 Documentation/virtual/kvm/arm/pvtime.txt
>>
>> diff --git a/Documentation/virtual/kvm/arm/pvtime.txt b/Documentation/virtual/kvm/arm/pvtime.txt
>> new file mode 100644
>> index 000000000000..e6ae9799e1d5
>> --- /dev/null
>> +++ b/Documentation/virtual/kvm/arm/pvtime.txt
>> @@ -0,0 +1,107 @@
>> +Paravirtualized time support for arm64
>> +======================================
>> +
>> +Arm specification DEN0057/A defined a standard for paravirtualised time
>> +support for Aarch64 guests:
> 
> nit: AArch64
> 
>> +
>> +https://developer.arm.com/docs/den0057/a
> 
> Between this file and the above document, which one is authoritative?

The above document should be authoritative - although I'm still waiting
for the final version to be published. I'm not expecting any changes to
the stolen time part though.

>> +
>> +KVM/Arm64 implements the stolen time part of this specification by providing
> 
> nit: KVM/arm64
> 
>> +some hypervisor service calls to support a paravirtualized guest obtaining a
>> +view of the amount of time stolen from its execution.
>> +
>> +Two new SMCCC compatible hypercalls are defined:
>> +
>> +PV_FEATURES 0xC5000020
>> +PV_TIME_ST  0xC5000022
>> +
>> +These are only available in the SMC64/HVC64 calling convention as
>> +paravirtualized time is not available to 32 bit Arm guests.
>> +
>> +PV_FEATURES
>> +    Function ID:  (uint32)  : 0xC5000020
>> +    PV_func_id:   (uint32)  : Either PV_TIME_LPT or PV_TIME_ST
>> +    Return value: (int32)   : NOT_SUPPORTED (-1) or SUCCESS (0) if the relevant
>> +                              PV-time feature is supported by the hypervisor.
> 
> How is PV_FEATURES discovered? Is the intention to make it a generic
> ARM-wide PV discovery mechanism, not specific to PV time?

SMCCC is mandated for PV time. So, assuming the hypervisor supports
SMCCC, the "NOT_SUPPORTED" return is mandated by SMCCC if PV time isn't
supported.

However, we do also use the SMCCC_ARCH_FEATURES mechanism to check the
existence of PV_FEATURES before use. I'll update the document to call
this out.

>> +
>> +PV_TIME_ST
>> +    Function ID:  (uint32)  : 0xC5000022
>> +    Return value: (int64)   : IPA of the stolen time data structure for this
>> +                              (V)CPU. On failure:
>> +                              NOT_SUPPORTED (-1)
>> +
> 
> Is the size implicit? What are the memory attributes? This either needs
> documenting here, or point to the right bit to the spec.

The size is implicit - it's a pointer to the below structure, so the
guest can only rely on the first 16 bytes being valid. The memory
attributes are described in the specification as:

"The calling guest can map the IPA into normal memory with inner and
outer write back caching attributes, in the inner sharable domain"

I'll put those details in this document for completeness.

>> +Stolen Time
>> +-----------
>> +
>> +The structure pointed to by the PV_TIME_ST hypercall is as follows:
>> +
>> +  Field       | Byte Length | Byte Offset | Description
>> +  ----------- | ----------- | ----------- | --------------------------
>> +  Revision    |      4      |      0      | Must be 0 for version 0.1
>> +  Attributes  |      4      |      4      | Must be 0
>> +  Stolen time |      8      |      8      | Stolen time in unsigned
>> +              |             |             | nanoseconds indicating how
>> +              |             |             | much time this VCPU thread
>> +              |             |             | was involuntarily not
>> +              |             |             | running on a physical CPU.
>> +
>> +The structure will be updated by the hypervisor periodically as time is stolen
> 
> Is it really periodic? If so, when is the update frequency?

Hmm, periodic might be the wrong term - there is no guaranteed frequency
of update. The spec says:

"The hypervisor must update this value prior to scheduling a virtual CPU"

I guess that's probably the best description.

>> +from the VCPU. It will be present within a reserved region of the normal
>> +memory given to the guest. The guest should not attempt to write into this
>> +memory. There is a structure by VCPU of the guest.
> 
> What if the vcpu writes to it? Does it get a fault?

From the perspective from the specification this is undefined. A fault
would therefore be acceptable but isn't generated in the implementation
defined here.

> If there is a
> structure per vcpu, what is the layout in memory? How does a vcpu find
> its own data structure? Is that the address returned by PV_TIME_ST?

A call to PV_TIME_ST returns the structure for the calling vCPU - I'll
make that explicit. The layout is therefore defined by the hypervisor
and cannot be relied on by the guest. As below this implementation uses
a simple array of structures.

>> +
>> +User space interface
>> +====================
>> +
>> +User space can request that KVM provide the paravirtualized time interface to
>> +a guest by creating a KVM_DEV_TYPE_ARM_PV_TIME device, for example:
>> +
>> +    struct kvm_create_device pvtime_device = {
>> +            .type = KVM_DEV_TYPE_ARM_PV_TIME,
>> +            .attr = 0,
>> +            .flags = 0,
>> +    };
>> +
>> +    pvtime_fd = ioctl(vm_fd, KVM_CREATE_DEVICE, &pvtime_device);
>> +
>> +The guest IPA of the structures must be given to KVM. This is the base address
> 
> nit: s/guest //
> 
>> +of an array of stolen time structures (one for each VCPU). For example:
>> +
>> +    struct kvm_device_attr st_base = {
>> +            .group = KVM_DEV_ARM_PV_TIME_PADDR,
>> +            .attr = KVM_DEV_ARM_PV_TIME_ST,
>> +            .addr = (u64)(unsigned long)&st_paddr
>> +    };
>> +
>> +    ioctl(pvtime_fd, KVM_SET_DEVICE_ATTR, &st_base);
> 
> So the allocation itself is performed by the kernel? What are the
> ordering requirements between creating vcpus and the device? What are
> the alignment requirements for the base address?

The base address should be page aligned - I'll spell that out.

There are currently no ordering requirements between creating vcpus and
the device. However...

>> +
>> +For migration (or save/restore) of a guest it is necessary to save the contents
>> +of the shared page(s) and later restore them. KVM_DEV_ARM_PV_TIME_STATE_SIZE
>> +provides the size of this data and KVM_DEV_ARM_PV_TIME_STATE allows the state
>> +to be read/written.
> 
> Is the size variable depending on the number of vcpus?

...yes - so restoring the state after migration must be done after
creating the vcpus. I'll point out that the device should created after.

Thanks for the review,

Steve

  reply	other threads:[~2019-08-05 13:16 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-02 14:50 [PATCH 0/9] arm64: Stolen time support Steven Price
2019-08-02 14:50 ` [PATCH 1/9] KVM: arm64: Document PV-time interface Steven Price
2019-08-03 11:13   ` Marc Zyngier
2019-08-05 13:06     ` Steven Price [this message]
2019-08-05  3:23   ` Zenghui Yu
2019-08-05 13:06     ` Steven Price
2019-08-05 16:40   ` Christophe de Dinechin
2019-08-07 13:21     ` Steven Price
     [not found]       ` <9F77FA64-C71B-4025-A58D-3AC07E6688DE@dinechin.org>
2019-08-07 15:26         ` Steven Price
2019-08-02 14:50 ` [PATCH 2/9] KVM: arm/arm64: Factor out hypercall handling from PSCI code Steven Price
2019-08-02 14:50 ` [PATCH 3/9] KVM: arm64: Implement PV_FEATURES call Steven Price
2019-08-03 11:21   ` Marc Zyngier
2019-08-05 13:14     ` Steven Price
2019-08-02 14:50 ` [PATCH 4/9] KVM: arm64: Support stolen time reporting via shared structure Steven Price
2019-08-03 11:55   ` Marc Zyngier
2019-08-05 14:09     ` Steven Price
2019-08-03 17:58   ` Marc Zyngier
2019-08-03 18:13     ` Marc Zyngier
2019-08-05 14:18       ` Steven Price
2019-08-02 14:50 ` [PATCH 5/9] KVM: Allow kvm_device_ops to be const Steven Price
2019-08-02 14:50 ` [PATCH 6/9] KVM: arm64: Provide a PV_TIME device to user space Steven Price
2019-08-03 12:51   ` Marc Zyngier
2019-08-03 17:34     ` Marc Zyngier
2019-08-07 13:39       ` Steven Price
2019-08-07 13:51         ` Marc Zyngier
2019-08-05 16:10     ` Steven Price
2019-08-05 16:28       ` Marc Zyngier
2019-08-02 14:50 ` [PATCH 7/9] arm/arm64: Provide a wrapper for SMCCC 1.1 calls Steven Price
2019-08-05 10:03   ` Will Deacon
2019-08-02 14:50 ` [PATCH 8/9] arm/arm64: Make use of the SMCCC 1.1 wrapper Steven Price
2019-08-02 14:50 ` [PATCH 9/9] arm64: Retrieve stolen time as paravirtualized guest Steven Price
2019-08-04  9:53   ` Marc Zyngier
2019-08-08 15:29     ` Steven Price
2019-08-08 15:49       ` Marc Zyngier
2019-08-09 13:51   ` Zenghui Yu
2019-08-12 10:39     ` Steven Price
2019-08-13  6:06       ` Zenghui Yu
2019-08-03 18:05 ` [PATCH 0/9] arm64: Stolen time support Marc Zyngier
2019-08-05 13:06   ` Steven Price
2019-08-05 13:26     ` Marc Zyngier
2019-08-14 13:02     ` Alexander Graf
     [not found]       ` <8636i3omnd.wl-maz@kernel.org>
2019-08-14 14:52         ` [UNVERIFIED SENDER] " Alexander Graf
2019-08-16 10:23           ` Steven Price
2020-07-21  3:26 ` zhukeqian
2020-07-27 10:48   ` Steven Price
2020-07-29  2:57     ` zhukeqian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9b2077d1-29f2-549a-fd61-f9c8d3730c9c@arm.com \
    --to=steven.price@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=julien.thierry.kdev@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=maz@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=rkrcmar@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).