From: Steven Price <steven.price@arm.com>
To: Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>,
linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu
Cc: "Catalin Marinas" <catalin.marinas@arm.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Radim Krčmář" <rkrcmar@redhat.com>,
"Russell King" <linux@armlinux.org.uk>,
"James Morse" <james.morse@arm.com>,
"Julien Thierry" <julien.thierry.kdev@gmail.com>,
"Suzuki K Pouloze" <suzuki.poulose@arm.com>,
"Mark Rutland" <mark.rutland@arm.com>,
kvm@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 4/9] KVM: arm64: Support stolen time reporting via shared structure
Date: Wed, 21 Aug 2019 11:27:51 +0100 [thread overview]
Message-ID: <4703baa7-0116-f5d6-291e-1e669a36545d@arm.com> (raw)
In-Reply-To: <f6fad4fa-323d-306c-c582-de07464f4d00@kernel.org>
On 19/08/2019 17:40, Marc Zyngier wrote:
> Hi Steven,
>
> On 19/08/2019 15:04, Steven Price wrote:
>> Implement the service call for configuring a shared structure between a
>> VCPU and the hypervisor in which the hypervisor can write the time
>> stolen from the VCPU's execution time by other tasks on the host.
>>
>> The hypervisor allocates memory which is placed at an IPA chosen by user
>> space. The hypervisor then uses WRITE_ONCE() to update the shared
>> structure ensuring single copy atomicity of the 64-bit unsigned value
>> that reports stolen time in nanoseconds.
>>
>> Whenever stolen time is enabled by the guest, the stolen time counter is
>> reset.
>>
>> The stolen time itself is retrieved from the sched_info structure
>> maintained by the Linux scheduler code. We enable SCHEDSTATS when
>> selecting KVM Kconfig to ensure this value is meaningful.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>> arch/arm/include/asm/kvm_host.h | 15 +++++++
>> arch/arm64/include/asm/kvm_host.h | 16 ++++++-
>> arch/arm64/kvm/Kconfig | 1 +
>> include/linux/kvm_types.h | 2 +
>> virt/kvm/arm/arm.c | 19 +++++++++
>> virt/kvm/arm/hypercalls.c | 3 ++
>> virt/kvm/arm/pvtime.c | 71 +++++++++++++++++++++++++++++++
>> 7 files changed, 126 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index 369b5d2d54bf..14d61a84c270 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -39,6 +39,7 @@
>> KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>> #define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1)
>> #define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2)
>> +#define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3)
>>
>> DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
>>
>> @@ -77,6 +78,12 @@ struct kvm_arch {
>>
>> /* Mandated version of PSCI */
>> u32 psci_version;
>> +
>> + struct kvm_arch_pvtime {
>> + struct gfn_to_hva_cache st_ghc;
>> + gpa_t st_base;
>> + u64 st_size;
>> + } pvtime;
>
> It'd be good if we could avoid having this in the 32bit vcpu structure,
> given that it serves no real purpose (other than being able to compile
> things).
Good point - I think I can fix that with a couple more static inline
functions... It's a little tricky due to header file include order, but
I think I can make it work.
[...]
>> +int kvm_update_stolen_time(struct kvm_vcpu *vcpu, bool init)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_arch_pvtime *pvtime = &kvm->arch.pvtime;
>> + u64 steal;
>> + u64 steal_le;
>> + u64 offset;
>> + int idx;
>> + const int stride = sizeof(struct pvclock_vcpu_stolen_time);
>> +
>> + if (pvtime->st_base == GPA_INVALID)
>> + return -ENOTSUPP;
>> +
>> + /* Let's do the local bookkeeping */
>> + steal = vcpu->arch.steal.steal;
>> + steal += current->sched_info.run_delay - vcpu->arch.steal.last_steal;
>> + vcpu->arch.steal.last_steal = current->sched_info.run_delay;
>> + vcpu->arch.steal.steal = steal;
>> +
>> + offset = stride * kvm_vcpu_get_idx(vcpu);
>> +
>> + if (unlikely(offset + stride > pvtime->st_size))
>> + return -EINVAL;
>> +
>> + steal_le = cpu_to_le64(steal);
>> + pagefault_disable();
>
> What's the reason for doing a pagefault_disable()? What I'd expect is
> for the userspace page to be faulted in and written to, and doing a
> pagefault_disable() seems to be going against this idea.
Umm... this is me screwing up the locking...
The current code is very confused about which locks should/can be held
when kvm_update_stolen_time() is called. vcpu_req_record_steal()
explicitly takes the kvm->srcu read lock - which is then taken again
here. But kvm_hypercall_stolen_time doesn't hold any lock. And obviously
at some point in time I expected this to be called in atomic context...
In general the page is likely to be faulted in (as a guest which is
using stolen time is surely looking at the numbers there). But there's
no need for the pagefault_disable(). It also shouldn't be the callers
responsibility to hold kvm->srcu.
Steve
next prev parent reply other threads:[~2019-08-21 10:28 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-19 14:04 [PATCH v2 0/9] arm64: Stolen time support Steven Price
2019-08-19 14:04 ` [PATCH v2 1/9] KVM: arm64: Document PV-time interface Steven Price
2019-08-19 14:04 ` [PATCH v2 2/9] KVM: arm/arm64: Factor out hypercall handling from PSCI code Steven Price
2019-08-19 14:04 ` [PATCH v2 3/9] KVM: arm64: Implement PV_FEATURES call Steven Price
2019-08-19 14:04 ` [PATCH v2 4/9] KVM: arm64: Support stolen time reporting via shared structure Steven Price
2019-08-19 16:40 ` Marc Zyngier
2019-08-21 10:27 ` Steven Price [this message]
2019-08-19 14:04 ` [PATCH v2 5/9] KVM: Allow kvm_device_ops to be const Steven Price
2019-08-19 14:04 ` [PATCH v2 6/9] KVM: arm64: Provide a PV_TIME device to user space Steven Price
2019-08-19 14:04 ` [PATCH v2 7/9] arm/arm64: Provide a wrapper for SMCCC 1.1 calls Steven Price
2019-08-19 14:04 ` [PATCH v2 8/9] arm/arm64: Make use of the SMCCC 1.1 wrapper Steven Price
2019-08-19 14:04 ` [PATCH v2 9/9] arm64: Retrieve stolen time as paravirtualized guest Steven Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4703baa7-0116-f5d6-291e-1e669a36545d@arm.com \
--to=steven.price@arm.com \
--cc=catalin.marinas@arm.com \
--cc=james.morse@arm.com \
--cc=julien.thierry.kdev@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).