kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 3 preempted variables in kvm
@ 2021-01-22 15:11 Alex Shi
  2021-01-22 19:30 ` Sean Christopherson
  0 siblings, 1 reply; 4+ messages in thread
From: Alex Shi @ 2021-01-22 15:11 UTC (permalink / raw)
  To: Paolo Bonzini, kvm, Marc Zyngier, yj226063

Hi All,

I am newbie on KVM side, so probably I am wrong on the following.
Please correct me if it is.

There are 3 preempted variables in kvm:
     1, kvm_vcpu.preempted  in include/linux/kvm_host.h
     2, kvm_steal_time.preempted 3, kvm_vcpu_arch.st.preempted in arch/x86
Seems all of them are set or cleared at the same time. Like,

vcpu_put:
        kvm_sched_out()-> set 3 preempted
                kvm_arch_vcpu_put():
                        kvm_steal_time_set_preempted

vcpu_load:
        kvm_sched_in() : clear above 3 preempted
                kvm_arch_vcpu_load() -> kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);
                request dealed in vcpu_enter_guest() -> record_steal_time

Except the 2nd one reuse with KVM_FEATURE_PV_TLB_FLUSH bit which could be used
separately, Could we combine them into one, like just bool kvm_vcpu.preempted? and 
move out the KVM_FEATURE_PV_TLB_FLUSH. Believe all arch need this for a vcpu overcommit.

Thanks
Alex

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3 preempted variables in kvm
  2021-01-22 15:11 3 preempted variables in kvm Alex Shi
@ 2021-01-22 19:30 ` Sean Christopherson
  2021-01-25  5:49   ` Alex Shi
  0 siblings, 1 reply; 4+ messages in thread
From: Sean Christopherson @ 2021-01-22 19:30 UTC (permalink / raw)
  To: Alex Shi; +Cc: Paolo Bonzini, kvm, Marc Zyngier, yj226063

On Fri, Jan 22, 2021, Alex Shi wrote:
> Hi All,
> 
> I am newbie on KVM side, so probably I am wrong on the following.
> Please correct me if it is.
> 
> There are 3 preempted variables in kvm:
>      1, kvm_vcpu.preempted  in include/linux/kvm_host.h
>      2, kvm_steal_time.preempted
>      3, kvm_vcpu_arch.st.preempted in arch/x86
> Seems all of them are set or cleared at the same time. Like,

Not quite.  kvm_vcpu.preempted is set only in kvm_sched_out(), i.e. when the
vCPU was running and preempted by the host scheduler.  This is used by KVM when
KVM detects that a guest task appears to be waiting on a lock, in which case KVM
will bump the priority of preempted guest kernel threads in the hope that
scheduling in the preempted vCPU will release the lock.

kvm_steal_time.preempted is a paravirt struct that is shared with the guest.  It
is set on any call to kvm_arch_vcpu_put(), which covers kvm_sched_out() and adds
the case where the vCPU exits to userspace, e.g. for IO.  KVM itself hasn't been
preempted, but from the guest's perspective the CPU has been "preempted" in the
sense that CPU (from the guest's perspective) is not executing guest code.
Similar to KVM's vCPU scheduling heuristics, the guest kernel uses this info to
inform its scheduling, e.g. to avoid waiting on a lock owner to drop the lock
since the lock owner is not actively running.

kvm_vcpu_arch.st.preempted is effectively a host-side cache of
kvm_steal_time.preempted that's used to optimize kvm_arch_vcpu_put() by avoiding
the moderately costly mapping of guest.  It could be dropped, but it's a single
byte per vCPU so worth keeping even if the performance benefits are modest.

> vcpu_put:
>         kvm_sched_out()-> set 3 preempted
>                 kvm_arch_vcpu_put():
>                         kvm_steal_time_set_preempted
> 
> vcpu_load:
>         kvm_sched_in() : clear above 3 preempted
>                 kvm_arch_vcpu_load() -> kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);
>                 request dealed in vcpu_enter_guest() -> record_steal_time
> 
> Except the 2nd one reuse with KVM_FEATURE_PV_TLB_FLUSH bit which could be used
> separately, Could we combine them into one, like just bool kvm_vcpu.preempted? and 
> move out the KVM_FEATURE_PV_TLB_FLUSH. Believe all arch need this for a vcpu overcommit.

Moving KVM_VCPU_FLUSH_TLB out of kvm_steal_time.preempted isn't viable. The
guest kernel is only allowed to rely on the host to flush the vCPU's TLB if it
knows the vCPU is preempted (from its perspective), as that's the only way it
can guarantee that KVM will observe the TLB flush request before enterring the
vCPU.  KVM_VCPU_FLUSH_TLB and KVM_VCPU_PREEMPTED need to be in the same word so
KVM can read and clear them atomically, otherwise there would be a window where
KVM would miss the flush request.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3 preempted variables in kvm
  2021-01-22 19:30 ` Sean Christopherson
@ 2021-01-25  5:49   ` Alex Shi
  2021-01-25  8:46     ` Marc Zyngier
  0 siblings, 1 reply; 4+ messages in thread
From: Alex Shi @ 2021-01-25  5:49 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, Marc Zyngier, yj226063

Hi Sean,

Thanks a lot for the detailed explanations.
Yes, they are all meaningful variables on x86. But for more archs, guess the lock
issue on different arch's guest are similar. Maybe a abstraction on the point would
be very helpful. Any comments?

Thanks
Alex

在 2021/1/23 上午3:30, Sean Christopherson 写道:
> On Fri, Jan 22, 2021, Alex Shi wrote:
>> Hi All,
>>
>> I am newbie on KVM side, so probably I am wrong on the following.
>> Please correct me if it is.
>>
>> There are 3 preempted variables in kvm:
>>      1, kvm_vcpu.preempted  in include/linux/kvm_host.h
>>      2, kvm_steal_time.preempted
>>      3, kvm_vcpu_arch.st.preempted in arch/x86
>> Seems all of them are set or cleared at the same time. Like,
> 
> Not quite.  kvm_vcpu.preempted is set only in kvm_sched_out(), i.e. when the
> vCPU was running and preempted by the host scheduler.  This is used by KVM when
> KVM detects that a guest task appears to be waiting on a lock, in which case KVM
> will bump the priority of preempted guest kernel threads in the hope that
> scheduling in the preempted vCPU will release the lock.
> 
> kvm_steal_time.preempted is a paravirt struct that is shared with the guest.  It
> is set on any call to kvm_arch_vcpu_put(), which covers kvm_sched_out() and adds
> the case where the vCPU exits to userspace, e.g. for IO.  KVM itself hasn't been
> preempted, but from the guest's perspective the CPU has been "preempted" in the
> sense that CPU (from the guest's perspective) is not executing guest code.
> Similar to KVM's vCPU scheduling heuristics, the guest kernel uses this info to
> inform its scheduling, e.g. to avoid waiting on a lock owner to drop the lock
> since the lock owner is not actively running.
> 
> kvm_vcpu_arch.st.preempted is effectively a host-side cache of
> kvm_steal_time.preempted that's used to optimize kvm_arch_vcpu_put() by avoiding
> the moderately costly mapping of guest.  It could be dropped, but it's a single
> byte per vCPU so worth keeping even if the performance benefits are modest.
> 
>> vcpu_put:
>>         kvm_sched_out()-> set 3 preempted
>>                 kvm_arch_vcpu_put():
>>                         kvm_steal_time_set_preempted
>>
>> vcpu_load:
>>         kvm_sched_in() : clear above 3 preempted
>>                 kvm_arch_vcpu_load() -> kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);
>>                 request dealed in vcpu_enter_guest() -> record_steal_time
>>
>> Except the 2nd one reuse with KVM_FEATURE_PV_TLB_FLUSH bit which could be used
>> separately, Could we combine them into one, like just bool kvm_vcpu.preempted? and 
>> move out the KVM_FEATURE_PV_TLB_FLUSH. Believe all arch need this for a vcpu overcommit.
> 
> Moving KVM_VCPU_FLUSH_TLB out of kvm_steal_time.preempted isn't viable. The
> guest kernel is only allowed to rely on the host to flush the vCPU's TLB if it
> knows the vCPU is preempted (from its perspective), as that's the only way it
> can guarantee that KVM will observe the TLB flush request before enterring the
> vCPU.  KVM_VCPU_FLUSH_TLB and KVM_VCPU_PREEMPTED need to be in the same word so
> KVM can read and clear them atomically, otherwise there would be a window where
> KVM would miss the flush request.
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3 preempted variables in kvm
  2021-01-25  5:49   ` Alex Shi
@ 2021-01-25  8:46     ` Marc Zyngier
  0 siblings, 0 replies; 4+ messages in thread
From: Marc Zyngier @ 2021-01-25  8:46 UTC (permalink / raw)
  To: Alex Shi; +Cc: Sean Christopherson, Paolo Bonzini, kvm, yj226063

On 2021-01-25 05:49, Alex Shi wrote:
> Hi Sean,
> 
> Thanks a lot for the detailed explanations.
> Yes, they are all meaningful variables on x86. But for more archs,
> guess the lock
> issue on different arch's guest are similar. Maybe a abstraction on
> the point would
> be very helpful. Any comments?

It depends on what you have in mind. The whole "yield to another
vcpu" mechanism is already abstracted behind kvm_vcpu_on_spin(),
which at arm64, s390 and x86 are using. You could investigate
whether this is a valid approach for Power and MIPS.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-01-26 20:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-22 15:11 3 preempted variables in kvm Alex Shi
2021-01-22 19:30 ` Sean Christopherson
2021-01-25  5:49   ` Alex Shi
2021-01-25  8:46     ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).