[RFC] Question about async TLB flush and KVM pv tlb improvements

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC] Question about async TLB flush and KVM pv tlb improvements
@ 2020-02-25  4:12 何容光(邦采)
  2020-02-25  6:31 ` Wanpeng Li
       [not found] ` <C6DFCB69-25A6-43FE-92E9-C9E675CAB615@vmware.com>
  0 siblings, 2 replies; 6+ messages in thread
From: 何容光(邦采) @ 2020-02-25  4:12 UTC (permalink / raw)
  To: namit, peterz, kernellwp, pbonzini
  Cc: dave.hansen, mingo, tglx, x86, linux-kernel, dave.hansen, bp,
	luto, kvm, yongting.lyt,
	吴启翾(启翾)

Hi there,

I saw this async TLB flush patch at https://lore.kernel.org/patchwork/patch/1082481/ , and I am wondering after one year, do you think if this patch is practical or there are functional flaws?
From my POV, Nadav's patch seems has no obvious flaw. But I am not familiar about the relationship between CPU's speculation exec and stale TLB, since it's usually transparent from programing. In which condition would machine check occurs? Is there some reference I can learn?
BTW, I am trying to improve kvm pv tlb flush that if a vCPU is preempted, as initiating CPU is not sending IPI to and waiting for the preempted vCPU, when the preempted vCPU is resuming, I want the VMM to inject an interrupt, perhaps NMI, to the vCPU and letting vCPU flush TLB instead of flush TLB for the vCPU, in case the vCPU is not in kernel mode or disabled interrupt, otherwise stick to VMM flush. Since VMM flush using INVVPID would flush all TLB of all PCID thus has some negative performance impacting on the preempted vCPU. So is there same problem as the async TLB flush patch?
Thanks in advance.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] Question about async TLB flush and KVM pv tlb improvements
  2020-02-25  4:12 [RFC] Question about async TLB flush and KVM pv tlb improvements 何容光(邦采)
@ 2020-02-25  6:31 ` Wanpeng Li
  2020-02-25  7:53   ` 回复：[RFC] " 何容光(邦采)
       [not found] ` <C6DFCB69-25A6-43FE-92E9-C9E675CAB615@vmware.com>
  1 sibling, 1 reply; 6+ messages in thread
From: Wanpeng Li @ 2020-02-25  6:31 UTC (permalink / raw)
  To: 何容光(邦采)
  Cc: namit, peterz, pbonzini, dave.hansen, mingo, tglx, x86,
	linux-kernel, dave.hansen, bp, luto, kvm, yongting.lyt,
	吴启翾(启翾)

On Tue, 25 Feb 2020 at 12:12, 何容光(邦采) <bangcai.hrg@alibaba-inc.com> wrote:
>
> Hi there,
>
> I saw this async TLB flush patch at https://lore.kernel.org/patchwork/patch/1082481/ , and I am wondering after one year, do you think if this patch is practical or there are functional flaws?
> From my POV, Nadav's patch seems has no obvious flaw. But I am not familiar about the relationship between CPU's speculation exec and stale TLB, since it's usually transparent from programing. In which condition would machine check occurs? Is there some reference I can learn?
> BTW, I am trying to improve kvm pv tlb flush that if a vCPU is preempted, as initiating CPU is not sending IPI to and waiting for the preempted vCPU, when the preempted vCPU is resuming, I want the VMM to inject an interrupt, perhaps NMI, to the vCPU and letting vCPU flush TLB instead of flush TLB for the vCPU, in case the vCPU is not in kernel mode or disabled interrupt, otherwise stick to VMM flush. Since VMM flush using INVVPID would flush all TLB of all PCID thus has some negative performance impacting on the preempted vCPU. So is there same problem as the async TLB flush patch?

PV TLB Shootdown is disabled in dedicated scenario, I believe there
are already heavy tlb misses in overcommit scenarios before this
feature, so flush all TLB associated with one specific VPID will not
worse that much.

    Wanpeng

^ permalink raw reply	[flat|nested] 6+ messages in thread

* 回复：[RFC] Question about async TLB flush and KVM pv tlb improvements
       [not found] ` <C6DFCB69-25A6-43FE-92E9-C9E675CAB615@vmware.com>
@ 2020-02-25  7:25   ` 何容光(邦采)
  0 siblings, 0 replies; 6+ messages in thread
From: 何容光(邦采) @ 2020-02-25  7:25 UTC (permalink / raw)
  To: Nadav Amit
  Cc: peterz, kernellwp, pbonzini, dave.hansen, mingo, tglx, x86,
	linux-kernel, dave.hansen, bp, luto, kvm,
	林永听(海枫),
	吴启翾(启翾),
	herongguang

>> On Feb 24, 2020, at 8:12 PM, 何容光(邦采) <bangcai.hrg@alibaba-inc.com> wrote:
>> 
>> Hi there,
>> 
>> I saw this async TLB flush patch at
>> https://lore.kernel.org/patchwork/patch/1082481/ , and I am wondering
>> after one year, do you think if this patch is practical or there are
>> functional flaws? From my POV, Nadav's patch seems has no obvious flaw.
>> But I am not familiar about the relationship between CPU's speculation
>> exec and stale TLB, since it's usually transparent from programing. In
>> which condition would machine check occurs? Is there some reference I can
>> learn?

> I was/am held back by personal issues that consume my free time, which
> prevented me from sending a new version so far.

Good to hear that :)

> As for the patch-set - the greatest benefit in performance comes from
> running local/remote TLB flushes concurrently, and I will respin another
> version of that in two weeks time. I will send the async flushes afterwards.

In non-overcommitment virtualization environment, I think this will be 
beneficial. Since the async implementation still need remote CPU’s IPI 
response, pv tlb flush can address this scenario’s need.

Do you have any reference about relationship between CPU's speculation
and stale TLB, especially causing machine check? I want to learn about this, 
thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* 回复：[RFC] Question about async TLB flush and KVM pv tlb improvements
  2020-02-25  6:31 ` Wanpeng Li
@ 2020-02-25  7:53   ` 何容光(邦采)
  2020-02-25  8:41     ` [RFC] " Wanpeng Li
  0 siblings, 1 reply; 6+ messages in thread
From: 何容光(邦采) @ 2020-02-25  7:53 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: namit, peterz, pbonzini, dave.hansen, mingo, tglx, x86,
	linux-kernel, dave.hansen, bp, luto, kvm,
	林永听(海枫),
	吴启翾(启翾),
	herongguang

> On Tue, 25 Feb 2020 at 12:12, 何容光(邦采) <bangcai.hrg@alibaba-inc.com> wrote:
>>
>> Hi there,
>>
>> I saw this async TLB flush patch at https://lore.kernel.org/patchwork/patch/1082481/ , and I am wondering after one year, do you think if this patch is practical or there are functional flaws?
>> From my POV, Nadav's patch seems has no obvious flaw. But I am not familiar about the relationship between CPU's speculation exec and stale TLB, since it's usually transparent from programing. In which condition would machine check occurs? Is there some reference I can learn?
>> BTW, I am trying to improve kvm pv tlb flush that if a vCPU is preempted, as initiating CPU is not sending IPI to and waiting for the preempted vCPU, when the preempted vCPU is resuming, I want the VMM to inject an interrupt, perhaps NMI, to the vCPU and letting vCPU flush TLB instead of flush TLB for the vCPU, in case the vCPU is not in kernel mode or disabled interrupt, otherwise stick to VMM flush. Since VMM flush using INVVPID would flush all TLB of all PCID thus has some negative performance impacting on the preempted vCPU. So is there same problem as the async TLB flush patch?

> PV TLB Shootdown is disabled in dedicated scenario, I believe there
> are already heavy tlb misses in overcommit scenarios before this
> feature, so flush all TLB associated with one specific VPID will not
> worse that much.

If vcpus running on one pcpu is limited to a few, from my test, there 
can still be some beneficial. Especially if we can move all the logic to
VMM eliminating waiting of IPI, however correctness of functionally 
is a concern. This is also why I found Nadav's patch, do you have 
any advice on this?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] Question about async TLB flush and KVM pv tlb improvements
  2020-02-25  7:53   ` 回复：[RFC] " 何容光(邦采)
@ 2020-02-25  8:41     ` Wanpeng Li
  2020-02-25  9:26       ` He Rongguang
  0 siblings, 1 reply; 6+ messages in thread
From: Wanpeng Li @ 2020-02-25  8:41 UTC (permalink / raw)
  To: 何容光(邦采)
  Cc: namit, peterz, pbonzini, dave.hansen, mingo, tglx, x86,
	linux-kernel, dave.hansen, bp, luto, kvm,
	林永听(海枫),
	吴启翾(启翾),
	herongguang

On Tue, 25 Feb 2020 at 15:53, 何容光(邦采) <bangcai.hrg@alibaba-inc.com> wrote:
>
> > On Tue, 25 Feb 2020 at 12:12, 何容光(邦采) <bangcai.hrg@alibaba-inc.com> wrote:
> >>
> >> Hi there,
> >>
> >> I saw this async TLB flush patch at https://lore.kernel.org/patchwork/patch/1082481/ , and I am wondering after one year, do you think if this patch is practical or there are functional flaws?
> >> From my POV, Nadav's patch seems has no obvious flaw. But I am not familiar about the relationship between CPU's speculation exec and stale TLB, since it's usually transparent from programing. In which condition would machine check occurs? Is there some reference I can learn?
> >> BTW, I am trying to improve kvm pv tlb flush that if a vCPU is preempted, as initiating CPU is not sending IPI to and waiting for the preempted vCPU, when the preempted vCPU is resuming, I want the VMM to inject an interrupt, perhaps NMI, to the vCPU and letting vCPU flush TLB instead of flush TLB for the vCPU, in case the vCPU is not in kernel mode or disabled interrupt, otherwise stick to VMM flush. Since VMM flush using INVVPID would flush all TLB of all PCID thus has some negative performance impacting on the preempted vCPU. So is there same problem as the async TLB flush patch?
>
> > PV TLB Shootdown is disabled in dedicated scenario, I believe there
> > are already heavy tlb misses in overcommit scenarios before this
> > feature, so flush all TLB associated with one specific VPID will not
> > worse that much.
>
> If vcpus running on one pcpu is limited to a few, from my test, there
> can still be some beneficial. Especially if we can move all the logic to

Unless the vCPU is preempted.

> VMM eliminating waiting of IPI, however correctness of functionally
> is a concern. This is also why I found Nadav's patch, do you have
> any advice on this?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] Question about async TLB flush and KVM pv tlb improvements
  2020-02-25  8:41     ` [RFC] " Wanpeng Li
@ 2020-02-25  9:26       ` He Rongguang
  0 siblings, 0 replies; 6+ messages in thread
From: He Rongguang @ 2020-02-25  9:26 UTC (permalink / raw)
  To: Wanpeng Li, 何容光(邦采)
  Cc: namit, peterz, pbonzini, dave.hansen, mingo, tglx, x86,
	linux-kernel, dave.hansen, bp, luto, kvm,
	林永听(海枫),
	吴启翾(启翾)


在 2020/2/25 16:41, Wanpeng Li 写道:
> On Tue, 25 Feb 2020 at 15:53, 何容光(邦采) <bangcai.hrg@alibaba-inc.com> wrote:
>>> On Tue, 25 Feb 2020 at 12:12, 何容光(邦采) <bangcai.hrg@alibaba-inc.com> wrote:
>>>> Hi there,
>>>>
>>>> I saw this async TLB flush patch at https://lore.kernel.org/patchwork/patch/1082481/ , and I am wondering after one year, do you think if this patch is practical or there are functional flaws?
>>>>  From my POV, Nadav's patch seems has no obvious flaw. But I am not familiar about the relationship between CPU's speculation exec and stale TLB, since it's usually transparent from programing. In which condition would machine check occurs? Is there some reference I can learn?
>>>> BTW, I am trying to improve kvm pv tlb flush that if a vCPU is preempted, as initiating CPU is not sending IPI to and waiting for the preempted vCPU, when the preempted vCPU is resuming, I want the VMM to inject an interrupt, perhaps NMI, to the vCPU and letting vCPU flush TLB instead of flush TLB for the vCPU, in case the vCPU is not in kernel mode or disabled interrupt, otherwise stick to VMM flush. Since VMM flush using INVVPID would flush all TLB of all PCID thus has some negative performance impacting on the preempted vCPU. So is there same problem as the async TLB flush patch?
>>> PV TLB Shootdown is disabled in dedicated scenario, I believe there
>>> are already heavy tlb misses in overcommit scenarios before this
>>> feature, so flush all TLB associated with one specific VPID will not
>>> worse that much.
>> If vcpus running on one pcpu is limited to a few, from my test, there
>> can still be some beneficial. Especially if we can move all the logic to
> Unless the vCPU is preempted.

Correct, in fact I am using a no-IPI-in-VM approch, that's why I am 
asking about the

async approch.

>> VMM eliminating waiting of IPI, however correctness of functionally
>> is a concern. This is also why I found Nadav's patch, do you have
>> any advice on this?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-02-25  9:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-25  4:12 [RFC] Question about async TLB flush and KVM pv tlb improvements 何容光(邦采)
2020-02-25  6:31 ` Wanpeng Li
2020-02-25  7:53   ` 回复：[RFC] " 何容光(邦采)
2020-02-25  8:41     ` [RFC] " Wanpeng Li
2020-02-25  9:26       ` He Rongguang
     [not found] ` <C6DFCB69-25A6-43FE-92E9-C9E675CAB615@vmware.com>
2020-02-25  7:25   ` 回复：[RFC] " 何容光(邦采)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.