linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/2] Enable HvNotifyLongSpinWait for Hyper-V
@ 2018-10-19 13:13 Yi Sun
  2018-10-19 13:13 ` [PATCH v1 1/2] x86/hyperv: get spinlock retry number on Hyper-V Yi Sun
  2018-10-19 13:13 ` [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall Yi Sun
  0 siblings, 2 replies; 25+ messages in thread
From: Yi Sun @ 2018-10-19 13:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, tglx, jgross, chao.p.peng, chao.gao, isaku.yamahata,
	michael.h.kelley, tianyu.lan, Yi Sun, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Michael Kelley (EOSG)

The HvNotifyLongSpinWait hypercall is used by a guest
OS to notify the hypervisor that the calling virtual
processor is attempting to acquire a resource that is
potentially held by another virtual processor within
the same Virtual Machine. This scheduling hint improves
the scalability of VMs with more than one virtual
processor on Hyper-V.

Per MSFT TLFS, the retry number is sent to hypervisor
only when the retry number exceeds the recommended
number. If recommended number is 0xFFFFFFFF, never retry.

The recommended number if got from EBX of Implementation
Recommendations MSR (0x40000004).

This patch set bases on latest tip branch because below
patch set has not been merged into main branch.
    [PATCH v4 0/2] Enable PV qspinlock for Hyper-V

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com>
Cc: Juergen Gross <jgross@suse.com>

Yi Sun (2):
  x86/hyperv: get number of spinlock retry on Hyper-V
  x86/hyperv: call HVCALL_NOTIFY_LONG_SPIN_WAIT

 arch/x86/hyperv/hv_spinlock.c       | 18 ++++++++++++++++++
 arch/x86/include/asm/mshyperv.h     |  4 ++++
 arch/x86/kernel/cpu/mshyperv.c      |  1 +
 kernel/locking/qspinlock_paravirt.h | 10 ++++++++++
 4 files changed, 33 insertions(+)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v1 1/2] x86/hyperv: get spinlock retry number on Hyper-V
  2018-10-19 13:13 [PATCH v1 0/2] Enable HvNotifyLongSpinWait for Hyper-V Yi Sun
@ 2018-10-19 13:13 ` Yi Sun
  2018-10-19 13:13 ` [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall Yi Sun
  1 sibling, 0 replies; 25+ messages in thread
From: Yi Sun @ 2018-10-19 13:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, tglx, jgross, chao.p.peng, chao.gao, isaku.yamahata,
	michael.h.kelley, tianyu.lan, Yi Sun, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Michael Kelley (EOSG)

EBX of Implementation Recommendations MSR (0x40000004) indicates
recommended number of attempts to retry a spinlock failure before
notifying the hypervisor about the failures.

0xFFFFFFFF indicates never to retry.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com>
Cc: Juergen Gross <jgross@suse.com>
---
 arch/x86/include/asm/mshyperv.h | 3 +++
 arch/x86/kernel/cpu/mshyperv.c  | 1 +
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 0d6271c..f909365 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -11,10 +11,13 @@
 
 #define VP_INVAL	U32_MAX
 
+#define HYPERV_SPINLOCK_RETRY_NEVER	U32_MAX
+
 struct ms_hyperv_info {
 	u32 features;
 	u32 misc_features;
 	u32 hints;
+	u32 num_spin_retry;
 	u32 nested_features;
 	u32 max_vp_index;
 	u32 max_lp_index;
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 1c72f38..04f480a 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -222,6 +222,7 @@ static void __init ms_hyperv_init_platform(void)
 	ms_hyperv.features = cpuid_eax(HYPERV_CPUID_FEATURES);
 	ms_hyperv.misc_features = cpuid_edx(HYPERV_CPUID_FEATURES);
 	ms_hyperv.hints    = cpuid_eax(HYPERV_CPUID_ENLIGHTMENT_INFO);
+	ms_hyperv.num_spin_retry = cpuid_ebx(HYPERV_CPUID_ENLIGHTMENT_INFO);
 
 	pr_info("Hyper-V: features 0x%x, hints 0x%x\n",
 		ms_hyperv.features, ms_hyperv.hints);
-- 
1.9.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-19 13:13 [PATCH v1 0/2] Enable HvNotifyLongSpinWait for Hyper-V Yi Sun
  2018-10-19 13:13 ` [PATCH v1 1/2] x86/hyperv: get spinlock retry number on Hyper-V Yi Sun
@ 2018-10-19 13:13 ` Yi Sun
  2018-10-19 14:20   ` Juergen Gross
  2018-10-24 16:53   ` Michael Kelley
  1 sibling, 2 replies; 25+ messages in thread
From: Yi Sun @ 2018-10-19 13:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, tglx, jgross, chao.p.peng, chao.gao, isaku.yamahata,
	michael.h.kelley, tianyu.lan, Yi Sun, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Michael Kelley (EOSG)

The HvNotifyLongSpinWait hypercall (HVCALL_NOTIFY_LONG_SPIN_WAIT)
is used by a guest OS to notify the hypervisor that the calling
virtual processor is attempting to acquire a resource that is
potentially held by another virtual processor within the same
Virtual Machine. This scheduling hint improves the scalability of
VMs with more than one virtual processor on Hyper-V.

Per MSFT TLFS, the retry number (SpinWaitInfo) is sent to hypervisor
only when the retry number exceeds the recommended number. If
recommended number is 0xFFFFFFFF, never retry.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com>
Cc: Juergen Gross <jgross@suse.com>
---
 arch/x86/hyperv/hv_spinlock.c       | 18 ++++++++++++++++++
 arch/x86/include/asm/mshyperv.h     |  1 +
 kernel/locking/qspinlock_paravirt.h | 10 ++++++++++
 3 files changed, 29 insertions(+)

diff --git a/arch/x86/hyperv/hv_spinlock.c b/arch/x86/hyperv/hv_spinlock.c
index a861b04..723dccb 100644
--- a/arch/x86/hyperv/hv_spinlock.c
+++ b/arch/x86/hyperv/hv_spinlock.c
@@ -18,6 +18,24 @@
 
 static bool __initdata hv_pvspin = true;
 
+bool hv_notify_long_spin_wait(int retry_num)
+{
+	/*
+	 * Per MSFT TLFS, the SpinWaitInfo is sent to hypervisor only when
+	 * the retry number exceeds the recommended number.
+	 *
+	 * If recommended number is 0xFFFFFFFF, never retry.
+	 */
+	if (ms_hyperv.num_spin_retry == HYPERV_SPINLOCK_RETRY_NEVER)
+		return false;
+
+	if ((0 == retry_num % ms_hyperv.num_spin_retry) && retry_num)
+		hv_do_fast_hypercall8(HVCALL_NOTIFY_LONG_SPIN_WAIT,
+				      retry_num);
+
+	return true;
+}
+
 static void hv_qlock_kick(int cpu)
 {
 	apic->send_IPI(cpu, X86_PLATFORM_IPI_VECTOR);
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index f909365..bd87868 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -356,6 +356,7 @@ static inline int cpumask_to_vpset(struct hv_vpset *vpset,
 void hv_apic_init(void);
 void __init hv_init_spinlocks(void);
 bool hv_vcpu_is_preempted(int vcpu);
+bool hv_notify_long_spin_wait(int retry_num);
 #else
 static inline void hv_apic_init(void) {}
 #endif
diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
index 0130e48..9e88c7e 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -7,6 +7,8 @@
 #include <linux/bootmem.h>
 #include <linux/debug_locks.h>
 
+#include <asm/mshyperv.h>
+
 /*
  * Implement paravirt qspinlocks; the general idea is to halt the vcpus instead
  * of spinning them.
@@ -305,6 +307,10 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
 				wait_early = true;
 				break;
 			}
+#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
+			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
+				break;
+#endif
 			cpu_relax();
 		}
 
@@ -433,6 +439,10 @@ static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node)
 		for (loop = SPIN_THRESHOLD; loop; loop--) {
 			if (trylock_clear_pending(lock))
 				goto gotlock;
+#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
+			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
+				break;
+#endif
 			cpu_relax();
 		}
 		clear_pending(lock);
-- 
1.9.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-19 13:13 ` [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall Yi Sun
@ 2018-10-19 14:20   ` Juergen Gross
  2018-10-22  1:53     ` Yi Sun
  2018-10-24 16:53   ` Michael Kelley
  1 sibling, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2018-10-19 14:20 UTC (permalink / raw)
  To: Yi Sun, linux-kernel
  Cc: x86, tglx, chao.p.peng, chao.gao, isaku.yamahata,
	michael.h.kelley, tianyu.lan, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger

On 19/10/2018 15:13, Yi Sun wrote:
> The HvNotifyLongSpinWait hypercall (HVCALL_NOTIFY_LONG_SPIN_WAIT)
> is used by a guest OS to notify the hypervisor that the calling
> virtual processor is attempting to acquire a resource that is
> potentially held by another virtual processor within the same
> Virtual Machine. This scheduling hint improves the scalability of
> VMs with more than one virtual processor on Hyper-V.
> 
> Per MSFT TLFS, the retry number (SpinWaitInfo) is sent to hypervisor
> only when the retry number exceeds the recommended number. If
> recommended number is 0xFFFFFFFF, never retry.
> 
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Stephen Hemminger <sthemmin@microsoft.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com>
> Cc: Juergen Gross <jgross@suse.com>
> ---
>  arch/x86/hyperv/hv_spinlock.c       | 18 ++++++++++++++++++
>  arch/x86/include/asm/mshyperv.h     |  1 +
>  kernel/locking/qspinlock_paravirt.h | 10 ++++++++++
>  3 files changed, 29 insertions(+)
> 
> diff --git a/arch/x86/hyperv/hv_spinlock.c b/arch/x86/hyperv/hv_spinlock.c
> index a861b04..723dccb 100644
> --- a/arch/x86/hyperv/hv_spinlock.c
> +++ b/arch/x86/hyperv/hv_spinlock.c
> @@ -18,6 +18,24 @@
>  
>  static bool __initdata hv_pvspin = true;
>  
> +bool hv_notify_long_spin_wait(int retry_num)
> +{
> +	/*
> +	 * Per MSFT TLFS, the SpinWaitInfo is sent to hypervisor only when
> +	 * the retry number exceeds the recommended number.
> +	 *
> +	 * If recommended number is 0xFFFFFFFF, never retry.
> +	 */
> +	if (ms_hyperv.num_spin_retry == HYPERV_SPINLOCK_RETRY_NEVER)
> +		return false;
> +
> +	if ((0 == retry_num % ms_hyperv.num_spin_retry) && retry_num)
> +		hv_do_fast_hypercall8(HVCALL_NOTIFY_LONG_SPIN_WAIT,
> +				      retry_num);
> +
> +	return true;
> +}
> +
>  static void hv_qlock_kick(int cpu)
>  {
>  	apic->send_IPI(cpu, X86_PLATFORM_IPI_VECTOR);
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index f909365..bd87868 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -356,6 +356,7 @@ static inline int cpumask_to_vpset(struct hv_vpset *vpset,
>  void hv_apic_init(void);
>  void __init hv_init_spinlocks(void);
>  bool hv_vcpu_is_preempted(int vcpu);
> +bool hv_notify_long_spin_wait(int retry_num);
>  #else
>  static inline void hv_apic_init(void) {}
>  #endif
> diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
> index 0130e48..9e88c7e 100644
> --- a/kernel/locking/qspinlock_paravirt.h
> +++ b/kernel/locking/qspinlock_paravirt.h
> @@ -7,6 +7,8 @@
>  #include <linux/bootmem.h>
>  #include <linux/debug_locks.h>
>  
> +#include <asm/mshyperv.h>
> +
>  /*
>   * Implement paravirt qspinlocks; the general idea is to halt the vcpus instead
>   * of spinning them.
> @@ -305,6 +307,10 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
>  				wait_early = true;
>  				break;
>  			}
> +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
> +			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
> +				break;
> +#endif

I don't like that. Why should a KVM or Xen guest call into a hyperv
specific function?

Can't you move this to existing hyperv specific paravirt hooks?


Juergen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-19 14:20   ` Juergen Gross
@ 2018-10-22  1:53     ` Yi Sun
  2018-10-22  7:32       ` Juergen Gross
  0 siblings, 1 reply; 25+ messages in thread
From: Yi Sun @ 2018-10-22  1:53 UTC (permalink / raw)
  To: Juergen Gross
  Cc: linux-kernel, x86, tglx, chao.p.peng, chao.gao, isaku.yamahata,
	michael.h.kelley, tianyu.lan, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger

On 18-10-19 16:20:52, Juergen Gross wrote:
> On 19/10/2018 15:13, Yi Sun wrote:

[...]

> > diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
> > index 0130e48..9e88c7e 100644
> > --- a/kernel/locking/qspinlock_paravirt.h
> > +++ b/kernel/locking/qspinlock_paravirt.h
> > @@ -7,6 +7,8 @@
> >  #include <linux/bootmem.h>
> >  #include <linux/debug_locks.h>
> >  
> > +#include <asm/mshyperv.h>
> > +
> >  /*
> >   * Implement paravirt qspinlocks; the general idea is to halt the vcpus instead
> >   * of spinning them.
> > @@ -305,6 +307,10 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
> >  				wait_early = true;
> >  				break;
> >  			}
> > +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
> > +			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
> > +				break;
> > +#endif
> 
> I don't like that. Why should a KVM or Xen guest call into a hyperv
> specific function?
> 
> Can't you move this to existing hyperv specific paravirt hooks?
> 
hv_notify_long_spin_wait() must be called in this loop but it seems
there is no appropriate existing paravirt hook here. So, can I add
one more hook in pv_lock_ops to do this notification?

> 
> Juergen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-22  1:53     ` Yi Sun
@ 2018-10-22  7:32       ` Juergen Gross
  2018-10-22 16:31         ` Waiman Long
  0 siblings, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2018-10-22  7:32 UTC (permalink / raw)
  To: Yi Sun
  Cc: linux-kernel, x86, tglx, chao.p.peng, chao.gao, isaku.yamahata,
	michael.h.kelley, tianyu.lan, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Peter Zijlstra, mingo, Will Deacon,
	Waiman Long

On 22/10/2018 03:53, Yi Sun wrote:
> On 18-10-19 16:20:52, Juergen Gross wrote:
>> On 19/10/2018 15:13, Yi Sun wrote:
> 
> [...]
> 
>>> diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
>>> index 0130e48..9e88c7e 100644
>>> --- a/kernel/locking/qspinlock_paravirt.h
>>> +++ b/kernel/locking/qspinlock_paravirt.h
>>> @@ -7,6 +7,8 @@
>>>  #include <linux/bootmem.h>
>>>  #include <linux/debug_locks.h>
>>>  
>>> +#include <asm/mshyperv.h>
>>> +
>>>  /*
>>>   * Implement paravirt qspinlocks; the general idea is to halt the vcpus instead
>>>   * of spinning them.
>>> @@ -305,6 +307,10 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
>>>  				wait_early = true;
>>>  				break;
>>>  			}
>>> +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
>>> +			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
>>> +				break;
>>> +#endif
>>
>> I don't like that. Why should a KVM or Xen guest call into a hyperv
>> specific function?
>>
>> Can't you move this to existing hyperv specific paravirt hooks?
>>
> hv_notify_long_spin_wait() must be called in this loop but it seems
> there is no appropriate existing paravirt hook here. So, can I add
> one more hook in pv_lock_ops to do this notification?

vcpu_is_preempted() is already part of this loop. And this is a paravirt
hook. Can't you make use of that? This might require adding another
parameter to this hook, but I'd prefer that over another pv-spinlock
hook.

Adding some more locking maintainers and Waiman to the Cc: list.


Juergen


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-22  7:32       ` Juergen Gross
@ 2018-10-22 16:31         ` Waiman Long
  2018-10-22 17:15           ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Waiman Long @ 2018-10-22 16:31 UTC (permalink / raw)
  To: Juergen Gross, Yi Sun
  Cc: linux-kernel, x86, tglx, chao.p.peng, chao.gao, isaku.yamahata,
	michael.h.kelley, tianyu.lan, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Peter Zijlstra, mingo, Will Deacon

On 10/22/2018 03:32 AM, Juergen Gross wrote:
> On 22/10/2018 03:53, Yi Sun wrote:
>> On 18-10-19 16:20:52, Juergen Gross wrote:
>>> On 19/10/2018 15:13, Yi Sun wrote:
>> [...]
>>
>>>> diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
>>>> index 0130e48..9e88c7e 100644
>>>> --- a/kernel/locking/qspinlock_paravirt.h
>>>> +++ b/kernel/locking/qspinlock_paravirt.h
>>>> @@ -7,6 +7,8 @@
>>>>  #include <linux/bootmem.h>
>>>>  #include <linux/debug_locks.h>
>>>>  
>>>> +#include <asm/mshyperv.h>
>>>> +
>>>>  /*
>>>>   * Implement paravirt qspinlocks; the general idea is to halt the vcpus instead
>>>>   * of spinning them.
>>>> @@ -305,6 +307,10 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
>>>>  				wait_early = true;
>>>>  				break;
>>>>  			}
>>>> +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
>>>> +			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
>>>> +				break;
>>>> +#endif
>>> I don't like that. Why should a KVM or Xen guest call into a hyperv
>>> specific function?
>>>
>>> Can't you move this to existing hyperv specific paravirt hooks?
>>>
>> hv_notify_long_spin_wait() must be called in this loop but it seems
>> there is no appropriate existing paravirt hook here. So, can I add
>> one more hook in pv_lock_ops to do this notification?
> vcpu_is_preempted() is already part of this loop. And this is a paravirt
> hook. Can't you make use of that? This might require adding another
> parameter to this hook, but I'd prefer that over another pv-spinlock
> hook.
>
> Adding some more locking maintainers and Waiman to the Cc: list.
>
>
> Juergen
>
I agree with Juergen on that. I would suggest rename the
vcpu_is_preempted hook into a more generic vcpu_stop_spinning, perhaps,
so different hypervisors can act on the information accordingly. Adding
an extra parameter is fine.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-22 16:31         ` Waiman Long
@ 2018-10-22 17:15           ` Peter Zijlstra
  2018-10-22 17:27             ` Waiman Long
  2018-10-23  2:57             ` Yi Sun
  0 siblings, 2 replies; 25+ messages in thread
From: Peter Zijlstra @ 2018-10-22 17:15 UTC (permalink / raw)
  To: Waiman Long
  Cc: Juergen Gross, Yi Sun, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon


Firstly, who come a patch that is grubbing around in kernel/locking/ has
an x86/hyperv subject and isn't Cc'ed to the locking maintainers?

On Mon, Oct 22, 2018 at 12:31:45PM -0400, Waiman Long wrote:
> On 10/22/2018 03:32 AM, Juergen Gross wrote:
> > On 22/10/2018 03:53, Yi Sun wrote:
> >> On 18-10-19 16:20:52, Juergen Gross wrote:
> >>> On 19/10/2018 15:13, Yi Sun wrote:
> >> [...]
> >>
> >>>> diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
> >>>> index 0130e48..9e88c7e 100644
> >>>> --- a/kernel/locking/qspinlock_paravirt.h
> >>>> +++ b/kernel/locking/qspinlock_paravirt.h
> >>>> @@ -7,6 +7,8 @@
> >>>>  #include <linux/bootmem.h>
> >>>>  #include <linux/debug_locks.h>
> >>>>  
> >>>> +#include <asm/mshyperv.h>
> >>>> +
> >>>>  /*
> >>>>   * Implement paravirt qspinlocks; the general idea is to halt the vcpus instead
> >>>>   * of spinning them.
> >>>> @@ -305,6 +307,10 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
> >>>>  				wait_early = true;
> >>>>  				break;
> >>>>  			}
> >>>> +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
> >>>> +			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
> >>>> +				break;
> >>>> +#endif

Secondly; how come you thought that was acceptable in any way shape or
form?

> > vcpu_is_preempted() is already part of this loop. And this is a paravirt
> > hook. Can't you make use of that? This might require adding another
> > parameter to this hook, but I'd prefer that over another pv-spinlock
> > hook.

> I agree with Juergen on that. I would suggest rename the
> vcpu_is_preempted hook into a more generic vcpu_stop_spinning, perhaps,
> so different hypervisors can act on the information accordingly. Adding
> an extra parameter is fine.

No; no extra parameters. vcpu_is_preempted() is a simple and intuitive
interface. Why would we want to make it complicated?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-22 17:15           ` Peter Zijlstra
@ 2018-10-22 17:27             ` Waiman Long
  2018-10-22 17:31               ` Peter Zijlstra
  2018-10-23  2:57             ` Yi Sun
  1 sibling, 1 reply; 25+ messages in thread
From: Waiman Long @ 2018-10-22 17:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juergen Gross, Yi Sun, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon

On 10/22/2018 01:15 PM, Peter Zijlstra wrote:
> Firstly, who come a patch that is grubbing around in kernel/locking/ has
> an x86/hyperv subject and isn't Cc'ed to the locking maintainers?
>
> On Mon, Oct 22, 2018 at 12:31:45PM -0400, Waiman Long wrote:
>> On 10/22/2018 03:32 AM, Juergen Gross wrote:
>>> On 22/10/2018 03:53, Yi Sun wrote:
>>>> On 18-10-19 16:20:52, Juergen Gross wrote:
>>>>> On 19/10/2018 15:13, Yi Sun wrote:
>>>> [...]
>>>>
>>>>>> diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
>>>>>> index 0130e48..9e88c7e 100644
>>>>>> --- a/kernel/locking/qspinlock_paravirt.h
>>>>>> +++ b/kernel/locking/qspinlock_paravirt.h
>>>>>> @@ -7,6 +7,8 @@
>>>>>>  #include <linux/bootmem.h>
>>>>>>  #include <linux/debug_locks.h>
>>>>>>  
>>>>>> +#include <asm/mshyperv.h>
>>>>>> +
>>>>>>  /*
>>>>>>   * Implement paravirt qspinlocks; the general idea is to halt the vcpus instead
>>>>>>   * of spinning them.
>>>>>> @@ -305,6 +307,10 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
>>>>>>  				wait_early = true;
>>>>>>  				break;
>>>>>>  			}
>>>>>> +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
>>>>>> +			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
>>>>>> +				break;
>>>>>> +#endif
> Secondly; how come you thought that was acceptable in any way shape or
> form?
>
>>> vcpu_is_preempted() is already part of this loop. And this is a paravirt
>>> hook. Can't you make use of that? This might require adding another
>>> parameter to this hook, but I'd prefer that over another pv-spinlock
>>> hook.
>> I agree with Juergen on that. I would suggest rename the
>> vcpu_is_preempted hook into a more generic vcpu_stop_spinning, perhaps,
>> so different hypervisors can act on the information accordingly. Adding
>> an extra parameter is fine.
> No; no extra parameters. vcpu_is_preempted() is a simple and intuitive
> interface. Why would we want to make it complicated?

Hyperv seems to do it in a somewhat different way by looking at the spin
counter and decide if it should continue. I don't know why they do that
and what advantage it has.

The current patch is definitely not OK. A revised patch that makes use
of an existing paravirt hook will be more acceptable. Again, I would
like to see some performance figure and why they do it this way to see
if it is worthwhile to change the existing interface.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-22 17:27             ` Waiman Long
@ 2018-10-22 17:31               ` Peter Zijlstra
  2018-10-22 18:01                 ` Waiman Long
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2018-10-22 17:31 UTC (permalink / raw)
  To: Waiman Long
  Cc: Juergen Gross, Yi Sun, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon

On Mon, Oct 22, 2018 at 01:27:27PM -0400, Waiman Long wrote:

> >> I agree with Juergen on that. I would suggest rename the
> >> vcpu_is_preempted hook into a more generic vcpu_stop_spinning, perhaps,
> >> so different hypervisors can act on the information accordingly. Adding
> >> an extra parameter is fine.
> > No; no extra parameters. vcpu_is_preempted() is a simple and intuitive
> > interface. Why would we want to make it complicated?
> 
> Hyperv seems to do it in a somewhat different way by looking at the spin
> counter and decide if it should continue. I don't know why they do that
> and what advantage it has.
> 
> The current patch is definitely not OK. A revised patch that makes use
> of an existing paravirt hook will be more acceptable. Again, I would
> like to see some performance figure and why they do it this way to see
> if it is worthwhile to change the existing interface.

Note that there are vcpu_is_preempted() users that are not in a
spin-loop.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-22 17:31               ` Peter Zijlstra
@ 2018-10-22 18:01                 ` Waiman Long
  0 siblings, 0 replies; 25+ messages in thread
From: Waiman Long @ 2018-10-22 18:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juergen Gross, Yi Sun, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon

On 10/22/2018 01:31 PM, Peter Zijlstra wrote:
> On Mon, Oct 22, 2018 at 01:27:27PM -0400, Waiman Long wrote:
>
>>>> I agree with Juergen on that. I would suggest rename the
>>>> vcpu_is_preempted hook into a more generic vcpu_stop_spinning, perhaps,
>>>> so different hypervisors can act on the information accordingly. Adding
>>>> an extra parameter is fine.
>>> No; no extra parameters. vcpu_is_preempted() is a simple and intuitive
>>> interface. Why would we want to make it complicated?
>> Hyperv seems to do it in a somewhat different way by looking at the spin
>> counter and decide if it should continue. I don't know why they do that
>> and what advantage it has.
>>
>> The current patch is definitely not OK. A revised patch that makes use
>> of an existing paravirt hook will be more acceptable. Again, I would
>> like to see some performance figure and why they do it this way to see
>> if it is worthwhile to change the existing interface.
> Note that there are vcpu_is_preempted() users that are not in a
> spin-loop.

You are right. I forgot about that. In that case, someone has to prove
that using an alternative way to stop spinning really has a performance
advantage compared to what we already have today.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-22 17:15           ` Peter Zijlstra
  2018-10-22 17:27             ` Waiman Long
@ 2018-10-23  2:57             ` Yi Sun
  2018-10-23  8:51               ` Peter Zijlstra
  1 sibling, 1 reply; 25+ messages in thread
From: Yi Sun @ 2018-10-23  2:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Juergen Gross, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon

On 18-10-22 19:15:16, Peter Zijlstra wrote:
> 
> Firstly, who come a patch that is grubbing around in kernel/locking/ has
> an x86/hyperv subject and isn't Cc'ed to the locking maintainers?
> 
I am sorry. That is my fault to forget to add locking maintainers.

> On Mon, Oct 22, 2018 at 12:31:45PM -0400, Waiman Long wrote:
> > On 10/22/2018 03:32 AM, Juergen Gross wrote:
> > > On 22/10/2018 03:53, Yi Sun wrote:
> > >> On 18-10-19 16:20:52, Juergen Gross wrote:
> > >>> On 19/10/2018 15:13, Yi Sun wrote:
> > >> [...]
> > >>
> > >>>> diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
> > >>>> index 0130e48..9e88c7e 100644
> > >>>> --- a/kernel/locking/qspinlock_paravirt.h
> > >>>> +++ b/kernel/locking/qspinlock_paravirt.h
> > >>>> @@ -7,6 +7,8 @@
> > >>>>  #include <linux/bootmem.h>
> > >>>>  #include <linux/debug_locks.h>
> > >>>>  
> > >>>> +#include <asm/mshyperv.h>
> > >>>> +
> > >>>>  /*
> > >>>>   * Implement paravirt qspinlocks; the general idea is to halt the vcpus instead
> > >>>>   * of spinning them.
> > >>>> @@ -305,6 +307,10 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
> > >>>>  				wait_early = true;
> > >>>>  				break;
> > >>>>  			}
> > >>>> +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
> > >>>> +			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
> > >>>> +				break;
> > >>>> +#endif
> 
> Secondly; how come you thought that was acceptable in any way shape or
> form?
> 
Sorry for that. Will try another way.

> > > vcpu_is_preempted() is already part of this loop. And this is a paravirt
> > > hook. Can't you make use of that? This might require adding another
> > > parameter to this hook, but I'd prefer that over another pv-spinlock
> > > hook.
> 
> > I agree with Juergen on that. I would suggest rename the
> > vcpu_is_preempted hook into a more generic vcpu_stop_spinning, perhaps,
> > so different hypervisors can act on the information accordingly. Adding
> > an extra parameter is fine.
> 
> No; no extra parameters. vcpu_is_preempted() is a simple and intuitive
> interface. Why would we want to make it complicated?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-23  2:57             ` Yi Sun
@ 2018-10-23  8:51               ` Peter Zijlstra
  2018-10-23  9:33                 ` Yi Sun
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2018-10-23  8:51 UTC (permalink / raw)
  To: Yi Sun
  Cc: Waiman Long, Juergen Gross, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon

On Tue, Oct 23, 2018 at 10:57:40AM +0800, Yi Sun wrote:
> On 18-10-22 19:15:16, Peter Zijlstra wrote:

> > > >>>> +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
> > > >>>> +			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
> > > >>>> +				break;
> > > >>>> +#endif
> > 
> > Secondly; how come you thought that was acceptable in any way shape or
> > form?
> > 
> Sorry for that. Will try another way.

Can you try and explain why vcpu_is_preempted() doesn't work for you?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-23  8:51               ` Peter Zijlstra
@ 2018-10-23  9:33                 ` Yi Sun
  2018-10-31  1:54                   ` Yi Sun
  0 siblings, 1 reply; 25+ messages in thread
From: Yi Sun @ 2018-10-23  9:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Juergen Gross, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon

On 18-10-23 10:51:27, Peter Zijlstra wrote:
> On Tue, Oct 23, 2018 at 10:57:40AM +0800, Yi Sun wrote:
> > On 18-10-22 19:15:16, Peter Zijlstra wrote:
> 
> > > > >>>> +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
> > > > >>>> +			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
> > > > >>>> +				break;
> > > > >>>> +#endif
> > > 
> > > Secondly; how come you thought that was acceptable in any way shape or
> > > form?
> > > 
> > Sorry for that. Will try another way.
> 
> Can you try and explain why vcpu_is_preempted() doesn't work for you?

I thought HvSpinWaitInfo is used to notify hypervisor the spin number
which is different with vcpu_is_preempted. So I did not consider
vcpu_is_preempted.

But HvSpinWaitInfo is a quite simple function and could be combined
with vcpu_is_preempted together. So I think it is OK to use
vcpu_is_preempted to make codes clean. I will have a try.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-19 13:13 ` [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall Yi Sun
  2018-10-19 14:20   ` Juergen Gross
@ 2018-10-24 16:53   ` Michael Kelley
  2018-10-25  2:23     ` Yi Sun
  2018-10-31  2:06     ` Yi Sun
  1 sibling, 2 replies; 25+ messages in thread
From: Michael Kelley @ 2018-10-24 16:53 UTC (permalink / raw)
  To: Yi Sun, linux-kernel
  Cc: x86, tglx, jgross, chao.p.peng, chao.gao, isaku.yamahata,
	Tianyu Lan, KY Srinivasan, Haiyang Zhang, Stephen Hemminger

From: Yi Sun <yi.y.sun@linux.intel.com>  Sent: Friday, October 19, 2018 6:14 AM
> 
> The HvNotifyLongSpinWait hypercall (HVCALL_NOTIFY_LONG_SPIN_WAIT)
> is used by a guest OS to notify the hypervisor that the calling
> virtual processor is attempting to acquire a resource that is
> potentially held by another virtual processor within the same
> Virtual Machine. This scheduling hint improves the scalability of
> VMs with more than one virtual processor on Hyper-V.
> 
> Per MSFT TLFS, the retry number (SpinWaitInfo) is sent to hypervisor
> only when the retry number exceeds the recommended number. If
> recommended number is 0xFFFFFFFF, never retry.

The HvNotifyLongSpinWait hypercall should be understood to be
advisory only.  As you noted, it is a scheduling hint to the
hypervisor that some virtual CPU in the VM holds a spin lock.  Even
though Linux knows which vCPU holds the spin lock, the hypercall
does not provide a way to give that information to Hyper-V.  The
hypercall always returns immediately.

The "retry number" is a bit mis-named in the Hyper-V Top Level
Functional Spec (TLFS).  It is essentially a threshold value.  Hyper-V is
saying "don't bother to advise me about the spin lock until you have
done a certain number of spins."  This threshold prevents
over-notifying Hyper-V such that the notification becomes somewhat
meaningless.   It's not immediately clear to me why the hypercall passes
that value as an input, but maybe it lets the Hyper-V scheduler prioritize
among vCPUs based on how many times they have spun for a lock.  I
think we were told that current Hyper-V implementations ignore this
input value anyway.

I believe the description of the sentinel value 0xFFFFFFFF in the
Hyper-V TLFS is incorrect.  Because it is the max possible threshold
value, that value in the EBX register just means to not ever bother to
notify.   The description should be "0xFFFFFFFF indicates never to notify."
The value does *not* indicate anything about retrying to obtain the
spin lock.

>  static bool __initdata hv_pvspin = true;
> 
> +bool hv_notify_long_spin_wait(int retry_num)

retry_num should probably be declared as unsigned int.  You
don't want it to be treated as a negative number if the high
order bit is set.

> +{
> +	/*
> +	 * Per MSFT TLFS, the SpinWaitInfo is sent to hypervisor only when
> +	 * the retry number exceeds the recommended number.
> +	 *
> +	 * If recommended number is 0xFFFFFFFF, never retry.
> +	 */
> +	if (ms_hyperv.num_spin_retry == HYPERV_SPINLOCK_RETRY_NEVER)
> +		return false;
> +
> +	if ((0 == retry_num % ms_hyperv.num_spin_retry) && retry_num)

I don't know if the "%" function is right here.  Your implementation will
notify Hyper-V on every multiple of num_spin_retry.   The alternative is to
notify once when the threshold is exceeded, and never again for this
particular attempt to obtain a spin lock.  We should check with the Hyper-V
team for which approach they expect to be used.

> +		hv_do_fast_hypercall8(HVCALL_NOTIFY_LONG_SPIN_WAIT,
> +				      retry_num);

The Hyper-V TLFS seems to be inconsistent on whether the input parameter
is 32-bits or 64-bits.   In one place it is typed as UINT64, but in another place
it is shown as only 4 bytes.  Need to clear this up with the Hyper-V team as
well.

> +
> +	return true;

I don't see a need for this function to return true vs. false.  Any calling code
should not change its behavior based on num_spin_retry.   This function will
either notify Hyper-V or not notify Hyper-V, depending on whether the number
of attempts to obtain the spinlock meets the threshold.  But calling code will
do the same thing regardless of whether such a notification is made. 

Michael


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-24 16:53   ` Michael Kelley
@ 2018-10-25  2:23     ` Yi Sun
  2018-10-31  2:06     ` Yi Sun
  1 sibling, 0 replies; 25+ messages in thread
From: Yi Sun @ 2018-10-25  2:23 UTC (permalink / raw)
  To: Michael Kelley
  Cc: linux-kernel, x86, tglx, jgross, chao.p.peng, chao.gao,
	isaku.yamahata, Tianyu Lan, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger

Hi, Michael,

Thanks a lot for the review and comments! Let us sync with Hyper-V team
to confirm these suspicious points.

BRs,
Sun Yi

On 18-10-24 16:53:00, Michael Kelley wrote:
> From: Yi Sun <yi.y.sun@linux.intel.com>  Sent: Friday, October 19, 2018 6:14 AM
> > 
> > The HvNotifyLongSpinWait hypercall (HVCALL_NOTIFY_LONG_SPIN_WAIT)
> > is used by a guest OS to notify the hypervisor that the calling
> > virtual processor is attempting to acquire a resource that is
> > potentially held by another virtual processor within the same
> > Virtual Machine. This scheduling hint improves the scalability of
> > VMs with more than one virtual processor on Hyper-V.
> > 
> > Per MSFT TLFS, the retry number (SpinWaitInfo) is sent to hypervisor
> > only when the retry number exceeds the recommended number. If
> > recommended number is 0xFFFFFFFF, never retry.
> 
> The HvNotifyLongSpinWait hypercall should be understood to be
> advisory only.  As you noted, it is a scheduling hint to the
> hypervisor that some virtual CPU in the VM holds a spin lock.  Even
> though Linux knows which vCPU holds the spin lock, the hypercall
> does not provide a way to give that information to Hyper-V.  The
> hypercall always returns immediately.
> 
> The "retry number" is a bit mis-named in the Hyper-V Top Level
> Functional Spec (TLFS).  It is essentially a threshold value.  Hyper-V is
> saying "don't bother to advise me about the spin lock until you have
> done a certain number of spins."  This threshold prevents
> over-notifying Hyper-V such that the notification becomes somewhat
> meaningless.   It's not immediately clear to me why the hypercall passes
> that value as an input, but maybe it lets the Hyper-V scheduler prioritize
> among vCPUs based on how many times they have spun for a lock.  I
> think we were told that current Hyper-V implementations ignore this
> input value anyway.
> 
> I believe the description of the sentinel value 0xFFFFFFFF in the
> Hyper-V TLFS is incorrect.  Because it is the max possible threshold
> value, that value in the EBX register just means to not ever bother to
> notify.   The description should be "0xFFFFFFFF indicates never to notify."
> The value does *not* indicate anything about retrying to obtain the
> spin lock.
> 
I will send mail to Hyper-V team to clarify these.

> >  static bool __initdata hv_pvspin = true;
> > 
> > +bool hv_notify_long_spin_wait(int retry_num)
> 
> retry_num should probably be declared as unsigned int.  You
> don't want it to be treated as a negative number if the high
> order bit is set.
> 
Yes, I should declare it as 'unsigned int'. Thanks!

> > +{
> > +	/*
> > +	 * Per MSFT TLFS, the SpinWaitInfo is sent to hypervisor only when
> > +	 * the retry number exceeds the recommended number.
> > +	 *
> > +	 * If recommended number is 0xFFFFFFFF, never retry.
> > +	 */
> > +	if (ms_hyperv.num_spin_retry == HYPERV_SPINLOCK_RETRY_NEVER)
> > +		return false;
> > +
> > +	if ((0 == retry_num % ms_hyperv.num_spin_retry) && retry_num)
> 
> I don't know if the "%" function is right here.  Your implementation will
> notify Hyper-V on every multiple of num_spin_retry.   The alternative is to
> notify once when the threshold is exceeded, and never again for this
> particular attempt to obtain a spin lock.  We should check with the Hyper-V
> team for which approach they expect to be used.
> 
> > +		hv_do_fast_hypercall8(HVCALL_NOTIFY_LONG_SPIN_WAIT,
> > +				      retry_num);
> 
> The Hyper-V TLFS seems to be inconsistent on whether the input parameter
> is 32-bits or 64-bits.   In one place it is typed as UINT64, but in another place
> it is shown as only 4 bytes.  Need to clear this up with the Hyper-V team as
> well.
> 
> > +
> > +	return true;
> 
> I don't see a need for this function to return true vs. false.  Any calling code
> should not change its behavior based on num_spin_retry.   This function will
> either notify Hyper-V or not notify Hyper-V, depending on whether the number
> of attempts to obtain the spinlock meets the threshold.  But calling code will
> do the same thing regardless of whether such a notification is made. 
> 
> Michael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-23  9:33                 ` Yi Sun
@ 2018-10-31  1:54                   ` Yi Sun
  2018-10-31 14:10                     ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Yi Sun @ 2018-10-31  1:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Juergen Gross, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon

On 18-10-23 17:33:28, Yi Sun wrote:
> On 18-10-23 10:51:27, Peter Zijlstra wrote:
> > On Tue, Oct 23, 2018 at 10:57:40AM +0800, Yi Sun wrote:
> > > On 18-10-22 19:15:16, Peter Zijlstra wrote:
> > 
> > > > > >>>> +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_SPINLOCKS) && IS_ENABLED(CONFIG_HYPERV)
> > > > > >>>> +			if (!hv_notify_long_spin_wait(SPIN_THRESHOLD - loop))
> > > > > >>>> +				break;
> > > > > >>>> +#endif
> > > > 
> > > > Secondly; how come you thought that was acceptable in any way shape or
> > > > form?
> > > > 
> > > Sorry for that. Will try another way.
> > 
> > Can you try and explain why vcpu_is_preempted() doesn't work for you?
> 
> I thought HvSpinWaitInfo is used to notify hypervisor the spin number
> which is different with vcpu_is_preempted. So I did not consider
> vcpu_is_preempted.
> 
> But HvSpinWaitInfo is a quite simple function and could be combined
> with vcpu_is_preempted together. So I think it is OK to use
> vcpu_is_preempted to make codes clean. I will have a try.

After checking codes, there is one issue to call vcpu_is_preempted.
There are two spin loops in qspinlock_paravirt.h. One loop in
'pv_wait_node' calls vcpu_is_preempted. But another loop in
'pv_wait_head_or_lock' does not call vcpu_is_preempted. It also does
not call any other ops of 'pv_lock_ops' in the loop. So I am afraid
we have to add one more ops in 'pv_lock_ops' to do this.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-24 16:53   ` Michael Kelley
  2018-10-25  2:23     ` Yi Sun
@ 2018-10-31  2:06     ` Yi Sun
  1 sibling, 0 replies; 25+ messages in thread
From: Yi Sun @ 2018-10-31  2:06 UTC (permalink / raw)
  To: Michael Kelley
  Cc: linux-kernel, x86, tglx, jgross, chao.p.peng, chao.gao,
	isaku.yamahata, Tianyu Lan, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger

After syncing with Hyper-V team, we have got answers as below.

On 18-10-24 16:53:00, Michael Kelley wrote:
> From: Yi Sun <yi.y.sun@linux.intel.com>  Sent: Friday, October 19, 2018 6:14 AM
> > 
> > The HvNotifyLongSpinWait hypercall (HVCALL_NOTIFY_LONG_SPIN_WAIT)
> > is used by a guest OS to notify the hypervisor that the calling
> > virtual processor is attempting to acquire a resource that is
> > potentially held by another virtual processor within the same
> > Virtual Machine. This scheduling hint improves the scalability of
> > VMs with more than one virtual processor on Hyper-V.
> > 
> > Per MSFT TLFS, the retry number (SpinWaitInfo) is sent to hypervisor
> > only when the retry number exceeds the recommended number. If
> > recommended number is 0xFFFFFFFF, never retry.
> 
> The HvNotifyLongSpinWait hypercall should be understood to be
> advisory only.  As you noted, it is a scheduling hint to the
> hypervisor that some virtual CPU in the VM holds a spin lock.  Even
> though Linux knows which vCPU holds the spin lock, the hypercall
> does not provide a way to give that information to Hyper-V.  The
> hypercall always returns immediately.
> 
> The "retry number" is a bit mis-named in the Hyper-V Top Level
> Functional Spec (TLFS).  It is essentially a threshold value.  Hyper-V is
> saying "don't bother to advise me about the spin lock until you have
> done a certain number of spins."  This threshold prevents
> over-notifying Hyper-V such that the notification becomes somewhat
> meaningless.   It's not immediately clear to me why the hypercall passes
> that value as an input, but maybe it lets the Hyper-V scheduler prioritize
> among vCPUs based on how many times they have spun for a lock.  I
> think we were told that current Hyper-V implementations ignore this
> input value anyway.
> 
> I believe the description of the sentinel value 0xFFFFFFFF in the
> Hyper-V TLFS is incorrect.  Because it is the max possible threshold
> value, that value in the EBX register just means to not ever bother to
> notify.   The description should be "0xFFFFFFFF indicates never to notify."
> The value does *not* indicate anything about retrying to obtain the
> spin lock.
> 
You are right. 0xFFFFFFFF only indicates never to notify. We should not
break the spin loop.

> >  static bool __initdata hv_pvspin = true;
> > 
> > +bool hv_notify_long_spin_wait(int retry_num)
> 
> retry_num should probably be declared as unsigned int.  You
> don't want it to be treated as a negative number if the high
> order bit is set.
> 
Yes, thanks!

> > +{
> > +	/*
> > +	 * Per MSFT TLFS, the SpinWaitInfo is sent to hypervisor only when
> > +	 * the retry number exceeds the recommended number.
> > +	 *
> > +	 * If recommended number is 0xFFFFFFFF, never retry.
> > +	 */
> > +	if (ms_hyperv.num_spin_retry == HYPERV_SPINLOCK_RETRY_NEVER)
> > +		return false;
> > +
> > +	if ((0 == retry_num % ms_hyperv.num_spin_retry) && retry_num)
> 
> I don't know if the "%" function is right here.  Your implementation will
> notify Hyper-V on every multiple of num_spin_retry.   The alternative is to
> notify once when the threshold is exceeded, and never again for this
> particular attempt to obtain a spin lock.  We should check with the Hyper-V
> team for which approach they expect to be used.
> 
We should send the notification on every multiple of the recommended
number.

> > +		hv_do_fast_hypercall8(HVCALL_NOTIFY_LONG_SPIN_WAIT,
> > +				      retry_num);
> 
> The Hyper-V TLFS seems to be inconsistent on whether the input parameter
> is 32-bits or 64-bits.   In one place it is typed as UINT64, but in another place
> it is shown as only 4 bytes.  Need to clear this up with the Hyper-V team as
> well.
> 
It is 32-bits.

> > +
> > +	return true;
> 
> I don't see a need for this function to return true vs. false.  Any calling code
> should not change its behavior based on num_spin_retry.   This function will
> either notify Hyper-V or not notify Hyper-V, depending on whether the number
> of attempts to obtain the spinlock meets the threshold.  But calling code will
> do the same thing regardless of whether such a notification is made. 
> 
You are right. I will change it to 'void'.

> Michael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-31  1:54                   ` Yi Sun
@ 2018-10-31 14:10                     ` Peter Zijlstra
  2018-10-31 15:07                       ` Waiman Long
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2018-10-31 14:10 UTC (permalink / raw)
  To: Yi Sun
  Cc: Waiman Long, Juergen Gross, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon

On Wed, Oct 31, 2018 at 09:54:17AM +0800, Yi Sun wrote:
> On 18-10-23 17:33:28, Yi Sun wrote:
> > On 18-10-23 10:51:27, Peter Zijlstra wrote:

> > > Can you try and explain why vcpu_is_preempted() doesn't work for you?
> > 
> > I thought HvSpinWaitInfo is used to notify hypervisor the spin number
> > which is different with vcpu_is_preempted. So I did not consider
> > vcpu_is_preempted.
> > 
> > But HvSpinWaitInfo is a quite simple function and could be combined
> > with vcpu_is_preempted together. So I think it is OK to use
> > vcpu_is_preempted to make codes clean. I will have a try.
> 
> After checking codes, there is one issue to call vcpu_is_preempted.
> There are two spin loops in qspinlock_paravirt.h. One loop in
> 'pv_wait_node' calls vcpu_is_preempted. But another loop in
> 'pv_wait_head_or_lock' does not call vcpu_is_preempted. It also does
> not call any other ops of 'pv_lock_ops' in the loop. So I am afraid
> we have to add one more ops in 'pv_lock_ops' to do this.

Why? Would not something like the below cure that? Waiman, can you have
a look at this; I always forget how that paravirt crud works.

---
 kernel/locking/qspinlock.c          | 5 +++--
 kernel/locking/qspinlock_paravirt.h | 5 +++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 8a8c3c208c5e..a4ab80f95176 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -286,7 +286,8 @@ static __always_inline void __pv_wait_node(struct mcs_spinlock *node,
 static __always_inline void __pv_kick_node(struct qspinlock *lock,
 					   struct mcs_spinlock *node) { }
 static __always_inline u32  __pv_wait_head_or_lock(struct qspinlock *lock,
-						   struct mcs_spinlock *node)
+						   struct mcs_spinlock *node,
+						   struct mcs_spinlock *prev)
 						   { return 0; }
 
 #define pv_enabled()		false
@@ -500,7 +501,7 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 	 * If PV isn't active, 0 will be returned instead.
 	 *
 	 */
-	if ((val = pv_wait_head_or_lock(lock, node)))
+	if ((val = pv_wait_head_or_lock(lock, node, prev)))
 		goto locked;
 
 	val = atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK));
diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
index 0130e488ebfe..531dadc955fb 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -399,9 +399,10 @@ static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node)
  * The current value of the lock will be returned for additional processing.
  */
 static u32
-pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node)
+pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node, struct mcs_spinlock *prev)
 {
 	struct pv_node *pn = (struct pv_node *)node;
+	struct pv_node *pp = (struct pv_node *)prev;
 	struct qspinlock **lp = NULL;
 	int waitcnt = 0;
 	int loop;
@@ -430,7 +431,7 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node)
 		 * disable lock stealing before attempting to acquire the lock.
 		 */
 		set_pending(lock);
-		for (loop = SPIN_THRESHOLD; loop; loop--) {
+		for (loop = SPIN_THRESHOLD; loop && !vcpu_is_preempted(prev->cpu); loop--) {
 			if (trylock_clear_pending(lock))
 				goto gotlock;
 			cpu_relax();


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-31 14:10                     ` Peter Zijlstra
@ 2018-10-31 15:07                       ` Waiman Long
  2018-10-31 17:15                         ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Waiman Long @ 2018-10-31 15:07 UTC (permalink / raw)
  To: Peter Zijlstra, Yi Sun
  Cc: Juergen Gross, linux-kernel, x86, tglx, chao.p.peng, chao.gao,
	isaku.yamahata, michael.h.kelley, tianyu.lan, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, mingo, Will Deacon

On 10/31/2018 10:10 AM, Peter Zijlstra wrote:
> On Wed, Oct 31, 2018 at 09:54:17AM +0800, Yi Sun wrote:
>> On 18-10-23 17:33:28, Yi Sun wrote:
>>> On 18-10-23 10:51:27, Peter Zijlstra wrote:
>>>> Can you try and explain why vcpu_is_preempted() doesn't work for you?
>>> I thought HvSpinWaitInfo is used to notify hypervisor the spin number
>>> which is different with vcpu_is_preempted. So I did not consider
>>> vcpu_is_preempted.
>>>
>>> But HvSpinWaitInfo is a quite simple function and could be combined
>>> with vcpu_is_preempted together. So I think it is OK to use
>>> vcpu_is_preempted to make codes clean. I will have a try.
>> After checking codes, there is one issue to call vcpu_is_preempted.
>> There are two spin loops in qspinlock_paravirt.h. One loop in
>> 'pv_wait_node' calls vcpu_is_preempted. But another loop in
>> 'pv_wait_head_or_lock' does not call vcpu_is_preempted. It also does
>> not call any other ops of 'pv_lock_ops' in the loop. So I am afraid
>> we have to add one more ops in 'pv_lock_ops' to do this.
> Why? Would not something like the below cure that? Waiman, can you have
> a look at this; I always forget how that paravirt crud works.

There are two major reasons why the vcpu_is_preempt() test isn't done at
pv_wait_head_or_lock(). First of all, we may not have a valid prev
pointer after all if it is the first one to enter the queue while the
lock is busy. Secondly, because of lock stealing, the cpu number pointed
by a valid prev pointer may not be the actual cpu that is currently
holding the lock. Another minor reason is that we want to minimize the
lock transfer latency and so don't want to sleep too early while waiting
at the queue head.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-31 15:07                       ` Waiman Long
@ 2018-10-31 17:15                         ` Peter Zijlstra
  2018-11-01  3:20                           ` Yi Sun
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2018-10-31 17:15 UTC (permalink / raw)
  To: Waiman Long
  Cc: Yi Sun, Juergen Gross, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon

On Wed, Oct 31, 2018 at 11:07:22AM -0400, Waiman Long wrote:
> On 10/31/2018 10:10 AM, Peter Zijlstra wrote:
> > On Wed, Oct 31, 2018 at 09:54:17AM +0800, Yi Sun wrote:
> >> On 18-10-23 17:33:28, Yi Sun wrote:
> >>> On 18-10-23 10:51:27, Peter Zijlstra wrote:
> >>>> Can you try and explain why vcpu_is_preempted() doesn't work for you?
> >>> I thought HvSpinWaitInfo is used to notify hypervisor the spin number
> >>> which is different with vcpu_is_preempted. So I did not consider
> >>> vcpu_is_preempted.
> >>>
> >>> But HvSpinWaitInfo is a quite simple function and could be combined
> >>> with vcpu_is_preempted together. So I think it is OK to use
> >>> vcpu_is_preempted to make codes clean. I will have a try.
> >> After checking codes, there is one issue to call vcpu_is_preempted.
> >> There are two spin loops in qspinlock_paravirt.h. One loop in
> >> 'pv_wait_node' calls vcpu_is_preempted. But another loop in
> >> 'pv_wait_head_or_lock' does not call vcpu_is_preempted. It also does
> >> not call any other ops of 'pv_lock_ops' in the loop. So I am afraid
> >> we have to add one more ops in 'pv_lock_ops' to do this.
> > Why? Would not something like the below cure that? Waiman, can you have
> > a look at this; I always forget how that paravirt crud works.
> 
> There are two major reasons why the vcpu_is_preempt() test isn't done at
> pv_wait_head_or_lock(). First of all, we may not have a valid prev
> pointer after all if it is the first one to enter the queue while the
> lock is busy. Secondly, because of lock stealing, the cpu number pointed
> by a valid prev pointer may not be the actual cpu that is currently
> holding the lock. Another minor reason is that we want to minimize the
> lock transfer latency and so don't want to sleep too early while waiting
> at the queue head.

So Yi, are you actually seeing a problem? If so, can you give details?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-10-31 17:15                         ` Peter Zijlstra
@ 2018-11-01  3:20                           ` Yi Sun
  2018-11-01  8:59                             ` Peter Zijlstra
  2018-11-01 12:59                             ` Waiman Long
  0 siblings, 2 replies; 25+ messages in thread
From: Yi Sun @ 2018-11-01  3:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Juergen Gross, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon

On 18-10-31 18:15:39, Peter Zijlstra wrote:
> On Wed, Oct 31, 2018 at 11:07:22AM -0400, Waiman Long wrote:
> > On 10/31/2018 10:10 AM, Peter Zijlstra wrote:
> > > On Wed, Oct 31, 2018 at 09:54:17AM +0800, Yi Sun wrote:
> > >> On 18-10-23 17:33:28, Yi Sun wrote:
> > >>> On 18-10-23 10:51:27, Peter Zijlstra wrote:
> > >>>> Can you try and explain why vcpu_is_preempted() doesn't work for you?
> > >>> I thought HvSpinWaitInfo is used to notify hypervisor the spin number
> > >>> which is different with vcpu_is_preempted. So I did not consider
> > >>> vcpu_is_preempted.
> > >>>
> > >>> But HvSpinWaitInfo is a quite simple function and could be combined
> > >>> with vcpu_is_preempted together. So I think it is OK to use
> > >>> vcpu_is_preempted to make codes clean. I will have a try.
> > >> After checking codes, there is one issue to call vcpu_is_preempted.
> > >> There are two spin loops in qspinlock_paravirt.h. One loop in
> > >> 'pv_wait_node' calls vcpu_is_preempted. But another loop in
> > >> 'pv_wait_head_or_lock' does not call vcpu_is_preempted. It also does
> > >> not call any other ops of 'pv_lock_ops' in the loop. So I am afraid
> > >> we have to add one more ops in 'pv_lock_ops' to do this.
> > > Why? Would not something like the below cure that? Waiman, can you have
> > > a look at this; I always forget how that paravirt crud works.
> > 
> > There are two major reasons why the vcpu_is_preempt() test isn't done at
> > pv_wait_head_or_lock(). First of all, we may not have a valid prev
> > pointer after all if it is the first one to enter the queue while the
> > lock is busy. Secondly, because of lock stealing, the cpu number pointed
> > by a valid prev pointer may not be the actual cpu that is currently
> > holding the lock. Another minor reason is that we want to minimize the
> > lock transfer latency and so don't want to sleep too early while waiting
> > at the queue head.
> 
> So Yi, are you actually seeing a problem? If so, can you give details?

Where does the patch come from? I cannot find it through google.

Per Waiman's comment, it seems not suitable to call vcpu_is_preempted()
in pv_wait_head_or_lock(). So, we cannot make HvSpinWaitInfo notification
through vcpu_is_preempted() for such case. Based on that, I suggest to
add one more callback function in pv_lock_ops.

BTW, which performance test do you suggest? I am trying to test it by
AIM7.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-11-01  3:20                           ` Yi Sun
@ 2018-11-01  8:59                             ` Peter Zijlstra
  2018-11-01 12:59                             ` Waiman Long
  1 sibling, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2018-11-01  8:59 UTC (permalink / raw)
  To: Yi Sun
  Cc: Waiman Long, Juergen Gross, linux-kernel, x86, tglx, chao.p.peng,
	chao.gao, isaku.yamahata, michael.h.kelley, tianyu.lan,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, mingo,
	Will Deacon

On Thu, Nov 01, 2018 at 11:20:21AM +0800, Yi Sun wrote:
> On 18-10-31 18:15:39, Peter Zijlstra wrote:

> > So Yi, are you actually seeing a problem? If so, can you give details?
> 
> Where does the patch come from? I cannot find it through google.

What patch!? The one I posted:

  https://lkml.kernel.org/r/20181031141030.GB13219@hirez.programming.kicks-ass.net

I made that up real quick.

> Per Waiman's comment, it seems not suitable to call vcpu_is_preempted()
> in pv_wait_head_or_lock(). So, we cannot make HvSpinWaitInfo notification
> through vcpu_is_preempted() for such case. Based on that, I suggest to
> add one more callback function in pv_lock_ops.

We're not doing anything until you can show a problem. If it ain't
broken, don't fix it etc..

> BTW, which performance test do you suggest? I am trying to test it by
> AIM7.

Waiman?


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-11-01  3:20                           ` Yi Sun
  2018-11-01  8:59                             ` Peter Zijlstra
@ 2018-11-01 12:59                             ` Waiman Long
  2018-11-05  6:54                               ` Yi Sun
  1 sibling, 1 reply; 25+ messages in thread
From: Waiman Long @ 2018-11-01 12:59 UTC (permalink / raw)
  To: Yi Sun, Peter Zijlstra
  Cc: Juergen Gross, linux-kernel, x86, tglx, chao.p.peng, chao.gao,
	isaku.yamahata, michael.h.kelley, tianyu.lan, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, mingo, Will Deacon

On 10/31/2018 11:20 PM, Yi Sun wrote:
> On 18-10-31 18:15:39, Peter Zijlstra wrote:
>> On Wed, Oct 31, 2018 at 11:07:22AM -0400, Waiman Long wrote:
>>> On 10/31/2018 10:10 AM, Peter Zijlstra wrote:
>>>> On Wed, Oct 31, 2018 at 09:54:17AM +0800, Yi Sun wrote:
>>>>> On 18-10-23 17:33:28, Yi Sun wrote:
>>>>>> On 18-10-23 10:51:27, Peter Zijlstra wrote:
>>>>>>> Can you try and explain why vcpu_is_preempted() doesn't work for you?
>>>>>> I thought HvSpinWaitInfo is used to notify hypervisor the spin number
>>>>>> which is different with vcpu_is_preempted. So I did not consider
>>>>>> vcpu_is_preempted.
>>>>>>
>>>>>> But HvSpinWaitInfo is a quite simple function and could be combined
>>>>>> with vcpu_is_preempted together. So I think it is OK to use
>>>>>> vcpu_is_preempted to make codes clean. I will have a try.
>>>>> After checking codes, there is one issue to call vcpu_is_preempted.
>>>>> There are two spin loops in qspinlock_paravirt.h. One loop in
>>>>> 'pv_wait_node' calls vcpu_is_preempted. But another loop in
>>>>> 'pv_wait_head_or_lock' does not call vcpu_is_preempted. It also does
>>>>> not call any other ops of 'pv_lock_ops' in the loop. So I am afraid
>>>>> we have to add one more ops in 'pv_lock_ops' to do this.
>>>> Why? Would not something like the below cure that? Waiman, can you have
>>>> a look at this; I always forget how that paravirt crud works.
>>> There are two major reasons why the vcpu_is_preempt() test isn't done at
>>> pv_wait_head_or_lock(). First of all, we may not have a valid prev
>>> pointer after all if it is the first one to enter the queue while the
>>> lock is busy. Secondly, because of lock stealing, the cpu number pointed
>>> by a valid prev pointer may not be the actual cpu that is currently
>>> holding the lock. Another minor reason is that we want to minimize the
>>> lock transfer latency and so don't want to sleep too early while waiting
>>> at the queue head.
>> So Yi, are you actually seeing a problem? If so, can you give details?
> Where does the patch come from? I cannot find it through google.
>
> Per Waiman's comment, it seems not suitable to call vcpu_is_preempted()
> in pv_wait_head_or_lock(). So, we cannot make HvSpinWaitInfo notification
> through vcpu_is_preempted() for such case. Based on that, I suggest to
> add one more callback function in pv_lock_ops.

I am hesitant to add any additional check at the spinning loop in
pv_wait_head_or_lock() especially one that is a hypercall or a callback
that will take time to execute. The testing that I had done in the past
indicated that it would slow down locking performance especially if the
VM wasn't overcommitted at all.

Any additional slack in pv_wait_node() can be mitigated by the lock
stealing that can happen. Slack in pv_wait_head_or_lock(), on the other
hand, will certainly increase the lock transfer latency and impact
performance. So you need performance data to show that it is worthwhile
to do so.

As for performance test, the kernel has a builtin locktorture test if
you configured it in. So show us the performance data with and without
the patch.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall
  2018-11-01 12:59                             ` Waiman Long
@ 2018-11-05  6:54                               ` Yi Sun
  0 siblings, 0 replies; 25+ messages in thread
From: Yi Sun @ 2018-11-05  6:54 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Juergen Gross, linux-kernel, x86, tglx,
	chao.p.peng, chao.gao, isaku.yamahata, michael.h.kelley,
	tianyu.lan, K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	mingo, Will Deacon

On 18-11-01 08:59:08, Waiman Long wrote:
> On 10/31/2018 11:20 PM, Yi Sun wrote:
> > On 18-10-31 18:15:39, Peter Zijlstra wrote:
> >> On Wed, Oct 31, 2018 at 11:07:22AM -0400, Waiman Long wrote:
> >>> On 10/31/2018 10:10 AM, Peter Zijlstra wrote:
> >>>> On Wed, Oct 31, 2018 at 09:54:17AM +0800, Yi Sun wrote:
> >>>>> On 18-10-23 17:33:28, Yi Sun wrote:
> >>>>>> On 18-10-23 10:51:27, Peter Zijlstra wrote:
> >>>>>>> Can you try and explain why vcpu_is_preempted() doesn't work for you?
> >>>>>> I thought HvSpinWaitInfo is used to notify hypervisor the spin number
> >>>>>> which is different with vcpu_is_preempted. So I did not consider
> >>>>>> vcpu_is_preempted.
> >>>>>>
> >>>>>> But HvSpinWaitInfo is a quite simple function and could be combined
> >>>>>> with vcpu_is_preempted together. So I think it is OK to use
> >>>>>> vcpu_is_preempted to make codes clean. I will have a try.
> >>>>> After checking codes, there is one issue to call vcpu_is_preempted.
> >>>>> There are two spin loops in qspinlock_paravirt.h. One loop in
> >>>>> 'pv_wait_node' calls vcpu_is_preempted. But another loop in
> >>>>> 'pv_wait_head_or_lock' does not call vcpu_is_preempted. It also does
> >>>>> not call any other ops of 'pv_lock_ops' in the loop. So I am afraid
> >>>>> we have to add one more ops in 'pv_lock_ops' to do this.
> >>>> Why? Would not something like the below cure that? Waiman, can you have
> >>>> a look at this; I always forget how that paravirt crud works.
> >>> There are two major reasons why the vcpu_is_preempt() test isn't done at
> >>> pv_wait_head_or_lock(). First of all, we may not have a valid prev
> >>> pointer after all if it is the first one to enter the queue while the
> >>> lock is busy. Secondly, because of lock stealing, the cpu number pointed
> >>> by a valid prev pointer may not be the actual cpu that is currently
> >>> holding the lock. Another minor reason is that we want to minimize the
> >>> lock transfer latency and so don't want to sleep too early while waiting
> >>> at the queue head.
> >> So Yi, are you actually seeing a problem? If so, can you give details?
> > Where does the patch come from? I cannot find it through google.
> >
> > Per Waiman's comment, it seems not suitable to call vcpu_is_preempted()
> > in pv_wait_head_or_lock(). So, we cannot make HvSpinWaitInfo notification
> > through vcpu_is_preempted() for such case. Based on that, I suggest to
> > add one more callback function in pv_lock_ops.
> 
> I am hesitant to add any additional check at the spinning loop in
> pv_wait_head_or_lock() especially one that is a hypercall or a callback
> that will take time to execute. The testing that I had done in the past
> indicated that it would slow down locking performance especially if the
> VM wasn't overcommitted at all.
> 
> Any additional slack in pv_wait_node() can be mitigated by the lock
> stealing that can happen. Slack in pv_wait_head_or_lock(), on the other
> hand, will certainly increase the lock transfer latency and impact
> performance. So you need performance data to show that it is worthwhile
> to do so.
> 
Ok, I will make performance test to show if it is worthwhile to call
SpinWaitInfo in pv_wait_head_or_lock.

> As for performance test, the kernel has a builtin locktorture test if
> you configured it in. So show us the performance data with and without
> the patch.

Thank you! I will make performance test for whole patch.

> 
> Cheers,
> Longman

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-11-05  6:57 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-19 13:13 [PATCH v1 0/2] Enable HvNotifyLongSpinWait for Hyper-V Yi Sun
2018-10-19 13:13 ` [PATCH v1 1/2] x86/hyperv: get spinlock retry number on Hyper-V Yi Sun
2018-10-19 13:13 ` [PATCH v1 2/2] x86/hyperv: make HvNotifyLongSpinWait hypercall Yi Sun
2018-10-19 14:20   ` Juergen Gross
2018-10-22  1:53     ` Yi Sun
2018-10-22  7:32       ` Juergen Gross
2018-10-22 16:31         ` Waiman Long
2018-10-22 17:15           ` Peter Zijlstra
2018-10-22 17:27             ` Waiman Long
2018-10-22 17:31               ` Peter Zijlstra
2018-10-22 18:01                 ` Waiman Long
2018-10-23  2:57             ` Yi Sun
2018-10-23  8:51               ` Peter Zijlstra
2018-10-23  9:33                 ` Yi Sun
2018-10-31  1:54                   ` Yi Sun
2018-10-31 14:10                     ` Peter Zijlstra
2018-10-31 15:07                       ` Waiman Long
2018-10-31 17:15                         ` Peter Zijlstra
2018-11-01  3:20                           ` Yi Sun
2018-11-01  8:59                             ` Peter Zijlstra
2018-11-01 12:59                             ` Waiman Long
2018-11-05  6:54                               ` Yi Sun
2018-10-24 16:53   ` Michael Kelley
2018-10-25  2:23     ` Yi Sun
2018-10-31  2:06     ` Yi Sun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).