All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] sched/nohz: Optimize get_nohz_timer_target()
@ 2019-06-28  0:43 Wanpeng Li
  2019-06-28  0:43 ` [PATCH RESEND v3] sched/isolation: Prefer housekeeping cpu in local node Wanpeng Li
  2019-06-28  1:10 ` [PATCH v2] sched/nohz: Optimize get_nohz_timer_target() Frederic Weisbecker
  0 siblings, 2 replies; 13+ messages in thread
From: Wanpeng Li @ 2019-06-28  0:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Peter Zijlstra, Ingo Molnar, Frederic Weisbecker,
	Thomas Gleixner

From: Wanpeng Li <wanpengli@tencent.com>

On a machine, cpu 0 is used for housekeeping, the other 39 cpus in the 
same socket are in nohz_full mode. We can observe huge time burn in the 
loop for seaching nearest busy housekeeper cpu by ftrace.

  2)               |       get_nohz_timer_target() {
  2)   0.240 us    |         housekeeping_test_cpu();
  2)   0.458 us    |         housekeeping_test_cpu();

  ...

  2)   0.292 us    |         housekeeping_test_cpu();
  2)   0.240 us    |         housekeeping_test_cpu();
  2)   0.227 us    |         housekeeping_any_cpu();
  2) + 43.460 us   |       }
  
This patch optimizes the searching logic by finding a nearest housekeeper
cpu in the housekeeping cpumask, it can minimize the worst searching time 
from ~44us to < 10us in my testing. In addition, the last iterated busy 
housekeeper can become a random candidate while current CPU is a better 
fallback if it is a housekeeper.

Cc: Ingo Molnar <mingo@redhat.com> 
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
v1 -> v2:
 * current CPU is a better fallback if it is a housekeeper

 kernel/sched/core.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 102dfcf..04a0f6a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -539,27 +539,32 @@ void resched_cpu(int cpu)
  */
 int get_nohz_timer_target(void)
 {
-	int i, cpu = smp_processor_id();
+	int i, cpu = smp_processor_id(), default_cpu = -1;
 	struct sched_domain *sd;
 
-	if (!idle_cpu(cpu) && housekeeping_cpu(cpu, HK_FLAG_TIMER))
-		return cpu;
+	if (housekeeping_cpu(cpu, HK_FLAG_TIMER)) {
+		if (!idle_cpu(cpu))
+			return cpu;
+		default_cpu = cpu;
+	}
 
 	rcu_read_lock();
 	for_each_domain(cpu, sd) {
-		for_each_cpu(i, sched_domain_span(sd)) {
+		for_each_cpu_and(i, sched_domain_span(sd),
+			housekeeping_cpumask(HK_FLAG_TIMER)) {
 			if (cpu == i)
 				continue;
 
-			if (!idle_cpu(i) && housekeeping_cpu(i, HK_FLAG_TIMER)) {
+			if (!idle_cpu(i)) {
 				cpu = i;
 				goto unlock;
 			}
 		}
 	}
 
-	if (!housekeeping_cpu(cpu, HK_FLAG_TIMER))
-		cpu = housekeeping_any_cpu(HK_FLAG_TIMER);
+	if (default_cpu == -1)
+		default_cpu = housekeeping_any_cpu(HK_FLAG_TIMER);
+	cpu = default_cpu;
 unlock:
 	rcu_read_unlock();
 	return cpu;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RESEND v3] sched/isolation: Prefer housekeeping cpu in local node
  2019-06-28  0:43 [PATCH v2] sched/nohz: Optimize get_nohz_timer_target() Wanpeng Li
@ 2019-06-28  0:43 ` Wanpeng Li
  2019-06-28  1:18   ` Frederic Weisbecker
  2019-06-28  6:58   ` Srikar Dronamraju
  2019-06-28  1:10 ` [PATCH v2] sched/nohz: Optimize get_nohz_timer_target() Frederic Weisbecker
  1 sibling, 2 replies; 13+ messages in thread
From: Wanpeng Li @ 2019-06-28  0:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Peter Zijlstra, Ingo Molnar, Frederic Weisbecker,
	Thomas Gleixner

From: Wanpeng Li <wanpengli@tencent.com>

In real product setup, there will be houseeking cpus in each nodes, it 
is prefer to do housekeeping from local node, fallback to global online 
cpumask if failed to find houseeking cpu from local node.

Cc: Ingo Molnar <mingo@redhat.com> 
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
v2 -> v3:
 * add sched_numa_find_closest comments
v1 -> v2:
 * introduce sched_numa_find_closest

 kernel/sched/isolation.c | 12 ++++++++++--
 kernel/sched/sched.h     |  5 ++---
 kernel/sched/topology.c  | 22 ++++++++++++++++++++++
 3 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index 123ea07..589afba 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -16,9 +16,17 @@ static unsigned int housekeeping_flags;
 
 int housekeeping_any_cpu(enum hk_flags flags)
 {
-	if (static_branch_unlikely(&housekeeping_overridden))
-		if (housekeeping_flags & flags)
+	int cpu;
+
+	if (static_branch_unlikely(&housekeeping_overridden)) {
+		if (housekeeping_flags & flags) {
+			cpu = sched_numa_find_closest(housekeeping_mask, smp_processor_id());
+			if (cpu < nr_cpu_ids)
+				return cpu;
+
 			return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+		}
+	}
 	return smp_processor_id();
 }
 EXPORT_SYMBOL_GPL(housekeeping_any_cpu);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b08dee2..0db7431 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1212,9 +1212,6 @@ enum numa_topology_type {
 extern enum numa_topology_type sched_numa_topology_type;
 extern int sched_max_numa_distance;
 extern bool find_numa_distance(int distance);
-#endif
-
-#ifdef CONFIG_NUMA
 extern void sched_init_numa(void);
 extern void sched_domains_numa_masks_set(unsigned int cpu);
 extern void sched_domains_numa_masks_clear(unsigned int cpu);
@@ -1224,6 +1221,8 @@ static inline void sched_domains_numa_masks_set(unsigned int cpu) { }
 static inline void sched_domains_numa_masks_clear(unsigned int cpu) { }
 #endif
 
+extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
+
 #ifdef CONFIG_NUMA_BALANCING
 /* The regions in numa_faults array from task_struct */
 enum numa_faults_stats {
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 63184cf..083ef23 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1726,6 +1726,28 @@ void sched_domains_numa_masks_clear(unsigned int cpu)
 
 #endif /* CONFIG_NUMA */
 
+/*
+ * sched_numa_find_closest() - given the NUMA topology, find the cpu
+ *                             closest to @cpu from @cpumask.
+ * cpumask: cpumask to find a cpu from
+ * cpu: cpu to be close to
+ *
+ * returns: cpu, or >= nr_cpu_ids when nothing found (or !NUMA).
+ */
+int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
+{
+#ifdef CONFIG_NUMA
+	int i, j = cpu_to_node(cpu);
+
+	for (i = 0; i < sched_domains_numa_levels; i++) {
+		cpu = cpumask_any_and(cpus, sched_domains_numa_masks[i][j]);
+		if (cpu < nr_cpu_ids)
+			return cpu;
+	}
+#endif
+	return nr_cpu_ids;
+}
+
 static int __sdt_alloc(const struct cpumask *cpu_map)
 {
 	struct sched_domain_topology_level *tl;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] sched/nohz: Optimize get_nohz_timer_target()
  2019-06-28  0:43 [PATCH v2] sched/nohz: Optimize get_nohz_timer_target() Wanpeng Li
  2019-06-28  0:43 ` [PATCH RESEND v3] sched/isolation: Prefer housekeeping cpu in local node Wanpeng Li
@ 2019-06-28  1:10 ` Frederic Weisbecker
  2019-10-23  8:16   ` Wanpeng Li
  1 sibling, 1 reply; 13+ messages in thread
From: Frederic Weisbecker @ 2019-06-28  1:10 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Ingo Molnar, Thomas Gleixner

On Fri, Jun 28, 2019 at 08:43:12AM +0800, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> On a machine, cpu 0 is used for housekeeping, the other 39 cpus in the 
> same socket are in nohz_full mode. We can observe huge time burn in the 
> loop for seaching nearest busy housekeeper cpu by ftrace.
> 
>   2)               |       get_nohz_timer_target() {
>   2)   0.240 us    |         housekeeping_test_cpu();
>   2)   0.458 us    |         housekeeping_test_cpu();
> 
>   ...
> 
>   2)   0.292 us    |         housekeeping_test_cpu();
>   2)   0.240 us    |         housekeeping_test_cpu();
>   2)   0.227 us    |         housekeeping_any_cpu();
>   2) + 43.460 us   |       }
>   
> This patch optimizes the searching logic by finding a nearest housekeeper
> cpu in the housekeeping cpumask, it can minimize the worst searching time 
> from ~44us to < 10us in my testing. In addition, the last iterated busy 
> housekeeper can become a random candidate while current CPU is a better 
> fallback if it is a housekeeper.
> 
> Cc: Ingo Molnar <mingo@redhat.com> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RESEND v3] sched/isolation: Prefer housekeeping cpu in local node
  2019-06-28  0:43 ` [PATCH RESEND v3] sched/isolation: Prefer housekeeping cpu in local node Wanpeng Li
@ 2019-06-28  1:18   ` Frederic Weisbecker
  2019-06-28  6:58   ` Srikar Dronamraju
  1 sibling, 0 replies; 13+ messages in thread
From: Frederic Weisbecker @ 2019-06-28  1:18 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Ingo Molnar, Thomas Gleixner

On Fri, Jun 28, 2019 at 08:43:13AM +0800, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> In real product setup, there will be houseeking cpus in each nodes, it 
> is prefer to do housekeeping from local node, fallback to global online 
> cpumask if failed to find houseeking cpu from local node.
> 
> Cc: Ingo Molnar <mingo@redhat.com> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RESEND v3] sched/isolation: Prefer housekeeping cpu in local node
  2019-06-28  0:43 ` [PATCH RESEND v3] sched/isolation: Prefer housekeeping cpu in local node Wanpeng Li
  2019-06-28  1:18   ` Frederic Weisbecker
@ 2019-06-28  6:58   ` Srikar Dronamraju
  2019-06-28  7:19     ` Wanpeng Li
  1 sibling, 1 reply; 13+ messages in thread
From: Srikar Dronamraju @ 2019-06-28  6:58 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Ingo Molnar,
	Frederic Weisbecker, Thomas Gleixner

* Wanpeng Li <kernellwp@gmail.com> [2019-06-28 08:43:13]:


>  
> +/*
> + * sched_numa_find_closest() - given the NUMA topology, find the cpu
> + *                             closest to @cpu from @cpumask.
> + * cpumask: cpumask to find a cpu from
> + * cpu: cpu to be close to
> + *
> + * returns: cpu, or >= nr_cpu_ids when nothing found (or !NUMA).

One nit:
I dont see sched_numa_find_closest returning anything greater than
nr_cpu_ids. So 's/>= //' for the above comment.

> + */
> +int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
> +{
> +#ifdef CONFIG_NUMA
> +	int i, j = cpu_to_node(cpu);
> +
> +	for (i = 0; i < sched_domains_numa_levels; i++) {
> +		cpu = cpumask_any_and(cpus, sched_domains_numa_masks[i][j]);
> +		if (cpu < nr_cpu_ids)
> +			return cpu;
> +	}
> +#endif
> +	return nr_cpu_ids;
> +}
> +

Should we have a static function for sched_numa_find_closest instead of
having #ifdef in the function?

>  static int __sdt_alloc(const struct cpumask *cpu_map)
>  {
>  	struct sched_domain_topology_level *tl;

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RESEND v3] sched/isolation: Prefer housekeeping cpu in local node
  2019-06-28  6:58   ` Srikar Dronamraju
@ 2019-06-28  7:19     ` Wanpeng Li
  2019-06-28  8:44       ` Srikar Dronamraju
  0 siblings, 1 reply; 13+ messages in thread
From: Wanpeng Li @ 2019-06-28  7:19 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: LKML, Ingo Molnar, Peter Zijlstra, Ingo Molnar,
	Frederic Weisbecker, Thomas Gleixner

Hi Srikar,
On Fri, 28 Jun 2019 at 14:58, Srikar Dronamraju
<srikar@linux.vnet.ibm.com> wrote:
>
> * Wanpeng Li <kernellwp@gmail.com> [2019-06-28 08:43:13]:
>
>
> >
> > +/*
> > + * sched_numa_find_closest() - given the NUMA topology, find the cpu
> > + *                             closest to @cpu from @cpumask.
> > + * cpumask: cpumask to find a cpu from
> > + * cpu: cpu to be close to
> > + *
> > + * returns: cpu, or >= nr_cpu_ids when nothing found (or !NUMA).
>
> One nit:
> I dont see sched_numa_find_closest returning anything greater than
> nr_cpu_ids. So 's/>= //' for the above comment.
>
> > + */
> > +int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
> > +{
> > +#ifdef CONFIG_NUMA
> > +     int i, j = cpu_to_node(cpu);
> > +
> > +     for (i = 0; i < sched_domains_numa_levels; i++) {
> > +             cpu = cpumask_any_and(cpus, sched_domains_numa_masks[i][j]);
> > +             if (cpu < nr_cpu_ids)
> > +                     return cpu;
> > +     }
> > +#endif
> > +     return nr_cpu_ids;
> > +}
> > +
>
> Should we have a static function for sched_numa_find_closest instead of
> having #ifdef in the function?
>
> >  static int __sdt_alloc(const struct cpumask *cpu_map)
> >  {
> >       struct sched_domain_topology_level *tl;

So, how about add this?

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index a7e7d8c..5f2b262 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1225,13 +1225,17 @@ enum numa_topology_type {
 extern void sched_init_numa(void);
 extern void sched_domains_numa_masks_set(unsigned int cpu);
 extern void sched_domains_numa_masks_clear(unsigned int cpu);
+extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
 #else
 static inline void sched_init_numa(void) { }
 static inline void sched_domains_numa_masks_set(unsigned int cpu) { }
 static inline void sched_domains_numa_masks_clear(unsigned int cpu) { }
+static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
+{
+    return nr_cpu_ids;
+}
 #endif

-extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);

 #ifdef CONFIG_NUMA_BALANCING
 /* The regions in numa_faults array from task_struct */
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 72731ed..9372c18 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1734,19 +1734,16 @@ void sched_domains_numa_masks_clear(unsigned int cpu)
     }
 }

-#endif /* CONFIG_NUMA */
-
 /*
  * sched_numa_find_closest() - given the NUMA topology, find the cpu
  *                             closest to @cpu from @cpumask.
  * cpumask: cpumask to find a cpu from
  * cpu: cpu to be close to
  *
- * returns: cpu, or >= nr_cpu_ids when nothing found (or !NUMA).
+ * returns: cpu, or nr_cpu_ids when nothing found.
  */
 int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
 {
-#ifdef CONFIG_NUMA
     int i, j = cpu_to_node(cpu);

     for (i = 0; i < sched_domains_numa_levels; i++) {
@@ -1754,10 +1751,11 @@ int sched_numa_find_closest(const struct
cpumask *cpus, int cpu)
         if (cpu < nr_cpu_ids)
             return cpu;
     }
-#endif
     return nr_cpu_ids;
 }

+#endif /* CONFIG_NUMA */
+
 static int __sdt_alloc(const struct cpumask *cpu_map)
 {
     struct sched_domain_topology_level *tl;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH RESEND v3] sched/isolation: Prefer housekeeping cpu in local node
  2019-06-28  7:19     ` Wanpeng Li
@ 2019-06-28  8:44       ` Srikar Dronamraju
  0 siblings, 0 replies; 13+ messages in thread
From: Srikar Dronamraju @ 2019-06-28  8:44 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: LKML, Ingo Molnar, Peter Zijlstra, Ingo Molnar,
	Frederic Weisbecker, Thomas Gleixner

> 
> -#endif /* CONFIG_NUMA */
> -
>  /*
>   * sched_numa_find_closest() - given the NUMA topology, find the cpu
>   *                             closest to @cpu from @cpumask.
>   * cpumask: cpumask to find a cpu from
>   * cpu: cpu to be close to
>   *
> - * returns: cpu, or >= nr_cpu_ids when nothing found (or !NUMA).
> + * returns: cpu, or nr_cpu_ids when nothing found.
>   */
>  int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
>  {
> -#ifdef CONFIG_NUMA
>      int i, j = cpu_to_node(cpu);
> 
>      for (i = 0; i < sched_domains_numa_levels; i++) {
> @@ -1754,10 +1751,11 @@ int sched_numa_find_closest(const struct
> cpumask *cpus, int cpu)
>          if (cpu < nr_cpu_ids)
>              return cpu;
>      }
> -#endif
>      return nr_cpu_ids;
>  }
> 
> +#endif /* CONFIG_NUMA */
> +
>  static int __sdt_alloc(const struct cpumask *cpu_map)
>  {
>      struct sched_domain_topology_level *tl;
> 

Looks good to me.

Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] sched/nohz: Optimize get_nohz_timer_target()
  2019-06-28  1:10 ` [PATCH v2] sched/nohz: Optimize get_nohz_timer_target() Frederic Weisbecker
@ 2019-10-23  8:16   ` Wanpeng Li
  2019-10-23  8:29     ` Thomas Gleixner
  0 siblings, 1 reply; 13+ messages in thread
From: Wanpeng Li @ 2019-10-23  8:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Peter Zijlstra, Ingo Molnar, kvm, Frederic Weisbecker

On Fri, 28 Jun 2019 at 09:10, Frederic Weisbecker <frederic@kernel.org> wrote:
>
> On Fri, Jun 28, 2019 at 08:43:12AM +0800, Wanpeng Li wrote:
> > From: Wanpeng Li <wanpengli@tencent.com>
> >
> > On a machine, cpu 0 is used for housekeeping, the other 39 cpus in the
> > same socket are in nohz_full mode. We can observe huge time burn in the
> > loop for seaching nearest busy housekeeper cpu by ftrace.
> >
> >   2)               |       get_nohz_timer_target() {
> >   2)   0.240 us    |         housekeeping_test_cpu();
> >   2)   0.458 us    |         housekeeping_test_cpu();
> >
> >   ...
> >
> >   2)   0.292 us    |         housekeeping_test_cpu();
> >   2)   0.240 us    |         housekeeping_test_cpu();
> >   2)   0.227 us    |         housekeeping_any_cpu();
> >   2) + 43.460 us   |       }
> >
> > This patch optimizes the searching logic by finding a nearest housekeeper
> > cpu in the housekeeping cpumask, it can minimize the worst searching time
> > from ~44us to < 10us in my testing. In addition, the last iterated busy
> > housekeeper can become a random candidate while current CPU is a better
> > fallback if it is a housekeeper.
> >
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
>
> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

Hi Thomas,

I didn't see your refactor to get_nohz_timer_target() which you
mentioned in IRC after four months, I can observe cyclictest drop from
4~5us to 8us in kvm guest(we offload the lapic timer emulation to
housekeeping cpu to avoid timer fire external interrupt on the pCPU
which vCPU resident incur a vCPU vmexit) w/o this patch in the case of
there is no busy housekeeping cpu. The score can be recovered after I
give stress to create a busy housekeeping cpu.

Could you consider applying this patch for temporary since I'm not
sure when the refactor can be ready.

    Wanpeng

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] sched/nohz: Optimize get_nohz_timer_target()
  2019-10-23  8:16   ` Wanpeng Li
@ 2019-10-23  8:29     ` Thomas Gleixner
  2019-10-23  9:25       ` Wanpeng Li
  2020-01-06  6:21       ` Wanpeng Li
  0 siblings, 2 replies; 13+ messages in thread
From: Thomas Gleixner @ 2019-10-23  8:29 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: LKML, Ingo Molnar, Peter Zijlstra, Ingo Molnar, kvm, Frederic Weisbecker

On Wed, 23 Oct 2019, Wanpeng Li wrote:
> I didn't see your refactor to get_nohz_timer_target() which you
> mentioned in IRC after four months, I can observe cyclictest drop from
> 4~5us to 8us in kvm guest(we offload the lapic timer emulation to
> housekeeping cpu to avoid timer fire external interrupt on the pCPU
> which vCPU resident incur a vCPU vmexit) w/o this patch in the case of
> there is no busy housekeeping cpu. The score can be recovered after I
> give stress to create a busy housekeeping cpu.
> 
> Could you consider applying this patch for temporary since I'm not
> sure when the refactor can be ready.

Yeah. It's delayed (again).... Will pick that up.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] sched/nohz: Optimize get_nohz_timer_target()
  2019-10-23  8:29     ` Thomas Gleixner
@ 2019-10-23  9:25       ` Wanpeng Li
  2020-01-06  6:21       ` Wanpeng Li
  1 sibling, 0 replies; 13+ messages in thread
From: Wanpeng Li @ 2019-10-23  9:25 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Peter Zijlstra, Ingo Molnar, kvm, Frederic Weisbecker

On Wed, 23 Oct 2019 at 16:29, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, 23 Oct 2019, Wanpeng Li wrote:
> > I didn't see your refactor to get_nohz_timer_target() which you
> > mentioned in IRC after four months, I can observe cyclictest drop from
> > 4~5us to 8us in kvm guest(we offload the lapic timer emulation to
> > housekeeping cpu to avoid timer fire external interrupt on the pCPU
> > which vCPU resident incur a vCPU vmexit) w/o this patch in the case of
> > there is no busy housekeeping cpu. The score can be recovered after I
> > give stress to create a busy housekeeping cpu.
> >
> > Could you consider applying this patch for temporary since I'm not
> > sure when the refactor can be ready.
>
> Yeah. It's delayed (again).... Will pick that up.

Sorry, you will pick up the patch or refactor? :)

    Wanpeng

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] sched/nohz: Optimize get_nohz_timer_target()
  2019-10-23  8:29     ` Thomas Gleixner
  2019-10-23  9:25       ` Wanpeng Li
@ 2020-01-06  6:21       ` Wanpeng Li
  2020-01-10 14:12         ` Thomas Gleixner
  1 sibling, 1 reply; 13+ messages in thread
From: Wanpeng Li @ 2020-01-06  6:21 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Peter Zijlstra, Ingo Molnar, kvm, Frederic Weisbecker

Hi Thomas,
On Wed, 23 Oct 2019 at 16:29, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, 23 Oct 2019, Wanpeng Li wrote:
> > I didn't see your refactor to get_nohz_timer_target() which you
> > mentioned in IRC after four months, I can observe cyclictest drop from
> > 4~5us to 8us in kvm guest(we offload the lapic timer emulation to
> > housekeeping cpu to avoid timer fire external interrupt on the pCPU
> > which vCPU resident incur a vCPU vmexit) w/o this patch in the case of
> > there is no busy housekeeping cpu. The score can be recovered after I
> > give stress to create a busy housekeeping cpu.
> >
> > Could you consider applying this patch for temporary since I'm not
> > sure when the refactor can be ready.
>
> Yeah. It's delayed (again).... Will pick that up.

I didn't find WIP tag for this work after ~half year since v4 was
posted https://lkml.org/lkml/2019/6/28/231 Could you apply this patch
for temporary because the completion time of refactor is not
deterministic.

    Wanpeng

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] sched/nohz: Optimize get_nohz_timer_target()
  2020-01-06  6:21       ` Wanpeng Li
@ 2020-01-10 14:12         ` Thomas Gleixner
  2020-01-13  1:32           ` Wanpeng Li
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2020-01-10 14:12 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: LKML, Ingo Molnar, Peter Zijlstra, Ingo Molnar, kvm, Frederic Weisbecker

Wanpeng,

Wanpeng Li <kernellwp@gmail.com> writes:

> Hi Thomas,
> On Wed, 23 Oct 2019 at 16:29, Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> On Wed, 23 Oct 2019, Wanpeng Li wrote:
>> > I didn't see your refactor to get_nohz_timer_target() which you
>> > mentioned in IRC after four months, I can observe cyclictest drop from
>> > 4~5us to 8us in kvm guest(we offload the lapic timer emulation to
>> > housekeeping cpu to avoid timer fire external interrupt on the pCPU
>> > which vCPU resident incur a vCPU vmexit) w/o this patch in the case of
>> > there is no busy housekeeping cpu. The score can be recovered after I
>> > give stress to create a busy housekeeping cpu.
>> >
>> > Could you consider applying this patch for temporary since I'm not
>> > sure when the refactor can be ready.
>>
>> Yeah. It's delayed (again).... Will pick that up.
>
> I didn't find WIP tag for this work after ~half year since v4 was
> posted https://lkml.org/lkml/2019/6/28/231 Could you apply this patch
> for temporary because the completion time of refactor is not
> deterministic.

Could you please repost it?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] sched/nohz: Optimize get_nohz_timer_target()
  2020-01-10 14:12         ` Thomas Gleixner
@ 2020-01-13  1:32           ` Wanpeng Li
  0 siblings, 0 replies; 13+ messages in thread
From: Wanpeng Li @ 2020-01-13  1:32 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Peter Zijlstra, Ingo Molnar, kvm, Frederic Weisbecker

On Fri, 10 Jan 2020 at 22:12, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Wanpeng,
>
> Wanpeng Li <kernellwp@gmail.com> writes:
>
> > Hi Thomas,
> > On Wed, 23 Oct 2019 at 16:29, Thomas Gleixner <tglx@linutronix.de> wrote:
> >>
> >> On Wed, 23 Oct 2019, Wanpeng Li wrote:
> >> > I didn't see your refactor to get_nohz_timer_target() which you
> >> > mentioned in IRC after four months, I can observe cyclictest drop from
> >> > 4~5us to 8us in kvm guest(we offload the lapic timer emulation to
> >> > housekeeping cpu to avoid timer fire external interrupt on the pCPU
> >> > which vCPU resident incur a vCPU vmexit) w/o this patch in the case of
> >> > there is no busy housekeeping cpu. The score can be recovered after I
> >> > give stress to create a busy housekeeping cpu.
> >> >
> >> > Could you consider applying this patch for temporary since I'm not
> >> > sure when the refactor can be ready.
> >>
> >> Yeah. It's delayed (again).... Will pick that up.
> >
> > I didn't find WIP tag for this work after ~half year since v4 was
> > posted https://lkml.org/lkml/2019/6/28/231 Could you apply this patch
> > for temporary because the completion time of refactor is not
> > deterministic.
>
> Could you please repost it?

Just repost, thanks Thomas.

    Wanpeng

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-01-13  1:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-28  0:43 [PATCH v2] sched/nohz: Optimize get_nohz_timer_target() Wanpeng Li
2019-06-28  0:43 ` [PATCH RESEND v3] sched/isolation: Prefer housekeeping cpu in local node Wanpeng Li
2019-06-28  1:18   ` Frederic Weisbecker
2019-06-28  6:58   ` Srikar Dronamraju
2019-06-28  7:19     ` Wanpeng Li
2019-06-28  8:44       ` Srikar Dronamraju
2019-06-28  1:10 ` [PATCH v2] sched/nohz: Optimize get_nohz_timer_target() Frederic Weisbecker
2019-10-23  8:16   ` Wanpeng Li
2019-10-23  8:29     ` Thomas Gleixner
2019-10-23  9:25       ` Wanpeng Li
2020-01-06  6:21       ` Wanpeng Li
2020-01-10 14:12         ` Thomas Gleixner
2020-01-13  1:32           ` Wanpeng Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.