All of lore.kernel.org
 help / color / mirror / Atom feed
* vmstat kthreads
@ 2013-06-18 15:23 Paul E. McKenney
  2013-06-18 17:46 ` Christoph Lameter
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2013-06-18 15:23 UTC (permalink / raw)
  To: linux-mm; +Cc: ghaskins, niv, kravetz

Hello!

I have been digging around the vmstat kthreads a bit, and it appears to
me that there is no reason to run a given CPU's vmstat kthread unless
that CPU spends some time executing in the kernel.  If correct, this
observation indicates that one way to safely reduce OS jitter due to the
vmstat kthreads is to prevent them from executing on a given CPU if that
CPU has been executing in usermode since the last time that this CPU's
vmstat kthread executed.

Does this seem like a sensible course of action, or did I miss something
when I went through the code?

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: vmstat kthreads
  2013-06-18 15:23 vmstat kthreads Paul E. McKenney
@ 2013-06-18 17:46 ` Christoph Lameter
  2013-06-18 18:26   ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2013-06-18 17:46 UTC (permalink / raw)
  To: gilad; +Cc: linux-mm, ghaskins, niv, Paul E. McKenney, kravetz

On Tue, 18 Jun 2013, Paul E. McKenney wrote:

> I have been digging around the vmstat kthreads a bit, and it appears to
> me that there is no reason to run a given CPU's vmstat kthread unless
> that CPU spends some time executing in the kernel.  If correct, this
> observation indicates that one way to safely reduce OS jitter due to the
> vmstat kthreads is to prevent them from executing on a given CPU if that
> CPU has been executing in usermode since the last time that this CPU's
> vmstat kthread executed.

Right and we have patches to that effect.

> Does this seem like a sensible course of action, or did I miss something
> when I went through the code?

Nope you are right on.

Gilad Ben-Yossef has been posting patches that address this issue in Feb
2012. Ccing him. Can we see your latest work, Gilead?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: vmstat kthreads
  2013-06-18 17:46 ` Christoph Lameter
@ 2013-06-18 18:26   ` Paul E. McKenney
  2013-06-19 14:23     ` Christoph Lameter
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2013-06-18 18:26 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: gilad, linux-mm, ghaskins, niv, kravetz

On Tue, Jun 18, 2013 at 05:46:50PM +0000, Christoph Lameter wrote:
> On Tue, 18 Jun 2013, Paul E. McKenney wrote:
> 
> > I have been digging around the vmstat kthreads a bit, and it appears to
> > me that there is no reason to run a given CPU's vmstat kthread unless
> > that CPU spends some time executing in the kernel.  If correct, this
> > observation indicates that one way to safely reduce OS jitter due to the
> > vmstat kthreads is to prevent them from executing on a given CPU if that
> > CPU has been executing in usermode since the last time that this CPU's
> > vmstat kthread executed.
> 
> Right and we have patches to that effect.

Even better!

> > Does this seem like a sensible course of action, or did I miss something
> > when I went through the code?
> 
> Nope you are right on.
> 
> Gilad Ben-Yossef has been posting patches that address this issue in Feb
> 2012. Ccing him. Can we see your latest work, Gilead?

Is it this one?

https://lkml.org/lkml/2012/5/3/269

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: vmstat kthreads
  2013-06-18 18:26   ` Paul E. McKenney
@ 2013-06-19 14:23     ` Christoph Lameter
  2013-06-19 14:50       ` Gilad Ben-Yossef
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2013-06-19 14:23 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: gilad, linux-mm, ghaskins, niv, kravetz, Frederic Weisbecker

On Tue, 18 Jun 2013, Paul E. McKenney wrote:

> > Gilad Ben-Yossef has been posting patches that address this issue in Feb
> > 2012. Ccing him. Can we see your latest work, Gilead?
>
> Is it this one?
>
> https://lkml.org/lkml/2012/5/3/269

Yes that is it. Maybe the scheme there could be generalized so that other
subsystems can also use this to disable their threads if nothing is going
on? Or integrate the monitoring into the notick logic somehow?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: vmstat kthreads
  2013-06-19 14:23     ` Christoph Lameter
@ 2013-06-19 14:50       ` Gilad Ben-Yossef
  2013-06-19 14:57         ` Gilad Ben-Yossef
                           ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-06-19 14:50 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, linux-mm, ghaskins, niv, kravetz, Frederic Weisbecker

On Wed, Jun 19, 2013 at 5:23 PM, Christoph Lameter <cl@linux.com> wrote:
> On Tue, 18 Jun 2013, Paul E. McKenney wrote:
>
>> > Gilad Ben-Yossef has been posting patches that address this issue in Feb
>> > 2012. Ccing him. Can we see your latest work, Gilead?
>>
>> Is it this one?
>>
>> https://lkml.org/lkml/2012/5/3/269
>
> Yes that is it. Maybe the scheme there could be generalized so that other
> subsystems can also use this to disable their threads if nothing is going
> on? Or integrate the monitoring into the notick logic somehow?
>

I respinned the original patch based on feedback from Christoph for
3.2 and even did some light testing then, but got distracted and never
posted the result.

I've just ported them over to 3.10 and they merge (with a small fix
due to deferred workqueue API changes) and build. I did not try to run
this version though.
I'll post them as replies to this message.

I'd be happy to rescue them from the "TODO" pile... :-)

-- 
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
 -- Jean-Baptiste Queru

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: vmstat kthreads
  2013-06-19 14:50       ` Gilad Ben-Yossef
@ 2013-06-19 14:57         ` Gilad Ben-Yossef
  2013-06-20  5:06           ` Gilad Ben-Yossef
  2013-06-19 14:59         ` Paul E. McKenney
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-06-19 14:57 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, linux-mm, ghaskins, niv, kravetz, Frederic Weisbecker

On Wed, Jun 19, 2013 at 5:50 PM, Gilad Ben-Yossef <gilad@benyossef.com> wrote:
> On Wed, Jun 19, 2013 at 5:23 PM, Christoph Lameter <cl@linux.com> wrote:
>> On Tue, 18 Jun 2013, Paul E. McKenney wrote:
>>
>>> > Gilad Ben-Yossef has been posting patches that address this issue in Feb
>>> > 2012. Ccing him. Can we see your latest work, Gilead?
>>>
>>> Is it this one?
>>>
>>> https://lkml.org/lkml/2012/5/3/269
>>
>> Yes that is it. Maybe the scheme there could be generalized so that other
>> subsystems can also use this to disable their threads if nothing is going
>> on? Or integrate the monitoring into the notick logic somehow?
>>
>
> I respinned the original patch based on feedback from Christoph for
> 3.2 and even did some light testing then, but got distracted and never
> posted the result.
>
> I've just ported them over to 3.10 and they merge (with a small fix
> due to deferred workqueue API changes) and build. I did not try to run
> this version though.
> I'll post them as replies to this message.

... or rather, I'll send them later this evening when I'm not behind a
firewall that is blocking SMTP connections... :-)

Gilad

>
> I'd be happy to rescue them from the "TODO" pile... :-)
>
> --
> Gilad Ben-Yossef
> Chief Coffee Drinker
> gilad@benyossef.com
> Israel Cell: +972-52-8260388
> US Cell: +1-973-8260388
> http://benyossef.com
>
> "If you take a class in large-scale robotics, can you end up in a
> situation where the homework eats your dog?"
>  -- Jean-Baptiste Queru



-- 
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
 -- Jean-Baptiste Queru

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: vmstat kthreads
  2013-06-19 14:50       ` Gilad Ben-Yossef
  2013-06-19 14:57         ` Gilad Ben-Yossef
@ 2013-06-19 14:59         ` Paul E. McKenney
  2013-06-19 20:18           ` Christoph Lameter
  2013-06-19 20:02           ` Gilad Ben-Yossef
  2013-06-19 20:02           ` Gilad Ben-Yossef
  3 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2013-06-19 14:59 UTC (permalink / raw)
  To: Gilad Ben-Yossef
  Cc: Christoph Lameter, linux-mm, ghaskins, niv, kravetz, Frederic Weisbecker

On Wed, Jun 19, 2013 at 05:50:46PM +0300, Gilad Ben-Yossef wrote:
> On Wed, Jun 19, 2013 at 5:23 PM, Christoph Lameter <cl@linux.com> wrote:
> > On Tue, 18 Jun 2013, Paul E. McKenney wrote:
> >
> >> > Gilad Ben-Yossef has been posting patches that address this issue in Feb
> >> > 2012. Ccing him. Can we see your latest work, Gilead?
> >>
> >> Is it this one?
> >>
> >> https://lkml.org/lkml/2012/5/3/269
> >
> > Yes that is it. Maybe the scheme there could be generalized so that other
> > subsystems can also use this to disable their threads if nothing is going
> > on? Or integrate the monitoring into the notick logic somehow?
> >
> 
> I respinned the original patch based on feedback from Christoph for
> 3.2 and even did some light testing then, but got distracted and never
> posted the result.
> 
> I've just ported them over to 3.10 and they merge (with a small fix
> due to deferred workqueue API changes) and build. I did not try to run
> this version though.
> I'll post them as replies to this message.
> 
> I'd be happy to rescue them from the "TODO" pile... :-)

Please!  ;-)

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
  2013-06-19 14:50       ` Gilad Ben-Yossef
@ 2013-06-19 20:02           ` Gilad Ben-Yossef
  2013-06-19 14:59         ` Paul E. McKenney
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-06-19 20:02 UTC (permalink / raw)
  To: paulmck; +Cc: Gilad Ben-Yossef, Christoph Lameter, linux-kernel, linux-mm

vmstat_update runs every second from the work queue to update statistics
and drain per cpu pages back into the global page allocator.

This is useful in most circumstances but is wasteful if the CPU doesn't
actually make any VM activity. This can happen in the situtation that
the CPU is idle or running a CPU bound long term task (e.g. CPU
isolation), in which case the periodic vmstate_update timer needlessly
interrupts the CPU.

This patch tries to make vmstat_update schedule itself for the next
round only if there was any work for it to do in the previous run.
The assumption is that if for a whole second we didn't see any VM
activity it is reasnoable to assume that the CPU is not using the
VM because it is idle or runs a long term single CPU bound task.

A scapegoat CPU is picked to serve to periodically monitor
CPUs that have their vmstat_update work stopped and re-schedule them
if VM activity is detected. The scapegoat CPU never stops its
vmstat_update work item instance.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
CC: Christoph Lameter <cl@linux.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: linux-kernel@vger.kernel.org
CC: linux-mm@kvack.org
---
 include/linux/vmstat.h |    2 +-
 mm/vmstat.c            |   92 ++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 79 insertions(+), 15 deletions(-)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index c586679..a30ab79 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -198,7 +198,7 @@ extern void __inc_zone_state(struct zone *, enum zone_stat_item);
 extern void dec_zone_state(struct zone *, enum zone_stat_item);
 extern void __dec_zone_state(struct zone *, enum zone_stat_item);
 
-void refresh_cpu_vm_stats(int);
+bool refresh_cpu_vm_stats(int);
 void refresh_zone_stat_thresholds(void);
 
 void drain_zonestat(struct zone *zone, struct per_cpu_pageset *);
diff --git a/mm/vmstat.c b/mm/vmstat.c
index f42745e..6143c70 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -14,6 +14,7 @@
 #include <linux/module.h>
 #include <linux/slab.h>
 #include <linux/cpu.h>
+#include <linux/cpumask.h>
 #include <linux/vmstat.h>
 #include <linux/sched.h>
 #include <linux/math64.h>
@@ -432,11 +433,12 @@ EXPORT_SYMBOL(dec_zone_page_state);
  * with the global counters. These could cause remote node cache line
  * bouncing and will have to be only done when necessary.
  */
-void refresh_cpu_vm_stats(int cpu)
+bool refresh_cpu_vm_stats(int cpu)
 {
 	struct zone *zone;
 	int i;
 	int global_diff[NR_VM_ZONE_STAT_ITEMS] = { 0, };
+	bool vm_activity = false;
 
 	for_each_populated_zone(zone) {
 		struct per_cpu_pageset *p;
@@ -483,14 +485,21 @@ void refresh_cpu_vm_stats(int cpu)
 		if (p->expire)
 			continue;
 
-		if (p->pcp.count)
+		if (p->pcp.count) {
+			vm_activity = true;
 			drain_zone_pages(zone, &p->pcp);
+		}
 #endif
 	}
 
 	for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
-		if (global_diff[i])
+		if (global_diff[i]) {
 			atomic_long_add(global_diff[i], &vm_stat[i]);
+			vm_activity = true;
+		}
+
+	return vm_activity;
+
 }
 
 /*
@@ -1172,24 +1181,69 @@ static const struct file_operations proc_vmstat_file_operations = {
 #endif /* CONFIG_PROC_FS */
 
 #ifdef CONFIG_SMP
+
+#define VMSTAT_NO_CPU (-1)
+
 static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
 int sysctl_stat_interval __read_mostly = HZ;
+static struct cpumask vmstat_cpus;
+static int vmstat_monitor_cpu __read_mostly = VMSTAT_NO_CPU;
 
-static void vmstat_update(struct work_struct *w)
+static inline bool need_vmstat(int cpu)
 {
-	refresh_cpu_vm_stats(smp_processor_id());
-	schedule_delayed_work(&__get_cpu_var(vmstat_work),
-		round_jiffies_relative(sysctl_stat_interval));
+	struct zone *zone;
+	int i;
+
+	for_each_populated_zone(zone) {
+		struct per_cpu_pageset *p;
+
+		p = per_cpu_ptr(zone->pageset, cpu);
+
+		for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
+			if (p->vm_stat_diff[i])
+				return true;
+
+		if (zone_to_nid(zone) != numa_node_id() && p->pcp.count)
+			return true;
+	}
+
+	return false;
 }
 
-static void __cpuinit start_cpu_timer(int cpu)
+static void vmstat_update(struct work_struct *w);
+
+static void start_cpu_timer(int cpu)
 {
 	struct delayed_work *work = &per_cpu(vmstat_work, cpu);
 
-	INIT_DEFERRABLE_WORK(work, vmstat_update);
+	cpumask_set_cpu(cpu, &vmstat_cpus);
 	schedule_delayed_work_on(cpu, work, __round_jiffies_relative(HZ, cpu));
 }
 
+static void __cpuinit setup_cpu_timer(int cpu)
+{
+	struct delayed_work *work = &per_cpu(vmstat_work, cpu);
+
+	INIT_DEFERRABLE_WORK(work, vmstat_update);
+	start_cpu_timer(cpu);
+}
+
+static void vmstat_update(struct work_struct *w)
+{
+	int cpu, this_cpu = smp_processor_id();
+
+	if (unlikely(this_cpu == vmstat_monitor_cpu))
+		for_each_cpu_not(cpu, &vmstat_cpus)
+			if (need_vmstat(cpu))
+				start_cpu_timer(cpu);
+
+	if (likely(refresh_cpu_vm_stats(this_cpu) || (this_cpu == vmstat_monitor_cpu)))
+		schedule_delayed_work(&__get_cpu_var(vmstat_work),
+				round_jiffies_relative(sysctl_stat_interval));
+	else
+		cpumask_clear_cpu(this_cpu, &vmstat_cpus);
+}
+
 /*
  * Use the cpu notifier to insure that the thresholds are recalculated
  * when necessary.
@@ -1204,17 +1258,25 @@ static int __cpuinit vmstat_cpuup_callback(struct notifier_block *nfb,
 	case CPU_ONLINE:
 	case CPU_ONLINE_FROZEN:
 		refresh_zone_stat_thresholds();
-		start_cpu_timer(cpu);
+		setup_cpu_timer(cpu);
 		node_set_state(cpu_to_node(cpu), N_CPU);
 		break;
 	case CPU_DOWN_PREPARE:
 	case CPU_DOWN_PREPARE_FROZEN:
-		cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
-		per_cpu(vmstat_work, cpu).work.func = NULL;
+		if (cpumask_test_cpu(cpu, &vmstat_cpus)) {
+			cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
+			per_cpu(vmstat_work, cpu).work.func = NULL;
+			if(cpu == vmstat_monitor_cpu) {
+				int this_cpu = smp_processor_id();
+				vmstat_monitor_cpu = this_cpu;
+				if (!cpumask_test_cpu(this_cpu, &vmstat_cpus))
+					start_cpu_timer(this_cpu);
+			}
+		}
 		break;
 	case CPU_DOWN_FAILED:
 	case CPU_DOWN_FAILED_FROZEN:
-		start_cpu_timer(cpu);
+		setup_cpu_timer(cpu);
 		break;
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
@@ -1237,8 +1299,10 @@ static int __init setup_vmstat(void)
 
 	register_cpu_notifier(&vmstat_notifier);
 
+	vmstat_monitor_cpu = smp_processor_id();
+
 	for_each_online_cpu(cpu)
-		start_cpu_timer(cpu);
+		setup_cpu_timer(cpu);
 #endif
 #ifdef CONFIG_PROC_FS
 	proc_create("buddyinfo", S_IRUGO, NULL, &fragmentation_file_operations);
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
@ 2013-06-19 20:02           ` Gilad Ben-Yossef
  0 siblings, 0 replies; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-06-19 20:02 UTC (permalink / raw)
  To: paulmck; +Cc: Gilad Ben-Yossef, Christoph Lameter, linux-kernel, linux-mm

vmstat_update runs every second from the work queue to update statistics
and drain per cpu pages back into the global page allocator.

This is useful in most circumstances but is wasteful if the CPU doesn't
actually make any VM activity. This can happen in the situtation that
the CPU is idle or running a CPU bound long term task (e.g. CPU
isolation), in which case the periodic vmstate_update timer needlessly
interrupts the CPU.

This patch tries to make vmstat_update schedule itself for the next
round only if there was any work for it to do in the previous run.
The assumption is that if for a whole second we didn't see any VM
activity it is reasnoable to assume that the CPU is not using the
VM because it is idle or runs a long term single CPU bound task.

A scapegoat CPU is picked to serve to periodically monitor
CPUs that have their vmstat_update work stopped and re-schedule them
if VM activity is detected. The scapegoat CPU never stops its
vmstat_update work item instance.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
CC: Christoph Lameter <cl@linux.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: linux-kernel@vger.kernel.org
CC: linux-mm@kvack.org
---
 include/linux/vmstat.h |    2 +-
 mm/vmstat.c            |   92 ++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 79 insertions(+), 15 deletions(-)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index c586679..a30ab79 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -198,7 +198,7 @@ extern void __inc_zone_state(struct zone *, enum zone_stat_item);
 extern void dec_zone_state(struct zone *, enum zone_stat_item);
 extern void __dec_zone_state(struct zone *, enum zone_stat_item);
 
-void refresh_cpu_vm_stats(int);
+bool refresh_cpu_vm_stats(int);
 void refresh_zone_stat_thresholds(void);
 
 void drain_zonestat(struct zone *zone, struct per_cpu_pageset *);
diff --git a/mm/vmstat.c b/mm/vmstat.c
index f42745e..6143c70 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -14,6 +14,7 @@
 #include <linux/module.h>
 #include <linux/slab.h>
 #include <linux/cpu.h>
+#include <linux/cpumask.h>
 #include <linux/vmstat.h>
 #include <linux/sched.h>
 #include <linux/math64.h>
@@ -432,11 +433,12 @@ EXPORT_SYMBOL(dec_zone_page_state);
  * with the global counters. These could cause remote node cache line
  * bouncing and will have to be only done when necessary.
  */
-void refresh_cpu_vm_stats(int cpu)
+bool refresh_cpu_vm_stats(int cpu)
 {
 	struct zone *zone;
 	int i;
 	int global_diff[NR_VM_ZONE_STAT_ITEMS] = { 0, };
+	bool vm_activity = false;
 
 	for_each_populated_zone(zone) {
 		struct per_cpu_pageset *p;
@@ -483,14 +485,21 @@ void refresh_cpu_vm_stats(int cpu)
 		if (p->expire)
 			continue;
 
-		if (p->pcp.count)
+		if (p->pcp.count) {
+			vm_activity = true;
 			drain_zone_pages(zone, &p->pcp);
+		}
 #endif
 	}
 
 	for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
-		if (global_diff[i])
+		if (global_diff[i]) {
 			atomic_long_add(global_diff[i], &vm_stat[i]);
+			vm_activity = true;
+		}
+
+	return vm_activity;
+
 }
 
 /*
@@ -1172,24 +1181,69 @@ static const struct file_operations proc_vmstat_file_operations = {
 #endif /* CONFIG_PROC_FS */
 
 #ifdef CONFIG_SMP
+
+#define VMSTAT_NO_CPU (-1)
+
 static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
 int sysctl_stat_interval __read_mostly = HZ;
+static struct cpumask vmstat_cpus;
+static int vmstat_monitor_cpu __read_mostly = VMSTAT_NO_CPU;
 
-static void vmstat_update(struct work_struct *w)
+static inline bool need_vmstat(int cpu)
 {
-	refresh_cpu_vm_stats(smp_processor_id());
-	schedule_delayed_work(&__get_cpu_var(vmstat_work),
-		round_jiffies_relative(sysctl_stat_interval));
+	struct zone *zone;
+	int i;
+
+	for_each_populated_zone(zone) {
+		struct per_cpu_pageset *p;
+
+		p = per_cpu_ptr(zone->pageset, cpu);
+
+		for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
+			if (p->vm_stat_diff[i])
+				return true;
+
+		if (zone_to_nid(zone) != numa_node_id() && p->pcp.count)
+			return true;
+	}
+
+	return false;
 }
 
-static void __cpuinit start_cpu_timer(int cpu)
+static void vmstat_update(struct work_struct *w);
+
+static void start_cpu_timer(int cpu)
 {
 	struct delayed_work *work = &per_cpu(vmstat_work, cpu);
 
-	INIT_DEFERRABLE_WORK(work, vmstat_update);
+	cpumask_set_cpu(cpu, &vmstat_cpus);
 	schedule_delayed_work_on(cpu, work, __round_jiffies_relative(HZ, cpu));
 }
 
+static void __cpuinit setup_cpu_timer(int cpu)
+{
+	struct delayed_work *work = &per_cpu(vmstat_work, cpu);
+
+	INIT_DEFERRABLE_WORK(work, vmstat_update);
+	start_cpu_timer(cpu);
+}
+
+static void vmstat_update(struct work_struct *w)
+{
+	int cpu, this_cpu = smp_processor_id();
+
+	if (unlikely(this_cpu == vmstat_monitor_cpu))
+		for_each_cpu_not(cpu, &vmstat_cpus)
+			if (need_vmstat(cpu))
+				start_cpu_timer(cpu);
+
+	if (likely(refresh_cpu_vm_stats(this_cpu) || (this_cpu == vmstat_monitor_cpu)))
+		schedule_delayed_work(&__get_cpu_var(vmstat_work),
+				round_jiffies_relative(sysctl_stat_interval));
+	else
+		cpumask_clear_cpu(this_cpu, &vmstat_cpus);
+}
+
 /*
  * Use the cpu notifier to insure that the thresholds are recalculated
  * when necessary.
@@ -1204,17 +1258,25 @@ static int __cpuinit vmstat_cpuup_callback(struct notifier_block *nfb,
 	case CPU_ONLINE:
 	case CPU_ONLINE_FROZEN:
 		refresh_zone_stat_thresholds();
-		start_cpu_timer(cpu);
+		setup_cpu_timer(cpu);
 		node_set_state(cpu_to_node(cpu), N_CPU);
 		break;
 	case CPU_DOWN_PREPARE:
 	case CPU_DOWN_PREPARE_FROZEN:
-		cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
-		per_cpu(vmstat_work, cpu).work.func = NULL;
+		if (cpumask_test_cpu(cpu, &vmstat_cpus)) {
+			cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
+			per_cpu(vmstat_work, cpu).work.func = NULL;
+			if(cpu == vmstat_monitor_cpu) {
+				int this_cpu = smp_processor_id();
+				vmstat_monitor_cpu = this_cpu;
+				if (!cpumask_test_cpu(this_cpu, &vmstat_cpus))
+					start_cpu_timer(this_cpu);
+			}
+		}
 		break;
 	case CPU_DOWN_FAILED:
 	case CPU_DOWN_FAILED_FROZEN:
-		start_cpu_timer(cpu);
+		setup_cpu_timer(cpu);
 		break;
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
@@ -1237,8 +1299,10 @@ static int __init setup_vmstat(void)
 
 	register_cpu_notifier(&vmstat_notifier);
 
+	vmstat_monitor_cpu = smp_processor_id();
+
 	for_each_online_cpu(cpu)
-		start_cpu_timer(cpu);
+		setup_cpu_timer(cpu);
 #endif
 #ifdef CONFIG_PROC_FS
 	proc_create("buddyinfo", S_IRUGO, NULL, &fragmentation_file_operations);
-- 
1.7.0.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/2] mm: add sysctl to pick vmstat monitor cpu
  2013-06-19 14:50       ` Gilad Ben-Yossef
@ 2013-06-19 20:02           ` Gilad Ben-Yossef
  2013-06-19 14:59         ` Paul E. McKenney
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-06-19 20:02 UTC (permalink / raw)
  To: paulmck; +Cc: Gilad Ben-Yossef, Christoph Lameter, linux-kernel, linux-mm

Add a sysctl knob to enable admin to hand pick the scapegoat cpu
that will perform the extra work of preiodically checking for
new VM activity on CPUs that have switched off their vmstat_update
work item schedling.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
CC: Christoph Lameter <cl@linux.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: linux-kernel@vger.kernel.org
CC: linux-mm@kvack.org

---
 include/linux/vmstat.h |    1 +
 kernel/sysctl.c        |    7 ++++
 mm/vmstat.c            |   72 ++++++++++++++++++++++++++++++++++++++++++++----
 3 files changed, 74 insertions(+), 6 deletions(-)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index a30ab79..470f1d0 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -9,6 +9,7 @@
 #include <linux/atomic.h>
 
 extern int sysctl_stat_interval;
+extern int sysctl_vmstat_monitor_cpu;
 
 #ifdef CONFIG_VM_EVENT_COUNTERS
 /*
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 9edcf45..58c889e 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1361,6 +1361,13 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_jiffies,
 	},
+	{
+		.procname	= "stat_monitor_cpu",
+		.data		= &sysctl_vmstat_monitor_cpu,
+		.maxlen		= sizeof(sysctl_vmstat_monitor_cpu),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
 #endif
 #ifdef CONFIG_MMU
 	{
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 6143c70..767412e 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1187,7 +1187,7 @@ static const struct file_operations proc_vmstat_file_operations = {
 static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
 int sysctl_stat_interval __read_mostly = HZ;
 static struct cpumask vmstat_cpus;
-static int vmstat_monitor_cpu __read_mostly = VMSTAT_NO_CPU;
+int sysctl_vmstat_monitor_cpu __read_mostly = VMSTAT_NO_CPU;
 
 static inline bool need_vmstat(int cpu)
 {
@@ -1232,12 +1232,13 @@ static void vmstat_update(struct work_struct *w)
 {
 	int cpu, this_cpu = smp_processor_id();
 
-	if (unlikely(this_cpu == vmstat_monitor_cpu))
+	if (unlikely(this_cpu == sysctl_vmstat_monitor_cpu))
 		for_each_cpu_not(cpu, &vmstat_cpus)
 			if (need_vmstat(cpu))
 				start_cpu_timer(cpu);
 
-	if (likely(refresh_cpu_vm_stats(this_cpu) || (this_cpu == vmstat_monitor_cpu)))
+	if (likely(refresh_cpu_vm_stats(this_cpu) ||
+		(this_cpu == sysctl_vmstat_monitor_cpu)))
 		schedule_delayed_work(&__get_cpu_var(vmstat_work),
 				round_jiffies_relative(sysctl_stat_interval));
 	else
@@ -1266,9 +1267,9 @@ static int __cpuinit vmstat_cpuup_callback(struct notifier_block *nfb,
 		if (cpumask_test_cpu(cpu, &vmstat_cpus)) {
 			cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
 			per_cpu(vmstat_work, cpu).work.func = NULL;
-			if(cpu == vmstat_monitor_cpu) {
+			if (cpu == sysctl_vmstat_monitor_cpu) {
 				int this_cpu = smp_processor_id();
-				vmstat_monitor_cpu = this_cpu;
+				sysctl_vmstat_monitor_cpu = this_cpu;
 				if (!cpumask_test_cpu(this_cpu, &vmstat_cpus))
 					start_cpu_timer(this_cpu);
 			}
@@ -1299,7 +1300,7 @@ static int __init setup_vmstat(void)
 
 	register_cpu_notifier(&vmstat_notifier);
 
-	vmstat_monitor_cpu = smp_processor_id();
+	sysctl_vmstat_monitor_cpu = smp_processor_id();
 
 	for_each_online_cpu(cpu)
 		setup_cpu_timer(cpu);
@@ -1474,5 +1475,64 @@ fail:
 	return -ENOMEM;
 }
 
+#ifdef CONFIG_SYSCTL
+/*
+ * proc handler for /proc/sys/mm/stat_monitor_cpu
+ *
+ * Note that there is a harmless race condition here:
+ * If you concurrently try to change the monitor CPU to
+ * a new valid one and an invalid (offline) one at the
+ * same time, you can get a success indication for the
+ * valid one, a failure for the invalid one, but end up
+ * with the old value. It's easily fixable but hardly
+ * worth the added complexity.
+ */
+
+int proc_monitor_cpu(struct ctl_table *table, int write,
+			void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	int ret;
+	int tmp;
+
+	/*
+	 * We need to make sure the chosen and old monitor cpus don't
+	 * go offline on us during the transition.
+	 */
+	get_online_cpus();
+
+	tmp = sysctl_vmstat_monitor_cpu;
+
+	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+
+	if (ret || !write)
+		goto out;
+
+	/*
+	 * An offline CPU is a bad choice for monitoring duty.
+	 * Abort.
+	 */
+	if (!cpu_online(sysctl_vmstat_monitor_cpu)) {
+		sysctl_vmstat_monitor_cpu = tmp;
+		ret = -ERANGE;
+		/*
+		 * Note! we fall through here on purpose, since
+		 * the old CPU monitor might have switched off
+		 * its vmstat_update by this time.
+		 */
+	}
+
+	/*
+	 * If the new monitor cpu had the vmstat_update off,
+	 * bring it back on.
+	 */
+	if (!cpumask_test_cpu(sysctl_vmstat_monitor_cpu, &vmstat_cpus))
+		start_cpu_timer(sysctl_vmstat_monitor_cpu);
+
+out:
+	put_online_cpus();
+	return ret;
+}
+#endif /* CONFIG_SYSCTL */
+
 module_init(extfrag_debug_init);
 #endif
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/2] mm: add sysctl to pick vmstat monitor cpu
@ 2013-06-19 20:02           ` Gilad Ben-Yossef
  0 siblings, 0 replies; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-06-19 20:02 UTC (permalink / raw)
  To: paulmck; +Cc: Gilad Ben-Yossef, Christoph Lameter, linux-kernel, linux-mm

Add a sysctl knob to enable admin to hand pick the scapegoat cpu
that will perform the extra work of preiodically checking for
new VM activity on CPUs that have switched off their vmstat_update
work item schedling.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
CC: Christoph Lameter <cl@linux.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: linux-kernel@vger.kernel.org
CC: linux-mm@kvack.org

---
 include/linux/vmstat.h |    1 +
 kernel/sysctl.c        |    7 ++++
 mm/vmstat.c            |   72 ++++++++++++++++++++++++++++++++++++++++++++----
 3 files changed, 74 insertions(+), 6 deletions(-)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index a30ab79..470f1d0 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -9,6 +9,7 @@
 #include <linux/atomic.h>
 
 extern int sysctl_stat_interval;
+extern int sysctl_vmstat_monitor_cpu;
 
 #ifdef CONFIG_VM_EVENT_COUNTERS
 /*
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 9edcf45..58c889e 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1361,6 +1361,13 @@ static struct ctl_table vm_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_jiffies,
 	},
+	{
+		.procname	= "stat_monitor_cpu",
+		.data		= &sysctl_vmstat_monitor_cpu,
+		.maxlen		= sizeof(sysctl_vmstat_monitor_cpu),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
 #endif
 #ifdef CONFIG_MMU
 	{
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 6143c70..767412e 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1187,7 +1187,7 @@ static const struct file_operations proc_vmstat_file_operations = {
 static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
 int sysctl_stat_interval __read_mostly = HZ;
 static struct cpumask vmstat_cpus;
-static int vmstat_monitor_cpu __read_mostly = VMSTAT_NO_CPU;
+int sysctl_vmstat_monitor_cpu __read_mostly = VMSTAT_NO_CPU;
 
 static inline bool need_vmstat(int cpu)
 {
@@ -1232,12 +1232,13 @@ static void vmstat_update(struct work_struct *w)
 {
 	int cpu, this_cpu = smp_processor_id();
 
-	if (unlikely(this_cpu == vmstat_monitor_cpu))
+	if (unlikely(this_cpu == sysctl_vmstat_monitor_cpu))
 		for_each_cpu_not(cpu, &vmstat_cpus)
 			if (need_vmstat(cpu))
 				start_cpu_timer(cpu);
 
-	if (likely(refresh_cpu_vm_stats(this_cpu) || (this_cpu == vmstat_monitor_cpu)))
+	if (likely(refresh_cpu_vm_stats(this_cpu) ||
+		(this_cpu == sysctl_vmstat_monitor_cpu)))
 		schedule_delayed_work(&__get_cpu_var(vmstat_work),
 				round_jiffies_relative(sysctl_stat_interval));
 	else
@@ -1266,9 +1267,9 @@ static int __cpuinit vmstat_cpuup_callback(struct notifier_block *nfb,
 		if (cpumask_test_cpu(cpu, &vmstat_cpus)) {
 			cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
 			per_cpu(vmstat_work, cpu).work.func = NULL;
-			if(cpu == vmstat_monitor_cpu) {
+			if (cpu == sysctl_vmstat_monitor_cpu) {
 				int this_cpu = smp_processor_id();
-				vmstat_monitor_cpu = this_cpu;
+				sysctl_vmstat_monitor_cpu = this_cpu;
 				if (!cpumask_test_cpu(this_cpu, &vmstat_cpus))
 					start_cpu_timer(this_cpu);
 			}
@@ -1299,7 +1300,7 @@ static int __init setup_vmstat(void)
 
 	register_cpu_notifier(&vmstat_notifier);
 
-	vmstat_monitor_cpu = smp_processor_id();
+	sysctl_vmstat_monitor_cpu = smp_processor_id();
 
 	for_each_online_cpu(cpu)
 		setup_cpu_timer(cpu);
@@ -1474,5 +1475,64 @@ fail:
 	return -ENOMEM;
 }
 
+#ifdef CONFIG_SYSCTL
+/*
+ * proc handler for /proc/sys/mm/stat_monitor_cpu
+ *
+ * Note that there is a harmless race condition here:
+ * If you concurrently try to change the monitor CPU to
+ * a new valid one and an invalid (offline) one at the
+ * same time, you can get a success indication for the
+ * valid one, a failure for the invalid one, but end up
+ * with the old value. It's easily fixable but hardly
+ * worth the added complexity.
+ */
+
+int proc_monitor_cpu(struct ctl_table *table, int write,
+			void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	int ret;
+	int tmp;
+
+	/*
+	 * We need to make sure the chosen and old monitor cpus don't
+	 * go offline on us during the transition.
+	 */
+	get_online_cpus();
+
+	tmp = sysctl_vmstat_monitor_cpu;
+
+	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+
+	if (ret || !write)
+		goto out;
+
+	/*
+	 * An offline CPU is a bad choice for monitoring duty.
+	 * Abort.
+	 */
+	if (!cpu_online(sysctl_vmstat_monitor_cpu)) {
+		sysctl_vmstat_monitor_cpu = tmp;
+		ret = -ERANGE;
+		/*
+		 * Note! we fall through here on purpose, since
+		 * the old CPU monitor might have switched off
+		 * its vmstat_update by this time.
+		 */
+	}
+
+	/*
+	 * If the new monitor cpu had the vmstat_update off,
+	 * bring it back on.
+	 */
+	if (!cpumask_test_cpu(sysctl_vmstat_monitor_cpu, &vmstat_cpus))
+		start_cpu_timer(sysctl_vmstat_monitor_cpu);
+
+out:
+	put_online_cpus();
+	return ret;
+}
+#endif /* CONFIG_SYSCTL */
+
 module_init(extfrag_debug_init);
 #endif
-- 
1.7.0.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: vmstat kthreads
  2013-06-19 14:59         ` Paul E. McKenney
@ 2013-06-19 20:18           ` Christoph Lameter
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-06-19 20:18 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Gilad Ben-Yossef, linux-mm, ghaskins, niv, kravetz, Frederic Weisbecker

On Wed, 19 Jun 2013, Paul E. McKenney wrote:

> > I've just ported them over to 3.10 and they merge (with a small fix
> > due to deferred workqueue API changes) and build. I did not try to run
> > this version though.
> > I'll post them as replies to this message.
> >
> > I'd be happy to rescue them from the "TODO" pile... :-)
>
> Please!  ;-)

Well if we are going into vmstat mods then I'd also like to throw in this
old patch:

Subject: vmstat: Avoid interrupt disable in vm stats loop

There is no need to disable interrupts if we use xchg().

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/mm/vmstat.c
===================================================================
--- linux.orig/mm/vmstat.c	2013-05-20 15:19:28.000000000 -0500
+++ linux/mm/vmstat.c	2013-06-19 10:09:09.954024071 -0500
@@ -445,13 +445,8 @@ void refresh_cpu_vm_stats(int cpu)

 		for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
 			if (p->vm_stat_diff[i]) {
-				unsigned long flags;
-				int v;
+				int v = xchg(p->vm_stat_diff + i, 0);

-				local_irq_save(flags);
-				v = p->vm_stat_diff[i];
-				p->vm_stat_diff[i] = 0;
-				local_irq_restore(flags);
 				atomic_long_add(v, &zone->vm_stat[i]);
 				global_diff[i] += v;
 #ifdef CONFIG_NUMA

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: vmstat kthreads
  2013-06-19 14:57         ` Gilad Ben-Yossef
@ 2013-06-20  5:06           ` Gilad Ben-Yossef
  0 siblings, 0 replies; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-06-20  5:06 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, linux-mm, ghaskins, niv, kravetz, Frederic Weisbecker

On Wed, Jun 19, 2013 at 5:57 PM, Gilad Ben-Yossef <gilad@benyossef.com> wrote:

>> I'll post them as replies to this message.
>
> ... or rather, I'll send them later this evening when I'm not behind a
> firewall that is blocking SMTP connections... :-)
>

Well, I must have botched  the reply-to header, so they ended up as a
separate message post:

https://lkml.org/lkml/2013/6/19/583
https://lkml.org/lkml/2013/6/19/580


-- 
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
 -- Jean-Baptiste Queru

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] mm: add sysctl to pick vmstat monitor cpu
  2013-06-19 20:02           ` Gilad Ben-Yossef
@ 2013-06-20 13:58             ` Christoph Lameter
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-06-20 13:58 UTC (permalink / raw)
  To: Gilad Ben-Yossef
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Wed, 19 Jun 2013, Gilad Ben-Yossef wrote:

> Add a sysctl knob to enable admin to hand pick the scapegoat cpu
> that will perform the extra work of preiodically checking for
> new VM activity on CPUs that have switched off their vmstat_update
> work item schedling.

Not necessary if we use the dynticks sacrificial processor
(boot cpu). Seems to be also used for RCU.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] mm: add sysctl to pick vmstat monitor cpu
@ 2013-06-20 13:58             ` Christoph Lameter
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-06-20 13:58 UTC (permalink / raw)
  To: Gilad Ben-Yossef
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Wed, 19 Jun 2013, Gilad Ben-Yossef wrote:

> Add a sysctl knob to enable admin to hand pick the scapegoat cpu
> that will perform the extra work of preiodically checking for
> new VM activity on CPUs that have switched off their vmstat_update
> work item schedling.

Not necessary if we use the dynticks sacrificial processor
(boot cpu). Seems to be also used for RCU.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
  2013-06-19 20:02           ` Gilad Ben-Yossef
@ 2013-06-20 14:05             ` Christoph Lameter
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-06-20 14:05 UTC (permalink / raw)
  To: Gilad Ben-Yossef
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Wed, 19 Jun 2013, Gilad Ben-Yossef wrote:

> +static void vmstat_update(struct work_struct *w)
> +{
> +	int cpu, this_cpu = smp_processor_id();
> +
> +	if (unlikely(this_cpu == vmstat_monitor_cpu))
> +		for_each_cpu_not(cpu, &vmstat_cpus)
> +			if (need_vmstat(cpu))
> +				start_cpu_timer(cpu);
> +
> +	if (likely(refresh_cpu_vm_stats(this_cpu) || (this_cpu == vmstat_monitor_cpu)))
> +		schedule_delayed_work(&__get_cpu_var(vmstat_work),
> +				round_jiffies_relative(sysctl_stat_interval));
> +	else
> +		cpumask_clear_cpu(this_cpu, &vmstat_cpus);

The clearing of vmstat_cpus could be avoided if this processor is not
running tickless. Frequent updates to vmstat_cpus could become an issue.

>  	case CPU_DOWN_PREPARE_FROZEN:
> -		cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
> -		per_cpu(vmstat_work, cpu).work.func = NULL;
> +		if (cpumask_test_cpu(cpu, &vmstat_cpus)) {
> +			cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
> +			per_cpu(vmstat_work, cpu).work.func = NULL;
> +			if(cpu == vmstat_monitor_cpu) {
> +				int this_cpu = smp_processor_id();
> +				vmstat_monitor_cpu = this_cpu;
> +				if (!cpumask_test_cpu(this_cpu, &vmstat_cpus))
> +					start_cpu_timer(this_cpu);
> +			}
> +		}
>  		break;

If the disabling of vmstat is tied into the nohz logic then these portions
are no longer necessary.

> @@ -1237,8 +1299,10 @@ static int __init setup_vmstat(void)
>
>  	register_cpu_notifier(&vmstat_notifier);
>
> +	vmstat_monitor_cpu = smp_processor_id();
> +

Drop the vmstat_monitor_cpu and use the dynticks processor.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
@ 2013-06-20 14:05             ` Christoph Lameter
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-06-20 14:05 UTC (permalink / raw)
  To: Gilad Ben-Yossef
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Wed, 19 Jun 2013, Gilad Ben-Yossef wrote:

> +static void vmstat_update(struct work_struct *w)
> +{
> +	int cpu, this_cpu = smp_processor_id();
> +
> +	if (unlikely(this_cpu == vmstat_monitor_cpu))
> +		for_each_cpu_not(cpu, &vmstat_cpus)
> +			if (need_vmstat(cpu))
> +				start_cpu_timer(cpu);
> +
> +	if (likely(refresh_cpu_vm_stats(this_cpu) || (this_cpu == vmstat_monitor_cpu)))
> +		schedule_delayed_work(&__get_cpu_var(vmstat_work),
> +				round_jiffies_relative(sysctl_stat_interval));
> +	else
> +		cpumask_clear_cpu(this_cpu, &vmstat_cpus);

The clearing of vmstat_cpus could be avoided if this processor is not
running tickless. Frequent updates to vmstat_cpus could become an issue.

>  	case CPU_DOWN_PREPARE_FROZEN:
> -		cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
> -		per_cpu(vmstat_work, cpu).work.func = NULL;
> +		if (cpumask_test_cpu(cpu, &vmstat_cpus)) {
> +			cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
> +			per_cpu(vmstat_work, cpu).work.func = NULL;
> +			if(cpu == vmstat_monitor_cpu) {
> +				int this_cpu = smp_processor_id();
> +				vmstat_monitor_cpu = this_cpu;
> +				if (!cpumask_test_cpu(this_cpu, &vmstat_cpus))
> +					start_cpu_timer(this_cpu);
> +			}
> +		}
>  		break;

If the disabling of vmstat is tied into the nohz logic then these portions
are no longer necessary.

> @@ -1237,8 +1299,10 @@ static int __init setup_vmstat(void)
>
>  	register_cpu_notifier(&vmstat_notifier);
>
> +	vmstat_monitor_cpu = smp_processor_id();
> +

Drop the vmstat_monitor_cpu and use the dynticks processor.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
  2013-06-20 14:05             ` Christoph Lameter
@ 2013-08-07 18:16               ` Christoph Lameter
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-08-07 18:16 UTC (permalink / raw)
  To: Gilad Ben-Yossef
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

Is there any work in progress on this issue?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
@ 2013-08-07 18:16               ` Christoph Lameter
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-08-07 18:16 UTC (permalink / raw)
  To: Gilad Ben-Yossef
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

Is there any work in progress on this issue?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
  2013-08-07 18:16               ` Christoph Lameter
  (?)
@ 2013-08-08  6:28               ` Gilad Ben-Yossef
  -1 siblings, 0 replies; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-08-08  6:28 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

[-- Attachment #1: Type: text/plain, Size: 485 bytes --]

On Wed, Aug 7, 2013 at 9:16 PM, Christoph Lameter <cl@gentwo.org> wrote:

> Is there any work in progress on this issue?
>

Sorry, I dropped the ball following up on this one. Let me check up on it
now...

Gilad

-- 
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com

"If you take a class in large-scale robotics, can you end up in a situation
where the homework eats your dog?"
 -- Jean-Baptiste Queru

[-- Attachment #2: Type: text/html, Size: 954 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
  2013-06-20 14:05             ` Christoph Lameter
@ 2013-08-08  6:54               ` Gilad Ben-Yossef
  -1 siblings, 0 replies; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-08-08  6:54 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Thu, Jun 20, 2013 at 5:05 PM, Christoph Lameter <cl@linux.com> wrote:
>
> On Wed, 19 Jun 2013, Gilad Ben-Yossef wrote:
>
> > +static void vmstat_update(struct work_struct *w)
> > +{
> > +     int cpu, this_cpu = smp_processor_id();
> > +
> > +     if (unlikely(this_cpu == vmstat_monitor_cpu))
> > +             for_each_cpu_not(cpu, &vmstat_cpus)
> > +                     if (need_vmstat(cpu))
> > +                             start_cpu_timer(cpu);
> > +
> > +     if (likely(refresh_cpu_vm_stats(this_cpu) || (this_cpu ==
> > vmstat_monitor_cpu)))
> > +             schedule_delayed_work(&__get_cpu_var(vmstat_work),
> > +
> > round_jiffies_relative(sysctl_stat_interval));
> > +     else
> > +             cpumask_clear_cpu(this_cpu, &vmstat_cpus);
>
> The clearing of vmstat_cpus could be avoided if this processor is not
> running tickless. Frequent updates to vmstat_cpus could become an issue.

I like the idea of tying the vmstat disabling to the tickless logic
but I seem to have run
into a bit of a chicken and egg problem here:

vmstat_update runs from the vmstat work queue item by the workqueue
kernel thread.

If this code is running, it means there are at least two schedulable tasks:
1. The workqueue kernel thread, because it is running.
2. At least one more task, otherwise were were in idle and the
workqueue kernel thread
would not execute this work item.

Unfortunately, having two schedulable tasks means we're not running
tickless, so the check
will never trigger - or have I've missed something obvious?

Thanks,
Gilad


--
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com

"If you take a class in large-scale robotics, can you end up in a situation
where the homework eats your dog?"
 -- Jean-Baptiste Queru

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
@ 2013-08-08  6:54               ` Gilad Ben-Yossef
  0 siblings, 0 replies; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-08-08  6:54 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Thu, Jun 20, 2013 at 5:05 PM, Christoph Lameter <cl@linux.com> wrote:
>
> On Wed, 19 Jun 2013, Gilad Ben-Yossef wrote:
>
> > +static void vmstat_update(struct work_struct *w)
> > +{
> > +     int cpu, this_cpu = smp_processor_id();
> > +
> > +     if (unlikely(this_cpu == vmstat_monitor_cpu))
> > +             for_each_cpu_not(cpu, &vmstat_cpus)
> > +                     if (need_vmstat(cpu))
> > +                             start_cpu_timer(cpu);
> > +
> > +     if (likely(refresh_cpu_vm_stats(this_cpu) || (this_cpu ==
> > vmstat_monitor_cpu)))
> > +             schedule_delayed_work(&__get_cpu_var(vmstat_work),
> > +
> > round_jiffies_relative(sysctl_stat_interval));
> > +     else
> > +             cpumask_clear_cpu(this_cpu, &vmstat_cpus);
>
> The clearing of vmstat_cpus could be avoided if this processor is not
> running tickless. Frequent updates to vmstat_cpus could become an issue.

I like the idea of tying the vmstat disabling to the tickless logic
but I seem to have run
into a bit of a chicken and egg problem here:

vmstat_update runs from the vmstat work queue item by the workqueue
kernel thread.

If this code is running, it means there are at least two schedulable tasks:
1. The workqueue kernel thread, because it is running.
2. At least one more task, otherwise were were in idle and the
workqueue kernel thread
would not execute this work item.

Unfortunately, having two schedulable tasks means we're not running
tickless, so the check
will never trigger - or have I've missed something obvious?

Thanks,
Gilad


--
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com

"If you take a class in large-scale robotics, can you end up in a situation
where the homework eats your dog?"
 -- Jean-Baptiste Queru

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
  2013-08-08  6:54               ` Gilad Ben-Yossef
@ 2013-08-08 14:59                 ` Christoph Lameter
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-08-08 14:59 UTC (permalink / raw)
  To: Gilad Ben-Yossef
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Thu, 8 Aug 2013, Gilad Ben-Yossef wrote:

> vmstat_update runs from the vmstat work queue item by the workqueue
> kernel thread.
>
> If this code is running, it means there are at least two schedulable tasks:
> 1. The workqueue kernel thread, because it is running.
> 2. At least one more task, otherwise were were in idle and the
> workqueue kernel thread
> would not execute this work item.
>
> Unfortunately, having two schedulable tasks means we're not running
> tickless, so the check
> will never trigger - or have I've missed something obvious?

The vmstat update is deferrable work. As such it is not required to run
and can be pushed off. It will not be considered for the calculation of
the next timer interupt. See __next_timer_interrupt().




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
@ 2013-08-08 14:59                 ` Christoph Lameter
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-08-08 14:59 UTC (permalink / raw)
  To: Gilad Ben-Yossef
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Thu, 8 Aug 2013, Gilad Ben-Yossef wrote:

> vmstat_update runs from the vmstat work queue item by the workqueue
> kernel thread.
>
> If this code is running, it means there are at least two schedulable tasks:
> 1. The workqueue kernel thread, because it is running.
> 2. At least one more task, otherwise were were in idle and the
> workqueue kernel thread
> would not execute this work item.
>
> Unfortunately, having two schedulable tasks means we're not running
> tickless, so the check
> will never trigger - or have I've missed something obvious?

The vmstat update is deferrable work. As such it is not required to run
and can be pushed off. It will not be considered for the calculation of
the next timer interupt. See __next_timer_interrupt().



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
  2013-08-08 14:59                 ` Christoph Lameter
@ 2013-08-09 18:56                   ` Gilad Ben-Yossef
  -1 siblings, 0 replies; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-08-09 18:56 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Thu, Aug 8, 2013 at 5:59 PM, Christoph Lameter <cl@gentwo.org> wrote:
> On Thu, 8 Aug 2013, Gilad Ben-Yossef wrote:
>
>> vmstat_update runs from the vmstat work queue item by the workqueue
>> kernel thread.
>>
>> If this code is running, it means there are at least two schedulable tasks:
>> 1. The workqueue kernel thread, because it is running.
>> 2. At least one more task, otherwise were were in idle and the
>> workqueue kernel thread
>> would not execute this work item.
>>
>> Unfortunately, having two schedulable tasks means we're not running
>> tickless, so the check
>> will never trigger - or have I've missed something obvious?
>
> The vmstat update is deferrable work. As such it is not required to run
> and can be pushed off. It will not be considered for the calculation of
> the next timer interupt. See __next_timer_interrupt().

Yes, I understand that. I was trying to say something else:

If the code does not consider setting the vmstat_cpus bit in the mask
unless we are running
on a CPU in tickless state, than we will (almost) never set
vmstat_cpus since we will (almost)
never be tickless in a deferrable work -

If there is no other task, we will be in idle and the deferreable work
will not be scheduled since the timer will not fire.

If there is one task originally, the work queue gets executed in the
work queue kernel thread, so we have two tasks so tickless will
disengae.

If there is more than one task tickless is not engage.

Bottom line - we will be in active tickless mode when running a
deferreable work item only if we happen to have fire the timer
that scheduled the work and the previously running task happened to
block. This is rare enough that in practice we will almost
never be in active tickless mode when running the vmstat_update function.

I hope I manage to explain myself better this time.

Thanks,
Gilad



-- 
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
 -- Jean-Baptiste Queru

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
@ 2013-08-09 18:56                   ` Gilad Ben-Yossef
  0 siblings, 0 replies; 28+ messages in thread
From: Gilad Ben-Yossef @ 2013-08-09 18:56 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Thu, Aug 8, 2013 at 5:59 PM, Christoph Lameter <cl@gentwo.org> wrote:
> On Thu, 8 Aug 2013, Gilad Ben-Yossef wrote:
>
>> vmstat_update runs from the vmstat work queue item by the workqueue
>> kernel thread.
>>
>> If this code is running, it means there are at least two schedulable tasks:
>> 1. The workqueue kernel thread, because it is running.
>> 2. At least one more task, otherwise were were in idle and the
>> workqueue kernel thread
>> would not execute this work item.
>>
>> Unfortunately, having two schedulable tasks means we're not running
>> tickless, so the check
>> will never trigger - or have I've missed something obvious?
>
> The vmstat update is deferrable work. As such it is not required to run
> and can be pushed off. It will not be considered for the calculation of
> the next timer interupt. See __next_timer_interrupt().

Yes, I understand that. I was trying to say something else:

If the code does not consider setting the vmstat_cpus bit in the mask
unless we are running
on a CPU in tickless state, than we will (almost) never set
vmstat_cpus since we will (almost)
never be tickless in a deferrable work -

If there is no other task, we will be in idle and the deferreable work
will not be scheduled since the timer will not fire.

If there is one task originally, the work queue gets executed in the
work queue kernel thread, so we have two tasks so tickless will
disengae.

If there is more than one task tickless is not engage.

Bottom line - we will be in active tickless mode when running a
deferreable work item only if we happen to have fire the timer
that scheduled the work and the previously running task happened to
block. This is rare enough that in practice we will almost
never be in active tickless mode when running the vmstat_update function.

I hope I manage to explain myself better this time.

Thanks,
Gilad



-- 
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@benyossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
 -- Jean-Baptiste Queru

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
  2013-08-09 18:56                   ` Gilad Ben-Yossef
@ 2013-08-28 19:28                     ` Christoph Lameter
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-08-28 19:28 UTC (permalink / raw)
  To: Gilad Ben-Yossef
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Fri, 9 Aug 2013, Gilad Ben-Yossef wrote:

> If the code does not consider setting the vmstat_cpus bit in the mask
> unless we are running
> on a CPU in tickless state, than we will (almost) never set
> vmstat_cpus since we will (almost)
> never be tickless in a deferrable work -

Sorry never got around to answering this one. Not sure what to do about
it.

How about this: Disable the vmstats when there is no diff to handle
instead?  This means that the OS was quiet during the earlier period. That
way you have an independent criteria for switching vmstat work off from
tickless. Would even work when there are multiple processes running on the
processor if none of them causes counter updates.

In the meantime there are additional patches for the vmstat function
pending for merge from me (not related to the conditional running of
vmstat but may make it easier to implement). So if you want to do any work
then please on top of the newer release available from Andrew's tree.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/2] mm: make vmstat_update periodic run conditional
@ 2013-08-28 19:28                     ` Christoph Lameter
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-08-28 19:28 UTC (permalink / raw)
  To: Gilad Ben-Yossef
  Cc: Paul E. McKenney, linux-kernel, linux-mm, Frederic Weisbecker

On Fri, 9 Aug 2013, Gilad Ben-Yossef wrote:

> If the code does not consider setting the vmstat_cpus bit in the mask
> unless we are running
> on a CPU in tickless state, than we will (almost) never set
> vmstat_cpus since we will (almost)
> never be tickless in a deferrable work -

Sorry never got around to answering this one. Not sure what to do about
it.

How about this: Disable the vmstats when there is no diff to handle
instead?  This means that the OS was quiet during the earlier period. That
way you have an independent criteria for switching vmstat work off from
tickless. Would even work when there are multiple processes running on the
processor if none of them causes counter updates.

In the meantime there are additional patches for the vmstat function
pending for merge from me (not related to the conditional running of
vmstat but may make it easier to implement). So if you want to do any work
then please on top of the newer release available from Andrew's tree.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2013-08-28 19:34 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-18 15:23 vmstat kthreads Paul E. McKenney
2013-06-18 17:46 ` Christoph Lameter
2013-06-18 18:26   ` Paul E. McKenney
2013-06-19 14:23     ` Christoph Lameter
2013-06-19 14:50       ` Gilad Ben-Yossef
2013-06-19 14:57         ` Gilad Ben-Yossef
2013-06-20  5:06           ` Gilad Ben-Yossef
2013-06-19 14:59         ` Paul E. McKenney
2013-06-19 20:18           ` Christoph Lameter
2013-06-19 20:02         ` [PATCH v2 1/2] mm: make vmstat_update periodic run conditional Gilad Ben-Yossef
2013-06-19 20:02           ` Gilad Ben-Yossef
2013-06-20 14:05           ` Christoph Lameter
2013-06-20 14:05             ` Christoph Lameter
2013-08-07 18:16             ` Christoph Lameter
2013-08-07 18:16               ` Christoph Lameter
2013-08-08  6:28               ` Gilad Ben-Yossef
2013-08-08  6:54             ` Gilad Ben-Yossef
2013-08-08  6:54               ` Gilad Ben-Yossef
2013-08-08 14:59               ` Christoph Lameter
2013-08-08 14:59                 ` Christoph Lameter
2013-08-09 18:56                 ` Gilad Ben-Yossef
2013-08-09 18:56                   ` Gilad Ben-Yossef
2013-08-28 19:28                   ` Christoph Lameter
2013-08-28 19:28                     ` Christoph Lameter
2013-06-19 20:02         ` [PATCH v2 2/2] mm: add sysctl to pick vmstat monitor cpu Gilad Ben-Yossef
2013-06-19 20:02           ` Gilad Ben-Yossef
2013-06-20 13:58           ` Christoph Lameter
2013-06-20 13:58             ` Christoph Lameter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.