All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: Christoph Lameter <cl@gentwo.org>
Cc: <akpm@linux-foundation.org>,
	Gilad Ben-Yossef <gilad@benyossef.com>,
	Thomas Gleixner <tglx@linutronix.de>, Tejun Heo <tj@kernel.org>,
	John Stultz <johnstul@us.ibm.com>,
	Mike Frysinger <vapier@gentoo.org>,
	Minchan Kim <minchan.kim@gmail.com>,
	Hakan Akkan <hakanakkan@gmail.com>,
	Max Krasnyansky <maxk@qualcomm.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	<hughd@google.com>, <viresh.kumar@linaro.org>, <hpa@zytor.com>,
	<mingo@kernel.org>, <peterz@infradead.org>
Subject: Re: vmstat: On demand vmstat workers V8
Date: Wed, 30 Jul 2014 10:57:36 +0800	[thread overview]
Message-ID: <53D85F20.7020206@cn.fujitsu.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1407100903130.12483@gentwo.org>

If I understand the semantics of the cpu_stat_off correctly, please read.

cpu_stat_off = a set of such CPU: the cpu is online && vmstat_work is off
I consider some code forget to guarantee each cpu in cpu_stat_off is online.

Thanks,
Lai

On 07/10/2014 10:04 PM, Christoph Lameter wrote:

> +
> +/*
> + * Shepherd worker thread that checks the
> + * differentials of processors that have their worker
> + * threads for vm statistics updates disabled because of
> + * inactivity.
> + */
> +static void vmstat_shepherd(struct work_struct *w);
> +
> +static DECLARE_DELAYED_WORK(shepherd, vmstat_shepherd);
> +
> +static void vmstat_shepherd(struct work_struct *w)
> +{
> +	int cpu;
> +
> +	/* Check processors whose vmstat worker threads have been disabled */

I think the bug is here, it re-queues the per_cpu(vmstat_work, cpu) which is offline
(after vmstat_cpuup_callback(CPU_DOWN_PREPARE).  And cpu_stat_off is accessed without
proper lock.

I suggest to use get_cpu_online() or a new cpu_stat_off_mutex to protect it.


	get_cpu_online(); /* mutex_lock(&cpu_stat_off_mutex); */
	for_each_cpu(cpu, cpu_stat_off)
		if (need_update(cpu) &&
			cpumask_test_and_clear_cpu(cpu, cpu_stat_off))

			schedule_delayed_work_on(cpu, &per_cpu(vmstat_work, cpu),
				__round_jiffies_relative(sysctl_stat_interval, cpu));
	put_cpu_online(); /* mutex_unlock(&cpu_stat_off_mutex); */



> +
> +
> +	schedule_delayed_work(&shepherd,
>  		round_jiffies_relative(sysctl_stat_interval));
> +
>  }
> 
> -static void start_cpu_timer(int cpu)
> +static void __init start_shepherd_timer(void)
>  {
> -	struct delayed_work *work = &per_cpu(vmstat_work, cpu);
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu)
> +		INIT_DEFERRABLE_WORK(per_cpu_ptr(&vmstat_work, cpu),
> +			vmstat_update);
> +
> +	cpu_stat_off = kmalloc(cpumask_size(), GFP_KERNEL);
> +	cpumask_copy(cpu_stat_off, cpu_online_mask);
> 
> -	INIT_DEFERRABLE_WORK(work, vmstat_update);
> -	schedule_delayed_work_on(cpu, work, __round_jiffies_relative(HZ, cpu));
> +	schedule_delayed_work(&shepherd,
> +		round_jiffies_relative(sysctl_stat_interval));
>  }
> 
>  static void vmstat_cpu_dead(int node)
> @@ -1272,17 +1373,17 @@ static int vmstat_cpuup_callback(struct
>  	case CPU_ONLINE:
>  	case CPU_ONLINE_FROZEN:
>  		refresh_zone_stat_thresholds();
> -		start_cpu_timer(cpu);
>  		node_set_state(cpu_to_node(cpu), N_CPU);
> +		cpumask_set_cpu(cpu, cpu_stat_off);
>  		break;
>  	case CPU_DOWN_PREPARE:
>  	case CPU_DOWN_PREPARE_FROZEN:
> -		cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
> -		per_cpu(vmstat_work, cpu).work.func = NULL;
> +		if (!cpumask_test_and_set_cpu(cpu, cpu_stat_off))
> +			cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));

It is suggest that cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu)) should
be called unconditionally.  And the cpu should be cleared from cpu_stat_off.
(you set it, it is BUG according to vmstat_shepherd() and the semantics of the
cpu_stat_off).

	/* mutex_lock(&cpu_stat_off_mutex); */
		/*if you use cpu_stat_off_mutex instead of get_cpu_online() */
	cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
	cpumask_clear_cpu(cpu, cpu_stat_off);
	/* mutex_unlock(&cpu_stat_off_mutex); */

	/* don't forget to use cpu_stat_off_mutex on other place for
	   accessing to cpu_stat_off except the one in vmstat_update() which
	   is protected by cancel_delayed_work_sync() + other stuffs
	   please also update that comments and keep that VM_BUG_ON() */


>  		break;
>  	case CPU_DOWN_FAILED:
>  	case CPU_DOWN_FAILED_FROZEN:
> -		start_cpu_timer(cpu);
> +		cpumask_set_cpu(cpu, cpu_stat_off);
>  		break;
>  	case CPU_DEAD:
>  	case CPU_DEAD_FROZEN:
> @@ -1302,15 +1403,10 @@ static struct notifier_block vmstat_noti
>  static int __init setup_vmstat(void)
>  {
>  #ifdef CONFIG_SMP
> -	int cpu;
> -
>  	cpu_notifier_register_begin();
>  	__register_cpu_notifier(&vmstat_notifier);
> 
> -	for_each_online_cpu(cpu) {
> -		start_cpu_timer(cpu);
> -		node_set_state(cpu_to_node(cpu), N_CPU);
> -	}
> +	start_shepherd_timer();
>  	cpu_notifier_register_done();
>  #endif
>  #ifdef CONFIG_PROC_FS
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> .
> 


WARNING: multiple messages have this Message-ID (diff)
From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: Christoph Lameter <cl@gentwo.org>
Cc: akpm@linux-foundation.org, Gilad Ben-Yossef <gilad@benyossef.com>,
	Thomas Gleixner <tglx@linutronix.de>, Tejun Heo <tj@kernel.org>,
	John Stultz <johnstul@us.ibm.com>,
	Mike Frysinger <vapier@gentoo.org>,
	Minchan Kim <minchan.kim@gmail.com>,
	Hakan Akkan <hakanakkan@gmail.com>,
	Max Krasnyansky <maxk@qualcomm.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	hughd@google.com, viresh.kumar@linaro.org, hpa@zytor.com,
	mingo@kernel.org, peterz@infradead.org
Subject: Re: vmstat: On demand vmstat workers V8
Date: Wed, 30 Jul 2014 10:57:36 +0800	[thread overview]
Message-ID: <53D85F20.7020206@cn.fujitsu.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1407100903130.12483@gentwo.org>

If I understand the semantics of the cpu_stat_off correctly, please read.

cpu_stat_off = a set of such CPU: the cpu is online && vmstat_work is off
I consider some code forget to guarantee each cpu in cpu_stat_off is online.

Thanks,
Lai

On 07/10/2014 10:04 PM, Christoph Lameter wrote:

> +
> +/*
> + * Shepherd worker thread that checks the
> + * differentials of processors that have their worker
> + * threads for vm statistics updates disabled because of
> + * inactivity.
> + */
> +static void vmstat_shepherd(struct work_struct *w);
> +
> +static DECLARE_DELAYED_WORK(shepherd, vmstat_shepherd);
> +
> +static void vmstat_shepherd(struct work_struct *w)
> +{
> +	int cpu;
> +
> +	/* Check processors whose vmstat worker threads have been disabled */

I think the bug is here, it re-queues the per_cpu(vmstat_work, cpu) which is offline
(after vmstat_cpuup_callback(CPU_DOWN_PREPARE).  And cpu_stat_off is accessed without
proper lock.

I suggest to use get_cpu_online() or a new cpu_stat_off_mutex to protect it.


	get_cpu_online(); /* mutex_lock(&cpu_stat_off_mutex); */
	for_each_cpu(cpu, cpu_stat_off)
		if (need_update(cpu) &&
			cpumask_test_and_clear_cpu(cpu, cpu_stat_off))

			schedule_delayed_work_on(cpu, &per_cpu(vmstat_work, cpu),
				__round_jiffies_relative(sysctl_stat_interval, cpu));
	put_cpu_online(); /* mutex_unlock(&cpu_stat_off_mutex); */



> +
> +
> +	schedule_delayed_work(&shepherd,
>  		round_jiffies_relative(sysctl_stat_interval));
> +
>  }
> 
> -static void start_cpu_timer(int cpu)
> +static void __init start_shepherd_timer(void)
>  {
> -	struct delayed_work *work = &per_cpu(vmstat_work, cpu);
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu)
> +		INIT_DEFERRABLE_WORK(per_cpu_ptr(&vmstat_work, cpu),
> +			vmstat_update);
> +
> +	cpu_stat_off = kmalloc(cpumask_size(), GFP_KERNEL);
> +	cpumask_copy(cpu_stat_off, cpu_online_mask);
> 
> -	INIT_DEFERRABLE_WORK(work, vmstat_update);
> -	schedule_delayed_work_on(cpu, work, __round_jiffies_relative(HZ, cpu));
> +	schedule_delayed_work(&shepherd,
> +		round_jiffies_relative(sysctl_stat_interval));
>  }
> 
>  static void vmstat_cpu_dead(int node)
> @@ -1272,17 +1373,17 @@ static int vmstat_cpuup_callback(struct
>  	case CPU_ONLINE:
>  	case CPU_ONLINE_FROZEN:
>  		refresh_zone_stat_thresholds();
> -		start_cpu_timer(cpu);
>  		node_set_state(cpu_to_node(cpu), N_CPU);
> +		cpumask_set_cpu(cpu, cpu_stat_off);
>  		break;
>  	case CPU_DOWN_PREPARE:
>  	case CPU_DOWN_PREPARE_FROZEN:
> -		cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
> -		per_cpu(vmstat_work, cpu).work.func = NULL;
> +		if (!cpumask_test_and_set_cpu(cpu, cpu_stat_off))
> +			cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));

It is suggest that cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu)) should
be called unconditionally.  And the cpu should be cleared from cpu_stat_off.
(you set it, it is BUG according to vmstat_shepherd() and the semantics of the
cpu_stat_off).

	/* mutex_lock(&cpu_stat_off_mutex); */
		/*if you use cpu_stat_off_mutex instead of get_cpu_online() */
	cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
	cpumask_clear_cpu(cpu, cpu_stat_off);
	/* mutex_unlock(&cpu_stat_off_mutex); */

	/* don't forget to use cpu_stat_off_mutex on other place for
	   accessing to cpu_stat_off except the one in vmstat_update() which
	   is protected by cancel_delayed_work_sync() + other stuffs
	   please also update that comments and keep that VM_BUG_ON() */


>  		break;
>  	case CPU_DOWN_FAILED:
>  	case CPU_DOWN_FAILED_FROZEN:
> -		start_cpu_timer(cpu);
> +		cpumask_set_cpu(cpu, cpu_stat_off);
>  		break;
>  	case CPU_DEAD:
>  	case CPU_DEAD_FROZEN:
> @@ -1302,15 +1403,10 @@ static struct notifier_block vmstat_noti
>  static int __init setup_vmstat(void)
>  {
>  #ifdef CONFIG_SMP
> -	int cpu;
> -
>  	cpu_notifier_register_begin();
>  	__register_cpu_notifier(&vmstat_notifier);
> 
> -	for_each_online_cpu(cpu) {
> -		start_cpu_timer(cpu);
> -		node_set_state(cpu_to_node(cpu), N_CPU);
> -	}
> +	start_shepherd_timer();
>  	cpu_notifier_register_done();
>  #endif
>  #ifdef CONFIG_PROC_FS
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> .
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2014-07-30  2:56 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-10 14:04 vmstat: On demand vmstat workers V8 Christoph Lameter
2014-07-10 14:04 ` Christoph Lameter
2014-07-11 13:20 ` Frederic Weisbecker
2014-07-11 13:20   ` Frederic Weisbecker
2014-07-11 13:56   ` Christoph Lameter
2014-07-11 13:56     ` Christoph Lameter
2014-07-11 13:58     ` Frederic Weisbecker
2014-07-11 13:58       ` Frederic Weisbecker
2014-07-11 15:17       ` Christoph Lameter
2014-07-11 15:17         ` Christoph Lameter
2014-07-11 15:19         ` Frederic Weisbecker
2014-07-11 15:19           ` Frederic Weisbecker
2014-07-11 15:22           ` Christoph Lameter
2014-07-11 15:22             ` Christoph Lameter
2014-07-14 20:10             ` Hugh Dickins
2014-07-14 20:10               ` Hugh Dickins
2014-07-14 20:51               ` Christoph Lameter
2014-07-14 20:51                 ` Christoph Lameter
2014-07-30  3:04         ` Lai Jiangshan
2014-07-30  3:04           ` Lai Jiangshan
2014-07-26  2:22 ` Sasha Levin
2014-07-26  2:22   ` Sasha Levin
2014-07-28 18:55   ` Christoph Lameter
2014-07-28 18:55     ` Christoph Lameter
2014-07-28 21:54     ` Andrew Morton
2014-07-28 21:54       ` Andrew Morton
2014-07-28 22:00       ` Sasha Levin
2014-07-28 22:00         ` Sasha Levin
2014-07-29 15:17       ` Christoph Lameter
2014-07-29 15:17         ` Christoph Lameter
2014-07-29  7:56     ` Peter Zijlstra
2014-07-29 12:05       ` Tejun Heo
2014-07-29 12:05         ` Tejun Heo
2014-07-29 12:23         ` Peter Zijlstra
2014-07-29 12:23           ` Peter Zijlstra
2014-07-29 13:12           ` Tejun Heo
2014-07-29 13:12             ` Tejun Heo
2014-07-29 15:10             ` Christoph Lameter
2014-07-29 15:10               ` Christoph Lameter
2014-07-29 15:14               ` Tejun Heo
2014-07-29 15:14                 ` Tejun Heo
2014-07-29 15:26                 ` Christoph Lameter
2014-07-29 15:26                   ` Christoph Lameter
2014-07-29 15:39                 ` Christoph Lameter
2014-07-29 15:39                   ` Christoph Lameter
2014-07-29 15:47                   ` Sasha Levin
2014-07-29 15:47                     ` Sasha Levin
2014-07-29 15:59                     ` Christoph Lameter
2014-07-29 15:59                       ` Christoph Lameter
2014-07-30  3:11                   ` Lai Jiangshan
2014-07-30  3:11                     ` Lai Jiangshan
2014-07-30 14:34                     ` Christoph Lameter
2014-07-30 14:34                       ` Christoph Lameter
2014-07-29 15:22             ` Christoph Lameter
2014-07-29 15:22               ` Christoph Lameter
2014-07-29 15:43               ` Sasha Levin
2014-07-29 15:43                 ` Sasha Levin
2014-08-04 21:37   ` Sasha Levin
2014-08-04 21:37     ` Sasha Levin
2014-08-05 14:51     ` Christoph Lameter
2014-08-05 14:51       ` Christoph Lameter
2014-08-05 22:25       ` Sasha Levin
2014-08-05 22:25         ` Sasha Levin
2014-08-06 14:12         ` Christoph Lameter
2014-08-06 14:12           ` Christoph Lameter
2014-08-07  1:50           ` Sasha Levin
2014-08-07  1:50             ` Sasha Levin
2014-07-30  2:57 ` Lai Jiangshan [this message]
2014-07-30  2:57   ` Lai Jiangshan
2014-07-30 14:45   ` Christoph Lameter
2014-07-30 14:45     ` Christoph Lameter
2014-07-31  0:52     ` Lai Jiangshan
2014-07-31  0:52       ` Lai Jiangshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53D85F20.7020206@cn.fujitsu.com \
    --to=laijs@cn.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@gentwo.org \
    --cc=fweisbec@gmail.com \
    --cc=gilad@benyossef.com \
    --cc=hakanakkan@gmail.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maxk@qualcomm.com \
    --cc=minchan.kim@gmail.com \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vapier@gentoo.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.