All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <umgwanakikbuti@gmail.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Christoph Lameter <cl@linux.com>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm, vmstat: make quiet_vmstat lighter (was: Re: fast path cycle muncher (vmstat: make vmstat_updater deferrable) again and shut down on idle)
Date: Wed, 27 Jan 2016 18:03:54 +0100	[thread overview]
Message-ID: <1453914234.3517.32.camel@gmail.com> (raw)
In-Reply-To: <20160127164825.GF13951@dhcp22.suse.cz>

On Wed, 2016-01-27 at 17:48 +0100, Michal Hocko wrote:
> On Sat 23-01-16 17:21:55, Mike Galbraith wrote:
> > Hi Christoph,
> > 
> > While you're fixing that commit up, can you perhaps find a better
> > home
> > for quiet_vmstat()?  It not only munches cycles when switching
> > cross
> > -core mightily, for -rt it injects a sleeping lock into the idle
> > task.
> > 
> >     12.89%  [kernel]       [k] refresh_cpu_vm_stats.isra.12
> >      4.75%  [kernel]       [k] __schedule                  
> >      4.70%  [kernel]       [k] mutex_unlock                
> >      3.14%  [kernel]       [k] __switch_to                 
> 
> What about the following fix?

I haven't done _extensive_ testing, but 4.5-rc1-rt (with NO_HZ_FULL
enabling hacks restored, as it's otherwise a dead .config item) is
gripe free whether nohz_full CPUs are in use or not, and overhead went
away in it and desktop configured master kernel (PREEMPT_VOLUNTARY).

> ---
> From c74a04c4fdfe1fa67933bb1ac83d3de0532aaab2 Mon Sep 17 00:00:00
> 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Wed, 27 Jan 2016 17:24:22 +0100
> Subject: [PATCH] mm, vmstat: make quiet_vmstat lighter
> 
> Mike has reported a considerable overhead of refresh_cpu_vm_stats
> from
> the idle entry during pipe test:
>     12.89%  [kernel]       [k] refresh_cpu_vm_stats.isra.12
>      4.75%  [kernel]       [k] __schedule
>      4.70%  [kernel]       [k] mutex_unlock
>      3.14%  [kernel]       [k] __switch_to
> 
> This is caused by 0eb77e988032 ("vmstat: make vmstat_updater
> deferrable
> again and shut down on idle") which has placed quiet_vmstat into
> cpu_idle_loop. The main reason here seems to be that the idle entry
> has
> to get over all zones and perform atomic operations for each vmstat
> entry even though there might be no per cpu diffs. This is a
> pointless
> overhead for _each_ idle entry.
> 
> Make sure that quiet_vmstat is as light as possible.
> 
>  First of all it doesn't make any sense to do any local sync if the
> current cpu is already set in oncpu_stat_off because vmstat_update
> puts
> itself there only if there is nothing to do.
> 
> Then we can check need_update which should be a cheap way to check
> for
> potential per-cpu diffs and only then do refresh_cpu_vm_stats.
> 
> The original patch also did cancel_delayed_work which we are not
> doing
> here. There are two reasons for that. Firstly cancel_delayed_work
> from
> idle context will blow up on RT kernels (reported by Mike):
> [    2.279582] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rt3 #7
> [    2.280444] Hardware name: MEDION MS-7848/MS-7848, BIOS
> M7848W08.20C 09/23/2013
> [    2.281316]  ffff88040b00d640 ffff88040b01fe10 ffffffff812d20e2
> 0000000000000000
> [    2.282202]  ffff88040b01fe30 ffffffff81081095 ffff88041ec4cee0
> ffff88041ec501e0
> [    2.283073]  ffff88040b01fe48 ffffffff815ff910 ffff88041ec4cee0
> ffff88040b01fe88
> [    2.283941] Call Trace:
> [    2.284797]  [<ffffffff812d20e2>] dump_stack+0x49/0x67
> [    2.285658]  [<ffffffff81081095>] ___might_sleep+0xf5/0x180
> [    2.286521]  [<ffffffff815ff910>] rt_spin_lock+0x20/0x50
> [    2.287382]  [<ffffffff81075919>] try_to_grab_pending+0x69/0x240
> [    2.288239]  [<ffffffff81075b16>] cancel_delayed_work+0x26/0xe0
> [    2.289094]  [<ffffffff8115ec05>] quiet_vmstat+0x75/0xa0
> [    2.289949]  [<ffffffff8109ab38>] cpu_idle_loop+0x38/0x3e0
> [    2.290800]  [<ffffffff8109aef3>] cpu_startup_entry+0x13/0x20
> [    2.291647]  [<ffffffff81036164>] start_secondary+0x114/0x140
> 
> And secondly, even on !RT kernels it might add some non trivial
> overhead
> which is not necessary. Even if the vmstat worker wakes up and
> preempts
> idle then it will be most likely a single shot noop because the stats
> were already synced and so it would end up on the oncpu_stat_off
> anyway.
> We just need to teach both vmstat_shepherd and vmstat_update to stop
> scheduling the worker if there is nothing to do.
> 
> Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/vmstat.c | 60 ++++++++++++++++++++++++++++++++++++++++-----------
> ---------
>  1 file changed, 40 insertions(+), 20 deletions(-)
> 
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 40b2c74ddf16..7747f85537b6 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1396,10 +1396,15 @@ static void vmstat_update(struct work_struct
> *w)
>  		 * Counters were updated so we expect more updates
>  		 * to occur in the future. Keep on running the
>  		 * update worker thread.
> +		 * If we were marked on cpu_stat_off clear the flag
> +		 * so that vmstat_shepherd doesn't schedule us
> again.
>  		 */
> -		queue_delayed_work_on(smp_processor_id(), vmstat_wq,
> -			this_cpu_ptr(&vmstat_work),
> -			round_jiffies_relative(sysctl_stat_interval)
> );
> +		if (!cpumask_test_and_clear_cpu(smp_processor_id(),
> +						cpu_stat_off)) {
> +			queue_delayed_work_on(smp_processor_id(),
> vmstat_wq,
> +				this_cpu_ptr(&vmstat_work),
> +				round_jiffies_relative(sysctl_stat_i
> nterval));
> +		}
>  	} else {
>  		/*
>  		 * We did not update any counters so the app may be
> in
> @@ -1417,18 +1422,6 @@ static void vmstat_update(struct work_struct
> *w)
>   * until the diffs stay at zero. The function is used by NOHZ and
> can only be
>   * invoked when tick processing is not active.
>   */
> -void quiet_vmstat(void)
> -{
> -	if (system_state != SYSTEM_RUNNING)
> -		return;
> -
> -	do {
> -		if (!cpumask_test_and_set_cpu(smp_processor_id(),
> cpu_stat_off))
> -			cancel_delayed_work(this_cpu_ptr(&vmstat_wor
> k));
> -
> -	} while (refresh_cpu_vm_stats(false));
> -}
> -
>  /*
>   * Check if the diffs for a certain cpu indicate that
>   * an update is needed.
> @@ -1452,6 +1445,30 @@ static bool need_update(int cpu)
>  	return false;
>  }
>  
> +void quiet_vmstat(void)
> +{
> +	if (system_state != SYSTEM_RUNNING)
> +		return;
> +
> +	/*
> +	 * If we are already in hands of the shepherd then there
> +	 * is nothing for us to do here.
> +	 */
> +	if (cpumask_test_and_set_cpu(smp_processor_id(),
> cpu_stat_off))
> +		return;
> +
> +	if (!need_update(smp_processor_id()))
> +		return;
> +
> +	/*
> +	 * Just refresh counters and do not care about the pending
> delayed
> +	 * vmstat_update. It doesn't fire that often to matter and
> canceling
> +	 * it would be too expensive from this path.
> +	 * vmstat_shepherd will take care about that for us.
> +	 */
> +	refresh_cpu_vm_stats(false);
> +}
> +
>  
>  /*
>   * Shepherd worker thread that checks the
> @@ -1470,11 +1487,14 @@ static void vmstat_shepherd(struct
> work_struct *w)
>  	get_online_cpus();
>  	/* Check processors whose vmstat worker threads have been
> disabled */
>  	for_each_cpu(cpu, cpu_stat_off)
> -		if (need_update(cpu) &&
> -			cpumask_test_and_clear_cpu(cpu,
> cpu_stat_off))
> -
> -			queue_delayed_work_on(cpu, vmstat_wq,
> -				&per_cpu(vmstat_work, cpu), 0);
> +		if (need_update(cpu)) {
> +			if (cpumask_test_and_clear_cpu(cpu,
> cpu_stat_off))
> +				queue_delayed_work_on(cpu,
> vmstat_wq,
> +					&per_cpu(vmstat_work, cpu),
> 0);
> +		} else {
> +			cpumask_set_cpu(cpu, cpu_stat_off);
> +			cancel_delayed_work(this_cpu_ptr(&vmstat_wor
> k));
> +		}
>  
>  	put_online_cpus();
>  
> -- 
> 2.7.0.rc3
> 

  reply	other threads:[~2016-01-27 17:04 UTC|newest]

Thread overview: 114+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-19  0:33 mm, vmstat: kernel BUG at mm/vmstat.c:1408! Sasha Levin
2015-12-19  0:33 ` Sasha Levin
2015-12-21 13:08 ` Christoph Lameter
2015-12-21 13:08   ` Christoph Lameter
2015-12-21 20:28   ` Sasha Levin
2015-12-21 20:28     ` Sasha Levin
2015-12-21 21:07     ` Sasha Levin
2015-12-21 21:07       ` Sasha Levin
2015-12-21 21:14       ` Christoph Lameter
2015-12-21 21:14         ` Christoph Lameter
2015-12-22 17:21         ` Christoph Lameter
2015-12-22 17:21           ` Christoph Lameter
2015-12-24 20:14         ` Sasha Levin
2015-12-29 17:01           ` Christoph Lameter
2015-12-29 17:01             ` Christoph Lameter
2015-12-29 17:18             ` Christoph Lameter
2015-12-29 17:18               ` Christoph Lameter
2016-01-04 18:05           ` Christoph Lameter
2016-01-04 18:05             ` Christoph Lameter
2016-01-04 18:46             ` Sasha Levin
2016-01-04 18:46               ` Sasha Levin
2016-01-12 11:31   ` Shiraz Hashim
2016-01-12 11:31     ` Shiraz Hashim
2016-01-12 12:23     ` Christoph Lameter
2016-01-12 12:23       ` Christoph Lameter
2016-01-12 12:27       ` Shiraz Hashim
2016-01-12 12:27         ` Shiraz Hashim
2016-01-13 11:36       ` Shiraz Hashim
2016-01-13 11:36         ` Shiraz Hashim
2016-01-13 12:32         ` Shiraz Hashim
2016-01-13 12:32           ` Shiraz Hashim
2016-01-14 21:06         ` Sasha Levin
2016-01-14 21:06           ` Sasha Levin
2016-01-20 14:37 ` Michal Hocko
2016-01-20 14:37   ` Michal Hocko
2016-01-20 14:56   ` Sasha Levin
2016-01-20 14:56     ` Sasha Levin
2016-01-20 15:10     ` Michal Hocko
2016-01-20 15:10       ` Michal Hocko
2016-01-20 15:20       ` Christoph Lameter
2016-01-20 15:20         ` Christoph Lameter
2016-01-20 15:49         ` Sasha Levin
2016-01-20 15:49           ` Sasha Levin
2016-01-20 15:55           ` Christoph Lameter
2016-01-20 15:55             ` Christoph Lameter
2016-01-20 21:28             ` Michal Hocko
2016-01-20 21:28               ` Michal Hocko
2016-01-20 21:57               ` Christoph Lameter
2016-01-20 21:57                 ` Christoph Lameter
2016-01-21  8:24                 ` Michal Hocko
2016-01-21  8:24                   ` Michal Hocko
2016-01-21 15:45                   ` Christoph Lameter
2016-01-21 15:45                     ` Christoph Lameter
2016-01-21 16:51                     ` Michal Hocko
2016-01-21 16:51                       ` Michal Hocko
2016-01-21 17:38                       ` Christoph Lameter
2016-01-21 17:38                         ` Christoph Lameter
2016-01-22 11:00                         ` Shiraz Hashim
2016-01-22 11:00                           ` Shiraz Hashim
2016-01-22 14:04                         ` Michal Hocko
2016-01-22 14:04                           ` Michal Hocko
2016-01-22 16:07                           ` Christoph Lameter
2016-01-22 16:07                             ` Christoph Lameter
2016-01-22 16:12                             ` Michal Hocko
2016-01-22 16:12                               ` Michal Hocko
2016-01-22 16:46                               ` Christoph Lameter
2016-01-22 16:46                                 ` Christoph Lameter
2016-01-22 17:12                                 ` Michal Hocko
2016-01-22 17:12                                   ` Michal Hocko
2016-01-23 16:21                                 ` fast path cycle muncher (vmstat: make vmstat_updater deferrable again and shut down on idle) Mike Galbraith
2016-01-24  0:33                                   ` Christoph Lameter
2016-01-24  2:46                                     ` Mike Galbraith
2016-01-24  3:46                                       ` Christoph Lameter
2016-01-24  5:36                                         ` Mike Galbraith
2016-01-25 17:42                                   ` Michal Hocko
2016-01-25 18:02                                     ` Christoph Lameter
2016-01-25 20:13                                       ` Michal Hocko
2016-01-26 16:25                                         ` Christoph Lameter
2016-01-26 18:31                                           ` Mike Galbraith
2016-01-26 18:34                                             ` Christoph Lameter
2016-01-26 18:45                                               ` Mike Galbraith
2016-01-26 19:20                                                 ` Christoph Lameter
2016-01-27  3:12                                                   ` Mike Galbraith
2016-01-27  4:15                                                     ` Mike Galbraith
2016-01-27 16:28                                                     ` Christoph Lameter
2016-01-28 15:36                                                       ` Frederic Weisbecker
2016-01-28 16:42                                                         ` Christoph Lameter
2016-01-26  2:14                                       ` Mike Galbraith
2016-01-26  2:25                                         ` Mike Galbraith
2016-01-26 16:26                                           ` Christoph Lameter
2016-01-26 17:39                                             ` Mike Galbraith
2016-01-26 18:19                                               ` Christoph Lameter
2016-01-26 16:26                                         ` Christoph Lameter
2016-01-26 17:08                                           ` Mike Galbraith
2016-01-26 18:22                                             ` Christoph Lameter
2016-01-26 19:09                                               ` Mike Galbraith
2016-01-26 19:22                                                 ` Christoph Lameter
2016-01-27 16:48                                   ` [PATCH] mm, vmstat: make quiet_vmstat lighter (was: Re: fast path cycle muncher (vmstat: make vmstat_updater deferrable) " Michal Hocko
2016-01-27 17:03                                     ` Mike Galbraith [this message]
2016-01-27 18:26                                     ` Christoph Lameter
2016-01-28 15:21                                       ` Michal Hocko
2016-01-28 16:40                                         ` Christoph Lameter
2016-01-28 16:53                                           ` Michal Hocko
2016-01-28 17:05                                             ` Christoph Lameter
2016-01-28 15:31                                       ` Michal Hocko
2016-01-28 15:37                                         ` [PATCH] vmstat: make vmstat_update deferrable (was: Re: [PATCH] mm, vmstat: make quiet_vmstat lighter) " Michal Hocko
2016-01-28 16:48                                           ` Christoph Lameter
2016-01-28 16:42                                         ` [PATCH] mm, vmstat: make quiet_vmstat lighter (was: Re: fast " Christoph Lameter
2016-01-24 16:57                                 ` mm, vmstat: kernel BUG at mm/vmstat.c:1408! Linus Torvalds
2016-01-24 16:57                                   ` Linus Torvalds
2016-01-20 15:14   ` Christoph Lameter
2016-01-20 15:14     ` Christoph Lameter
2016-01-20 15:20     ` Michal Hocko
2016-01-20 15:20       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1453914234.3517.32.camel@gmail.com \
    --to=umgwanakikbuti@gmail.com \
    --cc=cl@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.