All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>, <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/4] mm: /proc/sys/vm/stat_refresh skip checking known negative stats
Date: Sun, 28 Feb 2021 16:53:02 -0800	[thread overview]
Message-ID: <YDw67lSx5vLTgx/O@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <alpine.LSU.2.11.2102251512170.13363@eggly.anvils>

On Thu, Feb 25, 2021 at 03:14:03PM -0800, Hugh Dickins wrote:
> vmstat_refresh() can occasionally catch nr_zone_write_pending and
> nr_writeback when they are transiently negative.  The reason is partly
> that the interrupt which decrements them in test_clear_page_writeback()
> can come in before __test_set_page_writeback() got to increment them;
> but transient negatives are still seen even when that is prevented, and
> we have not yet resolved why (Roman believes that it is an unavoidable
> consequence of the refresh scheduled on each cpu).  But those stats are
> not buggy, they have never been seen to drift away from 0 permanently:
> so just avoid the annoyance of showing a warning on them.
> 
> Similarly avoid showing a warning on nr_free_cma: CMA users have seen
> that one reported negative from /proc/sys/vm/stat_refresh too, but it
> does drift away permanently: I believe that's because its incrementation
> and decrementation are decided by page migratetype, but the migratetype
> of a pageblock is not guaranteed to be constant.
> 
> Use switch statements so we can most easily add or remove cases later.

I'm OK with the code, but I can't fully agree with the commit log. I don't think
there is any mystery around negative values. Let me copy-paste the explanation
from my original patch:

    These warnings* are generated by the vmstat_refresh() function, which
    assumes that atomic zone and numa counters can't go below zero.  However,
    on a SMP machine it's not quite right: due to per-cpu caching it can in
    theory be as low as -(zone threshold) * NR_CPUs.
    
    For instance, let's say all cma pages are in use and NR_FREE_CMA_PAGES
    reached 0.  Then we've reclaimed a small number of cma pages on each CPU
    except CPU0, so that most percpu NR_FREE_CMA_PAGES counters are slightly
    positive (the atomic counter is still 0).  Then somebody on CPU0 consumes
    all these pages.  The number of pages can easily exceed the threshold and
    a negative value will be committed to the atomic counter.

    * warnings about negative NR_FREE_CMA_PAGES

Actually, the same is almost true for ANY other counter. What differs CMA, dirty
and write pending counters is that they can reach 0 value under normal conditions.
Other counters are usually not reaching values small enough to see negative values
on a reasonable sized machine.

Does it makes sense?

> 
> Link: https://lore.kernel.org/linux-mm/20200714173747.3315771-1-guro@fb.com/
> Reported-by: Roman Gushchin <guro@fb.com>
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
> 
>  mm/vmstat.c |   15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> --- vmstat2/mm/vmstat.c	2021-02-25 11:56:18.000000000 -0800
> +++ vmstat3/mm/vmstat.c	2021-02-25 12:42:15.000000000 -0800
> @@ -1840,6 +1840,14 @@ int vmstat_refresh(struct ctl_table *tab
>  	if (err)
>  		return err;
>  	for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) {
> +		/*
> +		 * Skip checking stats known to go negative occasionally.
> +		 */
> +		switch (i) {
> +		case NR_ZONE_WRITE_PENDING:
> +		case NR_FREE_CMA_PAGES:
> +			continue;
> +		}
>  		val = atomic_long_read(&vm_zone_stat[i]);
>  		if (val < 0) {
>  			pr_warn("%s: %s %ld\n",
> @@ -1856,6 +1864,13 @@ int vmstat_refresh(struct ctl_table *tab
>  	}
>  #endif
>  	for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
> +		/*
> +		 * Skip checking stats known to go negative occasionally.
> +		 */
> +		switch (i) {
> +		case NR_WRITEBACK:
> +			continue;
> +		}
>  		val = atomic_long_read(&vm_node_stat[i]);
>  		if (val < 0) {
>  			pr_warn("%s: %s %ld\n",

  reply	other threads:[~2021-03-01  0:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-25 23:10 [PATCH 1/4] mm: restore node stat checking in /proc/sys/vm/stat_refresh Hugh Dickins
2021-02-25 23:10 ` Hugh Dickins
2021-02-25 23:12 ` [PATCH 2/4] mm: no more EINVAL from /proc/sys/vm/stat_refresh Hugh Dickins
2021-02-25 23:12   ` Hugh Dickins
2021-03-01  0:38   ` Roman Gushchin
2021-02-25 23:14 ` [PATCH 3/4] mm: /proc/sys/vm/stat_refresh skip checking known negative stats Hugh Dickins
2021-02-25 23:14   ` Hugh Dickins
2021-03-01  0:53   ` Roman Gushchin [this message]
2021-03-01 22:08     ` Hugh Dickins
2021-03-01 22:08       ` Hugh Dickins
2021-03-02  0:34       ` Roman Gushchin
2021-03-02  6:03         ` [PATCH v2 " Hugh Dickins
2021-03-02  6:03           ` Hugh Dickins
2021-03-04  2:25           ` Roman Gushchin
2021-02-25 23:15 ` [PATCH 4/4] mm: /proc//sys/vm/stat_refresh stop checking monotonic numa stats Hugh Dickins
2021-02-25 23:15   ` Hugh Dickins
2021-03-01  0:53   ` Roman Gushchin
2021-03-01  0:37 ` [PATCH 1/4] mm: restore node stat checking in /proc/sys/vm/stat_refresh Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YDw67lSx5vLTgx/O@carbon.dhcp.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.