All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Roman Gushchin <guro@fb.com>
Cc: Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-mm@kvack.org, kernel-team@fb.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings
Date: Thu, 6 Aug 2020 09:41:05 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.11.2008060824380.9630@eggly.anvils> (raw)
In-Reply-To: <20200806035140.GA1162560@carbon.dhcp.thefacebook.com>

On Wed, 5 Aug 2020, Roman Gushchin wrote:
> On Wed, Aug 05, 2020 at 08:01:33PM -0700, Hugh Dickins wrote:
> > 
> > I shall certainly want to reintroduce those stats to checking for
> > negatives, even if it's in a patch that never earns your approval,
> > and just ends up kept internal for debugging.  But equally certainly,
> > I must not suddenly reintroduce that checking without gaining some
> > experience of it (and perhaps getting as irritated as you by more
> > transient negatives).
> > 
> > I said earlier that I'd prefer you to rip out all that checking for
> > negatives, rather than retaining it with the uselessly over-generous
> > 125 * nr_cpus leeway.  Please, Roman, would you send Andrew a patch
> > doing that, to replace the patch in this thread?  Or if you prefer,
> > I can do so.
> 
> Sure, I can do it. But can you, please, explain why you want them to be
> eliminated? Is this because they will never fire, you think?

Yes, I've never seen a machine on which vm/stat_refresh reported -16000
or less, or anything approaching that (true, this laptop does not have
128 cpus, but...).  Maybe they get reinstalled or otherwise rebooted
before reaching numbers of that magnitude.  Maybe having the warnings
shown at much lower magnitudes has helped to root-cause and eliminate
some of them.

Waiting until the heat-death-of-the-universe theoretical worst case is
so unhelpful as to defeat the purpose of the warning.  I think you do
understand, but perhaps not all readers of this thread understand,
that vm/stat_refresh merges all per-cpu counts into the global atomic
immediately before deciding negative.  The only problem is that
"immediately" is not instantaneous across cpus, so the possibility
of work started on one cpu but completed on another during the course
of the refresh, causing false negatives, is real though not great.

> 
> In my humble opinion they might be quite useful: any systematic under- or
> overflow will eventually trigger this warning, and we'll know for sure that
> something is wrong. So I'd add a similar check for node counters without
> any hesitation.

It's true that while developing, we can all make mistakes so big that
"eventually" will show up quickly, and even that warning could help.
But since you'll only show them when they reach -16000 (sorry, I keep
going back to the 128 cpu case, just to make it more vivid), it won't
be of any use to catch races later in development, or in production.

Whereas my own patch would just fix the missing items, and continue
to annoy you with occasional ignorable warnings - so I cannot submit
that, and wouldn't want to submit it right now, without having tried
it out for a while, to check what kind of noise it will generate.

So I thought it best, either to leave mm/vmstat.c alone for the moment,
or else just delete the disputed and incomplete code; coming back to
it later when we have something we can agree upon.

But I think you're preferring to resubmit your 125*nr_cpus patch to akpm,
with the missing NR_VM_NODE_STAT_ITEMS added in (as either one or two
patches), with foreshortened testing but the reassurance that since
it's so hard to reach the point of showing the warnings, new negatives
will either be quiet, or easily fixed before v5.10 released: okay,
I can more easily Hack that to my preference than Ack it or Nak it.

Hugh

  reply	other threads:[~2020-08-06 16:49 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-14 17:39 [PATCH v2] mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings Roman Gushchin
2020-07-20  8:03 ` Michal Hocko
2020-07-20 20:20   ` Roman Gushchin
2020-07-30  3:45 ` Hugh Dickins
2020-07-30  3:45   ` Hugh Dickins
2020-07-30 16:23   ` Roman Gushchin
2020-07-31  4:06     ` Hugh Dickins
2020-07-31  4:06       ` Hugh Dickins
2020-08-01  1:18       ` Roman Gushchin
2020-08-01  2:17         ` Hugh Dickins
2020-08-01  2:17           ` Hugh Dickins
2020-08-04  0:40           ` Roman Gushchin
2020-08-06  3:01             ` Hugh Dickins
2020-08-06  3:01               ` Hugh Dickins
2020-08-06  3:51               ` Roman Gushchin
2020-08-06 16:41                 ` Hugh Dickins [this message]
2020-08-06 16:41                   ` Hugh Dickins
2020-08-06 23:38               ` Roman Gushchin
2020-08-07  0:16                 ` Hugh Dickins
2020-08-07  0:16                   ` Hugh Dickins
2020-08-07  1:25                 ` Andrew Morton
2021-02-24  7:24                   ` Hugh Dickins
2021-02-24  7:24                     ` Hugh Dickins
2021-02-25  1:53                     ` Roman Gushchin
2021-02-25 17:21                       ` Hugh Dickins
2021-02-25 17:21                         ` Hugh Dickins
2021-02-25 18:06                         ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.2008060824380.9630@eggly.anvils \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.