From: Roman Gushchin <guro@fb.com>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Vlastimil Babka <vbabka@suse.cz>, <linux-mm@kvack.org>,
<kernel-team@fb.com>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings
Date: Fri, 31 Jul 2020 18:18:21 -0700 [thread overview]
Message-ID: <20200801011821.GA859734@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <alpine.LSU.2.11.2007302018350.2410@eggly.anvils>
On Thu, Jul 30, 2020 at 09:06:55PM -0700, Hugh Dickins wrote:
> On Thu, 30 Jul 2020, Roman Gushchin wrote:
> > On Wed, Jul 29, 2020 at 08:45:47PM -0700, Hugh Dickins wrote:
> > >
> > > But a better idea is perhaps to redefine the behavior of
> > > "echo >/proc/sys/vm/stat_refresh". What if
> > > "echo someparticularstring >/proc/sys/vm/stat_refresh" were to
> > > disable or enable the warning (permanently? or just that time?):
> > > disable would be more "back-compatible", but I think it's okay
> > > if you prefer enable. Or "someparticularstring" could actually
> > > specify the warning threshold you want to use - you might echo
> > > 125 or 16000, I might echo 0. We can haggle over the default.
> >
> > May I ask you, what kind of problems you have in your in mind,
> > which can be revealed by these warnings? Or maybe there is some
> > history attached?
>
> Yes: 52b6f46bc163 mentions finding a bug of mine in NR_ISOLATED_FILE
> accounting, but IIRC (though I might be making this up) there was
> also a bug in the NR_ACTIVE or NR_INACTIVE FILE or ANON accounting.
>
> When one of the stats used for balancing or limiting in vmscan.c
> trends increasingly negative, it becomes increasingly difficult
> for those heuristics (adding on to others, comparing with others)
> to do what they're intended to do: they behave increasingly weirdly.
>
> Now the same (or the opposite) is true if one of those stats trends
> increasingly positive: but if it leaks positive, it's visible in
> /proc/vmstat; whereas if it leaks negative, it's presented there as 0.
>
> And most of the time (when unsynchronized) showing 0 is much better
> than showing a transient negative. But to help fix bugs, we do need
> some way of seeing the negatives, and vm/stat_refresh provides an
> opportunity to do so, when it synchronizes.
>
> I'd be glad not to show the transients if I knew them: set a flag
> on any that go negative, and only show if negative twice or more
> in a row? Perhaps, but I don't relish adding that, and think it
> would be over-engineering.
>
> It does sound to me like echoing the warning threshold into
> /proc/sys/vm/stat_refresh is the best way to satisfy us both.
>
> Though another alternative did occur to me overnight: we could
> scrap the logged warning, and show "nr_whatever -53" as output
> from /proc/sys/vm/stat_refresh: that too would be acceptable
> to me, and you redirect to /dev/null.
It sounds like a good idea to me. Do you want me to prepare a patch?
>
> (Why did I choose -53 in my example? An in-joke: when I looked
> through our machines for these warnings, on old kernels with my
> old shmem hugepage implementation, there were a striking number
> with "nr_shmem_freeholes -53"; but I'm a few years too late to
> investigate what was going on there.)
:)
>
> >
> > If it's all about some particular counters, which are known to be
> > strictly positive, maybe we should do the opposite, and check only
> > those counters? Because in general it's not an indication of a problem.
>
> Yet it's very curious how few stats ever generate such warnings:
> you're convinced they're just transient noise, and you're probably right;
> but I am a little suspicious of whether they are accounted correctly.
Yeah, I was initially very suspicious too, but I didn't find any issues
and still think it's not an indication of a problem.
Thank you!
next prev parent reply other threads:[~2020-08-01 1:18 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-14 17:39 [PATCH v2] mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings Roman Gushchin
2020-07-20 8:03 ` Michal Hocko
2020-07-20 20:20 ` Roman Gushchin
2020-07-30 3:45 ` Hugh Dickins
2020-07-30 16:23 ` Roman Gushchin
2020-07-31 4:06 ` Hugh Dickins
2020-08-01 1:18 ` Roman Gushchin [this message]
2020-08-01 2:17 ` Hugh Dickins
2020-08-04 0:40 ` Roman Gushchin
2020-08-06 3:01 ` Hugh Dickins
2020-08-06 3:51 ` Roman Gushchin
2020-08-06 16:41 ` Hugh Dickins
2020-08-06 23:38 ` Roman Gushchin
2020-08-07 0:16 ` Hugh Dickins
2020-08-07 1:25 ` Andrew Morton
2021-02-24 7:24 ` Hugh Dickins
2021-02-25 1:53 ` Roman Gushchin
2021-02-25 17:21 ` Hugh Dickins
2021-02-25 18:06 ` Roman Gushchin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200801011821.GA859734@carbon.dhcp.thefacebook.com \
--to=guro@fb.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).