linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>, <linux-mm@kvack.org>,
	<kernel-team@fb.com>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings
Date: Thu, 6 Aug 2020 16:38:04 -0700	[thread overview]
Message-ID: <20200806233804.GB1217906@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <alpine.LSU.2.11.2008051913580.8184@eggly.anvils>


On Wed, Aug 05, 2020 at 08:01:33PM -0700, Hugh Dickins wrote:
> On Mon, 3 Aug 2020, Roman Gushchin wrote:
> > On Fri, Jul 31, 2020 at 07:17:05PM -0700, Hugh Dickins wrote:
> > > On Fri, 31 Jul 2020, Roman Gushchin wrote:
> > > > On Thu, Jul 30, 2020 at 09:06:55PM -0700, Hugh Dickins wrote:
> > > > > 
> > > > > Though another alternative did occur to me overnight: we could
> > > > > scrap the logged warning, and show "nr_whatever -53" as output
> > > > > from /proc/sys/vm/stat_refresh: that too would be acceptable
> > > > > to me, and you redirect to /dev/null.
> > > > 
> > > > It sounds like a good idea to me. Do you want me to prepare a patch?
> > > 
> > > Yes, if you like that one best, please do prepare a patch - thanks!
> > 
> > Hi Hugh,
> > 
> > I mastered a patch (attached below), but honestly I can't say I like it.
> > The resulting interface is confusing: we don't generally use sysctls to
> > print debug data and/or warnings.
> 
> Since you confessed to not liking it yourself, I paid it very little
> attention.  Yes, when I made that suggestion, I wasn't really thinking
> of how stat_refresh is a /proc/sys/vm sysctl thing; and I'm not at all
> sure how issuing output from a /proc file intended for input works out
> (perhaps there are plenty of good examples, and you followed one, but
> it smells fishy to me now).
> 
> > 
> > I thought about treating a write to this sysctls as setting the threshold,
> > so that "echo 0 > /proc/sys/vm/stat_refresh" would warn on all negative
> > entries, and "cat /proc/sys/vm/stat_refresh" would use the default threshold
> > as in my patch. But this breaks  to some extent the current ABI, as passing
> > an incorrect value will result in -EINVAL instead of passing (as now).
> 
> I expect we could handle that well enough, by more lenient validation
> of the input; though my comment above on output versus input sheds doubt.
> 
> > 
> > Overall I still think we shouldn't warn on any values inside the possible
> > range, as it's not an indication of any kind of error. The only reason
> > why we see some values going negative and some not, is that some of them
> > are updated more frequently than others, and some are bouncing around
> > zero, while other can't reach zero too easily (like the number of free pages).
> 
> We continue to disagree on that (and it amuses me that you who are so
> sure they can be ignored, cannot ignore them; whereas I who am so curious
> to investigate them, have not actually found the time to do so in years).
> It was looking as if nothing could satisfy us both, but...
> 
> > 
> > Actually, if someone wants to ensure that numbers are accurate,
> > we have to temporarily set the threshold to 0, then flush the percpu data
> > and only then check atomics. In the current design flushing percpu data
> > matters for only slowly updated counters, as all others will run away while
> > we're waiting for the flush. So if we're targeting some slowly updating
> > counters, maybe we should warn only on them being negative, Idk.
> 
> I was going to look into that angle, though it would probably add a little
> unjustifiable overhead to fast paths, and be rejected on that basis.
> 
> But in going to do so, came up against an earlier comment of yours, of
> which I had misunderstood the significance. I had said and you replied:
> 
> > > nr_zone_write_pending: yes, I've looked at our machines, and see that
> > > showing up for us too (-49 was the worst I saw).  Not at all common,
> > > but seen.  And not followed by increasingly worse numbers, so a state
> > > that corrects itself.  nr_dirty too (fewer instances, bigger numbers);
> > > but never nr_writeback, which you'd expect to go along with those.
> > 
> > NR_DIRTY and NR_WRITEBACK are node counters, so we don't check them?
> 
> Wow. Now I see what you were pointing out: when v4.8's 75ef71840539
> ("mm, vmstat: add infrastructure for per-node vmstats") went in, it
> missed updating vmstat_refresh() to check all the NR_VM_NODE_STAT items.
> 
> And I've never noticed, and have interpreted its silence on those items
> as meaning they're all good (and the nr_dirty ones I mentioned above,
> must have been from residual old kernels, hence the fewer instances).
> I see the particularly tricky NR_ISOLATED ones are in that category.
> Maybe they are all good, but I have been mistaken.
> 
> I shall certainly want to reintroduce those stats to checking for
> negatives, even if it's in a patch that never earns your approval,
> and just ends up kept internal for debugging.  But equally certainly,
> I must not suddenly reintroduce that checking without gaining some
> experience of it (and perhaps getting as irritated as you by more
> transient negatives).
> 
> I said earlier that I'd prefer you to rip out all that checking for
> negatives, rather than retaining it with the uselessly over-generous
> 125 * nr_cpus leeway.  Please, Roman, would you send Andrew a patch
> doing that, to replace the patch in this thread?  Or if you prefer,
> I can do so.

Hi Andrew,

it seems that Hugh and me haven't reached a consensus here.
Can, you, please, not merge this patch into 5.9, so we would have
more time to find a solution, acceptable for all?

Thank you!

Roman


  parent reply	other threads:[~2020-08-06 23:38 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-14 17:39 [PATCH v2] mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings Roman Gushchin
2020-07-20  8:03 ` Michal Hocko
2020-07-20 20:20   ` Roman Gushchin
2020-07-30  3:45 ` Hugh Dickins
2020-07-30 16:23   ` Roman Gushchin
2020-07-31  4:06     ` Hugh Dickins
2020-08-01  1:18       ` Roman Gushchin
2020-08-01  2:17         ` Hugh Dickins
2020-08-04  0:40           ` Roman Gushchin
2020-08-06  3:01             ` Hugh Dickins
2020-08-06  3:51               ` Roman Gushchin
2020-08-06 16:41                 ` Hugh Dickins
2020-08-06 23:38               ` Roman Gushchin [this message]
2020-08-07  0:16                 ` Hugh Dickins
2020-08-07  1:25                 ` Andrew Morton
2021-02-24  7:24                   ` Hugh Dickins
2021-02-25  1:53                     ` Roman Gushchin
2021-02-25 17:21                       ` Hugh Dickins
2021-02-25 18:06                         ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200806233804.GB1217906@carbon.dhcp.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).