linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Greg Thelen <gthelen@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	 Vladimir Davydov <vdavydov.dev@gmail.com>,
	Tejun Heo <tj@kernel.org>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org
Subject: Re: [PATCH] writeback: sum memcg dirty counters as needed
Date: Fri, 29 Mar 2019 10:50:17 -0700	[thread overview]
Message-ID: <xr93imw1wox2.fsf@gthelen.svl.corp.google.com> (raw)
In-Reply-To: <20190328142016.GA15763@cmpxchg.org>

Johannes Weiner <hannes@cmpxchg.org> wrote:

> On Thu, Mar 07, 2019 at 08:56:32AM -0800, Greg Thelen wrote:
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -3880,6 +3880,7 @@ struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb)
>>   * @pheadroom: out parameter for number of allocatable pages according to memcg
>>   * @pdirty: out parameter for number of dirty pages
>>   * @pwriteback: out parameter for number of pages under writeback
>> + * @exact: determines exact counters are required, indicates more work.
>>   *
>>   * Determine the numbers of file, headroom, dirty, and writeback pages in
>>   * @wb's memcg.  File, dirty and writeback are self-explanatory.  Headroom
>> @@ -3890,18 +3891,29 @@ struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb)
>>   * ancestors.  Note that this doesn't consider the actual amount of
>>   * available memory in the system.  The caller should further cap
>>   * *@pheadroom accordingly.
>> + *
>> + * Return value is the error precision associated with *@pdirty
>> + * and *@pwriteback.  When @exact is set this a minimal value.
>>   */
>> -void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
>> -			 unsigned long *pheadroom, unsigned long *pdirty,
>> -			 unsigned long *pwriteback)
>> +unsigned long
>> +mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
>> +		    unsigned long *pheadroom, unsigned long *pdirty,
>> +		    unsigned long *pwriteback, bool exact)
>>  {
>>  	struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css);
>>  	struct mem_cgroup *parent;
>> +	unsigned long precision;
>>  
>> -	*pdirty = memcg_page_state(memcg, NR_FILE_DIRTY);
>> -
>> +	if (exact) {
>> +		precision = 0;
>> +		*pdirty = memcg_exact_page_state(memcg, NR_FILE_DIRTY);
>> +		*pwriteback = memcg_exact_page_state(memcg, NR_WRITEBACK);
>> +	} else {
>> +		precision = MEMCG_CHARGE_BATCH * num_online_cpus();
>> +		*pdirty = memcg_page_state(memcg, NR_FILE_DIRTY);
>> +		*pwriteback = memcg_page_state(memcg, NR_WRITEBACK);
>> +	}
>>  	/* this should eventually include NR_UNSTABLE_NFS */
>> -	*pwriteback = memcg_page_state(memcg, NR_WRITEBACK);
>>  	*pfilepages = mem_cgroup_nr_lru_pages(memcg, (1 << LRU_INACTIVE_FILE) |
>>  						     (1 << LRU_ACTIVE_FILE));
>>  	*pheadroom = PAGE_COUNTER_MAX;
>> @@ -3913,6 +3925,8 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
>>  		*pheadroom = min(*pheadroom, ceiling - min(ceiling, used));
>>  		memcg = parent;
>>  	}
>> +
>> +	return precision;
>
> Have you considered unconditionally using the exact version here?
>
> It does for_each_online_cpu(), but until very, very recently we did
> this per default for all stats, for years. It only became a problem in
> conjunction with the for_each_memcg loops when frequently reading
> memory stats at the top of a very large hierarchy.
>
> balance_dirty_pages() is called against memcgs that actually own the
> inodes/memory and doesn't do the additional recursive tree collection.
>
> It's also not *that* hot of a function, and in the io path...
>
> It would simplify this patch immensely.

Good idea.  Done in -v2 of the patch:
https://lore.kernel.org/lkml/20190329174609.164344-1-gthelen@google.com/


      reply	other threads:[~2019-03-29 17:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-07 16:56 [PATCH] writeback: sum memcg dirty counters as needed Greg Thelen
2019-03-21 23:44 ` Andrew Morton
2019-03-29 17:47   ` Greg Thelen
2019-03-22 18:15 ` Roman Gushchin
2019-03-27 22:29   ` Greg Thelen
2019-03-28 14:05     ` Johannes Weiner
2019-03-28 14:20 ` Johannes Weiner
2019-03-29 17:50   ` Greg Thelen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xr93imw1wox2.fsf@gthelen.svl.corp.google.com \
    --to=gthelen@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).