All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <htejun@fb.com>
To: Josef Bacik <jbacik@fb.com>
Cc: <linux-btrfs@vger.kernel.org>, <kernel-team@fb.com>,
	<david@fromorbit.com>, <jack@suse.cz>,
	<linux-fsdevel@vger.kernel.org>, <viro@zeniv.linux.org.uk>,
	<hch@infradead.org>, <jweiner@fb.com>
Subject: Re: [PATCH 3/5] writeback: add counters for metadata usage
Date: Tue, 25 Oct 2016 15:50:37 -0400	[thread overview]
Message-ID: <20161025195037.GC13857@htj.duckdns.org> (raw)
In-Reply-To: <1477420904-1399-4-git-send-email-jbacik@fb.com>

Hello,

On Tue, Oct 25, 2016 at 02:41:42PM -0400, Josef Bacik wrote:
> Btrfs has no bounds except memory on the amount of dirty memory that we have in
> use for metadata.  Historically we have used a special inode so we could take
> advantage of the balance_dirty_pages throttling that comes with using pagecache.
> However as we'd like to support different blocksizes it would be nice to not
> have to rely on pagecache, but still get the balance_dirty_pages throttling
> without having to do it ourselves.
> 
> So introduce *METADATA_DIRTY_BYTES and *METADATA_WRITEBACK_BYTES.  These are
> zone and bdi_writeback counters to keep track of how many bytes we have in
> flight for METADATA.  We need to count in bytes as blocksizes could be
> percentages of pagesize.  We simply convert the bytes to number of pages where
> it is needed for the throttling.
> 
> Also introduce NR_METADATA_BYTES so we can keep track of the total amount of
> pages used for metadata on the system.  This is also needed so things like dirty
> throttling know that this is dirtyable memory as well and easily reclaimed.
> 
> Signed-off-by: Josef Bacik <jbacik@fb.com>

Some nits.

It'd be nice to note that this patch just introduces new fields
without using them and thus doesn't cause any behavioral changes.

> @@ -51,6 +51,8 @@ static DEVICE_ATTR(cpumap,  S_IRUGO, node_read_cpumask, NULL);
>  static DEVICE_ATTR(cpulist, S_IRUGO, node_read_cpulist, NULL);
>  
>  #define K(x) ((x) << (PAGE_SHIFT - 10))
> +#define BtoK(x) ((x) >> 10)

This would belong in a separate patch but any chance we can share
these definitions?  It's fine to have the definitions in a couple
places but these are getting duplicated in multiple spots and actually
getting confusing with K meaning pages to kilobytes.  I'm not sure how
it exactly should be tho.

> @@ -2473,6 +2504,100 @@ void account_page_dirtied(struct page *page, struct address_space *mapping)
>  EXPORT_SYMBOL(account_page_dirtied);
>  
>  /*

/**

> + * account_metadata_dirtied
> + * @page - the page being dirited
> + * @bdi - the bdi that owns this page
> + * @bytes - the number of bytes being dirtied
> + *
> + * Do the dirty page accounting for metadata pages that aren't backed by an
> + * address_space.
> + */
> +void account_metadata_dirtied(struct page *page, struct backing_dev_info *bdi,
> +			      long bytes)
> +{
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	__mod_node_page_state(page_pgdat(page), NR_METADATA_DIRTY_BYTES,
> +			      bytes);
> +	__add_wb_stat(&bdi->wb, WB_DIRTIED_BYTES, bytes);
> +	__add_wb_stat(&bdi->wb, WB_METADATA_DIRTY_BYTES, bytes);
> +	current->nr_dirtied++;
> +	task_io_account_write(bytes);
> +	this_cpu_inc(bdp_ratelimits);
> +	local_irq_restore(flags);

Again, I'm not sure about the explicit irq ops especially as some of
the counters are already irq safe.

> +}
> +EXPORT_SYMBOL(account_metadata_dirtied);
> +
> +/*

/**

> + * account_metadata_cleaned
> + * @page - the page being cleaned
> + * @bdi - the bdi that owns this page
> + * @bytes - the number of bytes cleaned
> + *
> + * Called on a no longer dirty metadata page.
> + */
> +void account_metadata_cleaned(struct page *page, struct backing_dev_info *bdi,
> +			      long bytes)
> +{
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	__mod_node_page_state(page_pgdat(page), NR_METADATA_DIRTY_BYTES,
> +			      -bytes);
> +	__add_wb_stat(&bdi->wb, WB_METADATA_DIRTY_BYTES, -bytes);
> +	task_io_account_cancelled_write(bytes);
> +	local_irq_restore(flags);

Ditto with irq and the following functions.

> @@ -3701,7 +3703,20 @@ static unsigned long node_pagecache_reclaimable(struct pglist_data *pgdat)
>  	if (unlikely(delta > nr_pagecache_reclaimable))
>  		delta = nr_pagecache_reclaimable;
>  
> -	return nr_pagecache_reclaimable - delta;
> +	nr_metadata_reclaimable =
> +		node_page_state(pgdat, NR_METADATA_BYTES) >> PAGE_SHIFT;
> +	/*
> +	 * We don't do writeout through the shrinkers so subtract any
> +	 * dirty/writeback metadata bytes from the reclaimable count.
> +	 */

Hmm... up until this point, the dirty metadata was handled the same
way as regular dirty data but it deviates here.  Is this right?  The
calculations in writeback code also assumes that the dirty pages are
reclaimable.  If this is inherently different, it'd be nice to explain
more explicitly why this is different from others.

Thanks.

-- 
tejun

  reply	other threads:[~2016-10-25 19:51 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-25 18:41 [PATCH 0/5][RESEND] Support for metadata specific accounting Josef Bacik
2016-10-25 18:41 ` [PATCH 1/5] remove mapping from balance_dirty_pages*() Josef Bacik
2016-10-25 18:47   ` Tejun Heo
2016-10-25 18:41 ` [PATCH 2/5] writeback: convert WB_WRITTEN/WB_DIRITED counters to bytes Josef Bacik
2016-10-25 19:03   ` Tejun Heo
2016-10-25 19:09     ` Josef Bacik
2016-10-30 15:13   ` Jan Kara
2016-10-25 18:41 ` [PATCH 3/5] writeback: add counters for metadata usage Josef Bacik
2016-10-25 19:50   ` Tejun Heo [this message]
2016-10-26 15:20     ` Josef Bacik
2016-10-26 15:49       ` Tejun Heo
2016-10-30 15:36   ` Jan Kara
2016-10-25 18:41 ` [PATCH 4/5] writeback: introduce super_operations->write_metadata Josef Bacik
2016-10-25 20:00   ` Tejun Heo
2016-10-25 18:41 ` [PATCH 5/5] fs: don't set *REFERENCED unless we are on the lru list Josef Bacik
2016-10-25 22:01   ` Dave Chinner
2016-10-25 23:36     ` Dave Chinner
2016-10-26 20:03       ` Josef Bacik
2016-10-26 22:20         ` Dave Chinner
2016-10-26 15:11     ` Josef Bacik
2016-10-27  0:30       ` Dave Chinner
2016-10-27 13:13         ` Josef Bacik
2016-10-28  3:48           ` Dave Chinner
2016-10-25 22:44   ` Omar Sandoval
2016-10-26  4:17     ` [PATCH 5/5] " Andreas Dilger
2016-10-26  5:24       ` Omar Sandoval
  -- strict thread matches above, loose matches on Subject: below --
2016-10-24 20:43 [PATCH 0/5] Support for metadata specific accounting Josef Bacik
2016-10-24 20:43 ` [PATCH 3/5] writeback: add counters for metadata usage Josef Bacik
2016-10-24 20:43   ` Josef Bacik
2016-10-24 20:43   ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161025195037.GC13857@htj.duckdns.org \
    --to=htejun@fb.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jbacik@fb.com \
    --cc=jweiner@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.