All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <jens.axboe@oracle.com>, Jan Kara <jack@suse.cz>,
	Theodore Tso <tytso@mit.edu>, Dave Chinner <david@fromorbit.com>,
	Chris Mason <chris.mason@oracle.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 5/6] writeback: don't delay inodes redirtied by a fast dirtier
Date: Wed, 23 Sep 2009 21:40:39 +0800	[thread overview]
Message-ID: <20090923134039.GA1196@localhost> (raw)
In-Reply-To: <20090923132351.GA32404@infradead.org>

On Wed, Sep 23, 2009 at 09:23:51PM +0800, Christoph Hellwig wrote:
> On Wed, Sep 23, 2009 at 09:20:08PM +0800, Wu Fengguang wrote:
> > I noticed that
> > - the write chunk size of balance_dirty_pages() is 12, which is pretty
> >   small and inefficient.
> > - during copy, the inode is sometimes redirty_tail (old behavior) and
> >   sometimes requeue_io (new behavior).
> > - during copy, the directory inode will always be synced and then
> >   redirty_tail.
> > - after copy, the inode will be redirtied after sync.
> 
> Yeah, XFS uses generic_file_uffered_write and the heuristics in there
> for balance_dirty_pages turned out to be really bad.  So far we didn't
> manage to sucessfully get that fixed, though.

Ah sorry. It's because of the first patch, it does not always "bump up"
the write chunk. In your case it is obviously decreased (the original
ratelimit_pages=4096 is much larger value).  I'll fix it.

> > It shall not be a problem to use requeue_io for XFS, because whether
> > it be requeue_io or redirty_tail, write_inode() will be called once
> > for every 4MB.
> > 
> > It would be inefficient if XFS really tries to write inode and
> > directory inode's metadata every time it synced 4MB page. If
> > that write attempt is turned into _real_ IO, that would be bad
> > and kill performance. Increasing MAX_WRITEBACK_PAGES may help
> > reduce the frequency of write_inode() though.
> 
> The way we call write_inode for XFS is extremly inefficient for XFS.  As
> you noticed XFS tends to redirty the inode on I/O completion, and we
> also cluster inode writeouts.  For XFS we'd really prefer to not
> intermix data and inode writeout, but first do the data writeout and
> then later push out the inodes, preferably with as many as possible
> inodes to sweep out in one go.

I guess the difficult part would be possible policy requirements
on the max batch size (max number of inodes or pages to write before
switching to write metadata) and the delay time (between the sync of
data and metadata). It may take a long time to make a full scan of
the dirty list.

Thanks,
Fengguang

  reply	other threads:[~2009-09-23 13:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-23 12:33 [PATCH 0/6] [RFC] writeback fixes for 2.6.32 Wu Fengguang
2009-09-23 12:33 ` [PATCH 1/6] writeback: balance_dirty_pages() shall write more than dirtied pages Wu Fengguang
2009-09-23 12:45   ` Christoph Hellwig
2009-09-23 12:53     ` Wu Fengguang
2009-09-23 13:56   ` [PATCH 1/6 -v2] " Wu Fengguang
2009-09-23 13:58     ` Wu Fengguang
2009-09-23 12:33 ` [PATCH 2/6] writeback: stop background writeback when below background threshold Wu Fengguang
2009-09-23 15:05   ` Jens Axboe
2009-09-24  1:24     ` Wu Fengguang
2009-09-23 12:33 ` [PATCH 3/6] writeback: kupdate writeback shall not stop when more io is possible Wu Fengguang
2009-09-23 12:33 ` [PATCH 4/6] writeback: cleanup writeback_single_inode() Wu Fengguang
2009-09-23 12:33 ` [PATCH 5/6] writeback: don't delay inodes redirtied by a fast dirtier Wu Fengguang
2009-09-23 13:20   ` Wu Fengguang
2009-09-23 13:23     ` Christoph Hellwig
2009-09-23 13:40       ` Wu Fengguang [this message]
2009-09-26 19:47   ` Christoph Hellwig
2009-09-27  2:02     ` Wu Fengguang
2009-09-23 12:33 ` [PATCH 6/6] writeback: redirty a fully scanned inode Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090923134039.GA1196@localhost \
    --to=fengguang.wu@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.