linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Chinner <dgc@sgi.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: dgc@sgi.com, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [patch 3/8] per backing_dev dirty and writeback page accounting
Date: Tue, 13 Mar 2007 10:12:56 +1100	[thread overview]
Message-ID: <20070312231256.GT6095633@melbourne.sgi.com> (raw)
In-Reply-To: <E1HQt7o-00042r-00@dorka.pomaz.szeredi.hu>

On Mon, Mar 12, 2007 at 11:36:16PM +0100, Miklos Szeredi wrote:
> I'll try to explain the reason for the deadlock first.

Ah, thanks for that.

> > IIUC, your problem is that there's another bdi that holds all the
> > dirty pages, and this throttle loop never flushes pages from that
> > other bdi and we sleep instead. It seems to me that the fundamental
> > problem is that to clean the pages we need to flush both bdi's, not
> > just the bdi we are directly dirtying.
> 
> This is what happens:
> 
> write fault on upper filesystem
>   balance_dirty_pages
>     submit write requests
>   loop ...

Isn't this loop transferring the dirty state from the upper
filesystem to the lower filesystem? What I don't see here is
how the pages on this filesystem are not getting cleaned if
the lower filesystem is being flushed properly.

I'm probably missing something big and obvious, but I'm not
familiar with the exact workings of FUSE so please excuse my
ignorance....

> ------- fuse IPC ---------------
> [fuse loopback fs thread 1]

This is the lower filesystem? Or a callback thread for
doing the write requests to the lower filesystem?

> read request
> sys_write
>   mutex_lock(i_mutex)
>   ...
>      balance_dirty_pages
>         submit write requests
>         loop ... write requests completed ... dirty still over limit ... 
> 	... loop forever

Hmmm - the situation in balance_dirty_pages() after an attempt
to writeback_inodes(&wbc) that has written nothing because there
is nothing to write would be:

	wbc->nr_write == write_chunk &&
	wbc->pages_skipped == 0 &&
	wbc->encountered_congestion == 0 &&
	!bdi_congested(wbc->bdi)

What happens if you make that an exit condition to the loop?
Or alternatively, adding another bit to the wbc structure to
say "there was nothing to do" and setting that if we find
list_empty(&sb->s_dirty) when trying to flush dirty inodes."

[ FWIW, this may also solve another problem of fast block devices
being throttled incorrectly when a slow block dev is consuming
all the dirty pages... ]

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

  reply	other threads:[~2007-03-12 23:13 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-06 18:04 [patch 0/8] VFS/VM patches Miklos Szeredi
2007-03-06 18:04 ` [patch 1/8] fix race in clear_page_dirty_for_io() Miklos Szeredi
2007-03-06 22:25   ` Andrew Morton
2007-03-06 18:04 ` [patch 2/8] update ctime and mtime for mmaped write Miklos Szeredi
2007-03-06 20:32   ` Peter Zijlstra
2007-03-06 21:24     ` Miklos Szeredi
2007-03-06 21:47       ` Peter Zijlstra
2007-03-06 22:00         ` Miklos Szeredi
2007-03-06 22:07         ` Peter Zijlstra
2007-03-06 22:18           ` Miklos Szeredi
2007-03-06 22:28             ` Peter Zijlstra
2007-03-06 22:36               ` Miklos Szeredi
2007-03-06 18:04 ` [patch 3/8] per backing_dev dirty and writeback page accounting Miklos Szeredi
2007-03-12  6:23   ` David Chinner
2007-03-12 11:40     ` Miklos Szeredi
2007-03-12 21:44       ` David Chinner
2007-03-12 22:36         ` Miklos Szeredi
2007-03-12 23:12           ` David Chinner [this message]
2007-03-13  8:21             ` Miklos Szeredi
2007-03-13 22:12               ` David Chinner
2007-03-14 22:09                 ` Miklos Szeredi
2007-03-06 18:04 ` [patch 4/8] fix deadlock in balance_dirty_pages Miklos Szeredi
2007-03-06 18:04 ` [patch 5/8] fix deadlock in throttle_vm_writeout Miklos Szeredi
2007-03-06 18:04 ` [patch 6/8] balance dirty pages from loop device Miklos Szeredi
2007-03-06 18:04 ` [patch 7/8] add filesystem subtype support Miklos Szeredi
2007-03-06 18:04 ` [patch 8/8] consolidate generic_writepages and mpage_writepages fix Miklos Szeredi
2007-03-07 20:46   ` Andrew Morton
2007-03-07 21:26     ` Miklos Szeredi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070312231256.GT6095633@melbourne.sgi.com \
    --to=dgc@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).