linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: Nick Piggin <npiggin@suse.de>
Cc: David Chinner <dgc@sgi.com>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC] fsblock
Date: Wed, 27 Jun 2007 07:50:56 -0400	[thread overview]
Message-ID: <20070627115056.GW14224@think.oraclecorp.com> (raw)
In-Reply-To: <20070627053245.GA6033@wotan.suse.de>

On Wed, Jun 27, 2007 at 07:32:45AM +0200, Nick Piggin wrote:
> On Tue, Jun 26, 2007 at 08:34:49AM -0400, Chris Mason wrote:
> > On Tue, Jun 26, 2007 at 07:23:09PM +1000, David Chinner wrote:
> > > On Tue, Jun 26, 2007 at 01:55:11PM +1000, Nick Piggin wrote:
> > 
> > [ ... fsblocks vs extent range mapping ]
> > 
> > > iomaps can double as range locks simply because iomaps are
> > > expressions of ranges within the file.  Seeing as you can only
> > > access a given range exclusively to modify it, inserting an empty
> > > mapping into the tree as a range lock gives an effective method of
> > > allowing safe parallel reads, writes and allocation into the file.
> > > 
> > > The fsblocks and the vm page cache interface cannot be used to
> > > facilitate this because a radix tree is the wrong type of tree to
> > > store this information in. A sparse, range based tree (e.g. btree)
> > > is the right way to do this and it matches very well with
> > > a range based API.
> > 
> > I'm really not against the extent based page cache idea, but I kind of
> > assumed it would be too big a change for this kind of generic setup.  At
> > any rate, if we'd like to do it, it may be best to ditch the idea of
> > "attach mapping information to a page", and switch to "lookup mapping
> > information and range locking for a page".
> 
> Well the get_block equivalent API is extent based one now, and I'll
> look at what is required in making map_fsblock a more generic call
> that could be used for an extent-based scheme.
> 
> An extent based thing IMO really isn't appropriate as the main generic
> layer here though. If it is really useful and popular, then it could
> be turned into generic code and sit along side fsblock or underneath
> fsblock...

Lets look at a typical example of how IO actually gets done today,
starting with sys_write():

sys_write(file, buffer, 1MB)
for each page:
    prepare_write()
	allocate contiguous chunks of disk
        attach buffers
    copy_from_user()
    commit_write()
        dirty buffers

pdflush:
    writepages()
        find pages with contiguous chunks of disk
	build and submit large bios

So, we replace prepare_write and commit_write with an extent based api,
but we keep the dirty each buffer part.  writepages has to turn that
back into extents (bio sized), and the result is completely full of dark
dark corner cases.

I do think fsblocks is a nice cleanup on its own, but Dave has a good
point that it makes sense to look for ways generalize things even more.

-chris

  parent reply	other threads:[~2007-06-27 11:54 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-24  1:45 [RFC] fsblock Nick Piggin
2007-06-24  1:46 ` [patch 1/3] add the fsblock layer Nick Piggin
2007-06-24 15:28   ` Andi Kleen
2007-06-24 20:18     ` Arjan van de Ven
2007-06-25  8:58       ` Andi Kleen
2007-06-25  7:19     ` Nick Piggin
2007-06-24 23:01   ` Neil Brown
2007-06-25  7:41     ` Nick Piggin
2007-06-25 12:29       ` Chris Mason
2007-06-26  2:34         ` Nick Piggin
2007-06-26  2:48           ` Neil Brown
2007-06-26  3:07             ` Nick Piggin
2007-06-26 12:26               ` Chris Mason
2007-06-30 10:40                 ` Christoph Hellwig
2007-06-30 10:40           ` Christoph Hellwig
2007-06-25 13:19   ` Chris Mason
2007-06-26  2:42     ` Nick Piggin
2007-06-24  1:46 ` [patch 2/3] block_dev: convert to fsblock Nick Piggin
2007-06-24  1:47 ` [patch 3/3] minix: " Nick Piggin
2007-06-24  1:53 ` [RFC] fsblock Nick Piggin
2007-06-24  3:07 ` Jeff Garzik
2007-06-24  3:47   ` Nick Piggin
2007-06-24 13:51     ` Chris Mason
2007-06-25  6:58       ` Nick Piggin
2007-06-25 12:25         ` Chris Mason
2007-06-30 10:44           ` Christoph Hellwig
2007-06-30 10:42   ` Christoph Hellwig
2007-06-30 11:10     ` Jeff Garzik
2007-06-30 11:13       ` Christoph Hellwig
2007-06-24  4:19 ` William Lee Irwin III
2007-06-24 14:16 ` Andi Kleen
2007-06-25  7:16   ` Nick Piggin
2007-06-26  3:06 ` David Chinner
2007-06-26  3:55   ` Nick Piggin
2007-06-26  9:23     ` David Chinner
2007-06-26 11:14       ` Nick Piggin
2007-06-27 12:39         ` Kyle Moffett
2007-06-26 12:34       ` Chris Mason
2007-06-27  5:32         ` Nick Piggin
2007-06-27  6:05           ` David Chinner
2007-06-27 11:50           ` Chris Mason [this message]
2007-06-27 15:18             ` Anton Altaparmakov
2007-06-27 22:35             ` David Chinner
2007-06-28  2:44               ` Nick Piggin
2007-06-28 12:20                 ` Chris Mason
2007-06-29  2:08                   ` David Chinner
2007-06-29  2:33                   ` Nick Piggin
2007-06-30 11:05 ` Christoph Hellwig
2007-07-09 17:14 ` Christoph Lameter
2007-07-10  0:54   ` Nick Piggin
2007-07-10  0:59     ` Christoph Lameter
2007-07-10  1:07       ` Nick Piggin
2007-07-10  1:37       ` Dave McCracken

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070627115056.GW14224@think.oraclecorp.com \
    --to=chris.mason@oracle.com \
    --cc=dgc@sgi.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).