All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jörn Engel" <joern@logfs.org>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mtd@lists.infradead.org, Arnd Bergmann <arnd@arndb.de>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 12/17] [LogFS] readwrite.c
Date: Mon, 23 Nov 2009 14:15:47 +0100	[thread overview]
Message-ID: <20091123131547.GA18889@logfs.org> (raw)
In-Reply-To: <84144f020911230433q5fd90321m9b463cb425fafe7d@mail.gmail.com>

Hello Pekka!

On Mon, 23 November 2009 14:33:15 +0200, Pekka Enberg wrote:
> 
> (Dunno who to CC, really, so lets see if I can trick Andrew or Hugh
> into looking at the issue.)

That would be nice.

> On Fri, Nov 20, 2009 at 9:38 PM, Joern Engel <joern@logfs.org> wrote:
> > +static void logfs_lock_write_page(struct page *page)
> > +{
> > +       int loop = 0;
> > +
> > +       while (unlikely(!trylock_page(page))) {
> > +               if (loop++ > 0x1000) {
> > +                       /* Has been observed once so far... */
> > +                       printk(KERN_ERR "stack at %p\n", &loop);
> > +                       BUG();
> > +               }
> > +               if (PagePreLocked(page)) {
> > +                       /* Holder of page lock is waiting for us, it
> > +                        * is safe to use this page. */
> > +                       break;
> > +               }
> > +               /* Some other process has this page locked and has
> > +                * nothing to do with us.  Wait for it to finish.
> > +                */
> > +               schedule();
> > +       }
> > +       BUG_ON(!PageLocked(page));
> > +}
> 
> What's the purpose of PagePreLocked()? The above function looks pretty
> fragile for a filesystem to me.

Avoiding deadlocks.  Garbage collection is the cause of almost all
headaches in logfs, including this.  Any write may require some amount
of garbage collection to free up some space.  Garbage collection
consists of basically random reads followed by random writes.  So in
order to write any page, it may be required to first read and write some
other inconvenient page.

Simple case:
	Thread A writes page 1, GC then reads/writes page 1.
More complicated:
	Thread A writes page 1, thread B writes page 2.
	Thread A gets the write lock, thread B blocks on write lock.
	Thread A does GC which reads/writes page 2.

The more complicated case requires that any thread holding the write
lock must be able to read/write any page belonging to this filesystem.
If those pages are locked by someone else, it should wait it out and
only use the page when the page lock is released.  But if the pages are
locked in a logfs write path, that would cause a deadlock.

So PagePreLocked(page) indicates that this page is in a logfs write
path.  It won't change until the current thread releases the logfs write
lock and is fair game for GC.

Jörn

-- 
The cheapest, fastest and most reliable components of a computer
system are those that aren't there.
-- Gordon Bell, DEC labratories

WARNING: multiple messages have this Message-ID (diff)
From: "Jörn Engel" <joern@logfs.org>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Arnd Bergmann <arnd@arndb.de>,
	linux-kernel@vger.kernel.org,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	linux-mtd@lists.infradead.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 12/17] [LogFS] readwrite.c
Date: Mon, 23 Nov 2009 14:15:47 +0100	[thread overview]
Message-ID: <20091123131547.GA18889@logfs.org> (raw)
In-Reply-To: <84144f020911230433q5fd90321m9b463cb425fafe7d@mail.gmail.com>

Hello Pekka!

On Mon, 23 November 2009 14:33:15 +0200, Pekka Enberg wrote:
> 
> (Dunno who to CC, really, so lets see if I can trick Andrew or Hugh
> into looking at the issue.)

That would be nice.

> On Fri, Nov 20, 2009 at 9:38 PM, Joern Engel <joern@logfs.org> wrote:
> > +static void logfs_lock_write_page(struct page *page)
> > +{
> > +       int loop = 0;
> > +
> > +       while (unlikely(!trylock_page(page))) {
> > +               if (loop++ > 0x1000) {
> > +                       /* Has been observed once so far... */
> > +                       printk(KERN_ERR "stack at %p\n", &loop);
> > +                       BUG();
> > +               }
> > +               if (PagePreLocked(page)) {
> > +                       /* Holder of page lock is waiting for us, it
> > +                        * is safe to use this page. */
> > +                       break;
> > +               }
> > +               /* Some other process has this page locked and has
> > +                * nothing to do with us.  Wait for it to finish.
> > +                */
> > +               schedule();
> > +       }
> > +       BUG_ON(!PageLocked(page));
> > +}
> 
> What's the purpose of PagePreLocked()? The above function looks pretty
> fragile for a filesystem to me.

Avoiding deadlocks.  Garbage collection is the cause of almost all
headaches in logfs, including this.  Any write may require some amount
of garbage collection to free up some space.  Garbage collection
consists of basically random reads followed by random writes.  So in
order to write any page, it may be required to first read and write some
other inconvenient page.

Simple case:
	Thread A writes page 1, GC then reads/writes page 1.
More complicated:
	Thread A writes page 1, thread B writes page 2.
	Thread A gets the write lock, thread B blocks on write lock.
	Thread A does GC which reads/writes page 2.

The more complicated case requires that any thread holding the write
lock must be able to read/write any page belonging to this filesystem.
If those pages are locked by someone else, it should wait it out and
only use the page when the page lock is released.  But if the pages are
locked in a logfs write path, that would cause a deadlock.

So PagePreLocked(page) indicates that this page is in a logfs write
path.  It won't change until the current thread releases the logfs write
lock and is fair game for GC.

Jörn

-- 
The cheapest, fastest and most reliable components of a computer
system are those that aren't there.
-- Gordon Bell, DEC labratories

  reply	other threads:[~2009-11-23 13:15 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-20 19:37 [PATCH 0/17] [LogFS] New flash filesystem Joern Engel
2009-11-20 19:37 ` [PATCH 1/17] [LogFS] Documentation Joern Engel
2009-11-20 19:37   ` Joern Engel
2009-11-20 19:37 ` [PATCH 2/17] [LogFS] compr.c Joern Engel
2009-11-20 19:37 ` [PATCH 3/17] [LogFS] dev_bdev.c Joern Engel
2009-11-20 19:37 ` [PATCH 4/17] [LogFS] dev_mtd.c Joern Engel
2009-11-20 19:37 ` [PATCH 5/17] [LogFS] dir.c Joern Engel
2009-11-23 11:17   ` Dan Carpenter
2009-11-23 11:17     ` Dan Carpenter
2009-11-23 11:17     ` Dan Carpenter
2009-11-23 13:32     ` Jörn Engel
2009-11-23 13:32       ` Jörn Engel
2009-11-23 13:32       ` Jörn Engel
2009-11-20 19:37 ` [PATCH 6/17] [LogFS] file.c Joern Engel
2009-11-20 19:37 ` [PATCH 7/17] [LogFS] gc.c Joern Engel
2009-11-20 19:37 ` [PATCH 8/17] [LogFS] inode.c Joern Engel
2009-11-20 19:37 ` [PATCH 9/17] [LogFS] journal.c Joern Engel
2009-11-20 19:37 ` [PATCH 10/17] [LogFS] logfs.h Joern Engel
2009-11-20 19:38 ` [PATCH 11/17] [LogFS] logfs_abi.h Joern Engel
2009-11-20 19:38 ` [PATCH 12/17] [LogFS] readwrite.c Joern Engel
2009-11-23 12:33   ` Pekka Enberg
2009-11-23 12:33     ` Pekka Enberg
2009-11-23 12:33     ` Pekka Enberg
2009-11-23 13:15     ` Jörn Engel [this message]
2009-11-23 13:15       ` Jörn Engel
2009-11-20 19:38 ` [PATCH 13/17] [LogFS] segment.c Joern Engel
2009-11-20 19:38 ` [PATCH 14/17] [LogFS] super.c Joern Engel
2009-11-20 19:38 ` [PATCH 15/17] [LogFS] btree headers Joern Engel
2009-11-20 19:38 ` [PATCH 16/17] [LogFS] btree.c Joern Engel
2009-11-20 19:38 ` [PATCH 17/17] [LogFS] Kconfig and Makefile Joern Engel
2009-11-20 19:38 ` [PATCH 18/17] [LogFS] fio support Joern Engel
2009-11-23 12:18 ` [PATCH 0/17] [LogFS] New flash filesystem Arnd Bergmann
2009-11-23 12:18   ` Arnd Bergmann
2009-11-25 15:55   ` Jörn Engel
2009-11-25 15:55     ` Jörn Engel
2009-11-25 15:55     ` Jörn Engel
2009-11-25 23:51     ` Stephen Rothwell
2009-11-25 23:51       ` Stephen Rothwell
2009-11-26  8:36       ` Jörn Engel
2009-11-26  8:36         ` Jörn Engel
2009-11-26  8:36         ` Jörn Engel
2009-11-27  4:24         ` Stephen Rothwell
2009-11-27  4:24           ` Stephen Rothwell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091123131547.GA18889@logfs.org \
    --to=joern@logfs.org \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=penberg@cs.helsinki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.