linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] Fix assorted FS-Cache issues
@ 2013-05-03  0:33 David Howells
  2013-05-03  0:33 ` [PATCH 1/8] fs/fscache: remove spin_lock() from the condition in while() David Howells
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: David Howells @ 2013-05-03  0:33 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, hjayasur, jlayton, linux-kernel



Following this mail is a series of patches to fix a number of FS-Cache issues,
including a number of oopses.  The patches can also be found at:

	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache

The patches are as follows:

 (1) Don't put spin_lock() in a while-condition as spin_lock() may be wrapped
     with do {} while(0) (cleanup).

 (2) Name i_mutex lock classes rather than using numbers in CacheFiles (cleanup).

 (3) Don't sleep in page release if __GFP_FS is not set (deadlock vs ext4).

 (4) Uninline fscache_object_init() (cleanup).

 (5) Wrap checks on object state (cleanup).

 (6) Provide a system wait_on_atomic_t() and wake_up_atomic_t() (enhancement).

 (7) Simplify the object state machine (need #4 and #5).

 (8) Simplify cookie retention by objects (oops fix, need #6 and #7).

David
---
David Howells (6):
      FS-Cache: Don't sleep in page release if __GFP_FS is not set
      FS-Cache: Uninline fscache_object_init()
      FS-Cache: Wrap checks on object state
      Add wait_on_atomic_t() and wake_up_atomic_t()
      FS-Cache: Fix object state machine to have separate work and wait states
      FS-Cache: Simplify cookie retention for fscache_objects, fixing access problems

J. Bruce Fields (1):
      CacheFiles: name i_mutex lock class explicitly

Sebastian Andrzej Siewior (1):
      fs/fscache: remove spin_lock() from the condition in while()


 fs/cachefiles/interface.c     |   11 
 fs/cachefiles/namei.c         |   10 
 fs/cachefiles/xattr.c         |    6 
 fs/fscache/cache.c            |   34 +
 fs/fscache/cookie.c           |   93 +---
 fs/fscache/fsdef.c            |    1 
 fs/fscache/internal.h         |   11 
 fs/fscache/main.c             |   11 
 fs/fscache/netfs.c            |    1 
 fs/fscache/object-list.c      |  103 ++--
 fs/fscache/object.c           | 1077 +++++++++++++++++++++--------------------
 fs/fscache/operation.c        |   36 +
 fs/fscache/page.c             |   55 +-
 include/linux/fscache-cache.h |  170 +++---
 include/linux/wait.h          |   29 +
 kernel/wait.c                 |   85 +++
 16 files changed, 922 insertions(+), 811 deletions(-)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/8] fs/fscache: remove spin_lock() from the condition in while()
  2013-05-03  0:33 [PATCH 0/8] Fix assorted FS-Cache issues David Howells
@ 2013-05-03  0:33 ` David Howells
  2013-05-03  0:33 ` [PATCH 2/8] CacheFiles: name i_mutex lock class explicitly David Howells
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2013-05-03  0:33 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, hjayasur, jlayton, linux-kernel

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

The spinlock() within the condition in while() will cause a compile error
if it is not a function. This is not a problem on mainline but it does not
look pretty and there is no reason to do it that way.
That patch writes it a little differently and avoids the double condition.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/page.c |   16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index ff000e5..4882c80 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -796,11 +796,16 @@ void fscache_invalidate_writes(struct fscache_cookie *cookie)
 
 	_enter("");
 
-	while (spin_lock(&cookie->stores_lock),
-	       n = radix_tree_gang_lookup_tag(&cookie->stores, results, 0,
-					      ARRAY_SIZE(results),
-					      FSCACHE_COOKIE_PENDING_TAG),
-	       n > 0) {
+	for (;;) {
+		spin_lock(&cookie->stores_lock);
+		n = radix_tree_gang_lookup_tag(&cookie->stores, results, 0,
+					       ARRAY_SIZE(results),
+					       FSCACHE_COOKIE_PENDING_TAG);
+		if (n == 0) {
+			spin_unlock(&cookie->stores_lock);
+			break;
+		}
+
 		for (i = n - 1; i >= 0; i--) {
 			page = results[i];
 			radix_tree_delete(&cookie->stores, page->index);
@@ -812,7 +817,6 @@ void fscache_invalidate_writes(struct fscache_cookie *cookie)
 			page_cache_release(results[i]);
 	}
 
-	spin_unlock(&cookie->stores_lock);
 	_leave("");
 }
 


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/8] CacheFiles: name i_mutex lock class explicitly
  2013-05-03  0:33 [PATCH 0/8] Fix assorted FS-Cache issues David Howells
  2013-05-03  0:33 ` [PATCH 1/8] fs/fscache: remove spin_lock() from the condition in while() David Howells
@ 2013-05-03  0:33 ` David Howells
  2013-05-03  0:33 ` [PATCH 3/8] FS-Cache: Don't sleep in page release if __GFP_FS is not set David Howells
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2013-05-03  0:33 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, hjayasur, jlayton, linux-kernel

From: J. Bruce Fields <bfields@redhat.com>

Just some cleanup.

(And note the caller of this function may, for example, call vfs_unlink
on a child, so the "1" (I_MUTEX_PARENT) really was what was intended
here.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/namei.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 8c01c5fc..07cbd44 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -836,7 +836,7 @@ static struct dentry *cachefiles_check_active(struct cachefiles_cache *cache,
 	//       dir->d_name.len, dir->d_name.len, dir->d_name.name, filename);
 
 	/* look up the victim */
-	mutex_lock_nested(&dir->d_inode->i_mutex, 1);
+	mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_PARENT);
 
 	start = jiffies;
 	victim = lookup_one_len(filename, dir, strlen(filename));


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/8] FS-Cache: Don't sleep in page release if __GFP_FS is not set
  2013-05-03  0:33 [PATCH 0/8] Fix assorted FS-Cache issues David Howells
  2013-05-03  0:33 ` [PATCH 1/8] fs/fscache: remove spin_lock() from the condition in while() David Howells
  2013-05-03  0:33 ` [PATCH 2/8] CacheFiles: name i_mutex lock class explicitly David Howells
@ 2013-05-03  0:33 ` David Howells
  2013-05-03  0:33 ` [PATCH 4/8] FS-Cache: Uninline fscache_object_init() David Howells
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2013-05-03  0:33 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, hjayasur, jlayton, linux-kernel

Don't sleep in __fscache_maybe_release_page() if __GFP_FS is not set.  This
goes some way towards mitigating fscache deadlocking against ext4 by way of
the allocator, eg:

INFO: task flush-8:0:24427 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-8:0       D ffff88003e2b9fd8     0 24427      2 0x00000000
 ffff88003e2b9138 0000000000000046 ffff880012e3a040 ffff88003e2b9fd8
 0000000000011c80 ffff88003e2b9fd8 ffffffff81a10400 ffff880012e3a040
 0000000000000002 ffff880012e3a040 ffff88003e2b9098 ffffffff8106dcf5
Call Trace:
 [<ffffffff8106dcf5>] ? __lock_is_held+0x31/0x53
 [<ffffffff81219b61>] ? radix_tree_lookup_element+0xf4/0x12a
 [<ffffffff81454bed>] schedule+0x60/0x62
 [<ffffffffa01d349c>] __fscache_wait_on_page_write+0x8b/0xa5 [fscache]
 [<ffffffff810498a8>] ? __init_waitqueue_head+0x4d/0x4d
 [<ffffffffa01d393a>] __fscache_maybe_release_page+0x30c/0x324 [fscache]
 [<ffffffffa01d369a>] ? __fscache_maybe_release_page+0x6c/0x324 [fscache]
 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170
 [<ffffffffa01fd7b2>] nfs_fscache_release_page+0x68/0x94 [nfs]
 [<ffffffffa01ef73e>] nfs_release_page+0x7e/0x86 [nfs]
 [<ffffffff810aa553>] try_to_release_page+0x32/0x3b
 [<ffffffff810b6c70>] shrink_page_list+0x535/0x71a
 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170
 [<ffffffff810b7352>] shrink_inactive_list+0x20a/0x2dd
 [<ffffffff81071a13>] ? mark_held_locks+0xbe/0xea
 [<ffffffff810b7a65>] shrink_lruvec+0x34c/0x3eb
 [<ffffffff810b7bd3>] do_try_to_free_pages+0xcf/0x355
 [<ffffffff810b7fc8>] try_to_free_pages+0x9a/0xa1
 [<ffffffff810b08d2>] __alloc_pages_nodemask+0x494/0x6f7
 [<ffffffff810d9a07>] kmem_getpages+0x58/0x155
 [<ffffffff810dc002>] fallback_alloc+0x120/0x1f3
 [<ffffffff8106db23>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffff810dbed3>] ____cache_alloc_node+0x177/0x186
 [<ffffffff81162a6c>] ? ext4_init_io_end+0x1c/0x37
 [<ffffffff810dc403>] kmem_cache_alloc+0xf1/0x176
 [<ffffffff810b17ac>] ? test_set_page_writeback+0x101/0x113
 [<ffffffff81162a6c>] ext4_init_io_end+0x1c/0x37
 [<ffffffff81162ce4>] ext4_bio_write_page+0x20f/0x3af
 [<ffffffff8115cc02>] mpage_da_submit_io+0x26e/0x2f6
 [<ffffffff811088e5>] ? __find_get_block_slow+0x38/0x133
 [<ffffffff81161348>] mpage_da_map_and_submit+0x3a7/0x3bd
 [<ffffffff81161a60>] ext4_da_writepages+0x30d/0x426
 [<ffffffff810b3359>] do_writepages+0x1c/0x2a
 [<ffffffff81102f4d>] __writeback_single_inode+0x3e/0xe5
 [<ffffffff81103995>] writeback_sb_inodes+0x1bd/0x2f4
 [<ffffffff81103b3b>] __writeback_inodes_wb+0x6f/0xb4
 [<ffffffff81103c81>] wb_writeback+0x101/0x195
 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170
 [<ffffffff811043aa>] ? wb_do_writeback+0xaa/0x173
 [<ffffffff8110434a>] wb_do_writeback+0x4a/0x173
 [<ffffffff81071bbc>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff81038554>] ? del_timer+0x4b/0x5b
 [<ffffffff811044e0>] bdi_writeback_thread+0x6d/0x147
 [<ffffffff81104473>] ? wb_do_writeback+0x173/0x173
 [<ffffffff81048fbc>] kthread+0xd0/0xd8
 [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e
 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55
 [<ffffffff81456aac>] ret_from_fork+0x7c/0xb0
 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55
2 locks held by flush-8:0/24427:
 #0:  (&type->s_umount_key#41){.+.+..}, at: [<ffffffff810e3b73>] grab_super_passive+0x4c/0x76
 #1:  (jbd2_handle){+.+...}, at: [<ffffffff81190d81>] start_this_handle+0x475/0x4ea


The problem here is that another thread, which is attempting to write the
to-be-stored NFS page to the on-ext4 cache file is waiting for the journal
lock, eg:

INFO: task kworker/u:2:24437 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u:2     D ffff880039589768     0 24437      2 0x00000000
 ffff8800395896d8 0000000000000046 ffff8800283bf040 ffff880039589fd8
 0000000000011c80 ffff880039589fd8 ffff880039f0b040 ffff8800283bf040
 0000000000000006 ffff8800283bf6b8 ffff880039589658 ffffffff81071a13
Call Trace:
 [<ffffffff81071a13>] ? mark_held_locks+0xbe/0xea
 [<ffffffff81455e73>] ? _raw_spin_unlock_irqrestore+0x3a/0x50
 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170
 [<ffffffff81071bbc>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff81454bed>] schedule+0x60/0x62
 [<ffffffff81190c23>] start_this_handle+0x317/0x4ea
 [<ffffffff810498a8>] ? __init_waitqueue_head+0x4d/0x4d
 [<ffffffff81190fcc>] jbd2__journal_start+0xb3/0x12e
 [<ffffffff81176606>] __ext4_journal_start_sb+0xb2/0xc6
 [<ffffffff8115f137>] ext4_da_write_begin+0x109/0x233
 [<ffffffff810a964d>] generic_file_buffered_write+0x11a/0x264
 [<ffffffff811032cf>] ? __mark_inode_dirty+0x2d/0x1ee
 [<ffffffff810ab1ab>] __generic_file_aio_write+0x2a5/0x2d5
 [<ffffffff810ab24a>] generic_file_aio_write+0x6f/0xd0
 [<ffffffff81159a2c>] ext4_file_write+0x38c/0x3c4
 [<ffffffff810e0915>] do_sync_write+0x91/0xd1
 [<ffffffffa00a17f0>] cachefiles_write_page+0x26f/0x310 [cachefiles]
 [<ffffffffa01d470b>] fscache_write_op+0x21e/0x37a [fscache]
 [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e
 [<ffffffffa01d2479>] fscache_op_work_func+0x78/0xd7 [fscache]
 [<ffffffff8104455a>] process_one_work+0x232/0x3a8
 [<ffffffff810444ff>] ? process_one_work+0x1d7/0x3a8
 [<ffffffff81044ee0>] worker_thread+0x214/0x303
 [<ffffffff81044ccc>] ? manage_workers+0x245/0x245
 [<ffffffff81048fbc>] kthread+0xd0/0xd8
 [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e
 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55
 [<ffffffff81456aac>] ret_from_fork+0x7c/0xb0
 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55
4 locks held by kworker/u:2/24437:
 #0:  (fscache_operation){.+.+.+}, at: [<ffffffff810444ff>] process_one_work+0x1d7/0x3a8
 #1:  ((&op->work)){+.+.+.}, at: [<ffffffff810444ff>] process_one_work+0x1d7/0x3a8
 #2:  (sb_writers#14){.+.+.+}, at: [<ffffffff810ab22c>] generic_file_aio_write+0x51/0xd0
 #3:  (&sb->s_type->i_mutex_key#19){+.+.+.}, at: [<ffffffff810ab236>] generic_file_aio_write+0x5b/0x

fscache already tries to cancel pending stores, but it can't cancel a write
for which I/O is already in progress.

An alternative would be to accept writing garbage to the cache under extreme
circumstances and to kill the afflicted cache object if we have to do this.
However, we really need to know how strapped the allocator is before deciding
to do that.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/page.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index 4882c80..42f8f2d 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -109,7 +109,7 @@ page_busy:
 	 * allocator as the work threads writing to the cache may all end up
 	 * sleeping on memory allocation, so we may need to impose a timeout
 	 * too. */
-	if (!(gfp & __GFP_WAIT)) {
+	if (!(gfp & __GFP_WAIT) || !(gfp & __GFP_FS)) {
 		fscache_stat(&fscache_n_store_vmscan_busy);
 		return false;
 	}


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/8] FS-Cache: Uninline fscache_object_init()
  2013-05-03  0:33 [PATCH 0/8] Fix assorted FS-Cache issues David Howells
                   ` (2 preceding siblings ...)
  2013-05-03  0:33 ` [PATCH 3/8] FS-Cache: Don't sleep in page release if __GFP_FS is not set David Howells
@ 2013-05-03  0:33 ` David Howells
  2013-05-03  0:33 ` [PATCH 5/8] FS-Cache: Wrap checks on object state David Howells
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2013-05-03  0:33 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, hjayasur, jlayton, linux-kernel

Uninline fscache_object_init() so as not to expose some of the FS-Cache
internals to the cache backend.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/object.c           |   40 ++++++++++++++++++++++++++++++++++++++--
 include/linux/fscache-cache.h |   38 ++------------------------------------
 2 files changed, 40 insertions(+), 38 deletions(-)

diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 50d41c1..0133699 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -363,7 +363,7 @@ unsupported_event:
 /*
  * execute an object
  */
-void fscache_object_work_func(struct work_struct *work)
+static void fscache_object_work_func(struct work_struct *work)
 {
 	struct fscache_object *object =
 		container_of(work, struct fscache_object, work);
@@ -379,7 +379,43 @@ void fscache_object_work_func(struct work_struct *work)
 	clear_bit(FSCACHE_OBJECT_EV_REQUEUE, &object->events);
 	fscache_put_object(object);
 }
-EXPORT_SYMBOL(fscache_object_work_func);
+
+/**
+ * fscache_object_init - Initialise a cache object description
+ * @object: Object description
+ * @cookie: Cookie object will be attached to
+ * @cache: Cache in which backing object will be found
+ *
+ * Initialise a cache object description to its basic values.
+ *
+ * See Documentation/filesystems/caching/backend-api.txt for a complete
+ * description.
+ */
+void fscache_object_init(struct fscache_object *object,
+			 struct fscache_cookie *cookie,
+			 struct fscache_cache *cache)
+{
+	atomic_inc(&cache->object_count);
+
+	object->state = FSCACHE_OBJECT_INIT;
+	spin_lock_init(&object->lock);
+	INIT_LIST_HEAD(&object->cache_link);
+	INIT_HLIST_NODE(&object->cookie_link);
+	INIT_WORK(&object->work, fscache_object_work_func);
+	INIT_LIST_HEAD(&object->dependents);
+	INIT_LIST_HEAD(&object->dep_link);
+	INIT_LIST_HEAD(&object->pending_ops);
+	object->n_children = 0;
+	object->n_ops = object->n_in_progress = object->n_exclusive = 0;
+	object->events = object->event_mask = 0;
+	object->flags = 0;
+	object->store_limit = 0;
+	object->store_limit_l = 0;
+	object->cache = cache;
+	object->cookie = cookie;
+	object->parent = NULL;
+}
+EXPORT_SYMBOL(fscache_object_init);
 
 /*
  * initialise an object
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 5dfa0aa..9b9c1de 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -426,42 +426,8 @@ extern const char *fscache_object_states[];
 	(test_bit(FSCACHE_IOERROR, &(obj)->cache->flags) &&	\
 	 (obj)->state >= FSCACHE_OBJECT_DYING)
 
-extern void fscache_object_work_func(struct work_struct *work);
-
-/**
- * fscache_object_init - Initialise a cache object description
- * @object: Object description
- *
- * Initialise a cache object description to its basic values.
- *
- * See Documentation/filesystems/caching/backend-api.txt for a complete
- * description.
- */
-static inline
-void fscache_object_init(struct fscache_object *object,
-			 struct fscache_cookie *cookie,
-			 struct fscache_cache *cache)
-{
-	atomic_inc(&cache->object_count);
-
-	object->state = FSCACHE_OBJECT_INIT;
-	spin_lock_init(&object->lock);
-	INIT_LIST_HEAD(&object->cache_link);
-	INIT_HLIST_NODE(&object->cookie_link);
-	INIT_WORK(&object->work, fscache_object_work_func);
-	INIT_LIST_HEAD(&object->dependents);
-	INIT_LIST_HEAD(&object->dep_link);
-	INIT_LIST_HEAD(&object->pending_ops);
-	object->n_children = 0;
-	object->n_ops = object->n_in_progress = object->n_exclusive = 0;
-	object->events = object->event_mask = 0;
-	object->flags = 0;
-	object->store_limit = 0;
-	object->store_limit_l = 0;
-	object->cache = cache;
-	object->cookie = cookie;
-	object->parent = NULL;
-}
+extern void fscache_object_init(struct fscache_object *, struct fscache_cookie *,
+				struct fscache_cache *);
 
 extern void fscache_object_lookup_negative(struct fscache_object *object);
 extern void fscache_obtained_object(struct fscache_object *object);


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/8] FS-Cache: Wrap checks on object state
  2013-05-03  0:33 [PATCH 0/8] Fix assorted FS-Cache issues David Howells
                   ` (3 preceding siblings ...)
  2013-05-03  0:33 ` [PATCH 4/8] FS-Cache: Uninline fscache_object_init() David Howells
@ 2013-05-03  0:33 ` David Howells
  2013-05-03  0:33 ` [PATCH 6/8] Add wait_on_atomic_t() and wake_up_atomic_t() David Howells
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2013-05-03  0:33 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, hjayasur, jlayton, linux-kernel

Wrap checks on object state (mostly outside of fs/fscache/object.c) with
inline functions so that the mechanism can be replaced.

Some of the state checks within object.c are left as-is as they will be
replaced.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/namei.c         |    4 ++--
 fs/fscache/cache.c            |    2 +-
 fs/fscache/cookie.c           |    8 ++++----
 fs/fscache/object.c           |    8 ++++----
 fs/fscache/operation.c        |    2 +-
 include/linux/fscache-cache.h |   37 ++++++++++++++++++++++++++++---------
 6 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 07cbd44..01979a3 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -130,7 +130,7 @@ found_dentry:
 	       fscache_object_states[object->fscache.state],
 	       dentry);
 
-	if (object->fscache.state < FSCACHE_OBJECT_DYING) {
+	if (fscache_object_is_live(&object->fscache)) {
 		printk(KERN_ERR "\n");
 		printk(KERN_ERR "CacheFiles: Error:"
 		       " Can't preemptively bury live object\n");
@@ -192,7 +192,7 @@ try_again:
 	/* an old object from a previous incarnation is hogging the slot - we
 	 * need to wait for it to be destroyed */
 wait_for_old_object:
-	if (xobject->fscache.state < FSCACHE_OBJECT_DYING) {
+	if (fscache_object_is_live(&object->fscache)) {
 		printk(KERN_ERR "\n");
 		printk(KERN_ERR "CacheFiles: Error:"
 		       " Unexpected object collision\n");
diff --git a/fs/fscache/cache.c b/fs/fscache/cache.c
index b52aed1..129ea53 100644
--- a/fs/fscache/cache.c
+++ b/fs/fscache/cache.c
@@ -115,7 +115,7 @@ struct fscache_cache *fscache_select_cache_for_object(
 				     struct fscache_object, cookie_link);
 
 		cache = object->cache;
-		if (object->state >= FSCACHE_OBJECT_DYING ||
+		if (fscache_object_is_dying(object) ||
 		    test_bit(FSCACHE_IOERROR, &cache->flags))
 			cache = NULL;
 
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index e2cba1f..a5f36c9 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -285,7 +285,7 @@ static int fscache_alloc_object(struct fscache_cache *cache,
 
 object_already_extant:
 	ret = -ENOBUFS;
-	if (object->state >= FSCACHE_OBJECT_DYING) {
+	if (fscache_object_is_dead(object)) {
 		spin_unlock(&cookie->lock);
 		goto error;
 	}
@@ -321,7 +321,7 @@ static int fscache_attach_object(struct fscache_cookie *cookie,
 	ret = -EEXIST;
 	hlist_for_each_entry(p, &cookie->backing_objects, cookie_link) {
 		if (p->cache == object->cache) {
-			if (p->state >= FSCACHE_OBJECT_DYING)
+			if (fscache_object_is_dying(p))
 				ret = -ENOBUFS;
 			goto cant_attach_object;
 		}
@@ -332,7 +332,7 @@ static int fscache_attach_object(struct fscache_cookie *cookie,
 	hlist_for_each_entry(p, &cookie->parent->backing_objects,
 			     cookie_link) {
 		if (p->cache == object->cache) {
-			if (p->state >= FSCACHE_OBJECT_DYING) {
+			if (fscache_object_is_dying(p)) {
 				ret = -ENOBUFS;
 				spin_unlock(&cookie->parent->lock);
 				goto cant_attach_object;
@@ -400,7 +400,7 @@ void __fscache_invalidate(struct fscache_cookie *cookie)
 			object = hlist_entry(cookie->backing_objects.first,
 					     struct fscache_object,
 					     cookie_link);
-			if (object->state < FSCACHE_OBJECT_DYING)
+			if (fscache_object_is_live(object))
 				fscache_raise_event(
 					object, FSCACHE_OBJECT_EV_INVALIDATE);
 		}
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 0133699..863f687 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -457,10 +457,10 @@ static void fscache_initialise_object(struct fscache_object *object)
 		spin_lock_nested(&parent->lock, 1);
 		_debug("parent %s", fscache_object_states[parent->state]);
 
-		if (parent->state >= FSCACHE_OBJECT_DYING) {
+		if (fscache_object_is_dying(parent)) {
 			_debug("bad parent");
 			set_bit(FSCACHE_OBJECT_EV_WITHDRAW, &object->events);
-		} else if (parent->state < FSCACHE_OBJECT_AVAILABLE) {
+		} else if (!fscache_object_is_available(parent)) {
 			_debug("wait");
 
 			/* we may get woken up in this state by child objects
@@ -517,9 +517,9 @@ static void fscache_lookup_object(struct fscache_object *object)
 	ASSERTCMP(parent->n_obj_ops, >, 0);
 
 	/* make sure the parent is still available */
-	ASSERTCMP(parent->state, >=, FSCACHE_OBJECT_AVAILABLE);
+	ASSERT(fscache_object_is_available(parent));
 
-	if (parent->state >= FSCACHE_OBJECT_DYING ||
+	if (fscache_object_is_dying(parent) ||
 	    test_bit(FSCACHE_IOERROR, &object->cache->flags)) {
 		_debug("unavailable");
 		set_bit(FSCACHE_OBJECT_EV_WITHDRAW, &object->events);
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 762a9ec..ccf0219 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -35,7 +35,7 @@ void fscache_enqueue_operation(struct fscache_operation *op)
 
 	ASSERT(list_empty(&op->pend_link));
 	ASSERT(op->processor != NULL);
-	ASSERTCMP(op->object->state, >=, FSCACHE_OBJECT_AVAILABLE);
+	ASSERT(fscache_object_is_available(op->object));
 	ASSERTCMP(atomic_read(&op->usage), >, 0);
 	ASSERTCMP(op->state, ==, FSCACHE_OP_ST_IN_PROGRESS);
 
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 9b9c1de..c5f9234 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -417,15 +417,6 @@ struct fscache_object {
 
 extern const char *fscache_object_states[];
 
-#define fscache_object_is_active(obj)			      \
-	(!test_bit(FSCACHE_IOERROR, &(obj)->cache->flags) &&  \
-	 (obj)->state >= FSCACHE_OBJECT_AVAILABLE &&	      \
-	 (obj)->state < FSCACHE_OBJECT_DYING)
-
-#define fscache_object_is_dead(obj)				\
-	(test_bit(FSCACHE_IOERROR, &(obj)->cache->flags) &&	\
-	 (obj)->state >= FSCACHE_OBJECT_DYING)
-
 extern void fscache_object_init(struct fscache_object *, struct fscache_cookie *,
 				struct fscache_cache *);
 
@@ -438,6 +429,34 @@ extern void fscache_object_destroy(struct fscache_object *object);
 #define fscache_object_destroy(object) do {} while(0)
 #endif
 
+static inline bool fscache_object_is_live(struct fscache_object *object)
+{
+	return object->state < FSCACHE_OBJECT_DYING;
+}
+
+static inline bool fscache_object_is_dying(struct fscache_object *object)
+{
+	return !fscache_object_is_live(object);
+}
+
+static inline bool fscache_object_is_available(struct fscache_object *object)
+{
+	return object->state >= FSCACHE_OBJECT_AVAILABLE;
+}
+
+static inline bool fscache_object_is_active(struct fscache_object *object)
+{
+	return fscache_object_is_available(object) &&
+		fscache_object_is_live(object) &&
+		!test_bit(FSCACHE_IOERROR, &object->cache->flags);
+}
+
+static inline bool fscache_object_is_dead(struct fscache_object *object)
+{
+	return fscache_object_is_dying(object) &&
+		test_bit(FSCACHE_IOERROR, &object->cache->flags);
+}
+
 /**
  * fscache_object_destroyed - Note destruction of an object in a cache
  * @cache: The cache from which the object came


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 6/8] Add wait_on_atomic_t() and wake_up_atomic_t()
  2013-05-03  0:33 [PATCH 0/8] Fix assorted FS-Cache issues David Howells
                   ` (4 preceding siblings ...)
  2013-05-03  0:33 ` [PATCH 5/8] FS-Cache: Wrap checks on object state David Howells
@ 2013-05-03  0:33 ` David Howells
  2013-05-03  0:33 ` [PATCH 7/8] FS-Cache: Fix object state machine to have separate work and wait states David Howells
  2013-05-03  0:33 ` [PATCH 8/8] FS-Cache: Simplify cookie retention for fscache_objects, fixing access problems David Howells
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2013-05-03  0:33 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, hjayasur, jlayton, linux-kernel

Add wait_on_atomic_t() and wake_up_atomic_t() to indicate became-zero events
on atomic_t types.  This uses the bit-wake waitqueue table.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/linux/wait.h |   29 ++++++++++++++++-
 kernel/wait.c        |   85 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 113 insertions(+), 1 deletion(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index 7cb64d4..1379562 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -21,8 +21,12 @@ struct __wait_queue {
 };
 
 struct wait_bit_key {
-	void *flags;
+	union {
+		void *flags;
+		atomic_t *atomic_val;
+	};
 	int bit_nr;
+#define WAIT_ATOMIC_T_BIT_NR -1
 };
 
 struct wait_bit_queue {
@@ -60,6 +64,9 @@ struct task_struct;
 #define __WAIT_BIT_KEY_INITIALIZER(word, bit)				\
 	{ .flags = word, .bit_nr = bit, }
 
+#define __WAIT_ATOMIC_T_KEY_INITIALIZER(p)				\
+	{ .atomic_val = p, .bit_nr = WAIT_ATOMIC_T_BIT_NR, }
+
 extern void __init_waitqueue_head(wait_queue_head_t *q, const char *name, struct lock_class_key *);
 
 #define init_waitqueue_head(q)				\
@@ -146,8 +153,10 @@ void __wake_up_bit(wait_queue_head_t *, void *, int);
 int __wait_on_bit(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned);
 int __wait_on_bit_lock(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned);
 void wake_up_bit(void *, int);
+void wake_up_atomic_t(atomic_t *);
 int out_of_line_wait_on_bit(void *, int, int (*)(void *), unsigned);
 int out_of_line_wait_on_bit_lock(void *, int, int (*)(void *), unsigned);
+int out_of_line_wait_on_atomic_t(atomic_t *, int (*)(atomic_t *), unsigned);
 wait_queue_head_t *bit_waitqueue(void *, int);
 
 #define wake_up(x)			__wake_up(x, TASK_NORMAL, 1, NULL)
@@ -810,5 +819,23 @@ static inline int wait_on_bit_lock(void *word, int bit,
 		return 0;
 	return out_of_line_wait_on_bit_lock(word, bit, action, mode);
 }
+
+/**
+ * wait_on_atomic_t - Wait for an atomic_t to become 0
+ * @val: The atomic value being waited on, a kernel virtual address
+ * @action: the function used to sleep, which may take special actions
+ * @mode: the task state to sleep in
+ *
+ * Wait for an atomic_t to become 0.  We abuse the bit-wait waitqueue table for
+ * the purpose of getting a waitqueue, but we set the key to a bit number
+ * outside of the target 'word'.
+ */
+static inline
+int wait_on_atomic_t(atomic_t *val, int (*action)(atomic_t *), unsigned mode)
+{
+	if (atomic_read(val) == 0)
+		return 0;
+	return out_of_line_wait_on_atomic_t(val, action, mode);
+}
 	
 #endif
diff --git a/kernel/wait.c b/kernel/wait.c
index 6698e0c..ec5b23c 100644
--- a/kernel/wait.c
+++ b/kernel/wait.c
@@ -287,3 +287,88 @@ wait_queue_head_t *bit_waitqueue(void *word, int bit)
 	return &zone->wait_table[hash_long(val, zone->wait_table_bits)];
 }
 EXPORT_SYMBOL(bit_waitqueue);
+
+/*
+ * Manipulate the atomic_t address to produce a better bit waitqueue table hash
+ * index (we're keying off bit -1, but that would produce a horrible hash
+ * value).
+ */
+static inline wait_queue_head_t *atomic_t_waitqueue(atomic_t *p)
+{
+	if (BITS_PER_LONG == 64) {
+		unsigned long q = (unsigned long)p;
+		return bit_waitqueue((void *)(q & ~1), q & 1);
+	}
+	return bit_waitqueue(p, 0);
+}
+
+static int wake_atomic_t_function(wait_queue_t *wait, unsigned mode, int sync,
+				  void *arg)
+{
+	struct wait_bit_key *key = arg;
+	struct wait_bit_queue *wait_bit
+		= container_of(wait, struct wait_bit_queue, wait);
+
+	if (wait_bit->key.atomic_val != key->atomic_val ||
+	    wait_bit->key.bit_nr != key->bit_nr ||
+	    atomic_read(key->atomic_val) != 0)
+		return 0;
+	return autoremove_wake_function(wait, mode, sync, key);
+}
+
+/*
+ * To allow interruptible waiting and asynchronous (i.e. nonblocking) waiting,
+ * the actions of __wait_on_atomic_t() are permitted return codes.  Nonzero
+ * return codes halt waiting and return.
+ */
+static __sched
+int __wait_on_atomic_t(wait_queue_head_t *wq, struct wait_bit_queue *q,
+		       int (*action)(atomic_t *), unsigned mode)
+{
+	int ret = 0;
+
+	do {
+		prepare_to_wait(wq, &q->wait, mode);
+		if (atomic_read(q->key.atomic_val) == 0)
+			ret = (*action)(q->key.atomic_val);
+	} while (!ret && atomic_read(q->key.atomic_val) != 0);
+	finish_wait(wq, &q->wait);
+	return ret;
+}
+
+#define DEFINE_WAIT_ATOMIC_T(name, p)					\
+	struct wait_bit_queue name = {					\
+		.key = __WAIT_ATOMIC_T_KEY_INITIALIZER(p),		\
+		.wait	= {						\
+			.private	= current,			\
+			.func		= wake_atomic_t_function,	\
+			.task_list	=				\
+				LIST_HEAD_INIT((name).wait.task_list),	\
+		},							\
+	}
+
+__sched int out_of_line_wait_on_atomic_t(atomic_t *p, int (*action)(atomic_t *),
+					 unsigned mode)
+{
+	wait_queue_head_t *wq = atomic_t_waitqueue(p);
+	DEFINE_WAIT_ATOMIC_T(wait, p);
+
+	return __wait_on_atomic_t(wq, &wait, action, mode);
+}
+EXPORT_SYMBOL(out_of_line_wait_on_atomic_t);
+
+/**
+ * wake_up_atomic_t - Wake up a waiter on a atomic_t
+ * @word: The word being waited on, a kernel virtual address
+ * @bit: The bit of the word being waited on
+ *
+ * Wake up anyone waiting for the atomic_t to go to zero.
+ *
+ * Abuse the bit-waker function and its waitqueue hash table set (the atomic_t
+ * check is done by the waiter's wake function, not the by the waker itself).
+ */
+void wake_up_atomic_t(atomic_t *p)
+{
+	__wake_up_bit(atomic_t_waitqueue(p), p, WAIT_ATOMIC_T_BIT_NR);
+}
+EXPORT_SYMBOL(wake_up_atomic_t);


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 7/8] FS-Cache: Fix object state machine to have separate work and wait states
  2013-05-03  0:33 [PATCH 0/8] Fix assorted FS-Cache issues David Howells
                   ` (5 preceding siblings ...)
  2013-05-03  0:33 ` [PATCH 6/8] Add wait_on_atomic_t() and wake_up_atomic_t() David Howells
@ 2013-05-03  0:33 ` David Howells
  2013-05-03  0:33 ` [PATCH 8/8] FS-Cache: Simplify cookie retention for fscache_objects, fixing access problems David Howells
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2013-05-03  0:33 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, hjayasur, jlayton, linux-kernel

Fix object state machine to have separate work and wait states as that makes
it easier to envision.

There are now three kinds of state:

 (1) Work state.  This is an execution state.  No event processing is performed
     by a work state.  The function attached to a work state returns a pointer
     indicating the next state to which the OSM should transition.  Returning
     NO_TRANSIT repeats the current state, but goes back to the scheduler
     first.

 (2) Wait state.  This is an event processing state.  No execution is
     performed by a wait state.  Wait states are just tables of "if event X
     occurs, clear it and transition to state Y".  The dispatcher returns to
     the scheduler if none of the events in which the wait state has an
     interest are currently pending.

 (3) Out-of-band state.  This is a special work state.  Transitions to normal
     states can be overridden when an unexpected event occurs (eg. I/O error).
     Instead the dispatcher disables and clears the OOB event and transits to
     the specified work state.  This then acts as an ordinary work state,
     though object->state points to the overridden destination.  Returning
     NO_TRANSIT resumes the overridden transition.

In addition, the states have names in their definitions, so there's no need for
tables of state names.  Further, the EV_REQUEUE event is no longer necessary as
that is automatic for work states.

Since the states are now separate structs rather than values in an enum, it's
not possible to use comparisons other than (non-)equality between them, so use
some object->flags to indicate what phase an object is in.

The EV_RELEASE, EV_RETIRE and EV_WITHDRAW events have been squished into one
(EV_KILL).  An object flag now carries the information about retirement.

Similarly, the RELEASING, RECYCLING and WITHDRAWING states have been merged
into an KILL_OBJECT state and additional states have been added for handling
waiting dependent objects (JUMPSTART_DEPS and KILL_DEPENDENTS).

A state has also been added for synchronising with parent object initialisation
(WAIT_FOR_PARENT) and another for initiating look up (PARENT_READY).

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c     |    2 
 fs/cachefiles/namei.c         |    4 
 fs/fscache/cache.c            |   32 +
 fs/fscache/cookie.c           |    9 
 fs/fscache/internal.h         |    8 
 fs/fscache/object-list.c      |   10 
 fs/fscache/object.c           |  987 +++++++++++++++++++++--------------------
 fs/fscache/operation.c        |   22 -
 fs/fscache/page.c             |   11 
 include/linux/fscache-cache.h |   66 +--
 10 files changed, 575 insertions(+), 576 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 746ce53..3d76321 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -263,7 +263,7 @@ static void cachefiles_drop_object(struct fscache_object *_object)
 #endif
 
 	/* delete retired objects */
-	if (object->fscache.state == FSCACHE_OBJECT_RECYCLING &&
+	if (test_bit(FSCACHE_OBJECT_RETIRE, &object->fscache.flags) &&
 	    _object != cache->cache.fsdef
 	    ) {
 		_debug("- retire object OBJ%x", object->fscache.debug_id);
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 01979a3..25badd1 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -38,7 +38,7 @@ void __cachefiles_printk_object(struct cachefiles_object *object,
 	printk(KERN_ERR "%sobject: OBJ%x\n",
 	       prefix, object->fscache.debug_id);
 	printk(KERN_ERR "%sobjstate=%s fl=%lx wbusy=%x ev=%lx[%lx]\n",
-	       prefix, fscache_object_states[object->fscache.state],
+	       prefix, object->fscache.state->name,
 	       object->fscache.flags, work_busy(&object->fscache.work),
 	       object->fscache.events, object->fscache.event_mask);
 	printk(KERN_ERR "%sops=%u inp=%u exc=%u\n",
@@ -127,7 +127,7 @@ static void cachefiles_mark_object_buried(struct cachefiles_cache *cache,
 found_dentry:
 	kdebug("preemptive burial: OBJ%x [%s] %p",
 	       object->fscache.debug_id,
-	       fscache_object_states[object->fscache.state],
+	       object->fscache.state->name,
 	       dentry);
 
 	if (fscache_object_is_live(&object->fscache)) {
diff --git a/fs/fscache/cache.c b/fs/fscache/cache.c
index 129ea53..f7cff36 100644
--- a/fs/fscache/cache.c
+++ b/fs/fscache/cache.c
@@ -224,8 +224,10 @@ int fscache_add_cache(struct fscache_cache *cache,
 	BUG_ON(!ifsdef);
 
 	cache->flags = 0;
-	ifsdef->event_mask = ULONG_MAX & ~(1 << FSCACHE_OBJECT_EV_CLEARED);
-	ifsdef->state = FSCACHE_OBJECT_ACTIVE;
+	ifsdef->event_mask =
+		((1 << NR_FSCACHE_OBJECT_EVENTS) - 1) &
+		~(1 << FSCACHE_OBJECT_EV_CLEARED);
+	__set_bit(FSCACHE_OBJECT_IS_AVAILABLE, &ifsdef->flags);
 
 	if (!tagname)
 		tagname = cache->identifier;
@@ -330,25 +332,25 @@ static void fscache_withdraw_all_objects(struct fscache_cache *cache,
 {
 	struct fscache_object *object;
 
-	spin_lock(&cache->object_list_lock);
-
 	while (!list_empty(&cache->object_list)) {
-		object = list_entry(cache->object_list.next,
-				    struct fscache_object, cache_link);
-		list_move_tail(&object->cache_link, dying_objects);
+		spin_lock(&cache->object_list_lock);
 
-		_debug("withdraw %p", object->cookie);
+		if (!list_empty(&cache->object_list)) {
+			object = list_entry(cache->object_list.next,
+					    struct fscache_object, cache_link);
+			list_move_tail(&object->cache_link, dying_objects);
 
-		spin_lock(&object->lock);
-		spin_unlock(&cache->object_list_lock);
-		fscache_raise_event(object, FSCACHE_OBJECT_EV_WITHDRAW);
-		spin_unlock(&object->lock);
+			_debug("withdraw %p", object->cookie);
+
+			/* This must be done under object_list_lock to prevent
+			 * a race with fscache_drop_object().
+			 */
+			fscache_raise_event(object, FSCACHE_OBJECT_EV_KILL);
+		}
 
+		spin_unlock(&cache->object_list_lock);
 		cond_resched();
-		spin_lock(&cache->object_list_lock);
 	}
-
-	spin_unlock(&cache->object_list_lock);
 }
 
 /**
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index a5f36c9..eee4366 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -205,7 +205,7 @@ static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie)
 
 	/* initiate the process of looking up all the objects in the chain
 	 * (done by fscache_initialise_object()) */
-	fscache_enqueue_object(object);
+	fscache_raise_event(object, FSCACHE_OBJECT_EV_NEW_CHILD);
 
 	spin_unlock(&cookie->lock);
 
@@ -469,7 +469,6 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie, int retire)
 {
 	struct fscache_cache *cache;
 	struct fscache_object *object;
-	unsigned long event;
 
 	fscache_stat(&fscache_n_relinquishes);
 	if (retire)
@@ -497,8 +496,6 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie, int retire)
 			    fscache_wait_bit, TASK_UNINTERRUPTIBLE);
 	}
 
-	event = retire ? FSCACHE_OBJECT_EV_RETIRE : FSCACHE_OBJECT_EV_RELEASE;
-
 try_again:
 	spin_lock(&cookie->lock);
 
@@ -533,7 +530,9 @@ try_again:
 
 		cache = object->cache;
 		object->cookie = NULL;
-		fscache_raise_event(object, event);
+		if (retire)
+			set_bit(FSCACHE_OBJECT_RETIRE, &object->flags);
+		fscache_raise_event(object, FSCACHE_OBJECT_EV_KILL);
 		spin_unlock(&object->lock);
 
 		if (atomic_dec_and_test(&cookie->usage))
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index ee38fef..3322d3c 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -97,10 +97,6 @@ extern int fscache_wait_bit_interruptible(void *);
 /*
  * object.c
  */
-extern const char fscache_object_states_short[FSCACHE_OBJECT__NSTATES][5];
-
-extern void fscache_withdrawing_object(struct fscache_cache *,
-				       struct fscache_object *);
 extern void fscache_enqueue_object(struct fscache_object *);
 
 /*
@@ -291,6 +287,10 @@ static inline void fscache_raise_event(struct fscache_object *object,
 				       unsigned event)
 {
 	BUG_ON(event >= NR_FSCACHE_OBJECT_EVENTS);
+#if 0
+	printk("*** fscache_raise_event(OBJ%d{%lx},%x)\n",
+	       object->debug_id, object->event_mask, (1 << event));
+#endif
 	if (!test_and_set_bit(event, &object->events) &&
 	    test_bit(event, &object->event_mask))
 		fscache_enqueue_object(object);
diff --git a/fs/fscache/object-list.c b/fs/fscache/object-list.c
index f27c89d..4a386b0 100644
--- a/fs/fscache/object-list.c
+++ b/fs/fscache/object-list.c
@@ -174,7 +174,7 @@ static int fscache_objlist_show(struct seq_file *m, void *v)
 
 	if ((unsigned long) v == 1) {
 		seq_puts(m, "OBJECT   PARENT   STAT CHLDN OPS OOP IPR EX READS"
-			 " EM EV F S"
+			 " EM EV FL S"
 			 " | NETFS_COOKIE_DEF TY FL NETFS_DATA");
 		if (config & (FSCACHE_OBJLIST_CONFIG_KEY |
 			      FSCACHE_OBJLIST_CONFIG_AUX))
@@ -193,7 +193,7 @@ static int fscache_objlist_show(struct seq_file *m, void *v)
 
 	if ((unsigned long) v == 2) {
 		seq_puts(m, "======== ======== ==== ===== === === === == ====="
-			 " == == = ="
+			 " == == == ="
 			 " | ================ == == ================");
 		if (config & (FSCACHE_OBJLIST_CONFIG_KEY |
 			      FSCACHE_OBJLIST_CONFIG_AUX))
@@ -219,7 +219,7 @@ static int fscache_objlist_show(struct seq_file *m, void *v)
 	if (~config) {
 		FILTER(obj->cookie,
 		       COOKIE, NOCOOKIE);
-		FILTER(obj->state != FSCACHE_OBJECT_ACTIVE ||
+		FILTER(fscache_object_is_active(obj) ||
 		       obj->n_ops != 0 ||
 		       obj->n_obj_ops != 0 ||
 		       obj->flags ||
@@ -235,10 +235,10 @@ static int fscache_objlist_show(struct seq_file *m, void *v)
 	}
 
 	seq_printf(m,
-		   "%8x %8x %s %5u %3u %3u %3u %2u %5u %2lx %2lx %1lx %1x | ",
+		   "%8x %8x %s %5u %3u %3u %3u %2u %5u %2lx %2lx %2lx %1x | ",
 		   obj->debug_id,
 		   obj->parent ? obj->parent->debug_id : -1,
-		   fscache_object_states_short[obj->state],
+		   obj->state->short_name,
 		   obj->n_children,
 		   obj->n_ops,
 		   obj->n_obj_ops,
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 863f687..c2d9de0 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -17,50 +17,102 @@
 #include <linux/slab.h>
 #include "internal.h"
 
-const char *fscache_object_states[FSCACHE_OBJECT__NSTATES] = {
-	[FSCACHE_OBJECT_INIT]		= "OBJECT_INIT",
-	[FSCACHE_OBJECT_LOOKING_UP]	= "OBJECT_LOOKING_UP",
-	[FSCACHE_OBJECT_CREATING]	= "OBJECT_CREATING",
-	[FSCACHE_OBJECT_AVAILABLE]	= "OBJECT_AVAILABLE",
-	[FSCACHE_OBJECT_ACTIVE]		= "OBJECT_ACTIVE",
-	[FSCACHE_OBJECT_INVALIDATING]	= "OBJECT_INVALIDATING",
-	[FSCACHE_OBJECT_UPDATING]	= "OBJECT_UPDATING",
-	[FSCACHE_OBJECT_DYING]		= "OBJECT_DYING",
-	[FSCACHE_OBJECT_LC_DYING]	= "OBJECT_LC_DYING",
-	[FSCACHE_OBJECT_ABORT_INIT]	= "OBJECT_ABORT_INIT",
-	[FSCACHE_OBJECT_RELEASING]	= "OBJECT_RELEASING",
-	[FSCACHE_OBJECT_RECYCLING]	= "OBJECT_RECYCLING",
-	[FSCACHE_OBJECT_WITHDRAWING]	= "OBJECT_WITHDRAWING",
-	[FSCACHE_OBJECT_DEAD]		= "OBJECT_DEAD",
+static const struct fscache_state *fscache_abort_initialisation(struct fscache_object *, int);
+static const struct fscache_state *fscache_kill_dependents(struct fscache_object *, int);
+static const struct fscache_state *fscache_drop_object(struct fscache_object *, int);
+static const struct fscache_state *fscache_initialise_object(struct fscache_object *, int);
+static const struct fscache_state *fscache_invalidate_object(struct fscache_object *, int);
+static const struct fscache_state *fscache_jumpstart_dependents(struct fscache_object *, int);
+static const struct fscache_state *fscache_kill_object(struct fscache_object *, int);
+static const struct fscache_state *fscache_lookup_failure(struct fscache_object *, int);
+static const struct fscache_state *fscache_look_up_object(struct fscache_object *, int);
+static const struct fscache_state *fscache_object_available(struct fscache_object *, int);
+static const struct fscache_state *fscache_parent_ready(struct fscache_object *, int);
+static const struct fscache_state *fscache_update_object(struct fscache_object *, int);
+static const struct fscache_state *fscache_detach_from_cookie(struct fscache_object *, int);
+
+#define __STATE_NAME(n) fscache_osm_##n
+#define STATE(n) (&__STATE_NAME(n))
+
+#define WORK_STATE(n, sn, f) \
+	const struct fscache_state __STATE_NAME(n) = {			\
+		.name = #n,						\
+		.short_name = sn,					\
+		.work = f						\
+	}
+#define WAIT_STATE(n, sn, ...) \
+	const struct fscache_state __STATE_NAME(n) = {			\
+		.name = #n,						\
+		.short_name = sn,					\
+		.work = NULL,						\
+		.transitions = { __VA_ARGS__, { 0, NULL } }		\
+	}
+
+#define TRANSIT_TO(state, emask) \
+	{ .events = (emask), .transit_to = STATE(state) }
+
+#define NO_TRANSIT ((struct fscache_state *)NULL)
+
+#define transit_to(state) ({ prefetch(STATE(state)); STATE(state); })
+
+
+static WORK_STATE(INIT_OBJECT,		"INIT", fscache_initialise_object);
+static WORK_STATE(PARENT_READY,		"PRDY", fscache_parent_ready);
+static WORK_STATE(ABORT_INIT,		"ABRT", fscache_abort_initialisation);
+static WORK_STATE(LOOK_UP_OBJECT,	"LOOK", fscache_look_up_object);
+static WORK_STATE(CREATE_OBJECT,	"CRTO", fscache_look_up_object);
+static WORK_STATE(OBJECT_AVAILABLE,	"AVBL", fscache_object_available);
+static WORK_STATE(JUMPSTART_DEPS,	"JUMP", fscache_jumpstart_dependents);
+
+static WORK_STATE(INVALIDATE_OBJECT,	"INVL", fscache_invalidate_object);
+static WORK_STATE(UPDATE_OBJECT,	"UPDT", fscache_update_object);
+
+static WORK_STATE(LOOKUP_FAILURE,	"LCFL", fscache_lookup_failure);
+static WORK_STATE(KILL_OBJECT,		"KILL", fscache_kill_object);
+static WORK_STATE(KILL_DEPENDENTS,	"KDEP", fscache_kill_dependents);
+static WORK_STATE(DROP_OBJECT,		"DROP", fscache_drop_object);
+static WORK_STATE(DETACH_FROM_COOKIE,	"DTCH", fscache_detach_from_cookie);
+static WORK_STATE(OBJECT_DEAD,		"DEAD", (void*)2UL);
+
+static WAIT_STATE(WAIT_FOR_INIT,	"?INI",
+		  TRANSIT_TO(INIT_OBJECT,	1 << FSCACHE_OBJECT_EV_NEW_CHILD));
+
+static WAIT_STATE(WAIT_FOR_PARENT,	"?PRN",
+		  TRANSIT_TO(PARENT_READY,	1 << FSCACHE_OBJECT_EV_PARENT_READY));
+
+static WAIT_STATE(WAIT_FOR_CMD,		"?CMD",
+		  TRANSIT_TO(INVALIDATE_OBJECT,	1 << FSCACHE_OBJECT_EV_INVALIDATE),
+		  TRANSIT_TO(UPDATE_OBJECT,	1 << FSCACHE_OBJECT_EV_UPDATE),
+		  TRANSIT_TO(JUMPSTART_DEPS,	1 << FSCACHE_OBJECT_EV_NEW_CHILD));
+
+static WAIT_STATE(WAIT_FOR_CLEARANCE,	"?CLR",
+		  TRANSIT_TO(KILL_OBJECT,	1 << FSCACHE_OBJECT_EV_CLEARED));
+
+/* How to deal with unexpected events */
+static const struct fscache_transition fscache_osm_init_oob[] = {
+	   TRANSIT_TO(ABORT_INIT,
+		      (1 << FSCACHE_OBJECT_EV_ERROR) |
+		      (1 << FSCACHE_OBJECT_EV_KILL)),
+	   { 0, NULL }
 };
-EXPORT_SYMBOL(fscache_object_states);
-
-const char fscache_object_states_short[FSCACHE_OBJECT__NSTATES][5] = {
-	[FSCACHE_OBJECT_INIT]		= "INIT",
-	[FSCACHE_OBJECT_LOOKING_UP]	= "LOOK",
-	[FSCACHE_OBJECT_CREATING]	= "CRTN",
-	[FSCACHE_OBJECT_AVAILABLE]	= "AVBL",
-	[FSCACHE_OBJECT_ACTIVE]		= "ACTV",
-	[FSCACHE_OBJECT_INVALIDATING]	= "INVL",
-	[FSCACHE_OBJECT_UPDATING]	= "UPDT",
-	[FSCACHE_OBJECT_DYING]		= "DYNG",
-	[FSCACHE_OBJECT_LC_DYING]	= "LCDY",
-	[FSCACHE_OBJECT_ABORT_INIT]	= "ABTI",
-	[FSCACHE_OBJECT_RELEASING]	= "RELS",
-	[FSCACHE_OBJECT_RECYCLING]	= "RCYC",
-	[FSCACHE_OBJECT_WITHDRAWING]	= "WTHD",
-	[FSCACHE_OBJECT_DEAD]		= "DEAD",
+
+static const struct fscache_transition fscache_osm_lookup_oob[] = {
+	   TRANSIT_TO(LOOKUP_FAILURE,
+		      (1 << FSCACHE_OBJECT_EV_ERROR) |
+		      (1 << FSCACHE_OBJECT_EV_KILL)),
+	   { 0, NULL }
+};
+
+static const struct fscache_transition fscache_osm_run_oob[] = {
+	   TRANSIT_TO(KILL_OBJECT,
+		      (1 << FSCACHE_OBJECT_EV_ERROR) |
+		      (1 << FSCACHE_OBJECT_EV_KILL)),
+	   { 0, NULL }
 };
 
 static int  fscache_get_object(struct fscache_object *);
 static void fscache_put_object(struct fscache_object *);
-static void fscache_initialise_object(struct fscache_object *);
-static void fscache_lookup_object(struct fscache_object *);
-static void fscache_object_available(struct fscache_object *);
-static void fscache_invalidate_object(struct fscache_object *);
-static void fscache_release_object(struct fscache_object *);
-static void fscache_withdraw_object(struct fscache_object *);
-static void fscache_enqueue_dependents(struct fscache_object *);
+static bool fscache_enqueue_dependents(struct fscache_object *, int);
 static void fscache_dequeue_object(struct fscache_object *);
 
 /*
@@ -83,281 +135,102 @@ static inline void fscache_done_parent_op(struct fscache_object *object)
 }
 
 /*
- * Notify netfs of invalidation completion.
+ * Object state machine dispatcher.
  */
-static inline void fscache_invalidation_complete(struct fscache_cookie *cookie)
+static void fscache_object_sm_dispatcher(struct fscache_object *object)
 {
-	if (test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags))
-		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING);
-}
-
-/*
- * process events that have been sent to an object's state machine
- * - initiates parent lookup
- * - does object lookup
- * - does object creation
- * - does object recycling and retirement
- * - does object withdrawal
- */
-static void fscache_object_state_machine(struct fscache_object *object)
-{
-	enum fscache_object_state new_state;
-	struct fscache_cookie *cookie;
-	int event;
+	const struct fscache_transition *t;
+	const struct fscache_state *state, *new_state;
+	unsigned long events, event_mask;
+	int event = -1;
 
 	ASSERT(object != NULL);
 
 	_enter("{OBJ%x,%s,%lx}",
-	       object->debug_id, fscache_object_states[object->state],
-	       object->events);
-
-	switch (object->state) {
-		/* wait for the parent object to become ready */
-	case FSCACHE_OBJECT_INIT:
-		object->event_mask =
-			FSCACHE_OBJECT_EVENTS_MASK &
-			~(1 << FSCACHE_OBJECT_EV_CLEARED);
-		fscache_initialise_object(object);
-		goto done;
-
-		/* look up the object metadata on disk */
-	case FSCACHE_OBJECT_LOOKING_UP:
-		fscache_lookup_object(object);
-		goto lookup_transit;
-
-		/* create the object metadata on disk */
-	case FSCACHE_OBJECT_CREATING:
-		fscache_lookup_object(object);
-		goto lookup_transit;
-
-		/* handle an object becoming available; start pending
-		 * operations and queue dependent operations for processing */
-	case FSCACHE_OBJECT_AVAILABLE:
-		fscache_object_available(object);
-		goto active_transit;
-
-		/* normal running state */
-	case FSCACHE_OBJECT_ACTIVE:
-		goto active_transit;
-
-		/* Invalidate an object on disk */
-	case FSCACHE_OBJECT_INVALIDATING:
-		clear_bit(FSCACHE_OBJECT_EV_INVALIDATE, &object->events);
-		fscache_stat(&fscache_n_invalidates_run);
-		fscache_stat(&fscache_n_cop_invalidate_object);
-		fscache_invalidate_object(object);
-		fscache_stat_d(&fscache_n_cop_invalidate_object);
-		fscache_raise_event(object, FSCACHE_OBJECT_EV_UPDATE);
-		goto active_transit;
-
-		/* update the object metadata on disk */
-	case FSCACHE_OBJECT_UPDATING:
-		clear_bit(FSCACHE_OBJECT_EV_UPDATE, &object->events);
-		fscache_stat(&fscache_n_updates_run);
-		fscache_stat(&fscache_n_cop_update_object);
-		object->cache->ops->update_object(object);
-		fscache_stat_d(&fscache_n_cop_update_object);
-		goto active_transit;
-
-		/* handle an object dying during lookup or creation */
-	case FSCACHE_OBJECT_LC_DYING:
-		object->event_mask &= ~(1 << FSCACHE_OBJECT_EV_UPDATE);
-		fscache_stat(&fscache_n_cop_lookup_complete);
-		object->cache->ops->lookup_complete(object);
-		fscache_stat_d(&fscache_n_cop_lookup_complete);
-
-		spin_lock(&object->lock);
-		object->state = FSCACHE_OBJECT_DYING;
-		cookie = object->cookie;
-		if (cookie) {
-			if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP,
-					       &cookie->flags))
-				wake_up_bit(&cookie->flags,
-					    FSCACHE_COOKIE_LOOKING_UP);
-			if (test_and_clear_bit(FSCACHE_COOKIE_CREATING,
-					       &cookie->flags))
-				wake_up_bit(&cookie->flags,
-					    FSCACHE_COOKIE_CREATING);
+	       object->debug_id, object->state->name, object->events);
+
+	event_mask = object->event_mask;
+restart:
+	object->event_mask = 0; /* Mask normal event handling */
+	state = object->state;
+restart_masked:
+	events = object->events;
+
+	/* Handle any out-of-band events (typically an error) */
+	if (events & object->oob_event_mask) {
+		_debug("{OBJ%x} oob %lx",
+		       object->debug_id, events & object->oob_event_mask);
+		for (t = object->oob_table; t->events; t++) {
+			if (events & t->events) {
+				state = t->transit_to;
+				ASSERT(state->work != NULL);
+				event = fls(events & t->events) - 1;
+				__clear_bit(event, &object->oob_event_mask);
+				clear_bit(event, &object->events);
+				goto execute_work_state;
+			}
 		}
-		spin_unlock(&object->lock);
+	}
 
-		fscache_done_parent_op(object);
+	/* Wait states are just transition tables */
+	if (!state->work) {
+		if (events & event_mask) {
+			for (t = state->transitions; t->events; t++) {
+				if (events & t->events) {
+					new_state = t->transit_to;
+					event = fls(events & t->events) - 1;
+					clear_bit(event, &object->events);
+					_debug("{OBJ%x} ev %d: %s -> %s",
+					       object->debug_id, event,
+					       state->name, new_state->name);
+					object->state = state = new_state;
+					goto execute_work_state;
+				}
+			}
 
-		/* wait for completion of all active operations on this object
-		 * and the death of all child objects of this object */
-	case FSCACHE_OBJECT_DYING:
-	dying:
-		clear_bit(FSCACHE_OBJECT_EV_CLEARED, &object->events);
-		spin_lock(&object->lock);
-		_debug("dying OBJ%x {%d,%d}",
-		       object->debug_id, object->n_ops, object->n_children);
-		if (object->n_ops == 0 && object->n_children == 0) {
-			object->event_mask &=
-				~(1 << FSCACHE_OBJECT_EV_CLEARED);
-			object->event_mask |=
-				(1 << FSCACHE_OBJECT_EV_WITHDRAW) |
-				(1 << FSCACHE_OBJECT_EV_RETIRE) |
-				(1 << FSCACHE_OBJECT_EV_RELEASE) |
-				(1 << FSCACHE_OBJECT_EV_ERROR);
-		} else {
-			object->event_mask &=
-				~((1 << FSCACHE_OBJECT_EV_WITHDRAW) |
-				  (1 << FSCACHE_OBJECT_EV_RETIRE) |
-				  (1 << FSCACHE_OBJECT_EV_RELEASE) |
-				  (1 << FSCACHE_OBJECT_EV_ERROR));
-			object->event_mask |=
-				1 << FSCACHE_OBJECT_EV_CLEARED;
+			/* The event mask didn't include all the tabled bits */
+			BUG();
 		}
-		spin_unlock(&object->lock);
-		fscache_enqueue_dependents(object);
-		fscache_start_operations(object);
-		goto terminal_transit;
-
-		/* handle an abort during initialisation */
-	case FSCACHE_OBJECT_ABORT_INIT:
-		_debug("handle abort init %lx", object->events);
-		object->event_mask &= ~(1 << FSCACHE_OBJECT_EV_UPDATE);
-
-		spin_lock(&object->lock);
-		fscache_dequeue_object(object);
-
-		object->state = FSCACHE_OBJECT_DYING;
-		if (test_and_clear_bit(FSCACHE_COOKIE_CREATING,
-				       &object->cookie->flags))
-			wake_up_bit(&object->cookie->flags,
-				    FSCACHE_COOKIE_CREATING);
-		spin_unlock(&object->lock);
-		goto dying;
-
-		/* handle the netfs releasing an object and possibly marking it
-		 * obsolete too */
-	case FSCACHE_OBJECT_RELEASING:
-	case FSCACHE_OBJECT_RECYCLING:
-		object->event_mask &=
-			~((1 << FSCACHE_OBJECT_EV_WITHDRAW) |
-			  (1 << FSCACHE_OBJECT_EV_RETIRE) |
-			  (1 << FSCACHE_OBJECT_EV_RELEASE) |
-			  (1 << FSCACHE_OBJECT_EV_ERROR));
-		fscache_release_object(object);
-		spin_lock(&object->lock);
-		object->state = FSCACHE_OBJECT_DEAD;
-		spin_unlock(&object->lock);
-		fscache_stat(&fscache_n_object_dead);
-		goto terminal_transit;
-
-		/* handle the parent cache of this object being withdrawn from
-		 * active service */
-	case FSCACHE_OBJECT_WITHDRAWING:
-		object->event_mask &=
-			~((1 << FSCACHE_OBJECT_EV_WITHDRAW) |
-			  (1 << FSCACHE_OBJECT_EV_RETIRE) |
-			  (1 << FSCACHE_OBJECT_EV_RELEASE) |
-			  (1 << FSCACHE_OBJECT_EV_ERROR));
-		fscache_withdraw_object(object);
-		spin_lock(&object->lock);
-		object->state = FSCACHE_OBJECT_DEAD;
-		spin_unlock(&object->lock);
-		fscache_stat(&fscache_n_object_dead);
-		goto terminal_transit;
-
-		/* complain about the object being woken up once it is
-		 * deceased */
-	case FSCACHE_OBJECT_DEAD:
-		printk(KERN_ERR "FS-Cache:"
-		       " Unexpected event in dead state %lx\n",
-		       object->events & object->event_mask);
-		BUG();
-
-	default:
-		printk(KERN_ERR "FS-Cache: Unknown object state %u\n",
-		       object->state);
-		BUG();
-	}
-
-	/* determine the transition from a lookup state */
-lookup_transit:
-	event = fls(object->events & object->event_mask) - 1;
-	switch (event) {
-	case FSCACHE_OBJECT_EV_WITHDRAW:
-	case FSCACHE_OBJECT_EV_RETIRE:
-	case FSCACHE_OBJECT_EV_RELEASE:
-	case FSCACHE_OBJECT_EV_ERROR:
-		new_state = FSCACHE_OBJECT_LC_DYING;
-		goto change_state;
-	case FSCACHE_OBJECT_EV_INVALIDATE:
-		new_state = FSCACHE_OBJECT_INVALIDATING;
-		goto change_state;
-	case FSCACHE_OBJECT_EV_REQUEUE:
-		goto done;
-	case -1:
-		goto done; /* sleep until event */
-	default:
-		goto unsupported_event;
+		/* Randomly woke up */
+		goto unmask_events;
 	}
 
-	/* determine the transition from an active state */
-active_transit:
-	event = fls(object->events & object->event_mask) - 1;
-	switch (event) {
-	case FSCACHE_OBJECT_EV_WITHDRAW:
-	case FSCACHE_OBJECT_EV_RETIRE:
-	case FSCACHE_OBJECT_EV_RELEASE:
-	case FSCACHE_OBJECT_EV_ERROR:
-		new_state = FSCACHE_OBJECT_DYING;
-		goto change_state;
-	case FSCACHE_OBJECT_EV_INVALIDATE:
-		new_state = FSCACHE_OBJECT_INVALIDATING;
-		goto change_state;
-	case FSCACHE_OBJECT_EV_UPDATE:
-		new_state = FSCACHE_OBJECT_UPDATING;
-		goto change_state;
-	case -1:
-		new_state = FSCACHE_OBJECT_ACTIVE;
-		goto change_state; /* sleep until event */
-	default:
-		goto unsupported_event;
-	}
+execute_work_state:
+	_debug("{OBJ%x} exec %s", object->debug_id, state->name);
 
-	/* determine the transition from a terminal state */
-terminal_transit:
-	event = fls(object->events & object->event_mask) - 1;
-	switch (event) {
-	case FSCACHE_OBJECT_EV_WITHDRAW:
-		new_state = FSCACHE_OBJECT_WITHDRAWING;
-		goto change_state;
-	case FSCACHE_OBJECT_EV_RETIRE:
-		new_state = FSCACHE_OBJECT_RECYCLING;
-		goto change_state;
-	case FSCACHE_OBJECT_EV_RELEASE:
-		new_state = FSCACHE_OBJECT_RELEASING;
-		goto change_state;
-	case FSCACHE_OBJECT_EV_ERROR:
-		new_state = FSCACHE_OBJECT_WITHDRAWING;
-		goto change_state;
-	case FSCACHE_OBJECT_EV_CLEARED:
-		new_state = FSCACHE_OBJECT_DYING;
-		goto change_state;
-	case -1:
-		goto done; /* sleep until event */
-	default:
-		goto unsupported_event;
+	new_state = state->work(object, event);
+	event = -1;
+	if (new_state == NO_TRANSIT) {
+		_debug("{OBJ%x} %s notrans", object->debug_id, state->name);
+		fscache_enqueue_object(object);
+		event_mask = object->oob_event_mask;
+		goto unmask_events;
 	}
 
-change_state:
-	spin_lock(&object->lock);
-	object->state = new_state;
-	spin_unlock(&object->lock);
+	_debug("{OBJ%x} %s -> %s",
+	       object->debug_id, state->name, new_state->name);
+	object->state = state = new_state;
 
-done:
-	_leave(" [->%s]", fscache_object_states[object->state]);
-	return;
+	if (state->work) {
+		if (unlikely(state->work == ((void *)2UL))) {
+			_leave(" [dead]");
+			return;
+		}
+		goto restart_masked;
+	}
 
-unsupported_event:
-	printk(KERN_ERR "FS-Cache:"
-	       " Unsupported event %d [%lx/%lx] in state %s\n",
-	       event, object->events, object->event_mask,
-	       fscache_object_states[object->state]);
-	BUG();
+	/* Transited to wait state */
+	event_mask = object->oob_event_mask;
+	for (t = state->transitions; t->events; t++)
+		event_mask |= t->events;
+
+unmask_events:
+	object->event_mask = event_mask;
+	smp_mb();
+	events = object->events;
+	if (events & event_mask)
+		goto restart;
+	_leave(" [msk %lx]", event_mask);
 }
 
 /*
@@ -372,11 +245,8 @@ static void fscache_object_work_func(struct work_struct *work)
 	_enter("{OBJ%x}", object->debug_id);
 
 	start = jiffies;
-	fscache_object_state_machine(object);
+	fscache_object_sm_dispatcher(object);
 	fscache_hist(fscache_objs_histogram, start);
-	if (object->events & object->event_mask)
-		fscache_enqueue_object(object);
-	clear_bit(FSCACHE_OBJECT_EV_REQUEUE, &object->events);
 	fscache_put_object(object);
 }
 
@@ -395,9 +265,13 @@ void fscache_object_init(struct fscache_object *object,
 			 struct fscache_cookie *cookie,
 			 struct fscache_cache *cache)
 {
+	const struct fscache_transition *t;
+
 	atomic_inc(&cache->object_count);
 
-	object->state = FSCACHE_OBJECT_INIT;
+	object->state = STATE(WAIT_FOR_INIT);
+	object->oob_table = fscache_osm_init_oob;
+	object->flags = 1 << FSCACHE_OBJECT_IS_LIVE;
 	spin_lock_init(&object->lock);
 	INIT_LIST_HEAD(&object->cache_link);
 	INIT_HLIST_NODE(&object->cookie_link);
@@ -407,17 +281,48 @@ void fscache_object_init(struct fscache_object *object,
 	INIT_LIST_HEAD(&object->pending_ops);
 	object->n_children = 0;
 	object->n_ops = object->n_in_progress = object->n_exclusive = 0;
-	object->events = object->event_mask = 0;
-	object->flags = 0;
+	object->events = 0;
 	object->store_limit = 0;
 	object->store_limit_l = 0;
 	object->cache = cache;
 	object->cookie = cookie;
 	object->parent = NULL;
+
+	object->oob_event_mask = 0;
+	for (t = object->oob_table; t->events; t++)
+		object->oob_event_mask |= t->events;
+	object->event_mask = object->oob_event_mask;
+	for (t = object->state->transitions; t->events; t++)
+		object->event_mask |= t->events;
 }
 EXPORT_SYMBOL(fscache_object_init);
 
 /*
+ * Abort object initialisation before we start it.
+ */
+static const struct fscache_state *fscache_abort_initialisation(struct fscache_object *object,
+								int event)
+{
+	struct fscache_cookie *cookie;
+
+	_enter("{OBJ%x},%d", object->debug_id, event);
+
+	object->oob_event_mask = 0;
+	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+
+	fscache_dequeue_object(object);
+
+	spin_lock(&object->lock);
+	cookie = object->cookie;
+	clear_bit_unlock(FSCACHE_COOKIE_CREATING, &cookie->flags);
+	spin_unlock(&object->lock);
+
+	wake_up_bit(&cookie->flags, FSCACHE_COOKIE_CREATING);
+
+	return transit_to(KILL_OBJECT);
+}
+
+/*
  * initialise an object
  * - check the specified object's parent to see if we can make use of it
  *   immediately to do a creation
@@ -426,74 +331,78 @@ EXPORT_SYMBOL(fscache_object_init);
  * - an object's cookie is pinned until we clear FSCACHE_COOKIE_CREATING on the
  *   leaf-most cookies of the object and all its children
  */
-static void fscache_initialise_object(struct fscache_object *object)
+static const struct fscache_state *fscache_initialise_object(struct fscache_object *object,
+							     int event)
 {
 	struct fscache_object *parent;
+	bool success;
 
-	_enter("");
-	ASSERT(object->cookie != NULL);
-	ASSERT(object->cookie->parent != NULL);
-
-	if (object->events & ((1 << FSCACHE_OBJECT_EV_ERROR) |
-			      (1 << FSCACHE_OBJECT_EV_RELEASE) |
-			      (1 << FSCACHE_OBJECT_EV_RETIRE) |
-			      (1 << FSCACHE_OBJECT_EV_WITHDRAW))) {
-		_debug("abort init %lx", object->events);
-		spin_lock(&object->lock);
-		object->state = FSCACHE_OBJECT_ABORT_INIT;
-		spin_unlock(&object->lock);
-		return;
-	}
+	_enter("{OBJ%x},%d", object->debug_id, event);
 
-	spin_lock(&object->cookie->lock);
-	spin_lock_nested(&object->cookie->parent->lock, 1);
+	ASSERT(list_empty(&object->dep_link));
 
 	parent = object->parent;
 	if (!parent) {
-		_debug("no parent");
-		set_bit(FSCACHE_OBJECT_EV_WITHDRAW, &object->events);
-	} else {
-		spin_lock(&object->lock);
-		spin_lock_nested(&parent->lock, 1);
-		_debug("parent %s", fscache_object_states[parent->state]);
-
-		if (fscache_object_is_dying(parent)) {
-			_debug("bad parent");
-			set_bit(FSCACHE_OBJECT_EV_WITHDRAW, &object->events);
-		} else if (!fscache_object_is_available(parent)) {
-			_debug("wait");
-
-			/* we may get woken up in this state by child objects
-			 * binding on to us, so we need to make sure we don't
-			 * add ourself to the list multiple times */
-			if (list_empty(&object->dep_link)) {
-				fscache_stat(&fscache_n_cop_grab_object);
-				object->cache->ops->grab_object(object);
-				fscache_stat_d(&fscache_n_cop_grab_object);
-				list_add(&object->dep_link,
-					 &parent->dependents);
-
-				/* fscache_acquire_non_index_cookie() uses this
-				 * to wake the chain up */
-				if (parent->state == FSCACHE_OBJECT_INIT)
-					fscache_enqueue_object(parent);
-			}
-		} else {
-			_debug("go");
-			parent->n_ops++;
-			parent->n_obj_ops++;
-			object->lookup_jif = jiffies;
-			object->state = FSCACHE_OBJECT_LOOKING_UP;
-			set_bit(FSCACHE_OBJECT_EV_REQUEUE, &object->events);
-		}
+		_leave(" [no parent]");
+		return transit_to(DETACH_FROM_COOKIE);
+	}
 
-		spin_unlock(&parent->lock);
-		spin_unlock(&object->lock);
+	_debug("parent %s", parent->state->name);
+
+	if (fscache_object_is_dying(parent)) {
+		_leave(" [bad parent]");
+		return transit_to(DETACH_FROM_COOKIE);
 	}
 
-	spin_unlock(&object->cookie->parent->lock);
-	spin_unlock(&object->cookie->lock);
+	if (fscache_object_is_available(parent)) {
+		_leave(" [ready]");
+		return transit_to(PARENT_READY);
+	}
+
+	_debug("wait");
+
+	spin_lock(&parent->lock);
+	fscache_stat(&fscache_n_cop_grab_object);
+	success = false;
+	if (fscache_object_is_live(parent) &&
+	    object->cache->ops->grab_object(object)) {
+		list_add(&object->dep_link, &parent->dependents);
+		success = true;
+	}
+	fscache_stat_d(&fscache_n_cop_grab_object);
+	spin_unlock(&parent->lock);
+	if (!success) {
+		_leave(" [grab failed]");
+		return transit_to(DETACH_FROM_COOKIE);
+	}
+
+	/* fscache_acquire_non_index_cookie() uses this
+	 * to wake the chain up */
+	fscache_raise_event(parent, FSCACHE_OBJECT_EV_NEW_CHILD);
+	_leave(" [wait]");
+	return transit_to(WAIT_FOR_PARENT);
+}
+
+/*
+ * Once the parent object is ready, we should kick off our lookup op.
+ */
+static const struct fscache_state *fscache_parent_ready(struct fscache_object *object,
+							int event)
+{
+	struct fscache_object *parent = object->parent;
+
+	_enter("{OBJ%x},%d", object->debug_id, event);
+
+	ASSERT(parent != NULL);
+
+	spin_lock(&parent->lock);
+	parent->n_ops++;
+	parent->n_obj_ops++;
+	object->lookup_jif = jiffies;
+	spin_unlock(&parent->lock);
+
 	_leave("");
+	return transit_to(LOOK_UP_OBJECT);
 }
 
 /*
@@ -503,15 +412,17 @@ static void fscache_initialise_object(struct fscache_object *object)
  * - an object's cookie is pinned until we clear FSCACHE_COOKIE_CREATING on the
  *   leaf-most cookies of the object and all its children
  */
-static void fscache_lookup_object(struct fscache_object *object)
+static const struct fscache_state *fscache_look_up_object(struct fscache_object *object,
+							  int event)
 {
 	struct fscache_cookie *cookie = object->cookie;
-	struct fscache_object *parent;
+	struct fscache_object *parent = object->parent;
 	int ret;
 
-	_enter("");
+	_enter("{OBJ%x},%d", object->debug_id, event);
+
+	object->oob_table = fscache_osm_lookup_oob;
 
-	parent = object->parent;
 	ASSERT(parent != NULL);
 	ASSERTCMP(parent->n_ops, >, 0);
 	ASSERTCMP(parent->n_obj_ops, >, 0);
@@ -521,10 +432,8 @@ static void fscache_lookup_object(struct fscache_object *object)
 
 	if (fscache_object_is_dying(parent) ||
 	    test_bit(FSCACHE_IOERROR, &object->cache->flags)) {
-		_debug("unavailable");
-		set_bit(FSCACHE_OBJECT_EV_WITHDRAW, &object->events);
-		_leave("");
-		return;
+		_leave(" [unavailable]");
+		return transit_to(LOOKUP_FAILURE);
 	}
 
 	_debug("LOOKUP \"%s/%s\" in \"%s\"",
@@ -543,10 +452,17 @@ static void fscache_lookup_object(struct fscache_object *object)
 		/* probably stuck behind another object, so move this one to
 		 * the back of the queue */
 		fscache_stat(&fscache_n_object_lookups_timed_out);
-		set_bit(FSCACHE_OBJECT_EV_REQUEUE, &object->events);
+		_leave(" [timeout]");
+		return NO_TRANSIT;
 	}
 
-	_leave("");
+	if (ret < 0) {
+		_leave(" [error]");
+		return transit_to(LOOKUP_FAILURE);
+	}
+
+	_leave(" [ok]");
+	return transit_to(OBJECT_AVAILABLE);
 }
 
 /**
@@ -560,32 +476,20 @@ void fscache_object_lookup_negative(struct fscache_object *object)
 {
 	struct fscache_cookie *cookie = object->cookie;
 
-	_enter("{OBJ%x,%s}",
-	       object->debug_id, fscache_object_states[object->state]);
+	_enter("{OBJ%x,%s}", object->debug_id, object->state->name);
 
-	spin_lock(&object->lock);
-	if (object->state == FSCACHE_OBJECT_LOOKING_UP) {
+	if (!test_and_set_bit(FSCACHE_OBJECT_IS_LOOKED_UP, &object->flags)) {
 		fscache_stat(&fscache_n_object_lookups_negative);
 
-		/* transit here to allow write requests to begin stacking up
-		 * and read requests to begin returning ENODATA */
-		object->state = FSCACHE_OBJECT_CREATING;
-		spin_unlock(&object->lock);
-
-		set_bit(FSCACHE_COOKIE_PENDING_FILL, &cookie->flags);
+		/* Allow write requests to begin stacking up and read requests to begin
+		 * returning ENODATA.
+		 */
 		set_bit(FSCACHE_COOKIE_NO_DATA_YET, &cookie->flags);
 
 		_debug("wake up lookup %p", &cookie->flags);
-		smp_mb__before_clear_bit();
-		clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags);
-		smp_mb__after_clear_bit();
+		clear_bit_unlock(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags);
 		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP);
-		set_bit(FSCACHE_OBJECT_EV_REQUEUE, &object->events);
-	} else {
-		ASSERTCMP(object->state, ==, FSCACHE_OBJECT_CREATING);
-		spin_unlock(&object->lock);
 	}
-
 	_leave("");
 }
 EXPORT_SYMBOL(fscache_object_lookup_negative);
@@ -604,37 +508,30 @@ void fscache_obtained_object(struct fscache_object *object)
 {
 	struct fscache_cookie *cookie = object->cookie;
 
-	_enter("{OBJ%x,%s}",
-	       object->debug_id, fscache_object_states[object->state]);
+	_enter("{OBJ%x,%s}", object->debug_id, object->state->name);
 
 	/* if we were still looking up, then we must have a positive lookup
 	 * result, in which case there may be data available */
-	spin_lock(&object->lock);
-	if (object->state == FSCACHE_OBJECT_LOOKING_UP) {
+	if (!test_and_set_bit(FSCACHE_OBJECT_IS_LOOKED_UP, &object->flags)) {
 		fscache_stat(&fscache_n_object_lookups_positive);
 
-		clear_bit(FSCACHE_COOKIE_NO_DATA_YET, &cookie->flags);
-
-		object->state = FSCACHE_OBJECT_AVAILABLE;
-		spin_unlock(&object->lock);
+		/* We do (presumably) have data */
+		clear_bit_unlock(FSCACHE_COOKIE_NO_DATA_YET, &cookie->flags);
 
-		smp_mb__before_clear_bit();
-		clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags);
-		smp_mb__after_clear_bit();
+		/* Allow write requests to begin stacking up and read requests
+		 * to begin shovelling data.
+		 */
+		clear_bit_unlock(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags);
 		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP);
-		set_bit(FSCACHE_OBJECT_EV_REQUEUE, &object->events);
 	} else {
-		ASSERTCMP(object->state, ==, FSCACHE_OBJECT_CREATING);
 		fscache_stat(&fscache_n_object_created);
-
-		object->state = FSCACHE_OBJECT_AVAILABLE;
-		spin_unlock(&object->lock);
-		set_bit(FSCACHE_OBJECT_EV_REQUEUE, &object->events);
-		smp_wmb();
 	}
 
-	if (test_and_clear_bit(FSCACHE_COOKIE_CREATING, &cookie->flags))
-		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_CREATING);
+	set_bit(FSCACHE_OBJECT_IS_AVAILABLE, &object->flags);
+
+	/* Permit __fscache_relinquish_cookie() to proceed */
+	clear_bit_unlock(FSCACHE_COOKIE_CREATING, &cookie->flags);
+	wake_up_bit(&cookie->flags, FSCACHE_COOKIE_CREATING);
 
 	_leave("");
 }
@@ -643,15 +540,18 @@ EXPORT_SYMBOL(fscache_obtained_object);
 /*
  * handle an object that has just become available
  */
-static void fscache_object_available(struct fscache_object *object)
+static const struct fscache_state *fscache_object_available(struct fscache_object *object,
+							    int event)
 {
-	_enter("{OBJ%x}", object->debug_id);
+	struct fscache_cookie *cookie = object->cookie;
+
+	_enter("{OBJ%x},%d", object->debug_id, event);
+
+	object->oob_table = fscache_osm_run_oob;
 
 	spin_lock(&object->lock);
 
-	if (object->cookie &&
-	    test_and_clear_bit(FSCACHE_COOKIE_CREATING, &object->cookie->flags))
-		wake_up_bit(&object->cookie->flags, FSCACHE_COOKIE_CREATING);
+	ASSERTIF(cookie, !test_bit(FSCACHE_COOKIE_CREATING, &object->cookie->flags));
 
 	fscache_done_parent_op(object);
 	if (object->n_in_progress == 0) {
@@ -667,72 +567,117 @@ static void fscache_object_available(struct fscache_object *object)
 	fscache_stat(&fscache_n_cop_lookup_complete);
 	object->cache->ops->lookup_complete(object);
 	fscache_stat_d(&fscache_n_cop_lookup_complete);
-	fscache_enqueue_dependents(object);
 
 	fscache_hist(fscache_obj_instantiate_histogram, object->lookup_jif);
 	fscache_stat(&fscache_n_object_avail);
 
 	_leave("");
+	return transit_to(JUMPSTART_DEPS);
 }
 
 /*
- * drop an object's attachments
+ * Wake up this object's dependent objects now that we've become available.
  */
-static void fscache_drop_object(struct fscache_object *object)
+static const struct fscache_state *fscache_jumpstart_dependents(struct fscache_object *object,
+								int event)
 {
-	struct fscache_object *parent = object->parent;
-	struct fscache_cache *cache = object->cache;
+	_enter("{OBJ%x},%d", object->debug_id, event);
 
-	_enter("{OBJ%x,%d}", object->debug_id, object->n_children);
+	if (!fscache_enqueue_dependents(object, FSCACHE_OBJECT_EV_PARENT_READY))
+		return NO_TRANSIT; /* Not finished; requeue */
+	return transit_to(WAIT_FOR_CMD);
+}
 
-	ASSERTCMP(object->cookie, ==, NULL);
-	ASSERT(hlist_unhashed(&object->cookie_link));
+/*
+ * Handle lookup or creation failute.
+ */
+static const struct fscache_state *fscache_lookup_failure(struct fscache_object *object,
+							  int event)
+{
+	struct fscache_cookie *cookie;
+	bool wake_looking_up = false;
 
-	spin_lock(&cache->object_list_lock);
-	list_del_init(&object->cache_link);
-	spin_unlock(&cache->object_list_lock);
+	_enter("{OBJ%x},%d", object->debug_id, event);
 
-	fscache_stat(&fscache_n_cop_drop_object);
-	cache->ops->drop_object(object);
-	fscache_stat_d(&fscache_n_cop_drop_object);
+	object->oob_event_mask = 0;
 
-	if (parent) {
-		_debug("release parent OBJ%x {%d}",
-		       parent->debug_id, parent->n_children);
+	fscache_stat(&fscache_n_cop_lookup_complete);
+	object->cache->ops->lookup_complete(object);
+	fscache_stat_d(&fscache_n_cop_lookup_complete);
 
-		spin_lock(&parent->lock);
-		parent->n_children--;
-		if (parent->n_children == 0)
-			fscache_raise_event(parent, FSCACHE_OBJECT_EV_CLEARED);
-		spin_unlock(&parent->lock);
-		object->parent = NULL;
+	spin_lock(&object->lock);
+	cookie = object->cookie;
+	set_bit(FSCACHE_COOKIE_UNAVAILABLE, &cookie->flags);
+	if (cookie) {
+		if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags))
+			wake_looking_up = true;
+		clear_bit_unlock(FSCACHE_COOKIE_CREATING, &cookie->flags);
 	}
+	spin_unlock(&object->lock);
 
-	/* this just shifts the object release to the work processor */
-	fscache_put_object(object);
+	if (wake_looking_up)
+		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP);
+	wake_up_bit(&cookie->flags, FSCACHE_COOKIE_CREATING);
 
-	_leave("");
+	fscache_done_parent_op(object);
+	return transit_to(KILL_OBJECT);
 }
 
 /*
- * release or recycle an object that the netfs has discarded
+ * Wait for completion of all active operations on this object and the death of
+ * all child objects of this object.
  */
-static void fscache_release_object(struct fscache_object *object)
+static const struct fscache_state *fscache_kill_object(struct fscache_object *object,
+						       int event)
 {
-	_enter("");
+	_enter("{OBJ%x,%d,%d},%d",
+	       object->debug_id, object->n_ops, object->n_children, event);
+
+	object->oob_event_mask = 0;
+
+	spin_lock(&object->lock);
+	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+	spin_unlock(&object->lock);
 
-	fscache_drop_object(object);
+	if (list_empty(&object->dependents) &&
+	    object->n_ops == 0 &&
+	    object->n_children == 0)
+		return object->cookie ?
+			transit_to(DETACH_FROM_COOKIE) : transit_to(DROP_OBJECT);
+
+	spin_lock(&object->lock);
+	fscache_start_operations(object);
+	spin_unlock(&object->lock);
+
+	if (!list_empty(&object->dependents))
+		return transit_to(KILL_DEPENDENTS);
+
+	return transit_to(WAIT_FOR_CLEARANCE);
+}
+
+/*
+ * Kill dependent objects.
+ */
+static const struct fscache_state *fscache_kill_dependents(struct fscache_object *object,
+							   int event)
+{
+	_enter("{OBJ%x},%d", object->debug_id, event);
+
+	if (!fscache_enqueue_dependents(object, FSCACHE_OBJECT_EV_KILL))
+		return NO_TRANSIT; /* Not finished */
+	return transit_to(WAIT_FOR_CLEARANCE);
 }
 
 /*
  * withdraw an object from active service
  */
-static void fscache_withdraw_object(struct fscache_object *object)
+static const struct fscache_state *fscache_detach_from_cookie(struct fscache_object *object,
+							      int event)
 {
 	struct fscache_cookie *cookie;
-	bool detached;
+	bool detached = false, awaken = false;
 
-	_enter("");
+	_enter("{OBJ%x},%d", object->debug_id, event);
 
 	spin_lock(&object->lock);
 	cookie = object->cookie;
@@ -742,14 +687,15 @@ static void fscache_withdraw_object(struct fscache_object *object)
 		atomic_inc(&cookie->usage);
 		spin_unlock(&object->lock);
 
-		detached = false;
 		spin_lock(&cookie->lock);
 		spin_lock(&object->lock);
 
 		if (object->cookie == cookie) {
 			hlist_del_init(&object->cookie_link);
 			object->cookie = NULL;
-			fscache_invalidation_complete(cookie);
+			if (test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING,
+					       &cookie->flags))
+				awaken = true;
 			detached = true;
 		}
 		spin_unlock(&cookie->lock);
@@ -760,37 +706,62 @@ static void fscache_withdraw_object(struct fscache_object *object)
 
 	spin_unlock(&object->lock);
 
-	fscache_drop_object(object);
+	if (awaken)
+		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING);
+
+	fscache_stat(&fscache_n_object_dead);
+	_leave("");
+	return transit_to(DROP_OBJECT);
 }
 
 /*
- * withdraw an object from active service at the behest of the cache
- * - need break the links to a cached object cookie
- * - called under two situations:
- *   (1) recycler decides to reclaim an in-use object
- *   (2) a cache is unmounted
- * - have to take care as the cookie can be being relinquished by the netfs
- *   simultaneously
- * - the object is pinned by the caller holding a refcount on it
+ * Drop an object's attachments
  */
-void fscache_withdrawing_object(struct fscache_cache *cache,
-				struct fscache_object *object)
+static const struct fscache_state *fscache_drop_object(struct fscache_object *object,
+						       int event)
 {
-	bool enqueue = false;
+	struct fscache_object *parent = object->parent;
+	struct fscache_cache *cache = object->cache;
 
-	_enter(",OBJ%x", object->debug_id);
+	_enter("{OBJ%x,%d},%d", object->debug_id, object->n_children, event);
 
+	ASSERTCMP(object->cookie, ==, NULL);
+	ASSERT(hlist_unhashed(&object->cookie_link));
+
+	/* Prevent a race with our last child, which has to signal EV_CLEARED
+	 * before dropping our spinlock.
+	 */
 	spin_lock(&object->lock);
-	if (object->state < FSCACHE_OBJECT_WITHDRAWING) {
-		object->state = FSCACHE_OBJECT_WITHDRAWING;
-		enqueue = true;
-	}
 	spin_unlock(&object->lock);
 
-	if (enqueue)
-		fscache_enqueue_object(object);
+	/* Discard from the cache's collection of objects */
+	spin_lock(&cache->object_list_lock);
+	list_del_init(&object->cache_link);
+	spin_unlock(&cache->object_list_lock);
+
+	fscache_stat(&fscache_n_cop_drop_object);
+	cache->ops->drop_object(object);
+	fscache_stat_d(&fscache_n_cop_drop_object);
+
+	/* The parent object wants to know when all it dependents have gone */
+	if (parent) {
+		_debug("release parent OBJ%x {%d}",
+		       parent->debug_id, parent->n_children);
+
+		spin_lock(&parent->lock);
+		parent->n_children--;
+		if (parent->n_children == 0)
+			fscache_raise_event(parent, FSCACHE_OBJECT_EV_CLEARED);
+		spin_unlock(&parent->lock);
+		object->parent = NULL;
+	}
+
+	/* this just shifts the object release to the work processor */
+	fscache_put_object(object);
+	fscache_stat(&fscache_n_object_dead);
 
 	_leave("");
+	return transit_to(OBJECT_DEAD);
 }
 
 /*
@@ -807,7 +778,7 @@ static int fscache_get_object(struct fscache_object *object)
 }
 
 /*
- * discard a ref on a work item
+ * Discard a ref on an object
  */
 static void fscache_put_object(struct fscache_object *object)
 {
@@ -839,7 +810,7 @@ void fscache_enqueue_object(struct fscache_object *object)
 
 /**
  * fscache_object_sleep_till_congested - Sleep until object wq is congested
- * @timoutp: Scheduler sleep timeout
+ * @timeoutp: Scheduler sleep timeout
  *
  * Allow an object handler to sleep until the object workqueue is congested.
  *
@@ -867,18 +838,21 @@ bool fscache_object_sleep_till_congested(signed long *timeoutp)
 EXPORT_SYMBOL_GPL(fscache_object_sleep_till_congested);
 
 /*
- * enqueue the dependents of an object for metadata-type processing
- * - the caller must hold the object's lock
- * - this may cause an already locked object to wind up being processed again
+ * Enqueue the dependents of an object for metadata-type processing.
+ *
+ * If we don't manage to finish the list before the scheduler wants to run
+ * again then return false immediately.  We return true if the list was
+ * cleared.
  */
-static void fscache_enqueue_dependents(struct fscache_object *object)
+static bool fscache_enqueue_dependents(struct fscache_object *object, int event)
 {
 	struct fscache_object *dep;
+	bool ret = true;
 
 	_enter("{OBJ%x}", object->debug_id);
 
 	if (list_empty(&object->dependents))
-		return;
+		return true;
 
 	spin_lock(&object->lock);
 
@@ -887,23 +861,23 @@ static void fscache_enqueue_dependents(struct fscache_object *object)
 				 struct fscache_object, dep_link);
 		list_del_init(&dep->dep_link);
 
-
-		/* sort onto appropriate lists */
-		fscache_enqueue_object(dep);
+		fscache_raise_event(dep, event);
 		fscache_put_object(dep);
 
-		if (!list_empty(&object->dependents))
-			cond_resched_lock(&object->lock);
+		if (!list_empty(&object->dependents) && need_resched()) {
+			ret = false;
+			break;
+		}
 	}
 
 	spin_unlock(&object->lock);
+	return ret;
 }
 
 /*
  * remove an object from whatever queue it's waiting on
- * - the caller must hold object->lock
  */
-void fscache_dequeue_object(struct fscache_object *object)
+static void fscache_dequeue_object(struct fscache_object *object)
 {
 	_enter("{OBJ%x}", object->debug_id);
 
@@ -963,12 +937,14 @@ EXPORT_SYMBOL(fscache_check_aux);
 /*
  * Asynchronously invalidate an object.
  */
-static void fscache_invalidate_object(struct fscache_object *object)
+static const struct fscache_state *_fscache_invalidate_object(struct fscache_object *object,
+							      int event)
 {
 	struct fscache_operation *op;
 	struct fscache_cookie *cookie = object->cookie;
 
-	_enter("{OBJ%x}", object->debug_id);
+	_enter("{OBJ%x},%d", object->debug_id, event);
+
 
 	/* Reject any new read/write ops and abort any that are pending. */
 	fscache_invalidate_writes(cookie);
@@ -978,9 +954,9 @@ static void fscache_invalidate_object(struct fscache_object *object)
 	/* Now we have to wait for in-progress reads and writes */
 	op = kzalloc(sizeof(*op), GFP_KERNEL);
 	if (!op) {
-		fscache_raise_event(object, FSCACHE_OBJECT_EV_ERROR);
+		clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
 		_leave(" [ENOMEM]");
-		return;
+		return transit_to(KILL_OBJECT);
 	}
 
 	fscache_operation_init(op, object->cache->ops->invalidate_object, NULL);
@@ -1001,13 +977,44 @@ static void fscache_invalidate_object(struct fscache_object *object)
 	/* We can allow read and write requests to come in once again.  They'll
 	 * queue up behind our exclusive invalidation operation.
 	 */
-	fscache_invalidation_complete(cookie);
-	_leave("");
-	return;
+	if (test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags))
+		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING);
+	_leave(" [ok]");
+	return transit_to(UPDATE_OBJECT);
 
 submit_op_failed:
+	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
 	spin_unlock(&cookie->lock);
 	kfree(op);
-	fscache_raise_event(object, FSCACHE_OBJECT_EV_ERROR);
 	_leave(" [EIO]");
+	return transit_to(KILL_OBJECT);
+}
+
+static const struct fscache_state *fscache_invalidate_object(struct fscache_object *object,
+							     int event)
+{
+	const struct fscache_state *s;
+
+	fscache_stat(&fscache_n_invalidates_run);
+	fscache_stat(&fscache_n_cop_invalidate_object);
+	s = _fscache_invalidate_object(object, event);
+	fscache_stat_d(&fscache_n_cop_invalidate_object);
+	return s;
+}
+
+/*
+ * Asynchronously update an object.
+ */
+static const struct fscache_state *fscache_update_object(struct fscache_object *object,
+							 int event)
+{
+	_enter("{OBJ%x},%d", object->debug_id, event);
+
+	fscache_stat(&fscache_n_updates_run);
+	fscache_stat(&fscache_n_cop_update_object);
+	object->cache->ops->update_object(object);
+	fscache_stat_d(&fscache_n_cop_update_object);
+
+	_leave("");
+	return transit_to(WAIT_FOR_CMD);
 }
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index ccf0219..4da211b 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -119,7 +119,7 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 		/* need to issue a new write op after this */
 		clear_bit(FSCACHE_OBJECT_PENDING_WRITE, &object->flags);
 		ret = 0;
-	} else if (object->state == FSCACHE_OBJECT_CREATING) {
+	} else if (test_bit(FSCACHE_OBJECT_IS_LOOKED_UP, &object->flags)) {
 		op->object = object;
 		object->n_ops++;
 		object->n_exclusive++;	/* reads and writes must wait */
@@ -144,7 +144,7 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
  */
 static void fscache_report_unexpected_submission(struct fscache_object *object,
 						 struct fscache_operation *op,
-						 unsigned long ostate)
+						 const struct fscache_state *ostate)
 {
 	static bool once_only;
 	struct fscache_operation *p;
@@ -155,11 +155,8 @@ static void fscache_report_unexpected_submission(struct fscache_object *object,
 	once_only = true;
 
 	kdebug("unexpected submission OP%x [OBJ%x %s]",
-	       op->debug_id, object->debug_id,
-	       fscache_object_states[object->state]);
-	kdebug("objstate=%s [%s]",
-	       fscache_object_states[object->state],
-	       fscache_object_states[ostate]);
+	       op->debug_id, object->debug_id, object->state->name);
+	kdebug("objstate=%s [%s]", object->state->name, ostate->name);
 	kdebug("objflags=%lx", object->flags);
 	kdebug("objevent=%lx [%lx]", object->events, object->event_mask);
 	kdebug("ops=%u inp=%u exc=%u",
@@ -190,7 +187,7 @@ static void fscache_report_unexpected_submission(struct fscache_object *object,
 int fscache_submit_op(struct fscache_object *object,
 		      struct fscache_operation *op)
 {
-	unsigned long ostate;
+	const struct fscache_state *ostate;
 	int ret;
 
 	_enter("{OBJ%x OP%x},{%u}",
@@ -226,16 +223,14 @@ int fscache_submit_op(struct fscache_object *object,
 			fscache_run_op(object, op);
 		}
 		ret = 0;
-	} else if (object->state == FSCACHE_OBJECT_CREATING) {
+	} else if (test_bit(FSCACHE_OBJECT_IS_LOOKED_UP, &object->flags)) {
 		op->object = object;
 		object->n_ops++;
 		atomic_inc(&op->usage);
 		list_add_tail(&op->pend_link, &object->pending_ops);
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
-	} else if (object->state == FSCACHE_OBJECT_DYING ||
-		   object->state == FSCACHE_OBJECT_LC_DYING ||
-		   object->state == FSCACHE_OBJECT_WITHDRAWING) {
+	} else if (fscache_object_is_dying(object)) {
 		fscache_stat(&fscache_n_op_rejected);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
@@ -266,13 +261,14 @@ void fscache_abort_object(struct fscache_object *object)
 
 /*
  * jump start the operation processing on an object
- * - caller must hold object->lock
  */
 void fscache_start_operations(struct fscache_object *object)
 {
 	struct fscache_operation *op;
 	bool stop = false;
 
+	ASSERT(spin_is_locked(&object->lock));
+
 	while (!list_empty(&object->pending_ops) && !stop) {
 		op = list_entry(object->pending_ops.next,
 				struct fscache_operation, pend_link);
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index 42f8f2d..b4e4b42 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -408,7 +408,7 @@ int __fscache_read_or_alloc_page(struct fscache_cookie *cookie,
 	object = hlist_entry(cookie->backing_objects.first,
 			     struct fscache_object, cookie_link);
 
-	ASSERTCMP(object->state, >, FSCACHE_OBJECT_LOOKING_UP);
+	ASSERT(test_bit(FSCACHE_OBJECT_IS_LOOKED_UP, &object->flags));
 
 	atomic_inc(&object->n_reads);
 	__set_bit(FSCACHE_OP_DEC_READ_CNT, &op->op.flags);
@@ -729,8 +729,9 @@ static void fscache_write_op(struct fscache_operation *_op)
 		 */
 		spin_unlock(&object->lock);
 		fscache_op_complete(&op->op, false);
-		_leave(" [cancel] op{f=%lx s=%u} obj{s=%u f=%lx}",
-		       _op->flags, _op->state, object->state, object->flags);
+		_leave(" [cancel] op{f=%lx s=%u} obj{s=%s f=%lx}",
+		       _op->flags, _op->state, object->state->short_name,
+		       object->flags);
 		return;
 	}
 
@@ -833,14 +834,12 @@ void fscache_invalidate_writes(struct fscache_cookie *cookie)
  *  (1) negative lookup, object not yet created (FSCACHE_COOKIE_CREATING is
  *      set)
  *
- *	(a) no writes yet (set FSCACHE_COOKIE_PENDING_FILL and queue deferred
- *	    fill op)
+ *	(a) no writes yet
  *
  *	(b) writes deferred till post-creation (mark page for writing and
  *	    return immediately)
  *
  *  (2) negative lookup, object created, initial fill being made from netfs
- *      (FSCACHE_COOKIE_INITIAL_FILL is set)
  *
  *	(a) fill point not yet reached this page (mark page for writing and
  *          return)
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index c5f9234..9ff516b 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -328,11 +328,9 @@ struct fscache_cookie {
 #define FSCACHE_COOKIE_LOOKING_UP	0	/* T if non-index cookie being looked up still */
 #define FSCACHE_COOKIE_CREATING		1	/* T if non-index object being created still */
 #define FSCACHE_COOKIE_NO_DATA_YET	2	/* T if new object with no cached data yet */
-#define FSCACHE_COOKIE_PENDING_FILL	3	/* T if pending initial fill on object */
-#define FSCACHE_COOKIE_FILLING		4	/* T if filling object incrementally */
-#define FSCACHE_COOKIE_UNAVAILABLE	5	/* T if cookie is unavailable (error, etc) */
-#define FSCACHE_COOKIE_WAITING_ON_READS	6	/* T if cookie is waiting on reads */
-#define FSCACHE_COOKIE_INVALIDATING	7	/* T if cookie is being invalidated */
+#define FSCACHE_COOKIE_UNAVAILABLE	3	/* T if cookie is unavailable (error, etc) */
+#define FSCACHE_COOKIE_WAITING_ON_READS	4	/* T if cookie is waiting on reads */
+#define FSCACHE_COOKIE_INVALIDATING	5	/* T if cookie is being invalidated */
 };
 
 extern struct fscache_cookie fscache_fsdef_index;
@@ -341,45 +339,40 @@ extern struct fscache_cookie fscache_fsdef_index;
  * Event list for fscache_object::{event_mask,events}
  */
 enum {
-	FSCACHE_OBJECT_EV_REQUEUE,	/* T if object should be requeued */
+	FSCACHE_OBJECT_EV_NEW_CHILD,	/* T if object has a new child */
+	FSCACHE_OBJECT_EV_PARENT_READY,	/* T if object's parent is ready */
 	FSCACHE_OBJECT_EV_UPDATE,	/* T if object should be updated */
 	FSCACHE_OBJECT_EV_INVALIDATE,	/* T if cache requested object invalidation */
 	FSCACHE_OBJECT_EV_CLEARED,	/* T if accessors all gone */
 	FSCACHE_OBJECT_EV_ERROR,	/* T if fatal error occurred during processing */
-	FSCACHE_OBJECT_EV_RELEASE,	/* T if netfs requested object release */
-	FSCACHE_OBJECT_EV_RETIRE,	/* T if netfs requested object retirement */
-	FSCACHE_OBJECT_EV_WITHDRAW,	/* T if cache requested object withdrawal */
+	FSCACHE_OBJECT_EV_KILL,		/* T if netfs relinquished or cache withdrew object */
 	NR_FSCACHE_OBJECT_EVENTS
 };
 
 #define FSCACHE_OBJECT_EVENTS_MASK ((1UL << NR_FSCACHE_OBJECT_EVENTS) - 1)
 
 /*
+ * States for object state machine.
+ */
+struct fscache_transition {
+	unsigned long events;
+	const struct fscache_state *transit_to;
+};
+
+struct fscache_state {
+	char name[24];
+	char short_name[8];
+	const struct fscache_state *(*work)(struct fscache_object *object,
+					    int event);
+	const struct fscache_transition transitions[];
+};
+
+/*
  * on-disk cache file or index handle
  */
 struct fscache_object {
-	enum fscache_object_state {
-		FSCACHE_OBJECT_INIT,		/* object in initial unbound state */
-		FSCACHE_OBJECT_LOOKING_UP,	/* looking up object */
-		FSCACHE_OBJECT_CREATING,	/* creating object */
-
-		/* active states */
-		FSCACHE_OBJECT_AVAILABLE,	/* cleaning up object after creation */
-		FSCACHE_OBJECT_ACTIVE,		/* object is usable */
-		FSCACHE_OBJECT_INVALIDATING,	/* object is invalidating */
-		FSCACHE_OBJECT_UPDATING,	/* object is updating */
-
-		/* terminal states */
-		FSCACHE_OBJECT_DYING,		/* object waiting for accessors to finish */
-		FSCACHE_OBJECT_LC_DYING,	/* object cleaning up after lookup/create */
-		FSCACHE_OBJECT_ABORT_INIT,	/* abort the init state */
-		FSCACHE_OBJECT_RELEASING,	/* releasing object */
-		FSCACHE_OBJECT_RECYCLING,	/* retiring object */
-		FSCACHE_OBJECT_WITHDRAWING,	/* withdrawing object */
-		FSCACHE_OBJECT_DEAD,		/* object is now dead */
-		FSCACHE_OBJECT__NSTATES
-	} state;
-
+	const struct fscache_state *state;	/* Object state machine state */
+	const struct fscache_transition *oob_table; /* OOB state transition table */
 	int			debug_id;	/* debugging ID */
 	int			n_children;	/* number of child objects */
 	int			n_ops;		/* number of extant ops on object */
@@ -390,6 +383,7 @@ struct fscache_object {
 	spinlock_t		lock;		/* state and operations lock */
 
 	unsigned long		lookup_jif;	/* time at which lookup started */
+	unsigned long		oob_event_mask;	/* OOB events this object is interested in */
 	unsigned long		event_mask;	/* events this object is interested in */
 	unsigned long		events;		/* events to be processed by this object
 						 * (order is important - using fls) */
@@ -398,6 +392,10 @@ struct fscache_object {
 #define FSCACHE_OBJECT_LOCK		0	/* T if object is busy being processed */
 #define FSCACHE_OBJECT_PENDING_WRITE	1	/* T if object has pending write */
 #define FSCACHE_OBJECT_WAITING		2	/* T if object is waiting on its parent */
+#define FSCACHE_OBJECT_RETIRE		3	/* T if object should be retired */
+#define FSCACHE_OBJECT_IS_LIVE		4	/* T if object is not withdrawn or relinquished */
+#define FSCACHE_OBJECT_IS_LOOKED_UP	5	/* T if object has been looked up */
+#define FSCACHE_OBJECT_IS_AVAILABLE	6	/* T if object has become active */
 
 	struct list_head	cache_link;	/* link in cache->object_list */
 	struct hlist_node	cookie_link;	/* link in cookie->backing_objects */
@@ -415,8 +413,6 @@ struct fscache_object {
 	loff_t			store_limit_l;	/* current storage limit */
 };
 
-extern const char *fscache_object_states[];
-
 extern void fscache_object_init(struct fscache_object *, struct fscache_cookie *,
 				struct fscache_cache *);
 
@@ -431,7 +427,7 @@ extern void fscache_object_destroy(struct fscache_object *object);
 
 static inline bool fscache_object_is_live(struct fscache_object *object)
 {
-	return object->state < FSCACHE_OBJECT_DYING;
+	return test_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
 }
 
 static inline bool fscache_object_is_dying(struct fscache_object *object)
@@ -441,7 +437,7 @@ static inline bool fscache_object_is_dying(struct fscache_object *object)
 
 static inline bool fscache_object_is_available(struct fscache_object *object)
 {
-	return object->state >= FSCACHE_OBJECT_AVAILABLE;
+	return test_bit(FSCACHE_OBJECT_IS_AVAILABLE, &object->flags);
 }
 
 static inline bool fscache_object_is_active(struct fscache_object *object)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 8/8] FS-Cache: Simplify cookie retention for fscache_objects, fixing access problems
  2013-05-03  0:33 [PATCH 0/8] Fix assorted FS-Cache issues David Howells
                   ` (6 preceding siblings ...)
  2013-05-03  0:33 ` [PATCH 7/8] FS-Cache: Fix object state machine to have separate work and wait states David Howells
@ 2013-05-03  0:33 ` David Howells
  7 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2013-05-03  0:33 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, hjayasur, jlayton, linux-kernel

Simplify the way fscache cache objects retain their cookie.  The way I
implemented the cookie storage handling made synchronisation a pain (ie. the
object state machine can't rely on the cookie actually still being there).

Instead of the the object being detached from the cookie and the cookie being
freed in __fscache_relinquish_cookie(), we defer both operations:

 (*) The detachment of the object from the list in the cookie now takes place
     in fscache_drop_object() and is thus governed by the object state machine
     (fscache_detach_from_cookie() has been removed).

 (*) The release of the cookie is now in fscache_object_destroy() - which is
     called by the cache backend just before it frees the object.

This means that the fscache_cookie struct is now available to the cache all the
way through from ->alloc_object() to ->drop_object() and ->put_object() -
meaning that it's no longer necessary to take object->lock to guarantee access.

However, __fscache_relinquish_cookie() doesn't wait for the object to go all
the way through to destruction before letting the netfs proceed.  That would
massively slow down the netfs.  Since __fscache_relinquish_cookie() leaves the
cookie around, in must therefore break all attachments to the netfs - which
includes ->def, ->netfs_data and any outstanding page read/writes.

To handle this, struct fscache_cookie now has an n_active counter:

 (1) This starts off initialised to 1.

 (2) Any time the cache needs to get at the netfs data, it calls
     fscache_use_cookie() to increment it - if it is not zero.  If it was zero,
     then access is not permitted.

 (3) When the cache has finished with the data, it calls fscache_unuse_cookie()
     to decrement it.  This does a wake-up on it if it reaches 0.

 (4) __fscache_relinquish_cookie() decrements n_active and then waits for it to
     reach 0.  The initialisation to 1 in step (1) ensures that we only get
     wake ups when we're trying to get rid of the cookie.

This leaves __fscache_relinquish_cookie() a lot simpler.


***
This fixes a problem in the current code whereby if fscache_invalidate() is
followed sufficiently quickly by fscache_relinquish_cookie() then it is
possible for __fscache_relinquish_cookie() to have detached the cookie from the
object and cleared the pointer before a thread is dispatched to process the
invalidation state in the object state machine.

Since the pending write clearance was deferred to the invalidation state to
make it asynchronous, we need to either wait in relinquishment for the stores
tree to be cleared in the invalidation state or we need to handle the clearance
in relinquishment.

Further, if the relinquishment code does clear the tree, then the invalidation
state need to make the clearance contingent on still having the cookie to hand
(since that's where the tree is rooted) and we have to prevent the cookie from
disappearing for the duration.

This can lead to an oops like the following:

BUG: unable to handle kernel NULL pointer dereference at 000000000000000c
...
RIP: 0010:[<ffffffff8151023e>] _spin_lock+0xe/0x30
...
CR2: 000000000000000c ...
...
Process kslowd002 (...)
....
Call Trace:
 [<ffffffffa01c3278>] fscache_invalidate_writes+0x38/0xd0 [fscache]
 [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
 [<ffffffff8105e759>] ? find_busiest_queue+0x69/0x150
 [<ffffffff8110ddd4>] ? slow_work_enqueue+0x104/0x180
 [<ffffffffa01c1303>] fscache_object_slow_work_execute+0x5e3/0x9d0 [fscache]
 [<ffffffff81096b67>] ? bit_waitqueue+0x17/0xd0
 [<ffffffff8110e233>] slow_work_execute+0x233/0x310
 [<ffffffff8110e515>] slow_work_thread+0x205/0x360
 [<ffffffff81096ca0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8110e310>] ? slow_work_thread+0x0/0x360
 [<ffffffff81096936>] kthread+0x96/0xa0
 [<ffffffff8100c0ca>] child_rip+0xa/0x20
 [<ffffffff810968a0>] ? kthread+0x0/0xa0
 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20

The parameter to fscache_invalidate_writes() was object->cookie which is NULL.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/cachefiles/interface.c     |   11 ++
 fs/cachefiles/xattr.c         |    6 -
 fs/fscache/cookie.c           |   80 ++++++-----------
 fs/fscache/fsdef.c            |    1 
 fs/fscache/internal.h         |    3 +
 fs/fscache/main.c             |   11 ++
 fs/fscache/netfs.c            |    1 
 fs/fscache/object-list.c      |   93 +++++++++-----------
 fs/fscache/object.c           |  188 ++++++++++++++++-------------------------
 fs/fscache/operation.c        |   12 +--
 fs/fscache/page.c             |   26 ++++--
 include/linux/fscache-cache.h |   57 +++++++++---
 12 files changed, 232 insertions(+), 257 deletions(-)

diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 3d76321..eeb3f7d 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -212,20 +212,29 @@ static void cachefiles_update_object(struct fscache_object *_object)
 	object = container_of(_object, struct cachefiles_object, fscache);
 	cache = container_of(object->fscache.cache, struct cachefiles_cache,
 			     cache);
+
+	if (!fscache_use_cookie(_object)) {
+		_leave(" [relinq]");
+		return;
+	}
+
 	cookie = object->fscache.cookie;
 
 	if (!cookie->def->get_aux) {
+		fscache_unuse_cookie(_object);
 		_leave(" [no aux]");
 		return;
 	}
 
 	auxdata = kmalloc(2 + 512 + 3, cachefiles_gfp);
 	if (!auxdata) {
+		fscache_unuse_cookie(_object);
 		_leave(" [nomem]");
 		return;
 	}
 
 	auxlen = cookie->def->get_aux(cookie->netfs_data, auxdata->data, 511);
+	fscache_unuse_cookie(_object);
 	ASSERTCMP(auxlen, <, 511);
 
 	auxdata->len = auxlen + 1;
@@ -263,7 +272,7 @@ static void cachefiles_drop_object(struct fscache_object *_object)
 #endif
 
 	/* delete retired objects */
-	if (test_bit(FSCACHE_OBJECT_RETIRE, &object->fscache.flags) &&
+	if (test_bit(FSCACHE_COOKIE_RETIRED, &object->fscache.cookie->flags) &&
 	    _object != cache->cache.fsdef
 	    ) {
 		_debug("- retire object OBJ%x", object->fscache.debug_id);
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c
index 73b4628..2476e51 100644
--- a/fs/cachefiles/xattr.c
+++ b/fs/cachefiles/xattr.c
@@ -109,13 +109,12 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object,
 	struct dentry *dentry = object->dentry;
 	int ret;
 
-	ASSERT(object->fscache.cookie);
 	ASSERT(dentry);
 
 	_enter("%p,#%d", object, auxdata->len);
 
 	/* attempt to install the cache metadata directly */
-	_debug("SET %s #%u", object->fscache.cookie->def->name, auxdata->len);
+	_debug("SET #%u", auxdata->len);
 
 	ret = vfs_setxattr(dentry, cachefiles_xattr_cache,
 			   &auxdata->type, auxdata->len,
@@ -138,13 +137,12 @@ int cachefiles_update_object_xattr(struct cachefiles_object *object,
 	struct dentry *dentry = object->dentry;
 	int ret;
 
-	ASSERT(object->fscache.cookie);
 	ASSERT(dentry);
 
 	_enter("%p,#%d", object, auxdata->len);
 
 	/* attempt to install the cache metadata directly */
-	_debug("SET %s #%u", object->fscache.cookie->def->name, auxdata->len);
+	_debug("SET #%u", auxdata->len);
 
 	ret = vfs_setxattr(dentry, cachefiles_xattr_cache,
 			   &auxdata->type, auxdata->len,
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index eee4366..0e91a3c 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -95,6 +95,11 @@ struct fscache_cookie *__fscache_acquire_cookie(
 	atomic_set(&cookie->usage, 1);
 	atomic_set(&cookie->n_children, 0);
 
+	/* We keep the active count elevated until relinquishment to prevent an
+	 * attempt to wake up every time the object operations queue quiesces.
+	 */
+	atomic_set(&cookie->n_active, 1);
+
 	atomic_inc(&parent->usage);
 	atomic_inc(&parent->n_children);
 
@@ -177,7 +182,6 @@ static int fscache_acquire_non_index_cookie(struct fscache_cookie *cookie)
 
 	cookie->flags =
 		(1 << FSCACHE_COOKIE_LOOKING_UP) |
-		(1 << FSCACHE_COOKIE_CREATING) |
 		(1 << FSCACHE_COOKIE_NO_DATA_YET);
 
 	/* ask the cache to allocate objects for this cookie and its parent
@@ -467,7 +471,6 @@ EXPORT_SYMBOL(__fscache_update_cookie);
  */
 void __fscache_relinquish_cookie(struct fscache_cookie *cookie, int retire)
 {
-	struct fscache_cache *cache;
 	struct fscache_object *object;
 
 	fscache_stat(&fscache_n_relinquishes);
@@ -480,8 +483,11 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie, int retire)
 		return;
 	}
 
-	_enter("%p{%s,%p},%d",
-	       cookie, cookie->def->name, cookie->netfs_data, retire);
+	_enter("%p{%s,%p,%d},%d",
+	       cookie, cookie->def->name, cookie->netfs_data,
+	       atomic_read(&cookie->n_active), retire);
+
+	ASSERTCMP(atomic_read(&cookie->n_active), >, 0);
 
 	if (atomic_read(&cookie->n_children) != 0) {
 		printk(KERN_ERR "FS-Cache: Cookie '%s' still has children\n",
@@ -489,62 +495,28 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie, int retire)
 		BUG();
 	}
 
-	/* wait for the cookie to finish being instantiated (or to fail) */
-	if (test_bit(FSCACHE_COOKIE_CREATING, &cookie->flags)) {
-		fscache_stat(&fscache_n_relinquishes_waitcrt);
-		wait_on_bit(&cookie->flags, FSCACHE_COOKIE_CREATING,
-			    fscache_wait_bit, TASK_UNINTERRUPTIBLE);
-	}
+	/* No further netfs-accessing operations on this cookie permitted */
+	set_bit(FSCACHE_COOKIE_RELINQUISHED, &cookie->flags);
+	if (retire)
+		set_bit(FSCACHE_COOKIE_RETIRED, &cookie->flags);
 
-try_again:
 	spin_lock(&cookie->lock);
-
-	/* break links with all the active objects */
-	while (!hlist_empty(&cookie->backing_objects)) {
-		int n_reads;
-		object = hlist_entry(cookie->backing_objects.first,
-				     struct fscache_object,
-				     cookie_link);
-
-		_debug("RELEASE OBJ%x", object->debug_id);
-
-		set_bit(FSCACHE_COOKIE_WAITING_ON_READS, &cookie->flags);
-		n_reads = atomic_read(&object->n_reads);
-		if (n_reads) {
-			int n_ops = object->n_ops;
-			int n_in_progress = object->n_in_progress;
-			spin_unlock(&cookie->lock);
-			printk(KERN_ERR "FS-Cache:"
-			       " Cookie '%s' still has %d outstanding reads (%d,%d)\n",
-			       cookie->def->name,
-			       n_reads, n_ops, n_in_progress);
-			wait_on_bit(&cookie->flags, FSCACHE_COOKIE_WAITING_ON_READS,
-				    fscache_wait_bit, TASK_UNINTERRUPTIBLE);
-			printk("Wait finished\n");
-			goto try_again;
-		}
-
-		/* detach each cache object from the object cookie */
-		spin_lock(&object->lock);
-		hlist_del_init(&object->cookie_link);
-
-		cache = object->cache;
-		object->cookie = NULL;
-		if (retire)
-			set_bit(FSCACHE_OBJECT_RETIRE, &object->flags);
+	hlist_for_each_entry(object, &cookie->backing_objects, cookie_link) {
 		fscache_raise_event(object, FSCACHE_OBJECT_EV_KILL);
-		spin_unlock(&object->lock);
-
-		if (atomic_dec_and_test(&cookie->usage))
-			/* the cookie refcount shouldn't be reduced to 0 yet */
-			BUG();
 	}
+	spin_unlock(&cookie->lock);
 
-	/* detach pointers back to the netfs */
+	/* Wait for cessation of activity requiring access to the netfs (when
+	 * n_active reaches 0).
+	 */
+	if (!atomic_dec_and_test(&cookie->n_active))
+		wait_on_atomic_t(&cookie->n_active, fscache_wait_atomic_t,
+				 TASK_UNINTERRUPTIBLE);
+
+	/* Clear pointers back to the netfs */
 	cookie->netfs_data	= NULL;
 	cookie->def		= NULL;
-
-	spin_unlock(&cookie->lock);
+	BUG_ON(cookie->stores.rnode);
 
 	if (cookie->parent) {
 		ASSERTCMP(atomic_read(&cookie->parent->usage), >, 0);
@@ -552,7 +524,7 @@ try_again:
 		atomic_dec(&cookie->parent->n_children);
 	}
 
-	/* finally dispose of the cookie */
+	/* Dispose of the netfs's link to the cookie */
 	ASSERTCMP(atomic_read(&cookie->usage), >, 0);
 	fscache_cookie_put(cookie);
 
diff --git a/fs/fscache/fsdef.c b/fs/fscache/fsdef.c
index f5b4bae..10a2ade 100644
--- a/fs/fscache/fsdef.c
+++ b/fs/fscache/fsdef.c
@@ -55,6 +55,7 @@ static struct fscache_cookie_def fscache_fsdef_index_def = {
 
 struct fscache_cookie fscache_fsdef_index = {
 	.usage		= ATOMIC_INIT(1),
+	.n_active	= ATOMIC_INIT(1),
 	.lock		= __SPIN_LOCK_UNLOCKED(fscache_fsdef_index.lock),
 	.backing_objects = HLIST_HEAD_INIT,
 	.def		= &fscache_fsdef_index_def,
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 3322d3c..12d505b 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -93,6 +93,7 @@ static inline bool fscache_object_congested(void)
 
 extern int fscache_wait_bit(void *);
 extern int fscache_wait_bit_interruptible(void *);
+extern int fscache_wait_atomic_t(atomic_t *);
 
 /*
  * object.c
@@ -106,8 +107,10 @@ extern void fscache_enqueue_object(struct fscache_object *);
 extern const struct file_operations fscache_objlist_fops;
 
 extern void fscache_objlist_add(struct fscache_object *);
+extern void fscache_objlist_remove(struct fscache_object *);
 #else
 #define fscache_objlist_add(object) do {} while(0)
+#define fscache_objlist_remove(object) do {} while(0)
 #endif
 
 /*
diff --git a/fs/fscache/main.c b/fs/fscache/main.c
index f9d8567..7c27907 100644
--- a/fs/fscache/main.c
+++ b/fs/fscache/main.c
@@ -205,7 +205,6 @@ int fscache_wait_bit(void *flags)
 	schedule();
 	return 0;
 }
-EXPORT_SYMBOL(fscache_wait_bit);
 
 /*
  * wait_on_bit() sleep function for interruptible waiting
@@ -215,4 +214,12 @@ int fscache_wait_bit_interruptible(void *flags)
 	schedule();
 	return signal_pending(current);
 }
-EXPORT_SYMBOL(fscache_wait_bit_interruptible);
+
+/*
+ * wait_on_atomic_t() sleep function for uninterruptible waiting
+ */
+int fscache_wait_atomic_t(atomic_t *p)
+{
+	schedule();
+	return 0;
+}
diff --git a/fs/fscache/netfs.c b/fs/fscache/netfs.c
index e028b8e..b1bb611 100644
--- a/fs/fscache/netfs.c
+++ b/fs/fscache/netfs.c
@@ -40,6 +40,7 @@ int __fscache_register_netfs(struct fscache_netfs *netfs)
 	/* initialise the primary index cookie */
 	atomic_set(&netfs->primary_index->usage, 1);
 	atomic_set(&netfs->primary_index->n_children, 0);
+	atomic_set(&netfs->primary_index->n_active, 1);
 
 	netfs->primary_index->def		= &fscache_fsdef_netfs_def;
 	netfs->primary_index->parent		= &fscache_fsdef_index;
diff --git a/fs/fscache/object-list.c b/fs/fscache/object-list.c
index 4a386b0..e1959ef 100644
--- a/fs/fscache/object-list.c
+++ b/fs/fscache/object-list.c
@@ -70,13 +70,10 @@ void fscache_objlist_add(struct fscache_object *obj)
 	write_unlock(&fscache_object_list_lock);
 }
 
-/**
- * fscache_object_destroy - Note that a cache object is about to be destroyed
- * @object: The object to be destroyed
- *
- * Note the imminent destruction and deallocation of a cache object record.
+/*
+ * Remove an object from the object list.
  */
-void fscache_object_destroy(struct fscache_object *obj)
+void fscache_objlist_remove(struct fscache_object *obj)
 {
 	write_lock(&fscache_object_list_lock);
 
@@ -85,7 +82,6 @@ void fscache_object_destroy(struct fscache_object *obj)
 
 	write_unlock(&fscache_object_list_lock);
 }
-EXPORT_SYMBOL(fscache_object_destroy);
 
 /*
  * find the object in the tree on or after the specified index
@@ -166,10 +162,9 @@ static int fscache_objlist_show(struct seq_file *m, void *v)
 {
 	struct fscache_objlist_data *data = m->private;
 	struct fscache_object *obj = v;
+	struct fscache_cookie *cookie;
 	unsigned long config = data->config;
-	uint16_t keylen, auxlen;
 	char _type[3], *type;
-	bool no_cookie;
 	u8 *buf = data->buf, *p;
 
 	if ((unsigned long) v == 1) {
@@ -216,8 +211,9 @@ static int fscache_objlist_show(struct seq_file *m, void *v)
 		}							\
 	} while(0)
 
+	cookie = obj->cookie;
 	if (~config) {
-		FILTER(obj->cookie,
+		FILTER(cookie->def,
 		       COOKIE, NOCOOKIE);
 		FILTER(fscache_object_is_active(obj) ||
 		       obj->n_ops != 0 ||
@@ -250,48 +246,40 @@ static int fscache_objlist_show(struct seq_file *m, void *v)
 		   obj->flags,
 		   work_busy(&obj->work));
 
-	no_cookie = true;
-	keylen = auxlen = 0;
-	if (obj->cookie) {
-		spin_lock(&obj->lock);
-		if (obj->cookie) {
-			switch (obj->cookie->def->type) {
-			case 0:
-				type = "IX";
-				break;
-			case 1:
-				type = "DT";
-				break;
-			default:
-				sprintf(_type, "%02u",
-					obj->cookie->def->type);
-				type = _type;
-				break;
-			}
+	if (fscache_use_cookie(obj)) {
+		uint16_t keylen = 0, auxlen = 0;
 
-			seq_printf(m, "%-16s %s %2lx %16p",
-				   obj->cookie->def->name,
-				   type,
-				   obj->cookie->flags,
-				   obj->cookie->netfs_data);
-
-			if (obj->cookie->def->get_key &&
-			    config & FSCACHE_OBJLIST_CONFIG_KEY)
-				keylen = obj->cookie->def->get_key(
-					obj->cookie->netfs_data,
-					buf, 400);
-
-			if (obj->cookie->def->get_aux &&
-			    config & FSCACHE_OBJLIST_CONFIG_AUX)
-				auxlen = obj->cookie->def->get_aux(
-					obj->cookie->netfs_data,
-					buf + keylen, 512 - keylen);
-
-			no_cookie = false;
+		switch (cookie->def->type) {
+		case 0:
+			type = "IX";
+			break;
+		case 1:
+			type = "DT";
+			break;
+		default:
+			sprintf(_type, "%02u", cookie->def->type);
+			type = _type;
+			break;
 		}
-		spin_unlock(&obj->lock);
 
-		if (!no_cookie && (keylen > 0 || auxlen > 0)) {
+		seq_printf(m, "%-16s %s %2lx %16p",
+			   cookie->def->name,
+			   type,
+			   cookie->flags,
+			   cookie->netfs_data);
+
+		if (cookie->def->get_key &&
+		    config & FSCACHE_OBJLIST_CONFIG_KEY)
+			keylen = cookie->def->get_key(cookie->netfs_data,
+						      buf, 400);
+
+		if (cookie->def->get_aux &&
+		    config & FSCACHE_OBJLIST_CONFIG_AUX)
+			auxlen = cookie->def->get_aux(cookie->netfs_data,
+						      buf + keylen, 512 - keylen);
+		fscache_unuse_cookie(obj);
+
+		if (keylen > 0 || auxlen > 0) {
 			seq_printf(m, " ");
 			for (p = buf; keylen > 0; keylen--)
 				seq_printf(m, "%02x", *p++);
@@ -302,12 +290,11 @@ static int fscache_objlist_show(struct seq_file *m, void *v)
 					seq_printf(m, "%02x", *p++);
 			}
 		}
-	}
 
-	if (no_cookie)
-		seq_printf(m, "<no_cookie>\n");
-	else
 		seq_printf(m, "\n");
+	} else {
+		seq_printf(m, "<no_netfs>\n");
+	}
 	return 0;
 }
 
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index c2d9de0..c2c8c23 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -29,7 +29,6 @@ static const struct fscache_state *fscache_look_up_object(struct fscache_object
 static const struct fscache_state *fscache_object_available(struct fscache_object *, int);
 static const struct fscache_state *fscache_parent_ready(struct fscache_object *, int);
 static const struct fscache_state *fscache_update_object(struct fscache_object *, int);
-static const struct fscache_state *fscache_detach_from_cookie(struct fscache_object *, int);
 
 #define __STATE_NAME(n) fscache_osm_##n
 #define STATE(n) (&__STATE_NAME(n))
@@ -71,7 +70,6 @@ static WORK_STATE(LOOKUP_FAILURE,	"LCFL", fscache_lookup_failure);
 static WORK_STATE(KILL_OBJECT,		"KILL", fscache_kill_object);
 static WORK_STATE(KILL_DEPENDENTS,	"KDEP", fscache_kill_dependents);
 static WORK_STATE(DROP_OBJECT,		"DROP", fscache_drop_object);
-static WORK_STATE(DETACH_FROM_COOKIE,	"DTCH", fscache_detach_from_cookie);
 static WORK_STATE(OBJECT_DEAD,		"DEAD", (void*)2UL);
 
 static WAIT_STATE(WAIT_FOR_INIT,	"?INI",
@@ -127,8 +125,8 @@ static inline void fscache_done_parent_op(struct fscache_object *object)
 	       object->debug_id, parent->debug_id, parent->n_ops);
 
 	spin_lock_nested(&parent->lock, 1);
-	parent->n_ops--;
 	parent->n_obj_ops--;
+	parent->n_ops--;
 	if (parent->n_ops == 0)
 		fscache_raise_event(parent, FSCACHE_OBJECT_EV_CLEARED);
 	spin_unlock(&parent->lock);
@@ -303,22 +301,10 @@ EXPORT_SYMBOL(fscache_object_init);
 static const struct fscache_state *fscache_abort_initialisation(struct fscache_object *object,
 								int event)
 {
-	struct fscache_cookie *cookie;
-
 	_enter("{OBJ%x},%d", object->debug_id, event);
 
 	object->oob_event_mask = 0;
-	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
-
 	fscache_dequeue_object(object);
-
-	spin_lock(&object->lock);
-	cookie = object->cookie;
-	clear_bit_unlock(FSCACHE_COOKIE_CREATING, &cookie->flags);
-	spin_unlock(&object->lock);
-
-	wake_up_bit(&cookie->flags, FSCACHE_COOKIE_CREATING);
-
 	return transit_to(KILL_OBJECT);
 }
 
@@ -328,8 +314,6 @@ static const struct fscache_state *fscache_abort_initialisation(struct fscache_o
  *   immediately to do a creation
  * - we may need to start the process of creating a parent and we need to wait
  *   for the parent's lookup and creation to complete if it's not there yet
- * - an object's cookie is pinned until we clear FSCACHE_COOKIE_CREATING on the
- *   leaf-most cookies of the object and all its children
  */
 static const struct fscache_state *fscache_initialise_object(struct fscache_object *object,
 							     int event)
@@ -344,14 +328,14 @@ static const struct fscache_state *fscache_initialise_object(struct fscache_obje
 	parent = object->parent;
 	if (!parent) {
 		_leave(" [no parent]");
-		return transit_to(DETACH_FROM_COOKIE);
+		return transit_to(DROP_OBJECT);
 	}
 
-	_debug("parent %s", parent->state->name);
+	_debug("parent: %s of:%lx", parent->state->name, parent->flags);
 
 	if (fscache_object_is_dying(parent)) {
 		_leave(" [bad parent]");
-		return transit_to(DETACH_FROM_COOKIE);
+		return transit_to(DROP_OBJECT);
 	}
 
 	if (fscache_object_is_available(parent)) {
@@ -373,7 +357,7 @@ static const struct fscache_state *fscache_initialise_object(struct fscache_obje
 	spin_unlock(&parent->lock);
 	if (!success) {
 		_leave(" [grab failed]");
-		return transit_to(DETACH_FROM_COOKIE);
+		return transit_to(DROP_OBJECT);
 	}
 
 	/* fscache_acquire_non_index_cookie() uses this
@@ -409,8 +393,6 @@ static const struct fscache_state *fscache_parent_ready(struct fscache_object *o
  * look an object up in the cache from which it was allocated
  * - we hold an "access lock" on the parent object, so the parent object cannot
  *   be withdrawn by either party till we've finished
- * - an object's cookie is pinned until we clear FSCACHE_COOKIE_CREATING on the
- *   leaf-most cookies of the object and all its children
  */
 static const struct fscache_state *fscache_look_up_object(struct fscache_object *object,
 							  int event)
@@ -431,22 +413,21 @@ static const struct fscache_state *fscache_look_up_object(struct fscache_object
 	ASSERT(fscache_object_is_available(parent));
 
 	if (fscache_object_is_dying(parent) ||
-	    test_bit(FSCACHE_IOERROR, &object->cache->flags)) {
+	    test_bit(FSCACHE_IOERROR, &object->cache->flags) ||
+	    !fscache_use_cookie(object)) {
 		_leave(" [unavailable]");
 		return transit_to(LOOKUP_FAILURE);
 	}
 
-	_debug("LOOKUP \"%s/%s\" in \"%s\"",
-	       parent->cookie->def->name, cookie->def->name,
-	       object->cache->tag->name);
+	_debug("LOOKUP \"%s\" in \"%s\"",
+	       cookie->def->name, object->cache->tag->name);
 
 	fscache_stat(&fscache_n_object_lookups);
 	fscache_stat(&fscache_n_cop_lookup_object);
 	ret = object->cache->ops->lookup_object(object);
 	fscache_stat_d(&fscache_n_cop_lookup_object);
 
-	if (test_bit(FSCACHE_OBJECT_EV_ERROR, &object->events))
-		set_bit(FSCACHE_COOKIE_UNAVAILABLE, &cookie->flags);
+	fscache_unuse_cookie(object);
 
 	if (ret == -ETIMEDOUT) {
 		/* probably stuck behind another object, so move this one to
@@ -528,11 +509,6 @@ void fscache_obtained_object(struct fscache_object *object)
 	}
 
 	set_bit(FSCACHE_OBJECT_IS_AVAILABLE, &object->flags);
-
-	/* Permit __fscache_relinquish_cookie() to proceed */
-	clear_bit_unlock(FSCACHE_COOKIE_CREATING, &cookie->flags);
-	wake_up_bit(&cookie->flags, FSCACHE_COOKIE_CREATING);
-
 	_leave("");
 }
 EXPORT_SYMBOL(fscache_obtained_object);
@@ -543,16 +519,12 @@ EXPORT_SYMBOL(fscache_obtained_object);
 static const struct fscache_state *fscache_object_available(struct fscache_object *object,
 							    int event)
 {
-	struct fscache_cookie *cookie = object->cookie;
-
 	_enter("{OBJ%x},%d", object->debug_id, event);
 
 	object->oob_table = fscache_osm_run_oob;
 
 	spin_lock(&object->lock);
 
-	ASSERTIF(cookie, !test_bit(FSCACHE_COOKIE_CREATING, &object->cookie->flags));
-
 	fscache_done_parent_op(object);
 	if (object->n_in_progress == 0) {
 		if (object->n_ops > 0) {
@@ -595,7 +567,6 @@ static const struct fscache_state *fscache_lookup_failure(struct fscache_object
 							  int event)
 {
 	struct fscache_cookie *cookie;
-	bool wake_looking_up = false;
 
 	_enter("{OBJ%x},%d", object->debug_id, event);
 
@@ -605,19 +576,10 @@ static const struct fscache_state *fscache_lookup_failure(struct fscache_object
 	object->cache->ops->lookup_complete(object);
 	fscache_stat_d(&fscache_n_cop_lookup_complete);
 
-	spin_lock(&object->lock);
 	cookie = object->cookie;
 	set_bit(FSCACHE_COOKIE_UNAVAILABLE, &cookie->flags);
-	if (cookie) {
-		if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags))
-			wake_looking_up = true;
-		clear_bit_unlock(FSCACHE_COOKIE_CREATING, &cookie->flags);
-	}
-	spin_unlock(&object->lock);
-
-	if (wake_looking_up)
+	if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags))
 		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP);
-	wake_up_bit(&cookie->flags, FSCACHE_COOKIE_CREATING);
 
 	fscache_done_parent_op(object);
 	return transit_to(KILL_OBJECT);
@@ -633,21 +595,20 @@ static const struct fscache_state *fscache_kill_object(struct fscache_object *ob
 	_enter("{OBJ%x,%d,%d},%d",
 	       object->debug_id, object->n_ops, object->n_children, event);
 
-	object->oob_event_mask = 0;
-
-	spin_lock(&object->lock);
 	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
-	spin_unlock(&object->lock);
+	object->oob_event_mask = 0;
 
 	if (list_empty(&object->dependents) &&
 	    object->n_ops == 0 &&
 	    object->n_children == 0)
-		return object->cookie ?
-			transit_to(DETACH_FROM_COOKIE) : transit_to(DROP_OBJECT);
+		return transit_to(DROP_OBJECT);
 
-	spin_lock(&object->lock);
-	fscache_start_operations(object);
-	spin_unlock(&object->lock);
+	if (object->n_in_progress == 0) {
+		spin_lock(&object->lock);
+		if (object->n_ops > 0 && object->n_in_progress == 0)
+			fscache_start_operations(object);
+		spin_unlock(&object->lock);
+	}
 
 	if (!list_empty(&object->dependents))
 		return transit_to(KILL_DEPENDENTS);
@@ -669,64 +630,32 @@ static const struct fscache_state *fscache_kill_dependents(struct fscache_object
 }
 
 /*
- * withdraw an object from active service
- */
-static const struct fscache_state *fscache_detach_from_cookie(struct fscache_object *object,
-							      int event)
-{
-	struct fscache_cookie *cookie;
-	bool detached = false, awaken = false;
-
-	_enter("{OBJ%x},%d", object->debug_id, event);
-
-	spin_lock(&object->lock);
-	cookie = object->cookie;
-	if (cookie) {
-		/* need to get the cookie lock before the object lock, starting
-		 * from the object pointer */
-		atomic_inc(&cookie->usage);
-		spin_unlock(&object->lock);
-
-		spin_lock(&cookie->lock);
-		spin_lock(&object->lock);
-
-		if (object->cookie == cookie) {
-			hlist_del_init(&object->cookie_link);
-			object->cookie = NULL;
-			if (test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING,
-					       &cookie->flags))
-				awaken = true;
-			detached = true;
-		}
-		spin_unlock(&cookie->lock);
-		fscache_cookie_put(cookie);
-		if (detached)
-			fscache_cookie_put(cookie);
-	}
-
-	spin_unlock(&object->lock);
-
-	if (awaken)
-		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING);
-
-	fscache_stat(&fscache_n_object_dead);
-	_leave("");
-	return transit_to(DROP_OBJECT);
-}
-
-/*
  * Drop an object's attachments
  */
 static const struct fscache_state *fscache_drop_object(struct fscache_object *object,
 						       int event)
 {
 	struct fscache_object *parent = object->parent;
+	struct fscache_cookie *cookie = object->cookie;
 	struct fscache_cache *cache = object->cache;
+	bool awaken = false;
 
 	_enter("{OBJ%x,%d},%d", object->debug_id, object->n_children, event);
 
-	ASSERTCMP(object->cookie, ==, NULL);
-	ASSERT(hlist_unhashed(&object->cookie_link));
+	ASSERT(cookie != NULL);
+	ASSERT(!hlist_unhashed(&object->cookie_link));
+
+	/* Make sure the cookie no longer points here and that the netfs isn't
+	 * waiting for us.
+	 */
+	spin_lock(&cookie->lock);
+	hlist_del_init(&object->cookie_link);
+	if (test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags))
+		awaken = true;
+	spin_unlock(&cookie->lock);
+
+	if (awaken)
+		wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING);
 
 	/* Prevent a race with our last child, which has to signal EV_CLEARED
 	 * before dropping our spinlock.
@@ -787,6 +716,22 @@ static void fscache_put_object(struct fscache_object *object)
 	fscache_stat_d(&fscache_n_cop_put_object);
 }
 
+/**
+ * fscache_object_destroy - Note that a cache object is about to be destroyed
+ * @object: The object to be destroyed
+ *
+ * Note the imminent destruction and deallocation of a cache object record.
+ */
+void fscache_object_destroy(struct fscache_object *object)
+{
+	fscache_objlist_remove(object);
+
+	/* We can get rid of the cookie now */
+	fscache_cookie_put(object->cookie);
+	object->cookie = NULL;
+}
+EXPORT_SYMBOL(fscache_object_destroy);
+
 /*
  * enqueue an object for metadata-type processing
  */
@@ -896,7 +841,10 @@ static void fscache_dequeue_object(struct fscache_object *object)
  * @data: The auxiliary data for the object
  * @datalen: The size of the auxiliary data
  *
- * This function consults the netfs about the coherency state of an object
+ * This function consults the netfs about the coherency state of an object.
+ * The caller must be holding a ref on cookie->n_active (held by
+ * fscache_look_up_object() on behalf of the cache backend during object lookup
+ * and creation).
  */
 enum fscache_checkaux fscache_check_aux(struct fscache_object *object,
 					const void *data, uint16_t datalen)
@@ -945,6 +893,15 @@ static const struct fscache_state *_fscache_invalidate_object(struct fscache_obj
 
 	_enter("{OBJ%x},%d", object->debug_id, event);
 
+	/* We're going to need the cookie.  If the cookie is not available then
+	 * retire the object instead.
+	 */
+	if (!fscache_use_cookie(object)) {
+		ASSERT(object->cookie->stores.rnode == NULL);
+		set_bit(FSCACHE_COOKIE_RETIRED, &cookie->flags);
+		_leave(" [no cookie]");
+		return transit_to(KILL_OBJECT);
+	}
 
 	/* Reject any new read/write ops and abort any that are pending. */
 	fscache_invalidate_writes(cookie);
@@ -953,14 +910,13 @@ static const struct fscache_state *_fscache_invalidate_object(struct fscache_obj
 
 	/* Now we have to wait for in-progress reads and writes */
 	op = kzalloc(sizeof(*op), GFP_KERNEL);
-	if (!op) {
-		clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
-		_leave(" [ENOMEM]");
-		return transit_to(KILL_OBJECT);
-	}
+	if (!op)
+		goto nomem;
 
 	fscache_operation_init(op, object->cache->ops->invalidate_object, NULL);
-	op->flags = FSCACHE_OP_ASYNC | (1 << FSCACHE_OP_EXCLUSIVE);
+	op->flags = FSCACHE_OP_ASYNC |
+		(1 << FSCACHE_OP_EXCLUSIVE) |
+		(1 << FSCACHE_OP_UNUSE_COOKIE);
 
 	spin_lock(&cookie->lock);
 	if (fscache_submit_exclusive_op(object, op) < 0)
@@ -982,6 +938,12 @@ static const struct fscache_state *_fscache_invalidate_object(struct fscache_obj
 	_leave(" [ok]");
 	return transit_to(UPDATE_OBJECT);
 
+nomem:
+	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+	fscache_unuse_cookie(object);
+	_leave(" [ENOMEM]");
+	return transit_to(KILL_OBJECT);
+
 submit_op_failed:
 	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
 	spin_unlock(&cookie->lock);
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 4da211b..6935901 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -424,14 +424,10 @@ void fscache_put_operation(struct fscache_operation *op)
 
 	object = op->object;
 
-	if (test_bit(FSCACHE_OP_DEC_READ_CNT, &op->flags)) {
-		if (atomic_dec_and_test(&object->n_reads)) {
-			clear_bit(FSCACHE_COOKIE_WAITING_ON_READS,
-				  &object->cookie->flags);
-			wake_up_bit(&object->cookie->flags,
-				    FSCACHE_COOKIE_WAITING_ON_READS);
-		}
-	}
+	if (test_bit(FSCACHE_OP_DEC_READ_CNT, &op->flags))
+		atomic_dec(&object->n_reads);
+	if (test_bit(FSCACHE_OP_UNUSE_COOKIE, &op->flags))
+		fscache_unuse_cookie(object);
 
 	/* now... we may get called with the object spinlock held, so we
 	 * complete the cleanup here only if we can immediately acquire the
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index b4e4b42..780bac6 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -163,10 +163,12 @@ static void fscache_attr_changed_op(struct fscache_operation *op)
 
 	fscache_stat(&fscache_n_attr_changed_calls);
 
-	if (fscache_object_is_active(object)) {
+	if (fscache_object_is_active(object) &&
+	    fscache_use_cookie(object)) {
 		fscache_stat(&fscache_n_cop_attr_changed);
 		ret = object->cache->ops->attr_changed(object);
 		fscache_stat_d(&fscache_n_cop_attr_changed);
+		fscache_unuse_cookie(object);
 		if (ret < 0)
 			fscache_abort_object(object);
 	}
@@ -246,6 +248,7 @@ static void fscache_release_retrieval_op(struct fscache_operation *_op)
  * allocate a retrieval op
  */
 static struct fscache_retrieval *fscache_alloc_retrieval(
+	struct fscache_cookie *cookie,
 	struct address_space *mapping,
 	fscache_rw_complete_t end_io_func,
 	void *context)
@@ -260,7 +263,10 @@ static struct fscache_retrieval *fscache_alloc_retrieval(
 	}
 
 	fscache_operation_init(&op->op, NULL, fscache_release_retrieval_op);
-	op->op.flags	= FSCACHE_OP_MYTHREAD | (1 << FSCACHE_OP_WAITING);
+	atomic_inc(&cookie->n_active);
+	op->op.flags	= FSCACHE_OP_MYTHREAD |
+		(1UL << FSCACHE_OP_WAITING) |
+		(1UL << FSCACHE_OP_UNUSE_COOKIE);
 	op->mapping	= mapping;
 	op->end_io_func	= end_io_func;
 	op->context	= context;
@@ -394,7 +400,8 @@ int __fscache_read_or_alloc_page(struct fscache_cookie *cookie,
 	if (fscache_wait_for_deferred_lookup(cookie) < 0)
 		return -ERESTARTSYS;
 
-	op = fscache_alloc_retrieval(page->mapping, end_io_func, context);
+	op = fscache_alloc_retrieval(cookie, page->mapping,
+				     end_io_func,context);
 	if (!op) {
 		_leave(" = -ENOMEM");
 		return -ENOMEM;
@@ -465,6 +472,7 @@ nobufs_unlock_dec:
 	atomic_dec(&object->n_reads);
 nobufs_unlock:
 	spin_unlock(&cookie->lock);
+	atomic_dec(&cookie->n_active);
 	kfree(op);
 nobufs:
 	fscache_stat(&fscache_n_retrievals_nobufs);
@@ -522,7 +530,7 @@ int __fscache_read_or_alloc_pages(struct fscache_cookie *cookie,
 	if (fscache_wait_for_deferred_lookup(cookie) < 0)
 		return -ERESTARTSYS;
 
-	op = fscache_alloc_retrieval(mapping, end_io_func, context);
+	op = fscache_alloc_retrieval(cookie, mapping, end_io_func, context);
 	if (!op)
 		return -ENOMEM;
 	op->n_pages = *nr_pages;
@@ -589,6 +597,7 @@ nobufs_unlock_dec:
 	atomic_dec(&object->n_reads);
 nobufs_unlock:
 	spin_unlock(&cookie->lock);
+	atomic_dec(&cookie->n_active);
 	kfree(op);
 nobufs:
 	fscache_stat(&fscache_n_retrievals_nobufs);
@@ -631,7 +640,7 @@ int __fscache_alloc_page(struct fscache_cookie *cookie,
 	if (fscache_wait_for_deferred_lookup(cookie) < 0)
 		return -ERESTARTSYS;
 
-	op = fscache_alloc_retrieval(page->mapping, NULL, NULL);
+	op = fscache_alloc_retrieval(cookie, page->mapping, NULL, NULL);
 	if (!op)
 		return -ENOMEM;
 	op->n_pages = 1;
@@ -675,6 +684,7 @@ error:
 
 nobufs_unlock:
 	spin_unlock(&cookie->lock);
+	atomic_dec(&cookie->n_active);
 	kfree(op);
 nobufs:
 	fscache_stat(&fscache_n_allocs_nobufs);
@@ -876,7 +886,9 @@ int __fscache_write_page(struct fscache_cookie *cookie,
 
 	fscache_operation_init(&op->op, fscache_write_op,
 			       fscache_release_write_op);
-	op->op.flags = FSCACHE_OP_ASYNC | (1 << FSCACHE_OP_WAITING);
+	op->op.flags = FSCACHE_OP_ASYNC |
+		(1 << FSCACHE_OP_WAITING) |
+		(1 << FSCACHE_OP_UNUSE_COOKIE);
 
 	ret = radix_tree_preload(gfp & ~__GFP_HIGHMEM);
 	if (ret < 0)
@@ -922,6 +934,7 @@ int __fscache_write_page(struct fscache_cookie *cookie,
 	op->op.debug_id	= atomic_inc_return(&fscache_op_debug_id);
 	op->store_limit = object->store_limit;
 
+	atomic_inc(&cookie->n_active);
 	if (fscache_submit_op(object, &op->op) < 0)
 		goto submit_failed;
 
@@ -948,6 +961,7 @@ already_pending:
 	return 0;
 
 submit_failed:
+	atomic_dec(&cookie->n_active);
 	spin_lock(&cookie->stores_lock);
 	radix_tree_delete(&cookie->stores, page->index);
 	spin_unlock(&cookie->stores_lock);
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 9ff516b..e7802a5 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -97,7 +97,8 @@ struct fscache_operation {
 #define FSCACHE_OP_WAITING	4	/* cleared when op is woken */
 #define FSCACHE_OP_EXCLUSIVE	5	/* exclusive op, other ops must wait */
 #define FSCACHE_OP_DEC_READ_CNT	6	/* decrement object->n_reads on destruction */
-#define FSCACHE_OP_KEEP_FLAGS	0x0070	/* flags to keep when repurposing an op */
+#define FSCACHE_OP_UNUSE_COOKIE	7	/* call fscache_unuse_cookie() on completion */
+#define FSCACHE_OP_KEEP_FLAGS	0x00f0	/* flags to keep when repurposing an op */
 
 	enum fscache_operation_state state;
 	atomic_t		usage;
@@ -314,6 +315,7 @@ struct fscache_cache_ops {
 struct fscache_cookie {
 	atomic_t			usage;		/* number of users of this cookie */
 	atomic_t			n_children;	/* number of children of this cookie */
+	atomic_t			n_active;	/* number of active users of netfs ptrs */
 	spinlock_t			lock;
 	spinlock_t			stores_lock;	/* lock on page store tree */
 	struct hlist_head		backing_objects; /* object(s) backing this file/index */
@@ -326,11 +328,11 @@ struct fscache_cookie {
 
 	unsigned long			flags;
 #define FSCACHE_COOKIE_LOOKING_UP	0	/* T if non-index cookie being looked up still */
-#define FSCACHE_COOKIE_CREATING		1	/* T if non-index object being created still */
-#define FSCACHE_COOKIE_NO_DATA_YET	2	/* T if new object with no cached data yet */
-#define FSCACHE_COOKIE_UNAVAILABLE	3	/* T if cookie is unavailable (error, etc) */
-#define FSCACHE_COOKIE_WAITING_ON_READS	4	/* T if cookie is waiting on reads */
-#define FSCACHE_COOKIE_INVALIDATING	5	/* T if cookie is being invalidated */
+#define FSCACHE_COOKIE_NO_DATA_YET	1	/* T if new object with no cached data yet */
+#define FSCACHE_COOKIE_UNAVAILABLE	2	/* T if cookie is unavailable (error, etc) */
+#define FSCACHE_COOKIE_INVALIDATING	3	/* T if cookie is being invalidated */
+#define FSCACHE_COOKIE_RELINQUISHED	4	/* T if cookie has been relinquished */
+#define FSCACHE_COOKIE_RETIRED		5	/* T if cookie was retired */
 };
 
 extern struct fscache_cookie fscache_fsdef_index;
@@ -392,10 +394,9 @@ struct fscache_object {
 #define FSCACHE_OBJECT_LOCK		0	/* T if object is busy being processed */
 #define FSCACHE_OBJECT_PENDING_WRITE	1	/* T if object has pending write */
 #define FSCACHE_OBJECT_WAITING		2	/* T if object is waiting on its parent */
-#define FSCACHE_OBJECT_RETIRE		3	/* T if object should be retired */
-#define FSCACHE_OBJECT_IS_LIVE		4	/* T if object is not withdrawn or relinquished */
-#define FSCACHE_OBJECT_IS_LOOKED_UP	5	/* T if object has been looked up */
-#define FSCACHE_OBJECT_IS_AVAILABLE	6	/* T if object has become active */
+#define FSCACHE_OBJECT_IS_LIVE		3	/* T if object is not withdrawn or relinquished */
+#define FSCACHE_OBJECT_IS_LOOKED_UP	4	/* T if object has been looked up */
+#define FSCACHE_OBJECT_IS_AVAILABLE	5	/* T if object has become active */
 
 	struct list_head	cache_link;	/* link in cache->object_list */
 	struct hlist_node	cookie_link;	/* link in cookie->backing_objects */
@@ -415,16 +416,11 @@ struct fscache_object {
 
 extern void fscache_object_init(struct fscache_object *, struct fscache_cookie *,
 				struct fscache_cache *);
+extern void fscache_object_destroy(struct fscache_object *);
 
 extern void fscache_object_lookup_negative(struct fscache_object *object);
 extern void fscache_obtained_object(struct fscache_object *object);
 
-#ifdef CONFIG_FSCACHE_OBJECT_LIST
-extern void fscache_object_destroy(struct fscache_object *object);
-#else
-#define fscache_object_destroy(object) do {} while(0)
-#endif
-
 static inline bool fscache_object_is_live(struct fscache_object *object)
 {
 	return test_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
@@ -512,6 +508,35 @@ static inline void fscache_end_io(struct fscache_retrieval *op,
 	op->end_io_func(page, op->context, error);
 }
 
+/**
+ * fscache_use_cookie - Request usage of cookie attached to an object
+ * @object: Object description
+ * 
+ * Request usage of the cookie attached to an object.  NULL is returned if the
+ * relinquishment had reduced the cookie usage count to 0.
+ */
+static inline bool fscache_use_cookie(struct fscache_object *object)
+{
+	struct fscache_cookie *cookie = object->cookie;
+	return atomic_inc_not_zero(&cookie->n_active) != 0;
+}
+
+/**
+ * fscache_unuse_cookie - Cease usage of cookie attached to an object
+ * @object: Object description
+ * 
+ * Cease usage of the cookie attached to an object.  When the users count
+ * reaches zero then the cookie relinquishment will be permitted to proceed.
+ */
+static inline void fscache_unuse_cookie(struct fscache_object *object)
+{
+	struct fscache_cookie *cookie = object->cookie;
+	if (atomic_dec_and_test(&cookie->n_active)) {
+		printk("*** Reduce to 0: %s\n", cookie->def->name);
+		wake_up_atomic_t(&cookie->n_active);
+	}
+}
+
 /*
  * out-of-line cache backend functions
  */


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-05-03  0:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-03  0:33 [PATCH 0/8] Fix assorted FS-Cache issues David Howells
2013-05-03  0:33 ` [PATCH 1/8] fs/fscache: remove spin_lock() from the condition in while() David Howells
2013-05-03  0:33 ` [PATCH 2/8] CacheFiles: name i_mutex lock class explicitly David Howells
2013-05-03  0:33 ` [PATCH 3/8] FS-Cache: Don't sleep in page release if __GFP_FS is not set David Howells
2013-05-03  0:33 ` [PATCH 4/8] FS-Cache: Uninline fscache_object_init() David Howells
2013-05-03  0:33 ` [PATCH 5/8] FS-Cache: Wrap checks on object state David Howells
2013-05-03  0:33 ` [PATCH 6/8] Add wait_on_atomic_t() and wake_up_atomic_t() David Howells
2013-05-03  0:33 ` [PATCH 7/8] FS-Cache: Fix object state machine to have separate work and wait states David Howells
2013-05-03  0:33 ` [PATCH 8/8] FS-Cache: Simplify cookie retention for fscache_objects, fixing access problems David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).