All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space
@ 2015-02-26 14:01 David Howells
  2015-02-26 14:02 ` [PATCH 01/13] FS-Cache: Count culled objects and objects rejected due to lack " David Howells
                   ` (13 more replies)
  0 siblings, 14 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:01 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel


Attached are a bunch of patches that progressively fix bugs that can occur
when the cache hits the hard "must maintain X free space" limit and starts
rejecting requests.

The commit ensubjected "FS-Cache: Synchronise object death state change vs
operation submission" is the main bug I was looking for.  Some of the other
patches fix other bugs I found along the way and the remaining patches are
changes pulled out of the bugfixes to make the bugfix commits more narrowly
targeted.

Note that I added some new stats:

	Ops    : ini=836360 dfr=19 rel=836360 gc=19

The ini=N field here indicates the number of fscache_operation structs that
were initialised and should match the rel=M value.

	CacheEv: nsp=3425 stl=0 rtr=0 cul=0

Indicates counts of certain cache backend events: nsp=N shows how many objects
were rejected due to lack of space in the cache, stl=N shows how many objects
were found to be stale upon reuse and thus discarded, rtr=N how many objects
were retired and cul=N shows how many inactive objects were culled.

These can also be found here:

	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-fixes

tagged with:

	fscache-fixes-20150226

David
---
David Howells (13):
      FS-Cache: Count culled objects and objects rejected due to lack of space
      FS-Cache: Move fscache_report_unexpected_submission() to make it more available
      FS-Cache: When submitting an op, cancel it if the target object is dying
      FS-Cache: Handle a new operation submitted against a killed object
      FS-Cache: Synchronise object death state change vs operation submission
      FS-Cache: fscache_object_is_dead() has wrong logic, kill it
      FS-Cache: Permit fscache_cancel_op() to cancel in-progress operations too
      FS-Cache: Out of line fscache_operation_init()
      FS-Cache: Count the number of initialised operations
      FS-Cache: Fix cancellation of in-progress operation
      FS-Cache: Put an aborted initialised op so that it is accounted correctly
      FS-Cache: The operation cancellation method needs calling in more places
      FS-Cache: Retain the netfs context in the retrieval op earlier


 Documentation/filesystems/caching/backend-api.txt |   23 ++
 Documentation/filesystems/caching/fscache.txt     |    7 -
 fs/cachefiles/internal.h                          |    1 
 fs/cachefiles/namei.c                             |   33 ++-
 fs/fscache/cookie.c                               |    8 -
 fs/fscache/internal.h                             |   12 +
 fs/fscache/object.c                               |   69 +++++-
 fs/fscache/operation.c                            |  254 ++++++++++++++-------
 fs/fscache/page.c                                 |   86 ++++---
 fs/fscache/stats.c                                |   14 +
 include/linux/fscache-cache.h                     |   55 ++---
 11 files changed, 378 insertions(+), 184 deletions(-)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 01/13] FS-Cache: Count culled objects and objects rejected due to lack of space
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:02 ` David Howells
  2015-02-26 14:02   ` David Howells
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

Count the number of objects that get culled by the cache backend and the
number of objects that the cache backend declines to instantiate due to lack
of space in the cache.

These numbers are made available through /proc/fs/fscache/stats

Signed-off-by: David Howells <dhowells@redhat.com>
---

 Documentation/filesystems/caching/backend-api.txt |   23 ++++++++++
 Documentation/filesystems/caching/fscache.txt     |    4 ++
 fs/cachefiles/internal.h                          |    1 
 fs/cachefiles/namei.c                             |   33 +++++++++------
 fs/fscache/internal.h                             |    5 ++
 fs/fscache/object.c                               |   47 +++++++++++++++++++++
 fs/fscache/stats.c                                |   10 ++++
 include/linux/fscache-cache.h                     |   12 +++++
 8 files changed, 122 insertions(+), 13 deletions(-)

diff --git a/Documentation/filesystems/caching/backend-api.txt b/Documentation/filesystems/caching/backend-api.txt
index 277d1e810670..c0bd5677271b 100644
--- a/Documentation/filesystems/caching/backend-api.txt
+++ b/Documentation/filesystems/caching/backend-api.txt
@@ -676,6 +676,29 @@ FS-Cache provides some utilities that a cache backend may make use of:
      as possible.
 
 
+ (*) Indicate that a stale object was found and discarded:
+
+	void fscache_object_retrying_stale(struct fscache_object *object);
+
+     This is called to indicate that the lookup procedure found an object in
+     the cache that the netfs decided was stale.  The object has been
+     discarded from the cache and the lookup will be performed again.
+
+
+ (*) Indicate that the caching backend killed an object:
+
+	void fscache_object_mark_killed(struct fscache_object *object,
+					enum fscache_why_object_killed why);
+
+     This is called to indicate that the cache backend preemptively killed an
+     object.  The why parameter should be set to indicate the reason:
+
+	FSCACHE_OBJECT_IS_STALE - the object was stale and needs discarding.
+	FSCACHE_OBJECT_NO_SPACE - there was insufficient cache space
+	FSCACHE_OBJECT_WAS_RETIRED - the object was retired when relinquished.
+	FSCACHE_OBJECT_WAS_CULLED - the object was culled to make space.
+
+
  (*) Get and release references on a retrieval record:
 
 	void fscache_get_retrieval(struct fscache_retrieval *op);
diff --git a/Documentation/filesystems/caching/fscache.txt b/Documentation/filesystems/caching/fscache.txt
index 770267af5b3e..66fa7fbccfa4 100644
--- a/Documentation/filesystems/caching/fscache.txt
+++ b/Documentation/filesystems/caching/fscache.txt
@@ -303,6 +303,10 @@ proc files.
 		wrp=N	Number of in-progress write_page() cache ops
 		ucp=N	Number of in-progress uncache_page() cache ops
 		dsp=N	Number of in-progress dissociate_pages() cache ops
+	CacheEv	nsp=N	Number of object lookups/creations rejected due to lack of space
+		stl=N	Number of stale objects deleted
+		rtr=N	Number of objects retired when relinquished
+		cul=N	Number of objects culled
 
 
  (*) /proc/fs/fscache/histogram
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 8c52472d2efa..aecd0859eacb 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -43,7 +43,6 @@ struct cachefiles_object {
 	loff_t				i_size;		/* object size */
 	unsigned long			flags;
 #define CACHEFILES_OBJECT_ACTIVE	0		/* T if marked active */
-#define CACHEFILES_OBJECT_BURIED	1		/* T if preemptively buried */
 	atomic_t			usage;		/* object usage count */
 	uint8_t				type;		/* object type */
 	uint8_t				new;		/* T if object new */
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 7f8e83f9d74e..1d60195fc518 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -97,7 +97,8 @@ static noinline void cachefiles_printk_object(struct cachefiles_object *object,
  *   call vfs_unlink(), vfs_rmdir() or vfs_rename()
  */
 static void cachefiles_mark_object_buried(struct cachefiles_cache *cache,
-					  struct dentry *dentry)
+					  struct dentry *dentry,
+					  enum fscache_why_object_killed why)
 {
 	struct cachefiles_object *object;
 	struct rb_node *p;
@@ -132,8 +133,9 @@ found_dentry:
 		pr_err("\n");
 		pr_err("Error: Can't preemptively bury live object\n");
 		cachefiles_printk_object(object, NULL);
-	} else if (test_and_set_bit(CACHEFILES_OBJECT_BURIED, &object->flags)) {
-		pr_err("Error: Object already preemptively buried\n");
+	} else {
+		if (why != FSCACHE_OBJECT_IS_STALE)
+			fscache_object_mark_killed(&object->fscache, why);
 	}
 
 	write_unlock(&cache->active_lock);
@@ -265,7 +267,8 @@ requeue:
 static int cachefiles_bury_object(struct cachefiles_cache *cache,
 				  struct dentry *dir,
 				  struct dentry *rep,
-				  bool preemptive)
+				  bool preemptive,
+				  enum fscache_why_object_killed why)
 {
 	struct dentry *grave, *trap;
 	struct path path, path_to_graveyard;
@@ -289,7 +292,7 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache,
 			ret = vfs_unlink(dir->d_inode, rep, NULL);
 
 			if (preemptive)
-				cachefiles_mark_object_buried(cache, rep);
+				cachefiles_mark_object_buried(cache, rep, why);
 		}
 
 		mutex_unlock(&dir->d_inode->i_mutex);
@@ -394,7 +397,7 @@ try_again:
 					    "Rename failed with error %d", ret);
 
 		if (preemptive)
-			cachefiles_mark_object_buried(cache, rep);
+			cachefiles_mark_object_buried(cache, rep, why);
 	}
 
 	unlock_rename(cache->graveyard, dir);
@@ -422,7 +425,7 @@ int cachefiles_delete_object(struct cachefiles_cache *cache,
 
 	mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_PARENT);
 
-	if (test_bit(CACHEFILES_OBJECT_BURIED, &object->flags)) {
+	if (test_bit(FSCACHE_OBJECT_KILLED_BY_CACHE, &object->fscache.flags)) {
 		/* object allocation for the same key preemptively deleted this
 		 * object's file so that it could create its own file */
 		_debug("object preemptively buried");
@@ -433,7 +436,8 @@ int cachefiles_delete_object(struct cachefiles_cache *cache,
 		 * may have been renamed */
 		if (dir == object->dentry->d_parent) {
 			ret = cachefiles_bury_object(cache, dir,
-						     object->dentry, false);
+						     object->dentry, false,
+						     FSCACHE_OBJECT_WAS_RETIRED);
 		} else {
 			/* it got moved, presumably by cachefilesd culling it,
 			 * so it's no longer in the key path and we can ignore
@@ -522,7 +526,7 @@ lookup_again:
 		if (!next->d_inode) {
 			ret = cachefiles_has_space(cache, 1, 0);
 			if (ret < 0)
-				goto create_error;
+				goto no_space_error;
 
 			path.dentry = dir;
 			ret = security_path_mkdir(&path, next, 0);
@@ -551,7 +555,7 @@ lookup_again:
 		if (!next->d_inode) {
 			ret = cachefiles_has_space(cache, 1, 0);
 			if (ret < 0)
-				goto create_error;
+				goto no_space_error;
 
 			path.dentry = dir;
 			ret = security_path_mknod(&path, next, S_IFREG, 0);
@@ -602,7 +606,8 @@ lookup_again:
 			 * mutex) */
 			object->dentry = NULL;
 
-			ret = cachefiles_bury_object(cache, dir, next, true);
+			ret = cachefiles_bury_object(cache, dir, next, true,
+						     FSCACHE_OBJECT_IS_STALE);
 			dput(next);
 			next = NULL;
 
@@ -610,6 +615,7 @@ lookup_again:
 				goto delete_error;
 
 			_debug("redo lookup");
+			fscache_object_retrying_stale(&object->fscache);
 			goto lookup_again;
 		}
 	}
@@ -662,6 +668,8 @@ lookup_again:
 	_leave(" = 0 [%lu]", object->dentry->d_inode->i_ino);
 	return 0;
 
+no_space_error:
+	fscache_object_mark_killed(&object->fscache, FSCACHE_OBJECT_NO_SPACE);
 create_error:
 	_debug("create error %d", ret);
 	if (ret == -EIO)
@@ -927,7 +935,8 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
 	/*  actually remove the victim (drops the dir mutex) */
 	_debug("bury");
 
-	ret = cachefiles_bury_object(cache, dir, victim, false);
+	ret = cachefiles_bury_object(cache, dir, victim, false,
+				     FSCACHE_OBJECT_WAS_CULLED);
 	if (ret < 0)
 		goto error;
 
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 7872a62ef30c..3063a58b7d3d 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -271,6 +271,11 @@ extern atomic_t fscache_n_cop_write_page;
 extern atomic_t fscache_n_cop_uncache_page;
 extern atomic_t fscache_n_cop_dissociate_pages;
 
+extern atomic_t fscache_n_cache_no_space_reject;
+extern atomic_t fscache_n_cache_stale_objects;
+extern atomic_t fscache_n_cache_retired_objects;
+extern atomic_t fscache_n_cache_culled_objects;
+
 static inline void fscache_stat(atomic_t *stat)
 {
 	atomic_inc(stat);
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index da032daf0e0d..12bb468bf0ae 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -1016,3 +1016,50 @@ static const struct fscache_state *fscache_update_object(struct fscache_object *
 	_leave("");
 	return transit_to(WAIT_FOR_CMD);
 }
+
+/**
+ * fscache_object_retrying_stale - Note retrying stale object
+ * @object: The object that will be retried
+ *
+ * Note that an object lookup found an on-disk object that was adjudged to be
+ * stale and has been deleted.  The lookup will be retried.
+ */
+void fscache_object_retrying_stale(struct fscache_object *object)
+{
+	fscache_stat(&fscache_n_cache_no_space_reject);
+}
+EXPORT_SYMBOL(fscache_object_retrying_stale);
+
+/**
+ * fscache_object_mark_killed - Note that an object was killed
+ * @object: The object that was culled
+ * @why: The reason the object was killed.
+ *
+ * Note that an object was killed.  Returns true if the object was
+ * already marked killed, false if it wasn't.
+ */
+void fscache_object_mark_killed(struct fscache_object *object,
+				enum fscache_why_object_killed why)
+{
+	if (test_and_set_bit(FSCACHE_OBJECT_KILLED_BY_CACHE, &object->flags)) {
+		pr_err("Error: Object already killed by cache [%s]\n",
+		       object->cache->identifier);
+		return;
+	}
+
+	switch (why) {
+	case FSCACHE_OBJECT_NO_SPACE:
+		fscache_stat(&fscache_n_cache_no_space_reject);
+		break;
+	case FSCACHE_OBJECT_IS_STALE:
+		fscache_stat(&fscache_n_cache_stale_objects);
+		break;
+	case FSCACHE_OBJECT_WAS_RETIRED:
+		fscache_stat(&fscache_n_cache_retired_objects);
+		break;
+	case FSCACHE_OBJECT_WAS_CULLED:
+		fscache_stat(&fscache_n_cache_culled_objects);
+		break;
+	}
+}
+EXPORT_SYMBOL(fscache_object_mark_killed);
diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c
index 40d13c70ef51..3a722e8f2307 100644
--- a/fs/fscache/stats.c
+++ b/fs/fscache/stats.c
@@ -130,6 +130,11 @@ atomic_t fscache_n_cop_write_page;
 atomic_t fscache_n_cop_uncache_page;
 atomic_t fscache_n_cop_dissociate_pages;
 
+atomic_t fscache_n_cache_no_space_reject;
+atomic_t fscache_n_cache_stale_objects;
+atomic_t fscache_n_cache_retired_objects;
+atomic_t fscache_n_cache_culled_objects;
+
 /*
  * display the general statistics
  */
@@ -271,6 +276,11 @@ static int fscache_stats_show(struct seq_file *m, void *v)
 		   atomic_read(&fscache_n_cop_write_page),
 		   atomic_read(&fscache_n_cop_uncache_page),
 		   atomic_read(&fscache_n_cop_dissociate_pages));
+	seq_printf(m, "CacheEv: nsp=%d stl=%d rtr=%d cul=%d\n",
+		   atomic_read(&fscache_n_cache_no_space_reject),
+		   atomic_read(&fscache_n_cache_stale_objects),
+		   atomic_read(&fscache_n_cache_retired_objects),
+		   atomic_read(&fscache_n_cache_culled_objects));
 	return 0;
 }
 
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 771484993ca7..c9dafdaf3347 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -371,6 +371,7 @@ struct fscache_object {
 #define FSCACHE_OBJECT_IS_LOOKED_UP	4	/* T if object has been looked up */
 #define FSCACHE_OBJECT_IS_AVAILABLE	5	/* T if object has become active */
 #define FSCACHE_OBJECT_RETIRED		6	/* T if object was retired on relinquishment */
+#define FSCACHE_OBJECT_KILLED_BY_CACHE	7	/* T if object was killed by the cache */
 
 	struct list_head	cache_link;	/* link in cache->object_list */
 	struct hlist_node	cookie_link;	/* link in cookie->backing_objects */
@@ -551,4 +552,15 @@ extern enum fscache_checkaux fscache_check_aux(struct fscache_object *object,
 					       const void *data,
 					       uint16_t datalen);
 
+extern void fscache_object_retrying_stale(struct fscache_object *object);
+
+enum fscache_why_object_killed {
+	FSCACHE_OBJECT_IS_STALE,
+	FSCACHE_OBJECT_NO_SPACE,
+	FSCACHE_OBJECT_WAS_RETIRED,
+	FSCACHE_OBJECT_WAS_CULLED,
+};
+extern void fscache_object_mark_killed(struct fscache_object *object,
+				       enum fscache_why_object_killed why);
+
 #endif /* _LINUX_FSCACHE_CACHE_H */


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 02/13] FS-Cache: Move fscache_report_unexpected_submission() to make it more available
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:02   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

Move fscache_report_unexpected_submission() up within operation.c so that it
can be called from fscache_submit_exclusive_op() too.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c |   74 ++++++++++++++++++++++++------------------------
 1 file changed, 37 insertions(+), 37 deletions(-)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index e7b87a0e5185..68ead8482617 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -76,6 +76,43 @@ static void fscache_run_op(struct fscache_object *object,
 }
 
 /*
+ * report an unexpected submission
+ */
+static void fscache_report_unexpected_submission(struct fscache_object *object,
+						 struct fscache_operation *op,
+						 const struct fscache_state *ostate)
+{
+	static bool once_only;
+	struct fscache_operation *p;
+	unsigned n;
+
+	if (once_only)
+		return;
+	once_only = true;
+
+	kdebug("unexpected submission OP%x [OBJ%x %s]",
+	       op->debug_id, object->debug_id, object->state->name);
+	kdebug("objstate=%s [%s]", object->state->name, ostate->name);
+	kdebug("objflags=%lx", object->flags);
+	kdebug("objevent=%lx [%lx]", object->events, object->event_mask);
+	kdebug("ops=%u inp=%u exc=%u",
+	       object->n_ops, object->n_in_progress, object->n_exclusive);
+
+	if (!list_empty(&object->pending_ops)) {
+		n = 0;
+		list_for_each_entry(p, &object->pending_ops, pend_link) {
+			ASSERTCMP(p->object, ==, object);
+			kdebug("%p %p", op->processor, op->release);
+			n++;
+		}
+
+		kdebug("n=%u", n);
+	}
+
+	dump_stack();
+}
+
+/*
  * submit an exclusive operation for an object
  * - other ops are excluded from running simultaneously with this one
  * - this gets any extra refs it needs on an op
@@ -139,43 +176,6 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 }
 
 /*
- * report an unexpected submission
- */
-static void fscache_report_unexpected_submission(struct fscache_object *object,
-						 struct fscache_operation *op,
-						 const struct fscache_state *ostate)
-{
-	static bool once_only;
-	struct fscache_operation *p;
-	unsigned n;
-
-	if (once_only)
-		return;
-	once_only = true;
-
-	kdebug("unexpected submission OP%x [OBJ%x %s]",
-	       op->debug_id, object->debug_id, object->state->name);
-	kdebug("objstate=%s [%s]", object->state->name, ostate->name);
-	kdebug("objflags=%lx", object->flags);
-	kdebug("objevent=%lx [%lx]", object->events, object->event_mask);
-	kdebug("ops=%u inp=%u exc=%u",
-	       object->n_ops, object->n_in_progress, object->n_exclusive);
-
-	if (!list_empty(&object->pending_ops)) {
-		n = 0;
-		list_for_each_entry(p, &object->pending_ops, pend_link) {
-			ASSERTCMP(p->object, ==, object);
-			kdebug("%p %p", op->processor, op->release);
-			n++;
-		}
-
-		kdebug("n=%u", n);
-	}
-
-	dump_stack();
-}
-
-/*
  * submit an operation for an object
  * - objects may be submitted only in the following states:
  *   - during object creation (write ops may be submitted)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 02/13] FS-Cache: Move fscache_report_unexpected_submission() to make it more available
@ 2015-02-26 14:02   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

Move fscache_report_unexpected_submission() up within operation.c so that it
can be called from fscache_submit_exclusive_op() too.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c |   74 ++++++++++++++++++++++++------------------------
 1 file changed, 37 insertions(+), 37 deletions(-)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index e7b87a0e5185..68ead8482617 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -76,6 +76,43 @@ static void fscache_run_op(struct fscache_object *object,
 }
 
 /*
+ * report an unexpected submission
+ */
+static void fscache_report_unexpected_submission(struct fscache_object *object,
+						 struct fscache_operation *op,
+						 const struct fscache_state *ostate)
+{
+	static bool once_only;
+	struct fscache_operation *p;
+	unsigned n;
+
+	if (once_only)
+		return;
+	once_only = true;
+
+	kdebug("unexpected submission OP%x [OBJ%x %s]",
+	       op->debug_id, object->debug_id, object->state->name);
+	kdebug("objstate=%s [%s]", object->state->name, ostate->name);
+	kdebug("objflags=%lx", object->flags);
+	kdebug("objevent=%lx [%lx]", object->events, object->event_mask);
+	kdebug("ops=%u inp=%u exc=%u",
+	       object->n_ops, object->n_in_progress, object->n_exclusive);
+
+	if (!list_empty(&object->pending_ops)) {
+		n = 0;
+		list_for_each_entry(p, &object->pending_ops, pend_link) {
+			ASSERTCMP(p->object, ==, object);
+			kdebug("%p %p", op->processor, op->release);
+			n++;
+		}
+
+		kdebug("n=%u", n);
+	}
+
+	dump_stack();
+}
+
+/*
  * submit an exclusive operation for an object
  * - other ops are excluded from running simultaneously with this one
  * - this gets any extra refs it needs on an op
@@ -139,43 +176,6 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 }
 
 /*
- * report an unexpected submission
- */
-static void fscache_report_unexpected_submission(struct fscache_object *object,
-						 struct fscache_operation *op,
-						 const struct fscache_state *ostate)
-{
-	static bool once_only;
-	struct fscache_operation *p;
-	unsigned n;
-
-	if (once_only)
-		return;
-	once_only = true;
-
-	kdebug("unexpected submission OP%x [OBJ%x %s]",
-	       op->debug_id, object->debug_id, object->state->name);
-	kdebug("objstate=%s [%s]", object->state->name, ostate->name);
-	kdebug("objflags=%lx", object->flags);
-	kdebug("objevent=%lx [%lx]", object->events, object->event_mask);
-	kdebug("ops=%u inp=%u exc=%u",
-	       object->n_ops, object->n_in_progress, object->n_exclusive);
-
-	if (!list_empty(&object->pending_ops)) {
-		n = 0;
-		list_for_each_entry(p, &object->pending_ops, pend_link) {
-			ASSERTCMP(p->object, ==, object);
-			kdebug("%p %p", op->processor, op->release);
-			n++;
-		}
-
-		kdebug("n=%u", n);
-	}
-
-	dump_stack();
-}
-
-/*
  * submit an operation for an object
  * - objects may be submitted only in the following states:
  *   - during object creation (write ops may be submitted)

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 03/13] FS-Cache: When submitting an op, cancel it if the target object is dying
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:02   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

When submitting an operation, prefer to cancel the operation immediately
rather than queuing it for later processing if the object is marked as dying
(ie. the object state machine has reached the KILL_OBJECT state).

Whilst we're at it, change the series of related test_bit() calls into a
READ_ONCE() and bitwise-AND operators to reduce the number of load
instructions (test_bit() has a volatile address).

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c        |   47 ++++++++++++++++++++++++++---------------
 include/linux/fscache-cache.h |    9 ++++++--
 2 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 68ead8482617..dec6defe3be3 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -120,6 +120,8 @@ static void fscache_report_unexpected_submission(struct fscache_object *object,
 int fscache_submit_exclusive_op(struct fscache_object *object,
 				struct fscache_operation *op)
 {
+	const struct fscache_state *ostate;
+	unsigned long flags;
 	int ret;
 
 	_enter("{OBJ%x OP%x},", object->debug_id, op->debug_id);
@@ -132,8 +134,19 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 	ASSERTCMP(object->n_ops, >=, object->n_exclusive);
 	ASSERT(list_empty(&op->pend_link));
 
+	ostate = object->state;
+	smp_rmb();
+
 	op->state = FSCACHE_OP_ST_PENDING;
-	if (fscache_object_is_active(object)) {
+	flags = READ_ONCE(object->flags);
+	if (unlikely(!(flags & BIT(FSCACHE_OBJECT_IS_LIVE)))) {
+		fscache_stat(&fscache_n_op_rejected);
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -ENOBUFS;
+	} else if (unlikely(fscache_cache_is_broken(object))) {
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -EIO;
+	} else if (flags & BIT(FSCACHE_OBJECT_IS_AVAILABLE)) {
 		op->object = object;
 		object->n_ops++;
 		object->n_exclusive++;	/* reads and writes must wait */
@@ -155,7 +168,7 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 		/* need to issue a new write op after this */
 		clear_bit(FSCACHE_OBJECT_PENDING_WRITE, &object->flags);
 		ret = 0;
-	} else if (test_bit(FSCACHE_OBJECT_IS_LOOKED_UP, &object->flags)) {
+	} else if (flags & BIT(FSCACHE_OBJECT_IS_LOOKED_UP)) {
 		op->object = object;
 		object->n_ops++;
 		object->n_exclusive++;	/* reads and writes must wait */
@@ -164,11 +177,9 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
 	} else {
-		/* If we're in any other state, there must have been an I/O
-		 * error of some nature.
-		 */
-		ASSERT(test_bit(FSCACHE_IOERROR, &object->cache->flags));
-		ret = -EIO;
+		fscache_report_unexpected_submission(object, op, ostate);
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -ENOBUFS;
 	}
 
 	spin_unlock(&object->lock);
@@ -187,6 +198,7 @@ int fscache_submit_op(struct fscache_object *object,
 		      struct fscache_operation *op)
 {
 	const struct fscache_state *ostate;
+	unsigned long flags;
 	int ret;
 
 	_enter("{OBJ%x OP%x},{%u}",
@@ -204,7 +216,15 @@ int fscache_submit_op(struct fscache_object *object,
 	smp_rmb();
 
 	op->state = FSCACHE_OP_ST_PENDING;
-	if (fscache_object_is_active(object)) {
+	flags = READ_ONCE(object->flags);
+	if (unlikely(!(flags & BIT(FSCACHE_OBJECT_IS_LIVE)))) {
+		fscache_stat(&fscache_n_op_rejected);
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -ENOBUFS;
+	} else if (unlikely(fscache_cache_is_broken(object))) {
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -EIO;
+	} else if (flags & BIT(FSCACHE_OBJECT_IS_AVAILABLE)) {
 		op->object = object;
 		object->n_ops++;
 
@@ -222,25 +242,18 @@ int fscache_submit_op(struct fscache_object *object,
 			fscache_run_op(object, op);
 		}
 		ret = 0;
-	} else if (test_bit(FSCACHE_OBJECT_IS_LOOKED_UP, &object->flags)) {
+	} else if (flags & BIT(FSCACHE_OBJECT_IS_LOOKED_UP)) {
 		op->object = object;
 		object->n_ops++;
 		atomic_inc(&op->usage);
 		list_add_tail(&op->pend_link, &object->pending_ops);
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
-	} else if (fscache_object_is_dying(object)) {
-		fscache_stat(&fscache_n_op_rejected);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -ENOBUFS;
-	} else if (!test_bit(FSCACHE_IOERROR, &object->cache->flags)) {
+	} else {
 		fscache_report_unexpected_submission(object, op, ostate);
 		ASSERT(!fscache_object_is_active(object));
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
-	} else {
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -ENOBUFS;
 	}
 
 	spin_unlock(&object->lock);
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index c9dafdaf3347..2e83a141e465 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -411,17 +411,22 @@ static inline bool fscache_object_is_available(struct fscache_object *object)
 	return test_bit(FSCACHE_OBJECT_IS_AVAILABLE, &object->flags);
 }
 
+static inline bool fscache_cache_is_broken(struct fscache_object *object)
+{
+	return test_bit(FSCACHE_IOERROR, &object->cache->flags);
+}
+
 static inline bool fscache_object_is_active(struct fscache_object *object)
 {
 	return fscache_object_is_available(object) &&
 		fscache_object_is_live(object) &&
-		!test_bit(FSCACHE_IOERROR, &object->cache->flags);
+		!fscache_cache_is_broken(object);
 }
 
 static inline bool fscache_object_is_dead(struct fscache_object *object)
 {
 	return fscache_object_is_dying(object) &&
-		test_bit(FSCACHE_IOERROR, &object->cache->flags);
+		fscache_cache_is_broken(object);
 }
 
 /**


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 03/13] FS-Cache: When submitting an op, cancel it if the target object is dying
@ 2015-02-26 14:02   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

When submitting an operation, prefer to cancel the operation immediately
rather than queuing it for later processing if the object is marked as dying
(ie. the object state machine has reached the KILL_OBJECT state).

Whilst we're at it, change the series of related test_bit() calls into a
READ_ONCE() and bitwise-AND operators to reduce the number of load
instructions (test_bit() has a volatile address).

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c        |   47 ++++++++++++++++++++++++++---------------
 include/linux/fscache-cache.h |    9 ++++++--
 2 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 68ead8482617..dec6defe3be3 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -120,6 +120,8 @@ static void fscache_report_unexpected_submission(struct fscache_object *object,
 int fscache_submit_exclusive_op(struct fscache_object *object,
 				struct fscache_operation *op)
 {
+	const struct fscache_state *ostate;
+	unsigned long flags;
 	int ret;
 
 	_enter("{OBJ%x OP%x},", object->debug_id, op->debug_id);
@@ -132,8 +134,19 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 	ASSERTCMP(object->n_ops, >=, object->n_exclusive);
 	ASSERT(list_empty(&op->pend_link));
 
+	ostate = object->state;
+	smp_rmb();
+
 	op->state = FSCACHE_OP_ST_PENDING;
-	if (fscache_object_is_active(object)) {
+	flags = READ_ONCE(object->flags);
+	if (unlikely(!(flags & BIT(FSCACHE_OBJECT_IS_LIVE)))) {
+		fscache_stat(&fscache_n_op_rejected);
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -ENOBUFS;
+	} else if (unlikely(fscache_cache_is_broken(object))) {
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -EIO;
+	} else if (flags & BIT(FSCACHE_OBJECT_IS_AVAILABLE)) {
 		op->object = object;
 		object->n_ops++;
 		object->n_exclusive++;	/* reads and writes must wait */
@@ -155,7 +168,7 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 		/* need to issue a new write op after this */
 		clear_bit(FSCACHE_OBJECT_PENDING_WRITE, &object->flags);
 		ret = 0;
-	} else if (test_bit(FSCACHE_OBJECT_IS_LOOKED_UP, &object->flags)) {
+	} else if (flags & BIT(FSCACHE_OBJECT_IS_LOOKED_UP)) {
 		op->object = object;
 		object->n_ops++;
 		object->n_exclusive++;	/* reads and writes must wait */
@@ -164,11 +177,9 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
 	} else {
-		/* If we're in any other state, there must have been an I/O
-		 * error of some nature.
-		 */
-		ASSERT(test_bit(FSCACHE_IOERROR, &object->cache->flags));
-		ret = -EIO;
+		fscache_report_unexpected_submission(object, op, ostate);
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -ENOBUFS;
 	}
 
 	spin_unlock(&object->lock);
@@ -187,6 +198,7 @@ int fscache_submit_op(struct fscache_object *object,
 		      struct fscache_operation *op)
 {
 	const struct fscache_state *ostate;
+	unsigned long flags;
 	int ret;
 
 	_enter("{OBJ%x OP%x},{%u}",
@@ -204,7 +216,15 @@ int fscache_submit_op(struct fscache_object *object,
 	smp_rmb();
 
 	op->state = FSCACHE_OP_ST_PENDING;
-	if (fscache_object_is_active(object)) {
+	flags = READ_ONCE(object->flags);
+	if (unlikely(!(flags & BIT(FSCACHE_OBJECT_IS_LIVE)))) {
+		fscache_stat(&fscache_n_op_rejected);
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -ENOBUFS;
+	} else if (unlikely(fscache_cache_is_broken(object))) {
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -EIO;
+	} else if (flags & BIT(FSCACHE_OBJECT_IS_AVAILABLE)) {
 		op->object = object;
 		object->n_ops++;
 
@@ -222,25 +242,18 @@ int fscache_submit_op(struct fscache_object *object,
 			fscache_run_op(object, op);
 		}
 		ret = 0;
-	} else if (test_bit(FSCACHE_OBJECT_IS_LOOKED_UP, &object->flags)) {
+	} else if (flags & BIT(FSCACHE_OBJECT_IS_LOOKED_UP)) {
 		op->object = object;
 		object->n_ops++;
 		atomic_inc(&op->usage);
 		list_add_tail(&op->pend_link, &object->pending_ops);
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
-	} else if (fscache_object_is_dying(object)) {
-		fscache_stat(&fscache_n_op_rejected);
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -ENOBUFS;
-	} else if (!test_bit(FSCACHE_IOERROR, &object->cache->flags)) {
+	} else {
 		fscache_report_unexpected_submission(object, op, ostate);
 		ASSERT(!fscache_object_is_active(object));
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
-	} else {
-		op->state = FSCACHE_OP_ST_CANCELLED;
-		ret = -ENOBUFS;
 	}
 
 	spin_unlock(&object->lock);
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index c9dafdaf3347..2e83a141e465 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -411,17 +411,22 @@ static inline bool fscache_object_is_available(struct fscache_object *object)
 	return test_bit(FSCACHE_OBJECT_IS_AVAILABLE, &object->flags);
 }
 
+static inline bool fscache_cache_is_broken(struct fscache_object *object)
+{
+	return test_bit(FSCACHE_IOERROR, &object->cache->flags);
+}
+
 static inline bool fscache_object_is_active(struct fscache_object *object)
 {
 	return fscache_object_is_available(object) &&
 		fscache_object_is_live(object) &&
-		!test_bit(FSCACHE_IOERROR, &object->cache->flags);
+		!fscache_cache_is_broken(object);
 }
 
 static inline bool fscache_object_is_dead(struct fscache_object *object)
 {
 	return fscache_object_is_dying(object) &&
-		test_bit(FSCACHE_IOERROR, &object->cache->flags);
+		fscache_cache_is_broken(object);
 }
 
 /**

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 04/13] FS-Cache: Handle a new operation submitted against a killed object
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:02   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

Reject new operations that are being submitted against an object if that
object has failed its lookup or creation states or has been killed by the
cache backend for some other reason, such as having been culled.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/object.c    |    2 ++
 fs/fscache/operation.c |    6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 12bb468bf0ae..9b79fc9a1464 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -610,6 +610,8 @@ static const struct fscache_state *fscache_lookup_failure(struct fscache_object
 	object->cache->ops->lookup_complete(object);
 	fscache_stat_d(&fscache_n_cop_lookup_complete);
 
+	set_bit(FSCACHE_OBJECT_KILLED_BY_CACHE, &object->flags);
+
 	cookie = object->cookie;
 	set_bit(FSCACHE_COOKIE_UNAVAILABLE, &cookie->flags);
 	if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags))
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index dec6defe3be3..18658fffbba1 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -176,6 +176,9 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 		list_add_tail(&op->pend_link, &object->pending_ops);
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
+	} else if (flags & BIT(FSCACHE_OBJECT_KILLED_BY_CACHE)) {
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -ENOBUFS;
 	} else {
 		fscache_report_unexpected_submission(object, op, ostate);
 		op->state = FSCACHE_OP_ST_CANCELLED;
@@ -249,6 +252,9 @@ int fscache_submit_op(struct fscache_object *object,
 		list_add_tail(&op->pend_link, &object->pending_ops);
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
+	} else if (flags & BIT(FSCACHE_OBJECT_KILLED_BY_CACHE)) {
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -ENOBUFS;
 	} else {
 		fscache_report_unexpected_submission(object, op, ostate);
 		ASSERT(!fscache_object_is_active(object));


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 04/13] FS-Cache: Handle a new operation submitted against a killed object
@ 2015-02-26 14:02   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

Reject new operations that are being submitted against an object if that
object has failed its lookup or creation states or has been killed by the
cache backend for some other reason, such as having been culled.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/object.c    |    2 ++
 fs/fscache/operation.c |    6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 12bb468bf0ae..9b79fc9a1464 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -610,6 +610,8 @@ static const struct fscache_state *fscache_lookup_failure(struct fscache_object
 	object->cache->ops->lookup_complete(object);
 	fscache_stat_d(&fscache_n_cop_lookup_complete);
 
+	set_bit(FSCACHE_OBJECT_KILLED_BY_CACHE, &object->flags);
+
 	cookie = object->cookie;
 	set_bit(FSCACHE_COOKIE_UNAVAILABLE, &cookie->flags);
 	if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags))
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index dec6defe3be3..18658fffbba1 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -176,6 +176,9 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 		list_add_tail(&op->pend_link, &object->pending_ops);
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
+	} else if (flags & BIT(FSCACHE_OBJECT_KILLED_BY_CACHE)) {
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -ENOBUFS;
 	} else {
 		fscache_report_unexpected_submission(object, op, ostate);
 		op->state = FSCACHE_OP_ST_CANCELLED;
@@ -249,6 +252,9 @@ int fscache_submit_op(struct fscache_object *object,
 		list_add_tail(&op->pend_link, &object->pending_ops);
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
+	} else if (flags & BIT(FSCACHE_OBJECT_KILLED_BY_CACHE)) {
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		ret = -ENOBUFS;
 	} else {
 		fscache_report_unexpected_submission(object, op, ostate);
 		ASSERT(!fscache_object_is_active(object));

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 05/13] FS-Cache: Synchronise object death state change vs operation submission
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:02   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

When an object is being marked as no longer live, do this under the object
spinlock to prevent a race with operation submission targeted on that object.

The problem occurs due to the following pair of intertwined sequences when the
cache tries to create an object that would take it over the hard available
space limit:

 NETFS INTERFACE
 ===============
 (A) The netfs calls fscache_acquire_cookie().  object creation is deferred to
     the object state machine and the netfs is allowed to continue.

	OBJECT STATE MACHINE KTHREAD
	============================
	(1) The object is looked up on disk by fscache_look_up_object()
	    calling cachefiles_walk_to_object().  The latter finds that the
	    object is not yet represented on disk and calls
	    fscache_object_lookup_negative().

	(2) fscache_object_lookup_negative() sets FSCACHE_COOKIE_NO_DATA_YET
	    and clears FSCACHE_COOKIE_LOOKING_UP, thus allowing the netfs to
	    start queuing read operations.

 (B) The netfs calls fscache_read_or_alloc_pages().  This calls
     fscache_wait_for_deferred_lookup() which sees FSCACHE_COOKIE_LOOKING_UP
     become clear, allowing the read to begin.

 (C) A read operation is set up and passed to fscache_submit_op() to deal
     with.

	(3) cachefiles_walk_to_object() calls cachefiles_has_space(), which
	    fails (or one of the file operations to create stuff fails).
	    cachefiles returns an error to fscache.

	(4) fscache_look_up_object() transits to the LOOKUP_FAILURE state,

	(5) fscache_lookup_failure() sets FSCACHE_OBJECT_LOOKED_UP and
	    FSCACHE_COOKIE_UNAVAILABLE and clears FSCACHE_COOKIE_LOOKING_UP
	    then transits to the KILL_OBJECT state.

	(6) fscache_kill_object() clears FSCACHE_OBJECT_IS_LIVE in an attempt
	    to reject any further requests from the netfs.

	(7) object->n_ops is examined and found to be 0.
	    fscache_kill_object() transits to the DROP_OBJECT state.

 (D) fscache_submit_op() locks the object spinlock, sees if it can dispatch
     the op immediately by calling fscache_object_is_active() - which fails
     since FSCACHE_OBJECT_IS_AVAILABLE has not yet been set.

 (E) fscache_submit_op() then tests FSCACHE_OBJECT_LOOKED_UP - which is set.
     It then queues the object and increments object->n_ops.

	(8) fscache_drop_object() releases the object and eventually
	    fscache_put_object() calls cachefiles_put_object() which suffers
	    an assertion failure here:

		ASSERTCMP(object->fscache.n_ops, ==, 0);

Locking the object spinlock in step (6) around the clearance of
FSCACHE_OBJECT_IS_LIVE ensures that the the decision trees in
fscache_submit_op() and fscache_submit_exclusive_op() don't see the IS_LIVE
flag being cleared mid-decision: either the op is queued before step (7) - in
which case fscache_kill_object() will see n_ops>0 and will deal with the op -
or the op will be rejected.

This, combined with rejecting op submission if the target object is dying, fix
the problem.

The problem shows up as the following oops:

CacheFiles: Assertion failed
CacheFiles: 1 == 0 is false
------------[ cut here ]------------
kernel BUG at ../fs/cachefiles/interface.c:339!
...
RIP: 0010:[<ffffffffa014fd9c>]  [<ffffffffa014fd9c>] cachefiles_put_object+0x2a4/0x301 [cachefiles]
...
Call Trace:
 [<ffffffffa008674b>] fscache_put_object+0x18/0x21 [fscache]
 [<ffffffffa00883e6>] fscache_object_work_func+0x3ba/0x3c9 [fscache]
 [<ffffffff81054dad>] process_one_work+0x226/0x441
 [<ffffffff81055d91>] worker_thread+0x273/0x36b
 [<ffffffff81055b1e>] ? rescuer_thread+0x2e1/0x2e1
 [<ffffffff81059b9d>] kthread+0x10e/0x116
 [<ffffffff81059a8f>] ? kthread_create_on_node+0x1bb/0x1bb
 [<ffffffff815579ac>] ret_from_fork+0x7c/0xb0
 [<ffffffff81059a8f>] ? kthread_create_on_node+0x1bb/0x1bb

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/object.c |   17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 9b79fc9a1464..40049f7505f0 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -328,6 +328,17 @@ void fscache_object_init(struct fscache_object *object,
 EXPORT_SYMBOL(fscache_object_init);
 
 /*
+ * Mark the object as no longer being live, making sure that we synchronise
+ * against op submission.
+ */
+static inline void fscache_mark_object_dead(struct fscache_object *object)
+{
+	spin_lock(&object->lock);
+	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+	spin_unlock(&object->lock);
+}
+
+/*
  * Abort object initialisation before we start it.
  */
 static const struct fscache_state *fscache_abort_initialisation(struct fscache_object *object,
@@ -631,7 +642,7 @@ static const struct fscache_state *fscache_kill_object(struct fscache_object *ob
 	_enter("{OBJ%x,%d,%d},%d",
 	       object->debug_id, object->n_ops, object->n_children, event);
 
-	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+	fscache_mark_object_dead(object);
 	object->oob_event_mask = 0;
 
 	if (list_empty(&object->dependents) &&
@@ -976,13 +987,13 @@ static const struct fscache_state *_fscache_invalidate_object(struct fscache_obj
 	return transit_to(UPDATE_OBJECT);
 
 nomem:
-	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+	fscache_mark_object_dead(object);
 	fscache_unuse_cookie(object);
 	_leave(" [ENOMEM]");
 	return transit_to(KILL_OBJECT);
 
 submit_op_failed:
-	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+	fscache_mark_object_dead(object);
 	spin_unlock(&cookie->lock);
 	fscache_unuse_cookie(object);
 	kfree(op);


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 05/13] FS-Cache: Synchronise object death state change vs operation submission
@ 2015-02-26 14:02   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

When an object is being marked as no longer live, do this under the object
spinlock to prevent a race with operation submission targeted on that object.

The problem occurs due to the following pair of intertwined sequences when the
cache tries to create an object that would take it over the hard available
space limit:

 NETFS INTERFACE
 ===============
 (A) The netfs calls fscache_acquire_cookie().  object creation is deferred to
     the object state machine and the netfs is allowed to continue.

	OBJECT STATE MACHINE KTHREAD
	============================
	(1) The object is looked up on disk by fscache_look_up_object()
	    calling cachefiles_walk_to_object().  The latter finds that the
	    object is not yet represented on disk and calls
	    fscache_object_lookup_negative().

	(2) fscache_object_lookup_negative() sets FSCACHE_COOKIE_NO_DATA_YET
	    and clears FSCACHE_COOKIE_LOOKING_UP, thus allowing the netfs to
	    start queuing read operations.

 (B) The netfs calls fscache_read_or_alloc_pages().  This calls
     fscache_wait_for_deferred_lookup() which sees FSCACHE_COOKIE_LOOKING_UP
     become clear, allowing the read to begin.

 (C) A read operation is set up and passed to fscache_submit_op() to deal
     with.

	(3) cachefiles_walk_to_object() calls cachefiles_has_space(), which
	    fails (or one of the file operations to create stuff fails).
	    cachefiles returns an error to fscache.

	(4) fscache_look_up_object() transits to the LOOKUP_FAILURE state,

	(5) fscache_lookup_failure() sets FSCACHE_OBJECT_LOOKED_UP and
	    FSCACHE_COOKIE_UNAVAILABLE and clears FSCACHE_COOKIE_LOOKING_UP
	    then transits to the KILL_OBJECT state.

	(6) fscache_kill_object() clears FSCACHE_OBJECT_IS_LIVE in an attempt
	    to reject any further requests from the netfs.

	(7) object->n_ops is examined and found to be 0.
	    fscache_kill_object() transits to the DROP_OBJECT state.

 (D) fscache_submit_op() locks the object spinlock, sees if it can dispatch
     the op immediately by calling fscache_object_is_active() - which fails
     since FSCACHE_OBJECT_IS_AVAILABLE has not yet been set.

 (E) fscache_submit_op() then tests FSCACHE_OBJECT_LOOKED_UP - which is set.
     It then queues the object and increments object->n_ops.

	(8) fscache_drop_object() releases the object and eventually
	    fscache_put_object() calls cachefiles_put_object() which suffers
	    an assertion failure here:

		ASSERTCMP(object->fscache.n_ops, ==, 0);

Locking the object spinlock in step (6) around the clearance of
FSCACHE_OBJECT_IS_LIVE ensures that the the decision trees in
fscache_submit_op() and fscache_submit_exclusive_op() don't see the IS_LIVE
flag being cleared mid-decision: either the op is queued before step (7) - in
which case fscache_kill_object() will see n_ops>0 and will deal with the op -
or the op will be rejected.

This, combined with rejecting op submission if the target object is dying, fix
the problem.

The problem shows up as the following oops:

CacheFiles: Assertion failed
CacheFiles: 1 == 0 is false
------------[ cut here ]------------
kernel BUG at ../fs/cachefiles/interface.c:339!
...
RIP: 0010:[<ffffffffa014fd9c>]  [<ffffffffa014fd9c>] cachefiles_put_object+0x2a4/0x301 [cachefiles]
...
Call Trace:
 [<ffffffffa008674b>] fscache_put_object+0x18/0x21 [fscache]
 [<ffffffffa00883e6>] fscache_object_work_func+0x3ba/0x3c9 [fscache]
 [<ffffffff81054dad>] process_one_work+0x226/0x441
 [<ffffffff81055d91>] worker_thread+0x273/0x36b
 [<ffffffff81055b1e>] ? rescuer_thread+0x2e1/0x2e1
 [<ffffffff81059b9d>] kthread+0x10e/0x116
 [<ffffffff81059a8f>] ? kthread_create_on_node+0x1bb/0x1bb
 [<ffffffff815579ac>] ret_from_fork+0x7c/0xb0
 [<ffffffff81059a8f>] ? kthread_create_on_node+0x1bb/0x1bb

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/object.c |   17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 9b79fc9a1464..40049f7505f0 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -328,6 +328,17 @@ void fscache_object_init(struct fscache_object *object,
 EXPORT_SYMBOL(fscache_object_init);
 
 /*
+ * Mark the object as no longer being live, making sure that we synchronise
+ * against op submission.
+ */
+static inline void fscache_mark_object_dead(struct fscache_object *object)
+{
+	spin_lock(&object->lock);
+	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+	spin_unlock(&object->lock);
+}
+
+/*
  * Abort object initialisation before we start it.
  */
 static const struct fscache_state *fscache_abort_initialisation(struct fscache_object *object,
@@ -631,7 +642,7 @@ static const struct fscache_state *fscache_kill_object(struct fscache_object *ob
 	_enter("{OBJ%x,%d,%d},%d",
 	       object->debug_id, object->n_ops, object->n_children, event);
 
-	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+	fscache_mark_object_dead(object);
 	object->oob_event_mask = 0;
 
 	if (list_empty(&object->dependents) &&
@@ -976,13 +987,13 @@ static const struct fscache_state *_fscache_invalidate_object(struct fscache_obj
 	return transit_to(UPDATE_OBJECT);
 
 nomem:
-	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+	fscache_mark_object_dead(object);
 	fscache_unuse_cookie(object);
 	_leave(" [ENOMEM]");
 	return transit_to(KILL_OBJECT);
 
 submit_op_failed:
-	clear_bit(FSCACHE_OBJECT_IS_LIVE, &object->flags);
+	fscache_mark_object_dead(object);
 	spin_unlock(&cookie->lock);
 	fscache_unuse_cookie(object);
 	kfree(op);

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 06/13] FS-Cache: fscache_object_is_dead() has wrong logic, kill it
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:02   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

fscache_object_is_dead() returns true only if the object is marked dead and
the cache got an I/O error.  This should be a logical OR instead.  Since two
of the callers got split up into handling for separate subcases, expand the
other callers and kill the function.  This is probably the right thing to do
anyway since one of the subcases isn't about the object at all, but rather
about the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/cookie.c           |    3 ++-
 fs/fscache/page.c             |    6 ++++--
 include/linux/fscache-cache.h |    6 ------
 3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 89acec742e0b..8de22164f5fb 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -327,7 +327,8 @@ static int fscache_alloc_object(struct fscache_cache *cache,
 
 object_already_extant:
 	ret = -ENOBUFS;
-	if (fscache_object_is_dead(object)) {
+	if (fscache_object_is_dying(object) ||
+	    fscache_cache_is_broken(object)) {
 		spin_unlock(&cookie->lock);
 		goto error;
 	}
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index de33b3fccca6..d0805e31361c 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -377,11 +377,13 @@ check_if_dead:
 		_leave(" = -ENOBUFS [cancelled]");
 		return -ENOBUFS;
 	}
-	if (unlikely(fscache_object_is_dead(object))) {
-		pr_err("%s() = -ENOBUFS [obj dead %d]\n", __func__, op->state);
+	if (unlikely(fscache_object_is_dying(object) ||
+		     fscache_cache_is_broken(object))) {
+		enum fscache_operation_state state = op->state;
 		fscache_cancel_op(op, do_cancel);
 		if (stat_object_dead)
 			fscache_stat(stat_object_dead);
+		_leave(" = -ENOBUFS [obj dead %d]", state);
 		return -ENOBUFS;
 	}
 	return 0;
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 2e83a141e465..ca3d550da11e 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -423,12 +423,6 @@ static inline bool fscache_object_is_active(struct fscache_object *object)
 		!fscache_cache_is_broken(object);
 }
 
-static inline bool fscache_object_is_dead(struct fscache_object *object)
-{
-	return fscache_object_is_dying(object) &&
-		fscache_cache_is_broken(object);
-}
-
 /**
  * fscache_object_destroyed - Note destruction of an object in a cache
  * @cache: The cache from which the object came


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 06/13] FS-Cache: fscache_object_is_dead() has wrong logic, kill it
@ 2015-02-26 14:02   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

fscache_object_is_dead() returns true only if the object is marked dead and
the cache got an I/O error.  This should be a logical OR instead.  Since two
of the callers got split up into handling for separate subcases, expand the
other callers and kill the function.  This is probably the right thing to do
anyway since one of the subcases isn't about the object at all, but rather
about the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/cookie.c           |    3 ++-
 fs/fscache/page.c             |    6 ++++--
 include/linux/fscache-cache.h |    6 ------
 3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 89acec742e0b..8de22164f5fb 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -327,7 +327,8 @@ static int fscache_alloc_object(struct fscache_cache *cache,
 
 object_already_extant:
 	ret = -ENOBUFS;
-	if (fscache_object_is_dead(object)) {
+	if (fscache_object_is_dying(object) ||
+	    fscache_cache_is_broken(object)) {
 		spin_unlock(&cookie->lock);
 		goto error;
 	}
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index de33b3fccca6..d0805e31361c 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -377,11 +377,13 @@ check_if_dead:
 		_leave(" = -ENOBUFS [cancelled]");
 		return -ENOBUFS;
 	}
-	if (unlikely(fscache_object_is_dead(object))) {
-		pr_err("%s() = -ENOBUFS [obj dead %d]\n", __func__, op->state);
+	if (unlikely(fscache_object_is_dying(object) ||
+		     fscache_cache_is_broken(object))) {
+		enum fscache_operation_state state = op->state;
 		fscache_cancel_op(op, do_cancel);
 		if (stat_object_dead)
 			fscache_stat(stat_object_dead);
+		_leave(" = -ENOBUFS [obj dead %d]", state);
 		return -ENOBUFS;
 	}
 	return 0;
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 2e83a141e465..ca3d550da11e 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -423,12 +423,6 @@ static inline bool fscache_object_is_active(struct fscache_object *object)
 		!fscache_cache_is_broken(object);
 }
 
-static inline bool fscache_object_is_dead(struct fscache_object *object)
-{
-	return fscache_object_is_dying(object) &&
-		fscache_cache_is_broken(object);
-}
-
 /**
  * fscache_object_destroyed - Note destruction of an object in a cache
  * @cache: The cache from which the object came

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 07/13] FS-Cache: Permit fscache_cancel_op() to cancel in-progress operations too
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:02   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

Currently, fscache_cancel_op() only cancels pending operations - attempts to
cancel in-progress operations are ignored.  This leads to a problem in
fscache_wait_for_operation_activation() whereby the wait is terminated, but
the object has been killed.

The check at the end of the function now triggers because it's no longer
contingent on the cache having produced an I/O error since the commit that
fixed the logic error in fscache_object_is_dead().

The result of the check is that it tries to cancel the operation - but since
the object may not be pending by this point, the cancellation request may be
ignored - with the result that the the object is just put by the caller and
fscache_put_operation has an assertion failure because the operation isn't in
either the COMPLETE or the CANCELLED states.

To fix this, we permit in-progress ops to be cancelled under some
circumstances.

The bug results in an oops that looks something like this:

	FS-Cache: fscache_wait_for_operation_activation() = -ENOBUFS [obj dead 3]
	FS-Cache:
	FS-Cache: Assertion failed
	FS-Cache: 3 == 5 is false
	------------[ cut here ]------------
	kernel BUG at ../fs/fscache/operation.c:432!
	...
	RIP: 0010:[<ffffffffa0088574>] fscache_put_operation+0xf2/0x2cd
	Call Trace:
	 [<ffffffffa008b92a>] __fscache_read_or_alloc_pages+0x2ec/0x3b3
	 [<ffffffffa00b761f>] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
	 [<ffffffffa00b06c5>] nfs_readpages+0x10c/0x185 [nfs]
	 [<ffffffff81124925>] ? alloc_pages_current+0x119/0x13e
	 [<ffffffff810ee5fd>] ? __page_cache_alloc+0xfb/0x10a
	 [<ffffffff810f87f8>] __do_page_cache_readahead+0x188/0x22c
	 [<ffffffff810f8b3a>] ondemand_readahead+0x29e/0x2af
	 [<ffffffff810f8c92>] page_cache_sync_readahead+0x38/0x3a
	 [<ffffffff810ef337>] generic_file_read_iter+0x1a2/0x55a
	 [<ffffffffa00a9dff>] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
	 [<ffffffffa00a6a23>] nfs_file_read+0x49/0x70 [nfs]
	 [<ffffffff811363be>] new_sync_read+0x78/0x9c
	 [<ffffffff81137164>] __vfs_read+0x13/0x38
	 [<ffffffff8113721e>] vfs_read+0x95/0x121
	 [<ffffffff811372f6>] SyS_read+0x4c/0x8a
	 [<ffffffff81557a52>] system_call_fastpath+0x12/0x17

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/internal.h  |    3 ++-
 fs/fscache/operation.c |   20 +++++++++++++++++---
 fs/fscache/page.c      |    4 ++--
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 3063a58b7d3d..87c4544ec912 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -125,7 +125,8 @@ extern int fscache_submit_exclusive_op(struct fscache_object *,
 extern int fscache_submit_op(struct fscache_object *,
 			     struct fscache_operation *);
 extern int fscache_cancel_op(struct fscache_operation *,
-			     void (*)(struct fscache_operation *));
+			     void (*)(struct fscache_operation *),
+			     bool);
 extern void fscache_cancel_all_ops(struct fscache_object *);
 extern void fscache_abort_object(struct fscache_object *);
 extern void fscache_start_operations(struct fscache_object *);
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 18658fffbba1..67594a8d791a 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -312,9 +312,11 @@ void fscache_start_operations(struct fscache_object *object)
  * cancel an operation that's pending on an object
  */
 int fscache_cancel_op(struct fscache_operation *op,
-		      void (*do_cancel)(struct fscache_operation *))
+		      void (*do_cancel)(struct fscache_operation *),
+		      bool cancel_in_progress_op)
 {
 	struct fscache_object *object = op->object;
+	bool put = false;
 	int ret;
 
 	_enter("OBJ%x OP%x}", op->object->debug_id, op->debug_id);
@@ -328,8 +330,19 @@ int fscache_cancel_op(struct fscache_operation *op,
 	ret = -EBUSY;
 	if (op->state == FSCACHE_OP_ST_PENDING) {
 		ASSERT(!list_empty(&op->pend_link));
-		fscache_stat(&fscache_n_op_cancelled);
 		list_del_init(&op->pend_link);
+		put = true;
+		fscache_stat(&fscache_n_op_cancelled);
+		if (do_cancel)
+			do_cancel(op);
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
+			object->n_exclusive--;
+		if (test_and_clear_bit(FSCACHE_OP_WAITING, &op->flags))
+			wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
+		ret = 0;
+	} else if (op->state == FSCACHE_OP_ST_IN_PROGRESS && cancel_in_progress_op) {
+		fscache_stat(&fscache_n_op_cancelled);
 		if (do_cancel)
 			do_cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
@@ -337,10 +350,11 @@ int fscache_cancel_op(struct fscache_operation *op,
 			object->n_exclusive--;
 		if (test_and_clear_bit(FSCACHE_OP_WAITING, &op->flags))
 			wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
-		fscache_put_operation(op);
 		ret = 0;
 	}
 
+	if (put)
+		fscache_put_operation(op);
 	spin_unlock(&object->lock);
 	_leave(" = %d", ret);
 	return ret;
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index d0805e31361c..433cae927eca 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -359,7 +359,7 @@ int fscache_wait_for_operation_activation(struct fscache_object *object,
 		fscache_stat(stat_op_waits);
 	if (wait_on_bit(&op->flags, FSCACHE_OP_WAITING,
 			TASK_INTERRUPTIBLE) != 0) {
-		ret = fscache_cancel_op(op, do_cancel);
+		ret = fscache_cancel_op(op, do_cancel, false);
 		if (ret == 0)
 			return -ERESTARTSYS;
 
@@ -380,7 +380,7 @@ check_if_dead:
 	if (unlikely(fscache_object_is_dying(object) ||
 		     fscache_cache_is_broken(object))) {
 		enum fscache_operation_state state = op->state;
-		fscache_cancel_op(op, do_cancel);
+		fscache_cancel_op(op, do_cancel, true);
 		if (stat_object_dead)
 			fscache_stat(stat_object_dead);
 		_leave(" = -ENOBUFS [obj dead %d]", state);


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 07/13] FS-Cache: Permit fscache_cancel_op() to cancel in-progress operations too
@ 2015-02-26 14:02   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

Currently, fscache_cancel_op() only cancels pending operations - attempts to
cancel in-progress operations are ignored.  This leads to a problem in
fscache_wait_for_operation_activation() whereby the wait is terminated, but
the object has been killed.

The check at the end of the function now triggers because it's no longer
contingent on the cache having produced an I/O error since the commit that
fixed the logic error in fscache_object_is_dead().

The result of the check is that it tries to cancel the operation - but since
the object may not be pending by this point, the cancellation request may be
ignored - with the result that the the object is just put by the caller and
fscache_put_operation has an assertion failure because the operation isn't in
either the COMPLETE or the CANCELLED states.

To fix this, we permit in-progress ops to be cancelled under some
circumstances.

The bug results in an oops that looks something like this:

	FS-Cache: fscache_wait_for_operation_activation() = -ENOBUFS [obj dead 3]
	FS-Cache:
	FS-Cache: Assertion failed
	FS-Cache: 3 == 5 is false
	------------[ cut here ]------------
	kernel BUG at ../fs/fscache/operation.c:432!
	...
	RIP: 0010:[<ffffffffa0088574>] fscache_put_operation+0xf2/0x2cd
	Call Trace:
	 [<ffffffffa008b92a>] __fscache_read_or_alloc_pages+0x2ec/0x3b3
	 [<ffffffffa00b761f>] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
	 [<ffffffffa00b06c5>] nfs_readpages+0x10c/0x185 [nfs]
	 [<ffffffff81124925>] ? alloc_pages_current+0x119/0x13e
	 [<ffffffff810ee5fd>] ? __page_cache_alloc+0xfb/0x10a
	 [<ffffffff810f87f8>] __do_page_cache_readahead+0x188/0x22c
	 [<ffffffff810f8b3a>] ondemand_readahead+0x29e/0x2af
	 [<ffffffff810f8c92>] page_cache_sync_readahead+0x38/0x3a
	 [<ffffffff810ef337>] generic_file_read_iter+0x1a2/0x55a
	 [<ffffffffa00a9dff>] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
	 [<ffffffffa00a6a23>] nfs_file_read+0x49/0x70 [nfs]
	 [<ffffffff811363be>] new_sync_read+0x78/0x9c
	 [<ffffffff81137164>] __vfs_read+0x13/0x38
	 [<ffffffff8113721e>] vfs_read+0x95/0x121
	 [<ffffffff811372f6>] SyS_read+0x4c/0x8a
	 [<ffffffff81557a52>] system_call_fastpath+0x12/0x17

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/internal.h  |    3 ++-
 fs/fscache/operation.c |   20 +++++++++++++++++---
 fs/fscache/page.c      |    4 ++--
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 3063a58b7d3d..87c4544ec912 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -125,7 +125,8 @@ extern int fscache_submit_exclusive_op(struct fscache_object *,
 extern int fscache_submit_op(struct fscache_object *,
 			     struct fscache_operation *);
 extern int fscache_cancel_op(struct fscache_operation *,
-			     void (*)(struct fscache_operation *));
+			     void (*)(struct fscache_operation *),
+			     bool);
 extern void fscache_cancel_all_ops(struct fscache_object *);
 extern void fscache_abort_object(struct fscache_object *);
 extern void fscache_start_operations(struct fscache_object *);
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 18658fffbba1..67594a8d791a 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -312,9 +312,11 @@ void fscache_start_operations(struct fscache_object *object)
  * cancel an operation that's pending on an object
  */
 int fscache_cancel_op(struct fscache_operation *op,
-		      void (*do_cancel)(struct fscache_operation *))
+		      void (*do_cancel)(struct fscache_operation *),
+		      bool cancel_in_progress_op)
 {
 	struct fscache_object *object = op->object;
+	bool put = false;
 	int ret;
 
 	_enter("OBJ%x OP%x}", op->object->debug_id, op->debug_id);
@@ -328,8 +330,19 @@ int fscache_cancel_op(struct fscache_operation *op,
 	ret = -EBUSY;
 	if (op->state == FSCACHE_OP_ST_PENDING) {
 		ASSERT(!list_empty(&op->pend_link));
-		fscache_stat(&fscache_n_op_cancelled);
 		list_del_init(&op->pend_link);
+		put = true;
+		fscache_stat(&fscache_n_op_cancelled);
+		if (do_cancel)
+			do_cancel(op);
+		op->state = FSCACHE_OP_ST_CANCELLED;
+		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
+			object->n_exclusive--;
+		if (test_and_clear_bit(FSCACHE_OP_WAITING, &op->flags))
+			wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
+		ret = 0;
+	} else if (op->state == FSCACHE_OP_ST_IN_PROGRESS && cancel_in_progress_op) {
+		fscache_stat(&fscache_n_op_cancelled);
 		if (do_cancel)
 			do_cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
@@ -337,10 +350,11 @@ int fscache_cancel_op(struct fscache_operation *op,
 			object->n_exclusive--;
 		if (test_and_clear_bit(FSCACHE_OP_WAITING, &op->flags))
 			wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
-		fscache_put_operation(op);
 		ret = 0;
 	}
 
+	if (put)
+		fscache_put_operation(op);
 	spin_unlock(&object->lock);
 	_leave(" = %d", ret);
 	return ret;
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index d0805e31361c..433cae927eca 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -359,7 +359,7 @@ int fscache_wait_for_operation_activation(struct fscache_object *object,
 		fscache_stat(stat_op_waits);
 	if (wait_on_bit(&op->flags, FSCACHE_OP_WAITING,
 			TASK_INTERRUPTIBLE) != 0) {
-		ret = fscache_cancel_op(op, do_cancel);
+		ret = fscache_cancel_op(op, do_cancel, false);
 		if (ret == 0)
 			return -ERESTARTSYS;
 
@@ -380,7 +380,7 @@ check_if_dead:
 	if (unlikely(fscache_object_is_dying(object) ||
 		     fscache_cache_is_broken(object))) {
 		enum fscache_operation_state state = op->state;
-		fscache_cancel_op(op, do_cancel);
+		fscache_cancel_op(op, do_cancel, true);
 		if (stat_object_dead)
 			fscache_stat(stat_object_dead);
 		_leave(" = -ENOBUFS [obj dead %d]", state);

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 08/13] FS-Cache: Out of line fscache_operation_init()
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:02   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

Out of line fscache_operation_init() so that it can access internal FS-Cache
features, such as stats, in a later commit.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c        |   22 ++++++++++++++++++++++
 include/linux/fscache-cache.h |   24 +++---------------------
 2 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 67594a8d791a..61a6e78b85fa 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -21,6 +21,28 @@ atomic_t fscache_op_debug_id;
 EXPORT_SYMBOL(fscache_op_debug_id);
 
 /**
+ * fscache_operation_init - Do basic initialisation of an operation
+ * @op: The operation to initialise
+ * @release: The release function to assign
+ *
+ * Do basic initialisation of an operation.  The caller must still set flags,
+ * object and processor if needed.
+ */
+void fscache_operation_init(struct fscache_operation *op,
+			    fscache_operation_processor_t processor,
+			    fscache_operation_release_t release)
+{
+	INIT_WORK(&op->work, fscache_op_work_func);
+	atomic_set(&op->usage, 1);
+	op->state = FSCACHE_OP_ST_INITIALISED;
+	op->debug_id = atomic_inc_return(&fscache_op_debug_id);
+	op->processor = processor;
+	op->release = release;
+	INIT_LIST_HEAD(&op->pend_link);
+}
+EXPORT_SYMBOL(fscache_operation_init);
+
+/**
  * fscache_enqueue_operation - Enqueue an operation for processing
  * @op: The operation to enqueue
  *
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index ca3d550da11e..0e26d49972e3 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -119,27 +119,9 @@ extern void fscache_op_work_func(struct work_struct *work);
 extern void fscache_enqueue_operation(struct fscache_operation *);
 extern void fscache_op_complete(struct fscache_operation *, bool);
 extern void fscache_put_operation(struct fscache_operation *);
-
-/**
- * fscache_operation_init - Do basic initialisation of an operation
- * @op: The operation to initialise
- * @release: The release function to assign
- *
- * Do basic initialisation of an operation.  The caller must still set flags,
- * object and processor if needed.
- */
-static inline void fscache_operation_init(struct fscache_operation *op,
-					fscache_operation_processor_t processor,
-					fscache_operation_release_t release)
-{
-	INIT_WORK(&op->work, fscache_op_work_func);
-	atomic_set(&op->usage, 1);
-	op->state = FSCACHE_OP_ST_INITIALISED;
-	op->debug_id = atomic_inc_return(&fscache_op_debug_id);
-	op->processor = processor;
-	op->release = release;
-	INIT_LIST_HEAD(&op->pend_link);
-}
+extern void fscache_operation_init(struct fscache_operation *,
+				   fscache_operation_processor_t,
+				   fscache_operation_release_t);
 
 /*
  * data read operation


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 08/13] FS-Cache: Out of line fscache_operation_init()
@ 2015-02-26 14:02   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

Out of line fscache_operation_init() so that it can access internal FS-Cache
features, such as stats, in a later commit.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c        |   22 ++++++++++++++++++++++
 include/linux/fscache-cache.h |   24 +++---------------------
 2 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 67594a8d791a..61a6e78b85fa 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -21,6 +21,28 @@ atomic_t fscache_op_debug_id;
 EXPORT_SYMBOL(fscache_op_debug_id);
 
 /**
+ * fscache_operation_init - Do basic initialisation of an operation
+ * @op: The operation to initialise
+ * @release: The release function to assign
+ *
+ * Do basic initialisation of an operation.  The caller must still set flags,
+ * object and processor if needed.
+ */
+void fscache_operation_init(struct fscache_operation *op,
+			    fscache_operation_processor_t processor,
+			    fscache_operation_release_t release)
+{
+	INIT_WORK(&op->work, fscache_op_work_func);
+	atomic_set(&op->usage, 1);
+	op->state = FSCACHE_OP_ST_INITIALISED;
+	op->debug_id = atomic_inc_return(&fscache_op_debug_id);
+	op->processor = processor;
+	op->release = release;
+	INIT_LIST_HEAD(&op->pend_link);
+}
+EXPORT_SYMBOL(fscache_operation_init);
+
+/**
  * fscache_enqueue_operation - Enqueue an operation for processing
  * @op: The operation to enqueue
  *
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index ca3d550da11e..0e26d49972e3 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -119,27 +119,9 @@ extern void fscache_op_work_func(struct work_struct *work);
 extern void fscache_enqueue_operation(struct fscache_operation *);
 extern void fscache_op_complete(struct fscache_operation *, bool);
 extern void fscache_put_operation(struct fscache_operation *);
-
-/**
- * fscache_operation_init - Do basic initialisation of an operation
- * @op: The operation to initialise
- * @release: The release function to assign
- *
- * Do basic initialisation of an operation.  The caller must still set flags,
- * object and processor if needed.
- */
-static inline void fscache_operation_init(struct fscache_operation *op,
-					fscache_operation_processor_t processor,
-					fscache_operation_release_t release)
-{
-	INIT_WORK(&op->work, fscache_op_work_func);
-	atomic_set(&op->usage, 1);
-	op->state = FSCACHE_OP_ST_INITIALISED;
-	op->debug_id = atomic_inc_return(&fscache_op_debug_id);
-	op->processor = processor;
-	op->release = release;
-	INIT_LIST_HEAD(&op->pend_link);
-}
+extern void fscache_operation_init(struct fscache_operation *,
+				   fscache_operation_processor_t,
+				   fscache_operation_release_t);
 
 /*
  * data read operation

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 09/13] FS-Cache: Count the number of initialised operations
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:02   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

Count and display through /proc/fs/fscache/stats the number of initialised
operations.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 Documentation/filesystems/caching/fscache.txt |    3 ++-
 fs/fscache/internal.h                         |    1 +
 fs/fscache/operation.c                        |    1 +
 fs/fscache/stats.c                            |    4 +++-
 4 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/caching/fscache.txt b/Documentation/filesystems/caching/fscache.txt
index 66fa7fbccfa4..50f0a5757f48 100644
--- a/Documentation/filesystems/caching/fscache.txt
+++ b/Documentation/filesystems/caching/fscache.txt
@@ -284,8 +284,9 @@ proc files.
 		enq=N	Number of times async ops queued for processing
 		can=N	Number of async ops cancelled
 		rej=N	Number of async ops rejected due to object lookup/create failure
+		ini=N	Number of async ops initialised
 		dfr=N	Number of async ops queued for deferred release
-		rel=N	Number of async ops released
+		rel=N	Number of async ops released (should equal ini=N when idle)
 		gc=N	Number of deferred-release async ops garbage collected
 	CacheOp	alo=N	Number of in-progress alloc_object() cache ops
 		luo=N	Number of in-progress lookup_object() cache ops
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 87c4544ec912..a63225116db6 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -165,6 +165,7 @@ extern atomic_t fscache_n_op_pend;
 extern atomic_t fscache_n_op_run;
 extern atomic_t fscache_n_op_enqueue;
 extern atomic_t fscache_n_op_deferred_release;
+extern atomic_t fscache_n_op_initialised;
 extern atomic_t fscache_n_op_release;
 extern atomic_t fscache_n_op_gc;
 extern atomic_t fscache_n_op_cancelled;
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 61a6e78b85fa..9761df4fc2ab 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -39,6 +39,7 @@ void fscache_operation_init(struct fscache_operation *op,
 	op->processor = processor;
 	op->release = release;
 	INIT_LIST_HEAD(&op->pend_link);
+	fscache_stat(&fscache_n_op_initialised);
 }
 EXPORT_SYMBOL(fscache_operation_init);
 
diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c
index 3a722e8f2307..7cfa0aacdf6d 100644
--- a/fs/fscache/stats.c
+++ b/fs/fscache/stats.c
@@ -23,6 +23,7 @@ atomic_t fscache_n_op_run;
 atomic_t fscache_n_op_enqueue;
 atomic_t fscache_n_op_requeue;
 atomic_t fscache_n_op_deferred_release;
+atomic_t fscache_n_op_initialised;
 atomic_t fscache_n_op_release;
 atomic_t fscache_n_op_gc;
 atomic_t fscache_n_op_cancelled;
@@ -251,7 +252,8 @@ static int fscache_stats_show(struct seq_file *m, void *v)
 		   atomic_read(&fscache_n_op_enqueue),
 		   atomic_read(&fscache_n_op_cancelled),
 		   atomic_read(&fscache_n_op_rejected));
-	seq_printf(m, "Ops    : dfr=%u rel=%u gc=%u\n",
+	seq_printf(m, "Ops    : ini=%u dfr=%u rel=%u gc=%u\n",
+		   atomic_read(&fscache_n_op_initialised),
 		   atomic_read(&fscache_n_op_deferred_release),
 		   atomic_read(&fscache_n_op_release),
 		   atomic_read(&fscache_n_op_gc));


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 09/13] FS-Cache: Count the number of initialised operations
@ 2015-02-26 14:02   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

Count and display through /proc/fs/fscache/stats the number of initialised
operations.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 Documentation/filesystems/caching/fscache.txt |    3 ++-
 fs/fscache/internal.h                         |    1 +
 fs/fscache/operation.c                        |    1 +
 fs/fscache/stats.c                            |    4 +++-
 4 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/caching/fscache.txt b/Documentation/filesystems/caching/fscache.txt
index 66fa7fbccfa4..50f0a5757f48 100644
--- a/Documentation/filesystems/caching/fscache.txt
+++ b/Documentation/filesystems/caching/fscache.txt
@@ -284,8 +284,9 @@ proc files.
 		enq=N	Number of times async ops queued for processing
 		can=N	Number of async ops cancelled
 		rej=N	Number of async ops rejected due to object lookup/create failure
+		ini=N	Number of async ops initialised
 		dfr=N	Number of async ops queued for deferred release
-		rel=N	Number of async ops released
+		rel=N	Number of async ops released (should equal ini=N when idle)
 		gc=N	Number of deferred-release async ops garbage collected
 	CacheOp	alo=N	Number of in-progress alloc_object() cache ops
 		luo=N	Number of in-progress lookup_object() cache ops
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index 87c4544ec912..a63225116db6 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -165,6 +165,7 @@ extern atomic_t fscache_n_op_pend;
 extern atomic_t fscache_n_op_run;
 extern atomic_t fscache_n_op_enqueue;
 extern atomic_t fscache_n_op_deferred_release;
+extern atomic_t fscache_n_op_initialised;
 extern atomic_t fscache_n_op_release;
 extern atomic_t fscache_n_op_gc;
 extern atomic_t fscache_n_op_cancelled;
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 61a6e78b85fa..9761df4fc2ab 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -39,6 +39,7 @@ void fscache_operation_init(struct fscache_operation *op,
 	op->processor = processor;
 	op->release = release;
 	INIT_LIST_HEAD(&op->pend_link);
+	fscache_stat(&fscache_n_op_initialised);
 }
 EXPORT_SYMBOL(fscache_operation_init);
 
diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c
index 3a722e8f2307..7cfa0aacdf6d 100644
--- a/fs/fscache/stats.c
+++ b/fs/fscache/stats.c
@@ -23,6 +23,7 @@ atomic_t fscache_n_op_run;
 atomic_t fscache_n_op_enqueue;
 atomic_t fscache_n_op_requeue;
 atomic_t fscache_n_op_deferred_release;
+atomic_t fscache_n_op_initialised;
 atomic_t fscache_n_op_release;
 atomic_t fscache_n_op_gc;
 atomic_t fscache_n_op_cancelled;
@@ -251,7 +252,8 @@ static int fscache_stats_show(struct seq_file *m, void *v)
 		   atomic_read(&fscache_n_op_enqueue),
 		   atomic_read(&fscache_n_op_cancelled),
 		   atomic_read(&fscache_n_op_rejected));
-	seq_printf(m, "Ops    : dfr=%u rel=%u gc=%u\n",
+	seq_printf(m, "Ops    : ini=%u dfr=%u rel=%u gc=%u\n",
+		   atomic_read(&fscache_n_op_initialised),
 		   atomic_read(&fscache_n_op_deferred_release),
 		   atomic_read(&fscache_n_op_release),
 		   atomic_read(&fscache_n_op_gc));

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 10/13] FS-Cache: Fix cancellation of in-progress operation
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:02   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

Cancellation of an in-progress operation needs to update the relevant counters
and start any operations that are pending waiting on this one.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 9761df4fc2ab..b6bf5f399d70 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -365,6 +365,13 @@ int fscache_cancel_op(struct fscache_operation *op,
 			wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
 		ret = 0;
 	} else if (op->state == FSCACHE_OP_ST_IN_PROGRESS && cancel_in_progress_op) {
+		ASSERTCMP(object->n_in_progress, >, 0);
+		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
+			object->n_exclusive--;
+		object->n_in_progress--;
+		if (object->n_in_progress == 0)
+			fscache_start_operations(object);
+
 		fscache_stat(&fscache_n_op_cancelled);
 		if (do_cancel)
 			do_cancel(op);


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 10/13] FS-Cache: Fix cancellation of in-progress operation
@ 2015-02-26 14:02   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:02 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

Cancellation of an in-progress operation needs to update the relevant counters
and start any operations that are pending waiting on this one.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 9761df4fc2ab..b6bf5f399d70 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -365,6 +365,13 @@ int fscache_cancel_op(struct fscache_operation *op,
 			wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
 		ret = 0;
 	} else if (op->state == FSCACHE_OP_ST_IN_PROGRESS && cancel_in_progress_op) {
+		ASSERTCMP(object->n_in_progress, >, 0);
+		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
+			object->n_exclusive--;
+		object->n_in_progress--;
+		if (object->n_in_progress == 0)
+			fscache_start_operations(object);
+
 		fscache_stat(&fscache_n_op_cancelled);
 		if (do_cancel)
 			do_cancel(op);

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 11/13] FS-Cache: Put an aborted initialised op so that it is accounted correctly
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:03   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:03 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

Call fscache_put_operation() or a wrapper on any op that has gone through
fscache_operation_init() so that the accounting shown in /proc is done
correctly, specifically fscache_n_op_release.

fscache_put_operation() therefore now allows an op in the INITIALISED state as
well as in the CANCELLED and COMPLETE states.

Note that this means that an operation can get put that doesn't have its
->object pointer filled in, so anything that depends on the object needs to be
conditional in fscache_put_operation().

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c |   54 +++++++++++++++++++++++++-----------------------
 fs/fscache/page.c      |   14 ++++++------
 2 files changed, 35 insertions(+), 33 deletions(-)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index b6bf5f399d70..c76c09730768 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -472,7 +472,8 @@ void fscache_put_operation(struct fscache_operation *op)
 		return;
 
 	_debug("PUT OP");
-	ASSERTIFCMP(op->state != FSCACHE_OP_ST_COMPLETE,
+	ASSERTIFCMP(op->state != FSCACHE_OP_ST_INITIALISED &&
+		    op->state != FSCACHE_OP_ST_COMPLETE,
 		    op->state, ==, FSCACHE_OP_ST_CANCELLED);
 	op->state = FSCACHE_OP_ST_DEAD;
 
@@ -484,35 +485,36 @@ void fscache_put_operation(struct fscache_operation *op)
 	}
 
 	object = op->object;
+	if (likely(object)) {
+		if (test_bit(FSCACHE_OP_DEC_READ_CNT, &op->flags))
+			atomic_dec(&object->n_reads);
+		if (test_bit(FSCACHE_OP_UNUSE_COOKIE, &op->flags))
+			fscache_unuse_cookie(object);
+
+		/* now... we may get called with the object spinlock held, so we
+		 * complete the cleanup here only if we can immediately acquire the
+		 * lock, and defer it otherwise */
+		if (!spin_trylock(&object->lock)) {
+			_debug("defer put");
+			fscache_stat(&fscache_n_op_deferred_release);
+
+			cache = object->cache;
+			spin_lock(&cache->op_gc_list_lock);
+			list_add_tail(&op->pend_link, &cache->op_gc_list);
+			spin_unlock(&cache->op_gc_list_lock);
+			schedule_work(&cache->op_gc);
+			_leave(" [defer]");
+			return;
+		}
 
-	if (test_bit(FSCACHE_OP_DEC_READ_CNT, &op->flags))
-		atomic_dec(&object->n_reads);
-	if (test_bit(FSCACHE_OP_UNUSE_COOKIE, &op->flags))
-		fscache_unuse_cookie(object);
-
-	/* now... we may get called with the object spinlock held, so we
-	 * complete the cleanup here only if we can immediately acquire the
-	 * lock, and defer it otherwise */
-	if (!spin_trylock(&object->lock)) {
-		_debug("defer put");
-		fscache_stat(&fscache_n_op_deferred_release);
+		ASSERTCMP(object->n_ops, >, 0);
+		object->n_ops--;
+		if (object->n_ops == 0)
+			fscache_raise_event(object, FSCACHE_OBJECT_EV_CLEARED);
 
-		cache = object->cache;
-		spin_lock(&cache->op_gc_list_lock);
-		list_add_tail(&op->pend_link, &cache->op_gc_list);
-		spin_unlock(&cache->op_gc_list_lock);
-		schedule_work(&cache->op_gc);
-		_leave(" [defer]");
-		return;
+		spin_unlock(&object->lock);
 	}
 
-	ASSERTCMP(object->n_ops, >, 0);
-	object->n_ops--;
-	if (object->n_ops == 0)
-		fscache_raise_event(object, FSCACHE_OBJECT_EV_CLEARED);
-
-	spin_unlock(&object->lock);
-
 	kfree(op);
 	_leave(" [done]");
 }
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index 433cae927eca..7db9752252ef 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -239,7 +239,7 @@ nobufs_dec:
 	wake_cookie = __fscache_unuse_cookie(cookie);
 nobufs:
 	spin_unlock(&cookie->lock);
-	kfree(op);
+	fscache_put_operation(op);
 	if (wake_cookie)
 		__fscache_wake_unused_cookie(cookie);
 	fscache_stat(&fscache_n_attr_changed_nobufs);
@@ -505,7 +505,7 @@ nobufs_unlock:
 	spin_unlock(&cookie->lock);
 	if (wake_cookie)
 		__fscache_wake_unused_cookie(cookie);
-	kfree(op);
+	fscache_put_retrieval(op);
 nobufs:
 	fscache_stat(&fscache_n_retrievals_nobufs);
 	_leave(" = -ENOBUFS");
@@ -634,7 +634,7 @@ nobufs_unlock_dec:
 	wake_cookie = __fscache_unuse_cookie(cookie);
 nobufs_unlock:
 	spin_unlock(&cookie->lock);
-	kfree(op);
+	fscache_put_retrieval(op);
 	if (wake_cookie)
 		__fscache_wake_unused_cookie(cookie);
 nobufs:
@@ -728,7 +728,7 @@ nobufs_unlock_dec:
 	wake_cookie = __fscache_unuse_cookie(cookie);
 nobufs_unlock:
 	spin_unlock(&cookie->lock);
-	kfree(op);
+	fscache_put_retrieval(op);
 	if (wake_cookie)
 		__fscache_wake_unused_cookie(cookie);
 nobufs:
@@ -1018,7 +1018,7 @@ already_pending:
 	spin_unlock(&object->lock);
 	spin_unlock(&cookie->lock);
 	radix_tree_preload_end();
-	kfree(op);
+	fscache_put_operation(&op->op);
 	fscache_stat(&fscache_n_stores_ok);
 	_leave(" = 0");
 	return 0;
@@ -1038,7 +1038,7 @@ nobufs_unlock_obj:
 nobufs:
 	spin_unlock(&cookie->lock);
 	radix_tree_preload_end();
-	kfree(op);
+	fscache_put_operation(&op->op);
 	if (wake_cookie)
 		__fscache_wake_unused_cookie(cookie);
 	fscache_stat(&fscache_n_stores_nobufs);
@@ -1046,7 +1046,7 @@ nobufs:
 	return -ENOBUFS;
 
 nomem_free:
-	kfree(op);
+	fscache_put_operation(&op->op);
 nomem:
 	fscache_stat(&fscache_n_stores_oom);
 	_leave(" = -ENOMEM");


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 11/13] FS-Cache: Put an aborted initialised op so that it is accounted correctly
@ 2015-02-26 14:03   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:03 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

Call fscache_put_operation() or a wrapper on any op that has gone through
fscache_operation_init() so that the accounting shown in /proc is done
correctly, specifically fscache_n_op_release.

fscache_put_operation() therefore now allows an op in the INITIALISED state as
well as in the CANCELLED and COMPLETE states.

Note that this means that an operation can get put that doesn't have its
->object pointer filled in, so anything that depends on the object needs to be
conditional in fscache_put_operation().

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c |   54 +++++++++++++++++++++++++-----------------------
 fs/fscache/page.c      |   14 ++++++------
 2 files changed, 35 insertions(+), 33 deletions(-)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index b6bf5f399d70..c76c09730768 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -472,7 +472,8 @@ void fscache_put_operation(struct fscache_operation *op)
 		return;
 
 	_debug("PUT OP");
-	ASSERTIFCMP(op->state != FSCACHE_OP_ST_COMPLETE,
+	ASSERTIFCMP(op->state != FSCACHE_OP_ST_INITIALISED &&
+		    op->state != FSCACHE_OP_ST_COMPLETE,
 		    op->state, ==, FSCACHE_OP_ST_CANCELLED);
 	op->state = FSCACHE_OP_ST_DEAD;
 
@@ -484,35 +485,36 @@ void fscache_put_operation(struct fscache_operation *op)
 	}
 
 	object = op->object;
+	if (likely(object)) {
+		if (test_bit(FSCACHE_OP_DEC_READ_CNT, &op->flags))
+			atomic_dec(&object->n_reads);
+		if (test_bit(FSCACHE_OP_UNUSE_COOKIE, &op->flags))
+			fscache_unuse_cookie(object);
+
+		/* now... we may get called with the object spinlock held, so we
+		 * complete the cleanup here only if we can immediately acquire the
+		 * lock, and defer it otherwise */
+		if (!spin_trylock(&object->lock)) {
+			_debug("defer put");
+			fscache_stat(&fscache_n_op_deferred_release);
+
+			cache = object->cache;
+			spin_lock(&cache->op_gc_list_lock);
+			list_add_tail(&op->pend_link, &cache->op_gc_list);
+			spin_unlock(&cache->op_gc_list_lock);
+			schedule_work(&cache->op_gc);
+			_leave(" [defer]");
+			return;
+		}
 
-	if (test_bit(FSCACHE_OP_DEC_READ_CNT, &op->flags))
-		atomic_dec(&object->n_reads);
-	if (test_bit(FSCACHE_OP_UNUSE_COOKIE, &op->flags))
-		fscache_unuse_cookie(object);
-
-	/* now... we may get called with the object spinlock held, so we
-	 * complete the cleanup here only if we can immediately acquire the
-	 * lock, and defer it otherwise */
-	if (!spin_trylock(&object->lock)) {
-		_debug("defer put");
-		fscache_stat(&fscache_n_op_deferred_release);
+		ASSERTCMP(object->n_ops, >, 0);
+		object->n_ops--;
+		if (object->n_ops == 0)
+			fscache_raise_event(object, FSCACHE_OBJECT_EV_CLEARED);
 
-		cache = object->cache;
-		spin_lock(&cache->op_gc_list_lock);
-		list_add_tail(&op->pend_link, &cache->op_gc_list);
-		spin_unlock(&cache->op_gc_list_lock);
-		schedule_work(&cache->op_gc);
-		_leave(" [defer]");
-		return;
+		spin_unlock(&object->lock);
 	}
 
-	ASSERTCMP(object->n_ops, >, 0);
-	object->n_ops--;
-	if (object->n_ops == 0)
-		fscache_raise_event(object, FSCACHE_OBJECT_EV_CLEARED);
-
-	spin_unlock(&object->lock);
-
 	kfree(op);
 	_leave(" [done]");
 }
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index 433cae927eca..7db9752252ef 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -239,7 +239,7 @@ nobufs_dec:
 	wake_cookie = __fscache_unuse_cookie(cookie);
 nobufs:
 	spin_unlock(&cookie->lock);
-	kfree(op);
+	fscache_put_operation(op);
 	if (wake_cookie)
 		__fscache_wake_unused_cookie(cookie);
 	fscache_stat(&fscache_n_attr_changed_nobufs);
@@ -505,7 +505,7 @@ nobufs_unlock:
 	spin_unlock(&cookie->lock);
 	if (wake_cookie)
 		__fscache_wake_unused_cookie(cookie);
-	kfree(op);
+	fscache_put_retrieval(op);
 nobufs:
 	fscache_stat(&fscache_n_retrievals_nobufs);
 	_leave(" = -ENOBUFS");
@@ -634,7 +634,7 @@ nobufs_unlock_dec:
 	wake_cookie = __fscache_unuse_cookie(cookie);
 nobufs_unlock:
 	spin_unlock(&cookie->lock);
-	kfree(op);
+	fscache_put_retrieval(op);
 	if (wake_cookie)
 		__fscache_wake_unused_cookie(cookie);
 nobufs:
@@ -728,7 +728,7 @@ nobufs_unlock_dec:
 	wake_cookie = __fscache_unuse_cookie(cookie);
 nobufs_unlock:
 	spin_unlock(&cookie->lock);
-	kfree(op);
+	fscache_put_retrieval(op);
 	if (wake_cookie)
 		__fscache_wake_unused_cookie(cookie);
 nobufs:
@@ -1018,7 +1018,7 @@ already_pending:
 	spin_unlock(&object->lock);
 	spin_unlock(&cookie->lock);
 	radix_tree_preload_end();
-	kfree(op);
+	fscache_put_operation(&op->op);
 	fscache_stat(&fscache_n_stores_ok);
 	_leave(" = 0");
 	return 0;
@@ -1038,7 +1038,7 @@ nobufs_unlock_obj:
 nobufs:
 	spin_unlock(&cookie->lock);
 	radix_tree_preload_end();
-	kfree(op);
+	fscache_put_operation(&op->op);
 	if (wake_cookie)
 		__fscache_wake_unused_cookie(cookie);
 	fscache_stat(&fscache_n_stores_nobufs);
@@ -1046,7 +1046,7 @@ nobufs:
 	return -ENOBUFS;
 
 nomem_free:
-	kfree(op);
+	fscache_put_operation(&op->op);
 nomem:
 	fscache_stat(&fscache_n_stores_oom);
 	_leave(" = -ENOMEM");

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 12/13] FS-Cache: The operation cancellation method needs calling in more places
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:03   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:03 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

Any time an incomplete operation is cancelled, the operation cancellation
function needs to be called to clean up.  This is currently being passed
directly to some of the functions that might want to call it, but not all.

Instead, pass the cancellation method pointer to the fscache_operation_init()
and have that cache it in the operation struct.  Further, plug in a dummy
cancellation handler if the caller declines to set one as this allows us to
call the function unconditionally (the extra overhead isn't worth bothering
about as we don't expect to be calling this typically).

The cancellation method must thence be called everywhere the CANCELLED state
is set.  Note that we call it *before* setting the CANCELLED state such that
the method can use the old state value to guide its operation.

fscache_do_cancel_retrieval() needs moving higher up in the sources so that
the init function can use it now.

Without this, the following oops may be seen:

	FS-Cache: Assertion failed
	FS-Cache: 3 == 0 is false
	------------[ cut here ]------------
	kernel BUG at ../fs/fscache/page.c:261!
	...
	RIP: 0010:[<ffffffffa0089c1b>]  fscache_release_retrieval_op+0x77/0x100
	 [<ffffffffa008853d>] fscache_put_operation+0x114/0x2da
	 [<ffffffffa008b8c2>] __fscache_read_or_alloc_pages+0x358/0x3b3
	 [<ffffffffa00b761f>] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
	 [<ffffffffa00b06c5>] nfs_readpages+0x10c/0x185 [nfs]
	 [<ffffffff81124925>] ? alloc_pages_current+0x119/0x13e
	 [<ffffffff810ee5fd>] ? __page_cache_alloc+0xfb/0x10a
	 [<ffffffff810f87f8>] __do_page_cache_readahead+0x188/0x22c
	 [<ffffffff810f8b3a>] ondemand_readahead+0x29e/0x2af
	 [<ffffffff810f8c92>] page_cache_sync_readahead+0x38/0x3a
	 [<ffffffff810ef337>] generic_file_read_iter+0x1a2/0x55a
	 [<ffffffffa00a9dff>] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
	 [<ffffffffa00a6a23>] nfs_file_read+0x49/0x70 [nfs]
	 [<ffffffff811363be>] new_sync_read+0x78/0x9c
	 [<ffffffff81137164>] __vfs_read+0x13/0x38
	 [<ffffffff8113721e>] vfs_read+0x95/0x121
	 [<ffffffff811372f6>] SyS_read+0x4c/0x8a
	 [<ffffffff81557a52>] system_call_fastpath+0x12/0x17

The assertion is showing that the remaining number of pages (n_pages) is not 0
when the operation is being released.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/cookie.c           |    5 ++--
 fs/fscache/internal.h         |    7 ++----
 fs/fscache/object.c           |    3 ++-
 fs/fscache/operation.c        |   31 +++++++++++++++++++++-------
 fs/fscache/page.c             |   46 ++++++++++++++++++++---------------------
 include/linux/fscache-cache.h |    5 ++++
 6 files changed, 57 insertions(+), 40 deletions(-)

diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 8de22164f5fb..d403c69bee08 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -672,7 +672,7 @@ int __fscache_check_consistency(struct fscache_cookie *cookie)
 	if (!op)
 		return -ENOMEM;
 
-	fscache_operation_init(op, NULL, NULL);
+	fscache_operation_init(op, NULL, NULL, NULL);
 	op->flags = FSCACHE_OP_MYTHREAD |
 		(1 << FSCACHE_OP_WAITING) |
 		(1 << FSCACHE_OP_UNUSE_COOKIE);
@@ -696,8 +696,7 @@ int __fscache_check_consistency(struct fscache_cookie *cookie)
 	/* the work queue now carries its own ref on the object */
 	spin_unlock(&cookie->lock);
 
-	ret = fscache_wait_for_operation_activation(object, op,
-						    NULL, NULL, NULL);
+	ret = fscache_wait_for_operation_activation(object, op, NULL, NULL);
 	if (ret == 0) {
 		/* ask the cache to honour the operation */
 		ret = object->cache->ops->check_consistency(op);
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index a63225116db6..97ec45110957 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -124,9 +124,7 @@ extern int fscache_submit_exclusive_op(struct fscache_object *,
 				       struct fscache_operation *);
 extern int fscache_submit_op(struct fscache_object *,
 			     struct fscache_operation *);
-extern int fscache_cancel_op(struct fscache_operation *,
-			     void (*)(struct fscache_operation *),
-			     bool);
+extern int fscache_cancel_op(struct fscache_operation *, bool);
 extern void fscache_cancel_all_ops(struct fscache_object *);
 extern void fscache_abort_object(struct fscache_object *);
 extern void fscache_start_operations(struct fscache_object *);
@@ -139,8 +137,7 @@ extern int fscache_wait_for_deferred_lookup(struct fscache_cookie *);
 extern int fscache_wait_for_operation_activation(struct fscache_object *,
 						 struct fscache_operation *,
 						 atomic_t *,
-						 atomic_t *,
-						 void (*)(struct fscache_operation *));
+						 atomic_t *);
 extern void fscache_invalidate_writes(struct fscache_cookie *);
 
 /*
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 40049f7505f0..9e792e30f4db 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -961,7 +961,8 @@ static const struct fscache_state *_fscache_invalidate_object(struct fscache_obj
 	if (!op)
 		goto nomem;
 
-	fscache_operation_init(op, object->cache->ops->invalidate_object, NULL);
+	fscache_operation_init(op, object->cache->ops->invalidate_object,
+			       NULL, NULL);
 	op->flags = FSCACHE_OP_ASYNC |
 		(1 << FSCACHE_OP_EXCLUSIVE) |
 		(1 << FSCACHE_OP_UNUSE_COOKIE);
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index c76c09730768..57d4abb68656 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -20,6 +20,10 @@
 atomic_t fscache_op_debug_id;
 EXPORT_SYMBOL(fscache_op_debug_id);
 
+static void fscache_operation_dummy_cancel(struct fscache_operation *op)
+{
+}
+
 /**
  * fscache_operation_init - Do basic initialisation of an operation
  * @op: The operation to initialise
@@ -30,6 +34,7 @@ EXPORT_SYMBOL(fscache_op_debug_id);
  */
 void fscache_operation_init(struct fscache_operation *op,
 			    fscache_operation_processor_t processor,
+			    fscache_operation_cancel_t cancel,
 			    fscache_operation_release_t release)
 {
 	INIT_WORK(&op->work, fscache_op_work_func);
@@ -37,6 +42,7 @@ void fscache_operation_init(struct fscache_operation *op,
 	op->state = FSCACHE_OP_ST_INITIALISED;
 	op->debug_id = atomic_inc_return(&fscache_op_debug_id);
 	op->processor = processor;
+	op->cancel = cancel ?: fscache_operation_dummy_cancel;
 	op->release = release;
 	INIT_LIST_HEAD(&op->pend_link);
 	fscache_stat(&fscache_n_op_initialised);
@@ -164,9 +170,11 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 	flags = READ_ONCE(object->flags);
 	if (unlikely(!(flags & BIT(FSCACHE_OBJECT_IS_LIVE)))) {
 		fscache_stat(&fscache_n_op_rejected);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	} else if (unlikely(fscache_cache_is_broken(object))) {
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -EIO;
 	} else if (flags & BIT(FSCACHE_OBJECT_IS_AVAILABLE)) {
@@ -200,10 +208,12 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
 	} else if (flags & BIT(FSCACHE_OBJECT_KILLED_BY_CACHE)) {
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	} else {
 		fscache_report_unexpected_submission(object, op, ostate);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	}
@@ -245,9 +255,11 @@ int fscache_submit_op(struct fscache_object *object,
 	flags = READ_ONCE(object->flags);
 	if (unlikely(!(flags & BIT(FSCACHE_OBJECT_IS_LIVE)))) {
 		fscache_stat(&fscache_n_op_rejected);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	} else if (unlikely(fscache_cache_is_broken(object))) {
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -EIO;
 	} else if (flags & BIT(FSCACHE_OBJECT_IS_AVAILABLE)) {
@@ -276,11 +288,13 @@ int fscache_submit_op(struct fscache_object *object,
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
 	} else if (flags & BIT(FSCACHE_OBJECT_KILLED_BY_CACHE)) {
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	} else {
 		fscache_report_unexpected_submission(object, op, ostate);
 		ASSERT(!fscache_object_is_active(object));
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	}
@@ -335,7 +349,6 @@ void fscache_start_operations(struct fscache_object *object)
  * cancel an operation that's pending on an object
  */
 int fscache_cancel_op(struct fscache_operation *op,
-		      void (*do_cancel)(struct fscache_operation *),
 		      bool cancel_in_progress_op)
 {
 	struct fscache_object *object = op->object;
@@ -355,9 +368,9 @@ int fscache_cancel_op(struct fscache_operation *op,
 		ASSERT(!list_empty(&op->pend_link));
 		list_del_init(&op->pend_link);
 		put = true;
+
 		fscache_stat(&fscache_n_op_cancelled);
-		if (do_cancel)
-			do_cancel(op);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
 			object->n_exclusive--;
@@ -373,8 +386,7 @@ int fscache_cancel_op(struct fscache_operation *op,
 			fscache_start_operations(object);
 
 		fscache_stat(&fscache_n_op_cancelled);
-		if (do_cancel)
-			do_cancel(op);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
 			object->n_exclusive--;
@@ -408,6 +420,7 @@ void fscache_cancel_all_ops(struct fscache_object *object)
 		list_del_init(&op->pend_link);
 
 		ASSERTCMP(op->state, ==, FSCACHE_OP_ST_PENDING);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 
 		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
@@ -440,8 +453,12 @@ void fscache_op_complete(struct fscache_operation *op, bool cancelled)
 
 	spin_lock(&object->lock);
 
-	op->state = cancelled ?
-		FSCACHE_OP_ST_CANCELLED : FSCACHE_OP_ST_COMPLETE;
+	if (!cancelled) {
+		op->state = FSCACHE_OP_ST_COMPLETE;
+	} else {
+		op->cancel(op);
+		op->state = FSCACHE_OP_ST_CANCELLED;
+	}
 
 	if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
 		object->n_exclusive--;
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index 7db9752252ef..d1b23a67c031 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -213,7 +213,7 @@ int __fscache_attr_changed(struct fscache_cookie *cookie)
 		return -ENOMEM;
 	}
 
-	fscache_operation_init(op, fscache_attr_changed_op, NULL);
+	fscache_operation_init(op, fscache_attr_changed_op, NULL, NULL);
 	op->flags = FSCACHE_OP_ASYNC |
 		(1 << FSCACHE_OP_EXCLUSIVE) |
 		(1 << FSCACHE_OP_UNUSE_COOKIE);
@@ -249,6 +249,17 @@ nobufs:
 EXPORT_SYMBOL(__fscache_attr_changed);
 
 /*
+ * Handle cancellation of a pending retrieval op
+ */
+static void fscache_do_cancel_retrieval(struct fscache_operation *_op)
+{
+	struct fscache_retrieval *op =
+		container_of(_op, struct fscache_retrieval, op);
+
+	atomic_set(&op->n_pages, 0);
+}
+
+/*
  * release a retrieval op reference
  */
 static void fscache_release_retrieval_op(struct fscache_operation *_op)
@@ -285,7 +296,9 @@ static struct fscache_retrieval *fscache_alloc_retrieval(
 		return NULL;
 	}
 
-	fscache_operation_init(&op->op, NULL, fscache_release_retrieval_op);
+	fscache_operation_init(&op->op, NULL,
+			       fscache_do_cancel_retrieval,
+			       fscache_release_retrieval_op);
 	op->op.flags	= FSCACHE_OP_MYTHREAD |
 		(1UL << FSCACHE_OP_WAITING) |
 		(1UL << FSCACHE_OP_UNUSE_COOKIE);
@@ -330,24 +343,12 @@ int fscache_wait_for_deferred_lookup(struct fscache_cookie *cookie)
 }
 
 /*
- * Handle cancellation of a pending retrieval op
- */
-static void fscache_do_cancel_retrieval(struct fscache_operation *_op)
-{
-	struct fscache_retrieval *op =
-		container_of(_op, struct fscache_retrieval, op);
-
-	atomic_set(&op->n_pages, 0);
-}
-
-/*
  * wait for an object to become active (or dead)
  */
 int fscache_wait_for_operation_activation(struct fscache_object *object,
 					  struct fscache_operation *op,
 					  atomic_t *stat_op_waits,
-					  atomic_t *stat_object_dead,
-					  void (*do_cancel)(struct fscache_operation *))
+					  atomic_t *stat_object_dead)
 {
 	int ret;
 
@@ -359,7 +360,7 @@ int fscache_wait_for_operation_activation(struct fscache_object *object,
 		fscache_stat(stat_op_waits);
 	if (wait_on_bit(&op->flags, FSCACHE_OP_WAITING,
 			TASK_INTERRUPTIBLE) != 0) {
-		ret = fscache_cancel_op(op, do_cancel, false);
+		ret = fscache_cancel_op(op, false);
 		if (ret == 0)
 			return -ERESTARTSYS;
 
@@ -380,7 +381,7 @@ check_if_dead:
 	if (unlikely(fscache_object_is_dying(object) ||
 		     fscache_cache_is_broken(object))) {
 		enum fscache_operation_state state = op->state;
-		fscache_cancel_op(op, do_cancel, true);
+		fscache_cancel_op(op, true);
 		if (stat_object_dead)
 			fscache_stat(stat_object_dead);
 		_leave(" = -ENOBUFS [obj dead %d]", state);
@@ -464,8 +465,7 @@ int __fscache_read_or_alloc_page(struct fscache_cookie *cookie,
 	ret = fscache_wait_for_operation_activation(
 		object, &op->op,
 		__fscache_stat(&fscache_n_retrieval_op_waits),
-		__fscache_stat(&fscache_n_retrievals_object_dead),
-		fscache_do_cancel_retrieval);
+		__fscache_stat(&fscache_n_retrievals_object_dead));
 	if (ret < 0)
 		goto error;
 
@@ -595,8 +595,7 @@ int __fscache_read_or_alloc_pages(struct fscache_cookie *cookie,
 	ret = fscache_wait_for_operation_activation(
 		object, &op->op,
 		__fscache_stat(&fscache_n_retrieval_op_waits),
-		__fscache_stat(&fscache_n_retrievals_object_dead),
-		fscache_do_cancel_retrieval);
+		__fscache_stat(&fscache_n_retrievals_object_dead));
 	if (ret < 0)
 		goto error;
 
@@ -702,8 +701,7 @@ int __fscache_alloc_page(struct fscache_cookie *cookie,
 	ret = fscache_wait_for_operation_activation(
 		object, &op->op,
 		__fscache_stat(&fscache_n_alloc_op_waits),
-		__fscache_stat(&fscache_n_allocs_object_dead),
-		fscache_do_cancel_retrieval);
+		__fscache_stat(&fscache_n_allocs_object_dead));
 	if (ret < 0)
 		goto error;
 
@@ -946,7 +944,7 @@ int __fscache_write_page(struct fscache_cookie *cookie,
 	if (!op)
 		goto nomem;
 
-	fscache_operation_init(&op->op, fscache_write_op,
+	fscache_operation_init(&op->op, fscache_write_op, NULL,
 			       fscache_release_write_op);
 	op->op.flags = FSCACHE_OP_ASYNC |
 		(1 << FSCACHE_OP_WAITING) |
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 0e26d49972e3..69c6eb10d858 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -74,6 +74,7 @@ extern wait_queue_head_t fscache_cache_cleared_wq;
  */
 typedef void (*fscache_operation_release_t)(struct fscache_operation *op);
 typedef void (*fscache_operation_processor_t)(struct fscache_operation *op);
+typedef void (*fscache_operation_cancel_t)(struct fscache_operation *op);
 
 enum fscache_operation_state {
 	FSCACHE_OP_ST_BLANK,		/* Op is not yet submitted */
@@ -109,6 +110,9 @@ struct fscache_operation {
 	 *   the op in a non-pool thread */
 	fscache_operation_processor_t processor;
 
+	/* Operation cancellation cleanup (optional) */
+	fscache_operation_cancel_t cancel;
+
 	/* operation releaser */
 	fscache_operation_release_t release;
 };
@@ -121,6 +125,7 @@ extern void fscache_op_complete(struct fscache_operation *, bool);
 extern void fscache_put_operation(struct fscache_operation *);
 extern void fscache_operation_init(struct fscache_operation *,
 				   fscache_operation_processor_t,
+				   fscache_operation_cancel_t,
 				   fscache_operation_release_t);
 
 /*


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 12/13] FS-Cache: The operation cancellation method needs calling in more places
@ 2015-02-26 14:03   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:03 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

Any time an incomplete operation is cancelled, the operation cancellation
function needs to be called to clean up.  This is currently being passed
directly to some of the functions that might want to call it, but not all.

Instead, pass the cancellation method pointer to the fscache_operation_init()
and have that cache it in the operation struct.  Further, plug in a dummy
cancellation handler if the caller declines to set one as this allows us to
call the function unconditionally (the extra overhead isn't worth bothering
about as we don't expect to be calling this typically).

The cancellation method must thence be called everywhere the CANCELLED state
is set.  Note that we call it *before* setting the CANCELLED state such that
the method can use the old state value to guide its operation.

fscache_do_cancel_retrieval() needs moving higher up in the sources so that
the init function can use it now.

Without this, the following oops may be seen:

	FS-Cache: Assertion failed
	FS-Cache: 3 == 0 is false
	------------[ cut here ]------------
	kernel BUG at ../fs/fscache/page.c:261!
	...
	RIP: 0010:[<ffffffffa0089c1b>]  fscache_release_retrieval_op+0x77/0x100
	 [<ffffffffa008853d>] fscache_put_operation+0x114/0x2da
	 [<ffffffffa008b8c2>] __fscache_read_or_alloc_pages+0x358/0x3b3
	 [<ffffffffa00b761f>] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
	 [<ffffffffa00b06c5>] nfs_readpages+0x10c/0x185 [nfs]
	 [<ffffffff81124925>] ? alloc_pages_current+0x119/0x13e
	 [<ffffffff810ee5fd>] ? __page_cache_alloc+0xfb/0x10a
	 [<ffffffff810f87f8>] __do_page_cache_readahead+0x188/0x22c
	 [<ffffffff810f8b3a>] ondemand_readahead+0x29e/0x2af
	 [<ffffffff810f8c92>] page_cache_sync_readahead+0x38/0x3a
	 [<ffffffff810ef337>] generic_file_read_iter+0x1a2/0x55a
	 [<ffffffffa00a9dff>] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
	 [<ffffffffa00a6a23>] nfs_file_read+0x49/0x70 [nfs]
	 [<ffffffff811363be>] new_sync_read+0x78/0x9c
	 [<ffffffff81137164>] __vfs_read+0x13/0x38
	 [<ffffffff8113721e>] vfs_read+0x95/0x121
	 [<ffffffff811372f6>] SyS_read+0x4c/0x8a
	 [<ffffffff81557a52>] system_call_fastpath+0x12/0x17

The assertion is showing that the remaining number of pages (n_pages) is not 0
when the operation is being released.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/cookie.c           |    5 ++--
 fs/fscache/internal.h         |    7 ++----
 fs/fscache/object.c           |    3 ++-
 fs/fscache/operation.c        |   31 +++++++++++++++++++++-------
 fs/fscache/page.c             |   46 ++++++++++++++++++++---------------------
 include/linux/fscache-cache.h |    5 ++++
 6 files changed, 57 insertions(+), 40 deletions(-)

diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 8de22164f5fb..d403c69bee08 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -672,7 +672,7 @@ int __fscache_check_consistency(struct fscache_cookie *cookie)
 	if (!op)
 		return -ENOMEM;
 
-	fscache_operation_init(op, NULL, NULL);
+	fscache_operation_init(op, NULL, NULL, NULL);
 	op->flags = FSCACHE_OP_MYTHREAD |
 		(1 << FSCACHE_OP_WAITING) |
 		(1 << FSCACHE_OP_UNUSE_COOKIE);
@@ -696,8 +696,7 @@ int __fscache_check_consistency(struct fscache_cookie *cookie)
 	/* the work queue now carries its own ref on the object */
 	spin_unlock(&cookie->lock);
 
-	ret = fscache_wait_for_operation_activation(object, op,
-						    NULL, NULL, NULL);
+	ret = fscache_wait_for_operation_activation(object, op, NULL, NULL);
 	if (ret == 0) {
 		/* ask the cache to honour the operation */
 		ret = object->cache->ops->check_consistency(op);
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index a63225116db6..97ec45110957 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -124,9 +124,7 @@ extern int fscache_submit_exclusive_op(struct fscache_object *,
 				       struct fscache_operation *);
 extern int fscache_submit_op(struct fscache_object *,
 			     struct fscache_operation *);
-extern int fscache_cancel_op(struct fscache_operation *,
-			     void (*)(struct fscache_operation *),
-			     bool);
+extern int fscache_cancel_op(struct fscache_operation *, bool);
 extern void fscache_cancel_all_ops(struct fscache_object *);
 extern void fscache_abort_object(struct fscache_object *);
 extern void fscache_start_operations(struct fscache_object *);
@@ -139,8 +137,7 @@ extern int fscache_wait_for_deferred_lookup(struct fscache_cookie *);
 extern int fscache_wait_for_operation_activation(struct fscache_object *,
 						 struct fscache_operation *,
 						 atomic_t *,
-						 atomic_t *,
-						 void (*)(struct fscache_operation *));
+						 atomic_t *);
 extern void fscache_invalidate_writes(struct fscache_cookie *);
 
 /*
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 40049f7505f0..9e792e30f4db 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -961,7 +961,8 @@ static const struct fscache_state *_fscache_invalidate_object(struct fscache_obj
 	if (!op)
 		goto nomem;
 
-	fscache_operation_init(op, object->cache->ops->invalidate_object, NULL);
+	fscache_operation_init(op, object->cache->ops->invalidate_object,
+			       NULL, NULL);
 	op->flags = FSCACHE_OP_ASYNC |
 		(1 << FSCACHE_OP_EXCLUSIVE) |
 		(1 << FSCACHE_OP_UNUSE_COOKIE);
diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index c76c09730768..57d4abb68656 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -20,6 +20,10 @@
 atomic_t fscache_op_debug_id;
 EXPORT_SYMBOL(fscache_op_debug_id);
 
+static void fscache_operation_dummy_cancel(struct fscache_operation *op)
+{
+}
+
 /**
  * fscache_operation_init - Do basic initialisation of an operation
  * @op: The operation to initialise
@@ -30,6 +34,7 @@ EXPORT_SYMBOL(fscache_op_debug_id);
  */
 void fscache_operation_init(struct fscache_operation *op,
 			    fscache_operation_processor_t processor,
+			    fscache_operation_cancel_t cancel,
 			    fscache_operation_release_t release)
 {
 	INIT_WORK(&op->work, fscache_op_work_func);
@@ -37,6 +42,7 @@ void fscache_operation_init(struct fscache_operation *op,
 	op->state = FSCACHE_OP_ST_INITIALISED;
 	op->debug_id = atomic_inc_return(&fscache_op_debug_id);
 	op->processor = processor;
+	op->cancel = cancel ?: fscache_operation_dummy_cancel;
 	op->release = release;
 	INIT_LIST_HEAD(&op->pend_link);
 	fscache_stat(&fscache_n_op_initialised);
@@ -164,9 +170,11 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 	flags = READ_ONCE(object->flags);
 	if (unlikely(!(flags & BIT(FSCACHE_OBJECT_IS_LIVE)))) {
 		fscache_stat(&fscache_n_op_rejected);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	} else if (unlikely(fscache_cache_is_broken(object))) {
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -EIO;
 	} else if (flags & BIT(FSCACHE_OBJECT_IS_AVAILABLE)) {
@@ -200,10 +208,12 @@ int fscache_submit_exclusive_op(struct fscache_object *object,
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
 	} else if (flags & BIT(FSCACHE_OBJECT_KILLED_BY_CACHE)) {
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	} else {
 		fscache_report_unexpected_submission(object, op, ostate);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	}
@@ -245,9 +255,11 @@ int fscache_submit_op(struct fscache_object *object,
 	flags = READ_ONCE(object->flags);
 	if (unlikely(!(flags & BIT(FSCACHE_OBJECT_IS_LIVE)))) {
 		fscache_stat(&fscache_n_op_rejected);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	} else if (unlikely(fscache_cache_is_broken(object))) {
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -EIO;
 	} else if (flags & BIT(FSCACHE_OBJECT_IS_AVAILABLE)) {
@@ -276,11 +288,13 @@ int fscache_submit_op(struct fscache_object *object,
 		fscache_stat(&fscache_n_op_pend);
 		ret = 0;
 	} else if (flags & BIT(FSCACHE_OBJECT_KILLED_BY_CACHE)) {
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	} else {
 		fscache_report_unexpected_submission(object, op, ostate);
 		ASSERT(!fscache_object_is_active(object));
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		ret = -ENOBUFS;
 	}
@@ -335,7 +349,6 @@ void fscache_start_operations(struct fscache_object *object)
  * cancel an operation that's pending on an object
  */
 int fscache_cancel_op(struct fscache_operation *op,
-		      void (*do_cancel)(struct fscache_operation *),
 		      bool cancel_in_progress_op)
 {
 	struct fscache_object *object = op->object;
@@ -355,9 +368,9 @@ int fscache_cancel_op(struct fscache_operation *op,
 		ASSERT(!list_empty(&op->pend_link));
 		list_del_init(&op->pend_link);
 		put = true;
+
 		fscache_stat(&fscache_n_op_cancelled);
-		if (do_cancel)
-			do_cancel(op);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
 			object->n_exclusive--;
@@ -373,8 +386,7 @@ int fscache_cancel_op(struct fscache_operation *op,
 			fscache_start_operations(object);
 
 		fscache_stat(&fscache_n_op_cancelled);
-		if (do_cancel)
-			do_cancel(op);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
 			object->n_exclusive--;
@@ -408,6 +420,7 @@ void fscache_cancel_all_ops(struct fscache_object *object)
 		list_del_init(&op->pend_link);
 
 		ASSERTCMP(op->state, ==, FSCACHE_OP_ST_PENDING);
+		op->cancel(op);
 		op->state = FSCACHE_OP_ST_CANCELLED;
 
 		if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
@@ -440,8 +453,12 @@ void fscache_op_complete(struct fscache_operation *op, bool cancelled)
 
 	spin_lock(&object->lock);
 
-	op->state = cancelled ?
-		FSCACHE_OP_ST_CANCELLED : FSCACHE_OP_ST_COMPLETE;
+	if (!cancelled) {
+		op->state = FSCACHE_OP_ST_COMPLETE;
+	} else {
+		op->cancel(op);
+		op->state = FSCACHE_OP_ST_CANCELLED;
+	}
 
 	if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
 		object->n_exclusive--;
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index 7db9752252ef..d1b23a67c031 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -213,7 +213,7 @@ int __fscache_attr_changed(struct fscache_cookie *cookie)
 		return -ENOMEM;
 	}
 
-	fscache_operation_init(op, fscache_attr_changed_op, NULL);
+	fscache_operation_init(op, fscache_attr_changed_op, NULL, NULL);
 	op->flags = FSCACHE_OP_ASYNC |
 		(1 << FSCACHE_OP_EXCLUSIVE) |
 		(1 << FSCACHE_OP_UNUSE_COOKIE);
@@ -249,6 +249,17 @@ nobufs:
 EXPORT_SYMBOL(__fscache_attr_changed);
 
 /*
+ * Handle cancellation of a pending retrieval op
+ */
+static void fscache_do_cancel_retrieval(struct fscache_operation *_op)
+{
+	struct fscache_retrieval *op =
+		container_of(_op, struct fscache_retrieval, op);
+
+	atomic_set(&op->n_pages, 0);
+}
+
+/*
  * release a retrieval op reference
  */
 static void fscache_release_retrieval_op(struct fscache_operation *_op)
@@ -285,7 +296,9 @@ static struct fscache_retrieval *fscache_alloc_retrieval(
 		return NULL;
 	}
 
-	fscache_operation_init(&op->op, NULL, fscache_release_retrieval_op);
+	fscache_operation_init(&op->op, NULL,
+			       fscache_do_cancel_retrieval,
+			       fscache_release_retrieval_op);
 	op->op.flags	= FSCACHE_OP_MYTHREAD |
 		(1UL << FSCACHE_OP_WAITING) |
 		(1UL << FSCACHE_OP_UNUSE_COOKIE);
@@ -330,24 +343,12 @@ int fscache_wait_for_deferred_lookup(struct fscache_cookie *cookie)
 }
 
 /*
- * Handle cancellation of a pending retrieval op
- */
-static void fscache_do_cancel_retrieval(struct fscache_operation *_op)
-{
-	struct fscache_retrieval *op =
-		container_of(_op, struct fscache_retrieval, op);
-
-	atomic_set(&op->n_pages, 0);
-}
-
-/*
  * wait for an object to become active (or dead)
  */
 int fscache_wait_for_operation_activation(struct fscache_object *object,
 					  struct fscache_operation *op,
 					  atomic_t *stat_op_waits,
-					  atomic_t *stat_object_dead,
-					  void (*do_cancel)(struct fscache_operation *))
+					  atomic_t *stat_object_dead)
 {
 	int ret;
 
@@ -359,7 +360,7 @@ int fscache_wait_for_operation_activation(struct fscache_object *object,
 		fscache_stat(stat_op_waits);
 	if (wait_on_bit(&op->flags, FSCACHE_OP_WAITING,
 			TASK_INTERRUPTIBLE) != 0) {
-		ret = fscache_cancel_op(op, do_cancel, false);
+		ret = fscache_cancel_op(op, false);
 		if (ret == 0)
 			return -ERESTARTSYS;
 
@@ -380,7 +381,7 @@ check_if_dead:
 	if (unlikely(fscache_object_is_dying(object) ||
 		     fscache_cache_is_broken(object))) {
 		enum fscache_operation_state state = op->state;
-		fscache_cancel_op(op, do_cancel, true);
+		fscache_cancel_op(op, true);
 		if (stat_object_dead)
 			fscache_stat(stat_object_dead);
 		_leave(" = -ENOBUFS [obj dead %d]", state);
@@ -464,8 +465,7 @@ int __fscache_read_or_alloc_page(struct fscache_cookie *cookie,
 	ret = fscache_wait_for_operation_activation(
 		object, &op->op,
 		__fscache_stat(&fscache_n_retrieval_op_waits),
-		__fscache_stat(&fscache_n_retrievals_object_dead),
-		fscache_do_cancel_retrieval);
+		__fscache_stat(&fscache_n_retrievals_object_dead));
 	if (ret < 0)
 		goto error;
 
@@ -595,8 +595,7 @@ int __fscache_read_or_alloc_pages(struct fscache_cookie *cookie,
 	ret = fscache_wait_for_operation_activation(
 		object, &op->op,
 		__fscache_stat(&fscache_n_retrieval_op_waits),
-		__fscache_stat(&fscache_n_retrievals_object_dead),
-		fscache_do_cancel_retrieval);
+		__fscache_stat(&fscache_n_retrievals_object_dead));
 	if (ret < 0)
 		goto error;
 
@@ -702,8 +701,7 @@ int __fscache_alloc_page(struct fscache_cookie *cookie,
 	ret = fscache_wait_for_operation_activation(
 		object, &op->op,
 		__fscache_stat(&fscache_n_alloc_op_waits),
-		__fscache_stat(&fscache_n_allocs_object_dead),
-		fscache_do_cancel_retrieval);
+		__fscache_stat(&fscache_n_allocs_object_dead));
 	if (ret < 0)
 		goto error;
 
@@ -946,7 +944,7 @@ int __fscache_write_page(struct fscache_cookie *cookie,
 	if (!op)
 		goto nomem;
 
-	fscache_operation_init(&op->op, fscache_write_op,
+	fscache_operation_init(&op->op, fscache_write_op, NULL,
 			       fscache_release_write_op);
 	op->op.flags = FSCACHE_OP_ASYNC |
 		(1 << FSCACHE_OP_WAITING) |
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 0e26d49972e3..69c6eb10d858 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -74,6 +74,7 @@ extern wait_queue_head_t fscache_cache_cleared_wq;
  */
 typedef void (*fscache_operation_release_t)(struct fscache_operation *op);
 typedef void (*fscache_operation_processor_t)(struct fscache_operation *op);
+typedef void (*fscache_operation_cancel_t)(struct fscache_operation *op);
 
 enum fscache_operation_state {
 	FSCACHE_OP_ST_BLANK,		/* Op is not yet submitted */
@@ -109,6 +110,9 @@ struct fscache_operation {
 	 *   the op in a non-pool thread */
 	fscache_operation_processor_t processor;
 
+	/* Operation cancellation cleanup (optional) */
+	fscache_operation_cancel_t cancel;
+
 	/* operation releaser */
 	fscache_operation_release_t release;
 };
@@ -121,6 +125,7 @@ extern void fscache_op_complete(struct fscache_operation *, bool);
 extern void fscache_put_operation(struct fscache_operation *);
 extern void fscache_operation_init(struct fscache_operation *,
 				   fscache_operation_processor_t,
+				   fscache_operation_cancel_t,
 				   fscache_operation_release_t);
 
 /*

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 13/13] FS-Cache: Retain the netfs context in the retrieval op earlier
  2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
@ 2015-02-26 14:03   ` David Howells
  2015-02-26 14:02   ` David Howells
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:03 UTC (permalink / raw)
  To: linux-cachefs; +Cc: dhowells, linux-fsdevel, linux-nfs, linux-kernel

Now that the retrieval operation may be disposed of by fscache_put_operation()
before we actually set the context, the retrieval-specific cleanup operation
can produce a NULL-pointer dereference when it tries to unconditionally clean
up the netfs context.

Given that it is expected that we'll get at least as far as the place where we
currently set the context pointer and it is unlikely we'll go through the
error handling paths prior to that point, retain the context right from the
point that the retrieval op is allocated.

Concomitant to this, we need to retain the cookie pointer in the retrieval op
also so that we can call the netfs to release its context in the release
method.

In addition, we might now get into fscache_release_retrieval_op() with the op
only initialised.  To this end, set the operation to DEAD only after the
release method has been called and skip the n_pages test upon cleanup if the
op is still in the INITIALISED state.

Without these changes, the following oops might be seen:

	BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
	...
	RIP: 0010:[<ffffffffa0089c98>] fscache_release_retrieval_op+0xae/0x100
	...
	Call Trace:
	 [<ffffffffa0088560>] fscache_put_operation+0x117/0x2e0
	 [<ffffffffa008b8f5>] __fscache_read_or_alloc_pages+0x351/0x3ac
	 [<ffffffffa00b761f>] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
	 [<ffffffffa00b06c5>] nfs_readpages+0x10c/0x185 [nfs]
	 [<ffffffff81124925>] ? alloc_pages_current+0x119/0x13e
	 [<ffffffff810ee5fd>] ? __page_cache_alloc+0xfb/0x10a
	 [<ffffffff810f87f8>] __do_page_cache_readahead+0x188/0x22c
	 [<ffffffff810f8b3a>] ondemand_readahead+0x29e/0x2af
	 [<ffffffff810f8c92>] page_cache_sync_readahead+0x38/0x3a
	 [<ffffffff810ef337>] generic_file_read_iter+0x1a2/0x55a
	 [<ffffffffa00a9dff>] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
	 [<ffffffffa00a6a23>] nfs_file_read+0x49/0x70 [nfs]
	 [<ffffffff811363be>] new_sync_read+0x78/0x9c
	 [<ffffffff81137164>] __vfs_read+0x13/0x38
	 [<ffffffff8113721e>] vfs_read+0x95/0x121
	 [<ffffffff811372f6>] SyS_read+0x4c/0x8a
	 [<ffffffff81557a52>] system_call_fastpath+0x12/0x17

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c        |    2 +-
 fs/fscache/page.c             |   20 ++++++++++----------
 include/linux/fscache-cache.h |    1 +
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 57d4abb68656..de67745e1cd7 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -492,7 +492,6 @@ void fscache_put_operation(struct fscache_operation *op)
 	ASSERTIFCMP(op->state != FSCACHE_OP_ST_INITIALISED &&
 		    op->state != FSCACHE_OP_ST_COMPLETE,
 		    op->state, ==, FSCACHE_OP_ST_CANCELLED);
-	op->state = FSCACHE_OP_ST_DEAD;
 
 	fscache_stat(&fscache_n_op_release);
 
@@ -500,6 +499,7 @@ void fscache_put_operation(struct fscache_operation *op)
 		op->release(op);
 		op->release = NULL;
 	}
+	op->state = FSCACHE_OP_ST_DEAD;
 
 	object = op->object;
 	if (likely(object)) {
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index d1b23a67c031..483bbc613bf0 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -269,11 +269,12 @@ static void fscache_release_retrieval_op(struct fscache_operation *_op)
 
 	_enter("{OP%x}", op->op.debug_id);
 
-	ASSERTCMP(atomic_read(&op->n_pages), ==, 0);
+	ASSERTIFCMP(op->op.state != FSCACHE_OP_ST_INITIALISED,
+		    atomic_read(&op->n_pages), ==, 0);
 
 	fscache_hist(fscache_retrieval_histogram, op->start_time);
 	if (op->context)
-		fscache_put_context(op->op.object->cookie, op->context);
+		fscache_put_context(op->cookie, op->context);
 
 	_leave("");
 }
@@ -302,11 +303,18 @@ static struct fscache_retrieval *fscache_alloc_retrieval(
 	op->op.flags	= FSCACHE_OP_MYTHREAD |
 		(1UL << FSCACHE_OP_WAITING) |
 		(1UL << FSCACHE_OP_UNUSE_COOKIE);
+	op->cookie	= cookie;
 	op->mapping	= mapping;
 	op->end_io_func	= end_io_func;
 	op->context	= context;
 	op->start_time	= jiffies;
 	INIT_LIST_HEAD(&op->to_do);
+
+	/* Pin the netfs read context in case we need to do the actual netfs
+	 * read because we've encountered a cache read failure.
+	 */
+	if (context)
+		fscache_get_context(op->cookie, context);
 	return op;
 }
 
@@ -456,10 +464,6 @@ int __fscache_read_or_alloc_page(struct fscache_cookie *cookie,
 
 	fscache_stat(&fscache_n_retrieval_ops);
 
-	/* pin the netfs read context in case we need to do the actual netfs
-	 * read because we've encountered a cache read failure */
-	fscache_get_context(object->cookie, op->context);
-
 	/* we wait for the operation to become active, and then process it
 	 * *here*, in this thread, and not in the thread pool */
 	ret = fscache_wait_for_operation_activation(
@@ -586,10 +590,6 @@ int __fscache_read_or_alloc_pages(struct fscache_cookie *cookie,
 
 	fscache_stat(&fscache_n_retrieval_ops);
 
-	/* pin the netfs read context in case we need to do the actual netfs
-	 * read because we've encountered a cache read failure */
-	fscache_get_context(object->cookie, op->context);
-
 	/* we wait for the operation to become active, and then process it
 	 * *here*, in this thread, and not in the thread pool */
 	ret = fscache_wait_for_operation_activation(
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 69c6eb10d858..604e1526cd00 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -133,6 +133,7 @@ extern void fscache_operation_init(struct fscache_operation *,
  */
 struct fscache_retrieval {
 	struct fscache_operation op;
+	struct fscache_cookie	*cookie;	/* The netfs cookie */
 	struct address_space	*mapping;	/* netfs pages */
 	fscache_rw_complete_t	end_io_func;	/* function to call on I/O completion */
 	void			*context;	/* netfs read context (pinned) */


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 13/13] FS-Cache: Retain the netfs context in the retrieval op earlier
@ 2015-02-26 14:03   ` David Howells
  0 siblings, 0 replies; 28+ messages in thread
From: David Howells @ 2015-02-26 14:03 UTC (permalink / raw)
  To: linux-cachefs; +Cc: linux-fsdevel, linux-nfs, linux-kernel

Now that the retrieval operation may be disposed of by fscache_put_operation()
before we actually set the context, the retrieval-specific cleanup operation
can produce a NULL-pointer dereference when it tries to unconditionally clean
up the netfs context.

Given that it is expected that we'll get at least as far as the place where we
currently set the context pointer and it is unlikely we'll go through the
error handling paths prior to that point, retain the context right from the
point that the retrieval op is allocated.

Concomitant to this, we need to retain the cookie pointer in the retrieval op
also so that we can call the netfs to release its context in the release
method.

In addition, we might now get into fscache_release_retrieval_op() with the op
only initialised.  To this end, set the operation to DEAD only after the
release method has been called and skip the n_pages test upon cleanup if the
op is still in the INITIALISED state.

Without these changes, the following oops might be seen:

	BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
	...
	RIP: 0010:[<ffffffffa0089c98>] fscache_release_retrieval_op+0xae/0x100
	...
	Call Trace:
	 [<ffffffffa0088560>] fscache_put_operation+0x117/0x2e0
	 [<ffffffffa008b8f5>] __fscache_read_or_alloc_pages+0x351/0x3ac
	 [<ffffffffa00b761f>] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
	 [<ffffffffa00b06c5>] nfs_readpages+0x10c/0x185 [nfs]
	 [<ffffffff81124925>] ? alloc_pages_current+0x119/0x13e
	 [<ffffffff810ee5fd>] ? __page_cache_alloc+0xfb/0x10a
	 [<ffffffff810f87f8>] __do_page_cache_readahead+0x188/0x22c
	 [<ffffffff810f8b3a>] ondemand_readahead+0x29e/0x2af
	 [<ffffffff810f8c92>] page_cache_sync_readahead+0x38/0x3a
	 [<ffffffff810ef337>] generic_file_read_iter+0x1a2/0x55a
	 [<ffffffffa00a9dff>] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
	 [<ffffffffa00a6a23>] nfs_file_read+0x49/0x70 [nfs]
	 [<ffffffff811363be>] new_sync_read+0x78/0x9c
	 [<ffffffff81137164>] __vfs_read+0x13/0x38
	 [<ffffffff8113721e>] vfs_read+0x95/0x121
	 [<ffffffff811372f6>] SyS_read+0x4c/0x8a
	 [<ffffffff81557a52>] system_call_fastpath+0x12/0x17

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fscache/operation.c        |    2 +-
 fs/fscache/page.c             |   20 ++++++++++----------
 include/linux/fscache-cache.h |    1 +
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/fs/fscache/operation.c b/fs/fscache/operation.c
index 57d4abb68656..de67745e1cd7 100644
--- a/fs/fscache/operation.c
+++ b/fs/fscache/operation.c
@@ -492,7 +492,6 @@ void fscache_put_operation(struct fscache_operation *op)
 	ASSERTIFCMP(op->state != FSCACHE_OP_ST_INITIALISED &&
 		    op->state != FSCACHE_OP_ST_COMPLETE,
 		    op->state, ==, FSCACHE_OP_ST_CANCELLED);
-	op->state = FSCACHE_OP_ST_DEAD;
 
 	fscache_stat(&fscache_n_op_release);
 
@@ -500,6 +499,7 @@ void fscache_put_operation(struct fscache_operation *op)
 		op->release(op);
 		op->release = NULL;
 	}
+	op->state = FSCACHE_OP_ST_DEAD;
 
 	object = op->object;
 	if (likely(object)) {
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index d1b23a67c031..483bbc613bf0 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -269,11 +269,12 @@ static void fscache_release_retrieval_op(struct fscache_operation *_op)
 
 	_enter("{OP%x}", op->op.debug_id);
 
-	ASSERTCMP(atomic_read(&op->n_pages), ==, 0);
+	ASSERTIFCMP(op->op.state != FSCACHE_OP_ST_INITIALISED,
+		    atomic_read(&op->n_pages), ==, 0);
 
 	fscache_hist(fscache_retrieval_histogram, op->start_time);
 	if (op->context)
-		fscache_put_context(op->op.object->cookie, op->context);
+		fscache_put_context(op->cookie, op->context);
 
 	_leave("");
 }
@@ -302,11 +303,18 @@ static struct fscache_retrieval *fscache_alloc_retrieval(
 	op->op.flags	= FSCACHE_OP_MYTHREAD |
 		(1UL << FSCACHE_OP_WAITING) |
 		(1UL << FSCACHE_OP_UNUSE_COOKIE);
+	op->cookie	= cookie;
 	op->mapping	= mapping;
 	op->end_io_func	= end_io_func;
 	op->context	= context;
 	op->start_time	= jiffies;
 	INIT_LIST_HEAD(&op->to_do);
+
+	/* Pin the netfs read context in case we need to do the actual netfs
+	 * read because we've encountered a cache read failure.
+	 */
+	if (context)
+		fscache_get_context(op->cookie, context);
 	return op;
 }
 
@@ -456,10 +464,6 @@ int __fscache_read_or_alloc_page(struct fscache_cookie *cookie,
 
 	fscache_stat(&fscache_n_retrieval_ops);
 
-	/* pin the netfs read context in case we need to do the actual netfs
-	 * read because we've encountered a cache read failure */
-	fscache_get_context(object->cookie, op->context);
-
 	/* we wait for the operation to become active, and then process it
 	 * *here*, in this thread, and not in the thread pool */
 	ret = fscache_wait_for_operation_activation(
@@ -586,10 +590,6 @@ int __fscache_read_or_alloc_pages(struct fscache_cookie *cookie,
 
 	fscache_stat(&fscache_n_retrieval_ops);
 
-	/* pin the netfs read context in case we need to do the actual netfs
-	 * read because we've encountered a cache read failure */
-	fscache_get_context(object->cookie, op->context);
-
 	/* we wait for the operation to become active, and then process it
 	 * *here*, in this thread, and not in the thread pool */
 	ret = fscache_wait_for_operation_activation(
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 69c6eb10d858..604e1526cd00 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -133,6 +133,7 @@ extern void fscache_operation_init(struct fscache_operation *,
  */
 struct fscache_retrieval {
 	struct fscache_operation op;
+	struct fscache_cookie	*cookie;	/* The netfs cookie */
 	struct address_space	*mapping;	/* netfs pages */
 	fscache_rw_complete_t	end_io_func;	/* function to call on I/O completion */
 	void			*context;	/* netfs read context (pinned) */

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space
@ 2015-03-02 23:58   ` Jeff Layton
  0 siblings, 0 replies; 28+ messages in thread
From: Jeff Layton @ 2015-03-02 23:58 UTC (permalink / raw)
  To: David Howells; +Cc: linux-cachefs, linux-fsdevel, linux-nfs, linux-kernel

On Thu, 26 Feb 2015 14:01:55 +0000
David Howells <dhowells@redhat.com> wrote:

> 
> Attached are a bunch of patches that progressively fix bugs that can occur
> when the cache hits the hard "must maintain X free space" limit and starts
> rejecting requests.
> 
> The commit ensubjected "FS-Cache: Synchronise object death state change vs
> operation submission" is the main bug I was looking for.  Some of the other
> patches fix other bugs I found along the way and the remaining patches are
> changes pulled out of the bugfixes to make the bugfix commits more narrowly
> targeted.
> 
> Note that I added some new stats:
> 
> 	Ops    : ini=836360 dfr=19 rel=836360 gc=19
> 
> The ini=N field here indicates the number of fscache_operation structs that
> were initialised and should match the rel=M value.
> 
> 	CacheEv: nsp=3425 stl=0 rtr=0 cul=0
> 
> Indicates counts of certain cache backend events: nsp=N shows how many objects
> were rejected due to lack of space in the cache, stl=N shows how many objects
> were found to be stale upon reuse and thus discarded, rtr=N how many objects
> were retired and cul=N shows how many inactive objects were culled.
> 
> These can also be found here:
> 
> 	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-fixes
> 
> tagged with:
> 
> 	fscache-fixes-20150226
> 
> David
> ---
> David Howells (13):
>       FS-Cache: Count culled objects and objects rejected due to lack of space
>       FS-Cache: Move fscache_report_unexpected_submission() to make it more available
>       FS-Cache: When submitting an op, cancel it if the target object is dying
>       FS-Cache: Handle a new operation submitted against a killed object
>       FS-Cache: Synchronise object death state change vs operation submission
>       FS-Cache: fscache_object_is_dead() has wrong logic, kill it
>       FS-Cache: Permit fscache_cancel_op() to cancel in-progress operations too
>       FS-Cache: Out of line fscache_operation_init()
>       FS-Cache: Count the number of initialised operations
>       FS-Cache: Fix cancellation of in-progress operation
>       FS-Cache: Put an aborted initialised op so that it is accounted correctly
>       FS-Cache: The operation cancellation method needs calling in more places
>       FS-Cache: Retain the netfs context in the retrieval op earlier
> 
> 
>  Documentation/filesystems/caching/backend-api.txt |   23 ++
>  Documentation/filesystems/caching/fscache.txt     |    7 -
>  fs/cachefiles/internal.h                          |    1 
>  fs/cachefiles/namei.c                             |   33 ++-
>  fs/fscache/cookie.c                               |    8 -
>  fs/fscache/internal.h                             |   12 +
>  fs/fscache/object.c                               |   69 +++++-
>  fs/fscache/operation.c                            |  254 ++++++++++++++-------
>  fs/fscache/page.c                                 |   86 ++++---
>  fs/fscache/stats.c                                |   14 +
>  include/linux/fscache-cache.h                     |   55 ++---
>  11 files changed, 378 insertions(+), 184 deletions(-)
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

I've looked over these patches. While I haven't looked at the fscache
code in quite a while, these all look reasonably sane to me (and it
looks like there are some good bugfixes in here).

Acked-by: Jeff Layton <jeff.layton@primarydata.com>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space
@ 2015-03-02 23:58   ` Jeff Layton
  0 siblings, 0 replies; 28+ messages in thread
From: Jeff Layton @ 2015-03-02 23:58 UTC (permalink / raw)
  To: David Howells
  Cc: linux-cachefs-H+wXaHxf7aLQT0dZR+AlfA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thu, 26 Feb 2015 14:01:55 +0000
David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> 
> Attached are a bunch of patches that progressively fix bugs that can occur
> when the cache hits the hard "must maintain X free space" limit and starts
> rejecting requests.
> 
> The commit ensubjected "FS-Cache: Synchronise object death state change vs
> operation submission" is the main bug I was looking for.  Some of the other
> patches fix other bugs I found along the way and the remaining patches are
> changes pulled out of the bugfixes to make the bugfix commits more narrowly
> targeted.
> 
> Note that I added some new stats:
> 
> 	Ops    : ini=836360 dfr=19 rel=836360 gc=19
> 
> The ini=N field here indicates the number of fscache_operation structs that
> were initialised and should match the rel=M value.
> 
> 	CacheEv: nsp=3425 stl=0 rtr=0 cul=0
> 
> Indicates counts of certain cache backend events: nsp=N shows how many objects
> were rejected due to lack of space in the cache, stl=N shows how many objects
> were found to be stale upon reuse and thus discarded, rtr=N how many objects
> were retired and cul=N shows how many inactive objects were culled.
> 
> These can also be found here:
> 
> 	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-fixes
> 
> tagged with:
> 
> 	fscache-fixes-20150226
> 
> David
> ---
> David Howells (13):
>       FS-Cache: Count culled objects and objects rejected due to lack of space
>       FS-Cache: Move fscache_report_unexpected_submission() to make it more available
>       FS-Cache: When submitting an op, cancel it if the target object is dying
>       FS-Cache: Handle a new operation submitted against a killed object
>       FS-Cache: Synchronise object death state change vs operation submission
>       FS-Cache: fscache_object_is_dead() has wrong logic, kill it
>       FS-Cache: Permit fscache_cancel_op() to cancel in-progress operations too
>       FS-Cache: Out of line fscache_operation_init()
>       FS-Cache: Count the number of initialised operations
>       FS-Cache: Fix cancellation of in-progress operation
>       FS-Cache: Put an aborted initialised op so that it is accounted correctly
>       FS-Cache: The operation cancellation method needs calling in more places
>       FS-Cache: Retain the netfs context in the retrieval op earlier
> 
> 
>  Documentation/filesystems/caching/backend-api.txt |   23 ++
>  Documentation/filesystems/caching/fscache.txt     |    7 -
>  fs/cachefiles/internal.h                          |    1 
>  fs/cachefiles/namei.c                             |   33 ++-
>  fs/fscache/cookie.c                               |    8 -
>  fs/fscache/internal.h                             |   12 +
>  fs/fscache/object.c                               |   69 +++++-
>  fs/fscache/operation.c                            |  254 ++++++++++++++-------
>  fs/fscache/page.c                                 |   86 ++++---
>  fs/fscache/stats.c                                |   14 +
>  include/linux/fscache-cache.h                     |   55 ++---
>  11 files changed, 378 insertions(+), 184 deletions(-)
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

I've looked over these patches. While I haven't looked at the fscache
code in quite a while, these all look reasonably sane to me (and it
looks like there are some good bugfixes in here).

Acked-by: Jeff Layton <jeff.layton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2015-03-02 23:58 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-26 14:01 [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space David Howells
2015-02-26 14:02 ` [PATCH 01/13] FS-Cache: Count culled objects and objects rejected due to lack " David Howells
2015-02-26 14:02 ` [PATCH 02/13] FS-Cache: Move fscache_report_unexpected_submission() to make it more available David Howells
2015-02-26 14:02   ` David Howells
2015-02-26 14:02 ` [PATCH 03/13] FS-Cache: When submitting an op, cancel it if the target object is dying David Howells
2015-02-26 14:02   ` David Howells
2015-02-26 14:02 ` [PATCH 04/13] FS-Cache: Handle a new operation submitted against a killed object David Howells
2015-02-26 14:02   ` David Howells
2015-02-26 14:02 ` [PATCH 05/13] FS-Cache: Synchronise object death state change vs operation submission David Howells
2015-02-26 14:02   ` David Howells
2015-02-26 14:02 ` [PATCH 06/13] FS-Cache: fscache_object_is_dead() has wrong logic, kill it David Howells
2015-02-26 14:02   ` David Howells
2015-02-26 14:02 ` [PATCH 07/13] FS-Cache: Permit fscache_cancel_op() to cancel in-progress operations too David Howells
2015-02-26 14:02   ` David Howells
2015-02-26 14:02 ` [PATCH 08/13] FS-Cache: Out of line fscache_operation_init() David Howells
2015-02-26 14:02   ` David Howells
2015-02-26 14:02 ` [PATCH 09/13] FS-Cache: Count the number of initialised operations David Howells
2015-02-26 14:02   ` David Howells
2015-02-26 14:02 ` [PATCH 10/13] FS-Cache: Fix cancellation of in-progress operation David Howells
2015-02-26 14:02   ` David Howells
2015-02-26 14:03 ` [PATCH 11/13] FS-Cache: Put an aborted initialised op so that it is accounted correctly David Howells
2015-02-26 14:03   ` David Howells
2015-02-26 14:03 ` [PATCH 12/13] FS-Cache: The operation cancellation method needs calling in more places David Howells
2015-02-26 14:03   ` David Howells
2015-02-26 14:03 ` [PATCH 13/13] FS-Cache: Retain the netfs context in the retrieval op earlier David Howells
2015-02-26 14:03   ` David Howells
2015-03-02 23:58 ` [PATCH 00/13] FS-Cache: Fix a number of bugs that occur when the cache runs out of space Jeff Layton
2015-03-02 23:58   ` Jeff Layton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.