[PATCH 0/9] Some improvements to request deferral and related code

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/9] Some improvements to request deferral and related code
@ 2009-09-09  6:32 NeilBrown
       [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2009-09-09  6:32 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

Hi,
 here again are those patches from request deferral improvement set
 that have not yet been applied.

 I had previously missed the cache lookups in svcauth_gss, and have now
 made them work correctly.

 I also discovered that for NFSv4.1 and later, the current deferral
 scheme is disabled once the request gets into the NFS layer (i.e. out
 of the RPC layer).  I don't know that 4.0 doesn't disable it as well
 - maybe the problems it can cause are more severe with 4.1?  With
 this patch set included it might be acceptable to disable the current
 scheme for 4.0.

 I believe this is ready for wider testing and probably mainline
 inclusion.

Thanks,
NeilBrown


---

NeilBrown (9):
      sunrpc: close connection when a request is irretrievably lost.
      sunrpc/cache: change deferred-request hash table to use hlist.
      nfsd/idmap: drop special request deferal in favour of improved default.
      sunrpc/cache: retry cache lookups that return -ETIMEDOUT
      sunrpc/cache: allow threads to block while waiting for cache update.
      sunrpc/cache: avoid variable over-loading in cache_defer_req
      sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
      sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
      sunrpc/cache: change cache_defer_req to return -ve error, not boolean.


 fs/nfsd/export.c                  |   18 ++++++
 fs/nfsd/nfs4idmap.c               |  105 ++++------------------------------
 include/linux/sunrpc/cache.h      |    5 +-
 include/linux/sunrpc/svcauth.h    |   10 ++-
 net/sunrpc/auth_gss/svcauth_gss.c |   29 ++++++---
 net/sunrpc/cache.c                |  115 ++++++++++++++++++++++++-------------
 net/sunrpc/svc.c                  |    3 +
 net/sunrpc/svc_xprt.c             |   11 ++++
 net/sunrpc/svcauth_unix.c         |   34 +++++++++--
 9 files changed, 176 insertions(+), 154 deletions(-)

-- 
Signature


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/9] sunrpc/cache: change cache_defer_req to return -ve error, not boolean.
       [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-09-09  6:32   ` NeilBrown
       [not found]     ` <20090909063254.20462.57204.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-09-09  6:32   ` [PATCH 4/9] sunrpc/cache: avoid variable over-loading in cache_defer_req NeilBrown
                     ` (8 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2009-09-09  6:32 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

As "cache_defer_req" does not sound like a predicate, having it return
a boolean value can be confusing.  It is more consistent to return
0 for success and negative for error.

Exactly what error code to return is not important as we don't
differentiate between reasons why the request wasn't deferred,
we only care about whether it was deferred or not.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/cache.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 1a61401..aafb0e7 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -255,7 +255,7 @@ int cache_check(struct cache_detail *detail,
 	}
 
 	if (rv == -EAGAIN) {
-		if (cache_defer_req(rqstp, h) == 0) {
+		if (cache_defer_req(rqstp, h) < 0) {
 			/* Request is not deferred */
 			rv = cache_is_valid(detail, h);
 			if (rv == -EAGAIN)
@@ -511,11 +511,11 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 		 * or continue and drop the oldest below
 		 */
 		if (net_random()&1)
-			return 0;
+			return -ENOMEM;
 	}
 	dreq = req->defer(req);
 	if (dreq == NULL)
-		return 0;
+		return -ENOMEM;
 
 	dreq->item = item;
 
@@ -545,9 +545,9 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 	if (!test_bit(CACHE_PENDING, &item->flags)) {
 		/* must have just been validated... */
 		cache_revisit_request(item);
-		return 0;
+		return -EAGAIN;
 	}
-	return 1;
+	return 0;
 }
 
 static void cache_revisit_request(struct cache_head *item)



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/9] sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
       [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (3 preceding siblings ...)
  2009-09-09  6:32   ` [PATCH 3/9] sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req NeilBrown
@ 2009-09-09  6:32   ` NeilBrown
  2009-09-09  6:32   ` [PATCH 5/9] sunrpc/cache: allow threads to block while waiting for cache update NeilBrown
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: NeilBrown @ 2009-09-09  6:32 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

The extra call to cache_revisit_request in cache_fresh_unlocked is not
needed, as should have been fairly clear at the time of
   commit 4013edea9a0b6cdcb1fdf5d4011e47e068fd6efb

If there are requests to be revisited, then we can be sure that
CACHE_PENDING is set, so the second call is sufficient.

So remove the first call.
Then remove the 'new' parameter,
then remove the return value for cache_fresh_locked which is only used
to provide the value for 'new'.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/cache.c |   23 ++++++++++-------------
 1 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index aafb0e7..baddb23 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -105,18 +105,16 @@ EXPORT_SYMBOL_GPL(sunrpc_cache_lookup);
 
 static void cache_dequeue(struct cache_detail *detail, struct cache_head *ch);
 
-static int cache_fresh_locked(struct cache_head *head, time_t expiry)
+static void cache_fresh_locked(struct cache_head *head, time_t expiry)
 {
 	head->expiry_time = expiry;
 	head->last_refresh = get_seconds();
-	return !test_and_set_bit(CACHE_VALID, &head->flags);
+	set_bit(CACHE_VALID, &head->flags);
 }
 
 static void cache_fresh_unlocked(struct cache_head *head,
-			struct cache_detail *detail, int new)
+				 struct cache_detail *detail)
 {
-	if (new)
-		cache_revisit_request(head);
 	if (test_and_clear_bit(CACHE_PENDING, &head->flags)) {
 		cache_revisit_request(head);
 		cache_dequeue(detail, head);
@@ -132,7 +130,6 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 	 */
 	struct cache_head **head;
 	struct cache_head *tmp;
-	int is_new;
 
 	if (!test_bit(CACHE_VALID, &old->flags)) {
 		write_lock(&detail->hash_lock);
@@ -141,9 +138,9 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 				set_bit(CACHE_NEGATIVE, &old->flags);
 			else
 				detail->update(old, new);
-			is_new = cache_fresh_locked(old, new->expiry_time);
+			cache_fresh_locked(old, new->expiry_time);
 			write_unlock(&detail->hash_lock);
-			cache_fresh_unlocked(old, detail, is_new);
+			cache_fresh_unlocked(old, detail);
 			return old;
 		}
 		write_unlock(&detail->hash_lock);
@@ -167,11 +164,11 @@ struct cache_head *sunrpc_cache_update(struct cache_detail *detail,
 	*head = tmp;
 	detail->entries++;
 	cache_get(tmp);
-	is_new = cache_fresh_locked(tmp, new->expiry_time);
+	cache_fresh_locked(tmp, new->expiry_time);
 	cache_fresh_locked(old, 0);
 	write_unlock(&detail->hash_lock);
-	cache_fresh_unlocked(tmp, detail, is_new);
-	cache_fresh_unlocked(old, detail, 0);
+	cache_fresh_unlocked(tmp, detail);
+	cache_fresh_unlocked(old, detail);
 	cache_put(old, detail);
 	return tmp;
 }
@@ -240,8 +237,8 @@ int cache_check(struct cache_detail *detail,
 				cache_revisit_request(h);
 				if (rv == -EAGAIN) {
 					set_bit(CACHE_NEGATIVE, &h->flags);
-					cache_fresh_unlocked(h, detail,
-					     cache_fresh_locked(h, get_seconds()+CACHE_NEW_EXPIRY));
+					cache_fresh_locked(h, get_seconds()+CACHE_NEW_EXPIRY);
+					cache_fresh_unlocked(h, detail);
 					rv = -ENOENT;
 				}
 				break;



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/9] sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
       [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (2 preceding siblings ...)
  2009-09-09  6:32   ` [PATCH 6/9] sunrpc/cache: retry cache lookups that return -ETIMEDOUT NeilBrown
@ 2009-09-09  6:32   ` NeilBrown
       [not found]     ` <20090909063254.20462.7969.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-09-09  6:32   ` [PATCH 2/9] sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked NeilBrown
                     ` (5 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2009-09-09  6:32 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

Using list_del_init is generally safer than list_del, and it will
allow us, in a subsequent patch, to see if an entry has already been
processed or not.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/cache.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index baddb23..9c8f0cc 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -529,8 +529,8 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 	if (++cache_defer_cnt > DFR_MAX) {
 		dreq = list_entry(cache_defer_list.prev,
 				  struct cache_deferred_req, recent);
-		list_del(&dreq->recent);
-		list_del(&dreq->hash);
+		list_del_init(&dreq->recent);
+		list_del_init(&dreq->hash);
 		cache_defer_cnt--;
 	}
 	spin_unlock(&cache_defer_lock);
@@ -564,7 +564,7 @@ static void cache_revisit_request(struct cache_head *item)
 			dreq = list_entry(lp, struct cache_deferred_req, hash);
 			lp = lp->next;
 			if (dreq->item == item) {
-				list_del(&dreq->hash);
+				list_del_init(&dreq->hash);
 				list_move(&dreq->recent, &pending);
 				cache_defer_cnt--;
 			}
@@ -590,7 +590,7 @@ void cache_clean_deferred(void *owner)
 
 	list_for_each_entry_safe(dreq, tmp, &cache_defer_list, recent) {
 		if (dreq->owner == owner) {
-			list_del(&dreq->hash);
+			list_del_init(&dreq->hash);
 			list_move(&dreq->recent, &pending);
 			cache_defer_cnt--;
 		}



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/9] sunrpc/cache: avoid variable over-loading in cache_defer_req
       [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-09-09  6:32   ` [PATCH 1/9] sunrpc/cache: change cache_defer_req to return -ve error, not boolean NeilBrown
@ 2009-09-09  6:32   ` NeilBrown
       [not found]     ` <20090909063254.20462.68582.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-09-09  6:32   ` [PATCH 6/9] sunrpc/cache: retry cache lookups that return -ETIMEDOUT NeilBrown
                     ` (7 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2009-09-09  6:32 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

In cache_defer_req, 'dreq' is used for two significantly different
values that happen to be of the same type.

This is both confusing, and makes it hard to extend the range of one of
the values as we will in the next patch.
So introduce 'discard' to take one of the values.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 net/sunrpc/cache.c |   18 +++++++++---------
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 9c8f0cc..54bbd83 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -500,7 +500,7 @@ static int cache_defer_cnt;
 
 static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 {
-	struct cache_deferred_req *dreq;
+	struct cache_deferred_req *dreq, *discard;
 	int hash = DFR_HASH(item);
 
 	if (cache_defer_cnt >= DFR_MAX) {
@@ -525,20 +525,20 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 	list_add(&dreq->hash, &cache_defer_hash[hash]);
 
 	/* it is in, now maybe clean up */
-	dreq = NULL;
+	discard = NULL;
 	if (++cache_defer_cnt > DFR_MAX) {
-		dreq = list_entry(cache_defer_list.prev,
-				  struct cache_deferred_req, recent);
-		list_del_init(&dreq->recent);
-		list_del_init(&dreq->hash);
+		discard = list_entry(cache_defer_list.prev,
+				     struct cache_deferred_req, recent);
+		list_del_init(&discard->recent);
+		list_del_init(&discard->hash);
 		cache_defer_cnt--;
 	}
 	spin_unlock(&cache_defer_lock);
 
-	if (dreq) {
+	if (discard)
 		/* there was one too many */
-		dreq->revisit(dreq, 1);
-	}
+		discard->revisit(discard, 1);
+
 	if (!test_bit(CACHE_PENDING, &item->flags)) {
 		/* must have just been validated... */
 		cache_revisit_request(item);



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/9] sunrpc/cache: allow threads to block while waiting for cache update.
       [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (4 preceding siblings ...)
  2009-09-09  6:32   ` [PATCH 2/9] sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked NeilBrown
@ 2009-09-09  6:32   ` NeilBrown
       [not found]     ` <20090909063254.20462.99277.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-09-09  6:32   ` [PATCH 7/9] nfsd/idmap: drop special request deferal in favour of improved default NeilBrown
                     ` (3 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2009-09-09  6:32 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

The current practice of waiting for cache updates by queueing the
whole request to be retried has (at least) two problems.

1/ With NFSv4, requests can be quite complex and re-trying a whole
  request when a latter part fails should only be a last-resort, not a
  normal practice.

2/ Large requests, and in particular any 'write' request, will not be
  queued by the current code and doing so would be undesirable.

In many cases only a very sort wait is needed before the cache gets
valid data.

So, providing the underlying transport permits it by setting
 ->thread_wait,
arrange to wait briefly for an upcall to be completed (as reflected in
the clearing of CACHE_PENDING).
If the short wait was not long enough and CACHE_PENDING is still set,
fall back on the old approach.

The 'thread_wait' value is set to 5 seconds when there are spare
threads, and 1 second when there are no spare threads.

These values are probably much higher than needed, but will ensure
some forward progress.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 include/linux/sunrpc/cache.h |    3 +++
 net/sunrpc/cache.c           |   44 +++++++++++++++++++++++++++++++++++++++++-
 net/sunrpc/svc_xprt.c        |   11 +++++++++++
 3 files changed, 57 insertions(+), 1 deletions(-)

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 6f52b4d..ef3db11 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -125,6 +125,9 @@ struct cache_detail {
  */
 struct cache_req {
 	struct cache_deferred_req *(*defer)(struct cache_req *req);
+	int thread_wait;  /* How long (jiffies) we can block the
+			   * current thread to wait for updates.
+			   */
 };
 /* this must be embedded in a deferred_request that is being
  * delayed awaiting cache-fill
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 54bbd83..46e9e2b 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -498,10 +498,22 @@ static LIST_HEAD(cache_defer_list);
 static struct list_head cache_defer_hash[DFR_HASHSIZE];
 static int cache_defer_cnt;
 
+struct thread_deferred_req {
+	struct cache_deferred_req handle;
+	wait_queue_head_t wait;
+};
+static void cache_restart_thread(struct cache_deferred_req *dreq, int too_many)
+{
+	struct thread_deferred_req *dr =
+		container_of(dreq, struct thread_deferred_req, handle);
+	wake_up(&dr->wait);
+}
+
 static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 {
 	struct cache_deferred_req *dreq, *discard;
 	int hash = DFR_HASH(item);
+	struct thread_deferred_req sleeper;
 
 	if (cache_defer_cnt >= DFR_MAX) {
 		/* too much in the cache, randomly drop this one,
@@ -510,7 +522,14 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 		if (net_random()&1)
 			return -ENOMEM;
 	}
-	dreq = req->defer(req);
+	if (req->thread_wait) {
+		dreq = &sleeper.handle;
+		init_waitqueue_head(&sleeper.wait);
+		dreq->revisit = cache_restart_thread;
+	} else
+		dreq = req->defer(req);
+
+ retry:
 	if (dreq == NULL)
 		return -ENOMEM;
 
@@ -544,6 +563,29 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 		cache_revisit_request(item);
 		return -EAGAIN;
 	}
+
+	if (dreq == &sleeper.handle) {
+		wait_event_interruptible_timeout(
+			sleeper.wait,
+			!test_bit(CACHE_PENDING, &item->flags)
+			|| list_empty(&sleeper.handle.hash),
+			req->thread_wait);
+		spin_lock(&cache_defer_lock);
+		if (!list_empty(&sleeper.handle.hash)) {
+			list_del_init(&sleeper.handle.recent);
+			list_del_init(&sleeper.handle.hash);
+			cache_defer_cnt--;
+		}
+		spin_unlock(&cache_defer_lock);
+		if (test_bit(CACHE_PENDING, &item->flags)) {
+			/* item is still pending, try request
+			 * deferral
+			 */
+			dreq = req->defer(req);
+			goto retry;
+		}
+		return 0;
+	}
 	return 0;
 }
 
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 912dea5..65f6a25 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -655,12 +655,23 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
 		pool->sp_nwaking--;
 		BUG_ON(pool->sp_nwaking < 0);
 	}
+
+	/* Normally we will wait up to 5 seconds for any required
+	 * cache information to be provided.
+	 */
+	rqstp->rq_chandle.thread_wait = 5*HZ;
 	xprt = svc_xprt_dequeue(pool);
 	if (xprt) {
 		rqstp->rq_xprt = xprt;
 		svc_xprt_get(xprt);
 		rqstp->rq_reserved = serv->sv_max_mesg;
 		atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
+
+		/* As there is a shortage of threads and this request
+		 * had to be queue, don't allow the thread to wait so
+		 * long for cache updates.
+		 */
+		rqstp->rq_chandle.thread_wait = 1*HZ;
 	} else {
 		/* No data pending. Go to sleep */
 		svc_thread_enqueue(pool, rqstp);



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/9] sunrpc/cache: retry cache lookups that return -ETIMEDOUT
       [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-09-09  6:32   ` [PATCH 1/9] sunrpc/cache: change cache_defer_req to return -ve error, not boolean NeilBrown
  2009-09-09  6:32   ` [PATCH 4/9] sunrpc/cache: avoid variable over-loading in cache_defer_req NeilBrown
@ 2009-09-09  6:32   ` NeilBrown
       [not found]     ` <20090909063254.20462.41616.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-09-09  6:32   ` [PATCH 3/9] sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req NeilBrown
                     ` (6 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2009-09-09  6:32 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

If cache_check returns -ETIMEDOUT, then the cache item is not
up-to-date, but there is no pending upcall.
This could mean the data is not available, or it could mean that the
good data has been stored in a new cache item.

So re-do the lookup and if that returns a new item, proceed using that
item.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 fs/nfsd/export.c                  |   18 ++++++++++++++++++
 net/sunrpc/auth_gss/svcauth_gss.c |   18 ++++++++++++++----
 net/sunrpc/svcauth_unix.c         |   22 ++++++++++++++++++++--
 3 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 984a5eb..ec0f0d9 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -793,9 +793,18 @@ exp_find_key(svc_client *clp, int fsid_type, u32 *fsidv, struct cache_req *reqp)
 	memcpy(key.ek_fsid, fsidv, key_len(fsid_type));
 
 	ek = svc_expkey_lookup(&key);
+ again:
 	if (ek == NULL)
 		return ERR_PTR(-ENOMEM);
 	err = cache_check(&svc_expkey_cache, &ek->h, reqp);
+	if (err == -ETIMEDOUT) {
+		struct svc_expkey *prev_ek = ek;
+		ek = svc_expkey_lookup(&key);
+		if (ek != prev_ek)
+			goto again;
+		if (ek)
+			cache_put(&ek->h, &svc_expkey_cache);
+	}
 	if (err)
 		return ERR_PTR(err);
 	return ek;
@@ -865,9 +874,18 @@ static svc_export *exp_get_by_name(svc_client *clp, const struct path *path,
 	key.ex_path = *path;
 
 	exp = svc_export_lookup(&key);
+ retry:
 	if (exp == NULL)
 		return ERR_PTR(-ENOMEM);
 	err = cache_check(&svc_export_cache, &exp->h, reqp);
+	if (err == -ETIMEDOUT) {
+		struct svc_export *prev_exp = exp;
+		exp = svc_export_lookup(&key);
+		if (exp != prev_exp)
+			goto retry;
+		if (exp)
+			cache_put(&exp->h, &svc_export_cache);
+	}
 	if (err)
 		return ERR_PTR(err);
 	return exp;
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index f6c51e5..c4e9177 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -999,7 +999,7 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
 	struct kvec *argv = &rqstp->rq_arg.head[0];
 	struct kvec *resv = &rqstp->rq_res.head[0];
 	struct xdr_netobj tmpobj;
-	struct rsi *rsip, rsikey;
+	struct rsi *rsip, rsikey, *prev_rsi;
 	int ret;
 
 	/* Read the verifier; should be NULL: */
@@ -1029,15 +1029,24 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
 	}
 
 	/* Perform upcall, or find upcall result: */
+ retry:
 	rsip = rsi_lookup(&rsikey);
-	rsi_free(&rsikey);
-	if (!rsip)
+	if (!rsip) {
+		rsi_free(&rsikey);
 		return SVC_DROP;
+	}
 	switch (cache_check(&rsi_cache, &rsip->h, &rqstp->rq_chandle)) {
-	case -EAGAIN:
 	case -ETIMEDOUT:
+		prev_rsi = rsip;
+		rsip = rsi_lookup(&rsikey);
+		if (rsip != prev_rsi)
+			goto retry;
+		if (rsip)
+			cache_put(&rsip->h, &rsi_cache);
+	case -EAGAIN:
 	case -ENOENT:
 		/* No upcall result: */
+		rsi_free(&rsikey);
 		return SVC_DROP;
 	case 0:
 		ret = SVC_DROP;
@@ -1059,6 +1068,7 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
 	}
 	ret = SVC_COMPLETE;
 out:
+	rsi_free(&rsikey);
 	cache_put(&rsip->h, &rsi_cache);
 	return ret;
 }
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index 117f68a..f9122df 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -659,8 +659,10 @@ static int unix_gid_find(uid_t uid, struct group_info **gip,
 			 struct svc_rqst *rqstp)
 {
 	struct unix_gid *ug = unix_gid_lookup(uid);
+	struct unix_gid *prevug;
 	if (!ug)
 		return -EAGAIN;
+ retry:
 	switch (cache_check(&unix_gid_cache, &ug->h, &rqstp->rq_chandle)) {
 	case -ENOENT:
 		*gip = NULL;
@@ -670,6 +672,13 @@ static int unix_gid_find(uid_t uid, struct group_info **gip,
 		get_group_info(*gip);
 		cache_put(&ug->h, &unix_gid_cache);
 		return 0;
+	case -ETIMEDOUT:
+		prevug = ug;
+		ug = unix_gid_lookup(uid);
+		if (ug != prevug)
+			goto retry;
+		if (ug)
+			cache_put(&ug->h, &unix_gid_cache);
 	default:
 		return -EAGAIN;
 	}
@@ -680,7 +689,7 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
 {
 	struct sockaddr_in *sin;
 	struct sockaddr_in6 *sin6, sin6_storage;
-	struct ip_map *ipm;
+	struct ip_map *ipm, *prev_ipm;
 
 	switch (rqstp->rq_addr.ss_family) {
 	case AF_INET:
@@ -705,14 +714,23 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
 		ipm = ip_map_lookup(rqstp->rq_server->sv_program->pg_class,
 				    &sin6->sin6_addr);
 
+ retry:
 	if (ipm == NULL)
 		return SVC_DENIED;
 
 	switch (cache_check(&ip_map_cache, &ipm->h, &rqstp->rq_chandle)) {
 		default:
 			BUG();
-		case -EAGAIN:
 		case -ETIMEDOUT:
+			prev_ipm = ipm;
+			ipm = ip_map_lookup(rqstp->rq_server->sv_program->pg_class,
+					    &sin6->sin6_addr);
+			if (ipm != prev_ipm)
+				goto retry;
+			if (ipm)
+				cache_put(&ipm->h, &ip_map_cache);
+
+		case -EAGAIN:
 			return SVC_DROP;
 		case -ENOENT:
 			return SVC_DENIED;



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 7/9] nfsd/idmap: drop special request deferal in favour of improved default.
       [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (5 preceding siblings ...)
  2009-09-09  6:32   ` [PATCH 5/9] sunrpc/cache: allow threads to block while waiting for cache update NeilBrown
@ 2009-09-09  6:32   ` NeilBrown
       [not found]     ` <20090909063254.20462.80299.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  2009-09-09  6:32   ` [PATCH 9/9] sunrpc: close connection when a request is irretrievably lost NeilBrown
                     ` (2 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2009-09-09  6:32 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

The idmap code manages request deferal by waiting for a reply from
userspace rather than putting the NFS request on a queue to be retried
from the start.
Now that the common deferal code does this there is no need for the
special code in idmap.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 fs/nfsd/nfs4idmap.c |  105 +++++----------------------------------------------
 1 files changed, 11 insertions(+), 94 deletions(-)

diff --git a/fs/nfsd/nfs4idmap.c b/fs/nfsd/nfs4idmap.c
index cdfa86f..22574f3 100644
--- a/fs/nfsd/nfs4idmap.c
+++ b/fs/nfsd/nfs4idmap.c
@@ -497,109 +497,26 @@ nfsd_idmap_shutdown(void)
 	cache_unregister(&nametoid_cache);
 }
 
-/*
- * Deferred request handling
- */
-
-struct idmap_defer_req {
-       struct cache_req		req;
-       struct cache_deferred_req deferred_req;
-       wait_queue_head_t	waitq;
-       atomic_t			count;
-};
-
-static inline void
-put_mdr(struct idmap_defer_req *mdr)
-{
-	if (atomic_dec_and_test(&mdr->count))
-		kfree(mdr);
-}
-
-static inline void
-get_mdr(struct idmap_defer_req *mdr)
-{
-	atomic_inc(&mdr->count);
-}
-
-static void
-idmap_revisit(struct cache_deferred_req *dreq, int toomany)
-{
-	struct idmap_defer_req *mdr =
-		container_of(dreq, struct idmap_defer_req, deferred_req);
-
-	wake_up(&mdr->waitq);
-	put_mdr(mdr);
-}
-
-static struct cache_deferred_req *
-idmap_defer(struct cache_req *req)
-{
-	struct idmap_defer_req *mdr =
-		container_of(req, struct idmap_defer_req, req);
-
-	mdr->deferred_req.revisit = idmap_revisit;
-	get_mdr(mdr);
-	return (&mdr->deferred_req);
-}
-
-static inline int
-do_idmap_lookup(struct ent *(*lookup_fn)(struct ent *), struct ent *key,
-		struct cache_detail *detail, struct ent **item,
-		struct idmap_defer_req *mdr)
-{
-	*item = lookup_fn(key);
-	if (!*item)
-		return -ENOMEM;
-	return cache_check(detail, &(*item)->h, &mdr->req);
-}
-
-static inline int
-do_idmap_lookup_nowait(struct ent *(*lookup_fn)(struct ent *),
-			struct ent *key, struct cache_detail *detail,
-			struct ent **item)
-{
-	int ret = -ENOMEM;
-
-	*item = lookup_fn(key);
-	if (!*item)
-		goto out_err;
-	ret = -ETIMEDOUT;
-	if (!test_bit(CACHE_VALID, &(*item)->h.flags)
-			|| (*item)->h.expiry_time < get_seconds()
-			|| detail->flush_time > (*item)->h.last_refresh)
-		goto out_put;
-	ret = -ENOENT;
-	if (test_bit(CACHE_NEGATIVE, &(*item)->h.flags))
-		goto out_put;
-	return 0;
-out_put:
-	cache_put(&(*item)->h, detail);
-out_err:
-	*item = NULL;
-	return ret;
-}
-
 static int
 idmap_lookup(struct svc_rqst *rqstp,
 		struct ent *(*lookup_fn)(struct ent *), struct ent *key,
 		struct cache_detail *detail, struct ent **item)
 {
-	struct idmap_defer_req *mdr;
 	int ret;
 
-	mdr = kzalloc(sizeof(*mdr), GFP_KERNEL);
-	if (!mdr)
+	*item = lookup_fn(key);
+	if (!*item)
 		return -ENOMEM;
-	atomic_set(&mdr->count, 1);
-	init_waitqueue_head(&mdr->waitq);
-	mdr->req.defer = idmap_defer;
-	ret = do_idmap_lookup(lookup_fn, key, detail, item, mdr);
-	if (ret == -EAGAIN) {
-		wait_event_interruptible_timeout(mdr->waitq,
-			test_bit(CACHE_VALID, &(*item)->h.flags), 1 * HZ);
-		ret = do_idmap_lookup_nowait(lookup_fn, key, detail, item);
+ retry:
+	ret = cache_check(detail, &(*item)->h, &rqstp->rq_chandle);
+
+	if (ret == -ETIMEDOUT) {
+		struct ent *prev_item = *item;
+		*item = lookup_fn(key);
+		if (*item != prev_item)
+			goto retry;
+		cache_put(&(*item)->h, detail);
 	}
-	put_mdr(mdr);
 	return ret;
 }
 



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 8/9] sunrpc/cache: change deferred-request hash table to use hlist.
       [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (7 preceding siblings ...)
  2009-09-09  6:32   ` [PATCH 9/9] sunrpc: close connection when a request is irretrievably lost NeilBrown
@ 2009-09-09  6:32   ` NeilBrown
  2009-09-11 21:07   ` [PATCH 0/9] Some improvements to request deferral and related code J. Bruce Fields
  9 siblings, 0 replies; 25+ messages in thread
From: NeilBrown @ 2009-09-09  6:32 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

Being a hash table, hlist is the best option.

There is currently some ugliness were we treat "->next == NULL" as
a special case to avoid having to initialise the whole array.
This change nicely gets rid of that case.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 include/linux/sunrpc/cache.h |    2 +-
 net/sunrpc/cache.c           |   30 ++++++++++++------------------
 2 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index ef3db11..ba9d72c 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -133,7 +133,7 @@ struct cache_req {
  * delayed awaiting cache-fill
  */
 struct cache_deferred_req {
-	struct list_head	hash;	/* on hash chain */
+	struct hlist_node	hash;	/* on hash chain */
 	struct list_head	recent; /* on fifo */
 	struct cache_head	*item;  /* cache item we wait on */
 	void			*owner; /* we might need to discard all defered requests
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 46e9e2b..dab417c 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -495,7 +495,7 @@ EXPORT_SYMBOL_GPL(cache_purge);
 
 static DEFINE_SPINLOCK(cache_defer_lock);
 static LIST_HEAD(cache_defer_list);
-static struct list_head cache_defer_hash[DFR_HASHSIZE];
+static struct hlist_head cache_defer_hash[DFR_HASHSIZE];
 static int cache_defer_cnt;
 
 struct thread_deferred_req {
@@ -539,9 +539,7 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 
 	list_add(&dreq->recent, &cache_defer_list);
 
-	if (cache_defer_hash[hash].next == NULL)
-		INIT_LIST_HEAD(&cache_defer_hash[hash]);
-	list_add(&dreq->hash, &cache_defer_hash[hash]);
+	hlist_add_head(&dreq->hash, &cache_defer_hash[hash]);
 
 	/* it is in, now maybe clean up */
 	discard = NULL;
@@ -549,7 +547,7 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 		discard = list_entry(cache_defer_list.prev,
 				     struct cache_deferred_req, recent);
 		list_del_init(&discard->recent);
-		list_del_init(&discard->hash);
+		hlist_del_init(&discard->hash);
 		cache_defer_cnt--;
 	}
 	spin_unlock(&cache_defer_lock);
@@ -568,12 +566,12 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
 		wait_event_interruptible_timeout(
 			sleeper.wait,
 			!test_bit(CACHE_PENDING, &item->flags)
-			|| list_empty(&sleeper.handle.hash),
+			|| hlist_unhashed(&sleeper.handle.hash),
 			req->thread_wait);
 		spin_lock(&cache_defer_lock);
-		if (!list_empty(&sleeper.handle.hash)) {
+		if (!hlist_unhashed(&sleeper.handle.hash)) {
 			list_del_init(&sleeper.handle.recent);
-			list_del_init(&sleeper.handle.hash);
+			hlist_del_init(&sleeper.handle.hash);
 			cache_defer_cnt--;
 		}
 		spin_unlock(&cache_defer_lock);
@@ -594,24 +592,20 @@ static void cache_revisit_request(struct cache_head *item)
 	struct cache_deferred_req *dreq;
 	struct list_head pending;
 
-	struct list_head *lp;
+	struct hlist_node *lp, *tmp;
 	int hash = DFR_HASH(item);
 
 	INIT_LIST_HEAD(&pending);
 	spin_lock(&cache_defer_lock);
 
-	lp = cache_defer_hash[hash].next;
-	if (lp) {
-		while (lp != &cache_defer_hash[hash]) {
-			dreq = list_entry(lp, struct cache_deferred_req, hash);
-			lp = lp->next;
+	hlist_for_each_entry_safe(dreq, lp, tmp,
+				  &cache_defer_hash[hash], hash)
 			if (dreq->item == item) {
-				list_del_init(&dreq->hash);
+				hlist_del_init(&dreq->hash);
 				list_move(&dreq->recent, &pending);
 				cache_defer_cnt--;
 			}
-		}
-	}
+
 	spin_unlock(&cache_defer_lock);
 
 	while (!list_empty(&pending)) {
@@ -632,7 +626,7 @@ void cache_clean_deferred(void *owner)
 
 	list_for_each_entry_safe(dreq, tmp, &cache_defer_list, recent) {
 		if (dreq->owner == owner) {
-			list_del_init(&dreq->hash);
+			hlist_del_init(&dreq->hash);
 			list_move(&dreq->recent, &pending);
 			cache_defer_cnt--;
 		}



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 9/9] sunrpc: close connection when a request is irretrievably lost.
       [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (6 preceding siblings ...)
  2009-09-09  6:32   ` [PATCH 7/9] nfsd/idmap: drop special request deferal in favour of improved default NeilBrown
@ 2009-09-09  6:32   ` NeilBrown
  2009-09-09  6:32   ` [PATCH 8/9] sunrpc/cache: change deferred-request hash table to use hlist NeilBrown
  2009-09-11 21:07   ` [PATCH 0/9] Some improvements to request deferral and related code J. Bruce Fields
  9 siblings, 0 replies; 25+ messages in thread
From: NeilBrown @ 2009-09-09  6:32 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, NeilBrown

If we drop a request in the sunrpc layer, either due kmalloc failure,
or due to a cache miss when we could not queue the request for later
replay, then close the connection to encourage the client to retry sooner.

Note that if the drop happens in the NFS layer, NFSERR_JUKEBOX
(aka NFS4ERR_DELAY) is returned to guide the client concerning
replay.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 include/linux/sunrpc/svcauth.h    |   10 +++++++---
 net/sunrpc/auth_gss/svcauth_gss.c |   11 ++++++-----
 net/sunrpc/svc.c                  |    3 +++
 net/sunrpc/svcauth_unix.c         |   14 +++++++++-----
 4 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/include/linux/sunrpc/svcauth.h b/include/linux/sunrpc/svcauth.h
index d39dbdc..1126693 100644
--- a/include/linux/sunrpc/svcauth.h
+++ b/include/linux/sunrpc/svcauth.h
@@ -108,9 +108,13 @@ struct auth_ops {
 #define	SVC_NEGATIVE	4
 #define	SVC_OK		5
 #define	SVC_DROP	6
-#define	SVC_DENIED	7
-#define	SVC_PENDING	8
-#define	SVC_COMPLETE	9
+#define	SVC_CLOSE	7	/* Like SVC_DROP, but request is definitely
+				 * lost so if there is a tcp connection, it
+				 * should be closed
+				 */
+#define	SVC_DENIED	8
+#define	SVC_PENDING	9
+#define	SVC_COMPLETE	10
 
 
 extern int	svc_authenticate(struct svc_rqst *rqstp, __be32 *authp);
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index c4e9177..03eae41 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -963,7 +963,7 @@ svcauth_gss_set_client(struct svc_rqst *rqstp)
 	if (rqstp->rq_gssclient == NULL)
 		return SVC_DENIED;
 	stat = svcauth_unix_set_client(rqstp);
-	if (stat == SVC_DROP)
+	if (stat == SVC_DROP || stat == SVC_CLOSE)
 		return stat;
 	return SVC_OK;
 }
@@ -1017,7 +1017,7 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
 		return SVC_DENIED;
 	memset(&rsikey, 0, sizeof(rsikey));
 	if (dup_netobj(&rsikey.in_handle, &gc->gc_ctx))
-		return SVC_DROP;
+		return SVC_CLOSE;
 	*authp = rpc_autherr_badverf;
 	if (svc_safe_getnetobj(argv, &tmpobj)) {
 		kfree(rsikey.in_handle.data);
@@ -1025,7 +1025,7 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
 	}
 	if (dup_netobj(&rsikey.in_token, &tmpobj)) {
 		kfree(rsikey.in_handle.data);
-		return SVC_DROP;
+		return SVC_CLOSE;
 	}
 
 	/* Perform upcall, or find upcall result: */
@@ -1033,7 +1033,7 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
 	rsip = rsi_lookup(&rsikey);
 	if (!rsip) {
 		rsi_free(&rsikey);
-		return SVC_DROP;
+		return SVC_CLOSE;
 	}
 	switch (cache_check(&rsi_cache, &rsip->h, &rqstp->rq_chandle)) {
 	case -ETIMEDOUT:
@@ -1043,13 +1043,14 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
 			goto retry;
 		if (rsip)
 			cache_put(&rsip->h, &rsi_cache);
+		return SVC_CLOSE;
 	case -EAGAIN:
 	case -ENOENT:
 		/* No upcall result: */
 		rsi_free(&rsikey);
 		return SVC_DROP;
 	case 0:
-		ret = SVC_DROP;
+		ret = SVC_CLOSE;
 		/* Got an answer to the upcall; use it: */
 		if (gss_write_init_verf(rqstp, rsip))
 			goto out;
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 952f206..b47aa21 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -1050,6 +1050,9 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
 		goto err_bad;
 	case SVC_DENIED:
 		goto err_bad_auth;
+	case SVC_CLOSE:
+		if (test_bit(XPT_TEMP, &rqstp->rq_xprt->xpt_flags))
+			svc_close_xprt(rqstp->rq_xprt);
 	case SVC_DROP:
 		goto dropit;
 	case SVC_COMPLETE:
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index f9122df..4e4e10b 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -679,6 +679,7 @@ static int unix_gid_find(uid_t uid, struct group_info **gip,
 			goto retry;
 		if (ug)
 			cache_put(&ug->h, &unix_gid_cache);
+		return -ESHUTDOWN;
 	default:
 		return -EAGAIN;
 	}
@@ -729,7 +730,7 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
 				goto retry;
 			if (ipm)
 				cache_put(&ipm->h, &ip_map_cache);
-
+			return SVC_CLOSE;
 		case -EAGAIN:
 			return SVC_DROP;
 		case -ENOENT:
@@ -774,7 +775,7 @@ svcauth_null_accept(struct svc_rqst *rqstp, __be32 *authp)
 	cred->cr_gid = (gid_t) -1;
 	cred->cr_group_info = groups_alloc(0);
 	if (cred->cr_group_info == NULL)
-		return SVC_DROP; /* kmalloc failure - client must retry */
+		return SVC_CLOSE; /* kmalloc failure - client must retry */
 
 	/* Put NULL verifier */
 	svc_putnl(resv, RPC_AUTH_NULL);
@@ -836,13 +837,16 @@ svcauth_unix_accept(struct svc_rqst *rqstp, __be32 *authp)
 	slen = svc_getnl(argv);			/* gids length */
 	if (slen > 16 || (len -= (slen + 2)*4) < 0)
 		goto badcred;
-	if (unix_gid_find(cred->cr_uid, &cred->cr_group_info, rqstp)
-	    == -EAGAIN)
+	switch (unix_gid_find(cred->cr_uid, &cred->cr_group_info, rqstp)) {
+	case -EAGAIN:
 		return SVC_DROP;
+	case -ESHUTDOWN:
+		return SVC_CLOSE;
+	}
 	if (cred->cr_group_info == NULL) {
 		cred->cr_group_info = groups_alloc(slen);
 		if (cred->cr_group_info == NULL)
-			return SVC_DROP;
+			return SVC_CLOSE;
 		for (i = 0; i < slen; i++)
 			GROUP_AT(cred->cr_group_info, i) = svc_getnl(argv);
 	} else {



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] sunrpc/cache: change cache_defer_req to return -ve error, not boolean.
       [not found]     ` <20090909063254.20462.57204.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-09-11 21:03       ` J. Bruce Fields
  0 siblings, 0 replies; 25+ messages in thread
From: J. Bruce Fields @ 2009-09-11 21:03 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> As "cache_defer_req" does not sound like a predicate, having it return
> a boolean value can be confusing.  It is more consistent to return
> 0 for success and negative for error.
> 
> Exactly what error code to return is not important as we don't
> differentiate between reasons why the request wasn't deferred,
> we only care about whether it was deferred or not.

Thanks, applied.

--b.

> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> 
>  net/sunrpc/cache.c |   10 +++++-----
>  1 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 1a61401..aafb0e7 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -255,7 +255,7 @@ int cache_check(struct cache_detail *detail,
>  	}
>  
>  	if (rv == -EAGAIN) {
> -		if (cache_defer_req(rqstp, h) == 0) {
> +		if (cache_defer_req(rqstp, h) < 0) {
>  			/* Request is not deferred */
>  			rv = cache_is_valid(detail, h);
>  			if (rv == -EAGAIN)
> @@ -511,11 +511,11 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  		 * or continue and drop the oldest below
>  		 */
>  		if (net_random()&1)
> -			return 0;
> +			return -ENOMEM;
>  	}
>  	dreq = req->defer(req);
>  	if (dreq == NULL)
> -		return 0;
> +		return -ENOMEM;
>  
>  	dreq->item = item;
>  
> @@ -545,9 +545,9 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  	if (!test_bit(CACHE_PENDING, &item->flags)) {
>  		/* must have just been validated... */
>  		cache_revisit_request(item);
> -		return 0;
> +		return -EAGAIN;
>  	}
> -	return 1;
> +	return 0;
>  }
>  
>  static void cache_revisit_request(struct cache_head *item)
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] Some improvements to request deferral and related code
       [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
                     ` (8 preceding siblings ...)
  2009-09-09  6:32   ` [PATCH 8/9] sunrpc/cache: change deferred-request hash table to use hlist NeilBrown
@ 2009-09-11 21:07   ` J. Bruce Fields
  9 siblings, 0 replies; 25+ messages in thread
From: J. Bruce Fields @ 2009-09-11 21:07 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> Hi,
>  here again are those patches from request deferral improvement set
>  that have not yet been applied.
> 
>  I had previously missed the cache lookups in svcauth_gss, and have now
>  made them work correctly.
> 
>  I also discovered that for NFSv4.1 and later, the current deferral
>  scheme is disabled once the request gets into the NFS layer (i.e. out
>  of the RPC layer).  I don't know that 4.0 doesn't disable it as well
>  - maybe the problems it can cause are more severe with 4.1?

Yes, I think we'd need special handling of the sequence operation, at
least.  I can't recall exactly what the issue was, but we just decided
to give up on deferrals entirely until they were handled in some better
way....

> With
>  this patch set included it might be acceptable to disable the current
>  scheme for 4.0.

Great, thanks.

--b.

> 
>  I believe this is ready for wider testing and probably mainline
>  inclusion.
> 
> Thanks,
> NeilBrown
> 
> 
> ---
> 
> NeilBrown (9):
>       sunrpc: close connection when a request is irretrievably lost.
>       sunrpc/cache: change deferred-request hash table to use hlist.
>       nfsd/idmap: drop special request deferal in favour of improved default.
>       sunrpc/cache: retry cache lookups that return -ETIMEDOUT
>       sunrpc/cache: allow threads to block while waiting for cache update.
>       sunrpc/cache: avoid variable over-loading in cache_defer_req
>       sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
>       sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
>       sunrpc/cache: change cache_defer_req to return -ve error, not boolean.
> 
> 
>  fs/nfsd/export.c                  |   18 ++++++
>  fs/nfsd/nfs4idmap.c               |  105 ++++------------------------------
>  include/linux/sunrpc/cache.h      |    5 +-
>  include/linux/sunrpc/svcauth.h    |   10 ++-
>  net/sunrpc/auth_gss/svcauth_gss.c |   29 ++++++---
>  net/sunrpc/cache.c                |  115 ++++++++++++++++++++++++-------------
>  net/sunrpc/svc.c                  |    3 +
>  net/sunrpc/svc_xprt.c             |   11 ++++
>  net/sunrpc/svcauth_unix.c         |   34 +++++++++--
>  9 files changed, 176 insertions(+), 154 deletions(-)
> 
> -- 
> Signature
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/9] sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
       [not found]     ` <20090909063254.20462.7969.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-09-18 15:48       ` J. Bruce Fields
  0 siblings, 0 replies; 25+ messages in thread
From: J. Bruce Fields @ 2009-09-18 15:48 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> Using list_del_init is generally safer than list_del, and it will
> allow us, in a subsequent patch, to see if an entry has already been
> processed or not.

Also applied this (and the previous)--thanks.--b.

> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> 
>  net/sunrpc/cache.c |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index baddb23..9c8f0cc 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -529,8 +529,8 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  	if (++cache_defer_cnt > DFR_MAX) {
>  		dreq = list_entry(cache_defer_list.prev,
>  				  struct cache_deferred_req, recent);
> -		list_del(&dreq->recent);
> -		list_del(&dreq->hash);
> +		list_del_init(&dreq->recent);
> +		list_del_init(&dreq->hash);
>  		cache_defer_cnt--;
>  	}
>  	spin_unlock(&cache_defer_lock);
> @@ -564,7 +564,7 @@ static void cache_revisit_request(struct cache_head *item)
>  			dreq = list_entry(lp, struct cache_deferred_req, hash);
>  			lp = lp->next;
>  			if (dreq->item == item) {
> -				list_del(&dreq->hash);
> +				list_del_init(&dreq->hash);
>  				list_move(&dreq->recent, &pending);
>  				cache_defer_cnt--;
>  			}
> @@ -590,7 +590,7 @@ void cache_clean_deferred(void *owner)
>  
>  	list_for_each_entry_safe(dreq, tmp, &cache_defer_list, recent) {
>  		if (dreq->owner == owner) {
> -			list_del(&dreq->hash);
> +			list_del_init(&dreq->hash);
>  			list_move(&dreq->recent, &pending);
>  			cache_defer_cnt--;
>  		}
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/9] sunrpc/cache: avoid variable over-loading in cache_defer_req
       [not found]     ` <20090909063254.20462.68582.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-09-18 21:24       ` J. Bruce Fields
  0 siblings, 0 replies; 25+ messages in thread
From: J. Bruce Fields @ 2009-09-18 21:24 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> In cache_defer_req, 'dreq' is used for two significantly different
> values that happen to be of the same type.
> 
> This is both confusing, and makes it hard to extend the range of one of
> the values as we will in the next patch.
> So introduce 'discard' to take one of the values.

Applied, thanks.--b.

> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> 
>  net/sunrpc/cache.c |   18 +++++++++---------
>  1 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 9c8f0cc..54bbd83 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -500,7 +500,7 @@ static int cache_defer_cnt;
>  
>  static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  {
> -	struct cache_deferred_req *dreq;
> +	struct cache_deferred_req *dreq, *discard;
>  	int hash = DFR_HASH(item);
>  
>  	if (cache_defer_cnt >= DFR_MAX) {
> @@ -525,20 +525,20 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  	list_add(&dreq->hash, &cache_defer_hash[hash]);
>  
>  	/* it is in, now maybe clean up */
> -	dreq = NULL;
> +	discard = NULL;
>  	if (++cache_defer_cnt > DFR_MAX) {
> -		dreq = list_entry(cache_defer_list.prev,
> -				  struct cache_deferred_req, recent);
> -		list_del_init(&dreq->recent);
> -		list_del_init(&dreq->hash);
> +		discard = list_entry(cache_defer_list.prev,
> +				     struct cache_deferred_req, recent);
> +		list_del_init(&discard->recent);
> +		list_del_init(&discard->hash);
>  		cache_defer_cnt--;
>  	}
>  	spin_unlock(&cache_defer_lock);
>  
> -	if (dreq) {
> +	if (discard)
>  		/* there was one too many */
> -		dreq->revisit(dreq, 1);
> -	}
> +		discard->revisit(discard, 1);
> +
>  	if (!test_bit(CACHE_PENDING, &item->flags)) {
>  		/* must have just been validated... */
>  		cache_revisit_request(item);
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/9] sunrpc/cache: allow threads to block while waiting for cache update.
       [not found]     ` <20090909063254.20462.99277.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-12-02 20:59       ` J. Bruce Fields
  2009-12-02 21:23         ` Trond Myklebust
  0 siblings, 1 reply; 25+ messages in thread
From: J. Bruce Fields @ 2009-12-02 20:59 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> The current practice of waiting for cache updates by queueing the
> whole request to be retried has (at least) two problems.

Apologies for the delay!

> 1/ With NFSv4, requests can be quite complex and re-trying a whole
>   request when a latter part fails should only be a last-resort, not a
>   normal practice.
> 
> 2/ Large requests, and in particular any 'write' request, will not be
>   queued by the current code and doing so would be undesirable.
> 
> In many cases only a very sort wait is needed before the cache gets
> valid data.
> 
> So, providing the underlying transport permits it by setting
>  ->thread_wait,
> arrange to wait briefly for an upcall to be completed (as reflected in
> the clearing of CACHE_PENDING).
> If the short wait was not long enough and CACHE_PENDING is still set,
> fall back on the old approach.
> 
> The 'thread_wait' value is set to 5 seconds when there are spare
> threads, and 1 second when there are no spare threads.
> 
> These values are probably much higher than needed, but will ensure
> some forward progress.

This looks fine, and I want to merge it.  One mainly superficial
complaint:

>  static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  {
>  	struct cache_deferred_req *dreq, *discard;
>  	int hash = DFR_HASH(item);
> +	struct thread_deferred_req sleeper;
>  
>  	if (cache_defer_cnt >= DFR_MAX) {
>  		/* too much in the cache, randomly drop this one,
> @@ -510,7 +522,14 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  		if (net_random()&1)
>  			return -ENOMEM;
>  	}
> -	dreq = req->defer(req);
> +	if (req->thread_wait) {
> +		dreq = &sleeper.handle;
> +		init_waitqueue_head(&sleeper.wait);
> +		dreq->revisit = cache_restart_thread;
> +	} else
> +		dreq = req->defer(req);
> +
> + retry:
>  	if (dreq == NULL)
>  		return -ENOMEM;
>  
> @@ -544,6 +563,29 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  		cache_revisit_request(item);
>  		return -EAGAIN;
>  	}
> +
> +	if (dreq == &sleeper.handle) {
> +		wait_event_interruptible_timeout(
> +			sleeper.wait,
> +			!test_bit(CACHE_PENDING, &item->flags)
> +			|| list_empty(&sleeper.handle.hash),
> +			req->thread_wait);
> +		spin_lock(&cache_defer_lock);
> +		if (!list_empty(&sleeper.handle.hash)) {
> +			list_del_init(&sleeper.handle.recent);
> +			list_del_init(&sleeper.handle.hash);
> +			cache_defer_cnt--;
> +		}
> +		spin_unlock(&cache_defer_lock);
> +		if (test_bit(CACHE_PENDING, &item->flags)) {
> +			/* item is still pending, try request
> +			 * deferral
> +			 */
> +			dreq = req->defer(req);
> +			goto retry;
> +		}
> +		return 0;
> +	}

With this, cache_defer_req is tending towards the long and complicated
side.  It'd probably suffice to do something as simple as moving some of
the code into helper functions to hide the details.

--b.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/9] sunrpc/cache: allow threads to block while waiting for cache update.
  2009-12-02 20:59       ` J. Bruce Fields
@ 2009-12-02 21:23         ` Trond Myklebust
  2009-12-02 21:50           ` Trond Myklebust
  0 siblings, 1 reply; 25+ messages in thread
From: Trond Myklebust @ 2009-12-02 21:23 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: NeilBrown, linux-nfs

On Wed, 2009-12-02 at 15:59 -0500, J. Bruce Fields wrote: 
> On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> > The current practice of waiting for cache updates by queueing the
> > whole request to be retried has (at least) two problems.
> 
> Apologies for the delay!
> 
> > 1/ With NFSv4, requests can be quite complex and re-trying a whole
> >   request when a latter part fails should only be a last-resort, not a
> >   normal practice.
> > 
> > 2/ Large requests, and in particular any 'write' request, will not be
> >   queued by the current code and doing so would be undesirable.
> > 
> > In many cases only a very sort wait is needed before the cache gets
> > valid data.
> > 
> > So, providing the underlying transport permits it by setting
> >  ->thread_wait,
> > arrange to wait briefly for an upcall to be completed (as reflected in
> > the clearing of CACHE_PENDING).
> > If the short wait was not long enough and CACHE_PENDING is still set,
> > fall back on the old approach.
> > 
> > The 'thread_wait' value is set to 5 seconds when there are spare
> > threads, and 1 second when there are no spare threads.
> > 
> > These values are probably much higher than needed, but will ensure
> > some forward progress.
> 
> This looks fine, and I want to merge it.  One mainly superficial
> complaint:
> 
> >  static int cache_defer_req(struct cache_req *req, struct cache_head *item)
> >  {
> >  	struct cache_deferred_req *dreq, *discard;
> >  	int hash = DFR_HASH(item);
> > +	struct thread_deferred_req sleeper;
> >  
> >  	if (cache_defer_cnt >= DFR_MAX) {
> >  		/* too much in the cache, randomly drop this one,
> > @@ -510,7 +522,14 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
> >  		if (net_random()&1)
> >  			return -ENOMEM;
> >  	}
> > -	dreq = req->defer(req);
> > +	if (req->thread_wait) {
> > +		dreq = &sleeper.handle;
> > +		init_waitqueue_head(&sleeper.wait);
> > +		dreq->revisit = cache_restart_thread;
> > +	} else
> > +		dreq = req->defer(req);
> > +
> > + retry:
> >  	if (dreq == NULL)
> >  		return -ENOMEM;
> >  
> > @@ -544,6 +563,29 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
> >  		cache_revisit_request(item);
> >  		return -EAGAIN;
> >  	}
> > +
> > +	if (dreq == &sleeper.handle) {
> > +		wait_event_interruptible_timeout(
> > +			sleeper.wait,
> > +			!test_bit(CACHE_PENDING, &item->flags)
> > +			|| list_empty(&sleeper.handle.hash),
> > +			req->thread_wait);
> > +		spin_lock(&cache_defer_lock);
> > +		if (!list_empty(&sleeper.handle.hash)) {
> > +			list_del_init(&sleeper.handle.recent);
> > +			list_del_init(&sleeper.handle.hash);
> > +			cache_defer_cnt--;
> > +		}
> > +		spin_unlock(&cache_defer_lock);
> > +		if (test_bit(CACHE_PENDING, &item->flags)) {
> > +			/* item is still pending, try request
> > +			 * deferral
> > +			 */
> > +			dreq = req->defer(req);
> > +			goto retry;
> > +		}
> > +		return 0;
> > +	}
> 
> With this, cache_defer_req is tending towards the long and complicated
> side.  It'd probably suffice to do something as simple as moving some of
> the code into helper functions to hide the details.

Couldn't you also simplify things a good deal by just adding a
completion to struct cache_deferred_req, and then letting
cache_defer_req() call wait_for_completion_timeout()?

That's pretty much all we do in fs/nfs/cache_lib.c...

Cheers
  Trond


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/9] sunrpc/cache: allow threads to block while waiting for cache update.
  2009-12-02 21:23         ` Trond Myklebust
@ 2009-12-02 21:50           ` Trond Myklebust
  0 siblings, 0 replies; 25+ messages in thread
From: Trond Myklebust @ 2009-12-02 21:50 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: NeilBrown, linux-nfs

On Wed, 2009-12-02 at 16:23 -0500, Trond Myklebust wrote: 
> Couldn't you also simplify things a good deal by just adding a
> completion to struct cache_deferred_req, and then letting
> cache_defer_req() call wait_for_completion_timeout()?

Better yet, the caller of cache_check() could decide whether or not to
also call wait_for_completion_timeout().

In addition to allowing the rpc server thread to wait upon the
authentication upcall, this could also simplify the nfsv4 server's
idmapper code, and the nfsv4 client's dns resolver code.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] sunrpc/cache: retry cache lookups that return -ETIMEDOUT
       [not found]     ` <20090909063254.20462.41616.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-12-02 22:11       ` J. Bruce Fields
  2009-12-03 16:57         ` J. Bruce Fields
  0 siblings, 1 reply; 25+ messages in thread
From: J. Bruce Fields @ 2009-12-02 22:11 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> If cache_check returns -ETIMEDOUT, then the cache item is not
> up-to-date, but there is no pending upcall.
> This could mean the data is not available, or it could mean that the
> good data has been stored in a new cache item.
> 
> So re-do the lookup and if that returns a new item, proceed using that
> item.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> 
>  fs/nfsd/export.c                  |   18 ++++++++++++++++++
>  net/sunrpc/auth_gss/svcauth_gss.c |   18 ++++++++++++++----
>  net/sunrpc/svcauth_unix.c         |   22 ++++++++++++++++++++--
>  3 files changed, 52 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> index 984a5eb..ec0f0d9 100644
> --- a/fs/nfsd/export.c
> +++ b/fs/nfsd/export.c
> @@ -793,9 +793,18 @@ exp_find_key(svc_client *clp, int fsid_type, u32 *fsidv, struct cache_req *reqp)
>  	memcpy(key.ek_fsid, fsidv, key_len(fsid_type));
>  
>  	ek = svc_expkey_lookup(&key);
> + again:
>  	if (ek == NULL)
>  		return ERR_PTR(-ENOMEM);
>  	err = cache_check(&svc_expkey_cache, &ek->h, reqp);
> +	if (err == -ETIMEDOUT) {
> +		struct svc_expkey *prev_ek = ek;
> +		ek = svc_expkey_lookup(&key);
> +		if (ek != prev_ek)
> +			goto again;
> +		if (ek)
> +			cache_put(&ek->h, &svc_expkey_cache);
> +	}

This is very subtle.  (We're comparing to a pointer which may not
actually point to anything any more.)  And it's repeated in every
caller.  Is there any simpler way to handle this?

--b.

>  	if (err)
>  		return ERR_PTR(err);
>  	return ek;
> @@ -865,9 +874,18 @@ static svc_export *exp_get_by_name(svc_client *clp, const struct path *path,
>  	key.ex_path = *path;
>  
>  	exp = svc_export_lookup(&key);
> + retry:
>  	if (exp == NULL)
>  		return ERR_PTR(-ENOMEM);
>  	err = cache_check(&svc_export_cache, &exp->h, reqp);
> +	if (err == -ETIMEDOUT) {
> +		struct svc_export *prev_exp = exp;
> +		exp = svc_export_lookup(&key);
> +		if (exp != prev_exp)
> +			goto retry;
> +		if (exp)
> +			cache_put(&exp->h, &svc_export_cache);
> +	}
>  	if (err)
>  		return ERR_PTR(err);
>  	return exp;
> diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
> index f6c51e5..c4e9177 100644
> --- a/net/sunrpc/auth_gss/svcauth_gss.c
> +++ b/net/sunrpc/auth_gss/svcauth_gss.c
> @@ -999,7 +999,7 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
>  	struct kvec *argv = &rqstp->rq_arg.head[0];
>  	struct kvec *resv = &rqstp->rq_res.head[0];
>  	struct xdr_netobj tmpobj;
> -	struct rsi *rsip, rsikey;
> +	struct rsi *rsip, rsikey, *prev_rsi;
>  	int ret;
>  
>  	/* Read the verifier; should be NULL: */
> @@ -1029,15 +1029,24 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
>  	}
>  
>  	/* Perform upcall, or find upcall result: */
> + retry:
>  	rsip = rsi_lookup(&rsikey);
> -	rsi_free(&rsikey);
> -	if (!rsip)
> +	if (!rsip) {
> +		rsi_free(&rsikey);
>  		return SVC_DROP;
> +	}
>  	switch (cache_check(&rsi_cache, &rsip->h, &rqstp->rq_chandle)) {
> -	case -EAGAIN:
>  	case -ETIMEDOUT:
> +		prev_rsi = rsip;
> +		rsip = rsi_lookup(&rsikey);
> +		if (rsip != prev_rsi)
> +			goto retry;
> +		if (rsip)
> +			cache_put(&rsip->h, &rsi_cache);
> +	case -EAGAIN:
>  	case -ENOENT:
>  		/* No upcall result: */
> +		rsi_free(&rsikey);
>  		return SVC_DROP;
>  	case 0:
>  		ret = SVC_DROP;
> @@ -1059,6 +1068,7 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
>  	}
>  	ret = SVC_COMPLETE;
>  out:
> +	rsi_free(&rsikey);
>  	cache_put(&rsip->h, &rsi_cache);
>  	return ret;
>  }
> diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
> index 117f68a..f9122df 100644
> --- a/net/sunrpc/svcauth_unix.c
> +++ b/net/sunrpc/svcauth_unix.c
> @@ -659,8 +659,10 @@ static int unix_gid_find(uid_t uid, struct group_info **gip,
>  			 struct svc_rqst *rqstp)
>  {
>  	struct unix_gid *ug = unix_gid_lookup(uid);
> +	struct unix_gid *prevug;
>  	if (!ug)
>  		return -EAGAIN;
> + retry:
>  	switch (cache_check(&unix_gid_cache, &ug->h, &rqstp->rq_chandle)) {
>  	case -ENOENT:
>  		*gip = NULL;
> @@ -670,6 +672,13 @@ static int unix_gid_find(uid_t uid, struct group_info **gip,
>  		get_group_info(*gip);
>  		cache_put(&ug->h, &unix_gid_cache);
>  		return 0;
> +	case -ETIMEDOUT:
> +		prevug = ug;
> +		ug = unix_gid_lookup(uid);
> +		if (ug != prevug)
> +			goto retry;
> +		if (ug)
> +			cache_put(&ug->h, &unix_gid_cache);
>  	default:
>  		return -EAGAIN;
>  	}
> @@ -680,7 +689,7 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
>  {
>  	struct sockaddr_in *sin;
>  	struct sockaddr_in6 *sin6, sin6_storage;
> -	struct ip_map *ipm;
> +	struct ip_map *ipm, *prev_ipm;
>  
>  	switch (rqstp->rq_addr.ss_family) {
>  	case AF_INET:
> @@ -705,14 +714,23 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
>  		ipm = ip_map_lookup(rqstp->rq_server->sv_program->pg_class,
>  				    &sin6->sin6_addr);
>  
> + retry:
>  	if (ipm == NULL)
>  		return SVC_DENIED;
>  
>  	switch (cache_check(&ip_map_cache, &ipm->h, &rqstp->rq_chandle)) {
>  		default:
>  			BUG();
> -		case -EAGAIN:
>  		case -ETIMEDOUT:
> +			prev_ipm = ipm;
> +			ipm = ip_map_lookup(rqstp->rq_server->sv_program->pg_class,
> +					    &sin6->sin6_addr);
> +			if (ipm != prev_ipm)
> +				goto retry;
> +			if (ipm)
> +				cache_put(&ipm->h, &ip_map_cache);
> +
> +		case -EAGAIN:
>  			return SVC_DROP;
>  		case -ENOENT:
>  			return SVC_DENIED;
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/9] nfsd/idmap: drop special request deferal in favour of improved default.
       [not found]     ` <20090909063254.20462.80299.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-12-02 22:18       ` J. Bruce Fields
  0 siblings, 0 replies; 25+ messages in thread
From: J. Bruce Fields @ 2009-12-02 22:18 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> The idmap code manages request deferal by waiting for a reply from
> userspace rather than putting the NFS request on a queue to be retried
> from the start.
> Now that the common deferal code does this there is no need for the
> special code in idmap.

This patch makes me happy....

--b.

> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> 
>  fs/nfsd/nfs4idmap.c |  105 +++++----------------------------------------------
>  1 files changed, 11 insertions(+), 94 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4idmap.c b/fs/nfsd/nfs4idmap.c
> index cdfa86f..22574f3 100644
> --- a/fs/nfsd/nfs4idmap.c
> +++ b/fs/nfsd/nfs4idmap.c
> @@ -497,109 +497,26 @@ nfsd_idmap_shutdown(void)
>  	cache_unregister(&nametoid_cache);
>  }
>  
> -/*
> - * Deferred request handling
> - */
> -
> -struct idmap_defer_req {
> -       struct cache_req		req;
> -       struct cache_deferred_req deferred_req;
> -       wait_queue_head_t	waitq;
> -       atomic_t			count;
> -};
> -
> -static inline void
> -put_mdr(struct idmap_defer_req *mdr)
> -{
> -	if (atomic_dec_and_test(&mdr->count))
> -		kfree(mdr);
> -}
> -
> -static inline void
> -get_mdr(struct idmap_defer_req *mdr)
> -{
> -	atomic_inc(&mdr->count);
> -}
> -
> -static void
> -idmap_revisit(struct cache_deferred_req *dreq, int toomany)
> -{
> -	struct idmap_defer_req *mdr =
> -		container_of(dreq, struct idmap_defer_req, deferred_req);
> -
> -	wake_up(&mdr->waitq);
> -	put_mdr(mdr);
> -}
> -
> -static struct cache_deferred_req *
> -idmap_defer(struct cache_req *req)
> -{
> -	struct idmap_defer_req *mdr =
> -		container_of(req, struct idmap_defer_req, req);
> -
> -	mdr->deferred_req.revisit = idmap_revisit;
> -	get_mdr(mdr);
> -	return (&mdr->deferred_req);
> -}
> -
> -static inline int
> -do_idmap_lookup(struct ent *(*lookup_fn)(struct ent *), struct ent *key,
> -		struct cache_detail *detail, struct ent **item,
> -		struct idmap_defer_req *mdr)
> -{
> -	*item = lookup_fn(key);
> -	if (!*item)
> -		return -ENOMEM;
> -	return cache_check(detail, &(*item)->h, &mdr->req);
> -}
> -
> -static inline int
> -do_idmap_lookup_nowait(struct ent *(*lookup_fn)(struct ent *),
> -			struct ent *key, struct cache_detail *detail,
> -			struct ent **item)
> -{
> -	int ret = -ENOMEM;
> -
> -	*item = lookup_fn(key);
> -	if (!*item)
> -		goto out_err;
> -	ret = -ETIMEDOUT;
> -	if (!test_bit(CACHE_VALID, &(*item)->h.flags)
> -			|| (*item)->h.expiry_time < get_seconds()
> -			|| detail->flush_time > (*item)->h.last_refresh)
> -		goto out_put;
> -	ret = -ENOENT;
> -	if (test_bit(CACHE_NEGATIVE, &(*item)->h.flags))
> -		goto out_put;
> -	return 0;
> -out_put:
> -	cache_put(&(*item)->h, detail);
> -out_err:
> -	*item = NULL;
> -	return ret;
> -}
> -
>  static int
>  idmap_lookup(struct svc_rqst *rqstp,
>  		struct ent *(*lookup_fn)(struct ent *), struct ent *key,
>  		struct cache_detail *detail, struct ent **item)
>  {
> -	struct idmap_defer_req *mdr;
>  	int ret;
>  
> -	mdr = kzalloc(sizeof(*mdr), GFP_KERNEL);
> -	if (!mdr)
> +	*item = lookup_fn(key);
> +	if (!*item)
>  		return -ENOMEM;
> -	atomic_set(&mdr->count, 1);
> -	init_waitqueue_head(&mdr->waitq);
> -	mdr->req.defer = idmap_defer;
> -	ret = do_idmap_lookup(lookup_fn, key, detail, item, mdr);
> -	if (ret == -EAGAIN) {
> -		wait_event_interruptible_timeout(mdr->waitq,
> -			test_bit(CACHE_VALID, &(*item)->h.flags), 1 * HZ);
> -		ret = do_idmap_lookup_nowait(lookup_fn, key, detail, item);
> + retry:
> +	ret = cache_check(detail, &(*item)->h, &rqstp->rq_chandle);
> +
> +	if (ret == -ETIMEDOUT) {
> +		struct ent *prev_item = *item;
> +		*item = lookup_fn(key);
> +		if (*item != prev_item)
> +			goto retry;
> +		cache_put(&(*item)->h, detail);
>  	}
> -	put_mdr(mdr);
>  	return ret;
>  }
>  
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] sunrpc/cache: retry cache lookups that return -ETIMEDOUT
  2009-12-02 22:11       ` J. Bruce Fields
@ 2009-12-03 16:57         ` J. Bruce Fields
  2009-12-04  4:38           ` Neil Brown
  0 siblings, 1 reply; 25+ messages in thread
From: J. Bruce Fields @ 2009-12-03 16:57 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs

On Wed, Dec 02, 2009 at 05:11:27PM -0500, J. Bruce Fields wrote:
> On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> > @@ -793,9 +793,18 @@ exp_find_key(svc_client *clp, int fsid_type, u32 *fsidv, struct cache_req *reqp)
> >  	memcpy(key.ek_fsid, fsidv, key_len(fsid_type));
> >  
> >  	ek = svc_expkey_lookup(&key);
> > + again:
> >  	if (ek == NULL)
> >  		return ERR_PTR(-ENOMEM);
> >  	err = cache_check(&svc_expkey_cache, &ek->h, reqp);
> > +	if (err == -ETIMEDOUT) {
> > +		struct svc_expkey *prev_ek = ek;
> > +		ek = svc_expkey_lookup(&key);
> > +		if (ek != prev_ek)
> > +			goto again;
> > +		if (ek)
> > +			cache_put(&ek->h, &svc_expkey_cache);
> > +	}
> 
> This is very subtle.  (We're comparing to a pointer which may not
> actually point to anything any more.)  And it's repeated in every
> caller.  Is there any simpler way to handle this?

Actually, is it even right?  After the cache_check:

> >  	err = cache_check(&svc_expkey_cache, &ek->h, reqp);

we no longer hold a reference on ek, so it could be freed.  There's no
reason that address couldn't then be reused for something else.  It's
even possible that a new entry could be created at the same address
here.  So:

> > +	if (err == -ETIMEDOUT) {
> > +		struct svc_expkey *prev_ek = ek;
> > +		ek = svc_expkey_lookup(&key);

the "ek" that's returned might well equal prev_ek,

> > +		if (ek != prev_ek)
> > +			goto again;

but that doesn't necessarily imply that this is the same object that
used to exist at that address.  So we could still return an ek which
isn't actually a positive cache entry.

Am I missing something?

--b.

> > +		if (ek)
> > +			cache_put(&ek->h, &svc_expkey_cache);
> > +	}

> 
> --b.
> 
> >  	if (err)
> >  		return ERR_PTR(err);
> >  	return ek;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] sunrpc/cache: retry cache lookups that return -ETIMEDOUT
  2009-12-03 16:57         ` J. Bruce Fields
@ 2009-12-04  4:38           ` Neil Brown
       [not found]             ` <20091204153845.1ec83de5-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Neil Brown @ 2009-12-04  4:38 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Thu, 3 Dec 2009 11:57:29 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Wed, Dec 02, 2009 at 05:11:27PM -0500, J. Bruce Fields wrote:
> > On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> > > @@ -793,9 +793,18 @@ exp_find_key(svc_client *clp, int fsid_type, u32 *fsidv, struct cache_req *reqp)
> > >  	memcpy(key.ek_fsid, fsidv, key_len(fsid_type));
> > >  
> > >  	ek = svc_expkey_lookup(&key);
> > > + again:
> > >  	if (ek == NULL)
> > >  		return ERR_PTR(-ENOMEM);
> > >  	err = cache_check(&svc_expkey_cache, &ek->h, reqp);
> > > +	if (err == -ETIMEDOUT) {
> > > +		struct svc_expkey *prev_ek = ek;
> > > +		ek = svc_expkey_lookup(&key);
> > > +		if (ek != prev_ek)
> > > +			goto again;
> > > +		if (ek)
> > > +			cache_put(&ek->h, &svc_expkey_cache);
> > > +	}
> > 
> > This is very subtle.  (We're comparing to a pointer which may not
> > actually point to anything any more.)  And it's repeated in every
> > caller.  Is there any simpler way to handle this?
> 
> Actually, is it even right?  After the cache_check:
> 
> > >  	err = cache_check(&svc_expkey_cache, &ek->h, reqp);
> 
> we no longer hold a reference on ek, so it could be freed.  There's no
> reason that address couldn't then be reused for something else.  It's
> even possible that a new entry could be created at the same address
> here.  So:
> 
> > > +	if (err == -ETIMEDOUT) {
> > > +		struct svc_expkey *prev_ek = ek;
> > > +		ek = svc_expkey_lookup(&key);
> 
> the "ek" that's returned might well equal prev_ek,
> 
> > > +		if (ek != prev_ek)
> > > +			goto again;
> 
> but that doesn't necessarily imply that this is the same object that
> used to exist at that address.  So we could still return an ek which
> isn't actually a positive cache entry.
> 
> Am I missing something?

No, I don't think so.

How about this as an alternate.  I have only compile tested it, nothing more.
But if it looks good to you I'll make sure it really works.

I don't much like the way that the ipmap lookups came out, but they are a bit
awkward because the previous value is cached for improved performance.

Thanks for the review.
NeilBrown

>From 6eb6129ee478951ba1f77cdef0f13cf5a7e3f2cd Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Wed, 9 Sep 2009 16:22:53 +1000
Subject: [PATCH] sunrpc/cache: retry cache lookups that return -ETIMEDOUT

If cache_check returns -ETIMEDOUT, then the cache item is not
up-to-date, but there is no pending upcall.
This could mean the data is not available, or it could mean that the
good data has been stored in a new cache item.

So re-do the lookup and if that returns a new item, proceed using that
item.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/export.c                  |   74 ++++++++++++++++++++----------------
 include/linux/sunrpc/cache.h      |    5 ++
 net/sunrpc/auth_gss/svcauth_gss.c |   50 +++++++++++++------------
 net/sunrpc/cache.c                |   30 +++++++++++++++
 net/sunrpc/svcauth_unix.c         |   47 +++++++++++++++++++----
 5 files changed, 140 insertions(+), 66 deletions(-)

diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 984a5eb..99144d5 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -273,10 +273,9 @@ static struct cache_detail svc_expkey_cache = {
 	.alloc		= expkey_alloc,
 };
 
-static struct svc_expkey *
-svc_expkey_lookup(struct svc_expkey *item)
+static int
+svc_expkey_hash(struct svc_expkey *item)
 {
-	struct cache_head *ch;
 	int hash = item->ek_fsidtype;
 	char * cp = (char*)item->ek_fsid;
 	int len = key_len(item->ek_fsidtype);
@@ -284,6 +283,14 @@ svc_expkey_lookup(struct svc_expkey *item)
 	hash ^= hash_mem(cp, len, EXPKEY_HASHBITS);
 	hash ^= hash_ptr(item->ek_client, EXPKEY_HASHBITS);
 	hash &= EXPKEY_HASHMASK;
+	return hash;
+}
+
+static struct svc_expkey *
+svc_expkey_lookup(struct svc_expkey *item)
+{
+	struct cache_head *ch;
+	int hash = svc_expkey_hash(item);
 
 	ch = sunrpc_cache_lookup(&svc_expkey_cache, &item->h,
 				 hash);
@@ -297,13 +304,7 @@ static struct svc_expkey *
 svc_expkey_update(struct svc_expkey *new, struct svc_expkey *old)
 {
 	struct cache_head *ch;
-	int hash = new->ek_fsidtype;
-	char * cp = (char*)new->ek_fsid;
-	int len = key_len(new->ek_fsidtype);
-
-	hash ^= hash_mem(cp, len, EXPKEY_HASHBITS);
-	hash ^= hash_ptr(new->ek_client, EXPKEY_HASHBITS);
-	hash &= EXPKEY_HASHMASK;
+	int hash = svc_expkey_hash(new);
 
 	ch = sunrpc_cache_update(&svc_expkey_cache, &new->h,
 				 &old->h, hash);
@@ -743,14 +744,22 @@ struct cache_detail svc_export_cache = {
 	.alloc		= svc_export_alloc,
 };
 
-static struct svc_export *
-svc_export_lookup(struct svc_export *exp)
+static int
+svc_export_hash(struct svc_export *exp)
 {
-	struct cache_head *ch;
 	int hash;
+
 	hash = hash_ptr(exp->ex_client, EXPORT_HASHBITS);
 	hash ^= hash_ptr(exp->ex_path.dentry, EXPORT_HASHBITS);
 	hash ^= hash_ptr(exp->ex_path.mnt, EXPORT_HASHBITS);
+	return hash;
+}
+
+static struct svc_export *
+svc_export_lookup(struct svc_export *exp)
+{
+	struct cache_head *ch;
+	int hash = svc_export_hash(exp);
 
 	ch = sunrpc_cache_lookup(&svc_export_cache, &exp->h,
 				 hash);
@@ -764,10 +773,7 @@ static struct svc_export *
 svc_export_update(struct svc_export *new, struct svc_export *old)
 {
 	struct cache_head *ch;
-	int hash;
-	hash = hash_ptr(old->ex_client, EXPORT_HASHBITS);
-	hash ^= hash_ptr(old->ex_path.dentry, EXPORT_HASHBITS);
-	hash ^= hash_ptr(old->ex_path.mnt, EXPORT_HASHBITS);
+	int hash = svc_export_hash(old);
 
 	ch = sunrpc_cache_update(&svc_export_cache, &new->h,
 				 &old->h,
@@ -782,8 +788,8 @@ svc_export_update(struct svc_export *new, struct svc_export *old)
 static struct svc_expkey *
 exp_find_key(svc_client *clp, int fsid_type, u32 *fsidv, struct cache_req *reqp)
 {
-	struct svc_expkey key, *ek;
-	int err;
+	struct svc_expkey key;
+	struct cache_head *h;
 	
 	if (!clp)
 		return ERR_PTR(-ENOENT);
@@ -792,13 +798,14 @@ exp_find_key(svc_client *clp, int fsid_type, u32 *fsidv, struct cache_req *reqp)
 	key.ek_fsidtype = fsid_type;
 	memcpy(key.ek_fsid, fsidv, key_len(fsid_type));
 
-	ek = svc_expkey_lookup(&key);
-	if (ek == NULL)
+	h = sunrpc_lookup_check(&svc_expkey_cache, &key.h,
+				reqp, svc_expkey_hash(&key));
+
+	if (h == NULL)
 		return ERR_PTR(-ENOMEM);
-	err = cache_check(&svc_expkey_cache, &ek->h, reqp);
-	if (err)
-		return ERR_PTR(err);
-	return ek;
+	if (IS_ERR(h))
+		return ERR_PTR(PTR_ERR(h));
+	return container_of(h, struct svc_expkey, h);
 }
 
 static int exp_set_key(svc_client *clp, int fsid_type, u32 *fsidv,
@@ -855,8 +862,8 @@ exp_get_fsid_key(svc_client *clp, int fsid)
 static svc_export *exp_get_by_name(svc_client *clp, const struct path *path,
 				     struct cache_req *reqp)
 {
-	struct svc_export *exp, key;
-	int err;
+	struct svc_export key;
+	struct cache_head *h;
 
 	if (!clp)
 		return ERR_PTR(-ENOENT);
@@ -864,13 +871,14 @@ static svc_export *exp_get_by_name(svc_client *clp, const struct path *path,
 	key.ex_client = clp;
 	key.ex_path = *path;
 
-	exp = svc_export_lookup(&key);
-	if (exp == NULL)
+	h = sunrpc_lookup_check(&svc_export_cache, &key.h,
+				reqp, svc_export_hash(&key));
+	if (h == NULL)
 		return ERR_PTR(-ENOMEM);
-	err = cache_check(&svc_export_cache, &exp->h, reqp);
-	if (err)
-		return ERR_PTR(err);
-	return exp;
+	if (IS_ERR(h))
+		return ERR_PTR(PTR_ERR(h));
+
+	return container_of(h, struct svc_export, h);
 }
 
 /*
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index ef3db11..31d1687 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -154,6 +154,11 @@ extern struct cache_head *
 sunrpc_cache_update(struct cache_detail *detail,
 		    struct cache_head *new, struct cache_head *old, int hash);
 
+extern struct cache_head *
+sunrpc_lookup_check(struct cache_detail *detail,
+		    struct cache_head *item,
+		    struct cache_req *rqstp,
+		    int hash);
 extern int
 sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h,
 		void (*cache_request)(struct cache_detail *,
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index f6c51e5..ab05580 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -1000,6 +1000,7 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
 	struct kvec *resv = &rqstp->rq_res.head[0];
 	struct xdr_netobj tmpobj;
 	struct rsi *rsip, rsikey;
+	struct cache_head *h;
 	int ret;
 
 	/* Read the verifier; should be NULL: */
@@ -1029,34 +1030,35 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
 	}
 
 	/* Perform upcall, or find upcall result: */
-	rsip = rsi_lookup(&rsikey);
+	h = sunrpc_lookup_check(&rsi_cache, &rsikey.h, &rqstp->rq_chandle,
+				rsi_hash(&rsikey));
 	rsi_free(&rsikey);
-	if (!rsip)
+
+	if (!h)
 		return SVC_DROP;
-	switch (cache_check(&rsi_cache, &rsip->h, &rqstp->rq_chandle)) {
-	case -EAGAIN:
-	case -ETIMEDOUT:
-	case -ENOENT:
+
+	if (IS_ERR(h))
 		/* No upcall result: */
 		return SVC_DROP;
-	case 0:
-		ret = SVC_DROP;
-		/* Got an answer to the upcall; use it: */
-		if (gss_write_init_verf(rqstp, rsip))
-			goto out;
-		if (resv->iov_len + 4 > PAGE_SIZE)
-			goto out;
-		svc_putnl(resv, RPC_SUCCESS);
-		if (svc_safe_putnetobj(resv, &rsip->out_handle))
-			goto out;
-		if (resv->iov_len + 3 * 4 > PAGE_SIZE)
-			goto out;
-		svc_putnl(resv, rsip->major_status);
-		svc_putnl(resv, rsip->minor_status);
-		svc_putnl(resv, GSS_SEQ_WIN);
-		if (svc_safe_putnetobj(resv, &rsip->out_token))
-			goto out;
-	}
+
+	rsip = container_of(h, struct rsi, h);
+	ret = SVC_DROP;
+	/* Got an answer to the upcall; use it: */
+	if (gss_write_init_verf(rqstp, rsip))
+		goto out;
+	if (resv->iov_len + 4 > PAGE_SIZE)
+		goto out;
+	svc_putnl(resv, RPC_SUCCESS);
+	if (svc_safe_putnetobj(resv, &rsip->out_handle))
+		goto out;
+	if (resv->iov_len + 3 * 4 > PAGE_SIZE)
+		goto out;
+	svc_putnl(resv, rsip->major_status);
+	svc_putnl(resv, rsip->minor_status);
+	svc_putnl(resv, GSS_SEQ_WIN);
+	if (svc_safe_putnetobj(resv, &rsip->out_token))
+		goto out;
+
 	ret = SVC_COMPLETE;
 out:
 	cache_put(&rsip->h, &rsi_cache);
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 46e9e2b..fbd38d4 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -265,6 +265,36 @@ int cache_check(struct cache_detail *detail,
 }
 EXPORT_SYMBOL_GPL(cache_check);
 
+struct cache_head *sunrpc_lookup_check(struct cache_detail *detail,
+				       struct cache_head *item,
+				       struct cache_req *rqstp,
+				       int hash)
+{
+	struct cache_head *rv;
+	struct cache_head *prev = NULL;
+	int err = -ETIMEDOUT;
+
+	while (err == -ETIMEDOUT) {
+		rv = sunrpc_cache_lookup(detail, item, hash);
+		if (rv && rv == prev)
+			break;
+		if (prev)
+			cache_put(prev, detail);
+		if (rv == NULL)
+			return NULL;
+
+		prev = cache_get(rv);
+		err = cache_check(detail, rv, rqstp);
+	}
+
+
+	if (prev)
+		cache_put(prev, detail);
+	if (err)
+		return PTR_ERR(err);
+	return rv;
+}
+EXPORT_SYMBOL_GPL(sunrpc_lookup_check);
 /*
  * caches need to be periodically cleaned.
  * For this we maintain a list of cache_detail and
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index 117f68a..f4e5908 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -658,18 +658,27 @@ static struct unix_gid *unix_gid_lookup(uid_t uid)
 static int unix_gid_find(uid_t uid, struct group_info **gip,
 			 struct svc_rqst *rqstp)
 {
-	struct unix_gid *ug = unix_gid_lookup(uid);
-	if (!ug)
+	struct unix_gid ugid;
+	struct cache_head *h;
+
+	ugid.uid = uid;
+	h = sunrpc_lookup_check(&unix_gid_cache, &ugid.h, 
+				&rqstp->rq_chandle, hash_long(uid, GID_HASHBITS));
+
+	if (!h)
 		return -EAGAIN;
-	switch (cache_check(&unix_gid_cache, &ug->h, &rqstp->rq_chandle)) {
-	case -ENOENT:
-		*gip = NULL;
-		return 0;
-	case 0:
+	if (!IS_ERR(h)) {
+		struct unix_gid *ug = container_of(h, struct unix_gid, h);
 		*gip = ug->gi;
 		get_group_info(*gip);
 		cache_put(&ug->h, &unix_gid_cache);
 		return 0;
+	}
+
+	switch (PTR_ERR(h)) {
+	case -ENOENT:
+		*gip = NULL;
+		return 0;
 	default:
 		return -EAGAIN;
 	}
@@ -681,6 +690,7 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
 	struct sockaddr_in *sin;
 	struct sockaddr_in6 *sin6, sin6_storage;
 	struct ip_map *ipm;
+	int err;
 
 	switch (rqstp->rq_addr.ss_family) {
 	case AF_INET:
@@ -708,11 +718,30 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
 	if (ipm == NULL)
 		return SVC_DENIED;
 
-	switch (cache_check(&ip_map_cache, &ipm->h, &rqstp->rq_chandle)) {
+	err = cache_check(&ip_map_cache, &ipm->h, &rqstp->rq_chandle);
+
+	if (err == -ETIMEDOUT) {
+		struct ip_map ip;
+		struct cache_head *h;
+
+		strcpy(ip.m_class, rqstp->rq_server->sv_program->pg_class);
+		ipv6_addr_copy(&ip.m_addr, &sin6->sin6_addr);
+
+		h = sunrpc_lookup_check(&ip_map_cache, &ip.h,
+					&rqstp->rq_chandle, hash_ip6(sin6->sin6_addr));
+		if (h == NULL)
+			return SVC_DENIED;
+		if (IS_ERR(h))
+			err = PTR_ERR(h);
+		else {
+			err = 0;
+			ipm = container_of(h, struct ip_map, h);
+		}
+	}
+	switch(err) {
 		default:
 			BUG();
 		case -EAGAIN:
-		case -ETIMEDOUT:
 			return SVC_DROP;
 		case -ENOENT:
 			return SVC_DENIED;
-- 
1.6.5.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] sunrpc/cache: retry cache lookups that return -ETIMEDOUT
       [not found]             ` <20091204153845.1ec83de5-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2009-12-05  1:17               ` J. Bruce Fields
  2009-12-15  6:27                 ` Neil Brown
  0 siblings, 1 reply; 25+ messages in thread
From: J. Bruce Fields @ 2009-12-05  1:17 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-nfs

On Fri, Dec 04, 2009 at 03:38:45PM +1100, Neil Brown wrote:
> On Thu, 3 Dec 2009 11:57:29 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Wed, Dec 02, 2009 at 05:11:27PM -0500, J. Bruce Fields wrote:
> > > On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> > > > @@ -793,9 +793,18 @@ exp_find_key(svc_client *clp, int fsid_type, u32 *fsidv, struct cache_req *reqp)
> > > >  	memcpy(key.ek_fsid, fsidv, key_len(fsid_type));
> > > >  
> > > >  	ek = svc_expkey_lookup(&key);
> > > > + again:
> > > >  	if (ek == NULL)
> > > >  		return ERR_PTR(-ENOMEM);
> > > >  	err = cache_check(&svc_expkey_cache, &ek->h, reqp);
> > > > +	if (err == -ETIMEDOUT) {
> > > > +		struct svc_expkey *prev_ek = ek;
> > > > +		ek = svc_expkey_lookup(&key);
> > > > +		if (ek != prev_ek)
> > > > +			goto again;
> > > > +		if (ek)
> > > > +			cache_put(&ek->h, &svc_expkey_cache);
> > > > +	}
> > > 
> > > This is very subtle.  (We're comparing to a pointer which may not
> > > actually point to anything any more.)  And it's repeated in every
> > > caller.  Is there any simpler way to handle this?
> > 
> > Actually, is it even right?  After the cache_check:
> > 
> > > >  	err = cache_check(&svc_expkey_cache, &ek->h, reqp);
> > 
> > we no longer hold a reference on ek, so it could be freed.  There's no
> > reason that address couldn't then be reused for something else.  It's
> > even possible that a new entry could be created at the same address
> > here.  So:
> > 
> > > > +	if (err == -ETIMEDOUT) {
> > > > +		struct svc_expkey *prev_ek = ek;
> > > > +		ek = svc_expkey_lookup(&key);
> > 
> > the "ek" that's returned might well equal prev_ek,
> > 
> > > > +		if (ek != prev_ek)
> > > > +			goto again;
> > 
> > but that doesn't necessarily imply that this is the same object that
> > used to exist at that address.  So we could still return an ek which
> > isn't actually a positive cache entry.
> > 
> > Am I missing something?
> 
> No, I don't think so.
> 
> How about this as an alternate.  I have only compile tested it, nothing more.
> But if it looks good to you I'll make sure it really works.

Well, without having really thinking about it:

	- If this were two separate patches, I'd have an easier time
	  sorting out the interesting stuff from the trivial (though
	  nevertheless good) hash-function reshuffling.
	- Adding code to the common lookup_and_check() instead of to
	  every caller certainly seems better, but too bad about the
	  special cases that remain.
	- Something still seems odd here: we shouldn't ever have
	  duplicate cache entries with the same key, because during
	  their lifetimes cache entries are always kept in the hash.  So
	  why do we need extra code to check for that case?  I may just
	  be forgetting what we're doing here.  Should I go reread the
	  rest of the series?

--b.

> 
> I don't much like the way that the ipmap lookups came out, but they are a bit
> awkward because the previous value is cached for improved performance.
> 
> Thanks for the review.
> NeilBrown
> 
> From 6eb6129ee478951ba1f77cdef0f13cf5a7e3f2cd Mon Sep 17 00:00:00 2001
> From: NeilBrown <neilb@suse.de>
> Date: Wed, 9 Sep 2009 16:22:53 +1000
> Subject: [PATCH] sunrpc/cache: retry cache lookups that return -ETIMEDOUT
> 
> If cache_check returns -ETIMEDOUT, then the cache item is not
> up-to-date, but there is no pending upcall.
> This could mean the data is not available, or it could mean that the
> good data has been stored in a new cache item.
> 
> So re-do the lookup and if that returns a new item, proceed using that
> item.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfsd/export.c                  |   74 ++++++++++++++++++++----------------
>  include/linux/sunrpc/cache.h      |    5 ++
>  net/sunrpc/auth_gss/svcauth_gss.c |   50 +++++++++++++------------
>  net/sunrpc/cache.c                |   30 +++++++++++++++
>  net/sunrpc/svcauth_unix.c         |   47 +++++++++++++++++++----
>  5 files changed, 140 insertions(+), 66 deletions(-)
> 
> diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> index 984a5eb..99144d5 100644
> --- a/fs/nfsd/export.c
> +++ b/fs/nfsd/export.c
> @@ -273,10 +273,9 @@ static struct cache_detail svc_expkey_cache = {
>  	.alloc		= expkey_alloc,
>  };
>  
> -static struct svc_expkey *
> -svc_expkey_lookup(struct svc_expkey *item)
> +static int
> +svc_expkey_hash(struct svc_expkey *item)
>  {
> -	struct cache_head *ch;
>  	int hash = item->ek_fsidtype;
>  	char * cp = (char*)item->ek_fsid;
>  	int len = key_len(item->ek_fsidtype);
> @@ -284,6 +283,14 @@ svc_expkey_lookup(struct svc_expkey *item)
>  	hash ^= hash_mem(cp, len, EXPKEY_HASHBITS);
>  	hash ^= hash_ptr(item->ek_client, EXPKEY_HASHBITS);
>  	hash &= EXPKEY_HASHMASK;
> +	return hash;
> +}
> +
> +static struct svc_expkey *
> +svc_expkey_lookup(struct svc_expkey *item)
> +{
> +	struct cache_head *ch;
> +	int hash = svc_expkey_hash(item);
>  
>  	ch = sunrpc_cache_lookup(&svc_expkey_cache, &item->h,
>  				 hash);
> @@ -297,13 +304,7 @@ static struct svc_expkey *
>  svc_expkey_update(struct svc_expkey *new, struct svc_expkey *old)
>  {
>  	struct cache_head *ch;
> -	int hash = new->ek_fsidtype;
> -	char * cp = (char*)new->ek_fsid;
> -	int len = key_len(new->ek_fsidtype);
> -
> -	hash ^= hash_mem(cp, len, EXPKEY_HASHBITS);
> -	hash ^= hash_ptr(new->ek_client, EXPKEY_HASHBITS);
> -	hash &= EXPKEY_HASHMASK;
> +	int hash = svc_expkey_hash(new);
>  
>  	ch = sunrpc_cache_update(&svc_expkey_cache, &new->h,
>  				 &old->h, hash);
> @@ -743,14 +744,22 @@ struct cache_detail svc_export_cache = {
>  	.alloc		= svc_export_alloc,
>  };
>  
> -static struct svc_export *
> -svc_export_lookup(struct svc_export *exp)
> +static int
> +svc_export_hash(struct svc_export *exp)
>  {
> -	struct cache_head *ch;
>  	int hash;
> +
>  	hash = hash_ptr(exp->ex_client, EXPORT_HASHBITS);
>  	hash ^= hash_ptr(exp->ex_path.dentry, EXPORT_HASHBITS);
>  	hash ^= hash_ptr(exp->ex_path.mnt, EXPORT_HASHBITS);
> +	return hash;
> +}
> +
> +static struct svc_export *
> +svc_export_lookup(struct svc_export *exp)
> +{
> +	struct cache_head *ch;
> +	int hash = svc_export_hash(exp);
>  
>  	ch = sunrpc_cache_lookup(&svc_export_cache, &exp->h,
>  				 hash);
> @@ -764,10 +773,7 @@ static struct svc_export *
>  svc_export_update(struct svc_export *new, struct svc_export *old)
>  {
>  	struct cache_head *ch;
> -	int hash;
> -	hash = hash_ptr(old->ex_client, EXPORT_HASHBITS);
> -	hash ^= hash_ptr(old->ex_path.dentry, EXPORT_HASHBITS);
> -	hash ^= hash_ptr(old->ex_path.mnt, EXPORT_HASHBITS);
> +	int hash = svc_export_hash(old);
>  
>  	ch = sunrpc_cache_update(&svc_export_cache, &new->h,
>  				 &old->h,
> @@ -782,8 +788,8 @@ svc_export_update(struct svc_export *new, struct svc_export *old)
>  static struct svc_expkey *
>  exp_find_key(svc_client *clp, int fsid_type, u32 *fsidv, struct cache_req *reqp)
>  {
> -	struct svc_expkey key, *ek;
> -	int err;
> +	struct svc_expkey key;
> +	struct cache_head *h;
>  	
>  	if (!clp)
>  		return ERR_PTR(-ENOENT);
> @@ -792,13 +798,14 @@ exp_find_key(svc_client *clp, int fsid_type, u32 *fsidv, struct cache_req *reqp)
>  	key.ek_fsidtype = fsid_type;
>  	memcpy(key.ek_fsid, fsidv, key_len(fsid_type));
>  
> -	ek = svc_expkey_lookup(&key);
> -	if (ek == NULL)
> +	h = sunrpc_lookup_check(&svc_expkey_cache, &key.h,
> +				reqp, svc_expkey_hash(&key));
> +
> +	if (h == NULL)
>  		return ERR_PTR(-ENOMEM);
> -	err = cache_check(&svc_expkey_cache, &ek->h, reqp);
> -	if (err)
> -		return ERR_PTR(err);
> -	return ek;
> +	if (IS_ERR(h))
> +		return ERR_PTR(PTR_ERR(h));
> +	return container_of(h, struct svc_expkey, h);
>  }
>  
>  static int exp_set_key(svc_client *clp, int fsid_type, u32 *fsidv,
> @@ -855,8 +862,8 @@ exp_get_fsid_key(svc_client *clp, int fsid)
>  static svc_export *exp_get_by_name(svc_client *clp, const struct path *path,
>  				     struct cache_req *reqp)
>  {
> -	struct svc_export *exp, key;
> -	int err;
> +	struct svc_export key;
> +	struct cache_head *h;
>  
>  	if (!clp)
>  		return ERR_PTR(-ENOENT);
> @@ -864,13 +871,14 @@ static svc_export *exp_get_by_name(svc_client *clp, const struct path *path,
>  	key.ex_client = clp;
>  	key.ex_path = *path;
>  
> -	exp = svc_export_lookup(&key);
> -	if (exp == NULL)
> +	h = sunrpc_lookup_check(&svc_export_cache, &key.h,
> +				reqp, svc_export_hash(&key));
> +	if (h == NULL)
>  		return ERR_PTR(-ENOMEM);
> -	err = cache_check(&svc_export_cache, &exp->h, reqp);
> -	if (err)
> -		return ERR_PTR(err);
> -	return exp;
> +	if (IS_ERR(h))
> +		return ERR_PTR(PTR_ERR(h));
> +
> +	return container_of(h, struct svc_export, h);
>  }
>  
>  /*
> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
> index ef3db11..31d1687 100644
> --- a/include/linux/sunrpc/cache.h
> +++ b/include/linux/sunrpc/cache.h
> @@ -154,6 +154,11 @@ extern struct cache_head *
>  sunrpc_cache_update(struct cache_detail *detail,
>  		    struct cache_head *new, struct cache_head *old, int hash);
>  
> +extern struct cache_head *
> +sunrpc_lookup_check(struct cache_detail *detail,
> +		    struct cache_head *item,
> +		    struct cache_req *rqstp,
> +		    int hash);
>  extern int
>  sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h,
>  		void (*cache_request)(struct cache_detail *,
> diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
> index f6c51e5..ab05580 100644
> --- a/net/sunrpc/auth_gss/svcauth_gss.c
> +++ b/net/sunrpc/auth_gss/svcauth_gss.c
> @@ -1000,6 +1000,7 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
>  	struct kvec *resv = &rqstp->rq_res.head[0];
>  	struct xdr_netobj tmpobj;
>  	struct rsi *rsip, rsikey;
> +	struct cache_head *h;
>  	int ret;
>  
>  	/* Read the verifier; should be NULL: */
> @@ -1029,34 +1030,35 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
>  	}
>  
>  	/* Perform upcall, or find upcall result: */
> -	rsip = rsi_lookup(&rsikey);
> +	h = sunrpc_lookup_check(&rsi_cache, &rsikey.h, &rqstp->rq_chandle,
> +				rsi_hash(&rsikey));
>  	rsi_free(&rsikey);
> -	if (!rsip)
> +
> +	if (!h)
>  		return SVC_DROP;
> -	switch (cache_check(&rsi_cache, &rsip->h, &rqstp->rq_chandle)) {
> -	case -EAGAIN:
> -	case -ETIMEDOUT:
> -	case -ENOENT:
> +
> +	if (IS_ERR(h))
>  		/* No upcall result: */
>  		return SVC_DROP;
> -	case 0:
> -		ret = SVC_DROP;
> -		/* Got an answer to the upcall; use it: */
> -		if (gss_write_init_verf(rqstp, rsip))
> -			goto out;
> -		if (resv->iov_len + 4 > PAGE_SIZE)
> -			goto out;
> -		svc_putnl(resv, RPC_SUCCESS);
> -		if (svc_safe_putnetobj(resv, &rsip->out_handle))
> -			goto out;
> -		if (resv->iov_len + 3 * 4 > PAGE_SIZE)
> -			goto out;
> -		svc_putnl(resv, rsip->major_status);
> -		svc_putnl(resv, rsip->minor_status);
> -		svc_putnl(resv, GSS_SEQ_WIN);
> -		if (svc_safe_putnetobj(resv, &rsip->out_token))
> -			goto out;
> -	}
> +
> +	rsip = container_of(h, struct rsi, h);
> +	ret = SVC_DROP;
> +	/* Got an answer to the upcall; use it: */
> +	if (gss_write_init_verf(rqstp, rsip))
> +		goto out;
> +	if (resv->iov_len + 4 > PAGE_SIZE)
> +		goto out;
> +	svc_putnl(resv, RPC_SUCCESS);
> +	if (svc_safe_putnetobj(resv, &rsip->out_handle))
> +		goto out;
> +	if (resv->iov_len + 3 * 4 > PAGE_SIZE)
> +		goto out;
> +	svc_putnl(resv, rsip->major_status);
> +	svc_putnl(resv, rsip->minor_status);
> +	svc_putnl(resv, GSS_SEQ_WIN);
> +	if (svc_safe_putnetobj(resv, &rsip->out_token))
> +		goto out;
> +
>  	ret = SVC_COMPLETE;
>  out:
>  	cache_put(&rsip->h, &rsi_cache);
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 46e9e2b..fbd38d4 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -265,6 +265,36 @@ int cache_check(struct cache_detail *detail,
>  }
>  EXPORT_SYMBOL_GPL(cache_check);
>  
> +struct cache_head *sunrpc_lookup_check(struct cache_detail *detail,
> +				       struct cache_head *item,
> +				       struct cache_req *rqstp,
> +				       int hash)
> +{
> +	struct cache_head *rv;
> +	struct cache_head *prev = NULL;
> +	int err = -ETIMEDOUT;
> +
> +	while (err == -ETIMEDOUT) {
> +		rv = sunrpc_cache_lookup(detail, item, hash);
> +		if (rv && rv == prev)
> +			break;
> +		if (prev)
> +			cache_put(prev, detail);
> +		if (rv == NULL)
> +			return NULL;
> +
> +		prev = cache_get(rv);
> +		err = cache_check(detail, rv, rqstp);
> +	}
> +
> +
> +	if (prev)
> +		cache_put(prev, detail);
> +	if (err)
> +		return PTR_ERR(err);
> +	return rv;
> +}
> +EXPORT_SYMBOL_GPL(sunrpc_lookup_check);
>  /*
>   * caches need to be periodically cleaned.
>   * For this we maintain a list of cache_detail and
> diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
> index 117f68a..f4e5908 100644
> --- a/net/sunrpc/svcauth_unix.c
> +++ b/net/sunrpc/svcauth_unix.c
> @@ -658,18 +658,27 @@ static struct unix_gid *unix_gid_lookup(uid_t uid)
>  static int unix_gid_find(uid_t uid, struct group_info **gip,
>  			 struct svc_rqst *rqstp)
>  {
> -	struct unix_gid *ug = unix_gid_lookup(uid);
> -	if (!ug)
> +	struct unix_gid ugid;
> +	struct cache_head *h;
> +
> +	ugid.uid = uid;
> +	h = sunrpc_lookup_check(&unix_gid_cache, &ugid.h, 
> +				&rqstp->rq_chandle, hash_long(uid, GID_HASHBITS));
> +
> +	if (!h)
>  		return -EAGAIN;
> -	switch (cache_check(&unix_gid_cache, &ug->h, &rqstp->rq_chandle)) {
> -	case -ENOENT:
> -		*gip = NULL;
> -		return 0;
> -	case 0:
> +	if (!IS_ERR(h)) {
> +		struct unix_gid *ug = container_of(h, struct unix_gid, h);
>  		*gip = ug->gi;
>  		get_group_info(*gip);
>  		cache_put(&ug->h, &unix_gid_cache);
>  		return 0;
> +	}
> +
> +	switch (PTR_ERR(h)) {
> +	case -ENOENT:
> +		*gip = NULL;
> +		return 0;
>  	default:
>  		return -EAGAIN;
>  	}
> @@ -681,6 +690,7 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
>  	struct sockaddr_in *sin;
>  	struct sockaddr_in6 *sin6, sin6_storage;
>  	struct ip_map *ipm;
> +	int err;
>  
>  	switch (rqstp->rq_addr.ss_family) {
>  	case AF_INET:
> @@ -708,11 +718,30 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
>  	if (ipm == NULL)
>  		return SVC_DENIED;
>  
> -	switch (cache_check(&ip_map_cache, &ipm->h, &rqstp->rq_chandle)) {
> +	err = cache_check(&ip_map_cache, &ipm->h, &rqstp->rq_chandle);
> +
> +	if (err == -ETIMEDOUT) {
> +		struct ip_map ip;
> +		struct cache_head *h;
> +
> +		strcpy(ip.m_class, rqstp->rq_server->sv_program->pg_class);
> +		ipv6_addr_copy(&ip.m_addr, &sin6->sin6_addr);
> +
> +		h = sunrpc_lookup_check(&ip_map_cache, &ip.h,
> +					&rqstp->rq_chandle, hash_ip6(sin6->sin6_addr));
> +		if (h == NULL)
> +			return SVC_DENIED;
> +		if (IS_ERR(h))
> +			err = PTR_ERR(h);
> +		else {
> +			err = 0;
> +			ipm = container_of(h, struct ip_map, h);
> +		}
> +	}
> +	switch(err) {
>  		default:
>  			BUG();
>  		case -EAGAIN:
> -		case -ETIMEDOUT:
>  			return SVC_DROP;
>  		case -ENOENT:
>  			return SVC_DENIED;
> -- 
> 1.6.5.3
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] sunrpc/cache: retry cache lookups that return -ETIMEDOUT
  2009-12-05  1:17               ` J. Bruce Fields
@ 2009-12-15  6:27                 ` Neil Brown
       [not found]                   ` <20091215172729.5e1d0190-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Neil Brown @ 2009-12-15  6:27 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Fri, 4 Dec 2009 20:17:42 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> > 
> > How about this as an alternate.  I have only compile tested it, nothing more.
> > But if it looks good to you I'll make sure it really works.  
> 
> Well, without having really thinking about it:
> 
> 	- If this were two separate patches, I'd have an easier time
> 	  sorting out the interesting stuff from the trivial (though
> 	  nevertheless good) hash-function reshuffling.

I'll see what I can come up with...

> 	- Adding code to the common lookup_and_check() instead of to
> 	  every caller certainly seems better, but too bad about the
> 	  special cases that remain.

yeah.... I could possibly add a pass-by-reference to lookup_and_check
which points to a possible cached value, but that would have
only one user, so the special case would be moved elsewhere...
??

> 	- Something still seems odd here: we shouldn't ever have
> 	  duplicate cache entries with the same key, because during
> 	  their lifetimes cache entries are always kept in the hash.  So
> 	  why do we need extra code to check for that case?  I may just
> 	  be forgetting what we're doing here.  Should I go reread the
> 	  rest of the series?

When sunrpc_update_cache is called to update and item that is
already valid, it unhashes that item and creates a new one.
(The unhashed item disappears once all the refcounts go).
So if we wait for user-space to update an entry for us, we
might find out that it has been unhashed, so we need to find
the new one.

NeilBrown

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] sunrpc/cache: retry cache lookups that return -ETIMEDOUT
       [not found]                   ` <20091215172729.5e1d0190-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2010-02-01 17:11                     ` J. Bruce Fields
  2010-02-02 21:33                       ` Neil Brown
  0 siblings, 1 reply; 25+ messages in thread
From: J. Bruce Fields @ 2010-02-01 17:11 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-nfs

On Tue, Dec 15, 2009 at 05:27:29PM +1100, Neil Brown wrote:
> On Fri, 4 Dec 2009 20:17:42 -0500
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > > 
> > > How about this as an alternate.  I have only compile tested it, nothing more.
> > > But if it looks good to you I'll make sure it really works.  
> > 
> > Well, without having really thinking about it:
> > 
> > 	- If this were two separate patches, I'd have an easier time
> > 	  sorting out the interesting stuff from the trivial (though
> > 	  nevertheless good) hash-function reshuffling.
> 
> I'll see what I can come up with...

Have you had a chance to get back to this?

> 
> > 	- Adding code to the common lookup_and_check() instead of to
> > 	  every caller certainly seems better, but too bad about the
> > 	  special cases that remain.
> 
> yeah.... I could possibly add a pass-by-reference to lookup_and_check
> which points to a possible cached value, but that would have
> only one user, so the special case would be moved elsewhere...
> ??

Yeah, that doesn't sound so great.

> > 	- Something still seems odd here: we shouldn't ever have
> > 	  duplicate cache entries with the same key, because during
> > 	  their lifetimes cache entries are always kept in the hash.  So
> > 	  why do we need extra code to check for that case?  I may just
> > 	  be forgetting what we're doing here.  Should I go reread the
> > 	  rest of the series?
> 
> When sunrpc_update_cache is called to update and item that is
> already valid, it unhashes that item and creates a new one.
> (The unhashed item disappears once all the refcounts go).
> So if we wait for user-space to update an entry for us, we
> might find out that it has been unhashed, so we need to find
> the new one.

But nobody ever waits on a valid entry, right?  So isn't the only case
we care about the invalid case?  I'll admit I haven't thought this
through.

--b.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] sunrpc/cache: retry cache lookups that return -ETIMEDOUT
  2010-02-01 17:11                     ` J. Bruce Fields
@ 2010-02-02 21:33                       ` Neil Brown
  0 siblings, 0 replies; 25+ messages in thread
From: Neil Brown @ 2010-02-02 21:33 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

On Mon, 1 Feb 2010 12:11:48 -0500
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Tue, Dec 15, 2009 at 05:27:29PM +1100, Neil Brown wrote:
> > On Fri, 4 Dec 2009 20:17:42 -0500
> > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > 
> > > > 
> > > > How about this as an alternate.  I have only compile tested it, nothing more.
> > > > But if it looks good to you I'll make sure it really works.  
> > > 
> > > Well, without having really thinking about it:
> > > 
> > > 	- If this were two separate patches, I'd have an easier time
> > > 	  sorting out the interesting stuff from the trivial (though
> > > 	  nevertheless good) hash-function reshuffling.
> > 
> > I'll see what I can come up with...
> 
> Have you had a chance to get back to this?

It looks like I did the initial separation in to three patches then promptly
went on holiday and forgot about it...
I'll get back to it.

> > > 	- Something still seems odd here: we shouldn't ever have
> > > 	  duplicate cache entries with the same key, because during
> > > 	  their lifetimes cache entries are always kept in the hash.  So
> > > 	  why do we need extra code to check for that case?  I may just
> > > 	  be forgetting what we're doing here.  Should I go reread the
> > > 	  rest of the series?
> > 
> > When sunrpc_update_cache is called to update an item that is
> > already valid, it unhashes that item and creates a new one.
> > (The unhashed item disappears once all the refcounts go).
> > So if we wait for user-space to update an entry for us, we
> > might find out that it has been unhashed, so we need to find
> > the new one.
> 
> But nobody ever waits on a valid entry, right?  So isn't the only case
> we care about the invalid case?  I'll admit I haven't thought this
> through.

We might wait on a valid entry that has expired (->expiry_time has passed).
However we could possibly unlink such entries whenever we find them, then
repeat the lookup thus creating a new non-CACHE_VALID entry and wait on
that...
I see if I can make that work.

NeilBrown

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2010-02-02 21:33 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-09  6:32 [PATCH 0/9] Some improvements to request deferral and related code NeilBrown
     [not found] ` <20090909062539.20462.67466.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-09-09  6:32   ` [PATCH 1/9] sunrpc/cache: change cache_defer_req to return -ve error, not boolean NeilBrown
     [not found]     ` <20090909063254.20462.57204.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-09-11 21:03       ` J. Bruce Fields
2009-09-09  6:32   ` [PATCH 4/9] sunrpc/cache: avoid variable over-loading in cache_defer_req NeilBrown
     [not found]     ` <20090909063254.20462.68582.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-09-18 21:24       ` J. Bruce Fields
2009-09-09  6:32   ` [PATCH 6/9] sunrpc/cache: retry cache lookups that return -ETIMEDOUT NeilBrown
     [not found]     ` <20090909063254.20462.41616.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-12-02 22:11       ` J. Bruce Fields
2009-12-03 16:57         ` J. Bruce Fields
2009-12-04  4:38           ` Neil Brown
     [not found]             ` <20091204153845.1ec83de5-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-12-05  1:17               ` J. Bruce Fields
2009-12-15  6:27                 ` Neil Brown
     [not found]                   ` <20091215172729.5e1d0190-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2010-02-01 17:11                     ` J. Bruce Fields
2010-02-02 21:33                       ` Neil Brown
2009-09-09  6:32   ` [PATCH 3/9] sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req NeilBrown
     [not found]     ` <20090909063254.20462.7969.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-09-18 15:48       ` J. Bruce Fields
2009-09-09  6:32   ` [PATCH 2/9] sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked NeilBrown
2009-09-09  6:32   ` [PATCH 5/9] sunrpc/cache: allow threads to block while waiting for cache update NeilBrown
     [not found]     ` <20090909063254.20462.99277.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-12-02 20:59       ` J. Bruce Fields
2009-12-02 21:23         ` Trond Myklebust
2009-12-02 21:50           ` Trond Myklebust
2009-09-09  6:32   ` [PATCH 7/9] nfsd/idmap: drop special request deferal in favour of improved default NeilBrown
     [not found]     ` <20090909063254.20462.80299.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-12-02 22:18       ` J. Bruce Fields
2009-09-09  6:32   ` [PATCH 9/9] sunrpc: close connection when a request is irretrievably lost NeilBrown
2009-09-09  6:32   ` [PATCH 8/9] sunrpc/cache: change deferred-request hash table to use hlist NeilBrown
2009-09-11 21:07   ` [PATCH 0/9] Some improvements to request deferral and related code J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.