* [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
@ 2010-12-29 20:47 J. Bruce Fields
2010-12-29 20:59 ` J. Bruce Fields
2011-01-03 22:26 ` J. Bruce Fields
0 siblings, 2 replies; 20+ messages in thread
From: J. Bruce Fields @ 2010-12-29 20:47 UTC (permalink / raw)
To: linux-nfs; +Cc: Neil Brown
From: J. Bruce Fields <bfields@redhat.com>
Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it
(and allowing any concurrent users to destroy it on last put) instead of
trying to update it in place.
Otherwise someone referencing the ip_map we're modifying here could try
to use the m_client just as we're putting the last reference.
The bug should only be seen by users of the legacy nfsd interfaces.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
net/sunrpc/svcauth_unix.c | 18 ++++++++++++++++--
1 files changed, 16 insertions(+), 2 deletions(-)
Intended to apply for 2.6.38 if this looks right....
--b.
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index a04ac91..70586ad 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -386,6 +386,21 @@ int auth_unix_forget_old(struct auth_domain *dom)
}
EXPORT_SYMBOL_GPL(auth_unix_forget_old);
+static void auth_unix_invalidate_ip_map(struct cache_detail *cd, struct ip_map *ipm)
+{
+ struct cache_head *ch;
+ struct ip_map ip;
+
+ ip.m_client = ipm->m_client;
+ ip.h.flags = CACHE_NEGATIVE;
+ ip.h.expiry_time = ipm->h.expiry_time;
+ ch = sunrpc_cache_update(cd, &ip.h, &ipm->h,
+ hash_str(ipm->m_class, IP_HASHBITS) ^
+ hash_ip6(ipm->m_addr));
+ if (ch)
+ cache_put(ch, cd);
+}
+
struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr)
{
struct ip_map *ipm;
@@ -401,8 +416,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr)
return NULL;
if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) {
- if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0)
- auth_domain_put(&ipm->m_client->h);
+ auth_unix_invalidate_ip_map(sn->ip_map_cache, ipm);
rv = NULL;
} else {
rv = &ipm->m_client->h;
--
1.7.3.4
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2010-12-29 20:47 [PATCH] svcrpc: modifying positive sunrpc cache entries is racy J. Bruce Fields
@ 2010-12-29 20:59 ` J. Bruce Fields
2010-12-30 1:19 ` Neil Brown
2011-01-03 22:26 ` J. Bruce Fields
1 sibling, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2010-12-29 20:59 UTC (permalink / raw)
To: linux-nfs; +Cc: Neil Brown
On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote:
> From: J. Bruce Fields <bfields@redhat.com>
>
> Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it
> (and allowing any concurrent users to destroy it on last put) instead of
> trying to update it in place.
>
> Otherwise someone referencing the ip_map we're modifying here could try
> to use the m_client just as we're putting the last reference.
>
> The bug should only be seen by users of the legacy nfsd interfaces.
>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
> net/sunrpc/svcauth_unix.c | 18 ++++++++++++++++--
> 1 files changed, 16 insertions(+), 2 deletions(-)
>
> Intended to apply for 2.6.38 if this looks right....
Also noticed while trying to track down an rhel5 oops in
svcauth_unix_set_client():
- cache_check() can set an entry negative in place, which if
nothing else must cause a leak in some cases. (Because when
the entry is eventually destroyed, it will be assumed to not
have any contents.) I suppose the fix is again to try to
adding a new negative entry instead.
- since cache_check() doesn't use any locking, I can't see what
guarantees that when it sees the CACHE_VALID bit set and
CACHE_NEGATIVE cleared, it must necessarily see the new
contents. I think that'd be fixed by a wmb() before setting
those bits and a rmb() after checking them. I don't know if
it's actually possible to hit that bug....
--b.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2010-12-29 20:59 ` J. Bruce Fields
@ 2010-12-30 1:19 ` Neil Brown
2010-12-30 1:57 ` J. Bruce Fields
0 siblings, 1 reply; 20+ messages in thread
From: Neil Brown @ 2010-12-30 1:19 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs
On Wed, 29 Dec 2010 15:59:42 -0500 "J. Bruce Fields" <bfields@fieldses.org>
wrote:
> On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote:
> > From: J. Bruce Fields <bfields@redhat.com>
> >
> > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it
> > (and allowing any concurrent users to destroy it on last put) instead of
> > trying to update it in place.
> >
> > Otherwise someone referencing the ip_map we're modifying here could try
> > to use the m_client just as we're putting the last reference.
> >
> > The bug should only be seen by users of the legacy nfsd interfaces.
> >
> > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > ---
> > net/sunrpc/svcauth_unix.c | 18 ++++++++++++++++--
> > 1 files changed, 16 insertions(+), 2 deletions(-)
> >
> > Intended to apply for 2.6.38 if this looks right....
>
> Also noticed while trying to track down an rhel5 oops in
> svcauth_unix_set_client():
>
> - cache_check() can set an entry negative in place, which if
> nothing else must cause a leak in some cases. (Because when
> the entry is eventually destroyed, it will be assumed to not
> have any contents.) I suppose the fix is again to try to
> adding a new negative entry instead.
cache_check should only set an entry 'negative' if it is not already valid
(rv == -EAGAIN) and there is no up-call pending.
Maybe we should check CACHE_VALID again after the test_and_set of
CACHE_PENDING, but is a very unlikely race (if it is actually a race at all)
>
> - since cache_check() doesn't use any locking, I can't see what
> guarantees that when it sees the CACHE_VALID bit set and
> CACHE_NEGATIVE cleared, it must necessarily see the new
> contents. I think that'd be fixed by a wmb() before setting
> those bits and a rmb() after checking them. I don't know if
> it's actually possible to hit that bug....
Yes, we probably want a set_bit_lock in cache_fresh_locked() though I don't
think that exists, so we could use test_and_set_bit_locked() instead.
But it does feel like maybe we should add some locking to cache_check.
Take the lock at the the start, and release it after the
test_and_set_bit(CACHE_PENDING) or once we have decided not to do that ???
I think when I wrote this I might have thought that bit ops implied memory
ordering ... or maybe I just didn't think through the issues properly at all.
Thanks,
NeilBrown
>
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2010-12-30 1:19 ` Neil Brown
@ 2010-12-30 1:57 ` J. Bruce Fields
2011-01-03 20:55 ` J. Bruce Fields
0 siblings, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2010-12-30 1:57 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-nfs
On Thu, Dec 30, 2010 at 12:19:40PM +1100, Neil Brown wrote:
> On Wed, 29 Dec 2010 15:59:42 -0500 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
>
> > On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote:
> > > From: J. Bruce Fields <bfields@redhat.com>
> > >
> > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it
> > > (and allowing any concurrent users to destroy it on last put) instead of
> > > trying to update it in place.
> > >
> > > Otherwise someone referencing the ip_map we're modifying here could try
> > > to use the m_client just as we're putting the last reference.
> > >
> > > The bug should only be seen by users of the legacy nfsd interfaces.
> > >
> > > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > > ---
> > > net/sunrpc/svcauth_unix.c | 18 ++++++++++++++++--
> > > 1 files changed, 16 insertions(+), 2 deletions(-)
> > >
> > > Intended to apply for 2.6.38 if this looks right....
> >
> > Also noticed while trying to track down an rhel5 oops in
> > svcauth_unix_set_client():
> >
> > - cache_check() can set an entry negative in place, which if
> > nothing else must cause a leak in some cases. (Because when
> > the entry is eventually destroyed, it will be assumed to not
> > have any contents.) I suppose the fix is again to try to
> > adding a new negative entry instead.
>
> cache_check should only set an entry 'negative' if it is not already valid
> (rv == -EAGAIN) and there is no up-call pending.
I don't think anything keeps VALID from being set after the
cache_is_valid check but before the code that does the
set_bit(CACHE_NEGATIVE).
> Maybe we should check CACHE_VALID again after the test_and_set of
> CACHE_PENDING, but is a very unlikely race (if it is actually a race at all)
>
> >
> > - since cache_check() doesn't use any locking, I can't see what
> > guarantees that when it sees the CACHE_VALID bit set and
> > CACHE_NEGATIVE cleared, it must necessarily see the new
> > contents. I think that'd be fixed by a wmb() before setting
> > those bits and a rmb() after checking them. I don't know if
> > it's actually possible to hit that bug....
>
> Yes, we probably want a set_bit_lock in cache_fresh_locked() though I don't
> think that exists, so we could use test_and_set_bit_locked() instead.
>
> But it does feel like maybe we should add some locking to cache_check.
> Take the lock at the the start, and release it after the
> test_and_set_bit(CACHE_PENDING) or once we have decided not to do that ???
Maybe so.
--b.
>
> I think when I wrote this I might have thought that bit ops implied memory
> ordering ... or maybe I just didn't think through the issues properly at all.
>
> Thanks,
> NeilBrown
>
>
> >
> > --b.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2010-12-30 1:57 ` J. Bruce Fields
@ 2011-01-03 20:55 ` J. Bruce Fields
2011-01-04 5:01 ` NeilBrown
0 siblings, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2011-01-03 20:55 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-nfs
On Wed, Dec 29, 2010 at 08:57:19PM -0500, J. Bruce Fields wrote:
> On Thu, Dec 30, 2010 at 12:19:40PM +1100, Neil Brown wrote:
> > On Wed, 29 Dec 2010 15:59:42 -0500 "J. Bruce Fields" <bfields@fieldses.org>
> > wrote:
> > > Also noticed while trying to track down an rhel5 oops in
> > > svcauth_unix_set_client():
> > >
> > > - cache_check() can set an entry negative in place, which if
> > > nothing else must cause a leak in some cases. (Because when
> > > the entry is eventually destroyed, it will be assumed to not
> > > have any contents.) I suppose the fix is again to try to
> > > adding a new negative entry instead.
> >
> > cache_check should only set an entry 'negative' if it is not already valid
> > (rv == -EAGAIN) and there is no up-call pending.
>
> I don't think anything keeps VALID from being set after the
> cache_is_valid check but before the code that does the
> set_bit(CACHE_NEGATIVE).
>
> > Maybe we should check CACHE_VALID again after the test_and_set of
> > CACHE_PENDING, but is a very unlikely race (if it is actually a race at all)
> >
> > >
> > > - since cache_check() doesn't use any locking, I can't see what
> > > guarantees that when it sees the CACHE_VALID bit set and
> > > CACHE_NEGATIVE cleared, it must necessarily see the new
> > > contents. I think that'd be fixed by a wmb() before setting
> > > those bits and a rmb() after checking them. I don't know if
> > > it's actually possible to hit that bug....
> >
> > Yes, we probably want a set_bit_lock in cache_fresh_locked() though I don't
> > think that exists, so we could use test_and_set_bit_locked() instead.
> >
> > But it does feel like maybe we should add some locking to cache_check.
> > Take the lock at the the start, and release it after the
> > test_and_set_bit(CACHE_PENDING) or once we have decided not to do that ???
>
> Maybe so.
Here's one attempt.
--b.
commit 55563023f85d01698ccf72325c87e3a7039a189b
Author: J. Bruce Fields <bfields@redhat.com>
Date: Mon Jan 3 15:10:27 2011 -0500
svcrpc: take locks to fix cache_check races
There are at least a couple races in cache_check:
- We attempt to turn a cache entry negative in place. But that
entry may already have been filled in by some other task since
we last checked whether it was valid, so we could be modifying
an already-valid entry. If nothing else there's a likely leak
in such a case when the entry is eventually put() and contents
are not freed because it has CACHE_NEGATIVE set.
- If cache_check races with an update that is turning the entry
CACHE_VALID, then it's possible that the CACHE_VALID bit could
become visible on this CPU before the actual contents do, so
we could tell the caller this entry is ready to use when in
fact the caller could still get invalid contents.
Some memory barriers might be sufficient to fix at least the latter; but
for now let's keep things simple and take the hash_lock when we turn an
entry negative or check the CACHE_VALID bit.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 0d6002f..2105b40 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -200,8 +200,9 @@ static int cache_make_upcall(struct cache_detail *cd, struct cache_head *h)
return cd->cache_upcall(cd, h);
}
-static inline int cache_is_valid(struct cache_detail *detail, struct cache_head *h)
+static int __cache_is_valid(struct cache_detail *detail, struct cache_head *h)
{
+
if (!test_bit(CACHE_VALID, &h->flags))
return -EAGAIN;
else {
@@ -213,6 +214,33 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head
}
}
+static int cache_is_valid(struct cache_detail *detail, struct cache_head *h)
+{
+ int rv;
+
+ read_lock(&detail->hash_lock);
+ rv = __cache_is_valid(detail, h);
+ read_unlock(&detail->hash_lock);
+ return rv;
+}
+
+static int try_to_negate_entry(struct cache_detail *detail, struct cache_head *h)
+{
+ int rv;
+
+ write_lock(&detail->hash_lock);
+ rv = __cache_is_valid(detail, h);
+ if (rv != -EAGAIN) {
+ write_unlock(&detail->hash_lock);
+ return rv;
+ }
+ set_bit(CACHE_NEGATIVE, &h->flags);
+ cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY);
+ write_unlock(&detail->hash_lock);
+ cache_fresh_unlocked(h, detail);
+ return -ENOENT;
+}
+
/*
* This is the generic cache management routine for all
* the authentication caches.
@@ -251,14 +279,8 @@ int cache_check(struct cache_detail *detail,
case -EINVAL:
clear_bit(CACHE_PENDING, &h->flags);
cache_revisit_request(h);
- if (rv == -EAGAIN) {
- set_bit(CACHE_NEGATIVE, &h->flags);
- cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY);
- cache_fresh_unlocked(h, detail);
- rv = -ENOENT;
- }
+ rv = try_to_negate_entry(detail, h);
break;
-
case -EAGAIN:
clear_bit(CACHE_PENDING, &h->flags);
cache_revisit_request(h);
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2010-12-29 20:47 [PATCH] svcrpc: modifying positive sunrpc cache entries is racy J. Bruce Fields
2010-12-29 20:59 ` J. Bruce Fields
@ 2011-01-03 22:26 ` J. Bruce Fields
2011-01-04 3:08 ` J. Bruce Fields
1 sibling, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2011-01-03 22:26 UTC (permalink / raw)
To: linux-nfs; +Cc: Neil Brown
On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote:
> From: J. Bruce Fields <bfields@redhat.com>
>
> Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it
> (and allowing any concurrent users to destroy it on last put) instead of
> trying to update it in place.
Or the following seems simpler.
(And I was thinking it was necessary to ensure that the right thing
happened to the cached xprt->xpt_auth_cache entry--though on a second
look I see that sunrpc_cache_update also expires the replaced entry
immediately. Still, this seems simpler if it also works.)
--b.
commit 20771de3185bf0031aa767d3b19f3e744a465a0c
Author: J. Bruce Fields <bfields@redhat.com>
Date: Fri Dec 24 14:03:38 2010 -0500
svcrpc: modifying valid sunrpc cache entries is racy
Once a sunrpc cache entry is non-NEGATIVE, we shouldn't attempt to
modify it in place. Otherwise someone referencing the ip_map we're
modifying here could try to use the m_client just as we're putting the
last reference.
Instead, just set the expiry time so it expires immediately.
The bug should only be seen by users of the legacy nfsd interfaces.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index a04ac91..5edc147 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -401,8 +401,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr)
return NULL;
if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) {
- if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0)
- auth_domain_put(&ipm->m_client->h);
+ ipm->h.expiry_time = 0;
rv = NULL;
} else {
rv = &ipm->m_client->h;
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2011-01-03 22:26 ` J. Bruce Fields
@ 2011-01-04 3:08 ` J. Bruce Fields
2011-01-04 4:51 ` NeilBrown
0 siblings, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2011-01-04 3:08 UTC (permalink / raw)
To: linux-nfs; +Cc: Neil Brown
On Mon, Jan 03, 2011 at 05:26:05PM -0500, J. Bruce Fields wrote:
> On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote:
> > From: J. Bruce Fields <bfields@redhat.com>
> >
> > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it
> > (and allowing any concurrent users to destroy it on last put) instead of
> > trying to update it in place.
>
> Or the following seems simpler.
>
> (And I was thinking it was necessary to ensure that the right thing
> happened to the cached xprt->xpt_auth_cache entry--though on a second
> look I see that sunrpc_cache_update also expires the replaced entry
> immediately. Still, this seems simpler if it also works.)
Eh, on third thoughts: we probably do want a real negative entry created
in the cache, so I think the original patch was correct!
--b.
>
> --b.
>
> commit 20771de3185bf0031aa767d3b19f3e744a465a0c
> Author: J. Bruce Fields <bfields@redhat.com>
> Date: Fri Dec 24 14:03:38 2010 -0500
>
> svcrpc: modifying valid sunrpc cache entries is racy
>
> Once a sunrpc cache entry is non-NEGATIVE, we shouldn't attempt to
> modify it in place. Otherwise someone referencing the ip_map we're
> modifying here could try to use the m_client just as we're putting the
> last reference.
>
> Instead, just set the expiry time so it expires immediately.
>
> The bug should only be seen by users of the legacy nfsd interfaces.
>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>
> diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
> index a04ac91..5edc147 100644
> --- a/net/sunrpc/svcauth_unix.c
> +++ b/net/sunrpc/svcauth_unix.c
> @@ -401,8 +401,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr)
> return NULL;
>
> if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) {
> - if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0)
> - auth_domain_put(&ipm->m_client->h);
> + ipm->h.expiry_time = 0;
> rv = NULL;
> } else {
> rv = &ipm->m_client->h;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2011-01-04 3:08 ` J. Bruce Fields
@ 2011-01-04 4:51 ` NeilBrown
2011-01-04 18:43 ` J. Bruce Fields
2011-01-04 21:46 ` J. Bruce Fields
0 siblings, 2 replies; 20+ messages in thread
From: NeilBrown @ 2011-01-04 4:51 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs
On Mon, 3 Jan 2011 22:08:05 -0500 "J. Bruce Fields" <bfields@fieldses.org>
wrote:
> On Mon, Jan 03, 2011 at 05:26:05PM -0500, J. Bruce Fields wrote:
> > On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote:
> > > From: J. Bruce Fields <bfields@redhat.com>
> > >
> > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it
> > > (and allowing any concurrent users to destroy it on last put) instead of
> > > trying to update it in place.
> >
> > Or the following seems simpler.
> >
> > (And I was thinking it was necessary to ensure that the right thing
> > happened to the cached xprt->xpt_auth_cache entry--though on a second
> > look I see that sunrpc_cache_update also expires the replaced entry
> > immediately. Still, this seems simpler if it also works.)
>
> Eh, on third thoughts: we probably do want a real negative entry created
> in the cache, so I think the original patch was correct!
I like your second thought better than your third.
I don't see any reason to push this item out of the cache extra quickly.
In fact I think it would still be correct to just remove those two lines and
not set expiry_time to 0. auth_unix_lookup will never find that IP address,
and so it doesn't matter if it remains in the cache or not.
I guess there could result in the cache appearing to contain different data
depending on whether you look at it the 'old' way or the 'new' way, but I
don't that is a real problem, and setting expiry_time to 0 overcomes that.
What was the substance of your third thought?
BTW, you could use sunrpc_invalidate rather than just setting expiry_time to
zero, which would hurry it out of the cache a bit faster.
And this all made me realise that there is more code that can be placed under
CONFIG_NFSD_DEPRECTATED.
SUNRPC: Remove more code when NFSD_DEPRECATED is not configured
Signed-off-by: NeilBrown <neilb@suse.de>
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 6950c98..c2aebe8 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -255,10 +255,13 @@ static inline time_t get_expiry(char **bpp)
return rv - boot.tv_sec;
}
+#ifdef CONFIG_NFSD_DEPRECATED
static inline void sunrpc_invalidate(struct cache_head *h,
struct cache_detail *detail)
{
h->expiry_time = seconds_since_boot() - 1;
detail->nextcheck = seconds_since_boot();
}
+#endif /* CONFIG_NFSD_DEPRECATED */
+
#endif /* _LINUX_SUNRPC_CACHE_H_ */
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index 560677d..92d47ad 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -30,7 +30,9 @@
struct unix_domain {
struct auth_domain h;
+#ifdef CONFIG_NFSD_DEPRECATED
int addr_changes;
+#endif /* CONFIG_NFSD_DEPRECATED */
/* other stuff later */
};
@@ -64,7 +66,9 @@ struct auth_domain *unix_domain_find(char *name)
return NULL;
}
new->h.flavour = &svcauth_unix;
+#ifdef CONFIG_NFSD_DEPRECATED
new->addr_changes = 0;
+#endif /* CONFIG_NFSD_DEPRECATED */
rv = auth_domain_lookup(name, &new->h);
}
}
@@ -92,7 +96,9 @@ struct ip_map {
char m_class[8]; /* e.g. "nfsd" */
struct in6_addr m_addr;
struct unix_domain *m_client;
+#ifdef CONFIG_NFSD_DEPRECATED
int m_add_change;
+#endif /* CONFIG_NFSD_DEPRECATED */
};
static void ip_map_put(struct kref *kref)
@@ -146,7 +152,9 @@ static void update(struct cache_head *cnew, struct cache_head *citem)
kref_get(&item->m_client->h.ref);
new->m_client = item->m_client;
+#ifdef CONFIG_NFSD_DEPRECATED
new->m_add_change = item->m_add_change;
+#endif /* CONFIG_NFSD_DEPRECATED */
}
static struct cache_head *ip_map_alloc(void)
{
@@ -331,6 +339,7 @@ static int __ip_map_update(struct cache_detail *cd, struct ip_map *ipm,
ip.h.flags = 0;
if (!udom)
set_bit(CACHE_NEGATIVE, &ip.h.flags);
+#ifdef CONFIG_NFSD_DEPRECATED
else {
ip.m_add_change = udom->addr_changes;
/* if this is from the legacy set_client system call,
@@ -339,6 +348,7 @@ static int __ip_map_update(struct cache_detail *cd, struct ip_map *ipm,
if (expiry == NEVER)
ip.m_add_change++;
}
+#endif /* CONFIG_NFSD_DEPRECATED */
ip.h.expiry_time = expiry;
ch = sunrpc_cache_update(cd, &ip.h, &ipm->h,
hash_str(ipm->m_class, IP_HASHBITS) ^
@@ -358,6 +368,7 @@ static inline int ip_map_update(struct net *net, struct ip_map *ipm,
return __ip_map_update(sn->ip_map_cache, ipm, udom, expiry);
}
+#ifdef CONFIG_NFSD_DEPRECATED
int auth_unix_add_addr(struct net *net, struct in6_addr *addr, struct auth_domain *dom)
{
struct unix_domain *udom;
@@ -426,6 +437,7 @@ void svcauth_unix_purge(void)
}
}
EXPORT_SYMBOL_GPL(svcauth_unix_purge);
+#endif /* CONFIG_NFSD_DEPRECATED */
static inline struct ip_map *
ip_map_cached_get(struct svc_xprt *xprt)
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2011-01-03 20:55 ` J. Bruce Fields
@ 2011-01-04 5:01 ` NeilBrown
2011-01-04 15:22 ` J. Bruce Fields
0 siblings, 1 reply; 20+ messages in thread
From: NeilBrown @ 2011-01-04 5:01 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs
On Mon, 3 Jan 2011 15:55:14 -0500 "J. Bruce Fields" <bfields@fieldses.org>
wrote:
> On Wed, Dec 29, 2010 at 08:57:19PM -0500, J. Bruce Fields wrote:
> > On Thu, Dec 30, 2010 at 12:19:40PM +1100, Neil Brown wrote:
> > > On Wed, 29 Dec 2010 15:59:42 -0500 "J. Bruce Fields" <bfields@fieldses.org>
> > > wrote:
> > > > Also noticed while trying to track down an rhel5 oops in
> > > > svcauth_unix_set_client():
> > > >
> > > > - cache_check() can set an entry negative in place, which if
> > > > nothing else must cause a leak in some cases. (Because when
> > > > the entry is eventually destroyed, it will be assumed to not
> > > > have any contents.) I suppose the fix is again to try to
> > > > adding a new negative entry instead.
> > >
> > > cache_check should only set an entry 'negative' if it is not already valid
> > > (rv == -EAGAIN) and there is no up-call pending.
> >
> > I don't think anything keeps VALID from being set after the
> > cache_is_valid check but before the code that does the
> > set_bit(CACHE_NEGATIVE).
> >
> > > Maybe we should check CACHE_VALID again after the test_and_set of
> > > CACHE_PENDING, but is a very unlikely race (if it is actually a race at all)
> > >
> > > >
> > > > - since cache_check() doesn't use any locking, I can't see what
> > > > guarantees that when it sees the CACHE_VALID bit set and
> > > > CACHE_NEGATIVE cleared, it must necessarily see the new
> > > > contents. I think that'd be fixed by a wmb() before setting
> > > > those bits and a rmb() after checking them. I don't know if
> > > > it's actually possible to hit that bug....
> > >
> > > Yes, we probably want a set_bit_lock in cache_fresh_locked() though I don't
> > > think that exists, so we could use test_and_set_bit_locked() instead.
> > >
> > > But it does feel like maybe we should add some locking to cache_check.
> > > Take the lock at the the start, and release it after the
> > > test_and_set_bit(CACHE_PENDING) or once we have decided not to do that ???
> >
> > Maybe so.
>
> Here's one attempt.
>
> --b.
>
> commit 55563023f85d01698ccf72325c87e3a7039a189b
> Author: J. Bruce Fields <bfields@redhat.com>
> Date: Mon Jan 3 15:10:27 2011 -0500
>
> svcrpc: take locks to fix cache_check races
>
> There are at least a couple races in cache_check:
>
> - We attempt to turn a cache entry negative in place. But that
> entry may already have been filled in by some other task since
> we last checked whether it was valid, so we could be modifying
> an already-valid entry. If nothing else there's a likely leak
> in such a case when the entry is eventually put() and contents
> are not freed because it has CACHE_NEGATIVE set.
> - If cache_check races with an update that is turning the entry
> CACHE_VALID, then it's possible that the CACHE_VALID bit could
> become visible on this CPU before the actual contents do, so
> we could tell the caller this entry is ready to use when in
> fact the caller could still get invalid contents.
>
> Some memory barriers might be sufficient to fix at least the latter; but
> for now let's keep things simple and take the hash_lock when we turn an
> entry negative or check the CACHE_VALID bit.
>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 0d6002f..2105b40 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -200,8 +200,9 @@ static int cache_make_upcall(struct cache_detail *cd, struct cache_head *h)
> return cd->cache_upcall(cd, h);
> }
>
> -static inline int cache_is_valid(struct cache_detail *detail, struct cache_head *h)
> +static int __cache_is_valid(struct cache_detail *detail, struct cache_head *h)
> {
> +
> if (!test_bit(CACHE_VALID, &h->flags))
> return -EAGAIN;
> else {
> @@ -213,6 +214,33 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head
> }
> }
>
> +static int cache_is_valid(struct cache_detail *detail, struct cache_head *h)
> +{
> + int rv;
> +
> + read_lock(&detail->hash_lock);
> + rv = __cache_is_valid(detail, h);
> + read_unlock(&detail->hash_lock);
> + return rv;
> +}
I don't think there is anything in __cache_is_valid that needs to be
protected.
The compiler will almost certainly produce code which loads f->flags once and
then performs 1 or 2 bit tests against the value in the register and produces
one of 3 possible return values based on the result.
There is absolutely no value in putting locking around that, especially as
CACHE_VALID is never cleared.
Maybe you imagine a re-ordering of setting CACHE_NEGATIVE and CACHE_VALID,
but as they are in the same cache line (and in fact in the same byte) they
cannot be re-ordered. We always set CACHE_NEGATIVE before CACHE_VALID and
there is no way those two could get to memory in the wrong order.
> +
> +static int try_to_negate_entry(struct cache_detail *detail, struct cache_head *h)
> +{
> + int rv;
> +
> + write_lock(&detail->hash_lock);
> + rv = __cache_is_valid(detail, h);
> + if (rv != -EAGAIN) {
> + write_unlock(&detail->hash_lock);
> + return rv;
> + }
> + set_bit(CACHE_NEGATIVE, &h->flags);
> + cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY);
> + write_unlock(&detail->hash_lock);
> + cache_fresh_unlocked(h, detail);
> + return -ENOENT;
> +}
> +
> /*
> * This is the generic cache management routine for all
> * the authentication caches.
> @@ -251,14 +279,8 @@ int cache_check(struct cache_detail *detail,
> case -EINVAL:
> clear_bit(CACHE_PENDING, &h->flags);
> cache_revisit_request(h);
> - if (rv == -EAGAIN) {
> - set_bit(CACHE_NEGATIVE, &h->flags);
> - cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY);
> - cache_fresh_unlocked(h, detail);
> - rv = -ENOENT;
> - }
> + rv = try_to_negate_entry(detail, h);
> break;
This bit looks good those. It feels much better having an 'unlock' between
'cache_fresh_locked' and 'cache_fresh_unlocked' !!
Thanks,
NeilBrown
> -
> case -EAGAIN:
> clear_bit(CACHE_PENDING, &h->flags);
> cache_revisit_request(h);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2011-01-04 5:01 ` NeilBrown
@ 2011-01-04 15:22 ` J. Bruce Fields
2011-01-04 19:23 ` J. Bruce Fields
0 siblings, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2011-01-04 15:22 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-nfs
On Tue, Jan 04, 2011 at 04:01:52PM +1100, NeilBrown wrote:
> On Mon, 3 Jan 2011 15:55:14 -0500 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
> > @@ -213,6 +214,33 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head
> > }
> > }
> >
> > +static int cache_is_valid(struct cache_detail *detail, struct cache_head *h)
> > +{
> > + int rv;
> > +
> > + read_lock(&detail->hash_lock);
> > + rv = __cache_is_valid(detail, h);
> > + read_unlock(&detail->hash_lock);
> > + return rv;
> > +}
>
> I don't think there is anything in __cache_is_valid that needs to be
> protected.
> The compiler will almost certainly produce code which loads f->flags once and
> then performs 1 or 2 bit tests against the value in the register and produces
> one of 3 possible return values based on the result.
> There is absolutely no value in putting locking around that, especially as
> CACHE_VALID is never cleared.
>
> Maybe you imagine a re-ordering of setting CACHE_NEGATIVE and CACHE_VALID,
> but as they are in the same cache line (and in fact in the same byte) they
> cannot be re-ordered. We always set CACHE_NEGATIVE before CACHE_VALID and
> there is no way those two could get to memory in the wrong order.
The risk would be a reordering of CACHE_VALID with setting of the actual
contents in the !NEGATIVE case, so:
task doing lookup task doing update
^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^
1. test for CACHE_VALID 3. set item->contents
& !CACHE_NEGATIVE
2. dereference 4. set CACHE_VALID, clear
item->contents->... CACHE_NEGATIVE.
As I understand it, if we want to gaurantee that item->contents is good
at step 2, then we need a write barrier between 3 and 4, together with a
read barrier between 1 and 2. Taking the spinlock is overkill, but
should accomplish the same thing, as it forces 1 to occur before 3 or
after 4, and adds any necessary memory barriers.
--b.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2011-01-04 4:51 ` NeilBrown
@ 2011-01-04 18:43 ` J. Bruce Fields
2011-01-04 21:15 ` NeilBrown
2011-01-04 21:46 ` J. Bruce Fields
1 sibling, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2011-01-04 18:43 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-nfs
On Tue, Jan 04, 2011 at 03:51:07PM +1100, NeilBrown wrote:
> On Mon, 3 Jan 2011 22:08:05 -0500 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
>
> > On Mon, Jan 03, 2011 at 05:26:05PM -0500, J. Bruce Fields wrote:
> > > On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote:
> > > > From: J. Bruce Fields <bfields@redhat.com>
> > > >
> > > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it
> > > > (and allowing any concurrent users to destroy it on last put) instead of
> > > > trying to update it in place.
> > >
> > > Or the following seems simpler.
> > >
> > > (And I was thinking it was necessary to ensure that the right thing
> > > happened to the cached xprt->xpt_auth_cache entry--though on a second
> > > look I see that sunrpc_cache_update also expires the replaced entry
> > > immediately. Still, this seems simpler if it also works.)
> >
> > Eh, on third thoughts: we probably do want a real negative entry created
> > in the cache, so I think the original patch was correct!
>
> I like your second thought better than your third.
> I don't see any reason to push this item out of the cache extra quickly.
> In fact I think it would still be correct to just remove those two lines and
> not set expiry_time to 0. auth_unix_lookup will never find that IP address,
> and so it doesn't matter if it remains in the cache or not.
> I guess there could result in the cache appearing to contain different data
> depending on whether you look at it the 'old' way or the 'new' way, but I
> don't that is a real problem, and setting expiry_time to 0 overcomes that.
>
> What was the substance of your third thought?
I had some idea that we could end up with a cache entry stuck in the
cache forever.
OK, actually I think late last night I reverted into some sort of "maybe
if I type the first thing that comes to my mind, somebody else will
think this through for me" state. Apologies. Um, did I win?
> BTW, you could use sunrpc_invalidate rather than just setting expiry_time to
> zero, which would hurry it out of the cache a bit faster.
Yep, I like that, and I'd forgotten about sunrpc_invalidate, thanks!
Done.... (As below).
> And this all made me realise that there is more code that can be placed under
> CONFIG_NFSD_DEPRECTATED.
Yep, applied.
(When do we get to remove all this? Taking a stab at the 2.6.40 merge
window.... OK, party at my place in May!)
--b.
commit ab5c05c579b0b37b9bf2c79c9c8f0ef6045ee41d
Author: J. Bruce Fields <bfields@redhat.com>
Date: Fri Dec 24 14:03:38 2010 -0500
svcrpc: modifying valid sunrpc cache entries is racy
Once a sunrpc cache entry is VALID, we should be replacing it (and
allowing any concurrent users to destroy it on last put) instead of
trying to update it in place.
Otherwise someone referencing the ip_map we're modifying here could try
to use the m_client just as we're putting the last reference.
The bug should only be seen by users of the legacy nfsd interfaces.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index a04ac91..59a7c52 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -401,8 +401,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr)
return NULL;
if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) {
- if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0)
- auth_domain_put(&ipm->m_client->h);
+ sunrpc_invalidate(&ipm->h, sn->ip_map_cache);
rv = NULL;
} else {
rv = &ipm->m_client->h;
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2011-01-04 15:22 ` J. Bruce Fields
@ 2011-01-04 19:23 ` J. Bruce Fields
2011-01-04 19:31 ` [PATCH 1/2] svcrpc: take lock on turning entry NEGATIVE in cache_check J. Bruce Fields
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: J. Bruce Fields @ 2011-01-04 19:23 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-nfs
On Tue, Jan 04, 2011 at 10:22:31AM -0500, J. Bruce Fields wrote:
> On Tue, Jan 04, 2011 at 04:01:52PM +1100, NeilBrown wrote:
> > On Mon, 3 Jan 2011 15:55:14 -0500 "J. Bruce Fields" <bfields@fieldses.org>
> > wrote:
> > > @@ -213,6 +214,33 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head
> > > }
> > > }
> > >
> > > +static int cache_is_valid(struct cache_detail *detail, struct cache_head *h)
> > > +{
> > > + int rv;
> > > +
> > > + read_lock(&detail->hash_lock);
> > > + rv = __cache_is_valid(detail, h);
> > > + read_unlock(&detail->hash_lock);
> > > + return rv;
> > > +}
> >
> > I don't think there is anything in __cache_is_valid that needs to be
> > protected.
> > The compiler will almost certainly produce code which loads f->flags once and
> > then performs 1 or 2 bit tests against the value in the register and produces
> > one of 3 possible return values based on the result.
> > There is absolutely no value in putting locking around that, especially as
> > CACHE_VALID is never cleared.
> >
> > Maybe you imagine a re-ordering of setting CACHE_NEGATIVE and CACHE_VALID,
> > but as they are in the same cache line (and in fact in the same byte) they
> > cannot be re-ordered. We always set CACHE_NEGATIVE before CACHE_VALID and
> > there is no way those two could get to memory in the wrong order.
>
> The risk would be a reordering of CACHE_VALID with setting of the actual
> contents in the !NEGATIVE case, so:
>
> task doing lookup task doing update
> ^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^
>
> 1. test for CACHE_VALID 3. set item->contents
> & !CACHE_NEGATIVE
>
> 2. dereference 4. set CACHE_VALID, clear
> item->contents->... CACHE_NEGATIVE.
>
> As I understand it, if we want to gaurantee that item->contents is good
> at step 2, then we need a write barrier between 3 and 4, together with a
> read barrier between 1 and 2. Taking the spinlock is overkill, but
> should accomplish the same thing, as it forces 1 to occur before 3 or
> after 4, and adds any necessary memory barriers.
So, that being the real problem, perhaps it's clearer to use explicit
memory barriers instead of more locking, and add some comments. Also
split this into a separate patch, as in the following.
--b.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/2] svcrpc: take lock on turning entry NEGATIVE in cache_check
2011-01-04 19:23 ` J. Bruce Fields
@ 2011-01-04 19:31 ` J. Bruce Fields
2011-01-04 19:31 ` [PATCH 2/2] svcrpc: ensure cache_check caller sees updated entry J. Bruce Fields
2011-01-04 21:10 ` [PATCH] svcrpc: modifying positive sunrpc cache entries is racy NeilBrown
2 siblings, 0 replies; 20+ messages in thread
From: J. Bruce Fields @ 2011-01-04 19:31 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-nfs, J. Bruce Fields
We attempt to turn a cache entry negative in place. But that entry may
already have been filled in by some other task since we last checked
whether it was valid, so we could be modifying an already-valid entry.
If nothing else there's a likely leak in such a case when the entry is
eventually put() and contents are not freed because it has
CACHE_NEGATIVE set.
So, take the cache_lock just as sunrpc_cache_update() does.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
net/sunrpc/cache.c | 25 ++++++++++++++++++-------
1 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 0d6002f..a6c5733 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -213,6 +213,23 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head
}
}
+static int try_to_negate_entry(struct cache_detail *detail, struct cache_head *h)
+{
+ int rv;
+
+ write_lock(&detail->hash_lock);
+ rv = cache_is_valid(detail, h);
+ if (rv != -EAGAIN) {
+ write_unlock(&detail->hash_lock);
+ return rv;
+ }
+ set_bit(CACHE_NEGATIVE, &h->flags);
+ cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY);
+ write_unlock(&detail->hash_lock);
+ cache_fresh_unlocked(h, detail);
+ return -ENOENT;
+}
+
/*
* This is the generic cache management routine for all
* the authentication caches.
@@ -251,14 +268,8 @@ int cache_check(struct cache_detail *detail,
case -EINVAL:
clear_bit(CACHE_PENDING, &h->flags);
cache_revisit_request(h);
- if (rv == -EAGAIN) {
- set_bit(CACHE_NEGATIVE, &h->flags);
- cache_fresh_locked(h, seconds_since_boot()+CACHE_NEW_EXPIRY);
- cache_fresh_unlocked(h, detail);
- rv = -ENOENT;
- }
+ rv = try_to_negate_entry(detail, h);
break;
-
case -EAGAIN:
clear_bit(CACHE_PENDING, &h->flags);
cache_revisit_request(h);
--
1.7.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/2] svcrpc: ensure cache_check caller sees updated entry
2011-01-04 19:23 ` J. Bruce Fields
2011-01-04 19:31 ` [PATCH 1/2] svcrpc: take lock on turning entry NEGATIVE in cache_check J. Bruce Fields
@ 2011-01-04 19:31 ` J. Bruce Fields
2011-01-04 21:10 ` [PATCH] svcrpc: modifying positive sunrpc cache entries is racy NeilBrown
2 siblings, 0 replies; 20+ messages in thread
From: J. Bruce Fields @ 2011-01-04 19:31 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-nfs, J. Bruce Fields
Supposes cache_check runs simultaneously with an update on a different
CPU:
cache_check task doing update
^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^
1. test for CACHE_VALID 1'. set entry->data
& !CACHE_NEGATIVE
2. use entry->data 2'. set CACHE_VALID
If the two memory writes performed in step 1' and 2' appear misordered
with respect to the reads in step 1 and 2, then the caller could get
stale data at step 2 even though it saw CACHE_VALID set on the cache
entry.
Add memory barriers to prevent this.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
net/sunrpc/cache.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index a6c5733..72ad836 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -128,6 +128,7 @@ static void cache_fresh_locked(struct cache_head *head, time_t expiry)
{
head->expiry_time = expiry;
head->last_refresh = seconds_since_boot();
+ smp_wmb(); /* paired with smp_rmb() in cache_is_valid() */
set_bit(CACHE_VALID, &head->flags);
}
@@ -208,8 +209,16 @@ static inline int cache_is_valid(struct cache_detail *detail, struct cache_head
/* entry is valid */
if (test_bit(CACHE_NEGATIVE, &h->flags))
return -ENOENT;
- else
+ else {
+ /*
+ * In combination with write barrier in
+ * sunrpc_cache_update, ensures that anyone
+ * using the cache entry after this sees the
+ * updated contents:
+ */
+ smp_rmb();
return 0;
+ }
}
}
--
1.7.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2011-01-04 19:23 ` J. Bruce Fields
2011-01-04 19:31 ` [PATCH 1/2] svcrpc: take lock on turning entry NEGATIVE in cache_check J. Bruce Fields
2011-01-04 19:31 ` [PATCH 2/2] svcrpc: ensure cache_check caller sees updated entry J. Bruce Fields
@ 2011-01-04 21:10 ` NeilBrown
[not found] ` <20110105081031.220bfbc9-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2 siblings, 1 reply; 20+ messages in thread
From: NeilBrown @ 2011-01-04 21:10 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs
On Tue, 4 Jan 2011 14:23:51 -0500 "J. Bruce Fields" <bfields@fieldses.org>
wrote:
> So, that being the real problem, perhaps it's clearer to use explicit
> memory barriers instead of more locking, and add some comments. Also
> split this into a separate patch, as in the following.
Me likie.
Two separate patches is good, and nice comments next to the memory barriers
is good.
Reviewed-by: NeilBrown <neilb@suse.de>
Thanks!
NeilBrown
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
[not found] ` <20110105081031.220bfbc9-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
@ 2011-01-04 21:15 ` J. Bruce Fields
0 siblings, 0 replies; 20+ messages in thread
From: J. Bruce Fields @ 2011-01-04 21:15 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-nfs
On Wed, Jan 05, 2011 at 08:10:31AM +1100, NeilBrown wrote:
> On Tue, 4 Jan 2011 14:23:51 -0500 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
>
> > So, that being the real problem, perhaps it's clearer to use explicit
> > memory barriers instead of more locking, and add some comments. Also
> > split this into a separate patch, as in the following.
>
> Me likie.
>
> Two separate patches is good, and nice comments next to the memory barriers
> is good.
> Reviewed-by: NeilBrown <neilb@suse.de>
Good, thanks--queueing up for 2.6.38.--b.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2011-01-04 18:43 ` J. Bruce Fields
@ 2011-01-04 21:15 ` NeilBrown
2011-01-04 21:21 ` J. Bruce Fields
0 siblings, 1 reply; 20+ messages in thread
From: NeilBrown @ 2011-01-04 21:15 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs
On Tue, 4 Jan 2011 13:43:14 -0500 "J. Bruce Fields" <bfields@fieldses.org>
wrote:
> On Tue, Jan 04, 2011 at 03:51:07PM +1100, NeilBrown wrote:
> > On Mon, 3 Jan 2011 22:08:05 -0500 "J. Bruce Fields" <bfields@fieldses.org>
> > wrote:
> >
> > > On Mon, Jan 03, 2011 at 05:26:05PM -0500, J. Bruce Fields wrote:
> > > > On Wed, Dec 29, 2010 at 03:47:52PM -0500, bfields wrote:
> > > > > From: J. Bruce Fields <bfields@redhat.com>
> > > > >
> > > > > Once a sunrpc cache entry is non-NEGATIVE, we should be replacing it
> > > > > (and allowing any concurrent users to destroy it on last put) instead of
> > > > > trying to update it in place.
> > > >
> > > > Or the following seems simpler.
> > > >
> > > > (And I was thinking it was necessary to ensure that the right thing
> > > > happened to the cached xprt->xpt_auth_cache entry--though on a second
> > > > look I see that sunrpc_cache_update also expires the replaced entry
> > > > immediately. Still, this seems simpler if it also works.)
> > >
> > > Eh, on third thoughts: we probably do want a real negative entry created
> > > in the cache, so I think the original patch was correct!
> >
> > I like your second thought better than your third.
> > I don't see any reason to push this item out of the cache extra quickly.
> > In fact I think it would still be correct to just remove those two lines and
> > not set expiry_time to 0. auth_unix_lookup will never find that IP address,
> > and so it doesn't matter if it remains in the cache or not.
> > I guess there could result in the cache appearing to contain different data
> > depending on whether you look at it the 'old' way or the 'new' way, but I
> > don't that is a real problem, and setting expiry_time to 0 overcomes that.
> >
> > What was the substance of your third thought?
>
> I had some idea that we could end up with a cache entry stuck in the
> cache forever.
>
> OK, actually I think late last night I reverted into some sort of "maybe
> if I type the first thing that comes to my mind, somebody else will
> think this through for me" state. Apologies. Um, did I win?
This would be the Andrew Morton method of community involvement: say
something obviously wrong as it produces more responses than saying something
right - and some of them might be useful.... I fall for it all the time!
>
> > BTW, you could use sunrpc_invalidate rather than just setting expiry_time to
> > zero, which would hurry it out of the cache a bit faster.
>
> Yep, I like that, and I'd forgotten about sunrpc_invalidate, thanks!
> Done.... (As below).
>
> > And this all made me realise that there is more code that can be placed under
> > CONFIG_NFSD_DEPRECTATED.
>
> Yep, applied.
>
> (When do we get to remove all this? Taking a stab at the 2.6.40 merge
> window....
That is what is says in feature-removal-schedule.txt (which no-one reads).
> OK, party at my place in May!)
Will there still be snow? I'm not coming if there is no snow!
(patch looks good)
NeilBrown
>
> --b.
>
> commit ab5c05c579b0b37b9bf2c79c9c8f0ef6045ee41d
> Author: J. Bruce Fields <bfields@redhat.com>
> Date: Fri Dec 24 14:03:38 2010 -0500
>
> svcrpc: modifying valid sunrpc cache entries is racy
>
> Once a sunrpc cache entry is VALID, we should be replacing it (and
> allowing any concurrent users to destroy it on last put) instead of
> trying to update it in place.
>
> Otherwise someone referencing the ip_map we're modifying here could try
> to use the m_client just as we're putting the last reference.
>
> The bug should only be seen by users of the legacy nfsd interfaces.
>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>
> diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
> index a04ac91..59a7c52 100644
> --- a/net/sunrpc/svcauth_unix.c
> +++ b/net/sunrpc/svcauth_unix.c
> @@ -401,8 +401,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr)
> return NULL;
>
> if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) {
> - if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0)
> - auth_domain_put(&ipm->m_client->h);
> + sunrpc_invalidate(&ipm->h, sn->ip_map_cache);
> rv = NULL;
> } else {
> rv = &ipm->m_client->h;
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2011-01-04 21:15 ` NeilBrown
@ 2011-01-04 21:21 ` J. Bruce Fields
0 siblings, 0 replies; 20+ messages in thread
From: J. Bruce Fields @ 2011-01-04 21:21 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-nfs
On Wed, Jan 05, 2011 at 08:15:38AM +1100, NeilBrown wrote:
> On Tue, 4 Jan 2011 13:43:14 -0500 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
> > Yep, applied.
> >
> > (When do we get to remove all this? Taking a stab at the 2.6.40 merge
> > window....
>
> That is what is says in feature-removal-schedule.txt (which no-one reads).
>
>
> > OK, party at my place in May!)
>
> Will there still be snow? I'm not coming if there is no snow!
If anyone attends from Australia, I'll find the snow. It may melt
rather quickly, though....
> (patch looks good)
Applied--thanks for the help!--b.
>
> NeilBrown
>
>
>
> >
> > --b.
> >
> > commit ab5c05c579b0b37b9bf2c79c9c8f0ef6045ee41d
> > Author: J. Bruce Fields <bfields@redhat.com>
> > Date: Fri Dec 24 14:03:38 2010 -0500
> >
> > svcrpc: modifying valid sunrpc cache entries is racy
> >
> > Once a sunrpc cache entry is VALID, we should be replacing it (and
> > allowing any concurrent users to destroy it on last put) instead of
> > trying to update it in place.
> >
> > Otherwise someone referencing the ip_map we're modifying here could try
> > to use the m_client just as we're putting the last reference.
> >
> > The bug should only be seen by users of the legacy nfsd interfaces.
> >
> > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> >
> > diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
> > index a04ac91..59a7c52 100644
> > --- a/net/sunrpc/svcauth_unix.c
> > +++ b/net/sunrpc/svcauth_unix.c
> > @@ -401,8 +401,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr)
> > return NULL;
> >
> > if ((ipm->m_client->addr_changes - ipm->m_add_change) >0) {
> > - if (test_and_set_bit(CACHE_NEGATIVE, &ipm->h.flags) == 0)
> > - auth_domain_put(&ipm->m_client->h);
> > + sunrpc_invalidate(&ipm->h, sn->ip_map_cache);
> > rv = NULL;
> > } else {
> > rv = &ipm->m_client->h;
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2011-01-04 4:51 ` NeilBrown
2011-01-04 18:43 ` J. Bruce Fields
@ 2011-01-04 21:46 ` J. Bruce Fields
2011-01-04 23:05 ` NeilBrown
1 sibling, 1 reply; 20+ messages in thread
From: J. Bruce Fields @ 2011-01-04 21:46 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-nfs
On Tue, Jan 04, 2011 at 03:51:07PM +1100, NeilBrown wrote:
> And this all made me realise that there is more code that can be placed under
> CONFIG_NFSD_DEPRECTATED.
>
>
> SUNRPC: Remove more code when NFSD_DEPRECATED is not configured
>
> Signed-off-by: NeilBrown <neilb@suse.de>
I did need one change to get a succesful !CONFIG_NFSD_DEPRECATED
compile.
--b.
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index 46e1cbc..30916b0 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -422,6 +422,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr)
return rv;
}
EXPORT_SYMBOL_GPL(auth_unix_lookup);
+#endif /* CONFIG_NFSD_DEPRECATED */
void svcauth_unix_purge(void)
{
@@ -435,7 +436,6 @@ void svcauth_unix_purge(void)
}
}
EXPORT_SYMBOL_GPL(svcauth_unix_purge);
-#endif /* CONFIG_NFSD_DEPRECATED */
static inline struct ip_map *
ip_map_cached_get(struct svc_xprt *xprt)
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] svcrpc: modifying positive sunrpc cache entries is racy
2011-01-04 21:46 ` J. Bruce Fields
@ 2011-01-04 23:05 ` NeilBrown
0 siblings, 0 replies; 20+ messages in thread
From: NeilBrown @ 2011-01-04 23:05 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs
On Tue, 4 Jan 2011 16:46:39 -0500 "J. Bruce Fields" <bfields@fieldses.org>
wrote:
> On Tue, Jan 04, 2011 at 03:51:07PM +1100, NeilBrown wrote:
> > And this all made me realise that there is more code that can be placed under
> > CONFIG_NFSD_DEPRECTATED.
> >
> >
> > SUNRPC: Remove more code when NFSD_DEPRECATED is not configured
> >
> > Signed-off-by: NeilBrown <neilb@suse.de>
>
> I did need one change to get a succesful !CONFIG_NFSD_DEPRECATED
> compile.
yep.... I had done a test compile, but I managed to get the sense of
CONFIG_NFSD_DEPRECATED inverted :-(
NeilBrown
>
> --b.
>
> diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
> index 46e1cbc..30916b0 100644
> --- a/net/sunrpc/svcauth_unix.c
> +++ b/net/sunrpc/svcauth_unix.c
> @@ -422,6 +422,7 @@ struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr)
> return rv;
> }
> EXPORT_SYMBOL_GPL(auth_unix_lookup);
> +#endif /* CONFIG_NFSD_DEPRECATED */
>
> void svcauth_unix_purge(void)
> {
> @@ -435,7 +436,6 @@ void svcauth_unix_purge(void)
> }
> }
> EXPORT_SYMBOL_GPL(svcauth_unix_purge);
> -#endif /* CONFIG_NFSD_DEPRECATED */
>
> static inline struct ip_map *
> ip_map_cached_get(struct svc_xprt *xprt)
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2011-01-04 23:05 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-29 20:47 [PATCH] svcrpc: modifying positive sunrpc cache entries is racy J. Bruce Fields
2010-12-29 20:59 ` J. Bruce Fields
2010-12-30 1:19 ` Neil Brown
2010-12-30 1:57 ` J. Bruce Fields
2011-01-03 20:55 ` J. Bruce Fields
2011-01-04 5:01 ` NeilBrown
2011-01-04 15:22 ` J. Bruce Fields
2011-01-04 19:23 ` J. Bruce Fields
2011-01-04 19:31 ` [PATCH 1/2] svcrpc: take lock on turning entry NEGATIVE in cache_check J. Bruce Fields
2011-01-04 19:31 ` [PATCH 2/2] svcrpc: ensure cache_check caller sees updated entry J. Bruce Fields
2011-01-04 21:10 ` [PATCH] svcrpc: modifying positive sunrpc cache entries is racy NeilBrown
[not found] ` <20110105081031.220bfbc9-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2011-01-04 21:15 ` J. Bruce Fields
2011-01-03 22:26 ` J. Bruce Fields
2011-01-04 3:08 ` J. Bruce Fields
2011-01-04 4:51 ` NeilBrown
2011-01-04 18:43 ` J. Bruce Fields
2011-01-04 21:15 ` NeilBrown
2011-01-04 21:21 ` J. Bruce Fields
2011-01-04 21:46 ` J. Bruce Fields
2011-01-04 23:05 ` NeilBrown
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.