All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/1] cache_head leak in sunrpc_cache_lookup()
@ 2018-11-28  8:45 Vasily Averin
  2018-11-28 23:35 ` J. Bruce Fields
  0 siblings, 1 reply; 7+ messages in thread
From: Vasily Averin @ 2018-11-28  8:45 UTC (permalink / raw)
  To: J. Bruce Fields, Jeff Layton, linux-nfs
  Cc: David S. Miller, NeilBrown, Pavel Tikhomirov

Dear all,
we have found memory leak on OpenVz7 node and believe it affects mainline too.

sunrpc_cache_lookup() removes exprired cache_head from hash,
however if it waits for reply on submitted cache_request 
both of them can leak forever, nobody cleans unhashed cache_heads.

Originally we had claim on busy loop device of stopped container,
that had executed nfs server inside.  
Device was kept by mount that was detached from already destroyed mount namespace.
By using crash search we have found some structure with path struct related to our mount.
Finally we have found that it was alive svc_export struct used by to alive cache_request,
however both of them pointed to already freed cache_detail.

We decided that cache_detail was correctly freed during destroy of net namespace,
however svc_export with taken path struct, cache_request and some other structures 
seems was leaked forever.

This could  happen only if cache_head of svc_export was removed from hash on cache_detail
before its destroy. Finally we have found that it could happen when sunrpc_cache_lookup()
removes expired cache_head from hash.

Usually it works correctly and cache_put(freeme) frees expired cache_head.
However in our case cache_head have an extra reference counter from stalled cache_request.
Becasue of cache_head was removed from hash of cache_detail it cannot be found in cache_clean()
and its cache_request cannot be freed in cache_dequeue(). Memory leaks forever, 
exactly like we observed.

After may attempts we have reproduced this situation on OpenVz7 kernel, 
however our reproducer is quite long and complex. Unfortunately we still
did not reproduced this problem on mainline kernel and did not validated the patch yet.

It would be great if someone advised us some simple way to trigger described scenario.
We are not sure that our patch is correct, please let us know if our analyze missed something.

Vasily Averin (1):
  sunrpc: cache_head leak due queued requests

 net/sunrpc/cache.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/1] cache_head leak in sunrpc_cache_lookup()
  2018-11-28  8:45 [PATCH 0/1] cache_head leak in sunrpc_cache_lookup() Vasily Averin
@ 2018-11-28 23:35 ` J. Bruce Fields
  2018-11-29  5:35   ` NeilBrown
  0 siblings, 1 reply; 7+ messages in thread
From: J. Bruce Fields @ 2018-11-28 23:35 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Jeff Layton, linux-nfs, David S. Miller, NeilBrown, Pavel Tikhomirov

On Wed, Nov 28, 2018 at 11:45:46AM +0300, Vasily Averin wrote:
> Dear all, we have found memory leak on OpenVz7 node and believe it
> affects mainline too.
> 
> sunrpc_cache_lookup() removes exprired cache_head from hash, however
> if it waits for reply on submitted cache_request both of them can leak
> forever, nobody cleans unhashed cache_heads.
> 
> Originally we had claim on busy loop device of stopped container, that
> had executed nfs server inside.  Device was kept by mount that was
> detached from already destroyed mount namespace.  By using crash
> search we have found some structure with path struct related to our
> mount.  Finally we have found that it was alive svc_export struct used
> by to alive cache_request, however both of them pointed to already
> freed cache_detail.
> 
> We decided that cache_detail was correctly freed during destroy of net
> namespace, however svc_export with taken path struct, cache_request
> and some other structures seems was leaked forever.
> 
> This could  happen only if cache_head of svc_export was removed from
> hash on cache_detail before its destroy. Finally we have found that it
> could happen when sunrpc_cache_lookup() removes expired cache_head
> from hash.
> 
> Usually it works correctly and cache_put(freeme) frees expired
> cache_head.  However in our case cache_head have an extra reference
> counter from stalled cache_request.  Becasue of cache_head was removed
> from hash of cache_detail it cannot be found in cache_clean() and its
> cache_request cannot be freed in cache_dequeue(). Memory leaks
> forever, exactly like we observed.
> 
> After may attempts we have reproduced this situation on OpenVz7
> kernel, however our reproducer is quite long and complex.
> Unfortunately we still did not reproduced this problem on mainline
> kernel and did not validated the patch yet.
> 
> It would be great if someone advised us some simple way to trigger
> described scenario.

I think you should be able to produce hung upcalls by flushing the cache
(exportfs -f), then stopping mountd, then trying to access the
filesystem from a client.  Does that help?

> We are not sure that our patch is correct, please let us know if our
> analyze missed something.

It looks OK to me, but it would be helpful to have Neil's review too.

I think I'd also copy some of the above into the changelog--e.g. it
might be useful to document that this can manifest as a stray reference
cuont on a mount.

--b.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/1] cache_head leak in sunrpc_cache_lookup()
  2018-11-28 23:35 ` J. Bruce Fields
@ 2018-11-29  5:35   ` NeilBrown
  2018-12-04 20:42     ` J. Bruce Fields
  2019-04-04 22:21     ` NeilBrown
  0 siblings, 2 replies; 7+ messages in thread
From: NeilBrown @ 2018-11-29  5:35 UTC (permalink / raw)
  To: J. Bruce Fields, Vasily Averin
  Cc: Jeff Layton, linux-nfs, David S. Miller, Pavel Tikhomirov

[-- Attachment #1: Type: text/plain, Size: 2726 bytes --]

On Wed, Nov 28 2018, J. Bruce Fields wrote:

> On Wed, Nov 28, 2018 at 11:45:46AM +0300, Vasily Averin wrote:
>> Dear all, we have found memory leak on OpenVz7 node and believe it
>> affects mainline too.
>> 
>> sunrpc_cache_lookup() removes exprired cache_head from hash, however
>> if it waits for reply on submitted cache_request both of them can leak
>> forever, nobody cleans unhashed cache_heads.
>> 
>> Originally we had claim on busy loop device of stopped container, that
>> had executed nfs server inside.  Device was kept by mount that was
>> detached from already destroyed mount namespace.  By using crash
>> search we have found some structure with path struct related to our
>> mount.  Finally we have found that it was alive svc_export struct used
>> by to alive cache_request, however both of them pointed to already
>> freed cache_detail.
>> 
>> We decided that cache_detail was correctly freed during destroy of net
>> namespace, however svc_export with taken path struct, cache_request
>> and some other structures seems was leaked forever.
>> 
>> This could  happen only if cache_head of svc_export was removed from
>> hash on cache_detail before its destroy. Finally we have found that it
>> could happen when sunrpc_cache_lookup() removes expired cache_head
>> from hash.
>> 
>> Usually it works correctly and cache_put(freeme) frees expired
>> cache_head.  However in our case cache_head have an extra reference
>> counter from stalled cache_request.  Becasue of cache_head was removed
>> from hash of cache_detail it cannot be found in cache_clean() and its
>> cache_request cannot be freed in cache_dequeue(). Memory leaks
>> forever, exactly like we observed.
>> 
>> After may attempts we have reproduced this situation on OpenVz7
>> kernel, however our reproducer is quite long and complex.
>> Unfortunately we still did not reproduced this problem on mainline
>> kernel and did not validated the patch yet.
>> 
>> It would be great if someone advised us some simple way to trigger
>> described scenario.
>
> I think you should be able to produce hung upcalls by flushing the cache
> (exportfs -f), then stopping mountd, then trying to access the
> filesystem from a client.  Does that help?
>
>> We are not sure that our patch is correct, please let us know if our
>> analyze missed something.
>
> It looks OK to me, but it would be helpful to have Neil's review too.

Yes, it makes sense to me.
 Reviewed-by: NeilBrown <neilb@suse.com>

Thanks,
NeilBrown


>
> I think I'd also copy some of the above into the changelog--e.g. it
> might be useful to document that this can manifest as a stray reference
> cuont on a mount.
>
> --b.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/1] cache_head leak in sunrpc_cache_lookup()
  2018-11-29  5:35   ` NeilBrown
@ 2018-12-04 20:42     ` J. Bruce Fields
  2019-04-04 22:21     ` NeilBrown
  1 sibling, 0 replies; 7+ messages in thread
From: J. Bruce Fields @ 2018-12-04 20:42 UTC (permalink / raw)
  To: NeilBrown
  Cc: Vasily Averin, Jeff Layton, linux-nfs, David S. Miller, Pavel Tikhomirov

On Thu, Nov 29, 2018 at 04:35:12PM +1100, NeilBrown wrote:
> On Wed, Nov 28 2018, J. Bruce Fields wrote:
> 
> > On Wed, Nov 28, 2018 at 11:45:46AM +0300, Vasily Averin wrote:
> >> Dear all, we have found memory leak on OpenVz7 node and believe it
> >> affects mainline too.
> >> 
> >> sunrpc_cache_lookup() removes exprired cache_head from hash, however
> >> if it waits for reply on submitted cache_request both of them can leak
> >> forever, nobody cleans unhashed cache_heads.
> >> 
> >> Originally we had claim on busy loop device of stopped container, that
> >> had executed nfs server inside.  Device was kept by mount that was
> >> detached from already destroyed mount namespace.  By using crash
> >> search we have found some structure with path struct related to our
> >> mount.  Finally we have found that it was alive svc_export struct used
> >> by to alive cache_request, however both of them pointed to already
> >> freed cache_detail.
> >> 
> >> We decided that cache_detail was correctly freed during destroy of net
> >> namespace, however svc_export with taken path struct, cache_request
> >> and some other structures seems was leaked forever.
> >> 
> >> This could  happen only if cache_head of svc_export was removed from
> >> hash on cache_detail before its destroy. Finally we have found that it
> >> could happen when sunrpc_cache_lookup() removes expired cache_head
> >> from hash.
> >> 
> >> Usually it works correctly and cache_put(freeme) frees expired
> >> cache_head.  However in our case cache_head have an extra reference
> >> counter from stalled cache_request.  Becasue of cache_head was removed
> >> from hash of cache_detail it cannot be found in cache_clean() and its
> >> cache_request cannot be freed in cache_dequeue(). Memory leaks
> >> forever, exactly like we observed.
> >> 
> >> After may attempts we have reproduced this situation on OpenVz7
> >> kernel, however our reproducer is quite long and complex.
> >> Unfortunately we still did not reproduced this problem on mainline
> >> kernel and did not validated the patch yet.
> >> 
> >> It would be great if someone advised us some simple way to trigger
> >> described scenario.
> >
> > I think you should be able to produce hung upcalls by flushing the cache
> > (exportfs -f), then stopping mountd, then trying to access the
> > filesystem from a client.  Does that help?
> >
> >> We are not sure that our patch is correct, please let us know if our
> >> analyze missed something.
> >
> > It looks OK to me, but it would be helpful to have Neil's review too.
> 
> Yes, it makes sense to me.
>  Reviewed-by: NeilBrown <neilb@suse.com>

OK, applied for 4.21.--b.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/1] cache_head leak in sunrpc_cache_lookup()
  2018-11-29  5:35   ` NeilBrown
  2018-12-04 20:42     ` J. Bruce Fields
@ 2019-04-04 22:21     ` NeilBrown
  2019-04-05  0:34       ` NeilBrown
  1 sibling, 1 reply; 7+ messages in thread
From: NeilBrown @ 2019-04-04 22:21 UTC (permalink / raw)
  To: J. Bruce Fields, Vasily Averin; +Cc: Jeff Layton, linux-nfs, Pavel Tikhomirov

[-- Attachment #1: Type: text/plain, Size: 4364 bytes --]

On Thu, Nov 29 2018, NeilBrown wrote:

> On Wed, Nov 28 2018, J. Bruce Fields wrote:
>
>> On Wed, Nov 28, 2018 at 11:45:46AM +0300, Vasily Averin wrote:
>>> Dear all, we have found memory leak on OpenVz7 node and believe it
>>> affects mainline too.
>>> 
>>> sunrpc_cache_lookup() removes exprired cache_head from hash, however
>>> if it waits for reply on submitted cache_request both of them can leak
>>> forever, nobody cleans unhashed cache_heads.
>>> 
>>> Originally we had claim on busy loop device of stopped container, that
>>> had executed nfs server inside.  Device was kept by mount that was
>>> detached from already destroyed mount namespace.  By using crash
>>> search we have found some structure with path struct related to our
>>> mount.  Finally we have found that it was alive svc_export struct used
>>> by to alive cache_request, however both of them pointed to already
>>> freed cache_detail.
>>> 
>>> We decided that cache_detail was correctly freed during destroy of net
>>> namespace, however svc_export with taken path struct, cache_request
>>> and some other structures seems was leaked forever.
>>> 
>>> This could  happen only if cache_head of svc_export was removed from
>>> hash on cache_detail before its destroy. Finally we have found that it
>>> could happen when sunrpc_cache_lookup() removes expired cache_head
>>> from hash.
>>> 
>>> Usually it works correctly and cache_put(freeme) frees expired
>>> cache_head.  However in our case cache_head have an extra reference
>>> counter from stalled cache_request.  Becasue of cache_head was removed
>>> from hash of cache_detail it cannot be found in cache_clean() and its
>>> cache_request cannot be freed in cache_dequeue(). Memory leaks
>>> forever, exactly like we observed.
>>> 
>>> After may attempts we have reproduced this situation on OpenVz7
>>> kernel, however our reproducer is quite long and complex.
>>> Unfortunately we still did not reproduced this problem on mainline
>>> kernel and did not validated the patch yet.
>>> 
>>> It would be great if someone advised us some simple way to trigger
>>> described scenario.
>>
>> I think you should be able to produce hung upcalls by flushing the cache
>> (exportfs -f), then stopping mountd, then trying to access the
>> filesystem from a client.  Does that help?
>>
>>> We are not sure that our patch is correct, please let us know if our
>>> analyze missed something.
>>
>> It looks OK to me, but it would be helpful to have Neil's review too.
>
> Yes, it makes sense to me.
>  Reviewed-by: NeilBrown <neilb@suse.com>

Unfortunately I was wrong.  See below.

NeilBrown

From: NeilBrown <neilb@suse.com>
Subject: [PATCH] sunrpc: don't mark uninitialised items as VALID.

A recent commit added a call to cache_fresh_locked()
when an expired item was found.
The call sets the CACHE_VALID flag, so it is important
that the item actually is valid.
There are two ways it could be valid:
1/ If ->update has been called to fill in relevant content
2/ If CACHE_NEGATIVE is set, to say that content doesn't exist.

An expired item that is waiting for an update will be neither.
Setting CACHE_VALID will mean that a subsequent call to cache_put()
will be likely to dereference uninitialised pointers.

So we must make sure the item is valid, and we already have code to do
that in try_to_negate_entry().  This takes the hash lock and so cannot
be used directly, so take out the two lines that we need and use them.

Now cache_fresh_locked() is certain to be called only on
a valid item.

Cc: stable@kernel.org # 2.6.35
Fixes: 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued request")
Signed-off-by: NeilBrown <neilb@suse.com>
---
 net/sunrpc/cache.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 12bb23b8e0c5..f5f8ce8c3443 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -105,6 +105,8 @@ static struct cache_head *sunrpc_cache_add_entry(struct cache_detail *detail,
 			if (cache_is_expired(detail, tmp)) {
 				hlist_del_init_rcu(&tmp->cache_list);
 				detail->entries --;
+				if (cache_is_valid(tmp) == -EAGAIN)
+					set_bit(CACHE_NEGATIVE, &h->flags);
 				cache_fresh_locked(tmp, 0, detail);
 				freeme = tmp;
 				break;
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/1] cache_head leak in sunrpc_cache_lookup()
  2019-04-04 22:21     ` NeilBrown
@ 2019-04-05  0:34       ` NeilBrown
  2019-04-05 23:52         ` J. Bruce Fields
  0 siblings, 1 reply; 7+ messages in thread
From: NeilBrown @ 2019-04-05  0:34 UTC (permalink / raw)
  To: J. Bruce Fields, Vasily Averin; +Cc: Jeff Layton, linux-nfs, Pavel Tikhomirov

[-- Attachment #1: Type: text/plain, Size: 2305 bytes --]

On Fri, Apr 05 2019, NeilBrown wrote:
>
> Unfortunately I was wrong.  See below.
>

Arrghh.. Ignore that patch.  Use this one.

NeilBrown

From b7a8dced1a34869710a5d68a64e8bd37bf2ab426 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Fri, 5 Apr 2019 09:04:41 +1100
Subject: [PATCH] sunrpc: don't mark uninitialised items as VALID.

A recent commit added a call to cache_fresh_locked()
when an expired item was found.
The call sets the CACHE_VALID flag, so it is important
that the item actually is valid.
There are two ways it could be valid:
1/ If ->update has been called to fill in relevant content
2/ if CACHE_NEGATIVE is set, to say that content doesn't exist.

An expired item that is waiting for an update will be neither.
Setting CACHE_VALID will mean that a subsequent call to cache_put()
will be likely to dereference uninitialised pointers.

So we must make sure the item is valid, and we already have code to do
that in try_to_negate_entry().  This takes the hash lock and so cannot
be used directly, so take out the two lines that we need and use them.

Now cache_fresh_locked() is certain to be called only on
a valid item.

Cc: stable@kernel.org # 2.6.35
Fixes: 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued request")
Signed-off-by: NeilBrown <neilb@suse.com>
---
 net/sunrpc/cache.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 12bb23b8e0c5..261131dfa1f1 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -54,6 +54,7 @@ static void cache_init(struct cache_head *h, struct cache_detail *detail)
 	h->last_refresh = now;
 }
 
+static inline int cache_is_valid(struct cache_head *h);
 static void cache_fresh_locked(struct cache_head *head, time_t expiry,
 				struct cache_detail *detail);
 static void cache_fresh_unlocked(struct cache_head *head,
@@ -105,6 +106,8 @@ static struct cache_head *sunrpc_cache_add_entry(struct cache_detail *detail,
 			if (cache_is_expired(detail, tmp)) {
 				hlist_del_init_rcu(&tmp->cache_list);
 				detail->entries --;
+				if (cache_is_valid(tmp) == -EAGAIN)
+					set_bit(CACHE_NEGATIVE, &tmp->flags);
 				cache_fresh_locked(tmp, 0, detail);
 				freeme = tmp;
 				break;
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/1] cache_head leak in sunrpc_cache_lookup()
  2019-04-05  0:34       ` NeilBrown
@ 2019-04-05 23:52         ` J. Bruce Fields
  0 siblings, 0 replies; 7+ messages in thread
From: J. Bruce Fields @ 2019-04-05 23:52 UTC (permalink / raw)
  To: NeilBrown; +Cc: Vasily Averin, Jeff Layton, linux-nfs, Pavel Tikhomirov

On Fri, Apr 05, 2019 at 11:34:40AM +1100, NeilBrown wrote:
> On Fri, Apr 05 2019, NeilBrown wrote:
> >
> > Unfortunately I was wrong.  See below.
> >
> 
> Arrghh.. Ignore that patch.  Use this one.

Took me a moment to spot the difference.  OK!  Queuing that up for
5.1....

--b.

> 
> NeilBrown
> 
> From b7a8dced1a34869710a5d68a64e8bd37bf2ab426 Mon Sep 17 00:00:00 2001
> From: NeilBrown <neilb@suse.com>
> Date: Fri, 5 Apr 2019 09:04:41 +1100
> Subject: [PATCH] sunrpc: don't mark uninitialised items as VALID.
> 
> A recent commit added a call to cache_fresh_locked()
> when an expired item was found.
> The call sets the CACHE_VALID flag, so it is important
> that the item actually is valid.
> There are two ways it could be valid:
> 1/ If ->update has been called to fill in relevant content
> 2/ if CACHE_NEGATIVE is set, to say that content doesn't exist.
> 
> An expired item that is waiting for an update will be neither.
> Setting CACHE_VALID will mean that a subsequent call to cache_put()
> will be likely to dereference uninitialised pointers.
> 
> So we must make sure the item is valid, and we already have code to do
> that in try_to_negate_entry().  This takes the hash lock and so cannot
> be used directly, so take out the two lines that we need and use them.
> 
> Now cache_fresh_locked() is certain to be called only on
> a valid item.
> 
> Cc: stable@kernel.org # 2.6.35
> Fixes: 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued request")
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  net/sunrpc/cache.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 12bb23b8e0c5..261131dfa1f1 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -54,6 +54,7 @@ static void cache_init(struct cache_head *h, struct cache_detail *detail)
>  	h->last_refresh = now;
>  }
>  
> +static inline int cache_is_valid(struct cache_head *h);
>  static void cache_fresh_locked(struct cache_head *head, time_t expiry,
>  				struct cache_detail *detail);
>  static void cache_fresh_unlocked(struct cache_head *head,
> @@ -105,6 +106,8 @@ static struct cache_head *sunrpc_cache_add_entry(struct cache_detail *detail,
>  			if (cache_is_expired(detail, tmp)) {
>  				hlist_del_init_rcu(&tmp->cache_list);
>  				detail->entries --;
> +				if (cache_is_valid(tmp) == -EAGAIN)
> +					set_bit(CACHE_NEGATIVE, &tmp->flags);
>  				cache_fresh_locked(tmp, 0, detail);
>  				freeme = tmp;
>  				break;
> -- 
> 2.14.0.rc0.dirty
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-04-05 23:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-28  8:45 [PATCH 0/1] cache_head leak in sunrpc_cache_lookup() Vasily Averin
2018-11-28 23:35 ` J. Bruce Fields
2018-11-29  5:35   ` NeilBrown
2018-12-04 20:42     ` J. Bruce Fields
2019-04-04 22:21     ` NeilBrown
2019-04-05  0:34       ` NeilBrown
2019-04-05 23:52         ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.