All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sunrpc/cache: skip checking the pending cache expired
@ 2017-02-04  5:41 Kinglong Mee
  2017-02-05 10:37 ` Kinglong Mee
  0 siblings, 1 reply; 5+ messages in thread
From: Kinglong Mee @ 2017-02-04  5:41 UTC (permalink / raw)
  To: J. Bruce Fields, linux-nfs; +Cc: Trond Myklebust, NeilBrown, Kinglong Mee

When the first time pynfs runs after rpc/nfsd startup, always get the warning, 

Got error: Connection closed

Commit 778620364ef5 "sunrpc/cache: make cache flushing more reliable." lets
cache_is_expired() checking expired as, 

return (h->expiry_time < seconds_since_boot()) ||
        (detail->flush_time >= h->last_refresh);

The detail->flush_time is equal to h->last_refresh time, when the cache is
new created and then do the upcall immediately. So that, the cache will be
treated as expired and be cleaned when write_flush().

This patch skips checking the pending cache expired when doing upcall.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
---
 include/linux/sunrpc/cache.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 62a60ee..9961c1f 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -204,8 +204,11 @@ static inline void cache_put(struct cache_head *h, struct cache_detail *cd)
 	kref_put(&h->ref, cd->cache_put);
 }
 
-static inline int cache_is_expired(struct cache_detail *detail, struct cache_head *h)
+static inline bool cache_is_expired(struct cache_detail *detail, struct cache_head *h)
 {
+	if (test_bit(CACHE_PENDING, &h->flags))
+		return false;
+
 	return  (h->expiry_time < seconds_since_boot()) ||
 		(detail->flush_time >= h->last_refresh);
 }
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] sunrpc/cache: skip checking the pending cache expired
  2017-02-04  5:41 [PATCH] sunrpc/cache: skip checking the pending cache expired Kinglong Mee
@ 2017-02-05 10:37 ` Kinglong Mee
  2017-02-05 12:41   ` Kinglong Mee
  0 siblings, 1 reply; 5+ messages in thread
From: Kinglong Mee @ 2017-02-05 10:37 UTC (permalink / raw)
  To: J. Bruce Fields, linux-nfs; +Cc: Trond Myklebust, NeilBrown, Kinglong Mee

On 2/4/2017 13:41, Kinglong Mee wrote:
> When the first time pynfs runs after rpc/nfsd startup, always get the warning, 
> 
> Got error: Connection closed
> 
> Commit 778620364ef5 "sunrpc/cache: make cache flushing more reliable." lets
> cache_is_expired() checking expired as, 
> 
> return (h->expiry_time < seconds_since_boot()) ||
>         (detail->flush_time >= h->last_refresh);
> 
> The detail->flush_time is equal to h->last_refresh time, when the cache is
> new created and then do the upcall immediately. So that, the cache will be
> treated as expired and be cleaned when write_flush().
> 
> This patch skips checking the pending cache expired when doing upcall.
> 
> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
> ---
>  include/linux/sunrpc/cache.h | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
> index 62a60ee..9961c1f 100644
> --- a/include/linux/sunrpc/cache.h
> +++ b/include/linux/sunrpc/cache.h
> @@ -204,8 +204,11 @@ static inline void cache_put(struct cache_head *h, struct cache_detail *cd)
>  	kref_put(&h->ref, cd->cache_put);
>  }
>  
> -static inline int cache_is_expired(struct cache_detail *detail, struct cache_head *h)
> +static inline bool cache_is_expired(struct cache_detail *detail, struct cache_head *h)
>  {
> +	if (test_bit(CACHE_PENDING, &h->flags))
> +		return false;
> +

Sorry, with this patch there is a bug exist,
When a pending cache exist, sunrpc_destroy_cache_detail() cann't
cleanup the cache_detail, but user always free it later, 
that will cause a NULL reference.

Please ignore this patch, maybe another patch for the above problem
 will be sent.

thanks,
Kinglong Mee

>  	return  (h->expiry_time < seconds_since_boot()) ||
>  		(detail->flush_time >= h->last_refresh);
>  }
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sunrpc/cache: skip checking the pending cache expired
  2017-02-05 10:37 ` Kinglong Mee
@ 2017-02-05 12:41   ` Kinglong Mee
  2017-02-05 22:28     ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Kinglong Mee @ 2017-02-05 12:41 UTC (permalink / raw)
  To: J. Bruce Fields, linux-nfs; +Cc: Trond Myklebust, NeilBrown, Kinglong Mee

On 2/5/2017 18:37, Kinglong Mee wrote:
> On 2/4/2017 13:41, Kinglong Mee wrote:
>> When the first time pynfs runs after rpc/nfsd startup, always get the warning, 
>>
>> Got error: Connection closed
>>
>> Commit 778620364ef5 "sunrpc/cache: make cache flushing more reliable." lets
>> cache_is_expired() checking expired as, 
>>
>> return (h->expiry_time < seconds_since_boot()) ||
>>         (detail->flush_time >= h->last_refresh);
>>
>> The detail->flush_time is equal to h->last_refresh time, when the cache is
>> new created and then do the upcall immediately. So that, the cache will be
>> treated as expired and be cleaned when write_flush().

I found the problem is caused by, 
1. A new startup of nfsd, rpc.mountd, etc,
2. A rpc request from client (pynfs test, or normal mounting),
3. An ip_map cache is created but invalid, so upcall to rpc.mountd, 
4. rpc.mountd process the ip_map upcall, before write the valid data to nfsd,
   do auth_reload(), and check_useipaddr(),
5. For the first time, old_use_ipaddr = -1, that cause a cache_flush,
6. The ip_map cache will be treat as expired and clean,
7. When rpc.mountd write the valid data to nfsd, a new ip_map is created
   and updated, the cache_check of old ip_map(doing the upcall) will
   return -ETIMEDOUT.
8. RPC layer return SVC_CLOSE and close the xprt after commit 4d712ef1db05
   "svcauth_gss: Close connection when dropping an incoming message"

Although it is exist only one time, but I don't like the noise.
Is the cache_flush for old_use_ipaddr = -1 necessary in step 5?
I have no idea about how to fix it, any suggests?

thanks,
Kinglong Mee

>>
>> This patch skips checking the pending cache expired when doing upcall.
>>
>> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
>> ---
>>  include/linux/sunrpc/cache.h | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
>> index 62a60ee..9961c1f 100644
>> --- a/include/linux/sunrpc/cache.h
>> +++ b/include/linux/sunrpc/cache.h
>> @@ -204,8 +204,11 @@ static inline void cache_put(struct cache_head *h, struct cache_detail *cd)
>>  	kref_put(&h->ref, cd->cache_put);
>>  }
>>  
>> -static inline int cache_is_expired(struct cache_detail *detail, struct cache_head *h)
>> +static inline bool cache_is_expired(struct cache_detail *detail, struct cache_head *h)
>>  {
>> +	if (test_bit(CACHE_PENDING, &h->flags))
>> +		return false;
>> +
> 
> Sorry, with this patch there is a bug exist,
> When a pending cache exist, sunrpc_destroy_cache_detail() cann't
> cleanup the cache_detail, but user always free it later, 
> that will cause a NULL reference.
> 
> Please ignore this patch, maybe another patch for the above problem
>  will be sent.
> 
> thanks,
> Kinglong Mee
> 
>>  	return  (h->expiry_time < seconds_since_boot()) ||
>>  		(detail->flush_time >= h->last_refresh);
>>  }
>>
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sunrpc/cache: skip checking the pending cache expired
  2017-02-05 12:41   ` Kinglong Mee
@ 2017-02-05 22:28     ` NeilBrown
  2017-02-06  0:59       ` Kinglong Mee
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2017-02-05 22:28 UTC (permalink / raw)
  To: Kinglong Mee, J. Bruce Fields, linux-nfs; +Cc: Trond Myklebust, Kinglong Mee

[-- Attachment #1: Type: text/plain, Size: 4601 bytes --]

On Sun, Feb 05 2017, Kinglong Mee wrote:

> On 2/5/2017 18:37, Kinglong Mee wrote:
>> On 2/4/2017 13:41, Kinglong Mee wrote:
>>> When the first time pynfs runs after rpc/nfsd startup, always get the warning, 
>>>
>>> Got error: Connection closed
>>>
>>> Commit 778620364ef5 "sunrpc/cache: make cache flushing more reliable." lets
>>> cache_is_expired() checking expired as, 
>>>
>>> return (h->expiry_time < seconds_since_boot()) ||
>>>         (detail->flush_time >= h->last_refresh);
>>>
>>> The detail->flush_time is equal to h->last_refresh time, when the cache is
>>> new created and then do the upcall immediately. So that, the cache will be
>>> treated as expired and be cleaned when write_flush().
>
> I found the problem is caused by, 
> 1. A new startup of nfsd, rpc.mountd, etc,
> 2. A rpc request from client (pynfs test, or normal mounting),
> 3. An ip_map cache is created but invalid, so upcall to rpc.mountd, 
> 4. rpc.mountd process the ip_map upcall, before write the valid data to nfsd,
>    do auth_reload(), and check_useipaddr(),
> 5. For the first time, old_use_ipaddr = -1, that cause a cache_flush,
> 6. The ip_map cache will be treat as expired and clean,
> 7. When rpc.mountd write the valid data to nfsd, a new ip_map is created
>    and updated, the cache_check of old ip_map(doing the upcall) will
>    return -ETIMEDOUT.
> 8. RPC layer return SVC_CLOSE and close the xprt after commit 4d712ef1db05
>    "svcauth_gss: Close connection when dropping an incoming message"
>
> Although it is exist only one time, but I don't like the noise.
> Is the cache_flush for old_use_ipaddr = -1 necessary in step 5?
> I have no idea about how to fix it, any suggests?
>

Hi,
 I think that your idea that CACHE_PENDING cache entries shouldn't be
 treated as expired make some sense.
 I would actually focus on CACHE_VALID.  If CACHE_VALID is not set, then
 there is no data in the cache item, so there is nothing to expire.
 So it would be nice if cache items that don't have CACHE_VALID are
 never treated as expired.
 However, as you discovered, that causes a problem for cache_purge().
 It purges a cache by setting the ->flush_time, and calling
 cache_clean(), which only removes expired entries.  For that to work,
 Even non-CACHE_VALID entries need to be seen as expired.

 I think the best thing to do is to write a stand-alone cache_purge().
 Rather than using cache_flush(), explicitly free every item on every
 hash chain.

 Something like this?

static int cache_purge(struct cache_detail *cd)
{
	int rv = 0;
	struct list_head *next;
	int i;

	write_lock(&cd->hash_lock);

	for (i = 0; i < cd->hash_size; i++) {
		struct cache_head *ch = NULL;
		struct cache_detail *d;
		struct hlist_head *head;
		struct hlist_node *tmp;

		head = &cd->hash_table[i];
		hlist_for_each_entry_safe(ch, tmp, head, cache_list) {
			hlist_del_init(&ch->cache_list);
			cd->entries--;

			set_bit(CACHE_CLEANED, &ch->flags);
			write_unlock(&cd->hash_lock);
			cache_fresh_unlocked(ch, d);
			cache_put(ch, d);
			write_lock(&cd->hash_lock);
		}
	}
	write_unlock(&cd->hash_lock);

	return rv;
}

Thanks,
NeilBrown


> thanks,
> Kinglong Mee
>
>>>
>>> This patch skips checking the pending cache expired when doing upcall.
>>>
>>> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
>>> ---
>>>  include/linux/sunrpc/cache.h | 5 ++++-
>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
>>> index 62a60ee..9961c1f 100644
>>> --- a/include/linux/sunrpc/cache.h
>>> +++ b/include/linux/sunrpc/cache.h
>>> @@ -204,8 +204,11 @@ static inline void cache_put(struct cache_head *h, struct cache_detail *cd)
>>>  	kref_put(&h->ref, cd->cache_put);
>>>  }
>>>  
>>> -static inline int cache_is_expired(struct cache_detail *detail, struct cache_head *h)
>>> +static inline bool cache_is_expired(struct cache_detail *detail, struct cache_head *h)
>>>  {
>>> +	if (test_bit(CACHE_PENDING, &h->flags))
>>> +		return false;
>>> +
>> 
>> Sorry, with this patch there is a bug exist,
>> When a pending cache exist, sunrpc_destroy_cache_detail() cann't
>> cleanup the cache_detail, but user always free it later, 
>> that will cause a NULL reference.
>> 
>> Please ignore this patch, maybe another patch for the above problem
>>  will be sent.
>> 
>> thanks,
>> Kinglong Mee
>> 
>>>  	return  (h->expiry_time < seconds_since_boot()) ||
>>>  		(detail->flush_time >= h->last_refresh);
>>>  }
>>>
>> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sunrpc/cache: skip checking the pending cache expired
  2017-02-05 22:28     ` NeilBrown
@ 2017-02-06  0:59       ` Kinglong Mee
  0 siblings, 0 replies; 5+ messages in thread
From: Kinglong Mee @ 2017-02-06  0:59 UTC (permalink / raw)
  To: NeilBrown, J. Bruce Fields, linux-nfs; +Cc: Trond Myklebust, Kinglong Mee

On 2/6/2017 06:28, NeilBrown wrote:
> On Sun, Feb 05 2017, Kinglong Mee wrote:
> 
>> On 2/5/2017 18:37, Kinglong Mee wrote:
>>> On 2/4/2017 13:41, Kinglong Mee wrote:
>>>> When the first time pynfs runs after rpc/nfsd startup, always get the warning, 
>>>>
>>>> Got error: Connection closed
>>>>
>>>> Commit 778620364ef5 "sunrpc/cache: make cache flushing more reliable." lets
>>>> cache_is_expired() checking expired as, 
>>>>
>>>> return (h->expiry_time < seconds_since_boot()) ||
>>>>         (detail->flush_time >= h->last_refresh);
>>>>
>>>> The detail->flush_time is equal to h->last_refresh time, when the cache is
>>>> new created and then do the upcall immediately. So that, the cache will be
>>>> treated as expired and be cleaned when write_flush().
>>
>> I found the problem is caused by, 
>> 1. A new startup of nfsd, rpc.mountd, etc,
>> 2. A rpc request from client (pynfs test, or normal mounting),
>> 3. An ip_map cache is created but invalid, so upcall to rpc.mountd, 
>> 4. rpc.mountd process the ip_map upcall, before write the valid data to nfsd,
>>    do auth_reload(), and check_useipaddr(),
>> 5. For the first time, old_use_ipaddr = -1, that cause a cache_flush,
>> 6. The ip_map cache will be treat as expired and clean,
>> 7. When rpc.mountd write the valid data to nfsd, a new ip_map is created
>>    and updated, the cache_check of old ip_map(doing the upcall) will
>>    return -ETIMEDOUT.
>> 8. RPC layer return SVC_CLOSE and close the xprt after commit 4d712ef1db05
>>    "svcauth_gss: Close connection when dropping an incoming message"
>>
>> Although it is exist only one time, but I don't like the noise.
>> Is the cache_flush for old_use_ipaddr = -1 necessary in step 5?
>> I have no idea about how to fix it, any suggests?
>>
> 
> Hi,
>  I think that your idea that CACHE_PENDING cache entries shouldn't be
>  treated as expired make some sense.
>  I would actually focus on CACHE_VALID.  If CACHE_VALID is not set, then
>  there is no data in the cache item, so there is nothing to expire.
>  So it would be nice if cache items that don't have CACHE_VALID are
>  never treated as expired.
>  However, as you discovered, that causes a problem for cache_purge().
>  It purges a cache by setting the ->flush_time, and calling
>  cache_clean(), which only removes expired entries.  For that to work,
>  Even non-CACHE_VALID entries need to be seen as expired.

rpc.mountd do a write_flush() that cause the cache_flush,
but, it's the same as cache_purge() which setting the ->flush_time.

> 
>  I think the best thing to do is to write a stand-alone cache_purge().
>  Rather than using cache_flush(), explicitly free every item on every
>  hash chain.

The stand-alone cache_purge() as following make sure cleanup all
cache entries it the cache_detail.
It's better than my another patch for the NULL reference bugfix.

Don't treat non-CACHE_VALID entries as expired make sense, 
it's needed also.

thanks,
Kinglong Mee

> 
>  Something like this?
> 
> static int cache_purge(struct cache_detail *cd)
> {
> 	int rv = 0;
> 	struct list_head *next;
> 	int i;
> 
> 	write_lock(&cd->hash_lock);
> 
> 	for (i = 0; i < cd->hash_size; i++) {
> 		struct cache_head *ch = NULL;
> 		struct cache_detail *d;
> 		struct hlist_head *head;
> 		struct hlist_node *tmp;
> 
> 		head = &cd->hash_table[i];
> 		hlist_for_each_entry_safe(ch, tmp, head, cache_list) {
> 			hlist_del_init(&ch->cache_list);
> 			cd->entries--;
> 
> 			set_bit(CACHE_CLEANED, &ch->flags);
> 			write_unlock(&cd->hash_lock);
> 			cache_fresh_unlocked(ch, d);
> 			cache_put(ch, d);
> 			write_lock(&cd->hash_lock);
> 		}
> 	}
> 	write_unlock(&cd->hash_lock);
> 
> 	return rv;
> }
> 
> Thanks,
> NeilBrown
> 
> 
>> thanks,
>> Kinglong Mee
>>
>>>>
>>>> This patch skips checking the pending cache expired when doing upcall.
>>>>
>>>> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
>>>> ---
>>>>  include/linux/sunrpc/cache.h | 5 ++++-
>>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
>>>> index 62a60ee..9961c1f 100644
>>>> --- a/include/linux/sunrpc/cache.h
>>>> +++ b/include/linux/sunrpc/cache.h
>>>> @@ -204,8 +204,11 @@ static inline void cache_put(struct cache_head *h, struct cache_detail *cd)
>>>>  	kref_put(&h->ref, cd->cache_put);
>>>>  }
>>>>  
>>>> -static inline int cache_is_expired(struct cache_detail *detail, struct cache_head *h)
>>>> +static inline bool cache_is_expired(struct cache_detail *detail, struct cache_head *h)
>>>>  {
>>>> +	if (test_bit(CACHE_PENDING, &h->flags))
>>>> +		return false;
>>>> +
>>>
>>> Sorry, with this patch there is a bug exist,
>>> When a pending cache exist, sunrpc_destroy_cache_detail() cann't
>>> cleanup the cache_detail, but user always free it later, 
>>> that will cause a NULL reference.
>>>
>>> Please ignore this patch, maybe another patch for the above problem
>>>  will be sent.
>>>
>>> thanks,
>>> Kinglong Mee
>>>
>>>>  	return  (h->expiry_time < seconds_since_boot()) ||
>>>>  		(detail->flush_time >= h->last_refresh);
>>>>  }
>>>>
>>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-02-06  1:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-04  5:41 [PATCH] sunrpc/cache: skip checking the pending cache expired Kinglong Mee
2017-02-05 10:37 ` Kinglong Mee
2017-02-05 12:41   ` Kinglong Mee
2017-02-05 22:28     ` NeilBrown
2017-02-06  0:59       ` Kinglong Mee

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.