On 11 Mar 2013 17:13:41 +0100 Bodo Stroesser <bstroesser@ts.fujitsu.com>
wrote:

> Hi,
> 
> AFAICS, there is one more race in RPC cache.
> 
> The problem showed up for the "auth.unix.ip" cache, that
> has a reader.
> 
> If a server thread tries to find a cache entry, it first
> does a lookup (or calls ip_map_cached_get() in this specific
> case). Then, it calls cache_check() for this entry.
> 
> If the reader updates the cache entry just between the
> thread's lookup and cache_check() execution, cache_check()
> will do an upcall for this cache entry. This is, because
> sunrpc_cache_update() calls cache_fresh_locked(old, 0),
> which sets expiry_time to 0.
> 
> Unfortunately, the reply to the upcall will not dequeue
> the queued cache_request, as the reply will be assigned to
> the cache entry newly created by the above cache update.
> 
> The result is a growing queue of orphaned cache_request
> structures --> memory leak.
> 
> I found this on a SLES11 SP1 with a backport of the latest
> patches that fix the other RPC races. On this old kernel,
> the problem also leads to svc_drop() being called for the
> affected RPC request (after svc_defer()).
> 
> Best Regards
> Bodo

I don't think this is a real problem.
The periodic call to "cache_clean" should find these orphaned requests and
purge them.  So you could get a short term memory leak, but it should
correct itself.
Do you agree?

Thanks,
NeilBrown