linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cifs: remove unnecessary copies of tcon->crfid.fid
@ 2021-04-13 23:25 Muhammad Usama Anjum
  2021-04-14 12:00 ` Aurélien Aptel
  0 siblings, 1 reply; 3+ messages in thread
From: Muhammad Usama Anjum @ 2021-04-13 23:25 UTC (permalink / raw)
  To: Steve French, Ronnie Sahlberg,
	open list:COMMON INTERNET FILE SYSTEM CLIENT (CIFS),
	moderated list:COMMON INTERNET FILE SYSTEM CLIENT (CIFS),
	open list
  Cc: musamaanjum, kernel-janitors, dan.carpenter, colin.king

pfid is being set to tcon->crfid.fid and they are copied in each other
multiple times. Remove the memcopy between same pointers.

Addresses-Coverity: ("Overlapped copy")
Fixes: 9e81e8ff74b9 ("cifs: return cached_fid from open_shroot")
Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com>
---
I'm not sure why refcount was being incremented here. This file has been
evoloved so much. Any ideas?

fs/cifs/smb2ops.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 61214b23c57f..6fa35003dcfe 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -847,14 +847,6 @@ int open_cached_dir(unsigned int xid, struct cifs_tcon *tcon,
 			.volatile_fid = pfid->volatile_fid,
 		};
 
-		/*
-		 * caller expects this func to set pfid to a valid
-		 * cached root, so we copy the existing one and get a
-		 * reference.
-		 */
-		memcpy(pfid, tcon->crfid.fid, sizeof(*pfid));
-		kref_get(&tcon->crfid.refcount);
-
 		mutex_unlock(&tcon->crfid.fid_mutex);
 
 		if (rc == 0) {
@@ -885,7 +877,6 @@ int open_cached_dir(unsigned int xid, struct cifs_tcon *tcon,
 	oparms.fid->mid = le64_to_cpu(o_rsp->sync_hdr.MessageId);
 #endif /* CIFS_DEBUG2 */
 
-	memcpy(tcon->crfid.fid, pfid, sizeof(struct cifs_fid));
 	tcon->crfid.tcon = tcon;
 	tcon->crfid.is_valid = true;
 	tcon->crfid.dentry = dentry;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] cifs: remove unnecessary copies of tcon->crfid.fid
  2021-04-13 23:25 [PATCH] cifs: remove unnecessary copies of tcon->crfid.fid Muhammad Usama Anjum
@ 2021-04-14 12:00 ` Aurélien Aptel
  2021-04-15 14:52   ` Muhammad Usama Anjum
  0 siblings, 1 reply; 3+ messages in thread
From: Aurélien Aptel @ 2021-04-14 12:00 UTC (permalink / raw)
  To: Muhammad Usama Anjum, Steve French, Ronnie Sahlberg,
	open list:COMMON INTERNET FILE SYSTEM CLIENT (CIFS),
	moderated list:COMMON INTERNET FILE SYSTEM CLIENT (CIFS),
	open list
  Cc: musamaanjum, kernel-janitors, dan.carpenter, colin.king

Muhammad Usama Anjum <musamaanjum@gmail.com> writes:
> pfid is being set to tcon->crfid.fid and they are copied in each other
> multiple times. Remove the memcopy between same pointers.
>
> Addresses-Coverity: ("Overlapped copy")
> Fixes: 9e81e8ff74b9 ("cifs: return cached_fid from open_shroot")
> Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com>
> ---
> I'm not sure why refcount was being incremented here. This file has been
> evoloved so much. Any ideas?

The fact that pfid is the same as the cache is very weird... Probably
due to recent change.

This function returns a cached dir entry for the root of the share which
can be accessed/shared by multiple task.

The basic logic is:

    open_cached_dir(result) {
    
        if (cache.is_valid) {
            memcpy(result, cache->fid)
            return
        }
    
        // not cached, send open() to server
        dir tmp;
        smb2_open(&tmp...)
        memcpy(cache->fid, tmp)
        cache.is_valid = true
        memcpy(result, cache->fid)
        return
    }

My understanding of this is that all file/dir entry have a refcount so
to prevent callers from releasing the cached entry when they put it we
need to bump it before returning.

    open_cached_dir(result) {
    
        if (cache.is_valid) {
            kref_get(cache)
            memcpy(result, cache->fid)
            return
        }
    
        // not cached, send open() to server
        dir tmp;
        smb2_open(&tmp...)
        memcpy(cache->fid, tmp)
        cache.is_valid = true

        kref_init(cache)
        kref_get(cache)

        memcpy(result, cache->fid)
        return
    }

Now this function can be called from multiple thread, and there are
couple of critical sections.


    process 1                process 2
    -------------------     -----------------
    if (cache.is_valid)     
    => false continue         
    smb2_open(...)            
    
                            if (cache.is_valid)     
                            => false continue         
                            smb2_open(...)            
    
    cache.is_valid = true

In that exemple, we ended up opening twice and overwriting the cache.
So we need to add locks to avoid this race condition.

    open_cached_dir(result) {
      	mutex_lock(cache)
                  
        if (cache.is_valid) {
            kref_get(cache)
            memcpy(result, cache->fid)
            mutex_unlock(cache)
            return cache
        }
    
        // not cached, send open() to server
        dir tmp;
        smb2_open(&tmp...)
        memcpy(cache->fid, tmp)
        cache.is_valid = true

        kref_init(cache)
        kref_get(cache)

        mutex_unlock(cache)

        memcpy(result, cache->fid)
        return
    }

But now we get reports of deadlocks. Turns out smb2_open() in some code
path (if it ends up triggering reconnect) will take the lock
again. Since linux mutex are not reentrant this will block forever
(deadlock). So we need to release for the smb2_open() call.

    open_cached_dir(result) {
      	mutex_lock(cache);
                  
        if (cache.is_valid) {
            kref_get(cache)
            memcpy(result, cache->fid)
            return cache
        }

        // release lock for open
        mutex_unlock(cache)
        // not cached, send open() to server
        dir tmp;
        smb2_open(&tmp...)
        // take it back
        mutex_lock(cache)

        // now we need to check is_valid again since it could have been
        // changed in that small unlocked time frame by a concurrent process
        if (cache.is_valid) {
            // a concurrent call to this func was done already
            // return the existing one to caller
            memcpy(result, cache->fid)
            kref_get(cache)
            mutex_unlock(cache)
            // close the tmp duplicate one we opened
            smb2_close(tmp)
            return
        }

        memcpy(result, cache->fid)
        kref_init(cache)
        kref_get(cache)

        mutex_unlock(cache)

        return
    }

That ^^^ is the pseudo-code of what the function *should* be doing. We
need to go over it and see what it is doing different now. I think it's
likely when we made the code to be used for caching any dir
something diverged wrong.

Cheers,
-- 
Aurélien Aptel / SUSE Labs Samba Team
GPG: 1839 CB5F 9F5B FB9B AA97  8C99 03C8 A49B 521B D5D3
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, DE
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 247165 (AG München)


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] cifs: remove unnecessary copies of tcon->crfid.fid
  2021-04-14 12:00 ` Aurélien Aptel
@ 2021-04-15 14:52   ` Muhammad Usama Anjum
  0 siblings, 0 replies; 3+ messages in thread
From: Muhammad Usama Anjum @ 2021-04-15 14:52 UTC (permalink / raw)
  To: Aurélien Aptel, Steve French, Ronnie Sahlberg,
	open list:COMMON INTERNET FILE SYSTEM CLIENT (CIFS),
	moderated list:COMMON INTERNET FILE SYSTEM CLIENT (CIFS),
	open list
  Cc: musamaanjum, kernel-janitors, dan.carpenter, colin.king

On Wed, 2021-04-14 at 14:00 +0200, Aurélien Aptel wrote:
> Muhammad Usama Anjum <musamaanjum@gmail.com> writes:
> > pfid is being set to tcon->crfid.fid and they are copied in each other
> > multiple times. Remove the memcopy between same pointers.
> > 
> > Addresses-Coverity: ("Overlapped copy")
> > Fixes: 9e81e8ff74b9 ("cifs: return cached_fid from open_shroot")
> > Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com>
> > ---
> > I'm not sure why refcount was being incremented here. This file has been
> > evoloved so much. Any ideas?
> 
> The fact that pfid is the same as the cache is very weird... Probably
> due to recent change.
> 
> This function returns a cached dir entry for the root of the share which
> can be accessed/shared by multiple task.
> 
Aurélien Aptel,

Thank you so much for awesome explanation. The whole function makes
sense now. We need to remove the memcpy calls only. We need to
increment the refcount (it was removed in this patch). I'll send a V2.

Regards,
Usama


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-15 14:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 23:25 [PATCH] cifs: remove unnecessary copies of tcon->crfid.fid Muhammad Usama Anjum
2021-04-14 12:00 ` Aurélien Aptel
2021-04-15 14:52   ` Muhammad Usama Anjum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).