All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Always update the dentry cache with fresh readdir() results
@ 2012-07-05  8:38 Andrew Bartlett
  2012-07-05 10:02 ` Andrew Bartlett
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Bartlett @ 2012-07-05  8:38 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 1155 bytes --]

When we do a readdir() in CIFS, we are potentially efficiently
collecting a great deal of current, catchable stat information.

It is important that we always keep the dentry cache current for two
reasons:
 - the information may have changed (within the actime timeout).
 - if we still have a dentry cache value after that timeout, it is quite
expensive (1xRTT per entry) to find out if it was still correct.

This hits folks who are using CIFS over a WAN very badly.  For example
on an emulated 50ms delay I would have ls --color complete in .1
seconds, and a second run take 4.5 seconds, as each stat() (for the
colouring) would create a trans2 query_path_info query for each file,
right after getting the same information in the trans2 find_first2.

This patch implements the simplest approach, I would welcome a
correction on if there is a better approach than d_drop() and dput().

Tested on 3.4.4-3.cifsrevalidate.fc17.i686 with a 50ms WANem emulated
WAN against Samba 4.0 beta3.

Thanks,

Andrew Bartlett
-- 
Andrew Bartlett                                http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org

[-- Attachment #2: 0001-fs-cifs-Always-recreate-the-cifs-dentry-cache-when-w.patch --]
[-- Type: text/x-patch, Size: 2575 bytes --]

>From 4478a902d3606205313eb37a225da22712841bff Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
Date: Thu, 5 Jul 2012 15:48:08 +1000
Subject: [PATCH] fs/cifs: Always recreate the cifs dentry cache when we have
 fresh information

When we do a readdir() in CIFS, we are potentially efficiently
collecting a great deal of current, cache-able stat information.

It is important that we always keep the dentry cache current for two reasons:
 - the information may have changed (within the actime timeout).
 - if we still have a dentry cache value after that timeout, it is quite
   expensive (1xRTT per entry) to find out if it was still correct.

This hits folks who are using CIFS over a WAN very badly.  For example
on an emulated 50ms delay I would have ls --color complete in .1
seconds, and a second run take 4.5 seconds, as each stat() (for the
colouring) would create a trans2 query_path_info query for each file,
right after getting the same information in the trans2 find_first2.

This patch implements the simplest approach, I would welcome a
correction on if there is a better approach than d_drop() and dput().

Tested on 3.4.4-3.cifsrevalidate.fc17.i686 against Samba 4.0 beta3.

Andrew Bartlett

Signed-off-by: Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
---
 fs/cifs/readdir.c |   11 ++++++-----
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c
index 0a8224d..f964b40 100644
--- a/fs/cifs/readdir.c
+++ b/fs/cifs/readdir.c
@@ -66,9 +66,13 @@ static inline void dump_cifs_file_struct(struct file *file, char *label)
 #endif /* DEBUG2 */
 
 /*
- * Find the dentry that matches "name". If there isn't one, create one. If it's
- * a negative dentry or the uniqueid changed, then drop it and recreate it.
+ * Find the dentry that matches "name". If there isn't one, create
+ * one.  Otherwise recreate it fresh so we start the timeout for
+ * revalidation again.  Local locks are much faster than the network
+ * ops to revalidate it, particularly as revalidation will be
+ * per-inode, but here we have a whole directory at once!
  */
+
 static struct dentry *
 cifs_readdir_lookup(struct dentry *parent, struct qstr *name,
 		    struct cifs_fattr *fattr)
@@ -86,9 +90,6 @@ cifs_readdir_lookup(struct dentry *parent, struct qstr *name,
 
 	dentry = d_lookup(parent, name);
 	if (dentry) {
-		/* FIXME: check for inode number changes? */
-		if (dentry->d_inode != NULL)
-			return dentry;
 		d_drop(dentry);
 		dput(dentry);
 	}
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] Always update the dentry cache with fresh readdir() results
  2012-07-05  8:38 [PATCH] Always update the dentry cache with fresh readdir() results Andrew Bartlett
@ 2012-07-05 10:02 ` Andrew Bartlett
  2012-07-05 11:24   ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Bartlett @ 2012-07-05 10:02 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA; +Cc: Bill Robertson, Dion Edwards

(CCing in the original reporter)

On Thu, 2012-07-05 at 18:38 +1000, Andrew Bartlett wrote:
> When we do a readdir() in CIFS, we are potentially efficiently
> collecting a great deal of current, catchable stat information.
> 
> It is important that we always keep the dentry cache current for two
> reasons:
>  - the information may have changed (within the actime timeout).
>  - if we still have a dentry cache value after that timeout, it is quite
> expensive (1xRTT per entry) to find out if it was still correct.
> 
> This hits folks who are using CIFS over a WAN very badly.  For example
> on an emulated 50ms delay I would have ls --color complete in .1
> seconds, and a second run take 4.5 seconds, as each stat() (for the
> colouring) would create a trans2 query_path_info query for each file,
> right after getting the same information in the trans2 find_first2.
> 
> This patch implements the simplest approach, I would welcome a
> correction on if there is a better approach than d_drop() and dput().
> 
> Tested on 3.4.4-3.cifsrevalidate.fc17.i686 with a 50ms WANem emulated
> WAN against Samba 4.0 beta3.
> 
> Thanks,
> 
> Andrew Bartlett

-- 
Andrew Bartlett                                http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Always update the dentry cache with fresh readdir() results
  2012-07-05 10:02 ` Andrew Bartlett
@ 2012-07-05 11:24   ` Jeff Layton
       [not found]     ` <20120705072401.7eb1a7ee-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2012-07-05 11:24 UTC (permalink / raw)
  To: Andrew Bartlett
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, Bill Robertson, Dion Edwards

On Thu, 05 Jul 2012 20:02:47 +1000
Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:

> (CCing in the original reporter)
> 
> On Thu, 2012-07-05 at 18:38 +1000, Andrew Bartlett wrote:
> > When we do a readdir() in CIFS, we are potentially efficiently
> > collecting a great deal of current, catchable stat information.
> > 
> > It is important that we always keep the dentry cache current for two
> > reasons:
> >  - the information may have changed (within the actime timeout).
> >  - if we still have a dentry cache value after that timeout, it is quite
> > expensive (1xRTT per entry) to find out if it was still correct.
> > 
> > This hits folks who are using CIFS over a WAN very badly.  For example
> > on an emulated 50ms delay I would have ls --color complete in .1
> > seconds, and a second run take 4.5 seconds, as each stat() (for the
> > colouring) would create a trans2 query_path_info query for each file,
> > right after getting the same information in the trans2 find_first2.
> > 
> > This patch implements the simplest approach, I would welcome a
> > correction on if there is a better approach than d_drop() and dput().
> > 
> > Tested on 3.4.4-3.cifsrevalidate.fc17.i686 with a 50ms WANem emulated
> > WAN against Samba 4.0 beta3.
> > 
> > Thanks,
> > 
> > Andrew Bartlett
> 

Nice work tracking that down and coding up the patch. While it's not
incorrect to drop the dentry here, we can be a little more efficient
here and just update the inode in place if the uniqueid didn't change.

Something like this (untested) patch should do it. Could you test this
and let me know if it also helps?

-------------------------[snip]--------------------------

cifs: always update the inode cache with the results from a FIND_*

When we get back a FIND_FIRST/NEXT result, we have some info about the
dentry that we use to instantiate a new inode. We were ignoring and
discarding that info when we had an existing dentry in the cache.

Fix this by updating the inode in place when we find an existing dentry
and the uniqueid is the same.

Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # .31.x
Reported-by: Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
Reported-by: Bill Robertson <bill_robertson-nSG1tDLywIjKnmoGZ802fQ@public.gmane.org>
Reported-by: Dion Edwards <dion_edwards-nSG1tDLywIjKnmoGZ802fQ@public.gmane.org>
Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 fs/cifs/readdir.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c
index 0a8224d..a4217f0 100644
--- a/fs/cifs/readdir.c
+++ b/fs/cifs/readdir.c
@@ -86,9 +86,12 @@ cifs_readdir_lookup(struct dentry *parent, struct qstr *name,
 
 	dentry = d_lookup(parent, name);
 	if (dentry) {
-		/* FIXME: check for inode number changes? */
-		if (dentry->d_inode != NULL)
+		inode = dentry->d_inode;
+		/* update inode in place if i_ino didn't change */
+		if (inode && CIFS_I(inode)->uniqueid == fattr->cf_uniqueid) {
+			cifs_fattr_to_inode(inode, fattr);
 			return dentry;
+		}
 		d_drop(dentry);
 		dput(dentry);
 	}
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] Always update the dentry cache with fresh readdir() results
       [not found]     ` <20120705072401.7eb1a7ee-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
@ 2012-07-05 23:31       ` Andrew Bartlett
  2012-07-06  1:46         ` Jeff Layton
  2012-07-06  6:30       ` Andrew Bartlett
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Bartlett @ 2012-07-05 23:31 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, Bill Robertson, Dion Edwards

On Thu, 2012-07-05 at 07:24 -0400, Jeff Layton wrote:
> On Thu, 05 Jul 2012 20:02:47 +1000
> Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> 
> > (CCing in the original reporter)
> > 
> > On Thu, 2012-07-05 at 18:38 +1000, Andrew Bartlett wrote:
> > > When we do a readdir() in CIFS, we are potentially efficiently
> > > collecting a great deal of current, catchable stat information.
> > > 
> > > It is important that we always keep the dentry cache current for two
> > > reasons:
> > >  - the information may have changed (within the actime timeout).
> > >  - if we still have a dentry cache value after that timeout, it is quite
> > > expensive (1xRTT per entry) to find out if it was still correct.
> > > 
> > > This hits folks who are using CIFS over a WAN very badly.  For example
> > > on an emulated 50ms delay I would have ls --color complete in .1
> > > seconds, and a second run take 4.5 seconds, as each stat() (for the
> > > colouring) would create a trans2 query_path_info query for each file,
> > > right after getting the same information in the trans2 find_first2.
> > > 
> > > This patch implements the simplest approach, I would welcome a
> > > correction on if there is a better approach than d_drop() and dput().
> > > 
> > > Tested on 3.4.4-3.cifsrevalidate.fc17.i686 with a 50ms WANem emulated
> > > WAN against Samba 4.0 beta3.
> > > 
> > > Thanks,
> > > 
> > > Andrew Bartlett
> > 
> 
> Nice work tracking that down and coding up the patch. While it's not
> incorrect to drop the dentry here, we can be a little more efficient
> here and just update the inode in place if the uniqueid didn't change.
> 
> Something like this (untested) patch should do it. Could you test this
> and let me know if it also helps?

Is it really safe to update so much without getting a lock over all the
updates?

/* populate an inode with info from a cifs_fattr struct */
void
cifs_fattr_to_inode(struct inode *inode, struct cifs_fattr *fattr)
{
	struct cifsInodeInfo *cifs_i = CIFS_I(inode);
	struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
	unsigned long oldtime = cifs_i->time;

	cifs_revalidate_cache(inode, fattr);

	inode->i_atime = fattr->cf_atime;
	inode->i_mtime = fattr->cf_mtime;
	inode->i_ctime = fattr->cf_ctime;
	inode->i_rdev = fattr->cf_rdev;
	set_nlink(inode, fattr->cf_nlink);
	inode->i_uid = fattr->cf_uid;
	inode->i_gid = fattr->cf_gid;

	/* if dynperm is set, don't clobber existing mode */
	if (inode->i_state & I_NEW ||
	    !(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_DYNPERM))
		inode->i_mode = fattr->cf_mode;

	cifs_i->cifsAttrs = fattr->cf_cifsattrs;

	if (fattr->cf_flags & CIFS_FATTR_NEED_REVAL)
		cifs_i->time = 0;
	else
		cifs_i->time = jiffies;

	cFYI(1, "inode 0x%p old_time=%ld new_time=%ld", inode,
		 oldtime, cifs_i->time);

	cifs_i->delete_pending = fattr->cf_flags & CIFS_FATTR_DELETE_PENDING;

	cifs_i->server_eof = fattr->cf_eof;
	/*
	 * Can't safely change the file size here if the client is writing to
	 * it due to potential races.
	 */
	spin_lock(&inode->i_lock);
	if (is_size_safe_to_change(cifs_i, fattr->cf_eof)) {
		i_size_write(inode, fattr->cf_eof);

		/*
		 * i_blocks is not related to (i_size / i_blksize),
		 * but instead 512 byte (2**9) size is required for
		 * calculating num blocks.
		 */
		inode->i_blocks = (512 - 1 + fattr->cf_bytes) >> 9;
	}
	spin_unlock(&inode->i_lock);

	if (fattr->cf_flags & CIFS_FATTR_DFS_REFERRAL)
		inode->i_flags |= S_AUTOMOUNT;
	cifs_set_ops(inode);
}

That is, I think the spin_lock() needs to be moved to the top of
cifs_fattr_to_inode().  How is this safe for the current callers?

The equivalent code in NFS does this:

int nfs_refresh_inode(struct inode *inode, struct nfs_fattr *fattr)
{
	int status;

	if ((fattr->valid & NFS_ATTR_FATTR) == 0)
		return 0;
	spin_lock(&inode->i_lock);
	status = nfs_refresh_inode_locked(inode, fattr);
	spin_unlock(&inode->i_lock);

	return status;
}

In our case it will be more difficult, as cifs_fattr_to_inode() takes
the inode->i_lock (but only for some updates).

I agree that it is important to call cifs_fattr_to_inode, because it is
critical to call cifs_revalidate_cache(), to flush the fscache and to
flush any cached pages. 

Andrew Bartlett

> -------------------------[snip]--------------------------
> 
> cifs: always update the inode cache with the results from a FIND_*
> 
> When we get back a FIND_FIRST/NEXT result, we have some info about the
> dentry that we use to instantiate a new inode. We were ignoring and
> discarding that info when we had an existing dentry in the cache.
> 
> Fix this by updating the inode in place when we find an existing dentry
> and the uniqueid is the same.
> 
> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # .31.x
> Reported-by: Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
> Reported-by: Bill Robertson <bill_robertson-nSG1tDLywIjKnmoGZ802fQ@public.gmane.org>
> Reported-by: Dion Edwards <dion_edwards-nSG1tDLywIjKnmoGZ802fQ@public.gmane.org>
> Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/cifs/readdir.c |    7 +++++--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c
> index 0a8224d..a4217f0 100644
> --- a/fs/cifs/readdir.c
> +++ b/fs/cifs/readdir.c
> @@ -86,9 +86,12 @@ cifs_readdir_lookup(struct dentry *parent, struct qstr *name,
>  
>  	dentry = d_lookup(parent, name);
>  	if (dentry) {
> -		/* FIXME: check for inode number changes? */
> -		if (dentry->d_inode != NULL)
> +		inode = dentry->d_inode;
> +		/* update inode in place if i_ino didn't change */
> +		if (inode && CIFS_I(inode)->uniqueid == fattr->cf_uniqueid) {
> +			cifs_fattr_to_inode(inode, fattr);
>  			return dentry;
> +		}
>  		d_drop(dentry);
>  		dput(dentry);
>  	}

-- 
Andrew Bartlett                                http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Always update the dentry cache with fresh readdir() results
  2012-07-05 23:31       ` Andrew Bartlett
@ 2012-07-06  1:46         ` Jeff Layton
       [not found]           ` <20120705214608.2a3a681b-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2012-07-06  1:46 UTC (permalink / raw)
  To: Andrew Bartlett
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, Bill Robertson, Dion Edwards

On Fri, 06 Jul 2012 09:31:07 +1000
Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:

> On Thu, 2012-07-05 at 07:24 -0400, Jeff Layton wrote:
> > On Thu, 05 Jul 2012 20:02:47 +1000
> > Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> > 
> > > (CCing in the original reporter)
> > > 
> > > On Thu, 2012-07-05 at 18:38 +1000, Andrew Bartlett wrote:
> > > > When we do a readdir() in CIFS, we are potentially efficiently
> > > > collecting a great deal of current, catchable stat information.
> > > > 
> > > > It is important that we always keep the dentry cache current for two
> > > > reasons:
> > > >  - the information may have changed (within the actime timeout).
> > > >  - if we still have a dentry cache value after that timeout, it is quite
> > > > expensive (1xRTT per entry) to find out if it was still correct.
> > > > 
> > > > This hits folks who are using CIFS over a WAN very badly.  For example
> > > > on an emulated 50ms delay I would have ls --color complete in .1
> > > > seconds, and a second run take 4.5 seconds, as each stat() (for the
> > > > colouring) would create a trans2 query_path_info query for each file,
> > > > right after getting the same information in the trans2 find_first2.
> > > > 
> > > > This patch implements the simplest approach, I would welcome a
> > > > correction on if there is a better approach than d_drop() and dput().
> > > > 
> > > > Tested on 3.4.4-3.cifsrevalidate.fc17.i686 with a 50ms WANem emulated
> > > > WAN against Samba 4.0 beta3.
> > > > 
> > > > Thanks,
> > > > 
> > > > Andrew Bartlett
> > > 
> > 
> > Nice work tracking that down and coding up the patch. While it's not
> > incorrect to drop the dentry here, we can be a little more efficient
> > here and just update the inode in place if the uniqueid didn't change.
> > 
> > Something like this (untested) patch should do it. Could you test this
> > and let me know if it also helps?
> 
> Is it really safe to update so much without getting a lock over all the
> updates?
> 

What's your worry, specifically?

The vfs only requires that you hold the lock over i_size updates. I
suppose it's possible that you could have racing updates to an inode,
but in practice, the last one will generally "win".

-- 
Jeff Layton <jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Always update the dentry cache with fresh readdir() results
       [not found]           ` <20120705214608.2a3a681b-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
@ 2012-07-06  6:20             ` Andrew Bartlett
  2012-07-06 11:03               ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Bartlett @ 2012-07-06  6:20 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, Bill Robertson, Dion Edwards

On Thu, 2012-07-05 at 21:46 -0400, Jeff Layton wrote:
> On Fri, 06 Jul 2012 09:31:07 +1000
> Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> 
> > On Thu, 2012-07-05 at 07:24 -0400, Jeff Layton wrote:
> > > On Thu, 05 Jul 2012 20:02:47 +1000
> > > Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> > > 
> > > > (CCing in the original reporter)
> > > > 
> > > > On Thu, 2012-07-05 at 18:38 +1000, Andrew Bartlett wrote:
> > > > > When we do a readdir() in CIFS, we are potentially efficiently
> > > > > collecting a great deal of current, catchable stat information.
> > > > > 
> > > > > It is important that we always keep the dentry cache current for two
> > > > > reasons:
> > > > >  - the information may have changed (within the actime timeout).
> > > > >  - if we still have a dentry cache value after that timeout, it is quite
> > > > > expensive (1xRTT per entry) to find out if it was still correct.
> > > > > 
> > > > > This hits folks who are using CIFS over a WAN very badly.  For example
> > > > > on an emulated 50ms delay I would have ls --color complete in .1
> > > > > seconds, and a second run take 4.5 seconds, as each stat() (for the
> > > > > colouring) would create a trans2 query_path_info query for each file,
> > > > > right after getting the same information in the trans2 find_first2.
> > > > > 
> > > > > This patch implements the simplest approach, I would welcome a
> > > > > correction on if there is a better approach than d_drop() and dput().
> > > > > 
> > > > > Tested on 3.4.4-3.cifsrevalidate.fc17.i686 with a 50ms WANem emulated
> > > > > WAN against Samba 4.0 beta3.
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Andrew Bartlett
> > > > 
> > > 
> > > Nice work tracking that down and coding up the patch. While it's not
> > > incorrect to drop the dentry here, we can be a little more efficient
> > > here and just update the inode in place if the uniqueid didn't change.
> > > 
> > > Something like this (untested) patch should do it. Could you test this
> > > and let me know if it also helps?
> > 
> > Is it really safe to update so much without getting a lock over all the
> > updates?
> > 
> 
> What's your worry, specifically?
> 
> The vfs only requires that you hold the lock over i_size updates. I
> suppose it's possible that you could have racing updates to an inode,
> but in practice, the last one will generally "win".

Writes are a worry, and I'm not sure I like the idea of parallel updates
being able to leave it in a undefined state, but I was more worried
about a read, that is someone reading just between:

	inode->i_uid = fattr->cf_uid;
	inode->i_gid = fattr->cf_gid;

and

	/* if dynperm is set, don't clobber existing mode */
	if (inode->i_state & I_NEW ||
	    !(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_DYNPERM))
		inode->i_mode = fattr->cf_mode;

Imagine that this file was changing owner, and was setuid.  Don't we
have a race here were a very lucky caller could get the entry from the
dentry cache between the uid change and the permission change (setuid
removal)?

The race is very narrow, and most of the worries are much more mundane,
but isn't this why you would lock the inode for the whole update?

I'm not fully up on kernel locking rules, which is why I looked to NFS
for the example I mentioned. 

Thanks,

Andrew Bartlett 

-- 
Andrew Bartlett                                http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Always update the dentry cache with fresh readdir() results
       [not found]     ` <20120705072401.7eb1a7ee-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
  2012-07-05 23:31       ` Andrew Bartlett
@ 2012-07-06  6:30       ` Andrew Bartlett
  2012-07-06 11:11         ` Jeff Layton
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Bartlett @ 2012-07-06  6:30 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, Bill Robertson, Dion Edwards

On Thu, 2012-07-05 at 07:24 -0400, Jeff Layton wrote:
> On Thu, 05 Jul 2012 20:02:47 +1000
> Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> 
> > (CCing in the original reporter)
> > 
> > On Thu, 2012-07-05 at 18:38 +1000, Andrew Bartlett wrote:
> > > When we do a readdir() in CIFS, we are potentially efficiently
> > > collecting a great deal of current, catchable stat information.
> > > 
> > > It is important that we always keep the dentry cache current for two
> > > reasons:
> > >  - the information may have changed (within the actime timeout).
> > >  - if we still have a dentry cache value after that timeout, it is quite
> > > expensive (1xRTT per entry) to find out if it was still correct.
> > > 
> > > This hits folks who are using CIFS over a WAN very badly.  For example
> > > on an emulated 50ms delay I would have ls --color complete in .1
> > > seconds, and a second run take 4.5 seconds, as each stat() (for the
> > > colouring) would create a trans2 query_path_info query for each file,
> > > right after getting the same information in the trans2 find_first2.
> > > 
> > > This patch implements the simplest approach, I would welcome a
> > > correction on if there is a better approach than d_drop() and dput().
> > > 
> > > Tested on 3.4.4-3.cifsrevalidate.fc17.i686 with a 50ms WANem emulated
> > > WAN against Samba 4.0 beta3.
> > > 
> > > Thanks,
> > > 
> > > Andrew Bartlett
> > 
> 
> Nice work tracking that down and coding up the patch. While it's not
> incorrect to drop the dentry here, we can be a little more efficient
> here and just update the inode in place if the uniqueid didn't change.
> 
> Something like this (untested) patch should do it. Could you test this
> and let me know if it also helps?

Yes, same behaviour as as per my patch.  Thanks you very much, it seems
we are on our way to solving this!

Andrew Bartlett

-- 
Andrew Bartlett                                http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Always update the dentry cache with fresh readdir() results
  2012-07-06  6:20             ` Andrew Bartlett
@ 2012-07-06 11:03               ` Jeff Layton
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Layton @ 2012-07-06 11:03 UTC (permalink / raw)
  To: Andrew Bartlett
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, Bill Robertson, Dion Edwards

On Fri, 06 Jul 2012 16:20:47 +1000
Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:

> On Thu, 2012-07-05 at 21:46 -0400, Jeff Layton wrote:
> > On Fri, 06 Jul 2012 09:31:07 +1000
> > Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> > 
> > > On Thu, 2012-07-05 at 07:24 -0400, Jeff Layton wrote:
> > > > On Thu, 05 Jul 2012 20:02:47 +1000
> > > > Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> > > > 
> > > > > (CCing in the original reporter)
> > > > > 
> > > > > On Thu, 2012-07-05 at 18:38 +1000, Andrew Bartlett wrote:
> > > > > > When we do a readdir() in CIFS, we are potentially efficiently
> > > > > > collecting a great deal of current, catchable stat information.
> > > > > > 
> > > > > > It is important that we always keep the dentry cache current for two
> > > > > > reasons:
> > > > > >  - the information may have changed (within the actime timeout).
> > > > > >  - if we still have a dentry cache value after that timeout, it is quite
> > > > > > expensive (1xRTT per entry) to find out if it was still correct.
> > > > > > 
> > > > > > This hits folks who are using CIFS over a WAN very badly.  For example
> > > > > > on an emulated 50ms delay I would have ls --color complete in .1
> > > > > > seconds, and a second run take 4.5 seconds, as each stat() (for the
> > > > > > colouring) would create a trans2 query_path_info query for each file,
> > > > > > right after getting the same information in the trans2 find_first2.
> > > > > > 
> > > > > > This patch implements the simplest approach, I would welcome a
> > > > > > correction on if there is a better approach than d_drop() and dput().
> > > > > > 
> > > > > > Tested on 3.4.4-3.cifsrevalidate.fc17.i686 with a 50ms WANem emulated
> > > > > > WAN against Samba 4.0 beta3.
> > > > > > 
> > > > > > Thanks,
> > > > > > 
> > > > > > Andrew Bartlett
> > > > > 
> > > > 
> > > > Nice work tracking that down and coding up the patch. While it's not
> > > > incorrect to drop the dentry here, we can be a little more efficient
> > > > here and just update the inode in place if the uniqueid didn't change.
> > > > 
> > > > Something like this (untested) patch should do it. Could you test this
> > > > and let me know if it also helps?
> > > 
> > > Is it really safe to update so much without getting a lock over all the
> > > updates?
> > > 
> > 
> > What's your worry, specifically?
> > 
> > The vfs only requires that you hold the lock over i_size updates. I
> > suppose it's possible that you could have racing updates to an inode,
> > but in practice, the last one will generally "win".
> 
> Writes are a worry, and I'm not sure I like the idea of parallel updates
> being able to leave it in a undefined state, but I was more worried
> about a read, that is someone reading just between:
> 
> 	inode->i_uid = fattr->cf_uid;
> 	inode->i_gid = fattr->cf_gid;
> 
> and
> 
> 	/* if dynperm is set, don't clobber existing mode */
> 	if (inode->i_state & I_NEW ||
> 	    !(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_DYNPERM))
> 		inode->i_mode = fattr->cf_mode;
> 
> Imagine that this file was changing owner, and was setuid.  Don't we
> have a race here were a very lucky caller could get the entry from the
> dentry cache between the uid change and the permission change (setuid
> removal)?
> 
> The race is very narrow, and most of the worries are much more mundane,
> but isn't this why you would lock the inode for the whole update?
> 
> I'm not fully up on kernel locking rules, which is why I looked to NFS
> for the example I mentioned. 

Locking only works if both "sides" actually respect the lock. The
decision about how to handle setuid creds is done in prepare_binprm()
and as far as I can tell, there is no lock held over the fetch of the
i_mode and i_uid/i_gid.

I suppose it's possible there's a race there, but it would be for every
filesystem -- not just CIFS and NFS. If you're concerned about this,
the thing to do there is probably to mail linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
with a description and see if it there's some mitigating factor we're
not seeing?

Another thing you could do is try to reproduce this. Maybe add a
(switchable) delay after prepare_binprm() fetches the mode, but before
it does the setuid checks. Try to run the program and then quickly
change the ownership to something else and see if the setuid takes
effect...


-- 
Jeff Layton <jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Always update the dentry cache with fresh readdir() results
  2012-07-06  6:30       ` Andrew Bartlett
@ 2012-07-06 11:11         ` Jeff Layton
       [not found]           ` <20120706071123.2563c615-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2012-07-06 11:11 UTC (permalink / raw)
  To: Andrew Bartlett
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, Bill Robertson, Dion Edwards

On Fri, 06 Jul 2012 16:30:12 +1000
Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:

> On Thu, 2012-07-05 at 07:24 -0400, Jeff Layton wrote:
> > On Thu, 05 Jul 2012 20:02:47 +1000
> > Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> > 
> > > (CCing in the original reporter)
> > > 
> > > On Thu, 2012-07-05 at 18:38 +1000, Andrew Bartlett wrote:
> > > > When we do a readdir() in CIFS, we are potentially efficiently
> > > > collecting a great deal of current, catchable stat information.
> > > > 
> > > > It is important that we always keep the dentry cache current for two
> > > > reasons:
> > > >  - the information may have changed (within the actime timeout).
> > > >  - if we still have a dentry cache value after that timeout, it is quite
> > > > expensive (1xRTT per entry) to find out if it was still correct.
> > > > 
> > > > This hits folks who are using CIFS over a WAN very badly.  For example
> > > > on an emulated 50ms delay I would have ls --color complete in .1
> > > > seconds, and a second run take 4.5 seconds, as each stat() (for the
> > > > colouring) would create a trans2 query_path_info query for each file,
> > > > right after getting the same information in the trans2 find_first2.
> > > > 
> > > > This patch implements the simplest approach, I would welcome a
> > > > correction on if there is a better approach than d_drop() and dput().
> > > > 
> > > > Tested on 3.4.4-3.cifsrevalidate.fc17.i686 with a 50ms WANem emulated
> > > > WAN against Samba 4.0 beta3.
> > > > 
> > > > Thanks,
> > > > 
> > > > Andrew Bartlett
> > > 
> > 
> > Nice work tracking that down and coding up the patch. While it's not
> > incorrect to drop the dentry here, we can be a little more efficient
> > here and just update the inode in place if the uniqueid didn't change.
> > 
> > Something like this (untested) patch should do it. Could you test this
> > and let me know if it also helps?
> 
> Yes, same behaviour as as per my patch.  Thanks you very much, it seems
> we are on our way to solving this!
> 

Thanks for testing it. I've gone ahead and sent that off to Steve for
inclusion. If there is anything else required wrt serializing the inode
attribute updates, then we can deal with that in a separate patch since
that's really a separate problem.

Cheers,
-- 
Jeff Layton <jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Always update the dentry cache with fresh readdir() results
       [not found]           ` <20120706071123.2563c615-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
@ 2012-07-06 22:42             ` Andrew Bartlett
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Bartlett @ 2012-07-06 22:42 UTC (permalink / raw)
  To: Jeff Layton
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, Bill Robertson, Dion Edwards

On Fri, 2012-07-06 at 07:11 -0400, Jeff Layton wrote:
> On Fri, 06 Jul 2012 16:30:12 +1000
> Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> 
> > On Thu, 2012-07-05 at 07:24 -0400, Jeff Layton wrote:
> > > On Thu, 05 Jul 2012 20:02:47 +1000
> > > Andrew Bartlett <abartlet-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> > > 
> > > > (CCing in the original reporter)
> > > > 
> > > > On Thu, 2012-07-05 at 18:38 +1000, Andrew Bartlett wrote:
> > > > > When we do a readdir() in CIFS, we are potentially efficiently
> > > > > collecting a great deal of current, catchable stat information.
> > > > > 
> > > > > It is important that we always keep the dentry cache current for two
> > > > > reasons:
> > > > >  - the information may have changed (within the actime timeout).
> > > > >  - if we still have a dentry cache value after that timeout, it is quite
> > > > > expensive (1xRTT per entry) to find out if it was still correct.
> > > > > 
> > > > > This hits folks who are using CIFS over a WAN very badly.  For example
> > > > > on an emulated 50ms delay I would have ls --color complete in .1
> > > > > seconds, and a second run take 4.5 seconds, as each stat() (for the
> > > > > colouring) would create a trans2 query_path_info query for each file,
> > > > > right after getting the same information in the trans2 find_first2.
> > > > > 
> > > > > This patch implements the simplest approach, I would welcome a
> > > > > correction on if there is a better approach than d_drop() and dput().
> > > > > 
> > > > > Tested on 3.4.4-3.cifsrevalidate.fc17.i686 with a 50ms WANem emulated
> > > > > WAN against Samba 4.0 beta3.
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Andrew Bartlett
> > > > 
> > > 
> > > Nice work tracking that down and coding up the patch. While it's not
> > > incorrect to drop the dentry here, we can be a little more efficient
> > > here and just update the inode in place if the uniqueid didn't change.
> > > 
> > > Something like this (untested) patch should do it. Could you test this
> > > and let me know if it also helps?
> > 
> > Yes, same behaviour as as per my patch.  Thanks you very much, it seems
> > we are on our way to solving this!
> > 
> 
> Thanks for testing it. I've gone ahead and sent that off to Steve for
> inclusion. If there is anything else required wrt serializing the inode
> attribute updates, then we can deal with that in a separate patch since
> that's really a separate problem.

Indeed.  Thanks!

Andrew Bartlett

-- 
Andrew Bartlett                                http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-07-06 22:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-05  8:38 [PATCH] Always update the dentry cache with fresh readdir() results Andrew Bartlett
2012-07-05 10:02 ` Andrew Bartlett
2012-07-05 11:24   ` Jeff Layton
     [not found]     ` <20120705072401.7eb1a7ee-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2012-07-05 23:31       ` Andrew Bartlett
2012-07-06  1:46         ` Jeff Layton
     [not found]           ` <20120705214608.2a3a681b-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2012-07-06  6:20             ` Andrew Bartlett
2012-07-06 11:03               ` Jeff Layton
2012-07-06  6:30       ` Andrew Bartlett
2012-07-06 11:11         ` Jeff Layton
     [not found]           ` <20120706071123.2563c615-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2012-07-06 22:42             ` Andrew Bartlett

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.