Re: [PATCH 0/11] nfsd: Summary of "Improve NFS server performance"

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH 0/11] nfsd: Summary of "Improve NFS server performance"
       [not found] ` <20090325133607.16437.33288.sendpatchset-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2009-04-17 18:46   ` J. Bruce Fields
  2009-04-17 19:30     ` Krishna Kumar2
  2009-04-21 22:57   ` J. Bruce Fields
  1 sibling, 1 reply; 19+ messages in thread
From: J. Bruce Fields @ 2009-04-17 18:46 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: jlayton, linux-nfs, Krishna Kumar

On Wed, Mar 25, 2009 at 07:06:07PM +0530, Krishna Kumar wrote:
> From: Krishna Kumar <krkumar2@in.ibm.com>
> 
> 			Patch summary:
> 			--------------
> Change the caching from ino/dev to a file handle model. Advantages:
> 
> 1. Since file handles are unique, this patch removes all dependencies on the
>    kernel readahead parameters and it's implementation; and instead caches
>    files based on file handles.
> 2. Allows the server to not have to open/close a file multiple times when the
>    client reads it, and results in faster lookup times.
> 3. Readahead is automatically taken care of since the file is not closed while
>    it is getting read (quickly) by the client.
> 4. Another optimization is to avoid getting the cache lock twice for each read
>    operation (after the first time it is taken to update the cache).
> 5. (ra_size, ra_depth retain the "ra" prefix for now, since I have not checked
>    whether changing that will break any user programs)

You're referring here to the "ra" line in /proc/net/rpc/nfsd?

--b.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/11] nfsd: Summary of "Improve NFS server performance"
  2009-04-17 18:46   ` [PATCH 0/11] nfsd: Summary of "Improve NFS server performance" J. Bruce Fields
@ 2009-04-17 19:30     ` Krishna Kumar2
  0 siblings, 0 replies; 19+ messages in thread
From: Krishna Kumar2 @ 2009-04-17 19:30 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: jlayton, Krishna Kumar, linux-nfs

"J. Bruce Fields" <bfields@fieldses.org> wrote on 04/18/2009 12:16:47 AM:

> Re: [PATCH 0/11] nfsd: Summary of "Improve NFS server performance"
>
> On Wed, Mar 25, 2009 at 07:06:07PM +0530, Krishna Kumar wrote:
> > From: Krishna Kumar <krkumar2@in.ibm.com>
> >
> >          Patch summary:
> >          --------------
> > Change the caching from ino/dev to a file handle model. Advantages:
> >
> > 1. Since file handles are unique, this patch removes all dependencies
on the
> >    kernel readahead parameters and it's implementation; and instead
caches> >    files based on file handles.
> > 2. Allows the server to not have to open/close a file multiple times
when the
> >    client reads it, and results in faster lookup times.
> > 3. Readahead is automatically taken care of since the file is not
closed while
> >    it is getting read (quickly) by the client.
> > 4. Another optimization is to avoid getting the cache lock twice for
each read
> >    operation (after the first time it is taken to update the cache).
> > 5. (ra_size, ra_depth retain the "ra" prefix for now, since I have not
checked
> >    whether changing that will break any user programs)
>
> You're referring here to the "ra" line in /proc/net/rpc/nfsd?

Yes - at that time I was thinking of any consumer of this file. From your
comment,
I see that it is not an issue now, thanks for pointing it out.

Thanks,

- KK


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/11] nfsd: ADD data structure infrastructure
       [not found]   ` <20090325133628.16437.11092.sendpatchset-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2009-04-21 22:48     ` J. Bruce Fields
  2009-04-22  5:36       ` Krishna Kumar2
  2009-04-21 23:05     ` J. Bruce Fields
  1 sibling, 1 reply; 19+ messages in thread
From: J. Bruce Fields @ 2009-04-21 22:48 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: Krishna Kumar, jlayton, linux-nfs

On Wed, Mar 25, 2009 at 07:06:28PM +0530, Krishna Kumar wrote:
> From: Krishna Kumar <krkumar2@in.ibm.com>
> 
> ADD infrastructure in terms of new structures.
> 
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> ---
> 
>  fs/nfsd/vfs.c |   35 +++++++++++++++++++++++++++++++++++
>  1 file changed, 35 insertions(+)
> 
> diff -ruNp org/fs/nfsd/vfs.c new1/fs/nfsd/vfs.c
> --- org/fs/nfsd/vfs.c	2009-03-25 17:31:39.000000000 +0530
> +++ new1/fs/nfsd/vfs.c	2009-03-25 17:39:43.000000000 +0530
> @@ -88,6 +88,41 @@ struct raparm_hbucket {
>  #define RAPARM_HASH_MASK	(RAPARM_HASH_SIZE-1)
>  static struct raparm_hbucket	raparm_hash[RAPARM_HASH_SIZE];
>  
> +/*
> + * This is a cache of file handles to quicken file lookup. This also
> + * helps prevent multiple open/close of a file when a client reads it.
> + *
> + * If you increase the number of cached files very much, you'll need to
> + * add a hash table here.

I'm a little confused here--what do you mean by "add a hash table here"?

--b.

> + */
> +struct fhcache {
> +	struct fhcache		*p_next;
> +
> +	/* Hashed on this parameter */
> +	__u32			p_auth;
> +
> +	/* Cached information */
> +	struct file		*p_filp;
> +	struct svc_export	*p_exp;
> +
> +	/* Refcount for overwrite */
> +	atomic_t		p_count;
> +
> +	/* When this entry expires */
> +	unsigned long		p_expires;
> +
> +	unsigned int		p_hindex;
> +};
> +
> +struct fhcache_hbucket {
> +	struct fhcache		*pb_head;
> +	spinlock_t		pb_lock;
> +} ____cacheline_aligned_in_smp;
> +
> +#define FHPARM_HASH_BITS	8
> +#define FHPARM_HASH_SIZE	(1<<FHPARM_HASH_BITS)
> +#define FHPARM_HASH_MASK	(FHPARM_HASH_SIZE-1)
> +
>  /* 
>   * Called from nfsd_lookup and encode_dirent. Check if we have crossed 
>   * a mount point.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/11] nfsd: Summary of "Improve NFS server performance"
       [not found] ` <20090325133607.16437.33288.sendpatchset-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  2009-04-17 18:46   ` [PATCH 0/11] nfsd: Summary of "Improve NFS server performance" J. Bruce Fields
@ 2009-04-21 22:57   ` J. Bruce Fields
  2009-04-22  5:35     ` Krishna Kumar2
  1 sibling, 1 reply; 19+ messages in thread
From: J. Bruce Fields @ 2009-04-21 22:57 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: jlayton, linux-nfs, Krishna Kumar

On Wed, Mar 25, 2009 at 07:06:07PM +0530, Krishna Kumar wrote:
> From: Krishna Kumar <krkumar2@in.ibm.com>
> 
> 			Patch summary:
> 			--------------
> Change the caching from ino/dev to a file handle model. Advantages:
> 
> 1. Since file handles are unique, this patch removes all dependencies on the
>    kernel readahead parameters and it's implementation; and instead caches
>    files based on file handles.
> 2. Allows the server to not have to open/close a file multiple times when the
>    client reads it, and results in faster lookup times.
> 3. Readahead is automatically taken care of since the file is not closed while
>    it is getting read (quickly) by the client.
> 4. Another optimization is to avoid getting the cache lock twice for each read
>    operation (after the first time it is taken to update the cache).
> 5. (ra_size, ra_depth retain the "ra" prefix for now, since I have not checked
>    whether changing that will break any user programs)
> 
> 
> 		Patches are described as:
> 		--------------------------
> 	1. nfsd: ADD data structure infrastructure
> 	2. nfsd: ADD new function infrastructure
> 	3. nfsd: CHANGE old function calls to new calls
> 	4. nfsd: ADD client "rm" support
> 	5. nfsd: ADD nfsd_fhcache_shutdown and nfsd_fhcache_init
> 	6. nfsd: CHANGE _ra* calls to _fh* calls in nfssvc
> 	7. nfsd: REMOVE old infrastructure's functions
> 	8. nfsd: REMOVE old infrastructure's data structures
> 
> 
> 			List of changes from Ver1:
> 			--------------------------
> 1. Implement entire logic of either updating-cache or do-nothing or close-file
>    in fh_cache_upd. This simplifies the caller (nfsd_read).
> 2. nfsd_get_fhcache doesn't overwrite existing entries which would require the
>    existing entries to be freed up - that is done by the daemon exclusively.
>    This saves time at lookup by avoiding freeing entries (fh_cache_put).
>    Another change is to optimize the logic of selecting a free slot.
> 3. Due to #2, fh_cache_upd doesn't have to test whether the entry is already
>    on the daemon list (it never is, plus list_del_init is changed to list_del).
> 4. As a result of #2, the daemon becomes simpler - there is no race to handle
>    where there is an entry on the list but it has no cached file/dentry, etc.
> 5. Made some comments clearer and easier to understand.
> 6. Jeff: Changed NFSD_CACHE_JIFFIES to use HZ.
> 7. Jeff: Changed nfsd_daemon_list to nfsd_daemon_free_list; and changed ra_init
> 	  and ra_shutdown prefixes.
> 8. Jeff: Split patch into smaller patches. Tested each with successful builds.
> 9. Pending:
> 	- Bruce: But I think I'd prefer some separate operation (probably just
> 	  triggered by a write to a some new file in the nfsd filesystem) that
> 	  told nfsd to release all its references to a given filesystem.  An
> 	  administrator would have to know to do this before unmounting (or
> 	  maybe mount could be patched to do this).
> 
> 
> 			Performance:
> 			-------------
> 
> This patch was tested with clients running 2, 4, 8, 16 --- 256 test processes,
> each doing reads of different files. Each test includes different I/O sizes.
> 31/77 tests got more than 5% improvement; with 5 tests getting more than 10%
> gain (full results at the end of this post).

The numbers look promising, but very noisy.  Maybe an average of a few
tests would give more stable numbers?

--b.

> Please review. Any comments or improvement ideas are appreciated.
> 
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> ---
> 		(#Test Processes on Client == #NFSD's on Server)
> -----------------------------------------------------------------------
> #Test Processes Bufsize		Org BW KB/s	New BW KB/s	%
> -----------------------------------------------------------------------
> 2		256		68022.46	71094.64	4.51
> 2		4096		67833.74	70726.38	4.26
> 2		8192		64541.14	69635.93	7.89
> 2		16384		65708.86	68994.88	5.00
> 2		32768		64272.28	68525.36	6.61
> 2		65536		64684.13	69103.28	6.83
> 2		131072		64765.67	68855.57	6.31
> 
> 4		256		60849.48	64702.04	6.33
> 4		4096		60660.67	64309.37	6.01
> 4		8192		60506.00	61142.84	1.05
> 4		16384		60796.86	64069.82	5.38
> 4		32768		60947.07	64648.57	6.07
> 4		65536		60774.45	63735.24	4.87
> 4		131072		61369.66	65261.85	6.34
> 
> 8		256		49239.57	54467.33	10.61
> 8		4096		50650.45	55400.01	9.37
> 8		8192		50661.58	51732.16	2.11
> 8		16384		51114.76	56025.31	9.60
> 8		32768		52367.20	54348.05	3.78
> 8		65536		51000.23	54285.63	6.44
> 8		131072		52996.73	54021.11	1.93
> 
> 16		256		44534.67	45478.60	2.11
> 16		4096		43897.69	46519.89	5.97
> 16		8192		43787.87	44083.61	.67
> 16		16384		43883.62	46726.03	6.47
> 16		32768		44284.96	44035.86	-.56
> 16		65536		43804.33	44865.20	2.42
> 16		131072		44525.30	43752.62	-1.73
> 
> 32		256		40420.30	42575.30	5.33
> 32		4096		39913.14	42279.21	5.92
> 32		8192		40261.19	42399.93	5.31
> 32		16384		38094.95	42645.32	11.94
> 32		32768		40610.27	43015.37	5.92
> 32		65536		41438.05	41794.76	.86
> 32		131072		41869.06	43644.07	4.23
> 
> 48		256		36986.45	40185.34	8.64
> 48		4096		36585.79	38227.38	4.48
> 48		8192		38406.78	38055.91	-.91
> 48		16384		34950.05	36688.86	4.97
> 48		32768		38908.71	37900.33	-2.59
> 48		65536		39364.64	40036.67	1.70
> 48		131072		40391.56	40887.11	1.22
> 
> 64		256		32821.89	34568.06	5.32
> 64		4096		35468.42	35529.29	.17
> 64		8192		34135.44	36462.31	6.81
> 64		16384		31926.51	32694.91	2.40
> 64		32768		35527.69	35234.60	-.82
> 64		65536		36066.08	36359.77	.81
> 64		131072		35969.04	37462.86	4.15
> 
> 96		256		30032.74	29973.45	-.19
> 96		4096		29687.06	30881.64	4.02
> 96		8192		31142.51	32504.66	4.37
> 96		16384		29546.77	30663.39	3.77
> 96		32768		32458.94	32797.36	1.04
> 96		65536		32826.99	33451.66	1.90
> 96		131072		33601.46	34177.39	1.71
> 
> 128		256		28584.59	29092.11	1.77
> 128		4096		29311.11	30097.65	2.68
> 128		8192		31398.87	33154.63	5.59
> 128		16384		28283.58	29071.45	2.78
> 128		32768		32819.93	33654.11	2.54
> 128		65536		32617.13	33778.46	3.56
> 128		131072		32972.71	34160.82	3.60
> 
> 192		256		25245.92	26331.48	4.29
> 192		4096		27368.48	29318.49	7.12
> 192		8192		30173.74	31477.35	4.32
> 192		16384		26388.54	29719.15	12.62
> 192		32768		31840.91	33606.17	5.54
> 192		65536		33374.85	33607.14	.69
> 192		131072		33523.48	32601.93	-2.74
> 
> 256		256		22256.91	21139.79	-5.01
> 256		4096		25192.75	24281.51	-3.61
> 256		8192		26534.95	28100.59	5.90
> 256		16384		24162.85	25607.86	5.98
> 256		32768		29218.38	29417.28	.68
> 256		65536		29609.59	30049.79	1.48
> 256		131072		30014.29	30132.33	.39
> -----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/11] nfsd: ADD data structure infrastructure
       [not found]   ` <20090325133628.16437.11092.sendpatchset-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
  2009-04-21 22:48     ` [PATCH 1/11] nfsd: ADD data structure infrastructure J. Bruce Fields
@ 2009-04-21 23:05     ` J. Bruce Fields
  1 sibling, 0 replies; 19+ messages in thread
From: J. Bruce Fields @ 2009-04-21 23:05 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: Krishna Kumar, jlayton, linux-nfs

On Wed, Mar 25, 2009 at 07:06:28PM +0530, Krishna Kumar wrote:
> From: Krishna Kumar <krkumar2@in.ibm.com>
> 
> ADD infrastructure in terms of new structures.
> 
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> ---
> 
>  fs/nfsd/vfs.c |   35 +++++++++++++++++++++++++++++++++++
>  1 file changed, 35 insertions(+)
> 
> diff -ruNp org/fs/nfsd/vfs.c new1/fs/nfsd/vfs.c
> --- org/fs/nfsd/vfs.c	2009-03-25 17:31:39.000000000 +0530
> +++ new1/fs/nfsd/vfs.c	2009-03-25 17:39:43.000000000 +0530
> @@ -88,6 +88,41 @@ struct raparm_hbucket {
>  #define RAPARM_HASH_MASK	(RAPARM_HASH_SIZE-1)
>  static struct raparm_hbucket	raparm_hash[RAPARM_HASH_SIZE];
>  
> +/*
> + * This is a cache of file handles to quicken file lookup. This also
> + * helps prevent multiple open/close of a file when a client reads it.
> + *
> + * If you increase the number of cached files very much, you'll need to
> + * add a hash table here.
> + */
> +struct fhcache {
> +	struct fhcache		*p_next;
> +
> +	/* Hashed on this parameter */
> +	__u32			p_auth;
> +
> +	/* Cached information */
> +	struct file		*p_filp;
> +	struct svc_export	*p_exp;
> +
> +	/* Refcount for overwrite */
> +	atomic_t		p_count;
> +
> +	/* When this entry expires */
> +	unsigned long		p_expires;
> +
> +	unsigned int		p_hindex;
> +};
> +
> +struct fhcache_hbucket {
> +	struct fhcache		*pb_head;
> +	spinlock_t		pb_lock;
> +} ____cacheline_aligned_in_smp;

Hadn't run across __cacheline_aligned_in_smp before--so that just
assures an fhcache_hbucket gets its own cacheline, so that two cpu's
taking the spinlock on different buckets won't be fighting over the
cacheline?

--b.

> +
> +#define FHPARM_HASH_BITS	8
> +#define FHPARM_HASH_SIZE	(1<<FHPARM_HASH_BITS)
> +#define FHPARM_HASH_MASK	(FHPARM_HASH_SIZE-1)
> +
>  /* 
>   * Called from nfsd_lookup and encode_dirent. Check if we have crossed 
>   * a mount point.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/11] nfsd: ADD new function infrastructure
       [not found]   ` <20090325133647.16437.59567.sendpatchset-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2009-04-22  2:54     ` J. Bruce Fields
  2009-04-22  5:37       ` Krishna Kumar2
  0 siblings, 1 reply; 19+ messages in thread
From: J. Bruce Fields @ 2009-04-22  2:54 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: jlayton, linux-nfs, Krishna Kumar

On Wed, Mar 25, 2009 at 07:06:47PM +0530, Krishna Kumar wrote:
> From: Krishna Kumar <krkumar2@in.ibm.com>
> 
> ADD infrastructure in terms of new functions.
> 
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> ---
> 
>  fs/nfsd/vfs.c |  224 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 224 insertions(+)
> 
> diff -ruNp new1/fs/nfsd/vfs.c new2/fs/nfsd/vfs.c
> --- new1/fs/nfsd/vfs.c	2009-03-25 17:39:43.000000000 +0530
> +++ new2/fs/nfsd/vfs.c	2009-03-25 17:42:10.000000000 +0530
> @@ -55,11 +55,15 @@
>  #include <linux/security.h>
>  #endif /* CONFIG_NFSD_V4 */
>  #include <linux/jhash.h>
> +#include <linux/kthread.h>
>  
>  #include <asm/uaccess.h>
>  
>  #define NFSDDBG_FACILITY		NFSDDBG_FILEOP
>  
> +/* Number of jiffies to cache the file before releasing */
> +#define NFSD_CACHE_JIFFIES		(HZ/10)
> +
>  
>  /*
>   * This is a cache of readahead params that help us choose the proper
> @@ -111,6 +115,9 @@ struct fhcache {
>  	/* When this entry expires */
>  	unsigned long		p_expires;
>  
> +	/* List of entries linked to 'nfsd_daemon_free_list' */
> +	struct list_head	p_list;
> +
>  	unsigned int		p_hindex;
>  };
>  
> @@ -122,6 +129,7 @@ struct fhcache_hbucket {
>  #define FHPARM_HASH_BITS	8
>  #define FHPARM_HASH_SIZE	(1<<FHPARM_HASH_BITS)
>  #define FHPARM_HASH_MASK	(FHPARM_HASH_SIZE-1)
> +static struct fhcache_hbucket	fhcache_hash[FHPARM_HASH_SIZE];
>  
>  /* 
>   * Called from nfsd_lookup and encode_dirent. Check if we have crossed 
> @@ -866,6 +874,222 @@ found:
>  	return ra;
>  }
>  
> +/* Synchronization for daemon with enqueuer's */
> +static DEFINE_SPINLOCK(k_nfsd_lock);
> +
> +/* List of FH cache entries that has to be cleaned up when they expire */
> +static struct list_head		nfsd_daemon_free_list;
> +
> +/*
> + * Returns cached values of 'file' and svc_export; resets these entries
> + * to NULL.
> + */
> +static inline void fh_get_cached_entries(struct fhcache *fh,
> +					 struct file **filep,
> +					 struct svc_export **expp)
> +{
> +	*filep = fh->p_filp;
> +	*expp = fh->p_exp;
> +
> +	fh->p_filp = NULL;
> +	fh->p_exp = NULL;
> +}
> +
> +/*
> + * Hold a reference to dentry and svc_export (file already has an extra
> + * reference count as it is not closed normally.
> + */
> +static inline void fh_cache_get(struct file *file, struct svc_export *exp)
> +{
> +	dget(file->f_dentry);
> +	cache_get(&exp->h);
> +}
> +
> +/* Drop a reference to file, dentry and svc_export */
> +static inline void fh_cache_put(struct file *file, struct svc_export *exp)
> +{
> +	cache_put(&exp->h, &svc_export_cache);
> +	dput(file->f_dentry);
> +	fput(file);
> +}
> +
> +/*
> + * Holds a reference to dentry and svc_export, and caches both. Add fh entry
> + * to list for daemon to cleanup later. Once we add the entry to the list,
> + * we'd rather it expire prematurely rather than updating it on every read.
> + */
> +static inline void fh_cache_upd(struct fhcache *fh, struct file *file,
> +				struct svc_export *exp)
> +{
> +	if (fh) {
> +		if (!fh->p_filp && file) {
> +			struct fhcache_hbucket *fhb;
> +
> +			fh_cache_get(file, exp);
> +
> +			fhb = &fhcache_hash[fh->p_hindex];
> +
> +			spin_lock(&fhb->pb_lock);
> +			fh->p_filp = file;
> +			fh->p_exp = exp;
> +			fh->p_expires = jiffies + NFSD_CACHE_JIFFIES;
> +
> +			spin_lock(&k_nfsd_lock);
> +			list_add_tail(&fh->p_list, &nfsd_daemon_free_list);
> +			spin_unlock(&k_nfsd_lock);
> +
> +			spin_unlock(&fhb->pb_lock);
> +		}
> +
> +		/* Drop our reference */
> +		atomic_dec(&fh->p_count);
> +	} else if (file)
> +		nfsd_close(file);
> +}
> +
> +/* Daemon cache cleanup handler */
> +void daemon_free_entries(void)
> +{
> +	unsigned long now = jiffies;
> +
> +	spin_lock(&k_nfsd_lock);
> +	while (!list_empty(&nfsd_daemon_free_list)) {
> +		struct fhcache *fh = list_entry(nfsd_daemon_free_list.next,
> +                                                struct fhcache, p_list);
> +		struct fhcache_hbucket *fhb;
> +		struct file *file;
> +		struct svc_export *exp;
> +
> +		if (time_after(fh->p_expires, now) || now != jiffies) {
> +			/*
> +			 * This (and all subsequent entries) have not expired;
> +			 * or we have spent too long in this loop.
> +			 */
> +			break;
> +		}
> +
> +		fhb = &fhcache_hash[fh->p_hindex];
> +
> +		/*
> +		 * Make sure we do not deadlock with updaters - we can free
> +		 * entry next time in case of a race.
> +		 */
> +		if (!spin_trylock(&fhb->pb_lock)) {
> +			/*
> +			 * Entry is being used, no need to free this, try later
> +			 */
> +			break;
> +		}
> +
> +		if (atomic_read(&fh->p_count)) {
> +			spin_unlock(&fhb->pb_lock);
> +			break;
> +		}
> +
> +		list_del(&fh->p_list);
> +		fh_get_cached_entries(fh, &file, &exp);
> +		spin_unlock(&fhb->pb_lock);
> +		spin_unlock(&k_nfsd_lock);
> +
> +		fh_cache_put(file, exp);
> +		spin_lock(&k_nfsd_lock);
> +	}
> +	spin_unlock(&k_nfsd_lock);
> +}
> +
> +static int k_nfsd_thread(void *unused)
> +{
> +	while (!kthread_should_stop()) {
> +		schedule_timeout_interruptible(NFSD_CACHE_JIFFIES);
> +
> +		if (kthread_should_stop())
> +			break;
> +
> +		daemon_free_entries();
> +	}
> +	__set_current_state(TASK_RUNNING);

Is this necessary?  The comment before schedule_timeout() claims "The
current task state is guaranteed to be TASK_RUNNING when this routine
returns."

> +
> +	return 0;
> +}
> +
> +/*
> + * Obtain the cached file, export and d_inode values for the FH
> + * specified by fh->auth[3]
> + */
> +static inline struct fhcache *
> +nfsd_get_fhcache(__u32 auth)
> +{
> +	struct fhcache		*fh, **fhp, **ffhp = NULL;
> +	int			depth = 0;
> +	unsigned int		hash;
> +	struct fhcache_hbucket	*fhb;
> +
> +	if (!auth)
> +		return NULL;
> +
> +	hash = jhash_1word(auth, 0xfeedbeef) & FHPARM_HASH_MASK;
> +	fhb = &fhcache_hash[hash];
> +
> +	spin_lock(&fhb->pb_lock);
> +	for (fhp = &fhb->pb_head; (fh = *fhp); fhp = &fh->p_next) {
> +		if (fh->p_auth == auth) {
> +			/* Same inode */
> +			if (unlikely(!fh->p_filp)) {
> +				/*
> +				 * Someone is racing in the same code, and
> +				 * this is the first reference to this file.
> +				 */
> +				spin_unlock(&fhb->pb_lock);

The locking seems over-complicated.  But I have to admit, I don't
understand it yet.

> +				return NULL;
> +			}
> +
> +			/*
> +			 * Hold an extra reference to dentry/exp since these
> +			 * are released in fh_put(). 'file' already has an
> +			 * extra hold from the first lookup which was never
> +			 * dropped.

Why not take a reference on the file instead of the dentry?

> +			 */
> +			fh_cache_get(fh->p_filp, fh->p_exp);
> +			goto found;
> +		}
> +
> +		depth++;
> +
> +		if (!ffhp && !fh->p_filp) {
> +			/*
> +			 * This is an unused inode (or a different one), and no
> +			 * entry was found till now.
> +			 */
> +			if (!atomic_read(&fh->p_count)) /* Entry is unused */
> +				ffhp = fhp;
> +		}
> +	}
> +
> +	if (!ffhp) {
> +		spin_unlock(&fhb->pb_lock);
> +		return NULL;
> +	}
> +
> +	depth = nfsdstats.ra_size*11/10;

Help, I'm confused--what's the meaning of this calculation?

--b.

> +	fhp = ffhp;
> +	fh = *ffhp;
> +	fh->p_hindex = hash;
> +	fh->p_auth = auth;
> +
> +found:
> +	if (fhp != &fhb->pb_head) {
> +		*fhp = fh->p_next;
> +		fh->p_next   = fhb->pb_head;
> +		fhb->pb_head = fh;
> +	}
> +
> +	atomic_inc(&fh->p_count);
> +	nfsdstats.ra_depth[depth*10/nfsdstats.ra_size]++;
> +	spin_unlock(&fhb->pb_lock);
> +
> +	return fh;
> +}
> +
>  /*
>   * Grab and keep cached pages associated with a file in the svc_rqst
>   * so that they can be passed to the network sendmsg/sendpage routines

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/11] nfsd: Summary of "Improve NFS server performance"
  2009-04-21 22:57   ` J. Bruce Fields
@ 2009-04-22  5:35     ` Krishna Kumar2
  2009-04-22 19:41       ` J. Bruce Fields
  0 siblings, 1 reply; 19+ messages in thread
From: Krishna Kumar2 @ 2009-04-22  5:35 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: jlayton, Krishna Kumar, linux-nfs

"J. Bruce Fields" <bfields@fieldses.org> wrote on 04/22/2009 04:27:28 AM:

> The numbers look promising, but very noisy.  Maybe an average of a few
> tests would give more stable numbers?

Is it enough if I cut out some "popular" bufsizes and #processes out of
that
table and post it? If so, here is a smaller list:

----------------------------------------------------------------------
#Processes      Bufsize         OrgBW KB/s      New BW KB/s     %
----------------------------------------------------------------------
4               4096            60660.67        64309.37        6.01
4               16384           60796.86        64069.82        5.38
4               65536           60774.45        63735.24        4.87
4               131072          61369.66        65261.85        6.34

8               4096            50650.45        55400.01        9.37
8               16384           51114.76        56025.31        9.60
8               65536           51000.23        54285.63        6.44
8               131072          52996.73        54021.11        1.93

16              4096            43897.69        46519.89        5.97
16              16384           43883.62        46726.03        6.47
16              65536           43804.33        44865.20        2.42
16              131072          44525.30        43752.62        -1.73

32              4096            39913.14        42279.21        5.92
32              16384           38094.95        42645.32        11.94
32              65536           41438.05        41794.76        .86
32              131072          41869.06        43644.07        4.23

64              4096            35468.42        35529.29        .17
64              16384           31926.51        32694.91        2.40
64              65536           36066.08        36359.77        .81
64              131072          35969.04        37462.86        4.15

128             4096            29311.11        30097.65        2.68
128             16384           28283.58        29071.45        2.78
128             65536           32617.13        33778.46        3.56
128             131072          32972.71        34160.82        3.60

192             4096            27368.48        29318.49        7.12
192             16384           26388.54        29719.15        12.62
192             65536           33374.85        33607.14        .69
192             131072          33523.48        32601.93        -2.74
-----------------------------------------------------------------------
                                1170059.88      1223737.36      4.58%
-----------------------------------------------------------------------

Thanks,

- KK


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/11] nfsd: ADD data structure infrastructure
  2009-04-21 22:48     ` [PATCH 1/11] nfsd: ADD data structure infrastructure J. Bruce Fields
@ 2009-04-22  5:36       ` Krishna Kumar2
  2009-04-22 19:43         ` J. Bruce Fields
  0 siblings, 1 reply; 19+ messages in thread
From: Krishna Kumar2 @ 2009-04-22  5:36 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: jlayton, Krishna Kumar, linux-nfs

"J. Bruce Fields" <bfields@fieldses.org> wrote on 04/22/2009 04:18:38 AM:

> > +/*
> > + * This is a cache of file handles to quicken file lookup. This also
> > + * helps prevent multiple open/close of a file when a client reads it.
> > + *
> > + * If you increase the number of cached files very much, you'll need
to
> > + * add a hash table here.
>
> I'm a little confused here--what do you mean by "add a hash table here"?

That is also in the original code. I feel the comment is wrong as we
already have a hash table in raparm_hbucket, but I didn't want to change
typos as part of this patch.

thanks,

- KK


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/11] nfsd: ADD new function infrastructure
  2009-04-22  2:54     ` [PATCH 2/11] nfsd: ADD new function infrastructure J. Bruce Fields
@ 2009-04-22  5:37       ` Krishna Kumar2
  0 siblings, 0 replies; 19+ messages in thread
From: Krishna Kumar2 @ 2009-04-22  5:37 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: jlayton, Krishna Kumar, linux-nfs

"J. Bruce Fields" <bfields@fieldses.org> wrote on 04/22/2009 08:24:27 AM:

> > +static int k_nfsd_thread(void *unused)
> > +{
> > +   while (!kthread_should_stop()) {
> > +      schedule_timeout_interruptible(NFSD_CACHE_JIFFIES);
> > +
> > +      if (kthread_should_stop())
> > +         break;
> > +
> > +      daemon_free_entries();
> > +   }
> > +   __set_current_state(TASK_RUNNING);
>
> Is this necessary?  The comment before schedule_timeout() claims "The
> current task state is guaranteed to be TASK_RUNNING when this routine
> returns."

Right, it is not required.

> > +static inline struct fhcache *
> > +nfsd_get_fhcache(__u32 auth)
> > +{
> > +   struct fhcache      *fh, **fhp, **ffhp = NULL;
> > +   int         depth = 0;
> > +   unsigned int      hash;
> > +   struct fhcache_hbucket   *fhb;
> > +
> > +   if (!auth)
> > +      return NULL;
> > +
> > +   hash = jhash_1word(auth, 0xfeedbeef) & FHPARM_HASH_MASK;
> > +   fhb = &fhcache_hash[hash];
> > +
> > +   spin_lock(&fhb->pb_lock);
> > +   for (fhp = &fhb->pb_head; (fh = *fhp); fhp = &fh->p_next) {
> > +      if (fh->p_auth == auth) {
> > +         /* Same inode */
> > +         if (unlikely(!fh->p_filp)) {
> > +            /*
> > +             * Someone is racing in the same code, and
> > +             * this is the first reference to this file.
> > +             */
> > +            spin_unlock(&fhb->pb_lock);
>
> The locking seems over-complicated.  But I have to admit, I don't
> understand it yet.

I have one new lock - k_nfsd_lock, that is used to add/delete the entry to
nfsd_daemon_free_list. Lock is used by either the daemon, or when an
operation results in creating or deleting a cache entry (read and
unlink).

>
> > +            return NULL;
> > +         }
> > +
> > +         /*
> > +          * Hold an extra reference to dentry/exp since these
> > +          * are released in fh_put(). 'file' already has an
> > +          * extra hold from the first lookup which was never
> > +          * dropped.
>
> Why not take a reference on the file instead of the dentry?

I am holding an extra reference to file in the first lookup, but unless
I hold dentry also, it gets freed up by fh_put.

> > +   depth = nfsdstats.ra_size*11/10;
>
> Help, I'm confused--what's the meaning of this calculation?

That is original code, and I also find it confusing :) What I understand is
that when an entry is not found in the cache, ra_depth[10] is incremented,
and
when it is found, ra_depth[0] is incremented (unless the hash table entry
contains elements in the range of 20-30 or more, then ra_depth[2], [3],
etc,
get incremented. It seems to give an indication of the efficiency of the
hash
table.

thanks,

- KK


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/11] nfsd: Summary of "Improve NFS server performance"
  2009-04-22  5:35     ` Krishna Kumar2
@ 2009-04-22 19:41       ` J. Bruce Fields
  0 siblings, 0 replies; 19+ messages in thread
From: J. Bruce Fields @ 2009-04-22 19:41 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: jlayton, Krishna Kumar, linux-nfs

On Wed, Apr 22, 2009 at 11:05:14AM +0530, Krishna Kumar2 wrote:
> "J. Bruce Fields" <bfields@fieldses.org> wrote on 04/22/2009 04:27:28 AM:
> 
> > The numbers look promising, but very noisy.  Maybe an average of a few
> > tests would give more stable numbers?
> 
> Is it enough if I cut out some "popular" bufsizes and #processes out of
> that
> table and post it? If so, here is a smaller list:

By "noisy" I just mean that they look like there's a lot of randomness.
So, yes, I'm fine with a subset, but I'm curious what the variance is on
repeated runs of each test.

--b.

> 
> ----------------------------------------------------------------------
> #Processes      Bufsize         OrgBW KB/s      New BW KB/s     %
> ----------------------------------------------------------------------
> 4               4096            60660.67        64309.37        6.01
> 4               16384           60796.86        64069.82        5.38
> 4               65536           60774.45        63735.24        4.87
> 4               131072          61369.66        65261.85        6.34
> 
> 8               4096            50650.45        55400.01        9.37
> 8               16384           51114.76        56025.31        9.60
> 8               65536           51000.23        54285.63        6.44
> 8               131072          52996.73        54021.11        1.93
> 
> 16              4096            43897.69        46519.89        5.97
> 16              16384           43883.62        46726.03        6.47
> 16              65536           43804.33        44865.20        2.42
> 16              131072          44525.30        43752.62        -1.73
> 
> 32              4096            39913.14        42279.21        5.92
> 32              16384           38094.95        42645.32        11.94
> 32              65536           41438.05        41794.76        .86
> 32              131072          41869.06        43644.07        4.23
> 
> 64              4096            35468.42        35529.29        .17
> 64              16384           31926.51        32694.91        2.40
> 64              65536           36066.08        36359.77        .81
> 64              131072          35969.04        37462.86        4.15
> 
> 128             4096            29311.11        30097.65        2.68
> 128             16384           28283.58        29071.45        2.78
> 128             65536           32617.13        33778.46        3.56
> 128             131072          32972.71        34160.82        3.60
> 
> 192             4096            27368.48        29318.49        7.12
> 192             16384           26388.54        29719.15        12.62
> 192             65536           33374.85        33607.14        .69
> 192             131072          33523.48        32601.93        -2.74
> -----------------------------------------------------------------------
>                                 1170059.88      1223737.36      4.58%
> -----------------------------------------------------------------------
> 
> Thanks,
> 
> - KK
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/11] nfsd: ADD data structure infrastructure
  2009-04-22  5:36       ` Krishna Kumar2
@ 2009-04-22 19:43         ` J. Bruce Fields
  0 siblings, 0 replies; 19+ messages in thread
From: J. Bruce Fields @ 2009-04-22 19:43 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: jlayton, Krishna Kumar, linux-nfs

On Wed, Apr 22, 2009 at 11:06:33AM +0530, Krishna Kumar2 wrote:
> "J. Bruce Fields" <bfields@fieldses.org> wrote on 04/22/2009 04:18:38 AM:
> 
> > > +/*
> > > + * This is a cache of file handles to quicken file lookup. This also
> > > + * helps prevent multiple open/close of a file when a client reads it.
> > > + *
> > > + * If you increase the number of cached files very much, you'll need
> to
> > > + * add a hash table here.
> >
> > I'm a little confused here--what do you mean by "add a hash table here"?
> 
> That is also in the original code. I feel the comment is wrong as we
> already have a hash table in raparm_hbucket, but I didn't want to change
> typos as part of this patch.

This patch doesn't change anything--it only adds new lines.  If you're
trying to make minimal changes, that's admirable, but break up the
patches in such a way that each patch shows that minimal change.

--b.

> 
> thanks,
> 
> - KK
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/11] nfsd: CHANGE old function calls to new calls
       [not found]   ` <20090325133707.16437.66360.sendpatchset-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
@ 2009-04-22 20:05     ` J. Bruce Fields
  2009-04-23 15:55       ` Krishna Kumar2
  0 siblings, 1 reply; 19+ messages in thread
From: J. Bruce Fields @ 2009-04-22 20:05 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: Krishna Kumar, jlayton, linux-nfs

On Wed, Mar 25, 2009 at 07:07:07PM +0530, Krishna Kumar wrote:
> @@ -1337,12 +1321,30 @@ nfsd_read(struct svc_rqst *rqstp, struct
>  			goto out;
>  		err = nfsd_vfs_read(rqstp, fhp, file, offset, vec, vlen, count);
>  	} else {
> -		err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_READ, &file);
> -		if (err)
> -			goto out;
> -		err = nfsd_vfs_read(rqstp, fhp, file, offset, vec, vlen, count);
> -		nfsd_close(file);
> +		struct fhcache	*fh;
> +
> +		/* Check if this fh is cached */
> +		fh = nfsd_get_fhcache(fhp->fh_handle.fh_auth[3]);

How do you know fh_auth[3] is sufficient to identify the file reliably?
This looks very fragile to me.

If the goal is to bypass rqst_exp_find(), exportfs_decode_fh() and
friends--that makes me nervous.  We should figure out how to make those
lookups faster instead.  Or at the very least, make sure we're keying on
the entire filehandle instead of just part of it.

--b.

> +		if (fh && fh->p_filp) {
> +			/* Got cached values */
> +			file = fh->p_filp;
> +			fhp->fh_dentry = file->f_dentry;
> +			fhp->fh_export = fh->p_exp;
> +			err = fh_verify(rqstp, fhp, S_IFREG, NFSD_MAY_READ);
> +		} else {
> +			/* Nothing in cache */
> +			err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_READ,
> +					&file);
> +		}
> +
> +		if (!err)
> +			err = nfsd_vfs_read(rqstp, fhp, file, offset, vec, vlen,
> +					    count);
> +
> +		/* Update cached values if required, and clean up */
> +		fh_cache_upd(fh, file, fhp->fh_export);
>  	}
> +
>  out:
>  	return err;
>  }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/11] nfsd: CHANGE old function calls to new calls
  2009-04-22 20:05     ` [PATCH 3/11] nfsd: CHANGE old function calls to new calls J. Bruce Fields
@ 2009-04-23 15:55       ` Krishna Kumar2
  2009-04-23 22:39         ` J. Bruce Fields
  0 siblings, 1 reply; 19+ messages in thread
From: Krishna Kumar2 @ 2009-04-23 15:55 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: jlayton, Krishna Kumar, linux-nfs

Hi Bruce,

(combining all three of your mails into one response for easier tracking)

"J. Bruce Fields" <bfields@fieldses.org> wrote on 04/23/2009 01:35:53 AM:

> How do you know fh_auth[3] is sufficient to identify the file reliably?
> This looks very fragile to me.
>
> If the goal is to bypass rqst_exp_find(), exportfs_decode_fh() and
> friends--that makes me nervous.  We should figure out how to make those
> lookups faster instead.  Or at the very least, make sure we're keying on
> the entire filehandle instead of just part of it.

To understand how filehandles works, I was looking at nfsd_set_fh_dentry,
here fid is fh->auth[2] (since type is FSID_DEV). However, the unique FH
value is stored in fh->auth[3], while auth[0]/[1] contains the maj/min and
ino. fh->auth[3] is unique for different files and across file systems, but
it is not actually used to find the file (only the exp) - the stored
dev/ino
(auth[0] and auth[1]) are used to find the dentry by calling the underlying
filesystem (exportfs_decode_fh).

Keying on the entire filehandle seems reasonable, but I don't think it is
required as auth[3] seems to be allocated memory which is unique in a
system,
just that we don't use it for locating the file currently and I was
proposing
that we do. Please correct me if this is wrong, or whether a better way is
possible.

> This patch doesn't change anything--it only adds new lines.  If you're
> trying to make minimal changes, that's admirable, but break up the
> patches in such a way that each patch shows that minimal change.

To make each revision compile cleanly, I had to add the infrastructure and
then delete the old one. So it looks like new code. The entire function and
data structures are re-written so it is difficult to break into individual
components where each will compile clean.

> By "noisy" I just mean that they look like there's a lot of randomness.
> So, yes, I'm fine with a subset, but I'm curious what the variance is on
> repeated runs of each test.

Sure, I will try to do this over the weekend and submit.

thanks,

- KK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/11] nfsd: CHANGE old function calls to new calls
  2009-04-23 15:55       ` Krishna Kumar2
@ 2009-04-23 22:39         ` J. Bruce Fields
  2009-04-24 16:17           ` Krishna Kumar2
  0 siblings, 1 reply; 19+ messages in thread
From: J. Bruce Fields @ 2009-04-23 22:39 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: jlayton, Krishna Kumar, linux-nfs

On Thu, Apr 23, 2009 at 09:25:34PM +0530, Krishna Kumar2 wrote:
> Hi Bruce,
> 
> (combining all three of your mails into one response for easier tracking)
> 
> "J. Bruce Fields" <bfields@fieldses.org> wrote on 04/23/2009 01:35:53 AM:
> 
> > How do you know fh_auth[3] is sufficient to identify the file reliably?
> > This looks very fragile to me.
> >
> > If the goal is to bypass rqst_exp_find(), exportfs_decode_fh() and
> > friends--that makes me nervous.  We should figure out how to make those
> > lookups faster instead.  Or at the very least, make sure we're keying on
> > the entire filehandle instead of just part of it.
> 
> To understand how filehandles works, I was looking at nfsd_set_fh_dentry,
> here fid is fh->auth[2] (since type is FSID_DEV). However, the unique FH
> value is stored in fh->auth[3], while auth[0]/[1] contains the maj/min and
> ino. fh->auth[3] is unique for different files and across file systems, but
> it is not actually used to find the file (only the exp) - the stored
> dev/ino
> (auth[0] and auth[1]) are used to find the dentry by calling the underlying
> filesystem (exportfs_decode_fh).
> 
> Keying on the entire filehandle seems reasonable, but I don't think it is
> required as auth[3] seems to be allocated memory which is unique in a
> system,

You lost me.  Where do you see auth[3] getting encoded, and what do you
mean by saying it's "allocated memory which is unique in a system"?

There are a lot of different filehandle encoding options.  I lose track
of them myself....

> just that we don't use it for locating the file currently and I was
> proposing
> that we do. Please correct me if this is wrong, or whether a better way is
> possible.
> 
> > This patch doesn't change anything--it only adds new lines.  If you're
> > trying to make minimal changes, that's admirable, but break up the
> > patches in such a way that each patch shows that minimal change.
> 
> To make each revision compile cleanly, I had to add the infrastructure and
> then delete the old one. So it looks like new code. The entire function and
> data structures are re-written so it is difficult to break into individual
> components where each will compile clean.

Breaking up a complex change into logical individual steps, each of
which compile and run cleanly, is sometimes tricky.  So it may be that
there *is* some way to do it in this case, but that you just haven't
found it yet.

That said, it could be, as you say, that this requires just replacing
everything.  In that case, please just start over from scratch and don't
feel bound to keep stuff from the old implementation that you don't
understand.

--b.

> > By "noisy" I just mean that they look like there's a lot of randomness.
> > So, yes, I'm fine with a subset, but I'm curious what the variance is on
> > repeated runs of each test.
> 
> Sure, I will try to do this over the weekend and submit.
> 
> thanks,
> 
> - KK
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/11] nfsd: CHANGE old function calls to new calls
  2009-04-23 22:39         ` J. Bruce Fields
@ 2009-04-24 16:17           ` Krishna Kumar2
  2009-04-24 16:23             ` J. Bruce Fields
  0 siblings, 1 reply; 19+ messages in thread
From: Krishna Kumar2 @ 2009-04-24 16:17 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: jlayton, Krishna Kumar, linux-nfs

Hi Bruce,

"J. Bruce Fields" <bfields@fieldses.org> wrote on 04/24/2009 04:09:09 AM:

> > Keying on the entire filehandle seems reasonable, but I don't think it
is
> > required as auth[3] seems to be allocated memory which is unique in a
> > system,
>
> You lost me.  Where do you see auth[3] getting encoded, and what do you
> mean by saying it's "allocated memory which is unique in a system"?
>
> There are a lot of different filehandle encoding options.  I lose track
> of them myself....

Sorry, I am wrong, I was thinking of &auth[3] and wrote too fast. I am
testing
using your suggestion hashing on the full filehandle, something like:

{
      __u32 a = auth[0], b = auth[1], c = auth[2], d = auth[3];
      hash = jhash_3words(a, b, jhash_2words(c, d, 0), 0xfeedbeef) &
                                    FHPARM_HASH_MASK;
      ...
      /*
       * Matching check uses something like:
       * if (fh->p_auth1 == a && fh->p_auth2 == b && fh->p_auth3 == c &&
fh->p_auth4 == d)
       */
}

Is what you had in mind? I am testing some more with this, so far I get
different values for different files and filesystems.

I am not sure if there is an easier way to do a hash and get the unique
file
associated with the filehandle, this part of the code is very complicated.

> > just that we don't use it for locating the file currently and I was
> > proposing
> > that we do. Please correct me if this is wrong, or whether a better way
is
> > possible.
> >
> > > This patch doesn't change anything--it only adds new lines.  If
you're
> > > trying to make minimal changes, that's admirable, but break up the
> > > patches in such a way that each patch shows that minimal change.
> >
> > To make each revision compile cleanly, I had to add the infrastructure
and
> > then delete the old one. So it looks like new code. The entire function
and
> > data structures are re-written so it is difficult to break into
individual
> > components where each will compile clean.
>
> Breaking up a complex change into logical individual steps, each of
> which compile and run cleanly, is sometimes tricky.  So it may be that
> there *is* some way to do it in this case, but that you just haven't
> found it yet.
>
> That said, it could be, as you say, that this requires just replacing
> everything.  In that case, please just start over from scratch and don't
> feel bound to keep stuff from the old implementation that you don't
> understand.

OK.

thanks,

- KK


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/11] nfsd: CHANGE old function calls to new calls
  2009-04-24 16:17           ` Krishna Kumar2
@ 2009-04-24 16:23             ` J. Bruce Fields
  2009-04-24 16:58               ` Krishna Kumar2
  0 siblings, 1 reply; 19+ messages in thread
From: J. Bruce Fields @ 2009-04-24 16:23 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: jlayton, Krishna Kumar, linux-nfs, Greg Banks

On Fri, Apr 24, 2009 at 09:47:57PM +0530, Krishna Kumar2 wrote:
> Hi Bruce,
> 
> "J. Bruce Fields" <bfields@fieldses.org> wrote on 04/24/2009 04:09:09 AM:
> 
> > > Keying on the entire filehandle seems reasonable, but I don't think it
> is
> > > required as auth[3] seems to be allocated memory which is unique in a
> > > system,
> >
> > You lost me.  Where do you see auth[3] getting encoded, and what do you
> > mean by saying it's "allocated memory which is unique in a system"?
> >
> > There are a lot of different filehandle encoding options.  I lose track
> > of them myself....
> 
> Sorry, I am wrong, I was thinking of &auth[3] and wrote too fast. I am
> testing
> using your suggestion hashing on the full filehandle, something like:
> 
> {
>       __u32 a = auth[0], b = auth[1], c = auth[2], d = auth[3];
>       hash = jhash_3words(a, b, jhash_2words(c, d, 0), 0xfeedbeef) &
>                                     FHPARM_HASH_MASK;
>       ...
>       /*
>        * Matching check uses something like:
>        * if (fh->p_auth1 == a && fh->p_auth2 == b && fh->p_auth3 == c &&
> fh->p_auth4 == d)
>        */
> }
> 
> Is what you had in mind? I am testing some more with this, so far I get
> different values for different files and filesystems.
> 
> I am not sure if there is an easier way to do a hash and get the unique
> file
> associated with the filehandle, this part of the code is very complicated.

Why not just do a hash on the entire filehandle, however long it may be?

(Cc'ing Greg since he says he had some patches which did something
similar, and perhaps he could offer a suggestion.)

--b.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/11] nfsd: CHANGE old function calls to new calls
  2009-04-24 16:23             ` J. Bruce Fields
@ 2009-04-24 16:58               ` Krishna Kumar2
  2009-04-24 19:25                 ` J. Bruce Fields
  0 siblings, 1 reply; 19+ messages in thread
From: Krishna Kumar2 @ 2009-04-24 16:58 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Greg Banks, jlayton, Krishna Kumar, linux-nfs

Hi Bruce,

"J. Bruce Fields" <bfields@fieldses.org> wrote on 04/24/2009 09:53:21 PM:

> > {
> >       __u32 a = auth[0], b = auth[1], c = auth[2], d = auth[3];
> >       hash = jhash_3words(a, b, jhash_2words(c, d, 0), 0xfeedbeef) &
> >                                     FHPARM_HASH_MASK;
> >       ...
> >       /*
> >        * Matching check uses something like:
> >        * if (fh->p_auth1 == a && fh->p_auth2 == b && fh->p_auth3 == c
&&
> > fh->p_auth4 == d)
> >        */
> > }
> >
> > Is what you had in mind? I am testing some more with this, so far I get
> > different values for different files and filesystems.
> >
> > I am not sure if there is an easier way to do a hash and get the unique
> > file
> > associated with the filehandle, this part of the code is very
complicated.
>
> Why not just do a hash on the entire filehandle, however long it may be?

I am not sure how many numbers to hash on, usually the first 4 numbers are
the
ino, inode generation, parent inode, parent inode generation, etc, and is a
unique match. Or filesystems can have their own encode handlers but store
similar stuff in these indices. I guess a memcmp could also be done if I
know
the length of the auth being used.

> (Cc'ing Greg since he says he had some patches which did something
> similar, and perhaps he could offer a suggestion.)

OK, will wait for response from Greg.

thanks,

- KK


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/11] nfsd: CHANGE old function calls to new calls
  2009-04-24 16:58               ` Krishna Kumar2
@ 2009-04-24 19:25                 ` J. Bruce Fields
  2009-04-26 11:16                   ` Krishna Kumar2
  0 siblings, 1 reply; 19+ messages in thread
From: J. Bruce Fields @ 2009-04-24 19:25 UTC (permalink / raw)
  To: Krishna Kumar2; +Cc: Greg Banks, jlayton, Krishna Kumar, linux-nfs

On Fri, Apr 24, 2009 at 10:28:18PM +0530, Krishna Kumar2 wrote:
> Hi Bruce,
> 
> "J. Bruce Fields" <bfields@fieldses.org> wrote on 04/24/2009 09:53:21 PM:
> 
> > > {
> > >       __u32 a = auth[0], b = auth[1], c = auth[2], d = auth[3];
> > >       hash = jhash_3words(a, b, jhash_2words(c, d, 0), 0xfeedbeef) &
> > >                                     FHPARM_HASH_MASK;
> > >       ...
> > >       /*
> > >        * Matching check uses something like:
> > >        * if (fh->p_auth1 == a && fh->p_auth2 == b && fh->p_auth3 == c
> &&
> > > fh->p_auth4 == d)
> > >        */
> > > }
> > >
> > > Is what you had in mind? I am testing some more with this, so far I get
> > > different values for different files and filesystems.
> > >
> > > I am not sure if there is an easier way to do a hash and get the unique
> > > file
> > > associated with the filehandle, this part of the code is very
> complicated.
> >
> > Why not just do a hash on the entire filehandle, however long it may be?
> 
> I am not sure how many numbers to hash on, usually the first 4 numbers are
> the
> ino, inode generation, parent inode, parent inode generation, etc, and is a
> unique match. Or filesystems can have their own encode handlers but store
> similar stuff in these indices. I guess a memcmp could also be done if I
> know
> the length of the auth being used.

Why not use the whole thing?:

	fh1->fh_size == fh2->fh_size
		&& memcmp(fh1->fh_base, fh2->fh_base, fh1->fh_size) == 0

--b.

> 
> > (Cc'ing Greg since he says he had some patches which did something
> > similar, and perhaps he could offer a suggestion.)
> 
> OK, will wait for response from Greg.
> 
> thanks,
> 
> - KK
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/11] nfsd: CHANGE old function calls to new calls
  2009-04-24 19:25                 ` J. Bruce Fields
@ 2009-04-26 11:16                   ` Krishna Kumar2
  0 siblings, 0 replies; 19+ messages in thread
From: Krishna Kumar2 @ 2009-04-26 11:16 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Greg Banks, jlayton, Krishna Kumar, linux-nfs

"J. Bruce Fields" <bfields@fieldses.org> wrote on 04/25/2009 12:55:06 AM:

> Why not use the whole thing?:
>
>    fh1->fh_size == fh2->fh_size
>       && memcmp(fh1->fh_base, fh2->fh_base, fh1->fh_size) == 0

OK - so I just add a knfsd_fh in the cache and compare fh's. That
sounds good, let me test and see what is the result.

thanks,

- KK


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-04-26 11:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20090325133607.16437.33288.sendpatchset@localhost.localdomain>
     [not found] ` <20090325133607.16437.33288.sendpatchset-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2009-04-17 18:46   ` [PATCH 0/11] nfsd: Summary of "Improve NFS server performance" J. Bruce Fields
2009-04-17 19:30     ` Krishna Kumar2
2009-04-21 22:57   ` J. Bruce Fields
2009-04-22  5:35     ` Krishna Kumar2
2009-04-22 19:41       ` J. Bruce Fields
     [not found] ` <20090325133628.16437.11092.sendpatchset@localhost.localdomain>
     [not found]   ` <20090325133628.16437.11092.sendpatchset-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2009-04-21 22:48     ` [PATCH 1/11] nfsd: ADD data structure infrastructure J. Bruce Fields
2009-04-22  5:36       ` Krishna Kumar2
2009-04-22 19:43         ` J. Bruce Fields
2009-04-21 23:05     ` J. Bruce Fields
     [not found] ` <20090325133647.16437.59567.sendpatchset@localhost.localdomain>
     [not found]   ` <20090325133647.16437.59567.sendpatchset-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2009-04-22  2:54     ` [PATCH 2/11] nfsd: ADD new function infrastructure J. Bruce Fields
2009-04-22  5:37       ` Krishna Kumar2
     [not found] ` <20090325133707.16437.66360.sendpatchset@localhost.localdomain>
     [not found]   ` <20090325133707.16437.66360.sendpatchset-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2009-04-22 20:05     ` [PATCH 3/11] nfsd: CHANGE old function calls to new calls J. Bruce Fields
2009-04-23 15:55       ` Krishna Kumar2
2009-04-23 22:39         ` J. Bruce Fields
2009-04-24 16:17           ` Krishna Kumar2
2009-04-24 16:23             ` J. Bruce Fields
2009-04-24 16:58               ` Krishna Kumar2
2009-04-24 19:25                 ` J. Bruce Fields
2009-04-26 11:16                   ` Krishna Kumar2

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.