[PATCH RFC 0/1] Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH RFC 0/1] Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file
@ 2013-05-23 21:53 Scott Mayhew
  2013-05-23 21:53 ` [PATCH RFC 1/1] NFS: " Scott Mayhew
  0 siblings, 1 reply; 9+ messages in thread
From: Scott Mayhew @ 2013-05-23 21:53 UTC (permalink / raw)
  To: linux-nfs

We had a customer experience some performance issues after migrating their
MQseries servers from AIX to Linux.  Their performance benchmark basically puts
50000 messages in a message queue, and a tcpdump captured during these tests
would show a ton of very small writes that were sequential but not contiguous.
After doing some investigation with systemtap we determined that when we called
nfs_updatepage() we were not being allowed to extend the write because the
inode->i_flock was not NULL.  So then later when we'd arrive at
nfs_try_to_update_request() we would always wind up calling nfs_wb_page().

I gave the customer a test kernel using a patch similar to the one that follows
and the test results were favorable, with far fewer writes, the majority of
which were utilizing the full wsize.  For example, the top ten write sizes and
number of occurrences from a tcpdump captured while running the benchmark with
an unpatched kernel:

$ tshark -r before.pcap.gz -R "nfs.opcode==write && nfs.stateid4.hash==0xf09c"
-T fields -e nfs.write.data_length | sort | uniq -c | sort -nr | head
   5852 512
   5575 1024
   2262 1035
   2160 1121
   1661 1023
   1460 1074
   1413 1073
   1394 1152
   1244 1055
    933 1804

contrasted with a tcpdump captured while running the benchmark with the test
kernel:

$ tshark -r after.pcap.gz -R "nfs.opcode==write && nfs.stateid4.hash==0x9f87"
-T fields -e nfs.write.data_length | sort | uniq -c | sort -nr | head
    917 65536
     76 36864
     69 20480
     55 53248
     32 18432
     31 49152
     31 4096
     31 32768
     30 16384
     25 65536,4096

Scott Mayhew (1):
  NFS: Allow nfs_updatepage to extend a write to cover a full page when
    we     have a lock that covers the entire file

 fs/nfs/write.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC 1/1] NFS: Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file
  2013-05-23 21:53 [PATCH RFC 0/1] Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file Scott Mayhew
@ 2013-05-23 21:53 ` Scott Mayhew
  2013-05-23 22:15   ` Myklebust, Trond
  2013-05-23 22:24   ` Jeff Layton
  0 siblings, 2 replies; 9+ messages in thread
From: Scott Mayhew @ 2013-05-23 21:53 UTC (permalink / raw)
  To: linux-nfs

Currently nfs_updatepage allows a write to be extended to cover a full
page only if we don't have a byte range lock on the file... but if we've
got the whole file locked, then we should be allowed to extend the
write.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
 fs/nfs/write.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index a2c7c28..f35fb4f 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -908,13 +908,16 @@ int nfs_updatepage(struct file *file, struct page *page,
 		file->f_path.dentry->d_name.name, count,
 		(long long)(page_file_offset(page) + offset));
 
-	/* If we're not using byte range locks, and we know the page
+	/* If we're not using byte range locks (or if the range of the
+	 * lock covers the entire file), and we know the page
 	 * is up to date, it may be more efficient to extend the write
 	 * to cover the entire page in order to avoid fragmentation
 	 * inefficiencies.
 	 */
 	if (nfs_write_pageuptodate(page, inode) &&
-			inode->i_flock == NULL &&
+			(inode->i_flock == NULL ||
+			(inode->i_flock->fl_start == 0 &&
+			inode->i_flock->fl_end == OFFSET_MAX)) &&
 			!(file->f_flags & O_DSYNC)) {
 		count = max(count + offset, nfs_page_length(page));
 		offset = 0;
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 1/1] NFS: Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file
  2013-05-23 21:53 ` [PATCH RFC 1/1] NFS: " Scott Mayhew
@ 2013-05-23 22:15   ` Myklebust, Trond
  2013-05-23 22:24   ` Jeff Layton
  1 sibling, 0 replies; 9+ messages in thread
From: Myklebust, Trond @ 2013-05-23 22:15 UTC (permalink / raw)
  To: Scott Mayhew; +Cc: linux-nfs

Hi Scott,

On Thu, 2013-05-23 at 17:53 -0400, Scott Mayhew wrote:
> Currently nfs_updatepage allows a write to be extended to cover a full
> page only if we don't have a byte range lock on the file... but if we've
> got the whole file locked, then we should be allowed to extend the
> write.
> 
> Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> ---
>  fs/nfs/write.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index a2c7c28..f35fb4f 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -908,13 +908,16 @@ int nfs_updatepage(struct file *file, struct page *page,
>  		file->f_path.dentry->d_name.name, count,
>  		(long long)(page_file_offset(page) + offset));
>  
> -	/* If we're not using byte range locks, and we know the page
> +	/* If we're not using byte range locks (or if the range of the
> +	 * lock covers the entire file), and we know the page
>  	 * is up to date, it may be more efficient to extend the write
>  	 * to cover the entire page in order to avoid fragmentation
>  	 * inefficiencies.
>  	 */
>  	if (nfs_write_pageuptodate(page, inode) &&
> -			inode->i_flock == NULL &&
> +			(inode->i_flock == NULL ||
> +			(inode->i_flock->fl_start == 0 &&
> +			inode->i_flock->fl_end == OFFSET_MAX)) &&
>  			!(file->f_flags & O_DSYNC)) {

Can we put this condition into a helper function? I started with the
"nfs_write_pageuptodate()" thingy, but now we're starting to add in
extra complications...

Thanks!
  Trond

>  		count = max(count + offset, nfs_page_length(page));
>  		offset = 0;


-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 1/1] NFS: Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file
  2013-05-23 21:53 ` [PATCH RFC 1/1] NFS: " Scott Mayhew
  2013-05-23 22:15   ` Myklebust, Trond
@ 2013-05-23 22:24   ` Jeff Layton
  2013-05-23 22:30     ` Myklebust, Trond
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff Layton @ 2013-05-23 22:24 UTC (permalink / raw)
  To: Scott Mayhew; +Cc: linux-nfs

On Thu, 23 May 2013 17:53:41 -0400
Scott Mayhew <smayhew@redhat.com> wrote:

> Currently nfs_updatepage allows a write to be extended to cover a full
> page only if we don't have a byte range lock on the file... but if we've
> got the whole file locked, then we should be allowed to extend the
> write.
> 
> Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> ---
>  fs/nfs/write.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index a2c7c28..f35fb4f 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -908,13 +908,16 @@ int nfs_updatepage(struct file *file, struct page *page,
>  		file->f_path.dentry->d_name.name, count,
>  		(long long)(page_file_offset(page) + offset));
>  
> -	/* If we're not using byte range locks, and we know the page
> +	/* If we're not using byte range locks (or if the range of the
> +	 * lock covers the entire file), and we know the page
>  	 * is up to date, it may be more efficient to extend the write
>  	 * to cover the entire page in order to avoid fragmentation
>  	 * inefficiencies.
>  	 */
>  	if (nfs_write_pageuptodate(page, inode) &&
> -			inode->i_flock == NULL &&
> +			(inode->i_flock == NULL ||
> +			(inode->i_flock->fl_start == 0 &&
> +			inode->i_flock->fl_end == OFFSET_MAX)) &&
>  			!(file->f_flags & O_DSYNC)) {
>  		count = max(count + offset, nfs_page_length(page));
>  		offset = 0;

Sounds like a reasonable proposition, but I think you might need to do
more vetting of the locks...

For instance, does it make sense to do this if it's a F_RDLCK? Also,
you're only looking at the first lock in the i_flock list. Might it
make more sense to walk the list and see whether the page might be
entirely covered by a lock that doesn't extend over the whole file?

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 1/1] NFS: Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file
  2013-05-23 22:24   ` Jeff Layton
@ 2013-05-23 22:30     ` Myklebust, Trond
  2013-05-24 11:24       ` Jeff Layton
  0 siblings, 1 reply; 9+ messages in thread
From: Myklebust, Trond @ 2013-05-23 22:30 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Scott Mayhew, linux-nfs

On Thu, 2013-05-23 at 18:24 -0400, Jeff Layton wrote:
> On Thu, 23 May 2013 17:53:41 -0400
> Scott Mayhew <smayhew@redhat.com> wrote:
> 
> > Currently nfs_updatepage allows a write to be extended to cover a full
> > page only if we don't have a byte range lock on the file... but if we've
> > got the whole file locked, then we should be allowed to extend the
> > write.
> > 
> > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > ---
> >  fs/nfs/write.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> > index a2c7c28..f35fb4f 100644
> > --- a/fs/nfs/write.c
> > +++ b/fs/nfs/write.c
> > @@ -908,13 +908,16 @@ int nfs_updatepage(struct file *file, struct page *page,
> >  		file->f_path.dentry->d_name.name, count,
> >  		(long long)(page_file_offset(page) + offset));
> >  
> > -	/* If we're not using byte range locks, and we know the page
> > +	/* If we're not using byte range locks (or if the range of the
> > +	 * lock covers the entire file), and we know the page
> >  	 * is up to date, it may be more efficient to extend the write
> >  	 * to cover the entire page in order to avoid fragmentation
> >  	 * inefficiencies.
> >  	 */
> >  	if (nfs_write_pageuptodate(page, inode) &&
> > -			inode->i_flock == NULL &&
> > +			(inode->i_flock == NULL ||
> > +			(inode->i_flock->fl_start == 0 &&
> > +			inode->i_flock->fl_end == OFFSET_MAX)) &&
> >  			!(file->f_flags & O_DSYNC)) {
> >  		count = max(count + offset, nfs_page_length(page));
> >  		offset = 0;
> 
> Sounds like a reasonable proposition, but I think you might need to do
> more vetting of the locks...
> 
> For instance, does it make sense to do this if it's a F_RDLCK? Also,
> you're only looking at the first lock in the i_flock list. Might it
> make more sense to walk the list and see whether the page might be
> entirely covered by a lock that doesn't extend over the whole file?
> 

I'm guessing that the answer is to both these questions are "no":
- Anybody who is writing while holding a F_RDLCK is likely doing
something wrong.
- Walking the lock list on every write can quickly get painful if we
have lots of small locks.

However it may make a lot of sense to look at whether or not we hold a
NFSv4 write delegation.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 1/1] NFS: Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file
  2013-05-23 22:30     ` Myklebust, Trond
@ 2013-05-24 11:24       ` Jeff Layton
  2013-06-04 13:21         ` Scott Mayhew
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff Layton @ 2013-05-24 11:24 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: Scott Mayhew, linux-nfs

On Thu, 23 May 2013 22:30:10 +0000
"Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote:

> On Thu, 2013-05-23 at 18:24 -0400, Jeff Layton wrote:
> > On Thu, 23 May 2013 17:53:41 -0400
> > Scott Mayhew <smayhew@redhat.com> wrote:
> > 
> > > Currently nfs_updatepage allows a write to be extended to cover a full
> > > page only if we don't have a byte range lock on the file... but if we've
> > > got the whole file locked, then we should be allowed to extend the
> > > write.
> > > 
> > > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > > ---
> > >  fs/nfs/write.c | 7 +++++--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> > > index a2c7c28..f35fb4f 100644
> > > --- a/fs/nfs/write.c
> > > +++ b/fs/nfs/write.c
> > > @@ -908,13 +908,16 @@ int nfs_updatepage(struct file *file, struct page *page,
> > >  		file->f_path.dentry->d_name.name, count,
> > >  		(long long)(page_file_offset(page) + offset));
> > >  
> > > -	/* If we're not using byte range locks, and we know the page
> > > +	/* If we're not using byte range locks (or if the range of the
> > > +	 * lock covers the entire file), and we know the page
> > >  	 * is up to date, it may be more efficient to extend the write
> > >  	 * to cover the entire page in order to avoid fragmentation
> > >  	 * inefficiencies.
> > >  	 */
> > >  	if (nfs_write_pageuptodate(page, inode) &&
> > > -			inode->i_flock == NULL &&
> > > +			(inode->i_flock == NULL ||
> > > +			(inode->i_flock->fl_start == 0 &&
> > > +			inode->i_flock->fl_end == OFFSET_MAX)) &&
> > >  			!(file->f_flags & O_DSYNC)) {
> > >  		count = max(count + offset, nfs_page_length(page));
> > >  		offset = 0;
> > 
> > Sounds like a reasonable proposition, but I think you might need to do
> > more vetting of the locks...
> > 
> > For instance, does it make sense to do this if it's a F_RDLCK? Also,
> > you're only looking at the first lock in the i_flock list. Might it
> > make more sense to walk the list and see whether the page might be
> > entirely covered by a lock that doesn't extend over the whole file?
> > 
> 
> I'm guessing that the answer is to both these questions are "no":
> - Anybody who is writing while holding a F_RDLCK is likely doing
> something wrong.

Right, so I think we ought to be conservative here and not extend the
write if this is an F_RDLCK.

> - Walking the lock list on every write can quickly get painful if we
> have lots of small locks.
> 

True, but it's probably still preferable to do that than to do a bunch
of small I/Os to the server. But, that's an optimization that can be
done later. Hardly anyone does real byte-range locking so I'm fine with
this approach for now.

> However it may make a lot of sense to look at whether or not we hold a
> NFSv4 write delegation.
> 

Yes, that would be a good thing too. Having a helper function like you
suggested should make it easier to encapsulate that logic sanely.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 1/1] NFS: Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file
  2013-05-24 11:24       ` Jeff Layton
@ 2013-06-04 13:21         ` Scott Mayhew
  2013-06-04 14:01           ` Jeff Layton
  2013-06-25 19:15           ` Jeff Layton
  0 siblings, 2 replies; 9+ messages in thread
From: Scott Mayhew @ 2013-06-04 13:21 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Myklebust, Trond, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 3290 bytes --]

On Fri, 24 May 2013, Jeff Layton wrote:

> On Thu, 23 May 2013 22:30:10 +0000
> "Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote:
> 
> > On Thu, 2013-05-23 at 18:24 -0400, Jeff Layton wrote:
> > > On Thu, 23 May 2013 17:53:41 -0400
> > > Scott Mayhew <smayhew@redhat.com> wrote:
> > > 
> > > > Currently nfs_updatepage allows a write to be extended to cover a full
> > > > page only if we don't have a byte range lock on the file... but if we've
> > > > got the whole file locked, then we should be allowed to extend the
> > > > write.
> > > > 
> > > > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > > > ---
> > > >  fs/nfs/write.c | 7 +++++--
> > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> > > > index a2c7c28..f35fb4f 100644
> > > > --- a/fs/nfs/write.c
> > > > +++ b/fs/nfs/write.c
> > > > @@ -908,13 +908,16 @@ int nfs_updatepage(struct file *file, struct page *page,
> > > >  		file->f_path.dentry->d_name.name, count,
> > > >  		(long long)(page_file_offset(page) + offset));
> > > >  
> > > > -	/* If we're not using byte range locks, and we know the page
> > > > +	/* If we're not using byte range locks (or if the range of the
> > > > +	 * lock covers the entire file), and we know the page
> > > >  	 * is up to date, it may be more efficient to extend the write
> > > >  	 * to cover the entire page in order to avoid fragmentation
> > > >  	 * inefficiencies.
> > > >  	 */
> > > >  	if (nfs_write_pageuptodate(page, inode) &&
> > > > -			inode->i_flock == NULL &&
> > > > +			(inode->i_flock == NULL ||
> > > > +			(inode->i_flock->fl_start == 0 &&
> > > > +			inode->i_flock->fl_end == OFFSET_MAX)) &&
> > > >  			!(file->f_flags & O_DSYNC)) {
> > > >  		count = max(count + offset, nfs_page_length(page));
> > > >  		offset = 0;
> > > 
> > > Sounds like a reasonable proposition, but I think you might need to do
> > > more vetting of the locks...
> > > 
> > > For instance, does it make sense to do this if it's a F_RDLCK? Also,
> > > you're only looking at the first lock in the i_flock list. Might it
> > > make more sense to walk the list and see whether the page might be
> > > entirely covered by a lock that doesn't extend over the whole file?
> > > 
> > 
> > I'm guessing that the answer is to both these questions are "no":
> > - Anybody who is writing while holding a F_RDLCK is likely doing
> > something wrong.
> 
> Right, so I think we ought to be conservative here and not extend the
> write if this is an F_RDLCK.
> 
> > - Walking the lock list on every write can quickly get painful if we
> > have lots of small locks.
> > 
> 
> True, but it's probably still preferable to do that than to do a bunch
> of small I/Os to the server. But, that's an optimization that can be
> done later. Hardly anyone does real byte-range locking so I'm fine with
> this approach for now.
> 
> > However it may make a lot of sense to look at whether or not we hold a
> > NFSv4 write delegation.
> > 
> 
> Yes, that would be a good thing too. Having a helper function like you
> suggested should make it easier to encapsulate that logic sanely.
> 
Here's an updated patch that moves the logic to a helper function,
checks to see if we have a write delegation, and checks the lock type.

-Scott

[-- Attachment #2: 0001-NFS-Allow-nfs_updatepage-to-extend-a-write-under-add.patch --]
[-- Type: text/plain, Size: 2417 bytes --]

>From 3938f17ef84f5c4889fd7f827109f89c932df569 Mon Sep 17 00:00:00 2001
From: Scott Mayhew <smayhew@redhat.com>
Date: Wed, 22 May 2013 17:03:17 -0400
Subject: [PATCH RFC] NFS: Allow nfs_updatepage to extend a write under
 additional circumstances

Currently nfs_updatepage allows a write to be extended to cover a full
page only if we don't have a byte range lock lock on the file... but if
we have a write delegation on the file or if we have the whole file
locked for writing then we should be allowed to extend the write as
well.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
 fs/nfs/write.c | 31 +++++++++++++++++++++++--------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index a2c7c28..c8a1bcc 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -888,6 +888,28 @@ out:
 	return PageUptodate(page) != 0;
 }
 
+/* If we know the page is up to date, and we're not using byte range locks (or
+ * if we have the whole file locked for writing), it may be more efficient to
+ * extend the write to cover the entire page in order to avoid fragmentation
+ * inefficiencies.
+ *
+ * If the file is opened for synchronous writes or if we have a write delegation
+ * from the server then we can just skip the rest of the checks.
+ */
+static int nfs_can_extend_write(struct file *file, struct page *page, struct inode *inode)
+{
+	if (file->f_flags & O_DSYNC)
+		return 0;
+	if (nfs_have_delegation(inode, FMODE_WRITE))
+		return 1;
+	if (nfs_write_pageuptodate(page, inode) && (inode->i_flock == NULL ||
+			(inode->i_flock->fl_start == 0 &&
+			inode->i_flock->fl_end == OFFSET_MAX &&
+			inode->i_flock->fl_type != F_RDLCK)))
+		return 1;
+	return 0;
+}
+
 /*
  * Update and possibly write a cached page of an NFS file.
  *
@@ -908,14 +930,7 @@ int nfs_updatepage(struct file *file, struct page *page,
 		file->f_path.dentry->d_name.name, count,
 		(long long)(page_file_offset(page) + offset));
 
-	/* If we're not using byte range locks, and we know the page
-	 * is up to date, it may be more efficient to extend the write
-	 * to cover the entire page in order to avoid fragmentation
-	 * inefficiencies.
-	 */
-	if (nfs_write_pageuptodate(page, inode) &&
-			inode->i_flock == NULL &&
-			!(file->f_flags & O_DSYNC)) {
+	if (nfs_can_extend_write(file, page, inode)) {
 		count = max(count + offset, nfs_page_length(page));
 		offset = 0;
 	}
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 1/1] NFS: Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file
  2013-06-04 13:21         ` Scott Mayhew
@ 2013-06-04 14:01           ` Jeff Layton
  2013-06-25 19:15           ` Jeff Layton
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Layton @ 2013-06-04 14:01 UTC (permalink / raw)
  To: Scott Mayhew; +Cc: Myklebust, Trond, linux-nfs

On Tue, 4 Jun 2013 09:21:49 -0400
Scott Mayhew <smayhew@redhat.com> wrote:


> 
> Currently nfs_updatepage allows a write to be extended to cover a full
> page only if we don't have a byte range lock lock on the file... but if
> we have a write delegation on the file or if we have the whole file
> locked for writing then we should be allowed to extend the write as
> well.
> 
> Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> ---
>  fs/nfs/write.c | 31 +++++++++++++++++++++++--------
>  1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index a2c7c28..c8a1bcc 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -888,6 +888,28 @@ out:
>  	return PageUptodate(page) != 0;
>  }
>  
> +/* If we know the page is up to date, and we're not using byte range locks (or
> + * if we have the whole file locked for writing), it may be more efficient to
> + * extend the write to cover the entire page in order to avoid fragmentation
> + * inefficiencies.
> + *
> + * If the file is opened for synchronous writes or if we have a write delegation
> + * from the server then we can just skip the rest of the checks.
> + */
> +static int nfs_can_extend_write(struct file *file, struct page *page, struct inode *inode)
> +{
> +	if (file->f_flags & O_DSYNC)
> +		return 0;
> +	if (nfs_have_delegation(inode, FMODE_WRITE))
> +		return 1;
> +	if (nfs_write_pageuptodate(page, inode) && (inode->i_flock == NULL ||
> +			(inode->i_flock->fl_start == 0 &&
> +			inode->i_flock->fl_end == OFFSET_MAX &&
> +			inode->i_flock->fl_type != F_RDLCK)))
> +		return 1;
> +	return 0;
> +}
> +
>  /*
>   * Update and possibly write a cached page of an NFS file.
>   *
> @@ -908,14 +930,7 @@ int nfs_updatepage(struct file *file, struct page *page,
>  		file->f_path.dentry->d_name.name, count,
>  		(long long)(page_file_offset(page) + offset));
>  
> -	/* If we're not using byte range locks, and we know the page
> -	 * is up to date, it may be more efficient to extend the write
> -	 * to cover the entire page in order to avoid fragmentation
> -	 * inefficiencies.
> -	 */
> -	if (nfs_write_pageuptodate(page, inode) &&
> -			inode->i_flock == NULL &&
> -			!(file->f_flags & O_DSYNC)) {
> +	if (nfs_can_extend_write(file, page, inode)) {
>  		count = max(count + offset, nfs_page_length(page));
>  		offset = 0;
>  	}

Looks reasonable to me...

Acked-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 1/1] NFS: Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file
  2013-06-04 13:21         ` Scott Mayhew
  2013-06-04 14:01           ` Jeff Layton
@ 2013-06-25 19:15           ` Jeff Layton
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Layton @ 2013-06-25 19:15 UTC (permalink / raw)
  To: Scott Mayhew; +Cc: Myklebust, Trond, linux-nfs

On Tue, 4 Jun 2013 09:21:49 -0400
Scott Mayhew <smayhew@redhat.com> wrote:

> From: Scott Mayhew <smayhew@redhat.com>
> To: Jeff Layton <jlayton@redhat.com>
> Cc: "Myklebust, Trond" <Trond.Myklebust@netapp.com>, "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
> Subject: Re: [PATCH RFC 1/1] NFS: Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file
> Date: Tue, 4 Jun 2013 09:21:49 -0400
> Sender: linux-nfs-owner@vger.kernel.org
> User-Agent: Mutt/1.5.20 (2009-06-14)
> 
> On Fri, 24 May 2013, Jeff Layton wrote:
> 
> > On Thu, 23 May 2013 22:30:10 +0000
> > "Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote:
> >   
> > > On Thu, 2013-05-23 at 18:24 -0400, Jeff Layton wrote:  
> > > > On Thu, 23 May 2013 17:53:41 -0400
> > > > Scott Mayhew <smayhew@redhat.com> wrote:
> > > >   
> > > > > Currently nfs_updatepage allows a write to be extended to cover a full
> > > > > page only if we don't have a byte range lock on the file... but if we've
> > > > > got the whole file locked, then we should be allowed to extend the
> > > > > write.
> > > > > 
> > > > > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > > > > ---
> > > > >  fs/nfs/write.c | 7 +++++--
> > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> > > > > index a2c7c28..f35fb4f 100644
> > > > > --- a/fs/nfs/write.c
> > > > > +++ b/fs/nfs/write.c
> > > > > @@ -908,13 +908,16 @@ int nfs_updatepage(struct file *file, struct page *page,
> > > > >  		file->f_path.dentry->d_name.name, count,
> > > > >  		(long long)(page_file_offset(page) + offset));
> > > > >  
> > > > > -	/* If we're not using byte range locks, and we know the page
> > > > > +	/* If we're not using byte range locks (or if the range of the
> > > > > +	 * lock covers the entire file), and we know the page
> > > > >  	 * is up to date, it may be more efficient to extend the write
> > > > >  	 * to cover the entire page in order to avoid fragmentation
> > > > >  	 * inefficiencies.
> > > > >  	 */
> > > > >  	if (nfs_write_pageuptodate(page, inode) &&
> > > > > -			inode->i_flock == NULL &&
> > > > > +			(inode->i_flock == NULL ||
> > > > > +			(inode->i_flock->fl_start == 0 &&
> > > > > +			inode->i_flock->fl_end == OFFSET_MAX)) &&
> > > > >  			!(file->f_flags & O_DSYNC)) {
> > > > >  		count = max(count + offset, nfs_page_length(page));
> > > > >  		offset = 0;  
> > > > 
> > > > Sounds like a reasonable proposition, but I think you might need to do
> > > > more vetting of the locks...
> > > > 
> > > > For instance, does it make sense to do this if it's a F_RDLCK? Also,
> > > > you're only looking at the first lock in the i_flock list. Might it
> > > > make more sense to walk the list and see whether the page might be
> > > > entirely covered by a lock that doesn't extend over the whole file?
> > > >   
> > > 
> > > I'm guessing that the answer is to both these questions are "no":
> > > - Anybody who is writing while holding a F_RDLCK is likely doing
> > > something wrong.  
> > 
> > Right, so I think we ought to be conservative here and not extend the
> > write if this is an F_RDLCK.
> >   
> > > - Walking the lock list on every write can quickly get painful if we
> > > have lots of small locks.
> > >   
> > 
> > True, but it's probably still preferable to do that than to do a bunch
> > of small I/Os to the server. But, that's an optimization that can be
> > done later. Hardly anyone does real byte-range locking so I'm fine with
> > this approach for now.
> >   
> > > However it may make a lot of sense to look at whether or not we hold a
> > > NFSv4 write delegation.
> > >   
> > 
> > Yes, that would be a good thing too. Having a helper function like you
> > suggested should make it easier to encapsulate that logic sanely.
> >   
> Here's an updated patch that moves the logic to a helper function,
> checks to see if we have a write delegation, and checks the lock type.
> 
> -Scott
> 
> From 3938f17ef84f5c4889fd7f827109f89c932df569 Mon Sep 17 00:00:00 2001
> From: Scott Mayhew <smayhew@redhat.com>
> Date: Wed, 22 May 2013 17:03:17 -0400
> Subject: [PATCH RFC] NFS: Allow nfs_updatepage to extend a write under
>  additional circumstances
> 
> Currently nfs_updatepage allows a write to be extended to cover a full
> page only if we don't have a byte range lock lock on the file... but if
> we have a write delegation on the file or if we have the whole file
> locked for writing then we should be allowed to extend the write as
> well.
> 
> Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> ---
>  fs/nfs/write.c | 31 +++++++++++++++++++++++--------
>  1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index a2c7c28..c8a1bcc 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -888,6 +888,28 @@ out:
>  	return PageUptodate(page) != 0;
>  }
>  
> +/* If we know the page is up to date, and we're not using byte range locks (or
> + * if we have the whole file locked for writing), it may be more efficient to
> + * extend the write to cover the entire page in order to avoid fragmentation
> + * inefficiencies.
> + *
> + * If the file is opened for synchronous writes or if we have a write delegation
> + * from the server then we can just skip the rest of the checks.
> + */
> +static int nfs_can_extend_write(struct file *file, struct page *page, struct inode *inode)
> +{
> +	if (file->f_flags & O_DSYNC)
> +		return 0;
> +	if (nfs_have_delegation(inode, FMODE_WRITE))
> +		return 1;
> +	if (nfs_write_pageuptodate(page, inode) && (inode->i_flock == NULL ||
> +			(inode->i_flock->fl_start == 0 &&
> +			inode->i_flock->fl_end == OFFSET_MAX &&
> +			inode->i_flock->fl_type != F_RDLCK)))
> +		return 1;
> +	return 0;
> +}
> +
>  /*
>   * Update and possibly write a cached page of an NFS file.
>   *
> @@ -908,14 +930,7 @@ int nfs_updatepage(struct file *file, struct page *page,
>  		file->f_path.dentry->d_name.name, count,
>  		(long long)(page_file_offset(page) + offset));
>  
> -	/* If we're not using byte range locks, and we know the page
> -	 * is up to date, it may be more efficient to extend the write
> -	 * to cover the entire page in order to avoid fragmentation
> -	 * inefficiencies.
> -	 */
> -	if (nfs_write_pageuptodate(page, inode) &&
> -			inode->i_flock == NULL &&
> -			!(file->f_flags & O_DSYNC)) {
> +	if (nfs_can_extend_write(file, page, inode)) {
>  		count = max(count + offset, nfs_page_length(page));
>  		offset = 0;
>  	}

Sorry I didn't chime in on this before. Looks sane to me...

Reviewed-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-06-25 19:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-23 21:53 [PATCH RFC 0/1] Allow nfs_updatepage to extend a write to cover a full page when we have a lock that covers the entire file Scott Mayhew
2013-05-23 21:53 ` [PATCH RFC 1/1] NFS: " Scott Mayhew
2013-05-23 22:15   ` Myklebust, Trond
2013-05-23 22:24   ` Jeff Layton
2013-05-23 22:30     ` Myklebust, Trond
2013-05-24 11:24       ` Jeff Layton
2013-06-04 13:21         ` Scott Mayhew
2013-06-04 14:01           ` Jeff Layton
2013-06-25 19:15           ` Jeff Layton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.