* [PATCH 0/2] fix nfsd stable write implementation
@ 2012-10-26 21:06 ` J. Bruce Fields
0 siblings, 0 replies; 12+ messages in thread
From: J. Bruce Fields @ 2012-10-26 21:06 UTC (permalink / raw)
To: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
Cc: Peter Staubach, J. Bruce Fields
From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Peter pointed out to me that the nfs server is implementing stable
writes by setting the O_SYNC flag. I can't see why we couldn't write
and then sync instead, but I don't know this stuff as well as I should;
does the following look reasonable to people?
--b.
J. Bruce Fields (2):
nfsd: assume writeable exportabled filesystems have f_sync
nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes
fs/nfsd/vfs.c | 26 ++++++--------------------
1 file changed, 6 insertions(+), 20 deletions(-)
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 0/2] fix nfsd stable write implementation
@ 2012-10-26 21:06 ` J. Bruce Fields
0 siblings, 0 replies; 12+ messages in thread
From: J. Bruce Fields @ 2012-10-26 21:06 UTC (permalink / raw)
To: linux-nfs, linux-fsdevel; +Cc: Peter Staubach, J. Bruce Fields
From: "J. Bruce Fields" <bfields@redhat.com>
Peter pointed out to me that the nfs server is implementing stable
writes by setting the O_SYNC flag. I can't see why we couldn't write
and then sync instead, but I don't know this stuff as well as I should;
does the following look reasonable to people?
--b.
J. Bruce Fields (2):
nfsd: assume writeable exportabled filesystems have f_sync
nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes
fs/nfsd/vfs.c | 26 ++++++--------------------
1 file changed, 6 insertions(+), 20 deletions(-)
--
1.7.9.5
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] nfsd: assume writeable exportabled filesystems have f_sync
2012-10-26 21:06 ` J. Bruce Fields
@ 2012-10-26 21:06 ` J. Bruce Fields
-1 siblings, 0 replies; 12+ messages in thread
From: J. Bruce Fields @ 2012-10-26 21:06 UTC (permalink / raw)
To: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
Cc: Peter Staubach, J. Bruce Fields
From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
I don't really see how you could claim to support nfsd and not support
fsync somehow.
And in practice a quick look through the exportable filesystems suggests
the only ones without an ->fsync are read-only (efs, isofs, squashfs) or
in-memory (shmem).
Also, performing a write and then returning an error if the sync fails
(as we would do here in the wgather case) seems unhelpful to clients.
Also remove an incorrect comment.
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
fs/nfsd/vfs.c | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index c120b48..ed3eb59 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1020,21 +1020,8 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
inode = dentry->d_inode;
exp = fhp->fh_export;
- /*
- * Request sync writes if
- * - the sync export option has been set, or
- * - the client requested O_SYNC behavior (NFSv3 feature).
- * - The file system doesn't support fsync().
- * When NFSv2 gathered writes have been configured for this volume,
- * flushing the data to disk is handled separately below.
- */
use_wgather = (rqstp->rq_vers == 2) && EX_WGATHER(exp);
- if (!file->f_op->fsync) {/* COMMIT3 cannot work */
- stable = 2;
- *stablep = 2; /* FILE_SYNC */
- }
-
if (!EX_ISSYNC(exp))
stable = 0;
if (stable && !use_wgather) {
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 1/2] nfsd: assume writeable exportabled filesystems have f_sync
@ 2012-10-26 21:06 ` J. Bruce Fields
0 siblings, 0 replies; 12+ messages in thread
From: J. Bruce Fields @ 2012-10-26 21:06 UTC (permalink / raw)
To: linux-nfs, linux-fsdevel; +Cc: Peter Staubach, J. Bruce Fields
From: "J. Bruce Fields" <bfields@redhat.com>
I don't really see how you could claim to support nfsd and not support
fsync somehow.
And in practice a quick look through the exportable filesystems suggests
the only ones without an ->fsync are read-only (efs, isofs, squashfs) or
in-memory (shmem).
Also, performing a write and then returning an error if the sync fails
(as we would do here in the wgather case) seems unhelpful to clients.
Also remove an incorrect comment.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
fs/nfsd/vfs.c | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index c120b48..ed3eb59 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1020,21 +1020,8 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
inode = dentry->d_inode;
exp = fhp->fh_export;
- /*
- * Request sync writes if
- * - the sync export option has been set, or
- * - the client requested O_SYNC behavior (NFSv3 feature).
- * - The file system doesn't support fsync().
- * When NFSv2 gathered writes have been configured for this volume,
- * flushing the data to disk is handled separately below.
- */
use_wgather = (rqstp->rq_vers == 2) && EX_WGATHER(exp);
- if (!file->f_op->fsync) {/* COMMIT3 cannot work */
- stable = 2;
- *stablep = 2; /* FILE_SYNC */
- }
-
if (!EX_ISSYNC(exp))
stable = 0;
if (stable && !use_wgather) {
--
1.7.9.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes
2012-10-26 21:06 ` J. Bruce Fields
@ 2012-10-26 21:06 ` J. Bruce Fields
-1 siblings, 0 replies; 12+ messages in thread
From: J. Bruce Fields @ 2012-10-26 21:06 UTC (permalink / raw)
To: linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
Cc: Peter Staubach, J. Bruce Fields
From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
NFSv4 shares the same struct file across multiple writes. (And we'd
like NFSv2 and NFSv3 to do that as well some day.)
So setting O_SYNC on the struct file as a way to request a synchronous
write doesn't work.
Instead, do a vfs_fsync_range() in that case.
Reported-by: Peter Staubach <pstaubach-83r9SdEf25FBDgjK7y7TUQ@public.gmane.org>
Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
fs/nfsd/vfs.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index ed3eb59..b584205 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1024,11 +1024,6 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
if (!EX_ISSYNC(exp))
stable = 0;
- if (stable && !use_wgather) {
- spin_lock(&file->f_lock);
- file->f_flags |= O_SYNC;
- spin_unlock(&file->f_lock);
- }
/* Write the data. */
oldfs = get_fs(); set_fs(KERNEL_DS);
@@ -1044,8 +1039,12 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
if (inode->i_mode & (S_ISUID | S_ISGID))
kill_suid(dentry);
- if (stable && use_wgather)
- host_err = wait_for_concurrent_writes(file);
+ if (stable) {
+ if (use_wgather)
+ host_err = wait_for_concurrent_writes(file);
+ else
+ host_err = vfs_fsync_range(file, offset, offset+*cnt, 0);
+ }
out_nfserr:
dprintk("nfsd: write complete host_err=%d\n", host_err);
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes
@ 2012-10-26 21:06 ` J. Bruce Fields
0 siblings, 0 replies; 12+ messages in thread
From: J. Bruce Fields @ 2012-10-26 21:06 UTC (permalink / raw)
To: linux-nfs, linux-fsdevel; +Cc: Peter Staubach, J. Bruce Fields
From: "J. Bruce Fields" <bfields@redhat.com>
NFSv4 shares the same struct file across multiple writes. (And we'd
like NFSv2 and NFSv3 to do that as well some day.)
So setting O_SYNC on the struct file as a way to request a synchronous
write doesn't work.
Instead, do a vfs_fsync_range() in that case.
Reported-by: Peter Staubach <pstaubach@exagrid.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
fs/nfsd/vfs.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index ed3eb59..b584205 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1024,11 +1024,6 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
if (!EX_ISSYNC(exp))
stable = 0;
- if (stable && !use_wgather) {
- spin_lock(&file->f_lock);
- file->f_flags |= O_SYNC;
- spin_unlock(&file->f_lock);
- }
/* Write the data. */
oldfs = get_fs(); set_fs(KERNEL_DS);
@@ -1044,8 +1039,12 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
if (inode->i_mode & (S_ISUID | S_ISGID))
kill_suid(dentry);
- if (stable && use_wgather)
- host_err = wait_for_concurrent_writes(file);
+ if (stable) {
+ if (use_wgather)
+ host_err = wait_for_concurrent_writes(file);
+ else
+ host_err = vfs_fsync_range(file, offset, offset+*cnt, 0);
+ }
out_nfserr:
dprintk("nfsd: write complete host_err=%d\n", host_err);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] fix nfsd stable write implementation
2012-10-26 21:06 ` J. Bruce Fields
@ 2012-10-29 23:28 ` NeilBrown
-1 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2012-10-29 23:28 UTC (permalink / raw)
To: J. Bruce Fields
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Peter Staubach
[-- Attachment #1: Type: text/plain, Size: 1467 bytes --]
On Fri, 26 Oct 2012 17:06:55 -0400 "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
wrote:
> From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>
> Peter pointed out to me that the nfs server is implementing stable
> writes by setting the O_SYNC flag. I can't see why we couldn't write
> and then sync instead, but I don't know this stuff as well as I should;
> does the following look reasonable to people?
Bruce changed the code to implement stable writes by calling
vfs_fsync_range(). I can't see why we couldn't use O_SYNC instead.
It seems like you are making a change just for the sake of making a change.
Is there some reason that you think a separate 'sync' is more efficient than
using O_SYNC ?
As a general principle, I think it is best to give the file system as much
information as possible to that it can make whatever optimisation decisions
that it wants to.
Setting O_SYNC gives the filesystem more information than not, because it
allows it to change the behaviour of the 'write' request... though I don't
know if any filesystem actually uses the information.
Why the change?
NeilBrown
>
> --b.
>
> J. Bruce Fields (2):
> nfsd: assume writeable exportabled filesystems have f_sync
> nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes
>
> fs/nfsd/vfs.c | 26 ++++++--------------------
> 1 file changed, 6 insertions(+), 20 deletions(-)
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] fix nfsd stable write implementation
@ 2012-10-29 23:28 ` NeilBrown
0 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2012-10-29 23:28 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs, linux-fsdevel, Peter Staubach
[-- Attachment #1: Type: text/plain, Size: 1409 bytes --]
On Fri, 26 Oct 2012 17:06:55 -0400 "J. Bruce Fields" <bfields@redhat.com>
wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
>
> Peter pointed out to me that the nfs server is implementing stable
> writes by setting the O_SYNC flag. I can't see why we couldn't write
> and then sync instead, but I don't know this stuff as well as I should;
> does the following look reasonable to people?
Bruce changed the code to implement stable writes by calling
vfs_fsync_range(). I can't see why we couldn't use O_SYNC instead.
It seems like you are making a change just for the sake of making a change.
Is there some reason that you think a separate 'sync' is more efficient than
using O_SYNC ?
As a general principle, I think it is best to give the file system as much
information as possible to that it can make whatever optimisation decisions
that it wants to.
Setting O_SYNC gives the filesystem more information than not, because it
allows it to change the behaviour of the 'write' request... though I don't
know if any filesystem actually uses the information.
Why the change?
NeilBrown
>
> --b.
>
> J. Bruce Fields (2):
> nfsd: assume writeable exportabled filesystems have f_sync
> nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes
>
> fs/nfsd/vfs.c | 26 ++++++--------------------
> 1 file changed, 6 insertions(+), 20 deletions(-)
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] fix nfsd stable write implementation
2012-10-29 23:28 ` NeilBrown
(?)
@ 2012-10-30 14:07 ` J. Bruce Fields
[not found] ` <20121030140725.GC24618-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
-1 siblings, 1 reply; 12+ messages in thread
From: J. Bruce Fields @ 2012-10-30 14:07 UTC (permalink / raw)
To: NeilBrown; +Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, Peter Staubach
On Tue, Oct 30, 2012 at 10:28:33AM +1100, NeilBrown wrote:
> On Fri, 26 Oct 2012 17:06:55 -0400 "J. Bruce Fields" <bfields@redhat.com>
> wrote:
>
> > From: "J. Bruce Fields" <bfields@redhat.com>
> >
> > Peter pointed out to me that the nfs server is implementing stable
> > writes by setting the O_SYNC flag. I can't see why we couldn't write
> > and then sync instead, but I don't know this stuff as well as I should;
> > does the following look reasonable to people?
>
> Bruce changed the code to implement stable writes by calling
> vfs_fsync_range(). I can't see why we couldn't use O_SYNC instead.
>
> It seems like you are making a change just for the sake of making a change.
> Is there some reason that you think a separate 'sync' is more efficient than
> using O_SYNC ?
Oh, sorry, see the changelog on the second patch: the problem is that
the struct file can be shared across multiple writes in the NFSv4 case,
so a single stable write could make all subsequent writes synchronous.
(And some day people would like filehandle caching for v2/v3, in which
case we'll run into the same problem.)
> As a general principle, I think it is best to give the file system as much
> information as possible to that it can make whatever optimisation decisions
> that it wants to.
>
> Setting O_SYNC gives the filesystem more information than not, because it
> allows it to change the behaviour of the 'write' request... though I don't
> know if any filesystem actually uses the information.
I'm not sure how to figure out if that's a real problem or not.
--b.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] fix nfsd stable write implementation
2012-10-30 14:07 ` J. Bruce Fields
@ 2012-10-30 20:30 ` NeilBrown
0 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2012-10-30 20:30 UTC (permalink / raw)
To: J. Bruce Fields
Cc: J. Bruce Fields, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Peter Staubach
[-- Attachment #1: Type: text/plain, Size: 2036 bytes --]
On Tue, 30 Oct 2012 10:07:25 -0400 "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
wrote:
> On Tue, Oct 30, 2012 at 10:28:33AM +1100, NeilBrown wrote:
> > On Fri, 26 Oct 2012 17:06:55 -0400 "J. Bruce Fields" <bfields-H+wXaHxf7aJhl2p70BpVqQ@public.gmane.orgm>
> > wrote:
> >
> > > From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > >
> > > Peter pointed out to me that the nfs server is implementing stable
> > > writes by setting the O_SYNC flag. I can't see why we couldn't write
> > > and then sync instead, but I don't know this stuff as well as I should;
> > > does the following look reasonable to people?
> >
> > Bruce changed the code to implement stable writes by calling
> > vfs_fsync_range(). I can't see why we couldn't use O_SYNC instead.
> >
> > It seems like you are making a change just for the sake of making a change.
> > Is there some reason that you think a separate 'sync' is more efficient than
> > using O_SYNC ?
>
> Oh, sorry, see the changelog on the second patch: the problem is that
> the struct file can be shared across multiple writes in the NFSv4 case,
> so a single stable write could make all subsequent writes synchronous.
>
> (And some day people would like filehandle caching for v2/v3, in which
> case we'll run into the same problem.)
Ahh, I missed that.
Makes sense then, thanks.
>
> > As a general principle, I think it is best to give the file system as much
> > information as possible to that it can make whatever optimisation decisions
> > that it wants to.
> >
> > Setting O_SYNC gives the filesystem more information than not, because it
> > allows it to change the behaviour of the 'write' request... though I don't
> > know if any filesystem actually uses the information.
>
> I'm not sure how to figure out if that's a real problem or not.
Search all filesystems for special handling of O_SYNC or O_DSYNC?
I strongly suspect that it is not a real problem.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] fix nfsd stable write implementation
@ 2012-10-30 20:30 ` NeilBrown
0 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2012-10-30 20:30 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, Peter Staubach
[-- Attachment #1: Type: text/plain, Size: 1950 bytes --]
On Tue, 30 Oct 2012 10:07:25 -0400 "J. Bruce Fields" <bfields@fieldses.org>
wrote:
> On Tue, Oct 30, 2012 at 10:28:33AM +1100, NeilBrown wrote:
> > On Fri, 26 Oct 2012 17:06:55 -0400 "J. Bruce Fields" <bfields@redhat.com>
> > wrote:
> >
> > > From: "J. Bruce Fields" <bfields@redhat.com>
> > >
> > > Peter pointed out to me that the nfs server is implementing stable
> > > writes by setting the O_SYNC flag. I can't see why we couldn't write
> > > and then sync instead, but I don't know this stuff as well as I should;
> > > does the following look reasonable to people?
> >
> > Bruce changed the code to implement stable writes by calling
> > vfs_fsync_range(). I can't see why we couldn't use O_SYNC instead.
> >
> > It seems like you are making a change just for the sake of making a change.
> > Is there some reason that you think a separate 'sync' is more efficient than
> > using O_SYNC ?
>
> Oh, sorry, see the changelog on the second patch: the problem is that
> the struct file can be shared across multiple writes in the NFSv4 case,
> so a single stable write could make all subsequent writes synchronous.
>
> (And some day people would like filehandle caching for v2/v3, in which
> case we'll run into the same problem.)
Ahh, I missed that.
Makes sense then, thanks.
>
> > As a general principle, I think it is best to give the file system as much
> > information as possible to that it can make whatever optimisation decisions
> > that it wants to.
> >
> > Setting O_SYNC gives the filesystem more information than not, because it
> > allows it to change the behaviour of the 'write' request... though I don't
> > know if any filesystem actually uses the information.
>
> I'm not sure how to figure out if that's a real problem or not.
Search all filesystems for special handling of O_SYNC or O_DSYNC?
I strongly suspect that it is not a real problem.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] fix nfsd stable write implementation
2012-10-30 20:30 ` NeilBrown
(?)
@ 2012-11-08 0:20 ` J. Bruce Fields
-1 siblings, 0 replies; 12+ messages in thread
From: J. Bruce Fields @ 2012-11-08 0:20 UTC (permalink / raw)
To: NeilBrown; +Cc: J. Bruce Fields, linux-nfs, linux-fsdevel, Peter Staubach
On Wed, Oct 31, 2012 at 07:30:27AM +1100, NeilBrown wrote:
> On Tue, 30 Oct 2012 10:07:25 -0400 "J. Bruce Fields" <bfields@fieldses.org>
> wrote:
>
> > On Tue, Oct 30, 2012 at 10:28:33AM +1100, NeilBrown wrote:
> > > On Fri, 26 Oct 2012 17:06:55 -0400 "J. Bruce Fields" <bfields@redhat.com>
> > > wrote:
> > >
> > > > From: "J. Bruce Fields" <bfields@redhat.com>
> > > >
> > > > Peter pointed out to me that the nfs server is implementing stable
> > > > writes by setting the O_SYNC flag. I can't see why we couldn't write
> > > > and then sync instead, but I don't know this stuff as well as I should;
> > > > does the following look reasonable to people?
> > >
> > > Bruce changed the code to implement stable writes by calling
> > > vfs_fsync_range(). I can't see why we couldn't use O_SYNC instead.
> > >
> > > It seems like you are making a change just for the sake of making a change.
> > > Is there some reason that you think a separate 'sync' is more efficient than
> > > using O_SYNC ?
> >
> > Oh, sorry, see the changelog on the second patch: the problem is that
> > the struct file can be shared across multiple writes in the NFSv4 case,
> > so a single stable write could make all subsequent writes synchronous.
> >
> > (And some day people would like filehandle caching for v2/v3, in which
> > case we'll run into the same problem.)
>
> Ahh, I missed that.
> Makes sense then, thanks.
>
> >
> > > As a general principle, I think it is best to give the file system as much
> > > information as possible to that it can make whatever optimisation decisions
> > > that it wants to.
> > >
> > > Setting O_SYNC gives the filesystem more information than not, because it
> > > allows it to change the behaviour of the 'write' request... though I don't
> > > know if any filesystem actually uses the information.
> >
> > I'm not sure how to figure out if that's a real problem or not.
>
> Search all filesystems for special handling of O_SYNC or O_DSYNC?
>
> I strongly suspect that it is not a real problem.
OK, committing for 3.8.
--b.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-11-08 0:21 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-26 21:06 [PATCH 0/2] fix nfsd stable write implementation J. Bruce Fields
2012-10-26 21:06 ` J. Bruce Fields
[not found] ` <1351285617-20450-1-git-send-email-bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-10-26 21:06 ` [PATCH 1/2] nfsd: assume writeable exportabled filesystems have f_sync J. Bruce Fields
2012-10-26 21:06 ` J. Bruce Fields
2012-10-26 21:06 ` [PATCH 2/2] nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes J. Bruce Fields
2012-10-26 21:06 ` J. Bruce Fields
2012-10-29 23:28 ` [PATCH 0/2] fix nfsd stable write implementation NeilBrown
2012-10-29 23:28 ` NeilBrown
2012-10-30 14:07 ` J. Bruce Fields
[not found] ` <20121030140725.GC24618-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-10-30 20:30 ` NeilBrown
2012-10-30 20:30 ` NeilBrown
2012-11-08 0:20 ` J. Bruce Fields
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.