* [PATCH] fsync.2: ERRORS: add EIO and ENOSPC @ 2020-08-29 7:13 milan.opensource 2020-09-07 7:11 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 15+ messages in thread From: milan.opensource @ 2020-08-29 7:13 UTC (permalink / raw) To: mtk.manpages; +Cc: linux-kernel, Milan Shah From: Milan Shah <milan.opensource@gmail.com> This Fix addresses Bug 194757. Ref: https://bugzilla.kernel.org/show_bug.cgi?id=194757 --- man2/fsync.2 | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/man2/fsync.2 b/man2/fsync.2 index 96401cd..f38b3e4 100644 --- a/man2/fsync.2 +++ b/man2/fsync.2 @@ -186,6 +186,19 @@ In these cases disk caches need to be disabled using or .BR sdparm (8) to guarantee safe operation. + +When +.BR fsync () +or +.BR fdatasync () +returns +.B EIO +or +.B ENOSPC +any error flags on pages in the file mapping are cleared, so subsequent synchronisation attempts +will return without error. It is +.I not +safe to retry synchronisation and assume that a non-error return means prior writes are now on disk. .SH SEE ALSO .BR sync (1), .BR bdflush (2), -- 2.7.4 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-08-29 7:13 [PATCH] fsync.2: ERRORS: add EIO and ENOSPC milan.opensource @ 2020-09-07 7:11 ` Michael Kerrisk (man-pages) 2020-09-08 11:27 ` Jan Kara 0 siblings, 1 reply; 15+ messages in thread From: Michael Kerrisk (man-pages) @ 2020-09-07 7:11 UTC (permalink / raw) To: milan.opensource; +Cc: lkml, Andrew Morton, linux-fsdevel [Widening the CC to include Andrew and linux-fsdevel@] [Milan: thanks for the patch, but it's unclear to me from your commit message how/if you verified the details.] Andrew, maybe you (or someone else) can comment, since long ago your commit f79e2abb9bd452d97295f34376dedbec9686b986 Author: Andrew Morton <akpm@osdl.org> Date: Fri Mar 31 02:30:42 2006 -0800 included a comment that is referred to in stackoverflow discussion about this topic (that SO discussion is in turn referred to by https://bugzilla.kernel.org/show_bug.cgi?id=194757). The essence as I understand it, is this: (1) fsync() (and similar) may fail EIO or ENOSPC, at which point data has not been synced. (2) In this case, the EIO/ENOSPC setting is cleared so that... (3) A subsequent fsync() might return success, but... (4) That doesn't mean that the data in (1) landed on the disk. The proposed manual page patch below wants to document this, but I'd be happy to have an FS-knowledgeable person comment before I apply. Thanks, Michael On Sat, 29 Aug 2020 at 09:13, <milan.opensource@gmail.com> wrote: > > From: Milan Shah <milan.opensource@gmail.com> > > This Fix addresses Bug 194757. > Ref: https://bugzilla.kernel.org/show_bug.cgi?id=194757 > --- > man2/fsync.2 | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/man2/fsync.2 b/man2/fsync.2 > index 96401cd..f38b3e4 100644 > --- a/man2/fsync.2 > +++ b/man2/fsync.2 > @@ -186,6 +186,19 @@ In these cases disk caches need to be disabled using > or > .BR sdparm (8) > to guarantee safe operation. > + > +When > +.BR fsync () > +or > +.BR fdatasync () > +returns > +.B EIO > +or > +.B ENOSPC > +any error flags on pages in the file mapping are cleared, so subsequent synchronisation attempts > +will return without error. It is > +.I not > +safe to retry synchronisation and assume that a non-error return means prior writes are now on disk. > .SH SEE ALSO > .BR sync (1), > .BR bdflush (2), > -- > 2.7.4 > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-07 7:11 ` Michael Kerrisk (man-pages) @ 2020-09-08 11:27 ` Jan Kara 2020-09-08 16:10 ` Jeff Layton ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Jan Kara @ 2020-09-08 11:27 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: milan.opensource, lkml, Andrew Morton, linux-fsdevel, Jeff Layton Added Jeff to CC since he has written the code... On Mon 07-09-20 09:11:06, Michael Kerrisk (man-pages) wrote: > [Widening the CC to include Andrew and linux-fsdevel@] > [Milan: thanks for the patch, but it's unclear to me from your commit > message how/if you verified the details.] > > Andrew, maybe you (or someone else) can comment, since long ago your > > commit f79e2abb9bd452d97295f34376dedbec9686b986 > Author: Andrew Morton <akpm@osdl.org> > Date: Fri Mar 31 02:30:42 2006 -0800 > > included a comment that is referred to in stackoverflow discussion > about this topic (that SO discussion is in turn referred to by > https://bugzilla.kernel.org/show_bug.cgi?id=194757). > > The essence as I understand it, is this: > (1) fsync() (and similar) may fail EIO or ENOSPC, at which point data > has not been synced. > (2) In this case, the EIO/ENOSPC setting is cleared so that... > (3) A subsequent fsync() might return success, but... > (4) That doesn't mean that the data in (1) landed on the disk. Correct. > The proposed manual page patch below wants to document this, but I'd > be happy to have an FS-knowledgeable person comment before I apply. Just a small comment below: > On Sat, 29 Aug 2020 at 09:13, <milan.opensource@gmail.com> wrote: > > > > From: Milan Shah <milan.opensource@gmail.com> > > > > This Fix addresses Bug 194757. > > Ref: https://bugzilla.kernel.org/show_bug.cgi?id=194757 > > --- > > man2/fsync.2 | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/man2/fsync.2 b/man2/fsync.2 > > index 96401cd..f38b3e4 100644 > > --- a/man2/fsync.2 > > +++ b/man2/fsync.2 > > @@ -186,6 +186,19 @@ In these cases disk caches need to be disabled using > > or > > .BR sdparm (8) > > to guarantee safe operation. > > + > > +When > > +.BR fsync () > > +or > > +.BR fdatasync () > > +returns > > +.B EIO > > +or > > +.B ENOSPC > > +any error flags on pages in the file mapping are cleared, so subsequent synchronisation attempts > > +will return without error. It is > > +.I not > > +safe to retry synchronisation and assume that a non-error return means prior writes are now on disk. > > .SH SEE ALSO > > .BR sync (1), > > .BR bdflush (2), So the error state isn't really stored "on pages in the file mapping". Current implementation (since 4.14) is that error state is stored in struct file (I think this tends to be called "file description" in manpages) and so EIO / ENOSPC is reported once for each file description of the file that was open before the error happened. Not sure if we want to be so precise in the manpages or if it just confuses people. Anyway your takeway that no error on subsequent fsync() does not mean data was written is correct. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-08 11:27 ` Jan Kara @ 2020-09-08 16:10 ` Jeff Layton 2020-09-09 22:50 ` NeilBrown 2020-09-08 19:44 ` Jeff Layton 2020-09-09 10:52 ` Michael Kerrisk (man-pages) 2 siblings, 1 reply; 15+ messages in thread From: Jeff Layton @ 2020-09-08 16:10 UTC (permalink / raw) To: Jan Kara, Michael Kerrisk (man-pages) Cc: milan.opensource, lkml, Andrew Morton, linux-fsdevel On Tue, 2020-09-08 at 13:27 +0200, Jan Kara wrote: > Added Jeff to CC since he has written the code... > > On Mon 07-09-20 09:11:06, Michael Kerrisk (man-pages) wrote: > > [Widening the CC to include Andrew and linux-fsdevel@] > > [Milan: thanks for the patch, but it's unclear to me from your commit > > message how/if you verified the details.] > > > > Andrew, maybe you (or someone else) can comment, since long ago your > > > > commit f79e2abb9bd452d97295f34376dedbec9686b986 > > Author: Andrew Morton <akpm@osdl.org> > > Date: Fri Mar 31 02:30:42 2006 -0800 > > > > included a comment that is referred to in stackoverflow discussion > > about this topic (that SO discussion is in turn referred to by > > https://bugzilla.kernel.org/show_bug.cgi?id=194757). > > > > The essence as I understand it, is this: > > (1) fsync() (and similar) may fail EIO or ENOSPC, at which point data > > has not been synced. > > (2) In this case, the EIO/ENOSPC setting is cleared so that... > > (3) A subsequent fsync() might return success, but... > > (4) That doesn't mean that the data in (1) landed on the disk. > > Correct. > > > The proposed manual page patch below wants to document this, but I'd > > be happy to have an FS-knowledgeable person comment before I apply. > > Just a small comment below: > > > On Sat, 29 Aug 2020 at 09:13, <milan.opensource@gmail.com> wrote: > > > From: Milan Shah <milan.opensource@gmail.com> > > > > > > This Fix addresses Bug 194757. > > > Ref: https://bugzilla.kernel.org/show_bug.cgi?id=194757 > > > --- > > > man2/fsync.2 | 13 +++++++++++++ > > > 1 file changed, 13 insertions(+) > > > > > > diff --git a/man2/fsync.2 b/man2/fsync.2 > > > index 96401cd..f38b3e4 100644 > > > --- a/man2/fsync.2 > > > +++ b/man2/fsync.2 > > > @@ -186,6 +186,19 @@ In these cases disk caches need to be disabled using > > > or > > > .BR sdparm (8) > > > to guarantee safe operation. > > > + > > > +When > > > +.BR fsync () > > > +or > > > +.BR fdatasync () > > > +returns > > > +.B EIO > > > +or > > > +.B ENOSPC > > > +any error flags on pages in the file mapping are cleared, so subsequent synchronisation attempts > > > +will return without error. It is > > > +.I not > > > +safe to retry synchronisation and assume that a non-error return means prior writes are now on disk. > > > .SH SEE ALSO > > > .BR sync (1), > > > .BR bdflush (2), > > So the error state isn't really stored "on pages in the file mapping". > Current implementation (since 4.14) is that error state is stored in struct > file (I think this tends to be called "file description" in manpages) and > so EIO / ENOSPC is reported once for each file description of the file that > was open before the error happened. Not sure if we want to be so precise in > the manpages or if it just confuses people. Anyway your takeway that no > error on subsequent fsync() does not mean data was written is correct. > > Honza > Yep. My only comment is that there is nothing special about EIO and ENOSPC. All errors are the same in this regard. Basically, issuing a new fsync after a failed one doesn't do any good. You need to redirty the pages first. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-08 16:10 ` Jeff Layton @ 2020-09-09 22:50 ` NeilBrown 0 siblings, 0 replies; 15+ messages in thread From: NeilBrown @ 2020-09-09 22:50 UTC (permalink / raw) To: Jeff Layton, Jan Kara, Michael Kerrisk (man-pages) Cc: milan.opensource, lkml, Andrew Morton, linux-fsdevel [-- Attachment #1: Type: text/plain, Size: 680 bytes --] On Tue, Sep 08 2020, Jeff Layton wrote: > > Yep. > > My only comment is that there is nothing special about EIO and ENOSPC. There are two type of errors that fsync can return. EBADF EROFS EINVAL - these are usage errors. EIO ENOSPC EDQUOT - these are functional failures. So I would say there *is* something special about those errors, though it isn't *very* special, and it isn't *just* those errors. EDQUOT should be included in the list. NeilBrown > All errors are the same in this regard. Basically, issuing a new fsync > after a failed one doesn't do any good. You need to redirty the pages > first. > -- > Jeff Layton <jlayton@kernel.org> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 853 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-08 11:27 ` Jan Kara 2020-09-08 16:10 ` Jeff Layton @ 2020-09-08 19:44 ` Jeff Layton 2020-09-09 10:53 ` Michael Kerrisk (man-pages) 2020-09-09 23:04 ` NeilBrown 2020-09-09 10:52 ` Michael Kerrisk (man-pages) 2 siblings, 2 replies; 15+ messages in thread From: Jeff Layton @ 2020-09-08 19:44 UTC (permalink / raw) To: Jan Kara, Michael Kerrisk (man-pages) Cc: milan.opensource, lkml, Andrew Morton, linux-fsdevel On Tue, 2020-09-08 at 13:27 +0200, Jan Kara wrote: > Added Jeff to CC since he has written the code... > > On Mon 07-09-20 09:11:06, Michael Kerrisk (man-pages) wrote: > > [Widening the CC to include Andrew and linux-fsdevel@] > > [Milan: thanks for the patch, but it's unclear to me from your commit > > message how/if you verified the details.] > > > > Andrew, maybe you (or someone else) can comment, since long ago your > > > > commit f79e2abb9bd452d97295f34376dedbec9686b986 > > Author: Andrew Morton <akpm@osdl.org> > > Date: Fri Mar 31 02:30:42 2006 -0800 > > > > included a comment that is referred to in stackoverflow discussion > > about this topic (that SO discussion is in turn referred to by > > https://bugzilla.kernel.org/show_bug.cgi?id=194757). > > > > The essence as I understand it, is this: > > (1) fsync() (and similar) may fail EIO or ENOSPC, at which point data > > has not been synced. > > (2) In this case, the EIO/ENOSPC setting is cleared so that... > > (3) A subsequent fsync() might return success, but... > > (4) That doesn't mean that the data in (1) landed on the disk. > > Correct. > > > The proposed manual page patch below wants to document this, but I'd > > be happy to have an FS-knowledgeable person comment before I apply. > > Just a small comment below: > > > On Sat, 29 Aug 2020 at 09:13, <milan.opensource@gmail.com> wrote: > > > From: Milan Shah <milan.opensource@gmail.com> > > > > > > This Fix addresses Bug 194757. > > > Ref: https://bugzilla.kernel.org/show_bug.cgi?id=194757 > > > --- > > > man2/fsync.2 | 13 +++++++++++++ > > > 1 file changed, 13 insertions(+) > > > > > > diff --git a/man2/fsync.2 b/man2/fsync.2 > > > index 96401cd..f38b3e4 100644 > > > --- a/man2/fsync.2 > > > +++ b/man2/fsync.2 > > > @@ -186,6 +186,19 @@ In these cases disk caches need to be disabled using > > > or > > > .BR sdparm (8) > > > to guarantee safe operation. > > > + > > > +When > > > +.BR fsync () > > > +or > > > +.BR fdatasync () > > > +returns > > > +.B EIO > > > +or > > > +.B ENOSPC > > > +any error flags on pages in the file mapping are cleared, so subsequent synchronisation attempts > > > +will return without error. It is > > > +.I not > > > +safe to retry synchronisation and assume that a non-error return means prior writes are now on disk. > > > .SH SEE ALSO > > > .BR sync (1), > > > .BR bdflush (2), > > So the error state isn't really stored "on pages in the file mapping". > Current implementation (since 4.14) is that error state is stored in struct > file (I think this tends to be called "file description" in manpages) and > so EIO / ENOSPC is reported once for each file description of the file that > was open before the error happened. Not sure if we want to be so precise in > the manpages or if it just confuses people. Anyway your takeway that no > error on subsequent fsync() does not mean data was written is correct. > > Thinking about it more, I think we ought to spell this out explicitly as we can in the manpage. This is a point of confusion for a lot of people and not understanding this can lead to data integrity bugs. Maybe something like this in the NOTES section? ''' When fsync returns an error, the file is considered to be "clean". A subsequent call to fsync will not result in a reattempt to write out the data, unless that data has been rewritten. Applications that want to reattempt writing to the file after a transient error must re-write their data. ''' To be clear: In practice, you'd only have to write enough to redirty each page in most cases. Also, it is hard to claim that the above behavior is universally true. A filesystem could opt to keep the pages dirty for some errors, but the vast majority just toss out the data whenever there is a writeback problem. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-08 19:44 ` Jeff Layton @ 2020-09-09 10:53 ` Michael Kerrisk (man-pages) 2020-09-09 23:04 ` NeilBrown 1 sibling, 0 replies; 15+ messages in thread From: Michael Kerrisk (man-pages) @ 2020-09-09 10:53 UTC (permalink / raw) To: Jeff Layton, Jan Kara Cc: mtk.manpages, milan.opensource, lkml, Andrew Morton, linux-fsdevel Hello Jeff, On 9/8/20 9:44 PM, Jeff Layton wrote: > On Tue, 2020-09-08 at 13:27 +0200, Jan Kara wrote: >> Added Jeff to CC since he has written the code... >> >> On Mon 07-09-20 09:11:06, Michael Kerrisk (man-pages) wrote: >>> [Widening the CC to include Andrew and linux-fsdevel@] >>> [Milan: thanks for the patch, but it's unclear to me from your commit >>> message how/if you verified the details.] >>> >>> Andrew, maybe you (or someone else) can comment, since long ago your >>> >>> commit f79e2abb9bd452d97295f34376dedbec9686b986 >>> Author: Andrew Morton <akpm@osdl.org> >>> Date: Fri Mar 31 02:30:42 2006 -0800 >>> >>> included a comment that is referred to in stackoverflow discussion >>> about this topic (that SO discussion is in turn referred to by >>> https://bugzilla.kernel.org/show_bug.cgi?id=194757). >>> >>> The essence as I understand it, is this: >>> (1) fsync() (and similar) may fail EIO or ENOSPC, at which point data >>> has not been synced. >>> (2) In this case, the EIO/ENOSPC setting is cleared so that... >>> (3) A subsequent fsync() might return success, but... >>> (4) That doesn't mean that the data in (1) landed on the disk. >> >> Correct. >> >>> The proposed manual page patch below wants to document this, but I'd >>> be happy to have an FS-knowledgeable person comment before I apply. >> >> Just a small comment below: >> >>> On Sat, 29 Aug 2020 at 09:13, <milan.opensource@gmail.com> wrote: >>>> From: Milan Shah <milan.opensource@gmail.com> >>>> >>>> This Fix addresses Bug 194757. >>>> Ref: https://bugzilla.kernel.org/show_bug.cgi?id=194757 >>>> --- >>>> man2/fsync.2 | 13 +++++++++++++ >>>> 1 file changed, 13 insertions(+) >>>> >>>> diff --git a/man2/fsync.2 b/man2/fsync.2 >>>> index 96401cd..f38b3e4 100644 >>>> --- a/man2/fsync.2 >>>> +++ b/man2/fsync.2 >>>> @@ -186,6 +186,19 @@ In these cases disk caches need to be disabled using >>>> or >>>> .BR sdparm (8) >>>> to guarantee safe operation. >>>> + >>>> +When >>>> +.BR fsync () >>>> +or >>>> +.BR fdatasync () >>>> +returns >>>> +.B EIO >>>> +or >>>> +.B ENOSPC >>>> +any error flags on pages in the file mapping are cleared, so subsequent synchronisation attempts >>>> +will return without error. It is >>>> +.I not >>>> +safe to retry synchronisation and assume that a non-error return means prior writes are now on disk. >>>> .SH SEE ALSO >>>> .BR sync (1), >>>> .BR bdflush (2), >> >> So the error state isn't really stored "on pages in the file mapping". >> Current implementation (since 4.14) is that error state is stored in struct >> file (I think this tends to be called "file description" in manpages) and >> so EIO / ENOSPC is reported once for each file description of the file that >> was open before the error happened. Not sure if we want to be so precise in >> the manpages or if it just confuses people. Anyway your takeway that no >> error on subsequent fsync() does not mean data was written is correct. >> >> > > Thinking about it more, I think we ought to spell this out explicitly as > we can in the manpage. This is a point of confusion for a lot of people > and not understanding this can lead to data integrity bugs. Maybe > something like this in the NOTES section? > > ''' > When fsync returns an error, the file is considered to be "clean". A > subsequent call to fsync will not result in a reattempt to write out the > data, unless that data has been rewritten. Applications that want to > reattempt writing to the file after a transient error must re-write > their data. > ''' Thanks. It's incredibly helpful when someone with the needed domain-specific knowledge suggest a wording! > To be clear: > > In practice, you'd only have to write enough to redirty each page in > most cases. Presumably, this could be accomplished by write(2)-ing exactly the same user space buffers again? So, I'd like to expand your text a little. How would the following be: [[ When fsync() or fdatasync() returns an error, the file is considered to be "clean", even though the corresponding modified ("dirty") buffer cache pages may not have been flushed to the storage device. A subsequent call to fsync() will not result in a reattempt to write out the data, unless that data has in the meantime been rewritten. Applications that want to reattempt writing to the file after a transient error--for example, EIO and ENOSPC can occur because of transient conditions--must rewrite their data. ]] How is that text? > Also, it is hard to claim that the above behavior is universally true. A > filesystem could opt to keep the pages dirty for some errors, but the > vast majority just toss out the data whenever there is a writeback > problem. I think I won't worry about trying to discuss such variations in the manual page. BTW, I added a loosely related question in a reply that I just sent to Jan. Maybe you have some thoughts there also. Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-08 19:44 ` Jeff Layton 2020-09-09 10:53 ` Michael Kerrisk (man-pages) @ 2020-09-09 23:04 ` NeilBrown 2020-09-10 17:42 ` Jeff Layton 1 sibling, 1 reply; 15+ messages in thread From: NeilBrown @ 2020-09-09 23:04 UTC (permalink / raw) To: Jeff Layton, Jan Kara, Michael Kerrisk (man-pages) Cc: milan.opensource, lkml, Andrew Morton, linux-fsdevel [-- Attachment #1: Type: text/plain, Size: 5154 bytes --] On Tue, Sep 08 2020, Jeff Layton wrote: > On Tue, 2020-09-08 at 13:27 +0200, Jan Kara wrote: >> Added Jeff to CC since he has written the code... >> >> On Mon 07-09-20 09:11:06, Michael Kerrisk (man-pages) wrote: >> > [Widening the CC to include Andrew and linux-fsdevel@] >> > [Milan: thanks for the patch, but it's unclear to me from your commit >> > message how/if you verified the details.] >> > >> > Andrew, maybe you (or someone else) can comment, since long ago your >> > >> > commit f79e2abb9bd452d97295f34376dedbec9686b986 >> > Author: Andrew Morton <akpm@osdl.org> >> > Date: Fri Mar 31 02:30:42 2006 -0800 >> > >> > included a comment that is referred to in stackoverflow discussion >> > about this topic (that SO discussion is in turn referred to by >> > https://bugzilla.kernel.org/show_bug.cgi?id=194757). >> > >> > The essence as I understand it, is this: >> > (1) fsync() (and similar) may fail EIO or ENOSPC, at which point data >> > has not been synced. >> > (2) In this case, the EIO/ENOSPC setting is cleared so that... >> > (3) A subsequent fsync() might return success, but... >> > (4) That doesn't mean that the data in (1) landed on the disk. >> >> Correct. >> >> > The proposed manual page patch below wants to document this, but I'd >> > be happy to have an FS-knowledgeable person comment before I apply. >> >> Just a small comment below: >> >> > On Sat, 29 Aug 2020 at 09:13, <milan.opensource@gmail.com> wrote: >> > > From: Milan Shah <milan.opensource@gmail.com> >> > > >> > > This Fix addresses Bug 194757. >> > > Ref: https://bugzilla.kernel.org/show_bug.cgi?id=194757 >> > > --- >> > > man2/fsync.2 | 13 +++++++++++++ >> > > 1 file changed, 13 insertions(+) >> > > >> > > diff --git a/man2/fsync.2 b/man2/fsync.2 >> > > index 96401cd..f38b3e4 100644 >> > > --- a/man2/fsync.2 >> > > +++ b/man2/fsync.2 >> > > @@ -186,6 +186,19 @@ In these cases disk caches need to be disabled using >> > > or >> > > .BR sdparm (8) >> > > to guarantee safe operation. >> > > + >> > > +When >> > > +.BR fsync () >> > > +or >> > > +.BR fdatasync () >> > > +returns >> > > +.B EIO >> > > +or >> > > +.B ENOSPC >> > > +any error flags on pages in the file mapping are cleared, so subsequent synchronisation attempts >> > > +will return without error. It is >> > > +.I not >> > > +safe to retry synchronisation and assume that a non-error return means prior writes are now on disk. >> > > .SH SEE ALSO >> > > .BR sync (1), >> > > .BR bdflush (2), >> >> So the error state isn't really stored "on pages in the file mapping". >> Current implementation (since 4.14) is that error state is stored in struct >> file (I think this tends to be called "file description" in manpages) and >> so EIO / ENOSPC is reported once for each file description of the file that >> was open before the error happened. Not sure if we want to be so precise in >> the manpages or if it just confuses people. Anyway your takeway that no >> error on subsequent fsync() does not mean data was written is correct. >> >> > > Thinking about it more, I think we ought to spell this out explicitly as > we can in the manpage. This is a point of confusion for a lot of people > and not understanding this can lead to data integrity bugs. Maybe > something like this in the NOTES section? > > ''' > When fsync returns an error, the file is considered to be "clean". A > subsequent call to fsync will not result in a reattempt to write out the > data, unless that data has been rewritten. Applications that want to > reattempt writing to the file after a transient error must re-write > their data. > ''' > > To be clear: > > In practice, you'd only have to write enough to redirty each page in > most cases. Nonononono. In practice you have to repeat the entire write because you cannot know if the cached page is from before the write failure, or has since been flushed and reloaded. > > Also, it is hard to claim that the above behavior is universally true. A > filesystem could opt to keep the pages dirty for some errors, but the > vast majority just toss out the data whenever there is a writeback > problem. ...and any filesystem that doesn't behave that way is wasting effort, because nothing else can be assumed. Regarding your "NOTES" addition, I don't feel comfortable with the "clean" language. I would prefer something like: When fsync() reports a failure (EIO, ENOSPC, EDQUOT) it must be assumed that any write requests initiated since the previous successful fsync was initiated may have failed, and that any cached data may have been lost. A future fsync() will not attempt to write out the same data again. If recovery is possible and desired, the application must repeat all the writes that may have failed. If the regions of a file that were written to prior to a failed fsync() are read, the content reported may not reflect the stored content, and subsequent reads may revert to the stored content at any time. NeilBrown > > > -- > Jeff Layton <jlayton@kernel.org> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 853 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-09 23:04 ` NeilBrown @ 2020-09-10 17:42 ` Jeff Layton 2020-09-16 23:25 ` NeilBrown 0 siblings, 1 reply; 15+ messages in thread From: Jeff Layton @ 2020-09-10 17:42 UTC (permalink / raw) To: NeilBrown, Jan Kara, Michael Kerrisk (man-pages) Cc: milan.opensource, lkml, Andrew Morton, linux-fsdevel On Thu, 2020-09-10 at 09:04 +1000, NeilBrown wrote: > On Tue, Sep 08 2020, Jeff Layton wrote: > > > On Tue, 2020-09-08 at 13:27 +0200, Jan Kara wrote: > > > Added Jeff to CC since he has written the code... > > > > > > On Mon 07-09-20 09:11:06, Michael Kerrisk (man-pages) wrote: > > > > [Widening the CC to include Andrew and linux-fsdevel@] > > > > [Milan: thanks for the patch, but it's unclear to me from your commit > > > > message how/if you verified the details.] > > > > > > > > Andrew, maybe you (or someone else) can comment, since long ago your > > > > > > > > commit f79e2abb9bd452d97295f34376dedbec9686b986 > > > > Author: Andrew Morton <akpm@osdl.org> > > > > Date: Fri Mar 31 02:30:42 2006 -0800 > > > > > > > > included a comment that is referred to in stackoverflow discussion > > > > about this topic (that SO discussion is in turn referred to by > > > > https://bugzilla.kernel.org/show_bug.cgi?id=194757). > > > > > > > > The essence as I understand it, is this: > > > > (1) fsync() (and similar) may fail EIO or ENOSPC, at which point data > > > > has not been synced. > > > > (2) In this case, the EIO/ENOSPC setting is cleared so that... > > > > (3) A subsequent fsync() might return success, but... > > > > (4) That doesn't mean that the data in (1) landed on the disk. > > > > > > Correct. > > > > > > > The proposed manual page patch below wants to document this, but I'd > > > > be happy to have an FS-knowledgeable person comment before I apply. > > > > > > Just a small comment below: > > > > > > > On Sat, 29 Aug 2020 at 09:13, <milan.opensource@gmail.com> wrote: > > > > > From: Milan Shah <milan.opensource@gmail.com> > > > > > > > > > > This Fix addresses Bug 194757. > > > > > Ref: https://bugzilla.kernel.org/show_bug.cgi?id=194757 > > > > > --- > > > > > man2/fsync.2 | 13 +++++++++++++ > > > > > 1 file changed, 13 insertions(+) > > > > > > > > > > diff --git a/man2/fsync.2 b/man2/fsync.2 > > > > > index 96401cd..f38b3e4 100644 > > > > > --- a/man2/fsync.2 > > > > > +++ b/man2/fsync.2 > > > > > @@ -186,6 +186,19 @@ In these cases disk caches need to be disabled using > > > > > or > > > > > .BR sdparm (8) > > > > > to guarantee safe operation. > > > > > + > > > > > +When > > > > > +.BR fsync () > > > > > +or > > > > > +.BR fdatasync () > > > > > +returns > > > > > +.B EIO > > > > > +or > > > > > +.B ENOSPC > > > > > +any error flags on pages in the file mapping are cleared, so subsequent synchronisation attempts > > > > > +will return without error. It is > > > > > +.I not > > > > > +safe to retry synchronisation and assume that a non-error return means prior writes are now on disk. > > > > > .SH SEE ALSO > > > > > .BR sync (1), > > > > > .BR bdflush (2), > > > > > > So the error state isn't really stored "on pages in the file mapping". > > > Current implementation (since 4.14) is that error state is stored in struct > > > file (I think this tends to be called "file description" in manpages) and > > > so EIO / ENOSPC is reported once for each file description of the file that > > > was open before the error happened. Not sure if we want to be so precise in > > > the manpages or if it just confuses people. Anyway your takeway that no > > > error on subsequent fsync() does not mean data was written is correct. > > > > > > > > > > Thinking about it more, I think we ought to spell this out explicitly as > > we can in the manpage. This is a point of confusion for a lot of people > > and not understanding this can lead to data integrity bugs. Maybe > > something like this in the NOTES section? > > > > ''' > > When fsync returns an error, the file is considered to be "clean". A > > subsequent call to fsync will not result in a reattempt to write out the > > data, unless that data has been rewritten. Applications that want to > > reattempt writing to the file after a transient error must re-write > > their data. > > ''' > > > > To be clear: > > > > In practice, you'd only have to write enough to redirty each page in > > most cases. > > Nonononono. In practice you have to repeat the entire write because you > cannot know if the cached page is from before the write failure, or has > since been flushed and reloaded. > Oh, good point! There's no way for userland to know that, so you really do have to rewrite the whole thing. > > Also, it is hard to claim that the above behavior is universally true. A > > filesystem could opt to keep the pages dirty for some errors, but the > > vast majority just toss out the data whenever there is a writeback > > problem. > > ...and any filesystem that doesn't behave that way is wasting effort, > because nothing else can be assumed. > Yeah. I only made the point to be pedantic. There's no benefit to documenting that, I think... > Regarding your "NOTES" addition, I don't feel comfortable with the > "clean" language. I would prefer something like: > > When fsync() reports a failure (EIO, ENOSPC, EDQUOT) it must be assumed > that any write requests initiated since the previous successful fsync > was initiated may have failed, and that any cached data may have been > lost. A future fsync() will not attempt to write out the same data > again. If recovery is possible and desired, the application must > repeat all the writes that may have failed. > > If the regions of a file that were written to prior to a failed fsync() > are read, the content reported may not reflect the stored content, and > subsequent reads may revert to the stored content at any time. > Much nicer. Should we make a distinction between usage and functional classes of errors in this? The "usage" errors will probably not result in the pages being tossed out, but the functional ones almost certainly will... -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-10 17:42 ` Jeff Layton @ 2020-09-16 23:25 ` NeilBrown 2020-09-17 7:01 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 15+ messages in thread From: NeilBrown @ 2020-09-16 23:25 UTC (permalink / raw) To: Jeff Layton, Jan Kara, Michael Kerrisk (man-pages) Cc: milan.opensource, lkml, Andrew Morton, linux-fsdevel [-- Attachment #1: Type: text/plain, Size: 1423 bytes --] On Thu, Sep 10 2020, Jeff Layton wrote: > >> Regarding your "NOTES" addition, I don't feel comfortable with the >> "clean" language. I would prefer something like: >> >> When fsync() reports a failure (EIO, ENOSPC, EDQUOT) it must be assumed >> that any write requests initiated since the previous successful fsync >> was initiated may have failed, and that any cached data may have been >> lost. A future fsync() will not attempt to write out the same data >> again. If recovery is possible and desired, the application must >> repeat all the writes that may have failed. >> >> If the regions of a file that were written to prior to a failed fsync() >> are read, the content reported may not reflect the stored content, and >> subsequent reads may revert to the stored content at any time. >> > > Much nicer. I guess someone should turn it into a patch.... > > Should we make a distinction between usage and functional classes of > errors in this? The "usage" errors will probably not result in the pages > being tossed out, but the functional ones almost certainly will... Maybe. I think it is a useful distinction, but to be consistent it would be best to make it in all (section 2) man pages. Maybe not all at once though. It is really up to Michael if that is a direction he is interesting in following. NeilBrown > > -- > Jeff Layton <jlayton@kernel.org> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 853 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-16 23:25 ` NeilBrown @ 2020-09-17 7:01 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 15+ messages in thread From: Michael Kerrisk (man-pages) @ 2020-09-17 7:01 UTC (permalink / raw) To: NeilBrown, Jeff Layton, Jan Kara Cc: mtk.manpages, milan.opensource, lkml, Andrew Morton, linux-fsdevel On 9/17/20 1:25 AM, NeilBrown wrote: > On Thu, Sep 10 2020, Jeff Layton wrote: >> >>> Regarding your "NOTES" addition, I don't feel comfortable with the >>> "clean" language. I would prefer something like: >>> >>> When fsync() reports a failure (EIO, ENOSPC, EDQUOT) it must be assumed >>> that any write requests initiated since the previous successful fsync >>> was initiated may have failed, and that any cached data may have been >>> lost. A future fsync() will not attempt to write out the same data >>> again. If recovery is possible and desired, the application must >>> repeat all the writes that may have failed. >>> >>> If the regions of a file that were written to prior to a failed fsync() >>> are read, the content reported may not reflect the stored content, and >>> subsequent reads may revert to the stored content at any time. >>> >> >> Much nicer. > > I guess someone should turn it into a patch.... That woud be great. >> Should we make a distinction between usage and functional classes of >> errors in this? The "usage" errors will probably not result in the pages >> being tossed out, but the functional ones almost certainly will... > > Maybe. I think it is a useful distinction, but to be consistent it > would be best to make it in all (section 2) man pages. Maybe not all at > once though. It is really up to Michael if that is a direction he is > interesting in following. I think it's useful, and I'd accept patches that make such distinctions. Of course, if we said *everything* should get fixed at the same time, nothing would get fixed :-). So, I think I'd just take individual patches that made such changes on an ad hoc basis. Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-08 11:27 ` Jan Kara 2020-09-08 16:10 ` Jeff Layton 2020-09-08 19:44 ` Jeff Layton @ 2020-09-09 10:52 ` Michael Kerrisk (man-pages) 2020-09-09 11:21 ` Jan Kara 2 siblings, 1 reply; 15+ messages in thread From: Michael Kerrisk (man-pages) @ 2020-09-09 10:52 UTC (permalink / raw) To: Jan Kara Cc: mtk.manpages, milan.opensource, lkml, Andrew Morton, linux-fsdevel, Jeff Layton Hello Jan, Thank you for jumping in on this thread. On 9/8/20 1:27 PM, Jan Kara wrote: > Added Jeff to CC since he has written the code... > > On Mon 07-09-20 09:11:06, Michael Kerrisk (man-pages) wrote: >> [Widening the CC to include Andrew and linux-fsdevel@] >> [Milan: thanks for the patch, but it's unclear to me from your commit >> message how/if you verified the details.] >> >> Andrew, maybe you (or someone else) can comment, since long ago your >> >> commit f79e2abb9bd452d97295f34376dedbec9686b986 >> Author: Andrew Morton <akpm@osdl.org> >> Date: Fri Mar 31 02:30:42 2006 -0800 >> >> included a comment that is referred to in stackoverflow discussion >> about this topic (that SO discussion is in turn referred to by >> https://bugzilla.kernel.org/show_bug.cgi?id=194757). >> >> The essence as I understand it, is this: >> (1) fsync() (and similar) may fail EIO or ENOSPC, at which point data >> has not been synced. >> (2) In this case, the EIO/ENOSPC setting is cleared so that... >> (3) A subsequent fsync() might return success, but... >> (4) That doesn't mean that the data in (1) landed on the disk. > > Correct. Thanks for the confirmation! >> The proposed manual page patch below wants to document this, but I'd >> be happy to have an FS-knowledgeable person comment before I apply. > > Just a small comment below: > >> On Sat, 29 Aug 2020 at 09:13, <milan.opensource@gmail.com> wrote: >>> >>> From: Milan Shah <milan.opensource@gmail.com> >>> >>> This Fix addresses Bug 194757. >>> Ref: https://bugzilla.kernel.org/show_bug.cgi?id=194757 >>> --- >>> man2/fsync.2 | 13 +++++++++++++ >>> 1 file changed, 13 insertions(+) >>> >>> diff --git a/man2/fsync.2 b/man2/fsync.2 >>> index 96401cd..f38b3e4 100644 >>> --- a/man2/fsync.2 >>> +++ b/man2/fsync.2 >>> @@ -186,6 +186,19 @@ In these cases disk caches need to be disabled using >>> or >>> .BR sdparm (8) >>> to guarantee safe operation. >>> + >>> +When >>> +.BR fsync () >>> +or >>> +.BR fdatasync () >>> +returns >>> +.B EIO >>> +or >>> +.B ENOSPC >>> +any error flags on pages in the file mapping are cleared, so subsequent synchronisation attempts >>> +will return without error. It is >>> +.I not >>> +safe to retry synchronisation and assume that a non-error return means prior writes are now on disk. >>> .SH SEE ALSO >>> .BR sync (1), >>> .BR bdflush (2), > > So the error state isn't really stored "on pages in the file mapping". > Current implementation (since 4.14) is that error state is stored in struct > file (I think this tends to be called "file description" in manpages) and (Yes, "open file description" is the POSIX terminology for the thing that sits between the FD and the inode--struct file in kernel parlance--and I try to follow POSIX terminology in the manual pages where possible. > so EIO / ENOSPC is reported once for each file description of the file that > was open before the error happened. Not sure if we want to be so precise in > the manpages or if it just confuses people. Well, people are confused now, so I think more detail would be good. > Anyway your takeway that no > error on subsequent fsync() does not mean data was written is correct. Thanks. (See also my rply to Jeff.) By the way, a question related to your comments above. In the errors section, there is this: EIO An error occurred during synchronization. This error may relate to data written to some other file descriptor on the * same file. Since Linux 4.13, errors from write-back will be reported to all file descriptors that might have written the data which triggered the error. Some filesystems (e.g., NFS) keep close track of which data came through which file descriptor, and give more precise reporting. Other filesystems (e.g., most local filesystems) will report errors to all file descriptors that were open on the * file when the error was recorded. In the marked (*) lines, we have the word "file". Is this accurate? I mean, I would normally take "file" in this context to mean the inode ('struct inode'). But I wonder if really what is meant here is "open file description" ('struct file'). In other words, is the EIO being generated for all FDs connected to the same open file description, or for all FDs for all of the open file descriptions connected to the inode? Your thoughts? Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-09 10:52 ` Michael Kerrisk (man-pages) @ 2020-09-09 11:21 ` Jan Kara 2020-09-09 11:58 ` Michael Kerrisk (man-pages) 0 siblings, 1 reply; 15+ messages in thread From: Jan Kara @ 2020-09-09 11:21 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Jan Kara, milan.opensource, lkml, Andrew Morton, linux-fsdevel, Jeff Layton On Wed 09-09-20 12:52:48, Michael Kerrisk (man-pages) wrote: > > So the error state isn't really stored "on pages in the file mapping". > > Current implementation (since 4.14) is that error state is stored in struct > > file (I think this tends to be called "file description" in manpages) and > > (Yes, "open file description" is the POSIX terminology for the thing that > sits between the FD and the inode--struct file in kernel parlance--and I > try to follow POSIX terminology in the manual pages where possible. > > > so EIO / ENOSPC is reported once for each file description of the file that > > was open before the error happened. Not sure if we want to be so precise in > > the manpages or if it just confuses people. > > Well, people are confused now, so I think more detail would be good. > > > Anyway your takeway that no > > error on subsequent fsync() does not mean data was written is correct. > > Thanks. (See also my rply to Jeff.) > > By the way, a question related to your comments above. In the > errors section, there is this: > > EIO An error occurred during synchronization. This error may > relate to data written to some other file descriptor on the > * same file. Since Linux 4.13, errors from write-back will > be reported to all file descriptors that might have written > the data which triggered the error. Some filesystems > (e.g., NFS) keep close track of which data came through > which file descriptor, and give more precise reporting. > Other filesystems (e.g., most local filesystems) will > report errors to all file descriptors that were open on the > * file when the error was recorded. > > In the marked (*) lines, we have the word "file". Is this accurate? I mean, I > would normally take "file" in this context to mean the inode ('struct inode'). > But I wonder if really what is meant here is "open file description" > ('struct file'). In other words, is the EIO being generated for all FDs > connected to the same open file description, or for all FDs for all of the > open file descriptions connected to the inode? Your thoughts? The error gets reported once for each "open file description" of the file (inode) where the error happened. If there are multiple file descriptors pointing to the same open file description, then only one of those file descriptors will see the error. This is inevitable consequence of kernel storing the error state in struct file and clearing it once it is reported... Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-09 11:21 ` Jan Kara @ 2020-09-09 11:58 ` Michael Kerrisk (man-pages) 2020-09-09 14:14 ` Jan Kara 0 siblings, 1 reply; 15+ messages in thread From: Michael Kerrisk (man-pages) @ 2020-09-09 11:58 UTC (permalink / raw) To: Jan Kara Cc: mtk.manpages, milan.opensource, lkml, Andrew Morton, linux-fsdevel, Jeff Layton, NeilBrown [CC += Neil, since he wrote the text we're talking about] Hello Jan, On 9/9/20 1:21 PM, Jan Kara wrote: > On Wed 09-09-20 12:52:48, Michael Kerrisk (man-pages) wrote: >>> So the error state isn't really stored "on pages in the file mapping". >>> Current implementation (since 4.14) is that error state is stored in struct >>> file (I think this tends to be called "file description" in manpages) and >> >> (Yes, "open file description" is the POSIX terminology for the thing that >> sits between the FD and the inode--struct file in kernel parlance--and I >> try to follow POSIX terminology in the manual pages where possible. >> >>> so EIO / ENOSPC is reported once for each file description of the file that >>> was open before the error happened. Not sure if we want to be so precise in >>> the manpages or if it just confuses people. >> >> Well, people are confused now, so I think more detail would be good. >> >>> Anyway your takeway that no >>> error on subsequent fsync() does not mean data was written is correct. >> >> Thanks. (See also my rply to Jeff.) >> >> By the way, a question related to your comments above. In the >> errors section, there is this: >> >> EIO An error occurred during synchronization. This error may >> relate to data written to some other file descriptor on the >> * same file. Since Linux 4.13, errors from write-back will >> be reported to all file descriptors that might have written >> the data which triggered the error. Some filesystems >> (e.g., NFS) keep close track of which data came through >> which file descriptor, and give more precise reporting. >> Other filesystems (e.g., most local filesystems) will >> report errors to all file descriptors that were open on the >> * file when the error was recorded. >> >> In the marked (*) lines, we have the word "file". Is this accurate? I mean, I >> would normally take "file" in this context to mean the inode ('struct inode'). >> But I wonder if really what is meant here is "open file description" >> ('struct file'). In other words, is the EIO being generated for all FDs >> connected to the same open file description, or for all FDs for all of the >> open file descriptions connected to the inode? Your thoughts? > > The error gets reported once for each "open file description" of the file > (inode) where the error happened. If there are multiple file descriptors > pointing to the same open file description, then only one of those file > descriptors will see the error. This is inevitable consequence of kernel > storing the error state in struct file and clearing it once it is > reported... So, the text in wrong two respects, I believe: * It should be phrased in terms of "open file description", not "file", in the lines that I marked. * Where it says "to all file descriptors" (twice), it should rather say "to any of the file descriptors [that refer to the open file description]" Do you agree? Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC 2020-09-09 11:58 ` Michael Kerrisk (man-pages) @ 2020-09-09 14:14 ` Jan Kara 0 siblings, 0 replies; 15+ messages in thread From: Jan Kara @ 2020-09-09 14:14 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Jan Kara, milan.opensource, lkml, Andrew Morton, linux-fsdevel, Jeff Layton, NeilBrown On Wed 09-09-20 13:58:50, Michael Kerrisk (man-pages) wrote: > [CC += Neil, since he wrote the text we're talking about] > > Hello Jan, > > On 9/9/20 1:21 PM, Jan Kara wrote: > > On Wed 09-09-20 12:52:48, Michael Kerrisk (man-pages) wrote: > >>> So the error state isn't really stored "on pages in the file mapping". > >>> Current implementation (since 4.14) is that error state is stored in struct > >>> file (I think this tends to be called "file description" in manpages) and > >> > >> (Yes, "open file description" is the POSIX terminology for the thing that > >> sits between the FD and the inode--struct file in kernel parlance--and I > >> try to follow POSIX terminology in the manual pages where possible. > >> > >>> so EIO / ENOSPC is reported once for each file description of the file that > >>> was open before the error happened. Not sure if we want to be so precise in > >>> the manpages or if it just confuses people. > >> > >> Well, people are confused now, so I think more detail would be good. > >> > >>> Anyway your takeway that no > >>> error on subsequent fsync() does not mean data was written is correct. > >> > >> Thanks. (See also my rply to Jeff.) > >> > >> By the way, a question related to your comments above. In the > >> errors section, there is this: > >> > >> EIO An error occurred during synchronization. This error may > >> relate to data written to some other file descriptor on the > >> * same file. Since Linux 4.13, errors from write-back will > >> be reported to all file descriptors that might have written > >> the data which triggered the error. Some filesystems > >> (e.g., NFS) keep close track of which data came through > >> which file descriptor, and give more precise reporting. > >> Other filesystems (e.g., most local filesystems) will > >> report errors to all file descriptors that were open on the > >> * file when the error was recorded. > >> > >> In the marked (*) lines, we have the word "file". Is this accurate? I mean, I > >> would normally take "file" in this context to mean the inode ('struct inode'). > >> But I wonder if really what is meant here is "open file description" > >> ('struct file'). In other words, is the EIO being generated for all FDs > >> connected to the same open file description, or for all FDs for all of the > >> open file descriptions connected to the inode? Your thoughts? > > > > The error gets reported once for each "open file description" of the file > > (inode) where the error happened. If there are multiple file descriptors > > pointing to the same open file description, then only one of those file > > descriptors will see the error. This is inevitable consequence of kernel > > storing the error state in struct file and clearing it once it is > > reported... > > So, the text in wrong two respects, I believe: > > * It should be phrased in terms of "open file description", not "file", > in the lines that I marked. No, I believe 'file' is correct on these two lines. I guess I wasn't precise enough in my explanation of the mechanism :) We actually have two places where we store error state. There's "error counter" in the inode and then "last seen error" counter in struct file. Whenever "last seen error" is less than "inode error counter" we report error from the syscall and set "last seen error" in the used struct file to the current "inode error counter". So whenever writeback error happens for the inode, all 'struct file's will end up reporting the error exactly once. > * Where it says "to all file descriptors" (twice), it should rather say > "to any of the file descriptors [that refer to the open file description]" This is correct but I'm not sure it captures well the fact that each open file description is guaranteed to get a notification. So maybe I'd rephrase it like "reported to all file descriptors" -> "reported to all open file descriptions (if there are multiple file descriptors pointing to the same open file description, the error is reported only to the first call regardless of which of the descriptors it uses)" Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-09-17 7:02 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-29 7:13 [PATCH] fsync.2: ERRORS: add EIO and ENOSPC milan.opensource 2020-09-07 7:11 ` Michael Kerrisk (man-pages) 2020-09-08 11:27 ` Jan Kara 2020-09-08 16:10 ` Jeff Layton 2020-09-09 22:50 ` NeilBrown 2020-09-08 19:44 ` Jeff Layton 2020-09-09 10:53 ` Michael Kerrisk (man-pages) 2020-09-09 23:04 ` NeilBrown 2020-09-10 17:42 ` Jeff Layton 2020-09-16 23:25 ` NeilBrown 2020-09-17 7:01 ` Michael Kerrisk (man-pages) 2020-09-09 10:52 ` Michael Kerrisk (man-pages) 2020-09-09 11:21 ` Jan Kara 2020-09-09 11:58 ` Michael Kerrisk (man-pages) 2020-09-09 14:14 ` Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).