All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Marshall <hubcap@omnibond.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Mike Marshall <hubcap@omnibond.com>
Subject: Re: write() semantics (Re: Orangefs ABI documentation)
Date: Thu, 3 Mar 2016 17:25:08 -0500	[thread overview]
Message-ID: <CAOg9mSTDxoLfHceXOJAwNZ6mhwZs4o=d9sQUjWcAQsQbUPiE4w@mail.gmail.com> (raw)
In-Reply-To: <CA+55aFw_3RpvBuTjrMank5CVRYp+RqOZn53Q3OrV+MNur_vZ6A@mail.gmail.com>

Here is what I have come up with to try and make our return to
interrupted writes more acceptable... in my tests it seems to
work. My test involved running the client-core with a tiny IO
buffer size (4k) and a C program with a large write buffer(32M).
That way there was plenty of time for me to fire off the C program
and hit Ctrl-C while the write was chugging along.

The return value from do_readv_writev always matches the size
of the file written by the aborted C program in my tests.

I changed the C program around so that sometimes I ran it with
(O_CREAT | O_RDWR) on open and sometimes with (O_CREAT | O_RDWR | O_APPEND)
and it seemed to do the right thing. I didn't try to set up a
signal handler so that the signal wasn't fatal to the process,
I guess that would be a test to actually see and verify the correct
short return code to write...

Do you all think this looks like it should work in principle?

BTW: in the distant past someone else attempted to solve this problem
the "nfs intr" way - we have an intr mount option, and that's why there's
all that sweating over whether or not stuff is "interruptible" in
waitqueue.c... I'm not sure if our intr mount option is relevant anymore
given the way the op handling code has evolved...

diff --git a/fs/orangefs/file.c b/fs/orangefs/file.c
index 6f2e0f7..4349c9d 100644
--- a/fs/orangefs/file.c
+++ b/fs/orangefs/file.c
@@ -180,21 +180,54 @@ populate_shared_memory:
  }

  if (ret < 0) {
- /*
- * don't write an error to syslog on signaled operation
- * termination unless we've got debugging turned on, as
- * this can happen regularly (i.e. ctrl-c)
- */
- if (ret == -EINTR)
+ if (ret == -EINTR) {
+ /*
+ * We can't return EINTR if any data was written,
+ * it's not POSIX. It is minimally acceptable
+ * to give a partial write, the way NFS does.
+ *
+ * It would be optimal to return all or nothing,
+ * but if a userspace write is bigger than
+ * an IO buffer, and the interrupt occurs
+ * between buffer writes, that would not be
+ * possible.
+ */
+ switch (new_op->op_state - OP_VFS_STATE_GIVEN_UP) {
+ /*
+ * If the op was waiting when the interrupt
+ * occurred, then the client-core did not
+ * trigger the write.
+ */
+ case OP_VFS_STATE_WAITING:
+ ret = 0;
+ break;
+ /*
+ * If the op was in progress when the interrupt
+ * occurred, then the client-core was able to
+ * trigger the write.
+ */
+ case OP_VFS_STATE_INPROGR:
+ ret = total_size;
+ break;
+ default:
+ gossip_err("%s: unexpected op state :%d:.\n",
+   __func__,
+   new_op->op_state);
+ ret = 0;
+ break;
+ }
  gossip_debug(GOSSIP_FILE_DEBUG,
-     "%s: returning error %ld\n", __func__,
-     (long)ret);
- else
+     "%s: got EINTR, state:%d: %p\n",
+     __func__,
+     new_op->op_state,
+     new_op);
+ } else {
  gossip_err("%s: error in %s handle %pU, returning %zd\n",
  __func__,
  type == ORANGEFS_IO_READ ?
  "read from" : "write to",
  handle, ret);
+ }
  if (orangefs_cancel_op_in_progress(new_op))
  return ret;


On Sat, Jan 23, 2016 at 6:35 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Sat, Jan 23, 2016 at 2:46 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>
>> What should we get?  -EINTR, despite having written some data?
>
> No, that's not acceptable.
>
> Either all or nothing (which is POSIX) or the NFS 'intr' mount
> behavior (partial write return, -EINTR only when nothing was written
> at all). And, like NFS, a mount option might be a good thing.
>
> And of course, for the usual reasons, fatal signals are special in
> that for them we generally say "screw posix, nobody sees the return
> value anyway", but even there the filesystem might as well still
> return the partial return value (just to not introduce yet another
> special case).
>
> In fact, I think that with our "fatal signals interrupt" behavior,
> nobody should likely use the "intr" mount option on NFS. Even if the
> semantics may be "better", there are likely simply just too many
> programs that don't check the return value of "write()" at all, much
> less handle partial writes correctly.
>
> (And yes, our "screw posix" behavior wrt fatal signals is strictly
> wrong even _despite_ the fact that nobody sees the return value -
> other processes can still obviously see that the whole write wasn't
> done. But blocking on a fatal signal is _so_ annoying that it's one of
> those things where we just say "posix was wrong on this one, and if we
> squint a bit we look _almost_ like we're compliant").
>
>               Linus

  reply	other threads:[~2016-03-03 22:25 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-15 21:46 Orangefs ABI documentation Mike Marshall
2016-01-22  7:11 ` Al Viro
2016-01-22 11:09   ` Mike Marshall
2016-01-22 16:59     ` Mike Marshall
2016-01-22 17:08       ` Al Viro
2016-01-22 17:40         ` Mike Marshall
2016-01-22 17:43         ` Al Viro
2016-01-22 18:17           ` Mike Marshall
2016-01-22 18:37             ` Al Viro
2016-01-22 19:07               ` Mike Marshall
2016-01-22 19:21                 ` Mike Marshall
2016-01-22 20:04                   ` Al Viro
2016-01-22 20:30                     ` Mike Marshall
2016-01-23  0:12                       ` Al Viro
2016-01-23  1:28                         ` Al Viro
2016-01-23  2:54                           ` Mike Marshall
2016-01-23 19:10                             ` Al Viro
2016-01-23 19:24                               ` Mike Marshall
2016-01-23 21:35                                 ` Mike Marshall
2016-01-23 22:05                                   ` Al Viro
2016-01-23 21:40                                 ` Al Viro
2016-01-23 22:36                                   ` Mike Marshall
2016-01-24  0:16                                     ` Al Viro
2016-01-24  4:05                                       ` Al Viro
2016-01-24 22:12                                         ` Mike Marshall
2016-01-30 17:22                                           ` Al Viro
2016-01-26 19:52                                         ` Martin Brandenburg
2016-01-30 17:34                                           ` Al Viro
2016-01-30 18:27                                             ` Al Viro
2016-02-04 23:30                                               ` Mike Marshall
2016-02-06 19:42                                                 ` Al Viro
2016-02-07  1:38                                                   ` Al Viro
2016-02-07  3:53                                                     ` Al Viro
2016-02-07 20:01                                                       ` [RFC] bufmap-related wait logics (Re: Orangefs ABI documentation) Al Viro
2016-02-08 22:26                                                       ` Orangefs ABI documentation Mike Marshall
2016-02-08 23:35                                                         ` Al Viro
2016-02-09  3:32                                                           ` Al Viro
2016-02-09 14:34                                                             ` Mike Marshall
2016-02-09 17:40                                                               ` Al Viro
2016-02-09 21:06                                                                 ` Al Viro
2016-02-09 22:25                                                                   ` Mike Marshall
2016-02-11 23:36                                                                   ` Mike Marshall
2016-02-09 22:02                                                                 ` Mike Marshall
2016-02-09 22:16                                                                   ` Al Viro
2016-02-09 22:40                                                                     ` Al Viro
2016-02-09 23:13                                                                       ` Al Viro
2016-02-10 16:44                                                                         ` Al Viro
2016-02-10 21:26                                                                           ` Al Viro
2016-02-11 23:54                                                                           ` Mike Marshall
2016-02-12  0:55                                                                             ` Al Viro
2016-02-12 12:13                                                                               ` Mike Marshall
2016-02-11  0:44                                                                         ` Al Viro
2016-02-11  3:22                                                                           ` Mike Marshall
2016-02-12  4:27                                                                             ` Al Viro
2016-02-12 12:26                                                                               ` Mike Marshall
2016-02-12 18:00                                                                                 ` Martin Brandenburg
2016-02-13 17:18                                                                                   ` Mike Marshall
2016-02-13 17:47                                                                                     ` Al Viro
2016-02-14  2:56                                                                                       ` Al Viro
2016-02-14  3:46                                                                                         ` [RFC] slot allocator - waitqueue use review needed (Re: Orangefs ABI documentation) Al Viro
2016-02-14  4:06                                                                                           ` Al Viro
2016-02-16  2:12                                                                                           ` Al Viro
2016-02-16 19:28                                                                                             ` Al Viro
2016-02-14 22:31                                                                                         ` Orangefs ABI documentation Mike Marshall
2016-02-14 23:43                                                                                           ` Al Viro
2016-02-15 17:46                                                                                             ` Mike Marshall
2016-02-15 18:45                                                                                               ` Al Viro
2016-02-15 22:32                                                                                                 ` Martin Brandenburg
2016-02-15 23:04                                                                                                   ` Al Viro
2016-02-16 23:15                                                                                                     ` Mike Marshall
2016-02-16 23:36                                                                                                       ` Al Viro
2016-02-16 23:54                                                                                                         ` Al Viro
2016-02-17 19:24                                                                                                           ` Mike Marshall
2016-02-17 20:11                                                                                                             ` Al Viro
2016-02-17 21:17                                                                                                               ` Al Viro
2016-02-17 22:24                                                                                                                 ` Mike Marshall
2016-02-17 22:40                                                                                                             ` Martin Brandenburg
2016-02-17 23:09                                                                                                               ` Al Viro
2016-02-17 23:15                                                                                                                 ` Al Viro
2016-02-18  0:04                                                                                                                   ` Al Viro
2016-02-18 11:11                                                                                                                     ` Al Viro
2016-02-18 18:58                                                                                                                       ` Mike Marshall
2016-02-18 19:20                                                                                                                         ` Al Viro
2016-02-18 19:49                                                                                                                         ` Martin Brandenburg
2016-02-18 20:08                                                                                                                           ` Mike Marshall
2016-02-18 20:22                                                                                                                             ` Mike Marshall
2016-02-18 20:38                                                                                                                               ` Mike Marshall
2016-02-18 20:52                                                                                                                                 ` Al Viro
2016-02-18 21:50                                                                                                                                   ` Mike Marshall
2016-02-19  0:25                                                                                                                                     ` Al Viro
2016-02-19 22:11                                                                                                                                       ` Mike Marshall
2016-02-19 22:22                                                                                                                                         ` Al Viro
2016-02-20 12:14                                                                                                                                           ` Mike Marshall
2016-02-20 13:36                                                                                                                                             ` Al Viro
2016-02-22 16:20                                                                                                                                               ` Mike Marshall
2016-02-22 21:22                                                                                                                                                 ` Mike Marshall
2016-02-23 21:58                                                                                                                                                   ` Mike Marshall
2016-02-26 20:21                                                                                                                                                     ` Mike Marshall
2016-02-19 22:32                                                                                                                                         ` Al Viro
2016-02-19 22:45                                                                                                                                           ` Martin Brandenburg
2016-02-19 22:50                                                                                                                                           ` Martin Brandenburg
2016-02-18 20:49                                                                                                                               ` Al Viro
2016-02-15 22:47                                                                                                 ` Mike Marshall
2016-01-23 22:46                                   ` write() semantics (Re: Orangefs ABI documentation) Al Viro
2016-01-23 23:35                                     ` Linus Torvalds
2016-03-03 22:25                                       ` Mike Marshall [this message]
2016-03-04 20:55                                         ` Mike Marshall
2016-01-22 20:51                     ` Orangefs ABI documentation Mike Marshall
2016-01-22 23:53                       ` Mike Marshall
2016-01-22 19:54                 ` Al Viro
2016-01-22 19:50             ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOg9mSTDxoLfHceXOJAwNZ6mhwZs4o=d9sQUjWcAQsQbUPiE4w@mail.gmail.com' \
    --to=hubcap@omnibond.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.