All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-devel@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>, Laurent Vivier <laurent@vivier.eu>
Subject: [PATCH v2 6/9] linux-user: Split safe-syscall macro into its own header
Date: Wed,  8 Sep 2021 16:44:02 +0100	[thread overview]
Message-ID: <20210908154405.15417-7-peter.maydell@linaro.org> (raw)
In-Reply-To: <20210908154405.15417-1-peter.maydell@linaro.org>

Split the safe-syscall macro from qemu.h into a new safe-syscall.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 linux-user/qemu.h         | 135 ---------------------------------
 linux-user/safe-syscall.h | 154 ++++++++++++++++++++++++++++++++++++++
 linux-user/syscall.c      |   1 +
 3 files changed, 155 insertions(+), 135 deletions(-)
 create mode 100644 linux-user/safe-syscall.h

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 0cb79990579..a82a46236e6 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -240,141 +240,6 @@ void probe_guest_base(const char *image_name,
 
 #include "qemu/log.h"
 
-/* safe_syscall.S */
-
-/**
- * safe_syscall:
- * @int number: number of system call to make
- * ...: arguments to the system call
- *
- * Call a system call if guest signal not pending.
- * This has the same API as the libc syscall() function, except that it
- * may return -1 with errno == TARGET_ERESTARTSYS if a signal was pending.
- *
- * Returns: the system call result, or -1 with an error code in errno
- * (Errnos are host errnos; we rely on TARGET_ERESTARTSYS not clashing
- * with any of the host errno values.)
- */
-
-/*
- * A guide to using safe_syscall() to handle interactions between guest
- * syscalls and guest signals:
- *
- * Guest syscalls come in two flavours:
- *
- * (1) Non-interruptible syscalls
- *
- * These are guest syscalls that never get interrupted by signals and
- * so never return EINTR. They can be implemented straightforwardly in
- * QEMU: just make sure that if the implementation code has to make any
- * blocking calls that those calls are retried if they return EINTR.
- * It's also OK to implement these with safe_syscall, though it will be
- * a little less efficient if a signal is delivered at the 'wrong' moment.
- *
- * Some non-interruptible syscalls need to be handled using block_signals()
- * to block signals for the duration of the syscall. This mainly applies
- * to code which needs to modify the data structures used by the
- * host_signal_handler() function and the functions it calls, including
- * all syscalls which change the thread's signal mask.
- *
- * (2) Interruptible syscalls
- *
- * These are guest syscalls that can be interrupted by signals and
- * for which we need to either return EINTR or arrange for the guest
- * syscall to be restarted. This category includes both syscalls which
- * always restart (and in the kernel return -ERESTARTNOINTR), ones
- * which only restart if there is no handler (kernel returns -ERESTARTNOHAND
- * or -ERESTART_RESTARTBLOCK), and the most common kind which restart
- * if the handler was registered with SA_RESTART (kernel returns
- * -ERESTARTSYS). System calls which are only interruptible in some
- * situations (like 'open') also need to be handled this way.
- *
- * Here it is important that the host syscall is made
- * via this safe_syscall() function, and *not* via the host libc.
- * If the host libc is used then the implementation will appear to work
- * most of the time, but there will be a race condition where a
- * signal could arrive just before we make the host syscall inside libc,
- * and then then guest syscall will not correctly be interrupted.
- * Instead the implementation of the guest syscall can use the safe_syscall
- * function but otherwise just return the result or errno in the usual
- * way; the main loop code will take care of restarting the syscall
- * if appropriate.
- *
- * (If the implementation needs to make multiple host syscalls this is
- * OK; any which might really block must be via safe_syscall(); for those
- * which are only technically blocking (ie which we know in practice won't
- * stay in the host kernel indefinitely) it's OK to use libc if necessary.
- * You must be able to cope with backing out correctly if some safe_syscall
- * you make in the implementation returns either -TARGET_ERESTARTSYS or
- * EINTR though.)
- *
- * block_signals() cannot be used for interruptible syscalls.
- *
- *
- * How and why the safe_syscall implementation works:
- *
- * The basic setup is that we make the host syscall via a known
- * section of host native assembly. If a signal occurs, our signal
- * handler checks the interrupted host PC against the addresse of that
- * known section. If the PC is before or at the address of the syscall
- * instruction then we change the PC to point at a "return
- * -TARGET_ERESTARTSYS" code path instead, and then exit the signal handler
- * (causing the safe_syscall() call to immediately return that value).
- * Then in the main.c loop if we see this magic return value we adjust
- * the guest PC to wind it back to before the system call, and invoke
- * the guest signal handler as usual.
- *
- * This winding-back will happen in two cases:
- * (1) signal came in just before we took the host syscall (a race);
- *   in this case we'll take the guest signal and have another go
- *   at the syscall afterwards, and this is indistinguishable for the
- *   guest from the timing having been different such that the guest
- *   signal really did win the race
- * (2) signal came in while the host syscall was blocking, and the
- *   host kernel decided the syscall should be restarted;
- *   in this case we want to restart the guest syscall also, and so
- *   rewinding is the right thing. (Note that "restart" semantics mean
- *   "first call the signal handler, then reattempt the syscall".)
- * The other situation to consider is when a signal came in while the
- * host syscall was blocking, and the host kernel decided that the syscall
- * should not be restarted; in this case QEMU's host signal handler will
- * be invoked with the PC pointing just after the syscall instruction,
- * with registers indicating an EINTR return; the special code in the
- * handler will not kick in, and we will return EINTR to the guest as
- * we should.
- *
- * Notice that we can leave the host kernel to make the decision for
- * us about whether to do a restart of the syscall or not; we do not
- * need to check SA_RESTART flags in QEMU or distinguish the various
- * kinds of restartability.
- */
-#ifdef HAVE_SAFE_SYSCALL
-/* The core part of this function is implemented in assembly */
-extern long safe_syscall_base(int *pending, long number, ...);
-
-#define safe_syscall(...)                                               \
-    ({                                                                  \
-        long ret_;                                                      \
-        int *psp_ = &((TaskState *)thread_cpu->opaque)->signal_pending; \
-        ret_ = safe_syscall_base(psp_, __VA_ARGS__);                    \
-        if (is_error(ret_)) {                                           \
-            errno = -ret_;                                              \
-            ret_ = -1;                                                  \
-        }                                                               \
-        ret_;                                                           \
-    })
-
-#else
-
-/*
- * Fallback for architectures which don't yet provide a safe-syscall assembly
- * fragment; note that this is racy!
- * This should go away when all host architectures have been updated.
- */
-#define safe_syscall syscall
-
-#endif
-
 /* syscall.c */
 int host_to_target_waitstatus(int status);
 
diff --git a/linux-user/safe-syscall.h b/linux-user/safe-syscall.h
new file mode 100644
index 00000000000..6bc03902628
--- /dev/null
+++ b/linux-user/safe-syscall.h
@@ -0,0 +1,154 @@
+/*
+ * safe-syscall.h: prototypes for linux-user signal-race-safe syscalls
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef LINUX_USER_SAFE_SYSCALL_H
+#define LINUX_USER_SAFE_SYSCALL_H
+
+/**
+ * safe_syscall:
+ * @int number: number of system call to make
+ * ...: arguments to the system call
+ *
+ * Call a system call if guest signal not pending.
+ * This has the same API as the libc syscall() function, except that it
+ * may return -1 with errno == TARGET_ERESTARTSYS if a signal was pending.
+ *
+ * Returns: the system call result, or -1 with an error code in errno
+ * (Errnos are host errnos; we rely on TARGET_ERESTARTSYS not clashing
+ * with any of the host errno values.)
+ */
+
+/*
+ * A guide to using safe_syscall() to handle interactions between guest
+ * syscalls and guest signals:
+ *
+ * Guest syscalls come in two flavours:
+ *
+ * (1) Non-interruptible syscalls
+ *
+ * These are guest syscalls that never get interrupted by signals and
+ * so never return EINTR. They can be implemented straightforwardly in
+ * QEMU: just make sure that if the implementation code has to make any
+ * blocking calls that those calls are retried if they return EINTR.
+ * It's also OK to implement these with safe_syscall, though it will be
+ * a little less efficient if a signal is delivered at the 'wrong' moment.
+ *
+ * Some non-interruptible syscalls need to be handled using block_signals()
+ * to block signals for the duration of the syscall. This mainly applies
+ * to code which needs to modify the data structures used by the
+ * host_signal_handler() function and the functions it calls, including
+ * all syscalls which change the thread's signal mask.
+ *
+ * (2) Interruptible syscalls
+ *
+ * These are guest syscalls that can be interrupted by signals and
+ * for which we need to either return EINTR or arrange for the guest
+ * syscall to be restarted. This category includes both syscalls which
+ * always restart (and in the kernel return -ERESTARTNOINTR), ones
+ * which only restart if there is no handler (kernel returns -ERESTARTNOHAND
+ * or -ERESTART_RESTARTBLOCK), and the most common kind which restart
+ * if the handler was registered with SA_RESTART (kernel returns
+ * -ERESTARTSYS). System calls which are only interruptible in some
+ * situations (like 'open') also need to be handled this way.
+ *
+ * Here it is important that the host syscall is made
+ * via this safe_syscall() function, and *not* via the host libc.
+ * If the host libc is used then the implementation will appear to work
+ * most of the time, but there will be a race condition where a
+ * signal could arrive just before we make the host syscall inside libc,
+ * and then then guest syscall will not correctly be interrupted.
+ * Instead the implementation of the guest syscall can use the safe_syscall
+ * function but otherwise just return the result or errno in the usual
+ * way; the main loop code will take care of restarting the syscall
+ * if appropriate.
+ *
+ * (If the implementation needs to make multiple host syscalls this is
+ * OK; any which might really block must be via safe_syscall(); for those
+ * which are only technically blocking (ie which we know in practice won't
+ * stay in the host kernel indefinitely) it's OK to use libc if necessary.
+ * You must be able to cope with backing out correctly if some safe_syscall
+ * you make in the implementation returns either -TARGET_ERESTARTSYS or
+ * EINTR though.)
+ *
+ * block_signals() cannot be used for interruptible syscalls.
+ *
+ *
+ * How and why the safe_syscall implementation works:
+ *
+ * The basic setup is that we make the host syscall via a known
+ * section of host native assembly. If a signal occurs, our signal
+ * handler checks the interrupted host PC against the addresse of that
+ * known section. If the PC is before or at the address of the syscall
+ * instruction then we change the PC to point at a "return
+ * -TARGET_ERESTARTSYS" code path instead, and then exit the signal handler
+ * (causing the safe_syscall() call to immediately return that value).
+ * Then in the main.c loop if we see this magic return value we adjust
+ * the guest PC to wind it back to before the system call, and invoke
+ * the guest signal handler as usual.
+ *
+ * This winding-back will happen in two cases:
+ * (1) signal came in just before we took the host syscall (a race);
+ *   in this case we'll take the guest signal and have another go
+ *   at the syscall afterwards, and this is indistinguishable for the
+ *   guest from the timing having been different such that the guest
+ *   signal really did win the race
+ * (2) signal came in while the host syscall was blocking, and the
+ *   host kernel decided the syscall should be restarted;
+ *   in this case we want to restart the guest syscall also, and so
+ *   rewinding is the right thing. (Note that "restart" semantics mean
+ *   "first call the signal handler, then reattempt the syscall".)
+ * The other situation to consider is when a signal came in while the
+ * host syscall was blocking, and the host kernel decided that the syscall
+ * should not be restarted; in this case QEMU's host signal handler will
+ * be invoked with the PC pointing just after the syscall instruction,
+ * with registers indicating an EINTR return; the special code in the
+ * handler will not kick in, and we will return EINTR to the guest as
+ * we should.
+ *
+ * Notice that we can leave the host kernel to make the decision for
+ * us about whether to do a restart of the syscall or not; we do not
+ * need to check SA_RESTART flags in QEMU or distinguish the various
+ * kinds of restartability.
+ */
+#ifdef HAVE_SAFE_SYSCALL
+/* The core part of this function is implemented in assembly */
+extern long safe_syscall_base(int *pending, long number, ...);
+
+#define safe_syscall(...)                                               \
+    ({                                                                  \
+        long ret_;                                                      \
+        int *psp_ = &((TaskState *)thread_cpu->opaque)->signal_pending; \
+        ret_ = safe_syscall_base(psp_, __VA_ARGS__);                    \
+        if (is_error(ret_)) {                                           \
+            errno = -ret_;                                              \
+            ret_ = -1;                                                  \
+        }                                                               \
+        ret_;                                                           \
+    })
+
+#else
+
+/*
+ * Fallback for architectures which don't yet provide a safe-syscall assembly
+ * fragment; note that this is racy!
+ * This should go away when all host architectures have been updated.
+ */
+#define safe_syscall syscall
+
+#endif
+
+#endif
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index b6c8406e1dc..9873830b465 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -131,6 +131,7 @@
 #include "signal-common.h"
 #include "loader.h"
 #include "user-mmap.h"
+#include "safe-syscall.h"
 #include "qemu/guest-random.h"
 #include "qemu/selfmap.h"
 #include "user/syscall-trace.h"
-- 
2.20.1



  parent reply	other threads:[~2021-09-08 15:50 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-08 15:43 [PATCH v2 0/9] linux-user: split internals out of qemu.h Peter Maydell
2021-09-08 15:43 ` [PATCH v2 1/9] linux-user: Fix coding style nits in qemu.h Peter Maydell
2021-09-12 16:07   ` Richard Henderson
2021-09-08 15:43 ` [PATCH v2 2/9] linux-user: Split strace prototypes into strace.h Peter Maydell
2021-09-12 16:16   ` Richard Henderson
2021-09-08 15:43 ` [PATCH v2 3/9] linux-user: Split signal-related prototypes into signal-common.h Peter Maydell
2021-09-08 15:59   ` Philippe Mathieu-Daudé
2021-09-12 16:16   ` Richard Henderson
2021-09-08 15:44 ` [PATCH v2 4/9] linux-user: Split loader-related prototypes into loader.h Peter Maydell
2021-09-08 16:00   ` Philippe Mathieu-Daudé
2021-09-12 16:17   ` Richard Henderson
2021-09-08 15:44 ` [PATCH v2 5/9] linux-user: Split mmap prototypes into user-mmap.h Peter Maydell
2021-09-08 16:01   ` Philippe Mathieu-Daudé
2021-09-12 16:18   ` Richard Henderson
2021-09-08 15:44 ` Peter Maydell [this message]
2021-09-12 16:23   ` [PATCH v2 6/9] linux-user: Split safe-syscall macro into its own header Richard Henderson
2021-09-13  8:20     ` Peter Maydell
2021-09-08 15:44 ` [PATCH v2 7/9] linux-user: Split linux-user internals out of qemu.h Peter Maydell
2021-09-12 16:28   ` Richard Henderson
2021-09-08 15:44 ` [PATCH v2 8/9] linux-user: Don't include gdbstub.h in qemu.h Peter Maydell
2021-09-12 16:35   ` Richard Henderson
2021-09-08 15:44 ` [PATCH v2 9/9] linux-user: Drop unneeded includes from qemu.h Peter Maydell
2021-09-12 16:36   ` Richard Henderson
2021-09-13 19:07 ` [PATCH v2 0/9] linux-user: split internals out of qemu.h Laurent Vivier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210908154405.15417-7-peter.maydell@linaro.org \
    --to=peter.maydell@linaro.org \
    --cc=laurent@vivier.eu \
    --cc=qemu-devel@nongnu.org \
    --cc=riku.voipio@iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.