All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
To: Gabriel Krisman Bertazi <krisman@collabora.com>, alx.manpages@gmail.com
Cc: mtk.manpages@gmail.com, linux-man@vger.kernel.org, kernel@collabora.com
Subject: Re: [PATCH v6] prctl.2: Document Syscall User Dispatch
Date: Wed, 30 Dec 2020 11:24:04 +0100	[thread overview]
Message-ID: <5da9a8bc-e034-1ab4-3f87-328108c1b27d@gmail.com> (raw)
In-Reply-To: <20201228173832.347794-1-krisman@collabora.com>

Hello Gabriel

This is looking much better. Thank you! I have a few more
comments still.

On 12/28/20 6:38 PM, Gabriel Krisman Bertazi wrote:
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
> 
> ---
> Changes since v5:
> (suggested by Michael Kerrisk)
>   - Change () punctuation
>   - fix grammar
>   - Add information about interception, return and return value
> 
> Changes since v4:
> (suggested by Michael Kerrisk)
>   - Modify explanation of what dispatch to user space means.
>   - Drop references to emulation.
>   - Document suggestion about placing libc in allowed-region.
>   - Comment about avoiding syscall cost.
> Changes since v3:
> (suggested by Michael Kerrisk)
>   - Explain what dispatch to user space means.
>   - Document the fact that the memory region is a single consecutive
>   range.
>   - Explain failure if *arg5 is set to a bad value.
>   - fix english typo.
>   - Define what 'invalid memory region' means.
> 
> Changes since v2:
> (suggested by Alejandro Colomar)
>   - selective -> selectively
>   - Add missing oxford comma.
> 
> Changes since v1:
> (suggested by Alejandro Colomar)
>   - Use semantic lines
>   - Fix usage of .{B|I}R and .{B|I}
>   - Don't format literals
>   - Fix preferred spelling of userspace
>   - Fix case of word
> ---
>  man2/prctl.2 | 159 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 159 insertions(+)
> 
> diff --git a/man2/prctl.2 b/man2/prctl.2
> index f25f05fdb593..0a0abfb78055 100644
> --- a/man2/prctl.2
> +++ b/man2/prctl.2
> @@ -1533,6 +1533,135 @@ For more information, see the kernel source file
>  (or
>  .I Documentation/arm64/sve.txt
>  before Linux 5.3).
> +.TP
> +.\" prctl PR_SET_SYSCALL_USER_DISPATCH
> +.\" commit 1446e1df9eb183fdf81c3f0715402f1d7595d4
> +.BR PR_SET_SYSCALL_USER_DISPATCH " (since Linux 5.11, x86 only)"
> +.IP
> +Configure the Syscall User Dispatch mechanism
> +for the calling thread.
> +This mechanism allows an application
> +to selectively intercept system calls
> +so that they can be handled within the application itself.
> +Interception takes the form of a thread-directed
> +.B SIGSYS
> +signal that is delivered to the thread
> +when it makes a system call.
> +If intercepted,
> +the system call is not executed by the kernel.
> +.IP
> +The current Syscall User Dispatch mode is selected via
> +.IR arg2 ,
> +which can either be set to
> +.B PR_SYS_DISPATCH_ON
> +to enable the feature,
> +or to
> +.B PR_SYS_DISPATCH_OFF
> +to turn it off.

So, I realize now that I'm slightly confused.

The value of arg2 can be either PR_SYS_DISPATCH_ON or
PR_SYS_DISPATCH_OFF. The value of the selector pointed to by
arg5 can likewise be R_SYS_DISPATCH_ON or PR_SYS_DISPATCH_OFF.
What is the relationship between these two attributes? For example,
what does it mean if arg2 isP R_SYS_DISPATCH_ON and, at the time of
the prctl() call, the selector has the value PR_SYS_DISPATCH_OFF?

> +.IP
> +When
> +.I arg2
> +is set to
> +.BR PR_SYS_DISPATCH_ON ,
> +.I arg3
> +and
> +.I arg4
> +respectively identify the
> +.I offset
> +and
> +.I length
> +of a single contiguous memory region in the process map

Better: s/map/address space/ ?

> +from where system calls are always allowed to be executed,
> +regardless of the switch variable

s/variable/variable./

> +(Typically, this area would include the area of memory
> +containing the C library.)

I think just to ease readability (smaller paragraphs), insert
.IP
here.

> +.I arg5
> +points to a char-sized variable
> +that is a fast switch to enable/disable the mechanism
> +without the overhead of doing a system call.
> +The variable pointed by
> +.I arg5
> +can either be set to
> +.B PR_SYS_DISPATCH_ON
> +to enable the mechanism
> +or to
> +.B PR_SYS_DISPATCH_OFF
> +to temporarily disable it.
> +This value is checked by the kernel
> +on every system call entry,
> +and any unexpected value will raise
> +an uncatchable
> +.B SIGSYS
> +at that time,
> +killing the application.
> +.IP
> +When a system call is intercepted,
> +the kernel sends a thread-directed
> +.B SIGSYS
> +signal to the triggering thread.
> +Various fields will be set in the
> +.I siginfo_t
> +structure (see
> +.BR sigaction (2))
> +associated with the signal:
> +.RS
> +.IP * 3
> +.I si_signo
> +will contain
> +.BR SIGSYS .
> +.IP *
> +.IR si_call_addr
> +will show the address of the system call instruction.
> +.IP *
> +.IR si_syscall
> +and
> +.IR si_arch
> +will indicate which system call was attempted.
> +.IP *
> +.I si_code
> +will contain
> +.BR SYS_USER_DISPATCH .
> +.IP *
> +.I si_errno
> +will be set to 0.
> +.RE
> +.IP
> +The program counter will be as though the system call happened
> +(i.e., the program counter will not point to the system call instruction).
> +.IP
> +When the signal handler returns to the kernel,
> +the system call completes immediately
> +and returns to the calling thread,
> +without actually being executed.
> +If necessary
> +(i.e., when emulating the system call on user space.),
> +the signal handler should set the system call return value
> +to a sane value,
> +by modifying the register context stored in the
> +.I ucontext
> +argument of the signal handler.

Just for my own education, do you have any example code somewhere
that demonstrates setting the syscall return value?

Thanks,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

  reply	other threads:[~2020-12-30 10:24 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-28 17:38 [PATCH v6] prctl.2: Document Syscall User Dispatch Gabriel Krisman Bertazi
2020-12-30 10:24 ` Michael Kerrisk (man-pages) [this message]
2020-12-30 16:51   ` Gabriel Krisman Bertazi
2020-12-30 19:50     ` Michael Kerrisk (man-pages)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5da9a8bc-e034-1ab4-3f87-328108c1b27d@gmail.com \
    --to=mtk.manpages@gmail.com \
    --cc=alx.manpages@gmail.com \
    --cc=kernel@collabora.com \
    --cc=krisman@collabora.com \
    --cc=linux-man@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.