All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] prctl.2: Document Syscall User Dispatch
@ 2020-12-22 20:25 Gabriel Krisman Bertazi
  2020-12-23 10:40 ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 3+ messages in thread
From: Gabriel Krisman Bertazi @ 2020-12-22 20:25 UTC (permalink / raw)
  To: alx.manpages, mtk.manpages; +Cc: linux-man, Gabriel Krisman Bertazi

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>

---
Changes since v3:
(suggested by Michael Kerrisk)
  - Explain what dispatch to user space means.
  - Document the fact that the memory region is a single consecutive
  range.
  - Explain failure if *arg5 is set to a bad value.
  - fix english typo.
  - Define what 'invalid memory region' means.

Changes since v2:
(suggested by Alejandro Colomar)
  - selective -> selectively
  - Add missing oxford comma.

Changes since v1:
(suggested by Alejandro Colomar)
  - Use semantic lines
  - Fix usage of .{B|I}R and .{B|I}
  - Don't format literals
  - Fix preferred spelling of userspace
  - Fix case of word
---
 man2/prctl.2 | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 122 insertions(+)

diff --git a/man2/prctl.2 b/man2/prctl.2
index f25f05fdb593..71261a736964 100644
--- a/man2/prctl.2
+++ b/man2/prctl.2
@@ -1533,6 +1533,98 @@ For more information, see the kernel source file
 (or
 .I Documentation/arm64/sve.txt
 before Linux 5.3).
+.TP
+.\" prctl PR_SET_SYSCALL_USER_DISPATCH
+.\" commit 1446e1df9eb183fdf81c3f0715402f1d7595d4
+.BR PR_SET_SYSCALL_USER_DISPATCH " (since Linux 5.11, x86 only)"
+.IP
+Configure the Syscall User Dispatch mechanism
+for the calling thread,
+to selectively intercept system calls
+and dispatch them back to be instrumented by user space
+through
+.BR SIGSYS .
+This gives user space the opportunity to emulate the system call
+and modify its return value.
+.IP
+When a system call is dispatched back to user space
+by this mechanism,
+it is not executed by the kernel.
+When the signal handler returns,
+the system call completes immediately
+with the return value set
+by the signal handler.
+(See
+.BR sigaction (2)
+for information on setting the return value).
+.IP
+The current Syscall User Dispatch mode is selected via
+.IR arg2 ,
+which can either be set to
+.B PR_SYS_DISPATCH_ON
+to enable the feature,
+or to
+.B PR_SYS_DISPATCH_OFF
+to turn it off.
+.IP
+When
+.I arg2
+is set to
+.BR PR_SYS_DISPATCH_ON ,
+.I arg3
+and
+.I arg4
+respectively identify the
+.I offset
+and
+.I length
+of a single contiguous memory region in the process map
+from where system calls are always allowed to be executed,
+regardless of the switch variable.
+.I arg5
+points to a char-sized variable
+that is a fast switch to enable/disable the mechanism
+without invoking the kernel.
+The variable pointed by
+.I arg5
+can either be set to
+.B PR_SYS_DISPATCH_ON
+to enable the mechanism
+or to
+.B PR_SYS_DISPATCH_OFF
+to temporarily disable it.
+The value pointed by
+.B arg5
+is checked by the kernel
+on every system call entry,
+and any unexpected value will raise
+an uncatchable
+.B SIGSYS
+at that time,
+killing the application.
+.PI
+When a system call is intercepted,
+.B SIGSYS
+is raised with
+.I si_code
+set to
+.BR SYS_USER_DISPATCH .
+.IP
+When
+.I arg2
+is set to
+.BR PR_SYS_DISPATCH_OFF ,
+the remaining arguments must be set to 0.
+.IP
+The setting is not preserved across
+.BR fork (2),
+.BR clone (2),
+or
+.BR execve (2).
+.IP
+For more information,
+see the kernel source file
+.IR Documentation/admin-guide/syscall-user-dispatch.rst
 .\" prctl PR_SET_TAGGED_ADDR_CTRL
 .\" commit 63f0c60379650d82250f22e4cf4137ef3dc4f43d
 .TP
@@ -2000,6 +2092,14 @@ and
 .I arg3
 is an invalid address.
 .TP
+.B EFAULT
+.I option
+is
+.B PR_SET_SYSCALL_USER_DISPATCH
+and
+.I arg5
+has an invalid address.
+.TP
 .B EINVAL
 The value of
 .I option
@@ -2231,6 +2331,28 @@ and SVE is not available on this platform.
 .B EINVAL
 .I option
 is
+.B PR_SET_SYSCALL_USER_DISPATCH
+and one of the following is true:
+.RS
+.IP * 3
+.I arg2
+is
+.B PR_SYS_DISPATCH_OFF
+and the remaining arguments are not 0;
+.IP * 3
+.I arg2
+is
+.B PR_SYS_DISPATCH_ON
+and the memory range specified is outside the
+address space of the process.
+.IP * 3
+.I arg2
+is invalid.
+.RE
+.TP
+.B EINVAL
+.I option
+is
 .BR PR_SET_TAGGED_ADDR_CTRL
 and the arguments are invalid or unsupported.
 See the description of
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v4] prctl.2: Document Syscall User Dispatch
  2020-12-22 20:25 [PATCH v4] prctl.2: Document Syscall User Dispatch Gabriel Krisman Bertazi
@ 2020-12-23 10:40 ` Michael Kerrisk (man-pages)
  2020-12-23 17:30   ` Gabriel Krisman Bertazi
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Kerrisk (man-pages) @ 2020-12-23 10:40 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi, alx.manpages; +Cc: mtk.manpages, linux-man

Hello Gabriel,

On 12/22/20 9:25 PM, Gabriel Krisman Bertazi wrote:
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
> 
> ---
> Changes since v3:
> (suggested by Michael Kerrisk)
>   - Explain what dispatch to user space means.
>   - Document the fact that the memory region is a single consecutive
>   range.
>   - Explain failure if *arg5 is set to a bad value.
>   - fix english typo.
>   - Define what 'invalid memory region' means.
> 
> Changes since v2:
> (suggested by Alejandro Colomar)
>   - selective -> selectively
>   - Add missing oxford comma.
> 
> Changes since v1:
> (suggested by Alejandro Colomar)
>   - Use semantic lines
>   - Fix usage of .{B|I}R and .{B|I}
>   - Don't format literals
>   - Fix preferred spelling of userspace
>   - Fix case of word
> ---
>  man2/prctl.2 | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 122 insertions(+)
> 
> diff --git a/man2/prctl.2 b/man2/prctl.2
> index f25f05fdb593..71261a736964 100644
> --- a/man2/prctl.2
> +++ b/man2/prctl.2
> @@ -1533,6 +1533,98 @@ For more information, see the kernel source file
>  (or
>  .I Documentation/arm64/sve.txt
>  before Linux 5.3).
> +.TP
> +.\" prctl PR_SET_SYSCALL_USER_DISPATCH
> +.\" commit 1446e1df9eb183fdf81c3f0715402f1d7595d4
> +.BR PR_SET_SYSCALL_USER_DISPATCH " (since Linux 5.11, x86 only)"
> +.IP
> +Configure the Syscall User Dispatch mechanism
> +for the calling thread,
> +to selectively intercept system calls
> +and dispatch them back to be instrumented by user space
> +through
> +.BR SIGSYS .

I think that "dispatch them back to be instrumented by user space" 
doesn't really explain anything to someone unfamiliar with SUD.

How about something like this (if it is correct):

[[
The Syscall User Dispatch mechanism allows an application to
selectively intercept system calls so that they can be emulated
within the application itself. Interception takes the form a
thread-directed SIGSYS signal that is delivered to the thread
when it makes a system call. Upon rece(The system call is not executed
by the kernel.)
]]

> +This gives user space the opportunity to emulate the system call
> +and modify its return value.

How is the system call emulated? What I mean is: does one 
emulate it from the SIGSYS handler? That needs to be more
clearly stated.

> +.IP
> +When a system call is dispatched back to user space
> +by this mechanism,
> +it is not executed by the kernel.
> +When the signal handler returns,
> +the system call completes immediately
> +with the return value set
> +by the signal handler.
> +(See
> +.BR sigaction (2)
> +for information on setting the return value).

I can's see anything in sigaction(2) that explains how to set the
return value. Am I missing something or do you have a patch in
progress for that page?

> +.IP
> +The current Syscall User Dispatch mode is selected via
> +.IR arg2 ,
> +which can either be set to
> +.B PR_SYS_DISPATCH_ON
> +to enable the feature,
> +or to
> +.B PR_SYS_DISPATCH_OFF
> +to turn it off.
> +.IP
> +When
> +.I arg2
> +is set to
> +.BR PR_SYS_DISPATCH_ON ,
> +.I arg3
> +and
> +.I arg4
> +respectively identify the
> +.I offset
> +and
> +.I length
> +of a single contiguous memory region in the process map
> +from where system calls are always allowed to be executed,
> +regardless of the switch variable.

Perhaps add something here like:

"(Typically this area would include area of memory containing
the C library.)
"
?

> +.I arg5
> +points to a char-sized variable
> +that is a fast switch to enable/disable the mechanism
> +without invoking the kernel.

Maybe:
s/invoking the kernel/requiring (the expense of) a system call/
?

> +The variable pointed by
> +.I arg5
> +can either be set to
> +.B PR_SYS_DISPATCH_ON
> +to enable the mechanism
> +or to
> +.B PR_SYS_DISPATCH_OFF
> +to temporarily disable it.
> +The value pointed by
> +.B arg5
> +is checked by the kernel
> +on every system call entry,
> +and any unexpected value will raise
> +an uncatchable
> +.B SIGSYS
> +at that time,
> +killing the application.
> +.PI
> +When a system call is intercepted,
> +.B SIGSYS
> +is raised with
> +.I si_code
> +set to
> +.BR SYS_USER_DISPATCH .
> +.IP
> +When
> +.I arg2
> +is set to
> +.BR PR_SYS_DISPATCH_OFF ,
> +the remaining arguments must be set to 0.
> +.IP
> +The setting is not preserved across
> +.BR fork (2),
> +.BR clone (2),
> +or
> +.BR execve (2).
> +.IP
> +For more information,
> +see the kernel source file
> +.IR Documentation/admin-guide/syscall-user-dispatch.rst
>  .\" prctl PR_SET_TAGGED_ADDR_CTRL
>  .\" commit 63f0c60379650d82250f22e4cf4137ef3dc4f43d
>  .TP
> @@ -2000,6 +2092,14 @@ and
>  .I arg3
>  is an invalid address.
>  .TP
> +.B EFAULT
> +.I option
> +is
> +.B PR_SET_SYSCALL_USER_DISPATCH
> +and
> +.I arg5
> +has an invalid address.
> +.TP
>  .B EINVAL
>  The value of
>  .I option
> @@ -2231,6 +2331,28 @@ and SVE is not available on this platform.
>  .B EINVAL
>  .I option
>  is
> +.B PR_SET_SYSCALL_USER_DISPATCH
> +and one of the following is true:
> +.RS
> +.IP * 3
> +.I arg2
> +is
> +.B PR_SYS_DISPATCH_OFF
> +and the remaining arguments are not 0;
> +.IP * 3
> +.I arg2
> +is
> +.B PR_SYS_DISPATCH_ON
> +and the memory range specified is outside the
> +address space of the process.
> +.IP * 3
> +.I arg2
> +is invalid.
> +.RE
> +.TP
> +.B EINVAL
> +.I option
> +is
>  .BR PR_SET_TAGGED_ADDR_CTRL
>  and the arguments are invalid or unsupported.
>  See the description of

Thanks,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v4] prctl.2: Document Syscall User Dispatch
  2020-12-23 10:40 ` Michael Kerrisk (man-pages)
@ 2020-12-23 17:30   ` Gabriel Krisman Bertazi
  0 siblings, 0 replies; 3+ messages in thread
From: Gabriel Krisman Bertazi @ 2020-12-23 17:30 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages); +Cc: alx.manpages, linux-man

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> Hello Gabriel,
>
> On 12/22/20 9:25 PM, Gabriel Krisman Bertazi wrote:
>> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
>> 
>> ---
>> Changes since v3:
>> (suggested by Michael Kerrisk)
>>   - Explain what dispatch to user space means.
>>   - Document the fact that the memory region is a single consecutive
>>   range.
>>   - Explain failure if *arg5 is set to a bad value.
>>   - fix english typo.
>>   - Define what 'invalid memory region' means.
>> 
>> Changes since v2:
>> (suggested by Alejandro Colomar)
>>   - selective -> selectively
>>   - Add missing oxford comma.
>> 
>> Changes since v1:
>> (suggested by Alejandro Colomar)
>>   - Use semantic lines
>>   - Fix usage of .{B|I}R and .{B|I}
>>   - Don't format literals
>>   - Fix preferred spelling of userspace
>>   - Fix case of word
>> ---
>>  man2/prctl.2 | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 122 insertions(+)
>> 
>> diff --git a/man2/prctl.2 b/man2/prctl.2
>> index f25f05fdb593..71261a736964 100644
>> --- a/man2/prctl.2
>> +++ b/man2/prctl.2
>> @@ -1533,6 +1533,98 @@ For more information, see the kernel source file
>>  (or
>>  .I Documentation/arm64/sve.txt
>>  before Linux 5.3).
>> +.TP
>> +.\" prctl PR_SET_SYSCALL_USER_DISPATCH
>> +.\" commit 1446e1df9eb183fdf81c3f0715402f1d7595d4
>> +.BR PR_SET_SYSCALL_USER_DISPATCH " (since Linux 5.11, x86 only)"
>> +.IP
>> +Configure the Syscall User Dispatch mechanism
>> +for the calling thread,
>> +to selectively intercept system calls
>> +and dispatch them back to be instrumented by user space
>> +through
>> +.BR SIGSYS .
>
> I think that "dispatch them back to be instrumented by user space" 
> doesn't really explain anything to someone unfamiliar with SUD.
>
> How about something like this (if it is correct):
>
> [[
> The Syscall User Dispatch mechanism allows an application to
> selectively intercept system calls so that they can be emulated
> within the application itself. Interception takes the form a
> thread-directed SIGSYS signal that is delivered to the thread
> when it makes a system call. Upon rece(The system call is not executed
> by the kernel.)
> ]]
>
>> +This gives user space the opportunity to emulate the system call
>> +and modify its return value.
>
> How is the system call emulated? What I mean is: does one 
> emulate it from the SIGSYS handler? That needs to be more
> clearly stated.

I am rethinking the mention to emulation in the manpage, as that goes
beyond SUD.  In fact, it is one usecase that can be implemented using
SUD and signal handlers, but there are others.

I'm using your suggestion above slightly modified, to avoid the term emulation.

>
>> +.IP
>> +When a system call is dispatched back to user space
>> +by this mechanism,
>> +it is not executed by the kernel.
>> +When the signal handler returns,
>> +the system call completes immediately
>> +with the return value set
>> +by the signal handler.
>> +(See
>> +.BR sigaction (2)
>> +for information on setting the return value).
>
> I can's see anything in sigaction(2) that explains how to set the
> return value. Am I missing something or do you have a patch in
> progress for that page?

the way you modify the syscall return value is not part of SUD, instead
it is generic to how signals are handled.  so I'm dropping this bit.

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-12-23 17:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-22 20:25 [PATCH v4] prctl.2: Document Syscall User Dispatch Gabriel Krisman Bertazi
2020-12-23 10:40 ` Michael Kerrisk (man-pages)
2020-12-23 17:30   ` Gabriel Krisman Bertazi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.