All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] rseq: x86: implement abort-at-ip extension
@ 2022-01-07 17:03 Mathieu Desnoyers
  2022-01-07 19:31 ` Florian Weimer
  0 siblings, 1 reply; 15+ messages in thread
From: Mathieu Desnoyers @ 2022-01-07 17:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, carlos, Mathieu Desnoyers

Allow rseq critical section abort handlers to optionally figure out at
which instruction pointer the rseq critical section was aborted.

This allows implementing rseq critical sections containing loops, in
which case the commit side-effect cannot be the last instruction. This
is useful to implement adaptative mutexes aware of preemption in
user-space. (see [1])

This also allows implementing rseq critical sections with multiple
commit steps, and use the abort-at-ip information to figure out what
needs to be undone in the abort handler.

Introduce the RSEQ_FLAG_QUERY_ABORT_AT_IP rseq system call flag. This
lets userspace query whether the kernel and architecture supports the
abort-at-ip rseq extension.

Only provide this information for rseq critical sections for which the
rseq_cs descriptor has the RSEQ_CS_FLAG_ABORT_AT_IP flag set. Those
critical sections need to expect those ecx/rcx registers to be
clobbered on abort.

For x86-32, provide the abort-at-ip information in the %ecx register.
For x86-64, provide the abort-at-ip information in the %rcx register.

[1] https://github.com/compudj/rseq-test/blob/adapt-lock-abort-at-ip/test-rseq-adaptative-lock.c#L80

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 arch/x86/include/asm/ptrace.h | 12 ++++++++++++
 include/uapi/linux/rseq.h     |  4 ++++
 kernel/rseq.c                 | 15 +++++++++++++++
 3 files changed, 31 insertions(+)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index b94f615600d5..0a50e7f14c64 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -387,5 +387,17 @@ extern int do_set_thread_area(struct task_struct *p, int idx,
 # define do_set_thread_area_64(p, s, t)	(0)
 #endif
 
+#ifdef CONFIG_RSEQ
+
+#define RSEQ_ARCH_HAS_ABORT_AT_IP
+
+/* Use ecx/rcx as placeholder for abort-at ip. */
+static inline void rseq_abort_at_ip(struct pt_regs *regs, unsigned long ip)
+{
+	regs->cx = ip;
+}
+
+#endif
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _ASM_X86_PTRACE_H */
diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
index 9a402fdb60e9..3130232e6d0c 100644
--- a/include/uapi/linux/rseq.h
+++ b/include/uapi/linux/rseq.h
@@ -20,12 +20,14 @@ enum rseq_cpu_id_state {
 
 enum rseq_flags {
 	RSEQ_FLAG_UNREGISTER = (1 << 0),
+	RSEQ_FLAG_QUERY_ABORT_AT_IP = (1 << 1),
 };
 
 enum rseq_cs_flags_bit {
 	RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT	= 0,
 	RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT	= 1,
 	RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT	= 2,
+	RSEQ_CS_FLAG_ABORT_AT_IP_BIT		= 3,
 };
 
 enum rseq_cs_flags {
@@ -35,6 +37,8 @@ enum rseq_cs_flags {
 		(1U << RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT),
 	RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE	=
 		(1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
+	RSEQ_CS_FLAG_ABORT_AT_IP		=
+		(1U << RSEQ_CS_FLAG_ABORT_AT_IP_BIT),
 };
 
 /*
diff --git a/kernel/rseq.c b/kernel/rseq.c
index 6d45ac3dae7f..6612136412c8 100644
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -21,6 +21,13 @@
 #define RSEQ_CS_PREEMPT_MIGRATE_FLAGS (RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE | \
 				       RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT)
 
+#ifdef RSEQ_ARCH_HAS_ABORT_AT_IP
+static bool rseq_has_abort_at_ip(void) { return true; }
+#else
+static bool rseq_has_abort_at_ip(void) { return false; }
+static void rseq_abort_at_ip(struct pt_regs *regs, unsigned long ip) { }
+#endif
+
 /*
  *
  * Restartable sequences are a lightweight interface that allows
@@ -261,6 +268,8 @@ static int rseq_ip_fixup(struct pt_regs *regs)
 	trace_rseq_ip_fixup(ip, rseq_cs.start_ip, rseq_cs.post_commit_offset,
 			    rseq_cs.abort_ip);
 	instruction_pointer_set(regs, (unsigned long)rseq_cs.abort_ip);
+	if (rseq_cs.flags & RSEQ_CS_FLAG_ABORT_AT_IP)
+		rseq_abort_at_ip(regs, ip);
 	return 0;
 }
 
@@ -330,6 +339,12 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 {
 	int ret;
 
+	if (flags & RSEQ_FLAG_QUERY_ABORT_AT_IP) {
+		if (flags & ~RSEQ_FLAG_QUERY_ABORT_AT_IP)
+			return -EINVAL;
+		return rseq_has_abort_at_ip() ? 0 : -EINVAL;
+	}
+
 	if (flags & RSEQ_FLAG_UNREGISTER) {
 		if (flags & ~RSEQ_FLAG_UNREGISTER)
 			return -EINVAL;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-07 17:03 [RFC PATCH] rseq: x86: implement abort-at-ip extension Mathieu Desnoyers
@ 2022-01-07 19:31 ` Florian Weimer
  2022-01-07 19:48   ` Mathieu Desnoyers
  0 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2022-01-07 19:31 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, Paul E . McKenney,
	Boqun Feng, H . Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

* Mathieu Desnoyers:

> Allow rseq critical section abort handlers to optionally figure out at
> which instruction pointer the rseq critical section was aborted.
>
> This allows implementing rseq critical sections containing loops, in
> which case the commit side-effect cannot be the last instruction. This
> is useful to implement adaptative mutexes aware of preemption in
> user-space. (see [1])

Could you write the program counter to the rseq area instead?  This
would avoid discussing which register to clobber.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-07 19:31 ` Florian Weimer
@ 2022-01-07 19:48   ` Mathieu Desnoyers
  2022-01-07 21:27     ` Mathieu Desnoyers
  2022-01-12 15:16     ` Florian Weimer
  0 siblings, 2 replies; 15+ messages in thread
From: Mathieu Desnoyers @ 2022-01-07 19:48 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

----- On Jan 7, 2022, at 2:31 PM, Florian Weimer fw@deneb.enyo.de wrote:

> * Mathieu Desnoyers:
> 
>> Allow rseq critical section abort handlers to optionally figure out at
>> which instruction pointer the rseq critical section was aborted.
>>
>> This allows implementing rseq critical sections containing loops, in
>> which case the commit side-effect cannot be the last instruction. This
>> is useful to implement adaptative mutexes aware of preemption in
>> user-space. (see [1])
> 
> Could you write the program counter to the rseq area instead?  This
> would avoid discussing which register to clobber.

Using the rseq area for that purpose would be problematic for nested signal
handlers with rseq critical sections. If a signal happens to be delivered
right after the abort ip adjustment, its signal handler containing a rseq
critical section could overwrite the relevant "abort-at-ip" field in the
rseq per-thread area before it has been read by the abort handler interrupted
by the signal.

Making this architecture-agnostic is indeed a laudable goal, but I don't
think the rseq per-thread area is a good fit for this.

I also though about making the clobbered register configurable on a
per-critical-section basis, but I rather think that it would be
overengineered: too much complexity for the gain. Unless there are
very strong reasons for choosing one register over another on a per
use-case basis ?

I guess if we ever care about the state of a given register within a given
range of instructions, we may lose that information if it is overwritten
by the abort-at-ip value. For instance, in my adaptative mutex prototype,
I use the Zero Flag to check if cmpxchg has succeeded. But if I would have
wanted to use the register modified by cmpxchg, and it would happen to be
clobbered by the abort-at-ip on abort, then it limits what the abort handler
can observe. It's fine as long as instructions can select what registers they
operate on, but instructions like cmpxchg AFAIR work on specific registers,
which might warrant making the abort-at-ip register configurable per
critical section. But maybe just choosing a register for abort-at-ip which
is not typically used by instructions that rely on hardcoded registers might
be sufficient.

Thoughts ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-07 19:48   ` Mathieu Desnoyers
@ 2022-01-07 21:27     ` Mathieu Desnoyers
  2022-01-07 22:27       ` David Laight
  2022-01-12 15:16     ` Florian Weimer
  1 sibling, 1 reply; 15+ messages in thread
From: Mathieu Desnoyers @ 2022-01-07 21:27 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

----- On Jan 7, 2022, at 2:48 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:

> ----- On Jan 7, 2022, at 2:31 PM, Florian Weimer fw@deneb.enyo.de wrote:
> 
>> * Mathieu Desnoyers:
>> 
>>> Allow rseq critical section abort handlers to optionally figure out at
>>> which instruction pointer the rseq critical section was aborted.
>>>
>>> This allows implementing rseq critical sections containing loops, in
>>> which case the commit side-effect cannot be the last instruction. This
>>> is useful to implement adaptative mutexes aware of preemption in
>>> user-space. (see [1])
>> 
>> Could you write the program counter to the rseq area instead?  This
>> would avoid discussing which register to clobber.
> 
> Using the rseq area for that purpose would be problematic for nested signal
> handlers with rseq critical sections. If a signal happens to be delivered
> right after the abort ip adjustment, its signal handler containing a rseq
> critical section could overwrite the relevant "abort-at-ip" field in the
> rseq per-thread area before it has been read by the abort handler interrupted
> by the signal.
> 
> Making this architecture-agnostic is indeed a laudable goal, but I don't
> think the rseq per-thread area is a good fit for this.
> 
> I also though about making the clobbered register configurable on a
> per-critical-section basis, but I rather think that it would be
> overengineered: too much complexity for the gain. Unless there are
> very strong reasons for choosing one register over another on a per
> use-case basis ?
> 
> I guess if we ever care about the state of a given register within a given
> range of instructions, we may lose that information if it is overwritten
> by the abort-at-ip value. For instance, in my adaptative mutex prototype,
> I use the Zero Flag to check if cmpxchg has succeeded. But if I would have
> wanted to use the register modified by cmpxchg, and it would happen to be
> clobbered by the abort-at-ip on abort, then it limits what the abort handler
> can observe. It's fine as long as instructions can select what registers they
> operate on, but instructions like cmpxchg AFAIR work on specific registers,
> which might warrant making the abort-at-ip register configurable per
> critical section. But maybe just choosing a register for abort-at-ip which
> is not typically used by instructions that rely on hardcoded registers might
> be sufficient.
> 
> Thoughts ?

That being said, there might be an architecture agnostic alternative available.
On abort of a RSEQ_CS_FLAG_ABORT_AT_IP critical section, we could let the kernel
decrement/increment the stack pointer to make room for a pointer (depending if the
stack grows down or up). It would then store the abort-at-ip value at the top of
stack.

The abort handler would be expected to use this top of stack abort-at-ip value,
and would be required to increment/decrement (depending on the stack direction)
the stack pointer back to its rightful value before the end of the assembly block.

Thoughts ?

Thanks,

Mathieu

> 
> Thanks,
> 
> Mathieu
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-07 21:27     ` Mathieu Desnoyers
@ 2022-01-07 22:27       ` David Laight
  2022-01-08  1:08         ` Mathieu Desnoyers
  0 siblings, 1 reply; 15+ messages in thread
From: David Laight @ 2022-01-07 22:27 UTC (permalink / raw)
  To: 'Mathieu Desnoyers', Florian Weimer
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

> That being said, there might be an architecture agnostic alternative available.
> On abort of a RSEQ_CS_FLAG_ABORT_AT_IP critical section, we could let the kernel
> decrement/increment the stack pointer to make room for a pointer (depending if the
> stack grows down or up). It would then store the abort-at-ip value at the top of
> stack.

Stack redzone in a leaf function?

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-07 22:27       ` David Laight
@ 2022-01-08  1:08         ` Mathieu Desnoyers
  2022-01-08  1:33           ` Mathieu Desnoyers
  2022-01-08  1:33           ` Mathieu Desnoyers
  0 siblings, 2 replies; 15+ messages in thread
From: Mathieu Desnoyers @ 2022-01-08  1:08 UTC (permalink / raw)
  To: David Laight
  Cc: Florian Weimer, Peter Zijlstra, linux-kernel, Thomas Gleixner,
	paulmck, Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

----- On Jan 7, 2022, at 5:27 PM, David Laight David.Laight@ACULAB.COM wrote:

>> That being said, there might be an architecture agnostic alternative available.
>> On abort of a RSEQ_CS_FLAG_ABORT_AT_IP critical section, we could let the kernel
>> decrement/increment the stack pointer to make room for a pointer (depending if
>> the
>> stack grows down or up). It would then store the abort-at-ip value at the top of
>> stack.
> 
> Stack redzone in a leaf function?

Good point!

For x86-64 for instance, which has a redzone of 128 bytes, there are a few alternatives.
On abort:

Approach A) Move the stack pointer as little as possible

   1. On abort in kernel:
   1.1 Subtract 8 bytes from the stack pointer
   1.2 Store the abort-at-ip value at stack pointer - 128
   2. In abort handler (userspace)
   2.1 user-space knows that it can find the abort-at-ip value at stack pointer - 128
   2.2 after using that value, the abort handler needs to add 8 bytes to the stack pointer

Approach B) Move the stack pointer to skip the entire redzone

   1. On abort in kernel:
   1.1 Subtract 128 bytes from the stack pointer
   1.2 Store the abort-at-ip value at stack pointer - 8 (basically within a new red zone)
   2. In abort handler (userspace)
   2.1 user-space knows that it can find the abort-at-ip value at stack pointer - 8
   2.2 after using that value, the abort handler needs to add 128 bytes to the stack pointer

I suspect both approaches should work.

Any preference or other ideas ?

Thanks,

Mathieu

> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT,
> UK
> Registration No: 1397386 (Wales)

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-08  1:08         ` Mathieu Desnoyers
@ 2022-01-08  1:33           ` Mathieu Desnoyers
  2022-01-08  1:33           ` Mathieu Desnoyers
  1 sibling, 0 replies; 15+ messages in thread
From: Mathieu Desnoyers @ 2022-01-08  1:33 UTC (permalink / raw)
  To: David Laight
  Cc: Florian Weimer, Peter Zijlstra, linux-kernel, Thomas Gleixner,
	paulmck, Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

----- On Jan 7, 2022, at 8:08 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:

> ----- On Jan 7, 2022, at 5:27 PM, David Laight David.Laight@ACULAB.COM wrote:
> 
>>> That being said, there might be an architecture agnostic alternative available.
>>> On abort of a RSEQ_CS_FLAG_ABORT_AT_IP critical section, we could let the kernel
>>> decrement/increment the stack pointer to make room for a pointer (depending if
>>> the
>>> stack grows down or up). It would then store the abort-at-ip value at the top of
>>> stack.
>> 
>> Stack redzone in a leaf function?
> 
> Good point!
> 
> For x86-64 for instance, which has a redzone of 128 bytes, there are a few
> alternatives.
> On abort:
> 
> Approach A) Move the stack pointer as little as possible
> 
>   1. On abort in kernel:
>   1.1 Subtract 8 bytes from the stack pointer
>   1.2 Store the abort-at-ip value at stack pointer - 128
>   2. In abort handler (userspace)
>   2.1 user-space knows that it can find the abort-at-ip value at stack pointer -
>   128
>   2.2 after using that value, the abort handler needs to add 8 bytes to the stack
>   pointer
> 
> Approach B) Move the stack pointer to skip the entire redzone
> 
>   1. On abort in kernel:
>   1.1 Subtract 128 bytes from the stack pointer
>   1.2 Store the abort-at-ip value at stack pointer - 8 (basically within a new red
>   zone)
>   2. In abort handler (userspace)
>   2.1 user-space knows that it can find the abort-at-ip value at stack pointer - 8
>   2.2 after using that value, the abort handler needs to add 128 bytes to the
>   stack pointer

Or approach C)

  1. On abort in kernel:
  1.1 Subtract 136 bytes from the stack pointer
  1.2 Store the abort-at-ip value at stack pointer
  2. In abort handler (userspace)
  2.1 user-space knows that it can find the abort-at-ip value at stack pointer, so
      it can pop it from the stack
  2.2 after popping that value, the abort handler needs to add 128 bytes to the stack pointer

So far, approach C seems to be the most elegant IMHO.

Thanks,

Mathieu


> 
> I suspect both approaches should work.
> 
> Any preference or other ideas ?
> 
> Thanks,
> 
> Mathieu
> 
>> 
>> -
>> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT,
>> UK
>> Registration No: 1397386 (Wales)
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-08  1:08         ` Mathieu Desnoyers
  2022-01-08  1:33           ` Mathieu Desnoyers
@ 2022-01-08  1:33           ` Mathieu Desnoyers
  1 sibling, 0 replies; 15+ messages in thread
From: Mathieu Desnoyers @ 2022-01-08  1:33 UTC (permalink / raw)
  To: David Laight
  Cc: Florian Weimer, Peter Zijlstra, linux-kernel, Thomas Gleixner,
	paulmck, Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

----- On Jan 7, 2022, at 8:08 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:

> ----- On Jan 7, 2022, at 5:27 PM, David Laight David.Laight@ACULAB.COM wrote:
> 
>>> That being said, there might be an architecture agnostic alternative available.
>>> On abort of a RSEQ_CS_FLAG_ABORT_AT_IP critical section, we could let the kernel
>>> decrement/increment the stack pointer to make room for a pointer (depending if
>>> the
>>> stack grows down or up). It would then store the abort-at-ip value at the top of
>>> stack.
>> 
>> Stack redzone in a leaf function?
> 
> Good point!
> 
> For x86-64 for instance, which has a redzone of 128 bytes, there are a few
> alternatives.
> On abort:
> 
> Approach A) Move the stack pointer as little as possible
> 
>   1. On abort in kernel:
>   1.1 Subtract 8 bytes from the stack pointer
>   1.2 Store the abort-at-ip value at stack pointer - 128
>   2. In abort handler (userspace)
>   2.1 user-space knows that it can find the abort-at-ip value at stack pointer -
>   128
>   2.2 after using that value, the abort handler needs to add 8 bytes to the stack
>   pointer
> 
> Approach B) Move the stack pointer to skip the entire redzone
> 
>   1. On abort in kernel:
>   1.1 Subtract 128 bytes from the stack pointer
>   1.2 Store the abort-at-ip value at stack pointer - 8 (basically within a new red
>   zone)
>   2. In abort handler (userspace)
>   2.1 user-space knows that it can find the abort-at-ip value at stack pointer - 8
>   2.2 after using that value, the abort handler needs to add 128 bytes to the
>   stack pointer

Or approach C)

  1. On abort in kernel:
  1.1 Subtract 136 bytes from the stack pointer
  1.2 Store the abort-at-ip value at stack pointer
  2. In abort handler (userspace)
  2.1 user-space knows that it can find the abort-at-ip value at stack pointer, so
      it can pop it from the stack
  2.2 after popping that value, the abort handler needs to add 128 bytes to the stack pointer

So far, approach C seems to be the most elegant IMHO.

Thanks,

Mathieu


> 
> I suspect both approaches should work.
> 
> Any preference or other ideas ?
> 
> Thanks,
> 
> Mathieu
> 
>> 
>> -
>> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT,
>> UK
>> Registration No: 1397386 (Wales)
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-07 19:48   ` Mathieu Desnoyers
  2022-01-07 21:27     ` Mathieu Desnoyers
@ 2022-01-12 15:16     ` Florian Weimer
  2022-01-12 15:26       ` Mathieu Desnoyers
  2022-01-12 15:38       ` Peter Zijlstra
  1 sibling, 2 replies; 15+ messages in thread
From: Florian Weimer @ 2022-01-12 15:16 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

* Mathieu Desnoyers:

> ----- On Jan 7, 2022, at 2:31 PM, Florian Weimer fw@deneb.enyo.de wrote:
>
>> * Mathieu Desnoyers:
>> 
>>> Allow rseq critical section abort handlers to optionally figure out at
>>> which instruction pointer the rseq critical section was aborted.
>>>
>>> This allows implementing rseq critical sections containing loops, in
>>> which case the commit side-effect cannot be the last instruction. This
>>> is useful to implement adaptative mutexes aware of preemption in
>>> user-space. (see [1])
>> 
>> Could you write the program counter to the rseq area instead?  This
>> would avoid discussing which register to clobber.
>
> Using the rseq area for that purpose would be problematic for nested signal
> handlers with rseq critical sections. If a signal happens to be delivered
> right after the abort ip adjustment, its signal handler containing a rseq
> critical section could overwrite the relevant "abort-at-ip" field in the
> rseq per-thread area before it has been read by the abort handler interrupted
> by the signal.
>
> Making this architecture-agnostic is indeed a laudable goal, but I don't
> think the rseq per-thread area is a good fit for this.
>
> I also though about making the clobbered register configurable on a
> per-critical-section basis, but I rather think that it would be
> overengineered: too much complexity for the gain. Unless there are
> very strong reasons for choosing one register over another on a per
> use-case basis ?

You could perhaps push a signal frame onto the stack.  It's going to
be expensive, but it's already in the context switch path, so maybe it
does not matter.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-12 15:16     ` Florian Weimer
@ 2022-01-12 15:26       ` Mathieu Desnoyers
  2022-01-12 15:38       ` Peter Zijlstra
  1 sibling, 0 replies; 15+ messages in thread
From: Mathieu Desnoyers @ 2022-01-12 15:26 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

----- On Jan 12, 2022, at 10:16 AM, Florian Weimer fw@deneb.enyo.de wrote:

> * Mathieu Desnoyers:
> 
>> ----- On Jan 7, 2022, at 2:31 PM, Florian Weimer fw@deneb.enyo.de wrote:
>>
>>> * Mathieu Desnoyers:
>>> 
>>>> Allow rseq critical section abort handlers to optionally figure out at
>>>> which instruction pointer the rseq critical section was aborted.
>>>>
>>>> This allows implementing rseq critical sections containing loops, in
>>>> which case the commit side-effect cannot be the last instruction. This
>>>> is useful to implement adaptative mutexes aware of preemption in
>>>> user-space. (see [1])
>>> 
>>> Could you write the program counter to the rseq area instead?  This
>>> would avoid discussing which register to clobber.
>>
>> Using the rseq area for that purpose would be problematic for nested signal
>> handlers with rseq critical sections. If a signal happens to be delivered
>> right after the abort ip adjustment, its signal handler containing a rseq
>> critical section could overwrite the relevant "abort-at-ip" field in the
>> rseq per-thread area before it has been read by the abort handler interrupted
>> by the signal.
>>
>> Making this architecture-agnostic is indeed a laudable goal, but I don't
>> think the rseq per-thread area is a good fit for this.
>>
>> I also though about making the clobbered register configurable on a
>> per-critical-section basis, but I rather think that it would be
>> overengineered: too much complexity for the gain. Unless there are
>> very strong reasons for choosing one register over another on a per
>> use-case basis ?
> 
> You could perhaps push a signal frame onto the stack.  It's going to
> be expensive, but it's already in the context switch path, so maybe it
> does not matter.

The route I'm taking in my subsequent version of the patch is very close to
pushing a signal frame: on abort, skip the redzone, and push the abort-at-ip
pointer. Then abort handler is then expected to pop the abort-at-ip pointer
and unskip the redzone.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-12 15:16     ` Florian Weimer
  2022-01-12 15:26       ` Mathieu Desnoyers
@ 2022-01-12 15:38       ` Peter Zijlstra
  2022-01-12 16:00         ` Florian Weimer
  1 sibling, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2022-01-12 15:38 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Mathieu Desnoyers, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

On Wed, Jan 12, 2022 at 04:16:36PM +0100, Florian Weimer wrote:

> You could perhaps push a signal frame onto the stack.  It's going to
> be expensive, but it's already in the context switch path, so maybe it
> does not matter.

Please no! Signals are a trainwreck that need change (see the whole
AVX-512 / AMX saga), we shouldn't use more of that just cause.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-12 15:38       ` Peter Zijlstra
@ 2022-01-12 16:00         ` Florian Weimer
  2022-01-12 16:38           ` Mathieu Desnoyers
  0 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2022-01-12 16:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mathieu Desnoyers, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

* Peter Zijlstra:

> On Wed, Jan 12, 2022 at 04:16:36PM +0100, Florian Weimer wrote:
>
>> You could perhaps push a signal frame onto the stack.  It's going to
>> be expensive, but it's already in the context switch path, so maybe it
>> does not matter.
>
> Please no! Signals are a trainwreck that need change (see the whole
> AVX-512 / AMX saga), we shouldn't use more of that just cause.

If it's a signal, it should be modeled as such.  I think it's pretty
close to a synchronous signal.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-12 16:00         ` Florian Weimer
@ 2022-01-12 16:38           ` Mathieu Desnoyers
  2022-01-12 21:00             ` Florian Weimer
  0 siblings, 1 reply; 15+ messages in thread
From: Mathieu Desnoyers @ 2022-01-12 16:38 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

----- On Jan 12, 2022, at 11:00 AM, Florian Weimer fw@deneb.enyo.de wrote:

> * Peter Zijlstra:
> 
>> On Wed, Jan 12, 2022 at 04:16:36PM +0100, Florian Weimer wrote:
>>
>>> You could perhaps push a signal frame onto the stack.  It's going to
>>> be expensive, but it's already in the context switch path, so maybe it
>>> does not matter.
>>
>> Please no! Signals are a trainwreck that need change (see the whole
>> AVX-512 / AMX saga), we shouldn't use more of that just cause.
> 
> If it's a signal, it should be modeled as such.  I think it's pretty
> close to a synchronous signal.

Florian, just to validate here: is your argument about AVX-512/AMX or about
rseq abort-at-ip ?

Am I understanding correctly that you would prefer that the kernel push an entire
signal frame onto the stack rather than just push the abort-at-ip value
on abort ? If it is the case, are there advantages in doing so ? Tooling support ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-12 16:38           ` Mathieu Desnoyers
@ 2022-01-12 21:00             ` Florian Weimer
  2022-01-12 21:24               ` Mathieu Desnoyers
  0 siblings, 1 reply; 15+ messages in thread
From: Florian Weimer @ 2022-01-12 21:00 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

* Mathieu Desnoyers:

> ----- On Jan 12, 2022, at 11:00 AM, Florian Weimer fw@deneb.enyo.de wrote:
>
>> * Peter Zijlstra:
>> 
>>> On Wed, Jan 12, 2022 at 04:16:36PM +0100, Florian Weimer wrote:
>>>
>>>> You could perhaps push a signal frame onto the stack.  It's going to
>>>> be expensive, but it's already in the context switch path, so maybe it
>>>> does not matter.
>>>
>>> Please no! Signals are a trainwreck that need change (see the whole
>>> AVX-512 / AMX saga), we shouldn't use more of that just cause.
>> 
>> If it's a signal, it should be modeled as such.  I think it's pretty
>> close to a synchronous signal.

(an asynchronous signal)

> Florian, just to validate here: is your argument about AVX-512/AMX or about
> rseq abort-at-ip ?

rseq abort-at-ip.  I wonder if it is possible to use regular stack
unwinding (through the signal frame) to figure out where the abort
happened, and use the existing cleanup handler functionality in GCC.
(Although -fnon-call-exceptions is not quite up to this, but in theory
we would have to fix this for POSIX asynchronous cancellation/Ada
asynchronous transfer of control support anyway.)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] rseq: x86: implement abort-at-ip extension
  2022-01-12 21:00             ` Florian Weimer
@ 2022-01-12 21:24               ` Mathieu Desnoyers
  0 siblings, 0 replies; 15+ messages in thread
From: Mathieu Desnoyers @ 2022-01-12 21:24 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

----- On Jan 12, 2022, at 4:00 PM, Florian Weimer fw@deneb.enyo.de wrote:

> * Mathieu Desnoyers:
> 
>> ----- On Jan 12, 2022, at 11:00 AM, Florian Weimer fw@deneb.enyo.de wrote:
>>
>>> * Peter Zijlstra:
>>> 
>>>> On Wed, Jan 12, 2022 at 04:16:36PM +0100, Florian Weimer wrote:
>>>>
>>>>> You could perhaps push a signal frame onto the stack.  It's going to
>>>>> be expensive, but it's already in the context switch path, so maybe it
>>>>> does not matter.
>>>>
>>>> Please no! Signals are a trainwreck that need change (see the whole
>>>> AVX-512 / AMX saga), we shouldn't use more of that just cause.
>>> 
>>> If it's a signal, it should be modeled as such.  I think it's pretty
>>> close to a synchronous signal.
> 
> (an asynchronous signal)
> 
>> Florian, just to validate here: is your argument about AVX-512/AMX or about
>> rseq abort-at-ip ?
> 
> rseq abort-at-ip.  I wonder if it is possible to use regular stack
> unwinding (through the signal frame) to figure out where the abort
> happened, and use the existing cleanup handler functionality in GCC.
> (Although -fnon-call-exceptions is not quite up to this, but in theory
> we would have to fix this for POSIX asynchronous cancellation/Ada
> asynchronous transfer of control support anyway.)

OK, so if we take x86-64 as an example, the abort would do:

/* Skip redzone and sigframe */
regs->sp -= 128 + sizeof(struct rt_sigframe);

It's unclear to me what should be the size of rt_sigframe here. Should it
include floating point state as well ?

Then at regs->sp, we can store a signal frame. The interesting bits would
go into (struct rt_sigframe  __user *)->uc which is a struct sigcontext.
It would contain all register state at the point where the critical section
was aborted.

The only register state we really care about is sigcontext "rip", given that
the userspace abort handler can save all other relevant registers by itself.

Then should we populate the other struct rt_sigframe fields or just zero them ?
I wonder what the meaning of "pretcode" and "struct siginfo" would be in the
context of an rseq abort ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-01-12 21:24 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-07 17:03 [RFC PATCH] rseq: x86: implement abort-at-ip extension Mathieu Desnoyers
2022-01-07 19:31 ` Florian Weimer
2022-01-07 19:48   ` Mathieu Desnoyers
2022-01-07 21:27     ` Mathieu Desnoyers
2022-01-07 22:27       ` David Laight
2022-01-08  1:08         ` Mathieu Desnoyers
2022-01-08  1:33           ` Mathieu Desnoyers
2022-01-08  1:33           ` Mathieu Desnoyers
2022-01-12 15:16     ` Florian Weimer
2022-01-12 15:26       ` Mathieu Desnoyers
2022-01-12 15:38       ` Peter Zijlstra
2022-01-12 16:00         ` Florian Weimer
2022-01-12 16:38           ` Mathieu Desnoyers
2022-01-12 21:00             ` Florian Weimer
2022-01-12 21:24               ` Mathieu Desnoyers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.