linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] Add support for synchronous signals on perf events
@ 2021-02-23 14:34 Marco Elver
  2021-02-23 14:34 ` [PATCH RFC 1/4] perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children Marco Elver
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Marco Elver @ 2021-02-23 14:34 UTC (permalink / raw)
  To: elver, peterz, alexander.shishkin, acme, mingo, jolsa,
	mark.rutland, namhyung, tglx
  Cc: glider, viro, arnd, christian, dvyukov, jannh, axboe, mascasa,
	pcc, irogers, kasan-dev, linux-arch, linux-fsdevel, linux-kernel,
	linux-m68k, x86

The perf subsystem today unifies various tracing and monitoring
features, from both software and hardware. One benefit of the perf
subsystem is automatically inheriting events to child tasks, which
enables process-wide events monitoring with low overheads. By default
perf events are non-intrusive, not affecting behaviour of the tasks
being monitored.

For certain use-cases, however, it makes sense to leverage the
generality of the perf events subsystem and optionally allow the tasks
being monitored to receive signals on events they are interested in.
This patch series adds the option to synchronously signal user space on
events.

The discussion at [1] led to the changes proposed in this series. The
approach taken in patch 3/4 to use 'event_limit' to trigger the signal
was kindly suggested by Peter Zijlstra in [2].

[1] https://lore.kernel.org/lkml/CACT4Y+YPrXGw+AtESxAgPyZ84TYkNZdP0xpocX2jwVAbZD=-XQ@mail.gmail.com/
[2] https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.net/ 

Motivation and example uses:

1. 	Our immediate motivation is low-overhead sampling-based race
	detection for user-space [3]. By using perf_event_open() at
	process initialization, we can create hardware
	breakpoint/watchpoint events that are propagated automatically
	to all threads in a process. As far as we are aware, today no
	existing kernel facility (such as ptrace) allows us to set up
	process-wide watchpoints with minimal overheads (that are
	comparable to mprotect() of whole pages).

	[3] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf 

2.	Other low-overhead error detectors that rely on detecting
	accesses to certain memory locations or code, process-wide and
	also only in a specific set of subtasks or threads.

Other example use-cases we found potentially interesting:

3.	Code hot patching without full stop-the-world. Specifically, by
	setting a code breakpoint to entry to the patched routine, then
	send signals to threads and check that they are not in the
	routine, but without stopping them further. If any of the
	threads will enter the routine, it will receive SIGTRAP and
	pause.

4. 	Safepoints without mprotect(). Some Java implementations use
	"load from a known memory location" as a safepoint. When threads
	need to be stopped, the page containing the location is
	mprotect()ed and threads get a signal. This can be replaced with
	a watchpoint, which does not require a whole page nor DTLB
	shootdowns.

5.	Tracking data flow globally.

6.	Threads receiving signals on performance events to
	throttle/unthrottle themselves.


Marco Elver (4):
  perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children
  signal: Introduce TRAP_PERF si_code and si_perf to siginfo
  perf/core: Add support for SIGTRAP on perf events
  perf/core: Add breakpoint information to siginfo on SIGTRAP

 arch/m68k/kernel/signal.c          |  3 ++
 arch/x86/kernel/signal_compat.c    |  5 ++-
 fs/signalfd.c                      |  4 +++
 include/linux/compat.h             |  2 ++
 include/linux/signal.h             |  1 +
 include/uapi/asm-generic/siginfo.h |  6 +++-
 include/uapi/linux/perf_event.h    |  3 +-
 include/uapi/linux/signalfd.h      |  4 ++-
 kernel/events/core.c               | 54 +++++++++++++++++++++++++++++-
 kernel/signal.c                    | 11 ++++++
 10 files changed, 88 insertions(+), 5 deletions(-)

-- 
2.30.0.617.g56c4b15f3c-goog


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH RFC 1/4] perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children
  2021-02-23 14:34 [PATCH RFC 0/4] Add support for synchronous signals on perf events Marco Elver
@ 2021-02-23 14:34 ` Marco Elver
  2021-02-23 14:48   ` Dmitry Vyukov
  2021-02-23 14:34 ` [PATCH RFC 2/4] signal: Introduce TRAP_PERF si_code and si_perf to siginfo Marco Elver
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Marco Elver @ 2021-02-23 14:34 UTC (permalink / raw)
  To: elver, peterz, alexander.shishkin, acme, mingo, jolsa,
	mark.rutland, namhyung, tglx
  Cc: glider, viro, arnd, christian, dvyukov, jannh, axboe, mascasa,
	pcc, irogers, kasan-dev, linux-arch, linux-fsdevel, linux-kernel,
	linux-m68k, x86

As with other ioctls (such as PERF_EVENT_IOC_{ENABLE,DISABLE}), fix up
handling of PERF_EVENT_IOC_MODIFY_ATTRIBUTES to also apply to children.

Link: https://lkml.kernel.org/r/YBqVaY8aTMYtoUnX@hirez.programming.kicks-ass.net
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marco Elver <elver@google.com>
---
 kernel/events/core.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 129dee540a8b..37a8297be164 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3179,16 +3179,36 @@ static int perf_event_modify_breakpoint(struct perf_event *bp,
 static int perf_event_modify_attr(struct perf_event *event,
 				  struct perf_event_attr *attr)
 {
+	int (*func)(struct perf_event *, struct perf_event_attr *);
+	struct perf_event *child;
+	int err;
+
 	if (event->attr.type != attr->type)
 		return -EINVAL;
 
 	switch (event->attr.type) {
 	case PERF_TYPE_BREAKPOINT:
-		return perf_event_modify_breakpoint(event, attr);
+		func = perf_event_modify_breakpoint;
+		break;
 	default:
 		/* Place holder for future additions. */
 		return -EOPNOTSUPP;
 	}
+
+	WARN_ON_ONCE(event->ctx->parent_ctx);
+
+	mutex_lock(&event->child_mutex);
+	err = func(event, attr);
+	if (err)
+		goto out;
+	list_for_each_entry(child, &event->child_list, child_list) {
+		err = func(child, attr);
+		if (err)
+			goto out;
+	}
+out:
+	mutex_unlock(&event->child_mutex);
+	return err;
 }
 
 static void ctx_sched_out(struct perf_event_context *ctx,
-- 
2.30.0.617.g56c4b15f3c-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 2/4] signal: Introduce TRAP_PERF si_code and si_perf to siginfo
  2021-02-23 14:34 [PATCH RFC 0/4] Add support for synchronous signals on perf events Marco Elver
  2021-02-23 14:34 ` [PATCH RFC 1/4] perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children Marco Elver
@ 2021-02-23 14:34 ` Marco Elver
  2021-02-23 18:01   ` Geert Uytterhoeven
  2021-02-23 14:34 ` [PATCH RFC 3/4] perf/core: Add support for SIGTRAP on perf events Marco Elver
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Marco Elver @ 2021-02-23 14:34 UTC (permalink / raw)
  To: elver, peterz, alexander.shishkin, acme, mingo, jolsa,
	mark.rutland, namhyung, tglx
  Cc: glider, viro, arnd, christian, dvyukov, jannh, axboe, mascasa,
	pcc, irogers, kasan-dev, linux-arch, linux-fsdevel, linux-kernel,
	linux-m68k, x86

Introduces the TRAP_PERF si_code, and associated siginfo_t field
si_perf. These will be used by the perf event subsystem to send signals
(if requested) to the task where an event occurred.

Signed-off-by: Marco Elver <elver@google.com>
---
 arch/m68k/kernel/signal.c          |  3 +++
 arch/x86/kernel/signal_compat.c    |  5 ++++-
 fs/signalfd.c                      |  4 ++++
 include/linux/compat.h             |  2 ++
 include/linux/signal.h             |  1 +
 include/uapi/asm-generic/siginfo.h |  6 +++++-
 include/uapi/linux/signalfd.h      |  4 +++-
 kernel/signal.c                    | 11 +++++++++++
 8 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index 349570f16a78..a4b7ee1df211 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -622,6 +622,9 @@ static inline void siginfo_build_tests(void)
 	/* _sigfault._addr_pkey */
 	BUILD_BUG_ON(offsetof(siginfo_t, si_pkey) != 0x12);
 
+	/* _sigfault._perf */
+	BUILD_BUG_ON(offsetof(siginfo_t, si_perf) != 0x10);
+
 	/* _sigpoll */
 	BUILD_BUG_ON(offsetof(siginfo_t, si_band)   != 0x0c);
 	BUILD_BUG_ON(offsetof(siginfo_t, si_fd)     != 0x10);
diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c
index a5330ff498f0..0e5d0a7e203b 100644
--- a/arch/x86/kernel/signal_compat.c
+++ b/arch/x86/kernel/signal_compat.c
@@ -29,7 +29,7 @@ static inline void signal_compat_build_tests(void)
 	BUILD_BUG_ON(NSIGFPE  != 15);
 	BUILD_BUG_ON(NSIGSEGV != 9);
 	BUILD_BUG_ON(NSIGBUS  != 5);
-	BUILD_BUG_ON(NSIGTRAP != 5);
+	BUILD_BUG_ON(NSIGTRAP != 6);
 	BUILD_BUG_ON(NSIGCHLD != 6);
 	BUILD_BUG_ON(NSIGSYS  != 2);
 
@@ -138,6 +138,9 @@ static inline void signal_compat_build_tests(void)
 	BUILD_BUG_ON(offsetof(siginfo_t, si_pkey) != 0x20);
 	BUILD_BUG_ON(offsetof(compat_siginfo_t, si_pkey) != 0x14);
 
+	BUILD_BUG_ON(offsetof(siginfo_t, si_perf) != 0x18);
+	BUILD_BUG_ON(offsetof(compat_siginfo_t, si_perf) != 0x10);
+
 	CHECK_CSI_OFFSET(_sigpoll);
 	CHECK_CSI_SIZE  (_sigpoll, 2*sizeof(int));
 	CHECK_SI_SIZE   (_sigpoll, 4*sizeof(int));
diff --git a/fs/signalfd.c b/fs/signalfd.c
index 456046e15873..040a1142915f 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -134,6 +134,10 @@ static int signalfd_copyinfo(struct signalfd_siginfo __user *uinfo,
 #endif
 		new.ssi_addr_lsb = (short) kinfo->si_addr_lsb;
 		break;
+	case SIL_PERF_EVENT:
+		new.ssi_addr = (long) kinfo->si_addr;
+		new.ssi_perf = kinfo->si_perf;
+		break;
 	case SIL_CHLD:
 		new.ssi_pid    = kinfo->si_pid;
 		new.ssi_uid    = kinfo->si_uid;
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 6e65be753603..c8821d966812 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -236,6 +236,8 @@ typedef struct compat_siginfo {
 					char _dummy_pkey[__COMPAT_ADDR_BND_PKEY_PAD];
 					u32 _pkey;
 				} _addr_pkey;
+				/* used when si_code=TRAP_PERF */
+				compat_u64 _perf;
 			};
 		} _sigfault;
 
diff --git a/include/linux/signal.h b/include/linux/signal.h
index 205526c4003a..1e98548d7cf6 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -43,6 +43,7 @@ enum siginfo_layout {
 	SIL_FAULT_MCEERR,
 	SIL_FAULT_BNDERR,
 	SIL_FAULT_PKUERR,
+	SIL_PERF_EVENT,
 	SIL_CHLD,
 	SIL_RT,
 	SIL_SYS,
diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
index d2597000407a..d0bb9125c853 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -91,6 +91,8 @@ union __sifields {
 				char _dummy_pkey[__ADDR_BND_PKEY_PAD];
 				__u32 _pkey;
 			} _addr_pkey;
+			/* used when si_code=TRAP_PERF */
+			__u64 _perf;
 		};
 	} _sigfault;
 
@@ -155,6 +157,7 @@ typedef struct siginfo {
 #define si_lower	_sifields._sigfault._addr_bnd._lower
 #define si_upper	_sifields._sigfault._addr_bnd._upper
 #define si_pkey		_sifields._sigfault._addr_pkey._pkey
+#define si_perf		_sifields._sigfault._perf
 #define si_band		_sifields._sigpoll._band
 #define si_fd		_sifields._sigpoll._fd
 #define si_call_addr	_sifields._sigsys._call_addr
@@ -253,7 +256,8 @@ typedef struct siginfo {
 #define TRAP_BRANCH     3	/* process taken branch trap */
 #define TRAP_HWBKPT     4	/* hardware breakpoint/watchpoint */
 #define TRAP_UNK	5	/* undiagnosed trap */
-#define NSIGTRAP	5
+#define TRAP_PERF	6	/* perf event with sigtrap=1 */
+#define NSIGTRAP	6
 
 /*
  * There is an additional set of SIGTRAP si_codes used by ptrace
diff --git a/include/uapi/linux/signalfd.h b/include/uapi/linux/signalfd.h
index 83429a05b698..7e333042c7e3 100644
--- a/include/uapi/linux/signalfd.h
+++ b/include/uapi/linux/signalfd.h
@@ -39,6 +39,8 @@ struct signalfd_siginfo {
 	__s32 ssi_syscall;
 	__u64 ssi_call_addr;
 	__u32 ssi_arch;
+	__u32 __pad3;
+	__u64 ssi_perf;
 
 	/*
 	 * Pad strcture to 128 bytes. Remember to update the
@@ -49,7 +51,7 @@ struct signalfd_siginfo {
 	 * comes out of a read(2) and we really don't want to have
 	 * a compat on read(2).
 	 */
-	__u8 __pad[28];
+	__u8 __pad[16];
 };
 
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 5ad8566534e7..943c98782634 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1199,6 +1199,7 @@ static inline bool has_si_pid_and_uid(struct kernel_siginfo *info)
 	case SIL_FAULT_MCEERR:
 	case SIL_FAULT_BNDERR:
 	case SIL_FAULT_PKUERR:
+	case SIL_PERF_EVENT:
 	case SIL_SYS:
 		ret = false;
 		break;
@@ -2531,6 +2532,7 @@ static void hide_si_addr_tag_bits(struct ksignal *ksig)
 	case SIL_FAULT_MCEERR:
 	case SIL_FAULT_BNDERR:
 	case SIL_FAULT_PKUERR:
+	case SIL_PERF_EVENT:
 		ksig->info.si_addr = arch_untagged_si_addr(
 			ksig->info.si_addr, ksig->sig, ksig->info.si_code);
 		break;
@@ -3333,6 +3335,10 @@ void copy_siginfo_to_external32(struct compat_siginfo *to,
 #endif
 		to->si_pkey = from->si_pkey;
 		break;
+	case SIL_PERF_EVENT:
+		to->si_addr = ptr_to_compat(from->si_addr);
+		to->si_perf = from->si_perf;
+		break;
 	case SIL_CHLD:
 		to->si_pid = from->si_pid;
 		to->si_uid = from->si_uid;
@@ -3413,6 +3419,10 @@ static int post_copy_siginfo_from_user32(kernel_siginfo_t *to,
 #endif
 		to->si_pkey = from->si_pkey;
 		break;
+	case SIL_PERF_EVENT:
+		to->si_addr = compat_ptr(from->si_addr);
+		to->si_perf = from->si_perf;
+		break;
 	case SIL_CHLD:
 		to->si_pid    = from->si_pid;
 		to->si_uid    = from->si_uid;
@@ -4593,6 +4603,7 @@ static inline void siginfo_buildtime_checks(void)
 	CHECK_OFFSET(si_lower);
 	CHECK_OFFSET(si_upper);
 	CHECK_OFFSET(si_pkey);
+	CHECK_OFFSET(si_perf);
 
 	/* sigpoll */
 	CHECK_OFFSET(si_band);
-- 
2.30.0.617.g56c4b15f3c-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 3/4] perf/core: Add support for SIGTRAP on perf events
  2021-02-23 14:34 [PATCH RFC 0/4] Add support for synchronous signals on perf events Marco Elver
  2021-02-23 14:34 ` [PATCH RFC 1/4] perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children Marco Elver
  2021-02-23 14:34 ` [PATCH RFC 2/4] signal: Introduce TRAP_PERF si_code and si_perf to siginfo Marco Elver
@ 2021-02-23 14:34 ` Marco Elver
  2021-02-23 14:57   ` Dmitry Vyukov
  2021-02-23 18:13   ` Dmitry Vyukov
  2021-02-23 14:34 ` [PATCH RFC 4/4] perf/core: Add breakpoint information to siginfo on SIGTRAP Marco Elver
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 17+ messages in thread
From: Marco Elver @ 2021-02-23 14:34 UTC (permalink / raw)
  To: elver, peterz, alexander.shishkin, acme, mingo, jolsa,
	mark.rutland, namhyung, tglx
  Cc: glider, viro, arnd, christian, dvyukov, jannh, axboe, mascasa,
	pcc, irogers, kasan-dev, linux-arch, linux-fsdevel, linux-kernel,
	linux-m68k, x86

Adds bit perf_event_attr::sigtrap, which can be set to cause events to
send SIGTRAP (with si_code TRAP_PERF) to the task where the event
occurred. To distinguish perf events and allow user space to decode
si_perf (if set), the event type is set in si_errno.

The primary motivation is to support synchronous signals on perf events
in the task where an event (such as breakpoints) triggered.

Link: https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.net/
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Marco Elver <elver@google.com>
---
 include/uapi/linux/perf_event.h |  3 ++-
 kernel/events/core.c            | 21 +++++++++++++++++++++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index ad15e40d7f5d..b9cc6829a40c 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -389,7 +389,8 @@ struct perf_event_attr {
 				cgroup         :  1, /* include cgroup events */
 				text_poke      :  1, /* include text poke events */
 				build_id       :  1, /* use build id in mmap2 events */
-				__reserved_1   : 29;
+				sigtrap        :  1, /* send synchronous SIGTRAP on event */
+				__reserved_1   : 28;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 37a8297be164..8718763045fd 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6288,6 +6288,17 @@ void perf_event_wakeup(struct perf_event *event)
 	}
 }
 
+static void perf_sigtrap(struct perf_event *event)
+{
+	struct kernel_siginfo info;
+
+	clear_siginfo(&info);
+	info.si_signo = SIGTRAP;
+	info.si_code = TRAP_PERF;
+	info.si_errno = event->attr.type;
+	force_sig_info(&info);
+}
+
 static void perf_pending_event_disable(struct perf_event *event)
 {
 	int cpu = READ_ONCE(event->pending_disable);
@@ -6297,6 +6308,13 @@ static void perf_pending_event_disable(struct perf_event *event)
 
 	if (cpu == smp_processor_id()) {
 		WRITE_ONCE(event->pending_disable, -1);
+
+		if (event->attr.sigtrap) {
+			atomic_inc(&event->event_limit); /* rearm event */
+			perf_sigtrap(event);
+			return;
+		}
+
 		perf_event_disable_local(event);
 		return;
 	}
@@ -11325,6 +11343,9 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 
 	event->state		= PERF_EVENT_STATE_INACTIVE;
 
+	if (event->attr.sigtrap)
+		atomic_set(&event->event_limit, 1);
+
 	if (task) {
 		event->attach_state = PERF_ATTACH_TASK;
 		/*
-- 
2.30.0.617.g56c4b15f3c-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 4/4] perf/core: Add breakpoint information to siginfo on SIGTRAP
  2021-02-23 14:34 [PATCH RFC 0/4] Add support for synchronous signals on perf events Marco Elver
                   ` (2 preceding siblings ...)
  2021-02-23 14:34 ` [PATCH RFC 3/4] perf/core: Add support for SIGTRAP on perf events Marco Elver
@ 2021-02-23 14:34 ` Marco Elver
  2021-02-23 15:01   ` Dmitry Vyukov
  2021-02-23 20:03 ` [PATCH RFC 0/4] Add support for synchronous signals on perf events Marco Elver
  2021-02-23 20:27 ` Andy Lutomirski
  5 siblings, 1 reply; 17+ messages in thread
From: Marco Elver @ 2021-02-23 14:34 UTC (permalink / raw)
  To: elver, peterz, alexander.shishkin, acme, mingo, jolsa,
	mark.rutland, namhyung, tglx
  Cc: glider, viro, arnd, christian, dvyukov, jannh, axboe, mascasa,
	pcc, irogers, kasan-dev, linux-arch, linux-fsdevel, linux-kernel,
	linux-m68k, x86

Encode information from breakpoint attributes into siginfo_t, which
helps disambiguate which breakpoint fired.

Note, providing the event fd may be unreliable, since the event may have
been modified (via PERF_EVENT_IOC_MODIFY_ATTRIBUTES) between the event
triggering and the signal being delivered to user space.

Signed-off-by: Marco Elver <elver@google.com>
---
 kernel/events/core.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8718763045fd..d7908322d796 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6296,6 +6296,17 @@ static void perf_sigtrap(struct perf_event *event)
 	info.si_signo = SIGTRAP;
 	info.si_code = TRAP_PERF;
 	info.si_errno = event->attr.type;
+
+	switch (event->attr.type) {
+	case PERF_TYPE_BREAKPOINT:
+		info.si_addr = (void *)(unsigned long)event->attr.bp_addr;
+		info.si_perf = (event->attr.bp_len << 16) | (u64)event->attr.bp_type;
+		break;
+	default:
+		/* No additional info set. */
+		break;
+	}
+
 	force_sig_info(&info);
 }
 
-- 
2.30.0.617.g56c4b15f3c-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 1/4] perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children
  2021-02-23 14:34 ` [PATCH RFC 1/4] perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children Marco Elver
@ 2021-02-23 14:48   ` Dmitry Vyukov
  0 siblings, 0 replies; 17+ messages in thread
From: Dmitry Vyukov @ 2021-02-23 14:48 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Thomas Gleixner, Alexander Potapenko, Al Viro, Arnd Bergmann,
	Christian Brauner, Jann Horn, Jens Axboe, Matt Morehouse,
	Peter Collingbourne, Ian Rogers, kasan-dev, linux-arch,
	linux-fsdevel, LKML, linux-m68k, the arch/x86 maintainers

On Tue, Feb 23, 2021 at 3:34 PM Marco Elver <elver@google.com> wrote:
>
> As with other ioctls (such as PERF_EVENT_IOC_{ENABLE,DISABLE}), fix up
> handling of PERF_EVENT_IOC_MODIFY_ATTRIBUTES to also apply to children.
>
> Link: https://lkml.kernel.org/r/YBqVaY8aTMYtoUnX@hirez.programming.kicks-ass.net
> Suggested-by: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: Marco Elver <elver@google.com>

Reviewed-by: Dmitry Vyukov <dvyukov@google.com>


> ---
>  kernel/events/core.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 129dee540a8b..37a8297be164 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3179,16 +3179,36 @@ static int perf_event_modify_breakpoint(struct perf_event *bp,
>  static int perf_event_modify_attr(struct perf_event *event,
>                                   struct perf_event_attr *attr)
>  {
> +       int (*func)(struct perf_event *, struct perf_event_attr *);
> +       struct perf_event *child;
> +       int err;
> +
>         if (event->attr.type != attr->type)
>                 return -EINVAL;
>
>         switch (event->attr.type) {
>         case PERF_TYPE_BREAKPOINT:
> -               return perf_event_modify_breakpoint(event, attr);
> +               func = perf_event_modify_breakpoint;
> +               break;
>         default:
>                 /* Place holder for future additions. */
>                 return -EOPNOTSUPP;
>         }
> +
> +       WARN_ON_ONCE(event->ctx->parent_ctx);
> +
> +       mutex_lock(&event->child_mutex);
> +       err = func(event, attr);
> +       if (err)
> +               goto out;
> +       list_for_each_entry(child, &event->child_list, child_list) {
> +               err = func(child, attr);
> +               if (err)
> +                       goto out;
> +       }
> +out:
> +       mutex_unlock(&event->child_mutex);
> +       return err;
>  }
>
>  static void ctx_sched_out(struct perf_event_context *ctx,
> --
> 2.30.0.617.g56c4b15f3c-goog
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 3/4] perf/core: Add support for SIGTRAP on perf events
  2021-02-23 14:34 ` [PATCH RFC 3/4] perf/core: Add support for SIGTRAP on perf events Marco Elver
@ 2021-02-23 14:57   ` Dmitry Vyukov
  2021-02-23 18:13   ` Dmitry Vyukov
  1 sibling, 0 replies; 17+ messages in thread
From: Dmitry Vyukov @ 2021-02-23 14:57 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Thomas Gleixner, Alexander Potapenko, Al Viro, Arnd Bergmann,
	Christian Brauner, Jann Horn, Jens Axboe, Matt Morehouse,
	Peter Collingbourne, Ian Rogers, kasan-dev, linux-arch,
	linux-fsdevel, LKML, linux-m68k, the arch/x86 maintainers

On Tue, Feb 23, 2021 at 3:34 PM Marco Elver <elver@google.com> wrote:
>
> Adds bit perf_event_attr::sigtrap, which can be set to cause events to
> send SIGTRAP (with si_code TRAP_PERF) to the task where the event
> occurred. To distinguish perf events and allow user space to decode
> si_perf (if set), the event type is set in si_errno.
>
> The primary motivation is to support synchronous signals on perf events
> in the task where an event (such as breakpoints) triggered.
>
> Link: https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.net/
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Marco Elver <elver@google.com>
> ---
>  include/uapi/linux/perf_event.h |  3 ++-
>  kernel/events/core.c            | 21 +++++++++++++++++++++
>  2 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index ad15e40d7f5d..b9cc6829a40c 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -389,7 +389,8 @@ struct perf_event_attr {
>                                 cgroup         :  1, /* include cgroup events */
>                                 text_poke      :  1, /* include text poke events */
>                                 build_id       :  1, /* use build id in mmap2 events */
> -                               __reserved_1   : 29;
> +                               sigtrap        :  1, /* send synchronous SIGTRAP on event */
> +                               __reserved_1   : 28;
>
>         union {
>                 __u32           wakeup_events;    /* wakeup every n events */
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 37a8297be164..8718763045fd 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6288,6 +6288,17 @@ void perf_event_wakeup(struct perf_event *event)
>         }
>  }
>
> +static void perf_sigtrap(struct perf_event *event)
> +{
> +       struct kernel_siginfo info;
> +
> +       clear_siginfo(&info);
> +       info.si_signo = SIGTRAP;
> +       info.si_code = TRAP_PERF;
> +       info.si_errno = event->attr.type;
> +       force_sig_info(&info);
> +}
> +
>  static void perf_pending_event_disable(struct perf_event *event)
>  {
>         int cpu = READ_ONCE(event->pending_disable);
> @@ -6297,6 +6308,13 @@ static void perf_pending_event_disable(struct perf_event *event)
>
>         if (cpu == smp_processor_id()) {
>                 WRITE_ONCE(event->pending_disable, -1);
> +
> +               if (event->attr.sigtrap) {
> +                       atomic_inc(&event->event_limit); /* rearm event */

Can/should this be atomic_set(&event->event_limit, 1)? It should only
go between 1 and 0, right?
Otherwise:

Acked-by: Dmitry Vyukov <dvyukov@google.com>


> +                       perf_sigtrap(event);
> +                       return;
> +               }
> +
>                 perf_event_disable_local(event);
>                 return;
>         }
> @@ -11325,6 +11343,9 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
>
>         event->state            = PERF_EVENT_STATE_INACTIVE;
>
> +       if (event->attr.sigtrap)
> +               atomic_set(&event->event_limit, 1);
> +
>         if (task) {
>                 event->attach_state = PERF_ATTACH_TASK;
>                 /*
> --
> 2.30.0.617.g56c4b15f3c-goog
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 4/4] perf/core: Add breakpoint information to siginfo on SIGTRAP
  2021-02-23 14:34 ` [PATCH RFC 4/4] perf/core: Add breakpoint information to siginfo on SIGTRAP Marco Elver
@ 2021-02-23 15:01   ` Dmitry Vyukov
  2021-02-23 15:10     ` Marco Elver
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Vyukov @ 2021-02-23 15:01 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Thomas Gleixner, Alexander Potapenko, Al Viro, Arnd Bergmann,
	Christian Brauner, Jann Horn, Jens Axboe, Matt Morehouse,
	Peter Collingbourne, Ian Rogers, kasan-dev, linux-arch,
	linux-fsdevel, LKML, linux-m68k, the arch/x86 maintainers

On Tue, Feb 23, 2021 at 3:34 PM Marco Elver <elver@google.com> wrote:
>
> Encode information from breakpoint attributes into siginfo_t, which
> helps disambiguate which breakpoint fired.
>
> Note, providing the event fd may be unreliable, since the event may have
> been modified (via PERF_EVENT_IOC_MODIFY_ATTRIBUTES) between the event
> triggering and the signal being delivered to user space.
>
> Signed-off-by: Marco Elver <elver@google.com>
> ---
>  kernel/events/core.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 8718763045fd..d7908322d796 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6296,6 +6296,17 @@ static void perf_sigtrap(struct perf_event *event)
>         info.si_signo = SIGTRAP;
>         info.si_code = TRAP_PERF;
>         info.si_errno = event->attr.type;
> +
> +       switch (event->attr.type) {
> +       case PERF_TYPE_BREAKPOINT:
> +               info.si_addr = (void *)(unsigned long)event->attr.bp_addr;
> +               info.si_perf = (event->attr.bp_len << 16) | (u64)event->attr.bp_type;
> +               break;
> +       default:
> +               /* No additional info set. */

Should we prohibit using attr.sigtrap for !PERF_TYPE_BREAKPOINT if we
don't know what info to pass yet?

> +               break;
> +       }
> +
>         force_sig_info(&info);
>  }
>
> --
> 2.30.0.617.g56c4b15f3c-goog
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 4/4] perf/core: Add breakpoint information to siginfo on SIGTRAP
  2021-02-23 15:01   ` Dmitry Vyukov
@ 2021-02-23 15:10     ` Marco Elver
  2021-02-23 15:16       ` Dmitry Vyukov
  0 siblings, 1 reply; 17+ messages in thread
From: Marco Elver @ 2021-02-23 15:10 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Thomas Gleixner, Alexander Potapenko, Al Viro, Arnd Bergmann,
	Christian Brauner, Jann Horn, Jens Axboe, Matt Morehouse,
	Peter Collingbourne, Ian Rogers, kasan-dev, linux-arch,
	linux-fsdevel, LKML, linux-m68k, the arch/x86 maintainers

On Tue, 23 Feb 2021 at 16:01, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Tue, Feb 23, 2021 at 3:34 PM Marco Elver <elver@google.com> wrote:
> >
> > Encode information from breakpoint attributes into siginfo_t, which
> > helps disambiguate which breakpoint fired.
> >
> > Note, providing the event fd may be unreliable, since the event may have
> > been modified (via PERF_EVENT_IOC_MODIFY_ATTRIBUTES) between the event
> > triggering and the signal being delivered to user space.
> >
> > Signed-off-by: Marco Elver <elver@google.com>
> > ---
> >  kernel/events/core.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 8718763045fd..d7908322d796 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -6296,6 +6296,17 @@ static void perf_sigtrap(struct perf_event *event)
> >         info.si_signo = SIGTRAP;
> >         info.si_code = TRAP_PERF;
> >         info.si_errno = event->attr.type;
> > +
> > +       switch (event->attr.type) {
> > +       case PERF_TYPE_BREAKPOINT:
> > +               info.si_addr = (void *)(unsigned long)event->attr.bp_addr;
> > +               info.si_perf = (event->attr.bp_len << 16) | (u64)event->attr.bp_type;
> > +               break;
> > +       default:
> > +               /* No additional info set. */
>
> Should we prohibit using attr.sigtrap for !PERF_TYPE_BREAKPOINT if we
> don't know what info to pass yet?

I don't think it's necessary. This way, by default we get support for
other perf events. If user space observes si_perf==0, then there's no
information available. That would require that any event type that
sets si_perf in future, must ensure that it sets si_perf!=0.

I can add a comment to document the requirement here (and user space
facing documentation should get a copy of how the info is encoded,
too).

Alternatively, we could set si_errno to 0 if no info is available, at
the cost of losing the type information for events not explicitly
listed here.

What do you prefer?

> > +               break;
> > +       }
> > +
> >         force_sig_info(&info);
> >  }
> >
> > --
> > 2.30.0.617.g56c4b15f3c-goog
> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 4/4] perf/core: Add breakpoint information to siginfo on SIGTRAP
  2021-02-23 15:10     ` Marco Elver
@ 2021-02-23 15:16       ` Dmitry Vyukov
  2021-02-23 17:16         ` Marco Elver
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Vyukov @ 2021-02-23 15:16 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Thomas Gleixner, Alexander Potapenko, Al Viro, Arnd Bergmann,
	Christian Brauner, Jann Horn, Jens Axboe, Matt Morehouse,
	Peter Collingbourne, Ian Rogers, kasan-dev, linux-arch,
	linux-fsdevel, LKML, linux-m68k, the arch/x86 maintainers

On Tue, Feb 23, 2021 at 4:10 PM 'Marco Elver' via kasan-dev
<kasan-dev@googlegroups.com> wrote:
> > > Encode information from breakpoint attributes into siginfo_t, which
> > > helps disambiguate which breakpoint fired.
> > >
> > > Note, providing the event fd may be unreliable, since the event may have
> > > been modified (via PERF_EVENT_IOC_MODIFY_ATTRIBUTES) between the event
> > > triggering and the signal being delivered to user space.
> > >
> > > Signed-off-by: Marco Elver <elver@google.com>
> > > ---
> > >  kernel/events/core.c | 11 +++++++++++
> > >  1 file changed, 11 insertions(+)
> > >
> > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > index 8718763045fd..d7908322d796 100644
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -6296,6 +6296,17 @@ static void perf_sigtrap(struct perf_event *event)
> > >         info.si_signo = SIGTRAP;
> > >         info.si_code = TRAP_PERF;
> > >         info.si_errno = event->attr.type;
> > > +
> > > +       switch (event->attr.type) {
> > > +       case PERF_TYPE_BREAKPOINT:
> > > +               info.si_addr = (void *)(unsigned long)event->attr.bp_addr;
> > > +               info.si_perf = (event->attr.bp_len << 16) | (u64)event->attr.bp_type;
> > > +               break;
> > > +       default:
> > > +               /* No additional info set. */
> >
> > Should we prohibit using attr.sigtrap for !PERF_TYPE_BREAKPOINT if we
> > don't know what info to pass yet?
>
> I don't think it's necessary. This way, by default we get support for
> other perf events. If user space observes si_perf==0, then there's no
> information available. That would require that any event type that
> sets si_perf in future, must ensure that it sets si_perf!=0.
>
> I can add a comment to document the requirement here (and user space
> facing documentation should get a copy of how the info is encoded,
> too).
>
> Alternatively, we could set si_errno to 0 if no info is available, at
> the cost of losing the type information for events not explicitly
> listed here.
>
> What do you prefer?

Ah, I see.
Let's wait for the opinions of other people. There are a number of
options for how to approach this.

> > > +               break;
> > > +       }
> > > +
> > >         force_sig_info(&info);
> > >  }
> > >
> > > --
> > > 2.30.0.617.g56c4b15f3c-goog
> > >
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/CANpmjNP1wQvG0SNPP2L9QO%3Dnatf0XU8HXj-r2_-U4QZxtr-dVA%40mail.gmail.com.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 4/4] perf/core: Add breakpoint information to siginfo on SIGTRAP
  2021-02-23 15:16       ` Dmitry Vyukov
@ 2021-02-23 17:16         ` Marco Elver
  0 siblings, 0 replies; 17+ messages in thread
From: Marco Elver @ 2021-02-23 17:16 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Thomas Gleixner, Alexander Potapenko, Al Viro, Arnd Bergmann,
	Christian Brauner, Jann Horn, Jens Axboe, Matt Morehouse,
	Peter Collingbourne, Ian Rogers, kasan-dev, linux-arch,
	linux-fsdevel, LKML, linux-m68k, the arch/x86 maintainers

On Tue, 23 Feb 2021 at 16:16, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Tue, Feb 23, 2021 at 4:10 PM 'Marco Elver' via kasan-dev
> <kasan-dev@googlegroups.com> wrote:
> > > > Encode information from breakpoint attributes into siginfo_t, which
> > > > helps disambiguate which breakpoint fired.
> > > >
> > > > Note, providing the event fd may be unreliable, since the event may have
> > > > been modified (via PERF_EVENT_IOC_MODIFY_ATTRIBUTES) between the event
> > > > triggering and the signal being delivered to user space.
> > > >
> > > > Signed-off-by: Marco Elver <elver@google.com>
> > > > ---
> > > >  kernel/events/core.c | 11 +++++++++++
> > > >  1 file changed, 11 insertions(+)
> > > >
> > > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > > index 8718763045fd..d7908322d796 100644
> > > > --- a/kernel/events/core.c
> > > > +++ b/kernel/events/core.c
> > > > @@ -6296,6 +6296,17 @@ static void perf_sigtrap(struct perf_event *event)
> > > >         info.si_signo = SIGTRAP;
> > > >         info.si_code = TRAP_PERF;
> > > >         info.si_errno = event->attr.type;
> > > > +
> > > > +       switch (event->attr.type) {
> > > > +       case PERF_TYPE_BREAKPOINT:
> > > > +               info.si_addr = (void *)(unsigned long)event->attr.bp_addr;
> > > > +               info.si_perf = (event->attr.bp_len << 16) | (u64)event->attr.bp_type;
> > > > +               break;
> > > > +       default:
> > > > +               /* No additional info set. */
> > >
> > > Should we prohibit using attr.sigtrap for !PERF_TYPE_BREAKPOINT if we
> > > don't know what info to pass yet?
> >
> > I don't think it's necessary. This way, by default we get support for
> > other perf events. If user space observes si_perf==0, then there's no
> > information available. That would require that any event type that
> > sets si_perf in future, must ensure that it sets si_perf!=0.
> >
> > I can add a comment to document the requirement here (and user space
> > facing documentation should get a copy of how the info is encoded,
> > too).
> >
> > Alternatively, we could set si_errno to 0 if no info is available, at
> > the cost of losing the type information for events not explicitly
> > listed here.

Note that PERF_TYPE_HARDWARE == 0, so setting si_errno to 0 does not
work. Which leaves us with:

1. Ensure si_perf==0 (or some other magic value) if no info is
available and !=0 otherwise.

2. Return error for events where we do not officially support
requesting sigtrap.

I'm currently leaning towards (1).

> > What do you prefer?
>
> Ah, I see.
> Let's wait for the opinions of other people. There are a number of
> options for how to approach this.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 2/4] signal: Introduce TRAP_PERF si_code and si_perf to siginfo
  2021-02-23 14:34 ` [PATCH RFC 2/4] signal: Introduce TRAP_PERF si_code and si_perf to siginfo Marco Elver
@ 2021-02-23 18:01   ` Geert Uytterhoeven
  2021-02-23 20:06     ` Arnd Bergmann
  0 siblings, 1 reply; 17+ messages in thread
From: Geert Uytterhoeven @ 2021-02-23 18:01 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Thomas Gleixner, Alexander Potapenko, Al Viro, Arnd Bergmann,
	Christian Brauner, Dmitry Vyukov, Jann Horn, Jens Axboe, mascasa,
	Peter Collingbourne, irogers, kasan-dev, Linux-Arch,
	Linux FS Devel, Linux Kernel Mailing List, linux-m68k,
	the arch/x86 maintainers

On Tue, Feb 23, 2021 at 3:52 PM Marco Elver <elver@google.com> wrote:
> Introduces the TRAP_PERF si_code, and associated siginfo_t field
> si_perf. These will be used by the perf event subsystem to send signals
> (if requested) to the task where an event occurred.
>
> Signed-off-by: Marco Elver <elver@google.com>

>  arch/m68k/kernel/signal.c          |  3 +++

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 3/4] perf/core: Add support for SIGTRAP on perf events
  2021-02-23 14:34 ` [PATCH RFC 3/4] perf/core: Add support for SIGTRAP on perf events Marco Elver
  2021-02-23 14:57   ` Dmitry Vyukov
@ 2021-02-23 18:13   ` Dmitry Vyukov
  1 sibling, 0 replies; 17+ messages in thread
From: Dmitry Vyukov @ 2021-02-23 18:13 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Thomas Gleixner, Alexander Potapenko, Al Viro, Arnd Bergmann,
	Christian Brauner, Jann Horn, Jens Axboe, Matt Morehouse,
	Peter Collingbourne, Ian Rogers, kasan-dev, linux-arch,
	linux-fsdevel, LKML, linux-m68k, the arch/x86 maintainers

On Tue, Feb 23, 2021 at 3:34 PM Marco Elver <elver@google.com> wrote:
>
> Adds bit perf_event_attr::sigtrap, which can be set to cause events to
> send SIGTRAP (with si_code TRAP_PERF) to the task where the event
> occurred. To distinguish perf events and allow user space to decode
> si_perf (if set), the event type is set in si_errno.
>
> The primary motivation is to support synchronous signals on perf events
> in the task where an event (such as breakpoints) triggered.
>
> Link: https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.net/
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Marco Elver <elver@google.com>
> ---
>  include/uapi/linux/perf_event.h |  3 ++-
>  kernel/events/core.c            | 21 +++++++++++++++++++++
>  2 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index ad15e40d7f5d..b9cc6829a40c 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -389,7 +389,8 @@ struct perf_event_attr {
>                                 cgroup         :  1, /* include cgroup events */
>                                 text_poke      :  1, /* include text poke events */
>                                 build_id       :  1, /* use build id in mmap2 events */
> -                               __reserved_1   : 29;
> +                               sigtrap        :  1, /* send synchronous SIGTRAP on event */
> +                               __reserved_1   : 28;
>
>         union {
>                 __u32           wakeup_events;    /* wakeup every n events */
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 37a8297be164..8718763045fd 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6288,6 +6288,17 @@ void perf_event_wakeup(struct perf_event *event)
>         }
>  }
>
> +static void perf_sigtrap(struct perf_event *event)
> +{
> +       struct kernel_siginfo info;
> +
> +       clear_siginfo(&info);
> +       info.si_signo = SIGTRAP;
> +       info.si_code = TRAP_PERF;
> +       info.si_errno = event->attr.type;
> +       force_sig_info(&info);
> +}
> +
>  static void perf_pending_event_disable(struct perf_event *event)
>  {
>         int cpu = READ_ONCE(event->pending_disable);
> @@ -6297,6 +6308,13 @@ static void perf_pending_event_disable(struct perf_event *event)
>
>         if (cpu == smp_processor_id()) {
>                 WRITE_ONCE(event->pending_disable, -1);
> +
> +               if (event->attr.sigtrap) {
> +                       atomic_inc(&event->event_limit); /* rearm event */

We send the signal to the current task. Can this fire outside of the
current task context? E.g. in interrupt/softirq/etc? And then we will
send the signal to the current task. Watchpoint can be set to
userspace address and then something asynchronous (some IO completion)
that does not belong to this task access the userspace address (is
this possible?). But watchpoints can also be set to kernel addresses,
then another context can definitely access it.
(1) can this happen? maybe perf context is somehow disabled when !in_task()?
(2) if yes, what is the desired behavior?




> +                       perf_sigtrap(event);
> +                       return;
> +               }
> +
>                 perf_event_disable_local(event);
>                 return;
>         }
> @@ -11325,6 +11343,9 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
>
>         event->state            = PERF_EVENT_STATE_INACTIVE;
>
> +       if (event->attr.sigtrap)
> +               atomic_set(&event->event_limit, 1);
> +
>         if (task) {
>                 event->attach_state = PERF_ATTACH_TASK;
>                 /*
> --
> 2.30.0.617.g56c4b15f3c-goog
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 0/4] Add support for synchronous signals on perf events
  2021-02-23 14:34 [PATCH RFC 0/4] Add support for synchronous signals on perf events Marco Elver
                   ` (3 preceding siblings ...)
  2021-02-23 14:34 ` [PATCH RFC 4/4] perf/core: Add breakpoint information to siginfo on SIGTRAP Marco Elver
@ 2021-02-23 20:03 ` Marco Elver
  2021-02-23 20:27 ` Andy Lutomirski
  5 siblings, 0 replies; 17+ messages in thread
From: Marco Elver @ 2021-02-23 20:03 UTC (permalink / raw)
  To: Marco Elver, Peter Zijlstra, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Ingo Molnar, Jiri Olsa, Mark Rutland,
	Namhyung Kim, Thomas Gleixner
  Cc: Alexander Potapenko, Al Viro, Arnd Bergmann, Christian Brauner,
	Dmitry Vyukov, Jann Horn, Jens Axboe, Matt Morehouse,
	Peter Collingbourne, Ian Rogers, kasan-dev, linux-arch,
	linux-fsdevel, LKML, linux-m68k, the arch/x86 maintainers

On Tue, 23 Feb 2021 at 15:34, Marco Elver <elver@google.com> wrote:
>
> The perf subsystem today unifies various tracing and monitoring
> features, from both software and hardware. One benefit of the perf
> subsystem is automatically inheriting events to child tasks, which
> enables process-wide events monitoring with low overheads. By default
> perf events are non-intrusive, not affecting behaviour of the tasks
> being monitored.
>
> For certain use-cases, however, it makes sense to leverage the
> generality of the perf events subsystem and optionally allow the tasks
> being monitored to receive signals on events they are interested in.
> This patch series adds the option to synchronously signal user space on
> events.
>
> The discussion at [1] led to the changes proposed in this series. The
> approach taken in patch 3/4 to use 'event_limit' to trigger the signal
> was kindly suggested by Peter Zijlstra in [2].
>
> [1] https://lore.kernel.org/lkml/CACT4Y+YPrXGw+AtESxAgPyZ84TYkNZdP0xpocX2jwVAbZD=-XQ@mail.gmail.com/
> [2] https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.net/
>
> Motivation and example uses:
>
> 1.      Our immediate motivation is low-overhead sampling-based race
>         detection for user-space [3]. By using perf_event_open() at
>         process initialization, we can create hardware
>         breakpoint/watchpoint events that are propagated automatically
>         to all threads in a process. As far as we are aware, today no
>         existing kernel facility (such as ptrace) allows us to set up
>         process-wide watchpoints with minimal overheads (that are
>         comparable to mprotect() of whole pages).
>
>         [3] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf
>
> 2.      Other low-overhead error detectors that rely on detecting
>         accesses to certain memory locations or code, process-wide and
>         also only in a specific set of subtasks or threads.
>
> Other example use-cases we found potentially interesting:
>
> 3.      Code hot patching without full stop-the-world. Specifically, by
>         setting a code breakpoint to entry to the patched routine, then
>         send signals to threads and check that they are not in the
>         routine, but without stopping them further. If any of the
>         threads will enter the routine, it will receive SIGTRAP and
>         pause.
>
> 4.      Safepoints without mprotect(). Some Java implementations use
>         "load from a known memory location" as a safepoint. When threads
>         need to be stopped, the page containing the location is
>         mprotect()ed and threads get a signal. This can be replaced with
>         a watchpoint, which does not require a whole page nor DTLB
>         shootdowns.
>
> 5.      Tracking data flow globally.
>
> 6.      Threads receiving signals on performance events to
>         throttle/unthrottle themselves.
>
>
> Marco Elver (4):
>   perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children
>   signal: Introduce TRAP_PERF si_code and si_perf to siginfo
>   perf/core: Add support for SIGTRAP on perf events
>   perf/core: Add breakpoint information to siginfo on SIGTRAP

Note that we're currently pondering fork + exec, and suggestions would
be appreciated. We think we'll need some restrictions, like Peter
proposed here: here:
https://lore.kernel.org/lkml/YBvj6eJR%2FDY2TsEB@hirez.programming.kicks-ass.net/

We think what we want is to inherit the events to children only if
cloned with CLONE_SIGHAND. If there's space for a 'inherit_mask' in
perf_event_attr, that'd be most flexible, but perhaps we do not have
the space.

Thanks,
-- Marco

>
>  arch/m68k/kernel/signal.c          |  3 ++
>  arch/x86/kernel/signal_compat.c    |  5 ++-
>  fs/signalfd.c                      |  4 +++
>  include/linux/compat.h             |  2 ++
>  include/linux/signal.h             |  1 +
>  include/uapi/asm-generic/siginfo.h |  6 +++-
>  include/uapi/linux/perf_event.h    |  3 +-
>  include/uapi/linux/signalfd.h      |  4 ++-
>  kernel/events/core.c               | 54 +++++++++++++++++++++++++++++-
>  kernel/signal.c                    | 11 ++++++
>  10 files changed, 88 insertions(+), 5 deletions(-)
>
> --
> 2.30.0.617.g56c4b15f3c-goog
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 2/4] signal: Introduce TRAP_PERF si_code and si_perf to siginfo
  2021-02-23 18:01   ` Geert Uytterhoeven
@ 2021-02-23 20:06     ` Arnd Bergmann
  0 siblings, 0 replies; 17+ messages in thread
From: Arnd Bergmann @ 2021-02-23 20:06 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Marco Elver, Peter Zijlstra, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Ingo Molnar, Jiri Olsa, Mark Rutland,
	Namhyung Kim, Thomas Gleixner, Alexander Potapenko, Al Viro,
	Arnd Bergmann, Christian Brauner, Dmitry Vyukov, Jann Horn,
	Jens Axboe, mascasa, Peter Collingbourne, irogers, kasan-dev,
	Linux-Arch, Linux FS Devel, Linux Kernel Mailing List,
	linux-m68k, the arch/x86 maintainers

On Tue, Feb 23, 2021 at 7:01 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> On Tue, Feb 23, 2021 at 3:52 PM Marco Elver <elver@google.com> wrote:
> > Introduces the TRAP_PERF si_code, and associated siginfo_t field
> > si_perf. These will be used by the perf event subsystem to send signals
> > (if requested) to the task where an event occurred.
> >
> > Signed-off-by: Marco Elver <elver@google.com>
>
> >  arch/m68k/kernel/signal.c          |  3 +++
>
> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
>

For asm-generic:

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 0/4] Add support for synchronous signals on perf events
  2021-02-23 14:34 [PATCH RFC 0/4] Add support for synchronous signals on perf events Marco Elver
                   ` (4 preceding siblings ...)
  2021-02-23 20:03 ` [PATCH RFC 0/4] Add support for synchronous signals on perf events Marco Elver
@ 2021-02-23 20:27 ` Andy Lutomirski
  2021-02-23 22:26   ` Marco Elver
  5 siblings, 1 reply; 17+ messages in thread
From: Andy Lutomirski @ 2021-02-23 20:27 UTC (permalink / raw)
  To: Marco Elver
  Cc: peterz, alexander.shishkin, acme, mingo, jolsa, mark.rutland,
	namhyung, tglx, glider, viro, arnd, christian, dvyukov, jannh,
	axboe, mascasa, pcc, irogers, kasan-dev, linux-arch,
	linux-fsdevel, linux-kernel, linux-m68k, x86


> On Feb 23, 2021, at 6:34 AM, Marco Elver <elver@google.com> wrote:
> 
> The perf subsystem today unifies various tracing and monitoring
> features, from both software and hardware. One benefit of the perf
> subsystem is automatically inheriting events to child tasks, which
> enables process-wide events monitoring with low overheads. By default
> perf events are non-intrusive, not affecting behaviour of the tasks
> being monitored.
> 
> For certain use-cases, however, it makes sense to leverage the
> generality of the perf events subsystem and optionally allow the tasks
> being monitored to receive signals on events they are interested in.
> This patch series adds the option to synchronously signal user space on
> events.

Unless I missed some machinations, which is entirely possible, you can’t call force_sig_info() from NMI context. Not only am I not convinced that the core signal code is NMI safe, but at least x86 can’t correctly deliver signals on NMI return. You probably need an IPI-to-self.

> 
> The discussion at [1] led to the changes proposed in this series. The
> approach taken in patch 3/4 to use 'event_limit' to trigger the signal
> was kindly suggested by Peter Zijlstra in [2].
> 
> [1] https://lore.kernel.org/lkml/CACT4Y+YPrXGw+AtESxAgPyZ84TYkNZdP0xpocX2jwVAbZD=-XQ@mail.gmail.com/
> [2] https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.net/ 
> 
> Motivation and example uses:
> 
> 1.    Our immediate motivation is low-overhead sampling-based race
>    detection for user-space [3]. By using perf_event_open() at
>    process initialization, we can create hardware
>    breakpoint/watchpoint events that are propagated automatically
>    to all threads in a process. As far as we are aware, today no
>    existing kernel facility (such as ptrace) allows us to set up
>    process-wide watchpoints with minimal overheads (that are
>    comparable to mprotect() of whole pages).

This would be doable much more simply with an API to set a breakpoint.  All the machinery exists except the actual user API.

>    [3] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf 
> 
> 2.    Other low-overhead error detectors that rely on detecting
>    accesses to certain memory locations or code, process-wide and
>    also only in a specific set of subtasks or threads.
> 
> Other example use-cases we found potentially interesting:
> 
> 3.    Code hot patching without full stop-the-world. Specifically, by
>    setting a code breakpoint to entry to the patched routine, then
>    send signals to threads and check that they are not in the
>    routine, but without stopping them further. If any of the
>    threads will enter the routine, it will receive SIGTRAP and
>    pause.

Cute.

> 
> 4.    Safepoints without mprotect(). Some Java implementations use
>    "load from a known memory location" as a safepoint. When threads
>    need to be stopped, the page containing the location is
>    mprotect()ed and threads get a signal. This can be replaced with
>    a watchpoint, which does not require a whole page nor DTLB
>    shootdowns.

I’m skeptical. Propagating a hardware breakpoint to all threads involves IPIs and horribly slow writes to DR1 (or 2, 3, or 4) and DR7.  A TLB flush can be accelerated using paravirt or hypothetical future hardware. Or real live hardware on ARM64.

(The hypothetical future hardware is almost present on Zen 3.  A bit of work is needed on the hardware end to make it useful.)

> 
> 5.    Tracking data flow globally.
> 
> 6.    Threads receiving signals on performance events to
>    throttle/unthrottle themselves.
> 
> Marco Elver (4):
>  perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children
>  signal: Introduce TRAP_PERF si_code and si_perf to siginfo
>  perf/core: Add support for SIGTRAP on perf events
>  perf/core: Add breakpoint information to siginfo on SIGTRAP
> 
> arch/m68k/kernel/signal.c          |  3 ++
> arch/x86/kernel/signal_compat.c    |  5 ++-
> fs/signalfd.c                      |  4 +++
> include/linux/compat.h             |  2 ++
> include/linux/signal.h             |  1 +
> include/uapi/asm-generic/siginfo.h |  6 +++-
> include/uapi/linux/perf_event.h    |  3 +-
> include/uapi/linux/signalfd.h      |  4 ++-
> kernel/events/core.c               | 54 +++++++++++++++++++++++++++++-
> kernel/signal.c                    | 11 ++++++
> 10 files changed, 88 insertions(+), 5 deletions(-)
> 
> -- 
> 2.30.0.617.g56c4b15f3c-goog
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 0/4] Add support for synchronous signals on perf events
  2021-02-23 20:27 ` Andy Lutomirski
@ 2021-02-23 22:26   ` Marco Elver
  0 siblings, 0 replies; 17+ messages in thread
From: Marco Elver @ 2021-02-23 22:26 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Peter Zijlstra, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Jiri Olsa, Mark Rutland, Namhyung Kim,
	Thomas Gleixner, Alexander Potapenko, Al Viro, Arnd Bergmann,
	Christian Brauner, Dmitry Vyukov, Jann Horn, Jens Axboe,
	Matt Morehouse, Peter Collingbourne, Ian Rogers, kasan-dev,
	linux-arch, linux-fsdevel, LKML, linux-m68k,
	the arch/x86 maintainers

On Tue, 23 Feb 2021 at 21:27, Andy Lutomirski <luto@amacapital.net> wrote:
> > On Feb 23, 2021, at 6:34 AM, Marco Elver <elver@google.com> wrote:
> >
> > The perf subsystem today unifies various tracing and monitoring
> > features, from both software and hardware. One benefit of the perf
> > subsystem is automatically inheriting events to child tasks, which
> > enables process-wide events monitoring with low overheads. By default
> > perf events are non-intrusive, not affecting behaviour of the tasks
> > being monitored.
> >
> > For certain use-cases, however, it makes sense to leverage the
> > generality of the perf events subsystem and optionally allow the tasks
> > being monitored to receive signals on events they are interested in.
> > This patch series adds the option to synchronously signal user space on
> > events.
>
> Unless I missed some machinations, which is entirely possible, you can’t call force_sig_info() from NMI context. Not only am I not convinced that the core signal code is NMI safe, but at least x86 can’t correctly deliver signals on NMI return. You probably need an IPI-to-self.

force_sig_info() is called from an irq_work only: perf_pending_event
-> perf_pending_event_disable -> perf_sigtrap -> force_sig_info. What
did I miss?

> > The discussion at [1] led to the changes proposed in this series. The
> > approach taken in patch 3/4 to use 'event_limit' to trigger the signal
> > was kindly suggested by Peter Zijlstra in [2].
> >
> > [1] https://lore.kernel.org/lkml/CACT4Y+YPrXGw+AtESxAgPyZ84TYkNZdP0xpocX2jwVAbZD=-XQ@mail.gmail.com/
> > [2] https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.net/
> >
> > Motivation and example uses:
> >
> > 1.    Our immediate motivation is low-overhead sampling-based race
> >    detection for user-space [3]. By using perf_event_open() at
> >    process initialization, we can create hardware
> >    breakpoint/watchpoint events that are propagated automatically
> >    to all threads in a process. As far as we are aware, today no
> >    existing kernel facility (such as ptrace) allows us to set up
> >    process-wide watchpoints with minimal overheads (that are
> >    comparable to mprotect() of whole pages).
>
> This would be doable much more simply with an API to set a breakpoint.  All the machinery exists except the actual user API.

Isn't perf_event_open() that API?

A new user API implementation will either be a thin wrapper around
perf events or reinvent half of perf events to deal with managing
watchpoints across a set of tasks (process-wide or some subset).

It's not just breakpoints though.

> >    [3] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf
> >
> > 2.    Other low-overhead error detectors that rely on detecting
> >    accesses to certain memory locations or code, process-wide and
> >    also only in a specific set of subtasks or threads.
> >
> > Other example use-cases we found potentially interesting:
> >
> > 3.    Code hot patching without full stop-the-world. Specifically, by
> >    setting a code breakpoint to entry to the patched routine, then
> >    send signals to threads and check that they are not in the
> >    routine, but without stopping them further. If any of the
> >    threads will enter the routine, it will receive SIGTRAP and
> >    pause.
>
> Cute.
>
> >
> > 4.    Safepoints without mprotect(). Some Java implementations use
> >    "load from a known memory location" as a safepoint. When threads
> >    need to be stopped, the page containing the location is
> >    mprotect()ed and threads get a signal. This can be replaced with
> >    a watchpoint, which does not require a whole page nor DTLB
> >    shootdowns.
>
> I’m skeptical. Propagating a hardware breakpoint to all threads involves IPIs and horribly slow writes to DR1 (or 2, 3, or 4) and DR7.  A TLB flush can be accelerated using paravirt or hypothetical future hardware. Or real live hardware on ARM64.
>
> (The hypothetical future hardware is almost present on Zen 3.  A bit of work is needed on the hardware end to make it useful.)

Fair enough. Although watchpoints can be much more fine-grained than
an mprotect() which then also has downsides (checking if the accessed
memory was actually the bytes we're interested in). Maybe we should
also ask CPU vendors to give us better watchpoints (perhaps start with
more of them, and easier to set in batch)? We still need a user space
API...

Thanks,
-- Marco



> >
> > 5.    Tracking data flow globally.
> >
> > 6.    Threads receiving signals on performance events to
> >    throttle/unthrottle themselves.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-02-23 22:29 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-23 14:34 [PATCH RFC 0/4] Add support for synchronous signals on perf events Marco Elver
2021-02-23 14:34 ` [PATCH RFC 1/4] perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children Marco Elver
2021-02-23 14:48   ` Dmitry Vyukov
2021-02-23 14:34 ` [PATCH RFC 2/4] signal: Introduce TRAP_PERF si_code and si_perf to siginfo Marco Elver
2021-02-23 18:01   ` Geert Uytterhoeven
2021-02-23 20:06     ` Arnd Bergmann
2021-02-23 14:34 ` [PATCH RFC 3/4] perf/core: Add support for SIGTRAP on perf events Marco Elver
2021-02-23 14:57   ` Dmitry Vyukov
2021-02-23 18:13   ` Dmitry Vyukov
2021-02-23 14:34 ` [PATCH RFC 4/4] perf/core: Add breakpoint information to siginfo on SIGTRAP Marco Elver
2021-02-23 15:01   ` Dmitry Vyukov
2021-02-23 15:10     ` Marco Elver
2021-02-23 15:16       ` Dmitry Vyukov
2021-02-23 17:16         ` Marco Elver
2021-02-23 20:03 ` [PATCH RFC 0/4] Add support for synchronous signals on perf events Marco Elver
2021-02-23 20:27 ` Andy Lutomirski
2021-02-23 22:26   ` Marco Elver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).