linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND PATCH v5 0/6] perf: Add ioctl for PMU driver configuration
@ 2018-12-17 17:21 Mathieu Poirier
  2018-12-17 17:21 ` [RESEND PATCH v5 1/6] perf: Introduce ioctl to communicate driver configuration to kernel Mathieu Poirier
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Mathieu Poirier @ 2018-12-17 17:21 UTC (permalink / raw)
  To: acme, peterz, gregkh
  Cc: mingo, tglx, alexander.shishkin, schwidefsky, heiko.carstens,
	will.deacon, mark.rutland, jolsa, namhyung, adrian.hunter, ast,
	hpa, suzuki.poulosi, linux-s390, linux-kernel, linux-arm-kernel

In the coresight world there can be more than one sink to aggregate
trace data generated by CPUs, hence the need for users to select which
one to use from the perf command line.

Up until now sysfs was used to communicate sink information to the
kernel but that was never the right way to proceed because it breaks
when more than one perf session are created at the same time.  The
situation was manageable when working with per-thread scenarios where
a single HW trace event is created but in CPU-wide mode a HW trace
event is created for each CPU that is specified on the perf command
line, taking us back to the concurrency problem we have when dealing
with multiple per-thread session.

This work fits in a wider scheme to support CPU-wide trace scenarios on
CoreSight that is available here [1]. The first step in that venture is
to address sysfs concurrency issues, which this patchset does.

The main difference with V4 is the usage of a CoreSight sink's HW start
address (u64) to uniquely identify a sink, negating the need to use the
component's name (char *).

By using a u64 we can theoretically add the sink information to the
perf_event_attr structure, avoiding the need to add a new perf ioctl().
On the flip side it would introduce a very specific CoreSight field
to a structure that is generic.  I have opted for the ioctl() method,
let me know if you want me to proceed with the latter.

Regards,
Mathieu

# Before this set: 

root@juno:/home/linaro# perf record -e cs_etm/@20070000.etr/ -C 2,3 sleep 1
failed to mmap with 12 (Cannot allocate memory)

# After this set:

root@juno:/home/linaro# perf record -e cs_etm/@20070000.etr/ -C 2,3 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 4.145 MB perf.data ]

[1]. https://git.linaro.org/people/mathieu.poirier/coresight.git/log/?h=cpu-wide-coresight

---
Changes for V5:
. Went from a char * to a u64 for ioctl argument (GKH).
. Added a better description of the problem being fixed to
  the cover letter (GKH).

Changes for V4:
. Made passing of information between ioctl() and PMU simpler.

Mathieu Poirier (6):
  perf: Introduce ioctl to communicate driver configuration to kernel
  perf/core: Use ioctl to communicate driver configuration to kernel
  perf/aux: Make perf_event accessible to setup_aux()
  coresight: Use PMU driver configuration for sink selection
  perf tools: Make perf_evsel accessible to PMU driver configuration
    code
  perf tools: Use ioctl function to send sink information to kernel

 arch/s390/kernel/perf_cpum_sf.c                  |  6 +-
 arch/x86/events/intel/bts.c                      |  4 +-
 arch/x86/events/intel/pt.c                       |  5 +-
 drivers/hwtracing/coresight/coresight-etm-perf.c | 80 +++++++++++++++++-----
 drivers/perf/arm_spe_pmu.c                       |  6 +-
 include/linux/perf_event.h                       | 40 ++++++++++-
 include/uapi/linux/perf_event.h                  |  1 +
 kernel/events/core.c                             | 67 +++++++++++++++++++
 kernel/events/ring_buffer.c                      |  2 +-
 tools/include/uapi/linux/perf_event.h            |  1 +
 tools/perf/arch/arm/util/cs-etm.c                | 84 +++++++++++++++++++++++-
 tools/perf/arch/arm/util/cs-etm.h                |  3 +-
 tools/perf/util/drv_configs.c                    | 30 ++-------
 tools/perf/util/evsel.c                          |  6 ++
 tools/perf/util/evsel.h                          |  1 +
 tools/perf/util/pmu.h                            |  3 +-
 16 files changed, 287 insertions(+), 52 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RESEND PATCH v5 1/6] perf: Introduce ioctl to communicate driver configuration to kernel
  2018-12-17 17:21 [RESEND PATCH v5 0/6] perf: Add ioctl for PMU driver configuration Mathieu Poirier
@ 2018-12-17 17:21 ` Mathieu Poirier
  2018-12-19  8:29   ` Greg KH
  2018-12-17 17:21 ` [RESEND PATCH v5 2/6] perf/core: Use " Mathieu Poirier
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Mathieu Poirier @ 2018-12-17 17:21 UTC (permalink / raw)
  To: acme, peterz, gregkh
  Cc: mingo, tglx, alexander.shishkin, schwidefsky, heiko.carstens,
	will.deacon, mark.rutland, jolsa, namhyung, adrian.hunter, ast,
	hpa, suzuki.poulosi, linux-s390, linux-kernel, linux-arm-kernel

Adding a new IOCTL command to communicate PMU specific configuration to
PMU kernel drivers.  This can be anything a PMU might need for
configuration that doesn't fit in the perf_event_attr structure, such
as the CoreSight sink to use for a session.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 include/uapi/linux/perf_event.h       | 1 +
 tools/include/uapi/linux/perf_event.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 9de8780ac8d9..a39d8dc2e7a9 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -462,6 +462,7 @@ struct perf_event_query_bpf {
 #define PERF_EVENT_IOC_PAUSE_OUTPUT		_IOW('$', 9, __u32)
 #define PERF_EVENT_IOC_QUERY_BPF		_IOWR('$', 10, struct perf_event_query_bpf *)
 #define PERF_EVENT_IOC_MODIFY_ATTRIBUTES	_IOW('$', 11, struct perf_event_attr *)
+#define PERF_EVENT_IOC_SET_DRV_CONFIG		_IOW('$', 12, __u64 *)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 9de8780ac8d9..a39d8dc2e7a9 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -462,6 +462,7 @@ struct perf_event_query_bpf {
 #define PERF_EVENT_IOC_PAUSE_OUTPUT		_IOW('$', 9, __u32)
 #define PERF_EVENT_IOC_QUERY_BPF		_IOWR('$', 10, struct perf_event_query_bpf *)
 #define PERF_EVENT_IOC_MODIFY_ATTRIBUTES	_IOW('$', 11, struct perf_event_attr *)
+#define PERF_EVENT_IOC_SET_DRV_CONFIG		_IOW('$', 12, __u64 *)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RESEND PATCH v5 2/6] perf/core: Use ioctl to communicate driver configuration to kernel
  2018-12-17 17:21 [RESEND PATCH v5 0/6] perf: Add ioctl for PMU driver configuration Mathieu Poirier
  2018-12-17 17:21 ` [RESEND PATCH v5 1/6] perf: Introduce ioctl to communicate driver configuration to kernel Mathieu Poirier
@ 2018-12-17 17:21 ` Mathieu Poirier
  2018-12-19  8:31   ` Greg KH
  2018-12-17 17:21 ` [RESEND PATCH v5 3/6] perf/aux: Make perf_event accessible to setup_aux() Mathieu Poirier
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Mathieu Poirier @ 2018-12-17 17:21 UTC (permalink / raw)
  To: acme, peterz, gregkh
  Cc: mingo, tglx, alexander.shishkin, schwidefsky, heiko.carstens,
	will.deacon, mark.rutland, jolsa, namhyung, adrian.hunter, ast,
	hpa, suzuki.poulosi, linux-s390, linux-kernel, linux-arm-kernel

This patch adds the mechanic needed for user space to send PMU specific
configuration to the kernel driver using an ioctl() command.  That way
events can keep track of options that don't fit in the perf_event_attr
structure like the selection of a CoreSight sink to use for the session.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 include/linux/perf_event.h | 38 ++++++++++++++++++++++++++
 kernel/events/core.c       | 67 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 105 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 53c500f0ca79..8e69b7e309e7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -114,6 +114,14 @@ struct hw_perf_event_extra {
 	int		idx;	/* index in shared_regs->regs[] */
 };
 
+/*
+ * PMU driver configuration
+ */
+struct pmu_drv_config {
+	void		*config;
+	raw_spinlock_t	lock;
+};
+
 /**
  * struct hw_perf_event - performance event hardware details:
  */
@@ -178,6 +186,9 @@ struct hw_perf_event {
 	/* Last sync'ed generation of filters */
 	unsigned long			addr_filters_gen;
 
+	/* PMU driver configuration */
+	struct pmu_drv_config		drv_config;
+
 /*
  * hw_perf_event::state flags; used to track the PERF_EF_* state.
  */
@@ -447,6 +458,17 @@ struct pmu {
 	 * Filter events for PMU-specific reasons.
 	 */
 	int (*filter_match)		(struct perf_event *event); /* optional */
+
+	/*
+	 * Validate complex PMU configuration that don't fit in the
+	 * perf_event_attr struct.  Returns a PMU specific pointer or an error
+	 * value < 0.
+	 *
+	 * As with addr_filters_validate(), runs in the context of the ioctl()
+	 * process and is not serialized with the rest of the PMU callbacks.
+	 */
+	void *(*drv_config_validate)	(struct perf_event *event,
+					 u64 value);
 };
 
 enum perf_addr_filter_action_t {
@@ -1235,6 +1257,11 @@ static inline bool has_addr_filter(struct perf_event *event)
 	return event->pmu->nr_addr_filters;
 }
 
+static inline bool has_drv_config(struct perf_event *event)
+{
+	return event->pmu->drv_config_validate;
+}
+
 /*
  * An inherited event uses parent's filters
  */
@@ -1249,6 +1276,17 @@ perf_event_addr_filters(struct perf_event *event)
 	return ifh;
 }
 
+static inline struct pmu_drv_config *
+perf_event_get_drv_config(struct perf_event *event)
+{
+	struct pmu_drv_config *cfg = &event->hw.drv_config;
+
+	if (event->parent)
+		cfg = &event->parent->hw.drv_config;
+
+	return cfg;
+}
+
 extern void perf_event_addr_filters_sync(struct perf_event *event);
 
 extern int perf_output_begin(struct perf_output_handle *handle,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 84530ab358c3..af7a53c97744 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5003,6 +5003,8 @@ static inline int perf_fget_light(int fd, struct fd *p)
 static int perf_event_set_output(struct perf_event *event,
 				 struct perf_event *output_event);
 static int perf_event_set_filter(struct perf_event *event, void __user *arg);
+static int perf_event_set_drv_config(struct perf_event *event,
+				     void __user *arg);
 static int perf_event_set_bpf_prog(struct perf_event *event, u32 prog_fd);
 static int perf_copy_attr(struct perf_event_attr __user *uattr,
 			  struct perf_event_attr *attr);
@@ -5089,6 +5091,10 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
 
 		return perf_event_modify_attr(event,  &new_attr);
 	}
+
+	case PERF_EVENT_IOC_SET_DRV_CONFIG:
+		return perf_event_set_drv_config(event, (void __user *)arg);
+
 	default:
 		return -ENOTTY;
 	}
@@ -9128,6 +9134,66 @@ static int perf_event_set_filter(struct perf_event *event, void __user *arg)
 	return ret;
 }
 
+static void perf_drv_config_replace(struct perf_event *event, void *drv_data)
+{
+	unsigned long flags;
+	struct pmu_drv_config *drv_config = &event->hw.drv_config;
+
+	if (!has_drv_config(event))
+		return;
+
+	/* Children take their configuration from their parent */
+	if (event->parent)
+		return;
+
+	/* Make sure the PMU doesn't get a handle on the data */
+	raw_spin_lock_irqsave(&drv_config->lock, flags);
+	drv_config->config = drv_data;
+	raw_spin_unlock_irqrestore(&drv_config->lock, flags);
+}
+
+static int
+perf_event_process_drv_config(struct perf_event *event, u64 value)
+{
+	int ret = -EINVAL;
+	void *drv_data;
+
+	/* Make sure ctx.mutex is held */
+	lockdep_assert_held(&event->ctx->mutex);
+
+	/* Children take their configuration from their parent */
+	if (WARN_ON_ONCE(event->parent))
+		goto out;
+
+	drv_data = event->pmu->drv_config_validate(event, value);
+	if (IS_ERR(drv_data)) {
+		ret = PTR_ERR(drv_data);
+		goto out;
+	}
+
+	perf_drv_config_replace(event, drv_data);
+
+	ret = 0;
+out:
+	return ret;
+}
+
+static int perf_event_set_drv_config(struct perf_event *event, void __user *arg)
+{
+	int ret;
+	u64 value;
+
+	if (!has_drv_config(event))
+		return -EINVAL;
+
+	if (copy_from_user(&value, arg, sizeof(value)))
+		return -EFAULT;
+
+	ret = perf_event_process_drv_config(event, value);
+
+	return ret;
+}
+
 /*
  * hrtimer based swevent callback
  */
@@ -10052,6 +10118,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 	if (attr->freq && attr->sample_freq)
 		hwc->sample_period = 1;
 	hwc->last_period = hwc->sample_period;
+	raw_spin_lock_init(&hwc->drv_config.lock);
 
 	local64_set(&hwc->period_left, hwc->sample_period);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RESEND PATCH v5 3/6] perf/aux: Make perf_event accessible to setup_aux()
  2018-12-17 17:21 [RESEND PATCH v5 0/6] perf: Add ioctl for PMU driver configuration Mathieu Poirier
  2018-12-17 17:21 ` [RESEND PATCH v5 1/6] perf: Introduce ioctl to communicate driver configuration to kernel Mathieu Poirier
  2018-12-17 17:21 ` [RESEND PATCH v5 2/6] perf/core: Use " Mathieu Poirier
@ 2018-12-17 17:21 ` Mathieu Poirier
  2018-12-17 17:21 ` [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection Mathieu Poirier
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Mathieu Poirier @ 2018-12-17 17:21 UTC (permalink / raw)
  To: acme, peterz, gregkh
  Cc: mingo, tglx, alexander.shishkin, schwidefsky, heiko.carstens,
	will.deacon, mark.rutland, jolsa, namhyung, adrian.hunter, ast,
	hpa, suzuki.poulosi, linux-s390, linux-kernel, linux-arm-kernel

When pmu::setup_aux() is called the coresight PMU needs to know which
sink to use for the session by looking up the information in the
drv_config field of the hw_perf_event structure.

As such simply replace the cpu information by the complete perf_event
structure and change all affected customers.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 arch/s390/kernel/perf_cpum_sf.c                  | 6 +++---
 arch/x86/events/intel/bts.c                      | 4 +++-
 arch/x86/events/intel/pt.c                       | 5 +++--
 drivers/hwtracing/coresight/coresight-etm-perf.c | 6 +++---
 drivers/perf/arm_spe_pmu.c                       | 6 +++---
 include/linux/perf_event.h                       | 2 +-
 kernel/events/ring_buffer.c                      | 2 +-
 7 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/s390/kernel/perf_cpum_sf.c b/arch/s390/kernel/perf_cpum_sf.c
index 7bf604ff50a1..df92c2af99b6 100644
--- a/arch/s390/kernel/perf_cpum_sf.c
+++ b/arch/s390/kernel/perf_cpum_sf.c
@@ -1600,7 +1600,7 @@ static void aux_sdb_init(unsigned long sdb)
 
 /*
  * aux_buffer_setup() - Setup AUX buffer for diagnostic mode sampling
- * @cpu:	On which to allocate, -1 means current
+ * @event:	Event the buffer is setup for, event->cpu == -1 means current
  * @pages:	Array of pointers to buffer pages passed from perf core
  * @nr_pages:	Total pages
  * @snapshot:	Flag for snapshot mode
@@ -1612,8 +1612,8 @@ static void aux_sdb_init(unsigned long sdb)
  *
  * Return the private AUX buffer structure if success or NULL if fails.
  */
-static void *aux_buffer_setup(int cpu, void **pages, int nr_pages,
-			      bool snapshot)
+static void *aux_buffer_setup(struct perf_event *event, void **pages,
+			      int nr_pages, bool snapshot)
 {
 	struct sf_buffer *sfb;
 	struct aux_buffer *aux;
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 24ffa1e88cf9..7139f6bf27ad 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -77,10 +77,12 @@ static size_t buf_size(struct page *page)
 }
 
 static void *
-bts_buffer_setup_aux(int cpu, void **pages, int nr_pages, bool overwrite)
+bts_buffer_setup_aux(struct perf_event *event, void **pages,
+		     int nr_pages, bool overwrite)
 {
 	struct bts_buffer *buf;
 	struct page *page;
+	int cpu = event->cpu;
 	int node = (cpu == -1) ? cpu : cpu_to_node(cpu);
 	unsigned long offset;
 	size_t size = nr_pages << PAGE_SHIFT;
diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 3a0aa83cbd07..f7ec5c5fbf0a 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -1104,10 +1104,11 @@ static int pt_buffer_init_topa(struct pt_buffer *buf, unsigned long nr_pages,
  * Return:	Our private PT buffer structure.
  */
 static void *
-pt_buffer_setup_aux(int cpu, void **pages, int nr_pages, bool snapshot)
+pt_buffer_setup_aux(struct perf_event *event, void **pages,
+		    int nr_pages, bool snapshot)
 {
 	struct pt_buffer *buf;
-	int node, ret;
+	int node, ret, cpu = event->cpu;
 
 	if (!nr_pages)
 		return NULL;
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index abe8249b893b..f21eb28b6782 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -177,15 +177,15 @@ static void etm_free_aux(void *data)
 	schedule_work(&event_data->work);
 }
 
-static void *etm_setup_aux(int event_cpu, void **pages,
+static void *etm_setup_aux(struct perf_event *event, void **pages,
 			   int nr_pages, bool overwrite)
 {
-	int cpu;
+	int cpu = event->cpu;
 	cpumask_t *mask;
 	struct coresight_device *sink;
 	struct etm_event_data *event_data = NULL;
 
-	event_data = alloc_event_data(event_cpu);
+	event_data = alloc_event_data(cpu);
 	if (!event_data)
 		return NULL;
 	INIT_WORK(&event_data->work, free_event_data);
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 54ec278d2fc4..4dcd7bf14dcc 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -824,10 +824,10 @@ static void arm_spe_pmu_read(struct perf_event *event)
 {
 }
 
-static void *arm_spe_pmu_setup_aux(int cpu, void **pages, int nr_pages,
-				   bool snapshot)
+static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages,
+				   int nr_pages, bool snapshot)
 {
-	int i;
+	int i, cpu = event->cpu;
 	struct page **pglist;
 	struct arm_spe_pmu_buf *buf;
 
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 8e69b7e309e7..b772ea2441a7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -420,7 +420,7 @@ struct pmu {
 	/*
 	 * Set up pmu-private data structures for an AUX area
 	 */
-	void *(*setup_aux)		(int cpu, void **pages,
+	void *(*setup_aux)		(struct perf_event *event, void **pages,
 					 int nr_pages, bool overwrite);
 					/* optional */
 
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 4a9937076331..857308295f63 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -658,7 +658,7 @@ int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,
 			goto out;
 	}
 
-	rb->aux_priv = event->pmu->setup_aux(event->cpu, rb->aux_pages, nr_pages,
+	rb->aux_priv = event->pmu->setup_aux(event, rb->aux_pages, nr_pages,
 					     overwrite);
 	if (!rb->aux_priv)
 		goto out;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection
  2018-12-17 17:21 [RESEND PATCH v5 0/6] perf: Add ioctl for PMU driver configuration Mathieu Poirier
                   ` (2 preceding siblings ...)
  2018-12-17 17:21 ` [RESEND PATCH v5 3/6] perf/aux: Make perf_event accessible to setup_aux() Mathieu Poirier
@ 2018-12-17 17:21 ` Mathieu Poirier
  2018-12-18 14:14   ` Suzuki K Poulose
  2018-12-17 17:21 ` [RESEND PATCH v5 5/6] perf tools: Make perf_evsel accessible to PMU driver configuration code Mathieu Poirier
  2018-12-17 17:21 ` [RESEND PATCH v5 6/6] perf tools: Use ioctl function to send sink information to kernel Mathieu Poirier
  5 siblings, 1 reply; 16+ messages in thread
From: Mathieu Poirier @ 2018-12-17 17:21 UTC (permalink / raw)
  To: acme, peterz, gregkh
  Cc: mingo, tglx, alexander.shishkin, schwidefsky, heiko.carstens,
	will.deacon, mark.rutland, jolsa, namhyung, adrian.hunter, ast,
	hpa, suzuki.poulosi, linux-s390, linux-kernel, linux-arm-kernel

This patch uses the PMU driver configuration held in event::hw::drv_config
to select a sink for each event that is created (the old sysFS way of
working is kept around for backward compatibility).

By proceeding in this way a sink can be used by multiple sessions
without having to play games with entries in sysFS.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 drivers/hwtracing/coresight/coresight-etm-perf.c | 74 ++++++++++++++++++++----
 1 file changed, 62 insertions(+), 12 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index f21eb28b6782..a7e1fdef07f2 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -4,6 +4,7 @@
  * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
  */
 
+#include <linux/amba/bus.h>
 #include <linux/coresight.h>
 #include <linux/coresight-pmu.h>
 #include <linux/cpumask.h>
@@ -11,6 +12,7 @@
 #include <linux/list.h>
 #include <linux/mm.h>
 #include <linux/init.h>
+#include <linux/ioport.h>
 #include <linux/perf_event.h>
 #include <linux/percpu-defs.h>
 #include <linux/slab.h>
@@ -177,6 +179,26 @@ static void etm_free_aux(void *data)
 	schedule_work(&event_data->work);
 }
 
+static struct coresight_device *etm_drv_config_sync(struct perf_event *event)
+{
+	struct coresight_device *sink = NULL;
+	struct pmu_drv_config *drv_config = perf_event_get_drv_config(event);
+
+	/*
+	 * Make sure we don't race with perf_drv_config_replace() in
+	 * kernel/events/core.c.
+	 */
+	raw_spin_lock(&drv_config->lock);
+
+	/* Copy what we got from user space if applicable. */
+	if (drv_config->config)
+		sink = drv_config->config;
+
+	raw_spin_unlock(&drv_config->lock);
+
+	return sink;
+}
+
 static void *etm_setup_aux(struct perf_event *event, void **pages,
 			   int nr_pages, bool overwrite)
 {
@@ -190,18 +212,11 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
 		return NULL;
 	INIT_WORK(&event_data->work, free_event_data);
 
-	/*
-	 * In theory nothing prevent tracers in a trace session from being
-	 * associated with different sinks, nor having a sink per tracer.  But
-	 * until we have HW with this kind of topology we need to assume tracers
-	 * in a trace session are using the same sink.  Therefore go through
-	 * the coresight bus and pick the first enabled sink.
-	 *
-	 * When operated from sysFS users are responsible to enable the sink
-	 * while from perf, the perf tools will do it based on the choice made
-	 * on the cmd line.  As such the "enable_sink" flag in sysFS is reset.
-	 */
-	sink = coresight_get_enabled_sink(true);
+	/* First get the sink config from user space. */
+	sink = etm_drv_config_sync(event);
+	if (!sink)
+		sink = coresight_get_enabled_sink(true);
+
 	if (!sink || !sink_ops(sink)->alloc_buffer)
 		goto err;
 
@@ -454,6 +469,40 @@ static void etm_addr_filters_sync(struct perf_event *event)
 	filters->nr_filters = i;
 }
 
+static int etm_drv_config_find_sink(struct device *dev, void *data)
+{
+	struct amba_device *adev = to_amba_device(dev->parent);
+	struct resource *res = &adev->res;
+	u64 value = *((u64 *)data);
+
+	/*
+	 * The HW mapping of a component is unique.  If the value we've been
+	 * given matches the component's start address, then we must have found
+	 * the device we are looking for.
+	 */
+	if (res->start == value)
+		return 1;
+
+	return 0;
+}
+
+static void *etm_drv_config_validate(struct perf_event *event, u64 value)
+{
+	struct device *dev;
+	struct coresight_device *sink;
+
+	/* Look for the device with a res->start equal to @value. */
+	dev = bus_find_device(&coresight_bustype, NULL,
+			      &value, etm_drv_config_find_sink);
+	if (!dev)
+		return ERR_PTR(-EINVAL);
+
+	sink = to_coresight_device(dev);
+	put_device(dev);
+
+	return sink;
+}
+
 int etm_perf_symlink(struct coresight_device *csdev, bool link)
 {
 	char entry[sizeof("cpu9999999")];
@@ -498,6 +547,7 @@ static int __init etm_perf_init(void)
 	etm_pmu.addr_filters_sync	= etm_addr_filters_sync;
 	etm_pmu.addr_filters_validate	= etm_addr_filters_validate;
 	etm_pmu.nr_addr_filters		= ETM_ADDR_CMP_MAX;
+	etm_pmu.drv_config_validate	= etm_drv_config_validate;
 
 	ret = perf_pmu_register(&etm_pmu, CORESIGHT_ETM_PMU_NAME, -1);
 	if (ret == 0)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RESEND PATCH v5 5/6] perf tools: Make perf_evsel accessible to PMU driver configuration code
  2018-12-17 17:21 [RESEND PATCH v5 0/6] perf: Add ioctl for PMU driver configuration Mathieu Poirier
                   ` (3 preceding siblings ...)
  2018-12-17 17:21 ` [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection Mathieu Poirier
@ 2018-12-17 17:21 ` Mathieu Poirier
  2018-12-17 17:21 ` [RESEND PATCH v5 6/6] perf tools: Use ioctl function to send sink information to kernel Mathieu Poirier
  5 siblings, 0 replies; 16+ messages in thread
From: Mathieu Poirier @ 2018-12-17 17:21 UTC (permalink / raw)
  To: acme, peterz, gregkh
  Cc: mingo, tglx, alexander.shishkin, schwidefsky, heiko.carstens,
	will.deacon, mark.rutland, jolsa, namhyung, adrian.hunter, ast,
	hpa, suzuki.poulosi, linux-s390, linux-kernel, linux-arm-kernel

Make the perf_evsel available to the PMU driver configuration code.  That
way function perf_evsel__run_ioctl() can be used from there and
information pertaining to the perf_evsel_config_term is still available.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 tools/perf/arch/arm/util/cs-etm.c | 22 +++++++++++++++++++++-
 tools/perf/arch/arm/util/cs-etm.h |  3 ++-
 tools/perf/util/drv_configs.c     | 30 +++++++-----------------------
 tools/perf/util/pmu.h             |  3 ++-
 4 files changed, 32 insertions(+), 26 deletions(-)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index 2f595cd73da6..d8081c2e6d44 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -635,7 +635,7 @@ static int __printf(2, 3) cs_device__print_file(const char *name, const char *fm
 	return ret;
 }
 
-int cs_etm_set_drv_config(struct perf_evsel_config_term *term)
+static int cs_etm_set_drv_config_term(struct perf_evsel_config_term *term)
 {
 	int ret;
 	char enable_sink[ENABLE_SINK_MAX];
@@ -649,3 +649,23 @@ int cs_etm_set_drv_config(struct perf_evsel_config_term *term)
 
 	return 0;
 }
+
+int cs_etm_set_drv_config(struct perf_evsel *evsel,
+			  struct perf_evsel_config_term **err_term)
+{
+	int err = 0;
+	struct perf_evsel_config_term *term;
+
+	list_for_each_entry(term, &evsel->config_terms, list) {
+		if (term->type != PERF_EVSEL__CONFIG_TERM_DRV_CFG)
+			continue;
+
+		err = cs_etm_set_drv_config_term(term);
+		if (err) {
+			*err_term = term;
+			break;
+		}
+	}
+
+	return err;
+}
diff --git a/tools/perf/arch/arm/util/cs-etm.h b/tools/perf/arch/arm/util/cs-etm.h
index 1a12e64f5127..a3f8dde6ccef 100644
--- a/tools/perf/arch/arm/util/cs-etm.h
+++ b/tools/perf/arch/arm/util/cs-etm.h
@@ -10,6 +10,7 @@
 #include "../../util/evsel.h"
 
 struct auxtrace_record *cs_etm_record_init(int *err);
-int cs_etm_set_drv_config(struct perf_evsel_config_term *term);
+int cs_etm_set_drv_config(struct perf_evsel *evsel,
+			  struct perf_evsel_config_term **err_term);
 
 #endif
diff --git a/tools/perf/util/drv_configs.c b/tools/perf/util/drv_configs.c
index eec754243f4d..f7c1bcf08549 100644
--- a/tools/perf/util/drv_configs.c
+++ b/tools/perf/util/drv_configs.c
@@ -25,7 +25,6 @@ perf_evsel__apply_drv_configs(struct perf_evsel *evsel,
 {
 	bool found = false;
 	int err = 0;
-	struct perf_evsel_config_term *term;
 	struct perf_pmu *pmu = NULL;
 
 	while ((pmu = perf_pmu__scan(pmu)) != NULL)
@@ -34,29 +33,14 @@ perf_evsel__apply_drv_configs(struct perf_evsel *evsel,
 			break;
 		}
 
-	list_for_each_entry(term, &evsel->config_terms, list) {
-		if (term->type != PERF_EVSEL__CONFIG_TERM_DRV_CFG)
-			continue;
+	/*
+	 * No need to continue if we didn't get a match or if there is no
+	 * driver configuration function for this PMU.
+	 */
+	if (!found || !pmu->set_drv_config)
+		return err;
 
-		/*
-		 * We have a configuration term, report an error if we
-		 * can't find the PMU or if the PMU driver doesn't support
-		 * cmd line driver configuration.
-		 */
-		if (!found || !pmu->set_drv_config) {
-			err = -EINVAL;
-			*err_term = term;
-			break;
-		}
-
-		err = pmu->set_drv_config(term);
-		if (err) {
-			*err_term = term;
-			break;
-		}
-	}
-
-	return err;
+	return pmu->set_drv_config(evsel, err_term);
 }
 
 int perf_evlist__apply_drv_configs(struct perf_evlist *evlist,
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 76fecec7b3f9..47f44394042b 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -29,7 +29,8 @@ struct perf_pmu {
 	struct list_head format;  /* HEAD struct perf_pmu_format -> list */
 	struct list_head aliases; /* HEAD struct perf_pmu_alias -> list */
 	struct list_head list;    /* ELEM */
-	int (*set_drv_config)	(struct perf_evsel_config_term *term);
+	int (*set_drv_config)	(struct perf_evsel *evsel,
+				 struct perf_evsel_config_term **err_term);
 };
 
 struct perf_pmu_info {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RESEND PATCH v5 6/6] perf tools: Use ioctl function to send sink information to kernel
  2018-12-17 17:21 [RESEND PATCH v5 0/6] perf: Add ioctl for PMU driver configuration Mathieu Poirier
                   ` (4 preceding siblings ...)
  2018-12-17 17:21 ` [RESEND PATCH v5 5/6] perf tools: Make perf_evsel accessible to PMU driver configuration code Mathieu Poirier
@ 2018-12-17 17:21 ` Mathieu Poirier
  5 siblings, 0 replies; 16+ messages in thread
From: Mathieu Poirier @ 2018-12-17 17:21 UTC (permalink / raw)
  To: acme, peterz, gregkh
  Cc: mingo, tglx, alexander.shishkin, schwidefsky, heiko.carstens,
	will.deacon, mark.rutland, jolsa, namhyung, adrian.hunter, ast,
	hpa, suzuki.poulosi, linux-s390, linux-kernel, linux-arm-kernel

The communication of sink information for a trace session doesn't work when
more than one CPU is involved in the scenario due to the static nature of
sysfs.  As such communicate the sink information to each event by using the
an ioctl command.  The information sent to the kernel is the component's HW
address, which is guaranteed to be unique on the CoreSight bus and known to
the kernel.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 tools/perf/arch/arm/util/cs-etm.c | 66 +++++++++++++++++++++++++++++++++++++--
 tools/perf/util/evsel.c           |  6 ++++
 tools/perf/util/evsel.h           |  1 +
 3 files changed, 71 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index d8081c2e6d44..41e394806707 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -22,11 +22,13 @@
 #include "../../util/thread_map.h"
 #include "../../util/cs-etm.h"
 
+#include <stdio.h>
 #include <stdlib.h>
 #include <sys/stat.h>
 
 #define ENABLE_SINK_MAX	128
 #define CS_BUS_DEVICE_PATH "/bus/coresight/devices/"
+#define AMBA_BUS_DEVICE_PATH "/bus/amba/devices/"
 
 struct cs_etm_recording {
 	struct auxtrace_record	itr;
@@ -619,6 +621,41 @@ static FILE *cs_device__open_file(const char *name)
 
 }
 
+static int cs_etm_check_drv_config_term(const char *name, u64 *addr)
+{
+	char path[PATH_MAX];
+	const char *sysfs;
+	int err;
+	FILE *file;
+	u64 start;
+
+	/* CS devices are all found under sysFS */
+	sysfs = sysfs__mountpoint();
+	if (!sysfs)
+		return -EINVAL;
+
+	/* The resource file contains the HW start address of this component */
+	snprintf(path, PATH_MAX,
+		 "%s" AMBA_BUS_DEVICE_PATH "%s" "/resource", sysfs, name);
+
+	file = fopen(path, "r");
+	if (!file) {
+		pr_debug("Unable to open file: %s\n", path);
+		return -EINVAL;
+	}
+
+	/* We just need the first value */
+	err = fscanf(file, "%016lx", &start);
+	if (err != 1) {
+		pr_debug("Unable to get resource start value for: %s\n", path);
+		return -EINVAL;
+	}
+
+	*addr = start;
+
+	return 0;
+}
+
 static int __printf(2, 3) cs_device__print_file(const char *name, const char *fmt, ...)
 {
 	va_list args;
@@ -635,7 +672,7 @@ static int __printf(2, 3) cs_device__print_file(const char *name, const char *fm
 	return ret;
 }
 
-static int cs_etm_set_drv_config_term(struct perf_evsel_config_term *term)
+static int cs_etm_set_drv_config_term_sysfs(struct perf_evsel_config_term *term)
 {
 	int ret;
 	char enable_sink[ENABLE_SINK_MAX];
@@ -650,6 +687,21 @@ static int cs_etm_set_drv_config_term(struct perf_evsel_config_term *term)
 	return 0;
 }
 
+static int cs_etm_set_drv_config_term_ioctl(struct perf_evsel *evsel,
+					    struct perf_evsel_config_term *term)
+{
+	u64 addr;
+	int ret;
+
+	/* First check the input */
+	ret = cs_etm_check_drv_config_term(term->val.drv_cfg, &addr);
+	if (ret)
+		return ret;
+
+	/* All good, apply configuration */
+	return perf_evsel__apply_drv_config(evsel, addr);
+}
+
 int cs_etm_set_drv_config(struct perf_evsel *evsel,
 			  struct perf_evsel_config_term **err_term)
 {
@@ -660,7 +712,17 @@ int cs_etm_set_drv_config(struct perf_evsel *evsel,
 		if (term->type != PERF_EVSEL__CONFIG_TERM_DRV_CFG)
 			continue;
 
-		err = cs_etm_set_drv_config_term(term);
+		/* First try the new interface, i.e ioctl() */
+		err = cs_etm_set_drv_config_term_ioctl(evsel, term);
+		if (!err)
+			continue;
+
+		/*
+		 * Something went wrong, we are probably working with an older
+		 * kernel.  As such use the sysFS interface, which will only
+		 * work for per-thread scenarios.
+		 */
+		err = cs_etm_set_drv_config_term_sysfs(term);
 		if (err) {
 			*err_term = term;
 			break;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6d187059a373..8aa2b79c3314 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1195,6 +1195,12 @@ static int perf_evsel__append_filter(struct perf_evsel *evsel,
 	return -1;
 }
 
+int perf_evsel__apply_drv_config(struct perf_evsel *evsel, u64 addr)
+{
+	return perf_evsel__run_ioctl(evsel,
+				     PERF_EVENT_IOC_SET_DRV_CONFIG, &addr);
+}
+
 int perf_evsel__append_tp_filter(struct perf_evsel *evsel, const char *filter)
 {
 	return perf_evsel__append_filter(evsel, "(%s) && (%s)", filter);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 3147ca76c6fc..6be7560bc986 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -276,6 +276,7 @@ int perf_evsel__append_tp_filter(struct perf_evsel *evsel, const char *filter);
 int perf_evsel__append_addr_filter(struct perf_evsel *evsel,
 				   const char *filter);
 int perf_evsel__apply_filter(struct perf_evsel *evsel, const char *filter);
+int perf_evsel__apply_drv_config(struct perf_evsel *evsel, u64 addr);
 int perf_evsel__enable(struct perf_evsel *evsel);
 int perf_evsel__disable(struct perf_evsel *evsel);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection
  2018-12-17 17:21 ` [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection Mathieu Poirier
@ 2018-12-18 14:14   ` Suzuki K Poulose
  2018-12-18 15:21     ` Alexander Shishkin
  2018-12-18 17:34     ` Mathieu Poirier
  0 siblings, 2 replies; 16+ messages in thread
From: Suzuki K Poulose @ 2018-12-18 14:14 UTC (permalink / raw)
  To: Mathieu Poirier, acme, peterz, gregkh
  Cc: mingo, tglx, alexander.shishkin, schwidefsky, heiko.carstens,
	will.deacon, mark.rutland, jolsa, namhyung, adrian.hunter, ast,
	hpa, suzuki.poulosi, linux-s390, linux-kernel, linux-arm-kernel

Hi Mathieu,

On 17/12/2018 17:21, Mathieu Poirier wrote:
> This patch uses the PMU driver configuration held in event::hw::drv_config
> to select a sink for each event that is created (the old sysFS way of
> working is kept around for backward compatibility).
> 
> By proceeding in this way a sink can be used by multiple sessions
> without having to play games with entries in sysFS.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> ---
>   drivers/hwtracing/coresight/coresight-etm-perf.c | 74 ++++++++++++++++++++----
>   1 file changed, 62 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index f21eb28b6782..a7e1fdef07f2 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -4,6 +4,7 @@
>    * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
>    */
>   
> +#include <linux/amba/bus.h>
>   #include <linux/coresight.h>
>   #include <linux/coresight-pmu.h>
>   #include <linux/cpumask.h>
> @@ -11,6 +12,7 @@
>   #include <linux/list.h>
>   #include <linux/mm.h>
>   #include <linux/init.h>
> +#include <linux/ioport.h>
>   #include <linux/perf_event.h>
>   #include <linux/percpu-defs.h>
>   #include <linux/slab.h>
> @@ -177,6 +179,26 @@ static void etm_free_aux(void *data)
>   	schedule_work(&event_data->work);
>   }
>   
> +static struct coresight_device *etm_drv_config_sync(struct perf_event *event)

minor nit: The name doesn't quite imply what we do here. Did you mean
s/sync/sink ?

> +{
> +	struct coresight_device *sink = NULL;
> +	struct pmu_drv_config *drv_config = perf_event_get_drv_config(event);
> +
> +	/*
> +	 * Make sure we don't race with perf_drv_config_replace() in
> +	 * kernel/events/core.c.
> +	 */
> +	raw_spin_lock(&drv_config->lock);
> +
> +	/* Copy what we got from user space if applicable. */
> +	if (drv_config->config)
> +		sink = drv_config->config;
> +
> +	raw_spin_unlock(&drv_config->lock);
> +
> +	return sink;
> +}
> +
>   static void *etm_setup_aux(struct perf_event *event, void **pages,
>   			   int nr_pages, bool overwrite)
>   {
> @@ -190,18 +212,11 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>   		return NULL;
>   	INIT_WORK(&event_data->work, free_event_data);
>   
> -	/*
> -	 * In theory nothing prevent tracers in a trace session from being
> -	 * associated with different sinks, nor having a sink per tracer.  But
> -	 * until we have HW with this kind of topology we need to assume tracers
> -	 * in a trace session are using the same sink.  Therefore go through
> -	 * the coresight bus and pick the first enabled sink.
> -	 *
> -	 * When operated from sysFS users are responsible to enable the sink
> -	 * while from perf, the perf tools will do it based on the choice made
> -	 * on the cmd line.  As such the "enable_sink" flag in sysFS is reset.
> -	 */
> -	sink = coresight_get_enabled_sink(true);
> +	/* First get the sink config from user space. */
> +	sink = etm_drv_config_sync(event);
> +	if (!sink)
> +		sink = coresight_get_enabled_sink(true);
> +
>   	if (!sink || !sink_ops(sink)->alloc_buffer)
>   		goto err;
>   
> @@ -454,6 +469,40 @@ static void etm_addr_filters_sync(struct perf_event *event)
>   	filters->nr_filters = i;
>   }
>   
> +static int etm_drv_config_find_sink(struct device *dev, void *data)
> +{
> +	struct amba_device *adev = to_amba_device(dev->parent);
> +	struct resource *res = &adev->res;
> +	u64 value = *((u64 *)data);
> +
> +	/*
> +	 * The HW mapping of a component is unique.  If the value we've been
> +	 * given matches the component's start address, then we must have found
> +	 * the device we are looking for.
> +	 */

To be frank, I don't quite like the idea of passing the base address of the
component as the key to locate a device, (even though that is unique and readily
available). I would rather prefer a programmable way to map the keys to the
"sink" devices, which works platform agnostic (e.g, ACPI support, where the base
address is not obvious from the name). Also if we decide to use a platform
agnostic naming scheme, it becomes even more complex.

We could assign a static "id/key" exported either via the device sysfs dir or
the "pmu" dir. I prefer the latter.

Thoughts ?

And whatever we decide to choose, needs to be clearly documented under the 
Documentation/perf/cs_etm.txt.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection
  2018-12-18 14:14   ` Suzuki K Poulose
@ 2018-12-18 15:21     ` Alexander Shishkin
  2018-12-18 18:20       ` Mathieu Poirier
  2018-12-18 17:34     ` Mathieu Poirier
  1 sibling, 1 reply; 16+ messages in thread
From: Alexander Shishkin @ 2018-12-18 15:21 UTC (permalink / raw)
  To: Suzuki K Poulose, Mathieu Poirier, acme, peterz, gregkh
  Cc: mingo, tglx, schwidefsky, heiko.carstens, will.deacon,
	mark.rutland, jolsa, namhyung, adrian.hunter, ast, hpa,
	suzuki.poulosi, linux-s390, linux-kernel, linux-arm-kernel

Suzuki K Poulose <suzuki.poulose@arm.com> writes:

>> +	/*
>> +	 * The HW mapping of a component is unique.  If the value we've been
>> +	 * given matches the component's start address, then we must have found
>> +	 * the device we are looking for.
>> +	 */
>
> To be frank, I don't quite like the idea of passing the base address of the
> component as the key to locate a device, (even though that is unique and readily
> available). I would rather prefer a programmable way to map the keys to the
> "sink" devices, which works platform agnostic (e.g, ACPI support, where the base
> address is not obvious from the name). Also if we decide to use a platform
> agnostic naming scheme, it becomes even more complex.
>
> We could assign a static "id/key" exported either via the device sysfs dir or
> the "pmu" dir. I prefer the latter.
>
> Thoughts ?

So, my understanding is that we have a bunch of trace sources and a
bunch of trace sinks to choose from when we set up the perf event. The
current model basically treats trace sources as PMUs and relies on the
sink configuration process to be done via sysfs, which is not ideal as
an API.

The first thing that comes to mind is: can then the sinks be made their
own PMUs, so the above can be done via the existing SET_OUTPUT ioctl?

Regards,
--
Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection
  2018-12-18 14:14   ` Suzuki K Poulose
  2018-12-18 15:21     ` Alexander Shishkin
@ 2018-12-18 17:34     ` Mathieu Poirier
  2018-12-19  9:40       ` Suzuki K Poulose
  1 sibling, 1 reply; 16+ messages in thread
From: Mathieu Poirier @ 2018-12-18 17:34 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Greg KH, Ingo Molnar,
	Thomas Gleixner, Alexander Shishkin, schwidefsky, heiko.carstens,
	Will Deacon, Mark Rutland, Jiri Olsa, Namhyung Kim,
	Adrian Hunter, ast, H. Peter Anvin, suzuki.poulosi, linux-s390,
	Linux Kernel Mailing List, linux-arm-kernel

Good day Suzuki,

On Tue, 18 Dec 2018 at 07:14, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
> Hi Mathieu,
>
> On 17/12/2018 17:21, Mathieu Poirier wrote:
> > This patch uses the PMU driver configuration held in event::hw::drv_config
> > to select a sink for each event that is created (the old sysFS way of
> > working is kept around for backward compatibility).
> >
> > By proceeding in this way a sink can be used by multiple sessions
> > without having to play games with entries in sysFS.
> >
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > ---
> >   drivers/hwtracing/coresight/coresight-etm-perf.c | 74 ++++++++++++++++++++----
> >   1 file changed, 62 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> > index f21eb28b6782..a7e1fdef07f2 100644
> > --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> > +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> > @@ -4,6 +4,7 @@
> >    * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
> >    */
> >
> > +#include <linux/amba/bus.h>
> >   #include <linux/coresight.h>
> >   #include <linux/coresight-pmu.h>
> >   #include <linux/cpumask.h>
> > @@ -11,6 +12,7 @@
> >   #include <linux/list.h>
> >   #include <linux/mm.h>
> >   #include <linux/init.h>
> > +#include <linux/ioport.h>
> >   #include <linux/perf_event.h>
> >   #include <linux/percpu-defs.h>
> >   #include <linux/slab.h>
> > @@ -177,6 +179,26 @@ static void etm_free_aux(void *data)
> >       schedule_work(&event_data->work);
> >   }
> >
> > +static struct coresight_device *etm_drv_config_sync(struct perf_event *event)
>
> minor nit: The name doesn't quite imply what we do here. Did you mean
> s/sync/sink ?
>

I chose "sync" with "synchronisation" in mind.  I tried to keep things
generic since we could potentially use the same interface to convey
complex PMU configuration.  Arguably we could go with "sink" for now
and change it to "sync" in the future - I'm not strongly opinionated
on that part.

> > +{
> > +     struct coresight_device *sink = NULL;
> > +     struct pmu_drv_config *drv_config = perf_event_get_drv_config(event);
> > +
> > +     /*
> > +      * Make sure we don't race with perf_drv_config_replace() in
> > +      * kernel/events/core.c.
> > +      */
> > +     raw_spin_lock(&drv_config->lock);
> > +
> > +     /* Copy what we got from user space if applicable. */
> > +     if (drv_config->config)
> > +             sink = drv_config->config;
> > +
> > +     raw_spin_unlock(&drv_config->lock);
> > +
> > +     return sink;
> > +}
> > +
> >   static void *etm_setup_aux(struct perf_event *event, void **pages,
> >                          int nr_pages, bool overwrite)
> >   {
> > @@ -190,18 +212,11 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> >               return NULL;
> >       INIT_WORK(&event_data->work, free_event_data);
> >
> > -     /*
> > -      * In theory nothing prevent tracers in a trace session from being
> > -      * associated with different sinks, nor having a sink per tracer.  But
> > -      * until we have HW with this kind of topology we need to assume tracers
> > -      * in a trace session are using the same sink.  Therefore go through
> > -      * the coresight bus and pick the first enabled sink.
> > -      *
> > -      * When operated from sysFS users are responsible to enable the sink
> > -      * while from perf, the perf tools will do it based on the choice made
> > -      * on the cmd line.  As such the "enable_sink" flag in sysFS is reset.
> > -      */
> > -     sink = coresight_get_enabled_sink(true);
> > +     /* First get the sink config from user space. */
> > +     sink = etm_drv_config_sync(event);
> > +     if (!sink)
> > +             sink = coresight_get_enabled_sink(true);
> > +
> >       if (!sink || !sink_ops(sink)->alloc_buffer)
> >               goto err;
> >
> > @@ -454,6 +469,40 @@ static void etm_addr_filters_sync(struct perf_event *event)
> >       filters->nr_filters = i;
> >   }
> >
> > +static int etm_drv_config_find_sink(struct device *dev, void *data)
> > +{
> > +     struct amba_device *adev = to_amba_device(dev->parent);
> > +     struct resource *res = &adev->res;
> > +     u64 value = *((u64 *)data);
> > +
> > +     /*
> > +      * The HW mapping of a component is unique.  If the value we've been
> > +      * given matches the component's start address, then we must have found
> > +      * the device we are looking for.
> > +      */
>
> To be frank, I don't quite like the idea of passing the base address of the
> component as the key to locate a device, (even though that is unique and readily
> available). I would rather prefer a programmable way to map the keys to the
> "sink" devices, which works platform agnostic (e.g, ACPI support, where the base
> address is not obvious from the name). Also if we decide to use a platform
> agnostic naming scheme, it becomes even more complex.

This mechanism doesn't rely on the naming scheme - it exploits the
"resource" interface exported for each amba device [1].  As such
whether the component is discovered using ACPI or DT, we end up on the
same amba bus and using the same interface.

[1]. https://elixir.bootlin.com/linux/latest/source/drivers/amba/bus.c#L128

>
> We could assign a static "id/key" exported either via the device sysfs dir or
> the "pmu" dir. I prefer the latter.

Not sure what you mean by "pmu" directory - would you mind expanding
on that?  Using sysfs would be quite easy but I am reluctant to create
a new id/key mechanism and introduce another entry when we have the
component address that is unique and already available in the amba
directory structure.

Thanks for taking a look,
Mathieu

>
> Thoughts ?
>
> And whatever we decide to choose, needs to be clearly documented under the
> Documentation/perf/cs_etm.txt.
>
> Cheers
> Suzuki

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection
  2018-12-18 15:21     ` Alexander Shishkin
@ 2018-12-18 18:20       ` Mathieu Poirier
  0 siblings, 0 replies; 16+ messages in thread
From: Mathieu Poirier @ 2018-12-18 18:20 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Suzuki K Poulose, Arnaldo Carvalho de Melo, Peter Zijlstra,
	Greg KH, Ingo Molnar, Thomas Gleixner, schwidefsky,
	heiko.carstens, Will Deacon, Mark Rutland, Jiri Olsa,
	Namhyung Kim, Adrian Hunter, ast, H. Peter Anvin, suzuki.poulosi,
	linux-s390, Linux Kernel Mailing List, linux-arm-kernel

Hi Alex,

On Tue, 18 Dec 2018 at 08:21, Alexander Shishkin
<alexander.shishkin@linux.intel.com> wrote:
>
> Suzuki K Poulose <suzuki.poulose@arm.com> writes:
>
> >> +    /*
> >> +     * The HW mapping of a component is unique.  If the value we've been
> >> +     * given matches the component's start address, then we must have found
> >> +     * the device we are looking for.
> >> +     */
> >
> > To be frank, I don't quite like the idea of passing the base address of the
> > component as the key to locate a device, (even though that is unique and readily
> > available). I would rather prefer a programmable way to map the keys to the
> > "sink" devices, which works platform agnostic (e.g, ACPI support, where the base
> > address is not obvious from the name). Also if we decide to use a platform
> > agnostic naming scheme, it becomes even more complex.
> >
> > We could assign a static "id/key" exported either via the device sysfs dir or
> > the "pmu" dir. I prefer the latter.
> >
> > Thoughts ?
>
> So, my understanding is that we have a bunch of trace sources and a
> bunch of trace sinks to choose from when we set up the perf event. The
> current model basically treats trace sources as PMUs and relies on the
> sink configuration process to be done via sysfs, which is not ideal as
> an API.

That is correct.  Most of the grief comes from the fact that when
tracing CPU-wide session sinks are concurrently used by more than one
CPU.

>
> The first thing that comes to mind is: can then the sinks be made their
> own PMUs, so the above can be done via the existing SET_OUTPUT ioctl?

I had a serious look at the SET_OUTPUT function as part of the
research that pre-dated implementing CPU-wide support for coresight.
The core does not allow events assigned to different CPUs to use the
same mmap'ed area, which is perfectly understandable.

To me the problem of sharing a sink between CPUs is inherent to
coresight and should be fixed within that framework (see full work
here [1] if interested).  The implementation associates a sink with an
mmap'ed area, just like PT and coresight --per-thread.  Where things
differ is that for coresight CPU-wide the sink is kept in function for
as long as a CPU is using it, ignoring other request for updates or to
switch it off.  So the first CPU to use it turns the sink on and the
last turns it off after collecting trace data from it.

Thanks,
Mathieu

[1]. https://git.linaro.org/people/mathieu.poirier/coresight.git/log/?h=cpu-wide-coresight

>
> Regards,
> --
> Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RESEND PATCH v5 1/6] perf: Introduce ioctl to communicate driver configuration to kernel
  2018-12-17 17:21 ` [RESEND PATCH v5 1/6] perf: Introduce ioctl to communicate driver configuration to kernel Mathieu Poirier
@ 2018-12-19  8:29   ` Greg KH
  0 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2018-12-19  8:29 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: acme, peterz, mingo, tglx, alexander.shishkin, schwidefsky,
	heiko.carstens, will.deacon, mark.rutland, jolsa, namhyung,
	adrian.hunter, ast, hpa, suzuki.poulosi, linux-s390,
	linux-kernel, linux-arm-kernel

On Mon, Dec 17, 2018 at 10:21:41AM -0700, Mathieu Poirier wrote:
> Adding a new IOCTL command to communicate PMU specific configuration to
> PMU kernel drivers.  This can be anything a PMU might need for
> configuration that doesn't fit in the perf_event_attr structure, such
> as the CoreSight sink to use for a session.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> ---
>  include/uapi/linux/perf_event.h       | 1 +
>  tools/include/uapi/linux/perf_event.h | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 9de8780ac8d9..a39d8dc2e7a9 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -462,6 +462,7 @@ struct perf_event_query_bpf {
>  #define PERF_EVENT_IOC_PAUSE_OUTPUT		_IOW('$', 9, __u32)
>  #define PERF_EVENT_IOC_QUERY_BPF		_IOWR('$', 10, struct perf_event_query_bpf *)
>  #define PERF_EVENT_IOC_MODIFY_ATTRIBUTES	_IOW('$', 11, struct perf_event_attr *)
> +#define PERF_EVENT_IOC_SET_DRV_CONFIG		_IOW('$', 12, __u64 *)
>  
>  enum perf_event_ioc_flags {
>  	PERF_IOC_FLAG_GROUP		= 1U << 0,
> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
> index 9de8780ac8d9..a39d8dc2e7a9 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -462,6 +462,7 @@ struct perf_event_query_bpf {
>  #define PERF_EVENT_IOC_PAUSE_OUTPUT		_IOW('$', 9, __u32)
>  #define PERF_EVENT_IOC_QUERY_BPF		_IOWR('$', 10, struct perf_event_query_bpf *)
>  #define PERF_EVENT_IOC_MODIFY_ATTRIBUTES	_IOW('$', 11, struct perf_event_attr *)
> +#define PERF_EVENT_IOC_SET_DRV_CONFIG		_IOW('$', 12, __u64 *)

A pointer to an unknown structure?  That's just asking for abuse :(


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RESEND PATCH v5 2/6] perf/core: Use ioctl to communicate driver configuration to kernel
  2018-12-17 17:21 ` [RESEND PATCH v5 2/6] perf/core: Use " Mathieu Poirier
@ 2018-12-19  8:31   ` Greg KH
  0 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2018-12-19  8:31 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: acme, peterz, mingo, tglx, alexander.shishkin, schwidefsky,
	heiko.carstens, will.deacon, mark.rutland, jolsa, namhyung,
	adrian.hunter, ast, hpa, suzuki.poulosi, linux-s390,
	linux-kernel, linux-arm-kernel

On Mon, Dec 17, 2018 at 10:21:42AM -0700, Mathieu Poirier wrote:
> This patch adds the mechanic needed for user space to send PMU specific
> configuration to the kernel driver using an ioctl() command.  That way
> events can keep track of options that don't fit in the perf_event_attr
> structure like the selection of a CoreSight sink to use for the session.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> ---
>  include/linux/perf_event.h | 38 ++++++++++++++++++++++++++
>  kernel/events/core.c       | 67 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 105 insertions(+)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 53c500f0ca79..8e69b7e309e7 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -114,6 +114,14 @@ struct hw_perf_event_extra {
>  	int		idx;	/* index in shared_regs->regs[] */
>  };
>  
> +/*
> + * PMU driver configuration
> + */
> +struct pmu_drv_config {
> +	void		*config;
> +	raw_spinlock_t	lock;
> +};
> +
>  /**
>   * struct hw_perf_event - performance event hardware details:
>   */
> @@ -178,6 +186,9 @@ struct hw_perf_event {
>  	/* Last sync'ed generation of filters */
>  	unsigned long			addr_filters_gen;
>  
> +	/* PMU driver configuration */
> +	struct pmu_drv_config		drv_config;
> +
>  /*
>   * hw_perf_event::state flags; used to track the PERF_EF_* state.
>   */
> @@ -447,6 +458,17 @@ struct pmu {
>  	 * Filter events for PMU-specific reasons.
>  	 */
>  	int (*filter_match)		(struct perf_event *event); /* optional */
> +
> +	/*
> +	 * Validate complex PMU configuration that don't fit in the
> +	 * perf_event_attr struct.  Returns a PMU specific pointer or an error
> +	 * value < 0.
> +	 *
> +	 * As with addr_filters_validate(), runs in the context of the ioctl()
> +	 * process and is not serialized with the rest of the PMU callbacks.
> +	 */
> +	void *(*drv_config_validate)	(struct perf_event *event,
> +					 u64 value);
>  };
>  
>  enum perf_addr_filter_action_t {
> @@ -1235,6 +1257,11 @@ static inline bool has_addr_filter(struct perf_event *event)
>  	return event->pmu->nr_addr_filters;
>  }
>  
> +static inline bool has_drv_config(struct perf_event *event)
> +{
> +	return event->pmu->drv_config_validate;
> +}
> +
>  /*
>   * An inherited event uses parent's filters
>   */
> @@ -1249,6 +1276,17 @@ perf_event_addr_filters(struct perf_event *event)
>  	return ifh;
>  }
>  
> +static inline struct pmu_drv_config *
> +perf_event_get_drv_config(struct perf_event *event)
> +{
> +	struct pmu_drv_config *cfg = &event->hw.drv_config;
> +
> +	if (event->parent)
> +		cfg = &event->parent->hw.drv_config;
> +
> +	return cfg;
> +}
> +
>  extern void perf_event_addr_filters_sync(struct perf_event *event);
>  
>  extern int perf_output_begin(struct perf_output_handle *handle,
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 84530ab358c3..af7a53c97744 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5003,6 +5003,8 @@ static inline int perf_fget_light(int fd, struct fd *p)
>  static int perf_event_set_output(struct perf_event *event,
>  				 struct perf_event *output_event);
>  static int perf_event_set_filter(struct perf_event *event, void __user *arg);
> +static int perf_event_set_drv_config(struct perf_event *event,
> +				     void __user *arg);
>  static int perf_event_set_bpf_prog(struct perf_event *event, u32 prog_fd);
>  static int perf_copy_attr(struct perf_event_attr __user *uattr,
>  			  struct perf_event_attr *attr);
> @@ -5089,6 +5091,10 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
>  
>  		return perf_event_modify_attr(event,  &new_attr);
>  	}
> +
> +	case PERF_EVENT_IOC_SET_DRV_CONFIG:
> +		return perf_event_set_drv_config(event, (void __user *)arg);
> +
>  	default:
>  		return -ENOTTY;
>  	}
> @@ -9128,6 +9134,66 @@ static int perf_event_set_filter(struct perf_event *event, void __user *arg)
>  	return ret;
>  }
>  
> +static void perf_drv_config_replace(struct perf_event *event, void *drv_data)
> +{
> +	unsigned long flags;
> +	struct pmu_drv_config *drv_config = &event->hw.drv_config;
> +
> +	if (!has_drv_config(event))
> +		return;
> +
> +	/* Children take their configuration from their parent */
> +	if (event->parent)
> +		return;
> +
> +	/* Make sure the PMU doesn't get a handle on the data */
> +	raw_spin_lock_irqsave(&drv_config->lock, flags);
> +	drv_config->config = drv_data;
> +	raw_spin_unlock_irqrestore(&drv_config->lock, flags);
> +}
> +
> +static int
> +perf_event_process_drv_config(struct perf_event *event, u64 value)
> +{
> +	int ret = -EINVAL;
> +	void *drv_data;
> +
> +	/* Make sure ctx.mutex is held */
> +	lockdep_assert_held(&event->ctx->mutex);
> +
> +	/* Children take their configuration from their parent */
> +	if (WARN_ON_ONCE(event->parent))
> +		goto out;
> +
> +	drv_data = event->pmu->drv_config_validate(event, value);
> +	if (IS_ERR(drv_data)) {
> +		ret = PTR_ERR(drv_data);
> +		goto out;
> +	}
> +
> +	perf_drv_config_replace(event, drv_data);
> +
> +	ret = 0;
> +out:
> +	return ret;
> +}
> +
> +static int perf_event_set_drv_config(struct perf_event *event, void __user *arg)
> +{
> +	int ret;
> +	u64 value;
> +
> +	if (!has_drv_config(event))
> +		return -EINVAL;
> +
> +	if (copy_from_user(&value, arg, sizeof(value)))
> +		return -EFAULT;

You are just sending in a pointer to a u64?  Why not just pass a u64
directly instead?  Why is a pointer needed here?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection
  2018-12-18 17:34     ` Mathieu Poirier
@ 2018-12-19  9:40       ` Suzuki K Poulose
  2019-01-07 18:18         ` Mathieu Poirier
  0 siblings, 1 reply; 16+ messages in thread
From: Suzuki K Poulose @ 2018-12-19  9:40 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Greg KH, Ingo Molnar,
	Thomas Gleixner, Alexander Shishkin, schwidefsky, heiko.carstens,
	Will Deacon, Mark Rutland, Jiri Olsa, Namhyung Kim,
	Adrian Hunter, ast, H. Peter Anvin, suzuki.poulosi, linux-s390,
	Linux Kernel Mailing List, linux-arm-kernel



On 18/12/2018 17:34, Mathieu Poirier wrote:
> Good day Suzuki,
> 
> On Tue, 18 Dec 2018 at 07:14, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>
>> Hi Mathieu,
>>
>> On 17/12/2018 17:21, Mathieu Poirier wrote:
>>> This patch uses the PMU driver configuration held in event::hw::drv_config
>>> to select a sink for each event that is created (the old sysFS way of
>>> working is kept around for backward compatibility).
>>>
>>> By proceeding in this way a sink can be used by multiple sessions
>>> without having to play games with entries in sysFS.
>>>
>>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
>>> ---
>>>    drivers/hwtracing/coresight/coresight-etm-perf.c | 74 ++++++++++++++++++++----
>>>    1 file changed, 62 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
>>> index f21eb28b6782..a7e1fdef07f2 100644
>>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
>>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
>>> @@ -4,6 +4,7 @@
>>>     * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
>>>     */
>>>
>>> +#include <linux/amba/bus.h>
>>>    #include <linux/coresight.h>
>>>    #include <linux/coresight-pmu.h>
>>>    #include <linux/cpumask.h>
>>> @@ -11,6 +12,7 @@
>>>    #include <linux/list.h>
>>>    #include <linux/mm.h>
>>>    #include <linux/init.h>
>>> +#include <linux/ioport.h>
>>>    #include <linux/perf_event.h>
>>>    #include <linux/percpu-defs.h>
>>>    #include <linux/slab.h>
>>> @@ -177,6 +179,26 @@ static void etm_free_aux(void *data)
>>>        schedule_work(&event_data->work);
>>>    }
>>>
>>> +static struct coresight_device *etm_drv_config_sync(struct perf_event *event)
>>
>> minor nit: The name doesn't quite imply what we do here. Did you mean
>> s/sync/sink ?
>>
> 
> I chose "sync" with "synchronisation" in mind.  I tried to keep things
> generic since we could potentially use the same interface to convey
> complex PMU configuration.  Arguably we could go with "sink" for now
> and change it to "sync" in the future - I'm not strongly opinionated
> on that part.

Ok. I thought we were trying to grab the sink information from the event
drv_config, hence something that implies that would be slightly more
reader friendly. Again, I am not too keen on it.

> 
>>> +{
>>> +     struct coresight_device *sink = NULL;
>>> +     struct pmu_drv_config *drv_config = perf_event_get_drv_config(event);
>>> +
>>> +     /*
>>> +      * Make sure we don't race with perf_drv_config_replace() in
>>> +      * kernel/events/core.c.
>>> +      */
>>> +     raw_spin_lock(&drv_config->lock);
>>> +
>>> +     /* Copy what we got from user space if applicable. */
>>> +     if (drv_config->config)
>>> +             sink = drv_config->config;
>>> +
>>> +     raw_spin_unlock(&drv_config->lock);
>>> +
>>> +     return sink;
>>> +}
>>> +
>>>    static void *etm_setup_aux(struct perf_event *event, void **pages,
>>>                           int nr_pages, bool overwrite)
>>>    {
>>> @@ -190,18 +212,11 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>>>                return NULL;
>>>        INIT_WORK(&event_data->work, free_event_data);
>>>
>>> -     /*
>>> -      * In theory nothing prevent tracers in a trace session from being
>>> -      * associated with different sinks, nor having a sink per tracer.  But
>>> -      * until we have HW with this kind of topology we need to assume tracers
>>> -      * in a trace session are using the same sink.  Therefore go through
>>> -      * the coresight bus and pick the first enabled sink.
>>> -      *
>>> -      * When operated from sysFS users are responsible to enable the sink
>>> -      * while from perf, the perf tools will do it based on the choice made
>>> -      * on the cmd line.  As such the "enable_sink" flag in sysFS is reset.
>>> -      */
>>> -     sink = coresight_get_enabled_sink(true);
>>> +     /* First get the sink config from user space. */
>>> +     sink = etm_drv_config_sync(event);
>>> +     if (!sink)
>>> +             sink = coresight_get_enabled_sink(true);
>>> +
>>>        if (!sink || !sink_ops(sink)->alloc_buffer)
>>>                goto err;
>>>
>>> @@ -454,6 +469,40 @@ static void etm_addr_filters_sync(struct perf_event *event)
>>>        filters->nr_filters = i;
>>>    }
>>>
>>> +static int etm_drv_config_find_sink(struct device *dev, void *data)
>>> +{
>>> +     struct amba_device *adev = to_amba_device(dev->parent);
>>> +     struct resource *res = &adev->res;
>>> +     u64 value = *((u64 *)data);
>>> +
>>> +     /*
>>> +      * The HW mapping of a component is unique.  If the value we've been
>>> +      * given matches the component's start address, then we must have found
>>> +      * the device we are looking for.
>>> +      */
>>
>> To be frank, I don't quite like the idea of passing the base address of the
>> component as the key to locate a device, (even though that is unique and readily
>> available). I would rather prefer a programmable way to map the keys to the
>> "sink" devices, which works platform agnostic (e.g, ACPI support, where the base
>> address is not obvious from the name). Also if we decide to use a platform
>> agnostic naming scheme, it becomes even more complex.
> 
> This mechanism doesn't rely on the naming scheme - it exploits the
> "resource" interface exported for each amba device [1].  As such
> whether the component is discovered using ACPI or DT, we end up on the
> same amba bus and using the same interface.
> 
> [1]. https://elixir.bootlin.com/linux/latest/source/drivers/amba/bus.c#L128

Ok. The only problem with this approach would be if the devices doesn't appear
on the AMBA bus (btw, which is not true for the existing IPs).

> 
>>
>> We could assign a static "id/key" exported either via the device sysfs dir or
>> the "pmu" dir. I prefer the latter.
> 
> Not sure what you mean by "pmu" directory - would you mind expanding
> on that?  Using sysfs would be quite easy but I am reluctant to create
> a new id/key mechanism and introduce another entry when we have the
> component address that is unique and already available in the amba
> directory structure.

We could add another directory under :

/sys/bus/event_source/devices/<PMU>/
				\_ events/
				\_ format/
say :
		\_ drv_config/
		Or
		\_ sinks/

and list the sinks, eg:
# cd $sysfs_pmu_dir/sinks
# cat <name_of_the_sink>
ID_of_the_sink

Btw, I am always inclined to using some bits off one of the "config" fields
("config1" or "config2") for the sink configuration. But I understand that
you have explored that avenue and chose this approach as we have further
configurations required for complex ETM settings.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection
  2018-12-19  9:40       ` Suzuki K Poulose
@ 2019-01-07 18:18         ` Mathieu Poirier
  2019-01-09 14:55           ` Suzuki K Poulose
  0 siblings, 1 reply; 16+ messages in thread
From: Mathieu Poirier @ 2019-01-07 18:18 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Greg KH, Ingo Molnar,
	Thomas Gleixner, Alexander Shishkin, schwidefsky, heiko.carstens,
	Will Deacon, Mark Rutland, Jiri Olsa, Namhyung Kim,
	Adrian Hunter, ast, H. Peter Anvin, suzuki.poulosi, linux-s390,
	Linux Kernel Mailing List, linux-arm-kernel

On Wed, 19 Dec 2018 at 02:40, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
>
>
> On 18/12/2018 17:34, Mathieu Poirier wrote:
> > Good day Suzuki,
> >
> > On Tue, 18 Dec 2018 at 07:14, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> >>
> >> Hi Mathieu,
> >>
> >> On 17/12/2018 17:21, Mathieu Poirier wrote:
> >>> This patch uses the PMU driver configuration held in event::hw::drv_config
> >>> to select a sink for each event that is created (the old sysFS way of
> >>> working is kept around for backward compatibility).
> >>>
> >>> By proceeding in this way a sink can be used by multiple sessions
> >>> without having to play games with entries in sysFS.
> >>>
> >>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> >>> ---
> >>>    drivers/hwtracing/coresight/coresight-etm-perf.c | 74 ++++++++++++++++++++----
> >>>    1 file changed, 62 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> >>> index f21eb28b6782..a7e1fdef07f2 100644
> >>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> >>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> >>> @@ -4,6 +4,7 @@
> >>>     * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
> >>>     */
> >>>
> >>> +#include <linux/amba/bus.h>
> >>>    #include <linux/coresight.h>
> >>>    #include <linux/coresight-pmu.h>
> >>>    #include <linux/cpumask.h>
> >>> @@ -11,6 +12,7 @@
> >>>    #include <linux/list.h>
> >>>    #include <linux/mm.h>
> >>>    #include <linux/init.h>
> >>> +#include <linux/ioport.h>
> >>>    #include <linux/perf_event.h>
> >>>    #include <linux/percpu-defs.h>
> >>>    #include <linux/slab.h>
> >>> @@ -177,6 +179,26 @@ static void etm_free_aux(void *data)
> >>>        schedule_work(&event_data->work);
> >>>    }
> >>>
> >>> +static struct coresight_device *etm_drv_config_sync(struct perf_event *event)
> >>
> >> minor nit: The name doesn't quite imply what we do here. Did you mean
> >> s/sync/sink ?
> >>
> >
> > I chose "sync" with "synchronisation" in mind.  I tried to keep things
> > generic since we could potentially use the same interface to convey
> > complex PMU configuration.  Arguably we could go with "sink" for now
> > and change it to "sync" in the future - I'm not strongly opinionated
> > on that part.
>
> Ok. I thought we were trying to grab the sink information from the event
> drv_config, hence something that implies that would be slightly more
> reader friendly. Again, I am not too keen on it.
>
> >
> >>> +{
> >>> +     struct coresight_device *sink = NULL;
> >>> +     struct pmu_drv_config *drv_config = perf_event_get_drv_config(event);
> >>> +
> >>> +     /*
> >>> +      * Make sure we don't race with perf_drv_config_replace() in
> >>> +      * kernel/events/core.c.
> >>> +      */
> >>> +     raw_spin_lock(&drv_config->lock);
> >>> +
> >>> +     /* Copy what we got from user space if applicable. */
> >>> +     if (drv_config->config)
> >>> +             sink = drv_config->config;
> >>> +
> >>> +     raw_spin_unlock(&drv_config->lock);
> >>> +
> >>> +     return sink;
> >>> +}
> >>> +
> >>>    static void *etm_setup_aux(struct perf_event *event, void **pages,
> >>>                           int nr_pages, bool overwrite)
> >>>    {
> >>> @@ -190,18 +212,11 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> >>>                return NULL;
> >>>        INIT_WORK(&event_data->work, free_event_data);
> >>>
> >>> -     /*
> >>> -      * In theory nothing prevent tracers in a trace session from being
> >>> -      * associated with different sinks, nor having a sink per tracer.  But
> >>> -      * until we have HW with this kind of topology we need to assume tracers
> >>> -      * in a trace session are using the same sink.  Therefore go through
> >>> -      * the coresight bus and pick the first enabled sink.
> >>> -      *
> >>> -      * When operated from sysFS users are responsible to enable the sink
> >>> -      * while from perf, the perf tools will do it based on the choice made
> >>> -      * on the cmd line.  As such the "enable_sink" flag in sysFS is reset.
> >>> -      */
> >>> -     sink = coresight_get_enabled_sink(true);
> >>> +     /* First get the sink config from user space. */
> >>> +     sink = etm_drv_config_sync(event);
> >>> +     if (!sink)
> >>> +             sink = coresight_get_enabled_sink(true);
> >>> +
> >>>        if (!sink || !sink_ops(sink)->alloc_buffer)
> >>>                goto err;
> >>>
> >>> @@ -454,6 +469,40 @@ static void etm_addr_filters_sync(struct perf_event *event)
> >>>        filters->nr_filters = i;
> >>>    }
> >>>
> >>> +static int etm_drv_config_find_sink(struct device *dev, void *data)
> >>> +{
> >>> +     struct amba_device *adev = to_amba_device(dev->parent);
> >>> +     struct resource *res = &adev->res;
> >>> +     u64 value = *((u64 *)data);
> >>> +
> >>> +     /*
> >>> +      * The HW mapping of a component is unique.  If the value we've been
> >>> +      * given matches the component's start address, then we must have found
> >>> +      * the device we are looking for.
> >>> +      */
> >>
> >> To be frank, I don't quite like the idea of passing the base address of the
> >> component as the key to locate a device, (even though that is unique and readily
> >> available). I would rather prefer a programmable way to map the keys to the
> >> "sink" devices, which works platform agnostic (e.g, ACPI support, where the base
> >> address is not obvious from the name). Also if we decide to use a platform
> >> agnostic naming scheme, it becomes even more complex.
> >
> > This mechanism doesn't rely on the naming scheme - it exploits the
> > "resource" interface exported for each amba device [1].  As such
> > whether the component is discovered using ACPI or DT, we end up on the
> > same amba bus and using the same interface.
> >
> > [1]. https://elixir.bootlin.com/linux/latest/source/drivers/amba/bus.c#L128
>
> Ok. The only problem with this approach would be if the devices doesn't appear
> on the AMBA bus (btw, which is not true for the existing IPs).

This is news to me and definitely requires further attention.  I'd
rather come up with a design that account for those right away than
overhaul the whole thing in a year from now.

>
> >
> >>
> >> We could assign a static "id/key" exported either via the device sysfs dir or
> >> the "pmu" dir. I prefer the latter.
> >
> > Not sure what you mean by "pmu" directory - would you mind expanding
> > on that?  Using sysfs would be quite easy but I am reluctant to create
> > a new id/key mechanism and introduce another entry when we have the
> > component address that is unique and already available in the amba
> > directory structure.
>
> We could add another directory under :
>
> /sys/bus/event_source/devices/<PMU>/
>                                 \_ events/
>                                 \_ format/
> say :
>                 \_ drv_config/
>                 Or
>                 \_ sinks/
>
> and list the sinks, eg:
> # cd $sysfs_pmu_dir/sinks
> # cat <name_of_the_sink>
> ID_of_the_sink

There is merit to this idea.  Thinking along those lines simply using
"sinks" is probably a better approach since we don't yet know how PMU
configuration will unfold.  I'm also wondering if we have to
explicitly list the ID of the sink.  The ID itself should be fetch
from the device specific entries in sysfs like the "resource" property
of sinks that show up on the AMBA bus.  Sinks that don't show up on
the AMBA bus will likely have a "reg" property of something similar
and that is where the ID should be taken from.

Lastly it may be tricky to add a new directory structure to the PMU
entry as it is generic for all PMUs in the system.  But that is up for
investigation and I will look into doing something like that.

>
> Btw, I am always inclined to using some bits off one of the "config" fields
> ("config1" or "config2") for the sink configuration. But I understand that
> you have explored that avenue and chose this approach as we have further
> configurations required for complex ETM settings.

I've had a good conversation with myself over the holidays on that
topic (hence the late-ish reply).  My original approach was to use the
same ioctl() mechanism to do sink selection and complex PMU
configuration.  But as Greg confirmed with his comment on the previous
patch doing so requires to iron out how the latter will be done on ARM
(and probably on Intel too), something that is a different deal
altogether.

Taking a step back and forgetting about complex PMU configuration for
a minute, the ID of a sink can easily be stuffed in one of the
"config" fields of the perf_event_attr structure, something I alluded
to in the patchset's cover letter.  That way we can move along with
this feature and leave the discussion on complex PMU configuration for
another day.

I'll do a respin with the above in mind.

A happy new year to you,
Mathieu

>
> Cheers
> Suzuki

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection
  2019-01-07 18:18         ` Mathieu Poirier
@ 2019-01-09 14:55           ` Suzuki K Poulose
  0 siblings, 0 replies; 16+ messages in thread
From: Suzuki K Poulose @ 2019-01-09 14:55 UTC (permalink / raw)
  To: mathieu.poirier
  Cc: acme, peterz, gregkh, mingo, tglx, alexander.shishkin,
	schwidefsky, heiko.carstens, will.deacon, mark.rutland, jolsa,
	namhyung, adrian.hunter, ast, hpa, Suzuki.Poulose, linux-s390,
	linux-kernel, linux-arm-kernel

Hi Mathieu,

On 07/01/2019 18:18, Mathieu Poirier wrote:
> On Wed, 19 Dec 2018 at 02:40, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>
>>
>>
>> On 18/12/2018 17:34, Mathieu Poirier wrote:
>>> Good day Suzuki,
>>>
>>> On Tue, 18 Dec 2018 at 07:14, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:

>>>> We could assign a static "id/key" exported either via the device sysfs dir or
>>>> the "pmu" dir. I prefer the latter.
>>>
>>> Not sure what you mean by "pmu" directory - would you mind expanding
>>> on that?  Using sysfs would be quite easy but I am reluctant to create
>>> a new id/key mechanism and introduce another entry when we have the
>>> component address that is unique and already available in the amba
>>> directory structure.
>>
>> We could add another directory under :
>>
>> /sys/bus/event_source/devices/<PMU>/
>>                                  \_ events/
>>                                  \_ format/
>> say :
>>                  \_ drv_config/
>>                  Or
>>                  \_ sinks/
>>
>> and list the sinks, eg:
>> # cd $sysfs_pmu_dir/sinks
>> # cat <name_of_the_sink>
>> ID_of_the_sink
> 
> There is merit to this idea.  Thinking along those lines simply using
> "sinks" is probably a better approach since we don't yet know how PMU
> configuration will unfold.  I'm also wondering if we have to
> explicitly list the ID of the sink.  The ID itself should be fetch
> from the device specific entries in sysfs like the "resource" property
> of sinks that show up on the AMBA bus.  Sinks that don't show up on
> the AMBA bus will likely have a "reg" property of something similar
> and that is where the ID should be taken from.

I would recommend this be done by the PMU driver and expose it, rather
than specifying what is expected. If we keep it simple like an integer,
which is then mapped to the sink-device, we could save some bits in the
config field for further use and a the complex set of rules for the ID.

> 
> Lastly it may be tricky to add a new directory structure to the PMU
> entry as it is generic for all PMUs in the system.  But that is up for
> investigation and I will look into doing something like that.

I don't think this is particularly problematic. The PMU driver can provide
a list of attribute_groups which should eventually appear in the device
directory under the "bus/event_source/devices/<PMU>/". And that provides
a central place for finding the SINK id for all the available sinks, rather
than scanning for a sink under all the buses (e.g, platform, amba) wherever
that might show up.

> 
>>
>> Btw, I am always inclined to using some bits off one of the "config" fields
>> ("config1" or "config2") for the sink configuration. But I understand that
>> you have explored that avenue and chose this approach as we have further
>> configurations required for complex ETM settings.
> 
> I've had a good conversation with myself over the holidays on that
> topic (hence the late-ish reply).  My original approach was to use the
> same ioctl() mechanism to do sink selection and complex PMU
> configuration.  But as Greg confirmed with his comment on the previous
> patch doing so requires to iron out how the latter will be done on ARM
> (and probably on Intel too), something that is a different deal
> altogether.
> 
> Taking a step back and forgetting about complex PMU configuration for
> a minute, the ID of a sink can easily be stuffed in one of the
> "config" fields of the perf_event_attr structure, something I alluded
> to in the patchset's cover letter.  That way we can move along with
> this feature and leave the discussion on complex PMU configuration for
> another day.

Cool ! I completely agree with this.

> 
> I'll do a respin with the above in mind.
> 
> A happy new year to you,

You too.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-01-09 14:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-17 17:21 [RESEND PATCH v5 0/6] perf: Add ioctl for PMU driver configuration Mathieu Poirier
2018-12-17 17:21 ` [RESEND PATCH v5 1/6] perf: Introduce ioctl to communicate driver configuration to kernel Mathieu Poirier
2018-12-19  8:29   ` Greg KH
2018-12-17 17:21 ` [RESEND PATCH v5 2/6] perf/core: Use " Mathieu Poirier
2018-12-19  8:31   ` Greg KH
2018-12-17 17:21 ` [RESEND PATCH v5 3/6] perf/aux: Make perf_event accessible to setup_aux() Mathieu Poirier
2018-12-17 17:21 ` [RESEND PATCH v5 4/6] coresight: Use PMU driver configuration for sink selection Mathieu Poirier
2018-12-18 14:14   ` Suzuki K Poulose
2018-12-18 15:21     ` Alexander Shishkin
2018-12-18 18:20       ` Mathieu Poirier
2018-12-18 17:34     ` Mathieu Poirier
2018-12-19  9:40       ` Suzuki K Poulose
2019-01-07 18:18         ` Mathieu Poirier
2019-01-09 14:55           ` Suzuki K Poulose
2018-12-17 17:21 ` [RESEND PATCH v5 5/6] perf tools: Make perf_evsel accessible to PMU driver configuration code Mathieu Poirier
2018-12-17 17:21 ` [RESEND PATCH v5 6/6] perf tools: Use ioctl function to send sink information to kernel Mathieu Poirier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).