All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 0/4] coresight: Add ETR-PERF polling.
@ 2021-07-13 12:15 Daniel Kiss
  2021-07-13 12:15 ` [PATCHv2 1/4] coresight: tmc-etr: Use handle->head from perf_output_handle directly Daniel Kiss
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Daniel Kiss @ 2021-07-13 12:15 UTC (permalink / raw)
  To: coresight
  Cc: denik, leo.yan, linux-arm-kernel, mathieu.poirier, mike.leach,
	suzuki.poulose, Daniel Kiss

This series adds a feature to ETR-PERF that sync the ETR buffer to perf
periodically. This is really handy when the system wide trace is used
because in this case the perf won't sync during the trace. In a per-thread
setup the traced program might not go to the kernel frequvently enought
to collect trace. Polling helps in both usecases. Can be used with strobing.
Tuning polling period is challanging, I'm working on an additional patch
that adds some metrics to help tune the polling period.

Changes from v1:
 * tmc-etr: Advance buffer pointer in sync buffer refactored to use the buffer directly
 * New config option to enable polling.
 * Code of polling is simplified based on the commments.
 * Fix event allocation issue by changing to GFP_ATOMIC.

Daniel Kiss (4):
  coresight: tmc-etr: Use handle->head from perf_output_handle directly.
  coresight: tmc-etr: Track perf handler.
  coresight: etm-perf: Export etm_event_cpu_path.
  coresight: Add ETR-PERF polling.

 .../testing/sysfs-bus-coresight-devices-tmc   |   8 +
 drivers/hwtracing/coresight/Kconfig           |  12 +
 drivers/hwtracing/coresight/Makefile          |   1 +
 .../hwtracing/coresight/coresight-etm-perf.c  |  10 +-
 .../hwtracing/coresight/coresight-etm-perf.h  |   1 +
 .../coresight/coresight-etr-perf-polling.c    | 275 ++++++++++++++++++
 .../coresight/coresight-etr-perf-polling.h    |  38 +++
 .../hwtracing/coresight/coresight-tmc-core.c  |   4 +
 .../hwtracing/coresight/coresight-tmc-etr.c   |  32 +-
 drivers/hwtracing/coresight/coresight-tmc.h   |   2 +
 10 files changed, 372 insertions(+), 11 deletions(-)
 create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.c
 create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.h

-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCHv2 1/4] coresight: tmc-etr: Use handle->head from perf_output_handle directly.
  2021-07-13 12:15 [PATCHv2 0/4] coresight: Add ETR-PERF polling Daniel Kiss
@ 2021-07-13 12:15 ` Daniel Kiss
  2021-07-13 12:15 ` [PATCHv2 2/4] coresight: tmc-etr: Track perf handler Daniel Kiss
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Daniel Kiss @ 2021-07-13 12:15 UTC (permalink / raw)
  To: coresight
  Cc: denik, leo.yan, linux-arm-kernel, mathieu.poirier, mike.leach,
	suzuki.poulose, Daniel Kiss

Polling might call update multiple times and the cached buffer head will
be out of sync without advancing it. Using the head directly from the
perf_output_handle solves this problem.

Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index acdb59e0e6614..589bb2d56e802 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -32,7 +32,6 @@ struct etr_flat_buf {
  * @etr_buf		- Actual buffer used by the ETR
  * @pid			- The PID this etr_perf_buffer belongs to.
  * @snaphost		- Perf session mode
- * @head		- handle->head at the beginning of the session.
  * @nr_pages		- Number of pages in the ring buffer.
  * @pages		- Array of Pages in the ring buffer.
  */
@@ -41,7 +40,6 @@ struct etr_perf_buffer {
 	struct etr_buf		*etr_buf;
 	pid_t			pid;
 	bool			snapshot;
-	unsigned long		head;
 	int			nr_pages;
 	void			**pages;
 };
@@ -1436,17 +1434,18 @@ static void tmc_free_etr_buffer(void *config)
  * tmc_etr_sync_perf_buffer: Copy the actual trace data from the hardware
  * buffer to the perf ring buffer.
  */
-static void tmc_etr_sync_perf_buffer(struct etr_perf_buffer *etr_perf,
+static void tmc_etr_sync_perf_buffer(struct perf_output_handle *handle,
+				     struct etr_perf_buffer *etr_perf,
 				     unsigned long src_offset,
 				     unsigned long to_copy)
 {
 	long bytes;
 	long pg_idx, pg_offset;
-	unsigned long head = etr_perf->head;
+	unsigned long head;
 	char **dst_pages, *src_buf;
 	struct etr_buf *etr_buf = etr_perf->etr_buf;
 
-	head = etr_perf->head;
+	head = PERF_IDX2OFF(handle->head, etr_perf);
 	pg_idx = head >> PAGE_SHIFT;
 	pg_offset = head & (PAGE_SIZE - 1);
 	dst_pages = (char **)etr_perf->pages;
@@ -1553,7 +1552,7 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
 	/* Insert barrier packets at the beginning, if there was an overflow */
 	if (lost)
 		tmc_etr_buf_insert_barrier_packet(etr_buf, offset);
-	tmc_etr_sync_perf_buffer(etr_perf, offset, size);
+	tmc_etr_sync_perf_buffer(handle, etr_perf, offset, size);
 
 	/*
 	 * In snapshot mode we simply increment the head by the number of byte
@@ -1605,8 +1604,6 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
 		goto unlock_out;
 	}
 
-	etr_perf->head = PERF_IDX2OFF(handle->head, etr_perf);
-
 	/*
 	 * No HW configuration is needed if the sink is already in
 	 * use for this session.
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCHv2 2/4] coresight: tmc-etr: Track perf handler.
  2021-07-13 12:15 [PATCHv2 0/4] coresight: Add ETR-PERF polling Daniel Kiss
  2021-07-13 12:15 ` [PATCHv2 1/4] coresight: tmc-etr: Use handle->head from perf_output_handle directly Daniel Kiss
@ 2021-07-13 12:15 ` Daniel Kiss
  2021-08-25 17:02   ` Mathieu Poirier
  2021-07-13 12:15 ` [PATCHv2 3/4] coresight: etm-perf: Export etm_event_cpu_path Daniel Kiss
  2021-07-13 12:15 ` [PATCHv2 4/4] coresight: Add ETR-PERF polling Daniel Kiss
  3 siblings, 1 reply; 11+ messages in thread
From: Daniel Kiss @ 2021-07-13 12:15 UTC (permalink / raw)
  To: coresight
  Cc: denik, leo.yan, linux-arm-kernel, mathieu.poirier, mike.leach,
	suzuki.poulose, Daniel Kiss, Branislav Rankov

Keep track of the perf handler that is registred by the first tracer.
This will be used by the update call from polling.

Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
Signed-off-by: Branislav Rankov <Branislav.Rankov@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 6 ++++--
 drivers/hwtracing/coresight/coresight-tmc.h     | 2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 589bb2d56e802..55c9b5fd9f832 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -1503,8 +1503,8 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
 
 	spin_lock_irqsave(&drvdata->spinlock, flags);
 
-	/* Don't do anything if another tracer is using this sink */
-	if (atomic_read(csdev->refcnt) != 1) {
+	/* Serve only the tracer with the leading perf handler */
+	if (drvdata->perf_handle != handle) {
 		spin_unlock_irqrestore(&drvdata->spinlock, flags);
 		goto out;
 	}
@@ -1619,6 +1619,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
 		drvdata->pid = pid;
 		drvdata->mode = CS_MODE_PERF;
 		drvdata->perf_buf = etr_perf->etr_buf;
+		drvdata->perf_handle = handle;
 		atomic_inc(csdev->refcnt);
 	}
 
@@ -1666,6 +1667,7 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev)
 	drvdata->mode = CS_MODE_DISABLED;
 	/* Reset perf specific data */
 	drvdata->perf_buf = NULL;
+	drvdata->perf_handle = NULL;
 
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index b91ec7dde7bc9..81583ffb973dc 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -184,6 +184,7 @@ struct etr_buf {
  * @idr_mutex:	Access serialisation for idr.
  * @sysfs_buf:	SYSFS buffer for ETR.
  * @perf_buf:	PERF buffer for ETR.
+ * @perf_handle: PERF handle for ETR.
  */
 struct tmc_drvdata {
 	void __iomem		*base;
@@ -207,6 +208,7 @@ struct tmc_drvdata {
 	struct mutex		idr_mutex;
 	struct etr_buf		*sysfs_buf;
 	struct etr_buf		*perf_buf;
+	struct perf_output_handle *perf_handle;
 };
 
 struct etr_buf_operations {
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCHv2 3/4] coresight: etm-perf: Export etm_event_cpu_path.
  2021-07-13 12:15 [PATCHv2 0/4] coresight: Add ETR-PERF polling Daniel Kiss
  2021-07-13 12:15 ` [PATCHv2 1/4] coresight: tmc-etr: Use handle->head from perf_output_handle directly Daniel Kiss
  2021-07-13 12:15 ` [PATCHv2 2/4] coresight: tmc-etr: Track perf handler Daniel Kiss
@ 2021-07-13 12:15 ` Daniel Kiss
  2021-07-13 12:15 ` [PATCHv2 4/4] coresight: Add ETR-PERF polling Daniel Kiss
  3 siblings, 0 replies; 11+ messages in thread
From: Daniel Kiss @ 2021-07-13 12:15 UTC (permalink / raw)
  To: coresight
  Cc: denik, leo.yan, linux-arm-kernel, mathieu.poirier, mike.leach,
	suzuki.poulose, Daniel Kiss, Branislav Rankov

Polling need to access the etm_event_cpu_path.

Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
Signed-off-by: Branislav Rankov <Branislav.Rankov@arm.com>
---
 drivers/hwtracing/coresight/coresight-etm-perf.c | 2 +-
 drivers/hwtracing/coresight/coresight-etm-perf.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 6f398377fec9e..a3f4c07f5bf8b 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -117,7 +117,7 @@ etm_event_cpu_path_ptr(struct etm_event_data *data, int cpu)
 	return per_cpu_ptr(data->path, cpu);
 }
 
-static inline struct list_head *
+struct list_head *
 etm_event_cpu_path(struct etm_event_data *data, int cpu)
 {
 	return *etm_event_cpu_path_ptr(data, cpu);
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h
index 3e4f2ad5e193d..6d02abbad1b0f 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.h
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.h
@@ -69,6 +69,7 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle)
 		return data->snk_config;
 	return NULL;
 }
+struct list_head *etm_event_cpu_path(struct etm_event_data *data, int cpu);
 #else
 static inline int etm_perf_symlink(struct coresight_device *csdev, bool link)
 { return -EINVAL; }
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCHv2 4/4] coresight: Add ETR-PERF polling.
  2021-07-13 12:15 [PATCHv2 0/4] coresight: Add ETR-PERF polling Daniel Kiss
                   ` (2 preceding siblings ...)
  2021-07-13 12:15 ` [PATCHv2 3/4] coresight: etm-perf: Export etm_event_cpu_path Daniel Kiss
@ 2021-07-13 12:15 ` Daniel Kiss
  2021-08-26 16:41   ` Mathieu Poirier
  2021-08-26 17:32   ` Mathieu Poirier
  3 siblings, 2 replies; 11+ messages in thread
From: Daniel Kiss @ 2021-07-13 12:15 UTC (permalink / raw)
  To: coresight
  Cc: denik, leo.yan, linux-arm-kernel, mathieu.poirier, mike.leach,
	suzuki.poulose, Daniel Kiss, Branislav Rankov, Denis Nikitin

ETR might fill up the buffer sooner than an event makes perf to trigger
the synchronisation especially in system wide trace. Polling runs
periodically to sync the ETR buffer. Period is configurable via sysfs,
disabled by default.

Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
Signed-off-by: Branislav Rankov <Branislav.Rankov@arm.com>
Tested-by: Denis Nikitin <denik@chromium.org>
---
 .../testing/sysfs-bus-coresight-devices-tmc   |   8 +
 drivers/hwtracing/coresight/Kconfig           |  12 +
 drivers/hwtracing/coresight/Makefile          |   1 +
 .../hwtracing/coresight/coresight-etm-perf.c  |   8 +
 .../coresight/coresight-etr-perf-polling.c    | 275 ++++++++++++++++++
 .../coresight/coresight-etr-perf-polling.h    |  38 +++
 .../hwtracing/coresight/coresight-tmc-core.c  |   4 +
 .../hwtracing/coresight/coresight-tmc-etr.c   |  13 +
 8 files changed, 359 insertions(+)
 create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.c
 create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.h

diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
index 6aa527296c710..3b411e8a6f417 100644
--- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
+++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
@@ -91,3 +91,11 @@ Contact:	Mathieu Poirier <mathieu.poirier@linaro.org>
 Description:	(RW) Size of the trace buffer for TMC-ETR when used in SYSFS
 		mode. Writable only for TMC-ETR configurations. The value
 		should be aligned to the kernel pagesize.
+
+What:		/sys/bus/coresight/devices/<memory_map>.tmc/polling/period
+Date:		July 2021
+KernelVersion:	5.14
+Contact:	Daniel Kiss <daniel.kiss@arm.com>
+Description:	(RW) Time in milliseconds when the TMC-ETR is synced.
+		Default value is 0, means the feature is disabled.
+		Writable only for TMC-ETR configurations.
diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
index 84530fd80998c..4e91fb98849f4 100644
--- a/drivers/hwtracing/coresight/Kconfig
+++ b/drivers/hwtracing/coresight/Kconfig
@@ -44,6 +44,18 @@ config CORESIGHT_LINK_AND_SINK_TMC
 	  To compile this driver as a module, choose M here: the
 	  module will be called coresight-tmc.
 
+
+config CORESIGHT_ETR_PERF_POLL
+	bool "Coresight ETR Perf Polling"
+
+	depends on CORESIGHT_LINK_AND_SINK_TMC
+	help
+	  Enable the support for software periodic synchronization of the ETR buffer.
+	  ETR might fill up the buffer sooner than an event makes perf to trigger
+	  the synchronization especially in system wide trace. Polling runs
+	  periodically to sync the ETR buffer. Period is configurable via sysfs,
+	  disabled by default.
+
 config CORESIGHT_CATU
 	tristate "Coresight Address Translation Unit (CATU) driver"
 	depends on CORESIGHT_LINK_AND_SINK_TMC
diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile
index d60816509755c..6baac328eea87 100644
--- a/drivers/hwtracing/coresight/Makefile
+++ b/drivers/hwtracing/coresight/Makefile
@@ -5,6 +5,7 @@
 obj-$(CONFIG_CORESIGHT) += coresight.o
 coresight-y := coresight-core.o  coresight-etm-perf.o coresight-platform.o \
 		coresight-sysfs.o
+coresight-$(CONFIG_CORESIGHT_ETR_PERF_POLL) += coresight-etr-perf-polling.o
 obj-$(CONFIG_CORESIGHT_LINK_AND_SINK_TMC) += coresight-tmc.o
 coresight-tmc-y := coresight-tmc-core.o coresight-tmc-etf.o \
 		      coresight-tmc-etr.o
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index a3f4c07f5bf8b..3095840a567c4 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -19,6 +19,7 @@
 #include <linux/workqueue.h>
 
 #include "coresight-etm-perf.h"
+#include "coresight-etr-perf-polling.h"
 #include "coresight-priv.h"
 
 static struct pmu etm_pmu;
@@ -438,6 +439,8 @@ static void etm_event_start(struct perf_event *event, int flags)
 	/* Tell the perf core the event is alive */
 	event->hw.state = 0;
 
+	etr_perf_polling_event_start(event, event_data, handle);
+
 	/* Finally enable the tracer */
 	if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
 		goto fail_disable_path;
@@ -497,6 +500,8 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	if (!sink)
 		return;
 
+	etr_perf_polling_event_stop(event, event_data);
+
 	/* stop tracer */
 	source_ops(csdev)->disable(csdev, event);
 
@@ -741,6 +746,8 @@ int __init etm_perf_init(void)
 	etm_pmu.addr_filters_validate	= etm_addr_filters_validate;
 	etm_pmu.nr_addr_filters		= ETM_ADDR_CMP_MAX;
 
+	etr_perf_polling_init();
+
 	ret = perf_pmu_register(&etm_pmu, CORESIGHT_ETM_PMU_NAME, -1);
 	if (ret == 0)
 		etm_perf_up = true;
@@ -750,5 +757,6 @@ int __init etm_perf_init(void)
 
 void __exit etm_perf_exit(void)
 {
+	etr_perf_polling_exit();
 	perf_pmu_unregister(&etm_pmu);
 }
diff --git a/drivers/hwtracing/coresight/coresight-etr-perf-polling.c b/drivers/hwtracing/coresight/coresight-etr-perf-polling.c
new file mode 100644
index 0000000000000..87e6bc42a62de
--- /dev/null
+++ b/drivers/hwtracing/coresight/coresight-etr-perf-polling.c
@@ -0,0 +1,275 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright(C) 2021 Arm Limited. All rights reserved.
+ * Author: Daniel Kiss <daniel.kiss@arm.com>
+ */
+
+#include <linux/coresight.h>
+#include <linux/coresight-pmu.h>
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/init.h>
+#include <linux/list.h>
+#include <linux/percpu-defs.h>
+#include <linux/perf_event.h>
+#include <linux/slab.h>
+#include <linux/stringhash.h>
+#include <linux/types.h>
+#include <linux/workqueue.h>
+
+#include "coresight-etr-perf-polling.h"
+#include "coresight-priv.h"
+#include "coresight-tmc.h"
+
+struct polling_event_list {
+	struct perf_event *perf_event;
+	struct etm_event_data *etm_event_data;
+	struct perf_output_handle *ctx_handle;
+	void (*tmc_etr_reset_hw)(struct tmc_drvdata *data);
+	struct list_head list;
+};
+
+struct polling {
+	int cpu;
+	struct polling_event_list *polled_event;
+	struct delayed_work delayed_work;
+};
+
+static atomic_t period;
+static spinlock_t spinlock_re;
+static struct list_head registered_events;
+
+static DEFINE_PER_CPU(struct polling, polling);
+
+static ssize_t period_show(struct device *dev, struct device_attribute *attr,
+			   char *buf)
+{
+	int temp;
+
+	temp = atomic_read(&period);
+	return sprintf(buf, "%i\n", temp);
+}
+
+static ssize_t period_store(struct device *dev, struct device_attribute *attr,
+			    const char *buf, size_t count)
+{
+	int temp = 0;
+
+	if (!kstrtoint(buf, 10, &temp) && (temp >= 0))
+		atomic_set(&period, temp);
+	return count;
+}
+
+static DEVICE_ATTR_RW(period);
+
+static struct attribute *coresight_tmc_polling_attrs[] = {
+	&dev_attr_period.attr,
+	NULL,
+};
+const struct attribute_group coresight_tmc_polling_group = {
+	.attrs = coresight_tmc_polling_attrs,
+	.name = "polling",
+};
+EXPORT_SYMBOL_GPL(coresight_tmc_polling_group);
+
+static inline void polling_sched_worker(struct polling *p)
+{
+	int tickrate = atomic_read(&period);
+
+	if (tickrate > 0)
+		schedule_delayed_work_on(p->cpu, &p->delayed_work,
+					 msecs_to_jiffies(tickrate));
+}
+
+static inline bool is_etr_related(struct etm_event_data *etm_event_data, int cpu)
+{
+	struct list_head *path;
+	struct coresight_device *sink;
+	struct tmc_drvdata *drvdata;
+
+	path = etm_event_cpu_path(etm_event_data, cpu);
+	if (WARN_ON(!path))
+		return false;
+	sink = coresight_get_sink(path);
+	if (WARN_ON(!sink))
+		return false;
+	drvdata = dev_get_drvdata(sink->dev.parent);
+	if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
+		return false;
+	return true;
+}
+
+/*
+ * Adds the event to the polled events list.
+ */
+void etr_perf_polling_event_start(struct perf_event *event,
+				  struct etm_event_data *etm_event_data,
+				  struct perf_output_handle *ctx_handle)
+{
+	int cpu = smp_processor_id();
+	struct polling *p = per_cpu_ptr(&polling, cpu);
+	struct polling_event_list *element, *tmp;
+
+	if (!is_etr_related(etm_event_data, cpu))
+		return;
+
+	spin_lock(&spinlock_re);
+	list_for_each_entry_safe(element, tmp, &registered_events, list) {
+		if (element->ctx_handle == ctx_handle) {
+			element->perf_event = event;
+			element->etm_event_data = etm_event_data;
+			spin_unlock(&spinlock_re);
+			p->polled_event = element;
+			polling_sched_worker(p);
+			return;
+		}
+	}
+	spin_unlock(&spinlock_re);
+}
+
+/*
+ * Removes the event from the to be polled events list.
+ */
+void etr_perf_polling_event_stop(struct perf_event *event,
+				 struct etm_event_data *etm_event_data)
+{
+	int cpu = smp_processor_id();
+	struct polling *p = per_cpu_ptr(&polling, cpu);
+
+	if (!is_etr_related(etm_event_data, cpu))
+		return;
+
+	if (p->polled_event) {
+		struct polling_event_list *element = p->polled_event;
+
+		if (element->perf_event == event) {
+			p->polled_event = NULL;
+			element->perf_event = NULL;
+			element->etm_event_data = NULL;
+			cancel_delayed_work(&p->delayed_work);
+			return;
+		}
+	}
+}
+
+/*
+ * The polling worker is a workqueue job which is periodically
+ * woken up to update the perf aux buffer from the etr shrink.
+ */
+static void etr_perf_polling_worker(struct work_struct *work)
+{
+	unsigned long flags;
+	struct list_head *path;
+	struct coresight_device *sink;
+	int size;
+	int cpu = smp_processor_id();
+	struct polling *p = per_cpu_ptr(&polling, cpu);
+
+	if (!atomic_read(&period))
+		return;
+
+	if (!p->polled_event)
+		return;
+	/*
+	 * Scheduling would do the same from the perf hooks,
+	 * this should be done in one go.
+	 */
+	local_irq_save(flags);
+
+	polling_sched_worker(p);
+
+	path = etm_event_cpu_path(p->polled_event->etm_event_data, cpu);
+	sink = coresight_get_sink(path);
+	size = sink_ops(sink)->update_buffer(
+		sink, p->polled_event->ctx_handle,
+		p->polled_event->etm_event_data->snk_config);
+
+	/*
+	 * Restart the trace.
+	 */
+	if (p->polled_event->tmc_etr_reset_hw)
+		p->polled_event->tmc_etr_reset_hw(dev_get_drvdata(sink->dev.parent));
+
+	WARN_ON(size < 0);
+	if (size > 0) {
+		struct etm_event_data *new_event_data;
+
+		perf_aux_output_end(p->polled_event->ctx_handle, size);
+		new_event_data = perf_aux_output_begin(
+			p->polled_event->ctx_handle,
+			p->polled_event->perf_event);
+		if (WARN_ON(new_event_data == NULL)) {
+			local_irq_restore(flags);
+			return;
+		}
+
+		p->polled_event->etm_event_data = new_event_data;
+		WARN_ON(new_event_data->snk_config !=
+			p->polled_event->etm_event_data->snk_config);
+	}
+
+	local_irq_restore(flags);
+}
+
+void etr_perf_polling_handle_register(struct perf_output_handle *handle,
+				      void (*tmc_etr_reset_hw)(struct tmc_drvdata *drvdata))
+{
+	struct polling_event_list *element;
+
+	element = kmalloc(sizeof(*element), GFP_ATOMIC);
+	if (WARN_ON(!element))
+		return;
+	memset(element, 0, sizeof(*element));
+	element->ctx_handle = handle;
+	element->tmc_etr_reset_hw = tmc_etr_reset_hw;
+	spin_lock(&spinlock_re);
+	list_add(&element->list, &registered_events);
+	spin_unlock(&spinlock_re);
+}
+EXPORT_SYMBOL_GPL(etr_perf_polling_handle_register);
+
+void etr_perf_polling_handle_deregister(struct perf_output_handle *handle)
+{
+	struct polling_event_list *element, *tmp;
+
+	spin_lock(&spinlock_re);
+	list_for_each_entry_safe(element, tmp, &registered_events, list) {
+		if (element->ctx_handle == handle) {
+			list_del(&element->list);
+			spin_unlock(&spinlock_re);
+			kfree(element);
+			return;
+		}
+	}
+	spin_unlock(&spinlock_re);
+}
+EXPORT_SYMBOL_GPL(etr_perf_polling_handle_deregister);
+
+void etr_perf_polling_init(void)
+{
+	int cpu;
+
+	spin_lock_init(&spinlock_re);
+	INIT_LIST_HEAD(&registered_events);
+	atomic_set(&period, 0);
+	for_each_possible_cpu(cpu) {
+		struct polling *p = per_cpu_ptr(&polling, cpu);
+
+		p->cpu = cpu;
+		p->polled_event = NULL;
+		INIT_DELAYED_WORK(&p->delayed_work, etr_perf_polling_worker);
+	}
+}
+
+void etr_perf_polling_exit(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		struct polling *p = per_cpu_ptr(&polling, cpu);
+
+		cancel_delayed_work_sync(&p->delayed_work);
+		WARN_ON(p->polled_event);
+	}
+	WARN_ON(!list_empty(&registered_events));
+}
diff --git a/drivers/hwtracing/coresight/coresight-etr-perf-polling.h b/drivers/hwtracing/coresight/coresight-etr-perf-polling.h
new file mode 100644
index 0000000000000..d47b4424594e6
--- /dev/null
+++ b/drivers/hwtracing/coresight/coresight-etr-perf-polling.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright(C) 2021 Arm Limited. All rights reserved.
+ * Author: Daniel Kiss <daniel.kiss@arm.com>
+ */
+
+#ifndef _CORESIGHT_ETM_PERF_POLLING_H
+#define _CORESIGHT_ETM_PERF_POLLING_H
+
+#include <linux/coresight.h>
+#include <linux/perf_event.h>
+#include "coresight-etm-perf.h"
+#include "coresight-tmc.h"
+
+#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
+void etr_perf_polling_init(void);
+void etr_perf_polling_exit(void);
+void etr_perf_polling_handle_register(struct perf_output_handle *handle,
+				      void (*tmc_etr_reset_hw)(struct tmc_drvdata *drvdata));
+void etr_perf_polling_handle_deregister(struct perf_output_handle *handle);
+void etr_perf_polling_event_start(struct perf_event *event,
+				  struct etm_event_data *etm_event_data,
+				  struct perf_output_handle *ctx_handle);
+void etr_perf_polling_event_stop(struct perf_event *event,
+				 struct etm_event_data *etm_event_data);
+
+extern const struct attribute_group coresight_tmc_polling_group;
+
+#else /* !CONFIG_CORESIGHT_ETR_PERF_POLL */
+#define etr_perf_polling_init()
+#define etr_perf_polling_exit()
+#define etr_perf_polling_handle_register(...)
+#define etr_perf_polling_handle_deregister(...)
+#define etr_perf_polling_event_start(...)
+#define etr_perf_polling_event_stop(...)
+#endif
+
+#endif
diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
index 74c6323d4d6ab..dbcdba162bd38 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-core.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
@@ -26,6 +26,7 @@
 
 #include "coresight-priv.h"
 #include "coresight-tmc.h"
+#include "coresight-etr-perf-polling.h"
 
 DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
 DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
@@ -365,6 +366,9 @@ static const struct attribute_group coresight_tmc_mgmt_group = {
 static const struct attribute_group *coresight_tmc_groups[] = {
 	&coresight_tmc_group,
 	&coresight_tmc_mgmt_group,
+#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
+	&coresight_tmc_polling_group,
+#endif
 	NULL,
 };
 
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 55c9b5fd9f832..67cd4bdcda71b 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -16,6 +16,7 @@
 #include <linux/vmalloc.h>
 #include "coresight-catu.h"
 #include "coresight-etm-perf.h"
+#include "coresight-etr-perf-polling.h"
 #include "coresight-priv.h"
 #include "coresight-tmc.h"
 
@@ -1137,6 +1138,16 @@ void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
 	drvdata->etr_buf = NULL;
 }
 
+#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
+
+static void tmc_etr_reset_hw(struct tmc_drvdata *drvdata)
+{
+	__tmc_etr_disable_hw(drvdata);
+	__tmc_etr_enable_hw(drvdata);
+}
+
+#endif
+
 static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 {
 	int ret = 0;
@@ -1620,6 +1631,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
 		drvdata->mode = CS_MODE_PERF;
 		drvdata->perf_buf = etr_perf->etr_buf;
 		drvdata->perf_handle = handle;
+		etr_perf_polling_handle_register(handle, tmc_etr_reset_hw);
 		atomic_inc(csdev->refcnt);
 	}
 
@@ -1667,6 +1679,7 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev)
 	drvdata->mode = CS_MODE_DISABLED;
 	/* Reset perf specific data */
 	drvdata->perf_buf = NULL;
+	etr_perf_polling_handle_deregister(drvdata->perf_handle);
 	drvdata->perf_handle = NULL;
 
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 2/4] coresight: tmc-etr: Track perf handler.
  2021-07-13 12:15 ` [PATCHv2 2/4] coresight: tmc-etr: Track perf handler Daniel Kiss
@ 2021-08-25 17:02   ` Mathieu Poirier
  2021-08-25 19:09     ` Mathieu Poirier
  0 siblings, 1 reply; 11+ messages in thread
From: Mathieu Poirier @ 2021-08-25 17:02 UTC (permalink / raw)
  To: Daniel Kiss
  Cc: coresight, denik, leo.yan, linux-arm-kernel, mike.leach,
	suzuki.poulose, Branislav Rankov

Hi Daniel,

On Tue, Jul 13, 2021 at 02:15:30PM +0200, Daniel Kiss wrote:
> Keep track of the perf handler that is registred by the first tracer.
> This will be used by the update call from polling.
> 
> Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
> Signed-off-by: Branislav Rankov <Branislav.Rankov@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-tmc-etr.c | 6 ++++--
>  drivers/hwtracing/coresight/coresight-tmc.h     | 2 ++
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 589bb2d56e802..55c9b5fd9f832 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -1503,8 +1503,8 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
>  
>  	spin_lock_irqsave(&drvdata->spinlock, flags);
>  
> -	/* Don't do anything if another tracer is using this sink */
> -	if (atomic_read(csdev->refcnt) != 1) {
> +	/* Serve only the tracer with the leading perf handler */
> +	if (drvdata->perf_handle != handle) {

In CPU wide trace scenarios the first CPU to enable a sink is not
guaranteed to be the same as the last CPU to use it.  As far as I understand the
above assumes the first and last CPUs to use a sink are the same.

>  		spin_unlock_irqrestore(&drvdata->spinlock, flags);
>  		goto out;
>  	}
> @@ -1619,6 +1619,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
>  		drvdata->pid = pid;
>  		drvdata->mode = CS_MODE_PERF;
>  		drvdata->perf_buf = etr_perf->etr_buf;
> +		drvdata->perf_handle = handle;
>  		atomic_inc(csdev->refcnt);
>  	}
>  
> @@ -1666,6 +1667,7 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev)
>  	drvdata->mode = CS_MODE_DISABLED;
>  	/* Reset perf specific data */
>  	drvdata->perf_buf = NULL;
> +	drvdata->perf_handle = NULL;
>  
>  	spin_unlock_irqrestore(&drvdata->spinlock, flags);
>  
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> index b91ec7dde7bc9..81583ffb973dc 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.h
> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> @@ -184,6 +184,7 @@ struct etr_buf {
>   * @idr_mutex:	Access serialisation for idr.
>   * @sysfs_buf:	SYSFS buffer for ETR.
>   * @perf_buf:	PERF buffer for ETR.
> + * @perf_handle: PERF handle for ETR.
>   */
>  struct tmc_drvdata {
>  	void __iomem		*base;
> @@ -207,6 +208,7 @@ struct tmc_drvdata {
>  	struct mutex		idr_mutex;
>  	struct etr_buf		*sysfs_buf;
>  	struct etr_buf		*perf_buf;
> +	struct perf_output_handle *perf_handle;
>  };
>  
>  struct etr_buf_operations {
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 2/4] coresight: tmc-etr: Track perf handler.
  2021-08-25 17:02   ` Mathieu Poirier
@ 2021-08-25 19:09     ` Mathieu Poirier
  0 siblings, 0 replies; 11+ messages in thread
From: Mathieu Poirier @ 2021-08-25 19:09 UTC (permalink / raw)
  To: Daniel Kiss
  Cc: coresight, denik, leo.yan, linux-arm-kernel, mike.leach,
	suzuki.poulose, Branislav Rankov

On Wed, Aug 25, 2021 at 11:02:13AM -0600, Mathieu Poirier wrote:
> Hi Daniel,
> 
> On Tue, Jul 13, 2021 at 02:15:30PM +0200, Daniel Kiss wrote:
> > Keep track of the perf handler that is registred by the first tracer.
> > This will be used by the update call from polling.
> > 
> > Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
> > Signed-off-by: Branislav Rankov <Branislav.Rankov@arm.com>
> > ---
> >  drivers/hwtracing/coresight/coresight-tmc-etr.c | 6 ++++--
> >  drivers/hwtracing/coresight/coresight-tmc.h     | 2 ++
> >  2 files changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > index 589bb2d56e802..55c9b5fd9f832 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > @@ -1503,8 +1503,8 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
> >  
> >  	spin_lock_irqsave(&drvdata->spinlock, flags);
> >  
> > -	/* Don't do anything if another tracer is using this sink */
> > -	if (atomic_read(csdev->refcnt) != 1) {
> > +	/* Serve only the tracer with the leading perf handler */
> > +	if (drvdata->perf_handle != handle) {
> 
> In CPU wide trace scenarios the first CPU to enable a sink is not
> guaranteed to be the same as the last CPU to use it.  As far as I understand the
> above assumes the first and last CPUs to use a sink are the same.

I want to take a little more time to ponder about the rest of this patchset.
More comments to come tomorrow.

Thanks,
Mathieu

> 
> >  		spin_unlock_irqrestore(&drvdata->spinlock, flags);
> >  		goto out;
> >  	}
> > @@ -1619,6 +1619,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
> >  		drvdata->pid = pid;
> >  		drvdata->mode = CS_MODE_PERF;
> >  		drvdata->perf_buf = etr_perf->etr_buf;
> > +		drvdata->perf_handle = handle;
> >  		atomic_inc(csdev->refcnt);
> >  	}
> >  
> > @@ -1666,6 +1667,7 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev)
> >  	drvdata->mode = CS_MODE_DISABLED;
> >  	/* Reset perf specific data */
> >  	drvdata->perf_buf = NULL;
> > +	drvdata->perf_handle = NULL;
> >  
> >  	spin_unlock_irqrestore(&drvdata->spinlock, flags);
> >  
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> > index b91ec7dde7bc9..81583ffb973dc 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc.h
> > +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> > @@ -184,6 +184,7 @@ struct etr_buf {
> >   * @idr_mutex:	Access serialisation for idr.
> >   * @sysfs_buf:	SYSFS buffer for ETR.
> >   * @perf_buf:	PERF buffer for ETR.
> > + * @perf_handle: PERF handle for ETR.
> >   */
> >  struct tmc_drvdata {
> >  	void __iomem		*base;
> > @@ -207,6 +208,7 @@ struct tmc_drvdata {
> >  	struct mutex		idr_mutex;
> >  	struct etr_buf		*sysfs_buf;
> >  	struct etr_buf		*perf_buf;
> > +	struct perf_output_handle *perf_handle;
> >  };
> >  
> >  struct etr_buf_operations {
> > -- 
> > 2.25.1
> > 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 4/4] coresight: Add ETR-PERF polling.
  2021-07-13 12:15 ` [PATCHv2 4/4] coresight: Add ETR-PERF polling Daniel Kiss
@ 2021-08-26 16:41   ` Mathieu Poirier
  2021-08-26 17:32   ` Mathieu Poirier
  1 sibling, 0 replies; 11+ messages in thread
From: Mathieu Poirier @ 2021-08-26 16:41 UTC (permalink / raw)
  To: Daniel Kiss
  Cc: coresight, denik, leo.yan, linux-arm-kernel, mike.leach,
	suzuki.poulose, Branislav Rankov, Denis Nikitin

Good day,

On Tue, Jul 13, 2021 at 02:15:32PM +0200, Daniel Kiss wrote:
> ETR might fill up the buffer sooner than an event makes perf to trigger
> the synchronisation especially in system wide trace. Polling runs
> periodically to sync the ETR buffer. Period is configurable via sysfs,
> disabled by default.
> 
> Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
> Signed-off-by: Branislav Rankov <Branislav.Rankov@arm.com>
> Tested-by: Denis Nikitin <denik@chromium.org>
> ---
>  .../testing/sysfs-bus-coresight-devices-tmc   |   8 +
>  drivers/hwtracing/coresight/Kconfig           |  12 +
>  drivers/hwtracing/coresight/Makefile          |   1 +
>  .../hwtracing/coresight/coresight-etm-perf.c  |   8 +
>  .../coresight/coresight-etr-perf-polling.c    | 275 ++++++++++++++++++
>  .../coresight/coresight-etr-perf-polling.h    |  38 +++
>  .../hwtracing/coresight/coresight-tmc-core.c  |   4 +
>  .../hwtracing/coresight/coresight-tmc-etr.c   |  13 +
>  8 files changed, 359 insertions(+)
>  create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.c
>  create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> index 6aa527296c710..3b411e8a6f417 100644
> --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> @@ -91,3 +91,11 @@ Contact:	Mathieu Poirier <mathieu.poirier@linaro.org>
>  Description:	(RW) Size of the trace buffer for TMC-ETR when used in SYSFS
>  		mode. Writable only for TMC-ETR configurations. The value
>  		should be aligned to the kernel pagesize.
> +
> +What:		/sys/bus/coresight/devices/<memory_map>.tmc/polling/period
> +Date:		July 2021
> +KernelVersion:	5.14
> +Contact:	Daniel Kiss <daniel.kiss@arm.com>
> +Description:	(RW) Time in milliseconds when the TMC-ETR is synced.
> +		Default value is 0, means the feature is disabled.
> +		Writable only for TMC-ETR configurations.

The problem with the current implementation is that period is global to all
tracing sessions.  Moreover it seems possible to change the value of a period
while a session is ongoing, which is guaranteed to confuse users.

A better way to proceed is to make the period a perf configurable option the
same way timestamp and ccyacc are.  Look at the top of coresight-etm-perf.c for
examples.  This might be the last time we use configuration attributes, after
that we should start using Mike Leach's new system configuration feature.

> diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
> index 84530fd80998c..4e91fb98849f4 100644
> --- a/drivers/hwtracing/coresight/Kconfig
> +++ b/drivers/hwtracing/coresight/Kconfig
> @@ -44,6 +44,18 @@ config CORESIGHT_LINK_AND_SINK_TMC
>  	  To compile this driver as a module, choose M here: the
>  	  module will be called coresight-tmc.
>  
> +
> +config CORESIGHT_ETR_PERF_POLL
> +	bool "Coresight ETR Perf Polling"
> +
> +	depends on CORESIGHT_LINK_AND_SINK_TMC
> +	help
> +	  Enable the support for software periodic synchronization of the ETR buffer.
> +	  ETR might fill up the buffer sooner than an event makes perf to trigger
> +	  the synchronization especially in system wide trace. Polling runs
> +	  periodically to sync the ETR buffer. Period is configurable via sysfs,
> +	  disabled by default.
> +

Polling should be part of the core and not a configurable option.

More comments to come...

>  config CORESIGHT_CATU
>  	tristate "Coresight Address Translation Unit (CATU) driver"
>  	depends on CORESIGHT_LINK_AND_SINK_TMC
> diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile
> index d60816509755c..6baac328eea87 100644
> --- a/drivers/hwtracing/coresight/Makefile
> +++ b/drivers/hwtracing/coresight/Makefile
> @@ -5,6 +5,7 @@
>  obj-$(CONFIG_CORESIGHT) += coresight.o
>  coresight-y := coresight-core.o  coresight-etm-perf.o coresight-platform.o \
>  		coresight-sysfs.o
> +coresight-$(CONFIG_CORESIGHT_ETR_PERF_POLL) += coresight-etr-perf-polling.o
>  obj-$(CONFIG_CORESIGHT_LINK_AND_SINK_TMC) += coresight-tmc.o
>  coresight-tmc-y := coresight-tmc-core.o coresight-tmc-etf.o \
>  		      coresight-tmc-etr.o
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index a3f4c07f5bf8b..3095840a567c4 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -19,6 +19,7 @@
>  #include <linux/workqueue.h>
>  
>  #include "coresight-etm-perf.h"
> +#include "coresight-etr-perf-polling.h"
>  #include "coresight-priv.h"
>  
>  static struct pmu etm_pmu;
> @@ -438,6 +439,8 @@ static void etm_event_start(struct perf_event *event, int flags)
>  	/* Tell the perf core the event is alive */
>  	event->hw.state = 0;
>  
> +	etr_perf_polling_event_start(event, event_data, handle);
> +
>  	/* Finally enable the tracer */
>  	if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
>  		goto fail_disable_path;
> @@ -497,6 +500,8 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	if (!sink)
>  		return;
>  
> +	etr_perf_polling_event_stop(event, event_data);
> +
>  	/* stop tracer */
>  	source_ops(csdev)->disable(csdev, event);
>  
> @@ -741,6 +746,8 @@ int __init etm_perf_init(void)
>  	etm_pmu.addr_filters_validate	= etm_addr_filters_validate;
>  	etm_pmu.nr_addr_filters		= ETM_ADDR_CMP_MAX;
>  
> +	etr_perf_polling_init();
> +
>  	ret = perf_pmu_register(&etm_pmu, CORESIGHT_ETM_PMU_NAME, -1);
>  	if (ret == 0)
>  		etm_perf_up = true;
> @@ -750,5 +757,6 @@ int __init etm_perf_init(void)
>  
>  void __exit etm_perf_exit(void)
>  {
> +	etr_perf_polling_exit();
>  	perf_pmu_unregister(&etm_pmu);
>  }
> diff --git a/drivers/hwtracing/coresight/coresight-etr-perf-polling.c b/drivers/hwtracing/coresight/coresight-etr-perf-polling.c
> new file mode 100644
> index 0000000000000..87e6bc42a62de
> --- /dev/null
> +++ b/drivers/hwtracing/coresight/coresight-etr-perf-polling.c
> @@ -0,0 +1,275 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright(C) 2021 Arm Limited. All rights reserved.
> + * Author: Daniel Kiss <daniel.kiss@arm.com>
> + */
> +
> +#include <linux/coresight.h>
> +#include <linux/coresight-pmu.h>
> +#include <linux/cpumask.h>
> +#include <linux/device.h>
> +#include <linux/init.h>
> +#include <linux/list.h>
> +#include <linux/percpu-defs.h>
> +#include <linux/perf_event.h>
> +#include <linux/slab.h>
> +#include <linux/stringhash.h>
> +#include <linux/types.h>
> +#include <linux/workqueue.h>
> +
> +#include "coresight-etr-perf-polling.h"
> +#include "coresight-priv.h"
> +#include "coresight-tmc.h"
> +
> +struct polling_event_list {
> +	struct perf_event *perf_event;
> +	struct etm_event_data *etm_event_data;
> +	struct perf_output_handle *ctx_handle;
> +	void (*tmc_etr_reset_hw)(struct tmc_drvdata *data);
> +	struct list_head list;
> +};
> +
> +struct polling {
> +	int cpu;
> +	struct polling_event_list *polled_event;
> +	struct delayed_work delayed_work;
> +};
> +
> +static atomic_t period;
> +static spinlock_t spinlock_re;
> +static struct list_head registered_events;
> +
> +static DEFINE_PER_CPU(struct polling, polling);
> +
> +static ssize_t period_show(struct device *dev, struct device_attribute *attr,
> +			   char *buf)
> +{
> +	int temp;
> +
> +	temp = atomic_read(&period);
> +	return sprintf(buf, "%i\n", temp);
> +}
> +
> +static ssize_t period_store(struct device *dev, struct device_attribute *attr,
> +			    const char *buf, size_t count)
> +{
> +	int temp = 0;
> +
> +	if (!kstrtoint(buf, 10, &temp) && (temp >= 0))
> +		atomic_set(&period, temp);
> +	return count;
> +}
> +
> +static DEVICE_ATTR_RW(period);
> +
> +static struct attribute *coresight_tmc_polling_attrs[] = {
> +	&dev_attr_period.attr,
> +	NULL,
> +};
> +const struct attribute_group coresight_tmc_polling_group = {
> +	.attrs = coresight_tmc_polling_attrs,
> +	.name = "polling",
> +};
> +EXPORT_SYMBOL_GPL(coresight_tmc_polling_group);
> +
> +static inline void polling_sched_worker(struct polling *p)
> +{
> +	int tickrate = atomic_read(&period);
> +
> +	if (tickrate > 0)
> +		schedule_delayed_work_on(p->cpu, &p->delayed_work,
> +					 msecs_to_jiffies(tickrate));
> +}
> +
> +static inline bool is_etr_related(struct etm_event_data *etm_event_data, int cpu)
> +{
> +	struct list_head *path;
> +	struct coresight_device *sink;
> +	struct tmc_drvdata *drvdata;
> +
> +	path = etm_event_cpu_path(etm_event_data, cpu);
> +	if (WARN_ON(!path))
> +		return false;
> +	sink = coresight_get_sink(path);
> +	if (WARN_ON(!sink))
> +		return false;
> +	drvdata = dev_get_drvdata(sink->dev.parent);
> +	if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
> +		return false;
> +	return true;
> +}
> +
> +/*
> + * Adds the event to the polled events list.
> + */
> +void etr_perf_polling_event_start(struct perf_event *event,
> +				  struct etm_event_data *etm_event_data,
> +				  struct perf_output_handle *ctx_handle)
> +{
> +	int cpu = smp_processor_id();
> +	struct polling *p = per_cpu_ptr(&polling, cpu);
> +	struct polling_event_list *element, *tmp;
> +
> +	if (!is_etr_related(etm_event_data, cpu))
> +		return;
> +
> +	spin_lock(&spinlock_re);
> +	list_for_each_entry_safe(element, tmp, &registered_events, list) {
> +		if (element->ctx_handle == ctx_handle) {
> +			element->perf_event = event;
> +			element->etm_event_data = etm_event_data;
> +			spin_unlock(&spinlock_re);
> +			p->polled_event = element;
> +			polling_sched_worker(p);
> +			return;
> +		}
> +	}
> +	spin_unlock(&spinlock_re);
> +}
> +
> +/*
> + * Removes the event from the to be polled events list.
> + */
> +void etr_perf_polling_event_stop(struct perf_event *event,
> +				 struct etm_event_data *etm_event_data)
> +{
> +	int cpu = smp_processor_id();
> +	struct polling *p = per_cpu_ptr(&polling, cpu);
> +
> +	if (!is_etr_related(etm_event_data, cpu))
> +		return;
> +
> +	if (p->polled_event) {
> +		struct polling_event_list *element = p->polled_event;
> +
> +		if (element->perf_event == event) {
> +			p->polled_event = NULL;
> +			element->perf_event = NULL;
> +			element->etm_event_data = NULL;
> +			cancel_delayed_work(&p->delayed_work);
> +			return;
> +		}
> +	}
> +}
> +
> +/*
> + * The polling worker is a workqueue job which is periodically
> + * woken up to update the perf aux buffer from the etr shrink.
> + */
> +static void etr_perf_polling_worker(struct work_struct *work)
> +{
> +	unsigned long flags;
> +	struct list_head *path;
> +	struct coresight_device *sink;
> +	int size;
> +	int cpu = smp_processor_id();
> +	struct polling *p = per_cpu_ptr(&polling, cpu);
> +
> +	if (!atomic_read(&period))
> +		return;
> +
> +	if (!p->polled_event)
> +		return;
> +	/*
> +	 * Scheduling would do the same from the perf hooks,
> +	 * this should be done in one go.
> +	 */
> +	local_irq_save(flags);
> +
> +	polling_sched_worker(p);
> +
> +	path = etm_event_cpu_path(p->polled_event->etm_event_data, cpu);
> +	sink = coresight_get_sink(path);
> +	size = sink_ops(sink)->update_buffer(
> +		sink, p->polled_event->ctx_handle,
> +		p->polled_event->etm_event_data->snk_config);
> +
> +	/*
> +	 * Restart the trace.
> +	 */
> +	if (p->polled_event->tmc_etr_reset_hw)
> +		p->polled_event->tmc_etr_reset_hw(dev_get_drvdata(sink->dev.parent));
> +
> +	WARN_ON(size < 0);
> +	if (size > 0) {
> +		struct etm_event_data *new_event_data;
> +
> +		perf_aux_output_end(p->polled_event->ctx_handle, size);
> +		new_event_data = perf_aux_output_begin(
> +			p->polled_event->ctx_handle,
> +			p->polled_event->perf_event);
> +		if (WARN_ON(new_event_data == NULL)) {
> +			local_irq_restore(flags);
> +			return;
> +		}
> +
> +		p->polled_event->etm_event_data = new_event_data;
> +		WARN_ON(new_event_data->snk_config !=
> +			p->polled_event->etm_event_data->snk_config);
> +	}
> +
> +	local_irq_restore(flags);
> +}
> +
> +void etr_perf_polling_handle_register(struct perf_output_handle *handle,
> +				      void (*tmc_etr_reset_hw)(struct tmc_drvdata *drvdata))
> +{
> +	struct polling_event_list *element;
> +
> +	element = kmalloc(sizeof(*element), GFP_ATOMIC);
> +	if (WARN_ON(!element))
> +		return;
> +	memset(element, 0, sizeof(*element));
> +	element->ctx_handle = handle;
> +	element->tmc_etr_reset_hw = tmc_etr_reset_hw;
> +	spin_lock(&spinlock_re);
> +	list_add(&element->list, &registered_events);
> +	spin_unlock(&spinlock_re);
> +}
> +EXPORT_SYMBOL_GPL(etr_perf_polling_handle_register);
> +
> +void etr_perf_polling_handle_deregister(struct perf_output_handle *handle)
> +{
> +	struct polling_event_list *element, *tmp;
> +
> +	spin_lock(&spinlock_re);
> +	list_for_each_entry_safe(element, tmp, &registered_events, list) {
> +		if (element->ctx_handle == handle) {
> +			list_del(&element->list);
> +			spin_unlock(&spinlock_re);
> +			kfree(element);
> +			return;
> +		}
> +	}
> +	spin_unlock(&spinlock_re);
> +}
> +EXPORT_SYMBOL_GPL(etr_perf_polling_handle_deregister);
> +
> +void etr_perf_polling_init(void)
> +{
> +	int cpu;
> +
> +	spin_lock_init(&spinlock_re);
> +	INIT_LIST_HEAD(&registered_events);
> +	atomic_set(&period, 0);
> +	for_each_possible_cpu(cpu) {
> +		struct polling *p = per_cpu_ptr(&polling, cpu);
> +
> +		p->cpu = cpu;
> +		p->polled_event = NULL;
> +		INIT_DELAYED_WORK(&p->delayed_work, etr_perf_polling_worker);
> +	}
> +}
> +
> +void etr_perf_polling_exit(void)
> +{
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu) {
> +		struct polling *p = per_cpu_ptr(&polling, cpu);
> +
> +		cancel_delayed_work_sync(&p->delayed_work);
> +		WARN_ON(p->polled_event);
> +	}
> +	WARN_ON(!list_empty(&registered_events));
> +}
> diff --git a/drivers/hwtracing/coresight/coresight-etr-perf-polling.h b/drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> new file mode 100644
> index 0000000000000..d47b4424594e6
> --- /dev/null
> +++ b/drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright(C) 2021 Arm Limited. All rights reserved.
> + * Author: Daniel Kiss <daniel.kiss@arm.com>
> + */
> +
> +#ifndef _CORESIGHT_ETM_PERF_POLLING_H
> +#define _CORESIGHT_ETM_PERF_POLLING_H
> +
> +#include <linux/coresight.h>
> +#include <linux/perf_event.h>
> +#include "coresight-etm-perf.h"
> +#include "coresight-tmc.h"
> +
> +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> +void etr_perf_polling_init(void);
> +void etr_perf_polling_exit(void);
> +void etr_perf_polling_handle_register(struct perf_output_handle *handle,
> +				      void (*tmc_etr_reset_hw)(struct tmc_drvdata *drvdata));
> +void etr_perf_polling_handle_deregister(struct perf_output_handle *handle);
> +void etr_perf_polling_event_start(struct perf_event *event,
> +				  struct etm_event_data *etm_event_data,
> +				  struct perf_output_handle *ctx_handle);
> +void etr_perf_polling_event_stop(struct perf_event *event,
> +				 struct etm_event_data *etm_event_data);
> +
> +extern const struct attribute_group coresight_tmc_polling_group;
> +
> +#else /* !CONFIG_CORESIGHT_ETR_PERF_POLL */
> +#define etr_perf_polling_init()
> +#define etr_perf_polling_exit()
> +#define etr_perf_polling_handle_register(...)
> +#define etr_perf_polling_handle_deregister(...)
> +#define etr_perf_polling_event_start(...)
> +#define etr_perf_polling_event_stop(...)
> +#endif
> +
> +#endif
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
> index 74c6323d4d6ab..dbcdba162bd38 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
> @@ -26,6 +26,7 @@
>  
>  #include "coresight-priv.h"
>  #include "coresight-tmc.h"
> +#include "coresight-etr-perf-polling.h"
>  
>  DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
>  DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
> @@ -365,6 +366,9 @@ static const struct attribute_group coresight_tmc_mgmt_group = {
>  static const struct attribute_group *coresight_tmc_groups[] = {
>  	&coresight_tmc_group,
>  	&coresight_tmc_mgmt_group,
> +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> +	&coresight_tmc_polling_group,
> +#endif
>  	NULL,
>  };
>  
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 55c9b5fd9f832..67cd4bdcda71b 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -16,6 +16,7 @@
>  #include <linux/vmalloc.h>
>  #include "coresight-catu.h"
>  #include "coresight-etm-perf.h"
> +#include "coresight-etr-perf-polling.h"
>  #include "coresight-priv.h"
>  #include "coresight-tmc.h"
>  
> @@ -1137,6 +1138,16 @@ void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
>  	drvdata->etr_buf = NULL;
>  }
>  
> +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> +
> +static void tmc_etr_reset_hw(struct tmc_drvdata *drvdata)
> +{
> +	__tmc_etr_disable_hw(drvdata);
> +	__tmc_etr_enable_hw(drvdata);
> +}
> +
> +#endif
> +
>  static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>  {
>  	int ret = 0;
> @@ -1620,6 +1631,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
>  		drvdata->mode = CS_MODE_PERF;
>  		drvdata->perf_buf = etr_perf->etr_buf;
>  		drvdata->perf_handle = handle;
> +		etr_perf_polling_handle_register(handle, tmc_etr_reset_hw);
>  		atomic_inc(csdev->refcnt);
>  	}
>  
> @@ -1667,6 +1679,7 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev)
>  	drvdata->mode = CS_MODE_DISABLED;
>  	/* Reset perf specific data */
>  	drvdata->perf_buf = NULL;
> +	etr_perf_polling_handle_deregister(drvdata->perf_handle);
>  	drvdata->perf_handle = NULL;
>  
>  	spin_unlock_irqrestore(&drvdata->spinlock, flags);
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 4/4] coresight: Add ETR-PERF polling.
  2021-07-13 12:15 ` [PATCHv2 4/4] coresight: Add ETR-PERF polling Daniel Kiss
  2021-08-26 16:41   ` Mathieu Poirier
@ 2021-08-26 17:32   ` Mathieu Poirier
  2021-08-31 13:31     ` Mike Leach
  1 sibling, 1 reply; 11+ messages in thread
From: Mathieu Poirier @ 2021-08-26 17:32 UTC (permalink / raw)
  To: Daniel Kiss
  Cc: coresight, denik, leo.yan, linux-arm-kernel, mike.leach,
	suzuki.poulose, Branislav Rankov, Denis Nikitin

On Tue, Jul 13, 2021 at 02:15:32PM +0200, Daniel Kiss wrote:
> ETR might fill up the buffer sooner than an event makes perf to trigger
> the synchronisation especially in system wide trace. Polling runs
> periodically to sync the ETR buffer. Period is configurable via sysfs,
> disabled by default.
> 
> Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
> Signed-off-by: Branislav Rankov <Branislav.Rankov@arm.com>
> Tested-by: Denis Nikitin <denik@chromium.org>
> ---
>  .../testing/sysfs-bus-coresight-devices-tmc   |   8 +
>  drivers/hwtracing/coresight/Kconfig           |  12 +
>  drivers/hwtracing/coresight/Makefile          |   1 +
>  .../hwtracing/coresight/coresight-etm-perf.c  |   8 +
>  .../coresight/coresight-etr-perf-polling.c    | 275 ++++++++++++++++++
>  .../coresight/coresight-etr-perf-polling.h    |  38 +++
>  .../hwtracing/coresight/coresight-tmc-core.c  |   4 +
>  .../hwtracing/coresight/coresight-tmc-etr.c   |  13 +
>  8 files changed, 359 insertions(+)
>  create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.c
>  create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> index 6aa527296c710..3b411e8a6f417 100644
> --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> @@ -91,3 +91,11 @@ Contact:	Mathieu Poirier <mathieu.poirier@linaro.org>
>  Description:	(RW) Size of the trace buffer for TMC-ETR when used in SYSFS
>  		mode. Writable only for TMC-ETR configurations. The value
>  		should be aligned to the kernel pagesize.
> +
> +What:		/sys/bus/coresight/devices/<memory_map>.tmc/polling/period
> +Date:		July 2021
> +KernelVersion:	5.14
> +Contact:	Daniel Kiss <daniel.kiss@arm.com>
> +Description:	(RW) Time in milliseconds when the TMC-ETR is synced.
> +		Default value is 0, means the feature is disabled.
> +		Writable only for TMC-ETR configurations.
> diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
> index 84530fd80998c..4e91fb98849f4 100644
> --- a/drivers/hwtracing/coresight/Kconfig
> +++ b/drivers/hwtracing/coresight/Kconfig
> @@ -44,6 +44,18 @@ config CORESIGHT_LINK_AND_SINK_TMC
>  	  To compile this driver as a module, choose M here: the
>  	  module will be called coresight-tmc.
>  
> +
> +config CORESIGHT_ETR_PERF_POLL
> +	bool "Coresight ETR Perf Polling"
> +
> +	depends on CORESIGHT_LINK_AND_SINK_TMC
> +	help
> +	  Enable the support for software periodic synchronization of the ETR buffer.
> +	  ETR might fill up the buffer sooner than an event makes perf to trigger
> +	  the synchronization especially in system wide trace. Polling runs
> +	  periodically to sync the ETR buffer. Period is configurable via sysfs,
> +	  disabled by default.
> +
>  config CORESIGHT_CATU
>  	tristate "Coresight Address Translation Unit (CATU) driver"
>  	depends on CORESIGHT_LINK_AND_SINK_TMC
> diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile
> index d60816509755c..6baac328eea87 100644
> --- a/drivers/hwtracing/coresight/Makefile
> +++ b/drivers/hwtracing/coresight/Makefile
> @@ -5,6 +5,7 @@
>  obj-$(CONFIG_CORESIGHT) += coresight.o
>  coresight-y := coresight-core.o  coresight-etm-perf.o coresight-platform.o \
>  		coresight-sysfs.o
> +coresight-$(CONFIG_CORESIGHT_ETR_PERF_POLL) += coresight-etr-perf-polling.o
>  obj-$(CONFIG_CORESIGHT_LINK_AND_SINK_TMC) += coresight-tmc.o
>  coresight-tmc-y := coresight-tmc-core.o coresight-tmc-etf.o \
>  		      coresight-tmc-etr.o
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index a3f4c07f5bf8b..3095840a567c4 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -19,6 +19,7 @@
>  #include <linux/workqueue.h>
>  
>  #include "coresight-etm-perf.h"
> +#include "coresight-etr-perf-polling.h"
>  #include "coresight-priv.h"
>  
>  static struct pmu etm_pmu;
> @@ -438,6 +439,8 @@ static void etm_event_start(struct perf_event *event, int flags)
>  	/* Tell the perf core the event is alive */
>  	event->hw.state = 0;
>  
> +	etr_perf_polling_event_start(event, event_data, handle);
> +
>  	/* Finally enable the tracer */
>  	if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
>  		goto fail_disable_path;
> @@ -497,6 +500,8 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	if (!sink)
>  		return;
>  
> +	etr_perf_polling_event_stop(event, event_data);
> +
>  	/* stop tracer */
>  	source_ops(csdev)->disable(csdev, event);
>  
> @@ -741,6 +746,8 @@ int __init etm_perf_init(void)
>  	etm_pmu.addr_filters_validate	= etm_addr_filters_validate;
>  	etm_pmu.nr_addr_filters		= ETM_ADDR_CMP_MAX;
>  
> +	etr_perf_polling_init();

The problem here is that a function specific to an ETR is inserted in code that
should be generic.  So if we want to do the same for ETB, ETF and any kind of
sink that's out there, we'd have to duplicate code and it would quickly get
messy.  What is needed is a generic solution.

Moreover what is proposed here is centered on sinks, while the entire coresight
framework is centered on sources.  The polling function, although specific to
sinks, should emanate from and be driven by the generic code that handles
sources.  There should also be a single event responsible for doing the polling
rather than all of them as it is the case in this patchset.  In per-thread mode
where a single thread is traced it isn't a problem.  In CPU-wide scenarios the
first event should do the polling.  There will be a gap between the time that
event stops and all the other events in the trace session stop but that is a
limitation we will have to live with.  

That leaves us with per-thread sessions where child threads are spun off.  You
will have to look into how events are handle in that case.  Suzuki has done work
in that area and might remember a few things. 

There are other design problems with this set that I won't get into since it 
needs a refactoring anyway.

Regards,
Mathieu

> +
>  	ret = perf_pmu_register(&etm_pmu, CORESIGHT_ETM_PMU_NAME, -1);
>  	if (ret == 0)
>  		etm_perf_up = true;
> @@ -750,5 +757,6 @@ int __init etm_perf_init(void)
>  
>  void __exit etm_perf_exit(void)
>  {
> +	etr_perf_polling_exit();
>  	perf_pmu_unregister(&etm_pmu);
>  }
> diff --git a/drivers/hwtracing/coresight/coresight-etr-perf-polling.c b/drivers/hwtracing/coresight/coresight-etr-perf-polling.c
> new file mode 100644
> index 0000000000000..87e6bc42a62de
> --- /dev/null
> +++ b/drivers/hwtracing/coresight/coresight-etr-perf-polling.c
> @@ -0,0 +1,275 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright(C) 2021 Arm Limited. All rights reserved.
> + * Author: Daniel Kiss <daniel.kiss@arm.com>
> + */
> +
> +#include <linux/coresight.h>
> +#include <linux/coresight-pmu.h>
> +#include <linux/cpumask.h>
> +#include <linux/device.h>
> +#include <linux/init.h>
> +#include <linux/list.h>
> +#include <linux/percpu-defs.h>
> +#include <linux/perf_event.h>
> +#include <linux/slab.h>
> +#include <linux/stringhash.h>
> +#include <linux/types.h>
> +#include <linux/workqueue.h>
> +
> +#include "coresight-etr-perf-polling.h"
> +#include "coresight-priv.h"
> +#include "coresight-tmc.h"
> +
> +struct polling_event_list {
> +	struct perf_event *perf_event;
> +	struct etm_event_data *etm_event_data;
> +	struct perf_output_handle *ctx_handle;
> +	void (*tmc_etr_reset_hw)(struct tmc_drvdata *data);
> +	struct list_head list;
> +};
> +
> +struct polling {
> +	int cpu;
> +	struct polling_event_list *polled_event;
> +	struct delayed_work delayed_work;
> +};
> +
> +static atomic_t period;
> +static spinlock_t spinlock_re;
> +static struct list_head registered_events;
> +
> +static DEFINE_PER_CPU(struct polling, polling);
> +
> +static ssize_t period_show(struct device *dev, struct device_attribute *attr,
> +			   char *buf)
> +{
> +	int temp;
> +
> +	temp = atomic_read(&period);
> +	return sprintf(buf, "%i\n", temp);
> +}
> +
> +static ssize_t period_store(struct device *dev, struct device_attribute *attr,
> +			    const char *buf, size_t count)
> +{
> +	int temp = 0;
> +
> +	if (!kstrtoint(buf, 10, &temp) && (temp >= 0))
> +		atomic_set(&period, temp);
> +	return count;
> +}
> +
> +static DEVICE_ATTR_RW(period);
> +
> +static struct attribute *coresight_tmc_polling_attrs[] = {
> +	&dev_attr_period.attr,
> +	NULL,
> +};
> +const struct attribute_group coresight_tmc_polling_group = {
> +	.attrs = coresight_tmc_polling_attrs,
> +	.name = "polling",
> +};
> +EXPORT_SYMBOL_GPL(coresight_tmc_polling_group);
> +
> +static inline void polling_sched_worker(struct polling *p)
> +{
> +	int tickrate = atomic_read(&period);
> +
> +	if (tickrate > 0)
> +		schedule_delayed_work_on(p->cpu, &p->delayed_work,
> +					 msecs_to_jiffies(tickrate));
> +}
> +
> +static inline bool is_etr_related(struct etm_event_data *etm_event_data, int cpu)
> +{
> +	struct list_head *path;
> +	struct coresight_device *sink;
> +	struct tmc_drvdata *drvdata;
> +
> +	path = etm_event_cpu_path(etm_event_data, cpu);
> +	if (WARN_ON(!path))
> +		return false;
> +	sink = coresight_get_sink(path);
> +	if (WARN_ON(!sink))
> +		return false;
> +	drvdata = dev_get_drvdata(sink->dev.parent);
> +	if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
> +		return false;
> +	return true;
> +}
> +
> +/*
> + * Adds the event to the polled events list.
> + */
> +void etr_perf_polling_event_start(struct perf_event *event,
> +				  struct etm_event_data *etm_event_data,
> +				  struct perf_output_handle *ctx_handle)
> +{
> +	int cpu = smp_processor_id();
> +	struct polling *p = per_cpu_ptr(&polling, cpu);
> +	struct polling_event_list *element, *tmp;
> +
> +	if (!is_etr_related(etm_event_data, cpu))
> +		return;
> +
> +	spin_lock(&spinlock_re);
> +	list_for_each_entry_safe(element, tmp, &registered_events, list) {
> +		if (element->ctx_handle == ctx_handle) {
> +			element->perf_event = event;
> +			element->etm_event_data = etm_event_data;
> +			spin_unlock(&spinlock_re);
> +			p->polled_event = element;
> +			polling_sched_worker(p);
> +			return;
> +		}
> +	}
> +	spin_unlock(&spinlock_re);
> +}
> +
> +/*
> + * Removes the event from the to be polled events list.
> + */
> +void etr_perf_polling_event_stop(struct perf_event *event,
> +				 struct etm_event_data *etm_event_data)
> +{
> +	int cpu = smp_processor_id();
> +	struct polling *p = per_cpu_ptr(&polling, cpu);
> +
> +	if (!is_etr_related(etm_event_data, cpu))
> +		return;
> +
> +	if (p->polled_event) {
> +		struct polling_event_list *element = p->polled_event;
> +
> +		if (element->perf_event == event) {
> +			p->polled_event = NULL;
> +			element->perf_event = NULL;
> +			element->etm_event_data = NULL;
> +			cancel_delayed_work(&p->delayed_work);
> +			return;
> +		}
> +	}
> +}
> +
> +/*
> + * The polling worker is a workqueue job which is periodically
> + * woken up to update the perf aux buffer from the etr shrink.
> + */
> +static void etr_perf_polling_worker(struct work_struct *work)
> +{
> +	unsigned long flags;
> +	struct list_head *path;
> +	struct coresight_device *sink;
> +	int size;
> +	int cpu = smp_processor_id();
> +	struct polling *p = per_cpu_ptr(&polling, cpu);
> +
> +	if (!atomic_read(&period))
> +		return;
> +
> +	if (!p->polled_event)
> +		return;
> +	/*
> +	 * Scheduling would do the same from the perf hooks,
> +	 * this should be done in one go.
> +	 */
> +	local_irq_save(flags);
> +
> +	polling_sched_worker(p);
> +
> +	path = etm_event_cpu_path(p->polled_event->etm_event_data, cpu);
> +	sink = coresight_get_sink(path);
> +	size = sink_ops(sink)->update_buffer(
> +		sink, p->polled_event->ctx_handle,
> +		p->polled_event->etm_event_data->snk_config);
> +
> +	/*
> +	 * Restart the trace.
> +	 */
> +	if (p->polled_event->tmc_etr_reset_hw)
> +		p->polled_event->tmc_etr_reset_hw(dev_get_drvdata(sink->dev.parent));
> +
> +	WARN_ON(size < 0);
> +	if (size > 0) {
> +		struct etm_event_data *new_event_data;
> +
> +		perf_aux_output_end(p->polled_event->ctx_handle, size);
> +		new_event_data = perf_aux_output_begin(
> +			p->polled_event->ctx_handle,
> +			p->polled_event->perf_event);
> +		if (WARN_ON(new_event_data == NULL)) {
> +			local_irq_restore(flags);
> +			return;
> +		}
> +
> +		p->polled_event->etm_event_data = new_event_data;
> +		WARN_ON(new_event_data->snk_config !=
> +			p->polled_event->etm_event_data->snk_config);
> +	}
> +
> +	local_irq_restore(flags);
> +}
> +
> +void etr_perf_polling_handle_register(struct perf_output_handle *handle,
> +				      void (*tmc_etr_reset_hw)(struct tmc_drvdata *drvdata))
> +{
> +	struct polling_event_list *element;
> +
> +	element = kmalloc(sizeof(*element), GFP_ATOMIC);
> +	if (WARN_ON(!element))
> +		return;
> +	memset(element, 0, sizeof(*element));
> +	element->ctx_handle = handle;
> +	element->tmc_etr_reset_hw = tmc_etr_reset_hw;
> +	spin_lock(&spinlock_re);
> +	list_add(&element->list, &registered_events);
> +	spin_unlock(&spinlock_re);
> +}
> +EXPORT_SYMBOL_GPL(etr_perf_polling_handle_register);
> +
> +void etr_perf_polling_handle_deregister(struct perf_output_handle *handle)
> +{
> +	struct polling_event_list *element, *tmp;
> +
> +	spin_lock(&spinlock_re);
> +	list_for_each_entry_safe(element, tmp, &registered_events, list) {
> +		if (element->ctx_handle == handle) {
> +			list_del(&element->list);
> +			spin_unlock(&spinlock_re);
> +			kfree(element);
> +			return;
> +		}
> +	}
> +	spin_unlock(&spinlock_re);
> +}
> +EXPORT_SYMBOL_GPL(etr_perf_polling_handle_deregister);
> +
> +void etr_perf_polling_init(void)
> +{
> +	int cpu;
> +
> +	spin_lock_init(&spinlock_re);
> +	INIT_LIST_HEAD(&registered_events);
> +	atomic_set(&period, 0);
> +	for_each_possible_cpu(cpu) {
> +		struct polling *p = per_cpu_ptr(&polling, cpu);
> +
> +		p->cpu = cpu;
> +		p->polled_event = NULL;
> +		INIT_DELAYED_WORK(&p->delayed_work, etr_perf_polling_worker);
> +	}
> +}
> +
> +void etr_perf_polling_exit(void)
> +{
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu) {
> +		struct polling *p = per_cpu_ptr(&polling, cpu);
> +
> +		cancel_delayed_work_sync(&p->delayed_work);
> +		WARN_ON(p->polled_event);
> +	}
> +	WARN_ON(!list_empty(&registered_events));
> +}
> diff --git a/drivers/hwtracing/coresight/coresight-etr-perf-polling.h b/drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> new file mode 100644
> index 0000000000000..d47b4424594e6
> --- /dev/null
> +++ b/drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright(C) 2021 Arm Limited. All rights reserved.
> + * Author: Daniel Kiss <daniel.kiss@arm.com>
> + */
> +
> +#ifndef _CORESIGHT_ETM_PERF_POLLING_H
> +#define _CORESIGHT_ETM_PERF_POLLING_H
> +
> +#include <linux/coresight.h>
> +#include <linux/perf_event.h>
> +#include "coresight-etm-perf.h"
> +#include "coresight-tmc.h"
> +
> +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> +void etr_perf_polling_init(void);
> +void etr_perf_polling_exit(void);
> +void etr_perf_polling_handle_register(struct perf_output_handle *handle,
> +				      void (*tmc_etr_reset_hw)(struct tmc_drvdata *drvdata));
> +void etr_perf_polling_handle_deregister(struct perf_output_handle *handle);
> +void etr_perf_polling_event_start(struct perf_event *event,
> +				  struct etm_event_data *etm_event_data,
> +				  struct perf_output_handle *ctx_handle);
> +void etr_perf_polling_event_stop(struct perf_event *event,
> +				 struct etm_event_data *etm_event_data);
> +
> +extern const struct attribute_group coresight_tmc_polling_group;
> +
> +#else /* !CONFIG_CORESIGHT_ETR_PERF_POLL */
> +#define etr_perf_polling_init()
> +#define etr_perf_polling_exit()
> +#define etr_perf_polling_handle_register(...)
> +#define etr_perf_polling_handle_deregister(...)
> +#define etr_perf_polling_event_start(...)
> +#define etr_perf_polling_event_stop(...)
> +#endif
> +
> +#endif
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
> index 74c6323d4d6ab..dbcdba162bd38 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
> @@ -26,6 +26,7 @@
>  
>  #include "coresight-priv.h"
>  #include "coresight-tmc.h"
> +#include "coresight-etr-perf-polling.h"
>  
>  DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
>  DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
> @@ -365,6 +366,9 @@ static const struct attribute_group coresight_tmc_mgmt_group = {
>  static const struct attribute_group *coresight_tmc_groups[] = {
>  	&coresight_tmc_group,
>  	&coresight_tmc_mgmt_group,
> +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> +	&coresight_tmc_polling_group,
> +#endif
>  	NULL,
>  };
>  
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 55c9b5fd9f832..67cd4bdcda71b 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -16,6 +16,7 @@
>  #include <linux/vmalloc.h>
>  #include "coresight-catu.h"
>  #include "coresight-etm-perf.h"
> +#include "coresight-etr-perf-polling.h"
>  #include "coresight-priv.h"
>  #include "coresight-tmc.h"
>  
> @@ -1137,6 +1138,16 @@ void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
>  	drvdata->etr_buf = NULL;
>  }
>  
> +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> +
> +static void tmc_etr_reset_hw(struct tmc_drvdata *drvdata)
> +{
> +	__tmc_etr_disable_hw(drvdata);
> +	__tmc_etr_enable_hw(drvdata);
> +}
> +
> +#endif
> +
>  static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>  {
>  	int ret = 0;
> @@ -1620,6 +1631,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
>  		drvdata->mode = CS_MODE_PERF;
>  		drvdata->perf_buf = etr_perf->etr_buf;
>  		drvdata->perf_handle = handle;
> +		etr_perf_polling_handle_register(handle, tmc_etr_reset_hw);
>  		atomic_inc(csdev->refcnt);
>  	}
>  
> @@ -1667,6 +1679,7 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev)
>  	drvdata->mode = CS_MODE_DISABLED;
>  	/* Reset perf specific data */
>  	drvdata->perf_buf = NULL;
> +	etr_perf_polling_handle_deregister(drvdata->perf_handle);
>  	drvdata->perf_handle = NULL;
>  
>  	spin_unlock_irqrestore(&drvdata->spinlock, flags);
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 4/4] coresight: Add ETR-PERF polling.
  2021-08-26 17:32   ` Mathieu Poirier
@ 2021-08-31 13:31     ` Mike Leach
  2021-09-01 18:02       ` Mathieu Poirier
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Leach @ 2021-08-31 13:31 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Daniel Kiss, Coresight ML, Denis Nikitin, Leo Yan,
	linux-arm-kernel, Suzuki K. Poulose, Branislav Rankov,
	Denis Nikitin

iI Mathieu, Daniel,.

On Thu, 26 Aug 2021 at 18:33, Mathieu Poirier
<mathieu.poirier@linaro.org> wrote:
>
> On Tue, Jul 13, 2021 at 02:15:32PM +0200, Daniel Kiss wrote:
> > ETR might fill up the buffer sooner than an event makes perf to trigger
> > the synchronisation especially in system wide trace. Polling runs
> > periodically to sync the ETR buffer. Period is configurable via sysfs,
> > disabled by default.
> >
> > Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
> > Signed-off-by: Branislav Rankov <Branislav.Rankov@arm.com>
> > Tested-by: Denis Nikitin <denik@chromium.org>
> > ---
> >  .../testing/sysfs-bus-coresight-devices-tmc   |   8 +
> >  drivers/hwtracing/coresight/Kconfig           |  12 +
> >  drivers/hwtracing/coresight/Makefile          |   1 +
> >  .../hwtracing/coresight/coresight-etm-perf.c  |   8 +
> >  .../coresight/coresight-etr-perf-polling.c    | 275 ++++++++++++++++++
> >  .../coresight/coresight-etr-perf-polling.h    |  38 +++
> >  .../hwtracing/coresight/coresight-tmc-core.c  |   4 +
> >  .../hwtracing/coresight/coresight-tmc-etr.c   |  13 +
> >  8 files changed, 359 insertions(+)
> >  create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.c
> >  create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> >
> > diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> > index 6aa527296c710..3b411e8a6f417 100644
> > --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> > +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> > @@ -91,3 +91,11 @@ Contact:   Mathieu Poirier <mathieu.poirier@linaro.org>
> >  Description: (RW) Size of the trace buffer for TMC-ETR when used in SYSFS
> >               mode. Writable only for TMC-ETR configurations. The value
> >               should be aligned to the kernel pagesize.
> > +
> > +What:                /sys/bus/coresight/devices/<memory_map>.tmc/polling/period
> > +Date:                July 2021
> > +KernelVersion:       5.14
> > +Contact:     Daniel Kiss <daniel.kiss@arm.com>
> > +Description: (RW) Time in milliseconds when the TMC-ETR is synced.
> > +             Default value is 0, means the feature is disabled.
> > +             Writable only for TMC-ETR configurations.
> > diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
> > index 84530fd80998c..4e91fb98849f4 100644
> > --- a/drivers/hwtracing/coresight/Kconfig
> > +++ b/drivers/hwtracing/coresight/Kconfig
> > @@ -44,6 +44,18 @@ config CORESIGHT_LINK_AND_SINK_TMC
> >         To compile this driver as a module, choose M here: the
> >         module will be called coresight-tmc.
> >
> > +
> > +config CORESIGHT_ETR_PERF_POLL
> > +     bool "Coresight ETR Perf Polling"
> > +
> > +     depends on CORESIGHT_LINK_AND_SINK_TMC
> > +     help
> > +       Enable the support for software periodic synchronization of the ETR buffer.
> > +       ETR might fill up the buffer sooner than an event makes perf to trigger
> > +       the synchronization especially in system wide trace. Polling runs
> > +       periodically to sync the ETR buffer. Period is configurable via sysfs,
> > +       disabled by default.
> > +
> >  config CORESIGHT_CATU
> >       tristate "Coresight Address Translation Unit (CATU) driver"
> >       depends on CORESIGHT_LINK_AND_SINK_TMC
> > diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile
> > index d60816509755c..6baac328eea87 100644
> > --- a/drivers/hwtracing/coresight/Makefile
> > +++ b/drivers/hwtracing/coresight/Makefile
> > @@ -5,6 +5,7 @@
> >  obj-$(CONFIG_CORESIGHT) += coresight.o
> >  coresight-y := coresight-core.o  coresight-etm-perf.o coresight-platform.o \
> >               coresight-sysfs.o
> > +coresight-$(CONFIG_CORESIGHT_ETR_PERF_POLL) += coresight-etr-perf-polling.o
> >  obj-$(CONFIG_CORESIGHT_LINK_AND_SINK_TMC) += coresight-tmc.o
> >  coresight-tmc-y := coresight-tmc-core.o coresight-tmc-etf.o \
> >                     coresight-tmc-etr.o
> > diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> > index a3f4c07f5bf8b..3095840a567c4 100644
> > --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> > +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> > @@ -19,6 +19,7 @@
> >  #include <linux/workqueue.h>
> >
> >  #include "coresight-etm-perf.h"
> > +#include "coresight-etr-perf-polling.h"
> >  #include "coresight-priv.h"
> >
> >  static struct pmu etm_pmu;
> > @@ -438,6 +439,8 @@ static void etm_event_start(struct perf_event *event, int flags)
> >       /* Tell the perf core the event is alive */
> >       event->hw.state = 0;
> >
> > +     etr_perf_polling_event_start(event, event_data, handle);
> > +
> >       /* Finally enable the tracer */
> >       if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
> >               goto fail_disable_path;
> > @@ -497,6 +500,8 @@ static void etm_event_stop(struct perf_event *event, int mode)
> >       if (!sink)
> >               return;
> >
> > +     etr_perf_polling_event_stop(event, event_data);
> > +
> >       /* stop tracer */
> >       source_ops(csdev)->disable(csdev, event);
> >
> > @@ -741,6 +746,8 @@ int __init etm_perf_init(void)
> >       etm_pmu.addr_filters_validate   = etm_addr_filters_validate;
> >       etm_pmu.nr_addr_filters         = ETM_ADDR_CMP_MAX;
> >
> > +     etr_perf_polling_init();
>
> The problem here is that a function specific to an ETR is inserted in code that
> should be generic.  So if we want to do the same for ETB, ETF and any kind of
> sink that's out there, we'd have to duplicate code and it would quickly get
> messy.  What is needed is a generic solution.
>
> Moreover what is proposed here is centered on sinks, while the entire coresight
> framework is centered on sources.  The polling function, although specific to
> sinks, should emanate from and be driven by the generic code that handles
> sources.  There should also be a single event responsible for doing the polling
> rather than all of them as it is the case in this patchset.  In per-thread mode
> where a single thread is traced it isn't a problem.  In CPU-wide scenarios the
> first event should do the polling.  There will be a gap between the time that
> event stops and all the other events in the trace session stop but that is a
> limitation we will have to live with.
>

In my previous reviews of this I have suggested that this must be ETR
specific - the object of feature is to use polling to mimic the
interrupt we get with somehting  like TRBE, and get more consistent
output across the entire trace run - which is the objective of the
feature and has been backed up by evidence from those that have tried
it out..
Granted, TRBE is per core and thus can only be operating on the single
source, but the guiding idea remains the same - save data whenever the
sink hardware is full (or artificailly stopped in the case of
polling). TRBE does not, as far as I recall wait for an event to stop
before servicing the full interrupt.,

The worry is that is this is source event driven then any advantages
would be wiped out by waiting for events in the trace session to stop.

Regards

Mike


> That leaves us with per-thread sessions where child threads are spun off.  You
> will have to look into how events are handle in that case.  Suzuki has done work
> in that area and might remember a few things.
>
> There are other design problems with this set that I won't get into since it
> needs a refactoring anyway.
>
> Regards,
> Mathieu
>
> > +
> >       ret = perf_pmu_register(&etm_pmu, CORESIGHT_ETM_PMU_NAME, -1);
> >       if (ret == 0)
> >               etm_perf_up = true;
> > @@ -750,5 +757,6 @@ int __init etm_perf_init(void)
> >
> >  void __exit etm_perf_exit(void)
> >  {
> > +     etr_perf_polling_exit();
> >       perf_pmu_unregister(&etm_pmu);
> >  }
> > diff --git a/drivers/hwtracing/coresight/coresight-etr-perf-polling.c b/drivers/hwtracing/coresight/coresight-etr-perf-polling.c
> > new file mode 100644
> > index 0000000000000..87e6bc42a62de
> > --- /dev/null
> > +++ b/drivers/hwtracing/coresight/coresight-etr-perf-polling.c
> > @@ -0,0 +1,275 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright(C) 2021 Arm Limited. All rights reserved.
> > + * Author: Daniel Kiss <daniel.kiss@arm.com>
> > + */
> > +
> > +#include <linux/coresight.h>
> > +#include <linux/coresight-pmu.h>
> > +#include <linux/cpumask.h>
> > +#include <linux/device.h>
> > +#include <linux/init.h>
> > +#include <linux/list.h>
> > +#include <linux/percpu-defs.h>
> > +#include <linux/perf_event.h>
> > +#include <linux/slab.h>
> > +#include <linux/stringhash.h>
> > +#include <linux/types.h>
> > +#include <linux/workqueue.h>
> > +
> > +#include "coresight-etr-perf-polling.h"
> > +#include "coresight-priv.h"
> > +#include "coresight-tmc.h"
> > +
> > +struct polling_event_list {
> > +     struct perf_event *perf_event;
> > +     struct etm_event_data *etm_event_data;
> > +     struct perf_output_handle *ctx_handle;
> > +     void (*tmc_etr_reset_hw)(struct tmc_drvdata *data);
> > +     struct list_head list;
> > +};
> > +
> > +struct polling {
> > +     int cpu;
> > +     struct polling_event_list *polled_event;
> > +     struct delayed_work delayed_work;
> > +};
> > +
> > +static atomic_t period;
> > +static spinlock_t spinlock_re;
> > +static struct list_head registered_events;
> > +
> > +static DEFINE_PER_CPU(struct polling, polling);
> > +
> > +static ssize_t period_show(struct device *dev, struct device_attribute *attr,
> > +                        char *buf)
> > +{
> > +     int temp;
> > +
> > +     temp = atomic_read(&period);
> > +     return sprintf(buf, "%i\n", temp);
> > +}
> > +
> > +static ssize_t period_store(struct device *dev, struct device_attribute *attr,
> > +                         const char *buf, size_t count)
> > +{
> > +     int temp = 0;
> > +
> > +     if (!kstrtoint(buf, 10, &temp) && (temp >= 0))
> > +             atomic_set(&period, temp);
> > +     return count;
> > +}
> > +
> > +static DEVICE_ATTR_RW(period);
> > +
> > +static struct attribute *coresight_tmc_polling_attrs[] = {
> > +     &dev_attr_period.attr,
> > +     NULL,
> > +};
> > +const struct attribute_group coresight_tmc_polling_group = {
> > +     .attrs = coresight_tmc_polling_attrs,
> > +     .name = "polling",
> > +};
> > +EXPORT_SYMBOL_GPL(coresight_tmc_polling_group);
> > +
> > +static inline void polling_sched_worker(struct polling *p)
> > +{
> > +     int tickrate = atomic_read(&period);
> > +
> > +     if (tickrate > 0)
> > +             schedule_delayed_work_on(p->cpu, &p->delayed_work,
> > +                                      msecs_to_jiffies(tickrate));
> > +}
> > +
> > +static inline bool is_etr_related(struct etm_event_data *etm_event_data, int cpu)
> > +{
> > +     struct list_head *path;
> > +     struct coresight_device *sink;
> > +     struct tmc_drvdata *drvdata;
> > +
> > +     path = etm_event_cpu_path(etm_event_data, cpu);
> > +     if (WARN_ON(!path))
> > +             return false;
> > +     sink = coresight_get_sink(path);
> > +     if (WARN_ON(!sink))
> > +             return false;
> > +     drvdata = dev_get_drvdata(sink->dev.parent);
> > +     if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
> > +             return false;
> > +     return true;
> > +}
> > +
> > +/*
> > + * Adds the event to the polled events list.
> > + */
> > +void etr_perf_polling_event_start(struct perf_event *event,
> > +                               struct etm_event_data *etm_event_data,
> > +                               struct perf_output_handle *ctx_handle)
> > +{
> > +     int cpu = smp_processor_id();
> > +     struct polling *p = per_cpu_ptr(&polling, cpu);
> > +     struct polling_event_list *element, *tmp;
> > +
> > +     if (!is_etr_related(etm_event_data, cpu))
> > +             return;
> > +
> > +     spin_lock(&spinlock_re);
> > +     list_for_each_entry_safe(element, tmp, &registered_events, list) {
> > +             if (element->ctx_handle == ctx_handle) {
> > +                     element->perf_event = event;
> > +                     element->etm_event_data = etm_event_data;
> > +                     spin_unlock(&spinlock_re);
> > +                     p->polled_event = element;
> > +                     polling_sched_worker(p);
> > +                     return;
> > +             }
> > +     }
> > +     spin_unlock(&spinlock_re);
> > +}
> > +
> > +/*
> > + * Removes the event from the to be polled events list.
> > + */
> > +void etr_perf_polling_event_stop(struct perf_event *event,
> > +                              struct etm_event_data *etm_event_data)
> > +{
> > +     int cpu = smp_processor_id();
> > +     struct polling *p = per_cpu_ptr(&polling, cpu);
> > +
> > +     if (!is_etr_related(etm_event_data, cpu))
> > +             return;
> > +
> > +     if (p->polled_event) {
> > +             struct polling_event_list *element = p->polled_event;
> > +
> > +             if (element->perf_event == event) {
> > +                     p->polled_event = NULL;
> > +                     element->perf_event = NULL;
> > +                     element->etm_event_data = NULL;
> > +                     cancel_delayed_work(&p->delayed_work);
> > +                     return;
> > +             }
> > +     }
> > +}
> > +
> > +/*
> > + * The polling worker is a workqueue job which is periodically
> > + * woken up to update the perf aux buffer from the etr shrink.
> > + */
> > +static void etr_perf_polling_worker(struct work_struct *work)
> > +{
> > +     unsigned long flags;
> > +     struct list_head *path;
> > +     struct coresight_device *sink;
> > +     int size;
> > +     int cpu = smp_processor_id();
> > +     struct polling *p = per_cpu_ptr(&polling, cpu);
> > +
> > +     if (!atomic_read(&period))
> > +             return;
> > +
> > +     if (!p->polled_event)
> > +             return;
> > +     /*
> > +      * Scheduling would do the same from the perf hooks,
> > +      * this should be done in one go.
> > +      */
> > +     local_irq_save(flags);
> > +
> > +     polling_sched_worker(p);
> > +
> > +     path = etm_event_cpu_path(p->polled_event->etm_event_data, cpu);
> > +     sink = coresight_get_sink(path);
> > +     size = sink_ops(sink)->update_buffer(
> > +             sink, p->polled_event->ctx_handle,
> > +             p->polled_event->etm_event_data->snk_config);
> > +
> > +     /*
> > +      * Restart the trace.
> > +      */
> > +     if (p->polled_event->tmc_etr_reset_hw)
> > +             p->polled_event->tmc_etr_reset_hw(dev_get_drvdata(sink->dev.parent));
> > +
> > +     WARN_ON(size < 0);
> > +     if (size > 0) {
> > +             struct etm_event_data *new_event_data;
> > +
> > +             perf_aux_output_end(p->polled_event->ctx_handle, size);
> > +             new_event_data = perf_aux_output_begin(
> > +                     p->polled_event->ctx_handle,
> > +                     p->polled_event->perf_event);
> > +             if (WARN_ON(new_event_data == NULL)) {
> > +                     local_irq_restore(flags);
> > +                     return;
> > +             }
> > +
> > +             p->polled_event->etm_event_data = new_event_data;
> > +             WARN_ON(new_event_data->snk_config !=
> > +                     p->polled_event->etm_event_data->snk_config);
> > +     }
> > +
> > +     local_irq_restore(flags);
> > +}
> > +
> > +void etr_perf_polling_handle_register(struct perf_output_handle *handle,
> > +                                   void (*tmc_etr_reset_hw)(struct tmc_drvdata *drvdata))
> > +{
> > +     struct polling_event_list *element;
> > +
> > +     element = kmalloc(sizeof(*element), GFP_ATOMIC);
> > +     if (WARN_ON(!element))
> > +             return;
> > +     memset(element, 0, sizeof(*element));
> > +     element->ctx_handle = handle;
> > +     element->tmc_etr_reset_hw = tmc_etr_reset_hw;
> > +     spin_lock(&spinlock_re);
> > +     list_add(&element->list, &registered_events);
> > +     spin_unlock(&spinlock_re);
> > +}
> > +EXPORT_SYMBOL_GPL(etr_perf_polling_handle_register);
> > +
> > +void etr_perf_polling_handle_deregister(struct perf_output_handle *handle)
> > +{
> > +     struct polling_event_list *element, *tmp;
> > +
> > +     spin_lock(&spinlock_re);
> > +     list_for_each_entry_safe(element, tmp, &registered_events, list) {
> > +             if (element->ctx_handle == handle) {
> > +                     list_del(&element->list);
> > +                     spin_unlock(&spinlock_re);
> > +                     kfree(element);
> > +                     return;
> > +             }
> > +     }
> > +     spin_unlock(&spinlock_re);
> > +}
> > +EXPORT_SYMBOL_GPL(etr_perf_polling_handle_deregister);
> > +
> > +void etr_perf_polling_init(void)
> > +{
> > +     int cpu;
> > +
> > +     spin_lock_init(&spinlock_re);
> > +     INIT_LIST_HEAD(&registered_events);
> > +     atomic_set(&period, 0);
> > +     for_each_possible_cpu(cpu) {
> > +             struct polling *p = per_cpu_ptr(&polling, cpu);
> > +
> > +             p->cpu = cpu;
> > +             p->polled_event = NULL;
> > +             INIT_DELAYED_WORK(&p->delayed_work, etr_perf_polling_worker);
> > +     }
> > +}
> > +
> > +void etr_perf_polling_exit(void)
> > +{
> > +     int cpu;
> > +
> > +     for_each_possible_cpu(cpu) {
> > +             struct polling *p = per_cpu_ptr(&polling, cpu);
> > +
> > +             cancel_delayed_work_sync(&p->delayed_work);
> > +             WARN_ON(p->polled_event);
> > +     }
> > +     WARN_ON(!list_empty(&registered_events));
> > +}
> > diff --git a/drivers/hwtracing/coresight/coresight-etr-perf-polling.h b/drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> > new file mode 100644
> > index 0000000000000..d47b4424594e6
> > --- /dev/null
> > +++ b/drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> > @@ -0,0 +1,38 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright(C) 2021 Arm Limited. All rights reserved.
> > + * Author: Daniel Kiss <daniel.kiss@arm.com>
> > + */
> > +
> > +#ifndef _CORESIGHT_ETM_PERF_POLLING_H
> > +#define _CORESIGHT_ETM_PERF_POLLING_H
> > +
> > +#include <linux/coresight.h>
> > +#include <linux/perf_event.h>
> > +#include "coresight-etm-perf.h"
> > +#include "coresight-tmc.h"
> > +
> > +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> > +void etr_perf_polling_init(void);
> > +void etr_perf_polling_exit(void);
> > +void etr_perf_polling_handle_register(struct perf_output_handle *handle,
> > +                                   void (*tmc_etr_reset_hw)(struct tmc_drvdata *drvdata));
> > +void etr_perf_polling_handle_deregister(struct perf_output_handle *handle);
> > +void etr_perf_polling_event_start(struct perf_event *event,
> > +                               struct etm_event_data *etm_event_data,
> > +                               struct perf_output_handle *ctx_handle);
> > +void etr_perf_polling_event_stop(struct perf_event *event,
> > +                              struct etm_event_data *etm_event_data);
> > +
> > +extern const struct attribute_group coresight_tmc_polling_group;
> > +
> > +#else /* !CONFIG_CORESIGHT_ETR_PERF_POLL */
> > +#define etr_perf_polling_init()
> > +#define etr_perf_polling_exit()
> > +#define etr_perf_polling_handle_register(...)
> > +#define etr_perf_polling_handle_deregister(...)
> > +#define etr_perf_polling_event_start(...)
> > +#define etr_perf_polling_event_stop(...)
> > +#endif
> > +
> > +#endif
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
> > index 74c6323d4d6ab..dbcdba162bd38 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
> > +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
> > @@ -26,6 +26,7 @@
> >
> >  #include "coresight-priv.h"
> >  #include "coresight-tmc.h"
> > +#include "coresight-etr-perf-polling.h"
> >
> >  DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
> >  DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
> > @@ -365,6 +366,9 @@ static const struct attribute_group coresight_tmc_mgmt_group = {
> >  static const struct attribute_group *coresight_tmc_groups[] = {
> >       &coresight_tmc_group,
> >       &coresight_tmc_mgmt_group,
> > +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> > +     &coresight_tmc_polling_group,
> > +#endif
> >       NULL,
> >  };
> >
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > index 55c9b5fd9f832..67cd4bdcda71b 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > @@ -16,6 +16,7 @@
> >  #include <linux/vmalloc.h>
> >  #include "coresight-catu.h"
> >  #include "coresight-etm-perf.h"
> > +#include "coresight-etr-perf-polling.h"
> >  #include "coresight-priv.h"
> >  #include "coresight-tmc.h"
> >
> > @@ -1137,6 +1138,16 @@ void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
> >       drvdata->etr_buf = NULL;
> >  }
> >
> > +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> > +
> > +static void tmc_etr_reset_hw(struct tmc_drvdata *drvdata)
> > +{
> > +     __tmc_etr_disable_hw(drvdata);
> > +     __tmc_etr_enable_hw(drvdata);
> > +}
> > +
> > +#endif
> > +
> >  static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
> >  {
> >       int ret = 0;
> > @@ -1620,6 +1631,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
> >               drvdata->mode = CS_MODE_PERF;
> >               drvdata->perf_buf = etr_perf->etr_buf;
> >               drvdata->perf_handle = handle;
> > +             etr_perf_polling_handle_register(handle, tmc_etr_reset_hw);
> >               atomic_inc(csdev->refcnt);
> >       }
> >
> > @@ -1667,6 +1679,7 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev)
> >       drvdata->mode = CS_MODE_DISABLED;
> >       /* Reset perf specific data */
> >       drvdata->perf_buf = NULL;
> > +     etr_perf_polling_handle_deregister(drvdata->perf_handle);
> >       drvdata->perf_handle = NULL;
> >
> >       spin_unlock_irqrestore(&drvdata->spinlock, flags);
> > --
> > 2.25.1
> >



-- 
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 4/4] coresight: Add ETR-PERF polling.
  2021-08-31 13:31     ` Mike Leach
@ 2021-09-01 18:02       ` Mathieu Poirier
  0 siblings, 0 replies; 11+ messages in thread
From: Mathieu Poirier @ 2021-09-01 18:02 UTC (permalink / raw)
  To: Mike Leach
  Cc: Daniel Kiss, Coresight ML, Denis Nikitin, Leo Yan,
	linux-arm-kernel, Suzuki K. Poulose, Branislav Rankov,
	Denis Nikitin

Good morning guys,

On Tue, Aug 31, 2021 at 02:31:28PM +0100, Mike Leach wrote:
> iI Mathieu, Daniel,.
> 
> On Thu, 26 Aug 2021 at 18:33, Mathieu Poirier
> <mathieu.poirier@linaro.org> wrote:
> >
> > On Tue, Jul 13, 2021 at 02:15:32PM +0200, Daniel Kiss wrote:
> > > ETR might fill up the buffer sooner than an event makes perf to trigger
> > > the synchronisation especially in system wide trace. Polling runs
> > > periodically to sync the ETR buffer. Period is configurable via sysfs,
> > > disabled by default.
> > >
> > > Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
> > > Signed-off-by: Branislav Rankov <Branislav.Rankov@arm.com>
> > > Tested-by: Denis Nikitin <denik@chromium.org>
> > > ---
> > >  .../testing/sysfs-bus-coresight-devices-tmc   |   8 +
> > >  drivers/hwtracing/coresight/Kconfig           |  12 +
> > >  drivers/hwtracing/coresight/Makefile          |   1 +
> > >  .../hwtracing/coresight/coresight-etm-perf.c  |   8 +
> > >  .../coresight/coresight-etr-perf-polling.c    | 275 ++++++++++++++++++
> > >  .../coresight/coresight-etr-perf-polling.h    |  38 +++
> > >  .../hwtracing/coresight/coresight-tmc-core.c  |   4 +
> > >  .../hwtracing/coresight/coresight-tmc-etr.c   |  13 +
> > >  8 files changed, 359 insertions(+)
> > >  create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.c
> > >  create mode 100644 drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> > >
> > > diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> > > index 6aa527296c710..3b411e8a6f417 100644
> > > --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> > > +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> > > @@ -91,3 +91,11 @@ Contact:   Mathieu Poirier <mathieu.poirier@linaro.org>
> > >  Description: (RW) Size of the trace buffer for TMC-ETR when used in SYSFS
> > >               mode. Writable only for TMC-ETR configurations. The value
> > >               should be aligned to the kernel pagesize.
> > > +
> > > +What:                /sys/bus/coresight/devices/<memory_map>.tmc/polling/period
> > > +Date:                July 2021
> > > +KernelVersion:       5.14
> > > +Contact:     Daniel Kiss <daniel.kiss@arm.com>
> > > +Description: (RW) Time in milliseconds when the TMC-ETR is synced.
> > > +             Default value is 0, means the feature is disabled.
> > > +             Writable only for TMC-ETR configurations.
> > > diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
> > > index 84530fd80998c..4e91fb98849f4 100644
> > > --- a/drivers/hwtracing/coresight/Kconfig
> > > +++ b/drivers/hwtracing/coresight/Kconfig
> > > @@ -44,6 +44,18 @@ config CORESIGHT_LINK_AND_SINK_TMC
> > >         To compile this driver as a module, choose M here: the
> > >         module will be called coresight-tmc.
> > >
> > > +
> > > +config CORESIGHT_ETR_PERF_POLL
> > > +     bool "Coresight ETR Perf Polling"
> > > +
> > > +     depends on CORESIGHT_LINK_AND_SINK_TMC
> > > +     help
> > > +       Enable the support for software periodic synchronization of the ETR buffer.
> > > +       ETR might fill up the buffer sooner than an event makes perf to trigger
> > > +       the synchronization especially in system wide trace. Polling runs
> > > +       periodically to sync the ETR buffer. Period is configurable via sysfs,
> > > +       disabled by default.
> > > +
> > >  config CORESIGHT_CATU
> > >       tristate "Coresight Address Translation Unit (CATU) driver"
> > >       depends on CORESIGHT_LINK_AND_SINK_TMC
> > > diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile
> > > index d60816509755c..6baac328eea87 100644
> > > --- a/drivers/hwtracing/coresight/Makefile
> > > +++ b/drivers/hwtracing/coresight/Makefile
> > > @@ -5,6 +5,7 @@
> > >  obj-$(CONFIG_CORESIGHT) += coresight.o
> > >  coresight-y := coresight-core.o  coresight-etm-perf.o coresight-platform.o \
> > >               coresight-sysfs.o
> > > +coresight-$(CONFIG_CORESIGHT_ETR_PERF_POLL) += coresight-etr-perf-polling.o
> > >  obj-$(CONFIG_CORESIGHT_LINK_AND_SINK_TMC) += coresight-tmc.o
> > >  coresight-tmc-y := coresight-tmc-core.o coresight-tmc-etf.o \
> > >                     coresight-tmc-etr.o
> > > diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> > > index a3f4c07f5bf8b..3095840a567c4 100644
> > > --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> > > +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> > > @@ -19,6 +19,7 @@
> > >  #include <linux/workqueue.h>
> > >
> > >  #include "coresight-etm-perf.h"
> > > +#include "coresight-etr-perf-polling.h"
> > >  #include "coresight-priv.h"
> > >
> > >  static struct pmu etm_pmu;
> > > @@ -438,6 +439,8 @@ static void etm_event_start(struct perf_event *event, int flags)
> > >       /* Tell the perf core the event is alive */
> > >       event->hw.state = 0;
> > >
> > > +     etr_perf_polling_event_start(event, event_data, handle);
> > > +
> > >       /* Finally enable the tracer */
> > >       if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
> > >               goto fail_disable_path;
> > > @@ -497,6 +500,8 @@ static void etm_event_stop(struct perf_event *event, int mode)
> > >       if (!sink)
> > >               return;
> > >
> > > +     etr_perf_polling_event_stop(event, event_data);
> > > +
> > >       /* stop tracer */
> > >       source_ops(csdev)->disable(csdev, event);
> > >
> > > @@ -741,6 +746,8 @@ int __init etm_perf_init(void)
> > >       etm_pmu.addr_filters_validate   = etm_addr_filters_validate;
> > >       etm_pmu.nr_addr_filters         = ETM_ADDR_CMP_MAX;
> > >
> > > +     etr_perf_polling_init();
> >
> > The problem here is that a function specific to an ETR is inserted in code that
> > should be generic.  So if we want to do the same for ETB, ETF and any kind of
> > sink that's out there, we'd have to duplicate code and it would quickly get
> > messy.  What is needed is a generic solution.
> >
> > Moreover what is proposed here is centered on sinks, while the entire coresight
> > framework is centered on sources.  The polling function, although specific to
> > sinks, should emanate from and be driven by the generic code that handles
> > sources.  There should also be a single event responsible for doing the polling
> > rather than all of them as it is the case in this patchset.  In per-thread mode
> > where a single thread is traced it isn't a problem.  In CPU-wide scenarios the
> > first event should do the polling.  There will be a gap between the time that
> > event stops and all the other events in the trace session stop but that is a
> > limitation we will have to live with.
> >
> 
> In my previous reviews of this I have suggested that this must be ETR
> specific - the object of feature is to use polling to mimic the
> interrupt we get with somehting  like TRBE, and get more consistent
> output across the entire trace run - which is the objective of the
> feature and has been backed up by evidence from those that have tried
> it out..
> Granted, TRBE is per core and thus can only be operating on the single
> source, but the guiding idea remains the same - save data whenever the
> sink hardware is full (or artificailly stopped in the case of
> polling). TRBE does not, as far as I recall wait for an event to stop
> before servicing the full interrupt.,

I think we all recognize that TRBEs are not part of this conversation, something I
should have made clearer when I commented on "the kind of sink out there".

I also think we have the same vision for this feature, that is to make it look
like an interrupt except that instead of being triggered by the HW it comes from
a timer.

> 
> The worry is that is this is source event driven then any advantages
> would be wiped out by waiting for events in the trace session to stop.

I see three main problems with the proposed implementation:

1) It doesn't apply to other N:1 sinks such as ETB and ETF.

2) Every event in a session starts a timer.  What we need is a single timer
driving the polling. 

3) A reference to the perf handle (of type perf_output_handle) is taken in the
polling_event_list structure.  I understand why this is done but that handle
belongs to the perf core and we shouldn't rely on it.  First because it is not
guaranteed to be valid [1] and second we don't know what the perf core will do
with it in the future.

When referring to "source" driven, it really means "event" driven.  The timer
should belong to the etm_event_data structure.  In per-thread mode there is only
one invent and in CPU wide scenarios there is one per CPU.  In the latter case the
first event of the session should handles the timer.  Each time an event is installed
on a CPU the timer, if present, is armed and inversely disarmed when it is removed.
That way we don't have to mess with the perf handle and there is only one timer per
trace session.

When the timer fires the etm_ctxt is valid because the event is running (when
the task is yanked away from a CPU the timer is cancelled).  As such the timer
handler can call sink_ops(sink)->update_buffer() and deal with the
perf_aux_{input | output}_end() functions seamlessly.  That would take care of
the perf handle context problem and would be usable by any of the N:1 sinks we
currently support.  Note that ->update_buffer() would need a new parameter,
something like "bool timer", to overwrite the sink's reference count.

The only scenario I'm not tackling is --per-thread that spuns off child thread
because I don't remember the details, but Suzuki should.

Thanks,
Mathieu

PS: Daniel: I should have warned you - this is a complex feature in a subsystem
that is already quite complex.  As such I expect we'll iterate a few times over
the design to make sure everything is handled properly.  

[1]. https://elixir.bootlin.com/linux/latest/source/drivers/hwtracing/coresight/coresight-etm-perf.c#L28

> 
> Regards
> 
> Mike
> 
> 
> > That leaves us with per-thread sessions where child threads are spun off.  You
> > will have to look into how events are handle in that case.  Suzuki has done work
> > in that area and might remember a few things.
> >
> > There are other design problems with this set that I won't get into since it
> > needs a refactoring anyway.
> >
> > Regards,
> > Mathieu
> >
> > > +
> > >       ret = perf_pmu_register(&etm_pmu, CORESIGHT_ETM_PMU_NAME, -1);
> > >       if (ret == 0)
> > >               etm_perf_up = true;
> > > @@ -750,5 +757,6 @@ int __init etm_perf_init(void)
> > >
> > >  void __exit etm_perf_exit(void)
> > >  {
> > > +     etr_perf_polling_exit();
> > >       perf_pmu_unregister(&etm_pmu);
> > >  }
> > > diff --git a/drivers/hwtracing/coresight/coresight-etr-perf-polling.c b/drivers/hwtracing/coresight/coresight-etr-perf-polling.c
> > > new file mode 100644
> > > index 0000000000000..87e6bc42a62de
> > > --- /dev/null
> > > +++ b/drivers/hwtracing/coresight/coresight-etr-perf-polling.c
> > > @@ -0,0 +1,275 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Copyright(C) 2021 Arm Limited. All rights reserved.
> > > + * Author: Daniel Kiss <daniel.kiss@arm.com>
> > > + */
> > > +
> > > +#include <linux/coresight.h>
> > > +#include <linux/coresight-pmu.h>
> > > +#include <linux/cpumask.h>
> > > +#include <linux/device.h>
> > > +#include <linux/init.h>
> > > +#include <linux/list.h>
> > > +#include <linux/percpu-defs.h>
> > > +#include <linux/perf_event.h>
> > > +#include <linux/slab.h>
> > > +#include <linux/stringhash.h>
> > > +#include <linux/types.h>
> > > +#include <linux/workqueue.h>
> > > +
> > > +#include "coresight-etr-perf-polling.h"
> > > +#include "coresight-priv.h"
> > > +#include "coresight-tmc.h"
> > > +
> > > +struct polling_event_list {
> > > +     struct perf_event *perf_event;
> > > +     struct etm_event_data *etm_event_data;
> > > +     struct perf_output_handle *ctx_handle;
> > > +     void (*tmc_etr_reset_hw)(struct tmc_drvdata *data);
> > > +     struct list_head list;
> > > +};
> > > +
> > > +struct polling {
> > > +     int cpu;
> > > +     struct polling_event_list *polled_event;
> > > +     struct delayed_work delayed_work;
> > > +};
> > > +
> > > +static atomic_t period;
> > > +static spinlock_t spinlock_re;
> > > +static struct list_head registered_events;
> > > +
> > > +static DEFINE_PER_CPU(struct polling, polling);
> > > +
> > > +static ssize_t period_show(struct device *dev, struct device_attribute *attr,
> > > +                        char *buf)
> > > +{
> > > +     int temp;
> > > +
> > > +     temp = atomic_read(&period);
> > > +     return sprintf(buf, "%i\n", temp);
> > > +}
> > > +
> > > +static ssize_t period_store(struct device *dev, struct device_attribute *attr,
> > > +                         const char *buf, size_t count)
> > > +{
> > > +     int temp = 0;
> > > +
> > > +     if (!kstrtoint(buf, 10, &temp) && (temp >= 0))
> > > +             atomic_set(&period, temp);
> > > +     return count;
> > > +}
> > > +
> > > +static DEVICE_ATTR_RW(period);
> > > +
> > > +static struct attribute *coresight_tmc_polling_attrs[] = {
> > > +     &dev_attr_period.attr,
> > > +     NULL,
> > > +};
> > > +const struct attribute_group coresight_tmc_polling_group = {
> > > +     .attrs = coresight_tmc_polling_attrs,
> > > +     .name = "polling",
> > > +};
> > > +EXPORT_SYMBOL_GPL(coresight_tmc_polling_group);
> > > +
> > > +static inline void polling_sched_worker(struct polling *p)
> > > +{
> > > +     int tickrate = atomic_read(&period);
> > > +
> > > +     if (tickrate > 0)
> > > +             schedule_delayed_work_on(p->cpu, &p->delayed_work,
> > > +                                      msecs_to_jiffies(tickrate));
> > > +}
> > > +
> > > +static inline bool is_etr_related(struct etm_event_data *etm_event_data, int cpu)
> > > +{
> > > +     struct list_head *path;
> > > +     struct coresight_device *sink;
> > > +     struct tmc_drvdata *drvdata;
> > > +
> > > +     path = etm_event_cpu_path(etm_event_data, cpu);
> > > +     if (WARN_ON(!path))
> > > +             return false;
> > > +     sink = coresight_get_sink(path);
> > > +     if (WARN_ON(!sink))
> > > +             return false;
> > > +     drvdata = dev_get_drvdata(sink->dev.parent);
> > > +     if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
> > > +             return false;
> > > +     return true;
> > > +}
> > > +
> > > +/*
> > > + * Adds the event to the polled events list.
> > > + */
> > > +void etr_perf_polling_event_start(struct perf_event *event,
> > > +                               struct etm_event_data *etm_event_data,
> > > +                               struct perf_output_handle *ctx_handle)
> > > +{
> > > +     int cpu = smp_processor_id();
> > > +     struct polling *p = per_cpu_ptr(&polling, cpu);
> > > +     struct polling_event_list *element, *tmp;
> > > +
> > > +     if (!is_etr_related(etm_event_data, cpu))
> > > +             return;
> > > +
> > > +     spin_lock(&spinlock_re);
> > > +     list_for_each_entry_safe(element, tmp, &registered_events, list) {
> > > +             if (element->ctx_handle == ctx_handle) {
> > > +                     element->perf_event = event;
> > > +                     element->etm_event_data = etm_event_data;
> > > +                     spin_unlock(&spinlock_re);
> > > +                     p->polled_event = element;
> > > +                     polling_sched_worker(p);
> > > +                     return;
> > > +             }
> > > +     }
> > > +     spin_unlock(&spinlock_re);
> > > +}
> > > +
> > > +/*
> > > + * Removes the event from the to be polled events list.
> > > + */
> > > +void etr_perf_polling_event_stop(struct perf_event *event,
> > > +                              struct etm_event_data *etm_event_data)
> > > +{
> > > +     int cpu = smp_processor_id();
> > > +     struct polling *p = per_cpu_ptr(&polling, cpu);
> > > +
> > > +     if (!is_etr_related(etm_event_data, cpu))
> > > +             return;
> > > +
> > > +     if (p->polled_event) {
> > > +             struct polling_event_list *element = p->polled_event;
> > > +
> > > +             if (element->perf_event == event) {
> > > +                     p->polled_event = NULL;
> > > +                     element->perf_event = NULL;
> > > +                     element->etm_event_data = NULL;
> > > +                     cancel_delayed_work(&p->delayed_work);
> > > +                     return;
> > > +             }
> > > +     }
> > > +}
> > > +
> > > +/*
> > > + * The polling worker is a workqueue job which is periodically
> > > + * woken up to update the perf aux buffer from the etr shrink.
> > > + */
> > > +static void etr_perf_polling_worker(struct work_struct *work)
> > > +{
> > > +     unsigned long flags;
> > > +     struct list_head *path;
> > > +     struct coresight_device *sink;
> > > +     int size;
> > > +     int cpu = smp_processor_id();
> > > +     struct polling *p = per_cpu_ptr(&polling, cpu);
> > > +
> > > +     if (!atomic_read(&period))
> > > +             return;
> > > +
> > > +     if (!p->polled_event)
> > > +             return;
> > > +     /*
> > > +      * Scheduling would do the same from the perf hooks,
> > > +      * this should be done in one go.
> > > +      */
> > > +     local_irq_save(flags);
> > > +
> > > +     polling_sched_worker(p);
> > > +
> > > +     path = etm_event_cpu_path(p->polled_event->etm_event_data, cpu);
> > > +     sink = coresight_get_sink(path);
> > > +     size = sink_ops(sink)->update_buffer(
> > > +             sink, p->polled_event->ctx_handle,
> > > +             p->polled_event->etm_event_data->snk_config);
> > > +
> > > +     /*
> > > +      * Restart the trace.
> > > +      */
> > > +     if (p->polled_event->tmc_etr_reset_hw)
> > > +             p->polled_event->tmc_etr_reset_hw(dev_get_drvdata(sink->dev.parent));
> > > +
> > > +     WARN_ON(size < 0);
> > > +     if (size > 0) {
> > > +             struct etm_event_data *new_event_data;
> > > +
> > > +             perf_aux_output_end(p->polled_event->ctx_handle, size);
> > > +             new_event_data = perf_aux_output_begin(
> > > +                     p->polled_event->ctx_handle,
> > > +                     p->polled_event->perf_event);
> > > +             if (WARN_ON(new_event_data == NULL)) {
> > > +                     local_irq_restore(flags);
> > > +                     return;
> > > +             }
> > > +
> > > +             p->polled_event->etm_event_data = new_event_data;
> > > +             WARN_ON(new_event_data->snk_config !=
> > > +                     p->polled_event->etm_event_data->snk_config);
> > > +     }
> > > +
> > > +     local_irq_restore(flags);
> > > +}
> > > +
> > > +void etr_perf_polling_handle_register(struct perf_output_handle *handle,
> > > +                                   void (*tmc_etr_reset_hw)(struct tmc_drvdata *drvdata))
> > > +{
> > > +     struct polling_event_list *element;
> > > +
> > > +     element = kmalloc(sizeof(*element), GFP_ATOMIC);
> > > +     if (WARN_ON(!element))
> > > +             return;
> > > +     memset(element, 0, sizeof(*element));
> > > +     element->ctx_handle = handle;
> > > +     element->tmc_etr_reset_hw = tmc_etr_reset_hw;
> > > +     spin_lock(&spinlock_re);
> > > +     list_add(&element->list, &registered_events);
> > > +     spin_unlock(&spinlock_re);
> > > +}
> > > +EXPORT_SYMBOL_GPL(etr_perf_polling_handle_register);
> > > +
> > > +void etr_perf_polling_handle_deregister(struct perf_output_handle *handle)
> > > +{
> > > +     struct polling_event_list *element, *tmp;
> > > +
> > > +     spin_lock(&spinlock_re);
> > > +     list_for_each_entry_safe(element, tmp, &registered_events, list) {
> > > +             if (element->ctx_handle == handle) {
> > > +                     list_del(&element->list);
> > > +                     spin_unlock(&spinlock_re);
> > > +                     kfree(element);
> > > +                     return;
> > > +             }
> > > +     }
> > > +     spin_unlock(&spinlock_re);
> > > +}
> > > +EXPORT_SYMBOL_GPL(etr_perf_polling_handle_deregister);
> > > +
> > > +void etr_perf_polling_init(void)
> > > +{
> > > +     int cpu;
> > > +
> > > +     spin_lock_init(&spinlock_re);
> > > +     INIT_LIST_HEAD(&registered_events);
> > > +     atomic_set(&period, 0);
> > > +     for_each_possible_cpu(cpu) {
> > > +             struct polling *p = per_cpu_ptr(&polling, cpu);
> > > +
> > > +             p->cpu = cpu;
> > > +             p->polled_event = NULL;
> > > +             INIT_DELAYED_WORK(&p->delayed_work, etr_perf_polling_worker);
> > > +     }
> > > +}
> > > +
> > > +void etr_perf_polling_exit(void)
> > > +{
> > > +     int cpu;
> > > +
> > > +     for_each_possible_cpu(cpu) {
> > > +             struct polling *p = per_cpu_ptr(&polling, cpu);
> > > +
> > > +             cancel_delayed_work_sync(&p->delayed_work);
> > > +             WARN_ON(p->polled_event);
> > > +     }
> > > +     WARN_ON(!list_empty(&registered_events));
> > > +}
> > > diff --git a/drivers/hwtracing/coresight/coresight-etr-perf-polling.h b/drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> > > new file mode 100644
> > > index 0000000000000..d47b4424594e6
> > > --- /dev/null
> > > +++ b/drivers/hwtracing/coresight/coresight-etr-perf-polling.h
> > > @@ -0,0 +1,38 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +/*
> > > + * Copyright(C) 2021 Arm Limited. All rights reserved.
> > > + * Author: Daniel Kiss <daniel.kiss@arm.com>
> > > + */
> > > +
> > > +#ifndef _CORESIGHT_ETM_PERF_POLLING_H
> > > +#define _CORESIGHT_ETM_PERF_POLLING_H
> > > +
> > > +#include <linux/coresight.h>
> > > +#include <linux/perf_event.h>
> > > +#include "coresight-etm-perf.h"
> > > +#include "coresight-tmc.h"
> > > +
> > > +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> > > +void etr_perf_polling_init(void);
> > > +void etr_perf_polling_exit(void);
> > > +void etr_perf_polling_handle_register(struct perf_output_handle *handle,
> > > +                                   void (*tmc_etr_reset_hw)(struct tmc_drvdata *drvdata));
> > > +void etr_perf_polling_handle_deregister(struct perf_output_handle *handle);
> > > +void etr_perf_polling_event_start(struct perf_event *event,
> > > +                               struct etm_event_data *etm_event_data,
> > > +                               struct perf_output_handle *ctx_handle);
> > > +void etr_perf_polling_event_stop(struct perf_event *event,
> > > +                              struct etm_event_data *etm_event_data);
> > > +
> > > +extern const struct attribute_group coresight_tmc_polling_group;
> > > +
> > > +#else /* !CONFIG_CORESIGHT_ETR_PERF_POLL */
> > > +#define etr_perf_polling_init()
> > > +#define etr_perf_polling_exit()
> > > +#define etr_perf_polling_handle_register(...)
> > > +#define etr_perf_polling_handle_deregister(...)
> > > +#define etr_perf_polling_event_start(...)
> > > +#define etr_perf_polling_event_stop(...)
> > > +#endif
> > > +
> > > +#endif
> > > diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
> > > index 74c6323d4d6ab..dbcdba162bd38 100644
> > > --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
> > > +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
> > > @@ -26,6 +26,7 @@
> > >
> > >  #include "coresight-priv.h"
> > >  #include "coresight-tmc.h"
> > > +#include "coresight-etr-perf-polling.h"
> > >
> > >  DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
> > >  DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
> > > @@ -365,6 +366,9 @@ static const struct attribute_group coresight_tmc_mgmt_group = {
> > >  static const struct attribute_group *coresight_tmc_groups[] = {
> > >       &coresight_tmc_group,
> > >       &coresight_tmc_mgmt_group,
> > > +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> > > +     &coresight_tmc_polling_group,
> > > +#endif
> > >       NULL,
> > >  };
> > >
> > > diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > > index 55c9b5fd9f832..67cd4bdcda71b 100644
> > > --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > > +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > > @@ -16,6 +16,7 @@
> > >  #include <linux/vmalloc.h>
> > >  #include "coresight-catu.h"
> > >  #include "coresight-etm-perf.h"
> > > +#include "coresight-etr-perf-polling.h"
> > >  #include "coresight-priv.h"
> > >  #include "coresight-tmc.h"
> > >
> > > @@ -1137,6 +1138,16 @@ void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
> > >       drvdata->etr_buf = NULL;
> > >  }
> > >
> > > +#ifdef CONFIG_CORESIGHT_ETR_PERF_POLL
> > > +
> > > +static void tmc_etr_reset_hw(struct tmc_drvdata *drvdata)
> > > +{
> > > +     __tmc_etr_disable_hw(drvdata);
> > > +     __tmc_etr_enable_hw(drvdata);
> > > +}
> > > +
> > > +#endif
> > > +
> > >  static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
> > >  {
> > >       int ret = 0;
> > > @@ -1620,6 +1631,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
> > >               drvdata->mode = CS_MODE_PERF;
> > >               drvdata->perf_buf = etr_perf->etr_buf;
> > >               drvdata->perf_handle = handle;
> > > +             etr_perf_polling_handle_register(handle, tmc_etr_reset_hw);
> > >               atomic_inc(csdev->refcnt);
> > >       }
> > >
> > > @@ -1667,6 +1679,7 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev)
> > >       drvdata->mode = CS_MODE_DISABLED;
> > >       /* Reset perf specific data */
> > >       drvdata->perf_buf = NULL;
> > > +     etr_perf_polling_handle_deregister(drvdata->perf_handle);
> > >       drvdata->perf_handle = NULL;
> > >
> > >       spin_unlock_irqrestore(&drvdata->spinlock, flags);
> > > --
> > > 2.25.1
> > >
> 
> 
> 
> -- 
> Mike Leach
> Principal Engineer, ARM Ltd.
> Manchester Design Centre. UK

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-09-01 18:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-13 12:15 [PATCHv2 0/4] coresight: Add ETR-PERF polling Daniel Kiss
2021-07-13 12:15 ` [PATCHv2 1/4] coresight: tmc-etr: Use handle->head from perf_output_handle directly Daniel Kiss
2021-07-13 12:15 ` [PATCHv2 2/4] coresight: tmc-etr: Track perf handler Daniel Kiss
2021-08-25 17:02   ` Mathieu Poirier
2021-08-25 19:09     ` Mathieu Poirier
2021-07-13 12:15 ` [PATCHv2 3/4] coresight: etm-perf: Export etm_event_cpu_path Daniel Kiss
2021-07-13 12:15 ` [PATCHv2 4/4] coresight: Add ETR-PERF polling Daniel Kiss
2021-08-26 16:41   ` Mathieu Poirier
2021-08-26 17:32   ` Mathieu Poirier
2021-08-31 13:31     ` Mike Leach
2021-09-01 18:02       ` Mathieu Poirier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.