linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios
@ 2019-03-25 21:56 Mathieu Poirier
  2019-03-25 21:56 ` [PATCH v2 01/16] coresight: pmu: Adding ITRACE property to cs_etm PMU Mathieu Poirier
                   ` (17 more replies)
  0 siblings, 18 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

This is the second revision of a patchset that adds support for CPU-wide
trace scenarios and as such, it is now possible to issue the following
commands:

	# perf record -e cs_etm/@20070000.etr/ -C 2,3 $COMMAND
	# perf record -e cs_etm/@20070000.etr/ -a $COMMAND

The solution is designed to work for both 1:1 and N:1 source/sink
topologies, though the former hasn't been tested for lack of access to HW.

Most of the changes revolve around allowing more than one event to use
a sink when operated from perf.  More specifically the first event to
use a sink switches it on while the last one is tasked to aggregate traces
and switching off the device.

This is the kernel part of the solution, with the user space portion to be
released in a later set.  All patches (user and kernel) have been rebased
on v5.1-rc2 and are hosted here[1].  Everything has been tested on Juno and
410c dragonboard platforms.

Regards,
Mathieu

[1]. https://git.linaro.org/people/mathieu.poirier/coresight.git (5.1-rc2-cpu-wide-v2) 

== Changes for V2 ==
* Using define ETM4_CFG_BIT_CTXTID rather than hard coded value (Suzuki).
* Moved pid out of struct etr_buf and into struct etr_perf_buffer (Suzuki).
* Removed code related to forcing double buffering (Suzuki).
* Fixed function reallocarray() for older distributions (Mike).
* Fixed counter configuration when dealing with errors(Leo).
* Automatically selecting PID_IN_CONTEXTIDR with ETMv4 driver.
* Rebased to v5.1-rc2.

Mathieu Poirier (16):
  coresight: pmu: Adding ITRACE property to cs_etm PMU
  coresight: etm4x: Add kernel configuration for CONTEXTID
  coresight: etm4x: Configure tracers to emit timestamps
  coresight: Adding return code to sink::disable() operation
  coresight: Move reference counting inside sink drivers
  coresight: Properly address errors in sink::disable() functions
  coresight: Properly address concurrency in sink::update() functions
  coresight: perf: Clean up function etm_setup_aux()
  coresight: perf: Refactor function free_event_data()
  coresight: Communicate perf event to sink buffer allocation function
  coresight: tmc-etr: Refactor function tmc_etr_setup_perf_buf()
  coresight: tmc-etr: Introduce the notion of process ID to ETR devices
  coresight: tmc-etr: Allow events to use the same ETR buffer
  coresight: tmc-etr: Add support for CPU-wide trace scenarios
  coresight: tmc-etf: Add support for CPU-wide trace scenarios
  coresight: etb10: Add support for CPU-wide trace scenarios

 drivers/hwtracing/coresight/Kconfig           |   1 +
 drivers/hwtracing/coresight/coresight-etb10.c |  83 +++++--
 .../hwtracing/coresight/coresight-etm-perf.c  |  37 +++-
 drivers/hwtracing/coresight/coresight-etm4x.c | 120 ++++++++++-
 .../hwtracing/coresight/coresight-tmc-etf.c   |  82 +++++--
 .../hwtracing/coresight/coresight-tmc-etr.c   | 204 +++++++++++++++---
 drivers/hwtracing/coresight/coresight-tmc.c   |   2 +
 drivers/hwtracing/coresight/coresight-tmc.h   |   6 +
 drivers/hwtracing/coresight/coresight-tpiu.c  |   9 +-
 drivers/hwtracing/coresight/coresight.c       |  28 +--
 include/linux/coresight-pmu.h                 |   2 +
 include/linux/coresight.h                     |   7 +-
 tools/include/linux/coresight-pmu.h           |   2 +
 13 files changed, 482 insertions(+), 101 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 01/16] coresight: pmu: Adding ITRACE property to cs_etm PMU
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-25 21:56 ` [PATCH v2 02/16] coresight: etm4x: Add kernel configuration for CONTEXTID Mathieu Poirier
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

Add to the capabilities the ITRACE property so that ITRACE START events
are generated when the PMU is switched on by the core.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 drivers/hwtracing/coresight/coresight-etm-perf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 4d5a2b9f9d6a..25ae56e924bb 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -566,7 +566,8 @@ static int __init etm_perf_init(void)
 {
 	int ret;
 
-	etm_pmu.capabilities		= PERF_PMU_CAP_EXCLUSIVE;
+	etm_pmu.capabilities		= (PERF_PMU_CAP_EXCLUSIVE |
+					   PERF_PMU_CAP_ITRACE);
 
 	etm_pmu.attr_groups		= etm_pmu_attr_groups;
 	etm_pmu.task_ctx_nr		= perf_sw_context;
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 02/16] coresight: etm4x: Add kernel configuration for CONTEXTID
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
  2019-03-25 21:56 ` [PATCH v2 01/16] coresight: pmu: Adding ITRACE property to cs_etm PMU Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-26 11:59   ` Suzuki K Poulose
  2019-03-25 21:56 ` [PATCH v2 03/16] coresight: etm4x: Configure tracers to emit timestamps Mathieu Poirier
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

Set the proper bit in the configuration register when contextID tracing
has been requested by user space.  That way PE_CONTEXT elements are
generated by the tracers when a process is installed on a CPU.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 drivers/hwtracing/coresight/Kconfig              | 1 +
 drivers/hwtracing/coresight/coresight-etm-perf.c | 2 ++
 drivers/hwtracing/coresight/coresight-etm4x.c    | 5 +++++
 include/linux/coresight-pmu.h                    | 2 ++
 tools/include/linux/coresight-pmu.h              | 2 ++
 5 files changed, 12 insertions(+)

diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
index ad34380cac49..44d1650f398e 100644
--- a/drivers/hwtracing/coresight/Kconfig
+++ b/drivers/hwtracing/coresight/Kconfig
@@ -75,6 +75,7 @@ config CORESIGHT_SOURCE_ETM4X
 	bool "CoreSight Embedded Trace Macrocell 4.x driver"
 	depends on ARM64
 	select CORESIGHT_LINKS_AND_SINKS
+	select PID_IN_CONTEXTIDR
 	help
 	  This driver provides support for the ETM4.x tracer module, tracing the
 	  instructions that a processor is executing. This is primarily useful
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 25ae56e924bb..bbfed70b3402 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -29,6 +29,7 @@ static DEFINE_PER_CPU(struct coresight_device *, csdev_src);
 
 /* ETMv3.5/PTM's ETMCR is 'config' */
 PMU_FORMAT_ATTR(cycacc,		"config:" __stringify(ETM_OPT_CYCACC));
+PMU_FORMAT_ATTR(contextid,	"config:" __stringify(ETM_OPT_CTXTID));
 PMU_FORMAT_ATTR(timestamp,	"config:" __stringify(ETM_OPT_TS));
 PMU_FORMAT_ATTR(retstack,	"config:" __stringify(ETM_OPT_RETSTK));
 /* Sink ID - same for all ETMs */
@@ -36,6 +37,7 @@ PMU_FORMAT_ATTR(sinkid,		"config2:0-31");
 
 static struct attribute *etm_config_formats_attr[] = {
 	&format_attr_cycacc.attr,
+	&format_attr_contextid.attr,
 	&format_attr_timestamp.attr,
 	&format_attr_retstack.attr,
 	&format_attr_sinkid.attr,
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c
index 08ce37c9475d..732ae12fca9b 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x.c
@@ -239,6 +239,11 @@ static int etm4_parse_event_config(struct etmv4_drvdata *drvdata,
 	if (attr->config & BIT(ETM_OPT_TS))
 		/* bit[11], Global timestamp tracing bit */
 		config->cfg |= BIT(11);
+
+	if (attr->config & BIT(ETM_OPT_CTXTID))
+		/* bit[6], Context ID tracing bit */
+		config->cfg |= BIT(ETM4_CFG_BIT_CTXTID);
+
 	/* return stack - enable if selected and supported */
 	if ((attr->config & BIT(ETM_OPT_RETSTK)) && drvdata->retstack)
 		/* bit[12], Return stack enable bit */
diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h
index a1a959ba24ff..b0e35eec6499 100644
--- a/include/linux/coresight-pmu.h
+++ b/include/linux/coresight-pmu.h
@@ -12,11 +12,13 @@
 
 /* ETMv3.5/PTM's ETMCR config bit */
 #define ETM_OPT_CYCACC  12
+#define ETM_OPT_CTXTID	14
 #define ETM_OPT_TS      28
 #define ETM_OPT_RETSTK	29
 
 /* ETMv4 CONFIGR programming bits for the ETM OPTs */
 #define ETM4_CFG_BIT_CYCACC	4
+#define ETM4_CFG_BIT_CTXTID	6
 #define ETM4_CFG_BIT_TS		11
 #define ETM4_CFG_BIT_RETSTK	12
 
diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h
index a1a959ba24ff..b0e35eec6499 100644
--- a/tools/include/linux/coresight-pmu.h
+++ b/tools/include/linux/coresight-pmu.h
@@ -12,11 +12,13 @@
 
 /* ETMv3.5/PTM's ETMCR config bit */
 #define ETM_OPT_CYCACC  12
+#define ETM_OPT_CTXTID	14
 #define ETM_OPT_TS      28
 #define ETM_OPT_RETSTK	29
 
 /* ETMv4 CONFIGR programming bits for the ETM OPTs */
 #define ETM4_CFG_BIT_CYCACC	4
+#define ETM4_CFG_BIT_CTXTID	6
 #define ETM4_CFG_BIT_TS		11
 #define ETM4_CFG_BIT_RETSTK	12
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 03/16] coresight: etm4x: Configure tracers to emit timestamps
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
  2019-03-25 21:56 ` [PATCH v2 01/16] coresight: pmu: Adding ITRACE property to cs_etm PMU Mathieu Poirier
  2019-03-25 21:56 ` [PATCH v2 02/16] coresight: etm4x: Add kernel configuration for CONTEXTID Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-26 11:53   ` Suzuki K Poulose
  2019-03-25 21:56 ` [PATCH v2 04/16] coresight: Adding return code to sink::disable() operation Mathieu Poirier
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

Configure timestamps to be emitted at regular intervals in the trace
stream to temporally correlate instructions executed on different CPUs.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 drivers/hwtracing/coresight/coresight-etm4x.c | 115 +++++++++++++++++-
 1 file changed, 112 insertions(+), 3 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c
index 732ae12fca9b..45c341a5aa0b 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x.c
@@ -138,8 +138,11 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
 			       drvdata->base + TRCCNTVRn(i));
 	}
 
-	/* Resource selector pair 0 is always implemented and reserved */
-	for (i = 0; i < drvdata->nr_resource * 2; i++)
+	/*
+	 * Resource selector pair 0 is always implemented and reserved.  As
+	 * such start at 2.
+	 */
+	for (i = 2; i < drvdata->nr_resource * 2; i++)
 		writel_relaxed(config->res_ctrl[i],
 			       drvdata->base + TRCRSCTLRn(i));
 
@@ -201,6 +204,97 @@ static void etm4_enable_hw_smp_call(void *info)
 	arg->rc = etm4_enable_hw(arg->drvdata);
 }
 
+/*
+ * The goal of function etm4_config_timestamp_event() is to configure a
+ * counter that will tell the tracer to emit a timestamp packet when it
+ * reaches zero.  This is done in order to get a more fine grained idea
+ * of when instructions are executed so that they can be correlated
+ * with execution on other CPUs.
+ *
+ * To do this the counter itself is configured to self reload and
+ * TRCRSCTLR1 (always true) used to get the counter to decrement.  From
+ * there a resource selector is configured with the counter and the
+ * timestamp control register to use the resource selector to trigger the
+ * event that will insert a timestamp packet in the stream.
+ */
+static int etm4_config_timestamp_event(struct etmv4_drvdata *drvdata)
+{
+	int i, ctridx, ret = -EINVAL;
+	int counter, rselector;
+	u32 val = 0;
+	struct etmv4_config *config = &drvdata->config;
+
+	/* No point in trying if we don't have at least one counter */
+	if (!drvdata->nr_cntr)
+		goto out;
+
+	/* Find a counter that hasn't been initialised */
+	for (i = 0; i < drvdata->nr_cntr; i++)
+		if (config->cntr_val[i] == 0)
+			break;
+
+	/* Remember what counter we used */
+	counter = 1 << i;
+	ctridx = i;
+
+	/* All the counters have been configured already, bail out */
+	if (i == drvdata->nr_cntr) {
+		pr_err("%s: no available counter found\n", __func__);
+		goto out;
+	}
+
+	/*
+	 * Initialise original and reload counter value to the smallest
+	 * possible value in order to get as much precision as we can.
+	 */
+	config->cntr_val[ctridx] = 1;
+	config->cntrldvr[ctridx] = 1;
+
+	/* Set the trace counter control register */
+	val =  0x1 << 16	|  /* Bit 16, reload counter automatically */
+	       0x0 << 7		|  /* Select single resource selector */
+	       0x1;		   /* Resource selector 1, i.e always true */
+
+	config->cntr_ctrl[ctridx] = val;
+
+	/*
+	 * Searching for an available resource selector to use, starting at
+	 * '2' since every implementation has at least 2 resource selector.
+	 * ETMIDR4 gives the number of resource selector _pairs_,
+	 * hence multiply by 2.
+	 */
+	for (i = 2; i < drvdata->nr_resource * 2; i++)
+		if (!config->res_ctrl[i])
+			break;
+
+	/* Remember what resource selector we used */
+	rselector = i;
+
+	if (i == drvdata->nr_resource * 2) {
+		pr_err("%s: no available resource selector found\n", __func__);
+
+		/* Backout what we did and exit */
+		config->cntr_ctrl[ctridx] = 0;
+		config->cntrldvr[ctridx] = 0;
+		config->cntr_val[ctridx] = 0;
+		goto out;
+	}
+
+	val = 0x2 << 16		| /* Group 0b0010 - Counter and sequencers */
+	      counter << 0;	  /* Counter to use */
+
+	config->res_ctrl[i] = val;
+
+	val = 0x0 << 7		| /* Select single resource selector */
+	      rselector;	  /* Resource selector */
+
+	config->ts_ctrl = val;
+
+	ret = 0;
+out:
+	return ret;
+}
+
 static int etm4_parse_event_config(struct etmv4_drvdata *drvdata,
 				   struct perf_event *event)
 {
@@ -236,9 +330,24 @@ static int etm4_parse_event_config(struct etmv4_drvdata *drvdata,
 		/* TRM: Must program this for cycacc to work */
 		config->ccctlr = ETM_CYC_THRESHOLD_DEFAULT;
 	}
-	if (attr->config & BIT(ETM_OPT_TS))
+	if (attr->config & BIT(ETM_OPT_TS)) {
+		/*
+		 * Configure timestamps to be emitted at regular intervals in
+		 * order to correlate instructions executed on different CPUs
+		 * (CPU-wide trace scenarios).
+		 */
+		ret = etm4_config_timestamp_event(drvdata);
+
+		/*
+		 * No need to go further if timestamp intervals can't
+		 * be configured.
+		 */
+		if (ret)
+			goto out;
+
 		/* bit[11], Global timestamp tracing bit */
 		config->cfg |= BIT(11);
+	}
 
 	if (attr->config & BIT(ETM_OPT_CTXTID))
 		/* bit[6], Context ID tracing bit */
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 04/16] coresight: Adding return code to sink::disable() operation
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (2 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 03/16] coresight: etm4x: Configure tracers to emit timestamps Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-26 14:55   ` Suzuki K Poulose
  2019-03-25 21:56 ` [PATCH v2 05/16] coresight: Move reference counting inside sink drivers Mathieu Poirier
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

In preparation to handle device reference counting inside of the sink
drivers, add a return code to the sink::disable() operation so that
proper action can be taken if a sink has not been disabled.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 drivers/hwtracing/coresight/coresight-etb10.c   | 3 ++-
 drivers/hwtracing/coresight/coresight-tmc-etf.c | 5 +++--
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 5 +++--
 drivers/hwtracing/coresight/coresight-tpiu.c    | 3 ++-
 drivers/hwtracing/coresight/coresight.c         | 6 +++++-
 include/linux/coresight.h                       | 2 +-
 6 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index 105782ea64c7..71c2a3cdb866 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -325,7 +325,7 @@ static void etb_disable_hw(struct etb_drvdata *drvdata)
 	coresight_disclaim_device(drvdata->base);
 }
 
-static void etb_disable(struct coresight_device *csdev)
+static int etb_disable(struct coresight_device *csdev)
 {
 	struct etb_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 	unsigned long flags;
@@ -340,6 +340,7 @@ static void etb_disable(struct coresight_device *csdev)
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	dev_dbg(drvdata->dev, "ETB disabled\n");
+	return 0;
 }
 
 static void *etb_alloc_buffer(struct coresight_device *csdev, int cpu,
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index a5f053f2db2c..d4213e7c2c45 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -273,7 +273,7 @@ static int tmc_enable_etf_sink(struct coresight_device *csdev,
 	return 0;
 }
 
-static void tmc_disable_etf_sink(struct coresight_device *csdev)
+static int tmc_disable_etf_sink(struct coresight_device *csdev)
 {
 	unsigned long flags;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
@@ -281,7 +281,7 @@ static void tmc_disable_etf_sink(struct coresight_device *csdev)
 	spin_lock_irqsave(&drvdata->spinlock, flags);
 	if (drvdata->reading) {
 		spin_unlock_irqrestore(&drvdata->spinlock, flags);
-		return;
+		return -EBUSY;
 	}
 
 	/* Disable the TMC only if it needs to */
@@ -293,6 +293,7 @@ static void tmc_disable_etf_sink(struct coresight_device *csdev)
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	dev_dbg(drvdata->dev, "TMC-ETB/ETF disabled\n");
+	return 0;
 }
 
 static int tmc_enable_etf_link(struct coresight_device *csdev,
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index f684283890d3..33501777038a 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -1392,7 +1392,7 @@ static int tmc_enable_etr_sink(struct coresight_device *csdev,
 	return -EINVAL;
 }
 
-static void tmc_disable_etr_sink(struct coresight_device *csdev)
+static int tmc_disable_etr_sink(struct coresight_device *csdev)
 {
 	unsigned long flags;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
@@ -1400,7 +1400,7 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev)
 	spin_lock_irqsave(&drvdata->spinlock, flags);
 	if (drvdata->reading) {
 		spin_unlock_irqrestore(&drvdata->spinlock, flags);
-		return;
+		return -EBUSY;
 	}
 
 	/* Disable the TMC only if it needs to */
@@ -1412,6 +1412,7 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev)
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	dev_dbg(drvdata->dev, "TMC-ETR disabled\n");
+	return 0;
 }
 
 static const struct coresight_ops_sink tmc_etr_sink_ops = {
diff --git a/drivers/hwtracing/coresight/coresight-tpiu.c b/drivers/hwtracing/coresight/coresight-tpiu.c
index b2f72a1fa402..0d13da1b9df1 100644
--- a/drivers/hwtracing/coresight/coresight-tpiu.c
+++ b/drivers/hwtracing/coresight/coresight-tpiu.c
@@ -94,13 +94,14 @@ static void tpiu_disable_hw(struct tpiu_drvdata *drvdata)
 	CS_LOCK(drvdata->base);
 }
 
-static void tpiu_disable(struct coresight_device *csdev)
+static int tpiu_disable(struct coresight_device *csdev)
 {
 	struct tpiu_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
 	tpiu_disable_hw(drvdata);
 
 	dev_dbg(drvdata->dev, "TPIU disabled\n");
+	return 0;
 }
 
 static const struct coresight_ops_sink tpiu_sink_ops = {
diff --git a/drivers/hwtracing/coresight/coresight.c b/drivers/hwtracing/coresight/coresight.c
index 29cef898afba..13eda4693f81 100644
--- a/drivers/hwtracing/coresight/coresight.c
+++ b/drivers/hwtracing/coresight/coresight.c
@@ -239,9 +239,13 @@ static int coresight_enable_sink(struct coresight_device *csdev,
 
 static void coresight_disable_sink(struct coresight_device *csdev)
 {
+	int ret;
+
 	if (atomic_dec_return(csdev->refcnt) == 0) {
 		if (sink_ops(csdev)->disable) {
-			sink_ops(csdev)->disable(csdev);
+			ret = sink_ops(csdev)->disable(csdev);
+			if (ret)
+				return;
 			csdev->enable = false;
 		}
 	}
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 7b87965f7a65..189cc6ddc92b 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -192,7 +192,7 @@ struct coresight_device {
  */
 struct coresight_ops_sink {
 	int (*enable)(struct coresight_device *csdev, u32 mode, void *data);
-	void (*disable)(struct coresight_device *csdev);
+	int (*disable)(struct coresight_device *csdev);
 	void *(*alloc_buffer)(struct coresight_device *csdev, int cpu,
 			      void **pages, int nr_pages, bool overwrite);
 	void (*free_buffer)(void *config);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 05/16] coresight: Move reference counting inside sink drivers
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (3 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 04/16] coresight: Adding return code to sink::disable() operation Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-26 15:04   ` Suzuki K Poulose
  2019-03-25 21:56 ` [PATCH v2 06/16] coresight: Properly address errors in sink::disable() functions Mathieu Poirier
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

When operating in CPU-wide mode with an N:1 source/sink HW topology,
multiple CPUs can access a sink concurrently.  As such reference counting
needs to happen when the device's spinlock is held to avoid racing with
other operations (start(), update(), stop()), such as:

session A				Session B
-----					-------

enable_sink
atomic_inc(refcount)  = 1

...

atomic_dec(refcount) = 0		enable_sink
if (refcount == 0) disable_sink
					atomic_inc()

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 drivers/hwtracing/coresight/coresight-etb10.c | 21 ++++++++++----
 .../hwtracing/coresight/coresight-tmc-etf.c   | 21 +++++++++++---
 .../hwtracing/coresight/coresight-tmc-etr.c   | 19 +++++++++++--
 drivers/hwtracing/coresight/coresight-tpiu.c  |  6 +++-
 drivers/hwtracing/coresight/coresight.c       | 28 +++++++++----------
 5 files changed, 66 insertions(+), 29 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index 71c2a3cdb866..5af50a852e87 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -5,6 +5,7 @@
  * Description: CoreSight Embedded Trace Buffer driver
  */
 
+#include <linux/atomic.h>
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/types.h>
@@ -159,14 +160,15 @@ static int etb_enable_sysfs(struct coresight_device *csdev)
 		goto out;
 	}
 
-	/* Nothing to do, the tracer is already enabled. */
-	if (drvdata->mode == CS_MODE_SYSFS)
-		goto out;
+	if (drvdata->mode == CS_MODE_DISABLED) {
+		ret = etb_enable_hw(drvdata);
+		if (ret)
+			goto out;
 
-	ret = etb_enable_hw(drvdata);
-	if (!ret)
 		drvdata->mode = CS_MODE_SYSFS;
+	}
 
+	atomic_inc(csdev->refcnt);
 out:
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 	return ret;
@@ -196,8 +198,10 @@ static int etb_enable_perf(struct coresight_device *csdev, void *data)
 		goto out;
 
 	ret = etb_enable_hw(drvdata);
-	if (!ret)
+	if (!ret) {
 		drvdata->mode = CS_MODE_PERF;
+		atomic_inc(csdev->refcnt);
+	}
 
 out:
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
@@ -332,6 +336,11 @@ static int etb_disable(struct coresight_device *csdev)
 
 	spin_lock_irqsave(&drvdata->spinlock, flags);
 
+	if (atomic_dec_return(csdev->refcnt)) {
+		spin_unlock_irqrestore(&drvdata->spinlock, flags);
+		return -EBUSY;
+	}
+
 	/* Disable the ETB only if it needs to */
 	if (drvdata->mode != CS_MODE_DISABLED) {
 		etb_disable_hw(drvdata);
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index d4213e7c2c45..d50a608a60f1 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -4,6 +4,7 @@
  * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
  */
 
+#include <linux/atomic.h>
 #include <linux/circ_buf.h>
 #include <linux/coresight.h>
 #include <linux/perf_event.h>
@@ -180,8 +181,10 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev)
 	 * sink is already enabled no memory is needed and the HW need not be
 	 * touched.
 	 */
-	if (drvdata->mode == CS_MODE_SYSFS)
+	if (drvdata->mode == CS_MODE_SYSFS) {
+		atomic_inc(csdev->refcnt);
 		goto out;
+	}
 
 	/*
 	 * If drvdata::buf isn't NULL, memory was allocated for a previous
@@ -200,11 +203,13 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev)
 	}
 
 	ret = tmc_etb_enable_hw(drvdata);
-	if (!ret)
+	if (!ret) {
 		drvdata->mode = CS_MODE_SYSFS;
-	else
+		atomic_inc(csdev->refcnt);
+	} else {
 		/* Free up the buffer if we failed to enable */
 		used = false;
+	}
 out:
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
@@ -239,8 +244,10 @@ static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, void *data)
 		if (ret)
 			break;
 		ret  = tmc_etb_enable_hw(drvdata);
-		if (!ret)
+		if (!ret) {
 			drvdata->mode = CS_MODE_PERF;
+			atomic_inc(csdev->refcnt);
+		}
 	} while (0);
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
@@ -279,11 +286,17 @@ static int tmc_disable_etf_sink(struct coresight_device *csdev)
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
 	spin_lock_irqsave(&drvdata->spinlock, flags);
+
 	if (drvdata->reading) {
 		spin_unlock_irqrestore(&drvdata->spinlock, flags);
 		return -EBUSY;
 	}
 
+	if (atomic_dec_return(csdev->refcnt)) {
+		spin_unlock_irqrestore(&drvdata->spinlock, flags);
+		return -EBUSY;
+	}
+
 	/* Disable the TMC only if it needs to */
 	if (drvdata->mode != CS_MODE_DISABLED) {
 		tmc_etb_disable_hw(drvdata);
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 33501777038a..f90bca971367 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -4,6 +4,7 @@
  * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
  */
 
+#include <linux/atomic.h>
 #include <linux/coresight.h>
 #include <linux/dma-mapping.h>
 #include <linux/iommu.h>
@@ -1124,8 +1125,10 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	 * sink is already enabled no memory is needed and the HW need not be
 	 * touched, even if the buffer size has changed.
 	 */
-	if (drvdata->mode == CS_MODE_SYSFS)
+	if (drvdata->mode == CS_MODE_SYSFS) {
+		atomic_inc(csdev->refcnt);
 		goto out;
+	}
 
 	/*
 	 * If we don't have a buffer or it doesn't match the requested size,
@@ -1138,8 +1141,10 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	}
 
 	ret = tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf);
-	if (!ret)
+	if (!ret) {
 		drvdata->mode = CS_MODE_SYSFS;
+		atomic_inc(csdev->refcnt);
+	}
 out:
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
@@ -1370,8 +1375,10 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
 	etr_perf->head = PERF_IDX2OFF(handle->head, etr_perf);
 	drvdata->perf_data = etr_perf;
 	rc = tmc_etr_enable_hw(drvdata, etr_perf->etr_buf);
-	if (!rc)
+	if (!rc) {
 		drvdata->mode = CS_MODE_PERF;
+		atomic_inc(csdev->refcnt);
+	}
 
 unlock_out:
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
@@ -1398,11 +1405,17 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev)
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
 	spin_lock_irqsave(&drvdata->spinlock, flags);
+
 	if (drvdata->reading) {
 		spin_unlock_irqrestore(&drvdata->spinlock, flags);
 		return -EBUSY;
 	}
 
+	if (atomic_dec_return(csdev->refcnt)) {
+		spin_unlock_irqrestore(&drvdata->spinlock, flags);
+		return -EBUSY;
+	}
+
 	/* Disable the TMC only if it needs to */
 	if (drvdata->mode != CS_MODE_DISABLED) {
 		tmc_etr_disable_hw(drvdata);
diff --git a/drivers/hwtracing/coresight/coresight-tpiu.c b/drivers/hwtracing/coresight/coresight-tpiu.c
index 0d13da1b9df1..7acbeffcc137 100644
--- a/drivers/hwtracing/coresight/coresight-tpiu.c
+++ b/drivers/hwtracing/coresight/coresight-tpiu.c
@@ -5,6 +5,7 @@
  * Description: CoreSight Trace Port Interface Unit driver
  */
 
+#include <linux/atomic.h>
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/device.h>
@@ -73,7 +74,7 @@ static int tpiu_enable(struct coresight_device *csdev, u32 mode, void *__unused)
 	struct tpiu_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
 	tpiu_enable_hw(drvdata);
-
+	atomic_inc(csdev->refcnt);
 	dev_dbg(drvdata->dev, "TPIU enabled\n");
 	return 0;
 }
@@ -98,6 +99,9 @@ static int tpiu_disable(struct coresight_device *csdev)
 {
 	struct tpiu_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
+	if (atomic_dec_return(csdev->refcnt))
+		return -EBUSY;
+
 	tpiu_disable_hw(drvdata);
 
 	dev_dbg(drvdata->dev, "TPIU disabled\n");
diff --git a/drivers/hwtracing/coresight/coresight.c b/drivers/hwtracing/coresight/coresight.c
index 13eda4693f81..19ba121d7451 100644
--- a/drivers/hwtracing/coresight/coresight.c
+++ b/drivers/hwtracing/coresight/coresight.c
@@ -225,14 +225,13 @@ static int coresight_enable_sink(struct coresight_device *csdev,
 	 * We need to make sure the "new" session is compatible with the
 	 * existing "mode" of operation.
 	 */
-	if (sink_ops(csdev)->enable) {
-		ret = sink_ops(csdev)->enable(csdev, mode, data);
-		if (ret)
-			return ret;
-		csdev->enable = true;
-	}
+	if (!sink_ops(csdev)->enable)
+		return -EINVAL;
 
-	atomic_inc(csdev->refcnt);
+	ret = sink_ops(csdev)->enable(csdev, mode, data);
+	if (ret)
+		return ret;
+	csdev->enable = true;
 
 	return 0;
 }
@@ -241,14 +240,13 @@ static void coresight_disable_sink(struct coresight_device *csdev)
 {
 	int ret;
 
-	if (atomic_dec_return(csdev->refcnt) == 0) {
-		if (sink_ops(csdev)->disable) {
-			ret = sink_ops(csdev)->disable(csdev);
-			if (ret)
-				return;
-			csdev->enable = false;
-		}
-	}
+	if (!sink_ops(csdev)->disable)
+		return;
+
+	ret = sink_ops(csdev)->disable(csdev);
+	if (ret)
+		return;
+	csdev->enable = false;
 }
 
 static int coresight_enable_link(struct coresight_device *csdev,
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 06/16] coresight: Properly address errors in sink::disable() functions
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (4 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 05/16] coresight: Move reference counting inside sink drivers Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-25 21:56 ` [PATCH v2 07/16] coresight: Properly address concurrency in sink::update() functions Mathieu Poirier
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

When disabling a sink the reference counter ensures the operation goes
through if nobody else is using it.  As such if drvdata::mode is already
set do CS_MODE_DISABLED, it is an error and should be reported as such.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-etb10.c   | 9 ++++-----
 drivers/hwtracing/coresight/coresight-tmc-etf.c | 9 ++++-----
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 9 ++++-----
 3 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index 5af50a852e87..52b7d95ab498 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -341,11 +341,10 @@ static int etb_disable(struct coresight_device *csdev)
 		return -EBUSY;
 	}
 
-	/* Disable the ETB only if it needs to */
-	if (drvdata->mode != CS_MODE_DISABLED) {
-		etb_disable_hw(drvdata);
-		drvdata->mode = CS_MODE_DISABLED;
-	}
+	/* Complain if we (somehow) got out of sync */
+	WARN_ON_ONCE(drvdata->mode == CS_MODE_DISABLED);
+	etb_disable_hw(drvdata);
+	drvdata->mode = CS_MODE_DISABLED;
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	dev_dbg(drvdata->dev, "ETB disabled\n");
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index d50a608a60f1..30f868676540 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -297,11 +297,10 @@ static int tmc_disable_etf_sink(struct coresight_device *csdev)
 		return -EBUSY;
 	}
 
-	/* Disable the TMC only if it needs to */
-	if (drvdata->mode != CS_MODE_DISABLED) {
-		tmc_etb_disable_hw(drvdata);
-		drvdata->mode = CS_MODE_DISABLED;
-	}
+	/* Complain if we (somehow) got out of sync */
+	WARN_ON_ONCE(drvdata->mode == CS_MODE_DISABLED);
+	tmc_etb_disable_hw(drvdata);
+	drvdata->mode = CS_MODE_DISABLED;
 
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index f90bca971367..86e748d09dc3 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -1416,11 +1416,10 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev)
 		return -EBUSY;
 	}
 
-	/* Disable the TMC only if it needs to */
-	if (drvdata->mode != CS_MODE_DISABLED) {
-		tmc_etr_disable_hw(drvdata);
-		drvdata->mode = CS_MODE_DISABLED;
-	}
+	/* Complain if we (somehow) got out of sync */
+	WARN_ON_ONCE(drvdata->mode == CS_MODE_DISABLED);
+	tmc_etr_disable_hw(drvdata);
+	drvdata->mode = CS_MODE_DISABLED;
 
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 07/16] coresight: Properly address concurrency in sink::update() functions
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (5 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 06/16] coresight: Properly address errors in sink::disable() functions Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-25 21:56 ` [PATCH v2 08/16] coresight: perf: Clean up function etm_setup_aux() Mathieu Poirier
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

When operating in CPU-wide trace scenarios and working with an N:1
source/sink HW topology, update() functions need to be made atomic
in order to avoid racing with start and stop operations.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-etb10.c   | 4 +++-
 drivers/hwtracing/coresight/coresight-tmc-etf.c | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index 52b7d95ab498..6b50e781dc57 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -413,7 +413,7 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev,
 	const u32 *barrier;
 	u32 read_ptr, write_ptr, capacity;
 	u32 status, read_data;
-	unsigned long offset, to_read;
+	unsigned long offset, to_read, flags;
 	struct cs_buffers *buf = sink_config;
 	struct etb_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
@@ -422,6 +422,7 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev,
 
 	capacity = drvdata->buffer_depth * ETB_FRAME_SIZE_WORDS;
 
+	spin_lock_irqsave(&drvdata->spinlock, flags);
 	__etb_disable_hw(drvdata);
 	CS_UNLOCK(drvdata->base);
 
@@ -532,6 +533,7 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev,
 	}
 	__etb_enable_hw(drvdata);
 	CS_LOCK(drvdata->base);
+	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	return to_read;
 }
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 30f868676540..a38ad2b0d95a 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -413,7 +413,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
 	u32 *buf_ptr;
 	u64 read_ptr, write_ptr;
 	u32 status;
-	unsigned long offset, to_read;
+	unsigned long offset, to_read, flags;
 	struct cs_buffers *buf = sink_config;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
@@ -424,6 +424,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
 	if (WARN_ON_ONCE(drvdata->mode != CS_MODE_PERF))
 		return 0;
 
+	spin_lock_irqsave(&drvdata->spinlock, flags);
 	CS_UNLOCK(drvdata->base);
 
 	tmc_flush_and_stop(drvdata);
@@ -517,6 +518,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
 		to_read = buf->nr_pages << PAGE_SHIFT;
 	}
 	CS_LOCK(drvdata->base);
+	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	return to_read;
 }
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 08/16] coresight: perf: Clean up function etm_setup_aux()
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (6 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 07/16] coresight: Properly address concurrency in sink::update() functions Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-25 21:56 ` [PATCH v2 09/16] coresight: perf: Refactor function free_event_data() Mathieu Poirier
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

There is no point in allocating sink memory for a trace session if
there is not a way to free it once it is no longer needed.  As such make
sure the sink API function to allocate and free memory have been
implemented before moving ahead with the establishment of a trace
session.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-etm-perf.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index bbfed70b3402..b8ca3800b56b 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -134,8 +134,7 @@ static void free_event_data(struct work_struct *work)
 	if (event_data->snk_config && !WARN_ON(cpumask_empty(mask))) {
 		cpu = cpumask_first(mask);
 		sink = coresight_get_sink(etm_event_cpu_path(event_data, cpu));
-		if (sink_ops(sink)->free_buffer)
-			sink_ops(sink)->free_buffer(event_data->snk_config);
+		sink_ops(sink)->free_buffer(event_data->snk_config);
 	}
 
 	for_each_cpu(cpu, mask) {
@@ -215,7 +214,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
 		sink = coresight_get_enabled_sink(true);
 	}
 
-	if (!sink || !sink_ops(sink)->alloc_buffer)
+	if (!sink)
 		goto err;
 
 	mask = &event_data->mask;
@@ -261,6 +260,9 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
 	if (cpu >= nr_cpu_ids)
 		goto err;
 
+	if (!sink_ops(sink)->alloc_buffer || !sink_ops(sink)->free_buffer)
+		goto err;
+
 	/* Allocate the sink buffer for this session */
 	event_data->snk_config =
 			sink_ops(sink)->alloc_buffer(sink, cpu, pages,
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 09/16] coresight: perf: Refactor function free_event_data()
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (7 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 08/16] coresight: perf: Clean up function etm_setup_aux() Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-26 15:07   ` Suzuki K Poulose
  2019-03-25 21:56 ` [PATCH v2 10/16] coresight: Communicate perf event to sink buffer allocation function Mathieu Poirier
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

Function free_event_data() is already busy and is bound to become
worse with the addition of CPU-wide trace scenarios.  As such spin
off a new function to strickly take care of the sink buffers.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 .../hwtracing/coresight/coresight-etm-perf.c  | 24 ++++++++++++++-----
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index b8ca3800b56b..806b3dd5872d 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -120,22 +120,34 @@ static int etm_event_init(struct perf_event *event)
 	return ret;
 }
 
+static void free_sink_buffer(struct etm_event_data *event_data)
+{
+	int cpu;
+	cpumask_t *mask = &event_data->mask;
+	struct coresight_device *sink;
+
+	if (WARN_ON(cpumask_empty(mask)))
+		return;
+
+	if (!event_data->snk_config)
+		return;
+
+	cpu = cpumask_first(mask);
+	sink = coresight_get_sink(etm_event_cpu_path(event_data, cpu));
+	sink_ops(sink)->free_buffer(event_data->snk_config);
+}
+
 static void free_event_data(struct work_struct *work)
 {
 	int cpu;
 	cpumask_t *mask;
 	struct etm_event_data *event_data;
-	struct coresight_device *sink;
 
 	event_data = container_of(work, struct etm_event_data, work);
 	mask = &event_data->mask;
 
 	/* Free the sink buffers, if there are any */
-	if (event_data->snk_config && !WARN_ON(cpumask_empty(mask))) {
-		cpu = cpumask_first(mask);
-		sink = coresight_get_sink(etm_event_cpu_path(event_data, cpu));
-		sink_ops(sink)->free_buffer(event_data->snk_config);
-	}
+	free_sink_buffer(event_data);
 
 	for_each_cpu(cpu, mask) {
 		struct list_head **ppath;
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 10/16] coresight: Communicate perf event to sink buffer allocation function
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (8 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 09/16] coresight: perf: Refactor function free_event_data() Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-26 15:12   ` Suzuki K Poulose
  2019-03-25 21:56 ` [PATCH v2 11/16] coresight: tmc-etr: Refactor function tmc_etr_setup_perf_buf() Mathieu Poirier
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

Make struct perf_event available to sink buffer allocation functions in
order to use the pid they carry to allocate and free buffer memory along
with regimenting access to what source a sink can collect data for.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 drivers/hwtracing/coresight/coresight-etb10.c    | 7 ++++---
 drivers/hwtracing/coresight/coresight-etm-perf.c | 2 +-
 drivers/hwtracing/coresight/coresight-tmc-etf.c  | 7 ++++---
 drivers/hwtracing/coresight/coresight-tmc-etr.c  | 6 ++++--
 include/linux/coresight.h                        | 5 +++--
 5 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index 6b50e781dc57..7d64c41cd8ac 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -351,10 +351,11 @@ static int etb_disable(struct coresight_device *csdev)
 	return 0;
 }
 
-static void *etb_alloc_buffer(struct coresight_device *csdev, int cpu,
-			      void **pages, int nr_pages, bool overwrite)
+static void *etb_alloc_buffer(struct coresight_device *csdev,
+			      struct perf_event *event, void **pages,
+			      int nr_pages, bool overwrite)
 {
-	int node;
+	int node, cpu = event->cpu;
 	struct cs_buffers *buf;
 
 	if (cpu == -1)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 806b3dd5872d..3c6294432748 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -277,7 +277,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
 
 	/* Allocate the sink buffer for this session */
 	event_data->snk_config =
-			sink_ops(sink)->alloc_buffer(sink, cpu, pages,
+			sink_ops(sink)->alloc_buffer(sink, event, pages,
 						     nr_pages, overwrite);
 	if (!event_data->snk_config)
 		goto err;
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index a38ad2b0d95a..1df1f8fade71 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -350,10 +350,11 @@ static void tmc_disable_etf_link(struct coresight_device *csdev,
 	dev_dbg(drvdata->dev, "TMC-ETF disabled\n");
 }
 
-static void *tmc_alloc_etf_buffer(struct coresight_device *csdev, int cpu,
-				  void **pages, int nr_pages, bool overwrite)
+static void *tmc_alloc_etf_buffer(struct coresight_device *csdev,
+				  struct perf_event *event, void **pages,
+				  int nr_pages, bool overwrite)
 {
-	int node;
+	int node, cpu = event->cpu;
 	struct cs_buffers *buf;
 
 	if (cpu == -1)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 86e748d09dc3..6e2c2aa130d5 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -9,6 +9,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/iommu.h>
 #include <linux/slab.h>
+#include <linux/types.h>
 #include <linux/vmalloc.h>
 #include "coresight-catu.h"
 #include "coresight-etm-perf.h"
@@ -1210,9 +1211,10 @@ tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages,
 
 
 static void *tmc_alloc_etr_buffer(struct coresight_device *csdev,
-				  int cpu, void **pages, int nr_pages,
-				  bool snapshot)
+				  struct perf_event *event, void **pages,
+				  int nr_pages, bool snapshot)
 {
+	int cpu = event->cpu;
 	struct etr_perf_buffer *etr_perf;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 189cc6ddc92b..62a520df8add 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -193,8 +193,9 @@ struct coresight_device {
 struct coresight_ops_sink {
 	int (*enable)(struct coresight_device *csdev, u32 mode, void *data);
 	int (*disable)(struct coresight_device *csdev);
-	void *(*alloc_buffer)(struct coresight_device *csdev, int cpu,
-			      void **pages, int nr_pages, bool overwrite);
+	void *(*alloc_buffer)(struct coresight_device *csdev,
+			      struct perf_event *event, void **pages,
+			      int nr_pages, bool overwrite);
 	void (*free_buffer)(void *config);
 	unsigned long (*update_buffer)(struct coresight_device *csdev,
 			      struct perf_output_handle *handle,
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 11/16] coresight: tmc-etr: Refactor function tmc_etr_setup_perf_buf()
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (9 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 10/16] coresight: Communicate perf event to sink buffer allocation function Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-26 15:29   ` Suzuki K Poulose
  2019-03-25 21:56 ` [PATCH v2 12/16] coresight: tmc-etr: Introduce the notion of process ID to ETR devices Mathieu Poirier
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

Refactoring function tmc_etr_setup_perf_buf() so that it only deals
with the high level etr_perf_buffer, and leaving the allocation of the
backend buffer (i.e etr_buf) to another function.

That way the backend buffer allocation function can decide if it wants
to reuse an existing buffer (CPU-wide trace scenarios) or simply create
a new one.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 .../hwtracing/coresight/coresight-tmc-etr.c   | 46 +++++++++++++------
 1 file changed, 31 insertions(+), 15 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 6e2c2aa130d5..79fee9341446 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -1159,25 +1159,13 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	return ret;
 }
 
-/*
- * tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf.
- * The size of the hardware buffer is dependent on the size configured
- * via sysfs and the perf ring buffer size. We prefer to allocate the
- * largest possible size, scaling down the size by half until it
- * reaches a minimum limit (1M), beyond which we give up.
- */
-static struct etr_perf_buffer *
-tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages,
-		       void **pages, bool snapshot)
+static struct etr_buf *
+tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
+		    int nr_pages, void **pages)
 {
 	struct etr_buf *etr_buf;
-	struct etr_perf_buffer *etr_perf;
 	unsigned long size;
 
-	etr_perf = kzalloc_node(sizeof(*etr_perf), GFP_KERNEL, node);
-	if (!etr_perf)
-		return ERR_PTR(-ENOMEM);
-
 	/*
 	 * Try to match the perf ring buffer size if it is larger
 	 * than the size requested via sysfs.
@@ -1201,6 +1189,34 @@ tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages,
 		size /= 2;
 	} while (size >= TMC_ETR_PERF_MIN_BUF_SIZE);
 
+	return ERR_PTR(-ENOMEM);
+
+done:
+	return etr_buf;
+}
+
+/*
+ * tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf.
+ * The size of the hardware buffer is dependent on the size configured
+ * via sysfs and the perf ring buffer size. We prefer to allocate the
+ * largest possible size, scaling down the size by half until it
+ * reaches a minimum limit (1M), beyond which we give up.
+ */
+static struct etr_perf_buffer *
+tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node,
+		       int nr_pages, void **pages, bool snapshot)
+{
+	struct etr_buf *etr_buf;
+	struct etr_perf_buffer *etr_perf;
+
+	etr_perf = kzalloc_node(sizeof(*etr_perf), GFP_KERNEL, node);
+	if (!etr_perf)
+		return ERR_PTR(-ENOMEM);
+
+	etr_buf = tmc_etr_get_etr_buf(drvdata, node, nr_pages, pages);
+	if (!IS_ERR(etr_buf))
+		goto done;
+
 	kfree(etr_perf);
 	return ERR_PTR(-ENOMEM);
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 12/16] coresight: tmc-etr: Introduce the notion of process ID to ETR devices
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (10 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 11/16] coresight: tmc-etr: Refactor function tmc_etr_setup_perf_buf() Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-26 16:46   ` Suzuki K Poulose
  2019-03-25 21:56 ` [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer Mathieu Poirier
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

In preparation to support CPU-wide trace scenarios, add the notion of
process ID to ETR devices so that memory buffers can be shared between
events.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 .../hwtracing/coresight/coresight-tmc-etr.c   | 22 +++++++++++++++++--
 drivers/hwtracing/coresight/coresight-tmc.h   |  3 +++
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 79fee9341446..7254fafdf1c2 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -26,6 +26,7 @@ struct etr_flat_buf {
 /*
  * etr_perf_buffer - Perf buffer used for ETR
  * @etr_buf		- Actual buffer used by the ETR
+ * @pid			- The PID this etr_perf_buffer belongs to.
  * @snaphost		- Perf session mode
  * @head		- handle->head at the beginning of the session.
  * @nr_pages		- Number of pages in the ring buffer.
@@ -33,6 +34,7 @@ struct etr_flat_buf {
  */
 struct etr_perf_buffer {
 	struct etr_buf		*etr_buf;
+	pid_t			pid;
 	bool			snapshot;
 	unsigned long		head;
 	int			nr_pages;
@@ -1160,8 +1162,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 }
 
 static struct etr_buf *
-tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
-		    int nr_pages, void **pages)
+alloc_etr_buf(struct tmc_drvdata *drvdata, int node,
+	      int nr_pages, void **pages)
 {
 	struct etr_buf *etr_buf;
 	unsigned long size;
@@ -1195,6 +1197,20 @@ tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
 	return etr_buf;
 }
 
+static struct etr_buf *
+tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
+		    int nr_pages, void **pages)
+{
+	struct etr_buf *etr_buf;
+
+	etr_buf = alloc_etr_buf(drvdata, node, nr_pages, pages);
+	if (IS_ERR(etr_buf))
+		return etr_buf;
+
+	refcount_set(&etr_buf->refcount, 1);
+	return etr_buf;
+}
+
 /*
  * tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf.
  * The size of the hardware buffer is dependent on the size configured
@@ -1231,6 +1247,7 @@ static void *tmc_alloc_etr_buffer(struct coresight_device *csdev,
 				  int nr_pages, bool snapshot)
 {
 	int cpu = event->cpu;
+	pid_t pid = task_pid_nr(event->owner);
 	struct etr_perf_buffer *etr_perf;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
@@ -1244,6 +1261,7 @@ static void *tmc_alloc_etr_buffer(struct coresight_device *csdev,
 		return NULL;
 	}
 
+	etr_perf->pid = pid;
 	etr_perf->snapshot = snapshot;
 	etr_perf->nr_pages = nr_pages;
 	etr_perf->pages = pages;
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index 487c53701e9c..7675138ff9fc 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -9,6 +9,7 @@
 
 #include <linux/dma-mapping.h>
 #include <linux/miscdevice.h>
+#include <linux/refcount.h>
 
 #define TMC_RSZ			0x004
 #define TMC_STS			0x00c
@@ -133,6 +134,7 @@ struct etr_buf_operations;
 
 /**
  * struct etr_buf - Details of the buffer used by ETR
+ * @refcount	: Number of sources currently using this etr_buf.
  * @mode	: Mode of the ETR buffer, contiguous, Scatter Gather etc.
  * @full	: Trace data overflow
  * @size	: Size of the buffer.
@@ -143,6 +145,7 @@ struct etr_buf_operations;
  * @private	: Backend specific information for the buf
  */
 struct etr_buf {
+	refcount_t			refcount;
 	enum etr_mode			mode;
 	bool				full;
 	ssize_t				size;
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (11 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 12/16] coresight: tmc-etr: Introduce the notion of process ID to ETR devices Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-26 16:18   ` Suzuki K Poulose
  2019-03-30 15:43   ` Leo Yan
  2019-03-25 21:56 ` [PATCH v2 14/16] coresight: tmc-etr: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (4 subsequent siblings)
  17 siblings, 2 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

This patch uses the pid of the process being traced to aggregate traces
coming from different processors in the same sink, something that is
required when collecting traces in CPU-wide mode when the CoreSight HW
enacts a N:1 source/sink topology.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 .../hwtracing/coresight/coresight-tmc-etr.c   | 71 +++++++++++++++++--
 1 file changed, 65 insertions(+), 6 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 7254fafdf1c2..cbabf88bd51d 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -8,6 +8,7 @@
 #include <linux/coresight.h>
 #include <linux/dma-mapping.h>
 #include <linux/iommu.h>
+#include <linux/idr.h>
 #include <linux/slab.h>
 #include <linux/types.h>
 #include <linux/vmalloc.h>
@@ -41,6 +42,9 @@ struct etr_perf_buffer {
 	void			**pages;
 };
 
+static DEFINE_IDR(session_idr);
+static DEFINE_MUTEX(session_idr_lock);
+
 /* Convert the perf index to an offset within the ETR buffer */
 #define PERF_IDX2OFF(idx, buf)	((idx) % ((buf)->nr_pages << PAGE_SHIFT))
 
@@ -1198,16 +1202,48 @@ alloc_etr_buf(struct tmc_drvdata *drvdata, int node,
 }
 
 static struct etr_buf *
-tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
+tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, pid_t pid, int node,
 		    int nr_pages, void **pages)
 {
+	int ret;
 	struct etr_buf *etr_buf;
 
+retry:
+	/* See if a buffer has been allocated for this session */
+	mutex_lock(&session_idr_lock);
+	etr_buf = idr_find(&session_idr, pid);
+	if (etr_buf) {
+		refcount_inc(&etr_buf->refcount);
+		mutex_unlock(&session_idr_lock);
+		return etr_buf;
+	}
+
+	/* If we made it here no buffer has been allocated, do so now. */
+	mutex_unlock(&session_idr_lock);
+
 	etr_buf = alloc_etr_buf(drvdata, node, nr_pages, pages);
 	if (IS_ERR(etr_buf))
 		return etr_buf;
 
 	refcount_set(&etr_buf->refcount, 1);
+
+	/* Now that we have a buffer, add it to the IDR. */
+	mutex_lock(&session_idr_lock);
+	ret = idr_alloc(&session_idr, etr_buf, pid, pid + 1, GFP_KERNEL);
+	mutex_unlock(&session_idr_lock);
+
+	/* Another event whith this session ID has allocated this buffer. */
+	if (ret == -ENOSPC) {
+		tmc_free_etr_buf(etr_buf);
+		goto retry;
+	}
+
+	/* The IDR can't allocate room for a new session, abandon ship. */
+	if (ret == -ENOMEM) {
+		tmc_free_etr_buf(etr_buf);
+		return ERR_PTR(ret);
+	}
+
 	return etr_buf;
 }
 
@@ -1219,7 +1255,7 @@ tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
  * reaches a minimum limit (1M), beyond which we give up.
  */
 static struct etr_perf_buffer *
-tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node,
+tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, pid_t pid, int node,
 		       int nr_pages, void **pages, bool snapshot)
 {
 	struct etr_buf *etr_buf;
@@ -1229,7 +1265,7 @@ tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node,
 	if (!etr_perf)
 		return ERR_PTR(-ENOMEM);
 
-	etr_buf = tmc_etr_get_etr_buf(drvdata, node, nr_pages, pages);
+	etr_buf = tmc_etr_get_etr_buf(drvdata, pid, node, nr_pages, pages);
 	if (!IS_ERR(etr_buf))
 		goto done;
 
@@ -1254,7 +1290,7 @@ static void *tmc_alloc_etr_buffer(struct coresight_device *csdev,
 	if (cpu == -1)
 		cpu = smp_processor_id();
 
-	etr_perf = tmc_etr_setup_perf_buf(drvdata, cpu_to_node(cpu),
+	etr_perf = tmc_etr_setup_perf_buf(drvdata, pid, cpu_to_node(cpu),
 					  nr_pages, pages, snapshot);
 	if (IS_ERR(etr_perf)) {
 		dev_dbg(drvdata->dev, "Unable to allocate ETR buffer\n");
@@ -1272,9 +1308,32 @@ static void *tmc_alloc_etr_buffer(struct coresight_device *csdev,
 static void tmc_free_etr_buffer(void *config)
 {
 	struct etr_perf_buffer *etr_perf = config;
+	struct etr_buf *buf, *etr_buf = etr_perf->etr_buf;
+
+	if (!etr_buf)
+		goto free_etr_perf_buffer;
+
+	mutex_lock(&session_idr_lock);
+	/* If we are not the last one to use the buffer, don't touch it. */
+	if (!refcount_dec_and_test(&etr_buf->refcount)) {
+		mutex_unlock(&session_idr_lock);
+		goto free_etr_perf_buffer;
+	}
+
+	/* We are the last one, remove from the IDR and free the buffer. */
+	buf = idr_remove(&session_idr, etr_perf->pid);
+	mutex_unlock(&session_idr_lock);
+
+	/*
+	 * Something went very wrong if the buffer associated with this ID
+	 * is not the same in the IDR.  Leak to avoid use after free.
+	 */
+	if (WARN_ON(buf != etr_buf))
+		goto free_etr_perf_buffer;
+
+	tmc_free_etr_buf(etr_perf->etr_buf);
 
-	if (etr_perf->etr_buf)
-		tmc_free_etr_buf(etr_perf->etr_buf);
+free_etr_perf_buffer:
 	kfree(etr_perf);
 }
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 14/16] coresight: tmc-etr: Add support for CPU-wide trace scenarios
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (12 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-25 21:56 ` [PATCH v2 15/16] coresight: tmc-etf: " Mathieu Poirier
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

This patch adds support for CPU-wide trace scenarios by making sure that
only the sources monitoring the same process have access to a common sink.
Because the sink is shared between sources, the first source to use the
sink switches it on while the last one does the cleanup.  Any attempt to
modify the HW is overlooked for as long as more than one source is using
a sink.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 .../hwtracing/coresight/coresight-tmc-etr.c   | 38 ++++++++++++++++---
 drivers/hwtracing/coresight/coresight-tmc.c   |  2 +
 drivers/hwtracing/coresight/coresight-tmc.h   |  3 ++
 3 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index cbabf88bd51d..48093605331a 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -1408,6 +1408,13 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
 	struct etr_buf *etr_buf = etr_perf->etr_buf;
 
 	spin_lock_irqsave(&drvdata->spinlock, flags);
+
+	/* Don't do anything if another tracer is using this sink */
+	if (atomic_read(csdev->refcnt) != 1) {
+		spin_unlock_irqrestore(&drvdata->spinlock, flags);
+		goto out;
+	}
+
 	if (WARN_ON(drvdata->perf_data != etr_perf)) {
 		lost = true;
 		spin_unlock_irqrestore(&drvdata->spinlock, flags);
@@ -1447,17 +1454,15 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
 static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
 {
 	int rc = 0;
+	pid_t pid;
 	unsigned long flags;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 	struct perf_output_handle *handle = data;
 	struct etr_perf_buffer *etr_perf = etm_perf_sink_config(handle);
 
 	spin_lock_irqsave(&drvdata->spinlock, flags);
-	/*
-	 * There can be only one writer per sink in perf mode. If the sink
-	 * is already open in SYSFS mode, we can't use it.
-	 */
-	if (drvdata->mode != CS_MODE_DISABLED || WARN_ON(drvdata->perf_data)) {
+	 /* Don't use this sink if it is already claimed by sysFS */
+	if (drvdata->mode == CS_MODE_SYSFS) {
 		rc = -EBUSY;
 		goto unlock_out;
 	}
@@ -1467,10 +1472,31 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
 		goto unlock_out;
 	}
 
+	/* Get a handle on the pid of the process to monitor */
+	pid = etr_perf->pid;
+
+	/* Do not proceed if this device is associated with another session */
+	if (drvdata->pid != -1 && drvdata->pid != pid) {
+		rc = -EBUSY;
+		goto unlock_out;
+	}
+
 	etr_perf->head = PERF_IDX2OFF(handle->head, etr_perf);
 	drvdata->perf_data = etr_perf;
+
+	/*
+	 * No HW configuration is needed if the sink is already in
+	 * use for this session.
+	 */
+	if (drvdata->pid == pid) {
+		atomic_inc(csdev->refcnt);
+		goto unlock_out;
+	}
+
 	rc = tmc_etr_enable_hw(drvdata, etr_perf->etr_buf);
 	if (!rc) {
+		/* Associate with monitored process. */
+		drvdata->pid = pid;
 		drvdata->mode = CS_MODE_PERF;
 		atomic_inc(csdev->refcnt);
 	}
@@ -1514,6 +1540,8 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev)
 	/* Complain if we (somehow) got out of sync */
 	WARN_ON_ONCE(drvdata->mode == CS_MODE_DISABLED);
 	tmc_etr_disable_hw(drvdata);
+	/* Dissociate from monitored process. */
+	drvdata->pid = -1;
 	drvdata->mode = CS_MODE_DISABLED;
 
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c
index 2a02da3d630f..b01e587f376d 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.c
+++ b/drivers/hwtracing/coresight/coresight-tmc.c
@@ -415,6 +415,8 @@ static int tmc_probe(struct amba_device *adev, const struct amba_id *id)
 	devid = readl_relaxed(drvdata->base + CORESIGHT_DEVID);
 	drvdata->config_type = BMVAL(devid, 6, 7);
 	drvdata->memwidth = tmc_get_memwidth(devid);
+	/* This device is not associated with a session */
+	drvdata->pid = -1;
 
 	if (drvdata->config_type == TMC_CONFIG_TYPE_ETR) {
 		if (np)
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index 7675138ff9fc..c66cbce28bd4 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -163,6 +163,8 @@ struct etr_buf {
  * @csdev:	component vitals needed by the framework.
  * @miscdev:	specifics to handle "/dev/xyz.tmc" entry.
  * @spinlock:	only one at a time pls.
+ * @pid:	Process ID of the process being monitored by the session
+ *		that is using this component.
  * @buf:	Snapshot of the trace data for ETF/ETB.
  * @etr_buf:	details of buffer used in TMC-ETR
  * @len:	size of the available trace for ETF/ETB.
@@ -182,6 +184,7 @@ struct tmc_drvdata {
 	struct coresight_device	*csdev;
 	struct miscdevice	miscdev;
 	spinlock_t		spinlock;
+	pid_t			pid;
 	bool			reading;
 	union {
 		char		*buf;		/* TMC ETB */
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 15/16] coresight: tmc-etf: Add support for CPU-wide trace scenarios
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (13 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 14/16] coresight: tmc-etr: Add support for CPU-wide trace scenarios Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-25 21:56 ` [PATCH v2 16/16] coresight: etb10: " Mathieu Poirier
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

This patch adds support for CPU-wide trace scenarios by making sure that
only the sources monitoring the same process have access to a common sink.
Because the sink is shared between sources, the first source to use the
sink switches it on while the last one does the cleanup.  Any attempt to
modify the HW is overlooked for as long as more than one source is using
a sink.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 .../hwtracing/coresight/coresight-tmc-etf.c   | 40 ++++++++++++++++---
 1 file changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 1df1f8fade71..2527b5d3b65e 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -223,6 +223,7 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev)
 static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, void *data)
 {
 	int ret = 0;
+	pid_t pid;
 	unsigned long flags;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 	struct perf_output_handle *handle = data;
@@ -233,18 +234,39 @@ static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, void *data)
 		if (drvdata->reading)
 			break;
 		/*
-		 * In Perf mode there can be only one writer per sink.  There
-		 * is also no need to continue if the ETB/ETF is already
-		 * operated from sysFS.
+		 * No need to continue if the ETB/ETF is already operated
+		 * from sysFS.
 		 */
-		if (drvdata->mode != CS_MODE_DISABLED)
+		if (drvdata->mode == CS_MODE_SYSFS) {
+			ret = -EBUSY;
 			break;
+		}
+
+		/* Get a handle on the pid of the process to monitor */
+		pid = task_pid_nr(handle->event->owner);
+
+		if (drvdata->pid != -1 && drvdata->pid != pid) {
+			ret = -EBUSY;
+			break;
+		}
 
 		ret = tmc_set_etf_buffer(csdev, handle);
 		if (ret)
 			break;
+
+		/*
+		 * No HW configuration is needed if the sink is already in
+		 * use for this session.
+		 */
+		if (drvdata->pid == pid) {
+			atomic_inc(csdev->refcnt);
+			break;
+		}
+
 		ret  = tmc_etb_enable_hw(drvdata);
 		if (!ret) {
+			/* Associate with monitored process. */
+			drvdata->pid = pid;
 			drvdata->mode = CS_MODE_PERF;
 			atomic_inc(csdev->refcnt);
 		}
@@ -300,6 +322,8 @@ static int tmc_disable_etf_sink(struct coresight_device *csdev)
 	/* Complain if we (somehow) got out of sync */
 	WARN_ON_ONCE(drvdata->mode == CS_MODE_DISABLED);
 	tmc_etb_disable_hw(drvdata);
+	/* Dissociate from monitored process. */
+	drvdata->pid = -1;
 	drvdata->mode = CS_MODE_DISABLED;
 
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
@@ -414,7 +438,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
 	u32 *buf_ptr;
 	u64 read_ptr, write_ptr;
 	u32 status;
-	unsigned long offset, to_read, flags;
+	unsigned long offset, to_read = 0, flags;
 	struct cs_buffers *buf = sink_config;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
@@ -426,6 +450,11 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
 		return 0;
 
 	spin_lock_irqsave(&drvdata->spinlock, flags);
+
+	/* Don't do anything if another tracer is using this sink */
+	if (atomic_read(csdev->refcnt) != 1)
+		goto out;
+
 	CS_UNLOCK(drvdata->base);
 
 	tmc_flush_and_stop(drvdata);
@@ -519,6 +548,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
 		to_read = buf->nr_pages << PAGE_SHIFT;
 	}
 	CS_LOCK(drvdata->base);
+out:
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	return to_read;
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 16/16] coresight: etb10: Add support for CPU-wide trace scenarios
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (14 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 15/16] coresight: tmc-etf: " Mathieu Poirier
@ 2019-03-25 21:56 ` Mathieu Poirier
  2019-03-27  7:52 ` [PATCH v2 00/16] coresight: " Leo Yan
  2019-04-11 18:52 ` Robert Walker
  17 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-25 21:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, leo.yan

This patch adds support for CPU-wide trace scenarios by making sure that
only the sources monitoring the same process have access to a common sink.
Because the sink is shared between sources, the first source to use the
sink switches it on while the last one does the cleanup.  Any attempt to
modify the HW is overlooked for as long as more than one source is using
a sink.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 drivers/hwtracing/coresight/coresight-etb10.c | 43 +++++++++++++++++--
 1 file changed, 39 insertions(+), 4 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index 7d64c41cd8ac..a2379c00d635 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -72,6 +72,8 @@
  * @miscdev:	specifics to handle "/dev/xyz.etb" entry.
  * @spinlock:	only one at a time pls.
  * @reading:	synchronise user space access to etb buffer.
+ * @pid:	Process ID of the process being monitored by the session
+ *		that is using this component.
  * @buf:	area of memory where ETB buffer content gets sent.
  * @mode:	this ETB is being used.
  * @buffer_depth: size of @buf.
@@ -85,6 +87,7 @@ struct etb_drvdata {
 	struct miscdevice	miscdev;
 	spinlock_t		spinlock;
 	local_t			reading;
+	pid_t			pid;
 	u8			*buf;
 	u32			mode;
 	u32			buffer_depth;
@@ -177,28 +180,49 @@ static int etb_enable_sysfs(struct coresight_device *csdev)
 static int etb_enable_perf(struct coresight_device *csdev, void *data)
 {
 	int ret = 0;
+	pid_t pid;
 	unsigned long flags;
 	struct etb_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
+	struct perf_output_handle *handle = data;
 
 	spin_lock_irqsave(&drvdata->spinlock, flags);
 
-	/* No need to continue if the component is already in use. */
-	if (drvdata->mode != CS_MODE_DISABLED) {
+	/* No need to continue if the component is already in used by sysFS. */
+	if (drvdata->mode == CS_MODE_SYSFS) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	/* Get a handle on the pid of the process to monitor */
+	pid = task_pid_nr(handle->event->owner);
+
+	if (drvdata->pid != -1 && drvdata->pid != pid) {
 		ret = -EBUSY;
 		goto out;
 	}
 
+	/*
+	 * No HW configuration is needed if the sink is already in
+	 * use for this session.
+	 */
+	if (drvdata->pid == pid) {
+		atomic_inc(csdev->refcnt);
+		goto out;
+	}
+
 	/*
 	 * We don't have an internal state to clean up if we fail to setup
 	 * the perf buffer. So we can perform the step before we turn the
 	 * ETB on and leave without cleaning up.
 	 */
-	ret = etb_set_buffer(csdev, (struct perf_output_handle *)data);
+	ret = etb_set_buffer(csdev, handle);
 	if (ret)
 		goto out;
 
 	ret = etb_enable_hw(drvdata);
 	if (!ret) {
+		/* Associate with monitored process. */
+		drvdata->pid = pid;
 		drvdata->mode = CS_MODE_PERF;
 		atomic_inc(csdev->refcnt);
 	}
@@ -344,6 +368,8 @@ static int etb_disable(struct coresight_device *csdev)
 	/* Complain if we (somehow) got out of sync */
 	WARN_ON_ONCE(drvdata->mode == CS_MODE_DISABLED);
 	etb_disable_hw(drvdata);
+	/* Dissociate from monitored process. */
+	drvdata->pid = -1;
 	drvdata->mode = CS_MODE_DISABLED;
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
@@ -414,7 +440,7 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev,
 	const u32 *barrier;
 	u32 read_ptr, write_ptr, capacity;
 	u32 status, read_data;
-	unsigned long offset, to_read, flags;
+	unsigned long offset, to_read = 0, flags;
 	struct cs_buffers *buf = sink_config;
 	struct etb_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
@@ -424,6 +450,11 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev,
 	capacity = drvdata->buffer_depth * ETB_FRAME_SIZE_WORDS;
 
 	spin_lock_irqsave(&drvdata->spinlock, flags);
+
+	/* Don't do anything if another tracer is using this sink */
+	if (atomic_read(csdev->refcnt) != 1)
+		goto out;
+
 	__etb_disable_hw(drvdata);
 	CS_UNLOCK(drvdata->base);
 
@@ -534,6 +565,7 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev,
 	}
 	__etb_enable_hw(drvdata);
 	CS_LOCK(drvdata->base);
+out:
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	return to_read;
@@ -742,6 +774,9 @@ static int etb_probe(struct amba_device *adev, const struct amba_id *id)
 	if (!drvdata->buf)
 		return -ENOMEM;
 
+	/* This device is not associated with a session */
+	drvdata->pid = -1;
+
 	desc.type = CORESIGHT_DEV_TYPE_SINK;
 	desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_BUFFER;
 	desc.ops = &etb_cs_ops;
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 03/16] coresight: etm4x: Configure tracers to emit timestamps
  2019-03-25 21:56 ` [PATCH v2 03/16] coresight: etm4x: Configure tracers to emit timestamps Mathieu Poirier
@ 2019-03-26 11:53   ` Suzuki K Poulose
  0 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2019-03-26 11:53 UTC (permalink / raw)
  To: mathieu.poirier, linux-arm-kernel
  Cc: alexander.shishkin, peterz, coresight, Mike.Leach, leo.yan

On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> Configure timestamps to be emitted at regular intervals in the trace
> stream to temporally correlate instructions executed on different CPUs.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> ---
>   drivers/hwtracing/coresight/coresight-etm4x.c | 115 +++++++++++++++++-
>   1 file changed, 112 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c
> index 732ae12fca9b..45c341a5aa0b 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x.c
> @@ -138,8 +138,11 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
>   			       drvdata->base + TRCCNTVRn(i));
>   	}
>   
> -	/* Resource selector pair 0 is always implemented and reserved */
> -	for (i = 0; i < drvdata->nr_resource * 2; i++)
> +	/*
> +	 * Resource selector pair 0 is always implemented and reserved.  As
> +	 * such start at 2.
> +	 */
> +	for (i = 2; i < drvdata->nr_resource * 2; i++)
>   		writel_relaxed(config->res_ctrl[i],
>   			       drvdata->base + TRCRSCTLRn(i));


To be frank, that looks like a separate fix for the existing driver,
from this series.

>   
> @@ -201,6 +204,97 @@ static void etm4_enable_hw_smp_call(void *info)
>   	arg->rc = etm4_enable_hw(arg->drvdata);
>   }
>   
> +/*
> + * The goal of function etm4_config_timestamp_event() is to configure a
> + * counter that will tell the tracer to emit a timestamp packet when it
> + * reaches zero.  This is done in order to get a more fine grained idea
> + * of when instructions are executed so that they can be correlated
> + * with execution on other CPUs.
> + *
> + * To do this the counter itself is configured to self reload and
> + * TRCRSCTLR1 (always true) used to get the counter to decrement.  From
> + * there a resource selector is configured with the counter and the
> + * timestamp control register to use the resource selector to trigger the
> + * event that will insert a timestamp packet in the stream.
> + */
> +static int etm4_config_timestamp_event(struct etmv4_drvdata *drvdata)
> +{
> +	int i, ctridx, ret = -EINVAL;
> +	int counter, rselector;
> +	u32 val = 0;
> +	struct etmv4_config *config = &drvdata->config;
> +
> +	/* No point in trying if we don't have at least one counter */
> +	if (!drvdata->nr_cntr)
> +		goto out;
> +
> +	/* Find a counter that hasn't been initialised */
> +	for (i = 0; i < drvdata->nr_cntr; i++)
> +		if (config->cntr_val[i] == 0)
> +			break;
> +
> +	/* Remember what counter we used */
> +	counter = 1 << i;
> +	ctridx = i;
> +
> +	/* All the counters have been configured already, bail out */
> +	if (i == drvdata->nr_cntr) {
> +		pr_err("%s: no available counter found\n", __func__);
> +		goto out;
> +	}

Should this be pr_debug ? This could be easily triggered to flood the
dmesg. Also, I think the return code could be -ENODEV or -ENOSPC rather
than -EINVAL in this case.

> +
> +	/*
> +	 * Initialise original and reload counter value to the smallest
> +	 * possible value in order to get as much precision as we can.
> +	 */
> +	config->cntr_val[ctridx] = 1;
> +	config->cntrldvr[ctridx] = 1;

We could delay the initialisation until we find the resource selector
to avoid clearing them later.

> +
> +	/* Set the trace counter control register */
> +	val =  0x1 << 16	|  /* Bit 16, reload counter automatically */
> +	       0x0 << 7		|  /* Select single resource selector */
> +	       0x1;		   /* Resource selector 1, i.e always true */
> +
> +	config->cntr_ctrl[ctridx] = val;
> +
> +	/*
> +	 * Searching for an available resource selector to use, starting at
> +	 * '2' since every implementation has at least 2 resource selector.
> +	 * ETMIDR4 gives the number of resource selector _pairs_,
> +	 * hence multiply by 2.
> +	 */
> +	for (i = 2; i < drvdata->nr_resource * 2; i++)
> +		if (!config->res_ctrl[i])
> +			break;
> +
> +	/* Remember what resource selector we used */
> +	rselector = i;
> +
> +	if (i == drvdata->nr_resource * 2) {
> +		pr_err("%s: no available resource selector found\n", __func__);
> +

Same as above. This shouldn't be an error, may be a debug. It is a
crunch of resources, a usage error from the user and not kernel.

> +		/* Backout what we did and exit */
> +		config->cntr_ctrl[ctridx] = 0;
> +		config->cntrldvr[ctridx] = 0;
> +		config->cntr_val[ctridx] = 0;
> +		goto out;
> +	}
> +
> +	val = 0x2 << 16		| /* Group 0b0010 - Counter and sequencers */
> +	      counter << 0;	  /* Counter to use */
> +
> +	config->res_ctrl[i] = val;
> +
> +	val = 0x0 << 7		| /* Select single resource selector */
> +	      rselector;	  /* Resource selector */
> +
> +	config->ts_ctrl = val;
> +
> +	ret = 0;
> +out:
> +	return ret;
> +}
> +
>   static int etm4_parse_event_config(struct etmv4_drvdata *drvdata,
>   				   struct perf_event *event)
>   {
> @@ -236,9 +330,24 @@ static int etm4_parse_event_config(struct etmv4_drvdata *drvdata,
>   		/* TRM: Must program this for cycacc to work */
>   		config->ccctlr = ETM_CYC_THRESHOLD_DEFAULT;
>   	}
> -	if (attr->config & BIT(ETM_OPT_TS))
> +	if (attr->config & BIT(ETM_OPT_TS)) {
> +		/*
> +		 * Configure timestamps to be emitted at regular intervals in
> +		 * order to correlate instructions executed on different CPUs
> +		 * (CPU-wide trace scenarios).
> +		 */
> +		ret = etm4_config_timestamp_event(drvdata);
> +
> +		/*
> +		 * No need to go further if timestamp intervals can't
> +		 * be configured.
> +		 */
> +		if (ret)
> +			goto out;
> +
>   		/* bit[11], Global timestamp tracing bit */
>   		config->cfg |= BIT(11);
> +	}
>   
>   	if (attr->config & BIT(ETM_OPT_CTXTID))
>   		/* bit[6], Context ID tracing bit */
> 


Rest all looks good to me.

Suzuki

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 02/16] coresight: etm4x: Add kernel configuration for CONTEXTID
  2019-03-25 21:56 ` [PATCH v2 02/16] coresight: etm4x: Add kernel configuration for CONTEXTID Mathieu Poirier
@ 2019-03-26 11:59   ` Suzuki K Poulose
  0 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2019-03-26 11:59 UTC (permalink / raw)
  To: mathieu.poirier, linux-arm-kernel
  Cc: alexander.shishkin, peterz, coresight, Mike.Leach, leo.yan

On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> Set the proper bit in the configuration register when contextID tracing
> has been requested by user space.  That way PE_CONTEXT elements are
> generated by the tracers when a process is installed on a CPU.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 04/16] coresight: Adding return code to sink::disable() operation
  2019-03-25 21:56 ` [PATCH v2 04/16] coresight: Adding return code to sink::disable() operation Mathieu Poirier
@ 2019-03-26 14:55   ` Suzuki K Poulose
  0 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2019-03-26 14:55 UTC (permalink / raw)
  To: mathieu.poirier, linux-arm-kernel
  Cc: alexander.shishkin, peterz, coresight, Mike.Leach, leo.yan

On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> In preparation to handle device reference counting inside of the sink
> drivers, add a return code to the sink::disable() operation so that
> proper action can be taken if a sink has not been disabled.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

This one and the next patch in the series, together fixes an issue
where we could leave the a sink still enabled, but with refcounts
dropped. (i.e, if we collide with someone reading the sink, while
trying to disable).

As such the issue has been lying around since the beginning, we don't
have to bother about fixing the issue in one single patch.

Thus,

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 05/16] coresight: Move reference counting inside sink drivers
  2019-03-25 21:56 ` [PATCH v2 05/16] coresight: Move reference counting inside sink drivers Mathieu Poirier
@ 2019-03-26 15:04   ` Suzuki K Poulose
  0 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2019-03-26 15:04 UTC (permalink / raw)
  To: mathieu.poirier, linux-arm-kernel
  Cc: alexander.shishkin, peterz, coresight, Mike.Leach, leo.yan

On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> When operating in CPU-wide mode with an N:1 source/sink HW topology,
> multiple CPUs can access a sink concurrently.  As such reference counting
> needs to happen when the device's spinlock is held to avoid racing with
> other operations (start(), update(), stop()), such as:
> 
> session A				Session B
> -----					-------
> 
> enable_sink
> atomic_inc(refcount)  = 1
> 
> ...
> 
> atomic_dec(refcount) = 0		enable_sink
> if (refcount == 0) disable_sink
> 					atomic_inc()
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> ---

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 09/16] coresight: perf: Refactor function free_event_data()
  2019-03-25 21:56 ` [PATCH v2 09/16] coresight: perf: Refactor function free_event_data() Mathieu Poirier
@ 2019-03-26 15:07   ` Suzuki K Poulose
  0 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2019-03-26 15:07 UTC (permalink / raw)
  To: mathieu.poirier, linux-arm-kernel
  Cc: alexander.shishkin, peterz, coresight, Mike.Leach, leo.yan

On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> Function free_event_data() is already busy and is bound to become
> worse with the addition of CPU-wide trace scenarios.  As such spin
> off a new function to strickly take care of the sink buffers.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 10/16] coresight: Communicate perf event to sink buffer allocation function
  2019-03-25 21:56 ` [PATCH v2 10/16] coresight: Communicate perf event to sink buffer allocation function Mathieu Poirier
@ 2019-03-26 15:12   ` Suzuki K Poulose
  0 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2019-03-26 15:12 UTC (permalink / raw)
  To: mathieu.poirier, linux-arm-kernel
  Cc: alexander.shishkin, peterz, coresight, Mike.Leach, leo.yan

On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> Make struct perf_event available to sink buffer allocation functions in
> order to use the pid they carry to allocate and free buffer memory along
> with regimenting access to what source a sink can collect data for.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> 

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 11/16] coresight: tmc-etr: Refactor function tmc_etr_setup_perf_buf()
  2019-03-25 21:56 ` [PATCH v2 11/16] coresight: tmc-etr: Refactor function tmc_etr_setup_perf_buf() Mathieu Poirier
@ 2019-03-26 15:29   ` Suzuki K Poulose
  2019-03-26 16:29     ` Mathieu Poirier
  0 siblings, 1 reply; 41+ messages in thread
From: Suzuki K Poulose @ 2019-03-26 15:29 UTC (permalink / raw)
  To: mathieu.poirier, linux-arm-kernel
  Cc: alexander.shishkin, peterz, coresight, Mike.Leach, leo.yan

On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> Refactoring function tmc_etr_setup_perf_buf() so that it only deals
> with the high level etr_perf_buffer, and leaving the allocation of the
> backend buffer (i.e etr_buf) to another function.
> 
> That way the backend buffer allocation function can decide if it wants
> to reuse an existing buffer (CPU-wide trace scenarios) or simply create
> a new one.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

Looks good to me, except for one minor nit:


> ---
>   .../hwtracing/coresight/coresight-tmc-etr.c   | 46 +++++++++++++------
>   1 file changed, 31 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 6e2c2aa130d5..79fee9341446 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -1159,25 +1159,13 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>   	return ret;
>   }
>   
> -/*
> - * tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf.
> - * The size of the hardware buffer is dependent on the size configured
> - * via sysfs and the perf ring buffer size. We prefer to allocate the
> - * largest possible size, scaling down the size by half until it
> - * reaches a minimum limit (1M), beyond which we give up.
> - */
> -static struct etr_perf_buffer *
> -tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages,
> -		       void **pages, bool snapshot)
> +static struct etr_buf *
> +tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
> +		    int nr_pages, void **pages)

nit: The name tmc_etr_get_etr_buf() sounds too generic and has nothing
to do with the perf. It would be good to make it explicit that it is for
perf session.

May be, tmc_etr_perf_get_etr_buf() ? or may be even, simply
get_perf_etr_buf().

Otherwise,

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer
  2019-03-25 21:56 ` [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer Mathieu Poirier
@ 2019-03-26 16:18   ` Suzuki K Poulose
  2019-03-26 17:55     ` Mathieu Poirier
  2019-03-30 15:43   ` Leo Yan
  1 sibling, 1 reply; 41+ messages in thread
From: Suzuki K Poulose @ 2019-03-26 16:18 UTC (permalink / raw)
  To: mathieu.poirier, linux-arm-kernel
  Cc: alexander.shishkin, peterz, coresight, Mike.Leach, leo.yan

On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> This patch uses the pid of the process being traced to aggregate traces
> coming from different processors in the same sink, something that is
> required when collecting traces in CPU-wide mode when the CoreSight HW
> enacts a N:1 source/sink topology.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> ---
>   .../hwtracing/coresight/coresight-tmc-etr.c   | 71 +++++++++++++++++--
>   1 file changed, 65 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 7254fafdf1c2..cbabf88bd51d 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -8,6 +8,7 @@
>   #include <linux/coresight.h>
>   #include <linux/dma-mapping.h>
>   #include <linux/iommu.h>
> +#include <linux/idr.h>
>   #include <linux/slab.h>
>   #include <linux/types.h>
>   #include <linux/vmalloc.h>
> @@ -41,6 +42,9 @@ struct etr_perf_buffer {
>   	void			**pages;
>   };
>   
> +static DEFINE_IDR(session_idr);
> +static DEFINE_MUTEX(session_idr_lock);

Please correct me if I am wrong here. What we now do with this series is

- One event per CPU and thus one ETR perf buf per CPU.
- One *ETR buf* per PID, from the IDR hash list.

However, if we have 1:1 situation, we will have :

N (say 2 ETRs), but one *ETR buf* as they all share the same PID, 
implying multiple multiple ETRs will try to use the same etr buf,
which could corrupt the trace data ? Instead,  what we need in that
situation is :

One ETR buf perf PID+ETR device. i.e, etr_bufs must be per ETR.
So should the IDR be specific to an ETR ?

Or do we not support a session with multiple ETRs involved (1:1) ?

Cheers
Suzuki

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 11/16] coresight: tmc-etr: Refactor function tmc_etr_setup_perf_buf()
  2019-03-26 15:29   ` Suzuki K Poulose
@ 2019-03-26 16:29     ` Mathieu Poirier
  0 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-26 16:29 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: alexander.shishkin, coresight, peterz, Mike.Leach, leo.yan,
	linux-arm-kernel

On Tue, Mar 26, 2019 at 03:29:03PM +0000, Suzuki K Poulose wrote:
> On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> > Refactoring function tmc_etr_setup_perf_buf() so that it only deals
> > with the high level etr_perf_buffer, and leaving the allocation of the
> > backend buffer (i.e etr_buf) to another function.
> > 
> > That way the backend buffer allocation function can decide if it wants
> > to reuse an existing buffer (CPU-wide trace scenarios) or simply create
> > a new one.
> > 
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> 
> Looks good to me, except for one minor nit:
> 
> 
> > ---
> >   .../hwtracing/coresight/coresight-tmc-etr.c   | 46 +++++++++++++------
> >   1 file changed, 31 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > index 6e2c2aa130d5..79fee9341446 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > @@ -1159,25 +1159,13 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
> >   	return ret;
> >   }
> > -/*
> > - * tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf.
> > - * The size of the hardware buffer is dependent on the size configured
> > - * via sysfs and the perf ring buffer size. We prefer to allocate the
> > - * largest possible size, scaling down the size by half until it
> > - * reaches a minimum limit (1M), beyond which we give up.
> > - */
> > -static struct etr_perf_buffer *
> > -tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages,
> > -		       void **pages, bool snapshot)
> > +static struct etr_buf *
> > +tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
> > +		    int nr_pages, void **pages)
> 
> nit: The name tmc_etr_get_etr_buf() sounds too generic and has nothing
> to do with the perf. It would be good to make it explicit that it is for
> perf session.
> 
> May be, tmc_etr_perf_get_etr_buf() ? or may be even, simply
> get_perf_etr_buf().

I don't have a strong preference here, the latter looks fine to me.

> 
> Otherwise,
> 
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Ok

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 12/16] coresight: tmc-etr: Introduce the notion of process ID to ETR devices
  2019-03-25 21:56 ` [PATCH v2 12/16] coresight: tmc-etr: Introduce the notion of process ID to ETR devices Mathieu Poirier
@ 2019-03-26 16:46   ` Suzuki K Poulose
  2019-03-26 18:06     ` Mathieu Poirier
  0 siblings, 1 reply; 41+ messages in thread
From: Suzuki K Poulose @ 2019-03-26 16:46 UTC (permalink / raw)
  To: mathieu.poirier, linux-arm-kernel
  Cc: alexander.shishkin, peterz, coresight, Mike.Leach, leo.yan

Mathieu,

On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> In preparation to support CPU-wide trace scenarios, add the notion of
> process ID to ETR devices so that memory buffers can be shared between
> events.

This patch looks fine to me. However it seems to do more than what the 
commit describes. e.g, refcounting. It may be worth fixing the
description.

Either way:

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> ---
>   .../hwtracing/coresight/coresight-tmc-etr.c   | 22 +++++++++++++++++--
>   drivers/hwtracing/coresight/coresight-tmc.h   |  3 +++
>   2 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 79fee9341446..7254fafdf1c2 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -26,6 +26,7 @@ struct etr_flat_buf {
>   /*
>    * etr_perf_buffer - Perf buffer used for ETR
>    * @etr_buf		- Actual buffer used by the ETR
> + * @pid			- The PID this etr_perf_buffer belongs to.
>    * @snaphost		- Perf session mode
>    * @head		- handle->head at the beginning of the session.
>    * @nr_pages		- Number of pages in the ring buffer.
> @@ -33,6 +34,7 @@ struct etr_flat_buf {
>    */
>   struct etr_perf_buffer {
>   	struct etr_buf		*etr_buf;
> +	pid_t			pid;
>   	bool			snapshot;
>   	unsigned long		head;
>   	int			nr_pages;
> @@ -1160,8 +1162,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>   }
>   
>   static struct etr_buf *
> -tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
> -		    int nr_pages, void **pages)
> +alloc_etr_buf(struct tmc_drvdata *drvdata, int node,
> +	      int nr_pages, void **pages)
>   {
>   	struct etr_buf *etr_buf;
>   	unsigned long size;
> @@ -1195,6 +1197,20 @@ tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
>   	return etr_buf;
>   }
>   
> +static struct etr_buf *
> +tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
> +		    int nr_pages, void **pages)
> +{
> +	struct etr_buf *etr_buf;
> +
> +	etr_buf = alloc_etr_buf(drvdata, node, nr_pages, pages);
> +	if (IS_ERR(etr_buf))
> +		return etr_buf;
> +
> +	refcount_set(&etr_buf->refcount, 1);
> +	return etr_buf;
> +}
> +
>   /*
>    * tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf.
>    * The size of the hardware buffer is dependent on the size configured
> @@ -1231,6 +1247,7 @@ static void *tmc_alloc_etr_buffer(struct coresight_device *csdev,
>   				  int nr_pages, bool snapshot)
>   {
>   	int cpu = event->cpu;
> +	pid_t pid = task_pid_nr(event->owner);
>   	struct etr_perf_buffer *etr_perf;
>   	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>   
> @@ -1244,6 +1261,7 @@ static void *tmc_alloc_etr_buffer(struct coresight_device *csdev,
>   		return NULL;
>   	}
>   
> +	etr_perf->pid = pid;
>   	etr_perf->snapshot = snapshot;
>   	etr_perf->nr_pages = nr_pages;
>   	etr_perf->pages = pages;
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> index 487c53701e9c..7675138ff9fc 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.h
> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> @@ -9,6 +9,7 @@
>   
>   #include <linux/dma-mapping.h>
>   #include <linux/miscdevice.h>
> +#include <linux/refcount.h>
>   
>   #define TMC_RSZ			0x004
>   #define TMC_STS			0x00c
> @@ -133,6 +134,7 @@ struct etr_buf_operations;
>   
>   /**
>    * struct etr_buf - Details of the buffer used by ETR
> + * @refcount	: Number of sources currently using this etr_buf.
>    * @mode	: Mode of the ETR buffer, contiguous, Scatter Gather etc.
>    * @full	: Trace data overflow
>    * @size	: Size of the buffer.
> @@ -143,6 +145,7 @@ struct etr_buf_operations;
>    * @private	: Backend specific information for the buf
>    */
>   struct etr_buf {
> +	refcount_t			refcount;
>   	enum etr_mode			mode;
>   	bool				full;
>   	ssize_t				size;
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer
  2019-03-26 16:18   ` Suzuki K Poulose
@ 2019-03-26 17:55     ` Mathieu Poirier
  2019-03-27 11:32       ` Suzuki K Poulose
  0 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-26 17:55 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: alexander.shishkin, coresight, peterz, Mike.Leach, leo.yan,
	linux-arm-kernel

On Tue, Mar 26, 2019 at 04:18:35PM +0000, Suzuki K Poulose wrote:
> On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> > This patch uses the pid of the process being traced to aggregate traces
> > coming from different processors in the same sink, something that is
> > required when collecting traces in CPU-wide mode when the CoreSight HW
> > enacts a N:1 source/sink topology.
> > 
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > ---
> >   .../hwtracing/coresight/coresight-tmc-etr.c   | 71 +++++++++++++++++--
> >   1 file changed, 65 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > index 7254fafdf1c2..cbabf88bd51d 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > @@ -8,6 +8,7 @@
> >   #include <linux/coresight.h>
> >   #include <linux/dma-mapping.h>
> >   #include <linux/iommu.h>
> > +#include <linux/idr.h>
> >   #include <linux/slab.h>
> >   #include <linux/types.h>
> >   #include <linux/vmalloc.h>
> > @@ -41,6 +42,9 @@ struct etr_perf_buffer {
> >   	void			**pages;
> >   };
> > +static DEFINE_IDR(session_idr);
> > +static DEFINE_MUTEX(session_idr_lock);
> 
> Please correct me if I am wrong here. What we now do with this series is
> 
> - One event per CPU and thus one ETR perf buf per CPU.
> - One *ETR buf* per PID, from the IDR hash list.
> 
> However, if we have 1:1 situation, we will have :
> 
> N (say 2 ETRs), but one *ETR buf* as they all share the same PID, implying
> multiple multiple ETRs will try to use the same etr buf,
> which could corrupt the trace data ? Instead,  what we need in that
> situation is :
> 
> One ETR buf perf PID+ETR device. i.e, etr_bufs must be per ETR.
> So should the IDR be specific to an ETR ?
> 
> Or do we not support a session with multiple ETRs involved (1:1) ?

At this time 1:1 topologies are not supported and a fair amount of work will go
in doing so.  When I started working on this serie my thought was that because
of said amount of work there is no point thinking about 1:1, and that we can
deal with it when we get there.

But taking a step back and having dealt with the harder (concurrency) problems
inherent to CPU-wide scenarios, it would be trivial to make the code 1:1 ready
and it will be one less thing to worry about down the road.

That being said and answering your question above, I think we need and IDR per
ETR (in the drvdata) to avoid contention issues on systems with several ETRs.  

Thanks for bringing this back to my attention.
Mathieu

> 
> Cheers
> Suzuki

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 12/16] coresight: tmc-etr: Introduce the notion of process ID to ETR devices
  2019-03-26 16:46   ` Suzuki K Poulose
@ 2019-03-26 18:06     ` Mathieu Poirier
  0 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-26 18:06 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: alexander.shishkin, coresight, peterz, Mike.Leach, leo.yan,
	linux-arm-kernel

On Tue, Mar 26, 2019 at 04:46:06PM +0000, Suzuki K Poulose wrote:
> Mathieu,
> 
> On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> > In preparation to support CPU-wide trace scenarios, add the notion of
> > process ID to ETR devices so that memory buffers can be shared between
> > events.
> 
> This patch looks fine to me. However it seems to do more than what the
> commit describes. e.g, refcounting. It may be worth fixing the
> description.

I wasted a lot of time thinking about this... Theoretically it should be split
in two but the changes would be so trivial it would hardly be worth the trouble.
I'll rework the description. 

> 
> Either way:
> 
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> 
> > 
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > ---
> >   .../hwtracing/coresight/coresight-tmc-etr.c   | 22 +++++++++++++++++--
> >   drivers/hwtracing/coresight/coresight-tmc.h   |  3 +++
> >   2 files changed, 23 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > index 79fee9341446..7254fafdf1c2 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > @@ -26,6 +26,7 @@ struct etr_flat_buf {
> >   /*
> >    * etr_perf_buffer - Perf buffer used for ETR
> >    * @etr_buf		- Actual buffer used by the ETR
> > + * @pid			- The PID this etr_perf_buffer belongs to.
> >    * @snaphost		- Perf session mode
> >    * @head		- handle->head at the beginning of the session.
> >    * @nr_pages		- Number of pages in the ring buffer.
> > @@ -33,6 +34,7 @@ struct etr_flat_buf {
> >    */
> >   struct etr_perf_buffer {
> >   	struct etr_buf		*etr_buf;
> > +	pid_t			pid;
> >   	bool			snapshot;
> >   	unsigned long		head;
> >   	int			nr_pages;
> > @@ -1160,8 +1162,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
> >   }
> >   static struct etr_buf *
> > -tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
> > -		    int nr_pages, void **pages)
> > +alloc_etr_buf(struct tmc_drvdata *drvdata, int node,
> > +	      int nr_pages, void **pages)
> >   {
> >   	struct etr_buf *etr_buf;
> >   	unsigned long size;
> > @@ -1195,6 +1197,20 @@ tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
> >   	return etr_buf;
> >   }
> > +static struct etr_buf *
> > +tmc_etr_get_etr_buf(struct tmc_drvdata *drvdata, int node,
> > +		    int nr_pages, void **pages)
> > +{
> > +	struct etr_buf *etr_buf;
> > +
> > +	etr_buf = alloc_etr_buf(drvdata, node, nr_pages, pages);
> > +	if (IS_ERR(etr_buf))
> > +		return etr_buf;
> > +
> > +	refcount_set(&etr_buf->refcount, 1);
> > +	return etr_buf;
> > +}
> > +
> >   /*
> >    * tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf.
> >    * The size of the hardware buffer is dependent on the size configured
> > @@ -1231,6 +1247,7 @@ static void *tmc_alloc_etr_buffer(struct coresight_device *csdev,
> >   				  int nr_pages, bool snapshot)
> >   {
> >   	int cpu = event->cpu;
> > +	pid_t pid = task_pid_nr(event->owner);
> >   	struct etr_perf_buffer *etr_perf;
> >   	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> > @@ -1244,6 +1261,7 @@ static void *tmc_alloc_etr_buffer(struct coresight_device *csdev,
> >   		return NULL;
> >   	}
> > +	etr_perf->pid = pid;
> >   	etr_perf->snapshot = snapshot;
> >   	etr_perf->nr_pages = nr_pages;
> >   	etr_perf->pages = pages;
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> > index 487c53701e9c..7675138ff9fc 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc.h
> > +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> > @@ -9,6 +9,7 @@
> >   #include <linux/dma-mapping.h>
> >   #include <linux/miscdevice.h>
> > +#include <linux/refcount.h>
> >   #define TMC_RSZ			0x004
> >   #define TMC_STS			0x00c
> > @@ -133,6 +134,7 @@ struct etr_buf_operations;
> >   /**
> >    * struct etr_buf - Details of the buffer used by ETR
> > + * @refcount	: Number of sources currently using this etr_buf.
> >    * @mode	: Mode of the ETR buffer, contiguous, Scatter Gather etc.
> >    * @full	: Trace data overflow
> >    * @size	: Size of the buffer.
> > @@ -143,6 +145,7 @@ struct etr_buf_operations;
> >    * @private	: Backend specific information for the buf
> >    */
> >   struct etr_buf {
> > +	refcount_t			refcount;
> >   	enum etr_mode			mode;
> >   	bool				full;
> >   	ssize_t				size;
> > 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (15 preceding siblings ...)
  2019-03-25 21:56 ` [PATCH v2 16/16] coresight: etb10: " Mathieu Poirier
@ 2019-03-27  7:52 ` Leo Yan
  2019-03-27 14:40   ` Mathieu Poirier
  2019-04-11 18:52 ` Robert Walker
  17 siblings, 1 reply; 41+ messages in thread
From: Leo Yan @ 2019-03-27  7:52 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, linux-arm-kernel

On Mon, Mar 25, 2019 at 03:56:16PM -0600, Mathieu Poirier wrote:
> This is the second revision of a patchset that adds support for CPU-wide
> trace scenarios and as such, it is now possible to issue the following
> commands:
> 
> 	# perf record -e cs_etm/@20070000.etr/ -C 2,3 $COMMAND
> 	# perf record -e cs_etm/@20070000.etr/ -a $COMMAND
> 
> The solution is designed to work for both 1:1 and N:1 source/sink
> topologies, though the former hasn't been tested for lack of access to HW.
> 
> Most of the changes revolve around allowing more than one event to use
> a sink when operated from perf.  More specifically the first event to
> use a sink switches it on while the last one is tasked to aggregate traces
> and switching off the device.
> 
> This is the kernel part of the solution, with the user space portion to be
> released in a later set.  All patches (user and kernel) have been rebased
> on v5.1-rc2 and are hosted here[1].  Everything has been tested on Juno and
> 410c dragonboard platforms.

FWIW, gave the testing on my Hikey620 board and these patches also work
well with below commands:

  # perf record -e cs_etm/@f6402000.etf/ -a uname
  # perf record -e cs_etm/@f6402000.etf/ -C 0,4 uname

  # perf record -e cs_etm/@f6404000.etr/ -a uname
  # perf record -e cs_etm/@f6404000.etr/ -C 1,2,7 uname

P.s. just note here and this is off topic to this patch set, the
'perf report' command took below time for decoding ~4MB CoreSight trace
data on my Hikey board, seems we can take a look for decoding speed
optimization later:

  # time perf report  --vmlinux /mnt/linux-kernel/linux-next/vmlinux
  real    2m11.153s
  user    1m47.979s
  sys     0m0.204s

Thanks,
Leo Yan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer
  2019-03-26 17:55     ` Mathieu Poirier
@ 2019-03-27 11:32       ` Suzuki K Poulose
  2019-03-27 17:01         ` Mathieu Poirier
  0 siblings, 1 reply; 41+ messages in thread
From: Suzuki K Poulose @ 2019-03-27 11:32 UTC (permalink / raw)
  To: mathieu.poirier
  Cc: alexander.shishkin, coresight, peterz, Mike.Leach, leo.yan,
	linux-arm-kernel

On 03/26/2019 05:55 PM, Mathieu Poirier wrote:
> On Tue, Mar 26, 2019 at 04:18:35PM +0000, Suzuki K Poulose wrote:
>> On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
>>> This patch uses the pid of the process being traced to aggregate traces
>>> coming from different processors in the same sink, something that is
>>> required when collecting traces in CPU-wide mode when the CoreSight HW
>>> enacts a N:1 source/sink topology.
>>>
>>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
>>> ---
>>>    .../hwtracing/coresight/coresight-tmc-etr.c   | 71 +++++++++++++++++--
>>>    1 file changed, 65 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> index 7254fafdf1c2..cbabf88bd51d 100644
>>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> @@ -8,6 +8,7 @@
>>>    #include <linux/coresight.h>
>>>    #include <linux/dma-mapping.h>
>>>    #include <linux/iommu.h>
>>> +#include <linux/idr.h>
>>>    #include <linux/slab.h>
>>>    #include <linux/types.h>
>>>    #include <linux/vmalloc.h>
>>> @@ -41,6 +42,9 @@ struct etr_perf_buffer {
>>>    	void			**pages;
>>>    };
>>> +static DEFINE_IDR(session_idr);
>>> +static DEFINE_MUTEX(session_idr_lock);
>>
>> Please correct me if I am wrong here. What we now do with this series is
>>
>> - One event per CPU and thus one ETR perf buf per CPU.
>> - One *ETR buf* per PID, from the IDR hash list.
>>
>> However, if we have 1:1 situation, we will have :
>>
>> N (say 2 ETRs), but one *ETR buf* as they all share the same PID, implying
>> multiple multiple ETRs will try to use the same etr buf,
>> which could corrupt the trace data ? Instead,  what we need in that
>> situation is :
>>
>> One ETR buf perf PID+ETR device. i.e, etr_bufs must be per ETR.
>> So should the IDR be specific to an ETR ?
>>
>> Or do we not support a session with multiple ETRs involved (1:1) ?
> 
> At this time 1:1 topologies are not supported and a fair amount of work will go
> in doing so.  When I started working on this serie my thought was that because
> of said amount of work there is no point thinking about 1:1, and that we can
> deal with it when we get there.
> 
> But taking a step back and having dealt with the harder (concurrency) problems
> inherent to CPU-wide scenarios, it would be trivial to make the code 1:1 ready
> and it will be one less thing to worry about down the road.
> 
> That being said and answering your question above, I think we need and IDR per
> ETR (in the drvdata) to avoid contention issues on systems with several ETRs.
> 
> Thanks for bringing this back to my attention.

Cool. Thanks for explaining the rationale. So, when we do that, I think
we may be able to have one "etr_perf_buffer" per ETR and thus we may be
able to move the refcount back to the etr_perf_buffer, just like we
moved the PID and index etr_perf_buffer instead of the etr_buf ?

Cheers
Suzuki


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios
  2019-03-27  7:52 ` [PATCH v2 00/16] coresight: " Leo Yan
@ 2019-03-27 14:40   ` Mathieu Poirier
  2019-03-27 14:44     ` Leo Yan
  0 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-27 14:40 UTC (permalink / raw)
  To: Leo Yan
  Cc: Suzuki K. Poulose, Alexander Shishkin, Coresight ML,
	Peter Zijlstra, Mike Leach, linux-arm-kernel

On Wed, 27 Mar 2019 at 01:52, Leo Yan <leo.yan@linaro.org> wrote:
>
> On Mon, Mar 25, 2019 at 03:56:16PM -0600, Mathieu Poirier wrote:
> > This is the second revision of a patchset that adds support for CPU-wide
> > trace scenarios and as such, it is now possible to issue the following
> > commands:
> >
> >       # perf record -e cs_etm/@20070000.etr/ -C 2,3 $COMMAND
> >       # perf record -e cs_etm/@20070000.etr/ -a $COMMAND
> >
> > The solution is designed to work for both 1:1 and N:1 source/sink
> > topologies, though the former hasn't been tested for lack of access to HW.
> >
> > Most of the changes revolve around allowing more than one event to use
> > a sink when operated from perf.  More specifically the first event to
> > use a sink switches it on while the last one is tasked to aggregate traces
> > and switching off the device.
> >
> > This is the kernel part of the solution, with the user space portion to be
> > released in a later set.  All patches (user and kernel) have been rebased
> > on v5.1-rc2 and are hosted here[1].  Everything has been tested on Juno and
> > 410c dragonboard platforms.
>
> FWIW, gave the testing on my Hikey620 board and these patches also work
> well with below commands:
>
>   # perf record -e cs_etm/@f6402000.etf/ -a uname
>   # perf record -e cs_etm/@f6402000.etf/ -C 0,4 uname
>
>   # perf record -e cs_etm/@f6404000.etr/ -a uname
>   # perf record -e cs_etm/@f6404000.etr/ -C 1,2,7 uname

Should I add your "Tested-by:" to the patches then?

>
> P.s. just note here and this is off topic to this patch set, the
> 'perf report' command took below time for decoding ~4MB CoreSight trace
> data on my Hikey board, seems we can take a look for decoding speed
> optimization later:
>
>   # time perf report  --vmlinux /mnt/linux-kernel/linux-next/vmlinux
>   real    2m11.153s
>   user    1m47.979s
>   sys     0m0.204s
>
> Thanks,
> Leo Yan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios
  2019-03-27 14:40   ` Mathieu Poirier
@ 2019-03-27 14:44     ` Leo Yan
  0 siblings, 0 replies; 41+ messages in thread
From: Leo Yan @ 2019-03-27 14:44 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Suzuki K. Poulose, Alexander Shishkin, Coresight ML,
	Peter Zijlstra, Mike Leach, linux-arm-kernel

On Wed, Mar 27, 2019 at 08:40:18AM -0600, Mathieu Poirier wrote:
> On Wed, 27 Mar 2019 at 01:52, Leo Yan <leo.yan@linaro.org> wrote:
> >
> > On Mon, Mar 25, 2019 at 03:56:16PM -0600, Mathieu Poirier wrote:
> > > This is the second revision of a patchset that adds support for CPU-wide
> > > trace scenarios and as such, it is now possible to issue the following
> > > commands:
> > >
> > >       # perf record -e cs_etm/@20070000.etr/ -C 2,3 $COMMAND
> > >       # perf record -e cs_etm/@20070000.etr/ -a $COMMAND
> > >
> > > The solution is designed to work for both 1:1 and N:1 source/sink
> > > topologies, though the former hasn't been tested for lack of access to HW.
> > >
> > > Most of the changes revolve around allowing more than one event to use
> > > a sink when operated from perf.  More specifically the first event to
> > > use a sink switches it on while the last one is tasked to aggregate traces
> > > and switching off the device.
> > >
> > > This is the kernel part of the solution, with the user space portion to be
> > > released in a later set.  All patches (user and kernel) have been rebased
> > > on v5.1-rc2 and are hosted here[1].  Everything has been tested on Juno and
> > > 410c dragonboard platforms.
> >
> > FWIW, gave the testing on my Hikey620 board and these patches also work
> > well with below commands:
> >
> >   # perf record -e cs_etm/@f6402000.etf/ -a uname
> >   # perf record -e cs_etm/@f6402000.etf/ -C 0,4 uname
> >
> >   # perf record -e cs_etm/@f6404000.etr/ -a uname
> >   # perf record -e cs_etm/@f6404000.etr/ -C 1,2,7 uname
> 
> Should I add your "Tested-by:" to the patches then?

Yes.  Please add my test tag:

Tested-by: Leo Yan <leo.yan@linaro.org>

Thanks,
Leo Yan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer
  2019-03-27 11:32       ` Suzuki K Poulose
@ 2019-03-27 17:01         ` Mathieu Poirier
  2019-04-01 13:01           ` Suzuki K Poulose
  0 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-03-27 17:01 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Alexander Shishkin, Coresight ML, Peter Zijlstra, Mike Leach,
	Leo Yan, linux-arm-kernel

On Wed, 27 Mar 2019 at 05:30, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
> On 03/26/2019 05:55 PM, Mathieu Poirier wrote:
> > On Tue, Mar 26, 2019 at 04:18:35PM +0000, Suzuki K Poulose wrote:
> >> On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> >>> This patch uses the pid of the process being traced to aggregate traces
> >>> coming from different processors in the same sink, something that is
> >>> required when collecting traces in CPU-wide mode when the CoreSight HW
> >>> enacts a N:1 source/sink topology.
> >>>
> >>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> >>> ---
> >>>    .../hwtracing/coresight/coresight-tmc-etr.c   | 71 +++++++++++++++++--
> >>>    1 file changed, 65 insertions(+), 6 deletions(-)
> >>>
> >>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> >>> index 7254fafdf1c2..cbabf88bd51d 100644
> >>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> >>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> >>> @@ -8,6 +8,7 @@
> >>>    #include <linux/coresight.h>
> >>>    #include <linux/dma-mapping.h>
> >>>    #include <linux/iommu.h>
> >>> +#include <linux/idr.h>
> >>>    #include <linux/slab.h>
> >>>    #include <linux/types.h>
> >>>    #include <linux/vmalloc.h>
> >>> @@ -41,6 +42,9 @@ struct etr_perf_buffer {
> >>>     void                    **pages;
> >>>    };
> >>> +static DEFINE_IDR(session_idr);
> >>> +static DEFINE_MUTEX(session_idr_lock);
> >>
> >> Please correct me if I am wrong here. What we now do with this series is
> >>
> >> - One event per CPU and thus one ETR perf buf per CPU.
> >> - One *ETR buf* per PID, from the IDR hash list.
> >>
> >> However, if we have 1:1 situation, we will have :
> >>
> >> N (say 2 ETRs), but one *ETR buf* as they all share the same PID, implying
> >> multiple multiple ETRs will try to use the same etr buf,
> >> which could corrupt the trace data ? Instead,  what we need in that
> >> situation is :
> >>
> >> One ETR buf perf PID+ETR device. i.e, etr_bufs must be per ETR.
> >> So should the IDR be specific to an ETR ?
> >>
> >> Or do we not support a session with multiple ETRs involved (1:1) ?
> >
> > At this time 1:1 topologies are not supported and a fair amount of work will go
> > in doing so.  When I started working on this serie my thought was that because
> > of said amount of work there is no point thinking about 1:1, and that we can
> > deal with it when we get there.
> >
> > But taking a step back and having dealt with the harder (concurrency) problems
> > inherent to CPU-wide scenarios, it would be trivial to make the code 1:1 ready
> > and it will be one less thing to worry about down the road.
> >
> > That being said and answering your question above, I think we need and IDR per
> > ETR (in the drvdata) to avoid contention issues on systems with several ETRs.
> >
> > Thanks for bringing this back to my attention.
>
> Cool. Thanks for explaining the rationale. So, when we do that, I think
> we may be able to have one "etr_perf_buffer" per ETR and thus we may be
> able to move the refcount back to the etr_perf_buffer, just like we
> moved the PID and index etr_perf_buffer instead of the etr_buf ?

An etr_perf_buffer is associated with an event and holds the AUX ring
buffer that was created for that event.  In CPU-wide N:1 mode multiple
events (one per CPU), each with its own AUX ring buffer, share a sink
and as such we can't have a single etr_perf_buffer per ETR.

Thanks,
Mathieu

>
> Cheers
> Suzuki
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer
  2019-03-25 21:56 ` [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer Mathieu Poirier
  2019-03-26 16:18   ` Suzuki K Poulose
@ 2019-03-30 15:43   ` Leo Yan
  2019-04-01  7:29     ` Mathieu Poirier
  1 sibling, 1 reply; 41+ messages in thread
From: Leo Yan @ 2019-03-30 15:43 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: suzuki.poulose, alexander.shishkin, coresight, peterz,
	mike.leach, linux-arm-kernel

Hi Mathieu,

On Mon, Mar 25, 2019 at 03:56:29PM -0600, Mathieu Poirier wrote:
> This patch uses the pid of the process being traced to aggregate traces
> coming from different processors in the same sink, something that is
> required when collecting traces in CPU-wide mode when the CoreSight HW
> enacts a N:1 source/sink topology.

I tried to use kprobe to verify the flow, so I observe something is
not expected (Maybe I misunderstand it), please check the detailed
info as below.

- The testing steps are as below:

  # perf record -e cs_etm/@f6404000.etr/ -a -- sleep 10 &
  # ls
  # ls
  # ls

  So I expect the CoreSight to do global tracing for 10 seconds, and
  then simply execute 'ls' command for 3 times and these three
  processes will be traced into perf data.

- I used kprobe event to add dynamic points for function tracing
  tmc_alloc_etr_buffer() and tmc_etr_get_etr_buf():

  echo 'p:my_probe tmc_alloc_etr_buffer' > /sys/kernel/debug/tracing/kprobe_events
  echo 'p:my_probe2 0xffff000010ba63c4 pid=%x1:u32' >> kprobe_events

  When start the 'perf record' command it will create etr_perf and
  etr_buf for every CPU, but afterwards for the three 'ls' processes,
  I cannot see any any ftrace log for them.  So finally I capture the
  trace log as below, it only creates buffer for every CPU for one
  etr_perf and one etr_buf but has no any buffer is created for 'ls'
  processes.

#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
            perf-2502  [000] d...  2003.550571: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
            perf-2502  [000] d...  2003.550595: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
            perf-2502  [000] d...  2003.556306: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
            perf-2502  [000] d...  2003.556329: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
            perf-2502  [000] d...  2003.557694: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
            perf-2502  [000] d...  2003.557708: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
            perf-2502  [000] d...  2003.559079: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
            perf-2502  [000] d...  2003.559095: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
            perf-2502  [000] d...  2003.560567: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
            perf-2502  [000] d...  2003.560581: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
            perf-2502  [000] d...  2003.561954: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
            perf-2502  [000] d...  2003.561965: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
            perf-2502  [000] d...  2003.563338: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
            perf-2502  [000] d...  2003.563352: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
            perf-2502  [000] d...  2003.564782: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
            perf-2502  [000] d...  2003.564796: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502


  Question: When every time execute 'ls' program, should we expect the
  tmc-etr driver to create etr_buf for every process?

- If I use the command 'perf report --sort pid --stdio' to output result,
  I also can only see one process and doesn't have any samples for new
  created 'ls' processes:

  # Samples: 168  of event 'branches'
  # Event count (approx.): 168
  #
  # Children      Self      Pid:Command
  # ........  ........  ...............
  #
     100.00%   100.00%     2502:perf

Thanks,
Leo Yan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer
  2019-03-30 15:43   ` Leo Yan
@ 2019-04-01  7:29     ` Mathieu Poirier
  0 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-04-01  7:29 UTC (permalink / raw)
  To: Leo Yan
  Cc: Suzuki K. Poulose, Alexander Shishkin, Coresight ML,
	Peter Zijlstra, Mike Leach, linux-arm-kernel

Good day Leo,

On Sat, 30 Mar 2019 at 09:43, Leo Yan <leo.yan@linaro.org> wrote:
>
> Hi Mathieu,
>
> On Mon, Mar 25, 2019 at 03:56:29PM -0600, Mathieu Poirier wrote:
> > This patch uses the pid of the process being traced to aggregate traces
> > coming from different processors in the same sink, something that is
> > required when collecting traces in CPU-wide mode when the CoreSight HW
> > enacts a N:1 source/sink topology.
>
> I tried to use kprobe to verify the flow, so I observe something is
> not expected (Maybe I misunderstand it), please check the detailed
> info as below.
>
> - The testing steps are as below:
>
>   # perf record -e cs_etm/@f6404000.etr/ -a -- sleep 10 &
>   # ls
>   # ls
>   # ls
>
>   So I expect the CoreSight to do global tracing for 10 seconds, and
>   then simply execute 'ls' command for 3 times and these three
>   processes will be traced into perf data.

Right.

>
> - I used kprobe event to add dynamic points for function tracing
>   tmc_alloc_etr_buffer() and tmc_etr_get_etr_buf():
>
>   echo 'p:my_probe tmc_alloc_etr_buffer' > /sys/kernel/debug/tracing/kprobe_events
>   echo 'p:my_probe2 0xffff000010ba63c4 pid=%x1:u32' >> kprobe_events
>
>   When start the 'perf record' command it will create etr_perf and
>   etr_buf for every CPU, but afterwards for the three 'ls' processes,
>   I cannot see any any ftrace log for them.  So finally I capture the
>   trace log as below, it only creates buffer for every CPU for one
>   etr_perf and one etr_buf but has no any buffer is created for 'ls'
>   processes.
>

That is the correct behavior.  More specifically it creates an
etr_perf for every event but the etr_buf is shared between those
events because there is only one ETR device on your system.

> #                              _-----=> irqs-off
> #                             / _----=> need-resched
> #                            | / _---=> hardirq/softirq
> #                            || / _--=> preempt-depth
> #                            ||| /     delay
> #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
> #              | |       |   ||||       |         |
>             perf-2502  [000] d...  2003.550571: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
>             perf-2502  [000] d...  2003.550595: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
>             perf-2502  [000] d...  2003.556306: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
>             perf-2502  [000] d...  2003.556329: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
>             perf-2502  [000] d...  2003.557694: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
>             perf-2502  [000] d...  2003.557708: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
>             perf-2502  [000] d...  2003.559079: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
>             perf-2502  [000] d...  2003.559095: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
>             perf-2502  [000] d...  2003.560567: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
>             perf-2502  [000] d...  2003.560581: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
>             perf-2502  [000] d...  2003.561954: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
>             perf-2502  [000] d...  2003.561965: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
>             perf-2502  [000] d...  2003.563338: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
>             perf-2502  [000] d...  2003.563352: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
>             perf-2502  [000] d...  2003.564782: my_probe: (tmc_alloc_etr_buffer+0x0/0x280)
>             perf-2502  [000] d...  2003.564796: my_probe2: (tmc_alloc_etr_buffer+0xbc/0x280) pid=2502
>
>
>   Question: When every time execute 'ls' program, should we expect the
>   tmc-etr driver to create etr_buf for every process?

Buffers are created at the beginning of the trace session for every
CPU (as you reported) but after that everything that happens on those
CPU is using the same buffers.  As such you won't see buffers created
for every program you execute.

>
> - If I use the command 'perf report --sort pid --stdio' to output result,
>   I also can only see one process and doesn't have any samples for new
>   created 'ls' processes:
>
>   # Samples: 168  of event 'branches'
>   # Event count (approx.): 168
>   #
>   # Children      Self      Pid:Command
>   # ........  ........  ...............
>   #
>      100.00%   100.00%     2502:perf

When working with an N:1 source/sink topology and doing a CPU-wide
session, the first event to use the sink will switch it on and the
last one will collect trace data.  With a 10 second trace session it
is very likely traces associated with the 'ls' processes have been
overwritten.  This is a problem inherent to the HW topology and there
is nothing we can currently do to alleviate it.

Let me know if you need more information.
Mathieu

>
> Thanks,
> Leo Yan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer
  2019-03-27 17:01         ` Mathieu Poirier
@ 2019-04-01 13:01           ` Suzuki K Poulose
  2019-04-03  2:13             ` Mathieu Poirier
  0 siblings, 1 reply; 41+ messages in thread
From: Suzuki K Poulose @ 2019-04-01 13:01 UTC (permalink / raw)
  To: mathieu.poirier
  Cc: alexander.shishkin, coresight, peterz, Mike.leach, leo.yan,
	linux-arm-kernel

Hi Mathieu,

On 27/03/2019 17:01, Mathieu Poirier wrote:
> On Wed, 27 Mar 2019 at 05:30, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>
>> On 03/26/2019 05:55 PM, Mathieu Poirier wrote:
>>> On Tue, Mar 26, 2019 at 04:18:35PM +0000, Suzuki K Poulose wrote:
>>>> On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
>>>>> This patch uses the pid of the process being traced to aggregate traces
>>>>> coming from different processors in the same sink, something that is
>>>>> required when collecting traces in CPU-wide mode when the CoreSight HW
>>>>> enacts a N:1 source/sink topology.
>>>>>
>>>>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
>>>>> ---
>>>>>     .../hwtracing/coresight/coresight-tmc-etr.c   | 71 +++++++++++++++++--
>>>>>     1 file changed, 65 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>>>> index 7254fafdf1c2..cbabf88bd51d 100644
>>>>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>>>> @@ -8,6 +8,7 @@
>>>>>     #include <linux/coresight.h>
>>>>>     #include <linux/dma-mapping.h>
>>>>>     #include <linux/iommu.h>
>>>>> +#include <linux/idr.h>
>>>>>     #include <linux/slab.h>
>>>>>     #include <linux/types.h>
>>>>>     #include <linux/vmalloc.h>
>>>>> @@ -41,6 +42,9 @@ struct etr_perf_buffer {
>>>>>      void                    **pages;
>>>>>     };
>>>>> +static DEFINE_IDR(session_idr);
>>>>> +static DEFINE_MUTEX(session_idr_lock);
>>>>
>>>> Please correct me if I am wrong here. What we now do with this series is
>>>>
>>>> - One event per CPU and thus one ETR perf buf per CPU.
>>>> - One *ETR buf* per PID, from the IDR hash list.
>>>>
>>>> However, if we have 1:1 situation, we will have :
>>>>
>>>> N (say 2 ETRs), but one *ETR buf* as they all share the same PID, implying
>>>> multiple multiple ETRs will try to use the same etr buf,
>>>> which could corrupt the trace data ? Instead,  what we need in that
>>>> situation is :
>>>>
>>>> One ETR buf perf PID+ETR device. i.e, etr_bufs must be per ETR.
>>>> So should the IDR be specific to an ETR ?
>>>>
>>>> Or do we not support a session with multiple ETRs involved (1:1) ?
>>>
>>> At this time 1:1 topologies are not supported and a fair amount of work will go
>>> in doing so.  When I started working on this serie my thought was that because
>>> of said amount of work there is no point thinking about 1:1, and that we can
>>> deal with it when we get there.
>>>
>>> But taking a step back and having dealt with the harder (concurrency) problems
>>> inherent to CPU-wide scenarios, it would be trivial to make the code 1:1 ready
>>> and it will be one less thing to worry about down the road.
>>>
>>> That being said and answering your question above, I think we need and IDR per
>>> ETR (in the drvdata) to avoid contention issues on systems with several ETRs.
>>>
>>> Thanks for bringing this back to my attention.
>>
>> Cool. Thanks for explaining the rationale. So, when we do that, I think
>> we may be able to have one "etr_perf_buffer" per ETR and thus we may be
>> able to move the refcount back to the etr_perf_buffer, just like we
>> moved the PID and index etr_perf_buffer instead of the etr_buf ?
> 
> An etr_perf_buffer is associated with an event and holds the AUX ring
> buffer that was created for that event.  In CPU-wide N:1 mode multiple
> events (one per CPU), each with its own AUX ring buffer, share a sink
> and as such we can't have a single etr_perf_buffer per ETR.

Ok. Thanks for clarifying this. So we have one AUX buffer per event, but
they all could end up in the same ETR HW buffer and may be copied to only
one of the AUX buffer, which happens to really disable the ETR. And thus
we need to have a sufficiently large AUX buffer in place to allow we don't
overflow all the time with trace from multiple events. Makes sense for the
N:1 topology. Also the decoding phase must extract the trace to the
corresponding event based on the TRACEDI of the packets.

Thats the best we could do at the moment.
If there was a way to have one single large AUX buffer for a set of events,
which is easily accessible from a given event in the set, rather than "N" large
buffers which could potentially impact memory consumption.

It would be good to have this clearly documented somewhere in the implementation
to keep us in track.

Cheers
Suzuki

> 
> Thanks,
> Mathieu
> 
>>
>> Cheers
>> Suzuki
>>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer
  2019-04-01 13:01           ` Suzuki K Poulose
@ 2019-04-03  2:13             ` Mathieu Poirier
  0 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2019-04-03  2:13 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Alexander Shishkin, Coresight ML, Peter Zijlstra, Mike Leach,
	Leo Yan, linux-arm-kernel

On Mon, 1 Apr 2019 at 07:02, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
> Hi Mathieu,
>
> On 27/03/2019 17:01, Mathieu Poirier wrote:
> > On Wed, 27 Mar 2019 at 05:30, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> >>
> >> On 03/26/2019 05:55 PM, Mathieu Poirier wrote:
> >>> On Tue, Mar 26, 2019 at 04:18:35PM +0000, Suzuki K Poulose wrote:
> >>>> On 03/25/2019 09:56 PM, Mathieu Poirier wrote:
> >>>>> This patch uses the pid of the process being traced to aggregate traces
> >>>>> coming from different processors in the same sink, something that is
> >>>>> required when collecting traces in CPU-wide mode when the CoreSight HW
> >>>>> enacts a N:1 source/sink topology.
> >>>>>
> >>>>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> >>>>> ---
> >>>>>     .../hwtracing/coresight/coresight-tmc-etr.c   | 71 +++++++++++++++++--
> >>>>>     1 file changed, 65 insertions(+), 6 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> >>>>> index 7254fafdf1c2..cbabf88bd51d 100644
> >>>>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> >>>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> >>>>> @@ -8,6 +8,7 @@
> >>>>>     #include <linux/coresight.h>
> >>>>>     #include <linux/dma-mapping.h>
> >>>>>     #include <linux/iommu.h>
> >>>>> +#include <linux/idr.h>
> >>>>>     #include <linux/slab.h>
> >>>>>     #include <linux/types.h>
> >>>>>     #include <linux/vmalloc.h>
> >>>>> @@ -41,6 +42,9 @@ struct etr_perf_buffer {
> >>>>>      void                    **pages;
> >>>>>     };
> >>>>> +static DEFINE_IDR(session_idr);
> >>>>> +static DEFINE_MUTEX(session_idr_lock);
> >>>>
> >>>> Please correct me if I am wrong here. What we now do with this series is
> >>>>
> >>>> - One event per CPU and thus one ETR perf buf per CPU.
> >>>> - One *ETR buf* per PID, from the IDR hash list.
> >>>>
> >>>> However, if we have 1:1 situation, we will have :
> >>>>
> >>>> N (say 2 ETRs), but one *ETR buf* as they all share the same PID, implying
> >>>> multiple multiple ETRs will try to use the same etr buf,
> >>>> which could corrupt the trace data ? Instead,  what we need in that
> >>>> situation is :
> >>>>
> >>>> One ETR buf perf PID+ETR device. i.e, etr_bufs must be per ETR.
> >>>> So should the IDR be specific to an ETR ?
> >>>>
> >>>> Or do we not support a session with multiple ETRs involved (1:1) ?
> >>>
> >>> At this time 1:1 topologies are not supported and a fair amount of work will go
> >>> in doing so.  When I started working on this serie my thought was that because
> >>> of said amount of work there is no point thinking about 1:1, and that we can
> >>> deal with it when we get there.
> >>>
> >>> But taking a step back and having dealt with the harder (concurrency) problems
> >>> inherent to CPU-wide scenarios, it would be trivial to make the code 1:1 ready
> >>> and it will be one less thing to worry about down the road.
> >>>
> >>> That being said and answering your question above, I think we need and IDR per
> >>> ETR (in the drvdata) to avoid contention issues on systems with several ETRs.
> >>>
> >>> Thanks for bringing this back to my attention.
> >>
> >> Cool. Thanks for explaining the rationale. So, when we do that, I think
> >> we may be able to have one "etr_perf_buffer" per ETR and thus we may be
> >> able to move the refcount back to the etr_perf_buffer, just like we
> >> moved the PID and index etr_perf_buffer instead of the etr_buf ?
> >
> > An etr_perf_buffer is associated with an event and holds the AUX ring
> > buffer that was created for that event.  In CPU-wide N:1 mode multiple
> > events (one per CPU), each with its own AUX ring buffer, share a sink
> > and as such we can't have a single etr_perf_buffer per ETR.
>
> Ok. Thanks for clarifying this. So we have one AUX buffer per event, but
> they all could end up in the same ETR HW buffer and may be copied to only
> one of the AUX buffer, which happens to really disable the ETR.

Exactly.

> And thus
> we need to have a sufficiently large AUX buffer in place to allow we don't
> overflow all the time with trace from multiple events. Makes sense for the
> N:1 topology.

Right - fortunately ring buffer size can be modified from the perf
tools command line.

> Also the decoding phase must extract the trace to the
> corresponding event based on the TRACEDI of the packets.

Yes, a good portion of the user space changes is related to that.

>
> Thats the best we could do at the moment.
> If there was a way to have one single large AUX buffer for a set of events,
> which is easily accessible from a given event in the set, rather than "N" large
> buffers which could potentially impact memory consumption.

That was my original idea but it would have required significant
changes to the kernel perf framework.

>
> It would be good to have this clearly documented somewhere in the implementation
> to keep us in track.

Ok, I'll put together a good description of the design choices in the
code.  There is already a significant amount but nothing that really
summarises this conversation.

>
> Cheers
> Suzuki
>
> >
> > Thanks,
> > Mathieu
> >
> >>
> >> Cheers
> >> Suzuki
> >>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios
  2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
                   ` (16 preceding siblings ...)
  2019-03-27  7:52 ` [PATCH v2 00/16] coresight: " Leo Yan
@ 2019-04-11 18:52 ` Robert Walker
  2019-04-16 19:37   ` Mathieu Poirier
  17 siblings, 1 reply; 41+ messages in thread
From: Robert Walker @ 2019-04-11 18:52 UTC (permalink / raw)
  To: Mathieu Poirier, linux-arm-kernel
  Cc: alexander.shishkin, peterz, coresight, mike.leach

Hi Mathieu,

On 25/03/2019 21:56, Mathieu Poirier wrote:
> This is the second revision of a patchset that adds support for CPU-wide
> trace scenarios and as such, it is now possible to issue the following
> commands:
>
> 	# perf record -e cs_etm/@20070000.etr/ -C 2,3 $COMMAND
> 	# perf record -e cs_etm/@20070000.etr/ -a $COMMAND
>
> The solution is designed to work for both 1:1 and N:1 source/sink
> topologies, though the former hasn't been tested for lack of access to HW.
>
> Most of the changes revolve around allowing more than one event to use
> a sink when operated from perf.  More specifically the first event to
> use a sink switches it on while the last one is tasked to aggregate traces
> and switching off the device.
>
> This is the kernel part of the solution, with the user space portion to be
> released in a later set.  All patches (user and kernel) have been rebased
> on v5.1-rc2 and are hosted here[1].  Everything has been tested on Juno and
> 410c dragonboard platforms.
>
> Regards,
> Mathieu
>
> [1]. https://git.linaro.org/people/mathieu.poirier/coresight.git (5.1-rc2-cpu-wide-v2)
>
I've tested this patch set and the associated perf patches on the HiKey 
960 - trace collection and decode appears to work OK. However, in order 
to get the timestamps in the trace stream, I needed to enable the 
CoreSight Timestamp generator before starting trace.  Without this, all 
the timestamp packets had a value of 0. This will likely affect other 
platforms.  For testing purposes, I enabled it by poking the control 
register directly via /dev/mem, but for full support you will need a 
driver for this component (it's fairly simple - just a single register 
to write to enable / disable) and entries in the device tree / ACPI 
tables - it's similar to the helper devices like CTI & CATU which aren't 
on the trace data path, but are associated with a device that is.

Also, the use of a counter to generate the timestamps periodically 
conflicts with the ETM strobing patch we've been using for AutoFDO.  
This strobing requires 2 counters and as most ETM implementations only 
have 2 counters, there is only one available if one is used for 
timestamps.  We'll have to do some investigation to work out a way 
around this.

Regards

Rob


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios
  2019-04-11 18:52 ` Robert Walker
@ 2019-04-16 19:37   ` Mathieu Poirier
  2019-04-24 16:22     ` Robert Walker
  0 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2019-04-16 19:37 UTC (permalink / raw)
  To: Robert Walker
  Cc: Alexander Shishkin, Peter Zijlstra, Coresight ML, Mike Leach,
	linux-arm-kernel

Hi Robert,

On Thu, 11 Apr 2019 at 12:52, Robert Walker <robert.walker@arm.com> wrote:
>
> Hi Mathieu,
>
> On 25/03/2019 21:56, Mathieu Poirier wrote:
> > This is the second revision of a patchset that adds support for CPU-wide
> > trace scenarios and as such, it is now possible to issue the following
> > commands:
> >
> >       # perf record -e cs_etm/@20070000.etr/ -C 2,3 $COMMAND
> >       # perf record -e cs_etm/@20070000.etr/ -a $COMMAND
> >
> > The solution is designed to work for both 1:1 and N:1 source/sink
> > topologies, though the former hasn't been tested for lack of access to HW.
> >
> > Most of the changes revolve around allowing more than one event to use
> > a sink when operated from perf.  More specifically the first event to
> > use a sink switches it on while the last one is tasked to aggregate traces
> > and switching off the device.
> >
> > This is the kernel part of the solution, with the user space portion to be
> > released in a later set.  All patches (user and kernel) have been rebased
> > on v5.1-rc2 and are hosted here[1].  Everything has been tested on Juno and
> > 410c dragonboard platforms.
> >
> > Regards,
> > Mathieu
> >
> > [1]. https://git.linaro.org/people/mathieu.poirier/coresight.git (5.1-rc2-cpu-wide-v2)
> >
> I've tested this patch set and the associated perf patches on the HiKey
> 960 - trace collection and decode appears to work OK. However, in order
> to get the timestamps in the trace stream, I needed to enable the
> CoreSight Timestamp generator before starting trace.  Without this, all
> the timestamp packets had a value of 0. This will likely affect other
> platforms.  For testing purposes, I enabled it by poking the control
> register directly via /dev/mem, but for full support you will need a
> driver for this component (it's fairly simple - just a single register
> to write to enable / disable) and entries in the device tree / ACPI
> tables - it's similar to the helper devices like CTI & CATU which aren't
> on the trace data path, but are associated with a device that is.
>

Thank you for taking the time to test this.  Can I add your
"Tested-by:" to the set?

Platforms where the timestamp generator needs to explicitly be enabled
are slowly emerging - I have heard of the issue on the CS mailing list
about a month ago.  Since I don't have HW to test the feature it will
not be part of this set.

> Also, the use of a counter to generate the timestamps periodically
> conflicts with the ETM strobing patch we've been using for AutoFDO.
> This strobing requires 2 counters and as most ETM implementations only
> have 2 counters, there is only one available if one is used for
> timestamps.  We'll have to do some investigation to work out a way
> around this.

I noticed that clocks were in short supply and as such added an
explicity test to make sure there were enough of them before
proceeding.  Like topology issues there isn't much we can currently do
other than preventing a trace session from moving forward if there
isn't enough counters.

Thanks,
Mathieu

>
> Regards
>
> Rob
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios
  2019-04-16 19:37   ` Mathieu Poirier
@ 2019-04-24 16:22     ` Robert Walker
  0 siblings, 0 replies; 41+ messages in thread
From: Robert Walker @ 2019-04-24 16:22 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Alexander Shishkin, Peter Zijlstra, Coresight ML, Mike Leach,
	linux-arm-kernel

Hi Mathieu,

On 16/04/2019 20:37, Mathieu Poirier wrote:
> Hi Robert,
>
> On Thu, 11 Apr 2019 at 12:52, Robert Walker <robert.walker@arm.com> wrote:
>> Hi Mathieu,
>>
>> On 25/03/2019 21:56, Mathieu Poirier wrote:
>>> This is the second revision of a patchset that adds support for CPU-wide
>>> trace scenarios and as such, it is now possible to issue the following
>>> commands:
>>>
>>>        # perf record -e cs_etm/@20070000.etr/ -C 2,3 $COMMAND
>>>        # perf record -e cs_etm/@20070000.etr/ -a $COMMAND
>>>
>>> The solution is designed to work for both 1:1 and N:1 source/sink
>>> topologies, though the former hasn't been tested for lack of access to HW.
>>>
>>> Most of the changes revolve around allowing more than one event to use
>>> a sink when operated from perf.  More specifically the first event to
>>> use a sink switches it on while the last one is tasked to aggregate traces
>>> and switching off the device.
>>>
>>> This is the kernel part of the solution, with the user space portion to be
>>> released in a later set.  All patches (user and kernel) have been rebased
>>> on v5.1-rc2 and are hosted here[1].  Everything has been tested on Juno and
>>> 410c dragonboard platforms.
>>>
>>> Regards,
>>> Mathieu
>>>
>>> [1]. https://git.linaro.org/people/mathieu.poirier/coresight.git (5.1-rc2-cpu-wide-v2)
>>>
>> I've tested this patch set and the associated perf patches on the HiKey
>> 960 - trace collection and decode appears to work OK. However, in order
>> to get the timestamps in the trace stream, I needed to enable the
>> CoreSight Timestamp generator before starting trace.  Without this, all
>> the timestamp packets had a value of 0. This will likely affect other
>> platforms.  For testing purposes, I enabled it by poking the control
>> register directly via /dev/mem, but for full support you will need a
>> driver for this component (it's fairly simple - just a single register
>> to write to enable / disable) and entries in the device tree / ACPI
>> tables - it's similar to the helper devices like CTI & CATU which aren't
>> on the trace data path, but are associated with a device that is.
>>
> Thank you for taking the time to test this.  Can I add your
> "Tested-by:" to the set?

Yes, please do.  I've also tested v3 of these patches.

> Platforms where the timestamp generator needs to explicitly be enabled
> are slowly emerging - I have heard of the issue on the CS mailing list
> about a month ago.  Since I don't have HW to test the feature it will
> not be part of this set.
>
>> Also, the use of a counter to generate the timestamps periodically
>> conflicts with the ETM strobing patch we've been using for AutoFDO.
>> This strobing requires 2 counters and as most ETM implementations only
>> have 2 counters, there is only one available if one is used for
>> timestamps.  We'll have to do some investigation to work out a way
>> around this.
> I noticed that clocks were in short supply and as such added an
> explicity test to make sure there were enough of them before
> proceeding.  Like topology issues there isn't much we can currently do
> other than preventing a trace session from moving forward if there
> isn't enough counters.
>
My current thinking on this is that when using the strobing mode for 
AutoFDO, we only get short bursts of trace from each core and are only 
interested in following the program flow during that burst, so precise 
alignment between the instructions streams of each core is less 
important (and unlikely - we wouldn't expect the strobes of multiple 
cores to coincide).  We get a timestamp as a result of the trace burst 
starting which is sufficient for coarse alignment of the bursts.  So 
I've reworked my strobing patch to use both counters for the strobing 
and not enable the timestamp counter when strobing is enabled.

Regards


Rob



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2019-04-24 16:23 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-25 21:56 [PATCH v2 00/16] coresight: Add support for CPU-wide trace scenarios Mathieu Poirier
2019-03-25 21:56 ` [PATCH v2 01/16] coresight: pmu: Adding ITRACE property to cs_etm PMU Mathieu Poirier
2019-03-25 21:56 ` [PATCH v2 02/16] coresight: etm4x: Add kernel configuration for CONTEXTID Mathieu Poirier
2019-03-26 11:59   ` Suzuki K Poulose
2019-03-25 21:56 ` [PATCH v2 03/16] coresight: etm4x: Configure tracers to emit timestamps Mathieu Poirier
2019-03-26 11:53   ` Suzuki K Poulose
2019-03-25 21:56 ` [PATCH v2 04/16] coresight: Adding return code to sink::disable() operation Mathieu Poirier
2019-03-26 14:55   ` Suzuki K Poulose
2019-03-25 21:56 ` [PATCH v2 05/16] coresight: Move reference counting inside sink drivers Mathieu Poirier
2019-03-26 15:04   ` Suzuki K Poulose
2019-03-25 21:56 ` [PATCH v2 06/16] coresight: Properly address errors in sink::disable() functions Mathieu Poirier
2019-03-25 21:56 ` [PATCH v2 07/16] coresight: Properly address concurrency in sink::update() functions Mathieu Poirier
2019-03-25 21:56 ` [PATCH v2 08/16] coresight: perf: Clean up function etm_setup_aux() Mathieu Poirier
2019-03-25 21:56 ` [PATCH v2 09/16] coresight: perf: Refactor function free_event_data() Mathieu Poirier
2019-03-26 15:07   ` Suzuki K Poulose
2019-03-25 21:56 ` [PATCH v2 10/16] coresight: Communicate perf event to sink buffer allocation function Mathieu Poirier
2019-03-26 15:12   ` Suzuki K Poulose
2019-03-25 21:56 ` [PATCH v2 11/16] coresight: tmc-etr: Refactor function tmc_etr_setup_perf_buf() Mathieu Poirier
2019-03-26 15:29   ` Suzuki K Poulose
2019-03-26 16:29     ` Mathieu Poirier
2019-03-25 21:56 ` [PATCH v2 12/16] coresight: tmc-etr: Introduce the notion of process ID to ETR devices Mathieu Poirier
2019-03-26 16:46   ` Suzuki K Poulose
2019-03-26 18:06     ` Mathieu Poirier
2019-03-25 21:56 ` [PATCH v2 13/16] coresight: tmc-etr: Allow events to use the same ETR buffer Mathieu Poirier
2019-03-26 16:18   ` Suzuki K Poulose
2019-03-26 17:55     ` Mathieu Poirier
2019-03-27 11:32       ` Suzuki K Poulose
2019-03-27 17:01         ` Mathieu Poirier
2019-04-01 13:01           ` Suzuki K Poulose
2019-04-03  2:13             ` Mathieu Poirier
2019-03-30 15:43   ` Leo Yan
2019-04-01  7:29     ` Mathieu Poirier
2019-03-25 21:56 ` [PATCH v2 14/16] coresight: tmc-etr: Add support for CPU-wide trace scenarios Mathieu Poirier
2019-03-25 21:56 ` [PATCH v2 15/16] coresight: tmc-etf: " Mathieu Poirier
2019-03-25 21:56 ` [PATCH v2 16/16] coresight: etb10: " Mathieu Poirier
2019-03-27  7:52 ` [PATCH v2 00/16] coresight: " Leo Yan
2019-03-27 14:40   ` Mathieu Poirier
2019-03-27 14:44     ` Leo Yan
2019-04-11 18:52 ` Robert Walker
2019-04-16 19:37   ` Mathieu Poirier
2019-04-24 16:22     ` Robert Walker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).