linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/17] coresight: perf: TMC ETR backend support
@ 2017-10-19 17:15 Suzuki K Poulose
  2017-10-19 17:15 ` [PATCH 01/17] coresight etr: Disallow perf mode temporarily Suzuki K Poulose
                   ` (17 more replies)
  0 siblings, 18 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

The TMC-ETR supports routing the Coresight trace data to the
System memory. It supports two different modes in which the memory
could be used.

1) Contiguous memory - The memory is assumed to be physically
contiguous.

2) Scatter Gather list - The memory can be chunks of 4K pages,
which are specified in a table of pointers which itself could be
multiple 4K size pages.

To avoid the complications of the managing the buffer, this series
adds a layer for managing the ETR buffer, which makes the best possibly
choice based on what is available. The allocation can be tuned by passing
in flags, existing pages (e.g, perf ring buffer) etc.

Towards supporting ETR Scatter Gather mode, we introduce a generic TMC
scatter-gather table which can be used to manage the data and table pages.
The table can be filled in the format expected by the Scatter-Gather
mode.

The TMC ETR-SG mechanism doesn't allow starting the trace at non-zero
offset (required by perf). So we make some tricky changes to the table
at run time to allow starting at any "Page aligned" offset and then
wrap around to the beginning of the buffer with very less overhead.
See patches for more description.

The series also improves the way the ETR is controlled by different modes
(sysfs vs. perf) by keeping mode specific data. This allows access
to the trace data collected in sysfs mode, even when the ETR is
operated in perf mode. Also with the transparent management of the
buffer and scatter-gather mechanism, we can allow the user to
request for larger trace buffers for sysfs mode. This is supported
by providing a sysfs file, "buffer_size" which accepts a page aligned
size, which will be used by the ETR when allocating a buffer.

Finally, it cleans up the etm perf sink callbacks a little bit and
then adds the support for ETR sink. For the ETR, we try our best to
use the perf ring buffer as the target hardware buffer, provided :
 1) The ETR is dma coherent (since the pages will be shared with
    userspace perf tool).
 2) The perf is used in snapshot mode (The ETR cannot be stopped
    based on the size of the data written hence we could easily
    overwrite the buffer. We may be able to fix this in the future)
 3) The ETR supports the Scatter-Gather mode.

If we can't use the perf buffers directly, we fallback to using
software buffering where we have to copy the trace data back
to the perf ring buffer.

Suzuki K Poulose (17):
  coresight etr: Disallow perf mode temporarily
  coresight tmc: Hide trace buffer handling for file read
  coresight: Add helper for inserting synchronization packets
  coresight: Add generic TMC sg table framework
  coresight: Add support for TMC ETR SG unit
  coresight: tmc: Make ETR SG table circular
  coresight: tmc etr: Add transparent buffer management
  coresight: tmc: Add configuration support for trace buffer size
  coresight: Convert driver messages to dev_dbg
  coresight: etr: Track if the device is coherent
  coresight etr: Handle driver mode specific ETR buffers
  coresight etr: Relax collection of trace from sysfs mode
  coresight etr: Do not clean ETR trace buffer
  coresight: etr: Add support for save restore buffers
  coresight: etr_buf: Add helper for padding an area of trace data
  coresight: perf: Remove reset_buffer call back for sinks
  coresight perf: Add ETR backend support for etm-perf

 .../ABI/testing/sysfs-bus-coresight-devices-tmc    |    8 +
 .../coresight/coresight-dynamic-replicator.c       |    4 +-
 drivers/hwtracing/coresight/coresight-etb10.c      |   72 +-
 drivers/hwtracing/coresight/coresight-etm-perf.c   |    9 +-
 drivers/hwtracing/coresight/coresight-etm3x.c      |    4 +-
 drivers/hwtracing/coresight/coresight-etm4x.c      |    4 +-
 drivers/hwtracing/coresight/coresight-funnel.c     |    4 +-
 drivers/hwtracing/coresight/coresight-priv.h       |    8 +
 drivers/hwtracing/coresight/coresight-replicator.c |    4 +-
 drivers/hwtracing/coresight/coresight-stm.c        |    4 +-
 drivers/hwtracing/coresight/coresight-tmc-etf.c    |  109 +-
 drivers/hwtracing/coresight/coresight-tmc-etr.c    | 1665 ++++++++++++++++++--
 drivers/hwtracing/coresight/coresight-tmc.c        |   75 +-
 drivers/hwtracing/coresight/coresight-tmc.h        |  128 +-
 drivers/hwtracing/coresight/coresight-tpiu.c       |    4 +-
 include/linux/coresight.h                          |    5 +-
 16 files changed, 1837 insertions(+), 270 deletions(-)

-- 
2.13.6

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 01/17] coresight etr: Disallow perf mode temporarily
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-10-19 17:15 ` [PATCH 02/17] coresight tmc: Hide trace buffer handling for file read Suzuki K Poulose
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

We don't support ETR in perf mode yet. Temporarily fail the
operation until we add proper support.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 28 ++-----------------------
 1 file changed, 2 insertions(+), 26 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 68fbc8f7450e..d0208f01afd9 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -192,32 +192,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 
 static int tmc_enable_etr_sink_perf(struct coresight_device *csdev)
 {
-	int ret = 0;
-	unsigned long flags;
-	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
-
-	spin_lock_irqsave(&drvdata->spinlock, flags);
-	if (drvdata->reading) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	/*
-	 * In Perf mode there can be only one writer per sink.  There
-	 * is also no need to continue if the ETR is already operated
-	 * from sysFS.
-	 */
-	if (drvdata->mode != CS_MODE_DISABLED) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	drvdata->mode = CS_MODE_PERF;
-	tmc_etr_enable_hw(drvdata);
-out:
-	spin_unlock_irqrestore(&drvdata->spinlock, flags);
-
-	return ret;
+	/* We don't support perf mode yet ! */
+	return -EINVAL;
 }
 
 static int tmc_enable_etr_sink(struct coresight_device *csdev, u32 mode)
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 02/17] coresight tmc: Hide trace buffer handling for file read
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
  2017-10-19 17:15 ` [PATCH 01/17] coresight etr: Disallow perf mode temporarily Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-10-20 12:34   ` Julien Thierry
  2017-10-19 17:15 ` [PATCH 03/17] coresight: Add helper for inserting synchronization packets Suzuki K Poulose
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

At the moment we adjust the buffer pointers for reading the trace
data via misc device in the common code for ETF/ETB and ETR. Since
we are going to change how we manage the buffer for ETR, let us
move the buffer manipulation to the respective driver files, hiding
it from the common code. We do so by adding type specific helpers
for finding the length of data and the pointer to the buffer,
for a given length at a file position.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etf.c | 16 ++++++++++++
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 33 ++++++++++++++++++++++++
 drivers/hwtracing/coresight/coresight-tmc.c     | 34 ++++++++++++++-----------
 drivers/hwtracing/coresight/coresight-tmc.h     |  4 +++
 4 files changed, 72 insertions(+), 15 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index e2513b786242..0b6f1eb746de 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -120,6 +120,22 @@ static void tmc_etf_disable_hw(struct tmc_drvdata *drvdata)
 	CS_LOCK(drvdata->base);
 }
 
+/*
+ * Return the available trace data in the buffer from @pos, with
+ * a maximum limit of @len, updating the @bufpp on where to
+ * find it.
+ */
+ssize_t tmc_etb_get_sysfs_trace(struct tmc_drvdata *drvdata,
+				  loff_t pos, size_t len, char **bufpp)
+{
+	/* Adjust the len to available size @pos */
+	if (pos + len > drvdata->len)
+		len = drvdata->len - pos;
+	if (len > 0)
+		*bufpp = drvdata->buf + pos;
+	return len;
+}
+
 static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev)
 {
 	int ret = 0;
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index d0208f01afd9..063f253f1c99 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -69,6 +69,39 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 	CS_LOCK(drvdata->base);
 }
 
+/*
+ * Return the available trace data in the buffer @pos, with a maximum
+ * limit of @len, also updating the @bufpp on where to find it.
+ */
+ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
+			    loff_t pos, size_t len, char **bufpp)
+{
+	char *bufp = drvdata->buf + pos;
+	char *bufend = (char *)(drvdata->vaddr + drvdata->size);
+
+	/* Adjust the len to available size @pos */
+	if (pos + len > drvdata->len)
+		len = drvdata->len - pos;
+
+	if (len <= 0)
+		return len;
+
+	/*
+	 * Since we use a circular buffer, with trace data starting
+	 * @drvdata->buf, possibly anywhere in the buffer @drvdata->vaddr,
+	 * wrap the current @pos to within the buffer.
+	 */
+	if (bufp >= bufend)
+		bufp -= drvdata->size;
+	/*
+	 * For simplicity, avoid copying over a wrapped around buffer.
+	 */
+	if ((bufp + len) > bufend)
+		len = bufend - bufp;
+	*bufpp = bufp;
+	return len;
+}
+
 static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata)
 {
 	const u32 *barrier;
diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c
index 2ff4a66a3caa..c7201e40d737 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.c
+++ b/drivers/hwtracing/coresight/coresight-tmc.c
@@ -131,24 +131,29 @@ static int tmc_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
+static inline ssize_t tmc_get_sysfs_trace(struct tmc_drvdata *drvdata,
+					loff_t pos, size_t len, char **bufpp)
+{
+	switch (drvdata->config_type) {
+	case TMC_CONFIG_TYPE_ETB:
+	case TMC_CONFIG_TYPE_ETF:
+		return tmc_etb_get_sysfs_trace(drvdata, pos, len, bufpp);
+	case TMC_CONFIG_TYPE_ETR:
+		return tmc_etr_get_sysfs_trace(drvdata, pos, len, bufpp);
+	}
+
+	return  -EINVAL;
+}
+
 static ssize_t tmc_read(struct file *file, char __user *data, size_t len,
 			loff_t *ppos)
 {
+	char *bufp;
 	struct tmc_drvdata *drvdata = container_of(file->private_data,
 						   struct tmc_drvdata, miscdev);
-	char *bufp = drvdata->buf + *ppos;
-
-	if (*ppos + len > drvdata->len)
-		len = drvdata->len - *ppos;
-
-	if (drvdata->config_type == TMC_CONFIG_TYPE_ETR) {
-		if (bufp == (char *)(drvdata->vaddr + drvdata->size))
-			bufp = drvdata->vaddr;
-		else if (bufp > (char *)(drvdata->vaddr + drvdata->size))
-			bufp -= drvdata->size;
-		if ((bufp + len) > (char *)(drvdata->vaddr + drvdata->size))
-			len = (char *)(drvdata->vaddr + drvdata->size) - bufp;
-	}
+	len = tmc_get_sysfs_trace(drvdata, *ppos, len, &bufp);
+	if (len <= 0)
+		return 0;
 
 	if (copy_to_user(data, bufp, len)) {
 		dev_dbg(drvdata->dev, "%s: copy_to_user failed\n", __func__);
@@ -156,9 +161,8 @@ static ssize_t tmc_read(struct file *file, char __user *data, size_t len,
 	}
 
 	*ppos += len;
+	dev_dbg(drvdata->dev, "%zu bytes copied\n", len);
 
-	dev_dbg(drvdata->dev, "%s: %zu bytes copied, %d bytes left\n",
-		__func__, len, (int)(drvdata->len - *ppos));
 	return len;
 }
 
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index 8df7a813f537..6deb3afe9db8 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -183,10 +183,14 @@ int tmc_read_unprepare_etb(struct tmc_drvdata *drvdata);
 extern const struct coresight_ops tmc_etb_cs_ops;
 extern const struct coresight_ops tmc_etf_cs_ops;
 
+ssize_t tmc_etb_get_sysfs_trace(struct tmc_drvdata *drvdata,
+			    loff_t pos, size_t len, char **bufpp);
 /* ETR functions */
 int tmc_read_prepare_etr(struct tmc_drvdata *drvdata);
 int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata);
 extern const struct coresight_ops tmc_etr_cs_ops;
+ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
+			    loff_t pos, size_t len, char **bufpp);
 
 
 #define TMC_REG_PAIR(name, lo_off, hi_off)				\
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 03/17] coresight: Add helper for inserting synchronization packets
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
  2017-10-19 17:15 ` [PATCH 01/17] coresight etr: Disallow perf mode temporarily Suzuki K Poulose
  2017-10-19 17:15 ` [PATCH 02/17] coresight tmc: Hide trace buffer handling for file read Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-10-30 21:44   ` Mathieu Poirier
  2017-10-19 17:15 ` [PATCH 04/17] coresight: Add generic TMC sg table framework Suzuki K Poulose
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

Right now we open code filling the trace buffer with synchronization
packets when the circular buffer wraps around in different drivers.
Move this to a common place.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-etb10.c   | 10 +++------
 drivers/hwtracing/coresight/coresight-priv.h    |  8 ++++++++
 drivers/hwtracing/coresight/coresight-tmc-etf.c | 27 ++++++++-----------------
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 13 +-----------
 4 files changed, 20 insertions(+), 38 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index 56ecd7aff5eb..d7164ab8e229 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -203,7 +203,6 @@ static void etb_dump_hw(struct etb_drvdata *drvdata)
 	bool lost = false;
 	int i;
 	u8 *buf_ptr;
-	const u32 *barrier;
 	u32 read_data, depth;
 	u32 read_ptr, write_ptr;
 	u32 frame_off, frame_endoff;
@@ -234,19 +233,16 @@ static void etb_dump_hw(struct etb_drvdata *drvdata)
 
 	depth = drvdata->buffer_depth;
 	buf_ptr = drvdata->buf;
-	barrier = barrier_pkt;
 	for (i = 0; i < depth; i++) {
 		read_data = readl_relaxed(drvdata->base +
 					  ETB_RAM_READ_DATA_REG);
-		if (lost && *barrier) {
-			read_data = *barrier;
-			barrier++;
-		}
-
 		*(u32 *)buf_ptr = read_data;
 		buf_ptr += 4;
 	}
 
+	if (lost)
+		coresight_insert_barrier_packet(drvdata->buf);
+
 	if (frame_off) {
 		buf_ptr -= (frame_endoff * 4);
 		for (i = 0; i < frame_endoff; i++) {
diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
index f1d0e21d8cab..d12f64928c00 100644
--- a/drivers/hwtracing/coresight/coresight-priv.h
+++ b/drivers/hwtracing/coresight/coresight-priv.h
@@ -65,6 +65,7 @@ static DEVICE_ATTR_RO(name)
 	__coresight_simple_func(type, NULL, name, lo_off, hi_off)
 
 extern const u32 barrier_pkt[5];
+#define CORESIGHT_BARRIER_PKT_SIZE (sizeof(barrier_pkt) - sizeof(u32))
 
 enum etm_addr_type {
 	ETM_ADDR_TYPE_NONE,
@@ -98,6 +99,13 @@ struct cs_buffers {
 	void			**data_pages;
 };
 
+static inline void coresight_insert_barrier_packet(void *buf)
+{
+	if (buf)
+		memcpy(buf, barrier_pkt, CORESIGHT_BARRIER_PKT_SIZE);
+}
+
+
 static inline void CS_LOCK(void __iomem *addr)
 {
 	do {
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 0b6f1eb746de..d89bfb3042a2 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -43,39 +43,28 @@ static void tmc_etb_enable_hw(struct tmc_drvdata *drvdata)
 
 static void tmc_etb_dump_hw(struct tmc_drvdata *drvdata)
 {
-	bool lost = false;
 	char *bufp;
-	const u32 *barrier;
-	u32 read_data, status;
+	u32 read_data, lost;
 	int i;
 
-	/*
-	 * Get a hold of the status register and see if a wrap around
-	 * has occurred.
-	 */
-	status = readl_relaxed(drvdata->base + TMC_STS);
-	if (status & TMC_STS_FULL)
-		lost = true;
-
+	/* Check if the buffer was wrapped around. */
+	lost = readl_relaxed(drvdata->base + TMC_STS) & TMC_STS_FULL;
 	bufp = drvdata->buf;
 	drvdata->len = 0;
-	barrier = barrier_pkt;
 	while (1) {
 		for (i = 0; i < drvdata->memwidth; i++) {
 			read_data = readl_relaxed(drvdata->base + TMC_RRD);
 			if (read_data == 0xFFFFFFFF)
-				return;
-
-			if (lost && *barrier) {
-				read_data = *barrier;
-				barrier++;
-			}
-
+				goto done;
 			memcpy(bufp, &read_data, 4);
 			bufp += 4;
 			drvdata->len += 4;
 		}
 	}
+done:
+	if (lost)
+		coresight_insert_barrier_packet(drvdata->buf);
+	return;
 }
 
 static void tmc_etb_disable_hw(struct tmc_drvdata *drvdata)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 063f253f1c99..41535fa6b6cf 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -104,9 +104,7 @@ ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
 
 static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata)
 {
-	const u32 *barrier;
 	u32 val;
-	u32 *temp;
 	u64 rwp;
 
 	rwp = tmc_read_rwp(drvdata);
@@ -119,16 +117,7 @@ static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata)
 	if (val & TMC_STS_FULL) {
 		drvdata->buf = drvdata->vaddr + rwp - drvdata->paddr;
 		drvdata->len = drvdata->size;
-
-		barrier = barrier_pkt;
-		temp = (u32 *)drvdata->buf;
-
-		while (*barrier) {
-			*temp = *barrier;
-			temp++;
-			barrier++;
-		}
-
+		coresight_insert_barrier_packet(drvdata->buf);
 	} else {
 		drvdata->buf = drvdata->vaddr;
 		drvdata->len = rwp - drvdata->paddr;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 04/17] coresight: Add generic TMC sg table framework
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (2 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 03/17] coresight: Add helper for inserting synchronization packets Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-10-31 22:13   ` Mathieu Poirier
  2017-10-19 17:15 ` [PATCH 05/17] coresight: Add support for TMC ETR SG unit Suzuki K Poulose
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose, Mathieu Poirier

This patch introduces a generic sg table data structure and
associated operations. An SG table can be used to map a set
of Data pages where the trace data could be stored by the TMC
ETR. The information about the data pages could be stored in
different formats, depending on the type of the underlying
SG mechanism (e.g, TMC ETR SG vs Coresight CATU). The generic
structure provides book keeping of the pages used for the data
as well as the table contents. The table should be filled by
the user of the infrastructure.

A table can be created by specifying the number of data pages
as well as the number of table pages required to hold the
pointers, where the latter could be different for different
types of tables. The pages are mapped in the appropriate dma
data direction mode (i.e, DMA_TO_DEVICE for table pages
and DMA_FROM_DEVICE for data pages).  The framework can optionally
accept a set of allocated data pages (e.g, perf ring buffer) and
map them accordingly. The table and data pages are vmap'ed to allow
easier access by the drivers. The framework also provides helpers to
sync the data written to the pages with appropriate directions.

This will be later used by the TMC ETR SG unit.

Cc: Mathieu Poirier <matheiu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 289 +++++++++++++++++++++++-
 drivers/hwtracing/coresight/coresight-tmc.h     |  44 ++++
 2 files changed, 332 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 41535fa6b6cf..4b9e2b276122 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -16,10 +16,297 @@
  */
 
 #include <linux/coresight.h>
-#include <linux/dma-mapping.h>
+#include <linux/slab.h>
 #include "coresight-priv.h"
 #include "coresight-tmc.h"
 
+/*
+ * tmc_pages_get_offset:  Go through all the pages in the tmc_pages
+ * and map @phys_addr to an offset within the buffer.
+ */
+static long
+tmc_pages_get_offset(struct tmc_pages *tmc_pages, dma_addr_t addr)
+{
+	int i;
+	dma_addr_t page_start;
+
+	for (i = 0; i < tmc_pages->nr_pages; i++) {
+		page_start = tmc_pages->daddrs[i];
+		if (addr >= page_start && addr < (page_start + PAGE_SIZE))
+			return i * PAGE_SIZE + (addr - page_start);
+	}
+
+	return -EINVAL;
+}
+
+/*
+ * tmc_pages_free : Unmap and free the pages used by tmc_pages.
+ */
+static void tmc_pages_free(struct tmc_pages *tmc_pages,
+			   struct device *dev, enum dma_data_direction dir)
+{
+	int i;
+
+	for (i = 0; i < tmc_pages->nr_pages; i++) {
+		if (tmc_pages->daddrs && tmc_pages->daddrs[i])
+			dma_unmap_page(dev, tmc_pages->daddrs[i],
+					 PAGE_SIZE, dir);
+		if (tmc_pages->pages && tmc_pages->pages[i])
+			__free_page(tmc_pages->pages[i]);
+	}
+
+	kfree(tmc_pages->pages);
+	kfree(tmc_pages->daddrs);
+	tmc_pages->pages = NULL;
+	tmc_pages->daddrs = NULL;
+	tmc_pages->nr_pages = 0;
+}
+
+/*
+ * tmc_pages_alloc : Allocate and map pages for a given @tmc_pages.
+ * If @pages is not NULL, the list of page virtual addresses are
+ * used as the data pages. The pages are then dma_map'ed for @dev
+ * with dma_direction @dir.
+ *
+ * Returns 0 upon success, else the error number.
+ */
+static int tmc_pages_alloc(struct tmc_pages *tmc_pages,
+			   struct device *dev, int node,
+			   enum dma_data_direction dir, void **pages)
+{
+	int i, nr_pages;
+	dma_addr_t paddr;
+	struct page *page;
+
+	nr_pages = tmc_pages->nr_pages;
+	tmc_pages->daddrs = kcalloc(nr_pages, sizeof(*tmc_pages->daddrs),
+					 GFP_KERNEL);
+	if (!tmc_pages->daddrs)
+		return -ENOMEM;
+	tmc_pages->pages = kcalloc(nr_pages, sizeof(*tmc_pages->pages),
+					 GFP_KERNEL);
+	if (!tmc_pages->pages) {
+		kfree(tmc_pages->daddrs);
+		tmc_pages->daddrs = NULL;
+		return -ENOMEM;
+	}
+
+	for (i = 0; i < nr_pages; i++) {
+		if (pages && pages[i]) {
+			page = virt_to_page(pages[i]);
+			get_page(page);
+		} else {
+			page = alloc_pages_node(node,
+						GFP_KERNEL | __GFP_ZERO, 0);
+		}
+		paddr = dma_map_page(dev, page, 0, PAGE_SIZE, dir);
+		if (dma_mapping_error(dev, paddr))
+			goto err;
+		tmc_pages->daddrs[i] = paddr;
+		tmc_pages->pages[i] = page;
+	}
+	return 0;
+err:
+	tmc_pages_free(tmc_pages, dev, dir);
+	return -ENOMEM;
+}
+
+static inline dma_addr_t tmc_sg_table_base_paddr(struct tmc_sg_table *sg_table)
+{
+	if (WARN_ON(!sg_table->data_pages.pages[0]))
+		return 0;
+	return sg_table->table_daddr;
+}
+
+static inline void *tmc_sg_table_base_vaddr(struct tmc_sg_table *sg_table)
+{
+	if (WARN_ON(!sg_table->data_pages.pages[0]))
+		return NULL;
+	return sg_table->table_vaddr;
+}
+
+static inline void *
+tmc_sg_table_data_vaddr(struct tmc_sg_table *sg_table)
+{
+	if (WARN_ON(!sg_table->data_pages.nr_pages))
+		return 0;
+	return sg_table->data_vaddr;
+}
+
+static inline unsigned long
+tmc_sg_table_buf_size(struct tmc_sg_table *sg_table)
+{
+	return sg_table->data_pages.nr_pages << PAGE_SHIFT;
+}
+
+static inline long
+tmc_sg_get_data_page_offset(struct tmc_sg_table *sg_table, dma_addr_t addr)
+{
+	return tmc_pages_get_offset(&sg_table->data_pages, addr);
+}
+
+static inline void tmc_free_table_pages(struct tmc_sg_table *sg_table)
+{
+	if (sg_table->table_vaddr)
+		vunmap(sg_table->table_vaddr);
+	tmc_pages_free(&sg_table->table_pages, sg_table->dev, DMA_TO_DEVICE);
+}
+
+static void tmc_free_data_pages(struct tmc_sg_table *sg_table)
+{
+	if (sg_table->data_vaddr)
+		vunmap(sg_table->data_vaddr);
+	tmc_pages_free(&sg_table->data_pages, sg_table->dev, DMA_FROM_DEVICE);
+}
+
+void tmc_free_sg_table(struct tmc_sg_table *sg_table)
+{
+	tmc_free_table_pages(sg_table);
+	tmc_free_data_pages(sg_table);
+}
+
+/*
+ * Alloc pages for the table. Since this will be used by the device,
+ * allocate the pages closer to the device (i.e, dev_to_node(dev)
+ * rather than the CPU nod).
+ */
+static int tmc_alloc_table_pages(struct tmc_sg_table *sg_table)
+{
+	int rc;
+	struct tmc_pages *table_pages = &sg_table->table_pages;
+
+	rc = tmc_pages_alloc(table_pages, sg_table->dev,
+			     dev_to_node(sg_table->dev),
+			     DMA_TO_DEVICE, NULL);
+	if (rc)
+		return rc;
+	sg_table->table_vaddr = vmap(table_pages->pages,
+				     table_pages->nr_pages,
+				     VM_MAP,
+				     PAGE_KERNEL);
+	if (!sg_table->table_vaddr)
+		rc = -ENOMEM;
+	else
+		sg_table->table_daddr = table_pages->daddrs[0];
+	return rc;
+}
+
+static int tmc_alloc_data_pages(struct tmc_sg_table *sg_table, void **pages)
+{
+	int rc;
+
+	rc = tmc_pages_alloc(&sg_table->data_pages,
+			     sg_table->dev, sg_table->node,
+			     DMA_FROM_DEVICE, pages);
+	if (!rc) {
+		sg_table->data_vaddr = vmap(sg_table->data_pages.pages,
+					   sg_table->data_pages.nr_pages,
+					   VM_MAP,
+					   PAGE_KERNEL);
+		if (!sg_table->data_vaddr)
+			rc = -ENOMEM;
+	}
+	return rc;
+}
+
+/*
+ * tmc_alloc_sg_table: Allocate and setup dma pages for the TMC SG table
+ * and data buffers. TMC writes to the data buffers and reads from the SG
+ * Table pages.
+ *
+ * @dev		- Device to which page should be DMA mapped.
+ * @node	- Numa node for mem allocations
+ * @nr_tpages	- Number of pages for the table entries.
+ * @nr_dpages	- Number of pages for Data buffer.
+ * @pages	- Optional list of virtual address of pages.
+ */
+struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev,
+					int node,
+					int nr_tpages,
+					int nr_dpages,
+					void **pages)
+{
+	long rc;
+	struct tmc_sg_table *sg_table;
+
+	sg_table = kzalloc(sizeof(*sg_table), GFP_KERNEL);
+	if (!sg_table)
+		return ERR_PTR(-ENOMEM);
+	sg_table->data_pages.nr_pages = nr_dpages;
+	sg_table->table_pages.nr_pages = nr_tpages;
+	sg_table->node = node;
+	sg_table->dev = dev;
+
+	rc  = tmc_alloc_data_pages(sg_table, pages);
+	if (!rc)
+		rc = tmc_alloc_table_pages(sg_table);
+	if (rc) {
+		tmc_free_sg_table(sg_table);
+		kfree(sg_table);
+		return ERR_PTR(rc);
+	}
+
+	return sg_table;
+}
+
+/*
+ * tmc_sg_table_sync_data_range: Sync the data buffer written
+ * by the device from @offset upto a @size bytes.
+ */
+void tmc_sg_table_sync_data_range(struct tmc_sg_table *table,
+				  u64 offset, u64 size)
+{
+	int i, index, start;
+	int npages = DIV_ROUND_UP(size, PAGE_SIZE);
+	struct device *dev = table->dev;
+	struct tmc_pages *data = &table->data_pages;
+
+	start = offset >> PAGE_SHIFT;
+	for (i = start; i < (start + npages); i++) {
+		index = i % data->nr_pages;
+		dma_sync_single_for_cpu(dev, data->daddrs[index],
+					PAGE_SIZE, DMA_FROM_DEVICE);
+	}
+}
+
+/* tmc_sg_sync_table: Sync the page table */
+void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table)
+{
+	int i;
+	struct device *dev = sg_table->dev;
+	struct tmc_pages *table_pages = &sg_table->table_pages;
+
+	for (i = 0; i < table_pages->nr_pages; i++)
+		dma_sync_single_for_device(dev, table_pages->daddrs[i],
+					   PAGE_SIZE, DMA_TO_DEVICE);
+}
+
+/*
+ * tmc_sg_table_get_data: Get the buffer pointer for data @offset
+ * in the SG buffer. The @bufpp is updated to point to the buffer.
+ * Returns :
+ *	the length of linear data available at @offset.
+ *	or
+ *	<= 0 if no data is available.
+ */
+ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
+				u64 offset, size_t len, char **bufpp)
+{
+	size_t size;
+	int pg_idx = offset >> PAGE_SHIFT;
+	int pg_offset = offset & (PAGE_SIZE - 1);
+	struct tmc_pages *data_pages = &sg_table->data_pages;
+
+	size = tmc_sg_table_buf_size(sg_table);
+	if (offset >= size)
+		return -EINVAL;
+	len = (len < (size - offset)) ? len : size - offset;
+	len = (len < (PAGE_SIZE - pg_offset)) ? len : (PAGE_SIZE - pg_offset);
+	if (len > 0)
+		*bufpp = page_address(data_pages->pages[pg_idx]) + pg_offset;
+	return len;
+}
+
 static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 {
 	u32 axictl, sts;
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index 6deb3afe9db8..5e49c035a1ac 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -19,6 +19,7 @@
 #define _CORESIGHT_TMC_H
 
 #include <linux/miscdevice.h>
+#include <linux/dma-mapping.h>
 
 #define TMC_RSZ			0x004
 #define TMC_STS			0x00c
@@ -171,6 +172,38 @@ struct tmc_drvdata {
 	u32			etr_caps;
 };
 
+/**
+ * struct tmc_pages - Collection of pages used for SG.
+ * @nr_pages:		Number of pages in the list.
+ * @daddr:		DMA'able page address returned by dma_map_page().
+ * @vaddr:		Virtual address returned by page_address().
+ */
+struct tmc_pages {
+	int nr_pages;
+	dma_addr_t	*daddrs;
+	struct page	**pages;
+};
+
+/*
+ * struct tmc_sg_table : Generic SG table for TMC
+ * @dev:		Device for DMA allocations
+ * @table_vaddr:	Contiguous Virtual address for PageTable
+ * @data_vaddr:		Contiguous Virtual address for Data Buffer
+ * @table_daddr:	DMA address of the PageTable base
+ * @node:		Node for Page allocations
+ * @table_pages:	List of pages & dma address for Table
+ * @data_pages:		List of pages & dma address for Data
+ */
+struct tmc_sg_table {
+	struct device *dev;
+	void *table_vaddr;
+	void *data_vaddr;
+	dma_addr_t table_daddr;
+	int node;
+	struct tmc_pages table_pages;
+	struct tmc_pages data_pages;
+};
+
 /* Generic functions */
 void tmc_wait_for_tmcready(struct tmc_drvdata *drvdata);
 void tmc_flush_and_stop(struct tmc_drvdata *drvdata);
@@ -226,4 +259,15 @@ static inline bool tmc_etr_has_cap(struct tmc_drvdata *drvdata, u32 cap)
 	return !!(drvdata->etr_caps & cap);
 }
 
+struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev,
+					int node,
+					int nr_tpages,
+					int nr_dpages,
+					void **pages);
+void tmc_free_sg_table(struct tmc_sg_table *sg_table);
+void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table);
+void tmc_sg_table_sync_data_range(struct tmc_sg_table *table,
+				  u64 offset, u64 size);
+ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
+			      u64 offset, size_t len, char **bufpp);
 #endif
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 05/17] coresight: Add support for TMC ETR SG unit
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (3 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 04/17] coresight: Add generic TMC sg table framework Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-10-20 16:25   ` Julien Thierry
  2017-11-01 20:41   ` Mathieu Poirier
  2017-10-19 17:15 ` [PATCH 06/17] coresight: tmc: Make ETR SG table circular Suzuki K Poulose
                   ` (12 subsequent siblings)
  17 siblings, 2 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

This patch adds support for setting up an SG table used by the
TMC ETR inbuilt SG unit. The TMC ETR uses 4K page sized tables
to hold pointers to the 4K data pages with the last entry in a
table pointing to the next table with the entries, by kind of
chaining. The 2 LSBs determine the type of the table entry, to
one of :

 Normal - Points to a 4KB data page.
 Last   - Points to a 4KB data page, but is the last entry in the
          page table.
 Link   - Points to another 4KB table page with pointers to data.

The code takes care of handling the system page size which could
be different than 4K. So we could end up putting multiple ETR
SG tables in a single system page, vice versa for the data pages.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 256 ++++++++++++++++++++++++
 1 file changed, 256 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 4b9e2b276122..4424eb67a54c 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -21,6 +21,89 @@
 #include "coresight-tmc.h"
 
 /*
+ * The TMC ETR SG has a page size of 4K. The SG table contains pointers
+ * to 4KB buffers. However, the OS may be use PAGE_SIZE different from
+ * 4K (i.e, 16KB or 64KB). This implies that a single OS page could
+ * contain more than one SG buffer and tables.
+ *
+ * A table entry has the following format:
+ *
+ * ---Bit31------------Bit4-------Bit1-----Bit0--
+ * |     Address[39:12]    | SBZ |  Entry Type  |
+ * ----------------------------------------------
+ *
+ * Address: Bits [39:12] of a physical page address. Bits [11:0] are
+ *	    always zero.
+ *
+ * Entry type:
+ *	b00 - Reserved.
+ *	b01 - Last entry in the tables, points to 4K page buffer.
+ *	b10 - Normal entry, points to 4K page buffer.
+ *	b11 - Link. The address points to the base of next table.
+ */
+
+typedef u32 sgte_t;
+
+#define ETR_SG_PAGE_SHIFT		12
+#define ETR_SG_PAGE_SIZE		(1UL << ETR_SG_PAGE_SHIFT)
+#define ETR_SG_PAGES_PER_SYSPAGE	(1UL << \
+					 (PAGE_SHIFT - ETR_SG_PAGE_SHIFT))
+#define ETR_SG_PTRS_PER_PAGE		(ETR_SG_PAGE_SIZE / sizeof(sgte_t))
+#define ETR_SG_PTRS_PER_SYSPAGE		(PAGE_SIZE / sizeof(sgte_t))
+
+#define ETR_SG_ET_MASK			0x3
+#define ETR_SG_ET_LAST			0x1
+#define ETR_SG_ET_NORMAL		0x2
+#define ETR_SG_ET_LINK			0x3
+
+#define ETR_SG_ADDR_SHIFT		4
+
+#define ETR_SG_ENTRY(addr, type) \
+	(sgte_t)((((addr) >> ETR_SG_PAGE_SHIFT) << ETR_SG_ADDR_SHIFT) | \
+		 (type & ETR_SG_ET_MASK))
+
+#define ETR_SG_ADDR(entry) \
+	(((dma_addr_t)(entry) >> ETR_SG_ADDR_SHIFT) << ETR_SG_PAGE_SHIFT)
+#define ETR_SG_ET(entry)		((entry) & ETR_SG_ET_MASK)
+
+/*
+ * struct etr_sg_table : ETR SG Table
+ * @sg_table:		Generic SG Table holding the data/table pages.
+ * @hwaddr:		hwaddress used by the TMC, which is the base
+ *			address of the table.
+ */
+struct etr_sg_table {
+	struct tmc_sg_table	*sg_table;
+	dma_addr_t		hwaddr;
+};
+
+/*
+ * tmc_etr_sg_table_entries: Total number of table entries required to map
+ * @nr_pages system pages.
+ *
+ * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages.
+ * Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers,
+ * with the last entry pointing to the page containing the table
+ * entries. If we spill over to a new page for mapping 1 entry,
+ * we could as well replace the link entry of the previous page
+ * with the last entry.
+ */
+static inline unsigned long __attribute_const__
+tmc_etr_sg_table_entries(int nr_pages)
+{
+	unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE;
+	unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1);
+	/*
+	 * If we spill over to a new page for 1 entry, we could as well
+	 * make it the LAST entry in the previous page, skipping the Link
+	 * address.
+	 */
+	if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2))
+		nr_sglinks--;
+	return nr_sgpages + nr_sglinks;
+}
+
+/*
  * tmc_pages_get_offset:  Go through all the pages in the tmc_pages
  * and map @phys_addr to an offset within the buffer.
  */
@@ -307,6 +390,179 @@ ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
 	return len;
 }
 
+#ifdef ETR_SG_DEBUG
+/* Map a dma address to virtual address */
+static unsigned long
+tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table,
+			dma_addr_t addr, bool table)
+{
+	long offset;
+	unsigned long base;
+	struct tmc_pages *tmc_pages;
+
+	if (table) {
+		tmc_pages = &sg_table->table_pages;
+		base = (unsigned long)sg_table->table_vaddr;
+	} else {
+		tmc_pages = &sg_table->data_pages;
+		base = (unsigned long)sg_table->data_vaddr;
+	}
+
+	offset = tmc_pages_get_offset(tmc_pages, addr);
+	if (offset < 0)
+		return 0;
+	return base + offset;
+}
+
+/* Dump the given sg_table */
+static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
+{
+	sgte_t *ptr;
+	int i = 0;
+	dma_addr_t addr;
+	struct tmc_sg_table *sg_table = etr_table->sg_table;
+
+	ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
+					      etr_table->hwaddr, true);
+	while (ptr) {
+		addr = ETR_SG_ADDR(*ptr);
+		switch (ETR_SG_ET(*ptr)) {
+		case ETR_SG_ET_NORMAL:
+			pr_debug("%05d: %p\t:[N] 0x%llx\n", i, ptr, addr);
+			ptr++;
+			break;
+		case ETR_SG_ET_LINK:
+			pr_debug("%05d: *** %p\t:{L} 0x%llx ***\n",
+				 i, ptr, addr);
+			ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
+							      addr, true);
+			break;
+		case ETR_SG_ET_LAST:
+			pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n",
+				 i, ptr, addr);
+			return;
+		}
+		i++;
+	}
+	pr_debug("******* End of Table *****\n");
+}
+#endif
+
+/*
+ * Populate the SG Table page table entries from table/data
+ * pages allocated. Each Data page has ETR_SG_PAGES_PER_SYSPAGE SG pages.
+ * So does a Table page. So we keep track of indices of the tables
+ * in each system page and move the pointers accordingly.
+ */
+#define INC_IDX_ROUND(idx, size) (idx = (idx + 1) % size)
+static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
+{
+	dma_addr_t paddr;
+	int i, type, nr_entries;
+	int tpidx = 0; /* index to the current system table_page */
+	int sgtidx = 0;	/* index to the sg_table within the current syspage */
+	int sgtoffset = 0; /* offset to the next entry within the sg_table */
+	int dpidx = 0; /* index to the current system data_page */
+	int spidx = 0; /* index to the SG page within the current data page */
+	sgte_t *ptr; /* pointer to the table entry to fill */
+	struct tmc_sg_table *sg_table = etr_table->sg_table;
+	dma_addr_t *table_daddrs = sg_table->table_pages.daddrs;
+	dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
+
+	nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages);
+	/*
+	 * Use the contiguous virtual address of the table to update entries.
+	 */
+	ptr = sg_table->table_vaddr;
+	/*
+	 * Fill all the entries, except the last entry to avoid special
+	 * checks within the loop.
+	 */
+	for (i = 0; i < nr_entries - 1; i++) {
+		if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) {
+			/*
+			 * Last entry in a sg_table page is a link address to
+			 * the next table page. If this sg_table is the last
+			 * one in the system page, it links to the first
+			 * sg_table in the next system page. Otherwise, it
+			 * links to the next sg_table page within the system
+			 * page.
+			 */
+			if (sgtidx == ETR_SG_PAGES_PER_SYSPAGE - 1) {
+				paddr = table_daddrs[tpidx + 1];
+			} else {
+				paddr = table_daddrs[tpidx] +
+					(ETR_SG_PAGE_SIZE * (sgtidx + 1));
+			}
+			type = ETR_SG_ET_LINK;
+		} else {
+			/*
+			 * Update the idices to the data_pages to point to the
+			 * next sg_page in the data buffer.
+			 */
+			type = ETR_SG_ET_NORMAL;
+			paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
+			if (!INC_IDX_ROUND(spidx, ETR_SG_PAGES_PER_SYSPAGE))
+				dpidx++;
+		}
+		*ptr++ = ETR_SG_ENTRY(paddr, type);
+		/*
+		 * Move to the next table pointer, moving the table page index
+		 * if necessary
+		 */
+		if (!INC_IDX_ROUND(sgtoffset, ETR_SG_PTRS_PER_PAGE)) {
+			if (!INC_IDX_ROUND(sgtidx, ETR_SG_PAGES_PER_SYSPAGE))
+				tpidx++;
+		}
+	}
+
+	/* Set up the last entry, which is always a data pointer */
+	paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
+	*ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST);
+}
+
+/*
+ * tmc_init_etr_sg_table: Allocate a TMC ETR SG table, data buffer of @size and
+ * populate the table.
+ *
+ * @dev		- Device pointer for the TMC
+ * @node	- NUMA node where the memory should be allocated
+ * @size	- Total size of the data buffer
+ * @pages	- Optional list of page virtual address
+ */
+static struct etr_sg_table __maybe_unused *
+tmc_init_etr_sg_table(struct device *dev, int node,
+		  unsigned long size, void **pages)
+{
+	int nr_entries, nr_tpages;
+	int nr_dpages = size >> PAGE_SHIFT;
+	struct tmc_sg_table *sg_table;
+	struct etr_sg_table *etr_table;
+
+	etr_table = kzalloc(sizeof(*etr_table), GFP_KERNEL);
+	if (!etr_table)
+		return ERR_PTR(-ENOMEM);
+	nr_entries = tmc_etr_sg_table_entries(nr_dpages);
+	nr_tpages = DIV_ROUND_UP(nr_entries, ETR_SG_PTRS_PER_SYSPAGE);
+
+	sg_table = tmc_alloc_sg_table(dev, node, nr_tpages, nr_dpages, pages);
+	if (IS_ERR(sg_table)) {
+		kfree(etr_table);
+		return ERR_PTR(PTR_ERR(sg_table));
+	}
+
+	etr_table->sg_table = sg_table;
+	/* TMC should use table base address for DBA */
+	etr_table->hwaddr = sg_table->table_daddr;
+	tmc_etr_sg_table_populate(etr_table);
+	/* Sync the table pages for the HW */
+	tmc_sg_table_sync_table(sg_table);
+#ifdef ETR_SG_DEBUG
+	tmc_etr_sg_table_dump(etr_table);
+#endif
+	return etr_table;
+}
+
 static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 {
 	u32 axictl, sts;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 06/17] coresight: tmc: Make ETR SG table circular
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (4 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 05/17] coresight: Add support for TMC ETR SG unit Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-10-20 17:11   ` Julien Thierry
                     ` (2 more replies)
  2017-10-19 17:15 ` [PATCH 07/17] coresight: tmc etr: Add transparent buffer management Suzuki K Poulose
                   ` (11 subsequent siblings)
  17 siblings, 3 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

Make the ETR SG table Circular buffer so that we could start
at any of the SG pages and use the entire buffer for tracing.
This can be achieved by :

1) Keeping an additional LINK pointer at the very end of the
SG table, i.e, after the LAST buffer entry, to point back to
the beginning of the first table. This will allow us to use
the buffer normally when we start the trace at offset 0 of
the buffer, as the LAST buffer entry hints the TMC-ETR and
it automatically wraps to the offset 0.

2) If we want to start at any other ETR SG page aligned offset,
we could :
 a) Make the preceding page entry as LAST entry.
 b) Make the original LAST entry a normal entry.
 c) Use the table pointer to the "new" start offset as the
    base of the table address.
This works as the TMC doesn't mandate that the page table
base address should be 4K page aligned.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 159 +++++++++++++++++++++---
 1 file changed, 139 insertions(+), 20 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 4424eb67a54c..c171b244e12a 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -71,36 +71,41 @@ typedef u32 sgte_t;
  * @sg_table:		Generic SG Table holding the data/table pages.
  * @hwaddr:		hwaddress used by the TMC, which is the base
  *			address of the table.
+ * @nr_entries:		Total number of pointers in the table.
+ * @first_entry:	Index to the current "start" of the buffer.
+ * @last_entry:		Index to the last entry of the buffer.
  */
 struct etr_sg_table {
 	struct tmc_sg_table	*sg_table;
 	dma_addr_t		hwaddr;
+	u32			nr_entries;
+	u32			first_entry;
+	u32			last_entry;
 };
 
 /*
  * tmc_etr_sg_table_entries: Total number of table entries required to map
  * @nr_pages system pages.
  *
- * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages.
+ * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages and
+ * an additional Link pointer for making it a Circular buffer.
  * Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers,
  * with the last entry pointing to the page containing the table
- * entries. If we spill over to a new page for mapping 1 entry,
- * we could as well replace the link entry of the previous page
- * with the last entry.
+ * entries. If we fill the last table in full with the pointers, (i.e,
+ * nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) == 0, we don't have to allocate
+ * another table and hence skip the Link pointer. Also we could use the
+ * link entry of the last page to make it circular.
  */
 static inline unsigned long __attribute_const__
 tmc_etr_sg_table_entries(int nr_pages)
 {
 	unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE;
 	unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1);
-	/*
-	 * If we spill over to a new page for 1 entry, we could as well
-	 * make it the LAST entry in the previous page, skipping the Link
-	 * address.
-	 */
-	if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2))
+
+	if (nr_sglinks && !(nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1)))
 		nr_sglinks--;
-	return nr_sgpages + nr_sglinks;
+	/* Add an entry for the circular link */
+	return nr_sgpages + nr_sglinks + 1;
 }
 
 /*
@@ -417,14 +422,21 @@ tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table,
 /* Dump the given sg_table */
 static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
 {
-	sgte_t *ptr;
+	sgte_t *ptr, *start;
 	int i = 0;
 	dma_addr_t addr;
 	struct tmc_sg_table *sg_table = etr_table->sg_table;
 
-	ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
+	start = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
 					      etr_table->hwaddr, true);
-	while (ptr) {
+	if (!start) {
+		pr_debug("ERROR: Failed to translate table base: 0x%llx\n",
+					 etr_table->hwaddr);
+		return;
+	}
+
+	ptr = start;
+	do {
 		addr = ETR_SG_ADDR(*ptr);
 		switch (ETR_SG_ET(*ptr)) {
 		case ETR_SG_ET_NORMAL:
@@ -436,14 +448,17 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
 				 i, ptr, addr);
 			ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
 							      addr, true);
+			if (!ptr)
+				pr_debug("ERROR: Bad Link 0x%llx\n", addr);
 			break;
 		case ETR_SG_ET_LAST:
 			pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n",
 				 i, ptr, addr);
-			return;
+			ptr++;
+			break;
 		}
 		i++;
-	}
+	} while (ptr && ptr != start);
 	pr_debug("******* End of Table *****\n");
 }
 #endif
@@ -458,7 +473,7 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
 static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
 {
 	dma_addr_t paddr;
-	int i, type, nr_entries;
+	int i, type;
 	int tpidx = 0; /* index to the current system table_page */
 	int sgtidx = 0;	/* index to the sg_table within the current syspage */
 	int sgtoffset = 0; /* offset to the next entry within the sg_table */
@@ -469,16 +484,16 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
 	dma_addr_t *table_daddrs = sg_table->table_pages.daddrs;
 	dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
 
-	nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages);
 	/*
 	 * Use the contiguous virtual address of the table to update entries.
 	 */
 	ptr = sg_table->table_vaddr;
 	/*
-	 * Fill all the entries, except the last entry to avoid special
+	 * Fill all the entries, except the last two entries (i.e, the last
+	 * buffer and the circular link back to the base) to avoid special
 	 * checks within the loop.
 	 */
-	for (i = 0; i < nr_entries - 1; i++) {
+	for (i = 0; i < etr_table->nr_entries - 2; i++) {
 		if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) {
 			/*
 			 * Last entry in a sg_table page is a link address to
@@ -519,6 +534,107 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
 	/* Set up the last entry, which is always a data pointer */
 	paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
 	*ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST);
+	/* followed by a circular link, back to the start of the table */
+	*ptr++ = ETR_SG_ENTRY(sg_table->table_daddr, ETR_SG_ET_LINK);
+}
+
+/*
+ * tmc_etr_sg_offset_to_table_index : Translate a given data @offset
+ * to the index of the page table "entry". Data pointers always have
+ * a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an
+ * ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries.
+ */
+static inline u32
+tmc_etr_sg_offset_to_table_index(u64 offset)
+{
+	u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
+
+	return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
+}
+
+/*
+ * tmc_etr_sg_update_type: Update the type of a given entry in the
+ * table to the requested entry. This is only used for data buffers
+ * to toggle the "NORMAL" vs "LAST" buffer entries.
+ */
+static inline void tmc_etr_sg_update_type(sgte_t *entry, u32 type)
+{
+	WARN_ON(ETR_SG_ET(*entry) == ETR_SG_ET_LINK);
+	WARN_ON(!ETR_SG_ET(*entry));
+	*entry &= ~ETR_SG_ET_MASK;
+	*entry |= type;
+}
+
+/*
+ * tmc_etr_sg_table_index_to_daddr: Return the hardware address to the table
+ * entry @index. Use this address to let the table begin @index.
+ */
+static inline dma_addr_t
+tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index)
+{
+	u32 sys_page_idx = index / ETR_SG_PTRS_PER_SYSPAGE;
+	u32 sys_page_offset = index % ETR_SG_PTRS_PER_SYSPAGE;
+	sgte_t *ptr;
+
+	ptr = (sgte_t *)sg_table->table_pages.daddrs[sys_page_idx];
+	return (dma_addr_t)&ptr[sys_page_offset];
+}
+
+/*
+ * tmc_etr_sg_table_rotate : Rotate the SG circular buffer, moving
+ * the "base" to a requested offset. We do so by :
+ *
+ * 1) Reset the current LAST buffer.
+ * 2) Mark the "previous" buffer in the table to the "base" as LAST.
+ * 3) Update the hwaddr to point to the table pointer for the buffer
+ *    which starts at "base".
+ */
+static int __maybe_unused
+tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset)
+{
+	u32 last_entry, first_entry;
+	u64 last_offset;
+	struct tmc_sg_table *sg_table = etr_table->sg_table;
+	sgte_t *table_ptr = sg_table->table_vaddr;
+	ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
+
+	/* Offset should always be SG PAGE_SIZE aligned */
+	if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
+		pr_debug("unaligned base offset %llx\n", base_offset);
+		return -EINVAL;
+	}
+	/* Make sure the offset is within the range */
+	if (base_offset < 0 || base_offset > buf_size) {
+		base_offset = (base_offset + buf_size) % buf_size;
+		pr_debug("Resetting offset to %llx\n", base_offset);
+	}
+	first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
+	if (first_entry == etr_table->first_entry) {
+		pr_debug("Head is already at %llx, skipping\n", base_offset);
+		return 0;
+	}
+
+	/* Last entry should be the previous one to the new "base" */
+	last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size;
+	last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
+
+	/* Reset the current Last page to Normal and new Last page to NORMAL */
+	tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry],
+				 ETR_SG_ET_NORMAL);
+	tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST);
+	etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table,
+							    first_entry);
+	etr_table->first_entry = first_entry;
+	etr_table->last_entry = last_entry;
+	pr_debug("table rotated to offset %llx-%llx, entries (%d - %d), dba: %llx\n",
+			base_offset, last_offset, first_entry, last_entry,
+			etr_table->hwaddr);
+	/* Sync the table for device */
+	tmc_sg_table_sync_table(sg_table);
+#ifdef ETR_SG_DEBUG
+	tmc_etr_sg_table_dump(etr_table);
+#endif
+	return 0;
 }
 
 /*
@@ -552,6 +668,9 @@ tmc_init_etr_sg_table(struct device *dev, int node,
 	}
 
 	etr_table->sg_table = sg_table;
+	etr_table->nr_entries = nr_entries;
+	etr_table->first_entry = 0;
+	etr_table->last_entry = nr_entries - 2;
 	/* TMC should use table base address for DBA */
 	etr_table->hwaddr = sg_table->table_daddr;
 	tmc_etr_sg_table_populate(etr_table);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 07/17] coresight: tmc etr: Add transparent buffer management
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (5 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 06/17] coresight: tmc: Make ETR SG table circular Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-11-02 17:48   ` Mathieu Poirier
  2017-10-19 17:15 ` [PATCH 08/17] coresight: tmc: Add configuration support for trace buffer size Suzuki K Poulose
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

At the moment we always use contiguous memory for TMC ETR tracing
when used from sysfs. The size of the buffer is fixed at boot time
and can only be changed by modifiying the DT. With the introduction
of SG support we could support really large buffers in that mode.
This patch abstracts the buffer used for ETR to switch between a
contiguous buffer or a SG table depending on the availability of
the memory.

This also enables the sysfs mode to use the ETR in SG mode depending
on configured the trace buffer size. Also, since ETR will use the
new infrastructure to manage the buffer, we can get rid of some
of the members in the tmc_drvdata and clean up the fields a bit.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 433 +++++++++++++++++++-----
 drivers/hwtracing/coresight/coresight-tmc.h     |  60 +++-
 2 files changed, 403 insertions(+), 90 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index c171b244e12a..9e41eeaa5284 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -16,6 +16,7 @@
  */
 
 #include <linux/coresight.h>
+#include <linux/iommu.h>
 #include <linux/slab.h>
 #include "coresight-priv.h"
 #include "coresight-tmc.h"
@@ -646,7 +647,7 @@ tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset)
  * @size	- Total size of the data buffer
  * @pages	- Optional list of page virtual address
  */
-static struct etr_sg_table __maybe_unused *
+static struct etr_sg_table *
 tmc_init_etr_sg_table(struct device *dev, int node,
 		  unsigned long size, void **pages)
 {
@@ -682,19 +683,298 @@ tmc_init_etr_sg_table(struct device *dev, int node,
 	return etr_table;
 }
 
+/*
+ * tmc_etr_alloc_flat_buf: Allocate a contiguous DMA buffer.
+ * We keep the tmc_drvdata in the @private field to retrieve the
+ * device information, while the DMA address and virtual address are
+ * stored already in @hwaddr and @vaddr respectively, which never changes.
+ */
+static int tmc_etr_alloc_flat_buf(struct tmc_drvdata *drvdata,
+				  struct etr_buf *etr_buf, int node,
+				  void **pages)
+{
+	dma_addr_t paddr;
+	void *vaddr = dma_alloc_coherent(drvdata->dev, etr_buf->size,
+					   &paddr, GFP_KERNEL);
+	if (!vaddr)
+		return -ENOMEM;
+	etr_buf->vaddr = vaddr;
+	etr_buf->hwaddr = paddr;
+	etr_buf->mode = ETR_MODE_FLAT;
+	etr_buf->private = drvdata;
+	return 0;
+}
+
+static void tmc_etr_free_flat_buf(struct etr_buf *etr_buf)
+{
+	struct tmc_drvdata *drvdata = etr_buf->private;
+
+	if (etr_buf->hwaddr)
+		dma_free_coherent(drvdata->dev, etr_buf->size,
+					etr_buf->vaddr, etr_buf->hwaddr);
+}
+
+static void tmc_etr_sync_flat_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)
+{
+	/*
+	 * Adjust the buffer to point to the beginning of the trace data
+	 * and update the available trace data.
+	 */
+	etr_buf->offset = rrp - etr_buf->hwaddr;
+	if (etr_buf->full)
+		etr_buf->len = etr_buf->size;
+	else
+		etr_buf->len = rwp - rrp;
+}
+
+static ssize_t tmc_etr_get_data_flat_buf(struct etr_buf *etr_buf,
+					 u64 offset, size_t len, char **bufpp)
+{
+	/*
+	 * tmc_etr_buf_get_data already adjusts the length to handle
+	 * buffer wrapping around.
+	 */
+	*bufpp = (char *)((unsigned long)etr_buf->vaddr + offset);
+	return len;
+}
+
+static const struct etr_buf_operations etr_flat_buf_ops = {
+	.alloc = tmc_etr_alloc_flat_buf,
+	.free = tmc_etr_free_flat_buf,
+	.sync = tmc_etr_sync_flat_buf,
+	.get_data = tmc_etr_get_data_flat_buf,
+};
+
+/*
+ * tmc_etr_alloc_sg_buf: Allocate an SG buf  @etr_buf. Setup the parameters
+ * appropriately.
+ */
+static int tmc_etr_alloc_sg_buf(struct tmc_drvdata *drvdata,
+				struct etr_buf *etr_buf, int node,
+				void **pages)
+{
+	struct etr_sg_table *etr_table;
+
+	etr_table = tmc_init_etr_sg_table(drvdata->dev, node,
+					  etr_buf->size, pages);
+	if (IS_ERR(etr_table))
+		return -ENOMEM;
+	etr_buf->vaddr = tmc_sg_table_data_vaddr(etr_table->sg_table);
+	etr_buf->hwaddr = etr_table->hwaddr;
+	etr_buf->mode = ETR_MODE_ETR_SG;
+	etr_buf->private = etr_table;
+	return 0;
+}
+
+static void tmc_etr_free_sg_buf(struct etr_buf *etr_buf)
+{
+	struct etr_sg_table *etr_table = etr_buf->private;
+
+	if (etr_table) {
+		tmc_free_sg_table(etr_table->sg_table);
+		kfree(etr_table);
+	}
+}
+
+static ssize_t tmc_etr_get_data_sg_buf(struct etr_buf *etr_buf, u64 offset,
+				       size_t len, char **bufpp)
+{
+	struct etr_sg_table *etr_table = etr_buf->private;
+
+	return tmc_sg_table_get_data(etr_table->sg_table, offset, len, bufpp);
+}
+
+static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)
+{
+	long r_offset, w_offset;
+	struct etr_sg_table *etr_table = etr_buf->private;
+	struct tmc_sg_table *table = etr_table->sg_table;
+
+	r_offset = tmc_sg_get_data_page_offset(table, rrp);
+	if (r_offset < 0) {
+		dev_warn(table->dev, "Unable to map RRP %llx to offset\n",
+				rrp);
+		etr_buf->len = 0;
+		return;
+	}
+
+	w_offset = tmc_sg_get_data_page_offset(table, rwp);
+	if (w_offset < 0) {
+		dev_warn(table->dev, "Unable to map RWP %llx to offset\n",
+				rwp);
+		etr_buf->len = 0;
+		return;
+	}
+
+	etr_buf->offset = r_offset;
+	if (etr_buf->full)
+		etr_buf->len = etr_buf->size;
+	else
+		etr_buf->len = (w_offset < r_offset) ?
+			etr_buf->size + w_offset - r_offset :
+			w_offset - r_offset;
+	tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len);
+}
+
+static const struct etr_buf_operations etr_sg_buf_ops = {
+	.alloc = tmc_etr_alloc_sg_buf,
+	.free = tmc_etr_free_sg_buf,
+	.sync = tmc_etr_sync_sg_buf,
+	.get_data = tmc_etr_get_data_sg_buf,
+};
+
+static const struct etr_buf_operations *etr_buf_ops[] = {
+	[ETR_MODE_FLAT] = &etr_flat_buf_ops,
+	[ETR_MODE_ETR_SG] = &etr_sg_buf_ops,
+};
+
+static inline int tmc_etr_mode_alloc_buf(int mode,
+				  struct tmc_drvdata *drvdata,
+				  struct etr_buf *etr_buf, int node,
+				  void **pages)
+{
+	int rc;
+
+	switch (mode) {
+	case ETR_MODE_FLAT:
+	case ETR_MODE_ETR_SG:
+		rc = etr_buf_ops[mode]->alloc(drvdata, etr_buf, node, pages);
+		if (!rc)
+			etr_buf->ops = etr_buf_ops[mode];
+		return rc;
+	default:
+		return -EINVAL;
+	}
+}
+
+/*
+ * tmc_alloc_etr_buf: Allocate a buffer use by ETR.
+ * @drvdata	: ETR device details.
+ * @size	: size of the requested buffer.
+ * @flags	: Required properties of the type of buffer.
+ * @node	: Node for memory allocations.
+ * @pages	: An optional list of pages.
+ */
+static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
+					  ssize_t size, int flags,
+					  int node, void **pages)
+{
+	int rc = -ENOMEM;
+	bool has_etr_sg, has_iommu;
+	struct etr_buf *etr_buf;
+
+	has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG);
+	has_iommu = iommu_get_domain_for_dev(drvdata->dev);
+
+	etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
+	if (!etr_buf)
+		return ERR_PTR(-ENOMEM);
+
+	etr_buf->size = size;
+
+	/*
+	 * If we have to use an existing list of pages, we cannot reliably
+	 * use a contiguous DMA memory (even if we have an IOMMU). Otherwise,
+	 * we use the contiguous DMA memory if :
+	 *  a) The ETR cannot use Scatter-Gather.
+	 *  b) if not a, we have an IOMMU backup
+	 *  c) if none of the above holds, use it for smaller memory (< 1M).
+	 *
+	 * Fallback to available mechanisms.
+	 *
+	 */
+	if (!pages &&
+	    (!has_etr_sg || has_iommu || size < SZ_1M))
+		rc = tmc_etr_mode_alloc_buf(ETR_MODE_FLAT, drvdata,
+					    etr_buf, node, pages);
+	if (rc && has_etr_sg)
+		rc = tmc_etr_mode_alloc_buf(ETR_MODE_ETR_SG, drvdata,
+					    etr_buf, node, pages);
+	if (rc) {
+		kfree(etr_buf);
+		return ERR_PTR(rc);
+	}
+
+	return etr_buf;
+}
+
+static void tmc_free_etr_buf(struct etr_buf *etr_buf)
+{
+	WARN_ON(!etr_buf->ops || !etr_buf->ops->free);
+	etr_buf->ops->free(etr_buf);
+	kfree(etr_buf);
+}
+
+/*
+ * tmc_etr_buf_get_data: Get the pointer the trace data at @offset
+ * with a maximum of @len bytes.
+ * Returns: The size of the linear data available @pos, with *bufpp
+ * updated to point to the buffer.
+ */
+static ssize_t tmc_etr_buf_get_data(struct etr_buf *etr_buf,
+				    u64 offset, size_t len, char **bufpp)
+{
+	/* Adjust the length to limit this transaction to end of buffer */
+	len = (len < (etr_buf->size - offset)) ? len : etr_buf->size - offset;
+
+	return etr_buf->ops->get_data(etr_buf, (u64)offset, len, bufpp);
+}
+
+static inline s64
+tmc_etr_buf_insert_barrier_packet(struct etr_buf *etr_buf, u64 offset)
+{
+	ssize_t len;
+	char *bufp;
+
+	len = tmc_etr_buf_get_data(etr_buf, offset,
+				   CORESIGHT_BARRIER_PKT_SIZE, &bufp);
+	if (WARN_ON(len <= CORESIGHT_BARRIER_PKT_SIZE))
+		return -EINVAL;
+	coresight_insert_barrier_packet(bufp);
+	return offset + CORESIGHT_BARRIER_PKT_SIZE;
+}
+
+/*
+ * tmc_sync_etr_buf: Sync the trace buffer availability with drvdata.
+ * Makes sure the trace data is synced to the memory for consumption.
+ * @etr_buf->offset will hold the offset to the beginning of the trace data
+ * within the buffer, with @etr_buf->len bytes to consume. @etr_buf->vaddr
+ * will always point to the beginning of the "trace buffer".
+ */
+static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata)
+{
+	struct etr_buf *etr_buf = drvdata->etr_buf;
+	u64 rrp, rwp;
+	u32 status;
+
+	rrp = tmc_read_rrp(drvdata);
+	rwp = tmc_read_rwp(drvdata);
+	status = readl_relaxed(drvdata->base + TMC_STS);
+	etr_buf->full = status & TMC_STS_FULL;
+
+	WARN_ON(!etr_buf->ops || !etr_buf->ops->sync);
+
+	etr_buf->ops->sync(etr_buf, rrp, rwp);
+
+	/* Insert barrier packets at the beginning, if there was an overflow */
+	if (etr_buf->full)
+		tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset);
+}
+
 static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 {
 	u32 axictl, sts;
+	struct etr_buf *etr_buf = drvdata->etr_buf;
 
 	/* Zero out the memory to help with debug */
-	memset(drvdata->vaddr, 0, drvdata->size);
+	memset(etr_buf->vaddr, 0, etr_buf->size);
 
 	CS_UNLOCK(drvdata->base);
 
 	/* Wait for TMCSReady bit to be set */
 	tmc_wait_for_tmcready(drvdata);
 
-	writel_relaxed(drvdata->size / 4, drvdata->base + TMC_RSZ);
+	writel_relaxed(etr_buf->size / 4, drvdata->base + TMC_RSZ);
 	writel_relaxed(TMC_MODE_CIRCULAR_BUFFER, drvdata->base + TMC_MODE);
 
 	axictl = readl_relaxed(drvdata->base + TMC_AXICTL);
@@ -707,16 +987,22 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 		axictl |= TMC_AXICTL_ARCACHE_OS;
 	}
 
+	if (etr_buf->mode == ETR_MODE_ETR_SG) {
+		if (WARN_ON(!tmc_etr_has_cap(drvdata, TMC_ETR_SG)))
+			return;
+		axictl |= TMC_AXICTL_SCT_GAT_MODE;
+	}
+
 	writel_relaxed(axictl, drvdata->base + TMC_AXICTL);
-	tmc_write_dba(drvdata, drvdata->paddr);
+	tmc_write_dba(drvdata, etr_buf->hwaddr);
 	/*
 	 * If the TMC pointers must be programmed before the session,
 	 * we have to set it properly (i.e, RRP/RWP to base address and
 	 * STS to "not full").
 	 */
 	if (tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE)) {
-		tmc_write_rrp(drvdata, drvdata->paddr);
-		tmc_write_rwp(drvdata, drvdata->paddr);
+		tmc_write_rrp(drvdata, etr_buf->hwaddr);
+		tmc_write_rwp(drvdata, etr_buf->hwaddr);
 		sts = readl_relaxed(drvdata->base + TMC_STS) & ~TMC_STS_FULL;
 		writel_relaxed(sts, drvdata->base + TMC_STS);
 	}
@@ -732,62 +1018,52 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
 }
 
 /*
- * Return the available trace data in the buffer @pos, with a maximum
- * limit of @len, also updating the @bufpp on where to find it.
+ * Return the available trace data in the buffer (starts at etr_buf->offset,
+ * limited by etr_buf->len) from @pos, with a maximum limit of @len,
+ * also updating the @bufpp on where to find it. Since the trace data
+ * starts at anywhere in the buffer, depending on the RRP, we adjust the
+ * @len returned to handle buffer wrapping around.
  */
 ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
 			    loff_t pos, size_t len, char **bufpp)
 {
-	char *bufp = drvdata->buf + pos;
-	char *bufend = (char *)(drvdata->vaddr + drvdata->size);
-
-	/* Adjust the len to available size @pos */
-	if (pos + len > drvdata->len)
-		len = drvdata->len - pos;
+	s64 offset;
+	struct etr_buf *etr_buf = drvdata->etr_buf;
 
+	if (pos + len > etr_buf->len)
+		len = etr_buf->len - pos;
 	if (len <= 0)
 		return len;
 
-	/*
-	 * Since we use a circular buffer, with trace data starting
-	 * @drvdata->buf, possibly anywhere in the buffer @drvdata->vaddr,
-	 * wrap the current @pos to within the buffer.
-	 */
-	if (bufp >= bufend)
-		bufp -= drvdata->size;
-	/*
-	 * For simplicity, avoid copying over a wrapped around buffer.
-	 */
-	if ((bufp + len) > bufend)
-		len = bufend - bufp;
-	*bufpp = bufp;
-	return len;
+	/* Compute the offset from which we read the data */
+	offset = etr_buf->offset + pos;
+	if (offset >= etr_buf->size)
+		offset -= etr_buf->size;
+	return tmc_etr_buf_get_data(etr_buf, offset, len, bufpp);
 }
 
-static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata)
+static struct etr_buf *
+tmc_etr_setup_sysfs_buf(struct tmc_drvdata *drvdata)
 {
-	u32 val;
-	u64 rwp;
+	return tmc_alloc_etr_buf(drvdata, drvdata->size, 0,
+				  cpu_to_node(0), NULL);
+}
 
-	rwp = tmc_read_rwp(drvdata);
-	val = readl_relaxed(drvdata->base + TMC_STS);
+static void
+tmc_etr_free_sysfs_buf(struct etr_buf *buf)
+{
+	if (buf)
+		tmc_free_etr_buf(buf);
+}
 
-	/*
-	 * Adjust the buffer to point to the beginning of the trace data
-	 * and update the available trace data.
-	 */
-	if (val & TMC_STS_FULL) {
-		drvdata->buf = drvdata->vaddr + rwp - drvdata->paddr;
-		drvdata->len = drvdata->size;
-		coresight_insert_barrier_packet(drvdata->buf);
-	} else {
-		drvdata->buf = drvdata->vaddr;
-		drvdata->len = rwp - drvdata->paddr;
-	}
+static void tmc_etr_sync_sysfs_buf(struct tmc_drvdata *drvdata)
+{
+	tmc_sync_etr_buf(drvdata);
 }
 
 static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
 {
+
 	CS_UNLOCK(drvdata->base);
 
 	tmc_flush_and_stop(drvdata);
@@ -796,7 +1072,8 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
 	 * read before the TMC is disabled.
 	 */
 	if (drvdata->mode == CS_MODE_SYSFS)
-		tmc_etr_dump_hw(drvdata);
+		tmc_etr_sync_sysfs_buf(drvdata);
+
 	tmc_disable_hw(drvdata);
 
 	CS_LOCK(drvdata->base);
@@ -807,34 +1084,31 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	int ret = 0;
 	bool used = false;
 	unsigned long flags;
-	void __iomem *vaddr = NULL;
-	dma_addr_t paddr;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
+	struct etr_buf *new_buf = NULL, *free_buf = NULL;
 
 
 	/*
-	 * If we don't have a buffer release the lock and allocate memory.
-	 * Otherwise keep the lock and move along.
+	 * If we are enabling the ETR from disabled state, we need to make
+	 * sure we have a buffer with the right size. The etr_buf is not reset
+	 * immediately after we stop the tracing in SYSFS mode as we wait for
+	 * the user to collect the data. We may be able to reuse the existing
+	 * buffer, provided the size matches. Any allocation has to be done
+	 * with the lock released.
 	 */
 	spin_lock_irqsave(&drvdata->spinlock, flags);
-	if (!drvdata->vaddr) {
+	if (!drvdata->etr_buf || (drvdata->etr_buf->size != drvdata->size)) {
 		spin_unlock_irqrestore(&drvdata->spinlock, flags);
-
-		/*
-		 * Contiguous  memory can't be allocated while a spinlock is
-		 * held.  As such allocate memory here and free it if a buffer
-		 * has already been allocated (from a previous session).
-		 */
-		vaddr = dma_alloc_coherent(drvdata->dev, drvdata->size,
-					   &paddr, GFP_KERNEL);
-		if (!vaddr)
-			return -ENOMEM;
+		/* Allocate memory with the spinlock released */
+		free_buf = new_buf = tmc_etr_setup_sysfs_buf(drvdata);
+		if (IS_ERR(new_buf))
+			return PTR_ERR(new_buf);
 
 		/* Let's try again */
 		spin_lock_irqsave(&drvdata->spinlock, flags);
 	}
 
-	if (drvdata->reading) {
+	if (drvdata->reading || drvdata->mode == CS_MODE_PERF) {
 		ret = -EBUSY;
 		goto out;
 	}
@@ -842,21 +1116,20 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	/*
 	 * In sysFS mode we can have multiple writers per sink.  Since this
 	 * sink is already enabled no memory is needed and the HW need not be
-	 * touched.
+	 * touched, even if the buffer size has changed.
 	 */
 	if (drvdata->mode == CS_MODE_SYSFS)
 		goto out;
 
 	/*
-	 * If drvdata::buf == NULL, use the memory allocated above.
-	 * Otherwise a buffer still exists from a previous session, so
-	 * simply use that.
+	 * If we don't have a buffer or it doesn't match the requested size,
+	 * use the memory allocated above. Otherwise reuse it.
 	 */
-	if (drvdata->buf == NULL) {
+	if (!drvdata->etr_buf ||
+	    (new_buf && drvdata->etr_buf->size != new_buf->size)) {
 		used = true;
-		drvdata->vaddr = vaddr;
-		drvdata->paddr = paddr;
-		drvdata->buf = drvdata->vaddr;
+		free_buf = drvdata->etr_buf;
+		drvdata->etr_buf = new_buf;
 	}
 
 	drvdata->mode = CS_MODE_SYSFS;
@@ -865,8 +1138,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	/* Free memory outside the spinlock if need be */
-	if (!used && vaddr)
-		dma_free_coherent(drvdata->dev, drvdata->size, vaddr, paddr);
+	if (free_buf)
+		tmc_etr_free_sysfs_buf(free_buf);
 
 	if (!ret)
 		dev_info(drvdata->dev, "TMC-ETR enabled\n");
@@ -945,8 +1218,8 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata)
 		goto out;
 	}
 
-	/* If drvdata::buf is NULL the trace data has been read already */
-	if (drvdata->buf == NULL) {
+	/* If drvdata::etr_buf is NULL the trace data has been read already */
+	if (drvdata->etr_buf == NULL) {
 		ret = -EINVAL;
 		goto out;
 	}
@@ -965,8 +1238,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata)
 int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
 {
 	unsigned long flags;
-	dma_addr_t paddr;
-	void __iomem *vaddr = NULL;
+	struct etr_buf *etr_buf = NULL;
 
 	/* config types are set a boot time and never change */
 	if (WARN_ON_ONCE(drvdata->config_type != TMC_CONFIG_TYPE_ETR))
@@ -988,17 +1260,16 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
 		 * The ETR is not tracing and the buffer was just read.
 		 * As such prepare to free the trace buffer.
 		 */
-		vaddr = drvdata->vaddr;
-		paddr = drvdata->paddr;
-		drvdata->buf = drvdata->vaddr = NULL;
+		etr_buf =  drvdata->etr_buf;
+		drvdata->etr_buf = NULL;
 	}
 
 	drvdata->reading = false;
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	/* Free allocated memory out side of the spinlock */
-	if (vaddr)
-		dma_free_coherent(drvdata->dev, drvdata->size, vaddr, paddr);
+	if (etr_buf)
+		tmc_free_etr_buf(etr_buf);
 
 	return 0;
 }
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index 5e49c035a1ac..50ebc17c4645 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -55,6 +55,8 @@
 #define TMC_STS_TMCREADY_BIT	2
 #define TMC_STS_FULL		BIT(0)
 #define TMC_STS_TRIGGERED	BIT(1)
+#define TMC_STS_MEMERR		BIT(5)
+
 /*
  * TMC_AXICTL - 0x110
  *
@@ -134,6 +136,37 @@ enum tmc_mem_intf_width {
 #define CORESIGHT_SOC_600_ETR_CAPS	\
 	(TMC_ETR_SAVE_RESTORE | TMC_ETR_AXI_ARCACHE)
 
+enum etr_mode {
+	ETR_MODE_FLAT,		/* Uses contiguous flat buffer */
+	ETR_MODE_ETR_SG,	/* Uses in-built TMC ETR SG mechanism */
+};
+
+struct etr_buf_operations;
+
+/**
+ * struct etr_buf - Details of the buffer used by ETR
+ * @mode	: Mode of the ETR buffer, contiguous, Scatter Gather etc.
+ * @full	: Trace data overflow
+ * @size	: Size of the buffer.
+ * @hwaddr	: Address to be programmed in the TMC:DBA{LO,HI}
+ * @vaddr	: Virtual address of the buffer used for trace.
+ * @offset	: Offset of the trace data in the buffer for consumption.
+ * @len		: Available trace data @buf (may round up to the beginning).
+ * @ops		: ETR buffer operations for the mode.
+ * @private	: Backend specific information for the buf
+ */
+struct etr_buf {
+	enum etr_mode			mode;
+	bool				full;
+	ssize_t				size;
+	dma_addr_t			hwaddr;
+	void				*vaddr;
+	unsigned long			offset;
+	u64				len;
+	const struct etr_buf_operations	*ops;
+	void				*private;
+};
+
 /**
  * struct tmc_drvdata - specifics associated to an TMC component
  * @base:	memory mapped base address for this component.
@@ -141,11 +174,10 @@ enum tmc_mem_intf_width {
  * @csdev:	component vitals needed by the framework.
  * @miscdev:	specifics to handle "/dev/xyz.tmc" entry.
  * @spinlock:	only one at a time pls.
- * @buf:	area of memory where trace data get sent.
- * @paddr:	DMA start location in RAM.
- * @vaddr:	virtual representation of @paddr.
- * @size:	trace buffer size.
- * @len:	size of the available trace.
+ * @buf:	Snapshot of the trace data for ETF/ETB.
+ * @etr_buf:	details of buffer used in TMC-ETR
+ * @len:	size of the available trace for ETF/ETB.
+ * @size:	trace buffer size for this TMC (common for all modes).
  * @mode:	how this TMC is being used.
  * @config_type: TMC variant, must be of type @tmc_config_type.
  * @memwidth:	width of the memory interface databus, in bytes.
@@ -160,11 +192,12 @@ struct tmc_drvdata {
 	struct miscdevice	miscdev;
 	spinlock_t		spinlock;
 	bool			reading;
-	char			*buf;
-	dma_addr_t		paddr;
-	void __iomem		*vaddr;
-	u32			size;
+	union {
+		char		*buf;		/* TMC ETB */
+		struct etr_buf	*etr_buf;	/* TMC ETR */
+	};
 	u32			len;
+	u32			size;
 	u32			mode;
 	enum tmc_config_type	config_type;
 	enum tmc_mem_intf_width	memwidth;
@@ -172,6 +205,15 @@ struct tmc_drvdata {
 	u32			etr_caps;
 };
 
+struct etr_buf_operations {
+	int (*alloc)(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf,
+			int node, void **pages);
+	void (*sync)(struct etr_buf *etr_buf, u64 rrp, u64 rwp);
+	ssize_t (*get_data)(struct etr_buf *etr_buf, u64 offset, size_t len,
+				char **bufpp);
+	void (*free)(struct etr_buf *etr_buf);
+};
+
 /**
  * struct tmc_pages - Collection of pages used for SG.
  * @nr_pages:		Number of pages in the list.
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 08/17] coresight: tmc: Add configuration support for trace buffer size
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (6 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 07/17] coresight: tmc etr: Add transparent buffer management Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-11-02 19:26   ` Mathieu Poirier
  2017-10-19 17:15 ` [PATCH 09/17] coresight: Convert driver messages to dev_dbg Suzuki K Poulose
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

Now that we can dynamically switch between contiguous memory and
SG table depending on the trace buffer size, provide the support
for selecting an appropriate buffer size.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 .../ABI/testing/sysfs-bus-coresight-devices-tmc    |  8 ++++++
 drivers/hwtracing/coresight/coresight-tmc.c        | 32 ++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
index 4fe677ed1305..3675c380caf8 100644
--- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
+++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
@@ -83,3 +83,11 @@ KernelVersion:	4.7
 Contact:	Mathieu Poirier <mathieu.poirier@linaro.org>
 Description:	(R) Indicates the capabilities of the Coresight TMC.
 		The value is read directly from the DEVID register, 0xFC8,
+
+What:		/sys/bus/coresight/devices/<memory_map>.tmc/buffer-size
+Date:		September 2017
+KernelVersion:	4.15
+Contact:	Mathieu Poirier <mathieu.poirier@linaro.org>
+Description:	(RW) Size of the trace buffer for TMC-ETR when used in SYSFS
+		mode. Writable only for TMC-ETR configurations. The value
+		should be aligned to the kernel pagesize.
diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c
index c7201e40d737..2349b1805694 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.c
+++ b/drivers/hwtracing/coresight/coresight-tmc.c
@@ -283,8 +283,40 @@ static ssize_t trigger_cntr_store(struct device *dev,
 }
 static DEVICE_ATTR_RW(trigger_cntr);
 
+static ssize_t buffer_size_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent);
+
+	return sprintf(buf, "%#x\n", drvdata->size);
+}
+
+static ssize_t buffer_size_store(struct device *dev,
+			     struct device_attribute *attr,
+			     const char *buf, size_t size)
+{
+	int ret;
+	unsigned long val;
+	struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent);
+
+	if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
+		return -EPERM;
+
+	ret = kstrtoul(buf, 0, &val);
+	if (ret)
+		return ret;
+	/* The buffer size should be page aligned */
+	if (val & (PAGE_SIZE - 1))
+		return -EINVAL;
+	drvdata->size = val;
+	return size;
+}
+
+static DEVICE_ATTR_RW(buffer_size);
+
 static struct attribute *coresight_tmc_attrs[] = {
 	&dev_attr_trigger_cntr.attr,
+	&dev_attr_buffer_size.attr,
 	NULL,
 };
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 09/17] coresight: Convert driver messages to dev_dbg
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (7 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 08/17] coresight: tmc: Add configuration support for trace buffer size Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-10-19 17:15 ` [PATCH 10/17] coresight: etr: Track if the device is coherent Suzuki K Poulose
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

Convert component enable/disable messages from dev_info to dev_dbg.
This is required to prevent LOCKDEP splats when operating in perf
mode where we could be called with locks held to enable a coresight
path. If someone wants to really see the messages, they can always
enable it at runtime via dynamic_debug.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-dynamic-replicator.c | 4 ++--
 drivers/hwtracing/coresight/coresight-etb10.c              | 6 +++---
 drivers/hwtracing/coresight/coresight-etm3x.c              | 4 ++--
 drivers/hwtracing/coresight/coresight-etm4x.c              | 4 ++--
 drivers/hwtracing/coresight/coresight-funnel.c             | 4 ++--
 drivers/hwtracing/coresight/coresight-replicator.c         | 4 ++--
 drivers/hwtracing/coresight/coresight-stm.c                | 4 ++--
 drivers/hwtracing/coresight/coresight-tmc-etf.c            | 8 ++++----
 drivers/hwtracing/coresight/coresight-tmc-etr.c            | 4 ++--
 drivers/hwtracing/coresight/coresight-tmc.c                | 4 ++--
 drivers/hwtracing/coresight/coresight-tpiu.c               | 4 ++--
 11 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-dynamic-replicator.c b/drivers/hwtracing/coresight/coresight-dynamic-replicator.c
index accc2056f7c6..49efa9d90367 100644
--- a/drivers/hwtracing/coresight/coresight-dynamic-replicator.c
+++ b/drivers/hwtracing/coresight/coresight-dynamic-replicator.c
@@ -64,7 +64,7 @@ static int replicator_enable(struct coresight_device *csdev, int inport,
 
 	CS_LOCK(drvdata->base);
 
-	dev_info(drvdata->dev, "REPLICATOR enabled\n");
+	dev_dbg(drvdata->dev, "REPLICATOR enabled\n");
 	return 0;
 }
 
@@ -83,7 +83,7 @@ static void replicator_disable(struct coresight_device *csdev, int inport,
 
 	CS_LOCK(drvdata->base);
 
-	dev_info(drvdata->dev, "REPLICATOR disabled\n");
+	dev_dbg(drvdata->dev, "REPLICATOR disabled\n");
 }
 
 static const struct coresight_ops_link replicator_link_ops = {
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index d7164ab8e229..757f556975f7 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -164,7 +164,7 @@ static int etb_enable(struct coresight_device *csdev, u32 mode)
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 out:
-	dev_info(drvdata->dev, "ETB enabled\n");
+	dev_dbg(drvdata->dev, "ETB enabled\n");
 	return 0;
 }
 
@@ -270,7 +270,7 @@ static void etb_disable(struct coresight_device *csdev)
 
 	local_set(&drvdata->mode, CS_MODE_DISABLED);
 
-	dev_info(drvdata->dev, "ETB disabled\n");
+	dev_dbg(drvdata->dev, "ETB disabled\n");
 }
 
 static void *etb_alloc_buffer(struct coresight_device *csdev, int cpu,
@@ -513,7 +513,7 @@ static void etb_dump(struct etb_drvdata *drvdata)
 	}
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
-	dev_info(drvdata->dev, "ETB dumped\n");
+	dev_dbg(drvdata->dev, "ETB dumped\n");
 }
 
 static int etb_open(struct inode *inode, struct file *file)
diff --git a/drivers/hwtracing/coresight/coresight-etm3x.c b/drivers/hwtracing/coresight/coresight-etm3x.c
index e5b1ec57dbde..aa8a2b076ad4 100644
--- a/drivers/hwtracing/coresight/coresight-etm3x.c
+++ b/drivers/hwtracing/coresight/coresight-etm3x.c
@@ -510,7 +510,7 @@ static int etm_enable_sysfs(struct coresight_device *csdev)
 	drvdata->sticky_enable = true;
 	spin_unlock(&drvdata->spinlock);
 
-	dev_info(drvdata->dev, "ETM tracing enabled\n");
+	dev_dbg(drvdata->dev, "ETM tracing enabled\n");
 	return 0;
 
 err:
@@ -613,7 +613,7 @@ static void etm_disable_sysfs(struct coresight_device *csdev)
 	spin_unlock(&drvdata->spinlock);
 	cpus_read_unlock();
 
-	dev_info(drvdata->dev, "ETM tracing disabled\n");
+	dev_dbg(drvdata->dev, "ETM tracing disabled\n");
 }
 
 static void etm_disable(struct coresight_device *csdev,
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c
index e84d80b008fc..c9c73c2f7fd8 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x.c
@@ -274,7 +274,7 @@ static int etm4_enable_sysfs(struct coresight_device *csdev)
 	drvdata->sticky_enable = true;
 	spin_unlock(&drvdata->spinlock);
 
-	dev_info(drvdata->dev, "ETM tracing enabled\n");
+	dev_dbg(drvdata->dev, "ETM tracing enabled\n");
 	return 0;
 
 err:
@@ -387,7 +387,7 @@ static void etm4_disable_sysfs(struct coresight_device *csdev)
 	spin_unlock(&drvdata->spinlock);
 	cpus_read_unlock();
 
-	dev_info(drvdata->dev, "ETM tracing disabled\n");
+	dev_dbg(drvdata->dev, "ETM tracing disabled\n");
 }
 
 static void etm4_disable(struct coresight_device *csdev,
diff --git a/drivers/hwtracing/coresight/coresight-funnel.c b/drivers/hwtracing/coresight/coresight-funnel.c
index 77642e0e955b..afdf4807c2dc 100644
--- a/drivers/hwtracing/coresight/coresight-funnel.c
+++ b/drivers/hwtracing/coresight/coresight-funnel.c
@@ -72,7 +72,7 @@ static int funnel_enable(struct coresight_device *csdev, int inport,
 
 	funnel_enable_hw(drvdata, inport);
 
-	dev_info(drvdata->dev, "FUNNEL inport %d enabled\n", inport);
+	dev_dbg(drvdata->dev, "FUNNEL inport %d enabled\n", inport);
 	return 0;
 }
 
@@ -96,7 +96,7 @@ static void funnel_disable(struct coresight_device *csdev, int inport,
 
 	funnel_disable_hw(drvdata, inport);
 
-	dev_info(drvdata->dev, "FUNNEL inport %d disabled\n", inport);
+	dev_dbg(drvdata->dev, "FUNNEL inport %d disabled\n", inport);
 }
 
 static const struct coresight_ops_link funnel_link_ops = {
diff --git a/drivers/hwtracing/coresight/coresight-replicator.c b/drivers/hwtracing/coresight/coresight-replicator.c
index 3756e71cb8f5..4f7781203fd4 100644
--- a/drivers/hwtracing/coresight/coresight-replicator.c
+++ b/drivers/hwtracing/coresight/coresight-replicator.c
@@ -42,7 +42,7 @@ static int replicator_enable(struct coresight_device *csdev, int inport,
 {
 	struct replicator_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
-	dev_info(drvdata->dev, "REPLICATOR enabled\n");
+	dev_dbg(drvdata->dev, "REPLICATOR enabled\n");
 	return 0;
 }
 
@@ -51,7 +51,7 @@ static void replicator_disable(struct coresight_device *csdev, int inport,
 {
 	struct replicator_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
-	dev_info(drvdata->dev, "REPLICATOR disabled\n");
+	dev_dbg(drvdata->dev, "REPLICATOR disabled\n");
 }
 
 static const struct coresight_ops_link replicator_link_ops = {
diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c
index 92a780a6df1d..696455891ec4 100644
--- a/drivers/hwtracing/coresight/coresight-stm.c
+++ b/drivers/hwtracing/coresight/coresight-stm.c
@@ -218,7 +218,7 @@ static int stm_enable(struct coresight_device *csdev,
 	stm_enable_hw(drvdata);
 	spin_unlock(&drvdata->spinlock);
 
-	dev_info(drvdata->dev, "STM tracing enabled\n");
+	dev_dbg(drvdata->dev, "STM tracing enabled\n");
 	return 0;
 }
 
@@ -281,7 +281,7 @@ static void stm_disable(struct coresight_device *csdev,
 		pm_runtime_put(drvdata->dev);
 
 		local_set(&drvdata->mode, CS_MODE_DISABLED);
-		dev_info(drvdata->dev, "STM tracing disabled\n");
+		dev_dbg(drvdata->dev, "STM tracing disabled\n");
 	}
 }
 
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index d89bfb3042a2..aa4e8f03ef49 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -242,7 +242,7 @@ static int tmc_enable_etf_sink(struct coresight_device *csdev, u32 mode)
 	if (ret)
 		return ret;
 
-	dev_info(drvdata->dev, "TMC-ETB/ETF enabled\n");
+	dev_dbg(drvdata->dev, "TMC-ETB/ETF enabled\n");
 	return 0;
 }
 
@@ -265,7 +265,7 @@ static void tmc_disable_etf_sink(struct coresight_device *csdev)
 
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
-	dev_info(drvdata->dev, "TMC-ETB/ETF disabled\n");
+	dev_dbg(drvdata->dev, "TMC-ETB/ETF disabled\n");
 }
 
 static int tmc_enable_etf_link(struct coresight_device *csdev,
@@ -284,7 +284,7 @@ static int tmc_enable_etf_link(struct coresight_device *csdev,
 	drvdata->mode = CS_MODE_SYSFS;
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
-	dev_info(drvdata->dev, "TMC-ETF enabled\n");
+	dev_dbg(drvdata->dev, "TMC-ETF enabled\n");
 	return 0;
 }
 
@@ -304,7 +304,7 @@ static void tmc_disable_etf_link(struct coresight_device *csdev,
 	drvdata->mode = CS_MODE_DISABLED;
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
-	dev_info(drvdata->dev, "TMC-ETF disabled\n");
+	dev_dbg(drvdata->dev, "TMC-ETF disabled\n");
 }
 
 static void *tmc_alloc_etf_buffer(struct coresight_device *csdev, int cpu,
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 9e41eeaa5284..f12b7c5f68b2 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -1142,7 +1142,7 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 		tmc_etr_free_sysfs_buf(free_buf);
 
 	if (!ret)
-		dev_info(drvdata->dev, "TMC-ETR enabled\n");
+		dev_dbg(drvdata->dev, "TMC-ETR enabled\n");
 
 	return ret;
 }
@@ -1185,7 +1185,7 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev)
 
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
-	dev_info(drvdata->dev, "TMC-ETR disabled\n");
+	dev_dbg(drvdata->dev, "TMC-ETR disabled\n");
 }
 
 static const struct coresight_ops_sink tmc_etr_sink_ops = {
diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c
index 2349b1805694..4939333cc6c7 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.c
+++ b/drivers/hwtracing/coresight/coresight-tmc.c
@@ -88,7 +88,7 @@ static int tmc_read_prepare(struct tmc_drvdata *drvdata)
 	}
 
 	if (!ret)
-		dev_info(drvdata->dev, "TMC read start\n");
+		dev_dbg(drvdata->dev, "TMC read start\n");
 
 	return ret;
 }
@@ -110,7 +110,7 @@ static int tmc_read_unprepare(struct tmc_drvdata *drvdata)
 	}
 
 	if (!ret)
-		dev_info(drvdata->dev, "TMC read end\n");
+		dev_dbg(drvdata->dev, "TMC read end\n");
 
 	return ret;
 }
diff --git a/drivers/hwtracing/coresight/coresight-tpiu.c b/drivers/hwtracing/coresight/coresight-tpiu.c
index d7a3e453016d..7b105001dc32 100644
--- a/drivers/hwtracing/coresight/coresight-tpiu.c
+++ b/drivers/hwtracing/coresight/coresight-tpiu.c
@@ -77,7 +77,7 @@ static int tpiu_enable(struct coresight_device *csdev, u32 mode)
 
 	tpiu_enable_hw(drvdata);
 
-	dev_info(drvdata->dev, "TPIU enabled\n");
+	dev_dbg(drvdata->dev, "TPIU enabled\n");
 	return 0;
 }
 
@@ -99,7 +99,7 @@ static void tpiu_disable(struct coresight_device *csdev)
 
 	tpiu_disable_hw(drvdata);
 
-	dev_info(drvdata->dev, "TPIU disabled\n");
+	dev_dbg(drvdata->dev, "TPIU disabled\n");
 }
 
 static const struct coresight_ops_sink tpiu_sink_ops = {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 10/17] coresight: etr: Track if the device is coherent
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (8 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 09/17] coresight: Convert driver messages to dev_dbg Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-11-02 19:40   ` Mathieu Poirier
  2017-10-19 17:15 ` [PATCH 11/17] coresight etr: Handle driver mode specific ETR buffers Suzuki K Poulose
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

Track if the ETR is dma-coherent or not. This will be useful
in deciding if we should use software buffering for perf.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc.c | 5 ++++-
 drivers/hwtracing/coresight/coresight-tmc.h | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c
index 4939333cc6c7..5a8c41130f96 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.c
+++ b/drivers/hwtracing/coresight/coresight-tmc.c
@@ -347,6 +347,9 @@ static int tmc_etr_setup_caps(struct tmc_drvdata *drvdata,
 	if (!(devid & TMC_DEVID_NOSCAT))
 		tmc_etr_set_cap(drvdata, TMC_ETR_SG);
 
+	if (device_get_dma_attr(drvdata->dev) == DEV_DMA_COHERENT)
+		tmc_etr_set_cap(drvdata, TMC_ETR_COHERENT);
+
 	/* Check if the AXI address width is available */
 	if (devid & TMC_DEVID_AXIAW_VALID)
 		dma_mask = ((devid >> TMC_DEVID_AXIAW_SHIFT) &
@@ -397,7 +400,7 @@ static int tmc_probe(struct amba_device *adev, const struct amba_id *id)
 	if (!drvdata)
 		goto out;
 
-	drvdata->dev = &adev->dev;
+	drvdata->dev = dev;
 	dev_set_drvdata(dev, drvdata);
 
 	/* Validity for the resource is already checked by the AMBA core */
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index 50ebc17c4645..69da0b584a6b 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -131,6 +131,7 @@ enum tmc_mem_intf_width {
  * so we have to rely on PID of the IP to detect the functionality.
  */
 #define TMC_ETR_SAVE_RESTORE		(0x1U << 2)
+#define TMC_ETR_COHERENT		(0x1U << 3)
 
 /* Coresight SoC-600 TMC-ETR unadvertised capabilities */
 #define CORESIGHT_SOC_600_ETR_CAPS	\
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 11/17] coresight etr: Handle driver mode specific ETR buffers
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (9 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 10/17] coresight: etr: Track if the device is coherent Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-11-02 20:26   ` Mathieu Poirier
  2017-10-19 17:15 ` [PATCH 12/17] coresight etr: Relax collection of trace from sysfs mode Suzuki K Poulose
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

Since the ETR could be driven either by SYSFS or by perf, it
becomes complicated how we deal with the buffers used for each
of these modes. The ETR driver cannot simply free the current
attached buffer without knowing the provider (i.e, sysfs vs perf).

To solve this issue, we provide:
1) the driver-mode specific etr buffer to be retained in the drvdata
2) the etr_buf for a session should be passed on when enabling the
   hardware, which will be stored in drvdata->etr_buf. This will be
   replaced (not free'd) as soon as the hardware is disabled, after
   necessary sync operation.

The advantages of this are :

1) The common code path doesn't need to worry about how to dispose
   an existing buffer, if it is about to start a new session with a
   different buffer, possibly in a different mode.
2) The driver mode can control its buffers and can get access to the
   saved session even when the hardware is operating in a different
   mode. (e.g, we can still access a trace buffer from a sysfs mode
   even if the etr is now used in perf mode, without disrupting the
   current session.)

Towards this, we introduce a sysfs specific data which will hold the
etr_buf used for sysfs mode of operation, controlled solely by the
sysfs mode handling code.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 59 ++++++++++++++++---------
 drivers/hwtracing/coresight/coresight-tmc.h     |  2 +
 2 files changed, 41 insertions(+), 20 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index f12b7c5f68b2..ef7498f05b34 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -961,11 +961,16 @@ static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata)
 		tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset);
 }
 
-static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
+static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata,
+			      struct etr_buf *etr_buf)
 {
 	u32 axictl, sts;
-	struct etr_buf *etr_buf = drvdata->etr_buf;
 
+	/* Callers should provide an appropriate buffer for use */
+	if (WARN_ON(!etr_buf || drvdata->etr_buf))
+		return;
+
+	drvdata->etr_buf = etr_buf;
 	/* Zero out the memory to help with debug */
 	memset(etr_buf->vaddr, 0, etr_buf->size);
 
@@ -1023,12 +1028,15 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
  * also updating the @bufpp on where to find it. Since the trace data
  * starts at anywhere in the buffer, depending on the RRP, we adjust the
  * @len returned to handle buffer wrapping around.
+ *
+ * We are protected here by drvdata->reading != 0, which ensures the
+ * sysfs_buf stays alive.
  */
 ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
 			    loff_t pos, size_t len, char **bufpp)
 {
 	s64 offset;
-	struct etr_buf *etr_buf = drvdata->etr_buf;
+	struct etr_buf *etr_buf = drvdata->sysfs_buf;
 
 	if (pos + len > etr_buf->len)
 		len = etr_buf->len - pos;
@@ -1058,7 +1066,14 @@ tmc_etr_free_sysfs_buf(struct etr_buf *buf)
 
 static void tmc_etr_sync_sysfs_buf(struct tmc_drvdata *drvdata)
 {
-	tmc_sync_etr_buf(drvdata);
+	struct etr_buf *etr_buf = drvdata->etr_buf;
+
+	if (WARN_ON(drvdata->sysfs_buf != etr_buf)) {
+		tmc_etr_free_sysfs_buf(drvdata->sysfs_buf);
+		drvdata->sysfs_buf = NULL;
+	} else {
+		tmc_sync_etr_buf(drvdata);
+	}
 }
 
 static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
@@ -1077,6 +1092,8 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
 	tmc_disable_hw(drvdata);
 
 	CS_LOCK(drvdata->base);
+	/* Reset the ETR buf used by hardware */
+	drvdata->etr_buf = NULL;
 }
 
 static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
@@ -1085,7 +1102,7 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	bool used = false;
 	unsigned long flags;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
-	struct etr_buf *new_buf = NULL, *free_buf = NULL;
+	struct etr_buf *sysfs_buf = NULL, *new_buf = NULL, *free_buf = NULL;
 
 
 	/*
@@ -1097,7 +1114,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	 * with the lock released.
 	 */
 	spin_lock_irqsave(&drvdata->spinlock, flags);
-	if (!drvdata->etr_buf || (drvdata->etr_buf->size != drvdata->size)) {
+	sysfs_buf = READ_ONCE(drvdata->sysfs_buf);
+	if (!sysfs_buf || (sysfs_buf->size != drvdata->size)) {
 		spin_unlock_irqrestore(&drvdata->spinlock, flags);
 		/* Allocate memory with the spinlock released */
 		free_buf = new_buf = tmc_etr_setup_sysfs_buf(drvdata);
@@ -1125,15 +1143,16 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	 * If we don't have a buffer or it doesn't match the requested size,
 	 * use the memory allocated above. Otherwise reuse it.
 	 */
-	if (!drvdata->etr_buf ||
-	    (new_buf && drvdata->etr_buf->size != new_buf->size)) {
+	sysfs_buf = READ_ONCE(drvdata->sysfs_buf);
+	if (!sysfs_buf ||
+	    (new_buf && sysfs_buf->size != new_buf->size)) {
 		used = true;
-		free_buf = drvdata->etr_buf;
-		drvdata->etr_buf = new_buf;
+		free_buf = sysfs_buf;
+		drvdata->sysfs_buf = new_buf;
 	}
 
 	drvdata->mode = CS_MODE_SYSFS;
-	tmc_etr_enable_hw(drvdata);
+	tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf);
 out:
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
@@ -1218,13 +1237,13 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata)
 		goto out;
 	}
 
-	/* If drvdata::etr_buf is NULL the trace data has been read already */
-	if (drvdata->etr_buf == NULL) {
+	/* If sysfs_buf is NULL the trace data has been read already */
+	if (!drvdata->sysfs_buf) {
 		ret = -EINVAL;
 		goto out;
 	}
 
-	/* Disable the TMC if need be */
+	/* Disable the TMC if we are trying to read from a running session */
 	if (drvdata->mode == CS_MODE_SYSFS)
 		tmc_etr_disable_hw(drvdata);
 
@@ -1238,7 +1257,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata)
 int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
 {
 	unsigned long flags;
-	struct etr_buf *etr_buf = NULL;
+	struct etr_buf *sysfs_buf = NULL;
 
 	/* config types are set a boot time and never change */
 	if (WARN_ON_ONCE(drvdata->config_type != TMC_CONFIG_TYPE_ETR))
@@ -1254,22 +1273,22 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
 		 * so we don't have to explicitly clear it. Also, since the
 		 * tracer is still enabled drvdata::buf can't be NULL.
 		 */
-		tmc_etr_enable_hw(drvdata);
+		tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf);
 	} else {
 		/*
 		 * The ETR is not tracing and the buffer was just read.
 		 * As such prepare to free the trace buffer.
 		 */
-		etr_buf =  drvdata->etr_buf;
-		drvdata->etr_buf = NULL;
+		sysfs_buf = drvdata->sysfs_buf;
+		drvdata->sysfs_buf = NULL;
 	}
 
 	drvdata->reading = false;
 	spin_unlock_irqrestore(&drvdata->spinlock, flags);
 
 	/* Free allocated memory out side of the spinlock */
-	if (etr_buf)
-		tmc_free_etr_buf(etr_buf);
+	if (sysfs_buf)
+		tmc_etr_free_sysfs_buf(sysfs_buf);
 
 	return 0;
 }
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index 69da0b584a6b..14a3dec50b0f 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -185,6 +185,7 @@ struct etr_buf {
  * @trigger_cntr: amount of words to store after a trigger.
  * @etr_caps:	Bitmask of capabilities of the TMC ETR, inferred from the
  *		device configuration register (DEVID)
+ * @sysfs_data:	SYSFS buffer for ETR.
  */
 struct tmc_drvdata {
 	void __iomem		*base;
@@ -204,6 +205,7 @@ struct tmc_drvdata {
 	enum tmc_mem_intf_width	memwidth;
 	u32			trigger_cntr;
 	u32			etr_caps;
+	struct etr_buf		*sysfs_buf;
 };
 
 struct etr_buf_operations {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 12/17] coresight etr: Relax collection of trace from sysfs mode
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (10 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 11/17] coresight etr: Handle driver mode specific ETR buffers Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-10-19 17:15 ` [PATCH 13/17] coresight etr: Do not clean ETR trace buffer Suzuki K Poulose
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

Since the ETR now uses mode specific buffers, we can reliably
provide the trace data captured in sysfs mode, even when the ETR
is operating in PERF mode.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index ef7498f05b34..31353fc34b53 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -1231,19 +1231,17 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata)
 		goto out;
 	}
 
-	/* Don't interfere if operated from Perf */
-	if (drvdata->mode == CS_MODE_PERF) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	/* If sysfs_buf is NULL the trace data has been read already */
+	/*
+	 * We can safely allow reads even if the ETR is operating in PERF mode,
+	 * since the sysfs session is captured in mode specific data.
+	 * If drvdata::sysfs_data is NULL the trace data has been read already.
+	 */
 	if (!drvdata->sysfs_buf) {
 		ret = -EINVAL;
 		goto out;
 	}
 
-	/* Disable the TMC if we are trying to read from a running session */
+	/* Disable the TMC if we are trying to read from a running session. */
 	if (drvdata->mode == CS_MODE_SYSFS)
 		tmc_etr_disable_hw(drvdata);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 13/17] coresight etr: Do not clean ETR trace buffer
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (11 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 12/17] coresight etr: Relax collection of trace from sysfs mode Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-11-02 20:36   ` Mathieu Poirier
  2017-10-19 17:15 ` [PATCH 14/17] coresight: etr: Add support for save restore buffers Suzuki K Poulose
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

We zero out the entire trace buffer used for ETR before it
is enabled, for helping with debugging. Since we could be
restoring a session in perf mode, this could destroy the data.
Get rid of this step, if someone wants to debug, they can always
add it as and when needed.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 31353fc34b53..849684f85443 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -971,8 +971,6 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata,
 		return;
 
 	drvdata->etr_buf = etr_buf;
-	/* Zero out the memory to help with debug */
-	memset(etr_buf->vaddr, 0, etr_buf->size);
 
 	CS_UNLOCK(drvdata->base);
 
@@ -1267,9 +1265,8 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
 	if (drvdata->mode == CS_MODE_SYSFS) {
 		/*
 		 * The trace run will continue with the same allocated trace
-		 * buffer. The trace buffer is cleared in tmc_etr_enable_hw(),
-		 * so we don't have to explicitly clear it. Also, since the
-		 * tracer is still enabled drvdata::buf can't be NULL.
+		 * buffer. Since the tracer is still enabled drvdata::buf can't
+		 * be NULL.
 		 */
 		tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf);
 	} else {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 14/17] coresight: etr: Add support for save restore buffers
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (12 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 13/17] coresight etr: Do not clean ETR trace buffer Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-11-03 22:22   ` Mathieu Poirier
  2017-10-19 17:15 ` [PATCH 15/17] coresight: etr_buf: Add helper for padding an area of trace data Suzuki K Poulose
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

Add support for creating buffers which can be used in save-restore
mode (e.g, for use by perf). If the TMC-ETR supports save-restore
feature, we could support the mode in all buffer backends. However,
if it doesn't, we should fall back to using in built SG mechanism,
where we can rotate the SG table by making some adjustments in the
page table.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 132 +++++++++++++++++++++++-
 drivers/hwtracing/coresight/coresight-tmc.h     |  15 +++
 2 files changed, 143 insertions(+), 4 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 849684f85443..f8e654e1f5b2 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -590,7 +590,7 @@ tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index)
  * 3) Update the hwaddr to point to the table pointer for the buffer
  *    which starts at "base".
  */
-static int __maybe_unused
+static int
 tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset)
 {
 	u32 last_entry, first_entry;
@@ -700,6 +700,9 @@ static int tmc_etr_alloc_flat_buf(struct tmc_drvdata *drvdata,
 		return -ENOMEM;
 	etr_buf->vaddr = vaddr;
 	etr_buf->hwaddr = paddr;
+	etr_buf->rrp = paddr;
+	etr_buf->rwp = paddr;
+	etr_buf->status = 0;
 	etr_buf->mode = ETR_MODE_FLAT;
 	etr_buf->private = drvdata;
 	return 0;
@@ -754,13 +757,19 @@ static int tmc_etr_alloc_sg_buf(struct tmc_drvdata *drvdata,
 				void **pages)
 {
 	struct etr_sg_table *etr_table;
+	struct tmc_sg_table *sg_table;
 
 	etr_table = tmc_init_etr_sg_table(drvdata->dev, node,
 					  etr_buf->size, pages);
 	if (IS_ERR(etr_table))
 		return -ENOMEM;
+	sg_table = etr_table->sg_table;
 	etr_buf->vaddr = tmc_sg_table_data_vaddr(etr_table->sg_table);
 	etr_buf->hwaddr = etr_table->hwaddr;
+	/* TMC ETR SG automatically sets the RRP/RWP when enabled */
+	etr_buf->rrp = etr_table->hwaddr;
+	etr_buf->rwp = etr_table->hwaddr;
+	etr_buf->status = 0;
 	etr_buf->mode = ETR_MODE_ETR_SG;
 	etr_buf->private = etr_table;
 	return 0;
@@ -816,11 +825,49 @@ static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)
 	tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len);
 }
 
+static int tmc_etr_restore_sg_buf(struct etr_buf *etr_buf,
+				   u64 r_offset, u64 w_offset,
+				   u32 status, bool has_save_restore)
+{
+	int rc;
+	struct etr_sg_table *etr_table = etr_buf->private;
+	struct device *dev = etr_table->sg_table->dev;
+
+	/*
+	 * It is highly unlikely that we have an ETR with in-built SG and
+	 * Save-Restore capability and we are not sure if the PTRs will
+	 * be updated.
+	 */
+	if (has_save_restore) {
+		dev_warn_once(dev,
+		"Unexpected feature combination of SG and save-restore\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * Since we cannot program RRP/RWP different from DBAL, the offsets
+	 * should match.
+	 */
+	if (r_offset != w_offset) {
+		dev_dbg(dev, "Mismatched RRP/RWP offsets\n");
+		return -EINVAL;
+	}
+
+	rc = tmc_etr_sg_table_rotate(etr_table, w_offset);
+	if (!rc) {
+		etr_buf->hwaddr = etr_table->hwaddr;
+		etr_buf->rrp = etr_table->hwaddr;
+		etr_buf->rwp = etr_table->hwaddr;
+	}
+	return rc;
+}
+
 static const struct etr_buf_operations etr_sg_buf_ops = {
 	.alloc = tmc_etr_alloc_sg_buf,
 	.free = tmc_etr_free_sg_buf,
 	.sync = tmc_etr_sync_sg_buf,
 	.get_data = tmc_etr_get_data_sg_buf,
+	.restore = tmc_etr_restore_sg_buf,
 };
 
 static const struct etr_buf_operations *etr_buf_ops[] = {
@@ -861,10 +908,42 @@ static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
 {
 	int rc = -ENOMEM;
 	bool has_etr_sg, has_iommu;
+	bool has_flat, has_save_restore;
 	struct etr_buf *etr_buf;
 
 	has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG);
 	has_iommu = iommu_get_domain_for_dev(drvdata->dev);
+	has_save_restore = tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE);
+
+	/*
+	 * We can normally use flat DMA buffer provided that the buffer
+	 * is not used in save restore fashion without hardware support.
+	 */
+	has_flat = !(flags & ETR_BUF_F_RESTORE_PTRS) || has_save_restore;
+
+	/*
+	 * To support save-restore on a given ETR we have the following
+	 * conditions:
+	 *  1) If the buffer requires save-restore of a pointers as well
+	 *     as the Status bit, we require ETR support for it and we coul
+	 *     support all the backends.
+	 *  2) If the buffer requires only save-restore of pointers, then
+	 *     we could exploit a circular ETR SG list. None of the other
+	 *     backends can support it without the ETR feature.
+	 *
+	 * If the buffer will be used in a save-restore mode without
+	 * the ETR support for SAVE_RESTORE, we can only support TMC
+	 * ETR in-built SG tables which can be rotated to make it work.
+	 */
+	if ((flags & ETR_BUF_F_RESTORE_STATUS) && !has_save_restore)
+		return ERR_PTR(-EINVAL);
+
+	if (!has_flat && !has_etr_sg) {
+		dev_dbg(drvdata->dev,
+			"No available backends for ETR buffer with flags %x\n",
+			flags);
+		return ERR_PTR(-EINVAL);
+	}
 
 	etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
 	if (!etr_buf)
@@ -883,7 +962,7 @@ static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
 	 * Fallback to available mechanisms.
 	 *
 	 */
-	if (!pages &&
+	if (!pages && has_flat &&
 	    (!has_etr_sg || has_iommu || size < SZ_1M))
 		rc = tmc_etr_mode_alloc_buf(ETR_MODE_FLAT, drvdata,
 					    etr_buf, node, pages);
@@ -961,6 +1040,51 @@ static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata)
 		tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset);
 }
 
+/*
+ * tmc_etr_restore_generic: Common helper to restore the buffer
+ * status for FLAT buffers, which use linear TMC ETR address range.
+ * This is only possible with in-built ETR capability to save-restore
+ * the pointers. The DBA will still point to the original start of the
+ * buffer.
+ */
+static int tmc_etr_buf_generic_restore(struct etr_buf *etr_buf,
+					u64 r_offset, u64 w_offset,
+					u32 status, bool has_save_restore)
+{
+	u64 size = etr_buf->size;
+
+	if (!has_save_restore)
+		return -EINVAL;
+	etr_buf->rrp = etr_buf->hwaddr + (r_offset % size);
+	etr_buf->rwp = etr_buf->hwaddr + (w_offset % size);
+	etr_buf->status = status;
+	return 0;
+}
+
+static int __maybe_unused
+tmc_restore_etr_buf(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf,
+		    u64 r_offset, u64 w_offset, u32 status)
+{
+	bool has_save_restore = tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE);
+
+	if (WARN_ON_ONCE(!has_save_restore && etr_buf->mode != ETR_MODE_ETR_SG))
+		return -EINVAL;
+	/*
+	 * If we use a circular SG list without ETR support, we can't
+	 * support restoring "Full" bit.
+	 */
+	if (WARN_ON_ONCE(!has_save_restore && status))
+		return -EINVAL;
+	if (status & ~TMC_STS_FULL)
+		return -EINVAL;
+	if (etr_buf->ops->restore)
+		return etr_buf->ops->restore(etr_buf, r_offset, w_offset,
+					      status, has_save_restore);
+	else
+		return tmc_etr_buf_generic_restore(etr_buf, r_offset, w_offset,
+					       status, has_save_restore);
+}
+
 static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata,
 			      struct etr_buf *etr_buf)
 {
@@ -1004,8 +1128,8 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata,
 	 * STS to "not full").
 	 */
 	if (tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE)) {
-		tmc_write_rrp(drvdata, etr_buf->hwaddr);
-		tmc_write_rwp(drvdata, etr_buf->hwaddr);
+		tmc_write_rrp(drvdata, etr_buf->rrp);
+		tmc_write_rwp(drvdata, etr_buf->rwp);
 		sts = readl_relaxed(drvdata->base + TMC_STS) & ~TMC_STS_FULL;
 		writel_relaxed(sts, drvdata->base + TMC_STS);
 	}
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index 14a3dec50b0f..2c5b905b6494 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -142,12 +142,22 @@ enum etr_mode {
 	ETR_MODE_ETR_SG,	/* Uses in-built TMC ETR SG mechanism */
 };
 
+/* ETR buffer should support save-restore */
+#define ETR_BUF_F_RESTORE_PTRS		0x1
+#define ETR_BUF_F_RESTORE_STATUS	0x2
+
+#define ETR_BUF_F_RESTORE_MINIMAL	ETR_BUF_F_RESTORE_PTRS
+#define ETR_BUF_F_RESTORE_FULL		(ETR_BUF_F_RESTORE_PTRS |\
+					 ETR_BUF_F_RESTORE_STATUS)
 struct etr_buf_operations;
 
 /**
  * struct etr_buf - Details of the buffer used by ETR
  * @mode	: Mode of the ETR buffer, contiguous, Scatter Gather etc.
  * @full	: Trace data overflow
+ * @status	: Value for STATUS if the ETR supports save-restore.
+ * @rrp		: Value for RRP{LO:HI} if the ETR supports save-restore
+ * @rwp		: Value for RWP{LO:HI} if the ETR supports save-restore
  * @size	: Size of the buffer.
  * @hwaddr	: Address to be programmed in the TMC:DBA{LO,HI}
  * @vaddr	: Virtual address of the buffer used for trace.
@@ -159,6 +169,9 @@ struct etr_buf_operations;
 struct etr_buf {
 	enum etr_mode			mode;
 	bool				full;
+	u32				status;
+	dma_addr_t			rrp;
+	dma_addr_t			rwp;
 	ssize_t				size;
 	dma_addr_t			hwaddr;
 	void				*vaddr;
@@ -212,6 +225,8 @@ struct etr_buf_operations {
 	int (*alloc)(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf,
 			int node, void **pages);
 	void (*sync)(struct etr_buf *etr_buf, u64 rrp, u64 rwp);
+	int (*restore)(struct etr_buf *etr_buf, u64 r_offset,
+		       u64 w_offset, u32 status, bool has_save_restore);
 	ssize_t (*get_data)(struct etr_buf *etr_buf, u64 offset, size_t len,
 				char **bufpp);
 	void (*free)(struct etr_buf *etr_buf);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 15/17] coresight: etr_buf: Add helper for padding an area of trace data
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (13 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 14/17] coresight: etr: Add support for save restore buffers Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-10-19 17:15 ` [PATCH 16/17] coresight: perf: Remove reset_buffer call back for sinks Suzuki K Poulose
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

This patch adds a helper to insert barrier packets for a given
size (aligned to packet size) at given offset in an etr_buf. This
will be used later for perf mode when we try to start in the
middle of an SG buffer.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 52 ++++++++++++++++++++++---
 1 file changed, 46 insertions(+), 6 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index f8e654e1f5b2..229c36b7266c 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -999,18 +999,58 @@ static ssize_t tmc_etr_buf_get_data(struct etr_buf *etr_buf,
 	return etr_buf->ops->get_data(etr_buf, (u64)offset, len, bufpp);
 }
 
+/*
+ * tmc_etr_buf_insert_barrier_packets : Insert barrier packets at @offset upto
+ * @size of bytes in the given buffer. @size should be aligned to the barrier
+ * packet size.
+ *
+ * Returns the new @offset after filling the barriers on success. Otherwise
+ * returns error.
+ */
 static inline s64
-tmc_etr_buf_insert_barrier_packet(struct etr_buf *etr_buf, u64 offset)
+tmc_etr_buf_insert_barrier_packets(struct etr_buf *etr_buf,
+				   u64 offset, u64 size)
 {
 	ssize_t len;
 	char *bufp;
 
-	len = tmc_etr_buf_get_data(etr_buf, offset,
-				   CORESIGHT_BARRIER_PKT_SIZE, &bufp);
-	if (WARN_ON(len <= CORESIGHT_BARRIER_PKT_SIZE))
+	if ((size % CORESIGHT_BARRIER_PKT_SIZE) ||
+	    (offset % CORESIGHT_BARRIER_PKT_SIZE))
 		return -EINVAL;
-	coresight_insert_barrier_packet(bufp);
-	return offset + CORESIGHT_BARRIER_PKT_SIZE;
+	do {
+		len = tmc_etr_buf_get_data(etr_buf, offset, size, &bufp);
+		if (WARN_ON(len <= 0))
+			return -EINVAL;
+		/*
+		 * We are guaranteed that @bufp will point to a linear range
+		 * of @len bytes, where @len <= @size.
+		 */
+		size -= len;
+		offset += len;
+		while (len >= CORESIGHT_BARRIER_PKT_SIZE) {
+			coresight_insert_barrier_packet(bufp);
+			bufp += CORESIGHT_BARRIER_PKT_SIZE;
+			len -= CORESIGHT_BARRIER_PKT_SIZE;
+		}
+
+		/*
+		 * Normally we shouldn't have any left over here, as the trace
+		 * should always be aligned to ETR Frame size.
+		 */
+		WARN_ON(len);
+		/* If we reached the end of the buffer, wrap around */
+		if (offset == etr_buf->size)
+			offset -= etr_buf->size;
+	} while (size);
+
+	return offset;
+}
+
+static inline s64
+tmc_etr_buf_insert_barrier_packet(struct etr_buf *etr_buf, u64 offset)
+{
+	return tmc_etr_buf_insert_barrier_packets(etr_buf, offset,
+					  CORESIGHT_BARRIER_PKT_SIZE);
 }
 
 /*
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 16/17] coresight: perf: Remove reset_buffer call back for sinks
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (14 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 15/17] coresight: etr_buf: Add helper for padding an area of trace data Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-11-06 21:10   ` Mathieu Poirier
  2017-10-19 17:15 ` [PATCH 17/17] coresight perf: Add ETR backend support for etm-perf Suzuki K Poulose
  2017-10-20 11:00 ` [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
  17 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

Right now we issue an update_buffer() and reset_buffer() call backs
in succession when we stop tracing an event. The update_buffer is
supposed to check the status of the buffer and make sure the ring buffer
is updated with the trace data. And we store information about the
size of the data collected only to be consumed by the reset_buffer
callback which always follows the update_buffer. This patch gets
rid of the reset_buffer callback altogether and performs the actions
in update_buffer, making it return the size collected.

This removes some not-so pretty hack (storing the new head in the
size field for snapshot mode) and cleans it up a little bit.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-etb10.c    | 56 +++++------------------
 drivers/hwtracing/coresight/coresight-etm-perf.c |  9 +---
 drivers/hwtracing/coresight/coresight-tmc-etf.c  | 58 +++++-------------------
 include/linux/coresight.h                        |  5 +-
 4 files changed, 26 insertions(+), 102 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index 757f556975f7..75c5699000b0 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -323,37 +323,7 @@ static int etb_set_buffer(struct coresight_device *csdev,
 	return ret;
 }
 
-static unsigned long etb_reset_buffer(struct coresight_device *csdev,
-				      struct perf_output_handle *handle,
-				      void *sink_config)
-{
-	unsigned long size = 0;
-	struct cs_buffers *buf = sink_config;
-
-	if (buf) {
-		/*
-		 * In snapshot mode ->data_size holds the new address of the
-		 * ring buffer's head.  The size itself is the whole address
-		 * range since we want the latest information.
-		 */
-		if (buf->snapshot)
-			handle->head = local_xchg(&buf->data_size,
-						  buf->nr_pages << PAGE_SHIFT);
-
-		/*
-		 * Tell the tracer PMU how much we got in this run and if
-		 * something went wrong along the way.  Nobody else can use
-		 * this cs_buffers instance until we are done.  As such
-		 * resetting parameters here and squaring off with the ring
-		 * buffer API in the tracer PMU is fine.
-		 */
-		size = local_xchg(&buf->data_size, 0);
-	}
-
-	return size;
-}
-
-static void etb_update_buffer(struct coresight_device *csdev,
+static unsigned long etb_update_buffer(struct coresight_device *csdev,
 			      struct perf_output_handle *handle,
 			      void *sink_config)
 {
@@ -362,13 +332,13 @@ static void etb_update_buffer(struct coresight_device *csdev,
 	u8 *buf_ptr;
 	const u32 *barrier;
 	u32 read_ptr, write_ptr, capacity;
-	u32 status, read_data, to_read;
-	unsigned long offset;
+	u32 status, read_data;
+	unsigned long offset, to_read;
 	struct cs_buffers *buf = sink_config;
 	struct etb_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
 	if (!buf)
-		return;
+		return 0;
 
 	capacity = drvdata->buffer_depth * ETB_FRAME_SIZE_WORDS;
 
@@ -473,18 +443,17 @@ static void etb_update_buffer(struct coresight_device *csdev,
 	writel_relaxed(0x0, drvdata->base + ETB_RAM_WRITE_POINTER);
 
 	/*
-	 * In snapshot mode all we have to do is communicate to
-	 * perf_aux_output_end() the address of the current head.  In full
-	 * trace mode the same function expects a size to move rb->aux_head
-	 * forward.
+	 * In snapshot mode we have to update the handle->head to point
+	 * to the new location.
 	 */
-	if (buf->snapshot)
-		local_set(&buf->data_size, (cur * PAGE_SIZE) + offset);
-	else
-		local_add(to_read, &buf->data_size);
-
+	if (buf->snapshot) {
+		handle->head = (cur * PAGE_SIZE) + offset;
+		to_read = buf->nr_pages << PAGE_SHIFT;
+	}
 	etb_enable_hw(drvdata);
 	CS_LOCK(drvdata->base);
+
+	return to_read;
 }
 
 static const struct coresight_ops_sink etb_sink_ops = {
@@ -493,7 +462,6 @@ static const struct coresight_ops_sink etb_sink_ops = {
 	.alloc_buffer	= etb_alloc_buffer,
 	.free_buffer	= etb_free_buffer,
 	.set_buffer	= etb_set_buffer,
-	.reset_buffer	= etb_reset_buffer,
 	.update_buffer	= etb_update_buffer,
 };
 
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 8a0ad77574e7..e5f9567c87c4 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -342,15 +342,8 @@ static void etm_event_stop(struct perf_event *event, int mode)
 		if (!sink_ops(sink)->update_buffer)
 			return;
 
-		sink_ops(sink)->update_buffer(sink, handle,
+		size = sink_ops(sink)->update_buffer(sink, handle,
 					      event_data->snk_config);
-
-		if (!sink_ops(sink)->reset_buffer)
-			return;
-
-		size = sink_ops(sink)->reset_buffer(sink, handle,
-						    event_data->snk_config);
-
 		perf_aux_output_end(handle, size);
 	}
 
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index aa4e8f03ef49..073198e7b46e 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -358,36 +358,7 @@ static int tmc_set_etf_buffer(struct coresight_device *csdev,
 	return ret;
 }
 
-static unsigned long tmc_reset_etf_buffer(struct coresight_device *csdev,
-					  struct perf_output_handle *handle,
-					  void *sink_config)
-{
-	long size = 0;
-	struct cs_buffers *buf = sink_config;
-
-	if (buf) {
-		/*
-		 * In snapshot mode ->data_size holds the new address of the
-		 * ring buffer's head.  The size itself is the whole address
-		 * range since we want the latest information.
-		 */
-		if (buf->snapshot)
-			handle->head = local_xchg(&buf->data_size,
-						  buf->nr_pages << PAGE_SHIFT);
-		/*
-		 * Tell the tracer PMU how much we got in this run and if
-		 * something went wrong along the way.  Nobody else can use
-		 * this cs_buffers instance until we are done.  As such
-		 * resetting parameters here and squaring off with the ring
-		 * buffer API in the tracer PMU is fine.
-		 */
-		size = local_xchg(&buf->data_size, 0);
-	}
-
-	return size;
-}
-
-static void tmc_update_etf_buffer(struct coresight_device *csdev,
+static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
 				  struct perf_output_handle *handle,
 				  void *sink_config)
 {
@@ -396,17 +367,17 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev,
 	const u32 *barrier;
 	u32 *buf_ptr;
 	u64 read_ptr, write_ptr;
-	u32 status, to_read;
-	unsigned long offset;
+	u32 status;
+	unsigned long offset, to_read;
 	struct cs_buffers *buf = sink_config;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 
 	if (!buf)
-		return;
+		return 0;
 
 	/* This shouldn't happen */
 	if (WARN_ON_ONCE(drvdata->mode != CS_MODE_PERF))
-		return;
+		return 0;
 
 	CS_UNLOCK(drvdata->base);
 
@@ -495,18 +466,14 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev,
 		}
 	}
 
-	/*
-	 * In snapshot mode all we have to do is communicate to
-	 * perf_aux_output_end() the address of the current head.  In full
-	 * trace mode the same function expects a size to move rb->aux_head
-	 * forward.
-	 */
-	if (buf->snapshot)
-		local_set(&buf->data_size, (cur * PAGE_SIZE) + offset);
-	else
-		local_add(to_read, &buf->data_size);
-
+	/* In snapshot mode we have to update the head */
+	if (buf->snapshot) {
+		handle->head = (cur * PAGE_SIZE) + offset;
+		to_read = buf->nr_pages << PAGE_SHIFT;
+	}
 	CS_LOCK(drvdata->base);
+
+	return to_read;
 }
 
 static const struct coresight_ops_sink tmc_etf_sink_ops = {
@@ -515,7 +482,6 @@ static const struct coresight_ops_sink tmc_etf_sink_ops = {
 	.alloc_buffer	= tmc_alloc_etf_buffer,
 	.free_buffer	= tmc_free_etf_buffer,
 	.set_buffer	= tmc_set_etf_buffer,
-	.reset_buffer	= tmc_reset_etf_buffer,
 	.update_buffer	= tmc_update_etf_buffer,
 };
 
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index d950dad5056a..5c9e5fe2bf32 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -199,10 +199,7 @@ struct coresight_ops_sink {
 	int (*set_buffer)(struct coresight_device *csdev,
 			  struct perf_output_handle *handle,
 			  void *sink_config);
-	unsigned long (*reset_buffer)(struct coresight_device *csdev,
-				      struct perf_output_handle *handle,
-				      void *sink_config);
-	void (*update_buffer)(struct coresight_device *csdev,
+	unsigned long (*update_buffer)(struct coresight_device *csdev,
 			      struct perf_output_handle *handle,
 			      void *sink_config);
 };
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 17/17] coresight perf: Add ETR backend support for etm-perf
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (15 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 16/17] coresight: perf: Remove reset_buffer call back for sinks Suzuki K Poulose
@ 2017-10-19 17:15 ` Suzuki K Poulose
  2017-11-07  0:24   ` Mathieu Poirier
  2017-10-20 11:00 ` [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
  17 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-19 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier,
	Suzuki K Poulose

Add necessary support for using ETR as a sink in ETM perf tracing.
We try make the best use of the available modes of buffers to
try and avoid software double buffering.

We can use the perf ring buffer for ETR directly if all of the
conditions below are met :
 1) ETR is DMA coherent
 2) perf is used in snapshot mode. In full tracing mode, we cannot
    guarantee that the ETR will stop before it overwrites the data
    which may not have been consumed by the user.
 3) ETR supports save-restore with a scatter-gather mechanism
    which can use a given set of pages we use the perf ring buffer
    directly. If we have an in-built TMC ETR Scatter Gather unit,
    we make use of a circular SG list to restart from a given head.
    However, we need to align the starting offset to 4K in this case.

If the ETR doesn't support either of this, we fallback to software
double buffering.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-tmc-etr.c | 372 +++++++++++++++++++++++-
 drivers/hwtracing/coresight/coresight-tmc.h     |   2 +
 2 files changed, 372 insertions(+), 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index 229c36b7266c..1dfe7cf7c721 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -21,6 +21,9 @@
 #include "coresight-priv.h"
 #include "coresight-tmc.h"
 
+/* Lower limit for ETR hardware buffer in double buffering mode */
+#define TMC_ETR_PERF_MIN_BUF_SIZE	SZ_1M
+
 /*
  * The TMC ETR SG has a page size of 4K. The SG table contains pointers
  * to 4KB buffers. However, the OS may be use PAGE_SIZE different from
@@ -1328,10 +1331,371 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
 	return ret;
 }
 
+/*
+ * etr_perf_buffer - Perf buffer used for ETR
+ * @etr_buf		- Actual buffer used by the ETR
+ * @snaphost		- Perf session mode
+ * @head		- handle->head at the beginning of the session.
+ * @nr_pages		- Number of pages in the ring buffer.
+ * @pages		- Pages in the ring buffer.
+ * @flags		- Capabilities of the hardware buffer used in the
+ *			  session. If flags == 0, we use software double
+ *			  buffering.
+ */
+struct etr_perf_buffer {
+	struct etr_buf		*etr_buf;
+	bool			snapshot;
+	unsigned long		head;
+	int			nr_pages;
+	void			**pages;
+	u32			flags;
+};
+
+
+/*
+ * tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf. We try to
+ * use perf ring buffer pages for the ETR when we can. In the worst case
+ * we fallback to software double buffering. The size of the hardware buffer
+ * in this case is dependent on the size configured via sysfs, if we can't
+ * match the perf ring buffer size. We scale down the size by half until
+ * it reaches a limit of 1M, beyond which we give up.
+ */
+static struct etr_perf_buffer *
+tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages,
+		       void **pages, bool snapshot)
+{
+	int i;
+	struct etr_buf *etr_buf;
+	struct etr_perf_buffer *etr_perf;
+	unsigned long size;
+	unsigned long buf_flags[] = {
+					ETR_BUF_F_RESTORE_FULL,
+					ETR_BUF_F_RESTORE_MINIMAL,
+					0,
+				    };
+
+	etr_perf = kzalloc_node(sizeof(*etr_perf), GFP_KERNEL, node);
+	if (!etr_perf)
+		return ERR_PTR(-ENOMEM);
+
+	size = nr_pages << PAGE_SHIFT;
+	/*
+	 * We can use the perf ring buffer for ETR only if it is coherent
+	 * and in snapshot mode as we cannot control how much data will be
+	 * written before we stop.
+	 */
+	if (tmc_etr_has_cap(drvdata, TMC_ETR_COHERENT) && snapshot) {
+		for (i = 0; buf_flags[i]; i++) {
+			etr_buf = tmc_alloc_etr_buf(drvdata, size,
+						 buf_flags[i], node, pages);
+			if (!IS_ERR(etr_buf)) {
+				etr_perf->flags = buf_flags[i];
+				goto done;
+			}
+		}
+	}
+
+	/*
+	 * We have to now fallback to software double buffering.
+	 * The tricky decision is choosing a size for the hardware buffer.
+	 * We could start with drvdata->size (configurable via sysfs) and
+	 * scale it down until we can allocate the data.
+	 */
+	etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL);
+	if (!IS_ERR(etr_buf))
+		goto done;
+	size = drvdata->size;
+	do {
+		etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL);
+		if (!IS_ERR(etr_buf))
+			goto done;
+		size /= 2;
+	} while (size >= TMC_ETR_PERF_MIN_BUF_SIZE);
+
+	kfree(etr_perf);
+	return ERR_PTR(-ENOMEM);
+
+done:
+	etr_perf->etr_buf = etr_buf;
+	return etr_perf;
+}
+
+
+static void *tmc_etr_alloc_perf_buffer(struct coresight_device *csdev,
+					int cpu, void **pages, int nr_pages,
+					bool snapshot)
+{
+	struct etr_perf_buffer *etr_perf;
+	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
+
+	if (cpu == -1)
+		cpu = smp_processor_id();
+
+	etr_perf = tmc_etr_setup_perf_buf(drvdata, cpu_to_node(cpu),
+					     nr_pages, pages, snapshot);
+	if (IS_ERR(etr_perf)) {
+		dev_dbg(drvdata->dev, "Unable to allocate ETR buffer\n");
+		return NULL;
+	}
+
+	etr_perf->snapshot = snapshot;
+	etr_perf->nr_pages = nr_pages;
+	etr_perf->pages = pages;
+
+	return etr_perf;
+}
+
+static void tmc_etr_free_perf_buffer(void *config)
+{
+	struct etr_perf_buffer *etr_perf = config;
+
+	if (etr_perf->etr_buf)
+		tmc_free_etr_buf(etr_perf->etr_buf);
+	kfree(etr_perf);
+}
+
+/*
+ * Pad the etr buffer with barrier packets to align the head to 4K aligned
+ * offset. This is required for ETR SG backed buffers, so that we can rotate
+ * the buffer easily and avoid a software double buffering.
+ */
+static s64 tmc_etr_pad_perf_buffer(struct etr_perf_buffer *etr_perf, s64 head)
+{
+	s64 new_head;
+	struct etr_buf *etr_buf = etr_perf->etr_buf;
+
+	head %= etr_buf->size;
+	new_head = ALIGN(head, SZ_4K);
+	if (head == new_head)
+		return head;
+	/*
+	 * If the padding is not aligned to barrier packet size
+	 * we can't do much.
+	 */
+	if ((new_head - head) % CORESIGHT_BARRIER_PKT_SIZE)
+		return -EINVAL;
+	return tmc_etr_buf_insert_barrier_packets(etr_buf, head,
+						  new_head - head);
+}
+
+static int tmc_etr_set_perf_buffer(struct coresight_device *csdev,
+				   struct perf_output_handle *handle,
+				   void *config)
+{
+	int rc;
+	unsigned long flags;
+	s64 head, new_head;
+	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
+	struct etr_perf_buffer *etr_perf = config;
+	struct etr_buf *etr_buf = etr_perf->etr_buf;
+
+	etr_perf->head = handle->head;
+	head = etr_perf->head % etr_buf->size;
+	switch (etr_perf->flags) {
+	case ETR_BUF_F_RESTORE_MINIMAL:
+		new_head = tmc_etr_pad_perf_buffer(etr_perf, head);
+		if (new_head < 0)
+			return new_head;
+		if (head != new_head) {
+			rc = perf_aux_output_skip(handle, new_head - head);
+			if (rc)
+				return rc;
+			etr_perf->head = handle->head;
+			head = new_head;
+		}
+		/* Fall through */
+	case ETR_BUF_F_RESTORE_FULL:
+		rc = tmc_restore_etr_buf(drvdata, etr_buf, head, head, 0);
+		break;
+	case 0:
+		/* Nothing to do here. */
+		rc = 0;
+		break;
+	default:
+		dev_warn(drvdata->dev, "Unexpected flags in etr_perf buffer\n");
+		WARN_ON(1);
+		rc = -EINVAL;
+	}
+
+	/*
+	 * This sink is going to be used in perf mode. No other session can
+	 * grab it from us. So set the perf mode specific data here. This will
+	 * be released just before we disable the sink from update_buffer call
+	 * back.
+	 */
+	if (!rc) {
+		spin_lock_irqsave(&drvdata->spinlock, flags);
+		if (WARN_ON(drvdata->perf_data))
+			rc = -EBUSY;
+		else
+			drvdata->perf_data = etr_perf;
+		spin_unlock_irqrestore(&drvdata->spinlock, flags);
+	}
+	return rc;
+}
+
+/*
+ * tmc_etr_sync_perf_buffer: Copy the actual trace data from the hardware
+ * buffer to the perf ring buffer.
+ */
+static void tmc_etr_sync_perf_buffer(struct etr_perf_buffer *etr_perf)
+{
+	struct etr_buf *etr_buf = etr_perf->etr_buf;
+	unsigned long bytes, to_copy, head = etr_perf->head;
+	unsigned long pg_idx, pg_offset, src_offset;
+	char **dst_pages, *src_buf;
+
+	head = etr_perf->head % (etr_perf->nr_pages << PAGE_SHIFT);
+	pg_idx = head >> PAGE_SHIFT;
+	pg_offset = head & (PAGE_SIZE - 1);
+	dst_pages = (char **)etr_perf->pages;
+	src_offset = etr_buf->offset;
+	to_copy = etr_buf->len;
+
+	while (to_copy > 0) {
+		/*
+		 * We can copy minimum of :
+		 *  1) what is available in the source buffer,
+		 *  2) what is available in the source buffer, before it
+		 *     wraps around.
+		 *  3) what is available in the destination page.
+		 * in one iteration.
+		 */
+		bytes = tmc_etr_buf_get_data(etr_buf, src_offset, to_copy,
+					     &src_buf);
+		if (WARN_ON_ONCE(bytes <= 0))
+			break;
+		bytes = min(PAGE_SIZE - pg_offset, bytes);
+
+		memcpy(dst_pages[pg_idx] + pg_offset, src_buf, bytes);
+		to_copy -= bytes;
+		/* Move destination pointers */
+		pg_offset += bytes;
+		if (pg_offset == PAGE_SIZE) {
+			pg_offset = 0;
+			if (++pg_idx == etr_perf->nr_pages)
+				pg_idx = 0;
+		}
+
+		/* Move source pointers */
+		src_offset += bytes;
+		if (src_offset >= etr_buf->size)
+			src_offset -= etr_buf->size;
+	}
+}
+
+/*
+ * XXX: What is the expected behavior here in the following cases ?
+ *  1) Full trace mode, without double buffering : What should be the size
+ *     reported back when the buffer is full and has wrapped around. Ideally,
+ *     we should report for the lost trace to make sure the "head" in the ring
+ *     buffer comes back to the position as in the trace buffer, rather than
+ *     returning "total size" of the buffer.
+ * 2) In snapshot mode, should we always return "full buffer size" ?
+ */
+static unsigned long
+tmc_etr_update_perf_buffer(struct coresight_device *csdev,
+			   struct perf_output_handle *handle,
+			   void *config)
+{
+	bool double_buffer, lost = false;
+	unsigned long flags, offset, size = 0;
+	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
+	struct etr_perf_buffer *etr_perf = config;
+	struct etr_buf *etr_buf = etr_perf->etr_buf;
+
+	double_buffer = (etr_perf->flags == 0);
+
+	spin_lock_irqsave(&drvdata->spinlock, flags);
+	if (WARN_ON(drvdata->perf_data != etr_perf)) {
+		lost = true;
+		spin_unlock_irqrestore(&drvdata->spinlock, flags);
+		goto out;
+	}
+
+	CS_UNLOCK(drvdata->base);
+
+	tmc_flush_and_stop(drvdata);
+
+	tmc_sync_etr_buf(drvdata);
+	CS_UNLOCK(drvdata->base);
+	/* Reset perf specific data */
+	drvdata->perf_data = NULL;
+	spin_unlock_irqrestore(&drvdata->spinlock, flags);
+
+	offset = etr_buf->offset + etr_buf->len;
+	if (offset > etr_buf->size)
+		offset -= etr_buf->size;
+
+	if (double_buffer) {
+		/*
+		 * If we use software double buffering, update the ring buffer.
+		 * And the size is what we have in the hardware buffer.
+		 */
+		size = etr_buf->len;
+		tmc_etr_sync_perf_buffer(etr_perf);
+	} else {
+		/*
+		 * If the hardware uses perf ring buffer the size of the data
+		 * we have is from the old-head to the current head of the
+		 * buffer. This also means in non-snapshot mode, we have lost
+		 * one-full-buffer-size worth data, if the buffer wraps around.
+		 */
+		unsigned long old_head;
+
+		old_head = (etr_perf->head % etr_buf->size);
+		size = (offset - old_head + etr_buf->size) % etr_buf->size;
+	}
+
+	/*
+	 * Update handle->head in snapshot mode. Also update the size to the
+	 * hardware buffer size if there was an overflow.
+	 */
+	if (etr_perf->snapshot) {
+		if (double_buffer)
+			handle->head += size;
+		else
+			handle->head = offset;
+		if (etr_buf->full)
+			size = etr_buf->size;
+	}
+
+	lost |= etr_buf->full;
+out:
+	if (lost)
+		perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
+	return size;
+}
+
 static int tmc_enable_etr_sink_perf(struct coresight_device *csdev)
 {
-	/* We don't support perf mode yet ! */
-	return -EINVAL;
+	int rc = 0;
+	unsigned long flags;
+	struct etr_perf_buffer *etr_perf;
+	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
+
+	spin_lock_irqsave(&drvdata->spinlock, flags);
+	/*
+	 * There can be only one writer per sink in perf mode. If the sink
+	 * is already open in SYSFS mode, we can't use it.
+	 */
+	if (drvdata->mode != CS_MODE_DISABLED) {
+		rc = -EBUSY;
+		goto unlock_out;
+	}
+
+	etr_perf = drvdata->perf_data;
+	if (!etr_perf || !etr_perf->etr_buf) {
+		rc = -EINVAL;
+		goto unlock_out;
+	}
+
+	drvdata->mode = CS_MODE_PERF;
+	tmc_etr_enable_hw(drvdata, etr_perf->etr_buf);
+
+unlock_out:
+	spin_unlock_irqrestore(&drvdata->spinlock, flags);
+	return rc;
 }
 
 static int tmc_enable_etr_sink(struct coresight_device *csdev, u32 mode)
@@ -1372,6 +1736,10 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev)
 static const struct coresight_ops_sink tmc_etr_sink_ops = {
 	.enable		= tmc_enable_etr_sink,
 	.disable	= tmc_disable_etr_sink,
+	.alloc_buffer	= tmc_etr_alloc_perf_buffer,
+	.update_buffer	= tmc_etr_update_perf_buffer,
+	.set_buffer	= tmc_etr_set_perf_buffer,
+	.free_buffer	= tmc_etr_free_perf_buffer,
 };
 
 const struct coresight_ops tmc_etr_cs_ops = {
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index 2c5b905b6494..06386ceb7866 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -198,6 +198,7 @@ struct etr_buf {
  * @trigger_cntr: amount of words to store after a trigger.
  * @etr_caps:	Bitmask of capabilities of the TMC ETR, inferred from the
  *		device configuration register (DEVID)
+ * @perf_data:	PERF buffer for ETR.
  * @sysfs_data:	SYSFS buffer for ETR.
  */
 struct tmc_drvdata {
@@ -219,6 +220,7 @@ struct tmc_drvdata {
 	u32			trigger_cntr;
 	u32			etr_caps;
 	struct etr_buf		*sysfs_buf;
+	void			*perf_data;
 };
 
 struct etr_buf_operations {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/17] coresight: perf: TMC ETR backend support
  2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
                   ` (16 preceding siblings ...)
  2017-10-19 17:15 ` [PATCH 17/17] coresight perf: Add ETR backend support for etm-perf Suzuki K Poulose
@ 2017-10-20 11:00 ` Suzuki K Poulose
  17 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-10-20 11:00 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, robert.walker, mike.leach, coresight, mathieu.poirier

On 19/10/17 18:15, Suzuki K Poulose wrote:
> The TMC-ETR supports routing the Coresight trace data to the
> System memory. It supports two different modes in which the memory
> could be used.
> 
> 1) Contiguous memory - The memory is assumed to be physically
> contiguous.
> 
> 2) Scatter Gather list - The memory can be chunks of 4K pages,
> which are specified in a table of pointers which itself could be
> multiple 4K size pages.
> 
> To avoid the complications of the managing the buffer, this series
> adds a layer for managing the ETR buffer, which makes the best possibly
> choice based on what is available. The allocation can be tuned by passing
> in flags, existing pages (e.g, perf ring buffer) etc.
> 
> Towards supporting ETR Scatter Gather mode, we introduce a generic TMC
> scatter-gather table which can be used to manage the data and table pages.
> The table can be filled in the format expected by the Scatter-Gather
> mode.
> 
> The TMC ETR-SG mechanism doesn't allow starting the trace at non-zero
> offset (required by perf). So we make some tricky changes to the table
> at run time to allow starting at any "Page aligned" offset and then
> wrap around to the beginning of the buffer with very less overhead.
> See patches for more description.
> 
> The series also improves the way the ETR is controlled by different modes
> (sysfs vs. perf) by keeping mode specific data. This allows access
> to the trace data collected in sysfs mode, even when the ETR is
> operated in perf mode. Also with the transparent management of the
> buffer and scatter-gather mechanism, we can allow the user to
> request for larger trace buffers for sysfs mode. This is supported
> by providing a sysfs file, "buffer_size" which accepts a page aligned
> size, which will be used by the ETR when allocating a buffer.
> 
> Finally, it cleans up the etm perf sink callbacks a little bit and
> then adds the support for ETR sink. For the ETR, we try our best to
> use the perf ring buffer as the target hardware buffer, provided :
>   1) The ETR is dma coherent (since the pages will be shared with
>      userspace perf tool).
>   2) The perf is used in snapshot mode (The ETR cannot be stopped
>      based on the size of the data written hence we could easily
>      overwrite the buffer. We may be able to fix this in the future)
>   3) The ETR supports the Scatter-Gather mode.
> 
> If we can't use the perf buffers directly, we fallback to using
> software buffering where we have to copy the trace data back
> to the perf ring buffer.
> 

Just to be clear :

The perf tool doesn't support the perf AUX api for coresight. I have
used the perf tool from perf-OpenCSD [1] project to control the tracing.

[1] https://git.linaro.org/people/mathieu.poirier/coresight.git perf-opencsd-4.14-rc1


Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/17] coresight tmc: Hide trace buffer handling for file read
  2017-10-19 17:15 ` [PATCH 02/17] coresight tmc: Hide trace buffer handling for file read Suzuki K Poulose
@ 2017-10-20 12:34   ` Julien Thierry
  2017-11-01  9:55     ` Suzuki K Poulose
  0 siblings, 1 reply; 56+ messages in thread
From: Julien Thierry @ 2017-10-20 12:34 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel
  Cc: mathieu.poirier, coresight, linux-kernel, rob.walker, mike.leach

Hi Suzuki,

On 19/10/17 18:15, Suzuki K Poulose wrote:
> At the moment we adjust the buffer pointers for reading the trace
> data via misc device in the common code for ETF/ETB and ETR. Since
> we are going to change how we manage the buffer for ETR, let us
> move the buffer manipulation to the respective driver files, hiding
> it from the common code. We do so by adding type specific helpers
> for finding the length of data and the pointer to the buffer,
> for a given length at a file position.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>   drivers/hwtracing/coresight/coresight-tmc-etf.c | 16 ++++++++++++
>   drivers/hwtracing/coresight/coresight-tmc-etr.c | 33 ++++++++++++++++++++++++
>   drivers/hwtracing/coresight/coresight-tmc.c     | 34 ++++++++++++++-----------
>   drivers/hwtracing/coresight/coresight-tmc.h     |  4 +++
>   4 files changed, 72 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> index e2513b786242..0b6f1eb746de 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> @@ -120,6 +120,22 @@ static void tmc_etf_disable_hw(struct tmc_drvdata *drvdata)
>   	CS_LOCK(drvdata->base);
>   }
>   
> +/*
> + * Return the available trace data in the buffer from @pos, with
> + * a maximum limit of @len, updating the @bufpp on where to
> + * find it.
> + */
> +ssize_t tmc_etb_get_sysfs_trace(struct tmc_drvdata *drvdata,
> +				  loff_t pos, size_t len, char **bufpp)
> +{
> +	/* Adjust the len to available size @pos */
> +	if (pos + len > drvdata->len)
> +		len = drvdata->len - pos;
> +	if (len > 0)

Do we have some guarantee that "pos <= drvdata->len"? because since len 
is unsigned this check only covers the case were len is 0.

Maybe it would be better to use a signed variable to store the result of 
the difference.

> +		*bufpp = drvdata->buf + pos;
> +	return len;
> +}
> +
>   static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev)
>   {
>   	int ret = 0;
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index d0208f01afd9..063f253f1c99 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -69,6 +69,39 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
>   	CS_LOCK(drvdata->base);
>   }
>   
> +/*
> + * Return the available trace data in the buffer @pos, with a maximum
> + * limit of @len, also updating the @bufpp on where to find it.
> + */
> +ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
> +			    loff_t pos, size_t len, char **bufpp)
> +{
> +	char *bufp = drvdata->buf + pos;
> +	char *bufend = (char *)(drvdata->vaddr + drvdata->size);
> +
> +	/* Adjust the len to available size @pos */
> +	if (pos + len > drvdata->len)
> +		len = drvdata->len - pos;
> +
> +	if (len <= 0)
> +		return len;

Similar issue here.

Cheers,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 05/17] coresight: Add support for TMC ETR SG unit
  2017-10-19 17:15 ` [PATCH 05/17] coresight: Add support for TMC ETR SG unit Suzuki K Poulose
@ 2017-10-20 16:25   ` Julien Thierry
  2017-11-01 10:11     ` Suzuki K Poulose
  2017-11-01 20:41   ` Mathieu Poirier
  1 sibling, 1 reply; 56+ messages in thread
From: Julien Thierry @ 2017-10-20 16:25 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier

Hi Suzuki,

On 19/10/17 18:15, Suzuki K Poulose wrote:
> This patch adds support for setting up an SG table used by the
> TMC ETR inbuilt SG unit. The TMC ETR uses 4K page sized tables
> to hold pointers to the 4K data pages with the last entry in a
> table pointing to the next table with the entries, by kind of
> chaining. The 2 LSBs determine the type of the table entry, to
> one of :
> 
>   Normal - Points to a 4KB data page.
>   Last   - Points to a 4KB data page, but is the last entry in the
>            page table.
>   Link   - Points to another 4KB table page with pointers to data.
> 
> The code takes care of handling the system page size which could
> be different than 4K. So we could end up putting multiple ETR
> SG tables in a single system page, vice versa for the data pages.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>   drivers/hwtracing/coresight/coresight-tmc-etr.c | 256 ++++++++++++++++++++++++
>   1 file changed, 256 insertions(+)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 4b9e2b276122..4424eb67a54c 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -21,6 +21,89 @@
>   #include "coresight-tmc.h"
>   
>   /*
> + * The TMC ETR SG has a page size of 4K. The SG table contains pointers
> + * to 4KB buffers. However, the OS may be use PAGE_SIZE different from

nit:
"the OS may use a PAGE_SIZE different from".

> + * 4K (i.e, 16KB or 64KB). This implies that a single OS page could
> + * contain more than one SG buffer and tables.
> + *
> + * A table entry has the following format:
> + *
> + * ---Bit31------------Bit4-------Bit1-----Bit0--
> + * |     Address[39:12]    | SBZ |  Entry Type  |
> + * ----------------------------------------------
> + *
> + * Address: Bits [39:12] of a physical page address. Bits [11:0] are
> + *	    always zero.
> + *
> + * Entry type:
> + *	b00 - Reserved.
> + *	b01 - Last entry in the tables, points to 4K page buffer.
> + *	b10 - Normal entry, points to 4K page buffer.
> + *	b11 - Link. The address points to the base of next table.
> + */
> +
> +typedef u32 sgte_t;
> +
> +#define ETR_SG_PAGE_SHIFT		12
> +#define ETR_SG_PAGE_SIZE		(1UL << ETR_SG_PAGE_SHIFT)
> +#define ETR_SG_PAGES_PER_SYSPAGE	(1UL << \
> +					 (PAGE_SHIFT - ETR_SG_PAGE_SHIFT))

I think this would be slightly easier to understand if defined as:
"(PAGE_SIZE / ETR_SG_PAGE_SIZE)".

> +#define ETR_SG_PTRS_PER_PAGE		(ETR_SG_PAGE_SIZE / sizeof(sgte_t))
> +#define ETR_SG_PTRS_PER_SYSPAGE		(PAGE_SIZE / sizeof(sgte_t))
> +
> +#define ETR_SG_ET_MASK			0x3
> +#define ETR_SG_ET_LAST			0x1
> +#define ETR_SG_ET_NORMAL		0x2
> +#define ETR_SG_ET_LINK			0x3
> +
> +#define ETR_SG_ADDR_SHIFT		4
> +
> +#define ETR_SG_ENTRY(addr, type) \
> +	(sgte_t)((((addr) >> ETR_SG_PAGE_SHIFT) << ETR_SG_ADDR_SHIFT) | \
> +		 (type & ETR_SG_ET_MASK))
> +
> +#define ETR_SG_ADDR(entry) \
> +	(((dma_addr_t)(entry) >> ETR_SG_ADDR_SHIFT) << ETR_SG_PAGE_SHIFT)
> +#define ETR_SG_ET(entry)		((entry) & ETR_SG_ET_MASK)
> +
> +/*
> + * struct etr_sg_table : ETR SG Table
> + * @sg_table:		Generic SG Table holding the data/table pages.
> + * @hwaddr:		hwaddress used by the TMC, which is the base
> + *			address of the table.
> + */
> +struct etr_sg_table {
> +	struct tmc_sg_table	*sg_table;
> +	dma_addr_t		hwaddr;
> +};
> +
> +/*
> + * tmc_etr_sg_table_entries: Total number of table entries required to map
> + * @nr_pages system pages.
> + *
> + * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages.
> + * Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers,
> + * with the last entry pointing to the page containing the table
> + * entries. If we spill over to a new page for mapping 1 entry,
> + * we could as well replace the link entry of the previous page
> + * with the last entry.
> + */
> +static inline unsigned long __attribute_const__
> +tmc_etr_sg_table_entries(int nr_pages)
> +{
> +	unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE;
> +	unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1);
> +	/*
> +	 * If we spill over to a new page for 1 entry, we could as well
> +	 * make it the LAST entry in the previous page, skipping the Link
> +	 * address.
> +	 */
> +	if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2))
> +		nr_sglinks--;
> +	return nr_sgpages + nr_sglinks;
> +}
> +
> +/*
>    * tmc_pages_get_offset:  Go through all the pages in the tmc_pages
>    * and map @phys_addr to an offset within the buffer.
>    */
> @@ -307,6 +390,179 @@ ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
>   	return len;
>   }
>   
> +#ifdef ETR_SG_DEBUG
> +/* Map a dma address to virtual address */
> +static unsigned long
> +tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table,
> +			dma_addr_t addr, bool table)
> +{
> +	long offset;
> +	unsigned long base;
> +	struct tmc_pages *tmc_pages;
> +
> +	if (table) {
> +		tmc_pages = &sg_table->table_pages;
> +		base = (unsigned long)sg_table->table_vaddr;
> +	} else {
> +		tmc_pages = &sg_table->data_pages;
> +		base = (unsigned long)sg_table->data_vaddr;
> +	}
> +
> +	offset = tmc_pages_get_offset(tmc_pages, addr);
> +	if (offset < 0)
> +		return 0;
> +	return base + offset;
> +}
> +
> +/* Dump the given sg_table */
> +static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
> +{
> +	sgte_t *ptr;
> +	int i = 0;
> +	dma_addr_t addr;
> +	struct tmc_sg_table *sg_table = etr_table->sg_table;
> +
> +	ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
> +					      etr_table->hwaddr, true);
> +	while (ptr) {
> +		addr = ETR_SG_ADDR(*ptr);
> +		switch (ETR_SG_ET(*ptr)) {
> +		case ETR_SG_ET_NORMAL:
> +			pr_debug("%05d: %p\t:[N] 0x%llx\n", i, ptr, addr);
> +			ptr++;
> +			break;
> +		case ETR_SG_ET_LINK:
> +			pr_debug("%05d: *** %p\t:{L} 0x%llx ***\n",
> +				 i, ptr, addr);
> +			ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
> +							      addr, true);
> +			break;
> +		case ETR_SG_ET_LAST:
> +			pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n",
> +				 i, ptr, addr);
> +			return;

I get this is debug code, but it seems like if ETR_SG_ET(*ptr) is 0 we 
get stuck in an infinite loop. I guess it is something that supposedly 
doesn't happen, still I'd prefer having a default case saying the table 
might be corrupted and either incrementing ptr to try and get more info 
or breaking out of the loop.

> +		}
> +		i++;
> +	}
> +	pr_debug("******* End of Table *****\n");
> +}
> +#endif
> +
> +/*
> + * Populate the SG Table page table entries from table/data
> + * pages allocated. Each Data page has ETR_SG_PAGES_PER_SYSPAGE SG pages.
> + * So does a Table page. So we keep track of indices of the tables
> + * in each system page and move the pointers accordingly.
> + */
> +#define INC_IDX_ROUND(idx, size) (idx = (idx + 1) % size)

Needs more parenthesis around idx and size.

> +static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
> +{
> +	dma_addr_t paddr;
> +	int i, type, nr_entries;
> +	int tpidx = 0; /* index to the current system table_page */
> +	int sgtidx = 0;	/* index to the sg_table within the current syspage */
> +	int sgtoffset = 0; /* offset to the next entry within the sg_table */

That's misleading, this seems to be the index of an entry within an 
ETR_SG_PAGE rather than an offset in bytes.

Maybe ptridx or entryidx would be a better name.

> +	int dpidx = 0; /* index to the current system data_page */
> +	int spidx = 0; /* index to the SG page within the current data page */
> +	sgte_t *ptr; /* pointer to the table entry to fill */
> +	struct tmc_sg_table *sg_table = etr_table->sg_table;
> +	dma_addr_t *table_daddrs = sg_table->table_pages.daddrs;
> +	dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
> +
> +	nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages);
> +	/*
> +	 * Use the contiguous virtual address of the table to update entries.
> +	 */
> +	ptr = sg_table->table_vaddr;
> +	/*
> +	 * Fill all the entries, except the last entry to avoid special
> +	 * checks within the loop.
> +	 */
> +	for (i = 0; i < nr_entries - 1; i++) {
> +		if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) {
> +			/*
> +			 * Last entry in a sg_table page is a link address to
> +			 * the next table page. If this sg_table is the last
> +			 * one in the system page, it links to the first
> +			 * sg_table in the next system page. Otherwise, it
> +			 * links to the next sg_table page within the system
> +			 * page.
> +			 */
> +			if (sgtidx == ETR_SG_PAGES_PER_SYSPAGE - 1) {
> +				paddr = table_daddrs[tpidx + 1];
> +			} else {
> +				paddr = table_daddrs[tpidx] +
> +					(ETR_SG_PAGE_SIZE * (sgtidx + 1));
> +			}
> +			type = ETR_SG_ET_LINK;
> +		} else {
> +			/*
> +			 * Update the idices to the data_pages to point to the

nit: indices

Cheers,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 06/17] coresight: tmc: Make ETR SG table circular
  2017-10-19 17:15 ` [PATCH 06/17] coresight: tmc: Make ETR SG table circular Suzuki K Poulose
@ 2017-10-20 17:11   ` Julien Thierry
  2017-11-01 10:12     ` Suzuki K Poulose
  2017-11-01 23:47   ` Mathieu Poirier
  2017-11-06 19:07   ` Mathieu Poirier
  2 siblings, 1 reply; 56+ messages in thread
From: Julien Thierry @ 2017-10-20 17:11 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier

Hi Suzuki,

On 19/10/17 18:15, Suzuki K Poulose wrote:
> Make the ETR SG table Circular buffer so that we could start
> at any of the SG pages and use the entire buffer for tracing.
> This can be achieved by :
> 
> 1) Keeping an additional LINK pointer at the very end of the
> SG table, i.e, after the LAST buffer entry, to point back to
> the beginning of the first table. This will allow us to use
> the buffer normally when we start the trace at offset 0 of
> the buffer, as the LAST buffer entry hints the TMC-ETR and
> it automatically wraps to the offset 0.
> 
> 2) If we want to start at any other ETR SG page aligned offset,
> we could :
>   a) Make the preceding page entry as LAST entry.
>   b) Make the original LAST entry a normal entry.
>   c) Use the table pointer to the "new" start offset as the
>      base of the table address.
> This works as the TMC doesn't mandate that the page table
> base address should be 4K page aligned.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>   drivers/hwtracing/coresight/coresight-tmc-etr.c | 159 +++++++++++++++++++++---
>   1 file changed, 139 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 4424eb67a54c..c171b244e12a 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c

[...]

> @@ -519,6 +534,107 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
>   	/* Set up the last entry, which is always a data pointer */
>   	paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
>   	*ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST);
> +	/* followed by a circular link, back to the start of the table */
> +	*ptr++ = ETR_SG_ENTRY(sg_table->table_daddr, ETR_SG_ET_LINK);
> +}
> +
> +/*
> + * tmc_etr_sg_offset_to_table_index : Translate a given data @offset
> + * to the index of the page table "entry". Data pointers always have
> + * a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an
> + * ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries.
> + */
> +static inline u32
> +tmc_etr_sg_offset_to_table_index(u64 offset)
> +{
> +	u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
> +
> +	return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
> +}
> +
> +/*
> + * tmc_etr_sg_update_type: Update the type of a given entry in the
> + * table to the requested entry. This is only used for data buffers
> + * to toggle the "NORMAL" vs "LAST" buffer entries.
> + */
> +static inline void tmc_etr_sg_update_type(sgte_t *entry, u32 type)
> +{
> +	WARN_ON(ETR_SG_ET(*entry) == ETR_SG_ET_LINK);
> +	WARN_ON(!ETR_SG_ET(*entry));
> +	*entry &= ~ETR_SG_ET_MASK;
> +	*entry |= type;
> +}
> +
> +/*
> + * tmc_etr_sg_table_index_to_daddr: Return the hardware address to the table
> + * entry @index. Use this address to let the table begin @index.
> + */
> +static inline dma_addr_t
> +tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index)
> +{
> +	u32 sys_page_idx = index / ETR_SG_PTRS_PER_SYSPAGE;
> +	u32 sys_page_offset = index % ETR_SG_PTRS_PER_SYSPAGE;
> +	sgte_t *ptr;
> +
> +	ptr = (sgte_t *)sg_table->table_pages.daddrs[sys_page_idx];
> +	return (dma_addr_t)&ptr[sys_page_offset];
> +}
> +
> +/*
> + * tmc_etr_sg_table_rotate : Rotate the SG circular buffer, moving
> + * the "base" to a requested offset. We do so by :
> + *
> + * 1) Reset the current LAST buffer.
> + * 2) Mark the "previous" buffer in the table to the "base" as LAST.
> + * 3) Update the hwaddr to point to the table pointer for the buffer
> + *    which starts at "base".
> + */
> +static int __maybe_unused
> +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset)
> +{
> +	u32 last_entry, first_entry;
> +	u64 last_offset;
> +	struct tmc_sg_table *sg_table = etr_table->sg_table;
> +	sgte_t *table_ptr = sg_table->table_vaddr;
> +	ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
> +
> +	/* Offset should always be SG PAGE_SIZE aligned */
> +	if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
> +		pr_debug("unaligned base offset %llx\n", base_offset);
> +		return -EINVAL;
> +	}
> +	/* Make sure the offset is within the range */
> +	if (base_offset < 0 || base_offset > buf_size) {

base_offset is unsigned, so the left operand of the '||' is useless 
(would've expected the compiler to emit a warning for this).

> +		base_offset = (base_offset + buf_size) % buf_size;
> +		pr_debug("Resetting offset to %llx\n", base_offset);
> +	}
> +	first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
> +	if (first_entry == etr_table->first_entry) {
> +		pr_debug("Head is already at %llx, skipping\n", base_offset);
> +		return 0;
> +	}
> +
> +	/* Last entry should be the previous one to the new "base" */
> +	last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size;
> +	last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
> +
> +	/* Reset the current Last page to Normal and new Last page to NORMAL */

Current Last page to NORMAL and new Last page to LAST?

Cheers,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 03/17] coresight: Add helper for inserting synchronization packets
  2017-10-19 17:15 ` [PATCH 03/17] coresight: Add helper for inserting synchronization packets Suzuki K Poulose
@ 2017-10-30 21:44   ` Mathieu Poirier
  2017-11-01 10:01     ` Suzuki K Poulose
  0 siblings, 1 reply; 56+ messages in thread
From: Mathieu Poirier @ 2017-10-30 21:44 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On Thu, Oct 19, 2017 at 06:15:39PM +0100, Suzuki K Poulose wrote:
> Right now we open code filling the trace buffer with synchronization
> packets when the circular buffer wraps around in different drivers.
> Move this to a common place.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Mike Leach <mike.leach@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-etb10.c   | 10 +++------
>  drivers/hwtracing/coresight/coresight-priv.h    |  8 ++++++++
>  drivers/hwtracing/coresight/coresight-tmc-etf.c | 27 ++++++++-----------------
>  drivers/hwtracing/coresight/coresight-tmc-etr.c | 13 +-----------
>  4 files changed, 20 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
> index 56ecd7aff5eb..d7164ab8e229 100644
> --- a/drivers/hwtracing/coresight/coresight-etb10.c
> +++ b/drivers/hwtracing/coresight/coresight-etb10.c
> @@ -203,7 +203,6 @@ static void etb_dump_hw(struct etb_drvdata *drvdata)
>  	bool lost = false;
>  	int i;
>  	u8 *buf_ptr;
> -	const u32 *barrier;
>  	u32 read_data, depth;
>  	u32 read_ptr, write_ptr;
>  	u32 frame_off, frame_endoff;
> @@ -234,19 +233,16 @@ static void etb_dump_hw(struct etb_drvdata *drvdata)
>  
>  	depth = drvdata->buffer_depth;
>  	buf_ptr = drvdata->buf;
> -	barrier = barrier_pkt;
>  	for (i = 0; i < depth; i++) {
>  		read_data = readl_relaxed(drvdata->base +
>  					  ETB_RAM_READ_DATA_REG);
> -		if (lost && *barrier) {
> -			read_data = *barrier;
> -			barrier++;
> -		}
> -
>  		*(u32 *)buf_ptr = read_data;
>  		buf_ptr += 4;
>  	}
>  
> +	if (lost)
> +		coresight_insert_barrier_packet(drvdata->buf);
> +
>  	if (frame_off) {
>  		buf_ptr -= (frame_endoff * 4);
>  		for (i = 0; i < frame_endoff; i++) {
> diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
> index f1d0e21d8cab..d12f64928c00 100644
> --- a/drivers/hwtracing/coresight/coresight-priv.h
> +++ b/drivers/hwtracing/coresight/coresight-priv.h
> @@ -65,6 +65,7 @@ static DEVICE_ATTR_RO(name)
>  	__coresight_simple_func(type, NULL, name, lo_off, hi_off)
>  
>  extern const u32 barrier_pkt[5];
> +#define CORESIGHT_BARRIER_PKT_SIZE (sizeof(barrier_pkt) - sizeof(u32))

When using a memcpy() there is no need to have a 0x0 at the end of the
barrier_pkt array.  A such I suggest you remove that and simply use sizeof()
in coresight_insert_barrier_packet().

I'll review the rest of your patches tomorrow.

>  
>  enum etm_addr_type {
>  	ETM_ADDR_TYPE_NONE,
> @@ -98,6 +99,13 @@ struct cs_buffers {
>  	void			**data_pages;
>  };
>  
> +static inline void coresight_insert_barrier_packet(void *buf)
> +{
> +	if (buf)
> +		memcpy(buf, barrier_pkt, CORESIGHT_BARRIER_PKT_SIZE);
> +}
> +
> +
>  static inline void CS_LOCK(void __iomem *addr)
>  {
>  	do {
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> index 0b6f1eb746de..d89bfb3042a2 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> @@ -43,39 +43,28 @@ static void tmc_etb_enable_hw(struct tmc_drvdata *drvdata)
>  
>  static void tmc_etb_dump_hw(struct tmc_drvdata *drvdata)
>  {
> -	bool lost = false;
>  	char *bufp;
> -	const u32 *barrier;
> -	u32 read_data, status;
> +	u32 read_data, lost;
>  	int i;
>  
> -	/*
> -	 * Get a hold of the status register and see if a wrap around
> -	 * has occurred.
> -	 */
> -	status = readl_relaxed(drvdata->base + TMC_STS);
> -	if (status & TMC_STS_FULL)
> -		lost = true;
> -
> +	/* Check if the buffer was wrapped around. */
> +	lost = readl_relaxed(drvdata->base + TMC_STS) & TMC_STS_FULL;
>  	bufp = drvdata->buf;
>  	drvdata->len = 0;
> -	barrier = barrier_pkt;
>  	while (1) {
>  		for (i = 0; i < drvdata->memwidth; i++) {
>  			read_data = readl_relaxed(drvdata->base + TMC_RRD);
>  			if (read_data == 0xFFFFFFFF)
> -				return;
> -
> -			if (lost && *barrier) {
> -				read_data = *barrier;
> -				barrier++;
> -			}
> -
> +				goto done;
>  			memcpy(bufp, &read_data, 4);
>  			bufp += 4;
>  			drvdata->len += 4;
>  		}
>  	}
> +done:
> +	if (lost)
> +		coresight_insert_barrier_packet(drvdata->buf);
> +	return;
>  }
>  
>  static void tmc_etb_disable_hw(struct tmc_drvdata *drvdata)
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 063f253f1c99..41535fa6b6cf 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -104,9 +104,7 @@ ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
>  
>  static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata)
>  {
> -	const u32 *barrier;
>  	u32 val;
> -	u32 *temp;
>  	u64 rwp;
>  
>  	rwp = tmc_read_rwp(drvdata);
> @@ -119,16 +117,7 @@ static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata)
>  	if (val & TMC_STS_FULL) {
>  		drvdata->buf = drvdata->vaddr + rwp - drvdata->paddr;
>  		drvdata->len = drvdata->size;
> -
> -		barrier = barrier_pkt;
> -		temp = (u32 *)drvdata->buf;
> -
> -		while (*barrier) {
> -			*temp = *barrier;
> -			temp++;
> -			barrier++;
> -		}
> -
> +		coresight_insert_barrier_packet(drvdata->buf);
>  	} else {
>  		drvdata->buf = drvdata->vaddr;
>  		drvdata->len = rwp - drvdata->paddr;
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 04/17] coresight: Add generic TMC sg table framework
  2017-10-19 17:15 ` [PATCH 04/17] coresight: Add generic TMC sg table framework Suzuki K Poulose
@ 2017-10-31 22:13   ` Mathieu Poirier
  2017-11-01 10:09     ` Suzuki K Poulose
  0 siblings, 1 reply; 56+ messages in thread
From: Mathieu Poirier @ 2017-10-31 22:13 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach,
	coresight, Mathieu Poirier

On Thu, Oct 19, 2017 at 06:15:40PM +0100, Suzuki K Poulose wrote:
> This patch introduces a generic sg table data structure and
> associated operations. An SG table can be used to map a set
> of Data pages where the trace data could be stored by the TMC
> ETR. The information about the data pages could be stored in
> different formats, depending on the type of the underlying
> SG mechanism (e.g, TMC ETR SG vs Coresight CATU). The generic
> structure provides book keeping of the pages used for the data
> as well as the table contents. The table should be filled by
> the user of the infrastructure.
> 
> A table can be created by specifying the number of data pages
> as well as the number of table pages required to hold the
> pointers, where the latter could be different for different
> types of tables. The pages are mapped in the appropriate dma
> data direction mode (i.e, DMA_TO_DEVICE for table pages
> and DMA_FROM_DEVICE for data pages).  The framework can optionally
> accept a set of allocated data pages (e.g, perf ring buffer) and
> map them accordingly. The table and data pages are vmap'ed to allow
> easier access by the drivers. The framework also provides helpers to
> sync the data written to the pages with appropriate directions.
> 
> This will be later used by the TMC ETR SG unit.
> 
> Cc: Mathieu Poirier <matheiu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
> ---
>  drivers/hwtracing/coresight/coresight-tmc-etr.c | 289 +++++++++++++++++++++++-
>  drivers/hwtracing/coresight/coresight-tmc.h     |  44 ++++
>  2 files changed, 332 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 41535fa6b6cf..4b9e2b276122 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -16,10 +16,297 @@
>   */
>  
>  #include <linux/coresight.h>
> -#include <linux/dma-mapping.h>
> +#include <linux/slab.h>
>  #include "coresight-priv.h"
>  #include "coresight-tmc.h"
>  
> +/*
> + * tmc_pages_get_offset:  Go through all the pages in the tmc_pages
> + * and map @phys_addr to an offset within the buffer.

Did you mean "... map @addr"?  It might also be worth it to explicitly mention
that it maps a physical address to an offset in the contiguous range.

> + */
> +static long
> +tmc_pages_get_offset(struct tmc_pages *tmc_pages, dma_addr_t addr)
> +{
> +	int i;
> +	dma_addr_t page_start;
> +
> +	for (i = 0; i < tmc_pages->nr_pages; i++) {
> +		page_start = tmc_pages->daddrs[i];
> +		if (addr >= page_start && addr < (page_start + PAGE_SIZE))
> +			return i * PAGE_SIZE + (addr - page_start);
> +	}
> +
> +	return -EINVAL;
> +}
> +
> +/*
> + * tmc_pages_free : Unmap and free the pages used by tmc_pages.
> + */
> +static void tmc_pages_free(struct tmc_pages *tmc_pages,
> +			   struct device *dev, enum dma_data_direction dir)
> +{
> +	int i;
> +
> +	for (i = 0; i < tmc_pages->nr_pages; i++) {
> +		if (tmc_pages->daddrs && tmc_pages->daddrs[i])
> +			dma_unmap_page(dev, tmc_pages->daddrs[i],
> +					 PAGE_SIZE, dir);
> +		if (tmc_pages->pages && tmc_pages->pages[i])
> +			__free_page(tmc_pages->pages[i]);
> +	}
> +
> +	kfree(tmc_pages->pages);
> +	kfree(tmc_pages->daddrs);
> +	tmc_pages->pages = NULL;
> +	tmc_pages->daddrs = NULL;
> +	tmc_pages->nr_pages = 0;
> +}
> +
> +/*
> + * tmc_pages_alloc : Allocate and map pages for a given @tmc_pages.
> + * If @pages is not NULL, the list of page virtual addresses are
> + * used as the data pages. The pages are then dma_map'ed for @dev
> + * with dma_direction @dir.
> + *
> + * Returns 0 upon success, else the error number.
> + */
> +static int tmc_pages_alloc(struct tmc_pages *tmc_pages,
> +			   struct device *dev, int node,
> +			   enum dma_data_direction dir, void **pages)
> +{
> +	int i, nr_pages;
> +	dma_addr_t paddr;
> +	struct page *page;
> +
> +	nr_pages = tmc_pages->nr_pages;
> +	tmc_pages->daddrs = kcalloc(nr_pages, sizeof(*tmc_pages->daddrs),
> +					 GFP_KERNEL);
> +	if (!tmc_pages->daddrs)
> +		return -ENOMEM;
> +	tmc_pages->pages = kcalloc(nr_pages, sizeof(*tmc_pages->pages),
> +					 GFP_KERNEL);
> +	if (!tmc_pages->pages) {
> +		kfree(tmc_pages->daddrs);
> +		tmc_pages->daddrs = NULL;
> +		return -ENOMEM;
> +	}
> +
> +	for (i = 0; i < nr_pages; i++) {
> +		if (pages && pages[i]) {
> +			page = virt_to_page(pages[i]);
> +			get_page(page);
> +		} else {
> +			page = alloc_pages_node(node,
> +						GFP_KERNEL | __GFP_ZERO, 0);
> +		}
> +		paddr = dma_map_page(dev, page, 0, PAGE_SIZE, dir);
> +		if (dma_mapping_error(dev, paddr))
> +			goto err;
> +		tmc_pages->daddrs[i] = paddr;
> +		tmc_pages->pages[i] = page;
> +	}
> +	return 0;
> +err:
> +	tmc_pages_free(tmc_pages, dev, dir);
> +	return -ENOMEM;
> +}
> +
> +static inline dma_addr_t tmc_sg_table_base_paddr(struct tmc_sg_table *sg_table)
> +{
> +	if (WARN_ON(!sg_table->data_pages.pages[0]))
> +		return 0;
> +	return sg_table->table_daddr;
> +}
> +
> +static inline void *tmc_sg_table_base_vaddr(struct tmc_sg_table *sg_table)
> +{
> +	if (WARN_ON(!sg_table->data_pages.pages[0]))
> +		return NULL;
> +	return sg_table->table_vaddr;
> +}
> +
> +static inline void *
> +tmc_sg_table_data_vaddr(struct tmc_sg_table *sg_table)
> +{
> +	if (WARN_ON(!sg_table->data_pages.nr_pages))
> +		return 0;
> +	return sg_table->data_vaddr;
> +}
> +
> +static inline unsigned long
> +tmc_sg_table_buf_size(struct tmc_sg_table *sg_table)
> +{
> +	return sg_table->data_pages.nr_pages << PAGE_SHIFT;
> +}
> +
> +static inline long
> +tmc_sg_get_data_page_offset(struct tmc_sg_table *sg_table, dma_addr_t addr)
> +{
> +	return tmc_pages_get_offset(&sg_table->data_pages, addr);
> +}
> +
> +static inline void tmc_free_table_pages(struct tmc_sg_table *sg_table)
> +{
> +	if (sg_table->table_vaddr)
> +		vunmap(sg_table->table_vaddr);
> +	tmc_pages_free(&sg_table->table_pages, sg_table->dev, DMA_TO_DEVICE);
> +}
> +
> +static void tmc_free_data_pages(struct tmc_sg_table *sg_table)
> +{
> +	if (sg_table->data_vaddr)
> +		vunmap(sg_table->data_vaddr);
> +	tmc_pages_free(&sg_table->data_pages, sg_table->dev, DMA_FROM_DEVICE);
> +}
> +
> +void tmc_free_sg_table(struct tmc_sg_table *sg_table)
> +{
> +	tmc_free_table_pages(sg_table);
> +	tmc_free_data_pages(sg_table);
> +}
> +
> +/*
> + * Alloc pages for the table. Since this will be used by the device,
> + * allocate the pages closer to the device (i.e, dev_to_node(dev)
> + * rather than the CPU nod).
> + */
> +static int tmc_alloc_table_pages(struct tmc_sg_table *sg_table)
> +{
> +	int rc;
> +	struct tmc_pages *table_pages = &sg_table->table_pages;
> +
> +	rc = tmc_pages_alloc(table_pages, sg_table->dev,
> +			     dev_to_node(sg_table->dev),
> +			     DMA_TO_DEVICE, NULL);
> +	if (rc)
> +		return rc;
> +	sg_table->table_vaddr = vmap(table_pages->pages,
> +				     table_pages->nr_pages,
> +				     VM_MAP,
> +				     PAGE_KERNEL);
> +	if (!sg_table->table_vaddr)
> +		rc = -ENOMEM;
> +	else
> +		sg_table->table_daddr = table_pages->daddrs[0];
> +	return rc;
> +}
> +
> +static int tmc_alloc_data_pages(struct tmc_sg_table *sg_table, void **pages)
> +{
> +	int rc;
> +
> +	rc = tmc_pages_alloc(&sg_table->data_pages,
> +			     sg_table->dev, sg_table->node,

Am I missing something very subtle here or sg_table->node should be the same as
dev_to_node(sg_table->dev)?  If the same both tmc_alloc_table_pages() and
tmc_alloc_data_pages() should be using the same construct.  Otherwise please add
a comment to justify the difference.

> +			     DMA_FROM_DEVICE, pages);
> +	if (!rc) {
> +		sg_table->data_vaddr = vmap(sg_table->data_pages.pages,
> +					   sg_table->data_pages.nr_pages,
> +					   VM_MAP,
> +					   PAGE_KERNEL);
> +		if (!sg_table->data_vaddr)
> +			rc = -ENOMEM;
> +	}
> +	return rc;
> +}
> +
> +/*
> + * tmc_alloc_sg_table: Allocate and setup dma pages for the TMC SG table
> + * and data buffers. TMC writes to the data buffers and reads from the SG
> + * Table pages.
> + *
> + * @dev		- Device to which page should be DMA mapped.
> + * @node	- Numa node for mem allocations
> + * @nr_tpages	- Number of pages for the table entries.
> + * @nr_dpages	- Number of pages for Data buffer.
> + * @pages	- Optional list of virtual address of pages.
> + */
> +struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev,
> +					int node,
> +					int nr_tpages,
> +					int nr_dpages,
> +					void **pages)
> +{
> +	long rc;
> +	struct tmc_sg_table *sg_table;
> +
> +	sg_table = kzalloc(sizeof(*sg_table), GFP_KERNEL);
> +	if (!sg_table)
> +		return ERR_PTR(-ENOMEM);
> +	sg_table->data_pages.nr_pages = nr_dpages;
> +	sg_table->table_pages.nr_pages = nr_tpages;
> +	sg_table->node = node;
> +	sg_table->dev = dev;
> +
> +	rc  = tmc_alloc_data_pages(sg_table, pages);
> +	if (!rc)
> +		rc = tmc_alloc_table_pages(sg_table);
> +	if (rc) {
> +		tmc_free_sg_table(sg_table);
> +		kfree(sg_table);
> +		return ERR_PTR(rc);
> +	}
> +
> +	return sg_table;
> +}
> +
> +/*
> + * tmc_sg_table_sync_data_range: Sync the data buffer written
> + * by the device from @offset upto a @size bytes.
> + */
> +void tmc_sg_table_sync_data_range(struct tmc_sg_table *table,
> +				  u64 offset, u64 size)
> +{
> +	int i, index, start;
> +	int npages = DIV_ROUND_UP(size, PAGE_SIZE);
> +	struct device *dev = table->dev;
> +	struct tmc_pages *data = &table->data_pages;
> +
> +	start = offset >> PAGE_SHIFT;
> +	for (i = start; i < (start + npages); i++) {
> +		index = i % data->nr_pages;
> +		dma_sync_single_for_cpu(dev, data->daddrs[index],
> +					PAGE_SIZE, DMA_FROM_DEVICE);
> +	}
> +}
> +
> +/* tmc_sg_sync_table: Sync the page table */
> +void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table)
> +{
> +	int i;
> +	struct device *dev = sg_table->dev;
> +	struct tmc_pages *table_pages = &sg_table->table_pages;
> +
> +	for (i = 0; i < table_pages->nr_pages; i++)
> +		dma_sync_single_for_device(dev, table_pages->daddrs[i],
> +					   PAGE_SIZE, DMA_TO_DEVICE);
> +}
> +
> +/*
> + * tmc_sg_table_get_data: Get the buffer pointer for data @offset
> + * in the SG buffer. The @bufpp is updated to point to the buffer.
> + * Returns :
> + *	the length of linear data available at @offset.
> + *	or
> + *	<= 0 if no data is available.
> + */
> +ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
> +				u64 offset, size_t len, char **bufpp)
> +{
> +	size_t size;
> +	int pg_idx = offset >> PAGE_SHIFT;
> +	int pg_offset = offset & (PAGE_SIZE - 1);
> +	struct tmc_pages *data_pages = &sg_table->data_pages;
> +
> +	size = tmc_sg_table_buf_size(sg_table);
> +	if (offset >= size)
> +		return -EINVAL;
> +	len = (len < (size - offset)) ? len : size - offset;
> +	len = (len < (PAGE_SIZE - pg_offset)) ? len : (PAGE_SIZE - pg_offset);
> +	if (len > 0)
> +		*bufpp = page_address(data_pages->pages[pg_idx]) + pg_offset;
> +	return len;
> +}
> +
>  static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
>  {
>  	u32 axictl, sts;
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> index 6deb3afe9db8..5e49c035a1ac 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.h
> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> @@ -19,6 +19,7 @@
>  #define _CORESIGHT_TMC_H
>  
>  #include <linux/miscdevice.h>
> +#include <linux/dma-mapping.h>
>  
>  #define TMC_RSZ			0x004
>  #define TMC_STS			0x00c
> @@ -171,6 +172,38 @@ struct tmc_drvdata {
>  	u32			etr_caps;
>  };
>  
> +/**
> + * struct tmc_pages - Collection of pages used for SG.
> + * @nr_pages:		Number of pages in the list.
> + * @daddr:		DMA'able page address returned by dma_map_page().
> + * @vaddr:		Virtual address returned by page_address().

This isn't accurate.

> + */
> +struct tmc_pages {
> +	int nr_pages;
> +	dma_addr_t	*daddrs;
> +	struct page	**pages;
> +};
> +
> +/*
> + * struct tmc_sg_table : Generic SG table for TMC

Use a '-' as above or fix the above to be ':'.  I don't mind which is used as
long as they are the same.

> + * @dev:		Device for DMA allocations
> + * @table_vaddr:	Contiguous Virtual address for PageTable
> + * @data_vaddr:		Contiguous Virtual address for Data Buffer
> + * @table_daddr:	DMA address of the PageTable base
> + * @node:		Node for Page allocations
> + * @table_pages:	List of pages & dma address for Table
> + * @data_pages:		List of pages & dma address for Data
> + */
> +struct tmc_sg_table {
> +	struct device *dev;
> +	void *table_vaddr;
> +	void *data_vaddr;
> +	dma_addr_t table_daddr;
> +	int node;
> +	struct tmc_pages table_pages;
> +	struct tmc_pages data_pages;
> +};
> +
>  /* Generic functions */
>  void tmc_wait_for_tmcready(struct tmc_drvdata *drvdata);
>  void tmc_flush_and_stop(struct tmc_drvdata *drvdata);
> @@ -226,4 +259,15 @@ static inline bool tmc_etr_has_cap(struct tmc_drvdata *drvdata, u32 cap)
>  	return !!(drvdata->etr_caps & cap);
>  }
>  
> +struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev,
> +					int node,
> +					int nr_tpages,
> +					int nr_dpages,
> +					void **pages);
> +void tmc_free_sg_table(struct tmc_sg_table *sg_table);
> +void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table);
> +void tmc_sg_table_sync_data_range(struct tmc_sg_table *table,
> +				  u64 offset, u64 size);
> +ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
> +			      u64 offset, size_t len, char **bufpp);
>  #endif

I like this implementation, much cleaner than what I previously had.

> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/17] coresight tmc: Hide trace buffer handling for file read
  2017-10-20 12:34   ` Julien Thierry
@ 2017-11-01  9:55     ` Suzuki K Poulose
  0 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-01  9:55 UTC (permalink / raw)
  To: Julien Thierry, linux-arm-kernel
  Cc: mathieu.poirier, coresight, linux-kernel, rob.walker, mike.leach

On 20/10/17 13:34, Julien Thierry wrote:
> Hi Suzuki,
> 
> On 19/10/17 18:15, Suzuki K Poulose wrote:
>> At the moment we adjust the buffer pointers for reading the trace
>> data via misc device in the common code for ETF/ETB and ETR. Since
>> we are going to change how we manage the buffer for ETR, let us
>> move the buffer manipulation to the respective driver files, hiding
>> it from the common code. We do so by adding type specific helpers
>> for finding the length of data and the pointer to the buffer,
>> for a given length at a file position.
>>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-tmc-etf.c | 16 ++++++++++++
>>   drivers/hwtracing/coresight/coresight-tmc-etr.c | 33 ++++++++++++++++++++++++
>>   drivers/hwtracing/coresight/coresight-tmc.c     | 34 ++++++++++++++-----------
>>   drivers/hwtracing/coresight/coresight-tmc.h     |  4 +++
>>   4 files changed, 72 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
>> index e2513b786242..0b6f1eb746de 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
>> @@ -120,6 +120,22 @@ static void tmc_etf_disable_hw(struct tmc_drvdata *drvdata)
>>       CS_LOCK(drvdata->base);
>>   }
>> +/*
>> + * Return the available trace data in the buffer from @pos, with
>> + * a maximum limit of @len, updating the @bufpp on where to
>> + * find it.
>> + */
>> +ssize_t tmc_etb_get_sysfs_trace(struct tmc_drvdata *drvdata,
>> +                  loff_t pos, size_t len, char **bufpp)
>> +{
>> +    /* Adjust the len to available size @pos */
>> +    if (pos + len > drvdata->len)
>> +        len = drvdata->len - pos;
>> +    if (len > 0)
> 
> Do we have some guarantee that "pos <= drvdata->len"? because since len is unsigned this check only covers the case were len is 0.
> 
> Maybe it would be better to use a signed variable to store the result of the difference.
> 
>> +        *bufpp = drvdata->buf + pos;


>> + * Return the available trace data in the buffer @pos, with a maximum
>> + * limit of @len, also updating the @bufpp on where to find it.
>> + */
>> +ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
>> +                loff_t pos, size_t len, char **bufpp)
>> +{
>> +    char *bufp = drvdata->buf + pos;
>> +    char *bufend = (char *)(drvdata->vaddr + drvdata->size);
>> +
>> +    /* Adjust the len to available size @pos */
>> +    if (pos + len > drvdata->len)
>> +        len = drvdata->len - pos;
>> +
>> +    if (len <= 0)
>> +        return len;
> 
> Similar issue here.

Thanks for spotting. I will fix it.

Cheers

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 03/17] coresight: Add helper for inserting synchronization packets
  2017-10-30 21:44   ` Mathieu Poirier
@ 2017-11-01 10:01     ` Suzuki K Poulose
  0 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-01 10:01 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On 30/10/17 21:44, Mathieu Poirier wrote:
> On Thu, Oct 19, 2017 at 06:15:39PM +0100, Suzuki K Poulose wrote:
>> Right now we open code filling the trace buffer with synchronization
>> packets when the circular buffer wraps around in different drivers.
>> Move this to a common place.
>>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Cc: Mike Leach <mike.leach@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-etb10.c   | 10 +++------
>>   drivers/hwtracing/coresight/coresight-priv.h    |  8 ++++++++
>>   drivers/hwtracing/coresight/coresight-tmc-etf.c | 27 ++++++++-----------------
>>   drivers/hwtracing/coresight/coresight-tmc-etr.c | 13 +-----------
>>   4 files changed, 20 insertions(+), 38 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
>> index 56ecd7aff5eb..d7164ab8e229 100644
>> --- a/drivers/hwtracing/coresight/coresight-etb10.c
>> +++ b/drivers/hwtracing/coresight/coresight-etb10.c
>> @@ -203,7 +203,6 @@ static void etb_dump_hw(struct etb_drvdata *drvdata)
>>   	bool lost = false;
>>   	int i;
>>   	u8 *buf_ptr;
>> -	const u32 *barrier;
>>   	u32 read_data, depth;
>>   	u32 read_ptr, write_ptr;
>>   	u32 frame_off, frame_endoff;
>> @@ -234,19 +233,16 @@ static void etb_dump_hw(struct etb_drvdata *drvdata)
>>   
>>   	depth = drvdata->buffer_depth;
>>   	buf_ptr = drvdata->buf;
>> -	barrier = barrier_pkt;
>>   	for (i = 0; i < depth; i++) {
>>   		read_data = readl_relaxed(drvdata->base +
>>   					  ETB_RAM_READ_DATA_REG);
>> -		if (lost && *barrier) {
>> -			read_data = *barrier;
>> -			barrier++;
>> -		}
>> -
>>   		*(u32 *)buf_ptr = read_data;
>>   		buf_ptr += 4;
>>   	}
>>   
>> +	if (lost)
>> +		coresight_insert_barrier_packet(drvdata->buf);
>> +
>>   	if (frame_off) {
>>   		buf_ptr -= (frame_endoff * 4);
>>   		for (i = 0; i < frame_endoff; i++) {
>> diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
>> index f1d0e21d8cab..d12f64928c00 100644
>> --- a/drivers/hwtracing/coresight/coresight-priv.h
>> +++ b/drivers/hwtracing/coresight/coresight-priv.h
>> @@ -65,6 +65,7 @@ static DEVICE_ATTR_RO(name)
>>   	__coresight_simple_func(type, NULL, name, lo_off, hi_off)
>>   
>>   extern const u32 barrier_pkt[5];
>> +#define CORESIGHT_BARRIER_PKT_SIZE (sizeof(barrier_pkt) - sizeof(u32))
> 
> When using a memcpy() there is no need to have a 0x0 at the end of the
> barrier_pkt array.  A such I suggest you remove that and simply use sizeof()
> in coresight_insert_barrier_packet().

There is one place where can't simply do a memcpy(), in tmc_update_etf_buffer(),
where we could potentially move over to the next PAGE while filling the barrier packets.
This is why I didn't trim it off. However, I believe this shouldn't trigger as the trace
data should always be aligned to the frame size of the TMC and the perf buffer size is
page aligned. So, we should be able use memcpy() in that case too. I will fix it
in the next version.

Thanks
Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 04/17] coresight: Add generic TMC sg table framework
  2017-10-31 22:13   ` Mathieu Poirier
@ 2017-11-01 10:09     ` Suzuki K Poulose
  0 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-01 10:09 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-arm-kernel, linux-kernel, robert.walker, mike.leach,
	coresight, Mathieu Poirier

On 31/10/17 22:13, Mathieu Poirier wrote:
> On Thu, Oct 19, 2017 at 06:15:40PM +0100, Suzuki K Poulose wrote:
>> This patch introduces a generic sg table data structure and
>> associated operations. An SG table can be used to map a set
>> of Data pages where the trace data could be stored by the TMC
>> ETR. The information about the data pages could be stored in
>> different formats, depending on the type of the underlying
>> SG mechanism (e.g, TMC ETR SG vs Coresight CATU). The generic
>> structure provides book keeping of the pages used for the data
>> as well as the table contents. The table should be filled by
>> the user of the infrastructure.
>>
>> A table can be created by specifying the number of data pages
>> as well as the number of table pages required to hold the
>> pointers, where the latter could be different for different
>> types of tables. The pages are mapped in the appropriate dma
>> data direction mode (i.e, DMA_TO_DEVICE for table pages
>> and DMA_FROM_DEVICE for data pages).  The framework can optionally
>> accept a set of allocated data pages (e.g, perf ring buffer) and
>> map them accordingly. The table and data pages are vmap'ed to allow
>> easier access by the drivers. The framework also provides helpers to
>> sync the data written to the pages with appropriate directions.
>>
>> This will be later used by the TMC ETR SG unit.
>>
>> Cc: Mathieu Poirier <matheiu.poirier@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>> ---
>>   drivers/hwtracing/coresight/coresight-tmc-etr.c | 289 +++++++++++++++++++++++-
>>   drivers/hwtracing/coresight/coresight-tmc.h     |  44 ++++
>>   2 files changed, 332 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> index 41535fa6b6cf..4b9e2b276122 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> @@ -16,10 +16,297 @@
>>    */
>>   
>>   #include <linux/coresight.h>
>> -#include <linux/dma-mapping.h>
>> +#include <linux/slab.h>
>>   #include "coresight-priv.h"
>>   #include "coresight-tmc.h"
>>   
>> +/*
>> + * tmc_pages_get_offset:  Go through all the pages in the tmc_pages
>> + * and map @phys_addr to an offset within the buffer.
> 
> Did you mean "... map @addr"?  It might also be worth it to explicitly mention
> that it maps a physical address to an offset in the contiguous range.

Yes, definitely. I will fix it.


...

>> +/*
>> + * Alloc pages for the table. Since this will be used by the device,
>> + * allocate the pages closer to the device (i.e, dev_to_node(dev)
>> + * rather than the CPU nod).
>> + */
>> +static int tmc_alloc_table_pages(struct tmc_sg_table *sg_table)
>> +{
>> +	int rc;
>> +	struct tmc_pages *table_pages = &sg_table->table_pages;
>> +
>> +	rc = tmc_pages_alloc(table_pages, sg_table->dev,
>> +			     dev_to_node(sg_table->dev),
>> +			     DMA_TO_DEVICE, NULL);
>> +	if (rc)
>> +		return rc;
>> +	sg_table->table_vaddr = vmap(table_pages->pages,
>> +				     table_pages->nr_pages,
>> +				     VM_MAP,
>> +				     PAGE_KERNEL);
>> +	if (!sg_table->table_vaddr)
>> +		rc = -ENOMEM;
>> +	else
>> +		sg_table->table_daddr = table_pages->daddrs[0];
>> +	return rc;
>> +}
>> +
>> +static int tmc_alloc_data_pages(struct tmc_sg_table *sg_table, void **pages)
>> +{
>> +	int rc;
>> +
>> +	rc = tmc_pages_alloc(&sg_table->data_pages,
>> +			     sg_table->dev, sg_table->node,
> 
> Am I missing something very subtle here or sg_table->node should be the same as
> dev_to_node(sg_table->dev)?  If the same both tmc_alloc_table_pages() and
> tmc_alloc_data_pages() should be using the same construct.  Otherwise please add
> a comment to justify the difference.

Yes, it was a last minute change to switch the table to use dev_to_node(), while the
data pages are allocated as requested by the user. Eventually the user would consume
the data pages (even though the device produces it). However, the table pages are solely
for the consumption of the device, hence the dev_to_node().

I will add a comment to make that explicit.

>>   	u32 axictl, sts;
>> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
>> index 6deb3afe9db8..5e49c035a1ac 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc.h
>> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
>> @@ -19,6 +19,7 @@
>>   #define _CORESIGHT_TMC_H
>>   
>>   #include <linux/miscdevice.h>
>> +#include <linux/dma-mapping.h>
>>   
>>   #define TMC_RSZ			0x004
>>   #define TMC_STS			0x00c
>> @@ -171,6 +172,38 @@ struct tmc_drvdata {
>>   	u32			etr_caps;
>>   };
>>   
>> +/**
>> + * struct tmc_pages - Collection of pages used for SG.
>> + * @nr_pages:		Number of pages in the list.
>> + * @daddr:		DMA'able page address returned by dma_map_page().
>> + * @vaddr:		Virtual address returned by page_address().
> 
> This isn't accurate.
> 

Yes, I will clean that up. It kind of shows the number of revisions this
series has changed before reaching here ;-)

>> + */
>> +struct tmc_pages {
>> +	int nr_pages;
>> +	dma_addr_t	*daddrs;
>> +	struct page	**pages;
>> +};
>> +
>> +/*
>> + * struct tmc_sg_table : Generic SG table for TMC
> 
> Use a '-' as above or fix the above to be ':'.  I don't mind which is used as
> long as they are the same.
> 

Ok.


>> + * @dev:		Device for DMA allocations
>> + * @table_vaddr:	Contiguous Virtual address for PageTable
>> + * @data_vaddr:		Contiguous Virtual address for Data Buffer
>> + * @table_daddr:	DMA address of the PageTable base
>> + * @node:		Node for Page allocations
>> + * @table_pages:	List of pages & dma address for Table
>> + * @data_pages:		List of pages & dma address for Data
>> + */
>> +struct tmc_sg_table {
>> +	struct device *dev;
>> +	void *table_vaddr;
>> +	void *data_vaddr;
>> +	dma_addr_t table_daddr;
>> +	int node;
>> +	struct tmc_pages table_pages;
>> +	struct tmc_pages data_pages;
>> +};
>> +
>>   /* Generic functions */
>>   void tmc_wait_for_tmcready(struct tmc_drvdata *drvdata);
>>   void tmc_flush_and_stop(struct tmc_drvdata *drvdata);
>> @@ -226,4 +259,15 @@ static inline bool tmc_etr_has_cap(struct tmc_drvdata *drvdata, u32 cap)
>>   	return !!(drvdata->etr_caps & cap);
>>   }
>>   
>> +struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev,
>> +					int node,
>> +					int nr_tpages,
>> +					int nr_dpages,
>> +					void **pages);
>> +void tmc_free_sg_table(struct tmc_sg_table *sg_table);
>> +void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table);
>> +void tmc_sg_table_sync_data_range(struct tmc_sg_table *table,
>> +				  u64 offset, u64 size);
>> +ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
>> +			      u64 offset, size_t len, char **bufpp);
>>   #endif
> 
> I like this implementation, much cleaner than what I previously had.
> 

Thanks for the review !

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 05/17] coresight: Add support for TMC ETR SG unit
  2017-10-20 16:25   ` Julien Thierry
@ 2017-11-01 10:11     ` Suzuki K Poulose
  0 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-01 10:11 UTC (permalink / raw)
  To: Julien Thierry, linux-arm-kernel
  Cc: linux-kernel, rob.walker, mike.leach, coresight, mathieu.poirier


>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> index 4b9e2b276122..4424eb67a54c 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> @@ -21,6 +21,89 @@
>>   #include "coresight-tmc.h"
>>   /*
>> + * The TMC ETR SG has a page size of 4K. The SG table contains pointers
>> + * to 4KB buffers. However, the OS may be use PAGE_SIZE different from
> 
> nit:
> "the OS may use a PAGE_SIZE different from".
> 


>> +#define ETR_SG_PAGE_SHIFT        12
>> +#define ETR_SG_PAGE_SIZE        (1UL << ETR_SG_PAGE_SHIFT)
>> +#define ETR_SG_PAGES_PER_SYSPAGE    (1UL << \
>> +                     (PAGE_SHIFT - ETR_SG_PAGE_SHIFT))
> 
> I think this would be slightly easier to understand if defined as:
> "(PAGE_SIZE / ETR_SG_PAGE_SIZE)".
> 

>> +/* Dump the given sg_table */
>> +static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
>> +{
>> +    sgte_t *ptr;
>> +    int i = 0;
>> +    dma_addr_t addr;
>> +    struct tmc_sg_table *sg_table = etr_table->sg_table;
>> +
>> +    ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
>> +                          etr_table->hwaddr, true);
>> +    while (ptr) {
>> +        addr = ETR_SG_ADDR(*ptr);
>> +        switch (ETR_SG_ET(*ptr)) {
>> +        case ETR_SG_ET_NORMAL:
>> +            pr_debug("%05d: %p\t:[N] 0x%llx\n", i, ptr, addr);
>> +            ptr++;
>> +            break;
>> +        case ETR_SG_ET_LINK:
>> +            pr_debug("%05d: *** %p\t:{L} 0x%llx ***\n",
>> +                 i, ptr, addr);
>> +            ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
>> +                                  addr, true);
>> +            break;
>> +        case ETR_SG_ET_LAST:
>> +            pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n",
>> +                 i, ptr, addr);
>> +            return;
> 
> I get this is debug code, but it seems like if ETR_SG_ET(*ptr) is 0 we get stuck in an infinite loop. I guess it is something that supposedly doesn't happen, still I'd prefer having a default case saying the table might be corrupted and either incrementing ptr to try and get more info or breaking out of the loop.
> 

>> +        }
>> +        i++;
>> +    }
>> +    pr_debug("******* End of Table *****\n");
>> +}
>> +#endif
>> +
>> +/*
>> + * Populate the SG Table page table entries from table/data
>> + * pages allocated. Each Data page has ETR_SG_PAGES_PER_SYSPAGE SG pages.
>> + * So does a Table page. So we keep track of indices of the tables
>> + * in each system page and move the pointers accordingly.
>> + */
>> +#define INC_IDX_ROUND(idx, size) (idx = (idx + 1) % size)
> 
> Needs more parenthesis around idx and size.
> 

>> +static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
>> +{
>> +    dma_addr_t paddr;
>> +    int i, type, nr_entries;
>> +    int tpidx = 0; /* index to the current system table_page */
>> +    int sgtidx = 0;    /* index to the sg_table within the current syspage */
>> +    int sgtoffset = 0; /* offset to the next entry within the sg_table */
> 
> That's misleading, this seems to be the index of an entry within an ETR_SG_PAGE rather than an offset in bytes.
> 
> Maybe ptridx or entryidx would be a better name.

You're right, I have chosen sgtentry for now.

Thanks for the detailed look, I will fix all of them.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 06/17] coresight: tmc: Make ETR SG table circular
  2017-10-20 17:11   ` Julien Thierry
@ 2017-11-01 10:12     ` Suzuki K Poulose
  0 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-01 10:12 UTC (permalink / raw)
  To: Julien Thierry, linux-arm-kernel
  Cc: linux-kernel, robert.walker, mike.leach, coresight, mathieu.poirier

On 20/10/17 18:11, Julien Thierry wrote:

>> +static int __maybe_unused
>> +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset)
>> +{
>> +    u32 last_entry, first_entry;
>> +    u64 last_offset;
>> +    struct tmc_sg_table *sg_table = etr_table->sg_table;
>> +    sgte_t *table_ptr = sg_table->table_vaddr;
>> +    ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
>> +
>> +    /* Offset should always be SG PAGE_SIZE aligned */
>> +    if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
>> +        pr_debug("unaligned base offset %llx\n", base_offset);
>> +        return -EINVAL;
>> +    }
>> +    /* Make sure the offset is within the range */
>> +    if (base_offset < 0 || base_offset > buf_size) {
> 
> base_offset is unsigned, so the left operand of the '||' is useless (would've expected the compiler to emit a warning for this).
> 
>> +        base_offset = (base_offset + buf_size) % buf_size;
>> +        pr_debug("Resetting offset to %llx\n", base_offset);
>> +    }
>> +    first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
>> +    if (first_entry == etr_table->first_entry) {
>> +        pr_debug("Head is already at %llx, skipping\n", base_offset);
>> +        return 0;
>> +    }
>> +
>> +    /* Last entry should be the previous one to the new "base" */
>> +    last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size;
>> +    last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
>> +
>> +    /* Reset the current Last page to Normal and new Last page to NORMAL */
> 
> Current Last page to NORMAL and new Last page to LAST?

Thanks again, will fix them

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 05/17] coresight: Add support for TMC ETR SG unit
  2017-10-19 17:15 ` [PATCH 05/17] coresight: Add support for TMC ETR SG unit Suzuki K Poulose
  2017-10-20 16:25   ` Julien Thierry
@ 2017-11-01 20:41   ` Mathieu Poirier
  1 sibling, 0 replies; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-01 20:41 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, Mike Leach, coresight

On 19 October 2017 at 11:15, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> This patch adds support for setting up an SG table used by the
> TMC ETR inbuilt SG unit. The TMC ETR uses 4K page sized tables
> to hold pointers to the 4K data pages with the last entry in a
> table pointing to the next table with the entries, by kind of
> chaining. The 2 LSBs determine the type of the table entry, to
> one of :
>
>  Normal - Points to a 4KB data page.
>  Last   - Points to a 4KB data page, but is the last entry in the
>           page table.
>  Link   - Points to another 4KB table page with pointers to data.
>
> The code takes care of handling the system page size which could
> be different than 4K. So we could end up putting multiple ETR
> SG tables in a single system page, vice versa for the data pages.
>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-tmc-etr.c | 256 ++++++++++++++++++++++++
>  1 file changed, 256 insertions(+)
>
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 4b9e2b276122..4424eb67a54c 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -21,6 +21,89 @@
>  #include "coresight-tmc.h"
>
>  /*
> + * The TMC ETR SG has a page size of 4K. The SG table contains pointers
> + * to 4KB buffers. However, the OS may be use PAGE_SIZE different from
> + * 4K (i.e, 16KB or 64KB). This implies that a single OS page could
> + * contain more than one SG buffer and tables.
> + *
> + * A table entry has the following format:
> + *
> + * ---Bit31------------Bit4-------Bit1-----Bit0--
> + * |     Address[39:12]    | SBZ |  Entry Type  |
> + * ----------------------------------------------
> + *
> + * Address: Bits [39:12] of a physical page address. Bits [11:0] are
> + *         always zero.
> + *
> + * Entry type:
> + *     b00 - Reserved.
> + *     b01 - Last entry in the tables, points to 4K page buffer.
> + *     b10 - Normal entry, points to 4K page buffer.
> + *     b11 - Link. The address points to the base of next table.
> + */
> +
> +typedef u32 sgte_t;
> +
> +#define ETR_SG_PAGE_SHIFT              12
> +#define ETR_SG_PAGE_SIZE               (1UL << ETR_SG_PAGE_SHIFT)
> +#define ETR_SG_PAGES_PER_SYSPAGE       (1UL << \
> +                                        (PAGE_SHIFT - ETR_SG_PAGE_SHIFT))
> +#define ETR_SG_PTRS_PER_PAGE           (ETR_SG_PAGE_SIZE / sizeof(sgte_t))
> +#define ETR_SG_PTRS_PER_SYSPAGE                (PAGE_SIZE / sizeof(sgte_t))
> +
> +#define ETR_SG_ET_MASK                 0x3
> +#define ETR_SG_ET_LAST                 0x1
> +#define ETR_SG_ET_NORMAL               0x2
> +#define ETR_SG_ET_LINK                 0x3
> +
> +#define ETR_SG_ADDR_SHIFT              4
> +
> +#define ETR_SG_ENTRY(addr, type) \
> +       (sgte_t)((((addr) >> ETR_SG_PAGE_SHIFT) << ETR_SG_ADDR_SHIFT) | \
> +                (type & ETR_SG_ET_MASK))
> +
> +#define ETR_SG_ADDR(entry) \
> +       (((dma_addr_t)(entry) >> ETR_SG_ADDR_SHIFT) << ETR_SG_PAGE_SHIFT)
> +#define ETR_SG_ET(entry)               ((entry) & ETR_SG_ET_MASK)
> +
> +/*
> + * struct etr_sg_table : ETR SG Table
> + * @sg_table:          Generic SG Table holding the data/table pages.
> + * @hwaddr:            hwaddress used by the TMC, which is the base
> + *                     address of the table.
> + */
> +struct etr_sg_table {
> +       struct tmc_sg_table     *sg_table;
> +       dma_addr_t              hwaddr;
> +};
> +
> +/*
> + * tmc_etr_sg_table_entries: Total number of table entries required to map
> + * @nr_pages system pages.
> + *
> + * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages.
> + * Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers,
> + * with the last entry pointing to the page containing the table

... with the last entry pointing to another page of table entries.  If we ...

> + * entries. If we spill over to a new page for mapping 1 entry,
> + * we could as well replace the link entry of the previous page
> + * with the last entry.
> + */
> +static inline unsigned long __attribute_const__
> +tmc_etr_sg_table_entries(int nr_pages)
> +{
> +       unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE;
> +       unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1);
> +       /*
> +        * If we spill over to a new page for 1 entry, we could as well
> +        * make it the LAST entry in the previous page, skipping the Link
> +        * address.
> +        */
> +       if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2))
> +               nr_sglinks--;
> +       return nr_sgpages + nr_sglinks;
> +}
> +
> +/*
>   * tmc_pages_get_offset:  Go through all the pages in the tmc_pages
>   * and map @phys_addr to an offset within the buffer.
>   */
> @@ -307,6 +390,179 @@ ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
>         return len;
>  }
>
> +#ifdef ETR_SG_DEBUG
> +/* Map a dma address to virtual address */
> +static unsigned long
> +tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table,
> +                       dma_addr_t addr, bool table)
> +{
> +       long offset;
> +       unsigned long base;
> +       struct tmc_pages *tmc_pages;
> +
> +       if (table) {
> +               tmc_pages = &sg_table->table_pages;
> +               base = (unsigned long)sg_table->table_vaddr;
> +       } else {
> +               tmc_pages = &sg_table->data_pages;
> +               base = (unsigned long)sg_table->data_vaddr;
> +       }
> +
> +       offset = tmc_pages_get_offset(tmc_pages, addr);
> +       if (offset < 0)
> +               return 0;
> +       return base + offset;
> +}
> +
> +/* Dump the given sg_table */
> +static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
> +{
> +       sgte_t *ptr;
> +       int i = 0;
> +       dma_addr_t addr;
> +       struct tmc_sg_table *sg_table = etr_table->sg_table;
> +
> +       ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
> +                                             etr_table->hwaddr, true);
> +       while (ptr) {
> +               addr = ETR_SG_ADDR(*ptr);
> +               switch (ETR_SG_ET(*ptr)) {
> +               case ETR_SG_ET_NORMAL:
> +                       pr_debug("%05d: %p\t:[N] 0x%llx\n", i, ptr, addr);
> +                       ptr++;
> +                       break;
> +               case ETR_SG_ET_LINK:
> +                       pr_debug("%05d: *** %p\t:{L} 0x%llx ***\n",
> +                                i, ptr, addr);
> +                       ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
> +                                                             addr, true);
> +                       break;
> +               case ETR_SG_ET_LAST:
> +                       pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n",
> +                                i, ptr, addr);
> +                       return;
> +               }
> +               i++;
> +       }
> +       pr_debug("******* End of Table *****\n");
> +}
> +#endif
> +
> +/*
> + * Populate the SG Table page table entries from table/data
> + * pages allocated. Each Data page has ETR_SG_PAGES_PER_SYSPAGE SG pages.
> + * So does a Table page. So we keep track of indices of the tables
> + * in each system page and move the pointers accordingly.
> + */
> +#define INC_IDX_ROUND(idx, size) (idx = (idx + 1) % size)
> +static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
> +{
> +       dma_addr_t paddr;
> +       int i, type, nr_entries;
> +       int tpidx = 0; /* index to the current system table_page */
> +       int sgtidx = 0; /* index to the sg_table within the current syspage */
> +       int sgtoffset = 0; /* offset to the next entry within the sg_table */
> +       int dpidx = 0; /* index to the current system data_page */
> +       int spidx = 0; /* index to the SG page within the current data page */
> +       sgte_t *ptr; /* pointer to the table entry to fill */
> +       struct tmc_sg_table *sg_table = etr_table->sg_table;
> +       dma_addr_t *table_daddrs = sg_table->table_pages.daddrs;
> +       dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
> +
> +       nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages);
> +       /*
> +        * Use the contiguous virtual address of the table to update entries.
> +        */
> +       ptr = sg_table->table_vaddr;
> +       /*
> +        * Fill all the entries, except the last entry to avoid special
> +        * checks within the loop.
> +        */
> +       for (i = 0; i < nr_entries - 1; i++) {
> +               if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) {
> +                       /*
> +                        * Last entry in a sg_table page is a link address to
> +                        * the next table page. If this sg_table is the last
> +                        * one in the system page, it links to the first
> +                        * sg_table in the next system page. Otherwise, it
> +                        * links to the next sg_table page within the system
> +                        * page.
> +                        */
> +                       if (sgtidx == ETR_SG_PAGES_PER_SYSPAGE - 1) {
> +                               paddr = table_daddrs[tpidx + 1];
> +                       } else {
> +                               paddr = table_daddrs[tpidx] +
> +                                       (ETR_SG_PAGE_SIZE * (sgtidx + 1));
> +                       }
> +                       type = ETR_SG_ET_LINK;
> +               } else {
> +                       /*
> +                        * Update the idices to the data_pages to point to the
> +                        * next sg_page in the data buffer.
> +                        */
> +                       type = ETR_SG_ET_NORMAL;
> +                       paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
> +                       if (!INC_IDX_ROUND(spidx, ETR_SG_PAGES_PER_SYSPAGE))
> +                               dpidx++;
> +               }
> +               *ptr++ = ETR_SG_ENTRY(paddr, type);
> +               /*
> +                * Move to the next table pointer, moving the table page index
> +                * if necessary
> +                */
> +               if (!INC_IDX_ROUND(sgtoffset, ETR_SG_PTRS_PER_PAGE)) {
> +                       if (!INC_IDX_ROUND(sgtidx, ETR_SG_PAGES_PER_SYSPAGE))
> +                               tpidx++;
> +               }
> +       }
> +
> +       /* Set up the last entry, which is always a data pointer */
> +       paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
> +       *ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST);
> +}
> +
> +/*
> + * tmc_init_etr_sg_table: Allocate a TMC ETR SG table, data buffer of @size and
> + * populate the table.
> + *
> + * @dev                - Device pointer for the TMC
> + * @node       - NUMA node where the memory should be allocated
> + * @size       - Total size of the data buffer
> + * @pages      - Optional list of page virtual address
> + */
> +static struct etr_sg_table __maybe_unused *
> +tmc_init_etr_sg_table(struct device *dev, int node,
> +                 unsigned long size, void **pages)
> +{
> +       int nr_entries, nr_tpages;
> +       int nr_dpages = size >> PAGE_SHIFT;
> +       struct tmc_sg_table *sg_table;
> +       struct etr_sg_table *etr_table;
> +
> +       etr_table = kzalloc(sizeof(*etr_table), GFP_KERNEL);
> +       if (!etr_table)
> +               return ERR_PTR(-ENOMEM);
> +       nr_entries = tmc_etr_sg_table_entries(nr_dpages);
> +       nr_tpages = DIV_ROUND_UP(nr_entries, ETR_SG_PTRS_PER_SYSPAGE);
> +
> +       sg_table = tmc_alloc_sg_table(dev, node, nr_tpages, nr_dpages, pages);
> +       if (IS_ERR(sg_table)) {
> +               kfree(etr_table);
> +               return ERR_PTR(PTR_ERR(sg_table));
> +       }
> +
> +       etr_table->sg_table = sg_table;
> +       /* TMC should use table base address for DBA */
> +       etr_table->hwaddr = sg_table->table_daddr;
> +       tmc_etr_sg_table_populate(etr_table);
> +       /* Sync the table pages for the HW */
> +       tmc_sg_table_sync_table(sg_table);
> +#ifdef ETR_SG_DEBUG
> +       tmc_etr_sg_table_dump(etr_table);
> +#endif
> +       return etr_table;
> +}
> +
>  static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
>  {
>         u32 axictl, sts;
> --
> 2.13.6
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 06/17] coresight: tmc: Make ETR SG table circular
  2017-10-19 17:15 ` [PATCH 06/17] coresight: tmc: Make ETR SG table circular Suzuki K Poulose
  2017-10-20 17:11   ` Julien Thierry
@ 2017-11-01 23:47   ` Mathieu Poirier
  2017-11-02 12:00     ` Suzuki K Poulose
  2017-11-06 19:07   ` Mathieu Poirier
  2 siblings, 1 reply; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-01 23:47 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
> Make the ETR SG table Circular buffer so that we could start
> at any of the SG pages and use the entire buffer for tracing.
> This can be achieved by :
> 
> 1) Keeping an additional LINK pointer at the very end of the
> SG table, i.e, after the LAST buffer entry, to point back to
> the beginning of the first table. This will allow us to use
> the buffer normally when we start the trace at offset 0 of
> the buffer, as the LAST buffer entry hints the TMC-ETR and
> it automatically wraps to the offset 0.
> 
> 2) If we want to start at any other ETR SG page aligned offset,
> we could :
>  a) Make the preceding page entry as LAST entry.
>  b) Make the original LAST entry a normal entry.
>  c) Use the table pointer to the "new" start offset as the
>     base of the table address.
> This works as the TMC doesn't mandate that the page table
> base address should be 4K page aligned.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-tmc-etr.c | 159 +++++++++++++++++++++---
>  1 file changed, 139 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 4424eb67a54c..c171b244e12a 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -71,36 +71,41 @@ typedef u32 sgte_t;
>   * @sg_table:		Generic SG Table holding the data/table pages.
>   * @hwaddr:		hwaddress used by the TMC, which is the base
>   *			address of the table.
> + * @nr_entries:		Total number of pointers in the table.
> + * @first_entry:	Index to the current "start" of the buffer.
> + * @last_entry:		Index to the last entry of the buffer.
>   */
>  struct etr_sg_table {
>  	struct tmc_sg_table	*sg_table;
>  	dma_addr_t		hwaddr;
> +	u32			nr_entries;
> +	u32			first_entry;
> +	u32			last_entry;
>  };
>  
>  /*
>   * tmc_etr_sg_table_entries: Total number of table entries required to map
>   * @nr_pages system pages.
>   *
> - * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages.
> + * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages and
> + * an additional Link pointer for making it a Circular buffer.
>   * Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers,
>   * with the last entry pointing to the page containing the table
> - * entries. If we spill over to a new page for mapping 1 entry,
> - * we could as well replace the link entry of the previous page
> - * with the last entry.
> + * entries. If we fill the last table in full with the pointers, (i.e,
> + * nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) == 0, we don't have to allocate
> + * another table and hence skip the Link pointer. Also we could use the
> + * link entry of the last page to make it circular.
>   */
>  static inline unsigned long __attribute_const__
>  tmc_etr_sg_table_entries(int nr_pages)
>  {
>  	unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE;
>  	unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1);
> -	/*
> -	 * If we spill over to a new page for 1 entry, we could as well
> -	 * make it the LAST entry in the previous page, skipping the Link
> -	 * address.
> -	 */
> -	if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2))
> +
> +	if (nr_sglinks && !(nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1)))
>  		nr_sglinks--;
> -	return nr_sgpages + nr_sglinks;
> +	/* Add an entry for the circular link */
> +	return nr_sgpages + nr_sglinks + 1;
>  }
>  
>  /*
> @@ -417,14 +422,21 @@ tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table,
>  /* Dump the given sg_table */
>  static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
>  {
> -	sgte_t *ptr;
> +	sgte_t *ptr, *start;
>  	int i = 0;
>  	dma_addr_t addr;
>  	struct tmc_sg_table *sg_table = etr_table->sg_table;
>  
> -	ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
> +	start = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
>  					      etr_table->hwaddr, true);
> -	while (ptr) {
> +	if (!start) {
> +		pr_debug("ERROR: Failed to translate table base: 0x%llx\n",
> +					 etr_table->hwaddr);
> +		return;
> +	}
> +
> +	ptr = start;
> +	do {
>  		addr = ETR_SG_ADDR(*ptr);
>  		switch (ETR_SG_ET(*ptr)) {
>  		case ETR_SG_ET_NORMAL:
> @@ -436,14 +448,17 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
>  				 i, ptr, addr);
>  			ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
>  							      addr, true);
> +			if (!ptr)
> +				pr_debug("ERROR: Bad Link 0x%llx\n", addr);
>  			break;
>  		case ETR_SG_ET_LAST:
>  			pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n",
>  				 i, ptr, addr);
> -			return;
> +			ptr++;
> +			break;
>  		}
>  		i++;
> -	}
> +	} while (ptr && ptr != start);
>  	pr_debug("******* End of Table *****\n");
>  }
>  #endif
> @@ -458,7 +473,7 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
>  static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
>  {
>  	dma_addr_t paddr;
> -	int i, type, nr_entries;
> +	int i, type;
>  	int tpidx = 0; /* index to the current system table_page */
>  	int sgtidx = 0;	/* index to the sg_table within the current syspage */
>  	int sgtoffset = 0; /* offset to the next entry within the sg_table */
> @@ -469,16 +484,16 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
>  	dma_addr_t *table_daddrs = sg_table->table_pages.daddrs;
>  	dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
>  
> -	nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages);
>  	/*
>  	 * Use the contiguous virtual address of the table to update entries.
>  	 */
>  	ptr = sg_table->table_vaddr;
>  	/*
> -	 * Fill all the entries, except the last entry to avoid special
> +	 * Fill all the entries, except the last two entries (i.e, the last
> +	 * buffer and the circular link back to the base) to avoid special
>  	 * checks within the loop.
>  	 */
> -	for (i = 0; i < nr_entries - 1; i++) {
> +	for (i = 0; i < etr_table->nr_entries - 2; i++) {
>  		if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) {
>  			/*
>  			 * Last entry in a sg_table page is a link address to
> @@ -519,6 +534,107 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
>  	/* Set up the last entry, which is always a data pointer */
>  	paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
>  	*ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST);
> +	/* followed by a circular link, back to the start of the table */
> +	*ptr++ = ETR_SG_ENTRY(sg_table->table_daddr, ETR_SG_ET_LINK);
> +}
> +
> +/*
> + * tmc_etr_sg_offset_to_table_index : Translate a given data @offset
> + * to the index of the page table "entry". Data pointers always have
> + * a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an
> + * ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries.
> + */
> +static inline u32
> +tmc_etr_sg_offset_to_table_index(u64 offset)
> +{
> +	u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
> +
> +	return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
> +}
> +
> +/*
> + * tmc_etr_sg_update_type: Update the type of a given entry in the
> + * table to the requested entry. This is only used for data buffers
> + * to toggle the "NORMAL" vs "LAST" buffer entries.
> + */
> +static inline void tmc_etr_sg_update_type(sgte_t *entry, u32 type)
> +{
> +	WARN_ON(ETR_SG_ET(*entry) == ETR_SG_ET_LINK);
> +	WARN_ON(!ETR_SG_ET(*entry));
> +	*entry &= ~ETR_SG_ET_MASK;
> +	*entry |= type;
> +}
> +
> +/*
> + * tmc_etr_sg_table_index_to_daddr: Return the hardware address to the table
> + * entry @index. Use this address to let the table begin @index.
> + */
> +static inline dma_addr_t
> +tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index)
> +{
> +	u32 sys_page_idx = index / ETR_SG_PTRS_PER_SYSPAGE;
> +	u32 sys_page_offset = index % ETR_SG_PTRS_PER_SYSPAGE;
> +	sgte_t *ptr;
> +
> +	ptr = (sgte_t *)sg_table->table_pages.daddrs[sys_page_idx];
> +	return (dma_addr_t)&ptr[sys_page_offset];
> +}
> +
> +/*
> + * tmc_etr_sg_table_rotate : Rotate the SG circular buffer, moving
> + * the "base" to a requested offset. We do so by :
> + *
> + * 1) Reset the current LAST buffer.
> + * 2) Mark the "previous" buffer in the table to the "base" as LAST.
> + * 3) Update the hwaddr to point to the table pointer for the buffer
> + *    which starts at "base".
> + */
> +static int __maybe_unused
> +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset)
> +{
> +	u32 last_entry, first_entry;
> +	u64 last_offset;
> +	struct tmc_sg_table *sg_table = etr_table->sg_table;
> +	sgte_t *table_ptr = sg_table->table_vaddr;
> +	ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
> +
> +	/* Offset should always be SG PAGE_SIZE aligned */
> +	if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
> +		pr_debug("unaligned base offset %llx\n", base_offset);
> +		return -EINVAL;
> +	}
> +	/* Make sure the offset is within the range */
> +	if (base_offset < 0 || base_offset > buf_size) {
> +		base_offset = (base_offset + buf_size) % buf_size;
> +		pr_debug("Resetting offset to %llx\n", base_offset);
> +	}
> +	first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
> +	if (first_entry == etr_table->first_entry) {
> +		pr_debug("Head is already at %llx, skipping\n", base_offset);
> +		return 0;
> +	}
> +
> +	/* Last entry should be the previous one to the new "base" */
> +	last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size;
> +	last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
> +
> +	/* Reset the current Last page to Normal and new Last page to NORMAL */
> +	tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry],
> +				 ETR_SG_ET_NORMAL);
> +	tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST);
> +	etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table,
> +							    first_entry);
> +	etr_table->first_entry = first_entry;
> +	etr_table->last_entry = last_entry;
> +	pr_debug("table rotated to offset %llx-%llx, entries (%d - %d), dba: %llx\n",
> +			base_offset, last_offset, first_entry, last_entry,
> +			etr_table->hwaddr);

The above line generates a warning when compiling for ARMv7.

> +	/* Sync the table for device */
> +	tmc_sg_table_sync_table(sg_table);
> +#ifdef ETR_SG_DEBUG
> +	tmc_etr_sg_table_dump(etr_table);
> +#endif
> +	return 0;
>  }
>  
>  /*
> @@ -552,6 +668,9 @@ tmc_init_etr_sg_table(struct device *dev, int node,
>  	}
>  
>  	etr_table->sg_table = sg_table;
> +	etr_table->nr_entries = nr_entries;
> +	etr_table->first_entry = 0;
> +	etr_table->last_entry = nr_entries - 2;
>  	/* TMC should use table base address for DBA */
>  	etr_table->hwaddr = sg_table->table_daddr;
>  	tmc_etr_sg_table_populate(etr_table);
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 06/17] coresight: tmc: Make ETR SG table circular
  2017-11-01 23:47   ` Mathieu Poirier
@ 2017-11-02 12:00     ` Suzuki K Poulose
  2017-11-02 14:40       ` Mathieu Poirier
  0 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-02 12:00 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On 01/11/17 23:47, Mathieu Poirier wrote:
> On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
>> Make the ETR SG table Circular buffer so that we could start
>> at any of the SG pages and use the entire buffer for tracing.
>> This can be achieved by :
>>
>> 1) Keeping an additional LINK pointer at the very end of the
>> SG table, i.e, after the LAST buffer entry, to point back to
>> the beginning of the first table. This will allow us to use
>> the buffer normally when we start the trace at offset 0 of
>> the buffer, as the LAST buffer entry hints the TMC-ETR and
>> it automatically wraps to the offset 0.
>>
>> 2) If we want to start at any other ETR SG page aligned offset,
>> we could :
>>   a) Make the preceding page entry as LAST entry.
>>   b) Make the original LAST entry a normal entry.
>>   c) Use the table pointer to the "new" start offset as the
>>      base of the table address.
>> This works as the TMC doesn't mandate that the page table
>> base address should be 4K page aligned.
>>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---

>> +static int __maybe_unused
>> +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset)
>> +{
>> +	u32 last_entry, first_entry;
>> +	u64 last_offset;
>> +	struct tmc_sg_table *sg_table = etr_table->sg_table;
>> +	sgte_t *table_ptr = sg_table->table_vaddr;
>> +	ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
>> +
>> +	/* Offset should always be SG PAGE_SIZE aligned */
>> +	if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
>> +		pr_debug("unaligned base offset %llx\n", base_offset);
>> +		return -EINVAL;
>> +	}
>> +	/* Make sure the offset is within the range */
>> +	if (base_offset < 0 || base_offset > buf_size) {
>> +		base_offset = (base_offset + buf_size) % buf_size;
>> +		pr_debug("Resetting offset to %llx\n", base_offset);
>> +	}
>> +	first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
>> +	if (first_entry == etr_table->first_entry) {
>> +		pr_debug("Head is already at %llx, skipping\n", base_offset);
>> +		return 0;
>> +	}
>> +
>> +	/* Last entry should be the previous one to the new "base" */
>> +	last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size;
>> +	last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
>> +
>> +	/* Reset the current Last page to Normal and new Last page to NORMAL */
>> +	tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry],
>> +				 ETR_SG_ET_NORMAL);
>> +	tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST);
>> +	etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table,
>> +							    first_entry);
>> +	etr_table->first_entry = first_entry;
>> +	etr_table->last_entry = last_entry;
>> +	pr_debug("table rotated to offset %llx-%llx, entries (%d - %d), dba: %llx\n",
>> +			base_offset, last_offset, first_entry, last_entry,
>> +			etr_table->hwaddr);
> 
> The above line generates a warning when compiling for ARMv7.

Where you running with LPAE off ? That could probably be the case,
where hwaddr could be 32bit or 64bit depending on whether LPAE
is enabled. I will fix it.

I have fixed some other warnings with ARMv7 with LPAE.

Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 06/17] coresight: tmc: Make ETR SG table circular
  2017-11-02 12:00     ` Suzuki K Poulose
@ 2017-11-02 14:40       ` Mathieu Poirier
  2017-11-02 15:13         ` Russell King - ARM Linux
  0 siblings, 1 reply; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-02 14:40 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, Mike Leach, coresight

On 2 November 2017 at 06:00, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
> On 01/11/17 23:47, Mathieu Poirier wrote:
>>
>> On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
>>>
>>> Make the ETR SG table Circular buffer so that we could start
>>> at any of the SG pages and use the entire buffer for tracing.
>>> This can be achieved by :
>>>
>>> 1) Keeping an additional LINK pointer at the very end of the
>>> SG table, i.e, after the LAST buffer entry, to point back to
>>> the beginning of the first table. This will allow us to use
>>> the buffer normally when we start the trace at offset 0 of
>>> the buffer, as the LAST buffer entry hints the TMC-ETR and
>>> it automatically wraps to the offset 0.
>>>
>>> 2) If we want to start at any other ETR SG page aligned offset,
>>> we could :
>>>   a) Make the preceding page entry as LAST entry.
>>>   b) Make the original LAST entry a normal entry.
>>>   c) Use the table pointer to the "new" start offset as the
>>>      base of the table address.
>>> This works as the TMC doesn't mandate that the page table
>>> base address should be 4K page aligned.
>>>
>>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>> ---
>
>
>>> +static int __maybe_unused
>>> +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset)
>>> +{
>>> +       u32 last_entry, first_entry;
>>> +       u64 last_offset;
>>> +       struct tmc_sg_table *sg_table = etr_table->sg_table;
>>> +       sgte_t *table_ptr = sg_table->table_vaddr;
>>> +       ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
>>> +
>>> +       /* Offset should always be SG PAGE_SIZE aligned */
>>> +       if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
>>> +               pr_debug("unaligned base offset %llx\n", base_offset);
>>> +               return -EINVAL;
>>> +       }
>>> +       /* Make sure the offset is within the range */
>>> +       if (base_offset < 0 || base_offset > buf_size) {
>>> +               base_offset = (base_offset + buf_size) % buf_size;
>>> +               pr_debug("Resetting offset to %llx\n", base_offset);
>>> +       }
>>> +       first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
>>> +       if (first_entry == etr_table->first_entry) {
>>> +               pr_debug("Head is already at %llx, skipping\n",
>>> base_offset);
>>> +               return 0;
>>> +       }
>>> +
>>> +       /* Last entry should be the previous one to the new "base" */
>>> +       last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) %
>>> buf_size;
>>> +       last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
>>> +
>>> +       /* Reset the current Last page to Normal and new Last page to
>>> NORMAL */
>>> +       tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry],
>>> +                                ETR_SG_ET_NORMAL);
>>> +       tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST);
>>> +       etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table,
>>> +                                                           first_entry);
>>> +       etr_table->first_entry = first_entry;
>>> +       etr_table->last_entry = last_entry;
>>> +       pr_debug("table rotated to offset %llx-%llx, entries (%d - %d),
>>> dba: %llx\n",
>>> +                       base_offset, last_offset, first_entry,
>>> last_entry,
>>> +                       etr_table->hwaddr);
>>
>>
>> The above line generates a warning when compiling for ARMv7.
>
>
> Where you running with LPAE off ? That could probably be the case,
> where hwaddr could be 32bit or 64bit depending on whether LPAE
> is enabled. I will fix it.

My original setup did not have LPAE configured but even when I do
configure it I can generate the warnings.

Compiler:

arm-linux-gnueabi-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609

Let me know if you want my .config file.

>
> I have fixed some other warnings with ARMv7 with LPAE.
>
> Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 06/17] coresight: tmc: Make ETR SG table circular
  2017-11-02 14:40       ` Mathieu Poirier
@ 2017-11-02 15:13         ` Russell King - ARM Linux
  0 siblings, 0 replies; 56+ messages in thread
From: Russell King - ARM Linux @ 2017-11-02 15:13 UTC (permalink / raw)
  To: Mathieu Poirier, Randy Dunlap, Andrew Murray
  Cc: Suzuki K Poulose, coresight, rob.walker, linux-kernel,
	linux-arm-kernel, Mike Leach

On Thu, Nov 02, 2017 at 08:40:16AM -0600, Mathieu Poirier wrote:
> On 2 November 2017 at 06:00, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
> > On 01/11/17 23:47, Mathieu Poirier wrote:
> >>
> >> On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
> >>>
> >>> Make the ETR SG table Circular buffer so that we could start
> >>> at any of the SG pages and use the entire buffer for tracing.
> >>> This can be achieved by :
> >>>
> >>> 1) Keeping an additional LINK pointer at the very end of the
> >>> SG table, i.e, after the LAST buffer entry, to point back to
> >>> the beginning of the first table. This will allow us to use
> >>> the buffer normally when we start the trace at offset 0 of
> >>> the buffer, as the LAST buffer entry hints the TMC-ETR and
> >>> it automatically wraps to the offset 0.
> >>>
> >>> 2) If we want to start at any other ETR SG page aligned offset,
> >>> we could :
> >>>   a) Make the preceding page entry as LAST entry.
> >>>   b) Make the original LAST entry a normal entry.
> >>>   c) Use the table pointer to the "new" start offset as the
> >>>      base of the table address.
> >>> This works as the TMC doesn't mandate that the page table
> >>> base address should be 4K page aligned.
> >>>
> >>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> >>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>> ---
> >
> >
> >>> +static int __maybe_unused
> >>> +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset)
> >>> +{
> >>> +       u32 last_entry, first_entry;
> >>> +       u64 last_offset;
> >>> +       struct tmc_sg_table *sg_table = etr_table->sg_table;
> >>> +       sgte_t *table_ptr = sg_table->table_vaddr;
> >>> +       ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
> >>> +
> >>> +       /* Offset should always be SG PAGE_SIZE aligned */
> >>> +       if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
> >>> +               pr_debug("unaligned base offset %llx\n", base_offset);
> >>> +               return -EINVAL;
> >>> +       }
> >>> +       /* Make sure the offset is within the range */
> >>> +       if (base_offset < 0 || base_offset > buf_size) {
> >>> +               base_offset = (base_offset + buf_size) % buf_size;
> >>> +               pr_debug("Resetting offset to %llx\n", base_offset);
> >>> +       }
> >>> +       first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
> >>> +       if (first_entry == etr_table->first_entry) {
> >>> +               pr_debug("Head is already at %llx, skipping\n",
> >>> base_offset);
> >>> +               return 0;
> >>> +       }
> >>> +
> >>> +       /* Last entry should be the previous one to the new "base" */
> >>> +       last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) %
> >>> buf_size;
> >>> +       last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
> >>> +
> >>> +       /* Reset the current Last page to Normal and new Last page to
> >>> NORMAL */
> >>> +       tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry],
> >>> +                                ETR_SG_ET_NORMAL);
> >>> +       tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST);
> >>> +       etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table,
> >>> +                                                           first_entry);
> >>> +       etr_table->first_entry = first_entry;
> >>> +       etr_table->last_entry = last_entry;
> >>> +       pr_debug("table rotated to offset %llx-%llx, entries (%d - %d),
> >>> dba: %llx\n",
> >>> +                       base_offset, last_offset, first_entry,
> >>> last_entry,
> >>> +                       etr_table->hwaddr);
> >>
> >>
> >> The above line generates a warning when compiling for ARMv7.
> >
> >
> > Where you running with LPAE off ? That could probably be the case,
> > where hwaddr could be 32bit or 64bit depending on whether LPAE
> > is enabled. I will fix it.
> 
> My original setup did not have LPAE configured but even when I do
> configure it I can generate the warnings.
> 
> Compiler:
> 
> arm-linux-gnueabi-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
> 
> Let me know if you want my .config file.

For those who don't know... (it seems it's an all too common mistake).

In the printf format definition, the flag can be used to determine the
data type for the format.  Of those flags, the two which are relevant
here are:

	l	printf expects a "long" or "unsigned long" type.
	ll	printf expects a "long long" or "unsigned long long" type.

The size of "long" or "long long" is ABI dependent.  Typically, on
32-bit platforms, "long" is 32-bit and "long long" is 64-bit.  On
64-bit platforms, "long" is 64-bit and "long long" is 128-bit.

Moreover, ABIs can mandate alignment requirements for these data types.
This can cause problems if you do something like:

	u64 var = 1;

	printk("foo %lx\n", var);

If a 32-bit architecture mandates that arguments are passed in ascending
32-bit registers from r0, and 64-bit arguments are to be passed in an
even,odd register pair, then the above printk() is a problem.

The format specifies a "long" type, which is 32-bit type, printk() will
expect the value in r1 for the above, but the compiler has arranged to
pass "var" in r2,r3.  So the end result is that the above prints an
undefined value - whatever happens to be in r1 at the time.

Don't laugh, exactly this problem is in kexec-tools right now!

The kernel data types (and C99 data types) that are typed using the bit
width map to the standard C types.  What this means is that a "u64"
might be "long" if building on a 64-bit platform, or "long long" if
building on a 32-bit platform:

include/uapi/asm-generic/int-ll64.h:typedef unsigned long long __u64;
include/uapi/asm-generic/int-l64.h:typedef unsigned long __u64;

This means if you try and pass a u64 integer variable to printf, you
really can't use either of the "l" or "ll" flags, because you don't
know which you should be using.

The guidance at the top of Documentation/printk-formats.txt concerning
s64/u64 is basically incorrect:

Integer types
=============

::

        If variable is of Type,         use printk format specifier:
        ------------------------------------------------------------
                s64                     %lld or %llx
                u64                     %llu or %llx


The only way around this is:

1) to cast to the appropriate non-bitwidth defined type and use it's
   format flag (eg, unsigned long long if you need at least 64-bit
   precision, and use "ll" in the format.  Yes, on 64-bit it means you
   get 128-bit values, but that's a small price to pay for stuff working
   correctly.)

2) we do as done in userspace, and define a PRI64 which can be inserted
   into the format, but this is messy (iow, eg, "%"PRI64"x").  I
   personally find this rather horrid, and I suspect it'll trip other
   kernel developers sanity filters too.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 07/17] coresight: tmc etr: Add transparent buffer management
  2017-10-19 17:15 ` [PATCH 07/17] coresight: tmc etr: Add transparent buffer management Suzuki K Poulose
@ 2017-11-02 17:48   ` Mathieu Poirier
  2017-11-03 10:02     ` Suzuki K Poulose
  0 siblings, 1 reply; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-02 17:48 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On Thu, Oct 19, 2017 at 06:15:43PM +0100, Suzuki K Poulose wrote:
> At the moment we always use contiguous memory for TMC ETR tracing
> when used from sysfs. The size of the buffer is fixed at boot time
> and can only be changed by modifiying the DT. With the introduction
> of SG support we could support really large buffers in that mode.
> This patch abstracts the buffer used for ETR to switch between a
> contiguous buffer or a SG table depending on the availability of
> the memory.
> 
> This also enables the sysfs mode to use the ETR in SG mode depending
> on configured the trace buffer size. Also, since ETR will use the
> new infrastructure to manage the buffer, we can get rid of some
> of the members in the tmc_drvdata and clean up the fields a bit.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-tmc-etr.c | 433 +++++++++++++++++++-----
>  drivers/hwtracing/coresight/coresight-tmc.h     |  60 +++-
>  2 files changed, 403 insertions(+), 90 deletions(-)
>

[..]
 
> +
> +static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)
> +{
> +	long r_offset, w_offset;
> +	struct etr_sg_table *etr_table = etr_buf->private;
> +	struct tmc_sg_table *table = etr_table->sg_table;
> +
> +	r_offset = tmc_sg_get_data_page_offset(table, rrp);
> +	if (r_offset < 0) {
> +		dev_warn(table->dev, "Unable to map RRP %llx to offset\n",
> +				rrp);
> +		etr_buf->len = 0;
> +		return;
> +	}
> +
> +	w_offset = tmc_sg_get_data_page_offset(table, rwp);
> +	if (w_offset < 0) {
> +		dev_warn(table->dev, "Unable to map RWP %llx to offset\n",
> +				rwp);

                dev_warn(table->dev,
                         "Unable to map RWP %llx to offset\n", rwq);

It looks a little better and we respect indentation rules.  Same for r_offset.

> +		etr_buf->len = 0;
> +		return;
> +	}
> +
> +	etr_buf->offset = r_offset;
> +	if (etr_buf->full)
> +		etr_buf->len = etr_buf->size;
> +	else
> +		etr_buf->len = (w_offset < r_offset) ?
> +			etr_buf->size + w_offset - r_offset :
> +			w_offset - r_offset;
> +	tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len);
> +}
> +
> +static const struct etr_buf_operations etr_sg_buf_ops = {
> +	.alloc = tmc_etr_alloc_sg_buf,
> +	.free = tmc_etr_free_sg_buf,
> +	.sync = tmc_etr_sync_sg_buf,
> +	.get_data = tmc_etr_get_data_sg_buf,
> +};
> +
> +static const struct etr_buf_operations *etr_buf_ops[] = {
> +	[ETR_MODE_FLAT] = &etr_flat_buf_ops,
> +	[ETR_MODE_ETR_SG] = &etr_sg_buf_ops,
> +};
> +
> +static inline int tmc_etr_mode_alloc_buf(int mode,
> +				  struct tmc_drvdata *drvdata,
> +				  struct etr_buf *etr_buf, int node,
> +				  void **pages)

static inline int
tmc_etr_mode_alloc_buf(int mode,
                       struct tmc_drvdata *drvdata,
                       struct etr_buf *etr_buf, int node,
                       void **pages)

> +{
> +	int rc;
> +
> +	switch (mode) {
> +	case ETR_MODE_FLAT:
> +	case ETR_MODE_ETR_SG:
> +		rc = etr_buf_ops[mode]->alloc(drvdata, etr_buf, node, pages);
> +		if (!rc)
> +			etr_buf->ops = etr_buf_ops[mode];
> +		return rc;
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
> +/*
> + * tmc_alloc_etr_buf: Allocate a buffer use by ETR.
> + * @drvdata	: ETR device details.
> + * @size	: size of the requested buffer.
> + * @flags	: Required properties of the type of buffer.
> + * @node	: Node for memory allocations.
> + * @pages	: An optional list of pages.
> + */
> +static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
> +					  ssize_t size, int flags,
> +					  int node, void **pages)

Please fix indentation.  Also @flags isn't used.

> +{
> +	int rc = -ENOMEM;
> +	bool has_etr_sg, has_iommu;
> +	struct etr_buf *etr_buf;
> +
> +	has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG);
> +	has_iommu = iommu_get_domain_for_dev(drvdata->dev);
> +
> +	etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
> +	if (!etr_buf)
> +		return ERR_PTR(-ENOMEM);
> +
> +	etr_buf->size = size;
> +
> +	/*
> +	 * If we have to use an existing list of pages, we cannot reliably
> +	 * use a contiguous DMA memory (even if we have an IOMMU). Otherwise,
> +	 * we use the contiguous DMA memory if :
> +	 *  a) The ETR cannot use Scatter-Gather.
> +	 *  b) if not a, we have an IOMMU backup

Please rework the above sentence.

> +	 *  c) if none of the above holds, use it for smaller memory (< 1M).
> +	 *
> +	 * Fallback to available mechanisms.
> +	 *
> +	 */
> +	if (!pages &&
> +	    (!has_etr_sg || has_iommu || size < SZ_1M))
> +		rc = tmc_etr_mode_alloc_buf(ETR_MODE_FLAT, drvdata,
> +					    etr_buf, node, pages);
> +	if (rc && has_etr_sg)
> +		rc = tmc_etr_mode_alloc_buf(ETR_MODE_ETR_SG, drvdata,
> +					    etr_buf, node, pages);
> +	if (rc) {
> +		kfree(etr_buf);
> +		return ERR_PTR(rc);
> +	}
> +
> +	return etr_buf;
> +}
> +
> +static void tmc_free_etr_buf(struct etr_buf *etr_buf)
> +{
> +	WARN_ON(!etr_buf->ops || !etr_buf->ops->free);
> +	etr_buf->ops->free(etr_buf);
> +	kfree(etr_buf);
> +}
> +
> +/*
> + * tmc_etr_buf_get_data: Get the pointer the trace data at @offset
> + * with a maximum of @len bytes.
> + * Returns: The size of the linear data available @pos, with *bufpp
> + * updated to point to the buffer.
> + */
> +static ssize_t tmc_etr_buf_get_data(struct etr_buf *etr_buf,
> +				    u64 offset, size_t len, char **bufpp)
> +{
> +	/* Adjust the length to limit this transaction to end of buffer */
> +	len = (len < (etr_buf->size - offset)) ? len : etr_buf->size - offset;
> +
> +	return etr_buf->ops->get_data(etr_buf, (u64)offset, len, bufpp);
> +}
> +
> +static inline s64
> +tmc_etr_buf_insert_barrier_packet(struct etr_buf *etr_buf, u64 offset)
> +{
> +	ssize_t len;
> +	char *bufp;
> +
> +	len = tmc_etr_buf_get_data(etr_buf, offset,
> +				   CORESIGHT_BARRIER_PKT_SIZE, &bufp);
> +	if (WARN_ON(len <= CORESIGHT_BARRIER_PKT_SIZE))
> +		return -EINVAL;
> +	coresight_insert_barrier_packet(bufp);
> +	return offset + CORESIGHT_BARRIER_PKT_SIZE;
> +}
> +
> +/*
> + * tmc_sync_etr_buf: Sync the trace buffer availability with drvdata.
> + * Makes sure the trace data is synced to the memory for consumption.
> + * @etr_buf->offset will hold the offset to the beginning of the trace data
> + * within the buffer, with @etr_buf->len bytes to consume. @etr_buf->vaddr
> + * will always point to the beginning of the "trace buffer".
> + */
> +static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata)
> +{
> +	struct etr_buf *etr_buf = drvdata->etr_buf;
> +	u64 rrp, rwp;
> +	u32 status;
> +
> +	rrp = tmc_read_rrp(drvdata);
> +	rwp = tmc_read_rwp(drvdata);
> +	status = readl_relaxed(drvdata->base + TMC_STS);
> +	etr_buf->full = status & TMC_STS_FULL;
> +
> +	WARN_ON(!etr_buf->ops || !etr_buf->ops->sync);
> +
> +	etr_buf->ops->sync(etr_buf, rrp, rwp);
> +
> +	/* Insert barrier packets at the beginning, if there was an overflow */
> +	if (etr_buf->full)
> +		tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset);
> +}
> +
>  static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
>  {
>  	u32 axictl, sts;
> +	struct etr_buf *etr_buf = drvdata->etr_buf;
>  
>  	/* Zero out the memory to help with debug */
> -	memset(drvdata->vaddr, 0, drvdata->size);
> +	memset(etr_buf->vaddr, 0, etr_buf->size);
>  
>  	CS_UNLOCK(drvdata->base);
>  
>  	/* Wait for TMCSReady bit to be set */
>  	tmc_wait_for_tmcready(drvdata);
>  
> -	writel_relaxed(drvdata->size / 4, drvdata->base + TMC_RSZ);
> +	writel_relaxed(etr_buf->size / 4, drvdata->base + TMC_RSZ);
>  	writel_relaxed(TMC_MODE_CIRCULAR_BUFFER, drvdata->base + TMC_MODE);
>  
>  	axictl = readl_relaxed(drvdata->base + TMC_AXICTL);
> @@ -707,16 +987,22 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
>  		axictl |= TMC_AXICTL_ARCACHE_OS;
>  	}
>  
> +	if (etr_buf->mode == ETR_MODE_ETR_SG) {
> +		if (WARN_ON(!tmc_etr_has_cap(drvdata, TMC_ETR_SG)))
> +			return;
> +		axictl |= TMC_AXICTL_SCT_GAT_MODE;
> +	}
> +
>  	writel_relaxed(axictl, drvdata->base + TMC_AXICTL);
> -	tmc_write_dba(drvdata, drvdata->paddr);
> +	tmc_write_dba(drvdata, etr_buf->hwaddr);
>  	/*
>  	 * If the TMC pointers must be programmed before the session,
>  	 * we have to set it properly (i.e, RRP/RWP to base address and
>  	 * STS to "not full").
>  	 */
>  	if (tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE)) {
> -		tmc_write_rrp(drvdata, drvdata->paddr);
> -		tmc_write_rwp(drvdata, drvdata->paddr);
> +		tmc_write_rrp(drvdata, etr_buf->hwaddr);
> +		tmc_write_rwp(drvdata, etr_buf->hwaddr);
>  		sts = readl_relaxed(drvdata->base + TMC_STS) & ~TMC_STS_FULL;
>  		writel_relaxed(sts, drvdata->base + TMC_STS);
>  	}
> @@ -732,62 +1018,52 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
>  }
>  
>  /*
> - * Return the available trace data in the buffer @pos, with a maximum
> - * limit of @len, also updating the @bufpp on where to find it.
> + * Return the available trace data in the buffer (starts at etr_buf->offset,
> + * limited by etr_buf->len) from @pos, with a maximum limit of @len,
> + * also updating the @bufpp on where to find it. Since the trace data
> + * starts at anywhere in the buffer, depending on the RRP, we adjust the
> + * @len returned to handle buffer wrapping around.
>   */
>  ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
>  			    loff_t pos, size_t len, char **bufpp)

Please fix indentation

>  {
> -	char *bufp = drvdata->buf + pos;
> -	char *bufend = (char *)(drvdata->vaddr + drvdata->size);
> -
> -	/* Adjust the len to available size @pos */
> -	if (pos + len > drvdata->len)
> -		len = drvdata->len - pos;
> +	s64 offset;
> +	struct etr_buf *etr_buf = drvdata->etr_buf;
>  
> +	if (pos + len > etr_buf->len)
> +		len = etr_buf->len - pos;
>  	if (len <= 0)
>  		return len;
>  
> -	/*
> -	 * Since we use a circular buffer, with trace data starting
> -	 * @drvdata->buf, possibly anywhere in the buffer @drvdata->vaddr,
> -	 * wrap the current @pos to within the buffer.
> -	 */
> -	if (bufp >= bufend)
> -		bufp -= drvdata->size;
> -	/*
> -	 * For simplicity, avoid copying over a wrapped around buffer.
> -	 */
> -	if ((bufp + len) > bufend)
> -		len = bufend - bufp;
> -	*bufpp = bufp;
> -	return len;
> +	/* Compute the offset from which we read the data */
> +	offset = etr_buf->offset + pos;
> +	if (offset >= etr_buf->size)
> +		offset -= etr_buf->size;
> +	return tmc_etr_buf_get_data(etr_buf, offset, len, bufpp);
>  }
>  
> -static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata)
> +static struct etr_buf *
> +tmc_etr_setup_sysfs_buf(struct tmc_drvdata *drvdata)
>  {
> -	u32 val;
> -	u64 rwp;
> +	return tmc_alloc_etr_buf(drvdata, drvdata->size, 0,
> +				  cpu_to_node(0), NULL);

Indentation

> +}
>  
> -	rwp = tmc_read_rwp(drvdata);
> -	val = readl_relaxed(drvdata->base + TMC_STS);
> +static void
> +tmc_etr_free_sysfs_buf(struct etr_buf *buf)
> +{
> +	if (buf)
> +		tmc_free_etr_buf(buf);
> +}
>  
> -	/*
> -	 * Adjust the buffer to point to the beginning of the trace data
> -	 * and update the available trace data.
> -	 */
> -	if (val & TMC_STS_FULL) {
> -		drvdata->buf = drvdata->vaddr + rwp - drvdata->paddr;
> -		drvdata->len = drvdata->size;
> -		coresight_insert_barrier_packet(drvdata->buf);
> -	} else {
> -		drvdata->buf = drvdata->vaddr;
> -		drvdata->len = rwp - drvdata->paddr;
> -	}
> +static void tmc_etr_sync_sysfs_buf(struct tmc_drvdata *drvdata)
> +{
> +	tmc_sync_etr_buf(drvdata);
>  }
>  
>  static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
>  {
> +
>  	CS_UNLOCK(drvdata->base);
>  
>  	tmc_flush_and_stop(drvdata);
> @@ -796,7 +1072,8 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
>  	 * read before the TMC is disabled.
>  	 */
>  	if (drvdata->mode == CS_MODE_SYSFS)
> -		tmc_etr_dump_hw(drvdata);
> +		tmc_etr_sync_sysfs_buf(drvdata);
> +
>  	tmc_disable_hw(drvdata);
>  
>  	CS_LOCK(drvdata->base);
> @@ -807,34 +1084,31 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>  	int ret = 0;
>  	bool used = false;
>  	unsigned long flags;
> -	void __iomem *vaddr = NULL;
> -	dma_addr_t paddr;
>  	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> +	struct etr_buf *new_buf = NULL, *free_buf = NULL;
>  
>  
>  	/*
> -	 * If we don't have a buffer release the lock and allocate memory.
> -	 * Otherwise keep the lock and move along.
> +	 * If we are enabling the ETR from disabled state, we need to make
> +	 * sure we have a buffer with the right size. The etr_buf is not reset
> +	 * immediately after we stop the tracing in SYSFS mode as we wait for
> +	 * the user to collect the data. We may be able to reuse the existing
> +	 * buffer, provided the size matches. Any allocation has to be done
> +	 * with the lock released.
>  	 */
>  	spin_lock_irqsave(&drvdata->spinlock, flags);
> -	if (!drvdata->vaddr) {
> +	if (!drvdata->etr_buf || (drvdata->etr_buf->size != drvdata->size)) {
>  		spin_unlock_irqrestore(&drvdata->spinlock, flags);
> -
> -		/*
> -		 * Contiguous  memory can't be allocated while a spinlock is
> -		 * held.  As such allocate memory here and free it if a buffer
> -		 * has already been allocated (from a previous session).
> -		 */
> -		vaddr = dma_alloc_coherent(drvdata->dev, drvdata->size,
> -					   &paddr, GFP_KERNEL);
> -		if (!vaddr)
> -			return -ENOMEM;
> +		/* Allocate memory with the spinlock released */
> +		free_buf = new_buf = tmc_etr_setup_sysfs_buf(drvdata);
> +		if (IS_ERR(new_buf))
> +			return PTR_ERR(new_buf);
>  
>  		/* Let's try again */
>  		spin_lock_irqsave(&drvdata->spinlock, flags);
>  	}
>  
> -	if (drvdata->reading) {
> +	if (drvdata->reading || drvdata->mode == CS_MODE_PERF) {
>  		ret = -EBUSY;
>  		goto out;
>  	}
> @@ -842,21 +1116,20 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>  	/*
>  	 * In sysFS mode we can have multiple writers per sink.  Since this
>  	 * sink is already enabled no memory is needed and the HW need not be
> -	 * touched.
> +	 * touched, even if the buffer size has changed.
>  	 */
>  	if (drvdata->mode == CS_MODE_SYSFS)
>  		goto out;
>  
>  	/*
> -	 * If drvdata::buf == NULL, use the memory allocated above.
> -	 * Otherwise a buffer still exists from a previous session, so
> -	 * simply use that.
> +	 * If we don't have a buffer or it doesn't match the requested size,
> +	 * use the memory allocated above. Otherwise reuse it.
>  	 */
> -	if (drvdata->buf == NULL) {
> +	if (!drvdata->etr_buf ||
> +	    (new_buf && drvdata->etr_buf->size != new_buf->size)) {
>  		used = true;
> -		drvdata->vaddr = vaddr;
> -		drvdata->paddr = paddr;
> -		drvdata->buf = drvdata->vaddr;
> +		free_buf = drvdata->etr_buf;
> +		drvdata->etr_buf = new_buf;
>  	}
>  
>  	drvdata->mode = CS_MODE_SYSFS;
> @@ -865,8 +1138,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>  	spin_unlock_irqrestore(&drvdata->spinlock, flags);
>  
>  	/* Free memory outside the spinlock if need be */
> -	if (!used && vaddr)
> -		dma_free_coherent(drvdata->dev, drvdata->size, vaddr, paddr);
> +	if (free_buf)
> +		tmc_etr_free_sysfs_buf(free_buf);
>  
>  	if (!ret)
>  		dev_info(drvdata->dev, "TMC-ETR enabled\n");
> @@ -945,8 +1218,8 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata)
>  		goto out;
>  	}
>  
> -	/* If drvdata::buf is NULL the trace data has been read already */
> -	if (drvdata->buf == NULL) {
> +	/* If drvdata::etr_buf is NULL the trace data has been read already */
> +	if (drvdata->etr_buf == NULL) {
>  		ret = -EINVAL;
>  		goto out;
>  	}
> @@ -965,8 +1238,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata)
>  int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
>  {
>  	unsigned long flags;
> -	dma_addr_t paddr;
> -	void __iomem *vaddr = NULL;
> +	struct etr_buf *etr_buf = NULL;
>  
>  	/* config types are set a boot time and never change */
>  	if (WARN_ON_ONCE(drvdata->config_type != TMC_CONFIG_TYPE_ETR))
> @@ -988,17 +1260,16 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
>  		 * The ETR is not tracing and the buffer was just read.
>  		 * As such prepare to free the trace buffer.
>  		 */
> -		vaddr = drvdata->vaddr;
> -		paddr = drvdata->paddr;
> -		drvdata->buf = drvdata->vaddr = NULL;
> +		etr_buf =  drvdata->etr_buf;
> +		drvdata->etr_buf = NULL;
>  	}
>  
>  	drvdata->reading = false;
>  	spin_unlock_irqrestore(&drvdata->spinlock, flags);
>  
>  	/* Free allocated memory out side of the spinlock */
> -	if (vaddr)
> -		dma_free_coherent(drvdata->dev, drvdata->size, vaddr, paddr);
> +	if (etr_buf)
> +		tmc_free_etr_buf(etr_buf);
>  
>  	return 0;
>  }
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> index 5e49c035a1ac..50ebc17c4645 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.h
> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> @@ -55,6 +55,8 @@
>  #define TMC_STS_TMCREADY_BIT	2
>  #define TMC_STS_FULL		BIT(0)
>  #define TMC_STS_TRIGGERED	BIT(1)
> +#define TMC_STS_MEMERR		BIT(5)
> +
>  /*
>   * TMC_AXICTL - 0x110
>   *
> @@ -134,6 +136,37 @@ enum tmc_mem_intf_width {
>  #define CORESIGHT_SOC_600_ETR_CAPS	\
>  	(TMC_ETR_SAVE_RESTORE | TMC_ETR_AXI_ARCACHE)
>  
> +enum etr_mode {
> +	ETR_MODE_FLAT,		/* Uses contiguous flat buffer */
> +	ETR_MODE_ETR_SG,	/* Uses in-built TMC ETR SG mechanism */
> +};
> +
> +struct etr_buf_operations;
> +
> +/**
> + * struct etr_buf - Details of the buffer used by ETR
> + * @mode	: Mode of the ETR buffer, contiguous, Scatter Gather etc.
> + * @full	: Trace data overflow
> + * @size	: Size of the buffer.
> + * @hwaddr	: Address to be programmed in the TMC:DBA{LO,HI}
> + * @vaddr	: Virtual address of the buffer used for trace.
> + * @offset	: Offset of the trace data in the buffer for consumption.
> + * @len		: Available trace data @buf (may round up to the beginning).
> + * @ops		: ETR buffer operations for the mode.
> + * @private	: Backend specific information for the buf
> + */
> +struct etr_buf {
> +	enum etr_mode			mode;
> +	bool				full;
> +	ssize_t				size;
> +	dma_addr_t			hwaddr;
> +	void				*vaddr;
> +	unsigned long			offset;
> +	u64				len;
> +	const struct etr_buf_operations	*ops;
> +	void				*private;
> +};
> +
>  /**
>   * struct tmc_drvdata - specifics associated to an TMC component
>   * @base:	memory mapped base address for this component.
> @@ -141,11 +174,10 @@ enum tmc_mem_intf_width {
>   * @csdev:	component vitals needed by the framework.
>   * @miscdev:	specifics to handle "/dev/xyz.tmc" entry.
>   * @spinlock:	only one at a time pls.
> - * @buf:	area of memory where trace data get sent.
> - * @paddr:	DMA start location in RAM.
> - * @vaddr:	virtual representation of @paddr.
> - * @size:	trace buffer size.
> - * @len:	size of the available trace.
> + * @buf:	Snapshot of the trace data for ETF/ETB.
> + * @etr_buf:	details of buffer used in TMC-ETR
> + * @len:	size of the available trace for ETF/ETB.
> + * @size:	trace buffer size for this TMC (common for all modes).
>   * @mode:	how this TMC is being used.
>   * @config_type: TMC variant, must be of type @tmc_config_type.
>   * @memwidth:	width of the memory interface databus, in bytes.
> @@ -160,11 +192,12 @@ struct tmc_drvdata {
>  	struct miscdevice	miscdev;
>  	spinlock_t		spinlock;
>  	bool			reading;
> -	char			*buf;
> -	dma_addr_t		paddr;
> -	void __iomem		*vaddr;
> -	u32			size;
> +	union {
> +		char		*buf;		/* TMC ETB */
> +		struct etr_buf	*etr_buf;	/* TMC ETR */
> +	};
>  	u32			len;
> +	u32			size;
>  	u32			mode;
>  	enum tmc_config_type	config_type;
>  	enum tmc_mem_intf_width	memwidth;
> @@ -172,6 +205,15 @@ struct tmc_drvdata {
>  	u32			etr_caps;
>  };
>  
> +struct etr_buf_operations {
> +	int (*alloc)(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf,
> +			int node, void **pages);
> +	void (*sync)(struct etr_buf *etr_buf, u64 rrp, u64 rwp);
> +	ssize_t (*get_data)(struct etr_buf *etr_buf, u64 offset, size_t len,
> +				char **bufpp);
> +	void (*free)(struct etr_buf *etr_buf);
> +};
> +
>  /**
>   * struct tmc_pages - Collection of pages used for SG.
>   * @nr_pages:		Number of pages in the list.
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 08/17] coresight: tmc: Add configuration support for trace buffer size
  2017-10-19 17:15 ` [PATCH 08/17] coresight: tmc: Add configuration support for trace buffer size Suzuki K Poulose
@ 2017-11-02 19:26   ` Mathieu Poirier
  0 siblings, 0 replies; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-02 19:26 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On Thu, Oct 19, 2017 at 06:15:44PM +0100, Suzuki K Poulose wrote:
> Now that we can dynamically switch between contiguous memory and
> SG table depending on the trace buffer size, provide the support
> for selecting an appropriate buffer size.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  .../ABI/testing/sysfs-bus-coresight-devices-tmc    |  8 ++++++
>  drivers/hwtracing/coresight/coresight-tmc.c        | 32 ++++++++++++++++++++++
>  2 files changed, 40 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> index 4fe677ed1305..3675c380caf8 100644
> --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
> @@ -83,3 +83,11 @@ KernelVersion:	4.7
>  Contact:	Mathieu Poirier <mathieu.poirier@linaro.org>
>  Description:	(R) Indicates the capabilities of the Coresight TMC.
>  		The value is read directly from the DEVID register, 0xFC8,
> +
> +What:		/sys/bus/coresight/devices/<memory_map>.tmc/buffer-size
> +Date:		September 2017
> +KernelVersion:	4.15

More like 4.16 now.

> +Contact:	Mathieu Poirier <mathieu.poirier@linaro.org>
> +Description:	(RW) Size of the trace buffer for TMC-ETR when used in SYSFS
> +		mode. Writable only for TMC-ETR configurations. The value
> +		should be aligned to the kernel pagesize.
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c
> index c7201e40d737..2349b1805694 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc.c
> @@ -283,8 +283,40 @@ static ssize_t trigger_cntr_store(struct device *dev,
>  }
>  static DEVICE_ATTR_RW(trigger_cntr);
>  
> +static ssize_t buffer_size_show(struct device *dev,
> +				struct device_attribute *attr, char *buf)
> +{
> +	struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent);
> +
> +	return sprintf(buf, "%#x\n", drvdata->size);
> +}
> +
> +static ssize_t buffer_size_store(struct device *dev,
> +			     struct device_attribute *attr,
> +			     const char *buf, size_t size)

Indentation (I know trigger_cntr_store() is wrong).

> +{
> +	int ret;
> +	unsigned long val;
> +	struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent);
> +
> +	if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
> +		return -EPERM;

I think -EINVAL would be more appropriate but definitely not a big deal.

> +
> +	ret = kstrtoul(buf, 0, &val);
> +	if (ret)
> +		return ret;
> +	/* The buffer size should be page aligned */
> +	if (val & (PAGE_SIZE - 1))
> +		return -EINVAL;
> +	drvdata->size = val;
> +	return size;
> +}
> +
> +static DEVICE_ATTR_RW(buffer_size);
> +
>  static struct attribute *coresight_tmc_attrs[] = {
>  	&dev_attr_trigger_cntr.attr,
> +	&dev_attr_buffer_size.attr,
>  	NULL,
>  };
>  
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 10/17] coresight: etr: Track if the device is coherent
  2017-10-19 17:15 ` [PATCH 10/17] coresight: etr: Track if the device is coherent Suzuki K Poulose
@ 2017-11-02 19:40   ` Mathieu Poirier
  2017-11-03 10:03     ` Suzuki K Poulose
  0 siblings, 1 reply; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-02 19:40 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On Thu, Oct 19, 2017 at 06:15:46PM +0100, Suzuki K Poulose wrote:
> Track if the ETR is dma-coherent or not. This will be useful
> in deciding if we should use software buffering for perf.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-tmc.c | 5 ++++-
>  drivers/hwtracing/coresight/coresight-tmc.h | 1 +
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c
> index 4939333cc6c7..5a8c41130f96 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc.c
> @@ -347,6 +347,9 @@ static int tmc_etr_setup_caps(struct tmc_drvdata *drvdata,
>  	if (!(devid & TMC_DEVID_NOSCAT))
>  		tmc_etr_set_cap(drvdata, TMC_ETR_SG);
>  
> +	if (device_get_dma_attr(drvdata->dev) == DEV_DMA_COHERENT)
> +		tmc_etr_set_cap(drvdata, TMC_ETR_COHERENT);
> +
>  	/* Check if the AXI address width is available */
>  	if (devid & TMC_DEVID_AXIAW_VALID)
>  		dma_mask = ((devid >> TMC_DEVID_AXIAW_SHIFT) &
> @@ -397,7 +400,7 @@ static int tmc_probe(struct amba_device *adev, const struct amba_id *id)
>  	if (!drvdata)
>  		goto out;
>  
> -	drvdata->dev = &adev->dev;
> +	drvdata->dev = dev;

What is that one for?

>  	dev_set_drvdata(dev, drvdata);
>  
>  	/* Validity for the resource is already checked by the AMBA core */
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> index 50ebc17c4645..69da0b584a6b 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.h
> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> @@ -131,6 +131,7 @@ enum tmc_mem_intf_width {
>   * so we have to rely on PID of the IP to detect the functionality.
>   */
>  #define TMC_ETR_SAVE_RESTORE		(0x1U << 2)
> +#define TMC_ETR_COHERENT		(0x1U << 3)
>  
>  /* Coresight SoC-600 TMC-ETR unadvertised capabilities */
>  #define CORESIGHT_SOC_600_ETR_CAPS	\
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 11/17] coresight etr: Handle driver mode specific ETR buffers
  2017-10-19 17:15 ` [PATCH 11/17] coresight etr: Handle driver mode specific ETR buffers Suzuki K Poulose
@ 2017-11-02 20:26   ` Mathieu Poirier
  2017-11-03 10:08     ` Suzuki K Poulose
  0 siblings, 1 reply; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-02 20:26 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On Thu, Oct 19, 2017 at 06:15:47PM +0100, Suzuki K Poulose wrote:
> Since the ETR could be driven either by SYSFS or by perf, it
> becomes complicated how we deal with the buffers used for each
> of these modes. The ETR driver cannot simply free the current
> attached buffer without knowing the provider (i.e, sysfs vs perf).
> 
> To solve this issue, we provide:
> 1) the driver-mode specific etr buffer to be retained in the drvdata
> 2) the etr_buf for a session should be passed on when enabling the
>    hardware, which will be stored in drvdata->etr_buf. This will be
>    replaced (not free'd) as soon as the hardware is disabled, after
>    necessary sync operation.

If I get you right the problem you're trying to solve is what to do with a sysFS
buffer that hasn't been read (and freed) when a perf session is requested.  In
my opinion it should simply be freed.  Indeed the user probably doesn't care
much about that sysFS buffer, if it did the data would have been harvested.

> 
> The advantages of this are :
> 
> 1) The common code path doesn't need to worry about how to dispose
>    an existing buffer, if it is about to start a new session with a
>    different buffer, possibly in a different mode.
> 2) The driver mode can control its buffers and can get access to the
>    saved session even when the hardware is operating in a different
>    mode. (e.g, we can still access a trace buffer from a sysfs mode
>    even if the etr is now used in perf mode, without disrupting the
>    current session.)
> 
> Towards this, we introduce a sysfs specific data which will hold the
> etr_buf used for sysfs mode of operation, controlled solely by the
> sysfs mode handling code.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-tmc-etr.c | 59 ++++++++++++++++---------
>  drivers/hwtracing/coresight/coresight-tmc.h     |  2 +
>  2 files changed, 41 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index f12b7c5f68b2..ef7498f05b34 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -961,11 +961,16 @@ static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata)
>  		tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset);
>  }
>  
> -static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
> +static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata,
> +			      struct etr_buf *etr_buf)
>  {
>  	u32 axictl, sts;
> -	struct etr_buf *etr_buf = drvdata->etr_buf;
>  
> +	/* Callers should provide an appropriate buffer for use */
> +	if (WARN_ON(!etr_buf || drvdata->etr_buf))
> +		return;
> +
> +	drvdata->etr_buf = etr_buf;
>  	/* Zero out the memory to help with debug */
>  	memset(etr_buf->vaddr, 0, etr_buf->size);
>  
> @@ -1023,12 +1028,15 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
>   * also updating the @bufpp on where to find it. Since the trace data
>   * starts at anywhere in the buffer, depending on the RRP, we adjust the
>   * @len returned to handle buffer wrapping around.
> + *
> + * We are protected here by drvdata->reading != 0, which ensures the
> + * sysfs_buf stays alive.
>   */
>  ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
>  			    loff_t pos, size_t len, char **bufpp)
>  {
>  	s64 offset;
> -	struct etr_buf *etr_buf = drvdata->etr_buf;
> +	struct etr_buf *etr_buf = drvdata->sysfs_buf;
>  
>  	if (pos + len > etr_buf->len)
>  		len = etr_buf->len - pos;
> @@ -1058,7 +1066,14 @@ tmc_etr_free_sysfs_buf(struct etr_buf *buf)
>  
>  static void tmc_etr_sync_sysfs_buf(struct tmc_drvdata *drvdata)
>  {
> -	tmc_sync_etr_buf(drvdata);
> +	struct etr_buf *etr_buf = drvdata->etr_buf;
> +
> +	if (WARN_ON(drvdata->sysfs_buf != etr_buf)) {
> +		tmc_etr_free_sysfs_buf(drvdata->sysfs_buf);
> +		drvdata->sysfs_buf = NULL;
> +	} else {
> +		tmc_sync_etr_buf(drvdata);
> +	}
>  }
>  
>  static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
> @@ -1077,6 +1092,8 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
>  	tmc_disable_hw(drvdata);
>  
>  	CS_LOCK(drvdata->base);
> +	/* Reset the ETR buf used by hardware */
> +	drvdata->etr_buf = NULL;
>  }
>  
>  static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
> @@ -1085,7 +1102,7 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>  	bool used = false;
>  	unsigned long flags;
>  	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> -	struct etr_buf *new_buf = NULL, *free_buf = NULL;
> +	struct etr_buf *sysfs_buf = NULL, *new_buf = NULL, *free_buf = NULL;
>  
>  
>  	/*
> @@ -1097,7 +1114,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>  	 * with the lock released.
>  	 */
>  	spin_lock_irqsave(&drvdata->spinlock, flags);
> -	if (!drvdata->etr_buf || (drvdata->etr_buf->size != drvdata->size)) {
> +	sysfs_buf = READ_ONCE(drvdata->sysfs_buf);
> +	if (!sysfs_buf || (sysfs_buf->size != drvdata->size)) {
>  		spin_unlock_irqrestore(&drvdata->spinlock, flags);
>  		/* Allocate memory with the spinlock released */
>  		free_buf = new_buf = tmc_etr_setup_sysfs_buf(drvdata);
> @@ -1125,15 +1143,16 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>  	 * If we don't have a buffer or it doesn't match the requested size,
>  	 * use the memory allocated above. Otherwise reuse it.
>  	 */
> -	if (!drvdata->etr_buf ||
> -	    (new_buf && drvdata->etr_buf->size != new_buf->size)) {
> +	sysfs_buf = READ_ONCE(drvdata->sysfs_buf);
> +	if (!sysfs_buf ||
> +	    (new_buf && sysfs_buf->size != new_buf->size)) {
>  		used = true;
> -		free_buf = drvdata->etr_buf;
> -		drvdata->etr_buf = new_buf;
> +		free_buf = sysfs_buf;
> +		drvdata->sysfs_buf = new_buf;
>  	}
>  
>  	drvdata->mode = CS_MODE_SYSFS;
> -	tmc_etr_enable_hw(drvdata);
> +	tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf);
>  out:
>  	spin_unlock_irqrestore(&drvdata->spinlock, flags);
>  
> @@ -1218,13 +1237,13 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata)
>  		goto out;
>  	}
>  
> -	/* If drvdata::etr_buf is NULL the trace data has been read already */
> -	if (drvdata->etr_buf == NULL) {
> +	/* If sysfs_buf is NULL the trace data has been read already */
> +	if (!drvdata->sysfs_buf) {
>  		ret = -EINVAL;
>  		goto out;
>  	}
>  
> -	/* Disable the TMC if need be */
> +	/* Disable the TMC if we are trying to read from a running session */
>  	if (drvdata->mode == CS_MODE_SYSFS)
>  		tmc_etr_disable_hw(drvdata);
>  
> @@ -1238,7 +1257,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata)
>  int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
>  {
>  	unsigned long flags;
> -	struct etr_buf *etr_buf = NULL;
> +	struct etr_buf *sysfs_buf = NULL;
>  
>  	/* config types are set a boot time and never change */
>  	if (WARN_ON_ONCE(drvdata->config_type != TMC_CONFIG_TYPE_ETR))
> @@ -1254,22 +1273,22 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
>  		 * so we don't have to explicitly clear it. Also, since the
>  		 * tracer is still enabled drvdata::buf can't be NULL.
>  		 */
> -		tmc_etr_enable_hw(drvdata);
> +		tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf);
>  	} else {
>  		/*
>  		 * The ETR is not tracing and the buffer was just read.
>  		 * As such prepare to free the trace buffer.
>  		 */
> -		etr_buf =  drvdata->etr_buf;
> -		drvdata->etr_buf = NULL;
> +		sysfs_buf = drvdata->sysfs_buf;
> +		drvdata->sysfs_buf = NULL;
>  	}
>  
>  	drvdata->reading = false;
>  	spin_unlock_irqrestore(&drvdata->spinlock, flags);
>  
>  	/* Free allocated memory out side of the spinlock */
> -	if (etr_buf)
> -		tmc_free_etr_buf(etr_buf);
> +	if (sysfs_buf)
> +		tmc_etr_free_sysfs_buf(sysfs_buf);
>  
>  	return 0;
>  }
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> index 69da0b584a6b..14a3dec50b0f 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.h
> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> @@ -185,6 +185,7 @@ struct etr_buf {
>   * @trigger_cntr: amount of words to store after a trigger.
>   * @etr_caps:	Bitmask of capabilities of the TMC ETR, inferred from the
>   *		device configuration register (DEVID)
> + * @sysfs_data:	SYSFS buffer for ETR.
>   */
>  struct tmc_drvdata {
>  	void __iomem		*base;
> @@ -204,6 +205,7 @@ struct tmc_drvdata {
>  	enum tmc_mem_intf_width	memwidth;
>  	u32			trigger_cntr;
>  	u32			etr_caps;
> +	struct etr_buf		*sysfs_buf;
>  };
>  
>  struct etr_buf_operations {
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 13/17] coresight etr: Do not clean ETR trace buffer
  2017-10-19 17:15 ` [PATCH 13/17] coresight etr: Do not clean ETR trace buffer Suzuki K Poulose
@ 2017-11-02 20:36   ` Mathieu Poirier
  2017-11-03 10:10     ` Suzuki K Poulose
  0 siblings, 1 reply; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-02 20:36 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On Thu, Oct 19, 2017 at 06:15:49PM +0100, Suzuki K Poulose wrote:
> We zero out the entire trace buffer used for ETR before it
> is enabled, for helping with debugging. Since we could be
> restoring a session in perf mode, this could destroy the data.

I'm not sure to follow you with "... restoring a session in perf mode ...". 
When operating from the perf interface all the memory allocated for a session is
cleanup after, there is no re-using of memory as in sysFS.

> Get rid of this step, if someone wants to debug, they can always
> add it as and when needed.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-tmc-etr.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 31353fc34b53..849684f85443 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -971,8 +971,6 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata,
>  		return;
>  
>  	drvdata->etr_buf = etr_buf;
> -	/* Zero out the memory to help with debug */
> -	memset(etr_buf->vaddr, 0, etr_buf->size);

I agree, this can be costly when dealing with large areas of memory.

>  
>  	CS_UNLOCK(drvdata->base);
>  
> @@ -1267,9 +1265,8 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
>  	if (drvdata->mode == CS_MODE_SYSFS) {
>  		/*
>  		 * The trace run will continue with the same allocated trace
> -		 * buffer. The trace buffer is cleared in tmc_etr_enable_hw(),
> -		 * so we don't have to explicitly clear it. Also, since the
> -		 * tracer is still enabled drvdata::buf can't be NULL.
> +		 * buffer. Since the tracer is still enabled drvdata::buf can't
> +		 * be NULL.
>  		 */
>  		tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf);
>  	} else {
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 07/17] coresight: tmc etr: Add transparent buffer management
  2017-11-02 17:48   ` Mathieu Poirier
@ 2017-11-03 10:02     ` Suzuki K Poulose
  2017-11-03 20:13       ` Mathieu Poirier
  0 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-03 10:02 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On 02/11/17 17:48, Mathieu Poirier wrote:
> On Thu, Oct 19, 2017 at 06:15:43PM +0100, Suzuki K Poulose wrote:
>> At the moment we always use contiguous memory for TMC ETR tracing
>> when used from sysfs. The size of the buffer is fixed at boot time
>> and can only be changed by modifiying the DT. With the introduction
>> of SG support we could support really large buffers in that mode.
>> This patch abstracts the buffer used for ETR to switch between a
>> contiguous buffer or a SG table depending on the availability of
>> the memory.
>>
>> This also enables the sysfs mode to use the ETR in SG mode depending
>> on configured the trace buffer size. Also, since ETR will use the
>> new infrastructure to manage the buffer, we can get rid of some
>> of the members in the tmc_drvdata and clean up the fields a bit.
>>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-tmc-etr.c | 433 +++++++++++++++++++-----
>>   drivers/hwtracing/coresight/coresight-tmc.h     |  60 +++-
>>   2 files changed, 403 insertions(+), 90 deletions(-)
>>
> 
> [..]
>   
>> +
>> +static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)

>> +	w_offset = tmc_sg_get_data_page_offset(table, rwp);
>> +	if (w_offset < 0) {
>> +		dev_warn(table->dev, "Unable to map RWP %llx to offset\n",
>> +				rwp);
> 
>                  dev_warn(table->dev,
>                           "Unable to map RWP %llx to offset\n", rwq);
> 
> It looks a little better and we respect indentation rules.  Same for r_offset.
> 

>> +static inline int tmc_etr_mode_alloc_buf(int mode,
>> +				  struct tmc_drvdata *drvdata,
>> +				  struct etr_buf *etr_buf, int node,
>> +				  void **pages)
> 
> static inline int
> tmc_etr_mode_alloc_buf(int mode,
>                         struct tmc_drvdata *drvdata,
>                         struct etr_buf *etr_buf, int node,
>                         void **pages)

>> + * tmc_alloc_etr_buf: Allocate a buffer use by ETR.
>> + * @drvdata	: ETR device details.
>> + * @size	: size of the requested buffer.
>> + * @flags	: Required properties of the type of buffer.
>> + * @node	: Node for memory allocations.
>> + * @pages	: An optional list of pages.
>> + */
>> +static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
>> +					  ssize_t size, int flags,
>> +					  int node, void **pages)
> 
> Please fix indentation.  Also @flags isn't used.
> 

Yep, flags is only used later and can move it to the patch where we use it.

>> +{
>> +	int rc = -ENOMEM;
>> +	bool has_etr_sg, has_iommu;
>> +	struct etr_buf *etr_buf;
>> +
>> +	has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG);
>> +	has_iommu = iommu_get_domain_for_dev(drvdata->dev);
>> +
>> +	etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
>> +	if (!etr_buf)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	etr_buf->size = size;
>> +
>> +	/*
>> +	 * If we have to use an existing list of pages, we cannot reliably
>> +	 * use a contiguous DMA memory (even if we have an IOMMU). Otherwise,
>> +	 * we use the contiguous DMA memory if :
>> +	 *  a) The ETR cannot use Scatter-Gather.
>> +	 *  b) if not a, we have an IOMMU backup
> 
> Please rework the above sentence.

How about :
	   b) if (a) is not true and we have an IOMMU connected to the ETR.

I will address the other comments on indentation.

Thanks for the detailed look

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 10/17] coresight: etr: Track if the device is coherent
  2017-11-02 19:40   ` Mathieu Poirier
@ 2017-11-03 10:03     ` Suzuki K Poulose
  0 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-03 10:03 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On 02/11/17 19:40, Mathieu Poirier wrote:
> On Thu, Oct 19, 2017 at 06:15:46PM +0100, Suzuki K Poulose wrote:
>> Track if the ETR is dma-coherent or not. This will be useful
>> in deciding if we should use software buffering for perf.
>>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-tmc.c | 5 ++++-
>>   drivers/hwtracing/coresight/coresight-tmc.h | 1 +
>>   2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c
>> index 4939333cc6c7..5a8c41130f96 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc.c
>> +++ b/drivers/hwtracing/coresight/coresight-tmc.c
>> @@ -347,6 +347,9 @@ static int tmc_etr_setup_caps(struct tmc_drvdata *drvdata,
>>   	if (!(devid & TMC_DEVID_NOSCAT))
>>   		tmc_etr_set_cap(drvdata, TMC_ETR_SG);
>>   
>> +	if (device_get_dma_attr(drvdata->dev) == DEV_DMA_COHERENT)
>> +		tmc_etr_set_cap(drvdata, TMC_ETR_COHERENT);
>> +
>>   	/* Check if the AXI address width is available */
>>   	if (devid & TMC_DEVID_AXIAW_VALID)
>>   		dma_mask = ((devid >> TMC_DEVID_AXIAW_SHIFT) &
>> @@ -397,7 +400,7 @@ static int tmc_probe(struct amba_device *adev, const struct amba_id *id)
>>   	if (!drvdata)
>>   		goto out;
>>   
>> -	drvdata->dev = &adev->dev;
>> +	drvdata->dev = dev;
> 
> What is that one for?
> 

Oops, that was a minor cleanup and need not be part of this patch. I will leave things
as it is. It is not worth a separate patch.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 11/17] coresight etr: Handle driver mode specific ETR buffers
  2017-11-02 20:26   ` Mathieu Poirier
@ 2017-11-03 10:08     ` Suzuki K Poulose
  2017-11-03 20:30       ` Mathieu Poirier
  0 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-03 10:08 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-arm-kernel, linux-kernel, robert.walker, mike.leach, coresight

On 02/11/17 20:26, Mathieu Poirier wrote:
> On Thu, Oct 19, 2017 at 06:15:47PM +0100, Suzuki K Poulose wrote:
>> Since the ETR could be driven either by SYSFS or by perf, it
>> becomes complicated how we deal with the buffers used for each
>> of these modes. The ETR driver cannot simply free the current
>> attached buffer without knowing the provider (i.e, sysfs vs perf).
>>
>> To solve this issue, we provide:
>> 1) the driver-mode specific etr buffer to be retained in the drvdata
>> 2) the etr_buf for a session should be passed on when enabling the
>>     hardware, which will be stored in drvdata->etr_buf. This will be
>>     replaced (not free'd) as soon as the hardware is disabled, after
>>     necessary sync operation.
> 
> If I get you right the problem you're trying to solve is what to do with a sysFS
> buffer that hasn't been read (and freed) when a perf session is requested.  In
> my opinion it should simply be freed.  Indeed the user probably doesn't care
> much about that sysFS buffer, if it did the data would have been harvested.

Not only that. If we simply use the drvdata->etr_buf, we cannot track the mode
which uses it. If we keep the etr_buf around, how do the new mode user decide
how to free the existing one ? (e.g, the perf etr_buf could be associated with
other perf data structures). This change would allow us to leave the handling
of the etr_buf to its respective modes.

And whether to keep the sysfs etr_buf around is a separate decision from the
above.


Cheers
Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 13/17] coresight etr: Do not clean ETR trace buffer
  2017-11-02 20:36   ` Mathieu Poirier
@ 2017-11-03 10:10     ` Suzuki K Poulose
  2017-11-03 20:17       ` Mathieu Poirier
  0 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-03 10:10 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On 02/11/17 20:36, Mathieu Poirier wrote:
> On Thu, Oct 19, 2017 at 06:15:49PM +0100, Suzuki K Poulose wrote:
>> We zero out the entire trace buffer used for ETR before it
>> is enabled, for helping with debugging. Since we could be
>> restoring a session in perf mode, this could destroy the data.
> 
> I'm not sure to follow you with "... restoring a session in perf mode ...".
> When operating from the perf interface all the memory allocated for a session is
> cleanup after, there is no re-using of memory as in sysFS.

We could directly use the perf ring buffer for the ETR. In that case, the perf
ring buffer could contain trace data collected from the previous "schedule"
which the userspace hasn't collected yet. So, doing a memset here would
destroy that data.

Cheers
Suzuki

> 
>> Get rid of this step, if someone wants to debug, they can always
>> add it as and when needed.
>>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-tmc-etr.c | 7 ++-----
>>   1 file changed, 2 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> index 31353fc34b53..849684f85443 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>> @@ -971,8 +971,6 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata,
>>   		return;
>>   
>>   	drvdata->etr_buf = etr_buf;
>> -	/* Zero out the memory to help with debug */
>> -	memset(etr_buf->vaddr, 0, etr_buf->size);
> 
> I agree, this can be costly when dealing with large areas of memory.
> 
>>   
>>   	CS_UNLOCK(drvdata->base);
>>   
>> @@ -1267,9 +1265,8 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata)
>>   	if (drvdata->mode == CS_MODE_SYSFS) {
>>   		/*
>>   		 * The trace run will continue with the same allocated trace
>> -		 * buffer. The trace buffer is cleared in tmc_etr_enable_hw(),
>> -		 * so we don't have to explicitly clear it. Also, since the
>> -		 * tracer is still enabled drvdata::buf can't be NULL.
>> +		 * buffer. Since the tracer is still enabled drvdata::buf can't
>> +		 * be NULL.
>>   		 */
>>   		tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf);
>>   	} else {
>> -- 
>> 2.13.6
>>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 07/17] coresight: tmc etr: Add transparent buffer management
  2017-11-03 10:02     ` Suzuki K Poulose
@ 2017-11-03 20:13       ` Mathieu Poirier
  0 siblings, 0 replies; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-03 20:13 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, Mike Leach, coresight

On 3 November 2017 at 04:02, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
> On 02/11/17 17:48, Mathieu Poirier wrote:
>>
>> On Thu, Oct 19, 2017 at 06:15:43PM +0100, Suzuki K Poulose wrote:
>>>
>>> At the moment we always use contiguous memory for TMC ETR tracing
>>> when used from sysfs. The size of the buffer is fixed at boot time
>>> and can only be changed by modifiying the DT. With the introduction
>>> of SG support we could support really large buffers in that mode.
>>> This patch abstracts the buffer used for ETR to switch between a
>>> contiguous buffer or a SG table depending on the availability of
>>> the memory.
>>>
>>> This also enables the sysfs mode to use the ETR in SG mode depending
>>> on configured the trace buffer size. Also, since ETR will use the
>>> new infrastructure to manage the buffer, we can get rid of some
>>> of the members in the tmc_drvdata and clean up the fields a bit.
>>>
>>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>> ---
>>>   drivers/hwtracing/coresight/coresight-tmc-etr.c | 433
>>> +++++++++++++++++++-----
>>>   drivers/hwtracing/coresight/coresight-tmc.h     |  60 +++-
>>>   2 files changed, 403 insertions(+), 90 deletions(-)
>>>
>>
>> [..]
>>
>>>
>>> +
>>> +static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64
>>> rwp)
>
>
>>> +       w_offset = tmc_sg_get_data_page_offset(table, rwp);
>>> +       if (w_offset < 0) {
>>> +               dev_warn(table->dev, "Unable to map RWP %llx to
>>> offset\n",
>>> +                               rwp);
>>
>>
>>                  dev_warn(table->dev,
>>                           "Unable to map RWP %llx to offset\n", rwq);
>>
>> It looks a little better and we respect indentation rules.  Same for
>> r_offset.
>>
>
>>> +static inline int tmc_etr_mode_alloc_buf(int mode,
>>> +                                 struct tmc_drvdata *drvdata,
>>> +                                 struct etr_buf *etr_buf, int node,
>>> +                                 void **pages)
>>
>>
>> static inline int
>> tmc_etr_mode_alloc_buf(int mode,
>>                         struct tmc_drvdata *drvdata,
>>                         struct etr_buf *etr_buf, int node,
>>                         void **pages)
>
>
>>> + * tmc_alloc_etr_buf: Allocate a buffer use by ETR.
>>> + * @drvdata    : ETR device details.
>>> + * @size       : size of the requested buffer.
>>> + * @flags      : Required properties of the type of buffer.
>>> + * @node       : Node for memory allocations.
>>> + * @pages      : An optional list of pages.
>>> + */
>>> +static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
>>> +                                         ssize_t size, int flags,
>>> +                                         int node, void **pages)
>>
>>
>> Please fix indentation.  Also @flags isn't used.
>>

Ok, I haven't made it that far yet.  If it's used later one just leave
it as it is.

>
> Yep, flags is only used later and can move it to the patch where we use it.
>
>>> +{
>>> +       int rc = -ENOMEM;
>>> +       bool has_etr_sg, has_iommu;
>>> +       struct etr_buf *etr_buf;
>>> +
>>> +       has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG);
>>> +       has_iommu = iommu_get_domain_for_dev(drvdata->dev);
>>> +
>>> +       etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
>>> +       if (!etr_buf)
>>> +               return ERR_PTR(-ENOMEM);
>>> +
>>> +       etr_buf->size = size;
>>> +
>>> +       /*
>>> +        * If we have to use an existing list of pages, we cannot
>>> reliably
>>> +        * use a contiguous DMA memory (even if we have an IOMMU).
>>> Otherwise,
>>> +        * we use the contiguous DMA memory if :
>>> +        *  a) The ETR cannot use Scatter-Gather.
>>> +        *  b) if not a, we have an IOMMU backup
>>
>>
>> Please rework the above sentence.
>
>
> How about :
>            b) if (a) is not true and we have an IOMMU connected to the ETR.

I'm good with that.

>
> I will address the other comments on indentation.
>
> Thanks for the detailed look
>
> Cheers
> Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 13/17] coresight etr: Do not clean ETR trace buffer
  2017-11-03 10:10     ` Suzuki K Poulose
@ 2017-11-03 20:17       ` Mathieu Poirier
  2017-11-07 10:37         ` Suzuki K Poulose
  0 siblings, 1 reply; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-03 20:17 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, Mike Leach, coresight

On 3 November 2017 at 04:10, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
> On 02/11/17 20:36, Mathieu Poirier wrote:
>>
>> On Thu, Oct 19, 2017 at 06:15:49PM +0100, Suzuki K Poulose wrote:
>>>
>>> We zero out the entire trace buffer used for ETR before it
>>> is enabled, for helping with debugging. Since we could be
>>> restoring a session in perf mode, this could destroy the data.
>>
>>
>> I'm not sure to follow you with "... restoring a session in perf mode
>> ...".
>> When operating from the perf interface all the memory allocated for a
>> session is
>> cleanup after, there is no re-using of memory as in sysFS.
>
>
> We could directly use the perf ring buffer for the ETR. In that case, the
> perf
> ring buffer could contain trace data collected from the previous "schedule"
> which the userspace hasn't collected yet. So, doing a memset here would
> destroy that data.

I originally thought your comment was about re-using the memory from a
previous trace session, hence the confusion.  Please rework your
changelog to include this clarification as I am sure other people can
be mislead.

>
> Cheers
> Suzuki
>
>>
>>> Get rid of this step, if someone wants to debug, they can always
>>> add it as and when needed.
>>>
>>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>> ---
>>>   drivers/hwtracing/coresight/coresight-tmc-etr.c | 7 ++-----
>>>   1 file changed, 2 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> index 31353fc34b53..849684f85443 100644
>>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>> @@ -971,8 +971,6 @@ static void tmc_etr_enable_hw(struct tmc_drvdata
>>> *drvdata,
>>>                 return;
>>>         drvdata->etr_buf = etr_buf;
>>> -       /* Zero out the memory to help with debug */
>>> -       memset(etr_buf->vaddr, 0, etr_buf->size);
>>
>>
>> I agree, this can be costly when dealing with large areas of memory.
>>
>>>         CS_UNLOCK(drvdata->base);
>>>   @@ -1267,9 +1265,8 @@ int tmc_read_unprepare_etr(struct tmc_drvdata
>>> *drvdata)
>>>         if (drvdata->mode == CS_MODE_SYSFS) {
>>>                 /*
>>>                  * The trace run will continue with the same allocated
>>> trace
>>> -                * buffer. The trace buffer is cleared in
>>> tmc_etr_enable_hw(),
>>> -                * so we don't have to explicitly clear it. Also, since
>>> the
>>> -                * tracer is still enabled drvdata::buf can't be NULL.
>>> +                * buffer. Since the tracer is still enabled drvdata::buf
>>> can't
>>> +                * be NULL.
>>>                  */
>>>                 tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf);
>>>         } else {
>>> --
>>> 2.13.6
>>>
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 11/17] coresight etr: Handle driver mode specific ETR buffers
  2017-11-03 10:08     ` Suzuki K Poulose
@ 2017-11-03 20:30       ` Mathieu Poirier
  0 siblings, 0 replies; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-03 20:30 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, Robert Walker, Mike Leach, coresight

On 3 November 2017 at 04:08, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
> On 02/11/17 20:26, Mathieu Poirier wrote:
>>
>> On Thu, Oct 19, 2017 at 06:15:47PM +0100, Suzuki K Poulose wrote:
>>>
>>> Since the ETR could be driven either by SYSFS or by perf, it
>>> becomes complicated how we deal with the buffers used for each
>>> of these modes. The ETR driver cannot simply free the current
>>> attached buffer without knowing the provider (i.e, sysfs vs perf).
>>>
>>> To solve this issue, we provide:
>>> 1) the driver-mode specific etr buffer to be retained in the drvdata
>>> 2) the etr_buf for a session should be passed on when enabling the
>>>     hardware, which will be stored in drvdata->etr_buf. This will be
>>>     replaced (not free'd) as soon as the hardware is disabled, after
>>>     necessary sync operation.
>>
>>
>> If I get you right the problem you're trying to solve is what to do with a
>> sysFS
>> buffer that hasn't been read (and freed) when a perf session is requested.
>> In
>> my opinion it should simply be freed.  Indeed the user probably doesn't
>> care
>> much about that sysFS buffer, if it did the data would have been
>> harvested.
>
>
> Not only that. If we simply use the drvdata->etr_buf, we cannot track the
> mode
> which uses it. If we keep the etr_buf around, how do the new mode user
> decide
> how to free the existing one ? (e.g, the perf etr_buf could be associated
> with
> other perf data structures). This change would allow us to leave the
> handling
> of the etr_buf to its respective modes.

struct etr_buf has a 'mode' and an '*ops', how is that not sufficient?
 I'll try to finish reviewing your patches today, maybe I'll find the
answer later on...

>
> And whether to keep the sysfs etr_buf around is a separate decision from the
> above.
>
>
> Cheers
> Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 14/17] coresight: etr: Add support for save restore buffers
  2017-10-19 17:15 ` [PATCH 14/17] coresight: etr: Add support for save restore buffers Suzuki K Poulose
@ 2017-11-03 22:22   ` Mathieu Poirier
  0 siblings, 0 replies; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-03 22:22 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On Thu, Oct 19, 2017 at 06:15:50PM +0100, Suzuki K Poulose wrote:
> Add support for creating buffers which can be used in save-restore
> mode (e.g, for use by perf). If the TMC-ETR supports save-restore
> feature, we could support the mode in all buffer backends. However,

Instead of using the term backend simply write continuous buffer or SG mode.  It
is a lot less cryptic that way.

> if it doesn't, we should fall back to using in built SG mechanism,
> where we can rotate the SG table by making some adjustments in the
> page table.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-tmc-etr.c | 132 +++++++++++++++++++++++-
>  drivers/hwtracing/coresight/coresight-tmc.h     |  15 +++
>  2 files changed, 143 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 849684f85443..f8e654e1f5b2 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -590,7 +590,7 @@ tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index)
>   * 3) Update the hwaddr to point to the table pointer for the buffer
>   *    which starts at "base".
>   */
> -static int __maybe_unused
> +static int
>  tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset)
>  {
>  	u32 last_entry, first_entry;
> @@ -700,6 +700,9 @@ static int tmc_etr_alloc_flat_buf(struct tmc_drvdata *drvdata,
>  		return -ENOMEM;
>  	etr_buf->vaddr = vaddr;
>  	etr_buf->hwaddr = paddr;
> +	etr_buf->rrp = paddr;
> +	etr_buf->rwp = paddr;
> +	etr_buf->status = 0;
>  	etr_buf->mode = ETR_MODE_FLAT;
>  	etr_buf->private = drvdata;
>  	return 0;
> @@ -754,13 +757,19 @@ static int tmc_etr_alloc_sg_buf(struct tmc_drvdata *drvdata,
>  				void **pages)
>  {
>  	struct etr_sg_table *etr_table;
> +	struct tmc_sg_table *sg_table;
>  
>  	etr_table = tmc_init_etr_sg_table(drvdata->dev, node,
>  					  etr_buf->size, pages);
>  	if (IS_ERR(etr_table))
>  		return -ENOMEM;
> +	sg_table = etr_table->sg_table;

As far as I can tell this doesn't do anything.

>  	etr_buf->vaddr = tmc_sg_table_data_vaddr(etr_table->sg_table);
>  	etr_buf->hwaddr = etr_table->hwaddr;
> +	/* TMC ETR SG automatically sets the RRP/RWP when enabled */

If TMC ETR SG automatically sets the RRP/RWP, why explicitly setting it?

> +	etr_buf->rrp = etr_table->hwaddr;
> +	etr_buf->rwp = etr_table->hwaddr;
> +	etr_buf->status = 0;
>  	etr_buf->mode = ETR_MODE_ETR_SG;
>  	etr_buf->private = etr_table;
>  	return 0;
> @@ -816,11 +825,49 @@ static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)
>  	tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len);
>  }
>  
> +static int tmc_etr_restore_sg_buf(struct etr_buf *etr_buf,
> +				   u64 r_offset, u64 w_offset,
> +				   u32 status, bool has_save_restore)

Indentation

> +{
> +	int rc;
> +	struct etr_sg_table *etr_table = etr_buf->private;
> +	struct device *dev = etr_table->sg_table->dev;
> +
> +	/*
> +	 * It is highly unlikely that we have an ETR with in-built SG and
> +	 * Save-Restore capability and we are not sure if the PTRs will
> +	 * be updated.
> +	 */
> +	if (has_save_restore) {
> +		dev_warn_once(dev,
> +		"Unexpected feature combination of SG and save-restore\n");
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * Since we cannot program RRP/RWP different from DBAL, the offsets
> +	 * should match.
> +	 */
> +	if (r_offset != w_offset) {
> +		dev_dbg(dev, "Mismatched RRP/RWP offsets\n");
> +		return -EINVAL;
> +	}
> +
> +	rc = tmc_etr_sg_table_rotate(etr_table, w_offset);
> +	if (!rc) {
> +		etr_buf->hwaddr = etr_table->hwaddr;
> +		etr_buf->rrp = etr_table->hwaddr;
> +		etr_buf->rwp = etr_table->hwaddr;
> +	}
> +	return rc;
> +}
> +
>  static const struct etr_buf_operations etr_sg_buf_ops = {
>  	.alloc = tmc_etr_alloc_sg_buf,
>  	.free = tmc_etr_free_sg_buf,
>  	.sync = tmc_etr_sync_sg_buf,
>  	.get_data = tmc_etr_get_data_sg_buf,
> +	.restore = tmc_etr_restore_sg_buf,
>  };
>  
>  static const struct etr_buf_operations *etr_buf_ops[] = {
> @@ -861,10 +908,42 @@ static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
>  {
>  	int rc = -ENOMEM;
>  	bool has_etr_sg, has_iommu;
> +	bool has_flat, has_save_restore;
>  	struct etr_buf *etr_buf;
>  
>  	has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG);
>  	has_iommu = iommu_get_domain_for_dev(drvdata->dev);
> +	has_save_restore = tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE);
> +
> +	/*
> +	 * We can normally use flat DMA buffer provided that the buffer
> +	 * is not used in save restore fashion without hardware support.
> +	 */
> +	has_flat = !(flags & ETR_BUF_F_RESTORE_PTRS) || has_save_restore;
> +
> +	/*
> +	 * To support save-restore on a given ETR we have the following
> +	 * conditions:
> +	 *  1) If the buffer requires save-restore of a pointers as well
> +	 *     as the Status bit, we require ETR support for it and we coul
> +	 *     support all the backends.
> +	 *  2) If the buffer requires only save-restore of pointers, then
> +	 *     we could exploit a circular ETR SG list. None of the other
> +	 *     backends can support it without the ETR feature.
> +	 *
> +	 * If the buffer will be used in a save-restore mode without
> +	 * the ETR support for SAVE_RESTORE, we can only support TMC
> +	 * ETR in-built SG tables which can be rotated to make it work.
> +	 */
> +	if ((flags & ETR_BUF_F_RESTORE_STATUS) && !has_save_restore)
> +		return ERR_PTR(-EINVAL);
> +
> +	if (!has_flat && !has_etr_sg) {
> +		dev_dbg(drvdata->dev,
> +			"No available backends for ETR buffer with flags %x\n",
> +			flags);
> +		return ERR_PTR(-EINVAL);
> +	}
>  
>  	etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
>  	if (!etr_buf)
> @@ -883,7 +962,7 @@ static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
>  	 * Fallback to available mechanisms.
>  	 *
>  	 */
> -	if (!pages &&
> +	if (!pages && has_flat &&
>  	    (!has_etr_sg || has_iommu || size < SZ_1M))
>  		rc = tmc_etr_mode_alloc_buf(ETR_MODE_FLAT, drvdata,
>  					    etr_buf, node, pages);
> @@ -961,6 +1040,51 @@ static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata)
>  		tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset);
>  }
>  
> +/*
> + * tmc_etr_restore_generic: Common helper to restore the buffer
> + * status for FLAT buffers, which use linear TMC ETR address range.
> + * This is only possible with in-built ETR capability to save-restore
> + * the pointers. The DBA will still point to the original start of the
> + * buffer.
> + */
> +static int tmc_etr_buf_generic_restore(struct etr_buf *etr_buf,
> +					u64 r_offset, u64 w_offset,
> +					u32 status, bool has_save_restore)

Indentation

> +{
> +	u64 size = etr_buf->size;
> +
> +	if (!has_save_restore)
> +		return -EINVAL;
> +	etr_buf->rrp = etr_buf->hwaddr + (r_offset % size);
> +	etr_buf->rwp = etr_buf->hwaddr + (w_offset % size);
> +	etr_buf->status = status;
> +	return 0;
> +}
> +
> +static int __maybe_unused
> +tmc_restore_etr_buf(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf,
> +		    u64 r_offset, u64 w_offset, u32 status)
> +{
> +	bool has_save_restore = tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE);
> +
> +	if (WARN_ON_ONCE(!has_save_restore && etr_buf->mode != ETR_MODE_ETR_SG))
> +		return -EINVAL;
> +	/*
> +	 * If we use a circular SG list without ETR support, we can't
> +	 * support restoring "Full" bit.
> +	 */
> +	if (WARN_ON_ONCE(!has_save_restore && status))
> +		return -EINVAL;
> +	if (status & ~TMC_STS_FULL)
> +		return -EINVAL;
> +	if (etr_buf->ops->restore)
> +		return etr_buf->ops->restore(etr_buf, r_offset, w_offset,
> +					      status, has_save_restore);
> +	else
> +		return tmc_etr_buf_generic_restore(etr_buf, r_offset, w_offset,
> +					       status, has_save_restore);
> +}
> +
>  static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata,
>  			      struct etr_buf *etr_buf)
>  {
> @@ -1004,8 +1128,8 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata,
>  	 * STS to "not full").
>  	 */
>  	if (tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE)) {
> -		tmc_write_rrp(drvdata, etr_buf->hwaddr);
> -		tmc_write_rwp(drvdata, etr_buf->hwaddr);
> +		tmc_write_rrp(drvdata, etr_buf->rrp);
> +		tmc_write_rwp(drvdata, etr_buf->rwp);
>  		sts = readl_relaxed(drvdata->base + TMC_STS) & ~TMC_STS_FULL;
>  		writel_relaxed(sts, drvdata->base + TMC_STS);
>  	}
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> index 14a3dec50b0f..2c5b905b6494 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.h
> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> @@ -142,12 +142,22 @@ enum etr_mode {
>  	ETR_MODE_ETR_SG,	/* Uses in-built TMC ETR SG mechanism */
>  };
>  
> +/* ETR buffer should support save-restore */
> +#define ETR_BUF_F_RESTORE_PTRS		0x1
> +#define ETR_BUF_F_RESTORE_STATUS	0x2
> +
> +#define ETR_BUF_F_RESTORE_MINIMAL	ETR_BUF_F_RESTORE_PTRS
> +#define ETR_BUF_F_RESTORE_FULL		(ETR_BUF_F_RESTORE_PTRS |\
> +					 ETR_BUF_F_RESTORE_STATUS)
>  struct etr_buf_operations;
>  
>  /**
>   * struct etr_buf - Details of the buffer used by ETR
>   * @mode	: Mode of the ETR buffer, contiguous, Scatter Gather etc.
>   * @full	: Trace data overflow
> + * @status	: Value for STATUS if the ETR supports save-restore.
> + * @rrp		: Value for RRP{LO:HI} if the ETR supports save-restore
> + * @rwp		: Value for RWP{LO:HI} if the ETR supports save-restore
>   * @size	: Size of the buffer.
>   * @hwaddr	: Address to be programmed in the TMC:DBA{LO,HI}
>   * @vaddr	: Virtual address of the buffer used for trace.
> @@ -159,6 +169,9 @@ struct etr_buf_operations;
>  struct etr_buf {
>  	enum etr_mode			mode;
>  	bool				full;
> +	u32				status;
> +	dma_addr_t			rrp;
> +	dma_addr_t			rwp;
>  	ssize_t				size;
>  	dma_addr_t			hwaddr;
>  	void				*vaddr;
> @@ -212,6 +225,8 @@ struct etr_buf_operations {
>  	int (*alloc)(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf,
>  			int node, void **pages);
>  	void (*sync)(struct etr_buf *etr_buf, u64 rrp, u64 rwp);
> +	int (*restore)(struct etr_buf *etr_buf, u64 r_offset,
> +		       u64 w_offset, u32 status, bool has_save_restore);
>  	ssize_t (*get_data)(struct etr_buf *etr_buf, u64 offset, size_t len,
>  				char **bufpp);
>  	void (*free)(struct etr_buf *etr_buf);
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 06/17] coresight: tmc: Make ETR SG table circular
  2017-10-19 17:15 ` [PATCH 06/17] coresight: tmc: Make ETR SG table circular Suzuki K Poulose
  2017-10-20 17:11   ` Julien Thierry
  2017-11-01 23:47   ` Mathieu Poirier
@ 2017-11-06 19:07   ` Mathieu Poirier
  2017-11-07 10:36     ` Suzuki K Poulose
  2 siblings, 1 reply; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-06 19:07 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
> Make the ETR SG table Circular buffer so that we could start
> at any of the SG pages and use the entire buffer for tracing.
> This can be achieved by :
> 
> 1) Keeping an additional LINK pointer at the very end of the
> SG table, i.e, after the LAST buffer entry, to point back to
> the beginning of the first table. This will allow us to use
> the buffer normally when we start the trace at offset 0 of
> the buffer, as the LAST buffer entry hints the TMC-ETR and
> it automatically wraps to the offset 0.
> 
> 2) If we want to start at any other ETR SG page aligned offset,
> we could :
>  a) Make the preceding page entry as LAST entry.
>  b) Make the original LAST entry a normal entry.
>  c) Use the table pointer to the "new" start offset as the
>     base of the table address.
> This works as the TMC doesn't mandate that the page table
> base address should be 4K page aligned.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-tmc-etr.c | 159 +++++++++++++++++++++---
>  1 file changed, 139 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 4424eb67a54c..c171b244e12a 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -71,36 +71,41 @@ typedef u32 sgte_t;
>   * @sg_table:		Generic SG Table holding the data/table pages.
>   * @hwaddr:		hwaddress used by the TMC, which is the base
>   *			address of the table.
> + * @nr_entries:		Total number of pointers in the table.
> + * @first_entry:	Index to the current "start" of the buffer.
> + * @last_entry:		Index to the last entry of the buffer.
>   */
>  struct etr_sg_table {
>  	struct tmc_sg_table	*sg_table;
>  	dma_addr_t		hwaddr;
> +	u32			nr_entries;
> +	u32			first_entry;
> +	u32			last_entry;
>  };
>  
>  /*
>   * tmc_etr_sg_table_entries: Total number of table entries required to map
>   * @nr_pages system pages.
>   *
> - * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages.
> + * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages and
> + * an additional Link pointer for making it a Circular buffer.
>   * Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers,
>   * with the last entry pointing to the page containing the table
> - * entries. If we spill over to a new page for mapping 1 entry,
> - * we could as well replace the link entry of the previous page
> - * with the last entry.
> + * entries. If we fill the last table in full with the pointers, (i.e,
> + * nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) == 0, we don't have to allocate
> + * another table and hence skip the Link pointer. Also we could use the
> + * link entry of the last page to make it circular.
>   */
>  static inline unsigned long __attribute_const__
>  tmc_etr_sg_table_entries(int nr_pages)
>  {
>  	unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE;
>  	unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1);
> -	/*
> -	 * If we spill over to a new page for 1 entry, we could as well
> -	 * make it the LAST entry in the previous page, skipping the Link
> -	 * address.
> -	 */
> -	if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2))
> +
> +	if (nr_sglinks && !(nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1)))
>  		nr_sglinks--;
> -	return nr_sgpages + nr_sglinks;
> +	/* Add an entry for the circular link */
> +	return nr_sgpages + nr_sglinks + 1;
>  }
>  
>  /*
> @@ -417,14 +422,21 @@ tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table,
>  /* Dump the given sg_table */
>  static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
>  {
> -	sgte_t *ptr;
> +	sgte_t *ptr, *start;
>  	int i = 0;
>  	dma_addr_t addr;
>  	struct tmc_sg_table *sg_table = etr_table->sg_table;
>  
> -	ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
> +	start = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
>  					      etr_table->hwaddr, true);
> -	while (ptr) {
> +	if (!start) {
> +		pr_debug("ERROR: Failed to translate table base: 0x%llx\n",
> +					 etr_table->hwaddr);
> +		return;
> +	}
> +
> +	ptr = start;
> +	do {
>  		addr = ETR_SG_ADDR(*ptr);
>  		switch (ETR_SG_ET(*ptr)) {
>  		case ETR_SG_ET_NORMAL:
> @@ -436,14 +448,17 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
>  				 i, ptr, addr);
>  			ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
>  							      addr, true);
> +			if (!ptr)
> +				pr_debug("ERROR: Bad Link 0x%llx\n", addr);
>  			break;
>  		case ETR_SG_ET_LAST:
>  			pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n",
>  				 i, ptr, addr);
> -			return;
> +			ptr++;
> +			break;
>  		}
>  		i++;
> -	}
> +	} while (ptr && ptr != start);
>  	pr_debug("******* End of Table *****\n");
>  }
>  #endif
> @@ -458,7 +473,7 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table)
>  static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
>  {
>  	dma_addr_t paddr;
> -	int i, type, nr_entries;
> +	int i, type;
>  	int tpidx = 0; /* index to the current system table_page */
>  	int sgtidx = 0;	/* index to the sg_table within the current syspage */
>  	int sgtoffset = 0; /* offset to the next entry within the sg_table */
> @@ -469,16 +484,16 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
>  	dma_addr_t *table_daddrs = sg_table->table_pages.daddrs;
>  	dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
>  
> -	nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages);
>  	/*
>  	 * Use the contiguous virtual address of the table to update entries.
>  	 */
>  	ptr = sg_table->table_vaddr;
>  	/*
> -	 * Fill all the entries, except the last entry to avoid special
> +	 * Fill all the entries, except the last two entries (i.e, the last
> +	 * buffer and the circular link back to the base) to avoid special
>  	 * checks within the loop.
>  	 */
> -	for (i = 0; i < nr_entries - 1; i++) {
> +	for (i = 0; i < etr_table->nr_entries - 2; i++) {
>  		if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) {
>  			/*
>  			 * Last entry in a sg_table page is a link address to
> @@ -519,6 +534,107 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
>  	/* Set up the last entry, which is always a data pointer */
>  	paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
>  	*ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST);
> +	/* followed by a circular link, back to the start of the table */
> +	*ptr++ = ETR_SG_ENTRY(sg_table->table_daddr, ETR_SG_ET_LINK);
> +}
> +
> +/*
> + * tmc_etr_sg_offset_to_table_index : Translate a given data @offset
> + * to the index of the page table "entry". Data pointers always have
> + * a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an
> + * ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries.
> + */
> +static inline u32
> +tmc_etr_sg_offset_to_table_index(u64 offset)
> +{
> +	u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
> +
> +	return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
> +}

This function is the source of a bizarre linking error when compiling [14/17] on
armv7 as pasted here:

  UPD     include/generated/compile.h
  CC      init/version.o
  AR      init/built-in.o
  AR      built-in.o
  LD      vmlinux.o
  MODPOST vmlinux.o
drivers/hwtracing/coresight/coresight-tmc-etr.o: In function
`tmc_etr_sg_offset_to_table_index':
/home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553:
undefined reference to `__aeabi_uldivmod'
/home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:551:
undefined reference to `__aeabi_uldivmod'
/home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553:
undefined reference to `__aeabi_uldivmod'
drivers/hwtracing/coresight/coresight-tmc-etr.o: In function
`tmc_etr_sg_table_rotate':
/home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:609:
undefined reference to `__aeabi_uldivmod'

Please see if you can reproduce on your side.

Thanks,
Mathieu

> +
> +/*
> + * tmc_etr_sg_update_type: Update the type of a given entry in the
> + * table to the requested entry. This is only used for data buffers
> + * to toggle the "NORMAL" vs "LAST" buffer entries.
> + */
> +static inline void tmc_etr_sg_update_type(sgte_t *entry, u32 type)
> +{
> +	WARN_ON(ETR_SG_ET(*entry) == ETR_SG_ET_LINK);
> +	WARN_ON(!ETR_SG_ET(*entry));
> +	*entry &= ~ETR_SG_ET_MASK;
> +	*entry |= type;
> +}
> +
> +/*
> + * tmc_etr_sg_table_index_to_daddr: Return the hardware address to the table
> + * entry @index. Use this address to let the table begin @index.
> + */
> +static inline dma_addr_t
> +tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index)
> +{
> +	u32 sys_page_idx = index / ETR_SG_PTRS_PER_SYSPAGE;
> +	u32 sys_page_offset = index % ETR_SG_PTRS_PER_SYSPAGE;
> +	sgte_t *ptr;
> +
> +	ptr = (sgte_t *)sg_table->table_pages.daddrs[sys_page_idx];
> +	return (dma_addr_t)&ptr[sys_page_offset];
> +}
> +
> +/*
> + * tmc_etr_sg_table_rotate : Rotate the SG circular buffer, moving
> + * the "base" to a requested offset. We do so by :
> + *
> + * 1) Reset the current LAST buffer.
> + * 2) Mark the "previous" buffer in the table to the "base" as LAST.
> + * 3) Update the hwaddr to point to the table pointer for the buffer
> + *    which starts at "base".
> + */
> +static int __maybe_unused
> +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset)
> +{
> +	u32 last_entry, first_entry;
> +	u64 last_offset;
> +	struct tmc_sg_table *sg_table = etr_table->sg_table;
> +	sgte_t *table_ptr = sg_table->table_vaddr;
> +	ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
> +
> +	/* Offset should always be SG PAGE_SIZE aligned */
> +	if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
> +		pr_debug("unaligned base offset %llx\n", base_offset);
> +		return -EINVAL;
> +	}
> +	/* Make sure the offset is within the range */
> +	if (base_offset < 0 || base_offset > buf_size) {
> +		base_offset = (base_offset + buf_size) % buf_size;
> +		pr_debug("Resetting offset to %llx\n", base_offset);
> +	}
> +	first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
> +	if (first_entry == etr_table->first_entry) {
> +		pr_debug("Head is already at %llx, skipping\n", base_offset);
> +		return 0;
> +	}
> +
> +	/* Last entry should be the previous one to the new "base" */
> +	last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size;
> +	last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
> +
> +	/* Reset the current Last page to Normal and new Last page to NORMAL */
> +	tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry],
> +				 ETR_SG_ET_NORMAL);
> +	tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST);
> +	etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table,
> +							    first_entry);
> +	etr_table->first_entry = first_entry;
> +	etr_table->last_entry = last_entry;
> +	pr_debug("table rotated to offset %llx-%llx, entries (%d - %d), dba: %llx\n",
> +			base_offset, last_offset, first_entry, last_entry,
> +			etr_table->hwaddr);
> +	/* Sync the table for device */
> +	tmc_sg_table_sync_table(sg_table);
> +#ifdef ETR_SG_DEBUG
> +	tmc_etr_sg_table_dump(etr_table);
> +#endif
> +	return 0;
>  }
>  
>  /*
> @@ -552,6 +668,9 @@ tmc_init_etr_sg_table(struct device *dev, int node,
>  	}
>  
>  	etr_table->sg_table = sg_table;
> +	etr_table->nr_entries = nr_entries;
> +	etr_table->first_entry = 0;
> +	etr_table->last_entry = nr_entries - 2;
>  	/* TMC should use table base address for DBA */
>  	etr_table->hwaddr = sg_table->table_daddr;
>  	tmc_etr_sg_table_populate(etr_table);
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 16/17] coresight: perf: Remove reset_buffer call back for sinks
  2017-10-19 17:15 ` [PATCH 16/17] coresight: perf: Remove reset_buffer call back for sinks Suzuki K Poulose
@ 2017-11-06 21:10   ` Mathieu Poirier
  0 siblings, 0 replies; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-06 21:10 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On Thu, Oct 19, 2017 at 06:15:52PM +0100, Suzuki K Poulose wrote:
> Right now we issue an update_buffer() and reset_buffer() call backs
> in succession when we stop tracing an event. The update_buffer is
> supposed to check the status of the buffer and make sure the ring buffer
> is updated with the trace data. And we store information about the
> size of the data collected only to be consumed by the reset_buffer
> callback which always follows the update_buffer. This patch gets
> rid of the reset_buffer callback altogether and performs the actions
> in update_buffer, making it return the size collected.
> 
> This removes some not-so pretty hack (storing the new head in the
> size field for snapshot mode) and cleans it up a little bit.

The idea in splitting the update and reset operation was to seamlessly support
sinks that generate an interrupt when their buffer gets full.  Those are coming
and when we do need to support them we'll find ourselves splitting the update
and reset operation again.   

See comment below.

> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-etb10.c    | 56 +++++------------------
>  drivers/hwtracing/coresight/coresight-etm-perf.c |  9 +---
>  drivers/hwtracing/coresight/coresight-tmc-etf.c  | 58 +++++-------------------
>  include/linux/coresight.h                        |  5 +-
>  4 files changed, 26 insertions(+), 102 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
> index 757f556975f7..75c5699000b0 100644
> --- a/drivers/hwtracing/coresight/coresight-etb10.c
> +++ b/drivers/hwtracing/coresight/coresight-etb10.c
> @@ -323,37 +323,7 @@ static int etb_set_buffer(struct coresight_device *csdev,
>  	return ret;
>  }
>  
> -static unsigned long etb_reset_buffer(struct coresight_device *csdev,
> -				      struct perf_output_handle *handle,
> -				      void *sink_config)
> -{
> -	unsigned long size = 0;
> -	struct cs_buffers *buf = sink_config;
> -
> -	if (buf) {
> -		/*
> -		 * In snapshot mode ->data_size holds the new address of the
> -		 * ring buffer's head.  The size itself is the whole address
> -		 * range since we want the latest information.
> -		 */
> -		if (buf->snapshot)
> -			handle->head = local_xchg(&buf->data_size,
> -						  buf->nr_pages << PAGE_SHIFT);
> -
> -		/*
> -		 * Tell the tracer PMU how much we got in this run and if
> -		 * something went wrong along the way.  Nobody else can use
> -		 * this cs_buffers instance until we are done.  As such
> -		 * resetting parameters here and squaring off with the ring
> -		 * buffer API in the tracer PMU is fine.
> -		 */
> -		size = local_xchg(&buf->data_size, 0);
> -	}
> -
> -	return size;
> -}
> -
> -static void etb_update_buffer(struct coresight_device *csdev,
> +static unsigned long etb_update_buffer(struct coresight_device *csdev,
>  			      struct perf_output_handle *handle,
>  			      void *sink_config)
>  {
> @@ -362,13 +332,13 @@ static void etb_update_buffer(struct coresight_device *csdev,
>  	u8 *buf_ptr;
>  	const u32 *barrier;
>  	u32 read_ptr, write_ptr, capacity;
> -	u32 status, read_data, to_read;
> -	unsigned long offset;
> +	u32 status, read_data;
> +	unsigned long offset, to_read;
>  	struct cs_buffers *buf = sink_config;
>  	struct etb_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>  
>  	if (!buf)
> -		return;
> +		return 0;
>  
>  	capacity = drvdata->buffer_depth * ETB_FRAME_SIZE_WORDS;
>  
> @@ -473,18 +443,17 @@ static void etb_update_buffer(struct coresight_device *csdev,
>  	writel_relaxed(0x0, drvdata->base + ETB_RAM_WRITE_POINTER);
>  
>  	/*
> -	 * In snapshot mode all we have to do is communicate to
> -	 * perf_aux_output_end() the address of the current head.  In full
> -	 * trace mode the same function expects a size to move rb->aux_head
> -	 * forward.
> +	 * In snapshot mode we have to update the handle->head to point
> +	 * to the new location.
>  	 */
> -	if (buf->snapshot)
> -		local_set(&buf->data_size, (cur * PAGE_SIZE) + offset);
> -	else
> -		local_add(to_read, &buf->data_size);
> -
> +	if (buf->snapshot) {
> +		handle->head = (cur * PAGE_SIZE) + offset;
> +		to_read = buf->nr_pages << PAGE_SHIFT;
> +	}
>  	etb_enable_hw(drvdata);
>  	CS_LOCK(drvdata->base);
> +
> +	return to_read;
>  }
>  
>  static const struct coresight_ops_sink etb_sink_ops = {
> @@ -493,7 +462,6 @@ static const struct coresight_ops_sink etb_sink_ops = {
>  	.alloc_buffer	= etb_alloc_buffer,
>  	.free_buffer	= etb_free_buffer,
>  	.set_buffer	= etb_set_buffer,
> -	.reset_buffer	= etb_reset_buffer,
>  	.update_buffer	= etb_update_buffer,
>  };
>  
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index 8a0ad77574e7..e5f9567c87c4 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -342,15 +342,8 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  		if (!sink_ops(sink)->update_buffer)
>  			return;
>  
> -		sink_ops(sink)->update_buffer(sink, handle,
> +		size = sink_ops(sink)->update_buffer(sink, handle,
>  					      event_data->snk_config);
> -
> -		if (!sink_ops(sink)->reset_buffer)
> -			return;
> -
> -		size = sink_ops(sink)->reset_buffer(sink, handle,
> -						    event_data->snk_config);

For the current sink we support, i.e those that don't generate interrupts when
their buffer is full, I'm in agreement with your work.  I suggest you don't
touch the current implementation of etm_event_stop() and move everything you've
done to the reset operation.  The end result is the same and we don't have to
rework (again) etm_event_stop() when we need to support IPs that do send
interrupts.

> -
>  		perf_aux_output_end(handle, size);
>  	}
>  
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> index aa4e8f03ef49..073198e7b46e 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> @@ -358,36 +358,7 @@ static int tmc_set_etf_buffer(struct coresight_device *csdev,
>  	return ret;
>  }
>  
> -static unsigned long tmc_reset_etf_buffer(struct coresight_device *csdev,
> -					  struct perf_output_handle *handle,
> -					  void *sink_config)
> -{
> -	long size = 0;
> -	struct cs_buffers *buf = sink_config;
> -
> -	if (buf) {
> -		/*
> -		 * In snapshot mode ->data_size holds the new address of the
> -		 * ring buffer's head.  The size itself is the whole address
> -		 * range since we want the latest information.
> -		 */
> -		if (buf->snapshot)
> -			handle->head = local_xchg(&buf->data_size,
> -						  buf->nr_pages << PAGE_SHIFT);
> -		/*
> -		 * Tell the tracer PMU how much we got in this run and if
> -		 * something went wrong along the way.  Nobody else can use
> -		 * this cs_buffers instance until we are done.  As such
> -		 * resetting parameters here and squaring off with the ring
> -		 * buffer API in the tracer PMU is fine.
> -		 */
> -		size = local_xchg(&buf->data_size, 0);
> -	}
> -
> -	return size;
> -}
> -
> -static void tmc_update_etf_buffer(struct coresight_device *csdev,
> +static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev,
>  				  struct perf_output_handle *handle,
>  				  void *sink_config)
>  {
> @@ -396,17 +367,17 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev,
>  	const u32 *barrier;
>  	u32 *buf_ptr;
>  	u64 read_ptr, write_ptr;
> -	u32 status, to_read;
> -	unsigned long offset;
> +	u32 status;
> +	unsigned long offset, to_read;
>  	struct cs_buffers *buf = sink_config;
>  	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>  
>  	if (!buf)
> -		return;
> +		return 0;
>  
>  	/* This shouldn't happen */
>  	if (WARN_ON_ONCE(drvdata->mode != CS_MODE_PERF))
> -		return;
> +		return 0;
>  
>  	CS_UNLOCK(drvdata->base);
>  
> @@ -495,18 +466,14 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev,
>  		}
>  	}
>  
> -	/*
> -	 * In snapshot mode all we have to do is communicate to
> -	 * perf_aux_output_end() the address of the current head.  In full
> -	 * trace mode the same function expects a size to move rb->aux_head
> -	 * forward.
> -	 */
> -	if (buf->snapshot)
> -		local_set(&buf->data_size, (cur * PAGE_SIZE) + offset);
> -	else
> -		local_add(to_read, &buf->data_size);
> -
> +	/* In snapshot mode we have to update the head */
> +	if (buf->snapshot) {
> +		handle->head = (cur * PAGE_SIZE) + offset;
> +		to_read = buf->nr_pages << PAGE_SHIFT;
> +	}
>  	CS_LOCK(drvdata->base);
> +
> +	return to_read;
>  }
>  
>  static const struct coresight_ops_sink tmc_etf_sink_ops = {
> @@ -515,7 +482,6 @@ static const struct coresight_ops_sink tmc_etf_sink_ops = {
>  	.alloc_buffer	= tmc_alloc_etf_buffer,
>  	.free_buffer	= tmc_free_etf_buffer,
>  	.set_buffer	= tmc_set_etf_buffer,
> -	.reset_buffer	= tmc_reset_etf_buffer,
>  	.update_buffer	= tmc_update_etf_buffer,
>  };
>  
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index d950dad5056a..5c9e5fe2bf32 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -199,10 +199,7 @@ struct coresight_ops_sink {
>  	int (*set_buffer)(struct coresight_device *csdev,
>  			  struct perf_output_handle *handle,
>  			  void *sink_config);
> -	unsigned long (*reset_buffer)(struct coresight_device *csdev,
> -				      struct perf_output_handle *handle,
> -				      void *sink_config);
> -	void (*update_buffer)(struct coresight_device *csdev,
> +	unsigned long (*update_buffer)(struct coresight_device *csdev,
>  			      struct perf_output_handle *handle,
>  			      void *sink_config);
>  };
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 17/17] coresight perf: Add ETR backend support for etm-perf
  2017-10-19 17:15 ` [PATCH 17/17] coresight perf: Add ETR backend support for etm-perf Suzuki K Poulose
@ 2017-11-07  0:24   ` Mathieu Poirier
  2017-11-07 10:52     ` Suzuki K Poulose
  0 siblings, 1 reply; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-07  0:24 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On Thu, Oct 19, 2017 at 06:15:53PM +0100, Suzuki K Poulose wrote:
> Add necessary support for using ETR as a sink in ETM perf tracing.
> We try make the best use of the available modes of buffers to
> try and avoid software double buffering.
> 
> We can use the perf ring buffer for ETR directly if all of the
> conditions below are met :
>  1) ETR is DMA coherent
>  2) perf is used in snapshot mode. In full tracing mode, we cannot
>     guarantee that the ETR will stop before it overwrites the data
>     which may not have been consumed by the user.
>  3) ETR supports save-restore with a scatter-gather mechanism
>     which can use a given set of pages we use the perf ring buffer
>     directly. If we have an in-built TMC ETR Scatter Gather unit,
>     we make use of a circular SG list to restart from a given head.
>     However, we need to align the starting offset to 4K in this case.
> 
> If the ETR doesn't support either of this, we fallback to software
> double buffering.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-tmc-etr.c | 372 +++++++++++++++++++++++-
>  drivers/hwtracing/coresight/coresight-tmc.h     |   2 +
>  2 files changed, 372 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 229c36b7266c..1dfe7cf7c721 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -21,6 +21,9 @@
>  #include "coresight-priv.h"
>  #include "coresight-tmc.h"
>  
> +/* Lower limit for ETR hardware buffer in double buffering mode */
> +#define TMC_ETR_PERF_MIN_BUF_SIZE	SZ_1M
> +
>  /*
>   * The TMC ETR SG has a page size of 4K. The SG table contains pointers
>   * to 4KB buffers. However, the OS may be use PAGE_SIZE different from
> @@ -1328,10 +1331,371 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>  	return ret;
>  }
>  
> +/*
> + * etr_perf_buffer - Perf buffer used for ETR
> + * @etr_buf		- Actual buffer used by the ETR
> + * @snaphost		- Perf session mode
> + * @head		- handle->head at the beginning of the session.
> + * @nr_pages		- Number of pages in the ring buffer.
> + * @pages		- Pages in the ring buffer.
> + * @flags		- Capabilities of the hardware buffer used in the
> + *			  session. If flags == 0, we use software double
> + *			  buffering.
> + */
> +struct etr_perf_buffer {
> +	struct etr_buf		*etr_buf;
> +	bool			snapshot;
> +	unsigned long		head;
> +	int			nr_pages;
> +	void			**pages;
> +	u32			flags;
> +};

Please move this to the top, just below the declaration for etr_sg_table.

> +
> +
> +/*
> + * tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf. We try to
> + * use perf ring buffer pages for the ETR when we can. In the worst case
> + * we fallback to software double buffering. The size of the hardware buffer
> + * in this case is dependent on the size configured via sysfs, if we can't
> + * match the perf ring buffer size. We scale down the size by half until
> + * it reaches a limit of 1M, beyond which we give up.
> + */
> +static struct etr_perf_buffer *
> +tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages,
> +		       void **pages, bool snapshot)
> +{
> +	int i;
> +	struct etr_buf *etr_buf;
> +	struct etr_perf_buffer *etr_perf;
> +	unsigned long size;
> +	unsigned long buf_flags[] = {
> +					ETR_BUF_F_RESTORE_FULL,
> +					ETR_BUF_F_RESTORE_MINIMAL,
> +					0,
> +				    };
> +
> +	etr_perf = kzalloc_node(sizeof(*etr_perf), GFP_KERNEL, node);
> +	if (!etr_perf)
> +		return ERR_PTR(-ENOMEM);
> +
> +	size = nr_pages << PAGE_SHIFT;
> +	/*
> +	 * We can use the perf ring buffer for ETR only if it is coherent
> +	 * and in snapshot mode as we cannot control how much data will be
> +	 * written before we stop.
> +	 */
> +	if (tmc_etr_has_cap(drvdata, TMC_ETR_COHERENT) && snapshot) {
> +		for (i = 0; buf_flags[i]; i++) {
> +			etr_buf = tmc_alloc_etr_buf(drvdata, size,
> +						 buf_flags[i], node, pages);
> +			if (!IS_ERR(etr_buf)) {
> +				etr_perf->flags = buf_flags[i];
> +				goto done;
> +			}
> +		}
> +	}
> +
> +	/*
> +	 * We have to now fallback to software double buffering.
> +	 * The tricky decision is choosing a size for the hardware buffer.
> +	 * We could start with drvdata->size (configurable via sysfs) and
> +	 * scale it down until we can allocate the data.
> +	 */
> +	etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL);
> +	if (!IS_ERR(etr_buf))
> +		goto done;
> +	size = drvdata->size;
> +	do {
> +		etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL);
> +		if (!IS_ERR(etr_buf))
> +			goto done;
> +		size /= 2;
> +	} while (size >= TMC_ETR_PERF_MIN_BUF_SIZE);
> +
> +	kfree(etr_perf);
> +	return ERR_PTR(-ENOMEM);
> +
> +done:
> +	etr_perf->etr_buf = etr_buf;
> +	return etr_perf;
> +}
> +
> +
> +static void *tmc_etr_alloc_perf_buffer(struct coresight_device *csdev,
> +					int cpu, void **pages, int nr_pages,
> +					bool snapshot)
> +{
> +	struct etr_perf_buffer *etr_perf;
> +	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> +
> +	if (cpu == -1)
> +		cpu = smp_processor_id();
> +
> +	etr_perf = tmc_etr_setup_perf_buf(drvdata, cpu_to_node(cpu),
> +					     nr_pages, pages, snapshot);
> +	if (IS_ERR(etr_perf)) {
> +		dev_dbg(drvdata->dev, "Unable to allocate ETR buffer\n");
> +		return NULL;
> +	}
> +
> +	etr_perf->snapshot = snapshot;
> +	etr_perf->nr_pages = nr_pages;
> +	etr_perf->pages = pages;
> +
> +	return etr_perf;
> +}
> +
> +static void tmc_etr_free_perf_buffer(void *config)
> +{
> +	struct etr_perf_buffer *etr_perf = config;
> +
> +	if (etr_perf->etr_buf)
> +		tmc_free_etr_buf(etr_perf->etr_buf);
> +	kfree(etr_perf);
> +}
> +
> +/*
> + * Pad the etr buffer with barrier packets to align the head to 4K aligned
> + * offset. This is required for ETR SG backed buffers, so that we can rotate
> + * the buffer easily and avoid a software double buffering.
> + */
> +static s64 tmc_etr_pad_perf_buffer(struct etr_perf_buffer *etr_perf, s64 head)
> +{
> +	s64 new_head;
> +	struct etr_buf *etr_buf = etr_perf->etr_buf;
> +
> +	head %= etr_buf->size;
> +	new_head = ALIGN(head, SZ_4K);
> +	if (head == new_head)
> +		return head;
> +	/*
> +	 * If the padding is not aligned to barrier packet size
> +	 * we can't do much.
> +	 */
> +	if ((new_head - head) % CORESIGHT_BARRIER_PKT_SIZE)
> +		return -EINVAL;
> +	return tmc_etr_buf_insert_barrier_packets(etr_buf, head,
> +						  new_head - head);
> +}
> +
> +static int tmc_etr_set_perf_buffer(struct coresight_device *csdev,
> +				   struct perf_output_handle *handle,
> +				   void *config)
> +{
> +	int rc;
> +	unsigned long flags;
> +	s64 head, new_head;
> +	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> +	struct etr_perf_buffer *etr_perf = config;
> +	struct etr_buf *etr_buf = etr_perf->etr_buf;
> +
> +	etr_perf->head = handle->head;
> +	head = etr_perf->head % etr_buf->size;
> +	switch (etr_perf->flags) {
> +	case ETR_BUF_F_RESTORE_MINIMAL:
> +		new_head = tmc_etr_pad_perf_buffer(etr_perf, head);
> +		if (new_head < 0)
> +			return new_head;
> +		if (head != new_head) {
> +			rc = perf_aux_output_skip(handle, new_head - head);
> +			if (rc)
> +				return rc;
> +			etr_perf->head = handle->head;
> +			head = new_head;
> +		}
> +		/* Fall through */
> +	case ETR_BUF_F_RESTORE_FULL:
> +		rc = tmc_restore_etr_buf(drvdata, etr_buf, head, head, 0);
> +		break;
> +	case 0:
> +		/* Nothing to do here. */
> +		rc = 0;
> +		break;
> +	default:
> +		dev_warn(drvdata->dev, "Unexpected flags in etr_perf buffer\n");
> +		WARN_ON(1);
> +		rc = -EINVAL;
> +	}
> +
> +	/*
> +	 * This sink is going to be used in perf mode. No other session can
> +	 * grab it from us. So set the perf mode specific data here. This will
> +	 * be released just before we disable the sink from update_buffer call
> +	 * back.
> +	 */
> +	if (!rc) {
> +		spin_lock_irqsave(&drvdata->spinlock, flags);
> +		if (WARN_ON(drvdata->perf_data))
> +			rc = -EBUSY;
> +		else
> +			drvdata->perf_data = etr_perf;
> +		spin_unlock_irqrestore(&drvdata->spinlock, flags);
> +	}
> +	return rc;
> +}
> +
> +/*
> + * tmc_etr_sync_perf_buffer: Copy the actual trace data from the hardware
> + * buffer to the perf ring buffer.
> + */
> +static void tmc_etr_sync_perf_buffer(struct etr_perf_buffer *etr_perf)
> +{
> +	struct etr_buf *etr_buf = etr_perf->etr_buf;
> +	unsigned long bytes, to_copy, head = etr_perf->head;
> +	unsigned long pg_idx, pg_offset, src_offset;
> +	char **dst_pages, *src_buf;
> +
> +	head = etr_perf->head % (etr_perf->nr_pages << PAGE_SHIFT);
> +	pg_idx = head >> PAGE_SHIFT;
> +	pg_offset = head & (PAGE_SIZE - 1);
> +	dst_pages = (char **)etr_perf->pages;
> +	src_offset = etr_buf->offset;
> +	to_copy = etr_buf->len;
> +
> +	while (to_copy > 0) {
> +		/*
> +		 * We can copy minimum of :
> +		 *  1) what is available in the source buffer,
> +		 *  2) what is available in the source buffer, before it
> +		 *     wraps around.
> +		 *  3) what is available in the destination page.
> +		 * in one iteration.
> +		 */
> +		bytes = tmc_etr_buf_get_data(etr_buf, src_offset, to_copy,
> +					     &src_buf);
> +		if (WARN_ON_ONCE(bytes <= 0))
> +			break;
> +		bytes = min(PAGE_SIZE - pg_offset, bytes);
> +
> +		memcpy(dst_pages[pg_idx] + pg_offset, src_buf, bytes);
> +		to_copy -= bytes;
> +		/* Move destination pointers */
> +		pg_offset += bytes;
> +		if (pg_offset == PAGE_SIZE) {
> +			pg_offset = 0;
> +			if (++pg_idx == etr_perf->nr_pages)
> +				pg_idx = 0;
> +		}
> +
> +		/* Move source pointers */
> +		src_offset += bytes;
> +		if (src_offset >= etr_buf->size)
> +			src_offset -= etr_buf->size;
> +	}
> +}
> +
> +/*
> + * XXX: What is the expected behavior here in the following cases ?
> + *  1) Full trace mode, without double buffering : What should be the size
> + *     reported back when the buffer is full and has wrapped around. Ideally,
> + *     we should report for the lost trace to make sure the "head" in the ring
> + *     buffer comes back to the position as in the trace buffer, rather than
> + *     returning "total size" of the buffer.
> + * 2) In snapshot mode, should we always return "full buffer size" ?
> + */
> +static unsigned long
> +tmc_etr_update_perf_buffer(struct coresight_device *csdev,
> +			   struct perf_output_handle *handle,
> +			   void *config)
> +{
> +	bool double_buffer, lost = false;
> +	unsigned long flags, offset, size = 0;
> +	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> +	struct etr_perf_buffer *etr_perf = config;
> +	struct etr_buf *etr_buf = etr_perf->etr_buf;
> +
> +	double_buffer = (etr_perf->flags == 0);
> +
> +	spin_lock_irqsave(&drvdata->spinlock, flags);
> +	if (WARN_ON(drvdata->perf_data != etr_perf)) {
> +		lost = true;

If we are here something went seriously wrong - I don't think much more can be
done other than a WARN_ON()...

> +		spin_unlock_irqrestore(&drvdata->spinlock, flags);
> +		goto out;
> +	}
> +
> +	CS_UNLOCK(drvdata->base);
> +
> +	tmc_flush_and_stop(drvdata);
> +
> +	tmc_sync_etr_buf(drvdata);
> +	CS_UNLOCK(drvdata->base);
> +	/* Reset perf specific data */
> +	drvdata->perf_data = NULL;
> +	spin_unlock_irqrestore(&drvdata->spinlock, flags);
> +
> +	offset = etr_buf->offset + etr_buf->len;
> +	if (offset > etr_buf->size)
> +		offset -= etr_buf->size;
> +
> +	if (double_buffer) {
> +		/*
> +		 * If we use software double buffering, update the ring buffer.
> +		 * And the size is what we have in the hardware buffer.
> +		 */
> +		size = etr_buf->len;
> +		tmc_etr_sync_perf_buffer(etr_perf);
> +	} else {
> +		/*
> +		 * If the hardware uses perf ring buffer the size of the data
> +		 * we have is from the old-head to the current head of the
> +		 * buffer. This also means in non-snapshot mode, we have lost
> +		 * one-full-buffer-size worth data, if the buffer wraps around.
> +		 */
> +		unsigned long old_head;
> +
> +		old_head = (etr_perf->head % etr_buf->size);
> +		size = (offset - old_head + etr_buf->size) % etr_buf->size;
> +	}
> +
> +	/*
> +	 * Update handle->head in snapshot mode. Also update the size to the
> +	 * hardware buffer size if there was an overflow.
> +	 */
> +	if (etr_perf->snapshot) {
> +		if (double_buffer)
> +			handle->head += size;
> +		else
> +			handle->head = offset;
> +		if (etr_buf->full)
> +			size = etr_buf->size;
> +	}
> +
> +	lost |= etr_buf->full;
> +out:
> +	if (lost)
> +		perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
> +	return size;
> +}
> +
>  static int tmc_enable_etr_sink_perf(struct coresight_device *csdev)
>  {
> -	/* We don't support perf mode yet ! */
> -	return -EINVAL;
> +	int rc = 0;
> +	unsigned long flags;
> +	struct etr_perf_buffer *etr_perf;
> +	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> +
> +	spin_lock_irqsave(&drvdata->spinlock, flags);
> +	/*
> +	 * There can be only one writer per sink in perf mode. If the sink
> +	 * is already open in SYSFS mode, we can't use it.
> +	 */
> +	if (drvdata->mode != CS_MODE_DISABLED) {
> +		rc = -EBUSY;
> +		goto unlock_out;
> +	}
> +
> +	etr_perf = drvdata->perf_data;
> +	if (!etr_perf || !etr_perf->etr_buf) {
> +		rc = -EINVAL;

This is a serious malfunction - I would WARN_ON() before unlocking.

> +		goto unlock_out;
> +	}
> +
> +	drvdata->mode = CS_MODE_PERF;
> +	tmc_etr_enable_hw(drvdata, etr_perf->etr_buf);
> +
> +unlock_out:
> +	spin_unlock_irqrestore(&drvdata->spinlock, flags);
> +	return rc;
>  }
>  
>  static int tmc_enable_etr_sink(struct coresight_device *csdev, u32 mode)
> @@ -1372,6 +1736,10 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev)
>  static const struct coresight_ops_sink tmc_etr_sink_ops = {
>  	.enable		= tmc_enable_etr_sink,
>  	.disable	= tmc_disable_etr_sink,
> +	.alloc_buffer	= tmc_etr_alloc_perf_buffer,
> +	.update_buffer	= tmc_etr_update_perf_buffer,
> +	.set_buffer	= tmc_etr_set_perf_buffer,
> +	.free_buffer	= tmc_etr_free_perf_buffer,
>  };
>  
>  const struct coresight_ops tmc_etr_cs_ops = {
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> index 2c5b905b6494..06386ceb7866 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.h
> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> @@ -198,6 +198,7 @@ struct etr_buf {
>   * @trigger_cntr: amount of words to store after a trigger.
>   * @etr_caps:	Bitmask of capabilities of the TMC ETR, inferred from the
>   *		device configuration register (DEVID)
> + * @perf_data:	PERF buffer for ETR.
>   * @sysfs_data:	SYSFS buffer for ETR.
>   */
>  struct tmc_drvdata {
> @@ -219,6 +220,7 @@ struct tmc_drvdata {
>  	u32			trigger_cntr;
>  	u32			etr_caps;
>  	struct etr_buf		*sysfs_buf;
> +	void			*perf_data;

This is a temporary place holder while an event is active, i.e theoretically it
doesn't stay the same for the entire trace session.  In situations where there
could be one ETR per CPU, the same ETR could be used to serve more than one
trace session (since only one session can be active at a time on a CPU).  As
such I would call it curr_perf_data or something similar.  I'd also make that
clear in the above documentation.

Have you tried your implementation on a dragonboard or a Hikey?

Thanks,
Mathieu

>  };
>  
>  struct etr_buf_operations {
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 06/17] coresight: tmc: Make ETR SG table circular
  2017-11-06 19:07   ` Mathieu Poirier
@ 2017-11-07 10:36     ` Suzuki K Poulose
  2017-11-09 16:19       ` Mathieu Poirier
  0 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-07 10:36 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-arm-kernel, linux-kernel, rob.walker, mike.leach, coresight

On 06/11/17 19:07, Mathieu Poirier wrote:
> On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:

...

>> +/*
>> + * tmc_etr_sg_offset_to_table_index : Translate a given data @offset
>> + * to the index of the page table "entry". Data pointers always have
>> + * a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an
>> + * ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries.
>> + */
>> +static inline u32
>> +tmc_etr_sg_offset_to_table_index(u64 offset)
>> +{
>> +	u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
>> +
>> +	return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
>> +}
> 
> This function is the source of a bizarre linking error when compiling [14/17] on
> armv7 as pasted here:
> 
>    UPD     include/generated/compile.h
>    CC      init/version.o
>    AR      init/built-in.o
>    AR      built-in.o
>    LD      vmlinux.o
>    MODPOST vmlinux.o
> drivers/hwtracing/coresight/coresight-tmc-etr.o: In function
> `tmc_etr_sg_offset_to_table_index':
> /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553:
> undefined reference to `__aeabi_uldivmod'
> /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:551:
> undefined reference to `__aeabi_uldivmod'
> /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553:
> undefined reference to `__aeabi_uldivmod'
> drivers/hwtracing/coresight/coresight-tmc-etr.o: In function
> `tmc_etr_sg_table_rotate':
> /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:609:
> undefined reference to `__aeabi_uldivmod'
> 
> Please see if you can reproduce on your side.

Uh ! I had gcc-7, which didn't complain about it. But if switch to 4.9, it does.
It looks like division of 64bit entity on arm32 is triggering it. We don't need
this to be u64 above, as it is the page_idx and could simply switch to "unsigned long"
rather than explicitly using div64 helpers.

The following change fixes the issue for me. Please could you check if it solves the problem
for you ?


@@ -551,7 +553,7 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table)
  static inline u32
  tmc_etr_sg_offset_to_table_index(u64 offset)
  {
-       u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
+       unsigned long sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
  
         return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
  }



Thanks for the testing !

Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 13/17] coresight etr: Do not clean ETR trace buffer
  2017-11-03 20:17       ` Mathieu Poirier
@ 2017-11-07 10:37         ` Suzuki K Poulose
  0 siblings, 0 replies; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-07 10:37 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-arm-kernel, linux-kernel, rob.walker, Mike Leach, coresight

On 03/11/17 20:17, Mathieu Poirier wrote:
> On 3 November 2017 at 04:10, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
>> On 02/11/17 20:36, Mathieu Poirier wrote:
>>>
>>> On Thu, Oct 19, 2017 at 06:15:49PM +0100, Suzuki K Poulose wrote:
>>>>
>>>> We zero out the entire trace buffer used for ETR before it
>>>> is enabled, for helping with debugging. Since we could be
>>>> restoring a session in perf mode, this could destroy the data.
>>>
>>>
>>> I'm not sure to follow you with "... restoring a session in perf mode
>>> ...".
>>> When operating from the perf interface all the memory allocated for a
>>> session is
>>> cleanup after, there is no re-using of memory as in sysFS.
>>
>>
>> We could directly use the perf ring buffer for the ETR. In that case, the
>> perf
>> ring buffer could contain trace data collected from the previous "schedule"
>> which the userspace hasn't collected yet. So, doing a memset here would
>> destroy that data.
> 
> I originally thought your comment was about re-using the memory from a
> previous trace session, hence the confusion.  Please rework your
> changelog to include this clarification as I am sure other people can
> be mislead.

Sure, will do.

Thanks
Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 17/17] coresight perf: Add ETR backend support for etm-perf
  2017-11-07  0:24   ` Mathieu Poirier
@ 2017-11-07 10:52     ` Suzuki K Poulose
  2017-11-07 15:17       ` Mike Leach
  0 siblings, 1 reply; 56+ messages in thread
From: Suzuki K Poulose @ 2017-11-07 10:52 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: linux-arm-kernel, linux-kernel, robert.walker, mike.leach, coresight

On 07/11/17 00:24, Mathieu Poirier wrote:
> On Thu, Oct 19, 2017 at 06:15:53PM +0100, Suzuki K Poulose wrote:
>> Add necessary support for using ETR as a sink in ETM perf tracing.
>> We try make the best use of the available modes of buffers to
>> try and avoid software double buffering.
>>
>> We can use the perf ring buffer for ETR directly if all of the
>> conditions below are met :
>>   1) ETR is DMA coherent
>>   2) perf is used in snapshot mode. In full tracing mode, we cannot
>>      guarantee that the ETR will stop before it overwrites the data
>>      which may not have been consumed by the user.
>>   3) ETR supports save-restore with a scatter-gather mechanism
>>      which can use a given set of pages we use the perf ring buffer
>>      directly. If we have an in-built TMC ETR Scatter Gather unit,
>>      we make use of a circular SG list to restart from a given head.
>>      However, we need to align the starting offset to 4K in this case.
>>
>> If the ETR doesn't support either of this, we fallback to software
>> double buffering.
>>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

  
>> +/*
>> + * etr_perf_buffer - Perf buffer used for ETR
>> + * @etr_buf		- Actual buffer used by the ETR
>> + * @snaphost		- Perf session mode
>> + * @head		- handle->head at the beginning of the session.
>> + * @nr_pages		- Number of pages in the ring buffer.
>> + * @pages		- Pages in the ring buffer.
>> + * @flags		- Capabilities of the hardware buffer used in the
>> + *			  session. If flags == 0, we use software double
>> + *			  buffering.
>> + */
>> +struct etr_perf_buffer {
>> +	struct etr_buf		*etr_buf;
>> +	bool			snapshot;
>> +	unsigned long		head;
>> +	int			nr_pages;
>> +	void			**pages;
>> +	u32			flags;
>> +};
> 
> Please move this to the top, just below the declaration for etr_sg_table.

Sure.


>> +
>> +/*
>> + * XXX: What is the expected behavior here in the following cases ?
>> + *  1) Full trace mode, without double buffering : What should be the size
>> + *     reported back when the buffer is full and has wrapped around. Ideally,
>> + *     we should report for the lost trace to make sure the "head" in the ring
>> + *     buffer comes back to the position as in the trace buffer, rather than
>> + *     returning "total size" of the buffer.
>> + * 2) In snapshot mode, should we always return "full buffer size" ?
>> + */
>> +static unsigned long
>> +tmc_etr_update_perf_buffer(struct coresight_device *csdev,
>> +			   struct perf_output_handle *handle,
>> +			   void *config)
>> +{
>> +	bool double_buffer, lost = false;
>> +	unsigned long flags, offset, size = 0;
>> +	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>> +	struct etr_perf_buffer *etr_perf = config;
>> +	struct etr_buf *etr_buf = etr_perf->etr_buf;
>> +
>> +	double_buffer = (etr_perf->flags == 0);
>> +
>> +	spin_lock_irqsave(&drvdata->spinlock, flags);
>> +	if (WARN_ON(drvdata->perf_data != etr_perf)) {
>> +		lost = true;
> 
> If we are here something went seriously wrong - I don't think much more can be
> done other than a WARN_ON()...
> 

right. I will do it for the case below as well.

>>   static int tmc_enable_etr_sink_perf(struct coresight_device *csdev)
>>   {
...
>> +
>> +	etr_perf = drvdata->perf_data;
>> +	if (!etr_perf || !etr_perf->etr_buf) {
>> +		rc = -EINVAL;
> 
> This is a serious malfunction - I would WARN_ON() before unlocking.
> 


>> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
>> index 2c5b905b6494..06386ceb7866 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc.h
>> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
>> @@ -198,6 +198,7 @@ struct etr_buf {
>>    * @trigger_cntr: amount of words to store after a trigger.
>>    * @etr_caps:	Bitmask of capabilities of the TMC ETR, inferred from the
>>    *		device configuration register (DEVID)
>> + * @perf_data:	PERF buffer for ETR.
>>    * @sysfs_data:	SYSFS buffer for ETR.
>>    */
>>   struct tmc_drvdata {
>> @@ -219,6 +220,7 @@ struct tmc_drvdata {
>>   	u32			trigger_cntr;
>>   	u32			etr_caps;
>>   	struct etr_buf		*sysfs_buf;
>> +	void			*perf_data;
> 
> This is a temporary place holder while an event is active, i.e theoretically it
> doesn't stay the same for the entire trace session.  In situations where there
> could be one ETR per CPU, the same ETR could be used to serve more than one
> trace session (since only one session can be active at a time on a CPU).  As
> such I would call it curr_perf_data or something similar.  I'd also make that
> clear in the above documentation.

You're right. However, from the ETR's perspective, it doesn't care how the perf
uses it. So from the ETR driver side, it still is something used by the perf mode.
All it stands for is the buffer to be used when enabled in perf mode. I could
definitely add some comment to describe this. But I am not sure if we have to
rename the variable.

> 
> Have you tried your implementation on a dragonboard or a Hikey?

No, I haven't. But Mike and Rob are trying on the Draonboard & Hikey respectively.
We are hitting some issues in the Scatter Gather mode, which is under debugging.
The SG table looks correct, just that the ETR hangs up. It works fine in the
flat memory mode. So, it is something to do with the READ  (sg table pointers) vs
WRITE (write trace data) pressure on the ETR.

One change I am working on with the perf buffer is to limit the "size" of the
trace buffer used by the ETR (in case of the perf-ring buffer) to the handle->size.
Otherwise we could be corrupting the collected trace waiting for consumption by
the user. This is easily possible with our SG table. But with the flat buffer, we have to
limit the size the minimum of (handle->size, space-in-circular-buffer-before wrapping).

In either case, we could loose data if we overflow the buffer, something we can't help
at the moment.


Suzuki


> 
> Thanks,
> Mathieu
> 
>>   };
>>   
>>   struct etr_buf_operations {
>> -- 
>> 2.13.6
>>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 17/17] coresight perf: Add ETR backend support for etm-perf
  2017-11-07 10:52     ` Suzuki K Poulose
@ 2017-11-07 15:17       ` Mike Leach
  2017-11-07 15:46         ` Mathieu Poirier
  0 siblings, 1 reply; 56+ messages in thread
From: Mike Leach @ 2017-11-07 15:17 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Mathieu Poirier, linux-arm-kernel, linux-kernel, Robert Walker,
	coresight

Hi Suzuki, Mathieu,

A follow up on Dragonboard issues...

=====
 Using Suzukis debug code and some of my own home spun updates I've
got the following logging out of a typical ETR-SG session from the
DB410.
Session initiated using command line
'./perf record -e cs_etm/@826000.etr/ --per-thread sort'

root@linaro-developer:~# [  122.075896] tmc_etr_sg_table_dump:455:
Table base; Vaddr:ffffff800978d000; DAddr:0xb10b1000; Table Pages 1;
Table Entries 1024
[  122.075932] tmc_etr_sg_table_dump:462: 00000: ffffff800978d000:[N] 0xb14b0000
[  122.086281] tmc_etr_sg_table_dump:462: 00001: ffffff800978d004:[N] 0xb14b1000
[  122.093410] tmc_etr_sg_table_dump:462: 00002: ffffff800978d008:[N] 0xb14b2000
----- snip -----
[  129.438535] tmc_etr_sg_table_dump:462: 01021: ffffff800978dff4:[N] 0xb10ad000
[  129.445741] tmc_etr_sg_table_dump:475: 01022: ###
ffffff800978dff8:[L] 0xb10ae000 ###
[  129.452945] tmc_etr_sg_table_dump:479: 01023: empty line
[  129.460840] tmc_etr_sg_table_dump:485: ******* End of Table *****
[  129.466333] tmc_etr_alloc_sg_buf:822: coresight-tmc 826000.etr: ETR
- alloc SG buffer

== SG table looks fine - I've removed the last circular link used for
rotating the table as that is not happening anyway and wanted to
eliminate it as an issue

== first pass trace capture - long running program

[  129.481359] tmc_etr_enable_hw:1239:  Set DBA 0xb10b1000; AXICTL 0x000007bd
[  129.484297] tmc_etr_enable_hw:1260: exit()
[  129.491251] tmc_enable_etf_link:306: coresight-tmc 825000.etf:
TMC-ETF enabled
[  129.794350] tmc_sync_etr_buf:1124: enter()
[  129.794377] tmc_sync_etr_buf:1131: ETR regs: RRP=0xb14b0000,
RWP=0xB14B0000, STS=0x0000003C:full=false

== this shows the data page values for the first SG page from the
table have been loaded into the RRP / RWP registers. Indication that
the
== SG table has been read. However status indicates that the buffer is
empty, and that the AXI bus has returned an error (bit 5). (messing
with permissions made no difference)
== Error ignored by the driver (but I think the system is
irretrievably broken now anyway).

[  129.794383] tmc_etr_sync_sg_buf:849: enter()
[  129.806616] tmc_etr_sync_sg_buf:876: WARNING: Buffer Data Len == 0;
force sync some pages
[  129.811051] tmc_etr_sync_sg_buf:881: exit()
[  129.819116] tmc_etr_sg_dump_pages:505: PG(0) : 0xcdcdcdcd::0xcdcdcdcd
[  129.823112] tmc_etr_sg_dump_pages:505: PG(1) : 0xcdcdcdcd::0xcdcdcdcd
[  129.829709] tmc_etr_sg_dump_pages:505: PG(2) : 0xcdcdcdcd::0xcdcdcdcd
[  129.836133] tmc_etr_sg_dump_pages:505: PG(3) : 0xcdcdcdcd::0xcdcdcdcd

== 1st 4 pages were pre-filled - seem untouched

[  129.842556] tmc_sync_etr_buf:1143: exit()
[  129.848977] tmc_etr_sync_perf_buffer:1635: sync_perf 16384 bytes
[  129.853016] tmc_etf_print_regs_debug:37: TMC-ETF regs; RRP:0xF20
RWP:0xF20; Status:0x10

== ETF - operating as a FIFO link has recieved data and has been
emptied - so the trace system has been running.

[  129.859058] tmc_disable_etf_link:327: coresight-tmc 825000.etf:
TMC-ETF disabled
[  129.866778] tmc_etr_disable_hw:1322: enter()
[  129.874410] tmc_etr_disable_hw:1336: exit()
[  129.878666] tmc_disable_etr_sink:1815: coresight-tmc 826000.etr:
TMC-ETR disabled

== At this point we have the AXI bus errored out, and apparently no
trace sent to the ETR memory.

== Second pass - perf tries to restart the trace.

[  129.882636] tmc_etr_enable_hw:1197: enter()
[  129.890230] tmc_etr_enable_hw:1239:  Set DBA 0xb10b1000; AXICTL 0x000007bd
[  129.894205] tmc_etr_enable_hw:1260: exit()
[  129.901157] tmc_enable_etf_link:306: coresight-tmc 825000.etf:
TMC-ETF enabled
[  129.922498] coresight-tmc 826000.etr: timeout while waiting for
completion of Manual Flush
[  129.922672] coresight-tmc 826000.etr: timeout while waiting for TMC
to be Ready
[  129.929645] tmc_sync_etr_buf:1124: enter()
[  129.936850] tmc_sync_etr_buf:1131: ETR regs: RRP=0xb10b1000,
RWP=0xB10B1000, STS=0x00000010:full=false

== this is bad - somehow the ETR regs have been set to the table base
address, not the data page base address. No apparent AXI bus fault at
this point,
== but it is likely that the restart cleared the bit and the AXI is no
longer responding.

[  129.936856] tmc_etr_sync_sg_buf:849: enter()
[  129.950311] coresight-tmc 826000.etr: Unable to map RRP b10b1000 to offset

== driver error in response to invalid RRP value

[  129.954733] tmc_etr_sg_dump_pages:505: PG(0) : 0xcdcdcdcd::0xcdcdcdcd
[  129.961417] tmc_etr_sg_dump_pages:505: PG(1) : 0xcdcdcdcd::0xcdcdcdcd
[  129.967928] tmc_etr_sg_dump_pages:505: PG(2) : 0xcdcdcdcd::0xcdcdcdcd
[  129.974350] tmc_etr_sg_dump_pages:505: PG(3) : 0xcdcdcdcd::0xcdcdcdcd
[  129.980772] tmc_sync_etr_buf:1143: exit()
[  129.987194] tmc_etr_sync_perf_buffer:1635: sync_perf 0 bytes
[  129.991201] tmc_etf_print_regs_debug:37: TMC-ETF regs; RRP:0x1C0
RWP:0x1C0; Status:0x1

== ETF is full - still trying to collect trace data.

[  129.997066] coresight-tmc 825000.etf: timeout while waiting for
completion of Manual Flush
[  130.004789] coresight-tmc 825000.etf: timeout while waiting for TMC
to be Ready
[  130.012896] tmc_disable_etf_link:327: coresight-tmc 825000.etf:
TMC-ETF disabled
[  130.020099] tmc_etr_disable_hw:1322: enter()
[  130.027879] coresight-tmc 826000.etr: timeout while waiting for
completion of Manual Flush
[  130.032135] coresight-tmc 826000.etr: timeout while waiting for TMC
to be Ready

== flushing not working at any point in the system here - probably due
to incorrect ETR operation - can't flush if downstream not accepting
data.

[  130.040062] tmc_etr_disable_hw:1336: exit()
[  130.047266] tmc_disable_etr_sink:1815: coresight-tmc 826000.etr:
TMC-ETR disabled

== Beyond this point, things pretty much repeat, but other systems
start failing too - missing interrupts etc.
== Symptoms would seem to indicate a locked out AXI bus - but that is
pure speculation.
== Eventually, the system automatically reboots itself - some watchdog
element I guess.

=====

Conclusion:-

At this point ETR-SG is non-operational for unknown reasons - likely a
memory system issue. Whether this is software config or hardware fault
is not known at this point.
However - this does raise the question about upstreaming this
patchset. As it stands it will break existing ETR functionality on
DB410c (and possibly HiKey 960).
Currently the patchset decides on flat mapped / SG on buffer size. I
would like to see a parameter added, something like SG threshold size,
above which the implementation will choose SG, and below it will
choose flat mapped. There also needs to be a special value - 0/-1
where SG is always disabled for the device. If the parameter is
available in device tree and sysfs then it will give the control
needed should the ETR-SG issue with the current non-operational
platforms turn out to be insurmountable. At the very least it will
allow the current patchset to be implemented in a way that can
preserve what is currently working till a solution is found.

Regards

Mike


On 7 November 2017 at 10:52, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
> On 07/11/17 00:24, Mathieu Poirier wrote:
>>
>> On Thu, Oct 19, 2017 at 06:15:53PM +0100, Suzuki K Poulose wrote:
>>>
>>> Add necessary support for using ETR as a sink in ETM perf tracing.
>>> We try make the best use of the available modes of buffers to
>>> try and avoid software double buffering.
>>>
>>> We can use the perf ring buffer for ETR directly if all of the
>>> conditions below are met :
>>>   1) ETR is DMA coherent
>>>   2) perf is used in snapshot mode. In full tracing mode, we cannot
>>>      guarantee that the ETR will stop before it overwrites the data
>>>      which may not have been consumed by the user.
>>>   3) ETR supports save-restore with a scatter-gather mechanism
>>>      which can use a given set of pages we use the perf ring buffer
>>>      directly. If we have an in-built TMC ETR Scatter Gather unit,
>>>      we make use of a circular SG list to restart from a given head.
>>>      However, we need to align the starting offset to 4K in this case.
>>>
>>> If the ETR doesn't support either of this, we fallback to software
>>> double buffering.
>>>
>>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>
>
>
>>>
>>> +/*
>>> + * etr_perf_buffer - Perf buffer used for ETR
>>> + * @etr_buf            - Actual buffer used by the ETR
>>> + * @snaphost           - Perf session mode
>>> + * @head               - handle->head at the beginning of the session.
>>> + * @nr_pages           - Number of pages in the ring buffer.
>>> + * @pages              - Pages in the ring buffer.
>>> + * @flags              - Capabilities of the hardware buffer used in the
>>> + *                       session. If flags == 0, we use software double
>>> + *                       buffering.
>>> + */
>>> +struct etr_perf_buffer {
>>> +       struct etr_buf          *etr_buf;
>>> +       bool                    snapshot;
>>> +       unsigned long           head;
>>> +       int                     nr_pages;
>>> +       void                    **pages;
>>> +       u32                     flags;
>>> +};
>>
>>
>> Please move this to the top, just below the declaration for etr_sg_table.
>
>
> Sure.
>
>
>
>>> +
>>> +/*
>>> + * XXX: What is the expected behavior here in the following cases ?
>>> + *  1) Full trace mode, without double buffering : What should be the
>>> size
>>> + *     reported back when the buffer is full and has wrapped around.
>>> Ideally,
>>> + *     we should report for the lost trace to make sure the "head" in
>>> the ring
>>> + *     buffer comes back to the position as in the trace buffer, rather
>>> than
>>> + *     returning "total size" of the buffer.
>>> + * 2) In snapshot mode, should we always return "full buffer size" ?
>>> + */
>>> +static unsigned long
>>> +tmc_etr_update_perf_buffer(struct coresight_device *csdev,
>>> +                          struct perf_output_handle *handle,
>>> +                          void *config)
>>> +{
>>> +       bool double_buffer, lost = false;
>>> +       unsigned long flags, offset, size = 0;
>>> +       struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>>> +       struct etr_perf_buffer *etr_perf = config;
>>> +       struct etr_buf *etr_buf = etr_perf->etr_buf;
>>> +
>>> +       double_buffer = (etr_perf->flags == 0);
>>> +
>>> +       spin_lock_irqsave(&drvdata->spinlock, flags);
>>> +       if (WARN_ON(drvdata->perf_data != etr_perf)) {
>>> +               lost = true;
>>
>>
>> If we are here something went seriously wrong - I don't think much more
>> can be
>> done other than a WARN_ON()...
>>
>
> right. I will do it for the case below as well.
>
>>>   static int tmc_enable_etr_sink_perf(struct coresight_device *csdev)
>>>   {
>
> ...
>>>
>>> +
>>> +       etr_perf = drvdata->perf_data;
>>> +       if (!etr_perf || !etr_perf->etr_buf) {
>>> +               rc = -EINVAL;
>>
>>
>> This is a serious malfunction - I would WARN_ON() before unlocking.
>>
>
>
>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h
>>> b/drivers/hwtracing/coresight/coresight-tmc.h
>>> index 2c5b905b6494..06386ceb7866 100644
>>> --- a/drivers/hwtracing/coresight/coresight-tmc.h
>>> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
>>> @@ -198,6 +198,7 @@ struct etr_buf {
>>>    * @trigger_cntr: amount of words to store after a trigger.
>>>    * @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the
>>>    *            device configuration register (DEVID)
>>> + * @perf_data: PERF buffer for ETR.
>>>    * @sysfs_data:       SYSFS buffer for ETR.
>>>    */
>>>   struct tmc_drvdata {
>>> @@ -219,6 +220,7 @@ struct tmc_drvdata {
>>>         u32                     trigger_cntr;
>>>         u32                     etr_caps;
>>>         struct etr_buf          *sysfs_buf;
>>> +       void                    *perf_data;
>>
>>
>> This is a temporary place holder while an event is active, i.e
>> theoretically it
>> doesn't stay the same for the entire trace session.  In situations where
>> there
>> could be one ETR per CPU, the same ETR could be used to serve more than
>> one
>> trace session (since only one session can be active at a time on a CPU).
>> As
>> such I would call it curr_perf_data or something similar.  I'd also make
>> that
>> clear in the above documentation.
>
>
> You're right. However, from the ETR's perspective, it doesn't care how the
> perf
> uses it. So from the ETR driver side, it still is something used by the perf
> mode.
> All it stands for is the buffer to be used when enabled in perf mode. I
> could
> definitely add some comment to describe this. But I am not sure if we have
> to
> rename the variable.
>
>>
>> Have you tried your implementation on a dragonboard or a Hikey?
>
>
> No, I haven't. But Mike and Rob are trying on the Draonboard & Hikey
> respectively.
> We are hitting some issues in the Scatter Gather mode, which is under
> debugging.
> The SG table looks correct, just that the ETR hangs up. It works fine in the
> flat memory mode. So, it is something to do with the READ  (sg table
> pointers) vs
> WRITE (write trace data) pressure on the ETR.
>
> One change I am working on with the perf buffer is to limit the "size" of
> the
> trace buffer used by the ETR (in case of the perf-ring buffer) to the
> handle->size.
> Otherwise we could be corrupting the collected trace waiting for consumption
> by
> the user. This is easily possible with our SG table. But with the flat
> buffer, we have to
> limit the size the minimum of (handle->size, space-in-circular-buffer-before
> wrapping).
>
> In either case, we could loose data if we overflow the buffer, something we
> can't help
> at the moment.
>
>
> Suzuki
>
>
>
>>
>> Thanks,
>> Mathieu
>>
>>>   };
>>>     struct etr_buf_operations {
>>> --
>>> 2.13.6
>>>
>



-- 
Mike Leach
Principal Engineer, ARM Ltd.
Blackburn Design Centre. UK

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 17/17] coresight perf: Add ETR backend support for etm-perf
  2017-11-07 15:17       ` Mike Leach
@ 2017-11-07 15:46         ` Mathieu Poirier
  0 siblings, 0 replies; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-07 15:46 UTC (permalink / raw)
  To: Mike Leach
  Cc: Suzuki K Poulose, linux-arm-kernel, linux-kernel, Robert Walker,
	coresight

On 7 November 2017 at 08:17, Mike Leach <mike.leach@linaro.org> wrote:
> Hi Suzuki, Mathieu,
>
> A follow up on Dragonboard issues...
>
> =====
>  Using Suzukis debug code and some of my own home spun updates I've
> got the following logging out of a typical ETR-SG session from the
> DB410.
> Session initiated using command line
> './perf record -e cs_etm/@826000.etr/ --per-thread sort'
>
> root@linaro-developer:~# [  122.075896] tmc_etr_sg_table_dump:455:
> Table base; Vaddr:ffffff800978d000; DAddr:0xb10b1000; Table Pages 1;
> Table Entries 1024
> [  122.075932] tmc_etr_sg_table_dump:462: 00000: ffffff800978d000:[N] 0xb14b0000
> [  122.086281] tmc_etr_sg_table_dump:462: 00001: ffffff800978d004:[N] 0xb14b1000
> [  122.093410] tmc_etr_sg_table_dump:462: 00002: ffffff800978d008:[N] 0xb14b2000
> ----- snip -----
> [  129.438535] tmc_etr_sg_table_dump:462: 01021: ffffff800978dff4:[N] 0xb10ad000
> [  129.445741] tmc_etr_sg_table_dump:475: 01022: ###
> ffffff800978dff8:[L] 0xb10ae000 ###
> [  129.452945] tmc_etr_sg_table_dump:479: 01023: empty line
> [  129.460840] tmc_etr_sg_table_dump:485: ******* End of Table *****
> [  129.466333] tmc_etr_alloc_sg_buf:822: coresight-tmc 826000.etr: ETR
> - alloc SG buffer
>
> == SG table looks fine - I've removed the last circular link used for
> rotating the table as that is not happening anyway and wanted to
> eliminate it as an issue
>
> == first pass trace capture - long running program
>
> [  129.481359] tmc_etr_enable_hw:1239:  Set DBA 0xb10b1000; AXICTL 0x000007bd
> [  129.484297] tmc_etr_enable_hw:1260: exit()
> [  129.491251] tmc_enable_etf_link:306: coresight-tmc 825000.etf:
> TMC-ETF enabled
> [  129.794350] tmc_sync_etr_buf:1124: enter()
> [  129.794377] tmc_sync_etr_buf:1131: ETR regs: RRP=0xb14b0000,
> RWP=0xB14B0000, STS=0x0000003C:full=false
>
> == this shows the data page values for the first SG page from the
> table have been loaded into the RRP / RWP registers. Indication that
> the
> == SG table has been read. However status indicates that the buffer is
> empty, and that the AXI bus has returned an error (bit 5). (messing
> with permissions made no difference)
> == Error ignored by the driver (but I think the system is
> irretrievably broken now anyway).
>
> [  129.794383] tmc_etr_sync_sg_buf:849: enter()
> [  129.806616] tmc_etr_sync_sg_buf:876: WARNING: Buffer Data Len == 0;
> force sync some pages
> [  129.811051] tmc_etr_sync_sg_buf:881: exit()
> [  129.819116] tmc_etr_sg_dump_pages:505: PG(0) : 0xcdcdcdcd::0xcdcdcdcd
> [  129.823112] tmc_etr_sg_dump_pages:505: PG(1) : 0xcdcdcdcd::0xcdcdcdcd
> [  129.829709] tmc_etr_sg_dump_pages:505: PG(2) : 0xcdcdcdcd::0xcdcdcdcd
> [  129.836133] tmc_etr_sg_dump_pages:505: PG(3) : 0xcdcdcdcd::0xcdcdcdcd
>
> == 1st 4 pages were pre-filled - seem untouched
>
> [  129.842556] tmc_sync_etr_buf:1143: exit()
> [  129.848977] tmc_etr_sync_perf_buffer:1635: sync_perf 16384 bytes
> [  129.853016] tmc_etf_print_regs_debug:37: TMC-ETF regs; RRP:0xF20
> RWP:0xF20; Status:0x10
>
> == ETF - operating as a FIFO link has recieved data and has been
> emptied - so the trace system has been running.
>
> [  129.859058] tmc_disable_etf_link:327: coresight-tmc 825000.etf:
> TMC-ETF disabled
> [  129.866778] tmc_etr_disable_hw:1322: enter()
> [  129.874410] tmc_etr_disable_hw:1336: exit()
> [  129.878666] tmc_disable_etr_sink:1815: coresight-tmc 826000.etr:
> TMC-ETR disabled
>
> == At this point we have the AXI bus errored out, and apparently no
> trace sent to the ETR memory.
>
> == Second pass - perf tries to restart the trace.
>
> [  129.882636] tmc_etr_enable_hw:1197: enter()
> [  129.890230] tmc_etr_enable_hw:1239:  Set DBA 0xb10b1000; AXICTL 0x000007bd
> [  129.894205] tmc_etr_enable_hw:1260: exit()
> [  129.901157] tmc_enable_etf_link:306: coresight-tmc 825000.etf:
> TMC-ETF enabled
> [  129.922498] coresight-tmc 826000.etr: timeout while waiting for
> completion of Manual Flush
> [  129.922672] coresight-tmc 826000.etr: timeout while waiting for TMC
> to be Ready
> [  129.929645] tmc_sync_etr_buf:1124: enter()
> [  129.936850] tmc_sync_etr_buf:1131: ETR regs: RRP=0xb10b1000,
> RWP=0xB10B1000, STS=0x00000010:full=false
>
> == this is bad - somehow the ETR regs have been set to the table base
> address, not the data page base address. No apparent AXI bus fault at
> this point,
> == but it is likely that the restart cleared the bit and the AXI is no
> longer responding.
>
> [  129.936856] tmc_etr_sync_sg_buf:849: enter()
> [  129.950311] coresight-tmc 826000.etr: Unable to map RRP b10b1000 to offset
>
> == driver error in response to invalid RRP value
>
> [  129.954733] tmc_etr_sg_dump_pages:505: PG(0) : 0xcdcdcdcd::0xcdcdcdcd
> [  129.961417] tmc_etr_sg_dump_pages:505: PG(1) : 0xcdcdcdcd::0xcdcdcdcd
> [  129.967928] tmc_etr_sg_dump_pages:505: PG(2) : 0xcdcdcdcd::0xcdcdcdcd
> [  129.974350] tmc_etr_sg_dump_pages:505: PG(3) : 0xcdcdcdcd::0xcdcdcdcd
> [  129.980772] tmc_sync_etr_buf:1143: exit()
> [  129.987194] tmc_etr_sync_perf_buffer:1635: sync_perf 0 bytes
> [  129.991201] tmc_etf_print_regs_debug:37: TMC-ETF regs; RRP:0x1C0
> RWP:0x1C0; Status:0x1
>
> == ETF is full - still trying to collect trace data.
>
> [  129.997066] coresight-tmc 825000.etf: timeout while waiting for
> completion of Manual Flush
> [  130.004789] coresight-tmc 825000.etf: timeout while waiting for TMC
> to be Ready
> [  130.012896] tmc_disable_etf_link:327: coresight-tmc 825000.etf:
> TMC-ETF disabled
> [  130.020099] tmc_etr_disable_hw:1322: enter()
> [  130.027879] coresight-tmc 826000.etr: timeout while waiting for
> completion of Manual Flush
> [  130.032135] coresight-tmc 826000.etr: timeout while waiting for TMC
> to be Ready
>
> == flushing not working at any point in the system here - probably due
> to incorrect ETR operation - can't flush if downstream not accepting
> data.
>
> [  130.040062] tmc_etr_disable_hw:1336: exit()
> [  130.047266] tmc_disable_etr_sink:1815: coresight-tmc 826000.etr:
> TMC-ETR disabled
>
> == Beyond this point, things pretty much repeat, but other systems
> start failing too - missing interrupts etc.
> == Symptoms would seem to indicate a locked out AXI bus - but that is
> pure speculation.
> == Eventually, the system automatically reboots itself - some watchdog
> element I guess.
>
> =====
>
> Conclusion:-
>
> At this point ETR-SG is non-operational for unknown reasons - likely a
> memory system issue. Whether this is software config or hardware fault
> is not known at this point.
> However - this does raise the question about upstreaming this
> patchset. As it stands it will break existing ETR functionality on
> DB410c (and possibly HiKey 960).
> Currently the patchset decides on flat mapped / SG on buffer size. I
> would like to see a parameter added, something like SG threshold size,
> above which the implementation will choose SG, and below it will
> choose flat mapped. There also needs to be a special value - 0/-1
> where SG is always disabled for the device. If the parameter is
> available in device tree and sysfs then it will give the control
> needed should the ETR-SG issue with the current non-operational
> platforms turn out to be insurmountable. At the very least it will
> allow the current patchset to be implemented in a way that can
> preserve what is currently working till a solution is found.

Right, the patchset won't go upstream if it break things.  Before
thinking about mitigation factors I'd like to see what the root cause
of the problem is - when we have that we can discuss the best way to
go around it.

>
> Regards
>
> Mike
>
>
> On 7 November 2017 at 10:52, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
>> On 07/11/17 00:24, Mathieu Poirier wrote:
>>>
>>> On Thu, Oct 19, 2017 at 06:15:53PM +0100, Suzuki K Poulose wrote:
>>>>
>>>> Add necessary support for using ETR as a sink in ETM perf tracing.
>>>> We try make the best use of the available modes of buffers to
>>>> try and avoid software double buffering.
>>>>
>>>> We can use the perf ring buffer for ETR directly if all of the
>>>> conditions below are met :
>>>>   1) ETR is DMA coherent
>>>>   2) perf is used in snapshot mode. In full tracing mode, we cannot
>>>>      guarantee that the ETR will stop before it overwrites the data
>>>>      which may not have been consumed by the user.
>>>>   3) ETR supports save-restore with a scatter-gather mechanism
>>>>      which can use a given set of pages we use the perf ring buffer
>>>>      directly. If we have an in-built TMC ETR Scatter Gather unit,
>>>>      we make use of a circular SG list to restart from a given head.
>>>>      However, we need to align the starting offset to 4K in this case.
>>>>
>>>> If the ETR doesn't support either of this, we fallback to software
>>>> double buffering.
>>>>
>>>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>
>>
>>
>>>>
>>>> +/*
>>>> + * etr_perf_buffer - Perf buffer used for ETR
>>>> + * @etr_buf            - Actual buffer used by the ETR
>>>> + * @snaphost           - Perf session mode
>>>> + * @head               - handle->head at the beginning of the session.
>>>> + * @nr_pages           - Number of pages in the ring buffer.
>>>> + * @pages              - Pages in the ring buffer.
>>>> + * @flags              - Capabilities of the hardware buffer used in the
>>>> + *                       session. If flags == 0, we use software double
>>>> + *                       buffering.
>>>> + */
>>>> +struct etr_perf_buffer {
>>>> +       struct etr_buf          *etr_buf;
>>>> +       bool                    snapshot;
>>>> +       unsigned long           head;
>>>> +       int                     nr_pages;
>>>> +       void                    **pages;
>>>> +       u32                     flags;
>>>> +};
>>>
>>>
>>> Please move this to the top, just below the declaration for etr_sg_table.
>>
>>
>> Sure.
>>
>>
>>
>>>> +
>>>> +/*
>>>> + * XXX: What is the expected behavior here in the following cases ?
>>>> + *  1) Full trace mode, without double buffering : What should be the
>>>> size
>>>> + *     reported back when the buffer is full and has wrapped around.
>>>> Ideally,
>>>> + *     we should report for the lost trace to make sure the "head" in
>>>> the ring
>>>> + *     buffer comes back to the position as in the trace buffer, rather
>>>> than
>>>> + *     returning "total size" of the buffer.
>>>> + * 2) In snapshot mode, should we always return "full buffer size" ?
>>>> + */
>>>> +static unsigned long
>>>> +tmc_etr_update_perf_buffer(struct coresight_device *csdev,
>>>> +                          struct perf_output_handle *handle,
>>>> +                          void *config)
>>>> +{
>>>> +       bool double_buffer, lost = false;
>>>> +       unsigned long flags, offset, size = 0;
>>>> +       struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>>>> +       struct etr_perf_buffer *etr_perf = config;
>>>> +       struct etr_buf *etr_buf = etr_perf->etr_buf;
>>>> +
>>>> +       double_buffer = (etr_perf->flags == 0);
>>>> +
>>>> +       spin_lock_irqsave(&drvdata->spinlock, flags);
>>>> +       if (WARN_ON(drvdata->perf_data != etr_perf)) {
>>>> +               lost = true;
>>>
>>>
>>> If we are here something went seriously wrong - I don't think much more
>>> can be
>>> done other than a WARN_ON()...
>>>
>>
>> right. I will do it for the case below as well.
>>
>>>>   static int tmc_enable_etr_sink_perf(struct coresight_device *csdev)
>>>>   {
>>
>> ...
>>>>
>>>> +
>>>> +       etr_perf = drvdata->perf_data;
>>>> +       if (!etr_perf || !etr_perf->etr_buf) {
>>>> +               rc = -EINVAL;
>>>
>>>
>>> This is a serious malfunction - I would WARN_ON() before unlocking.
>>>
>>
>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h
>>>> b/drivers/hwtracing/coresight/coresight-tmc.h
>>>> index 2c5b905b6494..06386ceb7866 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-tmc.h
>>>> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
>>>> @@ -198,6 +198,7 @@ struct etr_buf {
>>>>    * @trigger_cntr: amount of words to store after a trigger.
>>>>    * @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the
>>>>    *            device configuration register (DEVID)
>>>> + * @perf_data: PERF buffer for ETR.
>>>>    * @sysfs_data:       SYSFS buffer for ETR.
>>>>    */
>>>>   struct tmc_drvdata {
>>>> @@ -219,6 +220,7 @@ struct tmc_drvdata {
>>>>         u32                     trigger_cntr;
>>>>         u32                     etr_caps;
>>>>         struct etr_buf          *sysfs_buf;
>>>> +       void                    *perf_data;
>>>
>>>
>>> This is a temporary place holder while an event is active, i.e
>>> theoretically it
>>> doesn't stay the same for the entire trace session.  In situations where
>>> there
>>> could be one ETR per CPU, the same ETR could be used to serve more than
>>> one
>>> trace session (since only one session can be active at a time on a CPU).
>>> As
>>> such I would call it curr_perf_data or something similar.  I'd also make
>>> that
>>> clear in the above documentation.
>>
>>
>> You're right. However, from the ETR's perspective, it doesn't care how the
>> perf
>> uses it. So from the ETR driver side, it still is something used by the perf
>> mode.
>> All it stands for is the buffer to be used when enabled in perf mode. I
>> could
>> definitely add some comment to describe this. But I am not sure if we have
>> to
>> rename the variable.
>>
>>>
>>> Have you tried your implementation on a dragonboard or a Hikey?
>>
>>
>> No, I haven't. But Mike and Rob are trying on the Draonboard & Hikey
>> respectively.
>> We are hitting some issues in the Scatter Gather mode, which is under
>> debugging.
>> The SG table looks correct, just that the ETR hangs up. It works fine in the
>> flat memory mode. So, it is something to do with the READ  (sg table
>> pointers) vs
>> WRITE (write trace data) pressure on the ETR.
>>
>> One change I am working on with the perf buffer is to limit the "size" of
>> the
>> trace buffer used by the ETR (in case of the perf-ring buffer) to the
>> handle->size.
>> Otherwise we could be corrupting the collected trace waiting for consumption
>> by
>> the user. This is easily possible with our SG table. But with the flat
>> buffer, we have to
>> limit the size the minimum of (handle->size, space-in-circular-buffer-before
>> wrapping).
>>
>> In either case, we could loose data if we overflow the buffer, something we
>> can't help
>> at the moment.
>>
>>
>> Suzuki
>>
>>
>>
>>>
>>> Thanks,
>>> Mathieu
>>>
>>>>   };
>>>>     struct etr_buf_operations {
>>>> --
>>>> 2.13.6
>>>>
>>
>
>
>
> --
> Mike Leach
> Principal Engineer, ARM Ltd.
> Blackburn Design Centre. UK

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 06/17] coresight: tmc: Make ETR SG table circular
  2017-11-07 10:36     ` Suzuki K Poulose
@ 2017-11-09 16:19       ` Mathieu Poirier
  0 siblings, 0 replies; 56+ messages in thread
From: Mathieu Poirier @ 2017-11-09 16:19 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, rob.walker, Mike Leach, coresight

On 7 November 2017 at 03:36, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote:
> On 06/11/17 19:07, Mathieu Poirier wrote:
>>
>> On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
>
>
> ...
>
>>> +/*
>>> + * tmc_etr_sg_offset_to_table_index : Translate a given data @offset
>>> + * to the index of the page table "entry". Data pointers always have
>>> + * a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an
>>> + * ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries.
>>> + */
>>> +static inline u32
>>> +tmc_etr_sg_offset_to_table_index(u64 offset)
>>> +{
>>> +       u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
>>> +
>>> +       return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
>>> +}
>>
>>
>> This function is the source of a bizarre linking error when compiling
>> [14/17] on
>> armv7 as pasted here:
>>
>>    UPD     include/generated/compile.h
>>    CC      init/version.o
>>    AR      init/built-in.o
>>    AR      built-in.o
>>    LD      vmlinux.o
>>    MODPOST vmlinux.o
>> drivers/hwtracing/coresight/coresight-tmc-etr.o: In function
>> `tmc_etr_sg_offset_to_table_index':
>>
>> /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553:
>> undefined reference to `__aeabi_uldivmod'
>>
>> /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:551:
>> undefined reference to `__aeabi_uldivmod'
>>
>> /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553:
>> undefined reference to `__aeabi_uldivmod'
>> drivers/hwtracing/coresight/coresight-tmc-etr.o: In function
>> `tmc_etr_sg_table_rotate':
>>
>> /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:609:
>> undefined reference to `__aeabi_uldivmod'
>>
>> Please see if you can reproduce on your side.
>
>
> Uh ! I had gcc-7, which didn't complain about it. But if switch to 4.9, it
> does.
> It looks like division of 64bit entity on arm32 is triggering it. We don't
> need
> this to be u64 above, as it is the page_idx and could simply switch to
> "unsigned long"
> rather than explicitly using div64 helpers.
>
> The following change fixes the issue for me. Please could you check if it
> solves the problem
> for you ?

Unfortunately it doesn't.

Mathieu

>
>
> @@ -551,7 +553,7 @@ static void tmc_etr_sg_table_populate(struct
> etr_sg_table *etr_table)
>  static inline u32
>  tmc_etr_sg_offset_to_table_index(u64 offset)
>  {
> -       u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
> +       unsigned long sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
>          return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
>  }
>
>
>
> Thanks for the testing !
>
> Suzuki

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2017-11-09 16:19 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-19 17:15 [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose
2017-10-19 17:15 ` [PATCH 01/17] coresight etr: Disallow perf mode temporarily Suzuki K Poulose
2017-10-19 17:15 ` [PATCH 02/17] coresight tmc: Hide trace buffer handling for file read Suzuki K Poulose
2017-10-20 12:34   ` Julien Thierry
2017-11-01  9:55     ` Suzuki K Poulose
2017-10-19 17:15 ` [PATCH 03/17] coresight: Add helper for inserting synchronization packets Suzuki K Poulose
2017-10-30 21:44   ` Mathieu Poirier
2017-11-01 10:01     ` Suzuki K Poulose
2017-10-19 17:15 ` [PATCH 04/17] coresight: Add generic TMC sg table framework Suzuki K Poulose
2017-10-31 22:13   ` Mathieu Poirier
2017-11-01 10:09     ` Suzuki K Poulose
2017-10-19 17:15 ` [PATCH 05/17] coresight: Add support for TMC ETR SG unit Suzuki K Poulose
2017-10-20 16:25   ` Julien Thierry
2017-11-01 10:11     ` Suzuki K Poulose
2017-11-01 20:41   ` Mathieu Poirier
2017-10-19 17:15 ` [PATCH 06/17] coresight: tmc: Make ETR SG table circular Suzuki K Poulose
2017-10-20 17:11   ` Julien Thierry
2017-11-01 10:12     ` Suzuki K Poulose
2017-11-01 23:47   ` Mathieu Poirier
2017-11-02 12:00     ` Suzuki K Poulose
2017-11-02 14:40       ` Mathieu Poirier
2017-11-02 15:13         ` Russell King - ARM Linux
2017-11-06 19:07   ` Mathieu Poirier
2017-11-07 10:36     ` Suzuki K Poulose
2017-11-09 16:19       ` Mathieu Poirier
2017-10-19 17:15 ` [PATCH 07/17] coresight: tmc etr: Add transparent buffer management Suzuki K Poulose
2017-11-02 17:48   ` Mathieu Poirier
2017-11-03 10:02     ` Suzuki K Poulose
2017-11-03 20:13       ` Mathieu Poirier
2017-10-19 17:15 ` [PATCH 08/17] coresight: tmc: Add configuration support for trace buffer size Suzuki K Poulose
2017-11-02 19:26   ` Mathieu Poirier
2017-10-19 17:15 ` [PATCH 09/17] coresight: Convert driver messages to dev_dbg Suzuki K Poulose
2017-10-19 17:15 ` [PATCH 10/17] coresight: etr: Track if the device is coherent Suzuki K Poulose
2017-11-02 19:40   ` Mathieu Poirier
2017-11-03 10:03     ` Suzuki K Poulose
2017-10-19 17:15 ` [PATCH 11/17] coresight etr: Handle driver mode specific ETR buffers Suzuki K Poulose
2017-11-02 20:26   ` Mathieu Poirier
2017-11-03 10:08     ` Suzuki K Poulose
2017-11-03 20:30       ` Mathieu Poirier
2017-10-19 17:15 ` [PATCH 12/17] coresight etr: Relax collection of trace from sysfs mode Suzuki K Poulose
2017-10-19 17:15 ` [PATCH 13/17] coresight etr: Do not clean ETR trace buffer Suzuki K Poulose
2017-11-02 20:36   ` Mathieu Poirier
2017-11-03 10:10     ` Suzuki K Poulose
2017-11-03 20:17       ` Mathieu Poirier
2017-11-07 10:37         ` Suzuki K Poulose
2017-10-19 17:15 ` [PATCH 14/17] coresight: etr: Add support for save restore buffers Suzuki K Poulose
2017-11-03 22:22   ` Mathieu Poirier
2017-10-19 17:15 ` [PATCH 15/17] coresight: etr_buf: Add helper for padding an area of trace data Suzuki K Poulose
2017-10-19 17:15 ` [PATCH 16/17] coresight: perf: Remove reset_buffer call back for sinks Suzuki K Poulose
2017-11-06 21:10   ` Mathieu Poirier
2017-10-19 17:15 ` [PATCH 17/17] coresight perf: Add ETR backend support for etm-perf Suzuki K Poulose
2017-11-07  0:24   ` Mathieu Poirier
2017-11-07 10:52     ` Suzuki K Poulose
2017-11-07 15:17       ` Mike Leach
2017-11-07 15:46         ` Mathieu Poirier
2017-10-20 11:00 ` [PATCH 00/17] coresight: perf: TMC ETR backend support Suzuki K Poulose

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).