All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
@ 2020-09-04  2:41 Linu Cherian
  2020-09-04  2:41 ` [PATCH v4 1/2] coresight: etm: perf: Sink selection using sysfs is deprecated Linu Cherian
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Linu Cherian @ 2020-09-04  2:41 UTC (permalink / raw)
  To: mathieu.poirier, suzuki.poulose, mike.leach
  Cc: coresight, linuc.decode, linux-arm-kernel, Linu Cherian

This patch series tries to fix the sysfs breakage on topologies
with per core sink.

Changes since v3:
- References to coresight_get_enabled_sink in perf interface
  has been removed and marked deprecated as a new patch.
- To avoid changes to coresight_find_sink for ease of maintenance,
  search function specific to sysfs usage has been added. 
- Sysfs being the only user for coresight_get_enabled sink,
  reset option is removed as well.

Changes since v2:
- Fixed checkpatch issue

Changes since v1:
- Misc fixes in commit message

Applies on https://git.linaro.org/kernel/coresight.git/log/?h=next

Linu Cherian (2):
  coresight: etm: perf: Sink selection using sysfs is deprecated
  coresight: Make sysFS functional on topologies with per core sink

 .../hwtracing/coresight/coresight-etm-perf.c  |  2 -
 drivers/hwtracing/coresight/coresight-priv.h  |  3 +-
 drivers/hwtracing/coresight/coresight.c       | 58 +++++++++----------
 3 files changed, 29 insertions(+), 34 deletions(-)


base-commit: 17f17c8f02a35a746376c2ecd054386575835b8b
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v4 1/2] coresight: etm: perf: Sink selection using sysfs is deprecated
  2020-09-04  2:41 [PATCH v4 0/2] Make sysFS functional on topologies with per core sink Linu Cherian
@ 2020-09-04  2:41 ` Linu Cherian
  2020-09-04  2:41 ` [PATCH v4 2/2] coresight: Make sysFS functional on topologies with per core sink Linu Cherian
  2020-10-05 11:27 ` [PATCH v4 0/2] " Suzuki K Poulose
  2 siblings, 0 replies; 18+ messages in thread
From: Linu Cherian @ 2020-09-04  2:41 UTC (permalink / raw)
  To: mathieu.poirier, suzuki.poulose, mike.leach
  Cc: coresight, linuc.decode, linux-arm-kernel, Linu Cherian

When using the perf interface, sink selection using sysfs is
deprecated.

Signed-off-by: Linu Cherian <lcherian@marvell.com>
---
 drivers/hwtracing/coresight/coresight-etm-perf.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 1a3169e69bb1..d1bf62f47f72 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -222,8 +222,6 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
 	if (event->attr.config2) {
 		id = (u32)event->attr.config2;
 		sink = coresight_get_sink_by_id(id);
-	} else {
-		sink = coresight_get_enabled_sink(true);
 	}
 
 	mask = &event_data->mask;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 2/2] coresight: Make sysFS functional on topologies with per core sink
  2020-09-04  2:41 [PATCH v4 0/2] Make sysFS functional on topologies with per core sink Linu Cherian
  2020-09-04  2:41 ` [PATCH v4 1/2] coresight: etm: perf: Sink selection using sysfs is deprecated Linu Cherian
@ 2020-09-04  2:41 ` Linu Cherian
  2020-09-14 19:36   ` Mathieu Poirier
  2020-10-05 11:27 ` [PATCH v4 0/2] " Suzuki K Poulose
  2 siblings, 1 reply; 18+ messages in thread
From: Linu Cherian @ 2020-09-04  2:41 UTC (permalink / raw)
  To: mathieu.poirier, suzuki.poulose, mike.leach
  Cc: coresight, linuc.decode, linux-arm-kernel, Linu Cherian

Coresight driver assumes sink is common across all the ETMs,
and tries to build a path between ETM and the first enabled
sink found using bus based search. This breaks sysFS usage
on implementations that has multiple per core sinks in
enabled state.

To fix this, coresight_get_enabled_sink API is updated to
do a connection based search starting from the given source,
instead of bus based search.
With sink selection using sysfs depecrated for perf interface,
provision for reset is removed as well in this API.

Signed-off-by: Linu Cherian <lcherian@marvell.com>
---
 drivers/hwtracing/coresight/coresight-priv.h |  3 +-
 drivers/hwtracing/coresight/coresight.c      | 58 +++++++++-----------
 2 files changed, 29 insertions(+), 32 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
index f2dc625ea585..5fe773c4d6cc 100644
--- a/drivers/hwtracing/coresight/coresight-priv.h
+++ b/drivers/hwtracing/coresight/coresight-priv.h
@@ -148,7 +148,8 @@ static inline void coresight_write_reg_pair(void __iomem *addr, u64 val,
 void coresight_disable_path(struct list_head *path);
 int coresight_enable_path(struct list_head *path, u32 mode, void *sink_data);
 struct coresight_device *coresight_get_sink(struct list_head *path);
-struct coresight_device *coresight_get_enabled_sink(bool reset);
+struct coresight_device *
+coresight_get_enabled_sink(struct coresight_device *source);
 struct coresight_device *coresight_get_sink_by_id(u32 id);
 struct coresight_device *
 coresight_find_default_sink(struct coresight_device *csdev);
diff --git a/drivers/hwtracing/coresight/coresight.c b/drivers/hwtracing/coresight/coresight.c
index e9c90f2de34a..562dc3d5f5b2 100644
--- a/drivers/hwtracing/coresight/coresight.c
+++ b/drivers/hwtracing/coresight/coresight.c
@@ -540,50 +540,46 @@ struct coresight_device *coresight_get_sink(struct list_head *path)
 	return csdev;
 }
 
-static int coresight_enabled_sink(struct device *dev, const void *data)
+static struct coresight_device *
+coresight_find_enabled_sink(struct coresight_device *csdev)
 {
-	const bool *reset = data;
-	struct coresight_device *csdev = to_coresight_device(dev);
+	int i;
+	struct coresight_device *sink;
 
 	if ((csdev->type == CORESIGHT_DEV_TYPE_SINK ||
 	     csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) &&
-	     csdev->activated) {
-		/*
-		 * Now that we have a handle on the sink for this session,
-		 * disable the sysFS "enable_sink" flag so that possible
-		 * concurrent perf session that wish to use another sink don't
-		 * trip on it.  Doing so has no ramification for the current
-		 * session.
-		 */
-		if (*reset)
-			csdev->activated = false;
+	    csdev->activated)
+		return csdev;
 
-		return 1;
+	/*
+	 * Recursively explore each port found on this element.
+	 */
+	for (i = 0; i < csdev->pdata->nr_outport; i++) {
+		struct coresight_device *child_dev;
+
+		child_dev = csdev->pdata->conns[i].child_dev;
+		if (child_dev)
+			sink = coresight_find_enabled_sink(child_dev);
+		if (sink)
+			return sink;
 	}
 
-	return 0;
+	return NULL;
 }
 
 /**
- * coresight_get_enabled_sink - returns the first enabled sink found on the bus
- * @deactivate:	Whether the 'enable_sink' flag should be reset
- *
- * When operated from perf the deactivate parameter should be set to 'true'.
- * That way the "enabled_sink" flag of the sink that was selected can be reset,
- * allowing for other concurrent perf sessions to choose a different sink.
+ * coresight_get_enabled_sink - returns the first enabled sink using
+ * connection based search starting from the source reference
  *
- * When operated from sysFS users have full control and as such the deactivate
- * parameter should be set to 'false', hence mandating users to explicitly
- * clear the flag.
+ * @source: Coresight source device reference
  */
-struct coresight_device *coresight_get_enabled_sink(bool deactivate)
+struct coresight_device *
+coresight_get_enabled_sink(struct coresight_device *source)
 {
-	struct device *dev = NULL;
-
-	dev = bus_find_device(&coresight_bustype, NULL, &deactivate,
-			      coresight_enabled_sink);
+	if (!source)
+		return NULL;
 
-	return dev ? to_coresight_device(dev) : NULL;
+	return coresight_find_enabled_sink(source);
 }
 
 static int coresight_sink_by_id(struct device *dev, const void *data)
@@ -992,7 +988,7 @@ int coresight_enable(struct coresight_device *csdev)
 	 * Search for a valid sink for this session but don't reset the
 	 * "enable_sink" flag in sysFS.  Users get to do that explicitly.
 	 */
-	sink = coresight_get_enabled_sink(false);
+	sink = coresight_get_enabled_sink(csdev);
 	if (!sink) {
 		ret = -EINVAL;
 		goto out;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 2/2] coresight: Make sysFS functional on topologies with per core sink
  2020-09-04  2:41 ` [PATCH v4 2/2] coresight: Make sysFS functional on topologies with per core sink Linu Cherian
@ 2020-09-14 19:36   ` Mathieu Poirier
  2020-09-15  4:40     ` Linu Cherian
  0 siblings, 1 reply; 18+ messages in thread
From: Mathieu Poirier @ 2020-09-14 19:36 UTC (permalink / raw)
  To: Linu Cherian
  Cc: coresight, mike.leach, linuc.decode, linux-arm-kernel, suzuki.poulose

Hi Linu,

On Fri, Sep 04, 2020 at 08:11:06AM +0530, Linu Cherian wrote:
> Coresight driver assumes sink is common across all the ETMs,
> and tries to build a path between ETM and the first enabled
> sink found using bus based search. This breaks sysFS usage
> on implementations that has multiple per core sinks in
> enabled state.
> 
> To fix this, coresight_get_enabled_sink API is updated to
> do a connection based search starting from the given source,
> instead of bus based search.
> With sink selection using sysfs depecrated for perf interface,
> provision for reset is removed as well in this API.
> 
> Signed-off-by: Linu Cherian <lcherian@marvell.com>
> ---
>  drivers/hwtracing/coresight/coresight-priv.h |  3 +-
>  drivers/hwtracing/coresight/coresight.c      | 58 +++++++++-----------
>  2 files changed, 29 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
> index f2dc625ea585..5fe773c4d6cc 100644
> --- a/drivers/hwtracing/coresight/coresight-priv.h
> +++ b/drivers/hwtracing/coresight/coresight-priv.h
> @@ -148,7 +148,8 @@ static inline void coresight_write_reg_pair(void __iomem *addr, u64 val,
>  void coresight_disable_path(struct list_head *path);
>  int coresight_enable_path(struct list_head *path, u32 mode, void *sink_data);
>  struct coresight_device *coresight_get_sink(struct list_head *path);
> -struct coresight_device *coresight_get_enabled_sink(bool reset);
> +struct coresight_device *
> +coresight_get_enabled_sink(struct coresight_device *source);
>  struct coresight_device *coresight_get_sink_by_id(u32 id);
>  struct coresight_device *
>  coresight_find_default_sink(struct coresight_device *csdev);
> diff --git a/drivers/hwtracing/coresight/coresight.c b/drivers/hwtracing/coresight/coresight.c
> index e9c90f2de34a..562dc3d5f5b2 100644
> --- a/drivers/hwtracing/coresight/coresight.c
> +++ b/drivers/hwtracing/coresight/coresight.c
> @@ -540,50 +540,46 @@ struct coresight_device *coresight_get_sink(struct list_head *path)
>  	return csdev;
>  }
>  
> -static int coresight_enabled_sink(struct device *dev, const void *data)
> +static struct coresight_device *
> +coresight_find_enabled_sink(struct coresight_device *csdev)
>  {
> -	const bool *reset = data;
> -	struct coresight_device *csdev = to_coresight_device(dev);
> +	int i;
> +	struct coresight_device *sink;
>  
>  	if ((csdev->type == CORESIGHT_DEV_TYPE_SINK ||
>  	     csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) &&
> -	     csdev->activated) {
> -		/*
> -		 * Now that we have a handle on the sink for this session,
> -		 * disable the sysFS "enable_sink" flag so that possible
> -		 * concurrent perf session that wish to use another sink don't
> -		 * trip on it.  Doing so has no ramification for the current
> -		 * session.
> -		 */
> -		if (*reset)
> -			csdev->activated = false;
> +	    csdev->activated)

Indentation problem.

> +		return csdev;
>  
> -		return 1;
> +	/*
> +	 * Recursively explore each port found on this element.
> +	 */
> +	for (i = 0; i < csdev->pdata->nr_outport; i++) {
> +		struct coresight_device *child_dev;
> +
> +		child_dev = csdev->pdata->conns[i].child_dev;
> +		if (child_dev)
> +			sink = coresight_find_enabled_sink(child_dev);
> +		if (sink)
> +			return sink;
>  	}
>  
> -	return 0;
> +	return NULL;
>  }
>  
>  /**
> - * coresight_get_enabled_sink - returns the first enabled sink found on the bus
> - * @deactivate:	Whether the 'enable_sink' flag should be reset
> - *
> - * When operated from perf the deactivate parameter should be set to 'true'.
> - * That way the "enabled_sink" flag of the sink that was selected can be reset,
> - * allowing for other concurrent perf sessions to choose a different sink.
> + * coresight_get_enabled_sink - returns the first enabled sink using
> + * connection based search starting from the source reference
>   *
> - * When operated from sysFS users have full control and as such the deactivate
> - * parameter should be set to 'false', hence mandating users to explicitly
> - * clear the flag.
> + * @source: Coresight source device reference
>   */
> -struct coresight_device *coresight_get_enabled_sink(bool deactivate)
> +struct coresight_device *
> +coresight_get_enabled_sink(struct coresight_device *source)
>  {
> -	struct device *dev = NULL;
> -
> -	dev = bus_find_device(&coresight_bustype, NULL, &deactivate,
> -			      coresight_enabled_sink);
> +	if (!source)
> +		return NULL;
>  
> -	return dev ? to_coresight_device(dev) : NULL;
> +	return coresight_find_enabled_sink(source);
>  }
>  
>  static int coresight_sink_by_id(struct device *dev, const void *data)
> @@ -992,7 +988,7 @@ int coresight_enable(struct coresight_device *csdev)
>  	 * Search for a valid sink for this session but don't reset the
>  	 * "enable_sink" flag in sysFS.  Users get to do that explicitly.
>  	 */

With this patch the above comment is no longer relevant.  Since you were nice
enough to do the extra work I fixed both problem and applied your patches. 

Thanks,
Mathieu

> -	sink = coresight_get_enabled_sink(false);
> +	sink = coresight_get_enabled_sink(csdev);
>  	if (!sink) {
>  		ret = -EINVAL;
>  		goto out;
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 2/2] coresight: Make sysFS functional on topologies with per core sink
  2020-09-14 19:36   ` Mathieu Poirier
@ 2020-09-15  4:40     ` Linu Cherian
  0 siblings, 0 replies; 18+ messages in thread
From: Linu Cherian @ 2020-09-15  4:40 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: coresight, mike.leach, linux-arm-kernel, Linu Cherian, suzuki.poulose


Hi Mathieu,

 >  
> >  /**
> > - * coresight_get_enabled_sink - returns the first enabled sink found on the bus
> > - * @deactivate:	Whether the 'enable_sink' flag should be reset
> > - *
> > - * When operated from perf the deactivate parameter should be set to 'true'.
> > - * That way the "enabled_sink" flag of the sink that was selected can be reset,
> > - * allowing for other concurrent perf sessions to choose a different sink.
> > + * coresight_get_enabled_sink - returns the first enabled sink using
> > + * connection based search starting from the source reference
> >   *
> > - * When operated from sysFS users have full control and as such the deactivate
> > - * parameter should be set to 'false', hence mandating users to explicitly
> > - * clear the flag.
> > + * @source: Coresight source device reference
> >   */
> > -struct coresight_device *coresight_get_enabled_sink(bool deactivate)
> > +struct coresight_device *
> > +coresight_get_enabled_sink(struct coresight_device *source)
> >  {
> > -	struct device *dev = NULL;
> > -
> > -	dev = bus_find_device(&coresight_bustype, NULL, &deactivate,
> > -			      coresight_enabled_sink);
> > +	if (!source)
> > +		return NULL;
> >  
> > -	return dev ? to_coresight_device(dev) : NULL;
> > +	return coresight_find_enabled_sink(source);
> >  }
> >  
> >  static int coresight_sink_by_id(struct device *dev, const void *data)
> > @@ -992,7 +988,7 @@ int coresight_enable(struct coresight_device *csdev)
> >  	 * Search for a valid sink for this session but don't reset the
> >  	 * "enable_sink" flag in sysFS.  Users get to do that explicitly.
> >  	 */
> 
> With this patch the above comment is no longer relevant.  Since you were nice
> enough to do the extra work I fixed both problem and applied your patches. 

Thanks.

-- 
Linu cherian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-09-04  2:41 [PATCH v4 0/2] Make sysFS functional on topologies with per core sink Linu Cherian
  2020-09-04  2:41 ` [PATCH v4 1/2] coresight: etm: perf: Sink selection using sysfs is deprecated Linu Cherian
  2020-09-04  2:41 ` [PATCH v4 2/2] coresight: Make sysFS functional on topologies with per core sink Linu Cherian
@ 2020-10-05 11:27 ` Suzuki K Poulose
  2020-10-06 13:21   ` Linu Cherian
                     ` (2 more replies)
  2 siblings, 3 replies; 18+ messages in thread
From: Suzuki K Poulose @ 2020-10-05 11:27 UTC (permalink / raw)
  To: lcherian, mathieu.poirier, mike.leach
  Cc: coresight, linuc.decode, linux-arm-kernel

Hi Linu,

On 09/04/2020 03:41 AM, Linu Cherian wrote:
> This patch series tries to fix the sysfs breakage on topologies
> with per core sink.
> 
> Changes since v3:
> - References to coresight_get_enabled_sink in perf interface
>    has been removed and marked deprecated as a new patch.
> - To avoid changes to coresight_find_sink for ease of maintenance,
>    search function specific to sysfs usage has been added.
> - Sysfs being the only user for coresight_get_enabled sink,
>    reset option is removed as well.

Have you tried running perf with --per-thread option ? I believe
this will be impacted as well, as we choose a single sink at the
moment and this may not be reachable from the other CPUs, where
the event may be scheduled. Eventually loosing trace for the
duration where the task is scheduled on a different CPU.

Please could you try this patch and see if helps ? I have lightly
tested this on a fast model.

---8>---

coresight: etm-perf: Allow an event to use multiple sinks

When there are multiple sinks on the system, in the absence
of a specified sink, it is quite possible that a default sink
for an ETM could be different from that of another ETM (e.g, on
systems with per-CPU sinks). However we do not support having
multiple sinks for an event yet. This patch allows the event to
use the default sinks on the ETMs where they are scheduled as
long as the sinks are of the same type.

e.g, if we have 1x1 topology with per-CPU ETRs, the event can
use the per-CPU ETR for the session. However, if the sinks
are of different type, e.g TMC-ETR on one and a custom sink
on another, the event will only trace on the first detected
sink (just like we have today).

Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
  .../hwtracing/coresight/coresight-etm-perf.c  | 69 +++++++++++++------
  1 file changed, 49 insertions(+), 20 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c 
b/drivers/hwtracing/coresight/coresight-etm-perf.c
index c2c9b127d074..19fe38010474 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -204,14 +204,28 @@ static void etm_free_aux(void *data)
  	schedule_work(&event_data->work);
  }

+/*
+ * When an event could be scheduled on more than one CPUs, we have to make
+ * sure that the sinks are of the same type, so that the sink_buffer could
+ * be reused.
+ */
+static bool sinks_match(struct coresight_device *a, struct coresight_device *b)
+{
+	if (!a || !b)
+		return false;
+	return (sink_ops(a) == sink_ops(b)) &&
+	       (a->subtype.sink_subtype == b->subtype.sink_subtype);
+}
+
  static void *etm_setup_aux(struct perf_event *event, void **pages,
  			   int nr_pages, bool overwrite)
  {
  	u32 id;
  	int cpu = event->cpu;
  	cpumask_t *mask;
-	struct coresight_device *sink;
+	struct coresight_device *sink = NULL;
  	struct etm_event_data *event_data = NULL;
+	bool sink_forced = false;

  	event_data = alloc_event_data(cpu);
  	if (!event_data)
@@ -222,6 +236,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
  	if (event->attr.config2) {
  		id = (u32)event->attr.config2;
  		sink = coresight_get_sink_by_id(id);
+		sink_forced = true;
  	}

  	mask = &event_data->mask;
@@ -235,7 +250,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
  	 */
  	for_each_cpu(cpu, mask) {
  		struct list_head *path;
-		struct coresight_device *csdev;
+		struct coresight_device *csdev, *cpu_sink;

  		csdev = per_cpu(csdev_src, cpu);
  		/*
@@ -243,33 +258,42 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
  		 * the mask and continue with the rest. If ever we try to trace
  		 * on this CPU, we handle it accordingly.
  		 */
-		if (!csdev) {
-			cpumask_clear_cpu(cpu, mask);
-			continue;
-		}
-
+		if (!csdev)
+			goto clear_cpu;
  		/*
-		 * No sink provided - look for a default sink for one of the
-		 * devices. At present we only support topology where all CPUs
-		 * use the same sink [N:1], so only need to find one sink. The
-		 * coresight_build_path later will remove any CPU that does not
-		 * attach to the sink, or if we have not found a sink.
+		 * No sink provided - look for a default sink for all the devices.
+		 * We only support multiple sinks, only if all the default sinks
+		 * are of the same type, so that the sink buffer can be shared
+		 * as the event moves around. As earlier, we don't trace on a
+		 * CPU, if we can't find a suitable sink.
  		 */
-		if (!sink)
-			sink = coresight_find_default_sink(csdev);
+		if (!sink_forced) {
+			cpu_sink = coresight_find_default_sink(csdev);
+			if (!cpu_sink)
+				goto clear_cpu;
+			/* First sink for this event */
+			if (!sink) {
+				sink = cpu_sink;
+			} else if (!sinks_match(cpu_sink, sink)) {
+				goto clear_cpu;
+			}
+
+		} else {
+			cpu_sink = sink;
+		}

  		/*
  		 * Building a path doesn't enable it, it simply builds a
  		 * list of devices from source to sink that can be
  		 * referenced later when the path is actually needed.
  		 */
-		path = coresight_build_path(csdev, sink);
-		if (IS_ERR(path)) {
-			cpumask_clear_cpu(cpu, mask);
+		path = coresight_build_path(csdev, cpu_sink);
+		if (!IS_ERR(path)) {
+			*etm_event_cpu_path_ptr(event_data, cpu) = path;
  			continue;
  		}
-
-		*etm_event_cpu_path_ptr(event_data, cpu) = path;
+clear_cpu:
+		cpumask_clear_cpu(cpu, mask);
  	}

  	/* no sink found for any CPU - cannot trace */
@@ -284,7 +308,12 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
  	if (!sink_ops(sink)->alloc_buffer || !sink_ops(sink)->free_buffer)
  		goto err;

-	/* Allocate the sink buffer for this session */
+	/*
+	 * Allocate the sink buffer for this session. All the sinks
+	 * where this event can be scheduled are ensured to be of the
+	 * same type. Thus the same sink configuration is used by the
+	 * sinks.
+	 */
  	event_data->snk_config =
  			sink_ops(sink)->alloc_buffer(sink, event, pages,
  						     nr_pages, overwrite);
-- 
2.24.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-10-05 11:27 ` [PATCH v4 0/2] " Suzuki K Poulose
@ 2020-10-06 13:21   ` Linu Cherian
  2020-10-06 16:12   ` Mathieu Poirier
  2020-10-26  4:33   ` Linu Cherian
  2 siblings, 0 replies; 18+ messages in thread
From: Linu Cherian @ 2020-10-06 13:21 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, coresight, mathieu.poirier, lcherian, mike.leach


Hi Suzuki,

On Mon Oct 05, 2020 at 12:27:07PM +0100, Suzuki K Poulose wrote:
> Hi Linu,
> 
> On 09/04/2020 03:41 AM, Linu Cherian wrote:
> >This patch series tries to fix the sysfs breakage on topologies
> >with per core sink.
> >
> >Changes since v3:
> >- References to coresight_get_enabled_sink in perf interface
> >   has been removed and marked deprecated as a new patch.
> >- To avoid changes to coresight_find_sink for ease of maintenance,
> >   search function specific to sysfs usage has been added.
> >- Sysfs being the only user for coresight_get_enabled sink,
> >   reset option is removed as well.
> 
> Have you tried running perf with --per-thread option ? I believe
> this will be impacted as well, as we choose a single sink at the
> moment and this may not be reachable from the other CPUs, where
> the event may be scheduled. Eventually loosing trace for the
> duration where the task is scheduled on a different CPU.
>

Our testing with perf was limited(using -C option)
and havent tried with --per-thread option.

 
> Please could you try this patch and see if helps ? I have lightly
> tested this on a fast model.

Sure, will get back on this.

-- 
Linu cherian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-10-05 11:27 ` [PATCH v4 0/2] " Suzuki K Poulose
  2020-10-06 13:21   ` Linu Cherian
@ 2020-10-06 16:12   ` Mathieu Poirier
  2020-10-06 16:43     ` Suzuki K Poulose
  2020-10-26  4:33   ` Linu Cherian
  2 siblings, 1 reply; 18+ messages in thread
From: Mathieu Poirier @ 2020-10-06 16:12 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: coresight, linuc.decode, linux-arm-kernel, lcherian, mike.leach

On Mon, Oct 05, 2020 at 12:27:07PM +0100, Suzuki K Poulose wrote:
> Hi Linu,
> 
> On 09/04/2020 03:41 AM, Linu Cherian wrote:
> > This patch series tries to fix the sysfs breakage on topologies
> > with per core sink.
> > 
> > Changes since v3:
> > - References to coresight_get_enabled_sink in perf interface
> >    has been removed and marked deprecated as a new patch.
> > - To avoid changes to coresight_find_sink for ease of maintenance,
> >    search function specific to sysfs usage has been added.
> > - Sysfs being the only user for coresight_get_enabled sink,
> >    reset option is removed as well.
> 
> Have you tried running perf with --per-thread option ? I believe
> this will be impacted as well, as we choose a single sink at the
> moment and this may not be reachable from the other CPUs, where
> the event may be scheduled. Eventually loosing trace for the
> duration where the task is scheduled on a different CPU.

Right, I considered this set in the context of sysfs only.  I expect supporting
1:1 configuration on perf to require changes in the kernel drivers and the perf
tools.

> 
> Please could you try this patch and see if helps ? I have lightly
> tested this on a fast model.
> 
> ---8>---
> 
> coresight: etm-perf: Allow an event to use multiple sinks
> 
> When there are multiple sinks on the system, in the absence
> of a specified sink, it is quite possible that a default sink
> for an ETM could be different from that of another ETM (e.g, on
> systems with per-CPU sinks). However we do not support having
> multiple sinks for an event yet. This patch allows the event to
> use the default sinks on the ETMs where they are scheduled as
> long as the sinks are of the same type.
> 
> e.g, if we have 1x1 topology with per-CPU ETRs, the event can
> use the per-CPU ETR for the session. However, if the sinks
> are of different type, e.g TMC-ETR on one and a custom sink
> on another, the event will only trace on the first detected
> sink (just like we have today).
> 
> Cc: Linu Cherian <lcherian@marvell.com>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Mike Leach <mike.leach@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  .../hwtracing/coresight/coresight-etm-perf.c  | 69 +++++++++++++------
>  1 file changed, 49 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c
> b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index c2c9b127d074..19fe38010474 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -204,14 +204,28 @@ static void etm_free_aux(void *data)
>  	schedule_work(&event_data->work);
>  }
> 
> +/*
> + * When an event could be scheduled on more than one CPUs, we have to make
> + * sure that the sinks are of the same type, so that the sink_buffer could
> + * be reused.
> + */
> +static bool sinks_match(struct coresight_device *a, struct coresight_device *b)
> +{
> +	if (!a || !b)
> +		return false;
> +	return (sink_ops(a) == sink_ops(b)) &&
> +	       (a->subtype.sink_subtype == b->subtype.sink_subtype);
> +}
> +
>  static void *etm_setup_aux(struct perf_event *event, void **pages,
>  			   int nr_pages, bool overwrite)
>  {
>  	u32 id;
>  	int cpu = event->cpu;
>  	cpumask_t *mask;
> -	struct coresight_device *sink;
> +	struct coresight_device *sink = NULL;
>  	struct etm_event_data *event_data = NULL;
> +	bool sink_forced = false;
> 
>  	event_data = alloc_event_data(cpu);
>  	if (!event_data)
> @@ -222,6 +236,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>  	if (event->attr.config2) {
>  		id = (u32)event->attr.config2;
>  		sink = coresight_get_sink_by_id(id);
> +		sink_forced = true;
>  	}
> 
>  	mask = &event_data->mask;
> @@ -235,7 +250,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>  	 */
>  	for_each_cpu(cpu, mask) {
>  		struct list_head *path;
> -		struct coresight_device *csdev;
> +		struct coresight_device *csdev, *cpu_sink;
> 
>  		csdev = per_cpu(csdev_src, cpu);
>  		/*
> @@ -243,33 +258,42 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>  		 * the mask and continue with the rest. If ever we try to trace
>  		 * on this CPU, we handle it accordingly.
>  		 */
> -		if (!csdev) {
> -			cpumask_clear_cpu(cpu, mask);
> -			continue;
> -		}
> -
> +		if (!csdev)
> +			goto clear_cpu;
>  		/*
> -		 * No sink provided - look for a default sink for one of the
> -		 * devices. At present we only support topology where all CPUs
> -		 * use the same sink [N:1], so only need to find one sink. The
> -		 * coresight_build_path later will remove any CPU that does not
> -		 * attach to the sink, or if we have not found a sink.
> +		 * No sink provided - look for a default sink for all the devices.
> +		 * We only support multiple sinks, only if all the default sinks
> +		 * are of the same type, so that the sink buffer can be shared
> +		 * as the event moves around. As earlier, we don't trace on a
> +		 * CPU, if we can't find a suitable sink.
>  		 */
> -		if (!sink)
> -			sink = coresight_find_default_sink(csdev);
> +		if (!sink_forced) {
> +			cpu_sink = coresight_find_default_sink(csdev);
> +			if (!cpu_sink)
> +				goto clear_cpu;
> +			/* First sink for this event */
> +			if (!sink) {
> +				sink = cpu_sink;
> +			} else if (!sinks_match(cpu_sink, sink)) {
> +				goto clear_cpu;
> +			}
> +
> +		} else {
> +			cpu_sink = sink;
> +		}
> 
>  		/*
>  		 * Building a path doesn't enable it, it simply builds a
>  		 * list of devices from source to sink that can be
>  		 * referenced later when the path is actually needed.
>  		 */
> -		path = coresight_build_path(csdev, sink);
> -		if (IS_ERR(path)) {
> -			cpumask_clear_cpu(cpu, mask);
> +		path = coresight_build_path(csdev, cpu_sink);
> +		if (!IS_ERR(path)) {
> +			*etm_event_cpu_path_ptr(event_data, cpu) = path;
>  			continue;
>  		}
> -
> -		*etm_event_cpu_path_ptr(event_data, cpu) = path;
> +clear_cpu:
> +		cpumask_clear_cpu(cpu, mask);
>  	}
> 
>  	/* no sink found for any CPU - cannot trace */
> @@ -284,7 +308,12 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>  	if (!sink_ops(sink)->alloc_buffer || !sink_ops(sink)->free_buffer)
>  		goto err;
> 
> -	/* Allocate the sink buffer for this session */
> +	/*
> +	 * Allocate the sink buffer for this session. All the sinks
> +	 * where this event can be scheduled are ensured to be of the
> +	 * same type. Thus the same sink configuration is used by the
> +	 * sinks.
> +	 */
>  	event_data->snk_config =
>  			sink_ops(sink)->alloc_buffer(sink, event, pages,
>  						     nr_pages, overwrite);
> -- 
> 2.24.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-10-06 16:12   ` Mathieu Poirier
@ 2020-10-06 16:43     ` Suzuki K Poulose
  2020-10-06 16:56       ` Mathieu Poirier
  0 siblings, 1 reply; 18+ messages in thread
From: Suzuki K Poulose @ 2020-10-06 16:43 UTC (permalink / raw)
  To: mathieu.poirier
  Cc: coresight, linuc.decode, linux-arm-kernel, lcherian, mike.leach

On 10/06/2020 05:12 PM, Mathieu Poirier wrote:
> On Mon, Oct 05, 2020 at 12:27:07PM +0100, Suzuki K Poulose wrote:
>> Hi Linu,
>>
>> On 09/04/2020 03:41 AM, Linu Cherian wrote:
>>> This patch series tries to fix the sysfs breakage on topologies
>>> with per core sink.
>>>
>>> Changes since v3:
>>> - References to coresight_get_enabled_sink in perf interface
>>>     has been removed and marked deprecated as a new patch.
>>> - To avoid changes to coresight_find_sink for ease of maintenance,
>>>     search function specific to sysfs usage has been added.
>>> - Sysfs being the only user for coresight_get_enabled sink,
>>>     reset option is removed as well.
>>
>> Have you tried running perf with --per-thread option ? I believe
>> this will be impacted as well, as we choose a single sink at the
>> moment and this may not be reachable from the other CPUs, where
>> the event may be scheduled. Eventually loosing trace for the
>> duration where the task is scheduled on a different CPU.
> 
> Right, I considered this set in the context of sysfs only.  I expect supporting
> 1:1 configuration on perf to require changes in the kernel drivers and the perf
> tools.

Correct. We can only support the perf case where the user opts for automatic
sink selection. If the user explicitly specifies a sink this could be either :

1) Due to the ignorance of the topology.
  OR
2) Purposefully forcing to use a global ETR on the system (for various
different reasons. e.g, debugging the interaction of multiple CPUs.)

So, the kernel cannot correct the user by choosing a different sink, when
something is specified. That is something the proposed patch below does.
As for the perf tools, as long as the tool can decode multiple AUX records
from different ETMs (generated on different sinks), I think we should be
done.

Suzuki


> 
>>
>> Please could you try this patch and see if helps ? I have lightly
>> tested this on a fast model.
>>
>> ---8>---
>>
>> coresight: etm-perf: Allow an event to use multiple sinks
>>
>> When there are multiple sinks on the system, in the absence
>> of a specified sink, it is quite possible that a default sink
>> for an ETM could be different from that of another ETM (e.g, on
>> systems with per-CPU sinks). However we do not support having
>> multiple sinks for an event yet. This patch allows the event to
>> use the default sinks on the ETMs where they are scheduled as
>> long as the sinks are of the same type.
>>
>> e.g, if we have 1x1 topology with per-CPU ETRs, the event can
>> use the per-CPU ETR for the session. However, if the sinks
>> are of different type, e.g TMC-ETR on one and a custom sink
>> on another, the event will only trace on the first detected
>> sink (just like we have today).
>>
>> Cc: Linu Cherian <lcherian@marvell.com>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Cc: Mike Leach <mike.leach@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>   .../hwtracing/coresight/coresight-etm-perf.c  | 69 +++++++++++++------
>>   1 file changed, 49 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> index c2c9b127d074..19fe38010474 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> @@ -204,14 +204,28 @@ static void etm_free_aux(void *data)
>>   	schedule_work(&event_data->work);
>>   }
>>
>> +/*
>> + * When an event could be scheduled on more than one CPUs, we have to make
>> + * sure that the sinks are of the same type, so that the sink_buffer could
>> + * be reused.
>> + */
>> +static bool sinks_match(struct coresight_device *a, struct coresight_device *b)
>> +{
>> +	if (!a || !b)
>> +		return false;
>> +	return (sink_ops(a) == sink_ops(b)) &&
>> +	       (a->subtype.sink_subtype == b->subtype.sink_subtype);
>> +}
>> +
>>   static void *etm_setup_aux(struct perf_event *event, void **pages,
>>   			   int nr_pages, bool overwrite)
>>   {
>>   	u32 id;
>>   	int cpu = event->cpu;
>>   	cpumask_t *mask;
>> -	struct coresight_device *sink;
>> +	struct coresight_device *sink = NULL;
>>   	struct etm_event_data *event_data = NULL;
>> +	bool sink_forced = false;
>>
>>   	event_data = alloc_event_data(cpu);
>>   	if (!event_data)
>> @@ -222,6 +236,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>>   	if (event->attr.config2) {
>>   		id = (u32)event->attr.config2;
>>   		sink = coresight_get_sink_by_id(id);
>> +		sink_forced = true;
>>   	}
>>
>>   	mask = &event_data->mask;
>> @@ -235,7 +250,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>>   	 */
>>   	for_each_cpu(cpu, mask) {
>>   		struct list_head *path;
>> -		struct coresight_device *csdev;
>> +		struct coresight_device *csdev, *cpu_sink;
>>
>>   		csdev = per_cpu(csdev_src, cpu);
>>   		/*
>> @@ -243,33 +258,42 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>>   		 * the mask and continue with the rest. If ever we try to trace
>>   		 * on this CPU, we handle it accordingly.
>>   		 */
>> -		if (!csdev) {
>> -			cpumask_clear_cpu(cpu, mask);
>> -			continue;
>> -		}
>> -
>> +		if (!csdev)
>> +			goto clear_cpu;
>>   		/*
>> -		 * No sink provided - look for a default sink for one of the
>> -		 * devices. At present we only support topology where all CPUs
>> -		 * use the same sink [N:1], so only need to find one sink. The
>> -		 * coresight_build_path later will remove any CPU that does not
>> -		 * attach to the sink, or if we have not found a sink.
>> +		 * No sink provided - look for a default sink for all the devices.
>> +		 * We only support multiple sinks, only if all the default sinks
>> +		 * are of the same type, so that the sink buffer can be shared
>> +		 * as the event moves around. As earlier, we don't trace on a
>> +		 * CPU, if we can't find a suitable sink.
>>   		 */
>> -		if (!sink)
>> -			sink = coresight_find_default_sink(csdev);
>> +		if (!sink_forced) {
>> +			cpu_sink = coresight_find_default_sink(csdev);
>> +			if (!cpu_sink)
>> +				goto clear_cpu;
>> +			/* First sink for this event */
>> +			if (!sink) {
>> +				sink = cpu_sink;
>> +			} else if (!sinks_match(cpu_sink, sink)) {
>> +				goto clear_cpu;
>> +			}
>> +
>> +		} else {
>> +			cpu_sink = sink;
>> +		}
>>
>>   		/*
>>   		 * Building a path doesn't enable it, it simply builds a
>>   		 * list of devices from source to sink that can be
>>   		 * referenced later when the path is actually needed.
>>   		 */
>> -		path = coresight_build_path(csdev, sink);
>> -		if (IS_ERR(path)) {
>> -			cpumask_clear_cpu(cpu, mask);
>> +		path = coresight_build_path(csdev, cpu_sink);
>> +		if (!IS_ERR(path)) {
>> +			*etm_event_cpu_path_ptr(event_data, cpu) = path;
>>   			continue;
>>   		}
>> -
>> -		*etm_event_cpu_path_ptr(event_data, cpu) = path;
>> +clear_cpu:
>> +		cpumask_clear_cpu(cpu, mask);
>>   	}
>>
>>   	/* no sink found for any CPU - cannot trace */
>> @@ -284,7 +308,12 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>>   	if (!sink_ops(sink)->alloc_buffer || !sink_ops(sink)->free_buffer)
>>   		goto err;
>>
>> -	/* Allocate the sink buffer for this session */
>> +	/*
>> +	 * Allocate the sink buffer for this session. All the sinks
>> +	 * where this event can be scheduled are ensured to be of the
>> +	 * same type. Thus the same sink configuration is used by the
>> +	 * sinks.
>> +	 */
>>   	event_data->snk_config =
>>   			sink_ops(sink)->alloc_buffer(sink, event, pages,
>>   						     nr_pages, overwrite);
>> -- 
>> 2.24.1
>>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-10-06 16:43     ` Suzuki K Poulose
@ 2020-10-06 16:56       ` Mathieu Poirier
  0 siblings, 0 replies; 18+ messages in thread
From: Mathieu Poirier @ 2020-10-06 16:56 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Coresight ML, Linu Cherian, linux-arm-kernel, Linu Cherian, Mike Leach

On Tue, 6 Oct 2020 at 10:38, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
> On 10/06/2020 05:12 PM, Mathieu Poirier wrote:
> > On Mon, Oct 05, 2020 at 12:27:07PM +0100, Suzuki K Poulose wrote:
> >> Hi Linu,
> >>
> >> On 09/04/2020 03:41 AM, Linu Cherian wrote:
> >>> This patch series tries to fix the sysfs breakage on topologies
> >>> with per core sink.
> >>>
> >>> Changes since v3:
> >>> - References to coresight_get_enabled_sink in perf interface
> >>>     has been removed and marked deprecated as a new patch.
> >>> - To avoid changes to coresight_find_sink for ease of maintenance,
> >>>     search function specific to sysfs usage has been added.
> >>> - Sysfs being the only user for coresight_get_enabled sink,
> >>>     reset option is removed as well.
> >>
> >> Have you tried running perf with --per-thread option ? I believe
> >> this will be impacted as well, as we choose a single sink at the
> >> moment and this may not be reachable from the other CPUs, where
> >> the event may be scheduled. Eventually loosing trace for the
> >> duration where the task is scheduled on a different CPU.
> >
> > Right, I considered this set in the context of sysfs only.  I expect supporting
> > 1:1 configuration on perf to require changes in the kernel drivers and the perf
> > tools.
>
> Correct. We can only support the perf case where the user opts for automatic
> sink selection. If the user explicitly specifies a sink this could be either :
>
> 1) Due to the ignorance of the topology.
>   OR
> 2) Purposefully forcing to use a global ETR on the system (for various
> different reasons. e.g, debugging the interaction of multiple CPUs.)
>

I agree.

> So, the kernel cannot correct the user by choosing a different sink, when
> something is specified. That is something the proposed patch below does.
> As for the perf tools, as long as the tool can decode multiple AUX records
> from different ETMs (generated on different sinks), I think we should be
> done.
>

IntelPT uses a 1:1 topology and as such Alex designed the perf tools
AUX mechanic to handle that kind of design.  In that regard the heavy
lifting has already been done but it won't work right away for CS,
some changes will be needed to tell the platform dependent code what
to expect.  The decoding part _should_ be fine but a careful
examination the code is the only way to be certain of that.

Thanks,
Mathieu

> Suzuki
>
>
> >
> >>
> >> Please could you try this patch and see if helps ? I have lightly
> >> tested this on a fast model.
> >>
> >> ---8>---
> >>
> >> coresight: etm-perf: Allow an event to use multiple sinks
> >>
> >> When there are multiple sinks on the system, in the absence
> >> of a specified sink, it is quite possible that a default sink
> >> for an ETM could be different from that of another ETM (e.g, on
> >> systems with per-CPU sinks). However we do not support having
> >> multiple sinks for an event yet. This patch allows the event to
> >> use the default sinks on the ETMs where they are scheduled as
> >> long as the sinks are of the same type.
> >>
> >> e.g, if we have 1x1 topology with per-CPU ETRs, the event can
> >> use the per-CPU ETR for the session. However, if the sinks
> >> are of different type, e.g TMC-ETR on one and a custom sink
> >> on another, the event will only trace on the first detected
> >> sink (just like we have today).
> >>
> >> Cc: Linu Cherian <lcherian@marvell.com>
> >> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> >> Cc: Mike Leach <mike.leach@linaro.org>
> >> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >> ---
> >>   .../hwtracing/coresight/coresight-etm-perf.c  | 69 +++++++++++++------
> >>   1 file changed, 49 insertions(+), 20 deletions(-)
> >>
> >> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c
> >> b/drivers/hwtracing/coresight/coresight-etm-perf.c
> >> index c2c9b127d074..19fe38010474 100644
> >> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> >> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> >> @@ -204,14 +204,28 @@ static void etm_free_aux(void *data)
> >>      schedule_work(&event_data->work);
> >>   }
> >>
> >> +/*
> >> + * When an event could be scheduled on more than one CPUs, we have to make
> >> + * sure that the sinks are of the same type, so that the sink_buffer could
> >> + * be reused.
> >> + */
> >> +static bool sinks_match(struct coresight_device *a, struct coresight_device *b)
> >> +{
> >> +    if (!a || !b)
> >> +            return false;
> >> +    return (sink_ops(a) == sink_ops(b)) &&
> >> +           (a->subtype.sink_subtype == b->subtype.sink_subtype);
> >> +}
> >> +
> >>   static void *etm_setup_aux(struct perf_event *event, void **pages,
> >>                         int nr_pages, bool overwrite)
> >>   {
> >>      u32 id;
> >>      int cpu = event->cpu;
> >>      cpumask_t *mask;
> >> -    struct coresight_device *sink;
> >> +    struct coresight_device *sink = NULL;
> >>      struct etm_event_data *event_data = NULL;
> >> +    bool sink_forced = false;
> >>
> >>      event_data = alloc_event_data(cpu);
> >>      if (!event_data)
> >> @@ -222,6 +236,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> >>      if (event->attr.config2) {
> >>              id = (u32)event->attr.config2;
> >>              sink = coresight_get_sink_by_id(id);
> >> +            sink_forced = true;
> >>      }
> >>
> >>      mask = &event_data->mask;
> >> @@ -235,7 +250,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> >>       */
> >>      for_each_cpu(cpu, mask) {
> >>              struct list_head *path;
> >> -            struct coresight_device *csdev;
> >> +            struct coresight_device *csdev, *cpu_sink;
> >>
> >>              csdev = per_cpu(csdev_src, cpu);
> >>              /*
> >> @@ -243,33 +258,42 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> >>               * the mask and continue with the rest. If ever we try to trace
> >>               * on this CPU, we handle it accordingly.
> >>               */
> >> -            if (!csdev) {
> >> -                    cpumask_clear_cpu(cpu, mask);
> >> -                    continue;
> >> -            }
> >> -
> >> +            if (!csdev)
> >> +                    goto clear_cpu;
> >>              /*
> >> -             * No sink provided - look for a default sink for one of the
> >> -             * devices. At present we only support topology where all CPUs
> >> -             * use the same sink [N:1], so only need to find one sink. The
> >> -             * coresight_build_path later will remove any CPU that does not
> >> -             * attach to the sink, or if we have not found a sink.
> >> +             * No sink provided - look for a default sink for all the devices.
> >> +             * We only support multiple sinks, only if all the default sinks
> >> +             * are of the same type, so that the sink buffer can be shared
> >> +             * as the event moves around. As earlier, we don't trace on a
> >> +             * CPU, if we can't find a suitable sink.
> >>               */
> >> -            if (!sink)
> >> -                    sink = coresight_find_default_sink(csdev);
> >> +            if (!sink_forced) {
> >> +                    cpu_sink = coresight_find_default_sink(csdev);
> >> +                    if (!cpu_sink)
> >> +                            goto clear_cpu;
> >> +                    /* First sink for this event */
> >> +                    if (!sink) {
> >> +                            sink = cpu_sink;
> >> +                    } else if (!sinks_match(cpu_sink, sink)) {
> >> +                            goto clear_cpu;
> >> +                    }
> >> +
> >> +            } else {
> >> +                    cpu_sink = sink;
> >> +            }
> >>
> >>              /*
> >>               * Building a path doesn't enable it, it simply builds a
> >>               * list of devices from source to sink that can be
> >>               * referenced later when the path is actually needed.
> >>               */
> >> -            path = coresight_build_path(csdev, sink);
> >> -            if (IS_ERR(path)) {
> >> -                    cpumask_clear_cpu(cpu, mask);
> >> +            path = coresight_build_path(csdev, cpu_sink);
> >> +            if (!IS_ERR(path)) {
> >> +                    *etm_event_cpu_path_ptr(event_data, cpu) = path;
> >>                      continue;
> >>              }
> >> -
> >> -            *etm_event_cpu_path_ptr(event_data, cpu) = path;
> >> +clear_cpu:
> >> +            cpumask_clear_cpu(cpu, mask);
> >>      }
> >>
> >>      /* no sink found for any CPU - cannot trace */
> >> @@ -284,7 +308,12 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> >>      if (!sink_ops(sink)->alloc_buffer || !sink_ops(sink)->free_buffer)
> >>              goto err;
> >>
> >> -    /* Allocate the sink buffer for this session */
> >> +    /*
> >> +     * Allocate the sink buffer for this session. All the sinks
> >> +     * where this event can be scheduled are ensured to be of the
> >> +     * same type. Thus the same sink configuration is used by the
> >> +     * sinks.
> >> +     */
> >>      event_data->snk_config =
> >>                      sink_ops(sink)->alloc_buffer(sink, event, pages,
> >>                                                   nr_pages, overwrite);
> >> --
> >> 2.24.1
> >>
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-10-05 11:27 ` [PATCH v4 0/2] " Suzuki K Poulose
  2020-10-06 13:21   ` Linu Cherian
  2020-10-06 16:12   ` Mathieu Poirier
@ 2020-10-26  4:33   ` Linu Cherian
  2020-10-26 18:17     ` Suzuki K Poulose
  2 siblings, 1 reply; 18+ messages in thread
From: Linu Cherian @ 2020-10-26  4:33 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, Coresight ML, Mathieu Poirier, Linu Cherian,
	Mike Leach

Hi Suzuki,

On Mon, Oct 5, 2020 at 4:52 PM Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
> Hi Linu,
>
> On 09/04/2020 03:41 AM, Linu Cherian wrote:
> > This patch series tries to fix the sysfs breakage on topologies
> > with per core sink.
> >
> > Changes since v3:
> > - References to coresight_get_enabled_sink in perf interface
> >    has been removed and marked deprecated as a new patch.
> > - To avoid changes to coresight_find_sink for ease of maintenance,
> >    search function specific to sysfs usage has been added.
> > - Sysfs being the only user for coresight_get_enabled sink,
> >    reset option is removed as well.
>
> Have you tried running perf with --per-thread option ? I believe
> this will be impacted as well, as we choose a single sink at the
> moment and this may not be reachable from the other CPUs, where
> the event may be scheduled. Eventually loosing trace for the
> duration where the task is scheduled on a different CPU.
>
> Please could you try this patch and see if helps ? I have lightly
> tested this on a fast model.

We are seeing some issues while testing with this patch.
The issue is that, always buffer allocation for the sink happens to be on the
first core in cpu mask and this doesn't match with the core on which
event is started. Please see below for additional comments.


>
> ---8>---
>
> coresight: etm-perf: Allow an event to use multiple sinks
>
> When there are multiple sinks on the system, in the absence
> of a specified sink, it is quite possible that a default sink
> for an ETM could be different from that of another ETM (e.g, on
> systems with per-CPU sinks). However we do not support having
> multiple sinks for an event yet. This patch allows the event to
> use the default sinks on the ETMs where they are scheduled as
> long as the sinks are of the same type.
>
> e.g, if we have 1x1 topology with per-CPU ETRs, the event can
> use the per-CPU ETR for the session. However, if the sinks
> are of different type, e.g TMC-ETR on one and a custom sink
> on another, the event will only trace on the first detected
> sink (just like we have today).
>
> Cc: Linu Cherian <lcherian@marvell.com>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Mike Leach <mike.leach@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>   .../hwtracing/coresight/coresight-etm-perf.c  | 69 +++++++++++++------
>   1 file changed, 49 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c
> b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index c2c9b127d074..19fe38010474 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -204,14 +204,28 @@ static void etm_free_aux(void *data)
>         schedule_work(&event_data->work);
>   }
>
> +/*
> + * When an event could be scheduled on more than one CPUs, we have to make
> + * sure that the sinks are of the same type, so that the sink_buffer could
> + * be reused.
> + */
> +static bool sinks_match(struct coresight_device *a, struct coresight_device *b)
> +{
> +       if (!a || !b)
> +               return false;
> +       return (sink_ops(a) == sink_ops(b)) &&
> +              (a->subtype.sink_subtype == b->subtype.sink_subtype);
> +}
> +
>   static void *etm_setup_aux(struct perf_event *event, void **pages,
>                            int nr_pages, bool overwrite)
>   {
>         u32 id;
>         int cpu = event->cpu;
>         cpumask_t *mask;
> -       struct coresight_device *sink;
> +       struct coresight_device *sink = NULL;
>         struct etm_event_data *event_data = NULL;
> +       bool sink_forced = false;
>
>         event_data = alloc_event_data(cpu);
>         if (!event_data)
> @@ -222,6 +236,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>         if (event->attr.config2) {
>                 id = (u32)event->attr.config2;
>                 sink = coresight_get_sink_by_id(id);
> +               sink_forced = true;
>         }
>
>         mask = &event_data->mask;
> @@ -235,7 +250,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>          */
>         for_each_cpu(cpu, mask) {
>                 struct list_head *path;
> -               struct coresight_device *csdev;
> +               struct coresight_device *csdev, *cpu_sink;
>
>                 csdev = per_cpu(csdev_src, cpu);
>                 /*
> @@ -243,33 +258,42 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>                  * the mask and continue with the rest. If ever we try to trace
>                  * on this CPU, we handle it accordingly.
>                  */
> -               if (!csdev) {
> -                       cpumask_clear_cpu(cpu, mask);
> -                       continue;
> -               }
> -
> +               if (!csdev)
> +                       goto clear_cpu;
>                 /*
> -                * No sink provided - look for a default sink for one of the
> -                * devices. At present we only support topology where all CPUs
> -                * use the same sink [N:1], so only need to find one sink. The
> -                * coresight_build_path later will remove any CPU that does not
> -                * attach to the sink, or if we have not found a sink.
> +                * No sink provided - look for a default sink for all the devices.
> +                * We only support multiple sinks, only if all the default sinks
> +                * are of the same type, so that the sink buffer can be shared
> +                * as the event moves around. As earlier, we don't trace on a
> +                * CPU, if we can't find a suitable sink.
>                  */
> -               if (!sink)
> -                       sink = coresight_find_default_sink(csdev);
> +               if (!sink_forced) {
> +                       cpu_sink = coresight_find_default_sink(csdev);
> +                       if (!cpu_sink)
> +                               goto clear_cpu;
> +                       /* First sink for this event */
> +                       if (!sink) {
> +                               sink = cpu_sink;
> +                       } else if (!sinks_match(cpu_sink, sink)) {
> +                               goto clear_cpu;
> +                       }
>


In per-thread option, cpu_sink always happens to be on core 0,
since all cores are enabled in the cpu mask.
So feels like we need to take into consideration of the core on which
the event gets started(the core on which the process gets executed)
while doing the buffer allocation.

On a related note, I had another question on this.
Don't we also need to address cases when multiple threads are forked
in a process ?

Thanks.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-10-26  4:33   ` Linu Cherian
@ 2020-10-26 18:17     ` Suzuki K Poulose
  2020-10-27 13:13       ` Linu Cherian
  0 siblings, 1 reply; 18+ messages in thread
From: Suzuki K Poulose @ 2020-10-26 18:17 UTC (permalink / raw)
  To: Linu Cherian
  Cc: linux-arm-kernel, Coresight ML, Mathieu Poirier, Linu Cherian,
	Mike Leach

Hi Linu,

Thanks for the feedback. My responses inline.

On 10/26/20 4:33 AM, Linu Cherian wrote:
> Hi Suzuki,
> 
> On Mon, Oct 5, 2020 at 4:52 PM Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>
>> Hi Linu,
>>
>> On 09/04/2020 03:41 AM, Linu Cherian wrote:
>>> This patch series tries to fix the sysfs breakage on topologies
>>> with per core sink.
>>>
>>> Changes since v3:
>>> - References to coresight_get_enabled_sink in perf interface
>>>     has been removed and marked deprecated as a new patch.
>>> - To avoid changes to coresight_find_sink for ease of maintenance,
>>>     search function specific to sysfs usage has been added.
>>> - Sysfs being the only user for coresight_get_enabled sink,
>>>     reset option is removed as well.
>>
>> Have you tried running perf with --per-thread option ? I believe
>> this will be impacted as well, as we choose a single sink at the
>> moment and this may not be reachable from the other CPUs, where
>> the event may be scheduled. Eventually loosing trace for the
>> duration where the task is scheduled on a different CPU.
>>
>> Please could you try this patch and see if helps ? I have lightly
>> tested this on a fast model.
> 
> We are seeing some issues while testing with this patch.
> The issue is that, always buffer allocation for the sink happens to be on the
> first core in cpu mask and this doesn't match with the core on which
> event is started. Please see below for additional comments.

Please could you clarify the "issues" ? How is the buffer allocation
a problem ?


> 
> 
>>
>> ---8>---
>>
>> coresight: etm-perf: Allow an event to use multiple sinks
>>
>> When there are multiple sinks on the system, in the absence
>> of a specified sink, it is quite possible that a default sink
>> for an ETM could be different from that of another ETM (e.g, on
>> systems with per-CPU sinks). However we do not support having
>> multiple sinks for an event yet. This patch allows the event to
>> use the default sinks on the ETMs where they are scheduled as
>> long as the sinks are of the same type.
>>
>> e.g, if we have 1x1 topology with per-CPU ETRs, the event can
>> use the per-CPU ETR for the session. However, if the sinks
>> are of different type, e.g TMC-ETR on one and a custom sink
>> on another, the event will only trace on the first detected
>> sink (just like we have today).
>>
>> Cc: Linu Cherian <lcherian@marvell.com>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Cc: Mike Leach <mike.leach@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>    .../hwtracing/coresight/coresight-etm-perf.c  | 69 +++++++++++++------
>>    1 file changed, 49 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> index c2c9b127d074..19fe38010474 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> @@ -204,14 +204,28 @@ static void etm_free_aux(void *data)
>>          schedule_work(&event_data->work);
>>    }
>>
>> +/*
>> + * When an event could be scheduled on more than one CPUs, we have to make
>> + * sure that the sinks are of the same type, so that the sink_buffer could
>> + * be reused.
>> + */
>> +static bool sinks_match(struct coresight_device *a, struct coresight_device *b)
>> +{
>> +       if (!a || !b)
>> +               return false;
>> +       return (sink_ops(a) == sink_ops(b)) &&
>> +              (a->subtype.sink_subtype == b->subtype.sink_subtype);
>> +}
>> +
>>    static void *etm_setup_aux(struct perf_event *event, void **pages,
>>                             int nr_pages, bool overwrite)
>>    {
>>          u32 id;
>>          int cpu = event->cpu;
>>          cpumask_t *mask;
>> -       struct coresight_device *sink;
>> +       struct coresight_device *sink = NULL;
>>          struct etm_event_data *event_data = NULL;
>> +       bool sink_forced = false;
>>
>>          event_data = alloc_event_data(cpu);
>>          if (!event_data)
>> @@ -222,6 +236,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>>          if (event->attr.config2) {
>>                  id = (u32)event->attr.config2;
>>                  sink = coresight_get_sink_by_id(id);
>> +               sink_forced = true;
>>          }
>>
>>          mask = &event_data->mask;
>> @@ -235,7 +250,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>>           */
>>          for_each_cpu(cpu, mask) {
>>                  struct list_head *path;
>> -               struct coresight_device *csdev;
>> +               struct coresight_device *csdev, *cpu_sink;
>>
>>                  csdev = per_cpu(csdev_src, cpu);
>>                  /*
>> @@ -243,33 +258,42 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>>                   * the mask and continue with the rest. If ever we try to trace
>>                   * on this CPU, we handle it accordingly.
>>                   */
>> -               if (!csdev) {
>> -                       cpumask_clear_cpu(cpu, mask);
>> -                       continue;
>> -               }
>> -
>> +               if (!csdev)
>> +                       goto clear_cpu;
>>                  /*
>> -                * No sink provided - look for a default sink for one of the
>> -                * devices. At present we only support topology where all CPUs
>> -                * use the same sink [N:1], so only need to find one sink. The
>> -                * coresight_build_path later will remove any CPU that does not
>> -                * attach to the sink, or if we have not found a sink.
>> +                * No sink provided - look for a default sink for all the devices.
>> +                * We only support multiple sinks, only if all the default sinks
>> +                * are of the same type, so that the sink buffer can be shared
>> +                * as the event moves around. As earlier, we don't trace on a
>> +                * CPU, if we can't find a suitable sink.
>>                   */
>> -               if (!sink)
>> -                       sink = coresight_find_default_sink(csdev);
>> +               if (!sink_forced) {
>> +                       cpu_sink = coresight_find_default_sink(csdev);
>> +                       if (!cpu_sink)
>> +                               goto clear_cpu;
>> +                       /* First sink for this event */
>> +                       if (!sink) {
>> +                               sink = cpu_sink;
>> +                       } else if (!sinks_match(cpu_sink, sink)) {
>> +                               goto clear_cpu;
>> +                       }
>>
> 
> 
> In per-thread option, cpu_sink always happens to be on core 0,
> since all cores are enabled in the cpu mask.
> So feels like we need to take into consideration of the core on which
> the event gets started(the core on which the process gets executed)
> while doing the buffer allocation.

event can only be scheduled on a single CPU at anytime (even though, the
mask indicates it could be scheduled on "cpu_mask"). So it doesn't
really matter where the "buffer" is allocated, unless the NUMA situation
comes into picture. But that doesn't help anyway with the AUX buffer,
which is attached to the "event" and not the ETMs.  When the event
moves, the buffer is moved to the corresponding "sink" attached to the
CPU. We make sure that the new sink can use the buffer. So, it should
work fine.

> 
> On a related note, I had another question on this.
> Don't we also need to address cases when multiple threads are forked
> in a process ?

I don't think we should worry about that in the underlying driver. All
we care about is an event and where it can be scheduled and which sink
can we use.

--per-thread doesn't handle multiple-threads anyway.

In the normal mode (i.e, when  none of -a / -C / --per-thread are 
specified), we should get an event per CPU. Each of those events
would get a buffer allocated on the "sink" closer to it. In reality
buffer allocation is only an issue on NUMA. When a thread is scheduled
on the CPU, the event for that CPU kicks in the sink is automatically
used.

One potential solution to solve the NUMA case is having a buffer 
allocated per-node. But the AUX buffer would need handling too, which
is controlled by the generic perf.

Suzuki

> 
> Thanks.
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-10-26 18:17     ` Suzuki K Poulose
@ 2020-10-27 13:13       ` Linu Cherian
  2020-11-07  5:43         ` Linu Cherian
  0 siblings, 1 reply; 18+ messages in thread
From: Linu Cherian @ 2020-10-27 13:13 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, Coresight ML, Mathieu Poirier, Linu Cherian,
	Mike Leach

Hi Suzuki,


On Mon, Oct 26, 2020 at 11:47 PM Suzuki K Poulose
<suzuki.poulose@arm.com> wrote:
>
> Hi Linu,
>
> Thanks for the feedback. My responses inline.
>
> On 10/26/20 4:33 AM, Linu Cherian wrote:
> > Hi Suzuki,
> >
> > On Mon, Oct 5, 2020 at 4:52 PM Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> >>
> >> Hi Linu,
> >>
> >> On 09/04/2020 03:41 AM, Linu Cherian wrote:
> >>> This patch series tries to fix the sysfs breakage on topologies
> >>> with per core sink.
> >>>
> >>> Changes since v3:
> >>> - References to coresight_get_enabled_sink in perf interface
> >>>     has been removed and marked deprecated as a new patch.
> >>> - To avoid changes to coresight_find_sink for ease of maintenance,
> >>>     search function specific to sysfs usage has been added.
> >>> - Sysfs being the only user for coresight_get_enabled sink,
> >>>     reset option is removed as well.
> >>
> >> Have you tried running perf with --per-thread option ? I believe
> >> this will be impacted as well, as we choose a single sink at the
> >> moment and this may not be reachable from the other CPUs, where
> >> the event may be scheduled. Eventually loosing trace for the
> >> duration where the task is scheduled on a different CPU.
> >>
> >> Please could you try this patch and see if helps ? I have lightly
> >> tested this on a fast model.
> >
> > We are seeing some issues while testing with this patch.
> > The issue is that, always buffer allocation for the sink happens to be on the
> > first core in cpu mask and this doesn't match with the core on which
> > event is started. Please see below for additional comments.
>
> Please could you clarify the "issues" ? How is the buffer allocation
> a problem ?

1. Just realized that the issue that we are seeing with this patch is something
specific to our test setup, since we had some custom patches that was required
for supporting the secure trace buffer configuration for our silicon.

And to be specific, our changeset was relying on the drvdata->etr_buf at the
time of tmc_etr_sync_perf_buffer.

In per core case during buffer allocation,
the sink chosen is always for the first core, core 0.
Let's consider the event started on say, core 4.
So w.r.t drvdata of tmc_etr4,
drvdata->etr_buf would get initialized while starting the event.
And w.r.t drvdata of tmc_etr0,
drvdata->etr_buf would be NULL here and our custom changeset
was expecting to be initialized with the etr_buf.

So will try to rebase our patches accordingly and test this again.

2. Related to iommu enabled configuration.

This on the assumption that there can be dedicated stream id (device id)
possible for each per core ETR device. Please ignore otherwise.

When the buffer allocation happens on tmc_etr0, and provided we have
 a iommu enabled case, the iommu mapping would be w.r.t to tmc_etr0.
But then, the actual DMA might get triggered from a non matching device,
ie. tmc_etr4 which would then fail.
Should we take this into consideration ?

Thanks.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-10-27 13:13       ` Linu Cherian
@ 2020-11-07  5:43         ` Linu Cherian
  2020-11-10 12:57           ` Linu Cherian
  0 siblings, 1 reply; 18+ messages in thread
From: Linu Cherian @ 2020-11-07  5:43 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, Coresight ML, Mathieu Poirier, Linu Cherian,
	Mike Leach

Hi Suzuki,

On Tue, Oct 27, 2020 at 6:43 PM Linu Cherian <linuc.decode@gmail.com> wrote:
>
> Hi Suzuki,
>
>
> On Mon, Oct 26, 2020 at 11:47 PM Suzuki K Poulose
> <suzuki.poulose@arm.com> wrote:
> >
> > Hi Linu,
> >
> > Thanks for the feedback. My responses inline.
> >
> > On 10/26/20 4:33 AM, Linu Cherian wrote:
> > > Hi Suzuki,
> > >
> > > On Mon, Oct 5, 2020 at 4:52 PM Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> > >>
> > >> Hi Linu,
> > >>
> > >> On 09/04/2020 03:41 AM, Linu Cherian wrote:
> > >>> This patch series tries to fix the sysfs breakage on topologies
> > >>> with per core sink.
> > >>>
> > >>> Changes since v3:
> > >>> - References to coresight_get_enabled_sink in perf interface
> > >>>     has been removed and marked deprecated as a new patch.
> > >>> - To avoid changes to coresight_find_sink for ease of maintenance,
> > >>>     search function specific to sysfs usage has been added.
> > >>> - Sysfs being the only user for coresight_get_enabled sink,
> > >>>     reset option is removed as well.
> > >>
> > >> Have you tried running perf with --per-thread option ? I believe
> > >> this will be impacted as well, as we choose a single sink at the
> > >> moment and this may not be reachable from the other CPUs, where
> > >> the event may be scheduled. Eventually loosing trace for the
> > >> duration where the task is scheduled on a different CPU.
> > >>
> > >> Please could you try this patch and see if helps ? I have lightly
> > >> tested this on a fast model.
> > >
> > > We are seeing some issues while testing with this patch.
> > > The issue is that, always buffer allocation for the sink happens to be on the
> > > first core in cpu mask and this doesn't match with the core on which
> > > event is started. Please see below for additional comments.
> >
> > Please could you clarify the "issues" ? How is the buffer allocation
> > a problem ?
>
> 1. Just realized that the issue that we are seeing with this patch is something
> specific to our test setup, since we had some custom patches that was required
> for supporting the secure trace buffer configuration for our silicon.
>
> And to be specific, our changeset was relying on the drvdata->etr_buf at the
> time of tmc_etr_sync_perf_buffer.
>
> In per core case during buffer allocation,
> the sink chosen is always for the first core, core 0.
> Let's consider the event started on say, core 4.
> So w.r.t drvdata of tmc_etr4,
> drvdata->etr_buf would get initialized while starting the event.
> And w.r.t drvdata of tmc_etr0,
> drvdata->etr_buf would be NULL here and our custom changeset
> was expecting to be initialized with the etr_buf.
>
> So will try to rebase our patches accordingly and test this again.
>

We are facing some issues while trying out perf. This doesn't appear
to be related to your patch though. Will share the details once we
do some initial analysis on it.

Thanks.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-11-07  5:43         ` Linu Cherian
@ 2020-11-10 12:57           ` Linu Cherian
  2020-11-10 14:57             ` Suzuki K Poulose
  0 siblings, 1 reply; 18+ messages in thread
From: Linu Cherian @ 2020-11-10 12:57 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, Coresight ML, Mathieu Poirier, Linu Cherian,
	Mike Leach

Hi Suzuki,

On Sat, Nov 7, 2020 at 11:13 AM Linu Cherian <linuc.decode@gmail.com> wrote:
>
> Hi Suzuki,
>
> On Tue, Oct 27, 2020 at 6:43 PM Linu Cherian <linuc.decode@gmail.com> wrote:
> >
> > Hi Suzuki,
> >
> >
> > On Mon, Oct 26, 2020 at 11:47 PM Suzuki K Poulose
> > <suzuki.poulose@arm.com> wrote:
> > >
> > > Hi Linu,
> > >
> > > Thanks for the feedback. My responses inline.
> > >
> > > On 10/26/20 4:33 AM, Linu Cherian wrote:
> > > > Hi Suzuki,
> > > >
> > > > On Mon, Oct 5, 2020 at 4:52 PM Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> > > >>
> > > >> Hi Linu,
> > > >>
> > > >> On 09/04/2020 03:41 AM, Linu Cherian wrote:
> > > >>> This patch series tries to fix the sysfs breakage on topologies
> > > >>> with per core sink.
> > > >>>
> > > >>> Changes since v3:
> > > >>> - References to coresight_get_enabled_sink in perf interface
> > > >>>     has been removed and marked deprecated as a new patch.
> > > >>> - To avoid changes to coresight_find_sink for ease of maintenance,
> > > >>>     search function specific to sysfs usage has been added.
> > > >>> - Sysfs being the only user for coresight_get_enabled sink,
> > > >>>     reset option is removed as well.
> > > >>
> > > >> Have you tried running perf with --per-thread option ? I believe
> > > >> this will be impacted as well, as we choose a single sink at the
> > > >> moment and this may not be reachable from the other CPUs, where
> > > >> the event may be scheduled. Eventually loosing trace for the
> > > >> duration where the task is scheduled on a different CPU.
> > > >>
> > > >> Please could you try this patch and see if helps ? I have lightly
> > > >> tested this on a fast model.
> > > >
> > > > We are seeing some issues while testing with this patch.
> > > > The issue is that, always buffer allocation for the sink happens to be on the
> > > > first core in cpu mask and this doesn't match with the core on which
> > > > event is started. Please see below for additional comments.
> > >
> > > Please could you clarify the "issues" ? How is the buffer allocation
> > > a problem ?
> >
> > 1. Just realized that the issue that we are seeing with this patch is something
> > specific to our test setup, since we had some custom patches that was required
> > for supporting the secure trace buffer configuration for our silicon.
> >
> > And to be specific, our changeset was relying on the drvdata->etr_buf at the
> > time of tmc_etr_sync_perf_buffer.
> >
> > In per core case during buffer allocation,
> > the sink chosen is always for the first core, core 0.
> > Let's consider the event started on say, core 4.
> > So w.r.t drvdata of tmc_etr4,
> > drvdata->etr_buf would get initialized while starting the event.
> > And w.r.t drvdata of tmc_etr0,
> > drvdata->etr_buf would be NULL here and our custom changeset
> > was expecting to be initialized with the etr_buf.
> >
> > So will try to rebase our patches accordingly and test this again.
> >
>
> We are facing some issues while trying out perf. This doesn't appear
> to be related to your patch though. Will share the details once we
> do some initial analysis on it.
>
> Thanks.

# ./perf record -vvv -e cs_etm// --per-thread uname -a
Using CPUID 0x00000000430f0b40
Attempting to add event pmu 'cs_etm' with '' that may result in non-fatal errors
nr_cblocks: 0
affinity: SYS
mmap flush: 1
comp level: 0
maps__set_modules_path_dir: cannot open
/lib/modules/5.9.0-rc5-00116-g91c9ea890e1a dir
Problems setting modules path maps, continuing anyway...
------------------------------------------------------------
perf_event_attr:
  type                             8
  size                             120
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|IDENTIFIER
  read_format                      ID
  disabled                         1
  enable_on_exec                   1
  sample_id_all                    1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 3333  cpu -1  group_fd -1  flags 0x8 = 5
------------------------------------------------------------
perf_event_attr:
  type                             1
  size                             120
  config                           0x9
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|IDENTIFIER
  read_format                      ID
  disabled                         1
  exclude_kernel                   1
  exclude_hv                       1
  mmap                             1
  comm                             1
  enable_on_exec                   1
  task                             1
  sample_id_all                    1
  exclude_guest                    1
  mmap2                            1
  comm_exec                        1
  context_switch                   1
  ksymbol                          1
  bpf_event                        1
------------------------------------------------------------
sys_perf_event_open: pid 3333  cpu -1  group_fd -1  flags 0x8 = 6
mmap size 589824B
AUX area mmap length 4194304
------------------------------------------------------------
perf_event_attr:
  type                             1
  size                             120
  config                           0x9
  watermark                        1
  sample_id_all                    1
  bpf_event                        1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0x8
sys_perf_event_open failed, error -22
switching off bpf_event
------------------------------------------------------------
perf_event_attr:
  type                             1
  size                             120
  config                           0x9
  watermark                        1
  sample_id_all                    1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0x8
sys_perf_event_open failed, error -22
switching off cloexec flag
------------------------------------------------------------
perf_event_attr:
  type                             1
  size                             120
  config                           0x9
  watermark                        1
  sample_id_all                    1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0
sys_perf_event_open failed, error -22
switching off sample_id_all
------------------------------------------------------------
perf_event_attr:
  type                             1
  size                             120
  config                           0x9
  watermark                        1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0
sys_perf_event_open failed, error -22
Couldn't start the BPF side band thread:
BPF programs starting from now on won't be annotatable
Synthesizing auxtrace information
cannot find cgroup mount point
Couldn't synthesize cgroup events.
Control descriptor is not initialized
Linux marvell 5.9.0-rc5-00116-g91c9ea890e1a #823 SMP PREEMPT Tue Nov
10 10:49:15 IST 2020 aarch64 aarch64 aarch64 GNU/Linux
auxtrace idx 0 old 0 head 0xdd50 diff 0xdd50
[ perf record: Woken up 1 times to write data ]
symbol:init_start file:(null) line:0 offset:0 return:0 lazy:(null)
snip ..
symbol:memory_mallopt file:(null) line:0 offset:0 return:0 lazy:(null)
failed to write feature CPUDESC
failed to write feature MEM_TOPOLOGY
failed to write feature CPU_PMU_CAPS
[ perf record: Captured and wrote 0.056 MB perf.data ]

# ./perf report
0x368 [0x50]: failed to process type: 1 [Cannot allocate memory]
Error:
failed to process sample
# To display the perf.data header info, please use --header/--header-only option

============================================================================

Appreciate your help on getting some debug hints on what is going wrong.

One strange thing noted here is sys_perf_event_open, passing cpu = -1
and pid = -1,
which doesnt appear to be valid as per tools/perf/design.txt

Thanks.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-11-10 12:57           ` Linu Cherian
@ 2020-11-10 14:57             ` Suzuki K Poulose
  2020-11-12  8:57               ` Linu Cherian
  0 siblings, 1 reply; 18+ messages in thread
From: Suzuki K Poulose @ 2020-11-10 14:57 UTC (permalink / raw)
  To: Linu Cherian
  Cc: linux-arm-kernel, Coresight ML, Mathieu Poirier, Linu Cherian,
	Mike Leach

Hi Linu

On 11/10/20 12:57 PM, Linu Cherian wrote:
> Hi Suzuki,
> 
...

>> We are facing some issues while trying out perf. This doesn't appear
>> to be related to your patch though. Will share the details once we
>> do some initial analysis on it.
>>
>> Thanks.
> 
> # ./perf record -vvv -e cs_etm// --per-thread uname -a
> Using CPUID 0x00000000430f0b40
> Attempting to add event pmu 'cs_etm' with '' that may result in non-fatal errors
> nr_cblocks: 0
> affinity: SYS
> mmap flush: 1
> comp level: 0
> maps__set_modules_path_dir: cannot open
> /lib/modules/5.9.0-rc5-00116-g91c9ea890e1a dir
> Problems setting modules path maps, continuing anyway...
> ------------------------------------------------------------
> perf_event_attr:
>    type                             8
>    size                             120
>    { sample_period, sample_freq }   1
>    sample_type                      IP|TID|IDENTIFIER
>    read_format                      ID
>    disabled                         1
>    enable_on_exec                   1
>    sample_id_all                    1
>    exclude_guest                    1
> ------------------------------------------------------------
> sys_perf_event_open: pid 3333  cpu -1  group_fd -1  flags 0x8 = 5
> ------------------------------------------------------------
> perf_event_attr:
>    type                             1
>    size                             120
>    config                           0x9
>    { sample_period, sample_freq }   1
>    sample_type                      IP|TID|IDENTIFIER
>    read_format                      ID
>    disabled                         1
>    exclude_kernel                   1
>    exclude_hv                       1
>    mmap                             1
>    comm                             1
>    enable_on_exec                   1
>    task                             1
>    sample_id_all                    1
>    exclude_guest                    1
>    mmap2                            1
>    comm_exec                        1
>    context_switch                   1
>    ksymbol                          1
>    bpf_event                        1
> ------------------------------------------------------------
> sys_perf_event_open: pid 3333  cpu -1  group_fd -1  flags 0x8 = 6
> mmap size 589824B
> AUX area mmap length 4194304
> ------------------------------------------------------------
> perf_event_attr:
>    type                             1
>    size                             120
>    config                           0x9
>    watermark                        1
>    sample_id_all                    1
>    bpf_event                        1
>    { wakeup_events, wakeup_watermark } 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0x8
> sys_perf_event_open failed, error -22
> switching off bpf_event
> ------------------------------------------------------------
> perf_event_attr:
>    type                             1
>    size                             120
>    config                           0x9
>    watermark                        1
>    sample_id_all                    1
>    { wakeup_events, wakeup_watermark } 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0x8
> sys_perf_event_open failed, error -22
> switching off cloexec flag
> ------------------------------------------------------------
> perf_event_attr:
>    type                             1
>    size                             120
>    config                           0x9
>    watermark                        1
>    sample_id_all                    1
>    { wakeup_events, wakeup_watermark } 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0
> sys_perf_event_open failed, error -22
> switching off sample_id_all
> ------------------------------------------------------------
> perf_event_attr:
>    type                             1
>    size                             120
>    config                           0x9
>    watermark                        1
>    { wakeup_events, wakeup_watermark } 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0
> sys_perf_event_open failed, error -22
> Couldn't start the BPF side band thread:
> BPF programs starting from now on won't be annotatable
> Synthesizing auxtrace information
> cannot find cgroup mount point
> Couldn't synthesize cgroup events.
> Control descriptor is not initialized
> Linux marvell 5.9.0-rc5-00116-g91c9ea890e1a #823 SMP PREEMPT Tue Nov
> 10 10:49:15 IST 2020 aarch64 aarch64 aarch64 GNU/Linux

> auxtrace idx 0 old 0 head 0xdd50 diff 0xdd50

I haven't seen this in the normal verbose output.

> [ perf record: Woken up 1 times to write data ]
> symbol:init_start file:(null) line:0 offset:0 return:0 lazy:(null)
> snip ..
> symbol:memory_mallopt file:(null) line:0 offset:0 return:0 lazy:(null)
> failed to write feature CPUDESC
> failed to write feature MEM_TOPOLOGY
> failed to write feature CPU_PMU_CAPS
> [ perf record: Captured and wrote 0.056 MB perf.data ]
> 
> # ./perf report
> 0x368 [0x50]: failed to process type: 1 [Cannot allocate memory]
> Error:
> failed to process sample

I have no clue about it. Are you able to run it under GDB ? (Looks like
you have built the perf, so if you have sources, it may be a good idea
to run under the GDB and figure out where that error is coming from).

Also, what is

perf --version ?


> # To display the perf.data header info, please use --header/--header-only option
> 
> ============================================================================
> 
> Appreciate your help on getting some debug hints on what is going wrong.


> 
> One strange thing noted here is sys_perf_event_open, passing cpu = -1
> and pid = -1,
> which doesnt appear to be valid as per tools/perf/design.txt

I see that on my Juno, but it still works. I believe that is for the
generic PMU (pmu.type == 1) and not the coresight PMU, which I believe
is (type == 8) in your case (the first event).

Suzuki

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-11-10 14:57             ` Suzuki K Poulose
@ 2020-11-12  8:57               ` Linu Cherian
  2020-11-12  9:20                 ` Suzuki K Poulose
  0 siblings, 1 reply; 18+ messages in thread
From: Linu Cherian @ 2020-11-12  8:57 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, Coresight ML, Mathieu Poirier, Linu Cherian,
	Mike Leach

Hi Suzuki,

On Tue, Nov 10, 2020 at 8:27 PM Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
> Hi Linu
>
> On 11/10/20 12:57 PM, Linu Cherian wrote:
> > Hi Suzuki,
> >
> ...
>
> >> We are facing some issues while trying out perf. This doesn't appear
> >> to be related to your patch though. Will share the details once we
> >> do some initial analysis on it.
> >>
> >> Thanks.
> >
> > # ./perf record -vvv -e cs_etm// --per-thread uname -a
> > Using CPUID 0x00000000430f0b40
> > Attempting to add event pmu 'cs_etm' with '' that may result in non-fatal errors
> > nr_cblocks: 0
> > affinity: SYS
> > mmap flush: 1
> > comp level: 0
> > maps__set_modules_path_dir: cannot open
> > /lib/modules/5.9.0-rc5-00116-g91c9ea890e1a dir
> > Problems setting modules path maps, continuing anyway...
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             8
> >    size                             120
> >    { sample_period, sample_freq }   1
> >    sample_type                      IP|TID|IDENTIFIER
> >    read_format                      ID
> >    disabled                         1
> >    enable_on_exec                   1
> >    sample_id_all                    1
> >    exclude_guest                    1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 3333  cpu -1  group_fd -1  flags 0x8 = 5
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             1
> >    size                             120
> >    config                           0x9
> >    { sample_period, sample_freq }   1
> >    sample_type                      IP|TID|IDENTIFIER
> >    read_format                      ID
> >    disabled                         1
> >    exclude_kernel                   1
> >    exclude_hv                       1
> >    mmap                             1
> >    comm                             1
> >    enable_on_exec                   1
> >    task                             1
> >    sample_id_all                    1
> >    exclude_guest                    1
> >    mmap2                            1
> >    comm_exec                        1
> >    context_switch                   1
> >    ksymbol                          1
> >    bpf_event                        1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 3333  cpu -1  group_fd -1  flags 0x8 = 6
> > mmap size 589824B
> > AUX area mmap length 4194304
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             1
> >    size                             120
> >    config                           0x9
> >    watermark                        1
> >    sample_id_all                    1
> >    bpf_event                        1
> >    { wakeup_events, wakeup_watermark } 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0x8
> > sys_perf_event_open failed, error -22
> > switching off bpf_event
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             1
> >    size                             120
> >    config                           0x9
> >    watermark                        1
> >    sample_id_all                    1
> >    { wakeup_events, wakeup_watermark } 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0x8
> > sys_perf_event_open failed, error -22
> > switching off cloexec flag
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             1
> >    size                             120
> >    config                           0x9
> >    watermark                        1
> >    sample_id_all                    1
> >    { wakeup_events, wakeup_watermark } 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0
> > sys_perf_event_open failed, error -22
> > switching off sample_id_all
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             1
> >    size                             120
> >    config                           0x9
> >    watermark                        1
> >    { wakeup_events, wakeup_watermark } 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0
> > sys_perf_event_open failed, error -22
> > Couldn't start the BPF side band thread:
> > BPF programs starting from now on won't be annotatable
> > Synthesizing auxtrace information
> > cannot find cgroup mount point
> > Couldn't synthesize cgroup events.
> > Control descriptor is not initialized
> > Linux marvell 5.9.0-rc5-00116-g91c9ea890e1a #823 SMP PREEMPT Tue Nov
> > 10 10:49:15 IST 2020 aarch64 aarch64 aarch64 GNU/Linux
>
> > auxtrace idx 0 old 0 head 0xdd50 diff 0xdd50
>
> I haven't seen this in the normal verbose output.
>
> > [ perf record: Woken up 1 times to write data ]
> > symbol:init_start file:(null) line:0 offset:0 return:0 lazy:(null)
> > snip ..
> > symbol:memory_mallopt file:(null) line:0 offset:0 return:0 lazy:(null)
> > failed to write feature CPUDESC
> > failed to write feature MEM_TOPOLOGY
> > failed to write feature CPU_PMU_CAPS
> > [ perf record: Captured and wrote 0.056 MB perf.data ]
> >
> > # ./perf report
> > 0x368 [0x50]: failed to process type: 1 [Cannot allocate memory]
> > Error:
> > failed to process sample
>
> I have no clue about it. Are you able to run it under GDB ? (Looks like
> you have built the perf, so if you have sources, it may be a good idea
> to run under the GDB and figure out where that error is coming from).
>

Yeah gdb helped figuring out the issue.
The issue is in the opencsd, where it doesn't seem to support multiple streams
when the formatter is not enabled. .
Note:Our Silicon has formatter disabled and we already had changes in perf tool
to take care of the formatter status.

The below hack helped.

diff --git a/decoder/source/ocsd_dcd_tree.cpp b/decoder/source/ocsd_dcd_tree.cpp
index be15e36..0210dec 100644
--- a/decoder/source/ocsd_dcd_tree.cpp
+++ b/decoder/source/ocsd_dcd_tree.cpp
@@ -401,7 +401,7 @@ ocsd_err_t DecodeTree::createDecoder(const
std::string &decoderName, const int c
     int crtFlags = createFlags;

     uint8_t CSID = 0;   // default for single stream decoder (no
deformatter) - we ignore the ID
-    if(usingFormatter())
+    //if(usingFormatter())
     {
         CSID = pConfig->getTraceID();
         crtFlags |= OCSD_CREATE_FLG_INST_ID;


Not sure if this is the right fix though.

This is how i tested,

1. # taskset 0x2 ./perf record -e cs_etm//u -F 10 --per-thread ping -c
30 127.0.0.1

2. # Ctrl-Z // Put the process in background

3. # taskset -p 0x4 <pid of ping process> // Move the ping process to core 2

4. # fg // Get the process to foreground

5. ./perf report
snip ...

# Samples: 66K of event 'branches:uH'
# Event count (approx.): 66953
#
# Children      Self  Command  Shared Object          Symbol
# ........  ........  .......  .....................
........................................
#
    15.94%    15.94%  ping     ld-2.31.so             [.] _dl_lookup_symbol_x
    14.93%    14.93%  ping     ld-2.31.so             [.] do_lookup_x
    10.68%    10.68%  ping     libc-2.31.so           [.] _dl_addr
     9.87%     9.87%  ping     ld-2.31.so             [.] _dl_relocate_object
     6.75%     6.75%  ping     ld-2.31.so             [.] strcmp
     3.62%     3.62%  ping     ld-2.31.so             [.] check_match
     2.72%     2.72%  ping     libc-2.31.so           [.] __vfprintf_internal
     1.90%     1.90%  ping     libc-2.31.so           [.] _int_malloc
     1.29%     1.29%  ping     libc-2.31.so           [.] getenv
     1.28%     1.28%  ping     libc-2.31.so           [.] strcmp
     1.17%     1.17%  ping     libc-2.31.so           [.]
_IO_file_xsputn@@GLIBC_2.17
     1.16%     1.16%  ping     ld-2.31.so             [.] _dl_name_match_p

snip ...

Also i could verify using prints in the tmc-etr-driver that the trace
buffer gets reused across cores
as well.


> Also, what is
>
> perf --version ?

perf version 5.9.0-rc5


>
>
> > # To display the perf.data header info, please use --header/--header-only option
> >
> > ============================================================================
> >
> > Appreciate your help on getting some debug hints on what is going wrong.
>
>
> >
> > One strange thing noted here is sys_perf_event_open, passing cpu = -1
> > and pid = -1,
> > which doesnt appear to be valid as per tools/perf/design.txt
>
> I see that on my Juno, but it still works. I believe that is for the
> generic PMU (pmu.type == 1) and not the coresight PMU, which I believe
> is (type == 8) in your case (the first event).
>
> Suzuki

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/2] Make sysFS functional on topologies with per core sink
  2020-11-12  8:57               ` Linu Cherian
@ 2020-11-12  9:20                 ` Suzuki K Poulose
  0 siblings, 0 replies; 18+ messages in thread
From: Suzuki K Poulose @ 2020-11-12  9:20 UTC (permalink / raw)
  To: Linu Cherian
  Cc: linux-arm-kernel, Coresight ML, Mathieu Poirier, Linu Cherian,
	Mike Leach

On 11/12/20 8:57 AM, Linu Cherian wrote:
> Hi Suzuki,
> 
> On Tue, Nov 10, 2020 at 8:27 PM Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>
>> Hi Linu
>>
>> On 11/10/20 12:57 PM, Linu Cherian wrote:
>>> Hi Suzuki,
>>>
>> ...
>>

....

>>> # ./perf report
>>> 0x368 [0x50]: failed to process type: 1 [Cannot allocate memory]
>>> Error:
>>> failed to process sample
>>
>> I have no clue about it. Are you able to run it under GDB ? (Looks like
>> you have built the perf, so if you have sources, it may be a good idea
>> to run under the GDB and figure out where that error is coming from).
>>
> 
> Yeah gdb helped figuring out the issue.
> The issue is in the opencsd, where it doesn't seem to support multiple streams
> when the formatter is not enabled. .
> Note:Our Silicon has formatter disabled and we already had changes in perf tool
> to take care of the formatter status.
> 
> The below hack helped.
> 
> diff --git a/decoder/source/ocsd_dcd_tree.cpp b/decoder/source/ocsd_dcd_tree.cpp
> index be15e36..0210dec 100644
> --- a/decoder/source/ocsd_dcd_tree.cpp
> +++ b/decoder/source/ocsd_dcd_tree.cpp
> @@ -401,7 +401,7 @@ ocsd_err_t DecodeTree::createDecoder(const
> std::string &decoderName, const int c
>       int crtFlags = createFlags;
> 
>       uint8_t CSID = 0;   // default for single stream decoder (no
> deformatter) - we ignore the ID
> -    if(usingFormatter())
> +    //if(usingFormatter())
>       {
>           CSID = pConfig->getTraceID();
>           crtFlags |= OCSD_CREATE_FLG_INST_ID;
> 
> 
> Not sure if this is the right fix though.

That may work for you, but would break the existing platforms
and the drivers which enable formatting by default.
We need a way to address this in the perf side. This would be
needed for the ETE/TRBE trace scenario as well, where
the formatting is not supported by TRBE.

> 
> This is how i tested,
> 
> 1. # taskset 0x2 ./perf record -e cs_etm//u -F 10 --per-thread ping -c
> 30 127.0.0.1
> 
> 2. # Ctrl-Z // Put the process in background
> 
> 3. # taskset -p 0x4 <pid of ping process> // Move the ping process to core 2
> 
> 4. # fg // Get the process to foreground
> 
> 5. ./perf report
> snip ...
> 
> # Samples: 66K of event 'branches:uH'
> # Event count (approx.): 66953
> #
> # Children      Self  Command  Shared Object          Symbol
> # ........  ........  .......  .....................
> ........................................
> #
>      15.94%    15.94%  ping     ld-2.31.so             [.] _dl_lookup_symbol_x
>      14.93%    14.93%  ping     ld-2.31.so             [.] do_lookup_x
>      10.68%    10.68%  ping     libc-2.31.so           [.] _dl_addr
>       9.87%     9.87%  ping     ld-2.31.so             [.] _dl_relocate_object
>       6.75%     6.75%  ping     ld-2.31.so             [.] strcmp
>       3.62%     3.62%  ping     ld-2.31.so             [.] check_match
>       2.72%     2.72%  ping     libc-2.31.so           [.] __vfprintf_internal
>       1.90%     1.90%  ping     libc-2.31.so           [.] _int_malloc
>       1.29%     1.29%  ping     libc-2.31.so           [.] getenv
>       1.28%     1.28%  ping     libc-2.31.so           [.] strcmp
>       1.17%     1.17%  ping     libc-2.31.so           [.]
> _IO_file_xsputn@@GLIBC_2.17
>       1.16%     1.16%  ping     ld-2.31.so             [.] _dl_name_match_p
> 
> snip ...
> 
> Also i could verify using prints in the tmc-etr-driver that the trace
> buffer gets reused across cores
> as well.

Cool ! So please could you test the newer version of this patch (not functionally
different, but slightly modified code) and add a Tested-by if you are happy
with it ?

https://lore.kernel.org/linux-arm-kernel/1605012309-24812-3-git-send-email-anshuman.khandual@arm.com/

Cheers
Suzuki



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-11-12  9:22 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04  2:41 [PATCH v4 0/2] Make sysFS functional on topologies with per core sink Linu Cherian
2020-09-04  2:41 ` [PATCH v4 1/2] coresight: etm: perf: Sink selection using sysfs is deprecated Linu Cherian
2020-09-04  2:41 ` [PATCH v4 2/2] coresight: Make sysFS functional on topologies with per core sink Linu Cherian
2020-09-14 19:36   ` Mathieu Poirier
2020-09-15  4:40     ` Linu Cherian
2020-10-05 11:27 ` [PATCH v4 0/2] " Suzuki K Poulose
2020-10-06 13:21   ` Linu Cherian
2020-10-06 16:12   ` Mathieu Poirier
2020-10-06 16:43     ` Suzuki K Poulose
2020-10-06 16:56       ` Mathieu Poirier
2020-10-26  4:33   ` Linu Cherian
2020-10-26 18:17     ` Suzuki K Poulose
2020-10-27 13:13       ` Linu Cherian
2020-11-07  5:43         ` Linu Cherian
2020-11-10 12:57           ` Linu Cherian
2020-11-10 14:57             ` Suzuki K Poulose
2020-11-12  8:57               ` Linu Cherian
2020-11-12  9:20                 ` Suzuki K Poulose

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.