linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf script python: Fix buffer size to report iregs in perf script
@ 2021-06-28  6:23 Kajol Jain
  2021-06-28  7:15 ` Nageswara Sastry
  2021-06-28 14:49 ` Paul A. Clarke
  0 siblings, 2 replies; 8+ messages in thread
From: Kajol Jain @ 2021-06-28  6:23 UTC (permalink / raw)
  To: acme
  Cc: maddy, atrajeev, kjain, pc, linux-kernel, jolsa, ravi.bangoria,
	linux-perf-users, linuxppc-dev, rnsastry

Commit 48a1f565261d ("perf script python: Add more PMU fields
to event handler dict") added functionality to report fields like
weight, iregs, uregs etc via perf report.
That commit predefined buffer size to 512 bytes to print those fields.

But incase of powerpc, since we added extended regs support
in commits:

Commit 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
Counter SPRs as part of extended regs")
Commit d735599a069f ("powerpc/perf: Add extended regs support for
power10 platform")

Now iregs can carry more bytes of data and this predefined buffer size
can result to data loss in perf script output.

Patch resolve this issue by making buffer size dynamic based on number
of registers needed to print. It also changed return type for function
"regs_map" from int to void, as the return value is not being used by
the caller function "set_regs_in_dict".

Fixes: 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
Counter SPRs as part of extended regs")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
---
 .../util/scripting-engines/trace-event-python.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 4e4aa4c97ac5..c8c9706b4643 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -687,7 +687,7 @@ static void set_sample_datasrc_in_dict(PyObject *dict,
 			_PyUnicode_FromString(decode));
 }
 
-static int regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size)
+static void regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size)
 {
 	unsigned int i = 0, r;
 	int printed = 0;
@@ -695,7 +695,7 @@ static int regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size)
 	bf[0] = 0;
 
 	if (!regs || !regs->regs)
-		return 0;
+		return;
 
 	for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
 		u64 val = regs->regs[i++];
@@ -704,8 +704,6 @@ static int regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size)
 				     "%5s:0x%" PRIx64 " ",
 				     perf_reg_name(r), val);
 	}
-
-	return printed;
 }
 
 static void set_regs_in_dict(PyObject *dict,
@@ -713,7 +711,16 @@ static void set_regs_in_dict(PyObject *dict,
 			     struct evsel *evsel)
 {
 	struct perf_event_attr *attr = &evsel->core.attr;
-	char bf[512];
+
+	/*
+	 * Here value 28 is a constant size which can be used to print
+	 * one register value and its corresponds to:
+	 * 16 chars is to specify 64 bit register in hexadecimal.
+	 * 2 chars is for appending "0x" to the hexadecimal value and
+	 * 10 chars is for register name.
+	 */
+	int size = __sw_hweight64(attr->sample_regs_intr) * 28;
+	char bf[size];
 
 	regs_map(&sample->intr_regs, attr->sample_regs_intr, bf, sizeof(bf));
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf script python: Fix buffer size to report iregs in perf script
  2021-06-28  6:23 [PATCH] perf script python: Fix buffer size to report iregs in perf script Kajol Jain
@ 2021-06-28  7:15 ` Nageswara Sastry
  2021-06-28 14:49 ` Paul A. Clarke
  1 sibling, 0 replies; 8+ messages in thread
From: Nageswara Sastry @ 2021-06-28  7:15 UTC (permalink / raw)
  To: Kajol Jain, acme
  Cc: maddy, atrajeev, pc, linux-kernel, jolsa, ravi.bangoria,
	linux-perf-users, linuxppc-dev

Tested by creating perf-script.py using perf script
and priting the iregs. Seen more values with this patch.


Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>

On 28/06/21 11:53 am, Kajol Jain wrote:
> Commit 48a1f565261d ("perf script python: Add more PMU fields
> to event handler dict") added functionality to report fields like
> weight, iregs, uregs etc via perf report.
> That commit predefined buffer size to 512 bytes to print those fields.
> 
> But incase of powerpc, since we added extended regs support
> in commits:
> 
> Commit 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
> Counter SPRs as part of extended regs")
> Commit d735599a069f ("powerpc/perf: Add extended regs support for
> power10 platform")
> 
> Now iregs can carry more bytes of data and this predefined buffer size
> can result to data loss in perf script output.
> 
> Patch resolve this issue by making buffer size dynamic based on number
> of registers needed to print. It also changed return type for function
> "regs_map" from int to void, as the return value is not being used by
> the caller function "set_regs_in_dict".
> 
> Fixes: 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
> Counter SPRs as part of extended regs")
> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> ---
>   .../util/scripting-engines/trace-event-python.c | 17 ++++++++++++-----
>   1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
> index 4e4aa4c97ac5..c8c9706b4643 100644
> --- a/tools/perf/util/scripting-engines/trace-event-python.c
> +++ b/tools/perf/util/scripting-engines/trace-event-python.c
> @@ -687,7 +687,7 @@ static void set_sample_datasrc_in_dict(PyObject *dict,
>   			_PyUnicode_FromString(decode));
>   }
>   
> -static int regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size)
> +static void regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size)
>   {
>   	unsigned int i = 0, r;
>   	int printed = 0;
> @@ -695,7 +695,7 @@ static int regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size)
>   	bf[0] = 0;
>   
>   	if (!regs || !regs->regs)
> -		return 0;
> +		return;
>   
>   	for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
>   		u64 val = regs->regs[i++];
> @@ -704,8 +704,6 @@ static int regs_map(struct regs_dump *regs, uint64_t mask, char *bf, int size)
>   				     "%5s:0x%" PRIx64 " ",
>   				     perf_reg_name(r), val);
>   	}
> -
> -	return printed;
>   }
>   
>   static void set_regs_in_dict(PyObject *dict,
> @@ -713,7 +711,16 @@ static void set_regs_in_dict(PyObject *dict,
>   			     struct evsel *evsel)
>   {
>   	struct perf_event_attr *attr = &evsel->core.attr;
> -	char bf[512];
> +
> +	/*
> +	 * Here value 28 is a constant size which can be used to print
> +	 * one register value and its corresponds to:
> +	 * 16 chars is to specify 64 bit register in hexadecimal.
> +	 * 2 chars is for appending "0x" to the hexadecimal value and
> +	 * 10 chars is for register name.
> +	 */
> +	int size = __sw_hweight64(attr->sample_regs_intr) * 28;
> +	char bf[size];
>   
>   	regs_map(&sample->intr_regs, attr->sample_regs_intr, bf, sizeof(bf));
>   
> 

-- 
Thanks and Regards
R.Nageswara Sastry

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf script python: Fix buffer size to report iregs in perf script
  2021-06-28  6:23 [PATCH] perf script python: Fix buffer size to report iregs in perf script Kajol Jain
  2021-06-28  7:15 ` Nageswara Sastry
@ 2021-06-28 14:49 ` Paul A. Clarke
  2021-06-29  7:09   ` kajoljain
  1 sibling, 1 reply; 8+ messages in thread
From: Paul A. Clarke @ 2021-06-28 14:49 UTC (permalink / raw)
  To: Kajol Jain
  Cc: acme, maddy, atrajeev, linux-kernel, jolsa, ravi.bangoria,
	linux-perf-users, linuxppc-dev, rnsastry

On Mon, Jun 28, 2021 at 11:53:41AM +0530, Kajol Jain wrote:
> Commit 48a1f565261d ("perf script python: Add more PMU fields
> to event handler dict") added functionality to report fields like
> weight, iregs, uregs etc via perf report.
> That commit predefined buffer size to 512 bytes to print those fields.
> 
> But incase of powerpc, since we added extended regs support
> in commits:
> 
> Commit 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
> Counter SPRs as part of extended regs")
> Commit d735599a069f ("powerpc/perf: Add extended regs support for
> power10 platform")
> 
> Now iregs can carry more bytes of data and this predefined buffer size
> can result to data loss in perf script output.
> 
> Patch resolve this issue by making buffer size dynamic based on number
> of registers needed to print. It also changed return type for function
> "regs_map" from int to void, as the return value is not being used by
> the caller function "set_regs_in_dict".
> 
> Fixes: 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
> Counter SPRs as part of extended regs")
> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> ---
>  .../util/scripting-engines/trace-event-python.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
> index 4e4aa4c97ac5..c8c9706b4643 100644
> --- a/tools/perf/util/scripting-engines/trace-event-python.c
> +++ b/tools/perf/util/scripting-engines/trace-event-python.c
[...]
> @@ -713,7 +711,16 @@ static void set_regs_in_dict(PyObject *dict,
>  			     struct evsel *evsel)
>  {
>  	struct perf_event_attr *attr = &evsel->core.attr;
> -	char bf[512];
> +
> +	/*
> +	 * Here value 28 is a constant size which can be used to print
> +	 * one register value and its corresponds to:
> +	 * 16 chars is to specify 64 bit register in hexadecimal.
> +	 * 2 chars is for appending "0x" to the hexadecimal value and
> +	 * 10 chars is for register name.
> +	 */
> +	int size = __sw_hweight64(attr->sample_regs_intr) * 28;
> +	char bf[size];

I propose using a template rather than a magic number here. Something like:
const char reg_name_tmpl[] = "10 chars  ";
const char reg_value_tmpl[] = "0x0123456789abcdef";
const int size = __sw_hweight64(attr->sample_regs_intr) +
                 sizeof reg_name_tmpl + sizeof reg_value_tmpl;

Pardon my ignorance, but is there no separation/whitespace between the name
and the value? And is there some significance to 10 characters for the
register name, or is that a magic number?

PC

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf script python: Fix buffer size to report iregs in perf script
  2021-06-28 14:49 ` Paul A. Clarke
@ 2021-06-29  7:09   ` kajoljain
  2021-07-06 11:56     ` kajoljain
  0 siblings, 1 reply; 8+ messages in thread
From: kajoljain @ 2021-06-29  7:09 UTC (permalink / raw)
  To: Paul A. Clarke
  Cc: acme, maddy, atrajeev, linux-kernel, jolsa, ravi.bangoria,
	linux-perf-users, linuxppc-dev, rnsastry



On 6/28/21 8:19 PM, Paul A. Clarke wrote:
> On Mon, Jun 28, 2021 at 11:53:41AM +0530, Kajol Jain wrote:
>> Commit 48a1f565261d ("perf script python: Add more PMU fields
>> to event handler dict") added functionality to report fields like
>> weight, iregs, uregs etc via perf report.
>> That commit predefined buffer size to 512 bytes to print those fields.
>>
>> But incase of powerpc, since we added extended regs support
>> in commits:
>>
>> Commit 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
>> Counter SPRs as part of extended regs")
>> Commit d735599a069f ("powerpc/perf: Add extended regs support for
>> power10 platform")
>>
>> Now iregs can carry more bytes of data and this predefined buffer size
>> can result to data loss in perf script output.
>>
>> Patch resolve this issue by making buffer size dynamic based on number
>> of registers needed to print. It also changed return type for function
>> "regs_map" from int to void, as the return value is not being used by
>> the caller function "set_regs_in_dict".
>>
>> Fixes: 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
>> Counter SPRs as part of extended regs")
>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
>> ---
>>  .../util/scripting-engines/trace-event-python.c | 17 ++++++++++++-----
>>  1 file changed, 12 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
>> index 4e4aa4c97ac5..c8c9706b4643 100644
>> --- a/tools/perf/util/scripting-engines/trace-event-python.c
>> +++ b/tools/perf/util/scripting-engines/trace-event-python.c
> [...]
>> @@ -713,7 +711,16 @@ static void set_regs_in_dict(PyObject *dict,
>>  			     struct evsel *evsel)
>>  {
>>  	struct perf_event_attr *attr = &evsel->core.attr;
>> -	char bf[512];
>> +
>> +	/*
>> +	 * Here value 28 is a constant size which can be used to print
>> +	 * one register value and its corresponds to:
>> +	 * 16 chars is to specify 64 bit register in hexadecimal.
>> +	 * 2 chars is for appending "0x" to the hexadecimal value and
>> +	 * 10 chars is for register name.
>> +	 */
>> +	int size = __sw_hweight64(attr->sample_regs_intr) * 28;
>> +	char bf[size];
> 
> I propose using a template rather than a magic number here. Something like:
> const char reg_name_tmpl[] = "10 chars  ";
> const char reg_value_tmpl[] = "0x0123456789abcdef";
> const int size = __sw_hweight64(attr->sample_regs_intr) +
>                  sizeof reg_name_tmpl + sizeof reg_value_tmpl;
> 

Hi Paul,
   Thanks for reviewing the patch. Yes these are
some standardization we can do by creating macros for different
fields.
The basic idea is, we want to provide significant buffer size
based on number of registers present in sample_regs_intr to accommodate
all data.

But before going to optimizing code, Arnaldo/Jiri, is this approach looks good to you?

> Pardon my ignorance, but is there no separation/whitespace between the name
> and the value?

This is how we will get data via perf script

r0:0xc000000000112008
r1:0xc000000023b37920
r2:0xc00000000144c900
r3:0xc0000000bc566120
r4:0xc0000000c5600000
r5:0x2606c6506ca
r6:0xc000000023b378f8
r7:0xfffffd9f93a48f0e
.....

 And is there some significance to 10 characters for the
> register name, or is that a magic number?

Most of the register name are within 10 characters, basically we are giving this
magic number to make sure we have enough space in buffer to contain all registers
name with colon.

Thanks,
Kajol Jain
 
> 
> PC
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf script python: Fix buffer size to report iregs in perf script
  2021-06-29  7:09   ` kajoljain
@ 2021-07-06 11:56     ` kajoljain
  2021-07-06 19:15       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 8+ messages in thread
From: kajoljain @ 2021-07-06 11:56 UTC (permalink / raw)
  To: acme, Jiri Olsa
  Cc: maddy, atrajeev, linux-kernel, ravi.bangoria, linux-perf-users,
	linuxppc-dev, rnsastry, Paul A. Clarke



On 6/29/21 12:39 PM, kajoljain wrote:
> 
> 
> On 6/28/21 8:19 PM, Paul A. Clarke wrote:
>> On Mon, Jun 28, 2021 at 11:53:41AM +0530, Kajol Jain wrote:
>>> Commit 48a1f565261d ("perf script python: Add more PMU fields
>>> to event handler dict") added functionality to report fields like
>>> weight, iregs, uregs etc via perf report.
>>> That commit predefined buffer size to 512 bytes to print those fields.
>>>
>>> But incase of powerpc, since we added extended regs support
>>> in commits:
>>>
>>> Commit 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
>>> Counter SPRs as part of extended regs")
>>> Commit d735599a069f ("powerpc/perf: Add extended regs support for
>>> power10 platform")
>>>
>>> Now iregs can carry more bytes of data and this predefined buffer size
>>> can result to data loss in perf script output.
>>>
>>> Patch resolve this issue by making buffer size dynamic based on number
>>> of registers needed to print. It also changed return type for function
>>> "regs_map" from int to void, as the return value is not being used by
>>> the caller function "set_regs_in_dict".
>>>
>>> Fixes: 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
>>> Counter SPRs as part of extended regs")
>>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
>>> ---
>>>  .../util/scripting-engines/trace-event-python.c | 17 ++++++++++++-----
>>>  1 file changed, 12 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
>>> index 4e4aa4c97ac5..c8c9706b4643 100644
>>> --- a/tools/perf/util/scripting-engines/trace-event-python.c
>>> +++ b/tools/perf/util/scripting-engines/trace-event-python.c
>> [...]
>>> @@ -713,7 +711,16 @@ static void set_regs_in_dict(PyObject *dict,
>>>  			     struct evsel *evsel)
>>>  {
>>>  	struct perf_event_attr *attr = &evsel->core.attr;
>>> -	char bf[512];
>>> +
>>> +	/*
>>> +	 * Here value 28 is a constant size which can be used to print
>>> +	 * one register value and its corresponds to:
>>> +	 * 16 chars is to specify 64 bit register in hexadecimal.
>>> +	 * 2 chars is for appending "0x" to the hexadecimal value and
>>> +	 * 10 chars is for register name.
>>> +	 */
>>> +	int size = __sw_hweight64(attr->sample_regs_intr) * 28;
>>> +	char bf[size];
>>
>> I propose using a template rather than a magic number here. Something like:
>> const char reg_name_tmpl[] = "10 chars  ";
>> const char reg_value_tmpl[] = "0x0123456789abcdef";
>> const int size = __sw_hweight64(attr->sample_regs_intr) +
>>                  sizeof reg_name_tmpl + sizeof reg_value_tmpl;
>>
> 
> Hi Paul,
>    Thanks for reviewing the patch. Yes these are
> some standardization we can do by creating macros for different
> fields.
> The basic idea is, we want to provide significant buffer size
> based on number of registers present in sample_regs_intr to accommodate
> all data.
> 

Hi Arnaldo/Jiri,
   Is the approach used in this patch looks fine to you?

Thanks,
Kajol Jain

> But before going to optimizing code, Arnaldo/Jiri, is this approach looks good to you?
> 
>> Pardon my ignorance, but is there no separation/whitespace between the name
>> and the value?
> 
> This is how we will get data via perf script
> 
> r0:0xc000000000112008
> r1:0xc000000023b37920
> r2:0xc00000000144c900
> r3:0xc0000000bc566120
> r4:0xc0000000c5600000
> r5:0x2606c6506ca
> r6:0xc000000023b378f8
> r7:0xfffffd9f93a48f0e
> .....
> 
>  And is there some significance to 10 characters for the
>> register name, or is that a magic number?
> 
> Most of the register name are within 10 characters, basically we are giving this
> magic number to make sure we have enough space in buffer to contain all registers
> name with colon.
> 
> Thanks,
> Kajol Jain
>  
>>
>> PC
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf script python: Fix buffer size to report iregs in perf script
  2021-07-06 11:56     ` kajoljain
@ 2021-07-06 19:15       ` Arnaldo Carvalho de Melo
  2021-07-07  5:46         ` kajoljain
  0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-07-06 19:15 UTC (permalink / raw)
  To: kajoljain
  Cc: Jiri Olsa, maddy, atrajeev, linux-kernel, ravi.bangoria,
	linux-perf-users, linuxppc-dev, rnsastry, Paul A. Clarke

Em Tue, Jul 06, 2021 at 05:26:12PM +0530, kajoljain escreveu:
> 
> 
> On 6/29/21 12:39 PM, kajoljain wrote:
> > 
> > 
> > On 6/28/21 8:19 PM, Paul A. Clarke wrote:
> >> On Mon, Jun 28, 2021 at 11:53:41AM +0530, Kajol Jain wrote:
> >>> Commit 48a1f565261d ("perf script python: Add more PMU fields
> >>> to event handler dict") added functionality to report fields like
> >>> weight, iregs, uregs etc via perf report.
> >>> That commit predefined buffer size to 512 bytes to print those fields.
> >>>
> >>> But incase of powerpc, since we added extended regs support
> >>> in commits:
> >>>
> >>> Commit 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
> >>> Counter SPRs as part of extended regs")
> >>> Commit d735599a069f ("powerpc/perf: Add extended regs support for
> >>> power10 platform")
> >>>
> >>> Now iregs can carry more bytes of data and this predefined buffer size
> >>> can result to data loss in perf script output.
> >>>
> >>> Patch resolve this issue by making buffer size dynamic based on number
> >>> of registers needed to print. It also changed return type for function
> >>> "regs_map" from int to void, as the return value is not being used by
> >>> the caller function "set_regs_in_dict".
> >>>
> >>> Fixes: 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
> >>> Counter SPRs as part of extended regs")
> >>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> >>> ---
> >>>  .../util/scripting-engines/trace-event-python.c | 17 ++++++++++++-----
> >>>  1 file changed, 12 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
> >>> index 4e4aa4c97ac5..c8c9706b4643 100644
> >>> --- a/tools/perf/util/scripting-engines/trace-event-python.c
> >>> +++ b/tools/perf/util/scripting-engines/trace-event-python.c
> >> [...]
> >>> @@ -713,7 +711,16 @@ static void set_regs_in_dict(PyObject *dict,
> >>>  			     struct evsel *evsel)
> >>>  {
> >>>  	struct perf_event_attr *attr = &evsel->core.attr;
> >>> -	char bf[512];
> >>> +
> >>> +	/*
> >>> +	 * Here value 28 is a constant size which can be used to print
> >>> +	 * one register value and its corresponds to:
> >>> +	 * 16 chars is to specify 64 bit register in hexadecimal.
> >>> +	 * 2 chars is for appending "0x" to the hexadecimal value and
> >>> +	 * 10 chars is for register name.
> >>> +	 */
> >>> +	int size = __sw_hweight64(attr->sample_regs_intr) * 28;
> >>> +	char bf[size];
> >>
> >> I propose using a template rather than a magic number here. Something like:
> >> const char reg_name_tmpl[] = "10 chars  ";
> >> const char reg_value_tmpl[] = "0x0123456789abcdef";
> >> const int size = __sw_hweight64(attr->sample_regs_intr) +
> >>                  sizeof reg_name_tmpl + sizeof reg_value_tmpl;
> >>
> > 
> > Hi Paul,
> >    Thanks for reviewing the patch. Yes these are
> > some standardization we can do by creating macros for different
> > fields.
> > The basic idea is, we want to provide significant buffer size
> > based on number of registers present in sample_regs_intr to accommodate
> > all data.
> > 
> 
> Hi Arnaldo/Jiri,
>    Is the approach used in this patch looks fine to you?

Yeah, and the comment you provide right above it explains it, so I think
that is enough, ok?

- Arnaldo
 
> Thanks,
> Kajol Jain
> 
> > But before going to optimizing code, Arnaldo/Jiri, is this approach looks good to you?
> > 
> >> Pardon my ignorance, but is there no separation/whitespace between the name
> >> and the value?
> > 
> > This is how we will get data via perf script
> > 
> > r0:0xc000000000112008
> > r1:0xc000000023b37920
> > r2:0xc00000000144c900
> > r3:0xc0000000bc566120
> > r4:0xc0000000c5600000
> > r5:0x2606c6506ca
> > r6:0xc000000023b378f8
> > r7:0xfffffd9f93a48f0e
> > .....
> > 
> >  And is there some significance to 10 characters for the
> >> register name, or is that a magic number?
> > 
> > Most of the register name are within 10 characters, basically we are giving this
> > magic number to make sure we have enough space in buffer to contain all registers
> > name with colon.
> > 
> > Thanks,
> > Kajol Jain
> >  
> >>
> >> PC
> >>

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf script python: Fix buffer size to report iregs in perf script
  2021-07-06 19:15       ` Arnaldo Carvalho de Melo
@ 2021-07-07  5:46         ` kajoljain
  2021-07-07 14:04           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 8+ messages in thread
From: kajoljain @ 2021-07-07  5:46 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, maddy, atrajeev, linux-kernel, ravi.bangoria,
	linux-perf-users, linuxppc-dev, rnsastry, Paul A. Clarke



On 7/7/21 12:45 AM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Jul 06, 2021 at 05:26:12PM +0530, kajoljain escreveu:
>>
>>
>> On 6/29/21 12:39 PM, kajoljain wrote:
>>>
>>>
>>> On 6/28/21 8:19 PM, Paul A. Clarke wrote:
>>>> On Mon, Jun 28, 2021 at 11:53:41AM +0530, Kajol Jain wrote:
>>>>> Commit 48a1f565261d ("perf script python: Add more PMU fields
>>>>> to event handler dict") added functionality to report fields like
>>>>> weight, iregs, uregs etc via perf report.
>>>>> That commit predefined buffer size to 512 bytes to print those fields.
>>>>>
>>>>> But incase of powerpc, since we added extended regs support
>>>>> in commits:
>>>>>
>>>>> Commit 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
>>>>> Counter SPRs as part of extended regs")
>>>>> Commit d735599a069f ("powerpc/perf: Add extended regs support for
>>>>> power10 platform")
>>>>>
>>>>> Now iregs can carry more bytes of data and this predefined buffer size
>>>>> can result to data loss in perf script output.
>>>>>
>>>>> Patch resolve this issue by making buffer size dynamic based on number
>>>>> of registers needed to print. It also changed return type for function
>>>>> "regs_map" from int to void, as the return value is not being used by
>>>>> the caller function "set_regs_in_dict".
>>>>>
>>>>> Fixes: 068aeea3773a ("perf powerpc: Support exposing Performance Monitor
>>>>> Counter SPRs as part of extended regs")
>>>>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
>>>>> ---
>>>>>  .../util/scripting-engines/trace-event-python.c | 17 ++++++++++++-----
>>>>>  1 file changed, 12 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
>>>>> index 4e4aa4c97ac5..c8c9706b4643 100644
>>>>> --- a/tools/perf/util/scripting-engines/trace-event-python.c
>>>>> +++ b/tools/perf/util/scripting-engines/trace-event-python.c
>>>> [...]
>>>>> @@ -713,7 +711,16 @@ static void set_regs_in_dict(PyObject *dict,
>>>>>  			     struct evsel *evsel)
>>>>>  {
>>>>>  	struct perf_event_attr *attr = &evsel->core.attr;
>>>>> -	char bf[512];
>>>>> +
>>>>> +	/*
>>>>> +	 * Here value 28 is a constant size which can be used to print
>>>>> +	 * one register value and its corresponds to:
>>>>> +	 * 16 chars is to specify 64 bit register in hexadecimal.
>>>>> +	 * 2 chars is for appending "0x" to the hexadecimal value and
>>>>> +	 * 10 chars is for register name.
>>>>> +	 */
>>>>> +	int size = __sw_hweight64(attr->sample_regs_intr) * 28;
>>>>> +	char bf[size];
>>>>
>>>> I propose using a template rather than a magic number here. Something like:
>>>> const char reg_name_tmpl[] = "10 chars  ";
>>>> const char reg_value_tmpl[] = "0x0123456789abcdef";
>>>> const int size = __sw_hweight64(attr->sample_regs_intr) +
>>>>                  sizeof reg_name_tmpl + sizeof reg_value_tmpl;
>>>>
>>>
>>> Hi Paul,
>>>    Thanks for reviewing the patch. Yes these are
>>> some standardization we can do by creating macros for different
>>> fields.
>>> The basic idea is, we want to provide significant buffer size
>>> based on number of registers present in sample_regs_intr to accommodate
>>> all data.
>>>
>>
>> Hi Arnaldo/Jiri,
>>    Is the approach used in this patch looks fine to you?
> 
> Yeah, and the comment you provide right above it explains it, so I think
> that is enough, ok?
> 

Hi Arnaldo,
    Thanks for reviewing it. As you said added comment already explains
why we are taking size constant as 28, should we skip adding macros part?
Can you pull this patch.

Thanks,
Kajol Jain

> - Arnaldo
>  
>> Thanks,
>> Kajol Jain
>>
>>> But before going to optimizing code, Arnaldo/Jiri, is this approach looks good to you?
>>>
>>>> Pardon my ignorance, but is there no separation/whitespace between the name
>>>> and the value?
>>>
>>> This is how we will get data via perf script
>>>
>>> r0:0xc000000000112008
>>> r1:0xc000000023b37920
>>> r2:0xc00000000144c900
>>> r3:0xc0000000bc566120
>>> r4:0xc0000000c5600000
>>> r5:0x2606c6506ca
>>> r6:0xc000000023b378f8
>>> r7:0xfffffd9f93a48f0e
>>> .....
>>>
>>>  And is there some significance to 10 characters for the
>>>> register name, or is that a magic number?
>>>
>>> Most of the register name are within 10 characters, basically we are giving this
>>> magic number to make sure we have enough space in buffer to contain all registers
>>> name with colon.
>>>
>>> Thanks,
>>> Kajol Jain
>>>  
>>>>
>>>> PC
>>>>
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf script python: Fix buffer size to report iregs in perf script
  2021-07-07  5:46         ` kajoljain
@ 2021-07-07 14:04           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-07-07 14:04 UTC (permalink / raw)
  To: kajoljain
  Cc: Jiri Olsa, maddy, atrajeev, linux-kernel, ravi.bangoria,
	linux-perf-users, linuxppc-dev, rnsastry, Paul A. Clarke

Em Wed, Jul 07, 2021 at 11:16:20AM +0530, kajoljain escreveu:
> On 7/7/21 12:45 AM, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Jul 06, 2021 at 05:26:12PM +0530, kajoljain escreveu:
> >> On 6/29/21 12:39 PM, kajoljain wrote:
> >>> On 6/28/21 8:19 PM, Paul A. Clarke wrote:
> >>>> On Mon, Jun 28, 2021 at 11:53:41AM +0530, Kajol Jain wrote:
> >>>>> @@ -713,7 +711,16 @@ static void set_regs_in_dict(PyObject *dict,
> >>>>>  			     struct evsel *evsel)
> >>>>>  {
> >>>>>  	struct perf_event_attr *attr = &evsel->core.attr;
> >>>>> -	char bf[512];
> >>>>> +
> >>>>> +	/*
> >>>>> +	 * Here value 28 is a constant size which can be used to print
> >>>>> +	 * one register value and its corresponds to:
> >>>>> +	 * 16 chars is to specify 64 bit register in hexadecimal.
> >>>>> +	 * 2 chars is for appending "0x" to the hexadecimal value and
> >>>>> +	 * 10 chars is for register name.
> >>>>> +	 */
> >>>>> +	int size = __sw_hweight64(attr->sample_regs_intr) * 28;
> >>>>> +	char bf[size];

> >>>> I propose using a template rather than a magic number here. Something like:
> >>>> const char reg_name_tmpl[] = "10 chars  ";
> >>>> const char reg_value_tmpl[] = "0x0123456789abcdef";
> >>>> const int size = __sw_hweight64(attr->sample_regs_intr) +
> >>>>                  sizeof reg_name_tmpl + sizeof reg_value_tmpl;

> >>>    Thanks for reviewing the patch. Yes these are
> >>> some standardization we can do by creating macros for different
> >>> fields.
> >>> The basic idea is, we want to provide significant buffer size
> >>> based on number of registers present in sample_regs_intr to accommodate
> >>> all data.

> >>    Is the approach used in this patch looks fine to you?

> > Yeah, and the comment you provide right above it explains it, so I think
> > that is enough, ok?
 
>     Thanks for reviewing it. As you said added comment already explains
> why we are taking size constant as 28, should we skip adding macros part?
> Can you pull this patch.

Sure.

- Arnaldo

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-07-07 14:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-28  6:23 [PATCH] perf script python: Fix buffer size to report iregs in perf script Kajol Jain
2021-06-28  7:15 ` Nageswara Sastry
2021-06-28 14:49 ` Paul A. Clarke
2021-06-29  7:09   ` kajoljain
2021-07-06 11:56     ` kajoljain
2021-07-06 19:15       ` Arnaldo Carvalho de Melo
2021-07-07  5:46         ` kajoljain
2021-07-07 14:04           ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).