linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Clark <james.clark@arm.com>
To: Tanmay Jagdale <tanmay@marvell.com>,
	"linux-perf-users@vger.kernel.org"
	<linux-perf-users@vger.kernel.org>,
	"leo.yan@linaro.org" <leo.yan@linaro.org>,
	"mike.leach@linaro.org" <mike.leach@linaro.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>,
	George Cherian <gcherian@marvell.com>,
	Linu Cherian <lcherian@marvell.com>,
	Bharat Bhushan <bbhushan2@marvell.com>,
	Mathieu Poirier <mathieu.poirier@linaro.org>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	John Garry <john.g.garry@oracle.com>,
	Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	"coresight@lists.linaro.org" <coresight@lists.linaro.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 0/7] perf cs_etm: Basic support for virtual/kernel timestamps
Date: Fri, 13 Jan 2023 11:12:46 +0000	[thread overview]
Message-ID: <70a94ad3-fe5d-b013-7f6b-dd83d2332e0e@arm.com> (raw)
In-Reply-To: <PH0PR18MB50178B52ED081276D2404493D6FD9@PH0PR18MB5017.namprd18.prod.outlook.com>



On 12/01/2023 15:33, Tanmay Jagdale wrote:
> Hi James
> 
>>>>> base-commit: 09e6f9f98370be9a9f8978139e0eb1be87d1125f
>>>> I have tested this patch set on our platform and able to see updated
>>>> timestamp values in perf script's output.
>>>>
>>>> $ perf record -e cs_etm/cycacc,@tmc_etr0/k -C 9 taskset -c 9 sleep 2
>>>> $ perf script --itrace=i1ns --ns -Fcomm,tid,pid,time,cpu,event,ip,sym,addr,symoff,flags,callindent
>>>>
>>>> At certain points noticed that hardware emits same timestamp packets
>>>> but with updated cycle count (CC) values. A small snippet of the log:
>>>>
>>>> Idx:100; ID:12; I_ADDR_S_IS0 : Address, Short, IS0.; Addr=0xFFFF8000086A761C ~[0x761C]
>>>> Idx:103; ID:12; I_TIMESTAMP : Timestamp.; Updated val = 0x2f373e37e02; CC=0x3d
>>>> Idx:107; ID:12; I_ATOM_F2 : Atom format 2.; EN
>>>> Idx:108; ID:12; I_TIMESTAMP : Timestamp.; Updated val = 0x2f373e37e02; CC=0x3f
>>>> Idx:112; ID:12; I_ATOM_F1 : Atom format 1.; N
>>>> Idx:113; ID:12; I_TIMESTAMP : Timestamp.; Updated val = 0x2f373e37e02; CC=0x45
>>>> Idx:116; ID:12; I_ATOM_F1 : Atom format 1.; E
>>>> Idx:117; ID:12; I_ADDR_S_IS0 : Address, Short, IS0.; Addr=0xFFFF8000086B52D4 ~[0x152D4]
>>>>
>>>> Since the source of timestamp is the Generic Timer block and the CPUs
>>>> run at higher frequencies, this behaviour could be possible on high
>>>> performance ARM cores.
>>>>
>>>> Having consecutive timestamps with same value is resulting in a
>>>> slightly jumbled order (in nanosecs) in perf script's time column.
>>>> A snippet corresponding to the Coresight trace data mentioned above:
>>>> ...
>>>> perf   965/965   [001]  3182.286629044:            instructions:k:   return                               0 ffff8000086a761c coresight_timeout+0xc8
>>>> perf   965/965   [001]  3182.286629044:            instructions:k:   return                               0 ffff8000086a7620 coresight_timeout+0xcc
>>>> perf   965/965   [001]  3182.286629046:            instructions:k:   jmp                                  0 ffff8000086a75c8 coresight_timeout+0x74
>>>> perf   965/965   [001]  3182.286629046:            instructions:k:   jmp                                  0 ffff8000086a75cc coresight_timeout+0x78
>>>> perf   965/965   [001]  3182.286629044:            instructions:k:   jcc                                  0 ffff8000086a75d0 coresight_timeout+0x7c
>>>> perf   965/965   [001]  3182.286629044:            instructions:k:   jcc                                  0 ffff8000086a75d4 coresight_timeout+0x80
>>>> perf   965/965   [001]  3182.286629044:            instructions:k:   jcc                                  0 ffff8000086a75d8 coresight_timeout+0x84
>>>> perf   965/965   [001]  3182.286629044:            instructions:k:   jcc                                  0 ffff8000086a75dc coresight_timeout+0x88
>>>> perf   965/965   [001]  3182.286629044:            instructions:k:   jcc                                  0 ffff8000086a75e0 coresight_timeout+0x8c
>>>> perf   965/965   [001]  3182.286629044:            instructions:k:   jcc                                  0 ffff8000086a75e4 coresight_timeout+0x90
>>>> perf   965/965   [001]  3182.286629044:            instructions:k:   jcc                                  0 ffff8000086a75e8 coresight_timeout+0x94
>>>> perf   965/965   [001]  3182.286629044:            instructions:k:   jcc                                  0 ffff8000086a75ec coresight_timeout+0x98
>>>>
>>>> Perf's do_soft_timestamp() logic in cs_etm_decoder.c file is incrementing
>>>> the HW timestamp based on instruction count. Since the next timestamp
>>>> also has the same value, it could be leading to this jumbled order.
>>>>
>>>> We would like to know if this has been observed on other platforms ?
>>>> And what could be a solution in SW for this ?
>>>
>>> Nice catch. If I'm understanding this correctly it looks like the issue
>>> is a combination of the cycle count in the packet being ignored by Perf,
>>> and the instruction count being reset to 0 when a new timestamp is received.
>>>
>>> It looks like it can be fixed by some combination of combining the cycle
>>> count and instruction count and maybe not resetting instruction count if
>>> the newly received timestamp is the same as the last one. I will look
>>> into this.
>>>
>>> We haven't noticed it on any other platforms, but we haven't been
>>> looking too closely at the timestamps until now. Perhaps I can add a
>>> test that checks if the trace in a known function goes in the correct
>>> time order.
>>>
>>
>> I'm thinking of something like the following patch to fix the ordering.
>> It doesn't use the cycle count, but I'm not sure if that would be
>> worthwhile in the end, considering that it would have sub nanosecond
>> resolution so wouldn't affect the Perf timestamps:
> Thanks for coming up with a quick solution for this issue !
> 
>>
>>
>> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> index 31fa3b45134a..08a028e3e87a 100644
>> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> @@ -112,6 +112,19 @@ int cs_etm_decoder__get_packet(struct cs_etm_packet_queue *packet_queue,
>>  	return 1;
>>  }
>>
>> +static u64 cs_etm_decoder__instr_count_to_ns(u64 instr_count)
>> +{
>> +	/*
>> +	 * Assume a maximum of 0.1ns elapsed per instruction. This would be the
>> +	 * case with a theoretical 10GHz core executing 1 instruction per cycle.
>> +	 * Used to estimate the sample time for synthesized instructions because
>> +	 * Coresight only emits a timestamp for a range of instructions rather
>> +	 * than per instruction.
>> +	 */
>> +	const int INSTR_PER_NS = 10;
>> +
>> +	return instr_count / INSTR_PER_NS;
>> +}
>>  static int cs_etm_decoder__gen_etmv3_config(struct cs_etm_trace_params *params,
>>  					    ocsd_etmv3_cfg *config)
>>  {
>> @@ -267,7 +280,7 @@ cs_etm_decoder__do_soft_timestamp(struct cs_etm_queue *etmq,
>>  	packet_queue->cs_timestamp = packet_queue->next_cs_timestamp;
>>
>>  	/* Estimate the timestamp for the next range packet */
>> -	packet_queue->next_cs_timestamp += packet_queue->instr_count;
>> +	packet_queue->next_cs_timestamp += cs_etm_decoder__instr_count_to_ns(packet_queue->instr_count);
>>  	packet_queue->instr_count = 0;
>>
>>  	/* Tell the front end which traceid_queue needs attention */
>> @@ -295,11 +308,17 @@ cs_etm_decoder__do_hard_timestamp(struct cs_etm_queue *etmq,
>>  	 * hence asking the decoder to keep decoding rather than stopping.
>>  	 */
>>  	if (packet_queue->cs_timestamp) {
>> -		packet_queue->next_cs_timestamp = elem->timestamp;
>> +		/*
>> +		 * Don't allow next_cs_timestamp to be less than the last one estimated by
>> +		 * cs_etm_decoder__do_soft_timestamp() otherwise new instructions would
>> +		 * appear to go back in time. In theory this should never happen, but if
>> +		 * it did, then next_cs_timestamp should eventually catch up to real time
>> +		 * unless every single range was predicted to be too long for some reason.
>> +		 */
>> +		packet_queue->next_cs_timestamp = max(elem->timestamp, packet_queue->next_cs_timestamp);
>>  		return OCSD_RESP_CONT;
>>  	}
>>
>> -
>>  	if (!elem->timestamp) {
>>  		/*
>>  		 * Zero timestamps can be seen due to misconfiguration or hardware bugs.
>> @@ -312,7 +331,7 @@ cs_etm_decoder__do_hard_timestamp(struct cs_etm_queue *etmq,
>>  					". Decoding may be improved by prepending 'Z' to your current --itrace
>> arguments.\n",
>>  					indx);
>>
>> -	} else if (packet_queue->instr_count > elem->timestamp) {
>> +	} else if (cs_etm_decoder__instr_count_to_ns(packet_queue->instr_count) > elem->timestamp) {
>>  		/*
>>  		 * Sanity check that the elem->timestamp - packet_queue->instr_count would not
>>  		 * result in an underflow. Warn and clamp at 0 if it would.
>> @@ -327,7 +346,8 @@ cs_etm_decoder__do_hard_timestamp(struct cs_etm_queue *etmq,
>>  		 * which instructions started by subtracting the number of instructions
>>  		 * executed to the timestamp.
>>  		 */
>> -		packet_queue->cs_timestamp = elem->timestamp - packet_queue->instr_count;
>> +		packet_queue->cs_timestamp = elem->timestamp -
>> +			cs_etm_decoder__instr_count_to_ns(packet_queue->instr_count);
>>  	}
>>  	packet_queue->next_cs_timestamp = elem->timestamp;
>>  	packet_queue->instr_count = 0;
> I have tested this diff along with the patchset and found that
> timestamps are generated/estimated in an increasing order.
> 
> However, found few corner cases where they weren't in order.
> I'm currently gathering more information on these corner cases.
> 
> Used the following steps to find inconsistencies:
> $ perf record -e cs_etm/@tmc_etr0/k -C 4 taskset -c 4 sleep 1
> $ perf script --itrace=i1ns --ns -Fcomm,tid,pid,time,cpu,event,ip,sym,addr,symoff,flags,callindent > itrace
> $ sed 's/://g' itrace | awk -F '.' ' { print $2 } ' | awk '{ if ($1 < prev) { print "line:" NR " " $0 } {prev=$1}}'
> 
> Setting INSTR_PER_NS to higher values (14,16,20 etc) results in
> lower inconsistencies but it would defeat the purpose of
> estimating values in software since we could be returning 0
> from __instr_count_to_ns() in many cases.
> 
> Please share your thoughts.

I believe this could be because I was adding nanosecond values
to the raw coresight counter before it was converted which was
a mistake. I have made another change to move the conversion
as early as possible to avoid mistakes like this.

Increasing the INSTR_PER_NS shouldn't be required because of
the max() call, in fact INSTR_PER_NS isn't necessarily
needed at all, I just thought it would make it more accurate.

I'll test the new version on the file that you sent before
pasting it here. If you have another recording that has more
of the edge cases then please also upload that one too.

James

  reply	other threads:[~2023-01-13 11:19 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-03 16:20 [PATCH v3 0/7] perf cs_etm: Basic support for virtual/kernel timestamps James Clark
2023-01-03 16:20 ` [PATCH v3 1/7] perf: Remove duplication around EVENT_SOURCE_DEVICE_PATH James Clark
2023-01-03 16:20 ` [PATCH v3 2/7] perf: Use perf_pmu__open_file() and perf_pmu__scan_file() James Clark
2023-01-11  7:09   ` Leo Yan
2023-01-11 15:14     ` James Clark
2023-01-03 16:20 ` [PATCH v3 3/7] perf: Remove remaining duplication of bus/event_source/devices/ James Clark
2023-01-03 16:20 ` [PATCH v3 4/7] perf pmu: Add function to check if a pmu file exists James Clark
2023-01-03 16:20 ` [PATCH v3 5/7] perf cs_etm: Keep separate symbols for ETMv4 and ETE parameters James Clark
2023-01-03 16:20 ` [PATCH v3 6/7] perf cs_etm: Record ts_source in AUXTRACE_INFO for ETMv4 and ETE James Clark
2023-01-03 16:20 ` [PATCH v3 7/7] perf cs_etm: Set the time field in the synthetic samples James Clark
2023-01-10 16:42 ` [PATCH v3 0/7] perf cs_etm: Basic support for virtual/kernel timestamps Tanmay Jagdale
2023-01-11 10:56   ` James Clark
2023-01-11 17:47     ` James Clark
2023-01-12 15:33       ` Tanmay Jagdale
2023-01-13 11:12         ` James Clark [this message]
2023-01-19 15:53           ` James Clark
2023-01-20 17:30             ` Tanmay Jagdale
2023-01-20 17:47               ` Arnaldo Carvalho de Melo
2023-01-21  5:13                 ` Tanmay Jagdale
2023-01-11 11:35   ` James Clark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=70a94ad3-fe5d-b013-7f6b-dd83d2332e0e@arm.com \
    --to=james.clark@arm.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bbhushan2@marvell.com \
    --cc=coresight@lists.linaro.org \
    --cc=gcherian@marvell.com \
    --cc=john.g.garry@oracle.com \
    --cc=jolsa@kernel.org \
    --cc=lcherian@marvell.com \
    --cc=leo.yan@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.poirier@linaro.org \
    --cc=mike.leach@linaro.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sgoutham@marvell.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tanmay@marvell.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).