All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: "peterz@infradead.org" <peterz@infradead.org>,
	"Intel-gfx@lists.freedesktop.org"
	<Intel-gfx@lists.freedesktop.org>
Subject: Re: [RFC v3 00/11] i915 PMU and engine busy stats
Date: Wed, 13 Sep 2017 10:01:14 +0100	[thread overview]
Message-ID: <234289d7-99e5-f78c-a34f-e28682e30576@linux.intel.com> (raw)
In-Reply-To: <1505224750.6296.13.camel@intel.com>


On 12/09/2017 23:01, Rogozhkin, Dmitry V wrote:
> On Tue, 2017-09-12 at 15:54 +0100, Tvrtko Ursulin wrote:
>> On 12/09/2017 03:03, Rogozhkin, Dmitry V wrote:
>>> Hi,
>>>
>>> Just tried v3 series. perf-stat works fine. From the IGT tests which I wrote for i915 PMU
>>> (https://patchwork.freedesktop.org/series/29313/) all pass (assuming pmu.enabled will be exposed
>>> in debugfs) except cpu_online subtest. And this is pretty interesting - see details below.
>>>
>>> Ok, be prepared for the long mail:)...
>>>
>>> So, cpu_online subtest loads RCS0 engine 100% and starts to put CPUs offline one by one starting
>>> from CPU0 (don't forget to have CONFIG_BOOTPARAM_HOTPLUG_CPU0=y in .config). Test expectation is
>>> that i915 PMU will report almost 100% (with 2% tolerance) busyness of RCS0. Test requests metric
>>> just twice: before running on RCS0 and right after. This fails as follows:
>>>
>>> Executed on rcs0 for 32004262184us
>>>     i915/rcs0-busy/: 2225723999us
>>> (perf_pmu:6325) CRITICAL: Test assertion failure function test_cpu_online, file perf_pmu.c:719:
>>> (perf_pmu:6325) CRITICAL: Failed assertion: perf_elapsed(&pm.metrics[0]) > (1-USAGE_TOLERANCE) * elapsed_ns(&start, &now)
>>> Stack trace:
>>>     #0 [__igt_fail_assert+0xf1]
>>>     #1 [__real_main773+0xff1]
>>>     #2 [main+0x35]
>>>     #3 [__libc_start_main+0xf5]
>>>     #4 [_start+0x29]
>>>     #5 [<unknown>+0x29]
>>> Subtest cpu_online failed.
>>> **** DEBUG ****
>>> (perf_pmu:6325) DEBUG: Test requirement passed: is_hotplug_cpu0()
>>> (perf_pmu:6325) INFO: perf_init: enabled 1 metrics from 1 requested
>>> (perf_pmu:6325) ioctl-wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring)
>>> (perf_pmu:6325) INFO: Executed on rcs0 for 32004262184us
>>> (perf_pmu:6325) INFO:   i915/rcs0-busy/: 2225723999us
>>> (perf_pmu:6325) CRITICAL: Test assertion failure function test_cpu_online, file perf_pmu.c:719:
>>> (perf_pmu:6325) CRITICAL: Failed assertion: perf_elapsed(&pm.metrics[0]) > (1-USAGE_TOLERANCE) * elapsed_ns(&start, &now)
>>>
>>> Now. Looks like that by itself PMU context migration works. For example, if you will comment out
>>> "perf_pmu_migrate_context(&pmu->base, cpu, target)" you will get:
>>>
>>>       Executed on rcs0 for 32004434918us
>>>         i915/rcs0-busy/:     76623707us
>>>
>>> Compare with previous:
>>>       Executed on rcs0 for 32004262184us
>>>         i915/rcs0-busy/:    2225723999us
>>>
>>> This test passed on the previous set of patches, I mean Tvrtko's v2 series + my patches.
>>>
>>> So, it seems we are loosing counter values somehow. I saw in the patches that this place really was modified - you
>>> have added subtraction from initial counter value:
>>> static void i915_pmu_event_read(struct perf_event *event)
>>> {
>>>
>>> 	local64_set(&event->count,
>>> 		    __i915_pmu_event_read(event) -
>>> 		    local64_read(&event->hw.prev_count));
>>> }
>>>
>>> But looks like the problem is that with the PMU context migration we get sequence of events start/stop (or maybe
>>> add/del) which eventually call our i915_pmu_enable/disable. Here is the dmesg log with the obvious printk:
>>>
>>> [  153.971096] [IGT] perf_pmu: starting subtest cpu_online
>>> [  153.971151] i915_pmu_enable: event->hw.prev_count=0
>>> [  154.036015] i915_pmu_disable: event->hw.prev_count=0
>>> [  154.048027] i915_pmu_enable: event->hw.prev_count=0
>>> [  154.049343] smpboot: CPU 0 is now offline
>>> [  155.059028] smpboot: Booting Node 0 Processor 0 APIC 0x0
>>> [  155.155078] smpboot: CPU 1 is now offline
>>> [  156.161026] x86: Booting SMP configuration:
>>> [  156.161027] smpboot: Booting Node 0 Processor 1 APIC 0x2
>>> [  156.197065] IRQ 122: no longer affine to CPU2
>>> [  156.198087] smpboot: CPU 2 is now offline
>>> [  157.208028] smpboot: Booting Node 0 Processor 2 APIC 0x4
>>> [  157.263093] smpboot: CPU 3 is now offline
>>> [  158.273026] smpboot: Booting Node 0 Processor 3 APIC 0x6
>>> [  158.310026] i915_pmu_disable: event->hw.prev_count=76648307
>>> [  158.319020] i915_pmu_enable: event->hw.prev_count=76648307
>>> [  158.319098] IRQ 124: no longer affine to CPU4
>>> [  158.320368] smpboot: CPU 4 is now offline
>>> [  159.326030] smpboot: Booting Node 0 Processor 4 APIC 0x1
>>> [  159.365306] smpboot: CPU 5 is now offline
>>> [  160.371030] smpboot: Booting Node 0 Processor 5 APIC 0x3
>>> [  160.421077] IRQ 125: no longer affine to CPU6
>>> [  160.422093] smpboot: CPU 6 is now offline
>>> [  161.429030] smpboot: Booting Node 0 Processor 6 APIC 0x5
>>> [  161.467091] smpboot: CPU 7 is now offline
>>> [  162.473027] smpboot: Booting Node 0 Processor 7 APIC 0x7
>>> [  162.527019] i915_pmu_disable: event->hw.prev_count=4347548222
>>> [  162.546017] i915_pmu_enable: event->hw.prev_count=4347548222
>>> [  162.547317] smpboot: CPU 0 is now offline
>>> [  163.553028] smpboot: Booting Node 0 Processor 0 APIC 0x0
>>> [  163.621089] smpboot: CPU 1 is now offline
>>> [  164.627028] x86: Booting SMP configuration:
>>> [  164.627029] smpboot: Booting Node 0 Processor 1 APIC 0x2
>>> [  164.669308] smpboot: CPU 2 is now offline
>>> [  165.679025] smpboot: Booting Node 0 Processor 2 APIC 0x4
>>> [  165.717089] smpboot: CPU 3 is now offline
>>> [  166.723025] smpboot: Booting Node 0 Processor 3 APIC 0x6
>>> [  166.775016] i915_pmu_disable: event->hw.prev_count=8574197312
>>> [  166.787016] i915_pmu_enable: event->hw.prev_count=8574197312
>>> [  166.788309] smpboot: CPU 4 is now offline
>>> [  167.794025] smpboot: Booting Node 0 Processor 4 APIC 0x1
>>> [  167.837114] smpboot: CPU 5 is now offline
>>> [  168.847025] smpboot: Booting Node 0 Processor 5 APIC 0x3
>>> [  168.889312] smpboot: CPU 6 is now offline
>>> [  169.899030] smpboot: Booting Node 0 Processor 6 APIC 0x5
>>> [  169.944104] smpboot: CPU 7 is now offline
>>> [  170.954032] smpboot: Booting Node 0 Processor 7 APIC 0x7
>>> [  171.000016] i915_pmu_disable: event->hw.prev_count=12815138319
>>> [  171.008017] i915_pmu_enable: event->hw.prev_count=12815138319
>>> [  171.009304] smpboot: CPU 0 is now offline
>>> [  172.017028] smpboot: Booting Node 0 Processor 0 APIC 0x0
>>> [  172.096104] smpboot: CPU 1 is now offline
>>> [  173.106025] x86: Booting SMP configuration:
>>> [  173.106026] smpboot: Booting Node 0 Processor 1 APIC 0x2
>>> [  173.147078] smpboot: CPU 2 is now offline
>>> [  174.153025] smpboot: Booting Node 0 Processor 2 APIC 0x4
>>> [  174.192093] smpboot: CPU 3 is now offline
>>> [  175.198028] smpboot: Booting Node 0 Processor 3 APIC 0x6
>>> [  175.229042] i915_pmu_disable: event->hw.prev_count=17035889079
>>> [  175.242030] i915_pmu_enable: event->hw.prev_count=17035889079
>>> [  175.242163] IRQ fixup: irq 120 move in progress, old vector 131
>>> [  175.242165] IRQ fixup: irq 121 move in progress, old vector 147
>>> [  175.242171] IRQ 124: no longer affine to CPU4
>>> [  175.243435] smpboot: CPU 4 is now offline
>>> [  176.248040] smpboot: Booting Node 0 Processor 4 APIC 0x1
>>> [  176.285328] smpboot: CPU 5 is now offline
>>> [  177.296039] smpboot: Booting Node 0 Processor 5 APIC 0x3
>>> [  177.325067] IRQ 125: no longer affine to CPU6
>>> [  177.326087] smpboot: CPU 6 is now offline
>>> [  178.335036] smpboot: Booting Node 0 Processor 6 APIC 0x5
>>> [  178.377063] IRQ 122: no longer affine to CPU7
>>> [  178.378086] smpboot: CPU 7 is now offline
>>> [  179.388028] smpboot: Booting Node 0 Processor 7 APIC 0x7
>>> [  179.454030] i915_pmu_disable: event->hw.prev_count=21269856967
>>> [  179.470026] i915_pmu_enable: event->hw.prev_count=21269856967
>>> [  179.471110] smpboot: CPU 0 is now offline
>>> [  180.481028] smpboot: Booting Node 0 Processor 0 APIC 0x0
>>> [  180.551075] smpboot: CPU 1 is now offline
>>> [  181.558029] x86: Booting SMP configuration:
>>> [  181.558030] smpboot: Booting Node 0 Processor 1 APIC 0x2
>>> [  181.595096] smpboot: CPU 2 is now offline
>>> [  182.605029] smpboot: Booting Node 0 Processor 2 APIC 0x4
>>> [  182.657084] smpboot: CPU 3 is now offline
>>> [  183.668030] smpboot: Booting Node 0 Processor 3 APIC 0x6
>>> [  183.709017] i915_pmu_disable: event->hw.prev_count=25497358644
>>> [  183.727016] i915_pmu_enable: event->hw.prev_count=25497358644
>>> [  183.728305] smpboot: CPU 4 is now offline
>>> [  184.734027] smpboot: Booting Node 0 Processor 4 APIC 0x1
>>> [  184.767090] smpboot: CPU 5 is now offline
>>> [  185.777036] smpboot: Booting Node 0 Processor 5 APIC 0x3
>>> [  185.823096] smpboot: CPU 6 is now offline
>>> [  186.829051] smpboot: Booting Node 0 Processor 6 APIC 0x5
>>> [  186.856350] smpboot: CPU 7 is now offline
>>> [  187.862051] smpboot: Booting Node 0 Processor 7 APIC 0x7
>>> [  187.871216] [IGT] perf_pmu: exiting, ret=99
>>> [  187.889199] Console: switching to colour frame buffer device 240x67
>>> [  187.889583] i915_pmu_disable: event->hw.prev_count=29754080941
>>>
>>> And the result which I got in userspace for this run were
>>>       Executed on rcs0 for 32003587981us
>>>         i915/rcs0-busy/: 2247436461us
>>>
>>> After that I decided to roll back the change with counting values which I mentioned before, i.e.:
>>> static void i915_pmu_event_read(struct perf_event *event)
>>> {
>>>
>>> 	local64_set(&event->count,
>>> 		    __i915_pmu_event_read(event) /*-
>>> 		    local64_read(&event->hw.prev_count)*/);
>>> }
>>>
>>> And I got test PASSED :):
>>>       Executed on rcs0 for 32002282603us
>>>         i915/rcs0-busy/: 31998855052us
>>>       Subtest cpu_online: SUCCESS (33.950s)
>>>
>>> At this point I need to go home :). Maybe you will have time to look into this issue? If not, I will continue
>>> tomorrow.
>>
>> I forgot to run this test since I did not have the kernel feature
>> enabled. But yes, now that I tried it, it is failing.
>>
>> What is happening is that event del (so counter stop as well) is getting
>> called when the CPU goes offline, followed by add->start, and the
>> initial counter value then gets reloaded.
>>
>> I don't see a way for i915 to distinguish between userspace
>> starting/stopping the event, and perf core doing the same in the CPU
>> migration process. Perhaps Peter could help here?
>>
>> I am storing the initial counter value when the counter is started so
>> that I can report it's relative value. In other words the change from
>> event start to stop. Perhaps that is not correct and should be left to
>> userspace to handle?
>>
>> Otherwise we have counters like energy use, and even engine busyness,
>> which will/can already be at some large value before PMU monitoring
>> starts. Which makes things like "perf stat -a -I <command>", or even
>> just normal "perf stat <command>", attribute all previous usage (from
>> before the command profiling started) to the reported stats.
>>
> 
> Actually that's pretty easy to fix. The following patch is doing that:
> 
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c
> b/drivers/gpu/drm/i915/i915_pmu.c
> index bce4951..277098d 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -390,10 +390,18 @@ static u64 __i915_pmu_event_read(struct perf_event
> *event)
>   
>   static void i915_pmu_event_read(struct perf_event *event)
>   {
> +       struct hw_perf_event *hwc = &event->hw;
> +       u64 prev_raw_count, new_raw_count;
>   
> -       local64_set(&event->count,
> -                   __i915_pmu_event_read(event) -
> -                   local64_read(&event->hw.prev_count));
> +again:
> +       prev_raw_count = local64_read(&hwc->prev_count);
> +       new_raw_count = __i915_pmu_event_read(event);
> +
> +       if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
> +                           new_raw_count) != prev_raw_count)
> +               goto again;
> +
> +       local64_add(new_raw_count - prev_raw_count, &event->count);
>   }
> 
> 
> I believe you need to squash it to the major i915 PMU enabling one.
> 
> So, the idea is:
> 1. event->count contains current counter value, it is ever growing for
> the particular event _instance_, i.e. even if event is stopped/started
> or added/deleted, till this event exists (not destroyed) it holds ever
> growing value
> 2. Since it is ever growing, in the read() we always add a _delta_ to
> the event count where start point is when event got enabled (started)
> 3. On PMU context migration to another CPU we will be issued a call to
> del(PERF_EF_UPDATE). Thus, here is the trick:
> 3.1. The first thing we do we _update_ event count adding to the count
> everything we gathered on the previous CPU
> 3.2. The second thing - we update event->hw.prev_count to the new value.
> Next time we will add delta counting from it
> 
> With this all tests I have for PMU passed. The code above is taken from
> arch/x86/events/intel/cstate.c as is. So, that's really how it should
> be.

Makes sense, simple solution which I overlooked. :I

I've sent a v6 with this fix and the updated v4 changelog. Just managed 
to do it initially against the wrong message in the thread, hopefully 
patchwork is not confused beyond repair now.

> And... I should say that was the last technical problem I saw for i915
> PMU implementation. With it puzzle looks complete for me. :).

So time to poke Chris I guess.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2017-09-13  9:01 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 01/11] drm/i915: Convert intel_rc6_residency_us to ns Tvrtko Ursulin
2017-09-14 19:48   ` Chris Wilson
2017-09-11 15:25 ` [RFC 02/11] drm/i915: Add intel_energy_uJ Tvrtko Ursulin
2017-09-14 19:49   ` Chris Wilson
2017-09-15  9:18     ` Tvrtko Ursulin
2017-09-14 20:36   ` Ville Syrjälä
2017-09-15  6:56     ` Tvrtko Ursulin
2017-09-15  8:51       ` Chris Wilson
2017-09-15 10:07         ` Tvrtko Ursulin
2017-09-15 10:34           ` Ville Syrjälä
2017-09-15 10:38             ` Chris Wilson
2017-09-15 11:16               ` Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 03/11] drm/i915: Extract intel_get_cagf Tvrtko Ursulin
2017-09-14 19:51   ` Chris Wilson
2017-09-11 15:25 ` [RFC 04/11] drm/i915/pmu: Expose a PMU interface for perf queries Tvrtko Ursulin
2017-09-12  2:06   ` Rogozhkin, Dmitry V
2017-09-12 14:59     ` Tvrtko Ursulin
2017-09-13  8:57       ` [RFC v6 " Tvrtko Ursulin
2017-09-13 10:34         ` [RFC v7 " Tvrtko Ursulin
2017-09-15  0:00           ` Rogozhkin, Dmitry V
2017-09-15  7:57             ` Tvrtko Ursulin
2017-09-14 19:46   ` [RFC " Chris Wilson
2017-09-11 15:25 ` [RFC 05/11] drm/i915/pmu: Suspend sampling when GPU is idle Tvrtko Ursulin
2017-09-13 10:34   ` [RFC v5 " Tvrtko Ursulin
2017-09-14 19:57     ` Chris Wilson
2017-09-15  9:22       ` Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 06/11] drm/i915: Wrap context schedule notification Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 07/11] drm/i915: Engine busy time tracking Tvrtko Ursulin
2017-09-14 20:16   ` Chris Wilson
2017-09-15  9:45     ` Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 08/11] drm/i915: Export engine busy stats in debugfs Tvrtko Ursulin
2017-09-14 20:17   ` Chris Wilson
2017-09-15  9:46     ` Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 09/11] drm/i915/pmu: Wire up engine busy stats to PMU Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 10/11] drm/i915: Export engine stats API to other users Tvrtko Ursulin
2017-09-12 18:35   ` Ben Widawsky
2017-09-14 20:26   ` Chris Wilson
2017-09-15  9:49     ` Tvrtko Ursulin
2017-09-19 19:50       ` Ben Widawsky
2017-09-19 20:11         ` Rogozhkin, Dmitry V
2017-09-29 10:59   ` Joonas Lahtinen
2017-09-11 15:25 ` [RFC 11/11] drm/i915: Gate engine stats collection with a static key Tvrtko Ursulin
2017-09-13 12:18   ` [RFC v3 " Tvrtko Ursulin
2017-09-14 20:22     ` Chris Wilson
2017-09-15  9:51       ` Tvrtko Ursulin
2017-09-11 15:50 ` ✗ Fi.CI.BAT: warning for i915 PMU and engine busy stats (rev3) Patchwork
2017-09-12  2:03 ` [RFC v3 00/11] i915 PMU and engine busy stats Rogozhkin, Dmitry V
2017-09-12 14:54   ` Tvrtko Ursulin
2017-09-12 22:01     ` Rogozhkin, Dmitry V
2017-09-13  8:54       ` [RFC v6 04/11] drm/i915/pmu: Expose a PMU interface for perf queries Tvrtko Ursulin
2017-09-13  9:01       ` Tvrtko Ursulin [this message]
2017-09-13  9:34 ` ✗ Fi.CI.BAT: warning for i915 PMU and engine busy stats (rev4) Patchwork
2017-09-13 10:46 ` ✗ Fi.CI.BAT: failure for i915 PMU and engine busy stats (rev6) Patchwork
2017-09-13 13:27 ` ✓ Fi.CI.BAT: success for i915 PMU and engine busy stats (rev7) Patchwork
2017-09-13 21:24 ` ✓ Fi.CI.IGT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=234289d7-99e5-f78c-a34f-e28682e30576@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=dmitry.v.rogozhkin@intel.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.