All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
To: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: igt-dev@lists.freedesktop.org
Subject: Re: [igt-dev] [PATCH i-g-t 1/6] test/perf: Drop caches when closing perf stream
Date: Thu, 5 Mar 2020 01:29:50 +0200	[thread overview]
Message-ID: <8e2012f7-4e18-be2d-0226-458056fcd81b@intel.com> (raw)
In-Reply-To: <20200304230438.GA58678@orsosgc001.amr.corp.intel.com>

On 05/03/2020 01:04, Umesh Nerlige Ramappa wrote:
> On Thu, Mar 05, 2020 at 12:05:55AM +0200, Lionel Landwerlin wrote:
>> On 04/03/2020 19:51, Umesh Nerlige Ramappa wrote:
>>> On Wed, Mar 04, 2020 at 10:45:55AM +0200, Lionel Landwerlin wrote:
>>>> On 04/03/2020 00:57, Umesh Nerlige Ramappa wrote:
>>>>> On Tue, Mar 03, 2020 at 02:38:08PM -0800, Umesh Nerlige Ramappa 
>>>>> wrote:
>>>>>> Running ./build/tests/perf will run all the perf subtests in a 
>>>>>> sequence.
>>>>>> When running tests in a sequence, subsequent tests may not run 
>>>>>> with a
>>>>>> clean slate. For resources that are lazily released, drop caches in
>>>>>> __perf_close.
>>>>>
>>>>> Hi Lionel, Chris,
>>>>>
>>>>> I notice an issue on TGL when running the entire suite of perf 
>>>>> tests.  In my setup, the polling test was failing with invalid 
>>>>> reports being seen in the beginning of the OA buffer. This issue 
>>>>> is seen more prominently with the newly added subtests which call 
>>>>> perf_open and perf_close a couple of times (say 
>>>>> blocking-with-interrupt).
>>>>>
>>>>> What I see in some runs is that the second test would result in a 
>>>>> bunch of unlanded reports in the beginning of the OA buffer. 
>>>>> Assuming that we are already waiting for the NOA config with a 
>>>>> noa_wait bo, I tried to look into this further.
>>>>>
>>>>> free_oa_buffer is called to free the oa_buffer bo and this work is 
>>>>> deferred by the driver. If a test is called before this free 
>>>>> completes, we see the issue. Just to test out this theory, if I 
>>>>> comment out the free_oa_buffer entirely, I see that the tests pass 
>>>>> without any issues since new gtt memory is being allocated each time.
>>>>>
>>>>> I guess the deferred free and the new allocation of the OA buffer 
>>>>> for subsequent test has something missing. Maybe TLBs not being 
>>>>> dropped? I imagine the OA unit might write valid reports somewhere 
>>>>> based on what it sees in the TLBs and cpu is looking for them 
>>>>> elsewhere (until the free completes). Just a theory though. Let me 
>>>>> know what you think.
>>>>>
>>>>> For now igt_drop_caches_set(DROP_FREED) is what is helping and 
>>>>> hence this patch.
>>>>
>>>>
>>>> Hey Umesh,
>>>>
>>>>
>>>> I guess this could be fixed by this commit :
>>>>
>>>>
>>>> commit 4b4e973d5eb89244b67d3223b60f752d0479f253
>>>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Date:   Mon Mar 2 08:57:57 2020 +0000
>>>>
>>>>     drm/i915/perf: Reintroduce wait on OA configuration completion
>>>>
>>>> If you can give this commit a try or rebase on drm-tip it would be 
>>>> great to confirm.
>>>
>>> I thought this commit was ensuring that the noa_wait is executed 
>>> completely before we enable the OA buffer captures. That still does 
>>> not explain why the issue goes away for me when I comment out 
>>> free_oa_buffer.
>>
>>
>> If noa_wait is not waited upon, either from CPU or GPU, then we 
>> enable OA while the configuration is not completely applied.
>>
>> Hence the invalid data at the beginning of the buffer.
>>
>>
>> Are you saying this commit didn't help?
>>
> No, I haven't tried it yet. I lost my reservation on the TGL machine 
> :(, so waiting for another one.
>
> What I meant is that - not freeing the OA buffer (only for 
> experimenting) results in a new gtt offset everytime we allocate the 
> OA buffer. When I do this, I don't see any invalid OA reports. This is 
> without the commit you mention above. If waiting for the NOA config to 
> complete were indeed the issue, I should have seen it even with my 
> experiment. Right?


I see, thanks that's a useful experiment.

Only thing I can think of would be HEAD/TAIL register writes that didn't 
land before the OA unit was turned on.

I'll dig into the code.


Is this only on Gen12?


Thanks,


-Lionel


>
> Thanks,
> Umesh
>
>>
>> Thanks,
>>
>>
>> -Lionel
>>
>>
>>>
>>> Thanks,
>>> Umesh
>>>
>>>>
>>>> Otherwise we might need more digging to figure what's going on.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>>
>>>> -Lionel
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Umesh
>>>>>
>>>>>>
>>>>>> Signed-off-by: Umesh Nerlige Ramappa 
>>>>>> <umesh.nerlige.ramappa@intel.com>
>>>>>> ---
>>>>>> tests/perf.c | 7 ++++++-
>>>>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/tests/perf.c b/tests/perf.c
>>>>>> index 5e818030..189c6aa1 100644
>>>>>> --- a/tests/perf.c
>>>>>> +++ b/tests/perf.c
>>>>>> @@ -244,6 +244,12 @@ __perf_close(int fd)
>>>>>>         close(pm_fd);
>>>>>>         pm_fd = -1;
>>>>>>     }
>>>>>> +
>>>>>> +    /* When running tests in a sequence, subsequent tests may 
>>>>>> not run with a
>>>>>> +     * clean slate. For resources that are lazily released, 
>>>>>> cleanup here.
>>>>>> +     */
>>>>>> +    if (drm_fd >= 0 && !getgid() && !getuid())
>>>>>> +        gem_quiescent_gpu(drm_fd);
>>>>>> }
>>>>>>
>>>>>> static int
>>>>>> @@ -3993,7 +3999,6 @@ test_rc6_disable(void)
>>>>>>     igt_assert_eq(n_events_end - n_events_start, 0);
>>>>>>
>>>>>>     __perf_close(stream_fd);
>>>>>> -    gem_quiescent_gpu(drm_fd);
>>>>>>
>>>>>>     n_events_start = rc6_residency_ms();
>>>>>>     nanosleep(&(struct timespec){ .tv_sec = 1, .tv_nsec = 0 }, 
>>>>>> NULL);
>>>>>> -- 
>>>>>> 2.20.1
>>>>>>
>>>>>> _______________________________________________
>>>>>> igt-dev mailing list
>>>>>> igt-dev@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/igt-dev
>>>>
>>>>
>>

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

  reply	other threads:[~2020-03-04 23:29 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-03 22:38 [igt-dev] [PATCH 0/6] Add igt tests for perf OA interrupts Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 1/6] test/perf: Drop caches when closing perf stream Umesh Nerlige Ramappa
2020-03-03 22:57   ` Umesh Nerlige Ramappa
2020-03-04  8:45     ` Lionel Landwerlin
2020-03-04 17:51       ` Umesh Nerlige Ramappa
2020-03-04 22:05         ` Lionel Landwerlin
2020-03-04 23:04           ` Umesh Nerlige Ramappa
2020-03-04 23:29             ` Lionel Landwerlin [this message]
2020-03-04 23:51               ` Umesh Nerlige Ramappa
2020-03-05 19:36                 ` Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 2/6] include/drm-uapi: Update i915_drm.h for perf OA APIs Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 3/6] tests/perf: new tests for parameterized OA buffer polling Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 4/6] tests/perf: new tests for OA interrupt Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 5/6] tests/perf: add flush data ioctl test Umesh Nerlige Ramappa
2020-03-03 22:38 ` [igt-dev] [PATCH i-g-t 6/6] tools: Enable interrupt support in i915 perf recorder Umesh Nerlige Ramappa
2020-03-03 23:56 ` [igt-dev] ✓ Fi.CI.BAT: success for Add igt tests for perf OA interrupts Patchwork
2020-03-04 14:44 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8e2012f7-4e18-be2d-0226-458056fcd81b@intel.com \
    --to=lionel.g.landwerlin@intel.com \
    --cc=igt-dev@lists.freedesktop.org \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.