From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9655C10E671 for ; Thu, 10 Nov 2022 04:20:34 +0000 (UTC) Date: Wed, 09 Nov 2022 20:20:30 -0800 Message-ID: <8735ar4f01.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: "Belgaumkar, Vinay" In-Reply-To: <4f7a0eb0-c851-58cc-f74b-045b05069fe5@intel.com> References: <20221107062329.1927534-1-ashutosh.dixit@intel.com> <87mt92l2nc.wl-ashutosh.dixit@intel.com> <87leoml2g4.wl-ashutosh.dixit@intel.com> <33646ce6-6692-1244-cb9f-4740105aadef@intel.com> <87fseuut9l.wl-ashutosh.dixit@intel.com> <2225190b-be60-03c0-e84b-56de7c9836f8@linux.intel.com> <87educ28s0.wl-ashutosh.dixit@intel.com> <4f7a0eb0-c851-58cc-f74b-045b05069fe5@intel.com> MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [igt-dev] [PATCH i-g-t] tests/perf_pmu: Compare against requested freq in frequency subtest List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: igt-dev@lists.freedesktop.org Errors-To: igt-dev-bounces@lists.freedesktop.org Sender: "igt-dev" List-ID: On Wed, 09 Nov 2022 17:37:18 -0800, Belgaumkar, Vinay wrote: > > On 11/8/2022 5:53 PM, Dixit, Ashutosh wrote: > > On Tue, 08 Nov 2022 13:02:33 -0800, Belgaumkar, Vinay wrote: > >> > >> On 11/8/2022 1:24 AM, Tvrtko Ursulin wrote: > >>> On 08/11/2022 01:31, Dixit, Ashutosh wrote: > >>>> On Mon, 07 Nov 2022 16:57:24 -0800, Belgaumkar, Vinay wrote: > >>>>> On 11/7/2022 4:22 PM, Dixit, Ashutosh wrote: > >>>>>> On Mon, 07 Nov 2022 16:18:31 -0800, Dixit, Ashutosh wrote: > >>>>>> Hi Vinay, > >>>>>> > >>>>>> A question for you below. > >>>>>> > >>>>>>> So I submitted this patch to repro the issue and to print out the > >>>>>>> requested > >>>>>>> freq from sysfs: > >>>>>>> > >>>>>>> https://patchwork.freedesktop.org/series/110630/ > >>>>>>> > >>>>>>> And we can see the output here: > >>>>>>> > >>>>>>> https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8061/bat-dg2-11/ig= t@perf_pmu@frequency.html > >>>>>>> > >>>>>>> ``` > >>>>>>> IGT-Version: 1.26-g1bef4d081 (x86_64) (Linux: > >>>>>>> 6.1.0-rc4-CI_DRM_12352-gc55ac6a74bd1+ x86_64) > >>>>>>> Starting subtest: frequency > >>>>>>> Frequency: min=3D300, max=3D2050, boost=3D2050 MHz > >>>>>>> Min frequency: requested 349.7, actual 349.7 > >>>>>>> Max frequency: requested 2048.0, actual 2048.0 > >>>>>>> Sysfs requested: min 350, max 2050 > >>>>>>> Stack trace: > >>>>>>> =A0=A0=A0 #0 ../../../usr/src/igt-gpu-tools/lib/igt_core.c:1908 > >>>>>>> __igt_fail_assert() > >>>>>>> =A0=A0=A0 #1 ../../../usr/src/igt-gpu-tools/tests/i915/perf_pmu.= c:1656 > >>>>>>> __igt_unique____real_main2147() > >>>>>>> =A0=A0=A0 #2 ../../../usr/src/igt-gpu-tools/tests/i915/perf_pmu.= c:2147 > >>>>>>> main() > >>>>>>> =A0=A0=A0 #3 [__libc_start_main+0xf3] > >>>>>>> =A0=A0=A0 #4 [_start+0x2e] > >>>>>>> Subtest frequency: FAIL (2.212s) > >>>>>>> ``` > >>>>>>> > >>>>>>> So we clearly see the requested freq from sysfs is indeed 350 MHz= so > >>>>>>> SLPC/PCODE is not honoring the set min =3D=3D max =3D=3D boost fr= eq (and PMU > >>>>>>> is > >>>>>>> measuring what sysfs is showing). In general PCODE is the final > >>>>>>> arbiter in > >>>>>>> such cases and we do occasionally see instances where set freq li= mits > >>>>>>> are > >>>>>>> not honored. > >>>>>>> > >>>>>>> I would say if igt@perf_pmu@frequency is testing freq measured by= PMU > >>>>>>> then > >>>>>>> the patch below is correct. Whether SLPC/PCODE is honoring the set > >>>>>>> freq > >>>>>>> limits should be tested in a SLPC test (which we also have). > >>>>>> igt@perf_pmu@frequency sets 'min =3D=3D max =3D=3D boost =3D=3D 30= 0 MHz' but we > >>>>>> still > >>>>>> see the requested freq to be 350 MHz. Do we have a SLPC test cover= ing > >>>>>> this > >>>>>> scenario or should we add one? This is failing on one of the DG2's. > >>>>> Does adding a delay help (around 20 ms for the h2g to go through > >>>>> typically)? > >>>> There is no delay but the test calls gem_quiescent_gpu() after setti= ng > >>>> 'min > >>>> =3D=3D max =3D=3D boost =3D=3D 300 MHz' and then launches a spinner.= We are checking > >>>> the requested freq 500 ms after the spinner is started (so plenty of > >>>> time > >>>> for the h2g) and still the requested freq is 350 MHz. > >>>> > >>>>> Also, is there a workload running when we change the min=3Dmax=3Dbo= ost to > >>>>> 300? > >>>> No, the workload is started after setting the freq and calling > >>>> gem_quiescent_gpu(). > >>> Implication here seems to be that gem_quiescent_gpu() would need to c= over > >>> H2G communication - does it? > >> No, more like there needs to be a WL running in order for SLPC to acti= vely > >> make changes to the requested frequency. > > Well here we set the freq's first and later when the WL runs FW should > > select an appropriate requested freq. > > > >>>>> We already check these things in our SLPC selftests. > >>> Is it then expected to respect the 300MHz max in this case? Or if it > >>> can't, should it be reflected in the sysfs readback? > >> It should respect the 300 Mhz. The only question in my mind is regardi= ng > >> efficient frequency. Can we print out what the RP1 is here? > > RP1 is also 300. We can see it here: > > > > https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8071/bat-dg2-11/igt@perf= _pmu@frequency.html > > > > with this patch: > > > > https://patchwork.freedesktop.org/series/110630/#rev3 > > > > In this case, min =3D=3D max =3D=3D 300, but requested freq is 400 (pre= viously it > > was 350). > > Ok, this might be happening due to the following - > > 1. We have efficient frequency enabled now, so GuC will use that instead = of > min even on light loads etc. > > 2. It is also known that this efficient frequency is "dynamic", especially > on DG2. > > 3. When we set min freq to a value lower than efficient, we will turn off > efficient frequency usage. But, here min =3D efficient =3D 300, so it will > remain ON. > > This is why we see 350 or even 400 sometimes as we are not bound to the m= in > (or even a single freq level) when efficient freq usage is allowed. Hi Vinay, thanks for the explanation, makes sense. > The solution for this case may be to turn off efficient frequency > forcibly for this test and see if that helps. We do that in the selftests > to ensure proper frequency bounds. I don't believe we can do this from userland, can we? Thanks. -- Ashutosh