From mboxrd@z Thu Jan 1 00:00:00 1970 From: Grigori Goronzy Subject: Re: [PATCH] drm/radeon: Adding UVD handle basis fps estimation v2 Date: Fri, 15 Aug 2014 17:32:51 +0200 Message-ID: <53EE2823.7030101@chown.ath.cx> References: <1407411210-1939-1-git-send-email-deathsimple@vodafone.de> <53E39C0A.6090109@vodafone.de> <53E88811.9030302@vodafone.de> <53E9E5A8.8010004@vodafone.de> <53EDC943.3020306@amd.com> <53EE151B.6050105@amd.com> <53EE2530.2030203@chown.ath.cx> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0720569443==" Return-path: Received: from pygmy.kinoho.net (pygmy.kinoho.net [134.0.27.24]) by gabe.freedesktop.org (Postfix) with ESMTP id 9CB386E570 for ; Fri, 15 Aug 2014 08:32:55 -0700 (PDT) In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Alex Deucher Cc: =?UTF-8?B?Q2hyaXN0aWFuIEvDtm5pZw==?= , Maling list - DRI developers , "Peter.Fruehberger@gmail.com" List-Id: dri-devel@lists.freedesktop.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --===============0720569443== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="okWh0su6W1OwQtGCfKMgeT4FTIuOtAMbI" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --okWh0su6W1OwQtGCfKMgeT4FTIuOtAMbI Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 15.08.2014 17:26, Alex Deucher wrote: > On Fri, Aug 15, 2014 at 11:20 AM, Grigori Goronzy w= rote: >> On 15.08.2014 16:11, Christian K=C3=B6nig wrote: >>> Hi Marco, >>> >>> the problem with an CS ioctl flag is that we sometimes don't know how= >>> much SCLK/MCLK boost is needed, for example when we do post processin= g >>> in the player using OpenGL and UVD decoding with VDPAU. In this case >>> VDPAU don't has the slightest idea how high SCLK/MCLK must be and so >>> can't give that info to the kernel either. >>> >> >> Maybe it's an acceptable workaround to simply disable dynamic UVD stat= e >> selection in case the UVD states only have a single power level. That >> will avoid the performance issues on affected systems, while still >> allowing dynamic UVD states on systems that have a saner DPM table >> setup. I think it is mosly older systems that suffer from this. >> >=20 > That is exactly what we do now. >=20 Is it? In 3.17-wip, dynamic UVD state selection (according to active streams) is still completely disabled. It will always use the generic UVD state. In fact wasn't it reverted again because of the performance issues on some systems? Grigori > Alex >=20 >=20 >> Best regards >> Grigori >> >>> Regards, >>> Christian. >>> >>> Am 15.08.2014 um 15:21 schrieb Marco Benatto: >>>> Hey all, >>>> >>>> I also had a talk with Alex yesterday about post-processing issues >>>> when using dynamic UVD profiles and a chamge on CS ioctl >>>> including a flag to let user mode driver tell to the kernel which >>>> performance requirement it wants for post processing. A commom >>>> point for both discussion is to stablish the default values for thes= e >>>> profiles, but probably this ioctl change would be more impacting/com= plex >>>> to implement than a sysfs entry. >>>> >>>> If a sysfs entry is anough for now I can handle the code to create i= t >>>> and, with your help, the code to setup the UVD profile requested >>>> through it. >>>> >>>> Is there any suggestion? >>>> >>>> Thanks all for your help, >>>> >>>> >>>> On Fri, Aug 15, 2014 at 5:48 AM, Christian K=C3=B6nig >>>> > wrote: >>>> >>>> Hi guys, >>>> >>>> to make a long story short every time I watch a movie my laptop >>>> start to heat up because we always select the standard UVD power= >>>> profile without actually measuring if that is necessary. >>>> >>>> Marco came up with a patch that seems to reliable measure the fp= s >>>> send down to the kernel and so together with knowing the frame >>>> size of the video should allow us to select the right UVD power >>>> profile. >>>> >>>> The problem is that Alex (unnoticed by me) completely disabled >>>> selecting the UVD profiles because of some issues with advanced >>>> post processing discussed on IRC. The problem seems to be that t= he >>>> lower UVD profiles have a to low SCLK/MCLK to handle the 3D load= >>>> that comes with scaling, deinterlacing etc... >>>> >>>> I unfortunately don't have time for it, cause this only affects >>>> the hardware generations R600-SI and not the newest one CIK. So >>>> could you guys stick together and come up with a solution? >>>> Something like a sysfs entry that let's us select the minimum UV= D >>>> power level allowed? >>>> >>>> I think Marco is happy to come up with a patch, we just need to >>>> know what's really needed and what should be the default values.= >>>> I'm happy to review everything that comes out of it, just don't >>>> have time to do it myself. >>>> >>>> Happy discussion and thanks in advance, >>>> Christian. >>>> >>>> Am 12.08.2014 um 15:05 schrieb Alex Deucher: >>>> >>>> On Tue, Aug 12, 2014 at 6:00 AM, Christian K=C3=B6nig >>>> > w= rote: >>>> >>>> Am 11.08.2014 um 16:52 schrieb Alex Deucher: >>>> >>>> On Mon, Aug 11, 2014 at 5:08 AM, Christian K=C3=B6ni= g >>>> >>> > wrote: >>>> >>>> Am 07.08.2014 um 21:43 schrieb Alex Deucher: >>>> >>>> On Thu, Aug 7, 2014 at 11:32 AM, Christian K= =C3=B6nig >>>> >>> > wrote: >>>> >>>> Am 07.08.2014 um 16:32 schrieb Alex Deuc= her: >>>> >>>> On Thu, Aug 7, 2014 at 7:33 AM, >>>> Christian K=C3=B6nig >>>> >>> > >>>> wrote: >>>> >>>> From: Marco A Benatto >>>> >>> > >>>> >>>> Adding a Frames Per Second >>>> estimation logic on UVD handles >>>> when it has being used. This >>>> estimation is per handle basis >>>> and will help on DPM profile >>>> calculation. >>>> >>>> v2 (chk): fix timestamp type, mo= ve >>>> functions around and >>>> cleanup code a bit= =2E >>>> >>>> Will this really help much? I thoug= ht >>>> the problem was mainly due to >>>> sclk and mclk for post processing. >>>> >>>> >>>> It should at least handle the UVD side f= or >>>> upclocking when you get a >>>> lot >>>> of >>>> streams / fps. And at on my NI the patch= >>>> seems to do exactly that. >>>> >>>> Switching sclk and mclk for post >>>> processing is a different task, and I >>>> actually have no idea what to do with th= em. >>>> >>>> At this point we always choose the plain UVD= >>>> state anyway so this >>>> patch would only take effect if we re-enable= d >>>> the dynamic UVD state >>>> selection. >>>> >>>> >>>> Hui? I thought we already re-enabled the dynamic= >>>> UVD state selection, but >>>> double checking this I found it disabled again. >>>> >>>> What was the problem with that? Looks like I >>>> somehow missed the >>>> discussion >>>> around it. >>>> >>>> We did, but after doing so a number of people >>>> complained about a >>>> regression on IRC because when apps like xmbc enable= d >>>> post processing, >>>> performance went down. >>>> >>>> >>>> That's strange, from my experience the different UVD >>>> performance states only >>>> affect UVDs dclk/vclk, not sclk/mclk. I need to get the >>>> DPM dumps to >>>> confirms this. >>>> >>>> The sclks and mclks are usually different as well, especiall= y >>>> on APUs. >>>> I can send you some examples. >>>> >>>> You not off hand remember who complained on IRC? Finding= >>>> something in the >>>> IRC logs is like searching for a needle in a haystack. >>>> >>>> I don't remember off hand. I think zgreg was involved in so= me >>>> of the >>>> discussions. >>>> >>>> Alex >>>> >>>> Thanks, >>>> Christian. >>>> >>>> >>>> Alex >>>> >>>> >>>> Christian. >>>> >>>> >>>> For the post processing, we probably need a >>>> hint we can >>>> pass to the driver in the CS ioctl to denote= >>>> what state we need. >>>> Although if we did that, this could would >>>> largely be moot. That said, >>>> newer asics support dynamic UVD clocks so we= >>>> really only need >>>> something like that for older asics and I >>>> guess VCE. >>>> >>>> Alex >>>> >>>> Christian. >>>> >>>> >>>> Alex >>>> >>>> Signed-off-by: Marco A Benatto >>>> >>> > >>>> Signed-off-by: Christian K=C3=B6= nig >>>> >>> > >>>> --- >>>> >>>> drivers/gpu/drm/radeon/radeon.h= >>>> | 10 ++++++ >>>> >>>> drivers/gpu/drm/radeon/radeon_u= vd.c >>>> | 64 >>>> ++++++++++++++++++++++++++++++++= +---- >>>> 2 files changed, 68 >>>> insertions(+), 6 deletions(-) >>>> >>>> diff --git >>>> a/drivers/gpu/drm/radeon/radeon.= h >>>> b/drivers/gpu/drm/radeon/radeon.= h >>>> index 9e1732e..e92f6cb 100644 >>>> --- a/drivers/gpu/drm/radeon/rad= eon.h >>>> +++ b/drivers/gpu/drm/radeon/rad= eon.h >>>> @@ -1617,6 +1617,15 @@ int >>>> radeon_pm_get_type_index(struct >>>> radeon_device >>>> *rdev, >>>> #define >>>> RADEON_UVD_STACK_SIZE (1024*102= 4) >>>> #define RADEON_UVD_HEAP_SIZ= E >>>> (1024*1024) >>>> >>>> +#define RADEON_UVD_FPS_EVENTS_M= AX 8 >>>> +#define RADEON_UVD_DEFAULT_FPS = 60 >>>> + >>>> +struct radeon_uvd_fps { >>>> + uint64_t timestam= p; >>>> + uint8_t event_in= dex; >>>> + uint8_t >>>> events[RADEON_UVD_FPS_EVENTS_MA= X]; >>>> +}; >>>> + >>>> struct radeon_uvd { >>>> struct radeon_bo >>>> *vcpu_bo; >>>> void >>>> *cpu_addr; >>>> @@ -1626,6 +1635,7 @@ struct >>>> radeon_uvd { >>>> struct drm_file >>>> *filp[RADEON_MAX_UVD_HANDLES]; >>>> unsigned >>>> img_size[RADEON_MAX_UVD_HANDLE= S]; >>>> struct delayed_work >>>> idle_work; >>>> + struct radeon_uvd_fps >>>> fps_info[RADEON_MAX_UVD_HANDLES= ]; >>>> }; >>>> >>>> int radeon_uvd_init(struct >>>> radeon_device *rdev); >>>> diff --git >>>> a/drivers/gpu/drm/radeon/radeon_= uvd.c >>>> b/drivers/gpu/drm/radeon/radeon_= uvd.c >>>> index 6bf55ec..ef5667a 100644 >>>> --- >>>> a/drivers/gpu/drm/radeon/radeon_= uvd.c >>>> +++ >>>> b/drivers/gpu/drm/radeon/radeon_= uvd.c >>>> @@ -237,6 +237,51 @@ void >>>> radeon_uvd_force_into_uvd_segmen= t(struct >>>> radeon_bo *rbo) >>>> rbo->placement.lpfn = =3D >>>> (256 * 1024 * 1024) >> PAGE_SHIF= T; >>>> } >>>> >>>> +static void >>>> radeon_uvd_fps_clear_events(stru= ct >>>> radeon_device *rdev, >>>> int >>>> idx) >>>> +{ >>>> + struct radeon_uvd_fps *f= ps >>>> =3D &rdev->uvd.fps_info[idx]; >>>> + unsigned i; >>>> + >>>> + fps->timestamp =3D jiffi= es_64; >>>> + fps->event_index =3D 0; >>>> + for (i =3D 0; i < >>>> RADEON_UVD_FPS_EVENTS_MAX; i++) >>>> + fps->events[i] =3D= 0; >>>> +} >>>> + >>>> +static void >>>> radeon_uvd_fps_note_event(struct= >>>> radeon_device *rdev, >>>> int >>>> idx) >>>> +{ >>>> + struct radeon_uvd_fps *f= ps >>>> =3D &rdev->uvd.fps_info[idx]; >>>> + uint64_t timestamp =3D >>>> jiffies_64; >>>> + unsigned rate =3D 0; >>>> + >>>> + uint8_t index =3D >>>> fps->event_index++; >>>> + fps->event_index %=3D >>>> RADEON_UVD_FPS_EVENTS_MAX; >>>> + >>>> + rate =3D div64_u64(HZ, >>>> max(timestamp - fps->timestamp, >>>> 1ULL)); >>>> + >>>> + fps->timestamp =3D times= tamp; >>>> + fps->events[index] =3D >>>> min(rate, 120u); >>>> +} >>>> + >>>> +static unsigned >>>> radeon_uvd_estimate_fps(struct >>>> radeon_device *rdev, >>>> int >>>> idx) >>>> +{ >>>> + struct radeon_uvd_fps *f= ps >>>> =3D &rdev->uvd.fps_info[idx]; >>>> + unsigned i, valid =3D 0,= >>>> count =3D 0; >>>> + >>>> + for (i =3D 0; i < >>>> RADEON_UVD_FPS_EVENTS_MAX; i++) = { >>>> + /* We should >>>> ignore zero values */ >>>> + if (fps->events[= i] >>>> !=3D 0) { >>>> + count +=3D= >>>> fps->events[i]; >>>> + valid++;= >>>> + } >>>> + } >>>> + >>>> + if (valid > 0) >>>> + return count / v= alid; >>>> + else >>>> + return >>>> RADEON_UVD_DEFAULT_FPS; >>>> +} >>>> + >>>> void >>>> radeon_uvd_free_handles(struct >>>> radeon_device *rdev, struct >>>> drm_file *filp) >>>> { >>>> int i, r; >>>> @@ -419,8 +464,10 @@ static int >>>> radeon_uvd_cs_msg(struct >>>> radeon_cs_parser >>>> *p, struct radeon_bo *bo, >>>> >>>> /* create or decode,= >>>> validate the handle */ >>>> for (i =3D 0; i < >>>> RADEON_MAX_UVD_HANDLES; ++i) { >>>> - if >>>> (atomic_read(&p->rdev->uvd.handl= es[i]) >>>> =3D=3D handle) >>>> + if >>>> (atomic_read(&p->rdev->uvd.handl= es[i]) >>>> =3D=3D handle) >>>> { >>>> + >>>> radeon_uvd_fps_note_event(p->rd= ev, i); >>>> retu= rn 0; >>>> + } >>>> } >>>> >>>> /* handle not found >>>> try to alloc a new one */ >>>> @@ -428,6 +475,7 @@ static int >>>> radeon_uvd_cs_msg(struct >>>> radeon_cs_parser >>>> *p, struct radeon_bo *bo, >>>> if >>>> (!atomic_cmpxchg(&p->rdev->uvd.h= andles[i], >>>> 0, >>>> handle)) { >>>> >>>> p->rdev->uvd.filp[i] =3D p->filp= ; >>>> >>>> p->rdev->uvd.img_size[i] =3D img= _size; >>>> + >>>> radeon_uvd_fps_clear_events(p->= rdev, >>>> i); >>>> retu= rn 0; >>>> } >>>> } >>>> @@ -763,7 +811,7 @@ int >>>> radeon_uvd_get_destroy_msg(struc= t >>>> radeon_device >>>> *rdev, int ring, >>>> static void >>>> radeon_uvd_count_handles(struct >>>> radeon_device *rdev, >>>> >>>> unsigned *sd, unsigned *h= d) >>>> { >>>> - unsigned i; >>>> + unsigned i, fps_rate =3D= 0; >>>> >>>> *sd =3D 0; >>>> *hd =3D 0; >>>> @@ -772,10 +820,13 @@ static voi= d >>>> radeon_uvd_count_handles(struct >>>> radeon_device *rdev, >>>> if >>>> (!atomic_read(&rdev->uvd.handles= [i])) >>>> cont= inue; >>>> >>>> - if >>>> (rdev->uvd.img_size[i] >=3D 720*= 576) >>>> - ++(*hd);= >>>> - else >>>> - ++(*sd);= >>>> + fps_rate =3D >>>> radeon_uvd_estimate_fps(rdev, i)= ; >>>> + >>>> + if >>>> (rdev->uvd.img_size[i] >=3D 720*= 576) { >>>> + (*hd) +=3D= >>>> fps_rate > 30 ? 1 : 2; >>>> + } else { >>>> + (*sd) +=3D= >>>> fps_rate > 30 ? 1 : 2; >>>> + } >>>> } >>>> } >>>> >>>> @@ -805,6 +856,7 @@ void >>>> radeon_uvd_note_usage(struct >>>> radeon_device >>>> *rdev) >>>> set_clocks &=3D >>>> schedule_delayed_work(&rdev->uvd= =2Eidle_work, >>>> >>>> msecs_to_jiffies(UVD_IDLE_TIMEOU= T_MS)); >>>> >>>> + >>>> if >>>> ((rdev->pm.pm_method =3D=3D >>>> PM_METHOD_DPM) && >>>> rdev->pm.dpm_enabled) { >>>> unsigned hd = =3D >>>> 0, sd =3D 0; >>>> >>>> radeon_uvd_count_handles(rdev, >>>> &sd, &hd); >>>> -- >>>> 1.9.1 >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Marco Antonio Benatto >>>> Linux user ID:#506236 >>> >> >> --okWh0su6W1OwQtGCfKMgeT4FTIuOtAMbI Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJT7igjAAoJEOpmYE8Sf8H2Sh4QAJ1JPL5XS5E+uhD/hCZ8NIXx CDXteCUO2YpnTRmsvJfpIcSnqHI/HuHV+K1HnQEODaAvbmKZC04D2WsQ6Rfzlmbg hprcz3zfJSxw9oiCpex6RfifEpvutxyITSQ6DKQdziApAO+H1m/B+X7Yw3BdOHoO vEWtED9U8CmQ+rRk1f8zJcCGDmq2ZSjcYjNp3mqxCld/BnkQyaFGYk4r6PMleKON ut69SXnkOlgMKum+c43EzBVISEVc9EzPmbJiA5eCmE9ZCxV+y0yenQjzGSLJCh2n jLuubwe2clNjCNsHglUwV5gGVsBn1eZ6Vcn0ktrmRo7wu1B1mPt0Ba2soZtJiWJ1 UctUOnDaU1VgrsDq/yeQYQ4Pc2AoCpaM9NWmXXr7W/oG8lS765oLZxA/P8N7jSIj yZeJl98b6uX5S7nlGg4FT9u+8+GX9SrEV216fudGvc588whtdv8M85LS2k7KFPY9 hOujat5CjefArgS4/Tbt74HX4Wd4LuwZbEBMSmk5HP62YpTIQZQZF+Z90VXevhp4 V2Y0QINK9r4moRpVH7BHLMEK0BLa3srCGVINxvh72WNYfOkKmoup8PfjZgQj3jjk SFA42EskDSVREKCfK0iWVElhTOhlghHGXMQ3uEE7vffUuswn+E2+bIsqOAsb9ZKo CCoOc6gJbYLJFKDkpdRB =ZXwU -----END PGP SIGNATURE----- --okWh0su6W1OwQtGCfKMgeT4FTIuOtAMbI-- --===============0720569443== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel --===============0720569443==--