All of lore.kernel.org
 help / color / mirror / Atom feed
* Looking for a start point for fixing a bug
@ 2014-07-22 12:39 Адонай Элохим
  2014-07-22 15:01 ` Rob Clark
  2014-07-22 16:05 ` Alex Deucher
  0 siblings, 2 replies; 9+ messages in thread
From: Адонай Элохим @ 2014-07-22 12:39 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1042 bytes --]

Hello all!

I have some spare time and knowledge in C to try to fix some bugs I am
seeing on my machine.
So I've checked out and compiled all git trees that I may need and now I'm
beginning to read articles.

And this is the point from where I don't know where to go. I want to fix
particular bug #79806 [1].
For me there are many places where this bug can hide - mesa? dri? radeon
kernel module? and I just don't know whether should I start reading
articles about mesa hacking or about dri architecture or about kernel
module development.

Now I think the best thing for me is to start looking through radeon kernel
module code (I've got ingenious idea that power management resides there)
and read more about its architecture. Is this right? I mean, I just want to
find out, is this a right place to start looking at for this bug?

P.S. Sorry for my English in case it's not good, I'm learning it now

P.P.S. And thanks for your hard work!

-------------------------------------------
[1] https://bugs.freedesktop.org/show_bug.cgi?id=79806

[-- Attachment #1.2: Type: text/html, Size: 1324 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Looking for a start point for fixing a bug
  2014-07-22 12:39 Looking for a start point for fixing a bug Адонай Элохим
@ 2014-07-22 15:01 ` Rob Clark
  2014-07-22 16:05 ` Alex Deucher
  1 sibling, 0 replies; 9+ messages in thread
From: Rob Clark @ 2014-07-22 15:01 UTC (permalink / raw)
  To: Адонай
	Элохим
  Cc: dri-devel

On Tue, Jul 22, 2014 at 8:39 AM, Адонай Элохим <algonkvel@gmail.com> wrote:
> Hello all!
>
> I have some spare time and knowledge in C to try to fix some bugs I am
> seeing on my machine.
> So I've checked out and compiled all git trees that I may need and now I'm
> beginning to read articles.
>
> And this is the point from where I don't know where to go. I want to fix
> particular bug #79806 [1].
> For me there are many places where this bug can hide - mesa? dri? radeon
> kernel module? and I just don't know whether should I start reading articles
> about mesa hacking or about dri architecture or about kernel module
> development.
>
> Now I think the best thing for me is to start looking through radeon kernel
> module code (I've got ingenious idea that power management resides there)
> and read more about its architecture. Is this right? I mean, I just want to
> find out, is this a right place to start looking at for this bug?

Kernel module seems like the place to start..  not sure if radeon
exposes current gpu/memory/etc clocks via sysfs somewhere?  (There is
some hwmon stuff but I've not looked too close)

Somehow, ending up back at bootup memory and gpu clocks sounds like a
fairly plausible explanation for slowdown after resume.

BR,
-R

> P.S. Sorry for my English in case it's not good, I'm learning it now
>
> P.P.S. And thanks for your hard work!
>
> -------------------------------------------
> [1] https://bugs.freedesktop.org/show_bug.cgi?id=79806
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Looking for a start point for fixing a bug
  2014-07-22 12:39 Looking for a start point for fixing a bug Адонай Элохим
  2014-07-22 15:01 ` Rob Clark
@ 2014-07-22 16:05 ` Alex Deucher
  2014-08-08 12:48   ` Адонай Элохим
  2014-08-10 21:19   ` Адонай Элохим
  1 sibling, 2 replies; 9+ messages in thread
From: Alex Deucher @ 2014-07-22 16:05 UTC (permalink / raw)
  To: Адонай
	Элохим
  Cc: Maling list - DRI developers

On Tue, Jul 22, 2014 at 8:39 AM, Адонай Элохим <algonkvel@gmail.com> wrote:
> Hello all!
>
> I have some spare time and knowledge in C to try to fix some bugs I am
> seeing on my machine.
> So I've checked out and compiled all git trees that I may need and now I'm
> beginning to read articles.
>
> And this is the point from where I don't know where to go. I want to fix
> particular bug #79806 [1].
> For me there are many places where this bug can hide - mesa? dri? radeon
> kernel module? and I just don't know whether should I start reading articles
> about mesa hacking or about dri architecture or about kernel module
> development.
>
> Now I think the best thing for me is to start looking through radeon kernel
> module code (I've got ingenious idea that power management resides there)
> and read more about its architecture. Is this right? I mean, I just want to
> find out, is this a right place to start looking at for this bug?

The power management is handled in the kernel driver.  See radeon_pm.c
and the relevant *_dpm.c files depending on what asic you have.

Alex

>
> P.S. Sorry for my English in case it's not good, I'm learning it now
>
> P.P.S. And thanks for your hard work!
>
> -------------------------------------------
> [1] https://bugs.freedesktop.org/show_bug.cgi?id=79806
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Looking for a start point for fixing a bug
  2014-07-22 16:05 ` Alex Deucher
@ 2014-08-08 12:48   ` Адонай Элохим
  2014-08-10 21:19   ` Адонай Элохим
  1 sibling, 0 replies; 9+ messages in thread
From: Адонай Элохим @ 2014-08-08 12:48 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Maling list - DRI developers

I started looking through the code week or so ago.
No much progress though but could you explain meaning of this to me:

if (running == 0) {
    if (running) {
        blackout = RREG32(MC_SHARED_BLACKOUT_CNTL);
        WREG32(MC_SHARED_BLACKOUT_CNTL, blackout | 1);
    }
...

It's in si.c, line 1564 in current master branch. Is it some
concurrent trick or so?

Thanks,
Oleg

2014-07-22 20:05 GMT+04:00 Alex Deucher <alexdeucher@gmail.com>:
> On Tue, Jul 22, 2014 at 8:39 AM, Адонай Элохим <algonkvel@gmail.com> wrote:
>> Hello all!
>>
>> I have some spare time and knowledge in C to try to fix some bugs I am
>> seeing on my machine.
>> So I've checked out and compiled all git trees that I may need and now I'm
>> beginning to read articles.
>>
>> And this is the point from where I don't know where to go. I want to fix
>> particular bug #79806 [1].
>> For me there are many places where this bug can hide - mesa? dri? radeon
>> kernel module? and I just don't know whether should I start reading articles
>> about mesa hacking or about dri architecture or about kernel module
>> development.
>>
>> Now I think the best thing for me is to start looking through radeon kernel
>> module code (I've got ingenious idea that power management resides there)
>> and read more about its architecture. Is this right? I mean, I just want to
>> find out, is this a right place to start looking at for this bug?
>
> The power management is handled in the kernel driver.  See radeon_pm.c
> and the relevant *_dpm.c files depending on what asic you have.
>
> Alex
>
>>
>> P.S. Sorry for my English in case it's not good, I'm learning it now
>>
>> P.P.S. And thanks for your hard work!
>>
>> -------------------------------------------
>> [1] https://bugs.freedesktop.org/show_bug.cgi?id=79806
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Looking for a start point for fixing a bug
  2014-07-22 16:05 ` Alex Deucher
  2014-08-08 12:48   ` Адонай Элохим
@ 2014-08-10 21:19   ` Адонай Элохим
  2014-08-10 21:53     ` Niels Ole Salscheider
  1 sibling, 1 reply; 9+ messages in thread
From: Адонай Элохим @ 2014-08-10 21:19 UTC (permalink / raw)
  To: Alex Deucher, Rob Clark; +Cc: Maling list - DRI developers

Hello again, hope you are still reading my texts...

I digged through the code and narrowed down the issue I wanted to fix.
It appears to be related to the `bool thermal_active` dpm struct
member and this piece of code:

if (rdev->asic->dpm.force_performance_level) {
        if (rdev->pm.dpm.thermal_active) {
            enum radeon_dpm_forced_level level = rdev->pm.dpm.forced_level;
            /* force low perf level for thermal */
            radeon_dpm_force_performance_level(rdev,
RADEON_DPM_FORCED_LEVEL_LOW);
            /* save the user's level */
            rdev->pm.dpm.forced_level = level;
        } else {
            /* otherwise, user selected level */
            radeon_dpm_force_performance_level(rdev, rdev->pm.dpm.forced_level);
        }
    }

I did a double check here - at boot `thermal_active` is `false` and
thus, performance level is properly initiated. But at resume from
suspend `thermal_active` is true and performance level is strictly
bound to low profile.
Besides you cannot change it via echo 1 > /sys/.../force_dpm_level,
again thanks to `thermal_active` checked there.

Could you explain meaning of this small boolean to me? I'd like to
make a small neat patch fixing this, but I'm scared of doing it in
wrong way.
Sorry if I'm being too persistent.

Thanks,
Oleg

2014-07-22 20:05 GMT+04:00 Alex Deucher <alexdeucher@gmail.com>:
> On Tue, Jul 22, 2014 at 8:39 AM, Адонай Элохим <algonkvel@gmail.com> wrote:
>> Hello all!
>>
>> I have some spare time and knowledge in C to try to fix some bugs I am
>> seeing on my machine.
>> So I've checked out and compiled all git trees that I may need and now I'm
>> beginning to read articles.
>>
>> And this is the point from where I don't know where to go. I want to fix
>> particular bug #79806 [1].
>> For me there are many places where this bug can hide - mesa? dri? radeon
>> kernel module? and I just don't know whether should I start reading articles
>> about mesa hacking or about dri architecture or about kernel module
>> development.
>>
>> Now I think the best thing for me is to start looking through radeon kernel
>> module code (I've got ingenious idea that power management resides there)
>> and read more about its architecture. Is this right? I mean, I just want to
>> find out, is this a right place to start looking at for this bug?
>
> The power management is handled in the kernel driver.  See radeon_pm.c
> and the relevant *_dpm.c files depending on what asic you have.
>
> Alex
>
>>
>> P.S. Sorry for my English in case it's not good, I'm learning it now
>>
>> P.P.S. And thanks for your hard work!
>>
>> -------------------------------------------
>> [1] https://bugs.freedesktop.org/show_bug.cgi?id=79806
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Looking for a start point for fixing a bug
  2014-08-10 21:19   ` Адонай Элохим
@ 2014-08-10 21:53     ` Niels Ole Salscheider
  2014-08-11  4:03       ` Адонай Элохим
  0 siblings, 1 reply; 9+ messages in thread
From: Niels Ole Salscheider @ 2014-08-10 21:53 UTC (permalink / raw)
  To: dri-devel

On Monday 11 August 2014, 01:19:32, Адонай Элохим wrote:
> Hello again, hope you are still reading my texts...
> 
> I digged through the code and narrowed down the issue I wanted to fix.
> It appears to be related to the `bool thermal_active` dpm struct
> member and this piece of code:
> 
> if (rdev->asic->dpm.force_performance_level) {
>         if (rdev->pm.dpm.thermal_active) {
>             enum radeon_dpm_forced_level level = rdev->pm.dpm.forced_level;
>             /* force low perf level for thermal */
>             radeon_dpm_force_performance_level(rdev,
> RADEON_DPM_FORCED_LEVEL_LOW);
>             /* save the user's level */
>             rdev->pm.dpm.forced_level = level;
>         } else {
>             /* otherwise, user selected level */
>             radeon_dpm_force_performance_level(rdev,
> rdev->pm.dpm.forced_level); }
>     }
> 
> I did a double check here - at boot `thermal_active` is `false` and
> thus, performance level is properly initiated. But at resume from
> suspend `thermal_active` is true and performance level is strictly
> bound to low profile.
> Besides you cannot change it via echo 1 > /sys/.../force_dpm_level,
> again thanks to `thermal_active` checked there.
> 
> Could you explain meaning of this small boolean to me? I'd like to
> make a small neat patch fixing this, but I'm scared of doing it in
> wrong way.
> Sorry if I'm being too persistent.

I think thermal_active means that the temperature got too high so that low 
clocks have to be used.

Just some idea, but thermal.work only gets scheduled when the high to low 
temperature interrupt occurs. When the temperature is too high before suspend 
(so that thermal_active is true) and it gets low during standby this interrupt 
will not occur. thermal.work is therefore not scheduled...

Ole

> Thanks,
> Oleg
> 
> 2014-07-22 20:05 GMT+04:00 Alex Deucher <alexdeucher@gmail.com>:
> > On Tue, Jul 22, 2014 at 8:39 AM, Адонай Элохим <algonkvel@gmail.com> 
wrote:
> >> Hello all!
> >> 
> >> I have some spare time and knowledge in C to try to fix some bugs I am
> >> seeing on my machine.
> >> So I've checked out and compiled all git trees that I may need and now
> >> I'm
> >> beginning to read articles.
> >> 
> >> And this is the point from where I don't know where to go. I want to fix
> >> particular bug #79806 [1].
> >> For me there are many places where this bug can hide - mesa? dri? radeon
> >> kernel module? and I just don't know whether should I start reading
> >> articles about mesa hacking or about dri architecture or about kernel
> >> module development.
> >> 
> >> Now I think the best thing for me is to start looking through radeon
> >> kernel
> >> module code (I've got ingenious idea that power management resides there)
> >> and read more about its architecture. Is this right? I mean, I just want
> >> to
> >> find out, is this a right place to start looking at for this bug?
> > 
> > The power management is handled in the kernel driver.  See radeon_pm.c
> > and the relevant *_dpm.c files depending on what asic you have.
> > 
> > Alex
> > 
> >> P.S. Sorry for my English in case it's not good, I'm learning it now
> >> 
> >> P.P.S. And thanks for your hard work!
> >> 
> >> -------------------------------------------
> >> [1] https://bugs.freedesktop.org/show_bug.cgi?id=79806
> >> 
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Looking for a start point for fixing a bug
  2014-08-10 21:53     ` Niels Ole Salscheider
@ 2014-08-11  4:03       ` Адонай Элохим
  2014-08-11  6:50         ` Niels Ole Salscheider
  2014-08-11 14:49         ` Alex Deucher
  0 siblings, 2 replies; 9+ messages in thread
From: Адонай Элохим @ 2014-08-11  4:03 UTC (permalink / raw)
  To: Niels Ole Salscheider; +Cc: Maling list - DRI developers

2014-08-11 1:53 GMT+04:00 Niels Ole Salscheider
<niels_ole@salscheider-online.de>:
> On Monday 11 August 2014, 01:19:32, Адонай Элохим wrote:
>> Hello again, hope you are still reading my texts...
>>
>> I digged through the code and narrowed down the issue I wanted to fix.
>> It appears to be related to the `bool thermal_active` dpm struct
>> member and this piece of code:
>>
>> if (rdev->asic->dpm.force_performance_level) {
>>         if (rdev->pm.dpm.thermal_active) {
>>             enum radeon_dpm_forced_level level = rdev->pm.dpm.forced_level;
>>             /* force low perf level for thermal */
>>             radeon_dpm_force_performance_level(rdev,
>> RADEON_DPM_FORCED_LEVEL_LOW);
>>             /* save the user's level */
>>             rdev->pm.dpm.forced_level = level;
>>         } else {
>>             /* otherwise, user selected level */
>>             radeon_dpm_force_performance_level(rdev,
>> rdev->pm.dpm.forced_level); }
>>     }
>>
>> I did a double check here - at boot `thermal_active` is `false` and
>> thus, performance level is properly initiated. But at resume from
>> suspend `thermal_active` is true and performance level is strictly
>> bound to low profile.
>> Besides you cannot change it via echo 1 > /sys/.../force_dpm_level,
>> again thanks to `thermal_active` checked there.
>>
>> Could you explain meaning of this small boolean to me? I'd like to
>> make a small neat patch fixing this, but I'm scared of doing it in
>> wrong way.
>> Sorry if I'm being too persistent.
>
> I think thermal_active means that the temperature got too high so that low
> clocks have to be used.
>
> Just some idea, but thermal.work only gets scheduled when the high to low
> temperature interrupt occurs. When the temperature is too high before suspend
> (so that thermal_active is true) and it gets low during standby this interrupt
> will not occur. thermal.work is therefore not scheduled...
>
> Ole
>

You were right, Ole. The driver thinks the temperature is too high.
Thanks a lot!
It seems the function ci_set_thermal_temperature_range is missing some lines:


diff --git a/drivers/gpu/drm/radeon/ci_dpm.c b/drivers/gpu/drm/radeon/ci_dpm.c
index 584090a..102a4bc 100644
--- a/radeon/ci_dpm.c
+++ b/radeon/ci_dpm.c
@@ -869,6 +869,9 @@ static int ci_set_thermal_temperature_range(struct
radeon_device *rdev,
        WREG32_SMC(CG_THERMAL_CTRL, tmp);
 #endif

+    rdev->pm.dpm.thermal.min_temp = low_temp;
+    rdev->pm.dpm.thermal.max_temp = high_temp;
+
        return 0;
 }


All other similar callbacks for different families of cards have these
lines. I wonder if there is any specific case for not doing this...
How do I propose it as a patch anyway?

>> Thanks,
>> Oleg
>>
>> 2014-07-22 20:05 GMT+04:00 Alex Deucher <alexdeucher@gmail.com>:
>> > On Tue, Jul 22, 2014 at 8:39 AM, Адонай Элохим <algonkvel@gmail.com>
> wrote:
>> >> Hello all!
>> >>
>> >> I have some spare time and knowledge in C to try to fix some bugs I am
>> >> seeing on my machine.
>> >> So I've checked out and compiled all git trees that I may need and now
>> >> I'm
>> >> beginning to read articles.
>> >>
>> >> And this is the point from where I don't know where to go. I want to fix
>> >> particular bug #79806 [1].
>> >> For me there are many places where this bug can hide - mesa? dri? radeon
>> >> kernel module? and I just don't know whether should I start reading
>> >> articles about mesa hacking or about dri architecture or about kernel
>> >> module development.
>> >>
>> >> Now I think the best thing for me is to start looking through radeon
>> >> kernel
>> >> module code (I've got ingenious idea that power management resides there)
>> >> and read more about its architecture. Is this right? I mean, I just want
>> >> to
>> >> find out, is this a right place to start looking at for this bug?
>> >
>> > The power management is handled in the kernel driver.  See radeon_pm.c
>> > and the relevant *_dpm.c files depending on what asic you have.
>> >
>> > Alex
>> >
>> >> P.S. Sorry for my English in case it's not good, I'm learning it now
>> >>
>> >> P.P.S. And thanks for your hard work!
>> >>
>> >> -------------------------------------------
>> >> [1] https://bugs.freedesktop.org/show_bug.cgi?id=79806
>> >>
>> >> _______________________________________________
>> >> dri-devel mailing list
>> >> dri-devel@lists.freedesktop.org
>> >> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Looking for a start point for fixing a bug
  2014-08-11  4:03       ` Адонай Элохим
@ 2014-08-11  6:50         ` Niels Ole Salscheider
  2014-08-11 14:49         ` Alex Deucher
  1 sibling, 0 replies; 9+ messages in thread
From: Niels Ole Salscheider @ 2014-08-11  6:50 UTC (permalink / raw)
  To: Адонай
	Элохим
  Cc: Maling list - DRI developers

On Monday 11 August 2014, 08:03:57, Адонай Элохим wrote:
> 2014-08-11 1:53 GMT+04:00 Niels Ole Salscheider
> 
> <niels_ole@salscheider-online.de>:
> > On Monday 11 August 2014, 01:19:32, Адонай Элохим wrote:
> >> Hello again, hope you are still reading my texts...
> >> 
> >> I digged through the code and narrowed down the issue I wanted to fix.
> >> It appears to be related to the `bool thermal_active` dpm struct
> >> member and this piece of code:
> >> 
> >> if (rdev->asic->dpm.force_performance_level) {
> >> 
> >>         if (rdev->pm.dpm.thermal_active) {
> >>         
> >>             enum radeon_dpm_forced_level level =
> >>             rdev->pm.dpm.forced_level;
> >>             /* force low perf level for thermal */
> >>             radeon_dpm_force_performance_level(rdev,
> >> 
> >> RADEON_DPM_FORCED_LEVEL_LOW);
> >> 
> >>             /* save the user's level */
> >>             rdev->pm.dpm.forced_level = level;
> >>         
> >>         } else {
> >>         
> >>             /* otherwise, user selected level */
> >>             radeon_dpm_force_performance_level(rdev,
> >> 
> >> rdev->pm.dpm.forced_level); }
> >> 
> >>     }
> >> 
> >> I did a double check here - at boot `thermal_active` is `false` and
> >> thus, performance level is properly initiated. But at resume from
> >> suspend `thermal_active` is true and performance level is strictly
> >> bound to low profile.
> >> Besides you cannot change it via echo 1 > /sys/.../force_dpm_level,
> >> again thanks to `thermal_active` checked there.
> >> 
> >> Could you explain meaning of this small boolean to me? I'd like to
> >> make a small neat patch fixing this, but I'm scared of doing it in
> >> wrong way.
> >> Sorry if I'm being too persistent.
> > 
> > I think thermal_active means that the temperature got too high so that low
> > clocks have to be used.
> > 
> > Just some idea, but thermal.work only gets scheduled when the high to low
> > temperature interrupt occurs. When the temperature is too high before
> > suspend (so that thermal_active is true) and it gets low during standby
> > this interrupt will not occur. thermal.work is therefore not scheduled...
> > 
> > Ole
> 
> You were right, Ole. The driver thinks the temperature is too high.
> Thanks a lot!
> It seems the function ci_set_thermal_temperature_range is missing some
> lines:
> 
> 
> diff --git a/drivers/gpu/drm/radeon/ci_dpm.c
> b/drivers/gpu/drm/radeon/ci_dpm.c index 584090a..102a4bc 100644
> --- a/radeon/ci_dpm.c
> +++ b/radeon/ci_dpm.c
> @@ -869,6 +869,9 @@ static int ci_set_thermal_temperature_range(struct
> radeon_device *rdev,
>         WREG32_SMC(CG_THERMAL_CTRL, tmp);
>  #endif
> 
> +    rdev->pm.dpm.thermal.min_temp = low_temp;
> +    rdev->pm.dpm.thermal.max_temp = high_temp;
> +
>         return 0;
>  }
> 
> 
> All other similar callbacks for different families of cards have these
> lines. I wonder if there is any specific case for not doing this...

This seems to be wrong indeed. I wonder why it happens after a suspend - 
resume cycle, though. Does the hardware generate a corresponding interrupt 
after resume?

There is however still an XXX comment in that function... Maybe Alex can 
comment on that.

> How do I propose it as a patch anyway?

Fix it in your local git checkout, commit it (don't forget to pass -s to get 
your signed-of-by line) and use git format-patch / git send-email to send it 
to this list...

> >> Thanks,
> >> Oleg
> >> 
> >> 2014-07-22 20:05 GMT+04:00 Alex Deucher <alexdeucher@gmail.com>:
> >> > On Tue, Jul 22, 2014 at 8:39 AM, Адонай Элохим <algonkvel@gmail.com>
> > 
> > wrote:
> >> >> Hello all!
> >> >> 
> >> >> I have some spare time and knowledge in C to try to fix some bugs I am
> >> >> seeing on my machine.
> >> >> So I've checked out and compiled all git trees that I may need and now
> >> >> I'm
> >> >> beginning to read articles.
> >> >> 
> >> >> And this is the point from where I don't know where to go. I want to
> >> >> fix
> >> >> particular bug #79806 [1].
> >> >> For me there are many places where this bug can hide - mesa? dri?
> >> >> radeon
> >> >> kernel module? and I just don't know whether should I start reading
> >> >> articles about mesa hacking or about dri architecture or about kernel
> >> >> module development.
> >> >> 
> >> >> Now I think the best thing for me is to start looking through radeon
> >> >> kernel
> >> >> module code (I've got ingenious idea that power management resides
> >> >> there)
> >> >> and read more about its architecture. Is this right? I mean, I just
> >> >> want
> >> >> to
> >> >> find out, is this a right place to start looking at for this bug?
> >> > 
> >> > The power management is handled in the kernel driver.  See radeon_pm.c
> >> > and the relevant *_dpm.c files depending on what asic you have.
> >> > 
> >> > Alex
> >> > 
> >> >> P.S. Sorry for my English in case it's not good, I'm learning it now
> >> >> 
> >> >> P.P.S. And thanks for your hard work!
> >> >> 
> >> >> -------------------------------------------
> >> >> [1] https://bugs.freedesktop.org/show_bug.cgi?id=79806
> >> >> 
> >> >> _______________________________________________
> >> >> dri-devel mailing list
> >> >> dri-devel@lists.freedesktop.org
> >> >> http://lists.freedesktop.org/mailman/listinfo/dri-devel
> >> 
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Looking for a start point for fixing a bug
  2014-08-11  4:03       ` Адонай Элохим
  2014-08-11  6:50         ` Niels Ole Salscheider
@ 2014-08-11 14:49         ` Alex Deucher
  1 sibling, 0 replies; 9+ messages in thread
From: Alex Deucher @ 2014-08-11 14:49 UTC (permalink / raw)
  To: Адонай
	Элохим
  Cc: Maling list - DRI developers

On Mon, Aug 11, 2014 at 12:03 AM, Адонай Элохим <algonkvel@gmail.com> wrote:
> 2014-08-11 1:53 GMT+04:00 Niels Ole Salscheider
> <niels_ole@salscheider-online.de>:
>> On Monday 11 August 2014, 01:19:32, Адонай Элохим wrote:
>>> Hello again, hope you are still reading my texts...
>>>
>>> I digged through the code and narrowed down the issue I wanted to fix.
>>> It appears to be related to the `bool thermal_active` dpm struct
>>> member and this piece of code:
>>>
>>> if (rdev->asic->dpm.force_performance_level) {
>>>         if (rdev->pm.dpm.thermal_active) {
>>>             enum radeon_dpm_forced_level level = rdev->pm.dpm.forced_level;
>>>             /* force low perf level for thermal */
>>>             radeon_dpm_force_performance_level(rdev,
>>> RADEON_DPM_FORCED_LEVEL_LOW);
>>>             /* save the user's level */
>>>             rdev->pm.dpm.forced_level = level;
>>>         } else {
>>>             /* otherwise, user selected level */
>>>             radeon_dpm_force_performance_level(rdev,
>>> rdev->pm.dpm.forced_level); }
>>>     }
>>>
>>> I did a double check here - at boot `thermal_active` is `false` and
>>> thus, performance level is properly initiated. But at resume from
>>> suspend `thermal_active` is true and performance level is strictly
>>> bound to low profile.
>>> Besides you cannot change it via echo 1 > /sys/.../force_dpm_level,
>>> again thanks to `thermal_active` checked there.
>>>
>>> Could you explain meaning of this small boolean to me? I'd like to
>>> make a small neat patch fixing this, but I'm scared of doing it in
>>> wrong way.
>>> Sorry if I'm being too persistent.
>>
>> I think thermal_active means that the temperature got too high so that low
>> clocks have to be used.
>>
>> Just some idea, but thermal.work only gets scheduled when the high to low
>> temperature interrupt occurs. When the temperature is too high before suspend
>> (so that thermal_active is true) and it gets low during standby this interrupt
>> will not occur. thermal.work is therefore not scheduled...
>>
>> Ole
>>
>
> You were right, Ole. The driver thinks the temperature is too high.
> Thanks a lot!
> It seems the function ci_set_thermal_temperature_range is missing some lines:
>
>
> diff --git a/drivers/gpu/drm/radeon/ci_dpm.c b/drivers/gpu/drm/radeon/ci_dpm.c
> index 584090a..102a4bc 100644
> --- a/radeon/ci_dpm.c
> +++ b/radeon/ci_dpm.c
> @@ -869,6 +869,9 @@ static int ci_set_thermal_temperature_range(struct
> radeon_device *rdev,
>         WREG32_SMC(CG_THERMAL_CTRL, tmp);
>  #endif
>
> +    rdev->pm.dpm.thermal.min_temp = low_temp;
> +    rdev->pm.dpm.thermal.max_temp = high_temp;
> +
>         return 0;
>  }
>
>
> All other similar callbacks for different families of cards have these
> lines. I wonder if there is any specific case for not doing this...
> How do I propose it as a patch anyway?

You analysis is correct.  Thanks for tracking this down.

Alex

>
>>> Thanks,
>>> Oleg
>>>
>>> 2014-07-22 20:05 GMT+04:00 Alex Deucher <alexdeucher@gmail.com>:
>>> > On Tue, Jul 22, 2014 at 8:39 AM, Адонай Элохим <algonkvel@gmail.com>
>> wrote:
>>> >> Hello all!
>>> >>
>>> >> I have some spare time and knowledge in C to try to fix some bugs I am
>>> >> seeing on my machine.
>>> >> So I've checked out and compiled all git trees that I may need and now
>>> >> I'm
>>> >> beginning to read articles.
>>> >>
>>> >> And this is the point from where I don't know where to go. I want to fix
>>> >> particular bug #79806 [1].
>>> >> For me there are many places where this bug can hide - mesa? dri? radeon
>>> >> kernel module? and I just don't know whether should I start reading
>>> >> articles about mesa hacking or about dri architecture or about kernel
>>> >> module development.
>>> >>
>>> >> Now I think the best thing for me is to start looking through radeon
>>> >> kernel
>>> >> module code (I've got ingenious idea that power management resides there)
>>> >> and read more about its architecture. Is this right? I mean, I just want
>>> >> to
>>> >> find out, is this a right place to start looking at for this bug?
>>> >
>>> > The power management is handled in the kernel driver.  See radeon_pm.c
>>> > and the relevant *_dpm.c files depending on what asic you have.
>>> >
>>> > Alex
>>> >
>>> >> P.S. Sorry for my English in case it's not good, I'm learning it now
>>> >>
>>> >> P.P.S. And thanks for your hard work!
>>> >>
>>> >> -------------------------------------------
>>> >> [1] https://bugs.freedesktop.org/show_bug.cgi?id=79806
>>> >>
>>> >> _______________________________________________
>>> >> dri-devel mailing list
>>> >> dri-devel@lists.freedesktop.org
>>> >> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-08-11 14:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-22 12:39 Looking for a start point for fixing a bug Адонай Элохим
2014-07-22 15:01 ` Rob Clark
2014-07-22 16:05 ` Alex Deucher
2014-08-08 12:48   ` Адонай Элохим
2014-08-10 21:19   ` Адонай Элохим
2014-08-10 21:53     ` Niels Ole Salscheider
2014-08-11  4:03       ` Адонай Элохим
2014-08-11  6:50         ` Niels Ole Salscheider
2014-08-11 14:49         ` Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.