All of lore.kernel.org
 help / color / mirror / Atom feed
* Runtime PM and the block layer
@ 2010-08-23 19:17 Alan Stern
  2010-08-23 19:53 ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Alan Stern @ 2010-08-23 19:17 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Kernel development list

Jens:

I want to implement runtime power management for the SCSI sd driver.  
The idea is that the device should automatically be suspended after a
certain amount of time spent idle.

The basic outline is simple enough.  If the device is in low power when
a request arrives, delay handling the request until the device can be
brought back to high power.  When a request completes and the request
queue is empty, schedule a runtime-suspend for the appropriate time in
the future.

The difficulty is that I don't know the right way these things should
interact with the request-queue management.  A request can be deferred
by making the prep_req_fn return BLKPREP_DEFER, right?  But then what
happens to the request and to the queue?  How does the runtime-resume
routine tell the block layer that the deferred request should be
restarted?

How does this all relate to the queue being stopped or plugged?

Another thing: The runtime-resume routine needs to send its own
commands to the device (to spin up a drive, for example).  These
commands must be sent before anything on the request queue, and they
must be handled right away even though the normal requests on the queue
are still deferred.

What's the right way to do all this?

Thanks,

Alan Stern


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime PM and the block layer
  2010-08-23 19:17 Runtime PM and the block layer Alan Stern
@ 2010-08-23 19:53 ` Jens Axboe
  2010-08-23 21:51   ` Alan Stern
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2010-08-23 19:53 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kernel development list

On 08/23/2010 09:17 PM, Alan Stern wrote:
> Jens:
> 
> I want to implement runtime power management for the SCSI sd driver.  
> The idea is that the device should automatically be suspended after a
> certain amount of time spent idle.
> 
> The basic outline is simple enough.  If the device is in low power when
> a request arrives, delay handling the request until the device can be
> brought back to high power.  When a request completes and the request
> queue is empty, schedule a runtime-suspend for the appropriate time in
> the future.

So if it's in low power mode, you need to defer because you want to
issue some special request first to bring it back to life?

> The difficulty is that I don't know the right way these things should
> interact with the request-queue management.  A request can be deferred
> by making the prep_req_fn return BLKPREP_DEFER, right?  But then what

Right, that is used for resource starvation. So usually very short
conditions.

> happens to the request and to the queue?  How does the runtime-resume
> routine tell the block layer that the deferred request should be
> restarted?

Internally, it uses the block queue plugging to set a timer to defer a
bit. That's purely implementation detail and it will change in the
not-so-distant future if I kill the per-queue plugging. The effect will
still be the same though, the action will be automatically retried after
some defined interval.

> How does this all relate to the queue being stopped or plugged?

A stopped queue is usually the driver telling the block layer to bugger
off for a while, and the driver will tell us when it's ok to resume
operations. So we can't control that part. Plugging we can control. But
if the device is plugged, the driver is idle _and_ we have IO pending.
So you would not be entering a lower power mode at that point, and the
driver should already be in an operationel state; when it got plugged,
we should have issued the special req to send it into live mode.

> Another thing: The runtime-resume routine needs to send its own
> commands to the device (to spin up a drive, for example).  These
> commands must be sent before anything on the request queue, and they
> must be handled right away even though the normal requests on the queue
> are still deferred.

We can flag those requests as being of some category that is allowed to
bypass the sleep state of the device. Handling right away can be
accomplished by just inserting at the front and having that flag set.

> What's the right way to do all this?

It needs to be done carefully. A queue can go in and out of idle/busy
state extremely fast. I did quite a few tricks on the queue timeout
handling to ensure that it didn't have much overhead on a per-rq basis.
So we could probably add an idle timer that is set to some suitable
timeout for this and would be added when the queue first goes empty. If
new requests come in, just let it simmer and defer checking the state to
when it actually fires. If nothing has happened, issue a new
q->power_mode(new_state) callback that would then queue a suitable
request to change the power state of the device. Queueing a new request
could check the state and issue a q->power_mode(RUNNING) or similar call
to bring things back to life.

Just a few ideas...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime PM and the block layer
  2010-08-23 19:53 ` Jens Axboe
@ 2010-08-23 21:51   ` Alan Stern
  2010-08-24 13:15     ` Runtime power management during system resume Raj Kumar
  2010-08-24 13:38     ` Runtime PM and the block layer Jens Axboe
  0 siblings, 2 replies; 24+ messages in thread
From: Alan Stern @ 2010-08-23 21:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Kernel development list

On Mon, 23 Aug 2010, Jens Axboe wrote:

> On 08/23/2010 09:17 PM, Alan Stern wrote:
> > Jens:
> > 
> > I want to implement runtime power management for the SCSI sd driver.  
> > The idea is that the device should automatically be suspended after a
> > certain amount of time spent idle.
> > 
> > The basic outline is simple enough.  If the device is in low power when
> > a request arrives, delay handling the request until the device can be
> > brought back to high power.  When a request completes and the request
> > queue is empty, schedule a runtime-suspend for the appropriate time in
> > the future.
> 
> So if it's in low power mode, you need to defer because you want to
> issue some special request first to bring it back to life?

Exactly.  And also because if the device is in low-power mode then its 
parent might be in low-power too, meaning that we would have to wait 
for both the parent and the device to return to full power before 
sending the request.

The PM framework is set up so that power-state changes are always done
in process context -- meaning in this case that a workqueue would be
needed.  The PM core has a special workqueue for just this purpose.  
But obviously a prep function can't sit around and wait for the work to 
get done.

> > The difficulty is that I don't know the right way these things should
> > interact with the request-queue management.  A request can be deferred
> > by making the prep_req_fn return BLKPREP_DEFER, right?  But then what
> 
> Right, that is used for resource starvation. So usually very short
> conditions.
> 
> > happens to the request and to the queue?  How does the runtime-resume
> > routine tell the block layer that the deferred request should be
> > restarted?
> 
> Internally, it uses the block queue plugging to set a timer to defer a
> bit. That's purely implementation detail and it will change in the
> not-so-distant future if I kill the per-queue plugging. The effect will
> still be the same though, the action will be automatically retried after
> some defined interval.

Hmm.  That doesn't sound quite like what I need.  Ideally the request
would go back to the head of the queue and stay there until the driver
tells the block layer to let it through (when the device is ready to 
accept it).

> > How does this all relate to the queue being stopped or plugged?
> 
> A stopped queue is usually the driver telling the block layer to bugger
> off for a while, and the driver will tell us when it's ok to resume
> operations.

Yes, that sounds more like it.  Put the request back on the queue 
and stop the queue.  If the prep fn calls blk_stop_queue() and then 
returns BLKPREP_DEFER, will that do it?

>  So we can't control that part. Plugging we can control. But

I probably didn't make it clear in the earlier message: The changes
to implement all this PM stuff will go in the driver, with nothing (or
almost nothing) changed in the block layer.  Hence stopping the queue
_is_ under my control.

Unless you think it would be better to change the block layer 
instead...

> if the device is plugged, the driver is idle _and_ we have IO pending.
> So you would not be entering a lower power mode at that point, and the
> driver should already be in an operationel state; when it got plugged,
> we should have issued the special req to send it into live mode.

Plugging doesn't seem like the right mechanism for this.

> > Another thing: The runtime-resume routine needs to send its own
> > commands to the device (to spin up a drive, for example).  These
> > commands must be sent before anything on the request queue, and they
> > must be handled right away even though the normal requests on the queue
> > are still deferred.
> 
> We can flag those requests as being of some category that is allowed to
> bypass the sleep state of the device. Handling right away can be
> accomplished by just inserting at the front and having that flag set.

Okay, good.  But if the queue is stopped when the requests are
inserted at the front (with the flag set), will they be allowed to go 
through to the driver?  In other words, is there a way to force certain 
requests to be processed even while the queue is stopped?

> > What's the right way to do all this?
> 
> It needs to be done carefully. A queue can go in and out of idle/busy
> state extremely fast. I did quite a few tricks on the queue timeout
> handling to ensure that it didn't have much overhead on a per-rq basis.
> So we could probably add an idle timer that is set to some suitable
> timeout for this and would be added when the queue first goes empty. If
> new requests come in, just let it simmer and defer checking the state to
> when it actually fires. If nothing has happened, issue a new
> q->power_mode(new_state) callback that would then queue a suitable
> request to change the power state of the device. Queueing a new request
> could check the state and issue a q->power_mode(RUNNING) or similar call
> to bring things back to life.
> 
> Just a few ideas...

The idle-time management can be handled in a couple of different ways,
and the PM core already contains routines to do it.  I'm not worried
about that (I have a very clear understanding of the PM core).  The 
interactions with the block layer are where I need help.

Speaking of which...  What is this q->power_mode stuff?  I haven't run
across it before and it doesn't seem to be mentioned in
include/linux/blkdev.h.  Is it connected with request_pm_state?  I
don't know what that is either, or how it is meant to be used.

Alan Stern


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Runtime power management during system resume
  2010-08-23 21:51   ` Alan Stern
@ 2010-08-24 13:15     ` Raj Kumar
  2010-08-24 14:30       ` Alan Stern
  2010-08-24 13:38     ` Runtime PM and the block layer Jens Axboe
  1 sibling, 1 reply; 24+ messages in thread
From: Raj Kumar @ 2010-08-24 13:15 UTC (permalink / raw)
  To: stern; +Cc: linux-kernel


 
Hi Alan,
 
I have implemented the run time power management in my drivers. I have one
issue regarding System resume.
 
When the system sleep is triggered as it is mentioned that Power management
core will increment the power_usage counter during prepare and decrements when complete
is called.
 
Now I have few questions:
 
1) When the system resume is done, it does not increase the power_usage counter.
 right?
 
So Does then the driver need to update the power_usage counter with run time power management
core and again set it to active means RPM_ACTIVE?
 
2) Suppose device is active, means its power_usage counter is already one, Now during system
sleep, does the driver first suspend it with run time power management core and then continue
System suspend?
 
3) Because I have seen the code of power management core and I did not see the that during
system suspend, run time power management status is updated means RPM_SUSPENDED.
right?
 
What do you think?
 
Regards
Raj
 
  		 	   		  

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime PM and the block layer
  2010-08-23 21:51   ` Alan Stern
  2010-08-24 13:15     ` Runtime power management during system resume Raj Kumar
@ 2010-08-24 13:38     ` Jens Axboe
  2010-08-24 14:42       ` Alan Stern
  2010-08-30 16:32       ` Alan Stern
  1 sibling, 2 replies; 24+ messages in thread
From: Jens Axboe @ 2010-08-24 13:38 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kernel development list

On 2010-08-23 23:51, Alan Stern wrote:
>>> happens to the request and to the queue?  How does the runtime-resume
>>> routine tell the block layer that the deferred request should be
>>> restarted?
>>
>> Internally, it uses the block queue plugging to set a timer to defer a
>> bit. That's purely implementation detail and it will change in the
>> not-so-distant future if I kill the per-queue plugging. The effect will
>> still be the same though, the action will be automatically retried after
>> some defined interval.
> 
> Hmm.  That doesn't sound quite like what I need.  Ideally the request
> would go back to the head of the queue and stay there until the driver
> tells the block layer to let it through (when the device is ready to 
> accept it).

It depends on where you want to handle it. If you want the driver to
reject it, then we don't have to change the block layer bits a lot. We
could add a DEFER_AND_STOP or something, which would never retry and it
would stop the queue. If the driver passed that back, then it would be
responsible for starting the queue at some point in the future.

>>> How does this all relate to the queue being stopped or plugged?
>>
>> A stopped queue is usually the driver telling the block layer to bugger
>> off for a while, and the driver will tell us when it's ok to resume
>> operations.
> 
> Yes, that sounds more like it.  Put the request back on the queue 
> and stop the queue.  If the prep fn calls blk_stop_queue() and then 
> returns BLKPREP_DEFER, will that do it?

I think it will be a lot cleaner to add specific support for this, as
per the DEFER_AND_STOP above.

>>  So we can't control that part. Plugging we can control. But
> 
> I probably didn't make it clear in the earlier message: The changes
> to implement all this PM stuff will go in the driver, with nothing (or
> almost nothing) changed in the block layer.  Hence stopping the queue
> _is_ under my control.
> 
> Unless you think it would be better to change the block layer 
> instead...

Doing it in the driver is fine. We can always make things more generic
and share them across drivers if there's sharing to be had there.

It also means we don't need special request types that are allowed to
bypass certain queue states, since the driver will track the state and
know what to defer and what to pass through.

>> It needs to be done carefully. A queue can go in and out of idle/busy
>> state extremely fast. I did quite a few tricks on the queue timeout
>> handling to ensure that it didn't have much overhead on a per-rq basis.
>> So we could probably add an idle timer that is set to some suitable
>> timeout for this and would be added when the queue first goes empty. If
>> new requests come in, just let it simmer and defer checking the state to
>> when it actually fires. If nothing has happened, issue a new
>> q->power_mode(new_state) callback that would then queue a suitable
>> request to change the power state of the device. Queueing a new request
>> could check the state and issue a q->power_mode(RUNNING) or similar call
>> to bring things back to life.
>>
>> Just a few ideas...
> 
> The idle-time management can be handled in a couple of different ways,
> and the PM core already contains routines to do it.  I'm not worried
> about that (I have a very clear understanding of the PM core).  The 
> interactions with the block layer are where I need help.
> 
> Speaking of which...  What is this q->power_mode stuff?  I haven't run
> across it before and it doesn't seem to be mentioned in
> include/linux/blkdev.h.  Is it connected with request_pm_state?  I
> don't know what that is either, or how it is meant to be used.

->power_mode() was just a suggested way to implement this, it doesn't
exist. But if you want to push it to the driver, then great, less work
for me :-)

Sounds like all you need is a way to return BLKPREP_DEFER_AND_STOP and
have the block layer stop the queue for you. When you need to restart,
you would insert a special request at the head of the queue and call
blk_start_queue() to get things going again.

The only missing bit would then be the idle detection. That would need
to be in the block layer itself, and the scheme I described should be
fine for that still.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime power management during system resume
  2010-08-24 13:15     ` Runtime power management during system resume Raj Kumar
@ 2010-08-24 14:30       ` Alan Stern
  2010-08-24 15:17         ` Raj Kumar
  0 siblings, 1 reply; 24+ messages in thread
From: Alan Stern @ 2010-08-24 14:30 UTC (permalink / raw)
  To: Raj Kumar; +Cc: linux-kernel

On Tue, 24 Aug 2010, Raj Kumar wrote:

> Hi Alan,
>  
> I have implemented the run time power management in my drivers. I have one
> issue regarding System resume.
>  
> When the system sleep is triggered as it is mentioned that Power management
> core will increment the power_usage counter during prepare and decrements when complete
> is called.
>  
> Now I have few questions:
>  
> 1) When the system resume is done, it does not increase the power_usage counter.
>  right?

That's right.

> So Does then the driver need to update the power_usage counter with run time power management
> core and again set it to active means RPM_ACTIVE?

Read section 6 of Documentation/power/runtime_pm.h.  It explains this.

> 2) Suppose device is active, means its power_usage counter is already one, Now during system
> sleep, does the driver first suspend it with run time power management core and then continue
> System suspend?

No.

> 3) Because I have seen the code of power management core and I did not see the that during
> system suspend, run time power management status is updated means RPM_SUSPENDED.
> right?

I don't understand your question.

Alan Stern


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime PM and the block layer
  2010-08-24 13:38     ` Runtime PM and the block layer Jens Axboe
@ 2010-08-24 14:42       ` Alan Stern
  2010-08-24 17:09         ` Jens Axboe
  2010-08-30 16:32       ` Alan Stern
  1 sibling, 1 reply; 24+ messages in thread
From: Alan Stern @ 2010-08-24 14:42 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Kernel development list

On Tue, 24 Aug 2010, Jens Axboe wrote:

> > Hmm.  That doesn't sound quite like what I need.  Ideally the request
> > would go back to the head of the queue and stay there until the driver
> > tells the block layer to let it through (when the device is ready to 
> > accept it).
> 
> It depends on where you want to handle it. If you want the driver to
> reject it, then we don't have to change the block layer bits a lot. We
> could add a DEFER_AND_STOP or something, which would never retry and it
> would stop the queue. If the driver passed that back, then it would be
> responsible for starting the queue at some point in the future.
> 
> >>> How does this all relate to the queue being stopped or plugged?
> >>
> >> A stopped queue is usually the driver telling the block layer to bugger
> >> off for a while, and the driver will tell us when it's ok to resume
> >> operations.
> > 
> > Yes, that sounds more like it.  Put the request back on the queue 
> > and stop the queue.  If the prep fn calls blk_stop_queue() and then 
> > returns BLKPREP_DEFER, will that do it?
> 
> I think it will be a lot cleaner to add specific support for this, as
> per the DEFER_AND_STOP above.

Okay, good.  I'll try to implement that and see how it goes.

> Sounds like all you need is a way to return BLKPREP_DEFER_AND_STOP and
> have the block layer stop the queue for you. When you need to restart,
> you would insert a special request at the head of the queue and call
> blk_start_queue() to get things going again.

Yes.

Suppose the driver needs to send two of these special requests before
going back to normal operation.  Won't restarting the queue for the
first special request also cause the following regular request to be
passed to the driver before the second special request can be inserted?  
Of course, the driver could cope with this simply by returning another
BLKPREP_DEFER_AND_STOP.

> The only missing bit would then be the idle detection. That would need
> to be in the block layer itself, and the scheme I described should be
> fine for that still.

Are you sure it needs to be in the block layer?  Is there no way for 
the driver's completion handler to tell whether the queue is now empty?  
Certainly it already has enough information to know whether the device 
is still busy processing another request.  When the device is no longer 
busy and the queue is empty, that's when the idle timer should be 
started or restarted.

Alan Stern


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime power management during system resume
  2010-08-24 14:30       ` Alan Stern
@ 2010-08-24 15:17         ` Raj Kumar
  2010-08-24 17:50           ` Alan Stern
  2010-08-25 13:27           ` Raj Kumar
  0 siblings, 2 replies; 24+ messages in thread
From: Raj Kumar @ 2010-08-24 15:17 UTC (permalink / raw)
  To: stern, linux-pm


 
Hi,
 
My question is:
 
1) When the system suspends, it invokes pm_runtime_put_sync when complete
callback is called which will again trigger to put device idle
 
means in runtime power management core, the device status is set to RPM_SUSPENDED.
right?
 
2) As you also said that during system resume, it does not increment the power_usage
counter,
 
that is ok, driver will increase the power_usage counter...
 
But will driver also set the state to RPM_ACTIVE means pm_runtime_set_active
 
right?
 
 
3) During hibernation, if the driver is registered using platform_driver_register...
 
When the system hibernates, will power management core call suspend, resume?
 
right?
 
Regards,
Raj
 
 
 
 
 
 


----------------------------------------
> Date: Tue, 24 Aug 2010 10:30:25 -0400
> From: stern@rowland.harvard.edu
> To: rajkumar278@hotmail.com
> CC: linux-kernel@vger.kernel.org
> Subject: Re: Runtime power management during system resume
>
> On Tue, 24 Aug 2010, Raj Kumar wrote:
>
>> Hi Alan,
>>
>> I have implemented the run time power management in my drivers. I have one
>> issue regarding System resume.
>>
>> When the system sleep is triggered as it is mentioned that Power management
>> core will increment the power_usage counter during prepare and decrements when complete
>> is called.
>>
>> Now I have few questions:
>>
>> 1) When the system resume is done, it does not increase the power_usage counter.
>> right?
>
> That's right.
>
>> So Does then the driver need to update the power_usage counter with run time power management
>> core and again set it to active means RPM_ACTIVE?
>
> Read section 6 of Documentation/power/runtime_pm.h. It explains this.
>
>> 2) Suppose device is active, means its power_usage counter is already one, Now during system
>> sleep, does the driver first suspend it with run time power management core and then continue
>> System suspend?
>
> No.
>
>> 3) Because I have seen the code of power management core and I did not see the that during
>> system suspend, run time power management status is updated means RPM_SUSPENDED.
>> right?
>
> I don't understand your question.
>
> Alan Stern
> 		 	   		  

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime PM and the block layer
  2010-08-24 14:42       ` Alan Stern
@ 2010-08-24 17:09         ` Jens Axboe
  2010-08-24 20:06           ` Alan Stern
  2010-09-27 15:22           ` Alan Stern
  0 siblings, 2 replies; 24+ messages in thread
From: Jens Axboe @ 2010-08-24 17:09 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kernel development list

On 2010-08-24 16:42, Alan Stern wrote:
>> Sounds like all you need is a way to return BLKPREP_DEFER_AND_STOP and
>> have the block layer stop the queue for you. When you need to restart,
>> you would insert a special request at the head of the queue and call
>> blk_start_queue() to get things going again.
> 
> Yes.
> 
> Suppose the driver needs to send two of these special requests before
> going back to normal operation.  Won't restarting the queue for the
> first special request also cause the following regular request to be
> passed to the driver before the second special request can be inserted?  
> Of course, the driver could cope with this simply by returning another
> BLKPREP_DEFER_AND_STOP.

For that special request, you are sure to have some ->end_io() hook to
know when it's complete. When that triggers, you queue the 2nd special
request. And so on, for how many you need.

>> The only missing bit would then be the idle detection. That would need
>> to be in the block layer itself, and the scheme I described should be
>> fine for that still.
> 
> Are you sure it needs to be in the block layer?  Is there no way for 
> the driver's completion handler to tell whether the queue is now empty?  
> Certainly it already has enough information to know whether the device 
> is still busy processing another request.  When the device is no longer 
> busy and the queue is empty, that's when the idle timer should be 
> started or restarted.

To some extent there is, but there can be context outside of the queue
it doesn't know about. That is the case for the plugging rework, for
instance. That also removes the queue_empty() call. Then there's
blk_fetch_request(), but that may return NULL while there's IO pending
in the block layer - so not reliable for that either. The block layer is
tracking this state anyway, if you are leaving it to the driver then it
would have to check everytime it completes the last request it has. It's
cheaper to do in the block layer.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime power management during system resume
  2010-08-24 15:17         ` Raj Kumar
@ 2010-08-24 17:50           ` Alan Stern
  2010-08-25 13:27           ` Raj Kumar
  1 sibling, 0 replies; 24+ messages in thread
From: Alan Stern @ 2010-08-24 17:50 UTC (permalink / raw)
  To: Raj Kumar; +Cc: linux-pm

On Tue, 24 Aug 2010, Raj Kumar wrote:

> 
>  
> Hi,
>  
> My question is:
>  
> 1) When the system suspends, it invokes pm_runtime_put_sync when complete
> callback is called which will again trigger to put device idle
>  
> means in runtime power management core, the device status is set to RPM_SUSPENDED.
> right?

If the runtime_idle callback routine decides to call
pm_runtime_suspend() then the status will be set to RPM_SUSPENDED.  
Otherwise the status will remain RPM_ACTIVE.

> 2) As you also said that during system resume, it does not increment the power_usage
> counter,
>  
> that is ok, driver will increase the power_usage counter...
>  
> But will driver also set the state to RPM_ACTIVE means pm_runtime_set_active
>  
> right?

The driver is supposed to do that.

> 3) During hibernation, if the driver is registered using platform_driver_register...
>  
> When the system hibernates, will power management core call suspend, resume?

Yes.

Alan Stern

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime PM and the block layer
  2010-08-24 17:09         ` Jens Axboe
@ 2010-08-24 20:06           ` Alan Stern
  2010-08-24 20:10             ` Jens Axboe
  2010-09-27 15:22           ` Alan Stern
  1 sibling, 1 reply; 24+ messages in thread
From: Alan Stern @ 2010-08-24 20:06 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Kernel development list

On Tue, 24 Aug 2010, Jens Axboe wrote:

> On 2010-08-24 16:42, Alan Stern wrote:
> >> Sounds like all you need is a way to return BLKPREP_DEFER_AND_STOP and
> >> have the block layer stop the queue for you. When you need to restart,
> >> you would insert a special request at the head of the queue and call
> >> blk_start_queue() to get things going again.
> > 
> > Yes.
> > 
> > Suppose the driver needs to send two of these special requests before
> > going back to normal operation.  Won't restarting the queue for the
> > first special request also cause the following regular request to be
> > passed to the driver before the second special request can be inserted?  
> > Of course, the driver could cope with this simply by returning another
> > BLKPREP_DEFER_AND_STOP.
> 
> For that special request, you are sure to have some ->end_io() hook to
> know when it's complete. When that triggers, you queue the 2nd special
> request. And so on, for how many you need.

That's not what I meant.  Suppose the driver wants to carry out special
requests A and B before carrying out request R, which is initially at
the head of the queue.  The driver inserts A at the front, calls
blk_start_queue(), and inserts B at the front when A completes.  
What's to prevent the block layer from sending R to the driver while A
is running?

> >> The only missing bit would then be the idle detection. That would need
> >> to be in the block layer itself, and the scheme I described should be
> >> fine for that still.
> > 
> > Are you sure it needs to be in the block layer?  Is there no way for 
> > the driver's completion handler to tell whether the queue is now empty?  
> > Certainly it already has enough information to know whether the device 
> > is still busy processing another request.  When the device is no longer 
> > busy and the queue is empty, that's when the idle timer should be 
> > started or restarted.
> 
> To some extent there is, but there can be context outside of the queue
> it doesn't know about. That is the case for the plugging rework, for
> instance. That also removes the queue_empty() call. Then there's
> blk_fetch_request(), but that may return NULL while there's IO pending
> in the block layer - so not reliable for that either. The block layer is
> tracking this state anyway, if you are leaving it to the driver then it
> would have to check everytime it completes the last request it has. It's
> cheaper to do in the block layer.

I see.  You're suggesting we add a new "power_mode" or "queue_idle"  
callback to the request_queue struct, and make the block layer invoke
this callback whenever a request completes and there are no other
requests pending or in flight.  Right?  And similarly, invoke the
callback (with a different argument) when the first request gets added
to an otherwise empty queue.

That would suit my needs.

Alan Stern


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime PM and the block layer
  2010-08-24 20:06           ` Alan Stern
@ 2010-08-24 20:10             ` Jens Axboe
  2010-08-24 21:09               ` Alan Stern
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2010-08-24 20:10 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kernel development list

On 08/24/2010 10:06 PM, Alan Stern wrote:
> On Tue, 24 Aug 2010, Jens Axboe wrote:
> 
>> On 2010-08-24 16:42, Alan Stern wrote:
>>>> Sounds like all you need is a way to return BLKPREP_DEFER_AND_STOP and
>>>> have the block layer stop the queue for you. When you need to restart,
>>>> you would insert a special request at the head of the queue and call
>>>> blk_start_queue() to get things going again.
>>>
>>> Yes.
>>>
>>> Suppose the driver needs to send two of these special requests before
>>> going back to normal operation.  Won't restarting the queue for the
>>> first special request also cause the following regular request to be
>>> passed to the driver before the second special request can be inserted?  
>>> Of course, the driver could cope with this simply by returning another
>>> BLKPREP_DEFER_AND_STOP.
>>
>> For that special request, you are sure to have some ->end_io() hook to
>> know when it's complete. When that triggers, you queue the 2nd special
>> request. And so on, for how many you need.
> 
> That's not what I meant.  Suppose the driver wants to carry out special
> requests A and B before carrying out request R, which is initially at
> the head of the queue.  The driver inserts A at the front, calls
> blk_start_queue(), and inserts B at the front when A completes.  
> What's to prevent the block layer from sending R to the driver while A
> is running?

Nothing, you will have to maintain that state and defer when
appropriate. Which should happen automatically, since you would not be
switching your state to running until request B has completed anyway.

>>>> The only missing bit would then be the idle detection. That would need
>>>> to be in the block layer itself, and the scheme I described should be
>>>> fine for that still.
>>>
>>> Are you sure it needs to be in the block layer?  Is there no way for 
>>> the driver's completion handler to tell whether the queue is now empty?  
>>> Certainly it already has enough information to know whether the device 
>>> is still busy processing another request.  When the device is no longer 
>>> busy and the queue is empty, that's when the idle timer should be 
>>> started or restarted.
>>
>> To some extent there is, but there can be context outside of the queue
>> it doesn't know about. That is the case for the plugging rework, for
>> instance. That also removes the queue_empty() call. Then there's
>> blk_fetch_request(), but that may return NULL while there's IO pending
>> in the block layer - so not reliable for that either. The block layer is
>> tracking this state anyway, if you are leaving it to the driver then it
>> would have to check everytime it completes the last request it has. It's
>> cheaper to do in the block layer.
> 
> I see.  You're suggesting we add a new "power_mode" or "queue_idle"  
> callback to the request_queue struct, and make the block layer invoke
> this callback whenever a request completes and there are no other
> requests pending or in flight.  Right?  And similarly, invoke the
> callback (with a different argument) when the first request gets added
> to an otherwise empty queue.
> 
> That would suit my needs.

Yep, that is what I'm suggesting.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime PM and the block layer
  2010-08-24 20:10             ` Jens Axboe
@ 2010-08-24 21:09               ` Alan Stern
  0 siblings, 0 replies; 24+ messages in thread
From: Alan Stern @ 2010-08-24 21:09 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Kernel development list

On Tue, 24 Aug 2010, Jens Axboe wrote:

> > I see.  You're suggesting we add a new "power_mode" or "queue_idle"  
> > callback to the request_queue struct, and make the block layer invoke
> > this callback whenever a request completes and there are no other
> > requests pending or in flight.  Right?  And similarly, invoke the
> > callback (with a different argument) when the first request gets added
> > to an otherwise empty queue.
> > 
> > That would suit my needs.
> 
> Yep, that is what I'm suggesting.

All right, I'll work on this and get back to you when I need more help.  
Thanks for the advice.

Alan Stern


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime power management during system resume
  2010-08-24 15:17         ` Raj Kumar
  2010-08-24 17:50           ` Alan Stern
@ 2010-08-25 13:27           ` Raj Kumar
  2010-08-25 14:51             ` Alan Stern
  2010-08-26 13:40             ` Raj Kumar
  1 sibling, 2 replies; 24+ messages in thread
From: Raj Kumar @ 2010-08-25 13:27 UTC (permalink / raw)
  To: stern, linux-pm


[-- Attachment #1.1: Type: text/plain, Size: 2593 bytes --]



Hi Alan,

Thanks for quick replies. As you said

 


> >> 2) Suppose device is active, means its power_usage counter is already one, Now during system
> >> sleep, does the driver first suspend it with run time power management core and then continue
> >> System suspend?
> >
{ALAN}  No.

 

 

But since during system suspend power_usage counter is incremented by 1 But if the device is active

means its state is RPM_ACTIVE, its power_usage counter is 1 in run time power management core.

 

when the system suspend happens, it will also increment the power_usage_counter by 1 during dpm_prepare

 

Now   the power_usage_counter is 2.

 

So when dpm_complete is invoked, it will decrement the power_usage_counter by 1 during dpm_complete.

 

So now when the drivers gets system suspend, its power_usage counter is still 1.

 

So what in this scenario, does the device decrements the power_usage counter itself?

 

Thanks

Raj









>
>
>
> ----------------------------------------
> > Date: Tue, 24 Aug 2010 10:30:25 -0400
> > From: stern@rowland.harvard.edu
> > To: rajkumar278@hotmail.com
> > CC: linux-kernel@vger.kernel.org
> > Subject: Re: Runtime power management during system resume
> >
> > On Tue, 24 Aug 2010, Raj Kumar wrote:
> >
> >> Hi Alan,
> >>
> >> I have implemented the run time power management in my drivers. I have one
> >> issue regarding System resume.
> >>
> >> When the system sleep is triggered as it is mentioned that Power management
> >> core will increment the power_usage counter during prepare and decrements when complete
> >> is called.
> >>
> >> Now I have few questions:
> >>
> >> 1) When the system resume is done, it does not increase the power_usage counter.
> >> right?
> >
> > That's right.
> >
> >> So Does then the driver need to update the power_usage counter with run time power management
> >> core and again set it to active means RPM_ACTIVE?
> >
> > Read section 6 of Documentation/power/runtime_pm.h. It explains this.
> >
> >> 2) Suppose device is active, means its power_usage counter is already one, Now during system
> >> sleep, does the driver first suspend it with run time power management core and then continue
> >> System suspend?
> >
> > No.
> >
> >> 3) Because I have seen the code of power management core and I did not see the that during
> >> system suspend, run time power management status is updated means RPM_SUSPENDED.
> >> right?
> >
> > I don't understand your question.
> >
> > Alan Stern
> >
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 3384 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime power management during system resume
  2010-08-25 13:27           ` Raj Kumar
@ 2010-08-25 14:51             ` Alan Stern
  2010-08-26 13:40             ` Raj Kumar
  1 sibling, 0 replies; 24+ messages in thread
From: Alan Stern @ 2010-08-25 14:51 UTC (permalink / raw)
  To: Raj Kumar; +Cc: linux-pm

On Wed, 25 Aug 2010, Raj Kumar wrote:

> But since during system suspend power_usage counter is incremented by 1 But if the device is active
> 
> means its state is RPM_ACTIVE, its power_usage counter is 1 in run time power management core.
> 
>  
> 
> when the system suspend happens, it will also increment the power_usage_counter by 1 during dpm_prepare
> 
>  
> 
> Now   the power_usage_counter is 2.
> 
>  
> 
> So when dpm_complete is invoked, it will decrement the power_usage_counter by 1 during dpm_complete.
> 
>  
> 
> So now when the drivers gets system suspend, its power_usage counter is still 1.
> 
>  
> 
> So what in this scenario, does the device decrements the power_usage counter itself?

Firstly, the _device_ can't change the power_usage counter.  Only the
_driver_ can.  You seem to keep forgetting this point; you need to keep
it straight:

		Driver != Device

Secondly, after the system suspend the power_usage counter has the same
value as it did before.  In your case the usage_counter was 1 before
the system suspend and it is 1 after the system suspend.  Whatever
routine was responsible for setting the counter to 1 originally will
also be responsible for decrementing the counter to 0 some time later.

Alan Stern

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime power management during system resume
  2010-08-25 13:27           ` Raj Kumar
  2010-08-25 14:51             ` Alan Stern
@ 2010-08-26 13:40             ` Raj Kumar
  2010-08-26 14:33               ` Alan Stern
  2010-09-18 11:49               ` (no subject) Raj Kumar
  1 sibling, 2 replies; 24+ messages in thread
From: Raj Kumar @ 2010-08-26 13:40 UTC (permalink / raw)
  To: stern, linux-pm


[-- Attachment #1.1: Type: text/plain, Size: 3572 bytes --]


 

Hi Alan,

 

I have few more questions:

 

1) If the platform device driver is using platform_bus_type as its bus type, in that case is the parent of device is its bus?

 

2) Because I saw the code in platform_device_register, when any platform device is register, its parent is set to

platform bus?

right?

 

int platform_device_add(struct platform_device *pdev)
{

----------------------------------------

if (!pdev->dev.parent)
                pdev->dev.parent = &platform_bus;

------------------------------------------------------

}

 

So when the platform device wants to use parent other than platform bus, is it possible to set the parent

of platform device to any other device rather than platform_bus?

 

 

3) The 3rd question is regarding, in this function

 

int __pm_runtime_resume(struct device *dev, bool from_wq)
        __releases(&dev->power.lock) __acquires(&dev->power.lock)
{

 

-----------------------
        if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume) {
                spin_unlock_irq(&dev->power.lock);

                retval = dev->bus->pm->runtime_resume(dev);

                spin_lock_irq(&dev->power.lock);
                dev->power.runtime_error = retval;
        }

 

--------------------

 out:
        if (parent) {
                spin_unlock_irq(&dev->power.lock);

                pm_runtime_put(parent);

                spin_lock_irq(&dev->power.lock);
        }

}

 

When the device is resumed and its status is set to RPM_ACTIVE, Now the control comes to 

pm_runtime_put(parent) (suppose device is resumed correctly)

 

Now it decrements the power usage count and call idle for parent.

 

Why after resuming the device, It will try to schedule idle for its parent?

 

Since the device is resumed so its parent should be active.

 

Regards

Raj

 

 

 

 


 

 

 


 

 

> ----------------------------------------
> > Date: Tue, 24 Aug 2010 10:30:25 -0400
> > From: stern@rowland.harvard.edu
> > To: rajkumar278@hotmail.com
> > CC: linux-kernel@vger.kernel.org
> > Subject: Re: Runtime power management during system resume
> >
> > On Tue, 24 Aug 2010, Raj Kumar wrote:
> >
> >> Hi Alan,
> >>
> >> I have implemented the run time power management in my drivers. I have one
> >> issue regarding System resume.
> >>
> >> When the system sleep is triggered as it is mentioned that Power management
> >> core will increment the power_usage counter during prepare and decrements when complete
> >> is called.
> >>
> >> Now I have few questions:
> >>
> >> 1) When the system resume is done, it does not increase the power_usage counter.
> >> right?
> >
> > That's right.
> >
> >> So Does then the driver need to update the power_usage counter with run time power management
> >> core and again set it to active means RPM_ACTIVE?
> >
> > Read section 6 of Documentation/power/runtime_pm.h. It explains this.
> >
> >> 2) Suppose device is active, means its power_usage counter is already one, Now during system
> >> sleep, does the driver first suspend it with run time power management core and then continue
> >> System suspend?
> >
> > No.
> >
> >> 3) Because I have seen the code of power management core and I did not see the that during
> >> system suspend, run time power management status is updated means RPM_SUSPENDED.
> >> right?
> >
> > I don't understand your question.
> >
> > Alan Stern
> >

 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 5342 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime power management during system resume
  2010-08-26 13:40             ` Raj Kumar
@ 2010-08-26 14:33               ` Alan Stern
  2010-09-18 11:49               ` (no subject) Raj Kumar
  1 sibling, 0 replies; 24+ messages in thread
From: Alan Stern @ 2010-08-26 14:33 UTC (permalink / raw)
  To: Raj Kumar; +Cc: Linux-pm mailing list

On Thu, 26 Aug 2010, Raj Kumar wrote:

> Hi Alan,

You know, it would be a lot easier to reply to your emails if you
didn't put so many blank lines in them and you told your email client
to wrap lines after 72 columns or so.

> I have few more questions:
> 
>  
> 
> 1) If the platform device driver is using platform_bus_type as its bus type, in that case is the parent of device is its bus?

Maybe yes, maybe no.

> 
>  
> 
> 2) Because I saw the code in platform_device_register, when any platform device is register, its parent is set to
> 
> platform bus?
> 
> right?

No.  Look at the code again:

> 
>  
> 
> int platform_device_add(struct platform_device *pdev)
> {
> 
> ----------------------------------------
> 
> if (!pdev->dev.parent)
>                 pdev->dev.parent = &platform_bus;
> 
> ------------------------------------------------------
> 
> }

The parent is set to platform_bus _only_ if the parent wasn't already
set.

> 
>  
> 
> So when the platform device wants to use parent other than platform bus, is it possible to set the parent
> 
> of platform device to any other device rather than platform_bus?

You're making that same mistake again.  The _device_ doesn't get to 
choose what the parent is; the _driver_ does.

Repeat after me:

		Driver != Device

You need to learn that.  If you can't remember the distinction between 
a device and a driver then you will never be able to write kernel code.

Getting back to your question: Of course it is possible.  The driver 
merely has to set pdev->dev.parent before calling 
platform_add_device().

> 
>  
> 
>  
> 
> 3) The 3rd question is regarding, in this function
> 
>  
> 
> int __pm_runtime_resume(struct device *dev, bool from_wq)
>         __releases(&dev->power.lock) __acquires(&dev->power.lock)
> {
> 
>  
> 
> -----------------------
>         if (dev->bus && dev->bus->pm && dev->bus->pm->runtime_resume) {
>                 spin_unlock_irq(&dev->power.lock);
> 
>                 retval = dev->bus->pm->runtime_resume(dev);
> 
>                 spin_lock_irq(&dev->power.lock);
>                 dev->power.runtime_error = retval;
>         }
> 
>  
> 
> --------------------
> 
>  out:
>         if (parent) {
>                 spin_unlock_irq(&dev->power.lock);
> 
>                 pm_runtime_put(parent);
> 
>                 spin_lock_irq(&dev->power.lock);
>         }
> 
> }
> 
>  
> 
> When the device is resumed and its status is set to RPM_ACTIVE, Now the control comes to 
> 
> pm_runtime_put(parent) (suppose device is resumed correctly)
> 
>  
> 
> Now it decrements the power usage count and call idle for parent.

Be careful.  This calls pm_runtime_idle() for the parent, but
pm_runtime_idle() probably won't call the runtime_idle callback for the
parent.  You'll see why if you read __pm_runtime_idle();  
pm_children_suspended(parent) will return 0 unless
parent->power.ignore_children is set.

> 
>  
> 
> Why after resuming the device, It will try to schedule idle for its parent?
> 
>  
> 
> Since the device is resumed so its parent should be active.

Of course the parent is active.  That's why pm_runtime_idle() is
called; only active devices get idle callbacks.  The opposite of
"active" is "suspended" -- obviously we don't want to make idle
callbacks for suspended devices!

It is possible for an active device to have an idle or suspended
parent.  This can happen if the parent's power.ignore_children flag is
set.  For example, consider a situation where the device remains at 
full power but the link between it and the computer has been powered 
down.  The device is still active, but its parent (the link) is 
suspended.

Alan Stern

P.S.: You do not need to include copies of old emails at the bottom of 
your messages.  Please stop doing it.

> > ----------------------------------------
> > > Date: Tue, 24 Aug 2010 10:30:25 -0400
> > > From: stern@rowland.harvard.edu
> > > To: rajkumar278@hotmail.com
> > > CC: linux-kernel@vger.kernel.org
> > > Subject: Re: Runtime power management during system resume
> > >
> > > On Tue, 24 Aug 2010, Raj Kumar wrote:
> > >
> > >> Hi Alan,
> > >>
> > >> I have implemented the run time power management in my drivers. I have one
> > >> issue regarding System resume.
> > >>
> > >> When the system sleep is triggered as it is mentioned that Power management
> > >> core will increment the power_usage counter during prepare and decrements when complete
> > >> is called.
> > >>
> > >> Now I have few questions:
> > >>
> > >> 1) When the system resume is done, it does not increase the power_usage counter.
> > >> right?
> > >
> > > That's right.
> > >
> > >> So Does then the driver need to update the power_usage counter with run time power management
> > >> core and again set it to active means RPM_ACTIVE?
> > >
> > > Read section 6 of Documentation/power/runtime_pm.h. It explains this.
> > >
> > >> 2) Suppose device is active, means its power_usage counter is already one, Now during system
> > >> sleep, does the driver first suspend it with run time power management core and then continue
> > >> System suspend?
> > >
> > > No.
> > >
> > >> 3) Because I have seen the code of power management core and I did not see the that during
> > >> system suspend, run time power management status is updated means RPM_SUSPENDED.
> > >> right?
> > >
> > > I don't understand your question.
> > >
> > > Alan Stern
> > >
> 
>  		 	   		  

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime PM and the block layer
  2010-08-24 13:38     ` Runtime PM and the block layer Jens Axboe
  2010-08-24 14:42       ` Alan Stern
@ 2010-08-30 16:32       ` Alan Stern
  1 sibling, 0 replies; 24+ messages in thread
From: Alan Stern @ 2010-08-30 16:32 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Kernel development list

On Tue, 24 Aug 2010, Jens Axboe wrote:

> > Unless you think it would be better to change the block layer 
> > instead...
> 
> Doing it in the driver is fine. We can always make things more generic
> and share them across drivers if there's sharing to be had there.

After giving this some thought, I have decided that it would be best to
implement much of this in the block layer.  It's a simpler approach and
it offers greater generality.

The changes would be fairly small.  Two additional fields will be added
to struct request_queue: a PM status (active, suspending, suspended,
resuming) and a pointer to the queue's struct device (for carrying out
PM operations).  Actually I'm a little surprised there isn't already a
pointer to the struct device; it seems like a very natural thing to 
have.

There also will be four new functions for drivers/subsystems to call at
the beginning and end of their suspend and resume routines.

> It also means we don't need special request types that are allowed to
> bypass certain queue states, since the driver will track the state and
> know what to defer and what to pass through.

It turns out there already are a couple of special request types for
this: REQ_TYPE_PM_SUSPEND and REQ_TYPE_PM_RESUME.  It's not clear why
two different types are needed, but blk_execute_rq_nowait() contains a 
clue:

	/* the queue is stopped so it won't be plugged+unplugged */
	if (rq->cmd_type == REQ_TYPE_PM_RESUME)
		q->request_fn(q);

The purpose for this is unclear.  It seems to have been added
specifically for the IDE driver, which is the only driver using these
request types.  (In fact, the entire request_pm_state structure also
isn't used anywhere else -- which indicates that it should be defined
in a private header for IDE alone instead of in blkdev.h.)  Maybe it
won't be needed after these changes.

My idea is that a queue shouldn't need to be explicitly stopped when
its device is suspended.  Instead, blk_peek_request() can check the
queue state and simply return NULL if the queue is suspending,
suspended, or resuming and the request type isn't REQ_TYPE_PM_SUSPEND
or _RESUME.  That should work, since blk_peek_request() is the only
path for moving requests from the queue to the driver, right?

> The only missing bit would then be the idle detection. That would need
> to be in the block layer itself, and the scheme I described should be
> fine for that still.

This is where I will need help.  From what I gather, a request's path
through the block layer starts at __elv_add_request() and ends at
blk_finish_request().  Updating a counter at these points should be
good enough -- except for elv_merge() and possibly other things I don't
know about.  Not to mention any changes you may be planning.

Basically I just need to call some new routines when a request is first
added to an idle queue and when a queue becomes idle because the last
request has completed.  Can you suggest the best way to do this?

Alan Stern


^ permalink raw reply	[flat|nested] 24+ messages in thread

* (no subject)
  2010-08-26 13:40             ` Raj Kumar
  2010-08-26 14:33               ` Alan Stern
@ 2010-09-18 11:49               ` Raj Kumar
  2010-09-18 15:36                 ` your mail Alan Stern
  2010-10-05 21:40                 ` Question about hibernation Raj Kumar
  1 sibling, 2 replies; 24+ messages in thread
From: Raj Kumar @ 2010-09-18 11:49 UTC (permalink / raw)
  To: stern, linux-pm


[-- Attachment #1.1: Type: text/plain, Size: 710 bytes --]


Hi Alan,

 

I have question regarding the CPU frequency subsystem through which the frequency of the CPU is scaled.

e.g. If we have device driver (X device contains processor) that wants to scale its own processor based

upon the workload, (DVFS driver for this processor is implemented and registered with cpufreq subsystem)

then X device driver when detects waorkload,  Can this X device driver call policy governor APIS for scaling clock or

this X device driver can directly calls DVFS driver APIs directly?

I just want to know from our device driver how do call DVFS driver if DVFS driver is registered with cpu frequency subsystem?

 

 

 

Regards

Raj 

 
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 1000 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: your mail
  2010-09-18 11:49               ` (no subject) Raj Kumar
@ 2010-09-18 15:36                 ` Alan Stern
  2010-09-18 15:56                   ` Dominik Brodowski
  2010-10-05 21:40                 ` Question about hibernation Raj Kumar
  1 sibling, 1 reply; 24+ messages in thread
From: Alan Stern @ 2010-09-18 15:36 UTC (permalink / raw)
  To: Raj Kumar; +Cc: linux-pm

On Sat, 18 Sep 2010, Raj Kumar wrote:

> Hi Alan,
> 
>  
> 
> I have question regarding the CPU frequency subsystem through which the frequency of the CPU is scaled.
> 
> e.g. If we have device driver (X device contains processor) that wants to scale its own processor based
> 
> upon the workload, (DVFS driver for this processor is implemented and registered with cpufreq subsystem)
> 
> then X device driver when detects waorkload,  Can this X device driver call policy governor APIS for scaling clock or
> 
> this X device driver can directly calls DVFS driver APIs directly?

I don't know.  I have never used cpufreq and I don't know how it works.

> I just want to know from our device driver how do call DVFS driver if DVFS driver is registered with cpu frequency subsystem?

Then you should ask somebody else.

Alan Stern

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: your mail
  2010-09-18 15:36                 ` your mail Alan Stern
@ 2010-09-18 15:56                   ` Dominik Brodowski
  0 siblings, 0 replies; 24+ messages in thread
From: Dominik Brodowski @ 2010-09-18 15:56 UTC (permalink / raw)
  To: Raj Kumar, linux-pm

Hi Raj,

On Sat, Sep 18, 2010 at 11:36:16AM -0400, Alan Stern wrote:
> > I have question regarding the CPU frequency subsystem through which the frequency of the CPU is scaled.
> > 
> > e.g. If we have device driver (X device contains processor) that wants to scale its own processor based
> > 
> > upon the workload, (DVFS driver for this processor is implemented and registered with cpufreq subsystem)
> > 
> > then X device driver when detects waorkload,  Can this X device driver call policy governor APIS for scaling clock or
> > 
> > this X device driver can directly calls DVFS driver APIs directly?

Best would be to register a cpufreq policy notifier,
(cpufreq_register_notifier()), and then call cpufreq_update_policy()
whenever the "X device driver" needs to modify its frequency constraints.

In addition, it would be best to discuss this on the cpufreq mailing list at

cpufreq@vger.kernel.org

Best,
	Dominik

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Runtime PM and the block layer
  2010-08-24 17:09         ` Jens Axboe
  2010-08-24 20:06           ` Alan Stern
@ 2010-09-27 15:22           ` Alan Stern
  1 sibling, 0 replies; 24+ messages in thread
From: Alan Stern @ 2010-09-27 15:22 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Kernel development list

On Tue, 24 Aug 2010, Jens Axboe wrote:

> > Are you sure it needs to be in the block layer?  Is there no way for 
> > the driver's completion handler to tell whether the queue is now empty?  
> > Certainly it already has enough information to know whether the device 
> > is still busy processing another request.  When the device is no longer 
> > busy and the queue is empty, that's when the idle timer should be 
> > started or restarted.
> 
> To some extent there is, but there can be context outside of the queue
> it doesn't know about. That is the case for the plugging rework, for
> instance. That also removes the queue_empty() call. Then there's
> blk_fetch_request(), but that may return NULL while there's IO pending
> in the block layer - so not reliable for that either. The block layer is
> tracking this state anyway, if you are leaving it to the driver then it
> would have to check everytime it completes the last request it has. It's
> cheaper to do in the block layer.

I'm going to take your advice.  To make this work, I need to know when 
the number of pending requests increases from 0 and when it drops to 0.

The most direct approach is to keep a count of the number of pending
requests for each request_queue.  As far as I can tell, a request
enters the system in elv_insert(), so that's where the count should be
incremented (except in the ELEVATOR_INSERT_REQUEUE case).  And a
request leaves the system in blk_finish_request(), so that's where the
count should be decremented.

Am I missing anything?  What about elv_merge_requests()?

Alan Stern


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Question about hibernation
  2010-09-18 11:49               ` (no subject) Raj Kumar
  2010-09-18 15:36                 ` your mail Alan Stern
@ 2010-10-05 21:40                 ` Raj Kumar
  2010-10-05 22:43                   ` Rafael J. Wysocki
  1 sibling, 1 reply; 24+ messages in thread
From: Raj Kumar @ 2010-10-05 21:40 UTC (permalink / raw)
  To: stern, linux-pm


[-- Attachment #1.1: Type: text/plain, Size: 664 bytes --]


 
Hi Alan,
 
I have question regarding the hibernation mode in linux. 
 
1) In normal suspend to ram, is freezing of tasks necessary while suspend to ram in current linux power management core?
2) In hibernation mode, is freezing of tasks done before normal suspend/resumes in order to that hibernation image is in sync?
 
My issue is because it might be possible that there is some part of device driver which is user space and another part is kernel space
then in suspend to ram and hibernation mode is freezing of all user space part is done before kernel side or it differs while
suspend to ram and hibernation mode?
 
Regards
Raj
  		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 951 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Question about hibernation
  2010-10-05 21:40                 ` Question about hibernation Raj Kumar
@ 2010-10-05 22:43                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-10-05 22:43 UTC (permalink / raw)
  To: linux-pm

On Tuesday, October 05, 2010, Raj Kumar wrote:
> 
> Hi Alan,
>  
> I have question regarding the hibernation mode in linux. 
>  
> 1) In normal suspend to ram, is freezing of tasks necessary while suspend to ram in current linux power management core?

Yes, it is.

> 2) In hibernation mode, is freezing of tasks done before normal suspend/resumes in order to that hibernation image is in sync?

That depends on what you mean by normal "suspend/resumes".  Generally speaking,
the hibernate code freezes tasks before preallocating image memory.

> My issue is because it might be possible that there is some part of device driver which is user space and another part is kernel space
> then in suspend to ram and hibernation mode is freezing of all user space part is done before kernel side or it differs while
> suspend to ram and hibernation mode?

User space is always frozen before suspend and hibernation.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-10-05 22:43 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-23 19:17 Runtime PM and the block layer Alan Stern
2010-08-23 19:53 ` Jens Axboe
2010-08-23 21:51   ` Alan Stern
2010-08-24 13:15     ` Runtime power management during system resume Raj Kumar
2010-08-24 14:30       ` Alan Stern
2010-08-24 15:17         ` Raj Kumar
2010-08-24 17:50           ` Alan Stern
2010-08-25 13:27           ` Raj Kumar
2010-08-25 14:51             ` Alan Stern
2010-08-26 13:40             ` Raj Kumar
2010-08-26 14:33               ` Alan Stern
2010-09-18 11:49               ` (no subject) Raj Kumar
2010-09-18 15:36                 ` your mail Alan Stern
2010-09-18 15:56                   ` Dominik Brodowski
2010-10-05 21:40                 ` Question about hibernation Raj Kumar
2010-10-05 22:43                   ` Rafael J. Wysocki
2010-08-24 13:38     ` Runtime PM and the block layer Jens Axboe
2010-08-24 14:42       ` Alan Stern
2010-08-24 17:09         ` Jens Axboe
2010-08-24 20:06           ` Alan Stern
2010-08-24 20:10             ` Jens Axboe
2010-08-24 21:09               ` Alan Stern
2010-09-27 15:22           ` Alan Stern
2010-08-30 16:32       ` Alan Stern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.