linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 PATCH 0/4]: block layer runtime pm
@ 2012-05-17 15:48 Lin Ming
  2012-05-17 15:48 ` [RFC v2 PATCH 1/4] block: add a flag to identify PM request Lin Ming
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Lin Ming @ 2012-05-17 15:48 UTC (permalink / raw)
  To: Jens Axboe, Alan Stern, Jeff Moyer; +Cc: linux-kernel, linux-pm, linux-scsi

Hi,

In August 2010, Jens and Alan discussed about "Runtime PM and the block
layer". http://marc.info/?t=128259108400001&r=1&w=2

Here are the RFC v2 patches that try to implement the ideas discussed.
And it's a workable version now.
Welcome to give it a try.

The test steps, for example

# ls -l /sys/block/sda
/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda

# echo auto > /sys/devices/pci0000:00/0000:00:1f.2/ata1/power/control
# echo auto > /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/power/control

Then you'll see sda is suspended after 10secs idle.

# cat /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/power/runtime_status
suspended

And if you do some IO, it will resume immediately.

v2:
- remove queue idle timer, use runtime pm core's auto suspend

Lin Ming (4):
      block: add a flag to identify PM request
      block: add queue runtime pm callbacks
      block: implement block layer runtime pm
      [SCSI] sd: change to auto suspend mode

 block/blk-core.c           |   12 +++++++
 block/blk-settings.c       |    8 +++++
 block/elevator.c           |    9 +++++
 drivers/scsi/scsi_lib.c    |   25 +++++++++++++--
 drivers/scsi/scsi_pm.c     |    7 ++--
 drivers/scsi/sd.c          |   72 ++++++++++++++++++++++++++++++++++++--------
 include/linux/blk_types.h  |    2 +
 include/linux/blkdev.h     |    7 ++++
 include/scsi/scsi_device.h |    4 ++
 9 files changed, 127 insertions(+), 19 deletions(-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC v2 PATCH 1/4] block: add a flag to identify PM request
  2012-05-17 15:48 [RFC v2 PATCH 0/4]: block layer runtime pm Lin Ming
@ 2012-05-17 15:48 ` Lin Ming
  2012-05-17 17:46   ` Alan Stern
  2012-05-17 15:48 ` [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks Lin Ming
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 22+ messages in thread
From: Lin Ming @ 2012-05-17 15:48 UTC (permalink / raw)
  To: Jens Axboe, Alan Stern, Jeff Moyer
  Cc: linux-kernel, linux-pm, linux-scsi, Lin Ming

Add a flag REQ_PM to identify the request is PM related.
As an example, modify scsi code to use this flag.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
---
 drivers/scsi/scsi_lib.c    |   25 ++++++++++++++++++++++---
 drivers/scsi/sd.c          |    9 +++++----
 include/linux/blk_types.h  |    2 ++
 include/scsi/scsi_device.h |    4 ++++
 4 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ead6405..5a1eb5a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -256,10 +256,10 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
 EXPORT_SYMBOL(scsi_execute);
 
 
-int scsi_execute_req(struct scsi_device *sdev, const unsigned char *cmd,
+static int __scsi_execute_req(struct scsi_device *sdev, const unsigned char *cmd,
 		     int data_direction, void *buffer, unsigned bufflen,
 		     struct scsi_sense_hdr *sshdr, int timeout, int retries,
-		     int *resid)
+		     int *resid, int flags)
 {
 	char *sense = NULL;
 	int result;
@@ -270,15 +270,34 @@ int scsi_execute_req(struct scsi_device *sdev, const unsigned char *cmd,
 			return DRIVER_ERROR << 24;
 	}
 	result = scsi_execute(sdev, cmd, data_direction, buffer, bufflen,
-			      sense, timeout, retries, 0, resid);
+			      sense, timeout, retries, flags, resid);
 	if (sshdr)
 		scsi_normalize_sense(sense, SCSI_SENSE_BUFFERSIZE, sshdr);
 
 	kfree(sense);
 	return result;
 }
+
+int scsi_execute_req(struct scsi_device *sdev, const unsigned char *cmd,
+		     int data_direction, void *buffer, unsigned bufflen,
+		     struct scsi_sense_hdr *sshdr, int timeout, int retries,
+		     int *resid)
+{
+	return __scsi_execute_req(sdev, cmd, data_direction, buffer, bufflen,
+			      sshdr, timeout, retries, resid, 0);
+}
 EXPORT_SYMBOL(scsi_execute_req);
 
+int scsi_execute_req_flag(struct scsi_device *sdev, const unsigned char *cmd,
+		     int data_direction, void *buffer, unsigned bufflen,
+		     struct scsi_sense_hdr *sshdr, int timeout, int retries,
+		     int *resid, int flags)
+{
+	return __scsi_execute_req(sdev, cmd, data_direction, buffer, bufflen,
+			      sshdr, timeout, retries, resid, flags);
+}
+EXPORT_SYMBOL(scsi_execute_req_flag);
+
 /*
  * Function:    scsi_init_cmd_errh()
  *
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 5ba5c2a..1205a8b 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1269,8 +1269,9 @@ static int sd_sync_cache(struct scsi_disk *sdkp)
 		 * Leave the rest of the command zero to indicate
 		 * flush everything.
 		 */
-		res = scsi_execute_req(sdp, cmd, DMA_NONE, NULL, 0, &sshdr,
-				       SD_FLUSH_TIMEOUT, SD_MAX_RETRIES, NULL);
+		res = scsi_execute_req_flag(sdp, cmd, DMA_NONE, NULL, 0, &sshdr,
+				       SD_FLUSH_TIMEOUT, SD_MAX_RETRIES, NULL,
+				       REQ_PM);
 		if (res == 0)
 			break;
 	}
@@ -2812,8 +2813,8 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start)
 	if (!scsi_device_online(sdp))
 		return -ENODEV;
 
-	res = scsi_execute_req(sdp, cmd, DMA_NONE, NULL, 0, &sshdr,
-			       SD_TIMEOUT, SD_MAX_RETRIES, NULL);
+	res = scsi_execute_req_flag(sdp, cmd, DMA_NONE, NULL, 0, &sshdr,
+			       SD_TIMEOUT, SD_MAX_RETRIES, NULL, REQ_PM);
 	if (res) {
 		sd_printk(KERN_WARNING, sdkp, "START_STOP FAILED\n");
 		sd_print_result(sdkp, res);
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 4053cbd..a75f854 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -150,6 +150,7 @@ enum rq_flag_bits {
 	__REQ_FLUSH_SEQ,	/* request for flush sequence */
 	__REQ_IO_STAT,		/* account I/O stat */
 	__REQ_MIXED_MERGE,	/* merge of different types, fail separately */
+	__REQ_PM,		/* runtime pm request */
 	__REQ_NR_BITS,		/* stops here */
 };
 
@@ -191,5 +192,6 @@ enum rq_flag_bits {
 #define REQ_IO_STAT		(1 << __REQ_IO_STAT)
 #define REQ_MIXED_MERGE		(1 << __REQ_MIXED_MERGE)
 #define REQ_SECURE		(1 << __REQ_SECURE)
+#define REQ_PM			(1 << __REQ_PM)
 
 #endif /* __LINUX_BLK_TYPES_H */
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 6efb2e1..7735b46 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -387,6 +387,10 @@ extern int scsi_execute_req(struct scsi_device *sdev, const unsigned char *cmd,
 			    int data_direction, void *buffer, unsigned bufflen,
 			    struct scsi_sense_hdr *, int timeout, int retries,
 			    int *resid);
+extern int scsi_execute_req_flag(struct scsi_device *sdev, const unsigned char *cmd,
+			    int data_direction, void *buffer, unsigned bufflen,
+			    struct scsi_sense_hdr *, int timeout, int retries,
+			    int *resid, int flags);
 
 #ifdef CONFIG_PM_RUNTIME
 extern int scsi_autopm_get_device(struct scsi_device *);
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-17 15:48 [RFC v2 PATCH 0/4]: block layer runtime pm Lin Ming
  2012-05-17 15:48 ` [RFC v2 PATCH 1/4] block: add a flag to identify PM request Lin Ming
@ 2012-05-17 15:48 ` Lin Ming
  2012-05-17 18:29   ` Alan Stern
  2012-05-17 15:48 ` [RFC v2 PATCH 3/4] block: implement block layer runtime pm Lin Ming
  2012-05-17 15:48 ` [RFC v2 PATCH 4/4] [SCSI] sd: change to auto suspend mode Lin Ming
  3 siblings, 1 reply; 22+ messages in thread
From: Lin Ming @ 2012-05-17 15:48 UTC (permalink / raw)
  To: Jens Axboe, Alan Stern, Jeff Moyer
  Cc: linux-kernel, linux-pm, linux-scsi, Lin Ming

Add runtime pm suspend/resume callbacks to request queue.
As an example, implement these callbacks in sd driver.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
---
 block/blk-settings.c   |    8 ++++++++
 drivers/scsi/sd.c      |   21 +++++++++++++++++++++
 include/linux/blkdev.h |    6 ++++++
 3 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index d3234fc..120d9fc 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -99,6 +99,14 @@ void blk_queue_lld_busy(struct request_queue *q, lld_busy_fn *fn)
 }
 EXPORT_SYMBOL_GPL(blk_queue_lld_busy);
 
+void blk_queue_runtime_pm(struct request_queue *q,
+		runtime_pm_fn *suspend_fn, runtime_pm_fn *resume_fn)
+{
+	q->runtime_suspend = suspend_fn;
+	q->runtime_resume = resume_fn;
+}
+EXPORT_SYMBOL_GPL(blk_queue_runtime_pm);
+
 /**
  * blk_set_default_limits - reset limits to default values
  * @lim:  the queue_limits structure to reset
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 1205a8b..0335cde 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -938,6 +938,25 @@ static int sd_prep_fn(struct request_queue *q, struct request *rq)
 	return scsi_prep_return(q, rq, ret);
 }
 
+static int sd_block_runtime_suspend(struct request_queue *q)
+{
+	struct scsi_device *sdp = q->queuedata;
+
+	pm_runtime_mark_last_busy(&sdp->sdev_gendev);
+	pm_runtime_put_autosuspend(&sdp->sdev_gendev);
+
+	return 0;
+}
+
+static int sd_block_runtime_resume(struct request_queue *q)
+{
+	struct scsi_device *sdp = q->queuedata;
+
+	pm_runtime_get(&sdp->sdev_gendev);
+
+	return 0;
+}
+
 /**
  *	sd_open - open a scsi disk device
  *	@inode: only i_rdev member may be used
@@ -2616,6 +2635,8 @@ static void sd_probe_async(void *data, async_cookie_t cookie)
 
 	blk_queue_prep_rq(sdp->request_queue, sd_prep_fn);
 	blk_queue_unprep_rq(sdp->request_queue, sd_unprep_fn);
+	blk_queue_runtime_pm(sdp->request_queue,
+		sd_block_runtime_suspend, sd_block_runtime_resume);
 
 	gd->driverfs_dev = &sdp->sdev_gendev;
 	gd->flags = GENHD_FL_EXT_DEVT;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 2aa2466..f52f518 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -216,6 +216,7 @@ typedef void (softirq_done_fn)(struct request *);
 typedef int (dma_drain_needed_fn)(struct request *);
 typedef int (lld_busy_fn) (struct request_queue *q);
 typedef int (bsg_job_fn) (struct bsg_job *);
+typedef int (runtime_pm_fn) (struct request_queue *q);
 
 enum blk_eh_timer_return {
 	BLK_EH_NOT_HANDLED,
@@ -289,6 +290,9 @@ struct request_queue {
 	rq_timed_out_fn		*rq_timed_out_fn;
 	dma_drain_needed_fn	*dma_drain_needed;
 	lld_busy_fn		*lld_busy_fn;
+	runtime_pm_fn		*runtime_suspend;
+	runtime_pm_fn		*runtime_resume;
+	int			rpm_status;
 
 	/*
 	 * Dispatch queue sorting
@@ -855,6 +859,8 @@ extern void blk_queue_segment_boundary(struct request_queue *, unsigned long);
 extern void blk_queue_prep_rq(struct request_queue *, prep_rq_fn *pfn);
 extern void blk_queue_unprep_rq(struct request_queue *, unprep_rq_fn *ufn);
 extern void blk_queue_merge_bvec(struct request_queue *, merge_bvec_fn *);
+extern void blk_queue_runtime_pm(struct request_queue *q,
+		runtime_pm_fn *, runtime_pm_fn *);
 extern void blk_queue_dma_alignment(struct request_queue *, int);
 extern void blk_queue_update_dma_alignment(struct request_queue *, int);
 extern void blk_queue_softirq_done(struct request_queue *, softirq_done_fn *);
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC v2 PATCH 3/4] block: implement block layer runtime pm
  2012-05-17 15:48 [RFC v2 PATCH 0/4]: block layer runtime pm Lin Ming
  2012-05-17 15:48 ` [RFC v2 PATCH 1/4] block: add a flag to identify PM request Lin Ming
  2012-05-17 15:48 ` [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks Lin Ming
@ 2012-05-17 15:48 ` Lin Ming
  2012-05-17 15:48 ` [RFC v2 PATCH 4/4] [SCSI] sd: change to auto suspend mode Lin Ming
  3 siblings, 0 replies; 22+ messages in thread
From: Lin Ming @ 2012-05-17 15:48 UTC (permalink / raw)
  To: Jens Axboe, Alan Stern, Jeff Moyer
  Cc: linux-kernel, linux-pm, linux-scsi, Lin Ming

A new queue field "nr_pending" is added to record the nummber of
requests pending. When queue becomes empty, an auto delayed suspend is
requested. If the queue continue to be empty for some time then it'll be
actually suspended. The time interval can be set via runtime pm sysfs
interface: autosuspend_delay_ms.

If new request comes and queue goes out of empty, then device is async
resumed.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
---
 block/blk-core.c       |   12 ++++++++++++
 block/elevator.c       |    9 +++++++++
 drivers/scsi/sd.c      |   28 ++++++++++++++++++++++++++++
 include/linux/blkdev.h |    1 +
 4 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 1f61b74..b0e381e 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1129,6 +1129,12 @@ void __blk_put_request(struct request_queue *q, struct request *req)
 	if (unlikely(--req->ref_count))
 		return;
 
+	/* PM request is not accounted */
+	if (!(req->cmd_flags & REQ_PM)) {
+		if (!(--q->nr_pending) && q->runtime_suspend)
+			q->runtime_suspend(q);
+	}
+
 	elv_completed_request(q, req);
 
 	/* this is a bio leak */
@@ -1917,6 +1923,12 @@ struct request *blk_peek_request(struct request_queue *q)
 	int ret;
 
 	while ((rq = __elv_next_request(q)) != NULL) {
+		/* Only PM request is allowed to go if the queue is suspended */
+		if (q->rpm_status != RPM_ACTIVE && !(rq->cmd_flags & REQ_PM)) {
+			rq = NULL;
+			break;
+		}
+
 		if (!(rq->cmd_flags & REQ_STARTED)) {
 			/*
 			 * This is the first time the device driver
diff --git a/block/elevator.c b/block/elevator.c
index f016855..ac67bff 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -546,6 +546,9 @@ void elv_requeue_request(struct request_queue *q, struct request *rq)
 
 	rq->cmd_flags &= ~REQ_STARTED;
 
+	/* __elv_add_request will increment the count */
+	if (!(rq->cmd_flags & REQ_PM))
+		q->nr_pending--;
 	__elv_add_request(q, rq, ELEVATOR_INSERT_REQUEUE);
 }
 
@@ -587,6 +590,12 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where)
 {
 	trace_block_rq_insert(q, rq);
 
+	if (!(rq->cmd_flags & REQ_PM)) {
+		if (q->nr_pending++ == 0 && q->runtime_resume)
+			/* async resume */
+			q->runtime_resume(q);
+	}
+
 	rq->q = q;
 
 	if (rq->cmd_flags & REQ_SOFTBARRIER) {
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 0335cde..2c3c8e7 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2878,11 +2878,21 @@ static void sd_shutdown(struct device *dev)
 static int sd_suspend(struct device *dev, pm_message_t mesg)
 {
 	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+	struct request_queue *q = sdkp->disk->queue;
 	int ret = 0;
 
 	if (!sdkp)
 		return 0;	/* this can happen */
 
+	spin_lock_irq(q->queue_lock);
+	if (q->nr_pending) {
+		spin_unlock_irq(q->queue_lock);
+		ret = -EBUSY;
+		goto done;
+	}
+	q->rpm_status = RPM_SUSPENDING;
+	spin_unlock_irq(q->queue_lock);
+
 	if (sdkp->WCE) {
 		sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
 		ret = sd_sync_cache(sdkp);
@@ -2893,6 +2903,12 @@ static int sd_suspend(struct device *dev, pm_message_t mesg)
 	if ((mesg.event & PM_EVENT_SLEEP) && sdkp->device->manage_start_stop) {
 		sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n");
 		ret = sd_start_stop_device(sdkp, 0);
+		spin_lock_irq(q->queue_lock);
+		if (!ret)
+			q->rpm_status = RPM_SUSPENDED;
+		else
+			q->rpm_status = RPM_ACTIVE;
+		spin_unlock_irq(q->queue_lock);
 	}
 
 done:
@@ -2903,13 +2919,25 @@ static int sd_suspend(struct device *dev, pm_message_t mesg)
 static int sd_resume(struct device *dev)
 {
 	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
+	struct request_queue *q = sdkp->disk->queue;
 	int ret = 0;
 
 	if (!sdkp->device->manage_start_stop)
 		goto done;
 
+	spin_lock_irq(q->queue_lock);
+	q->rpm_status = RPM_RESUMING;
+	spin_unlock_irq(q->queue_lock);
+
 	sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
 	ret = sd_start_stop_device(sdkp, 1);
+	if (!ret) {
+		spin_lock_irq(q->queue_lock);
+		q->rpm_status = RPM_ACTIVE;
+		pm_runtime_mark_last_busy(dev);
+		__blk_run_queue(sdkp->disk->queue);
+		spin_unlock_irq(q->queue_lock);
+	}
 
 done:
 	scsi_disk_put(sdkp);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f52f518..bde2021 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -349,6 +349,7 @@ struct request_queue {
 	unsigned int		nr_congestion_on;
 	unsigned int		nr_congestion_off;
 	unsigned int		nr_batching;
+	unsigned int		nr_pending;
 
 	unsigned int		dma_drain_size;
 	void			*dma_drain_buffer;
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC v2 PATCH 4/4] [SCSI] sd: change to auto suspend mode
  2012-05-17 15:48 [RFC v2 PATCH 0/4]: block layer runtime pm Lin Ming
                   ` (2 preceding siblings ...)
  2012-05-17 15:48 ` [RFC v2 PATCH 3/4] block: implement block layer runtime pm Lin Ming
@ 2012-05-17 15:48 ` Lin Ming
  3 siblings, 0 replies; 22+ messages in thread
From: Lin Ming @ 2012-05-17 15:48 UTC (permalink / raw)
  To: Jens Axboe, Alan Stern, Jeff Moyer
  Cc: linux-kernel, linux-pm, linux-scsi, Lin Ming

Now we don't need to do runtime pm get/put on open/release/remove path.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
---
 drivers/scsi/scsi_pm.c |    7 ++++---
 drivers/scsi/sd.c      |   14 +++++---------
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c
index c467064..a133345 100644
--- a/drivers/scsi/scsi_pm.c
+++ b/drivers/scsi/scsi_pm.c
@@ -171,9 +171,10 @@ static int scsi_runtime_idle(struct device *dev)
 
 	/* Insert hooks here for targets, hosts, and transport classes */
 
-	if (scsi_is_sdev_device(dev))
-		err = pm_schedule_suspend(dev, 100);
-	else
+	if (scsi_is_sdev_device(dev)) {
+		pm_runtime_mark_last_busy(dev);
+		err = pm_request_autosuspend(dev);
+	} else
 		err = pm_runtime_suspend(dev);
 	return err;
 }
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 2c3c8e7..97b5a67 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -985,10 +985,6 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
 
 	sdev = sdkp->device;
 
-	retval = scsi_autopm_get_device(sdev);
-	if (retval)
-		goto error_autopm;
-
 	/*
 	 * If the device is in error recovery, wait until it is done.
 	 * If the device is offline, then disallow any access to it.
@@ -1033,8 +1029,6 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
 	return 0;
 
 error_out:
-	scsi_autopm_put_device(sdev);
-error_autopm:
 	scsi_disk_put(sdkp);
 	return retval;	
 }
@@ -1069,7 +1063,6 @@ static int sd_release(struct gendisk *disk, fmode_t mode)
 	 * XXX is followed by a "rmmod sd_mod"?
 	 */
 
-	scsi_autopm_put_device(sdev);
 	scsi_disk_put(sdkp);
 	return 0;
 }
@@ -2638,6 +2631,10 @@ static void sd_probe_async(void *data, async_cookie_t cookie)
 	blk_queue_runtime_pm(sdp->request_queue,
 		sd_block_runtime_suspend, sd_block_runtime_resume);
 
+	pm_runtime_use_autosuspend(&sdp->sdev_gendev);
+	/* Set to 10 secs for test only, user can reset it via sysfs */
+	pm_runtime_set_autosuspend_delay(&sdp->sdev_gendev, 10 * 1000);
+
 	gd->driverfs_dev = &sdp->sdev_gendev;
 	gd->flags = GENHD_FL_EXT_DEVT;
 	if (sdp->removable) {
@@ -2652,7 +2649,7 @@ static void sd_probe_async(void *data, async_cookie_t cookie)
 
 	sd_printk(KERN_NOTICE, sdkp, "Attached SCSI %sdisk\n",
 		  sdp->removable ? "removable " : "");
-	scsi_autopm_put_device(sdp);
+	pm_runtime_put_sync_autosuspend(&sdp->sdev_gendev);
 	put_device(&sdkp->dev);
 }
 
@@ -2776,7 +2773,6 @@ static int sd_remove(struct device *dev)
 	struct scsi_disk *sdkp;
 
 	sdkp = dev_get_drvdata(dev);
-	scsi_autopm_get_device(sdkp->device);
 
 	async_synchronize_full();
 	blk_queue_prep_rq(sdkp->device->request_queue, scsi_prep_fn);
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 1/4] block: add a flag to identify PM request
  2012-05-17 15:48 ` [RFC v2 PATCH 1/4] block: add a flag to identify PM request Lin Ming
@ 2012-05-17 17:46   ` Alan Stern
  0 siblings, 0 replies; 22+ messages in thread
From: Alan Stern @ 2012-05-17 17:46 UTC (permalink / raw)
  To: Lin Ming; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Thu, 17 May 2012, Lin Ming wrote:

> Add a flag REQ_PM to identify the request is PM related.
> As an example, modify scsi code to use this flag.

Some systems need to use the SCSI error handler during suspend and 
resume.  You'll need to mark every command sent by the error handler 
with the REQ_PM flag.

Alan Stern


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-17 15:48 ` [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks Lin Ming
@ 2012-05-17 18:29   ` Alan Stern
  2012-05-17 18:56     ` Alan Stern
                       ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Alan Stern @ 2012-05-17 18:29 UTC (permalink / raw)
  To: Lin Ming; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Thu, 17 May 2012, Lin Ming wrote:

> Add runtime pm suspend/resume callbacks to request queue.
> As an example, implement these callbacks in sd driver.

This is not the way to do it.  The block subsystem should not use 
suspend/resume callbacks.

Instead, there should be block functions that can be called by client
drivers: block_pre_runtime_suspend, block_post_runtime_suspend,
bock_pre_runtime_resume, and block_post_runtime_resume.

They should do something like this:

	block_pre_runtime_suspend:
		If any requests are in the queue, return -EBUSY.
		Otherwise set q->rpm_status to RPM_SUSPENDING and
		return 0.

	block_post_runtime_suspend:
		If the suspend succeeded then set q->rpm_status to 
		RPM_SUSPENDED.  Otherwise set it to RPM_ACTIVE and
		call pm_runtime_mark_last_busy().

	block_pre_runtime_resume:
		Set q->rpm_status to RPM_RESUMING.

	block_post_runtime_resume:
		If the resume succeeded then set q->rpm_status to
		RPM_ACTIVE and call pm_runtime_mark_last_busy() and
		pm_runtime_request_autosuspend().
		Otherwise set q->rpm_status to RPM_SUSPENDED.

There should also be an initialization function for client drivers to
call.  block_runtime_pm_init() should call pm_runtime_mark_last_busy(),
pm_runtime_use_autosuspend(), and pm_runtime_autosuspend().

Next, you have to modify the parts of the block layer that run when a 
new request is added to the queue or a request is removed.

	When a request is added:
		If q->rpm_status is RPM_SUSPENDED, or if q->rpm_status
		is RPM_SUSPENDING and the REQ_PM flag isn't set, call
		pm_runtime_request_resume().

	When a request finishes:
		Call pm_runtime_mark_last_busy().

Next, you have to change the parts of the block layer responsible for
taking a request from the queue and handing it to the lower-level
driver (both peek and get).  If q->rpm_status is RPM_SUSPENDED, they
shouldn't do anything -- act as though the queue is empty.  If
q->rpm_status is RPM_SUSPENDING or RPM_RESUMING, they should hand over
the request only if it has the REQ_PM flag set.

For this to work, the block layer has to know what struct device
pointer to pass to the pm_runtime_* routines.  You'll have to add that
information to the request_queue structure; I guess q->dev can get set
by block_pm_runtime_init().  In fact, when that's done you won't need
q->rpm_status any more.  You'll be able to use q->dev->power.rpm_status
directly, and you won't have to update it because the PM core does that
for you.

(Or maybe it would be easier to make q->rpm_status be a pointer to 
q->dev->power.rpm_status.  That way, if CONFIG_PM_RUNTIME isn't enabled 
or block_runtime_pm_init() hasn't been called, you can have 
q->rpm_status simply point to a static value that is permanently set to 
RPM_ACTIVE.)

I may have left some parts out from this brief description.  Hopefully 
you'll be able to figure out the general idea and get it to work.

Alan Stern


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-17 18:29   ` Alan Stern
@ 2012-05-17 18:56     ` Alan Stern
  2012-05-18 14:53     ` Lin Ming
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 22+ messages in thread
From: Alan Stern @ 2012-05-17 18:56 UTC (permalink / raw)
  To: Lin Ming; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Thu, 17 May 2012, Alan Stern wrote:

> On Thu, 17 May 2012, Lin Ming wrote:
> 
> > Add runtime pm suspend/resume callbacks to request queue.
> > As an example, implement these callbacks in sd driver.
> 
> This is not the way to do it.  The block subsystem should not use 
> suspend/resume callbacks.
> 
> Instead, there should be block functions that can be called by client
> drivers: block_pre_runtime_suspend, block_post_runtime_suspend,
> bock_pre_runtime_resume, and block_post_runtime_resume.
> 
> They should do something like this:
> 
> 	block_pre_runtime_suspend:
> 		If any requests are in the queue, return -EBUSY.
> 		Otherwise set q->rpm_status to RPM_SUSPENDING and
> 		return 0.
> 
> 	block_post_runtime_suspend:
> 		If the suspend succeeded then set q->rpm_status to 
> 		RPM_SUSPENDED.  Otherwise set it to RPM_ACTIVE and
> 		call pm_runtime_mark_last_busy().
> 
> 	block_pre_runtime_resume:
> 		Set q->rpm_status to RPM_RESUMING.
> 
> 	block_post_runtime_resume:
> 		If the resume succeeded then set q->rpm_status to
> 		RPM_ACTIVE and call pm_runtime_mark_last_busy() and
> 		pm_runtime_request_autosuspend().
> 		Otherwise set q->rpm_status to RPM_SUSPENDED.
> 
> There should also be an initialization function for client drivers to
> call.  block_runtime_pm_init() should call pm_runtime_mark_last_busy(),
> pm_runtime_use_autosuspend(), and pm_runtime_autosuspend().
> 
> Next, you have to modify the parts of the block layer that run when a 
> new request is added to the queue or a request is removed.
> 
> 	When a request is added:
> 		If q->rpm_status is RPM_SUSPENDED, or if q->rpm_status
> 		is RPM_SUSPENDING and the REQ_PM flag isn't set, call
> 		pm_runtime_request_resume().
> 
> 	When a request finishes:
> 		Call pm_runtime_mark_last_busy().
> 
> Next, you have to change the parts of the block layer responsible for
> taking a request from the queue and handing it to the lower-level
> driver (both peek and get).  If q->rpm_status is RPM_SUSPENDED, they
> shouldn't do anything -- act as though the queue is empty.  If
> q->rpm_status is RPM_SUSPENDING or RPM_RESUMING, they should hand over
> the request only if it has the REQ_PM flag set.
> 
> For this to work, the block layer has to know what struct device
> pointer to pass to the pm_runtime_* routines.  You'll have to add that
> information to the request_queue structure; I guess q->dev can get set
> by block_pm_runtime_init().  In fact, when that's done you won't need
> q->rpm_status any more.  You'll be able to use q->dev->power.rpm_status
> directly, and you won't have to update it because the PM core does that
> for you.
> 
> (Or maybe it would be easier to make q->rpm_status be a pointer to 
> q->dev->power.rpm_status.  That way, if CONFIG_PM_RUNTIME isn't enabled 
> or block_runtime_pm_init() hasn't been called, you can have 
> q->rpm_status simply point to a static value that is permanently set to 
> RPM_ACTIVE.)
> 
> I may have left some parts out from this brief description.  Hopefully 
> you'll be able to figure out the general idea and get it to work.

Oh, yes -- I knew I had forgotten something.

The client driver has to be changed.  That means scsi_sysfs_add_sdev() 
should call block_pm_runtime_init().

scsi_runtime_suspend() should call block_pre_runtime_suspend() and fail
if that routine returns an error.  Then after calling
scsi_dev_type_suspend(), it should call block_post_runtime_suspend() 
to let it know whether the suspend operation succeeded.

Similarly, scsi_runtime_resume() should call block_pre_runtime_resume() 
before scsi_dev_type_resume() and block_post_runtime_resume() 
afterward.

Once that's done, you can remove the scsi_autpm stuff from sd_open() 
and sd_release().

Alan Stern


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-17 18:29   ` Alan Stern
  2012-05-17 18:56     ` Alan Stern
@ 2012-05-18 14:53     ` Lin Ming
  2012-05-21 14:41     ` Lin Ming
  2012-05-22  6:12     ` Lin Ming
  3 siblings, 0 replies; 22+ messages in thread
From: Lin Ming @ 2012-05-18 14:53 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Thu, 2012-05-17 at 14:29 -0400, Alan Stern wrote:

[ ... ]

> 
> I may have left some parts out from this brief description.  Hopefully 
> you'll be able to figure out the general idea and get it to work.

Thanks for the detail idea.

Let me look at it closely first.

> 
> Alan Stern
> 



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-17 18:29   ` Alan Stern
  2012-05-17 18:56     ` Alan Stern
  2012-05-18 14:53     ` Lin Ming
@ 2012-05-21 14:41     ` Lin Ming
  2012-05-21 15:00       ` Alan Stern
  2012-05-22  6:12     ` Lin Ming
  3 siblings, 1 reply; 22+ messages in thread
From: Lin Ming @ 2012-05-21 14:41 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Fri, May 18, 2012 at 2:29 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
[snip]
> I may have left some parts out from this brief description.  Hopefully
> you'll be able to figure out the general idea and get it to work.

All journal threads and flusher thread of the disk need to be freezed
before suspend and thaw after resume.

>
> Alan Stern

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-21 14:41     ` Lin Ming
@ 2012-05-21 15:00       ` Alan Stern
  2012-05-21 15:14         ` Lin Ming
  0 siblings, 1 reply; 22+ messages in thread
From: Alan Stern @ 2012-05-21 15:00 UTC (permalink / raw)
  To: Lin Ming; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=UTF-8, Size: 532 bytes --]

On Mon, 21 May 2012, Lin Ming wrote:

> On Fri, May 18, 2012 at 2:29 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> [snip]
> > I may have left some parts out from this brief description.  Hopefully
> > you'll be able to figure out the general idea and get it to work.
> 
> All journal threads and flusher thread of the disk need to be freezed
> before suspend and thaw after resume.

Why?  If any of those threads needs to write something to the disk 
while the disk is suspended, the disk will simply be resumed.

Alan Stern


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-21 15:00       ` Alan Stern
@ 2012-05-21 15:14         ` Lin Ming
  2012-05-21 18:32           ` Alan Stern
  2012-05-24 10:19           ` Lin Ming
  0 siblings, 2 replies; 22+ messages in thread
From: Lin Ming @ 2012-05-21 15:14 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Mon, May 21, 2012 at 11:00 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Mon, 21 May 2012, Lin Ming wrote:
>
>> On Fri, May 18, 2012 at 2:29 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
>> [snip]
>> > I may have left some parts out from this brief description.  Hopefully
>> > you'll be able to figure out the general idea and get it to work.
>>
>> All journal threads and flusher thread of the disk need to be freezed
>> before suspend and thaw after resume.
>
> Why?  If any of those threads needs to write something to the disk
> while the disk is suspended, the disk will simply be resumed.

When tested the patches, I found that kjournald and flusher thread
frequently resume the disk.

I'm not familiar with journal.
Are the journal threads still need to be in active state when the disk
is already suspended?

>
> Alan Stern

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-21 15:14         ` Lin Ming
@ 2012-05-21 18:32           ` Alan Stern
  2012-05-24 10:19           ` Lin Ming
  1 sibling, 0 replies; 22+ messages in thread
From: Alan Stern @ 2012-05-21 18:32 UTC (permalink / raw)
  To: Lin Ming; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=UTF-8, Size: 1275 bytes --]

On Mon, 21 May 2012, Lin Ming wrote:

> On Mon, May 21, 2012 at 11:00 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> > On Mon, 21 May 2012, Lin Ming wrote:
> >
> >> On Fri, May 18, 2012 at 2:29 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> >> [snip]
> >> > I may have left some parts out from this brief description.  Hopefully
> >> > you'll be able to figure out the general idea and get it to work.
> >>
> >> All journal threads and flusher thread of the disk need to be freezed
> >> before suspend and thaw after resume.
> >
> > Why?  If any of those threads needs to write something to the disk
> > while the disk is suspended, the disk will simply be resumed.
> 
> When tested the patches, I found that kjournald and flusher thread
> frequently resume the disk.
> 
> I'm not familiar with journal.
> Are the journal threads still need to be in active state when the disk
> is already suspended?

I don't know the answers.  Maybe Jens can tell us.

Ideally, a rotating disk wouldn't be suspended unless all the dirty 
blocks were already flushed and the journal was up-to-date.  For an 
SSD, frequently suspending and resuming doesn't matter quite so much.  
This may come down to a matter of setting the right value for the 
autosuspend timeout.

Alan Stern


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-17 18:29   ` Alan Stern
                       ` (2 preceding siblings ...)
  2012-05-21 14:41     ` Lin Ming
@ 2012-05-22  6:12     ` Lin Ming
  2012-05-22 15:56       ` Alan Stern
  2012-06-05 17:27       ` Alan Stern
  3 siblings, 2 replies; 22+ messages in thread
From: Lin Ming @ 2012-05-22  6:12 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Fri, May 18, 2012 at 2:29 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Thu, 17 May 2012, Lin Ming wrote:
>
>> Add runtime pm suspend/resume callbacks to request queue.
>> As an example, implement these callbacks in sd driver.
>
> This is not the way to do it.  The block subsystem should not use
> suspend/resume callbacks.
>
> Instead, there should be block functions that can be called by client
> drivers: block_pre_runtime_suspend, block_post_runtime_suspend,
> bock_pre_runtime_resume, and block_post_runtime_resume.
>
> They should do something like this:
>
>        block_pre_runtime_suspend:
>                If any requests are in the queue, return -EBUSY.
>                Otherwise set q->rpm_status to RPM_SUSPENDING and
>                return 0.
>
>        block_post_runtime_suspend:
>                If the suspend succeeded then set q->rpm_status to
>                RPM_SUSPENDED.  Otherwise set it to RPM_ACTIVE and
>                call pm_runtime_mark_last_busy().
>
>        block_pre_runtime_resume:
>                Set q->rpm_status to RPM_RESUMING.
>
>        block_post_runtime_resume:
>                If the resume succeeded then set q->rpm_status to
>                RPM_ACTIVE and call pm_runtime_mark_last_busy() and
>                pm_runtime_request_autosuspend().
>                Otherwise set q->rpm_status to RPM_SUSPENDED.
>
> There should also be an initialization function for client drivers to
> call.  block_runtime_pm_init() should call pm_runtime_mark_last_busy(),
> pm_runtime_use_autosuspend(), and pm_runtime_autosuspend().
>
> Next, you have to modify the parts of the block layer that run when a
> new request is added to the queue or a request is removed.
>
>        When a request is added:
>                If q->rpm_status is RPM_SUSPENDED, or if q->rpm_status
>                is RPM_SUSPENDING and the REQ_PM flag isn't set, call
>                pm_runtime_request_resume().
>
>        When a request finishes:
>                Call pm_runtime_mark_last_busy().
>
> Next, you have to change the parts of the block layer responsible for
> taking a request from the queue and handing it to the lower-level
> driver (both peek and get).  If q->rpm_status is RPM_SUSPENDED, they
> shouldn't do anything -- act as though the queue is empty.  If
> q->rpm_status is RPM_SUSPENDING or RPM_RESUMING, they should hand over
> the request only if it has the REQ_PM flag set.
>
> For this to work, the block layer has to know what struct device
> pointer to pass to the pm_runtime_* routines.  You'll have to add that
> information to the request_queue structure; I guess q->dev can get set
> by block_pm_runtime_init().  In fact, when that's done you won't need
> q->rpm_status any more.  You'll be able to use c
> directly, and you won't have to update it because the PM core does that
> for you.
>
> (Or maybe it would be easier to make q->rpm_status be a pointer to
> q->dev->power.rpm_status.  That way, if CONFIG_PM_RUNTIME isn't enabled
> or block_runtime_pm_init() hasn't been called, you can have
> q->rpm_status simply point to a static value that is permanently set to
> RPM_ACTIVE.)

I think we need q->rpm_status.
Block layer check ->rpm_status and client driver set this status.
And the status is synchronized with queue's spin lock.

If we use q->dev->power.rpm_status then how to sync it between block
layer and client driver?
Do you mean block layer will need to acquire q->dev->power.lock?

Lin Ming

>
> I may have left some parts out from this brief description.  Hopefully
> you'll be able to figure out the general idea and get it to work.
>
> Alan Stern

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-22  6:12     ` Lin Ming
@ 2012-05-22 15:56       ` Alan Stern
  2012-05-23  8:41         ` Lin Ming
  2012-06-05 17:27       ` Alan Stern
  1 sibling, 1 reply; 22+ messages in thread
From: Alan Stern @ 2012-05-22 15:56 UTC (permalink / raw)
  To: Lin Ming; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=UTF-8, Size: 2003 bytes --]

On Tue, 22 May 2012, Lin Ming wrote:

> > (Or maybe it would be easier to make q->rpm_status be a pointer to
> > q->dev->power.rpm_status.  That way, if CONFIG_PM_RUNTIME isn't enabled
> > or block_runtime_pm_init() hasn't been called, you can have
> > q->rpm_status simply point to a static value that is permanently set to
> > RPM_ACTIVE.)
> 
> I think we need q->rpm_status.
> Block layer check ->rpm_status and client driver set this status.

No, the client driver should not have to set any status values.  The 
client driver should do as little as possible.

> And the status is synchronized with queue's spin lock.

Right, and the client driver should not need to acquire the queue's 
lock.

> If we use q->dev->power.rpm_status then how to sync it between block
> layer and client driver?
> Do you mean block layer will need to acquire q->dev->power.lock?

That's not what I mean.

What synchronization are you concerned about?  The most important race 
seems to be when a new request is added to the queue at the same time 
as a runtime suspend begins.

If q->dev->power.rpm_status has already been set to RPM_SUSPENDING or
RPM_SUSPENDED when the request is submitted, the block layer should
call pm_runtime_request_resume().  Thus if the suspend succeeds, the
device will be resumed immediately afterward.  And if the suspend
fails, the new request will be handled as usual (note that the
block_*_runtime_* routines might need to kick-start the queue to get it
going again).

Alternatively, if q->dev->power.rpm_status is still equal to RPM_ACTIVE 
when the request is submitted, the block layer will simply accept the 
request.  If the request is still on the queue when 
block_pre_runtime_suspend is called, it will return -EBUSY and the 
suspend will fail.

The only synchronization needed to make this work is that the
block_{pre,post}_runtime_suspend routines need to hold the queue's lock 
while checking to see if any requests are in the queue.  You'd expect 
that anyway.

Alan Stern


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-22 15:56       ` Alan Stern
@ 2012-05-23  8:41         ` Lin Ming
  2012-05-23 14:58           ` Alan Stern
  0 siblings, 1 reply; 22+ messages in thread
From: Lin Ming @ 2012-05-23  8:41 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Tue, May 22, 2012 at 11:56 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Tue, 22 May 2012, Lin Ming wrote:
>
>> > (Or maybe it would be easier to make q->rpm_status be a pointer to
>> > q->dev->power.rpm_status.  That way, if CONFIG_PM_RUNTIME isn't enabled
>> > or block_runtime_pm_init() hasn't been called, you can have
>> > q->rpm_status simply point to a static value that is permanently set to
>> > RPM_ACTIVE.)
>>
>> I think we need q->rpm_status.
>> Block layer check ->rpm_status and client driver set this status.
>
> No, the client driver should not have to set any status values.  The
> client driver should do as little as possible.

Ah, I mean runtime pm core will set the status, not the client driver.

>
>> And the status is synchronized with queue's spin lock.
>
> Right, and the client driver should not need to acquire the queue's
> lock.
>
>> If we use q->dev->power.rpm_status then how to sync it between block
>> layer and client driver?
>> Do you mean block layer will need to acquire q->dev->power.lock?
>
> That's not what I mean.
>
> What synchronization are you concerned about?  The most important race

Let's consider below code.

@@ -587,6 +591,11 @@ void __elv_add_request(struct request_queue *q,
struct request *rq, int where)
 {
       trace_block_rq_insert(q, rq);

+       if (!(rq->cmd_flags & REQ_PM))
+               if (q->nr_pending++ == 0 && (q->rpm_status == RPM_SUSPENDED ||
+                               q->rpm_status == RPM_SUSPENDING) && q->dev)
+                       pm_request_resume(q->dev);
+
       rq->q = q;

       if (rq->cmd_flags & REQ_SOFTBARRIER) {

Block layer reads runtime status and pm core writes this status.
PM core uses dev->power.lock to protect this status.

I was thinking will it have problem if block layer does not acquire
dev->power.lock?
>From your explanation below, it seems does not have problem.

Thanks,
Lin Ming

> seems to be when a new request is added to the queue at the same time
> as a runtime suspend begins.
>
> If q->dev->power.rpm_status has already been set to RPM_SUSPENDING or
> RPM_SUSPENDED when the request is submitted, the block layer should
> call pm_runtime_request_resume().  Thus if the suspend succeeds, the
> device will be resumed immediately afterward.  And if the suspend
> fails, the new request will be handled as usual (note that the
> block_*_runtime_* routines might need to kick-start the queue to get it
> going again).
>
> Alternatively, if q->dev->power.rpm_status is still equal to RPM_ACTIVE
> when the request is submitted, the block layer will simply accept the
> request.  If the request is still on the queue when
> block_pre_runtime_suspend is called, it will return -EBUSY and the
> suspend will fail.
>
> The only synchronization needed to make this work is that the
> block_{pre,post}_runtime_suspend routines need to hold the queue's lock
> while checking to see if any requests are in the queue.  You'd expect
> that anyway.
>
> Alan Stern

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-23  8:41         ` Lin Ming
@ 2012-05-23 14:58           ` Alan Stern
  2012-05-23 15:59             ` Lin Ming
  0 siblings, 1 reply; 22+ messages in thread
From: Alan Stern @ 2012-05-23 14:58 UTC (permalink / raw)
  To: Lin Ming; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Wed, 23 May 2012, Lin Ming wrote:

> Let's consider below code.
> 
> @@ -587,6 +591,11 @@ void __elv_add_request(struct request_queue *q,
> struct request *rq, int where)
>  {
>        trace_block_rq_insert(q, rq);
> 
> +       if (!(rq->cmd_flags & REQ_PM))
> +               if (q->nr_pending++ == 0 && (q->rpm_status == RPM_SUSPENDED ||
> +                               q->rpm_status == RPM_SUSPENDING) && q->dev)
> +                       pm_request_resume(q->dev);
> +
>        rq->q = q;
> 
>        if (rq->cmd_flags & REQ_SOFTBARRIER) {
> 
> Block layer reads runtime status and pm core writes this status.
> PM core uses dev->power.lock to protect this status.
> 
> I was thinking will it have problem if block layer does not acquire
> dev->power.lock?
> From your explanation below, it seems does not have problem.

I don't think it's a problem, because all you're doing is reading 
dev->power.rpm_status -- you're not writing it.

On the other hand, there's nothing really wrong with keeping your own
local copy of rpm_status.  You could think of it as being the queue's
status as opposed to the device's status.  (Also, some people might
argue that dev->power.rpm_status is supposed to be private to the
runtime PM core and shouldn't be used by other code.)

Alan Stern


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-23 14:58           ` Alan Stern
@ 2012-05-23 15:59             ` Lin Ming
  0 siblings, 0 replies; 22+ messages in thread
From: Lin Ming @ 2012-05-23 15:59 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Wed, May 23, 2012 at 10:58 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Wed, 23 May 2012, Lin Ming wrote:
>
>> Let's consider below code.
>>
>> @@ -587,6 +591,11 @@ void __elv_add_request(struct request_queue *q,
>> struct request *rq, int where)
>>  {
>>        trace_block_rq_insert(q, rq);
>>
>> +       if (!(rq->cmd_flags & REQ_PM))
>> +               if (q->nr_pending++ == 0 && (q->rpm_status == RPM_SUSPENDED ||
>> +                               q->rpm_status == RPM_SUSPENDING) && q->dev)
>> +                       pm_request_resume(q->dev);
>> +
>>        rq->q = q;
>>
>>        if (rq->cmd_flags & REQ_SOFTBARRIER) {
>>
>> Block layer reads runtime status and pm core writes this status.
>> PM core uses dev->power.lock to protect this status.
>>
>> I was thinking will it have problem if block layer does not acquire
>> dev->power.lock?
>> From your explanation below, it seems does not have problem.
>
> I don't think it's a problem, because all you're doing is reading
> dev->power.rpm_status -- you're not writing it.
>
> On the other hand, there's nothing really wrong with keeping your own
> local copy of rpm_status.  You could think of it as being the queue's
> status as opposed to the device's status.  (Also, some people might
> argue that dev->power.rpm_status is supposed to be private to the
> runtime PM core and shouldn't be used by other code.)

Agree.
So I'd like to keep local copy of rpm_status.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-21 15:14         ` Lin Ming
  2012-05-21 18:32           ` Alan Stern
@ 2012-05-24 10:19           ` Lin Ming
  2012-05-24 14:52             ` Alan Stern
  1 sibling, 1 reply; 22+ messages in thread
From: Lin Ming @ 2012-05-24 10:19 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Mon, May 21, 2012 at 11:14 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> On Mon, May 21, 2012 at 11:00 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
>> On Mon, 21 May 2012, Lin Ming wrote:
>>
>>> On Fri, May 18, 2012 at 2:29 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
>>> [snip]
>>> > I may have left some parts out from this brief description.  Hopefully
>>> > you'll be able to figure out the general idea and get it to work.
>>>
>>> All journal threads and flusher thread of the disk need to be freezed
>>> before suspend and thaw after resume.
>>
>> Why?  If any of those threads needs to write something to the disk
>> while the disk is suspended, the disk will simply be resumed.
>
> When tested the patches, I found that kjournald and flusher thread
> frequently resume the disk.

Just found that it's because "printk".

When disk is suspended, it prints out some message, for example,

[  670.597103] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[  670.597827] sd 0:0:0:0: [sda] Stopping disk

Then syslogd is waken up to write the log.
So disk is resumed right after suspended.

Lin Ming

>
> I'm not familiar with journal.
> Are the journal threads still need to be in active state when the disk
> is already suspended?
>
>>
>> Alan Stern

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-24 10:19           ` Lin Ming
@ 2012-05-24 14:52             ` Alan Stern
  0 siblings, 0 replies; 22+ messages in thread
From: Alan Stern @ 2012-05-24 14:52 UTC (permalink / raw)
  To: Lin Ming; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Thu, 24 May 2012, Lin Ming wrote:

> > When tested the patches, I found that kjournald and flusher thread
> > frequently resume the disk.
> 
> Just found that it's because "printk".
> 
> When disk is suspended, it prints out some message, for example,
> 
> [  670.597103] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> [  670.597827] sd 0:0:0:0: [sda] Stopping disk
> 
> Then syslogd is waken up to write the log.
> So disk is resumed right after suspended.

:-)  Very amusing!  So if you change those two messages in sd_suspend
(and maybe also the message in sd_resume) from KERN_NOTICE to
KERN_DEBUG, things ought to improve.

Alan Stern


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-05-22  6:12     ` Lin Ming
  2012-05-22 15:56       ` Alan Stern
@ 2012-06-05 17:27       ` Alan Stern
  2012-06-06 14:26         ` Lin Ming
  1 sibling, 1 reply; 22+ messages in thread
From: Alan Stern @ 2012-06-05 17:27 UTC (permalink / raw)
  To: Lin Ming; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=UTF-8, Size: 1745 bytes --]

On Tue, 22 May 2012, Lin Ming wrote:

> > Next, you have to change the parts of the block layer responsible for
> > taking a request from the queue and handing it to the lower-level
> > driver (both peek and get).  If q->rpm_status is RPM_SUSPENDED, they
> > shouldn't do anything -- act as though the queue is empty.  If
> > q->rpm_status is RPM_SUSPENDING or RPM_RESUMING, they should hand over
> > the request only if it has the REQ_PM flag set.

I just realized this isn't quite right.  In fact, when the status is
RPM_SUSPENDING or RPM_RESUMING you need to search through the queue for
the first request with REQ_PM set.

Otherwise you can end up in a deadlock.  For example, suppose the
device is suspended and a normal I/O request arrives.  It gets put on
the queue and a runtime resume is issued.  The sd resume method then
submits a "spin-up drive" command to the queue with REQ_PM set, and it
doesn't return until this command is finished.  But the command won't
even start, because it isn't at the head of the queue -- the I/O
request is, and it prevents the "spin-up" command from running.

There are two other issues you also need to address.  The first is
simple: Make sure your changes don't cause any harm if CONFIG_PM or
CONFIG_PM_RUNTIME is set to N.

Second, near the end of scsi_pm.c:scsi_bus_resume_common(), you can see
that during resume from a system sleep, the device automatically gets
set back to RPM_ACTIVE.  You want the queue's runtime status to remain
in sync with the device's status; therefore you need to call
blk_pre_runtime_resume near the start of this function and
blk_post_runtime_resume near the end.  (This is the drawback of having
separate rpm_status fields for the queue and the device.)

Alan Stern


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks
  2012-06-05 17:27       ` Alan Stern
@ 2012-06-06 14:26         ` Lin Ming
  0 siblings, 0 replies; 22+ messages in thread
From: Lin Ming @ 2012-06-06 14:26 UTC (permalink / raw)
  To: Alan Stern; +Cc: Jens Axboe, Jeff Moyer, linux-kernel, linux-pm, linux-scsi

On Wed, Jun 6, 2012 at 2:27 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Tue, 22 May 2012, Lin Ming wrote:
>
>> > Next, you have to change the parts of the block layer responsible for
>> > taking a request from the queue and handing it to the lower-level
>> > driver (both peek and get).  If q->rpm_status is RPM_SUSPENDED, they
>> > shouldn't do anything -- act as though the queue is empty.  If
>> > q->rpm_status is RPM_SUSPENDING or RPM_RESUMING, they should hand over
>> > the request only if it has the REQ_PM flag set.
>
> I just realized this isn't quite right.  In fact, when the status is
> RPM_SUSPENDING or RPM_RESUMING you need to search through the queue for
> the first request with REQ_PM set.
>
> Otherwise you can end up in a deadlock.  For example, suppose the
> device is suspended and a normal I/O request arrives.  It gets put on
> the queue and a runtime resume is issued.  The sd resume method then
> submits a "spin-up drive" command to the queue with REQ_PM set, and it
> doesn't return until this command is finished.  But the command won't
> even start, because it isn't at the head of the queue -- the I/O
> request is, and it prevents the "spin-up" command from running.

At least for SCSI sd driver, the PM request is always queued at the head.

scsi_execute()
  blk_execute_rq(req->q, NULL, req, 1);

"1" means the request will be inserted at the head.

blk_execute_rq_nowait()
   spin_lock_irq(q->queue_lock);
   __elv_add_request(q, rq, where);
   __blk_run_queue(q)
   spin_unlock_irq(q->queue_lock);

So the request is inserted at the head and executed immediatly.

>
> There are two other issues you also need to address.  The first is
> simple: Make sure your changes don't cause any harm if CONFIG_PM or
> CONFIG_PM_RUNTIME is set to N.
>
> Second, near the end of scsi_pm.c:scsi_bus_resume_common(), you can see
> that during resume from a system sleep, the device automatically gets
> set back to RPM_ACTIVE.  You want the queue's runtime status to remain
> in sync with the device's status; therefore you need to call
> blk_pre_runtime_resume near the start of this function and
> blk_post_runtime_resume near the end.  (This is the drawback of having
> separate rpm_status fields for the queue and the device.)
>
> Alan Stern

I am travelling through June 10.
I'll investigate these when I'm back.

Thanks,
Lin Ming

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2012-06-06 14:26 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-17 15:48 [RFC v2 PATCH 0/4]: block layer runtime pm Lin Ming
2012-05-17 15:48 ` [RFC v2 PATCH 1/4] block: add a flag to identify PM request Lin Ming
2012-05-17 17:46   ` Alan Stern
2012-05-17 15:48 ` [RFC v2 PATCH 2/4] block: add queue runtime pm callbacks Lin Ming
2012-05-17 18:29   ` Alan Stern
2012-05-17 18:56     ` Alan Stern
2012-05-18 14:53     ` Lin Ming
2012-05-21 14:41     ` Lin Ming
2012-05-21 15:00       ` Alan Stern
2012-05-21 15:14         ` Lin Ming
2012-05-21 18:32           ` Alan Stern
2012-05-24 10:19           ` Lin Ming
2012-05-24 14:52             ` Alan Stern
2012-05-22  6:12     ` Lin Ming
2012-05-22 15:56       ` Alan Stern
2012-05-23  8:41         ` Lin Ming
2012-05-23 14:58           ` Alan Stern
2012-05-23 15:59             ` Lin Ming
2012-06-05 17:27       ` Alan Stern
2012-06-06 14:26         ` Lin Ming
2012-05-17 15:48 ` [RFC v2 PATCH 3/4] block: implement block layer runtime pm Lin Ming
2012-05-17 15:48 ` [RFC v2 PATCH 4/4] [SCSI] sd: change to auto suspend mode Lin Ming

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).