All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] block/elevator updates + deadline i/o scheduler
@ 2002-07-26 12:02 Jens Axboe
  2002-07-26 18:31 ` Andrew Morton
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Jens Axboe @ 2002-07-26 12:02 UTC (permalink / raw)
  To: Linux Kernel

[-- Attachment #1: Type: text/plain, Size: 4660 bytes --]

Hi,

The 2nd version of the deadline i/o scheduler is ready for some testing.
First patch make various changes to the elevator/block layer, that are
needed for the new i/o scheduler. Most of these are just correcting
assumptions about q->queue_head being the i/o scheduler sort list. In
detail:

o elevator_merge_requests_fn takes queue argument as well

o __make_request() inits insert_here to NULL instead of
  q->queue_head.prev, which means that the i/o schedulers must
  explicitly check for this condition now.

o incorporate elv_queue_empty(), it was just a place holder before

o add elv_get_sort_head(). it returns the sort head of the elevator for
  a given request. attempt_{back,front}_merge uses it to determine
  whether a request is valid or not. Maybe attempt_{back,front}_merge
  should just be killed, I doubt they have much relevance with the wake
  up batching.

o call the merge_cleanup functions of the elevator _after_ the merge has
  been done, not before. This way the elevator functions get the new
  state of the request, which is the most interesting.

o Kill extra nr_sectors check in ll_merge_requests_fn()

o bi_bi_bdev is always set in __make_request(), so kill check.

Once that is out of the way, we can plop the new i/o scheduler in. It's
not really ready for inclusion yet, due to a few 'issues':

o Barrier requests are not handled properly right now. This isn't a
  problem in the current kernel, but could be in the future.

o hard coded values

o hash probably needs a bit of work :-)

That said, how does it work? Well, the current Linux i/o schedulers use
q->queue_head as the main queue for both dispatch, sorting, and merging.
Pending requests are on that queue until the driver takes them off by
doing

	rq = elv_next_request(q);
	blkdev_dequeue_request(rq);
	/* now dispatch to hardware */

The layout of the deadline i/o scheduler is roughly:

	[1]       [2]
         |         |
         |         |
         |         |
         ====[3]====
              |
              |
              |
              |
             [4]

where [1] is the regular ascendingly sorted pending list of requests,
[2] is a fifo list (well really two lists, one for reads and one for
writes) of pending requests which each have an expire time assigned, [3]
is the elv_next_request() worker, and [4] is the dispatch queue
(q->queue_head again). When a request enters the i/o scheduler, it is
sorted into the [1] list, assigned an expire time, and sorted into the
fifo list [2] (the fifo list is really two lists, one for reads and one
for writes).

[1] is the main list where we serve requests from. If a request deadline
gets exceeded, we move a number of entries (known as the 'fifo_batch'
count) from the sorted list starting from the expired entry onto the
dispatch queue. This makes sure that we at least attempt to start an
expired request immediately, but don't completely fall back to FCFS i/o
scheduling (well set fifo_batch == 1, and you will get FCFS with an
appropriately low expire time).

So the tunables in this i/o scheduler are:

o READ fifo_expire. Attemt to start a READ before this time. Set to 1
  second in the patch.

o WRITE fifo_expire. Ditto for writes. Set to 8 seconds in the patch.

o fifo_batch. See above explanation. Set to 8 in the patch.

I was too lazy to make them run time changable in this version, so you
have to edit deadline-iosched.c to set them.

This version also includes a request merge hash. On back merging a bio
into an existing request, we can potentially save a queue scan by just
finding the rq in the hash. This is similar in idea to the bio hash that
was in the first 2.5.1-pre patches, only works on a request instead (ie
isn't broken like the bio version :-). I resurrected this idea when
Bartlomiej brought it up last week. I only added a back merge hash,
since previous expirements showed that 90% or more merge hits are back
merge hits. We will still check front merges, but only doing the scan
itself. I should do new statistics on this :-)

Finally, I've done some testing on it. No testing on whether this really
works well in real life (that's what I want testers to do), and no
testing on benchmark performance changes etc. What I have done is
beat-up testing, making sure it works without corrupting your data. I'm
fairly confident that does. Most testing was on SCSI (naturally),
however IDE has also been tested briefly.

On to the patches (I'm loosing track of this email). Both are attached.

elv-queue_head-misc-1
	elevator/block misc changes mentioned. This has gone to Linus
	for 2.5.28 inclusion

deadline-iosched-5
	the deadline i/o scheduler

-- 
Jens Axboe


[-- Attachment #2: elv-queue_head-misc-1 --]
[-- Type: text/plain, Size: 6834 bytes --]

diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.28/drivers/block/elevator.c linux/drivers/block/elevator.c
--- /opt/kernel/linux-2.5.28/drivers/block/elevator.c	Wed Jul 24 23:03:30 2002
+++ linux/drivers/block/elevator.c	Fri Jul 26 11:05:00 2002
@@ -220,7 +220,8 @@
 	}
 }
 
-void elevator_linus_merge_req(struct request *req, struct request *next)
+void elevator_linus_merge_req(request_queue_t *q, struct request *req,
+			      struct request *next)
 {
 	if (elv_linus_sequence(next) < elv_linus_sequence(req))
 		elv_linus_sequence(req) = elv_linus_sequence(next);
@@ -232,6 +233,9 @@
 	elevator_t *e = &q->elevator;
 	int lat = 0, *latency = e->elevator_data;
 
+	if (!insert_here)
+		insert_here = q->queue_head.prev;
+
 	if (!(rq->flags & REQ_BARRIER))
 		lat = latency[rq_data_dir(rq)];
 
@@ -318,7 +322,7 @@
 
 struct request *elevator_noop_next_request(request_queue_t *q)
 {
-	if (!blk_queue_empty(q))
+	if (!list_empty(&q->queue_head))
 		return list_entry_rq(q->queue_head.next);
 
 	return NULL;
@@ -376,7 +380,7 @@
 	elevator_t *e = &q->elevator;
 
 	if (e->elevator_merge_req_fn)
-		e->elevator_merge_req_fn(rq, next);
+		e->elevator_merge_req_fn(q, rq, next);
 }
 
 /*
@@ -433,6 +437,27 @@
 		e->elevator_remove_req_fn(q, rq);
 }
 
+int elv_queue_empty(request_queue_t *q)
+{
+	elevator_t *e = &q->elevator;
+
+	if (e->elevator_queue_empty_fn)
+		return e->elevator_queue_empty_fn(q);
+
+	return list_empty(&q->queue_head);
+}
+
+inline struct list_head *elv_get_sort_head(request_queue_t *q,
+					   struct request *rq)
+{
+	elevator_t *e = &q->elevator;
+
+	if (e->elevator_get_sort_head_fn)
+		return e->elevator_get_sort_head_fn(q, rq);
+
+	return &q->queue_head;
+}
+
 elevator_t elevator_linus = {
 	elevator_merge_fn:		elevator_linus_merge,
 	elevator_merge_cleanup_fn:	elevator_linus_merge_cleanup,
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.28/drivers/block/ll_rw_blk.c linux/drivers/block/ll_rw_blk.c
--- /opt/kernel/linux-2.5.28/drivers/block/ll_rw_blk.c	Wed Jul 24 23:03:20 2002
+++ linux/drivers/block/ll_rw_blk.c	Fri Jul 26 12:25:14 2002
@@ -1350,7 +1354,6 @@
 
 	if (rq_data_dir(req) != rq_data_dir(next)
 	    || !kdev_same(req->rq_dev, next->rq_dev)
-	    || req->nr_sectors + next->nr_sectors > q->max_sectors
 	    || next->waiting || next->special)
 		return;
 
@@ -1361,15 +1364,14 @@
 	 * counts here.
 	 */
 	if (q->merge_requests_fn(q, req, next)) {
-		elv_merge_requests(q, req, next);
-
-		blkdev_dequeue_request(next);
-
 		req->biotail->bi_next = next->bio;
 		req->biotail = next->biotail;
 
 		req->nr_sectors = req->hard_nr_sectors += next->hard_nr_sectors;
 
+		elv_merge_requests(q, req, next);
+
+		blkdev_dequeue_request(next);
 		blkdev_release_request(next);
 	}
 }
@@ -1377,16 +1379,18 @@
 static inline void attempt_back_merge(request_queue_t *q, struct request *rq)
 {
 	struct list_head *next = rq->queuelist.next;
+	struct list_head *sort_head = elv_get_sort_head(q, rq);
 
-	if (next != &q->queue_head)
+	if (next != sort_head)
 		attempt_merge(q, rq, list_entry_rq(next));
 }
 
 static inline void attempt_front_merge(request_queue_t *q, struct request *rq)
 {
 	struct list_head *prev = rq->queuelist.prev;
+	struct list_head *sort_head = elv_get_sort_head(q, rq);
 
-	if (prev != &q->queue_head)
+	if (prev != sort_head)
 		attempt_merge(q, list_entry_rq(prev), rq);
 }
 
@@ -1446,7 +1450,7 @@
 	spin_lock_irq(q->queue_lock);
 again:
 	req = NULL;
-	insert_here = q->queue_head.prev;
+	insert_here = NULL;
 
 	if (blk_queue_empty(q)) {
 		blk_plug_device(q);
@@ -1464,11 +1468,10 @@
 				break;
 			}
 
-			elv_merge_cleanup(q, req, nr_sectors);
-
 			req->biotail->bi_next = bio;
 			req->biotail = bio;
 			req->nr_sectors = req->hard_nr_sectors += nr_sectors;
+			elv_merge_cleanup(q, req, nr_sectors);
 			drive_stat_acct(req, nr_sectors, 0);
 			attempt_back_merge(q, req);
 			goto out;
@@ -1480,8 +1483,6 @@
 				break;
 			}
 
-			elv_merge_cleanup(q, req, nr_sectors);
-
 			bio->bi_next = req->bio;
 			req->bio = bio;
 			/*
@@ -1494,6 +1495,7 @@
 			req->hard_cur_sectors = cur_nr_sectors;
 			req->sector = req->hard_sector = sector;
 			req->nr_sectors = req->hard_nr_sectors += nr_sectors;
+			elv_merge_cleanup(q, req, nr_sectors);
 			drive_stat_acct(req, nr_sectors, 0);
 			attempt_front_merge(q, req);
 			goto out;
@@ -1562,9 +1564,7 @@
 	req->buffer = bio_data(bio);	/* see ->buffer comment above */
 	req->waiting = NULL;
 	req->bio = req->biotail = bio;
-	if (bio->bi_bdev)
-		req->rq_dev = to_kdev_t(bio->bi_bdev->bd_dev);
-	else	req->rq_dev = NODEV;
+	req->rq_dev = to_kdev_t(bio->bi_bdev->bd_dev);
 	add_request(q, req, insert_here);
 out:
 	if (freereq)
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.28/include/linux/elevator.h linux/include/linux/elevator.h
--- /opt/kernel/linux-2.5.28/include/linux/elevator.h	Wed Jul 24 23:03:18 2002
+++ linux/include/linux/elevator.h	Fri Jul 26 11:10:58 2002
@@ -6,13 +6,14 @@
 
 typedef void (elevator_merge_cleanup_fn) (request_queue_t *, struct request *, int);
 
-typedef void (elevator_merge_req_fn) (struct request *, struct request *);
+typedef void (elevator_merge_req_fn) (request_queue_t *, struct request *, struct request *);
 
 typedef struct request *(elevator_next_req_fn) (request_queue_t *);
 
 typedef void (elevator_add_req_fn) (request_queue_t *, struct request *, struct list_head *);
 typedef int (elevator_queue_empty_fn) (request_queue_t *);
 typedef void (elevator_remove_req_fn) (request_queue_t *, struct request *);
+typedef struct list_head *(elevator_get_sort_head_fn) (request_queue_t *, struct request *);
 
 typedef int (elevator_init_fn) (request_queue_t *, elevator_t *);
 typedef void (elevator_exit_fn) (request_queue_t *, elevator_t *);
@@ -28,6 +29,7 @@
 	elevator_remove_req_fn *elevator_remove_req_fn;
 
 	elevator_queue_empty_fn *elevator_queue_empty_fn;
+	elevator_get_sort_head_fn *elevator_get_sort_head_fn;
 
 	elevator_init_fn *elevator_init_fn;
 	elevator_exit_fn *elevator_exit_fn;
@@ -45,6 +47,8 @@
 extern void elv_merge_requests(request_queue_t *, struct request *,
 			       struct request *);
 extern void elv_remove_request(request_queue_t *, struct request *);
+extern int elv_queue_empty(request_queue_t *);
+extern inline struct list_head *elv_get_sort_head(request_queue_t *, struct request *);
 
 /*
  * noop I/O scheduler. always merges, always inserts new request at tail
@@ -72,6 +82,9 @@
 
 extern int elevator_init(request_queue_t *, elevator_t *, elevator_t);
 extern void elevator_exit(request_queue_t *, elevator_t *);
+extern inline int bio_rq_in_between(struct bio *, struct request *, struct list_head *);
+extern inline int elv_rq_merge_ok(struct request *, struct bio *);
+extern inline int elv_try_merge(struct request *, struct bio *);
 
 /*
  * Return values from elevator merger

[-- Attachment #3: deadline-iosched-5 --]
[-- Type: text/plain, Size: 12783 bytes --]

diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.28/drivers/block/Makefile linux/drivers/block/Makefile
--- /opt/kernel/linux-2.5.28/drivers/block/Makefile	Wed Jul 24 23:03:31 2002
+++ linux/drivers/block/Makefile	Fri Jul 26 11:05:00 2002
@@ -10,7 +10,7 @@
 
 export-objs	:= elevator.o ll_rw_blk.o blkpg.o loop.o DAC960.o genhd.o block_ioctl.o
 
-obj-y	:= elevator.o ll_rw_blk.o blkpg.o genhd.o block_ioctl.o
+obj-y	:= elevator.o ll_rw_blk.o blkpg.o genhd.o block_ioctl.o deadline-iosched.o
 
 obj-$(CONFIG_MAC_FLOPPY)	+= swim3.o
 obj-$(CONFIG_BLK_DEV_FD)	+= floppy.o
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.28/drivers/block/deadline-iosched.c linux/drivers/block/deadline-iosched.c
--- /opt/kernel/linux-2.5.28/drivers/block/deadline-iosched.c	Thu Jan  1 01:00:00 1970
+++ linux/drivers/block/deadline-iosched.c	Fri Jul 26 13:34:52 2002
@@ -0,0 +1,430 @@
+/*
+ *  linux/drivers/block/deadline-iosched.c
+ *
+ *  Deadline i/o scheduler.
+ *
+ *  Copyright (C) 2002 Jens Axboe <axboe@suse.de>
+ */
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/blkdev.h>
+#include <linux/elevator.h>
+#include <linux/bio.h>
+#include <linux/blk.h>
+#include <linux/config.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/compiler.h>
+
+/*
+ * feel free to try other values :-). *_expire values are the timeouts
+ * for reads and writes, our goal is to start a request "around" the time
+ * when it expires. fifo_batch is how many steps along the sorted list we
+ * will take when the front fifo request expires.
+ */
+static int read_expire = HZ;
+static int write_expire = 8 * HZ;
+static int fifo_batch = 8;
+
+static const int deadline_hash_shift = 6;
+#define DL_HASH_MASK	((1 << deadline_hash_shift) - 1)
+#define DL_HASH_FN(sec)	(((sec) >> 3) & DL_HASH_MASK)
+#define DL_HASH_ENTRIES	(1 << deadline_hash_shift)
+
+struct deadline_data {
+	struct list_head sort_list;
+	struct list_head fifo_list[2];
+	struct list_head *hash;
+	unsigned int fifo_batch;
+	unsigned long fifo_expire[2];
+};
+
+struct deadline_rq {
+	struct list_head fifo;
+	struct list_head hash;
+	struct request *request;
+	unsigned long expires;
+};
+
+static kmem_cache_t *drq_pool;
+
+#define RQ_DATA(rq)		((struct deadline_rq *) (rq)->elevator_private)
+
+inline void deadline_move_to_dispatch(request_queue_t *q, struct request *rq)
+{
+	struct deadline_rq *drq = RQ_DATA(rq);
+
+	list_move_tail(&rq->queuelist, &q->queue_head);
+	list_del_init(&drq->fifo);
+}
+
+#define list_entry_hash(ptr)	list_entry((ptr), struct deadline_rq, hash)
+struct request *deadline_find_hash(struct deadline_data *dd, sector_t offset)
+{
+	struct list_head *hash_list = &dd->hash[DL_HASH_FN(offset)];
+	struct list_head *entry = hash_list;
+	struct deadline_rq *drq;
+	struct request *rq = NULL;
+
+	while ((entry = entry->next) != hash_list) {
+		drq = list_entry_hash(entry);
+
+		if (drq->request->sector + drq->request->nr_sectors == offset) {
+			rq = drq->request;
+			break;
+		}
+	}
+
+	return rq;
+}
+
+#define ON_HASH(drq)	!list_empty(&(drq)->hash)
+inline void deadline_del_rq_hash(struct deadline_rq *drq)
+{
+	if (ON_HASH(drq))
+		list_del_init(&drq->hash);
+}
+
+inline void deadline_add_rq_hash(struct deadline_data *dd,
+				 struct deadline_rq *drq)
+{
+	struct request *rq = drq->request;
+
+	BUG_ON(ON_HASH(drq));
+
+	list_add(&drq->hash, &dd->hash[DL_HASH_FN(rq->sector +rq->nr_sectors)]);
+}
+
+int deadline_merge(request_queue_t *q, struct request **req, struct bio *bio)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct list_head *entry = &dd->sort_list;
+	struct request *__rq;
+	int ret = ELEVATOR_NO_MERGE;
+
+	/*
+	 * see if the merge hash can satisfy a back merge
+	 */
+	if ((__rq = deadline_find_hash(dd, bio->bi_sector))) {
+		if (elv_rq_merge_ok(__rq, bio)) {
+			if (__rq->sector + __rq->nr_sectors == bio->bi_sector) {
+				*req = __rq;
+				ret = ELEVATOR_BACK_MERGE;
+				goto out;
+			}
+		}
+	}
+
+	while ((entry = entry->prev) != &dd->sort_list) {
+		__rq = list_entry_rq(entry);
+
+		BUG_ON(__rq->flags & REQ_STARTED);
+		BUG_ON(__rq->flags & REQ_BARRIER);
+
+		if (!(__rq->flags & REQ_CMD))
+			continue;
+
+		if (!*req && bio_rq_in_between(bio, __rq, &dd->sort_list))
+			*req = __rq;
+
+		if (elv_rq_merge_ok(__rq, bio)) {
+			if (__rq->sector - bio_sectors(bio) == bio->bi_sector) {
+				*req = __rq;
+				ret = ELEVATOR_FRONT_MERGE;
+				break;
+			}
+		}
+	}
+
+out:
+	return ret;
+}
+
+void deadline_merge_cleanup(request_queue_t *q, struct request *req, int count)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct deadline_rq *drq = RQ_DATA(req);
+
+	deadline_del_rq_hash(drq);
+	deadline_add_rq_hash(dd, drq);
+}
+
+#define ON_FIFO(drq)	!list_empty(&(drq)->fifo)
+void deadline_merge_request(request_queue_t *q, struct request *req,
+			    struct request *next)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct deadline_rq *drq = RQ_DATA(req);
+	struct deadline_rq *dnext = RQ_DATA(next);
+
+	BUG_ON(!drq);
+	BUG_ON(!dnext);
+
+	/*
+	 * if dnext expires before drq, assign it's expire time to drq
+	 * and move into dnext position (dnext will be deleted) in fifo
+	 */
+	if (ON_FIFO(drq) && ON_FIFO(dnext)) {
+		if (time_before(dnext->expires, drq->expires)) {
+			list_move(&drq->fifo, &dnext->fifo);
+			drq->expires = dnext->expires;
+		}
+	}
+
+	deadline_del_rq_hash(drq);
+	deadline_add_rq_hash(dd, drq);
+}
+
+#define list_entry_fifo(ptr)	list_entry((ptr), struct deadline_rq, fifo)
+int deadline_check_fifo(request_queue_t *q, int ddir)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct list_head *fifo_list = &dd->fifo_list[ddir];
+	struct deadline_rq *drq;
+	struct request *rq;
+	int entries;
+
+	if (list_empty(fifo_list))
+		return 0;
+
+	drq = list_entry_fifo(fifo_list->next);
+	if (time_before(jiffies, drq->expires))
+		return 0;
+
+	rq = drq->request;
+	entries = dd->fifo_batch;
+	while (entries--) {
+		struct list_head *nxt = rq->queuelist.next;
+
+		/*
+		 * take it off the sort and fifo list, move
+		 * to dispatch queue
+		 */
+		deadline_move_to_dispatch(q, rq);
+
+		if (nxt == &dd->sort_list)
+			break;
+
+		rq = list_entry_rq(nxt);
+	}
+
+	/*
+	 * should be entries there now, at least 1
+	 */
+	return 1;
+}
+
+struct request *deadline_next_request(request_queue_t *q)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+
+	/*
+	 * if still requests on the dispatch queue, just grab the first one
+	 */
+	if (!list_empty(&q->queue_head))
+dispatch:
+		return list_entry_rq(q->queue_head.next);
+
+	/*
+	 * decide whether to just grab first from sorted list, or move a
+	 * batch from the expire list. if it has expired, move 'batch'
+	 * entries to dispatch queue
+	 */
+	if (deadline_check_fifo(q, READ))
+		goto dispatch;
+	if (deadline_check_fifo(q, WRITE))
+		goto dispatch;
+
+	if (!list_empty(&dd->sort_list)) {
+		deadline_move_to_dispatch(q, list_entry_rq(dd->sort_list.next));
+		goto dispatch;
+	}
+
+	return NULL;
+}
+
+void deadline_add_request(request_queue_t *q, struct request *rq, struct list_head *insert_here)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct deadline_rq *drq;
+
+	/*
+	 * insert barriers directly into the back of the dispatch queue,
+	 * don't use the sorted or fifo list for those
+	 */
+	if (unlikely(rq->flags & REQ_BARRIER)) {
+		list_add_tail(&rq->queuelist, &q->queue_head);
+		return;
+	}
+
+	/*
+	 * add to sort list
+	 */
+	if (!insert_here)
+		insert_here = dd->sort_list.prev;
+
+	list_add(&rq->queuelist, insert_here);
+
+	if (unlikely(!(rq->flags & REQ_CMD)))
+		return;
+
+	/*
+	 * set expire time and add to fifo list
+	 */
+	drq = RQ_DATA(rq);
+	drq->expires = jiffies + dd->fifo_expire[rq_data_dir(rq)];
+	list_add_tail(&drq->fifo, &dd->fifo_list[rq_data_dir(rq)]);
+
+	if (rq_mergeable(rq))
+		deadline_add_rq_hash(dd, drq);
+}
+
+void deadline_remove_request(request_queue_t *q, struct request *rq)
+{
+	struct deadline_rq *drq = RQ_DATA(rq);
+
+	if (drq) {
+		list_del_init(&drq->fifo);
+		deadline_del_rq_hash(drq);
+	}
+}
+
+int deadline_queue_empty(request_queue_t *q)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+
+	if (!list_empty(&q->queue_head) || !list_empty(&dd->sort_list))
+		return 0;
+
+	BUG_ON(!list_empty(&dd->fifo_list[READ]));
+	BUG_ON(!list_empty(&dd->fifo_list[WRITE]));
+	return 1;
+}
+
+struct list_head *deadline_get_sort_head(request_queue_t *q, struct request *rq)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+
+	return &dd->sort_list;
+}
+
+void deadline_exit(request_queue_t *q, elevator_t *e)
+{
+	struct deadline_data *dd = e->elevator_data;
+	struct deadline_rq *drq;
+	struct request *rq;
+	int i;
+
+	BUG_ON(!list_empty(&dd->fifo_list[READ]));
+	BUG_ON(!list_empty(&dd->fifo_list[WRITE]));
+	BUG_ON(!list_empty(&dd->sort_list));
+
+	for (i = 0; i < 2; i++) {
+		struct request_list *rl = &q->rq[i];
+		struct list_head *entry = &rl->free;
+
+		if (list_empty(&rl->free))
+			continue;
+	
+		while ((entry = entry->next) != &rl->free) {
+			rq = list_entry_rq(entry);
+
+			if ((drq = RQ_DATA(rq)) == NULL)
+				continue;
+
+			rq->elevator_private = NULL;
+			kmem_cache_free(drq_pool, drq);
+		}
+	}
+
+	kfree(dd->hash);
+	kfree(dd);
+}
+
+int deadline_init(request_queue_t *q, elevator_t *e)
+{
+	struct deadline_data *dd;
+	struct deadline_rq *drq;
+	struct request *rq;
+	int i, ret = 0;
+
+	if (!drq_pool)
+		return -ENOMEM;
+
+	dd = kmalloc(sizeof(*dd), GFP_KERNEL);
+	if (!dd)
+		return -ENOMEM;
+	memset(dd, 0, sizeof(*dd));
+
+	dd->hash = kmalloc(sizeof(struct list_head)*DL_HASH_ENTRIES,GFP_KERNEL);
+	if (!dd->hash) {
+		kfree(dd);
+		return -ENOMEM;
+	}
+
+	for (i = 0; i < DL_HASH_ENTRIES; i++)
+		INIT_LIST_HEAD(&dd->hash[i]);
+
+	INIT_LIST_HEAD(&dd->fifo_list[READ]);
+	INIT_LIST_HEAD(&dd->fifo_list[WRITE]);
+	INIT_LIST_HEAD(&dd->sort_list);
+	dd->fifo_batch = fifo_batch;
+	dd->fifo_expire[READ] = read_expire;
+	dd->fifo_expire[WRITE] = write_expire;
+	e->elevator_data = dd;
+
+	for (i = 0; i < 2; i++) {
+		struct request_list *rl = &q->rq[i];
+		struct list_head *entry = &rl->free;
+
+		if (list_empty(&rl->free))
+			continue;
+	
+		while ((entry = entry->next) != &rl->free) {
+			rq = list_entry_rq(entry);
+
+			drq = kmem_cache_alloc(drq_pool, GFP_KERNEL);
+			if (!drq) {
+				ret = -ENOMEM;
+				break;
+			}
+
+			INIT_LIST_HEAD(&drq->fifo);
+			INIT_LIST_HEAD(&drq->hash);
+			drq->request = rq;
+			drq->expires = 0;
+			rq->elevator_private = drq;
+		}
+	}
+
+	if (ret)
+		deadline_exit(q, e);
+
+	return ret;
+}
+
+int __init deadline_slab_setup(void)
+{
+	drq_pool = kmem_cache_create("deadline_drq", sizeof(struct deadline_rq),
+				     0, SLAB_HWCACHE_ALIGN, NULL, NULL);
+
+	if (!drq_pool)
+		panic("deadline: can't init slab pool\n");
+
+	return 0;
+}
+
+module_init(deadline_slab_setup);
+
+elevator_t iosched_deadline = {
+	.elevator_merge_fn = 		deadline_merge,
+	.elevator_merge_req_fn =	deadline_merge_request,
+	.elevator_next_req_fn =		deadline_next_request,
+	.elevator_add_req_fn =		deadline_add_request,
+	.elevator_remove_req_fn =	deadline_remove_request,
+	.elevator_queue_empty_fn =	deadline_queue_empty,
+	.elevator_get_sort_head_fn =	deadline_get_sort_head,
+	.elevator_init_fn =		deadline_init,
+	.elevator_exit_fn =		deadline_exit,
+};
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.28/drivers/block/ll_rw_blk.c linux/drivers/block/ll_rw_blk.c
--- /opt/kernel/linux-2.5.28/drivers/block/ll_rw_blk.c	Wed Jul 24 23:03:20 2002
+++ linux/drivers/block/ll_rw_blk.c	Fri Jul 26 12:25:14 2002
@@ -1114,7 +1114,11 @@
 	if (blk_init_free_list(q))
 		return -ENOMEM;
 
+#if 1
+	if ((ret = elevator_init(q, &q->elevator, iosched_deadline))) {
+#else
 	if ((ret = elevator_init(q, &q->elevator, elevator_linus))) {
+#endif
 		blk_cleanup_queue(q);
 		return ret;
 	}
@@ -2034,8 +2034,8 @@
 	queue_nr_requests = (total_ram >> 8) & ~15;	/* One per quarter-megabyte */
 	if (queue_nr_requests < 32)
 		queue_nr_requests = 32;
-	if (queue_nr_requests > 256)
-		queue_nr_requests = 256;
+	if (queue_nr_requests > 768)
+		queue_nr_requests = 768;
 
 	/*
 	 * Batch frees according to queue length
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.28/include/linux/elevator.h linux/include/linux/elevator.h
--- /opt/kernel/linux-2.5.28/include/linux/elevator.h	Wed Jul 24 23:03:18 2002
+++ linux/include/linux/elevator.h	Fri Jul 26 11:10:58 2002
@@ -59,6 +63,12 @@
 #define elv_linus_sequence(rq)	((long)(rq)->elevator_private)
 
 /*
+ * deadline i/o scheduler. uses request time outs to prevent indefinite
+ * starvation
+ */
+extern elevator_t iosched_deadline;
+
+/*
  * use the /proc/iosched interface, all the below is history ->
  */
 typedef struct blkelv_ioctl_arg_s {

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block/elevator updates + deadline i/o scheduler
  2002-07-26 12:02 [PATCH] block/elevator updates + deadline i/o scheduler Jens Axboe
@ 2002-07-26 18:31 ` Andrew Morton
  2002-07-28 19:12   ` Jens Axboe
  2002-07-30  7:57   ` Jens Axboe
  2002-07-27  1:22 ` Adam Kropelin
  2002-07-30 17:33 ` Bill Davidsen
  2 siblings, 2 replies; 11+ messages in thread
From: Andrew Morton @ 2002-07-26 18:31 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel

Jens Axboe wrote:
> 
> The layout of the deadline i/o scheduler is roughly:
> 
>         [1]       [2]
>          |         |
>          |         |
>          |         |
>          ====[3]====
>               |
>               |
>               |
>               |
>              [4]
> 
> where [1] is the regular ascendingly sorted pending list of requests,
> [2] is a fifo list (well really two lists, one for reads and one for
> writes) of pending requests which each have an expire time assigned, [3]
> is the elv_next_request() worker, and [4] is the dispatch queue
> (q->queue_head again). When a request enters the i/o scheduler, it is
> sorted into the [1] list, assigned an expire time, and sorted into the
> fifo list [2] (the fifo list is really two lists, one for reads and one
> for writes).
> 
> [1] is the main list where we serve requests from. If a request deadline
> gets exceeded, we move a number of entries (known as the 'fifo_batch'
> count) from the sorted list starting from the expired entry onto the
> dispatch queue. This makes sure that we at least attempt to start an
> expired request immediately, but don't completely fall back to FCFS i/o
> scheduling (well set fifo_batch == 1, and you will get FCFS with an
> appropriately low expire time).

I don't quite understand...  When expired requests are moved from the
fifo [2] onto the dispatch queue [4], is merging performed at the
dispatch queue?

In other words, if the fifo queue has blocks (1,3,5,7,2,4,6,8) or
(1,10,20,5,15,25), and they expire, will they be sorted in some manner
before going to the hardware?  If so, where?

> ...
> 
> Finally, I've done some testing on it. No testing on whether this really
> works well in real life (that's what I want testers to do), and no
> testing on benchmark performance changes etc. What I have done is
> beat-up testing, making sure it works without corrupting your data.

I'll give it a whizz over the weekend.

-

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block/elevator updates + deadline i/o scheduler
  2002-07-26 12:02 [PATCH] block/elevator updates + deadline i/o scheduler Jens Axboe
  2002-07-26 18:31 ` Andrew Morton
@ 2002-07-27  1:22 ` Adam Kropelin
  2002-07-30 17:33 ` Bill Davidsen
  2 siblings, 0 replies; 11+ messages in thread
From: Adam Kropelin @ 2002-07-27  1:22 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel

On Fri, Jul 26, 2002 at 02:02:48PM +0200, Jens Axboe wrote:
> The 2nd version of the deadline i/o scheduler is ready for some testing.
...
> Finally, I've done some testing on it. No testing on whether this really
> works well in real life (that's what I want testers to do), and no
> testing on benchmark performance changes etc. What I have done is
> beat-up testing, making sure it works without corrupting your data. I'm
> fairly confident that does. Most testing was on SCSI (naturally),
> however IDE has also been tested briefly.

Hi, Jens,

I'm interested in i/o performance (though more so in throughput than latency)
so I tried out the patches. I needed an extra bit (below) to get a compile
on 2.5.28.

My performance testing showed essentially no change with the deadline i/o
scheduler, but there are a multitude of possible reasons for that, including
user stupdity. Here's an outline of what I tried and what I observed:

The concept of deadline scheduling makes me think "latency" rather than 
"throughput" so I tried to run tests involving small i/os in the
presence of large streaming reads. I expected to see an improvement in
the time taken to service the small i/os. (Obviously, let me know if my
assumptions are all washed up ;) At the same time I wanted to see if there was
any impact (negative or positive) on the streaming workload.

Test #1: Simultaneously untarring two kernel trees (untar only, no gzip). The
idea here was that reading the source tarball was essentially a streaming read
while writing the output was a large number of relatively small writes. There
was less than 3 seconds difference in overall time between stock 2.5.28 and
-deadline. 

Test #2: Same as Test #1 with the addition of reading each *.c file in another
kernel tree while the untarring is going in the background. The idea here was
to see if the large set of small reads in the readonly workload would benefit.
Again, the difference was only a few seconds over several minutes.

Part of the issue here may be my test setup... All i/o was to the same (rather
slow) IDE disk. (The good news being 2.5 IDE did not blow up in my face under
all this stress.) Machine is a 2 CPU SMP PPro.

If these results are totally bogus, feel free to ignore them. If you have ideas
for other tests I should run, let me know and I'll oblige.

--Adam

--- elevator.h.orig	Fri Jul 26 20:59:06 2002
+++ elevator.h	Fri Jul 26 20:59:15 2002
@@ -93,10 +93,4 @@
 #define ELEVATOR_FRONT_MERGE	1
 #define ELEVATOR_BACK_MERGE	2
 
-/*
- * will change once we move to a more complex data structure than a simple
- * list for pending requests
- */
-#define elv_queue_empty(q)	list_empty(&(q)->queue_head)
-
 #endif


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block/elevator updates + deadline i/o scheduler
  2002-07-26 18:31 ` Andrew Morton
@ 2002-07-28 19:12   ` Jens Axboe
  2002-07-28 19:17     ` Randy.Dunlap
  2002-07-30  7:57   ` Jens Axboe
  1 sibling, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2002-07-28 19:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux Kernel

On Fri, Jul 26 2002, Andrew Morton wrote:
> Jens Axboe wrote:
> > 
> > The layout of the deadline i/o scheduler is roughly:
> > 
> >         [1]       [2]
> >          |         |
> >          |         |
> >          |         |
> >          ====[3]====
> >               |
> >               |
> >               |
> >               |
> >              [4]
> > 
> > where [1] is the regular ascendingly sorted pending list of requests,
> > [2] is a fifo list (well really two lists, one for reads and one for
> > writes) of pending requests which each have an expire time assigned, [3]
> > is the elv_next_request() worker, and [4] is the dispatch queue
> > (q->queue_head again). When a request enters the i/o scheduler, it is
> > sorted into the [1] list, assigned an expire time, and sorted into the
> > fifo list [2] (the fifo list is really two lists, one for reads and one
> > for writes).
> > 
> > [1] is the main list where we serve requests from. If a request deadline
> > gets exceeded, we move a number of entries (known as the 'fifo_batch'
> > count) from the sorted list starting from the expired entry onto the
> > dispatch queue. This makes sure that we at least attempt to start an
> > expired request immediately, but don't completely fall back to FCFS i/o
> > scheduling (well set fifo_batch == 1, and you will get FCFS with an
> > appropriately low expire time).
> 
> I don't quite understand...  When expired requests are moved from the
> fifo [2] onto the dispatch queue [4], is merging performed at the
> dispatch queue?

There are not moved from the fifo queue as such, a single entry (the
front one of course, always the oldest one) is chosen and the sorted
listed is followed from that request. So fifo_batch entries are moved
from the sorted list, starting at location X where X is the front of the
fifo and might be anywhere on the sorted list.

Merging has already been done etc, this takes place on the sorted queue
'as usual'. So it's just a matter of moving entries.

> In other words, if the fifo queue has blocks (1,3,5,7,2,4,6,8) or
> (1,10,20,5,15,25), and they expire, will they be sorted in some manner
> before going to the hardware?  If so, where?

If the fifo queue has the following entries (5,1,4,2,8,7,3,9) then the
sorted list looks like this (1,2,3,4,5,7,8,9). If fifo_batch is 4 in
this case, we would move (5,7,8,9) to the dispatch queue.

Think of it as applying deadlines to any type if i/o scheduler. The
fifo queue could essentially be tacked on to any type of other queue,
not just a plain sorted one.

> > Finally, I've done some testing on it. No testing on whether this really
> > works well in real life (that's what I want testers to do), and no
> > testing on benchmark performance changes etc. What I have done is
> > beat-up testing, making sure it works without corrupting your data.
> 
> I'll give it a whizz over the weekend.

Cool. I'd be interested in latency and throughput results at this point,
I have none of these. BTW, does anyone know of a good benchmark that
also cares about latency?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block/elevator updates + deadline i/o scheduler
  2002-07-28 19:12   ` Jens Axboe
@ 2002-07-28 19:17     ` Randy.Dunlap
  2002-07-28 19:26       ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Randy.Dunlap @ 2002-07-28 19:17 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Andrew Morton, Linux Kernel

On Sun, 28 Jul 2002, Jens Axboe wrote:

| Cool. I'd be interested in latency and throughput results at this point,
| I have none of these. BTW, does anyone know of a good benchmark that
| also cares about latency?

Danger, use of 'good' and 'benchmark' together.
Nevertheless, tiobench (tiobench.sf.net) tries to care about &
measure latency.

-- 
~Randy


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block/elevator updates + deadline i/o scheduler
  2002-07-28 19:17     ` Randy.Dunlap
@ 2002-07-28 19:26       ` Jens Axboe
  0 siblings, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2002-07-28 19:26 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: Andrew Morton, Linux Kernel

On Sun, Jul 28 2002, Randy.Dunlap wrote:
> On Sun, 28 Jul 2002, Jens Axboe wrote:
> 
> | Cool. I'd be interested in latency and throughput results at this point,
> | I have none of these. BTW, does anyone know of a good benchmark that
> | also cares about latency?
> 
> Danger, use of 'good' and 'benchmark' together.

:-)

> Nevertheless, tiobench (tiobench.sf.net) tries to care about &
> measure latency.

Does it? Hmm my version is probably too old then, thanks for the hint,
I'll try it first thing tomorrow.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block/elevator updates + deadline i/o scheduler
  2002-07-26 18:31 ` Andrew Morton
  2002-07-28 19:12   ` Jens Axboe
@ 2002-07-30  7:57   ` Jens Axboe
  1 sibling, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2002-07-30  7:57 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 1570 bytes --]

Hi,

An updated version of the deadline i/o scheduler against 2.5.29 is now
available. Changes since last version:

features:

o 'fifo_batch' meant "move this number of entries to dispatch queue" in
the previous version, now it means "move entries to dispatch queue at
this max cost". It is complimented by 'seek_cost' which is a simple
measure of transfer-time vs seek-time cost. If a request X+1 is
contigous to request X, then it is accounted at cost '1'. If not, it's
accounted at cost 'seek_cost'.

o Doing numbers on q->last_merge hits in the linus elevator showed
impressive amount of hits even when lots of threads where banging the
queue at the same time. So add q->last_merge hints to this scheduler
too.

fixes:

o remember to actually register e->merge_cleanup so the merge cleanup
parts works. This caused us to miss out on a number of back merges.

o use wli's hash_long() as the merge hash. gets good distribution on
various benchmark runs and is fast, very nice :-)

o various cleanups

So no serious bug found (ie crashes or data corruption issues). Again,
mainly tested on SCSI, briefly tested on IDE as well. Due to better
design of the i/o scheduler interface in 2.5 in general, integrity
testing on IDE isn't nearly as important as it was in 2.4 for instance.
So I consider the patch stable for general use right now.

elv-queue_head-misc-2
	various block- and i/o scheduler interface updates. needs to be
	applied first.

deadline-iosched-7
	the deadline i/o scheduler.

Find them on kernel.org as well of course. Testing welcome!

-- 
Jens Axboe


[-- Attachment #2: deadline-iosched-7 --]
[-- Type: text/plain, Size: 14383 bytes --]

diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.29/drivers/block/Makefile linux/drivers/block/Makefile
--- /opt/kernel/linux-2.5.29/drivers/block/Makefile	Wed Jul 24 23:03:31 2002
+++ linux/drivers/block/Makefile	Fri Jul 26 11:05:00 2002
@@ -10,7 +10,7 @@
 
 export-objs	:= elevator.o ll_rw_blk.o blkpg.o loop.o DAC960.o genhd.o block_ioctl.o
 
-obj-y	:= elevator.o ll_rw_blk.o blkpg.o genhd.o block_ioctl.o
+obj-y	:= elevator.o ll_rw_blk.o blkpg.o genhd.o block_ioctl.o deadline-iosched.o
 
 obj-$(CONFIG_MAC_FLOPPY)	+= swim3.o
 obj-$(CONFIG_BLK_DEV_FD)	+= floppy.o
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.29/drivers/block/deadline-iosched.c linux/drivers/block/deadline-iosched.c
--- /opt/kernel/linux-2.5.29/drivers/block/deadline-iosched.c	Thu Jan  1 01:00:00 1970
+++ linux/drivers/block/deadline-iosched.c	Tue Jul 30 09:40:49 2002
@@ -0,0 +1,490 @@
+/*
+ *  linux/drivers/block/deadline-iosched.c
+ *
+ *  Deadline i/o scheduler.
+ *
+ *  Copyright (C) 2002 Jens Axboe <axboe@suse.de>
+ */
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/blkdev.h>
+#include <linux/elevator.h>
+#include <linux/bio.h>
+#include <linux/blk.h>
+#include <linux/config.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/compiler.h>
+#include <linux/hash.h>
+
+/*
+ * feel free to try other values :-). *_expire values are the timeouts
+ * for reads and writes, our goal is to start a request "around" the time
+ * when it expires. fifo_batch is how many steps along the sorted list we
+ * will take when the front fifo request expires.
+ */
+static int read_expire = HZ;
+static int write_expire = 10 * HZ;
+static int fifo_batch = 80;
+static int seek_cost = 20;
+
+static const int deadline_hash_shift = 6;
+#define DL_HASH_BLOCK(sec)	((sec) >> 3)
+#define DL_HASH_FN(sec)		(hash_long(DL_HASH_BLOCK((sec)), deadline_hash_shift))
+#define DL_HASH_ENTRIES		(1 << deadline_hash_shift)
+
+#define DL_INVALIDATE_HASH(dd)				\
+	do {						\
+		if (!++(dd)->hash_valid_count)		\
+			(dd)->hash_valid_count = 1;	\
+	} while (0)
+
+struct deadline_data {
+	struct list_head sort_list;
+	struct list_head fifo_list[2];
+	struct list_head *hash;
+	sector_t last_sector;
+	unsigned long hash_valid_count;
+
+	/*
+	 * settings that change how the i/o scheduler behaves
+	 */
+	unsigned int fifo_batch;
+	unsigned long fifo_expire[2];
+	unsigned int seek_cost;
+};
+
+struct deadline_rq {
+	struct list_head fifo;
+	struct list_head hash;
+	unsigned long hash_valid_count;
+	struct request *request;
+	unsigned long expires;
+};
+
+static kmem_cache_t *drq_pool;
+
+#define RQ_DATA(rq)		((struct deadline_rq *) (rq)->elevator_private)
+
+static inline void deadline_move_to_dispatch(request_queue_t *q,
+					     struct request *rq)
+{
+	struct deadline_rq *drq = RQ_DATA(rq);
+
+	list_move_tail(&rq->queuelist, &q->queue_head);
+	list_del_init(&drq->fifo);
+}
+
+#define ON_HASH(drq)	(drq)->hash_valid_count
+static inline void deadline_del_rq_hash(struct deadline_rq *drq)
+{
+	if (ON_HASH(drq)) {
+		drq->hash_valid_count = 0;
+		list_del_init(&drq->hash);
+	}
+}
+
+static inline void deadline_add_rq_hash(struct deadline_data *dd,
+					struct deadline_rq *drq)
+{
+	struct request *rq = drq->request;
+
+	BUG_ON(ON_HASH(drq));
+
+	drq->hash_valid_count = dd->hash_valid_count;
+	list_add(&drq->hash, &dd->hash[DL_HASH_FN(rq->sector +rq->nr_sectors)]);
+}
+
+#define list_entry_hash(ptr)	list_entry((ptr), struct deadline_rq, hash)
+static struct request *deadline_find_hash(struct deadline_data *dd,
+					  sector_t offset)
+{
+	struct list_head *hash_list = &dd->hash[DL_HASH_FN(offset)];
+	struct list_head *entry, *next = hash_list->next;
+	struct deadline_rq *drq;
+	struct request *rq = NULL;
+
+	while ((entry = next) != hash_list) {
+		next = entry->next;
+		drq = list_entry_hash(entry);
+
+		BUG_ON(!drq->hash_valid_count);
+
+		if (!rq_mergeable(drq->request)
+		    || drq->hash_valid_count != dd->hash_valid_count) {
+			deadline_del_rq_hash(drq);
+			continue;
+		}
+
+		if (drq->request->sector + drq->request->nr_sectors == offset) {
+			rq = drq->request;
+			break;
+		}
+	}
+
+	return rq;
+}
+
+static int deadline_merge(request_queue_t *q, struct request **req,
+			  struct bio *bio)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct list_head *entry = &dd->sort_list;
+	struct request *__rq;
+	int ret = ELEVATOR_NO_MERGE;
+
+	/*
+	 * try last_merge to avoid going to hash
+	 */
+	if ((ret = elv_try_last_merge(q, req, bio)))
+		goto out;
+
+	/*
+	 * see if the merge hash can satisfy a back merge
+	 */
+	if ((__rq = deadline_find_hash(dd, bio->bi_sector))) {
+		if (elv_rq_merge_ok(__rq, bio)) {
+			if (__rq->sector + __rq->nr_sectors == bio->bi_sector) {
+				*req = __rq;
+				q->last_merge = &__rq->queuelist;
+				ret = ELEVATOR_BACK_MERGE;
+				goto out;
+			}
+		}
+	}
+
+	while ((entry = entry->prev) != &dd->sort_list) {
+		__rq = list_entry_rq(entry);
+
+		BUG_ON(__rq->flags & REQ_STARTED);
+		BUG_ON(__rq->flags & REQ_BARRIER);
+
+		if (!(__rq->flags & REQ_CMD))
+			continue;
+
+		if (!*req && bio_rq_in_between(bio, __rq, &dd->sort_list))
+			*req = __rq;
+
+		if (elv_rq_merge_ok(__rq, bio)) {
+			if (__rq->sector - bio_sectors(bio) == bio->bi_sector) {
+				*req = __rq;
+				q->last_merge = &__rq->queuelist;
+				ret = ELEVATOR_FRONT_MERGE;
+				break;
+			} else if (__rq->sector + __rq->nr_sectors == bio->bi_sector)
+				printk("deadline_merge: back merge hit on list\n");
+		}
+	}
+
+out:
+	return ret;
+}
+
+static void deadline_merge_cleanup(request_queue_t *q, struct request *req,
+				   int count)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct deadline_rq *drq = RQ_DATA(req);
+
+	deadline_del_rq_hash(drq);
+	deadline_add_rq_hash(dd, drq);
+}
+
+#define ON_FIFO(drq)	!list_empty(&(drq)->fifo)
+static void deadline_merge_request(request_queue_t *q, struct request *req,
+				   struct request *next)
+{
+	struct deadline_rq *drq = RQ_DATA(req);
+	struct deadline_rq *dnext = RQ_DATA(next);
+
+	BUG_ON(!drq);
+	BUG_ON(!dnext);
+
+	deadline_merge_cleanup(q, req, 0);
+
+	/*
+	 * if dnext expires before drq, assign it's expire time to drq
+	 * and move into dnext position (dnext will be deleted) in fifo
+	 */
+	if (ON_FIFO(drq) && ON_FIFO(dnext)) {
+		if (time_before(dnext->expires, drq->expires)) {
+			list_move(&drq->fifo, &dnext->fifo);
+			drq->expires = dnext->expires;
+		}
+	}
+}
+
+#define list_entry_fifo(ptr)	list_entry((ptr), struct deadline_rq, fifo)
+static int deadline_check_fifo(request_queue_t *q, int ddir)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct list_head *fifo_list = &dd->fifo_list[ddir];
+	struct deadline_rq *drq;
+	struct request *rq;
+	sector_t last_sec;
+	int batch_count;
+
+	if (list_empty(fifo_list))
+		return 0;
+
+	drq = list_entry_fifo(fifo_list->next);
+	if (time_before(jiffies, drq->expires))
+		return 0;
+
+	rq = drq->request;
+	batch_count = dd->fifo_batch;
+	last_sec = dd->last_sector;
+	do {
+		struct list_head *nxt = rq->queuelist.next;
+
+		/*
+		 * take it off the sort and fifo list, move
+		 * to dispatch queue
+		 */
+		deadline_move_to_dispatch(q, rq);
+
+		if (rq->sector == last_sec)
+			batch_count--;
+		else
+			batch_count -= dd->seek_cost;
+
+		if (nxt == &dd->sort_list)
+			break;
+
+		last_sec = rq->sector + rq->nr_sectors;
+		rq = list_entry_rq(nxt);
+	} while (batch_count > 0);
+
+	/*
+	 * should be entries there now, at least 1
+	 */
+	return 1;
+}
+
+static struct request *deadline_next_request(request_queue_t *q)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct request *rq;
+
+	/*
+	 * if still requests on the dispatch queue, just grab the first one
+	 */
+	if (!list_empty(&q->queue_head)) {
+dispatch:
+		rq = list_entry_rq(q->queue_head.next);
+		dd->last_sector = rq->sector + rq->nr_sectors;
+		return rq;
+	}
+
+	/*
+	 * decide whether to just grab first from sorted list, or move a
+	 * batch from the expire list. if it has expired, move 'batch'
+	 * entries to dispatch queue
+	 */
+	if (deadline_check_fifo(q, READ))
+		goto dispatch;
+	if (deadline_check_fifo(q, WRITE))
+		goto dispatch;
+
+	if (!list_empty(&dd->sort_list)) {
+		deadline_move_to_dispatch(q, list_entry_rq(dd->sort_list.next));
+		goto dispatch;
+	}
+
+	return NULL;
+}
+
+static void deadline_add_request(request_queue_t *q, struct request *rq, struct list_head *insert_here)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct deadline_rq *drq;
+
+	/*
+	 * insert barriers directly into the back of the dispatch queue,
+	 * don't use the sorted or fifo list for those
+	 */
+	if (unlikely(rq->flags & REQ_BARRIER)) {
+		DL_INVALIDATE_HASH(dd);
+		list_add_tail(&rq->queuelist, &q->queue_head);
+		q->last_merge = NULL;
+		return;
+	}
+
+	/*
+	 * add to sort list
+	 */
+	if (!insert_here)
+		insert_here = dd->sort_list.prev;
+
+	list_add(&rq->queuelist, insert_here);
+
+	if (unlikely(!(rq->flags & REQ_CMD)))
+		return;
+
+	/*
+	 * set expire time and add to fifo list
+	 */
+	drq = RQ_DATA(rq);
+	drq->expires = jiffies + dd->fifo_expire[rq_data_dir(rq)];
+	list_add_tail(&drq->fifo, &dd->fifo_list[rq_data_dir(rq)]);
+
+	if (rq_mergeable(rq))
+		deadline_add_rq_hash(dd, drq);
+
+	if (!q->last_merge)
+		q->last_merge = &rq->queuelist;
+}
+
+static void deadline_remove_request(request_queue_t *q, struct request *rq)
+{
+	struct deadline_rq *drq = RQ_DATA(rq);
+
+	if (drq) {
+		list_del_init(&drq->fifo);
+		deadline_del_rq_hash(drq);
+	}
+}
+
+static int deadline_queue_empty(request_queue_t *q)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+
+	if (!list_empty(&q->queue_head) || !list_empty(&dd->sort_list))
+		return 0;
+
+	BUG_ON(!list_empty(&dd->fifo_list[READ]));
+	BUG_ON(!list_empty(&dd->fifo_list[WRITE]));
+	return 1;
+}
+
+static struct list_head *deadline_get_sort_head(request_queue_t *q,
+						struct request *rq)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+
+	return &dd->sort_list;
+}
+
+static void deadline_exit(request_queue_t *q, elevator_t *e)
+{
+	struct deadline_data *dd = e->elevator_data;
+	struct deadline_rq *drq;
+	struct request *rq;
+	int i;
+
+	BUG_ON(!list_empty(&dd->fifo_list[READ]));
+	BUG_ON(!list_empty(&dd->fifo_list[WRITE]));
+	BUG_ON(!list_empty(&dd->sort_list));
+
+	for (i = 0; i < 2; i++) {
+		struct request_list *rl = &q->rq[i];
+		struct list_head *entry = &rl->free;
+
+		if (list_empty(&rl->free))
+			continue;
+	
+		while ((entry = entry->next) != &rl->free) {
+			rq = list_entry_rq(entry);
+
+			if ((drq = RQ_DATA(rq)) == NULL)
+				continue;
+
+			rq->elevator_private = NULL;
+			kmem_cache_free(drq_pool, drq);
+		}
+	}
+
+	kfree(dd->hash);
+	kfree(dd);
+}
+
+static int deadline_init(request_queue_t *q, elevator_t *e)
+{
+	struct deadline_data *dd;
+	struct deadline_rq *drq;
+	struct request *rq;
+	int i, ret = 0;
+
+	if (!drq_pool)
+		return -ENOMEM;
+
+	dd = kmalloc(sizeof(*dd), GFP_KERNEL);
+	if (!dd)
+		return -ENOMEM;
+	memset(dd, 0, sizeof(*dd));
+
+	dd->hash = kmalloc(sizeof(struct list_head)*DL_HASH_ENTRIES,GFP_KERNEL);
+	if (!dd->hash) {
+		kfree(dd);
+		return -ENOMEM;
+	}
+
+	for (i = 0; i < DL_HASH_ENTRIES; i++)
+		INIT_LIST_HEAD(&dd->hash[i]);
+
+	INIT_LIST_HEAD(&dd->fifo_list[READ]);
+	INIT_LIST_HEAD(&dd->fifo_list[WRITE]);
+	INIT_LIST_HEAD(&dd->sort_list);
+	dd->fifo_batch = fifo_batch;
+	dd->fifo_expire[READ] = read_expire;
+	dd->fifo_expire[WRITE] = write_expire;
+	dd->seek_cost = seek_cost;
+	dd->hash_valid_count = 1;
+	e->elevator_data = dd;
+
+	for (i = 0; i < 2; i++) {
+		struct request_list *rl = &q->rq[i];
+		struct list_head *entry = &rl->free;
+
+		if (list_empty(&rl->free))
+			continue;
+	
+		while ((entry = entry->next) != &rl->free) {
+			rq = list_entry_rq(entry);
+
+			drq = kmem_cache_alloc(drq_pool, GFP_KERNEL);
+			if (!drq) {
+				ret = -ENOMEM;
+				break;
+			}
+
+			memset(drq, 0, sizeof(*drq));
+			INIT_LIST_HEAD(&drq->fifo);
+			INIT_LIST_HEAD(&drq->hash);
+			drq->request = rq;
+			rq->elevator_private = drq;
+		}
+	}
+
+	if (ret)
+		deadline_exit(q, e);
+
+	return ret;
+}
+
+static int __init deadline_slab_setup(void)
+{
+	drq_pool = kmem_cache_create("deadline_drq", sizeof(struct deadline_rq),
+				     0, SLAB_HWCACHE_ALIGN, NULL, NULL);
+
+	if (!drq_pool)
+		panic("deadline: can't init slab pool\n");
+
+	return 0;
+}
+
+module_init(deadline_slab_setup);
+
+elevator_t iosched_deadline = {
+	.elevator_merge_fn = 		deadline_merge,
+	.elevator_merge_cleanup_fn =	deadline_merge_cleanup,
+	.elevator_merge_req_fn =	deadline_merge_request,
+	.elevator_next_req_fn =		deadline_next_request,
+	.elevator_add_req_fn =		deadline_add_request,
+	.elevator_remove_req_fn =	deadline_remove_request,
+	.elevator_queue_empty_fn =	deadline_queue_empty,
+	.elevator_get_sort_head_fn =	deadline_get_sort_head,
+	.elevator_init_fn =		deadline_init,
+	.elevator_exit_fn =		deadline_exit,
+};
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.29/drivers/block/ll_rw_blk.c linux/drivers/block/ll_rw_blk.c
--- /opt/kernel/linux-2.5.29/drivers/block/ll_rw_blk.c	Tue Jul 30 09:04:57 2002
+++ linux/drivers/block/ll_rw_blk.c	Mon Jul 29 21:15:27 2002
@@ -1114,7 +1114,11 @@
 	if (blk_init_free_list(q))
 		return -ENOMEM;
 
+#if 1
+	if ((ret = elevator_init(q, &q->elevator, iosched_deadline))) {
+#else
 	if ((ret = elevator_init(q, &q->elevator, elevator_linus))) {
+#endif
 		blk_cleanup_queue(q);
 		return ret;
 	}
@@ -2072,8 +2072,8 @@
 	queue_nr_requests = (total_ram >> 8) & ~15;	/* One per quarter-megabyte */
 	if (queue_nr_requests < 32)
 		queue_nr_requests = 32;
-	if (queue_nr_requests > 256)
-		queue_nr_requests = 256;
+	if (queue_nr_requests > 768)
+		queue_nr_requests = 768;
 
 	/*
 	 * Batch frees according to queue length
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.29/include/linux/elevator.h linux/include/linux/elevator.h
--- /opt/kernel/linux-2.5.29/include/linux/elevator.h	Fri Jul 26 14:01:22 2002
+++ linux/include/linux/elevator.h	Tue Jul 30 09:15:00 2002
@@ -59,6 +63,12 @@
 #define elv_linus_sequence(rq)	((long)(rq)->elevator_private)
 
 /*
+ * deadline i/o scheduler. uses request time outs to prevent indefinite
+ * starvation
+ */
+extern elevator_t iosched_deadline;
+
+/*
  * use the /proc/iosched interface, all the below is history ->
  */
 typedef struct blkelv_ioctl_arg_s {

[-- Attachment #3: elv-queue_head-misc-2 --]
[-- Type: text/plain, Size: 7185 bytes --]

diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.29/drivers/block/elevator.c linux/drivers/block/elevator.c
--- /opt/kernel/linux-2.5.29/drivers/block/elevator.c	Fri Jul 26 14:01:22 2002
+++ linux/drivers/block/elevator.c	Tue Jul 30 09:11:53 2002
@@ -220,7 +220,8 @@
 	}
 }
 
-void elevator_linus_merge_req(struct request *req, struct request *next)
+void elevator_linus_merge_req(request_queue_t *q, struct request *req,
+			      struct request *next)
 {
 	if (elv_linus_sequence(next) < elv_linus_sequence(req))
 		elv_linus_sequence(req) = elv_linus_sequence(next);
@@ -232,6 +233,9 @@
 	elevator_t *e = &q->elevator;
 	int lat = 0, *latency = e->elevator_data;
 
+	if (!insert_here)
+		insert_here = q->queue_head.prev;
+
 	if (!(rq->flags & REQ_BARRIER))
 		lat = latency[rq_data_dir(rq)];
 
@@ -318,7 +322,7 @@
 
 struct request *elevator_noop_next_request(request_queue_t *q)
 {
-	if (!blk_queue_empty(q))
+	if (!list_empty(&q->queue_head))
 		return list_entry_rq(q->queue_head.next);
 
 	return NULL;
@@ -376,7 +380,7 @@
 	elevator_t *e = &q->elevator;
 
 	if (e->elevator_merge_req_fn)
-		e->elevator_merge_req_fn(rq, next);
+		e->elevator_merge_req_fn(q, rq, next);
 }
 
 /*
@@ -433,6 +437,27 @@
 		e->elevator_remove_req_fn(q, rq);
 }
 
+int elv_queue_empty(request_queue_t *q)
+{
+	elevator_t *e = &q->elevator;
+
+	if (e->elevator_queue_empty_fn)
+		return e->elevator_queue_empty_fn(q);
+
+	return list_empty(&q->queue_head);
+}
+
+inline struct list_head *elv_get_sort_head(request_queue_t *q,
+					   struct request *rq)
+{
+	elevator_t *e = &q->elevator;
+
+	if (e->elevator_get_sort_head_fn)
+		return e->elevator_get_sort_head_fn(q, rq);
+
+	return &q->queue_head;
+}
+
 elevator_t elevator_linus = {
 	elevator_merge_fn:		elevator_linus_merge,
 	elevator_merge_cleanup_fn:	elevator_linus_merge_cleanup,
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.29/drivers/block/ll_rw_blk.c linux/drivers/block/ll_rw_blk.c
--- /opt/kernel/linux-2.5.29/drivers/block/ll_rw_blk.c	Tue Jul 30 09:04:57 2002
+++ linux/drivers/block/ll_rw_blk.c	Mon Jul 29 21:15:27 2002
@@ -1388,7 +1392,6 @@
 
 	if (rq_data_dir(req) != rq_data_dir(next)
 	    || !kdev_same(req->rq_dev, next->rq_dev)
-	    || req->nr_sectors + next->nr_sectors > q->max_sectors
 	    || next->waiting || next->special)
 		return;
 
@@ -1399,15 +1402,14 @@
 	 * counts here.
 	 */
 	if (q->merge_requests_fn(q, req, next)) {
-		elv_merge_requests(q, req, next);
-
-		blkdev_dequeue_request(next);
-
 		req->biotail->bi_next = next->bio;
 		req->biotail = next->biotail;
 
 		req->nr_sectors = req->hard_nr_sectors += next->hard_nr_sectors;
 
+		elv_merge_requests(q, req, next);
+
+		blkdev_dequeue_request(next);
 		blk_put_request(next);
 	}
 }
@@ -1415,16 +1417,18 @@
 static inline void attempt_back_merge(request_queue_t *q, struct request *rq)
 {
 	struct list_head *next = rq->queuelist.next;
+	struct list_head *sort_head = elv_get_sort_head(q, rq);
 
-	if (next != &q->queue_head)
+	if (next != sort_head)
 		attempt_merge(q, rq, list_entry_rq(next));
 }
 
 static inline void attempt_front_merge(request_queue_t *q, struct request *rq)
 {
 	struct list_head *prev = rq->queuelist.prev;
+	struct list_head *sort_head = elv_get_sort_head(q, rq);
 
-	if (prev != &q->queue_head)
+	if (prev != sort_head)
 		attempt_merge(q, list_entry_rq(prev), rq);
 }
 
@@ -1484,7 +1488,7 @@
 	spin_lock_irq(q->queue_lock);
 again:
 	req = NULL;
-	insert_here = q->queue_head.prev;
+	insert_here = NULL;
 
 	if (blk_queue_empty(q)) {
 		blk_plug_device(q);
@@ -1502,11 +1506,10 @@
 				break;
 			}
 
-			elv_merge_cleanup(q, req, nr_sectors);
-
 			req->biotail->bi_next = bio;
 			req->biotail = bio;
 			req->nr_sectors = req->hard_nr_sectors += nr_sectors;
+			elv_merge_cleanup(q, req, nr_sectors);
 			drive_stat_acct(req, nr_sectors, 0);
 			attempt_back_merge(q, req);
 			goto out;
@@ -1518,8 +1521,6 @@
 				break;
 			}
 
-			elv_merge_cleanup(q, req, nr_sectors);
-
 			bio->bi_next = req->bio;
 			req->bio = bio;
 			/*
@@ -1532,6 +1533,7 @@
 			req->hard_cur_sectors = cur_nr_sectors;
 			req->sector = req->hard_sector = sector;
 			req->nr_sectors = req->hard_nr_sectors += nr_sectors;
+			elv_merge_cleanup(q, req, nr_sectors);
 			drive_stat_acct(req, nr_sectors, 0);
 			attempt_front_merge(q, req);
 			goto out;
@@ -1600,9 +1602,7 @@
 	req->buffer = bio_data(bio);	/* see ->buffer comment above */
 	req->waiting = NULL;
 	req->bio = req->biotail = bio;
-	if (bio->bi_bdev)
-		req->rq_dev = to_kdev_t(bio->bi_bdev->bd_dev);
-	else	req->rq_dev = NODEV;
+	req->rq_dev = to_kdev_t(bio->bi_bdev->bd_dev);
 	add_request(q, req, insert_here);
 out:
 	if (freereq)
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.29/include/linux/elevator.h linux/include/linux/elevator.h
--- /opt/kernel/linux-2.5.29/include/linux/elevator.h	Fri Jul 26 14:01:22 2002
+++ linux/include/linux/elevator.h	Tue Jul 30 09:02:22 2002
@@ -6,13 +6,14 @@
 
 typedef void (elevator_merge_cleanup_fn) (request_queue_t *, struct request *, int);
 
-typedef void (elevator_merge_req_fn) (struct request *, struct request *);
+typedef void (elevator_merge_req_fn) (request_queue_t *, struct request *, struct request *);
 
 typedef struct request *(elevator_next_req_fn) (request_queue_t *);
 
 typedef void (elevator_add_req_fn) (request_queue_t *, struct request *, struct list_head *);
 typedef int (elevator_queue_empty_fn) (request_queue_t *);
 typedef void (elevator_remove_req_fn) (request_queue_t *, struct request *);
+typedef struct list_head *(elevator_get_sort_head_fn) (request_queue_t *, struct request *);
 
 typedef int (elevator_init_fn) (request_queue_t *, elevator_t *);
 typedef void (elevator_exit_fn) (request_queue_t *, elevator_t *);
@@ -28,6 +29,7 @@
 	elevator_remove_req_fn *elevator_remove_req_fn;
 
 	elevator_queue_empty_fn *elevator_queue_empty_fn;
+	elevator_get_sort_head_fn *elevator_get_sort_head_fn;
 
 	elevator_init_fn *elevator_init_fn;
 	elevator_exit_fn *elevator_exit_fn;
@@ -45,6 +47,8 @@
 extern void elv_merge_requests(request_queue_t *, struct request *,
 			       struct request *);
 extern void elv_remove_request(request_queue_t *, struct request *);
+extern int elv_queue_empty(request_queue_t *);
+extern inline struct list_head *elv_get_sort_head(request_queue_t *, struct request *);
 
 /*
  * noop I/O scheduler. always merges, always inserts new request at tail
@@ -72,6 +82,10 @@
 
 extern int elevator_init(request_queue_t *, elevator_t *, elevator_t);
 extern void elevator_exit(request_queue_t *, elevator_t *);
+extern inline int bio_rq_in_between(struct bio *, struct request *, struct list_head *);
+extern inline int elv_rq_merge_ok(struct request *, struct bio *);
+extern inline int elv_try_merge(struct request *, struct bio *);
+extern inline int elv_try_last_merge(request_queue_t *, struct request **, struct bio *);
 
 /*
  * Return values from elevator merger
@@ -80,10 +94,4 @@
 #define ELEVATOR_FRONT_MERGE	1
 #define ELEVATOR_BACK_MERGE	2
 
-/*
- * will change once we move to a more complex data structure than a simple
- * list for pending requests
- */
-#define elv_queue_empty(q)	list_empty(&(q)->queue_head)
-
 #endif

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block/elevator updates + deadline i/o scheduler
  2002-07-26 12:02 [PATCH] block/elevator updates + deadline i/o scheduler Jens Axboe
  2002-07-26 18:31 ` Andrew Morton
  2002-07-27  1:22 ` Adam Kropelin
@ 2002-07-30 17:33 ` Bill Davidsen
  2002-08-01  8:51   ` Jens Axboe
  2 siblings, 1 reply; 11+ messages in thread
From: Bill Davidsen @ 2002-07-30 17:33 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel

On Fri, 26 Jul 2002, Jens Axboe wrote:
> Finally, I've done some testing on it. No testing on whether this really
> works well in real life (that's what I want testers to do), and no
> testing on benchmark performance changes etc. What I have done is
> beat-up testing, making sure it works without corrupting your data. I'm
> fairly confident that does. Most testing was on SCSI (naturally),
> however IDE has also been tested briefly.

First, great job on the explanation, it went right in my folder for "when
the docs are clear" explanations.

Now a request, if someone is running a database app and tests this I'd
love to see the results. I expect things like LMbench to show more threads
ending at the same time, but will it help a reall app?

I bet it was tested briefly on IDE, my last use of IDE a week or so ago
lasted until I did "make dep" and the output went all over every attached
drive :-( Still, nice to know it will work if IDE makes it into 2.5.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block/elevator updates + deadline i/o scheduler
  2002-07-30 17:33 ` Bill Davidsen
@ 2002-08-01  8:51   ` Jens Axboe
  2002-08-01 18:24     ` Bill Davidsen
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2002-08-01  8:51 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux Kernel

On Tue, Jul 30 2002, Bill Davidsen wrote:
> On Fri, 26 Jul 2002, Jens Axboe wrote:
> > Finally, I've done some testing on it. No testing on whether this really
> > works well in real life (that's what I want testers to do), and no
> > testing on benchmark performance changes etc. What I have done is
> > beat-up testing, making sure it works without corrupting your data. I'm
> > fairly confident that does. Most testing was on SCSI (naturally),
> > however IDE has also been tested briefly.
> 
> First, great job on the explanation, it went right in my folder for "when
> the docs are clear" explanations.

Thanks :-)

> Now a request, if someone is running a database app and tests this I'd
> love to see the results. I expect things like LMbench to show more threads
> ending at the same time, but will it help a reall app?

Note that the deadline i/o scheduler only considers deadlines on
individual requests so far, so there's no real guarentee that process X,
Y, and Z will receive equal share of the bandwidth. This is something
I'm thinking about, though.

My testing does seem to indicate that the deadline scheduler is fairer
than the linus scheduler, but ymmv.

> I bet it was tested briefly on IDE, my last use of IDE a week or so ago
> lasted until I did "make dep" and the output went all over every attached
> drive :-( Still, nice to know it will work if IDE makes it into 2.5.

:/

I'll say that 2.5.29 IDE did work fine for the testing I did with the
deadline scheduler, at least it survived a dbench 64 (that's about the
testing it got).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block/elevator updates + deadline i/o scheduler
  2002-08-01  8:51   ` Jens Axboe
@ 2002-08-01 18:24     ` Bill Davidsen
  2002-08-02  8:16       ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Bill Davidsen @ 2002-08-01 18:24 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel

On Thu, 1 Aug 2002, Jens Axboe wrote:

> On Tue, Jul 30 2002, Bill Davidsen wrote:

> > Now a request, if someone is running a database app and tests this I'd
> > love to see the results. I expect things like LMbench to show more threads
> > ending at the same time, but will it help a reall app?
> 
> Note that the deadline i/o scheduler only considers deadlines on
> individual requests so far, so there's no real guarentee that process X,
> Y, and Z will receive equal share of the bandwidth. This is something
> I'm thinking about, though.

I'm not sure that equal bandwidth and deadline are compatible, some
processes may need more bandwidth to meet deadlines. I'm not unhappy about
that, it's the reads waiting in a flood of write or vice-versa that
occasionally make an app really rot.
 
> My testing does seem to indicate that the deadline scheduler is fairer
> than the linus scheduler, but ymmv.
> 
> > I bet it was tested briefly on IDE, my last use of IDE a week or so ago
> > lasted until I did "make dep" and the output went all over every attached
> > drive :-( Still, nice to know it will work if IDE makes it into 2.5.
> 
> :/
> 
> I'll say that 2.5.29 IDE did work fine for the testing I did with the
> deadline scheduler, at least it survived a dbench 64 (that's about the
> testing it got).

Hum, looks as if I should plan to retest with 29 and the deadline patches.
My personal test is a program doing one read/sec while making CD images
(mkisofs) on a machine with large memory. It tends to build the whole
700MB in memory and then bdflush decides it should be written and the read
latency totally goes to hell. I will say most of this can be tuned out at
some small cost to total performance (thanks Andrea).

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] block/elevator updates + deadline i/o scheduler
  2002-08-01 18:24     ` Bill Davidsen
@ 2002-08-02  8:16       ` Jens Axboe
  0 siblings, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2002-08-02  8:16 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux Kernel, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 2301 bytes --]

On Thu, Aug 01 2002, Bill Davidsen wrote:
> On Thu, 1 Aug 2002, Jens Axboe wrote:
> 
> > On Tue, Jul 30 2002, Bill Davidsen wrote:
> 
> > > Now a request, if someone is running a database app and tests this I'd
> > > love to see the results. I expect things like LMbench to show more threads
> > > ending at the same time, but will it help a reall app?
> > 
> > Note that the deadline i/o scheduler only considers deadlines on
> > individual requests so far, so there's no real guarentee that process X,
> > Y, and Z will receive equal share of the bandwidth. This is something
> > I'm thinking about, though.
> 
> I'm not sure that equal bandwidth and deadline are compatible, some
> processes may need more bandwidth to meet deadlines. I'm not unhappy about
> that, it's the reads waiting in a flood of write or vice-versa that
> occasionally make an app really rot.

Correct, that's _exactly_ the case that made me attempt this type of io
scheduler. The current scheduler does well wrt optimizing for
throughput.

> > My testing does seem to indicate that the deadline scheduler is fairer
> > than the linus scheduler, but ymmv.
> > 
> > > I bet it was tested briefly on IDE, my last use of IDE a week or so ago
> > > lasted until I did "make dep" and the output went all over every attached
> > > drive :-( Still, nice to know it will work if IDE makes it into 2.5.
> > 
> > :/
> > 
> > I'll say that 2.5.29 IDE did work fine for the testing I did with the
> > deadline scheduler, at least it survived a dbench 64 (that's about the
> > testing it got).
> 
> Hum, looks as if I should plan to retest with 29 and the deadline patches.
> My personal test is a program doing one read/sec while making CD images
> (mkisofs) on a machine with large memory. It tends to build the whole
> 700MB in memory and then bdflush decides it should be written and the read
> latency totally goes to hell. I will say most of this can be tuned out at
> some small cost to total performance (thanks Andrea).

You are most welcome to test of course, I would very much like you to do
just that. 2.5.30 is out now and has the elv-queue* patch included which
included the necessary base changes for applying deadline-iosched, so
it's even easier now. I've attached the latest deadline-iosched-8 for
you.

-- 
Jens Axboe


[-- Attachment #2: deadline-iosched-8 --]
[-- Type: text/plain, Size: 14373 bytes --]

diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.30/drivers/block/deadline-iosched.c linux/drivers/block/deadline-iosched.c
--- /opt/kernel/linux-2.5.30/drivers/block/deadline-iosched.c	1970-01-01 01:00:00.000000000 +0100
+++ linux/drivers/block/deadline-iosched.c	2002-07-31 12:56:33.000000000 +0200
@@ -0,0 +1,514 @@
+/*
+ *  linux/drivers/block/deadline-iosched.c
+ *
+ *  Deadline i/o scheduler.
+ *
+ *  Copyright (C) 2002 Jens Axboe <axboe@suse.de>
+ */
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/blkdev.h>
+#include <linux/elevator.h>
+#include <linux/bio.h>
+#include <linux/blk.h>
+#include <linux/config.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/compiler.h>
+#include <linux/hash.h>
+
+/*
+ * feel free to try other values :-). *_expire values are the timeouts
+ * for reads and writes, our goal is to start a request "around" the time
+ * when it expires. fifo_batch is how many steps along the sorted list we
+ * will take when the front fifo request expires.
+ */
+static int read_expire = HZ;
+static int write_expire = 10 * HZ;
+static int fifo_batch = 80;
+static int seek_cost = 20;
+
+static const int deadline_hash_shift = 8;
+#define DL_HASH_BLOCK(sec)	((sec) >> 3)
+#define DL_HASH_FN(sec)		(hash_long(DL_HASH_BLOCK((sec)), deadline_hash_shift))
+#define DL_HASH_ENTRIES		(1 << deadline_hash_shift)
+
+#define DL_INVALIDATE_HASH(dd)				\
+	do {						\
+		if (!++(dd)->hash_valid_count)		\
+			(dd)->hash_valid_count = 1;	\
+	} while (0)
+
+struct deadline_data {
+	struct list_head sort_list;	/* sorted listed */
+	struct list_head fifo_list[2];	/* fifo list */
+	struct list_head *dispatch;	/* driver dispatch queue */
+	struct list_head *hash;		/* request hash */
+	sector_t last_sector;		/* last sector sent to drive */
+	unsigned long hash_valid_count;	/* barrier hash count */
+	int fifo_check;			/* check this fifo list first */
+
+	/*
+	 * settings that change how the i/o scheduler behaves
+	 */
+	unsigned int fifo_batch;
+	unsigned long fifo_expire[2];
+	unsigned int seek_cost;
+};
+
+struct deadline_rq {
+	struct list_head fifo;
+	struct list_head hash;
+	unsigned long hash_valid_count;
+	struct request *request;
+	unsigned long expires;
+};
+
+static kmem_cache_t *drq_pool;
+
+#define RQ_DATA(rq)		((struct deadline_rq *) (rq)->elevator_private)
+
+static inline void
+deadline_move_to_dispatch(struct deadline_data *dd, struct request *rq)
+{
+	struct deadline_rq *drq = RQ_DATA(rq);
+
+	list_move_tail(&rq->queuelist, dd->dispatch);
+	list_del_init(&drq->fifo);
+}
+
+static inline void __deadline_del_rq_hash(struct deadline_rq *drq)
+{
+	drq->hash_valid_count = 0;
+	list_del_init(&drq->hash);
+}
+
+#define ON_HASH(drq)	(drq)->hash_valid_count
+static inline void deadline_del_rq_hash(struct deadline_rq *drq)
+{
+	if (ON_HASH(drq))
+		__deadline_del_rq_hash(drq);
+}
+
+static inline void
+deadline_add_rq_hash(struct deadline_data *dd, struct deadline_rq *drq)
+{
+	struct request *rq = drq->request;
+
+	BUG_ON(ON_HASH(drq));
+
+	drq->hash_valid_count = dd->hash_valid_count;
+	list_add(&drq->hash, &dd->hash[DL_HASH_FN(rq->sector +rq->nr_sectors)]);
+}
+
+#define list_entry_hash(ptr)	list_entry((ptr), struct deadline_rq, hash)
+static struct request *
+deadline_find_hash(struct deadline_data *dd, sector_t offset)
+{
+	struct list_head *hash_list = &dd->hash[DL_HASH_FN(offset)];
+	struct list_head *entry, *next = hash_list->next;
+	struct deadline_rq *drq;
+	struct request *rq = NULL;
+
+	while ((entry = next) != hash_list) {
+		next = entry->next;
+		drq = list_entry_hash(entry);
+
+		BUG_ON(!drq->hash_valid_count);
+
+		if (!rq_mergeable(drq->request)
+		    || drq->hash_valid_count != dd->hash_valid_count) {
+			__deadline_del_rq_hash(drq);
+			continue;
+		}
+
+		if (drq->request->sector + drq->request->nr_sectors == offset) {
+			rq = drq->request;
+			break;
+		}
+	}
+
+	return rq;
+}
+
+static int
+deadline_merge(request_queue_t *q, struct request **req, struct bio *bio)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct list_head *entry = &dd->sort_list;
+	struct request *__rq;
+	int ret = ELEVATOR_NO_MERGE;
+
+	/*
+	 * try last_merge to avoid going to hash
+	 */
+	if ((ret = elv_try_last_merge(q, req, bio)))
+		goto out;
+
+	/*
+	 * see if the merge hash can satisfy a back merge
+	 */
+	if ((__rq = deadline_find_hash(dd, bio->bi_sector))) {
+		BUG_ON(__rq->sector + __rq->nr_sectors != bio->bi_sector);
+
+		if (elv_rq_merge_ok(__rq, bio)) {
+			*req = __rq;
+			q->last_merge = &__rq->queuelist;
+			ret = ELEVATOR_BACK_MERGE;
+			goto out;
+		}
+	}
+
+	while ((entry = entry->prev) != &dd->sort_list) {
+		__rq = list_entry_rq(entry);
+
+		BUG_ON(__rq->flags & REQ_STARTED);
+		BUG_ON(__rq->flags & REQ_BARRIER);
+
+		if (!(__rq->flags & REQ_CMD))
+			continue;
+
+		if (!*req && bio_rq_in_between(bio, __rq, &dd->sort_list))
+			*req = __rq;
+
+		if (__rq->sector - bio_sectors(bio) == bio->bi_sector) {
+			if ((ret = elv_try_merge(__rq, bio))) {
+				*req = __rq;
+				q->last_merge = &__rq->queuelist;
+				break;
+			}
+		}
+	}
+
+out:
+	return ret;
+}
+
+static void
+deadline_merge_cleanup(request_queue_t *q, struct request *req, int count)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct deadline_rq *drq = RQ_DATA(req);
+
+	deadline_del_rq_hash(drq);
+	deadline_add_rq_hash(dd, drq);
+}
+
+#define ON_FIFO(drq)	!list_empty(&(drq)->fifo)
+static void
+deadline_merge_request(request_queue_t *q, struct request *req, struct request *next)
+{
+	struct deadline_rq *drq = RQ_DATA(req);
+	struct deadline_rq *dnext = RQ_DATA(next);
+
+	BUG_ON(!drq);
+	BUG_ON(!dnext);
+
+	deadline_merge_cleanup(q, req, 0);
+
+	/*
+	 * if dnext expires before drq, assign it's expire time to drq
+	 * and move into dnext position (dnext will be deleted) in fifo
+	 */
+	if (ON_FIFO(drq) && ON_FIFO(dnext)) {
+		if (time_before(dnext->expires, drq->expires)) {
+			list_move(&drq->fifo, &dnext->fifo);
+			drq->expires = dnext->expires;
+		}
+	}
+}
+
+#define list_entry_fifo(ptr)	list_entry((ptr), struct deadline_rq, fifo)
+static int
+deadline_check_fifo(struct deadline_data *dd, struct list_head *fifo_list)
+{
+	struct deadline_rq *drq;
+	struct request *rq;
+	sector_t last_sec;
+	int batch_count;
+
+	if (list_empty(fifo_list))
+		return 0;
+
+	drq = list_entry_fifo(fifo_list->next);
+	if (time_before(jiffies, drq->expires))
+		return 0;
+
+	rq = drq->request;
+	batch_count = dd->fifo_batch;
+	last_sec = dd->last_sector;
+	do {
+		struct list_head *nxt = rq->queuelist.next;
+
+		/*
+		 * take it off the sort and fifo list, move
+		 * to dispatch queue
+		 */
+		deadline_move_to_dispatch(dd, rq);
+
+		if (rq->sector == last_sec)
+			batch_count--;
+		else
+			batch_count -= dd->seek_cost;
+
+		if (nxt == &dd->sort_list)
+			break;
+
+		last_sec = rq->sector + rq->nr_sectors;
+		rq = list_entry_rq(nxt);
+	} while (batch_count > 0);
+
+	/*
+	 * should be entries there now, at least 1
+	 */
+	return 1;
+}
+
+static inline int deadline_check_fifos(struct deadline_data *dd)
+{
+	/*
+	 * check current list, flip if entries were moved
+	 */
+	if (deadline_check_fifo(dd, &dd->fifo_list[dd->fifo_check])) {
+		dd->fifo_check ^= 1;
+		return 1;
+	}
+
+	/*
+	 * check alternate list, leave current list if entries were moved
+	 */
+	if (deadline_check_fifo(dd, &dd->fifo_list[dd->fifo_check ^ 1]))
+		return 1;
+
+	return 0;
+}
+
+static struct request *deadline_next_request(request_queue_t *q)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct request *rq;
+
+	/*
+	 * if still requests on the dispatch queue, just grab the first one
+	 */
+	if (!list_empty(&q->queue_head)) {
+dispatch:
+		rq = list_entry_rq(q->queue_head.next);
+		dd->last_sector = rq->sector + rq->nr_sectors;
+		return rq;
+	}
+
+	/*
+	 * decide whether to just grab first from sorted list, or move a
+	 * batch from the expire list. if it has expired, move 'batch'
+	 * entries to dispatch queue
+	 */
+	if (deadline_check_fifos(dd))
+		goto dispatch;
+
+	if (!list_empty(&dd->sort_list)) {
+		deadline_move_to_dispatch(dd,list_entry_rq(dd->sort_list.next));
+		goto dispatch;
+	}
+
+	return NULL;
+}
+
+static void
+deadline_add_request(request_queue_t *q, struct request *rq, struct list_head *insert_here)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+	struct deadline_rq *drq;
+
+	/*
+	 * insert barriers directly into the back of the dispatch queue,
+	 * don't use the sorted or fifo list for those
+	 */
+	if (unlikely(rq->flags & REQ_BARRIER)) {
+		DL_INVALIDATE_HASH(dd);
+		list_add_tail(&rq->queuelist, &q->queue_head);
+		q->last_merge = NULL;
+		return;
+	}
+
+	/*
+	 * add to sort list
+	 */
+	if (!insert_here)
+		insert_here = dd->sort_list.prev;
+
+	list_add(&rq->queuelist, insert_here);
+
+	if (unlikely(!(rq->flags & REQ_CMD)))
+		return;
+
+	/*
+	 * set expire time and add to fifo list
+	 */
+	drq = RQ_DATA(rq);
+	drq->expires = jiffies + dd->fifo_expire[rq_data_dir(rq)];
+	list_add_tail(&drq->fifo, &dd->fifo_list[rq_data_dir(rq)]);
+
+	if (rq_mergeable(rq))
+		deadline_add_rq_hash(dd, drq);
+
+	if (!q->last_merge)
+		q->last_merge = &rq->queuelist;
+}
+
+static void deadline_remove_request(request_queue_t *q, struct request *rq)
+{
+	struct deadline_rq *drq = RQ_DATA(rq);
+
+	if (drq) {
+		list_del_init(&drq->fifo);
+		deadline_del_rq_hash(drq);
+	}
+}
+
+static int deadline_queue_empty(request_queue_t *q)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+
+	if (!list_empty(&q->queue_head) || !list_empty(&dd->sort_list))
+		return 0;
+
+	BUG_ON(!list_empty(&dd->fifo_list[READ]));
+	BUG_ON(!list_empty(&dd->fifo_list[WRITE]));
+	return 1;
+}
+
+static struct list_head *
+deadline_get_sort_head(request_queue_t *q, struct request *rq)
+{
+	struct deadline_data *dd = q->elevator.elevator_data;
+
+	return &dd->sort_list;
+}
+
+static void deadline_exit(request_queue_t *q, elevator_t *e)
+{
+	struct deadline_data *dd = e->elevator_data;
+	struct deadline_rq *drq;
+	struct request *rq;
+	int i;
+
+	BUG_ON(!list_empty(&dd->fifo_list[READ]));
+	BUG_ON(!list_empty(&dd->fifo_list[WRITE]));
+	BUG_ON(!list_empty(&dd->sort_list));
+
+	for (i = 0; i < 2; i++) {
+		struct request_list *rl = &q->rq[i];
+		struct list_head *entry = &rl->free;
+
+		if (list_empty(&rl->free))
+			continue;
+	
+		while ((entry = entry->next) != &rl->free) {
+			rq = list_entry_rq(entry);
+
+			if ((drq = RQ_DATA(rq)) == NULL)
+				continue;
+
+			rq->elevator_private = NULL;
+			kmem_cache_free(drq_pool, drq);
+		}
+	}
+
+	kfree(dd->hash);
+	kfree(dd);
+}
+
+static int deadline_init(request_queue_t *q, elevator_t *e)
+{
+	struct deadline_data *dd;
+	struct deadline_rq *drq;
+	struct request *rq;
+	int i, ret = 0;
+
+	if (!drq_pool)
+		return -ENOMEM;
+
+	dd = kmalloc(sizeof(*dd), GFP_KERNEL);
+	if (!dd)
+		return -ENOMEM;
+	memset(dd, 0, sizeof(*dd));
+
+	dd->hash = kmalloc(sizeof(struct list_head)*DL_HASH_ENTRIES,GFP_KERNEL);
+	if (!dd->hash) {
+		kfree(dd);
+		return -ENOMEM;
+	}
+
+	for (i = 0; i < DL_HASH_ENTRIES; i++)
+		INIT_LIST_HEAD(&dd->hash[i]);
+
+	INIT_LIST_HEAD(&dd->fifo_list[READ]);
+	INIT_LIST_HEAD(&dd->fifo_list[WRITE]);
+	INIT_LIST_HEAD(&dd->sort_list);
+	dd->dispatch = &q->queue_head;
+	dd->fifo_batch = fifo_batch;
+	dd->fifo_expire[READ] = read_expire;
+	dd->fifo_expire[WRITE] = write_expire;
+	dd->seek_cost = seek_cost;
+	dd->hash_valid_count = 1;
+	dd->fifo_check = READ;
+	e->elevator_data = dd;
+
+	for (i = 0; i < 2; i++) {
+		struct request_list *rl = &q->rq[i];
+		struct list_head *entry = &rl->free;
+
+		if (list_empty(&rl->free))
+			continue;
+	
+		while ((entry = entry->next) != &rl->free) {
+			rq = list_entry_rq(entry);
+
+			drq = kmem_cache_alloc(drq_pool, GFP_KERNEL);
+			if (!drq) {
+				ret = -ENOMEM;
+				break;
+			}
+
+			memset(drq, 0, sizeof(*drq));
+			INIT_LIST_HEAD(&drq->fifo);
+			INIT_LIST_HEAD(&drq->hash);
+			drq->request = rq;
+			rq->elevator_private = drq;
+		}
+	}
+
+	if (ret)
+		deadline_exit(q, e);
+
+	return ret;
+}
+
+static int __init deadline_slab_setup(void)
+{
+	drq_pool = kmem_cache_create("deadline_drq", sizeof(struct deadline_rq),
+				     0, SLAB_HWCACHE_ALIGN, NULL, NULL);
+
+	if (!drq_pool)
+		panic("deadline: can't init slab pool\n");
+
+	return 0;
+}
+
+module_init(deadline_slab_setup);
+
+elevator_t iosched_deadline = {
+	.elevator_merge_fn = 		deadline_merge,
+	.elevator_merge_cleanup_fn =	deadline_merge_cleanup,
+	.elevator_merge_req_fn =	deadline_merge_request,
+	.elevator_next_req_fn =		deadline_next_request,
+	.elevator_add_req_fn =		deadline_add_request,
+	.elevator_remove_req_fn =	deadline_remove_request,
+	.elevator_queue_empty_fn =	deadline_queue_empty,
+	.elevator_get_sort_head_fn =	deadline_get_sort_head,
+	.elevator_init_fn =		deadline_init,
+	.elevator_exit_fn =		deadline_exit,
+};
+EXPORT_SYMBOL(iosched_deadline);
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.30/drivers/block/ll_rw_blk.c linux/drivers/block/ll_rw_blk.c
--- /opt/kernel/linux-2.5.30/drivers/block/ll_rw_blk.c	2002-08-01 23:16:07.000000000 +0200
+++ linux/drivers/block/ll_rw_blk.c	2002-08-02 09:21:44.000000000 +0200
@@ -1114,7 +1114,11 @@
 	if (blk_init_free_list(q))
 		return -ENOMEM;
 
+#if 1
+	if ((ret = elevator_init(q, &q->elevator, iosched_deadline))) {
+#else
 	if ((ret = elevator_init(q, &q->elevator, elevator_linus))) {
+#endif
 		blk_cleanup_queue(q);
 		return ret;
 	}
@@ -2071,8 +2075,8 @@
 	queue_nr_requests = (total_ram >> 8) & ~15;	/* One per quarter-megabyte */
 	if (queue_nr_requests < 32)
 		queue_nr_requests = 32;
-	if (queue_nr_requests > 256)
-		queue_nr_requests = 256;
+	if (queue_nr_requests > 768)
+		queue_nr_requests = 768;
 
 	/*
 	 * Batch frees according to queue length
diff -urN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.5.30/include/linux/elevator.h linux/include/linux/elevator.h
--- /opt/kernel/linux-2.5.30/include/linux/elevator.h	2002-08-01 23:16:03.000000000 +0200
+++ linux/include/linux/elevator.h	2002-07-30 09:15:00.000000000 +0200
@@ -63,6 +63,12 @@
 #define elv_linus_sequence(rq)	((long)(rq)->elevator_private)
 
 /*
+ * deadline i/o scheduler. uses request time outs to prevent indefinite
+ * starvation
+ */
+extern elevator_t iosched_deadline;
+
+/*
  * use the /proc/iosched interface, all the below is history ->
  */
 typedef struct blkelv_ioctl_arg_s {

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2002-08-02  8:16 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-26 12:02 [PATCH] block/elevator updates + deadline i/o scheduler Jens Axboe
2002-07-26 18:31 ` Andrew Morton
2002-07-28 19:12   ` Jens Axboe
2002-07-28 19:17     ` Randy.Dunlap
2002-07-28 19:26       ` Jens Axboe
2002-07-30  7:57   ` Jens Axboe
2002-07-27  1:22 ` Adam Kropelin
2002-07-30 17:33 ` Bill Davidsen
2002-08-01  8:51   ` Jens Axboe
2002-08-01 18:24     ` Bill Davidsen
2002-08-02  8:16       ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.